155 112 22MB
English Pages 606 [585] Year 2021
Lecture Notes in Networks and Systems 185
Siba K. Udgata Srinivas Sethi Satish N. Srirama Editors
Intelligent Systems Proceedings of ICMIB 2020
Lecture Notes in Networks and Systems Volume 185
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA; Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada; Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.
More information about this series at http://www.springer.com/series/15179
Siba K. Udgata · Srinivas Sethi · Satish N. Srirama Editors
Intelligent Systems Proceedings of ICMIB 2020
Editors Siba K. Udgata School of computer and Information Sciences University of Hyderabad Hyderabad, Telangana, India
Srinivas Sethi IGIT Sarang Dhenkanal, Odisha, India
Satish N. Srirama Institute of Computer Science University of Tartu Tartu, Estonia School of Computer and Information Sciences University of Hyderabad Hyderabad, Telangana, India
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-981-33-6080-8 ISBN 978-981-33-6081-5 (eBook) https://doi.org/10.1007/978-981-33-6081-5 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Committee Members
Patron Prof. Satyabrata Mohanto, Director, IGIT Sarang
General Chair Prof. Lalit Mohan Patnaik, National Institute of Advanced Studies and Indian Institute of Science, Bangalore
Program Chair Prof. Siba K. Udgata, University of Hyderabad, India
Program Co-chair Prof. Satish Srirama, University of Tartu, Estonia
Organising Chairs Prof. S. N. Mishra, IGIT Sarang Prof. Srinivas Sethi, IGIT Sarang
v
vi
Committee Members
Convenor Prof. Srinivas Sethi, IGIT Sarang
Publicity Chairs Prof. B. P. Panigrahi, IGIT Sarang Prof. Sasmita Mishra, IGIT Sarang Prof. Pranati Das, IGIT Sarang Prof. Urmila Bhanja, IGIT Sarang Prof. B. B. Choudhury, IGIT Sarang Prof. Brojo Kishore Mishra, GIET University, Gunupur, India
Accommodation Chairs Prof. S. K. Tripathy, IGIT Sarang Prof. Sanjaya Patra, IGIT Sarang Prof. R. N. Sethi, IGIT Sarang Prof. P. R. Dhal, IGIT Sarang Prof. K. D. Sa, IGIT Sarang
Finance Chairs Prof. Rabindra Behera, IGIT Sarang Prof. Ashima Rout, IGIT Sarang
Technical Committee Members Manas Ranjan Patra, Berhampur University Siba K. Udgata, University of Hyderabad O. B. V. Ramanaiah, OBV, JNTU Hyderabad G. Suvarna Kumar, MVGRCE, Vijayanagaram G. Sandhya, MVGRCE, Vijayanagaram R. Hemalatha, University College of Engineering, Osmania University Amit Kumar Mishra, University of Capetown, South Africa R. Thangarajan, Kongu Engineering College, Tamil Nadu
Committee Members
vii
Birendra Biswal, Gayatri Vidya Parishad College of Engineering, Vishakhapatnam Somanath Tripathy, IIT Patna A. K. Turuk, NIT Rourkela B. D. Sahoo, NIT Rourkela D. P. Mohapatra, NIT Rourkela P. M. Khilar, NIT Rourkela P. G. Sapna, CIT, Coimbatore R. K. Dash, NIC, Mizoram B. K. Tripathy, VIT, Vellore Moumita Patra, IIT Guwahati S. N. Das, GIET University, Gunupur, Odisha Ram Kumar Dhurkari, IIM, Sirmaur, Himachal Pradesh Chitta Ranja Hota, BITS Pilani, Hyderabad, A. Kavitha, JNTU, Hyderabad S. Mini, NIT, Goa S. Nagender Kumar, University of Hyderabad Lalit Garg, University of Malta, Malta Lalitha Krishna, Kongu Engineering College, Tamil Nadu C. Poongodi, Kongu Engineering College, Tamil Nadu Sumanth Yenduri, Kennesaw University, USA Shaik Shakeel Ahamad, Majmaah University, Saudi Arabia K. Srujan Raju, CMR Technical Campus, Hyderabad R. Hemalatha, University College of Engineering, Osmania University, Hyderabad P. Sakthivel, Anna University Pavan Kumar Mishra, NIT, Raipur Tapan Kumar Gandhi, Department of Electrical Engineering, IIT Delhi Annappa B., NIT Surathkal Prafulla Kumar Behera, Utkal University, Bhubaneswar, Odisha Nekuri Naveen, School of Computer and Information Sciences, University of Hyderabad Ch. Venkaiah, School of Computer and Information Sciences, University of Hyderabad Rajendra Lal, School of Computer and Information Sciences, University of Hyderabad Dillip Singh Sisodia, Department of Computer Science and Engineering, NIT, Raipur Pradeep Singh, Department of Computer Science and Engineering, NIT, Raipur Jay Bagga, Ball State University, USA Sumagna Patnaik, J. B. Institute of Engineering and Technology, Hyderabad Ajit K. Sahoo, University of Hyderabad Atluri Rahul, Neurolus Systems, Hyderabad Samrat L. Sabat, Center for Advanced Studies in Electronic Science and Technology (CASEST), University of Hyderabad Nihar Satapathy, Sambalpur University Susil Kumar Mohanty, Department of Computer Science and Engineering, IIT Patna Kagita Venkat, NIT, Warangal
viii
Committee Members
Sanjay Kuanar, GIET Uniersity, Gunupr, Orissa Bhabendra Biswal, College of Engineering, Bhubaneswar Padmalaya Nayak, G. R. Institute of Engineering and Technology, Hyderabad Bhibudendu Pati, R. D. Womens University, Bhubaneswar Chabi Rani Panigrahi, R. D. Womens University, Bhubaneswar Rajesh Verma, Infosys Ltd, Hyderabad Arun Avinash Chauhan, School of Computer and Information Sciences, University of Hyderabad Khusbu Pahwa, Delhi Technological University, New Delhi Soumen Roy, DRDL, Hyderabad Satyajit Acharya, Tech Mahindra, Hyderabad Subhrakanta Panda, BITs Pilani, Hyderabad Vineet P. Nair, School of Computer and Information Sciences, University of Hyderabad Subash Yadav, Department of Computer Science, Central University of Jharkhand, Ranchi Layak Ali, Central University Karnataka, Gulbarga Deepak Kumar, NIT Meghalaya Bunil Balabantaray, NIT, Meghalaya Sumanta Pyne, NIT Rourkela Asis Tripathy, VIT, Vellore Mousumi Saha, NIT Durgapur Abhijit Sharma, NIT, Durgapur Mayukh Sarkar, MNNIT, Allahabad Oishila Bandyopadhyay, IIIT, Kalayani Subrat Kumar Mohanty, IIIT, Bhubaneswar Ramesh Chandra Mishra, IIIT Manipur Hirak Maity, Kolaghat Engineering College Sandeep Kumar Panda, ICFAI, Hyderabad Prasanta Kumar Swain, NOU, Odisha Ashim Rout, IGIT Sarang Srinivas Sethi, IGIT Sarang S. N. Mishra, IGIT Sarang Urmila Bhanja, IGIT Sarang D. J. Mishra, IGIT Sarang S. Mishra, IGIT Sarang Sangita Pal, IGIT Sarang Sanjaya Patra, IGIT Sarang Biswanath Sethi, IGIT Sarang Niroj Pani, IGIT Sarang Dillip Kumar Swain, IGIT Sarang Pranati Dash, IGIT Sarang B. P. Panigrahy, IGIT Sarang Rabindra Behera, IGIT Sarang L. N. Tripathy, CET Bhubaneswar
Committee Members
B. B. Choudhary, IGIT Sarang Dhiren Behera, IGIT Sarang R. N. Sethi, IGIT Sarang Anand Gupta, IGIT Sarang Ayaskanta Swain, NIT Rourkela Gayadhr Panda, NIT Meghalaya S. K. Tripathy, IGIT Sarang B. B. Panda, IGIT Sarang Md. N. Khan, IGIT Sarang Anukul Padhi, IGIT Sarang Debakanta Tripathy, IGIT Sarang S. K. Maity, IGIT Sarang Devi Acharya, VIT, velore Sanjaya Kumar, PRSU, Raipur, India V. Patle, PRSU, Raipur, India
ix
Preface
This Springer LNNS volume contains the papers presented at the 1st International Conference on Machine Learning, Internet of Things and Big Data (ICMIB-2020) held during 19th to 20th September 2020 organized by Indira Gandhi Institute of Technology (IGIT), Sarang, Odisha. This conference was originally planned to be held during 11th to 13th July 2020 but unfortunate pandemic forced us to postpone it. The pandemic threw a lot of challenges at us and no words of appreciation are enough for the organizing committee who could still pull it off successfully. The conference featured some excellent technical keynote talks and papers. Two tutorial talks, by Prof. D. Manjunath, IIT Bombay, and Prof. Nitin Auluck IIT Ropar, were held on September 18, 2020. The overwhelming response to the tutorial talks is worth mentioning. Apart from the tutorial sessions, six keynote talks by Prof. Raj Kumar Buyya (University of Melbourne, Australia), Prof. Md. Atiqur Rahman Ahad (University of Osaka, Japan), Prof. Yu-dong Zhang (University of Leicester, UK), Prof. Soodkhet Pojpapai (Suranaree University of Technology, Thailand), Prof. Siba K. Udgata (University of Hyderabad, India), and Mr. Aninda Bose (Springer Nature Publishing house) featured in this conference. We are grateful to all the speakers for accepting our invitation and sparing their time to deliver the talks. For the Conference, in spite of the adverse situation and lockdown throughout the world, we received 120 full paper submissions and we accepted only 49 papers for presentation and publication. The contributing authors are from different parts of the globe that includes UK, China, Saudi Arabia, Zambia, Bangladesh, Pakistan and India. All the papers are reviewed by at least three independent reviewers and in some cases by as many as six reviewers. All the papers are also checked for plagiarism and similarity score. It was really a tough job for us to select the best papers out of so many good papers for presentation in the conference. We had to do this unpleasant task, keeping the Springer guidelines and approval conditions in view. We take this opportunity to thank all the authors for their excellent work and contributions and also the reviewers who have done an excellent job.
xi
xii
Preface
On behalf of the technical committee, we are indebted to Prof. L. M. Patnaik, General Chair of the Conference, for his timely and valuable advice. We cannot imagine the conference without his active support at all the crossroads of decisionmaking process. The management of the host institute, particularly the Director Prof. Satyabrata Mohanta, TEQIP Coordinator Prof. Rabindra K. Behera, Organising Chair and Convenor Prof. Srinivas Sethi, and Organizing Chair Prof. S. N. Mishra have extended all possible support for the smooth conduct of the Conference. Our sincere thanks to all of them. We would also like to place on record our thanks to all the keynote speakers, tutorial speakers, reviewers, session chairs, authors, technical program committee members, various Chairs to handle finance, accommodation and publicity and above all to several volunteers. Our sincere thanks to all sponsors, press, print and electronic media for their excellent coverage of this conference. We are also thankful to Springer Nature publication house for agreeing to publish the accepted papers in their Lecture Notes in Networks and Systems (LNNS) series. Hyderabad, India Tartu, Estonia Sarang, India September 2020
Dr. Siba K. Udgata Dr. Satish N. Srirama Dr. Srinivas Sethi
List of Reviewers
A. N. Sharma Abhishek Pate Ajit Sahoo Alok R. Prusty Amit Kumar Chandanan Anuroop Mrutyunjar Arunima Pattanaik Ashutosh Bhoi Ashwini Kumar Nayak Asmita Behera B. Tirimula Roa Babita Majhi Bhawani Pattanaik Bibhu Prasad Behera Bibhu Prasad Panigrahy Bibhudatta Sahoo Bighna Raj Naik Chinmaya Sahu Christy Jeba Mala A. Dhruba Charana Panda Dillip Kumar Dipak Agarwal Abhishek Thakur Abirami T. Alok Ranjan Prusty Anitha V. Aswini K. B. N. B. Ray Bibhudendu Pati Chhabi Panigrahi Chinamay Misra xiii
xiv
Deva Priya M. Digambar Pawar Dilip Singh Jibendu Mantri Kavitha Athota Lalitha Krishna Layak Ali Manikandan R. Moumita Patra Mufti Mahamud Mahmud Nagender Kumar Suryadevara Naveen Nekuri Neelamadhab Padhi Padmalaya Naya Poongodi Chinaswami Preeti Parwekar Rajiv Senapati Sachi Pandey Saifulla Abdul Md. Sandhya Devi Sengathir J. Shakeel Ahamad Susant Sahu Swarup Roy Swetha Karima Thangarajan R. Veena Khandewal Venkat Kagita Vijender Busi Reddy Vimal Kumar K. Kamakhya Singh Khusboo Pahwa Krushna Chandra Sethi Kshirasagar Sahoo Lallit Kumar Sahoo Maheswar Behera Manmath K. Bhuyan Mihir Kumar Sutar Mrutyunjay Anuroop Mukesh Bhatre Niranjan Panigrahy Paresh Mahanty Plagiarism IGIT Sarang Pradyumna Ratha Pramod Parida
List of Reviewers
List of Reviewers
Prases Kumar Mohanty Pravat Dansena Amit Mishra Annappa Basava Hota Chittaranjan Janakiramaiah B. Latha P. Mahapatra R. P. Prasant Patnaik Rajeev Wankar Salman Moiz Samrat Sabat Sanjay Konhar Thirupathi Rao Puspalata Pujari Rajendra Prasad Nayak Rakesh Tripathy Ramesh Kumar Sahoo Ranjan Kumar Behera Ratnakar Swain S. P. Padhy Sagarika Mohanty Sambit Kumar Mishra Sampaa Sahoo Sangita Pal Sanjaya Kumar Sanjaya Panda Sanjib Mohanty Santanu Kumar Dash Sarojrani Patttnaik Saurav Bhoi Shivananda Poojara Somayya Avdut Srichandan Mohanty Srichandan Sobhanayak Sruthi P. Subasish Mohapatra Subhransu Dash Sudarshan Nandy Sukanta Bisoyi Sumit Kar Suraj Sharma Sushma Jaiswal Susil Mohanty Trilochan Rout
xv
xvi
Trilochon Rout Vikas Pandey Vinod Patle
List of Reviewers
Contents
An Approach for Heart Disease Prediction Using Machine Learning . . . Subasish Mohapatra, Jijnasee Dash, Subhadarshini Mohanty, and Arunima Hota Low-Cost Smart Solar DC Nano-Grid for Isolated Rural Electrification: Cyber-Physical System Design and Implementation . . . . Ranjan K. Behera, Swati Sneha, and Rustom Kumar Global Path Optimization of Humanoid NAO in Static Environment Using Prim’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Manoj Kumar Muni, Dayal R. Parhi, Priyadarshi Biplab Kumar, Chinmaya Sahu, Prasant Ranjan Dhal, and Saroj Kumar
1
13
25
Weather Prediction Using Hybrid Soft Computing Models . . . . . . . . . . . . Suvendra Kumar Jayasingh, Jibendu Kumar Mantri, and Sipali Pradhan
35
FindMoviez: A Movie Recommendation System . . . . . . . . . . . . . . . . . . . . . . Ashis Kumar Padhi, Ayog Mohanty, and Sipra Sahoo
49
Active Filter with 2-Fuzzy Intelligent Controller: A Solution to Power Quality Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Veeravelli Lakshmi Prasana, Pratap Sekhar Puhan, and Satyabrata Sahoo
59
Analysis of Covid Confirmed and Death Cases Using Different ML Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. Naga Satish, Ch. V. Raghavendran, and R. S. Murali Nath
73
How Good Are Classification Models in Handling Dynamic Intrusion Attacks in IoT? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lekhika Chettri and Swarup Roy
81
Sediment Rating Curve and Sediment Concentration Estimation for Mahanadi River . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pratik Acharya, Tushar Kumar Nath, and Ram Babu Nimma
95
xvii
xviii
Contents
An Energy-Efficient Routing with Particle Swarm Optimization and Aggregate Data for IOT-Enabled Software-Defined Networks . . . . . . 105 Krishnasamy Lalitha, Chinnasamy Poongodi, Shanmugam Anitha, and Duraisamy Vijay Anand Design of IoT-Based Real-Time Video Surveillance System Using Raspberry Pi and Sensor Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Saroja Kanta Panda and Sushanta Kumar Sahu Multi-agent System of Autonomous Underwater Vehicles in Octagon Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Madhusmita Panda and Bikramaditya Das Fuzzy Q-Reinforcement Learning-Based Energy Optimization in IoT Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Manoj Kumar, Pankaj Kumar Kashyap, and Sushil Kumar A Circumstantial Methodological Analysis of Recent Studies on NLP-driven Test Automation Approaches . . . . . . . . . . . . . . . . . . . . . . . . . 155 Atulya Gupta and Rajendra Prasad Mahapatra Plant Disease Recognition from Leaf Images Using Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 S. Preethi, A. Arun Prakash, and R. Thangarajan Optimum Design of Profile Modified Spur Gear Using PSO . . . . . . . . . . . 177 Jawaz Alam, Srusti Priyadarshini, Sumanta Panda, and Padmanav Dash Benchmark of Unsupervised Machine Learning Algorithms for Condition Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Krishna Chandra Patra, Rabi Narayan Sethi, and Dhiren Kumar Behera Investigation of the Efficiency for Fuzzy Logic-Based MPPT Algorithm Dedicated for Standalone Low-Cost PV Systems . . . . . . . . . . . 201 Garg Priyanka, Santanu Kumar Dash, and Vangala Padmaja Distributed Channel Assignment in Cognitive-Radio Enabled Internet of Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Kapil Goyal and Moumita Patra Load Reduction Using Temporal Modeling and Prediction in Periodic Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Arun Avinash Chauhan and Siba K. Udgata Direct Torque Control of Mathematically Modeled Induction Motor Drive Using PI-Type-I Fuzzy Logic Controller and Sliding Mode Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Soumya Ranjan Satpathy, Soumyaranjan Pradhan, Rosalin Pradhan, Rajashree Sahu, Aparesh Prasad Biswal, and Bibhu Prasad Ganthia
Contents
xix
Measuring the Performance of a Model Semantic Knowledge-Base for Automation of Commonsense Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . 253 Chandan Hegde and K. Ashwini COVID-19 Detection and Prediction Using Chest X-Ray Images . . . . . . . 265 Shreyas Mishra Automated Precision Irrigation System Using Machine Learning and IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Ashutosh Bhoi, Rajendra Prasad Nayak, Sourav Kumar Bhoi, and Srinivas Sethi UHWSF: Univariate Holt Winter’s Based Store Sales Forecasting . . . . . . 283 Gopal Behera, Ashok Kumar Bhoi, and Ashutosh Bhoi A Nature-Inspired-Based Multi-objective Service Placement in Fog Computing Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 Hemant Kumar Apat, Kunal Bhaisare, Bibhudatta Sahoo, and Prasenjit Maiti Advanced Binary Matrix-Based Frequent Pattern Mining Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 Pranaya Pournamashi Patro and Rajiv Senapati Sentiment Analysis Using Semi Supervised Machine Learning Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 Abinash Tripathy and Alok Kumar Jena Unconstrained Optimization Technique in Wireless Sensor Network for Energy Efficient Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Binaya Kumar Patra, Sarojananda Mishra, and Sanjay Kumar Patra A Smartphone App Based Model for Classification of Users and Reviews (A Case Study for Tourism Application) . . . . . . . . . . . . . . . . . 337 Ramesh K. Sahoo, Srinivas Sethi, and Siba K. Udgata Classification of Arrhythmia Beats Using Optimized K-Nearest Neighbor Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 Mohebbanaaz, L. V. Rajani Kumari, and Y. Padma Sai A Comparative Analysis of Fuzzy Logic-Based DTC and ST-DTC Using Three-Level Inverter for Torque Ripple Reduction . . . . . . . . . . . . . . 361 Umakanta Mahanta, Bhabesh Chandra Mohanta, Bibhu Prasad Panigrahi, and Anup Kumar Panda An Inference Engine Integrated with Health Parameters for Medical Web Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373 P. S. S. Sree Dhruti, L. V. Rajani Kumari, and Y. Padma Sai Diabetes Prediction Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . 383 Arshdeep Kaur Jaggi, Ananya Sharma, Nikhil Sharma, Ridhiman Singh, and Partha Sarathi Chakraborty
xx
Contents
Analysis of Security Vulnerabilities of Internet of Things and It’s Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393 Puspanjali Mallik and Om Prakash Jena Cognitive Function of Human Memory Using Machine Learning . . . . . . 403 Ashima Rout, Ramesh K. Sahoo, Sangita Pal, and Divyajyoti Dehury TB-EDA: A Trust-Based Event Detection Algorithm to Detect False Events in Software-Defined Vehicular Network . . . . . . . . . . . . . . . . . 413 Rajendra Prasad Nayak, Srinivas Sethi, and Sourav Kumar Bhoi A Sensor Deployment Scheme for Fault-Tolerant Connected Probabilistic Target Coverage: A Trade-Off Among Coverage, Connectivity, and Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 Pradyumna Kumar Ratha, Siba K. Udgata, and Nihar Ranjan Satapathy An Empirical Study of Green Supply Chain Management by Using an Optimisation Tool: An Eastern India Perspective . . . . . . . . . . . . . . . . . . 439 Bandita Sahu and Prasant Ranjan Dhal A Fuzzy AHP Approach to Evaluate the Strategic Design Criteria of a Smart Robotic Powered Wheelchair Prototype . . . . . . . . . . . . . . . . . . . 451 Sushil Kumar Sahoo and Bibhuti Bhusan Choudhury An Earthquake Prediction System for Bangladesh Using Deep Long Short-Term Memory Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 Md. Hasan Al Banna, Tapotosh Ghosh, Kazi Abu Taher, M. Shamim Kaiser, and Mufti Mahmud Offline Odia Handwritten Characters Recognition Using WEKA Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 Anupama Sahu and S. N. Mishra Neural Network-Based Receiver in MIMO-OFDM System for Multiuser Detection in UWA Communication . . . . . . . . . . . . . . . . . . . . . 485 Md Rizwan Khan and Bikramaditya Das Employing Deep Neural Network for Early Prediction of Students’ Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 Sachin Garg, Abdul Aleem, and Manoj Madhava Gore Minimizing Energy and Cost Through VM Placement Using Meta-Heuristic Algorithm in Cloud Data Center . . . . . . . . . . . . . . . . . . . . . 509 Sudhansu Shekhar Patra, Mahendra Kumar Gourisaria, G. M. Harshvardhan, and Smruti Rekha Prusty Mobile Cloud Computing: A Green Perspective . . . . . . . . . . . . . . . . . . . . . . 523 Atta-ur-Rahman, Sujata Dash, Munir Ahmad, and Tahir Iqbal
Contents
xxi
Fault Detection in Differential-Based STATCOM Compensated Double Circuit Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 S. K. Mishra, R. R. Hete, S. K. Bhuyan, L. N. Tripathy, and Ritesh Dash Optimizing the Valid Transaction Using Reinforcement Learning-Based Blockchain Ecosystem in WSN . . . . . . . . . . . . . . . . . . . . . . 551 P. Anitha Rajakumari and Pritee Parwekar Wi-Fi Fingerprint Localization Based on Multi-output Least Square Support Vector Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561 A. Christy Jeba Malar, M. Deva Priya, F. Femila, S. Sam Peter, and Viraja Ravi Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
Editors and Contributors
About the Editors Prof. Siba K. Udgata is Professor in Computer and Information Sciences at the University of Hyderabad, India. He has a Ph.D. in Computer Science in the area of mobile computing and wireless communications and worked as United Nations Fellow at UNU/IIST, Macau. His research focus is on wireless communication, mobile computing, intelligent sensors, sensor network algorithms, Internet of Things, and applications. He was Volume Editor and author for several Springer LNAI and AISC International Conference proceedings, books and also Associate Editor and an editorial board member of IOS Press KES Journal and Elsevier AKCE International Journal of Graphs and Combinatorics. Prof. Udgata has published more than 100 research papers in reputed international journals and conference proceedings. He has worked as Principal Investigator in many Government of India funded research projects mainly for the development of wireless sensor network applications, network security related applications and application of swarm intelligence techniques in the cognitive radio network domain. Dr. Srinivas Sethi is Associate Professor and has been actively involved in teaching and research in Computer Science since 1997. He did his Ph.D., in the area of Routing Algorithm in Mobile Ad hoc Network and is also continuing his work in the Wireless Sensor Network, Cognitive Radio Network and Cloud Computing. He is the member of editorial board for different journals and a program committee member for different international conferences/ workshop. Now, he is working as Faculty in the Department of Computer Science Engineering and Application at Indira Gandhi Institute of Technology Sarang, India, and has published more than 50 research papers in international journals and conference proceedings. He completed 3 research projects funded by different funding agencies such as DST, AICTE and DRDO. Now, he is continuing with other 3 projects funded by agency such as AICTE and NPIU.
xxiii
xxiv
Editors and Contributors
Dr. Satish N. Srirama was a Research Professor and Head of the Mobile & Cloud Lab at the Institute of Computer Science, University of Tartu, Estonia, and a Visiting Professor at the University of Hyderabad, India. Presently he is at School of Computer and Information Sciences, University of Hyderabad, India. He received his Ph.D. in Computer Science from RWTH Aachen University, Germany. His current research focuses on cloud computing, mobile web services, mobile cloud, Internet of Things, fog computing, migrating scientific computing and enterprise applications to the cloud and large-scale data analytics on the cloud. He is an IEEE senior member and Editor of 50-year-old Wiley Software: Practice and Experience and was Associate Editor of IEEE Transactions in Cloud Computing and a program committee member of several international conferences and workshops. Dr. Srirama has co-authored over 150 refereed scientific publications in international conferences and journals. Dr. Srirama has successfully managed several national, international and enterprise collaborative research grants and projects.
Contributors Pratik Acharya Civil Engineering Department, Indira Gandhi Institute of Technology, Sarang, Dhenkanal, Odisha, India Munir Ahmad BIIT, PMAS Arid Agriculture University, Rawalpindi, Pakistan Jawaz Alam Department of Mechanical Engineering, VSSUT, Burla, Odisha, India Abdul Aleem CSE Department, MNNIT Allahabad, Prayagraj, India P. Anitha Rajakumari Delhi-NCR Campus, SRM Institute of Science and Technology, Ghaziabad, India Shanmugam Anitha Department of IT, Kongu Engineering College, Erode, India Hemant Kumar Apat National Institute of Technology, Rourkela, Odisha, India A. Arun Prakash Department of Electronics and Communication Engineering, Kongu Engineering College, Erode, India K. Ashwini Department of Computer Science and Engineering, Global Academy of Technology (VTU), Bengaluru, India Atta-ur-Rahman Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia Dhiren Kumar Behera Department of Mechanical Engineering, Indira Gandhi Institute of Technology, Sarang, Odisha, India Gopal Behera Department of Computer Science & Engineering, Government College of Engineering Kalahandi, Bhawanipatna, Odisha, India
Editors and Contributors
xxv
Ranjan K. Behera Department of Electrical Engineering, Indian Institute of Technology Patna, Patna, India Kunal Bhaisare National Institute of Technology, Rourkela, Odisha, India Ashok Kumar Bhoi Department of Computer Science & Engineering, Government College of Engineering Kalahandi, Bhawanipatna, India Ashutosh Bhoi Department of Computer Science & Engineering, Government College of Engineering Kalahandi, Bhawanipatna, Odisha, India Sourav Kumar Bhoi Department of Computer Science and Engineering, PMEC, Berhampur, India S. K. Bhuyan Electrical Engineering Department, B.I.E.T, Bhadrak, India Aparesh Prasad Biswal Electrical Engineering, IGIT, Sarang, Dhenkanal, Odisha, India Partha Sarathi Chakraborty SRM Institute of Science & Technology, Chennai, India Arun Avinash Chauhan University of Hyderabad, Hyderabad, India Lekhika Chettri Network Reconstruction & Analysis (NetRA) Lab, Department of Computer Applications, Sikkim University, Gangtok, Sikkim, India Bibhuti Bhusan Choudhury Indira Gandhi Institution of Technology, Sarang, India A. Christy Jeba Malar Department of Information Technology, Sri Krishna College of Technology, Coimbatore, Tamil Nadu, India Bikramaditya Das Department of Electronics and Telecommunication Engineering, VSS University of Technology, Burla, Odisha, India Jijnasee Dash Department of Computer Science and Engineering, College of Engineering and Technology, Bhubaneswar, India Padmanav Dash Department of Mechanical Engineering, VSSUT, Burla, Odisha, India Ritesh Dash Electrical Engineering Department, Reva university, Bengaluru, India Santanu Kumar Dash VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India Sujata Dash Department of Computer Application, North Orissa University, Baripada, Odisha, India Divyajyoti Dehury Department of CSEA, IGIT SARANG, Dhenkanal, India M. Deva Priya Department of Computer Science and Engineering, Sri Krishna College of Technology, Coimbatore, Tamil Nadu, India
xxvi
Editors and Contributors
Prasant Ranjan Dhal Assistant Professor, Department of Mechanical Engineering, Indira Gandhi Institute of Technology, Sarang, Dhenkanal, Odisha, India F. Femila Department of Computer Science and Engineering, Sri Krishna College of Technology, Coimbatore, Tamil Nadu, India Bibhu Prasad Ganthia Electrical Engineering, IGIT, Sarang, Dhenkanal, Odisha, India Sachin Garg CSE Department, MNNIT Allahabad, Prayagraj, India Tapotosh Ghosh Bangladesh University of Professionals, Dhaka, Bangladesh Manoj Madhava Gore CSE Department, MNNIT Allahabad, Prayagraj, India Mahendra Kumar Gourisaria School of Computer Engineering, KIIT Deemed to be University, Bhubaneswar, India Kapil Goyal Indian Institute of Technology Guwahati, Guwahati, Assam, India Atulya Gupta Department of Computer Science and Engineering, SRM Institute of Science and Technology, Ghaziabad, India G. M. Harshvardhan School of Computer Engineering, KIIT Deemed to be University, Bhubaneswar, India Md. Hasan Al Banna Bangladesh University of Professionals, Dhaka, Bangladesh Chandan Hegde Research Scholar, Department of Computer Science and Engineering, Global Academy of Technology (VTU), Bengaluru, India R. R. Hete Electrical Engineering Department, G H Raisoni University, Amravati, India Arunima Hota Department of Computer Science and Engineering, College of Engineering and Technology, Bhubaneswar, India Tahir Iqbal Faculty of Software Engineering, Northeastern University, Shenyang, Liaoning, China Arshdeep Kaur Jaggi SRM Institute of Science & Technology, Chennai, India Suvendra Kumar Jayasingh PG Department of Computer Application, North Orissa University Baripada, Baripada, Odisha, India Alok Kumar Jena Gandhi Institute of Education and Technology, Gunupur, India Om Prakash Jena Ravenshaw University, Cuttack, India M. Shamim Kaiser Institute of Information Technology, Jahangirnagar University, Savar, Dhaka, Bangladesh
Editors and Contributors
xxvii
Pankaj Kumar Kashyap Wireless Communication and Networking Research Lab, School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India Md Rizwan Khan Department of Electronics and Telecommunication Engineering, VSS University of Technology, Burla, Odisha, India Manoj Kumar Wireless Communication and Networking Research Lab, School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India Priyadarshi Biplab Kumar Mechanical Engineering Department, Institute of Technology Hamirpur, Hamirpur, Himachal Pradesh, India
National
Rustom Kumar Indian Institute of Technology Kanpur, Kanpur, India Saroj Kumar Robotics Laboratory, Mechanical Engineering Department, National Institute of Technology Rourkela, Rourkela, Odisha, India Sushil Kumar Wireless Communication and Networking Research Lab, School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India Krishnasamy Lalitha Department of IT, Kongu Engineering College, Erode, India Umakanta Mahanta Department of Electrical Engineering, Indira Gandhi Institute of Technology, Dhenkanal, Odisha, India Rajendra Prasad Mahapatra Department of Computer Science and Engineering, SRM Institute of Science and Technology, Ghaziabad, India Mufti Mahmud Nottingham Trent University, Clifton Campus, Nottingham, UK Prasenjit Maiti National Institute of Technology, Rourkela, Odisha, India Puspanjali Mallik Shailabala Women’s(AUTO) College, Cuttack, India Jibendu Kumar Mantri PG Department of Computer Application, North Orissa University Baripada, Baripada, Odisha, India S. K. Mishra Electrical Engineering Department, G H Raisoni University, Amravati, India S. N. Mishra Department of Computer Science, Engineering and Applications, Indira Gandhi Institute of Technology, Dhenkanal, Odisha, India Sarojananda Mishra Department of Computer Science Engineering and Application, Indira Gandhi Institute of Technology, Sarang, India Shreyas Mishra National Institute of Technology, Rourkela, Odisha, India Bhabesh Chandra Mohanta Department of Electrical Engineering, Indira Gandhi Institute of Technology, Dhenkanal, Odisha, India Ayog Mohanty Department of Computer Science and Engineering, ITER, Siksha ‘O’ Anusandhan (Deemed To Be University), Bhubaneswar, India
xxviii
Editors and Contributors
Subhadarshini Mohanty Department of Computer Science and Engineering, College of Engineering and Technology, Bhubaneswar, India Subasish Mohapatra Department of Computer Science and Engineering, College of Engineering and Technology, Bhubaneswar, India Mohebbanaaz Department of ECE, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India Manoj Kumar Muni Robotics Laboratory, Mechanical Engineering Department, National Institute of Technology Rourkela, Rourkela, Odisha, India R. S. Murali Nath Professor, Department of Computer Science and Engineering, BVRIT HYDERABAD College of Engineering for Women, Hyderabad, Telangana, India G. Naga Satish Professor, Department of Computer Science and Engineering, BVRIT HYDERABAD College of Engineering for Women, Hyderabad, Telangana, India Tushar Kumar Nath Civil Engineering Department, Indira Gandhi Institute of Technology, Sarang, Dhenkanal, Odisha, India Rajendra Prasad Nayak Department of Computer Science and Engineering, GCEK, Bhawanipatna, India Ram Babu Nimma Civil Engineering Department, Indira Gandhi Institute of Technology, Sarang, Dhenkanal, Odisha, India Ashis Kumar Padhi Department of Computer Science and Engineering, ITER, Siksha ‘O’ Anusandhan (Deemed To Be University), Bhubaneswar, India Y. Padma Sai Dept of ECE, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India Vangala Padmaja VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India Sangita Pal Department of CSEA, IGIT SARANG, Dhenkanal, India Anup Kumar Panda Department of Electrical Engineering, National Institute of Technology, Rourkela, Odisha, India Madhusmita Panda Department of Electronics Engineering, VSSUT, Burla, Odisha, India
and
Telecommunication
Saroja Kanta Panda Department of Instrumentation and Electronics Engineering, College of Engineering and Technology Bhubaneswar, Bhubaneswar, India Sumanta Panda Department of Mechanical Engineering, VSSUT, Burla, Odisha, India
Editors and Contributors
xxix
Bibhu Prasad Panigrahi Department of Electrical Engineering, Indira Gandhi Institute of Technology, Dhenkanal, Odisha, India Dayal R. Parhi Robotics Laboratory, Mechanical Engineering Department, National Institute of Technology Rourkela, Rourkela, Odisha, India Pritee Parwekar Delhi-NCR Campus, SRM Institute of Science and Technology, Ghaziabad, India Binaya Kumar Patra Department of Computer Science Engineering and Application, Indira Gandhi Institute of Technology, Sarang, India Krishna Chandra Patra Department of Mechanical Engineering, Indira Gandhi Institute of Technology, Sarang, Odisha, India Moumita Patra Indian Institute of Technology Guwahati, Guwahati, Assam, India Sanjay Kumar Patra Department of Computer Science Engineering and Application, Indira Gandhi Institute of Technology, Sarang, India Sudhansu Shekhar Patra School of Computer Applications, KIIT Deemed to be University, Bhubaneswar, India Pranaya Pournamashi Patro GIET University, Gunupur, Odisha, India S. Sam Peter Department of Computer Science and Engineering, Sri Krishna College of Technology, Coimbatore, Tamil Nadu, India Chinnasamy Poongodi Department of IT, Kongu Engineering College, Erode, India Rosalin Pradhan Electrical Engineering, IGIT, Sarang, Dhenkanal, Odisha, India Sipali Pradhan PG Department of Computer Application, North Orissa University Baripada, Baripada, Odisha, India Soumyaranjan Pradhan Electrical Engineering, IGIT, Sarang, Dhenkanal, Odisha, India Veeravelli Lakshmi Prasana Sreenidhi Institute of Science and Technology, Hyderabad, Telengana, India S. Preethi Department of Electronics and Communication Engineering, Kongu Engineering College, Erode, India Srusti Priyadarshini Department of Mechanical Engineering, VSSUT, Burla, Odisha, India Garg Priyanka VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India Smruti Rekha Prusty Department of Electronics and Communication Engineering, Silicon Institute of Technology, Bhubaneswar, India
xxx
Editors and Contributors
Pratap Sekhar Puhan Sreenidhi Institute of Science and Technology, Hyderabad, Telengana, India Ch. V. Raghavendran Professor, Department of Information Technology, Aditya College of Engineering and Technology, Surampalem, AP, India L. V. Rajani Kumari Department of ECE, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India Pradyumna Kumar Ratha Sambalpur University Institute Technology, Sambalpur University, Sambalpur, Odisha, India
of
Information
Viraja Ravi Department of Computer Science and Engineering, Sri Krishna College of Technology, Coimbatore, Tamil Nadu, India Ashima Rout Department of ETC Engg, IGIT SARANG, Dhenkanal, India Swarup Roy Network Reconstruction & Analysis (NetRA) Lab, Department of Computer Applications, Sikkim University, Gangtok, Sikkim, India Bibhudatta Sahoo National Institute of Technology, Rourkela, Odisha, India Ramesh K. Sahoo Department CSEA, IGIT Sarang, Dhenkanal, India Satyabrata Sahoo Nalla Telengana, India
Malla
Reddy
Engineering
College,
Hyderabad,
Sipra Sahoo Department of Computer Science and Engineering, ITER, Siksha ‘O’ Anusandhan (Deemed To Be University), Bhubaneswar, India Sushil Kumar Sahoo Indira Gandhi Institution of Technology, Sarang, India Anupama Sahu Department of Computer Science, Engineering and Applications, Indira Gandhi Institute of Technology, Dhenkanal, Odisha, India Bandita Sahu Assistant Professor, Department of Mechanical Engineering, Indira Gandhi Institute of Technology, Sarang, Dhenkanal, Odisha, India Chinmaya Sahu School of Mechanical Engineering, Vellore Institute of Technology Vellore, Vellore, Tamil Nadu, India Rajashree Sahu Electrical Engineering, IGIT, Sarang, Dhenkanal, Odisha, India Sushanta Kumar Sahu Department of Instrumentation and Electronics Engineering, College of Engineering and Technology Bhubaneswar, Bhubaneswar, India Nihar Ranjan Satapathy Department of Mathematics, Sambalpur University, Burla, India Soumya Ranjan Satpathy Electrical Engineering, IGIT, Sarang, Dhenkanal, Odisha, India
Editors and Contributors
xxxi
Rajiv Senapati Department of CSE, SRM University, Amaravati, Andhra Pradesh, India Rabi Narayan Sethi Department of Mechanical Engineering, Indira Gandhi Institute of Technology, Sarang, Odisha, India Srinivas Sethi Department of Computer Science Engineering and Applications, IGIT Sarang, Dhenkanal, India Ananya Sharma SRM Institute of Science & Technology, Chennai, India Nikhil Sharma SRM Institute of Science & Technology, Chennai, India Ridhiman Singh SRM Institute of Science & Technology, Chennai, India Swati Sneha Banasthali University, Rajasthan, India P. S. S. Sree Dhruti Department of ECE, VNRVJIET, Hyderabad, India Kazi Abu Taher Bangladesh University of Professionals, Dhaka, Bangladesh R. Thangarajan Department of Information Technology, Kongu Engineering College, Erode, India Abinash Tripathy Raghu Engineering College, Visakhapatnam, India L. N. Tripathy Electrical Engineering Department, C.E.T., Bhubaneswar, India Siba K. Udgata School of Computer and Information Sciences, University of Hyderabad, Hyderabad, India Duraisamy Vijay Anand Department of IT, Kongu Engineering College, Erode, India
An Approach for Heart Disease Prediction Using Machine Learning Subasish Mohapatra, Jijnasee Dash, Subhadarshini Mohanty, and Arunima Hota
Abstract The heart is the most vital organ found in the chest pit of people. Sudden blockage of blood flow to the heart causes a heart attack. Due to the lack of proper diagnosis and early stage prediction of heart disease, many people die every year. In today’s era, the modern lifestyle and the polluted atmosphere are the main cause of growth in mortality rate. As per WHO data, cardiovascular diseases (CVDs) are the number one reason for death all around, taking an expected 17.9 million lives every year that is approximately 31% deaths around the globe. Irrespective of gender and age group, cardiovascular illness is a major issue in India. Hence, it is necessary that early prediction with accuracy can save a million lives. In this paper, different machine learning classification approaches are done for the early prediction of heart disease. After all, the conclusion is drawn that the random forest classifier produces more accurate predictions than other competitive approaches. It can be helpful for the necessary aid for doctors and chronic patients suffering from heart diseases. Keywords Classification · Machine learning · Heart disease · Support vector machine · K-nearest neighbor · Random forest · X-gboost · Decission tree
1 Introduction Any disorder that affects the normal functionality of the heart is known as heart disease under the term heart illness; it consists of blood vessel diseases, such as S. Mohapatra (B) · J. Dash · S. Mohanty · A. Hota Department of Computer Science and Engineering, College of Engineering and Technology, Bhubaneswar, India e-mail: [email protected] J. Dash e-mail: [email protected] S. Mohanty e-mail: [email protected] A. Hota e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_1
1
2
S. Mohapatra et al.
coronary artery problems, heart rhythm problems (arrhythmias) and heart faults you are born with (congenital heart problem), and many more [1, 2]. Insufficient or less blood flow can affect a healthy heart by not getting proper oxygen; it requires and affects other body parts with insufficient oxygen. Types of heart diseases are there with various symptoms that can be different for men and women. Therefore, many researchers have tried to give a solution for the prediction of heart disease. But accuracy has always been a major issue. In this paper, machine learning techniques have been used and the result is compared with different algorithms to get more accuracy in predicting whether a person is having heart disease or not. Various attributes of the dataset and the performance of the model are evaluated using a confusion matrix. The organization of the paper as follows: Next section contains the literature review. Then the subsequent sections contain the proposed model, model analysis, result discussion, conclusion, and future works, respectively.
2 Literature Survey Dwivedi discussed various algorithms of machine learning and those were assessed on the receiver operative characteristic curve. With logistic regression, approximately 85% accuracy was recorded [1]. Haq et al. proposed a system that was helpful to doctors to diagnose heart patients easily. They have discussed all classifiers and feature selection algorithms and validated their model with ROC and AUC. But the reduction in feature and optimization resulting accuracy was not up to the mark [2]. Manogaran et al. proposed a system where a wearable body sensor technology is used to measure blood pressure, glucose level, and heartbeat rate of patients to predict the illness of heart using a linear regression model with 81% accuracy [3]. Mohan et al. proposed a method for heart disease prediction with 88.7% accuracy using the hybrid random forest with a linear model (HRFLM) [4]. Desai et al. proposed a methodology where LR and BPNN classification models were used with tenfold cross-validation to predict heart illness by considering attributes of Cleveland dataset. But visible accuracy was not achieved by them [5]. Kannan et al. proposed a methodology where they compared the accuracy of four different machine learning algorithms with receiver operating characteristic (ROC) curve for predicting heart illness by the 14 attributes from UCI cardiac datasets [6]. Alim et al. proposed a model to predict heart disease using machine learning algorithms. They focused on finding features by using correlation on UCI vascular heart disease dataset for robust prediction of results. They achieved an accuracy of 86. 94% [7]. From the above reviews, it is clear that the prediction accuracy rate of classifiers and other algorithms is still a major concern for heart disease prediction. So in this paper, different machine learning techniques such as logistic regression, KNN, SVM, decision tree, random forest, and XGBoost classifiers have been discussed. Data exploration and model validation with hyperparameter tuning has been evaluated
An Approach for Heart Disease Prediction Using Machine Learning
3
Fig. 1 Block diagram of the proposed model
to get a more accurate prediction of the occurrence of heart disease in a person irrespective of age group and gender.
3 Proposed Model In Fig. 1 given the proposed model describes the steps followed such as collecting data from the dataset, data exploration, and visualization, pre-processing of the data as dataset may contain null values and creation of dummy variables for categorical data. Feature selection is done by finding a correlation among attributes with the target variable. Different machine learning techniques are applied to examine the accuracy rate for the prediction of heart disease. After performance evaluation, hyperparameter tuning is performed to improve the accuracy rate. Finally, the confusion matrix for each technique has been demonstrated for model validation.
4 Working of Model Dataset is collected from the UCI repository that is the Cleveland dataset. Some rows with all 14 attributes we are using are given below. Purpose of all attributes described later. Basic attributes taken into consideration and description of each attribute were presented in Tables 1 and 2, respectively.
4
S. Mohapatra et al.
Table 1 Basic attributes of dataset age sex cp trestbps chol fbs restecg thalach exang oldpeak slope ca thal target 0 63
1
3
145
233 1
0
150
0
2.3
0
0
1
1
1 37
1
2
130
250 0
1
187
0
3.5
0
0
2
1
2 41
0
1
130
204 0
0
172
0
1.4
2
0
2
1
3 56
1
1
120
236 0
1
170
0
0.8
2
0
2
1
4 57
0
0
120
354 0
1
163
1
0.6
2
0
2
1
Table 2 Description of each attribute of dataset S. No.
Attributes
Description
1
age
age in years
2
sex
(1 = male; 0 = female)
3
cp
chest pain type
4
trestbps
resting blood pressure (in mm Hg on admission to the hospital)
5
chol
serum cholestoral in mg/dl
6
fbs
(fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
7
restecg
resting electrocardiographic results
8
thalach
maximum heart rate achieved
9
exang
exercise induced angina (1 = yes; 0 = no)
10
oldpeak
ST depression induced by exercise relative to rest
11
slope
slope of the peak exercise ST segment
12
ca
number of major vessels (O–3) colored by flourosopy
13
thal
3 = normal; 6 = fixed defect; 7 = reversable defect
14
target
have disease or not (1 = yes, 0 = no)
4.1 Understanding Data of Dataset In the UCI heart disease dataset, we are using 14 attributes or features that are used for predicting target variables such as 0 and 1 for classification. Independent variables are used for the prediction of the dependent variable. Here, the attributes are independent variables, whereas the dependent variable is the target. Here, a person having various medical attributes is considered as an independent variable, whereas whether that person has heart disease or not is the dependent variable. First of all, importing required libraries is done that is Pandas for data analysis, Numpy for numerical operations, Matplotlib/Seaborn for plotting and visualizations, and Scikitlearn for machine learning. Then loading of the dataset is performed. We are dealing with the data with comma-separated values (.csv) format data [8, 9]. Pandas have the inbuilt functions to load and visualize the data in a data frame.
An Approach for Heart Disease Prediction Using Machine Learning
5
4.2 Data Exploration and Visualization Exploratory data analysis (EDA) is done to understand clearly the data, such as finding out the data type, missing values, dealing with those data, and most important is which feature to add or remove to get a clear idea of your dataset description.
4.3 Data Pre-processing After the dataset exploration, conversion of the present categorical values into dummy variables is done. We have to deal with NAN values. Then scaling of all values is performed before train the models of machine learning. get_dummys method used to get a dummy variable.
4.4 Feature Selection Correlation matrix is plotted to understand the relationships among various attributes and especially with the target [9]. Using F or AUC, we can do the feature selection.
4.5 Classification Before applying classifiers, proper splitting of the dataset is done to avoid data imbalancing. Data is divided into training and test set with an 80:20 ratio. We have to assign an integer value for random state parameters to generate the same set of data at each execution. Then different machine learning models applied for classification. In the training set, we will train data, and in the test set, we test the data to predict results. The training and testing of data are one by using six different machine learning algorithms, i.e., logistic regression, KNN, SVM, decision tree, random forest, and XGBoost. Each algorithm detail explanation is below. Logistic Regression LR is a machine learning technique that comes under supervised learning which can work with labeled data specially used to predict the probability of occurrence of a target variable [10]. The target or dependent variable has binary values that are 0 (No) or 1 (Yes). The output is a sigmoid curve always. P-value lies between 0 and 1. Below is the equation of LR. log
y = b0 + b1 x1 + b2 x2 + . . . + bn xn 1−y
(1)
6
S. Mohapatra et al.
where y is the independent variable, x is the dependent variable, b0 is Y-intercept, b1, b2 … bn are slopes. K-nearest Neighbors Classifier KNN is a machine learning technique that comes under supervised learning which can solve both classification and regression problems. It assumes that similar things existence is very close to each other. Selecting the right K value is the most important task here. For the given data that is unknown to the machine, we can test with different K values to get reduced error for better prediction. First of all number K of neighbors is selected, then Euclidean distance is evaluated, and as per the calculation, selection of K-nearest neighbors done. The number of data points in each category is counted. For the category having max neighbor, new data points are assigned. Then the model is ready. We have used Sklearn to direct import this classifier. Euclidean DistanceD(P, Q) =
(x2 − x1 )2 + (y2 − y1 )2
(2)
Support Vector Machine (SVM) Classifier SVM is very popular as a supervised machine learning technique. Mainly, to solve classification as well as regression problems, it is used. The main implementation of it is to popularize between two distinct classes if the arrangement of marked information is given in the preparation set to the calculation. For clearly distinguish among classes hyperplane is used. Though many algorithms can evaluate the similar parameters but, the goal is to find that hyperplane that has the most elevated edge that implies greatest separations between the two classes, so that in future on the off chance that another data point comes that is two be characterized, at that point, it very well may be grouped without any problem. We are using the Sklearn library to import the SVM classifier directly. Decision Tree Classifier It is a simple supervised learning technique for solving classification problems. It separates a dataset into littler and littler subsets while simultaneously a related decision tree is steadily evolved known as splitting. In the pruning process, tree size was reduced. The conclusive outcome is a tree with decision hubs and leaf hubs. A decision hub has at least two branches. Leaf hub represents a characterization or choice. The highest decision node in a tree which relates to the best predictor called root hub. Overfitting is avoided by the smaller sized tree. Smallest tree is to be found that contains final leaf node so that it will result in less cross validated error. Below given equations are used for decision making. Information Gain = Entropy(S) − (Weighted Avg) ∗ Entropy(each feature) (3)
An Approach for Heart Disease Prediction Using Machine Learning
Gini Index = 1 −
(P j)2
7
(4)
j
Entropy(s) = −P(yes) log2 P(yes) − P(no) log2 P(no)
(5)
S = Total number of samples, P (yes) = probability of yes, P (no) = probability of number, where information gain represents the estimation of changes in entropy after the division of a dataset dependent on an attribute. Gini index is a proportion of polluting influence or immaculateness utilized at the time of making a choice tree in the CART calculation. Entropy measures the randomness of data. Random Forest Classifier It is based on a divide-and-conquer approach where on a dataset which is splitted randomly that is random samples, separate decision trees are generated using information gain, gain ratio, and gini index commonly known as indicators of feature selection. Forest is the collection of those trees. Separate prediction is found for an individual tree that has an independent sample. For classification, a vote is performed for individual predictions and popular class that is a class with most votes chosen as a result. XGBoost Classifier It is a decision tree-based ensemble machine learning algorithm that uses a gradient boosting framework mainly for speed and good performance. A combination of different weak models to act as a powerful model is an [11] ensemble learning technique that yields the best output of all. Bagging and boosting are the most popular ensemble learners used with decision trees.
4.6 Hyperparameter Tuning GridSearchCV is used for selecting the most eligible parameters for a target where several parameters testing is performed by cross-validation. The best one is extracted for the targetted model for prediction to improve accuracy.
5 Simulatıon and Result Discussıon In the above Fig. 2, target variable value 1 represents a person having heart disease and value 0 represents a person having no heart disease. From the above figure, it is clear that there are 165 persons having heart disease, and the number of the person having no heart disease is 138. We can get a clear idea that our data is balanced. There are 31.68% female patients and 68.32% male patients of heart disease are present.
8
S. Mohapatra et al.
Fig. 2 Visualization plot of target versus count grouped by sex
In Fig. 3, we can visualize the number of persons affected by heart disease for the age group. In Fig. 4, we can visualize a person having heart disease or not with respect to age and maximum heart rate. In Fig. 5, we can visualize the correlation of different attributes with the target variable. The idea we got is that the attributes FBS, chol have the lowest correlation with target, whereas all others having a significant correlation with the target. Above given Table 3 shows a comparison of accuracy obtained by different classifiers. We can see random forest classifier performs better than others with an accuracy rate of 86.89%. Above given Table 4 shows a comparison of accuracy obtained by different classifiers after hyperparameter tuning. we can see improved prediction accuracy rate for most of the classifiers. We can say the tuned K-nearest neighbor classifier performs better with an accuracy rate of 88.52%.
Fig. 3 Visualization plot of age versus count
An Approach for Heart Disease Prediction Using Machine Learning
9
Fig. 4 Scatter plot of age versus maximum heart rate
Fig. 5 Correlation with respect to the target Table 3 Comparison of accuracy of different models
Model name
Accuracy (%)
0
Logistic regression
85.25
1
K-nearest neighbors
83.61
2
Support vector machine
83.61
3
Decision tree classifier
81.97
4
Random forest classifier
86.89
5
XGBoost classifier
83.61
10 Table 4 Comparison of accuracy of different models after hyperparameter tuning
S. Mohapatra et al. Model name
Accuracy (%)
0
Tuned Logistic Regression
86.89
1
Tuned K-nearest neighbors
88.52
2
Tuned Support Vector Machine
85.25
3
Tuned Decision Tree Classifier
80.33
4
Tuned Random Forest Classifier
85.25
5
Tuned XGBoost Classifier
80.33
Fig. 6 Confusion matrix for each classifier
Figure 6 shows the confusion matrix for each classifier. We can see before parameter tuning also all our model performance was good as less number of false positive (FP) and false negative (FN) values are present. Out of all, random forest classifier performance was good. Figure 7 shows the confusion matrix for each tuned classifier. We can see after parameter tuning accuracy is improved and all model performance was good as less number of false positive (FP) and false negative (FN) values are present. But out of all, tuned K-nearest neighbor’s performance was good.
6 Conclusion In this paper, different approaches have been compared with varying various parameters for the early detection of heart disease prediction. Different classifiers were applied and evaluated. It is observed that the random forest [11] is best before hyperparameter tuning with an accuracy rate of 86.89% and tuned K-nearest neighbors
An Approach for Heart Disease Prediction Using Machine Learning
11
Fig. 7 Confusion matrix for each tuned classifier
[12] with an accuracy rate of 88.52%. Matrix plot is used to visualize the performance of the proposed model. In future, the accuracy rate can be enhanced with the help of real clinical data and with more advanced classifiers. Further this comparison can be done with multiclass data to predict heart disease. In future, a more reliable feature selection method can be used to eradicate non-relevant features.
References 1. Dwivedi, A.K.: Performance evaluation of different machine learning techniques for prediction of heart disease. In: Neural Computing and Applications vol. 29, no. 10, pp 685–693 (2018) 2. Haq, A.U., Li, J.P., Memon, M.H., Nazir, S., Sun, R.: A hybrid intelligent system framework for the prediction of heart disease using machine learning algorithms. In: Mobile Information Systems 2018 (2018) 3. Manogaran, G., Lopez, D.: Health data analytics using scalable logistic regression with stochastic gradient descent. Int. J. Adv. Intel. Paradigms 10(1–2), 118–132 (2018) 4. Mohan, K.S., Chandrasegar, T., Gautam, S.: Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 7, 81542–81554 (2019) 5. Desai, S.D., Girardi, S., Narayankar, P., Pudakalakatti, N.R., Sulegaon, S.: Back-propagation neural network versus logistic regression in heart disease classification. In: Advanced Computing and Communication Technologies, Springer, Singapore, pp 133–144 (2019) 6. Kannan, R., Vasanthi, V.: Machine learning algorithms with ROC curve for predicting and diagnosing the heart disease. In: Soft Computing and Medical Bioinformatics, Springer, Singapore, pp 63–72 (2019) 7. Alim, M.A., Shamsheela, H., Yumna, F., Abdul, R.: Robust heart disease prediction: a novel approach based on significant feature and ensemble learning model. In: 2020 3rd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET), IEEE, pp 1–5 (2020) 8. Beunza, J.J., Puertas, E., García-Ovejero, E., Villalba, G., Condes, E., Koleva, G., Hurtado, C., Landecho, M.F.: Comparison of machine learning algorithms for clinical event prediction (risk of coronary heart disease). J. Biomed. Inf. 97, 103–257 (2019)
12
S. Mohapatra et al.
9. Jagtap, A., Priya, M., Omkar, B., Harshali, R.: Heart disease prediction using machine learning. Int. J. Res. Eng. Sci. Manag. 2(2), 352–355 (2019) 10. Wright, V.: Machine Learning: Using the Logistic Regression Model to Predict Coronary Heart Disease (2019) 11. Panda, D., Dash, S.R.: Predictive system: comparison of classification techniques for effective prediction of heart disease. In: Smart Intelligent Computing and Applications, Springer, Singapore, pp 203–213 (2020) 12. Vaidya, N., Kandu, M., Yadav, R.K., Bharadwaj, N.: The working of various prediction techniques for heart diseases–a case study. IJRAR-Int. J. Res. Anal. Rev. (IJRAR) 7(1), 447–453 (2020)
Low-Cost Smart Solar DC Nano-Grid for Isolated Rural Electrification: Cyber-Physical System Design and Implementation Ranjan K. Behera, Swati Sneha, and Rustom Kumar
Abstract In this paper, a smart controllable distributed solar DC nano-grid is developed for distributed villages in India. Initially, the proposed system is modeled individually and its controller structure is investigated. A communication protocol based on cyber-physical system is developed for controlling and coordinating different components of the proposed system. Each solar photovoltaic (SPV) power plant is rated at 1.5 kW. The SPV is connected to DC–DC converter to battery and a DC load. The battery is connected with a bidirectional DC-DC converter for bidirectional power flow. The operation of DC-DC converters ensures maximum power point tracking (MPPT) under any environmental condition. The characteristics of the communication network corresponding with the proposed solar energy utilization are remotely monitored, and then the proposed system with the solar interface is studied Simulink. The simulation results are verified by experimental prototype in the laboratory. Keywords DC nano grid · Cyber-physical system · Solar photo-voltaic · DC-DC converters
1 Introduction In Bihar, most of the rural areas are not getting electric power supply for all day and night. Now, the Government of India installed solar energy systems for all the rural areas. These areas are not well connected with the cities and remotely located. Each house has a solar DC nano-grid system of 1.2 kW capacity for providing essential power such as bulb, fan, and mobile charging to the individual houses. Similarly, R. K. Behera (B) Department of Electrical Engineering, Indian Institute of Technology Patna, Patna 801103, India e-mail: [email protected] S. Sneha Banasthali University, Rajasthan 304022, India R. Kumar Indian Institute of Technology Kanpur, Kanpur, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_2
13
14
R. K. Behera et al.
DC BUS PV Cell
MPPT Boost Converter
Internet
Bi Directional DC/DC Converter
Raspberry PI
Battery
Fig. 1 Structure of the proposed rural electrification unit
a nearby house also has a solar DC nano-grid system. All these solar systems are distributed around the village as shown in Fig. 1. Hence, it is very difficult to monitor the health of the solar system, power flow, solar availability, storage status, and load management manually by the technical staff in the house. Load power information and management in distributed solar power system are challenging under limited generation capacity plant. Hence, power distribution in the DC distribution grids can be improved through coordinated cyber-physical control among the distributed solar nano-grids (NGs). This paper proposes the following methods of improving the system operation, its controlling, and monitoring. Initially, a solar-based intelligent DC nano-grid is developed in the laboratory as shown in Fig. 2. In the proposed nano-grid, structure has two DC-DC converters used for MPPT and the DC bus control [1–6]. Another Fig. 2 Different modes of control 1.05Vdc Vdc 0.95Vdc
DC Bus Voltage
0.9Vdc
mode I
II
III
IV
Low-Cost Smart Solar DC Nano-Grid for Isolated Rural …
15
bidirectional converter is used for storage purposes. MPPT control along with the droop control method is used for handling situations of varying loads and achieving maximum power from the solar [6–9]. However, wireless nodes are used to control loads during demanding hours, to interconnect smart devices which are connected to each load. It is used for managing and regulating the power flow individually [1–11].
2 Brief Description of the Proposed Solar System The proposed solar DC nano-grid is shown in Fig. 1. A DC-DC boost converter is used to extract maximum power from the solar panel using the MPPT algorithm, and a bidirectional DC-DC converter is used to supply power to the battery whenever an extra amount of power is present in DC bus. If power is needed to DC bus, the battery will supply and keep the constant DC bus voltage [10, 11]. It does not require communication between the controller. Measuring the DC-link voltage system can operate in different modes which are shown in Fig. 2. Logging and remotely control system is also integrated into the system so that system can be monitored at a remote place. The unit of load varies with season as few appliances; i.e., fans are used only in summer. From Tables 1 and 2, it can be seen that nominal available energy per day is 1260 W h during the summer, whereas 140 * 5 W h during the winter season. Similarly, the required load also varies such that 1006.5 W h in summer and 574.5 W h in the winter season. During day time, the peak load is 150.5 W which is depicted in Fig. 3; however, SPV can supply only 140 W. Hence, some power sharing is needed between solar panel and storage element to supply the required power to the load with minimal disturbance. Table 1 Source rating
Table 2 Total load per day
Items
Rating
Energy per day
Solar
140 W
140 * (9 h)
Battery
30 AH
Items
Power rating
Consumed energy per day
DC fan
24 W
24 * (18 h)
Tube light
18 W
18 * (12 h)
Laptop
65 W
65 * (3 h)
Mobile phone
3.5 W
3.5 * (1 h)
TV (19 color) 40 W
40 * (4 h)
Total = 150.5 W Total = 1006.5 W h
16
R. K. Behera et al.
One room Load Profile
DC Fan Tube light Laptop Mobile Phone TV
4:00 AM
2:00 AM
12:00 AM
8:00 PM
10:00 PM
6:00 PM
4:00 PM
2:00 PM
12:00 PM
8:00 AM
Total Load 10:00 AM
160 140 120 100 80 60 40 20 0
6:00 AM
Power(Watt)
Fig. 3 One room load profile data
3 Cyber-Physical Energy System Modeling 3.1 Modeling of PV SPV is represented as simple diode circuits given in Fig. 4. V + IR q(V +I Rs ) s I = Isc − I E exp nkT −1 − Rsh
(1)
where I 0 is the saturation current of the diode, q is the elementary charge 1.6 × 1019 coulombs, T is the cell temperature in Kelvin, k is a constant of value 1.38 × 10−23 J/k, and I sc is current generated by photoelectric effect. n is the ideality factor of diode. Power voltage characteristics of PV cell are shown in Fig. 5, and datasheet of solar panel is given in Table 3. Figure 6 shows the communications in cyber-physical energy system (CPES) of smart DC microgrid. The modeling can be designed as follows. S1–S4 represent smart meters and sensors. R1.1–R1.4 and R2.1–R2.4 represent the communication links, and DEG1 to DEG4 are the different distributed energy sources. I Rs
Fig. 4 SPV circuit
Rsh
V
R Load
Isc
Low-Cost Smart Solar DC Nano-Grid for Isolated Rural …
17
Fig. 5 PV characteristics
Table 3 Data specification of solar panel Configuration
Single glass laminated type with 36 numbers of solar cells in series configuration
Weight
7.5 kg
Open-circuit voltage (V oc )
21.0 V
Short-circuit current (I sc )
4.6 Amp
Voltage at peak power point (V m ) 16.4 V Peak power (Pmax )
70 Wp or 75 Wp
Fig. 6 Proposed CPES for DC microgrid
DC BUS S1
R1.1
R2.1
DEG 1
S2
R1.2
R2.2
DEG 2
S3
R1.3
R2.3
DEG 3
S4
R1.4
R2.4
DEG 4
3.2 MIMO System Model We are assuming that there are NS sensors and NC controllers in the CPS. Let us assume that system states are linear and no perturbation. According to MIMO model, system can be written as
18
R. K. Behera et al.
X˙ (t) = AX (t) + BU (t) Y (t) = C X (t)
(2)
where X is the system state, U is the control input, Y is observer sensor, and A, B, C are constant matrix with proper dimention.
3.3 Controller Structure As here we are using single routing mode, the control law is given as: U (t) = kY (t)
(3)
where k is feedback gain. The matrix k has a special structure as: Ki j
= 0, if no connections between sensor and controller = 0, otherwise
(4)
If no relation among sensor j and controller i, then k ij = 0, which shows that control action u i is not dependent and unknown to the sensor action y j . Let us consider the example taken in Fig. 6 which consists of four sensors and four controllers in the MIMO system. Now, let information from sensor 1 and 4 is sent to controller 1 and data from sensor 2 and 4 is sent to controller 2 and 3, then we can write as: ⎡
k11 ⎢ 0 K =⎢ ⎣ 0 0
0 k22 0 0
0 0 0 0
⎤ k14 0 ⎥ ⎥ k34 ⎦
(5)
0
Substituting (3) in (2), the closed-loop system dynamics are as follows: X˙ (t) = AX (t) + BkC X (t) = AX (t)
(6)
where A = A + B K C.
3.4 Model of Communication Network The communication network is either wired or wireless. There are N C controllers, N S sensors, and N r relay nodes. It can be denoted by the three types of nodes as {n c }n c =1...N c , {n s }n s =1...N s , {n r }n r =1...Nr , where the subscripts denote the type of nodes. If the wired network has been used, wired links can determine the topology,
Low-Cost Smart Solar DC Nano-Grid for Isolated Rural …
19
and in case of wireless network, distance among the nodes determines the topology. Various routing techniques have been evolved to establish the proper communication between the number of sensors and controllers. To simplify the analysis, some assumptions have been taken: • Fluid Traffic: The data from sensors to controllers is continuous. This is valid as the sampling rate is high and quantization error is low. The speed of channel is fast as compared to the physical system dynamics. • Bandwidth constraint: Any kind of link between two different nodes say (a, b) requires one unit of bandwidth which is assumed to be an integer value. So, the bandwidth denoted by Wab and assumed to be: Ns
I (n s , a, b) ≤ Wab
(7)
n s =1
where I (n s , a, b) =
1, if data flows of sensors passes through link ab . 0, otherwise
4 Results and Discussion Figure 7 shows the voltage across load, voltage of DC-link capacitor, SPV current, and the current of the battery. When sunlight is present and battery is not fully charged (Mode III), boost converter is operating at maximum power point whenever an extra amount of power is present at DC bus that charges the battery. During night time or unavailability of solar power, that battery will supply power to the load. Figure 8 shows the current profile of the battery with different loads. Initially, there was no load so battery current was zero as suddenly load has been increased; therefore, battery current has been increased, and after some time, load has been decreased and so on. Figure 9 shows voltage and current waveforms of the system during operating mode conversion (mode III to mode II). Firstly, the SPV was supplying power to load and battery storage. As soon as solar power goes zero, the system starts operating in mode II. In this mode, load is carried; Fig. 10 shows the solar plant exist at IIT Patna. Figures 11, 12, and 13 show the image of online monitoring portal where users can monitor the real-time data of system and control their appliances which is connected to the system. There are five subsections of whole monitoring system. In the left side, there are four online switches to control the load. In middle, there are three online plots, the first one is for online voltage plot, the second one is for current plot, and the last one is for power plot of the system. Right side plot is battery charge indicator. It shows the charge status of battery. Figure 14 shows the experimental setup of the intelligent DC nano-grid.
20
R. K. Behera et al.
Fig. 7 Voltage across the load, DC-link voltage, SPV current, current in battery
Fig. 8 Voltage across the load, DC-link voltage, SPV current, current in battery w.r.t different load during night time
Low-Cost Smart Solar DC Nano-Grid for Isolated Rural …
21
Fig. 9 Voltage across the load, DC-link voltage, SPV current, current in battery during transition from modes III to II
Fig. 10 Solar plant at IIT Patna
5 Conclusions A dc microgrid system with suitable low cost communication protocol is presented in this paper for rural applications. A low-cost communication protocol with CPS is designed for online information system with suitable load and source management system. The integration of battery and grid connected system will provide effective,
22
R. K. Behera et al.
Fig. 11 Online monitoring portal
Fig. 12 Online monitoring portal with load disconnected
reliable, and durable DC microgrid system. The proposed DC microgrid is analyzed through a set of simulation and experiments for verification.
Low-Cost Smart Solar DC Nano-Grid for Isolated Rural …
23
Fig. 13 Online monitoring portal with battery-load disconnected
Fig. 14 Experimental setup
References 1. Qian, C.: Solar Power Conversion—A System Solution to Alternative Energy Demand, Application Notes. Microsemi’s Power Products Group, Bend, USA (2010) 2. Bose, B.K., Szczeny, P.M., Steigerwald, R.L.: Microcomputer control of a residential photovoltaic power conditioning system. IEEE Trans. Ind. Appl., IA-21, 1182–1191 (1985)
24
R. K. Behera et al.
3. Chiang, S.J., Chang, K.T., Yen, C.Y.: Residential energy storage system. IEEE Trans. Ind. Electron. 45, 385–394 (1998) 4. Sugimoto, H., Dong, H.: A new scheme for maximum photovoltaic power tracking control. In: Proceedings of IEEE PCC—Nagaoka’97, pp 691–696 (1997) 5. Enslin, J.H.R., Wolf, M.S., Snyman, D.B., Sweigers, W.: Integrated photovoltaic maximum power point tracking converter. IEEE Trans. Ind. Electron. 44, 769–773 (1997) 6. Hua, C., Lin, J., Shen, C.: Implementation of a DSP-controlled photovoltaic system with peak power tracking. IEEE Trans. Ind. Electron. 45, 99–107 (1998) 7. Hussein, K.H., Muta, I., Hoshino, T., Osakada, M.: Maximum photovoltaic power tracking: an algorithm for rapidly changing atmospheric conditions. Proc. IEEE Gen. Transm. Distrib. 142(1) 59–64 (1995) 8. Lohner, A., Meyer, T., Nagel, A.: A new panel-integratable inverter concept for grid-connected photovoltaic systems. In: Proceedings of IEEE International Symposium of Industrial Electronics, pp 827–831 (1996) 9. Meinhardt, M., et al.: Miniaturized low-profile module integrated converter for photovoltaic applications with integrated magnetic components. In: Proceedings of IEEE Applications Power Electronics Conference Expo, pp 305–311 (1999) 10. Om, R., Yamashiro, S., Mazumder, R.K., Nakamura, K., Mitsui, K., Yamagishi, M., Okamura, M.: Design and performance evaluation of grid connected PV-ECS system with load leveling function. IEEE J. Trans. Power Energy 121-B, 1112–1119 (2001) 11. Behera, R.K.: Highly efficient solar energy harvesting system for Bihar green energy initiativw, Manthan. Int. J. Sci. Res. Innov. 12, 25–28 (2011)
Global Path Optimization of Humanoid NAO in Static Environment Using Prim’s Algorithm Manoj Kumar Muni, Dayal R. Parhi, Priyadarshi Biplab Kumar, Chinmaya Sahu, Prasant Ranjan Dhal, and Saroj Kumar
Abstract This paper focuses on navigation of a humanoid robot cluttered with obstacles, avoiding collisions in static environment using Prim’s algorithm. Prim’s algorithm is a minimum spanning tree (MST) method with greedy approach which uses the concept of sets. It generates the MST by selecting least weights from the weighted graph and randomly forms disjoint sets with picking one least weight edge from the ones remaining for creating node incident to form the tree. Similar approach repeats for selecting all ‘n – 1’ edges to the tree which is the path direction to humanoid NAO. The developed algorithm is implemented in both simulation and experimental platforms to obtain the navigational results. The simulation and experimental navigational results confirm the efficiency of the path planning strategy as the percentage of deviations of navigational parameters is below 6%. Keywords Humanoid NAO · Prim’s algorithm · V-REP · Simulation · Experiment · Probability plot M. K. Muni (B) · D. R. Parhi · S. Kumar Robotics Laboratory, Mechanical Engineering Department, National Institute of Technology Rourkela, Rourkela 769008, Odisha, India e-mail: [email protected] D. R. Parhi e-mail: [email protected] S. Kumar e-mail: [email protected] P. B. Kumar Mechanical Engineering Department, National Institute of Technology Hamirpur, Hamirpur 177005, Himachal Pradesh, India e-mail: [email protected] C. Sahu School of Mechanical Engineering, Vellore Institute of Technology Vellore, Vellore 632014, Tamil Nadu, India P. R. Dhal Mechanical Engineering Department, Indira Gandhi Institute of Technology, Sarang 759146, Odisha, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_3
25
26
M. K. Muni et al.
1 Introduction Path planning is a challenging and key issue for the researchers towards any kind of robot such as mobile robot, industrial robot, bipeds, under water robots and humanoids. Nowadays, robots are widely used in military and medical applications. For the success of the robot, the motion analysis must be accurate and optimal. In this paper, Prim’s algorithm is studied and applied for global optimal path planning of humanoid robot in static environment. Various researchers used intelligent techniques like neural network, fuzzy system, regression analysis, A* algorithm for the navigation of mobile robots and industrial robots. A very few research has been found towards effective navigation of humanoids. Deepak and Parhi [1] presented particle swarm optimization-based navigational approach for mobile robots in various unknown environments. Mohanty and Parhi [2] proposed a new multiple adaptive neuro-fuzzy system for path planning of autonomous mobile robot between two locations. Pandey and Parhi [3] designed minimum rule-based mamdani fuzzy controller in MATLAB for navigation of mobile robots in environments cluttered with obstacles. Kundu and Parhi [4] employed an evolutionary dynamic adaptive harmony search technique for motion planning of underwater robot with obstacle avoidance and minimization of path length. Kumar et al. [5] combined two strategies such as regression analysis and genetic algorithm for intelligent navigation of multiple humanoids in environments cluttered with obstacles. Kumar et al. [6] used classical regression analysis and fuzzy computational intelligence approach for motion analysis of humanoid NAO in unknown environments. Fridovich-Keil et al. [7] employed predictive model and established degree of confidence with accuracy in predicting future predictions in generating probabilistic motion predictions for robot motion planning. Sadhu et al. [8] proposed Q-learning induced firefly algorithm (QFA) technique for global optimal path planning of robotic arm with balanced exploration and exploitation in real environment. Panda et al. [9] hybridized invasive weed population based technique with firefly algorithm to avoid premature convergence and static environment motion planning of multiple robots. Kumar et al. [10,11] designed an advanced regression controller and hybridized regression fuzzy controller for static and dynamic motion planning of humanoid robots in complex environments. Parhi and Kumar [12] employed novel virtual target shifting DAYKUN-BIP strategy with petri-net approach for smart and smooth path navigation of humanoids in unknown scenarios. Kumar and Parhi [13] proposed hybridized regression-genetic intelligent algorithm for motion analysis of humanoid robots in cluttered environments. Pandey and Parhi [14] applied behavior-based neural network (BBN) approach to mobile robots for goal search and planning of trajectory with obstacle avoidance. Park et al. [15] presented intention-aware online motion planning technique with predicted actions through temporal coherence for motion planning of high degree of freedom robots. Patle et al. [16] presented fuzzy-based decision controller with probability distribution for navigation of multiple mobile robots in dynamic environment. Chen et al. [17] proposed a novel bio-inspired neural dynamics approach for path planning of autonomous mobile robots in unknown
Global Path Optimization of Humanoid NAO in Static Environment …
27
dynamic platforms. Kim and Yoon [18] applied hybridized path planning technique based on sampling and optimization method with lazy PRM and bath informed tree for trajectory motion optimization of robots. Wen et al. [19] carried out active simultaneous localization and mapping (SLAM) approach with deep reinforcement learning for autonomous path planning robots in complex environments. Zhong et al. [20] presented hybridized A* algorithm-adaptive window strategy for hybrid motion planning of mobile robots in large-scale unknown environment. Sun et al. [21] carried out smooth and collision-free navigation of Mecanum-wheeled robot by using improved safe-connect RRT relied on voronoi diagram with cubic spline method. Moysis et al. [22] proposed chaotic motion planning approach for covering environment terrain with unpredictable motion and utilized logistic map modulo function for providing directions to autonomous mobile robots during navigation. Muni et al. [23–28] presented rule-based technique, bacterial foraging optimization, grey wolf optimization controller and sugeno fuzzy logic analysis for effective navigation of humanoid NAOs in complex environments. Kumar et al. [29–31] proposed fuzzified ant colony optimization, modified tabu search and hybridized sine–cosine-ant colony technique for control and optimal path search of mobile robots in complex environments. Kumar et al. [32–35] proposed several intelligent approaches for smooth navigation of humanoid robots. From researchers survey, it is concluded that various work has been taken for robot navigation but still the motion analysis problem has not been considered in minimum spanning tree view. So in this research work, Prim’s algorithm is considered which utilizes the concepts of edges, sets in forming tree, thus providing global optimal path to humanoid during motion planning.
2 Prim’s Algorithm The main advantage of the Prim’s algorithm is forming the minimum spanning tree (MST) with greedy approach [36] taking the global minimum obstacle distances during navigation between two locations such as source vertex and target vertex. Whenever the sensor senses any obstacle in its path, the PA controller activates and travels in global optimal direction towards target avoiding obstacles. Steps Considered for Scheming the PA Controller In the first step, humanoid selects the source vertex and target vertex In the second step, humanoid starts to move towards target and selects the vertex with minimum distance to target whenever it senses any obstacle Obtain all vertices with distance values connecting to source vertex to find the minimum vertex distance between those vertices and connect it with previous edge If a cycle is formed with maximum distance value by the selected vertex, then reject it and select the next least vertex and connect the new edge
28
M. K. Muni et al.
Continue the above steps until global minimum path traversal tree for humanoid navigation is obtained. In the developed PA controller, the sensor outputs such as the obstacle distances from target which are nothing but distance between edges from source vertex of each iteration play a significant role in selecting the vertices for global minimal motion analysis for the humanoid robots in complex environments by avoiding trapping in local minimal points.
2.1 Pseudocode of Prim’s Algorithm Figure 1 shows the pseudocode of the developed PA controller. Input adjacency matrix: adjmatrix [a] [b] = weight of obstacle edge (a, b) Initialize in MST [a] = false for all a Initialize priority [a] = infinity for all a Priority [0] = 0 Num vertices added = 0 While (num vertices added < num vertices) Vr = vertex with lowest priority that is not in MST If MST [Vr] = true Num vertices added = num vertices added + 1 For a=0 to num vertices – 1 If a≠ Vr and adjmatrix [Vr] [a] > 0 If priority [a] > adjmatrix [Vr] [a] Priority [a] = adjmatrix [Vr] [a] Predecessor [a] = Vr End if End If End for End if End while Tree matrix = adjacency matrix representation of tree using predecessor array with collection of edges with lowest weight values Return tree matrix Output: adjacency matrix representation of MST to guide humanoid NAO for motion planning
Fig. 1 Pseudocode of developed PA controller
Global Path Optimization of Humanoid NAO in Static Environment …
29
3 Implementation of Prim’s Algorithm for Simulation and Experiment Investigation The developed PA controller is implemented in humanoid H25 V4 NAO for testing the effectiveness through simulation and experiment investigation. The NAO considered for the research analysis is having 25 degrees of freedom and weighs 5.2 kg. Simulation arena (280 × 200 units) is created using V-REP software and the program is fed through Python language to obtain the simulation result. Similarly under laboratory conditions, same arena (280 × 200 units) is created to investigate the experimental result through choreographe tool. Figures 2 and 3 show the simulation and experimental results obtained from the developed PA controller with path direction. Table 1 depicts the path length covered and path time taken by the humanoid during simulation and experimental investigation, along with percentage of deviation. From Table 1, it is found that the simulation and experimental results are close with each other having average permissible deviation of percentage of 5.21 in path length covered and 5.82 in path time taken. The developed controller worked effectively in Humanoid NAO Source
Target (a)
(b)
(c)
(d)
(e)
(f)
Fig. 2 Simulation result obtained from PA controller by humanoid NAO
30
M. K. Muni et al. Humanoid NAO Source
Target (a)
(b)
(c)
(d)
(e)
(f)
Fig. 3 Experiment result obtained from PA controller by humanoid NAO Table 1 Comparison of path length covered and path time taken between simulation and experimental results for humanoid NAO S. No.
Simulation results
Experimental results
% of deviation
Path length covered (cm)
Path time taken (s)
Path length covered (cm)
Path time taken (s)
Path length covered
Path time taken
1
384.23
75.38
405.4
80.17
5.22
5.97
2
385.47
76.24
406.1
81.07
5.08
5.96
3
384.69
75.52
405.9
80.31
5.23
5.96
4
385.71
75.87
406.8
80.42
5.18
5.66
5
385.54
76.04
406.9
80.62
5.25
5.68
6
385.13
75.81
406.2
80.52
5.19
5.85
7
384.89
75.72
406.6
80.33
5.34
5.74
8
384.79
76.12
405.8
80.92
5.18
5.93
9
385.17
76.39
406.5
81.07
5.25
5.77
10
385.45
75.55
406.7
80.17
5.22
5.76
Avg.
385.1
75.86
406.2
80.56
5.21
5.82
Global Path Optimization of Humanoid NAO in Static Environment …
31
obtaining global optimal path between source and target locations with acceptable range of error limits.
4 Probability Plot Between Simulation and Experimental Results The simulation and experimental results obtained from PA controller are analyzed for probability plot through normal distribution to test the effectiveness and efficiency of the developed controller. Figure 4 shows the probability plot between simulation and experimental results with comparison. The notations used in probability plot to express the comparison between simulation and experiment results are explained below. PLC = Path Length Covered, PTT = Path Time Taken. The probability test comparison results testify that the results obtained from simulation and experimental platforms are in good agreement with each other with minimal amount of deviation.
Fig. 4 a, b Probability plot of simulation and experiment path length covered by humanoid NAO, c, d probability plot of simulation and experiment path time taken by humanoid NAO
32
M. K. Muni et al.
5 Conclusions In this research, the Prim’s algorithm controller is designed and developed for effective navigation of humanoid NAO in complex environments cluttered with different obstacles considering minimum spanning tree approach. The developed controller is tested in both simulation and experimental platform. The results obtained show a significant improvement in hassle-free navigation of a humanoid robot with percentage of deviation below 6%. The effectiveness of the developed PA controller is tested by probability plot drawn through normal distribution among the simulation and experiment results. The probability plot comparisons depict the efficiency of the algorithm towards motion planning. The developed controlled can be used by any robot for motion planning in any environment mixed with different shaped obstacles.
References 1. Deepak, B.B.V.L., Parhi, D.: PSO based path planner of an autonomous mobile robot. Open Comput. Sci. 2(2), 152–168 (2012) 2. Mohanty, P.K., Parhi, D.R.: Path planning strategy for mobile robot navigation using MANFIS controller. In: Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA), pp 353–361. Springer, Cham (2014) 3. Pandey, A., Parhi, D.R.: MATLAB simulation for mobile robot navigation with hurdles in cluttered environment using minimum rule based fuzzy logic controller. Procedia Technol. 14(1), 28–34 (2014) 4. Kundu, S., Parhi, D.R.: Navigation of underwater robot based on dynamically adaptive harmony search algorithm. Memetic Comput. 8(2), 125–146 (2016) 5. Kumar, A., Kumar, P.B., Parhi, D.R.: Intelligent navigation of humanoids in cluttered environments using regression analysis and genetic algorithm. Arab. J. Sci. Eng. 43(12), 7655–7678 (2018) 6. Kumar, P.B., Mohapatra, S., Parhi, D.R.: An intelligent navigation of humanoid NAO in the light of classical approach and computational intelligence. Comput. Anim. Virtual Worlds 30(2), e1858 (2019) 7. Fridovich-Keil, D., Bajcsy, A., Fisac, J.F., Herbert, S.L., Wang, S., Dragan, A.D., Tomlin, C.J.: Confidence-aware motion prediction for real-time collision avoidance. Int. J. Robot. Res. 39(2–3), 250–265 (2020) 8. Sadhu, A.K., Konar, A., Bhattacharjee, T., Das, S.: Synergism of firefly algorithm and Qlearning for robot arm path planning. Swarm Evol. Comput. 43, 50–68 (2018) 9. Panda, M.R., Dutta, S., Pradhan, S.: Hybridizing invasive weed optimization with firefly algorithm for multi-robot motion planning. Arab. J. Sci. Eng. 43(8), 4029–4039 (2018) 10. Kumar, P.B., Sahu, C., Parhi, D.R., Pandey, K.K., Chhotray, A.: Static and dynamic path planning of humanoids using an advanced regression controller. Scientia Iranica. Trans. B, Mech. Eng. 26(1), 375–393 (2019) 11. Kumar, P.B., Muni, M.K., Parhi, D.R.: Navigational analysis of multiple humanoids using a hybrid regression-fuzzy logic control approach in complex terrains. Appl. Soft Comput. 106088 (2020) 12. Parhi, D.R., Kumar, P.B.: Smart navigation of humanoid robots using DAYKUN-BIP virtual target displacement and petri-net strategy. Robotica 37(4), 626–640 (2019) 13. Kumar, P.B., Parhi, D.R.: Intelligent hybridization of regression technique with genetic algorithm for navigation of humanoids in complex environments. Robotica 38(4), 565–581 (2020)
Global Path Optimization of Humanoid NAO in Static Environment …
33
14. Pandey, K.K., Parhi, D.R.: Trajectory planning and the target search by the mobile robot in an environment using a behavior-based neural network approach. Robotica 1–15 (2020) 15. Park, J.S., Park, C., Manocha, D.: I-Planner: Intention-aware motion planning using learningbased human motion prediction. Int. J. Robot. Res. 38(1), 23–39 (2019) 16. Patle, B.K., Parhi, D.R.K., Jagadeesh, A., Kashyap, S.K.: Application of probability to enhance the performance of fuzzy based mobile robot navigation. Appl. Soft Comput. 75, 265–283 (2019) 17. Chen, Y., Liang, J., Wang, Y., Pan, Q., Tan, J., Mao, J.: Autonomous mobile robot path planning in unknown dynamic environments using neural dynamics. Soft Comput. 1–17 (2020) 18. Kim, D., Yoon, S.E.: Simultaneous planning of sampling and optimization: study on lazy evaluation and configuration free space approximation for optimal motion planning algorithm. Autonom. Robots. 44(2), 165–181 (2020) 19. Wen, S., Zhao, Y., Yuan, X., Wang, Z., Zhang, D., Manfredi, L.: Path planning for active SLAM based on deep reinforcement learning under unknown environments. Intel. Service Robot. 1–10 (2020) 20. Zhong, X., Tian, J., Hu, H., Peng, X.: Hybrid path planning based on safe A* algorithm and adaptive window approach for mobile robot in large-scale dynamic environment. J. Intel. Robot. Syst. 1–13 (2020) 21. Sun, Y., Zhang, C., Sun, P., Liu, C.: Safe and smooth motion planning for Mecanum-Wheeled robot using improved RRT and cubic spline. Arab. J. Sci. Eng. 1–16 (2019) 22. Moysis, L., Petavratzis, E., Volos, C., Nistazakis, H., Stouboulos, I.: A chaotic path planning generator based on logistic map and modulo tactics. Robot. Autonom. Syst. 124, 103377 (2020) 23. Muni, M.K., Kumar, P.B., Parhi, D.R., Rath, A.K., Das, H.C., Chhotray, A., Pandey, K.K., Salony, K.: Path planning of a humanoid robot using rule-based technique. In: Advances in Mechanical Engineering, pp 1547–1554. Springer, Singapore (2020) 24. Muni, M.K., Parhi, D.R., Kumar, P.B.: Improved motion planning of humanoid robots using bacterial foraging optimization. Robotica 1–14 (2020) 25. Muni, M.K., Parhi, D.R., Kumar, P.B.: Implementation of grey wolf optimization controller for multiple humanoid navigation. Comput. Animation Virtual Worlds e1919 (2020) 26. Muni, M.K., Parhi, D.R., Kumar, P., Pandey, K.K., Kumar, S., Chhotray, A.: Sugeno Fuzzy Logic Analysis: Navigation of Multiple Humanoids in Complex Environments. Available at SSRN 3536839 (2020) 27. Muni, M.K., Parhi, D.R., Kumar, P.B., Rath, A.K.: Navigational analysis of multiple humanoids using a hybridized rule base-sugeno fuzzy controller. Int. J. Humanoid Robot 2050017 (2020) 28. Muni, M.K., Parhi, D.R., Kumar, P.B., Kumar, S.: Motion control of multiple humanoids using a hybridized prim’s algorithm-fuzzy controller. Soft Comput. 1–22 (2020) 29. Kumar, S., Pandey, K.K., Muni, M.K., Parhi, D.R.: Path planning of the mobile robot using fuzzified advanced ant colony optimization. In: Innovative Product Design and Intelligent Manufacturing Systems, pp 1043–1052. Springer, Singapore (2020) 30. Kumar, S., Muni, M.K., Pandey, K.K., Chhotray, A., Parhi, D.R.: Path Planning and Control of Mobile Robots Using Modified Tabu Search Algorithm in Complex Environment. Available at SSRN 3539922 (2020) 31. Kumar, S., Parhi, D.R., Muni, M.K., Pandey, K.K.: Optimal path search and control of mobile robot using hybridized sine-cosine algorithm and ant colony optimization technique. Indust. Robot 47(4):535–545 (2020) 32. Rath, A.K., Das, H.C., Parhi, D.R., Kumar, P.B.: Application of artificial neural network for control and navigation of humanoid robot. J. Mech. Eng. Sci. 12(2), 3529–3538 (2018) 33. Rawat, H., Parhi, D.R., Kumar, P.B., Pandey, K.K., Behera, A.K.: Analysis and investigation of Mamdani fuzzy for control and navigation of mobile robot and exploration of different AI techniques pertaining to robot navigation. In: Emerging Trends in Engineering, Science and Manufacturing,(ETESM-2018). IGIT, Sarang, India (2018) 34. Sahu, C., Parhi, D.R., Kumar, P.B.: An approach to optimize the path of humanoids using adaptive ant colony optimization. J. Bionic Eng. 15(4), 623–635 (2018)
34
M. K. Muni et al.
35. Rath, A.K., Parhi, D.R., Das, H.C., Kumar, P.B.: Behaviour based navigational control of humanoid robot using genetic algorithm technique in cluttered environment. Model. Meas. Control A 91(1), 32–36 (2018) 36. Abhilasha, R.: Minimum cost spanning tree using prims algorithm. Int. J. Adv. Res. Comput. Sci. Manag. Stud. 1(1) (2013)
Weather Prediction Using Hybrid Soft Computing Models Suvendra Kumar Jayasingh, Jibendu Kumar Mantri, and Sipali Pradhan
Abstract The art of weather forecasting is a challenging task of predicting the state of the atmosphere at a future time for a specified location. Climate change and weather prediction is a highly nonlinear phenomenon which is called butterfly effect. The soft computing techniques are now capable of replacing the conventional weather prediction methods. The proposed new hybrid soft computing models are designed by exploiting the positive features of the constituent soft computing techniques and suppressing their disadvantages and also this research work intends to design the hybrid models by making use of favourable properties of Support Vector Machine, Multi-Layer Perceptron and Fuzzy Logic considering the weather of Delhi. The new hybrid soft computing models are used here to forecast the weather at Delhi by training the models using weather data of Delhi. Keywords Support vector machine · Multi-layer perceptron · Fuzzy rule-based support vector machine · Fuzzy rule-based multi-layer perceptron
1 Introduction The prediction of weather is a very complex task for the environmental researchers and the Meteorological Department. This complex task was earlier accomplished by traditional statistical methods. The advancement in the Computer Science and Technology has changed the way of predicting the weather by use of soft computing techniques. The soft computing techniques are innovative ways to construct computationally intelligent systems who possesses humanlike expertise to make a prediction. S. K. Jayasingh · J. K. Mantri (B) · S. Pradhan PG Department of Computer Application, North Orissa University Baripada, Baripada, Odisha, India e-mail: [email protected] S. K. Jayasingh e-mail: [email protected] S. Pradhan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_4
35
36
S. K. Jayasingh et al.
These soft computing techniques can be combined to form the hybrid soft computing models. The hybrid soft computing model is a very smart system which is coined by combining more than one soft computing models such as Support Vector Machine, Multi-Layer Perceptron and Fuzzy Logic etc. The combination of properties of two soft computing models possesses extended range of capabilities to perform better prediction. In this research work, we have analysed 5 years weather data of Delhi, the capital city of India which is situated at latitude–longitude pairs (28.64°N 77.22°E) as a case study. The new hybrid soft computing models are trained with the five years weather data of Delhi. The performance evaluation is done as compared to the actual data of Meteorological Department and it is observed that the performance is more efficient and promising.
2 Literature Survey Al-Matarneh et al. [1] have used neural networks with fuzzy logic and developed weather forecasting models for prediction of temperature. Amanullah et al. [2] used soft computing techniques such as artificial neural network for weather prediction. Bautu et al. [3] have made a study on forecasting meteorological data using soft computing methods. Biradar et al. [4] have made prediction of weather using data mining for weather prediction. Ghosh et al. [5] have used fuzzy logic for weather prediction. Gocken et al. [6] have made daily average temperature analysis using soft computing models. Hamidi et al. [7] have made a comparative study of support vector machine and artificial neural network for prediction of precipitation in Iran. Honarbakhsh et al. [8] have used soft computing techniques for prediction of evaporation. Isa et al. [9] have done weather prediction using photovoltaic system and neural network. Mala et al. [10] have done fuzzy rule-based classification for heart database using fuzzy RDBMS. Bhardwaj et al. [11] have implemented soft computing techniques for prediction of climate change. Sharma et al. [12] have used to soft computing models for prediction of weather. Sharma et al. [13] have done weather forecasting using soft computing and statistical techniques. Shah et al. [14] have used multi-layer architecture of soft computing models for prediction of weather. Bojja et al. [15] have designed artificial intelligent system for weather prediction.
3 Fuzzy Logic The fuzzy logic-based technique can be used for prediction of weather. The events in the weather are very intricate, uncertain, and vague. In order to predict the weather parameters, the fuzzy rules are defined for the same. The fuzzy set is a collection of all objects having similar attributes, but in a crisp set, one object belongs to a set or not. The membership of one object in a set is implied as 1 and not being a member is implied as 0. In fuzzy sets, the membership 1 indicates that the object belongs to
Weather Prediction Using Hybrid Soft Computing Models
37
the set; the membership 0 indicates the object does not belong to the set. One higher membership value implies that the object in a set. The Fuzzy Logic (FL) resembles the human reasoning. The fuzzy logic approach imitates the way of decision making by human brain that involves all possibilities between digital values yes and no. The fuzzy set of one weather parameter temperature may be defined as below. Very high High Medium Low Very low
> 35 mm 31–35 mm 26–30 mm 21–25 mm 0–20 mm.
A set of rules are decided by taking the if … then conditions which are used to take the decisions in predicting the target parameter of weather. The fuzzy set and its membership value are found out.
4 Multi-layer Perceptron A Multi-layer Perceptron (MLP) is generally based on the nervous system of biology. This consists of at least three layers such as input layer, hidden layer, and output layer. All nodes, except input nodes, are the neurons who use non-linear function for activation. It uses back propagation which is known as supervised learning. The multiple number of layers and non-linear function used for activation purpose distinguish it from the linear perceptron. It is able to distinguish between the datasets which are not separable linearly. The three layer MLP is shown in Fig. 1. Fig. 1 Multi-layer perceptron
38
S. K. Jayasingh et al.
5 Support Vector Machine The Support Vector Machine (SVM) is a soft computing model that follows supervised learning mechanism. It makes use of classification algorithm. It uses kernel trick algorithm and finds an optimal boundary among possible outputs. It provides optimal hyper planes to categorize the samples into respective sets. It is a very useful tool to classify and predict the group to which a new sample will belong. The time series dataset is used to train the model, and then, this model becomes capable of prediction new sample data. One hyper plane can separate the sample space into two sections. The equation of the hyper plane is w·x =0 In 2D, the hyper plane is a line y = ax + b which can be rewritten as y − ax − b = 0 The vector notation is ⎛
⎞ ⎛ ⎞ −b 1 w = ⎝ −a ⎠ and x = ⎝ x ⎠ 1 y When we take the dot product between w and x and then equate to zero, it produces an equation for a line. One model SVM is shown in Fig. 2. Fig. 2 Support vector dividing the two sets from each other
Weather Prediction Using Hybrid Soft Computing Models
39
6 Experimental Set-up The events at a particular place can be predicted by use of the other atmospheric parameters at that place. The events for a place may be rain, no rain, thunderstorm, fog, windy, etc. In our paper, we have used the weather parameters dew, humidity, temperature, pressure, wind speed, and visibility as the predicting parameters and events as the target variable. The atmospheric data for the aforementioned parameters are collected from AccuWeather web portal, and the missing values in the time series dataset are refined. The algorithm is implemented by using Python to implement the hybrid soft computing models as per the following block diagram as shown in Fig. 3, and the comparison of the output with the meteorological data is done by Weka. Algorithm for the proposed work Input: Time series weather data Output: Predicted weather event 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Set target variable Set predictors Train the model Test the result Find accuracy—1 Apply fuzzification to get fuzzy database Train the model with fuzzy database Test the result Find accuracy—2 If accuracy—2 is better than accuracy—1, then select the proposed hybrid model for prediction of event, else go to step—2 11. Input new set of parameters and predict the event. The comparison of output obtained in our proposed model and the meteorological data is done on the basis of MAE, RMSE, RAE, and RRSE.
6.1 Data Five years weather data from 2011 to 2015 for different parameters such as temperature, dew point, humidity, sea level pressure, visibility, wind speed, and event are taken into consideration for the present study as presented in Table 1. Weather Data
Refining the Data
Python Code to Implement the Models
Comparison of Soft Computing Models
Fig. 3 Block diagram for the flow of activities of the proposed work
Selection of best Soft Computing Model
40
S. K. Jayasingh et al.
Table 1 Numeric weather data Date/Jan 2015 Temperature Dew Humidity Pressure Visibility Wind speed Events 1
19
10
62
1018
2
3
No rain
2
17
12
68
1020
3
7
No rain
3
18
15
51
1011
7
8
No rain
4
18
11
67
1014
2
2
No rain
Table 2 Fuzzy weather data Date/Jan Temperature Dew 2015
Humidity Pressure
1
Very low
Very low Low
2
Very low
3
Very low
4
Very low
Visibility Wind speed Events
Very high Low
Very low
No rain
Very low Low
Very high Low
Low
No rain
Very low Very low
High
High
Low
No rain
Very low Low
High
Low
Very low
No rain
The numeric data are translated into equivalent fuzzy data which is shown in Table 2.
7 Fuzzy Rule-Based Support Vector Machine (FR-SVM) 7.1 Proposed Model The proposed Fuzzy Rule-based Support Vector Machine (FR-SVM) is the conglomeration of the positive features of fuzzy set and support vector machine. This model converts the crisp weather data to fuzzy database by using the fuzzy set rules as per the membership function defines in the metadata of fuzzy set. Then, the fuzzy database is fed to the support vector machine to give the predicted output. The proposed model is illustrated in Fig. 4. The crisp database is converted into fuzzy database. Then, the proposed model is trained by using the fuzzy database. Then, the model is used to predict the output.
7.2 Output Analysis The five years weather database from 2011 to 2015 is used to train the models support vector machine and the proposed FR-SVM. The output is compared based on different statistical errors as shown in Table 3.
Weather Prediction Using Hybrid Soft Computing Models
41
Fig. 4 Proposed new hybrid fuzzy rule-based SVM Table 3 Comparison between SVM and proposed model FR-SVM Year
Models
MAE
RMSE
RAE
RRSE
2011
SVM
2.2744
2.8366
0.3995
0.4408
FR-SVM
0.2583
0.3435
0.8224
0.8668
2012
SVM
2.2718
2.9155
0.3823
0.4317
FR-SVM
0.2615
0.3483
0.8325
0.879
2013
SVM
2.3189
2.8902
0.3833
0.424
FR-SVM
0.2633
0.351
0.8257
0.8789
SVM
2.5377
3.1884
0.4249
0.4643
FR-SVM
0.2614
0.348
0.8305
0.8773
SVM
2.2306
2.9008
0.3766
0.435
FR-SVM
0.2562
0.3402
0.8212
0.8615
2014 2015
42
S. K. Jayasingh et al. 3.5 3 2.5 2
SVM
1.5 1
FR-SVM
2011
2012
2013
2014
RAE
RRSE
MAE
RMSE
RAE
RRSE
MAE
RMSE
RAE
RRSE
MAE
RMSE
RAE
RRSE
MAE
RMSE
RAE
RRSE
MAE
0
RMSE
0.5
2015
Fig. 5 SVM versus FR-SVM—a graphical representation
Fig. 6 Graphical assessment between SVM and FR-SVM on the basis of RMSE
2 1.5 SVM
1
FR-SVM
0.5 0
1
2
3
4
5
6
The performance of SVM and proposed FR-SVM is compared on the basis of of MAE, RMSE, RAE, and RRSE. The pictorial representation is expressed in Fig. 5. The performance of SVM and proposed FR-SVM is compared on the basis of RMSE and is shown in Fig. 6. The lower the value of RMSE, the better is the model. On comparison between SVM and FR-SVM, it is concluded that the proposed model FR-SVM provides better result than SVM.
8 Fuzzy Rule-Based MLP (FR-MLP) 8.1 Proposed Model The proposed fuzzy rule-based multi-layer perceptron is the conglomeration of the features of fuzzy set and multi-layer perceptron. This model converts the crisp weather data to fuzzy database by using the fuzzy set rules as per the membership function defines in the metadata of fuzzy set. Then, the fuzzy database is used to train the multi-layer perceptron to give the predicted output. The new proposed hybrid model is illustrated in Fig. 7. The crisp weather data is translated into fuzzy database. Then, the proposed model is used to predict the output.
Weather Prediction Using Hybrid Soft Computing Models
43
Fig. 7 New hybrid fuzzy rule-based MLP
8.2 Output Analysis The five years weather data is used to train the models multi-layer perceptron and the newly proposed FR-MLP. The output is compared based on different statistical errors as shown in Table 4. The performance of MLP and proposed FR-MLP is compared based on of MAE, RMSE, RAE, and RRSE, and the pictorial representation is given in Fig. 8. The performance of MLP and proposed FR-MLP is compared on the basis of RMSE, and the pictorial representation is expressed in Fig. 9. The lower the value of RMSE, the better is the model. So, on comparison between MLP and FR-MLP, it is concluded that FR-MLP is better than MLP.
44
S. K. Jayasingh et al.
Table 4 Comparison between MLP and proposed new model FR-MLP Year
Models
MAE
RMSE
RAE
RRSE
2011
MLP
2.5443
3.2392
0.447
0.5034
FR-MLP
0.1957
0.3761
0.6232
0.9492
2012
MLP
2.3341
3.0198
0.3927
0.4472
FR-MLP
0.1865
0.3466
0.5936
0.8747
MLP
2.3398
3.0515
0.3868
0.4476
FR-MLP
0.2063
0.3882
0.6469
0.9722
2014
MLP
2.1913
2.7975
0.3669
0.4074
FR-MLP
0.1846
0.3644
0.5864
0.9188
2015
MLP
2.6087
3.3992
0.4404
0.5098
FR-MLP
0.1788
0.3604
0.5731
0.9126
2013
4 3
MLP
2
FR-MLP
1 MAE RMSE RAE RRSE MAE RMSE RAE RRSE MAE RMSE RAE RRSE MAE RMSE RAE RRSE MAE RMSE RAE RRSE
0
2011
2012
2013
2014
2015
Fig. 8 MLP versus FR-MLP
Fig. 9 MLP versus FR-MLP based on RMSE
1.5 1 0.5
MLP FR-MLP
0
9 Result Analysis Now let us have a comparison in the performance between the two proposed models for prediction of time series weather parameters on the basis of MAE, RMSE, RAE, and RRSE. The comparison is shown in Table 5. The performance of two proposed models FR-SVM and FR-MLP is compared on the basis of MAE, RMSE, RAE, and RRSE, and the pictorial representation is expressed in Fig. 10.
Weather Prediction Using Hybrid Soft Computing Models
45
Table 5 Comparison between FR-SVM and FR-MLP on the basis of MAE, RMSE, RAE, and RRSE Year
Models
MAE
RMSE
RAE
RRSE
2011
FR-SVM
0.2583
0.3435
0.8224
0.8668
FR-MLP
0.1957
0.3761
0.6232
0.9492
2012
FR-SVM
0.2615
0.3483
0.8325
0.879
FR-MLP
0.1865
0.3466
0.5936
0.8747
2013
FR-SVM
0.2633
0.351
0.8257
0.8789
FR-MLP
0.2063
0.3882
0.6469
0.9722
FR-SVM
0.2614
0.348
0.8305
0.8773
FR-MLP
0.1846
0.3644
0.5864
0.9188
FR-SVM
0.2562
0.3402
0.8212
0.8615
FR-MLP
0.1788
0.3604
0.5731
0.9126
2014 2015
1 0.8 0.6 0.4 0.2 0
FR-SVM
MAE RMSE RAE RRSE MAE RMSE RAE RRSE MAE RMSE RAE RRSE MAE RMSE RAE RRSE MAE RMSE RAE RRSE
FR-MLP
2011
2012
2013
2014
2015
Fig. 10 FR-SVM versus FR-MLP
The performance of two proposed models FR-SVM and FR-MLP is compared on the basis of Root Mean Squared Error (RMSE), and the pictorial representation is expressed in Fig. 11. The lower the value of RMSE, the better is the model. In the above comparison between the proposed soft computing models FR-SVM and FR-MLP, it is clear that the performance of FR-SVM is better in terms of RMSE and other error rates. Fig. 11 Comparison between FR-SVM and FR-MLP on the basis of RMSE
0.4 FR-SVM 0.35 FR-MLP 0.3 1
2
3
4
5
46
S. K. Jayasingh et al.
10 Conclusion and Future Scope In this paper, we attempted to design hybrid soft computing models to predict the weather events like rain, no rain, fog, thunderstorm, windy, etc. In the proposed models, we found that RMSE value is less. So, the new proposed hybrid soft computing models perform better than their constituent soft computing techniques. The prediction of weather by use of hybrid soft computing techniques gives promising results with better accuracy and less error rates than the simple soft computing methods. After analysis of the two proposed hybrid soft computing techniques, it is concluded that FR-SVM performs better than FR-MLP in predicting the different weather events in Delhi by making the models trained with five years atmospheric data. This model can be used for prediction in stock market, business prediction, agricultural crop prediction, and sentiment analysis, etc. Selection of optimal parameters in designing the hybrid models may make the prediction more promising and accurate.
References 1. Al-Matarneh, L., Sheta, A., Bani-Ahmad, S., Alshaer, J., Al-oqily I.: Development of temperature-based weather forecasting models using neural networks and fuzzy logic. Int. J. Multim. Ubiquit. Eng. 9(12), 343–366 (2014) 2. Amanullah, M., Khanaa, V.K.: Application of soft computing techniques in weather forecasting: ANN approach. Int. J. Adv. Res. 2(1), 212–219 (2014) 3. Bautu, E., Barbulescu, A.: Forecasting meteorological time series using soft computing methods: an empirical study. Appl. Math. Inf. Sci. Int. J. 7(4), 1297–1306 (2013) 4. Biradar, P., Ansari, S., Paradhar, Y., Lohiya, S.: Weather prediction using data mining. Int. J. Eng. Dev. Res. 5(2), 213–214 (2017) 5. Ghosh, S., Dutta, A., Choudhury, S.R., Paul, G.: Weather prediction by the use of fuzzy logic. J. Mech. Cont. Math. Sci. 8(2), 1228–1241 (2014) 6. Gocken, M., Boru, A., Dosdogru, T., Berber, N.: Application of soft computing models to daily average temperature analysis. Int. J. Eng. Technol. 1(2), 56–64 (2015) 7. Hamidi, O., Poorolajal, J., Sadeghifar, M., Abbasi, H., Maryanaji, Z., Faridi, H.R., Tapak, L.: A comparative study of support vector machines and artificial neural networks for predicting precipitation in Iran. Theory Appl. Climatol. 119, 723–731 (2015) 8. Honarbakhsh, A., Dashpagerdi, M.M., Vagharfard, H.: Application of soft computing methods in predicting evapotranspiration. Open J. Geol. 3, 397–403 (2013) 9. Isa, I.S., Omar, S., Saad, Z., Noor, N.M., Osman, M.K.: Weather forecasting using photovoltaic system and neural network. In: Second International Conference on Computational Intelligence, Communication Systems and networks, pp 96–100 (2010) 10. Mala, I., Aktar, P., Ali, T.J., Zia, S.S.: Fuzzy rule based classification for heart dtabase using fuzzy decision tree algorithm based on fuzzy RDBMS. World Appl. Sci. J. 28(9), 1331–1335 (2013) 11. Bhardwaj, R., Duhoon, V.: Weather forecasting using soft computing techniques. In: International Conference on Computing, Power and Communication Technologies (GUCON), Greater Noida, Uttar Pradesh, India, pp 1111–1115 (2018). https://doi.org/10.1109/GUCON.2018.867 5088
Weather Prediction Using Hybrid Soft Computing Models
47
12. Sharma, A., Manoria, M.: A weather forecasting system using concept of soft computing: a new approach. In: International Conference on Advanced Computing and Communications, Surathkal, pp. 353–356 (2006). https://doi.org/10.1109/ADCOM.2006.4289915 13. Sharma, M., Mathew, L., Chatterji, S.: Weather forecasting using soft computing and statistical techniques. Int. J. Adv. Res. Electr. Electron. Instrum. Eng. 3(7), 11285–11290 (2014) 14. Shah S.S.M., Meganathan, S., Kamali, A.: Soft computing research for weather prediction using multi layer architecture. Int. J. Eng. Adv. Technol. 8(6), 3779–3783 (2019) 15. Bojja, P., Sanam, N.: Design and development of artificial intelligence system for weather forecasting soft computing. ARPN J. Eng. Technol. 12(3), 685–689 (2017)
FindMoviez: A Movie Recommendation System Ashis Kumar Padhi, Ayog Mohanty, and Sipra Sahoo
Abstract Movie recommendation has become one of the most efficient ways of making the user experience more personalized and connecting the user with the movies that the user might like. In this paper, FindMoviez, a movie recommendation engine has been introduced, which is based on a combination of two recommendation algorithms, implemented in a web application. The three sub-data sets (i.e.m ratings, users, and metadata) of the famous Movielens data set have been used. A combination of item–item collaborative filtering and genre based using the average weighted rating method has been used. These algorithms have been modified in a way where the user is always recommended movies, and even if one of the above algorithms fails, the other comes into play making this product more reliable for the user. Thus, the user can completely relies on this product for genuine movie recommendations. Keywords Recommendation engine · Collaborative filtering · Item–item collaborative filtering · Average weighted rating
1 Introduction There is an abundance of data present on the Internet, and the number of active users on the Internet is also increasing at a much faster rate [1]. With the Internet expanding at a faster rate than ever before, ever-increasing explosion of information and the Internet being flooded with choices, the user is having a really hard time selecting from the available choices. Out of 7 billion people in the world, 4 billion were already online by October 2018 [2]. May it be a menu in the restaurant or looking for a good movie, everywhere there is so much to choose from that it becomes a hectic and monotonous task for the user to choose. Reaching their desired data quickly and easily is the requirement of paramount importance for the user [3]. Companies and organizations have understood this problem and deployed smart systems known as recommendation systems to guide their user to the required products. It has been A. K. Padhi (B) · A. Mohanty · S. Sahoo Department of Computer Science and Engineering, ITER, Siksha ‘O’ Anusandhan (Deemed To Be University), Bhubaneswar, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_5
49
50
A. K. Padhi et al.
one of the most heavily researched areas in the past, and it continues to be dug upon because it still has a lot of scope of improvement, and this domain is still rich with many problems. Nowadays, customers are spoilt with choices. Whenever you open any digital platform may it be Netflix or any other digital streaming website, you are loaded with so many options that it almost becomes confusing for you, which one to go for. To solve this problem, streaming websites have come up with huge and advanced recommendation algorithms. These are the systems that go through the user’s profile and can predict whether the specific user would like a specific product or not [4]. But, we have found some flaws and tried to tackle them in our recommendation engine named FindMoviez. Personalization and robustness are the main aspects that we are trying to achieve in this recommendation system. There are many recommendation systems that have been developed and give an accurate recommendation to the end users according to the requirements. Out of all the recommendation methods, the collaborative filtering method has been fairly successful and has stood the test of time, but certainly not without modifications [5–7]. But still, we do not consider the new users and we do not take their needs into consideration which results in unwanted output to such users. To overcome such, we came up with a hybrid approach of recommendation, and also movies that are not seen by the users are either regarded as the zero-rating or the average of their ratings. We replaced that aspect of filling up the NaN values in a different way which we will discuss further in this paper.
2 Background and Related Literature 2.1 Recommendation System Recommendation systems are basically information filtering systems that filter out desired or vital data from the huge amount of data that is available at the disposal of the user. These systems come in very handy when there is information overload or data overload [8]. These systems take into account the user’s choices, requirements or past history of data, or information consumption [9].
2.2 Collaborative Filtering Collaborative filtering is inspired by one of the age-old methods that we have been relying on while selecting any product, that is taking the opinion of other people who we think have a taste similar to that of ours [10]. Collaborative filtering finds similar
FindMoviez: A Movie Recommendation System
51
users to our desired user and takes their history of data or information consumption to filter useful information or data for the desired user. They are broadly classified into: a. Model-based collaborative filtering b. Memory-based collaborative filtering. Model-based collaborative filtering takes into account a user–item relationship matrix. Based on the insights from this matrix, this model generates recommendation for the desired user [11, 12]. Memory-based collaborative filtering is broadly classified into the following types: a. User based collaborative filtering b. Item-based collaborative filtering. Here, two recommendation techniques have been used: a. Item-based collaborative filtering b. Content-based filtering.
2.3 Item-Based Collaborative Filtering In this filtering technique, user variability is constant. This technique takes into account the list of items that the user has rated and finds similarity scores with respect to other items. Then, k-most similar items are recommended to the user (Fig. 1) [13]. Fig. 1 Formulae for item similarity
52
A. K. Padhi et al.
Fig. 2 Formulae for weighted rating
2.4 Content-Based Filtering In this filtering technique, user is recommended items based on the preferences selected by the user. These preferences can be tags or types or metadata-related information [14]. Such type of filtering usually recommends products on the basis of the result from the comparison between the content and the user’s interest (Fig. 2).
3 Methodology Most of us have experienced unwanted recommendations or no recommendation for our selected choices in any application at least once. This is caused due to narrow collection of input choices collected from the user. In our proposed methodology, we have addressed some issues in the existing system and tried to tackle the issues with a modified item-item collaborative filtering method and genre-based weighted average rating method.
3.1 Data Movie ratings can be collected directly from any websites which collect the realtime review of movies from the user or from open-source data sets available on the Internet. In our system, we have used a combination of multiple data sets taken from the Internet. We have taken movies, ratings, and movie metadata datasets from the famous MovieLens Dataset. We have used a combination of two smaller datasets movies (9742 × 3) and ratings (100,836 × 4) which combine to give us a data frame of (100,836 × 5) for collaborative filtering purposes due to limited computational capacity. For the genre-based algorithm, we use a movie metadata set which is a relatively bigger data set and which gets accessed in case our movie is not found in our correlation matrix from the two datasets mentioned above. The size of this movie metadata frame is of (45,466 × 24) for genre-based weighted rating algorithm.
FindMoviez: A Movie Recommendation System
53
3.2 Flowchart
3.3 Algorithm Here, we have used a combination of item-–tem collaborative filtering and modified content-based recommendation algorithms.
54
A. K. Padhi et al.
4 Experiment The simulation is carried out in a powerful scientific environment called Spyder using Python language. The specifications of the system in which the simulation is carried out are the i5 Processor, 4 GB RAM, Windows Operating System. We have visualized our simulation via a web application that is designed with the help of a flask framework using Python. 1. With the help of flask, we first run our home page (index.html) which will take the movie name, ratings, and genre from the user.
FindMoviez: A Movie Recommendation System
55
2. It will then check from the correlation matrix list if the movie is present in the correlation matrix or not. 3. If the movie is present, it will find similar movies from the data using item–item collaborative filtering.
4. And, it will send it back to the users with the help of pos.html, we show the users their recommendations. 5. If the user movie is not available in our correlation matrix list, then we go to the metadata set, and using the genre, we find out similar movies using a contentbased filtering method. 6. And finally, revert back the recommendation with appropriate message to the user. 7. So, it ensures that the user always gets some kind of recommendation according to his or her preference.
56
A. K. Padhi et al.
5 Conclusion With this recommendation system, we tried to fix the problem of people getting a proper suggestion of what they really want to see and respect their preferences, we came up with the project. Entertainment stands a vital place in everyone’s as it relaxes them from their busy life, and at this time, if we do not provide the viewers with what they want, it will be a little unfair. We tried to fix this issue by understanding the user’s mentality and mood and accordingly provide them with the recommendation. The results show that item-based techniques hold the promise of allowing hybridbased algorithms to scale to large data sets and at the same time produce high-quality recommendations [12, 15].
References 1. Garanayak, M., Mohanty, S.N., Jagadevb, A.K., Sahoo, S.: Recommender system using item based collaborative filtering (CF) and K-means. Int. J. Knowl. Based Intell. Eng. Syst. 23, 93 (2019) 2. 2011 Internet Use Survey Summary Report, KISA, pp. 2–27 (2011) 3. Jeong, W.-h., Kim, S-j., Park, D.-s., Kwak, J.: Performance improvement of a movie recommendation system based on personal propensity and secure collaborative filtering. J. Inf. Process Syst. 9(1), 157 (2013) 4. Manvi, S.S., Nalini, N., Bhajantri, L.B.: Recommender system in ubiqui tous commerce. In: International Conference on Electronics Computer Technology, IEEE, pp. 434–438 (2011) 5. Hill, W., Stead, L., Rosenstein, M., Furnas, G.: Recommending and evaluating choices in a virtual community of use. In: Proceeding of the SIGCHI Conference on Human Factors in Computing Systems, pp. 194–201 (1995) 6. Resnick, P., Lacovou, N., Sushak, M., Bergstrom, P., Riedl, J.: GroupLens: an open architecture for collaborative filtering of Netnews. In: Proceedings of ACM 1994 Conference on Computer Supported Cooperative Work, Chapel Hill, NC, pp. 175–186 (2001) 7. Shardanand, U., Maes, P.: Social information filtering: algorithms for automating ‘word of mouth’. In: Proceedings of the SIGCHI Conference on Human factors in Computing Systems, pp. 210–217 (1995) 8. Konstan, J.A., Ried, J.: Recommendation systems: from algorithms to user experience. User Model User-Adapt Inter. 22, 101–123 (2012) 9. Pan, C., Li, W.: Research paper recommendation with topic analysis. Comput. Des. Appl. IEEE 2, v4-246 (2010) 10. Ben Schafer, J., Frankowski, D., Herlocker, J., Sen, S.: Collaborative Filtering Recommender Systems, pp. V9.1-291. Department of Computer Science, University of Northern Iowa (2007) 11. McSherry, D.: Explaining the pros and cons of conclusions in CBR. In: European Conference on Case based Reasoning, pp. 317–330. Springer, Berlin (2004) 12. Balabanovic, M., Shoham, Y.: Fab: content-based, collaborative recommendation. Commun. ACM 40(3), 66–67 (1997) 13. Trans. Knowl. Data Eng. 17(6), 734–749 (2005) 14. Claypool, M., Gokhale, A., Miranda, T., Murnikov, P., Netes, D., Sartin, M.: Combining content-based and collaborative filters in an online news paper. In: Proceedings of ACM SIGIR Workshop on Recommender Systems: Algorithms and Evaluation, Berkeley, CA (1999)
FindMoviez: A Movie Recommendation System
57
15. Sarwar, B., Karypis, G., Konstan, J., Riedl, J: Item-Based Collaborative Filtering Recommendation Algorithms, pp. v3-288. GroupLens Research Group/Army HPC Research Center Department of Computer Science and Engineering University of Minnesota, Minneapolis, MN 55455 16. Claypool, M., Gokhale, A., Miranda, T., Murnikov, P., Netes, D., Sartin, M., Combining content-based and collaborative filters in an online news paper. In: Proceedings of ACM SIGIR Workshop on Recommender Systems: Algorithms and Evaluation, Berkeley, CA (1999)
Active Filter with 2-Fuzzy Intelligent Controller: A Solution to Power Quality Problem Veeravelli Lakshmi Prasana, Pratap Sekhar Puhan, and Satyabrata Sahoo
Abstract This paper presents a controller based on two-fuzzy logic technique associated with hysteresis band current regulated techniques. The developed controller is implemented in a shunt active filter to enhance the power quality in a system. At first, DC voltage component is fed to the first FLC to obtained in-phase current component, then AC Voltage is fed to the second FLC to obtain the quadrature component, combination of two component current results the reference current. Reference current and supply current are fed to the PWM controller, and accordingly, the pulses are generated for the operation of proposed filter, and the power quality is achieved. To prove the effectiveness of the controller models with different condition simulated and verified. Keywords Power quality · Shunt active filter · Fuzzy logic controller · 2-fuzzy technique · PWM technique
1 Introduction Different adverse effects such as rise of temperature in the transformer core, degradation of efficiency of the system, mal operation of the relay circuit, etc., are the results of harmonics producing loads in the power system. These harmonics producing loads are non-linear in nature and due to this non-linearity characteristics, they are producing harmonics. Arc furnace, diodes, UPS, Drives, battery, laptop, computers, etc. [1, 2] are comes under the category of non-linear load. Quality of the power can enhance by removing the harmonics using different types of techniques [1–9]. V. L. Prasana · P. S. Puhan (B) Sreenidhi Institute of Science and Technology, Hyderabad, Telengana, India e-mail: [email protected] V. L. Prasana e-mail: [email protected] S. Sahoo Nalla Malla Reddy Engineering College, Hyderabad, Telengana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_6
59
60
V. L. Prasana et al.
Initially, some restoring technology used to filter out the contents of harmonics comes with voltage and current. Low pass filter and isolation transformer are two important components for the purpose of harmonics [10] but as the low pass filter possess the two components inductor and capacitor, they create resonance effect in the system also it makes the filter size larger [11, 12]. Proper and accurate operation of the APF depends on the developed control algorithm to estimate the base signal and various control techniques discussed [11–17], other aspect in the filter is the magnitude of voltage of capacitor. Keeping that magnitude constant across the capacitor, many control algorithms has already been discussed [15, 16]. Recently, power quality in distributed generation pay attention to various researchers [5, 6, 10], Battery suppurated active filter has already implemented [11, 13] to improve the performance of the filter. This battery supported active filter in conjunction with instantaneous reactive power theory to mitigate the harmonics is discussed in [12]. Technique involves gain adjustment to enhance the switching signal performance described [14, 15]. IGBT with current control-based active filter using two PI controllers in closed loop discussed [4–6, 18]. Soft computing techniques play an important role for controlling action of the filters. FLC, NNC, FLC combine with NNC has been implemented in many papers [7–17], real-time analysis of fuzzy logic controller with 25 and 75 membership function has been implemented in many research [11, 13]. Weight updated algorithm in neural controller also implemented in neural network-based filter [13]. Indirect current control technique along with hysteresis and hysteresis fuzzy controller technique implemented by some researchers [11–15], phase locked loop controller with PI, hysteresis discussed [12, 14]. Indirect current with neural comparative analysis is done in with synchronous and neural [15, 16, 19, 20]. In the paper, five sections are framed including introduction in Sect. 1, a brief discussion of the filter is given in Sect. 2, controller implementation is discussed in Sects. 3, 4 and 5 presents results analysis and conclusions of the paper, respectively.
2 Details of the Proposed Filter Proposed filter is shown in Fig. 1. The proposed active filter system is designed with a normal three-phase IGBT build voltage source inverter bridge [4, 18]. Inductor (L ac ), resistor (Rac ), capacitor (Cdc ) are the input components of the filter. DC capacitor (Cdc ) is provided to deliver the required voltage to the filter. A diode bridge rectifier with inductor (L) and resistor (R) is used as a non-linear load. This load is connected to a three-phase AC supply consists of line impedance (Z L = R L + j X L ). The value of the system parameter is presented in the appendix.
Active Filter with 2-Fuzzy Intelligent Controller …
61
Fig. 1 Proposed active filter
3 Developed Control Technique Developed controller to use for generation of reference signals with two-fuzzy logic techniques associated with a controller which includes hysteresis band is shown in Fig. 2. The reference current can be obtained in two components which is the uniqueness of this work, i.e., in-phase components and quadrature components. Once the reference signal estimated, it compares with the source current signal and the error is fed to the current controller, and accordingly, the switching signals are generated with the help of PWM technique based on the error.
3.1 Determination of In-Phase Componenet ∗ The dc voltage average value Vdca(n) and counterpart of it Vdc(n) are given to the summing point, then the errors obtained. ∗ ∗ = Vdc(n) − Vdca(n) Vdc(n)
(1)
The error and changing value of error (E) are given to the FLC-1, and then, the fuzzy logic controller performs operation. The error value and changing error value are partitioned into seven parts. Large of negative, medium of negative and small value of negative, and corresponding value of positive including zero has chosen to regulate the parameter which is presented in Table 1 The obtained output from fuzzy logic controller is in-phase current component amplitude in three phases and that can be split into three single phases with the aid
62
V. L. Prasana et al.
Fig. 2 Technique to generate the switching signals
of three unit in-phase current vectors u srp , u syp , u sbp . ∗ ∗ ∗ = Iip∗ · u srp , i syp = I E∗ · u syp , i sbp = Iip∗ · u sbp i srp
(2)
Unit current in-phase currents u srp =
Vsy Vsr Vsb , u syp = , u sbp = VE Vas Vas
(3)
Active Filter with 2-Fuzzy Intelligent Controller …
63
Table 1 Truth table of FLC-first e/e
NL
NM
NS
Z
PS
PM
PL
NL
NL
NM
NL
NL
NM
NS
ZO
NM
NL
NL
NL
NM
NS
ZO
PS
NS
NL
NL
NM
NS
ZO
PS
PM
ZO
NL
NM
NS
ZO
PS
PM
PL
PS
NM
NS
ZO
PS
PM
PL
PL
PM
NM
ZO
PS
PL
PL
PL
PL
PL
ZO
PS
PM
PL
PL
PL
where Vas is the supply voltage amplitude that can be calculated by the below ( 21 ) Vas = 0.67 Vsr2 + Vsy2 + Vsb2
(4)
3.2 Determination of Quadrature Componenet ∗ The error between the amplitude of supply voltage Vas(n) and counterpart Vas(n) of it are given to fuzzy logic controller, and changing error is given as another input to the fuzzy controller, on applying fuzzy rules to the input, the output is obtained by applying AND operation between the two inputs, i.e., error Vas(n) and changing error E values. ∗ − Vas(n) Vas(n) = Vas(n)
(5)
Truth table is formed as same as Table 1 and also follows same rules like Table 1 Three-phase quadrature current is obtained as a output from fuzzy logic controller. Now, we have to convert three-phase currents to single-phase currents with the aid of unit vector currents of quadrature currents u srd , u syd , u sbd . ∗ ∗ ∗ ∗ ∗ ∗ = Iqd · u srd , i syd = Iqd · u syd , i sbd = Iqd · u sbd i srd
(6)
Unit current quadrature vectors
u srd u sbd
√ u srp 3 + u syp − u sbp −u syp + u sbp , u syd = , = 0.67 1.732 √ u srp 3 + u syp − u sbp = 0.67
(7)
64
V. L. Prasana et al.
Reference current is obtained by the addition of quadrature currents and in-phase currents and can be obtained as ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ + i srd , i sy = i syp + i syd , i sb = i sbp + i sbd i sr∗ = i srp
(8)
PWM is designed to receive the reference current signal in terms of total value and supply currents and accordingly gating signals required by the inverter is received and operation started to cancel out the harmonics. The process of gate signal generation is the most important work carried out in this work which can be effectively used for the switching action of the proposed inverter. The fuzzy updated editor is shown from Figs. 3, 4, 5 and 6
Fig. 3 FIS editor
Fig. 4 Membership function editor
Active Filter with 2-Fuzzy Intelligent Controller …
65
Fig. 5 Updated membership function IS editor
Fig. 6 Updated rule editor
4 Results and Analysis with Simulation Method In this work, load of non-linear type is connected to the system and this develops harmonics due to non-linear characteristics. Designed filter is connected and injects a current which is equal and 180 phase difference. Steady state results and dynamic state results are presented.
66
V. L. Prasana et al.
Fig. 7 Capacitor current at steady state
Fig. 8 Top one-voltage across load, middle one-source current, and bottom one-current though load without filter
4.1 Steady State Analysis In the steady state analysis, the model is run without filter and with filter, the capacitor current is shown in Fig. 7, the load voltage, current through the load and supply before and after filtration is presented in Figs. 8 and 9. The FFT analysis of before filter and after filter connection corresponding to source current in Figs. 8 and 9 is presented in Figs. 10 and 11, respectively
4.2 Dynamic State Analysis In the dynamic state analysis, the model is run without filter and with filter, the capacitor voltage in the dynamic states is shown in Fig. 12, The load voltage, load current and supply current before compensation and after compensation is presented
Active Filter with 2-Fuzzy Intelligent Controller …
67
Fig. 9 Top one-voltage across load, middle one-source current, and bottom one-current though load without filter
Fig. 10 Source current THD without filter
in Figs. 13 and 14. The FFT analysis after filter corresponding to source current is shown in Fig. 15. In Fig. 15, it shows that the harmonics level is reduces from 26.1 to 1.77% in dynamic state. The power factor is increased after compensation and it equals to 1. The developed control technique is not only minimizing the harmonics contents, but it helps to increase the power factor effectively nearly equal to 1. The resultant waveform of to judge the power factor is presented in Fig. 16 and Table 2.
68
V. L. Prasana et al.
Fig. 11 Source current THD with filter
Fig. 12 Capacitor current at dynamic state
Fig. 13 Top one-voltage across load, middle one-source current, and bottom one-current though load without filter
5 Conclusions The proposed shunt active filter with the control techniques is implemented and verified through simulation work with static and dynamic load condition. Two controllers with fuzzy technique in conjunction with indirect current control techniques proves
Active Filter with 2-Fuzzy Intelligent Controller … Fig. 14 Top one-voltage across load, middle one-source current, and bottom one-current though load with filter
Fig. 15 FFT analysis of source current in dynamic state
Fig. 16 Power factor analysis waveform
69
70 Table 2 Magnitude after filtering in steady state and dynamic state
Table 3 Magnitude of THD With existing Tech-refer [4]
V. L. Prasana et al. Harmonic order
THD steady state in parentage
60 Hz (Fund)
100.00
THD dynamic state in parentage
3–180 Hz,
0.03
0.03
5–300 Hz
0.06
0.07
7–420 Hz
0.09
0.1
9–540 Hz
0.03
0.04
11–660 Hz
0.01
0.02
13–780 Hz
0.02
0.02
Type of load used
Before compensation
After compensation Existing
Enhanced
Non-linear load
25.08
4.65
2.36
Dynamic load
24.27
4.05
1.56
it effectiveness as a best controller by reducing the level of harmonics from 25.08 to 2.36% in static condition and 1.56% of fundamental in dynamic condition. In Table 3, a comparative analysis is presented with one of the existing technique. The results obtained in enhanced technique are very encouraging and can be utilized in the real-time analysis. Power factor is also improved, hence overall power quality is improved.
Appendix
S. No.
Parameters
Value
1
Source voltage (rms/phase)
40 V
2
Frequency
60 Hz
3
Bridge inductor
3.94 mH
4
Bridge resistor
0.1
5
DC capacitor
650 µF
6
Line resistance
0.25
7
Line inductance
2.5 mH
8
Load resistance
10
9
Load inductance
5.0 mH
Active Filter with 2-Fuzzy Intelligent Controller …
71
References 1. Akagi, H., New trends in active filters for power conditioning. IEEE Trans. Ind. Appl. 32, 1312–1322 (1996) 2. Mohan, H.: A novel approach to minimize line current harmonics in the interfacing power electronics equipment with 3-phase utility systems. IEEE Trans Power Deliv 8, 1395–1401 (1993) 3. Akagi, H., Kanazawa, Y., Nabae, A.: Instantaneous reactive power compensators comprising switching devices without energy storage components. IEEE Trans. Ind. Appl. IA-20, 625–630 (1984) 4. Prasanna, L., Puhan, P.S.: PI controller based new shunt active power filter. J. Inf. Comput. Sci. 9(9), 16–23 (2019) 5. Dash, A., Ray, P.K.: Performance enhancement of PV-fed unified power quality conditioner for power quality improvement using JAYA optimized control philosophy. Arab. J. Sci. Eng. 44, 2115–2129 (2019) 6. Puhan, P.S., Ray, P.K., Panda, G.: Development of real time implementation of 5/5 rule based fuzzy logic controller shunt active power filter for power quality improvement, 17(6), 607–617 (2016) 7. Panda, G., Dash, S., Ray, P.K., Puhan, P.S.: Performance improvement of hysteresis current controller based three-phase shunt active power filter for harmonics elimination in a distribution system, ICAC3. Springer, Fr.CRCE, Mumbai (2013) 8. Dash, S., Ray, P.K.: Design and modeling of single-phase PV-UPQC scheme for power quality improvement utilizing a novel notch filter-based control algorithm: an experimental approach. Arab. J. Sci. Eng. 43, 3083–3102 (2018) 9. Puhan, P.S., Ray, P.K., Panda, G.: Real time harmonics estimation of distorted power system signal. Int. J. Electric. Power Energy Syst. 75, 91–98 (2016) 10. Puhan, P.S., Sandeep, S.D.: Real Time Neuro-Hysteresis Controller Implementation in Shunt Active Power Filter. ICETE. Springer, OU (2019) 11. Subjak, J.S., Mcquilkin, J.S.: Harmonics causes, effects, measurements, analysis: an update. IEEE Trans Ind Appl. 26, 1034–1042 (1990) 12. Singh, B., Al-Haddad, K., Chandra, A.: A review of active filters for power quality improvement. IEEE Trans. Ind. Electron. 46(5), 960–971 (1999) 13. Tey, L.H., So, P.L., Chu, C.H.: Neural network-controlled unified power quality conditioner for system harmonics compensation. In: Proceedings of IEEE/PES transmission and distribution conference and exhibition: Asia Pacific, vol 2, pp. 1038–1043 (2002) 14. Puhan, P.S., Ray, P.K., Panda, G.: A comparative analysis of shunt active power filter and hybrid active power filter with different control technique applied for harmonic elimination in a single phase system. Int. J. Model. Identif. Control 24(1), 19–28 (2015) 15. Puhan, P.S., Ray, P.K., Panda, G.: A comparative analysis of artificial neural network and synchronous detection controller to improve power quality in single phase system. Int. J. Power Electron. Indersci. 9(4), 385–401 (2018) 16. Bimal, K.: An adaptive hysteresis band current control technique of a voltage fed PWM inverter for machine drives system. IEEE Trans. Ind. Electron. 37(5) (1990) 17. Ametani, Harmonic reduction in thyristor converters by harmonic current injection. IEEE Trans. Power Appl. Syst. 95(2), 441–449 18. Chandra, A., Singh, B., Haddad, K.AI.: An improved control algorithm of Shunt Active filter for voltage regulation, harmonic elimination power factor correction, and balancing of non-linear loads. IEEE Trans. Power Electron. 15, 495–507 (2000) 19. Moran, L.A., Dixon, J.A., Wallac, R.R.: A three phase active power filter operating with a fixed switching frequency for reactive power and current harmonics compensation. IEEE Trans. Ind. Appl. 42, 402–408 (1995) 20. Gupta, N., Dubey, S.P., Sigh, S.P.: Neural network based active filter for power quality improvement. IEEEPES General meeting, Providence, RI, USA (2010)
Analysis of Covid Confirmed and Death Cases Using Different ML Algorithms G. Naga Satish, Ch. V. Raghavendran, and R. S. Murali Nath
Abstract Machine learning plays a foremost role from precedent years in illustration detection, spam restructuring, normal verbal communication commands, product suggestion, and therapeutic analysis. Present machine-learning algorithms worn for the analyzing the diseases and finding the relationships between confirmed, deaths. In the present paper, we are finding the root mean square error (RMSE) to analyze the COVID confirmed and deaths using linear regression, decision trees, random forests, proving death rates are less when compared with confirmed cases of our country India. Keywords Diseases · Linear · Random · Decision · Relationships · Analyze
1 Introduction Corona viruses (CoV) are having great relation with illness that root infirmity ranging from the widespread cold to supplementary ruthless sickness such as middle east respiratory syndrome (MERS-CoV) and severe acute respiratory syndrome (SARSCoV). The original corona virus (N-CoV) is a creating an innovative fangled damage that has not been until that time recognized in individual. Corona viruses are zoonotic, meaning they broadcast flanked by nature and community. Comprehensive investigations originate that SARS-CoV was broadcast from civet cats to humans and MERS-CoV from dromedary camels to humans. Several known corona viruses are flow in animals that have not yet contaminated humans. Widespread cipher of illness comprises respiratory indications, fever, and cough, tininess of mouthful of air and mouthful of air complicatedness. In more G. Naga Satish (B) · R. S. Murali Nath Professor, Department of Computer Science and Engineering, BVRIT HYDERABAD College of Engineering for Women, Hyderabad, Telangana, India e-mail: [email protected] Ch. V. Raghavendran Professor, Department of Information Technology, Aditya College of Engineering and Technology, Surampalem, AP, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_7
73
74
G. Naga Satish et al.
severe cases, infection can cause pneumonia, severe sensitive respiratory syndrome, kidney failure, and even death. Regular suggestions to avert infectivity extend include standard hand sponge down, covering mouth and nose when coughing and sneezing, thoroughly cooking meat and eggs. Steer clear of close contact with any person performance indication of respiratory infirmity such as coughing and sneezing. At hand could be many factors leading to the growth of mortality rate in Covid patients. But the whole number of cases can be subset into different age groups. It is observed that as the age of the patient is moving towards the increased age groups, the death rate increased. Age and the health condition of the patients contributed to the susceptibility. A person at 30 s with very low immune system also can be highly susceptible. Though this is exceptional, it is said that as the age grows, lungs are not as elastic or as resilient as it is before.
2 Objective Real-time identification of diseases, deficiency symptoms, and providing a remedy is not at all possible at all times. There are many myths circulating around the COVID19, the syndrome caused by novel corona virus, so it’s imperative to discern what’s factual and what’s not. Solitary can defend self from COVID-19 by instill, gulp down, dip in or abrasion on body peroxide, decontaminator, or abrasion alcohols—FALSE (1) A Inoculation to alleviate COVID-19 is to be had—FALSE (2) The virus was deliberately created—FALSE (3) Older people have high risk of Covid-19 attack than young or middle aged people. This percentage is far less in children whose age lies between 0 and 9 years—TRUE (4) Men are more susceptible to this virus than Women—TRUE. Our Objective is that the death rates are less when compared with confirmed cases. We are providing analysis by different machine-learning algorithms.
3 Literature Review COVID-19 has spread a long way around the world. Healthcare stakeholders from different industries have prioritized the epidemic consequences. Market leaders at different positions have used advanced big data analytics and other technologies to project the impact of the virus, understand its nature, and protect those who are most susceptible to severe illness. The people with less immune system also added expected to have heart diseases, lung diseases, diabetes-Mellitus, or kidney disease, which can wane their remains condition to fight against such infectious disease. In many countries, it is more likely to be in establishment situations like a treatment or
Analysis of Covid Confirmed and Death Cases …
75
old age homes or retirement homes, or living with family in a highly crowded area where there’s a greater risk of infection. Another challenge is that they might have separation or portability challenges. As they are secluded, they may not get information about the happenings of the disease; even if they are aware, they might not know what to do. Above all, such conditions they may not be capable to acquire food they need if provisions are elsewhere of stock and things turn out to be trickier. In many areas, higher-ranking are highly likely to live in scarcity, which can make it more tricky for them to get their groceries of livelihood necessities to take care of themselves. Deficiency presents a whole assortment of challenges pertaining to health. In such situations the virus has high scope of spreading along with increase in the intensity.
4 Comparison of Different Algorithms 4.1 Linear Regression By using linear regression, we can find the relationship between the confirmed cases and deaths. Simple linear regression is useful for verdict affiliation between two incessant variables (Figs. 1 and 2). The nucleus suggestion is to acquire a line that best fits the data. The finest fit line is the one for which entirety prediction error (all data points) are as minute as possible (Figs. 3 and 4).
Fig. 1 Viewing the data set
76
G. Naga Satish et al.
Fig. 2 Finding the linear regression
Fig. 3 Plotting graph confirmed versus deaths
4.2 Decision Trees Seeing that says all about, it is a hierarchy which helps us by flipside up us in managerial. Used for both classification and regression, it is a very indispensable and vital extrapolative erudition algorithm. Unlike from others, it works intuitively, i.e., captivating verdict one-by-one. As this is non-parametric, it works hasty and capable. The subsequent is the pseduocode for construction of decision trees 1. Generate a root node and allocate every of the training data to it 2. Decide on the finest rip attribute according to positive criteria. 3. Insert a branch to the source node for each cost of the rip.
Analysis of Covid Confirmed and Death Cases …
77
Fig. 4 Root mean square error and score
4. Divide the data into equally selected subsets along with the lines of the definite split. 5. Do again steps 2 and 3 for each and every leaf node until a bring to an end criteria is reached (Figs. 5, 6 and 7).
Fig. 5 Plotting the graph
78
G. Naga Satish et al.
Fig. 6 Predicting and printing RMSE
4.3 Random Forests Random forest is very extensively used to perform very intensive calculations due to increase in computational power. This should be used wisely as the end of the amount produced of the replica is like a blackbox. This is boot strapping algorithm with the decision tree model. The ultimate prophecy can be simply be the mean of every prophecy. Random forest gives much more accurate predictions when compared to simple regression models. Generally, high number of predictve variable and huge sample size gives the better results. The following are the beter to make out radnom forest model better 1. Predictive power features 2. The random forest algorithm will work on the low coorelations beyween trees and the forest. 3. The model impact is high as features which are selected and the parameters (Figs. 8 and 9).
5 Comparison of Algorithms The following is the comparision table between linear regression, decision trees, and random forest. By refering this table, root mean square error(RMSE) is less for random forest when compared with remaining algorithms (Table 1).
Analysis of Covid Confirmed and Death Cases …
Fig. 7 Decision tree based on values
Fig. 8 Feature scaling, training
79
80
G. Naga Satish et al.
Fig. 9 Evaluating the algorithm
Table 1 Comparison table
Model
RMSE
Linear regression
111.80
Decision trees
18.74
Random forest
12.14
6 Conclusion The target of any of the world is just diseases free but also to ensure the all people are safety. By analyzing the dataset which consists covid information of confirmed, deaths, and concluded that deaths are less with confirmed cases. We can also predict with respect to age and other factors and can predict the deaths or recovered with respect to the Confirmed cases.
How Good Are Classification Models in Handling Dynamic Intrusion Attacks in IoT? Lekhika Chettri and Swarup Roy
Abstract Internet of things (IoT) is vulnerable to the intrusion that may lead to security threats in the IoT ecosystem. Due to different architecture and protocol stack, the traditional intrusion detection system (IDS) does not work well for generating alarm during possible intrusion in IoT. Machine learning is one of the potential tools for effective intrusion detection. However, to apply them in IoT, it may need customization to work with IoT traffic. The situation becomes adverse when the attack patterns are not known Apriori. To mislead IDS, attackers frequently change the attack patterns. As a result, traditional machine learning methods usually fail to handle such dynamic intrusion effectively. In this work, we try to assess seven (07) well-known classification models for their suitability in the IoT network in detecting novel/dynamic attacks. It is more vulnerable and lethal for a system, if a detection system misclassifies a novel (unseen) attack as normal traffic. During our study, we assess such scenario of misclassification by our candidate models. Our result reveals that random forest performs better in detecting seen IoT attacks. SVM is superior in keeping a low misclassification rate for dynamic attacks as regular traffic. Our investigation further concludes that the best IDS system is not always the best detector for handling novel attacks. Keywords Internet of things (IoT) · Dynamic attacks · Intrusion detection · Machine learning · Classification · Clustering · Security · Unseen attacks · Unknown attacks
L. Chettri · S. Roy (B) Network Reconstruction & Analysis (NetRA) Lab, Department of Computer Applications, Sikkim University, Gangtok, Sikkim, India e-mail: [email protected] L. Chettri e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_8
81
82
L. Chettri and S. Roy
1 Introduction Over time, a large number of smart devices are getting connected to the Internet, forming a complex network, termed as the Internet of things (IoT). The essence of IoT, initially coined by Ashton [1] for supply chain management, gradually impacting our day-to-day life to offer a smarter digital lifestyle. One-way the smart devices making our life more comfortable; on the other hand, we are exposing ourselves towards growing cybersecurity threats. Though IoT devices have already deep-rooted its usage in our daily life, its security issues [2] and the vulnerability of the IoT devices are still questionable. Growing shreds of evidence of intrusions in IoT networks and devices poses fatal consequences, and it may even result in a threat to human life as well as the economy too [3–6]. A smart solution where a device is capable of selfdecision making may be a useful alternative for handling the above issues. Traditional machine learning-based solutions appear to be ineffective in handling such multiperspective intrusion attacks. Existing machine learning models majorly developed for the computer networks may not be suitable for IoT in its original form. It is because IoT is having its own set of protocols. There is no standard protocol stack for IoT yet, and the connected devices to the IoT network are immensely diverse. Besides, the application areas of IoT are relatively large, and some are severely fragile. Due to IoT data heterogeneity and the resource constraint nature of smart devices, it is always challenging to develop a capable machine learning-based solution. Simultaneously, attackers are becoming smarter as they mutate the signature of the know attacks or create new attacks frequently to bypass the intrusion detection tools easily. It is always challenging to handle priorly unseen attacks. Even sophisticated machine learning algorithms are limited in handling only the known attacks. In a situation where a network is exposed to novel attacks by altering regular traffic patterns (never seen before), the learning algorithm forces the new traffic to classify into one of the predefined categories. The situation becomes more adverse if existing the detection system label an unknown/dynamic1 attack as normal due to the lack of a prior attack signature. Attacks like zero-day and dynamic attacks are some of the examples that have been reported to increase recently. With the alarming increase in the size of IoT, providing effective security solutions has become a real challenge to combat novel attacks. Keeping in view the recent trend of attacks in IoT networks, machine learning-based solutions for intrusion detection could prove to be a good combat strategy, but may not be the best. A plethora of works contributed towards intrusion detection in traditional networks using a machine learning approach. However, almost negligible work has been done in detecting intrusion in any IoT network using a machine learning approach [7, 8]. No attempt has been made to-date in detecting any dynamic or priorly unseen attack for IoT network.
1 The
words novel/unseen/unknown/dynamic for never seen before the attack will be used interchangeably.
How Good Are Classification Models in Handling Dynamic Intrusion …
83
Hence, it is crucial to evaluate traditional classification models to access its effectiveness in detecting priorly seen and unseen attacks in the IoT environment. In this work, we study the scenarios of seen and unseen intrusion attacks in IoT and assess the effectiveness of traditional classification techniques in detecting both types of attacks. The rest of the paper is organized into four sections. Section 2 presents a brief, scope, and related work about machine learning in intrusion detection in IoT. We present, in Sect. 3, the suitability of the existing machine learning paradigm in handling the unseen attack. Section 4 provides experimental analysis and the results of our experiments. We conclude this paper by concluding remark in Sect. 5.
2 Machine Learning in Intrusion Detection for IoT Intrusion detection system (IDS) is an essential tool or software [9] for the protection of networks. The IDS watchdog the incoming traffic into the network and alerting the system administrator when it detects a security violation. The IoT devices are a heterogeneous network setup with very low accessibility that makes it very hard to design and apply specific security mechanisms. Traditional IDS may be applied in IoT by augmenting additional security layers that are specific to IoT. Recent trend shows that cybercriminals keep evolving attack patterns to evade the detectors. Hence, it would necessitate regular updates that would not be feasible, especially for IoT network type. Machine learning-based solutions can be much more reliable and hassle-free than other contemporary tools as they can adapt well with the attacks patterns progressively. The rationale behind the use of machine learning to build an effective intrusion detection technique is as follows. • It helps in automate IDS devices by enhancing the analytic models for continuous learning with the help of available data. • These evolved models can help produce reliable decisions without human intervention. • If the training process is carried considering all the possible scope of data, the technique can yield near to the perfect decision. Looking at the potentiality of the machine learning techniques in enhancing the performance of IDS, it might appear to be a promising alternative to tackle the security threats in IoT too. A very few works have been reported so far in tackling IoT intrusion attacks using a machine learning approach. Bostani and Sheikhan [10] employed a real-time hybrid approach, i.e. anomalybased intrusion detection modules and specifications for intrusion detection. The proposed unsupervised model uses optimum-path forest clustering on incoming data packets. Two routing attacks namely collector and selective attacks are used for experimentations. It reported a true positive rate of 76.19% and a false positive rate of 5.92% with a precision rate of 96.02%.
84
L. Chettri and S. Roy
Table 1 Summary of a different IDS that use ML for IoT network Sl. No. Author Method used ML method Application in IoT 1
Bostani and Sheikhan [13]
2
Diro and Chilamkurti [14] Azmoodeh et al. [15]
3
Dynamic/unseen/ unknown attack/traffic
Unsupervised- ✓ optimum path forest clustering
✗
Deep learning
✓
✓
Selection of Deep learning sequence code classes as a resource
✓
✗
Real-time hybrid approach i.e. anomaly and specification intrusion detection –
Diro and Chilamkurti [11] proposed a deep learning-based novel approach in intrusion detection for IoT networks. The proposed method claimed to handle zeroday attacks and mutation of previously known cyber-attacks. It reported a promising outcome without dynamic attacks. However, none of the protocols particular to IoT is considered during solution modelling. Azmoodeh et al. [12] focused on military environments for IoT by generating their dataset for malware. A deep learning approach is used for the selection of sequence code classes as a resource for the classification of malware samples. The proposed method claimed to achieve an accuracy of 98%. A summary of all three methods is reported in Table 1
3 Machine Learning in Handling Dynamic Attacks The machine learning techniques have been extensively used for several decades to improve network security that includes malware detection, authentication, antijamming offloading, access control [14–20], and more. Traditional machine learning models usually categorize the incoming traffic into one of the pre-trained classes. However, it is always not correct to assume that a detection system always receives inputs that are priorly known. Now and then some new attacks tumbles into the network. By definition, existing machine learning techniques treat incoming traffic as usual or classifies them into an existing class. Based on the appearance of an attack during training, we categorize them into the following two types.
How Good Are Classification Models in Handling Dynamic Intrusion …
85
1. Seen Attack: If a learning model is trained with network traffic derived from attack or normal instances, i.e. the pattern or signature of the traffic is exposed to the model during training, we termed it as a seen attack. During its span of life, if the attack traffic does not change its definition or has not mutated from its previous definition, we termed it a static attack. 2. Unseen Attack: On the contrary, traffic belongs to a family of attack for which the definition does not exist, or the model has never seen before any such traffic pattern during training, we termed it as an unseen attack. If an attack packet can mutate itself or if an already existing packet changes, its attack definition is termed as a dynamic attack. Handling unseen or dynamic intrusion attacks still is an open challenge not only in the IoT network but in almost every type of communication network. Classical machine learning methods are compelled to categorize incoming traffic into one of the predefined classes irrespective of its unseen nature. In such scenarios, a false negative prediction may pose a severe threat to the target system. Next, we apply a few classical classification methods in detecting IoT traffic. We try to access the qualitative aspects of those methods in handling both seen and unseen IoT intrusion attacks.
4 Experimental Assessment In the absence of real scenarios of unseen attacks, we assume a setup for training and testing to create a similar attack environment. We use publicly available label IoT traffic dataset with known attacks. We apply a few classification methods on the dataset and report their performance in handling unseen and seen IoT attacks.
4.1 Simulating Unseen Attacks We create an environment as depicted in Fig. 1 to simulate both seen and unseen attacks. In the training phase of the model (Fig. 1a), the learning model is trained with two types of DDoS attacks, namely Ack and Junk together with benign network traffic. The traffic is classified into two classes, i.e. the normal and attack. During the testing phase (Fig. 1b), the model is given new attack traffic that is not used during training. The new attack traffic belongs to the DDoS-TCP attack and is priorly unknown to the model and hence becomes an unseen attack for the learning model. Even though the novel attack traffic is unknown to the model, the unseen traffic is forcefully categorized into one of the two classes. The consequences of categorizing an unseen attack as an attack of an existing type, its relatively having a minor impact. However, the situation than the situation when it classifies the traffic as benign, which is more disastrous and have fatal consequences. During the experiments, we would like to study the behaviour of state-of-the-art classification models in handling unseen attacks.
86
L. Chettri and S. Roy
Fig. 1 Setup for simulating a virtual environment of seen and unseen attacks. a During training candidate model is trained with a limited number of attacks sample, b testing is performed with adding additional attack (TCP) samples absent during training
4.2 IoT Attack Dataset Used We use the BoTIoT dataset publicly available in UCI repository.2 BoTIoT comprises of a family of traffic samples belonging to DDoS attack and benign traffic and is generated using nine (09) real-time IoT devices. DDoS attacks such as ACK, SCAN, SYN, TCP, UDP, and UDP-Plain attacks are included in it. BoTIoT has been generated using two botnets, i.e. Gafgyt and Mirai. The motivation behind selecting BoTIoT is that recently large-scale DDoS attacks have been reported by exploiting IoT devices using Mirai [21–23, 23]. Mirai botnet is considered to be an exceptional type of botnet, and most of the DDoS attacks in the IoT network has been performed using Mirai. The original dataset contains nearly thousands of traffic records for each attack type under each botnet, counting more than millions of traffic records. For the ease of experimentation, we sample a total of 11,000 entries using random sampling.
4.3 Classification Models Used A total of seven classification models for our experimentation has been selected. While selecting the various classification models, we select different learning models including probabilistic and non-probabilistic, generative and discriminative, linear and nonlinear, regression, decision trees, and distance-based models. K-nearest neighbour (KNN) [24, 25] is a simple, nonparametric, and distance-based supervised 2 https://archive.ics.uci.edu/ml/datasets/.
How Good Are Classification Models in Handling Dynamic Intrusion …
87
learning algorithm. NaïveBayes [26, 27] is a simple probabilistic classifier based on Bayes’ rule. It has been widely used for intrusion detection in the network [13, 28]. Support vector machine (SVM) [29, 30] is a non-probabilistic maximum margin classification model. It tries to find a dividing hyper-plane that separates classes with maximum margin. We also use the decision tree (DT) [31] algorithms such as J48 and random tree for our experimentation. J48 is an optimized implementation of the C4.5 [31, 32] algorithm and creates a decision tree using the concept of information gain or entropy derived from information theory. Random forest [33] is an ensembled decision tree constructed by combining numerous decision trees to obtain a precise prediction model. Adaptive boosting (AdaBoost) [34] is an additive logistic regression model that can be used in conjunction with any other classifier to boost performance. AdaBoost is considered to be best partnered with a decision tree for enhancing its performance. To avoid implementation bias, we use Weka 3.9.43 for implementing all the above classification algorithms.
4.4 Results and Discussion We try to evaluate the performance of different classifiers in detecting IoT attack traffics. Usually, any classifiers are capable of inferring any incoming instances into some predefined classes. However, the performance varies with the type of learning methodology it adapts. In this section, we access their superiority in detecting priorly seen attacks. It is evident that traditional classifiers unable to handle unseen attacks and wrongly classify into one of the pre-existing classes. As stated earlier, in growing attack scenarios, where novel attacks are introduced every regularly, the consequences would be fatal if unseen attacks are classified as a normal instead of classifies it into any attack class, wrongly. We access such consequences while handling unseen attacks by different classification models. Finally, we investigate any possible reason for the misclassification of unseen attacks as normal traffic using cluster analysis.
4.4.1
Performance Analysis During Seen Attacks
We use 6600 and 4400 test instances for training and testing, respectively. Using random sampling, we select the equal number of instances from attack classes belongs to Ack, Junk, Scan, Syn, and benign instances. The overall performance of different classifiers is listed in Table 2. In the case of KNN, we select k = 3, which is the best parameter we observe after experimenting with a range of k values. We observe a superior performance by random forest with 95% testing accuracy. Due to the limited availability of attack types, we restrict our study only with DDoS attacks. In a real
3 https://www.cs.waikato.ac.nz/ml/weka/.
88
L. Chettri and S. Roy
Table 2 Performance of different candidate classifiers for handling seen traffic Sl. No. Classifiers Train Train (error) Test (accuracy) (accuracy) 1 2 3 4 5 6 7
KNN(3) Naïve Bayes Decision tree Random forest SVM AdaBoost (KNN) AdaBoost (decision tree)
Test (error)
99.13 74.86 99.93 100 85.86 99.13
0.86 25.13 0.06 0 14.13 0.86
87.54 66.40 90.54 95.09 76.11 87.54
12.45 33.59 9.54 4.90 23.81 12.45
100
0
93.29
6.70
sense, the real-world attack types are many. Hence, the performance of the candidate classifiers reported is the possible trend. It is not appropriate to grade their merit to draw a conclusive remark from it.
4.4.2
Handling Unseen Attacks
We simulate the environment of the novel or unseen attacks by considering four attack types, namely Ack, Junk, Scan, and Syn, together with normal traffic. We introduce three unseen attack classes, namely TCP, UDP, and UDP-Plain during testing or inference. Out of 4400 instances in the test set, 1520 instances belong to unseen traffic. As discussed above, the traditional classifiers are not built for handling unseen class instances during inference. Hence, the misclassification is an obvious event. However, considering real-life scenarios, we try to access that how much they are misclassifying an attack as normal traffic and may allowing it into the network. More a classifier misclassifying an unseen attack as an attack of any class, we consider it is being less vulnerable towards handling the unseen or novel attack. From Fig. 2, it is evident that SVM is the most reliable in handling unseen attacks in comparison with other classifiers. Whereas random forest, the best performer in seen attack scenarios, appears to be the most vulnerable. From our study, we observe that algorithms that are performing well for seen traffic are comparatively performing poorly for unseen traffic. A graphical plot reported in Fig. 3 clearly shows that a concerning number of instances are being classified as normal traffic though belonging to any attack class. It does not highlight the wrongly classified unseen attack instances as the traditional classifiers ignore all the instances for the classes unknown to them. If the unseen attack instances are plotted, the graphs would have more square boxes, i.e. more errors. The unseen attack traffic is forcibly classified into one of the existing classes, including normal. Though SVM is reliable in terms of low misclassification as normal, it has high false positive for seen traffic, and it is evident from Fig. 3e. If we observe the average performances, considering both the scenarios (Fig. 4), we
How Good Are Classification Models in Handling Dynamic Intrusion …
89
Fig. 2 Percentage of misclassification during unseen attacks as normal traffic by different classifiers
observe that decision tree (DT), AdaBoost-DT, and SVM maintain a balanced performance in both the scenarios in terms of low misclassification error (seen attacks) and false negative.
4.4.3
Cluster Analysis of Traffic Patterns
Here, we would like to investigate the possible reason for misclassification of seen and unseen attacks by different classifiers. Cluster analysis may provide a hint for the same. The intention is that if traffic sample derives from two different classes of attacks are similar and less distinctive, they will be clustered in a single cluster. Similarly, if the signature of normal and attack traffic shares similar characteristics, they may form overlapping cluster distribution. In such a case, the chances of misclassification will be more. If unseen attack traffic shares similar behaviour with normal traffic possibility will be high to misclassify unseen attack as normal traffic. We use K-means clustering [35] for cluster analysis of traffic instances. K-means is a clustering approach, and k represents the numbers of desired clusters. An instance is allocated to the cluster with the close centroid value, i.e. highly similar as compared to the other cluster’s centroid. We use Euclidean distance for computing similarity between two traffic samples. Initially, we select k = 2 to observe the separability between attack instances (any type) and normal traffic. Figure 5a shows that it is difficult to segregate normal traffic
90
L. Chettri and S. Roy
(a) Knn for n=3
(b) Naive Bayes
(c) Decision Tree
(d) SVM
(e) AdaBoost Fig. 3 Class distribution inferred by five different classifiers for unseen attacks. In the plots, the x-axis and the y-axis represent actual class and predicted classes, respectively. X indicates correctly classified instances and the box shows the wrongly classified instances. Different class instances are shown using different colours
How Good Are Classification Models in Handling Dynamic Intrusion …
91
Fig. 4 Average performance (in terms of the error in classification) of different classifiers in seen and unseen scenarios
(a) Distribution of attacks and normal traffic shows sharing of same clusters
(b) Distribution of traffic in five different clusters
(c) Clusters of seven attack types and normal traffic showing mix traffic distributions
Fig. 5 Clustering analysis of different attack traffics with varying numbers of k values
92
L. Chettri and S. Roy
from attack traffic as they share overlapping similarities. However, cluster 0 contains a share of attack traffics of 33% and cluster 1 includes 67% normal traffic. Next, we consider k = 5 for five (04) types of attack and normal traffic used during seen attack scenarios (Fig. 5b). It may help us to investigate inter-class attack signature similarities. We observe that other than junk attack traffic, it is challenging to differentiate another attack traffics from each other and from normal traffic too. Scan attack traffic aligns more with normal traffic which makes the task of detecting scan attack as attack and may lead to misclassification as normal traffic. Finally, we perform clustering (k = 8) considering seven attack classes (three newly introduced attacks, UDP-Plain, UDP, and TCP) and normal traffic. Here, we try to observe how much normal traffic is similar to newly introduced three attacks because of which new attacks are classified as normal traffic. It is also important to analyse whether new attacks are having any significant similarity with the existing attack classes. The more they are similar to existing attack classes, lesser are the chances that a classifier classifies an unseen attack as normal. From Fig. 5c, we can observe that 31.8% of normal traffic is clustered with UDP traffic and 41.8% with UDP-Plain (both are new attack). From our experimental results, we may further conclude that traffic similarity may be the potential reason for the misclassification of unseen traffic as normal traffic. It is the behaviour of the classification model, which may determine the rate of misclassification of unseen attacks towards a normal class or any other attack class.
5 Conclusion In this paper, we performed a unique study on how traditional classification models behave in handling seen and unseen IoT attack traffics. The unseen attack is a possible scenario where pre-trained IDS for IoT networks are exposed to some novel attacks. We used seven well-known classifiers for benchmarking. Experimental results revealed that no single model is equally useful in handling both scenarios. It further confirmed the fact that one should not rely upon any best performing IDS system for preventing any futuristic novel attacks. Until update the model incrementally, it would be vulnerable towards any novel or zero-day attacks. It is further evidence that either the existing methods need to be upgraded or new methods are to be proposed to address the IoT network security issues looking into the openness of the threats.
References 1. Ashton, K., et al.: That ‘internet of things’ thing. RFID J. 22(7), 97–114 (2009) 2. Borgia, E.: The internet of things vision: key features, applications and open issues. Comput. Commun. 54, 1–31 (2014)
How Good Are Classification Models in Handling Dynamic Intrusion …
93
3. Bamakan, S.M.H., Wang, H., Yingjie, T., Shi, Y.: An effective intrusion detection framework based on mclp/svm optimized by time-varying chaos particle swarm optimization. Neurocomputing 199, 90–102 (2016) 4. Farnaaz, N., Jabbar, M.: Random forest modeling for network intrusion detection system. Procedia Comput. Sci. 89(1), 213–217 (2016) 5. Singh, R., Kumar, H., Singla, R.: An intrusion detection system using network traffic profiling and online sequential extreme learning machine. Expert Syst. Appl. 42(22), 8609–8624 (2015) 6. Wang, H., Gu, J., Wang, S.: An effective intrusion detection framework based on SVM with feature augmentation. Knowl.-Based Syst. 136, 130–139 (2017) 7. da Costa, K.A., Papa, J.P., Lisboa, C.O., Munoz, R., de Albuquerque, V.H.C.: Internet of things: a survey on machine learning-based intrusion detection approaches. Comput. Netw. 151, 147–157 (2019) 8. Zarpelão, B.B., Miani, R.S., Kawakani, C.T., de Alvarenga, S.C.: A survey of intrusion detection in internet of things. J. Netw. Comput. Appl. 84, 25–37 (2017) 9. Meng, W.: Intrusion detection in the era of IoT: building trust via traffic filtering and sampling. Computer 51(7), 36–43 (2018) 10. Bostani, H., Sheikhan, M.: Hybrid of anomaly-based and specification-based ids for internet of things using unsupervised OPF based on mapreduce approach. Comput. Commun. 98, 52–71 (2017) 11. Diro, A.A., Chilamkurti, N.: Distributed attack detection scheme using deep learning approach for internet of things. Future Gener. Comput. Syst. 82, 761–768 (2018) 12. Azmoodeh, A., Dehghantanha, A., Choo, K.K.R.: Robust malware detection for internet of (battlefield) things devices using deep eigenspace learning. IEEE Trans. Sustain. Comput. 4(1), 88–95 (2018) 13. Mukherjee, S., Sharma, N.: Intrusion detection using Naive Bayes classifier with feature reduction. Procedia Technol. 4, 119–128 (2012) 14. Narudin, F.A., Feizollah, A., Anuar, N.B., Gani, A.: Evaluation of machine learning classifiers for mobile malware detection. Soft. Comput. 20(1), 343–357 (2016) 15. Ozay, M., Esnaola, I., Vural, F.T.Y., Kulkarni, S.R., Poor, H.V.: Machine learning methods for attack detection in the smart grid. IEEE Trans. Neural Netw. Learn. Syst. 27(8), 1773–1786 (2015) 16. Alsheikh, M.A., Lin, S., Niyato, D., Tan, H.P.: Machine learning in wireless sensor networks: algorithms, strategies, and applications. IEEE Commun. Surv. Tutor. 16(4), 1996–2018 (2014) 17. Branch, J.W., Giannella, C., Szymanski, B., Wolff, R., Kargupta, H.: In-network outlier detection in wireless sensor networks. Knowl. Inf. Syst. 34(1), 23–54 (2013) 18. Xiao, L., Li, Y., Han, G., Liu, G., Zhuang, W.: Phy-layer spoofing detection with reinforcement learning in wireless networks. IEEE Trans. Veh. Technol. 65(12), 10037–10047 (2016) 19. Xiao, L., Li, Y., Huang, X., Du, X.: Cloud-based malware detection game for mobile devices with offloading. IEEE Trans. Mob. Comput. 16(10), 2742–2750 (2017) 20. Xiao, L., Xie, C., Chen, T., Dai, H., Poor, H.V.: A mobile offloading game against smart attacks. IEEE Access 4, 2281–2291 (2016) 21. Bertino, E., Islam, N.: Botnets and internet of things security. Computer 50(2), 76–79 (2017) 22. Kolias, C., Kambourakis, G., Stavrou, A., Voas, J.: Ddos in the IoT: Mirai and other botnets. Computer 50(7), 80–84 (2017) 23. Raza, S., Wallgren, L., Voigt, T.: Svelte: real-time intrusion detection in the internet of things. Ad Hoc Netw. 11(8), 2661–2674 (2013) 24. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967) 25. Jagadish, H.V., Ooi, B.C., Tan, K.L., Yu, C., Zhang, R.: idistance: an adaptive b+-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst. (TODS) 30(2), 364– 397 (2005) 26. McCallum, A., Nigam, K., et al.: A comparison of event models for Naive Bayes text classification. In: AAAI-98 Workshop on Learning for Text Categorization, vol. 752, pp. 41–48. Citeseer (1998)
94
L. Chettri and S. Roy
27. Zhang, H.: The optimality of Naive Bayes. AA 1(2), 3 (2004) 28. Panda, M., Patra, M.R.: Network intrusion detection using Naive Bayes. Int. J. Comput. Sci. Netw. Secur. 7(12), 258–263 (2007) 29. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995) 30. Cristianini, N., Shawe-Taylor, J., et al.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press (2000) 31. Kotsiantis, S.B.: Decision trees: a recent overview. Artif. Intell. Rev. 39(4), 261–283 (2013) 32. Loh, W.Y.: Classification and regression trees. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 1(1), 14–23 (2011) 33. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001) 34. Freund, Y., Schapire, R., Abe, N.: A short introduction to boosting. Journal-Japanese Society For Artificial Intelligence 14(771–780), 1612 (1999) 35. Jumutc, V., Langone, R., Suykens, J.A.: Regularized and sparse stochastic k-means for distributed large-scale clustering. In: 2015 IEEE International Conference on Big Data (Big Data). pp. 2535–2540. IEEE (2015)
Sediment Rating Curve and Sediment Concentration Estimation for Mahanadi River Pratik Acharya, Tushar Kumar Nath, and Ram Babu Nimma
Abstract The sediment rating curve for Mahanadi River is rare to produce, so the sediment rating curve for three tributaries of size medium to large of the Mahanadi River is estimated. The various curves fitting technique has been applied to estimate the sediment rating curve. It is observed that the dataset does not show log-normal distribution due to the biased sampling of data so log-transferred linear fit cannot be applied to this rating curve. The Levenberg–Marquardt nonlinear and linear algorithm is applied to find out the coefficient of the model and it is found that a nonlinear power model includes the Levenberg–Marquardt algorithm was estimating most appropriate statistical solution of the problem. The long-term annual sediment load determined by nonlinear power model is found out to be consistent with previously published results. Keywords Rating curve · Regression analysis · Sediment load · Mahanadi River · Sediment load
1 Introduction The Mahanadi is a major seasonal river in the Odisha state of Eastern India. The confluence of two major rivers, the Sankh, South Koel River, leads to formation of Mahanadi River. It is flowing through the following districts which are Sundargarh, Deogarh, Angul, Dhenkanal, Cuttack, Jajpur and Kendrapara. It is located at 22 15 N and 84 47 E. The origin of Sankh is near the Jharkhand to Chhattisgarh border. The South Koel starts in Jharkhand, near Lohardaga. It is located on the other side of P. Acharya (B) · T. K. Nath · R. B. Nimma Civil Engineering Department, Indira Gandhi Institute of Technology, Sarang, Dhenkanal, Odisha 759146, India e-mail: [email protected] T. K. Nath e-mail: [email protected] R. B. Nimma e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_9
95
96
P. Acharya et al.
watershed that is Damodar River. The sediment rating curve is widely used as an empirical technique to find out the sediment concentration [1]. The sediment load of the river is affected by climatic and land cover changes within its catchment [2]. Due to increase in population and due to rapid growth in economy, human activities are seriously affecting the watershed, land cover which intern affect sediment concentration in the river [3]. Sediment concentration is altering by land cover changes due to human activities. It introduces a interlink in between two variables, one is discharge (Q) treated as predictor variable and sediment concentration (C) as response variable [4]. The relationship in between sediment concentration and discharge can be expressed as C = a Qb where a is ratting coefficient and b is exponent. There is another method for the formulation equation of rating curve [5], which includes simple linear algorithm. Beck et al. (1995) told that the parameter estimation is done based upon observed data so these models will suffer from problems associated with uniqueness of best fit. The sediment rating curves represent a linear or a nonlinear relationship that relates the sediment concentration to streamflow. The power functional relationship is the most common one which the average relationship between stream discharge (Q) and suspended sediment concentration (C). The quality of the model based up on the fitting method. The objective of this work to test various sediment rating technique to the observed datasets for the three tributaries of size medium to large of Mahanadi River. Assessment was done by using regression analysis of the given datasets.
2 River Mahanadi Basin Among the major rivers in India, Mahanadi River is one that flows toward the eastern side of the country and merging into the Bay of Bengal. The basin having longitudes of 19° 21 –23° 35 N and latitudes of 80° 30 –86° 50 E. The river is 850 km long and it extends over an area of about 142,000 km2 . Main tributaries of the Mahanadi River are the Mand, Hasdeo and Seonath. Average annual rain fall for Mahanadi River basin is 155 cm and it falls under tropical climate region. Geology of this basin mainly consists of three components, i.e., the Precambrian rocks of the Eastern Ghats, the lime stones, shales and sand stones of Gondwana affinity and the deposits of deltaic alluvium and littoral. This river basin consists of different lithology in various proportions and it is given as 34% granite, 7% khondalite, 15% of chomovkite, limestone and shale of lower Gondwana age, 22% of Sandstone and shale of upper Gondwana age and 5% of coastal alluvium. Also deposits of iron ore, limestone, bauxite, dolomite, copper and lead are available in this basin. The drainage basin map of the Mahanadi River is shown in Fig. 1 [6].
Sediment Rating Curve and Sediment Concentration …
97
Fig. 1 Drainage basin map of the Mahanadi River
3 Methods In order to prepare the accurate model, fitting and use of sediment rating curve is the best solution [7]. This study is done to test all existing techniques. The following methods are applied to the given datasets 1. Linear regression analysis (linear fit) 2. Log-linear regression analysis 3. Nonlinear regression analysis The performance of the Levenberg–Marquardt algorithm was tested and computed the values.
4 Results 4.1 Inspection of Initial Data It starts with inspection of data for any good fitting technique and application of the any regression model. The scatter plot between discharge of the Mahanadi River and sediment concentration is performed graphically. It is observed that the sediment
98
P. Acharya et al. Scatter Plot for Kurubhata gauging station
R² = 0.0317
1000 800 600 400 200 0 0
100
200
300
R² = 0.5734
Sediment Concentration(mg/l)
Sediment Concentration (mg/l)
Scatter Plot for Basantapur gauging station
2000 1600 1200
Discharge in cumecs (a)
800 400 0 0
200
400
800
1000
(b)
Scatter Plot for Tikarapara gauging station sediment concentration(mg/l)
600
Discharge(cumecs)
800
R² = 0.203
600 400 200 0 0
500
1000
Discharge(cumecs) (c)
Fig. 2 Scatter plot of sediment load concentration versus water discharge for the Mahanadi River at a Basantpur, b Kurubhata, c Tikarpara gauging station
distribution likely to skewed left and scatter increases significantly when discharge is in between 15,000 m3 /s and 30,000 m3 /s, respectively. Variation in suspended sediment concentration occurs due to the introduction of notable amount of wash load which is from surrounding river basin and from channel bank material. It is observed from scatter plot as shown in Fig. 2 that the quality of the curve is poor except for Kurubhata gauge station. The scatter of the sediment concentration at the Basantpur station increases from 300 to 400 m3 /s. In case of Kurubhata station gauge, the scatter of the discharge and sediment concentration is not fluctuating randomly. At Tikarpara river station, the scattering of the discharge and the sediment concentration fluctuates at the discharge range from 500 m3 /s to 600 m3 /s. So it can be investigated that as the discharge is increasing, the sediment concentration is also changing in case of Tikarpara river station and Kurubhata river station but in case of Basantpur, the sediment concentration is not changing even though discharge is increasing. It is observed that the erosion of bed occurred when the discharge is more than 20,000–30,000 cumecs in Basantpur and Tikarpara gauge station gauge station. At Kurubhata gauge station, the dominant discharge for erosion is about 2000 cumecs. The estimation quality of sediment rating curve is more depend on the nature of the dataset and the analysis involved for estimation. The correlation coefficient is defined by
Sediment Rating Curve and Sediment Concentration …
R = 1−
2 σ y,x
99
1/2 (1)
σy 2
where σ y is the standard deviation of y given as n σy =
i=1
(yi − ym )2 n−1
1/2
and n σ y,x =
i=1
(yi − yic )2 n−2
1/2
The yi are the actual value of y, and the yic are the values computed from the correlation equation for the same value of x, the correlation coefficient r might be written as R2 =
2 σ y2 − σ y,x
σ y2
(2)
where R 2 is called as coefficient of determination. Scatter plot of sediment load concentration versus water discharge for the Mahanadi River at Basantpur, Kurubhata and Tikarpara gauging stations is shown in Fig. 2. It is observed that in Basantpur gauge station, the correlation coefficient is found out to be 0.0317, which is indicating a poor fit or substantial scatter around the straight line. The correlation coefficient at Kurubhata gauge station is found out to be 0.5734, and at Tikarpara gauge station, it is found out to be 0.203. The scatterplot of discharge and sediment concentration is not showing a nonlinear power correlation. To minimize the sum of the squares of the deviation, least square method is adopted for finding best linear function. The average value of discharge of last 15 years is shown as hydrograph in Fig. 3. It indicates variation in between the flow rate and the duration. It is observed that the hydrograph at various gauge station fluctuating heavily so the removal of sediment concentration also fluctuating with discharge in river. There is maximum erosion of sediment concentration at dominant discharge. As the discharge reaching the peak values, the erosion of sediment concentration also increases accordingly.
4.2 Regression Models Sediment rating curve is fitted for the datasets which obtained from resulting equations. The ordinary least squares (OLS) method is used to obtain linear fit and nonlinear model was constructed with and without additive constants by using the
100
P. Acharya et al. Hydrograph for Kurubhata gauge staƟon Discharge(Cumecs)
30000 25000 20000 15000 10000 5000
1500 1000 500 0
28/5/1972 22/3/1975 13/1/1978 7/11/1980 1/9/1983 20/8/1986 13/6/1989 6/4/1992 29/1/1995 22/11/1997 16/9/2000 11/7/2003 4/12/2006 27/9/2009 21/7/2012
0
2000
1/4/1978 28/5/1980 25/7/1982 20/9/1984 17/11/1986 13/1/1989 12/3/1991 8/5/1993 5/7/1995 31/8/1997 28/10/1999 24/12/2001 20/2/2004 18/4/2006 21/8/2008 18/10/2010
Discharge(Cumecs)
Hydrograph for Tikarapara gauge staƟon 35000
Time (b)
Time (a)
1/1/1973 7/9/1974 13/5/1976 17/1/1978 23/9/1979 29/5/1981 2/2/1983 8/10/1984 14/6/1986 18/2/1988 24/10/1989 30/6/1991 5/3/1993 9/11/1994 15/7/1996 21/3/1998 25/11/1999 31/7/2001 6/4/2003 10/12/2004 16/8/2006 21/4/2008 26/12/2009 1/9/2011 7/5/2013
Discharge (Cumecs)
Hydrograph for Basantpur gauge staƟon 35000 30000 25000 20000 15000 10000 5000 0
Time (c)
Fig. 3 Hydrograph for the Mahanadi River at a Basantpur, b Kurubhata, c Tikarpara gauging station
Levenberg–Marquardt (L-M) and simplex algorithms. The accuracy of the regression model cannot be assessed directly because of high intent empirical observation is missing for the study period. Equation of rating curve mentioned in the Table 1 can be used for calculation of sediment load concentration. Regression model performance Table 1 Equations of sediment rating curve for the Mahanadi River Model name
Tikarpara
Basantpur
Kurubhata
Linear fit(OLS)
S = 0.0003Q
–
S = 0.00241Q
Linear fit (OLS) with additive constant
S = 0.00003733Q + 0.0594
S = 0.00008599Q + 0.05375
S= 0.00227Q + 0.0544
Nonlinear fit
S = 0.0022Q 0.5811
S= 0.003029Q 0.608613
S = 0.5Q 1.1026
L-M nonlinear power fit with constant
S = 0.00224Q 0.5802
S= S= 0.8088 ln(Q)+19.798 0.00006Q 2 + 165.17
Sediment Rating Curve and Sediment Concentration …
101
is generally visualized by naked eye; however, it cannot be substituted in the statistical assessment of model efficiency. A nonlinear power fit with a constant shows the most efficient model. In this case, simple linear regression and log nonlinear models showed the poor results. At Basantpur gauge station, the sediment concentration is estimated by using linear fit (OLS) with additive constant regression equation as shown in Table 1. It is observed that the estimated sediment concentration is not showing large variation from the observed data at Basantpur gauge station. The average error in the estimation of sediment concentration at Basantpur gauge station is found out to be ±13.22% from data mentioned in Table 2. For Kurubhata gauge station, the sediment concentration is estimated by using linear fit (OLS) with additive constant regression equation as given in Table 1. It is observed that the estimated sediment concentration is showing large variation from the observed data at Kurubhata gauge station. In some of the year, there is an under estimation of sediment concentration at Kurubhata gauge station. The average error in the estimation of sediment concentration at Kurubhata gauge station is found out to be ±9.46% as per data calculated. Similarly, at Tikarpara gauge station, the sediment concentration is estimated by using regression equation as shown in Table 1. It is observed that the estimated sediment concentration is not showing large variation from the observed data at Tikarpara gauge station. In some of the year, there is an under estimation of sediment concentration at Tikarpara gauge station. The average error in the estimation of sediment concentration at Kurubhata gauge station is found out to be ±9.17%.
4.3 Curve Fitting The estimation of the sediment rating cure was fitted with the datasets and the regression model is represented in Table 1. The Ordinary least squares method was used for linear fit of the dataset and for nonlinear models were fit by using Levenberg– Marquardt algorithm. There is an uncertainty in dataset collected with an improper instrumental error. Before examining the model efficiency, the scatter plot of estimated sediment concentration and discharge is plotted and it is observed that the there is a huge scatter of data at Basantpur gauge station. The uncertainty of the different regression model is calibrated by comparing the coefficient of determination of observed sediment concentration and estimated sediment concentration. The coefficient of determination (R2 ) indicates the adequacy of the statistical model. At Basantpur gauge station, the coefficient of determination was found out to be R2 = 0.012 earlier and after regression analysis, the coefficient of determination was improved to R2 = 0.015, which is indicating that there is an uncertainty of dataset exist. There is no definite model exist at Basantpur gauge station as due to uncertainty in dataset. The coefficient of determination at Kurubhata gauging station is quiet improved from R2 = 0.5734 to R2 = 0.8073 so it is clearly indicating that the regression model is most reliable in the case of Kurubhata gauge station. At Tikarpara gauge
102
P. Acharya et al.
Table 2 Comparison between observed sediment concentration and estimated sediment concentration, error estimation at Basantpur gauge station Year
Sediment concentration observed(mg/l)
Sediment concentration estimated(mg/l)
Residual in mg/l
% error
1973
118.55
149.36
30.80
20.62
1974
155.17
160.99
5.82
3.61
1975
135.76
149.35
13.59
9.10
1976
138.88
149.35
10.47
7.01
1977
164.88
180.42
15.53
8.61
1978
170.44
149.35
21.10
14.13
1979
116.39
149.35
32.96
22.07
1980
189.42
149.35
40.97
27.43
1981
159.24
149.34
9.90
6.63
1982
187.59
149.34
38.24
25.61
1983
177.50
149.34
28.26
18.92
1984
157.30
149.34
7.96
5.33
1985
129.39
149.34
19.95
13.36
1986
122.35
149.34
26.99
18.07
1987
135.61
149.34
13.73
9.19
1988
129.36
149.34
19.98
13.38
1989
138.25
149.34
11.09
7.42
1990
141.08
149.34
8.26
5.53
1991
132.56
149.34
16.78
11.23
1992
122.32
149.34
27.02
18.09
1993
122.27
149.33
27.07
18.13
1994
132.25
149.33
17.08
11.44
1995
120.32
149.33
29.01
19.43
2000
115.36
149.33
33.97
22.75
2001
121.32
149.33
28.01
18.76
2002
125.36
149.33
23.97
16.05
2003
122.38
149.33
26.95
18.05
2004
123.65
149.33
25.68
17.20
2005
129.34
149.33
19.99
13.39
2006
132.35
149.33
16.98
11.37
2007
136.35
149.33
12.98
8.69
2008
135.36
149.33
13.97
9.35
2009
134.63
149.33
14.70
9.84
2010
139.35
149.33
9.98
6.68
2011
135.62
149.32
13.70
9.18
2012
136.32
149.32
13.00
8.71
2013
132.56
149.32
16.76
11.23
2014
135.24
149.32
14.09
9.43
Sediment Rating Curve and Sediment Concentration …
103
1
Scaer Plot forTikarapara gauging staon
0.6
Linear fit (OLS) with addive constant
0.4
Nonlinear fit
0.2
L-M Nonlinear power with constatnt
0.8
0 0
10000
20000
Sediment concentraon in gm/l
Discharge in cumecs (a)
Sediment concentraon gm/l
Sediment concentraon in gm/l
station, the coefficient of determination was 0.0824 found earlier, and after regression analysis, the coefficient of determination was improved to 0.5719. The statistical analysis of the examined method is based on the ordinary least square method and nonlinear algorithms. The estimated and observed values are compared by graphical assessment of the scatter plots. The model adequacy is generally performed by visual inspection. However, the visual inspection should not be replaced for statistical examining of model efficiency. The least square method adopted for linear fit of the model. Levenberg–Marquardt is used for nonlinear model fit analysis. The regression model efficiency cannot be judged properly as there is huge empirical observation are not available for the analysis. The sediment rating curve equation can be used for the estimation of sediment concentration at different gauge station of the Mahanadi River. The variation of estimated sediment concentration and discharge has been plotted at various gauge station as shown in Fig. 4. It is observed that at Basantpur gauge station, the data is fit to the nonlinear power model. Linear model with additive constant is fitted with the dataset at Kurubhata gauge station. Nonlinear model with logarithmic fits to the model at Basantpur gauge station. The best fitted model can be calibrated by visual inspection as the data collected with high degree of uncertainty. Scaer Plot for basantapur gauging staon
0.8
Nonlinear fit
0.6
Linear fit (OLS) with addive constant
0.4 0.2 0
L-M Nonlinear power 0 5000 10000 fit with Discharge in cumecs constant
(b)
Scaer Plot for Kurubhata gauging staon 5 4
Nonlinear fit
3
Linear fit (OLS) with addive constant
2
L-M Nonlinear power with constant
1 0 0
500
1000
Discharge in cumec (c)
Fig. 4 Scatter plot of estimated sediment concentration versus water discharge for the Mahanadi River at a Tikarpara, b Basantpur, c Kurubhata gauging stations
104
P. Acharya et al.
5 Conclusions The sediment rating curve was evaluated based on a nonlinear regression analysis. In most of the gauge station, the nonlinear model is found out to be a statically optimal model for the problem. The long-term parameter for calculating sediment concentration for the study area was obtained by using the nonlinear regression model. The annual sediment concentration was found out by using regression model at various rive gauge stations. The error in observed data and estimated data not showing large variation at various gauge station. This nonlinear model was found out to be most efficient and consistent model in most of the gauge stations.
References 1. Asselman, N.E.M.: Fitting and interpretation of sediment rating curves. J. Hydrol. 234(3), 228– 248 (2000) 2. Khanchoul, K., Boukhrissa, Z.E., Majour, H.: Statistical modelling of suspended sediment transport in the Cherf drainage basin, Algeria. Revue Marocaine des Sciences Agronomiques et Vétérinaires 1(1), 13–17 (2012) 3. Begam, S., Barbhuiya, A.K.: Estimation of sediment yield of Sonai River using sediment rating curve and geographic information system. Int. J. Adv. Res. 5(9), 994–1000 (2017) 4. Syvitski, J.P.: On the deposition of sediment within glacier-influenced fjords: oceanographic controls. Mar. Geol. 85(2–4), 301–329 (1989) 5. Ndomba, Preksedis M., Mtalo, Felix W., Killingtveit, Ånund: Developing an excellent sediment rating curve from one hydrological year sampling programme data: Approach. J. Urban Environ. Eng. 2, 21–27 (2008) 6. Central Water Commission, Government of India. www.cwc.gov.in 7. Mount, N.J., Abrahart, R.J.: Load or concentration, logged or unlogged? Addressing ten years of uncertainty in neural network suspended sediment prediction. Hydrol. Process. 25(20), 3144– 3157 (2011)
An Energy-Efficient Routing with Particle Swarm Optimization and Aggregate Data for IOT-Enabled Software-Defined Networks Krishnasamy Lalitha, Chinnasamy Poongodi, Shanmugam Anitha, and Duraisamy Vijay Anand Abstract In the IoT era, the Software-Defined Wireless Sensor Networks play a crucial role. In these networks, many sensors were placed in hostile areas. Since the capabilities of the sensors has its own limits when it comes to computational and energy efficiency, it is hard to replace them once they run out of battery. So, there is an obvious need to develop an energy-efficient and Software-Defined routing system that enables us to handle wireless sensor networks with ease. This paper recommends the integration of Fork and Join Adaptive Particle Swarm Optimization (FJAPSO) along with data aggregation in the existing Software-Defined Wireless Sensor networks. This enhanced FJAPSO uses dual optimization techniques toward the optimal number of control nodes. The simulation results of enhanced FJAPSO produces a significant improvement compared to FJAPSO in optimizing the size of data to be transmitted which in turn increases the lifetime of sensor network. Keywords Sensor networks · Energy optimization · Routing · Data aggregation · Energy efficiency
1 Introduction Wireless sensor networks (WSN) consists of huge number of densely deployed low power sensor devices with limited communication and processing capabilities. The demand of WSN arises when the remote area to be monitored without human intervention. The major constraints faced are limited battery and low processing power. K. Lalitha (B) · C. Poongodi · S. Anitha · D. Vijay Anand Department of IT, Kongu Engineering College, Erode, India e-mail: [email protected]; [email protected] C. Poongodi e-mail: [email protected] S. Anitha e-mail: [email protected] D. Vijay Anand e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_10
105
106
K. Lalitha et al.
Fig. 1 Structure of WSN
These constraints paves the way for the emergence of internet of things (IoT) to be deployed in a large regions of environments to sense, transmit, communicate, and process the data in real time [1, 2]. Though IoT applied in various applications like smart health monitoring, smart home, retail, security perspective, surveillance, etc., still WSN is required to monitor like harsh environmental conditions. But controlling and monitoring the densely deployed WSN in a large geo-graphical area is very difficult in real time. This issue is overcome by the emergence of Software-Defined Wireless Senor Network (SDWSN) to enable auto-monitoring and control. A generic structure of the WSN is shown in Fig. 1. The randomly deployed sensor nodes are generally organized as groups called clusters. Each cluster may consists of a set of sensor nodes named cluster members and a head called cluster head (CH). The data collected from a member will be processed and transmitted through CH to the station management node or the Sink [3].
1.1 A Software-Defined Wireless Sensor Networks The auto-monitoring requirement brings the emergence of Software-Defined Wireless Senor Network (SDWSN) which laid off the monitoring and control limitations of traditional WSN. In SDWSN, the sensor nodes are capable enough to reconfigure their network monitoring functionalities dynamically and control failures based on real-time sensing. It applies a distributed clustering approach, i.e., decouples the control plane and data plane. This programmable control facilitated by SDWSN
An Energy-Efficient Routing with Particle Swarm Optimization …
107
Fig. 2 Structure of SDWSN
paves the way for multi-functional support to the virtualization oriented technologies like Fog and Cloud Computing, etc. [4]. The general operational structure of Software-Defined WSN is given in Fig. 2. Software-Defined Sensors are deployed randomly over the selected region and a set of sensors are grouped together and each group consists of a control node with cluster head responsibility. Remaining nodes are cluster members and dedicated to do the tasks assigned to them. In SDWSN, most of the battery-hungry tasks are carried over by the controller node. Controller can take routing decisions, and it has the capability to tune the transmission range of its cluster members in order to reduce the interference due to long range communication among members.
1.2 PSO in Software-Defined WSN SDWSN focuses on optimizing the energy consumption through reducing the control information’s while transferring data to the Sink and tune the transmission range autonomously for optimizing the energy consumed for routing. Routing algorithms were implemented with pre-defined set of clusters to optimize the energy consumption. The network required to be more scalable and elastic which brings overhead in routing [5, 6]. The distance and energy trade-off is a never-ending issue until the network is alive. Fitness function is proposed which considers the distance between control head (CH) and the Sink. In order to bring out optimized number of clusters with controlled
108
K. Lalitha et al.
routing, PSO has been used. PSO technique is computationally expensive to achieve all the expectations of sensor network [7–9]. Hence, a significant change in PSO as Fork and Join Adaptive Particle Swarm Optimization (FJAPSO) was proposed to achieve the energy-efficient routing with two-level optimization. In FJAPSO, each cluster particle is forked into a number of pre-defined set of sub-particles for every iteration and merged together into a parent particle with best forked solution. These forked sub-particles are concentrate on efficient routing, whereas parent particles are dedicated to optimize the number of clusters.
2 Proposed System Traditional WSN with deployed sensors are exchanged with Software-Defined Wireless Sensor Nodes (SDWSNs) which brings the automatic network failure monitoring and dynamic reconfiguration of their functionalities based on the real-time sensing. The number of clusters to be adjusted automatically based on the deployment area and the number of sensors. The routing and network overhead to be overlooked to optimize the energy consumption.
2.1 Objectives The massive sensor deployment and exponential growth in the IoT domain carries the need to build an automatically reconfigurable sensor networks which will reduce the network failure and optimize the routing as well. There is always a trade-off exists in size of the cluster, distance between members, and it can be treated as NP-hard problem. Since the battery replacement in the remote deployment of sensors is not possible, the optimization of data to be transmitted and routing path is essential. The main objective is to develop an algorithm that should combine both FJAPSO and data aggregation together called Enhanced Fork and Join Adaptive Particle Swarm Optimization (EFJAPSO). The EFJAPSO approach should consume less energy and increase the lifespan of the network.
2.2 Proposed Methodology To increase the life span of the sensors, we need an effective clustering and routing mechanism. In this paper, an EFJAPSO algorithm along with data aggregation methods is being implemented to improve the life span and power efficiency of the sensors. The Enhanced Fork and Join Adaptive Particle Swarm Optimization (EFJAPSO) has two main goals upon its implementation where the first one is
An Energy-Efficient Routing with Particle Swarm Optimization …
109
about optimal control head nodes of the sensors and the routing path optimization. The EFJAPSO works based on iterations. EFJAPSO forks out each particle into pre-defined sub-particles. A productive process is formulated which considers the distance between sensor nodes in a cluster and between clusters to reduce the energy spent for transmission by organized size of clusters. The collected data is aggregated and stored in the control head (CH). Finally, the data set is transmitted to the Sink using data cube approach. The meta-heuristic algorithm (EFJAPSO) starts with the pre-defined number particles generated randomly with its position and velocity vector measurement. The particle position behavior is considered as continuous random value. Fitness function is used to evaluate the fitness of a particle. Personal best solution and the global best solutions are evaluated using the fitness function proposed. The entire process of Fork and Join Adaptive PSO with data aggregation is shown in Algorithm 1. Algorithm 1: EFJAPSO 1: Begin 2: for each particle do 3: Initialize the position, velocity and dimension of the particle randomly 4: 5: Calculate the fitness function 6: 7: end for 8: Find the and solution for each particle 9: Repeat 10: Particle = Particle+1 11: Add the inertia weight (w) for each particle 12: for i=1 to t do 13: 14: Mutate the jth particle with parent particle with fitness function 15: 16: 17: 18: 19: 20: 21: 22: 23: 24:
end for
end if end for where i ≠ j & 1≤i,j≤k Until To meet the Termination criteria End
110
K. Lalitha et al.
2.3 Energy Model The objective of selecting an energy model is to reduce the consumption of energy for the operations carried out. The LEACH energy model is adopted here since random, static deployment of sensor nodes is taken for implementation.
2.4 Network Model The sensor nodes deployed randomly in the topological region and each SoftwareDefined Sensor is fixed with unique identifier. The nodes are quasi stationary, they can readjust their transmission range based on the control head distance. The SD members are dedicated to do the pre-defined tasks whereas the SD control head (SDCH) gathers required data in a stipulated time and routes it to the other SDCHs and the Sink. Nodes are assumed to be homogeneous and remain unattended after the remote deployment. The Sink is powered by external source.
3 Performance Evaluation The performance analysis is conducted on 7th generation processor with 8 GB RAM in Matlab (2016 A) to analyze the proposed EFJAPSO with various SDWSN parameters. To analyze various sensor nodes behavior, the random deployment is done and the number of sensing nodes varies from 100 to 1000 for evaluating the performance. The proposed EFJAPSO forks the particle in each with k of given data, which leads to the state which contains un-deterministic fitness function. To find the best value of k, a series of tests were executed and the results are discussed in Fig. 3. The results reveal that the test produces best fit value with less computation time. The behavior of EFJAPSO is measured by various parameters like normalized fitness value, inertia weight w, etc. as given in [10]. The resultant value includes the initial population and the iteration count. The max rate of inertia is set to 0.92 as recommended in FJAPSO. The tuning is done with the help of proposed adaptive tuner. The results are shown in Fig. 4. From the results obtained, the inference drawn is the adaptive tuner proposed here is working effectively, as expected the rate of ω is unstable according to happenings in the NFV. If the rate of NFV is found decreased than its aforementioned iteration value, it is the indication that particles are not moving into the direction where the best solution is practiced. So, the rate of ω is increased in the following repetitions to conduct the global search. With the help of NFV observations and inertia weight and iteration counter the switching between the local and global search is reduced as shown in Fig. 5.
An Energy-Efficient Routing with Particle Swarm Optimization …
Fig. 3 The behavior of EFJAPSO on computation time, k and fitness
Fig. 4 The behavior of adaptive tuner versus NFV
111
112
K. Lalitha et al.
Fig. 5 The behavior of EFJAPSO on IP, fitness value, and iterations
Figure 6 portrays the performance of EFJAPSO on many parameters like numbers of packet sent to the Sink, dead node count, and total residual energy. The effect of
Fig. 6 Number of packet data sent to CS versus round
An Energy-Efficient Routing with Particle Swarm Optimization …
113
number of rounds on respective evaluating parameters can be found on the results summarized on Fig. 6. From the results, we can come to a conclusion that the number of data packets transferred and number of rounds are directly proportional. Even though it is considered as directly proportional, the aggregated data packets will not be transmitted if there is no deviation when compared to the previous value. Hence, the graph shows linear from rounds 1500–3500 and it greatly reduces the energy consumption. The status of nodes can be gathered via this procedure. The alive nodes continue to transfer the data packets. When most of the nodes becomes dysfunctional, the packet transfer will be reduced to zero. When the number of rounds increase to an extent, the nodes begin to dysfunction and energy of residual nodes becomes zero. To tackle this disadvantage of FJAPSO, the EFJAPSO aggregates the several packet data’s and send the concise digest to the Sink where the number of rounds decreased as a result the residual energy of the nodes don’t deplete as fast as it happens in the FJAPSO.
4 Conclusion Minimizing the energy consumption in SDN is a challenging task. In this work, the routing is taken care by the Enhanced Fork and Join Adaptive Particle swarm optimization (EFJAPSO) for SDWSN. EFJAPSO is an embedded version of FJAPSO and data aggregation function. It optimizes the control head and enhances the lifetime of the network. A fitness function is proposed for control head selection and to enhance the lifetime of the SDWSN. To provide the effective convergence, the inertial weight of EFJAPSO is tuned. The experiment is simulated to analyze the performance of EFJAPSO. The performance study shows that EFJAPSO outperforms FJAPSO in terms of the number of packets transmitted and thus improves the network lifetime.
References 1. Kumar, N., Vidyarthi, D.P.: A green routing algorithm for IoT-enabled software defined wireless sensor network. IEEE Sens. J. 18(22), 9449–9460 (2018) 2. Huang, R., Chu, X., Zhang, J.: Energy-efficient monitoring in software defined wireless sensor networks using reinforcement learning. Int. J. Distrib. Sens. Networks (2015) 3. Misra, S., Bera, S., Achuthananda, M.P., Pal, S.K., Obaidat, M.S.: Situation-aware protocol switching in software-defined wireless sensor network systems. IEEE Syst. J. 12(3), 2353–2360 (2018) 4. Xiang, W., Wang, N., Zhou, Y.: An energy-efficient routing algorithm for software-defined wireless sensor networks. IEEE Sens. J. 16(20), 7373–7400 (2016) 5. Chaudhry, R., Tapaswi, S., Kumar, N.: Forwarding zone enabled PSO routing with network lifetime maximization in MANET. Appl. Intell. 48(9), 3053–3080 (2018) 6. Li, G., Guo, S., Yang, Y., Yang, Y.: Traffic load minimization in software defined wireless sensor networks. IEEE Internet Things J. 5(3), 1370–1378 (2018)
114
K. Lalitha et al.
7. Zidong, H., Yufeng, L., Junyu, L.: Numerical improvement for the mechanical performance of bikes based on an intelligent PSO-ABC algorithm and WSN technology. IEEE Access, 32890–32898 (2018) 8. Tanima, B., Indrajit, B.: Dynamic PSO based fuzzy clustering algorithm for WSNs. In: IEEE Region 10 Conference (TENCON) (2019) 9. Amrit, M., Pratik, G., Ziwei, Y., Lixia, Y., Joel, J.P.C.: ADAI and adaptive PSO-based resource allocation for wireless sensor networks. IEEE Access, 131163–131171 (2019) 10. Kobo, H.I., Abu-Mahfouz, A.M., Hancke, G.P.: Fragmentation based distributed control system for software defined wireless sensor networks. IEEE Trans. Ind. Inf. (2018)
Design of IoT-Based Real-Time Video Surveillance System Using Raspberry Pi and Sensor Network Saroja Kanta Panda and Sushanta Kumar Sahu
Abstract The video surveillance system of any secured places using security guards at every moment is not possible. Also, closed-circuit television (CCTV) is extensively used CCTV in most of the security places like multi-storage buildings, banks, cinema halls, commercial buildings like shopping malls and more. But real-time thief handling is very important to prevent theft and vandalism. This project employs an IoT-based real-time video surveillance system with a password locking technique using raspberry pi. The system requires a USB webcam, raspberry pi 3B, 4 * 4 Keypad and PIR sensor. When any motion is detected pi activate the webcam for capturing the image if anyone enters a wrong password or tries to enter a random password, at that time processor sends an email or SMS alert to register id or mobile number. When motion detected using passive infrared sensor (PIR) the raspberry pi store the image in the cloud using the SMTP mail server and send them to register email. Keywords IoT · SMTP · Raspberry pi · E-mail · Motion detection · Video surveillance
1 Introduction In the present day, security system is the most important for many public and private sectors like the banking sector, finance sector, shopping mall, multi-storage buildings and home security system [1]. For maintaining a social security video surveillance plays an important role in the present day. By using different methods or techniques like RFID technique [2], OTP-based, GSM-based, Bluetooth-based. It can decrease the crime rates, but real-time detection and capturing the theft is the vital point for the S. K. Panda (B) · S. K. Sahu Department of Instrumentation and Electronics Engineering, College of Engineering and Technology Bhubaneswar, Bhubaneswar, India e-mail: [email protected] S. K. Sahu e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_11
115
116
S. K. Panda and S. K. Sahu
present days. Here, the system proposed an IoT-based door locking system [3] with password protection because it is the first step for safety. The above systems provide a security system, but it is unable to provide an instant alert for any authentication. Many systems store the video and image to a memory or cloud server [3]. RFIDbased systems identify the person or object automatically but it is quite expensive. IoT-based video surveillance systems consume more power for continuous usage and require storage space for live streaming video. The proposed system is a PIR motion detection system [4] that activates the webcam when any living beings appear. At a particular time, a SMS send to the registered no and a notification mail sent to authority. The keypad locking system [5] allows only authorized people to enter the room. If any wrong password entered, then the raspberry pi send the image of the unauthorized person.
2 Related Work In this proposed system, the surveillance area is covered by a Logitech C310 (Webcam) which is directly connected to the raspberry pi 3 through a USB port. If any unknown persons want to access the password, it captures the image when motion is detected by the PIR sensor, then the processor sends the email to the authorized address and sends an SMS to the registered mobile number. This password given in the program code and it can access by authority. The photo captured from the video using the frame difference technique of OpenCV written in python language. The image capture by the time of motion detection is stored in the SMTP mail server. When it is offline, it stores in the local raspberry pi memory. The IoT system provides a password-based door locking system which is the first step of any unauthorized person. If multiple people come and try to access the password un-authentically, it also captured the image from the video frame. The total system provides a social security and a IoT-based control feature. The system architecture include the hardware raspberry pi, Webcam, GSM module, MAX 232IC, 4 * 4 keypad and PIR sensor, 26 * 2 LCD, Alarm. The architecture of the system is shown in Fig. 1. The Logitech C310 Webcam is connected to the raspberry pi board through USB port directly. MAX232 is a transceiver IC that converts a hardware layer protocol known as RS232. It has a pair of drivers and receivers which converts the TTL and CMOS voltage level to RS-232 voltage level. This voltage level is used for the serial communication between the GSM module and raspberry pi. When any human movement occur in the area of PIR sensor, the system activates the webcam. Webcam capture the image of unauthorized person from the video frame and sends the image to the register mail id. An SMS send to the register mobile no at same time. Using a raspberry pi setup or through the IoT system, more surveillance area can be covered. The raspberry pi setup will automatically deliver video data streaming to the cloud server. Here, C310 HD webcam camera module is used that can be capable of 1280 * 720p (HD) high definition video modes and still image, and it can connect raspberry pi directly with Universal Serial Bus (USB).
Design of IoT-Based Real-Time Video Surveillance System … Fig. 1 Block diagram of proposed system
117
Power Supply
Webcam
Raspberry Pi 3
Cloud
Alarm
4*4 Keypad MAX232 GSM Modem
Email
Digital PIR Sensor
SMS
2.1 Raspberry Pi 3 Model B It is a single-board credit card size computer, which is developed by the raspberry pi foundation. The proposed system uses a raspberry pi 3B model and it offers the following key features (Fig. 2). • • • • • • • •
Quad-core Broadcom BCM2837 64-bit ARM cortex A53 clocked at 1.2 GHz 400 MHz video core IV multimedia 1 GB SDRAM (i.e. 900 MHz) 4 USB ports Micro SD port for loading operating system and storing data 10/100 Mbps Ethernet and 802.11n wireless LAN network 17 GPIO channel Bluetooth 4.1 supported
Fig. 2 Raspberry pi 3 Model B and pin description of raspberry pi 3 (Source Google)
118
• • • •
S. K. Panda and S. K. Sahu
5 V power source via micro USB or GPIO header AUDIO/VIDEO OUTPUT Full-size HDMI Onboard camera/LCD display pin
2.2 PIR Sensor PIR sensor refers to a “Passive Infrared” electronic sensor. It is used for motion detection. It works on the principle of the amount of heat radiation emitted by the moving object is related to heat produce by it. PIR sensor detects the infrared radiation level by detecting the change in surrounding temperature, for example, when any person is detected by the PIR sensor, it suddenly turns on the webcam and follows the instructions. Which help in power consumption and longer life of the system (Fig. 3).
Fig. 3 Working of PIR sensor
Design of IoT-Based Real-Time Video Surveillance System …
119
2.3 Keypad 4 * 4 A 4 * 4 matrix Keypad consists of row and column matrixes and eight terminal points. The 16 keys have number 0–9, four alphabets A to D and two symbols * and #. Maximum 24 V for each segment. For eight terminals (4 for rows and 4 for columns) is for the connection of keypad to another devices. The system designs a password protect door locking system where the password can be number or alphanumeric which allows the authenticated person (Fig. 4). Fig. 4 Keypad of structure 4 * 4
120
S. K. Panda and S. K. Sahu
2.4 SMTP Server SMTP know as simple mail transfer protocol is a communication protocol for electronic mail transmission. The SMTP server is used for the application purpose to send and receive outgoing mail between the email senders and receivers. SMTP sends data to another server after processing the data and also receiving, sending and relaying of email. It sends the outgoing email from an activated account and protects the account from unauthorized accounts. It also checks the email is delivered or not.
2.5 GSM Module The GSM module communicates between a mobile device and a GPRS or GSM system module. The microcontroller of the GSM module allows wireless communication with other devices and modules. Many wireless system devices like security system, medical devices, GPS tracking system, home automation, E-commerce, etc.
2.6 Webcam A webcam is a small camera that streams an image of real-time video through a processor or computer network. It is a low cost high flexible digital camera. It is used for broadcast video images in real-time. The raspberry pi supplies the information capture by the webcam. It is mostly used for security surveillance, computer vision, video broadcasting. The system uses the Logitech C310 webcam for capturing the image and video of an unauthorized person. Logitech C310 webcam consists of 5MP snapshots, maximum noise reduction technique, auto light correction, fix focus in a 60-degree field of vision.
2.7 16 * 2 LCD Liquid crystal display is an electronic display module used for various circuits & devices like computers, mobile phone calculators, etc. It is a low power consumption multi-segment, light-emitting diodes. The system uses 16 * 2 LCD which has 16 columns and two rows. It is an alphanumeric display which displays the numbers, symbols and alphabets. It requires 4.7–5.3 V and 1 mA current with no backlight. Two modes of work (4 bits and 8 bits) and 5 * 8 pixel boxes are built for each character. It guides the user by displaying the message.
Design of IoT-Based Real-Time Video Surveillance System …
121
3 Motion Detection The system used a PIR sensor for motion detection at the entry point of the door which detects the change in the infrared signal. The change depends on the temperature and surface behaviour of the objects which move to in front of the (Fig. 5). Fig. 5 Motion detection flowchart
Start
Setup Initialization Enter Password
If The Password Correct ?
Initialize PIR Sensor
Motion Detects?
Capture Image & Video
Send Image to Email Cloud
End
Send Notification
122
S. K. Panda and S. K. Sahu
Fig. 6 Experimental setup for password protection
4 Implementation The system implement the motion detection algorithm written in python. Here, the system use OpenCV for capturing the image object or person with the help of webcam. Raspberry pi control all the hardware and software in the single platform. It sent the SMS through GSM modem and email through SMTP mail server to the register mail id (Fig. 6). Image captured by webcam sends to the SMTP cloud server when any wrong password entered by any unauthorized person. Motion detection helps to save power consumption and require less amount of memory space. It is also a low cost IoT system for implementing all security places.
5 Experimental Results Experimental result shows the motion detection, image capture at the time of entering wrong password. The system works over the internet to send the email and SMS alert which controlled by the IoT processor raspberry pi. There is no human being is needed for any operations so it is a fully automatic real-time operating system. The process of operation and monitoring allow the only authorized person whose email and mobile number is registered over this platform. Figures 7 and 8 shows wrong password detection and photo captured by using computer vision.
6 Conclusıon and Future Work By using the IoT-based smart door lock with a real-time video surveillance system is a consumer-oriented device. It is a smart flexible system that replaces a lot of conventional types of locking systems nowadays. The system is a powerful monitoring tool that unfolds the invader’s behaviours. It is a less power utilization device
Design of IoT-Based Real-Time Video Surveillance System …
123
Fig. 7 Screen shot of email with photo of intruder
Fig. 8 Screen shot of SMS alert
that ON the camera if any triggers occur. The raspberry pi provides for embedding more sensors and hardware with the system. The processor has much computation power to connect multiple devices and sensors. All the operation is fully automatic and it reduces manpower. It also uses various security places like the locker room of the bank, ATM centres and storerooms, etc. it store and sends the data to the SMTP server which not allow hacking the system. The real-time monitoring system build is a fully constructive system to build a fully constructive system for practical implementation. In future plan to use Open CV for in better way to improve detection algorithm, Because it mainly depends on the threshold value. It’s mean algorithm developed the performance of absolute conditions and to known about the person is authenticated or not.
References 1. Bhatkule, A.V., Shinde, U.B., Zanwar, S.R.: Home based security control system using raspberry pi and GSM. Int. J. Innovative Res. Comput. Commun. Eng. 4(9), 16259–16264 (2016)
124
S. K. Panda and S. K. Sahu
2. Lin, C., Tang, Y.: Research and design of the intelligent surveillance system based on DirectShow and OpenCV. In: 2011 International Conference on Consumer Electronics, Communications and Networks (CECNet), XianNing, 2011, pp. 4307–4310. https://doi.org/10.1109/cecnet.2011.576 8334 3. Jyothi, S.N., Vardhan, K.V.: Design and implementation of real time security surveillance system using IoT. In: 2016 International Conference on Communication and Electronics Systems (ICCES), Coimbatore, 2016, pp. 1–5. https://doi.org/10.1109/cesys.2016.7890003 4. Sruthy, S., George, S.N.: WiFi enabled home security surveillance system using Raspberry Pi and IoT module. In: 2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES), Kollam, 2017, pp. 1–6. https://doi.org/10.1109/ spices.2017.8091320 5. Murugan, K.H.S., Jacintha, V. Shifani, S.A.: Security system using raspberry Pi. In: 2017 Third International Conference on Science Technology Engineering & Management (ICONSTEM), Chennai, 2017, pp. 863–864. https://doi.org/10.1109/iconstem.2017.8261326
Multi-agent System of Autonomous Underwater Vehicles in Octagon Formation Madhusmita Panda and Bikramaditya Das
Abstract In this paper, a formation control problem of eight Autonomous underwater vehicles (AUVs) is addressed using multi-agent system (MAS) concept. The proposed MAS of AUVs constitute of a virtual leader and eight follower AUVs. Each follower AUV represents an agent connected by a communication network and assumes full communication without delay. The formation control of multi-AUV system deals with controlling positions and heading angles of AUVs using Jacobian theory and methods of geometric reduction to achieve an octagonal shape. The underwater environment is modelled as a two-dimensional grid. The AUV motion dynamics are modelled assuming 3 degree-of-freedom neglecting heave and sway motions. The surge and yaw inputs are used as control inputs for the controller. The proposed controller maintains the octagonal geometry while approaching the target. This research can be helpful in solving formation control problem for applications such as oceanographic survey. Keywords Autonomous underwater vehicle (AUV) · Follower · Formation · Jacobi theory · Leader · Multi-agent system (MAS) · Octagon
1 Introduction Our earth is surrounded by oceans that sum up to 70% of the total earth surface. Thus, the ocean and its impact on human society are undeniable and attract attention of researchers for decades [1]. The global warming is the current threat to human society and ocean considered as a major contributor to the same. Oceanic research involving Autonomous underwater vehicle (AUV) plays an important role in exploring the undersea environment and helps in finding causes and solutions for such problems [2, 3]. AUV is a self-sustained submersible equipped with its own DC power source and on-board controller [4]. The underwater world is vast and to cover a finite area requires involvement of multiple AUVs as a team [5]. The laws M. Panda (B) · B. Das Department of Electronics and Telecommunication Engineering, VSSUT, Burla, Odisha, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_12
125
126
M. Panda and B. Das
governing coordination and control of this team of AUVs are known as “Cooperative control”. Cooperative control can again be divided into “formation control” and “flocking control” [6]. The oceanic research involving multiple AUVs needs to cover widely separated areas for gathering information maintaining the coordination among members of the team [7]. The formation control of multiple AUVs maintains relative positions and orientations of the team of vehicles while approaching the target is preferred for surveying larger areas [8]. Control involving formation needs complicated controller design to maintain “inter-vehicular communication” to avoid collision among themselves and other obstacles [9]. The dynamic underwater environment [10] and unavailability of “Global positioning system (GPS)” signals [11] make formation control task quite challenging. Thus, designing fewer complex controllers for formation control is recently gaining research interest. Most of the methods for formation control of multiple AUVs advocates leader-follower structure [2–9] where follower vehicles trace the trajectory of a leader vehicle. The MAS can be considered as solution to coordination problem for multi-AUV formation [12]. In MAS, each agent estimates the position of the neighbour agents while moving towards their respective destinations. Szymak et al. [13] developed a control system architecture for MAS consisting AUVs employed in underwater survey. In the formation control problem for MAS of AUV, the leader AUV decides the direction of advancement of the formation while followers AUVs maintain the required angle of orientation and position with respect to the leader. Yang et al. [14] suggested to employ Jacobi transform and geometrical reduction techniques to separately design shape, motion and AUV orientation controllers. This research applies the proposed method by Yang et al. [15] to control octagonal formation of eight AUVs following leader-follower approach with a virtual leader. The leader and follower AUVs have synchronized clock and update their control inputs at the same instant of time, thus nullifying the effect of communication delay [15]. Octagonal shape efficiently covers the wider space in comparison to its counterpart. In this paper, we attempt to control an octagonal formation of the propsed multi-agent system of AUVs using Jacobian theory. This paper comprises of the following sections: Sect. 2 briefly introduces the AUV dynamics and explains the concept of formation control, Sect. 3 discusses the problem formulation, Sect. 4 discusses octagonal formation control, Sect.5 includes results analysis, and finally, Sect. 6 concludes the paper.
2 Review of AUV Dynamics 2.1 Dynamics of AUV There are 6 degrees of freedom (DOF) of an AUV [16]. Assuming only the horizontal motion of AUV at a fixed depth from the sea surface with negligible roll and pitch,
Multi-agent System of Autonomous Underwater Vehicles …
127
only 3-DOF can be considered. The DOF considered are surge, sway and yaw given by η = [x, y, ϕ]T [15]. Here, x and y denotes the position coordinates of the AUV in the xy-plane and ϕ is the heading angle. It is assumed that the hydrodynamic damping forces and moment are linear. The motion dynamics in x-axis is defined as follows [17] η˙ = RbI (ϕ)v
(1)
Above equation represents position and orientation transformation. Here, RbI (ϕ) is the rotational matrix that gives the relation between body frame (b) and inertial frame (I) [16] defined as ⎡
⎤ cos ϕ − sin ϕ 0 RbI (ϕ) = ⎣ sin ϕ cos ϕ 0 ⎦ 0 0 1
(2)
The linear motion of AUV can be described by Mv˙ + Q(v)v + D(v)v = τ
(3)
where ⎤ μ− X 0 0 M = ⎣ 0 μ − Y 0 ⎦, 0 0 Iz ⎡
⎡
⎤ −X 0 0 D(v) = ⎣ 0 −Y 0 ⎦ 0 0 Qr
(4)
The M represents the inertia mass matrix and Q(v) denotes the coriolis matrix.D(v) is the hydrodynamic damping matrix. τ is the control input vector that have forces applied in xy-plane and torque in z-axis. The model parameters are X u , Yv , X u˙ , Yv˙ , Q r , Iz . Assuming S = [x, y]T and α = [u, v]T . Equation (1) is described as in [8] S˙ = R(ϕ)α and ϕ˙ = r
(5)
Equation (5) represents vehicle orientation, where
cos ϕ − sin ϕ R(ϕ) = sin ϕ cos ϕ
(6)
R(ϕ) satisfies ˙ = R(ϕ)S(ϕ) ˙ R T (ϕ)R(ϕ) = I for all ϕ, and R(ϕ) Here, S(ϕ) ˙ is the skew-symmetric given by
(7)
128
M. Panda and B. Das
0 −ϕ˙ S(ϕ) ˙ = ϕ˙ 0
(8)
If damping D(v) is zero them Eq. (3) can be represented as M1 α˙ + Q r α = τ r˙ = −
(9)
X u˙ − Yv˙ Qr τϕ uv + r+ Iz Iz Iz
(10)
where 0 m − X u˙ , M1 = 0 m − Yv˙
−X u (−m + Y )r Qr = (m − X )r −Yv
τx and τ = τy (11)
By modifying Eq. (9), we get α˙ = M−1 1 (τ − Q r α)
(12)
Differentiating Eq. (5) both side, we get ˙ S¨ = R(ϕ)α + R(ϕ) α˙ = R(ϕ)S(ϕ)α ˙ + R(ϕ)α˙
(13)
Putting Eqs. (7) and (12) in Eq. (13) S¨ = R(ϕ) S(ϕ) ˙ − R −1 Q r R −1 (ϕ) S˙ + R(ϕ)M−1 1 τ
(14)
If E(ϕ, r ) = R(ϕ) S(ϕ) ˙ − R −1 Q r R −1 (ϕ) and T (ϕ) = R(ϕ)M−1 1
(15)
Then, Eq. (14) can be rewritten as S¨ = E(ϕ, r ) S˙ + T (ϕ)τ
(16)
Equations (5), (10) and (16) are non-linear equations about state variable (ϕ, r , ˙ having control input τ and τ ϕ . S, S)
Multi-agent System of Autonomous Underwater Vehicles …
129
2.2 Formation Dynamics Here, AUVs in the formations are modelled as deformable entities and are defined by Jacobian vectors [8]. Let us consider there are N numbers of AUVs whose positions are described by Sk = [xk , yk ]T , for (k = 1, 2, . . . N ) · jk (k = 1, 2 . . . N − 1) is the Jacobi vector and zc is the centre of formation. Then, Jacobi transformation is defined by
j1, j2 , . . . , j N −1 , z c
T
= Φ[S1 , S2 , . . . , S N ]T
(17)
where Φ is the linear transformation that is experimentally determined based on the desired shape. Formation centre is defined by z c = (1/W )
W
Sk
(18)
k=1
Equation (18) can be also written as T [S1 , S2 , . . . , S N ]T = Φ −1 j1 , j2 , . . . , j N −1 , z c
(19)
To derive the dynamic equation of Jacobi vectors and formation centre vector, we have to take second order derivative of above equation.
T T S¨1 , S¨2 , . . . , S¨ N = Φ −1 j¨1, j¨2 , . . . , j¨N −1 , z¨ c T T ⇒ j¨1, j¨2 , . . . , j¨N −1 , z¨ c = Φ S¨1 , S¨2 , . . . , S¨ N T ⇒ j¨1, j¨2 , . . . , j¨N −1 , z¨ c = Φ[E(ϕ1 , r1 ) S˙1 + T (ϕ1 )τ1 , E(ϕ2 , r2 ) S˙2 + T (ϕ2 )τ2 , . . . , E(ϕ N , r N ) S˙2 + T (ϕ N )τ N ]T T ⇒ j¨1, j¨2 , . . . , j¨N −1 , z¨ c ⎡ ⎤ ⎡ ⎤ j˙1 ⎡ ⎤ T (ϕ1 )τ1 ⎢ ⎥ E(ϕ1 , r1 ) . . . ... ⎢ . ⎥ ⎢ ⎥ . ⎥ ⎦Φ −1 ⎢ . ⎥ + Φ ⎢ = Φ⎣ ... ... ... ⎢ ⎥ ⎣ ⎦ . ⎣ ⎦ ˙ j N −1 ... . . . E(ϕ N , r N ) T (ϕ N )τ N z˙ c
(20)
The diagonal matrix E is the important parameter which used to find the formation shape and centre of formation dynamics are decoupled or not. Formation shape dynamics described by Jacobi shape vector jk and dynamics of centre of formation described by the centre vector zc . E is represented as
130
M. Panda and B. Das
−1
e11 e12 −1 E ϕk , rk = R(ϕk ) Sk (rk ) − R Q rk R (ϕk ) = e21 e22
(21)
where e11
e12
e21
e22
Xu m − X u˙ m − Yv˙ 2 rk sin ϕk cos ϕk = cos ϕk + − m − X u˙ m − Yv˙ m − X u˙ Yv + sin2 ϕk m − Yv˙ Xu Yv m − X u˙ sin ϕk cos ϕk + = − rk sin2 ϕk m − X u˙ m − Yv˙ m − Yv˙ m − Yv˙ + rk cos2 ϕk − rk m − X u˙ Xu Yv m − X u˙ sin ϕk cos ϕk − = − rk cos2 ϕk m − X u˙ m − Yv˙ m − Yv˙ m − Yv˙ + rk sin 2 ϕk + rk m − X u˙ Xu m − Yv˙ m − X u˙ 2 rk sin ϕk cos ϕk = sin ϕk + − m − X u˙ m − X u˙ m − Yv˙ Yv + cos2 ϕk m − Yv˙
(22)
E(ϕk , rk ) is insignificant for all k = 1, 2, … N by neglecting drag forces which leads to decoupling of shape and centre dynamics of formation. If we consider the drag forces, then the dynamics are not decoupled.
2.3 Communication Consensus The research work assumes a bidirectional flow of information among the AUVs represented by an undirected graph. The difference between the desired position and the actual position of an AUV at any instant of time is the synchronization error. The follower AUVs try to follow the path of the leader AUV by exchanging the synchronization error information with their neighbours in the absence of velocity measurements. Assuming that at any instant t, an AUV is always connected to its neighbour AUV an incidence matrix (α kl ) can be defined as αkl =
+1 if the kth AUV is at the left of lth AUV −1 if the kthAUV is at the right of lth AUV
(23)
The associated adjacency matrix is defined in graph theory as = [E kl ] ∈ RN ×N . Two members, k and l are neighbour AUVs if they can access the synchronization
Multi-agent System of Autonomous Underwater Vehicles …
131
error |θ k − θ l |. It is assumed that all the follower AUVs are in full communication with the leader [4, 18] and follows the leader such that lim |θk − θl | = 0, ∀ k(1, 2, . . . N )
(24)
t→∞
So, the distributed consensus tracking theorem for θ k may be defined as [4, 18] θ˙k = −β
⎡ E kl (θk − θl ) − δc sgn⎣
l∈ N¯ l (t)
⎤ E kl (θk − θl )⎦
(25)
l∈ N¯ l (t)
Here, N l (t) = {1, 2, . . .} denotes the neighbour set of follower k in the team consisting of the N followers and the virtual leader. β is a nonnegative constant and δ c is a positive constant. sgn[.] represents signum function. E kl for k, l = 1, 2 . . . m is a positive constant. For a switching network topology, it is assumed that l ∈ N k (t), / N k (t). Here, k = 1, 2, …,m, l = 0, 1, 2, …,m. If |θ − θl | < R at time t, then l ∈ R denotes the communication sensing radius of the AUV. As undirected graph is connected, so at least one value of E k0 is nonzero. E k0 is a positive constant if the virtual leader’s position is available to follower k. For the stability analysis of the above theorem is given in [19].
3 Problem Formulation Controlling MAS of AUV employed in any underwater operation with a specific target need the estimation of location, finding a collision free path and proper communication among the members of the team. In the presence of full communication, both position and velocity measurements and either of one must be available to all the AUVs to maintain coordination. The position and velocity information may be gathered using sensors such as, “Doppler velocity log” (DVL), and “Inertial Navigation System (INS)” in the absence of GPS signals. The GPS signals as radio waves cannot travel in underwater channel. Thus, acoustic signals are the only alternative available for underwater communication. Again, aquatic signal propagation is susceptible to environmental disturbances, “path loss”, “multipath fading”, “Doppler spread” with an increase in probability of error. In this research, we attempted to control and maintain an octagonal formation structure where eight AUVs are following a virtual leader. It is assumed that follower AUVs are aware about the position and velocity of other AUVs as full communication prevails. The signal flow in the formation control of MAS of AUVs is shown in Fig. 1. The follower AUVs trace the virtual leader as the leader path information is available to all the members in the team. The formation control problem (FCP) is modelled to accomplish following tasks: • Decide the strategy for formation control
132
M. Panda and B. Das Desired Formation Path
Formation Controller of Leader AUV
Path Parameters
Leader AUV
Current position
Data Received Through Acoustic Underwater Medium Position and Orientation Error Calculation
Agent-AUV
Formation Controller of Follower AUV
Current position
Fig. 1 Control signal flow model of MAS of AUVs in the formation
• Maintaining the octagonal shape • Shifting between the formation to avoid collision • Avail full communication among the team members.
4 Octagonal Formation Control of AUV The octagonal formation shape and navigation controller are designed by using linear state feedback [8]. The control inputs are surge force and heading control and it assumed that sway forces are negligible This section is divided into 4 parts which is listed below: • • • •
Dynamics of AUV formation Octagonal shape controller Formation angle controller Observer modelling
4.1 Dynamics of AUV Formation The AUV’s hydrodynamic parameters will satisfy Yv˙ = X u˙ and Yv = X u [8]. b 0 1 1 = m−Y . Hence, we get R −1 = , where b = m−X u˙ v˙ 0b So bX u 0 = bX u I2 E(ϕk , rk ) = (26) 0 bX u
Multi-agent System of Autonomous Underwater Vehicles …
133
E(ϕk , rk ) is a constant 2 × 2 diagonal matrix. For ‘N’ numbers of AUVs E = bX u I2N . Thus, formation system matrix of dynamic equation is a diagonal matrix which is constant and defined by Φ EΦ −1 = bX u I2N . Now defining H = [h 1 , h 2 , . . . , h c ]T = ΦU
(27)
Simplified equation for formation structure and centred moving systems are [8] j¨k = a X u I2 j˙k˙ + h k , k = 1, 2, . . . , N − 1 z¨ c = a X u I2 z˙ c + h c
(28)
The angular motion dynamics of AUV can be approximated as linear equation ϕ˙ = r and r˙ =
1 Qr r + τϕ Iz Iz
(29)
4.2 Octagonal Shape Controller Equation (28) represents linear formation shape of subsystems. The control force can be estimated with the help of state feedback method, considering an augmented T state vector X¯ k = jk , j˙k˙ given as ¯ k X˙¯ k = A¯ X¯ k + Bh
(30)
where 0 I2 , A¯ = 0 bX u I2
B¯ =
0 I2
(31)
We can define the tracking error as a vector E tk = X¯ k − X¯ kd . Here, X¯ kd is the desirable formation to be obtained. The state equation for E˙ tk is ¯ k − X˙¯ k E˙ tk = AE tk + A X¯ kd + Bh
(32)
The controller to maintain the motion of formation centre can be modelled similarly as shape controller. The controller force h c for formation centre while tracking a desired path z cd (t) can be expressed by h c = z¨ cd − a X u z˙ cd − g1z (z c − z cd ) − g2z (˙z c − z˙ cd )
(33)
134
M. Panda and B. Das
where g1z , g2z > 0 are controller gains [8].
4.3 Formation Angle Controller Assuming that AUVs are controlled by surge force and heading angle neglecting sway motion, i.e.τ y = 0 [8], we can define
u k1 T (ϕk )τk = u k2
(34)
where u k1 = bτkx cos ϕk , u k2 = bτkx sin ϕk
(35)
After obtaining the control force hk and hc , the desired value of U = F−1 H can be computed. The force for surge motion τkx is given by τkx =
u k1 u k2 = b cos ϕk b sin ϕk
(36)
Now, the desired yaw angle ϕ is given by ϕ = b tan 2(u k1 , u k2 )
(37)
The u k1 and u k2 have unique surge force and heading angle [8]. The angle controller for formation can be designed to achieve ϕk → ϕkd as t → ∞. The equation of angular velocity for kth AUV is given by
ϕ˙k r˙k
=
0 1 0 QIzr
ϕk rk
+
0 1 Iz
τϕk
(38)
The linear state feedback controller for yaw moment is given by [8] as τϕk ϕ
ϕ
Qr ϕk ϕk = Iz ϕ¨kd − ϕ˙kd − g1 (ϕk − ϕkd ) − g2 (rk − ϕkd ) Iz
where, g1 k , g2 k > 0 are controller gains.
(39)
Multi-agent System of Autonomous Underwater Vehicles …
135
4.4 Observer Modelling To compute τϕk it is needed to have ϕ˙kd and ϕ¨kd [8]. As only ϕkd is available, a state observer is required to approximate ϕ˙kd and ϕ¨kd form it. For this purpose, following equation can be defined as in [8] X˙ ϕk = Aϕ X ϕk + Bϕ ωk λk = Cϕ X ϕk
(40)
where ωk indicates Gaussian noise and λk represents output variable. ⎤ ⎡ ⎤ ϕkd ϕkd = ⎣ rkd ⎦ = ⎣ ϕ˙kd ⎦, z kd ϕ¨kd ⎡
X ϕk
⎡
⎤ 010 Aϕ = ⎣ 0 0 1 ⎦, 000
⎡ ⎤ 0 Bϕ = ⎣ 0 ⎦, Cϕ = 1 0 0 1 (41)
5 Simulation Set up and Result Analysis The AUV parameters used for the simulation are the experimental value taken from [4] and MATLAB is used as simulation platform. The initial positions [x, y] of the AUVs are selected as: [0,−40], [0,0], [0,−10], [0,20], [0, 1.20], [0, 10], [0,−30], [0,15]. The octagonal formation consisting of eight AUVs is shown in Figs. 2, 3 and 4 show variation of surge velocity and yaw angle variations of AUVs during motion as they approach the destination. The surge velocity and yaw angle of an AUV becomes stable once it obtains desired velocity and orientation as indicated by parallel lines in Figs. 3 and 4.
6 Conclusion In this research, an attempt is made to control MAS of eight AUVs in an octagonal formation. The participating AUVs follow leader-follower topology assuming full communication without delay. The MAS formation controller uses Jacobian theory and methods of geometric reduction to minimize complexity. The AUV motion dynamics are modelled assuming 3-DOF neglecting heave and sway motions. The simulation results for octagonal shape formation, the surge velocity and yaw angles have been obtained. It has been verified that the formation maintains the octagonal geometry by controlling the surge velocity and orientation of all the AUVs while approaching the target. There is no variation in surge velocity and yaw angle for the
136
Fig. 2 Octagonal formation in 2-D map
Fig. 3 Surge velocity variation of AUVs in octagonal formation in 2-D map
M. Panda and B. Das
Multi-agent System of Autonomous Underwater Vehicles …
137
Fig. 4 Yaw angle variation of AUVs in octagonal formation in 2-D map
AUVs that already obtained their required position and heading angle. But, surge velocities and yaw angles for AUVs vary till each AUV obtained their desired position to form octagon. No variation represented by parallel lines with X-axis. This research assumes full communication without delay. In future, this research may be implemented considering the communication delay. Acknowledgements We acknowledge the help and facilities provided by the department of “Electronics and Telecommunication Engineering” and “TEQIP-III cell of Veer Surendra Sai University of Technology, Burla, Odisha, India”.
References 1. Panda, M., Das, B., Subudhi, B., Pati, B.B.: A comprehensive review of path planning algorithms for autonomous underwater vehicles. Int. J. Autom. Comput. 17, 321–352 (2020) 2. Panda, M., Das, B.: Grey wolf optimizer and its applications: a survey. In: Proceedings of the Third International Conference on Microelectronics, Computing and Communication Systems, pp. 179–194. Springer (2019) 3. Panda, M., Das, B., Pati, B.B.: Grey wolf optimization for global path planning of autonomous underwater vehicle. In: Proceedings of the Third International Conference on Advanced Informatics for Computing Research—ICAICR’19, pp. 1–6. ACM Press, Shimla, India (2019) 4. Das, B., Subudhi, B., Bhusan Pati, B.: Adaptive sliding mode formation control of multiple underwater robots. Arch. Control Sci. 24, 515–543 (2014) 5. Panda, M., Das, B., Pati, B.B.: A hybrid approach for path planning of multiple AUVs. In: Sharma, R., Mishra, M., Nayak, J., Naik, B., Pelusi, D. (eds.) Innovation in Electrical Power
138
6. 7. 8. 9.
10. 11. 12. 13.
14. 15.
16. 17.
18.
19.
M. Panda and B. Das Engineering, Communication, and Computing Technology, pp. 327–338. Springer, Singapore (2020) Das, B., Subudhi, B., Pati, B.B.: Cooperative formation control of autonomous underwater vehicles: an overview. Int. J. Autom. Comput. 13, 199–225 (2016) Panda, M., Das, B., Pati, B.B.: Global path planning for multiple AUVs using GWO. Arch. Control Sci. 30, 77–100 (2020) Das, B., Subudhi, B., Pati, B.B.: Employing nonlinear observer for formation control of AUVs under communication constraints. Int. J. Intell. Unmanned Syst. 3, 122–155 (2015) Millan, P., Orihuela, L., Jurado, I., Rubio, F.R.: Formation control of autonomous underwater vehicles subject to communication delays. IEEE Trans. Control Syst. Technol. 22, 770–777 (2014) Das, B., Subudhi, B., Pati, B.B.: Co-operative control of a team of autonomous underwater vehicles in an obstacle-rich environment. J. Mar. Eng. Technol. 15, 135–151 (2016) Das, B., Subudhi, B., Pati, B.B.: Co-operative control coordination of a team of underwater vehicles with communication constraints. Trans. Inst. Meas. Control 38, 463–481 (2016) Das, B., Subudhi, B., Pati, B. B.: Formation control of underwater vehicles using Multi Agent System. Archives Contr. Sci. 30, (2020) Szymak, P., Praczyk, T.: Control systems of underwater vehicles in multi-agent system of underwater inspection. In: WSEAS International Conference. Proceedings. Mathematics and Computers in Science and Engineering, pp. 153–156. World Scientific and Engineering Academy and Society (2009) Yang, H., Zhang, F.: Geometric formation control for autonomous underwater vehicles. In: 2010 IEEE International Conference on Robotics and Automation, pp. 4288–4293. IEEE (2010) Ul’Yanov, S., Maksimkin, N.: Software toolbox for analysis and design of nonlinear control systems and its application to multi-AUV path-following control. In: 2017 40th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2017—Proceedings, pp. 1032–1037. Institute of Electrical and Electronics Engineers Inc. (2017) Fossen, T.I.: Guidance and Control of Ocean Vehicles (Fossen, T.I., ed.), 1st ed, pp. 6–54. British Library, Trondheim (1994) Yang, H., Zhang, F.: Geometric formation control for autonomous underwater vehicles. In: Proceedings—IEEE International Conference on Robotics and Automation, pp. 4288–4293 (2010) Panda, M., Das, B., Subudhi, B., Pati, B. B.: Adaptive fuzzy sliding mode formation controller for autonomous underwater vehicles with variable payload. Int. J. Intell. Unmanned Syst. (ahead-of-print) (2020) Cao, Y., Ren, W.: Distributed coordinated tracking with reduced interaction via a variable structure approach. IEEE Trans. Automat. Contr. 57, 33–48 (2012)
Fuzzy Q-Reinforcement Learning-Based Energy Optimization in IoT Network Manoj Kumar, Pankaj Kumar Kashyap, and Sushil Kumar
Abstract Motivated by the growing environmental concerned (effect of greenhouse gases) coupled with increasing cost of energy, green computing emerges as promising solution to energy-limited IoT network. As IoT network consists of limited lowbattery power smart sensors having ability to connect over wireless network for transmission of data, energy harvested from the environment by the sensor node reduces carbon emission and also recharges its battery continuously, and this harvested energy is used by sensors for its working operation that enhances the lifetime of the IoT network. In this paper, a Fuzzy Q-reinforcement learning (FQRL) scheme using fuzzy logic and model-free Q-learning to optimize the energy consumption in perpetual operations of IoT nodes is presented. The optimization of energy consumption is subject to adaptive duty cycle exercised to smart sensors. The learning agent of FQRL updates If-Then rules of fuzzy controllers according to reward received by learning agent through interacting with environment. The learning agent rewards for good action (increasing the firing strength of rule) and punishes (decrease the firing strength of rule) for bad action subject to maintain the energy neutrality condition. Finally, simulation results show the proposed FQRL outperforms in terms of duty cycle and residual energy after perpetual operation. It means presented algorithm FQRL provides smart sensors to achieve better charging status of their battery and suitable for energy harvested IoT networks. Keywords Fuzzy energy harvesting · Fuzzy inference system · Q-learning · Duty cycle · Power optimization · Energy neutrality condition
M. Kumar (B) · P. K. Kashyap · S. Kumar Wireless Communication and Networking Research Lab, School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi 110067, India e-mail: [email protected] P. K. Kashyap e-mail: [email protected] S. Kumar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_13
139
140
M. Kumar et al.
1 Introduction Internet of things (IoT) has huge applications in all the area of human being living style such as healthcare facilities, business organization, military operations, vehicular network and industrial organization to receive and transfer information over wireless network known as IoT network [1, 2]. As these smart devices generate extensive amount of data and continuously upload the data to sink node or base station through wireless network [3], these tiny smart sensors are powered by small battery and designed to perform complex computation, which causes its exhaust more amount of energy and dies quickly [4, 5]. The evolvement of cyberspace, mobile edge computing and cloud services developed model does not meet the requirement because of smart sensors work individually than cooperative mode [6]. Sensors individually sense and processed the sensed data and transfer to the sink node. So there is need to develop novel model to individually optimize the energy in operation performed by sensors. Energy harvesting (EH) approach enables smart sensors to perform their perpetual operation flawless without worrying about energy because it filled the battery unlimited times till the hardware are intact. EH is of two types; firstly, sensors receive the constant energy from grid station. Secondly, EH by sensors from the ambient environment (installed solar panel on sensor node, wind turbine, vibration (kinetic motion of human) and radio frequency emitted by other sensors [7]. Smart sensors harvest static energy from the grid stations that uses fossils fuels to generate energy, but it produces greenhouse gases that polluted the environment. whereas amount of harvest energy by the smart sensors from the environment depend upon weather condition in turn dynamic in nature and free from emission of greenhouse gases. Energy neutrality operation must be maintained by smart sensors to avoid power failure; i.e., consumption of energy must be less than equal to harvested energy [8]. The primary concern of the smart sensors is to use the harvested energy in perpetual operation in efficient manner. In [9], an energy neutrality condition (ENC)-based theorem has been proposed to achieve sensing and transfer the data to sink node in efficient manner. Further same author added the feature of adaptive duty cycle (ADC) to predict the amount of harvested energy and control the duty cycle of sensors. The duty cycle controls the sleep/wakeup time period of sensors. To mitigate the problem of dynamic nature of energy harvesting by smart sensors, intelligent machine learning technique such as fuzzy logic and reinforcement learning (RL) applied to IoT networks. The learning agent interacts with the environment under ENC and learns the strategy to control the duty cycle to maintain the perpetual operations of smart sensors. Fuzzy logic imitates the human way of learning and thinking mathematically into fuzzy inference system (FIS) database in the form of If-Then rules, and result is deduced on fuzzy rules set [10]. The real difficulties with fuzzy logic are to determine the number of parameters for fuzzy rules, and generalization of fuzzy rules [11]. Generalization of rules learning capability is added into FIS either by supervised learning or RL. It is not only parameters of fuzzy rules are tuned but FIS are also
Fuzzy Q-Reinforcement Learning-Based Energy Optimization …
141
generalized by itself. RL algorithm such as model-free Q-learning technique has been applied to FIS to update the rules based on reward (Q-value (state-action pairs)) [12]. The Q-learning agent gets reward for taking of good action and punishes with respect to mapping of state against bad action. This hybrid scheme is known as fuzzy Q-reinforcement learning (FQRL). It provides generalized state space to select an optimal policy to choose action which provides maximum discount expected reward to the system. In this paper, we present a dynamic fuzzy Q-reinforcement learning scheme to optimize the energy consumption in perpetual operations of IoT node which dynamically update the FIS rule database. The optimization of energy consumption is subject to adaptive duty cycle exercised to smart sensors. The main contribution of the presented algorithm is as follows: 1. Firstly, we define the system and energy model of IoT networks. 2. Secondly, a hybrid model FIS with Q-reinforcement learning (FIS-QRL) for power management of sensors is presented. The Q-learning agent updates the firing strength of each If-Then rule of FIS based on reward with respect to state space rule and action sensing power. 3. Finally, simulations are performed to show the effectiveness of the presented scheme in terms of energy consumption and EH with respect to state-of-the-art algorithms. The rest of the paper is divided into sections. Section 2 reviews the recent fuzzy Qlearning-based techniques for IoT networks. Section 3 explains the proposed fuzzy Qreinforcement learning scheme for IoT network. Section 4 analyzed the performance of the presented work. Finally, conclusion and future scope of the work included in Sect. 5.
2 Related Works In the literature [13], authors have proposed an energy-efficient sleep/wakeup mode for device to device communication on fuzzy Q-learning-based algorithm in heterogeneous cellular network subject to maintain quality of services. As the batterypowered sensors and base station used in networks have limited capacity of energy, ultimately it does not prolong the lifetime of network and their perpetual operations. In [14], authors have proposed dynamic energy management for EH sensors for wireless system using fuzzy logic with Q-learning. The learning agent decides the time period of duty cycle which depends on energy harvesting and sensing power as input. They only consider the power consumption in sensing operation and left behind the energy consumption which takes place into transfer of data. As data transfer from smart sensors to base station (sink node) consume more energy than data sensing. Also, authors failed to describe the dynamic fuzzy RL model clearly as updating the If-Then rules in case of dynamic crisp input parameter to produce output. In [15], authors optimize the electricity expenditures of battery based on RL technique in order to improve lifetime of EH cellular network. Firstly, authors have
142
M. Kumar et al.
proposed an approach to minimize the size of battery to overcome the battery aging problem, secondly an energy controller is designed using reinforcement learning to exercised perpetual operation of sensors. In [16, 17] authors have considered the problem of size of battery with various renewable sources using integer linear programming method. In [18], authors use Markov chains, an energy management policy in battery to optimize the cycle aging problem of battery. In [19], an online ARIMA model-based learning algorithm is proposed to minimize energy consumption in energy harvesting wireless sensor networks. However, these algorithms are easy to implement, but they are not much responsive toward reducing the energy consumption in smart sensors. Above-mentioned energy management algorithms do not simultaneously optimize the duty cycle and sensing power of smart sensors. Some of the mentioned algorithms are complex, require prior information of residual energy, EH and battery state of charge and only fit to specific environment. The other algorithms are much simpler, which are not suitable for dynamic nature of EH. Overall, exercised duty cycle for perpetual operation is not optimum; ultimately network lifetime decreases. In this regard, we present model-free an online Q-learning algorithm with combination of fuzzy logic to capture jointly both adaptive duty cycle (sensing power) and residual energy of smart sensors to optimize energy consumption.
3 Fuzzy Q-Reinforcement Learning (FQRL) Scheme In this section, the proposed fuzzy Q-reinforcement learning (FQRL) scheme is presented in detail. It consists of system and energy model for basic network formation assumptions and renewal energy-oriented model, and a hybrid model FIS with Q-reinforcement learning (FIS-QRL) for power management of sensors.
3.1 System and Energy Model Solar-powered EH smart sensors (IoT nodes) are randomly dispersed into the sensing field to monitor the environment. The aim of this paper is to optimize the energy consumption in perpetual operations of IoT nodes, while maintaining the ENC; i.e., consumed energy by IoT nodes in perpetual operation must be less than equal to harvested energy. These IoT nodes worked on asynchronous sleep/wakeup mode. So, it removes the restriction to synchronize the clocks of nodes in dynamic scenario, which is a favorable condition to IoT networks. In sleep mode, IoT nodes are in low-power mode that only able to sense the environment, and in wakeup mode, IoT nodes able to transfer the data to sink node. The duty cycle defined as the time period in which IoT nodes awake for sensing and communication with sink.
Fuzzy Q-Reinforcement Learning-Based Energy Optimization …
143
The IoT node harvests energy eh (t) from sun and stored into its battery of maximum capacity B max during the time slot t, which is used in perpetual operations by IoT nodes in next time slot (t + 1). The harvested energy eh (t) is stochastic in nature and follows independent and identically distributed random process; i.e., amount of harvested energy is different for different time slot but constant during one time slot [20]. The battery is assumed to performed ideally; that is, there is no loss of energy in storing the harvested energy and in communication with other nodes. The energy consumed in sensing the environment es (t), which is determined by duty cycle. The energy ef (t) consumed in transmitting the b-bit of packet over distance d in free space or multipath fading path follows first-order radio model [10] given as; ef (t) = etransceivier circuit + eamplifying circuit trans + befs d 2 , if d < do be f e (t) = betrans + bemp d 4 , if d ≥ do
(1)
where etrans represents the energy consumed by transceiver circuit in transmitting single bit of information. efs and emp are the constant energy consumption factor in free space (d < do ) and multipath (d ≥ do )of the amplifying circuit, respectively, where do represents the threshold distance; do = efs /emp determines the model used in transmission of data over distance d. Thus, total energy consumption ec (t) in both sensing and transferring the data to sink node calculated as follows: ec (t) = es (t) + ef (t)
(2)
Residual energy er (t) of IoT node at the beginning of time slot t evaluated as follows: er (t) = min B max , er (t − 1) + eh (t − 1) − ec (t)
(3)
3.2 Fuzzy Inference System with Q-Reinforcement Learning (FIS-QRL) The proposed hybrid model builds a foundation for updating the rule of FIS by employing Q-learning agent interacting with the environment for stochastic nature of EH as shown in Fig. 1. The power flow and control signal are shown by dashed and solid line, respectively. The input for (FIS) is harvested energy eh (t) and residual energy er (t) of IoT nodes. The linguistic variable for input to Mamdani-based FIS eh (t) = {little, medium, more} and er (t) = {poor, fair, good} and for output energy for sensing es (t) = {lowest, fair, medium, good, high}. For little, more linguistic variable of eh (t) and poor, good linguistic variable of er (t), trapezoidal membership function
144
M. Kumar et al.
Fig. 1 Hybrid model of FIS-QRL
fair
good
0.5
0 0
0.5
Residual_Energy
1
1 liƩle
medium
more
0.5
0 0
0.5
Harvested_Energy
Fig. 2 Linguistic variable membership functions
1
Degree of membership
1 poor
Degree of membership
Degree of membership
is used (see Fig. 2) and other remaining linguistic variable triangular membership function is used to input the parameter into FIS as crisp input. Initially, the fuzzy rule based on linguistic variable fed into rule database shown in Table 1, which is updated upon reward by Q-learning agent. FIS compute the duty cycle of IoT node in four steps. In the first step, membership degree of each crisp input is obtained through intersection point. In the second step, using fuzzy rule set (If-Then), firing strength of rule ϕ r i is determined using fuzzy AND i operator on membership degree of each crisp input. The firing strength of rule ϕr is used as weighting factor for reward of Q-learning agent. In the third step, aggregation of all output generated by multiple (nine) rule using fuzzy OR operator as single fuzzy sensing power es (r i ) and Q-value (q i ) that is transformed into output i probability p r as conclusion of each rule r i by softmax operation as follows: 1 lowest fair medium
good high
0.5
0 0
0.5
Sensing_Power
1
Fuzzy Q-Reinforcement Learning-Based Energy Optimization …
145
Table 1 Fuzzy rule set Rule (r i )
Then eh
er
es (r i )
Q-value
1.
little
poor
lowest
q1
2.
little
fair
lowest
q2
3.
little
good
fair
q3
4.
medium
poor
lowest
q4
5.
medium
fair
fair
q5
6.
medium
good
medium
q6
7.
more
poor
fair
q7
8.
more
fair
good
q8
9.
more
good
high
q9
i s i e Q (r ,e (r )) p r i = N , ∀r i Q (r i ,es (r i )) n=0 e
(4)
where N represents the total number of output generated by multiple rules and Q r i , es (r i ) denotes the reward (state-action pair) q-value of Q-learning by incorporating rule r i as discrete state and discrete action is sensing power es (r i ) with respect to rule r i . Q-learning uses the ε − gr eedy policy (0 < ε < 1) as exploration and exploitation strategy. The learning agent in the current state rti selects an optimal action with high probability (1 − ε), which maximized the expected Qi with reward rt+1 . value for current learning step t, and moves to the next state rt+1 The random action selection by learning agent with low probability ε encourages exploration of learning agent. The Q-value value of rule (r i ) is updated by one-step Bellman equation as follows: i s i i i s i s i (5) Q rt+1 , et rt Q rt , et rt ← (1 − α)Q rt , et rt + α rt+1 + γ max s et+1
i where α ∈ [0, 1], γ ∈ [0, 1] and rt+1 represent the learning rate, discount factor i of Q-learning agent is and reward during time slot t, respectively. The reward rt+1 the update of the firing strength ϕ r i of the rule based upon distance function of ENC. In the fourth step, defuzzification process (center of area or centroid method) is used to aggregate the output, and duty cycle of the IoT nodes is determined as weighted average of firing strength as follows:
M d c (t) =
i s i m=1 ϕ r e (r )
M i m=1 ϕ r
(6)
146
M. Kumar et al.
where M represents the total (9) If-then rules stored into FIS database. After duty cycle d c (t) determined for IoT node by Eq. (6), the optimal sensing power is allocated to the IoT node. Further, total energy consumption in sensing and transmission operation is calculated by Eq. (2). The total energy consumption ec (t) is compared to harvest energy eh (t) of IoT node to check the ENC met or not. The reward will be updated on the basis of distance from ENC (D ENC ) given as, D ENC =
eh (t) ∗ ehmax − d c (t) ∗ ec (t) ehavg
(7)
where ehmax and ehavg represent the maximum harvested energy and average harvested energy from the solar panel by IoT nodes. The duty cycle d c (t) and harvested energy eh (t) used in above equation are in percentage, and ec (t) is the total energy consumption with 100% duty cycle. If the value of D ENC is negative; i.e., energy consumption is more than the harvested energy and learning agent punished with negative reward otherwise and vice versa in the current learning step. Thus, it is clarified that Q-learning agent of Q-learning is responsible for estimating the duty cycle and updates the rule of FIS database, which provides sensing power time span for sleep and wakeup mode subject to ENC, though energy consumption of IoT node is evaluated. In the process of rewarding function, expected level of battery reference level eneut (energy neutrality reference) of IoT node is pre-determined with tolerance r neut is within the tolerance factor, factor
r ω, and
if the difference between e and e neut
≤ ω, the reward value (updating the firing strength of rule) given as i.e., e − e follows: ⎧ ⎨ −z, if − 2 < D ENC < −1 i i r =ϕ r ∗ (8) z, if − 1 < D ENC < 1 ⎩ ENC ω, alternative reward is received by learning agent as follows: ⎧ ⎨ −0.5 ∗ z, if D ENC < −1 ri = ϕ ri ∗ −z, if − 1 < D ENC < 1 ⎩ z, if D ENC > 1
(9)
Fuzzy Q-Reinforcement Learning-Based Energy Optimization …
147
Algorithm 1-Fuzzy Q- Reinforcement learning (FQRL) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
Begin Input: Set For // Fuzzy logic system// Fuzzification (obtain membership degree for crisp input of each rule FIS engine calculate the firing strength of each rule into probability by softmax Transform the conclusion operation using Eq.(4) using Eq. (6). Defuzzification step evaluate Perform the operation of sensing and transmission of data // Q-learning agent updating rule// Set State While (convergence = = False) do using policy Select an action (sensing power) Obtain Calculate distance from ENC using Eq.(7) using Eq. (8) and Eq.(9) Evaluate the reward basis on or next time slot Estimate Update the value using Eq.(5) END while END for
3.3 Complexity Analysis The time complexity of the presented algorithm FQRL of two folds: (i) FIS engine operations from line 6 to line 10. It requires the time for all the operation is equal to number of rules, i.e.,O(ωv ), where ω is the maximum number of linguistic variables in each parameter ‘v,’ in our case, ω({little, medium, more}) = 3 and v(eh , er ) = 2, updating (from line 11 to line i.e., O(32 ) = O(9) = constant time. (ii) Q-Learning 3 in case of worst case. Thus, the overall 18) depends upon size of state (n) is O n time complexity is equal to O t ωv + n 3 , where t is total number of time slot.
4 Simulation Result and Discussion To evaluate the performance of proposed FQRL scheme, we use MPR2400CA [21] mote as smart sensors and two sets of VGA image sensor OS08A10 [22] with four sets of frame rate (sensing rate) according to four sets of action taken place by Qlearning agent. The battery and solar panel used to provide power and harvest energy from sun listed in Table 2. The Q-learning agent takes four discrete Actions = {A1 , A2 , A3 , A4 }, such as both VGA sensors are active with maximum sensing rate, both active at half of the sensing rate, only one of them active and both are in sleep
148
M. Kumar et al.
Table 2 Device parameters Device
Products
Specifications
Sensor node
MPR2400CA
Active: 84 mW (CPU processing (30 mW) + communication (54mW)) Sleep: 54 μW
VGA sensor OS08A10 × 2
A1 (100% sening rate) : 480mW A2 (50% sensing rate) : 240mW A3 (only one active ) : 120mW A4 (both sleep mode) : 80μW Bmax (Maximum capacity) : 5000 mAH Operating voltage : 3.6 V
Battery
Rechargeable LiMH × 3 (series connection)
Solar Panel
ISO (May, June 2019 Data released by Maximum power : 1.5W California solar energy) Maximum voltage : 19V Maximum current : 90mA
mode, respectively. The total energy consumption ec (sensing plus processing and communication) of smart sensor during wakeup mode corresponds to four Actions = {A1 , A2 , A3 , A4 } which are {564 mW, 324 mW, 204 mW, 0.1 mW}, respectively. The battery state of charge or referential energy (eneut ) neutral point is 70% of battery power which is considered with tolerance factor 8%. The harvested energy by solar panel in the month of May and June 2019 by ISO [23] California is used to recharge the battery of smart sensor and residual energy, and duty cycle is computed on that basis for experiment.
4.1 Residual Energy Over a Month To check the amount of residual energy after perpetual operation and harvesting energy in the month of May and June 2019, three levels of initial battery capacity are considered such as 20% (1000 mAH), 50% (2500 mAH) and 80% (4000 mAH) of maximum battery capacity and referential energy neutral point 70% (3500 mAH). The harvested energy in a day (24 h) is sampled and normalized by peak power. Figure 3a and b shows that initially residual energy er is oscillating up and down and converges to reference energy neutral point eneut within 1.5 weeks (10 days) and 1 week (7 days) of month May and June 2019, respectively, for the prosed algorithm FQRL. Figure 3c and d shows that the residual energy comparison between FQRL and state-of-art algorithms such as RL and adaptive duty cycle [ADC] with initial battery capacity 50% (2500 mAH of full capacity). Figure 3c shows that er of ADC and RL algorithms is unstable, and they are far away (two times) from tolerance factor (8% − 400 mAH) of the energy neutral point (70% − 3500 mAH). It is evident from the result that residual energy of battery must be in range between 3100 and 3900 mAH to satisfy energy neutrality. However, the proposed FQRL fluctuates
Fuzzy Q-Reinforcement Learning-Based Energy Optimization …
(a)
80 60 40 Initial energy=20% Initial energy=50% Initial energy=80% Energy neutrality=70%
20 0
100
Residual Energy(%)
5
0
20 10 15 Number of days
25
(b)
80 60 40 Initial energy=20% Initial energy=50% Initial energy=80% Energy neutrality=70%
20 0
30
(c)
0
5
20 10 15 Number of days
25
30
(d) 100
80 60 40 FQRL RL ADC Energy neutrality=70%
20 0 0
Residual Energy(%)
100
5
10 15 20 Number of days
25
30
Residual Energy(%)
Residual Energy(%)
100
149
80 60 40
FQRL RL ADC
20
Energy neutrality=70%
0
0
5
10
15
20
25
30
Number of days
Fig. 3 a Residual enegry over month May. b Residual enegry over month June
below tolerance factor only once away from energy neutrality point and quickly learns (within 7 days) and maintains the energy neutrality point 70–80% afterward in the month. Whereas Fig. 3d shows that all three algorithms exhibit equilibrium around energy neutrality point, only RL-based algorithm learning rate slow and still in the learning phase till 20th day of month June. Whereas proposed FQRL maintains 70–76% of full battery capacity, perpetual operation could be done more accurate way.
4.2 Duty Cycle Over Month Figure 4a and b shows the performance of sensors by learning agent using proposed algorithm FQRL, i.e., exercised duty cycle [0, 40, 70 and 100%] with respect to residual energy store within months May and June 2019. It can be seen from the result that for low initial battery capacity 20%(1000 mAH), duty cycle of sensors are cut down (lower counts) to increase residual energy for achieving ENC. Whereas for higher battery capacity 80%(4000 mAH) of sensors, duty cycle count is increased
150
M. Kumar et al.
(a)
(b) 80
60
Count
Count
80
40 20
60 40 20
0
0
40
70
0
100
0
Duty cycle(%)
(c)
40
70
100
Duty cycle (%)
(d) 80
60 40
FQRL RL ADC
20 0
5
10
15
20
Number of days
25
30
Duty cycle(%)
80
Duty cycle(%)
20% 50% 80%
100
20% 50% 100%
100
60 40
FQRL RL ADC
20 0
5
10
15
20
25
30
Number of days
Fig. 4 a Expected duty count over duty cycle May. b Expected duty count over duty cycle June. c Duty cycle over month May. d Duty count over month June
to adjust residual energy to referential energy neutral point. Thus, FQRL ables to learn quickly and adjust the duty cycle and maintain residual energy higher than 65% for sustainable operation within 8 days in the month even for lower battery condition 20%. A comparison of exercised duty cycle between proposed algorithm and state-of-art algorithms over month is shown in Fig. 4c and d. It is evident from the result that the exercised duty cycle [initial battery capacity 50% (2500 mAH)] of FQRL is in increasing order within month under maintaining higher residual energy with respect to state-of-the-art algorithms. The exercised duty cycle is unstable for ADC algorithm, because it does not maintain residual energy properly. Thus, it is proved that proposed FQRL algorithm uses harvested energy efficiently to achieve lower energy consumption and maintain residual energy to energy neutrality point in long-term with higher duty cycle other than state-of-the-art algorithms. This is because Q-learning agent updates the firing strength of rules quickly (update the membership degree of input parameter) upon received reward.
Fuzzy Q-Reinforcement Learning-Based Energy Optimization … Table 3 Statistical analysis for 60 days
Metrics (%)
FQRL
151 RL
ADC
μ(er )
68.2
72.3
67.3
μ(d c )
66.6
55.2
62.3
σ (er )
2.10
6.72
2.80
σ (d c )
2.17
7.82
4.21
8
Fig. 5 RMSD over month May–June
FQRL RL
7
RMSD(%)
ADC
6 5 4 3
10
20
30 40 Number of days
50
60
4.3 Root Mean Square Deviation (RMSD) of Referential Energy Neutrality Point Over Month The mean value μ(er ), μ(d c ) and standard deviation σ (er ), σ (d c ) show the sustainability of operation and stability toward eneut points of residual energy and duty cycle of each algorithm. It is evident from Table 3 that FQRL provides best expected duty cycle (μ(d c ) : 66.6%) and residual energy (μ(er ) : 68.2%) close to the energy neutrality point (eneut : 70%). Also, FQRL achieves smallest standard deviation σ (er ), σ (d c ) about 2.10% and 2.17%, respectively. Figure 5 shows the RMSD from eneut of RL-based algorithm decreases rapidly as the number of energy harvesting days increases, whereas proposed algorithm FQRL maintains the RMSD close to eneut as number of days increases. Overall, statistical and graphical analysis proved that proposed algorithm FQRL is robust in nature and provides optimal duty cycle even in the lower battery capacity under energy neutrality point for both data sensing and communication.
5 Conclusion and Future Scope In this paper, a hybrid model-based algorithm FQRL is proposed to optimize the energy consumption in EH IoT network. The presented FQRL algorithm updates the
152
M. Kumar et al.
FIS database rule based on reward received of taking an action by QRL agent. It optimizes the duty cycle, which ultimately reduced the energy consumption of IoT node and enhances the IoT network lifetime. Simulation results show that residual energy and RMSD of smart sensors are very close to ENC and duty cycle of smart sensors is approximately 75–85% of time comparing to state-of-the-art algorithms. Thus, we prove that the presented algorithm facilitates real-time dynamic harvested energy and controls the duty cycle of smart sensors. As future work, optimize the total energy consumption between sources IoT node and sink node for multipath communication IoT network. In such scenario, relay node will be able to harvest the energy and forward the signal to destination to balance the traffic load.
References 1. Kumar, S., Kaiwartya, O., Rathee, M., Kumar, N., Lloret, J.: Toward energy-oriented optimization for green communication in sensor enabled IoT environments. IEEE Syst. J. (2020) 2. Kumar, S., Kumar, V., Kaiwartya, O., Dohare, U., Kumar, N., Lloret, J.: Towards Green Communication in Wireless Sensor Network: GA Enabled Distributed Zone Approach. Ad Hoc Netw. 93, 101903 (2019) 3. Premsankar, G., Di Francesco, M., Taleb, T.: Edge computing for the internet of things: a case study. IEEE Internet Things J. 5(2), 1275–1284 (April 2018) 4. Khatri, A., Kumar, S., Kaiwartya, O., Aslam, N., Meena, N., Abdullah, A.H.: Towards green computing in wireless sensor networks: controlled mobility–aided balanced tree approach. Int. J. Commun. Syst. (2018) 5. Kashyap, P.K., Kumar, S., Jaiswal, A.: Deep learning based offloading scheme for IoT networks towards green computing. In: IEEE International Conference on Industrial Internet (ICII), Orlando, USA, pp. 22–27 (2019) 6. Gupta, S.K.S., et al.: Research directions in energy-sustainable cyber physical systems. Sustain. Comput. Inform. Syst. 1, 57–74 (2011) 7. Choi, K.W., et al.: Simultaneous wireless information and power transfer (SWIPT) for internet of things: novel receiver design and experimental validation. IEEE Internet Things J. 7(4), 2996–3012 (April 2020) 8. Margolies, R., Gorlatova, M., Sarik, J., Stanje, G. et al.: Energy harvesting active networked tags (EnHANTs): prototyping and experimentation. ACM Trans. Sens. Networks 11(4), 62:1–62:27 (November 2015) 9. Kansal, A., Hsu, J., Zahedi, S., Srivastava, M.B.: Power management in energy harvesting sensor networks. ACM Trans. Embed. Comput. Syst. 6(4), article 32 (2007) 10. Kashyap, P.K., Kumar, S., Dohare, U., Kumar, V., Kharel, R.: Green computing in sensorsenabled internet of things: neuro fuzzy logic-based load balancing. MDPI Electronics 8(4), 384–405 (2019) 11. Kumar, K., Kumar, S., Kaiwartya, O., Kashyap, P., Lloret, J., Song, H.: Drone assisted flying ad-hoc networks: mobility and service oriented modeling using neuro-fuzzy. Ad Hoc Netw. 106, 102242 (2020) 12. Wanigasekara, C., Swain, A., Almakhles, D., Zhou, L.: Design of dynamic fuzzy Q-learning controller for networked wind energy conversion systems. In: IEEE International Conference on Environment and Electrical Engineering (EEEIC/I&CPS Europe), pp. 1–6, Madrid, Spain (2020) 13. Panahi, F.H., Panahi, F.H., Hattab, G., Ohtsuki, T., Cabric, D.: Green heterogeneous networks via an intelligent sleep/wake-up mechanism and D2D communications. IEEE Trans. Green Commun. Networking 2(4), 915–931 (Dec. 2018)
Fuzzy Q-Reinforcement Learning-Based Energy Optimization …
153
14. Zhou, L., Swain, A., Ukil, A.: Q-learning and dynamic fuzzy Q-learning based intelligent controllers for wind energy conversion systems. In: 2018 IEEE Innovative Smart Grid Technologies—Asia (ISGT Asia), Singapore, pp. 103–108 (2018) 15. Mendil, M., De Domenico, A., Heiries, V., Caire, R., Hadjsaid, N.: Battery-aware optimization of green small cells: sizing and energy management. IEEE Trans. Green Commun. Networking 2(3), 635–651 (Sept. 2018) 16. Alsharif, M.H., et al.: Green wireless network optimisation strategies within smart grid environments for Long Term Evolution cellular networks in Malaysia. Renew. Energy 85, 157–170 (2016) 17. Khalilpour, R., Vassallo, A.: Planning and operation scheduling of pv-battery systems: A novel methodology. Renew. Sustain. Energy Rev. 53, 194–208 (2016) 18. Michelusi, N., et al.: Energy management policies for harvesting-based wireless sensor devices with battery degradation. IEEE Trans. Commun. 61(12), 4934–4947 (2013) 19. Leithon, J. et al: Renewable energy management in cellular networks: an online strategy based on ARIMA forecasting and a Markov chain model. In: Wireless Commun. Net. Conference, IEEE (2016) 20. Wei, Z., Zhao, B., Su, J., Lu, X.: Dynamic edge computation offloading for internet of things with energy harvesting: a learning method. IEEE IoT J. 6(3), 4436–4447 (June 2019) 21. MPR2400CA http://www.memsic.com/userfiles/files/Datasheets/WSN/micaz_datasheet-t.pdf 22. OS08A10. http://www.ovt.com/sensors/OS08A10 23. Available on: https://ww2.energy.ca.gov/almanac/renewables_data/solar/index_cms.php
A Circumstantial Methodological Analysis of Recent Studies on NLP-driven Test Automation Approaches Atulya Gupta and Rajendra Prasad Mahapatra
Abstract From manual testing to test automation, test generation is advancing. With the emergence of new challenges—and legacy challenges already persisting—there is a great need of turning test creation activity into a way that is more responsive and effortless. Natural language processing, with its applicability in different domains, is swiftly adopted by researchers in software testing discipline to perform automation of such activities. Attempts like this will bring in prominent paradigm shifts in the conventional and mundane non-automated frameworks of test cases creation (software development activity) from requirement specifications. To explore, as how natural language processing could be employed to assist software testing, this paper presents a detailed article with methodological investigation of some recent research studies. The detailed knowledge will help the practitioners to get insights of how natural language processing (NLP) is being carried out in testing domain and what specific role does each term associated with it will play. Keywords NLP · Test automation · Natural language processing · Test cases · Test case generation
1 Introduction As machines overpowering today’s world of technology, with their functioning being controlled by the software powering it, testing aids in same notion by offering a solution to all the issues related to the behaving of machines in the exact way as humans want them to. Software testing, in general, depicts a procedure carried out by the testers to agnize and resolve fault issues and to gauge the functionality of the software application, in order to ensure the fulfillment of requirements by the A. Gupta (B) · R. P. Mahapatra Department of Computer Science and Engineering, SRM Institute of Science and Technology, Delhi-NCR Campus, Ghaziabad, India e-mail: [email protected] R. P. Mahapatra e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_14
155
156
A. Gupta and R. P. Mahapatra
developed software. This activity is effort-concentrated activity [1] and can either be processed manually (i.e., requiring human intervention) termed as manual testing or in automated framework (test automation). Most phases in software testing are conducted and practiced manually including test case design phase wherein set of test cases (test suite) are derived and sometimes documented in non-automated way from written requirement that often exists in (NL) natural language [1, 2]. Due to time and cost constraints and an urge to provide high reliability and accuracy in regardance to developed software, opting manual strategy for deriving test cases from requirements in NL would be inadequate, and hence, automation is widely necessitated. With advancement in the research domain, NLP is gaining attention and is expeditiously being applied to automate multiple software evolution activities.
1.1 Definition and Terminologies Associated with NLP Natural language processing is a field (of study) of artificial intelligence that targets to build interactions in between humans and machines (computer) as uncomplicated and within the ace of NL as possible. Furthermore, it could also be comprehended as an intersection of artificial intelligence and computational linguistics, catering to those users, not getting enough time to acquire knowledge of or get perfection in new languages, thence lowering learning curve. The two important subdivisions of NLP are as follows: (1) Natural language understanding (linguistics or NLU) and (2) NLG, i.e., natural language generation [3]. The former includes examining of language in context, form and meaning, while the latter defines a procedure of constructing significant phrases and sentences in NL format. Machine translation, NER, i.e., named entity recognition, POS tagging (where POS is an abbreviation of part-of-speech), discourse analysis are some of the systematically investigated tasks of NLP. Machine translation referred to the automated conversion of text from one language (human dialect) to another [3]. NER is probably the initial step toward IE, i.e., information extraction that attempts to recognize and categorize the named entities present in a text into predefined classes such as locations and names of persons. [4]. From the multiple areas where natural language processing could be applied, information extraction is one such area where phrases of interest are identified within textual data. Information extraction mechanism in many cases (such as extraction of entities, e.g., names, dates from applications) is a robust way of summarizing information pertinent to user’s need. POS tagging allocates the part of speech such as verb, noun, pronoun, adjective to each word existing in the text [4]. Discourse analysis describes the task of discerning the discourse structure of associated text. With these tasks, machine translation is one from the few of tasks that possess straight real-world application (i.e., Google Translate) [1, 3].
A Circumstantial Methodological Analysis of Recent Studies …
157
A brief overview with some of the common terms in NLP (Fig. 1) is provided in introduction part (Sect. 1) so that the readability of paper could be enhanced, at the same time helping the readers to grasp NLP concepts, along with the usage of these concepts in software testing while reading the remaining part of this paper. Subsequent sections deal with the research studies utilizing NLP in software testing (Sect. 2) and the comparative anatomization of these studies (Sect. 3) with the conclusion driven in Sect. 4, respectively.
2 Literature Review Automatic generation of test cases accompanies a wide list of benefits, i.e., from saving time that exhaustive procedures of testing may consume or the efforts that test engineers put, ensuring that the test cases properly cover all requirements, till the cost constraints. The research study in [4] majorly focused on the below stated issues as 1. To support test automation in modern industrial systems, deploying behavioral models needs to be precise and comprehensive enough. With this, these models tend to become complex and high-priced and hence not necessarily a part of development exercise. 2. With varied techniques stated in several literatures, enabling of test automation requires manual rectification of generated test models from natural language requirements. This results in creating scalability complications. 3. Approaches that generate test cases from requirements (written in NL) directly [5–7] were not executable and thence often need relevant non-automatic intervention for providing test input data. 4. Instances where generated test cases straightly from requirement specifications (in NL) are in executable format counting test input data as well [8], requisite requirement specifications to be written in accordance with CNL, i.e., controlled natural language. CNL or restricted NL are procured by limiting the vocabulary and grammar, thereby reducing or eliminating complexity and ambiguity for the task of test case extraction from requirements (in NL) through techniques based on NLP. With these stated issues, this precise study [4] majorly contemplated the problematics of acceptance test case generation procedures from NL desiderata and thence proposed an automated framework that does not rely on adscititious behavioral modeling for creation of executable system level, acceptance test cases from requirements in NL. The automated approach is entitled as ‘Use Case Modeling for System-level, Acceptance Tests Generation’ (UMTG) that engenders executable system level test cases incorporating test data, by utilizing behavioral particulars in use case specifications supplemented with a domain model. To be consistent with the motive of the study, researchers exercised NLP to avert behavioral modeling, i.e., sequence or activity diagrams, to a more analyzable and organized configuration of
158
Fig. 1 Basic terminologies in NLP
A. Gupta and R. P. Mahapatra
A Circumstantial Methodological Analysis of Recent Studies …
159
use case specification recognized as restricted use case modeling (RUCM). Elaborated comprehension about applicability of RUCM could be gathered from varied literatures (e.g., [9, 10]). In brief, this stated approach endeavored to attain balance among a number of objectives, i.e., readable use cases for all stakeholders, least modeling and adequate details for automated creation of acceptance test cases. From RUCM specifications, use case test models (UCTMs) were constructed and for this UMTG utilized NLP. UMTG relied on branch criterion augmented with defuse and subtype coverage criteria for the creation of use case scenarios, effectively from UCTMs. Extraction of a catalog of literal pre, post and guard stipulations from each use case specifications is done during the NLP. These stipulations would aid in enabling the UMTG to decide constraints requisite by test inputs for the satisfactory covering of a test scenario [4]. The extracted stipulations in NL were automatically rendered into constraints in OCL by UMTG. Object constraint language or OCL would further describe the stipulations in the form of entities in domain model. For the generation of OCL constraint, the competency of advanced NLP process is being exploited, i.e., SRL appended with semantic similarity detection. CNP is employed by UMTG for SRL. Generated OCL constraints then would aid in automatically creating the test input data through constraint solving utilizing alloy analyzer. More specifically, the progression of methodology opted by researchers in [4] is termed to be a systematic ‘10’ step mechanism outlined below (Fig. 2). Researchers mentioned steps ‘3’ and ‘4’ (Fig. 2) to be iterative until domain model is absolute. The significance of this methodology is the embedded NLP pipeline, relatively with the components constructed to process use case particulars in RUCM [4]. Improper examining of functional requirements may sometimes lead to missing of significant test cases. Authors in [11] concentrated on the same issue and thus presented a methodology of automated inspection of SRS, i.e., software requirement specification for test case extraction mechanism. Authors highly rely on keyworddriven approach for the extraction procedure of test cases. The methodology in this study is quite fundamental in nature and incorporated three phases (Fig. 3) for automated test case generation. Functioning of the entire methodology [11] was elucidated with an algorithmic structure, consisting of the detailed description for phase ‘2’, where in extraction of conjunctive sentences would be done, that comprised of keywords precisely ‘if’ and ‘then’. Test case table embodied various fields. For each field, extraction mechanism would work as: Nouns are extracted from conjunctive sentences to act as test data, statements linked with ‘if’ acted as test steps while the one associated with ‘then’ acted as expected result. Actual result is being collated with the expected result for the updation of fail/pass status of test cases. The algorithm iterates until there were no scenarios or conjunctive statements being left unexamined in requirement document. Authors validate their methodology with a motivating demonstration and conclude that this automated approach ensures maximized coverage of all the desiderata described in the SRS.
160
A. Gupta and R. P. Mahapatra
Fig. 2 Block representation of UMTG approach
Fig. 3 Block representation of phases embedded in the methodology
Author in [12] highlighted a subject of manual creation of test cases, at the times, when most industrial projects follow a mechanism of capturing specifications in the form of user stories from business stakeholders. This study investigated a practical key instead of manual test case creation scheme, for automating the procedure of test case generation within an environment comprising of Agile software evolution workflow, by utilizing user stories in NL and acceptance criteria. Two fresh input parameters (i.e., dictionary and test scenario description) were introduced and exploited, in addition to the information yielded by user stories (that alone would be inadequate
A Circumstantial Methodological Analysis of Recent Studies …
161
for the test case generation process exercising NLP). A tool was presented, to attain practicability, that utilizes NLP approaches for automated creation of functional test cases from test scenarios that do not rely on formal structures, but author in her methodology still presented a set of instructions for writing test scenario explanation, to circumvent any state of affairs resulting in degradation with respect to the quality of test cases, to be generated. The methodology [12] initiated as 1. Test scenario description and user stories were being processed as inputs through a NL parser. 2. Dependencies procured from the parser were being utilized in the creation of UML activity diagram, which were aimed to demonstrate the functionality flow. 3. Activity graph was generated from activity diagram that comprised of all the activities delineated as nodes. 4. An algorithm entitled as depth first search (DFS) is deployed for traversing the activity graph from initial to final activity state, in order to discover all possible authentic paths. 5. Possible functional test cases were then generated by utilizing the test paths (final outcome). For the system design of test case creation tool, inputs were described as: (1) Acceptance criteria (i.e., stipulations in the format of Given/When/Then, for determining that product met overall requirements being stated by customers). (2) User stories (in the format of As a[Actor]/I want[providing functionality]/So that[trade or business value]). (3) Dictionary that comprised of keywords (most preferably deployed in user stories) with their related test steps. (4) Test scenario description (elucidates the series of interactions in between system and user for attainment of specific goal). After inputs being fed to the system and got done with the furnishing and processing through the means of NLP, frames would act as a transient storage of data to aid in constructing activities. The kind of frame representation created by the author for this study was already exercised in [13]. Author [12] validated the accuracy of tool by the means of data collected through experiments. The experimental setup aimed at evaluating the utility and performance of the tool, by first capturing various parameter details through manual generation of test cases and then automated generation via tool proposed in this study. Test cases generated manually and in automated manner with the data fed into the tool were saved for post survey analysis. The deliberation and result section provided by author states that the outcome of preliminary analysis was indicative of certain aspects such as whether good coverage being provided by test cases and if high-standard and precise test cases were generated by this tool. With surveys showing that 79% of the desiderata were reported and being written in free flowing NL, test case generation from these desiderata that often do not pursue a defined structure constitutes a daunting challenge [14]. Meticulous understanding of high degree desideratum sentences in FRD, i.e., functional requirement document is vital while considering a case of sizeable software projects, majorly comprehending offshore or distributed development and testing groups. To be more specific, comprehension of desiderata should be consistent over various groups or
162
A. Gupta and R. P. Mahapatra
teams, and to attain such consistency, automated analysis of desiderata proves to be helpful. With this motive, the authors in [14] proposed a tool called ‘Litmus’ for the automated creation of functional test cases that were written in English (NL) in FRDs. The whole proposed approach by the authors [14] was being formulated in ‘5’ steps, which further elucidated the working of tool ‘Litmus’. This tool primarily works by considering each desideratum as a sentence, one at a time, and initiates the schematic processing of litmus through entity extractor. Methodology employed in this study does not foist any limit on the structure of the sentence. Prime attention of authors was on studies dealing with unconstraint NL (e.g., [15]). To begin with this model [14], in very first stage, it identifies and extracts all the entities from desideratum document. The tool employed Link Grammar (LG) Parser [16], to parse every desideratum sentence and the group of words with label NP or noun phrase were marked as entity. Modifications associated with the necessity of the approach are being done in LG dictionary. Automated recognition of entities was expounded to users for validation. Inclusion of heuristics and user validation would aid in increasing accuracy of the tool. This model progresses to second step with identification of testable sentences. Individual parsing of every sentence, utilizing LG again with the involvement of information related to entities, would highlight the links in sentences. A sentence tagged either with the ‘testable’ or ‘not testable’ depends upon the successful linkage in sentence. Explanation in terms of sentence to be ‘testable’ needs to incorporate subject, an action and a non-mandatory object. The next step of tool runs on splitting a composite and compounded sentence into uncomplicated sentences. Testable desideratum, if noticed to be compound sentence, would be simplified, i.e., splits up into a group of simple sentences. This step accompanies a considerable set of rules and related actions for the simplification of complex and composite sentences. Test intents were generated in the further phase of model, from each of the simplified sentences, and is being explained to be a group of words having a track of link from the LG parse and circumscribe by NPs. Extreme subparts of a test intent involve subject, an action and an object (non-mandatory). Formation of positive and negative test cases is the concluding phase of the tool. Test intents generated in erstwhile step were grouped and arranged in temporal sequence for the creation of positive test cases. These test cases would substantiate the affirmative activity of the system. Negative test cases were requisite for the justification of system’s behavior under exceptional circumstances and thence were generated, wherever applicable, by the utilization of boundary value analysis or other techniques. A logical inverse of conditions, that were considered in generating positive test, is required and attained via logical negation operation, i.e., neg ( ). Authors validated the accuracy of the tool (appox.77%) by applying it to different domains stating as Pharma, IT, and Telecom [14].
A Circumstantial Methodological Analysis of Recent Studies …
163
3 Comparative Analysis The goal of this section is to highlight sundry NLP tactics and approaches that are deployed to assist software testing. This section preferably examines the methodologies elucidated in each specific studies stated in Sect. 2, along with the pros and cons related to these studies in more straightforward and tabular structure (Table 1).
4 Conclusion Reviewing certain studies, we found that as the applications are getting complex day-by-day, development processes started to accelerate. To keep up pace with the increasing complexities, test automation based on NLP approaches would prove to be efficient. In-depth theoretical explorations of the studies stated in literature module revealed deprivation in context to fulfillment of certain parameters such as full automation (removal of human inspection), pertinent methodology for handling diverse formats of functional requirements and system’s intelligence while performing test case generation in an automated tool framework. In future, these parameters could be taken care of, along with considering the advantageous aspects of NLP as it can further be used for writing test steps of test cases directly in NL, without indulging into programming languages for test scripts. This will not only result in creation of legible test cases, but also increase the understandability level for all stakeholders.
2020
2017
I. Automatic Generation of Acceptance Test Cases from Use Case Specifications: an NLP-based Approach [4]
II. Constructing Test Cases Using Natural Language Processing [11]
1. Industrial case studies mentioned in the methodology stated the effectiveness of the automated approach in terms of generating up to 95% and 99% correct OCL constraints, respectively 2. Lessened the effort factor, as compared to one required in manual test case generation process 3. Dependence of automated approach on modern advances in NLP makes the whole system robust
Advantages
Not stated, as algorithm is Novel yet simplified approach for designed based on keyword notion auto-inspection of requirements stated in NL in FRDs which partially or to some extent could entirely diminish the human intervention in testing procedures
Methodology relied on 5 NLP tactics: 1. Tokenization 2. NER 3. POS tagging 4. SRL 5. Semantic Similarity Detection
Year of publishing Type of NLP techniques/parser used in the mentioned studies
Paper title (considered for literature review)
Table 1 Tabular representation of major details from reviewed studies
(continued)
1. The system proposed in this study does not consider functional requirement of any format, instead of that, requirements in the form of conjunctive statements are utilized only, hence restricting the working of this approach 2. Overall system’s accuracy could be increased by identifying significant test cases for future and then uplifting their priorities
UMTG is still not free from human intervention. As along with computation time, this approach requisite hours of human labor that involves mapping tables and writing of constraints
Disadvantages
164 A. Gupta and R. P. Mahapatra
2017
III. Automatic Generation of Test Cases for Agile using Natural Language Processing [12]
Input fed to the test case generation tool is preprocessed in two phases, i.e., Lexical Analysis and Syntactic Analysis. Stanford Parser and POS tagger were employed for this. Lexical analysis comprise of: 1. Tokenization 2. Splitting of sentence 3. POS tagger Stanford Parser performed syntactic analysis for the parsing of lexically analyzed text. This phase constitutes: 1. Dependency Parsing 2. Voice Classification Cases where there are numerous user stories per test scenario explanation, three methods were employed, i.e., 1. Basic String Matching 2. Lemmatization 3. Synonym extraction using Word Net
Year of publishing Type of NLP techniques/parser used in the mentioned studies
Paper title (considered for literature review)
Table 1 (continued) Disadvantages
(continued)
1. Lessened the effort needed 1. No support for looping back to while generation of test cases past activities or for concurrent by 31% activities or parallel course of 2. Enhanced test coverage of the activities in activity diagram desiderata by 23% and could be enhanced 3. Reduction in time by 61% and 2. Writing format of scenario for effort by 87% was obtained, description is not simplified and in case of generating test cases could be done by including for often varied stories of a numerous If-Else constructs to feature attain looping and concurrency 4. Author stated this approach to 3. Grammatically accurate be the first endeavor in sentences were not continually replacing manual with generated by frames and hence automated test case creation in could be updated with the aid of Agile environment linguistic specialist 4. Test case quality, utility and performance of the tool could be improvised, in case if understandability in terms of semantics or context is supported by the tool 5. The tool itself is not an entire testing solution for Agile software evolution and could be made complete by integrating it into test management software
Advantages
A Circumstantial Methodological Analysis of Recent Studies … 165
Advantages
Disadvantages
Utilization of Link Grammar 1. Inclusion of test intents results 1. Failed in few cases for Parser for examining the free form in finer test granularity which in generating correct test cases English turn aids in ensuring preferable due to unintelligible, test coverage incomplete and not 2. Accuracy level of tool stated as comprehensible test intents to be high, i.e., 84% wherein 2. Despite of high accuracy, in accuracy is measured as the some cases, tool showed up capability of linking only 56% of accuracy. Reason grammatically right sentences behind this low accuracy in case of LG, and for Litmus revealed, set of small sentences tool, it is the percentage of the across the document, which, number of right test cases to the when not handled correctly, total amount of test case created results in multiple identical 3. Applicability of tool to business errors analyst 3. Inadequacy in ranking 4. Able to recognize omitted test mechanism for few cases since cases by human analyst the LG parser utilized in the tool is designed to support links of small length 4. Accuracy levels of the methodology could be improved by constructing domain specific semantics and a robust set of language
Year of publishing Type of NLP techniques/parser used in the mentioned studies
IV. Litmus: Generation of Test 2012 Cases from Functional Requirements in Natural Language [14]
Paper title (considered for literature review)
Table 1 (continued)
166 A. Gupta and R. P. Mahapatra
A Circumstantial Methodological Analysis of Recent Studies …
167
References 1. Garousi, V., Bauer, S., Felderer, M.: NLP-assisted software testing: a systematic mapping of the literature. Inf. Softw. Technol. 126, Article 106321 (2020) 2. Belsare, D., Bhate, M.: A review of NLP oriented automated test case generation framework in testing. Int. J. Future Gener. Commun. Network. 13(1), 593–596 (2020) 3. Khurana, D., Koli, A., Khatter, K., Singh, S.: Natural Language Processing: State of The Art, Current Trends and Challenges. arXiv (2017) 4. Wang, C., Pastore, F., Goknil, A., Briand, L.C.: Automatic generation of acceptance test cases from use case specifications: an NLP-based approach. IEEE Trans. Softw. Eng. (2020) 5. Zhang, M., Yue, T., Ali, S., Zhang, H., Wu, J.: A systematic approach to automatically derive test cases from use cases specified in restricted natural languages. In: Amyot, D., Fonsecai Casas, P., Mussbacher, G. (eds.) System Analysis and Modeling: Models and Reusability. SAM 2014. Lecture Notes in Computer Science, vol. 8769, pp. 142–157. Springer, Cham (2014) 6. Sarmiento, E., et al.: Test scenario generation from natural language requirements descriptions based on Petri-Nets. Electron. Notes Theoret. Comput. Sci. 329, 123–148 (2016) 7. Sarmiento, E., et al.: C&L: generating model based test cases from natural language requirements descriptions. In: RET 2014, pp. 32–38. IEEE, Karlskrona, Sweden (2014) 8. Carvalho, G., Falcão, D., Barros, F., Sampaio, A., Mota, A., Motta, L., Blackburn, M.: NAT2TEST SCR: Test case generation from natural language requirements based on SCR specifications. Sci. Comput. Program. 95, 275–297 (2014) 9. Yue, T., Ali, S., Zhang, M.: RTCM: a natural language based, automated, and practical test case generation framework. ISSTA 2015, 397–408 (2015) 10. Mai, P.X., Pastore, F., Goknil, A., Briand, L.C.: A natural language programming approach for requirements-based security testing. In: ISSRE 2018, pp. 58–69. IEEE, Memphis, TN, USA (2018) 11. Ansari, A., Fatima, A.S., Shagufta, M.B., Tehreem, S.: Constructing test cases using natural language processing. In: 2017 Third International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB), pp. 95–99. IEEE, Chennai, India (2017) 12. Rane, P.P.: Automatic generation of test cases for agile using natural language processing. MSc thesis, Virginia Tech. University (2017) 13. Bhatia, J., Sharma, R., Biswas, K.K., Ghaisas, S.: Using grammatical knowledge patterns for structuring requirements specifications. In: 2013 3rd International Workshop on Requirements Patterns (RePa), pp. 31–34. IEEE, Rio de Janeiro, Brazil (2013) 14. Dwarakanath, A., Sengupta, S.: Litmus: generation of test cases from functional requirements in natural language. In: Bouma, G., Ittoo, A., Métais, E., Wortmann, H. (eds.) Natural Language Processing and Information Systems. NLDB 2012. Lecture Notes in Computer Science, vol. 7337, pp. 58–69. Springer, Berlin, Heidelberg (2012) 15. Sinha, A., Sutton, S.M., Paradkar, A.M.: Text2Test: automated inspection of natural language use cases. In: Third International Conference on Software Testing, Verification and Validation, pp. 155–164 (2010) 16. Sleator, D.D.K., Temperley, D.: Parsing English with a link grammar. In: Third International Workshop on Parsing Technologies (1993)
Plant Disease Recognition from Leaf Images Using Convolutional Neural Network S. Preethi, A. Arun Prakash, and R. Thangarajan
Abstract Automated plant disease recognition from leaf images of a plant is important in the field of agriculture. To achieve this, a convolutional neural network (ConvNet)-based classifier is proposed in this paper. The existing methods used the general CNN architectures such as AlexNet and VGGNet for the disease recognition by retraining it whereas the proposed CNN architecture contains a few convolution layers against the existing CNN models. The number of learnable parameters in the existing networks are 30 times higher compared to the proposed model. The proposed ConvNet is trained on a subset of the publically available dataset that contains 24,249 images of diseased and healthy plant leaves of the crops cultivated in India. The image samples in the dataset are modified by thresholding the original images to remove the background. Rectified linear unit (ReLu) activation function is used across all layers. The sparse categorical cross entropy loss function is considered with Adam optimization algorithm to fine tune the network parameters by minimizing the loss function. The proposed model achieved a training accuracy of 96.39%, validation accuracy of 90.97%, and testing accuracy of 90.59%. The performance of the proposed model is comparable to the existing methods, however, with reduced number of parameters. Due to the reduction in model parameters to a larger extent, the proposed model could be deployed in a resource constrained edge computing devices for a real-time processing. Keywords Plant disease classification · Convolutional neural networks · AI in agriculture
S. Preethi (B) · A. Arun Prakash Department of Electronics and Communication Engineering, Kongu Engineering College, Erode, India e-mail: [email protected] A. Arun Prakash e-mail: [email protected] R. Thangarajan Department of Information Technology, Kongu Engineering College, Erode, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_15
169
170
S. Preethi et al.
1 Introduction It is well known that the rapid growth of population around the globe poses global challenges to meet demand and supply of food grains to humanity. The majority of pre-harvest losses in forecasted crop yield are due to various kinds of plant diseases that are left untreated. Therefore, it is one of the important factors to concern in food security. It is estimated [1] that the reduction in crop yield is, on average, 40% due to various infectious diseases. Therefore, preventing the spread of disease in its early stage by recognizing its type is extremely crucial to avert the huge loss in production. There has been a continuing effort in detecting and classifying the type of disease that affected a plant using image processing techniques. However, most of those algorithms used a traditional image processing methods such as preprocessing and finding edges and contours, feature selection and extraction, and finally fed to a classifier, say, support vector machines (SVM) [2]. The notable drawback of using traditional classifiers is that the performance of the classifier reduces as the dimension of the input image (therefore, the dimension of features extracted from them) increases beyond certain threshold. Therefore, they were limited to small scale classification problems. Following the demonstration of using convolutional neural networks (CNN) for a large scale problem of recognizing thousand objects in the year 2012, AlexNet [3], the potential of convolutional neural networks is tapped for solving a wide variety of problems in computer vision. The notable and widely popular ConvNets, to name a few, are VGG net [4], ResNet [5] and U-NET [6] for medical image segmentation and many more tailored towards a particular application. Therefore, plant disease classification is no exception. In order to read this paper, we expect the reader to have some background knowledge in deep learning. The paper is organized as follows: In Sect. 2, a quick review on the related work in the past using traditional image processing approach is discussed and a few works that attempted on using recent CNN architectures for the disease recognition task are summarized. Subsequently, in Sect. 3, the details on the proposed architecture are discussed. Finally, the performance of the proposed architecture over the existing architectures is compared in the result section.
2 Related Work There are numerous research papers in literature for a plant disease classification developed for a specific crop, for example, Rice [7]. The detailed survey of such works, with the type of crops considered and classifier used, is neatly summarized in [8]. Therefore, it is redundant to reproduce the same here. However, the recent advancement in using ConvNet for a plant disease classification using the leaf of a plant is discussed for the rest of the section. The first usage of ConvNets for plant disease classification is proposed by Sladojevic et al., in [9]. The author considered thirteen different types of diseases for the
Plant Disease Recognition from Leaf Images …
171
study and used private datasets for training the model, with pre-trained weights on ImageNet, using a Caffe framework. Since the author used the private dataset with a limited number of images per class and also the dataset was not made available to the public, their results were not comparable. Subsequently, in the same year, Mohanty et al., evaluated two popular CNN architecture AlexNet [3] and GoogleNet [10] on the publically available dataset called PlantVillage dataset [11]. They trained the CNN model to classify 26 diseases for 14 crops with varying distribution of images for training and testing the model. Later, in the year 2018, Konstantinos et al. [12] evaluated the most commonly used CNN model: VGG net [4] in addition to AlexNet and GoogleNet for the combined dataset available publically. They created 56 classes by considering the type of plant and its disease as a pair, that is, the output of the classifier is a pair: [plant, disease]. Recently, Darwish et al. [13] applied pre-trained CNN networks, namely VGG16 and VGG19, on an intra-class classification of diseases specific to maize.
3 Experimental Setup The first step in developing any supervised learning model is to collect training samples with its respective ground truth labels. It should be kept in mind that the data samples along with the correct labels play a major role in deciding the classification accuracy of the classifier. Any inadvertent errors in labeling the samples will lead to a reduction in performance. In such a case, it will be difficult to debug the cause for performance degradation as there are numerous factors that could potentially affects the performance of the classifier. In order to train and test a supervised model, it is necessary to choose the framework that provides all the required components of the model. Keeping this in mind, this section is divided further into three subsections, namely: 3.1. Dataset and 3.2. Proposed Model.
3.1 Data Set A publically available dataset [11] have been considered for the study. The dataset contains nearly 50,000 images spanning 14 crop species, and 26 types of diseases (17 fungal, 4 bacterial, 2 viral, 2 molds, 1 mite). However, only 8 crop species (suitable for Indian subcontinent) out of 14 were considered for the study. Table 1 shows the list of crop species considered for the study and its respective class label (the name of pathogens that caused diseases). The background subtracted dataset was derived from the original dataset by applying a simple thresholding operation on original images. Then, the background subtracted images were used for the training and testing the model to evaluate the
172 Table 1 Name of plant and its respective label considered for the study
S. Preethi et al. S. No.
Plant
Class. disease name
1
Cherry
1. Podosphaera spp. 2. Healthy
2
Corn
3. Cercospora zeaemaydis 4. Puccinia sorghi 5. Exserohilum turcicum 6. Healthy
3
Grape
7. Guignardia bidwelli 8. Phaeomoriella spp. 9. Pseudocercospora 10. Healthy
4
Orange
11. Candidatus 12. Healthy
5
Bell Pepper
13. Xanthomonas campestris
6
Potato
15. Phytophthora infestans
14. Healthy 16. Healthy 7
Soyabean
17. Healthy
8
Tomato
18. Alternaria solani 19. Septoria lycopersici 20. Corynespora cassiicola 21. Fulvia fulva 22. Xanthomonas 23. Vesicatoria 24. Yellow Leaf Curl Virus 25. Mosaic 26. Tetranychus 27. Healthy
Fig. 1 a Original and b background subtracted image
Plant Disease Recognition from Leaf Images …
173
Fig. 2 Training images with respective class labels
performance of the classifier. Figure 1 shows a sample from a set of original images and its corresponding background subtracted image. Figure 2 shows a few samples of the training images used to train the classifier and its respective class number (ground truth) with reference to Table 1.
3.2 Proposed Model Unlike the previous approaches where a few already existing pre-trained models were considered and fine-tuned by modifying the final layers of the model, we propose a novel CNN architecture tailored to plant disease classification for edge computing devices. The proposed architecture is show in Fig. 3. along with the dimension for each layer. Figure 3 shows that the architecture consists of three convolution layers, where each convolution layer is comprised of Max-pooling layer to reduce spatial dimension and dropout for regularization, followed by three fully connected layers. Rectified linear unit (ReLu) activation function was used to avoid vanishing gradient problem which may occur during back-propagation. In order to prevent the network from over-fitting to the training data, a dropout [14] regularization method was applied throughout the network. The total number of parameters of the networks is merely about 2 million compared to 62 million for AlexNet and 138 million for VGG net. This in turn reduces the number of computations to a larger extent.
174
S. Preethi et al.
Fig. 3 Proposed CNN architecture
4 Results After experimenting with arbitrary CNN architectures, it was observed that the model trained with background subtracted images produce significantly higher accuracy than with the images without background subtraction. Therefore, all the results discussed are with respect to the background subtracted images. The total number of samples available in the dataset is divided into training (80%), validation (10%), and testing (10%) sets. Each sample in the dataset is of varying spatial dimension; therefore, all samples are resized to a fixed size of 100 × 100 × 3. The network weights have been initialized using Xavier initialization method. The input samples to the network were divided into batches of size 256 and fed into the network for forward propagation. The sparse categorical cross entropy loss has been used as an objective function to measure the performance of the network. The weights of the network were fine-tuned using Adam [15] optimization algorithm. The epoch vs accuracy and epoch vs Loss graphs are shown in Fig. 4 (each epoch consists of 95 iterations). It could be observed from Fig. 4. that the model achieved training accuracy of about 96.39% and validation accuracy of about 90% (with the learning rate set to 0.00085 and 50% dropout probability). Subsequently, the model with the trained weights was tested with the test dataset. The model achieved an accuracy of about 90.5% for the given test dataset.
5 Conclusion A ConvNet architecture for image-based plants disease classification was proposed, and the performance of the model was evaluated on a publically available dataset. The proposed model contains only 2 million trainable parameters against the existing
Plant Disease Recognition from Leaf Images …
175
(a)
(b)
Fig. 4 Accuracy and loss a for training set, b for validation set
models with 62 million and 138 million parameters. Therefore, the computational requirement was reduced to 30 to 60 times but with a significant reduction in the accuracy compared to the existing models. Further, the proposed model is more suitable to implement in edge computing devices which are resource constrained. The tradeoff has to be made between the required accuracy and real-time performance.
References 1. Flood, J.: The importance of plant health to food security. Food Sec. 2, 215–231 (2010) 2. Rumpf, T., et al.: Early detection and classification of plant disease with support vector machines based on hyperspectral reflectance. Comput. Electron. Agric. 74, 91–99 (2010) 3. Krizhevsky, A., et al.: Imagnet classification with deep convolutional neural networks. Neural Inform. Proc. Syst. 1, 1097–1105 (2012) 4. Simonyan, K., et al.: Very Deep Convolutional Networks for Large-Scale Image Recognition. https://arxiv.org/abs/1409.1556 (2014) 5. He, K., et al.: Deep Residual learning for image recognition. https://arxiv.org/abs/1512.03385 (2015) 6. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. (eds.) Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015)
176
S. Preethi et al.
7. Phadikar, S., Sil, J., Das, A.K.: Rice diseases classification using feature selection and rule generation techniques. Comput. Electron. Agric. 90, 76–85 (2013) 8. Kaur, S., Pandey, S., Goel, S.: Plants disease identification and classification through leaf images: a survey. Arch. Comput. Methods Eng. 26, 507–530 (2019) 9. Sladojevic, S., et al.: Deep neural networks based recognition of plant disease by leaf image classification. Comput. Intell. Neurosci. 2016, 1–11 (2016) 10. Szegedy, C., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9, Boston, MA (2015) 11. Hughes et al.: An Open Access Repository of Images on Plant Health to Enable the Development of Mobile Disease Diagnosis. https://arxiv.org/abs/1511.08060 (2015) 12. Konstantinos, F.: Deep learning models for plant disease detection and diagnosis. Comput. Electron. Agric. 145, 311–318 (2018) 13. Darwish, A., et.al.: An optimized model based on convolutional neural networks and orthogonal learning particle swarm optimization algorithm for plant diseases diagnosis. Swarm Evol. Comput. 52, 616–618 (2020) 14. Nitish, S., et al.: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014) 15. Kingma, P., Jimmy, Ba.: Adam: a method for stochastic optimization. In: International Conference on Learning Representation (ICLR) (2015)
Optimum Design of Profile Modified Spur Gear Using PSO Jawaz Alam, Srusti Priyadarshini, Sumanta Panda, and Padmanav Dash
Abstract In this article, a profile modification approach is adopted to design a spur gear set for minimization of contact stress and optimization of weight. An altering tooth sum method is used to reduce the hertzian contact stress along the path of contact. Three case studies are performed for a spur gear set with specific center distance and tooth sum of 90 (+5). A multi-objective optimization problem is formulated with contact stresses along path of contact and weight of gear set as design objectives. This nonlinear constrained optimization problem has been addressed by mean of the particle swarm optimization algorithm. Six design variables related to gear geometry and material property are used in this optimization, and all the constraints are satisfied at the optimal solution. Gear and pinion surface temperatures are below flash temperature, indicating protection against scoring wear. A higher value of AGMA scoring index is achieved in this optimum design approach. Specifically, a lighter gear with less contact stress and adequate scoring resistance is reported in this study. Promising results in terms of objective function values and computational time (cpu time) are observed. Furthermore, a CAD model is developed by means of optimized design parameters so as to check the geometric interference and practical feasibility of the design. Keywords Spur gears · Contact ratio · Profile shift · Contact stress · Scoring
1 Introduction The optimum design of a spur gear with lower contact stress across path of contact, lightweight and excellent wear resistance is a very cumbersome task. So it is necessary to improve the design of spur gear by considering different aspect like gear contact stress, vibration level, profile modification and adequate lubrication. Previously, lots of research work have been done to improve the design of spur gear. In a recent study, Sachidananda et al. [1] reported one of the important approaches to J. Alam (B) · S. Priyadarshini · S. Panda · P. Dash Department of Mechanical Engineering, VSSUT, Burla, Odisha 768018, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_16
177
178
J. Alam et al.
reduce the contact stresses in gear by using addendum modification. In the study, SEM images of the gear are analyzed to study the morphology of gear wear. Initially in 1998, the optimum weight design of spur gear train was introduced by Yokota et al. [2] and solved it by an improved genetic algorithm approach. Later, this OWD problem was again investigated by [3] using two advanced optimization algorithms, i.e., particle swarm optimization (PSO) and simulated annealing (SA) to determine an optimum set of design variable. Furthermore, Panda et al. [4] minimize this OWD problem using an evolutionary-based algorithm, i.e., differential evolution(DE). In fact, the authors have performed FEA on to optimized gear set to predict the most critically stressed region. In a similar study, Rao [5] considered two design cases as proposed by [3] and investigate it using another optimization algorithm, i.e., teaching learning-based optimization (TLBO). Lately, Maputi and Arora [6] have uses multi-objective teaching learning-based design optimization (MOTLBO) for the same OWD problem with an extension by including power output maximization. The extensive literature review reveals that no work related to simultaneous optimization of weight and contact stress along the path of contact on multi-objective basis has been reported. So in this study, an attempt is made to optimize the weight of spur gear set and contact stress along the path of contact. Some constraints related to bending fatigue, surface fatigue and scoring, etc. are imposed on the optimization problem to prevent the failure of the spur gear set against these modes of wear. A CAD model is also developed using the optimized geometry of the spur gear, so as to check the geometric interference.
2 Proposed Approach and Formulations This research work on the design of gear is mainly focused on dimensional design of gear drives and to reduce the contact stresses along the path of contact. In a compact design of the gear box, weight of the spur gear drive plays a pivotal role. In this work, weight and contact stress along the path of contact are considered as design objectives on a multi-objective basis. The same OWD problem was proposed by Yokota et al. [2], Savsani et al. [3] and Panda et al. [4] have been used for formulation of the weight optimization by using geometric features of gear. Whereas, Sachidanand et al. [1] use profile modification techniques to reduce the contact stress along the path of contact for a gear set whose gear ratio is 1:1. The main objective of this research work is to minimize the weight of a single-stage spur gear train and to reduce the contact stress of gear sets subjected to 10 nonlinear constraints involving mixed integer design variables. PSO algorithm is a population-based optimization algorithm initially exhibited by Kennedy and Eberhart [7]. The PSO is based on changing the velocity (i.e., accelerating) of each particle toward its ‘p Best’ and ‘g Best’ locations (global version of PSO) at each step. The acceleration is weighted by a random term with separate random numbers being generated for acceleration toward ‘p Best’ and ‘g Best’ locations. The velocity and position updates of the particles are accomplished as per the
Optimum Design of Profile Modified Spur Gear Using PSO
179
Eqs. 1 and 2. The detail steps of PSO algorithm are reported in [7]. Vi+1 [] = Vi [] + C1 ∗ rand() ∗ ( pBest [] − Current[]) + C2 ∗ rand() ∗ (gBest [] − Current[])
(1)
Currenti+1 [] = Current[] + Vi+1 []
(2)
The basic spur gear geometry illustrated in Fig. 1. The multi-objective function for the spur gear train weight optimization and minimum contact stress problem is as represented in Eq. (3). The nonlinear multi-objective function MinF(x) = w1 ×
W (x) σ (x) + w2 × W0 (x) σ0 (x)
(3)
where w1 and w2 . are the weight constant such that w1 + w2 = 1 and W(x) = weight of gear, W 0 (x) = weight of gear when it is optimized as a single objective, σ (x) = contact stress, σ 0 (x) = contact stress when it is optimized as a single objective.
Fig. 1 Basic single-stage spur gear geometry [3]
180
J. Alam et al.
On the basis of operating requirements, different objective functions for spur gear may be proposed. One of the most important of these being the minimum contact stress at the path of contact. In normal operating condition, pitting and scoring is the main mode of failure in spur gear. So, the formulation of contact stress is as reported in [1]. (σmax ) X = con
QX (RTR) X
(4)
where con = 0.35E and Q is load, X is profile shift, (RTR)X radius of curvature at 2 different points, E is young’s modulus of elasticity. (RTR) X =
T1 X (T1 T2 − T1 X ) T1 T2
(5)
The profile shift ‘X’ can be calculated X=
Z e (invφw − invφ) 2 tan φ
(6)
where φ w is altering in the pressure angle and it termed as working pressure Ze is the altered sum of tooth. σx =
σA σB σ B2 σC σD σ D2 σE + + + + + + σ A0 σ B0 σ B20 σC0 σ D0 σ D20 σ E0
(7)
where σ A , σ B , σ B2 , σC , σ D , σ D2 , σ E are the values of contact stress at point A, B, B2 , C, D, D2 , and E, respectively, and σ A0 , σ B0 , σ B20 , σC0 , σ D0 , σ D20 , σ E0 are the values of contact stress at corresponding points when it is optimized as single objective. The basic spur gear geometry considered for the weight optimization is illustrated in Fig. 1 as reported in [2]. Whereas, formulation of weight optimization is given as reported in [3]. Weight = F(x) = (πρ/4000) bm 2 Z 12 1 + a 2 − Di2 − d02 (l − bw ) −nd 2p bw − d12 + d22 b
(8)
The design variables considered in this optimization problem are same as reported in [4], i.e., module, hardness of gear material, shaft diameters of pinion and gear, face width and numbers of teeth on the pinion and gear. In this study, 4:1 gear ratio is considered for the spur gear set. Mainly, three types of constraints are used in this optimization problem, i.e., design, geometric and control parameter constraints as reported in [4]. The formulations of these constraints reported in [4] are considered in this optimization problem.
Optimum Design of Profile Modified Spur Gear Using PSO
181
3 Result and Discussion In this study, the addendum modification with standard tooth sum of 90 as well as with positively altered tooth sum (+5), i.e., 95 tooth sum is performed. The profile modification approach is adopted to reduce the contact stress at different points namely A, B, B2 , C, D, D2 , and E along the path of contact(can be seen in Fig. 2). The gear ratio is assumed to be 4:1. The weight and contact stress along path of contact are optimized for all cases, so that a lightweight gear with fairly low contact stress along path of contact can be designed. The optimized geometry along with the material property obtained through the PSO algorithm for standard and other gear set is reported in Table 1. It is observed from Table 2 that the weight of profile modified gear for tooth sum 90 is 4.976% more than the standard gear set. Similarly, when we focus on the optimum weight of positively altered profile modified tooth sum of 95, it is 27.906% greater than the standard tooth sum; however, the maximum contact stresses are less as compared to the standard tooth sum. The weight the optimized gear set obtained in this study is less as compared to the results reported in [4] for all the cases except altered tooth sum of 95. It is also observed that the face width of the optimized gear set is less as compared to standard tooth sum, and it also satisfies the scoring constraint and hence pointing toward resistance against scoring wear [8]. However, the reported result in Table 2 reveals that the contact stress along path of contact is less as compared to standard gear except at point C and E for the case 90 tooth sum with profile modification and at point C, D and E in the case of gear set of 95 tooth sums. The above mentioned fact is due to T 1 C > T 1 B and T 1 C < T 1 E. This condition reveals that C is a single pair mesh. Further, it can be predicted that wear near pitch line will be more in the case of 90 profile modified and 95 gear set. The contact ratio in modified teeth is approaching 1.2 and 0.9401 for tooth sum of 90 with profile modification and tooth sum of 95, respectively. One of the significant
Fig. 2 Line of contact from beginning to end [1]
182
J. Alam et al.
Table 1 Optimum dimensions of spur gear sets Ze
90 (X 1 = 0, X 2 = 0)
90 (X 1 = 1.1874, X 2 = − 1.1874)
95 (X 1 = 0.800, X 2 = − 2.9455)
b
21.3241
20.5
20.5
D1
20
20
20
D2
36.4
36.4
36.4
m
2
2
2
a
90
90
90
R11
18
18
19
R12
72
72
76
Ra1
20
22
21.88
Ra2
74
76
71.4
Rd1
15.5
17.5
18.1
Rd2
69.5
67.5
67.62
Rb1
16.31
16.31
17.21
Rb2
65.25
65.25
68.88
Cont. Ratio
1.4809
1.1947
0.9401
Hardness
357.4920
343.9871
327.2928
Table 2 Comparisons of contact stress in the gear sets Ze σA σB σ B2
90 (X 1 = 0, X 2 = 0)
90 (X 1 = 1.1874, X 2 = − 1.1874)
95 (X 1 = 0.800, X 2 = − 2.9455)
877.5740
621.6522
660.8847
682.4049
604.7149
665.1639
1147.3
948.9034
920.133
σC
888.4024
σD
854.40
821.4437
953.5499
σ D2
984.0831
795.2336
757.4041
σE
345.3264
427.9347
593.5618
Weight
1487.2
906.1738
1561.00
1044.0
1902.22
achievements of this study is that all the imposed constraints are satisfied at optimal solution point. It is also found that the hardness of the profile modified gear sets is close to each other.
Optimum Design of Profile Modified Spur Gear Using PSO
183
4 Convergence Study Case 1: Standard tooth sum (Ze = 90) The convergence characteristics of contact stresses along path of contact are depicted in Fig. 3. It can be clearly observed from Fig. 3 that the contact stress at point A(σ A )shows an initial fluctuation in its value, and finally, it converges to a constant value of 877.574 MPa after 39th iteration. Similarly, convergence plot for contact stress at point B(σ B ), B2 (σ B2 ), C(σ C ), D(σ D ), D2 (σ D2 ) and E(σ E )follows the same pattern and finally converges to stable value 682.404, 1147.300, 888.402, 854.400, 984.083, and 345.326 Mpa, respectively. Figure 4 shows optimized weight of gear set, and the convergence characteristics of weight of gear set show a quick convergence of the weight value of 1487 g after 39th iteration. While we focus on Fig. 5 which shows the convergence characteristics of some of the important parameter such as face width, (b) diameter of pinion(D1 ) and gear shaft(D2 ), hardness of the material(H) required and contact ratio(CR). As we consider at face width after showing initial fluctuation, it finally converges to a value of 21.3241 mm, while diameter of pinion shaft and gear shaft converge at 20 and 36.4 mm, respectively. The hardness of the material initially shows variation but after 41th iteration, it starts converging, and it finally converges to a value of 357.4920 BHN. Contact ratio is constant in this case and contains the value of 1.4809. Figure 6 illustrates the convergence of profile shift 700
CSB
700
600
600 500
0
1050
50 No. of iteration
100
1000 900
0
50 No. of iteration
800
100
0
1200
50 No. of iteration
100
50 No. of iteration
100
1000 CSD
CSC
550 1100
1000 950 900 850
1100
650
CSD2
CSA
800
1200
CSB2
900
900 800
0
50 No. of iteration
100
700
0
50
100
1000 800 600
No. of iteration
0
700 Tooth sum 90 standard Tooth sum 90 modified Tooth sum 95
CSE
600 500 400 300
0
50 No. of iteration
100
Fig. 3 Convergence of contact stresses along path of contact for spur gear sets
184
J. Alam et al. 2100 Tooth sum 90 standard Tooth sum 90 modified Tooth sum 95
2000
Weight of gear set
1900
1800
1700
1600
1500
1400
0
10
20
30
40
50
60
70
80
90
100
Fig. 4 Convergence of weight for spur gear sets
Facewidth
22 21.5 21 20.5 20
0
50
20.5 20 19.5 19 0
100
20
36.5
36
0
20
40
No. of iteration
1.5
400
Contact ratio
380
Hardness
37
No. of iteration
No. of iteration
360 340 320 300
40
Gear shaft diameter
Pinion shaft diameter
22.5
0
50
No. of iteration
100
Tooth sum 90 standard Tooth sum 90 modified Tooth sum 95
1
0.5
0
50
100
No. of iteration
Fig. 5 Convergence of important parameters (b, d 1 , d 2 , H, CR) for spur gear sets
Optimum Design of Profile Modified Spur Gear Using PSO
185
0 -1
x2
x1
1
0.5
-2 -3
0 0
20
40
60
80
-4 0
100
20
40
60
80
100
Tooth sum 90 standard No. of iteration Tooth sum 90 modified Tooth sum 95
No. of iteration
3 2.5
1
T2C
T1C
1.2
2
0.8 1.5 0.6 0
20
40
60
No. of iteration
80
100
1 0
20
40
60
80
100
No. of iteration
Fig. 6 Convergence of profile shift factor and T 1 C, T 2 C for spur gear sets
factor which is shows no variation in this case because no addendum modification is done. Whereas convergence characteristics of T 1 C and T 2 C also constant throughout the iteration. Case 2: Tooth sum (Ze = 90) with profile shift The convergence characteristics of contact stress along the path of contact are depicted in Fig. 3. It can be clearly observed from Fig. 3 that the contact stress at point A(σA )shows an initial fluctuation in its value, and after 89th iteration, finally it converges to a constant value of 621.652 MPa. Similarly, the convergence plot for contact stress at point B(σ B ), B2 (σ B2 ), C(σ C ), D(σ D ), D2 (σ D2 ) and E(σ E ) follows the same pattern and finally converges to stable value 604.714 Mpa, 948.903 Mpa, 906.173 Mpa, 821.443 Mpa, 795.233 Mpa and 427.934 Mpa respectively. Figure 4 shows optimized weight of gear set. The convergence characteristic of weight of gear set shows a quick convergence of the weight to a value 1561 g after 89th iteration. While we focus on Fig. 5 which shows the convergence characteristics of some of the important parameter such as face width, (b) diameter of pinion (D1 ) and gear shaft (D2 ), hardness of the material (H) required and contact ratio (CR). After showing initial fluctuation, face width finally converges to a value of 20.5 mm, while diameter of pinion shaft and gear shaft converge at 20 mm and 36.4 mm, respectively. The hardness of the material initially shows variation but after 89th iteration, it starts converging, and it finally converges to a value of 343.9871 BHN. The convergence characteristic of contact ratio indicates an initial fluctuation around a value of 1.2,
186
J. Alam et al.
and ultimately, it converges to a value of 1.1947. Figure 6 illustrates the convergence of profile shift factor of pinion(x1) and gear(x2) which initially varies and finally converges to the value of 1.1874 and −1.1874, respectively. Whereas convergence characteristics of T 1 C and T 2 C showing some initial fluctuation finally converge at 0.79 and 1.2775, respectively. Case 3: Positive altered tooth sum (Ze = 95) with profile shift The convergence characteristics of contact stress along the path of contact are depicted in Fig. 3. It can be clearly observed from Fig. 3 that the contact stress at point A(σ A ) shows an initial fluctuation in its value, and after 3rd iteration, finally converges to a constant value of 660.8847 MPa. Similarly, the convergence plot for contact stress at point B(σ B ), B2 (σ B2 ), C(σ C ), D(σ D ), D2 (σ D2 ) and E(σ E ) follows the same pattern and finally converges to stable value of 665.1639 MPa, 920.133 MPa, 1044.0 MPa, 953.5499 MPa, 757.4041 MPa and 593.5618 MPa, respectively. Figure 4 shows optimized weight of gear set. The convergence characteristic of weight of gear set shows a quick convergence of the weight to a value 1902.22 g after 3rd iteration. While we focus on Fig. 5 which shows the convergence characteristics of some of the important parameter face width (b) diameter of pinion (D1 ) and gear shaft (D2 ), hardness of the material (H) required and contact ratio (CR). After showing initial fluctuation, face width finally converges to a value of 20.5 mm, while diameter of pinion and gear shaft converge at 20 mm and 36.4 mm, respectively. The hardness of the material initially shows variation but it starts converging and its finally converge to a value of 327.2928 BHN after 4th iteration. The convergence characteristic of contact ratio indicates an initial fluctuation around a value of 1, and ultimately, it converges to a value of 0.9401. Figure 6 illustrates the convergence of profile shift factor of pinion(x1) and gear(x2) which initially varies and finally converges to the value of 0.8 and −2.9455, respectively. Whereas the convergence characteristics of T 1 C and T 2 C which initially fluctuate and finally converge to the value of 0.7073 and 1.5774, respectively.
5 Conclusion In this study, the weight and contact stress along the path of contact by doing addendum modification with standard tooth sum of 90 as well as with positively altered tooth sum (+5) 95 tooth sum are optimized. The estimation of hertizian contact stress at seven different points A, B, B2 ,C, D, D2 , and E along the path of contact is performed for standard and profile modified gearing using PSO algorithm. From the computation and analysis, the following conclusions are drawn. • The optimized weight obtained by using PSO algorithms shows significant improvement over the results obtained by Panda et al. [4] using a differential evolution (DE) algorithm.
Optimum Design of Profile Modified Spur Gear Using PSO
187
• For the profile modified tooth sum of 90, it is observed that the profile shift factor for pinion and gear is same; however in case of gear, it is negative. As we consider positively altered tooth sum (+5), i.e., 95 tooth sum positive profile shift is for pinion and negative profile shift is for gear is preferred. • Contact stress is reducing at lowest point single tooth contact (LPSTC) in both modified tooth sum of 90 and positively altered tooth sum of 95. • Minimum contact stress is obtained at end point contact (E); this is due to maximum radius of curvature. • The hardness of the 90 profile modified and 95 profile modified gear set is less indicating that a low cost material can be used to prevent failure of the gear set against scoring. • The optimization result reveals that the face width of the optimized gear set with profile modification is less, pointing toward better resistance against scoring wear • It is observed that through profile modification, the contact point can be shifted along the path of contact, i.e., optimum addendum modification will result in shift of contact point from single-pair mesh to two-pair mesh. Acknowledgements The authors deeply acknowledge the financial support provided by TEQIP-3 and the computational facilities provided in CAD laboratory of Dept. of Mechanical Engineering, VSSUT, BURLA.
References 1. Sachidananda, H.K., Raghunandana, K., Gonsalvis, J.: Design of spur gears using PROFILE MODIfiCAtion. Tribol. Trans. 58, 736–744 (2015) 2. Yokota, T., Taguchi, T., Mitsuo, G.A.: solution method or optimal weight design problem of the gear using genetic algorithms. J. Mech. Mach Theory 35, 523–526 (1998) 3. Savsani, V., Rao, R.V., Vakharia, D.P.: Optimal weight design of a gear train using particle swarm optimization and simulated annealing algorithms. Mech. Mach. Theory 45, 531\541 (2010). https://doi.org/10.1016/j.mechmachtheory.2009.10.010 4. Panda, S., Biswal, B., Jena, S., Mishra, D.: An approach to weight optimization of a spur gear. Proc. Inst. Mech. Eng. [J], J. Eng. Tribol. 231(2), 189–202 (2016). https://doi.org/10.1177/135 0650116650343 5. Rao, R.V.: Teaching learning based optimization algorithm. https://doi.org/10.1007/978-3-31922732-0 (2016) 6. Maputi, E.S., Arora, R.: Multi-objective spur gear design using teaching learning-based optimization and decision-making techniques. Cogent Eng. 6, 1665396 (2019). https://doi.org/10. 1080/23311916.2019.1665396 7. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN’95—International Conference on Neural Networks. https://doi.org/10.1109/icnn.1995.488968 8. Winter, H., Michaelis, K.,: Scoring load capacity of gears lubricated with EP-oils. AGMA Paper 219.17 (1983)
Benchmark of Unsupervised Machine Learning Algorithms for Condition Monitoring Krishna Chandra Patra , Rabi Narayan Sethi , and Dhiren Kumar Behera
Abstract Predictive maintenance and condition-based monitoring technique used to monitor the health of bearings, pumps, turbine rotors, gearboxes, etc. It uses the idea of data mining, statistical analysis, and machine learning technique to accurately predict early fault of mechanical components and calculate the remaining useful life. The paper is about condition-based health monitoring of heavy engineering equipment and their predictive maintenance. Data is gathered from the bearing of our experimental setup using unsupervised learning on type of failure and remaining useful life should be determined to predict the maintenance of a machine. In this paper, we consider a data collected from the bearing and fit different unsupervised learning algorithm, gaussian mixture model and clustering technique to check its performance, accuracy, and sturdiness. In conclusion, we have proposed a methodology to benchmark different algorithm techniques and select the best one. Keywords Predictive maintenance · Machine learning · Deep learning · Unsupervised learning · Fuzzy C-means (FCM) clustering · Gaussian mixture model
1 Introduction Historically maintenance of machine equipment based on trial and error or experience basis. In recent decades, predictive maintenance gains a significant prominence in the field of fault detection [1]. This is due to the advancement of the internet of things (IoT) gadgets, connected devices, and advancement in computer to handle big data sets. To minimize maintenance cost, different methodologies and algorithms proposed. One of the approaches is known as watchdog agent, the project consists of related machine learning techniques [2, 3], related techniques SIMAP [4], OSACBM distributed embedded condition monitoring system [5] and collective predictive maintenance frameworks [6]. Advancement of technologies like Internet of Things K. C. Patra (B) · R. N. Sethi · D. K. Behera Department of Mechanical Engineering, Indira Gandhi Institute of Technology, Sarang, Odisha 759146, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_17
189
190
K. C. Patra et al.
(IoT), big data allows us to process various parameters of the system for example acoustics, vibrating amplitudes, heat, stress, viscosity of fluid flow, Mass/volume flow rate, and many more. These parameters used for fault detection and prediction of remaining useful life. This is possible due to the various algorithms available at a different domain. Machine learning is a subset of deep learning, and deep learning is a subset of artificial intelligence. It is a programing approach, where the machine is adept to learn with least or no extra assistantship. Many problems such as image processing, big data can be solved by machine learning technique. Machine learning is divided into mainly three categories, the predictor and response variables play an important role for fault detection in supervised learning, where we use classification techniques. The building of model in unsupervised learning employ clustering techniques, where response variables are used to predict the outcomes but in case of reinforced learning, the program learns movements and significance by relating with the situation. In this examination, we mainly focused on unsupervised learning. In unsupervised learning, we concentrated on clustering techniques, where response variables are clustered into either user-defined or model built based on category, gap, concentration, and model characteristics of that variable. In this study, vibration data of a bearing is used. Selection of features and extraction of data will be illustrated in the subsequent sectors. Python and R programming language provides a lot of tool for supervised and unsupervised machine learning techniques. R is an open-source statistical programming language is developed by Ross Thakka and Robert gentlemen In August 1993. In this research, we employ the R programming language using R-studio.
2 Literature Review Predictive maintenance has a wider capability as it is used to reduce the cost of production and manage to survive in this competitive market. Predictive maintenance is seen by business analytics in three different prospects (i) Predictive analysis (ii) Descriptive analysis (iii) perspective analysis [7]. In descriptive analysis, the past data are summarized in charts to predict the failure whereas predictive analysis is an extension to descriptive analysis. In this case, historical data is analyzed to determine the faults. Type of failure and remaining useful life should be determined to predict the maintenance of a machine. The perspective analysis is an overall process which investigates minimization and maximization of the objective. This paper focuses on both predictive and descriptive analysis to predict failure. Predictive maintenance widely used in various applications such as (i) Fault detection of Bearing and gear drive (ii) Railroad monitoring and vehicle maintenance [8] (iii) aircraft conditioning [9] (iv) Thermal/mechanical energy field industry and many more. In predictive maintenance, past data analysis for fault detection is widely accepted in the industry. The catastrophic incidents can be overcome if we early predict the
Benchmark of Unsupervised Machine Learning Algorithms …
191
failure of machine equipment. In recent studies, predictive maintenance is classified into Both quantitative and qualitative model approach [10]. One of the oldest techniques for fault detection is principal component analysis (PCA). T2 and Q statistics [11], kernel PCA [12], MNN MDPCA using multi-scale neighborhood normalization [13]. Clustering method associated with PCA is a newly developed algorithm, where we use K-means [14], fuzzy C-means [15], hierarchical clustering [16], neural networks, neural net clustering algorithms and subtractive clustering [17], gaussian mixture model [18], and modified rank order clustering (MROC) [1], selection of best algorithms [19].
3 Machine Diagnosis Fault detection is an important parameter for machine diagnosis. It describes the randomness of machine equipment. If the parameter fluctuates from the standard called as machine fault. This paper discusses different algorithms such as principal component analysis (PCA T2 ) and square prediction error of q statistics, k-means, and gaussian mixture model (GMM) for fault prediction and benchmarking the model for vibration conditioning.
3.1 Data Supply Accelerometer signal from the vibration set up is mostly used for fault detection. In our case, we collected the data from the bearing of our experimental setup. The data were collected at a sampling frequency of 100 Hz. At four different stages of the machine, the data were collected (1) bearing with no load condition (2) bearing with load (3) bearing with a cutout in an outer race (4) bearing with a cutout in the inner race. We use the summary function in R, to see minimum, maximum, median, and mean for each variable. The time series plots of the mean value of the data are shown in Fig. 1. In this paper, we will try to see how different set of rules aid us to find the failure earlier.
3.2 Machine Diagnosis Using PCA If proper features of the variables are not selected, then it causes a significant error in the final selection of the model. Principal component analysis (PCA) is a mathematical algorithm used to provide an exact correlation between the components.
192
K. C. Patra et al.
0.10 -0.10
G's
Y-Axis mean
0
100
200
300
Observations
0.10 - 0.05
G's
X-Axismean
0
100
200
300
Observations
G's
-0.3
-0.1 0.1
Z-Axismean
0
100
200
300
Observations
G's
1.0 1.4 1.8
T mean
0
100
200
300
Observations
Fig. 1 Data plot of bearing vibration data
It eliminates zero variance body but retains similarities and differences [20]. This algorithm reduces dimensional similarity but retains most of the variation [21]. Algorithm Step 1: Select the data set as a matrix [Z] [Z ]m×n
(1)
wherein [Z] matrix have m and n no of rows and column Step 2: Deduct the mean of the variable from each element of [Z ] matrix Z¯ n − [Z ]n Step 3: Compute the covariance of matrix order m and n
(2)
Benchmark of Unsupervised Machine Learning Algorithms …
[C]m×n
193
(3)
Step 4: Calculate eigenvalues and eigenvectors of the covariance matrix using the following mathematical expression [C]n×n − [I ]n×n λ {Z }n×1 = {0}
(4)
Step 5: Calculate the eigenvector of the matrix [Z ] [X ]n×n = [{Z 1 }{Z 2 }{Z 3 } . . . {Z n }]
(5)
Step 6: Eigenvalue stores in the diagonal element. [λ]n×n
(6)
[X ]n×n represents the loading vectors and λ is the eigenvalues of the principal components. Step 7: Choose top-ranked eigenvectors [λ]r ×r
(7)
Step 8: Considering all r no. Of eigenvectors [X ]n×r = [{Z 1 }{Z 2 }{Z 3 } . . . {Z r }]
(8)
Step 9: Evaluate principal components [V ]m×r Metrics [V ]m×r = [Z ]m×n [X ]m×r
(9)
Summary of the PCA indicates 95.65% percentage of variance for the two principal components compared to the rest of others. Scree plot can be plotted using the R library. The plotted graph eigenvalues against principal components as shown in Fig. 2. From the summary of PCA records and scree plot, we concluded that the first two components have a maximum deviation compared to the remaining plotted graph.
3.3 T2 and SPE Statistical Analysis It is a multivariable statistical analysis. The data observation z can be calculated as seen below [22]
194
K. C. Patra et al.
Fig. 2 Scree plot and summery of PCA used to determine principal components
T2 =
n t 2j j=l
λi
(10)
F-distribution used to provide an upper confidence limit for t 2 . Tl,n,α =
l(n − 1) Fl,α,n−l n −l
(11)
where n is the number of samples, α is the level of significance and a is the number of principal components [23]. These statistics predict the value above the threshold limit and define a proper control limit for the records. The outcomes of vibration records are shown in Fig. 3. Based on the T 2 statistics graph and as shown in Fig. 4. We conclude that the faults are detected before 67 observation. Therefore, early detection helps us to monitor the vibration setup and take predictive action accordingly. Similarly, square prediction error (SPE) based on Q statistics predict the anomalies as shown in Fig. 4.
Benchmark of Unsupervised Machine Learning Algorithms …
Fig. 3 T 2 statistics results for vibration dataset
Fig. 4 PCA T 2 and PCA SPE for vibration datasets
195
196
K. C. Patra et al.
Fig. 5 The optimal number of clusters using elbow method
3.4 Optimal Number of Cluster for Cluster Analysis In unsupervised learning, we use different clustering techniques such as hierarchical clustering, K-means, and C-means clustering. They are hierarchical, iterative aspect, density-centered, metasearch regulated, and stochastic-based. In this paper, we discuss on K-means, C-means clustering techniques. Before that, we must find out the optimal number of clusters. To find out the optimal number of clusters of our data, there is a lot of techniques such as elbow method, gap statistic method, silhouette method are available in R programming. nbClust [24] and factoextra libraries are available for the elbow method in R. From the graph of elbow plot as shown in Fig. 5. We see that the graph tries to bend at 3 number of clusters beyond that the distance is insignificant. So, the optimal number of cluster is three for our system. Among these three clusters, one can represent faulty conditioning, another can be wrong conditioned, while the rest can predict normal condition. But, how we can observe that these three are optimal clusters to predict for better results? we will discuss it in our letter sections of cluster analysis.
3.5 K-Means and C-Means Clustering K-means is an important unsupervised clustering algorithm. It is used to divide the dataset into predetermined clusters based on Euclidean distance. The graphical result as shown in Fig. 6, plotted for K-means and K-means PCA. Within the clusters, the sum of squares by the cluster is 90.2%. C-means is a clustering technique where each recorded point fit into every cluster at some degree. Bezdek developed fuzzy C-means [13]. It has great application in
Benchmark of Unsupervised Machine Learning Algorithms …
197 K-meansPCA Result
az_mean
PC 2
K-meansResult
-0.3
-0.2
-0.1
0.0
0.1
-4
-2
0
2
4
PC 1
aT_mean C-meansPCA Result az_mean
PC 2
GMM Result
-4
-2
0 PC 1
2
4
-0.3
-0.2
-0.1 aT_mean
0.0
0.1
PC 2
GMM PCA Result
-4
-2
0
2
4
PC 1
Fig. 6 K-means and fuzzy C-means (FCM) clustering for fault detection
the field of engineering, astronomy, agriculture, medical diagnosis, image processing [13], chemistry, and shape tracking in target analysis [25]. It has a great application to overcome the overfitting of clustering. From the summary of K-means and C-means clustering. We observed that the data points are classified into 3 clusters having clusters of size 62, 71, and 229.
3.6 Gaussian Model Clustering Gaussian mixture model (GMM) is a model-based clustering. On which the modeling data collected from one of the numerous groupings. All these groupings may differ from one another but recorded points from the same group modeled by gaussian distribution [24]. Expectation maximization (EM) algorithms fitted perfectly for gaussian mixture model. It is an iterative aspect, where initial random estimates start the algorithms and updates at each iteration until it converged. Set some initial parameter to start the iteration. Start E-step and proceed to M-step using two generated function like GMM and GMM without PCA. We plot the graph as shown in Fig. 7. From the plot, we observed that GMM is capable of detecting anomalies, again it can classify the data into two groups. File 1
198
K. C. Patra et al.
Fig. 7 Classification based on gaussian finite mixture (GMM) model fitted by EM algorithm
and file 2 are coming into one state and file 3 and file 4 comes into another state. So, the GMM model precisely predicts the fault comparing to other cluster technique.
4 Results and Discussion The results from the GMM PCA plot shows us that the data are hypothesized into 4-states. Two of them is a healthy data set, and the other two is unhealthy data set using PCA and T 2 statistics we were able to fit our hypothetical states and able to predict the fault ahead of 67 observation as we moved ahead to fit different clustering algorithms, we get much more clear data then T 2 statistics. To obtain an optimal number of clusters, we use elbow method and nbclust package. Based on no of clusters, when data was fitted using K-means and C-means clustering. We got identical results. Based on our past knowledge, we classified the data as a healthy state, warning state and faulty state. In our final model, we use the EM algorithms for gaussian finite mixture model. The summary of GMM PCA shows a total of 10 components by using the summary function in R. On closure examination of all these ten components. We observed that these components are
Benchmark of Unsupervised Machine Learning Algorithms …
199
overlapping with each other. In-depth examination gives a similar pattern as we get in previous cluster analysis.
5 Conclusion This resource primarily focused on scaling of various machine learning algorithm for early fault detection using unsupervised machine learning techniques. In our analysis T2 , statistics provides better outcomes compared to the GMM technique and no hypothesis is needed to predict the association among the cluster and the state. This method is so versatile that any man has bare minimum domain knowledge can predict the fault or anomalies contrast to other technique. Although clustering analysis required a little knowledge about the records to name it as a healthy, warning, or faulty. Still, it is a nicer tool at a various level of operation. Where T 2 statistic fails at a certain level. While the amount of machine maintenance is costly, clustering analysis is a better alternative to monitor endlessly till a significant level is reached. Early fault detection is the preliminary stage of predictive maintenance. This work is currently executed for vibration records, but the data can be obtained from other physical factors. It would be fascinating to scrutinize the accuracy for a big data set and various failure records. In summary, we conclude that all algorithms give more or less similar results. Hence, if our is only to detect the faults, then T 2 statistics is a brilliant technique. However, if the fault detection requires to be prepared at distinct stages, then clustering techniques would be a nicer one.
References 1. Amruthnath, N., Gupta, T.: Fault class prediction in unsupervised learning using model-based clustering approach. In: Information and Computer Technologies (ICICT), 2018 International Conference (2018) 2. Lee, J., Kao, H.-A., Yang, S.: Service innovation and smart analytics for industry 4.0 and big data environment. Procedia CIRP 16, 3–8 (2014) 3. Djurdjanovic, D., Lee, J., Ni, J.: Watchdog agent, an infotronics-based prognostics approach for product performance degradation assessment and prediction. Adv. Eng. Inf. 17(3–4), 109–125 (2003) 4. Garcia, E., Guyennet, H., Lapayre, J.-C., Zerhouni, N.: A new industrial cooperative telemaintenance platform. Comput. Ind. Eng. 46(4), 851–864 (2004) 5. Sreenuch, T., Tsourdos, A., Jennions, I.K.: Distributed embedded condition monitoring systems based on OSA-CBM standard. Comput. Stand. Interfaces 35(2), 238–246 (2013) 6. Groba, C., Cech, S., Rosenthal, F., Gossling, A.: Architecture of the predictive maintenance framework. In: 6th International Conference on Computer Information Systems and Industrial Management Applications, 2007, IEEE 7. Evans, J.R., Lindner, C.H.: Business Analytics: The Next Frontier for Decision Sciences. College of Business, University of Cincinnati, Decision Science Institute
200
K. C. Patra et al.
8. Rögnvaldsson, T., Byttner, S., Prytz, R., Nowaczyk, S., Svensson, M.: Wisdom of Crowds for Self-organized Intelligent Monitoring of Vehicle Fleets. IEEE (2014) 9. Samaranayake, P., Kiridena, S.: Aircraft maintenance planning and scheduling: an integrated framework. J. Qual. Maintenance Eng. 18(4), 432–453 (2012) 10. Venkatasubramanian, V., Rengaswamy, R., Yinc, K., Kavurid, S.N.: A review of process fault detection and diagnosis: Part I: Quantitative model-based methods. Comput. Chem. Eng. 27(3), 293–311 (2003) 11. Bakdi, A., Kouadri, A., Bensmail, A.: Fault detection and diagnosis in a cement rotary kiln using PCA with EWMA based adaptive threshold monitoring scheme. Control Eng. Pract. 66, 64–75 (2017) 12. Yang, J., Chen, Y., Sun, Z.: A real-time fault detection and isolation strategy for gas sensor arrays. In: Instrumentation and Measurement Technology Conference (I2MTC), 2017 IEEE International, 22–25 May 2017. https://doi.org/10.1109/i2mtc.2017.7969906 13. Sridek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981) 14. Yiakopoulos, C.T., Gryllias, K.C., Antoniadis, I.A.: Rolling element bearing fault detection in industrial environments based on a K-means clustering approach. Expert Syst. Appl. 38(3), 2888–2911 (2011) 15. Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984) 16. Borgatti, S.P.: How to explain hierarchical clustering. Connections 17(2), 78–80. Copyright 1994 INSNA (1994) 17. Du, Z., Fan, B., Jin, X., Chi, J.: Fault detection and diagnosis for buildings and HVAC systems using combined neural networks and subtractive clustering analysis. Building Environ. 73, 1–11 (2014) 18. Goldberger, J., Roweis, S.: Hierarchical Clustering of a Mixture Model. In: Neural Information Processing Systems Conference 19. Amruthnath, N., Gupta, T.: Modified rank order clustering algorithm approach by including manufacturing data. In: 4th IFAC International Conference on Intelligent Control and Automation Sciences, Reims, France, June 1–3, 2016 20. Smith, L.I.: A tutorial on Principal Components Analysis, pp 2–8, February 26, 2002 21. Jolliffe, I.T.: Principal Component Analysis. Springer, New York (2002) 22. Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 417–441 (1933) 23. Villegas, T., Fuente, M.J., Rodríguez, M.: Principal component analysis for fault detection and diagnosis. Experience with a pilot plant. In: Advances in Computational Intelligence, Man-Machine Systems and Cybernetics. ISBN: 978-960-474-257-8 24. Charrad, M., Ghazzali, N., Boiteau, V., Niknafs, A.: NbClust: an R package for determining the relevant number of clusters in a data set. J. Stat. Softw. 61(6), 1–36 (2014). http://www.jst atsoft.org/v61/i06/ 25. Yong, Y., Chongxun, Z., Pan, L.: A novel fuzzy C-means clustering algorithm for image thresholding. Meas. Sci. Rev. 4(1) (2004)
Investigation of the Efficiency for Fuzzy Logic-Based MPPT Algorithm Dedicated for Standalone Low-Cost PV Systems Garg Priyanka, Santanu Kumar Dash, and Vangala Padmaja
Abstract This paper motivates the study of various maximum power point tracking (MPPT) algorithm and analyzes the efficiency for the standalone solar systems. To increase the efficiency of the algorithms for the tracking maximum power, MPPT shows a major role. The various MPPT algorithms are like perturb and observe, incremental conductance, and particle swarm optimization techniques have been implemented to enhance the performance of standalone solar system. But all this conventional methods have perturbations under dynamic environmental conditions. Therefore, this paper explicates fuzzy logic control technique for extraction of extreme power from photovoltaic cell using boost converter during the dynamic environmental conditions. The fuzzy-MPPT technique has been implemented for the standalone photovoltaic system. The system has been developed in Simulink environment and the results are analyzed. The obtained simulation results validate the efficiency over other conventional algorithms. Keywords Fuzzy logic control (FLC) · Fuzzy inference system (FIS) · Photovoltaic (PV) · Maximum power point (MPP) · DC–DC boost converter · Perturb and observe (P&O) · Incremental conductance (INC)
1 Introduction As the pollution across the world rises gradually due to harmful gases like CO2 and fossil fuels, the demand for green energy through the renewable energy resources is increasing. The energy generated by sun is a renewable energy resource that remains available for free of cost in nature and exists abundantly. Due to over use of electricity, G. Priyanka (B) · S. K. Dash · V. Padmaja VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India e-mail: [email protected] S. K. Dash e-mail: [email protected] V. Padmaja e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_18
201
202
G. Priyanka et al.
it is necessary to overcome energy crises, as the needs of people increase in day-today life the solar energy show a key role. As in present trend, everything is automated, it is a challenging task for sufficient and efficient production of energy, on this impact, there is a increasing demand for photovoltaic cell to generate maximum and efficient power, and they are stored into the batteries for further usage. In a sustained environmental conditions, there is a certain point such that power at output of pv cell is extreme, it is called a point of maximum power (MPP) [1]. The voltage as well as current gets affected as temperature and irradiance varies. The aim of employing MPPT controller in PV module is to exploit the output power with the boost converter [2]. There are various MPPT techniques like conventional and unconventional employed on DC boost converters for extracting MPP from PV cell and storing them in batteries [3]. Fuzzy logic is a technique that relates the way how humans make the decisions and that are correlated to the systems and it under takes all the possibility values between yes and no for decision making. The intuition of fuzzy logic controller is, it controls machine and gives acceptable reasoning in few cases where accurate results are not stable [4]. The benefits of FLC are that as there structures are easily understandable widely used for commercial and practical purpose, it supports in dealing with improbability, generally robust as there is no need of precise input, if the feedback sensor stops working, you can program it into the situation, and it will be easily altered for enhancement of the system [5].
2 Literature Survey Abbod et al. in [6] describe a novel MPPT method to overcome issues due to fuzzy like long processing and enhance the tracking speed when compared to other conventional algorithms. Woonki et al. in [7] present the fuzzy technique, describe issues due to changing irradiation and use various sliding algorithms for maximum power tracing, and performance is high in comparison with other conventional methods. Wen et al. in [8] described VUFLC technique for MPPT of a pv system and the results like dynamic response, the static error, system complexity are compared with other algorithms. Abdullah et al. in [9] suggested PSIM system with fuzzy, also SIM coupler by keeping radius and temperature constant alternatively, and performance is tracked under various scenario. Yang in [10] projected FLC procedure and described various scenarios for tracking MPP, and performance is being related with PSO. Ramasamy et al. in [11] suggested fuzzy method with different approaches, VSC and other is CPI and results are compared with unconventional methods. Joshua in [12] described a fuzzy technique; performance and convergence speed metrics are compared using various solar panels in LabVIEW.
Investigation of the Efficiency for Fuzzy Logic Based MPPT …
203
3 System Description Photovoltaic effect is a process in which the photovoltaic (PV) cell converts alternative energy into electricity; it is an energy harvesting technology. PV module constitutes of a large PV cells that are connected together [13]. A distinct solar cell cannot provide the prerequisite power. So this is the reason that the solar modules are organized together to increase the output power. These are parallely wired to produce high current and serially to produce high voltage [14]. To increase voltage and current, the cells of a photovoltaic should be connected in serial and parallel [15]. A modeling of a solar cell is shown Fig. 1. From the above equivalent circuit i ph = i D + i Rp + i i Rp =
v + i Rs Rp
(1)
(2)
By solving above equation, we get, i = i ph − i D −
v + i Rs Rp
(3)
Diode current i D is v + i Rs i D = I0 exp −1 nVT
(4)
n = 2 for silicon. Voltage equation of temperature by Boltzmann constant is vT =
KT q
The reverse saturation current I0 is Fig. 1 PV equivalent circuit
i Rs
i ph
iD
Rs
i Rp Rp
+
v -
iL
204
G. Priyanka et al.
I0 = KTm exp
−VGO nVT
(5)
By solving the above equations, the obtained form of equivalent circuit we get,
v + i Rs v + i Rs i = I0 exp −1 − nVT Rp
(6)
where iph = Photo current, iD = diode current, iRp = shunt resistance current, iRs = series resistance current, vT = voltage equation of temperature, q = Electron charge, T = Temperature (kelvin), K = Boltzmann constant.
4 Block Diagram of MPPT Using Fuzzy Logic Control The block diagram for tracking maximum power through a solar cell along with fuzzy logic is shown Fig. 2. It consists of a PV cell, boost converter, PWM, fuzzy logic controller. As observed, fuzzy logic block constitutes of different stages like fuzzification, fuzzy inference engine, defuzzification.
I Boost Converter
V
D PV cell
Generating input variables
L O A D
PWM
Fuzzy rules box
Fuzzification
Inference engine Fuzzy logic controller
Fig. 2 Block diagram of a MPPT using fuzzy logic
Defuzzification
Investigation of the Efficiency for Fuzzy Logic Based MPPT … Fig. 3 Boost converter device circuit
L
Vin
205
D
S
C
R
4.1 Boost Converter As a name boost converter, device may be a DC-DC boost device that enhances voltage from input to output, i.e., voltage that is observed at the output will be greater than the voltage observed at input, and therefore, the output current is also lesser when compared to the supply current. It has storage elements like inductor, capacitor and also consists of diode and a transistor. The inductors used in boost converter should be able to withstand the high currents. To reduce the ripple voltages, capacitor will be kept at the load. It is feasible transmitting maximum power at load with DC-DC converter; duty cycle through IGBT diode is varied. Duty cycle is precised by a MPPT controller to succeed in reaching maximum power [16] (Fig. 3).
4.2 Fuzzy Logic Controller 4.2.1
Fuzzification
Fuzzy rule-based system using fuzzification inference will evaluate a linguistic if then reasoning data. They finally produce a fuzzy consequences, such are transformed into output, then defuzzification will be performed to alter the fuzzy outcomes into crisp [17].
4.2.2
Defuzzification
Fuzzification process results in the fuzzified output and then that outputs are converted into a single crisp value through defuzzification. In FLC, defuzzified value represents the active function that could be helpful in controlling the process.
4.2.3
Rule Based System
The proposed system fuzzy uses rule system that is based on if and then rules. It is given by if antecedent, then consequent. In general, the rules are in the canonical form. Any linguistic variable has three general forms like assignment statement,
206
G. Priyanka et al.
conditional statement and unconditional statement and through them, the canonical rules can be formed. The PWM modulator generates a duty cycle, and boost converter drives the following circuit that acts as a switch.
5 Implementation Due to the various conventional methods such as perturb and observe, incremental conductance the operating point of maximum power has many oscillations, to avoid such tracking issues FLC is used [18]. As the solar irradiation and temperature falls upon solar panel due to photovoltaic effect, the current and voltage will be produced [19]. Power is calculated using mathematical calculations of current and voltage, and then the fuzzy input variables are generated; here, the inputs are voltage and power. For tracking of a max power of the photovoltaic cell by fuzzy logic in MATLAB [20], fuzzy controller requires a FIS file that can be generated using fuzzy block as following Fig. 4. The input membership functions and output membership functions are specified accordingly. The input membership functions are Vn, Pn and output is D [21] as discussed in Fig. 5. In input and output membership functions, the described linguistic variables are negative big (NB), positive big (PB), negative medium (NM), positive medium (PM), negative small (NS), positive small (PS), zero error (ZE). The fuzzy sets variables are Vn, Pn and D [22, 23]. The input membership function Vn and Pn ranges from −5 to 5 and respective plots for membership functions Vn and Pn are in Figs. 6 and 7. At output, the membership function is a duty cycle D that has the values from 0.1 to 1. The output membership function strategy is presented in Fig. 8. The fuzzy rule uses if and then rule-based system using operators on the fuzzy input and fuzzy output; totally forty-nine rules are shown in Table 1. Input membership function Vn, Pn and output membership function D can be graphically observed using XYZ plane, XZ plane, YZ plane in Figs. 9, 10 and 11.
6 Results and Discussions The MATLAB Simulink prototype for tracking determined maximum power with fuzzy logic comprises of pv module, fuzzy logic controller and a boost converter.
Investigation of the Efficiency for Fuzzy Logic Based MPPT …
207
start
Acquisition of current, voltage data from solar panel
Calculate the power Pn=In*Vn
Fuzzy input variables are generated
Input membership functions
Fuzzification
fuzzy rules
Mamdani method is used in fuzzy inference to infer knowledge
output membership function
Defuzzification
DC- DC boost converter
stop Fig. 4 Flow chart of MPPT using fuzzy logic
Model is simulated under constant temperature at 25 °C also irradiance at 1000 w/m2 , 800 w/m2 , 500 w/m2 . (A) Photovoltaic module simulation For tracking maximum power by fuzzy logic in MATLAB, the solar cell used is a Trina Solar 250. The simulation is carried out for the PV module with temperature
208
Fig. 5 MPPT using fuzzy system
Fig. 6 Membership function for Vn
Fig. 7 Membership function for Pn
G. Priyanka et al.
Investigation of the Efficiency for Fuzzy Logic Based MPPT …
209
Fig. 8 Membership function plot of duty cycle D
Table 1 Fuzzy rules for MPPT Pn
Vn NB
NM
NS
ZE
PS
PM
PB
NB
ZE
ZE
ZE
NB
NB
NB
NB
NM
ZE
ZE
ZE
NM
NM
NM
NS
NS
NM
ZE
ZE
ZE
NS
NS
NM
ZE
ZE
NM
NS
ZE
ZE
PM
ZE
PS
PS
PE
PM
ZE
ZE
ZE
PS
PM
PM
PS
PB
ZE
ZE
PM
ZE
PB
PB
PB
PB
ZE
ZE
ZE
ZE
at 25 °C and irradiance at 1000 w/m2 , 800 w/m2 , 500 w/m2 . The PV and IV characteristics of photovoltaic component with temperature at 25 °C, the irradiance at 1000 w/m2 , 800 w/m2 , 500 w/m2 are observed in following Figs. 12 and 13. (B) Simulation using fuzzy logic Generated voltage, current through PV module will be taken as input, and from the fuzzy controller, duty cycle is produced that drives IGBT diode of a particular converter so that it provides max power at load output. Figure 14 shows the input voltage that is from PV module before giving to the fuzzy logic controller and boost converter; at 25 °C temperature and 1000 w/m2 irradiance, the input voltage is 28 V and the time taken for consideration is for t = 1 s. X-axis is signifying the time, Y-axis is signifying the voltage. Figure 15 shows the output voltage at the load after boost converter including fuzzy logic controller through dissimilar input and output membership function with the duty cycle D = 0.5, output voltage is 42 V at 25 °C temperature and 1000 w/m2
210 Fig. 9 Fuzzy rules in surface
Fig. 10 Fuzzy rules for Vn,D in surface
G. Priyanka et al.
Investigation of the Efficiency for Fuzzy Logic Based MPPT …
211
Fig. 11 Fuzzy rules for Pn and D in surface
Fig. 12 VI characteristics of a PV cell
irradiance time taken for consideration is for t = 1 s. X-axis signifies time and Y-axis signifies voltage.
212
Fig. 13 PV characteristics of a PV cell
Fig. 14 Input voltage before MPPT
G. Priyanka et al.
Investigation of the Efficiency for Fuzzy Logic Based MPPT …
213
Fig. 15 Output voltage after MPPT
7 Conclusion In the present paper, fuzzy logic technique for MPPT has been implemented to track maximum power through photovoltaic module using boost converter. In the dynamic environmental conditions, the performance of standalone PV systems controlled by conventional controllers has decreased. Therefore, the fuzzy-MPPT combination has been implemented to enhance the performance and efficiency of the system. The PV and IV characteristics of a PV module under temperature 25 °C, various irradiance level 1000 w/m2 , 800 w/m2 , 500 w/m2 is observed, output voltage has been boosted from 28 to 42 V using boost converter by using the fuzzy-MPPT method.
References 1. Al-Majidi, S.D.: Design and implementation of reconfigurable MPPT fuzzy controller for photovoltaic systems. Ain Shams Eng. J., 1–10, October (2019) 2. Shajith Ali, U.: Z-source DC-DC converter with fuzzy logic MPPT control for photovoltaic applications. In: 5th International Conference on Advances in Energy Research, ICAER 2015, pp. 163–170, 15–17 December 2015, Mumbai, India 3. Dash, S.K., Ray, P.K., Panda, G.: DS1103 real-time operation and control of Photovoltaic fed unified power quality conditioner. In: IEEE Conference TENCON, pp. 3424–3428, Nov 22, 2016, Singapore (2016) 4. Bounechbaa, H., Bouzida, A., Nabtib, K., H. Benallab. Comparison of perturb & observe and fuzzy logic in maximum power point tracker for PV systems. In: The International Conference
214
5.
6. 7. 8.
9.
10.
11.
12.
13.
14. 15.
16. 17. 18.
19.
20.
21.
G. Priyanka et al. on Technologies and Materials for Renewable Energy, Environment and Sustainability, pp. 677– 684 (2014) Srinivas, Ch.L.S., Sreeraj, E.S.: A maximum power point tracking technique based on ripple correlation control for single phase photovoltaic system with fuzzy logic controller. In: 5th International Conference on Advances in Energy Research, ICAER 2015, pp. 69–77, 15–17 December 2015, Mumbai, India Abbod, M.F.: A novel maximum power point tracking technique based on fuzzy logic for photovoltaic systems. Int. J. Hydrogen Energy 43(31), 14158–14171 (2018) Na, W., Che, P., Kim, J.: An improvement of a fuzzy logic-controlled maximum power point tracking algorithm for photovoltaic applications. Appl. Sci. J. 7(4) (2017) Wang, Y., Yang, Y., Fang, G., Zhang, B., Wen, H., Tang, H., Fu, L., Chen, X.: An advanced maximum power point tracking method for photovoltaic systems by using variable universe fuzzy logic control considering temperature variability. Electron. J., MDPI (2018) Yaqin, E.N., Abdullah, A.G., Hakim, D.L., Nandiyanto, A.B.D.: MPPT based on fuzzy logic controller for photovoltaic system using PSIM and simulink. In: The 2nd Annual Applied Science and Engineering Conference (AASEC 2017), vol. 288, 24 August 2017, Bandung, Indonesia Sun, Z., Yang, Z.: Improved maximum power point tracking algorithm with cuk converter for PV systems. In: The 6th International Conference on Renewable Power Generation (RPG), vol. 2017(13), pp. 1676–1681, Oct 2017 Reddy, D., Ramasamy, S.: Fuzzy logic MPPT controller based three phase grid tied solar PV system with improved CPI voltage. In: International Conference on Innovations in Power and Advanced Computing Technologies (2017) Praful Raj, M., Joshua, A.M.: Design implementation and performance analysis of labview based Fuzzy Logic MPPT controller for standalone PV systems. In: IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (2017) Bendiba, B., Belmilia, H., Krimb, F.: A survey of the most used MPPT methods: conventional and advanced algorithms applied for photovoltaic systems. Renew. Sustain. Energy Rev. 45, 637–648 (2015) Dash, S.K., Ray, P.K.: Design and analysis of grid connected photovoltaic fed unified power quality conditioner. Int. J. Emerg. Electric Power Syst. 17, 301–310 (2016) Mohantya, P., Bhuvaneswarib, G., Balasubramanianb, R., Dhaliwalb, N.K.: MATLAB based modeling to study the performance of different MPPT techniques used for solar PV system under various operating conditions. Renew. Sustain. Energy Rev. J. 581–593 (2014) Danandeh, M.A., Mousavi G, S.M.: A new architecture of INC-fuzzy hybrid method for tracking maximum power point in PV cells. Solar Energy 171, 692–703 (2018) Mahamudul, H., Saad, M., Henk, M.I.: Photovoltaic system modeling with fuzzy logic based maximum power point tracking algorithm. Int. J. Photo Energy 2013, 10 pages (2013) Arora, A., Gaur, P.: AI based MPPT methods for grid connected PV systems under non linear changing solar irradiation. In: International Conference on Advances in Computer Engineering and Applications (ICACEA), 23 July 2015, Ghaziabad, India Dash, S.K., Ray, P.K.: Investigation on the performance of PV-UPQC under distorted current and voltage conditions. In: 5th International Conference on Renewable Energy: Generation and Applications (ICREGA), 16 April 2018, Al Ain, United Arab Emirates Samal, S., Barik, P.K.: Extraction of maximum power from a solar PV system using fuzzy controller based MPPT technique. In: IEEE International Conference on Technologies for Smart-City Energy Security and Power (ICSESP-2018), March 28–30, 2018, Bhubaneswar, India Algarín, C.R., Giraldo, J.T., Alvarez, O.R.: Fuzzy logic based MPPT controller for a PV system. Electron. J., MDPI (2017)
Investigation of the Efficiency for Fuzzy Logic Based MPPT …
215
22. Bendiba, B., Krimb, F., Belmilia, H., Almi, M.F.: Advanced fuzzy MPPT controller for a stand-alone PV system. In: The International Conference on Technologies and Materials for Renewable Energy, Environment and Sustainability, TMREES14, pp. 383–392, 2014, Algeria 23. Othman, A.M., El-arinia, M.M.M., Ghitas, A., Fathy, A.: Realworld maximum power point tracking simulation of PV system based on fuzzy logic control. National Research Institute of Astronomy and Geophysics (NRIAG) J. Astron. Geophys. pp. 186–194, Egypt
Distributed Channel Assignment in Cognitive-Radio Enabled Internet of Vehicles Kapil Goyal and Moumita Patra
Abstract Internet of Vehicles (IoV) is an evolving and appealing technology which enables vehicles to communicate with each other, to roadside infrastructures, and to pedestrian handheld devices. The communication between these agents helps to develop and execute on-road safety applications, traffic efficiency applications, and infotainment applications. However, spectrum allocated according to current vehicular communication standard IEEE 802.11p is not enough to provide growing demands of the increasing vehicle users. To this direction, cognitive radio technology and cooperative spectrum sensing have been proposed as possible solutions for overcoming these challenges. They enable vehicular users to efficiently use the licensed spectrum while ensuring promised Quality of Service (QoS) for licensed users. However, it is necessary to have an efficient channel assignment strategy in order to avoid collision between transmissions. In this work, we propose a distributed channel assignment algorithm to ensure efficient utilization of channels and improve the network throughput. We perform simulations which show that our proposed algorithm gives better throughput, packet delivery ratio and channel utilization. Keywords Internet of vehicles · Cognitive radio · Channel assignment
1 Introduction With the increase in urban population and development, there is a rapid increase in the number of vehicles on road. As the number of vehicles are increasing, there is also a growing need for on-road safety and infotainment applications. To provide a safe and comfortable experience to vehicular users, vehicles should be able to communicate with each other and with other electronic devices. Internet of Vehicles (IoV) is an K. Goyal · M. Patra (B) Indian Institute of Technology Guwahati, Guwahati, Assam 781039, India e-mail: [email protected] K. Goyal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_19
217
218
K. Goyal and M. Patra
area of research that enables vehicles to communicate with each other, to roadside infrastructure units, and to pedestrian’s handheld devices. This communication is done by using wide range of wireless spectrum and wireless networks such as WiFi, cellular networks, and TV white space, etc. [1]. Communication between vehicles is done using Wireless Access for Vehicular Environment (WAVE) which uses IEEE 802.11p standard [2]. This standard provides multi-channel operations with the help of a control channel (CCH) and service channels (SCHs). These channels are allocated 75 MHz spectrum in 5.9 GHz spectrum band and are synchronized at specific intervals. However, this allocated spectrum band is not enough to accommodate increasing demands of the users [2, 3]. Cognitive Radio (CR) technology is an evolving research area that allows dynamic spectrum access in wireless networks [4]. It allows unlicensed users to opportunistically use the available portion of licensed spectrum and improve application performance. If a channel is being used by licensed users known as Primary Users (PUs), channel is detected as busy and if it is not being used by PU, it is detected as available. Such channels can be used by unlicensed users known as Secondary Users (SUs). This sensed information helps SUs to detect free channels which they can use for transmitting their data [4]. However, the challenge is to efficiently assign these available channels to the SUs such that minimum collision is incurred [5–7]. In [8], the authors propose a centralized channel assignment algorithm which is executed by an RSU. In [9], the authors have proposed a centralized channel assignment technique which can be formulated as a resource matching problem. They have assumed that information about available channels at different locations is available in a geolocation database which can be accessed by RSU. In a highly mobile scenario with vehicles, a centralized system adds more delay and vehicular applications being delay-constrained, it may become difficult to serve them within their delay bounds. In [10], the authors have proposed a distributed algorithm for channel allocation which makes use of maximum transmission power of vehicles that are in the same cluster and bandwidth and interference of available channels. However, the authors have assumed cluster of vehicles and that each vehicle in the cluster always knows available channels, their bandwidth, and interference level. In [5, 6], the authors have proposed a cooperative spectrum sensing technique which defines three states of spectrum i.e., idle, busy by PU, and busy by SU. If a channel is sensed idle, SUs can transmit in that channel, if a channel is sensed occupied by PU, the SUs carefully refrain from using that channel and if a channel is sensed occupied by SU, the other SUs can contend for the channel. The channel allocation of SUs, is however done randomly. In [7], authors have proposed to sense PU channels when vehicles are in backoff, in order to get more sensed information. However, here too, idle PU channels are assigned randomly. In this work, we intend to propose efficient channel assignment algorithm for a cognitive-radio enabled IoV scenario. We have proposed a channel assignment algorithm for a scenario with lossless channel characteristics and have used Hungarian matching algorithm for assignment in lossy channel scenario. We have performed simulations which show that our proposed approaches perform better than random channel assignment to vehicular SUs.
Distributed Channel Assignment in Cognitive-Radio Enabled …
219
2 System Model 2.1 Scenario Our system model consists of a Two-Dimensional (2-D) city scenario with roads having multiple lanes and intersections. The road is logically divided into segments of length l. Vehicles have bi-directional movement. The speed of vehicles follows a uniform distribution. Vehicles can communicate with each other with the help of wireless devices called On Board Units (OBUs). Vehicles use IEEE 802.11p standard for communication with each other. They are also equipped with devices such that they can sense if a channel in the UHF-TV band is free and transmit their data in such a free channel, if available.
2.2 Radio Channel Model We have considered that the IEEE 802.11p standard multi-channel model is used for communication between vehicles, similar to the works in [5, 6]. As shown in Fig. 1, there is one Control CHannel (CCH) and four Service CHannels (SCH) at 5.9 GHz spectrum. OBUs switch between CCH and SCH for transmission of data. The Synchronization Interval (SI) is of 100 ms duration which is further divided into equal intervals of SCH and CCH. Sensing of primary channels is done in the SCH interval and it is done in alternate SCH intervals. Vehicles follow Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) for channel access and if the channel is sensed busy, then vehicles will go to back-off by uniformly selecting a value from the contention window as back-off counter. The TV white space consists of eight PU channels. These channels can be used by vehicular SUs if there is no PU activity, i.e., the channel is sensed idle.
Fig. 1 IEEE 802.11p channel switching [2]
220
K. Goyal and M. Patra
2.3 Cooperative Spectrum Sensing Vehicles perform spectrum sensing with the help of devices which help them to detect the presence/absence of licensed users (PUs) and identify the available spectrum for usage. Many factors such as, multipath fading, shadowing, and receiver uncertainty problems can significantly compromise the detection performance in spectrum sensing. In this direction, cooperative spectrum sensing can be used where vehicular SUs can cooperate and share the sensing results with other vehicles [4, 5]. In our system scenario too, vehicular SUs use cooperative spectrum sensing to combat issues related to lower detection performance.
3 Problem Formulation As mentioned in Sect. 2, the road is divided into segments of length l. We assume that there are N vehicles in each segment and K primary channels are there which are sensed idle for transmission. Time T is divided into slots of duration τ . Our objective in this work is to assign idle PU channels to vehicular SUs in such a way that the throughput of the network is maximized. Let Θi jt be the throughput achieved from the vehicle-channel pair {i, j} in time slot t. The objective function strives for maximization of overall throughput. max T
such that,
N
Θi jt xi jt
(1)
i, j
xi jt ≤ 1 ∀ j ∈ {1, 2, 3...K } t ∈ {1, 2, . . . , T }
(2)
xi jt ≤ 1 ∀ i ∈ {1, 2, 3...N } t ∈ {1, 2, . . . , T }
(3)
i=1 K j=1
xi jt
1, if channel j is assigned to vehicle i = 0, otherwise
(4)
Equation (2) represents the constraint which makes sure that for each jth channel, at most one vehicle from the segment is assigned at a time. However, only this constraint is not sufficient because we need to make sure that a single vehicle won’t be assigned more than one channels. This is ensured with the help of Eq. (3) which represents that the ith vehicle is assigned to only one channel. The third constraint, represented by Eq. (4) makes sure that xi jt can only have values 0 or 1.
Distributed Channel Assignment in Cognitive-Radio Enabled …
221
4 Proposed Methodology We have modeled the problem of assigning idle PU channels to vehicular SUs as a maximal matching problem. In order to assign channels we have considered a few assumptions and we have shown the assignment in two cases.
4.1 Case 1—Ideal Channel Characteristics In this case, we have considered a scenario where we have assumed that all the channels have same characteristics and we have considered lossless channels. In this case, in order to assign channels efficiently and achieve the objective as mentioned in Eq. (1), it is only required to assign channels to vehicles in such a way that contention among vehicles for assignment of channels is reduced. We have proposed a heuristic method to assign idle channels to vehicular SUs. We have assigned priorities to vehicles depending on the type of application being executed. Applications are prioritized according to their required delay constraints. Those which have strict delay constraints, are given higher priority. Idle channels are assigned to vehicles as per their priority. If the number of vehicles is more than the number of available channels, then the first K vehicles would be assigned K channels and the next K vehicles will be assigned the channels once the first K vehicles complete their transmission, and so on. Algorithm The proposed algorithm, as given in Algorithm 1 is executed by every vehicle which needs PU channel to transmit data. The algorithm is executed by every such vehicle after a CCH interval completes and next SCH interval starts. Each vehicle calls the procedure and tries to assign itself a channel based on the PU channel information it gained through cooperative spectrum sensing. Algorithm 1 Channel assignment procedure 1: I N PU T 2: channels ← Available PU channels 3: vehicles ← Vehicles in a segment 4: K ← si ze(channels); 5: N ← si ze(vehicles); 6: index ← vehicles.get I ndex(this vehicle); 7: if index ≥ K then 8: this.vehicle.haltCounter ← [index/K ] 9: this.vehicle. potentialChannel ← channels[index (mod K )] 10: else 11: this.vehicle.haltCounter ← 0 12: this.vehicle.transmissionChannel ← channels[index] 13: end if
222
K. Goyal and M. Patra
The input to the given algorithm is the number of available PU channels and the vehicles which intend to transmit data over PU channels. The vehicles are sorted as per their priority and are indexed likewise. If the number of vehicles is more than the number of available channels, then vehicles with index value more than the available channels will not be assigned a channel for transmission in the given time slot. In order to ensure that a channel is assigned to a vehicle as soon as it is free, we assign a potential channel to each vehicle that it may use if the previous vehicles, which were using the same channel, complete their transmission. After first k set of vehicles, the next K set of vehicles are assigned the idle PU channels as per their priority. This is achieved with the help of the variable halt counter, as shown in the algorithm. It denotes the number of vehicles which would be using the same PU channel, as per their priority. The halt counter gets decreased if it receives a signal from the other vehicles who are transmitting data over the same potential channel. A vehicle would transmit over a channel only if the halt counter is zero. An example is given below for more understanding. In this example given in Fig. 2, we have 7 vehicles in a segment and 3 available PU channels. V1 , V2 , V3 , are assigned channels C1 , C2 , C3 , respectively, for transmission. The other vehicles would be assigned a
Fig. 2 Example: channel assignment procedure
Distributed Channel Assignment in Cognitive-Radio Enabled … Table 1 Simulation parameters Parameter Number of vehicles Speed of vehicles Road length OBU transmission range Number of PU channels, N
223
Values {100,200, …500} 30–90 Km/h 15 Km 300 m 8
potential channel based on their index (for e.g. C2 as potential channel for V5 ). Once V1 , V2 , V3 complete their transmission, they will broadcast to other vehicles that they can decrease their halt counter. Thus, V 4, V 5, V 6 change their halt counter to 0 and V7 to 1. Then V4 , V5 and V6 will contend for their respective channels and the process will continue. Simulation Results and Discussion We have performed simulations for our proposed approach and have compared the results obtained using our approach with that of a scenario as mentioned in [5, 6] where assignment of PU channels to vehicular SUs is done randomly. We have used an internally developed discrete-event simulator based on Java for performing simulations. The simulation parameters are given in Table 1. Packet delivery ratio (PDR) is defined as the ratio of number of packets received divided by the total number of packets generated. Our proposed method performs better in terms of PDR, delay, throughput, and channel utility, as can be seen from Fig. 3a–d, respectively. This is because, in the basic approach, channels are assigned randomly, which may lead to collision when more than one vehicle is assigned the same PU channel. However, our proposed algorithm reduces number of collisions and the contention for obtaining PU channels by ensuring that no two vehicles are assigned the same PU channel at the same time and a vehicle is assigned to a channel only after transmission by previously assigned vehicle is completed.
4.2 Case 2—Non-ideal Channel Characteristics In this case, we consider the scenario with variable data rate of PU channels and variable packet generation rates for vehicles. Here, our problem can be modeled as a Hungarian matching problem. The Hungarian matching algorithm, also known as Kuhn-Munkres algorithm, can be used to find maximum-weight (minimum-weight) matchings in a bipartite graph, sometimes known as an assignment problem. Our problem can be represented as a bipartite graph where one set of vertices is vehicular SUs which want to transmit their data over idle PU channels and another set of vertices is the idle PU channels through which transmission can be done. Edges between each set of nodes are assigned some weights.
224
K. Goyal and M. Patra
(a) PDR versus Number of vehicles
(c) Throughput versus Number of vehicles
(b) Average Delay versus Number of vehicles
(d) Channel utility versus Number of vehicles
Fig. 3 Comparison of proposed algorithm with basic random channel allocation method
We define wi jt , as a weight parameter for edges between vehicles and PU channels. We assume that each packet has a size p. Packet generation rate is given by λit for vehicle i and data rate is μ jt , for channel j at time t. wi jt =
μ jt λit
(5)
Algorithm We create a cost matrix whose entries are wi jt and the objective is the maximization of the total cost. The idea behind maximizing the cost is to increase the total number of packets transmitted in the current time slot, thereby increasing the overall throughput. Since Hungarian matching algorithm requires the cost matrix to be a square matrix, we fill the remaining entries with dummy channels while ensuring the weights to be such that they are not selected by vehicles for transmission. This does not affect the final solution. Our proposed algorithm is given below: In Algorithm 2, each vehicle gets information about the available PU channels through cooperative spectrum sensing. Each vehicle also gets information of neighbouring vehicles in its own segment via periodic updates. The cost matrix is calculated
Distributed Channel Assignment in Cognitive-Radio Enabled …
225
Algorithm 2 Channel assignment procedure 1: 2: 3: 4: 5: 6: 7: 8: 9:
I N PU T channels ← Available PU channels vehicles ← Vehicles in a segment index ← vehicles.get I ndex(this vehicle); cost Matri x ← constr uctCost Matri x(channels, vehicles) matching ← executeH ungarian Matching(cost Matri x) if this vehicle is assigned any channel then this.vehicle.transmissionChannel ← channels[matching[index]] end if
(a) PDR vs. Number of vehicles
(b) Average Delay vs. Number of vehicles
(c) Throughput vs. Number of vehicles
(d) Channel utility vs. Number of vehicles
Fig. 4 Comparison of proposed Hungarian matching algorithm with basic random channel allocation method
using the aforementioned method and Hungarian matching algorithm is applied to obtain an optimal matching. Simulation Results and Discussion We have performed simulations with our proposed approach by using the same simulation scenario and parameters as mentioned in Sect. 4.1. Figure 4a represents the variation in PDR with number of vehicles in
226
K. Goyal and M. Patra
the scenario. It can be observed that with increase in number of vehicles, the PDR value keeps decreasing. This is because, increase in vehicles, leads to increase in contention for obtaining idle PU channels, resulting in added delay and reduced PDR. It can also be noted that our proposed algorithm performs better than basic random channel allocation method due to the property that no two vehicles are assigned the same PU channel at a time, thereby reducing collisions. Due to the same reason, our proposed algorithm performs better in terms of average delay, throughput, and channel utilization, as shown in Fig. 4b–d, respectively.
5 Conclusion In this work, we have proposed distributed channel assignment algorithms for ideal as well as non-ideal channels in a cognitive-radio assisted Internet of Vehicles scenario. Our proposed approach reduces the number of collisions that occur when two vehicles contend for the same channel. We also made sure that if a vehicle completes its transmission, it signals the other vehicles that they can use this channel. The simulation results show that the proposed algorithms show improved results in terms of packet delivery ratio, average delay of packets, network throughput, and channel utilization when compared to the scenario where channels are assigned in a random manner. Acknowledgements This research work was supported in part by the Department of Science and Technology (DST), Government of India vide project grant EC R/2018/000917.
References 1. Kaiwartya, O., Abdullah, A.H., Cao, Y., Altameem, A., Prasad, M., Lin, C., Liu, X.: Internet of vehicles: motivation, layered architecture, network model, challenges, and future aspects. IEEE Access 4, 5356–5373 (2016) 2. Karagiannis, G., Altintas, O., Ekici, E., Heijenk, G., Jarupan, B., Lin, K., Weil, T.: Vehicular networking: A survey and tutorial on requirements, architectures, challenges, standards and solutions. IEEE Commun. Surv. Tutor. 13(4), 584–616 (2011) 3. Hartenstein, H., Laberteaux, L.P.: A tutorial survey on vehicular ad hoc networks. IEEE Commun. Mag. 46(6), 164–171 (2008). June 4. Akyildiz, Ian F., Lo, Brandon F., Balakrishnan, Ravikumar: Cooperative spectrum sensing in cognitive radio networks: A survey. Phys. Commun. 4(1), 40–62 (2011) 5. Eze, J., Zhang, S., Liu, E., Eze, E.: Cognitive radio-enabled internet of vehicles: a cooperative spectrum sensing and allocation for vehicular communication. IET Netw. 7(4), 190–199 (2018) 6. Eze, J., Zhang, S., Liu, E., Chinedum, E.E., Yu, H.Q.: Cognitive radio aided internet of vehicles (iovs) for improved spectrum resource allocation. In: 2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing, pp. 2346–2352 (2015)
Distributed Channel Assignment in Cognitive-Radio Enabled …
227
7. Aruna, R., Patra, M.: A judicious spectrum sensing technique in cognitive radio assisted internet of vehicles. In: IEEE International Conference on Advanced Networks and Telecommunications Systems (2019) 8. Sohan, T.A., Haque, H.H., Hasan, A., Islam, J., Islam, A.A.: A graph coloring based dynamic channel assignment algorithm for cognitive radio vehicular ad hoc networks. In: 2016 International Conference on Networking Systems and Security (NSysS), pp. 1–8 (2016) 9. Chen, J., Liu, B., Zhou, H., Wu, Y., Gui, L.: When vehicles meet tv white space: A qos guaranteed dynamic spectrum access approach for vanet. In: 2014 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, pp. 1–6 (2014) 10. Li, X., Zekavat, S.A.: Distributed channel assignment in cognitive radio networks. In: Proceedings of the 2009 International Conference on Wireless Communications and Mobile Computing: Connecting the World Wirelessly, IWCMC ’09, pp. 989–993. ACM, New York (2009)
Load Reduction Using Temporal Modeling and Prediction in Periodic Sensor Networks Arun Avinash Chauhan and Siba K. Udgata
Abstract Wireless Sensor Networks (WSNs) operate in an energy constrained environment, and judicious use of limited battery of sensor nodes is a priority. Load reduction aids in prudent usage of battery by reducing the amount of data transmitted across the network without loss in underlying information, thereby increasing the network lifetime. This paper showcases a load reduction technique where we understand patterns in temporal data and create an adaptive prediction model based on M5P algorithm in the WEKA toolkit. The model predicts measurements, and only when sensor measurements do not agree the predictions, sensor nodes send data to the sink. This brings down the amount of data transmitted, leading to reduced communication and energy consumption. Preliminary results indicate 70% reduction in data transmission across the network, proving the efficacy of the temporal modeling in reducing amount of data sent, consequently saving energy, and improving the network lifetime. Keywords Sensor networks · Load reduction · Data reduction · Efficient-information transfer · Temporal modeling · Machine learning
1 Introduction Mankind is forever more in the quest to better lives. Computing helps mankind in their goal and with the advent of the Internet of Things (IoT), it gets further penetrative with each passing day. Wireless Sensor Networks (WSNs) are an important sub domain of the IoT. WSNs find usage in home, commercial, environmental and healthcare applications [1] and are a major research interest. In almost all of the applications, A. A. Chauhan (B) · S. K. Udgata University of Hyderabad, Hyderabad 500046, India e-mail: [email protected] S. K. Udgata e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_20
229
230
A. A. Chauhan and S. K. Udgata
they deal with collecting data and sending it to a centralized or distributed destination for further analysis. Data is at the crux of everything in today’s world. Sensor nodes measure data and send it across the network to the sink, which is a base station or computer where the data is further processed or used as is. Each sensor node involved in transmitting or forwarding data uses a radio, and the radio module takes up the most amount of energy in a sensor node. Transmission of a single data bit usually requires 30,000 times more energy than processing a single data bit [2]. In an energy constrained environment, energy comes at a premium and therefore, unless properly managed, data transmission can take up a lot of battery power resulting in sensor nodes running out of energy and dropping off the network. The goal of load reduction is to reduce the amount of data to be transferred from sensor nodes to sink node without loss in underlying information the data intends to convey at the destination. The general approach to load reduction by data aggregation in most cases is to leverage data similarity to aggregate data. These approaches employ techniques to overlap routes taken by data packets from sensor nodes to the sink and aggregate these packets in-network while routing them over the common path. Our approach, on the other hand, works at the sensor nodes which are the source of the data. Instead of generating data packets and aggregating it, we reduce the amount of data packets being generated and reduce the load of the network at the source. We do this with the help of temporal models residing at the individual sensor nodes and the sink. The model residing at the individual sensor nodes sends data only when predicted data is not similar to sensed data. The sink uses its copy of the temporal model to recreate data whenever any particular sensor node does not send data. This way, the amount of data to be communicated is reduced, decreasing the energy consumed, and consequently increasing the network life time. In the section that follows, we give a background on load reduction and related work. Section 3 presents our proposed technique. Section 4 showcases preliminary results and corresponding discussion, before we round things off in Sect. 5 with conclusions.
2 Background Many approaches to data reduction or aggregation take advantage of redundancy inherent in data to aggregate similar data when data packets from sensor nodes route over common paths to the sink. This is called as in-network aggregation. Routing and network topology play an important role in in-network aggregation. Consequently, approaches to data aggregation are categorized on the basis of topology as either structured, unstructured or hybrid. The classification, distinction and description of these approaches are succinctly dealt with in [3]. Structured aggregation usually involves tree based or cluster based approaches. In the tree based approaches, there is a predefined path, which can be either static or adaptive, from the sensor nodes to the sink. The topology resembles an inverted
Load Reduction Using Temporal Modeling and Prediction …
231
tree structure and hence the name. In the cluster based approach, the network is organised hierarchically into a set of sub networks called clusters. Each cluster has a representative node called the Cluster Head (CH). Data from sensor nodes within a cluster is sent to the corresponding CH, where it is aggregated and routed among the CHs towards the sink. Unstructured approaches, on the other hand, do not involve a routing structure and aggregation algorithms are distributed and local, as seen in techniques like flooding [3]. Unstructured approaches are useful and robust when the underlying network is lossy, whereas structured approaches are efficient when the underlying network is lossless. A third category, called the Hybrid approach, is inspired by the best of both structured and unstructured approaches. Tributary Delta [4] is a Hybrid approach which uses both a tree (tributary) and flooding (delta). In areas of lossless transmission, it uses a tree, and in areas of lossy transmission, it uses flooding. A majority of the approaches to load reduction use data aggregation to reduce redundant data and bring down network load. All these approaches work in reducing the load as it traverses through the network. Our approach aims to reduce the amount of data at its origin, that is, at the sensor nodes. The reason for this approach is twofold: energy spent in transmitting data between the sensor node to the first routing tree node or CH is not trivial, and secondly, our philosophy is to reduce the load at the source itself. We use temporal models to help us reduce the data to be transmitted from the sensor nodes towards the sink. Forecasting using models have been envisaged to aggregate data, to approximatemonitor networks, and to detect anomalies among others [5]. One of the first attempts to use a forecasting model to aggregate data was by Li and Wang [6]. They proposed a data aggregation scheme for time series data using the ARIMA model. The results were promising and since then there has been further work on using prediction models for data aggregation [7]. These prediction models are usually housed at the sensor nodes to begin with and are static. Some are adaptive in nature [8] but since they are modelled at the sensor nodes, are invariably simplistic in nature. These simple models work with simple data set predictions. Recently, [9] came up with a powerful multi-step method to predict sensory data using 1-D CNN and Bi-LSTM to pick abstract features based on sensor node measure correlation to model and predict data. The models in [9] are static and created offline, and further, it does not mention the complete WSN use case for these models. We propose a scheme to reduce load at the sensor nodes using temporal models but with an essential difference that models are adaptive in nature, and whenever the models need to be recreated, it is done at the sink. The sink node has much more powerful processing capabilities and theoretically infinite battery. This gives us the flexibility to create stronger temporal models. The models, whenever recreated at the sink, are distributed to the sensor nodes in the network, saving individual sensor nodes the task of model recreation. The entire operation is sink driven. Preliminary results suggest that our model is able to predict data within an acceptable range of accuracy, thereby reducing transmissions from sensor nodes to sink by 70%.
232
A. A. Chauhan and S. K. Udgata
3 Proposed Technique 3.1 Sensor Environment The operating environment consists of randomly deployed sensor nodes over an area that needs to be monitored. We consider the sensor nodes to form a Periodic Wireless Sensor Networks (PWSNs). In a PWSN, sensor nodes send data packets towards the sink once every time period. A time period is a fixed interval of time and can be divided into t number of time slots. Once every time period, sensor nodes send a data packet containing t number of measurements to the sink node. The major actors are: 1. Sensor nodes: Devices that sense and measure certain meteorological data that needs to be tracked. These sensor nodes also house the temporal model. 2. Sink node: The device where data from the entire network is received for further action by the stakeholders interested in the data. Houses temporal models corresponding to each sensor node in the network. 3. Temporal Model: Given the current date and time, the temporal model at each sensor node outputs the temperature expected for that date and time. The temporal model at the sink takes additional data, i.e., the sensor node location, to output the expected temperature for a particular sensor node at that data and time. The data measured by the sensor nodes comes from the five cities data set provided by [10] which is a real world data set. It consists of meteorological data, measured in five Chinese cities, over a period ranging from January 1st 2010 to December 31st, 2015. It consists of 16 features and 264,270 instances. In meteorological data, we choose to work with outdoor air temperature measurements.
3.2 Methodology Load Reduction Our technique consists of two phases. In the first phase, sensor nodes measure data once every time slot in a time period. The measurement is saved in a set of measurements. The sensor node then refers it temporal model which predicts a measurement for the current time slot. This prediction is saved in a set of predictions. Each measurement is compared against the corresponding prediction at that time slot. Should both measurement and prediction disagree, a flag is raised. At the end of the time period, if the flag is raised, the measurement set is sent to the sink. If the flag is not raised meaning the measurements and predictions agree, then a notification message is sent instead. The sink node is the primary actor in the second phase. The sink node receives data packets or notification packets from each sensor node at the end of every time period. If it receives a notification packet from a sensor node, it understands that the temporal model at that sensor node is able to predict within acceptable limits of accuracy and uses the temporal model at its end to reconstruct measurements. Instead, if it receives a data packet, it realizes that the temporal model for that particular sensor node is not performing optimally and uses the received measurements. Details of both phases follow.
Load Reduction Using Temporal Modeling and Prediction …
233
Data Reduction at Sensor Nodes Sensor nodes consist of sensors to measure air temperature. Every sensor node in the network also consists of a temporal model that gives an estimate or predicted value of the outdoor air temperature at a particular time slot (date and time) of operation. Once every time slot, the sensors measure a value that is saved in the set of measurements, and the temporal model outputs a prediction that is saved in the set of estimates. The measurement in that time slot is then compared to the estimate corresponding to the time slot. If both are not similar, then a flag indicating the same is set. Let us call this flag Slot difference flag. Once the time period elapses, the sum of measurements in measurement set is compared against the sum of estimates in the estimate set. If both differ by a value greater than a predefined threshold, then a flag indicating the same is set. Let us call this flag as the Period difference flag. If either of the flags is set, then the set of measurements is sent in a data packet towards the sink. If none of the flags are set, then a notification in a control packet is sent to the sink. Algorithm 1 describes the process at the sensor nodes. Note that the threshold, let us call it slot difference threshold, to compare measurement and prediction in every time slot is set as per data and application requirements. The Period difference flag is used to keep track of prediction drift of the set of measurements against the predictions. This means that individually, measurements might be similar to predictions, but using this flag, a consistent difference (albeit under slot difference threshold) in values is identified, measurements sent to sink, and consequently model is fine tuned to perform better. Algorithm 1: Data reduction at sensor nodes Input: Measurement from sensor, prediction from model. Measure set = 0, Prediction set = 0;
while time slot do Measure set += measurement, Prediction set += prediction;
if (measurement - prediction) > Threshold then Slot difference flag = true;
end if current time period ends then Period diff flag = true, if (sum(Measure set)-sum(Predict set)) gt. Threshold;
if Slot difference flag == true or Period difference flag == true then Add Measure set to data packet; Send data packet to sink;
else send notification packet to sink;
end Measure set = 0, Predict set = 0, Flags = 0;
end end
234
A. A. Chauhan and S. K. Udgata
Data Recovery at Sink Node For every sensor node, at each time slot, the sink uses the temporal model to get an estimate of the temperature at that instance and saves the estimate in a set of estimates for that time period. Let us call this set as ‘Measures’. There is a separate Measures set corresponding to every sensor node. Whenever a time period elapses and the sink receives a notification packet, it uses the Measures set it has for that sensor node. If instead it receives a data packet, it overwrites the contents of the Measures set with data received. Should the sink node receive a data packet for a certain number of consecutive time periods, it updates the temporal model and sends the updated temporal model back to the sensor node. We call this model deteriorate number and set it to 5. This number helps distinguish between anomalous situations and general operation of a model. This is set as per data and application requirements. Algorithm 2 describes the process at the sink. Algorithm 2: Data recovery at sink Input: Data packet with set [Mt ], t = 1, 2, · · · , k, from corresponding sensor node (or) notification packet. For each sensor node in network, do the following; while time slot do Measure set += current prediction; if current time period ends then if Data packet arrives then Extract Measure set from data packet; Update local Measure set; Forward Measure set to application; Increment ’Model Deteriorate counter’; else Forward Measure set to application; end if Model Deteriorate counter == 5 then Model Deteriorate counter = 0; Update temporal model; Send temporal model params to corresponding sensor node; end end end Temporal Model As is clear by now, the success of our approach depends on the efficiency of the temporal models. We use a temporal model which uses the current date and time, along with the spatial coordinates of the sensor node, as its predictor variables, and the temperature as the output variable. The temporal models were developed using the machine learning libraries in R language. The M5P regression tree is available in the RWeka library [11]. M5P tree generates model trees using the M5 algorithm, in its enhanced form by Quinlan [12]. We use records from the first two years of the data set for training our model, and the rest of the data set (the remaining four years) to test the model. The data is pre-processed and occasional NA values are replaced with last measured value. Data in the PWSNs
Load Reduction Using Temporal Modeling and Prediction …
235
is time series data and has inherent seasonal features. We convert the seasonality in the data to features to be used in the tree using Fourier transforms. Once the data is pre-processed and the features converted to sine and cosine features, the tree is constructed using constructs provided by RWeka. Our temporal modeling technique is more robust to the vagaries of data. This is because we break up the time series data, in this case, the temperature, into its constituent components, namely seasonality, trend and remainder. We then apply the M5P regression tree to the seasonal and remainder components, and separately to the trend component. Outputs of both regression trees are combined to get the overall temperature prediction.
3.3 Implementation We tested our hypothesis using RStudio which is an integrated development environment for R. The creation of the temporal models, the transfer of reduced data across the network, and the update of models from sink to sensor nodes were all coded with the help of R script files in RStudio.
4 Results and Discussion We evaluated our temporal model using the parameters: • Root Mean Square Error (RMSE) • Rsquared value • Mean Average Error (MAE) Moreover, results are visualized to compare performance of the model against actual data. The gains in PWSN on using temporal model are evaluated by finding out the average percentage reduction in data transmitted from sensor nodes to sink node.
4.1 Air Temperature and Temporal Model Table 1a shows the RMSE, Rsquared and MAE values for our temporal model for one year forecast period. Note that we also break down the performance metrics into individual components of temperature data. We compared our work against [7] who exhaustively investigated three machine learning algorithms for air pollution monitoring. Our RMSE values are an improvement over [7] who obtained RMSE values of 5.8, 6.4 and 16.4 for M5P, SVM and ANN based models respectively for ozone data. Their data set had a variance of 44.89 whereas our data set had a variance of 63.26, making predictions more challenging in our case. The ML model
236
A. A. Chauhan and S. K. Udgata
Table 1 Temporal model performance and ML model parameters b ML model parameters used by us and [7]. a Temporal model performance metrics. Forecast type
RMSE
Rsquared
MAE
Overall
3.32
0.8312
2.55
Seasonal
2.94
0.8533
2.19
Trend
0.95
0.0056
0.87
Trend (non tree)
–
0.7713
–
Remainder
2.67
0.0013
1.851
Alg
Parameter
M5P
Instances (leaf nodes)
Value 4
M5P
Pruning
Allowed
M5P
Smoothing
Allowed
SVM
Kernel type
Dot
ANN
Hidden layers
1
ANN
Nodes (Hidden layer)
Attributes
parameters used by [7] are listed in Table 1b. We used the same parameters for our M5P algorithm. Coming back to our results, it can be seen that Rsquared on trend performs poorly. This is because trees do not adapt to change or concept drift well. If the trend component is instead modeled as linear approximations, the Rsquared value betters immediately. Rsquared values are expected to be relatively low on remainder as it is stationary. Figure 1 plots the short term forecast where the training data is in blue, and the one day forecast is in orange. As it can be seen, the test value and the forecast value run pretty close to each other. Figure 2 plots the long term forecast where again we have two years of training data, but this time followed by forecast for the entire year. It has to be noted that short term forecast is more accurate than long term forecast. One can also notice that in the long term forecast, the forecast slightly deviates from measurements in the months of January, February and March but then improves. This is attributed to faster trend changes in those months in the year 2012. Note that the long term forecast performance was measured for a single temporal model as a proof of concept. Performance of long term forecasts will improve once the temporal model is made adaptive and uses a sliding window of historic measurements from training data, recent predictions made by the temporal model at the sink and measurements received at the sink.
4.2 Temporal Model and Load Reduction For a similarity threshold of 1.5 ◦ C between a prediction and a measurement, we observed a 70% reduction in transmitted data from sensor nodes to sink node for five cities data set.
Load Reduction Using Temporal Modeling and Prediction …
237
Fig. 1 Short term forecast—one day
Fig. 2 Long term forecast—one year
5 Conclusions In this paper, we proposed an approach to load reduction inspired by the concepts of machine learning. We proposed using temporal modeling and prediction to reduce the amount of load at the source of the PWSN. Preliminary results suggest that temporal models are accurate enough in predicting meteorological data such as air temperature, even in outdoor environment, where the measurements vary a lot more than an indoor
238
A. A. Chauhan and S. K. Udgata
environment. Going ahead, we need to test our approach exhaustively. Additionally, we need to work on improving model accuracy on individual components, specifically trend and remainder components. We intend to tweak M5P and use it in combination with linear approximate sub models to improve trend and remainder predictions. In the future, our approach can be further improved by incorporating spatial modeling where mismatch between prediction and measure can be verified to be an actual mismatch and not sensor node measurement error. We also intend to increase network lifetime even further by introducing the concept of virtual sensor nodes. Virtual sensor nodes are temporal models that approximate the behavior of physical sensor nodes that have run out of battery. The idea is based on the fact that we can model the measurements made by sensor nodes using temporal models. The virtual sensor node then provides measurements long after the corresponding physical sensor node goes down, thereby increasing network lifetime.
References 1. Akyildiz, I.F., Su, W., Sankarasubraniam, Y., Cayirci, E.: Wireless sensor networks: a survey. Comput. Netw. 38(4), (2002). https://doi.org/10.1016/S1389-1286(01)00302-4 2. Raghunathan, V., Schurgers, C., Park, S., Srivastava, M.B.: Energy-aware wireless microsensor network. IEEE Sig. Process. Mag. 19(2), 40–50 (2002). https://doi.org/10.1109/79.985679 3. Jesus, P., Baquero, C., Almeida, P.S.: A survey of distributed data aggregation algorithms. IEEE Commun. Surv. Tutor. 17(1), 381–404 (2014). https://doi.org/10.1109/COMST.2014.2354398 4. Chitnis, L., Dobra, A., Ranka, S.: Aggregation methods for large-scale sensor networks. ACM Trans. Sens. Netw. 4(2), 9, 1–9:36 (2008). https://doi.org/10.1145/1340771.1340775 5. Kim, J.J., Shin, K.G.: Energy-efficient self-adapting online linear forecasting for wireless sensor network applications. In: IEEE International Conference on Mobile Adhoc and Sensor Systems conference (2005). https://doi.org/10.1109/MAHSS.2005.1542822 6. Li, G., Wang, Y.: Automatic ARIMA modeling-based data aggregation scheme in wireless sensor networks. EURASIP J. Wirel. Commun. Netw. 85 (2013). link.springer.com/content/pdf/10.1186/1687-1499-2013-85.pdf 7. Shaban, K.B., Kadri, A., Rezk, E.: Urban air pollution monitoring system with forecasting models. IEEE Sens. J. 16(8), 2598–2606 (2016). https://doi.org/10.1109/JSEN.2016.2514378 8. Mollanoori, M., Hormati, M.M., Charkari, N.M.: An online prediction framework for sensor networks. In: 16th Iranian conference on Electrical Engineering (2008). citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.148.8696&rep=rep1&type=pdf 9. Cheng, H., Xie, Z., Shi, Y., Xiong, N.: Multi-step data prediction in wireless sensor networks based on one-dimensional CNN and bidirectional LSTM. IEEE Access 7, 117883–117896 (2019). https://doi.org/10.1109/ACCESS.2019.2937098 10. Liang, X., S. Li, S. Zhang, H. Huang, S. X. Chen: PM2.5 data reliability, consistency, and air quality assessment in five Chinese cities. J. Geophys. Res. Atmos. 121 (2016). https://doi.org/ 10.1002/2016JD024877 11. Hornik, K., Buchta, C., Zeileis, A.: Open-source machine learning: R meets Weka. Comput. Stat. 24(2), 225–232 (2009). https://doi.org/10.1007/s00180-008-0119-7 12. Quinlan, J.R.: Learning with continuous classes. In: Adams, Sterling (eds) Proceedings AI ’92 (). 343–348 (1992). https://sci2s.ugr.es/keel/pdf/algorithm/congreso/1992-Quinlan-AI.pdf
Direct Torque Control of Mathematically Modeled Induction Motor Drive Using PI-Type-I Fuzzy Logic Controller and Sliding Mode Controller Soumya Ranjan Satpathy, Soumyaranjan Pradhan, Rosalin Pradhan, Rajashree Sahu, Aparesh Prasad Biswal, and Bibhu Prasad Ganthia Abstract This research introduces a Type-I Fuzzy Logic control technique in associated with conventional PI controller using sliding mode control strategy for direct torque control in induction motor drive. This technique controls the rapid variation in motor speed to the optimum reference parameter for smooth operations. Here, Fuzzy Logic controller is used for the tuning of PI control gains to get faster response towards steady state. Sliding mode controller is used to remove uncertainties due to sudden variations in motor speed which regulated in associated with the PI. This complex controller helps in getting the full control on torque with switching converters and gets accurate outputs with respect to the reference parameters. The model is designed in MATLAB. Results indicate that the conventional PI controller gives fast steady state at normal operating conditions but the proposed technique of PI- FLC-SMC more effective and faster in variance and stability. ITAE from the different operating conditions are demonstrated in this paper, and the results are compared according to the simulink results. Keywords IM drive · DTC · PI controller · FLC · SMC · ITAE
1 Introduction Speed controller design significantly affects electric drive output [1]. PI speed controllers are widely used in industrial applications because of their simple structure. However, since system parameters, model uncertainties, non-linear dynamics, and external device disturbances constantly differ, PI controls with fixed-gain are often not able to boost the required function performance. Therefore, it is desirable to continually adjust the controller parameters when high performance is expected from the drive system [2]. Fuzzy Logic control technique can be added to tuning the PI controller gains to ensure maximum control output under nominal operating conditions [3, 4]. Nonetheless, another approach to the issue is to replace the PI controller S. R. Satpathy (B) · S. Pradhan · R. Pradhan · R. Sahu · A. P. Biswal · B. P. Ganthia Electrical Engineering, IGIT, Sarang, Dhenkanal, Odisha, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_21
239
240
S. R. Satpathy et al.
Fig. 1 Modern variable speed IM drive system
entirely with model reference adaptive control systems, for example, self-setting and artificial intelligence techniques, model adaptive control analysis (MRAC), and sliding mode control (SMC) [5]. Sliding mode control has shown robustness, inaccuracy, and dynamic modeling in relation to motor parameters, insensitivity to external load distortion, stability and rapid dynamic response among these various designs [6, 7]. It is therefore found to be very efficient in regulating systems of electric drives [8].
2 Induction Motor (IM) Drive The IM is perhaps the most commonly used AC engine, as it provides many advantages compared to other engines. Two forms of 3-phase IMs exist. They are: Woundrotor IM and Squirrel-cage IM [1]. Both machines are identical from an electrical point of view, except that the previous has permanently shorted rotor winding terminals within the system. In the context of a wound-rotor system, the 3-phase rotor winding terminals are externally available for the operation [2]. The project discusses only the speed strength of Squirrel-cage IM. Figure 1 displays the block diagram for a typical modern IM drive device with variable speed. The IM is directly or indirectly related to the charge (through gears). The power converter regulates the transfer of power form from an ac supply towards motor by actively regulating the voltage semiconductor switches [9, 10]. Primary reasons for this complexity are the use of variable frequency, harmonically configured electrical systems for converters and complicated dynamics of ac systems, differences in system parameters and complexities in managing feedback signals in the existence of transient harmonics. The IM is superior because they are machines that are stable and reliable to their counterparts, because they have functionality such as repairs-free in operations [11, 12].
3 Direct Torque Control Technique This switching-based approach has been described as the approach that is simple and feasible to meet those needs. DTC is one of the most excellent and efficient methods
Direct Torque Control of Mathematically Modeled Induction …
241
Fig. 2 Basic block diagram of conventional DTC scheme
Fig. 3 Hysteresis torque and flux controller block diagram
for regulating control induction [7, 13]. This approach emphasizes on differentiated torque value and the stator flux monitoring and is now one of the vital systems commonly observed control methods with the purpose of regulating torque and flux efficiently. Figure 2 displays the typical block diagram for conventional DTC scheme. The look up switching table is taken according to the switching vectors [14, 15].
242
S. R. Satpathy et al.
The diagram of the DTC block arrangement displayed Fig. 3 above and the frame of reference for the stator and rotor axis [12]. The reference torque can be measured using a PI unit using the variation in instantaneous speed and reference speed. Selection of this reference speed strengthens the torque and flux control dynamic response [1, 11, 16–18].
4 Type-I Fuzzy Logic Control Technique The Type-1 Fuzzy Logic technique is basically rule based and more effective for enhancing the steady state and transient analysis [19]. This adaptive technique effective for control on speed and torque during starting and running of induction motor drive. In Fig. 4, Type-I FLC architecture is presented. Similarly, in Fig. 5, rules for the fuzzy technique shown and in Fig. 6 and 7, fuzzy rule for speed and electromagnetic torque are presented, respectively [18, 20]. Fig. 4 FLC architecture
Fig. 5 FLC rule set
Fig. 6 FLC rule set for rotor speed
Direct Torque Control of Mathematically Modeled Induction …
243
Fig. 7 FLC rule set for electromagnetic torque
5 Sliding Mode Controller This highly advanced technique is widely recognized for its robustness against internal uncertainties, uncertainties due to non linearity in parametric variations and anomalies that were excluded in the modeling. SMC procedure is applicable for enhancing DTC for IMs highlighted in this study. Such strategies enhance the efficiency of the steady state and preserve the benefits of intermittent state and the torque and flux are resilient against system parameter variations. It presents models and experimental findings to demonstrate the feasibility of the proposed strategy. Figure 8 shows the concept of sliding mode technique in phasor diagram form where the desired vale and reaching mode to the stability is presented.
Fig. 8 Sliding mode control phasor
244
S. R. Satpathy et al.
6 Simulink Model and Result The direct torque control design FLC-SMC-DTC framework and the flux depending on the sliding mode of a speed driven induction motor are shown in Fig. 9. The model is cascade order for electromagnetic torque control, square flux norm, and speed control. So, there are sliding control method and genetic algorithms introduce to adjust the torque, flux and speed in the control structure (Fig. 10, 11, 12, 13, 14, 15 and 16).
Fig. 9 Block diagram of proposed technique
Fig. 10 d-axis voltage of DTC scheme of IM using PI-FLC-SMC with Nonlinearity
Direct Torque Control of Mathematically Modeled Induction …
245
Fig. 11 q-axis voltage of DTC scheme of IM using PI-FLC-SMC with Nonlinearity
Fig. 12 Speed characteristics for DTC-PI-FLC-SMC technique
Fig. 13 Speed characteristics for DTC-PI-FLC-SMC technique
From the above simulation results are presented as follows: Fig. 10 shows d-axis voltage of DTC scheme of IM using PI-FLC-SMC with Nonlinearity, Fig. 11 shows q-axis voltage of DTC scheme of IM using PI-FLC-SMC with Nonlinearity, Fig. 12 shows Speed characteristics for DTC-PI-FLC-SMC technique of the IM drive, Fig. 15 shows Speed characteristics for DTC-PI-FLC-SMC technique, Fig. 14 shows Speed
246
S. R. Satpathy et al.
Fig. 14 Speed characteristics for DTC-PI-FLC-SMC technique
characteristics for DTC-PI-FLC-SMC technique, Fig. 15 shows Speed characteristics for DTC-PI-FLC-SMC technique and Fig. 16 shows Speed characteristics for DTCPI-FLC-SMC technique. The Type-I-FLC-SMC program incorporates the Type-IFLC and the SMC methods to mitigate the chat phenomenon. The Type-I-FLC-SMC scheme ensuring stability and improving IMD performance ensures robustness in parameter uncertainty. SMC structures provide sliding surface selection and control legislation. In order to reduce chattering, a boundary layer adjacent to the sliding surface and the regulatory system is usually inserted.
7 Result Analysis and Discussion The cost function for the controller is to determine every generation’s individuals can be selected as the integral time of absolute error (ITAE). This cost function can be written in mathematical expression as: ITAE =
t
t|ex p(t)|dt 0
Direct Torque Control of Mathematically Modeled Induction …
247
Fig. 15 Speed characteristics for DTC-PI-FLC-SMC technique
FLC is looking for the optimum set value of the conventional PI controller gains during the search process which minimizes ITAE cost function. This function is set as the criterion of evolution of the FLC with positive and negative errors calculations. The comparison between controllers at different operating conditions is shown in Table 1. From the above result Table 2, we get the results of different speed variations and torque variations, respectively, due to different controllers and optimizing techniques and Table 3 below shows the comparison between ripple contents.
8 Conclusion This paper highlights the implementation of Type-I Fuzzy Logic Controller characteristics using sliding mode control technique with conventional PI controller for the direct torque control of IMD; induction motor drives system. This follow research paper, the conventional controller technique compared with the proposed PI-FLCSMC control technique, and it gives superior efficiency on speed variations and faster response towards steady state operation than the traditional DTC techniques. The DTC drive is the electromagnetic torque command based on the Lyapunov
248
S. R. Satpathy et al.
Fig. 16 Speed characteristics for DTC-PI-FLC-SMC technique Table 1 Comparison of controllers with proposed technique at various operating conditions Operating conditions
DTC (traditional)
DTC-PI-FLC technique
DTC-PI-FLC-SMC (Proposed technique)
0.08165
0.0882
0.0930
Reference speed of 120 rad/s at nominal parameters. phase load torque applied at t = s2sec from 30 to 90% rated load
0.32
0.30
0.281
Reference speed of 60 rad/s and rated load torque of 30%
0.01182
0.01088
0.051
Reference speed 120 rad/s at nominal parameters and load torque rated 30%
0.6061
0.6005
0.566
Speed 60 rad/s at rated load torque of 30%
ITAE ITAE
14.30
0.93
2.91
0.18
Kp
Max overshoot %
Settling time
0.17
13.74
0.18
9.65
551.05
0.12
3.12
1.09
20.42
520.34
Settling Time
Max Overshoot %
Kp
Ki
Torque
511.56
Peak speed
Ki
DTC (traditional) PI-SVM DTC PI-FLC-SMC-DTC Parameters technique
Parameters
Table 2 Comparison between controllers at different speed reversal and different load torque
0.12
16.6896
1.2
2.2
1105.2
0.09
3.1453
1.10
9.95
1008.3
0.0095
2.565
2.05
30.11
1007.65
DTC (traditional) PI-SVM DTC PI-FLC-SMC-DTC technique
Direct Torque Control of Mathematically Modeled Induction … 249
250
S. R. Satpathy et al.
Table 3 Comparison of ripple content between controllers at 1400 rpm Parameters Flux ripple (Wb) Torque ripple (Nm)
DTC (traditional) PI-SVM DTC PI-FLC-SMC-DTC technique 0.024 12.2
0.012
0.007
6.1
3.2
Steady state error (%)
0.0118
0.0046
0.0042
Settling time of speed (Sec)
0.640
0.635
0.590
principle for heuristic control. PI-FLC demonstrates better output under nominal driving conditions while FLC-SM reveals robustness over variance in stator resistance, inertia instability, and crassness of stability to the. The model provides better efficient and smooth operating process with faster steady state operations. In future, more adaptive techniques can be implemented for better steady state operations with higher constraints.
References 1. Ganthia, B.P., Rana, P.K., Pattanaik, S.A.: Space vector pulse width modulation fed direct torque control of induction motor drive using matlab-simulink. In: Proceedings of the 3rd International Conference on Electrical, Electronics, Engineering Trends, Communication, Optimization and Sciences (EEECOS), Tadepalligudem, India, 1–2 June 2016. 2. El Ouanjli, N., Derouich, A., El Ghzizal, A., Chebabhi, A., Taoussi, M.: A comparative study between FOC and DTC controls of the doubly fed induction motor (DFIM). IEEE, In: International conference on electrical and information technologies (2017) 3. Ammarn, A., Bourek, A., Benakcha, A.: Nonlinear SVM-DTC for induction motor drive using input-output feedback linearization and high order sliding mode control. ISA Trans. 67, 428– 442 (2017) 4. Wang, F., Zhang, Z., Mei, X., Rodríguez, J., Kennel, R.: Advanced control strategies of induction machine: Field oriented control, direct torque control and model predictive control. Energies 11(1), 120 (2018) 5. Ganthia, B., Sahu, S., Biswal, S., Abhisekh, A., Kumar Barik, S.: Genetic slgorithm based direct torque control of VSI fed induction motor drive using MATLAB simulation. Int. J. Adv. Trends Comput. Sci. Eng. 8(5), 2359 2369 (2019) 6. Ali, M.M., Xu, W., Elmorshedy, M.F., Liu, Y., Allam, S.M., Dong, M.: Sliding mode speed regulation of linear induction motors based on direct thrust control with space-vector modulation strategy. In: 2019 22nd International Conference on Electrical Machines and Systems (ICEMS), Harbin, China, pp. 1–6 (2019) 7. Jnayah, S., Khedher, A.: Sensorless direct torque control of induction motor using sliding mode flux observer. In: 2019 19th International Conference on Sciences and Techniques of Automatic Control and Computer Engineering (STA), Sousse, Tunisia, pp. 536–541 (2019) 8. Krim, S., Gdaim, S., Mtibaa, A., Faouzi Mimouni, M.: FPGA-based real-time implementation of a direct torque control with second-order sliding mode control and input–output feedback linearisation for an induction motor drive. In: IET Electric Power Applications, vol. 14, no. 3, pp. 480–491, 3 (2020) 9. Ganthia, B.P., Pritam, A., Rout, K., Singhsamant, S., Nayak, J.: Study of AGC in two-area hydro-thermal power system. In: Garg, A., Bhoi A., Sanjeevikumar P., Kamani K. (eds) Advances in Power Systems and Energy Management. Lecture Notes in Electrical Engineering, vol. 436. Springer, Singapore (2018)
Direct Torque Control of Mathematically Modeled Induction …
251
10. Ganthia, B.P., Rana, P.K., Patra, T., Pradhan, R., Sahu, R.: Design and analysis of gravitational search algorithm based TCSC controller in power system. In: Proceedings Materials Today. vol. 5, Issue 1, Part 1, pp. 841–847 (2018). ISSN 2214–7853, https://doi.org/10.1016/j.matpr. 2017.11.155 11. Ganthia B.P., Pradhan, R., Sahu, R., Pati, A.K.: Artificial ant colony optimized direct torque control of mathematically modelled induction motor drive using pi and sliding mode controller. In: Kumar, J., Jena, P. (eds.) Recent Advances in Power Electronics and Drives. Lecture Notes in Electrical Engineering, vol. 707, Springer, Singapore (2021). https://doi.org/10.1007/978981-15-8586-9_35 12. Pragati, A., Ganthia, B.P., Panigrahi, B.P.: Genetic algorithm optimized direct torque control of mathematically modeled induction motor drive using pi and sliding mode controller. In: Kumar, J., Jena, P. (eds.) Recent Advances in Power Electronics and Drives. Lecture Notes in Electrical Engineering, vol. 707, Springer, Singapore. https://doi.org/10.1007/978-981-158586-9_32 13. Prasad, G.B., Krishna, R.: Deregulated power system based study of AGC using PID and fuzzy logic controller . Int. J. Adv. Res. 4, 847–855. (2016) www.journalijar.co 14. Ganthia, B.P., Abhisikta, A., Pradhan, D., Pradhan, A.: A variable structured TCSC controller for power system stability enhancement. In: Proceedings Materials Today, vol. 5, Issue 1, Part 1, pp. 665–672 (2018). ISSN 2214–7853. https://doi.org/10.1016/j.matpr.2017.11.131 15. Ganthia, B.P., Mohanty, S.,Rana, P.K., Sahu, P.K.: Compensation of voltage sag using dvr with pi controller. In: International Conference Electrical Electronics and Optimization Techniques (ICEEOT), pp. 2138–2142 (2016) 16. Ganthia, B.P., Barik, S.K.: Steady-state and dynamic comparative analysis of PI and fuzzy logic controller in stator voltage oriented controlled DFIG fed wind energy conversion system. J. Inst. Eng. India Ser. B 101, 273–286 (2020). https://doi.org/10.1007/s40031-020-00455-8 17. Ganthia, B.P., Barik, S.K., Nayak, B.: Shunt connected FACTS devices for LVRT capability enhancement in WECS. Eng. Technol. Appl. Sci. Res. 10(3), 5819–5823 18. Ganthia, B.P.: Application of Hybrid Facts Devices in DFIG Based Wind Energy System for LVRT Capability Enhancements. J. Mech. Continua Math. Sci. J. Mech. Cont. Math. Sci. 15(6), 139–157 (2020). https://doi.org/10.26782/jmcms.2020.06.00012 19. Ganthia, B.P., Pradhan, R., Das, S., Ganthia, S.: Analytical study of MPPT based PV system using fuzzy logic controller. In: 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS), Chennai, India, pp. 3266–3269 (2017) 20. Ganthia, B.P.: Application of hybrid facts devices in DFIG based wind energy system for LVRT capability enhancements. J. Mech. Cont. Math. Sci. 15(6), 245–256, (2020). https://doi.org/10. 26782/jmcms.2020.06.00019.
Measuring the Performance of a Model Semantic Knowledge-Base for Automation of Commonsense Reasoning Chandan Hegde
and K. Ashwini
Abstract Commonsense Reasoning is a subfield of artificial intelligence that deals with ability of a computer to imitate mundane decision making. Though the field existed from decades, there has been very little contribution made to the development of commonsense reasoning. Lack of a well-defined methodology as well as computing facilities for implementing commonsense reasoning have always been obstacles in the development process. Semantic networks can be used to conceptualize and implement a part of commonsense reasoning. This paper presents a study on performance measurement and analysis of one such model semantic network used to build commonsense knowledge-base. Various categories of performance measures have been presented to analyze the practicality of such models for automation of commonsense reasoning. Overall, the analysis is intended to present a practical feasibility study of a model semantic network by considering characteristics of a commonsense knowledge-base. Keywords Commonsense reasoning · Artificial intelligence · Knowledge-base · Semantic networks · Inference
1 Introduction Commonsense Reasoning (CR) is one among many subfields of Artificial Intelligence (AI) that deals with the simulation of human actions and decision making using commonsense knowledge [1]. After its introduction, the field has barely made progress due to the complexity involved in computational implementation as well C. Hegde (B) Research Scholar, Department of Computer Science and Engineering, Global Academy of Technology (VTU), Bengaluru, India e-mail: [email protected] K. Ashwini Department of Computer Science and Engineering, Global Academy of Technology (VTU), Bengaluru, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_22
253
254
C. Hegde and K. Ashwini
as the methodology. However, there is a plenty of scope for improvement provided the modern computing is enriched by the touch of artificial intelligence and other computing prowess. There have been very little contribution towards development of such type of reasoning because of knowledge-base related issues. Provided substantial growth in artificial intelligence and understanding knowledge-base development, the complete automation of commonsense reasoning remains a mere theory due to many reasons. One reason is the inability of computers to implement automated reasoning to the full extent. Another reason being absence of knowledge-base dedicated to commonsense knowledge. Humans have been gifted with ability to make assumptions about matters around us. For instance, a free falling object or an impact caused by a heavy metal object after a collision are always subjected to assumptions. A computing machine cannot derive assumptions given such scenarios. That way computing agents are far away from having materialistic knowledge. Among several challenges involved in automating commonsense reasoning, drawing inferences from a given well-defined knowledge repository stands out. A knowledge-base ought to represent commonsense knowledge is different from a regular corpora in two ways. One, CR doesn’t make use of deductive reasoning while most of the reasoning methods are dependent on it. Evidences play a very little role when it comes to commonsense reasoning. CR makes use of observations instead. What makes CR more complex is the fact that the observations are subjected to periodic changes. Due to such dynamic characteristic, CR requires a constant update throughout its existence. Two, to be able to draw inferences a computer must work with critical elements of commonsense knowledge like time, objects, physics and rules of interaction, etc. The first challenge looks attainable due to the advancement in computational sciences. However, it is the second challenge where there is a need for a novel method to interpret the physical aspects in terms of computational linguistics. And in the process of conceptualizing a commonsense reasoning the first quest will always be addressing a knowledge-base build which can aid reasoning. Several efforts have been made in the past to support commonsense reasoning. A very few have proposed a dedicated knowledge-base to enhance a system’s ability to do reasoning. An effort was already made in the past for such a network through perception [2, 3]. Once the knowledge representation is implemented the inference engine can be systematized. Among all the available tools to build a knowledge-base semantic networks stand out due to their simplicity and effectivity. Semantic networks are being used in many reasoning models proving their efficiency as knowledge repositories [4]. A semantic network that has an ability to store knowledge about how objects look and feel like can benefit knowledge building for commonsense reasoning. The subsequent sections of this paper discuss these matters in detail and analyze the performance characteristics of semantic networks in conceptualizing commonsense reasoning. Four important measures namely sparsity, cluster size, power-law degree distribution and connectivity in semantic networks have been discussed in this paper. Following section describes few properties with respect to that.
Measuring the Performance of a Model Semantic …
255
2 Basic Concepts: Constitution of a Knowledge-Base A network designed to hold the commonsense knowledge in the form of a semantic network must store several type of information. Information about mundane activities, elementary knowledge of objects, and the basic rules of interaction between human and objects are few to mention. To implement a complete automation of CR on such repositories, the process expects the information to be stored in object-level [5]. Advanced data structures, graphical representations and frames are few other options to store knowledge in the similar manner to a semantic network [6]. The following section lists few of the properties of semantic networks with respect to representing commonsense knowledge.
2.1 Atomicity Representation of data can be brought down to the root level when the information is constituted in object form. Atomicity is the property when the data has indivisible form of representation. Representation of commonsense knowledge can benefit hugely due to atomicity. Due to the inductive reasoning nature of CR, information in atomic form can aid reasoning to its full extent. Further, the derived data from object notation is tend to have better clarity. Sparsity is the practical measure to identify atomicity among nodes of a semantic network.
2.2 Inheritance Semantic networks by default implement inheritance between object notations because of taxonomic structure. Inductive reasoning always gets benefited from taxonomic nature of information representation. It makes derivation of conclusion much more easier. The influence of inherited structure can induce better conclusions in CR.
2.3 Classification Objects with similar properties can be grouped together to form a class. Classification of objects reduces the effort required to associate one object with another to derive a conclusion. Complexity of predicted outcome can also be minimized due to the classification of information into clusters. Clustering is the practical notion of classification in semantic networks.
256
C. Hegde and K. Ashwini
2.4 Association Semantic network can yield a better result when relation among nodes or objects are well-defined. The inferencing can be benefited largely due to association of nodes with each other. However, complexity of the network can increase due to larger associativity. As mentioned before the objects in a semantic network ought to represent commonsense knowledge are expected to change their state due to the dynamic nature of the information itself. It changes the association between the nodes often. To prove the exhibition of the above mentioned properties by the proposed semantic network, a network built from one of the existing python libraries called ConceptNet has been formulated [7]. The details of the same is described in following section.
3 Performance Measures for a Model Semantic Network The experimental phase of this study is made up of a semantic network built with the help of ConceptNet. The details on the network built is described in Table 1. Explanation on the variables used to build the network is covered in the subsequent paragraphs. The four performance measures experimented on the above mentioned network are defined in the following section. In a realistic condition a knowledge-base in the form of a semantic network is bound to addition of nodes dynamically. The network defined for this study is capable of having such dynamic nodes with minimal effect on its performance.
Table 1 Summary statistics of model semantic network Variables Statistics (types) Number of nodes – Average number of edges – Average shortest path length Diameter of the network
108,256 (words) 48,787 (classes) 4.0 (words) 1.6 (classes) 8.56 24
Measuring the Performance of a Model Semantic …
257
3.1 Sparsity In this model of ConceptNet the total number of nodes are categorized into words and classes. Sparsity in a semantic network is the percentage of actual connections between nodes out of total number of possible connections. Table 1 lists a variable by the name average number of connections which is a measure of average numbers of edges per node. The numbers 4.0 and 1.6 for words and classes respectively in this category indicates that the network is very much sparse.
3.2 Cluster Size Cluster size allows us to understand the number of nodes associated with each other by any means. It is a measure which indicates the association property among the existing nodes. Cluster size is also depiction of number of class members. The number of clusters in the semantic network is directly associated with the number of classes in the vicinity.
3.3 Power-Law Degree Distribution The power-law degree distribution P(k) represents the probability that a randomly chosen word will have associating neighbors. The degree distribution follows a power law: P(k) ≈ k −γ
(1)
where, γ is typically between 2 and 4 [8]. When the degree distribution is inconsistent for a chosen set of nodes, it indicates improper distribution of connectivity resulting in bigger networks. We expect the network to show consistent degree distribution for a chosen set of nodes to avoid complex network formation.
3.4 Connectivity Connectivity defines ease of reach and traverse throughout the network. In most of the cases connectivity is indirectly proportional to sparsity. Denser the network, better the connectivity. Despite our model network being very sparse it excels in connectivity due to the presence of classes and their associations. Overall, connectivity with respect to a knowledge representing semantic network is a measure of efficiency of the network to support reasoning. The semantic network studied in this paper
258
C. Hegde and K. Ashwini
bears a statistical feature which is not just constrained to a network for knowledgerepresentation. It is generic in the perspective of all the semantic networks supporting reasoning. There are several factors which affect the performance of a semantic knowledge representation. The factor of performance deficiency due to node addition is not covered in the experiment conducted due to uncertainty involved. A CR model can only be accepted if it is subjected to reiteration of truth value after each inference drawing [9]. Assumptions, conclusions and even updates in the knowledge-base are subjected to an evaluation process. The term uncertainty is used for such derived data that is subjected to truthiness evaluation. And when it comes to the depiction of uncertainty, probabilistic models have been proven to be effective [10]. One peculiar behavior of CR is that the conclusions derived are also subjected to a measure of belief. A well-defined knowledge-base can reduce the amount of uncertainty involved in commonsense derivation.
4 Results and Analysis The first set of outcomes of the experiment to measure performance is with the sparsity of the network. Figure 1 represents the results of comparison between the numbers of connections in the network maintained at different n (number of nodes or words in this case) values. The linear growth in the number of connections in the model semantic network shows that the sparsity can be maintained even in the case of exponential growth in number of nodes. Thus, giving us an edge over the problem of dynamic addition of nodes. The surety over the network maintaining its integrity even after adding the nodes can be estimated by having a look into power-law degree distribution. Figure 2 shown below is the result of the same. The plot is given only for the word nodes.
Fig. 1 The linear growth of sparsity
Measuring the Performance of a Model Semantic …
259
Fig. 2 Degree distribution of a random node
Classes have not been included since it may cause redundant calculations after word node estimation. The degree distribution with respect to word nodes shows that the probability of a node having the number of neighbors decreases with increase in number of nodes. Though it seems like an unexpected behavior from a network, it is true to the essential that the increase in number of word nodes has no direct relation with the number of association. Existing classes may prevent the new nodes from having new associations which is from the perspective of network performance, a huge boost. One major concern over the resulting facts is the average length of shortest paths between any nodes. The model network consisting of 108,256 word nodes has an average length of 8.56 nodes. This figure is the result of Shortest Path First algorithm [11]. The reason for consideration of this algorithm is obvious, absence of negative weight on edges. Initially, the average length might seem like a small number but as the network grows, this number is also expected to increase further causing delays on inferencing. Finally, Fig. 3 gives a visualization of growing model of sematic network with the number of nodes set to 150. The distinguishing shades on the nodes represents timely addition of nodes. In the visualization depiction, t1 is the time interval during which initial nodes were added, t2 is an intermediate time during which the next set of nodes were added and finally t3 is the interval which saw rest of the word node additions. There is a greater analysis than what image projects. Addition of nodes comes with an overhead delay of computation of classes to which they belong to and association with all the related nodes. Surprisingly, there is a simple yet effective solution to the problem of increasing number of associations due to additions of word nodes. The solution lies not in the design of semantic network but in an effective inference engine.
260
C. Hegde and K. Ashwini
Fig. 3 Visualization of ConceptNet undirected graph: a total of 150 nodes are represented at various time intervals t1, t2 and t3
5 Revisiting the Problem of Inferencing The introductory part of this paper mentioned the role of inferencing in automation of commonsense reasoning. Having looked into the performance aspects of knowledge representation, it is the time to revisit inferencing with asserted inputs from the previous section.
5.1 An Inference Engine Drawing inference is different in automation of CR compared to other inferencing methods in two ways. One, in terms of a well-defined knowledge-base to support the reasoning. Two, the algorithm used to derive decisions from such knowledge-base. The former is usually a mixture of curated structured data and real-time data. Conceptualizing commonsense reasoning with an additional commonsense knowledge-base seems more adequate. The reason why a monotonic inference method would not be ideal is that the validation of drawn conclusion with commonsense knowledge base may change the original course of inference and call a secondary inference to substantiate the truthiness of earlier conclusion. The type of reasoning containing a substantiating secondary inferencing is called plausible reasoning [12]. In plausible reasoning the correctness of the outcome is uncertain. Although, many inference
Measuring the Performance of a Model Semantic …
261
methods have been used in the study of CR in the past, use of two aforementioned methods seem more suitable due to obvious reasons like existence of uncertainty. One such popular study is in the field of expert systems. However, not all of expert system study ensembles the application of commonsense reasoning. A goal driven approach or backward chaining for inference is not at all suitable in case of commonsense reasoning due to uncertain conclusions drawn. On the other hand reasoning cannot be completely data-driven. A formulated hypothesis is required to trigger backward chaining whereas in-detail dataset can elicit forward chaining [13]. The idea of inference in commonsense reasoning follows a forward chaining with the primary knowledge-base, arrives at certain conclusion, after which it reiterates the correctness using backward chaining with the secondary commonsense knowledge. Due to the absence of a well-defined association between the objects of the knowledge-base and the secondary commonsense knowledge repository.
5.2 Systematizing Commonsense Reasoning Modern day computers are equipped with massive computing infrastructures. However, there has been no state-of-the-art framework that can exhibit a good commonsense reasoning simulation. Following are the challenges to systematize automation of commonsense reasoning: Problem Definition The definition of commonsense reasoning itself is abstract. While implementation of automation of commonsense reasoning requires systematizing the knowledge of physical objects, their materialistic properties and basic rules of physical interaction, the definition is far from such inclusions. From an implementation perspective these properties are not readily available. Plausible Reasoning According to one of the leading active researchers in the field of commonsense reasoning, Ernest Davis, in the study of CR the conclusions drawn are always plausible [5]. It is almost impossible to derive an acceptable conclusion even after a substantial amount of input provision. In addition to this, there exist a very few successful reasoning models that are similar in operation. Size of Data Vast array of data is required for the inference engine to apply reasoning. Larger the data, better the results. This is backed by the fact that information need to be represented in a detailed manner which requires large repositories. Knowledge Representation The knowledge representation part of CR has no specific format of structured data. Moreover, abstraction is difficult to attain when there are detailed description of a group of objects. The likeliness of data being unstructured is more in case of CR due to the size and dimensions of data involved. Recently, IBM Watson—a machine capable of working on unstructured data and give analytic solutions, proved that inference can be improvised [14].
262
C. Hegde and K. Ashwini
6 Conclusion This work has found that measuring and analyzing sparsity, degree distribution, cluster size and connectivity for a model semantic network derives few important observations. First, the exponential growth in number of word nodes has very small effect on sparsity of the network. Low sparsity in semantic network tends to efficient access and association, reducing the time complexity of reasoning. Second, power-law degree distribution decreases with the increasing number of connections between the word nodes. A low degree distribution for a node means low probability that a chosen word node will have a smaller number of neighbours. Every time a new node is inserted, the number of newly created association will be kept to minimal. This again reduces the amount of computation if the network is considered for scaling. Third, the term cluster in our model network is directly associated with the class node and its members. The size of such clusters is a count of members of such classes. Performance of an automated commonsense reasoning model largely depends on a knowledge-base comprising of materialistic knowledge. Computer society has proven that knowledge-corpora can be very much bound to performance because of its immense size. The above-mentioned observations allow us to reiterate the fact that semantic networks can be adequate in hosting knowledge-corpora with least performance issues. An effective inference technique can catalyst the process of reasoning. By considering all the characteristics of an effective inference engine, an effort can be put to make inferencing practical with semantic network as a knowledge-base. The future of automation of commonsense reasoning depends on an establishment of a very strong knowledge-base. A state-of-the-art commonsense knowledge-base can benefit various applications comprising of computational intelligence. It can be used to refine computational ability of a machine in the field of natural language processing, deep learning, data analytics and many other subfields of computer science. It can also be used as a tool for building complex semantic networks for storing rich information.
References 1. Davis, E., Marcus, G.: Commonsense reasoning and commonsense knowledge in artificial intelligence. Commun. ACM 58(9), 92–10 (2015) 2. De Smedt, T., De Bleser, F., Van Asch, V., Nijs, L., Daelemans, W.: Gravital: natural language processing for computer graphics. In: Creativity and the Agile Mind: A Multi-Disciplinary Study of a Multi-Faceted Phenomenon, pp. 81–98. Mouton, Berlin (2013) 3. De Smedt, T.: Modeling Creativity: Case Studies in Python, pp. 78–96. University Press Antwerp (2013) 4. Lim, S., Tucker, C.S., Jablokow, K., Pursel, B.: A semantic network model for measuring engagement and performance in online learning platforms. Comput. Appl. Eng. Educ. 26(5), 1481–1492 (2018) 5. Davis, E.: Representation of Commonsense Knowledge. Morgan Kaufmann (2014) 6. Liu, H., Singh, P.: ConceptNet-a practical commonsense reasoning tool-kit. BT Technol. J. 22(4), 211–226 (2004)
Measuring the Performance of a Model Semantic …
263
7. Chandan, H., Ashwini, K.: Automation of commonsense reasoning: a study on feasible knowledge representation techniques. Int. J. Adv. Res. Comput. Sci. 8(9), 601–604 (2017) 8. Akama, H., Mikaye, M., Jung, J., Murphy, B.: Using graph components derived from an associative concept dictionary to predict fMRI neural activation patterns that represent the meaning of nouns. PloS one 10(4) (2015) 9. Halpern, J.Y.: Reasoning About Uncertainty. MIT Press (2017) 10. Shachter, R.D., Kanal, L.N., Henrion, M., Lemmer, J.F.: Uncertainty in Artificial Intelligence, vol. 5(10). Elsevier (2017) 11. Chen, B. Y, Chen, X.W., Chen, H.P., Lam, W.H.: Efficient algorithm for finding k shortest paths based on re-optimization technique. Transp. Res. Part E Logist. Transp. Rev.: 101819 (2019) 12. Allen, R.J.: The nature of juridical proof: probability as a tool in plausible reasoning. Int. J. Evid. Proof 21(1–2), 133–142 (2017) 13. Al-Ajlan, A.: The comparison between forward and backward chaining. Int. J. Mach. Learn. Comput. 5(2), 106–113 (2015) 14. Chen, Y., Elenee-Argentinis, J.D., Weber, G.: IBM Watson: how cognitive computing can be applied to big data challenges in life sciences research. Clin. Therapeut. 38(4), 688–701 (2016)
COVID-19 Detection and Prediction Using Chest X-Ray Images Shreyas Mishra
Abstract Coronavirus Disease 2019 (COVID-19) is an infectious disease caused by severe acute respiratory syndrome coronavirus 2(SARS-CoV-2). Widespread testing is hampered by lack of test kits and their questionable accuracies. Applying deep learning algorithms on chest x-ray images is a better, more accurate and faster method of testing. People using conventional testing methods take a few hours to get back their test results, which sometimes increase depending on the backlog of the site as well as location. This paper introduces a novel deep learning algorithm which can predict the presence of the virus to very high degrees of accuracy. This paper gives an accuracy of about 98% on the test set. This results in quick detection of the infection and saves a lot of time between testing and diagnosis. This can be highly instrumental in saving the life of a person who displayed no signs of the virus, as well as preventing further spread of the virus. Keywords COVID-19 · Deep learning · Chest X-ray · Convolution neural network
1 Introduction The novel coronavirus disease, first seen in Wuhan, China in December 2019, has spread rapidly to other parts of the world, and by mid-August, 2020, about 21 million people have been infected and about 760,000 people have died. As of now, no vaccine or cure exists and the only prevention of the disease is through less social contact or social distancing, wearing masks, etc. This results in the need for early diagnosis at the first sight of any symptoms, so as to quarantine the individual and the people, he or she may have come in contact with. Nowadays, doctors are cautioning of asymptomatic patients, which means that although infected, the person does not show any symptoms but continues to transmit the virus. This results in increasing death rates because when detected, the severity is already high.
S. Mishra (B) National Institute of Technology, Rourkela, Odisha 769008, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_23
265
266
S. Mishra
Fig. 1 A sample of COVID-19 positive images
Conventional testing methods have been developed by several firms and are not always accurate. This results in the need of alternate testing methods, where the possibility of infection is not present, and the accuracy is high. This is where deep learning comes into play. Chest x-ray or chest radiography is very commonly ordered by clinicians. It is used normally for cough, chest pain, shortness of breath, pneumonia and such kinds of diseases. The entire chest x-ray examination process can be completed within 15–20 min. The x-ray images can also be used for quick detection of COVID-19 in the patient. In this way, there is no need for doctors to look at the x-ray image as the image can be scanned and passed into a software which is trained on our model. There are not a lot of public datasets on COVID-19. The few which are available do not have a lot of COVID-19 positive samples. In this paper, we have used the public dataset available in GitHub [1]. We have filtered out the posteroanterior (PA) type chest x-ray view images from this dataset which resulted in 303 images of COVID-19 positive samples. Figure 1 shows two number of sample images from these 303 images. Also, some COVID-19 negative images have been obtained from the normal folder of Kaggle’s pneumonia dataset [2]. Figure 2 shows two number of sample images of COVID-19 negative type. Equal number of positive and negative samples has been used so as not to have class imbalance. The entire project has been done on Google Colab, which supports downloading datasets from all sources on the Internet and provides a free GPU use.
2 Literature Study It was found in previous studies that patients present abnormalities in chest radiography images that are characteristic of those infected with COVID-19 [3–6]. There are many papers which present encouraging results in terms of accuracy when using
COVID-19 Detection and Prediction Using Chest X-Ray Images
267
Fig. 2 A sample of COVID-19 negative images
chest x-ray images [7–10]. There are not many COVID-19 chest x-ray datasets available publicly. However, few of the datasets are available in various places like Kaggle and GitHub. Minaee et. al. [11] uses pre-trained deep neural networks on the COVID x-ray 5 K dataset. They have used transfer learning with popular fine-tuned architectures such as ResNet, DenseNet and SqueezeNet on the training data. They have fine-tuned the last layer of the CNN and used the pre-trained weights as a feature extractor. They have used the predefined weights of the model trained on the ImageNet [12] dataset and trained their Deep-COVID model on top of those weights. They had achieved a sensitivity rate of around 97% (5% tolerance) and a specificity rate of around 90% by using the adam optimizer, cross entropy and stochastic gradient descent loss functions, without any regularization. They got the best predictions using the SqueezeNet model. Wang and Wong [13] made a comparison between previously defined CNN architectures such as VGG16, Resnet-50 and their proposed methodology, COVID-Net, which gave them an accuracy of 93.3%. They have made use of a novel lightweight residual design pattern which consisted of multiple stages. Firstly, they used 1 × 1 convolutions to project features to a lower dimension, then expanded them to a higher dimension. Then, 3 × 3 convolutions were used to reduce computational complexity. After this, the previous stages were again repeated to obtain the final features. They have used a study about machine centric strategies to measure the performance of deep neural networks [14]. It is important to note that all of these papers have used small datasets comprising of a few hundred samples of COVID positive samples, similar to our proposed methodology.
268
S. Mishra
3 Proposed Methodology In this paper, we will be introduced to a novel deep learning architecture which can be used for prediction of COVID-19 in a patient in a swift manner. A novel architecture titled Deep COVID-Net is proposed for this purpose. Before defining the model, we have to gather all the information and do some processing of the data, collate all the data into specific folders. We will make use of the Keras library of Python. This is an open source neural network library written entirely in Python language. It needs to run on a background which can be Theano, TensorFlow, etc. In this project, we will run it on TensorFlow library. Firstly, the dataset of COVID positive samples is imported from the GitHub repository [1] using the clone command. Using the metadata file in the cloned repo, the COVID samples of the PA type are copied into a ‘COVID’ directory and saved there. The dataset of COVID negative samples is obtained from the Kaggle pneumonia chest x-ray dataset [2]. The number of positive samples were calculated, and an equal number of negative samples were copied and saved in the ‘NORMAL’ directory. All the images were split into 3 sets randomly, the training, validation and testing sets. The training set contained 436 images, the validation set contained 122 images and the testing set contained 48 images in total. A data augmentation technique was used to apply a series of random transformations to the entire batch of data, such as shearing, rotation, etc., which then replaces the original dataset. This applies random jitters and perturbations to the data while ensuring the class labels are unchanged. This is made possible by the ImageDataGenerator method of the Keras pre-processing module. The images are in the form of 3-dimensional NumPy arrays which are then standardized by dividing each pixel value by 255 so that each value is now between 0 and 1. The generator module has a class_indices function, which is used to know the index assigned to each class. This keeps the numbering clear so that there is no confusion while making the sensitivity and specificity calculations in the future. Finally, each image is resized to dimensions of 224 × 224.
3.1 Proposed Model The model introduced in this paper does not have a large degree of complexity, as the pre-trained networks used in the paper mentioned in the literature study. But our model produces a much higher accuracy score. Figure 3 shows the architecture of our Deep COVID-Net model. The Deep COVID-Net model proposed in this paper is a sequential model which takes 224 × 224 images as input to a convolution layer with 32 filters and 3 × 3 kernel size. It is connected to another conv layer with 64 filters. This is connected to a max pooling layer with 2 × 2 pool size. There are 3 more conv layer, and max pooling layer blocks connected in succession. The number of filters in each conv
COVID-19 Detection and Prediction Using Chest X-Ray Images
269
Fig. 3 Architecture of the deep COVID-Net model
layer are 64, 128 and 128 in that order. The pool size and kernel size parameters are same for all layers. The layers are then flattened and connected to a dense layer with 64 neurons. This is a fully connected layer. Dropout regularization is applied to randomly drop neurons and avoid overfitting on the training data. The next dense layer is the output layer of the model. The conv and dense layers always use the relu activation function.
270
S. Mishra
3.2 Training the Model After the necessary libraries were imported and the model was built, it was then compiled. The dropout layers which can be seen in the architecture are added so as not to over generalize the model, i.e. prevent overfitting on training data. The dense layer is the fully connected layer at the end of the neural network. The sigmoid activation function has been used, which returns the probabilities of each class for a test image. The binary cross entropy loss has been used with the adam optimizer. The metric to be calculated is the accuracy. The images were then fit into the model with the model.fit function. The model.fit generator function was given some parameters. The steps per epoch were 8; validation steps were 2, and the number of epochs was 10. Increasing the number of epochs would lead to decreasing accuracy.
4 Results The comparison of the two models namely Deep-COVID [11] and COVID-Net [13] mentioned in the literature study with our model has been done in Table 1. As can be seen in the classification report, we have an accuracy of about 98% on an unbiased test set. This shows that our model has a higher accuracy when compared with other models. The classification report is shown in Table 2. Classification report gives a summary of the training history, i.e. the precision, f1-score and recall values. These are some important performance metrics which can be taken into account apart from accuracy. Precision is the ratio of correctly predicted positive observations to the total predicted positive observations. Recall is the ratio of correctly predicted positive observations to the all observations in actual class. Table 1 Comparison of the 3 methodologies Model
Accuracy
Deep-COVID [9]
0.96
COVID-Net [10]
0.933
deep covid-net
0.98
Table 2 Classification report Parameter
Precision
Recall
F1-score
Support
0
0.96
1.00
0.98
26
1
1.00
0.95
0.98
22
0.98
48
Macro avg
0.98
0.98
0.98
48
Weighted avg
0.98
0.98
0.98
48
Accuracy
COVID-19 Detection and Prediction Using Chest X-Ray Images
271
Fig. 4 (Left) accuracy plot and (Right) loss plot
The reason behind the higher accuracy of Deep COVID-Net, as shown in Tables 1 and 2, in comparison to other models lies in the difference between their architectures and other hyperparameters. This paper avoids using pre-trained weights on CNN models like ResNet and DenseNet, which are too deep and introduce the problems of overfitting and in this case, the source and destination targets are not similar. This paper also uses a completely balanced dataset which avoids biasing of the model towards one class. Figure 4 shows the accuracy and loss plots of both training and validation sets. It can be seen from Fig. 4 that the model started with very low values of accuracy and after each pass of the entire training set or epoch, the weights applied to the input are updated and the model starts becoming more accurate. As the accuracy increases, the model becomes more optimized and the loss also decreases.
4.1 Confusion Matrix and ROC Curve Confusion matrix is usually regarded as a good way to describe the performance of a model on a set of data whose true value is known. The confusion matrix obtained for this paper on the set of test data gives very accurate results in line with the model accuracy. This is because, accuracy rates, although a very intuitive performance measure, is great only as long as we have symmetric datasets, where values of FP and FN are almost identical. Figure 5 shows the confusion matrix on the unbiased test set. Figure 6 shows the receiver operating characteristics (ROC) curve. As can be seen in the above confusion matrix, the number of true positive predictions is 26, and the number of true negative predictions is 21. The number of false positive predictions is 0, while the number of false negative predictions is 1. This proves the reliability of the model, since this confusion matrix is reflective of an unbiased dataset. The number of true positives and true negatives in a prediction model should always be high. This means that almost all the predictions are correct. The ROC curve gives the diagnostic ability of the model. It gives the true positive rate vs the false positive rate at different threshold settings of the classification
272
S. Mishra
Fig. 5 Confusion matrix for COVID-19 detection
Fig. 6 ROC curve
process. The area under the ROC curve is called the AUC score. The obtained AUC score is 0.9772. Basically, the AUC is a measure of separability of the different classes. As we can see from the very high accuracy score, there is a clear distinction between the two classes. The model correctly predicts COVID-19 negative cases as negative and COVID-19 positive cases as positive. Mathematically, we can calculate the sensitivity and specificity of the model using the formulae: Sensitivity = (True Positive)/(True Positive + False Negative). Specificity = (True Negative)/(True Negative + False Positive). According to the above formulae, the sensitivity obtained is 0.962 and the specificity obtained is 1.
COVID-19 Detection and Prediction Using Chest X-Ray Images
273
5 Conclusion In this paper, we have introduced Deep COVID-Net model, a novel deep learning CNN architecture which can predict whether a patient is COVID-19 positive or not based on open source publicly available chest x-ray images. The dataset contained 303 COVID positive images, and the same number of COVID negative images were used from the Kaggle dataset. We have achieved an accuracy of 98%, which is higher than previously published research works. This model was also used for batch prediction of chest x-ray images obtained from another source with similar levels of accuracy. We can improve on this method by adding a greater number of training images. In the future, adding more parameters to each of the chest x-ray images, such as patient medical data can also help to increase the reliability of this model. This method only uses chest x-ray images, but currently more models are being researched extensively which can predict the possibility of COVID infection in a person based on previous medical history combined with occurrence of symptoms. This can be a widely sought-after research opportunity in near future along with predicting severity of infection and duration of need to remain in medical care. Another area of research can be AI enabled solutions, where a thermal sensor can detect the body temperature of a patient, and armed with medical history of the patient along with chest x-ray images, compute the probability of infection in the patient as well as occurrence of disease. This can take place completely with the help of a machine without a human operator.
References 1. ieee8023/covid-chestxray-dataset,https://github.com/ieee8023/covid-chestxray-dataset. Last accessed 14 July 2020 2. Chest X-Ray Images (Pneumonia) | Kaggle. https://www.kaggle.com/paultimothymooney/ chest-xray-pneumonia. Last accessed 14 July 2020 3. Ng, M.Y., Lee, E.Y., Yang, J., Yang, F., Li, X., Wang, H., Lui, M.M.S., Lo, C.S.Y., Leung, B., Khong, P.L., Hui, C.K.M.: Imaging profile of the COVID-19 infection: radiologic findings and literature review. Radiology: Cardiothoracic Imaging 2(1), e200034 (2020) 4. Huang, C., Wang, Y., Li, X., Ren, L., Zhao, J., Hu, Y., Zhang, L., Fan, G., Xu, J., Gu, X., Cheng, Z.: Clinical features of patients infected with 2019 novel coronavirus in Wuhan. China. The Lancet 395(10223), 497–506 (2020) 5. Guan, W.J., Ni, Z.Y., Hu, Y., Liang, W.H., Ou, C.Q., He, J.X., Liu, L., Shan, H., Lei, C.L., Hui, D.S., Du, B.: Clinical characteristics of coronavirus disease 2019 in China. N. Engl. J. Med. 382(18), 1708–1720 (2020) 6. Jacobi, A., Chung, M., Bernheim, A., Eber, C.: Portable chest X-ray in coronavirus disease-19 (COVID-19): A pictorial review. Clin. Imaging 64, 35–42 (2020) 7. Gozes, O., Frid-Adar, M., Greenspan, H., Browning, P.D., Zhang, H., Ji, W., Bernheim, A. and Siegel, E.: Rapid ai development cycle for the coronavirus (covid-19) pandemic: Initial results for automated detection and patient monitoring using deep learning ct image analysis. arXiv preprint arXiv:2003.05037 (2020)
274
S. Mishra
8. Butt, C., Gill, J., Chun, D., Babu, B.A.: Deep learning system to screen coronavirus disease 2019 pneumonia. Applied Intelligence 22, 1–7 (2020) 9. Li, L., Qin, L., Xu, Z., Yin, Y., Wang, X., Kong, B., Bai, J., Lu, Y., Fang, Z., Song, Q., Cao, K.: Artificial intelligence distinguishes COVID-19 from community acquired pneumonia on chest CT. Radiology (2020). https://doi.org/10.1148/radiol.2020200905 10. Shi, F., Xia, L., Shan, F., Wu, D., Wei, Y., Yuan, H., Jiang, H., Gao, Y., Sui, H. and Shen, D.: Large-scale screening of covid-19 from community acquired pneumonia using infection size-aware classification. arXiv preprint arXiv:2003.09860 (2020) 11. Minaee, S., Kafieh, R., Sonka, M., Yazdani, S., Soufi, G.J.: Deep-covid: Predicting covid-19 from chest x-ray images using deep transfer learning. arXiv preprint arXiv:2004.09363 (2020) 12. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K. and Fei-Fei, L.: Imagenet: A largescale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, pp. 248–255 (2009) 13. Wang, L., Wong, A.: COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest X-Ray Images. arXiv preprint arXiv:2003.09871 (2020) 14. Qiu Lin, Z., Javad Shafiee, M., Bochkarev, S., St Jules, M., Wang, X.Y., Wong, A.: Do explanations reflect decisions? A machine-centric strategy to quantify the performance of explainability algorithms. arXiv, pp.arXiv-1910 (2019)
Automated Precision Irrigation System Using Machine Learning and IoT Ashutosh Bhoi, Rajendra Prasad Nayak, Sourav Kumar Bhoi, and Srinivas Sethi
Abstract Water is considered to be the most precious natural resource for agriculture in this twenty-first century. To avoid the scarcity of water we must have to use it precisely. For this task a smart irrigation recommendation system is the need of the hour. In this era of automation we may use technologies like machine learning and IoT to build a smart irrigation recommendation system for efficient water usage with nominal human intervention. Here, we propose an IoT based irrigation framework with machine intelligence. The intelligence is incorporated with various machine learning based regression and classification models. To make our proposed system even robust we have integrated the forecasted weather data using their available APIs. We use our own collected sensor data along with the NIT Raipur dataset to validate the effectiveness of this system. From all the experimentation, it is found that the proposed support vector regression (SVR) along with the KNN classifier trained system is very much effective for this challenging task. Keywords Smart irrigation · IoT · Machine intelligence · SVR · KNN classifier · Weather forecasting
Supported by organization x. A. Bhoi (B) · R. P. Nayak Department of Computer Science & Engineering, GCEK, Bhawanipatna, India e-mail: [email protected] R. P. Nayak e-mail: [email protected] S. K. Bhoi Department of Computer Science & Engineering, PMEC, Berhampur, India e-mail: [email protected] S. Sethi Department of Computer Science & Applications, IGIT, Sarang, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_24
275
276
A. Bhoi et al.
1 Introduction Due to environmental degradation, natural resources like water is depleted throughout the world. In a country like India irrigation consumes a lion share of water. Crop irrigation is the most prominent component in regulating the plant yield. It relies upon the numerous climatic parameters like air temperature, humidity, soil wetness, and soil temperature on which the growth of plants get affected. Traditional method of irrigation consumes huge amount of water with wastage. To improve productivity of farmers, particularly relies on personal supervision, which is a tedious task. It also requires domain expertise which is very difficult to get in these days. The state-ofthe-art irrigation recommendation systems are not fruitful because of many reasons [4]. Here, our goal is to develop an intelligent irrigation recommendation system for optimal water usage with minimal intervention of the farmer. The system must be robust and flexible enough to adapt the regional climatic circumstances based on which it may precisely recommend to irrigate or not. Recently, simple but efficient techniques like IoT and machine learning may come up with an effective solution for this challenging task. The literature review about the task is explained in Sect. 2. The details about the proposed methodology is mentioned in Sect. 3. The results and discussion are described in Sect. 4. At last, Sect. 5 concludes our work along with the outcomes and future scope for improvement.
2 Literature Review In a country like India majority of the population depend upon agriculture. Agriculture sector consumes a lot of water with maximal wastage. To reduce the wastage of water and human effort we require a smart and effective irrigation system maintaining maximal productivity. If we go through the literature, many researchers have attempted for a feasible solution of this task. As communication is the backbone of any smart irrigation system, wireless sensor network (WSN) based systems are suggested by many researchers [11, 13]. Implementation of GPRS communication as a gateway between WSN and the internet is proposed for smooth monitoring of the system [5]. In [1], the authors suggested a model to develop WSN enabled soil moisture controller, which regulates the need of water by differentiating the obtained soil moisture value with the predefined threshold value. Validation of all field parameters is done at regular interval to determine the water amount required in the soil for an efficient irrigation process. A real-time system named as Smart Photovoltaic Irrigation Manager (SPIM) is proposed to sync the photovoltaic power in stock with the energy needed to pump the irrigation requirements of various parts of irrigation networks [3]. In [6], the authors developed an automated irrigation system with the help of a humidity monitoring system. A multi-level farming focused framework is suggested for low space cultivation cities [12]. Local nodes are deployed at each level
Automated Precision Irrigation System Using Machine Learning …
277
with their individual local decision making capabilities which are further propagated to the centralized node connected in the cloud server. Internet of Things (IoT) and embedded technologies based model is designed for automatic irrigation of water to plants [10]. Various soil and environmental parameters obtained from sensors are incorporated in the proposed model for efficient water usage. GSM, Bluetooth, and cloud technologies are also incorporated by the authors in their designed model. In recent times machine learning and cloud are considered as an effective tool in these state-of-the-art systems for some gain [9]. A cloud and IoT based framework is implemented for smart agricultural services [7]. It reduces the human intervention of farmer for irrigation purposes. An alert system for automated irrigation is developed with the help of multihop wireless LANs [8]. Servo motor, NodeMCU boards, and various sensors are connected through a robust wireless network to upload the data into the cloud. Message Queuing Telemetry Transport (MQTT) protocol is enabled on a hybrid cloud and Internet of Things based model for smart irrigation in agriculture [15]. In [2], the authors considered various sensor, soil, and environmental parameters for analysis to supplement the automated irrigation task. In [14], the authors proposed to integrate distributed WSN with machine learning for a better solution to over irrigation and soil erosion issues. However, these limited works may further be improved for some significant gain in the precision irrigation task.
3 Proposed Methodology The proposed model for a feasible solution of this highly demanding and challenging task is described in this section. The framework for our suggested system is given in Fig. 1. All components and their working are described as follows: The first thing that is carried out by the system is gathering of all field and local environmental information using various sensors. Different sensors like soil moisture (EC- 1258), soil temperature (DS18B20), air temperature (DHT11), and humidity (DHT11) are used to collect all these relevant parameters. All these sensors are chosen based on local agriculture experts and available literature. The sensor models are used based on the literature and also with their review on Web. All these sensor information are measured twice a day and forwarded for the cloud storage with the help of a micro-controller device. The average of the two readings is calculated and stored as the final sample for that particular day. To eliminate the inter dependency among the used parameters, we calculated the simple but effective Pearson correlation between them and found that there are no strong correlation exist (here not more than 0.75). For this we calculated the correlation Here, Arduino is used due to its wide range of capabilities and energy efficiency. An Android application is also designed to interact with the Arduino for all activities. It takes input from the user (farmer) regarding the authentication and crop related details. It also displays all useful information to the user.
278
A. Bhoi et al.
Fig. 1 Solution framework
For any precision irrigation system forecasted weather data are very much necessary. Here, we integrate the zonal forecasted weather data to build our system robust and reliable. Weather APIs are used to collect these useful information from the Indian meteorological department (IMD) site. The parameters which are considered from forecasted weather data are the atmospheric pressure, precipitation, solar radiation and wind speed. All these stored data are fed to machine learning based analyzer to generate the irrigation decision. Based on the decision the handler sends irrigation recommendations to the user. Based on the recommendation the user will operate the pump through his or her mobile application. In this way our proposed system reduces the wastage of water and farmer intervention. Machine learning Analyzer The machine learning analyzer is the core component of this system. In the stored field and environmental data, we applied the Support vector regression (SVR) algorithm to predict the future values of these parameters. SVR may grab the non-linear relationships, and are reasonably robust to outliers. The forecasted weather data merged with the predicted soil and local environmental data are fed to the classification model. Then this model categorizes the merged data sample on the basis of whether irrigation is required or not at that time interval. This way machine learning assists the farmer with near accurate suggestions for irrigation in the crop field. The accuracy may further improve with experience as these ML models are trained with more data samples and feedback from the user.
Automated Precision Irrigation System Using Machine Learning …
279
4 Results and Discussion Our own collected sensor data along with the NIT Raipur crop dataset are used to validate the effectiveness of our proposed automated irrigation system. We gathered sensor data like soil moisture, soil temperature, air temperature, humidity, and UV radiation from the field and its surrounding. Each day stored readings are the average of all readings taken at periodic intervals on that day. 150 samples are collected from our simulation area. Similarly, in the NIT Raipur crop dataset 501 samples are there. The forecasted weather data are accessed from the IMD site within that date interval. As classification is the benchmarking task to evaluate the system, fivefold cross validation is used for better generalization. The effectiveness of the proposed model is determined by the most extensively used measures like accuracy (A), precision, recall, and f-measure. Mathematically, these performance measures are denoted as follows, Precision (Pr ) =
Recall (Rc ) =
Tp Tp + Fp
Tp Tp + Fn
F1 − measure (F1) =
2 ∗ Pr ∗ Rc Pr + Rc
(1)
(2)
(3)
where Fp , Tp , and Fn are represented as false positives, true positives, and false negatives respectively. The results of various combinations of machine learning models for categorization of irrigation required or not required are given in Table 1. From all the experimentations, it is noticed that the performance of our system is quite satisfactory for this automation task. The performance of this system may further improve as new data will be collected with time. The user feedback will further fine tune this system as few wrong suggestions may be at the beginning due to misclassification. From the experimental results it is quite clear that Support vector regressor (SVR) along with the KNN classifier is outperforming other set of models on both the datasets. Among the two datasets NIT Raipur crop dataset is doing better which may be due to more number of data samples present in it. When we run our system without ‘crop type’ and ‘crop days’, it is also found that the two parameters nominally contributing the results. Hence, we have planned to include these two parameters next time in our dataset. At this stage the performance improvement in our proposed model is nominal. We are very much hopeful that it will be improved further with more number of training samples in future.
ML model (Regressor + classifier)
Our sensor ANN+NB collected data SVR+NB ANN+KNN SVR+KNN NIT Raipur ANN+NB crop dataset SVR+NB ANN+KNN SVR+KNN
Dataset
81.15
F1
82.35 82.84 84.96 85.77 83.42 83.74 84.69 85.25
82.67
83.12 85.27 85.89 83.78
84.06 84.94 85.47
81.59 82.58 83.17
80.52 82.67 83.04 81.20
79.98
82.65 83.62 84.20
81.66 83.80 84.38 82.30 83.90 84.81 85.38
82.94 85.12 85.78 82.59
82.46
83.62 84.60 85.18
82.70 84.83 85.67 83.29
82.19
P
A
R
A
P
Fivefold cross validation
70:30 ratio of training and testing
Table 1 Recommendation for irrigation required or not required
81.51 82.50 83.11
80.41 82.56 82.95 81.10
79.86
R
82.55 83.54 84.13
81.54 83.68 84.29 82.18
81.01
F1
280 A. Bhoi et al.
Automated Precision Irrigation System Using Machine Learning …
281
5 Conclusion and Future Work In this work, we have proposed a prototype of an automated precision irrigation system for minimal human intervention and efficient water usage. The initial task of solution pipeline for proposed intelligent system includes regression of soil and environmental attributes. In the next, forecasted weather parameters are merged with these predicted values to minimize the irrigation fault due to impending rainfall. Finally, an efficient classifier is used to categorize the fused set of attributes to check whether irrigation required or not. Based on this results the system suggests the farmer for next irrigation, which may or may not be accepted by the user. If the user rejects, then a feedback is sent to the system, which update and fine-tune the model subsequently. From the experimental results it may be concluded that the SVR and KNN based model outperforming other models on both datasets. It is also expected that the system may perform even better in future with more samples and fine-tuning with feedback. Acknowledgements This work was supported and funded by Collaborative Research Scheme (CRS) of National Project Implementation Unit (NPIU), MHRD, Government of India. The authors wish to thank department of Computer Application, NIT Raipur for making available of the Crops dataset.
References 1. Bhanu, B.B., Hussain, M.A., Ande, P.: Monitoring of soil parameters for effective irrigation using wireless sensor networks. In: 2014 Sixth International Conference on Advanced Computing (ICoAC), pp. 211–215. IEEE (2014) 2. Bhanu, K., Mahadevaswamy, H., Jasmine, H.: Iot based smart system for enhanced irrigation in agriculture. In: 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), pp. 760–765. IEEE (2020) 3. García, A.M., García, I.F., Poyato, E.C., Barrios, P.M., Díaz, J.R.: Coupling irrigation scheduling with solar energy production in a smart irrigation management system. J. Clean. Prod. 175, 670–682 (2018) 4. Goldstein, A., Fink, L., Meitin, A., Bohadana, S., Lutenberg, O., Ravid, G.: Applying machine learning on sensor data for irrigation recommendations: revealing the agronomist’s tacit knowledge. Precis. Agric. 19(3), 421–444 (2018) 5. Gutiérrez, J., Villa-Medina, J.F., Nieto-Garibay, A., Porta-Gándara, M.Á.: Automated irrigation system using a wireless sensor network and GPRS module. IEEE Trans. Instrum. Meas. 63(1), 166–176 (2013) 6. Kamelia, L., Ramdhani, M.A., Faroqi, A., Rifadiapriyana, V.: Implementation of automation system for humidity monitoring and irrigation system. In: IOP Conference Series: Materials Science and Engineering, vol. 288, p. 012092 (2018) 7. Koduru, S., Padala, V.P.R., Padala, P.: Smart irrigation system using cloud and internet of things. In: Proceedings of 2nd International Conference on Communication, Computing and Networking, pp. 195–203. Springer (2019) 8. Lalitha, C., Aditya, M., Panda, M.: Smart irrigation alert system using multihop wireless local area networks. In: International Conference on Inventive Computation Technologies, pp. 115– 122. Springer (2019)
282
A. Bhoi et al.
9. Mekonnen, Y., Namuduri, S., Burton, L., Sarwat, A., Bhansali, S.: Machine learning techniques in wireless sensor network based precision agriculture. J. Electrochem. Soc. 167(3), 037522 (2019) 10. Monica, M., Yeshika, B., Abhishek, G., Sanjay, H., Dasiga, S.: Iot based control and automation of smart irrigation system: an automated irrigation system using sensors, GSM, bluetooth and cloud technology. In: 2017 International Conference on Recent Innovations in Signal processing and Embedded Systems (RISE), pp. 601–607. IEEE (2017) 11. Sales, N., Remédios, O., Arsenio, A.: Wireless sensor and actuator system for smart irrigation on the cloud. In: 2015 IEEE 2nd World Forum on Internet of Things (WF-IoT), pp. 693–698. IEEE (2015) 12. Salvi, S., Jain, S.F., Sanjay, H., Harshita, T., Farhana, M., Jain, N., Suhas, M.: Cloud based data analysis and monitoring of smart multi-level irrigation system using iot. In: 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC), pp. 752–757. IEEE (2017) 13. Savi´c, T., Radonji´c, M.: Wsn architecture for smart irrigation system. In: 2018 23rd International Scientific-Professional Conference on Information Technology (IT), pp. 1–4. IEEE (2018) 14. Vij, A., Vijendra, S., Jain, A., Bajaj, S., Bassi, A., Sharma, A.: Iot and machine learning approaches for automation of farm irrigation system. Procedia Comput. Sci. 167, 1250–1257 (2020) 15. Vijayaraghavan, V., et al.: Iot and cloud hinged smart irrigation system for urban and rural farmers employing MQTT protocol. In: 2020 5th International Conference on Devices, Circuits and Systems (ICDCS), pp. 71–75. IEEE (2020)
UHWSF: Univariate Holt Winter’s Based Store Sales Forecasting Gopal Behera, Ashok Kumar Bhoi, and Ashutosh Bhoi
Abstract Sales are the life line of any retailer store, hence forecasting of sales plays an important role for a retailer company and every retailer companies want to estimate their sales before actual sale so that their business will run successfully. Traditional method like Auto-regressive Integrated Moving Average (ARIMA) is commonly used technique for forecasting sales and this forecasting information is used to make a good business for retailer company. In this article Holt-Winter’s (HW) techniques are proposed to predict weekly sales for any retailers. In this work, we have used real word dataset (publicly available) Walmart sales dataset. Lastly a comparison is made between baseline and purposed model and found that the purposed method is more accurate and efficient to predict sales than the traditional method, also maintain a good accuracy. Keywords ARIMA · Holt-Winters · Sales forecasting · Statistic · Accuracy
1 Introduction As the success and growth of any online or offline retailer company is depends upon their sales of items on basis of weekly, monthly and yearly. Hence wrong prediction leads to over stock inventories or stock-outs for which a company faces losses. So to overcome this losses the retailer domain use accurate model to forecast the sales of companies on demand of the customers. But at the same time a company faces G. Behera (B) · A. K. Bhoi · A. Bhoi Government College of Engineering Kalahandi, Bhawanipatna, Odisha, India e-mail: [email protected] A. K. Bhoi e-mail: [email protected] A. Bhoi e-mail: [email protected] Department of Computer Science & Engineering, Government College of Engineering Kalahandi, Bhawanipatna, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_25
283
284
G. Behera et al.
various challenges for getting an accurate forecasting, which is essential. Challenges like what is the demand of customer in future accordingly the products or the items kept in stock ahead, similarly the factors like festival, holidays, public events and location are also impact the demand of future sales [13]. Traditional sales prediction like statistical methods, such as Box and Jenk-ins model, exponential smoothing, regression models, ARIMA are often applied. Xia and Wong [16] classify between modern heuristic methods and classical methods. ARIMA is a linear model and is not able to handle asymmetric behavior of real-world data [4, 12] of any retailer store. In other hand, heuristic techniques are capable to handle these challenges. In this work Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) are used as evaluation metrics. For evaluation propose we have used Walmart store sales dataset. Walmart is world wide retailer company which open their outlet across the world. The dataset contains data from 45 stores from 2010 to 2012 year and stores contains several department department, also some external data like unemployment, CPI (Consumer Price Index) and fuel prices of particular region of store are available which may impact the weekly sales of store. In this article we have proposed Holt-Winter’s model for uni-variate time series dataset and found that the performance of proposed model is significantly increase than the traditional ARIMA. The remaining article is designed as previous work in Sect. 2, proposed work is discussed in Sect. 3, implementation with experimental results and conclusion are discussed in Sects. 4 and 5 respectively.
2 Previous Work Many researchers have done significant work for enhancing accuracy of sales forecasting. Mainly approaches are based on traditional statistical models, neural networks, machine learning ensemble techniques and etc. As ARIM is closer to this work hence this article discuss briefly about the ARIMA. ARIMA is a statistical model, hence most of researchers have used it as baseline model for forecasting the sales of any retailer company. Muller et al. [10], Pavlyshenko [11] and Gurnani et al. [6] have used ARIMA as baseline for predicting sale in the year 2015, 2016 and 2017 respectively. In year 2019, Bandara et al. [1] studied the ARIMA and found ARIMA can’t handle multivariate domain also leads a poor performance when handling trend and seasonality. Wu et al. [15] and Xie [16] have used two variants of ARIMA like SARIMA (Seasonal ARIMA) and vectorized ARIMA (VARIMA); which handle sales prediction. Gurnani et al. [6] developed ARIMA with regressors for analyzing linearity in data contains time series, but fails to handle non-linearity in dataset [17]. Though Machine learning (ML) techniques are not designed for time-series based forecasting, but ML techniques have ability to tackle both non-linear and linear task in time-series features [4]. most of researchers have used Rossmann data-set for predicting sales. Behera and Nain [2] have discussed various ML techniques and shown the comparative analysis among them. An optimization based technique is shown for predicting future sale for Big mart [3]. Hansen and Doornik [4] analysis forecasting
UHWSF: Univariate Holt Winter’s Based Store Sales Forecasting
285
of sales on Rossmann data using SVM, linear regression (LR) and softmax regression; found that SVM. Forecasting of sales using Frequency Domain Regressor and SVM was explored by Lin et al. [9]. Pavlyshenko [11] have discussed a case study on time-series sales forecasting by using machine learning models.
3 Proposed Work In this work, we have proposed Holt-Winter’s (HW) additive and multiplicative model for predicting sales of a retailer store like Wall-Mart. Also considered an uni-variate time series data-set for the evaluation purpose and assume that the store will predict the sales on weekly basis by considering the whether day is holidays, public event or store location and size of store.
3.1 Holt-Winter’s HW is an extended version of Holt’s method [7, 14], which has an ability to handle the seasonality. Hence HW seasonal technique is consisting of forecast equation, with three equations that are comprises of level: lt , trend: bt and st : the seasonal component. In addition to these equations HW takes α, β ∗ and γ as smoothing parameter. Further HW is categorized into HW additive and Multiplicative [5, 8]. Where as additive HW is preferred when seasonal component is fixed throughout the series and are defined in Eqs. (1)–(4) respectively. yˆt+h|t = lt + hbt + st+h−m(k+1)
(1)
lt = α(yt − st−m ) + (1 − α)(lt−1 + bt−1 )
(2)
bt = β ∗ (lt + lt−1 ) + (1 − β ∗ )bt−1
(3)
st = γ (yt − lt−1 − bt−1 ) + (1 − γ )st−m
(4)
where k an integer of (h − 1)/m that is to ensure the estimation of seasonal index used in forecasting; m represents seasonal frequency that is number of seasons in a year, lt and bt represents estimated level and slop or trend of the series at time t. The parameter α takes the values between 0 to 1(0 ≤ α ≤ 1) and β ∗ takes within range 0 ≤ β ∗ ≤ 1.
286
G. Behera et al.
HW multiplicative components are defined from Eqs. (5)–(8) and these techniques are suitable when seasonal variations are proportionally changing w.r.t series of levels. (5) yˆt+h|t = (lt + hbt )st+h−m(k+1) lt = α
yt
+ (1 − α)(lt−1 + bt−1 )
(6)
bt = β ∗ (lt − lt−1 ) + (1 − β ∗ )bt−1
(7)
yt + (1 − γ )st−m (lt−1 + bt−1 )
(8)
st = γ
st−m
4 Implementation An experiment is set up by using python environment with Jupyter notebook and used real world time series data for evaluating the model. The detail dataset description is mentioned in the Sect. 4.1.
4.1 Dataset We collected real world dataset Walmart1 for our work. This is a time series univariate dataset and data contains 45 Walmart stores from the period of 2010–12 where a store has various department, also there are several external data like fuel price of a particular region of each store, CPI (Consumer Price Index) and unemployment rate, helps in analysis in details as these data may impact the sales of a store. Table 1 gives details of stores, Tables 2 and 3 elaborate details of external data and sales respectively.
4.2 Results and Discussion There are different accuracy metrics are used to evaluate the model namely RMSE, MAE [3]. Lesser is the RMSE and MAE, better is the model. RMSE is defined in the Eq. (9). N 1 (yt − f t )2 (9) RMSE = N t=1 1 https://www.kaggle.com/c/walmart-recruiting-store-sales-forecasting/data.
UHWSF: Univariate Holt Winter’s Based Store Sales Forecasting
287
Table 1 Description of a store in Walmart Attribute name Description Store Type Size
Number of stores (1–45) Type of store (A,B, C) Number of product contained in a particular store (34,000–210,000)
Table 2 Description of a external features of Walmart Attribute name Description Temperature Fuel_price Mark_down1:5 CPI Unemployment
Temperature during a week Fuel price in a particular region Shows type of markdown and what quantity was available Consumer price index of a week Week wise unemployment rate
Table 3 Description of a sales of Walmart Attribute name Description Date Weekly_sales Dept IsHoliday
Date of a week when observation was taken Weekly sales recorded Number of department (1–99) Holiday week or not (value either 0 or 1)
where yt is the original value, f t is the forecasted value and number of samples is represented by N . Similarly MAE is defined in the Eq. (10). MAE =
N 1 |yt − f t | N t=1
(10)
Before feeding the data into the model, data has gone through Exploratory Data Analysis (EDA) process to visualize behavior of data and check whether features of original dataset are in the correct form or not. Figure 1 indicates that out of three type of store, type A store is the largest size in term of containing product and type C store is the smallest store. It is found from Fig. 2 that the department number 92 has produce maximum weekly sales as compare to other department also considering whether the week is holiday or not. Further it is found from Fig. 3, that the weekly sales more when a week is holiday rather than non-holiday week. The Correlation among the attributes are shown in the Fig. 4 and from Fig. 4 it is found that Cost of goods increases, due to CPI increases and this causes reduction of sales. Also negative value indicates there is a weak association between the features and positive value refers that there is a strong association among the features. The time-series dataset is
288
G. Behera et al.
Fig. 1 Store type versus size: shows store type A has contained maximum number of product where as C type store has contain less number of product
Fig. 2 Weekly sales versus department: impact of sales per week of each department; department 92 has indicates maximum weekly sales
decomposed into trend, seasonality and residuals. It is observed from the Fig. 5, the trends of weekly sales is first decreases then elevated. Similarly in case of seasonal, weekly sales patterns are repeating and found that in the month of Nov-December the weekly sale is very high. For evaluation purpose, dataset prepared into training and testing set with the proportion of 80:20 respectively. Then we have applied the both ARIM and the proposed method and found that proposed model provides better forecasting than other model. Figure 6 shows that in case of baseline model the test value is not perfectly fitted for forecasting of weekly sales. Where as the proposed model is correctly fitted with the predication curve to test values hence gives accurate prediction of weekly sales in case of HW additive model as shown in Fig. 7. Similarly second variant of proposed model shown in Fig. 8 that is multiplicative model, which gives more accurate
UHWSF: Univariate Holt Winter’s Based Store Sales Forecasting
289
Fig. 3 Weekly sales versus Isholiday: weekly sales is more when a week is holiday
Fig. 4 Correlation among features in the dataset: negative value indicates weak correlation and positive values indicates strong relation among attribute
290
G. Behera et al.
Fig. 5 Trend,seasonality and residuals are presents in the dataset: in case of trend the weekly sales first decrease and then elevated and in seasonality the weekly sales patterns are repeating
Fig. 6 Weekly sales prediction using ARIMA: test data is not perfectly fitted for prediction
Fig. 7 Weekly sales prediction using proposed model (Holt-Winter’s Additive): test data is well fitted for forecasting of sales
UHWSF: Univariate Holt Winter’s Based Store Sales Forecasting
291
Fig. 8 Prediction weekly sales with seasonal variation using Holt-Winter’s multiplicative model Table 4 Comparison of accuracy metrics of base model with proposed model Accuracy metrics Base model ARIMA Proposed model Holt-Winter’s Holt-Winter’s additive model multiplicative model RMSE MAE
600.54 415.06
811.76 601.06
990.52 775.46
prediction than that of additive model as shown in Fig. 7. The accuracy of proposed model with base model is shown in the Table 4 and found that the proposed model is better than that of base mode for time-series data which containing both trend & seasonality.
5 Conclusion and Future Scope On data visualization concluded, weekly sales are largely depends upon the external data like isholiday, CPI, Location, etc. On analysis it is also found, sales are maximize for a week when that week falls on holiday. In this work two variants of Holt-Winter’s models are used for weekly sales forecasting tasks and exposed these variants of model with time-series data-set, where in HW additive model is better for timeseries data when the seasonal component is fixed throughout the series and the second version of model that Holt-Winter’s multiplicative model is suitable when the seasonal variations are proportionally changing w.r.t series of level. Further using both variants of HW model we found that there is a significant in fitting prediction curve with test data rather than the baseline model like ARIMA where the prediction curve is not well fitted into the test data. Similarly from accuracy metrics point view it is found that both variants of proposed model is better than that ARIMA. Furthermore the weekly sales are first decrease then elevated in case of trend and for seasonal data, weekly sales are repeating patterns that is the weekly sales are maximum in the month of Nov-December. It also found that the department number 92 has maximum weekly
292
G. Behera et al.
sales than other department and from correlation it is concluded that when the CPI is more the weekly sales is less. Further in future the smoothing parameters α, β, γ can be tuned, by using some hyper-parameter techniques to enhance the performance as well as to do more comprehensive analysis. In addition to this a more number of dataset can be experimented.
References 1. Bandara, K., Shi, P., Bergmeir, C., Hewamalage, H., Tran, Q., Seaman, B.: Sales demand forecast in e-commerce using a long short-term memory neural network methodology. In: International Conference on Neural Information Processing, pp. 462–474. Springer (2019) 2. Behera, G., Nain, N.: A comparative study of big mart sales prediction. In: International Conference on Computer Vision and Image Processing, pp. 421–432. Springer (2019) 3. Behera, G., Nain, N.: Grid search optimization (gso) based future sales prediction for big mart. In: 2019 15th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), pp. 172–178. IEEE (2019) 4. Doornik, J.A., Hansen, H.: A practical test for univariate and multivariate normality. Technical report, Discussion paper, Nuffield College (1994) 5. Grubb, H., Mason, A.: Long lead-time forecasting of UK air passengers by holt-winters methods with damped trend. Int. J. Forecast. 17(1), 71–82 (2001) 6. Gurnani, M., Korke, Y., Shah, P., Udmale, S., Sambhe, V., Bhirud, S.: Forecasting of sales by using fusion of machine learning techniques. In: 2017 International Conference on Data Management, Analytics and Innovation (ICDMAI), pp. 93–101. IEEE (2017) 7. Holt, C.C.: Forecasting seasonals and trends by exponentially weighted moving averages. Int. J. Forecast. 20(1), 5–10 (2004) 8. Kotsialos, A., Papageorgiou, M., Poulimenos, A.: Holt-winters and neural-network methods for medium-term sales forecasting. IFAC Proc. Vol. 38(1), 133–138 (2005) 9. Lin, S., Yu, E., Guo, X.: Forecasting Rossmann Store Leading 6-month Sales 10. Müller-Navarra, M., Lessmann, S., Voß, S.: Sales forecasting with partial recurrent neural networks: Empirical insights and benchmarking results. In: 2015 48th Hawaii International Conference on System Sciences, pp. 1108–1116. IEEE (2015) 11. Pavlyshenko, B.M.: Machine-learning models for sales time series forecasting. Data 4(1), 15 (2019) 12. Wheelwright, S., Makridakis, S., Hyndman, R.J.: Forecasting: Methods and Applications. Wiley, New York (1998) 13. Williams, G.C., Ainsworth, D., Jones, K.: Implementing SAP ERP Sales & Distribution. McGraw-Hill (2008) 14. Winters, P.R.: Forecasting sales by exponentially weighted moving averages. Manage. Sci. 6(3), 324–342 (1960) 15. Wu, L., Yan, J.Y., Fan, Y.J.: Data mining algorithms and statistical analysis for sales data forecast. In: 2012 Fifth International Joint Conference on Computational Sciences and Optimization, pp. 577–581. IEEE (2012) 16. Xie, X., Ding, J., Hu, G.: Forecasting the retail sales of china’s catering industry using support vector machines. In: 2008 7th World Congress on Intelligent Control and Automation, pp. 4458–4462. IEEE (2008) 17. Zhang, G.P.: Time series forecasting using a hybrid arima and neural network model. Neurocomputing 50, 159–175 (2003)
A Nature-Inspired-Based Multi-objective Service Placement in Fog Computing Environment Hemant Kumar Apat , Kunal Bhaisare, Bibhudatta Sahoo , and Prasenjit Maiti
Abstract Over the last couple of years, the Internet of Things (IoT) has been one of the popular technologies along with the emergence of 5G technologies that facilitate new interactions between things and humans to enhance the quality of life. With the rapid development of IoT applications, connected devices are generating extraordinary volume and unmatched variety of data that to be processed at the centralized cloud data center. The ever-increasing demand for computation resources in the centralized cloud data center system inevitably affects the Quality of Service (QoS). The concept of fog computing is based on moving the computational load into the edge of the network, which is a middle layer that has been introduced that consists of multiple heterogeneous fog devices to process the IoT application. Undoubtedly, the processing of data at the fog layer reduces the response time and bandwidth cost while fulfilling the Quality of Services (QoS). Due to the heterogeneity and dynamicity properties of IoT applications, the proper application placement is a key to enhance the overall system performance. To fully utilize the capabilities of distributed fog computing architecture, a large-scale (IoT) application can be decomposed into dependent and independent services and to deploy those services in an orderly way into the available virtualized fog node while satisfying the constraints and Service-Level Agreement (SLA) may increase the efficiency and performance of the proposed model. In this work, we study the application placement problem which is a well-known NP-complete problem in the fog computing environment. We investigate different deterministic and non-deterministic approach proposed by authors for optimal placement of services based on single and multiple objectives. We propose a genetic-algorithm-based meta-heuristic technique to solve multi-objective service placement and compared with random-based application placement. Evaluation results show that our proposal outperforms random-based placement policy.
Supported by National Institute of Technology, Rourkela, Odisha, India. H. K. Apat (B) · K. Bhaisare · B. Sahoo · P. Maiti National Institute of Technology, Rourkela, Odisha 769008, India e-mail: [email protected] URL: http://www.nitrkl.ac.in © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_26
293
294
H. K. Apat et al.
Keywords Fog computing · Application placement · Network resource optimization · Energy consumption · Internet of things (Iot) · Response time
1 Introduction Connected devices are changing the way we live and work. By 2020, it is expected that there will be over 50 billion Internet of Things (IoT) devices with much real-time latency-sensitive application and generating an unprecedented volume and variety of data [1]. Currently, data is being stored and processed in Cloud data centers which are geographically centralized in nature [2]. A large number of highly distributed devices generate an enormous amount of data, causes congestion in the network, and unacceptable high latency in service delivery. The traditional cloud-only architecture approaches cannot sustain the projected data velocity and volume requirements of the IoT. Therefore, to overcome these obstacles, Cisco introduced a new paradigm called Fog computing [2] in 2012. In the following subsection, we gave a brief overview of cloud computing. The traditional cloud computing paradigm come up with different service model Infrastructure as a Services (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS). The IaaS model outsources storage, network, memory resources to support Internet applications. Examples of IaaS providers are Amazon Web Services, Rackspace, etc. Services have been widely adopted to support the IoT-enabled different applications like smart city application, smart health care, smart traffic and augmented reality online games, etc. as presented in [3–5]. The centralized nature of cloud computing service architecture simply increases the frequency of communication between user device and due to large geographical distance of cloud data center limits the application which required real-time responses. Fog computing emerges as a viable solution to support geographically distributed, QoS-aware, and latency-sensitive applications [3, 4, 6, 7]. Fog computing extends cloud computing by providing computation, communication, and storage facilities toward the edge of the network in an efficient and timely manner. Also of providing data, computation, storage, and application services similar to cloud, fog is decentralized, capable of processing a large amount of low latency data to the fog layer while delay-tolerant application is still processed by the cloud data center. The first more formal definition of fog computing, by Bonomi et al. in [3], stated that “Fog computing is a highly virtualized platform that provides compute, storage and networking services between end devices and traditional Cloud Computing Data Centers, typically, but not exclusively located at the edge of the network.” As described in the definition, fog computing does not necessarily replace cloud computing but complements it. Fog computing fills the current gap between cloud and things, by distributing computing and control, storage, and networking and communications functions closer to end-user devices. Essentially, it eliminates/vastly reduces net-
A Nature-Inspired-Based Multi-objective Service Placement …
295
working loads and latency. These devices which are used to fill this gap are termed as fog nodes or fog devices. Fog nodes can be any computing devices that possess some computational capability like personal computer, tablet, mobile phones, etc.
2 Related Work In this section, we first presented a brief overview of service placement problem in fog computing paradigm defined and proposed by different authors and then discuss the methods adopted by different authors. From the literature review, we found that most of the service placement problems deal by different authors are based on single-objective optimization of performance metrics, e.g., network usage, cost, delay minimization, energy minimization and QoS maximization [7–13]. There are very few papers we found that deal with multi-objective optimization. We considered some best paper of single-objective and multi-objective and comparatively analyzed these papers. Mahmud et al. [11] In this paper, authors proposed algorithm that aims to reduce latency of services by properly placing services on devices distributed over network. Their proposed algorithm calculates nearby idle fog devices and forward services form current devices to idle one to reduce processing time. They compared the effectiveness of proposed algorithm using allotment time and acceptance percentage. Mahmud et al. [14] Fuzzy logical model is developed by in their work, and their proposed algorithm is context-aware. This algorithm categorizes different services and place in different fog resources to streamline the execution of services on fog resources. Taneja et al. [15] Authors propose algorithm for deciding to forward services to devices in fog-cloud architecture. In their algorithm, two calculations are done: mapping of device to services, and lookup for reasonable devices to place services. Minh et al. [16] Based on multi-level fog environment, a new algorithm that streamlines administration decentralization on fog environment utilizing setting mindful data, for example, area, time, Quality of Services (QoS) has been proposed. Abrol et al. [17] The cloud architecture is proposed to include the cloud management tools and resource placement module, and a novel social spider algorithm to solve the global best premature convergence problem faced with the other resource placement swarm intelligent algorithm in the cloud PaaS layer, the proposed architecture and new dynamic meta-heuristic for efficient resource management efficiency as well as resource plant module can be used to enhance the usage of the resources PaaS layer. Mseddi et al. [18] In this paper authors proposed an algorithm for efficient resource management in fog computing environment that is capable of delivering scalable resource in dynamic fog environment. Authors also discuss the resource provisioning problem and formulate the problem as Integer Linear Programming problem. Different heuristic and greedy algorithm were discussed by the authors.
296
H. K. Apat et al.
Fig. 1 Fog computing architecture
Mishra et al. [19] In this paper authors formulate the resource allocation as a biobjective optimization problem to analyse the trade-off issue between two conflicting parameters. Three meta heuristic algorithms has been applied and compared the result for finding the best among them.
3 Fog Computing Architecture We have adopted the fog computing architecture proposed by Bonomi et al. [2]. The fog computing architecture consists of three layers named as IoT layer, fog layer, and cloud layer, respectively.
A Nature-Inspired-Based Multi-objective Service Placement …
297
• End-Device Layer It is the last layer of architecture nearest to physical domain. Those include a variety of devices such as mobile phones, laptops, smart grids, smart cards, smart home appliances and so on. They are accountable for the sensing, transmission and processing of sensed data to the higher layer, using physical entity or usable event data. • Fog Layer This layer consists of all fog devices. Fog devices include router, switches, and other network devices which incorporated with some intelligence. Terminal devices can easily connect to any fog devices in the nearest network. Fog devices are capable of processing, storing, and analytical functions. • Cloud Layer This layer consists of distant cloud data centers, powerful servers and devices [15]. This layer manages underlying layer, and they facilitate efficient management due to powerful devices.
4 System Model 4.1 IoT Application An IoT application consists of several dependent and independent services [18], which executed on virtualized fog nodes and interact with each other by sending messages to complete the goal of the application. Throughout this paper, IoT applications are mentioned as the applications and each service is placed in fog nodes in the fog environment. Also in this paper, all the services are independent so that to exploit the distributed resources in a fog environment. Consider a set of all services that are to be deployed in fog environment be S. S = {S0 , S1 , S2 , . . . , Sm } A service is characterized by its demand of computational resources such as CPU Rc , RAM Rr , storage Rs and type of service. Service has an user-defined parameter deadline DSi which specifies maximum amount of time allowed for the service execution. Services are initiated by request from users which are directed to fog controllers explained in next subsection. These requests have following parameters which are used while placing task onto the devices in fog environment or onto the cloud.
4.2 Types of Request • Computational requirements (CPU, RAM, Storage) RSr i denotes request for resources r ∈ R where, R is resources considered in this model (CPU, RAM, Storage)
298
H. K. Apat et al.
• Deadline of service DSi denotes maximum time allowed for responding to service request.
4.3 Fog Network Layer A Fog computing system model can be defined as undirected graph G = (V , E) where V represent the set of fog nodes and E is the set of communicating links between fog nodes and fog control nodes (F).
4.4 Cloud Network Layer A cloud system consists of physical memory having enough storage capacity to process the latency tolerant application.
5 Problem Statement 5.1 Optimization Problem A service placement can be defined mathematically as mapping function f : S → (F ∪ P) where the domain is the services and the co-domain is the resources available in fog and cloud. The mapping of services onto the resource available in our system can be inferred as service placement and this problem has multiple solution exists, i.e., if we have m services that has to be mapped between n resources (m>n) there are mn possible combination and to map each combination would require exponential time, so we propose a meta-heuristic technique to solve the service placement problem. We have to choose the proper range from the co-domain set such that some specified objectives and constraints are satisfied. In our model, we assume that this mapping can be done as many to many (M:M). Given a set of service requests S = {S0 , S1 , S2 , . . . , Sm } , set of all the devices in a fog cluster including all fog nodes and fog controller nodes in a fog cluster D = {d0 , d1 , d2 , . . . , dN } and α, β, γ which are weights for the importance of makespan, energy usage, and cost, we need to find and optimal mapping between service and device such that, 1. Service request have minimum makespan time 2. Energy consumption due to service execution is minimum 3. Cost of execution is minimum.
A Nature-Inspired-Based Multi-objective Service Placement …
299
as in, min
F(s)
(1)
s∈S
F(s) =
xsd (αfm (s, d ) + βfe (s, d ) + γfc (s, d ))
(2)
d ∈D
subject to following constraints First, each service Si has to be placed at only one device either in the fog node or in cloud: N
xsj = 1, for all s ∈ S
(3)
j=0
Here xsj is a binary variable in our model to specify either the service s is placed in device j or not. Second, total makespan time of an service should not exceed the service deadline: xSi ,d ∗ ms(Si , d ) ≤ DSi (4) d ∈D
where fm (s, d ), fe (s, d ), fc (s, d ) are normalized makespan, energy consumption and cost, respectively, normalized as given in Eq. (5), (11), (9) subjected to constraints (3),(4), and ?? fm (s, d ) =
ms(s, d ) − min(MSs ) max(MSs ) − min(MSs )
(5)
where, MSs = {ms(i, j)|i = s, j ∈ D} proc
ms(i, j) = tij
proc
(6)
Xi Yi
(7)
Oi Ii + BW (srci , j) BW (j, dsti )
(8)
proc
tij tijcomm =
+ tijcomm
=
where tij is processing time required to executed service i (Si ) in device j (dj ). Similarly, tijcomm is communication time required to send and receive input and output
300
H. K. Apat et al.
of service i (Si ) to and from device j (dj ). Both processing time and communication time are calculated using Eqs. (7), (8), respectively. fm (s) =
c(s, d ) − min(Cs ) max(Cs ) − min(Cs )
(9)
where, Cs = {c(i, j)|i = s, j ∈ D} proc
c(i, j) = ηjcost ∗ tij
(10)
where ηjcost is money charged per unit time for execution in device j fm (s) =
ec(s, d ) − min(ECs ) max(ECs ) − min(ECs )
(11)
where, ECs = {ec(i, j)|i = s, j ∈ D} energy
c(i, j) = ηj energy
where ηj (Table 1).
∗ Wi
(12)
is energy consumed in watts for execution of service i in device j
Table 1 Notation used in algorithms
Notation
Description
R C D A BW R C F gen_num pop_num D b
Requirements of the task Device capacity List of all device id Allotment list Bandwidth of connection Requirements of the task Device capacity Fitness of population P number of generations population count List of all device id List of all tasks
A Nature-Inspired-Based Multi-objective Service Placement …
6 Proposed Algorithms
Algorithm 1 Random allotment 1: for i = 0 to size(b) do 2: A[i] =one random device id from D 3: while C[A[i]]["ram"] ≥ R[i]["ram"] & C[A[i]]["storage"] ≥ R[i]["storage"] do 4: A[i] =one random device id from D 5: end while 6: end for 7: return A
Algorithm 2 Genetic Allotment 1: P = generate_population(pop_num, b) 2: F = calculate_fitness(P, R, C) 3: P, F = sort(P, F) 4: for i = 0 → gen_num do 5: NP = crossover(P, F) 6: P = append (P, NP) 7: P = mutate(P) 8: F = calculate_fitness(P, R, C) 9: P, F = sort(P, F) 10: P = P[: pop_num] 11: F = F[: pop_num] 12: end for 13: returnP[0]
Algorithm 3 Population generation 1: P = [] 2: for i = 0 → pop_num do 3: p = [] 4: for j = 0 → size(b) do 5: c =random choice from D 6: p[j] = c 7: P.append (p) 8: end for 9: end for 10: return P
301
302
H. K. Apat et al.
7 Simulation and Results All the algorithms are simulated using the YAFS simulation tool written in Python. For each simulation, a random network is created and task launch time is initialized so that they are identical for all algorithms to facilitate proper comparison of all the algorithms (Figs. 2, 3, 4 and 5).
Fig. 2 Latency reduction
Fig. 3 Energy reduction
A Nature-Inspired-Based Multi-objective Service Placement …
303
Fig. 4 Cost reduction
Fig. 5 Overall performance of different algorithms
8 Conclusion and Future Work In this work we have used a multi-objective optimization method to solve the service placement problem in Fog computing with three metrics simultaneously makespan, energy and cost. Our proposed genetic algorithm based placement plan performs better than random based approach. In future we would like to explore some more meta heuristic for comparison. Pareto front solution for conflicting parameters will be considered in my future work.
304
H. K. Apat et al.
References 1. Mohan, N., Kangasharju, J.: Edge-fog cloud: A distributed cloud for internet of things computations, 2016 Cloudification of the Internet of Things. CIoT 2016 (2017) 2. Bonomi, F., Milito, R., Zhu, J., Addepalli, S.: Fog computing and its role in the Internet of Things. In: Proceedings 1st Ed. MCC Workshop Mobile Cloud Computing, pp. 13–16 (2012) 3. Deng, R., Lu, R., Lai, C., Luan, T.H., Liang, H.: Optimal workload allocation in fog-cloud computing toward balanced delay and power consumption. IEEE Internet Things J. 3, 1171– 1181 (2016) 4. Bitam, S., Zeadally, S., Mellouk, A.: Fog computing job scheduling optimization based on bees swarm. Enterprise Information Systems, pp. 1–25 (2017) 5. Azizi, S., Khosroabadi, F., Shojafar, M.: A priority-based service placement policy for fogcloud computing systems. Comput. Methods Differen. Equ. 7(4) (Special Issue), pp. 521–534 (2019) 6. Toor, A., ul Islam, S., Ahmed, G., Jabbar, S., Khalid, S., Sharif, A.M.: Energy efficient edgeof-things. EURASIP J. Wirel. Commun. Netw. 8 (2019) 7. Gupta, H., Vahid Dastjerdi, A., Ghosh, S.K., Buyya, R.: ifogsim: a toolkit for modeling and simulation of resource management techniques in the internet of things, edge and fog computing environments. Softw. Pract. Exp. 47(9), 1275–1296 (2017) 8. Yousefpour, A., Ishigaki, G., Jue, J.P., Fog computing: towards minimizing delay in the internet of things. In: IEEE international conference on edge computing (EDGE). IEEE 2017, pp. 17–24 (2017) 9. Taneja, M., Davy, A.: Resource aware placement of iot application modules in fog-cloud computing paradigm. In: IFIP/IEEE Symposium on Integrated Network and Service Management (IM). IEEE 2017, pp. 1222–1228 (2017) 10. Skarlat, O., Nardelli, M., Schulte, S., Borkowski, M., Leitner, P.: Optimized Iot service placement in the fog. SOCA 11(4), 427–443 (2017) 11. Mahmud, Redowan, Ramamohanarao, Kotagiri, Buyya, Rajkumar: Latency-aware application module management for fog computing environments. ACM Trans. Internet Technol. (TOIT) 19(1), 1–21 (2018) 12. Mahmud, R., Ramamohanarao, K., Buyya, R.: Edge affinity-based management of applications in fog computing environments. In: Proceedings of the 12th IEEE/ACM International Conference on Utility and Cloud Computing, pp. 61–70. ACM (2019) 13. Mahmud, R., Srirama, S.N., Ramamohanarao, K., Buyya, R.: Quality of experience (qoe)aware placement of applications in fog computing environments. J. Parallel Distrib. Comput. 132, 190–203 (2019) 14. Mahmud, R., et al.: Quality of experience (QoE)-aware placement of applications in fog computing environments. J. Parallel Distrib. Comput. 132, 190–203 (2019) 15. Taneja, M., Davy, A.: Resource aware placement of IoT application modules in fog-cloud computing paradigm. In: 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM). IEEE (2017) 16. Minh, Q.T., et al.: Toward service placement on Fog computing landscape. In: 2017 4th NAFOSTED Conference on Information and Computer Science. IEEE (2017) 17. Abrol, P., Guupta, S., Singh, S.: Nature-Inspired Metaheuristics in Cloud: A Review, pp. 13–34. Singapore, ICT Systems and Sustainability. Springer (2020) 18. Mseddi, A., et al.: Joint container placement and task provisioning in dynamic fog computing. IEEE Internet Things J. 6(6), 10028–10040 (2019) 19. Mishra, S.K., et al.: Sustainable service allocation using a metaheuristic technique in a fog server for industrial applications. IEEE Trans. Ind. Inform. 14(10), 4497–4506 (2018)
Advanced Binary Matrix-Based Frequent Pattern Mining Algorithm Pranaya Pournamashi Patro and Rajiv Senapati
Abstract Frequent pattern mining (FPM) is one of the most important areas in the field of data mining. Several FPM algorithms have been proposed in the literature by many researchers. In most of the approaches, data set is scanned repeatedly in almost every steps of the algorithm that leads to high time complexity. That is why, processing huge amount of data using those algorithms may not be a suitable option. Hence, a novel FPM algorithm is proposed in this paper that improves efficiency by decreasing the time complexity as compared to classical frequent pattern mining algorithm. The proposed FPM algorithm converts the real-world data set into a binary matrix in a single scan, then join operation is performed to obtain the candidate itemsets. Further, AND operation is performed on the candidates to obtain frequent itemsets. Further more, using our proposed algorithm, interesting association rules can be derived. Keywords Binary matrix · Data mining · FPM · Itemset
1 Introduction Data is the heart of the business, and with the recent progress in technology, a huge amount of data is generated with in a span of second. Extracting knowledge from these databases is an important data mining task. Frequent pattern mining is one of the most interesting areas, which is widely used in data analysis, target marketing, recommendation system, optimization of network, medical diagnosis and so on. Frequent pattern mining generates itemsets and sequences that appears more frequently in a data set. The FPM algorithm was proposed long back in 1993 by Agrawal [1]. The frequent pattern can be defined using minimum support threshold P. P. Patro GIET University, Gunupur, Odisha 765022, India e-mail: [email protected] R. Senapati (B) Department of CSE, SRM University, Amaravati, Andhra Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_27
305
306
P. P. Patro and R. Senapati
criteria. Let I is a set of items denoted by I = {i1 , i2 , . . . , in }, D is the transaction data set denoted by D = {t1 , t2 , . . . , tn }, where ti ∈ D, Ii ∈ I then ti (i ∈ [1, n]) is a transaction. Now, support of an itemset can be defined as the number of occurrences of that itemset in D. If an itemset satisfies the minimum support count threshold, then it is called as a frequent itemset. The FPM has been well studied in the literature by many researchers. However, some of the major issues associated with FPM include its computation cost due to the followings. • Repeated scan of transaction data set in every iteration of the algorithm. • Generating more number of candidate itemsets by repeatedly scanning the data set. Therefore, in this work, we have proposed a novel binary matrix-based FPM algorithm to find interesting frequent patterns. The main contributions of the paper are summarized as follows. • We have proposed a binary matrix-based FPM algorithm to reduce complexity and increase efficiency. • Unlike Apriori algorithm, our proposed algorithm scans the data set only once to generate frequent itemsets and sequences. • Further, the proposed algorithm can be used to find interesting associations. The rest of this paper is organized as follows. In Sect. 2, we have provided the review of previous works related to FPM algorithm. In Sect. 3, we have presented our proposed binary matrix-based FPM algorithm for finding interesting frequent itemset. In Sect. 4, we have presented the numerical results. Finally, in Sect. 5, we have concluded this paper.
2 Related Work Frequent pattern mining is one of the important as well as popular data mining techniques. Several algorithms for mining frequent patterns have been proposed, still there is a scope to reduce the time complexity in finding frequent pattern. The most famous algorithm for frequent pattern mining is Apriori algorithm, which is proposed in [1]. In [2], an improved Apriori algorithm is proposed that only scan the required transactions in order to reduce the time complexity. In [3], lattice traversal techniques are proposed to identify all the long frequent itemsets and their subsets in minimal computational time. An advance algorithm is proposed in [4], which implements small co-occurrence frequent itemset (COFI) tree that reduces the memory requirements for mining frequent itemsets. In [5], a maximal frequent itemset algorithm (MAFIA) is proposed to generate frequent itemsets by reducing search space. A parallel disk-based approach is discussed in [6], which increases the efficiency and scalability of the mining activity on large data set.
Advanced Binary Matrix-Based Frequent Pattern Mining Algorithm
307
The work reported in [7] implements transaction weight concept for reduction of computational time as well as the number of transaction in the process of mining frequent itemsets. In [8], both map reduction and Hadoop environment are used, and they proposed MH-Apriori algorithm, which reduces both time and space complexity for generating frequent itemset. An advance algorithm implementing transaction weight concept is introduced in [9] for reduction of computational time by reducing the number of transaction. Algorithm based on compress data set is described in [10], which reduces the time complexity of the classical algorithm. In [10], mark transaction compression is used to compress the data set to generate frequent itemset. An advance improved compress matrix (ICM) algorithm is discussed in [11]. It compresses the matrix by eliminating redundant terms and deleting the unnecessary itemsets. In [12], a bit set matrix was proposed based on a new optimization algorithm, which scans the data set only once to generate bit set matrix. The work reported in [13] includes optimization in pruning and transaction reduction technique to decrease the number of scanning for generating candidate itemset. In [14], an improved Apriori algorithm is proposed by using the direct hashing and pruning technique. In [15], an improved Apriori algorithm based on the heuristic association rule mining (ARM) approach is proposed which helps in the reduction of computational time as well as enhances the performance of algorithm. From the earlier studies, it is observed that, an extensive work has been performed by many researchers on frequent pattern mining. In real-time scenario with the huge volume of data, finding frequent pattern with less number of candidate generation is a challenging task. Therefore, in this paper, a novel FPM algorithm is proposed that improves efficiency by decreasing the time complexity as compared to classical frequent pattern mining algorithm.
3 Binary Matrix-Based FPM Algorithm In this section, we have presented our proposed FPM algorithm. The algorithm composed of three phases, first it scans the data set and converts it into Pm×n binary matrix, where m represents the number of itemset and n represents the number of transactions. Here, 1 is used for representing the existence of item in a corresponding transaction otherwise 0 is used. This phase also finds the frequent itemset and prunes the column of matrix through minimum support threshold. In the second phase, join operation is performed on binary matrix to generate candidate itemset. In third phase, AND operation is carried out to find the frequent itemset and prune the column of matrix through minimum support threshold. The process carried out until the prune of all column matrix. Finally, those items, whose frequency is more or equal to minimum support threshold, are said to be frequent items. The proposed FPM algorithm generates frequent itemset through several iteration by using a binary matrix. The detailed procedure of estimating the frequent itemset by converting the data set into a binary matrix is presented in Algorithm 1.
308
P. P. Patro and R. Senapati
Algorithm 1 Binary Matrix based FPM Algorithm. 1: Input: D, Scmin 2: L 3: Scan D. 4: Construct a matrix Pm×n . 5: for each i, i ≤ m 6: for each j, j ≤ n 7: if ith transaction contain j th element then 8: p[i][j] =1 9: else 10: p[i][j] =0 11: end if 12: end for 13: end for 14: for each i, i ≤ n 15: for each j, j ≤ m 16: sup = sup + p[i][j] 17: end for 18: if sup ≥ Scmin 19: add item to L1 20: else 21: drop column j from p 22: end if 23: end for 24: for each k, k ≥ 2 and Lk−1 = φ 25: Ck = join(Lk−1 ) 26: for each t, t ∈ Ck 27: calculate supp = AndOperation(t) 28: if supp ≥ Scmin 29: add t to Lk 30: end if 31: end for 32: end for 33: 34: L = L1 ∪ L2 ...... ∪ Lk 35: Exit
This Algorithm 1 is executed as follows. • In step 1, scan the data set D. • In step 2, a Pm×n binary matrix is created, where m and n represent transactions and items, respectively. • In step 3, for each item, j belongs to transaction i, we insert 1 in the location P[i][j] otherwise 0 is inserted. • In step 4, for each transaction i, the column j value is added to get the support count and stored in a variable sup. By comparing the individual sup value with minimum support count (i.e., Scmin ), the frequent-1 itemset is found. If the value of sup is greater then equal to Scmin , then that item column is added to frequent-1 itemset (i.e., L1 ) otherwise the column is pruned .
Advanced Binary Matrix-Based Frequent Pattern Mining Algorithm
309
Algorithm 2 join(). 1: Input: Lk−1 2: Output:Ck 3: for each item l1 ∈ lk−1 4: for each item l2 ∈ lk−1 5: if l1 ∩ l2 = φ 6: Ck = l1 l2 7: end if 8: end for 9: end for 10: return Ck
Algorithm 3 AndOperation(). 1: Input: t, t ∈ Ck 2: Output:sup1 3: Let sup1 = 0 4: for each i,i ∈ t 5: for each j, j ∈ t,i = j 6: for each k, k ≤ m 7: sup1 = sup1 + p[k][i]ANDsup[k][j] 8: end for 9: end for 10: end for 11: return sup1
• In step 5, for each k, k ≥ 2 the join() is called which takes the previous frequent(k-1) itemset (i.e., Lk−1 ) as an argument and returns candidate-k itemset (i.e., Ck ). For each itemset (i.e., t) of Ck is passed as a parameter in AndOperation() to calculate the support count (i.e., supp). After getting the supp of each itemset in Ck , a comparison will take place in between minimum support count (i.e., Scmin ) and individual support values. If the supp of itemset is greater than equal to Scmin , then add it to the list of frequent-k itemset (i.e., Lk ), otherwise prune the itemset. The step will continue until Lk−1 = φ. • In step 6, large frequent itemset (i.e., L) is found by combining L1 , L2 , ..., Lk respectively. • In step 8, the algorithm terminates. The join() operation is used to evaluate the candidate itemsets and is presented in Algorithm 2. The details of AndOperation() is to evaluate support count of individual itemset are presented in Algorithm 3.
310
P. P. Patro and R. Senapati
4 Result and Discussion A transaction data set D is considered in this work, which consists of ten transactions (i.e., {T1 , T2 , T3 , T4 , T5 , T6 , T7 , T8 , T9 , T10 }) and six items (i.e., {I1 , I2 , I3 , I4 , I5 , I6 }) as presented in Table 1. Let us consider the minimum support count as 4. First, the FPM algorithm scans the transaction data set once in order to convert the data set in to a binary matrix (i.e., Pm×n ). The row of the binary matrix represents the number of item corresponding to every transaction, and the column of matrix represents the number of transaction corresponding to every item. In the binary matrix, 1 indicates the presence of item in a particular transaction, and 0 represents the non-existing items. For example, in transaction data set, D, I1 is present in transaction T1 , so we insert 1 in the location P[T1 ][I1 ], I4 is not present in transaction T1 , so we insert 0 in the location P[T1 ][I4 ]. Similarly, a binary matrix is constructed, which is presented in Table 1. The binary matrix generated from transaction data set D is considered as candidate1 itemset. In order to identify the frequent 1-itemset, we have to add the individual column of candidate 1-itemset, which indicates the support count of individual item. For example in Table 2, all data present in individual column is added to calculate the support count (i.e., sup). Frequent 1-itemset is found by comparing support count of individual item with minimum support count, if the support count of an item is less than the minimum support count, then the item is pruned and the existing items are frequent 1-itemset. For example, here the minimum support count is 4, and the support count of item I4 is 3, which is less than the minimum support count, so the data I4 is pruned and L1 ={I1 , I2 , I3 , I5 , I6 } are the frequent 1 items. The candidate 1-itemsets and frequent 1-items with there transaction are listed in Table 2.
Table 1 Transaction data set D and it’s binary representation TID Items TID I1 I2 I3 T1 T2 T3 T4 T5 T6 T7 T8 T9 T 10
I1, I2, I3, I6 I1, I2, I3, I5, I6 I2, I3, I5, I6 I2, I4, I5, I6 I1, I3, I5, I6 I2, I5, I6 I1, I2, I3, I5, I6 I1, I2, I4, I6 I2, I3, I4, I5, I6 I2, I3, I5, I6
T1 T2 T3 T4 T5 T6 T7 T8 T9 T 10
1 1 0 0 1 0 1 1 0 0
1 1 1 1 0 1 1 1 1 1
1 1 1 0 1 0 1 0 1 1
I4
I5
I6
0 0 0 1 0 0 0 1 1 0
0 1 1 1 1 1 1 0 1 1
1 1 1 1 1 1 1 1 1 1
Advanced Binary Matrix-Based Frequent Pattern Mining Algorithm Table 2 Candidate and frequent 1-itemset matrix TID I1 I2 I3 I4 I5 I6 T1 T2 T3 T4 T5 T6 T7 T8 T9 T 10 sup
1 1 0 0 1 0 1 1 0 0 5
1 1 1 1 0 1 1 1 1 1 9
1 1 1 0 1 0 1 0 1 1 7
0 0 0 1 0 0 0 1 1 0 3
311
TID
I1
I2
I3
I5
I6
1 1 0 0 1 0 1 1 0 0
1 1 1 1 0 1 1 1 1 1
1 1 1 0 1 0 1 0 1 1
0 1 1 1 1 1 1 0 1 1
1 1 1 1 1 1 1 1 1 1
0 1 1 1 1 1 1 0 1 1 8
1 1 1 1 1 1 1 1 1 1 10
T1 T2 T3 T4 T5 T6 T7 T8 T9 T 10
Table 3 Candidate 2-itemset matrix TID
I1 I2
I1 I3
I1 I5
I1 I6
I2 I3
I2 I5
I2 I6
I3 I5
I3 I6
I5 I6
T1 T2 T3 T4 T5 T6 T7 T8 T9 T 10
1 1 0 0 0 0 1 1 0 0
1 1 0 0 1 0 1 0 0 0
0 1 0 0 1 0 1 0 0 0
1 1 0 0 1 0 1 1 0 0
1 1 1 0 0 0 1 0 1 1
0 1 1 1 0 1 1 0 1 1
1 1 1 1 0 1 1 1 0 1
0 1 1 0 1 0 1 0 0 1
1 1 1 0 1 0 1 0 0 1
0 1 1 1 1 1 1 0 0 1
In order to find the candidate-2 itemset, a join operation is performed in between the data of frequent 1-itemset of L1 . The two items, which take place in join operation, must not have any common element. The candidate 2-itemset, i.e., {(I1 I2 ), (I1 I3 ), (I1 I5 ), (I1 I6 ), (I2 I3 ), (I2 I5 ), (I2 I6 ), (I3 I5 ), (I3 I6 ), (I5 I6 )} with their transactions are listed in Table 3. The frequent-2 itemset is generated by performing AND operation in between the attributes of candidate-2 itemset. For example, AND operation in between candidate itemset (i.e., (I1 I2 ), (I1 I3 ),(I1 I5 ), (I1 I6 ), (I2 I3 ), (I2 I5 ), (I2 I6 ), (I3 I5 ), (I3 I6 ), (I5 I6 )) is explained below.
312
P. P. Patro and R. Senapati
(I 1.I 2) = [(1.1) + (1.1) + (0.1) + (0.1) + (1.0) + (0.1) + (1.1) + (1.1) + (0.1) + (0.1)] = 4, (I 1.I 3) = [(1.1) + (1.1) + (0.1) + (0.0) + (1.1) + (0.0) + (1.1) + (1.0) + (0.1) + (0.1)] = 4, (I 1.I 5) = [(1.0) + (1.1) + (0.1) + (0.1) + (1.1) + (0.1) + (1.1) + (1.0) + (0.1) + (0.1)] = 3, (I 1.I 6) = [(1.1) + (1.1) + (0.1) + (0.1) + (1.1) + (0.1) + (1.1) + (1.1) + (0.1) + (0.1)] = 5, (I 2.I 3) = [(1.1) + (1.1) + (1.1) + (1.0) + (0.1) + (1.0) + (1.1) + (1.0) + (1.1) + (1.1)] = 6, (I 2.I 5) = [(1.0) + (1.1) + (1.1) + (1.1) + (0.1) + (1.1) + (1.1) + (1.0) + (1.1) + (1.1)] = 7, (I 2.I 6) = [(1.1) + (1.1) + (1.1) + (1.1) + (0.1) + (1.1) + (1.1) + (1.1) + (1.1) + (1.1)] = 9, (I 3.I 5) = [(1.0) + (1.1) + (1.1) + (0.1) + (1.1) + (0.1) + (1.1) + (0.0) + (1.1) + (1.1)] = 6, (I 3.I 6) = [(1.1) + (1.1) + (1.1) + (0.1) + (1.1) + (0.1) + (1.1) + (0.1) + (1.1) + (1.1)] = 7, (I 5.I 6) = [(0.1) + (1.1) + (1.1) + (1.1) + (1.1) + (1.1) + (1.1) + (0.1) + (1.1) + (1.1)] = 8
A comparison in between support count of each itemset with minimum support count will take place to obtain the frequent-2 itemset. For example, in Table 3, as the support count (I1 I5 ) is less than the minimum support count, so the item (I1 I5 ) is pruned, and L2 = {(I1 I2 ), (I1 I3 ), (I1 I6 ), (I2 I3 ), (I2 I5 ), (I2 I6 ), (I3 I5 ), (I3 I6 ), (I5 I6 )} is frequent-2 itemset. The frequent-2 itemset with their transactions are listed in Table 4. To generate the candidate 3-itemset, we have to perform the join operation in between the attributes of frequent 2-itemset of L2 . The items, which take place in join operation, must not have any common element. The AND operation between the attributes is described as below.
Table 4 Frequent 2-itemset matrix TID
I1 I2
I1 I3
I1 I6
I2 I3
I2 I5
I2 I6
I3 I5
I3 I6
I5 I6
T1 T2 T3 T4 T5 T6 T7 T8 T9 T 10
1 1 0 0 0 0 1 1 0 0
1 1 0 0 1 0 1 0 0 0
1 1 0 0 1 0 1 1 0 0
1 1 1 0 0 0 1 0 1 1
0 1 1 1 0 1 1 0 1 1
1 1 1 1 0 1 1 1 0 1
0 1 1 0 1 0 1 0 0 1
1 1 1 0 1 0 1 0 0 1
0 1 1 1 1 1 1 0 0 1
Advanced Binary Matrix-Based Frequent Pattern Mining Algorithm
313
(I 1I 2).(I 1I 3) = [(1.1) + (1.1) + (0.0) + (0.0) + (0.1) + (0.0) + (1.1) + (1.0) + (0.0) + (0.0)] = 3 (I 1I 2).(I 2I 5) = [(1.0) + (1.1) + (0.1) + (0.1) + (0.0) + (0.1) + (1.1) + (1.0) + (0.1) + (0.1)] = 2 (I 1I 2).(I 2I 6) = [(1.1) + (1.1) + (0.1) + (0.1) + (0.0) + (0.1) + (1.1) + (1.1) + (0.1) + (0.1)] = 4 (I 1I 3).(I 3I 5) = [(1.0) + (1.1) + (0.1) + (0.0) + (1.1) + (0.0) + (1.1) + (0.0) + (0.1) + (0.1)] = 3 (I 1I 3).(I 3I 6) = [(1.1) + (1.1) + (0.1) + (0.0) + (1.1) + (0.0) + (1.1) + (0.0) + (0.1) + (0.1)] = 4 (I 1I 6).(I 5I 6) = [(1.0) + (1.1) + (0.1) + (0.1) + (1.1) + (0.1) + (1.1) + (1.0) + (0.1) + (0.1)] = 3 (I 2I 3).(I 3I 5) = [(1.0) + (1.1) + (1.1) + (0.0) + (0.1) + (0.0) + (1.1) + (0.0) + (1.1) + (1.1)] = 5 (I 2I 3).(I 3I 6) = [(1.1) + (1.1) + (1.1) + (0.0) + (0.1) + (0.0) + (1.1) + (0.0) + (1.1) + (1.1)] = 6 (I 2I 5).(I 5I 6) = [(0.0) + (1.1) + (1.1) + (1.1) + (0.1) + (1.1) + (1.1) + (0.0) + (1.1) + (1.1)] = 7 (I 3I 5).(I 5I 6) = [(0.0) + (1.1) + (1.1) + (0.1) + (1.1) + (0.1) + (1.1) + (0.0) + (1.1) + (1.1)] = 6
A comparison in between support count of each itemset with minimum support count will take place to obtain the frequent 3-itemset. For example, in Table 5, as the support count (I1 I2 I3 ),(I1 I2 I5 ), (I1 I3 I5 ), (I1 I5 I6 ) is less than the minimum support count, so these items are pruned, and L3 = {(I1 I2 I6 ), (I1 I3 I6 ), (I2 I3 I5 ), (I2 I3 I6 ), (I2 I5 I6 ), (I3 I5 I6 )} is added in the list of frequent 3-itemset. The frequent 3-itemset with their transaction are listed in Table 6. To find the candidate-4 itemset, we have to perform the join operation in between the attributes of frequent-3 itemset. The items which take place in join operation must not have any common element. AND operation in between the attributes is described as below.
Table 5 Candidate-3 itemset matrix TID I1 I2 I1 I2 I1 I2 I1 I3 I3 I5 I6 I5 T1 T2 T3 T4 T5 T6 T7 T8 T9 T 10 Supp
1 1 0 0 0 0 1 0 0 0 3
0 1 0 0 0 0 1 0 0 0 2
1 1 0 0 0 0 1 1 0 0 4
0 1 0 0 1 0 1 0 0 0 3
I1 I3 I6
I1 I5 I6
I2 I3 I5
I2 I3 I6
I2 I5 I6
I3 I5 I6
1 1 0 0 1 0 1 0 0 0 4
0 1 0 0 1 0 1 0 0 0 3
0 1 1 0 0 0 1 0 1 1 5
1 1 1 0 0 0 1 0 1 1 6
0 1 1 1 0 1 1 0 1 1 7
0 1 1 0 1 0 1 0 1 1 6
314
P. P. Patro and R. Senapati
Table 6 Frequent-3 itemset matrix TID
I1 I2 I6
I1 I3 I6
I2 I3 I5
I2 I3 I6
I2 I5 I6
I3 I5 I6
T1 T2 T3 T4 T5 T6 T7 T8 T9 T 10
1 1 0 0 0 0 1 1 0 0
1 1 0 0 1 0 1 0 0 0
0 1 1 0 0 0 1 0 1 1
1 1 1 0 0 0 1 0 1 1
0 1 1 1 0 1 1 0 1 1
0 1 1 0 1 0 1 0 1 1
Table 7 Candidate and frequent 4-itemset matrix TID
I1 I2 I3 I6
I1 I2 I5 I6
I1 I3 I5 I6
I2 I3 I5 I6
TID
I2 I3 I5 I6
T1 T2 T3 T4 T5 T6 T7 T8 T9 T 10 Sup
1 1 0 0 0 0 1 0 0 0 3
0 1 0 0 0 0 1 0 0 0 2
0 1 0 0 1 0 1 0 0 0 3
0 1 1 0 0 0 1 0 1 1 5
T1 T2 T3 T4 T5 T6 T7 T8 T9 T 10
0 1 1 0 0 0 1 0 1 1
(I 1I 2I 3I 6) = (I 1I 2I 6).(I 1I 3I 6) = [(1.1) + (1.1) + (0.0) + (0.0) + (0.1) + (0.0) + (1.1) + (1.0) + (0.0) + (0.0)] = 3 (I 1I 2I 5I 6) = (I 1I 2I 6).(I 2I 5I 6) = [(1.0) + (1.1) + (0.1) + (0.1) + (0.0) + (0.1) + (1.1) + (1.0) + (0.1) + (0.1)] = 2 (I 1I 3I 5I 6) = (I 1I 3I 6).(I 3I 5I 6) = [(1.0) + (1.1) + (0.1) + (0.0) + (1.1) + (0.0) + (1.1) + (0.0) + (0.1) + (0.1)] = 3 (I 2I 3I 5I 6) = (I 2I 3I 6).(I 2I 5I 6) = [(1.0) + (1.1) + (1.1) + (0.1) + (0.0) + (0.1) + (1.1) + (0.0) + (1.1) + (1.1)] = 4
A comparison between support count of each itemset with minimum support count will take place to obtain the frequent-4 itemset. For example, in Table 7, as the support count (I1 I2 I3 I6 ),(I1 I2 I5 I6 ), (I1 I3 I5 I6 ), (I2 I3 I5 I6 ) is less than the minimum support count, so these items are pruned, and L4 = {(I2 I3 I5 I6 )} is added in the list of frequent-4 itemset. Table 7 represents both the candidate-4 itemset matrix and frequent-4 itemset matrix. Further, no frequent itemset can be found because candidate-5 itemset can
Advanced Binary Matrix-Based Frequent Pattern Mining Algorithm
315
not be created from frequent-4 itemset. Finally, the frequent itemset is L = {I1 , I2 , I3 , I5 , I6 , (I1 I2 ), (I1 I3 ), (I1 I6 ), (I2 I3 ), (I2 I5 ), (I2 I6 ), (I3 I5 ), (I3 I6 ), (I5 I6 ), (I1 I2 I6 ), (I1 I3 I6 ), (I2 I3 I5 ), (I2 I3 I6 ), (I2 I5 I6 ), (I3 I5 I6 ), (I2 I3 I5 I6 )}.
5 Conclusion In this paper, we have introduced a new FPM algorithm for computing the frequent itemsets. An experiment is carried out by taking the transaction data set D, which contains transactions and items. The proposed algorithm is able to transform the transaction data set into a binary matrix, after that no further scanning of data set is required. Join and AndOperation function is carried out to find the frequent itemset. The proposed FPM has nice scalability in terms of efficiency achieved by reducing the number of scan. The performance in FPM increases due to the conversion of transaction data set D into binary matrix. As compared to classical FPM algorithm, the proposed FPM minimizes the complexity and increases the efficiency by scanning the data set only once.
References 1. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993) 2. Al-Maolegi, M., Arkok, B.: An Improved Apriori Algorithm for Association Rules. arXiv preprint arXiv:1403.3948 (2014) 3. Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng., 372– 390 (2000) 4. El-Hajj, M., Zaiane, O.R.: Non recursive generation of frequent K-itemsets from frequent pattern tree representations. In: International Conference on Data Warehousing and Knowledge Discovery, pp. 371–380 (2003) 5. d Burdick, M.C., Gehrke, J.: Mafia: a maximal frequent itemset algorithm for transactional databases. In: Proceedings 17th International Conference on Data Engineering, pp. 443–452 (2001) 6. Baralis, E., Cerquitelli, T., S.C.A.G.: P-mine: parallel itemset mining on large datasets. In: 2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW), pp. 266–271 (2013) 7. Sahoo, A., Senapati, R.: A Boolean load-matrix based frequent pattern mining algorithm. In: IEEE International Conference on Artificial Intelligence and Signal Processing, India, pp. 1–5 (2020) 8. Feng, D., Zhu, L., Zhang, L.: Research on improved Apriori algorithm based on MapReduce and HBase. In: IEEE International Conference on Computational Intelligence and Computing Research,Coimbatore, India (2016) 9. Yang, Q., Fu, Q., Wang, C., Yang, J.: A matrix-based Apriori algorithm improvement. In: International Conference on Data Science in Cyberspace, Guangdong, China (2018) 10. Xiao, M., Zhou, Y.Y., Pan, S.: Research on improvement of Apriori algorithm based on marked transaction compression. In: IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference,Chongqing, China, pp. 1067–1071 (2017)
316
P. P. Patro and R. Senapati
11. Yu, N., Yu, X., Shen, L., Yao, C.: Using the improved Apriori algorithm based on compressed matrix to analyze the characteristics of suspects. ICIC Expre. Lett. Part B, Appl. Int. J. Res. Surv. 2469–2475 (2015) 12. Huang, Y., Lin, Q., Li, Y.: Apriori-Bm algorithm for mining association rules based on bit set matrix. In: IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMEC), Xi’an, China, pp. 2580–2584 (2018) 13. Chrn, Z., Cai, S., Song, Q., Zhu, C.: An improved Apriori algorithm based on pruning optimization and transaction reduction. In: 2nd International Conference on Artificial Intelligence, Management Science and Electronic Commerce, ZHENGZHOU, China, pp. 1908–1911 (2011) 14. El-Mouadib, F.A., ferjani, K.S.A.: The performance of the Apriori-dhp algorithm with some alternative measures. In: 8th Conference on Advances in Decision Systems (ASD 2014) at Hammamet, Tunisia (2014) 15. Ghafari, S.M., Tjortjis, C.: A Survey on Association Rules Mining using Heuristics. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, p. e1307 (2019)
Sentiment Analysis Using Semi Supervised Machine Learning Technique Abinash Tripathy and Alok Kumar Jena
Abstract The sentiments of the users are expressed in the form of views or comments, in favor or against of any item, a product or a movie, etc. These reviews may be labeled or unlabeled. Labeled reviews are easier to process in compare to that of unlabeled once. Using Semi supervised machine-learning technique; the unlabeled reviews can be labeled. In this approach, with the help of small amount of labeled reviews, a large volume of unlabeled review can be labeled. In this paper, a stepby-step approach is adopted to label the unlabeled dataset. In order to perform this task, Support Vector Machine (SVM) technique is used. In order to access the results in each steps, the performance of the used technique is evaluated using different parameters like precision, recall and accuracy and thus, overall process can move forward. Keywords Semi supervised machine learning technique · Labeled review · Unlabeled reviews · Support vector machine
1 Introduction With increase of the on-line trading sites, the new customers always prefer the help or suggestions from the past reviews provided by the users about the products of their choice, based on that they take their decision of continuing the trade or not. These reviews are mostly unstructured in nature and in text format. Thus, need to be processed properly to obtain meaningful information from that. The reviews can be labeled into two polarity groups i.e., positive and negative. After labeling of the reviews, the processing is quite easy. But, in reality the reviews are not always A. Tripathy (B) Raghu Engineering College, Visakhapatnam, India e-mail: [email protected] A. K. Jena Gandhi Institute of Education and Technology, Gunupur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_28
317
318
A. Tripathy and A. K. Jena
labeled and labeling these reviews is a difficult task. Broadly speaking, the job of sentiment analysis is to analyze the user’s opinion or comments on any organization, the products developed by them and the attributes of the organization, to obtain a meaningful information that can be used for further analysis [1]. During the analysis of reviews, the sentiment analysis can be carried out at different level of granularity. Based on these granularity level, the processing of the reviews are different. According to Feldman, the different granularity level as follows [2]: • Document level sentiment analysis: This process considers the whole review as a single unit for processing and classifies them into different polarity groups. • Sentence level sentiment analysis: This approach consider each statement as a single unit and then, combine the sentiment of the sentences present in the document into a single unit and express the sentiment of the document in different polarity groups. • Aspect level sentiment analysis: While analyzing the sentiment for any item or product different aspects comes into picture. For example, when the sentiment analysis of a TV is considered different aspects like size, type, sound quality and features are considered as aspects. in this type of sentiment analysis, the reviews are separated based on these aspects and then each review is considered as a document for analysis. In this paper, the document level sentiment classification job is carried out on a particular dataset used frequently by different authors, i.e., movie review polarity dataset [3]. The machine learning techniques can be classified into mainly three different types. The most commonly used machine learning techniques are as follows: • Supervised learning: This type of learning works on a labeled dataset based on the polarity values like positive or negative polarity. Using different machine learning techniques, the polarity of the reviews are predicted and then, it is compared with the existing polarity. If the actual polarity and the predicted polarity of the reviews are same, then the accuracy of the system increases [4]. During the supervised learning, the main task of the system is to classify the reviews into different polarity levels. • Unsupervised learning: This type of learning works on unlabeled review, thus, it can not able to classify the reviews. In stead of that, based on the similar characteristics/properties of the reviews, this learning process group the similar types of reviews into a cluster [5]. In both the supervised and unsupervised approach, the number of classes and the no of clusters present are same. • Semi-supervised learning: This approach is a comparatively new approach in compared to that of the previous two approaches. The availability of the labeled reviews are quite less in compared to that of the unlabeled one. Thus, this approach make a attempt to label the unlabeled reviews with the help of the small labeled data [6]. The present study deals with the semi supervised learning approach.
Sentiment Analysis Using Semi Supervised Machine …
319
The organization of the paper is as follows: The Sect. 2 discusses about the literature survey regarding the semi semi supervised approach. Section 3 provides information about the methodologies used. Section 4 presents the proposed approach and result analysis. Section 5 finally concludes the paper.
2 Literature Survey In order to consider the rating interface job in sentiment analysis, Goldberg and Zhu have suggested a semi supervised technique, which is graph based. They have generated graph from unlabeled and labeled reviews both. From these generated graphs, some assumptions are generated by them, which suggested rating funs for each graphs using the principle of optimization [7]. Sindhwani and Melville have collected lexical information from the reviews which are unlabeled in nature using prediction algorithms [8]. Based on these information, they have generated a bipartite graph for the whole documents and for the words in it. For the analysis purpose, they have considered the words having higher sentiment values. They have analyzed a large size dataset for the analysis, which is unsupervised in nature. Anand and Naoream have suggested an aspect based approach to analyze the movie reviews, which is semi supervised in nature [9]. They have adopted a two phase approach for analysis. Initially, they have discarded labels of the review and gone for binary classification. Secondly, they have considered the aspects of the words based on some clues and rules that are manual in nature. A survey on Twitter tweet analysis approach is proposed by Silva et al., which is semi supervised in nature [10]. They have suggested three different types of approach for analysis such as topic based i.e, on which topic the reviews are presented, graph based i.e., they generate the graph from the reviews and finally, wrapper based approach that consisting of both co training approach and the self learning approach from the reviews. Khan et al. have proposed a novel approach namely Semi supervised feature weighting and intelligent model selection (SWIMS) to obtained the weights of the features present in the reviews, which they have collected from SentiWordNet [11]. Their approach is mainly based on lexical information collected from reviews and finally, SVM technique is used by them along with the weights collected from the features for classification of the reviews. Zhai et al. have proposed a framework of self-supervised semi supervised learning and use this approach to derive two novel semi supervised classification method [12]. They have presented a comparison of their proposed approach with the baselines and the existing approach and found out the obtain result is better in compare to the older one. Zaho et al. have extended the broad learning system used in supervised approach to semi supervised broad learning approach [13]. They have extracted features from labeled and unlabeled data and then combine them with manifold regularization framework to construct Laplacian matrix. In order to generate the objective function, they have combined the features collected from reviews, Laplace matrix and nodes enhancement principle, that solved the issues related to semi supervised classifica-
320
A. Tripathy and A. K. Jena
tion. Van Engelen and Hoos have proposed an overview of semi supervised learning and also presented a new taxonomy for semi supervised approach, which is different from primary objective approach and the unlabeled data processing system [14]. Using this approach, they have tried to resolve the issue in performance degradation caused by unlabeled data.
3 Methodology Used 3.1 Types of Sentiment Classification Depending upon the points of classification, sentiment classification may be categorized into binary classification or multi class classification [15]. • Binary Classification: In this type of classification, the reviews are classified into two classes namely positive and negative. These types of information are mainly preferred when the reviews want to know, whether customer prefer the product or not. • Multi-class classification: This types of classification approach is adopted when ranking is provided to any product. In this case, more than two classes are mostly considered like below average, average, good, very good, and excellent.
3.2 Dataset Used In this study, the IMDb dataset is considered for analysis as this dataset has both labeled and unlabeled data available, which help to perform the semi supervised classification. The detailed of the dataset is as follows: aclIMDb Dataset: The acl Internet Movie Database (IMDb) is freely available and used for sentiment analysis by different authors [16]. The database have separate set of reviews for both training and testing. Both training and testing dataset have 12,500 positive and negative reviews for analysis. For the semi supervised approach, the unlabeled reviews are also needed. These types of reviews are also present in this dataset with a collection of 50,000 unlabeled reviews.
3.3 Machine Learning Technique Used In this paper, Support Vector Machine Classifier (SVM) is used for classification instead of other classifier as it is observed that most of the authors have found out
Sentiment Analysis Using Semi Supervised Machine …
321
that the SVM shows a better result in compare to that of other machine learning techniques [15] . Support Vector Machine Classifier (SVM): This classifier is one of the most commonly used classifier for sentiment analysis. In this method, the decision boundaries are considered by using hyperplanes that separate one kind of reviews from others. In case of binary classification, the hyperplane make an attempt to keep the separation between the vectors generated from the documents as big as possible. The SVM solves the optimization problem mentioned below for training sets having labeled pair (ai , bi ), i = 1, 2, . . . where ai ∈ U n and b ∈ {1, −1}l [17], α, β, γmin 21 W T W + C li=1 ξi subject to bi (w T ξ(Ai ) + c) ≥ 1 − γi , γi ≥ 0.
(1)
Here training vector ai is mapped to higher dimensional space by ξ. Like other machine learning algorithms SVM also do not consider the reviews in the text format. These text reviews are transformed into numeric vectors using different approaches and given input to SVM for classification of reviews. After the transformation is done, a scaling job is performed that controls the vectors to maintain them within [1, 0] range.
3.4 Matrix Transformation of Text Reviews The reviews provided by the reviewers are textual in nature. These reviews are transformed into a suitable matrix format, which the machine learning techniques consider as input for processing. The following two functions are mainly used for the conversion of the text reviews into matrix format: 1. CountVectorizer (CV) function: The job of this function to transform the reviews into matrix form [18]. These function does not consider the frequency of the words i.e., the occurrence of the words and thus, the transformed matrix take a shape of sparse matrix with only 1 and 0. 2. Term Frequency—Inverse Document Frequency (TF-IDF) function: Along with the occurrence of the words, their frequency also plays an important role in assigning the sentiment value to a text review [18]. The frequency of occurrence of a word in a particular review is the term frequency of the word and the occurrence of a word in the whole document i.e., the whole dataset is called inverse document frequency. In this paper, the TF-IDF function is used for analysis, as the frequency of the occurrence of the words also carry an important role in sentiment classification.
322
A. Tripathy and A. K. Jena
3.5 Performance Evaluation Parameters The performance of the machine learning techniques need to be checked to know whether the techniques is providing a suitable result or not. Confusion matrix is one that help to perform this task. This matrix is a special type of contingency table that help to analyze the result obtained by a machine learning technique. This matrix provides informations such as true positive, true negative, false positive and false negative [15]. These parameters provide count i.e., if the predicted polarity of the reviews matches with the actual polarity (polarity of the reviews are known before) then the true count increases else the false count increases. If the actual polarity is positive and the predicted one is also positive, there is an increase in true positive count and the predicted one is false, there is an increase in false positive count. The confusion matrix can be shown as below in Table 1. Table 1 Confusion matrix
Positive Negative
Correct labels Positive
Negative
TP (True positive) FN (False negative)
FP (False positive) TN (True negative)
Based on the information obtained from the confusion matrix, the following parameters are found out that evaluate the performance of any particular machine learning technique. 1. Precision: This parameter check how exactly the classifier predicated the polarity of any review. The precision is calculated for both the negative and positive reviews. it is calculated as the ratio of correctly predicted reviews to the reviews having the same polarity, which is mentioned as below: Precision =
TP TP + FP
(2)
2. Recall: The machine learning approach must be complete i.e., it must consider all the reviews in the dataset. Like precision, the recall also calculated for both positive and negative reviews. It is calculated as the ratio of correctly predicted reviews to the reviews that are actually positive in nature, which is mentioned as follows: TP (3) Recall = TP + FN 3. Accuracy: Unlike other two parameters, the authors prefer accuracy to check the performance of the system as it most commonly used once. It is calculated as the ratio of the count of the reviews, whose predicated and actual polarity values are
Sentiment Analysis Using Semi Supervised Machine …
323
same to the total number of reviews considered for analysis, which is mentioned as follows: TP + TN (4) Accuracy = TP + TN + FP + FN
4 Proposed Approach In this study, the semi supervised self learning approach is used on IMDb dataset to label the unlabeled reviews. The dataset is first preprocessed to removed unnecessary information, then transformed to matrix form using TF-IDF function. Then using SVM, the classification is carried out on labeled dataset and the result is noted down. Then, a set of reviews are collected from the unlabeled reviews and the same classification process is carried out on these dataset. The predicted label are obtained from the classification. Based on the predicted polarity, the dataset is added to the same polarity dataset i.e., if a review is predicated as positive, it is added to positive dataset and similarly for the negative reviews also. Then another set of dataset is selected from the unlabeled dataset and the process continue. If the accuracy the system increases or became stable by adding the predicted reviews to the labeled dataset then it is considered else the selected dataset is again added to the unlabeled review set. This process is continue until all the reviews are labeled in the unlabeled dataset. The process is explained by Fig. 1.
Dataset Preprocessing and Transform to matrix Classify labeled reviews using SVM Select a set from unlabeled datset Classify unlabeled reviews using SVM Add to unlabel dataset
no
If accuracy increases ?
yes
Add reviews based on predicted polarity to labeled dataset If all unlabeled reviews are labeled ?
yes Stop
Fig. 1 Proposed approach for semi supervised classification approach
no
324
A. Tripathy and A. K. Jena
The detailed description of the approach can be explained as follows Step 1. In this study, The IMDb movie review dataset is considered for analysis [16]. Step 2. The text reviews may contain some unwanted or absurd information that do nt play any role in sentiment analysis and thus, must not be a part fro further analysis. The unwanted information are as follows: • Stop words: There are few words in English that occurs many times in a text like ‘the, and, of’, these words does not have any impact on sentiment value of the text. Thus, these stop words are searched and removed from the reviews. • Numeric and special character: Reviews prefer to use number and special characters in the reviews to give more stress on any topic. These need to be removed as while transformation into matrix, they can create confusion and also have no role is sentiment analysis. • Stemming and lowering the case: Mainly the verbs can take many different forms in a text. There is a root form and other are derived from it. For example: words like go, going, goes, went, all have a common root word ‘go’. So, the word go is considered for all others as mentioned above [17]. All the words in a text must be in same case otherwise there might be a confusion state. Thus, in this paper, all the text in reviews are converted into lower case for ease in analysis. • After preprocessing the documents, the next task is to convert the processed reviews into matrix format. In this paper, the TF-IDF function is considered for the transformation as the frequency of the word takes an important role in sentiment analysis. Step 3. After the reviews are transformed to matrix format. Then, the matrix is given input the SVM for analysis. In this case, 1000 positive and 1000 negative labeled reviews are considered for analysis. The performance evaluation parameters obtained is shown in Table 2 Step 4. The next step is to select a set of reviews from the unlabeled reviews. Then, the classification using SVM is done to find out the predicted polarity and based on these predicted polarity, the reviews are added to the labeled review. In this case, 10,000 reviews are arbitrarily selected from unlabeled dataset and SVM classification algorithm run on that. From these 10,000 reviews 5647 reviews are predicated as positive and 4353 reviews are predicted as negative. These reviews are then added to the labeled dataset and then SVM run again to check whether there is an improvement is result or not. The parameters obtained after addition of reviews is shown in Table 3 It is observed from Table 3 that the accuracy of the system has increased and thus, these labels assigned to the reviews can be assumed as true. After the successful addition of reviews the dataset size increased to 6647 positive reviews and 5353 negative reviews. This process is carried out again with 10,000 unlabeled reviews. After the SVM process run, the predicted positive reviews are 4567 and negative reviews are 5433.These reviews are then added to the labeled dataset and then SVM run again to check whether there
Sentiment Analysis Using Semi Supervised Machine …
325
Table 2 Evaluation parameters for SVM classifier on labeled reviews Confusion matrix Evaluation parameter Correct label Positive Negative
Positive 826 162
Negative 174 838
Accuracy
Precision
Recall
F-measure
0.83 0.84
0.84 0.83
0.83 0.83
Table 3 Parameter obtained after addition of first 10,000 reviews Confusion matrix Evaluation parameter Correct label Positive Negative
Positive 5010 466
Negative 1636 4886
Accuracy
Precision
Recall
F-measure
0.75 0.91
0.9 0.79
0.82 0.85
Table 4 Parameter obtained after addition of second set of 10,000 reviews Confusion matrix Evaluation parameter Correct label Positive Negative
Positive 9244 1784
Negative 1970 9002
83.228
Precision
Recall
F-measure
0.82 0.83
0.83 0.83
0.83 0.83
83.336
Accuracy 82.94
is an improvement is result or not. The confusion matrix and evaluation parameters obtained after addition of reviews is shown in Table 4 From Table 4, it is found out that there is a reduce in accuracy as compare to that of the result of Table 3. In such case, these set of reviews collected are discarded and again added to the unlabeled dataset. Step 5. During this step, it is checked that whether all the reviews from the unlabeled dataset is labeled or not. If all the reviews are labeled, then the process is stopped else a set is selected from the unlabeled dataset and classification process continue.
5 Conclusion and Future Work In this paper, an attempt is made to label the unlabeled reviews. A small number of labeled reviews are present and with the help of these small amount of reviews, a larger volume unlabeled reviews are labeled. The process adopted is a repeated process in which a set of unlabeled reviews are selected and that go through a classification process. Based on the predicted labels, these reviews are added to the labeled dataset.
326
A. Tripathy and A. K. Jena
In order to check whether the process is a successful once again classification is done. The accuracy obtained in this case must be same or grater than the previous one. If this is not the case than the set selected is again added to the unlabeled dataset. This process is continued until all unlabeled reviews are labeled. The accuracy achieved by using SVM on the labeled reviews is 83.228, later when the unlabeled reviews are analyzed and added to the dataset the accuracy increases up to 83.336. Another set as unlabeled review is considered, but in this case, the accuracy level of system goes down to 82.94 and thus, the set is not considered and added backed to unlabeled reviews. In this paper, the labeled or unlabeled reviews collected are assumed to be by the same set of the authors or reviewers. This is because based on the labeled reviews, the unlabeled reviews are processed and as the writing style of authors varies from each other, if any one of the labeled or unlabeled reviews are from different author the process may not provide a proper result. Thus, the process must be updated to work in generalized environment.
References 1. Liu, B.: Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies 5(1), 1–167 (2012) 2. Feldman, R.: Techniques and applications for sentiment analysis. Communications of the ACM 56(4), 82–89 (2013) 3. B. Pang, L. Lee, A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts, in: Proceedings of the 42nd annual meeting on Association for Computational Linguistics, Association for Computational Linguistics, pp. 271 – 276, 2004 4. G. Gautam, D. Yadav, Sentiment analysis of twitter data using machine learning approaches and semantic analysis, in: Contemporary Computing (IC3), 2014 Seventh International Conference on, IEEE, pp. 437–442, 2014 5. Hastie, T., Tibshirani, R., Friedman, J.: Unsupervised learning. Springer (2009) 6. M. F. A. Hady, F. Schwenker, Semi-supervised learning, in: Handbook on Neural Information Processing, Springer, pp. 215–239, 2013 7. A. B. Goldberg and X. Zhu, Seeing stars when there aren’t many stars: graph-based semisupervised learning for sentiment categorization, in: Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing, Stroudsburg, PA, USA. Association for Computational Linguistics, pp. 45–52, 2006 8. V. Sindhwani and P. Melville, Document-word co-regularization for semi-supervised sentiment analysis, In: Proceeding of Eighth IEEE International Conference on Data Mining, pp. 1025– 1030, IEEE press, Italy, 2008, 9. Anand, D., Naorem, D.: Semi-supervised aspect based sentiment analysis for movies using review filtering. Procedia Computer Science 84, 86–93 (2016) 10. Silva, N.F.F.D., Coletta, L.F., Hruschka, E.R.: A survey and comparative study of tweet sentiment analysis via semi-supervised learning. ACM Computing Surveys (CSUR) 49(1), 15–25 (2016) 11. Khan, F.H., Qamar, U., Bashir, S.: Swims: Semi-supervised subjective feature weighting and intelligent model selection for sentiment analysis. Knowledge-Based Systems 100, 97–111 (2016) 12. Zhai. X, Oliver. A., Kolesnikov. A., Beyer. L., S4l: Self-supervised semi-supervised learning, In: Proceedings of the IEEE international conference on computer vision, pp. 1476-1485, 2019
Sentiment Analysis Using Semi Supervised Machine …
327
13. H. Zhao, J. Zheng, W. Deng, Y. Song,: Semi-Supervised Broad Learning System Based on Manifold Regularization and Broad Network, in: IEEE Transactions on Circuits and Systems, 67(3), pp. 983-994, 2020 14. Van Engelen. J.E. and Hoos. H.H.,: A survey on semi-supervised learning, In: Machine Learning, 109(2), pp. 373-440, 2020 15. Tripathy, A., Agrawal, A., Rath, S.K.: Classification of sentiment reviews using n-gram machine learning approach. Expert Systems with Applications 57, 117–126 (2016) 16. A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, :Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1., pp. 142–150 Association for Computational Linguistics, 2011 17. Tripathy, A., Anand, A., Rath, S.K.: Document-level sentiment classification using hybrid machine learning approach. Knowledge and Information Systems 53(3), 805–831 (2017) 18. R. Garreta and G. Moncecchi, Learning scikit-learn: machine learning in python, Packt Publishing Ltd, 2013
Unconstrained Optimization Technique in Wireless Sensor Network for Energy Efficient Clustering Binaya Kumar Patra, Sarojananda Mishra, and Sanjay Kumar Patra
Abstract Emerging technology nowadays used large scale wireless communication network having tiny sensor node with minimal power and multifunctional process. The sensor node having limited energy cannot stand for a long years. It cannot recharge as most of the time the sensor nodes scattered in remote environment and harsh condition like dense forest, battle field, desert, etc. Many energy efficient technologies are emerging. The node energy remain a scare supply at the time of designing a wireless sensor network and transmission distance of packet linking node and base station. Energy enhancement is one of the key challenges in sensor network. Many of the people working in this field these days are trying to solve this energy efficiency by implementing clustering approach. Most of the approaches have not mathematically proven. To fill this balance, a system and methods with mathematical proof usually take the help of unconstrained optimization technique of multivariable calculus. This approach helps us to reduce energy consumption and routing issues in sensor networks. Here, the assumption is that the nodes are distributed in nature in multidimensional. Keywords Cluster · Sensor · Gateways · Eigenvalue · Symmetric matrix · Hessian
1 Introduction A wireless sensor network (WSN) usually consists of a huge number of randomly spread and outfitted with tiny-sensors. These tiny nodes are generally sense data, process and communicate to work together with each other, and with other device. B. K. Patra (B) · S. Mishra · S. K. Patra Department of Computer Science Engineering and Application, Indira Gandhi Institute of Technology, Sarang, India e-mail: [email protected] S. Mishra e-mail: [email protected] S. K. Patra e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_29
329
330
B. K. Patra et al.
It is vastly used in defense, manufacturing, healthcare, weather forecasting, remote sensing. Basically in sensor network, packet data are sending to a base station. WSNs are measure mechanism of Internet of Things (IoT)-based devices [1]. The main four tasks in which more power is consumed of a sensor node is at the time of data aggregation, data sending, data receiving, and data processing. For this reason, a sensor node must efficiently manage its memory usage and CPU power and energy. This enhances the network lifetime and productivity. Energy-efficient routing protocol designing is bigger challenges in WSN. Sensors are compactly placed, and they are restricted in energy, processing limit. Once it is deployed, it is difficult to replace them because most of them are scattered in remote and harsh environment. For that reason, the sufficient use and network management operation need sensor energy. Thus, the protocol must be energy efficient. As concerned about energy efficient hierarchical routing protocol, clustering approach is widely used in past few years. In this approach, the entire network is an assemble of a number of clusters. Each cluster contains a cluster head (CH), it discharge service as an intermediary connecting cluster and base stations (BS). In general, node possessing more energy assumed as cluster head. The cluster head (CH) gather data, process them, and delivers the process information to base station (BS). The CH delivers such a manner that it minimizes the total energy utilization in a sensor network [4, 5]. Periodically, there is a rotation of cluster head selection among the cluster members. In many applications, sensor nodes are physically scattered with maintaining a certain space among them, but the majority of the past energy efficient algorithms have not taken distance as parameter to create cluster head (CH). In this paper work, we prepare a mathematical problem for selection of cluster head (CH) by means of unconstrained optimization technique of multivariate calculus approach along with distance between the sensors.
2 Literature Review In [1, 2], various surveys have been done on WSN some of the approaches of routing protocol include the following: • • • • • •
Qos-based protocol Data-driven approach Mobility-based approach. Hierarchical protocol Location –based approach Bio-inspired algorithm
Some of the authors applied clustering approach for energy efficiency [3]. Leach protocol is the most basic clustered protocol. In this, packets are directly transmit from nodes to base station. In LEACH, each communication consists of two phases in each round. One is setup phase another is steady state phase. During every phase of process, node information are moved to the base station. Set up phase elects cluster
Unconstrained Optimization Technique in Wireless Sensor …
331
head (CH). In the steady state, phase node exchange information with their own cluster head (CH) in a single hop technique. Cluster heads aggregate data, process it, and transmit packets to the base station (BS). A ring clustering technique for efficient energy increasing has described in [4], which is named as power efficient congregation in wireless sensor system. In [5], Razaque et al. proposed a combination of LEACH AND PEGASIS, which is called PLEACH. It is a good technique to increase the energy for routing. This is a clustering approach having series protocol that selects the sensors with highest energy to be CHs and creates a sequence of CHs to send packet to the BS. Even though this approach enhance network lifetime, it skips various useful CH selections because each CH sends lone nearest neighbor node. In cluster-based protocol, the cluster head or the Gateways plays an important role and challenging task for maximization of network lifetime. So, load balancing of Gateways in WSN is very crucial issues. Some of the bio-inspired algorithm solves this type of issues. An improved version of shuffled frog leaping algorithm (SFLA) is one of them [6]. Authors projected a new fitness-function to develop the results of the solution produced by SFLA. In [7], Cheng et al. modeled EAACA, an energy efficient ant colony algorithm for WSN routing that makes improvement of the ACO to create the best path to the sink node. Specifically, the next-hop count choice takes three parameters: the base station distance, residual energy of nearest sensor, and average energy of the routing distance. The protocol makes frontward ants to set up suitable route to the objective nodes. After that creates reply message information by reversing ant’s movement. A time synchronization algorithm is proposed for enhancement of network lifetime in WSN [8]. Cross-layer routing is used vehicular adhoc networks. In VANETs, vehicular environment possess dynamic characteristics. So, routing is a biggest challenge. Hence, for the purpose of optimized routing choice to achieve better network performance, cross-layer approach is needed. The parameters optimization for exchange of information happens at the physical and MAC layers [9, 10]. A different form of cross-layer technique to maintain the layers together at modeling time, the network position is maintained by different algorithms in various layers [11]. Likewise, MAC and routing process both together work in a cross layer mode for improvement energy and Holdup minimization even as migrate from firm layered communications. This approach decreases source to destination packet delay and packet loss [12]. In [13], author proposed a probability density function (PDF) to control network life time and also develop a node deployments algorithm to avoid energy issues. Here, node density has taken as a major parameter. He derived intrinsic characteristic like covariance and mean from PDF. In [14], Lee and Kao proposed a three-layered architectures of centralized gridding and distributed clustering. It is a hybrid approach. The cluster heads (CH) and grid heads are determined in distributed and centralized way.
332
B. K. Patra et al.
3 Proposed Approach Most of the researcher focused on algorithmic approach or bio-inspired approach for energy optimization. Here, the approach is mathematical one. Here, the sensors are in a multidimensional space. Assume in a sensor network, sensor nodes are scattered arbitrarily and static in nature. The sensor node after deployment communicates to the nearest Gateway. Two sensor node can exchange information only when they are in a communication range (inter cluster) (Fig. 1).
3.1 Proposed Mathematical Approach Assume in a M dimensional pattern space a group of n vector Xj , j = 1,……., n, in to c groups cluster Gi , i = 1,…,c, and selects a cluster center in individual group such a way that the objective function of distance measure is reduced. When the sum of squared Euclidean distance is selected as the distance count among xk in group vector j and the related center of the cluster ci and cost function may be explained as. c c c Ji = || xk − ci ||2 (1) J= i=1
i=1
i=1
Here Ji = k,xk ∈G i xk − ci is the cost function in group cluster i. So, the values of J i requires the location of ci and geometric properties, i.e., Gi . A standard distance measure d(xk , ci ) can related to vector xk and group cluster i. The average cost function defined below. J=
c i=1
Ji =
c i=1
⎛ ⎝
c
⎞ d(xk − ci )⎠
(2)
k,xk ∈G i
The Euclidian distance for simplicity is applied as the distance, and total cost measure is a function and is defined above mathematical expression (1). To fix the cluster center ci, the computation of sum of squared of Euclidian distance should be minimized. This can go for second order partial derivatives for optimization, which is impossible in the case if it is generic distance cost function. The above Eq. (1) is a second order polynomial. It is an unconstrained optimization problem having multivariable. The cluster center is similar to find local minima of a multivariable polynomial unconstrained optimization. To find local minima, the second order partial derivative should be considered. By differentiating the Eq. (1), we get
Unconstrained Optimization Technique in Wireless Sensor …
333
c δ j(G) = 2 [ xk − ci ] (−1) for 1 ≤ i ≤ M δδ j(ci ) i=1
Setting it to zero, we get. n X = (n)(ci ) for 1 ≤ i ≤ M. j=1 1 n G0i = n j Xk for 1 ≤ i ≤ M. If ci is the optimal center that reduce the Eq. (1) is the mean belongs to every node vectors in cluster i: 1 Xk ci = |G| k,X k ∈G i
where | Gi | is the magnitude of the Gi . To compute for center as global minimum point, we have to find out the partial derivatives of second order. By doing the partial derivatives of second order, we know that the result is a Hessian matrix. n δ 2 j(G) = δci2 j=1
δ δ(G i )
c
[ − 2 ( xk − ci ) ] =
+ 2 = 2n
i=1
As we know if f: IRP − IRm and g: IRn − RP . The composition of f and g is the function f ° g from IRn to IRm defined as f (g(x)). The gradient of f and Hessian of f of a function f: IRn → IR are the vector of its first partial derivatives, and partial derivatives of second order is the matrix. Hessian matrix is symmetric if the second order partial derivatives are a continuous one, Hessian matrix is a positive value. The Eigen vector values are all positive, and the eigenvectors are orthogonal. ⎛
∂f ∂x
⎜ ∂ f1 ⎜ ∂ x2 ∇f =⎜ ⎜ .. ⎝ . ∂f ∂ xn
⎞
⎛
⎜ ⎜ ⎟ ⎜ ⎟ 2 ⎟ and ∇ f ⎜ ⎜ ⎟ ⎜ ⎠ ⎜ ⎝
∂2 f ∂2 f ∂ x12 ∂ x 1 ∂ x2 ∂2 f ∂2 f ∂ x2 ∂ x1 ∂ x22
.. .
∂ f ∂2 f ∂ xn ∂ x1 ∂ xn ∂ x2 2
··· ..
.
···
⎞ ∂2 f ∂ x1 ∂ xn ∂2 f ∂ x2 ∂ xn
.. .
∂2 f ∂ xn2
3.2 Diagonalizing a Symmetric Matrix Theorem If A is symmetric, its eigenvectors are orthogonal, i.e., x(k) ・ x(j) = 0 for k = j. T T Proof: Consider x (k) .Ax ( j) − x ( j) A.x (k)
⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠
334
B. K. Patra et al.
First it is a scalar, since xT , A, and x are, respectively, 1 × n, n × n, and n × 1. Since a scalar can be thought of as a 1 × 1 matrix, it is equal to its own transpose. Explicitly taking its transpose, we find that this scalar is minus itself, therefore, it is zero, T
T
x (k) .Ax ( j) − x ( j) A.x (k) T T T = (x (k) .Ax ( j) − x ( j) A.x (k) ) T T = − x (k) A.x ( j) − x ( j) Ax (k) .
T T = > x (k) A.x ( j) − x ( j) Ax (k) = 0. Now, we can use the Eigen value equation. T T 0 = x (k) A.x ( j) − x ( j) Ax (k) T And the fact that a b = bT a = ab to give. 0 = λ( j) − λ(k) x (k) .x ( j) . Either the Eigen vectors are different and we must have x (k) .x ( j) = 0, or they are the same. Either way we have proven that the two Eigen vectors are orthogonal. If the Eigen vectors have chosen to be unit length. x (k) .x (k) = 1, Then, the matrix constructed from the Eigen vectors must be orthogonal. i.e., CT = C−1 . Thus, for a symmetric matrix, CT AC is diagonal. Hence, the Hessian matrix is diagonal matrix of positive definite value as n > 0. The matrix is called diagonal if its representation has non-zero entries only in the diagonal line. ⎡
⎤ 2n 0 0 ⎣ 0 2n 0 ⎦ 0 0 2n
4 Conclusion and Upcoming Study In this work, we have projected a network lifetime improvement approach in a mathematical way. This procedure guides in a way to proper placement of cluster center (Gateway). The implementation of Hessian matrix and multivariable calculus enhance the energy efficiency and find the Gateway position. The cluster center is the similar to find the optimality of a second order polynomial. This second order polynomial is a Hessian matrix, and all the Hessian matrixes are symmetric in nature. The symmetric matrixes are positive Eigen value and vector, which is a diagonal in nature. The optimality lies in diagonal, which always makes an equal distance from others. This finds the placing of the Gateway in the proper location, which minimizes the distances measure by taking the sum of squared of Euclidian distance from source to the base station, which enhance the information exchange by reducible energy.
Unconstrained Optimization Technique in Wireless Sensor …
335
Fig. 1 WSN clustering with Gateway
Acknowledgements Authors are thankful to the organizing chair and reviewers for selecting this paper for presenting and publishing this in the conference ICMB2020. This would definitely encourage us to extend our research in future. Corresponding author Mr.Binaya Kumar Patra extends heartly regards to his supervisor Prof. S.N.Mishra and Dr.Sanjay Kumar Patra for their valuable guidance and suggestions throughout the research work and writing this paper as well.
References 1. Misra, S., Kumar, R .: A literature survey on various clustering approaches in wireless sensor network. In: 2nd International Conference on Communication Control and Intelligent Systems (CCIS), pp. 18–22 (2016). 2. Fei, Z., Li, B., Yang, S., Xing, C., Chen, H., Hanzo, L.: A survey on multi-objective optimisation in wireless sensor networks: Metrics, algorithms and open problems. IEEE Commun, Surv, Tutorials, pp. 1–38 (2016) 3. Heinzelman, W., Chandrakasam, A., Balakrishnan, H.: Energy-efficient communication protocol for wireless microsensor networks. in: Proceeedings of the Hawaii Conference on System Sciences, pp. 1–10 (2000) 4. Zhang, W., Li, L., Han, G., Zhang, L., E2HR: An energy-efficient heterogeneous ring clustering routing protocol for wireless sensor network. In: IEEE Access, vol. 5, pp. 1702–1713 (2017) 5. Razaque, A., Abdulgader, M., Joshi, C., Amsaad, F., Chauhan, M.: P-leach: energy efficient routing protocol for wireless sensor networks. In: IEEE Long Island Systems, Applications and Technology Conference (LISAT), pp. 1–5 (2016) 6. Edla, D.R., Lipare, A., Cheruku, R., Kuppilli, V.: An efficient load balancing of gateways using improved shuffled frog leaping algorithm and novel fitness function for WSNs. In : IEEE Sensors J. 17(20), 6724–6733 (2017) 7. Cheng, D., Xun, Y., Zhou, T., Li, W.: An energy aware ant colony algorithm for the routing of wireless sensor networks. In: Chen, R. (ed.) Intelligent computing and information science. communications in computer and information science, pp. 395–401. Springer, Berlin (2011)
336
B. K. Patra et al.
8. Xia, T., and He , S.: New energy-efficient time synchronization algorithm design for wireless sensor network. In: 32 nd Youth Academic Annual Conference of Chinese Association of Automation (YAC) , Hefei , pp. 490–495 (2017) 9. Awang, A., Husain, K., Kamel, N., Aïssa, S.: Routing in vehicular ad-hoc networks: A survey on single-and cross-layer design techniques, and perspectives. IEEE Access 5, 9497–9517 (2017) 10. Jun, L. : A cross-layer routing optimization method in Wireless Mesh Network. In :Software Engineering and Service Science (ICSESS), 4th IEEE International Conference on IEEE, pp. 357–360 (2013) 11. Conti, M., Maselli, G., Turi, G., Giordano, S.: Cross-layering in mobile ad hoc network design. Computer 37(2), 48–51 (2004) 12. Ruzzelli, A. , G., Hare, G. M., Jurdak, R .: Crosslayer integration of MAC and routing for low duty- cycle sensor networks.Ad Hoc Networks, MERLIN 6(8), 1238–1257 (2008) 13. Halder, S., DasBit, S.: Design of a probability density function targeting energy-efficient node deployement in wireless sensor networks. IEEE Trans. Netw. Service Manage. 11(2), 204–219 (2014) 14. Lee, J.S., Kao, T.Y.: An improved three_layer low–energy adaptive clustering hierarchy for wireless Sensor Networks. IEEE Internet Things J. 3(6), 951–958 (2016)
A Smartphone App Based Model for Classification of Users and Reviews (A Case Study for Tourism Application) Ramesh K. Sahoo, Srinivas Sethi, and Siba K. Udgata
Abstract Classification of reviews provided by the users plays a vital role in many real world applications. Many industries in the current era depend on the reviews of the customers/ users for planning their business and providing better customer care services. It deals with the classification of reviews to validate the objectives of the organization. The evaluations and follow up goals can be determined as positive or negative types of reviews. This paper tried to propose a model that performs the classification of users or customer reviews using ratings provided by users or reviewers. In the proposed model, users can give feedback on the location through the Android App, which will be stored in a cloud platform. This real time dataset can be used for tourism applications in the proposed work. The algorithm is used to classify a review as either an honest review or a fake review. It also tries to classify the users as honest, suspicious, and malicious. Feedbacks classified as honest and given by honest users only will be considered authentic information by the other users during the search operation. Keywords Classification · Feedback · Reviews · Rating · Tourism · Smartphone app
1 Introduction Currently, tourism is one of the growing businesses in the world. Traveling from place to place for relaxation, enjoyment, business meetings, and spending quality time with family in outdoor locations has become a norm. Due to the hectic lifestyle generally, most people have lots of stress, and many times this leads to a lot of health related issues. That is why people typically love and plan to visit some outdoor locations to get some time. R. K. Sahoo (B) · S. Sethi Department CSEA, IGIT Sarang, Dhenkanal, India e-mail: [email protected] S. K. Udgata School of Computer and Information Sciences, University of Hyderabad, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_30
337
338
R. K. Sahoo et al.
People find a suitable location by going through different websites for tourist places and trip advisories. These websites provide lots of information about tourist places based on the reviews obtained from other people who give feedback after visiting sites. This information is not reliable due to fake reviews provided by different malicious persons. Reviews or feedback given by people after visiting tourist places are considered correct, and based on this, certain information about tourist places is given to forthcoming visitors whenever asked. This information should be accurate. This information came from the feedback given by various users, and a user may provide a correct or fake review as per their intention. Therefore, it is essential to analyze feedback to identify fake and accurate reviews and types of users who give reviews as honest or malicious. Information that is based on reviews obtained from legitimate users only is considered to be reliable and trustworthy. Detection of fake reviews is one of the challenging jobs. Various techniques, like machine learning, sentiment analysis, classification, and convolutional neural networks, are used for text processing for the detection of fake reviews. Reviews and the activity of users are also analyzed to detect fake reviews and malicious users. In this paper, the proposed model is focused on the identification of correct and fake reviews. It also classifies users as honest, suspicious, or malicious. Only reviews obtained from honest users will be considered as information for the people who want to visit places. In the proposed model, the feedback has been collected from different users using the developed android app exclusively for the purpose. This feedback is stored in the cloud and analyzed using MATLAB for classification of users and detection of honest reviews and honest users. The rest of the paper is organized as follows. The background and related work are discussed in Sect. 2, followed by the discussion on the proposed methodology in Sect. 3. Section 4 analyzes the experimental result, and conclusions and future scope are discussed in sect. 5.
2 Background In this section, we discuss the background of the paper from different related research articles. Authors in [1] developed a fake review detection framework for detecting fake reviews to extract and characterize features of reviews based on user identity and review. Authors also emphasize that only analyzing textual reviews may not work and advised inclusion of social, personal, review activity, and user trust for analysis of reviews. In [2], the authors discussed a model to detect an outlier review based on review records such as reviews and comments for the product using data from the Amazon, China dataset. Instead of only analyzing the current reviews, it also considers the past reviews of products obtained from various sources to detect fake reviews.
A Smartphone App Based Model for Classification of Users …
339
A method for comprehensive analysis of content based classification using various settings and various learning algorithms on different data sets for fake review detection is described in [3]. They also focused on the classification method’s performance on real-world scenarios and adaptive from time to time. A machine learning model uses various machine learning algorithms to find the trustworthiness of yelp datasets [4]. They compare different machine learning algorithms for better performance then suggest that XGBoost classification technique works better for the model. Another method to get features based on Latent Dirichlet Allocation on the yelp dataset and apply the linguistic feature to detect fake and real reviews is also proposed [5]. Semantic analysis technique, decision table, and information gain are used to identify honest and reliable reviews and filter out malicious reviews using the data from the web for various products and reviewer information [6]. Geolocation based account detection, a manual fake review detection model is presented in [7]. The authors used the AdaBoost model with a long short-term memory (LSTM) neural network to analyze a user’s account and geolocation information of the reviewer to determine the trustworthiness of users and reliability of review given by users. The authors added the geolocation of users to fake review detection systems to enhance the model’s accuracy. An ensemble learning model is a combination of four different learning models, such as embedding LSTM, depth LSTM, LIWC CNN, and N-gram CNN, to detect fake news is proposed in [8]. The self-Adaptive Harmony Search (SAHS) algorithm in the ensemble learning model determines optimum weights to enhance accuracy for fake news detection. It gives 99.4% accuracy. A deep convolutional neural network (FNDNet) [9] is also proposed recently to detect fake news. Using multiple hidden layers of the deep neural network, the proposed model automatically discovers various discriminatory features used for the classification of fake news. Authors in [10] discussed linguistic characteristics such as affective, cognitive, social, and perceptual of the reviewer’s psychological process. They established a relationship with fake reviews by understanding the effect of time distance and the reviewer’s location on these reviews. A mathematical model has been developed for trust management in secure spectrum sensing in cognitive radio network [11].
3 Methodology 3.1 Proposed Model An android smartphone App has been developed that will work as a digital tourist guide. The proposed system is aimed at the following tasks. (1) Collecting data from individuals (volunteers) who visited a specific place using the app. (2) Storing and analyzing data in the cloud and estimate the reputation of a location based on feedback and rating.
340
R. K. Sahoo et al.
(3) Classification of actual data and false data by finding the trustworthiness of users. Only genuine and valid data are considered for building the model, and the rest are discarded.
3.2 Proposed Architecture In the proposed system, the user (volunteer) will install and register with the app by giving details such as mobile number, email-id, name, gender, profession, date of birth, and address. Data will be collected from various registered users about their visited tourist places through the mobile app in both offsite or onsite mode. Data collected from users will be stored in a cloud database. Feedback data will be analyzed to detect fake reviews, and as per the reliability of reviews given by users, it will estimate the honesty level of users. Any user can get information about any tourist places in the app. The information about tourist places will be given to users by analyzing reviews obtained from honest users only.
3.3 Data Collection Framework Data will be collected from registered users as feedback about the visited tourist place in onsite or offsite mode. In onsite mode, the user is available in that place for which he is giving feedback. In offsite mode, the user will provide feedback about any previously visited or known site from anywhere. In feedback, USER_ID, location coordinates in latitude and longitude, address, language used for communication, message about that place, and rating in the range of 1–5 are collected. The feedback also contains features namely, like, dislike, average, and rectification required, which is obtained as information about the visited place through the app. This information is stored in a cloud where it will be analyzed to detect fake reviews and estimate users’ honesty level.
3.4 Data Pre-Processing and Analysis Data stored in the cloud will be organized as a comma separated value in text file format. This data will be analyzed using Matlab.
A Smartphone App Based Model for Classification of Users …
341
3.5 Estimation of the Average Rating of Tourist Places The maximum likelihood estimation technique is used to determine the average rating of tourist places. It will be treated as the correct rating of that place. The below equation determines the total rating of the place, Rl =
n
Lu
(1)
u=0
El =
Rl n
(2)
In Eq. 1, u represent user ranges from 0 to n, and L u represents rating given by user u for the location. Rl is total rating provided by all users for the location. In Eq. 2, El represents the estimated rating of location is determined by taking the mean of Rl .
3.6 The Reliability Level of Users Rating given by a user for a location will be checked with an estimated rating of a location. If it matches, the review will be considered an honest review, and the reliability level will increase; otherwise, the reliability level will decrease. In Eq. 1 Rating given by a user for a location is between the specified range set by using an estimated rating of a location, which will be considered a correct review, otherwise marked as a fake review. If the review is correct, then the correct review will increase by 1; otherwise, the count of fake reviews will increase by 1. Using the count of correct review and fake review, the Reliability Level of a user will be determined as per the equation mentioned below; Rs =
1 (El − 1) ≤ Ur ≤ (El + 1) Otherwise 0
(3)
In Eq. 3, El is the estimated rating of a location, Ur is a review given by the user, and Rs is the status of the review provided by the user for a location l, If Ur is in between the range (El − 1) to (El + 1), then Rs = 1 means Review is correct otherwise Rs = 0 means the review is fake. If Rs = 1 then Number of correct Reviews (NCR) increases by 1 otherwise the number of wrong reviews (NWR) will increase by 1. It can be used to calculate Ru , which represents Reliability level of users as: Ru =
NCR NCR + NWR
(4)
342
R. K. Sahoo et al.
In Eq. 4, Ru represents the Reliability level of users derived using NCR and NWR.
3.7 The Activeness of the User If a user gives a review for a certain location, the number of participants (participated in the review process) will increase by 1; otherwise, the number of non-participants (not participated in the review process) will increase by 1. The activeness of the user will be determined using participants and non-participants as per equation mentioned in Eq. 5; UP =
1 User participated in Review Process for different location 0 Otherwise
(5)
In Eq. 5, U p represents the status of the user for participation in the review process for different locations. If the user participates in the review process, then U p = 1 otherwise 0. If U p = 1, then Number of participated (N p ) will increase by 1, otherwise the number of non participated (Na ) increases by 1. In Eq. 6, Au represents Activeness of User determined using N p and Na and calculated as follow: Au =
Np N p + Na
(6)
3.8 Incentives of the User Incentives are the user’s additional benefits if the user is consistently involved in the process and provides the correct review. A user can give multiple reviews for a location from time to time. Therefore, location wise Reliability level of the user will be determined. Finally, the mean of all location-wise reliability level of the user will be determined to get the user’s incentive level as per the equation mentioned as follow; A user can give multiple reviews for a location from time to time. Therefore, location wise suspicious levels of the user will be determined. Finally, the mean of all location-wise suspicious levels of a user will be determined to get the user’s incentive level as per the below equations. RsL =
1 0
(El − 1) ≤ RuL ≤ (El + 1) otherwise
(7)
A Smartphone App Based Model for Classification of Users …
343
In Eq. 7, Rs L represents Status of Review given by a User for location L,El is the estimated rating of location L determined using Eq. 2 and Ru L is the actual rating provided by a user for location L. if Ru L is in between the range of El − 1 and El + 1 then Rs L will be 1 otherwise 0. If Rs L = 1, then the Location wise number of correct review (N C R L ) will be incremented by 1; otherwise, the wrong review (N W R L ) will be incremented by 1. SuL =
NWR L NCR L + NWR L
(8)
In Eq. 8, Su L represents Location wise Suspicious level of the user determined using N W R L and N C R L . S=
n
Su L
(9)
u=0
In Eq. 9, total Suspicious level (S) is the summation of location wise suspicious level of n number of locations. Su =
S n
(10)
In Eq. 10, Su represents the Suspicious level of the user determined by taking the mean of S. Iu = 1 − Su
(11)
In Eq. 11, Iu represents incentives given to the various user as per their performance in the review process.
3.9 Determine Honest Level of User It has been given three different weight values to the user as per his performance determined in Reliability, Activeness, and Incentive as listed in the table below. Component
Weight variable
Value
Reliability
W1
0.5
Activeness
W2
0.2
Incentive
W3
0.3
The honest level (H u ) of users will be determined using weight values and performance metrics such as Reliability, Activeness, and Incentives mentioned in Eq. 12.
344
R. K. Sahoo et al.
Hu = W1 ∗ Ru + W2 ∗ Au + W3 ∗ Iu
3.9.1
(12)
Classification of the User
Users will be classified as Honest, Suspicious, and Malicious as per Honesty level (H u ). Honesty level (Hu )
Type of User
> 0.7
Honest
0.4 to 0.7
Suspicious
< 0.4
Malicious
In the developed app, any user can search for any tourist place before visiting that place. The app will provide an estimated rating of that place derived using rating given by the Honest user only. App also provides information like the communicative language used, Different Messages given by Honest User about that place, feedback given by Honest user. This information will help any user get some idea about the location they want to visit. Any user can search for the location as per Rating, Feedback, and Communicative language (Fig. 1).
Fig. 1 The proposed architecture of the system
A Smartphone App Based Model for Classification of Users …
345
4 Experimental Results and Discussions In this section, it has been discussed the results and analysis. In Fig. 2, the rating represents an estimated rating of location in the range of 1–5. The location represents a unique location numerically assigned from 1 to 42. This figure represents the Estimated rating of every location. The estimated rating will be treated as the actual rating of the location, which can be calculated from feedback or review from visitors for particular places. In Fig. 3 the reliability level represents the reliability of users determined as per their given review. Unique User ID of users starting from 1 to 9 provided in our analysis. Reliability reflects how much a user is reliable as per his/her review. It
Fig. 2 Estimated rating of location
Fig. 3 Reliability level of users
346
R. K. Sahoo et al.
has been observed that user reliability is directly proportional to numbers of correct reviews or feedback and inversely proportional to fake reviews if the number of correct reviews given by the user increases., the reliability level of the user also increases. In Fig. 4 activeness level represents the activeness of users in giving reviews of location. Users represent unique user id starting from 1 to 9, as mentioned in Fig. 3. The activeness of various users in participation in providing reviews for multiple locations. Activeness level of users is directly proportional to the participation of users in the review process. In Fig. 5 the incentive level represents the incentives given to various users where it has been considered for nine users starting from 1 to 9. It reflects the incentive
Fig. 4 The activeness of various users
Fig. 5 Incentive level of various users
A Smartphone App Based Model for Classification of Users …
347
Fig. 6 Honesty level of users
given to users as per their Reliability and Activeness in the review process. Higher incentives will be given to users who will provide the correct review. The amount of incentive is directly proportional to the number of correct reviews given by users and inversely proportional to the number of fake reviews. In Fig. 6 honesty level represents the honesty of users in the range of 0–1 for the different unique users assigned from 1 to 9. It reflects the honesty level of various users. The honesty of users will depend on the user’s performance in terms of his Reliability, Activeness, and Incentive assigned to him. The honesty level of users classifies various users into three types: honest, suspicious, and malicious. Here Green color resents Honest user, Yellow color represents Suspicious user, and Red color represents Malicious user. For user Id 4, the honesty level is zero and thus not visible.
5 Conclusions and Future Scope In this paper, a novel idea is presented to classify the reviews and classification of the reviewers also. We proposed a model to classify the reviewers as honest, suspicious, and malicious and even the reviews as fake or real. The reliability of the reviews is more if the honest reviewers upload it. The estimated rating of different locations is determined by analyzing the feedback uploaded by various users. The user reliability is determined by comparing the ratings of the users with the estimated rating of the location. The activeness of users in the review process is also determined, along with the reliability of users. As per the reliability and activeness of users, incentives can be given to users with consistent participation. The honesty level of users has been determined using reliability, activeness, and assigned incentives to users. Feedback or reviews given by honest users are considered as honest or real reviews.
348
R. K. Sahoo et al.
In the future, we propose to extend this work for other applications. We also propose to work on advanced algorithms, including machine learning algorithms for classification of the reviews and users.
References 1. Barbado, R., Araque, O., Iglesias, C.A.: A framework for fake review detection in online consumer electronics retailers. Information Process. Manage. Elsevier 56(4), 1234–1244 (2019). https://doi.org/10.1016/j.indmarman.2019.08.003 2. Liu, W., He, J., Han, S., Cai, F., Yang, Z., Zhu, N.: A method for the detection of fake reviews based on temporal features of reviews and comments. In: IEEE Engineering Management Review, vol. 47, no. 4, pp. 67–79, 1 Fourth quarter, Dec 2019. https://doi.org/10.1109/EMR. 2019.2928964 3. Cardoso, E.F., Silva, R.M., Almeida, T.A.: Towards automatic filtering of fake reviews, Neurocomputing, Elsevier, vol. 309, pp. 106–116 (2018). https://doi.org/10.1016/j.neucom.2018. 04.074 4. Sihombing, Fong, A.C.M.: Fake review detection on yelp dataset using classification techniques in machine learning. In: 2019 International Conference on Contemporary Computing and Informatics (IC3I), Singapore, Singapore, pp. 64–68 (2019). https://doi.org/10.1109/IC3I46 837.2019.9055644 5. Jia, S., Zhang, X., Wang, X., Liu, Y.: Fake reviews detection based on LDA. In: 2018 4th International Conference on Information Management (ICIM), Oxford, 2018, pp. 280–283. https://doi.org/10.1109/INFOMAN.2018.8392850. 6. S. K.S, A. Danti.: Detection of fake opinions on online products using Decision Tree and Information Gain. In: 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 2019, pp. 372–375. 10.1109/ ICCMC.2019.8819685 7. NaRuan, R. D., Su, C.: GADM: Manual fake review detection for O2O commercial platforms, Computers & Security, Elsevier, vol. 88, p. 101657, Jan 2020. https://doi.org/10.1016/j.cose. 2019.101657 8. Huang, Y.F., Chen, P.H.: Fake news detection using an ensemble learning model based on Self-Adaptive Harmony Search algorithms. Expert Syst. Appl. Elsevier, vo. 59, p. 113584, 30 Nov 2020. https://doi.org/10.1016/j.eswa.2020.113584 9. Kaliyar, R.K., Goswami, A., Narang, P., Sinha, S.: FNDNet—A deep convolutional neural network for fake news detection, Cognitive Systems Research, Elsevier, vol. 61, pp. 32–44. June 2020. https://doi.org/10.1016/j.cogsys.2019.12.005 10. Li, L., Lee, K.Y., Lee, M.W., Yang, S.B.: Unveiling the cloak of deviance: Linguistic cues for psychological processes in fake online reviews. Int. J. Hospitality Manage. Elsevier, vol. 87, p. 102468. May 2020. https://doi.org/10.1016/j.ijhm.2020.102468. 11. Kar, S., Sethi, S., Sahoo, R.K.: A multifactor trust management scheme for secure spectrum sensing in cognitive radio networks. Wireless Pers Commun. 97, 2523–2540 (2017). https:// doi.org/10.1007/s11277-017-4621-5
Classification of Arrhythmia Beats Using Optimized K-Nearest Neighbor Classifier Mohebbanaaz, L. V. Rajani Kumari, and Y. Padma Sai
Abstract Artificial intelligence related technologies are outperforming present day screening methods in medical field. The classification of ECG beats to detect cardiac arrhythmia is of great significance in medical field. K- Nearest Neighbor (KNN) algorithm is a supervised instance-based learning algorithm. It is most popular nonparametric algorithm in data mining and statistics because of its simplicity and substantial classification performance. However, classification using KNN algorithm becomes complicated when the sample size and the feature attributes are large. This may reduce the performance of KNN classifier. For classifying arrhythmia beats, MIT-BIH arrhythmia database is considered. An Optimized K-Nearest Neighbor Classifier (O-KNN) is proposed in this paper and Simulation results are compared with the traditional KNN algorithm. A traditional KNN algorithm gives an accuracy of 96.98%. Optimizing hyper parameters, the accuracy of the optimized K-Nearest Neighbor (O-KNN) Classifier reaches 99.03%. The experimental results show that the proposed algorithm improves the classification accuracy of KNN classifier in processing large data sets. Keywords Arrhythmia classification · K-nearest neighbor (KNN) · Optimized K-nearest neighbor (O-KNN)
1 Introduction Various methods have been proposed by specialists to identify and classify arrhythmia [1, 2]. Majority of them have used MIT-BIH database for ECG signals. In general, a Mohebbanaaz (B) · L. V. Rajani Kumari · Y. Padma Sai Department of ECE, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India e-mail: [email protected] L. V. Rajani Kumari e-mail: [email protected] Y. Padma Sai e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_31
349
350
Mohebbanaaz et al.
Model first learns the information from training samples and then classifies or predicts test samples [3, 4]. Noise or interference can cause mislabeling or misinterpreting heartbeats [5]. Noise has to be removed before extracting features. KNN is different from these model-based methods [6]. KNN Classifier is model free [7]. It does not have Training phase. KNN classifier directly classifies test samples by comparing the test samples with training samples using distance measures. Depending on the K value, the classifier obtains K nearest neighbors [8]. Among these K neighbors, depending on the majority label or class is specified. Because of its simplicity and robustness [9], KNN classifier is a very popular method in Medical field and statistics [3, 10]. As the available medical and health care data [11] is large KNN algorithm have problems due to imbalance datasets and slow processing due to large data size. To overcome such type of problem optimizing hyperparameters is very much needed. We can consider the data samples around a specific data and also assigns the weights for each category according to the location. This improves the classifier performance. Considering distribution of the classes and time lag also improves the accuracy of KNN classifier [12]. This paper proposes optimized KNN algorithm (O-KNN) by optimizing hyperparameters. The contribution of our work is as follows. First: An experimental study was conducted on the MIT-BIH Database by extracting feature and classifying ECG beats using KNN algorithm. Second: An Optimized KNN algorithm has been proposed optimizing all possible hyperparameters. The accuracy of classifier in processing large data sets is improved after optimizing all hyperparameters. This optimized classifier is compared with KNN classifier.
2 Methodology ECG beats are extracted from MIT-BIH database [13]. It contains 48 records each with 30 min duration signals [14]. 86,825 beats are extracted using Dynamic plosion index [15] of which 78,145 beats are using for training and 8680 beats are used for testing. Among 78,145 beats, 56,167 are Normal Sinus Rhythm (NR), 7249 beats are Left Bundle Branch Block (LB), 5398 beats are Right Bundle Branch Block (RB), 4847 beats are Premature Ventricular Contraction (PV), 1224 beats are Atrial Premature beats (AP) and 3260 beats are Paced beats (PB). Among 8680 beats, 6241 are Normal Sinus Rhythm (N), 805 beats are Left Bundle Branch Block (L), 599 beats are Right Bundle Branch Block(R), 538 beats are Premature Ventricular Contraction (V), 136 beats are Atrial Premature beats (A) and 361 beats are Paced beats (P).
Classification of Arrhythmia Beats Using Optimized …
351
2.1 K-Nearest Neighbor (KNN) Classifier KNN classifier stores all training data and classifies new data based on a similarity measure. Various distance measures can be used like cityblock, chebychev, correlation, cosine, Euclidean, hamming, jaccard, mahalanobis, minkowski, seuclidean and spearman [16]. Inspecting the training data optimum value of K is chosen. Cross-validation used independent dataset to validate K values. KNN Algorithm • Inputs: Set of training Data Points {(x 1 , y 1 ), (x 2 , y 2 ), . . . . . . . . . (x N , y N )} where xi ∈ X, a set of instances and yi ∈ Y = {0, 1, 2, . . . . . . , k}, k labels predicted. • Select the Distance Measure to be used • Initialize Number of neighbors i.e. K value. • For each testing Data points ti where i = {0, 1, 2, . . . . . . , k} calculate the distance using distance measure with Training Data Points {x 1 , x2 , . . . . . . . . . . . . . . . . . . x N )}. • Using minimum distance criterion and majority rule assign the label of training data points to test data.
2.2 Optimized K-Nearest Neighbor (O-KNN) Classifier KNN classifier directly calculates the distance between test data set and training data set. Sometimes the attributes in feature set may not have same measurement scales. This causes the classification to depend on a single attribute rather than all attributes. Standardizing the data can overcome this drawback. Hyperparameters like distance measure, Number of Neighbors need to be specified. Optimizing KNN classifier gives us best hyper parameter values. Following Hyperparameters are considered for optimizing KNN classifier. Number of Neighbors: This specifies Number of Neighbors to be considered while calculating distance measure. It can be in the range [1, N/2] where N is the Number of observations. Distance: Various Distance measure can be selected while classification. Optimization selects best distance measure which gives high accuracy among ‘cityblock’, ‘chebychev’, ‘correlation’, ‘cosine’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘mahalanobis’, ‘minkowski’, ‘seuclidean’, and ‘spearman’. Distance weight: Distance weight can be ‘equal’, ‘inverse’, and ‘squaredinverse’. Suppose d is the distance then equal takes the distance d inverse takes 1/d and squared inverse takes 1/d 2 as weight. Standardize: When there is a mixture of values with different scales, we do standardization. Standardization transforms the data with mean zero and standard deviation
352
Mohebbanaaz et al.
as one. When standardization is done, value is noted as true else value is noted as false. Among these hyperparameters Number of Neighbors and the Distance Measure has a larger impact on developing optimized model. Hence these two hyper parameters are mostly considered in optimization. In our work we have considered all four hyper parameters. Optimized KNN Algorithm • Inputs: Set of training Data Points {(x 1 , y 1 ), (x 2 , y 2 ), . . . . . . . . . . . . (x N , y N )} where xi ∈ X, a set of instances and yi ∈ Y = {0, 1, 2, . . . . . . , k}, k labels predicted. • Optimize hyperparameters • Generate objective function model. • For each testing Data points ti where i = {0, 1, 2, . . . . . . , k} calculate the distance using distance measure with Training Data Points {x 1 , x2 , . . . . . . . . . . . . . . . . . . x N )}. • Based on best hyperparameter value assign the label of training data points to test data.
3 Results and Discussion This section holds a discussion on simulated results of proposed methods. Simulation is performed using Matlab R2019b in system with Intel core i7 processor running at 3.7 GHz using 16 GB RAM and 4 GB NVIDIAGEFORCE GTX graphics card.
3.1 K-Nearest Neighbor (KNN) Classifier This confusion matrix of KNN classifier is given in Table 1. The KNN Classifier accuracy is 96.98% and elapsed time is 41.28 s. The performance metrics Positive Table 1 Confusion matrix of K-Nearest Neighbor (KNN) classifier NR
LB
RB
PV
AP
PB
NR
6117
1
113
1
9
0
LB
0
805
0
0
0
0
RB
1
0
592
1
5
0
PV
67
0
0
471
0
0
AP
49
15
0
0
72
0
PB
0
0
0
0
0
361
Classification of Arrhythmia Beats Using Optimized …
353
predictive value (PPV), True positive rate (TPR), True Negative Rate (TNR) and F1score are evaluated as shown in Table 2. Graph plotted between error rate and k value is shown in Fig. 1. It has been observed that as the value of k is increased the error rate is decreased. Region of Convergence curve of KNN Classifier for each class is shown in Fig. 2. Roc Curve shows the performance of classifier at different thresholds. The area under the curve of Fig. 2 is given in Table 3. The area under the ROC curve shows how the classifier distinguishes between different type of beats. Table 2 Performance metrics of K-Nearest Neighbor (KNN) classifier PPV (%)
TPR (%)
TNR (%)
F1 score (%)
NR
98.12
98.01
95.20
98.06
LB
98.05
100
99.79
99.01
RB
83.97
98.83
98.60
90.79
PV
99.57
87.54
99.97
93.17
AP
83.72
52.94
100
64.86
PB
100
100
99.83
100
Average
93.90
89.55
98.89
90.98
Fig. 1 Error rate versus K-value of KNN classifier
354
Mohebbanaaz et al.
Fig.2 Region of Convergence of KNN Classifier Table 3 AUC of KNN classifier Class
NR
LB
RB
PV
AP
PB
AUC
0.97
0.99
0.98
0.96
0.83
1.00
Table 4 Step by step evaluation for optimizing hyper parameters Eval result
Objective
1
Best
0.14
2
Best
0.063
3
Best
0.056
4
Accept
5
Accept
6
Accept
7 8
Objective runtime
NumNeighbors
Distance
Distance weight
Standardize
25.5
22
Hamming
Equal
true
26.6
1
Spearman
Inverse
true
30.0
29
Seuclidean
Equal
false
0.106
33.2
787
Chebychev
Inverse
false
0.116
39.8
1128
Seuclidean
Equal
false
0.059
31.0
35
Seuclidean
Equal
false
Best
0.049
51.7
2
Seuclidean
Equal
false
Accept
0.072
65.0
133
Spearman
Inverse
true
9
Accept
0.144
42.4
3
Spearman
Inverse
false
10
Best
0.044
42.1
6
Seuclidean
Equal
true
Optimization Completed MaxObjectiveEvaluation of 10 reached Total Function evaluation of 10 reached Total elapsed time:40.28 s Total objective function evaluation time: 35.26 s
Classification of Arrhythmia Beats Using Optimized …
355
3.2 Optimized K-Nearest Neighbor (O-KNN) Classifier Optimizing the hyper parameters enhances the performance of classifier. Table 4 gives the step by step evaluation of the Hyper parameters. Figure 3 shows the relationship between Function evaluation and minimum objective. Best observed feasible point: NumNeighbors
Distance
Distance weight
Standardize
6
seuclidean
equal
true
Value of Objective function observed = 0.040 Value of Objective function estimated = 0.039 Function evaluation time = 40.167 s.
Best estimated feasible point (according to model): NumNeighbors
Distance
Distance weight
Standardize
6
seuclidean
equal
true
Objective function value estimated = 0.039 Function evaluation time estimated = 40.143 s
Fig. 3 Relationship between min. objective and number of functions evaluated
356
Mohebbanaaz et al.
Table 5 Confusion matrix of optimized K-Nearest Neighbor (O-KNN) classifier NR
LB
RB
PV
AP
PB
NR
6231
0
3
2
5
0
LB
0
805
0
0
0
0
RB
0
0
599
0
0
0
PV
43
0
0
495
0
0
AP
33
0
0
0
103
0
PB
0
0
0
0
0
361
Table 6 Performance metrics of optimized K-Nearest Neighbor (O-KNN) classifier PPV (%)
TPR (%)
TNR (%)
F1Score (%)
NR
98.79
99.84
96.88
99.31
LB
100
100
100
100
RB
99.50
92.07
99.96
99.75
PV
99.59
92.00
99.97
95.65
AP
95.37
75.73
100
84.42
PB
100
100
99.94
100
Average
98.87
93.27
99.45
96.52
This confusion matrix of KNN classifier is given in Table 5. The overall accuracy of Optimized KNN Classifier is 99.03% and elapsed time is 40.28 s. The performance metrics Positive predictive value (PPV), True positive rate (TPR), True Negative Rate (TNR) and F1score are shown in Table 6. Graph plotted between error rate and k value is shown in Fig. 4. It has been observed that as the value of k is increased the error rate is decreased. Region of Convergence curve of optimized KNN Classifier for each class is shown in Fig. 5. Roc Curve shows the performance of classifier at different thresholds. The area under the curve of Fig. 4 is given in Table 7. The area under the ROC curve shows how the classifier distinguishes between different type of beats.
3.3 Comparision of K-Nearest Neighbors (KNN) and Optimized K-Nearest Neighbor (O-KNN) Classifier The comparisons of performance parameters of KNN classifier and Optimized KNN Classifier is in Fig. 6. As shown in the Table 8 KNN Classifier gives 96.98% accuracy. Optimizing hyper parameters improves the accuracy from 96.98 to 99.03%.
Classification of Arrhythmia Beats Using Optimized …
357
Fig.4 Error Rate versus K-value of optimized KNN classifier
Fig.5 Region of convergence of optimized KNN Classifier Table 7 AUC of Optimized KNN Classifier Class
NR
LB
RB
PV
AP
PB
AUC
0.99
1.0
1.0
0.99
0.89
1.00
358
Mohebbanaaz et al.
COMPARISION OF CLASSIFIERS PERCENTAGE (%)
100 80 60 40 20 0
Average
Average
PPV
TPR
Average
Average
Overall
TNR
F1Score
Accuracy
KNN
O-KNN
Fig.6 Comparison of classifiers
Table 8 Comparison of performance parameters of Classifiers Classifier
Average PPV (%)
Average TPR (%)
Average TNR (%)
Average F1Score (%)
Overall accuracy (%)
KNN
93.90
89.55
98.89
90.98
96.98
O-KNN
98.87
93.27
99.45
96.52
99.03
4 Conclusion In this paper, an Optimized K Nearest Neighbor (O-KNN) classifier is proposed to overcome the limitations of the KNN algorithm. Optimization reduces the running time of the KNN algorithm and improves accuracy. In addition, the medical datasets have a relatively high missing rate, which impacts classification results. A KNN Classifier gives an accuracy of 96.98% with elapsed time of 41.28 s. Classification accuracy is within acceptable limits. Optimization improves the accuracy to 99.03% with elapsed time 40.28 s.
References 1. Mohebbanaaz, Sai Y.P., Rajani Kumari, L.: A review on Arrhythmia classification using ECG signals. In: 2020 IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), Bhopal, India, pp. 1–6 (2020). https://doi.org/10.1109/SCE ECS48394.2020.9 2. Rajani Kumari, L.V., Sai, Y.P., Balaji, N.: ECG signal preprocessing based on empirical mode decomposition. Int. Conf. Microelectron. Electromagn. Telecommun. 673–679 (2015) 3. Zhu, W., Chen, X., Wang, Y., Wang, L.: Arrhythmia recognition and classification using ECG morphology and segment feature analysis. IEEE/ACM Trans. Comput. Biol. Bioinf. 16(1), 131–138, 1 Jan–Feb 2019
Classification of Arrhythmia Beats Using Optimized …
359
4. de Oliveira, L. S. C., Andreao, R. V., Sarcinelli Filho, M.: Bayesian Network with decision threshold for heart beat classification. In: IEEE Latin America Transactions, vol. 14, no 3, pp. 1103–1108, Mar 2016 5. Nurmaini, S., Darmawahyuni, A., Noviar, A., Mukti, S., Naufal Rachmatullah, M., Firdaus, F., Tutuko, B.: Deep Learning-Based Stacked Denoising and Autoencoder for ECG Heartbeat Classification. Electronics. 9. https://doi.org/10.3390/electronics9010135 6. Yazdani, S., Fallet, S., Vesin, J.: A novel short-term event extraction algorithm for biomedical signals. IEEE Trans. Biomed. Eng. 65(4), 754–762 (2018). ISSN 0018–9294 7. Bhoi, S. K., et al.: FallDS-IoT: a fall detection system for elderly healthcare based on IoT data analytics. In: 2018 International Conference on Information Technology (ICIT), Bhubaneswar, India, 2018, pp. 155–160. https://doi.org/10.1109/ICIT.2018.00041 8. Zhang, S., Li, X., Zong, M., Zhu, X., Cheng, D.: Learning k for kNN classification. ACM Trans. Intell. Syst. Technol. 8(3), 1–19 (2017) 9. Chen, Z.: ‘Identification of Android malicious behaviors based on k nearest neighbor algorithm and least squares support vector machine.’ J. Jilin Univ. 53(4), 720–724 (2015) 10. He, R., et al.: Automatic cardiac Arrhythmia classification using combination of deep residual network and bidirectional LSTM. IEEE Access 7, 102119–102135 (2019) 11. Wang, P., Hou, B., Shao, S., Yan, R.: ECG Arrhythmias detection using auxiliary classifier generative adversarial network and residual network. IEEE Access 7, 100910–100922 (2019) 12. Mohebbanaaz, Sirisha, M., Rajiv, K.: Random walk with clustering for image segmentation. https://doi.org/10.1007/978-3-030-24318-0_1 13. Moody, G.B., Mark, R.G.: The impact of the MIT-BIH Arrhythmia Database. IEEE Eng in Med and Biol 20(3):45–50 (May-June 2001). (PMID: 11446209) 14. Rajani Kumari, L.V., Sai, Y.P., Balaji, N., Viswada: FPGA based Arrhythmia detection. Procedia Comput. Sci. 57, 970–979 (2015) 15. Rajani Kumari, L.V., Sai, Y.P., Balaji, N.: Performance evaluation of neural networks and adaptive neuro fuzzy inference system for classification of Cardiac Arrhythmia. Int. J. Eng. Technol. 7, 250–253 (2018) 16. Zhang, S., Li, X., Zong, M., Zhu, X., Wang, R.: Efficient kNN classification with different numbers of nearest neighbors. IEEE Trans. Neural Netw. Learn. Syst. 29(5), 1774–1785 (May 2018). https://doi.org/10.1109/TNNLS.2017.2673241
A Comparative Analysis of Fuzzy Logic-Based DTC and ST-DTC Using Three-Level Inverter for Torque Ripple Reduction Umakanta Mahanta, Bhabesh Chandra Mohanta, Bibhu Prasad Panigrahi, and Anup Kumar Panda Abstract In this paper, fuzzy logic is implemented for DTC of a three-phase induction motor using two- and three-level inverter and a comparative study is done with conventional switching table-based DTC (ST-DTC). Here, d-q model in stationary reference frame equations are consider for simulation of three-phase induction motor, to which power supply is given by a three-level inverter controlled with fuzzy logic. In three-phase three-level (3P-3L) inverter, the number of switching states is 27 which is only 8 in three-phase two-level inverter. With increase in switching vectors, it is able to define a greater number of sectors and different sets of switching states depending on type of loading. By selecting proper input and output membership functions and rules, a fuzzy inference system is generated to trigger the switches of the inverter. The result shows that settling and rise times are comparable with conventional DTC; however, a remarkable reduction of torque ripples (5.2583% in three-level and 4.7577% in two-level inverter) are observed. The current ripples also reduced by 4.006% in three-level and 3.734% in two-level in case of fuzzy logic-based DTC (Fuzzy-DTC). Keywords Multi-level inverter · Induction motor · Fuzzy logic · DTC · Switching table · Torque ripple · Current ripple
U. Mahanta (B) · B. C. Mohanta · B. P. Panigrahi Department of Electrical Engineering, Indira Gandhi Institute of Technology, Sarang, Dhenkanal, Odisha, India e-mail: [email protected] B. C. Mohanta e-mail: [email protected] B. P. Panigrahi e-mail: [email protected] A. K. Panda Department of Electrical Engineering, National Institute of Technology, Rourkela, Odisha, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_32
361
362
U. Mahanta et al.
1 Introduction Induction motor is treated as workhorse of industries because of its self-starting, lesser in size and weight, high efficiency, low maintenance costs, and simpler construction. Faster torque control is also equally essential during the load changing. For faster and dynamic response, DTC is becoming a powerful technique. The major drawback of conventional DTC is higher torque ripple which depends on setting of the toque band. While applying switching table-based DTC with three-level inverters, the dv/dt stress and balancing of DC link capacitor voltage should be considered [1, 2]. DTC with intelligent control like sliding mode control (SMC), fuzzy logic (FL), and artificial neural network (ANN)-based control, etc., can be used for torque ripple reduction [3–5]. SVM-based DTC is also preferred for torque ripple reduction using dwell time ratio [6, 7]. Higher level inverters are preferred for high-power and low-voltage application. Additional advantage of choosing higher level inverter is low dv/dt stress on the switch and low harmonics distortion of voltage and current [8]. Increase in number of switching states in higher level inverter gives added advantage for selection of switching state for DTC application. Due to increase in switching state and sector, better torque control is possible during sudden load change. However, increase in number of switches leads to increase in switching losses [9]. Fuzzy rule-based control is simpler for application of switching table as compared to switching table-based DTC (ST-DTC). By proper selection of membership function (MF) of flux error, sector and torque error, outputs are created which are directly fed to the inverter. In this paper, a three-phase induction motor model is developed in d-q model from its stationary frame dynamic mathematical equation with the help of MATLAB/Simulink. A detail comparative analysis is carried out for ST-DTC using three-level inverter with fuzzy logic-based DTC using two-level (2L-Fuzzy) and three-level inverter (3L-Fuzzy) for different load conditions. With 3P-3L inverters, the number of switching states increases to 27 out of which 12 effective switching states are selected with 12 number of sectors for preparation of switching table which is used for fuzzy inference system (FIS). The analysis shows that fuzzy logic-based DTC with three-level inverter gives added advantage of selecting active vector for torque ripple reduction. Current ripple is also reducing in case of three-level inverter.
2 Mathematical Model of Three Phase Induction Motor and Power Supply The three-phase induction motor is designed in d-q model by using its stationary frame dynamic model mathematical Eq. (1) which are shown below [10]. s + Vqss = Rs i qs
s dψqs
dt
(1a)
A Comparative Analysis of Fuzzy Logic-Based DTC …
s Vdss = Rs i ds +
s Vqrs = 0 = Rr i qr +
s Vdrs = 0 = Rr i dr +
363
s dψds dt
s dψqr
(1b)
s − ωr ψdr
(1c)
s dψdr s + ωr ψqr dt
(1d)
dt
s s s s s s where ψqs , ψds , ψqr, and ψdr are linkage fluxes Vqs and Vds are stator voltages (d–q); s Vqrs and Vdr are rotor voltages (d–q);Rs is stator resistance, and Rr is rotor resistance; ωr is rotor speed (rad/second). The torque developed (Te ) and speed can be calculated using Eqs. (2) and (3), respectively.
3 P s s s s i qs − ψqs i ds × × ψds 2 2 2 dωr × j + Bωr (Te − Tl ) = P dt Te =
(2) (3)
The basic diagram of a three-phase three-level inverter is shown in Fig. 1 which has 27 switching states out of which only 12 switching are selected for DTC application. These 12 switching voltage vectors are uniformly distributed in space with an angle of 30° between two consecutive vectors as mentioned in Fig. 2. The switching conditions of three-level inverter are described in Table 1. To convert line output voltage of inverter to phase voltage for machine input, the following Eqs. (4) are implemented.
Sa1
Sb1
Sc1
Sa2
Sb2
Sc2
C1 +
-
Vdc Sb3
Sc3
Sb4
Sc4
Sa3
+
Vc
Va Vb
C2 3 Phase induction motor Sa4
Fig. 1 Three-phase three-level inverter
364
U. Mahanta et al.
Fig. 2 a Switching vectors for all states b selected switching states for DTC applications
Table 1 Inverter pole voltages of leg-a Leg state
Inverter pole voltage
Sa 1
Sa 2
Sa 3
Sa4
2
Vdc /2
ON
ON
OFF
OFF
1
0
OFF
ON
ON
OFF
0
−Vdc /2
ON
OFF
ON
ON
Vas =
1 (Vab − Vca ) 3
(4a)
Vbs =
1 (Vbc − Vab ) 3
(4b)
Vcs =
1 (Vca − Vbc ) 3
(4c)
After balancing the mmf of three-phase induction motor, three-phase voltages are converted to two-phase quantities by using Eqs. (5). Vqs = Vas
(5a)
1 Vds = √ (Vcs − Vbs ) 3
(5b)
Table 2 represents the switching table of three-level inverter for DTC with fivelevel torque comparator and two-level flux comparators, whose logic is represented in (6).
A Comparative Analysis of Fuzzy Logic-Based DTC … Ψqs Ψds
Fqs
Vqs
Ψqs, Ψds calculation
Fdm Fqm
|Ψ| Gate pulses
Linkage Flux calculation
e
Rectifier
Flux error
Scope
Fdr
Fdr Ψqr, Ψdr calculation
TL
iqs Id, iq Calcula -tion
ids
Torque calculation
3-Φ, 50HZ Supply
Ψref
F U Z Z Y
Fds Mutual Fqm Linkage flux calculatio Fdm n
Fqr
iqr
Fqs Fds
Fds
Vds
Flux Magnitude And sector calculation
365
Speed Calculation
Te
B L O C K
Torque error
ωb Tref
Fdr
ωr
Fqr Scope
e
idr
3-Φ 3-level inverter
3-Φ Induction motor
sector
Vqs
Pole voltage To line voltage conversion
3-Φ to 2Φ conversion
Vds
Fig. 3 Fuzzy rule-based DTC application to three-phase induction motor
Table 2 Switching table
F−
F+
Te
PL
PS
Z
NS
NL
PL
PS
Z
NS
NL
S1
201
202
000
002
012
210
220
000
020
021
S2
200
201
000
102
002
220
120
000
021
022
S3
210
200
000
202
102
120
020
000
022
012
S4
220
210
000
201
202
020
021
000
012
002
S5
120
220
000
200
201
021
022
000
002
102
S6
020
120
000
210
200
022
012
000
102
202
S7
021
020
000
220
210
012
002
000
202
201
S8
022
021
000
120
220
002
102
000
201
200
S9
012
022
000
020
120
102
202
000
200
210
S10
002
012
000
021
020
202
201
000
210
220
S11
102
002
000
022
021
201
200
000
220
120
S12
202
102
000
012
022
200
210
000
120
020
HTe = PL, for positive large torque error HTe = PS, for positive small torque error HTe = Z, for no change in torque HTe = NS, for small negative torque HTe = NL. for large negative torque F+ = positive flux error F− = negative flux error
(6)
366
U. Mahanta et al.
Fig. 4 a MF for torque error b MF for flux error c MF for sector d MF for output
Figure 3 describes the simulation arrangement of fuzzy logic-based DTC of threephase induction motor. In this model, the input line voltage of three-phase induction motor is first converted to phase voltage and fed to the induction motor model. The reference torque and flux are set according to the rated value. Comparing actual value of torque and flux with respective reference value, torque error and flux error are calculated. The estimated torque error, estimated flux error, and sector are given as input to FIS. The fuzzy rules are formed based on the switching Table 2 to get the desired switching state.
3 Fuzzy Implementation To implement fuzzy rules-based DTC, the three-input variables such as torque error, flux error, and sectors are chosen as input to the FIS. The membership functions (MF) are chosen according their error values and tolerance bands. As five-level torque comparator is used, five numbers of membership functions are selected out of which three are triangular and two are trapezoidal as shown in Fig. 4a. For flux, two-level comparator is used for which two numbers of trapezoidal membership functions are chosen as shown in Fig. 4b. For sectors calculation, triangular MFs are selected which are given in Fig. 4c. Figure 4d describes about the MF of output for a switch of the inverter.
A Comparative Analysis of Fuzzy Logic-Based DTC …
367
Fig. 5 Torque v/s time of; a 3P-3LST-DTC b 3P-3LFuzzy-DTC c 3P-2LFuzzy-DTC
4 Simulation Results For implementation of DTC, used three-phase induction machine model’s specifications are given in Table 3 [11]. A reference torque is applied whose value is decreased at 0.4 s and again increased at 0.8 s. The torque error is compared in five stages to take the advantages of more number of switching states. Here, the same machine model is used for 3P-3L ST-DTC, 3P-2L Fuzzy-DTC, and 3P-3L fvFuzzy-DTC for a comparative study. In Fig. 5, the torque versus time characteristics of 3P-3L ST-DTC, 3P-2L FuzzyDTC, and 3P-3L Fuzzy-DTC are shown. In Fig. 6, the transient response is shown, and time taken to reach the reference value by all the above methods is provided in Table 4. The time taken by the generated torque during fall from 100 to 30% at 0.4 s and to increase from 30% to its rated at time 0.8 s is also provided on the Table 4. From the table, it is observed that torque response parameters are nearly the same. However, the torque and current ripples reduce remarkably during FuzzyDTC. Figure 7 shows the torque ripple in all the methods. Here, the torque ripple tolerance for 3P-3L ST-DTC,3P-3L, and 3P-2L Fuzzy-DTC is 0.3179 Nm, 0.05813 Nm, and 0.08286 Nm, respectively. Percentage torque ripples [(Tmax − Tmin )/Tref. × 100] are 6.435%, 1.1716%, and 1.6773%, respectively.
368
U. Mahanta et al.
Fig. 6 Transient response during; a starting b step fall c step rise
In Fig. 8, phase current versus time plot is shown for all the methods. It is observed that peak to peak current ripples are 0.1451A, 0.0247A, and 0.0349A for 3P-3L STDTC, 3P-3L, and 3P-2L Fuzzy-DTC, respectively. While using fuzzy rule-based DTC, there is less current ripple as compared to ST-DTC. When comparing between 2L-Fuzzy and 3L-Fuzzy, the results are comparable. It is observed that all the methods have nearly same flux ripples. In Fig. 9, stator flux trajectory path is shown for all the methods. It is observed that in all the methods the flux trajectories are circular. However, a smoother path is followed by Fuzzy-DTC as compared to ST-DTC due to less ripple in stator current. Both in 2L-Fuzzy and 3L-Fuzzy, the flux trajectory is nearly same. Stator d-q current trajectory paths for 3P-3L ST-DTC, 3P-3L, and 3P-2L Fuzzy-DTC are shown in Fig. 10. 3L-Fuzzy has superior performance as compared to others, but the results of 2L-Fuzzy is comparable to 3L-Fuzzy. Rotor flux trajectory paths are approximately same for all the methods.
5 Conclusions In this paper, a comparative analysis of fuzzy logic-based DTC of a three-phase induction motor using two- and three-level inverters are carried out, and results
A Comparative Analysis of Fuzzy Logic-Based DTC …
Fig. 7 Torque ripple of; a 3P-3LST-DTC b 3P-3LFuzzy-DTC c 3P-2LFuzzy-D
Fig. 8 Current ripple of; a 3P-3LST-DTC b 3P-3LFuzzy-DTC c 3P-2LFuzzy-DTC
369
370
U. Mahanta et al.
Fig. 9 Stator flux trajectory of;a 3P-3LST-DTC b 3P-3LFuzzy-DTC c 3P-2LFuzzy-DTC
are compared with conventional DTC. The rated torque (4.94 Nm.) is selected as reference torque. The time taken by generated torque to reach the reference value for all the methods is approximately 0.009 s. There is no significant difference in rise time and fall time of torque for all the methods. However, there is a significant reduction of torque and current ripple. The torque ripple of ST-DTC is 6.435%, whereas the 2-L Fuzzy and 3L-Fuzzy have torque ripple of 1.6773% and 1.1767%, respectively. Similarly, the current ripple of ST-DTC is 4.83%, whereas the 2L-Fuzzy and 3L-Fuzzy have torque ripple of 1.096% and 0.8233%, respectively. The results of fuzzy rule-based DTC using three-level inverter are better as compared to two-level inverter.
A Comparative Analysis of Fuzzy Logic-Based DTC …
371
Fig. 10 Stator Current trajectory of; a 3P-3LST-DTC b 3P-3LFuzzy-DTC c 3P-2LFuzzy-DTC Table 3 Motor Parameters
Parameters
Values
Output power
0.75 KW
Connection
Star
Current
3A
Rated torque
4.947 Nm
Rated speed
1440 rpm
Rs
5.8
Rr
4.3
M
0.24 H
Ls = L r
0.26 H
J
0.0088 kg m2
damping coefficient B
0.003 Nm s/rad
372
U. Mahanta et al.
Table 4 Output results Parameters
3L ST-DTC
3L Fuzzy-DTC
2L Fuzzy-DTC
Remarks
Reference torque
4.94 Nm
4.94 Nm
4.94 Nm
Rated torque
Rise of torque 0.008755 s at start
0.00910 s
0.008952 s
Approximately equal
Fall of Te to follow Tref (100 to 30%)
0.000383
0.00036799 s
0.0004721 s
Approximately equal
Rise of Te to follow Tref (30 to 100%)
0.000645 s
0.0006499 s
0.000540351 s
Approximately equal
Torque ripple (%)
0.3179(6.435%)
0.05813(1.1767%)
0.08286(1.6773%)
Minimum in 3L Fuzzy-DTC
0.0247(0.8233%)
0.0329(1.096%)
Minimum in 3L Fuzzy-DTC
Current ripple 0.1451(4.83%) (%)
References 1. Renge, M.M., Suryawanshi, H.M.: Five-level diode clamped inverter to eleminate common mode voltage and reduce dv/dt in medium voltage rating induction motor drives. IEEE Trans. Power Electron. 23(4), 1598–1607 (2008) 2. Payami, S., Behera, R.K., Iqbal, A.: DTC of three level inverter fed five phase induction motor drive with novel neutral point voltage balancing scheme. IEEET Power Electron. vol. 2, no 2 (2018) 3. Kirankumar, B., Reddy, Y.V.S., Vijayakumar, M.: Multilevel inverter with space vector modulation: Intelligence direct torque control of induction motor. IET Power Electron. 10(10), 1129–1137 (2017) 4. Rao, V.M.V., Kumar,A.A.: Artificial neural network and adaptive neuro fuzzy control of direct torque control of induction motor for speed and torque ripple control. In: IEEE International Conference on Trends in Electronics and Informatics (ICOEI), pp. 1416–1422 (2018) 5. Bouhoune, K., Yazid, K., Boucherit, M.S., Menaa, M.: Fuzzy logic-based direct torque control for induction machine drive. In: IEEE Mediterranean Conference on Control and Automation (MED), pp. 577–582 (2017) 6. Ouanjli, N.E., Derouich, A., Ghzizal, A.E., Motahhir, S., Chebabhi, A., Mourabit, Y.E., Taoussi, M.: Modern improvement techniques of direct torque control for induction motor drives—a review. In: Protection and Control of Modern Power Systems vol. 4, p. 11 (Springer) (2019) 7. Mahanta, U., Patnaik, D., Panigrahi, B.P., Panda, A.K.: Dynamic modelling and simulation of SVM DTC of five phase induction motor. In: IEEE International Conference, ICEPE (2015) 8. Kouro, S., Malinowski, M., Gopakumar, K., Pou, J., Franquelo, L.G., Wu, B., et al.: Recent advances and industrial applications of multi level converters. IEEE Trans. Ind. Electron. 57(8), 2553–2580 (2010) 9. Mahanta, U., Panigrahi, B.P., Panda, A.K.: Performance analysis of switching table-based DTC 5-phase induction motor with 3-level inverter. J. Instit. Eng. (India): Series B. 100(6):599–607 (2019). (Springer) 10. Bose, B.K.: Morden power electronics and AC drives. Forth Impression, Pearson Education (2007) 11. Panigrahi, B. P., Prasad, D., Sengupta, S.: A simple hardware realisation of switching table based direct torque control of induction motor. Electric Power Systems Research (ELSEVIER), vol. 77, pp. 181–190 (2007)
An Inference Engine Integrated with Health Parameters for Medical Web Platform P. S. S. Sree Dhruti, L. V. Rajani Kumari, and Y. Padma Sai
Abstract Technology is advancing at an unprecedented rate with developments in almost every gadget, particularly smartphones. This project aims at using smartphones to measure some of the most important vitals of the human body—heart rate, breathing rate, blood pressure, oxygen level in blood and heart rate variability. In this approach, a person places his or her right and left finger tips on the smartphone camera lens to record 10-s video signals, which are then analyzed to measure the parameters. Thus, no external equipment like sensors and electrodes are required. The performance has been evaluated on 50 subjects with different age groups. Experimental results show that the heart rate and blood pressure can be estimated effectively using this approach with an average error rate of 2.5% and 2.3% respectively. Keywords Photoplethysmography · Video · Heart rate · Breathing rate · Heart rate variability · Oxygen level in blood · Blood pressure
1 Introduction Health is an important asset for a human being. In the present time, health of people is deteriorating but the costs of healthcare are increasing and this is affecting majority of the population. As per a 2018 survey, health costs in India are to rise at double the rate. Many people who really need medical attention i.e. access to good healthcare are facing a lot of difficulty. In the case of a rural person, getting tests performed such as ECG for measuring heart rate, spirometry for determining lung capacity and so on, after travelling for miles together is ineffective both in terms of time and cost and is a cause of discomfort. P. S. S. Sree Dhruti (B) · L. V. Rajani Kumari · Y. Padma Sai Department of ECE, VNRVJIET, Hyderabad, India e-mail: [email protected] L. V. Rajani Kumari e-mail: [email protected] Y. Padma Sai e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_33
373
374
P. S. S. Sree Dhruti et al.
Proposed approaches include usage of wearables and sensors [1–3], so that measured vitals can be communicated directly to the user’s mobile phone [4–6]. But there is an overhead of carrying them. A smartphone can be used for the same purpose as it is portable and more comfortable [7]. Hence, our proposed system is noninvasive, hassle-free and effective both in terms of time and cost as it solely depends on a smartphone for signal acquisition, thus eliminating the need for any external equipment. This paper is divided into the following sections: Sect. 2 describes the proposed methodology. Section 3 explains the implementation of the proposed method and the experimental results obtained. Finally, Sect. 4 provides the conclusions and future scope of this system.
2 Proposed System Figure 1 depicts the block diagram of the proposed system. In the first step, video signals of left and right fingertips are acquired using camera lens of smartphone [8, 9]. The signal to be processed is the brightness of the skin over time because of the variation in blood flow in the tissues. The brightness vector itself is the PPG signal which is extracted from the video, preprocessed and analyzed to calculate the health parameters as a part of second step. In the third step, the results are displayed in the mobile application and uploaded into database for future reference. Finally, all of the previous readings can be retrieved into the mobile application from the database in the fourth step. These steps are explained in detail as follows:
Fig. 1 Block diagram of the proposed system
An Inference Engine Integrated with Health Parameters …
375
2.1 Video Signal Acquisition Videos are captured when the user places his/her fingertip on the camera lens and presses it gently. Based on trial and error method, it has been observed that a 10-s video under proper lighting conditions (either by turning on flashlight in the mobile or sufficient natural light) is required for accurate results. The dataset has been acquired based on age factor, fitness level and also the model of mobile phones as it plays an important role in determining the sampling rate of the signal.
2.2 Computation of Brightness of Signal and Filtering As blood flows through the finger tips, there are subtle variations that are not visible to the naked eye, but can be observed upon processing it. Hence the entire video signal is divided into frames, red color channel is computed per frame and then average brightness value is calculated. This brightness vector is nothing but PPG signal. Apart from this, number of frames and sampling rate are also calculated as they are needed in estimating the various health parameters. The raw PPG signal is then band pass filtered with cut-off frequencies of 0.5 Hz and 5 Hz. The resultant filtered PPG signal can then be provided as input to the subsequent algorithms in order to estimate the health parameters.
2.3 Estimating the Health Parameters Heart Rate and Heart Rate Variability As per the definition, heart rate (HR) is defined as the number of heart beats per minute [10–13]. Figure 2 describes the entire process. In the filtered PPG signal, peaks are identified. Then the following formulae are used to calculate the heart rate: P = n/ fps Fig. 2 Estimation of heart rate
(1)
Acquiring PPG signal Peak detection Calculating heart rate
376
P. S. S. Sree Dhruti et al.
Fig. 3 Estimation of heart rate variability
Acquiring PPG signal Peak detection Calculate time difference between successive peak locations
Calculating HRV
Q = P/60
(2)
HR = number_of_ peaks / Q
(3)
n is the length of input signal (i.e. number of frames), fps is frame rate and HR is computed heart rate. Heart Rate Variability (HRV) is the variations observed between the heartbeats in the heart signal [14]. As depicted in Fig. 3, Peak detection is applied to the extracted PPG signal. Based on the locations of peaks (corresponding to the beats in the signal), successive differences are calculated. Thus, the final HRV value is the average of all of those values. Breathing Rate and Spo2 Breathing rate (BR) is defined as the number of breaths per minute. As shown in Fig. 4, in the filtered PPG signal, peaks are identified. Again, the peaks are identified within the envelope of the signal. T is the time period between two consecutive peaks in the newly determined peaks. Instantaneous respiratory rate is calculated as 60/T breaths/min. Thus, the final breathing rate value is the average of all instantaneous values [15]. Oxygen saturation level (Spo2) is defined as the level of oxygen in our blood [16]. Figure 5 explains the process of estimation. The acquired video signal is divided Fig. 4 Estimation of breathing rate
Acquiring PPG signal Peak detection to identify envelope of signal Applying peak detection to envelope Calculating instantaneous breathing rates
Calculating breathing rate
An Inference Engine Integrated with Health Parameters … Fig. 5 Estimation of Spo2
377
Acquiring video signal Converting into frames Seperating red and blue components Calculating mean and standard deviation for each colour Calculating Spo2
into frames. Red color and blue color components are separated from it. Mean and standard deviation are calculated for both red and blue colors. Let mr and sdr be mean and standard deviation of red color respectively. Let mb and sdb be mean and standard deviation for blue color respectively. As Spo2 is calculated based on red light and IR light which are the two wavelengths generated during the conventional pulse oximetry process [17], we consider red and blue colors for the same and estimate Spo2 as – Spo2 = A−B (ACred / DCred )/(ACblue / DCblue )
(4)
mr is red DC value, sdr is red AC value, mb is blue DC value and sdb is blue AC value respectively. A and B are constants obtained based on actual Spo2 graph. Here we have taken A as 100 and B as 5. Blood Pressure Blood pressure in simple terms means the amount of pressure applied by blood against the vessels. It is expressed as “Systolic Pressure (SBP) /Diastolic Pressure (DBP)” in mmHg. As shown in Fig. 6, PPG signals are acquired from left fingertip and right fingertip. Peak detection is applied to locate the peaks and their locations. Pulse Transit Time (PTT) is the time difference or lag between the two signals. Upon Fig. 6 Estimation of blood pressure
Acquiring PPG signals from left and right fingertips Peak detection Calculating PTT Input HR and PTT to machine learning model
Calculating SBP and DBP
378
P. S. S. Sree Dhruti et al.
estimating PTT, it is input to a machine learning model (using linear regression) along with the estimated HR to calculate the values of Systolic (SBP) and diastolic (DBP) pressure values respectively—We have used an existing machine learning model governed by the following equations [18]: SBP = 184.3 − (1.329 ∗ HR) + (0.0848 ∗ td)
(5)
DBP = 55.96 − (0.02912 ∗ HR) + (0.02302 ∗ td)
(6)
td = HRms−PTT
(7)
td is time delay, HRms is heart rate in milliseconds and PTT is pulse transit time.
3 Implementation and Results The above discussed algorithms for estimating health parameters have been implemented in MATLAB R2019b. An android application has been developed in Android Studio to acquire the video signals in real time and display the results. Upon opening the app for the first time, the user is greeted with a welcome slider that explains the benefits of the app and its usage as shown in Fig. 7a. User has to login through google sign-in as shown in Fig. 7b. Basic details are required to be filled depicted in Fig. 7c. When ‘Next’ is clicked, the video recording page opens. After recording a 10-s video of the right fingertip, a timer runs to process the signal and then the user is prompted to record left fingertip video in a similar way as shown in Fig. 7d. In the processing, extracted raw PPG signal is filtered and the resultant signal is analyzed to estimate the vitals as shown in Fig. 7e. Finally, the results are shown. Clicking ‘Upload’, saves the results in database for future reference. All of the details pertaining to the user (Personal details and readings) are stored in Firebase database that is highly secure. Previous readings can be checked by clicking on ‘My records’ as depicted in Fig. 7f. The condition of the subject can be analyzed based on the flowchart shown in Fig. 8. When HR is between 60 and 100 beats/min, HRV is between 20 and 200 ms, BR is between 12 and 20 breaths/min, Spo2 is between 90 and 100%, SBP is between 115 and 140 mmHg and DBP is between 70 and 90 mmHg, the subject is categorized as normal, else abnormal. Our developed system was used to obtain readings of 50 subjects of different age groups. Table 1 lists the readings of five subjects among them along with a comparison of their obtained and clinical heart rate and blood pressure values.
An Inference Engine Integrated with Health Parameters …
379
Fig. 7 Mobile app screenshots
4 Conclusion and Future Scope Based on the current situation where people are facing problems in monitoring health parameters, the proposed idea is to provide a comprehensive mobile application that measures some of the important health parameters which is effective both in terms of time and cost. Based on the obtained results, it is observed that the readings are very
380
P. S. S. Sree Dhruti et al.
Place fingers on camera lens
Acquire PPG signals
If (60 Waar). Manufacturing cost turns out to be the most important factor with the weight of 0.3657 from the stakeholder perspective. Secondly, the application of add-on features that we introduced in this work plays a vital role in the smart robotic power wheelchair with a weight of 0.2538, followed by the safety criteria with 0.156 weight. However, user power efficiency scores 0.1108 weight. The other three criteria have low weights. But for both methods, the ranking of criteria is nearly the same with a slight difference in weight values. Future work should be focused on adding new criteria for prototype design of wheelchair and other multi-criteria decision-making approaches should be carried out to confirm the highest intensity weight factor to achieve the desired need of disabled people.
References 1. WHO: World Report on Disability 2011, World Health Organisation (2011) 2. Sorrento, G.U., Archambault, P.S., Routhier, F., Dessureault, D., Boissy, P.: Assessment of joystick control during the performance of powered wheelchair driving tasks. J. Neuroeng.
A Fuzzy AHP Approach to Evaluate the Strategic …
463
Rehabil. 8(1), 31 (2011) 3. Rofer, T., Mandel, C., Laue, T.: Controlling an automated wheelchair via joystick/headjoystick supported by smart driving assistance. In: 2009 IEEE International Conference on Rehabilitation Robotics, pp. 743–748. IEEE (2009) 4. Abdulghani, M.M., Al-Aubidy, K.M., Ali, M.M., Hamarsheh, Q.J.: Wheelchair neuro fuzzy control and tracking system based on voice recognition. Sensors 20(10), 2872 (2020) 5. Pálsdóttir, Á.A., Dosen, S., Mohammadi, M., Andreasen, L., Struijk, N.S.: Remote tongue based control of a wheelchair mounted assistive robotic arm–a proof of concept study. In: 2019 IEEE International Conference on Mechatronics and Automation (ICMA), pp. 1300–1304. IEEE (2019) 6. Severin, I.C., Dobrea, D.M., Dobrea, M. C.: Head gesture recognition using a 6DOF inertial IMU. Int. J. Comput. Commun. Control 15(3) (2020) 7. Penkert, H., Baron, J.C., Madaus, K., Huber, W., Berthele, A.: Assessment of a novel, smartglass-based control device for electrically powered wheelchairs. Disabil. Rehabil. Assistive Technol. 1–5 (2019) 8. Apu, M.A.R., Fahad, I., Fattah, S.A., Shahnaz, C.: Eye blink controlled low cost smart wheel chair aiding disabled people. In: 2019 IEEE R10 Humanitarian Technology Conference (R10HTC)(47129), pp. 99–103. IEEE (2019) 9. Campeau-Vallerand, C., Michaud, F., Routhier, F., Archambault, P.S., Létourneau, D., GélinasBronsard, D., Auger, C.: Development of a web-based monitoring system for power tilt-in-space wheelchairs: formative evaluation. JMIR Rehabil. Assistive Technol. 6(2), e13560 (2019) 10. Jayakody, A., Nawarathna, A., Wijesinghe, I., Liyanage, S., Dissanayake, J.: Smart wheelchair to facilitate disabled individuals. In: 2019 International Conference on Advancements in Computing (ICAC), pp. 249–254. IEEE (2019) 11. DSouza, D.J., Srivastava, S., Prithika, R., Sahana Rai, A.N.: IoT based smart sensing wheelchair to assist in healthcare. Methods 6(06) (2019) 12. Palankar, M., De Laurentis, K.J., Alqasemi, R., Veras, E., Dubey, R., Arbel, Y., Donchin, E.: Control of a 9-DoF wheelchair-mounted robotic arm system using a P300 brain computer interface: initial experiments. In: 2008 IEEE International Conference on Robotics and Biomimetics, pp. 348–353. IEEE (2009) 13. Tseng, TH., Liang-Rui, C., Yu-Jia, Z., Bo-Rui, X., Jin-An, L.: Battery management system for 24-V battery-powered electric wheelchair. Proc. Eng. Technol. Innov. 10, 29 (2018) 14. Chu, J.U., Moon, IH., Choi, G.W., Ryu, J.C., Mun, M.S.: Design of BLDC motor controller for electric power wheelchair. In: Proceedings of the IEEE International Conference on Mechatronics, 2004. ICM’04., pp. 92–7e. IEEE (2004) 15. Mistarihi, M.Z., Okour, R.A., Mumani, A.A.: An integration of a QFD model with FuzzyANP approach for determining the importance weights for engineering characteristics of the proposed wheelchair design. Appl. Soft Comput. 90, 106136 (2020) 16. Sutradhar, A., Sunny, M.S.H., Mandal, M., Ahmed, R.: Design and construction of an automatic electric wheelchair: an economic approach for Bangladesh. In: 2017 3rd International Conference on Electrical Information and Communication Technology (EICT), pp. 1–5. IEEE (2017) 17. Galkin, I., Podgornovs, A., Blinov, A., Vitols, K., Vorobyov, M., Kosenko, R.: Considerations regarding the concept of cost-effective power-assist wheelchair subsystems: (Case Study and Initial Evaluation). Electr. Control Commun. Eng. 14(1), 71–80 (2018) 18. Immanuel, C.R., Paul, B.P., Gnanaraj, S.D., Sam Paul, P., Thomas, T.K.: Design and development of cost-effective motorised wheelchair. In: Proceedings of 6th International & 27th All India Manufacturing Technology, Design and Research Conference (AIMTDR-2016) College of Engineering, Pune, Maharashtra, INDIA (2016) 19. Nwaoha, T.C., Ashiedu, F.I.: Engineering judgment in wheelchair design criteria: an analytical hierarchy process (AHP) approach. J. Sustain. Technol. 6(2), 32–42 (2015) 20. Knofius, N., van der Heijden, M.C., Zijm, W.H.M.: Consolidating spare parts for asset maintenance with additive manufacturing. Int. J. Prod. Econ. 208, 269–280 (2019)
464
S. K. Sahoo and B. B. Choudhury
21. Ma, C., Li, W., Li, Q., Gravina, R., Yang, Y., Fortino, G.: An embedded risk prediction system for wheelchair safety driving. In: Advances in Body Area Networks I, pp. 149–163. Springer, Cham (2019) 22. Park, B., Roh, C., Kim, J.: A study on driving safety evaluation criteria of personal mobility. J. Korea Inst. Intell. Transp. Syst. 17(5), 1–13 (2018) 23. Al-Okby, M.F.R., Neubert, S., Stoll, N., Thurow, K.: Low-cost, flexible, and reliable hybrid wheelchair controller for patients with tetraplegia. In: 2019 IEEE International Conference on Cyborg and Bionic Systems (CBS), pp. 177–183. IEEE (2019) 24. Salimi, Z., Ferguson-Pell, M.: Investigating the reliability and validity of three novel virtual reality environments with different approaches to simulate wheelchair maneuvers. IEEE Trans. Neural Syst. Rehabil. Eng. 27(3), 514–522 (2019) 25. Andrews, A.W., Vallabhajosula, S., Ramsey, C., Smith, M., Lane, M.H.: Reliability and normative values of the wheelchair propulsion test: a preliminary investigation. NeuroRehabilitation 45(2), 229–237 (2019) 26. Craciunescu, M., Baicu, D., Circiumaru, M., Mocanu, S., Dobrescu, R.: Towards the development of autonomous wheelchair. In: 2019 22nd International Conference on Control Systems and Computer Science (CSCS), pp. 552–557. IEEE (2019) 27. Somwanshi, D., Bundele, M.: Obstacle detection approach for robotic wheelchair navigation. In: International Conference on Artificial Intelligence: Advances and Applications 2019, pp. 261–268. Springer, Singapore (2020) 28. Saaty, T.L.: A scaling method for priorities in hierarchical structures. J. Math. Psychol. 15(3), 234–281 (1977) 29. Buckley, J.J., Uppuluri, V.R.: Fuzzy hierarchical analysis. In: Uncertainty in Risk Assessment, Risk Management, and Decision Making, pp. 389–401. Springer, Boston, MA (1987)
An Earthquake Prediction System for Bangladesh Using Deep Long Short-Term Memory Architecture Md. Hasan Al Banna , Tapotosh Ghosh , Kazi Abu Taher, M. Shamim Kaiser, and Mufti Mahmud
Abstract Earthquake is a natural catastrophe, which is one of the most significant causes of structural and financial damage, along with the death of many humans. Prediction of the earthquake at least a month ahead of the event may diminish the death toll and financial loss. Bangladesh is in an active seismic region, where many earthquakes with small and medium magnitude occur almost every year. Several scientists have predicted that there is a good chance of an earthquake with startling energy shortly in this region. In this work, we have proposed a long short-term memory (LSTM)-based architecture for earthquake prediction in Bangladesh in the following month. After tuning hyperparameters, an architecture of 2 LSTM layers with 200 and 100 neurons, respectively, along with L1 and L2 regularization, was found to be the most efficient. The activation functions of the LSTM layers were tanh in the proposed architecture. The proposed LSTM architecture achieved a remarkable 70.67% accuracy with 64.78% sensitivity, 75.94% specificity in earthquake prediction for this region. Keywords Earthquake · Prediction · Optimization · LSTM
Md. Hasan Al Banna and Tapotosh Ghosh are contributed equally. Md. Hasan Al Banna · T. Ghosh · K. A. Taher Bangladesh University of Professionals, Dhaka, Bangladesh e-mail: [email protected] T. Ghosh e-mail: [email protected] K. A. Taher e-mail: [email protected] M. S. Kaiser (B) Institute of Information Technology, Jahangirnagar University, Savar, Dhaka 1342, Bangladesh e-mail: [email protected] M. Mahmud Nottingham Trent University, Clifton Campus, Nottingham NG11 8NS, UK e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_41
465
466
Md. Hasan Al Banna et al.
1 Introduction Earthquake is a natural disaster caused by the movement of tectonic plates of the earth due to its substantial internal energy release. This phenomenon is common in places, where geological faults are situated, and a massive amount of rocks move against one another in a narrow space [1]. The center of the earth is full of liquid magma, and the earth’s surface is floating on it. The temperature of the earth’s core is very high, and sometimes its energy needs to be released. These energies are released through the fault lines of the earth, which are the areas between the significant pieces of earth’s crust called the tectonic plates. The tremor leading to the energy release of the earth is called an earthquake. The impact of earthquakes does not depreciate with the stopping of the tremors since this leads to infrastructural damage and another natural phenomenon called a tsunami. Tsunami occurs when a high-magnitude earthquake happens in the ocean bed, and strong waves having height as long as 100 m sweep away the lands near the shore. From the year 1998 to the year 2017, about 750,000 people died worldwide because of earthquakes, and 125 million people were affected due to this natural disaster [2]. Bangladesh is a country of South Asia with 147,570 km2 of land, where 180 million people live. It is situated in the 20.35◦ north to 26.75◦ north latitude and 88.03◦ east to 92.75◦ east longitude. This country is placed on three moving tectonic plates, which are the Indian plate, the Burmese plate, and the Eurasian plate. Bangladesh has five fault lines running through the country. For the risk of damage, Bangladesh is ranked 5th in the world [3]. An earthquake of magnitude 7.5 in the Richter scale can take away 88,000 lives, demolish 72,000 buildings, and cause 1075 million dollars of loss in Dhaka city, [4] the capital of Bangladesh. Nevertheless, an accurate earthquake prediction system can help in reducing the impact. For the prediction of an earthquake, different methods have been suggested. Geller investigated the scientific quality of 100 years of earthquake researches [5]. Jiang et al. [6] used a support vector machine (SVM) algorithm to predict the highest earthquake magnitude in China based on some earthquake precursors. Asim et al. [7] performed earthquake prediction in Chile, Hindukush, and Southern California using a hybrid model of support vector regression (SVR) and Levenberg-Marquardt backpropagation (LM-BP). To predict earthquakes in the Himalayan region, Narayanakumar et al. [8] proposed a BP-based neural network (NN). Hu et al. [9] proposed an earthquake prediction architecture with logistic regression (LR) and BP. For earthquake prediction in the Chinese region, they considered 5 seismic and 2 time-varying parameters. The calculated error from the LR was used along with BP’s output to generalize the prediction. Maya and Yu [10] proposed to use transfer learning and meta-learning for improving the accuracy of earthquake prediction. Li et al. [11] compared the performance of different ML algorithms to predict the arrival time of the seismic waves. The short time average over long time average (STA/LTA) algorithm was used to detect any triggering event and evaluated the model with SVM, RF, and decision tree (DT) algorithm. Asim et al. [12] tried to evaluate the performance of SVM, RF, and artificial neural network (ANN) in Cyprus.
An Earthquake Prediction System for Bangladesh …
467
RF performed best for lower magnitude events, while SVM was accurate for midlevel earthquakes. Karimzadeh et al. [13] predicted aftershocks of earthquakes using the stress change and stress distribution. They used SVM, RF, K-nearest neighbors (KNN) algorithm, and Naive Bayes (NB) algorithm to predict the location of the earthquake, where NB was the best performer. Bio-inspired algorithms tend to work well for optimizing NN parameters. Majhi et al. [14] used functional link ANN, which does not have any hidden layers to calculate the mean magnitude of the previous 100 events before every month. The number of earthquakes and mean magnitude of the following month in Iran was predicted by Hajikhodaverdikhan et al. [15] using the PSO-SVR architecture, where the hyperparameters (C, , kernel scale) of the SVR were optimized to produce better performance. Li and Liu [16] used PSO for optimizing BP. Since earthquakes have long-term correlation, investigation on prior spatio-temporal data can reveal new information about earthquakes. Wang et al. [17] proposed to use spatio-temporal features to predict an earthquake in China using LSTM. The performance of the feed-forward neural network (FFNN) and LSTM was compared by Kishore et al. [18] to predict the trend of earthquakes. Prediction of earthquakes needs a model that can analyze a large sequence of seismic data to discover a pattern. A recurrent neural network (RNN) has a small capability of it as it can consider the previous state of the data. However, it shows vanishing and exploding gradient problem. The improvement to RNN is LSTM, which can deal with a long sequence and produces a good result. The main contributions of this paper are as follows: • An LSTM model for predicting an earthquake in Bangladesh the following month is proposed. • G-R seismicity indicators were used as a feature for this study, which were calculated for each month [19]. • Hyperparameter tuning and regularization techniques were adopted for improving the performance of the model. • The proposed model was compared with logistic regression (LR), SVM, RF, and ANN. To the best of the author’s knowledge, no researches have been done to predict earthquakes in advance, considering Bangladesh as a study region. In the next section, we will introduce the basic concepts of LSTM. In Sect. 3, the methodology will be discussed, and in Sect. 4, the results and discussion will be presented. In Sect. 5, we will provide conclusions and future works.
2 Basics of Long Short-Term Memory (LSTM) LSTM is an update of RNN designed for situations, when the distance between previous and current information is vast. The architecture mainly consists of 3 gates, such as the input gate, forget gate, and output gate. Cell state is the memory of the
468
Md. Hasan Al Banna et al. ht
Ct-1
Ct
+
x ~ Ct ft
σ
tanh
it
x
σ
tanh
ot
x
σ
ht-1
ht
xt
Forget Gate
Input Gate
Output Gate
Fig. 1 Generic LSTM cell structure. LSTM cell structure consists of a forget gate, input gate, and an output gate. Cell state holds the memory of the network
network, which can propagate important information to the sequence [20]. Figure 1 illustrates the generic cell structure of LSTM. In the LSTM cell structure, the current input information and the information arrived from the previous state go through a sigmoid activation function to mark the importance of the information. The information that gets a value close to 0 is forgotten using the forget gate, and the values close to 1 are propagated. The output of the forget gate is calculated as f t = σ (Wf × [h t−1 , xt ] + bf ), where Wf is the weights of the forget gate, and bf is the forget gate’s bias. The input gate is designed to update the cell state known as the memory of the LSTM architecture. The previous information and current inputs go through a sigmoid and a tanh activation function. The output of these activation functions multiplies together and provides an output between 0 and 1, where 1’s are kept in the cell state. If C˜t is the output of the tanh activation and i t is the output of sigmoid function then, i t = σ (Wi × [h t−1 , xt ] + bi ), C˜t = tan h(WC × [h t−1 , xt ] + bC ), where Wi and WC are the weights of the input gate and cell state, and bi and bC are the biases of the input gate and the cell state, respectively. The cell state gets updated through a point-wise addition with the output of the input gate. If Ct−1 is annotated as previous state information, and Ct is the current state information, then
An Earthquake Prediction System for Bangladesh …
469
the operation in the cell state can be expressed through the following equation: Ct = f t × Ct−1 + i t × C˜t . The output gate is used to assign values in the next state. The inputs are propagated through a sigmoid activation function, which provides an output between 0 and 1. Cell state information also propagates through a tanh activation function, and the output of this activation gets multiplied with the sigmoid activation’s output. This multiplication operation decides the carrying information in the hidden state. If ot is the sigmoid output in this gate and h t is the output of this gate then, ot = σ (Wo × [h t−1 , xt ] + bo ) h t = ot × tan h(Ct ) where Wo is the weight of the output gate, and bo is the output gate’s bias. In this way, LSTM stores information of long sequences.
3 Methodology 3.1 Dataset Preparation In this work, we have collected a dataset of the seismic events near Bangladesh between May 1973, and February 2020 from the United States Geological Survey (USGS) website [21]. There were 1763 events in the dataset. In this work, we tried to predict earthquakes of the following month. Eight seismic parameters [19] (T -value (T ), mean magnitude (Mmean ), rate of square root of seismic energy (E), b-value (b), ¯ meantime between characteristic events (Mchar ), η-value (η), magnitude deficit ( M), coefficient of variation from the mean time (Mt )) were calculated for each month, where previous 50 events were considered for calculating these parameters. If the magnitude of any seismic event was higher than 4.7, then the vector of seismic parameters was labeled as 1, which means there is a significant earthquake this month. Otherwise, the whole row was labeled as 0. After calculating all the features, we have handled the dataset’s null values by erasing those rows. We have finally found 495 rows (257 ones, 238 zeros) for the dataset. We divided the dataset into two parts, where 345 (70%) records were used for training purposes, and 150 (30%) records were kept isolated for testing purposes. Figure 2 shows the study region of this research work.
470
Md. Hasan Al Banna et al.
Fig. 2 Study region of the research work. Red dots denotes the center of the seismic events. Most of the seismic events close to Bangladesh have center in India and Myanmar
3.2 Proposed Model In this work, we have taken 8 seismic parameters as input. The input goes through two LSTM layers, and then a flatten layer is attached to it, where the second LSTM layer was bidirectional layer. Then the output tensor of these layers is propagated through the final dense layer with 2 neurons to provide the output. Figure 3 describes the proposed LSTM architecture. We have optimized the hyperparameters through rigorous tuning for the number of neurons and activation function of the LSTM layers. We have tried the number of neurons from 16 to 400 in the first and 8 to 200 in the second layer as beyond that, the performance decreases. We evaluated the LSTM architecture’s performance with the tanh and rectified linear unit (ReLU) activation function in the LSTM layers. We also tried the softmax layer in the final dense layers. Finally, we have added L1 and L2 regularization (L1 = 1 × 10−5 , L2 = 1 × 10−4 ) to optimize the performance of
An Earthquake Prediction System for Bangladesh …
b Earthquake Data preCatalog processing
η
Mchar
Mt
Input Layer
E
Bi-Directional LSTM Layer LSTM LSTM LSTM LSTM
Dense Layer
LSTM Layer
Mmean
Flatten Layer
T
471
Earthquake Or Not Earthquake
LSTM LSTM
Fig. 3 Proposed LSTM architecture for earthquake prediction in Bangladesh. It consists of an LSTM layer, a bidirectional LSTM layer, and finally a dense layer that provides prediction output
the proposed model. L1 regularization is also known as Lasso regularization, which utilizes the absolute value of magnitude as the loss function’s penalty term. In the case of L2 regularization, the squared magnitude is applied as the loss function’s penalty term. L2 regularization can also be denoted as ridge regularization. The proposed model is a 2 layered LSTM architecture with 200 and 100 neurons and tanh activation function, which was found by the different combinations of hyperparameters. L1 and L2 regularization will also be used for this model. We have compared the model with SVM, RF, LR, and ANN models to evaluate the performance. SVM works by creating a hyperplane between two classes in a way that the hyperplane remains at the highest possible distance from both of the data points [22]. In this work, we have trained an SVM model that had RBF kernel, and regularization parameter, C was set to 1. LR represents data points by creating a decision boundary using the logistic function [23]. We have trained the LR model with l2 penalization. RF is a decision tree based classifier, where multiple decision trees take a decision together to solve a classification problem [24]. In this work, we selected the number of trees in the random forest to be 100, and the Gini criteria is selected. The artificial neural network is designed to work like a human brain [25]. It is created by a set of connected neurons, and these neurons have their own bias and weights. In this work, there is an input layer taking the input of 8 attributes of seismic events, two consecutive dense layers with 30 and 60 neurons, and tanh activation function. After that, a flatten layer is attached to it. Finally, a dense layer is added to the architecture that can be considered as the output layer, which contains 2 neurons, and a sigmoid activation function. The training was performed with a learning rate of 0.001, the optimizer was adam, and the total number of epochs was 3000. The model was then tested with the unseen testing dataset.
472
Md. Hasan Al Banna et al.
4 Result Analysis In this section, we will analyze the obtained result from this study. For evaluation of the model, we have used true positive value (Tp ), false positive value (Fp ), true negative value (Tn ), false negative value (Fn ), sensitivity (Sn ), specificity (Tp ), positive predictive value (P1 ), negative predictive value (P0 ), accuracy, and unweighted average recall (UAR) [26]. In-depth discussion on the obtained result will be presented in the later portion of this section.
4.1 Earthquake Prediction in Bangladesh For Bangladesh, data from May 1973 to February 2019 was used for the latitude range of 18.11◦ to 27.11◦ and longitude range of 87.19◦ to 95.36◦ . For this study, we have used hyperparameter tuning to find the best model for the prediction of earthquakes in Bangladesh. The best model was a 2 layer LSTM model with 200 and 100 neurons and tanh activation function. L1 and L2 regularization were used for this model. The models are evaluated based on accuracy and UAR. Accuracy is the proportion of the correctly predicted events and the evaluated events. This is represented by Eq. (1). Accuracy =
Tp + Tn Tp + Tn + Fp + Fn
(1)
UAR is a measure of the generalization capability of this model. UAR can be mathematically represented as Eq. (2). UAR =
Sn + Sp 2
(2)
The proposed model produced the highest accuracy for 4000 epochs. Table 1 shows the 5 best models for earthquake prediction. Therefore our proposed model will be a 2 layer LSTM model, where the second layer is bidirectional and have 200 and 100 neurons, respectively. The bidirectional layer will have L1 and L2 regularization. This model performs at least 5% better than the second best model, which does not have regularization. On the training data, this model reached 99.18% accuracy. The Sn , and Sp were good, and both of them were near 99%. The P0 , and P1 also converged to 99%. The unweighted average of Recall was 0.9915. Therefore, we can say that the model was well trained for the data of Bangladesh. Figure 4 shows the confusion matrix of the trained proposed model that was generated during predicting training and testing samples.
An Earthquake Prediction System for Bangladesh …
473
Table 1 Best performing tuned LSTM models for earthquake prediction in Bangladesh No. of neurons (1st layer)
No. of neurons (2nd layer)
Activation function
Regularization
200
100
tan h
yes
0.6478
200
100
tan h
no
0.6901
150
100
tan h
no
0.6478
200
100
ReLU
no
300
150
tan h
no
(a)
Accuracy (%)
UAR
0.7594
70.67
0.7036
0.6582
67.33
0.67415
0.6455
64.66
0.64665
0.676
0.6202
64.66
0.6481
0.6338
0.6582
64.66
0.646
Sn
Sp
(b)
Fig. 4 Confusion matrix of the earthquake prediction. a Confusion matrix generated during prediction of the training set records by the proposed LSTM architecture. b Confusion matrix generated during prediction of the test set records by the proposed LSTM architecture. In both of the cases, 0 denotes no earthquake and 1 denotes earthquake event
The proposed model was then compared with other models to predict earthquakes in Bangladesh. RF produced the highest sensitivity of 0.7183. The ANN and proposed LSTM model also provided good sensitivity. The Sn of the proposed model was 0.6479. However, the specificity of the other models was poor. The best specificity was provided by the proposed LSTM model, which is 0.7595. The nearest Sp was achieved by the LR model, which was 0.6835. For P0 and P1 , the highest value was provided by the LSTM model, which was 0.7058 and 0.7077. The UAR of the proposed model was 0.7037, which is better than all the other models. The closest value reached by any other model was 0.6123, which was based on RF. Table 2 shows the detailed performance of the proposed model for testing unseen data and comparison with other models. This model outperformed all the models based on almost all the evaluation metrics. The average of Sn , Sp , P0 , and P1 for the LSTM model was 0.7052, which is also the highest value among all the models.
474
Md. Hasan Al Banna et al.
Table 2 Comparison of different machine learning models for earthquake prediction in Bangladesh Model
SVM
LR
RF
ANN
LSTM
Tn Fp Fn Tp Sn Sp P0 P1
36 43 29 42 0.5915 0.4556 0.5538 0.4941
54 25 55 16 0.2253 0.6835 0.4954 0.3902
40 39 20 51 0.7183 0.5063 0.6666 0.5666
27 52 21 50 0.7042 0.3417 0.5625 0.4901
60 19 25 46 0.6479 0.7595 0.7058 0.7077
4.2 Discussion on the Achieved Result
SVM
LR
RF
0.7067 0.7037 0.7052 0.5133 0.523 0.5246
0.6066 0.6123 0.6145
0.4666 0.4544 0.4486
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0.52 0.5235 0.5238
Performance
A model should not be evaluated based only on accuracy because it can be miss leading for imbalanced data. Therefore, in this study, we have used UAR and an average of Sn , Sp , P0 , P1 as two different evaluation parameters. Based on the accuracy, the proposed model is at least 16.5% better than the closest model, which is RF. After our proposed model, the RF model provided better results than the other models. Surprisingly the ANN-based model did not perform well on this study area. It provided performance close to the SVM model. The LR model performed poorly with an accuracy of 46.66%, UAR of 0.4544, and an average score of 0.4486, respectively. In terms of UAR, RF achieves a score of 0.6123. The proposed model is 14.92% better than the RF model in terms of UAR. The average score of RF was 0.61445, which is
ANN
Proposed LSTM
Models
Accuracy
UAR
Average
Fig. 5 Comparison of different models for earthquake prediction in Bangladesh. Proposed LSTM outperformed all the other models considering three different evaluation parameters (accuracy, UAR, and average)
An Earthquake Prediction System for Bangladesh …
475
14.77% worse than the proposed model. In Fig. 5, we have presented the comparison between SVM, LR, RF, ANN, and the proposed LSTM model in terms of accuracy, UAR, and average performance. The regularization in the proposed LSTM model helped a lot to generalize predictions on unseen data.
5 Conclusions Earthquake is a random natural phenomenon, which is very tough to predict. However, based on the earthquake catalog, seismicity indicators can be calculated, which helps in the prediction procedure. Historically, Bangladesh has seen earthquakes with high magnitude. Since being on the boundary of three major tectonic plates, this country is in some major earthquake threat. Our proposed model can predict an earthquake in Bangladesh of the following month with good accuracy, and the sensitivity and specificity of the model are also good. With model tuning and hyperparameter optimization, the performance of the model was increased. In achieving good results on unseen data, we have used regularization. We have compared the proposed model with some other machine learning models to see the difference in performance. Although the proposed model produces good results, the model’s performance can be increased. A deeper and more complex model, along with transfer learning, may improve the performance. In the future, we will update the model to improve the efficiency and incorporate location and exact time prediction for the region of Bangladesh. Acknowledgements This study was supported by the research funds of the Information and Communication Technology division of the Government of the People’s Republic of Bangladesh in 2019 - 2020.
References 1. Absar, N., Shoma, S.N., Chowdhury, A.A.: Estimating the occurrence probability of earthquake in bangladesh. Int. J. Sci. Eng. Res 8(2) (2017) 2. Mizutori, M., Guha-Sapir, D.: Economic losses, poverty and disasters 1998–2017. United Nations Office for Disaster Risk Reduction (2017) 3. Zaman, Md., Sifty, A., Rakhine, S., Md. Abdul, A., Amin, R.: Earthquake risks in Bangladesh and evaluation of awareness among the university students. J. Earth Sci. Clim. Change 9(7) (2018) 4. Rahman, M., Paul, S., Biswas, K.: Earthquake and Dhaka city-an approach to manage the impact. J. Sci. Found. 9(1–2), 65–75 (2011) 5. Geller, R.J.: Earthquake prediction: a critical review. Geophys. J. Int. 131(3), 425–450 (1997) 6. Jiang, C., Wei, X., Cui, X., You, D.: Application of support vector machine to synthetic earthquake prediction. Earthq. Sci. 22(3), 315–320 (2009) 7. Asim, K.M., Idris, A., Iqbal, T., Martinez-Alvarez, F.: Earthquake prediction model using support vector regressor and hybrid neural networks. PloS one 13(7) (2018)
476
Md. Hasan Al Banna et al.
8. Narayanakumar, S., Raja, K.: A BP artificial neural network model for earthquake magnitude prediction in Himalayas, India. Circ. Syst. 7(11), 3456–3468 (2016) 9. Hu, W.S., Nie, H.L., Wang, H.: Applied research of bp neural network in earthquake prediction. In: Applied Mechanics and Materials. vol. 204, pp. 2449–2454. Trans Tech Publ (2012) 10. Maya, M., Yu, W.: Short-term prediction of the earthquake through neural networks and metalearning. In: 2019 16th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE), pp. 1–6. IEEE (2019) 11. Li, W., Narvekar, N., Nakshatra, N., Raut, N., Sirkeci, B., Gao, J.: Seismic data classification using machine learning. In: 2018 IEEE Fourth International Conference on Big Data Computing Service and Applications (BigDataService), pp. 56–63. IEEE (2018) 12. Asim, K.M., Moustafa, S.S., Niaz, I.A., Elawadi, E.A., Iqbal, T., Martínez-Álvarez, F.: Seismicity analysis and machine learning models for short-term low magnitude seismic activity predictions in cyprus. Soil Dyn. Earthq. Eng. 130, 105932 (2020) 13. Karimzadeh, S., Matsuoka, M., Kuang, J., Ge, L.: Spatial prediction of aftershocks triggered by a major earthquake: a binary machine learning perspective. ISPRS Int. J. Geo-Inform. 8(10), 462 (2019) 14. Majhi, S.K., Hossain, S.S., Padhi, T.: Mfoflann: moth flame optimized functional link artificial neural network for prediction of earthquake magnitude. Evol. Syst. 11(1), 45–63 (2020) 15. Hajikhodaverdikhan, P., Nazari, M., Mohsenizadeh, M., Shamshirband, S., Chau, K.w.: Earthquake prediction with meteorological data by particle filter-based support vector regression. Eng. Appl. Comput. Fluid Mech. 12(1), 679–688 (2018) 16. Li, C., Liu, X.: An improved pso-bp neural network and its application to earthquake prediction. In: 2016 Chinese Control and Decision Conference (CCDC), pp. 3434–3438. IEEE (2016) 17. Wang, Q., Guo, Y., Yu, L., Li, P.: Earthquake prediction based on spatio-temporal data mining: an LSTM network approach. IEEE Trans. Emerg. Top. Comput. (2017) 18. Bhandarkar, T., Vardaan, K., Satish, N., Sridhar, S., Sivakumar, R., Ghosh, S.: Earthquake trend prediction using long short-term memory RNN. Int. J. Electr. Comput. Eng. 9(2), 1304 (2019) 19. Panakkat, A., Adeli, H.: Neural network models for earthquake magnitude prediction using multiple seismicity indicators. Int. J. Neural Syst. 17(01), 13–33 (2007) 20. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 21. Search earthquake catalog. https://earthquake.usgs.gov/earthquakes/search/ (2020). Accessed on July 14, 2020 22. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995) 23. Cox, D.R.: The regression analysis of binary sequences. J. Roy. Stat. Soc.: Ser. B (Methodol.) 20(2), 215–232 (1958) 24. Ho, T.K.: Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition. vol. 1, pp. 278–282. IEEE (1995) 25. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986) 26. Reyes, J., Morales-Esteban, A., Martínez-Álvarez, F.: Neural networks to predict earthquakes in Chile. Appl. Soft Comput. 13(2), 1314–1328 (2013)
Offline Odia Handwritten Characters Recognition Using WEKA Environment Anupama Sahu and S. N. Mishra
Abstract Optical character recognition (OCR) is an image analysis technique in the document where digital images (scanned) that contain handwritten script or machine printed script is used as input into a system to convert it to editable machine readable text format. In the current era, OCRs development for regional script is an active field of cutting edge as research, such as Odia, Telugu and Bengali. Particularly, in Odia language, it is a great challenge to an OCR inventor due to various categories of character in the alphabet. Further it is also required to combine the different letter in Odia, and many characters are roundish similar in loop. In this paper, WEKA software has been used to build classification model for offline character recognition. Hence, this research work has been attempted toward development of a novel algorithm for classification of offline handwritten Odia character recognition using Naive Bayes and decision table in WEKA environment. Keywords OCR · Classification · Naive Bayes · Decision table · Machine learning
1 Introduction Odia is among the regional languages of India. This language is the main communicative media in Odisha, spoken by more than 46 million (4.6 crores) of people equating 75% of population (2020 census). Odia is the language that is used for all the official purposes of the State of Orissa. The total number of Odia letters consists of about 269 symbols (11 vowels, 38 consonants, 10 digits and 210 conjuncts) out of 269, around 90 characters are very strenuous to disparate and recognize due to complex shapes. In Odia script, all matras are comparably smaller. On writing Odia dialect, the base line may drop off when using the usual characters and matras. Sometimes, A. Sahu (B) · S. N. Mishra Department of Computer Science, Engineering and Applications, Indira Gandhi Institute of Technology, Sarang, Dhenkanal, Odisha, India e-mail: [email protected] S. N. Mishra e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_42
477
478
A. Sahu and S. N. Mishra Pre-Processing Noise Removal Input Image Normalization
Segmentation
Feature Extraction
Classification
Recognize Character
Output Image
Fig. 1 Basic steps of OCR system
the usual characters get aligned together with the matras which creates composite modifiers. Optical character recognition is an optimistic technology used to convert written texts into digital form. OCR is a common procedure of digitizing the printed texts that can be electronically edited and displayed online. Odia is a classical language in the Indian subcontinent that is used by more than 46 million people. In Fig. 1, it shows several preprocessing techniques, move in on segmentation, feature extraction and classification method. By using these various techniques, one can get better output in each and every individual stage. There are various methods that are obtained in each and every individual stage such as neural networks, fuzzy logic, machine learning and so on. OCR system is based upon the following key steps that are given below. Recognition of Odia handwritten mostly classified into offline recognition and online recognition is shown below in Fig. 2. The digitized mechanism of text is converted into an image and then again converted into letter codes, which are within the computer and the text preparing application called offline recognition, whereas, in online recognition, while any one writes something, the recognition trial with that data stream comes out of a transducer. In case of online character recognition, the data are collected through smart phone and tablet, etc. The rest of the paper has been arranged with background in Sect. 2 followed by proposed model in Sect. 3. Section 4 described simulation work and concludes the paper in Sect. 5.
Offline Odia Handwritten Characters Recognition …
479
Character Recognition
Online
Offline
Handwritten Script
Recognition
Single Character
Verification Printed Handwritten
Fig. 2 Types of character recognition
2 Background Munish Kumar et al. [1] have proposed different classification methods using WEKA environment. They have used two feature extraction methods such as parabola curve fitting and power curve fitting method. For recognition, the authors have collected 3500 isolated Gurumukhi offline handwritten characters that are written by 100 distinct writers and also they have taken data as 60% for training and 40% for testing. They have achieved an accuracy of about 82.92% in parabola curve fitting and 82.86% in power curve fitting. In this paper [2], they have presented a stochastic method to confine the regions of license plates for cars using the standard set by the (MRT) Malaysian Road Transport Dept. To display the license plate of cars, firstly preprocess the front and rear sight of the images of cars. Then the features extraction technique is generated using the connected components of the license plate. Naive Bayes classifier technique has been used for confining the license plates and using that technique they have achieved the accuracy of about 0.98%. In this paper [3], they have used Naive Bayes classifier technique for segmentation of text component. Simple procedures are used to generate large collection of data sets in training stage. A collection of manuscript and printed Persian and English pictorial images that have been manually separated have been used for training. A proper post-processing is applied to improve the segmentation results. In this paper [4], they have created a handy software tool which is used for identifying the characters efficiently. Their preprocessing stage has enhanced data images for computational processing. Firstly, they have converted the input image into grayscale image, and then the grayscale image is again converted into the binary image. They have used morphological operations in the preprocessing stage. Then
480
A. Sahu and S. N. Mishra
the images which are converted into comma separated files can be used as training and testing the data set in WEKA environment. In this paper [5], they have used an approach of in-node microprocessor-based vehicle classification. This classification technique proceeds toward analyzing and determining the types of vehicles passing over a three-axis magnetometer sensor. They have followed J48 classification algorithm carried out in WEKA for vehicle classification and achieved an accuracy about 100%. In this paper [6], median filtering on the input characters has been used and put in normalization method over characters are used for removal of border edge pixel points. At the beginning, each character is divided into 3 × 3 grids, and the correlate with centroid for all the nine zones is assessed in the features extraction stage. In this paper [7], they have proposed skew correction of handwritten of Chinese character based on these two models. These models are four-direction and 181angle classification model. These two models were constructed based on residual neural network (ResNet). They have achieved accuracy about 98.4% in case of four directional classification models by the examination of CASIA-HWDB 1.1 data set. In this paper [8], they have proposed an artificial neural network for recognition of English alphanumeric characters. For recognition of alphanumeric characters, the OCR system splits up into two sections: One is the training section and another one is the recognition section. Both of the sections involve image acquisition, preprocessing and feature extraction. They have trained their network by proposed training algorithm to test more than ten samples per character and give the correctness about 99% for numeric digits, 97% correctness for capital letters and 96% correctness for small letters. In this paper [9], they have proposed deep neural network. Inception V3 is used to train and execute the noisy characters. They have collected around 53,342 noisy character images from different receipts and newspapers. After that they have trained those noisy characters and the recognition results seen that the deep neural network achieved better perfection on bad quality character images. In this paper [10], they have proposed deep learning method which is used to identify recognition that will depend on the man-made neural network. Deep learning implies a right feature to recognize the character images. They have achieved the accuracy nearby 78.6% by the use of deep learning method for character recognition. In this paper [11], they have applied water reservoir technique for segmentation and recognition of touching characters of Gurumukhi handwritten. By using this method, they have achieved the accuracy nearby 93.51% for character segmentation. In this paper [12], they have proposed neural network for identification of Gujarati numerals. A multilayered feed forward neural network was also mentioned for classification of Gujarati numerals handwritten. They have achieved the accuracy about 82%. In this paper [13], they have applied water reservoir technique for segmentation of handwritten text into individual characters. For segmentation of character, initially they have identified the isolated and touching character from the text. After that the characters of the word that touch are then segmented using structural, topological and water reservoir-based features. They found the accuracy as 96.7% (Table 1).
Offline Odia Handwritten Characters Recognition …
481
Table 1 Comparison between the techniques and results reported by various authors Classification and OCR technique
Applied technology
Result
Authors
Year of publication
Feature extraction techniques
1. Parabola curve fitting, 2. Power curve fitting
Achieved detection Munish Kumar accuracy in parabola et al. [1] curve fitting—82.92% and power curve fitting—82.86%
2014
Features classification technique
Naive Bayes classifiers
Achieved accuracy about 0.98%
Alfian Abdul Halin et al. [2]
2013
Features classification technique
Naive Bayes classifiers
To improve the segmentation results, a proper post-processing technique is applied
A.M. Bidgoli et al. [3]
2010
J48 classification technique
In-node microprocessor method
Achieved accuracy about 100%
Kyle Ying et al. [5]
2015
Feature extraction techniques
Median filter
Removal of border edge pixel points
N Pramila et al. [6]
2017
ResNet
Four-direction classification model and 181-angle classification model
Achieved the accuracy about 98.4% in case of four classification models
Zetao Huang et al. [7]
2019
Artificial neural network
Feed forward neural network
Achieved correctness about 99% for numeric digits, 97% correctness for capital letters and 96% correctness for small letters
Shyla Afroge et al. [8]
2016
Deep neural network
Inception V3
By using that technique, the deep neural network achieved better perfection on bad quality character images
Tan Chiang et al. [9] 2018
Neural network
Deep learning
Achieved accuracy about 78.6%
Dhara S. Joshi et al. 2018 [10]
Features extraction technique
Water reservoir method
Achieved accuracy about 93.51%
Munish Kumar et al. [11]
2014
Neural network
Multilayer fed forward n/w
Achieved accuracy about 82%
Apurva A. Desai et al. [12]
2010
Feature extraction
Structural method and water reservoir method
Achieved accuracy about 96.7%
N. Tripathy et al. [13]
2006
482
A. Sahu and S. N. Mishra
Naive Bayes
Input image
Preprocessor
Classify Technique
Output image
Decision Table
Fig. 3 Proposed architecture of classification technique in WEKA
3 Proposed Model An open-source software named WEKA provides apparatus for data preprocessing, execution of several machine learning algorithms and visualization tools which in turn helps you to develop machine learning method and apply them to real-world data. WEKA contains the following tools: classification, data preprocessing, clustering, association rules, regression and visualization. That is why this paper/work was directed toward development of two novel algorithms for classification of offline handwritten Odia character recognition using Naive Bayes and decision table in WEKA environment. After preprocessing, the image can be classified using these two classification techniques for recognition that is shown in Fig. 3. In an image acquisition step, 240 images are collected from 20 different individual writers and with a fixed sized of 60 * 60. Images are kept in a folder. WEKA software provided the data preprocessing tools to use the images through the file extension.arff. In preprocessing step, noise has been eliminated and labeled. Then in classification step, we are applying two ML algorithms, i.e., Naive Bayes and decision tree.
4 Simulation and Result Discussion To verify the effectiveness of these two techniques have been used to represent the results of various WEKA-based classification methods. In this paper, Swara Barna ) have been introduced only. Initially, 240 samples have been collected from ( isolated offline Odia handwritten characters of 20 different writers. Here the machine learning algorithms have been implemented for the classification of handwritten character. Figure 4 shows the comparison of the classification result based on these two techniques of Naive Bayes and decision table. In Naive Bayes, total number of
Offline Odia Handwritten Characters Recognition …
483
Fig. 4 Comparison of accuracy in Naive Bayes and decision table classifier technique
instances—240 have been taken. Out of these, the correctly classified instances— 196 and the incorrectly classified instances—44. Here Naive Bayes achieved an accuracy about 81%. But in case of decision table, it is seen that the total no. of correctly classified instances—207 and incorrectly classified instances—33. So here the decision table achieved an accuracy about 86%. The kappa statistic is used to evaluate classifiers among Naive Bayes and decision table. The kappa statistic is used to control only those instances that may have been correctly classified. Figure 5 shows the average weighted accuracy achieved in both of Naive Bayes and decision table classifier technique. True positive rate and false positive rate are basically the measurements that are used to plot. In case of true positive rate, both models correctly predict the positive class, where in case of false positive rate both model correctly predict the number of false positive class. Here in case of true positive cases the average weight is about 81% in case of Naive Bayes and 86% in case of decision table. By class the weighted average is 85% precision, 81% recall and 81%
Fig. 5 Weighted average precision of Naive Bayes and decision table classification technique
484
A. Sahu and S. N. Mishra
f-measure in case of Naive Bayes classifier. By class the weighted average is 89% precision, 86% recall and 86% f-measure in case of decision table classifier.
5 Conclusion and Future Study This paper has demonstrated the effectiveness of various classification techniques in WEKA for recognition of characters in offline Odia handwritten. Prepared data sets have been used. Around 240 samples of isolated offline Odia handwritten characters have been collected from 20 different writers. After comparison of result based on these two techniques, decision table has been constructed, and it has been observed the maximum accuracy of about 86% than Naive Bayes. Future study can be extended to be implementing other classification techniques for Odia matra based on huge amount of data.
References 1. Kumar, M., Jindal, M.K., Sharma, R.K.: WEKA-Based Classification Techniques for Offline Handwritten Gurmukhi Character Recognition, Second International Conference on Soft Computing. Springer, India (2014) 2. Halin, A., Sharef, M., Nurfadhlina, A., Jantan, A.H., Abdullah, L.N.: License plate localization using a Naive Bayes classifier. In: IEEE, International Conference on Signal and Image Processing Application (2013) 3. Bidgoli, A.M., Boraghi, M.: A language independent text segmentation technique based on Naive Bayes classifier. IEEE (2010) 4. Lavanya, K., Shaurya, B., Tank, P., Jain, S.: Handwritten digit recognition using Hoeffding tree. In: Decision Tree and Random Forests—A Comparative Approach, International Conference on Computational Intelligence in Data Science (ICCIDS). IEEE (2017) 5. Ying, K., Ameri, A., Trivedi, A., Ravindra, D.P., Mozumdar, D.M.: Decision tree-based machine learning algorithm for in-node vehicle classification. IEEE (2015) 6. Prameela, N., Anjusha, P., Karthik, R.: Off-line Telgu handwritten characters recognition using OCR. IEEE, ICECA (2017) 7. Huang, Z., Zhang, Q.: Skew correction of handwritten Chinese character based on ResNet. In: IEEE 2019 International Conference on HPBD&IS (2019) 8. Afroge, S., Ahmed, B., Mahmud, F.: Optical character recognition using Back propagation Nural Network. IEEE, ICECTE (2016) 9. Wei, T.C., Sheikh, U.U., Rahman, A.A.-H.A.: Improved optical character recognition with deep neural network. IEEE (2018) 10. Joshi, D.S., Risodkar, Y.R.: Deep learning based Gujarati handwritten character recognition. IEEE, ICACCT (2018) 11. Kumar, M., Jindal, M.K., Sharma, R.K.: Segmentation of isolated and touching characters in offline handwritten Gurumukhi script recognition. J. Inf. Technol. Comput. Sci. (2014) 12. Desai, A.A.: Gujarati handwritten numeral optical character reorganization through neural network. Pattern Recognition. Elsevier (2010) 13. Tripathy, N., Pal, U.: Handwriting segmentation of unconstrained Oriya text. CVPRU ISI 31(6):755–769 (2006)
Neural Network-Based Receiver in MIMO-OFDM System for Multiuser Detection in UWA Communication Md Rizwan Khan
and Bikramaditya Das
Abstract High quality of service and high transmission rate is the demand for future underwater acoustic (UWA) communication which is achieved through the implementation of multiple-input multiple-output orthogonal frequency division multiplexing (MIMO-OFDM) system in the UWA communication. However, the quality of the MIMO-OFDM system has faced multiaccess interference (MAI) at the receiver due to the interference from co-channel users. Therefore, multiuser detection (MUD) technique is needed at the receiver of the MIMO-OFDM system to suppress the effect of MAI. The novelty in this research is that MUD is achieved using multilayer perceptron (MLP)-based neural network (NN) detector at the receiver of the MIMO-OFDM system in the UWA communication. The MLPNN detector achieved the MUD at the receiver of the MIMO-OFDM system through the adaptation of NN weights and bias weights in the backpropagation (BP) algorithm. The transceiver model of the MIMOOFDM system in underwater is implemented using BELLHOP simulation system. The bit error rate (BER) performance of the MLPNN detector towards MUD is analysed and is compared with that of existing detectors (matched filter (MF) detector, decorrelating detector (DD), minimum mean square error (MMSE) detector, and multistage conventional parallel interference cancellation (PIC) detector) in the UWA network. Proposed MLPNN detector outperforms in BER analysis over existing detectors in the UWA network. Keywords Underwater acoustic (UWA) · Multiple-input multiple-output (MIMO) · Orthogonal frequency division multiplexing (OFDM) · Multiuser detection (MUD) · Multiaccess interference (MAI) · Multilayer perceptron neural network (MLPNN)
Md R. Khan · B. Das (B) Department of Electronics and Telecommunication Engineering, VSS University of Technology, Burla, Odisha, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_43
485
486
Md R. Khan and B. Das
1 Introduction High quality of service and high speed transmission rate in underwater is the demand arises in many military and civilian applications, data acquisition systems, exchange of data for environmental monitoring in underwater wireless sensor networks and in underwater security surveillance system [1]. But limited bandwidth due to low frequency of propagation, intersymbol interference (ISI) due to large multipath spread [2], and MAI due to the interference from co-channel users in the multiuser environment are the challenging problem in the UWA communication. MIMO-OFDM is considered to fulfil the ever-increasing demand for bandwidth, efficiency, and performance of the UWA communication. OFDM when implemented with the MIMO system diminishes the effect of ISI and converts frequency selective channel into flat fading channel in an underwater environment. MUD is needed at the receiver of the MIMO-OFDM system to enhance the system performance. An optimal error probability performance is obtained in [3] using a maximum-likelihood (ML) detector in the multiuser environment, but the complexity increases exponentially with the number of users. Therefore, to reduce computational complexity and to increase the performance, the development in the research area of suboptimal multiuser detectors has been considered. In this research, MUD is achieved at the receiver of the MIMO-OFDM system using MLPNN detector in which network weights and bias weights are updated using the BP algorithm. The rest of this paper is organized as follows. Section 2 describes the brief analysis of suboptimal multiuser detectors for MUD. Section 3 represents the problem formulation along with contribution. Section 4 represents the MLPNN-based MUD. Section 5 demonstrates result discussion to illustrate the effectiveness of MLPNN detector over conventional detection schemes. Concluding remarks are drawn finally in Sect. 6.
2 Background Research This section represents the suboptimal multiuser detectors that have been used for MUD. Suboptimal detectors are categorized into linear and nonlinear multiuser detectors. In nonlinear multiuser detector, interference estimation is generated and then removed from the received signal before detection. Parallel interference cancellation (PIC) [4], successive interference cancellation (SIC) [5, 6], and recursive successive interference cancellation (RSIC) [7] schemes are suboptimal multiuser detectors implemented for MUD using DS-CDMA system. Further, computationally efficient neural networks as nonlinear suboptimal multiuser detectors have been carried out in wireless communication. An ANN-based detector (ANND) trained using LM algorithm proposed in [8] which approaches ML detector with lower complexity for symbol detection in a MIMO-OFDM system with low-resolution
Neural Network-Based Receiver in MIMO-OFDM System …
487
analog-to-digital converters (LRADCs). A convolutional neural network-based likelihood ascent search (CNNLAS) detection algorithm is proposed for better BER performance with lower computational complexity per symbol for massive MIMO systems in [9]. In [10], traditional least-squares (LS) estimation and blanking combined with MLP to provide better symbol error rate (SER) towards symbol detection in the UWA communication. A deep neural network (DNN) is used in [11] to recover the transmitted symbols directly after sufficient training and deep leaning-based receiver provide consistent improvement in performance compared to traditional UWA OFDM receiver. NN in the area of MUD for the MIMO-OFDM system in the UWA communication is limited. Therefore, it is promising to explore BER performance using NN for the MIMO-OFDM system in the UWA network.
3 Problem Formulation and Contribution of This Study Consider a underwater MIMO-OFDM system shared by s number of users where s m 1, 2, 3….S equipped with S number of transmitting hydrophone antenna and L number of receiving hydrophone antenna as shown in Figs. 1 and 2. The frequency band is divided into m no. of subcarriers where m m 1, 2, 3….M. Channel matrix which is a MIMO channel impulse response for s number of users is represented as ⎡
Hs,m
s,m h s,m 1,1 h 1,2 s,m ⎢ h h s,m ⎢ 2,1 2,2 =⎢ . .. ⎣ .. . s,m h h s,m L ,1 L ,2
⎤ · · · h s,m 1,S ⎥ · · · h s,m 2,S ⎥ .. .. ⎥ . . ⎦ · · · h s,m L ,S
(1)
where h s,m L ,s represent the channel gain from sth transmitting antenna to Lth receiving antenna on subcarrier m.
Fig. 1 Block diagram of transmitter structure in underwater using MIMO-OFDM system
488
Md R. Khan and B. Das
Fig. 2 Block diagram of receiver structure in underwater using MIMO-OFDM system
The underwater channel response is considered as MIMO channel impulse response which is decomposed into a number of parallel spatial channels using singular value decomposition (SVD) [12]. Therefore, the channel matrix H is decomposed using SVD as in [13] and is represented as H = U DV H =
rank(H )
u q dq vqH
(2)
q=1
In Eq. (2) uq and vq represents the left singular and right singular vectors, respectively, and the columns of u and v are orthonormal. d q represents the diagonal matrix containing singular values which are nonnegative and are arranged in descending order of magnitude. H in the power of vq represents the Hermitian operator. The number of nonzero singular value determines the rank of the matrix. The received signal on subcarrier m for an underwater MIMO-OFDM system is represented as rm =
S
u s,m ds,m
√
ps,m xs,m + ηm = Um Dm
Pm xm + ηm
(3)
s=1
T
where rm = r1,m , r2.m , . . . , r L ,m defines signals received from L number of antennas on subcarrier m. ηm = Additive white Gaussian noise (AWGN) with variT
S ance is equal to σ 2. xm = s=1 xs,m = x1,m , x2,m , . . . , x S,m represents symbol S sequences after modulation. Um = s=1 u s,m = [u 1,m, u 2,m, . . . , u S,m ]T represent
Neural Network-Based Receiver in MIMO-OFDM System …
489
left singular vector in SVD.Dm = diag(d1,m, d2,m , . . . , d S,m ) represent the eigen √ √ √ √ value having non-negative singular value. pm = diag p1,m , p2,m , . . . , p S,m ) represents the transmit power level. MUD at the receiver has been performed using MF detector and linear MUD technique such as DD detector and MMSE detector. The output from MF detector, DD detector, and MMSE detector is methematically expressed as: ymM F = UmH rm
(4)
ymD D = Rm−1 ymM F
(5)
−1
√ ymM M S E = Rm + σ 2 (Dm pm )−2 ymM F
(6)
where Rm = UmH Um and ymM F is the output of MF detector. It is observed that the output from a conventional MF detector consists of the desired signal and MAI from co-channel users. DD minimizes the effect of MAI but enhances the noise term. Hence, MMSE detector is used for the suppression of MAI and noise. Further to supress MAI, MLPNN detector is used in this research as shown in Fig. 2. The contributions of the research are summarized as follows: • The MIMO-OFDM system is implemented in the UWA communication using BELLHOP simulation system in which the underwater channel response is decomposed using singular value decomposition (SVD). • A MLPNN detector is used at the receiver of the MIMO-OFDM system for MUD through successful MAI cancellation. In the proposed scheme, network weights and bias weights are updated using the BP algorithm in the MLPNN detector. • BER performance of MLPNN detector is evaluated for the MIMO-OFDM system and also BER performance is compared with existing linear MUD techniques.
4 MLPNN Based MUD In the MLPNN detector, the BP algorithm is used to train the feed forward MLP network. MLP is used with supervised learning which led to the successful back propagation algorithm. The synaptic weights are adjusted in the learning algorithm to obtain the network output close to the desired outputs (target outputs). Figure 3 represents the structure of MLPNN detector which is implemented at the receiver of the MIMO-OFDM system in underwater for MUD. The network parameters are defined as: r L ,m : the Lth data received from Lth antenna of input layer wi1j : weight between the ith input node and the jth hidden node
490
Md R. Khan and B. Das
Fig. 3 MUD using MLPNN detector at the receiver of the MIMO-OFDM system
b1L ,m : bias weight of the hidden neurons w2jk : weight between the jth hidden node and the kth output node b2L ,m : bias weight of the output node f (·): activation function of the MLP network. The resultant signal from each hidden node and output node is represented as: z L ,m = f
L
wi1j r L ,m
+
b1L ,m
w2jk z L ,m
+
b2L ,m
L=1
x L ,m = f
L
(7) (8)
L=1 where i, j, k m 1, 2, 3,…, L and m m 1,2,3…, M. The rules to change the network weight are given as:
w2jk = σ δk,m z L ,m
(9)
wi1j = σ δ j,m r L ,m
(10)
Neural Network-Based Receiver in MIMO-OFDM System …
491
σ is the learning rate. The rules to change the network bias weight at hidden node are given as: b1L ,m = σ δ j,m
(11)
b2L ,m = σ δk,m
(12)
The error gradient at the output of network and at hidden neuron is represented, respectively as:
δk,m = (tk,m − x L ,m ) f (x L ,m ) δ j,m =
L
δk,m w2jk f (z L ,m )
(13)
(14)
k=1
e L ,m = (tk,m − x L ,m )
(15)
In each training step, the weights are updated using BP algorithm in MLPNN detector. The weight is updated using BP method and is mathematically given as wi1j = wi1j + wi1j
(16)
w2jk = w2jk + w2jk
(17)
b1L ,m = b1L ,m + b1L ,m
(18)
b2L ,m = b2L ,m + b2L ,m
(19)
The activation function is considered as bipolar sigmoid and is represented as: f (c) =
1 − e−c 1 + e−c
(20)
492
Md R. Khan and B. Das
Algorithm 1 Algorithm for MUD using BP algorithm in MLPNN detector Step 1: Initialize the weights and learning rate Step 2: Compute the resultant signal from each hidden node and output node using (7) and (8) Step 3:Compute the error at the output of network using (15) Step 4:Compute the error gradient at the output of network and hidden neuron using (13) and (14) Step 5: Solve the network weight and bias weight using (9) , (10), (11), and (12) Step 6:Update the value of network weight and bias weight using (16) , (17), (18), and (19) Step 7:Compute the error at the output of network using (15). If error smaller than the computed one in Step 3 then reduce the learning rate else increase the learning rate. Go to step 4. Step 8: End
5 Result Discussion 5.1 Simulation Parameters An underwater network is set up using BELLHOP simulation system and the data analysis is performed using MATLAB. The overall simulation parameters are summarized in Table 1.
5.2 Simulation Result The simulations results are carried out to demonstrate the performance of the MLP NN detector implemented at the receiver of the MIMO-OFDM system in UWA communication. In the first set of simulation, the BER performance of MLPNN detector is evaluated as a function of SNR in Fig. 4, and in the second set of simulation, the BER performance of MLPNN detector as a function of the number of users is evaluated in Fig. 5. In Fig. 4, the BER performance as a function of SNR is evaluated for eight users in an AWGN channel with an assumption that the channel is estimated perfectly. From Fig. 4, it is observed that the BER performance of MLPNN detector gives better performance towards MUD over the MF detector, DD, MMSE detector, and existing multistage conventional PIC detector [7]. If the target BER is set at 10–3 , then MLPNN detector acheives the target BER at SNR equal to 25 dB, whereas DD
Neural Network-Based Receiver in MIMO-OFDM System … Table 1 Simulation parameters
493
Parameter
Value
No. of subcarriers (M)
64
No. of bits per OFDM symbol interval 64 Modulation type
BPSK
Centre frequency
32 kHz
Channel type
AWGN channel
Transmission distance
200 m
Depth from water surface
500 m
Subcarrier spacing
37.2 Hz
Input elements number of NN
S (number of users)
Output nodes number of NN
S (number of users)
Number of hidden layer in MLP-BP model
1
Neurons number in hidden layer
S (number of users)
Activation function of NN
Bipolar sigmoid
Training algorithm of NN
BP algorithm
Length of training
300 bits per user
Fig. 4 BER performance as a function of SNR
494
Md R. Khan and B. Das
Fig. 5 BER performance as a function of number of users
and MMSE detector acheives it at SNR equal to 35 and 32 dB, respectively. MLPNN detector performs better over other conventional detectors and existing multistage conventional PIC detector in BER analysis as it reduces the error gradient with propoer weight adaptation in BP algorithm. Figure 5 gives the BER performance as a function of the number of users with channel SNR value equal to 10 dB. From Fig. 5, it is observed that with the increase in the number of users, the BER increases. This is due to an increase in the value of MAI with increase in number of users at the receiver of the MIMO-OFDM system implemented in the UWA network. Even though BER increases with the number of users, still MLPNN detector outperforms over other detectors.
6 Conclusion This paper investigated an MLPNN detector at the receiver of the MIMO-OFDM system to obtain the desired signal from the noisy or raw data received in an underwater environment. MUD is achieved using MLPNN detector at the receiver of the MIMO-OFDM system in the UWA network. The BER performance of the MLPNN detector is obtained through weight updation using the BP algorithm. The BER performance of the MLPNN detector as a function of SNR and the number of users is obtained and compared with the existing MF detector, DD, MMSE detector, and multistage conventional PIC.
Neural Network-Based Receiver in MIMO-OFDM System …
495
References 1. Qu, F., Wang, Z., Yang, L., Wu, Z.: A journey toward modeling and resolving doppler in underwater acoustic communications. IEEE Commun. Mag. 54(2), 49–55 (2016) 2. Khan, M.R., Das, B., Pati, B.B.: Channel estimation strategies for underwater acoustic (UWA) communication: an overview. J. Franklin Inst. 357(11), 7229–7265 (2020) 3. Verd, S. Minimum probability of error for asynchronous Gaussian multiple-access channels. IEEE Trans. Inform. Theor. 32(1), 85–96 (1986) 4. Yan, T.C.: Spatially multiplexed CDMA multiuser underwater acoustic communications. IEEE J. Ocean Eng. 41(1), 217–231 (2016) 5. Cho, S.E., Song, H.C., Hodgkiss, W.S.: Successive interference cancellation for underwater acoustic communications. IEEE J. Oceanic Eng. 36(4), 490–501 (2011) 6. Ma, L., Zhou, S., Qiao, G., Liu, S., Zhou, F.: Superposition coding for downlink underwater acoustic OFDM. IEEE J. Ocean. Eng. 42(1), 175–187 (2017) 7. Yeo, H.K., Sharif, B.S., Adams, A.E., Hinton, O.R.: Implementation of multiuser detection strategies for coherent underwater acoustic communication. IEEE J. Oceanic Eng. 27, 17–27 (2002) 8. Shabnam, R., Sofiene, A.: ANN-based detection in MIMO-OFDM systems with low-resolution ADCs. Signal Processing (eess.SP). arXiv: 2001.11643 (2020) 9. Li, L., Hou, H., Meng, W.: Convolutional neural network based detection algorithm for uplink multiuser massive MIMO systems. IEEE Access 8, 64250–64265 (2020) 10. Chen, Z., He, Z., Niuet, K.: Neural network-based symbol detection in high-speed ofdm underwater acoustic communication. In: 2018 10th International Conference on wireless Communications and Signal Processing (WCSP), pp. 1–5 (2018) 11. Zhang, Y., Li, J., Zakharov, Y., Li, X., Li, J.: Deep learning based underwater acoustic OFDM communications. Appl. Acoust. 154, 53–58 (2019) 12. Raleigh, G.G., Cioffi, J.M.: Spatio–temporal coding for wireless communications. IEEE Trans. Commun. 46, 357–366 (1998) 13. Zhang, Y.J., Lataief, K.B.: An Efficient resource allocation scheme for spatial multiuser access in MIMO/OFDM systems. IEEE Trans. Commun. 53, 107–116 (2005)
Employing Deep Neural Network for Early Prediction of Students’ Performance Sachin Garg, Abdul Aleem, and Manoj Madhava Gore
Abstract Educational institutions aim to deliver quality education and motivate students to perform better in academic examinations. The early prediction of students’ performance helps to identify the low-performing students who may fail in exams, thus allowing institutions to help such students for performing better. Traditional machine learning methods utilize the academic attributes of students to predict their academic performance. Accuracy in the prediction of students’ performance is very crucial. This article employs deep neural network (DNN), a contemporary technique of deep learning, to predict students’ academic performance. The dataset utilized for prediction is prepared with the academic attributes of students. Comparison has been made with prominent machine learning methods, viz. support vector machine (SVM), naïve Bayes (NB), k-nearest neighbors (kNN), decision trees (DT), random forest (RF), and artificial neural networks (ANN). The results show that the proposed model obtains 98% accuracy, which is better than the accuracy of the other compared methods. Keywords Machine learning · Deep neural network (DNN) · Educational data mining (EDM)
1 Introduction Students’ academic performance plays a significant role in shaping their careers and affects their alma mater’s reputation. Predicting the academic performance of Sachin Garg—This work was done as a part of M. Tech. Thesis [1] during his stay at MNNIT Allahabad as a Master’s student. S. Garg · A. Aleem (B) · M. M. Gore CSE Department, MNNIT Allahabad, Prayagraj, India e-mail: [email protected] S. Garg e-mail: [email protected] M. M. Gore e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_44
497
498
S. Garg et al.
students with accuracy is valuable to students, teachers, and institutions. The outcome of prediction could transform the future progress of students. The accurate prediction of students’ performance is significant and challenging because it requires a large amount of educational data, usually not readily available. An essential step for prediction is to explore the educational data to determine the significant parameters impacting the students’ performance [2]. The explored information helps the academic planners make policies so that students’ performance could be improved, and the failure rate could be decreased. The early prediction of the students’ performance before final examinations helps institutions take precautionary steps for the students who are predicted to be low performer. Teachers may assist such students in improving their learning and support them to achieve better grades [3]. Educational data mining (EDM) is an important research area that applies data mining techniques to educational data [4]. EDM plays an essential role in gathering and pre-processing students’ data and developing new models to explore and analyze the data. The application of data mining algorithms on educational data yields new information. Machine learning is an emerging area in EDM that processes educational data through various techniques. Machine learning techniques in the educational domain are used to assess and forecast students’ performance in the examination. The predicted students’ performance helps the teachers pay extra attention to students who are predicted to be low performing. Some extra classes could also be planned for students predicted to perform low or moderate. Classification, a prominent technique of machine learning, arranges the data into the categories or the classes according to the properties of data elements. A classification model, also known as classifier, anticipates the class/category for the data elements. In the educational environment, the teachers classify their students according to the knowledge. The classes are labeled as unsatisfactory, satisfactory, good, very good, and excellent. EDM literature has reported numerous machine learning techniques for classification, such as kNN, ANN, SVM, NB, DT, RF, logical regression, etc. DNN is one of the emerging techniques in machine learning which has been rarely employed to predict students’ performance. DNN is a contemporary technique, which can be used to process educational data. DNN belongs to the family of the neural networks having a stack of layers. The first layer—the input layer—feeds data in DNN. The output of DNN is received through an output layer, which is the last layer. Between these two layers, the DNN model has a random number of hidden layers. Each layer is formed with neurons that have a resemblance to human brain neurons. The consequence of arranging neurons in layers is that a DNN is capable of processing data hierarchically. Equation 1 shows the activation function of a neural network having independent input variables x1, x2, x3, … with weights w1, w2, w3, …. Y = f
N i=0
wi xi
(1)
Employing Deep Neural Network for Early Prediction …
499
This article employs DNN for a prediction model that predicts students’ grades for the subject—“data structure.” The students at MNNIT Allahabad study “data structure” in their third semester of engineering. “Data structure” is one of the most widely studied subjects across many branches of engineering. Hence, a fair size of samples is obtained for the experiments. The prediction of grades is for the final examinations of the subject, which is conducted at the end of the semester. The article also compares the accuracy of prediction with traditional machine learning methods, like kNN, ANN, SVM, NB, RF, and DT. The research objective of this article is to propose a DNN-based classifier which is more accurate than the traditional classifiers based on other machine learning techniques. Moreover, the early prediction of students’ grades helps teachers pay more attention to low-performing students. The organization of this article has been done in five sections. Section 1 introduced the problem to be addressed and the probable solutions. The relevant research works in the EDM literature for the prediction of grades have been discussed in Sect. 2. Section 3 describes the proposed work for the prediction of students’ grades. Section 4 elaborates on the experimental outcomes and provides the comparative analysis with other traditional machine learning methods. Lastly, Sect. 5 presents the conclusion and future work.
2 Related Work Deep learning finds a lot of applications in the medical field [5, 6]. However, its usage in EDM literature for the prediction of students’ performance has been limited. The goal which matters is the accuracy of the prediction. Grade prediction with classifiers is a subject of study till date [7, 8]. Neural networks have been employed for prediction in quite a few articles [9–12]. However, the highest accuracy achieved is 84%, and the developed neural network model was computationally slow compared to other classification models. Amrieh et al. [13] employed ANN, NB, and DT for predicting students’ performance. The features utilized for prediction were based on demography, academic background, parent participation in the learning process, and student’s behavior. The highest accuracy achieved was 79% utilizing all the features with the ANN. However, accuracy was reduced to 57% (excluding behavioral features). Recent advancements in neural network helped to achieve better accuracy. Ramesh et al. [14] employed a neural network—multilayer perceptron for predicting the students’ grades with an accuracy of 72.38%. Generalized regression neural networks [15] predicted students’ academic performance with 75% accuracy. Mondal and Mukherjee [16] predicted students’ academic performance using a recurrent neural network model with an accuracy of 85%. Another recurrent neural network model [17] improvised the accuracy to 90%. A multilayer ANN architecture [18] predicted students’ performance with an accuracy of 96.2%. Deep learning has also been utilized for predicting the grades of students. Fok et al. [19] employed deep learning and TensorFlow artificial intelligence to predict
500
S. Garg et al.
students’ performance. The authors also forecasted the future university program most suitable for students. The proposed model examined traditional academic performance and non-academic performance for prediction. The proposed model achieved the highest accuracy of 91%. Bendangnuksung [20] predicted students’ performance by employing a deep neural network. The authors utilized kalboard 360, a learning management system, to prepare an educational dataset containing academic data of 500 students. The proposed model achieved 84.3% accuracy, which is low, considering the prediction technique applied is DNN. Hence, it is needed to use DNN with a better configuration for better results. The related work established that neural networks are the most reliable prediction technique for predicting the students’ performance. Most of the related work discussed had an accuracy of less than 90%. Only one research work achieved an accuracy of 96.2%. So, it is therefore needed to predict the students’ performance with high accuracy employing the neural network. This article utilizes DNN having a better configuration for prediction to increase accuracy and, thus, reliability.
3 Proposed Work This article proposes a DNN model with a better configuration for improved accuracy of prediction. This proposed model predicts the grades for the subject—“data structure” of engineering students in the third semester. The process diagram for the details of the proposal has been shown in Fig. 1. The procedure comprises of six steps, namely data collection, data pre-processing, configuring the DNN model, training the proposed DNN model, testing the DNN model, and evaluation of results. This section provides the detail about the dataset preparation, pre-processing methods, configuration of the DNN model, and the experimental methodology carried out. The next section discusses the results.
Fig. 1 System flowchart for the proposed DNN model
Employing Deep Neural Network for Early Prediction …
501
3.1 Dataset Preparation and Pre-processing The academic data of students studying “data structure” was placed in a dataset named “student.” The dataset mainly captured students’ performance, which is computed through the assessment of various academic activities. The academic performance parameters include Assignment_Marks, Attendance_Score, Internal_Assessment_Marks, Viva_Marks, Quiz_Marks, Mid_Semester _Marks, and Final_Grade obtained by students. Table 1 lists all the attributes in the “student” dataset along with their domains. The next step of data pre-processing cleans the data, handles missing values, and computes the attribute Final_Grade from the students’ obtained marks. Table 2 shows the computation of grades corresponding to the marks obtained by students. 956 students in 12 batches studied “data structures” at MNNIT Allahabad, India, during the odd semesters of the academic years 2018–19 and 2019–20. Four different faculty members were responsible for teaching the subject to students. All the attributes except Quiz_Marks in the “student” dataset have been directly collected through the faculty teaching “data structures.” The attribute Quiz_Marks has been computed through the confidence-based evaluation of MCQ-based quizzes using C-BEAM [21]. 30 MCQ quizzes were organized for the subject, which contained 20 questions, each of three marks. The evaluation scheme C-BEAM is a negative marking scheme that awards or deducts marks based on the correctness of responses Table 1 Attributes of “student” dataset
Table 2 Classification of students’ performance
Attributes name
Domain
Student_Id
Alphanumeric values
Degree
(Bachelors, Masters, Diploma)
Assignment_Marks
Real values
Attendence_Score
Real values
Internal_Assessment_Marks
Real values
Viva_Marks
Real values
Quiz_Marks
Real values
Mid_Semester_Marks
Real values
Final_Grade
(Grade classes)
Marks range
Classes
Class label
1 − Z , yt , ∀t ∈ [1, . . . , M]
(9)
xtr s ⇐ 1 + Z .(1 − yt ), ∀t ∈ [1, . . . , M]
(10)
t=1 r =1 L N r =1 s=1 L N r =1 s=1
xtr s ∈ {0, 1}, yt ∈ {0, 1}
(11)
The cost measure can be optimized by Eq. (12) having the same constraints mentioned in Eqs. (4)–(6). OF2 :
M L t=1 r =1
Ctr j
N
xtr s
(12)
s=1
Cts is defined as the price/hour in running a virtual machine with instance type(large, medium or small) Ik on physical machine St where the total running cost of the physical server is the summation of all the instances running on that physical server. The MOP can be converted to a SOP by defining a new function OF3 by assigning two different weights w1 and w2 and allots different weights and simultaneously optimizes both the functions OF1 and OF2 . OF3 = w1 ∗ OF1 + w2 ∗ OF2 ; w1 + w2 = 1; 0 ≤ w1 .w2 ≤ 1
(13)
4 VM Consolidation Based on Cost and Energy-Aware Algorithm The described optimization problem is a NP-hard problem. So we can decrease the solution space by using heuristic algorithm which finds an better problem solving technique solves in less time. Our proposed GA-based meta-heuristic technique for VMs placement cited cost and energy-aware virtual machine placement (CEAVP). Figure 3 shows the VM allocation solution with six VMs, three types of instances such as small, medium and large and four physical servers. The VM allocation problem has two stages. The instance selection is depending on the requirement of
516
S. S. Patra et al.
Instance Types
VMs
PMs
VM1
VM2 PM1
Small
VM3
PM2
VM4
Medium
PM3
VM5
PM4
Large VM6
Stage 1
Stage 2
Segment 1 6
1
8
5
4
3
Segment 2 7
2
5
9
1
3
7
2
4
6
8
Fig. 3 Example of VM placement problem and the encoding to server
the software the virtual machine runs, and in the next stage, the suitable physical machines are selected for allocating the virtual machines. Six basic steps have to consider for implementing GA.
4.1 Steps of GA for CEAVP Model 4.1.1
Encoding Scheme
Every chromosome is consists of length 2N + L + M − 2 and having 2 stages. The first stage is used for instance allocation, and in the second stage, the VM is assigned to one physical machine. The first stage of chromosomes is integers in [1, N + L − 1] and the numbers greater than N has been taken as delimiters. The second segment length is considered to be N + M − 1, that shows which VM is allocated to which physical machine.
Minimizing Energy and Cost Through VM Placement …
4.1.2
517
Initialization Step
Randomly, the initial population is get by considering the Eqs. (3) and (7). For reducing the computation time, each gene range is predetermined as described in Sect. 4.1.1.
4.1.3
Evaluation Function
In GA, every chromosome in the population has to be evaluated. The fitness function defined in Eq. (18) evaluates the power consumed and cost of placing the VMs depending on the data obtained by the chromosome. VtCPU Memory Vt
VtDisk
= max
s=1
RrCPU xtr s
yt ∗ CtCPU
N
yt ∗ Ct N
L r =1
Memory
s=1 Rr
xtr s
∀t ∈ [1, . . . M]
L r =1
s=1
RrDisk xtr s
∀t ∈ [1, . . . , M]
Memory
(15)
yt ∗ CtDisk
(VtCPU + Vt M
(14)
Memory
= max t=1
N
= max
M VT =
L r =1
+ VtDisk )
∀t ∈ [1, . . . , M] ∀t ∈ [1, . . . , M]
OF = OF3 + (1 + β ∗ VT )
(16)
(17) (18)
β is the control argument for the penalty function. Equations (15)–(17) give the penalties imposed in the excess use of CPU, disk resources and memory for the ith physical server.
4.1.4
Selection Strategy
The goal of this step is to select the best fitted chromosomes and reject the other chromosomes.
4.1.5
Crossover Operator
In the crossover step from two parents, new individuals are produced. The crossover operation is shown in Fig. 4.
518
S. S. Patra et al. Segment 1
Segment 2
6 1 8 5 4 3
7 2
8 2 6 3 1 7
5 4
Offering 1
8 2 6 5 4 3
7 1
Offering 2
6 1 8 3 2 7
5 4
Parent 1
Parent 2
5
9
1
3
7 2 4 6 8
7
4
5
9
6 8 3 1 2
7
4
1
3
7 2 5 6 8
5
9
4
9
6 8 3 1 2
Fig. 4 Example of a crossover operation
4.1.6
Mutation Operator
This operator generally changes the value of the genes. Mutation percentage is used to perform this operation as shown in Fig. 5, a parent from the population first chosen randomly, two segments are selected. By exchanging the gene value with each other, a new individual is generated. Segment 1
Segment 2
Parent
6
1 8 5 4 3
7 2
Offspring
6
1 3 5 4 8
7 2
Fig. 5 Example of mutation operation
5
5
9
9
1
8
3
3
7 2 4 6 8
7 2 4 6 1
Minimizing Energy and Cost Through VM Placement …
519
4.2 The Proposed Algorithm: CEAVP
__________________________________________________________________ Algorithm 1: Algorithm CEAVP __________________________________________________________________ Purpose: This algorithm generates the VM allocation list by taking the Virtual Machine List (VMList), Physical Machine List(PMList) and instance type list(TypeList) 1.Initialize the Population size(PSize), Crossover percentage(CP) and Mutation Percentage(MP), Terminate Condition(T) 2. ns Th1, Fault Is 50% distance in Circuit-1 from B-1 DSCE11P < -Th1 , Fault Is 50% distance in Circuit-1 from B-2
DSCE2P > Th2, Fault is 50% distance in Circuit-2 from B-1 DSCE22P < -Th2 , Fault Is 50% distance in Circuit-2 from B-2
Trip
(-Th1) < DSCE
(-Th2) < DSCE
Trip
p
p
< (+Th1)
< (+Th2)
External fault
Fig. 3 Proposed flow chart of the protection scheme
case of circuit-2, threshold value +5 is for the value of the faulty zone remains fifty percent distance from the B1 in circuit-2, and Th2 value for −5 signifies the faulty zone which remains fifty percent distance from B2 in the circuit-2 line. The threshold values (Th1 and Th2 of circuit-1 and circuit-2) are carefully chosen by carrying out different types of simulation in different faulty zone (Fig. 3).
4 Simulation Result and Discussion R2010 SIMULINK is used to construct the prototype model the relaying scheme. The fault simulations are supported under the MATLAB software considering the fault time of 0.3 s (the 600th sampling period in a sample time) for different shunt fault types. Simulations are prepared and presented using MATLAB Simulink and differential coding for both internal and external fault detection of parameter including ´ source impedance or Normal SI (NSI), 30% fault resistance (FR) from 1 to 100 , increase of NSI and phase reversal. The STATCOM here we are used to see the fault
542
S. K. Mishra et al.
detection performance by varying the magnitude of voltage and the phase angle. The differential relaying involves no error of calculation time for fault detection. As the differential current signals are obtained from the GPS system through CT’s placed at both end of the line. It extracts the faulty current signal from both the side of Bus-1 and Bus-2 through CT at the same time. The synchronous error is measured while extracting the current signal. The error is very less in the order of micro second. For that reason, STATCOM does not have any effect of during the fault and calculation of fault detection time. However, the distance relay including STATCOM has its adverse effect of calculation of fault time and fault detection and the type of fault. As the distance relaying scheme is based on the change of apparent impedance of the line. Therefore, the error is encountered due to overreaching or under reaching consequence relay maloperation in the line. STATCOM is used for enhancing power transmission capacity, control the voltage and frequency, and stability of the line under normal operating and transient condition. Therefore, differential relaying concept is not related to power compensation during the fault time calculation and fault detection of any phase does not depend on any changes in the STATCOM as it is not related to any change of impedance of the line. The role of during the fault condition using differential relaying STATCOM has its role of current and voltage change, but that value is not related to our calculation of current signal as the current signals are extracted from CT placed at both ends of the line through GPS system. Hence, there should not be any correlation between reactive power insertion and improvement in fault detection and fault time calculation as it is not related to any of the fault cases discussed in simulation result section. The different cases of fault and detection time are always observed, which remains within 1 cycle period of time (20 ms). In addition to this, it explains the different fault classification study of different shunt fault occurs at any part of the double line (circuit-1 and circuit-2). The critical fault study includes different parameter alteration in the transmission line such as fault angle, fault-resistance, impedance at the source and reversing the power flow of double line considering the sending end as receiving end substation. In all such conditions, it is observed that the algorithm suitably addresses double circuit line fault detection accurately, reliably and registers the different types of fault pattern.
4.1 Variation of Fr Figure 4 depicts L-G (B-G) fault at 150Km from SE substation B-1 in circuit-1 of ´ and FA = 0°. Figure depicts the fault which occurs in a single double line, Fr = 1 phase B of circuit-1 (B-G). The fault occurs at a line length 150Km distance (50% of total length of 300Km) from B-1. Here, it is observed that the threshold value is + 20 (Th1 = 20) because of the presence of STATCOM (circuit-1), and the threshold value is quite low +5 (Th2 = 5) without the STATCOM (circuit-2). The differential spectral content energy of B-phase magnitude (DSCE_B-phase) rises gradually and touches the threshold line (Th1 value equal to 20) in a cycle of fault time of 8 ms. However, the other differential spectral content energy (DSCE_A and DSCE_C) of
Fault Detection in Differential-Based STATCOM Compensated …
543
B phase-G fault
250
DSCEA DSCEB
DSCE Magnitudes
200 150
DSCE Th=20
C
Fault
100 50 0 -50 450
500
600
550
650
750
700
800
850
900
950
1000
Sample
´ and FA = Fig. 4 L-G (B-G) fault in circuit-1 of double line at 150 km from SE of B-1, Fr = 1 0°
A and C phase fails to touch the set Th1 line. It indicates both the phase A and C are not the faulty phase. It signifies the phase B of circuit-1 is the faulty phase, and A and C are not the faulty phase as it could not be able to touch Th1 line. Figure 5 depicts fault at AB phase-G from 50% distance of B-2 in circuit-2 (without STATCOM). The DSCE_A and DSCE_B phase both rises in downward direction, because the fault is from receiving end (B-2) and touches the negative Th2 value equal to −5. Thus, in both the case, the fault is detected after touching threshold line, whereas C-phase does not touch the threshold line. Figure 6 depicts the ABC phase-G fault at 150 km from B-1 (circuit-1) of double line including STATCOM, Fr = 100 . The fault here in all ABC phase is also successfully detected, as the DSCE magnitude of three phases is rising and touches the Th value, even if at higher values of Fr = 100ohm. Similarly, a comparison of B phase is made in circuit-1 and circuit-2 which is depicted in Fig. 7. The B-phase of the circuit is represented as B1-G and B2-G fault in circuit-1 and circuit-2, respectively. B-phase fault occurs at 150 km from ´ and B-1 (circuit-1 with STATCOM) and (circuit-2 without STATCOM) at Fr = 1 AB phase -G fault
20
DSCEA
DSCE Magnitudes
0
DSCEB DSCEC
-20 -40 Fault
-60 Th
-80 -100 -120 -140 400
500
600
700
800
900
Sample
´ and FA = 0° Fig. 5 LL-G (AB-G) fault from B-2 in circuit-2 of double line, Fr = 1
1000
544
S. K. Mishra et al. ABC-phase to G fault
DSCE Magnitudes
200
DSCE A DSCE B
150
DSCE C
100
Th=20
Fault
50
0
-50 450
500
550
600
650
700
750
800
850
900
950
1000
Sample
´ Fig. 6 LLL-G (ABC-G) fault from B-1 of double line including STATCOM (circuit-1), Fr = 100 ; FA = 0° B1-G fault & B2-G fault 300 B2-G (circuit-2) B1-G (circuit-1)
DSCE Magnitude
250 200 150
Th Fault
100 50 0 -50 450
500
550
600
650
700
750
800
850
900
950
1000
Sample
Fig. 7 L1-G (B1-G) in circuit-1 and L2-G (B2-G) in circuit-2 fault from Bus-1 of double line with ´ and FA = 0° STATCOM and without STATCOM, Fr = 1
FA = 0°. It is seen that the DSCE of B-phase in both the circuit faults are detected. However, the fault time in circuit-1 is less compared to the fault time in circuit-2, and in both case, fault time takes a cycle of time. Two types of conditions are made in all the table shown below for justifying the effectiveness of fault detection of the system. Condition-I represents fault at 150 km distance from bus B-1 in double line of circuit-1 (beforehand STATCOM mid-position), and Condition-II represents fault at 150 km distance from bus B-2 in double line of circuit-1. (Afterward STATCOM mid-position). Table 1 depicts the variation of DSCE energy content of circuit-1, Fr = 1ohm and FIA = 0°. In all cases, the values presented higher to the threshold represent faulty phase; otherwise, the phase is healthy one. Fault angle variation is one of the important parameter for considering fault detection study. Figure 8 depicts LL-G (AB-G) fault in a double circuit line to ground (circuit-1) at 150 km from bus B-1 in circuit-1, FA = 45°. The fault detection is identified, and fault time takes within a cycle time, but DSCE energy content varies,
Fault Detection in Differential-Based STATCOM Compensated …
545
Table 1 Various DSCE energy content of circuit-1, Fr = 1 and FIA = 0° Fault case
Condition-I AG
Condition-II BG
CG
AG
BG
1.34
96.46
BC
0.6299
68.8
64.0
1.80
BG
CG
−1.78
−112.5
1.695
−92.2
1.84 −98.95
CA
92.7
1.5
114.6
−139.0
−1.2
−135.6
ABCG
93.8
93.2
94.5
−94.3
−95.3
−96.3
AB-G fault 200
DSCE A
DSCE Magnitudes
DSCE 150
100
B
DSCE C Th=20 Fault
50
0
-50 450
500
550
600
650
700
750
800
850
900
950
1000
Sample
´ FA = 45° Fig. 8 LL-G (AB-G) fault of a double line in circuit-1, 150 km from bus B-1, Fr = 1 ,
for different phase fault values which are clearly depicted in the figure shown below. Figure 9 depicts the L-G (A-G) fault detection at 150 km from bus-1, FA = 90° with STATCOM. It is found that from above both figure at different FA value the fault is detected, but the fault time in FA = 90° is more than at fault at FA = 45°. Figure shows below in Fig. 10 which depicts a clear understanding of the circuit-2 (without STATCOM); at 150 km, from B-1, FA = 0° has higher amplitude of DSCE and touches the line +5. In Fig. 11 depicts the AB-G fault in circuit-2 from 150 km A Phase-G fault
DSCE Magnitudes
200
DSCE DSCE
150
100
DSCE
Th
A B C
Fault
50
0
-50 450
500
550
600
650
700
750
800
850
900
950
1000
Sample
Fig. 9 L-G (A-G) fault of a double line, STATCOM compensated (circuit-1), 150 km from bus B-1, FA = 90°
546
S. K. Mishra et al. AB phase-G fault
DSCE Magnitudes
160
DSCE A
140
DSCE
120
B
DSCE C
100 80
Th Fault
60 40 20 0 -20 400
500
600
1000
900
800
700
Sample
Fig. 10 LL-G (AB-G) fault of a double line, without STATCOM compensated line (circuit-2), 150 km from bus B-1, FA = 0° AB phase-G fault DSCE Magnitudes
20
DSCE A
0
DSCE B
-20
DSCE C
-40 Fault
-60 -80
Th
-100 -120 -140 -160 400
500
600
700
800
900
1000
Sample
Fig. 11 LL-G (AB-G) fault of a double line, without STATCOM compensated line (circuit-2), 150 km from bus B-2, FA = 0°
from B-2, FA = 0° detects the fault when the DSCE of A and B phase touches Th at −5. Table 2 depicts the variation of energy at Fr = 50 , and Table 3 shows at Fr = 100 of circuit-1, FA = 0°. The higher energy values more than the threshold signify the faulty phase. Table 2 Various DSCE energy content of circuit-1, Fr = 50 and FA = 0° Fault case
Condition-I AG
BG
0.38
CG
1.598
BC
−0.361
ABCG
98.2
Condition-II BG
CG
AG
BG
CG
80.45
−1.59
−1.780
−48.0
−1.674
79.7
−1.759
−1.557
−48.9
89.3
83.0
−1.275
−31.5
−65.3
98.6
98.0
−41.6
−43.9
−42.9
1.259
Fault Detection in Differential-Based STATCOM Compensated …
547
Table 3 Various DSCE energy content of circuit-1, Fr = 100 and FA = 0° Fault case
Condition-I AG
AG
98.4
BG
101.9
ABCG
CG
99.132
AG
BG
CG −0.6
1.682
−93.753
−1.58
110.8
1.781
−1.759
−124.4
−1.9
102.1
2.051
−99.152
−98.132
−0.612
98.141
−99.613
−98.142
−99.571
1.987
1.50
AB-G
Condition-II BG
98.203
AB phase -G fault 20
DSCE Magnitudes
DSCE A 0
DSCE DSCE
-20
B C
-40 Fault
-60 Th
-80 -100 -120 -140 400
500
600
700
800
900
1000
Sample
Fig. 12 LL-G (AB-G) fault of a double line, without STATCOM compensated line (circuit-2), 150 km from bus B-2, FA = 0°
4.2 Variation of SI Impedance variation at source (SI) is also considered for fault detection in double circuit line at SI. The changing value of SI is considered here up to 30% increase of SI. Figure 12 depicts at 50% (150 km) distance from B-2; double phase fault ´ SI = 30% increases the SI and FA = 0°. (AB-G) occurs in circuit-2; at Fr = 1 , The DSCE of AB phase magnitude touches the value = −5 in downward direction to detect fault. The circuit without STATCOM compensated line has lesser value of the value compared to the STATCOM compensated line. In Fig. 13, ABC phase–G fault is detected in double circuit line of STATCOM compensated line; at 150 km, from bus-1, SI = 50% increases of SI.
4.3 Variation of Phase Reversing The performance of reversing the phase is considered for fault detection as it suggests that the study of the fault occurs in bidirectional direction. The reverse current flow also occurs in the transmission line when the voltage phase angle is reversed. The simulation analysis for fault detection is also considered for clear understanding. Figure 14 depicts the ABC-G fault at 150 km from bus B-1 in double line of circuit-1
548
S. K. Mishra et al. ABC phase-G fault
DSCE Magnitudes
250
DSCE
A
DSCE B
200
DSCE C 150 Th
100
Fault
50 0 -50 450
500
550
600
650
700
750
800
850
900
950
1000
Sample
Fig. 13 LLL-G (ABC phase-G) fault in double circuit line of circuit-1 (STATCOM line), 150 km ´ SI = 50% increase of NSI from bus B-1, Fr = 1 , ABC-G fault 200
DSCE Magnitude
DSCE
A
DSCE B
150
DSCE
C
100 Th=20
Fault
50
0
-50 450
500
550
600
650
700
750
800
850
900
950
1000
Sample
´ reversing the phase Fig. 14 ABC-G fault, 150 km from bus B-1, Fr = 1 ,
(with STATCOM). The ABC fault is also detected here even if by reversing the phase angle successfully. This suggests that in any adverse situation of fault in the line can be detected successfully.
4.4 External Fault In many published paper, the external faulty zone case has not been addressed. It is an exceptional area of detection, where there is a chance of fault detection. In order to examine such fault case, an ABC-G case is considered. Here, the fault occurs between the bus-2 and bus-3 section of the figure depicted in Fig. 2. Figure 15 depicts ABC phase-G fault, Fr = 1 , NSI, and FA = 0° at an external end. The magnitude of DSCE of ABC phase value in case of external fault condition does not remain within the set value of Th1. So any of the phase of A, B, and C are unable to touch the Th 1 line both in +ve and –ve value of Th. This signifies faulty zone occurs in an external zone as per condition-3 mentioned earlier.
Fault Detection in Differential-Based STATCOM Compensated …
549
ABC phase-G fault 4
DSCE Magnitudes
DSCE 2
DSCE DSCE
0
A B C
-2 -4 Fault -6 -8
-10 400
500
600
700
800
900
1000
Sample
´ NSI, and FA = 0° at an external end Fig. 15 ABC-G fault, Fr = 1 ,
5 Conclusion The DWT- and DFT-based differential protection scheme is a superior scheme which detects the fault in a STATCOM compensated double circuit transmission line. This scheme is successfully applied at different fault condition at different location and different circuit to validate for internal and external fault detection within 20 ms of time (1 cycle period). The parameters considered here for different values of Fr, FA, power reversal, and SI. The scheme is tested in circuit-1 and circuit-2 for the detection of fault, and higher system accuracy is accessed as no misfault case is observed.
References 1. Hingorani, N.G., Gyugyi, L.: Understanding FACTS Concepts and Technology of Flexible AC Transmission Systems. IEEE Press, New York (2000) 2. Zhou, X., Wang, H., Aggarwal, R K., Beaumont, P.: The impact of STATCOM on Distance Relay. In: 15th PSCC Liege, 22–26 Aug 2005 3. EI-Arroudi, K., Joos, G., Mc Gills, D.T.: Operation of impedance protection relays with STATCOM. IEEE Trans. Power Deliv. 17, 381–387 (2002) 4. Tripathy, L.N., Samantaray, S.R., Jena, M.K., Mishra, S.K.: Fast discrete S-transform based differential relaying scheme for UPFC compensated parallel line. In: IEEE Conference on Electrical Electronics Signals Communication and Optimization (EESCO) (2015) 5. This work was supported by: All India Council for Technical Education, New Delhi (File No .8-11/RIFD/RPS/POLICY-1/2016-17) 6. Aker, E.E., Lutfi, M., Aris, I., Wahab, N.I.A., Hizam, H., Emmanuel, O.: Adverse impact of STATCOM on the performance of distance relay. Indonesian J. Electr. Eng. Comput. Sci. 6, 528–536 (2017) 7. Alsammak, A.N., Janderma, S.A.: Enhancement effects of the STATCOM on the distance relay protection. Int. J. Comput. Appl. 182, 10–14 (2019) 8. Mishra, S.K., Tripathy, L.N., Swain, S.C.: A DWT based STATCOM compensated transmission line. Int. J. Eng. Technol. 7, 61–64 (2018) 9. Mishra, S.K., Tripathy, L.N., Swain, S.C.: FDST approach STATCOM integrated double circuit transmission line. In: IEEE International Conference on DeviC-2017, 23–25th Mar., Government College of Engineering, Kalyani, W. Bengal, India (2017)
550
S. K. Mishra et al.
10. Tripathy, L.N., Samantaray, S.R., Dash, P.K.: A fast time–frequency transform based differential relaying scheme for UPFC based double-circuit transmission line. Int. J. Electr. Power Energy Syst. (2016) 11. Mishra, S.K., Tripathy, L.N., Swain, S.C.: DWT approach based differential relaying scheme for single circuit and double circuit transmission line protection including STATCOM. Ain Sham Eng. J. 10, 93–102 (2019) 12. Mishra, S.K.: A Neuro-wavelet Approach for the performance improvement in SVC integrated Wind-Fed transmission line. Ain Sham Eng. J. 10, 599–611 (2019) 13. Mishra, S.K., Tripathy, L.N.: A critical fault detection analysis & fault time in an UPFC transmission line. Protect. Control Modern Power Syst. 4, 1–10 (2019) 14. Mishra, S.K., Tripathy, L.N.: A novel relaying approach of combined DWT and ANN based relaying scheme in an UPFC integrated wind fed transmission line. Int. J. Comput. Syst. Eng. Inter Science, Accepted in 2019 (In Press) 15. Mishra, S.K., Tripathy, L.N., Swain, S.C.: A DWT based differential relaying scheme of a STATCOM integrated wind fed transmission line. Int. J. Renew. Energy Res. 8, 476–487 (2018) 16. Mishra, S.K., Swain, S.C., Tripathy, L.N.: A Time-Frequency transform based fault detection & classification of STATCOM integrated single circuit transmission. Int. J. Power Electr. Drives Syst. 8, 1804–1813 (2017) 17. Santoso, S., Powers, E.J., Grady, W.M., Hofmann, P.: Power quality assessment via wavelet transform analysis. IEEE Trans. Power Deliv. 11, 924–930 (1996) 18. Gouda, A.M., Salama, M.M.A., Sultan, M.R., Chikhani, A.Y.: Power quality detection and classification using wavelet multi-resolution signal decomposition. IEEE Trans. Power Deliv. 14 (1999) 19. Tripathy, L.N., Jena, M.K., Samantaray, S.R.: Differential relaying scheme for tapped transmission line connecting UPFC and wind farm. Int. J. Electr. Power Energy Syst. 60, 245–257 (2014) 20. Jurado, F., Saenz, J.R.: Comparison between discrete STFT and Wavelet for analysis of power quality events. Electr. Power Syst. Res. 62, 183–190 (2002) 21. Mishra, S.K, Tripathy, L.N.: A novel relaying approach for performance enhancement in a STATCOM integrated wind-fed transmission line using single terminal measurement. Iranian J. Sci. Technol. Trans. Electr. Eng. (In Press)
Optimizing the Valid Transaction Using Reinforcement Learning-Based Blockchain Ecosystem in WSN P. Anitha Rajakumari and Pritee Parwekar
Abstract In virtual transactions, blockchain technology is a new effective transmission technology that integrates decentralized and distributed database. Blockchain allows transparency in the whole chain of data, and it stores all the transactions into a ledger over four building blocks. However, to the best of our knowledge, there are no techniques available to optimize the transactions occurring between the source and destination of blockchain ecosystem. Here, the transactions are transmitted and received through a self-generated address or digital identities that serve an anonymous user. This paper has done a work on an optimal distribution of blockchain transactions in wireless sensor networks (WSNs) using Reinforcement Learning (RIL). The RIL helps in transacting the value of transactions from source to destination address and helps in recording the public transaction history. The RIL helps in verifying carefully the transactions since it is unaltered or rejected after verification. Further, the experimental results validate the blockchain transactions based on its transmission rate, packet delivery ratio, etc. Keywords Blockchain · Reinforcement learning · Wireless sensor network
1 Introduction The blockchain network (BCN) consists of a series of blockchains having a chain of blocks with valid information transacted between the blocks [1–4]. An additional block with blockchain is added, which represents the entire transaction ledger carried out in each block. To validate each block, a cryptographic algorithm is used, since each block has a parent hash value, time stamp, nonce, and random number to verify the hash value [5–7].
P. Anitha Rajakumari · P. Parwekar (B) Delhi-NCR Campus, SRM Institute of Science and Technology, Ghaziabad, India e-mail: [email protected]; [email protected] P. Anitha Rajakumari e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_48
551
552
P. Anitha Rajakumari and P. Parwekar
Therefore, the data validity via the blockchains is validated through the first block. Blockchain hash value is said to be special as it varies automatically depending on block updates, preventing data from being secretly authenticated. A new block with a chain is added, if they deem the block and consider its transaction valid [8]. WSN collect sensing data using mobile phone-borne sensors or any IoT device to save costs in [9]. CSN’s main problem is privacy leakage. The problem with privacy leakage leads to fewer users involved, and fake data is uploaded to protect their personal data. There are limited resources available to WSN nodes and mobile devices. The use of a blockchain network within intelligent sensor networks is impossible, as the sensors are resource restricted devices with insufficient computing capacity to deliver PoW consensus algorithms. A permanent connection among nodes is needed, not energy efficiency, by the blockchain network. Therefore, the blockchain without PoW, allowing partial connectivity to withstand errors and adverse activities at the same time, is needed [10]. In [9], the experimental evaluation is based on smaller number of samples, and the results can be one-sided, and privacy protection can be enhanced with the improvement of algorithm. The security analysis and protection against malicious nodes are not provided in [10]. The proposed method can be extended to other topological models with the integration of a Merkle tree. The blockchain structure for successful transactions is given in Fig. 1. In a distributed blockchain, a routing node acquires routing information in an open, secure, and network including but not limited to its neighboring routing nodes. The routing efficiency can be improved if proper use is made of this routing knowledge. In this view, the present study uses a distribution of a blockchain transactions in an optimal way, and hence, the authors use reinforcement learning to improve the distribution of transactions. RIL ensures that the transactions take place between the source and destination in an optimal way. The transactions occur based on the present and previous transactions, where the transactions are carefully included/altered/rejected after verification.
Hash
Timestamp
Tx1
Nonce
Tx1
…
Genesis Block Fig. 1 Blockchain structure
Tx1
Hash
Timestamp
Tx1
Nonce
Tx1 Block i
…
Tx1
Optimizing the Valid Transaction Using Reinforcement …
553
The main contribution of the paper includes a state-of-art modifications in the existing transactions using reinforcement learning. The RIL ensures that the transactions between blocks take place in non-redundant way so as to improve the transactions in optimal way. The outline of the paper is presented below: Sect. 2 provides the related works. Section 3 discusses the proposed blockchain transactions using reinforcement learning. Section 4 discusses the results and discussions between the proposed and existing methods. Section 5 concludes the entire work.
2 Proposed Method The robustness and trustworthiness of routing are improved with blockchain mechanism and token transaction in WSN. The blockchain involves the transactions of token and records the sensor node information. The details of which are given in Fig. 2, where the framework involves routing network (RN) and blockchain network (BCN) with three different entities namely server (S), routing (R) nodes, and terminal device. The RN in Fig. 2 has a routing node and a terminal, where each routing node (R) is connected with multiple terminals via LAN. The R node is responsible for transmission and reception of packets between the sensor nodes or between a sensor node and a terminal. Server
BCN
RIL
Rπ ( j) Source Terminal
R(i) WSN
Fig. 2 Blockchain-based routing scheme
Destination Terminal
554
P. Anitha Rajakumari and P. Parwekar
3 Routing Algorithm The procedure of the routing is given here. The data packets are transmitted from the source to the destination node or terminal via multiple routing nodes, say R(i). The routing nodes R(i) are responsible for the selection of immediate hops Rπ(j) via RIL routing policy. The RIL queries collect the information of routing state in a constant way from blockchain network. After the transmission of packets in a continuous manner, the target node R(t) receives the data packets, and it is then forwarded to the destination terminal. The routing nodes R(i) release token agreement for the generation of tokens by provided correlated variables on BCN that includes: • • • • •
Releaser address of BC, Token name, Empty table that maps between routing node (R) and token balance (BR), Token supply, Destination node R(t).
Upon the release of token contract, the generation of corresponding releaser tokens takes place automatically by the token agreement. The routing nodes transact the tokens via two functions namely transfer and confirm, where it belongs to the token agreement and it is triggered for transaction of tokens. The steps of routing are given below: Step 1: The packets are initialized in routing nodes R(i) and then in immediate hops Rπ(j) as p and q, respectively. Step 2: Initially, the routing nodes R(i) send data packets to immediate hops Rπ(j) with an acknowledgment (ACK) to confirm the reception of packets and token balance. Step 3: Routing node R(i) performs scheduling using routing information from neighborhood nodes. Step 4: Check the authenticity of the next node with token balance, if the nodes are malicious then deny the routing process and vice versa. Step 5: Once the packet is transmitted, the routing node R(i) triggers the function transfer on token agreement for the indications of changes in routing state based on the tokens sent to BCN. Step 6: Finally, the amount of tokens are released on token agreement. Step 7: While the immediate hops Rπ(j) trigger the function confirm on token agreement in order of confirming the total numbers of packets received at the BCN. Step 8: Finally, the token agreements the checks the number of packets received with the amount of tokens. If the tokens match with transactions, then the transactions are valid.
Optimizing the Valid Transaction Using Reinforcement …
555
Step 9: The consensus algorithm in server node confirms the entire token transactions. Step 10: On other hand, the failed transactions are canceled and discarded in BCN without affecting the further routing.
4 RIL Routing Reinforcement learning (RIL) is a region of machine learning disturbed with actions in an environment in order to maximize the concept of increasing reward. Reinforcement learning is a machine learning paradigms. It is one of the type’s unsupervised learning. Reinforcement learning differs from supervised learning in not required to label input/output pairs, and it requires sub-optimal events to be openly corrected. Instead the focus is on finding a balance between exploration of unexplored territory and development. The result of output depends on the state of the current input, and the next input depends on the output of the previous input. It has better performance in terms of delay, throughput and time stamp and extends Change for a long period of time.
5 RIL Routing Algorithm The RIL-based routing uses global, trusted, and dynamic routing information from the BCN. The learning model obtains the information from BCN and that includes: token transaction time stamp ti, tokens transfer amount ni for Rπ(j), remaining tokens, address array At. The routing environment is the environment (E) in RIL, and the routing node R(i) updates the routing state, and decision is made based on actions. The current packet position is the state x and decisions, where the packets are transferred. It is seen that the RIL has total of n states with n routing nodes. An action a is taken based on policy π(x) from the present state x. Finally, the action allows packet forwarding to the immediate hop. In this way, the packets are transmitted via hops between the sensor nodes.
6 Results The simulation is carried out with virtual servers for updating the blockchain transactions. The routing information uses RIL from public transactions. The proposed method is compared with existing BC mechanism without RIL in WSN. The experiments are conducted against average network delay and throughput.
556
P. Anitha Rajakumari and P. Parwekar
Reinforcement algorithm for finding minimum path. Here, it generates a shortest path tree (SPT) with given source. Here maintain two sets, one set contains vertices included in shortest path tree, other set includes vertices not yet included in shortest path tree. At every step of the algorithm, find a vertex which is in the other set (set of not yet included) and has a minimum distance from the source. Value-Based: A value-based reinforcement learning method used to maximize a value function V(s). In this method, the agent is expecting a long-term return of the current states under policy π. Policy-based: A policy-based RL method used to come up with such a policy that the action performed in every state helps you to gain maximum reward in the future.
Optimizing the Valid Transaction Using Reinforcement …
Fig. 3 Average network delay
557
558
P. Anitha Rajakumari and P. Parwekar
Fig. 4 Network throughput
The network average delay is presented in Fig. 3. It shows the results of average delay with varying network density. The result shows that the proposed method has reduced delay than ANN and DNN model. Figure 3 shows the results of average packet delivery, and Fig. 4 shows the results of network throughput. The results of average packet delivery show the reduced delay in present RIL than in existing BCN. The results of throughput show the increased throughput in present RIL than in existing BCN.
7 Conclusions In this paper, the distribution of blockchain transactions needs an optimal way and uses reinforcement learning to improve the distribution of transactions. RIL ensures that the transactions take place between the source and destination in an optimal way. The transactions occur based on the present and previous transactions, where the transactions are carefully included/altered/rejected after verification. The representation of packets via BC and then the BCN acquires routing transactions via valid confirmation of blockchain transactions from the server. The RIL adapts well with the BCN to select the optimal routing paths and avoids malicious node paths. The simulation reveals the validation of the methods in terms of reduced latency and increased throughput than existing BCN.
Optimizing the Valid Transaction Using Reinforcement …
559
References 1. Azaria, A., Ekblaw, A., Vieira, T., Lippman, A.: Medrec: using blockchain for medical data access and permission management. In: 2016 2nd International Conference on Open and Big Data (OBD), pp. 25–30. IEEE (2016, August) 2. Biryukov, A., Khovratovich, D., & Pustogarov, I.: Deanonymisation of clients in Bitcoin P2P network. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 15–29. ACM (2014, November) 3. Bos, J.W., Halderman, J.A., Heninger, N., Moore, J., Naehrig, M., Wustrow, E.: Elliptic curve cryptography in practice. In: International Conference on Financial Cryptography and Data Security, pp. 157–175. Springer, Berlin, Heidelberg (2014, March) 4. Kuo, T.T., & Ohno-Machado, L.: Modelchain: Decentralized privacy-preserving healthcare predictive modeling framework on private blockchain networks (2018). arXiv preprint arXiv: 1802.01746 5. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008). Available at https://bit coin.org/bitcoin.pdf 6. Puthal, D., Malik, N., Mohanty, S.P., Kougianos, E., Yang, C.: The blockchain as a decentralized security framework [future directions]. IEEE Consum. Electron. Mag. 7(2), 18–21 (2018) 7. Salma, B.U., Lawrence, A.A.: Improved group key management region based cluster protocol in cloud. Cluster Comput 1–13 (2017) 8. Nofer, M., Gomber, P., Hinz, O., Schiereck, D.: Blockchain. Bus. Inform. Syst. Eng. 59(3), 183–187 (2017) 9. Jia, B., et al.: A blockchain-based location privacy protection incentive mechanism in crowd sensing networks. Sensors 18(11), 3894 (2018) 10. Kushch, S., Prieto-Castrillo, F.: A rolling blockchain for a dynamic WSNs in a smart city (2018). arXiv preprint arXiv:1806.11399
Wi-Fi Fingerprint Localization Based on Multi-output Least Square Support Vector Regression A. Christy Jeba Malar, M. Deva Priya, F. Femila, S. Sam Peter, and Viraja Ravi
Abstract Estimating the location of a movable object is highly necessary for providing context-aware services in an indoor environment. As Global Positioning System (GPS) is not appropriate for indoor positioning, Wireless Local Area Network (WLAN) seems to be the choice due to its ubiquitous nature. The localization task based on wireless signals involves several challenges. This paper proposes a costeffective Wi-Fi-based location estimation and navigation architecture which employs the existing IEEE 802.11 infrastructure for facilitating indoor positioning, providing business solutions, monitoring health care and guiding navigation. A statistical regression model is built on the recorded Received Signal Strength (RSS) dataset using Multi-output Least Square Support Vector Machine (M-LS-SVM) regression which infers the locality of a mobile device. The information from the radio map helps in improving the performance. The proposed M-LS-SVM technique is compared with various regression models for different kernels. Keywords Fingerprint localization · Indoor positioning · Support vector machine regression · Global Positioning System
A. Christy Jeba Malar Department of Information Technology, Sri Krishna College of Technology, Coimbatore, Tamil Nadu, India e-mail: [email protected] M. Deva Priya (B) · F. Femila · S. S. Peter · V. Ravi Department of Computer Science and Engineering, Sri Krishna College of Technology, Coimbatore, Tamil Nadu, India e-mail: [email protected] F. Femila e-mail: [email protected] S. S. Peter e-mail: [email protected] V. Ravi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5_49
561
562
A. Christy Jeba Malar et al.
1 Introduction Location-based services support context-aware applications. They rely on accurate location computing technologies. Satellite-based Global Positioning System (GPS) finds the topographical position of the mobile user. This has driven different contextaware services such as navigation systems, business advertisements, surveillance, military equipment tracking and many more. GPS localization can be effectively applied in an outdoor environment. However, it is challenging to interpret indoor GPS signals as they are attenuated by obstacles. Indoor Positioning System (IPS) helps in determining the precise positions of mobile devices in closed environments such as universities, hospitals, airports and shopping malls. The indoor localization techniques such as infrared rays, pedestrian dead reckoning [1], Bluetooth, magnetic localization, optical navigation systems, ultrasound solutions, WLAN, ZigBee, ultra-wide band, inertial measurement units, biosensor and hybrid solutions are propounded with an aim to provide indoor navigation, emergency services, object tracking and other value-added services. The above-mentioned systems suffer from a constraint that they demand dedicated hardware to be installed in many cases to support localization sensors. For example, Radio-Frequency IDentification (RFID) tags are to be installed for RFID-based localization. These techniques are based on the distance measurements such as Time of Arrival (ToA), Angle of Arrival (AoA) [2], Received Signal Strength Indicator (RSSI) and Time Difference of Arrival (TDoA). Due to the multifaceted nature of indoor environments, wireless signal strength-based localization technique is always subject to challenges like unpredictability of signal propagation, Non-Line-of-Sight (NLoS), signal attenuation by obstacles and interference with devices in the same frequency. This unpredictability makes it difficult to design an efficient model for measuring the signal strength and computing the positions. In this paper, an IPS based on widely deployed indoor Wi-Fi systems and RSSI from wireless Access Points (APs) is propounded. The indoor localization problem is solved using statistical learning algorithms. Multi-output Least Square Support Vector Machine (M-LS-SVM) is applied on the fingerprint radio map.
2 Related Work In this section, diverse localizations methods propounded by various authors are discussed. Probabilistic techniques [3] consider the signal strength of the RPs during offline building of the radio map. In the online phase, probabilistic techniques are involved in the estimation of the Region of Interest (RoI). Internet of Things (IoT)-based indoor positioning is a developing technique [4]. Machine Learning (ML) and statistical models have been applied on fingerprint radio map for indoor positioning such as SVM, neural networks, Naive Bayes, Smallest M-vertex Polygon (SMP) and
Wi-Fi Fingerprint Localization Based on Multi-output …
563
compressive sensing and affinity propagation [3] for obtaining optimized accuracy in positioning. The non-linear behavior of the environmental RSS makes the measurement at RPs from Wi-Fi APs reliant on the environmental obstacles. Shi et al. [5] have proposed a RSSI-based positioning for large-scale applications. The authors have employed tri-partition RSSI classification and its tracing algorithm as an RSSI filter. In empirical data modeling, the model is derived from the existing data using induction, from which an unobserved system may be presumed. Eventually, the amount and excellence of observations oversee the performance of the model. Due to the surveillance behavior, data that is recorded is limited and sampled. In general, this sampling is not constant. Further, as the features of the application are of high dimensions, the collected data forms a sparse distribution in the input space. Traditional Neural Network (NN) approaches suffer from multiple local minima issues with generalization and overfitting of models. The data is highly separable when the data is stretched into a feature space with high dimensions. It is suitable for categorization of support vectors, regression and learning methods of kernel functions [6]. As already mentioned, SVM is a statistical-based ML tool used for classification and regression [7]. SVM is the best ML method with effective features like high fitting accuracy, fewer parameters and global optimality. Bisio et al. [8] have presented a positioning scheme called Smart P-FP, wherein they have compared the traditional and smart P-FP over smartphones. Kim et al. [9] have used Deep Neural Networks (DNNs) for the classification of building/floor and floor-based position estimation. They have used a stacked auto-encoder for reducing the dimensions of feature space and a feed-forward classifier for multiple label-based classification, using which the Wi-Fi fingerprinting-based multi-building/floor indoor positioning system is built. SVM comprises of both Support Vector Classification (SVC) and Support Vector Regression (SVR). SVC is best suited for applications with categorical data, whereas SVR is used to handle problems with continuous data as the response variable. SVM is generally used for classification, but recently it is extended to address regression problems also. In the test phase, non-linear mappings are applied involving low computational overhead. SVM formulation is based on the kernel functions chosen during the training phase. Kernel selection is significant: a (non) linear kernel yields (non) linear mapping. SVM (with Gaussian kernel) and statistical learning schemes like k-Nearest Neighbor (kNN) are applied for localization. SVM schemes are designed to perform regression, one for each spatial component. Results indicate that SVM outperforms other schemes in classification and provides similar performance in regression. Multi-variate regression aims to design a predictive model while performing a mapping from a multi-variate input space to a multi-variate output space. Kernel PLS regression and Multi-output Support Vector Regression (MSVR) [10] are familiar regression models. It is difficult to directly generalize multi-label classification methods to complement regression ones. Cluster-based node positioning and multi-floor cluster-based positioning in cloud environment can also be combined with SVR for improving the positioning accuracy [11, 12].
564
A. Christy Jeba Malar et al.
3 Proposed Multi-output LS-SVM (M-LS-SVM) The proposed system deals with finding a suitable model for RSS fingerprints (training data) recorded from Reference Point (RP) in a 2D grid environment. The accuracy is determined by taking the average RSS observed at four different angles at varying time intervals. The proposed Multi-Output LS-SVM (M-LS-SVM) model predicts the position of an unknown MS. A single M-LS-SVM learning model is sufficient to predict (x, y) coordinates of unknown position P(x, y), whereas in LS-SVM, two learning models are necessary to determine (x, y) coordinates separately.
3.1 Radio Map Construction Locations in WLANs are usually determined in two stages, namely offline and online training stages. In the offline training phase, the RSSs from the APs are recorded from various RPs. RPs are fixed based on the requirement before commencing the collection of RSS from them. As RSS fingerprint database is based on site calibration, it involves time and labor and is susceptible to environmental changes. Nowadays, crowdsourcing technique is used in database updation [13] so as to circumvent site calibration, wherein mobile users update their positions on a server in the offline stage. In the online phase, the system uses the RSS vector sent by the APs to find the user locality related to the searching patterns in the fingerprint radio map. Searching algorithms are categorized into deterministic and probabilistic algorithms based on the fingerprint parameter types. The RSS values are recorded at different time intervals at four different angles at the fixed points in the RoI. The final RSS value is taken as the average. It avoids the effect of ecological changes on the RSS values, and the average RSS value yields more accurate measurement. The location prediction model shows the association among the position of the Mobile Station (MS) and the RSS value recorded at each Wi-Fi AP (Fig. 1a). Figure 1b shows the RP in the deployment area. The campus is deployed with Wi-Fi (APs). A grid space with RPs at known locations is formulated. More than 75 RPs are identified at regular intervals of 1.5 m. Blue dots represent the labeled reference positions at the testing environment. The proposed M-LS-SVM performs non-linear mapping on the training set, so that input set is mapped to higher dimensional feature space, thereby providing accurate prediction.
3.2 Support Vector Regression (SVR) A ‘n × d’ matrix is formed for a set of ‘n’ observations with ‘d’ components each. Data points {xi , yi }, i = 1, 2, … n, where xi ∈ R m —Input data
Wi-Fi Fingerprint Localization Based on Multi-output …
565
Fig. 1 a System architecture. b Reference points in the deployment area
yi ∈ R l —Observable output on the set of independent observations n—Size of the training dataset F(x, w) is the family of functions parameterized by ‘w’ as given in Eq. (1). F = f (x, w), w ∈ λ| f : R m → R l
(1)
where λ—Set of parameters If the observations are found to be linear, then the regression function is given by, f (x, w) = w.x + b
(2)
In this problem, the input observations and output points are expected to be nonlinear. The SVR method maps each observation onto a feature space with high dimensions by using a non-linear function ‘ϕ’ and performs linear regression so that the original non-linear effect is obtained. f (x, w) = w.ϕ(x) + b
(3)
Thus, the regression problem can be solved using the following optimized solution. l 1 ξi + ξ ∗ min w2 + γ 2 i=1
The constraint is shown in the following equation.
566
A. Christy Jeba Malar et al.
⎫ w.ϕ(xi ) + b − yi ≤ ξi∗ + ε ⎪ ⎪ ⎬ yi − w.ϕ(xi ) − b ≤ ξi∗ + ε ⎪ ξi , ξi∗ ≥ 0 ⎪ ⎭ i = 1, 2, . . . n
(4)
where ‘ξi , ξi∗ ’—Slack variables signifying upper and lower constraints of output. Lagrange multiplier method is the best method to solve non-linear quadratic problems. l 1 ξi + ξi∗ L w, b, α, α ∗ = w2 + γ i=1 2 l − αi (ξi + ε − yi + w.ϕ(xi ) + b i=1 l α ∗ (ξi + ε + yi − w.ϕ(xi ) − b) − i=1 i l ηi ξi + ξi∗ −
(5)
αi , αi∗ ≥ 0(i = 1, 2, . . . l)
(6)
i=1
where
Karush–Kuhn–Tucker (KKT) conditions are applied for obtaining optimal conditions. w=
l
αi − αi∗ ϕ(xi ), i = 1, 2, . . . n
(7)
i=1
The kernel function ‘ψ xi, x j ’ is equal to the inner product of ‘ϕ(xi )’ and ‘ϕ x j ,’ i.e., ψ xi, x j = ϕ(xi ).ϕ x j The final non-linear SVR function is of the form, α i − α i∗ ψ(x, xi ) + b f (x) = xi ∫ s
where ‘s’ is the support vector for the given set.
(8)
(9)
Wi-Fi Fingerprint Localization Based on Multi-output …
567
3.3 M-LS-SVM for Fingerprint Radio Map The position determination problem of a mobile user can be formulated as a multioutput regression problem over the fingerprint radio map. Let y = yi, j ∫ R l×m
(10)
Here, ‘y’ is the user position in the indoor environment represented as a twodimensional variable. Hence, m = {1, 2} is the number of dimensions in the output variable. Given a set of observations, multi-output regression strives to produce an output vector (y ∈ R m ) from the given input data points (x ∈ R d ). M-LS-SVM solves the problem by finding, b = (b1 , b2 , . . . bl )T ∈ R l
(11)
Since classical SVM is one dimensional (1D), it involves two independent learning processes to find the user positions in (x, y) coordinates (y1l , y2l ). To overcome this, multi-output regression is employed to predict multi-output in a single learning process. y = [y1 , y2 , y3 , . . . , yl ] ⎡
y11 · · · ⎢ .. . . y=⎣ . .
⎤ y1l .. ⎥ . ⎦
(12)
(13)
yn1 · · · ynl
where ‘yil ’ signifies the output variable ‘yi ’ for instance ‘l’. The minimization function is given in Eq. (14). l l m 1 γ 2 2 min = ζ w , e w + e j i j w∈R mxn ,e 2 j=1 j 2 j=1 i=1 i j
(14)
where γ > 0. The Lagrange function is given by, L (l) (ω, b, e, α) = ζ (m) −
l m
αi j ω Tj .φ(xi ) + b j + ei j − yi j
j=1 i=1
According to the Karush–Kuhn–Tucker (KKT) conditions of optimality,
(15)
568
A. Christy Jeba Malar et al.
∂L = 0 ⇒ wj = αi j φ(xi ) ∂w j i=1 m
∂L =0⇒ αi j ∂b j i=1 m
∂L = 0 ⇒ αi j γ ei j ∂ei j ∂L = 0 ⇒ w Tj .ϕ(x) + b j + ei j = yi j ∂ei j
(16)
After eliminating ‘w j ’ and ‘ei j ,’ the linear equation is given by,
b 0 = α y
(17)
H = k + γ −1 Il ∫ R lxm
(18)
o 11T 1l H
where,
‘k’ is defined by the items given in Eq. (19). ki j = ϕ(xi )T ϕ x j = k xi , x j
(19)
b = (b1 , b2 , . . . bl )
(20)
α = (α1 , α2 , . . . αl)
(21)
y = (y1 , y2 , . . . yl )
(22)
M-LS-SVM function is determined as follows. y j (x) =
m
α ji k xi , x j + b j
(23)
i=1
where, j = 1, 2, 3 . . . m. In this paper, the problem of finding optimized locations is considered to be non-linear. The Lagrange dual formulation is added to the linear kernel model, thus transforming it into a non-linear model. Non-linear model depends on kernel functions to map input data to a new feature space. Radial Basis Function (RBF) with
Wi-Fi Fingerprint Localization Based on Multi-output …
569
Gaussian of the form given below is the best kernel function.
K x, x
x − x 2 = exp 2σ 2
(24)
RBFs involve methods for finding the subset of centers. Here, ‘σ’ is a user-defined parameter. In SVM, subset selection implicitly contributes to the Gaussian function. As an alternative to Wi-Fi positioning algorithm, SVM-based classification [14, 15] and multi-label classification [16] are involved in the formation of position clusters.
4 Experimental Setup and Results Discussion The system covers an area of 200 m2 with 15 fixed APs. The number of APs for analyzing the performance is also varied. 150 signal strength values are recorded at each RP at a time interval of 120 ms, at different angles like 0°, 90°, 180° and 270°. The accuracy of the proposed system is compared with SVM and M-SVM. MLS-SVM is implemented for different kernels using MATLAB. Table 1 shows the simulation parameters. Table 2 shows the fingerprint database collected at known RPs at regular intervals in four angles. This radio map is considered as a training set in empirical data modeling. An induction process is used to build up a model from the existing system from which the responses of the system for the observed predictor variable are deduced. To predict the performance, average relative error is computed using the following equation. l 1 yi − yˆi δ= l i=1 yi The correlation coefficient is given by, Table 1 Simulation Parameters Simulation parameters
Value used for simulation
Number of access points
15
Room size
200 × 200 m2
Angle of measurement
0°, 90°, 180° and 270°
Reference points
150
Time interval of measurement
120 ms
Space between reference points
1.5 m
Dimension taken
Two-dimensional space (x, y)
(25)
570
A. Christy Jeba Malar et al.
Table 2 Fingerprint radio map Position
Direction
RSS from (AP1 )
RSS from (AP2 )
RSS from (APn )
Average RSS
P1(x1, y1)
0°
−56
−66
−69
−68
P2(x2, y2)
90°
−61
−64
−72
−65
180°
−59
−61
−68
−62
270°
−56
−60
−71
−62
0°
−38
−55
−59
−47
90°
−41
−54
−48
−45
180°
−26
−44
−54
−43
270°
−32
−42
−54
−49
i
− y)( yˆi − y) R= 2 l 2 l ˆi − y) i=1 (yi − y) i=1 y i=1 (yi
(26)
where, ‘yi , yˆi ’ are the predicted and the actual locations, and ‘y i , y i ’ are the predicted and the actual average of output locations from observations. The SVM, M-SVM and M-LS-SVM are directly compared by varying the number of APs. In all the cases, RBF kernel is used for optimization. Correlation coefficient for the output variable is plotted in Fig. 2a. Figure 2b shows the probability distribution of error distance for different kernels. RBF kernel offers higher accuracy. From Fig. 3a, it is seen that M-LS-SVM forms a learning model that accurately maps the RSSI and the location coordinates. From the literature, it is evident that M-LS-SVM usually performs better than the other probabilistic and statistical models. In general, increased number of APs and RPs yield better accuracy. Different regression algorithms are tested by varying the number of APs. The Mean Square Error (MSE) reduces with the increase in the number of APs. Figure 3b shows the MSE for different SVM algorithms. The
(a)
(b)
Fig. 2 a Correlation coefficient for output variables. b Cumulative probability distribution for different kernels
Wi-Fi Fingerprint Localization Based on Multi-output …
(a)
571
(b)
Fig. 3 a Cumulative error probability distributions for different mechanisms with RBF kernel. b Mean square error for varying number of APs
Fig. 4 Average mean square error for different datasets
accuracy is tested by varying the training data size. From Fig. 4, it is obvious that the average error rate reduces with increase in the number of training samples, which after a certain level becomes constant.
5 Conclusion Regression is best suited for continuous data. Hence, LS-SVM and Multi-Output LSSVM seem to be preferable for predicting multi-output variables. From the simulation outcomes, it is evident that M-LS-SVM is capable of reconstructing exact locations even for input with noise. Prior information available in the training set improves
572
A. Christy Jeba Malar et al.
the localization performance. In M-LS-SVM, single learning process is sufficient to construct (x, y) coordinates of 2D positioning. It reduces the overhead of building two different learning models to determine (x, y) coordinates separately as in classical LS-SVM. The proposed M-LS-SVM is compared with the classical SVM and LSSVM methods of statistical localization. It is applied to different types of kernels, and it is seen that the RBF kernel yields the best results. However, the same technique can be applied for Bluetooth devices or any other mobile devices, provided the signals are received from pre-fixed RPs.
References 1. Chen, W., Chen, R., Chen, Y., Kuusniemi, H., Wang, J.: An effective pedestrian dead reckoning algorithm using a unified heading error model. In: IEEE/ION Position, Location and Navigation Symposium, pp. 340–347 (2010) 2. Kułakowski, P., Vales-Alonso, J., Egea-López, E., Ludwin, W., García-Haro, J.: Angle-ofarrival localization based on antenna arrays for wireless sensor networks. Comput. Electr. Eng. 36(6), 1181–1186 (2010) 3. Feng, C., Au, W.S.A., Valaee, S., Tan, Z.: Received-signal-strength-based indoor positioning using compressive sensing. IEEE Trans. Mob. Comput. 11(12), 1983–1993 (2011) 4. Huang, H., Zhou, J., Li, W., Zhang, J., Zhang, X., Hou, G.: Wearable indoor localisation approach in Internet of Things. IET Networks 5(5), 122–126 (2016) 5. Shi, Y., Shi, W., Liu, X., Xiao, X.: An RSSI classification and tracing algorithm to improve trilateration-based positioning. Sensors 20(15), 4244 (2020) 6. Figuera, C., Rojo-Álvarez, J.L., Wilby, M., Mora-Jiménez, I., Caamaño, A.J.: Advanced support vector machines for 802.11 indoor location. Signal Process. 92(9), 2126–2136 (2012) 7. Vapnik, V.: The nature of statistical learning theory. Springer Science & Business Media, Berlin (2013) 8. Bisio, I., Lavagetto, F., Marchese, M., Sciarrone, A.: Smart probabilistic fingerprinting for WiFi-based indoor positioning with mobile devices. Pervasive Mob. Comput. 31, 107–123 (2016) 9. Kim, K.S., Lee, S., Huang, K.: A scalable deep neural network architecture for multi-building and multi-floor indoor localization based on Wi-Fi fingerprinting. Big Data Anal. 3(1), 4 (2018) 10. Christy Jeba Malar, A., Siddique Ibrahim, S.P., Deva Priya, M.: A novel cluster based scheme for node positioning in indoor environment. Int. J. Eng. Adv. Technol. 8, 6S, 79–83 (2019) 11. Christy Jeba Malar, A., Deva Priya, M., Sengathir, J., Kiruthiga, N., Anitha, R., Sangeetha, T.: An intelligent multi-floor indoor positioning system for cloud-based environment. Int. J. Comput. Appl. 1–8 (2019) 12. Tuia, D., Verrelst, J., Alonso, L., Pérez-Cruz, F., Camps-Valls, G.: Multioutput support vector regression for remote sensing biophysical parameter estimation. IEEE Geosci. Remote Sens. Lett. 8(4), 804–808 (2011) 13. Chen, W., Wang, W., Li, Q., Chang, Q., Hou, H.: A crowd-sourcing indoor localization algorithm via optical camera on a smartphone assisted by wi-fi fingerprint RSSI. Sensors 16(3), 410 (2016) 14. Lee, C.W., Lin, T.N., Fang, S.H., Chou, Y.C.: A novel clustering-based approach of indoor location fingerprinting. In: 24th IEEE Annual International Symposium on Personal, Indoor, and Mobile Radio Communications, pp. 3191–3196 (2013) 15. Brunato, M., Battiti, R.: Statistical learning theory for location fingerprinting in wireless LANs. Comput. Netw. 47(6), 825–845 (2005) 16. Tran, D.A., Nguyen, T.: Localization in wireless sensor networks based on support vector machines. IEEE Trans. Parallel Distrib. Syst. 19(7), 981–994 (2008)
Author Index
A Acharya, Pratik, 95 Ahmad, Munir, 523 Alam, Jawaz, 177 Aleem, Abdul, 497 Anitha Rajakumari, P., 551 Anitha, Shanmugam, 105 Apat, Hemant Kumar, 293 Arun Prakash, A., 169 Ashwini, K., 253 Atta-ur-Rahman, 523
B Behera, Dhiren Kumar, 189 Behera, Gopal, 283 Behera, Ranjan K., 13 Bhaisare, Kunal, 293 Bhoi, Ashok Kumar, 283 Bhoi, Ashutosh, 275, 283 Bhoi, Sourav Kumar, 275, 413 Bhuyan, S. K., 535 Biswal, Aparesh Prasad, 239
C Chakraborty, Partha Sarathi, 383 Chauhan, Arun Avinash, 229 Chettri, Lekhika, 1, 81 Choudhury, Bibhuti Bhusan, 451 Christy Jeba Malar, A., 561
D Das, Bikramaditya, 125, 485 Dash, Jijnasee, 1
Dash, Padmanav, 177 Dash, Ritesh, 535 Dash, Santanu Kumar, 201 Dash, Sujata, 523 Dehury, Divyajyoti, 403 Deva Priya, M., 561 Dhal, Prasant Ranjan, 25, 439 F Femila, F., 561 G Ganthia, Bibhu Prasad, 239 Garg, Sachin, 497 Ghosh, Tapotosh, 465 Gore, Manoj Madhava, 497 Gourisaria, Mahendra Kumar, 509 Goyal, Kapil, 217 Gupta, Atulya, 155 H Harshvardhan, G. M., 509 Hasan Al Banna, Md., 465 Hegde, Chandan, 253 Hete, R. R., 535 Hota, Arunima, 1 I Iqbal, Tahir, 523 J Jaggi, Arshdeep Kaur, 383
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. K. Udgata et al. (eds.), Intelligent Systems, Lecture Notes in Networks and Systems 185, https://doi.org/10.1007/978-981-33-6081-5
573
574 Jayasingh, Suvendra Kumar, 35 Jena, Alok Kumar, 317 Jena, Om Prakash, 393
K Kaiser, M. Shamim, 465 Kashyap, Pankaj Kumar, 139 Khan, Md Rizwan, 485 Kumar, Manoj, 139 Kumar, Priyadarshi Biplab, 25 Kumar, Rustom, 13 Kumar, Saroj, 25 Kumar, Sushil, 139
L Lalitha, Krishnasamy, 105
M Mahanta, Umakanta, 361 Mahapatra, Rajendra Prasad, 155 Mahmud, Mufti, 465 Maiti, Prasenjit, 293 Mallik, Puspanjali, 393 Mantri, Jibendu Kumar, 35 Mishra, Sarojananda, 329 Mishra, Shreyas, 265 Mishra, S. K., 535 Mishra, S. N., 477 Mohanta, Bhabesh Chandra, 361 Mohanty, Ayog, 49 Mohanty, Subhadarshini, 1 Mohapatra, Subasish, 1 Mohebbanaaz, 349 Muni, Manoj Kumar, 25 Murali Nath, R. S., 73
N Naga Satish, G., 73 Nath, Tushar Kumar, 95 Nayak, Rajendra Prasad, 275, 413 Nimma, Ram Babu, 95
P Padhi, Ashis Kumar, 49 Padmaja, Vangala, 201 Padma Sai, Y., 349, 373 Pal, Sangita, 403 Panda, Anup Kumar, 361 Panda, Madhusmita, 125
Author Index Panda, Saroja Kanta, 115 Panda, Sumanta, 177 Panigrahi, Bibhu Prasad, 361 Parhi, Dayal R., 25 Parwekar, Pritee, 551 Patra, Binaya Kumar, 329 Patra, Krishna Chandra, 189 Patra, Moumita, 217 Patra, Sanjay Kumar, 329 Patra, Sudhansu Shekhar, 509 Patro, Pranaya Pournamashi, 305 Peter, S. Sam, 561 Poongodi, Chinnasamy, 105 Pradhan, Rosalin, 239 Pradhan, Sipali, 35 Pradhan, Soumyaranjan, 239 Prasana, Veeravelli Lakshmi, 59 Preethi, S., 169 Priyadarshini, Srusti, 177 Priyanka, Garg, 201 Prusty, Smruti Rekha, 509 Puhan, Pratap Sekhar, 59 R Raghavendran, Ch. V., 73 Rajani Kumari, L. V., 349, 373 Ratha, Pradyumna Kumar, 425 Ravi, Viraja, 561 Rout, Ashima, 403 Roy, Swarup, 1, 81 S Sahoo, Bibhudatta, 293 Sahoo, Ramesh K., 337, 403 Sahoo, Satyabrata, 59 Sahoo, Sipra, 49 Sahoo, Sushil Kumar, 451 Sahu, Anupama, 477 Sahu, Bandita, 439 Sahu, Chinmaya, 25 Sahu, Rajashree, 239 Sahu, Sushanta Kumar, 115 Satapathy, Nihar Ranjan, 425 Satpathy, Soumya Ranjan, 239 Senapati, Rajiv, 305 Sethi, Rabi Narayan, 189 Sethi, Srinivas, 275, 337, 413 Sharma, Ananya, 383 Sharma, Nikhil, 383 Singh, Ridhiman, 383 Sneha, Swati, 13 Sree Dhruti, P. S. S., 373
Author Index T Taher, Kazi Abu, 465 Thangarajan, R., 169 Tripathy, Abinash, 317 Tripathy, L. N., 535
575 U Udgata, Siba K., 229, 337, 425
V Vijay Anand, Duraisamy, 105