491 84 41MB
English Pages 1220 [1158] Year 2021
Lecture Notes in Networks and Systems 190
M. Shamim Kaiser Juanying Xie Vijay Singh Rathore Editors
Information and Communication Technology for Competitive Strategies (ICTCS 2020) Intelligent Strategies for ICT
Lecture Notes in Networks and Systems Volume 190
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA; Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering University of Alberta, Alberta, Canada; Systems Research Institute Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.
More information about this series at http://www.springer.com/series/15179
M. Shamim Kaiser · Juanying Xie · Vijay Singh Rathore Editors
Information and Communication Technology for Competitive Strategies (ICTCS 2020) Intelligent Strategies for ICT
Editors M. Shamim Kaiser Jahangirnagar University Dhaka, Bangladesh
Juanying Xie Shaanxi Normal University Xi’an, China
Vijay Singh Rathore IIS Deemed to be University Jaipur, Rajasthan, India
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-981-16-0881-0 ISBN 978-981-16-0882-7 (eBook) https://doi.org/10.1007/978-981-16-0882-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
The Fifth International Conference on Information and Communication Technology for Competitive Strategies (ICTCS 2020) targets the state of the art as well as emerging topics pertaining to information and communication technologies (ICTs) and effective strategies for its implementation for engineering and intelligent applications. The conference is anticipated to attract a large number of high-quality submissions, stimulate the cutting-edge research discussions among many academic pioneering researchers, scientists, industrial engineers, students from all around the world and provide a forum to researcher; propose new technologies, share their experiences and discuss future solutions for design infrastructure for ICT; provide a common platform for academic pioneering researchers, scientists, engineers and students to share their views and achievements; enrich technocrats and academicians by presenting their innovative and constructive ideas; focus on innovative issues at international level by bringing together the experts from different countries. The conference was held during December 11–12, 2020, digitally on Zoom and was organized by Global Knowledge Research Foundation. Research submissions in various advanced technology areas were received, and after a rigorous peer review process with the help of the program committee members and external reviewer, 211 papers were accepted with an acceptance rate of 21%. All 211 papers of the conference are accommodated in 2 volumes, and also, the papers in the book comprised authors from 15 countries. This event’s success was possible only with the help and support of our team and organizations. With immense pleasure and honor, we would like to express our sincere thanks to the authors for their remarkable contributions, all the technical program committee members for their time and expertise in reviewing the papers within a very tight schedule and the publisher Springer for their professional help. We are overwhelmed by our distinguished scholars and appreciate them for accepting our invitation to join us through the virtual platform and deliver keynote speeches and technical session chairs for analyzing the research work presented by the researchers. Most importantly, we are also grateful to our local support team for v
vi
Preface
their hard work for the conference. This series has already been made a continuous series which will be hosted at different locations every year. Dhaka, Bangladesh Xi’an, China Rajasthan, India
M. Shamim Kaiser Juanying Xie Vijay Singh Rathore
Contents
An Assessment of Internet Services Usage Among Postgraduate Students in the Nigerian Defence Academy . . . . . . . . . . . . . . . . . . . . . . . . . Angela Adanna Amaefule and Francisca Nonyelum Ogwueleka
1
Smart Attendance Monitoring System Using Local Binary Pattern Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Meghana, M. Himaja, and M. Rajesh
35
An Energy-Efficient Wireless Sensor Deployment for Lifetime Maximization by Optimizing Through Improved Particle Swarm Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T. Venkateswarao and B. Sreevidya Improved Nutrition Management in Maize by Analyzing Leaf Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Prashant Narayankar and Priyadarshini Patil Modified LEACH-B Protocol for Energy-Aware Wireless Sensor Network for Efficient Network Lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fahrin Rahman, Maruf Hossain, Md. Sabbir Hasan Sohag, Sejuti Zaman, and Mohammad Rakibul Islam Web Content Authentication: A Machine Learning Approach to Identify Fake and Authentic Web Pages on Internet . . . . . . . . . . . . . . . Jayakrishnan Ashok and Pankaj Badoni Classification of Brain Tumor by Convolution Neural Networks . . . . . . . Madhuri Pallod and M. V. Vaidya Automated Multiple Face Recognition Using Deep Learning for Security and Surveillance Applications . . . . . . . . . . . . . . . . . . . . . . . . . . Nidhi Chand, Nagaratna, Prema Balagannavar, B. J. Darshini, and H. T. Madan
49
65
75
85 105
113
vii
viii
Contents
An App-Based IoT-NFC Controlled Remote Access Security Through Cryptographic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Md. Abbas Ali Khan, Mohammad Hanif Ali, A. K. M. Fazlul Haque, Chandan Debnath, Md. Ismail Jabiullah, and Md. Riazur Rahman
125
Pneumonia Detection Using X-ray Images and Deep Learning . . . . . . . . Chinmay Khamkar, Manav Shah, Samip Kalyani, and Kiran Bhowmick
141
Autonomous Sailing Boat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . R. Divya, N. Inchara, Zainab A. Muskaan, Prasad B. Honnavalli, and B. R. Charanraj
153
A Proposed Method to Improve Efficiency in IPv6 Network Using Machine Learning Algorithms: An Overview . . . . . . . . . . . . . . . . . . Reema Roychaudhary and Rekha Shahapurkar
165
Byte Shrinking Approach for Lossy Image Compression . . . . . . . . . . . . . Tanuja R. Patil and Vishwanath P. Baligar
175
Image Encryption Using Matrix Attribute Value Techniques . . . . . . . . . . D. Saravanan and S. Vaithyasubramanain
185
Weather Prediction Based on Seasonal Parameters Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Manish Mandal, Abdul Qadir Zakir, and Suresh Sankaranarayanan Analysis of XML Data Integrity Using Multiple Digest Schemes . . . . . . Jatin Arora and K. R. Ramkumar Enhanced Safety and Security Features for CAN Communication in Automotives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Venkatesh Mane and Nalini C. Iyer
195 203
215
Fashion Classification and Object Detection Using CNN . . . . . . . . . . . . . . Shradha Itkare and Arati Manjaramkar
227
An Efficient Data Mining Algorithm for Crop Yield Prediction . . . . . . . H. V. Chaitra, Ramachandra, Chandani Sah, Saahithi Pradhan, Soundarya Kuralla, and Vanitha Sree
237
Cognitive Study of Data Mining Techniques in Educational Data Mining for Higher Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pratiksha Kanwar and Monika Rathore
247
Python GUI for Language Identification in Real-Time Using FFNN and MFCC Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Manjunath Sajjan, Mallamma V. Reddy, and M. Hanumanthappa
259
Contents
Traffic Management System Based on Density Prediction Using Maching Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Suresh Sankaranarayanan, Sumeet Omalur, Sarthak Gupta, Tanya Mishra, and Swasti Sumedha Tiwari Implementation of Optimized VLSI Architecture for Montgomery Multiplication Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . Arun Thomas, Anu Chalil, and K. N. Sreehari Artificial Intelligence in Healthcare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Saumya Roy, Archana Singh, and Chetna Choudhary Performance Improvement of Lossy Image Compression Based on Polynomial Curve Fitting and Vector Quantization . . . . . . . . . . . . . . . Shaimaa Othman, Amr Mohamed, Abdelatief Abouali, and Zaki Nossair Adversarial Deep Learning Attacks—A Review . . . . . . . . . . . . . . . . . . . . . Ganesh B. Ingle and Milind V. Kulkarni
ix
269
277 287
297 311
Enhancing Security of Cloud Platform with Cloud Access Security Broker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shahnawaz Ahmad, Shabana Mehfuz, and Javed Beg
325
Machine Learning Based Quality Prediction of Greywater: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Samir Sadik Shaikh and Rekha Shahapurkar
337
Blueprint of Blockchain for Land Registry Management in India . . . . . Ganesh Khadanaga and Kamal Jain
349
Dynamic Time Slice Task Scheduling in Cloud Computing . . . . . . . . . . . Linz Tom and V. R. Bindu
359
A Comprehensive Survey of Existing Researches on NOMA-Based Integrated Satellite-Terrestrial Networks for 5G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Joel S. Biyoghe and Vipin Balyan Global Random Walk for the Prediction of MiRNA Disease Association Using Heterogeneous Networks . . . . . . . . . . . . . . . . . . . . . . . . . J. R. Rashmi and Lalitha Rangarajan
369
379
IoT Past, Present, and Future a Literary Survey . . . . . . . . . . . . . . . . . . . . . Md. Faridul Islam Suny, Md. Monjourur Roshed Fahim, Mushfiqur Rahman, Nishat Tasnim Newaz, and Tajim Md. Niamat Ullah Akhund
393
Use of Classification Approaches for Takri Text Challenges . . . . . . . . . . . Shikha Magotra, Baijnath Kaushik, and Ajay Kaul
403
x
Contents
Dynamically Adaptive Cell Clustering in 5G Networks . . . . . . . . . . . . . . . K. Sai Tejaswini, Jayashree Balaji, and B. Prem Kumar
411
Recommendation System Based on Machine Learning and Deep Learning in Varied Perspectives: A Systematic Review . . . . . . . . . . . . . . . T. B. Lalitha and P. S. Sreeja
419
SON Coordination with Priorities for Separation of Parameter Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pigilam Swetha, Jayashree Balaji, and B. Prem Kumar
433
Digit Image Recognition Using an Ensemble of One-Versus-All Deep Network Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abdul Mueed Hafiz and Mahmoud Hassaballah
445
Recent Developments, Challenges, and Future Scope of Voice Activity Detection Schemes—A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shilpa Sharma, Punam Rattan, and Anurag Sharma
457
A Deep Learning Approach Toward Determining the Effects of News Trust Factor Based on Source Polarity . . . . . . . . . . . . . . . . . . . . . . Ayan Mukherjee and Ratnakirti Roy
465
A Pilot Study in Software-Defined Networking Using Wireshark for Analyzing Network Parameters to Detect DDoS Attacks . . . . . . . . . . Josy Elsa Varghese and Balachandra Muniyal
475
An Interactive Tool for Designing End-To-End Secure Workflows . . . . . Ravi Kanth Kotha, N. V. Narendra Kumar, T. Ramakrishnudu, Shruti Purohit, and Harika Nalam Investigation of Methodologies of Food Volume Estimation and Dataset for Image-Based Dietary Assessment . . . . . . . . . . . . . . . . . . . . Prachi Kadam, Nayana Petkar, and Shraddha Phansalkar A Compact Wideband Patch Antenna with Defected Ground for Satellite Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aditya Kumar Singh, Sweta Singh, Amrees Pandey, Shweta Singh, and Rajeev Singh Deep Learning for Natural Language Processing . . . . . . . . . . . . . . . . . . . . Rachana Patel and Sanskruti Patel College Project Preservation and Emulation Using Containerization Over Private Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yameen Ajani, Krish Mangalorkar, Yohann Nadar, Mahendra Mehra, and Dhananjay Kalbande Intellectual Property Rights Management Using Blockchain . . . . . . . . . . Vidhi Rambhia, Vruddhi Mehta, Ruchi Mehta, Riya Shah, and Dhiren Patel
489
499
513
523
535
545
Contents
A Salient Binary Coding Scheme for Face and Expression Recognition from Facial Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adiraju Prasanth Rao, Bollipelly PruthviRaj Goud, and D. Lakshmi Padmaja
xi
553
Low-Cost Smartphone-Controlled Remote Sensing IoT Robot . . . . . . . . Tajim Md. Niamat Ullah Akhund, Nishat Tasnim Newaz, Md. Rakib Hossain, and M. Shamim Kaiser
569
Thermal Image Processing and Analysis for Surveillance UAVs . . . . . . . Aasish Tammana, M. P. Amogh, B. Gagan, M. Anuradha, and H. R. Vanamala
577
Ensemble Methods for Scientific Data—A Comparative Study . . . . . . . . D. Lakshmi Padmaja, G. Surya Deepak, G. K. Sriharsha, and G. N. V. Ramana Rao
587
An Efficacious E-voting Mechanism using Blockchain to Preserve Data Integrity in Fog Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . K. Uma Maheswari, S. Mary Saira Bhanu, and S. Nickolas
597
Learning Analytics: A Literature Review and its Challenges . . . . . . . . . . Nisha, Archana Singhal, and Sunil Kumar Muttoo
607
Usage of ICT in Engineering Applications . . . . . . . . . . . . . . . . . . . . . . . . . . A. Arun Kumar, M. Shankar Lingam, and Smita Vempati
619
ICT for Good Governance: Evidence from Development Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. S. Raghavendra, M. Shankar Lingam, and J. Vanishree
627
Low-Power DSME-Based Communication and On-Board Processing in UAV for Smart Agriculture . . . . . . . . . . . . . . . . . . . . . . . . . . . Nikumani Choudhury and Anakhi Hazarika
639
Lane Change Assistance Using LiDAR for Autonomous Vehicles . . . . . . H. M. Gireesha, Prabha C. Nissimgoudar, Venkatesh Mane, and Nalini C. Iyer
649
A Systematic Review of Video Analytics Using Machine Learning and Deep Learning—A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . Prashant Narayankar and Vishwanath P. Baligar
659
Spark-Based Sentiment Analysis of Tweets Using Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. K. Chandrashekar, K. C. Srikantaiah, and K. R. Venugopal
667
Product Category Recommendation System Using Markov Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Krittaya Sivakriskul and Tanasanee Phienthrakul
677
xii
Contents
Low Code Development Platform for Digital Transformation . . . . . . . . . Vaishali S. Phalake and Shashank D. Joshi The Analogy of Haar Cascade and HOG Approaches for Facial Emotion Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Aishwarya and N. Neelima Testing of FPGA Input/Output Pins Using BIST . . . . . . . . . . . . . . . . . . . . . S. Gurusharan, R. Rahul Adhithya, S. Sri Harish, and J. P. Anita
689
699 709
Image Enhancement Using GAN (A Re-Modeling of SR-GAN for Noise Reduction) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. Vamsi Kiran Reddy and V. V. Sajith Variyar
721
Load Balancing and Its Challenges in Cloud Computing: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Harleen Kaur and Kiranbir Kaur
731
Predicting the Stock Markets Using Neural Network with Auxiliary Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sakshi and Sreyan Ghosh
743
Data Accountability and Security Enhancement in Remote Healthcare System Using BaaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Apoorva Kulkarni, Sonali Patil, and Rohini Pise
753
Cloud Computing: An Analysis of Authentication Methods Over Internet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ankush Kudale and S. Hemalatha
765
Transforming India’s Social Sector Using Ontology-Based Tantra Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shreekanth M. Prabhu and Natarajan Subramaniam
775
Design of Low Power and High-Speed 6-Transistors Adiabatic Full Adder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Minakshi Sanadhya and Devendra Kumar Sharma
795
Snappy Wheelchair: An IoT-Based Flex Controlled Robotic Wheel Chair for Disabled People . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tajim Md. Niamat Ullah Akhund, Gunjon Roy, Animesh Adhikary, Md. Ashraful Alam, Nishat Tasnim Newaz, Masud Rana Rashel, and Mohammad Abu Yousuf
803
Speed Limit Control and Wildlife Protection . . . . . . . . . . . . . . . . . . . . . . . . Venkatesh Mane, Ashwini Kamate, and Nalini C. Iyer
813
Digital Payments: An Indian Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . Gayatri Doctor and Shravan Engineer
821
Contents
xiii
Race Recognition Using CNN Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . R. Rushali and Abdul Jhummarwala
829
Private Cloud Configuration Using Amazon Web Services . . . . . . . . . . . . Ashwini S. Mane and Bharati S. Ainapure
839
Application of Python Programming and Its Future . . . . . . . . . . . . . . . . . Yash Bhatt and Prajakta Pahade
849
A Survey on Metaheuristics-Based Task Scheduling . . . . . . . . . . . . . . . . . Arzoo and Anil Kumar
859
Smart-Bin: IoT-Based Real-Time Garbage Monitoring System for Smart Cities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Raghavendra S. Shekhawat and Deepak Uniyal
871
Optimization in Artificial Intelligence-Based Devices and Algorithms for Efficient Training: A Survey . . . . . . . . . . . . . . . . . . . . . Priyadarshini Patil and S. M. Meena
881
Improving the Cyber Security by Applying the Maturity Model Case: Electricity Industry of Iran . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohammad Ebrahimnejad Shalmani and Mehran Sepehri
891
Encryption and Decryption Scheme for IoT Communications Using Multilevel Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sanchit Sindhwani, Sabha Sheikh, Saiyed Umer, and Ranjeet Kumar Rout
899
Susceptibility of Changes in Public Sector Banks with Special Emphasis on Some Newly Merged Banks: Focus on Cash Flow Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anita Nandi and Abhijit Dutta Fusion of Face Recognition and Number Plate Detection for Automatic Gate Opening System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H. M. Gireesha, Prabha C. Nissimgoudar, and Nalini C. Iyer Study of Text-to-Speech (TTS) Conversion for Indic Languages . . . . . . . Vishal Narvani and Harshal Arolkar
907
919 929
Wireless Sensor Network Based Architecture for Agriculture Farm Protection in India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dinesh Kumar Kalal and Ankit Bhavsar
937
Comparative Analysis of Volume Rendering Techniques on Craniofacial CT Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jithy Lijo, Tripty Singh, Venkatraman Bhat, and Moni Abraham Kuriakose
945
Implementation of Different MAC Protocols for IoT . . . . . . . . . . . . . . . . . S. Santhameena, J. Manikandan, and P. Priyanka
959
xiv
Contents
Analysis of Diseases in Plant’s Leaves Using Deep Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abhilasha, Vaibhav Vyas, Vijay Singh Rathore, and Neelam Chaplot
973
On Statistical Tools in Finding Equitable Antimagic Labeling of Complete Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Antony Puthussery, I. Sahul Hamid, and Xavier Chelladurai
985
Identifying the Alterations in the Microbiome Using Classification and Clustering Analysis: A Path Towards Microbiome Bio-Tech Innovations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hitesh Vijan and Prashant R. Kharote
997
Segmentation of Brain Tumor from MR Images Using SegX-Net an Hybrid Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1007 M. Ravikumar and B. J. Shivaprasad Digital India eGovernance Initiative for Tribal Empowerment: Performance Dashboard of the Ministry of Tribal Affairs . . . . . . . . . . . . 1017 Naval Jit Kapoor, Ashutosh Prasad Maurya, Raghu Raman, Kumar Govind, and Prema Nedungadi A Review on the Contribution of IoT in Various Domains of Supply Chain Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1029 Ramesh Shahabade Optimization-Based Boosting Feature Selection Method for Water Quality Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1041 M. Durairaj and T. Suresh Data Analytics Implementations in Indian e-Governance . . . . . . . . . . . . . 1051 Prashant Kumar Mittal, Anjali Dhingra, and Ashutosh Prasad Maurya Image Fusion in Multi-view Videos Using SURF Algorithm . . . . . . . . . . . 1061 B. L. Yashika and Vinod B. Durdi Multi-device Login Monitoring for Google Meet Using Path Compressed Double-Trie and User Location . . . . . . . . . . . . . . . . . . . . . . . . 1073 Anish Patil, Ankur Singh, and Neha Chauhan Analysis of Various Boosting Algorithms Used for Detection of Fraudulent Credit Card Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1083 Kaneez Zainab, Namrata Dhanda, and Qamar Abbas ICT Tools for Fishermen Assistance in India . . . . . . . . . . . . . . . . . . . . . . . . 1093 Sandhya Kiran and Anusha D. Shetti Mood Analysis of Bengali Songs Using Deep Neural Networks . . . . . . . . 1103 Devjyoti Nath and Shanta Phani
Contents
xv
Human Voice Sentiment Identification Using Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1115 Preeti Chawaj and S. R. Khot Statistical Analysis of Thermal Image Features to Discriminate Breast Abnormalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1129 Aayesha Hakim and R. N. Awale MIMO Systems in Wireless Communications: State of the Art . . . . . . . . 1141 Mehak Saini and Surender K. Grewal Hydrogen by Process of Water Electrolysis for Power Generation and a Review of Fuel Cell Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1155 Dipen S. Patel and Rajab Challoo Iris Presentation Attack Detection for Mobile Devices . . . . . . . . . . . . . . . . 1165 Meenakshi Choudhary, Vivek Tiwari, and U. Venkanna Enhanced Performance of Novel Patch Antenna Sub-array Design for Use in L-Band Ground Station Receiver Terminals Linked to Aerospace Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1175 T. A. Ajithlal, R. Gandhiraj, G. A. Shanmugha Sundaram, and K. A. Pradeep Kumar Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1185
Editors and Contributors
About the Editors Dr. M. Shamim Kaiser is currently working as Professor at the Institute of Information Technology of Jahangirnagar University, Savar, Dhaka-1342, Bangladesh. He received his bachelor’s and master’s degrees in Applied Physics Electronics and Communication Engineering from the University of Dhaka, Bangladesh, in 2002 and 2004, respectively, and the Ph.D. degree in Telecommunication Engineering from the Asian Institute of Technology (AIT) Pathumthani, Thailand, in 2010. His current research interests include data analytics, machine learning, wireless network & signal processing, cognitive radio network, big data and cyber security, and renewable energy. He has authored more than 100 papers in different peer-reviewed journals and conferences. He is Associate Editor of the IEEE Access Journal, and Guest Editor of Brain Informatics Journal and Cognitive Computation Journal. Dr. Kaiser is a life member of Bangladesh Electronic Society; Bangladesh Physical Society. He is also a senior member of IEEE, USA, and IEICE, Japan, and an active volunteer of the IEEE Bangladesh Section. He is Founding Chapter Chair of the IEEE Bangladesh Section Computer Society Chapter. Dr. Juanying Xie is currently Professor with Shaanxi Normal University, Xi’an, China. He has been Full Professor at the School of Computer Science of Shaanxi Normal University in PR China. His research interests include machine learning, data mining, and biomedical data analysis. He has published around 50 research papers, and published two monograph books. He has been Associate Editor of Health Information Science and Systems. He has been a program committee member of several international conferences, such as the International Conference on Health Information Science. He has been a senior member of China Computer Federation (CCF), a member of Chinese Association for Artificial Intelligence (CAAI), a member of Artificial Intelligence and Pattern Recognition Committee of CCF, and a member of Machine Learning Committee of CAAI, etc. He has been a peer reviewer for many journals, such as Information Sciences and IEEE xvii
xviii
Editors and Contributors
Transactions on Cybernetics. He was awarded Ph.D. in signal and information processing from Xidian University. He cooperated with Prof. Xiaohui Liu at Brunel University in UK from 2010 to 2011 in machine learning and gene selection research. He received an engineering master’s degree in the application technology of computers at Xidian University and a bachelor’s degree of science in computer science at Shanxi Normal University. Dr. Vijay Singh Rathore is presently working as Professor in the Department of CS and IT, IIS (Deemed to be) University, Jaipur (India). He received Ph.D. from the University of Rajasthan and has teaching experience of 20 years. He is Secretary, ACM Jaipur Chapter, Past Chairman, CSI Jaipur Chapter, got two patents published, Ph.D. Supervised (Awarded: 16, Under Supervision: 7), 80+ research papers, and 10+ books got published. He is handling international affairs of The IIS University, Jaipur. His research areas are Internet security, cloud computing, big data, and IoT.
Contributors Qamar Abbas Ambalika Institute of Technology and Management, Lucknow, India Abhilasha Banasthali Vidhyapeeth, Vanasthali, Rajastan, India Abdelatief Abouali Faculty of Computer Science, El-Shorouk Academy, Cairo, Egypt Mohammad Abu Yousuf Institute of Information Technology, Jahangirnagar University, Savar, Dhaka, Bangladesh Animesh Adhikary Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Shahnawaz Ahmad Department of Electrical Engineering, Jamia Millia Islamia, New Delhi, India Bharati S. Ainapure Vishwakarma University, Pune, India M. Aishwarya Department of Electronics and Communication Engineering, Amrita School of Engineering Bengaluru, Amrita Vishwa Vidyapeetham, Bengaluru, India Yameen Ajani Fr. Conceicao Rodrigues College of Engineering, University of Mumbai, Mumbai, Maharashtra, India T. A. Ajithlal SIERS Research Laboratory, Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India
Editors and Contributors
xix
Tajim Md. Niamat Ullah Akhund Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh; Institute of Information Technology, Jahangirnagar University, Dhaka, Bangladesh Mohammad Hanif Ali Jahangirnagar University, Dhaka, Bangladesh Angela Adanna Amaefule Department of Cyber Security, Nigerian Defence Academy, Kaduna, Nigeria M. P. Amogh PES University, Bengaluru, Karnataka, India J. P. Anita Department of Electronics and Communication Engineering, Amrita School of Engineering, Coimbatore, Amrita Vishwa Vidyapeetham, Coimbatore, India M. Anuradha PES University, Bengaluru, Karnataka, India Harshal Arolkar GLS University, Ahmedabad, India Jatin Arora Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India Arzoo Guru Nanak Dev University, Amritsar, India Jayakrishnan Ashok University of Petroleum and Energy Studies, Dehradun, India Md. Ashraful Alam Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh R. N. Awale Veermata Jijabai Technological Institute, Mumbai, India Pankaj Badoni University of Petroleum and Energy Studies, Dehradun, India Prema Balagannavar REVA University, Kattigenahalli, Yelahanka, Bengaluru, India Jayashree Balaji Smart World and Communications, Larsen & Toubro Constructions, Chennai, India Vishwanath P. Baligar School of Computer Science and Engineering, KLE Technological University, Hubballi, India Vipin Balyan Cape Peninsula University of Technology, Cape Town, South Africa Javed Beg Oracle, Bengaluru, India Venkatraman Bhat Narayana Health, Bengaluru, India Yash Bhatt Computer Science and Engineering, Prof Ram Meghe College of Engineering & Management, Amravati, Maharashtra, India Ankit Bhavsar GLS University, Ahmedabad, India Kiran Bhowmick Dwarkadas J, Sanghvi College of Engineering, Mumbai, India
xx
Editors and Contributors
V. R. Bindu School of Computer Sciences, Mahatma Gandhi University, Kottayam, Kerala, India Joel S. Biyoghe Cape Peninsula University of Technology, Cape Town, South Africa H. V. Chaitra Nitte Meenakshi Institute of Technology, Bangalore, Karnataka, India Anu Chalil Department of Electronics and Communication Engineering, Amrita Viswa Vidhyapeetham, Vallikavu, India Rajab Challoo EECS Department, Texas A&M University-Kingsville, Kingsville, TX, USA Nidhi Chand REVA University, Kattigenahalli, Yelahanka, Bengaluru, India D. K. Chandrashekar Department of CSE, SJB Institute of Technology, Bangalore, Karnataka, India Neelam Chaplot Poornima College of Engineering, Jaipur, Rajastan, India B. R. Charanraj PES University, Bangalore, India Neha Chauhan Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India Preeti Chawaj Department of Electronics and Telecommunication Engineering, D.Y. Patil College of Engineering and Technology, Kolhapur, India Xavier Chelladurai CHRIST (Deemed to be University), Bengaluru, India Chetna Choudhary ASET, Amity University, Noida, Uttar Pradesh, India Meenakshi Choudhary IIIT Naya Raipur, Naya Raipur, Chhattisgarh, India Nikumani Choudhury BITS Pilani, Hyderabad, India B. J. Darshini REVA University, Kattigenahalli, Yelahanka, Bengaluru, India Chandan Debnath National University, Gagipur, Bangladesh Namrata Dhanda Amity University, Lucknow Campus, Lucknow, India Anjali Dhingra National Informatics Centre Services Incorporated, New Delhi, India R. Divya PES University, Bangalore, India Gayatri Doctor Faculty of Management, CEPT University, Ahmedabad, India M. Durairaj School of Computer Science and Engineering, Bharathidasan University, Trichy, Tamil Nadu, India Vinod B. Durdi Dayananda Sagar College of Engineering, Bangalore, India
Editors and Contributors
xxi
Abhijit Dutta Department of Commerce, Sikkim University, Gangtok, India Shravan Engineer Faculty of Management, CEPT University, Ahmedabad, India Md. Faridul Islam Suny Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh B. Gagan PES University, Bengaluru, Karnataka, India R. Gandhiraj SIERS Research Laboratory, Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Sreyan Ghosh Computer Science and Engineering, Christ University, Bangalore, India H. M. Gireesha KLE Tchnological University, Vidyanagar, Hubballi, Karnataka, India Kumar Govind United Nations Development Programme, New Delhi, India Surender K. Grewal D.C.R.U.S.T, Murthal, Haryana, India Sarthak Gupta Department of Information Technology, SRM Institute of Science and Technology, Chennai, India S. Gurusharan Department of Electronics and Communication Engineering, Amrita School of Engineering, Coimbatore, Amrita Vishwa Vidyapeetham, Coimbatore, India Abdul Mueed Hafiz Department of ECE, Institute of Technology, University of Kashmir, Srinagar, J&K, India Aayesha Hakim Veermata Jijabai Technological Institute, Mumbai, India M. Hanumanthappa Department of Computer Science and Applications, Bangalore University, Bangalore, India A. K. M. Fazlul Haque Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Mahmoud Hassaballah Department of Computer Science, Faculty of Computers and Information, South Valley University, Qena, Egypt Anakhi Hazarika Indian Institute of Information Technology Guwahati, Guwahati, India S. Hemalatha Karpagam Academy of Higher Education, Coimbatore, India M. Himaja Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India Prasad B. Honnavalli PES University, Bangalore, India Maruf Hossain University of Dhaka, Dhaka, Bangladesh
xxii
Editors and Contributors
N. Inchara PES University, Bangalore, India Ganesh B. Ingle Vishwakarma University, Pune, India Mohammad Rakibul Islam Islamic University of Technology, Dhaka, Bangladesh Shradha Itkare Department of Information Technology, Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded, Maharashtra, India Nalini C. Iyer KLE Tchnological University, Vidyanagar, Hubballi, Karnataka, India Md. Ismail Jabiullah Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Kamal Jain IIT Roorkee, Roorkee, India Abdul Jhummarwala Bhaskaracharya National Institute for Space Applications and Geo-Informatics, Gandhinagar, Gujarat, India Shashank D. Joshi College of Engineering, Bharati Vidyapeeth (Deemed to be) University, Pune, India Prachi Kadam Symbiosis Institute of Technology (SIT), Symbiosis International Deemed University, Pune, India Dinesh Kumar Kalal GLS University, Ahmedabad, India Dhananjay Kalbande Sardar Patel Institute of Technology, University of Mumbai, Mumbai, Maharashtra, India Samip Kalyani Dwarkadas J, Sanghvi College of Engineering, Mumbai, India Ashwini Kamate KLE Technological UniversityHubballi, Hubballi, India Pratiksha Kanwar Rajasthan Technical University, Kota, India Naval Jit Kapoor Ministry of Tribal Affairs, Government of India, New Delhi, India Ajay Kaul SoCSE, SMVDU, Katra, J&K, India Harleen Kaur Guru Nanak Dev University, Amritsar, India Kiranbir Kaur Guru Nanak Dev University, Amritsar, India Baijnath Kaushik SoCSE, SMVDU, Katra, J&K, India Ganesh Khadanaga NIC, New Delhi, India Chinmay Khamkar Dwarkadas J, Sanghvi College of Engineering, Mumbai, India Md. Abbas Ali Khan Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh; Jahangirnagar University, Dhaka, Bangladesh
Editors and Contributors
xxiii
Prashant R. Kharote Mukesh Patel School of Technology Management and Engineering, NMIMS University, Mumbai, India S. R. Khot Department of Electronics and Telecommunication Engineering, D.Y. Patil College of Engineering and Technology, Kolhapur, India Sandhya Kiran Karnatak University Post Graduation Center, Karwar, Karnataka, India Ravi Kanth Kotha IDRBT, Hyderabad, India; NIT, Warangal, India Ankush Kudale Karpagam Academy of Higher Education, Coimbatore, India Apoorva Kulkarni Department of Information Technology, Pimpri Chinchwad College of Engineering, Pune, India Milind V. Kulkarni Science & Technology, Vishwakarma University, Pune, India A. Arun Kumar Department of CSE, Balaji Institute of Technology and Science, Warangal, India Anil Kumar Guru Nanak Dev University, Amritsar, India Soundarya Kuralla Nitte Meenakshi Institute of Technology, Bangalore, Karnataka, India Moni Abraham Kuriakose Cochin Cancer Research Institute, Kochi, Kerala, India D. Lakshmi Padmaja Department of Information Technology, Anurag University, Venkatapur, India T. B. Lalitha Department of Computer Application, Hindustan Institute of Technology and Science, Chennai, India Jithy Lijo Christ Academy Institute for Advanced Studies, Bengaluru, India M. Shankar Lingam NIRDPR, Hyderabad and University of Mysore, Mysore, India H. T. Madan REVA University, Kattigenahalli, Yelahanka, Bengaluru, India Shikha Magotra SoCSE, SMVDU, Katra, J&K, India Manish Mandal Department of Information Technology, SRM Institute of Science and Technology, Chennai, India Ashwini S. Mane Vishwakarma University, Pune, India Venkatesh Mane KLE Tchnological University, Vidyanagar, Hubballi, Karnataka, India Krish Mangalorkar Fr. Conceicao Rodrigues College of Engineering, University of Mumbai, Mumbai, Maharashtra, India
xxiv
Editors and Contributors
J. Manikandan PES University, Bangalore, India Arati Manjaramkar Department of Information Technology, Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded, Maharashtra, India S. Mary Saira Bhanu National Institute of Technology, Tiruchirappalli, Tamil Nadu, India Ashutosh Prasad Maurya National Informatics Centre Services Incorporated, New Delhi, India S. M. Meena School of Computer Science, KLE Technological University, Hubbali, Karnataka, India C. Meghana Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India Shabana Mehfuz Oracle, Bengaluru, India Mahendra Mehra Fr. Conceicao Rodrigues College of Engineering, University of Mumbai, Mumbai, Maharashtra, India Ruchi Mehta VJTI, Mumbai, India Vruddhi Mehta VJTI, Mumbai, India Tanya Mishra Department of Information Technology, SRM Institute of Science and Technology, Chennai, India Prashant Kumar Mittal National Informatics Centre Services Incorporated, New Delhi, India Amr Mohamed Faculty of Engineering, Helwan University, Cairo, Egypt Md. Monjourur Roshed Fahim Department of Computer Science Engineering, Daffodil International University, Dhaka, Bangladesh
and
Ayan Mukherjee Areteans Tech, Kolkata, India Balachandra Muniyal Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal University, Manipal, India Zainab A. Muskaan PES University, Bangalore, India Sunil Kumar Muttoo Department of Computer Science, University of Delhi, Delhi, India Yohann Nadar Fr. Conceicao Rodrigues College of Engineering, University of Mumbai, Mumbai, Maharashtra, India Nagaratna REVA University, Kattigenahalli, Yelahanka, Bengaluru, India Harika Nalam Gavs Technologies, Chennai, India
Editors and Contributors
xxv
Anita Nandi Academy of Professional Courses, Dr B C Roy Engineering College, Durgapur, West Bengal, India Prashant Narayankar School of Computer Science and Engineering, KLE Technological University, Hubballi, India N. V. Narendra Kumar IDRBT, Hyderabad, India Vishal Narvani GLS University, Ahmedabad, India Devjyoti Nath Bengal Institute of Technology, Kolkata, West Bengal, India Prema Nedungadi AmritaCREATE, Amrita Vishwa Vidyapeetham, Amritapuri, India; Department of Computer Science, Amrita Vishwa Vidyapeetham, Amritapuri, India N. Neelima Department of Electronics and Communication Engineering, Amrita School of Engineering Bengaluru, Amrita Vishwa Vidyapeetham, Bengaluru, India Nishat Tasnim Newaz Institute of Information University, Savar, Dhaka, Bangladesh
Technology,
Jahangirnagar
S. Nickolas National Institute of Technology, Tiruchirappalli, Tamil Nadu, India Nisha Department of Computer Science, University of Delhi, Delhi, India Prabha C. Nissimgoudar KLE Tchnological University, Vidyanagar, Hubballi, Karnataka, India Zaki Nossair Faculty of Engineering, Helwan University, Cairo, Egypt Francisca Nonyelum Ogwueleka Department of Cyber Security, Nigerian Defence Academy, Kaduna, Nigeria Sumeet Omalur Department of Information Technology, SRM Institute of Science and Technology, Chennai, India Shaimaa Othman Faculty of Engineering, Helwan University, Cairo, Egypt; Faculty of Computer Science, El-Shorouk Academy, Cairo, Egypt D. Lakshmi Padmaja Anurag Group of Institutions, Ghatkesar, India Prajakta Pahade Computer Science and Engineering, Prof Ram Meghe College of Engineering & Management, Amravati, Maharashtra, India Madhuri Pallod Department of Information Technology, SGGS Institute of Engineering & Technology, Nanded, Maharashtra, India Amrees Pandey Department of Electronics and Communication, University of Allahabad, Prayagraj, India Dhiren Patel VJTI, Mumbai, India Dipen S. Patel EECS Department, Texas A&M University-Kingsville, Kingsville, TX, USA
xxvi
Editors and Contributors
Rachana Patel Faculty of Computer Science and Applications, Charotar University of Science and Technology, Changa, Gujarat, India Sanskruti Patel Faculty of Computer Science and Applications, Charotar University of Science and Technology, Changa, Gujarat, India Anish Patil Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India Priyadarshini Patil School of Computer Science, KLE Technological University, Hubbali, Karnataka, India Sonali Patil Department of Information Technology, Pimpri Chinchwad College of Engineering, Pune, India Tanuja R. Patil K.L.E. Technological University, Hubballi, India Nayana Petkar Symbiosis Institute of Technology (SIT), Symbiosis International Deemed University, Pune, India Vaishali S. Phalake College of Engineering, Bharati Vidyapeeth (Deemed to be) University, Pune, India Shanta Phani Bengal Institute of Technology, Kolkata, West Bengal, India Shraddha Phansalkar Symbiosis Institute of Technology (SIT), Symbiosis International Deemed University, Pune, India Tanasanee Phienthrakul Department of Computer Engineering, Faculty of Engineering, Mahidol University, NakhonPathom, Thailand Rohini Pise Department of Information Technology, Pimpri Chinchwad College of Engineering, Pune, India Shreekanth M. Prabhu CMR Institute of Technology, Bengaluru, India K. A. Pradeep Kumar SIERS Research Laboratory, Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Saahithi Pradhan Nitte Meenakshi Institute of Technology, Bangalore, Karnataka, India B. Prem Kumar Smart World Constructions, Chennai, India
and
Communications,
Larsen
&
Toubro
P. Priyanka PES University, Bangalore, India Bollipelly PruthviRaj Goud Department of Information Technology, Anurag University, Venkatapur, India Shruti Purohit Amazon, Bengaluru, India Antony Puthussery CHRIST (Deemed to be University), Bengaluru, India
Editors and Contributors
xxvii
G. S. Raghavendra CMS, Jain University, Bengaluru, India Fahrin Rahman Islamic University of Technology, Dhaka, Bangladesh Md. Riazur Rahman Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Mushfiqur Rahman Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh R. Rahul Adhithya Department of Electronics and Communication Engineering, Amrita School of Engineering, Coimbatore, Amrita Vishwa Vidyapeetham, Coimbatore, India M. Rajesh Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India Md. Rakib Hossain Department of CSE, Daffodil International University, Dhaka, Bangladesh Ramachandra Nitte Meenakshi Institute of Technology, Bangalore, Karnataka, India T. Ramakrishnudu NIT, Warangal, India Raghu Raman AmritaCREATE, Amrita Vishwa Vidyapeetham, Amritapuri, India G. N. V. Ramana Rao Wipro Ltd, Hyderabad, India Vidhi Rambhia VJTI, Mumbai, India K. R. Ramkumar Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India Masud Rana Rashel Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh; University of Evora, Evora, Portugal Lalitha Rangarajan Department of Studies in Computer Science, University of Mysore, Mysore, India Adiraju Prasanth Rao Department University, Venkatapur, India
of
Information
Technology,
Anurag
J. R. Rashmi Department of Studies in Computer Science, University of Mysore, Mysore, India Monika Rathore Rajasthan Technical University, Kota, India Vijay Singh Rathore IIS University, Jaipur, Rajastan, India Punam Rattan Department of Computer Application, CT University, Ludhiana, India
xxviii
Editors and Contributors
M. Ravikumar Department of Computer Science, Kuvempu University, Shimoga, Karnataka, India Mallamma V. Reddy Rani Channamma University Belagavi, Godihal, Karnataka, India Ranjeet Kumar Rout Department of Computer Science and Engineering, National Institute of Technology Srinagar, Jammu and Kashmir, India Gunjon Roy Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh Ratnakirti Roy Dr. B. C. Roy Engineering College Academy of Professional Courses, Durgapur, India Saumya Roy ASET, Amity University, Noida, Uttar Pradesh, India Reema Roychaudhary St. Vincent Pallotti College of Engineering and Technology, Nagpur, Maharashtra, India; Oriental University, Indore, Madhya Pradesh, India R. Rushali National Institute of Technology, Aizawl, Mizoram, India Md. Sabbir Hasan Sohag Islamic University of Technology, Dhaka, Bangladesh Chandani Sah Nitte Meenakshi Institute of Technology, Bangalore, Karnataka, India I. Sahul Hamid The Madura College, Madurai, India K. Sai Tejaswini Smart World and Constructions, Chennai, India
Communications,
Larsen
& Toubro
Mehak Saini D.C.R.U.S.T, Murthal, Haryana, India V. V. Sajith Variyar Department of Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore, India Manjunath Sajjan Rani Channamma University Belagavi, Godihal, Karnataka, India Sakshi Computer Science and Engineering, Siddaganga Institute of Technology, Tumkur, India Minakshi Sanadhya Department of Electronics and Communication Engineering, SRM Institute of Science and Technology NCR Campus, Ghaziabad, India Suresh Sankaranarayanan Department of Information Technology, SRM Institute of Science and Technology, Chennai, India S. Santhameena PES University, Bangalore, India
Editors and Contributors
xxix
D. Saravanan Faculty of Operations & IT, ICFAI Business School (IBS), The ICFAI Foundation for Higher Education (IFHE), (Deemed to be university u/s 3 of the UGC Act 1956), Hyderabad, India Mehran Sepehri Sharif University of Technology, Tehran, Iran Manav Shah Dwarkadas J, Sanghvi College of Engineering, Mumbai, India Riya Shah VJTI, Mumbai, India Ramesh Shahabade Terna Maharashtra, India
Engineering
College,
Nerul,
Navi
Mumbai,
Rekha Shahapurkar Oriental University, Indore, Madhya Pradesh, India; Computer Science and Engineering Department, Oriental University, Indore, India Samir Sadik Shaikh Computer Science and Engineering Department, Oriental University, Indore, India Mohammad Ebrahimnejad Shalmani Ministry of Energy, Tehran, Iran M. Shamim Kaiser IIT, Jahangirnagar University, Dhaka, Bangladesh M. Shankar Lingam University of Mysore, Mysore, India; NIRDPR, Hyderabad, India G. A. Shanmugha Sundaram SIERS Research Laboratory, Department of Electronics and Communication Engineering, Center for Computational Engineering and Networking, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India Anurag Sharma Department of Computer Science and Engineering, GNA University, Phagwara, India Devendra Kumar Sharma Department of Electronics and Communication Engineering, SRM Institute of Science and Technology NCR Campus, Ghaziabad, India Shilpa Sharma Department of CSE, CT University, Ludhiana, India Sabha Sheikh Department of Computer Science and Engineering, National Institute of Technology Srinagar, Jammu and Kashmir, India Raghavendra S. Shekhawat Graphic Era University, Dehradun, India Anusha D. Shetti Government of Arts and Science College, Karwar, Karnataka, India B. J. Shivaprasad Department of Computer Science, Kuvempu University, Shimoga, Karnataka, India Sanchit Sindhwani Department of Computer Science and Engineering, Dr B R Ambedkar National Institute of Technology, Jalandhar, India
xxx
Editors and Contributors
Aditya Kumar Singh Department of Electronics and Communication, University of Allahabad, Prayagraj, India Ankur Singh Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India Archana Singh ASET, Amity University, Noida, Uttar Pradesh, India Rajeev Singh Department of Electronics and Communication, University of Allahabad, Prayagraj, India Shweta Singh Department of Electronics and Communication, IIT(ISM) Dhanbad, Dhanbad, India Sweta Singh Department of Electronics and Communication, University of Allahabad, Prayagraj, India Tripty Singh Amrita School of Engineering, Bengaluru, India Archana Singhal IP College, University of Delhi, Delhi, India Krittaya Sivakriskul Department of Computer Engineering, Engineering, Mahidol University, NakhonPathom, Thailand
Faculty
of
Vanitha Sree Nitte Meenakshi Institute of Technology, Bangalore, Karnataka, India K. N. Sreehari Department of Electronics and Communication Engineering, Amrita Viswa Vidhyapeetham, Vallikavu, India P. S. Sreeja Department of Computer Application, Hindustan Institute of Technology and Science, Chennai, India B. Sreevidya Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India S. Sri Harish Department of Electronics and Communication Engineering, Amrita School of Engineering, Coimbatore, Amrita Vishwa Vidyapeetham, Coimbatore, India G. K. Sriharsha Cognizant Technology Solutions Pvt. Ltd, Bengaluru, India K. C. Srikantaiah Department of CSE, SJB Institute of Technology, Bangalore, Karnataka, India Natarajan Subramaniam PES University, Bangalore, India T. Suresh School of Computer Science and Engineering, Bharathidasan University, Trichy, Tamil Nadu, India G. Surya Deepak Anurag Group of Institutions, Ghatkesar, India Pigilam Swetha Smart World Constructions, Chennai, India
and
Communication,
Larson
Aasish Tammana PES University, Bengaluru, Karnataka, India
&
Toubro
Editors and Contributors
xxxi
Arun Thomas Department of Electronics and Communication Engineering, Amrita Viswa Vidhyapeetham, Vallikavu, India Swasti Sumedha Tiwari Department of Information Technology, SRM Institute of Science and Technology, Chennai, India Vivek Tiwari IIIT Naya Raipur, Naya Raipur, Chhattisgarh, India Linz Tom Department of Changanacherry, Kerala, India
Computer
Science,
Assumption
College,
K. Uma Maheswari National Institute of Technology, Tiruchirappalli, Tamil Nadu, India Saiyed Umer Department of Computer Science and Engineering, Aliah University, Kolkata, West Bengal, India Deepak Uniyal Graphic Era University, Dehradun, India M. V. Vaidya Department of Information Technology, SGGS Institute of Engineering & Technology, Nanded, Maharashtra, India S. Vaithyasubramanain PG and Research Department of Mathematics, D. G. Vaishnav College, Chennai, India P. Vamsi Kiran Reddy Department of Computational Engineering Networking, Amrita Vishwa Vidyapeetham, Coimbatore, India
and
H. R. Vanamala PES University, Bengaluru, Karnataka, India J. Vanishree NIRDPR, Hyderabad, India Josy Elsa Varghese Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal University, Manipal, India Smita Vempati University of Mysore, Mysore, India U. Venkanna IIIT Naya Raipur, Naya Raipur, Chhattisgarh, India T. Venkateswarao Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India K. R. Venugopal Bangalore University, Bangalore, Karnataka, India Hitesh Vijan Mukesh Patel School of Technology Management and Engineering, NMIMS University, Mumbai, India Vaibhav Vyas Banasthali Vidhyapeeth, Vanasthali, Rajastan, India B. L. Yashika Dayananda Sagar College of Engineering, Bangalore, India Kaneez Zainab Amity University, Lucknow Campus, Lucknow, India
xxxii
Editors and Contributors
Abdul Qadir Zakir Department of Information Technology, SRM Institute of Science and Technology, Chennai, India Sejuti Zaman University of Dhaka, Dhaka, Bangladesh
An Assessment of Internet Services Usage Among Postgraduate Students in the Nigerian Defence Academy Angela Adanna Amaefule and Francisca Nonyelum Ogwueleka
Abstract This paper focuses on research on an assessment of Internet services usage among postgraduate students in Nigerian Defence Academy. The study was aimed to determine the state of Internet services usage among postgraduate scholars in the institution. The definitions, meaning of the Internet and Internet services, were stated. Also, Nigerian Defence Academy Postgraduate School with its postgraduate students were discussed. A good research literature review was carried out on related works, and quantitative research was performed using questionnaire form to investigate the Internet services usage in the academy by the postgraduate students. Inferences obtained from the analysis showed that the students use Internet services in NDA. They are aware of Internet services, and they have confidence in the accuracy of information on the Internet. The students normally use cybercafé and department lecture room for Internet services. However, they are of the view that no lecture note is available on the academy portal. The challenges encountered are download delay, high cost of Internet usage, lack of login credentials to the academy network, power outrage, inaccessibility of some website and insufficient bandwidth capacity. Academy management should organize training for postgraduate (PG) students on the use of Internet. The PG students should be encouraged to access academy website for information, use their emails for communications and the academy library databases for research work. Internet bandwidth capacity should be increased, and login credentials should be provided for every PG student to enable them access NDA network in order to use the Internet facilities. Keywords Assessment · Internet services · Internet services usage · Nigerian Defence Academy · Postgraduate students · Internet service problems · Internet devices · Availability
A. A. Amaefule (B) · F. N. Ogwueleka Department of Cyber Security, Nigerian Defence Academy, Kaduna, Nigeria © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_1
1
2
A. A. Amaefule and F. N. Ogwueleka
1 Introduction Internet is a worldwide plan of unified computer networks that uses transmission control protocol or Internet protocol (TCP/IP) to associate millions of devices across the globe. Internet transmits large volume of data and applications like interlinked Web reports and software, a framework to support email and distributed computers to enhance phone calls and shared documents [1]. Internet is referred to as a global connection of computers of different sizes, capacity and functionality. It is an enormous connection of networked computers over the world resulting to network of networks [2]. Due to the constant development on the Internet, the users of the technology and its usage have grown simultaneously. The utilization of Internet has expanded to the world from 1995 to 2000. Recently, about fifty-five (55) percent of Northern Americans are online. Based on the large demand for the needs of Internet services, Internet access has to turn out to be daily operations [3]. Nowadays, majority of individuals are at advantage through the use of Internet. The technology has helped to boost people’s work in most area of discipline. Information technology (IT) has contributed immensely to development of long-lasting projects with great features and improved quality connectivity both in business, work and education. The Internet has helped to improve sound reasoning with up-to-date information of patient’s health status. A lot of details on various health issues are available on health-related websites [4]. The source of Internet can be drawn from the creation of Advanced Research Project Agency Network (ARPANET) known as the interconnection of computers by the U.S. Department of Defense in 1969. Today, it is used to connect thousands of computer systems all over the world in non-structured way. The Internet is a result of linking together some media, computers and telecommunication devices [5]. It is known as one of the contributors to the advancement and development of the world economy. Formerly, it helped the government to solve vital policy issues; irrespective of its great importance of the fourth industrial revolution, more than four (4) billion people are not connected to the Web. Nigeria recorded 54% Internet usage in the early months of the year 2018 with 30% usage less of the broadband penetration stated by the federal government in 2018. Public Internet access providers (PIAPs) provided free access to Internet at sensitive locations as a means which is used to assist the government in encouraging the masses to indulge in worldwide electronic economy [6]. Internet facilities give someone the privilege to obtain large volume of records within a short period. There are four categories of Internet services, namely communication, information retrieval, Web and World Wide Web services [7]. Many services in real world are provided on the Internet like banking online, job search, ticketing for airline schedules, visa, international passport, hotel booking, etc. [8]. Some of the importance and uses of the Internet are seen in worldwide publications, circulation of information, instantaneous dialogue, propagation channel and streaming multimedia playback such as streaming of videos, graphics, photographs,
An Assessment of Internet Services Usage …
3
images and texts. Academia use it for research, lecturing or teaching to disseminate information and collective communication among people irrespective of their distance [9]. Internet helps to improve education through the utilization of databasestored data which is recoverable through electronic interfaces [1]. A lot of people use it to work from home, monitor and transact business, send and receive mail electronically, chat with friends and relatives, receive lectures online, download textbooks, get the direction of places, using webcam to view imaginary places. Internet education allows flexible time for learning; poor class of people can work and study at the same time. It helps students to feel the sense of group dynamics and motivation in their learning [10]. Postgraduate (PG) students in the Nigeria Defence Academy Postgraduate School (NDAPGS) do not participate in most of the academic events organized by the academy or the PG School, e.g. orientation of newly admitted students, workshops, departmental and faculty seminars for students, etc. It could be as a result of the fact that the students do not access the individual email created for every admitted PG student or that they do not regularly access the academy website which is a platform where news of upcoming events are posted for awareness to the academy community. Again, some students prefer the manual method of semester course registration. They fill and submit the completed manual copies to the department and the NDAPGS rather than using the automated course registration system which is the online course registration portal designed for the NDAPGS to enable the students to register all the courses offered in either first or second semester, respectively. Also, the academy library databases are located both at the old and permanent site of the Nigeria Defence Academy (NDA) to enable PG students to have access to research materials from the databases which the academy management subscribed on yearly basis. Instead of the PG students to make use of the academy databases in the library, most of them prefer to use their smart phones to access free databases. It was due to the challenges encountered by the students that motivated the study of an assessment of Internet services usage among PG students in NDA. The NDAPGS was established to develop in the graduate student the spirit of inquiry through training in research with an atmosphere of intellectual independence and individual creativity combined with a strong sense of teamwork. The NDAPGS runs both full-time and part-time programmes. The Internet services in the NDA were provided and supported by the Directorate of Information and Communications Technology (DICT). The directorate manages some applications, software and portals with the academy domain name which are all resident on the academy website. The Internet services available are emails for PG students, NDA website, online portal registration, World Wide Web to carryout out research works, assignments, seminar presentations, thesis, online application portal for postgraduate admissions, academy library databases, E-library, etc. [11]. The PG students have their student email address with NDA domain name @nda.edu.ng. The PG School management communicate with the new and existing PG students through their individual email. The students use their email to communicate correspondences with the NDAPGS management, the lecturers, among the students and outside bodies for acquisition and access to research materials. The academy website is provided for the PG students
4
A. A. Amaefule and F. N. Ogwueleka
where they can access the website, read news, upcoming events, watch videos of recent or past academy events and to obtain adequate information about the NDAPGS and the academy at large. Every PG student has a matriculation number. Students can access the online portal through the academy website using the email addresses created for them with the academy domain name to login. The online portal registration platform allows every student the access to register their courses for both first semester and second semester irrespective of their locations. The academy library is subscribed to some databases online in which each PG student can access books, conference papers, articles, journal, magazines and other relevant materials from the online databases using the login credentials provided by the library information technology (IT) staff. Others uses of the Internet are for research work, online learning, tutorials, online purchase of application forms, publication of articles and advertisements. This study is to access the level of usage of Internet services among PG students in NDA. The information from the research will help the management of NDA and the NDAPGS to improve the existing Internet services by increasing the bandwidth capacity, providing login credentials for easy Internet access, organizing orientation programme on Internet services and usage for the newly admitted PG students for effective usage. Subsequent sections will be literature review, research methodology, findings and analysis, summary, conclusions and recommendations.
1.1 Aim Aim is to determine the assessment of Internet services usage among postgraduate students in Nigerian Defence Academy.
1.2 Objectives The following are the objectives of the study; • • • • • •
To investigate the strength of Internet service usage by PG students. To investigate the rate of awareness of Internet service usage by PG students. To investigate where PG students access Internet services in NDA. To investigate the Internet services available in NDA. To investigate the Internet services used by PG students in NDA. To investigate the locations used to access Internet services by PG students in NDA. • To investigate the students’ experience in the use of Internet services. • To investigate devices used to access Internet services in NDA. • To investigate the problems of Internet service usage by PG students
An Assessment of Internet Services Usage …
5
2 Literature Review A lot of information was provided on the Internet for any subject area that man specializes [10]. Many universities and institutions of higher learning postgraduate (PG) students could obtain research publications to carry out their research work using the institutional website. The Internet plays a vital role in the higher institution for teaching, learning and carrying out research studies [8, 10]. Almost all students’ education requires the Internet. The Internet has made it possible to study online. Today, there are lots of virtual universities, whereby students attend classes by sitting on the computer accessing a particular university’s website, the video segment based on the topics and they could study at home [10]. The chat programmes such as Yahoo Messenger or Microsoft Messenger enable students to have a conversation with research colleagues or collaborators or write messages to them. Some websites like Gmail and Facebook allow someone to chat within the browser as well [8, 10, 12]. Voice over Internet protocol allows someone to have telephone service through an Internet connection and could help researchers to perform video conferencing, e.g. Skype and Facebook video calling. These services are free which replaces landline, and it saves time. The survey [9] is aimed to evaluate the provision of Internet facilities for PG students in federal universities of Southwest, Nigeria, and usage. The researcher to attain the objective used questionnaires, which was distributed to two hundred seventy-one (271) respondents such that ninety-one of them are from the University of Lagos, ninety are from the University of Ibadan, and ninety are from Obafemi Awolowo University, Ile-Ife. The three universities are the only universities in Southwest Nigeria that were undergoing education courses at the PG level, and each of the schools had network and Internet connected to the school. Results were generated with frequency and percentage tables which illustrated that Internet facilities were provided more at the laboratories than the hostels. In terms of accessibility of Internet facilities, the highest was university Internet library, followed by faculty Internet library, next was library Internet library, followed by PG office, and the least was hostel Internet library. In terms of usability of Internet services, the highest was search engines, followed by email, next was World Wide Web, followed by instant messaging, then file transfer protocol, followed by Internet video, the next was Skype, followed by Internet audio, then telnet, next was gopher, followed by scientific and satellite imaging, the least was newsgroup. However, lack of search skills, difficulties in navigation and the need to filter results from search were problems faced by the PG students [9]. In the study [1], the researcher stated that Internet carries a wide space of data resources and management. The study was aimed to determine the use of Internet among PG students of K. M. Shah Dental College and Hospital. A cross-sectional questionnaire was used to investigate awareness and use of the Internet by the PG students. In the study, it was discovered that there were 44.35% of females and 55.65% of males. Also, it was discovered that 99.1% utilized the Internet, 84.3% made use of the university Wi-Fi, 80.9% made use of bookmark, 81.7% used Google
6
A. A. Amaefule and F. N. Ogwueleka
drive, while 27% of them used the Web bloggers. Of all total participants, 40% had a challenge using the Internet. Students used the World Wide Web, email, wwwbased medical and databases on dental publications. Also, it was discovered that PG students do not have enough knowledge in the use of the Internet. Besides, 6.1% of the students did not rely on the accuracy of the information, and 0.9% do not rely on the accuracy of dentistry information on the Internet. Irrespective of the fact that most of the students used the Internet, there were lots of barriers during literature search such as availability of Internet, cost of utilization, time, low speed and virus infections. Lastly, the researcher opined that 98.2% of students are of the view that lecture notes should be placed on the university website [1]. The survey [8] investigates the prospects and challenges of Internet usage by PG students in the Faculty of Social and Management Sciences, Olabisi Onabanjo University Ago-Iwoye. Questionnaires of about one thousand two hundred (1200) copies were distributed, but only one thousand (1000) was collected and analysed. It was observed that majority of the students are between the age of 15–25 years. Female are more than male. In terms of Internet awareness in the school, a lot of students about seventy-four percent (74%) are aware of the facility. In terms of Internet access by the respondents, a lot of them about thirty-seven percent (37%) have access to the Internet for entertainment, twenty-nine percent (29%) for their research work and nineteen percent (19%) for a chat and sending of mail messages. Most of the students about thirty-seven percent (37%) have five years and above experience in the use of the Internet. However, it was observed that some students do not use the Internet despite the awareness. Also, it was discovered that some of them encountered some issues such as slowness of the Internet server speed and pages on the Web were too long; and thirty percent (30%) of the respondents had issues getting the appropriate information at the right time and security and privacy issues which contributed negatively to the use of the Internet resources [8]. The study [10] is a research centred on Internet usage by PG students of Gulbarga University. The work aimed to consider the students rate and pattern in the use of Internet in the university. A sample size of 100 was adopted. The study was to discover the state of mind of the students and the level that they attain in the Internet usage in information technology (IT). To establish a common relationship between student’s academic performance and the usage of internet [10]. In terms of access to Internet, out of one hundred respondents, forty-eight (48) percent access the Internet using university computer laboratories, twenty-eight (28) percent access Internet using smartphones, fifteen (15) percent access Internet using cybercafe, and lastly, nine (9) percent access Internet using own computer or laptop [10]. It was observed from the result obtained that due to the time spent in browsing the Internet, out of one hundred (100) respondents, forty-eight respondents spent 2–3 h browsing on the Internet, nineteen (19) respondents spent more than 5 h browsing the Internet, eighteen (18) respondents spent less than 2 h browsing the Internet, and lastly, fifteen (15) respondents spent 4–5 h browsing the Internet. For the content viewed on the Internet, the highest amount recorded was sixty-two (62) who viewed social networking, next was followed by fifty-six (56) who viewed academic notes, next was followed by forty-nine (49) who viewed chat rooms and makes video call,
An Assessment of Internet Services Usage …
7
next was followed by forty-six (46) who viewed online journals, next was followed by forty-three (43) who viewed email, next was followed by forty-one (41) who viewed online forum, and the least recorded was twenty-two (22) who viewed online games [10]. In terms of reason for use of Internet, out of one hundred respondents, the highest amount recorded was eighty-eight (88) respondents whose reason for use of Internet was to prepare class assignments, next was followed by seventy-two (72) whose reason was for entertainment, fifty-three (53) was to update knowledge, fifty-one (51) was to prepare for examinations, thirty-seven (37) was to read news, thirtyone (31) was for other reasons, followed by twenty-seven (27) respondents whose reason was to download software, twenty-three (23) was for communication, and the least recorded was eight (8) respondents whose reason was to purchase items. For academic performance on the use of Internet, out of one hundred respondents, sixtyeight respondents believed that their academic performance on the use of Internet was improved, twenty-three (23) respondents were neutral that is their opinion was academic performance on the use of Internet which was neither improved nor not improved, and lastly, nine (9) respondents opined that academic performance on the use of Internet was not improved [10]. In the survey [13], the researcher opined that the use of technologies in IT has helped to improve research work carried out by PG students in the library systems services of Francis Sulemanu Idachaba Library, University of Agriculture, Makurdi, using a descriptive method of survey for the study. A questionnaire was designed with a sample size of 270 PG students from a general school population of 2000 PG students. The result showed that those who were trained internally by an external resource person were 37.78%. Those trained on Internet skills rated 44.44%. The students who uses Internet daily were 22%. Those with digital libraries skills with improvements on their studies were 87.41%. Again, 51.11% uses Internet facilities outside the university for research. Irrespective of the advantages experienced, there are some challenges such as delay in download and lack of enough computers. In conclusion, it was discovered that Internet helped to put away manual library services, and the students pleased with Internet facilities were 94%. As a result of problems that arise from the Internet facilities, 92.96% are of the opinion that there should be enough and proper instructions to the PG students in the use of the library. Again, recommendations were given that librarians should endeavour to make students patronizes the e-resources through the use of IT. There should be un-interruption of power supply. Furthermore, an improvement in speed for the use of e-resource for the PG students should be provided. Workable computers should be made available, creating a sensitization programmes on the availability of some resources. It was given that easy Internet access is directly proportional to student’s skills [13]. The study [14] stated that Internet is a technology that has helped both staff and the PG students to carry out research work effectively. The researcher aimed to discover the purpose and types of Internet resources used by the academic staff and PG students of the Ahmadu Bello University, Zaria. The concept and reason for the use of Internet and type of Internet resources used by the academic staff and the PG students were
8
A. A. Amaefule and F. N. Ogwueleka
stated. A survey method was devised to carry out the study on both academic staff and the PG students in the twelve faculties. Out of 6155 PG students and 1577 academic staff, a sample size of 1232 (20%) of PG students and 316 (20%) of academic staff were selected. In the paper, e-journal and email are the most employed resources by both academic staff and PG students. The respondents in their opinion stated that learning, research, communication and social networking are the main reason for using the Internet. The study revealed the difference in the type of Internet resource and reason of using the Internet in the university. It was deduced that Internet was relevant for research and learning. Despite the benefits derived from the Internet, there are problems with inadequate and low connectivity, no teleconferencing equipment and small bandwidth. Due to these challenges, the following recommendations were suggested that Internet should be provided at the departments, faculties and common rooms to enable staff and students have easy access to the Internet. The bandwidth capacity should be increased with a policy that will help to sustain the technology from being mishandled by users. Also, the university should ensure that the students make effective use of the Internet by providing easy access to the facilities, ensuring adequate teleconferencing equipment is available for the students to easily collaborate, communicate and disseminate information among themselves and other professional through the network system. The Internet resources available were newsgroup, emails, directories, e-journals, databank, databases, e-books, etc. It was discovered from the analysis that emails, e-journals and databases were the highest used resources, while the least were usenet/newsgroup. The reasons for the use of the Internet by most students were for research, learning, communication and social networking. The challenges encountered are slow speed and little information [14]. The survey [15] opined that the Internet has effected a positive change in socioeconomic life of the people. The researcher stated that the Internet has also affected positive change in the business environment, socio-political areas, the behaviour of people and their cultural values. In the study, the author aimed to carry out a survey that would recognize and comprehend in one of the Romanian faculty the profiles of students who use the Internet. The research study detected the students’ Internet usage and time spent on the Internet. In the study, the students’ access to Internet was discovered, and most of the preferred services and how the teaching process can be used effectively were stated. Again, the students’ skill and zeal for Internet use were given. It was discovered that there was need for an update in the curriculum for academics. A survey was carried out, and the respondents were bachelor and master students of the Faculty of Accounting and Management Information Systems in the University of Economic Studies in Bucharest. Questionnaires were distributed to 119 students, out of which 84 were bachelor students and 35 master students. All the students had experienced in the use of the Internet. Analysis of the data was performed using eight hypotheses. In the study, the author encouraged the development of Internet-based learning tools and confirmed the students’ expectations regarding the successful development and implementation of technological developments in the field. The research findings could help improve the academic curriculum in computing and security fields [15].
An Assessment of Internet Services Usage …
9
The study [16] was on use of resources and services in the library by PG students of Babcock University. One hundred (100) numbers of questionnaire were used, and 76% were returned. It was discovered that most students used the library often for research work but do not make effective use of the resources provided. However, the students preferred to use lots of materials from the Internet, but the problem was inadequate time. In terms of use of resources in the library, they were averagely contented. Some recommendations were made such as there should be enough services offered to the students at least on full-day services basis. The school should provide more Internet outlets for the PG students at the library. Also, it was suggested that Internet facilities be made available all over the campus for easy access by the students. There should be a sensitization programme for the students by the librarian on the available services rendered. Again, a help desk was to be provided in order to receive complaints from the students and as such help the school management in providing remedy to the issues faced [16]. The survey [17] is on the time spent on the Internet for both academic and nonacademic reasons. A survey was performed on 1675 students in a random selection using questionnaire instrument on five various areas, namely social sciences, sciences, engineering, agriculture and computer sciences. It was discovered that the students use the Internet 4.48 h per day. The results showed that the students from computer science made use of 5.61 h per day using the Internet which was more than other students in other areas of study. Social sciences, agriculture and computer sciences students were seen to use the Internet for academics’ research compared to other students. Also, it was discovered that out of the huge time spent on the Internet by the students, less time was spent for academic research work. In terms of students from social sciences, the correlation between the time used for online activities and time for academic research was reasonably positive although with a low figure. On the contrary, for the science students, the correlation of the time spent online and for academic work was negative with a little value, while for agriculture, engineering and computer sciences, there was no correlation on the time spent online and the time for academic’s work. It was discovered that more applications developed these days are mostly meant for socializing with friends and making lots of funs rather than for academics or educational use and as such the students in higher education should make effective use of the whole time spent on the Internet for their academic works. Students who dedicate longer time using the Internet do not engage more in their academic work compared to the students who do not use the Internet always. It was seen that whether the students use the Internet for academic or social reason, it is a matter of individual choice. It was recommended that good amount of time spent on the Internet should be for academic purposes rather that social activities. Again, the Internet should serve as a medium of interactions among the students and the lecturers. According to research findings, using the Internet for academic work brings about optimal academic results [17]. The study [18] focused on the use of Internet by master and PhD students of the University of Khartoum for scientific research for the year 2014. Out of 3189 students, a random sampling of 441 students was used for four assemblies. A questionnaire instrument was used for data collection, and SPSS software was used to obtain the
10
A. A. Amaefule and F. N. Ogwueleka
percentage, mean, standard deviation, T-test using one group and ANOVA. From the findings, most students used the Internet for their research. The four assemblies’ average is that basic engineering is 65.3, medical and health studies is 65, humanities and educational studies is 62.73, and agricultural and veterinary studies is 61.75. The Internet helped to improve the rate of student’s research knowledge. Irrespective of the type of degree there is no difference in the rate of Internet usage by the students. It was discovered that to obtain efficiency, the school has to consider appropriateness on the research expectations of the students and the strength of the Internet to meet the demand for research work. If there is appropriateness, then the students will have the notion that Internet is of great use and they are contented, and this will encourage the optimal use of Internet for the scientific research work. Again, there was a need for the student to strive more in the use of Internet to advance their knowledge in obtaining information in all kinds of digitalized manner. However, the challenges faced was difficulty in getting materials written in English from the right websites for scientific work. Also, there was email and some technical issues [18]. The survey [19] was on the use and acquaintance of electronic properties by the SC/ST researchers and PG students of the colleges in Salem District in collaboration with Periyar University. An empirical survey was carried out using structured questionnaire of 700 numbers. There were primary and secondary data and deep communication on a sample of group. The primary data was collected using random sampling. Some tests like ANOVA, regression analysis, percentage, frequency and chi-square test discovered that many researchers and PG students were regularly using the e-resources for their works and exams. A good number of students obtain the materials by downloading in PDF files. The rate of use of the e-resources is low compared to the amount used in setting up the facilities for use. As a result, there is need for awareness of the resources and training to be organized to the students. It was discovered that there were enough e-resources available, but the equipment and devices to aid the effective use must be obtained and with up-to-date functionality. The challenges encountered were poor network, unnecessary adverts, interruption of power supply, time consuming and issue of licensing [19]. The study [20] was to observe how electronic resources are used by the PG students in the University of Cape Coast. The objectives were to discover the awareness of the digitalized materials, the frequency of use of these materials, the rate of computer knowledge and the problem faced by the students. Questionnaire instrument was used on a sample size of 275, and SPSS software was devised to compute the frequencies and percentages. It was discovered that majority of the students are aware of the digitalized materials. The students are mostly seen using Google Scholars and databases from the Web to carry out their research work rather than library database. The major challenge identified was poor connection of the Internet, others are inadequate advertising, no proper training, lack of login credentials, insufficient computers, interruption of power supply and little knowledge on information acquisition; all these caused the students to rely fully on the library staff for information. As a result, it has adversely influenced the rate of access and usage of the library for digitalized materials. It was suggested that to assuage these problems, the
An Assessment of Internet Services Usage …
11
library authority should provide adequate facilities that will aid effective use of the e-resources [20]. The survey [21] was to determine the essence and kinds of resources provided through the Internet for academic staff and PG students in Ahmadu Bello University, Zaria. In the study, a sampling method was used with a sample size of 1232 and 316 for PG and academic staff by employing questionnaire instrument for data collection. The data was analysed to obtain frequency and percentage tables, and t-test was performed, respectively. It was discovered that electronic mail and journals were used regularly by the staff and the students. The reason for use of the Internet was for studies, research, messages, interactions and socialization. Again, there was a momentous distinct in the kinds of Internet facilities and reason for using the Internet. Despite the fact that both the staff and students were interested in the Internet facilities, they were saddled with some problems such as inactive and inadequate Internet with little bandwidth capacity and no teleconferencing equipment which tends to affect the output of their research work. However, it was recommended that Internet facilities with enough bandwidth capacity should be made available at every location in the school for easy access as well as provision of teleconferencing devices to improve research with their colleagues in and outside the school [21]. The study [22] is on the use and understanding of Internet by three different disciplines of students in the university, namely business studies, science and arts. A random sampling method was deployed using questionnaire instrument for 150 students, 50 from each discipline. Again, SPSS software was used to analyse the data in frequency and percentage distributions. It was discovered that the rate of use of the Internet was 100, 92 and 90% for business studies, science and arts. The Internet was seen as a means of retrieving information for studies and research by the students. It was recommended due to under-use of the Internet facility. The science and arts students should put more effort in their use of the Internet. Also, all the students should improve their rate of access of the Internet. Again, in order to enhance the students’ outputs of studies, the Internet should be upgraded to a standard form to accommodate the high demand, magnitude and speed of request for materials for research work [22]. The survey [23] was done using questionnaire for 6000 students drawn from five universities and two polytechnics using random sampling method. Out of the total numbers distributed, only 5000 questionnaires were recovered. Descriptive analysis was performed on the data. It was discovered that most of the students are knowledgeable in the use of computer and as such use the Internet. They normally go to the cybercafé to access the Internet for their study materials. Most of the students gave an average response that Internet solely is for academic work. It was stated that the students always use electronic books and journals for their studies and helps them to prepare for their academic assessments and evaluations. However, they had some problems such as interruption of power supply, sluggish Internet connections, insufficient systems to access the Internet, too much demands and traffics from the students. In order to resolve these issues, some recommendations were made; there should be an approved cybercafe which should be allowed to run services within the school premises. Students should be made to perform an assignment via the Internet
12
A. A. Amaefule and F. N. Ogwueleka
as such it will boost their rate of Internet use. Students should be trained on the use of the Internet to help improve their Internet skills on research work. Students should be provided with support from the information and communication technology (ICT) staff. A good number of systems should be acquired for students for research works [23]. In the study [24], the researcher investigates the use of search engines of Internet by staff and students for academic studies. Out of 290 questionnaires, 230 were recovered. It was discovered that both staff and students are aware of search engines and materials available on the Internet for academic purposes. However, they had problem with lack of skills and little bandwidth capacity of the bulky research work to be carried out. Some recommendations were suggested to proffer adequate solutions. The librarian while carrying out their duties should not be intimidated that Internet has come to replace their jobs, rather they should encourage the school management to conduct intensive trainings to help enhance the student’s proficiencies in the use of search skills for effective academic works. Also, the Internet bandwidth was to be upgraded to a reasonable capacity to serve the university in terms of academics. Out of a population of 2000 students and 300 staff, a random sampling of 250 students and 40 staff was performed. Out of which, 230 responses were collected and analysed using statistical software to obtain the diagram, frequencies and percentage distribution, respectively. It was discovered that both parties gradually embraced the use of the search engines for their academics. It was recommended that training on the use of Internet and Internet course modules should be introduced generally for everyone in the school to the help improve their skills and proficiencies in academic research [24]. The survey [25] is to investigate the form by which Internet is been used by the students of the university. A descriptive kind of survey was employed using questionnaire in a random sampling manner for 200 students from four faculties. It was discovered that almost all students use the Internet frequently. 51.6% could access the Internet from their apartments or school accommodations. 21.1% make use of the library for research. It was seen that 91.1% of students use their mobile phones to access Internet. 49% of students trained on Internet could work effectively. 45.8% are not knowledgeable in it, so they try all kinds of efforts to be able to make positive results in their efforts. 78.4% use it for class work, 76.3% use it for entertainment, and 73.2% use it to communicate with one another. The highest services and materials used by the students are social media, search engines, electronic newspaper and electronic books. 53.2% could search and obtain their data; 28.9% could use some logics. The students are of the opinion that Internet has really helped them to improve their studies. 36% are of the opinion that they are satisfied somehow with the Internet. The problems that they had was low bandwidth capacity, and it was hard to get the materials that they needed due to the fact that they are not knowledgeable in the use of the facilities. Some recommendations were made such as providing adequate equipment and devices to facilitate prompt information findings and improve the students’ skills. It was opined that the students used the Internet for their studies, interactions among themselves and for fun at their leisure time [25].
An Assessment of Internet Services Usage …
13
3 Research Methodology In performing the research methodology, the aim of the research was considered which is assessing Internet services usage among postgraduate school in NDA. In order to achieve the aim, the following objectives must be satisfied such as: to investigate the strength of Internet service usage by PG students; to investigate the rate of awareness of Internet service usage by PG students; to investigate where PG students access Internet services in NDA; to investigate the Internet services available in NDA; to investigate the Internet services used by PG students in NDA; to investigate the locations used to access Internet services by PG students in NDA; to investigate the students’ experience in the use of Internet services; and to investigate devices used to access Internet services in NDA. In NDA, PG students made use of different devices to access Internet service and to investigate the problems of Internet service usage by PG students. Also, the students were saddled with some challenges while accessing Internet services for academic studies. In order to attain the objectives, quantitative method was employed using questionnaire tool as means of data collection. The questionnaire used in the study is an enhancement of the questionnaire used in paper [9]. The population of the area of study was NDAPGS, and the sample size was 120 students. The respondents were restricted to only PG students who are still on session in NDAPGS. In distributing the questionnaire link, a means of WhatsApp platform was utilized such that the questionnaire link was sent to PG programme coordinators of all departments to help forward the link to the WhatsApp group forum of their PG students, respectively. There were two hundred and eight responses, but due to improper filling of the form and incomplete answers provided to the questions by the students, eighty-eight (88) responses were discarded, while one hundred and twenty (120) fully completed forms were downloaded into Microsoft Excel using comma-separated values file format. The research design for the study was quantitative method; a survey was carried using a closed-ended questionnaire. The reason for the questionnaire was to investigate all the positive and negative factors associated with the use of Internet services by the postgraduate students of the academy. The researcher designed a questionnaire using Google form, and a link was generated online which was distributed to the various WhatsApp groups of different PG programmes for PG students. After the questionnaire forms have been completed by the PG students, the data was downloaded into Microsoft Excel worksheet file and then exported into IBM SPSS Statistics version 23 software where the data was analysed using descriptive statistics and the results were represented in frequency and percentage tables. Graphs were plotted using bar charts diagram to illustrate the results obtained. The questionnaire comprises of closed-ended questions used to collect data from the respondents. In all, there were three sections. Four questions from Sect. 1 which was the demographic information, twenty-six questions from Sect. 2 and five questions from Sect. 3 making a total of thirty-five questions administered on each of the questionnaire forms designed using online Google form platform. The questionnaire was conducted on
14
A. A. Amaefule and F. N. Ogwueleka
sample size of hundred and twenty respondents, all are PG students of NDAPGS which included full-time and part-time students who are running either postgraduate diplomas, academic masters, professional masters and doctor of philosophy.
4 Findings and Analysis Data obtained from the questionnaire form was analysed using IBM SPSS Statistics version 23 software. Results were generated using descriptive statistics analysis with frequency and percentage tables. Again, the results obtained were illustrated using bar charts. A total of two hundred and eight (208) responses were obtained from questionnaires distributed. However, due to the fact that some respondents skipped supplying answers to some key questions on the questionnaire. This rendered some of the questionnaires invalid and as such, the invalid questionnaires received were eighty-eight (88) while the valid questionnaires received were one hundred and twenty (120) which was used for the analysis. Some students answered all the questions in Sects. 1 and 2 but failed to answer all the questions or some of the questions in Sect. 3. Based on the anomalies experienced, eighty-eight (88) incomplete responses were discarded and one hundred and twenty (120) responses were used to run the data analysis of the survey carried out on the assessment of Internet services usage among postgraduate students of the academy. The data was analysed using IBM SPSS Statistics version 23 software; descriptive statistics analysis was performed on the thirty-five (35) questions that were administered to the postgraduate students. The data analysis and results obtained from the thirty-five (35) questions were illustrated using frequency tables giving the results obtained according to the frequency of occurrence with the corresponding results in percentages and diagrams depicting the various results obtained which were illustrated using bar charts for every result obtained from the thirty-five questions administered.
4.1 Demographic Information Result Table 1 is the age distribution. It was observed that out of one hundred and twenty (120) respondents, the age range from 25 to 30 was sixty (61) which is 50.8%, 31–35 age range was twenty-four (24) which is 20.0%, 36–40 age range was sixteen (16) which is 13.3%, 41–45 age range was seven (7) which is 5.8%, 46–50 was ten (10) which is 8.3%, while the students that did not specify their ages was two (2) which is 1.7% as shown in Fig. 1. For sex distribution, thirty-one (31) was female which is 25.8%, and eighty-nine (89) was male students which is 74.2% as shown in Fig. 2. From the programme mode distribution, seventy-five (75) were full-time students which is 62.5%, while forty-five (45) were part-time students which is 37.5% as shown in Fig. 3. From programme type distribution, academic masters were fiftyone (51) which is 42.5%, PGD students was eleven (11) which is 9.2%, PhD students
An Assessment of Internet Services Usage … Table 1 Demographic information
15
Demographic information
Description
Frequency
Percentage
Age
25–30
61
50.8
31–35
24
20.0
36–40
16
13.3
41–45
7
5.8
Sex Programme Level
46–50
10
8.3
Female
31
25.8
Male
89
74.2
Full-time
75
62.5
Part-time
45
37.5
Academic masters
51
42.5
PGD
11
9.2
PhD
19
15.8
Professional masters
39
32.5
Fig. 1 Age distribution
was nineteen (19) which is 15.8%, and professional masters’ students was thirty-nine (39) which is 32.5% as shown in Fig. 4. All the results obtained were represented using frequency with percentage distribution tables and bar charts.
16 Fig. 2 Sex distribution
Fig. 3 Programme mode distribution
Fig. 4 Level distribution
A. A. Amaefule and F. N. Ogwueleka
An Assessment of Internet Services Usage …
17
4.2 Usability and Awareness of Internet Services by PG Students Summary of results from Sect. 2 of the questionnaire questions is related to Internet services in NDA PG School. Table 2 is on usability, awareness of Internet services and confidentiality of Internet services by PG students in NDA. Twenty-five (25) students do not use Internet services, while ninety-five (95) students use Internet services as shown in Fig. 5. Fifty-nine (59) students are not aware of Internet services in NDA PG School, while sixty (60) students are aware of Internet services as shown in Fig. 6. Twenty-nine (29) students do not have confidence in the accuracy of information on the Internet, while ninety-one (91) students have confidence in accuracy of information as shown in Fig. 8. Ninety-eight (98) students are of the view that lecture notes are available, while twenty-two (22) students opined that lecture note are not available as shown in Fig. 9. From the data analysis in Table 2, postgraduate students stated that they use Internet services, they are aware of Internet services in NDA PG School, and they have confidence in the accuracy of information on the Internet. Table 2 Usability and awareness of Internet services by PG student S. No.
Items
Yes
No
1
Do you use Internet services in NDA PG School?
95 (79.2%)
25 (20.8%)
2
Are you aware of Internet services in NDA PG School?
60 (50.0%)
59 (49.2%)
3
Do you have confidence for the accuracy of information on the Internet?
91 (75.8%)
29 (24.2%)
4
Are there lecture note available on the academy postgraduate portal?
22 (18.3%)
98 (81.7%)
Fig. 5 Use of Internet service in NDA
18
A. A. Amaefule and F. N. Ogwueleka
Fig. 6 Awareness of Internet services
Fig. 7 Experience in the use of Internet
However, they are of the opinion that there is no lecture note on the academy portal (Fig. 9).
4.3 Experience in the Use of Internet Services Table 3 is on “How long is your experience in the use of Internet?” Thirty (30) students do not have any experience, twenty-eight (28) students have one to five (1–5) years experience, twenty-eight (28) students have six to ten (6–10) years experience, eleven (11) students have eleven to fifteen (11–15) years experience, while twenty-three (23) students have 16 and above years of experience as shown in Fig. 7. Therefore, the number of students who do not have experience and those who have between one to ten years experience is more than those who have between eleven to fifteen (11–15) years and sixteen and above years of experience.
An Assessment of Internet Services Usage …
19
Fig. 8 Confidence of Internet information
Fig. 9 Availability of lecture note on NDA portal
Table 3 How long is your experience in the use of Internet S. No.
Items
0
1–5
6–10
11–15
16 and above
1
How long is your experience in the use of Internet?
30 (25.0%)
28 (23.3%)
28 23.3%)
11 (9.2%)
23 (19.2%)
4.4 Campus Where Internet Services Are Accessed Table 4 is on which campus one accesses Internet services. Twenty-three (23) students access Internet services at the permanent site, while ninety-six (96) students access Internet services at the old site. This implies that more students access Internet services at the old site than the permanent site.
20
A. A. Amaefule and F. N. Ogwueleka
Table 4 Which campus do you access NDA Internet service
S. No.
Campus
Frequency
Percentage (%)
1
Permanent site
23
19.2
2
Old site
96
80.0
Table 5 Which laboratory do you normally use in NDA S. No.
Laboratory
Frequency
Percentage (%)
1
Permanent site directorate of ICT computer laboratory
8
2
Permanent site academy library computer laboratory
9
7.5
3
Old site directorate of ICT computer laboratory
41
34.2
4
Old site PG Library computer laboratory
26
21.7
5
PG school computer laboratory
34
28.3
6.7
4.5 Laboratory Used for Internet Services Table 5 is on which laboratory do you normally use in NDA. Permanent site directorate of ICT computer laboratory was eight (8), permanent site academy library computer laboratory was nine (9), old site directorate of ICT computer laboratory was forty-one (41), old site PG Library computer laboratory was twenty-six (26), while PG School computer laboratory was thirty-four (34). This implies that more students use old site directorate of ICT computer laboratory and PG School computer laboratory, while few students use permanent site directorate of ICT computer laboratory and permanent site academy library computer laboratory.
4.6 Access Point Used to Connect Internet Services Table 6 is on which access point do you connect for Internet services in NDA. Permanent site directorate of ICT access point was fourteen (14), permanent site academy library access point was six (6), old site directorate of ICT access point was thirty-eight (38), old site PG Library access point was thirty-one (31), while PG Table 6 Which access point do you connect for Internet services S. No.
Access point
Frequency
Percentage (%)
1
Permanent site directorate of ICT access point
14
11.7
2
Permanent site academy library access point
6
5.0
3
Old site directorate of ICT access point
38
31.7
4
Old site PG Library access point
31
25.8
5
PG school hall access point
29
34.2
An Assessment of Internet Services Usage …
21
Table 7 Where do you normally access Internet services S. No.
Location of Internet service
Frequency
Percentage (%)
1
Department lecture room
24
20.0
2
PG Hall
19
15.8
3
Cybercafé
47
39.2
4
Library computer laboratory
14
11.7
5
Directorate of ICT laboratory permanent/new site
16
13.3
School hall access point was twenty-nine (29) as shown in Fig. 12. This implies that more students connect to Internet services with old site directorate of ICT access point, old site PG Library access point and PG School hall access point were twentynine (29), while few students use permanent site directorate of ICT access point and permanent site academy library access point.
4.7 Location of Access of Internet Services Table 7 is on location where Internet services are accessed in NDA. Out of one hundred and twenty respondents who responded to the questionnaire, the highest was cybercafe which was forty-seven (47), followed by department lecture room twenty-four (24), followed by PG Hall nineteen (19), followed by directorate of ICT laboratory permanent/new site sixteen (16), and the least was library computer laboratory fourteen (14) as shown in Fig. 10. This implies that more students access Internet services at the cybercafe and department lecture room, while few students access Internet services at the PG Hall, library computer laboratory and directorate of ICT laboratory permanent/new site.
4.8 Device Used to Access Internet Services Table 8 is on what device that is used to access Internet services in NDA. Personal computer/laptop was twenty-five (25), smart phone was fifty-five (55), cybercafe computer was seven (7), and directorate of ICT computer laboratory was eight (8), while PG School computer laboratory was five (5) as shown in Fig. 11. This implies that more students use smart phone and own computer/laptop, while very few students use cybercafe computer, directorate of ICT computer laboratory and PG School computer laboratory (Fig. 12).
22
A. A. Amaefule and F. N. Ogwueleka
Fig. 10 Location where Internet services is access Internet
Table 8 What device do you use to access Internet services S. No.
Device used to access Internet
Frequency
Percentage (%)
1
Personal computer/laptop
25
20.8
2
Smart phone
55
45.8
3
Cybercafe computer
7
5.8
4
Directorate of ICT computer laboratory
8
6.7
5
PG school computer laboratory
5
4.2
4.9 Strength of Internet Service Usage Table 9 is on the frequency of Internet services used in NDA. Thirty-four (34) students often use email services, while fifty-five (55) students never used the services. Thirtynine (39) students often use email services, while thirty-six (36) students never use the services. Fifty-eight (58) students often use email services, while fifteen (15) students never use the services. Sixteen (16) students often use Internet services, while seventy-nine (79) students never use the services. This implies that more postgraduate students often use the online portal registration services compared to other services, while very few of the students use the academy library databases. Using the Likert scale with weight 1, 2, 3 and 4 gives a hypothesize mean of 2.5 (i.e., [1 + 2 + 3 + 4]/4). This would be used as basis for comparison with the variables under consideration. Looking critically at Table 10, the weighted mean for email usage is 1.93 which is less than 2.5 hypothesized mean; therefore, it indicates that
An Assessment of Internet Services Usage …
Fig. 11 Device used to access Internet
Fig. 12 Access point used to connect Internet
23
24
A. A. Amaefule and F. N. Ogwueleka
Table 9 How often do you use the following Internet services in NDAPGS S. No.
Internet services
Very often
Often
Seldom
Never
1
Email
13(10.8%)
21(17.5%)
31(25.8%)
55(45.8%)
2
Website
10(8.3%)
29(24.2%)
45(37.5%)
36(30.0%)
3
Online portal registration
19(15.8%)
39(32.5%)
47(39.2%)
15(12.5%)
4
Academy library databases
4(3.3%)
12(10.0%)
24(20.0%)
79(65.8%)
Table 10 How often is the usage of Internet services S. No.
.Internet services
Very often Often Seldom Never Weighted S.D. Decision (4) (3) (2) (1) mean
1
Email
13
21
31
55
1.93
1.03 Disagree
2
Website
10
29
45
36
2.12
0.93 Disagree
3
Online 19 portal registration
39
47
15
2.52
0.92 Agree
4
Academy library databases
12
24
79
1.50
0.81 Disagree
Grand total
8.07
3.69
Grand mean
2.02
0.92 Disagree
4
majority of the students disagree using email Internet service. Generally, the grand mean is 2.02 which is less than 2.5; we then conclude that the students disagreed using Internet services in NDA postgraduate school. Table 11 Chi-square tests of how often the usage of Internet
Value
Df
Asymptotic significance (2-sided)
Pearson chi-square
84.303a
9
0.000
Likelihood ratio
89.545
9
0.000
Linear-by-linear association
4.598
1
0.032
No. of valid cases
479
a0
cells (0.0%) have expected count less than 5. The minimum expected count is 11.43. Furthermore, from Table 11, we observed that the chi-square value is 84.303 at nine degrees of freedom and the P-value is 0.000 approximately. We then conclude that there is a significant difference in the often usage of the Internet services in NDA
An Assessment of Internet Services Usage …
25
Table 12 Internet services available for PG students S. No.
Internet services
1
Email
2
Website
3
Online portal registration
4
Academy library databases
Strongly agree
Agree
Neutral
Disagree
Strongly disagree
29(24.2%)
34(28.3%)
28(23.3%)
18(15.0%)
10(8.3%)
28(23.3%)
39(32.5%)
29(24.2%)
17(14.2%)
7(5.8%)
37(30.8%)
50(41.7%)
20(16.7%)
9(7.5%)
3(2.5%)
14(11.7%)
21(17.5%)
39(32.5%)
30(25.0%)
15(12.5%)
4.10 Availability of Internet Services Table 12 is on type of Internet services available for postgraduate students. Sixtythree (63) students agreed that email services are available, while twenty-eight (28) students disagreed. Sixty-seven (67) students agreed that website services are available, while twenty-four (24) students disagreed. Eighty-seven students agreed that online portal registration services are available, while twelve (12) students disagreed. Thirty-five (35) students agreed that academy library database services are available, while forty-five students disagreed. This implied that more of the postgraduate students agreed that online portal registration services are available to them compared to other services, while more of the postgraduate disagreed that academy library databases are available to them (Table 12). Using the Likert scale with weight 1, 2, 3, 4 and 5 gives a hypothesize mean of 3.0 (i.e., [1 + 2 + 3 + 4 + 5]/5). This would be used as basis for comparison with the variables under consideration. Looking critically at Table 13, the weighted mean for email usage is 3.45, website is 3.53, and online portal registration is 3.92 which is greater than 3.0 hypothesized mean. Therefore, it indicates that majority of the students agree on the availability of email, website and online portal registration. Generally, the grand mean is 3.45 which is greater than 3.0. We then conclude that the PG students agreed on the availability of Internet services in NDA postgraduate school.
4.11 Problems of Internet Services Table 15 depicts problems encountered by postgraduate students. Out of one hundred and twenty (120) respondents, the highest rate of problems encountered was sixtyeight (68) students who agreed that power outrage was a problem, sixty-four (64) students agreed that inaccessibility of some websites was a problem, fifty-nine (59) students agreed that insufficient bandwidth capacity was a problem, fifty-seven (57)
Academy library databases
4
14
37
28
21
50
39
39
20
29
28
Neutral (3)
30
9
17
18
Disagree (2)
15
3
7
10
Strongly disagree (1)
2.91
3.92
3.53
3.45
Mean
3.45
Online portal registration
3
34
Agree (4)
13.81
Website
2
29
Strongly agree (5)
Grand mean
Email
1
Grand total
Internet services
S. No.
Table 13 Internet services available for PG students
1.15
4.61
1.19
1.00
1.17
1.25
S.D.
Agree
Disagree
Agree
Agree
Agree
Decision
26 A. A. Amaefule and F. N. Ogwueleka
An Assessment of Internet Services Usage …
27
Table 14 Chi-square test on the availability of Internet facilities Pearson chi-square
Value
Df
Asymptotic significance (2-sided)
49.504a
12
0.000
Likelihood ratio
51.420
12
0.000
Linear-by-linear association
6.459
1
0.011
No. of valid cases
477
a0
cells (0.0%) have expected count less than 5. The minimum expected count is 8.73. More so, from Table 14, we observed that the chi-square value is 49.504 at 12 degrees of freedom and the P-value is 0.000 approximately. We then conclude that there is a significant association between the availability of the Internet services and students’ opinions in NDA Table 15 Problems encountered by PG students in the use of NDA Internet services S. No.
Problems
Strongly agree
Agree
Neutral
Disagree
Strongly disagree
1
Overload of information
10(8.3%)
25(20.8%)
52(43.3%)
25(20.8%)
8(6.7%)
2
Filtering results from search
9(7.5%)
38(31.7%)
54(45.0%)
13(10.8%)
4(3.3%)
3
Delay in download
21(17.5%)
35(29.2%)
45(37.5%)
13(10.8%)
6(5.0%)
4
Information credibility problem
15(12.5%)
26(21.7%)
5344.2%)
20(16.7%)
4(3.3%)
5
Absence of search skills
9(7.5%)
23(19.2%)
49(40.8%)
30(25.0%)
9(7.5%)
6
Expensive Internet usage
21(17.5%)
30(25.0%)
40(33.3%)
25(20.8%)
4(3.3%)
7
No network 26(21.7%) login credential
31 (25.8%)
4537.5%)
12(10.0%)
6(5.0%)
8
Outrages of power
25(20.8%)
43(35.8%)
36(30.0%)
12(10.0%)
4(3.3%)
9
Websites inaccessibility
16(13.3%)
48(40.0%)
39(32.5%)
15(12.5%)
2(1.7%)
10
Difficulties navigating website
13(10.8%)
28(23.3%)
57(47.5%)
18(15.0%)
4(3.3%)
11
Insufficient bandwidth capacity
21(17.5%)
38(31.7%)
52(43.3%)
7(5.8%)
2(1.7%)
28
A. A. Amaefule and F. N. Ogwueleka
students agreed that lack of login credential to the academy network was a problem, fifty-six (56) students agreed that download delay was a problem, and fifty-one (51) students agreed that high cost of Internet usage was a problem. The least rate of problems encountered that was recorded was thirty-three (33) students agreed that the absence of search skills was a problem, thirty-five (35) students agreed that overload of information was a problem, forty-one (41) students agreed that information credibility, difficulties navigating website are problems, and forty-seven (47) students agreed that filtering results from search was a problem. This implies that the high rate of problem encountered by postgraduate students are power outrage, inaccessibility of some websites, insufficient bandwidth capacity, lack of login credential to the academy network, download delay and high cost of Internet usage, while the low rate of problems encountered by the PG students are lack of search skills, information overload, credibility of information, difficulties in navigation of websites and filtering results from search. Using Table 16 revealed that the students agree with all the question items except question 5 in which they disagreed that absence of search skills is part of the problems of Internet which they face.
5 Summary There are more students between the age range from 25 to 30 which is 61. There are more male 89 than female 31. Full-time students 75 are more than part-time students 45. Academic masters 51 are more than other level of students. Ninety-five students use Internet services in NDA more than 25 who do not use it. More students 60 are aware of Internet services than those who are not aware in NDA which are 59. The PG students who have confidence of information on the Internet are 91 more than those who do not 29. Students are of the view that lecture notes are not available on the academy postgraduate portal 98 are more than 22 students who agreed. The number of students who do not have experience in the use of Internet is 30 which is higher than 28 students who have between 1–5 and 6–10 years of experience, while between 11–15 and 16 and above years are few with 11 and 23, respectively. Ninety-six students access Internet at old site more than 23 students at permanent site. Students who use old site directorate of ICT computer laboratory are 41 which is the highest compared to other laboratories in NDA. PG students who access Internet using old site directorate of ICT access point are 38 which is the highest compared to other access points. Students who access Internet at the cybercafe are 47 which is the highest compared to other location in NDA. Students who use smart phone are 55 which is the highest compared to other devices. Students who use online portal registration are 58 which is the highest compared to other Internet services, while very few of the students 16 use the academy library databases. Generally, students disagreed using Internet services in NDA, and there is a significant difference in the often usage of the Internet services in NDA. Eighty-seven students agreed that online portal registration is available to them than the other services, while more
Expensive Internet usage
No network login credential
Outrages of power
6
7
8
7
18
15
12
12
25
30
20
13
13
25
Disagree
2
4
2
4
6
4
9
4
6
4
8
Strongly disagree
3.58
3.23
3.51
3.61
3.49
3.33
2.94
3.24
3.43
3.30
3.03
Mean
3.34
52
57
39
36
45
40
49
53
45
54
52
Neutral
36.69
38
28
48
43
31
30
23
26
35
38
25
Agree
Grand mean
21
13
16
25
26
21
9
15
21
9
10
Strongly agree
Grand total
Insufficient bandwidth capacity
Absence of search skills
5
11
Information credibility problem
4
Websites inaccessibility
Delay in download
3
Difficulties navigating website
Filtering results from search
2
10
Overload of information
1
9
Problems
S. No.
Table 16 Problems encountered by postgraduate students in the use of NDA Internet services
0.996
10.96
0.90
0.95
0.93
1.03
1.09
1.09
1.02
0.99
1.06
0.89
1.01
S.D.
Agree
Agree
Agree
Agree
Agree
Agree
Agree
Disagree
Agree
Agree
Agree
Agree
Decision
An Assessment of Internet Services Usage … 29
30
A. A. Amaefule and F. N. Ogwueleka
Table 17 Chi-square tests of association on the problems encounter in the use of NDA Internet facilities Value
df
Asymptotic significance (2-sided)
Pearson chi-square
90.808a
40
0.000
Likelihood ratio
91.129
40
0.000
Linear-by-linear association
16.607
1
0.000
No. of valid cases
1316
a 11 cells (20.0%) have expected count less than 5. The minimum expected count is 4.75. Moreover,
from Table 17, we observed that the chi-square value is 90.808 at 40 degrees of freedom, and the P-value is 0.000 approximately. We then conclude that there is a significant association between the problems encountered by the students in the use of NDA Internet facilities and students experience about it.
of the PG students 45 disagreed that academy library databases are available to them. Students’ agreed on the availability of Internet services in NDA postgraduate school. There is a significant association between the availability of the Internet services and students’ opinions in NDA. There is a significant association between the problems encountered by the students in the use of NDA Internet facilities and students experience about it.
6 Conclusions In the study, we were able to assess Internet services usage by PG students. More students use the Internet and are aware of Internet services in NDA. More students do not have experience in the use of Internet than those who have between 1–5 and 6–10 years of experience while between 11–15 and 16 and above years are few. Students access Internet at old site more than at permanent site. Most students access Internet at the cybercafe. Majority of the students use their smart phone for Internet. Students mostly use NDA online portal registration than other Internet services, while very few students use the academy library databases. The grand mean decision obtained on how often is the use of Internet services was “disagree” which means generally, students disagreed using Internet services in NDA. There is a significant difference in the often usage of the Internet services in NDA. In terms of all the Internet services available in NDA, it was discovered that the number of students that agreed that online portal registration is available which are more compared to those that agreed on the other services, while the number of PG students that disagreed that academy library databases is available to them is more compared to those that disagreed on the other services. The grand mean decision obtained on Internet services available for PG students was “agree” which means generally students agreed on the availability of Internet services in NDA. There is a significant association between the availability of the Internet services and students’ opinions in NDA. Also, there is a significant association between the problems encountered
An Assessment of Internet Services Usage …
31
by the students in the use of NDA Internet facilities and students experience about it. It is suggested that the academy management should organize training for PG students on the use of Internet. The PG students should be encouraged to use the academy library databases for their research work. Also, they should be advised to use their emails for communication among themselves and the management as well as accessing the academy website for upcoming events and news. The Internet bandwidth capacity should be increased to accommodate more PG students for their research works, so that they can reduce the rate of use of cybercafe in NDA. Again, login credentials should be given to every PG students to enable them access NDA network in order to use the Internet facilities. The future work should be assessment of both undergraduate and postgraduate students’ use of Internet services to enable the academy to provide adequate facilities and services to serve all students in order for them to be highly productive in their research works.
7 Recommendations The following are the recommendations; • It is suggested that the academy management should organize training to for PG students on the use of Internet. • The PG students should be encouraged to use the academy library databases for their research work. • The management should advise PG students to use their emails for communication among themselves, the management and outside bodies. • The PG students should be encouraged to access the academy website regularly for upcoming events and news. • Internet bandwidth capacity should be increased to accommodate more PG students for their research works, so that they can reduce the rate of use of cybercafe in NDA. • Again, login credentials should be provided for every PG student to enable them access NDA network in order to use the Internet facilities.
References 1. A. Deshpande, N.H. Joshi, K.S. Poonacha, B. Dave, K. Naik, D. Mehta, Awareness and use of internet among postgraduate students of K.M. Shah Dental College and Hospital. Res. Rev. J. Med. Sci. Technol. (RRJoMST) 5(1), 32–37 (2016) 2. S. Idris, J. Dauda, Awareness and utilization of the internet resources and services for academic activities by the Academics of Tertiary Institutions in Adamawa State, Nigeria. Int. J. Knowl. Content Develop Technol 9(2), 7–31 (2019)
32
A. A. Amaefule and F. N. Ogwueleka
3. B. Wellman, A. Quaan-Haase, J. Boase, W. Chen, The Internet in Everyday Life. ResearchGate, pp. 1–18 (2002). https://www.researchgate.net/publication/2552018_The_Internet_in_ Everyday_Life, last accessed 2019/10/3 4. H. Armanul, Social and Academic Use of the Internet by Graduate Students in Finland and Bangladesh: A Comparative Study. University of Tapere, School of Information Sciences, Information Studies and Interactive Media, Master’s Thesis, pp. 1–4 (2015) 5. G. Singh, R. Pant, Use of internet for research and educational activities by research scholars: a study of D.S.B. Campus of Kumaun University Naintal. Int. J. Eng. Manage. Sci. (IJEMS) 4(2), 193–198 (2013) 6. NITDA Homepage, Framework and Guidelines for Public Internet Access (PIA). National Information Technology Development Agency, pp. 1–14. https://nitda.gov.ng/wp-content/upl oads/2019/01/Framework-and-Guidelines-for-Public-Internet-Access-PIA.pdf, last accessed 2019/10/8 7. Tutorial Point Homepage, Internet Services. Internet Technologies Tutorial. https://www.tutori alspoint.com/internet_technologies/internet_services.htm, last accessed 2019/09/30 8. R.A. Okunlaya, O.I. Amusa, E.K. Ogunlana, Prospects and challenges of internet use among the postgraduate students of social and management sciences in Olabisi Onabanjo University, Nigeria. Inf. Knowl. Manage. 5(5), 128–134 (2015) 9. M.V. Adegbija, O.O. Bola, O.M. Ogunsola, Availability and utilization of internet facilities by postgraduate students in Federal Universities of Southwest, Nigeria. Int. J. Comput. Appl. 1(2), 172–178 (2012) 10. V. Meti, A study on the internet usage pattern of postgraduate students of Gulbarga University. J. Mass Commun. Journalism 4(3), 1–3 (2014) 11. NDA Homepage. https://www.nda.edu.ng/#home/index, last accessed 2019/09/30 12. GCF Homepage, Internet: What is Internet? GCF Learnfree.org, Goodwill Community Foundation Inc. https://www.just.edu.jo/~mqais/cis99/PDF/Internet.pdf, last accessed 2019/09/30 13. J. Aba, B. Kwaghga, O.O. Ogban, E.M. Umogbai, The use of internet services by postgraduate students for research in Francis Idachaba Library, University of Agriculture Makurdi. IOSR J. Res. Method Educ. (IOSR-JRME) 5(1), 15–23 (2015) 14. Z. Mohammed, A. Aliyu, The use of internet by the academic staff and postgraduate students in Ahmadu Bello University, Zaria. ResearchGate, 1–20 (2014). https://www.researchgate. net/publication/264200189_THE_USE_OF_INTERNET_BY_THE_ACADEMIC_STAFF_ AND_POSTGRADUATE_STUDENTS_IN_AHMADU_BELLO_UNIVERSITY_ZARIA, last accessed 2019/10/28 15. V. Stanciua, A. Tincaa, A critical look on the student’s Internet use- an empirical study. Account. Manage. Inf. Syst. 13(4), 739–754 (2014) 16. F.N. Onifade, S.U. Ogbuiyi, S.U. Omeluzor, Library resources and service utilization by postgraduate students in a Nigerian private university. Int. J. Libr. Inf. Sci. 5(9), 289–294 (2013). https://doi.org/10.5897/IJLIS2012.054 17. F. Ahmad, M.A. Fauzi, H. Wan, H. Wan, H.N. Mokhtar, Use of internet for academic purposes among students in Malaysian Institutions of Higher Education. TOJET Turkish Online J. Educ. Technol. 13(1), 232–241 (2014) 18. I.K. Esam, H. Al, Perspectives of using internet on the scientific research among the postgraduate students at the University of Khartoum-Sudan. World J. Educ. 5(5), 11–20 (2015). https:// doi.org/10.5430/wje.v5n5p11 19. E.S. Kavitha, A study on knowledge and usage of electronic resources by the SC/ST research scholars and PG students among Periyar University Affiliated Colleges. Libr. Philos. Pract. (E-J.), 1–18 (2018). https://digitalcommons.unl.edu/libphilprac 20. E. Ankrah, D. Atuase, The use of electronic resources postgraduate students of the University of Cape Coast. Libr. Philos. Pract. (e-J.) Libr. Univ. Nebraska-Lincoln, 1–37 (2018). last accessed https://digitalcommons.unl.edu/libphilprac/1632 21. M. Zakari, A. Abdulkadir, The use of internet by the academic staff and postgraduate students in Ahmadu Bello University, Zaria. ResearchGate, 1–20 (2014). Last accessed https://doi.org/ 10.13140/2.1.2079.8083
An Assessment of Internet Services Usage …
33
22. H. Akram, R. Habibur, Comparative study of internet usage among university students: a study of the University of Dhaka, Bangladesh. Euro. Sci. J. 13(34), 134–150 (2017). https://doi.org/ 10.19044/esj.2017.v13n34p134 23. O. Ivwighreghweta, A.M. Igere, Impact of the internet on academic performance of students in Tertiary Institutions in Nigeria. J. Inf. Knowl. Manage. 5(2), 47–56 (2014) 24. O.S. Ozonuwe, H.O. Nwaogu, G. Ifijeh, M. Fagbohun, An assessment of the use of internet search engines in an academic environment. Int. J. Libr. Sci. 16(2), 1–11 (2018) 25. O.M. Bankole, G. Adio, Pattern of usage of internet among students of Federal University Oye-Ekiti, Ekiti State, Nigeria. Libr. Philos. Pract. (e-J.), 1–31 (2018). https://digitalcommons. unl.edu/libphilprac/1887
Smart Attendance Monitoring System Using Local Binary Pattern Histogram C. Meghana, M. Himaja, and M. Rajesh
Abstract Attendance marking of the students in a classroom is one of the most important activity carried out by the teachers to maintain the physical presence of students as well as for records. The traditional process of taking attendance of students is time consuming, error prone, and student can proxy. To eliminate or reduce these disadvantages, a new scheme is proposed in this paper in which the attendance monitoring is done through face detection. This will eliminate the time required for the teacher to read out the names of students and also eliminate the chances of students making proxies. The system uses face recognition algorithm which captures the images of students present in the class using a camera and compares with the images stored in the database. The key challenges involved are the quality of images getting captured and the accuracy of the face recognition algorithm. Experiments prove that the proposed system achieves more than 90% accuracy when implemented with a database consisting of 493 images and tested against a class of 54 students. The accuracy can be improved by equipping better quality camera, better lighting conditions, and more accurate face recognition algorithm. Keywords Face recognition · Biometric attendance · Surf algorithm · SVM algorithm · Local binary pattern histogram
1 Introduction In the present educational system, every day classroom participation of the student has a significant part in assessing the performance of the student as well as in monitoring their quality. In order to reduce this heavy work, many institutions are using auto or smart attendance monitoring systems. Smart attendance monitoring system C. Meghana (B) · M. Himaja · M. Rajesh Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India M. Rajesh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_2
35
36
C. Meghana et al.
is implemented by making use of biometrics. Face is also used in order to mark attendance efficiently. One of the methods to implement a smart attendance monitoring system is by combining Web technology, database technology, and machine learning. Web technology is the technology through which computers interact among themselves with the help of markup languages and various packages. It enables us to interact with the help of Web sites that utilize HTML and CSS. Database technology takes data and stores, organizes, and processes the data which then enables the users to easily access, update, or manage the data. There are various kinds of databases which have different shapes and sizes. Firebase database is a real-time database which is a NoSQL database, and it is facilitated on cloud. Data storage is in JSON format, and it allows real-time changes in the data. Firebase enables users to sync, store, and query data at global level. JSON is JavaScript object notation. It is used for storing and exchanging data. It is language independent. It is straightforward. Machine learning deals with models and different statistical models that are used to implement a particular work, without the use of extra instructions depending on patterns and inferences. These techniques come under artificial intelligence. Face recognition is the computer-based application that can detect, track, identify, and verify faces from the captured image or video. There are several ways for face recognition, but mostly it is implemented by comparing the facial features of the images that are already present in the database. The traditional method of attendance system that is being followed in several institutions is either by reading out the names or by signing in the papers which is very much time consuming, and there may be chances of marking proxies. Some institutions have used RFID card readers for marking attendance. RFID card readers procedure is nothing but RFID cards will be given corresponding to their identity. In this case, there is the chance of loss of card, there might be misuse of card to mark attendance by unauthorized users, and it is of high cost. In this paper, attendance of the student is marked by capturing the image of the student using Webcam, face recognition library is used to detect and recognize faces, and using functions in Python, attendance is marked by storing the data in firebase database. Since the Webcam present in the system is used to detect faces, the implementation cost is very less, easy to implement, efficient, accurate, and fast.
2 Related Works Several proposals have been made in the market for the automated attendance management systems. Some of these proposed models are mentioned below. Mahalin et al. [1] proposed a biometric-based attendance system in which the fingerprint of the student is captured using a portable device. Although this method was helpful in removing the drawbacks of the manual attendance based system, it came up with a new problem. This method was found to be a distraction for both
Smart Attendance Monitoring System …
37
students and teachers as the portable device is used during the lecture hours which makes a key reason for a distraction for students [1]. Lim et al. [2] proposed an automated attendance-based system using radio frequency identification tags (RFIDs). In this method, every student is provided a special unique identification card (UID) which has to be swiped directly in a machine that is connected to another machine which contains the student database. It is a wellorganized system, but it has a drawback that anyone can swipe the card for he/she who is not attending the class. In this way, the chances of proxies are high [2]. Bhattacharya et al. [3] proposed a system which was developed for managing the attendance of students using face recognition technology. This face recognition algorithm can be obtained directly from the dlib library, so that we can keep a track of the students face from frame to frame. As we know to mark the attendance, we need to detect the face. So to detect the face, we use different parameters such as pose estimation, sharpness, image size, and brightness. In pose estimation, the roll and pitch are adjusted during the face log generation. But while estimating, the only concern is the yaw angle. So we calculated the yaw angle by using the formula yaw = abs(arctan2(y2 − y1, x2 −x1)). Now, it is very likely to have blurry images as the faces are constantly moving. So to overcome this problem, we have to introduce face quality assessments by utilizing the variance of image Laplacian. As the size of the face is becoming smaller, we have to include the position of eye corners which is a part of face quality assessment. Mohana et al. [4] proposed a system which has two step process: • First detects all the regions where there is human skin • Second it extracts all the required information from the regions which might contain the accurate position of the face. To detect the skin, a filter is used which relies on texture and color information. The face detection takes place on the grayscale image which has human skin detected. A combination of thresholding and morphology is used for the extraction of the face. The surf algorithm is used for the matching purpose. This algorithm detects the interest points from the trained images and then compares with the recent image. To minimize errors, SVM algorithm is used. Borra et al. [5] proposed an automated attendance-based system using face recognition techniques. In a biometric-based system [8] which used fingerprints, we observed that it was becoming a difficult task to scan the fingerprints of each and every student which is a time consuming process. The radio frequency identification tags (RFIDs)-based attendance system was also not found effective, as it becomes little expensive for demanding a large number of unique identity cards (UID), and it failed to identify the person that can increase the liabilities of proxies [5]. The drawback of the proposed models mentioned above is gradually removed in face recognition attendance-based systems [9–11]. As we know, the face is the most important part of the human body and we personally distinguish people by looking at human faces. So the advantage of this approach is that students can easily mark their attendance by just looking at a camera which can be placed anywhere in the class.
38
C. Meghana et al.
3 Implementation Firstly, the images of the students are collected and are stored in a folder with unique picture ids, and email ids of the parents are taken as input in the form of a list. We have used face_recognition library for detecting and recognizing faces. It is used to identify all the faces in a given image, identify facial features in an image, and extract all the facial locations in an image and generate face encoding vectors of 128 values. Based on these encodings, we can measure similarity between two faces.
3.1 Image Capturing The image of the person is captured using the Webcam placed in the computer which can be accessed by a built-in library named OpenCV (computer vision). It is available in Python to capture an image.
3.2 Face Locating In this process, the exact location of the faces is found out. The input is taken as a normal image, and output comes out to be all the faces using the function below (Fig. 1). import face_recognition image = face_recognition. load_image_file(“image.jpg”) face_locations = face_recognition. face_locations(image)
Fig. 1 Face locating process
Smart Attendance Monitoring System …
39
Fig. 2 Procedure of face detection
3.3 Face Detection It is the process where the human face is detected in the picture. We can also define it by saying the process of locating the facial part of the image and distinguishing it from all other patterns seen in the image. The basic idea is to extract the features of the face and to use in different applications which use face detection. In our system, face detection plays an important role, and it is the initial step toward the face recognition process. The detection process faces a lot of complexities like exposure of light, distance of the target person from the camera, surroundings, and many more. Several methods are used for the face detection process such as Viola and Jones face detection algorithm, local binary pattern, Adaboost adaptive algorithm, and SNOW classifier methods. In this project, we will be using a local binary pattern histogram (LBPH) [6] (Fig. 2).
3.4 Face Encoding It is defined as a process of differentiating one face from another. As we know, this process will become a very long process and time consuming as it has to compare each and every image to find out the result. So, the reliable way to this solution is to
40
C. Meghana et al.
Fig. 3 Process of face encoding
find the measurements of each face. We are going to produce 128 measurements of each face. The training procedure takes an input of three facial images at a time: • A training face image of a known person is loaded • Another picture of the same known person is loaded • A picture of totally distinct person is loaded. After repetition of these procedures millions of times for millions of images of thousands of people, the network is trained and it gives at least ten images that have very close measurements to the person in front of the camera (Fig. 3).
3.5 Local Binary Pattern Histogram One of the most frequently used methods and very popular among all the methods is used for extracting facial features from an input image (Fig. 4). It is a well-organized and efficient texture operator in which all the pixels of an image are provided with a labeling according to the threshold value of the neighboring pixels and gives us the output as a binary number. This process was introduced by Ojala et al. [7].
Smart Attendance Monitoring System …
41
Fig. 4 LBPH based face recognition flowchart
This method divides the image into many small sections from where the facial features are extracted in order to define the shape of the stored image The original local binary pattern (LBP) operator was calculated using a 3 × 3 kernel. The decimal value of the LBP code was calculated using the formula: LBP(xc , yc )
n=1
f = (i n − i c )2n
n=0
where in Eq. (1), ic n in
intensity of gray in the central pixel, number of available sample points, and surrounding pixels of the image’ gray intensity, And the function f (x) is defined in the following way
(1)
42
C. Meghana et al.
Fig. 5 Three levels of facial description [6]
f (x) =
0, if x < 0 1, if x > 0
(2)
Later, the operator was extended to form a 64-pixel representation forming 8 × 8 arrangements of the pixels [6]. This method works by dividing the image into several areas where each subarea consists of a binary pattern. The image consists of 64 subareas as it is divided into 8 × 8 form. These patterns define the pixels surrounding the region. The features obtained from each subarea are joined to form a single histogram, which represents the image (Fig. 5).
3.6 Face Recognition Facial recognition is a technique to identify a face of the student that is captured by the Webcam by comparing it with all those faces that are present in the database. After the process of face recognition, the name of the image captured will be obtained. It is performed using two functions that are present in the face_recognition library. • compare_faces() • face_distance() compare_faces() This function recognizes the face that is captured only when the captured face matches with any of the face that is present in the database by at least 60%. Here, the faces are not compared directly; the encodings of the faces are compared. The faces of the students that are stored in the database are encoded into 128D facial points and are stored in a list (know_faceencodings), every single image that is captured and detected by Webcam is also encoded (encoded_image). The list (know_faceencoding) and the encoded image are passed as parameters to the
Smart Attendance Monitoring System …
43
Fig. 6 Recognizing the name using compare_faces()
Input
Output
compared_faces. Tolerance can also be passed as a parameter to the function which represents the threshold value of how the captured face should match with any of the face that is present in the database. If any value is not passed as tolerance, it takes the default value which is 60%. If the face detected matches with any of the faces that are present in the database as per tolerance value, then the name of the student is obtained by index of the face encoding since the names of the students are reserved in the list with the same indexes as encodings. face_recognition.compare_faces(known_encodings, encoded_image) Figure 6 shows an image that is given as an input to compare_faces(), and the encoding of this face matches with the encoding of the face with the name “Michael Jackson” If the detected face cannot be recognized from the compare_faces(), it can be detected from face_distances() face_distance() This function finds the euclidean distance from the captured face to all the faces that are stored in the database. Here, the distance is calculated by the face encodings but not the faces directly. The faces of the students that are stored in the database are encoded into 128D facial points and are stored in a list (know_faceencodings), and every single image that is captured, detected is also encoded (encoded_image). The list (know_faceencoding) and the encoded image are passed as parameters to the face_diatance(). face_distance(know_faceencodings,encoded_image) face_distance() returns the list of euclidean distances of the encoded image to all the face encodings stored in the database. The encoded face matches with the face that is present at the index of minimum euclidean distance, and the name of the student is obtained. If the minimum euclidean is greater than 0.5 which means the face does not match with any of the face that is present in the database, and the name of the student will be marked as unknown. Figure 7 shows an image is given as input to face_distance(); the encoding of this face matches with the encoding of the face with the name “Virat Kohli”. Figure 8 shows an image is given as input to face_distance(); the encoding of the input image does not match with any of the encodings of the faces that are present in the database, so the name of the student is marked unknown.
44
C. Meghana et al.
Fig. 7 Recognizing the name using face_distance()
Input
Output
Fig. 8 Marking unknown for the faces that are not present in the database
Input
Output
3.7 Attendance Marking Attendance is marked to those students whose faces are captured, detected, and recognized. Attendance is marked by storing the name of the student (of the recognized face) and status as present (since the face is detected) in the firebase database using put() function, and date will act as root. The same data the name of the student and the status as present is also added as a row in to the Excel sheet using append() function. The attendance of absentees is also marked similarly. The list of absentees is obtained by subtracting the list which has the names of the students whose faces are recognized from the list which has the names of all the students that are present in the database. The rows are appended into the Excel sheet with name of the student and the status as absent.
3.8 Sending Email An email is sent to absentees parent’s mail id which is already present in the database email. Mime base class is used for creating email. Creating Email MIMEMultipart is the subclass of email. Mime is used to create an email. This class creates a message object. Message object contains from, to, and subject as attributes. From is sender address which is [email protected]. To is
Smart Attendance Monitoring System …
45
receiver’s (absentee’s parent) address, and subject is subject corresponding to that email. MIMEText is the subclass of email. Mime is used to attach the body of email. Sending Email smtplib module of Python is used to send an email to the ward’s parents. Using sendemail() function, email is sent. First SMTP session should be created by passing URL and port number s = smtplib. SMTP(‘smtp.gmail.com’, 587) Start the TLS (Transport Layer Security) connection s.starttls() Authentication is done by passing the email address and password s.login(fromaddr, “Raspberry@123”). Then, email is sent by sendemail() function by passing from address, to address, and email as parameters. s.sendmail(fromaddr, toaddr, Email). After sending email, SMTP session must be terminated. s.quit()
4 Results The smart attendance monitoring system using face recognition is simple and works efficiently. We were able to detect the student faces using a Webcam, recognize those faces, and mark the attendance for students who are present to the class in the form of an Excel sheet, and an email is sent to the absentees parents.
4.1 GUI of the System A graphical user interface (GUI) is a system of interactive visual components for computer software. Figure 9 is the front end of smart attendance monitoring system. It is done using Tkinter library which is a Python interface for developing GUI. It consists of a textbox which takes the subject name as an input to the system. When we enter a valid subject name, the Webcam gets open to detect the student faces. If we enter any invalid subject name, then we will get an error message.
4.2 Face Recognition Figure 10 shows how the system recognizes the students who are present to the class. At the point when the Webcam gets open, it distinguishes all the appearances and contrasts of the identified images and the images in the database. The faces which are
46
C. Meghana et al.
Fig. 9 GUI of the system
Fig. 10 Recognizing the students who are there in the class
not in our database are marked as unknown, and the remaining faces are marked with corresponding student names. Once the recognition is done, attendance is marked for the students.
4.3 Attendance List Figure 11 is the Excel sheet generated automatically by the system. It consists of two columns—Student Name, Status. Status column tells us whether a student is
Smart Attendance Monitoring System …
47
Fig. 11 Attendance list generated in excel sheet
present/absent. Once the attendance is done, email will be sent to the absentees’ parents.
4.4 Absentees Email An email which is sent to absentees parents with absent from class as subject and with a message as their child is absent to the class…please take action. Once all the emails are sent, we will get a success message.
5 Conclusion The smart attendance monitoring system utilizing face acknowledgment is a super version for denoting the participation of students in a study hall. This machine has numerous advantages like reducing the possibilities of false attendance and manual work. In this cutting-edge world, a huge number of devices are using biometricsbased attendance, yet facial recognition sounds to be a feasible solution due to its highest accuracy alongside with its least human interference. This gadget mainly focuses on offering a significant degree of security. Thus, a prominently genius green participation framework for study hall participation needs to be built up that could perform notoriety on different contingents in a single case. Also, there is no necessity
48
C. Meghana et al.
of any exceptional equipment for its execution. A camera/Webcam to identify faces, a desktop, and database servers are adequate for building the smart attendance system.
References 1. R.A. Rashid, N.H. Mahalin, M.A. Sarijari, A.A. Abdul Aziz, Security system using biometric technology: design and implementation of voice recognition system (VRS), in International Conference on Computer and Communication Engineering, pp. 898–902, 2008 2. T. Lim, S. Lim, M. Mansor, RFID based attendance system. IEEE Symp. Ind. Electron. Appl. 2, 778–782 (2009) 3. S. Bhattacharya, G.S. Nainala, P. Das, A. Routray, Smart attendance monitoring system (SAMS): a face recognition based attendance system for classroom environment, in 2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT) (Mumbai, 2018), pp. 358–360 4. H.S. Mohana, U. Mahanthesha, Smart digital monitoring for attendance system, in 2018 International Conference on Recent Innovations in Electrical, Electronics & Communication Engineering (ICRIEECE) (Bhubaneswar, India, 2018), pp. 612–616 5. T. Ojala, M. Pietikainen, D. Harwood, A comparative study of texture measures with classification based on feature distributions. J. Intell. Learn. Syst. Appl. Pattern Recogn. 29 (1996) 6. T.F. Pereira, M.A. Angeloni, F.O. Simões, J.E.C. Silva, Video-Based Face Verification with Local Binary Patterns and SVM Using GMM Supervectors (Springer, 2012), pp. 240–252 7. H. Hassani, X. Huang, E. Silva, Digitalization and big data mining in banking. Big Data Cogn. Comput, 1–13, 2, 18 (2018) 8. P. Surekha, S. Sumathi, A soft computing based online face recognition system using a robot as an intelligent surveillance agent, in National Conference on Soft Computing Techniques for Engineering Applications (Rourkela, 2006) 9. S. Anand, K. Bijlani, S. Suresh, P. Praphul,Attendance monitoring in classroom using smartphone & Wi-Fi fingerprinting, in IEEE 8th International Conference on Technology for Education, T4E 2016 (Institute of Electrical and Electronics Engineers Inc., 2017), pp. 62–67 10. K. Nimmy, M. Sethumadhavan, Biometric authentication via facial recognition, in International Conference on Computer and Communication Engineering, 2014 11. G. Mahesh, K.R. Jayahari, K. Bijlani, A smart phone integrated smart classroom,in 2016 10th International Conference on Next-Generation Mobile Applications, Security and Technologies (NGMAST) (Cardiff, 2016), pp. 88–93
An Energy-Efficient Wireless Sensor Deployment for Lifetime Maximization by Optimizing Through Improved Particle Swarm Optimization T. Venkateswarao and B. Sreevidya
Abstract In the current situation, the problem that we are facing in the WSN is energy consumption of the sensor node, and it is the main challenge in the WSN. So, regarding this energy consumption issue, many researchers have proposed many protocols which give the best results in lifetime of network, consuming the less energy of the sensor node. However, the energy consumption is reduced using many protocols; here in this paper, the new technique advanced particle swarm optimization algorithm is used, by using this approach, the cluster head is selected, and also by using the MLE distance variance method, the unwanted or malicious nodes are detected and stopped the communication to any node, which gives benefit to energy consumption. If the CH is died unfortunately, the D-CH will be activated and it works as cluster head. Finally, we concluded that our proposed protocol gives the better accuracy when compared with the LEACH and PSO protocols; also it gives the better lifetime, PDR, and throughput. Keywords Particle swarm optimization (PSO) · Lower energy adaptive clustering hierarchy (LEACH) · Energy consumption · Network time period
1 Introduction Wireless Sensor Network (WSN): The network which does not have the connection between any node or sink (BS) is known as WSN, and it will communicate the data through the sensor nodes, like temperature, pressure, and other environmental data. Particle Swarm Optimization: This is the one type of optimizing technique which can be used to find the optimized path between the nodes in the network and
T. Venkateswarao (B) · B. Sreevidya Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Bengaluru, India B. Sreevidya e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_3
49
50
T. Venkateswarao and B. Sreevidya
establish the better path between the node to CH and BS. It is used to reduce the problem for gaining the best result based on the measurement. Particle: a small located object which have several physical or chemical properties, e.g., mass, volume, etc. Swarm: It is the group of the things which traveled from crowd to other places; here, it is collection of nodes, e.g., crowd, flock, etc. Optimization: Optimization is the word which means reducing some large amount of energy or problem. Cluster Head: This is the one main objective in the webwork, the nodes present in the webwork or cluster the one that is selected as the main node which is known as group main CH, and it communicates the data from the group members and CH to CH and CH to BS. In WSN, every node is the detector and the multilayer switch, and its calculating ability and the space availability [1] and communication capability are limited, so many wireless sensor applications and sensors are deployed in the harsh environments, so there is no possibility of replacement of failed nodes [2], and also it is more expensive, so in many situations, the WSN nodes operate by do not changing the battery for a long period of time. The energy usage of the WSN is the main thing that we observing these days, which can be solved based on the different types of protocols LEACH, PSO, SEP, and ACO protocols; the problem in the sensor nodes energy consumption is reduced. So here, we implemented the new protocol, which is IPSO–Improved particle swarm optimization—used for the selection of the group main (CH) and the MLE is used to detect the malicious nodes. First we deploy the sensor nodes in the network simulator, and then, after deploying the sensor nodes, the distance between each sensor nodes is computed by using the mathematical formulae 1, 2 using X, Y coordinates; and after calculating the distance between each sensor node, we use the relay nodes for equally distributing energy to each sensor node, and here the relay nodes are used for the balancing the energy load [3] of the each cluster head. It has the benefits when we use relay nodes in network which is no need of additional energy required for cluster head, so that the fitness function can be generated by using the remaining energy and the space between group main and sink. The remaining sections of this paper are illustrated here. In Sect. 2, the approach used in this paper is described and the clustering protocol is shown, in Sect. 3, optimizing of the solution is shown, and in Sect. 4 the improved protocol is shown. In Sect. 5, we analyze the consumption of energy with the advanced approach, and finally, the experimental results and the conclusions are provided in Sect. 6.
An Energy-Efficient Wireless Sensor Deployment …
51
2 Related Works So, here our literature survey mainly focused on the maximizing the network life time by using the different types of routing protocols, which generates the better accuracy in the way of energy usage of WSN. Currently, the main problem we are facing in the wireless networks is the power maintenance of the node, because the detector nodes are deployed and it cannot be rechargeable, so the energy problems will be occurred in the network. So here [1], the author uses the genetic algorithm and uses the 2D space to deploy nodes, and the minimum number of nodes is deployed in the network in this paper, and the result got by author is the best accuracy using the genetic algorithm. GE is used to provide the space between detector nodes and minimize the space between the CH to BS, but here, the sensor nodes are connected path to the base station, and also the author takes minimum number of nodes. So the lifetime of the sensor node is not increased as compared to our protocol, and the another author [2] uses the PSO algorithm for the less number of group main selection and to minimize the power usage of the sensor node; here the author compared with LEACH—C routing protocol which is the basic protocol, and the author does not got the better results compared to our protocol; author uses the basic PSO [4] which can take only space between the nodes to select the group main. And another author [3, 5] also uses the PSO approach which optimizes the cluster heads, and the power usage of the sensor node is decreased. The lifetime maximization is the most important thing in the WSN which can be solved by using the [6, 7]; the author uses the ant colony optimization (ACO) technique which is used for the selecting the group head (CH), it is also used for finding the least route between the sender to destination (BS), the author uses the diffusion protocol which finds the remaining power of the detector node, and the result will be shown using the MATLAB tool. And the another author [8] describes that the efficient communication is becoming the most problem in the WSN, and also the lifetime maximization is the important thing in the WSN; this can be solved by using the protocol which is hierarchical routing protocol based on the dynamic clustering procedure (HRP—DCP); here, the author used this to form the clusters and the cluster heads; and also the mechanism is worked for the lifetime maximization, which gives the best results. And another author [9] explains that the clustering protocol which is hybrid clustering approach; the protocol is used to identify and select the group head (CH), and when the cluster head energy is over, it indirectly sends the info to the cluster nodes, so the clustering will be done in the upcoming rounds, and they got the better results when compared to the other protocols. The author [3] explains the hot spot problem in the WSN, which leads to the sensor node death in less time, so here the author explains the grid-based clustering to avoid the energy consumption networks; the CH is selected using this GBC protocol and also finding the relation between different size cluster, the better results are got at the end, and another author [10] used the LEACH algorithm for the group head election, and the LEACH—TLCH is improved by the author who selects the 2CH at a time to optimize the power utilization of detector node in the network.
52
T. Venkateswarao and B. Sreevidya
And the author [5] explains that the PSO approach taken the mobile sink for the communication using the NP- hard problem, so here, the PSO is used for the selecting the rendezvous points and the selection of the optimal node among the all the nodes in the cluster, and the weight method [11] is also computed with the proposed model by using the amount of the information packet received from the remaining nodes; the improved algorithm is compared with WRPBS; it gives the better result in throughput and the lifetime improvement. The author [12, 13] explains the heterogeneous networks in which the protocol used is SEP stable election protocol for optimizing the information within cluster communication and BP is utilized for fusing the data taken from the CH into group. And the author [14, 15] explains the clustering mechanism using the ad-hoc networks and multi-level clustering mechanism.
3 Model Assertion 3.1 Network Model Firstly, the sensor nodes and sensor network must follow these assumptions. 1. 2. 3. 4. 5. 6.
The starting power of every sensor node is equal at the time of deployment of the detector node. Every detector node has the equal capability when it is in sensing time or communication to BS through cluster. Each sensor node is numbered based on its location. Every sensor node has the same transmission levels. The sensing data is very correlated, so the group head is summarizing the data from the group data packet. The base station is externally powered.
3.2 Energy Model The predefined model is used in this paper for the utilization of the energy which can be using the consideration of routing loss, and the model is defined by computing the space between the source and the destination, which can use some amount of the energy which is the required energy for transmitting using distance d on k bit packet. E T X (k, d) =
k × E elec + k × E f s × d 2 , if d ≤ d0 k × E elec + k × E mp × d 4 , if d > d0
(1)
So here E TX defines transmission energy, and the E fa is energy used for the reception. And space between two nodes is d.
An Energy-Efficient Wireless Sensor Deployment …
53
And d o is the threshold communication distance (TCD), it can be calculated as
E fa E mp
do =
(2)
to receiving the k, bit message, and the radio energy required is E R X (k) = k ∗ E Elec .
(3)
4 Clustering Protocol The section describes the clustering procedure, which is the sensor sending data that is transferred to the group main (CH) through the remaining nodes in the detector nodes which can communicate to the BS. So, the grouping scheme is proposed in this section which takes the neighbor nodes and the same power of the detector node to form the group of the detector nodes.
4.1 Cluster Head Selection Initially, we consider the M number of the detector nodes deployed in the simulator network field, and now we take the assumption of the all cluster heads as CH and CH1 , CH2 , CH3 …CHn. So, in the cluster, the group mains take the response to communicate the data from the group head and all the clusters and all the relay nodes of each group head. The destination (BS) will elect the group based on the neighbor node of the sensor nodes and the space between the node and the destination. The process can be shown below in mathematical expression. CH CH + (1 − α) × Rlocation FCH = α × Renergy
(4)
We have given in Eq. 4, the F CH contain the two sections, the α denotes the CH CH and Rlocation in the fitness function FCH. RCH power is involvement of the Renergy the proportion of group’s head median remaining power to non-group head node’s median remaining power. The expression is follows in the present round. CH Renergy
=
E CH E CH
∀node j ∈CH
=
∀node j ∈CH
r es E CH ( j)/|CH| r es E (i)/CH CH
(5)
54
T. Venkateswarao and B. Sreevidya
The ECH defines the average remaining energy of group head where NCH is the median remaining power of non-CH by maximizing the RCH energy, it will choose the group main based on the energy levels of each node, and RCH location is the factor of the median space among the non-group head node to BS and the base station to median space from group heads and the BS. The equation is as follows. CH Rlocation
=
D CH D CH
∀node j ∈CH
d(nodei , BS)/CH
∀node j ∈CH
d(node j , BS)/|CH|
=
(6)
The d(nodeI , BS) is the space of I, BS. The detector nodes are battery charged, so the recharging of the battery is impossible. The group main selection is shown in Algorithm 1. The each and every nodes has equal amount of power at starting of network node. Algorithm 1 Group Head (CH) Selection Procedure Start Re process For i = 1 to N do Compute the distance and energy through (5) and (6) End for Advance the group head through improved PSO algorithm Until The many iteration is mapped Finalize the selected node as ‘CH’ Exit If a node contains heavy remaining battery power and is very close to the base station, the node is elected as the group head without any parameters.
4.2 Relay Node Selection The relay node is helpful for the cluster head, which can act as the safety node for each cluster head; the relay node must and should have the two things which is firstly the relay nodes and the cluster nodes that have the same higher energy levels because it can contain more communication with big amount of data so high energy is required for relay node, and second one is less distance from cluster to base station because it can reduce distance which is the best thing in WSN. The relay nodes are formed as shown below RL = RL1 , RL2 , RL3 …RLn. The fitness function is derived as … FRL = β × RRL energy + (1 − β) × RRL location
(7)
An Energy-Efficient Wireless Sensor Deployment …
55
RL energy is follows RL Renergy
=
E RL E CO
∀nodez ∈RL
res E RL (z)/|RL|
∀nodek ∈CO
res E CO (k)/|CO|
=
(8)
RL location is follows RL = Rlocation
=
L CO L RL ∀nodek ∈CO
{d(nodek , BS) + d(nodek , CH j )}/|CO|
∀nodez ∈RL
{d(RLz , BS) + d(RLz , CH j )}/|RL|
(9)
4.3 Clustering Formation In clustering formation phase, they are two sections. (i) (ii) (i)
(ii)
Grouping done section Data communication section. Grouping done section In this section, after the groups are located, the group heads are formed based on the improved protocol, and the distance between each sensor node and the message is sent as sample to test the distance between sensor to base station which uses the CSMA (MAC) protocol. Based on the packet delivery to BS, the clusters are selected and the cluster head send the data to BS. Data communication section In this section, after the selection of the group main, the data from the group main is sent to the corresponding relay node, and the group head will send to the BS. Group main must be ready at any time to communicate the info from the sender to destination, and the base station (BS) can act as the destination.
The entire process requires the communication between the normal nodes and the group heads. In the next section, we explain the procedure of the improved algorithm and the steps to select the clusters and the CH selection, and using the mathematical expressions, the selection of the DCH is also explained in the section.
5 IPSO Algorithm for Cluster Head Updation Particle swarm optimization is the one of the stochastic algorithms which can solve the optimization problems by applying the iterations of the particular problem, and
56
T. Venkateswarao and B. Sreevidya
it can find the best solution itself. This can find the shortest routing path [10] in the WSN, the important thing of this PSO is to increase the time period of the sensor node and the network and increase the webwork life time, so here after the group formation of the cluster head will be selected based on the improved PSO (Fig. 1). This algorithm contains the five steps. A.
Initializing the protocol parameters So here, each particle has the velocity and the position that can be described as Vi = v1 , v2 , v3 . . . vn
Fig. 1 IPSO algorithm
An Energy-Efficient Wireless Sensor Deployment …
57
Pi = p1 , p2 , p3 . . . pn
B.
where i is the representation of current state. Computation of fitness values The particle can be searched in the dimension d space, so the computation of the fitness value is based on the 1 and 2 equations. So each particle or node can be iteratively looped, and the two solutions are formed pbest and the gbest solutions. pi = [ pi 1 , pi 2 , . . . , pi d ] pg = [ pg1 , pg2 , . . . , pgd ]
C.
( pbest) (gbest)
Update position and the velocity of the sensor node The mathematical equation for updating position of the node is xik+1 = xikj + vik+1 j j
(10)
And the velocity of the particle can be computed as = wvikj + c1 r1 ( pikj − xikj ) + c2 r2 ( pgk j − x gk j ) vik+1 j D.
Inertia weight updating Here, this inertia is changing because of the trapping occurring at the local optima which can increase energy consumption, so here we used this in improved PSO w = (wmax − wmin ) ×
Interactionmax − Iterationi Iterationmax
+ wmin
E.
(11)
(12)
Here, W max and W min are representing the minimum and maximum inertias; these values are set as 0.9 and 0.4. Repeat step 3 until termination will be reached Specify and select the best solution when the iteration reached to termination state, and automatically it will elect the cluster head.
6 Analysis of Energy Consumption The energy consumption is the main achievement we need in this paper which can be solved by using the improved PSO algorithm, so here the sensors are deployed in M × M region, and the amount of groups for that sensors is n, so here every cluster has particular remaining node [7] for the energy balancing among all the nodes; the average number of nodes within in the cluster can be N/n, the space or area needed
58
T. Venkateswarao and B. Sreevidya
for the each cluster is M 2 /n, and the density of the sensor nodes is uniform entire process, which can be defined as ρ=
n 1 = 2 M 2 /n M
(13)
The d 2 value obtained is shown below E d 2 = M 2 /2nπ
(14)
So here, the energy can be utilized in the common node which can be defined as E CO = (1 − ps [E T X (k, d) + E R X (k)] + ps E s
M2 + k E elec + Ps E s = (1 − ps k E elec + k E f s × 2 pπ
(15)
And the energy utilized in the relay node can be defined as the following expression given below. E RL = (1 − ps [E T X (k, d) + E R X (k)] + ps E s 4 = (1 − ps k E elec + k E mp dtoBS + k E elec + Ps E s
(16)
And the energy utilized in the cluster head can be defined as
N N − 2 E R X (k)] + k E DA n n
2 N M N + − 2 k E elec + k E DA = k E elec + k E f s 2 pπ n n
E CH = E T X (k, d) +
(17)
So finally, the energy used within the cluster is E cluster = E CH + E R L +
N − 2 E CO n
(18)
The sensor network consumed the final energy is shown below E total = n E cluster .
(19)
An Energy-Efficient Wireless Sensor Deployment …
59
7 Simulation and Results Here, we explain the overall execution process of the protocol, using the NS2 simulator, the sensor nodes deployed in the simulator network, and the cluster head is selected based on the new implemented IPSO algorithm which is main thing in this paper, so if any case of the cluster head died, then we add the another feature which is DCH known as deputed or additional cluster head which acts as the cluster head when cluster died in any case, the removal of the malicious nodes is also explained in the simulation, and finally, the energy consumption is less when we compare with the basic LEACH protocol with our proposed protocol. 1.
Code for the group main (CH) selection This is the code for the cluster head selection. Here in the code, we used the improved algorithm functions (Fig. 2). Data collection and transmission This is the code for the data collection and transmission (Fig. 3).
Results A.
Energy utilization of the existing approach and implemented approach
The energy utilization in advanced protocol is less, compared with the existing LEACH protocol because LEACH is the basic routing protocol; it selects the group main based on the one parameter energy only, but in the proposed protocol, it takes energy and the distance from base station to cluster and also inertia (Fig. 4).
Fig. 2 Group main selection
60
T. Venkateswarao and B. Sreevidya
Fig. 3 Data collection
Fig. 4 Energy consumption
B.
Throughput of the existing approach and implemented approach
The throughput of the existing protocol is better than the proposed protocol. Because in LEACH, one parameter is used for the CH selection, so the throughput is less compared with IPSO (Fig. 5). C.
PDR of the existing approach and implemented approach
The PDR of the existing protocol is giving less percentage compare to proposed protocol. In LEACH, the packet delivery is less, because it consumes more energy to communicate, so chances of cluster head spoil are heavy, so the PDR is less in LEACH (Fig. 6).
An Energy-Efficient Wireless Sensor Deployment …
Fig. 5 Throughput
Fig. 6 PDR
61
62
T. Venkateswarao and B. Sreevidya
Fig. 7 Comparison bar chart
Fig. 8 Comparison graph
D.
Comparison graph of proposed and existing protocol The graph defines the lifetime, throughput, and packet delivery ratio (Figs. 7 and 8).
8 Conclusion So, here we are concluding that our advanced approach will minimize the energy utilization in the sensor node, which is the main thing in the wireless sensor networks and the CH selection is based on the IPSO. DCH mechanism is used in this paper, which is the additional cluster head which is activated when the main cluster head is died, so that the relay node procedure is used in this paper, and also the fake node or malicious node can be detected, and stopping the communication for that malicious node is also explained, so the energy usage is reduced by using the bias and variance concept, so finally, the results of simulator shown that the improved protocol gives the better accuracy of lifetime and the throughput when compared with existing protocol.
An Energy-Efficient Wireless Sensor Deployment …
63
References 1. Y. Chen, X. Xu, Y. Wang, Wireless sensor network energy efficient coverage method based on intelligent optimization algorithm. Discrete Continuous Dyn. Syst. Ser. 12(4, 5) (2019) 2. K. Bennani, D. El Ghanami, Particle swarm optimization-based clustering in wireless sensor networks: the effectiveness of distance altering, in 2012 IEEE International Conference on Complex Systems (ICCS) (Agadir, 2012) 3. B. Sreevidya, M. Rajesh,Design and performance evaluation of an efficient multiple access protocol for virtual cellular networks, in International Conference on Computer Networks and Communication Technologies (ICCNCT), 2018 4. S.K. Singh, P. Kumar, J.P. Singh, An energy efficient protocol to mitigate hot spot problem using unequal clustering in WSN. Wireless Pers. Commun. 101, 799–827 (2018) 5. O. Younis, S. Fahmy, HEED: a hybrid, energy-efficient, distributed clustering approach for ad hoc sensor networks. IEEE Trans. Mobile Comput. 3(4), 366–379 (2004) 6. R. Arya, S.C. Sharma, energy estimation of sensor nodes using optimization in wireless sensor network, in IEEE International Conference on Computer, Communication and Control (IC42015), 2015 7. M.V. Ramesh, Design, development, and deployment of a wireless sensor network for detection of landslides. Ad Hoc Netw. 2–18 (2014) 8. B. Sreevidya, M. Rajesh, Enhanced energy optimized cluster based on demand routing protocol for wireless sensor networks, in International Conference on Advances in Computing, Communications & Informatics (ICACCI’17) (Manipal University, Karnataka, 2017) 9. H. Aoudia, Y. Touati, A.A. Cherif, Energy optimization mechanism in wireless sensor, in International Conference on MOBILe Wireless MiddleWARE, Systems and Applications Networks, 2013 10. J. Wang, Y.Q. Cao, B. Li, H. Kim, S. Lee, Particle swarm optimization-based clustering algorithm with mobile sink for WSNS. Future Gener. Comput. Syst. 76, 452–457 (2017) 11. M. Rajesh, A. George, T.S.B. Sudarshan, Energy efficient deployment of wireless sensor network by multiple mobile robots, in International Conference on Computing and Network Communications (CoCoNet), 2015 12. S. Mini, S.K. Udgata, S.L. Sabat, Sensor deployment and scheduling for target coverage problem in wireless sensor networks. IEEE Sens. J. 14(3), 636–644 (2014) 13. V. Kavitha, K. Ganapathy, Efficient and optimal routing using ant colony optimization mechanism for wireless sensor networks. Period. Eng. Nat. Sci. 6, 171–181 (2018) 14. J. Wang, C. Ju, Y. Gao, A.K. Sangaiah, G.-J. Kim, A PSO based energy efficient coverage control algorithm for wireless sensor networks. Comput. Mater. Cont. 56, 433–446 (2018) 15. O.A. Amodu, R.A. Mahmood, Impact of the energy-based and location-based LEACH secondary cluster aggregation on WSN lifetime. Wireless Netw. 24, 1379–1402 (2018)
Improved Nutrition Management in Maize by Analyzing Leaf Images Prashant Narayankar and Priyadarshini Patil
Abstract Identification of nutritional deficiency for efficient nutrient profile management in plants is one of the most critical research areas in automated precision agriculture. Nutrients like nitrogen (N), potassium (K), and sodium (Na) management is a critical agronomic practice to attain higher yield. In this paper, we propose an efficient automated nutrition management system in maize crops. This domain of work deals with automatically identifying if maize crop suffers from nitrogen deficiency or not by analyzing its leaf image and if so, detecting the type of deficiency to treat it effectively. We develop a real-time image processing-based system using EmguCV, where a given leaf image is first filtered to obtain the low-frequency leaf texture. Then a high pass filter is applied to image that detects outlier in leaf texture. The outlier region is processed by histogram thresholding to know if there is any deficiency in the leaf or not. We use statistical features obtained from color histograms to classify the leaf deficiency. Keywords Nitrogen (N) · Potassium (K) · Phosphorous (P) · Colour histogram · Nutritional deficiency · K-means clustering · Colour segmentation
1 Introduction The world population is proliferating, supplying agricultural needs of everyone is very challenging. As the growth in population, a considerable rise in crop production is necessary to meet the food needs. However, agriculture is often constrained by environmental conditions, drought, high temperature, nutrients deficiency, and so on. In such a situation, the use of technological solutions to manage agriculture more efficiently is required. Automated precision agriculture is the need of the hour. P. Narayankar (B) · P. Patil School of Computer Science and Engineering, KLE Technological University, Hubballi 580031, India e-mail: [email protected] P. Patil e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_4
65
66
P. Narayankar and P. Patil
Using sensors, software to analyze and take appropriate steps at each stage of crop development is needed to obtain optimal yield. Nutrition management is a vital part of precision agriculture and has to be done effectively to get maximum yield. Usually, fertilization will be done periodically to nourish the crops. However, a scientific assessment of the nutrition profile to apply optimal amounts of required nutrients is the key to achieving better plant growth and also avoids under or excess fertilizer application. Without understanding the nutritional requirements of a plant, applying excess fertilizers may inturn harm soil quality by altering salt concentration and hurting soil microorganisms that maintain soil health. In this paper, the focus is on intelligent nutrition management in maize crop, grown in North Karnataka regions of India. Applying fertilizers to crops, nutrients can be classified into two types: Macronutrients—needed in significant amounts and Micronutrients—needed in smaller amounts [1]. Macronutrients include nitrogen, potassium, phosphorus, sulphur, calcium, and magnesium. Micronutrients include iron, boron, zinc, copper [2]. The roots from the soil obtain both. These nutrients play a vital role in plants growth and metabolism. Figure 1 shows images of maize leaf, which show symptoms of various macro and micronutrients deficiencies with its causes and effects. In maize, the required macronutrients are nitrogen, phosphorous, and potassium. Nitrogen (N) is an important nutrient essential for the healthy growth of plants. In maize, nitrogen deficiency starts appearing as yellowish patches on the leaf tip. As the leaves become older, the yellowish patch progresses towards the midrib. The condition aggravates in cold, dry soil, poor fertilization, and leaching. Phosphorus (P) deficiency starts showing up in young maize plants. It appears as reddish to reddish-purple patches near tips and progresses towards margin in older leaves. If phosphorous deficiency persists, the plant growth is slow, and size is small. Phosphorus deficiency symptoms will be visible when the plant is younger. After growing to three feet or more, the symptoms will not appear even if a deficiency exists. Hence, the phosphorous deficiency should be managed adequately at earlier stages to support optimal plant growth. Phosphorus deficiency usually occurs in saturated soil, highly wet soil, and highly dry soil conditions. Nitrogen Deficiency
Phosphorous Deficiency
Fig. 1 Macronutrients deficiency in maize leaf
Potassium Deficiency
Improved Nutrition Management in Maize …
67
Potassium (K) deficiency is seen at 4th to 6th week, as yellowish decay on lower leaves. This deficiency starts with yellowing in lower leaves leaf margins. If the deficiency is not treated, it transfers from older leaves to younger leaves and shifts towards younger leaves. Potassium deficient maize has weak stalks, leading to late growth. Potassium deficiency occurs in wet soil, sandy soil, and dry conditions. The overall organization of this paper consist of seven main sections: Sect. 1 discuss the introduction to nutrition management in maize and different types of nutritions available which influence maize leaves. Section 1 also includes motivation and application work related to nutrition management. Section 2 describes the significant advancement in the literature related to our work. Methodology and proposed work are discussed in Sect. 3 with a logical flow of computational aspects. Section 4 describes the implementation, algorithm, and classification with images considered. Results are discussed in Sect. 5 by comparing normal and abnormal images with its histogram comparison. Section 6 is all about conclusion, finally, we acknowledge our mentor for providing all support to carry out this work.
1.1 Motivation As plant nutrition management is critical for optimal plant growth and yield, if not managed, affects the quality and quantity of crops. The condition is caused by environmental effects, poor sunlight, soil health and type, rainfall variations, and pollution. Majorly deficiency symptoms appear on the leaf, foliage, and stem parts of the plant, leading to growth retardation and yield loss, leading to revenue loss for farmers. This has motivated us to come up with a system which does the work of detecting the nutrient deficiency in crop to manage better nutrition effectively by analyzing plant images. This is the primary motivation for choosing the problem with software approach. In this paper, we have worked upon nitrogen deficiency management in maize.
1.2 Application Work Traditionally evaluation used to be done by manual inspection by experts, which is a tedious, strenuous, expensive, subjective, essentially clunky, and requires expert interventions regularly, making the process unreliable. This entails the development of computing solutions. Vision-based analysis and image processing solutions are which analyze the symptoms by characterizing shape, color, the texture of plant images are effective [3]. The vision-based solution considers multiple techniques which focus on problems like recognition, segmentation, classification based on the feature we select from input images. Image processing solutions involve histogram computation, grayscale comparisons, and many other solutions which are dependent on image features. In our problem, the deficiency symptoms majorly appear as visual
68
P. Narayankar and P. Patil
variations in color and texture of leaves, stem and uproot parts. Hence, by analyzing the images of the plant, we can identify and categorize the deficiencies effectively. Therefore, we have chosen an image processing based solution for deficiency detection. Image processing has also worked effectively for applications like an online classification of several leaf products ranging from complex vision-guided robotic routine inspection to control [4].
2 Literature Review Following Table 1 describes the relevant papers for our work. Table 1 Literature review Paper title and authors
Insights
Sridevy et al. [1]. Nitrogen and potassium deficiency identification in maize by image mining, spectral and true color response Indian Journal of Plant Physiology
This paper focuses on nitrogen and potassium deficiency in maize and identifies the symptoms through image mining, hyperspectral and true color responses Paper also provides the study to determine effective spectra ranges and significant component images of RGB intensities
Bankar et al. [6]. Plant disease detection techniques using canny edge detection and color histogram in image processing
This paper presents a method for identifying plant disease based on color, edge detection, and histogram matching
Rathod et al. [4]. Image processing techniques This paper provides various methods used to for the detection of leaf diseases study for leaf disease detection using image processing Al-Hiary et al. [5]. Fast and accurate detection This paper deals with and classification of plant diseases (1) Using k-means clustering to detect object which is infected (2)Extracting texture features from color co-occurrence of identified objects (3) Using KNN classify disease type Patil and Sunag. “Analysis of image retrieval techniques based on content,” 2015 IEEE international advance computing conference, pp. 958–962
This paper deals with the use of efficient feature extraction using color and texture
Jun and Wang [7]. Image thresholding using weighted parzen-window estimation. J. Applied Sci. 8: 772–779
This paper describes image Threashholding and segmentation techniques Image comparison is made using Parzen-window estimation
Improved Nutrition Management in Maize …
69
Fig. 2 Proposed system
3 Methodology Our proposed system is showed in Fig. 2. We have stored images collected which are nitrogen deficit images and normal images in a data repository the history dataset. Collected images are then processed using feature extraction and comparison analysis. After pre-processing, we identify nitrogen deficiency and the intensity of deficiency using segmenting leaf with RGB colors and clustering the leaf image using KMeans clustering algorithm [5]. Following diagram shows how the proposed system works. Image segmentation is the basic step in pattern analysis. Image segmentation involves dividing images into different regions which are homogenous. Histogram thresholding segmentation says that images are composed of varying grey ranges, and separates it into several peaks, each corresponding to one region [6, 7].
3.1 Color Segmentation with Filtering Colour segmentation is done using two filters: high and low pass filter. The high pass filter detects edges; the low pass filter detects smooth regions. We have used Gaussian high pass filter, which is efficient.
4 Implementation 4.1 Approach of the Algorithm Using k-Means Clustering and Finding k We have used k-means algorithm. The steps of the algorithm • First cluster the pixel information from an input image in HSV color space pixelwise [8]. • Next, intensity-based clustering pixel-wise [9]. • Then based on intensity values, the k-means algorithm is used which partitions the image. • Until convergence, recalculation and assignment of cluster using mean • Finally, the group to which it belongs.
70
P. Narayankar and P. Patil
In the end, the proposed system compares normal and abnormal leaves based on image histogram. The process of storing features of a normal image is as follows; the system converts the segmented image. The histogram is plotted based on the segmented image. All the features of the normal image are then stored and compared with the abnormal image to describe the nutritional deficiency in maize leaves [10, 11].
5 Results and Discussion Nitrogen deficiency is generally observed as a color change in maize leaves. We have experimented to identify a nitrogen deficiency in maize leaf based on image segmentation and k-means clustering. Segmentation helps us to identify the abnormalities in the leaf, and k-means help in clustering the images. Figures 3 and 4 show segmentation. Normal and defected leaves with histogram comparison after applying k-means clustering. Thresholding is done by k-means value, hence is efficient. Figure 5 shows the comparison of normal and deficit leaf. Histogram variations are clearly mentioned. Our study shows comparing normal and abnormal leaf images results in a variation of RGB histograms, which is used to classify as normal or deficit. Table 2
Fig. 3 Segmented normal leaf
Improved Nutrition Management in Maize …
71
Fig. 4 Segmented defect leaf
Fig. 5 Comparison of normal and abnormal leaf Table 2 Histogram variations for the above test image Histogram
Red
Green
Blue
Yellow
Variation
285.82
254.27
263.68
256.80
72
P. Narayankar and P. Patil
shows variations. These variations further define the type and level of deficiency that occurred in the maize leaf. Experimenting on a dataset of 330 images, 109 nitrogendeficient images, we have got an accuracy of 88% of detection of nitrogen deficit leaf images.
6 Conclusion The computer-aided technique is now used more and more as assistive technology for agriculture. Nitrogen deficiency detection is an essential step in maize crop growth and protection. Early detection of nitrogen deficiency can help the farmers to take crucial measures, by applying fertilizers in time, avoiding the damage from the deficiency aggravate and hence, prevents hampering of plant growth. In this work, we have used an EmguCV based technique for leaf deficiency symptoms detection, which is computationally efficient and yet highly accurate. The use of EmguCV allows the project to be ported onto mobile platforms. We have used a flat-file database for ease of access. Our technique has given accuracy of 88%. We have chosen image processing over heavy computation models, as much of the symptoms are visual, this made our system fast. Results whose that misdetection rate is very minimum for nitrogen deficiency. There is scope for further improvement by implementing potassium and sodium deficiency which are also significant macronutrients. Further classification upon detecting the type of deficiency and appropriate correction measures suggestion to farmers can be done as a real-time mobile application. Acknowledgements We are grateful to Dr. Mukanagouda S. Patil, Head, and Professor, Department of Plant Pathology, University of Agricultural Sciences, Dharwad, India. The authors are obliged to the support given by Prof. M. S. Patil for his guidance, help in acquiring image dataset, and providing insights on detecting deficiency based on symptoms.
References 1. S. Sridevy, A.S. Vijendran, R. Jagadeeswaran, M. Djanaguiraman, Nitrogen and potassium deficiency identification in maize by image mining, spectral and true color response. Indian J. Plant Physiol. (2018) 2. J. Sawyer, Integrated Pest Management (Department of Agronomy, Nutrient Deficiencies, and Application Injuries in Field Crops. Iowa State University, July 2004) 3. P. Patil, B. Sunag, Analysis of image retrieval techniques based on content, in 2015 IEEE International Advance Computing Conference (IACC) (Banglore, 2015), pp. 958–962.https:// doi.org/10.1109/IADCC.2015.7154846 4. A.N. Rathod, B. Tanawal, V. Shah, Image Processing Techniques for the Detection of Leaf Diseases, 2013 5. H. Al-Hiary, S. Bani-Ahmad, M. Reyalat, M. Braik, Z. ALRahamneh, Fast and Accurate Detection and Classification of Plant Diseases, 2010 6. S. Bankar, P. Kadam, S. Deokule, Plant Disease Detection Techniques using Canny Edge Detection and Color Histogram in Image Processing, 2012
Improved Nutrition Management in Maize …
73
7. W. Jun, S. Wang, Image thresholding using weighted parzen-window estimation. J. Appl. Sci. 8, 772–779 (2008) 8. H.D. Marathe, P.N. Kothe, Leaf disease detection using image processing techniques. Int. J. Eng. Res. Technol. (IJERT) 2(3) (2013). ISSN: 2278-0181 9. S.A. Ali, N. Suleiman, A. Mustapha, N. Mustapha, K-means clustering to improve the accuracy of decision tree response classification. Inf. Technol. J. 8, 1256–1262 (2009) 10. P. Patil, N. Yaligar, S.M. Meena, Comparision of Performance of Classifiers—SVM, RF and ANN in Potato Blight Disease Detection Using Leaf Images, 2017 IEEE (ICCIC) (Coimbatore, 2017), pp. 1–5 11. M.M. Raikar, M.S.M. Chaitra Kuchanur, S. Girraddi, P. Benagi, Classification and grading of okra-ladies finger using deep learning. Proc. Comput. Sci. 171, 2380–2389 (2020)
Modified LEACH-B Protocol for Energy-Aware Wireless Sensor Network for Efficient Network Lifetime Fahrin Rahman, Maruf Hossain, Md. Sabbir Hasan Sohag, Sejuti Zaman, and Mohammad Rakibul Islam
Abstract The networking system refers to the technique which allows users to share the information and requires lifetime efficient wireless networking protocols that are energy efficient and provides low latency. The applications of wireless sensor network cover a wide range and provide remote monitoring using networked microsensors. During efficient scrutiny of the network, they suffer for inadequacy information, but it is rarely seen in orthodoxly circuited computing systems. An applicationspecific algorithmic architecture is comparatively more preferable than the traditional layer formed approach. Low-energy adaptive clustering hierarchy (LEACH) is one the modern protocols for remote sensor networks which can unite the notion of both media access and energy-efficient cluster-based routing protocols to gain the best performance in accordance with system lifetime. For reducing information loss for the data aggregation property of LEACH-B, a simulation that has modified the “LEACH-B” has been proposed to ensure optimization and more robustness. Keywords Networking · Clustering · Wireless sensor network · Lifetime
1 Introduction Wireless sensor network is defined by the network of a particular region which uses wireless data connections between network nodes that are generally implemented and administered using radio communication [1]. Node’s strength, autonomy, independency, etc., are the issues which are affected by the DC power supply of the sensor nodes which are not suitable enough for ensuring a network’s healthy lifetime and autonomous operation. If a node is out of power, the whole networking system will be affected which creates “energy hole problem.” So, there needs some analysis of F. Rahman (B) · Md. Sabbir Hasan Sohag · M. R. Islam Islamic University of Technology, Dhaka, Bangladesh e-mail: [email protected] M. Hossain · S. Zaman University of Dhaka, Dhaka, Bangladesh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_5
75
76
F. Rahman et al.
changing protocols of network formatting which can sometimes ensure robustness [2]. There is a popular routing protocol which is called low-energy adaptive clustering hierarchy (LEACH). It dissipates low energy and has response time without compromising application-specific quality for expanding the lifetime reducing power consumption which uses clusters in a wireless sensor network. Huge limited-powered sensor nodes which are small in size co-operate the information to communicate the wirelessly gathered data to a base station which is called ‘sink.’ Thus, robustness of a network is affected, and this can degrade the lifetime. According to consider the facts of gathering, processing and transmitting contributions are needed for ensuring better energy consumption. LEACH needs modification for providing more balance and minimization of the node’s residual energy by making particular number of cluster head that will be approximately optimal per round [2]. In this paper, Sects. 2 and 3 are based on the theoretical analysis of LEACH protocols, and Sect. 4 describes the proposed modified version of LEACH.
2 Clustering Algorithm 2.1 Original LEACH Algorithm LEACH is the first hierarchical routing algorithm as milestone significance in clustering routing protocols among all routing protocols which uses single hop routing. The phase of LEACH protocol is divided into two phases, one is steady state and another is setting up stage. Time division multiple access (TDMA) is performed for cluster configuration and allocation in setting up stages. The value of T (n) is estimated by exerting Eq. 1. T (n) =
P 1−P×(r mod
) otherwise, 1 P
: if n ∈ G 0
(1)
In this equation, P is the craved percentage of sensor nodes that would be cluster heads, r shows the present round, and on the other hand, G isthe set of sensor nodes which have not engaged in cluster head selection in earlier 1 P rounds. After selecting cluster heads (CHs), sensor nodes send a joining message to this CH and also depend on the strength of the received signal for their desire to join the cluster head. In the phase of steady state, data transmission will be happened from cluster head to sink. But sometimes, the robustness of network is affected because it gives every node of both low and high energy equal chances to be the cluster head [3]. A threshold value is used here for electing a sensor node as cluster head. When a bulk of information are about to handle, and pass over the nodes of network, some modifications can be needed for enhancing efficiency [4]. Every single node is able to podcast precisely to the CH as well as to abate in unit hop routing. Hence, it may be applicable for compact networks. Energy consumption can
Modified LEACH-B Protocol for Energy …
77
be decreased by the “dynamic clustering” algorithm [5]. The peak value of cluster head cannot be assured because the cluster head selection is a stochastic process. The properties of high and low residual energy of nodes are analogous. Hence, the particular nodes that have inferior lasting energy can be picked as the head of the cluster nodes that will die at first.
2.2 LEACH-B Cluster head is much energy concerted than a non-cluster head (non-CH) node, so that the CH needs that node which takes its turn for being CH. LEACH does the uniform distribution task, and if anyhow CH dies, the whole cluster becomes purposeless. There is an improved LEACH protocol which is called ‘balanced LEACH’ or ‘LEACH-B’. It uses a distributed algorithmic protocol, where nodes make self-determining decisions without any centralization for improved lifetime span. The calculation of time interval is, t=
k E
(2)
where k is the factor and E is the residual energy of sensor node. The more energy the node has the less the time interval. This algorithm is a kind of distributed algorithm [6]. nodes per cluster with one If there are NC clusters, then on average, there are NNTOT C NTOT cluster head and NC − 1 non-cluster head nodes present. Cluster head selection is performed on the basis ti in the current round. If (C P (ti ) = 0), node p has been selected as the CH. LEACH-B also uses a threshold function TP (ti ) like typical LEACH but uses different format of equation. The threshold function of LEACH-B is shown in equation below: TP (ti ) =
NC NTOT −NC r mod
0
NTOT NC
:
C P (ti ) = 1 C P (ti ) = 0
(3)
Each cluster head dissipates energy and retransmits packets of the other sensor nodes to the ultimate receiver, transmits own packet, and transmits the broadcast packet. The node p chooses a random number (0–1) and takes the decision to become a cluster head or not. The threshold value is TP . When the node is selected as CH, it will send a notification to the other nodes. Non-CH nodes join their nearest cluster head of lower energy. Finally, according to TDMA schedule, CH sends it to base station (BS). In this algorithm, if there are ‘a’ clusters, the ab nodes remain per cluster. ‘b’ is the maximum number of nodes [6].
78
F. Rahman et al.
3 Other Multi-hop LEACH Algorithms 3.1 Cognitive LEACH (CogLEACH) Algorithm In spectrum-aware algorithm for an ad-hoc cognitive network, the solution is based on finding the optimal bipartite graph at each node subject to set of constraints. Clusters are formed for maximizing the summation of the number of common indolent channels with the number of sensor nodes within the cluster, or maximizing the multiplication of the same two parameters, or maximizing the number of nodes within the cluster given a threshold on the number of indolent common channels. The network consists of N nodes [7]. In the following Eq. (4), P indicates the craved percentage of sensor nodes to become cluster head among all nodes. Let (t) = (kα, 1) where m denotes the total number of channels in the used band and ci indicates the number of idle channels available to node i. E = {#CHs} =
N
Pi (t)
(4)
i=1
kα
c1 + c2 + · · · + cN =k m
(5)
That leads to, m=
c1 + c2 + · · · + cN α
(6)
Thus,
ci
Pi (t) = min k N
j=1
cj
,1
(7)
3.2 Weighted Spanning Tree LEACH (WST-LEACH) Algorithm WST-LEACH vacillates the cluster head (CH) loaded spanning tree. CH broadcasts an “announce” message after it is selected, and rest every node in the network knows that it is singled out for the current round. They elect the nearest cluster head to join. Every single non-cluster head disseminates a “JOIN” text to its cluster head as well as receiving a “JOIN” text, it will give an “acknowledgment” text and delegated “Time Division Multiplexing” slot. Cluster heads receive data from cluster items and
Modified LEACH-B Protocol for Energy …
79
then transmit these to the loaded spanning tree to the sink. In this selection, cluster head for WST-LEACH, the threshold equation will be P
1 − P ∗ r mod P1
S(n).E S(n).Nb 1 ∗ w1 ∗ + w3 ∗ + w2 ∗ Eo p∗N S(n).To Bs
T (n) =
(8)
In the above equation, (n). E is the residual energy of node n; Eo is the initial energy; N is the total number of nodes; (n). Nb is the neighbor numbers of node n within a radius R; S(n). T o Bs is the distance between node n and the BS; and w1 , w2 , w3 are coefficients. The selection process of WST-LEACH is similar to LEACH [8].
4 Proposed Modified LEACH Algorithm The process of data aggression in which information expressed gathered in a summary form. It is considered as a disadvantage of LEACH-B, because it is somehow difficult of data integration and gathering from each node in a meaningful way, so that the cluster head array should be chosen in a standardized way. Thus, in this paper, the threshold equation will be in a changed formation, so the possibility of collision can be reduced, and network robustness has been improved. Some MATLAB simulations are done which can execute the curve of number of alive nodes versus number of rounds actually indicates the network lifetime; that means, it indicates in which round, the first node dies [9]. Modified LEACH-B uses a threshold function like LEACH-B but in a different way. The threshold equation is, TP (ti ) =
NC ; N mod r, N C
n∈G
TOT
0;
(9)
otherwise
The comparison of above-mentioned protocols are given by the MATLAB curves below: In Fig. 1, for LEACH protocol, nodes begin to die after 935 rounds. Nodes of ‘balanced LEACH’ protocol begin to die after the round of 1158. Nodes of ‘CogLEACH’ protocol begin to die after the round of 1018, and nodes of ‘WSTLEACH’ protocol begin to die after the round of 1415. In the case of ‘modified LEACH-B’ protocol, nodes begin to die after some few rounds of those of LEACHB, Cog LEACH, and WST-LEACH which is a representation of a comparatively better lifetime. As shown in Fig. 2, nodes of LEACH original protocol begin to die after 935 rounds, nodes of LEACH-B protocol begin to die after the round of 1158, nodes of
80
F. Rahman et al.
Fig. 1 No. of alive nodes respect to number of round 100 LEACH B LEACH Original LEACH Proposed CogLEACH WST-LEACH
No. of dead nodes
80
60
40
20
0
0
1000
2000
3000
4000
5000
6000
7000
8000
No. of round
Fig. 2 No. of dead nodes respect to number of round
CogLEACH protocol begin to die after the round of 1018, and nodes of WST-LEACH protocol begin to die after the round of 1741. In nodes of ‘modified LEACH-B’ protocol, nodes begin to die after some few rounds of that of LEACH-B protocol, Cog LEACH protocol, and WST-LEACH protocol. As shown in Fig. 3, some curves are shown of advanced dead nodes, which are supplied by more initial energy than the normal nodes of LEACH original protocol, LEACH-B protocol, CogLEACH protocol, WST-LEACH protocol, and modified LEACH-B protocol. In the case of LEACH, nodes begin to die after 1690 rounds. The energy dissipation varies from 0 to around 10. The energy consumption of LEACH-B is the lowest among all mentioned protocols.
Modified LEACH-B Protocol for Energy …
81
10
No. of advanced dead nodes
LEACH B LEACH Original LEACH Proposed CogLEACH WST-LEACH
8 6 4 2
X: 3090 Y: 0
0
0
1000
2000
3000
4000
5000
6000
7000
8000
No. of round
Fig. 3 No. of advanced dead nodes respect to number of rounds
As shown in Fig. 4, normal nodes of LEACH, LEACH-B, CogLEACH, and WSTLEACH begin to die, respectively, after 935, 1141, 1003, and 1451 rounds. For “modified LEACH-B” protocol, normal dead nodes begin to die after 1724 rounds. Thus, the network lifetime increases in the case of ‘modified LEACH-B’ protocol. Figure 5 shows energy dissipated for normal nodes versus number of rounds for LEACH, LEACH-B, CogLEACH, WST-LEACH, and ‘modified LEACH-B’. After 1261, 1295, 1349, 1758, and 3473 rounds, the energy level will reach at zero point for, respectively, LEACH, LEACH-B, CogLEACH, WST-LEACH, and modified LEACH. This represents the better performance of “modified LEACH-B” among all.
No. of normal dead nodes
90 80
LEACH B
70
LEACH Proposed
60
CogLEACH WST-LEACH
LEACH Original
50 40 30 20 10 0 0
1000
2000
3000
4000
5000
No. of round
Fig. 4 No. of normal dead nodes respect to number of round
6000
7000
8000
9000
82
F. Rahman et al. 0.5
Energy for normal node
LEACH B LEACH Original LEACH Proposed CogLEACH WST-LEACH
0.4 0.3 0.2 0.1 0 -0.1
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
No. of round
Fig. 5 Energy for normal nodes respect to number of round
In Fig. 6, the curves represent “energy dissipated by advanced nodes versus number of rounds.” After approximately 4091 round, the energy level will reach to zero level in the case of LEACH protocol. LEACH-B, CogLEACH, WST-LEACH, and ‘modified LEACH-B protocols perform better, and energy consumption curve is slowly tending to the lower level (Table 1). 1
Energy for advanced node
LEACH B LEACH Original
0.8
LEACH Proposed CogLEACH WST-LEACH
0.6 0.4 0.2 0 -0.2
0
1000
2000
3000
4000
5000
6000
No. of round
Fig. 6 Energy for advanced nodes respect to number of round
7000
8000
9000
Modified LEACH-B Protocol for Energy …
83
Table 1 Comparative analysis of above-described protocols LEACH and LEACH Derivatives
year
Clustering
Overhead
Complexity
Network type
Network lifetime
LEACH
May 2000
Distributed
High
Low
Moderate
Low
LEACH-B
April 2003
Distributed
High
Moderate
Moderate
Moderate
CogLEACH
October 2014
Distributed
High
Moderate
Moderate
Moderate
WST-LEACH
May 2010
Distributed
High
High
High
High
Modified LEACH-B
October 2018
Distributed
High
Low
Moderate
Very high
5 Conclusion Network lifetime can be increased if the LEACH protocol can be modified in an energy efficient way [10]. It is important to consider the need for ease of deployment because it is the basis of applications of WSN in storage environment monitoring system. Including this, determination of location of networking is an important and rudimentary problem which needs future attention of the researchers. Energy constraints like these should be considered at the time of choosing a suitable protocol. Cluster head selection algorithm and data aggregation of LEACH reduce the amount of communication. In LEACH, for unbalanced cluster head partition, total energy dissipation increases. LEACH has so many defects, so an ideal protocol called LEACH-B can be implemented. In this paper, with the research of the LEACH-B protocol, a modification has been approached which is called ‘modified LEACHB’. The simulation result shows that ‘modified LEACH-B’ provides better energy efficiency and longer network lifetime than LEACH-B.
References 1. S. Fakher, M. Shokair, M.I. Moawad, K. Sharshar, The main effective parameters on wireless sensor network performance. Int. J. Sci. Res. Publ. 5(6) (2015) ˇ cej, Single-hop versus multi-hop—energy 2. U.M. Pešovi´c, J.J. Mohorko, K. Benkiˇc, Ž.F. Cuˇ efficiency analysis in wireless sensor networks, in Proceedings of Telekomunikacioni forum TELFOR (Srbija, Beograd, November 23–25, 2010), pp. 471–474 3. P.G. Samta, Energy efficient cluster based leach protocol using WSN. Int. J. Res. Appl. Sci. Eng. Technol. (IJRASET) 5(I) (2017) 4. D. Mahmood, N. Javaid, S. Mahmood, S. Qureshi, A.M. Memon, T. Zaman, MODLEACH: a variant of LEACH for WSNs, in Proceedings of 2013 Eighth International Conference on Broadband and Wireless Computing, Communication and Applications (France 2013), pp. 158– 163
84
F. Rahman et al.
5. H. Kaur, N. Kaur, S. Waraich, Comparative analysis of clustering protocols for wireless sensor networks. Int. J. Comput. Appl. 115(1), 0975–8887 (2015) 6. Tong Mu, Tang M, LEACH-B: an improved LEACH protocol for wireless sensor network, in Proceedings of 6th International Conference on Wireless Communications Networking and Mobile Computing (WiCOM), 2010, pp. 1–4 7. R.M. Eletreby, H.M. Elsayed, M.M. Khairy, CogLEACH: a spectrum aware clustering protocol for cognitive radio sensor networks, in 2014 9th International Conference on Cognitive Radio Oriented Wireless Networks and Communications (CROWNCOM), 2014, pp. 1–6 8. H. Zhang, P. Chen, S. Gong, Weighted spanning tree clustering routing algorithm based on LEACH, vol. 2, in Proceedings of the 2nd International Conference on Future Computer and Communication (ICFCC) (May 2010), pp. 223–227 9. K. Singla, D. Kaur, A review on clustering approaches for wireless sensor network using mobile sink. IJEDR 4(2), 266–269 (2016) 10. Y. Kim, B.-J. Song, J. Ju, Implementing a prototype system for power facility management using RFID/WSN, in Proceedings of the IADIS International Conference, 2006, pp. 70–75
Web Content Authentication: A Machine Learning Approach to Identify Fake and Authentic Web Pages on Internet Jayakrishnan Ashok and Pankaj Badoni
Abstract The Internet is evolving very fast and is gaining user base enormously. With the advent of low-cost devices and connectivity to access the Internet, it has become a common medium for communication. Various social media platforms and blogging Web sites are prevalent in the Internet today. This gave rise to fake news, unreliable Web content, etc., to prosper because of the absence of a system to validate the authenticity of the content. Also, hackers and unauthentic authors are taking advantage of the ranking system of popular search engines by modifying the content and other necessary factors, as to bag the first position of the search result. This paper introduces certain theoretical concepts that can be used to identify authentic Web sites by giving a trust factor to Web content. Also, with these factors modeled as input to a machine learning algorithm which outputs a trust factor for the Web site, it becomes easy for the Web surfers to pick the right Web site displayed to them by search engines. This paper also contributes ideology in creating a new dataset which can be used to train different ML models to identify unauthentic Web sites and Web blogs. Keywords Machine learning · Search engine · Authentic Web content · Authenticity · Internet · Confidence score · Information retrieval · Correlated data
1 Introduction The Internet 10 years back had a different view considering its utilization presently. The use of Internet during its initial days has changed drastically because of the advent of different technologies that came into existence to support its usage. The World Wide Web (WWW) has been growing ever since at an exponential rate. The Web can be considered as an entity which changes regularly and dynamically, which makes J. Ashok (B) · P. Badoni University of Petroleum and Energy Studies, Dehradun, India P. Badoni e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_6
85
86
J. Ashok and P. Badoni
it difficult to build a classification model that can fit to classify many different Web pages [1]. There are more than 1.87 billion pages which are accessible on the World Wide Web with more than 3.905 billion users of Internet [2]. With the huge number of users using the Internet, gave rise to different kinds of social media platforms which became a medium to express our self virtually. With the advent of such platforms, the information available about a topic became exorbitant and it is becoming impossible to process each one of them manually and pick the most reliable. Also, certain users are modifying their content and other aspects which decide their scoring factors, used by the popular search engines to rank Web sites, so that their content is ranked higher. The Web surfers are completely reliable on results from these search engines. Search engines use different types of ranking algorithms and methodologies to order the retrieved pages mined from the Web [3]. Even after seeing the result page, the users are confused at times in identifying the right content to refer to. The main aim of this research is to find methodologies that can be used by an automated system, to validate the Web site and the Web content, so as to give a confidence score as output which acts as a trust factor to be used by users to identify authentic Web content. This project can be materialized in different ways, but the methodology discussed in this paper is by creating an Extension/Add-on which can be added to the browser, so the user will have access to the validation system during the entire duration of their time spent online. Also, the parameter that have been discussed in Sect. 9 can be used to create a new labeled dataset, which can be used for training AI/ML models in the future. The rest of the paper is organized as follows. Section 2 presents relevant background information. Section 3 examines existing research. Section 4 specifies our proposed solution, which is subsequently evaluated in Sect. 5. Section 6 discusses our findings from the evaluation. Results are discussed in Sect. 7, and the paper concludes in Sect. 8. And finally, future work is presented in Sect. 9.
2 Background This section provides information of the existing scenario and the major concerns that acted as the root cause of this research.
2.1 World Wide Web The World Wide Web (WWW) can be thought of an information space, having a huge collection of Web pages and other Web documents, each of which is having a uniform resource locator (URL), and interlinked using hyperlinks and is accessed with the help of Internet [4]. It has become a common tool for interaction and information exchange between billions of people around the world [2, 5, 6]. The Web pages in WWW are documents formatted in hypertext markup language (HTML), a format
Web Content Authentication: A Machine Learning Approach …
87
used by the Web browsers to render information. There are other technologies like cascading style sheets and JavaScript which can be used to beautify and add different functionalities to the Web pages which makes the page dynamic.
2.2 Search Engines Search engine is the first option for all the Web surfers to find content from the Web. Search engines use crawlers, which are programs that act as computer robots [7], which float around the Web and collect information from the Web, which creates and updates documents for easy retrieval [8]. The retrieval process is known as information retrieval. It consists of processing unstructured data, usually text that satisfies the information needed from a large collection of documents [9]. A search engine has a series of programs that determines how the information in the dataset will be represented in the result page [8–12]. The Web surfer inputs some keywords, and the output is presented in form of a ranked list, decided by the engine itself, wherein it estimates the relevance of each Web site indexed in reference to the query fired [13].
2.3 Fake News Fake news can be referred under the category of yellow journalism, which is the news or facts that are propagated without relevant backing of facts or data and is purely crude exaggerated view of the information [14]. Fake news largely represent a particularly grievous and inaccurate data which is propagated through various social media platforms [15]. The fake news or rumors are used to deceive the target audience to believe something different from the reality. There are many psychological concepts involved which the unauthentic creator studies and implements in the Web site or Web blog, so that a large population are forced to believe it without knowing what the truth is. This can be done using presenting the information in an altered way or can represent the information in a manner which conveys bogus information but means something else [16]. There can be different motives behind it, mainly relating to anything from which a gain can be made monetarily or otherwise.
2.4 Machine Learning Arthur Lee Samuel who is considered as a pioneer in field of computer gaming and artificial intelligence defines machine learning in 1959 as a subfield of computer science that gives “computers the ability to learn without being explicitly programmed” [17]. Also, machine learning can be considered as the automated
88
J. Ashok and P. Badoni
data analysis and data prediction in this era of big data. The subsections that can be particularly categorized under machine learning are detecting patterns in data, building mathematical models to predict the future comings based on the patterns unveiled [18]. This subject is highly useful at present because of the level of interactions that humans have with technology. This interaction has taken the scenario to a level where analyzing data and predicting the future itself have become an industry. This field, machine learning, evolved from the research which was conducted in the field of pattern recognition and computational learning theory in artificial intelligence [19].
2.5 Deep Learning Deep learning can be considered as an advanced subcategory of machine learning which increases the overall intelligence of the system. It is also termed as deep structure learning or deep machine learning [20]. The basic unit of a deep neural network is an artificial neuron which is designed on similar structure of human brain neurons. The neurons fire or trigger, when the output generated by them is above a defined threshold. The deep learning model [21]: 2.5.1
2.5.2
2.5.3
2.5.4
Has a cascaded architecture, each layer having many artificial neurons. The number of neurons can change and is not a defined constant. The layers do feature extraction and transformation of input to learn complex relationships. Has many layers, where each layer takes the input from the output of the previous layer in the architecture. Due to the presence of many layers between the input and the output layer, it allows the model to use multiple processing layers, wherein linear as well as nonlinear transformation can be done. Can learn multiple levels of representations of the input data fed to them that correspond to different levels of abstraction. These levels form a hierarchy of concepts. Applications include pattern analysis in data, which comes under unsupervised learning and classification of data, which is considered as supervised learning.
3 Related Work Machine learning is extensively being used in fields such as healthcare industry [22, 23], malware detection [24], intrusion detection [25], image analysis, marketing, and so on. Machine learning is used for information retrieval and data mining. In this section, we have discussed the most notable works done. Mitchell posed classification of data as a supervised learning algorithmic process, where training of labeled data is done on a classifier which can be used to predict class of future samples [26]. Xiaoguan and Brian defined the general categorization of
Web Content Authentication: A Machine Learning Approach …
89
Web sites be divided into subject classification, functional classification, sentimental classification, and others [27]. The others which they mentioned included genre classification and search engine spam classification. Chekuri et al. work was based on to increase the precision of Web search by researching on the automatic methods to classify Web pages. Their work included application of statistical methodology [28]. In a study conducted by Chen and Dumais [29], to represent the search results, their research revolved around in representing the search results into a predefined hierarchical structure. According to the study conducted by them, users were more satisfied by this method. A similar study was done by Käki [30] to represent the search result in a categorized view of the search result. The experiments showcased by them indicated that the users were satisfied in this case. There have been a lot of attempts to detect fake news regarding hot topics going around in the world, to gain maximum traffic to one’s Web site thereby increasing the hit rate of their Web site and becoming one among the result in the top of search engines. According to a study conducted by Laser et al. [31], they described fake news as “… information that mimics the output of the news media in form, but not in organizational process or intent—e.g., lacking editorial norms and processes to weed out the untrue in favor of the true. Fake news is thus a sub-genre of the broader category of misinformation—of incorrect information about the state of the world.” Pennycook et al. [32], researched about fake news and noted that the lowlevel heuristics plays an important role in giving accurate judgment about the fake news which is fabricated. They pointed out three major evidences. (1) the viewers forgot seeing the fake news repetitively. (2) Political fake news regarding parties are colliding with their ideology. (3) Warning participants that the fake news have been disputed by third-party checkers. In the research done by Pennycook and Rand, reported three different studies on various aspects of potential thinking between proneness to think analytically and being receptive to fake news [15]. Study one indicated a negative correlation between the CRT performance and perceptions of fake news accuracy. In study two, they investigated the probability of relatedness between analytical thinking and overall skepticism toward news headlines rather than the fake content specifically. Study 3 researched about the influence of political ideology in media truth discernment. Lots of research was done in the field of text analysis and sentiment classification. Pang and Lee [33] did a research in comparing methodologies between standard machine learning techniques and human produced baselines. Their research concluded that SVM, Naive Bayes, and maximum entropy classification did not perform well as compared to the traditional topic-based modeling. Jurafsky and Martin [34] gave a complete description of Naive Bayes used in classification for text categorization, sentiment analysis, and spam detection. Fetterly et al. [35] proposed that spam Web pages can be identified through statistical analysis of various properties such as linkage structure, page content, and page evolution. Detecting malicious URLs is possible with machine learning and deep learning. Vanhoenshoven et al. [36] addressed the detection of malicious URLs as a binary classification problem using Naive Bayes, support vector machines, multi-layer perceptron, random forest, K-nearest neighbors, and decision trees. Their findings indicated that the multilayer
90
J. Ashok and P. Badoni
perceptron and random forest gave the highest accuracy among the algorithms under consideration. Ma et al. [37] studied on how to categorize URLs using a live system by lexical and host-based features of the URLs. They used the labeled dataset from a Web mail provider. They concentrated on online algorithms like logistic regression with stochastic gradient descent and perceptron. Lee and Kim [38] proposed a method to identify malicious tweets which in turn has URLs leading to suspicious URLs. They trained statistical classifiers on parameters such as lexical features of URLs, URLs redirection, and html content. clickbait is another method to mislead users to end up in unwanted sites. Kotenko et al. [39] in their research proposed a theoretical approach to block inappropriate content or Web sites. They stated that with the help of features like text information, textual content in each HTML tags, URLs in the Web site, and meta data available about that Web site can be used to train models (SVM, decision tree, and NB) to do binary classification. Salvetti and Nicolov [40] created a machine learning model based of n-grams of the URLs to analyze them and filter the slog URLs out. They used Naive Bayes classifier to analyze the URLs. Their model had an accuracy of 78% and precision of 93.3%. In another research conducted by Kotenko et al. [41], they justified the need of a protection system against information in society and need of an automated system to do the same. Efficient indicators can be utilized in such a system and design, implementation, and evaluation of such a new system. All this built using a three-level architecture is a customizable one where algorithms and trade-offs done on each layer can be adjusted without affecting other.
4 Problem Formulation The need for a system that validates the URLs that a Web surfer wishes to explore could help save time, effort, and identifying reliable content. There does not exist such a system which validates the search queries returned by the search engines in the market today. The surfers have become purely depended on the search engine’s results in identifying content from Web in general. A system so dynamic that it adapts and learns to identify reliable content, which is not specifically maintained by hardcoded rules, will be consider as an ideal solution in pointing out reliable content from the Web. This paper focuses on building such system, which can act like a wingman to Web surfers. Several machine learning models with different hyperparameters can be combined in creating such a system. But this paper focuses on building a model observed from the results of various experiments conducted throughout the research and contributing a new dataset which can be further utilized in building intelligent solutions to the same problem statement.
Web Content Authentication: A Machine Learning Approach …
91
5 Proposed Methodology The proposed methodology in this paper is theoretical in concept. Since there are multiple ways through which an accurate recommender system can be achieved, the approach discussed in this paper is just one solution. While building a recommender system, there are multiple features that can be considered while determining whether a Web page is authentic. Below are the features that were considered as good parameters in evaluating a Web site: A.
B.
C.
D.
E.
F.
G.
H.
I.
The Protocol Used: HTTP/HTTPS plays an important role in determining the quality of the Web site. HTTPS being considered as a protocol which is secure, there must be some investment made by the Web site owner to increase the security. Spam/fake Web sites does not use HTTPS protocol. Thus, analyzing the entire URL to verify the protocol used by the Web site. Sentimental Analysis of the Content: There is numerous analysis that can be performed on the content displayed by a Web site, e.g., identifying the content has any abusive language, false promotions, grammar mistakes, etc. Domain Name: Domain name are considered a potential factor. Categorizing different domain into higher (e.g., org, edu, etc.), medium (e.g., country-wise domain names), and lower (e.g., com, net, etc.) priority classes can serve as a good parameter used to segregate authentic Web sites. Maliciousness of URL: Certain URL patterns and domain names can be traced back to malicious Web pages, which are pre-classified and published to public as a dataset. Clickbaity: Clickbait is a common method used by Web content developers to attract users to view the Web page. Usage of such sentences and words in Web sites are generally found where selling or advertising a product is displayed. Such, Web sites tend to be less authentic in terms of the information that it is conveying. Web Structure: The Web structure of any Web site can be used in estimating the authenticity. A properly designed Web site has a good layout, and the UI is easy for the users to understand and navigate themselves, tends to be an authentic one, while a messy Web structure conveys how few efforts have been taken in designing and tends to be a less authentic one, in most of the cases. Malicious Scripts: Malicious scripts, written in JavaScript, can steal information from the visitors, such as information related to cookies and other browsing history. Such Web site’s intent is just gathering information for their own. Author: The reputation of the author who wrote the content plays an important role in its credibility and quality published on a Web site. A reputed author who is consider as a pioneer, from a reputed institution, or with a good track record in the domain in which the content is written upon, can be considered as an authentic one. Web trackers: Web trackers are used to identify the activities of the Web surfers. Web trackers deployed by malicious Web site or untrusted entities
92
J.
K.
J. Ashok and P. Badoni
can be harmful for the users. Thus, Web site on which such Web trackers are deployed can be considered as less authentic ones. Dwell Time: The amount of time which each user spends on a Web sites can be directly related to the quality of the content. Higher dwell time conveys that the content showcased in the Web site is of interest and good quality. Pop-Ups/Advertisements: Frequent pop-ups and advertisements can be considered as nuisance, and it is used to sell products. Web sites hosting such events are of poor quality in terms of content and tend to be less authentic one.
There is no available dataset as to which a machine learning model can be directly trained upon. Absence of such dataset gives rise to serious showstoppers. The way in which such a recommender system can be processed is through building an unsupervised learning methodology or through reinforcement learning methods. The other available method would be through giving the semi-trained model and using crowd scoring technique to re-train the model. This method is being used for many researches and can be considered as a viable method to train a model for this problem. The major drawback of this method is that the output received from the crowd cannot be relied upon entirely because everyone participating in the process may or may not respond with the most appropriate answer or may not be dedicated while participating in the process. The application that can be built to provide a score for each Web site displayed to users, is by building a browser plugin which connects to the machine learning model powering the recommender. For the pages that the recommender can give a score to can be highlighted using scripts and displaying the score for each Web site next to it. In this way, the Web surfer can analyze the rating for each Web site and decide on which Web site to browse. The current implemented architecture used for this research is shown in Fig. 1.
Fig. 1 Proposed architecture
Web Content Authentication: A Machine Learning Approach …
93
6 Experimental Observations An astute analysis of the performance of various algorithms on different dataset which was considered reliable in contributing to some of the above-stated parameters or features was performed using the proposed architecture, elaborating on specific parameters which were utilized.
6.1 Malicious Link Initial approach was to train a Naive Bayes model which takes in the dataset received from Kaggle as an input and was evaluated for its accuracy and performance. The algorithm of Naive Bayes can be elaborated as an algorithm which calculates the conditional probability of an event with a feature vector x1, x2, …, x n which belongs to a particular class C(i). In Naive Bayes algorithm, the features are assumed to be independent of each other. P(x1 , x2, . . . , xn |Ci ) for 1 ≤ i ≤ k P(x1 , x2 , . . . , xn ) ⎛ ⎞ j=n P(Ci |x1 , x2 , . . . , xn ) = ⎝ P(x j |Ci )⎠
P(Ci |x1 , x2 , . . . , xn ) =
j=1
P(Ci ) for 1 ≤ i ≤ k P(x1 , x2 , . . . , xn ) This equation can be further elaborated as shown below: P(x1 , x2 , . . . , xn , Ci ) = P(x1 |x2 , . . . , xn , Ci ).P(x2 , . . . , xn , Ci ) = P(x1 , x2 , . . . , xn , Ci ) = P(x2 |x3 , . . . , xn , Ci )P(x3 , . . . , xn , Ci ) = ··· = P(x1 , x2 , . . . , xn , Ci ) = P(x2 |x3 , . . . , xn , Ci ) . . . P(xn−1 |xn , Ci ).P(xn |Ci ).P(Ci ) The term P(x(j)|x(j + 1), …, x(n), C(i)) becomes P(x(j)|C(i)) because of the assumption that features are independent. Naive Bayes code was implemented using python. Default settings of Naive Bayes classifier were used. The report for the training is represented in Table 1. It was observed to that the model achieved an f 1-score of 0.97 which conveys that it is capable to correctly identify sentiments and can evaluate the URLs found in the Web.
94
J. Ashok and P. Badoni
Table 1 Result—Naive Bayes 0
Precision
Recall
f 1-score
0.94
0.90
0.92
Support 25,030
1
0.98
0.99
0.98
113,724
Avg./Total
0.97
0.97
0.97
138,754
Table 2 Result—XGB classifier 0
Precision
Recall
f 1-score
0.92
0.69
0.79
Support 24,781
1
0.94
0.99
0.96
113,973
Micro avg.
0.93
0.93
0.93
138,754
Macro avg.
0.93
0.84
0.88
138,754
Weighted avg.
0.93
0.93
0.93
138,754
Secondly, XGB classifier was implemented in Python for malicious dataset with parameters n_estimators-200, learning_rate-0.1, gamma-0.8, max_depth-8, and objective- ‘reg-squarrederror’. The report for the training is represented in Table 2. We noticed that the f 1-score of XGB classifier was 0.93 and was a less accurate measurement compared to Naive Bayes. In the interest of best classification capability, Naive Bayes was chosen to be part of the application.
6.2 Clickbaity A convolutional neural network was implemented using TensorFlow library on a test dataset which was curated for understanding clickbait sentences. These datasets were curated from analysis of different headlines from various sources like BuzzFeed, Newsweek, The Times of India, The Guardian, The Economist, TechCrunch, The Wall Street Journal, etc. Twenty epochs were used to train the model. We observed after the tenth epoch that the error loss and accuracy seemed to flatten out thereby stopping further execution of epochs represented in Fig. 2.
Fig. 2 Training result of neural network
Web Content Authentication: A Machine Learning Approach …
95
The test achieved a validation accuracy of ~0.88 which proved to be satisfactory for the research, and this model is incorporated into the application.
6.3 Sentiment Analysis of Content Sentiment analysis of content from every page is to be evaluated to understand the context of content. For the same, below are the results from three best machine learning model implementations. The dataset to train the models was taken from Kaggle represented in Fig. 3.
6.3.1
Naive Bayes
Naive Bayes algorithm was used to train a model which will classify any given sentences/paragraphs into positive or negative class. This information would be useful to analyze the nature of content. Certain Web sites may even modulate the metadata in a way that the search engines picks up the content for which it faked for, but the content would be different than what would be expected. The accuracy we attained using the dataset on Naive Bayes model was approx. 73%. Same dataset was tested on multinomial and Bernoulli classifier too.
6.3.2
Multinomial Naïve Bayes
Accuracy attained using MNB classifier was near to 77%.
Fig. 3 Training result of Naive Bayes
96
J. Ashok and P. Badoni
Table 3 Domain segregation Domains Level 1 org, gov, mil, edu, int Level 2 ac, ad, ae, af, ag, ai, al, am, an, ao, aq, ar, as, at, au, aw, ax, az, ba, bb, bd, be, bf, bg, bh, bi, bj, bl, bm, bn, bo, bq, br, bs, bt, bv, bw, by, bzca, cc, cd, cf, cg, ch, ci, ck, cl, cm, cn, co, cr, cu, cv, cw, cx, cy, cz, de, dj, dk, dm, do, dz, ec, ee, eg Level 3 com, co, net
6.3.3
Bernoulli Naïve Bayes
Bernoulli NB classifier gave an accuracy of 78% on the same dataset. The training yielded a max accuracy rate of ~0.78 or 78% which was considerably good when compared to the 68% of XGB classifier trained on the same dataset. As we see that, Naive Bayes outperforms the XGB for this dataset; thus, it was chosen to be part of the application.
6.4 Security Level Used by Web sites The security or encryption level used by Web sites is considered important factors. URLs using HTTPS were given higher priority other URLs using HTTP.
6.5 Domain Name Considered domain names were segregated as given in Table 3. Org, gov, etc., were given higher priority or a high score to give more weightage during training, and with ac, ad, je, nl, etc., were given medium priority; and com, co, and net were given least priority.
7 Results The results were curated considering the output of contributing factors into a single module where the machine learning models produced the best accuracy which was chosen. The output from each contributing factor would be considered for evaluating any Web site/URL found over the Web. A sample module was setup on Mac OS X with a pluggable interface using Mozilla Firefox extension. Python 3.5 and Django1.10 were used to build a RESTful Web application which was programmed to receive URLs that are to be evaluated from the extension. There are various ways to combine
Web Content Authentication: A Machine Learning Approach …
97
the results from the multiple factors that we have discussed above. We implemented Pearson’s correlation algorithm to understand how correlated the values from each input feature which we discussed in the earlier section. The highest correlated one, with higher values of correlation, being the best to lowest correlated one, with low or negative values, being the less reliable Web sites/URLs [42]. Pearson’s correlation algorithm Pearson’s correlation is used when understanding relationship between variables in the set. The possible research hypotheses are that the variables will show a positive linear relationship, a negative linear relationship, or no linear relationship at all. Correlation cannot be used to test an attributive research hypothesis, but if it is a true experiment, it can be used to test a casual hypothesis. Correlation can always be used to test an associative hypothesis. The value remains between −1 and 1. Highest related ones are of the interest to this research. Pearson’s correlation coefficient applied on a population is generally represented using ρ(rho). Given X and Y, ρ can be represented as shown below: ρ X, Y =
cov(X, Y ) σ X σY
(1)
where Cov σX σY
is the covariance of X and Y. standard deviation of X. standard deviation of Y.
P can also be expressed in terms of mean and expectations since, cov(X, Y ) = E[(X − μ X )(Y − μY )]
(2)
By (1) and (2), we can re-write ρ as: ρ X, Y =
E[(X − X )(Y − μY )] σ X σY
(3)
where σ X , σ Y , and E are as defined as above. μX and μY are mean of X and Y, respectively. The formula of ρ can be further expressed as ρ X, Y =
where μ X = E[X ]
E[X 2 ]
E[X Y ] − E[X ]E[Y ] − [E[X ]]2 E[Y 2 ] − [E[Y ]]2
(4)
98
J. Ashok and P. Badoni
μY = E[Y ] σ X2 = E[(X − E|X |)2 ] = E[X 2 ] − [E[X ]]2 σY2 = E[(Y − E|Y |)2 ] = E[Y 2 ] − [E[Y ]]2 E[(X − μ X )(Y − μY )] = E[(X − E[X ])(Y − E[Y ])] = E[X Y ] − E[X ]E[Y ] The input features discussed in Sect. 6 modeled into numeric as given in Table 4, so that the Pearson’s correlation can be calculated for each Web site/URL, and their respective scores are plotted and represented in Fig. 4. For this implementation, we targeted a search engine and decided to prepare the scores for each URL returned by the engine, so that the user that decides which one to browse through. We calculated the score using person’s correlation formula as described using the values from the input table and is represented in Fig. 4 as shown. The one hot encoding method was used, representing each category of values as a vector, for the fields HTTP/HTTPS, domain name. Represented clickbait with URLs whose clickbait percentage is above 75% as 1 else 0. For sentimental analysis, 1 represents negative content classification and 0 says good content. Maliciousness of each URL is represented with 1 being malicious and 0 being non-malicious. Final Table 4 Converted parameter representation Http
Https
High
Med
Low
Sentimental
Malicious
Clickbaity
0
0
1
1
0
0
0
1
0
1
0
1
1
0
0
0
0
0
2
0
1
1
0
0
0
0
1
3
0
1
1
0
0
0
1
1
4
0
1
1
0
0
0
0
0
5
1
0
0
0
1
0
1
0
6
0
1
0
0
1
0
1
0
7
0
1
0
0
1
0
1
0
8
0
1
0
0
1
0
1
0
9
0
1
0
0
1
0
1
0
10
1
0
0
0
1
0
1
0
11
0
1
0
0
1
0
1
0
12
0
1
1
0
0
0
0
0
13
0
1
0
0
1
0
1
0
14
0
1
0
0
1
0
1
0
15
0
1
0
0
1
0
0
0
16
0
1
1
0
0
0
0
0
17
1
0
0
0
1
0
0
0
18
0
1
0
1
0
0
1
0
Web Content Authentication: A Machine Learning Approach …
99
Fig. 4 Score calculated from the numerical representation
application of the person’s correlation to this set gave us a result shown in Fig. 4. The scores calculated using this formula is displayed to the end user and represented in a way that the user can decide on their own perspective. The URLs whose contributing decision-making attributes that are calculated from different models parsed through the Pearson’s algorithm, and those resulted in a positive correlation was shown in a green div box next to the URL and the one with negative correlation was represented in red div box. The output from a sample query made using the application is shown in Fig. 5. Green color values represent positive correlation among the attributes and indicate that the pin-point parameters used which are linearly related to each other and thus can be reliable than others. However, the URLs whose scores are marked red,
Fig. 5 Representation of output
100
J. Ashok and P. Badoni
which have negative correlation, indicate that the parameters are not linearly related and change in behavior of one can contradict the other. Thus, they can be considered as less reliable ones as in this research upon manual observation of random sample URLs. We observed where the parameters are linearly correlated which are the ones which showed more reliable or authentic Web content.
8 Conclusion Through this paper, it has been shown that Web sites can be tagged with reliable or non-reliable considering the parameters like domain, maliciousness of URL, security used, sentiment analysis of content, etc. Using these factors, some sample URLs were tested and labeled. In turn, these were used to test other URLs which validated the model’s performance. However, it can also be noted that the performance of the system must be improved by gathering more dataset and re-training the current machine learning models. With the current architecture, the labeling of URLs is found to be 60–70% accurate from manual verification. The system is slow in processing the result and may take 3–5 min in showing the final scores. Person’s algorithm scores URLs based on the relation between each parameter which are taken as a base, and it was observed that there are no specific pattern between the input parameters and results. There can be better algorithms designed to take this input and understand the parameters and with the new methods suggested in Sect. 9 of this paper, which might result in an output which is more reliable and accurate than what is observed now. Also this topic is a very contradictory topic due to its nature. The Web site or Web content found satisfactory by one person can be offensive or unethical to other. A perfect model in which an artificial system predicting the authenticity of Web content cannot be created, but a model which can be general in scoring can be curated through rigorous training with abundant data.
9 Future work Build training data through different methodologies like crowd sourcing or manual data creation on top of which many trials are to be performed with ML/DL models. Targeting a wider range of audience who is willing to participate in building the dataset can be considered or seeking help in terms of data from Internet giants is a viable option. Further, more algorithms are to be used in understanding the parameters and scoring each URL. The application developed can be applied on a widely available dataset of URLs such as common crawls [43] to label a section of dataset which can be re-utilized for training. The dataset created using this methodology for millions of Web site/URL can be used in further labeling and to train different AI models to further predict the classes of the remaining dataset which can then be retrained. This method promises a stable system which can rate authenticity of unseen
Web Content Authentication: A Machine Learning Approach …
101
Fig. 6 Implementation using TensorFlowJS
Web sites/URLs which are available on Internet today. Also, a difference in architecture that can be brought forward is by using the latest update from TensorflowJS, where a lite version of TensorFlow model created for this research could be deployed at the user end using the plugin or extension. Integrating this architecture with the method discussed above, an implementation can be materialized as shown in Fig. 6. When the confidence score/threshold for a URL is not passing the levels decided, they can be flagged and sent back to server for further investigation and re-training can be done and the updated weights can be sent back on a regular basis. This will remove the server calls thereby reducing the latency for communication and outputs can be obtained at the earliest. A deep research on this system and methodologies discussed will yield much better insights in identifying authenticity of Web content. Acknowledgements We sincerely thank University of Petroleum and Energy studies in providing the necessary facilities and backups needed for this research.
References 1. T.M. Mahmoud, T. Abd-El-Hafeez, D.T Nour El-Deen, Br. J. Appl. Sci. Technol. 18(6), 1–14 (2016). https://doi.org/10.9734/bjast/2016/30376 2. “Internet Users.” Number of Internet Users—Internet Live Stats. www.internetlivestats.com/ internet-users/ 3. M.S. Amin, S. Kabir, R. Kabir, A score based web page ranking algorithm. Int. J. Comput. Appl. 110(12), 0975–8887 (2015) 4. T. Berners-Lee, R. Cailliau, J.-F. Groff, B. Pollermann, World-wide web: the information universe. Internet Res. 20(4), 461–471 (2010). https://doi.org/10.1108/10662241011059471 5. World Wide Web, Timeline Pew Research Center: Internet, Science and Technology, 11 Mar 2014. www.pewinternetorg/2014/03/11/world-wide-web-timeline 6. C. Dewey, 36 ways the web has changed us, in The Washington Post (WP Company, 12 Mar. 2014). www.washingtonpost.com/news/style-blog/wp/2014/03/12/36-ways-the-webhas-changed-us/ 7. S. Brin, L. Page, The anatomy of a large-scale hypertextual web search engine, in Proceedings the Seventh World Wide Web Conference (Brisbane, Australia) 8. G. Pant, et al., Search Engine-Crawler Symbiosis: Adapting to Community Interests (The University of Iowa, USA) 9. D.C. Manning et al., Introduction to Information Retrieval (Cambridge University Press, New York, 2008) 10. S. Mizzaro, How Manyçc Relevances in Information Retrieval (University of Udine, Italy) 11. Search Kit Programming Guide (Apple Inc, Canada, 2005)
102
J. Ashok and P. Badoni
12. J. Barker, What Makes a Search Engine Good? Available https://www.lib.berkeley.edu/Teachi ngLib/Guides/Internet/SrchEngCriteria.pdf (2003) 13. A. Bifet, et al., An Analysis of Factors Used in Search Engine Ranking (Technical University of Catalonia, 2005) 14. Oxford English Dictionary 15. G. Pennycook, D.G. Rand, Who falls for fake news? The roles of analytic thinking, motivated reasoning, political ideology, and bullshit receptivity. SSRN Electron. J. (2017). https://doi. org/10.2139/ssrn.3023545 16. A. Cairo, Graphics lies, misleading visuals: Reflections on the challenges and pitfalls of evidence-driven visual communication, New Challenges for Data Design (Springer-Verlag London Ltd., 2015), pp. 103–116. https://doi.org/10.1007/978-1-4471-6596-5_5 17. A. Munoz, Machine Learning and Optimization [Online]. Retrieved 2017-06-01 from https:// www.cims.nyu.edu/~munoz/files/ml_optimization.pdf 18. K.P. Murphy, Machine Learning: A Probabilistic Perspective (The MIT Press, 2012). ISBN: 0262018020 9780262018029 19. R. Kohavi, F. Provost, Glossary of terms. Mach. Learn. 30, 271–274 (1998) 20. P. Ongsulee, Artificial intelligence, machine learning and deep learning, in 2017 15th International Conference on ICT and Knowledge Engineering (ICT&KE), pp. 1–6. https://doi.org/10. 1109/ICTKE.2017.8259629 21. L. Deng, D. Yu, Deep learning: methods and applications (PDF). Found. Trends Signal Process. 7(3–4), 199–200 (2014). https://doi.org/10.1561/2000000039 22. Y.-D. Zhang, G. Zhao, J. Sun, X. Wu, Z.-H. Wang, H.-M. Liu, V.V. Govindaraj, T. Zhan, J. Li, Smart pathological brain detection by synthetic minority oversampling technique extreme learning machine and Jaya algorithm. Multimedia Tools Appl. 2017. ISSN 1380-7501 23. Z. Liang, G. Zhang, J.X. Huang, Q.V. Hu, Deep learning for health- care decision making with EMRs, in Proceedings of IEEE International Conference on Bioinformatics and Biomedical (Nov 2014), pp. 556–559 24. M.A. Ali, D. Svetinovic, Z. Aung, S. Lukman, Malware detection in android mobile platform using machine learning algorithms, in 2017 International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions) (ICTUS). https://doi.org/10.1109/ ICTUS.2017.8286109 25. N. Shone, T.N. Ngoc, V.D. Phai, Q. Shi, A deep learning approach to network intrusion detection. IEEE Trans. Emerg. Topics Comput. Intell. 2(1), 41–50 (2018). https://doi.org/10.1109/ tetci.2017.2772792 26. T.M. Mitchell, Machine Learning (McGraw-Hill, New York, NY, 1997) 27. X. Qi, B.D. Davison, Web page classification: features and algorithms. ACM Comput. Surv. (CSUR) 41(2), 12 (2009) 28. C. Chekuri, M. Goldwasser, P. Raghavan, E. Upfal, Web search using automated classification, in Proceedings of the Sixth International World Wide Web Conference, 1997 29. H. Chen, S. Dumais, Bringing order to the web: automatically categorizing search results, Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (ACM Press, New York, NY, 2000), pp. 145–152 30. M. Käki, Findex: search result categories help users when document ranking fails, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI) (ACM Press, New York, NY, 2005), pp. 131–140 31. D. Lazer, M. Baum, J. Benkler, A. Berinsky, K. Greenhill, M. Metzger, B. Nyhan, G. Pennycook, D. Rothschild, M. Schudson, S.A. Sloman, C.R. Sunstein, E.A. Thorson, D.J. Watts, J. Zittrain, The science of fake news, 2017 32. G. Pennycook, A perspective on the theoretical foundation of dual-process models, in Dual Process Theory 2.0, ed. by W. De Neys (Psychology Press, 2017), p. 34 33. B. Pang, et al., Thumbs up? Sentiment classification using machine learning techniques, in Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing—EMNLP’02, July 2002, pp. 79–86. https://doi.org/10.3115/1118693.1118704
Web Content Authentication: A Machine Learning Approach …
103
34. R. Klabunde, D. Jurafsky, J.H. Martin, Speech and language processing. Zeitschrift Für Sprachwissenschaft 21(1) (2002). https://doi.org/10.1515/zfsw.2002.21.1.134 35. D. Fetterly, et al., Spam, damn spam, and statistics: Using statistical analysis to locate spam web pages, in Proceedings of the 7th International Workshop on the Web and Databases Colocated with ACM SIGMOD/PODS 2004—WebDB’04, June 2004. https://doi.org/10.1145/1017074. 1017077 36. F. Vanhoenshoven, et al., Detecting malicious URLs using machine learning techniques, in 2016 IEEE Symposium Series on Computational Intelligence (SSCI), 2016. https://doi.org/10. 1109/ssci.2016.7850079 37. J. Ma, et al., Identifying suspicious URLs, in Proceedings of the 26th Annual International Conference on Machine Learning—ICML’09, 2009. https://doi.org/10.1145/1553374.1553462 38. S. Lee, J. Kim, WarningBird: a near real-time detection system for suspicious URLs in Twitter stream. IEEE Trans. Dependable Secure Comput. 10(3), 183–195 (2013). https://doi.org/10. 1109/tdsc.2013.3 39. I. Kotenko, et al., Evaluation of text classification techniques for inappropriate web content blocking. In 2015 IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), 2015. https://doi.org/ 10.1109/idaacs.2015.7340769 40. F. Salvetti, N. Nicolov, Weblog classification for fast Splog filtering, in Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers on XX—NAACL’06, 2006. https://doi.org/10.3115/1614049.1614084 41. I. Kotenko, et al., Protection against information in ESociety: Using data mining methods to counteract unwanted and malicious data, in SpringerLink (Springer, 21 June 2017). www.link. springer.com/chapter/10.1007/978-3-319-69784-0_15 42. Link to GitHub where code is hosted. https://github.com/Jkrish1011/Web-Content-Authentic ator 43. https://commoncrawl.org/
Classification of Brain Tumor by Convolution Neural Networks Madhuri Pallod and M. V. Vaidya
Abstract Brain tumors are problematic, leading to the highest degree of very low life span. Recovery planning therefore represents a critical step toward improving patients’ quality of life. Different imaging methods Computer tomography, magnetic resonance imaging (CT) (MRI) and ultrasound scanning are commonly recommended for brain, lung, liver, breast, and prostate tumor control. MRI images are used in this research particularly to recognize tumors inside the brain. However, in a given time the enormous amount of MRI scanned data prevent the manual classification into tumor and non-tumor. But for a limited number of images, there is some constraint, i.e., to giving precise quantitative measurements. Accordingly, to reduce mortality rates in humans, it is important to have a trustworthy and automatic classification scheme. An extremely challenging task is the automatic classification of brain tumors into the enormous spatial and auxiliary alteration of the area containing brain tumors. In proposed research, automatic brain tumor identification is done using Convolution Neural Networks (CNN). The architecture’s deeper nature is accomplished using small kernels. Neuron weight is given as low as this. Results from experiments show CNN achieve a low complexity rate of 97.5% accuracy relative to all other approaches to state of the art. Keywords Neural networks · Brain tumor · CNN · Classification
1 Introduction Brain tumor is among the human body’s main organ, consisting of billions of tissue. The group of abnormal cells are created by unregulated division of cells known as tumors. Brain tumor is split into two groups: low-grade (grade 1, grade 2) known as benign tumor and high-grade (grade 3, grade 4) known as malignant tumor. Benign tumors are harmless that do not spread throughout the brain. Whereas a cancerous tumor is called the malignant tumor that could easily spread to other region of M. Pallod (B) · M. V. Vaidya Department of Information Technology, SGGS Institute of Engineering & Technology, Nanded, Maharashtra 431606, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_7
105
106
M. Pallod and M. V. Vaidya
the brain having undefined borders which leads to death [1]. Tumor modeling and progression have been identified using MRI. MRI images are primarily used for the diagnosis and therapy procedures in case of tumor detection. The MRI image contains more detail about brain anatomy and brain tissue than the Ultrasonic or CT image over medical image. In comparison with automated brain tumor identification and type cataloging techniques, scholars actually received Brain MRI images from the medical images that could be scanned and kept on the screen. Traditional methods used for brain tumor detection were Neural Networks (NN) and Support Vector Machine (SVM) [2]. Nowadays, deep learning (DL) model set trends in machine learning, since deep architecture can effectively represent complex relationships without many nodes, such as superficial architecture K-nearest neighbor (KNN) and support vector machine (SVM). As a result, they soon became the state of the art compared with health informatics such as medical image processing, medical informatics, and bioinformatics.
2 Previous Work In [3] the fuzzy C-means segmentation (FCM) is used to distinguish between the brain tumor region and non-tumor zone. Wavelet functionality is also extracted via the Discrete Wavelet Transform (DWT) at multilevel. Deep Neural Network (DNN) is ultimately applied to classify high-precision brain tumors. This technique is in contrast with the classification methods for KNN, Linear Discriminant Analysis (LDA), and Minimal Sequential Optimization (SMO). A precision rate of 96.97% in analyzing the classification of brain tumors based on DNN. But the difficulty is much higher, and extremely low efficiency. In [4], a novel simulation of the growth of biophysio mechanical tumors is given to test the growth of patient tumors step by step. Significant tumor mass effect seizure will be applied to individually marginalized gliomas and solid tumor. A tumor growth model is formed by combining the discrete and continuous methods. The proposed scheme provides the option of tacitly segment-bearing images of brain depend on atlas-dependent recording. This technique is primarily used for brain tumor segmentation. The measurement time is indeed huge. Large multi-fractal extraction (MultiFD) function and enhanced algorithm classification schemes are used for identification and segmentation of brain tumors in [5]. MultiFD extraction feature scheme is used to extract the tissue texture of the brain tumor. The improved methods by which AdaBoost is graded are used to assess whether that brain tissue is a tumor. Complexity is considerable. The Local Independent Classification Basis (LIPC) method is used for characterizing the brain voxel in [6]. Also in this process, the path function is extracted. Therefore, explicit regularization of the LIPC is not required. The precision is poor Precision is modest [7]. A seceded tumor segmentation method is introduced using a new technique compared to Cellular Automata (CA) to a cutting-based graph segmentation process. Seed Collection and Interest Size (VOI) are measured to segment brain tumors effectively.
Classification of Brain Tumor by Convolution Neural Networks
107
Segmentation of the tumor cut is also integrated into this feature. The difficulty is small and little accurate. Modern segmentation of brain tumors is adopted in [8], often referred to as a multimodal segmentation system for brain tumors. The combination of even separate segmentation algorithms is to achieve high efficiency as opposed to the current method. But, in difficulty, this is high. The brain tumor segmentation survey is shown at [9]. Discuss different segmentation methods, for example, region-based segmentation, threshold-based segmentation, fuzzy C-means segmentation, Atlasbased segmentation, Margo Random Field segmentation (MRF), deformable model, deformable geometric model, performance, robustness, and validity for all methods. The GANNIGMAC, Decision Tree, Wrapper-based Bagging C approach is used to get judgment law. The use of the hybrid feature set containing the combination of (GANNIGMAC + MRMR C + Bagging C + Decision Tree) is also a simplification of the decision rules. The control theory based on Fuzzy is used on the framework used for the segmentation and classification of brain tumors in [10]. The (FIS) Fuzzy Interference System is one of the processes commonly used for the segmentation of the brains. Supervised classification for the fuzzy controllers is used to establish membership function, high efficiency, and low accuracy. The adaptive equalization of the histogram is used to enhance contrast between images in [11]. Then, a segmentation based on fuzzy C-means (FCM) Is designed to distinguish the entire image of the brain from the tumor. After the Gabor function is extracted to filter out abnormal brain cells. There’s considerable uncertainty, little accurate. A novel automated classification of brain tumors is conducted via neural network convolutions in this work.
3 Proposed Methodology The human brain is basically based on the use of neural network (NN) architecture and its implementation. Methods like vector quantization, inference, clustering of data, pattern identification and classification functions can be done using neural network. Depending upon their interdependencies, NN is split into three groups— feed forward, feed backward and recurrent networks. The feed-forward NN is again subdivided into a single-layered and multi-layered channel of network. The hidden layer is not seen in the single-layer network but that only contains layer of input and output. The multi-layer however includes layers of input, hidden and output. And a recurrent network is a closed loop-based feedback network. Convolution neural network (CNN) is composed of several layers like input layer, convolutional layer, max pooling layer and lastly fully connected layer. Convolution neural network (CNN) is made up of volume of 3D input (length, width, height) to volume of 3D output. The input image given in the convolution layer is divided into particular, tiny regions. Activation function on each input is performed in layer ReLU. The pooling layers are optional. However, the pooling layer is mainly used to inspect final layer, i.e., used to generate classes.
108
M. Pallod and M. V. Vaidya
Figure 1 provides a block diagram for classification of brain tumors, based on a CNN. That has been separated into 2 phases, such as the phase of training and testing. Dataset for the brain image has publically available named as Brain Tumor Image Segmentation Benchmark (BRATS). BRATS includes several images, i.e., cancerous and non-cancerous. Preprocessing, feature exaction and loss function classification are carried out during the training phase to build a predictive model. The preprocessing resizes the image to adjust the image size. Eventually, CNN is used to automatically identify the brain tumors. Image netting is one of the pretraining models where layers of network have to be trained first. Pretrained BRATS is utilized for classification purpose to prevent time complexity problem. Using proposed approach, method will only get trained at last layer using MATLAB software. There is no need of all stages of preparation as shown in block diagram so the calculation time for proposed approach is limited meanwhile for brain tumors is high. Using a gradient descent algorithm, the loss function is calculated. A score function is used to map the class scores to the raw pixel image. Loss function checks different set of parameters for the output which depends on how well the training data accepted the induced ratings with the ground truth labels. It is very important
Fig. 1 Brain tumor classification using CNN block diagram
Classification of Brain Tumor by Convolution Neural Networks
109
to increase precision in the calculation of the loss function. On analysis loss function will be high, when the accuracy is low and vice versa. Using gradient descent, gradient time and gradient value of the loss function is calculated. CNN Classification working is based on following: 1. 2. 3. 4. 5. 6.
Use of convolutional filter on 1st layer Smoothing of the convolution filter that reduces the efficiency of the filter Activation layer manages the transfer of signal from one layer to another Usage of rectified linear unit (RELU) to correct the training cycle Inside the next layer, the neurons are connected to each neuron At the end of the training, loss layers are added to provide input from the neural network.
4 Results Our dataset includes MRI images for tumor and non-tumor, and is obtained from various online databases. Radiopedia [12] includes actual patient cases, tumor images from the radiopedia and the 2015 study dataset Brain Tumor Image Segmentation Benchmark (BRATS) [13] was gathered. In this study, the use of the neural convolution network achieves successful automatic detection of brain tumors. Simulations were performed using MATLAB. The exactness is measured and contrasted with all other cutting-edge methods. Training accuracy, validation accuracy, and loss of validation are analyzed in order to determine the efficiency of the proposed scheme for the classification of brain tumors. For classification, feature extraction part is needed as input. The performance of the classification is generated in terms of accuracy. The calculation time is high and the accuracy of tumor and non-tumor detection based on SVM is low. Using CNN-based classification, the extraction of features does not require separate steps. CNN itself takes the interpretation of the function as in Fig. 2 that includes malignant tumor and benign representation of the brain. Therefore, calculation time and complexity are low, and accuracy is high. The accuracy of the classification for brain tumor is reflected in Fig. 3. In comparison with normal image and tumor image, the tumor image has the highest risk score which helps in to classify. Finally, the analysis-based classification into cancerous or non-cancerous is done through CNN.
5 Conclusion Proposed work is primarily ensuring high performance with low complexity for automatic classification of brain tumor. For traditional brain tumor classification, Segmented, textured and shaped Extraction feature, and classification based on SVM and DNN, are used using fuzzy C-means (FCM), the calculation time is small in the meantime, being high precision. In the current scheme, classification of neural
110
M. Pallod and M. V. Vaidya
Brain Tumor Image
Fig. 2 Classified results based on CNN Fig. 3 Comparison of classification of brain tumor
Brain Non-Tumor Image
Classification of Brain Tumor by Convolution Neural Networks
111
networks based on convolution is implemented with a view to increasing accuracy and reducing computation time. The results of the diagnosis are also given as a brain abnormal images, or images as normal. CNN is one of those deep learning methods that include feed-forward layer sequences. Used for implementing in MATLAB. For classification purposes, the database net image is used. This is one of the pretrained ones. Therefore, preparation is done for the final sheet. CNN also extracts the value of the raw pixel with the feature depth, width and height. Ultimately to achieve high precision, the gradient loss function is applied. The precision of the instruction is measured, the validity accuracy and the lack of validity. The accuracy of preparation is 97.5%. Likewise, accuracy of validation is significant and validation failure is extremely low.
References 1. J. Zhang et al., Brain tumor segmentation based on refined fully convolutional neural networks with a hierarchical dice loss. Cornell Univ. Libr. Comput. Vis. Pattern Recogn. (2018) 2. S. Pereira et al., Brain tumor segmentation using convolutional neural networks in MRI images. IEEE Trans. Med. Imaging (2016) 3. H. Mohsen et al., Classification using deep learning neural networks for brain tumors. Future Comput. Inform. 1–4 (2017) 4. S. Bauer et al., Multiscale modeling for image analysis of brain tumor studies. IEEE Trans. Biomed. Eng. 59(1) (2012) 5. A. Islam et al., Multi-fractal Texture Estimation for Detection and Segmentation of Brain Tumors (IEEE, 2013) 6. M. Huang et al., Brain Tumor Segmentation Based on Local Independent Projectionbased Classification (IEEE Transactions on Biomedical Engineering, IEEE, 2013) 7. A. Hamamci et al., Tumor-cut: segmentation of brain tumors on contrast enhanced MR images for radiosurgery applications. IEEE Trans. Med. Imaging 31(3) (2012) 8. B.H. Menze et al., The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging (2014) 9. J. Liu et al., A survey of MRI-based brain tumor segmentation methods. TSINGHUA Sci. Technol. 19(6) (2011) 10. R. Karuppathal, V. Palanisamy, Fuzzy based automatic detection and classification approach for MRI-brain tumor. ARPN J. Eng. Appl. Sci. 9(12) (2014) 11. Janani, P. Meena, image segmentation for tumor detection using fuzzy inference system. Int. J. Comput. Sci. Mobile Comput. 2(5), 244–248 (2013) 12. Radiopaedia. https://radiopedia.org 13. BRATS 2015]. https://www.smir.ch/BRATS/Start2015
Automated Multiple Face Recognition Using Deep Learning for Security and Surveillance Applications Nidhi Chand, Nagaratna, Prema Balagannavar, B. J. Darshini, and H. T. Madan
Abstract Face recognition is some of the difficult processes due to a large number of wild datasets. Deep learning is the research concentrate on the latest years. Beacause of its best implementation, it is widely used in the area of pattern recognition initiated deep learning structure is collected of a set of complicated designed CNN. Deep learning gave valid resolution in the matter of recognition execution. In our present paper, our purpose is to consider deep learning established face recognition below atmosphere like disparate aspects of head positions, difficult clarification, faulty exterior characteristic localization, and precision using deep learning. We are using OpenCV, Haar cascade for detecting faces, eyes, and smile. LBPH face recognizer is used for training data recognition of faces. Convolution neural network (CNN) is used for facial extractions without any flaws and with more accuracy. Keywords CNN · Face recognition · Deep learning · LBPH · OpenCV · Haar cascade
1 Introduction Around the two-decade face, recognition has been completely discovered due to its many possible applications in the fields of biostatics and information security; countless algorithms have been initiated to recognize human faces by images or videos. As we can see that we live in a digital world, and we expect everything to be digital. So in that case, face recognition is a trending topic now. Facial recognition has become the model among essentially every individual accessory or each safety organization based on it one or another way. As the number of face recognition applications increases, the research on this has increases oppressively. It is used all around the world like for security purposes in CCTV motion picture system, for identifying N. Chand (B) · Nagaratna · P. Balagannavar · B. J. Darshini · H. T. Madan REVA University, Kattigenahalli, Yelahanka, Bengaluru 560064, India H. T. Madan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_8
113
114
N. Chand et al.
people in the case of theft, murder, etc., used in online payments, a mobile application like Face Lock, Facebook, Snapchat, etc., healthcare centers for the security of documents of patients, biometric purposes for taking attendance in school, college, office, law enforcement, command administration, risk inspection, escape run, surveillance, fugitive hunt, smart cards, ATMs, driving license, and passport to name a few. There are many reasearchers who has worked on this project using many different techniques to overcome face recognition problem. There are many problems in face recognition like no accuracy, no proper feature extraction, no proper face detection, no proper face recognition; especially in case of twins, it cannot detect and recognize the face when a face is turned in different angles. These problems can be overcome by deep learning because modules are efficient in research to concentrate on a character by itself.
1.1 Face Recognition Using Deep Learning Deep learning is single section of machine learning operation. Deep learning is an AL function that mimics the activities of the human brain in processing data for use in detecting objects, identifying speech, converting languages, and making decisions. Deep learning AL is able to learn without human direction, drawing from data that is both unstructured and unlabeled. Deep learning is a form of machine learning; it can be used to detect frod money censoring, among other functions. Deep learning explains huge amount of unformed data that would normally take humans decades to realize and process deep learning is class of machine learning algorithms that uses several layers to successively extract higher level features from the raw input. For example, in image processing, lower layers may identify edges, while higher layers may identify the concepts relevant to human such as digits or letters or faces (Fig. 1).
Fig. 1 Working of CNN in deep learning
Automated Multiple Face Recognition Using Deep …
115
2 Literature Survey At starting, so many researchers have worked on face recognition, but they have defined the faces of a person only in an image, by storing more personal images. By storing more number of person images through storing more data they have detected face in an image. Since the data have already stored, this paper concentrates on recognizing a face in both images and videos. The previous studies have worked on individual part detection such as jawline, nose, mouth, and eyes, and also, they have explained the face model by comparing different feature detections and the connections among different features. Starting, Bledsoe [1] and Kanade [2] worked on many self-moving and semi-self-moving face recognition plans. They have designed and grouped based on distances between the face points and the ratio among face points. Geoffrey Hinton used some machines called sparse restricted Boltzmann machines [3] and also “Deep belief network” [4] for shared representation learning, crossmodel learning, etc. After these works P. Ekman and W. V. Friesen “Facial action coding system” [5] came and worked on facial behavior. For this, he has classified two conceptual approaches. We all know that face recognition technology is the most demanding task due to its elaborated nature. Face recognition is a digital technology and its position and dimensions of a human face in a random image, and we need to know that we can also detect other objects like trees, cars, buildings, bodies, etc., using face recognition. And this technology is also called as object class detection. In this process, the challenge is to detect the position and dimension of the object in an image. In this technique, we also need to know the face localization. It is the process to locate facial parts in the given image—“Face recognition from video using generalized mean deep learning neural”. This is the algorithm by two scientists; they are Sharma and Yadav [6]. They have tried to detect the ace in a video based on deep neural networks. And they have recognized the face using deep neural networks very accurately, and they have proved that their algorithms are most favorable to detect the faces with accuracy. “Deep learning networks for face detection” is the algorithm by Ye et al. [7]. These people have proposed face recognition using deep learning, and they control the challengers of detecting faces accurately in non-feasible situations. “An incremental face recognition system based on deep learning” is proposed by Li et al. [8]; this is the algorithm based on deep learning techniques. This is implemented by FaceNet network architecture. This algorithm is planned and executed to modernize and upgrade through the new data amount. “Facial detection with expression recognition using artificial neural networks” is proposed by Owayjal et al. [9]; it is a motorized vision system. This algorithm is executed using MATLAB. Artificial neural network (ANN) uses multi-layer perception. The system will provide the ability to crop, non-white background, and categorize various faces placed in the same image (Table 1).
116
N. Chand et al.
Table 1 Literature survey of face recognition using deep learning S. No.
Paper title
Methodology
Training dataset
1
Face recognition from 2016 video using generalized mean deep learning neural
Year
Generalized mean method
Pasc and YouTube dataset
2
Deep learning network for face detection
2015
Multi-layemon linear mapping
–
3
An incremental face recognition system based on deep learning
2017
S-Ddl using SVM Algo
FaceNet dataset training
4
Face detection with expression recognition using artificial neural networks
2016
MATLAB, Ann with backpropagation
–
5
Face recognition on system in Android using neural networks
2015
Feed forward Algo, backpropagation algo
–
3 System Architecture In the research work, the facial identification is done suitable process of following sequences adopted in the algorithm by the following sequence. Firstly, when a face is detected of any person, it forms a frame; secondly, facial features will be extracted with the help of CNN, and the element derivation of the faces will be analyzed by deep learning of architectural features, and the same is implemented in precise for the perfectness, and these derived features of details will be reserved in the system classified correspondingly (Figs. 2 and 3). When it comes to humans, how do we recognize people? When we meet any person for the first time, we usually observe their facial features like eyes, nose, mouth, eyebrows, jawline, etc., so the next time when you meet the person, you recognize them by remembering their unique facial features. It is very easy for humans to recognize the faces of people. But how about the computer system, how do they recognize a face? Well, let me tell you it is very difficult for a computer system to recognize a face. So, we are supposed to train it. Now, training and recognizing have become easy by
Fig. 2 Steps in face recognition
Automated Multiple Face Recognition Using Deep …
117
Fig. 3 Proposed system architecture
using deep learning CNN, OpenCV, and LBPH face recognizer. There are three easy steps to recognize faces; they are: Data gathering, train the recognizer, and recognition. Let us see the explanation below. 1. 2. 3.
Data gathering: The deep care is taken in gathering data of the features of the faces of person which is required for identification. Train the recognizer: nourishing the face data and identical names of individual face cut through the recognizer in structure to read. Recognition: After providing different faces of those people, examine if the face recognizer you just trained identifies them. In data gathering, we have given persons’ face data that you want to identify.
3.1 Haar cascade Using Haar cascade has been made to identify any facial peculiarities easily. Haar features are important parts of the classifier. The features of the face in the provided images can be efficiently detected from the classifier’s help. The result of each feature is gathered in the following steps. As a process of the first step, the sum of pixels for both tones of black and white rectangles is estimated by pixels. The difference in pixels for black and white rectangles is calculated and thereby gives a unique numerical value, and thus, it also supports the detection method of images in numerous angles and alignments. The system of recognizing image is generated such that, scrutinizing for features in an image begins from the top-left corner to the complete image for these features to determine the face. This procedure can be repeated multiple times to get the best possible result. This procedure of operation is repeated for a greater number of times. In every attempt, we get some results that are compiled in the subsequent round, but the complete result is compiled collectively when the features
118
N. Chand et al.
have to be submitted. The results which are accumulating a complete image will get the desired precision only when we are dealing with a monochrome set of image structures. Training Face Recognizer When it comes to training the recognizer, there are three ways in which we can do. They are: • Local binary patterns histograms (LBPH) • Fisherfaces • Eigenfaces 1. 2. 3.
Local Binary Patterns Histograms (LBPH): cv2.face.createLBPHFaceRecognizer() FisherFaces: cv2.face.createFisherFaceRecognizer() EigenFaces: cv2.face.createEigenFaceRecognizer().
In Eigenface recognizer, the design system emphasizes the fact that, for face recognition, it is not required to consider the entire face. Moreover, we humans recognize some person by his/her distinct peculiarities like their eyes, mouth, nose, jawline, and eyebrows. It means that we concentrate on the main areas of the face which differentiates from others. For instance, from eyes to nose, there is an observable change and similarly from nose to mouth. In this way, you can compare them by looking at those fields; because by catching the most modification among faces, they help you in distinguishing one face from the other. Eigenface recognizer recognizes the face in this way. All the training images of all the people are verified and considered as a whole. The relevant and useful features are extracted from these, and other images are discarded. In this system, important features are called principal components. Figure 4 shows the image showing the variance extracted from a list of faces.
Fig. 4 Eigenfaces recognition
Automated Multiple Face Recognition Using Deep …
119
Fig. 5 Fisherface recognition
Fisherface recognizer is an improvised version of eigenfaces. If we observe EigenFace Recognizer considers all the training images as whole, and from that, it extracts the principal components of the face at once. By doing so, it fails to focus on the features that discriminates an individual face from that of another. That is, it mainly concentrates on the features, which represents faces of all the images in training set as whole (Fig. 5). Fisherface recognition can be achieved by adapting eigenfaces in such a way that it can extract features from the faces of each person separately instead of extracting it from the whole. By doing this, even if an individual face has high illumination change, it is assured that the other person’s extraction process is not affected. The algorithm Fisherface recognizer extracts the principal components that discriminates one individual from that of the other. It ensures the feature of one does not dominate over the other. From the following image, you can come to know about how principal components using the Fisherface algorithm works (Fig. 6). One thing to note here is that FisherFaces avoids hightlights of one individual from getting to be overwhelming, it still considers light changes as a valuable include. We know that light variety is not a valuable feature to extract because it is not a portion of the real confront. Local binary patterns histogram (LBPH) face recognizer. LBPH face recognizer is the improvised version of both the above algorithms. It can avoid the following drawbacks. • Both eigenfaces and Fisherfaces are affected by the illumination of light in the environment. Now, let us see the process of LBPH face recognizer and how it overcomes it.
120
N. Chand et al.
Fig. 6 Working of Fisherface recognizer principal components
The LBPH Face Recognizer Process Let us see how histogram works: Presume if you take a 3 × 3 window and move it across one portrait. At individual move, correlate the pixel at the center, and its surrounding. Recognize the neighbors with intensity value greater or lesser than equal to the center pixel by value 1 and 0. We read these values of 0 and 1 under the observation of 3 × 3 window in a clockwise pattern. We adopt this procedure over the entire portrait. We will have a detail of local binary patterns (Fig. 7).
Fig. 7 Working of LBPH face recognizer process
Automated Multiple Face Recognition Using Deep …
121
Fig. 8 Histogram of each image
Local binary pattern is modified to binary form. Local binary patterns are applied to face detection and recognition. Now, once a list of local binary patterns is created, you create each one into a decimal number using techniques binary to decimal conversion, and then, the histogram of each such image is created. You can see a sample of a histogram of a particular image shown below (Fig. 8). In the end, you will have one plot for each face in the training data set. That means if you have given 100 images in the training dataset, then local binary pattern histogram will extract 100 plots after training and storing them for later recognition. The procedure also keeps track of which plot belongs to which person. Afterward during recognition, the process is as follows: Whenever the recognizer is referred by a new image, it generates an plot for that image. Later, it compares that plot with previously stored ones. Therefore, the system finds the best match. Beneath is an organization of faces and their respective LBPH images. We can see that the local binary pattern faces are not affected by changes in light conditions (Fig. 9). When training data first, it reads the training images at individual person with their labels, and it recognizes faces from every image and assigns each detected face as numbered tag of the person it belongs. Thereby, LBPH face recognizer is better than the other two. That is why we are using OpenCV, an LBPH face recognizer for training faces. The algorithm for face recognition using deep learning are firstly, it reads training images of individual person along with their identity. Secondly, it detects faces from each image and assigns each detected face of numbered label of the person it belongs.
122
N. Chand et al.
Fig. 9 It recognizes face in any light condition
Thirdly, it educate OpenCV, LBPH recognizer by feeding data as we prepared steps of method inserting few new faces in front of a camera it preprocesses and it extracts the features of facial landmarks like eyes, nose, mouth, eyebrows, jawlines, etc., and then tries to match with given data. If it matches, then it shows the name of the person. We are using Haar cascade for the detection of facial features like eyes, smile, etc.
4 Results While running the algorithm, firstly, we give a new image that is captured through webcam. Secondly, the recognizer creates a new histogram of that new profile. Thirdly, it differentiates the new histogram with the older histogram which we had given before for training. Finally, it identifies the best compared match of the person and thereby identifies the person by tagging the person’s name left above the corner of the detection box. By using CNN, we can extract facial features like eyes, nose, mouth, jawline, eyebrows, etc., with more efficiency, and by using Haar cascade, you can detect faces with more efficiency and can be performed well by using this algorithm. By incrementing the series of layers of neural network it will remove the problem of rcognizing twins. As you can see in the below image, it detects a face, it recognizes a person’s name, it detects eyes, it detects a smile, etc (Fig. 10).
Automated Multiple Face Recognition Using Deep …
123
Fig. 10 Our result which we have got
5 Conclusion By using deep learning algorithms, we get a good result for recognition of faces. It first detects faces from a given input image with the help of Haar cascade; it acts appropriate output which acts as input for tagging the face system, and hence, it provides tagged face with their particular name as output. When it comes to face detection, whenever the face is detected, the possibility of face been recognized is represented by detection box. Once the face is detected it checks with the input image which we have given; if it matches, it shows the name of the person or just tags the person name just left above a corner of face detected box. We extend the concept of face recognition using deep learning to detect the face automatically and tag the person’s name with accuracy. If you want more accuracy, then it is quite obvious that you should give more input images of the person. If the no. of images increases, the accuracy also increases.
6 Future Work Governments across the world are ready for investing their resources in facial recognition technology; when it comes to leaders of face recognition market, then it is USA and China. The technology is familiar to advance and will design hefty credits in the coming years. Survelliance and security are the higher priority trades that are the major industries that will be actively convinced by passionately convinced by
124
N. Chand et al.
technology schools, colleges, and even in healthcare are also planning to implement the facial recognition technology on their premises for better management complicated technology used in facial technology is also making its way to the robotics industry.
References 1. W.W. Bledsoe, The model method in facial recognition, in Techical Report PRI:15 (Panoramic Research Inc., Palo Alto, CA. 1964) 2. T. Kanade, Computer Recognition of Human Faces (Birkhauser, Basel, Switzerland, and Stuttgart, Germany, 1973) 3. G. Hinton,A Practical Guide to Training Restricted Boltzmann Machines 4. G. Hinton, Deep belief networks. Scholarpedia 4(5), 5947 (2009) 5. P. Ekman, W.V. Friesen, Facial Action Coding System (1977) 6. P. Sharma, R.N. Yadav, K.V. Arya, Face recognition from video using generalized mean deep learning neural network, in 2016 4th International Symposium on Computational and Business Intelligence 7. X. Ye, X. Chen, H. Chen, Y. Gu, Q. Lv, Deep Learning Network for Face Detection (IEEE, 2015). 978-1-4673-7005-9 8. L. Li, Z. Jun, J. Fei, S. Li, An Incremental Face Recognition System Based on Deep Learning (IEEE). https://doi.org/10.23919/MVA.2017.7986845 9. M. Owayjan, R. Achkar, M. Iskandar, Facial Detection with Expression Recognition using Artificial Neural Network (IEEE). https://doi.org/10.1109/MECBME.2016.7745421
An App-Based IoT-NFC Controlled Remote Access Security Through Cryptographic Algorithm Md. Abbas Ali Khan, Mohammad Hanif Ali, A. K. M. Fazlul Haque, Chandan Debnath, Md. Ismail Jabiullah, and Md. Riazur Rahman
Abstract In the twenty-first century, a human being is passing through the world with generosity of technology and most of its the systems are being operated by automated or remote access control. However, sensor technology is already playing a vital role to control the smart home, smart office, etc. However, it is about to beyond a smart city. Remote access control is a part of the leading technology. An app-based innovative remote access control framework is adding an extra security to make this technology more convenient, secured and illustrate the usability of a person along with an authenticated system of the executive. NFC is used as a communication technology, and a microcontroller camera is also used for detection. An authentication process drives through a smartphone application over the IoT framework. A definitive objective of this paper is to ensure the security of remote access control, notification to the comer and admin, accessibility, usability and permissibility to enter the premises. In order to maintain the integrity and the confidentiality of data cryptographic, techniques like computational 512 bits hash functions are considered
Md. A. A. Khan (B) · A. K. M. F. Haque · Md. I. Jabiullah · Md. R. Rahman Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh e-mail: [email protected] A. K. M. F. Haque e-mail: [email protected] Md. I. Jabiullah e-mail: [email protected] Md. R. Rahman e-mail: [email protected] Md. A. A. Khan · M. H. Ali Jahangirnagar University, Dhaka, Bangladesh e-mail: [email protected] C. Debnath National University, Gagipur, Bangladesh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_9
125
126
Md. A. A. Khan et al.
and encrypt the hashed data once AES-192 is used. The additional part of this paper is to measure the performance of an employee. Keywords Cryptography · NFC · IoT · NFC tag · NFC device
1 Introduction To ensure security, IoT-controlled remote access security systems are presently showing an outstanding performance. NFC provides secured communication among all other short-range communication but unlike Wi-Fi, ZigBee and Bluetooth technology [1]. NFC-IoT devices can perform the security system along with cryptographic process. The Internet of Things (IoT) is the augmentation of Internet network into physical gadgets and regular items. They can be remotely observed and controlled with unique identifiers (UIDs) that also cover the remote health monitoring [2]. Because of reasonable processors and remote systems, it is conceivable to turn anything, from a pill to a plane to a self-driving vehicle into part of the IoT [3]. Near Field Communication (NFC) ISO/IEC 14,443 and ISO/IEC 18,000–3 devours the most reduced power than different wireless application protocols. This mechanization enabled the client to send radio data to a distinguished beneficiary and work at the 13.56 MHz frequency [4]. It planned to be an ensured sort of data exchange, and an NFC device is fit for being both an NFC per client and an NFC tag [5]. In this globalization network, system is applying for sharing and convening information [6]. Due to deep interest, research scholars are trying to make the data secure of IoT devices, even though many works already have done but developing is a continues process. In this system, IoT uses to control, monitor and access the door lock over the Internet by using smartphone [7]. It is very simple that hackers or sniffers are always trying to change or sniff the data. In the context, we introduce cryptographic algorithm to protect unauthorized access in the network [8]. The greatest quantity of the researcher in existing door lock system uses bio-metric, RFID, etc. Whatever, a complete solution is shortage to provide from the current system. The objective is to develop an app-based remote access control security framework dependent on NFC and IoT. For building up the framework, required equipments are ESP8266, Raspberry Pi, display, RFID module, and there is a breakdown in section named “required component.” The framework will utilize NFC and IoT platform for finishing its entire procedure, and as well as, cryptographic algorithm is used for client–server communication authentication [9]. This remote access door lock system can use any premises, but some additional features are added in this system with the help of Pi serve for some special purpose. For instance, if we use this remote access control (RAC) in an office, the office can only use the additional features and get the benefit. So the additional part of this paper is to measure the performance of an employee. Why do I claim an additional research? Because no researcher has done before such
An App-Based IoT-NFC Controlled Remote Access Security Through …
127
a work with door lock system. The rest of the paper is organized as Sect. 2 describes the previous studies, whereas Sects. 3, 4 and 5 describe hardware details, detail working procedure and performance measure and time count, respectively. Before conclusion, Sects. 6 and 7 represent RAC app architecture and results and discussion accordingly.
2 Previous Studies IoT-NFC control, i.e., remote access security-based door access permission system can be disintegrated into three domains, particularly fingerprint, face recognition and RFID. An amazing number of researcher define their works only for the either fingerprint or face recognition or RFID, whereas some researcher incorporates GSM and IoT along with image detection. As a matter of fact, we know that no work has been done for remote access security control through IoT-NFC base, and it is mentionable that an admin have the authority to permit or reject to access. Whatever, some other scholars have performed remote access through SMS, email and Bluetooth device. Sahani et al. (2015) shown an authentication and gave access to the user using Zigbee and face recognition and send an email or alert message to the owner. The main function and the communication protocol is Zigbee and GSM network [10]. Though the researcher started earliest time but uses only short range wireless communication with mobile network, whereas the range of data communication of ZigBee is not more than 70 m. Ahtsham et al. (2019) show secure door lock through cryptographic lock using cryptographic algorithm, and its communication process is mobile devices via wireless Internet and android application [11]. The researcher applies the algorithm AES-128 instead of AES-192 bits and AES-256 bits, respectively, although the next two is more secure than the first one. Ha et al. (2015) deploy user or visitor can access by the security door using a separate key, when the key is not matched with the system then the system will automatically take a picture of the stranger and transfer it to the end user mobile phone and the system is connected with the mobile phone through wireless communication [4]. Sowjanya et al. (2016) used fingerprint for authentication and the door lock system can also detect gas leakage, temperature, and fire. Apart from this, store all the data in the IoT server and the whole communication process is based on wireless internet [12]. The limitation of this system is to face any physical problem of the user. Nag et al. (2018) say that user can open the door in two conditions. If the user is known, then the system will automatically detect him or her by facial recognition, and if the user is new or unknown, then the specific remote user can give him access by social networking chatting application. Here all the systems connect with each other via wireless Internet [13]. Nag does not mention in his system that how to detect a new comer whether he is registered or none registered. Naths et al. (2016) proved that a RFID reader is installed on the door and user can get access into the room by using the authorized RFID card. The main communication protocol is RFID, and the device is connected with the server through wired connection, and the server is connected
128
Md. A. A. Khan et al.
with the Internet for the web view [14]. Muhammad et al. (2019) shown that the door lock system connects to a mobile device via Bluetooth and the user will get notification for key sharing. Doh et al. (2015) prove that door lock system using Internet of Things and authentication using image identification [15]. In this system, there are some limitations; only recorded image is recognized that means un-registered user cannot access the door even he has to right on entry, and no authentication algorithm is used whereas there is process of SMS system. Abbas et al. mentioned NFC Link to collect or access stored data by using NFC device [16]. He describes the NFC-tag has a very low storage capacity used to store data, and NFC reader can read the data with 4–10 cm. More importantly, there is a matter of security issue which will arise to build an IoT base device. To solve the security issue in IoT we have considered cryptographic technique [8]. In this paper, AES-192 is used to encrypt the plain text with hash family of SHA-512 [17]. Hence, encryption base remote access control security is more secure when the pin code or any other information is transfer over the Internet [18]. Overall, all these aforementioned past frameworks have subsequently been executed; all of them work in the same way with some variation of each other and for the same purpose. Many systems exist that use this revolution, but they have all most some specification. In this paper, an IoT-NFC-based access door lock and user access control system with a special feature are discussed. It will not only use as a door lock, but it also can be used as a periodical access control system; it means that the super admin or the authority can give access or reject an user in a specific room at a specific time and whoever is the user. So, the authority can maintain the privacy or can arrange a confidential meeting with the specific employees. All the access control data will be stored in the IoT server with time schedule so the authority also can keep their eyes on the time schedule of their employees for tracking. The following Table 1 shows the algorithm of cryptography. Table 1 Encryption algorithm Algorithm
Key size (bits)
Block size (bits)
Round
Cipher type
RAS
n = p*q
Variable
1
Asymmetric
DES-56
56
64
16
Symmetric
3DES-56
168
64
48
Symmetric
AES-128
128
128
10
Symmetric
AES-192
192
128
12
Symmetric
AES-256
256
128
14
Symmetric
An App-Based IoT-NFC Controlled Remote Access Security Through …
129
3 System Architecture and Hardware 3.1 Required Component The following hardware is required to perform the project. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
ESP-8266/32—3 Pcs Raspberry Pi—1 Pcs OLED Display—1 Pcs Keypad—1 Pcs ESP32 Camera—1 Pcs Door Bell—1 Pcs ESP32 lock—1 Pcs Voltage Controller 2 Pcs Raspberry Pi Adapter—1 Pcs RFID card Database Android app.
4 Detail Working Procedure 4.1 RAC Configuration and Working Procedure We illustrate the system architecture of the access control machine (ACM) (see Fig. 1). Firstly, all the accessories devices such as ESP cam, RFID, keypad and display will be connected with ESP8266/32 microcontroller. It will control all these devices and manage their data transformation based on the priority level. Secondly, the ESP8266/32 will be connected with the Raspberry Pi server which will generate an access request, and it will notify to the administrator. It is mentionable that this system provides registered- and non-registered-type access. So, a database (DB API) is connected with Raspberry Pi server and a code generator browser works with Pi server and smartphone. All the information will exchange through the Pi server and ESP 8226/32 microcontroller, whereas door bell, keypad, RFID (contain the information of the registered user) and ESP cam works via ESP 8226/32. On the other hand, display and ESP lock get active when ESP 8226/32 passes information to them. For registered user contains a 4-bit code from the registration period, whereas rest of the 3-bit code will be generated automatically during the NFC card punch period, and this 3 bit will be generated though the Pi server once the permission is allowed by the admin. For a registered user must have to know his or her 4-bit code. In contrast, non-registered user has ring the bell and immediately his/her image will be captured by the ESP cam and immediately send the image of the comer and visualized to the
130
Md. A. A. Khan et al.
Fig. 1 System architecture
admin smartphone along with an active notification tab. If the person is permissible by the admin, the admin tap of notification tab and then a 7-bit code will produce on the screen of Access Control Machine (ACM); otherwise, he/she will be received a disallowed message. Moreover, all the login information and image will be stored in the online/DB API server so the administrator can check the access history if needed. To make the confidential information secure, the RAC system performs hash, i.e., SHA-512 value and encrypted by AES-192 symmetric encryption algorithm [19].
An App-Based IoT-NFC Controlled Remote Access Security Through …
131
4.2 RAC Application Figure 2 demonstrates the flowchart of the proposed framework. At first, the entire process will start when the remote access control (RAC) receives a RFID card from a user; then, it will check whether the user is enrolled or not. If the user is registered, then it will generate a pin code that will visible on the display, and if the user is not registered, then he or she must have to press the doorbell. The doorbell will generate and send a notification to the admin phone. At the same time, it will automatically capture an image of the person. Then, the admin will give him or her permission to enter the room. On the other hand, after generating the 3-bit pin code on the display, the registered user will input his or her personal 4-bit code which is previously known by the registered user. After entering the pin code, the system will check for authentication. Nevertheless, after decryption, the hash value is checked, and if it matches with the existing hash value, then the door will open [20]; otherwise, the user will get three more chances to enter the correct pin code. If the registered user surpasses the three-time period, then the system will consequently start from its first stag.
Fig. 2 Flowchart of the system (security purpose)
132
Md. A. A. Khan et al.
4.3 Data Cryptography Technique Over the network, there are two security steps which have to be passed during data transmission period. To keep up the data integrity, firstly hash function is used and to stay away from reply attack as well as the next step is encryption in order to maintain data confidentiality. To make an array data from the surroundings, RAC collects information and generates a fixed length value once accomplished hash function. Hence, to encrypt data AES-192 is applied after completing the hashing. In this point of the above discussion, we understand that there are two cryptographic techniques which are known as hashing and encryption. Hashing-RAC contains the information of admin and register user along with the non-register user when he/she rings the bell. To obtain the hash value from the plain text, first it divides the input data into data block which is fixed in size. For doing the fixed length data block padding is used, even though, for a specific algorithm the data block is same in all cases. Here in this paper, SHA-512 bit is consider for hashing; it means the data bock is 512 bit only [21]. Once the padding is done, now 192-bit data has to add to make the total size. Actually, the hash function runs only once but here 80 rounds because of SHA-1 to create the output of 512 bit. Finally, memory is set and 80 rounds per plain text is processed. Then and there, it can perform the encryption of final hash value. Encryption–decryption will start along with EAS-192 bit algorithm after immediately getting the final hash value. Though the key size is 192 bit (24 byte), the block size is 128 bit (16 byte) [21]. So, to perform 12 rounds, it takes 128 bit of hash text and the possible combination is 6.2 * 1057 which is very effective against brute force attack [22]. In each round a fixed size sub-key with 32 bit is derived from the main key [23]. Finally, the encryption start on 512 bit fixed length value.
4.4 RAC Condition Information RAC status information will send to the app base smartphone. User information from RFID and RAC status is added with hash function. After that, it created a final hash value with fixed length. In order to encrypt, the hash value is consider along with the collaboration of AES 192 algorithm by applying cryptographic technique. Through the Pi server, the admin is received (in smartphone) the encrypted data. At the admin side, the encrypted data will be decrypted by using symmetric key. Once decrypting the data, the fixed length hash value is detached and check to match with help of existing hash value. If matched, then RAC app information is updated. The system will use the same encryption and description procedure for permission or rejection from admin for the user. Figure 3 shows the sending and receiving information from RAC to phone using hash function with help of symmetric key.
An App-Based IoT-NFC Controlled Remote Access Security Through …
133
Fig. 3 Sending and receiving information from RAC to phone using hash function with help of symmetric key
Fig. 4 RAC database
Figure 4 shows the user status through the database. Non-register user can access database by permission of admin, and register user can access directly for authentication and other purpose. Anyone’s usability and access information get from the database. By using a Pi server database, the smart door lock system and home automation’s data may analysis easily for getting user’s information [24].
5 Performance Measurement and Time Count 5.1 Data Access and Storage Figure 5 shows the android application page layout for both admin and the comer (registered user) and also figure out the layout of the relation between admin and comer through the database. The purpose of this system is to measure the performance of the admin basis on the completion of the assigned task. Firstly, when a comer will try to enter the room, he or she must have to swipe his or her NFC ID card as he/she is a registered user. Once accepting the request from the admin, he/she will get the
134
Md. A. A. Khan et al.
Fig. 5 Relation among admin and comer
access to enter the room and the system will start a time count. Then, the comer has to fix the work category from the option list that will show on smartphone. In the admin side, the same option is shown, and “work in progress” tab will active, as well as there is another inactive “DONE” button also visible. Once the task has done, then the comer will submit it through the android app along with the press of “SUBMIT” button. The admin’s “DONE” button will only active whenever the comer submits his or her task through “SUBMIT” button. As soon as admin is submitted the “DONE” button, the process will complete and system will automatically stop the time count. Figure 6 describes the whole process of admin and comer. In this connection, this entire procedure helps to find out the performance of a key person of an organization. It means that how fast a task can complete, and by this way, a person can be evaluated. As well as the system helps an organization to monitor remotely and perfectly. In this paper, the performance measurement and time count is an additional work because no researcher has done such a work bofore.
5.2 Time Count Once the permission of the comer is accepted then the process will start and the system automatically records the time count process and it ends when the system gets the successful notification from the comer and the admin. A time count process is generated through Pi sever. The following flow chart describes the task processing system among admin and the comer.
An App-Based IoT-NFC Controlled Remote Access Security Through …
135
Fig. 6 Depict flow chart of time count and task assign
6 RAC App Architecture Since this is an app-based remote access control security system, so to interact with the RAC system, an android app has been developed, and through this, all activities will be maintained. Figure 7 depicts the system structure as a step one where user starts their access via login option. In the context of login, user can enter the main menu to ensure their authentication of user name and password that will verify through database. The second step is to access main menu where the user can utilize and communicate to the RAC architecture. Now, the user touches the NFC card to the ESP 8266/32, and then, a notification message is received by the admin/room owner. If the admin/room owner wishes to accept, then he has to press “Done” otherwise “Reject.” In this point, the user type 3-bit new-generated code along with the 4-bit code which is generated during the user registration period. Now, the door is open to enter inside the room; after that, the user fixes task from task list and also mentions the assign person. When the user gets notify of task completion, at that time the user taps the submit tab and immediately the “Done” button becomes active. Finally, the
136
Md. A. A. Khan et al.
Fig. 7 RAC app system architecture
admin/room owner press “Done” button to complete to the process. More importantly, plaintext and fixed length hash value will encrypt by AES-192 and decrypt also.
7 Results and Discussion 7.1 Security, Notification and Permissibility In this point, thorough this method a security system is ensured along with the code on both the admin and user. It is clear that no one can get permit to enter the premises without having the notification of admin wherever he/she lives as well as all the information is store in Pi server. This system is very transparent because of 4-bit code during the registration period of an user and another new generated 3-bit code
An App-Based IoT-NFC Controlled Remote Access Security Through …
137
produce form the system once the RFID card is touched to the system. Apprehension of permissibility control is fully executed only by the admin. Even though, admin can disallow any registered and non-registered person from anywhere outside of the world. More importantly, all the process will be done with the concern of cryptographic technique. In this connection, we ensure two types of security, (i) 7-bit pin code and (ii) SHA-512 and encryption algorithm of AES-192 which will run over the network to protect unauthorized user and the data secure.
7.2 Accessibility and Usability In this part, it shows the performance of an employee basis on his/her activities of completion of task duration. Department-wise clustering also refer the usability of a person of an organization. Now, the issue is how many times a department comes to the key person in a week and how fast he/she can complete the assigned task. Such a way, we can evaluate all the employees automatically. There is no chance to do something fake because of 7-bit code generates process and cryptographic technique. We discuss in detail the procedure of measuring the performance of an employee of “Data access and Storage” section. Accessibility of another person to the admin or any other employees refers to the usability of the personal in an organization.
8 Conclusion Basically, this research has done over three innovative technologies. Such as cryptography, IoT and NFC. Here, IoT and NFC ensure online API user registration and access control, respectively. Moreover, the actual outcome of this research is a remote access control through IoT-NFC and AES-192 and SHA-512 which applied to ensure the confidentiality and integrity of the data. Nevertheless, an additional research has added that is to measure the performance of an employee. From the above discussion, we come to a point that door access security, data security and user evaluation are performing through an app which only authentication can access. As we can say that this system is highly secured. Any application engineering can apply this method in organization for real-life implementation. It also helps the developer to use the algorithm and technique based on this research analysis. As a future scope, I have a plan to develop more interactive and convenient use for the user with high security system, especially in regard of performance measurement by utilizing with latest technology such as image processing or machine learning to predict usability of an employee.
138
Md. A. A. Khan et al.
References 1. Md. A.A. Khan, M.A. Kabir, Comparison among shortrange wireless networks: bluetooth. Zigbee, & Wi-Fi,DIU J. Sci. Technol. 11 (1) (2016) 2. S. Gupta, V. Malhotra, S.N. Singh, Securing IOT driven remote helathcare data through blockchain. Int. J. Eng. Sci. Adv. Res. 5(2), 24–27 (2019) 3. M. Hemaanand, P.R. Chowdary, S. Darshan, S. Jagadeeswaran, R. Karthika, L. Parameswaran, Advanced driver assistance system using computer vision and IOT, in Computational Vision and Bio-Inspired Computing ed. by S. Smys , J. Tavares, V. Balas, A. Iliyasu. ICCVBIC 2019. Advances in Intelligent Systems and Computing, vol 1108 (Springer, Cham, 2020) 4. I. Ha, Security and usability improvement on a digital door lock system based on internet of things. Int. J. Secur. Its Appl. 9(8), 45–54 (2015) 5. A.A. Khan, D. Ali, M. Hanif, D.A.K.M. Haque, C. Debnath, D.R. Jabiullah, A detailed exploration of usability statistics and application rating based on wireless protocols. J. Adv. Comput. Eng. Technol. 6(1), 9–18 (2020) 6. G.Z. Islam et al., Achieving robust global bandwidth along with bypassing geo-restriction for internet users. Indonesian J. Electr. Eng. Comput. Sci. 18(1), 112–123 (2020) 7. A. Choudhary, S. Jamwal, D. Goyal, R.K. Dang, S. Sehgal, Condition monitoring of induction motor using Internet of Things (IoT), in Recent Advances in Mechanical Engineering 2020 (Springer, Singapore), pp. 353–365 8. E. Anaya, J. Patel, P. Shah, V. Shah, Y. Cheng, A performance study on cryptographic algorithms for IoT devices, in Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy 16 Mar 2020 (pp. 159–161) 9. P. Martins, L. Sousa, The role of non-positional arithmetic on efficient emerging cryptographic algorithms. IEEE Access. 24(8), 59533–59549 (2020 Mar) 10. M. Sahani, C. Nanda, A.K. Sahu, B. Pattnaik, Web-based online embedded door access control and home security system based on face recognition, in 2015 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2015] (IEEE, 2015), pp. 1–6 11. M. Ahtsham, H.Y. Yan, U. Ali, IoT based door lock surveillance system using cryptographic algorithms, in 2019 IEEE 16th International Conference on Networking, Sensing and Control (ICNSC) (IEEE, 2019), pp. 448–453 12. G. Sowjanya, S. Nagaraju, Design and implementation of door access control and security system based on IOT, in 2016 International Conference on Inventive Computation Technologies (ICICT), vol. 2 (IEEE, 2016), pp. 1–4 13. A. Nag, J. N. Nikhilendra, M. Kalmath, IOT based door access control using face recognition, in 2018 3rd International Conference for Convergence in Technology (I2CT) (IEEE, 2018), pp. 1–3 14. S. Nath, P. Banerjee, R.N. Biswas, S.K. Mitra, M.K. Naskar, Arduino based door unlocking system with real time control, in 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I) (IEEE, 2016), pp. 358–362 15. O. Doh, I. Ha, A digital door lock system for the internet of things with improved security and usability. Adv. Sci. Technol. Lett. 109, 33–38 (2015) 16. Md. A.A. Khan, M.H. Ali, A.K.M.F. Haque, Machine learning-based driving license management through wireless ad-hoc networks using NFC. Int. J. Recent Technol. Eng. (2020) 17. A.A. James, K.T. Sarika, A novel high-speed IoT based crypto lock using AES-128 and SHA512. Stud. Indian Place Names 40(74), 9–15 (27 Mar 2020) 18. R. Satti, S. Ejaz, M. Arshad, A smart visitors’ notification system with automatic secure door lock using mobile communication technology. Int. J. Comput. Commun. Syst. Eng. 2, 39–44 (2015) 19. S. Ghoshal, P. Bandyopadhyay, S. Roy, M. Baneree, A Journey from MD5 to SHA-3, in Trends in Communication, Cloud, and Big Data 2020 (Springer, Singapore, 2020), pp. 107–112 20. I. Cicek, K.A. Al, SHA-512 based Wireless authentication scheme for smart battery management systems. Int. J. Smart Grid-IjSmartGrid. 4(1), 11–16 (2020 Mar 25)
An App-Based IoT-NFC Controlled Remote Access Security Through …
139
21. P. Singh, S.K. Saroj, A secure data dynamics and public auditing scheme for cloud storage, in 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), 6 Mar 2020 (IEEE, 2020), pp. 695–700 22. V. Grover, An efficient brute force attack handling techniques for server virtualization (30 Mar 2020). Available at SSRN 3564447 23. A. Vuppala, R.S. Roshan, S. Nawaz, J.V. Ravindra, An efficient optimization and secured triple data encryption standard using enhanced key scheduling algorithm. Procedia Comput. Sci. 1(171), 1054–1063 (2020 Jan) 24. Y. Park, P. Sthapit, J. Pyun, Smart digital door lock for the home automation. Proc. TENCON 2009, 1–5 (2009)
Pneumonia Detection Using X-ray Images and Deep Learning Chinmay Khamkar , Manav Shah , Samip Kalyani, and Kiran Bhowmick
Abstract Pneumonia is a lung infection that reduces the ability to breathe in enough oxygen and has symptoms such as chest pain and fever. In order to reduce premature deaths in children, early detection of pneumonia is said to be vital. We improve upon the accuracy achieved by previous attempts to detect pneumonia that utilized various architectures. In this paper, a MobileNet-based architecture has been implemented in a convolutional neural network that takes an X-ray image as input and classifies whether or not a patient suffers from pneumonia. This implementation achieved high accuracy and recall percentages. Keywords Pneumonia detection · X-rays · MobileNet · Convolutional neural network · Deep learning
1 Introduction Pneumonia is a disease that causes inflammation in the lungs and is caused by bacteria, fungi or virus [1]. People of all ages, especially those who smoke or have a relatively weak immune system, could be infected by this disease. Some symptoms of pneumonia include shortness of breath, weakness, shaking chills and fever. In the USA [2], one million patients suffer from it each annum and about fifty thousand succumb to it. Pneumonia is also the eighth leading cause of death with the elderly and infants being highly susceptible to it. Pneumonia can be detected by pulse oximetry, bronchoscopy, blood tests or chest X-rays. Chest X-rays have often been used by radiologists to diagnose pneumonia in patients. However, this requires expertise and the lack of radiologists, especially in rural places might hamper early detection and recovery. With recent developments and advancements made in deep learning, models based on convolutional neural networks have been successfully used C. Khamkar (B) · M. Shah · S. Kalyani · K. Bhowmick Dwarkadas J, Sanghvi College of Engineering, Mumbai, India K. Bhowmick e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_10
141
142
C. Khamkar et al.
for image classification in the field of medical diagnoses such as to detect skin cancer [3], classify diabetic retinopathy [4]. Although some other deep learning models to detect pneumonia exist, in order to detect pneumonia, the implementation in this paper utilizes a MobileNet-based neural network and classifies with high accuracy whether a given X-ray image corresponds to a patient suffering from pneumonia or a normal patient. This model will help reduce the burden on radiologists, and their expertise could be used in more demanding cases. This implementation will also increase the chances of early detection of the infection, thus resulting in a lower mortality rate in children below the age of 5. The main motivation of this study was to assist professionals with early stage detection of the disease, prevent further complications. The paper is organized as follows: Part 2 consists of related work, Part 3 explains the functioning of a CNN model, Part 4 introduces our proposed model, Part 5 is the experimentation section, Part 6 describes the results and finally, and Part 7 is the conclusion.
2 Related Work A lot of early stage detection of diseases such as skin cancer, diabetic retinopathy and tuberculosis has been made possible by advancements in deep learning. Deep learning has also obtained successful results in problems like Alzheimer disease diagnosis and classification, interstitial lung disease detection and even diagnosis of celiac disease [4–6]. This section looks at some deep learning approaches used for pneumonia detection using X-ray imaging. Stephen et al. [7] used a CNN-based approach to detect pneumonia based on the X-ray images. Standard data augmentation techniques such as rescale, rotation, zoom range, horizontal flip were used to tackle the problem of overfitting. The model consists of two major parts which are feature extraction and classifier. It consists of four 3 × 3 convolutional layers of size 32, 64 and 128 (2), a 2 × 2 max pooling and ReLU activation function. The model was trained for 100 epochs. The model achieved a train accuracy of 0.95 and validation accuracy of 0.93. The process was repeated for images of different sizes to get better accuracy. The lowest training accuracy was encountered for an image of 100 which was 0.93, and the best was achieved for image size of 300 which turned out to be 0.956. Also, the validation accuracy was best in case of image size of 200 which was 0.937. This method was efficient when the dataset is small, and also, the accuracy of the model can further be improved with a larger dataset. Anuja Kumar Acharya et al. [8] have proposed an early stage pneumonia detection model using deep learning. Their main motivation was reducing the ill-effects of this disease in children, especially under the age of 5. In their implementation, they feed radiological images to a neural network which classifies whether or not a patient has a particular kind of pneumonia, namely bacterial or viral. Using a deep siamese architecture, they tried to find the similarity index among any two given images.
Pneumonia Detection Using X-ray Images and Deep Learning
143
First, they compare an image with a normal, already trained image and then they discern left and right segment’s visibility from the X-ray. They have implemented a multilayer CNN model using Pytorch. Overall, they were able to achieve an accuracy of 89%. Chouhan et al. [9] proposed a five neural network-based ensemble architecture, a novel approach for detecting pneumonia. These pretrained models were first trained on ImageNet dataset. Then, a combination of these models was taken giving a large ensemble-based model. The neural network architecture used in this ensemble are AlexNet, DenseNet121, InceptionV3, ResNet18 and GoogleNet. Such a deep architecture in an individual model is useful for feature extraction. Another advantage of this method is, since the individual models are already pretrained on the ImageNet dataset, a smaller dataset of pneumonia X-rays can be used for training the ensemble model and still can achieve promising results. Thus, this approach is useful in scenarios where the dataset for the diagnosis is scanty. Pasa et al. [10] proposed a simple custom five convolutional blocks architecture. Each convolutional block contains two convolutions of size 3 × 3 with ReLU activation followed by max pooling operation. These five convolutional blocks are then connected to a final softmax layer. Such a simple CNN architecture-based model makes them useful in avoiding overfitting when the size of the dataset is much smaller. Ayan et al. [11] used transfer learning on a Vgg16 and Xception models for diagnosing pneumonia based on radiological images with an accuracy of up to 87% after extensive hyperparameter tuning.
3 Convolutional Neural Network This paper proposes a CNN-based approach to detect pneumonia using chest X-ray dataset. CNNs are used for image classification because of its high accuracy. The first CNN model was proposed by Yann LeCun [12] in the 1989. It follows a hierarchical or funnel-like model to give an output which is fully connected to all the neurons. It consists of layers such a convolution, pooling, fully connected, dropout, dense layer.
3.1 Convolution Layer This layer is the first layer in the CNN architecture. It converts the input image into pixel values, and the size of the convolution layer depends upon the dimensions of the image. This results in a three-dimensional matrix consisting of pixel values which is then multiplied by a filter matrix chosen. The output of this operation gives us a feature map of that layer. A number of convolution layers can be utilized in this architecture. The filter matrix can perform different operations on the image such as edge detection, blur or sharpen the image. An activation function is associated
144
C. Khamkar et al.
Fig. 1 ReLU function
with this layer. Few activation functions are sigmoid, Tanh, ReLU. ReLU or rectified linear unit function’s purpose is to introduce nonlinearity in the neural network. The main function of activation function is to decide whether to fire a neuron or not. It also adds bias to the network. ReLU function [13] is defined by the formula (1). All the negative values are converted to zeros by using the ReLU function (Fig. 1). y = max (0, x)
(1)
3.2 Polling Layer Polling layer is used when there are a large number of parameters associated with the image and only a few are required to achieve better accuracy. Pooling is also used to retain important features of the map by reducing its dimensionality, and this process is called subsampling or down sampling. There are three types of pooling, namely max pooling, average pooling and sum pooling. The largest value from the feature map is considered. Pooling layer helps to reduce the overfitting problem in deep learning. Example of a max pooling layer is shown in Fig. 2.
3.3 Fully Connected Layer The role performed by a fully connected layer is to flatten the output of the previous layers and give a single vector output which is fed into the next layer. The nodes of each layer are connected to one another to form a dense network. The output vector of the fully connected layer helps in classifying the images according to the labels on which it was trained. Equations (2, 3) for the fully connected layer are shown below
Pneumonia Detection Using X-ray Images and Deep Learning
145
Fig. 2 Max pooling
h iin = I
∗
Wi + Bi
h iout = Fi h iin
(2) (3)
where i is the ith layer, hi in is the input for the ith layer, I is the input matrix, W i is the weight of the neurons for that layer, Bi is biases related to that layer. hi out is the output, and finally, F i is the activation function for that layer.
3.4 Dropout Layer Dropout layers are mainly used to tackle the problem of overfitting. Overfitting occurs when the dataset is small, and as a result, there are not many features or parameters for the network to learn. Because of this reason, the model usually gives very high accuracy on train data, almost close to 99% but performs very poorly on test data. To have a more generalized model, dropout layers are used. This layer randomly drops some nodes from the network and makes it look like a layer but with different number of nodes and connectivity to the layer prior to it. Adding dropout layers makes the training process noisy and affects the training accuracy, but this process is crucial to make the model more generalized (Fig. 3).
Fig. 3 Basic CNN architecture [14]
146
C. Khamkar et al.
4 Proposed Model In this section, the model architecture for the proposed model is explained in detail. The model was implemented on the Google collaboratory [15] platform that provides access to high speed computation to train machine learning models (Fig. 4).
4.1 Preprocessing Data preprocessing and augmentation play an important role in increasing the accuracy of the model. It helps to solve the problem of overfitting as it creates multiple copies of same image with some different attributes like rotation, flip, etc., in order to increase the size of data and to make sure that the model is feed with a variety of images so that the model does not see same images again and again, hence avoid the problem of overfitting. Some data augmentation methods have also been employed. The images have been rescaled to 1/255 range. The reason for this is that the normal RBG images have a color coefficient in the range of 1 to 255, which in turn are computationally heavy for the model. So, it is rescaled down to a range between 0 and 1. A zoom range of 0.3 which randomly zooms the image to the ratio of 0.3% is also added. The images are also flipped vertically to create multiple copies of it in the dataset (Figs. 5 and 6; Table 1).
4.2 Model The CNN architecture of the proposed model is shown in Fig. 5. The model is divided into two main parts which comprises the feature extractors and classifiers. The sigmoid function is used for classification of the image. In the feature extraction
Fig. 4 Proposed CNN architecture
Pneumonia Detection Using X-ray Images and Deep Learning
147
Fig. 5 X-ray images without pneumonia
Fig. 6 X-ray images with pneumonia
Table 1 Values of data augmentation
Parameter
Value
Rescale
1/255
Zoom range
0.3
Vertical flip
True
step, the features of the previous layers are passed onto the next layer. The feature extractor part consists of five convolution blocks, max pooling layer, dropout layer, flatten layer and dense layer. These layers perform function as mentioned above in the methodology section. • The first layer consists of two convolutional 2D layers having a filter of 16 with ReLU as the activation function. The filter size for all the layers in the model is 3 × 3, and also, the activation function for all the layers is consistent which is ReLU. At the end of the second layer, there is a max pool layer of size 2 × 2. • The second and third layers follow a similar architecture. It contains two separable convolutional layers having a filter size of 32 and 64, respectively. It is followed by a batch normalization layer. The reason for using this layer to speed up the learning process. If some features range from 0 to 1 and some range from 0 to 1000, then it is advised to normalize the features so that they get distributed evenly and in turn speeds up the process of computation. It also helps to reduce overfitting to some extent as it makes the model a bit generalized. It is then followed by the max pool layer of size 2 × 2. • The fourth and fifth layers follow an identical architecture compared to the previous two layers except there is an addition of one extra layer at the end
148
C. Khamkar et al.
of it. The filter sizes of the fourth and fifth separable convolutional layer are 128 and 256, respectively. The additional layer at the end is the dropout player with the dropout rate of 0.2. This means that the model randomly drops 20% of the nodes between the layers. • In the next step which is the classifier, the output of the convolution layer is flattened into a single one-dimensional vector by the fully connected layer. Three dense layers with different unit values are used for the step of flattening with ReLU as its activation functions. The unit values of the three dense layers are 512, 128 and 64, respectively. Each dense layer is sandwiched between a dropout layer. The three dropout layers have the dropout rate of 0.7, 0.5, 0.3, respectively. Again, the dropout layers are used to improve accuracy and add generalization to this model by randomly dropping certain nodes from the fully connected layer.
4.3 Output Finally, the output of the model was fed into a classifier with sigmoid as its activation function as shown in Fig. 5. Since the desired output of this study is to predict if a person has pneumonia (1) or is normal (0), the sigmoid function was the ideal choice for this binary classification. The sigmoid function is given by Eq. (4). f (x) = 1/ 1 + e−z
(4)
5 Experimentation 5.1 Dataset In the dataset[16], there were a total of 5856 images. The chest X-ray images were chosen from retrospective cohorts of pediatric patients from Guangzhou women and children medical center [17]. The images were divided into their respective classes (normal/pneumonia) with the total number of train images having 1341 normal and 3875 pneumonia images. Similarly, validation and test images had normal and pneumonia images with a count of 8, 8 and 234, 390, respectively (Table 2). Table 2 Distribution of dataset Train
Pneumonia
Normal
2875
3875
Test
234
390
Validation
8
8
Pneumonia Detection Using X-ray Images and Deep Learning
149
5.2 Hyperparameter Tuning For this model, the number of epochs was set to 10 because it was observed that train and validation accuracies tend to reach its peak at this value and decrease thereafter. Also, the batch size of 32 was appropriate for this dataset. Input dimensions of the image were tuned by training the model on image dimensions ranging from 100 to 150, and it was observed that the model worked best when the image dimensions were taken to be 125. These results are further analyzed in the results section.
5.3 Novel CNN Approach The study for pneumonia detection was extended and implemented on the VGG 16 model [18]. The hyperparameters were kept the same as the proposed model. The computational power required to implement VGG 16 was a lot more because it has more layers. This also decreased its accuracy overall. The train, validation and test accuracies were 74.29%, 62.34% and 62.5%, respectively. This model was clearly overfitting. Overall, VGG 16 model produced poor results compared to proposed one.
6 Results The model was trained for 10 epochs with a lot of hyperparameter tuning, and the best results are displayed. The experiment was conducted for three different images sizes. The detailed analysis of the model behavior for different sizes is shown below in Table 3. The two main goals of this experiment were to achieve a good test accuracy and reduce the number of false negative cases as it would have serious repercussions if it is predicted that a person does not have pneumonia when the patient actually has it. For a CNN model to work, the input image has to be of constant size. Training was performed for three different image dimensions (Figs. 7, 8 and 9). • For the image size 150, train and test accuracy of 94.94% and 90.224% was achieved, respectively. The number of false negative cases was 12 out of a total Table 3 Model performance for different image sizes Image size
Train accuracy
Test accuracy
Recall
Precision
False negative cases
150
94.94
90.224
96.92
96.52
12 (2%)
125
95.11
91.185
97.43
89.41
10 (1.6%)
100
95.28
91.026
95.64
90.53
17 (2.7%)
150
C. Khamkar et al.
Fig. 7 Model performance and confusion matrix for image size 150
Fig. 8 Model performance and confusion matrix for image size 125
Fig. 9 Model performance and confusion matrix for image size 100
624 test images which roughly 2% of the total images; consequently, the recall value for this set was 96.92%. • With image size as 125, the train and test accuracy were 95.11% and 91.185%, respectively. The false negative cases were 10 or 1.6% of the total test images, and the recall value was 97.43%. The best score for the model was achieved under these settings. • Lastly, a model of image size of 100 was trained, and it achieved train and test accuracy of 95.28% and 91.02%. Also, the false negative cases were 17 or 2.7% of the total images, and the corresponding recall value of 95.64% was achieved.
Pneumonia Detection Using X-ray Images and Deep Learning
151
7 Conclusion In this paper, a CNN-based architecture for pneumonia diagnosis is presented. The specific layers implemented in the architecture are that of the MobileNet [19] architecture. This implementation results in an accuracy of 91.185 and recall of 97.43. Thus, this paper shows the merits of using MobileNet architecture-based CNN model. The separable convolutional variation in the architecture performs depthwise spatial convolution followed by pointwise convolution. This reduces the total computational cost per convolutional block since the convolution is performed individually in the spatial and channel domain, thereby reducing the complexity and thus giving a higher computational speed per convolutional block than an old convolutional architecture. High accuracy as well as recall was achieved using this model. Further, this implementation could improve by utilizing pretrained architectures and a larger dataset. This approach could also be applied to detect disorders in other internal organs.
References 1. John Hopkins Medicine, What causes pneumonia (2019, November). www.hopkinsmedicine. org/health/conditions-and-diseases/pneumonia. Last accessed 26 Aug 2020 2. Centers for disease control and prevention. Pneumonia Can Be Prevented. https://www.cdc. gov/pneumonia/prevention.html. Last accessed 26 Aug 2020 3. H. Younis, M. H. Bhatti, M. Azeem, Classification of skin cancer dermoscopy images using transfer learning, in 2019 15th International Conference on Emerging Technologies (ICET) (Peshawar, Pakistan, 2019), pp. 1–4. https://doi.org/10.1109/ICET48972.2019.8994508 4. P. Hattikatti, Texture based interstitial lung disease detection using convolutional neural network, in 2017 International Conference on Big Data, IoT and Data Science (BID) (Pune, 2017), pp. 18–22. https://doi.org/10.1109/BID.2017.8336567 5. A. Farooq, S. Anwar, M. Awais, S. Rehman, A deep CNN based multi-class classification of Alzheimer’s disease using MRI, in 2017 IEEE International Conference on Imaging Systems and Techniques (IST) (Beijing, 2017), pp. 1–6. https://doi.org/10.1109/IST.2017.8261460 6. G. Wimmer, A. Vécsei, A. Uhl, CNN transfer learning for the automated diagnosis of celiac disease, in 2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA) (Oulu, 2016), pp. 1–6. https://doi.org/10.1109/IPTA.2016.7821020 7. O. Stephen et al., An efficient deep learning approach to pneumonia classification in healthcare. J. Healthcare Eng. 2019 (2019) 8. A.K. Acharya, R.A. Satapathy, Deep learning based approach towards the automatic diagnosis of pneumonia from chest radio-graphs. Biomed. Pharmacol. J. 13(1) (2020) 9. V. Chouhan et al (2020) A novel transfer learning-based approach for pneumonia detection in chest X-ray images. Appl. Sci. 10(2):559 10. F. Pasa, V. Golkov, F. Pfeiffer et al., Efficient deep network architectures for fast chest x-ray tuberculosis screening and visualization. Sci. Rep. 9, 6268 (2019) 11. E. Ayan, H.M. Ünver, Diagnosis of pneumonia from chest X-ray images using deep learning, in 2019 Scientific Meeting on Electrical-Electronics & Biomedical Engineering and Computer Science (EBBT) (Istanbul, Turkey, 2019), pp. 1-5, https://doi.org/10.1109/EBBT.2019.8741582 12. Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998, November). https://doi.org/10.1109/5. 726791
152
C. Khamkar et al.
13. D. Liu, A guide to ReLU. https://medium.com/@danqing/a-practical-guide-to-relu-b83ca804f 1f7. Last accessed 27 Aug 2020 14. R. Prabhu, Understanding of CNN—deep learning. https://medium.com/@RaghavPrabhu/und erstanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148. Last accessed 27 Aug 2020 (2018, March) 15. E. Bisong. Google colaboratory, in Building Machine Learning and Deep Learning Models on Google Cloud Platform (Apress, Berkeley, CA, 2019) 16. P. Mooney. Chest-xray-pneumonia, Version 2. https://www.kaggle.com/paultimothymooney/ chest-xray-pneumonia. Last accessed 2020/08/26. 17. D. Kermany, K. Zhang, M. Goldbaum, Labeled optical coherence tomography (OCT) and chest X-ray images for classification. Mendeley Data 2 (2018) 18. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 19. H-.Y. Chen, C.-Y. Su. An enhanced hybrid MobileNet, in 2018 9th International Conference on Awareness Science and Technology (iCAST) (IEEE, 2018)
Autonomous Sailing Boat R. Divya, N. Inchara, Zainab A. Muskaan, Prasad B. Honnavalli, and B. R. Charanraj
Abstract Among several self-sufficient surface vehicles, cruising mechanical autonomy is a successful innovation for lengthy haul missions and semi-persistent nearness in the seas. Self-governance of these vehicles as far as vitality will be accomplished by inexhaustible sunlight-based and wind power sources. The fundamental commitment of this paper is to design an autonomous sailing robot that explores the water surface and sends live video footage to the base station, the identification of the behavior of sailing, human–robot interaction, robot control, estimation of position, and obstacle detection. Major key applications for this vessel are the assessment of the nation’s ocean boundary and fishermen’s activity. The project presented is with a robotic vehicle which works automatically and manually and controls the moving object in the water the robot will capture and sends the information to the base system which helps to keep track of fisherman who cross the nation’s boundary unknowingly. Here, Arduino processor is inbuilt with interfacing a wireless Raspberry Pi camera. The servo motors are used to actuate in the ocean and provide movement in all directions. Keywords Sailboat · Robot · Sailing robot · Autonomous sailing robot · Obstacle detection · Robotic vehicle · Servo motors · Live video streaming · Raspberry Pi · Arduino Uno · Solar panel · Bluetooth · Ultrasonic sensors and batteries
1 Introduction The Internet of Things is one of the quickest developing spaces of the current century. The central rule of IoT is to empower gadgets to speak with one another, utilizing the information gathered by one another, and manufactures a biological system that is self-ruling and self-supported. With the ever-expanding number of gadgets that can speak with one another, a design must be fabricated and created to overcome any barrier between the current framework and the up and coming keen foundation. By R. Divya (B) · N. Inchara · Z. A. Muskaan · P. B. Honnavalli · B. R. Charanraj PES University, Bangalore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_11
153
154
R. Divya et al.
making use of the Internet of Things, an autonomous sailing boat is developed and it includes the following features: 1. 2. 3. 4.
Autonomous sailing Obstacle avoidance Live video streaming Power supply through solar panel.
Autonomous sailing of robotic boats in changing and unknown environments is a difficult task, which needs the best hardware and software architectures. The hardware architecture requires hull, Arduino microcontroller, motor driver, Bluetooth module, buzzer, rechargeable battery, actuators, DC servo motors, Raspberry Pi camera, solar panel, ultrasonic sensors, Arduino Bluetooth controller app, and software architecture includes Arduino programming language and software (IDE). Figuer 1 The main part of the boat is the hull, the watertight body of a ship or boat. The hull is built on fiberglass material which is a good choice for the hull and cheapest choice as well. This also provides a rigid surface and enough tensile to bear the payload that is provided by the other components of the boat. Arduino Uno R3 processor uses boards for loading programs from personal computers. Here, embedded C language is used and can be loaded using Arduino programming language and software (IDE). Boat has three HC-SR04 ultrasonic (US) sensors. It is a four-pin module, whose pin names are Vcc, trigger, echo, and ground, respectively. This sensor is used for measuring distance or sensing objects. If any situation occurs to alert, the buzzer starts buzzing and sends the information.
Fig. 1 Sailing boat architecture
Autonomous Sailing Boat
155
The propeller rotates clockwise to propel the boat forward direction. These are connected to the DC servo motors, which give it a mechanical force to stir the vessel. As this project includes live video streaming, Raspberry Pi camera has been used. The board connects directly into the CSI connector on the Raspberry Pi. With the help of Ustream software, it can deliver a crystal clear of 1080p HD video recording at 30 fps. The Bluetooth module HC-05 which is a MASTER/SLAVE module can manually control this boat. Only AT COMMANDS can configure the role of the module. These commands can be given through an android application called Bluetooth terminal HC-05. Here, users need to give an ASCII value that helps the boat to move in forward direction, reverse direction, left and right directions accordingly. A battery is fixed to the boat from which all the components get the main power source. The battery is a 12 V lead acid battery, which has a capacity of 1.3 A. As renewable energy, we can use solar electric systems as they have relatively low maintenance requirements and long life. So, once the battery gets drained out, the boat uses solar panels.
2 Literature Survey The sailing robot boat is capable of bearing stations in all wind and sea conditions, and it is described in [1] which becomes a good example to prove that production of autonomous robots and control of the autonomous robot can also be done and which can be obtained from the low-cost components and low complexity. Probabilities of the long-term operation are also discussed and mentioned as real possible operation in the future. The designing process of a particular sailing robot for the sailing is presented in [2], in which the sail robot was built to resist bad climatic conditions and harsh environment. Initially, the control modes will be basic, but which can be improved after many experiments in future. Routing design for the obstacle detection and avoidance using omni-directional camera-oriented object detection is presented in [3] which is capable of providing panoramic vision, which is used to obtain and manage the data in order to provide clear position, shape and size of obstacles which were detected. The most concerning issues in autonomous sailing robots are control mechanism and navigation and that is presented in [4]; this robot is a developed and pretty good researched robot. The complete control mechanism is developed and tried on mobile robots control, which resulted in good performance. Controllers and DC motors are precisely designed. An autonomous sailing robot complete architecture, hardware and the software is presented in [5] in which a set of sensors and the propellers are combined with the wind and solar harvesting which results in achieving the most difficult task of allowing autonomous features for water surface robots which can take decisions in the dangerous condition of navigation.
156
R. Divya et al.
Fig. 2 Overview of proposed system
3 Proposed System The existing system uses ARM processors as main processors and uses IR sensors with a camera. In Fig. 2 of the proposed system, Arduino UNO R3 is used as main processors with crystal clear camera features with ultrasonic sensors, solar panel, and buzzer.
4 Methodology The above “Fig. 3” demonstrates the flow diagram of manual working of the project. Bluetooth module has been used for manually controlling the boat, which is connected to Arduino Uno R3 board (connected to motors and ultrasonic sensors) as hardware architecture. Bluetooth module gets connected to an android application, which is software architecture, for connectivity. In application, it shows a list of Bluetooth devices, and we need to connect to the boat’s Bluetooth device. To control the movement of the boat, ASCII values must be entered in the application, i.e., 1 for forward, 2 for left, 3 for right, 4 for reverse and 0 for stopping the boat. Ultrasonic sensors detect the obstacles and the buzzer starts ringing. Therefore, manually the direction of the boat can be changed.
Autonomous Sailing Boat
157
Fig. 3 Flow diagram
5 Algorithm Development The below figure shows the programming sequence of automatic working of the boat. In Fig. 4, initially the boat will be in idle stage. If power supply is not given it continues to be in idle stage or else the boat starts to move forward. If any obstacles are detected, the buzzer starts ringing. If an obstacle is detected in front, left, or right side of the boat, then the boat changes its direction wherever the path is clear. If the boat has reached the dead-end, then it stops, reverses back, and then changes the direction to left or right. Then, after changing the direction, the boat moves forward. Again, if any obstacle is detected, it follows the same procedure as explained above if not move forward in the same direction and reaches the destination.
6 System Architecture In the above Fig. 5 architecture, the board in the middle is the main component. Three ultrasonic sensors as inputs for obstacle detection, Raspberry Pi camera for live video footage, solar panel as power source, Arduino for loading programs, Android application for controlling the movement of the boat. Motors with actuators are output, which propel the boat and are connected to the board.
7 Overview of the Boat Figure 6 shows the overview of the boat.
158 Fig. 4 Flow chart
R. Divya et al.
Autonomous Sailing Boat
Fig. 5 System architecture
Fig. 6 .
8 Applications • • • •
Boundary detection Port detection Fisherman safety Intrusion detection.
159
160
R. Divya et al.
9 Design, Risks, Assumptions, and Dependencies Risks: The working of autonomous vehicles at the water body presents many legal issues. They mostly regard collisions with another vessel. So the International Rules for Prevention of Collisions at Sea (1972, Organization) has set rules for any vessels on the sea. There can be security risks like loss of information and malfunction of components and hardware breakdown. Birds may pose a threat on damaging the components and wires on the boat. Assumptions: The boat is unsinkable. Always detects the obstacles and returns the information. Dependencies: Power supply from solar rechargeable battery.
10 Experimental Results Test outcome analysis have been driven on a nearest water place at Bangalore, India, using our sailing boat model (see Figs. 7, 8, 9, 10 and 11). The whole main purpose of testing was to check the robot movement and control mechanism. For this purpose, the sailing boat works: (1) manually which controls the moving object in the water, using an Android application called Bluetooth terminal HC-05, from Point A to Point B and (2) automatically. The robot will capture and send the information to the base station on the land. In Fig. 8, the boat is detecting the obstacle that is in front of it with the help of ultrasonic sensors. Figure 9 Boat detects the obstacle in front, left side, and changes its direction to right side and continues to move forward in straight direction. In Fig. 10, when any obstacle is detected less than 3 cm, the boat stops for 10 ms, reverses back, and then changes its direction wherever the path is clear. Figure 11 is live video footage; initially, install virtual network computing (VNC) that allows to remotely controlling the interface of one computer from another computer. Here, the desktop of the Raspberry Pi inside a window on one’s own computer can be seen and controlled as though the user is working on Raspberry Pi itself. First, we connect Raspberry Pi to PC through Wi-Fi and find the IP address (192.168.43.60) of Raspberry Pi using terminal where we can sign in into the VNC and enter User ID (Pi) and password (Raspberry), which then opens a virtual desktop. By giving required commands, the user will be able to interface with the Raspberry Pi. Using IBM Cloud Video (formerly Ustream), a streaming platform for video hosting, users will be able to see the live video streaming.
Autonomous Sailing Boat
161
Fig. 7 Moving forward
11 Conclusion In this project, universal sailing robotic vehicle architecture has been presented by us in whom it automatically and manually controls the moving object on the surface of the water. The robotic vehicle will capture and send the information back to the personal computer on the land surface, and the better performance will be obtained from low-cost equipment and less amount of complexity. This project is designed with strong belief that sailing robotic vehicles can also be implemented for the research purpose and ocean measurement purpose. Projects like this help in aspects of the ocean. The current limitation of the presented autonomous sailing includes security-related issues, where connection may get aborted in the middle of open sea, and the pest problem like locusts infestation, birds, and other animals or unpredictable situations in the sea or ocean. A bigger model of the same project can be built for the bigger research fields like ocean-mining, metal detection, sonar applications, etc.
162
R. Divya et al.
Fig. 8 Detecting the obstacle
Fig. 9 Detected obstacle in the front, left side, and changed the direction to the right
Autonomous Sailing Boat
Fig. 10 Reverse back
163
164
R. Divya et al.
Fig. 11 Live footage using Raspberry Pi camera
Acknowledgements The authors would like to thank the faculty Prof. Prasad B Honnavalli and Assistant Prof. Charanraj BR for helping and guiding us throughout the project.
References 1. M. Neal, Regular issue peer-reviewed technical communication a hardware proof of concept of a sailing robot for ocean observation (2006) 2. M. Naveau, C. Anthierensy, E. Paulyz, P. Courmontagnez, MARIUS Project: design of a sailing robot for oceanographic missions (2013) 3. V. Guo, M. Romero, S.-H. Ieng, F. Plumet, R. Benosman, B. Gas, Reactive path planning for autonomous sailboat using and omni-directional camera for obstacle detection (2011) 4. L. Zhi, M. Xuesong,Navigation and control system of mobile robot based on ROS 5. F. Plumet, C. Pêtrès, M.-A. Romero-Ramirez, B. Gas, S.-H. Ieng, Toward an autonomous sailing boat (2014)
A Proposed Method to Improve Efficiency in IPv6 Network Using Machine Learning Algorithms: An Overview Reema Roychaudhary and Rekha Shahapurkar
Abstract The current variety of the Internet Protocol is IPv6 for addressing the networking devices. It is additionally recognized as a classless addressing scheme that locates computing machines throughout the web so they can be located. Among the mechanisms that are used to improve the efficiency are classification and clustering algorithms of machine learning (ML) approach. This paper aims to survey some of the methods previously used before the use of machine learning approach. The proposed method is to make use of semi-supervised or reinforcement ML algorithms. Thus, the intention of this paper is to study the future of IPv6 addresses as per the various mechanisms proposed by past researchers to minimize the delay and propose a model with the use of reinforcement ML algorithm or with the use of semi-supervised ML algorithm to enhance the execution result of the addressing on real-time data. Keywords IPv4 address · IPv6 address · ICMPv6 · DHCPv6 · Clustering · Classification · Reinforcement · Semi-supervised · Supervised · Unsupervised
1 Introduction The Internet Protocol (IP) model version presently used in the mentioned category like Local Area Network (LAN) and Wide Area Network (WAN) is IP version 4 (IPv4). Internet Protocol version 6 (IPv6) is today’s class of the network layer protocol and additionally recognized as a classless addressing scheme that connect devices throughout the Internet so they can be linked. The past version of TCP/IP and Internet Protocol was IPv4 that make use of a 32-bit address frame format to inform billion of computing machines around the world, which used to be thought to be enough. However, the extend of the Internet, desktop PCs, smart gadgets and R. Roychaudhary (B) St. Vincent Pallotti College of Engineering and Technology, Nagpur, Maharashtra, India R. Roychaudhary · R. Shahapurkar Oriental University, Indore, Madhya Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_12
165
166
R. Roychaudhary and R. Shahapurkar
Fig. 1 IPv6 address format
Fig. 2 Colon hexadecimal notation
now the Internet of Things (IoT) smart devices proves that the globe wished greater number of addresses. IPv6 uses 128 bit addressing scheme or 16 bytes (octet) long as shown in Fig. 1 that can support 2128 unique IP address to each and every device or node connected to Internet. The class privilege as in IPv4 is removed to compensate the address depletion problem in IPv6. Thus, in classless addressing scheme, variable length blocks are used instead of class. IPv4 address approach consists of four units of one- to threedigit numbers; IPv6 makes use of eight groups of four hexadecimal digits or two bytes in length separated by colon (:). This is shown in Fig. 2. The various improvements of using IPv6 are as follows: 1. 2. 3. 4.
Long address range. Autoconfiguration. New version of protocol header structure. Refined support for options and extensions.
The networking and distributed computing systems are the most important areas to provide efficient computational sources for machine learning. Existing research are constrained to the use of regular computing device learning algorithms based totally on predictions and classification. Machine learning is appropriate for networking due to its capabilities for intrusion detection and overall performance predictions and can advantage in network design and optimization. It also makes new probabilities to assemble the prototype design through a well-balanced training method [1]. Specifically, machine learning approaches in IPv6 network is relevant and competent for the various reasons as: firstly for its best-known potential of ML. Secondly, classification and prediction mechanisms play essential roles, but the other significant roles in network are issues such as attacks from outside world and efficiency forecasting. Thirdly, ML can also help in making decisions that will promote other network parameters in opinion with the present states of the surroundings. Many network issues want to have interaction with difficult system environments. It is now not convenient to construct accurate models to represent complicated system. ML can judge a model of these structures with a precision value.
A Proposed Method to Improve Efficiency in IPv6 Network Using …
167
Fig. 3 Supervised learning
Thus, the major intention of this paper is to survey the future of IPv6 addresses based on the various mechanisms discussed to cut back the delay and improve the output of addressing scheme; we tend to propose a model based on the utilization of a reinforcement-based machine learning mechanism, or we are going to be combining the benefits of various algorithms using machine learning to form a new IPv6 addressing scheme.
2 Types of Machine Learning Algorithms Machine learning (ML) can be formally defined as a software program to learn from its experience/s that is based on some tasks and the performance is measured which is improved with experience/s gained. ML is broadly divided into three types:
2.1 Supervised Learning It works with labeled data (supervisor) to train the algorithm. It is used to predict the next value, hence also known as task-driven method. The set of inputs and the desired output values is inputted to the algorithm to give the desired output based on training the model. Figure 3 shows the mentioned description.
2.2 Unsupervised Learning It works with unlabeled data and thus focuses on classify the raw data into different clusters. As it learns on its own, it is also known as data-driven method. The set of inputs are given to the algorithm, and based on the interpretation, the algorithm processes and produces the desired output. Fig. 4 shows the mentioned description.
168
R. Roychaudhary and R. Shahapurkar
Fig. 4 Unsupervised learning
2.3 Reinforcement Learning It is also known as trial-and-error method where no inputs or outputs are available and the machine needs to learn on its own. Thus, it works based on the three main components: agent, environment, and action. The main aim is to increase the rewards and lower the punishment during its learning phase. Reinforcement learning (RL) will have positive bang on the system due to its feedback strategy to change its status to move in forward position which is similar to human learning process. Figure 5 illustrates the description. Thus, in RL, there is no mechanism of supervising; instead, the agent needs to learn from the experience or past states (S t −1 ), perform the set of actions (At ) provided by the environment and gain rewards (Rt ) and a new state (S t+1 ) based on positive results. On negative results, punishment is given. Markov decision process (MDP) [2] is used to formulate any problem in RL into mathematical terms.
Fig. 5 Reinforcement learning cycle
A Proposed Method to Improve Efficiency in IPv6 Network Using …
169
3 Background and Related Work In [2] and [3], the author’s goal was to enhance packet transport count, routing value ratio and common end-to-end delay by the use of three-dimensional routing decision cellular address (RDCA) mechanism while carrying out survey with IoT devices. Similar paper in 2018 [1] presented a qualitative assessment of IPv6 for Industrial IoT (IIoT). The author Nasin Sadat Zarif et al. [4] addressed a new approach to limit the computational overhead and complicated operations while using Electronic Product Code (EPC) and Object Name System (ONS) IPv6 addressing methods. Another paper is associated with network security where the author in [5] and [6] targets to minimize the network reconnaissance attacks through the usage of the available global unicast IPv6 addressing method. The author Raja Kumar Murugesan et al. [7] in 2015 introduced a survey paper on a variety of practical issues in IPv6 network technology. A paper in 2008 [8] focused on improvement of performance of IPv6 packet transmission using direct mapping method. In [9], the authors focused on examining and lowering the signaling value trouble the usage of network mobility (NEMO)-aid protocol for mobile IoT devices. Future work can be extended to enhance the analytical model using ML tools and techniques. The author Thibaut Stimpfling et al. [9] proposed a scalable high-performance IPv6 (SHIP) lookup design solution to help excessive bandwidth, low lookup latency primarily based on the scalability of IP network. Similar work has additionally been proposed in [10]. In [11], the authors worked on a number of methods to observe dynamic network nodes in the IPv6 address space to preserve optimization. A paper in 2017 [12] recommended proxy-based method that could adequately minimize the IPv4/IPv6 interoperability services and network maintenance cost except growing the complexity of the network. The author in [13] carried out an evaluation analyzing on previous DHCP (version 4) and current DHCP (version 6) servers with dual stack and Ipv4 over IPv6 system. Similar work was carried in [14] for IPTV box for IOT devices. In 2013 [15], the paper investigates delay due to handover for various multimedia applications in IEEE 802.16 network using DHCPv6. The researcher Liu et al. [15] considers a technique to generate and manipulate the IPv6 address in accordance to the network. This approach makes use of unique address generation schemes of IPv6 in accordance to the specific requirements in one-of-a-kind scenarios. As a future work, there is a requirement to sketch and enforce the automatic era of new IP address schemes in accordance to the community requirements. In 2017, the author Amjed Sid Ahmed Mohamed Sid Ahmed et al. [16] surveyed neighbor discovery protocol (NDP). The findings of the paper [17] direct to study other open challenges of NDP attacks.
170
R. Roychaudhary and R. Shahapurkar
The paper [17] proposes deployment of power-free wireless sensor networks, for many application of IoT to exchange information between the nodes in the Interweb and power-free WSN nodes using 6LoWPAN protocol. The future work involves the use of LAID for other kinds of power-free wireless networks. The author Ali El Ksimi et al. [18] presented an algorithm to optimize the security of IPv6 of duplicate address detection (DAD) for unicast type of as compared with NDP. A paper in 2018 [19] proposed a method regarded as DNS backscatter, and the author concluded that IPv6 is much less harder to be monitored, and therefore, DNS backscatter will be a necessary tool in IPv6 for looking at net associated activity. The author Carpene et al. [20] monitored the performance to segregate IPv6 interface identifier (IID) address improvement through the use of ML algorithms. Future research will continue to enhance the classification training times for ANN situations. In paper [21, 22], the authors discussed intrusion detection systems (IDS) to detect any possible attacks in a network. In 2017, the author Mowei Wang et al. [23] provides a survey based on the current advances in machine learning technology for networking and distributed computing domain. The paper [24] proposes a technique through the use of machine learning naïve Bayesian classifier method and decision tree C4.5 approach to observe hidden channels in IPv6 network in cyber world. Based on the above context, another researcher Bhanu Vrat et al. [25] highlighted the behavior pattern of a number of attacks in IPv4 and IPv6 networks practicing machine learning methods. In 2018, a paper [26] showed that IPv6 addresses are clustered into six distinct addressing schemes using entropy clustering method of machine learning approach. This paper acts a starting point to study the future of IPv6 addresses based on the various challenges. The author Triveni Pujari et al. [2] proposed a homogeneous clustering MLA which is used to figure out distinct variety of packet flow in a computer network.
4 Proposed Methodology In the present network addressing situation, the IPv6 addressing is being through with the help of clustering and classification algorithms. We are going to be combining the benefits of supervised and unsupervised machine learning algorithms, known as semi-supervised machine learning (S-SML) algorithm, to form a new IPv6 addressing scheme. The following Fig. 6 shows the proposed diagram. Initially, we will be defining IPv6 addressing in our network using simulation tool. Then, based on a ML algorithm (reinforcement), the addressing will be assigned to the nodes virtually. This addressing will be checked for efficiency, and if efficiency is below certain threshold, then re-addressing will be done. The process will continue
A Proposed Method to Improve Efficiency in IPv6 Network Using …
171
Fig. 6 Proposed diagram
till proper efficiency is obtained. Once done, then only the virtual addressing will be assigned to the actual network, and performance will be monitored. In case of a performance drop, there will be re-addressing in the network.
5 Expected Outcome While machine learning part will be designed and proposed in our work this is yet to be modeled. The following Fig. 7 shows the result output for the same.
6 Conclusion The main objective is to design a system that will try to reduce the delay during address assignments for new nodes in IPv6 network. The proposed system will be going to be helpful in enhancing the standard addressing quality that primarily based on preceding learning. The proposed system will enhance the overall effectiveness of IPV6 addressing for small, medium, and large-scale networks.
172
R. Roychaudhary and R. Shahapurkar
Fig. 7 Outcome of the proposed work using machine learning algorithm
References 1. Y. Tian, R. Miao, W. Chen, Y. Wang, Heterogeneous IoTs routing strategy based on cellular address, in IEEE International Conference on Smart Internet of Things (2018), pp. 64–69 2. N. Xu, Understanding the reinforcement learning, in CCEAI (IOP Publishing Ltd, China, 2019) 3. G. Kumar, A survey of IPv6 addressing schemes for Internet of Things. Int. J. Hyperconnectivity Internet of Things 2(2) (2018) 4. H. Najafi, M. Imani, N.S. Zarif, A new hybrid method of IPv6 addressing in the Internet of Things, in International Conference of Electrical & Computer Engineering in Iranian Journals & Conference Proceeding (2018) 5. S. Rukmanidevi, M. Hemalatha, Real time prefix matching based IP lookup and update mechanism for efficient routing in networks. J. Ambient Intell. Humanized Comput (Springer, 2019, December) 6. R.C. Mahesh, M. Bundele, A review on implementation issues in IPv6 network technology. Intern. J. Eng. Res. Gen. Sci. 3(6), 800–809 (2015 November–December) 7. R. Budiarto, S. Ramadas, R.K. Murugesan, Performance Improvement of IPv6 Packet Transmission Through Address Resolution Using Direct Mapping (IEEE, 2008) 8. P. Herber, B. Feldner, A Qualitative Evaluation of IPv6 for the Industrial Internet of Things (Elsevier, 2018), pp. 377–384 9. N. Belanger, J.M.P. Langlois, Y. Savaria, T. Stimpfling, SHIP: A Scalable High-Performance IPv6 Lookup Algorithm that Exploits Prefix Characteristics (IEEE, 2017) 10. D.A. Moskvin, T.D. Ovasapyan D.V. Ivanov, Approaches to detecting active network nodes in IPv6 address space. Autom. Control Comput. Sci. 51(8), 902–906 (2017) 11. Y.-S. Chen, S.-Y. Liao, A framework for supporting application level interoperability between IP4 and IPv6. Adv. Intell. Inf. Hiding Multimedia Signal Process. Smart Innov. Syst. Technol. 3(3), 271–278 (2017) 12. C.Y. Wu, W.Y. Lin, T. Te Lu, Comparison of IPv4-over-IPv6 (4over6) and dual stack technologies in dynamic configuration for IPv4v6 address. Adv. Intell. Inf. Hiding Multimedia Signal Process. Smart Innov. Syst. Technol. 259–269 (2017)
A Proposed Method to Improve Efficiency in IPv6 Network Using …
173
13. C.-H. Lin, J.-L. Hu, C. L. Chang, W.Y. Lin, H.-K. Su, Design and Implementation of an IPv4/IPv6 DualStack automatic service discovery mechanism for an access control system. Adv. Intell. Inf. Hiding Multimedia Signal Process. Smart Innov. Syst. Technol. 279–284 (2017) 14. J. Wozniak, K. Nowicki, T. Mrugalski, Dynamic host configuration protocol for IPv6 improvements for mobile nodes, telecommunication systems (2013), pp. 1021–1031 15. H.E. Lin, G. Ren, Y. Liu, GAGMS: A Requirement-Driven General Address Generation and Management System, vol. 61 (Science China Press and Springer-Verlag GmbH Germany, Part of Springer Nature, 2018) 16. R. Hassan, N.E. Othman, A.S.A.M.S. Ahmed, IPv6 neighbor discovery protocol specifications, threats and countermeasures: a survey. IEEE Access 5, 18187–18210 (2017) 17. S.-W. Qiu, K.-K. Chi, Y. Michael Fang, Y.-H. Zhu, Latency aware IPv6 packet delivery scheme over IEEE 802.15.4 based battery-free wireless sensor networks. IEEE Trans. Mobile Comput. 14(8), 1–14 (2016) 18. A. El Ksimi, C. Leghris, Towards a new algorithm to optimize IPv6 neighbor discovery security for small objects networks. Secur. Commun. Netw. 18 (Article ID 1816462), 11 p (2018) 19. J. Heidemann, K. Fukuda, Who knocks at the IPv6 door? Detecting IPv6 scanning, in Internet Measurement Conference (Boston, USA, 2018), 7 p 20. M. Johnstine, A.J. Woodward, C. Carpene, The effectiveness of classification algorithms on IPv6 IID construction. Int. J. Auton. Adaptive Commun. Syst. 10 (2017) 21. B. Belaton, M. Anbar, A.K. Alabsi, A.K. Al-Ani, O.E. Elejla, Comparison of classification algorithms on ICMPv6-based DDoS attacks detection, in 5th ICCST 2018 (Kota Kinabalu, Malaysia, 2019), pp. 347–357 22. Y. Cui, X. Wang, S. Xiao, J. Jiang, M. Wang, Machine Learning for Networking: Workflow, Advances and Opportunities (IEEE Network, 2017), pp. 1–8 23. X. Ma, E. Peytchev, A. Salih, Detection and classification of covert channels in IPv6 using enhanced machine learning, in Proceedings of The International Conference on Computer Technology and Information Systems (2015) 24. N. Aggarwal, S. Venkatesan, B. Vrat, Anomaly detection in IPv4 and IPv6 networks using machine learning, in IEEE INDICON 2015 (2015) 25. Q. Scheitle, P. Foremski, Q. Lone, M. Korczy´nski, S.D. Strowes, L. Hendriks, O. Gasser, Clusters in the Expanse: Understanding and Unbiasing IPv6 Hitlists, in Internet Measurement Conference (IMC’18) (Boston, MA, USA. ACM, New York, NY, USA, 2018), p. 15 26. B.G. Nalinakshi, D. Jayaramaiah, T. Pujari, Flow-based network traffic classification using clustering technique with MLA approach. Int. J. Res. Appl. Sci. Eng. Technol. (IJRASET) 5, 1855–1862 (2017) 27. I. You, C. Xu, H. Zhang, J. Guan, The PMIPv6-Based Group Binding Update for IoT Devices, Mobile Information Systems (Hindawi Publishing Corporation, 2016), p. 8 28. S.A. Abdullah, SEUI-64, bits an IPv6 addressing strategy to mitigate reconnaissance attacks. Eng. Sci. Technol. Int. J. 22, 667–672 (2019) 29. R.S.M. Alowaid, J.-G. Choi, M. Gohar, H.A.M. Alrubaish, Distributed group-based mobility management scheme in wireless body area networks. Hindawi Wirel. Commun. Mobile Comput. 11
Byte Shrinking Approach for Lossy Image Compression Tanuja R. Patil
and Vishwanath P. Baligar
Abstract Digital image storage and transmission is a challenging task nowadays. As per statistics, an average of 1.8 billion images are transmitted daily. Hence, image compression is inevitable. Here, we discuss a novel approach, which is a type of lossy image compression. But the quality of reconstructed image is higher. This method makes use of reducing the number of bytes by storing the number of ‘1’s in each bitplane of three adjacent pixels and recording the count of number of ‘1’s. This count value is stored, and later Huffman compression is applied. Reconstruction is done by energy distribution method. This gives better results in terms of quality at lower PSNR values compared to JPEG algorithm. Here we have used another metric to measure the quality of the image which is discussed in detail. Keywords Lossy image compression · Comparison with JPEG · Quality of image compression · Byte reduction · Quality metric for image compression
1 Introduction An image is a creation that is used to represent visual perception. A digital image is stored as pixels with a numerical value to represent light intensity. As the amount of image data storage and transmission is huge, the eventual solution is to use image compression. Image compression is the process of reducing irrelevance and redundant image data in order to be able to store or transmit without degrading the quality of the image to an unacceptable level. The reduction of redundant data results in two types of image compression, i.e., lossless or lossy type of image compression. Lossless is needed for applications such as medical imaging, satellite imagery, etc. Huffman coding, LZW coding, run length coding and arithmetic coding are some of T. R. Patil (B) · V. P. Baligar K.L.E. Technological University, Hubballi, India e-mail: [email protected] V. P. Baligar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_13
175
176
T. R. Patil and V. P. Baligar
the popular lossless techniques. Recent approaches use prediction-based techniques to improve compression ratio. Lossy compression results in loss of some data and can be used where size of image is very critical such as multimedia and GIS. These techniques may result in high compression ratio, but the quality of image might be reduced. Hence, it is required to work on improving the quality of reconstructed images in case of lossy compression [1–3]. In this paper, we present a lossy approach for image compression over grayscale images. Section 2 is a literature survey, which provides an overview about some image compression techniques. In Sect. 3, we discuss our proposed algorithm. Section 4 explains the compression and decompression algorithms of byte shrink approach. In Sect. 5, we discuss performance metric, in Sect. 6, we discuss the results, and Sect. 7 gives a comparison between our algorithm and the JPEG algorithm. Section 8 gives the conclusion.
2 Review of Literature As it is known, image compression leads to two types of image compression, lossy and lossless compression. We discuss some of the lossy and lossless image compression techniques and their outcomes. E. Kannan and G. Murugan present a new near-lossless image compression algorithm based on the Bayer format image suitable for hardware design. This algorithm can provide lossless compression for the region of interest (ROI) and high-quality compression for other regions. It can provide low average compression rate (2.12 bits/pixel) with high image quality (larger than 53.11 dB) for endoscopic images [4]. Vishwanath P. Baligar, L.M. Patnaik, G.R. Nagabhushana, 2006 discuss the lossless image compression scheme using linear predictors. The prediction for each pixel is formed by using a set of least-square-based linear prediction coefficients of the block to which the current pixel belongs. This is adaptive prediction based on Bayesian model averaging to improve the quality of predictor. The compression performance of the proposed method is superior to Joint Photographics Expert Group’s, JPEG-LS [5]. Mohamed Uvaze Ahamed Ayoobkhan, Eswaran Chikkannan, Kannan Ramakrishnan, Saravana Balaji Balasubramanian describe a lossless image compression technique using prediction errors. To achieve better compression performance, a novel classifier which makes use of wavelet and Fourier descriptor features is employed. Artificial neural network (ANN) is used as a predictor [6]. As per Sinisa Ilic, Mile Petrovic, Branimir Jaksic, Petar Spalevic, at lower values of bit-rate, there arise noise effects from the compression methodology used in JPEG. Here contour-like structures appear, which are uncomfortable for better visibility [7]. R. P. Huilgol, T. R. Patil and V. P. Baligar discuss two lossless compression techniques, which uses JPEG-LS prediction technique and seed number construction and the use of equations solving methods to improve the compression ratio [8, 9].
Byte Shrinking Approach for Lossy Image Compression
177
T. R. Patil, V. P. Baligar and R. P. Huilgol discuss a lossy compression algorithm. Here it is shown that using surrounding pixels method, at low PSNR levels, number of correct pixels in reconstructed image increases, thus reducing the blur effects that may arise in JPEG at same PSNR values [10]. From the literature review, we understand that among lossy image compression, standard JPEG has some adverse effects at low PSNR values and work is being done to overcome some contour effects. We propose a new method called as byte shrink approach, by which quality can be improved at low PSNR values.
3 Byte Shrinking Approach In this section, we describe a ‘byte shrink’ approach used over grayscale images. This is a unique approach, which deals with converting three pixel data into two bytes. Later, further compression is achieved by Huffman compression. In this method, the image is processed in raster scan manner. Three adjacent pixels are considered for compression. For all three pixels, number of ‘1’s present in each bitplane is counted and stored in a file. These count values form 2 byte data which is stored in a file. Further, the file values are Huffman compressed and saved, which forms ‘compressed’ file. Later, reconstruction is done by using a unique concept of energy distribution. This is explained in Methodology section.
4 Methodology Here, we describe the compression and decompression algorithms with examples taken over a sample Lena image. For our work, we have used the standard set of grayscale images of size 512 × 512.
4.1 Algorithm Used for Compression (1) (2) (3) (4) (5) (6)
Input the grayscale image pixel intensity values. Scan the pixels in raster scan manner, and consider three adjacent pixels X [i,j] , X [i, j+1] and X [i, j+2] Count the number of 1’s in each bitplane for these three pixels. Write the decimal value of these number of 1’s for each bitplane. The decimal values will be either, ‘0’, ‘1’, ‘2’ or ‘3’ Write the 2-bit binary value for the above decimal value, i.e., ‘00’,’01’,’10’,’11’, respectively. There are eight bitplanes for a pixel. To represent the count value (i.e., decimal value in binary) of one bitplane, 2 bits are required. Hence for 8 bitplanes, 16
178
(7)
T. R. Patil and V. P. Baligar
bits are required, i.e., 2 bytes. Lower 4 bitplanes are represented with 1 byte (LSByte) and upper 4 bitplanes, another byte (MSByte). Thus, three bytes are converted as two bytes. Write the MSByte and LSByte in a file. Later, the file values are Huffman coded and saved as Huffman compressed file. Let us call this file as ‘byte shrink.’
4.2 Algorithm Used for Decompression Decompression is done as shown below. Here the input file is ‘Bysteshrink’ file. (1) (2) (3) (4) (5) (6)
Huffman decompression is done to get the MSByte and LSByte values. Create an array for image reconstruction. Read two bits at a time to get the count value of 1 bitplane for MSByte. Repeat above step, four times to get 4 count values. Convert binary to decimal value. Similarly, repeat steps 3–4 to get 4 count values from LSByte. Later reconstruction of three pixel values, using 2 bytes, is done using the above count values, as described below. For a bitplane of all three pixels, we use the below steps, and it is repeated for all bitplanes. Initially, energy, to be allotted for each pixel, is calculated. If the count value is ‘0’ or ‘3’, no need to calculate energy. If count value is ‘1’ or ‘2’, calculation is done as follows. Energy =
N
2n × C (where C! = 0 or 3)
(1)
(n=0)
where N = 7 and C is count value (either ‘1’ or ‘2’) for nth bitplane. This energy has to be distributed among each pixel p, q, r. (1) (2) (3) (4) (5)
If the count value is ‘3’, allot ‘1’ to the particular bitplane of all pixels. If the count value is ‘0’, allot ‘0’ to the particular bitplane all pixels. If the count value is ‘1’ or ‘2’, allot ‘1’ to any one among p, q, r based on energy of which pixel is more than the energy of the particular bitplane. Repeat the allotment of ‘1’s as per steps 1–3, for all bitplanes. After each allotment to bitplanes, recalculate energy and distribute ‘1’s accordingly. Once the allotment of ‘1’s is over for all 8 bitplanes, convert the binary value of each pixel to decimal value. This is how the pixel intensity value for 3 pixels is allotted, and thus, three pixels are reconstructed from 2 bytes. This is explained in detail w.r.t. a sample set.
Examples We have applied this approach over grayscale images of size 512 × 512. Here we discuss this algorithm with an example of a sample set of Lena image shown in Fig. 1.
Byte Shrinking Approach for Lossy Image Compression
162 160 161
163 157 158
157 158 157
179
159 159 161
162 162 163
164 163 164
Fig. 1 Sample set1 of Lena image
Bitplane Count value
b7
b6
b5
b4
b3
b2
b1
b0
3
0
2
1
1
1
2
2
Fig. 2 Count-value file for sample set of Lena image
Calculations for compression over sample set 1: 1.
We show calculations for first three pixel intensity values, which are 162,163,157 in the sample set shown in Fig. 1.
162
163
157
159
162
164
160
157
158
159
162
163
161
158
157
161
163
164
Calculation of two bytes from three pixel values: (i)
Convert all three pixel intensity values say ‘p’, ‘q’, ‘r’ into binary p = 162= 10100010, q = 163 = 10100011, r = 157 = 10011101
(ii)
Count the number of ‘1’s in each bitplane for all 3 pixels and store the counts. Count value is as shown in Fig. 2.
Bitplane
b7
b6
b5
b4
b3
b2
b1
b0
Count value
3
0
2
1
1
1
2
2
(iii)
Assign 2-bit binary code for the count values
i.e. For 0 = ‘00’, 1 = ‘01’, 2= ‘10’, 3 = ‘11’ (iv)
Lower 4 bitplane count values are combined into 1 byte, i.e., LSByte count value for b3, b2, b1, b0 = 1, 1, 2, 2, by assigning 2-bit code for these values, 1, 1, 2, 2 = 01011010 (binary)= 70 (decimal)
Thus LSByte, 70, is stored in ‘Byteshrink’ File.
180
T. R. Patil and V. P. Baligar
(v)
Similarly upper 4 bitplane count values are combined into 1 byte, i.e., MSByte. Count value for b7, b6, b5, b4 = 3, 0, 2, 1, by assigning 2-bit code for these values, 3, 0, 2, 1 = 11,001,001 (binary) = 201 (decimal) Thus MSByte ,201, is stored in ‘Byteshrink’ File
(vi) (vii)
Repeat steps (i)–(v) for whole image, considering three adjacent pixels at a time. These values are further Huffman compressed and stored as a compressed file.
Reconstruction procedure: Huffman decode the ‘Byteshrink’ file. Read the MSByte and LSByte. Now reconstruct three pixel values from these two bytes as follows. MSByte read from the file = 201 and LSByte = 70 Covert into binary, MSByte = 201 = 11,001,001 and LSByte = 70 = 01,011,010 Get the count values for the above by reading two bits at a time.
1. 2. 3.
For MSByte = ‘11’, ‘00’, ‘10’, ‘01’ = 3,0,2,1 and LSByte = ‘01‘, ‘01’, ‘10’, ‘10’ = 1,1,2,2 4. (i) (ii)
Now, from these 4 + 4 = 8 count values, reconstruct 8 bitplanes of three pixels, i.e., ‘p’, ‘q’, ‘r’ Write MSByte, LSByte count values as 3, 0, 2, 1, 1 ,1, 2, 2 Calculate energy. Wherever count value is 0 or 3, skip the calculation, and for 1 or 2 calculate energy using the Eq. 1.
Energy = 2 × 25 + 1 × 24 + 1 × 23 + 1 × 22 + 2 × 21 + 2 × 20 = 98, remainder = 98% 2. (iii)
(iv) (a) (b) (c)
Calculate the energy of all 3 pixels as shown below. Divide the total energy among all three pixels. Energy of pixel ‘p’, ‘q’, ‘r’ = enp = enq = enr = 98/3 = 32, Distribute ‘1’s and ‘0’s for the bitplanes of all three pixels as shown below. Bitplane 7 count value = 3. Allot ‘1’s to bitplane 7 of all three pixels. Bitplane 6 count value = 0 allot ‘0’ bitplane 6 of all three pixels. Bitplane 5 count value = 2. Now bitplane 5 energy value (weighted binary value) = 32, Enp = Enq = 32, Since the count value is ‘2’ and Energy Enp = Enq = bitplane5 value, allot ‘1’ to bitplane5 of ‘p’ and ‘q’ pixels. Recalculate energy for each pixel
Enp = 32–32 = 0, Enq = 32–32 = 0, Energy = Enp + Enq + Enr = 0 + 0 + 32 = 32, Enr = 32. (d)
Bitplane 4 count value = ‘1’. Enp = Enq = 0, Enr = 32 and bitplane 4 energy value = 16. Hence allot ‘1’ to pixel ‘r’ Recalculate energy = Enr = 32–16 = 16,
Byte Shrinking Approach for Lossy Image Compression
181
Bitplane
b7
b6
b5
b4
b3
b2
b1
b0
p
1
0
1
0
0
0
1
1
q
1
0
1
0
0
0
0
0
r
1
0
0
1
1
1
1
1
Fig. 3 Reconstructed pixels
(e)
Bitplane 3 count value = ‘1’, Enp = 16 bitplane 3 energy value = 8. Hence allot ‘1’ to pixel ‘r’. Energy=16−8=8.
(f)
Bitplane 2 count value = ‘1’, Enp = 8 bitplane 2 energy value = 4. Hence allot ‘1’ to pixel ‘r’ Energy = 8 − 4 = 4.
(g)
(h)
Bitplane 1 count value = ‘2’, Bitplane 1 value = 2, Enr = 4. But now we have to allot ‘1’s to 2 pixels. Allot ‘1’ to pixel ‘r’ and forcibly allot ‘1’ to any of the other 2 i.e. either to ‘p’ or ‘q’. Suppose we allot for pixel ‘p’ energy is recalculated as shown Enp = 0 − 2 = − 2, Enq = 0. Enr = 4–2 = 2, Energy = Enp + Enq + Enr = -2 + 2 = 0. Bitplane count 0 value = ‘2’. Allot ‘1’s to any of two pixels. Energy has become ‘0’.
Now check remainder. It is = 2 bitplane 0 value = 1, hence allot ‘1’s to any of the two, i.e., ‘p’ and ‘q’. Now as per the allotment of ‘1’s and ‘0’s for all bitplanes of three pixels. Finally, number of ‘1’s allotted as per above steps is shown in Fig. 3, and the reconstructed values are p = 10,100,011 = 163, q = 10,100,011 = 163, r = 10,011,100 = 156. Calculate energy of original pixels, i.e., 162 + 163 + 157 = 482. Calculate energy of reconstructed pixels, i.e., 163 + 163 + 156 = 482. Here, we see that energy of reconstructed pixels is same as original pixels. Bitplane
b7
b6
b5
b4
b3
b2
b1
b0
p
1
0
1
0
0
0
1
1
q
1
0
1
0
0
0
0
0
r
1
0
0
1
1
1
1
1
182
T. R. Patil and V. P. Baligar
5 Quality Assessment of Reconstructed Images Performance metrics to assess the quality of reconstructed images are mean square error (MSE) and peak signal-to-noise ratio (PSNR). But here we are using another metric called correctness ratio, by which we get to know how many pixels are exactly equal to original pixel values. Correctness ratio is defined as below. Co.R. =
Total number of pixels in reconstructed image as that of original Total pixels in the original image
When we measure the performance of reconstructed image using the proposed method, we get more accuracy as compared with JPEG, at same PSNR values, which is shown in Sect. 7. Number of exact value pixels in the reconstructed image are more. Hence, correctness ratio also increases. As a result of this, the reconstructed images are exactly like original and contours which may appear after reconstruction is reduced.
6 Results We got the following results when we applied the proposed algorithm on standard set of images as shown in Fig. 4. Here, the reconstructed images show the quality of the images which seem to be near to original. Left side are the original images, and right side are the reconstructed images.
Fig. 4 Standard set of original images and reconstructed images
Byte Shrinking Approach for Lossy Image Compression
183
Table 1 Comparison of number of correct pixels and correctness ratio Input files
PSNR in dB
Correct pixel count using JPEG approach
Correct pixel count using proposed approach
Correctness ratio of JPEG
Correctness ratio of proposed approach
Lena
34.15
34,454
Baboon
31.4
14,322
64,348
0.18
0.24
35,436
0.05
Barbara
32.8
15,470
53,144
0.13
0.05
0.20
Airplane
35.1
28,336
72,150
0.1
0.28
Aya_matsuura
33.8
24,132
71,503
0.09
0.27
Pepper
34.8
21,580
61,608
0.08
0.24
7 Comparison of Results This section gives a comparison between the threshold algorithm with that of JPEG lossy. We compute the number of pixels in the reconstructed image, which are exactly same as that of original image using both algorithms as shown in Table 1.
8 Conclusion The ‘Byte shrink’ approach is an innovative and low computation intensive method for image compression. The performance metric used, ‘correctness ratio,’ gives a count of how many pixels have the same values as that of original. JPEG suffers from contour effects at low PSNR values, by which quality of reconstructed image is slightly lossy in nature. In the proposed technique, we get accurate results, wherever count of ‘1’s in a bitplane is ‘0’ or ‘3’. For count values ‘1’ and ‘2’, accurate results may not be possible, but sum of pixel intensity value of three pixels remain same as that of original, so pixel values may be same as original or may be exchanged among three pixels only. When measured using our new metric, it gives the count of correct pixels which are far more than JPEG at same PSNR values. Hence, we can say that the quality of reconstructed image is improved with this approach and compression ratio achieved is around 2.5 for standard datasets.
References 1. R. Kaur, P. Choudhary, A review of image compression techniques. Int. J. Comput. Appl. 142(1), 0975–8887 (2016, May) 2. A.A Anju, Performance analysis of image compression technique. Int. J. Recent Res. Aspects. 3(2) (2016). ISSN 2349-7688 3. R.C. Gonzalez, R.E. Woods, Digital Image Processing, 4th edn. Pearson Prentice (2015, March)
184
T. R. Patil and V. P. Baligar
4. E. Kannan, G. Murugan, Lossless image compression algorithm for transmitting over low bandwidth line. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(2) (2012) 5. V.P. Baligar, L.M. Patnaik, G.R. Nagabhushan, High compression and low order linear predictor for lossless coding of grayscale images. Image Vis. Comput. 21, 543–550 (2003). www.elsevier. com 6. M.U.A. Ayoobkhan, E.C.K. Ramakrishnan, S.B. Balasubramanian, Prediction-based lossless image compression, in Proceedings of the International Conference on ISMAC in Computational Vision and Bio-Engineering 2018 (ISMAC-CVB), pp. 1749–1761 7. S. Ilic, M. Petrovic, B. Jaksic, P. Spalevic, L. Lazic, M. Milosevic, Experimental analysis of picture quality after compression by different methods, Przegl˛ad Elektrotechniczny. R. 89 NR 11/2013. ISSN 0033-2097 8. R.P. Huilgol, V.P. Baligar, T.R. Patil, Lossless image compression using seed number and JPEG-LS prediction technique, in Conference proceedings-Punecon 2018 9. R.P.Huilgol, V.P. Baligar, T.R. Patil, Lossless image compression using proposed equations and JPEG-LS prediction technique, in 2018 International Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET) (Kottayam, India, 2018), pp. 1–6. https://doi.org/ 10.1109/ICCSDET.2018.8821065 10. T.R. Patil, V.P. Baligar, R.P. Huilgol, Low PSNR high fidelity image compression using surrounding pixels, in 2018 International Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET) (Kottayam, India, 2018), pp. 1–6. https://doi.org/10.1109/ICC SDET.2018.8821082
Image Encryption Using Matrix Attribute Value Techniques D. Saravanan and S. Vaithyasubramanain
Abstract In this technological world encoding the images is one of the important activities to avoid monitor or hack by the third parties. Data protection becomes essential because the communication occurs frequently over open networks. This paper focuses on Binary model encryption using a form of stream ciphering. Due to the development of technology conversion of images into encoded format is more essential in any type of network. It helps to protect our data from the hacker who try to steel our information’s. Confidentially is particularly important for safe dissemination of the media over IP networks. While compression of image reduces the bandwidth, it is not safe to transmit compressed images alone. It is equally important to encrypt the images until they are transmitted to a network. Investigation shows that the methodology suggested is generating better results than the current procedures. Keywords Image encoding · Secrete code · Encryption · Image conversion · Binary code conversion · Stream ciphering
1 Introduction Binary model is one of the latest technologies used in various places for analyze and retrieve the content. This model used in various domains extracting the needed information’s based on the user query. Binary model are made up of black and white images which are quantified by double principles, typically Zero and one, this values further represented in some model through the help of colors. Typically, zero represents one color and one represents other color. Advantage of this binary model occupies the less memory and produces efficient result [1]. Today any type of network the hacker can easy hack the data’s so privacy is particularly important for D. Saravanan Faculty of Operations & IT, ICFAI Business School (IBS), The ICFAI Foundation for Higher Education (IFHE), (Deemed to be university u/s 3 of the UGC Act 1956), Hyderabad, India S. Vaithyasubramanain (B) PG and Research Department of Mathematics, D. G. Vaishnav College, Chennai, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_14
185
186
D. Saravanan and S. Vaithyasubramanain
the dissemination of protected media over lP networks. While compression of the image reduces the bandwidth, it is not safe to transmit packed down images alone. It is also important to encrypt the images until they are transmitted to a system [2]. Mathematically, the generated cipher image cannot break in. The benefit of using stream cipher is that, as opposed to block ciphers, the execution speed is higher and has lower hardware complexity. In the proposed model information’s are converted into unreadable attributes. These attributes are generated in the form of matrix for that information’s are highly secured. Any binary model is first converted using this approach [3, 4]. In the proposed model image are converted into unreadable attributes. This attributes are selected from constructed matrix. After conversation the converted attributes are exchange among the trusted users. Due to this approach information’s are exchanged in a high secured way. The proposed model is described in Sect. 3, existing system and literature survey is carried in Sect. 2, Sect. 4 describes experimental setup, encryption process, corresponding source code and procedures are described in Sect. 5. Sections 6, 7 describes reconstructing process and generation matrix attribution value along with experimental outcomes.
2 Existing System and Literature Survey Already there are different methods for Binary model encryption. However in terms of the number of keys to be stored and distributed these methods suffer from the downside. A novel crypto system for Binary model encryption has been proposed earlier. A stream cipher crypto system is used according to this method whose cryptographic, stable pseudorandom bit generator is a Boolean hybrid cellular automaton. The downside is that 256 cells from the hidden key of the cryptosystem and the characteristic vector that is distributed to the recipient. Therefore, 512 keys must be allocated to the receiver in total [5, 6]. Dang et al. proposed that the development of technology in every field we are using the image for various applications. Image processing, image retrieval one of the emerging field today [7]. For accessing this images through internet high chance for at risk. Especially the images are shared in the public domain Image reconstructing, image modification may possible. For transferring any image through any network it is necessary to protect the transmitting image and network band width. In this paper author proposed novel scheme for data transmission using discrete wavelet transform Technique. Experimental output verifies the proposed technique works well in image transformation. In Kishana Ram Kashan et al., proposed technique for transmitting a secure image using Discreet Wavelet transform combined with advanced encryption standard. This model transmits the image with high speed even in low supremacy [8]. In Ai-Hong Zhu et al., proposed encrypted the given Image input using chaos. In the proposed technique totally nine chaos sequence are used for Convert the image for which one used for generating key values, six chaos are used to reassemble The input image. These methods works well using logistic map and produce the high multi chaos Encryption mechanism [9]. In Hong Xin et al., proposed
Image Encryption Using Matrix Attribute Value Techniques
Encode The Binary Stack
Stack Image
187
Convert the Encode image to Plaintext Format Enter the Key
Original Image
Decoding
Decode the Image using Key
Fig. 1 Image encryption—proposed architecture diagram
nonlinear pseudo random number generator for image encryption. In this technique value generated by the generator are compared with binary Series. Experimental outcome verifies the generated binary series have high sensibility [10].
3 Proposed System The Proposed Model is as shown in Fig. 1. In the Proposed Model, Matrix Based Attribute Values are used for both conversion and de-conversion operations. In the first Step the color images are converted into binary model, using this binary model we construct a Matrix Attribute Values. For every binary values i.e. for each one and zero, the corresponding attributes are selected from the table, those values are used for conversion process. By this binary Model, Image is first converted into unreadable format using matrix technique. Whenever we generate the key value it will be always lesser than that of the converting binary model values. After completion of this process we need to create matrix attribute table for each binary values. This value only used for encrypting the given input into unreadable format. After an attribute graph is created for each values in the matrix. The same reverse process is done for obtaining the original image. In both case we are using the same keys. This technique is most powerful than the existing technique because even any hackers they try to steal the data; it is impossible to identify the attribute values.
4 Experimental Setup This Proposed work having the following modules: Loading an Image; Convert color image model into binary model; Encryption and Decryption. Loading Images: In this module an image is loaded from the local drive which is to be incorporated. The user will need to pick a picture from the folder. The picture you selected is shown on the form. The image can be an extension color image such as.jpeg, pang,.bmp,.gif etc. This process is shown in the Figs. 5, 6 and 7. Conversion of Color Image
188
D. Saravanan and S. Vaithyasubramanain
model into binary model: In step 2 given input color model first converted into binary model. This process is shown in the Fig. 8. Binary model is the simplest image form it represents two image color or corresponding two values either zero or one. This value represents the values of the selected image colors. The benefit binary model is their fast acquisition and low storage. The time required for encryption would also be minimized by converting the colored image to a binary image. Module Encryption: In this module the given input binary model is converted into unreadable format. This process output is shown in the Fig. 9. In our proposed technique first we need to construct the matrix attribute. Using this attribute values the given input are converted same is circulated to the other users. Modulus of Decryption: In this module user able to get back the original model using the decryption function. This is the reverse operation of the above model. This process is used the same distributed matrix procedure of values generated in the previous model. The output of this process shown in the Fig. 10.
5 Encryption Process The proposed model for converting the input image into unreadable attribute is shown in the Fig. 2. In the proposed model the given information’s are attribute formats. This attributes or attribute Values are used for the conversation process. For maintain the high security the values are not used again. This is once generated attributed are never used further on the same time. It helps to maintain the confidentiality among the process. In the conversion based on the input binary model corresponding attribute are identified and tabulated using matrix format. These attribute values used for the conversion. Based on this conversion the original values are stored safely even the hacker steels the binary values they unable to read the content. During the conversion process the attribute value or attribute length are taken into account. Because the size of the attribute always lesser than the original values. In any case the value
Proposed Algorithm
Convert original image into unintelligible Format
Enter Key Value
Image converted as Plain Text Format
Unintelligible the Key Stream Process Unintelligible Binary Image Fig. 2 Model of an encryption system
Image Encryption Using Matrix Attribute Value Techniques
189
get exceeds, then the default attributes are consider same applied for the conversion. After generating the successful matrix attribute format same information’s are circulated among the trusted users. These entire operations are shown in the Fig. 2. In the proposed model shown in the figure first the binary model converted into unreadable Matrix Attribute. After converting the matrix attribute format clues among the conversion are identified, same transferred among the users. This clue value used for both cased i.e. for converting the binary Model into unreadable and reconstruct the original values. Such a way the given image is converted successfully.
5.1 Source Code to Convert Binary Image int averageColor = ((color.getRed() + color.getGreen() + color.getBlue())/3); //Set the pixel value to either black or white if average Color > = 100). cat.setRGB(w, h, white.getRGB()); else cat.setRGB(w, h, black.getRGB()) return cat;
5.2 Procedure for Encryption if(imgw < 25) colposition = imgw; Else colposition = 25; for(a l = l;a1 < imgw;a1 + + ) for(b1 = 1;b 1 < imgh;b 1 + + ) Color color = new Color(b.getRGB(a1, b1)); int averageColor = ((.getRed() + color.getGreen() + color.getBlue())/3); if(averageColor = = 0) p1 = ‘a’; Else j = i1°/ecolposition; if(j = = 0) p1 = (char)(j + (97 + colposition)); Else p I = (char)(j + 97); b.set RGB(j I, i I, avg.getRGB());
6 Reconstructing Process The proposed model for reconstructing the original image model shown in the Fig. 3. In this model user using the same set of clue values generated in the above model. Using the matrix attribute set values, the same clue used for both conversions. Every attribute values are stored in the matrix this attribute value is used for reconstructing the original image. In reconstruction process the same set of Clues are uses. These clues are generated at the time of converting the given input binary model. This Clues used for reconstruction process also. User first created a matrix attribute further he generates
190
D. Saravanan and S. Vaithyasubramanain
Unreadable binary model
Unreadable binary clues
Matrix attributes values for the converted binary model Reconstruct model Re constructer
Reconstructed Image Fig. 3 Model of a decryption system
the Clue values. While generating the clue values always lesser than the original attribute values. Based On the way information’s are generated and circulated. If the value is higher than the original value then default values are considering (Figs. 4, 5, 6, 7, 8, 9 and 10).
Fig. 4 User authentication for encryption
Image Encryption Using Matrix Attribute Value Techniques
Fig. 5 Wizard for loading image
Fig. 6 Select an image to load
191
192
Fig. 7 Loaded image Fig. 8 Color image is converted into binary image
Fig. 9 Encrypted images
D. Saravanan and S. Vaithyasubramanain
Image Encryption Using Matrix Attribute Value Techniques
193
Fig. 10 Reconstruct the orginal image process
6.1 Code of Reconstruction Process try{output (“String in decryption:” + original value); Generate (Error e) output (“Exception in read File” + e);} Image Decode () throw slow Exception output (“Length of character:” + span); output (“Height and width is “ + ac + “ “ + at): for (k1 = I; k1 < it; k1 + + ) { for(l1 = 1;l I < ice;k1 + + ) (if(k < span)(if(c[k] = = ‘a’) ( dc. Serge(k1, l1, color.black.catch RGB()); k = j + 1;} else(dc.set RGB(k1, k1, color.white.catch RGB());‘
7 Generation of Matrix Attribute Value In Binary image model each picture is encountered and corresponding calculated attribute value is replaced by picture values. In general, any binary model the image values Taken into either one or zero values only. Each and every value is represented in a tree Structure. On the left side of the tree always represents the value one, on the right side traverse always represents zero. If suppose any image value represent “bacha” means the corresponding Ones and zeros values are converted into the above text format. These values are obtained by Tree traverse of left and right.
194
D. Saravanan and S. Vaithyasubramanain
8 Conclusion and Future Scope Any type of network it is more important for encrypting the information otherwise more chance for Message leaking. In this paper we discussed a solution for transmitting the converted information in the form of binary model. For generating the clues and for transmitting the information among the Users here we proposed a matrix based attribute techniques. For each and every binary model the corresponding attributes are selected from the matrix. The model is very specific and the Key stream length is also taken into considerations. For converting the binary model into unreadable Format or reconstructing the original image we are using the same key values. For converting the input images into unreadable attributes, attribute values are tabulated for further process. It will reduce the user burden. Even this Values are hacked by the hackers it is very hard to predict the corresponding attribute value in the matrix. The proposed techniques work well in any type of binary model i.e. the size of the Image is not constrained. In future the converting of color image into binary model, generation of matrix attributes time may reduce using the different procedure.
References 1. N. K. Sreelaja, G. A. Vijayalakshmi Pai, Stream cipher for binary image encryption using ant colony optimization based key generation. Appl. Soft Comput. 12(9), 2879–2895 (2012) 2. L. Krikor, S. Baba, T. Arif, Z. Shaaban, Image encryption using DCT and stream cipher. Eur. J. Sci. Res. 48–58 (2009) 3. H.G. Kim, J.K. Han, S. Cho, An efficient implementation of RC4 cipher for encrypting multimedia files on mobile devices, in SAC ’07 Proceedings of the ACM Symposium on Applied Computing (2007), pp. 1171–1175 4. H.E.H. Ahmed, H.M. Kalash, O.S.F. Allah, An efficient chaos-based feedback stream cipher (EC BFSC) for image encryption, SlTIS (2006), pp. 110–121 5. M. Dorigo, T. Stutzle, Ant Colony Optimization (NewDelhi, Prentice Hall of India Private Limited, 2005), pp. 37–38 6. P. Fei, S.S. Qiu, L. Min, An image encryption algorithm based on mixed chaotic dynamic systems and external keys, in IEEE International Conference in Communication Circuits & Systems (2005), pp. 1135–1139 7. P.P. Dang, P.M. chau, Image encryption for secure internet multimedia applications. IEEE Trans. Consum. Electron. 46(2), 395–403 (2000) 8. K.R. Kashwan, K.A. Dattathreya, Improved serial 2D-DWT processor for advanced encryption standard, in 2015 IEEE International Conference on Computer Graphics Vision and Information Security (CGVIS) (2015), pp. 209–213 9. A. Zhu, L. Li, Improving for chaotic image encryption algorithm based on logistic map, in 2010 International Conference on Environmental Science and Information Application Technology (ESIAT), vol. 3 (2010), pp. 211–214 10. H. Xin, Z. Shujing, C. Weibin, J. Chongjun, An image encryption base on non-linear pseudorandom number generator, in 2010 International Conference on Computer Application and System Modeling (ICCASM), vol. 9 (2010), pp. 238–241
Weather Prediction Based on Seasonal Parameters Using Machine Learning Manish Mandal, Abdul Qadir Zakir, and Suresh Sankaranarayanan
Abstract Information and Communication Technologies are used for predicting the weather by Meterological Department using remote sensing technologies. Literature studies have shown that different machine learning techniques including ANN have been applied for predicting weather parameters like temperature, humidity, rainfall, pressure, sunshine, radiation forecasting for better prediction. But in none of the work, there has been attempt in predicting the different weather conditions for a day based on different weather parameters and not restricting to just few. Also, there has been no attempt in employingd deep learning algorithm in predicting the different weather parameters for a day. So now with the upcoming of Deep learning which is state of the art, we here propose to predict weather conditions and compare with traditional machine learning models. Keywords ANN · Naïve Bayes · Decision trees · Deep learning
1 Introduction The present weather prediction models largely rely on complex physical models and require the usage of supercomputer since it deals with complicated equations which govern how the state of a fluid changes with time. Research has been done by employing Information and Communication Technologies [1, 2] (ICT) for forecasting the weather and monitoring the climate and thereby informing the audience through Media like Television, Radio and currently mobile technologies. They have shown the relevance between ICT and weather forecasting. Now with the advent of machine learning, there has been work reported in applying machine learning algorithm for predicting the temperature of next day at any particular hour, maximum & minimum temperature for 7 days based on two days of input, predicting the future weather on day wise, months or years based on “maximum M. Mandal · A. Q. Zakir · S. Sankaranarayanan (B) Department of Information Technology, SRM Institute of Science and Technology, SRM Nagar, Kattankulathur Campus, Chennai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_15
195
196
M. Mandal et al.
temperature, minimum temperature, humidity and wind speed” and finally Multilayer Perceptron Neural Network for predicting the dry air temperature in intervals of three hours for complete 8 predictions per day [3–11]. But in none of the system, they could predict the different weather conditions of upcoming day or week based on varied weather parameters. This could give a clear indication as how the different weather condition would be today or for upcoming week based on different parameters for planning lot of things rather than restricting to few conditions. So, we here propose to employ deep learning-based model in predicting 37 different weather conditions based on 14 different weather parameters. Also, our deep learning-based model for weather prediction has been compared with other machine learning algorithms like “Naïve Bayes, KNN, Random Forest” in terms of error and accuracy. The rest of paper is organized as follows. Literature Review talks on various work carried out in employing ICT and Machine learning in weather prediction. Proposed work talks on theoretical proposed work followed by the theoretical background of different machine learning algorithm used in our proposed system. The Results and Discussion section talk on different machine learning algorithm for weather prediction based on dataset with accuracy and other metrics for comparison. The final section is the conclusion and future work.
2 Literature Review In this section, thorough literature review done on employing various methods including machine learning approach for predicting the weather has been discussed. The change in weather conditions toward predicting the signs and its possible impact with effects on changing weather been discussed in the paper by the authors [1]. In addition, the link between ICT and weather forecasting discussed in the paper. Authors [2] in their paper assessed the weather/climatic knowledge status and indigenous indicator were mentioned by farmers rather than short term forecast/predictions. Paper did not talk anything about climate change. Decision making was possible in this study based on traditional knowledge learnt from farmers. Authors [3] used Random Forest Regression machine learning model that was capable of predicting the temperature of the next day at any particular given hour where each record would consist of temperature, humidity, wind direction, atmospheric pressure, condition, etc. Authors [4] presented a method toward predicting the maximum and minimum temperature for 7 days based on two days of input. Both linear regression as well as variation of functional linear regression model were utilized. The model performed better in forecasting for later days or longer time scales. Authors [5] applied decision tree algorithm based on metrological data collected from different cities from 2012 and 2015 and accordingly predict the future weather
Weather Prediction Based on Seasonal Parameters Using Machine …
197
day wise, months or years. Parameters like “maximum temperature, minimum temperature, humidity and wind speed” was taken. Hybrid Machine learning model consisting of ANN, Decision Trees & Gaussian Process Modeling (GPM) was developed by authors [6]. This resulted in overlaying statistical limitation among key weather variables. Authors [7] proposed a sliding window technique which would predict a day’s weather condition. Weather data for previous seven days taken along with fortnight weather condition of past years. “Maximum temperature, minimum temperature, Humidity, and Rainfall” are the parameters they worked. The challenge was that they failed in providing proper results for the months of seasonal change where conditions are highly unpredictable. By changing the size of window, the result could be altered. Authors [8] employed Support Vector Machines toward predicting the temperature for a particular location. Structural Risk Minimization (SRM) was employed in SVM Regression which made to perform better than traditional techniques to avoid overfitting problem. Authors [9] worked upon multi-layer perceptron (MLP) neural networks. Ten years meteorological data (1996–2006) were trained and tested. In here, seven different weather variables were utilized which are “dry temperature, wet temperature, wind speed, humidity, pressure, sunshine, and radiation”. Inputs were normalized and accordingly dry air temperature predicted in intervals of three hours for a complete 8 predictions per day.
3 Proposed System Weather plays an important role in our day to day life. The challenge in the abovementioned system is that there has been work in predicting the customized weather parameter like humidity, temperature, rainy, pressure, sunshine, radiation, pressure which are the most basic ones. Also, the current meteorological department also predicts the weather condition for that particular seasons say Rainy, cyclone, Summer, Winter and so forth. But in none of the system, there has been attempt in predicting different weather conditions for upcoming day or week based on different weather parameters irrespective of season. This could benefit travelers and other business activity in planning things like organizing event, going out, hill climbing, swimming, construction, gardening and so. In addition, no advanced machine learning algorithms like Deep learning applied in predicting different weather conditions. So, we here propose toward predicting 37 different weather conditions for a day or week as appropriate based on 14 varied weather parameters by employing Deep learning and comparing with other machine learning algorithms like “Naïve Bayes, KNN, Decision Tree, Random Forest” in terms of Error and Accuracy. The algorithms which we implemented are explained below.
198
M. Mandal et al.
3.1 Naïve Bayes Algorithm It is a classification algorithm that works on the principle of probability. In Bayes Theorem [12], the probability of an event occurring is computing given that another event has already occurred. The Bayes’ mathematical model is given below: P( A|B) = P( B|A) ∗ P(B) P(A)
(1)
Generally, probability of Event A is computed given that Event B is true. In here, Event B is termed as evidence. “P(A) is called the priori of A” which means the “prior probability” or also called as Probability of event before the evidence seen. Attribute value of an unknown instance is called the proof which here is event B. “P(A|B) is a posteriori probability of B”, i.e., probability of event after evidence is seen.
3.2 K-Nearest Neighbors One of the most non-parametric algorithms used for classification and regression is KNN or K-Nearest Neighbors. In Nearest neighbors, we find some particular number of samples that are closer in distance to new points and from that the labels are predicted. Number of neighbors or samples are user-defined. In KNN, distance is calculated which can be of any metric and employs the most commonly used Euclidean distance. In here, it simply stores the instance of training data only. Classification in this algorithm is performed by computing votes of the nearest neighbor of each point [12]. The output is the mean of values of its neighbors present around it. The major problem in “KNN” algorithm is choosing the value of “K”. The downside of this algorithm is the complexity toward finding the number of nearest neighbors in each sample.
3.3 Decision Tree The structure of a decision tree [12] is diagrammatic where a test on an attribute is represented by each internal. Following that, the classification rules are represented by the paths from root to leaf node. A decision tree comprises of three types of nodes: • “Decision nodes”, which are represented by “squares”. • “Chance nodes”, which are represented by “circles”. • “End nodes”, which are represented by “triangles”.
Weather Prediction Based on Seasonal Parameters Using Machine …
199
Decision trees are used often for both classification and regression related problems. Decision tree has thinking capabilities just like humans therefore, it’s easy to understand and interpret.
3.4 Random Forest Random Forest [12] has been evolved from decision trees which is belong to both classifier and regression model called CART. In Random Forest, every decision tree provides classification of input data which later is collected by random forest toward classification and finally chose the highest voted prediction as a result. In Random Forest, Regression is possible which refers to ensembles of regression trees.
3.5 Deep Neural Network Neural networks are computing systems that are similar to biological neural networks. They consist of an input layer, output layer, and hidden layers. Neural networks are trained using backpropagation. In neural network, weight is the one that connects different neurons carrying a value. As value of weight is more, it implies that weight is more and accordingly more importance is given to weights of neurons on the input side. Next is the bias which can also be thought of as weight. Neurons which are not a part of the input layer got bias which also carries a value similar to the weights. f (b) +
n
xi wi
(2)
i=1
After aggregating all the input, decisions are done by the neuron and return another output. This process is called as activation. We represent it as f (z), where z is the aggregation of all the input. “Multilayer perceptrons (MLPs)” are ones with two hidden layers showing fourlayer arrangement [12]. In the “gradient descent”, weights and biases are taken irrespective of activation function present. The processing of cost work slope is downside of neural network. The solution is “Error Backpropagation” toward dealing with the slope processing. In deep learning, one main technique or method is Deep Neural Network with multiple hidden layers. DNN or Deep Neural is dependent on composition of linear function and non-linear activation function given [12, 13]. We propose using “Deep Neural Network”. The layers are hierarchical with the top layers learning high level features “(edges)”, and the low-level layers learning more data specific features. Figure 1 shows Deep Neural Network structure.
200
M. Mandal et al.
Fig. 1 DNN structure
4 Results and Discussion The main focus of the work is toward predicting different weather conditions such as haze, rainy, thunderstorm and various other types of conditions adding up to 38 based on 14 weather parameters for a particular day or week using Deep Neural Network. For our research, the weather data of Delhi [14] for a time duration of 20 years (1996–2017) is being used. The dataset included the reading of 14 parameters namely humidity, pressure, wind speed, wind direction, fog, hail, rain, snow, temperature, thunder, vism, tornado, dew and wind direction (degrees). The database includes readings of 14 weather parameters recorded at every hour interval.
4.1 Performance of Machine Learning Algorithms In regard to Deep Neural Network, we have used 14 different weather parameters as input nodes. The number of hidden layers used is 5, and a number of epochs were 500. Also, RELU activation was used within the hidden layer and SoftMax at the output layer. So, toward prediction, different machine learning algorithms which are “Naïve Bayes, KNN, Decision Tree, Random Forest and Deep Neural Networks” are employed toward computing the prediction accuracy. The results of prediction accuracy and other related metrics for each algorithm. From the analysis of different machine learning algorithms, it is clear that for DNN, precision and Recall is high which is 83.54 as compared to Naïve Bayes, KNN, Decision Tree and Random Forest. This indicates in DNN predicted results are all accurate and positive. There are very few inaccurate and negative results which are “False positive” and “False negatives”. In regard to “F1 Score” weighted and micro,
Weather Prediction Based on Seasonal Parameters Using Machine …
201
Table 1 Comparative analysis of algorithms Algorithms Accuracy F1 score F1 score Recall Recall Precision Precision (%) (weighted) (micro) score score score score (weighted) (micro) (weighted) (micro) Naïve Bayes
28.11
0.22
0.28
28.10
28.10
48.23
28.10
KNN
65.56
0.60
0.65
65.56
65.56
57.97
65.56
Decision tree
72.99
0.70
0.73
72.99
72.99
69.15
72.99
Random forest
74.07
0.71
0.74
74.07
74.07
71.18
74.07
Deep Neural Network
83.54
0.81
0.83
83.54
83.54
80.64
83.54
test’s accuracy for DNN is 0.81 and 0.83 which outperformed other machine learning algorithms. Finally, the accuracy of prediction is high for DNN which is 83.54% as compared to other machine learning algorithms. This indicates that the system is able to predict the best possible with Deep Neural as compared to other machine learning algorithms with very few inaccurate predictions. These are tabulated in Table 1.
5 Conclusion and Future Work Now with the advent of machine learning techniques, it is clear that good amount of work carried out based on historical data in predicting the weather conditions like temperature, humidity, rainfall, pressure, sunshine, radiation. There has been no work that could predict varied weather conditions for a day toward planning different activity which could vary from individual to individual. Also, there has been no attempt in employing deep learning algorithm in predicting the different weather parameters for a day. So, we here have proposed Deep learning toward predicting 38 different weather conditions for a day based on 14 different weather parameters and compared with other machine learning algorithms like “Naïve Bayes, KNN, Decision Tree and Random Forest”. The performance of algorithms been compared in terms of Precision, Recall, F1 Score and accuracy using weather dataset of Delhi for 20 years. It was found that Deep Neural Network outperformed other algorithms toward weather prediction. In future, our system can be expanded by collecting the weather data from various parts of a city. This could increase the number of local features in the training dataset. This data, along with the weather station data, will certainly improve the performance of our models which will result in better prediction of weather.
202
M. Mandal et al.
References 1. U. Onu Fergus, L. Nwagbo Chioma, ICT: A Cornerstone for effective weather forecasting (2017). Available from https://ijcat.com/archives/volume6/issue3/ijcatr06031001.pdf 2. G. Zuma-Netshiukhwi, K. Stigter, S. Walker, Use of traditional weather/climate knowledge by farmers in the South-Western Free State of South Africa: agrometeorological learning by scientists (2003). Available from https://pdfs.semanticscholar.org/8d58/3f376280697196a34c ce5058f79b39246cf3.pdf?_ga=2.267291150.1444242047.1582427632-1592015880.157327 8538 3. A.H.M. Jakaria, M. Hossain, M. Rahman, Smart Weather forecasting using machine learning: a case study in tennessee (2018). Available from https://www.researchgate.net/publication/ 330369173_Smart_Weather_Forecasting_Using_Machine_Learning_A_Case_Study_in_Ten nessee 4. M. Holmstorm, L. Dylan, V. Christopher, Machine learning applied to weather forecasting (2016). Available from https://cs229.stanford.edu/proj2016/report/HolmstromLiuVo-Machin eLearningAppliedToWeatherForecasting-report.pdf 5. S.B. Siddharth, G.H. Roopa, Weather prediction based on decision tree algorithm using data mining techniques (2016). Available from https://www.ijarcce.com/upload/2016/may-16/IJA RCCE%20114.pdf 6. G. Aditya,K. Ashish, H. Eric, A deep hybrid model for weather forecasting (2015). Available from https://aditya-grover.github.io/files/publications/kdd15.pdf 7. K. Piyush, S.B. Sarabjeet, Weather forecasting using sliding window algorithm (2013). Available from https://www.hindawi.com/journals/isrn/2013/156540 8. Y. Radhika, M. Shashi, Atmospheric temperature prediction using support vector machines (2009). Retrieved from https://www.ijcte.org/papers/009.pdf 9. M. Hayati, Z. Mohebi, Application of artificial neural networks for temperature forecasting (2007). Retrieved from https://www.researchgate.net/publication/292759704_Temperature_ forecasting_based_on_neural_network_approach 10. E.B. Abrahamsen, O.B. Brastein, B. Lie, Machine learning in python for weather forecast based on freely available weather data (2018). Available fromhttps://www.ep.liu.se/ecp/153/024/ecp 18153024.pdf 11. S. Jitcha, S.S. Sridhar, Weather prediction for Indian location using machine learning (2018). Retrieved from https://acadpubl.eu/hub/2018-118-22/articles/22c/59.pdf 12. A. Gereon, Hands-On Machine Learning With Scikit-Learn And Tensor Flow (O, Reily, USA, 2018) 13. Building Deep Learning Model using Keras (2018). Available from https://towardsdatascience. com/building-a-deep-learningmodel-using-keras1548ca149d37 14. Weather Data. Available from https://www.kaggle.com/jonathanbouchet/new-delhi-20-yearsof-weather-data
Analysis of XML Data Integrity Using Multiple Digest Schemes Jatin Arora
and K. R. Ramkumar
Abstract The Internet is a vast collection of documents that are constantly evolving. Most of the data on the Internet is available through structured or semi-structured documents. XML is one of the semi-structured data exchange format used over the web. Therefore, understanding and detecting the changes occurring in XML documents becomes crucial. Due to the tree-structured and self-determined nature of the XML document, many change detection algorithms have been proposed. Most of these algorithms are based on comparing the entire XML trees of two versions of XML documents for identifying and locating the changes in them. We proposed an algorithm in this paper that uses the multiple digest scheme that helps in reducing the search space of locating the changes in the XML document. The approach used in this paper does not create any extra computation overhead for the processing of XML document which is justified by the results obtained. Keywords Change detection · Multiple digests · Data integrity · Hash algorithms · Webpages · Structured data
1 Introduction Extensible markup language (XML) has now become the most common data exchange format over the web. The wide usability of XML documents makes it necessary to protect it from unauthorized access and modification. There is a set of security mechanisms such as encryption, decryption, digital signature, and validations that are applied to XML documents [1]. But these security mechanisms do not support the identification of changes in XML documents. To identify changes, J. Arora (B) · K. R. Ramkumar Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, India e-mail: [email protected] K. R. Ramkumar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. S. Kaiser et al. (eds.), Information and Communication Technology for Competitive Strategies (ICTCS 2020), Lecture Notes in Networks and Systems 190, https://doi.org/10.1007/978-981-16-0882-7_16
203
204
J. Arora and K. R. Ramkumar
various change detection algorithms have been developed depending upon the application in which XML is used. These algorithms are applied to identify what kinds of changes have occurred in these documents. It can be presentation change, behavioral change, content change, or context change [2, 3]. In this paper, we proposed a change detection algorithm which helps in identifying the content change. For this, we need to address the following three questions. The first is, to detect the changes which have occurred in an XML document. Second is how to localize those changes. And the third is to draw the experimental results with respect to the computational overhead [4]. To address the first question, i.e., to detect the changes in an XML document, we proposed an algorithm called multiple digests which perform the logical division of XML document and creates digests corresponding to each subdivision. Next step is to construct XOR tree by performing XOR operation on all the digests created earlier. XOR operation allows the combination of digest values in a hierarchical manner to produce a single digest value. This single digest value represents the signature of the entire XML document [5]. Further, this change detection algorithm keeps on comparing hierarchically organized XOR values by traversing the fewer XOR values that are not matched. It will continue searching the subtree till the particular division of the XML document is not identified as changed. After localizing the changes in the new version of an XML document, we need to find the computational cost incurred for locating the changes. For this, we tested the algorithm using an XML file of different sizes, applied multiple hash algorithms, and a varied number of logical divisions [6]. This automatic change detection technique for semi-structured documents will identify changes and helps in distinguishing two different versions of the document. Also, the integrity of the XML document can be easily checked based upon which user may either accept or reject the document [7]. Change detection also helps in patching the differences to make both versions of the XML document becomes identical. The rest of the paper is structured as follows. Section 2 discusses the related work. Section 3 presents the multiple digest approach for detecting changes in XML documents. Section 4 describes the experimental evaluations. Section 5 is about results and discusses the obtained results. Finally, Sect. 6 presents our conclusions and future work.
2 Related Work Earlier change detection was focused only on finding the differences between plain text files and Diff utility was one of the most popular examples for finding the differences. It is based on Longest Common Subsequence (LCS) algorithm which is suitable for plain text files. But it cannot be used for the structured or semi-structured documents in which positioning of data are determined by the hierarchical structure of the document rather than newline or tab spacing [8]. Storing of XML data follows tree structure which is based on Document Object Model (DOM) and accessing data
Analysis of XML Data Integrity Using Multiple Digest Schemes
205
is possible only through the root node. An XML document can be mapped to an ordered tree or an unordered tree. Our algorithm assumes an ordered XML tree. The differences calculated by change detection algorithms are known as edit script or delta [9]. The script consists of various operations such as insert, update, delete, or move which are responsible for the changes that happen in a document. The basic idea followed is to traverse all the nodes of the tree of the two versions of XML documents and then producing the edit script [10]. Various variations of diff algorithms are used to detect the changes with differences in their detection techniques such as X-diff, Xydiff, MH-diff, DiffXML, and many more. X-Diff detects changes in two versions of unordered XML documents by considering the path signature of the nodes. It generates a minimum edit script which consists of three basic operations, i.e., insert, update, and delete [11]. This minimum edit script can be applied to the tree of a given XML document to produce the other tree. Although X-Diff produces optimal delta script and the time complexity of this algorithm is O(n2 ). XyDiff algorithm detects the changes between two XML documents by detecting the subtrees that have no potential difference [12]. For this, it calculates the signature of the leaf nodes and assigns a unique identifier to each node. It then traverses up in the tree to match the entire tree based on node signature. It calculates the differences by matching the trees and identifies the appropriate insert, delete, update, and additional move operation which can be applied to make the documents similar and achieves a complexity of O(nlogn) and is unable to always produce an optimal result [13]. MH-DIFF stands for Meaningful change differences in hierarchical structure data for an unordered tree. This algorithm implements subtree copy and subtree move operation, in addition, to insert, update, and delete operations. It transforms the task into an edge cover problem and thus compares two trees and forms a minimum-cost edit script, to transform the original tree into a modified tree with a time complexity of O(n2 logn) [14]. DiffXML algorithm works by constructing a DOM tree of XML files and storing the corresponding values as well as the path of nodes into a relational data model. The differences between the two XML documents are obtained from the values of nodes which is stored in this relational model. It generates edit script which not only contains fundamental insert, updates, and deletes operations but also contains move operation. However, it deals with the issue of space complexity and optimal edit script in some cases [15]. It uses a tree matching algorithm to compare both ordered and unordered XML trees [16]. Delta XML provides edit script which consists of insert, deletes, and update operations. If the number of differences in two versions of XML documents increases then performance of Delta XML decreases [17]. Related work shows that all the change detection techniques have used the signature method to find the differences [18]. Our proposed algorithm is different from the existing algorithms in several ways. First, is to create a combined signature on set of nodes that help us reduce search space. The second is to localize the portion of the XML file where data modifications occurred. The localization of changed data is achieved by comparing the XOR values organized in a hierarchal manner.
206
J. Arora and K. R. Ramkumar
3 Multiple Digest Approach In this section, we have discussed in detail the change detection algorithm that would help in identifying the exact location of data corruption. The algorithm starts by dividing the entire XML document into subdocuments. The number of subdocuments to be generated is determined by the size of the file which is equivalent to 2N where N is any positive integer. The XML document is accessed via the DOM tree model [19]. The next step is to create a digest of each subdocument using one of the available hash functions such as MD5, SHA-0, SHA-1, SHA-2, and SHA-3 [20–22]. After the creation of multiple digests, combine these digest values using XOR operations and create final XOR as shown in Fig. 1. The figure represents the hierarchical structure formed after combining all the digests values using XOR operations [5]. XOR operations are used to keep the size of signature of limited length. Also, the block diagram of signature creation is shown in Fig. 2. Each of the subdocument created by the sender passed on to the hash creation module where the digest of each subdocument is created. After that XOR operation is applied to the digest values and final digest value is computed. This resultant value is then encrypted with the sender’s private key and signature value is inserted on the document. The process of verification of the signed document is shown in Fig. 3. The verification of the XML signature is carried out by the receiver of the document before using its content. Verification is carried out by calculating the digest of the document in the same manner as calculated by the sender.
Fig. 1 Hierarchical structure of digests and XOR
Analysis of XML Data Integrity Using Multiple Digest Schemes
Fig. 2 Creation of XML signature using multiple digests
Fig. 3 Verification of signature
207
208
J. Arora and K. R. Ramkumar
4 Experimentation For experimental purposes, an XML document containing Countries detail is considered in which the Country’s GDP data corresponding to different years is present as shown below.
1 2008 141,100 < /GDP>