539 66 42MB
English Pages 1007 [975] Year 2021
Advances in Intelligent Systems and Computing 1254
Raghvendra Kumar Nguyen Ho Quang Vijender Kumar Solanki Manuel Cardona Prasant Kumar Pattnaik Editors
Research in Intelligent and Computing in Engineering Select Proceedings of RICE 2020
Advances in Intelligent Systems and Computing Volume 1254
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. Indexed by SCOPUS, DBLP, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST), SCImago. All books published in the series are submitted for consideration in Web of Science.
More information about this series at http://www.springer.com/series/11156
Raghvendra Kumar Nguyen Ho Quang Vijender Kumar Solanki Manuel Cardona Prasant Kumar Pattnaik •
•
•
Editors
Research in Intelligent and Computing in Engineering Select Proceedings of RICE 2020
123
•
Editors Raghvendra Kumar Department of Computer Science and Engineering GIET University Gunupur, Odisha, India Vijender Kumar Solanki CMR Institute of Technology Hyderabad, Telangana, India
Nguyen Ho Quang Institute of Engineering and Technology Thu Dau Mot University Binh Duong, Vietnam Manuel Cardona Universidad Don Bosco Antiguo Cuscatlán, El Salvador
Prasant Kumar Pattnaik KIIT Deemed to be University Bhubaneswar, Odisha, India
ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-981-15-7526-6 ISBN 978-981-15-7527-3 (eBook) https://doi.org/10.1007/978-981-15-7527-3 © Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Contents
Fuzzy Logic Simulation for Classifying Abrasive Wheels in Pendulum Grinding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Van Canh Nguyen, Van Chi Hoang, Vanthan Nguyen, and Ngoc Hue Nguyen Enhancing Security and Trust of IoT Devices–Internet of Secured Things (IoST) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Rohini, Asha Rashmi Nayak, and N. Mohankumar Detection of DoS, DDoS Attacks in Software-Defined Networking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dinh-Tu Truong, Khanh-Dang Tran, Quoc-Binh Nguyen, and Dac-Tot Tran
1
15
25
Functional Encryption of Floating-Point ALU Using Weighted Logic Locking for Enhanced Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bihag Rajeev and N. Mohankumar
37
Nonnegative Feature Learning by Regularized Nonnegative Matrix Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Viet-Hang Duong, Manh-Quan Bui, and Jia-Ching Wang
47
MongoDB Versus MySQL: A Comparative Study of Two Python Login Systems Based on Data Fetching Time . . . . . . . . . . . . . . . . . . . . . Shrikant Patel, Sanjay Kumar, Sandhya Katiyar, Raju Shanmugam, and Rahul Chaudhary
57
Industrial LoRaWAN Network for Danang City: Solution for Long-Range and Low-Power IoT Applications . . . . . . . . . . . . . . . . . Thanh Dinh Ngo, Fabien Ferrero, Vinh Quang Doan, and Tuan Van Pham
65
MPPT Design Using the Hybrid Method for the PV System Under Partial Shading Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sy Ngo, Chian-Song Chiu, and Phuong-Tra Nguyen
77
v
vi
Contents
Comparison of Several Data Representations for a Single-Channel Optic System with 10 Gbps Bit Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . Ihsan A. Alshimaysawe, Hayder Fadhil Abdulsada, Saif H. Abdulwahed, Mohannad A. M. Al-Ja’afari, and Ameer H. Ali Classification of Gait Patterns Using Overlapping Time Displacement of Batchwise Video Subclips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Khang Nguyen, Jiawei Chee, Chong Wee Soh, Ngoc-Son Hoang, Jeong-Hoon Lim, Binh P. Nguyen, Chee-Kong Chui, and Matthew Chin Heng Chua
89
99
A Comparative Study to Classify Cumulonimbus Cloud Using Pre-trained CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Sitikantha Chattopadhyay, Souvik Pal, Pinaki Pratim Acharjya, and Sonali Bhattacharyya Improved Electric Wheelchair Controlled by Head Motion . . . . . . . . . . 121 He-Thong Bui, Le-Van Nguyen, Thanh-Nghi Ngo, Tuan-Sinh V. Nguyen, Anh-Ngoc T. Ho, and Qui-Tra Phan Innovation, Entrepreneurship and Sustainability of Business Through Techno-Social Ecosystem–Indian Scene . . . . . . . . . . . . . . . . . . . . . . . . . 131 Arindam Chakrabarty and Tenzing Norbu Similar Image Retrieval Based on Grey-Level Co-Occurrence Matrix and Hu Invariants Moments Using Parallel Computing . . . . . . . . . . . . . 141 Beshaier Ali Abdulla, Yossra Hussian Ali, and Nuha Jameel Ibrahim Abstraction of ‘C’ Program from Algorithm . . . . . . . . . . . . . . . . . . . . . 157 R. N. Kulkarni and S. Shenaz Begum Comparative Analysis of Machine Learning Algorithms for Hybrid Sources of Textual Data: In Development of Domain Adaptable Sentiment Analysis Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Vaishali Arya and Rashmi Agrawal Improving Extreme Learning Machine Accuracy Utilizing Genetic Algorithm for Intrusion Detection Purposes . . . . . . . . . . . . . . . . . . . . . . 171 Ahmed J. Obaid, Kareem Abbas Alghurabi, Salah A. K. Albermany, and Shubham Sharma Study on Model Reduction Algorithm Based on Schur Analysis . . . . . . 179 Ngoc Kien Vu, Hong Quang Nguyen, Kien Trung Ngo, and Phuong Nam Dao Evaluation of LMT and DNN Algorithms in Software Defect Prediction for Open-Source Software . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Sundos Abdulameer Alazawi and Mohammed Najim Al Salam
Contents
vii
Importance of the Considering Bottleneck Intermediate Node During the Intrusion Detection in MANET . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Towheed Sultana, Arshad Ahmad Khan Mohammad, and Nikhil Gupta Street Lamp Monitoring Using IoT Based on Node-Red . . . . . . . . . . . . 215 Juan Carlos Durón, Sebastián Gutiérrez, Manuel Cardona, and Vijender Kumar Solanki Fetal Heart Rate Monitoring Using Photoplethysmography Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 Ibrahim Patel and Thopucherla Mishwasree Sentiment Analysis Combination in Terrorist Detection on Twitter: A Brief Survey of Approaches and Techniques . . . . . . . . . . . . . . . . . . . 231 Esraa Najjar and Salam Al-augby Application of MicroStation and Tk-Tool in Assessing the Current Status and the Change of Agricultural Land in Ben Cat Town from 2014 to 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Dang Trung Thanh, Nguyen Huynh Anh Tuyet, Vo Quang Minh, and Pham Thanh Vu Implementing a Proposal System for Video Streaming Over MANET Using NS3, LXC and VLC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Naseer Abdulhussein Jabbar and Aymen Dawood Salman A User Specific APDS for Smart City Applications . . . . . . . . . . . . . . . . 267 Goutam Kumar Sahoo, Prasanta Kumar Pradhan, Santos Kumar Das, and Poonam Singh A Comparative Study of Machine Learning Algorithms in Predicting the Behavior of Truss Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 Tran-Hieu Nguyen and Anh-Tuan Vu Explosive Gas Detection and Alert System Using Internet of Things . . . 291 Manuel Cardona, Daniel Marroquín, Boris Salazar, Ivania Gómez, and Sebastián Gutiérrez Techno-entrepreneurship in India Through a Strategic Lens . . . . . . . . . 299 Arindam Chakrabarty, Mudang Tagiya, and Shyamalee Sinha Online Examination System (Electronic Learning) . . . . . . . . . . . . . . . . . 309 Sarah Ali Abdullah, Tariq Adnan Fadil, and Noor Ahmed Construction of a Backstepping Controller for Controlling the Depth Motion of an Automatic Underwater Vehicle . . . . . . . . . . . . 325 Nguyen Quang Vinh and Nguyen Duc Thanh Image Segmentation Algorithm Based on Statistical Properties . . . . . . . 333 Mohammed J. Alwazzan, Alaa O. Alkhfagi, and Ashraf M. Alattar
viii
Contents
Facial Expressions Recognition System Using FPGA-Based Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Tai Nguyen Duong Phuc, Nam Nguyen Nhut, Nguyen Trinh, and Linh Tran Sentiment Analysis Model and Its Role in Determining Social Media’s Influence on Decision Making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Praneet Amul Akash Cherukuri, Nguyen Thi Dieu Linh, Shreya Indukuri, and Sreecharan Nuthi Job Recommendation: An Approach to Match Job-Seeker’s Interest with Enterprise’s Requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 Ngoc-Trung-Kien Ho, Hung Ho-Dac, and Tuan-Anh Le Critical Research on the Novel Progressive, JOKER an Opportunistic Routing Protocol Technology for Enhancing the Network Performance for Multimedia Communications . . . . . . . . . . . . . . . . . . . . 369 Ahmed J. Obaid Analysis of Ransomware, Methodologies Used by Attackers and Mitigation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 Shivaram Jimada, Thi Dieu Linh Nguyen, Jahanavi Sanda, and Sai Kiran Vududala An Approach to Human Resource Demand Forecasting Based on Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 Kim-Son Nguyen, Ho-Dac Hung, Van-Tai Tran, and Tuan-Anh Le Malware Clustering Using Integer Linear Programming . . . . . . . . . . . . 397 Nibras A. Alkhaykanee and Salah A. K. Albermany Forward and Inverse Kinematics Analysis of a Spatial Three-Segment Continuum Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407 Chu Anh My, Duong Xuan Bien, and Le Chi Hieu Assessment of Annual Average Temperature by Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 Arindam Roy, Dharmpal Singh, Sudipta Sahana, Pranati Rakshit, and Souvik Pal Hand Recognition System Based on Invariant Moments Features . . . . . 429 Sundos Abdulameer Alazawi, Rawsam Abduladheem Hasan, and Abbas Abdulazeez Abdulhameed A Fuzzy-Based Multi-agent Framework for E-commerce Recommender Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441 S. Gopal Krishna Patro, Brojo Kishore Mishra, Sanjaya Kumar Panda, and Raghvendra Kumar
Contents
ix
Improved Doppler Radar Target Detection Method Using Quad Code Sequence Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453 Raj Kumar D. Bhure and K. Manjunathachari Genetic Algorithm-Aided Received Signal Strength Maximization in NLOS Visible Light Communication . . . . . . . . . . . . . . . . . . . . . . . . . 467 Tahesin Samira Delwar, Anindya Jana, Prasanta Kumar Pradhan, Abrar Siddique, and Jee-Youl Ryu Implementing Web Testing System Depending on Performance Testing Using Load Testing Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 Sawsan Sahib Abed and al-Hakam Hikmat Omar Content-Based Caching Strategy for Ubiquitous Healthcare Application in WBANs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 Dong Van Doan, Sang Quang Nguyen, and Tuan Phu Van Design and Simulation of High-Frequency Actuator and Sensor Based on NEMS Resonator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501 Tiba Saad Mohammed, Qais Al-Gayem, and Saad S. Hreshee Fast and Accurate Estimation of Building Cost Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515 T. Q. D. Pham, Nguyen Ho Quang, N. D. Vo, V. S. Bui, and V. X. Tran A 24 GHz Wide-Tuning-Range CMOS Digitally Controlled Oscillator for Automotive Radar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527 Abrar Siddique, Tahesin Samira Delwar, Anindya Jana, and Jee Youl Ryu Output Feedback Adaptive Reinforcement Learning Tracking Control for Wheel Inverted Pendulum Systems . . . . . . . . . . . . . . . . . . . 535 Anh Duc Hoang, Hong Quang Nguyen, Phuong Nam Dao, and Tien Hoang Nguyen Facial Expression Recognition with CNN-LSTM . . . . . . . . . . . . . . . . . . 549 Bui Thanh Hung and Le Minh Tien A Fast Reading of European License Plates Recognition from Monochrome Image Using MONNA . . . . . . . . . . . . . . . . . . . . . . . 561 Maad M. Mijwil Implementing Authentication System Depending on Face Recognition Using Anthropometric Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571 Ahmed Shihab Ahmed Process Parameter Optimization for Incremental Forming of Aluminum Alloy 5052-H32 Sheets Using Back-Propagation Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585 Quoc Tuan Pham, Nguyen Ho Quang, Van-Xuan Tran, Xiao Xiao, Jin Jae Kim, and Young Suk Kim
x
Contents
Classification of Parkinson’s Disease-Associated Gait Patterns . . . . . . . . 595 Khang Nguyen, Jeff Gan Ming Rui, Binh P. Nguyen, Matthew Chin Heng Chua, and Youheng Ou Yang Assessment of Structural Integrity Using Machine Learning . . . . . . . . . 607 T.-V. Hoang, V.-X. Tran, V. D. Nguyen, Nguyen Ho Quang, and V. S. Bui A Load Balancing VMs Migration Approach for Multi-tier Application in Cloud Computing Based on Fuzzy Set and Q-Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617 Khiet T. Bui, Linh V. Nguyen, Tai V. Tran, Tran-Vu Pham, and Hung C. Tran Tracking Control of Parallel Robot Manipulators Using RBF Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629 Vu Le Huy and Nguyen Dinh Dzung An IoT-Based Air Quality Monitoring with Deep Learning Model System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643 Harshit Srivastava, Kailash Bansal, Santos Kumar Das, and Santanu Sarkar Using Feature Selection Based on Multi-view for Rice Seed Images Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651 Dzi Lam Tran Tuan and Vinh Truong Hoang Smarter Pills: Low-Cost Embedded Device to Elders . . . . . . . . . . . . . . . 663 José Elías Romo, Sebastián Gutiérrez, Pedro Manuel Rodrigo, Manuel Cardona, and Vijender Kumar Solanki Security and Privacy in IOT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673 Devender Bhushan and Rashmi Agrawal Cross-Domain Using Composing of Selected DCT Coefficients Strategy with Quantization Tables for Reversible Data Hiding in JPEG Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681 Pham Quang Huy, Ta Minh Thanh, Le Danh Tai, and Pham Van Toan Empirical Analysis of Routing Protocols in Opportunistic Network . . . . 695 Renu Dalal and Manju Khari Fuzzy Lexicon-Based Approach for Sentiment Analysis of Blog and Microblog Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705 Srishti Sharma and Vaishali Kalra Face Recognition Using Hybrid HOG-CNN Approach . . . . . . . . . . . . . . 715 Bui Thanh Hung Empirical Analysis of Energy-Efficient Hybrid Protocol Under Black Hole Attack in MANETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725 Priyanka Singh and Manju Khari
Contents
xi
An Empirical Study on Usability and Security of E-Commerce Websites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735 Biresh Kumar and Sharmistha Roy Extracting Relevant Social Geo-Tagged Photos for Points of Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 747 Thanh-Hieu Bui and Tat-Bao-Thien Nguyen Smart Wheelchair Remotely Controlled by Hand Gestures . . . . . . . . . . 757 Hemlata Sharma and Nidhi Mathur Robust Compensation Fault-Tolerant Control Based on Sensor Fault Estimation Using Augmented System for DC Motor . . . . . . . . . . . . . . . 769 Tan Van Nguyen, Nguyen Ho Quang, and Cheolkeun Ha Monitoring Food Quality in Supply Chain Logistics . . . . . . . . . . . . . . . 781 Sushant Kumar and Saurabh Mukherjee Applications of Virtual Reality in a Cloud-Based Learning Environment: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 787 Nikhil S. Kaundanya and Manju Khari Different Platforms Using Internet of Things for Monitoring and Control Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795 Sebastián Gutiérrez, Rafael Rocha, Emmanuel Estrada, David Rendón, Gemma Eslava, Luis Aguilera, Pedro Manuel Rodrigo, and Vijender Kumar Solanki Internet of Things System Using the Raspberry Pi to Monitor a Small-Scale Server Room . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805 Thien M. Nguyen, Phuc G. Tran, Phuoc V. Dang, Huy H. T. Le, and Nhu Q. Tran A Survey on Hybrid Intrusion Detection Techniques . . . . . . . . . . . . . . . 815 Nitesh Singh Bhati and Manju Khari Topic-Guided RNN Model for Vietnamese Text Generation . . . . . . . . . 827 Dinh-Hong Vu and Anh-Cuong Le Automated Currency Recognition Using Neural Networks . . . . . . . . . . . 835 Abhishek Jain, Paras Jain, and Vikas Tripathi An Application of Vision Systems for the Inspection of Two-Dimensional Entities in a Plane . . . . . . . . . . . . . . . . . . . . . . . . . 843 Van Thao Le, Quang Huy Hoang, Duc Manh Dinh, and Yann Quinsat An Efficient Image-Based Skin Cancer Classification Framework Using Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 851 Tejasvi Ghanshala, Vikas Tripathi, and Bhaskar Pant
xii
Contents
Inverse Kinematics Analysis of Welding Robot IRB 1520ID Using Algorithm for Adjusting the Increments of Generalized Vector . . . . . . . 859 Chu Anh My, Duong Xuan Bien, and Le Chi Hieu Recent Trends in Big Data Ingestion Tools: A Study . . . . . . . . . . . . . . . 873 Garima Sharma, Vikas Tripathi, and Awadhesh Srivastava Optimization of the Feed Rate of Six-DOFs Robot in a Parametric Domain Based on Kinematics Modeling . . . . . . . . . . . . . . . . . . . . . . . . . 883 Chu Anh My, Duong Xuan Bien, and Le Chi Hieu Cold Start Problem Resolution Using Bayes Theorem . . . . . . . . . . . . . . 893 Deepika Gupta, Ankita Nainwal, and Bhaskar Pant Probabilistic Model Using Bayes Theorem Research Paper Recommender System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 901 Ankita Nainwal, Deepika Gupta, and Bhaskar Pant The Role of Big Data Analytics and AI in Smart Manufacturing: An Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 911 Chu Anh My Literature Review: Real Time Water Quality Monitoring and Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 923 Deepika Gupta, Ankita Nainwal, and Bhaskar Pant Development of a Stimulated Model of Smart Manufacturing Using the IoT and Industrial Robot Integrated Production Line . . . . . . . . . . . 931 Minh D. Tran, Toan H. Tran, Diem T. H. Vu, Thang C. Nguyen, Vi H. Nguyen, and Thanh T. Tran Reinforcement Learning Based Adaptive Optimal Strategy in Robotic Control Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 941 Phuong Nam Dao and Hong Quang Nguyen Energy Efficient Cluster Head Selection for Wireless Sensor Network Using Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 953 Kusum Lata Jain, Shivani Gupta, and Smarnika Mohapatra Sliding Mode Control Based Output Feedback Adaptive Dynamic Programming Algorithm for Uncertain Wheel Inverted Pendulum Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 963 Phuong-Nam Dao and Hong-Quang Nguyen Region-Based Space Filling Curve for Medical Image Scanning . . . . . . 973 Ashan Eranga Kudabalage, Le Van Dang, Leran Du, and Yumi Ueda Flexible Convolution in Scattering Transform and Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 983 Dinh-Thuan Dang
About the Editors
Dr. Raghvendra Kumar is working as Associate Professor in Computer Science and Engineering Department at GIET University, India. He received B. Tech, M.Tech and Ph.D. in Computer Science and Engineering, India, and Postdoc Fellow from Institute of Information Technology, Virtual Reality and Multimedia, Vietnam. He serves as Series Editor Internet of Everything (IOE): Security and Privacy Paradigm, Green Engineering and Technology: Concepts and Applications, publishes by CRC press, Taylor & Francis Group, USA, and Bio-Medical Engineering: Techniques and Applications, Publishes by Apple Academic Press, CRC Press, Taylor & Francis Group, USA. He also serves as acquisition editor for Computer Science by Apple Academic Press, CRC Press, Taylor & Francis Group, USA. He has published number of research papers in international journal (SCI/SCIE/ESCI/Scopus) and conferences including IEEE and Springer as well as serve as organizing chair (RICE-2019, 2020), volume Editor (RICE-2018), Keynote speaker, session chair, Co-chair, publicity chair, publication chair, advisory board, Technical program Committee members in many international and national conferences and serve as guest editors in many special issues from reputed journals (Indexed By: Scopus, ESCI, SCI). He also published 13 chapters in edited book published by IGI Global, Springer and Elsevier. His researches areas are Computer Networks, Data Mining, cloud computing and Secure Multiparty Computations, Theory of Computer Science and Design of Algorithms. He authored and Edited 23 computer science books in field of Internet of Things, Data Mining, Biomedical Engineering, Big Data, Robotics, and IGI Global Publication, USA, IOS Press Netherland, Springer, Elsevier, CRC Press, USA. Dr. Nguyen Ho Quang is currently Director of Institute of Engineering and Technology at Thu Dau Mot University (TDMU), Vietnam. He holds a bachelor’s degree in Mechanical Engineering from University of Science and Technology, Da Nang University, Vietnam in 2005. He obtained his MSc. degree in Mechanical Engineering from the Thai Nguyen University of Technology, Vietnam in 2010. In 2017, he received his PhD degree in Biomechanics and Computational Mechanics from University of Technology of Compiègne (UTC), Sorbonne University, xiii
xiv
About the Editors
France. And he has also completed a post-doctoral fellowship at the Biomechanics and Bioengineering Laboratory, UTC, France. His main research interests relate to Computational Biomechanics and Clinical applications including Biomechanics of the musculoskeletal system, patient-specific biomechanical modelling derived from medical images, Rehabilitation engineering. Dr. Vijender Kumar Solanki, Ph.D. is an Associate Professor in Computer Science & Engineering, CMR Institute of Technology (Autonomous), Hyderabad, TS, India. He has more than 14 years of academic experience in network security, IoT, Big Data, Smart City and IT. Prior to his current role, he was associated with Apeejay Institute of Technology, Greater Noida, UP, KSRCE (Autonomous) Institution, Tamilnadu, India and Institute of Technology & Science, Ghaziabad, UP, India. He is member of ACM and Senior Member IEEE. He has attended an orientation program at UGC-Academic Staff College, University of Kerala, Thiruvananthapuram, Kerala & Refresher course at Indian Institute of Information Technology, Allahabad, UP, India. He has authored or co-authored more than 60 research articles that are published in various journals, books and conference proceedings. He has edited or co-edited 14 books and Conference Proceedings in the area of soft computing. He received Ph.D in Computer Science and Engineering from Anna University, Chennai, India in 2017 and ME, MCA from Maharishi Dayanand University, Rohtak, Haryana, India in 2007 and 2004, respectively and a bachelor’s degree in Science from JLN Government College, Faridabad Haryana, India in 2001. He is the Book Series Editor of Internet of Everything (IoE): Security and Privacy Paradigm, CRC Press, Taylor & Francis Group, USA; Artificial Intelligence (AI): Elementary to Advanced Practices Series, CRC Press, Taylor & Francis Group, USA; IT, Management & Operations Research Practices, CRC Press, Taylor & Francis Group, USA and Computational Intelligence and Management Science Paradigm, (Focus Series) CRC Press, Taylor & Francis Group, USA. He is Editor-in-Chief in International Journal of Machine Learning and Networked Collaborative Engineering (IJMLNCE) ISSN 2581-3242; International Journal of Hyperconnectivity and the Internet of Things (IJHIoT), ISSN 2473-4365, IGI-Global, USA, Co-Editor Ingenieria Solidaria Journal ISSN (2357-6014), Associate Editor in International Journal of Information Retrieval Research (IJIRR), IGI-GLOBAL, USA, ISSN: 2155-6377 | E-ISSN: 2155-6385. He has been guest editor with IGI-Global, USA, InderScience & Many more publishers. He can be contacted at [email protected] Dr. Manuel Cardona, Ph.D., received the B.S. degree in Electrical Engineering in El Salvador, in 2004 and the Master degree in Automation and Robotics from Universidad Politécnica de Madrid, Madrid, Spain, in 2008. From 2007 to 2008, and 2011, he was a Research Assistant with the Robotics and Intelligence Machines Research Group at Universidad Politécnica de Madrid, Spain. He has a Postgraduate Degree in Scientific Research and a Postgraduate Degree and
About the Editors
xv
Innovation Management. In 2020, he received the Ph.D. (Cum Laude) in Automation and Robotics from Universidad Politécnica de Madrid, Spain. Currently, he is the research directorand the director of the Robotics and Intelligence Machines research group and Computer Vision research groupat Universidad Don Bosco (UDB), El Salvador. His research interest includes Rehabilitation Robotics, Biomechanics, kinematic and dynamic of serial andparallel Robots, embedded systems, vision and artificial intelligence, and applications of robotics systems. He has authored or co-authored more than 40research articles that are published in various journals, books and conference proceedings. He has also edited 3 books with Springer in the area of Robotics. He is an Associate Editor in International Journal of Machine Learning and Networked Collaborative Engineering (IJMLNCE) ISSN 2581-3242. He is an IEEE Senior Member and belongs to Robotics and Automation Society (RAS), Aerospace and Electronic Systems Society (AESS) and Education Society (EdSOC). He is IEEE RAS and AESS Student Branch Chapter Advisor and Student Brach Mentor at Universidad Don Bosco, and the Vicechair at IEEE El Salvador Section. Dr. Prasant Kumar Pattnaik, Ph.D. (Computer Science), Fellow IETE, Senior Member IEEE is a Professor at the School of Computer Engineering, KIIT Deemed University, Bhubaneswar, India. He has more than a decade of teaching and research experience. Dr. Pattnaik has published numbers of Research Papers in peer-reviewed international journals and conferences. He also published many edited book volumes. His areas of interest include Mobile Computing, Cloud Computing, Cyber Security, Intelligent Systems and Brain Computer Interface. He is an Associate Editor of the Journal of Intelligent & Fuzzy Systems.
Fuzzy Logic Simulation for Classifying Abrasive Wheels in Pendulum Grinding Van Canh Nguyen, Van Chi Hoang, Vanthan Nguyen, and Ngoc Hue Nguyen
Abstract Simulation results in the fuzzy logic environment can be used to classify abrasive wheels with linguistic estimates and solving local grinding problems: selection of based elements of the wheel characteristic, to minimize (maximize) a single surface topography parameter. Requirements integrated assessment of topography High-Speed Tool Steel W9Mo4Co8 in the greatest measure responses for an abrasive wheel: 25AF46K10V5-PO3, 5SG60K12VXP and 5NQ46I6VS3 with linguistic rate “good” and the greatest values of the desirability function. Keywords Fuzzy logic · Desirability function · Mamdani algorithm · Surface quality · Grinding
1 Introduction The classification problem arises when a researcher accepts to take some number of measurements with certain variable needs to be assigned to one of several categories. It cannot directly determine the category and is forced to use these observations. Currently, the abrasive tools produced are represented by several hundred factories on the global market [1, 2]. All abrasive wheels are classified according to various criteria: shape, geometric dimensions, type and brand of abrasive material, grain size, hardness, structure, bond, accuracy and unbalance classes [3–6]. In all previous studies and normative documents, there is not integrated classification cutting abilities of abrasive tools for the whole complex of output process variables. The problems of controlling grinding process are connected with fact that there are V. C. Nguyen (B) · V. Nguyen (B) Ngo Quyen University, Binh Duong, Viet Nam e-mail: [email protected] V. C. Hoang National Laboratory of Information Security, Ha Noi, Viet Nam N. H. Nguyen Thu Dau Mot University, Binh Duong, Viet Nam © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_1
1
2
V. C. Nguyen et al.
many estimates of the surface topography of the HSCPs. However, there is not one for allowing evaluation of treated surface simultaneously on a set of parameters among them: microgeometry, shape deviations and mechanical properties—by taking into account the service purpose of the parts. Many of these variables are nonlinear, interrelated and difficult to qualitatively evaluate with the high accuracy. In that way, physical models cannot be obtained and their experimental analogs cannot be exhaustive and have limited applicability [7]. One of the approaches can solve this problem that is the application of the theory of fuzzy logic [8, 9]. Introducing the notion of weighted object belonging to the set, fuzzy logic offers a flexible apparatus for a formal description of this kind of situation. So that, the question is not whether the object belongs to the class but how much the object belongs to a particular class. This feature of the theoretic fuzzy sets also opens up new possibilities in classification problems, which in contrast to traditional statistical methods allows using the more adaptable methods to represent the initial data of structure [10–12]. The idea and basic principles of the application apparatus of the theory of fuzzy sets to the solution of classification problems were forward in the thesis of Bellman et al. [13]. The first systematic study devoted to the problem of using fuzzy mathematics to solve problems of pattern recognition was Bezdek JC’s thesis [14] after which a wide variety of fuzzy methods for their recognition, including fuzzy methods of automatic classification, were proposed. Various aspects of fuzzy classification have been in detail studied and developed in the number of papers [15–17]. The purpose of study is to find the wheels that have the best cutting capacities in the formation of the surface topography of the HSCPs W9Mo4Co8 with the involvement of fuzzy logic, which ultimately should lead to an improvement in the quality of the assembly of the cutting tool and its durability.
2 Methods of Research 2.1 Conditions of Experiment The experiments were conducted by the periphery of abrasive wheels: abrasive wheels—standard porosity, most often the 6th structure, and highly porous wheels— 10th–12th structures. The removal of the operating allowance corresponded to a pendulum scheme without standard grooming at the end of the grinding cycle. The following unchanged conditions were adopted in [1]. The characteristic of wheel [3–6, 18] area l = 1; 16 contains the following information: • High-porous wheels of Norton production (l = 1; 4): 1–5SG46K12VXP, 2–5SG46I12VXP, 3–5SG60K12VXP, 4–TGX80I12VCF5 (Altos),
Fuzzy Logic Simulation for Classifying Abrasive Wheels …
3
• High-porous wheels of Russian production (l = 5; 11): 5–25AF46M12V5PO, 6–25AF46M12V5-PO3, 7–25AF46M10V5-PO, 8–25AF46M10V5-PO3, 9–25AF46K10V5-PO3, 10–25AF60M10V5-PO, 11–25AF46L10V5-KF35, • Standard-porous wheels of different manufacturers (l = 12; 16): 12– 5A46L10VAX, 13–EKE46K3V (corporations Abrasives Corundum, Germany); 14–5NQ46I6VS3 (Norton Vitrium); 15–92A/25AF46L6V20. 16–34AF60K6V5 (Russia). The roughness of the surfaces of the HSCPs was measured to use system which is based on the profilograph gauge 252 of the “Kalibr” production [19]. According to [20], macrogeometry is represented by two indicators of deviations from the plane: EFE maxl (y14 )—the main and auxiliary, and EFE al (y15 )—arithmetic mean. The method of their search is given in [21]. Microhardness HV (y16 ) is measured by [1, 22].
2.2 Statistical Methods for Interpretation of Experimental Data Cutting capacities of wheel are considered variable values but for the use of the interpretation of observations probability-theoretical approaches [1]. Reduction of the labor intensity of statistical computations is achieved by the use of software products, in particular, Statistica 6.1.478. Analysis of the experimental data:
yl jv , l = 1; 16, j = 1; 6, v = 1; 30,
(1)
it is possible that leads with the involvement of parametric and nonparametric (in particular, rank) statistical methods. Each of them has “its own field” [23] for effective exertion in technical applications. For the parametric method, it is necessary that all Eq. (1) possesses the properties of homoscedasticity (synonyms—homogeneity of deviation variances) and normality of distributions. Otherwise, the exact criteria of this method lose their reliability and can lead to the adoption of incorrect statistical decisions. In a similar situation, it is more expedient to use ranking statisticians which are not related to any family of distributions and do not use its properties. The characteristics of the one-dimensional frequency distribution for Eq. (1) provide [23–25]: • by measures of position (reference values),
• average y¯l j = yl j• ,
(2)
• medians y˜l j ,
(3)
4
V. C. Nguyen et al.
• on measures of dispersion (precision),
• standards of deviations S Dl j ,
(4)
• swings Rl j = (ymax − ymin )l j , ,
(5)
• quartile width QWl j = (y0.75 − y0.25 )l j ,
(6)
The choice of statistical methods is described in [25]. Throughout this study, we confine ourselves to stating that the procedure for interpreting Eq. (1) has been reduced to one stage as follows: a one-dimensional variance analysis for detecting a significant difference between the levels of measure positions without their nominal search.
2.3 Mathematical Model of Object Classification While Using Fuzzy Logic Simulation It is based on the theories of fuzzy sets and fuzzy logic. The author [10] proposed for this purpose mathematical model in the form of the graph “and/or” G = (A, μ(A) , C, μ( f ) , L , Y, S, D), presented in Fig. 1. The object is described by a finite set of attributes A = {a1 , . . . , am }. For each attribute aj , j = 1; m puts
Fig. 1 Model of the system of fuzzy classification objects [10]
Fuzzy Logic Simulation for Classifying Abrasive Wheels …
5
according to the set e of its clear values and the set of linguistic term Ti j = T1 j , . . . , Tm i j are put in correspondence, the number of terms of the attribute aj . Each such linguistic term T ij is estimated by the membership function μTi j (e j ) in the universal set. Observed objects belong to classes ( f 1 , . . . , f k ), each of which corresponds to the output linguistic variable (the output linguistic term) and describing by the membership function μ( fk ) , where Rτ ∈ R—fuzzy rule; 1 ≤ τ ≤ n—index rule Rτ ; a j ∈ A—affection object; μτTi j (e j ) ∈ μ A —the meaning of linguistic term of attributes; Cτ ∈ C— degree of truth condition; L τ ∈ L—truth degree of particular solution (subclause) of the rule for Rτ ; f —class objects; μτfk ∈ μ f —object class membership function; Yτ ∈ Y —the membership function of the subclause for the rule Rτ ; S—conclusion of a fuzzy classification system; and d ∈ D—output value which evaluates the class objects. The output value of the fuzzy classification system d is determined by the expression [10]: s
d=
z u Su (z u )
u=1 s
,
(7)
Su (z u )
u=1
where s is the number of functions belonging to the output term f ki in the base rule; z u —output value of the system corresponding to the rule having membership function; and Su , Su (z u )—the value of the membership function of the term f ki , equal to truth degree of subclause of this rule. Equation (7) represents the desirability function which is proposed by Harrington [26]. While executing the simulation process, the fuzzy logic in the MATLAB version R2013a environment was used a special Fuzzy Logic Toolbox extension package. It has a simple and well-thought-out interface which makes it easy to design and diagnose odd models [27].
3 Research Results and Discussion Testing Eq. (1) on the homogeneity of variances of sets l = 1; 16 was carried out according to the criteria (q = 1; 3): 1—Levene, 2—Brown–Forsythe, 3—Hartley, Cochren and Bartlett. The statistic program q = 3 depleted into one group. Dispersions of observations in Eq. (1) are considered homogeneous at a level of significance α•q ≤ 0.05, which in theoretical statistics is reflecting by adoption of null hypotheses (H0 ). It is revealed that H0 practically corresponds to the event for probability ρ = 1 (α•q = 0). Distribution laws of observations are analyzed with use of the Shapiro–Wilk statistics: At the reliability level, αl j > 0.5—H 0 is assumed. They are tested on all
6
V. C. Nguyen et al.
indicators separately for each abrasive tool. Thus, the total number of situations was analyzed. It is received that H0 is accepted in 12 cases from 96 possible situations. It should be noted that it used the same experimental data in conditions in different orders of their implementation and it often leads to different estimates of their deviations from normal distribution. In connection with the foregoing, it is accepted that the “nonparametric method of statistics” serves as the “field” for interpretation in Eq. (1). The results of the experiment in Eq. (3) and Eq. (6) after statistical processing (Table 1) are considered input variables for modeling topography of the surface of HSCPs W9Mo4Co8 in the fuzzy logic simulation. As Table 1 shows, the choice of the optimum characteristics of the abrasive tool by this parameter Ra1 allows to reduce the number of transitions in the developed grinding operation or increasing its productivity while maintaining the roughness. In similar conditions, cutting capacities of wheels increased the accuracy of the HSCPs form by one TFE qualifier [28] and microhardness of their surface increased by 1.6– 1.8 times. In both cases, the best results showed wheels l = 5; 7. Even, by taking into account medians, it is difficult to choose a wheel that would ensure the best state of the topography of the HSCP surface. If the analysis includes QW, then this problem cannot be realized on basis of statistical methods. Fuzzy logic is involved for this purpose. The attributes Ra1 , Rmax1 , S m2 , EFE max , EFE a and HV in the form of Eqs. (3) and (6) are fed to the input of the system (Fig. 1) attributes aij , l = 1; 16, j = 1; 6, and the output is the classification of abrasive tools according to the state of the surface topography of the HSCPs. The implementation of fuzzy logic is carried out in two successive stages: • Differential classification of wheels under conditions of decreasing Eqs. (3) and (6) for all surface quality parameters, except for HV and for HV increasing Eq. (3) with decreasing Eq. (6). Basis for increasing HV is the decrease in burns [29]. • Integral classification of wheels by complex estimations of topography HSCPs surface under that above requirements to Eqs. (3) and (6). At the first stage of the research, fuzzy model was created (Fig. 2) which includes three variables: two inputs ( y˜l j , QWl j , j = 1; 6) and one output (membership function) at constant l = 1; 16. Each of the variables represented by three functions in Eq. (7) depends on the corresponding event during fuzzification [16, 26]. In this study, the Mamdani algorithm used, for terms are chosed following membership functions: “Good”—z-like (zmf), “Middle”—pi-like (pimf) and “Bad”—z-like (zmf). The range and parameters of the input are shown in Table 2. Membership functions creating for the output variable “Output” three categories of quality of ground parts are used: “high”, “normal” and “bad”—with the corresponding terms: “zmf” and “pimf”. Their scales (good, normal, high) and functions of desirability d ∈ [0; 1] are presented: d ∈ [0.1; 0.4]—bad; d ∈ [0.1; 0.4; 0.6; 1.0]—normal; and d ∈ [0.6; 1.0]—high. It reflects to increase in the cutting capacities of wheels.
0.050
0.048
0.055
0.070
0.080
0.080
0.095
0.079
0.080
0.120
0.056
0.060
0.049
0.065
0.057
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
0.014
0.025
0.023
0.012
0.015
0.150
0.030
0.021
0.133
0.040
0.020
0.030
0.019
0.020
0.018
0.017
0.347
0.395
0.329
0.408
0.363
0.820
0.550
0.511
0.675
0.550
0.510
0.463
0.367
0.347
0.353
0.307
(3)
0.047
Rmax1l , µm
(3)
(6)1
Ra1l , µm
Parameter
1
Wheel l
0.088
0.109
0.153
0.100
0.071
0.923
0.222
0.133
0.709
0.249
0.143
0.204
0.111
0.126
0.138
0.129
(6)1
46.22
42.18
50.64
60.19
56.22
66.22
68.53
65.12
73.15
67.78
50.59
79.28
48.96
53.29
50.30
57.05
(3)
S m2l , µm
13.30
13.86
9.72
19.81
14.00
35.69
32.76
22.32
32.86
33.21
22.88
33.17
21.87
32.20
18.88
17.97
(6)1
17.00
15.00
15.00
15.00
16.00
18.00
14.75
12.00
16.50
12.50
18.00
12.00
16.00
14.00
18.00
15.00
(3)
3.75
1.00
1.75
1.00
2.75
6.00
6.88
1.00
4.75
3.00
4.75
2.75
6.25
5.25
2.75
3.00
(6)1
EFE maxl , µm
Table 1 Input variables for the fuzzy logic simulation of cutting capacities of wheels
12.46
11.17
11.29
9.38
11.42
11.54
8.17
9.17
9.75
7.75
10.42
7.75
10.25
8.75
12.00
11.46
(3)
EFE al , µm
2.69
2.88
0.83
1.25
2.50
3.60
4.97
1.88
3.69
1.98
2.35
1.51
5.88
2.13
3.54
1.90
(6)1
6863.94
7190.38
8895.90
8568.03
5824.95
7416.25
7487.75
8376.25
9092.63
10553.73
7861.56
9445.00
6002.68
8033.13
7918.75
7747.19
(3)
HV l , mPa
596.74
933.58
882.68
1654.71
734.35
776.81
501.53
714.85
811.40
1294.39
817.97
896.80
1424.52
486.09
620.63
615.91
(6)1
Fuzzy Logic Simulation for Classifying Abrasive Wheels … 7
8
V. C. Nguyen et al.
Fig. 2 Fuzzy logic simulation in the classification of laps for each parameter Table 2 Parameter input in the evaluation of cutting capacities of wheels by each parameter Parameter
Equations (3), (6)
Range
Good
Middle
Bad
Ra1l , µm
(3)
[0.04; 0.13]
[0.04; 0.08]
[0.04; 0.08; 0.1; 0.13]
[0.1; 0.13]
(6)
[0.01; 0.15]
[0.01; 0.07]
[0.01; 0.07; 0.10; 0.15]
[0.10; 0.15]
(3)
[0.1; 1]
[0.1; 0.5]
[0.2; 0.5; 0.7; [0.7; 1] 1]
(6)
[0.01; 1]
[0.01; 0.5]
[0.01; 0.5; 0.7; 1]
[0.7; 1]
(3)
[35; 85]
[35; 55]
[35; 55; 65; 85]
[65; 85]
(6)
[5; 40]
[5, 20]
[5; 20; 25; 40] [25; 40]
(3)
[8, 22]
[8, 14]
[8, 14, 16, 22] [16, 22]
(6)
[0.5; 10]
[0.5; 5]
[0.5; 5; 6; 10] [6, 10]
(3)
[5, 15]
[5, 9]
[7, 9, 11, 15]
[11, 15]
(6)
[0.25; 10]
[0.25; 5]
[0.25; 5; 7; 10]
[7, 10]
(3)
[5800; 11000]
[8000; 11000]
[5800; 7200; 8000; 11000]
[1200; 1600]
(6)
[486; 1655]
[486; 1000]
[5800; 7200; 8000; 11000]
[1200; 1600]
Rmax1l , µm
S m2l , µm
EFE maxl , µm EFE al , µm
HV l , mPa
Fuzzy Logic Simulation for Classifying Abrasive Wheels … Table 3 Fuzzy rules for the each parameter of quality system-grinded HSCPs
9
Rule
Equation (3)
Equation (6)
Output
1
Good
Good
High
2
Good
Middle
High
3
Good
Bad
Normal
4
Middle
Good
High
5
Middle
Middle
Normal
6
Middle
Bad
Low
7
Bad
Good
Normal
8
Bad
Middle
Low
9
Bad
Bad
Low
After creating membership functions, the rules for fuzzy reasoning (Table 3) for the developed system were created to use the Rule Editor interface in the Fuzzy Logic Toolbox. The use of its graphical format, all simplifications revealed which allowed predicting the output variable-output. Each change of the part property is displayed in the display of the output rules, changing it accordingly. The increase in the desirability function, as previously distinguished, raises the probability of the event and, i.e., in conditions of study. It provides for the growth of cutting capacities of wheels which improve the quality of the surface of the HSCPs. In this case, the joint accounting Eqs. (3), (6) allowed receiving the attributes al j , l = 1; 16, j = 1; 6 (Table 4) and significantly affected the priorities between the wheels l = 1; 16 which were previously made on reference values (Table 1). By attributes al1 , l = 1; 16 for the parameter Ra1l , the first two positions are occupied by the wheel: standard-porous wheel EKE46 (l = 13) from monocorundum with d13,1 = 0.86 and abrasive wheel 34A (l = 16) from chromic electrocorundum. The high-porous wheel Norton of sintercorund 5SG46K12VXP (l = 1) c d1,1 = 0.843 is only located in the fourth position. Although it was earlier taking into account, the reference values were only the most effective. On the remaining parameters of surface topography, only two instruments were stored in the top three: standard-porous wheel l = 16—for S m2 and high-porous wheel l = 9—for EFE max . The obtained differential estimates of the cutting capacities of wheels are of special interest in optimizing the grinding process by one or several parameters, if wheels with the largest cutting capacities coincide with them. In particular, ensuring the strength of landings is determined by parameters Ra1 and Rmax1 . They are as smaller and less as the associated dimensions change under load and increasing the reliability of connection. Improving the formation of macrogeometry (EFE max , EFE a ) provides grinding of the HSCPs loop from the monocorundum standard-porous wheel EKE46K3V (l = 13). Even if the cutting capacities of wheels do not coincide with several parameters of topography surface at the same time, then the first stage of modeling in the fuzzy logic environment cannot complete this classification, for which it is necessary to carry out the second stage of the integral evaluation of their cutting capacities.
10
V. C. Nguyen et al.
Table 4 Results of fuzzy logic simulation on the differential estimation of cutting capacities of abrasive wheels Wheel l
Desirability function d lj for parameters d l1 (Ra1 )
d l2 (Rmax1 )
d l3 (Sm2 )
d l4 (EFEmax )
d l5 (EFEa )
d l6 (HV)
1
0.843
0.701
0.501
0.578
0.661
0.653
2
0.829
0.716
0.506
0.557
0.506
0.648
3
0.807
0.733
0.442
0.528
0.629
0.826
4
0.805
0.765
0.513
0.527
0.500
0.202
5
0.673
0.628
0.324
0.602
0.722
0.559
6
0.809
0.722
0.505
0.409
0.599
0.513
7
0.568
0.574
0.407
0.578
0.650
0.746
8
0.332
0.500
0.420
0.526
0.510
0.523
9
0.795
0.790
0.500
0.857
0.666
0.556
10
0.673
0.605
0.423
0.516
0.504
0.806
11
0.121
0.287
0.291
0.497
0.511
0.525
12
0.848
0.828
0.531
0.602
0.581
0.401
13
0.860
0.793
0.500
0.856
0.773
0.210
14
0.767
0.681
0.687
0.745
0.840
0.513
15
0.724
0.776
0.644
0.856
0.545
0.501
16
0.855
0.796
0.545
0.533
0.532
0.661
For this purpose, created model consists of seven variables: six inputs and one output (Fig. 3). The membership functions for each input variable correspond to the output variables obtained in solving the first simulation problem (Table 4). They are represented by a numerical range [0; 1], and their breakdown into three classes is reflected bad, middle, good: d ∈ [0.1; 0.5]—bad; d ∈ [0.1; 0.5; 0.5; 0.9]—middle; and d ∈ [0.5; 0.9]—good. The membership function, for outputting variable “classification”, is represented by four classes of quality polished parts: “low”, “normal”, “good”, “high” and is displayed as “trimf”. Classification is characterized by the range [0; 1] and is represented by scale and desirability function in Table 5. The rules of fuzzy reasoning for a developed system include l = 3 × 3 × 3 × 3 × 3 × 3 = 729 possible combinations of output parameters. Partly, they are presented in Table 6. It is established that the most important quality property parameters are the following: Ra1 , EFE max , HV. If they are rated “good”, some parameter estimates: Rmax1 , S m2 and EFE a always have a desirability scale “good, high”. The results of the classification of abrasive wheels are based on comprehensive evaluation of quality polished surface of the HSCPs W9Mo4Co8 which is presented in Table 7. Among the outputs, there was no “high” evaluation of the cutting capacities of wheels since desirability functions, on the one hand for roughness and accuracy of the shape of microhardness, and on the other hand, differ (Table 4). If wheels are with a “good” assessment, it is worthwhile to distinguish two subclasses: the first
Fuzzy Logic Simulation for Classifying Abrasive Wheels …
11
Fig. 3 Fuzzy logic system on integrated classification of wheels
Table 5 Parameter “classification” while searching the best wheel on the integrated evaluation quality of polished plates Valuation type
Output parameters
Linguistic
Low
Normal
Good
High
Numeral d ●
[0; 0.37]
[0.37; 0.63]
[0.63; 0.80]
[0.8; 1.0]
one with dl• ∈ [0.7; 0.8] and the second dl• ∈ [0.6; 0.7]. The first group includes three instruments which are represented by desirability functions: for the wheel 25AF46K10V5-PO3 (l = 9), d9• = 0.735; and d3;14• = 0.724, l = 3; 14—for wheels 5SG60K12VXP, 5NQ46I6VS3. The second subclass also forms three tools: d13• = 0.623—a wheel cast from monocorundum EKE (l = 13), d15• = 0.669—a wheel with mixed grains 92A/25A (l = 15) and d1• = 0.654—high-porous wheels (l = 1). Integral assessments of the cutting capacities of wheels revealed new leaders in ensuring quality surface of the HSCPs W9Mo4Co8. In particular, the abrasive tools l = 9, 3, 14 which are included in the first subclass with the integral evaluation of “good” at the first stage of modeling in the fuzzy logic environment showed themselves on the best side: l = 9—twice in the parameters Rmax1 , EFE max ; Rmax1 , EFE max ; l = 3—once by HV; l = 14—twice by S m2 , EFE a , i.e., by one—two parameters from possible variables j ∈ [1; 6]. The correct choice of abrasive tool characteristics is important when robust design of grinding operations is performed by the
12
V. C. Nguyen et al.
Table 6 A selective representation rules of fuzzy reasoning Rule
Linguistic valuation of parameters Ra1
Rmax1
S m2
EFE max
EFE a
HV
Output
17
Bad
Bad
Bad
Middle
Good
Middle
Normal
18
Bad
Bad
Bad
Middle
Good
Good
Normal
19
Bad
Bad
Bad
Good
Bad
Bad
Low
20
Bad
Bad
Bad
Good
Bad
Middle
Low
…
…
…
…
…
…
…
…
322
Middle
Bad
Good
Good
Good
Bad
Normal
323
Middle
Bad
Good
Good
Good
Middle
Good
324
Middle
Bad
Good
Good
Good
Good
Good
325
Middle
Middle
Bad
Bad
Bad
Bad
Low
…
…
…
…
…
…
…
…
726
Good
Good
Good
Good
Middle
Good
High
727
Good
Good
Good
Good
Good
Bad
Good
728
Good
Good
Good
Good
Good
Middle
High
729
Good
Good
Good
Good
Good
Good
High
Table 7 Influence characteristics of the wheels on the complex evaluation of the surface quality of HSCPs
Wheel (l = 1; 16)
dl•
Output (class)
5SG46K12VXP (1)
0.654
Good
5SG46I12VXP (2)
0.576
Normal
5SG60K12VXP (3)
0.724
Good
TGX80I12VCF5 (4)
0.504
Normal
25AF46M12V5-PO (5)
0.599
Normal
25AF46M12V5-PO3 (6)
0.534
Normal
25AF46M10V5-PO (7)
0.554
Normal
25AF46M10V5-PO3 (8)
0.502
Normal
25AF46K10V5-PO3 (9)
0.735
Good
25AF60M10V5-PO (10)
0.569
Normal
25AF46L10V5-KF35 (11)
0.306
Low
5A46L10VAX (12)
0.565
Normal
EKE46K3V (13)
0.683
Good
5NQ46I6VS3 (14)
0.724
Good
92A/25AF46L6V20 (15)
0.669
Good
34AF60K6V5 (16)
0.571
Normal
Fuzzy Logic Simulation for Classifying Abrasive Wheels …
13
best (basic) tool. This approach allows optimizing all objective functions with the greatest efficiency. Using fuzzy logic simulation for evaluating cutting abilities of abrasive wheel is innovative. Basically, this approach solves particular problems which exclude the possibility of obtaining standard parts and satisfy all requirements. However, it is practically not used in mechanical engineering, although it can reduce the complexity of developing standards for cutting conditions and technological recommendations for milling, turning, drilling processes, etc.
4 Conclusions In conditions of violations, normality distributions of experimental data for the realization of fuzzy logic were justified to use nonparametric estimates of position and dispersion measures which include medians and quartic latitudes. Usage of fuzzy logic simulation and statistical processing of observations proved to be effective tools in the classification of cutting capacities of abrasive wheels. Modeling of fuzzy logic under the conditions of integral assessment state of surface has classified cutting capacities of abrasive instruments into three groups with linguistic estimates: good (l = 1; 3; 9; 13; 14; 15); normal (l = 2; 4; 5; 6; 7; 8; 10; 12; 16); and low (l = 11). Wheels l = 9; 3; 14 provide the best state topography of the surface of HSCPs along the whole range of quality indicators being investigated first of all. They are regulated when designing the grinding operations of the HSCPs W9Mo4Co8. The results of modeling at the first stage of the studying can be used to solve local problems of grinding, for example, to select grain, structure and hardness of wheels to minimize one of the quality parameters taking into account both position and dispersion measures. Modeling in the fuzzy logic environment made it possible to identify additional reserves for increasing the military–industrial complex which is predicted without taking into account possible correlations between the individual components of their characteristics: 5SG60K12VXP and 25AF60K12V5-PO.
References 1. Soler YI, Nguyen VC (2017) Study of micro-hardness of high-speed W9Mo4Co8 steel plates in pendulum grinding by abrasive wheel periphery. J Eng Technol Sci 49:291–307. https://doi. org/10.5614/j.eng.technol.sci.2017.49.3.1 2. Webster J, Tricard M (2004) Innovation in abrasive products for precision grinding. CIRP Ann Manuf Technol 54(2):597–617. https://doi.org/10.1016/S0007-8506(07)60031-6 3. Abrasives for industrial market. Carborundum. https://www.westtool.com/customer/wetosu/ vendor/catalogs/carborundum/industrial_catalog_2014/index.html 4. Abrasive Technological Excellence. Norton Saint-Gobain. https://pdfslide.net/documents/nor ton-catalogus.html 5. State Standard GOST R52381-2005 (2005) Abrasive materials. grain content and grain composition of grinding-powder. control of grain composition. Standartinform Publ., Moscow
14
V. C. Nguyen et al.
6. State Standard GOST R52587-2006 (2007) The abrasive tool. notation and methods for measuring hardness. Standartinform Publ, Moscow 7. Jayanti D, Barbara L (2016) Effect of manual grinding operations on surface integrity. Procedia CIRP 45:95–98. https://doi.org/10.1016/j.procir.2016.02.091 8. Kang Y, Lee M, Lee Y, Gatton TM (2006) Optimization of fuzzy rules: integrated approach for classification problems. Comput Sci Its Appl—ICCSA 2006:665–674. https://doi.org/10. 1007/11751649_73 9. Zhang JF, Chen HJ, Fan L (2014) Fuzzy logic and classification of information sources. Appl Mech Mater 614:378–380. https://doi.org/10.402/www.scientific.net/AMM.614.378 10. Nguyen DM (2015) Complex investigation of the classification problem of using fuzzy models and distributed computations, PhD Dissertation, Irkutsk State Transport University, Irkutsk City 11. Omid M (2011) Design of an expert system for sorting pistachio nuts through decision tree and fuzzy logic classifier. Expert Syst Appl 38:4339–4347. https://doi.org/10.1016/j.eswa.2010.09. 103 12. Mohammad RAAA, Ali MB, Hojat A, Saeid M, Babak B (2015) Fuzzy logic based classification of faults in mechanical differential. J Vibroeng 17:635–3649 https://www.jvejournals. com/article/14451/pdf 13. Bellman R, Kalaba R, Zadeh LA (1966) Abstraction and pattern classification. J Math Anal Appl 13:1–7. https://doi.org/10.1016/0022-247X(66)90071-0 14. Bezdek JC (1973) Fuzzy mathematics in pattern classification, Ph.D. Thesis. Ithaca: Cornell University, New York 15. Strackeljan J, Behr D, Kocher T (1997) Fuzzy pattern recognition for automatic detection of different teeth substances. Fuzzy Sets Syst 85:275–286. https://doi.org/10.1016/0165-011 4(95)00352-5 16. Bellacicco A (1976) Fuzzy classification. Synthese 33:273–281. https://doi.org/10.1007/BF0 0485447 17. Hirota K, Pedrycz W (1986) Subjective entropy or probabilistic sets and fuzzy cluster analysis. IEEE Trans Syst Man Cybern 16:173–179. https://doi.org/10.1109/TSMC.1986.289297 18. State Standard GOST R52781-2007 (2007) Grinding wheels. technical conditions. Standartinform Publ, Moscow 19. State Standard GOST 25472-82 (1982) Surface roughness. terms and definitions. Standartinform Publ, Moscow 20. State Standard GOST 24642-81 (1984) Form tolerances and arrangement of surfaces. basic concepts and notations. Standartinform Publ, Moscow 21. Soler YI, Nguyen VC (2014) Prediction effectiveness of grinding with wheels of different porosity from traditional and new abrasives by criterion of the accuracy of the plates formation P9M4K8. Proc Irkutsk State Techn Univ 11(94):49–58 22. State Standard GOST 9450-76 (1993) Measurement of microhardness indentation of diamond tips. Standartinform Publ, Moscow 23. Hollanoler M, Wolfe DA, Chicken E (2013) Nonparametric statistical methods. Wiley, New York 24. Sachs L (1984) Applied statistics: a handbook of techniques. Springer, New York 25. Wheeler DJ, Chambers DS (2010) Understanding statistical process control. SPC Press, Knoxville 26. Mandrov BI, Baklanov SD, Baklanov DD, Vlesko AS, Putivskyi AH, Sukhinina SD (2012) Application desirability function of Harrington for extrusion welding of sheets made of polyethylene of PEND brand. Polzunovsky almanac 1:62–64 27. Vyatchenin DA (2004) Fuzzy methods of automatic classification. UE Technoprint, Minsk (in Russian) 28. State Standard GOST 24643-81 (1981) Tolerances of the shape and arrangement of surfaces. Numeric values. Standartinform Publ, Moscow 29. Soler YI, Kazimirov DY, Nguyen VL (2015) Quantitative assessment of burns while flat grinding hardened parts made of steel 37Cr4 by abrasive wheels of different porosity. Obrab Met—Met Working Mat Sci 1(66):6–19. https://doi.org/10.17212/1994-6309-2015-1-6-19
Enhancing Security and Trust of IoT Devices–Internet of Secured Things (IoST) B. Rohini, Asha Rashmi Nayak, and N. Mohankumar
Abstract Internet of things has infiltrated all corners of the world by providing distributed intelligence to all objects making them smart. From small wearables to smart homes to smart cities, IoT has proven to be a substantial part in all facets of life. Safety of the data collected from IoT devices that can make a whole city smarter is of utmost importance. On such a large scale, one malignant intrusion can lead to leakage of thousands of public data, thereby compromising privacy. On an international level, this leads to various complications in military, friendship among nations, suppressing terrorism, etc. Most of the current methods of securing IoT devices deal with the network and application layer. This paper deals with a simple yet very efficient method of securing IoT devices in the physical layer before it hits the market. These devices are called Internet of Secured Things (IoST) devices. It also takes into consideration the importance of variations in timing, power and area. Keywords Internet of Secured Things (IoST) · Trojan detection · Hardware Trojan · Toggle count · Cloud platform · Hardware security · Design for security · Internet of things (IoT)
1 Introduction The flourish in the number of Internet of things (IoT) devices that connects the physical world together has had profound effects on society. To name a few, smart homes, smart automation and smart agriculture are some of the domains in which B. Rohini · A. R. Nayak · N. Mohankumar (B) Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] B. Rohini e-mail: [email protected] A. R. Nayak e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_2
15
16
B. Rohini et al.
IoT has infiltrated. Data from Juniper Research reveals that IoT-connected devices will number 38.5 billion in 2020, up from 13.4 billion in 2015: a rise of over 285%. With such a huge increase, IoT-based circuits have become an essential part of devices produced for military, banking, healthcare systems, etc. These are securitysensitive fields, and hence it becomes very important to be sure that there are no malicious intrusions in the hardware. In today’s era of outsourcing, hardware security has become a significant issue. Most companies outsource chip fabrication in order to increase their revenue and produce cheaper products compared to other competitors. The passing on of the job of IC fabrication to low-cost foundries has made these ICs vulnerable to intrusion. An adversary might include a Trojan in the circuit to leak confidential information or to sabotage the system at a later time. Trojans present in hardware are difficult to detect during testing time. One of the reasons is the small size of the ICs and their system complexity. This makes it impossible to track any Trojan by physical inspection. Destructive reverse engineering only renders the detection procedure to be expensive. In this paper, we have proposed a method to secure IoT devices from malicious Trojans using toggle count, thereby enhancing the security of hardware. We select a few nodes of a golden circuit (without any Trojan) and calculate the toggle count at those particular nodes. We do this for a particular input sequence and obtain the corresponding output sequence. The toggle count of the nodes under consideration is then saved in a cloud platform. To detect intrusions in a circuit, the input sequence is passed to the circuit under inspection and the toggle count at certain nodes is calculated. This is then compared with the one stored in the cloud. If the values match it is a golden IC, else it is infected. The devices that have passed this test are called Internet of Secured Things (IoST) devices. The following section enlists the advantages of such an IC testing. • A nondestructive method of IC testing. • On-the-fly testing is possible. The following section is a literature survey which enlists the advantages and drawbacks of all the techniques of Trojan detection that have been used in the past.
2 Literature Survey Paper by Salmani et al. deals with the difficulties related to triggering a Trojan circuit. A Trojan is activated by very rare stimuli so that it does not get detected easily, and consequently its transition probability is very low. A dummy flip-flop insertion method has been introduced which increases the transition probability of the Trojan circuit. The dummy flip-flop does not change the functionality of the circuit in any way. Time has been analyzed to reduce the detection time of the Trojan circuit. This has been achieved by increasing the switching time [1]. The side channel analysis is used to identify the infected region of a circuit. It is done in two phases, namely region-based partition and relative toggle magnification. In the first phase, the circuit
Enhancing Security and Trust of IoT Devices–Internet …
17
is divided into regions with gates as the center. After the circuit has been divided into regions, each region is activated with a small stimulus. The difference in toggle count of the infected circuit and a good circuit is maximized. This increases the power consumption of the infected circuit and helps in detection [2]. Trojan taxonomy that can be used in conjunction with Trojan detection schemes is analyzed. Trojans can be classified according to their physical and activation characteristics. According to their physical characteristics, a Trojan can be distinguished with respect to its type, size, distribution and structure. According to its activation characteristics, a Trojan can be either externally or internally activated [3]. Cloud computing refers to the availability of the computer system resources such as data storage and computing power without active management of the user. It enables users to access systems using Web browsers regardless of location or device used. Breach in the hardware can be detected by comparing the intrusion-free results stored in cloud with the results obtained [4]. Cloud computing is a convenient method of storing data, but there are some concerns regarding security of data that is being backed up into the cloud. Once data is sent to a cloud, it is controlled by a third party. Data from various sources is sent to the same cloud due to which data breaching is a threat. Sometimes, the cloud service provider may corrupt the data. The most common way of protecting the data is encrypting it. Various factors should be considered regarding confidentiality, integrity and availability of data [5]. Power consumption of a digital circuit is an important parameter. As dynamic power consumption is data-dependent, transition probability is discussed. Increasing transition probability of nets in an integrated circuit helps reveal hardware Trojan [6]. A technique when three levels of security at signature generation, encryption and bitstream sequencing for DUT much suitable for IoT nodes is proposed in Noor M M et al. [7] , Manoj Reddy et al. [8] and Atzori L et al. [9]. A hardware-based malicious activity monitor using hybrid voting algorithm is proposed in Koneru et al. The main drawback of the technique proposed in this paper [10] is the requirement of redundant DUTs.
3 Proposal Internet of things (IoT) is revolutionizing the electronics industry. With inadequate authentication and authorization, safeguarding the device becomes a huge predicament. Insecure Web and cloud interfaces become vulnerabilities that may be an attack vector in an IoT system at the application layer. Commercial IoT products are particularly vulnerable because one compromise can have detrimental consequences. In the physical layer itself, hardware can contain malicious intrusions (Trojans) that could leak information, change the functionality of the circuit itself, etc. Encryption, authentication, etc., cannot mitigate security issues in IoT if the problem lies in the hardware itself.
18
B. Rohini et al.
Fig. 1 Combinationally triggered Trojan
Trojan trigger and Trojan payload are two main terms used in hardware Trojan detection. Trojan trigger is the activation mechanism or the sequence of digital stimulus that causes the Trojan to become active and carry out its disruptive function, while Trojan payload is the part of the circuit that is affected. The trigger mechanism can be either analog or digital. Digitally triggered Trojans are classified into combinational and sequential types. Figure 1 shows an example of a combinationally triggered Trojan [11]. For A = 0 and B = 1, A = 1 and B = 0 and A = 1 and B = 1, values at C and C modified are equal. When A = 0 and B = 0, C and C modified are not equal anymore. In this section, we outline the approach taken by us to secure the IoT devices by detecting such a digitally triggered Trojan belonging to the functional class which could be externally triggered through some logic test sequence given by the user. We are naming this approach IoST—Internet of Secured Things. The focus is on “time zero” detection, i.e., noninvasive method which is applied during testing phase of each IC. The standard VLSI fault detection tool utilizes automatic test pattern generation (ATPG). The main challenge in such a logic testing-based approach is the exhaustive nature of identifying all possible test vectors to detect HTs, which is computationally infeasible. Thus, we make use of the toggle count, i.e., switching activity of the gates. Assuming we initially have a golden IC, i.e., a Trojan-free manufactured IC or functional model, firstly, we identify a set of nodes of the IC that may be targets for the attacker to attach HT. The toggle count of all such nodes is computed and stored in a tabular form. This table is then pushed into a secure cloud platform and is utilized during the testing phase of each and every IC. Any deviation in the toggle count of the nodes when compared to the data in cloud indicates that the IC has been tampered with. As an example, Fig. 2 shows combinational benchmark circuit ISCAS 85 C17. The toggle count at specific nodes is pushed into a secure cloud platform. Now, consider the same C17 circuit with a Trojan added to it at node 8 which is shown in Fig. 3. The Trojan T1 added to the circuit has the characteristic “stuck at 1.” The Fig. 2 ISCAS 85 C17
Enhancing Security and Trust of IoT Devices–Internet …
19
Fig. 3 ISCAS 85 C17 with Trojan T1
Table 1 Toggle count of C17 benchmark circuit
Nodes
Toggle count without T1
Toggle count with T1
6 7 8 9 10 11
1 3 12 24 9 12
1 3 12 24 1 24
functionality is not that of a C17 circuit anymore due to the addition of an unwanted Trojan in node 8. Table 1 shows the toggle count of C17 circuit with and without Trojan. This is a simple example of a circuit that has been tampered due to the HT present. There are other Trojans such as T2: “stuck at 0” and Trojans that T3: “complement the input.” During testing phase of such a tampered IC, the toggle count obtained is compared with the ones in the cloud. As the toggle counts do not match, we can conclude the presence of a HT. Figure 4 shows an overview of the proposed model. If the device is found to be compromised, the nodes which contain a different toggle count from that of the golden IC can be identified. This node can be traced back till the input to find the exact location of the Trojan. The primary service provided by cloud computing is to offer cloud storage. The service provider hosts the data of the user on their server, and the user accesses it from these servers. Personal, public, private and hybrid are the four main categories of cloud storage. Whatever may be, the important characteristics of a secure cloud platform include encryption, confidentiality, security and privacy, and integrity of the data. Even though having a personalized server is encouraged, it is not feasible. Hence, some measures such as a unique password that cannot be reset to any factory setting, complete knowledge about how and where the data is stored, whether that data is encrypted (and how) and who has access to the data and the keys can be taken. These safety measures will make a cloud platform safe and secure.
20
B. Rohini et al.
Fig. 4 Overview of proposed model
4 Results and Discussion The results obtained from testing ISCAS 85 benchmark circuits have been consolidated in this section. In a similar way, the IoT modules can be tested for malicious activity. A certain input sequence is applied to the gates, and the corresponding toggle counts of nodes under inspection have been plotted as bar graphs. For each node, the following toggle counts have been taken into consideration (Fig. 5), • • • •
Without Trojan—golden reference (blue) With Trojan T1—orange With Trojan T2—gray With Trojan T3—yellow.
As evident from the above graphs, toggle count varies greatly with inclusion of different types of Trojans in multiple nodes. Figure 6 shows percentage difference between the toggle counts of IC with different Trojans T1, T2 and T3 mentioned before. Figure 7 shows execution time taken by each of the circuits without and with different types of Trojans. Execution time is the time taken starting from application of input patterns to the circuit till the result indication. HT4 is the execution time taken by the circuit which has more than one Trojan present in it.
Enhancing Security and Trust of IoT Devices–Internet …
Fig. 5 Toggle count of various nodes for different benchmark circuits
21
22
B. Rohini et al.
Fig. 6 Percentage difference in toggle counts
Fig. 7 Execution time
The execution time was tested using a Wi-Fi connection of 0.83 Mbps—4.37 Mbps download speed and 1.02 Mbps—3.77 Mbps upload speed. It was tested in a machine with Intel i7 8th gen core and 8 GB RAM. For a guaranteed and almost unvarying Wi-Fi speed, estimation time can also be utilized as a parameter to test circuits for harmful intrusions. Figure 8 gives information about the leakage power of ISCAS 85 circuits with and without Trojans. Figure 9 shows the area analysis of ISCAS 85 circuits with and without Trojans. Even though the variations are minute, a range can be set for golden IC. If the chip does not fall into this range, more trials can be performed on the circuit to analyze the Trojan presence. The variations highlighted in Figs. 8 and 9 validate our claim of HT detection.
Enhancing Security and Trust of IoT Devices–Internet …
23
Fig. 8 Leakage power
Fig. 9 Area analysis
5 Conclusion IoT is a gigantic network of interconnected devices that might come with malicious intrusions that compromises the functionality, privacy and safety of the device and users. Thus, securing IoT devices from hardware Trojans becomes of utmost importance. A simple yet effective technique for making IoT devices reliable is presented in this paper. We have coined the term “IoST—Internet of Secured Things”—to differentiate it from common IoT devices that have not been under such rigorous testing to prove its reliability. Toggle count of certain nodes for a given input sequence is found, and it is used to identify the presence of a malicious threat in the circuit. Simulation results show that the possibility of identification of Trojans using toggle count increases with increase in the number of input patterns and increase in the number of nodes under consideration. “Internet of Secured Things (IoST)” should be the buzz word from now on in avenues such as smart city, industrial IoT and Industry 4.0.
24
B. Rohini et al.
References 1. Salmani H, Tehranipoor M, Plusquellic J (2012) A novel technique for improving hardware trojan detection and reducing trojan activation time. IEEE Trans Very Large Scale Integ (VLSI) Syst 20(1):112–125 2. Banga M, Hsiao MS (2008) A region based approach for the identification of hardware Trojans. In: 2008 IEEE international workshop on hardware-oriented security and trust, Anaheim, CA, pp 40–47 3. Xiaoxiao W, Tehranipoor M, Plusquellic J (2008) Detecting malicious inclusions in secure hardware: Challenges and solutions, pp 15–19. https://doi.org/10.1109/hst.2008.4559039 4. Hayes B (2008) Cloud computing. Commun ACM 51(7):9–11 10.1145/1364782.1364786 5. Vurukonda Naresh, Rao Dr, Thirumala B (2016) A study on data storage security issues in cloud computing. Procedia Comput Sci 92:128–135. https://doi.org/10.1016/j.procs.2016.07.335 6. http://www.cse.psu.edu/~kxc104/class/cmpen411/14f/lec/C411L14LowPower2.pdf 7. Noor MM, Hassan W (2018) Current research on internet of things (IoT) security: a survey. Comput Netw 148. https://doi.org/10.1016/j.comnet.2018.11.025 8. Reddy DM, Akshay KP, Giridhar R, Karan SD, Mohankumar N (2017) BHARKS: Built-in hardware authentication using random key sequence. In: 2017 4th International conference on signal processing computing and control (ISPCC), Solan, pp 200–204 9. Atzori L et al (2010) The internet of things: a survey. Comput, Netw 10. Koneru et al (2018) HAPMAD: Hardware-based authentication platform for malicious activity detection in digital circuits. In: Information systems design and intelligent applications, Springer, pp 608–617. https://doi.org/10.1007/978-981-10-7512-4_60 11. Chakraborty RS, Narasimhan S, Bhunia S (2009) Hardware trojan: threats and emerging solutions. In: 2009 IEEE international high level design validation and test workshop, San Francisco, CA, pp 166–171
Detection of DoS, DDoS Attacks in Software-Defined Networking Dinh-Tu Truong, Khanh-Dang Tran, Quoc-Binh Nguyen, and Dac-Tot Tran
Abstract In the traditional network system, control plane (CP) and data plane (DP) are both located on the same network device. If that device fails, the whole system will stop working and the traffics cannot be managed by the administrator. In distributed denial of service (DDoS) or denial of service (DoS), the attackers usually use botnets to generate a large or medium size traffic flow toward the server where the service is active. Nowadays, many techniques have been proposed to detect DoS, DDoS attacks but they are not very effective. However, the separation of CP and DP in software-defined networking (SDN) provides a good foundation for attack detection and prevention no matter whether the attackers attack some parts of the SDN or all of them. In this research, we use entropy to calculate the randomness level of the size of the packet to CP, the system’s brain. With this method, the controller can handle faster and we can get more correct output compared with the machine learning method in the SDN. Keywords Software-defined networking · DDoS attacks · Detection of DDoS
1 Introduction In a traditional network architecture, data plane (DP) and control plane (CP) are both on the same physical device. Each device is independent of the others and has its own version of the packet-forwarding policy. These policies’ consistency, D.-T. Truong (B) · K.-D. Tran · Q.-B. Nguyen Natural Language Processing and Knowledge Discovery Laboratory, Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Viet Nam e-mail: [email protected] Q.-B. Nguyen e-mail: [email protected] D.-T. Tran Faculty of Information Technology, Ho Chi Minh City University of Food Industry, Ho Chi Minh City, Viet Nam e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_3
25
26
D.-T. Truong et al.
therefore, maybe not guaranteed and it is very difficult for the administrators or network engineers to operate, control or config the whole network, especially with a large number of devices. Due to the increasing demand from the industry and the rise of cloud services, the complex traditional model, which depends so much on each provider, with inconsistent policies becomes unscalable and unadopted. Therefore, network solution researchers have come up with a new network technology called software-defined networking (SDN) [1]. Based on the OpenFlow protocol, this research result of Stanford University and UC Berkeley is the physical separation of the network CP, the routing flows, from the DP, the traffic flows. SDN transfers the flow control to a private network component called flow controller, playing the role of CP, to control several devices. This allows the flow of packets passing through the network which can be programmatically controlled. The network information and status, therefore, are logically centralized controlled. This makes the network more flexible. Once the CP is separated from the physical device and passed to the controller, network engineers can set an optimum policy for the entire network on this controller. The controller interacts with physical network devices through an OpenFlow protocol. With SDN, network management can be performed on a single interface, instead of each network device in the old model. The SDN architecture can be divided into three layers, namely the application layer, the controller layer, and the infrastructure layer. In the application layer, users can use the provided APIs from the controller layer to program and optimize network operations such as adjusting the latency parameters, bandwidth, routing, etc. The controller layer controls the network configuration according to the requirements from the application layer and network capabilities. It provides APIs to users for network application building. It also gathers information from the physical network and controls the physical network. This layer contains some software programs, called controllers, in a CP that is separated from the physical devices. Each controller is a managed application that controls the network flows. Most of SDN controllers are currently based on the OpenFlow protocol. The infrastructure layer is the network’s physical layer, including network devices such as Ethernet switches, routers, and links between them. In this layer, the data is forwarded based on the instructions from the controller layer via the OpenFlow protocol. This layer is the place where the devices make up the DP. A network device can operate under the control of several different controllers. This enhances the virtualization capabilities of the network (Fig. 1). DDoS or DoS attack is one of the biggest threats to enterprise networks in general and the network built on SDN in particular. Attackers do not intrude into the system but just block legitimate access by flooding the targeted machine or resource with superfluous requests in an attempt to overload systems. In the SDN-based network, switches are not involved in processing the traffic to and from the network, so the controller is the first affected entity. By defining flows entries, some new DDoS detection methods have been released, including FlowFence [2], Avant-Guard [3], SDNShield [4], Self-Organizing Maps (SOM) [5], and [1, 6–9].
Detection of DoS, DDoS Attacks in Software-Defined Networking
27
Fig. 1 SDN architecture
To detect the status of their system at that moment, K- means clustering [10] has a detection rate of 0.9993% and a false alarm rate of 0.001%, collected over TCPDump KDD-99 dataset. On the other hand, the artificial neural network [5, 11] has a detection rate of 81.34% and a false alarm rate of 21.13%, collected over UNSW-NB15, etc. The machine learning methods should not be applied to the SDN controller because all of them require along time of training using existing datasets (UNSWNB15, KDD99, NSL-KDD, etc.). Besides, the parameters’ fitness has to be updated continuously. This makes the calculation difficulties raise and the training time increase. All of them can be applied to IDS, IPS device [12, 13] but there are economic barriers when the system is expanded.
2 Detection of DDoS and DoS Using Entropy 2.1 Method In this paper, entropy is used to measure the randomness of Internet traffic in the network. In the case of normal network traffic, i.e., the user is performing a normal service request, the entropy is high. On the other hand, if the network is being attacked, that is, a large amount of malicious traffic is sent to the victim, then a lower
28
D.-T. Truong et al.
the value of entropy is shown through the analysis of parameters such as frames, size, port, address, etc. The parameters associated with the packet defined in the Controller are used to analyze the entropy in the active network traffic. The size of packets that are taken at a specific time is Y = {x1 , x2 , x3 , . . . , xn }
(1)
The probability pi which element x i appears in the Y value set is pi =
xi n
(2)
Entropy is defined as the total probability of all elements: Entropy = −
n
pi log( pi )
(3)
i=1
2.2 Statistics for DDoS, DoS Early Detection Using Entropy Method The SDN model is very vulnerable to DDoS attacks compared to other models. If the detection of attacks takes a long time, it is very easy to destroy the interconnections between layers in the SDN model. The differences between entropy random values in a normal traffic case and an attacked traffic case are showed in Table 1. The flow entries are as shown in Fig. 2. They are a collection of all packet matches by the administrator. When the dataset is complete, the entropy method will be calculated on this dataset and the evaluation result will be alerted. In Fig. 3, the black line represents the value of the entropy for normal traffic, whereas the red line represents attack traffic. The downfall of entropy value is an alarm for the network being at risk of attack. In the SDN model, the most important thing is to have a fast detection method to prevent attacks and take the best measure to deal with them. Table 1 Entropy of different between normal and attack
Y
Entropynormal
Entropyattack
50
0.103
0.098
100
0.12
0.053
200
0.24
0.051
500
0.29
0.063
1000
0.33
0.08
Detection of DoS, DDoS Attacks in Software-Defined Networking
29
Fig. 2 Flows define
Fig. 3 The different of entropy between normal and attack
2.3 Threshold The threshold is a value that is used to partition a set of specific values into specific areas. In this research, we use thresholds to identify attack areas as well as to detect the number and information of malicious packets. It also helps to classify the attacks. To find the optimal threshold, we began to collect the number of packets through two phases: (1) Phase 1: Calculate the entropymedian of a total of 500 packets through 25 attacks traffic (Fig. 4).
30
D.-T. Truong et al.
Fig. 4 The different of entropy between normal and phase attack 1
(2) Phase 2: Calculate the entropymedian of a total of 2000 packets through 100 attacks traffic (Fig. 5). EntropiesMedian of phase 1 and phase 2 are always fixed (by allowing the system to undergo actual attacks). The threshold in this study is the lowest entropy in all cases
Fig. 5 The different of entropy between normal and phase attack 2
Detection of DoS, DDoS Attacks in Software-Defined Networking
31
because it allows the controller to detect attacks with packets that occupy more than 50% of the incoming traffic. All value of threshold can be changed to suit the requirements of different network design; even if the network is running, these values can still be changed through the controller of the SDN model.
3 Network Setup and Result 3.1 Simulation Scenarios In this part of the paper, we will present step by step a simulation of our DoS/DDoS attack detection methods. The proposed algorithm is applied on OpenDayLight Controller and network model is built on GNS3.
3.2 Network Setup The model is built on GNS3 with three distinct network areas, namely attack network, ISP network, and SDN network. In SDN network, there are a Controller (OpenDayLight), an OpenvSwitchn and three hosts corresponding to the DMZ, server farm, and local partitions. They are all connected based on the model showed in Fig. 6. Network topology includes three areas, namely (1) attack network, the network partition of the attacker; (2) ISP network, the network partition of the internet provider; and (3) SDN network, the network partition of the enterprise network.
3.3 Result In this research, we tested normal and attack rates, all of which were categorized in two cases by taking the same set of magnitudes (Y ) and the number of times in most regulations. The attack/normal network traffic generated by Scapy with randomly generated IP sources and random traffic serving the test (Fig. 7). For each calculation of the entropy value, we use the getBackData.py file to enter the information related to the sampling number (−c), the number of IPs received each time (−w), the sampling time (−t), any traffic (−n). (1) Normal traffic As the network traffic is normal, traffic moves steadily, so the controller calculates the entropy value for each packet and will not detect any abnormalities throughout the entire experiment, and they are always higher than the attack threshold (Figure 8a).
32
D.-T. Truong et al.
Fig. 6 Network topology
Fig. 7 Entropy calculator
(a) Normal traffic
(b) Attack traffic
Fig. 8 Compare entropy values between normal traffic and attack traffic, a Normal traffic b Attack traffic
Detection of DoS, DDoS Attacks in Software-Defined Networking
33
(2) Attack traffic As the attack rate, the value of the entropy calculated from the controller is always lower than the threshold combined with the number of packets exceeds the allowable level to exclude cases where the network encounters extraneous factors that predict the flow. In the case of normal and attack traffic, the number of sampling times and the number of IPs must be the same to achieve the most satisfactory results (Fig. 8b). (3) Classify type DDoS and DoS After detecting the attack and gaining information about malicious packets, in conjunction with defining the initial flow entries, it is possible to classify the attack patterns according to their characteristics (SYN Flood, UDP Flood, ACK Flood,…), information on the date, time, port, packet size.
4 Conclusion In this paper, we propose a mechanism to detect different types of DDOS or DoS attacks in SDN using entropy to calculate the randomness level of the packet size. Both our simulated attacks and our data collection are closed to the actual business model nowadays. We can also classify the attack into types and provide the attack information as a premise for the next blocking steps (Fig. 9).
5 Future Work Our proposed method can be used as a base for several future developments. First of all, the combination of some monitoring tools into the application layer may help the system automate the whole process of evaluating traffic incoming. Secondly, once an attack is classified by our method, we can develop some prevention methods based on its type of attack. Besides, to make this model more robust and efficient, we can investigate the possible attacks on multiple controllers at the same time when combining different locations. Furthermore, we can also develop several new ways to mitigate those attacks.
34
D.-T. Truong et al.
Fig. 9 Type attack classify
References 1. Polat H, Polat O, Cetin A (2020) Detecting DDoS attacks in software-defined networks through feature selection methods and machine learning models. Sustain 12(3) 2. Andres P et al (2015) FlowFence: a denial of service defense system for software defined networking. In: Proceedings global information infrastructure networking symposium, pp 1–6 3. Seungwon S et al (2013) Avant-guard: scalable and vigilant switch flow management in software-defined networks. In: Proceedings 2013 ACM SIGSAC conference computer and communications security, ACM, pp 413–424 4. Chen K et al (2016) SDNShield: towards more comprehensive defense against DDoS attacks on SDN control plane. In: 2016 IEEE conference on communications and network security (CNS), IEEE, pp 28–36 5. Moustafa N (2016) The evaluation of network anomaly detection systems: statistical analysis of the unsw-nb15 data set and the comparison with the kdd99 data set. vol 25, No 1–3, Australia, pp 18–31
Detection of DoS, DDoS Attacks in Software-Defined Networking
35
6. Lopez AD, Mohan AP, Nair S (2019) Network traffic behavioral analytics for detection of DDoS attacks. SMU Data Sci Rev 2(1) 2019 7. Khare M, Oak R (2020) Real-Time distributed denial-of-service (DDoS) attack detection using decision trees for server performance maintenance. In: Performance management of integrated systems and its applications in software engineering, Springer, Singapore, pp 1–9 8. Sharma A, Agrawal C, Singh A, Kumar K (2020) Real-Time DDoS detection based on entropy using Hadoop framework. In: Computing in engineering and technology, Springer, Singapore, pp 297–305 9. Nooribakhsh M, Mollamotalebi M (2020) A review on statistical approaches for anomaly detection in DDoS attacks. Inf Secur J: A Global Perspect, 1–16 10. Riad AM, Elhenawy I, Hassan A, Nanc (2013) Visualize network anomaly detection. 5(5) 11. Idhammad M, Afdel K, Belouch M (2017) DoS detection method based on artificial neural. 8(4) 12. Akhunzada E, Ahmed A, Gani M, Khan K, Imran M (2015) Securing software defined network: toxonomy, requirements, and open issues, Malaysia, pp 39 13. Gu G, Shin S (2012) CloudWatcher: network security monitoring using openflow. pp 1–6
Functional Encryption of Floating-Point ALU Using Weighted Logic Locking for Enhanced Security Bihag Rajeev and N. Mohankumar
Abstract Today, almost all devices are manufactured with an arithmetic and logic unit. Floating-point ALU is used for performing various fundamental operations. It is preferred over other ALU subsystems since floating-point numbers represent both small and large numbers. Apart from this several other operations like shifting, rotating is also executed by ALU. This ALU systems can be easily manipulated and it is very difficult to prevent the unauthorized usages. The vital information is leaked out due to piracy issues. Hence, it is necessary to encrypt the ALU architecture with a logic locking block, in order to provide safety to the circuit. This paper proposes a technique called floating-point ALU with enhanced security called weighted logic locking. This technique offers high security when they are applied on the circuits. The technique can be combined to reduce the problems like key sensitization attacks. The weighted logic locking block is placed on the floating-point ALU circuit by determining the nets with maximum fault impact. The results show that there is no significant increase in the area and power overheads for the encrypted circuit from the floating-point ALU subsystem. Keywords Weighted logic locking · Floating-point ALU · Hardware security · Design for security
1 Introduction The increase in the cost of IC manufacturing led to several activities like IP piracy, overbuilding, reverse engineering and hardware trojans. In order to overcome these challenges, some techniques are developed namely IC metering, watermarking, IC B. Rajeev · N. Mohankumar (B) Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] B. Rajeev e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_4
37
38
B. Rajeev and N. Mohankumar
Fig.1 Work flow of the proposed architecture
camouflaging and logic locking. Compared to all, logic locking is one of the simple techniques that is commonly used to overcome the threats. The floating point has three basic fields, namely sign, exponent and mantissa. Sign bit is a one-bit field. Exponent 11 bits denote the double precision formats. For single precision, the bias value is 127 and the double precision bias value is 1023. Mantissa is very significant. It has 23 bits in single and 52 bits in double precision. It is implemented using different hardware modules to perform the various arithmetic and logical operations. In Fig. 1, the important modules include the exponent difference module, right shifter, 2’s complement adder, leading one detector, left shifter and rounding module. Logic locking is one of the popular techniques that functions against the threats, where additional logic gates are inserted in the circuit with the secret key. Locking mechanism is provided to the keys. The locked circuits contain key inputs. The additional key gate may be XORs or LUTs. If the circuit does not have any secret keys or logic locking block, it faces several problems like incorrect output, inefficiency to revive the data. The emerging development in this technique focuses on defense against SAT-based and SCAN-based attacks. Weighted logic locking is the technique used for increasing the hamming distance against the key sensitization attacks. It is one of the strong methods that achieves 50% of the hamming distance. In this method, the circuit is locked by means of multiple keys. By inserting many numbers of key gates, it becomes hard for the attacker to detect the secret key and to unlock it. Fault analysis-based locking technique reduces high execution time for placing the keys. FA-based techniques are highly recommended since it is not vulnerable to key sensitization attacks. The testing methods relate the output of invalid keys with the correct keys. In weighted logic locking, the propagation of random key bit through the control gate happens when the other key bits are controlled. The floating-point ALU subsystem performs the basic operations on 32-bit floating-point numbers. Optimizations of area and power overheads are obtained when the weighted logic locking encryption standard is incorporated with the floating-point ALU. The output corruption is measured in terms of hamming distance for different combination of key bits. The area and power overheads are validated.
Functional Encryption of Floating-Point ALU Using Weighted …
39
2 Related Works Karousos et al. proposed a new technique called weighted logic locking [1]. It involves multiple key inputs for controlling the key gates. This method provides a remedy against path sensitization attacks. The execution time is considerably reduced in case of logic locking. This approach is highly generic and can be combined with other methods. The vulnerabilities of these techniques have been mentioned in [2] by J. A. Roy et al. The overhead requirements for EPIC, delay and power are negligible. The experimental results show that EPIC is resilient to various attacks and piracy. Sreshma Krishnan et al. explain how weighted logic locking can be used to increase the hamming distance to provide immunity against key sensitization attacks [3]. Weighted logic locking employs multiple keys to lock the circuit compared to conventional logic locking. The key gates are inserted in the location with the maximum fault impact. J. Rajendran et al. explain that the output corruption occurs because of the wrong combination of keys and placement of key gates [4]. Hence, fault analysisbased technique is used to find the location to place the key gates. Output corruption is more because of using this technique. The concept of SAR Lock is explained in detail in [5] by M. Yasin et al. Many attempts have been made to damage the logic locking for taking out the information of secret keys. To avoid such discrepancies, new logic locking technique called SAR Lock is proposed. Decoding of locked netlist is done by the attacking agencies in a linear fashion by Yasin et al. [6]. The number of keys is sensitized by the key bits. A new metric is introduced to improve the logic locking. The results are interpreted. Plaza et al. give the remedy for third shift problem in case of IC piracy [7]. The test response allows IC testing before activation by an untrusted agency. So, it provides effective logic locking method. It is done by multiplexer-based locking strategy. The authors Azar et al. provide a concept of logic locking which can be done after implementing the secret keys [8]. The proposed logic locking scheme safe guards the intellectual property that is present in the circuit. Logic locking prevents all the unwanted activities against ICs. The design of obfuscated circuits for DSP applications is done using high-level transformations by Lee et al. [9]. The order of the filter can be reconfigured using multiple meaningful modes.
3 Methodology The floating-point ALU performs arithmetic and logical operations in the 32-bit inputs (a and b). To improve the security of the ALU subsystem, weighted logic locking encryption scheme is inserted. The encrypted ALU architecture and the adopted methods are as follows.
40
B. Rajeev and N. Mohankumar
3.1 Design of Floating-Point ALU Architecture The two operands N 1 and N 2 are the input. It is read and compared for denormalization and infinity. After the process of denormalization, it is set to 0 or 1. The fraction parts are extended to 24 bits after setting up the binary bit values. Comparison of the exponent values e1 and e2 is done using 8-bit subtraction under two cases, if e2 is greater than e1, N 1 and N 2 are swapped. The right shift operation is performed on the smaller fraction, f 2. 2’s complement is taken and f 2 is replaced after getting the results from subtraction. Thus, the two numbers possess the same exponent value. The signs indicate which operation is being performed. In case of subtraction, the bits present in f 2 are inverted and the fractions are added using 2’s complement. The sum of the operation is a negative number; it was inverted and extra 1 is added to the get the result. The process of normalization is initiated by passing the results to the detector. By considering the results from the detector, further steps involve shifting the results to the left. The results are approximated and the exponents are adjusted. Figure 2 shows the single precision floating-point ALU. Fig. 2 Flowchart of the 32-bit floating-point ALU
Functional Encryption of Floating-Point ALU Using Weighted …
41
3.2 Weighted Logic Locking in C17 Circuit The key idea of weighted logic locking can be illustrated using C17 benchmark circuit. There are five input nets, two output nets and six NAND gates. The input nets are 1, 2, 3, 6, 7 and the output nets are 22,2 3. The intermediate nets are 10, 11, 16 and 19 as shown in Fig. 3a. Figure 3b represents the C17 benchmark circuit encrypted using weighted logic locking for a key bit size of 2. Two input bubbled NAND gate forms the control gate and the two input XOR gate forms the key gate. The encryption logic is inserted in the net 16 of the benchmark circuit. The fault simulation for C17 is performed and net 16 is found to have the maximum fault impact. Depending on the combination of the key bits, incorrect outputs are generated. The truth table for the different combinations of inputs for the CUT is represented in Table 1. The key bit combination K1 K2 = 00, the output of the encrypted circuit is the same as that of the benchmark circuit, hence generating zero output corruption. Hence, this forms the correct key combination. The key bit combinations K 1 K 2 = 01, K 1 K 2 = 10, K 1 K 2 = 10 are the incorrect key combinations. The encrypted circuit generates the corrupted output and a hamming distance of 50% is achieved.
Fig. 3 a C17 benchmark circuit, b Encrypted C17 benchmark circuit
Table 1 True value outputs and the encrypted outputs obtained for the input combinations in C17 benchmark circuit Input of the circuit under test
Key gate outputs
Output of the encrypted circuit
Output of the benchmark circuit
Net1
Net3
Net6
Net2
Net7
K1
K2
Net 22
Net 23
Net 22
Net 23
1
0
0
1
1
0
0
1
1
1
1
0
0
1
0
0
0
1
0
1
1
1
0
1
1
1
0
1
0
0
1
0
0
1
1
1
1
0
1
1
1
1
1
0
42
B. Rajeev and N. Mohankumar
3.3 Encryption of Floating-Point ALU by Weighted Logic Locking The main aim in incorporating the weighted logic locking circuit in the floating-point circuit is to provide the secured access. The proposed floating-point ALU architecture has the locking system. Fault simulation is done for the ALU. The key gate outputs are provided to the module that performs the arithmetic operations. The basic operations of the ALU subsystem are performed and the outputs are obtained depending on the key bit configurations.
4 Results and Discussions The encrypted 32-bit floating-point ALU of two key bit size ensures its functions depending on the combinations of key bits. The correct outputs are generated for the key bit combination K 1 K 2 = 00 and K 1 K 2 = 11. The output corruption is measured using the hamming distance
4.1 Power Profile See Table 2. Table 2 Power profile for the proposed architecture Block
Cell internal power
Net switching power
Total dynamic power
Cell leakage power
Encrypted floating-point ALU
210.0822 uW
114.4829 uW
324.5651 uW
2.1765 mW
Floating-point ALU
198.43uW
107.911uW
302.01 uW
2.023 mW
Percentage difference
5.7
5.91
7.2
7.32
The cell internal power of the encrypted floating-point ALU is increased by 5.7% The net switching power of the encrypted floating-point ALU is increased by 5.91% The total dynamic power of the encrypted floating-point ALU is increased by 7.2% The cell leakage power of the encrypted floating-point ALU is increased by 7.32%
Functional Encryption of Floating-Point ALU Using Weighted …
43
Table 3 Area parameters for the proposed method Block
Combinational area Non-combinational area Net interconnect area
Encrypted floating-point 9274.060776 ALU
5673.369545
925.990717
Floating-point ALU
8928.087153
5237.290815
882.445832
3.731
7.686
4.703
Percentage difference
4.2 Area Profile The total area of encrypted floating-point ALU is 15873. 421038 is the total area and 15047.8238 is the area of the encrypted circuit. The total area is increased by 2.67%. See Table 3.
4.3 Functional Verification Two 32-bit inputs are tested for different combinations of key bits and the corresponding output signals for the different modules were obtained. Based on the input key combination, correct and incorrect outputs are generated. In Fig. 4, two 32-bit input values are provided for a and b. The values are a = 32’d1066401792 and b = 32’d1061158912. The output value is out = 32’d3222798335. The 32-bit inputs are tested for all the combinations of key inputs K 1 K 2 = 00,01,10,11 in the regions 1,2,3 and 4, respectively, in which different outputs are obtained for different keys where the functionality is executed correctly for the input key bit combination K 1 K 2 =00 in region 1. In Fig. 5, the values are a = 32’d1066401792 and b = 32’d1061158912. The output value is out = 32’d
Fig. 4 Simulation output for 32-bit ALU
44
B. Rajeev and N. Mohankumar
Fig. 5 Simulation output for 32-bit ALU
1072693248. The 32-bit inputs are tested for all the combinations of key inputs K 1 K 2 = 00,01,10,11 in all the regions. Different outputs are generated for different keys; the functionality is executed incorrectly for the input key bit combination K 1 K 2 = 01 in region 2. Maximum output corruption is obtained and a hamming distance of 93.75% is achieved. In Fig. 6, the values are a = 32’d1066401792 and b = 32’d1061158912. The output value is out = 32’d1072693248. The 32-bit inputs are tested for all the four combinations of key inputs K 1 K 2 = 00,01,10,11. Different outputs for different keys are obtained and the functionality is executed incorrectly for the input key bit combination K 1 K 2 = 10 in region 3. Maximum output corruption is obtained and a hamming distance of 93.75% is achieved. In Fig. 7, a = 32’d1066401792 and b = 32’d1061158912. The output value is out = 32’d3222798335. The 32-bit inputs are tested for all the combinations of key inputs K 1 K 2 in all theregions. Different outputs
Fig. 6 Simulation output for 32-bit ALU
Functional Encryption of Floating-Point ALU Using Weighted …
45
Fig. 7 Simulation output for 32-bitALU
are obtained for different keys out of which the functionality is executed correctly for the input key bit combination K 1 K 2 = 11 in region 4.
5 Conclusion A 32-bit floating-point ALU subsystem encryption using weighted logic locking method is proposed in this paper. The metric fault impact is employed to determine the location of key gate insertion to protect the subsystem. The functional waveforms for different key bit combinations are obtained and the output corruption is determined using hamming distance. The secured single precision floating-point ALU subsystem consumes relatively less area and power compared to the floating-point ALU. Future work involves evaluating the security metrics for the system by expanding the key bit size of the encryption scheme.
References 1. Karousos N, Pexaras K (2017) Weighted logic locking: A new approach for IC piracy protection. In: 23 International IEEE symposium on test robust systems, pp 221–226 2. Roy JA, Koushanfar F, Markov IL (2010) Ending piracy of integrated circuits. Comput (Long Beach California) 43(10):30–38 3. Krishnan S, Mohan Kumar N, Nirmala Devi M (2019) Weighted logic locking to increase hamming distance against key sensitization attack. In: Proceeded to 3rd international conference on electronic communication and aerospace technology ICECA 4. Rajendran J et al (2015) Fault analysis-based logic encryption. IEEE Trans Comput 64(2):410– 424 5. Yasin M, Mazumdar B (2016) SAR lock: SAT attack resistant logic locking. In: Proceeded in 2016 IEEE international. symposium on hardware oriented security trust, HOST, pp 236–241
46
B. Rajeev and N. Mohankumar
6. Yasin M, Rajendran JJ (2016) Karri: on improving the security of logic locking. IEEE Trans Comput Des Integr Circ Syst 35(9):1411–1424 7. Plaza SM, Markov IL (2015) Solving the third-shift problem in IC piracy with test-aware logic locking. IEEE Trans Comput Des Integr Circ Syst 34(6):961–971 8. Azar KZ, Kamali HM (2019) Threats on logic locking: a decade later 9. Lee YW, Touba NA (2015) Improving logic obfuscation via logic cone analysis. In: 16th latinamerican test symposium (LATS 2015)
Nonnegative Feature Learning by Regularized Nonnegative Matrix Factorization Viet-Hang Duong, Manh-Quan Bui, and Jia-Ching Wang
Abstract This paper investigated a feature learning problem, where localized and part-based representations are embedded in dimension-reduction progress. We propose a subspace learning model based on nonnegative matrix factorization (NMF) framework. The proposed NMF model incorporates wisely the spatial location information constraint to the base matrix, the manifold and max-margin penalty to the projective feature matrix for enhancing the capability of exploiting much information of the original data. Experiments on facial expression recognition scenery reveal powerful data representation ability of the proposed NMF method that enables dictionary learning and feature learning jointly. Keyword Feature learning · Supervised learning · Dimensionality reduction · Nonnegative matrix factorization · Facial expression recognition
1 Introduction In data mining and computer vision, the main target of the feature learning techniques is to automatically discover the representations needed so as to better analyze and take advantage of raw data. Many feature learning approaches have been proposed such as linear, nonlinear, supervised, unsupervised. Matrix factorization or the low-rank approximation learns the intrinsic structure of data and widely applied for representation learning. Suppose that the input data sample size is M, and samples are represented as N-dimensional vectors. The general idea of matrix factorization methods for feature learning [1–6] is to jointly learn a dictionary and the projections of the data on this dictionary and has the following form: D ≈ BF
(1)
V.-H. Duong (B) · M.-Q. Bui · J.-C. Wang BacLieu University, BacLieu, Vietnam e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_5
47
48
V.-H. Duong et al.
The dictionary B constitutes a K-dimensional feature space (K min(N, M)) and a set of basis vectors Bi each representing certain aspects of the data. For both unsupervised and supervised techniques, the dictionary is only learned on a training set of the data. The learned data representations F are obtained by projecting the data on the set of learned basis vectors in the dictionary. Thus, the core problem of matrix factorization for feature learning is what kind of B and F we need, and how to achieve them. Principal component analysis (PCA) [1] is a statistical procedure in which decomposes a multivariate dataset in uncorrelated orthogonal basis set B that explains the maximum amount of the variant. Similar to PCA, nonnegative matrix factorization (NMF) [3–5] is a dimension-reduction technique based on a low—rank approximation of the feature space. PCA and NMF optimize for a different result. PCA finds a subspace which conserves the variance of the data, while NMF figures out nonnegative features. In this paper, we design an extension NMF model applied to feature learning tasks, referred to max-margin SVM jointed NMF with spatial dispersion and graph constraints (SSGNMF). The main contributions of this paper are emphasized as follows: • A new supervised feature learning framework is proposed that seamlessly utilized spatial, label, manifold information to alleviate the adverse effect of irrelevant or less important features in the NMF approach; • A new loss objective functions NMF is formulated which incorporate the information spatial location constraint to the base matrix, the graph and max-margin penalty to the projective feature matrix for enhancing the capability of exploiting various information of the original data; • Comprehensive experiments to empirically analyze in terms of feature learning, particularly in facial expression recognition. The experimental results demonstrate that our work outperforms other methods. The paper is organized as follows. In Sect. 2, we review general Frobenious NMF works. Describing regularizations and integrate them to draw up the problem for dictionary learning and recognition step are shown in Sect. 3. A project gradient descent algorithm is also given in this part. Experiments are presented and analyzed in Sect. 4. Finally, this paper is concluded in Sect. 5.
2 Feature Learning with Nonnegative Matrix Factorization Given a nonnegative matrix D = [D1 , D2 ,…, DM ] ∈ R+N ×M , whose columns are N-dimensional feature vectors being M examples. NMF decomposes the input data matrix D into two low-rank nonnegative matrices B = [B1 , B2 ,…, BK ] ∈ R+N ×K and K ×M such that D ≈ BF. F = [F 1 , F 2 ,…, F M ] ∈ R+ Here, the dictionary matrix B contains K basis vectors and F is an activation coefficients matrix derived by projecting the input samples on learned subspace.
Nonnegative Feature Learning by Regularized …
49
The common Frobenious norm is used for the loss functions to minimize the fitting error [3, 4]. The corresponding NMF framework solves the following optimization problem: min B,Fi
M
Di − B Fi 2 s.t B ≥ 0, Fi ≥ 0, i = 1, 2 . . . , M
(2)
i=1
The most famous solutions of Eq. (2) are described as the standard iterative update rules [3]: Bnk ← Bnk
(D F T )nk ; (B F F T )nk
Fkm ← Fkm
(B T D)km (B T B F)km
(3)
As mentioned, feature learning using matrix factorization techniques has much advantages of jointly learning the dictionary and the projections. By NMF, an Ndimensional nonnegative observation vector Dm from the original feature space is explained as a linear combination of K dictionary elements Bk . In other words, each nonnegative Bk activates a part and all Bk together co-activate the part-based representation of the whole Dm with F m indicating the weight of each part. In document domain, the parts can be local characteristics like semantic topics, while they could be eyes, mouth and nose in facial data.
3 Proposed Method 3.1 Formulation Here, we aim at building an objective function of four terms including the reconstruction error of Frobenious norm NMF, the SVM classification and the graph regularizes to the coefficient matrix F, and the pixel dispersion penalty to the basis matrix B. It is demonstrated that detecting a subspace that preserves the manifold structure of the original space can be effectively modeled through the nearest neighbor graph on a scatter of data points [5, 7]. Constructing an undirected weighted graph G = {D, A} from given a set M training samples {D1 , D2 ,.., DM }∈R N . Each edge E ij is assigned a weight base on the similarity of a pair of vertices (Di , Dj ). In the same way as Deng et al. [5, 7], we incorporate the Laplacian matrix L to form a graph regularization term Trace (FLF T ) which is independent on F and L = A−E, where the diagonal matrix A is Aii = j E ij . How to discover intra-data components and expend their structure information are significant interests and importance for complex data analysis. Motivated from the work of [8, 9], we also impose spatial locality constraint onto the bases via pixel dispersion penalty. The dispersion degree of nonzero spectral value on each basic spectral vector Bi is measured the in proposing the following criterion:
50
V.-H. Duong et al.
BiT
w h
l([a, b], [a , b ]) × ta,b ·
taT ,b
Bi = BiT T Bi
(4)
a,a =1 b,b =1
where the dispersion kernel matrix T =
h w
T l([a, b], [a , b ]) × ta,b · ta,b
(5)
a,a =1 b,b =1
And l [x], [x ] is an association function between two coordinates of vector and measures the distance between them. With h is the height of an image, the indicator vector t a,b such that: ta,b (i) =
1, if i = (a − 1)h 0, otherwise
(6)
M Let Di , Y j i=1 denote a set of data vectors and their corresponding labels, where Di ∈ R N ; Yi ∈ {−1, 1}. We aim at building a common dictionary matrix B that can be used to extract features and be optimal under a max-margin classification criterion. This is accomplished by imposing constraints on the feature vectors derived from B. In this work, the features that are extracted from a data example D are given by BT D. That is, they are the projections of the data examples D on the bases vectors stored in B. Then, an optimization problem is given by: 1 Di − B Fi 2 + η( v2 + c ξi ) B,Fi ,(v,c),ξi 2 i=1 i=1 + μ Trace F L F T + λ Trace B T T B s.t. B ≥ 0, Fi ≥ 0, ξi ≥ 0, Yi v T Fi + c ≥ 1 − ξi , i = 1 . . . M min
M
M
(7)
where η, μ and λ are used to balance the trade off between four terms.
3.2 Optimization Algorithm We can treat this problem by a locally optimal solution, which uses an iterative algorithm that first updates F and B and then updates , alternately. Step 1: Fixing coefficient matrix F and the basis matrix B, the projection vector and slack variables are updated. The SSGNMF problem is formulated into the standard binary soft-margin SVM classification as:
Nonnegative Feature Learning by Regularized …
51
1 min η( v2 + c ξi ); s.t.ξi ≥ 0, Yi v T Pi + c ≥ 1 − ξi , i = 1 . . . M (8) (v,c),ξi 2 i=1 n
Step 2: When other variables are fixed, the optimization of the coefficient matrix is transformed as following: min
B,Fi ,(v,c),ξi
M
Di − B Fi 2 + μTrace F L F T
i=1
s.t. Fi ≥ 0, ξi ≥ 0, Yi v T Fi + c ≥ 1 − ξi , i = 1 . . . M
(9)
Step 3: Update the basic matrix: The model is transformed to a nonnegative matrix factorization with spatial constraint: min B
M
D − B F2 +λTrace B T T B ; s.t. B ≥ 0
(10)
i=1
Due to the nonnegative constraints, projected gradient descent method is adopted to solve it.
M 2 (t)T (t+1) (t) (t) (t) (t) (t) D − B F +λTrace B T B = max(0, B − ϑ ∇ B B i=1
(11) Herein, the gradient of Eq. (10) is ∇B
M
D − B F +λTrace BT B 2
T
= B F F T − D F T + λT B
(12)
i=1
4 Experiments The proposed algorithm is used for latent representation learning and tested on the facial expression recognition task with two datasets consisting of the Japanese Female Facial Expression (JAFFE) [10] and the extended Cohn-Kanade (CK+) [11]. Figure 1 presents original images from the JAFFE dataset and the CK+ dataset. The recognition phase is based on the minimum Euclidean similarity distance between the extracted feature vectors of the test facial expression image. The proposed SSGNMF is matched against basic NMF [2], sparseness regularized NMF (SNMF) [12], manifold NMF (GNMF) [7]; (4) NMF with spatial
52
surprise
V.-H. Duong et al.
sadness
anger
fear
disgust
Happiness
Fig. 1 The six “basic” facial expressions (happiness, sadness, surprise, anger, disgust, fear) image from the JAFFE dataset [10] (the first row) and the CK+ dataset [11] (the second row)
dispersion penalty (spatialNMF) [8], supervised robust nonnegative graph embedding (supRNGE) [13], and a matrix factorization model on complex domain (siCMF) [6]. For configuring the hyperparameters in the proposed SSGNMF, three hyperparameters related to the weight of max-margin constraint η, the enforce of graph regularize μ and the balance of pixel dispersion penalty λ are empirically set in a range of [0.1, 1].
4.1 Experiments on the JAFFE Dataset The JAFFE dataset consists of 219 facial images and all of them were used in experiments. One image of each expression per person was taken randomly to set up the training set and the rest images were used in the testing phase. It is visible from Table 1 that the proposed SSGNMF model outperforms the baseline NMF methods and the highest recognition rate is achieved by at 79.19%. There is the same trend between the number of training images and the accuracy rate in most of the compared algorithms, Table 1 Obtain results (%) using the JAFFE dataset with different subspace dimensionalities No. Basic
20
40
60
siCMF supRNGE
80
100
64.85
63.27
62.89
60.30
60.97
47.67
54.66
60.68
64.93
71.37
spatialNMF
61.23
62.60
69.45
71.08
70.71
graphNMF
10.96
12.33
10.96
10.96
31.51
sparseNMF
58.49
60.97
67.40
66.16
67.12
NMF
58.90
58.49
66.58
18.49
10.27
SSGNMF
63.01
64.06
70.34
71.82
72.19
Nonnegative Feature Learning by Regularized …
53
Table 2 Obtain results (%) using the CK+ dataset with different subspace dimensionalities No. Basic
20
40
60
80
100
siCMF
86.72
85.54
80.65
79.00
75.08
supRNGE
70.39
70.74
74.74
74.68
74.35
spatialNMF
84.90
90.13
91.13
90.44
93.00
graphNMF
23.42
23.42
26.45
31.68
49.59
sparseNMF
85.04
90.70
90.25
92.45
93.17
NMF
61.64
65.48
68.09
76.58
78.25
SSGNMF
83.01
90.15
93.95
94.00
96.72
exceptions in overfitting problem of the basic NMF, and the matrix factorization on complex field siCMF.
4.2 Experiments on CK+ Dataset The CK+ dataset contains 593 video sequences from 123 subjects that show distinct facial expressions. For each expression of a subject, the last five frames in the videos are selected and treated as static images. From these images, the testing set and training set were built with a ratio of 1:4. It is observed that the CK + dataset is less challenging than the JAFFE dataset. Once again, our designed framework shows the best performance and reaches the average recognition rates of 91.57% which followed by the sparseNMF and spatialNMF at 90.32% and 89.92%, respectively (Table 2).
4.3 Confusion Matrix on JAFFE Images The computed confusion matrix of SSGMF algorithm clarifies which facial parts are more effective on the recognition performance. The results are given in Table 3. Table 3 Confusion matrix (%) of 6-class facial expression recognition of the proposed SSGMF Expressions
Disgust
Disgust
67.93
5.07
4.5
46.02
18.72
3.25
Fear Anger
Fear
Anger 6 0 65.58
Sadness 0 4.5 0
Sadness
0
8.22
0
90.08
Happiness
0
8.17
0
15.48
Surprise
9.55
3.32
9.11
20.13
Happiness
Surprise
4.56
16.44
27.85
17.13
4.12
8.33
0 67.53 0
1.7 8.82 57.89
54
V.-H. Duong et al.
Because of highly confusing to happiness and disgust, the lowest accuracy is recognized by the fear expression, namely 46.02%. Similarly, the surprise is affected by most of other emotion and just get over fifty percent accuracy rate. In contrast, sadness can be classified well with the highest accuracy at 90.08%.
5 Conclusion In this paper, a wisely regularized nonnegative matrix factorization model (SSGNMF) is introduced for feature learning. The pixel dispersion-based constraint to the basic matrix is employed to discover intra-sample components, especially the statistical relationship and spatial distance between features. Discrimination and local structure preserving are obtained by the graph regularization and max-margin criteria on the coding matrix. As a result, the better locality dictionary and interpretable representation are yield. The experiment results convince that the proposed SSGMF algorithm can much outperform the popular NMF algorithms for facial expression recognition. For the future works, to firm the effectiveness of the proposed method, two approaches will be considered by conducting more experiences on noise data such as occluded images or employing different kind of divergence functions instead of Frobenious norm.
References 1. Fukunaga K (2013) Introduction to statistical pattern recognition. Acadamic, San Diego, CA 2. Noushath S, Kumar G, Shivakumara P (2006) Diagonal fisher linear discriminant analysis for efficient face recognition. Neurocomputing 69:1711–1716 3. Lee D, Seung H (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791 4. Lee D, Seung H (2001) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562 5. Cai D, He X, Wu X, Han J (200) Nonnegative matrix factorization on manifold. In: IEEE Int’l data mining (ICDM’08), pp 63–72 6. Hang D, Quan B, Jiun D, Shan L, Tung P, Bao P, Ching W (2017) A new approach of matrix factorization on complex domain for data representation. IEICE Trans Inf Syst E100-D(12):3059–3063 7. Cai D, He X, Han J, Huang T (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Patt Anal Mach Inte 33(8):1548–1560 8. Zheng W, Lai J, Liao S, He R (2012) Extracting non-negative basis images using pixel dispersion penalty. Pattern Recogn 45(8):2912–2926 9. Hang D, Shan L, Tung P, Mathulaprangsan S, Bao P, Ching W (2016) Spatial dispersion constrained NMF for monaural source separation. In: 10th International symposium on Chinese spoken language processing (ISCSLP) 10. Kanade T, Cohn J, Tian Y (2000) Comprehensive database for facial expression analysis. In: 4th IEEE International conference automatic face and gesture recognition, pp 46–53
Nonnegative Feature Learning by Regularized …
55
11. Lyons M, Akamatsu S, Kamachi M, Gyoba J (1998) Coding facial expressions with Gabor wavelets. In: 3rd IEEE International conference automatic face and gesture recognition, pp 200–205 12. Hoyer P (2004) Nonnegative matrix factorization with sparseness constraints. J Mach Learn 5:1457–1469 13. Zhang H, Zha Z, Yang Y, Yan S, Chua T (2014) Robust (semi) nonnegative graph embedding. IEEE Trans Image Process 23(1):2996–3012
MongoDB Versus MySQL: A Comparative Study of Two Python Login Systems Based on Data Fetching Time Shrikant Patel, Sanjay Kumar, Sandhya Katiyar, Raju Shanmugam, and Rahul Chaudhary Abstract A database refers to data structure used for systematical and organized collection of data. Now-a-days, Database system follows mostly two different algorithms i.e. SQL and NoSQL. In SQL, databases schema is predefined, fixed and vertically scalable whereas in NoSQL database, schema is dynamic and horizontally scalable. Also, SQL database are table-based database such as MySQL, Oracle, and MS SQL whereas NoSQL databases are document based, key-value based, and graph databases such as MongoDB, CouchDB, and so on. Here, A login system project developed using Python programming language is used to analyze performance of MongoDB and MySQL based on their data fetching speed from databases. A login system can be used where it is required that user should be verified before accessing confidential information or data. This paper will also help us to decide which type of database system should be used as backend database for storing and retrieving user credential information for any login-based system. Keywords SQL · NoSQL · MySQL · MongoDB · Python login system
1 Introduction A relational database is a set of officially described tables from which data can be retrieved or collected in many different ways without having to recognize the database tables. The relational database has been the groundwork of enterprise applications for years, and when MySQL is released in 1995, it has been popular and an expensive
S. Patel (B) · S. Kumar · R. Shanmugam School of Computing Science and Engineering, Galgotias University, Greater Noida, India e-mail: [email protected] S. Katiyar Galgotias College of Engineering and Technology, Greater Noida, India R. Chaudhary Bhai Parmanand Institute of Business Studies, Delhi, India © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_6
57
58
S. Patel et al.
option. Due to the bang of large volume and variation of data in the current years, nonrelational database technologies like MongoDB become useful to address the problems faced by old-style databases [1]. MongoDB is very useful for new applications as well as to augment or replace the existing relational infrastructure. MySQL is a popular open-source relational database management system (RDBMS) that is distributed, developed, and supported by Oracle Corporation. The relational systems like MySQL store data in tabular form and use structured query language (SQL) for accessing of data. In MySQL, the programmer should pre-define the schema based on requirements and set up rules to control the relationships between fields in the record [2]. The related information may be stored in different tables, but they are associated by the use of joins. Thus, data duplication can be minimized. MongoDB is an open-source database developed by MongoDB Inc. MongoDB stores data in JSON-like documents that can vary in structure. Related information can be stored together for fast query access through the MongoDB query language. MongoDB uses dynamic schemas, which helps to create records without first defining the structure, such as the attributes or the data types. It is possible to change the structure of records by simply adding new attributes or deleting the existing fields. This model helps to represent hierarchical relationships, to store arrays, and other more complex structures very easily. Documents in a record need not have an identical set of fields [3]. The “Python Login System” verifies the user credentials from database. A graphical user interface login form is developed using Python programming language which accepts user credentials: login id and password from user and verify user credentials from database. Once credentials are verified, it redirects user to user profile page. To compare speed of data fetching as per database requirement in one scenario, we have used NoSQL database management system, MongoDB, and in other scenario, SQL database management system MySQL is being used. One of the key differentiators is that NoSQL supported by column-oriented databases where RDBMS is row-oriented database. NoSQL seems to work better on both unstructured and unrelated data. NoSQL databases give up some features of the traditional databases for speed and horizontal scalability. MongoDB supports a “single master” model. This way you have a master node and some of slave nodes. In case the grasp goes down, one of the slaves is elected as master. This system happens mechanically, but it commonly takes time, and before the 3.2 release, 10–40 seconds had been taken; however, after the release of MongoDB 3.2 and later, failures are detected quicker and a new leader elected in much less than 2–10 seconds. The change-off for multi-master is that the reads are slower and scale much less efficaciously because the consumer should study from multiple nodes to make sure consistency. During the time of new chief election, your duplicate set is down and cannot take writes.
MongoDB Versus MySQL: A Comparative Study of Two Python Login …
59
2 Problem Definition Almost every system consisting of confidential information or data requires a login system to authenticate its users and provide accessibility accordingly. It is crucial that a login system must have quality performance as well as assurity of security, and the confusion which arises here is to determine which database system should be used for better performance. In this project which is used in this paper, we have used both types of database management systems (DBMS) MySQL for SQL and MongoDB for NoSQL to conclude which database management system among MongoDB and MySQL is better on the basis of their data fetching speed. When compared to MySQL, it is observed that MongoDB is much better in query processing [4]. The MongoDB database consists of a set of databases in which each database contains multiple collections. Because MongoDB operates with dynamic schemas, every collection might contain different types of data. Every object also called as documents is represented by a JSON structure: a list of key-value pairs. The value can be of mainly three types: a primitive value, an array of documents or a list of key-value pairs. To query these objects, the client can set the collections expressed as a list of key-value pairs [5]. It is also possible to query nested fields. The queries are also JSON-like structured; hence, a complex query can take much more space than the same query for the relational databases [3, 6].
3 Methodology Login system is developed using Python as programming language in which we have used “tkinter” module for developing graphical user interface so that user can interact with the system. The first user interface will have one grid having two fields: One is for user id and other is for user password and grid also has two buttons—one is for login and other is for clearing the fields. Now, whenever any user will click on login button each time two user-defined functions (written in python programming language) will be executed serially; one will verify the user id and password from MongoDB database and other will verify the same user id and password from MySQL database [7, 8]. Working of both user-defined python functions is as follows: FUNCTION 1 (for MongoDB): using MongoDB as Database Management System First of all, It will establish connection with MongoDB server by using MongoClient inbuilt function which is provided by pymongo module. After successful connection, it will select the database on which query will be performed. Then required login credential data is fetched from the particular database by using fuction ‘db.login_data.find()’ and stored in a variable. Now credentials read from the user login grid are compared with each credential fetched from database; if same credentials are found in database, then a message is prompt to the user that you are successfully logged in and user will be redirected to its profile page. Otherwise, if credentials
60
S. Patel et al.
entered by user are not matched with any fetched credentials from database, then a login error message will be shown as ‘Invalid User’. Structure of MongoDB Database for this Project: (Figs. 1 and 2) FUNCTION 1 (for MongoDB):
Fig. 1 MongoDB database structure for this login system
Fig. 2 User-defined function for credentials verification with MongoDB
MongoDB Versus MySQL: A Comparative Study of Two Python Login …
61
FUNCTION 2 (for MySQL): Using MySQL as Database Management System: Here, it will establish connection with MySQL server using mysqlconnector inbuilt function provided by MySQL module. After successfull connection, it will select the database on which query will be performed. Then, required login credential data is fetched from the particular database by using fuction ‘cursor.execute(select * from login_data)’ and ‘result2 = cursor.fetchall()’ and stored in a variable. Now, credentials read from the user login grid are compared with each credential fetched from database; if same credentials are found in database, then a message is prompt to the user that you are successfully logged in and user will be redirected to its profile page. Otherwise, if credentials entered by user are not matched with any fetched credentials from database, then a login error message will be shown as ‘Invalid User’. Structure of MySQL Database for this Project: (Figs. 3 and 4) FUNCTION 2 (for MySQL): Fig. 3 MySQL database structure for this login system
Fig. 4 User-defined function for credentials verification with MySQL
62
S. Patel et al.
4 Practical Results On execution of project, different user credentials are provided by user(tester) and different data fetching time is captured for both functions “mongo_database()” as well as “mysql_database()”. For calculation of execution time, Python inbuilt function “time.time()” is used provided by “time” module of Python. Various credentials are provided to the login system for both databases, and we obtained the below results: For Valid User Credentials (Table 1): For Invalid User Credentials (Table 2): Here, The performance of MongoDB is shown while comparing with MySQL by performing credentials verification operations. A few number of records were taken and performed the operations in both databases. The graph plotted based on that performance is shown below. For Valid User Credentials (Fig. 5): For Invalid User Credentials (Fig. 6): Table 1 Execution results for valid user credentials
Table 2 Execution results for invalid user credentials
Fig. 5 Chart for valid credentials
For valid credentials User credentials
Execution time (in s)
User ID
User password
MongoDB
MySQL
mahi
mahi@123
0.115
0.795
rahul
rahul@123
0.032
0.375
da
da@123
0.037
0.303
For invalid credentials User credentials
Execution time (in s)
User ID
MongoDB
User password
MySQL
it
it@123
0.027
0.316
dfs
dfs@123
0.022
0.349
mca
mca@123
0.026
0.305
MongoDB Versus MySQL: A Comparative Study of Two Python Login …
63
Fig. 6 Chart for invalid credentials
5 Performance Evaluation On analyzing, it is found that credential verifications speed of MongoDB is better than MySQL in login system. Mostly, login systems are adopting MongoDB because it facilitates them to build their system faster. One of the major problems with relational databases (SQL) can sometimes become complex as the amount of data grows, and the relations between pieces of data become more complicated. On the other hand, non-relational databases (MongDB) use key-value pair to store the data and database, and the problem of complexity is not raised as the amount of data grows [9, 10]. Here, we have taken very few records for evaluation; still, difference between performances of both the database systems clearly can be seen. So, it is obvious that in case of login systems having a huge number of records, this difference will be increased at a big scale. Consider an example of bank login system where the major concern is security of customer’s amount. As we know all banks have huge number of online customers who can access their account from anywhere and the only way to authenticate them is through login system. Suppose a bank login system has 10,000 online customers, then whenever any user will try to access his account through the login system, user’s credentials will be verified from the database system, and for which, database system has to search for the credentials provided by the user from the 10,000 records. Whether credentials are valid or not in both cases, database system has to perform searching in 10,000 records. So, it is very crucial that database system [11] should be capable to perform such operation efficiently even if database has a huge amount of data to provide security and speed performance simultaneously. From the observed result, it is clearly seen that whether user’s login credentials are valid or invalid, the data fetching time is always better in MongoDB database system in comparison with MySQL database system. MongoDB is widely used in the field of large databases. One of the most important advantages is its scalability. MongoDB follows ‘BASE’ transaction: Fundamentally available soft state and eventual consistency [12]. Another important feature is handling of failures. For the simplicity, we conduct an analysis based on Python login system. But MongoDB is more suitable for other applications having large volume of data where data needs high security [13]. Since it is schema less, it supports different types of data.
64
S. Patel et al.
6 Conclusion In this paper, we have gone through the performance evaluation between MySQL and MongoDB in a login system developed using Python. For evaluating its performance, data fetching time is considered. We reached to a conclusion that data fetching speed of MongoDB database system is faster than MySQL database system. It can be seen clearly from the observed results table that for valid credentials, the difference between data fetching time is little bit less but it is still can be observed easily. But in case of invalid credentials, data fetching time is almost constant for MongoDB database system and on the other hand for the MySQL database, the data fetching time is still the same as valid credentials and it is more in comparison with MongoDB database system. As to summarize, it can be said that SQL, relational database management system (RDBMS) is slower in data fetching as compared to NoSQL, non-relational database management systems. So, Whenever for any system speed matters more than security then non-relational database management systems like: MongoDB database system should be used for database requirement.
References 1. Dipina Damodaran B, Salim S, Vargese SM (2016) MONGODB Versus MYSQL: A comparative study of performance in super market management system. Int J Comput Sci Inf Technol (IJCSITY) 4(2) 2. Kolonko K (2018) Performance comparison of the most popular relational and non-relational database management systems. Master of science in software 3. Vidushi Jain VIT University-Vellore, Aviral Upadhyay VIT University-Vellore (2017) MongoDB and NoSQL atabases. Int J Comput Appl (0975–8887) 167(10) 4. Han J, Haihong E, Le G, Du J (2011) Survey on NoSQL database, IEEE 5. Okman L, Gal-Oz N, Gonen Y, Gudes E, Abramov J (2011) Security issues in NoSQL databases, IEEE 6. Jia T, Zhao X, Wang Z, Gong D, Ding G (2016) Model transformation and data migration from relational database to mongoDB, IEEE 7. Mukesh BA, Garg M (2018) Comparative analysis of simple and aggregate query in MongoDB. Int J Adv Res Ideas Innov Technol 8. Jayathilake D, Sooriaarachchi C, Gunawardena T, Kulasuriya B, Dayaratne T (2016) A study into the capabilities of NoSQLDatabases in handling a highly heterogeneoustree. Int J Comput Sci Inf Technol (IJCSITY) 4(2) 9. Elmsari, Navathe (2006) In: Fundamentals of database systems. 5th edn, Pearson Education 10. Ullman JD In: Principals of database systems, Galgotia 11. Malhar L In: Python data persistence: with SQL and NOSQL databases 12. Meier A (Author), Kaufmann M (Contributor) In: SQL and NoSQL databases: models, languages, consistency options and architectures for big data management 13. Bradshaw S (Author), Brazil E (Author), Chodorow K (Author) In: MongoDB: the definitive guide: powerful and scalable data 4
Industrial LoRaWAN Network for Danang City: Solution for Long-Range and Low-Power IoT Applications Thanh Dinh Ngo, Fabien Ferrero, Vinh Quang Doan, and Tuan Van Pham
Abstract The development of IoT applications is lightning fast in terms of technology and solutions in all areas of life. This brings opportunities and challenges wireless sensor network which requires low-power and long-range devices. LoRa technology is a new wireless communication standard designed to address these challenges. The industrial LoRaWAN protocol shows its ability for long-range data transmission, long autonomy, and high scalability. In this paper, we propose to experiment with a LoRaWAN network for IoT applications in Danang City area. Evaluation of LoRa coverage in real-life measurement shows an average communication range of 6 km with a maximum distance of 26 km. Keywords Internet of things · LoRaWAN · Range of coverage · Long-range communication · Low-power · Power management
1 Introduction In recent years, the Internet of Things (IoT) has shown up as a potential technology to disrupt many domains. IoT applications appear everywhere in all fields from industry to agriculture, smart homes, smart schools, and smart cities. According to Cisco’s expectation [1], there will be over billions of smart devices connected to the Internet that will have a great contribution to the global economy by 2020. The IoT technology T. D. Ngo (B) · V. Q. Doan · T. Van Pham The University of Danang - University of Science and Technology, Danang, Vietnam e-mail: [email protected] V. Q. Doan e-mail: [email protected] T. Van Pham e-mail: [email protected] F. Ferrero LEAT Universite Cote D’Azur, Nice, France e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_7
65
66
T. D. Ngo et al.
boom will continue to increase exponentially over the next decade along with big data and energy consumption issues for IoT devices. Today, many IoT applications require long-distance communication as well as low energy consumption. LoRa, which stands for long-range radio, was developed by Cycleo SAS and was later acquired by Semtech in 2012. LoRa uses a modulation technique called chirp spread spectrum (CSS) to offer a set of features including low-power, low-cost, and low data rate devices but significantly increases the range of the transmission, helping to transmit data via long distances. The working frequency of LoRa is unlicensed bands from 430 MHz to 915 MHz depending on different regions of the world. LoRa devices can operate data exchanges within the network for up to 10 years when using batteries [2]. Low-power wide area network (LPWANs) is a wireless communication technology designed to support the diverse deployment of IoT applications. These technologies enable wide-range and large-scale connectivity for low-power and low data rates. LoRaWAN is one of the most successful and popular technologies under LPWANs [2]. LoRaWAN includes communication protocols defined by the LoRa alliance, operating on the LoRa physical layer and free license bands. Therefore, the LoRaWAN technology attracts the attention of both industry and academics by many scientists and research institutes around the world. The research in modeling and simulation of the LoRaWAN network has made certain achievements, confirming the position of the LoRaWAN network in longrange IoT applications [2–4]. LoRa can be a reliable link for low-cost and long-range IoT applications. For real measurement, most research is using indoor LoRa gateway [3, 5] where the coverage of LoRa network is less than 2 km. However, in order to evaluate and analyze the feasibility of the real deployment of LoRaWAN industrial network for long-range applications, we should consider in specific geographical conditions as well as some communication factors that affect the range of coverage in the LoRaWAN network. This paper focuses on the evaluation of the coverage performance based on the deployment industrial LoRaWAN network in Danang City area. The rest of the paper is organized as follows: in Sect. 2, we describe the proposed model and all the resources used to carry out the coverage testing. In Sect. 3, we evaluate the performance of the system, the implementation results, and the coverage analysis. Finally, we conclude the paper in Sect. 4.
2 Proposed LoRaWAN Network 2.1 Proposed LoRaWAN Network Model In the proposed model, LoRaWAN industrial network consists of four main layers as shown in Fig. 1:
Industrial LoRaWAN Network for Danang City: Solution …
67
Fig. 1 LoRaWAN network
• LPWAN sensor layer: In the scope of this study, the device hardware uses the open-source circuit board [6] including Semtech’s LoRa module, Atmega328 microcontroller, and sensors. In addition, we also use additional STMicroelectronics development circuit boards in combination with MEMS motion sensors and environmental sensors. The sensor collects measurement data and transmits it to the gateway at near or far distances, indoors or outdoors with the lowest energy requirements. • LoRaWAN IoT Industrial layer: Using the industrial gateway of RAK and MultiTech, data from end-nodes is transmitted to the gateway in LoRaWAN network, the data is then transferred to the server. In order to increase coverage based on industrial gateways, relay repeaters are used which serve as virtual gateways. • Server layer: Gateway sends data to a server via WiFi, Ethernet, or mobile network using IoT MQTT protocol. In this research, we use The Things Network’s server. • Application services layer: including smart applications on smartphones, tablets. The collected data is converted into data that the user understands on the interface and system control functions, applying machine learning algorithms and artificial intelligence to analyze and evaluate data.
2.2 Industrial IoT LoRaWAN Gateway In the current proposed LoRaWAN network, we use two IoT LoRaWAN industrial gateways. In the first test, an Industrial IoT gateway of MultiTech was installed on the
68
T. D. Ngo et al.
21-story building of Danang Software Park (DSP) in the center of Danang City. This DSP gateway uses a 6dBi antenna. In order to increase LoRa coverage throughout Danang City, the IoT LoRaWAN gateway RAK7249 was later installed on a 10 floors of S building at Danang University—University of Science and Technology (DUT). This is the latest IP67 outdoor gateway IP67 of RAK. This DUT gateway integrates two separate receivers and two different antennas are connected (12dBi and 3dBi antenna) (Fig. 2). Both gateways are configured to use at 868 MHz. Figure 3 shows the location of these two industrial IoT gateways on the map. In order to conduct LoRaWAN network coverage evaluation, at the LoRa sensor node, we use two versions to test including:
Fig. 2 RAK and multitech IoT LoRaWAN gateways
Fig. 3 Two industrial IoT LoRaWAN gateway at Danang
Industrial LoRaWAN Network for Danang City: Solution …
69
Fig. 4 UCA sensor node based on Arduino
Development Board of UCA The sensor node uses LoRa RFM95W module, Arduino Mini Pro 3.3 V, 8 MHz and allows connection with temperature and humidity sensors such as DHT-22, BME280, motion sensor, and distance sensor. A detailed description of antenna design is given in [7, 8]. Antenna design on the circuit board with the shape of the logo of the Université Côte d’Azur (UCA). The entire circuit design is provided via git [6]. This sensor node is powered by an AA lithium battery (Fig. 4). Development Board B-L072Z-LRWAN1 This integrated development board is based on LoRa and Sigfox technology used STM32L072Cz microcontroller and LoRa SX1276 transceiver module, provides extremely wide spectrum communication and high anti-jamming ability as well as minimizes energy consumption. To measure environmental parameters, shield XNucleo-IKS01A1 is used to integrate MEMS motion sensors and environmental sensors. This sensor node is powered by three lithium AAA batteries (Fig. 5).
Fig. 5 Environment sensor node based on B-L072Z-LPWAN1
70
T. D. Ngo et al.
3 LoRaWAN Network Performance Evaluation To evaluate performance, other LPWAN technologies are analyzed and compared [9]. As shown in Table 1, LoRaWAN and Sigfox are potential technologies for longrange applications with a battery life of up to 10 years and low cost for an end-node. However, when it comes to security, LoRaWAN technology is superior to Sigfox. Besides, LoRaWAN overcomes the disadvantages of other technologies such as WiFi and ZigBee… [2]. LoRa uses three bandwidths (BW) of 125, 250, and 500 kHz. The wider the bandwidth, the shorter the coding time, from which the data transmission time is reduced, but the transmission distance is also reduced. In this test, we want to transmit over long distances so choosing BW = 125 kHz. The SF spread spectrum factor determines the number of chirp signals when coding the chipped signal of data. In our test, we chose SF = 12, a logic level of the modulated chirp signal will be coded by 12 chip signal pulses. To assess the coverage, the LoRaWAN network system is set up for the gateway and network nodes with the following parameters: (Table 2).
Table 1 LPWAN technology comparison [9] Feature
LoRaWAN
NB-IoT
Sigfox
LTE-M
Urban range
2–5 km
1.5 km
3–10 km
Rural range
45 km
20–40 km
30–50 km
200 km 4G
Battery Lifetime
~10 years
~10 years
~10 years
I2 > I3 ). When the solar output current is I pv < I3 , these three PV panels can provide current at the same time. The total output voltage of the solar array is V pv = v pv1 + v pv2 + v pv3 . When the solar output current is I3 < I pv < I2 , the third PV panel cannot provide current and the diode D3 will conduct the current, so the third PV panel will be isolated. Therefore, the total output voltage of the solar array is V pv = v pv1 + v pv2 . Table 1 shows the cases of the PV panels under partial shading effect. From (1), the output voltage of each PV panel can be expressed by the following equation: Table 1 Relationship between output voltage and current under partial shading effect
Cases of current
Output voltage
The PV panel The diode will is isolated be connected
I pv < I3
V pv = v pv1 + None v pv2 + v pv3
None
I3 < I pv < I2 V = pv v pv1 + v pv2
PV 3
D3
I2 < I pv < I1 V pv = v pv1
P V 2; P V 3
D2 ; D3
MPPT Design Using the Hybrid Method for the PV System …
81
Fig. 3 P–V curve of one case under partial shading effect
v pv( j) =
N p × I ph( j) + N p × Isat − i pv( j) Ns , × ln ko N p × Isat
j = 1, 2, . . . , n
(4)
where v pv( j) and i pv( j) are the output voltage and current of the jth PV panels and i pv( j) is dependent on the load. I ph( j) depends on the sunshine intensity. Moreover, to understand more about the partial shading effect, we used MATLAB software to simulate them. The used solar sunshine intensities are 100 mW/cm2 , 80 mW/cm2 and 25 mW/cm2 in the fixed ambient temperature 25 as illustrated in Fig. 3. Figure 3 also shows that there are three MPPs; one global MPP is 336.9 W, and two MPPs in the local domain are 167.4 W and 200.1 W.
3 MPP Tracking Control of the Stand-Alone PV Energy System 3.1 Architecture of the Standalone Solar Energy System Nowadays, many methods have been proposed to obtain the best energy conversion efficiency of the PV power system. In this paper, a stand-alone PV energy conversion system combined with the hybrid method was used. This system consists of PV energy panels, a DC boost power converter, a control circuit board and a DC load.
82
S. Ngo et al.
When the PV panels receive solar energy, controller uses the proposed method to find the MPP, and a DC/DC boost converter will transfer the PV power into the load.
3.2 MPP Tracking Method under Irradiance Variation When PV irradiance is changed, the solar panels generate multiple MPPs. The traditional MPPT method is easy to trap in the local MPP which reduces the energy conversion efficiency of the solar array. Therefore, this paper proposes a new approach, which combined a SFLA with the traditional P&O method. The SFLA is used to search the global maximum power region which can avoid to fall into the local MPPs. When the largest power peak is found, the traditional P&O method is connected to find the global MPP quickly. This proposed method will prevent the trapped region effectively, and it has two stages which is presented as follows: Stage 1 Shuffled Frog Leaping Algorithm (SFLA) The principle of SFLA simulated the way to find the food of a group of frogs. The frogs are created in many subgroups to find food. When they have food, the other frogs receive information from the ones that found the food. Since the feed intake of each subgroup is different, all subgroups will send information to the other subgroups to check the different food between the subgroups, so that all groups of frogs can move to the place with the most food. To perform the process, the group of frogs is divided into many subgroups. Each subgroup is calculated from the bad fitness the good fitness. The worst frogs in each subgroup are updated by (5), (6) and (7). The main operation of SFLA is to update the position of the worst frog in each subgroup through an iterative operation. The position of the worst frog is improved by learning from the best frog of the subgroups. Thus, the new position of the worst frog in each subgroup is updated according to these equations. Equations (5) and (7) are used to calculate the updating position Di and fitness Fw . If this process leads to a better solution, Fw (new) will replace the worst frog. Otherwise, Eq. (5) will be replaced by (6), meaning that Fb is replaced by Fg . If this way still cannot achieve a better solution, a new solution will be randomly generated to replace that worst frog. Because only the worst frogs are updated, the SFLA does not use much time to calculate and improve solar power conversion efficiency when compared with the GA and PSO. This method is applied in solar MPPT, which replaced the fitness value Fi with the solar power value Ppv , the frog position Di with the solar array output current I pv to find the global region. Di = rand() ∗ (Fb − Fw )
(5)
Di = rand() ∗ (Fg − Fw )
(6)
MPPT Design Using the Hybrid Method for the PV System …
Fw (new) = Di + Fw
83
(7)
where rand() is a random number between 0 and 1; Di is the frog’s moving distance, −Dmax < Di < +Dmax , Dmax is the maximum distance the frog moves; Fb , Fw are the best frog and the worst frog in each subgroup, and Fg is the best frog of all groups. Stage 2 Perturbation and Observation (P&O) Method Perturbation and observation (P&O) is a less complex algorithm for controlling MPPT. This method is applied to track the maximum power point accurately at constant irradiance and temperature. It has some demerits like slow response and oscillation at different environmental conditions. By changing a small amount of voltage (δV ), the voltage (Vk+1 ) and current (Ik+1 ) values after changing are measured and compared with the previous values (Vk , Ik ). Based on the measured results of the voltage (V = Vk+1 −Vk+1 ) and power (P = Pk+1 −Pk ) change, the algorithm will decide to increase or decrease the small amount of voltage (δV ) in the next iteration. The algorithm execution process will end when the perturbation is finished, which means that P = 0. The proposed algorithm is summarized through the following steps: Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Step 8
Measure V pv , I pv and calculate Ppv . Find Fw , Fb and Fg . Update Fw from Eqs. (5), (6) and (7). Compare Fw (new) with Fw (old), if Fw (new) better than Fw (old), then go to step 6. Otherwise, go to the next step. Replace Fw with random value. Determine |Fb − Fw | < e; if it is true, we achieve Vk , Ik , Pk and go to the next step. Otherwise, go to step 2. Measure voltage, current and calculate power at next time, we get Vk+1 , Ik+1 and calculate Pk+1 , V = Vk+1 − Vk+1 , P = Pk+1 − Pk . By changing a small amount of voltage (δV ), we get new voltage and current and calculate to get new power. If P = 0, the proposed method finishes. Otherwise, go to step 7 for the next iteration.
3.3 Control Circuit of the Standalone PV Panel System The basic diagram of the stand-alone module boost converter is illustrated in Fig. 4. The main task of this circuit is to convert the energy of the solar panels to the output load, perform the load voltage regulation and track the MPP. When this system operates, the Arduino professing board uses two voltage and current signals to perform MPPT. Based on the proposed method, the output signal of the board above is used and compared with the output voltage signal by IC T L494. The change of these two
84
S. Ngo et al. Solar panels DC/DC Boost Converter
Output
Solar energy RL
Voltage
VPV
setpoint IPV Anduino TL494 Processing Board
Voltage feeback
Fig. 4 Basic control circuit and boost converter for PV panels
compared signals will change the duty cycle to control the DC/DC boost converter. The advantage of this circuit is its simple structure, low cost, high performance and easy installation.
4 Simulation Results The PV panels are used to simulate with the following specifications:Pmax = 200 W, Vmax = 26.3 V , Voc = 32.9 V , Imax = 7.61 A, Isc = 8.21 A, Ns = 54, N p = 1. For the SFLA, the parameter is set with the number of the frogs in a group is 3, the number of groups is 5, error setting is set to P = 10 W. Figure 3 shows the simulation result of the voltage–power curve for considering the partial shading situations. The MPP of the whole domain is 336.9 W, and the lowest local MPP is 167.4 W. The simulation results of the P&O, PSO and hybrid methods are shown in Figs. 5, 6 and 7. In the partial shading situations, the P&O method was caught in a local MPP trap (167.4 W). With the proposed combination, SFLA is used to find the optimal solution, and this solution is very successful when combined with P&O method to find the best global solution (336.9 W). The SFLA and PSO methods also achieve the global optimization, but they take longer than the proposed hybrid method.
5 Conclusion This paper uses a DC stand-alone PV system with the proposed hybrid method. This hybrid method has improved the traditional MPPT methods under varying intensity effect. Firstly, the SFLA method is used to find the global region and then quickly get the global MPP through the P&O method. This solution is very effective in achieving the best energy conversion efficiency quickly and avoiding falling into the local maximum power peak. The P&O method is very easy to program, and it is
MPPT Design Using the Hybrid Method for the PV System …
Fig. 5 Simulation result of P&O method under the partial shading condition
Fig. 6 Simulation result of PSO method under the partial shading condition
85
86
S. Ngo et al.
Fig. 7 Hybrid method under the partial shading condition
often used to track MPP. Therefore, combining SFLA and P&O is a useful solution to increase the power conversion efficiency of the PV energy array. Acknowledgements This work was supported by the Ministry of Science and Technology, R.O.C., under Grant MOST-107-2221-E-033- 064.
References 1. Elgendy MA, Zahawi B, Atkinson DJ (2015) Operating characteristics of the P&O algorithm at high perturbation frequencies for standalone PV systems. IEEE Trans Energy Convers 30(1):189–198 2. Elgendy MA, Atkinson DJ, Zahawi B (2016) Experimental investigation of the incremental conductance maximum power point tracking algorithm at high perturbation rates. IET Renew Power Gener 10(2):133–139 3. Tey KS, Mekhilef S (2014) Modified incremental conductance algorithm for photovoltaic system under partial shading conditions and load variation. IEEE Trans Industr Electron 61(10):5384–5392 4. Ramyar A, Iman-Eini H, Farhangi S (2017) Global maximum power point tracking method for photovoltaic arrays under partial shading conditions. IEEE Trans Industr Electron 64(4):2855– 2864 5. Ghasemi MA, Foroushani HM, Parniani M (2016) Partial shading detection and smooth maximum power point tracking of PV arrays under PSC. IEEE Trans Power Electron 31(9):6281–6292 6. Soon TK, Mekhilef S (2015) A fast-converging MPPT technique for photovoltaic system under fast-varying solar irradiation and load resistance. IEEE Trans Industr Inf 11(1):176–186
MPPT Design Using the Hybrid Method for the PV System …
87
7. Peng B, Ho K, Liu Y (2018) A novel and fast MPPT method suitable for both fast changing and partially shaded conditions. IEEE Trans Industr Electron 65(4):3240–3251 8. Alik R, Jusoh A (2018) An enhanced P&O checking algorithm MPPT for high tracking efficiency of partially shaded PV module. Sol Energy 163:570–580 9. Deshkar SN, Dhale SB, Mukherjee JS, Babu TS, Rajasekar N (2015) Solar PV array reconfiguration under partial shading conditions for maximum power extraction using genetic algorithm. Renew Sustain Energy Rev 43:102–110 10. Sundareswaran K, Sankar P, Nayak PSR, Simon SP, Palani S (2015) Enhanced energy output from a PV system under partial shaded conditions through artificial bee colony. IEEE Trans Sustain Energy 6(1):198–209 11. Koad RBA, Zobaa AF, El-Shahat A (2017) A Novel MPPT algorithm based on particle swarm optimization for photovoltaic systems. IEEE Trans Sustain Energy 8(2):468–476 12. Sridhar R, Jeevananthan S, Dash SS, Vishnuram P (2017) A new maximum power tracking in PV system during partially shaded conditions based on shuffled frog leap algorithm. J Exp Theor Artif Intell 29(3):481–493
Comparison of Several Data Representations for a Single-Channel Optic System with 10 Gbps Bit Rate Ihsan A. Alshimaysawe, Hayder Fadhil Abdulsada, Saif H. Abdulwahed, Mohannad A. M. Al-Ja’afari, and Ameer H. Ali
Abstract Data representation represents converting bits into signal forms. These signals match the conditions of the transmission, such as DC levels and channel capacity for optical fiber system with linear and nonlinear effects; it is important to check the suitable data representation for a certain type of modulation. In this paper, a comparison has been made among several kinds of representations like Manchester, alternate mark inversion (AMI), ordinary non-return to zero (NRZ), chirped NRZ and Duobinary (DB) for single-channel optical fiber with direct transmission at 10 Gbps of 350 km (without DCF compensation) length. The results are taken in terms of power versus quality factor and optical signal spectrums in both sides. They showed that Manchester coding has a 12 order difference in signal power at the end of the fiber, and AMI coding has the farther threshold power point of 9dBm with a quality factor of 14 by using Optisystem version 13. Keywords Manchester code · Chirped NRZ · Dispersion compensation fiber
1 Motivation Figure of merit in optics communication is the large bandwidth capability and then huge bit rate; for that reason, it is the proper choice to meet the needs of the fast growth in optical networks and for future technologies [1]. A major role has been played by optical technology through replacing the copper wire in an electrical communication system by fiber links and that mean immunity against electromagnetic interference I. A. Alshimaysawe (B) Department of Communication Technique Engineering, Engineering Technical College/Najaf, Al-Furat al-Awsat Technical University, 31001 Najaf, Iraq e-mail: [email protected] H. F. Abdulsada Babylon Technical Institute, Al-Furat al-Awsat Technical University, 51015 Babylon, Iraq e-mail: [email protected] S. H. Abdulwahed · M. A. M. Al-Ja’afari · A. H. Ali Najaf Technical Institute, Al-Furat al-Awsat Technical University, 31001 Al-Najaf, Iraq © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_9
89
90
I. A. Alshimaysawe et al.
and fast system with large capacity, but at the same time, many impairments have been appeared [2]. As known, any communication system involves the optical system which consists of three main parts, transmitter, medium and receiver. For the optical transmitter, there are many kinds each with different mechanism but shared with a common section called modulator. The laser source is modulated with information bits that confirmed in a certain pulse shape [3]. The pulse shaping “data representation” types have been investigated in this paper and comparison had been made. Figure 1 shows sorts of these schematics; the first block is the Manchester code, also called bi-phase code, and it is one of the promising techniques that deals with high-speed optical system; it requires a twice bandwidth than NRZ coding, as well as used to decrease the DC level of the optical signal [4]. The second block was NRZ; it is a coding technique that presents a (+V) for ones(1’s) and (−V) for zeros (0’s) and needs a half baseband bandwidth, chirped NRZ used to make a broadening in the signal spectrum led to muffle the dispersion tolerance, thus improving the performance of the system [5]. AMI is a coding scheme which has a bipolar shape which gives binary 1 for negative or positive electrical voltage level and binary 0 for natural level [6] and offers a reduction for the unwanted effects of fiber optic same as self-phase modulation (SPM), cross-phase modulation (XPM) and four-wave mixing (FWM) [7–9]. After an odd number of zeros, a phase shift with π has been introduced for Duobinary format, and it uses less than the half bandwidth of the baseband signal [10]. In 2015, Prabhdeep Kaur et al. [11] introduced inter-satellite optical wireless communication scheme (IS-OWC) with 32 channel using non-return to zero (NRZ) and return to zero (RZ) schemes. It works at 10Gbps and made a comparison in terms of different input powers. In 2016, Saumya et al. [12] used RZ and NRZ modulation formats to check dense wavelength division multiplexing (DWDM) system with dispersion indemnity. In 2017, Doutje and Vincent et al. [13] proposed transmission schemes depending on Duobinary and binary NRZ to get 25Gbps bit rate by using 10Gbps equipment. In 2018, Ravneet and Harminder [14] achieved DPSK, chirped
Fig. 1 Transmission part of the system for different types of modulation technique
Comparison of Several Data Representations for a Single …
91
RZ and AMI modulation methods in IS-OWC system by using 64 channels. In 2018, Abhishek Sharma et al. [15] introduced comparison between AMI and NRZ technique in radio over fiber (RoF) by using a single channel with bit rate 2.5 Gbps at 5 GHz radio signal sent during 25 km fiber optic. In 2018, Sasikala and Chitra [16] showed the impact of FWM and XPM by using NRZ and RZ signal in DWDM optical systems at 10 Gbps for 2, 4 and 8 channels.
2 Complete System For an optical fiber system with a single channel, there are two types of impairments which are linear and nonlinear. The linear impairments represent the dispersion effect, while the nonlinear impairments represent self-phase modulation (SPM) and simulated Brillion scattering (SBS). In this system, there is a single optical channel with a length of 420 km constructed in seven segments; each one is divided into a pair of single-mode fiber (SMF), optical amplifiers and dispersion compensating fiber (DCF) with 5 km, 5 dB gain and −83.75 ps/nm/km, respectively, in a symmetric manner as shown in Fig. 2. The gain of amplifiers was 5 dB to cancel the equivalent attenuation in a 25 km SMF, and the value of negative dispersion is calculated according to Eq. (1) [17] and that would mitigate the dispersion in the 25 km SMF (16.75 ps/nm/km). A direct transmission was used in an overall optical system with Mech-Zender modulator of two inputs; the first one is the laser source of 193.1 THz transmitting frequency within 1550 nm window, the lowest optical window that subjected to attenuation, and the second input is the information signal with different representations as follows Manchester, NRZ, chirped NRZ, AMI and Duobinary as
Fig. 2 Overall optical system with 10 Gbps
92
I. A. Alshimaysawe et al.
Fig. 3 Optical fiber complete system diagram
seen in Fig. 1; then, the modulated optical signal travels through the fiber. With a system bit rate of 10 Gbps, the optical signal reached to the receiver that contains a PIN detector to convert the optical signal into an electrical signal fed to optical visualizes to analyze the signal. The complete diagram of the haul system is shown in Fig. 3. DSMF × L SMF = −DDCF × L DCF
(1)
3 Result and Discussion Signal representation is an important part inside the transmitter block; it has an obvious effect on the performance of the optic system. Impairments in optics both linear and nonlinear limit the ability of the system to handle a long distance and larger bit rate. The behavior of the system with Manchester representation has been dropped by 12 order (dBm) in term of power at the end of the 420 km as shown in Fig. 4, while the dropping in the system performance as seen by optical spectrum analyzer is 13.4, 14, 12 and 14 order (dBm) in AMI, chirped NRZ, ordinary NRZ and DB, respectively. One of the useful ways to describe the performance or to saw the behavior of the coding technique under a certain coding technique is to illustrate the spectrum of the optical signal, cause the spectrum is a relation between signal power versus frequency. Any degradation of the system performance could be noticed by the amount for the
Comparison of Several Data Representations for a Single …
93
Fig. 4 Spectrum visulaizer for optical signal with Manchester coding, a after transmitter, b before receiver
power dropping of the optical signal under a specific frequency and that is the reason behind the state “dropping by 12 or any number order.” Figures 5, 6, 7 and 8 show the results displayed by spectrum visualizer for optical signal after the transmitter and before the receiver for different coding schemes AMI, chirped NRZ, NRZ and DB coding.
Fig. 5 Spectrum visualizer for an optical signal with AMI coding, (a) After transmitter, (b) Before receiver
94
I. A. Alshimaysawe et al.
Fig. 6 Spectrum visulaizer for optical signal with chirped NRZ coding, a after transmitter, b before receiver
Fig. 7 Spectrum visulaizer for optical signal with NRZ coding, a after transmitter, b before receiver
One of the encouraging results is that with NRZ coding, the difference between signal powers at transmitter and receiver, respectively, is small; that is, mean optical impairments are showing lower effect in case of NRZ environment Figures 9, 10, 11, 12 and 13 explain the threshold power which is the maximum quality factor at a certain power point in haul curve. The quality factor in the Manchester case was 18 at 8dBm which indicates increasing the power farther than 8dBm and that causes severe limitations in optical fiber distance and the speed of information transmission.
Comparison of Several Data Representations for a Single …
95
Fig. 8 Spectrum visulaizer for optical signal with DB coding, a after transmitter, b before receiver
Fig. 9 Q-factor versus power for single-channel optical fiber system with Manchester coding
Fig. 10 Q-factor versus power for single-channel optical fiber system with AMI coding
96
I. A. Alshimaysawe et al.
Fig. 11 Q-factor versus power for single-channel optical fiber system with chirped NRZ coding
Fig. 12 Q-factor versus power for single-channel optical fiber system with NRZ coding
Fig. 13 Q-factor versus power for single-channel optical fiber system with DB coding
Comparison of Several Data Representations for a Single …
97
Because of the existence of nonlinear effects, the Q-factors were 14 at 9dBm, 8.2 at 4dBm, 16 at 7dBm and 3.7 at 5dBm for AMI, chirped NRZ, NRZ and DB representation, respectively. In Fig. 13, the curve between Q-factor versus signal power was poor in terms of threshold power and the value of quality factor despite of DB coding is one of the important techniques that is used for optical transmission, that is happen due to the equilibrium among several factors and its relation with this coding technique, this will be the future scope for further studies. From the results, it is clear that the threshold power point with AMI exceeds other presentation which allows more distance and faster system, but it is Q-factor which lies below the Q-factor of the Manchester case at the threshold point; this leads to an improvement in signal quality at the receiver
4 Conclusion Kerr and scattering effects in single-channel optical fiber have a dominant influence on the optical pulse. The result appears that each kind of data representation schemes inside the optical transmitter gives an alternate value of signal power at the receiver, whereas Manchester and ordinary NRZ provide the best performance by introducing the minimum signal power difference between transmitter and receiver by 12 orders each and overlap other schemes by 1.4, 2 and 2 orders for AMI, chirped NRZ and DB, respectively. AMI has the maximum resistance for optical nonlinearities at threshold power of 9dBm compared with Manchester by 1 order and 2, 4 and 5 orders. Ordinary NRZ, DB and chirped NRZ, respectively, lead to elasticity for increasing the transmission power and in return achievable longer distance.
References 1. Amit K, Simranjit S (2013) Performance comparison of modulation formats for wavelength re-modulated bi-directional passive optical network. Int J Eng Bus Enterp Appl 5(2):169–172 2. Varun M, Ankur S, Satinder P (2012) Performance evaluation of modulation format for optical system. Int J Electron Commun Technol 3(1):190–192 3. Ali Y, Saif H (2016) Mitigation of fiber nonlinearity effects in ultra high dense WDM system by using fractional fourier transform for 32 channel system. Eng Tech J 34(1):50–60 4. Hideki N, Yoshiaki Y, Keishi H (2002) Design of a 10-Gb/s burst-mode optical packet receiver module and it is demonstration in a WDM optical switching network. J Lightwave Technol 20(7):1078–1083 5. Vorgelegt V, Diplom I, Anes H, Aus B (2004) Investigations of high bit rate optical transmission systems employing a channel data rate of 40 Gb/s. Elektrotechnik und Informatik—der Technischen Universitat Berlin 6. Sushank C, Priyanka C, Abhishek S (2017) High Speed 4 × 2.5 Gbps-5 GHz AMI-WDM-RoF transmission system for WLANs. Opt Commun 40(3):285–288 7. Cheng K, Conradi J (2002) Reduction of pulse-to-pulse interaction using alternative RZ formats in 40-Gb/s systems. IEEE Photon Technol Lett 14(1):98–100
98
I. A. Alshimaysawe et al.
8. Franck T, Nielsen T, Stentz A (1997) Experimental verification of SBS suppression by duobinary modulation in integrated optics and optical fiber communications. In: 11th International conference on and 23rd European conference on optical communications, vol 17 (448), pp 71–74 9. Guo W, Lian K, Chun C (2005) A simple AMI-RZ transmitter based on single-arm intensity modulator and optical delay interferometer. Opt Commun 255(1):35–40 10. Mandeep K (2014) Performance analysis of optical fiber communication systems for NRZ, RZ and doubinary formats. Int J Adv Eng Res Sci (IJAERS) 1(5):81–84 11. Prabhdeep K, Amit G, Mandeep C (2015) Comparative analysis of Inter satellite optical wireless channel for NRZ and RZ modulation formats for different levels of input power. Procedia Comput Sci 58(2):572–577 12. Saumya S, Kamal K, Nar S (2016) Design and performance analysis of dispersion managed system with RZ and NRZ modulation format. In: IEEE International conference on control computing, communication and materials (ICCCCM) 13. Doutje T, Vincent E (2016) Proposals for cost-effectively upgrading passive optical networks to a 25G line rate. J Lightwave Technol 35(6):8733–8724 14. Ravneet K, Harminder K (2018) Comparative analysis of chirped AMI and DPSK modulation techniques in IS-OWC system. Optik 154(1):755–762 15. Abhishek Sharma et al. (2018) High speed radio over fiber system for wireless local area networks by incorporating alternate mark inversion scheme. J Opt Commu 1–5 16. Sasikala V, Chitra K (2018) Effects of cross-phase modulation and four-wave mixing in DWDM optical systems using RZ and NRZ signal. In: Optical and microwave technologies, Springer Nature Singapore Pte Lt, pp 53-66 17. Ameer H, Saif H, Mohannad A (2019) Investigation of the different compensation methods for single optical channel. Eng Appl Sci 14(9):3018–3022
Classification of Gait Patterns Using Overlapping Time Displacement of Batchwise Video Subclips Khang Nguyen, Jiawei Chee, Chong Wee Soh, Ngoc-Son Hoang, Jeong-Hoon Lim, Binh P. Nguyen, Chee-Kong Chui, and Matthew Chin Heng Chua
Abstract Gait patterns of Cerebral Palsy (CP) patients have been used for cluster and classification analysis. Diplegia is the paralysis of one or more body parts which may be caused by CP and may come in various forms. Current clinical practice in gait issue diagnosis relies heavily on observation and is prone to human error. Following previous studies, the effectiveness of introducing modern machine learning techniques and processes in improving the classification accuracy on gait video data was investigated. This paper proposes a novel feature engineering approach by transforming the original video into overlapping sub-clips which not only maintains important features but also reduces training time. Multiple machine learning models have been constructed to examine their individual performances before ensembling them to improve overall performance. The ensemble architecture consists of two K. Nguyen (B) Institute of Science and Information Technology, Hanoi, Vietnam e-mail: [email protected] J. Chee · C. W. Soh · N.-S. Hoang · M. C. Heng Chua Institutes of System Science, National University of Singapore, Singapore, Singapore e-mail: [email protected] C. W. Soh e-mail: [email protected] N.-S. Hoang e-mail: [email protected] M. C. Heng Chua e-mail: [email protected] J.-H. Lim Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore e-mail: [email protected] B. P. Nguyen Victoria University of Wellington, Wellington, New Zealand e-mail: [email protected] C.-K. Chui Department of Mechanical Engineering, Nation University of Singapore, Singapore, Singapore e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_10
99
100
K. Nguyen et al.
stages, a probabilistic-based aggregator and normalizer and a performance-weighted ensemble. Finally, the model classification accuracy was able to achieve over 95%, a marked improvement from the results obtained by the models applied on similar dataset from literature. Hence, this highlights the effectiveness of the proposed method in classification of gait patterns and potentially changing current clinical practice in gait-related diagnosis. Keywords Motion analysis · Neural networks · Optical imaging · Optimization · Tracking
1 Introduction Cerebral Palsy is a set of childhood movement disorder. The cause of CP is abnormal brain development or damage in the portions which governs movement and balance [1]. Diplegia is the paralysis of symmetric body parts and a manifestation of symptoms found in patients suffering from CP [2]. Typical symptoms of CP and diplegia include poor movement coordination, poor motor function involving lower and/or upper limbs and muscle spasticity. In general, the most common cause of lower limb (legs) diplegia is CP while a smaller but significant number of CP patients experience upper limb (arms) diplegia. CP treatment is a multi-disciplinary medical effort involving neurology, rehabilitation and therapy. As each patient may have varying levels of the multitude of possible symptoms, the treatment approach to each individual will vary largely depending on the class or form of diplegia which the patient belongs to [3]. A form is defined by clinicians to provide a standard in which they are able to record and portray a clinical snapshot of a child which regards to patients of similar symptoms [4]. This will allow the standard to assign generalized prognosis to patients of comparable symptom states. Due to a lack of consensus and standard in the medical industry, there are many classification methods used by different groups when it comes to classifying and clustering CP patients [5]. In our study, we will use a classification method which divides CP into 4 main clinical classes as in Table 1. These classes are defined according to the walking patterns of the patient as well as the motor repertoire available to the patient. It is noted, however, that this classification has a slight drawback due to the possibility of partial overlaps between forms [3]. This classification has been found to be effective in facilitating clinical assessments, providing a framework for assigning treatments and benchmarking the effectiveness to patients [6]. Furthermore, this method of classifications has been shown to be supported by Gross Motor Function Classification System (GMFCS) in which GMFCS levels correlate well with the forms of diplegia [6, 7]. This paper proposes using various techniques to improve classification accuracy of the current model which has been constructed from the dataset [3]. Such techniques include the use of pre-processing steps, different machine learning models as well as a nested ensembling approach of multiple models.
Classification of Gait Patterns Using Overlapping Time …
101
Table 1 Different forms/classes of Diplegia [3] Form Main traits Form 1 Form 2 Form 3 Form 4
Antepulsion of trunk, toe balancing. Consistent support from causes Pronounced knee flexion in midstance, loaded knee behavior, short steps Frontal trunk swinging and use of upper limbs to keep balance, presence of dys-perceptive disorders (fear of falling and of open spaces Maily a motor deficit. Increased talipes equinus at the start of walking. Difficulty to stop immediately the walking
Machine learning is commonly employed as a classification and categorization tool. Given a set of pre-labelled data, machine learning models may be trained to identify, interpolate and classify new or unseen data points. Due to the effectiveness and cost-savings benefit of automated classifiers, much progress has been made on gait classification for various means beyond the domain of diplegia [8]. Machine learning using the various available models such as neural networks, support vector machines and random forest has been used for gait analysis and classification [9]. Ferrari et al. [5] took two distinct approaches to the problem, viewing the data as time-series sequence of steps and static frequency-based data using Fourier transform of the original dataset. Long Short-Term Memory (LSTM) was used for time-series analysis while Multi-Layer Perceptron (MLP) was used on the Fourier transformed dataset. A unique pre-processing transformation technique was proposed and adopted by the authors. It consisted of converting the original set of 3-dimensional (3-D) locations of each body marker into a set of 27 3-D angles. This transformation was done due to knowledge of the significance of angular information with regards to clinical signs [10]. The results of the original authors showed an accuracy of 58% and 64% for MLP and LSTM models respectively. As a baseline, a radial basis Support Vector Machine (SVM) is used and later found to be outperformed by both LSTM and MLP. Their paper further reveals that there was difficulty in accurate classification of Form 3 data which suggests a lack of easily identifiable traits to which models can be trained on. Dataset The dataset was obtained via optoelectronic markers and sensors which provides gait data in the form of 19 3-D markers (Table 2). Specifically, the dataset consists of 1139 trials (walks) involving 178 patients collected at Laboratorio Analisi del Movimento del Bambino Disabile of the hospital Arcispedale S. Maria Nuova hosted in a 12 m × 8 m room equipped with a Vicon system of 8 MX+ cameras. The resultant data from the trials were saved in coordinate 3D (C3D) format. The distribution of patients and trials can be found in Table 3. The raw measurements were subsampled to reduce the sampling rate from 100 fps to 50 fps. While the raw data was provided in C3D format, the author has released the full anonymised dataset in the format of Numpy array file (.npy). It is noted that the data obtained from the Numpy array files should in principle be identical to the raw data used by the original authors.
102
K. Nguyen et al.
Table 2 Dataset—3-D body markers [3] No. Identifier Marker position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
C7 RA LA REP LEP RUL LUL RASIS LASIS RPSIS LPSIS RGT LGT RLE LLE RCA LCA RFM LFM
7th cervical vertebrae Right acromioclavicular joint Left acromioclavicular joint Right lateral elbow epicondyle Left lateral elbow epicondyle Right lateral prominence of ulna Left lateral prominence of ulna Right anterior superior iliac spine Left anterior superior iliac spine Right posterior superior iliac spine Left posterior superior iliac spine Right prominence of the greater trochanter Left prominence of the greater trochanter Right lateral knee epicondyle Left lateral knee epicondyle Right upper ridge of the calcaneus posterior surface Left upper ridge of the calcaneus posterior surface Right dorsal aspect of first metatarsal head Left dorsal aspect of first metatarsal head
Table 3 Dataset—distribution of patients and trials Class No. of patients 0 1 2 3
13 52 34 79
No. of trials 65 297 246 531
2 Methodology 2.1 Proposed Architecture Figure 1 provides the architecture overview of the proposed methodology to improve the model’s overall performance. It consists of two novel approaches. The former focuses on features engineering with an objective in creating value from the original dataset. The latter, on the other hand, explores complementary interactions among individual models as they have been constructed independently and their architectures are uniquely different from each other. Hence, there lies a certain degree of diversity which justifies the use of models ensemble [11].
Classification of Gait Patterns Using Overlapping Time …
103 Data Input
Dataset 3D Angle Transformation
Raw Coordinates
Data Pre-processing Sequence Transformation & Sub-trial split Static Frame Sequence
Time Series
Pre-processing (Feature Engineeing & Normalization)
MLP
SVM
RF
LSTM
Sub-trial Prediction Aggregation & Normalization
Sub-trial Prediction Aggregation & Normalization
Sub-trial Prediction Aggregation & Normalization
Sub-trial Prediction Aggregation & Normalization
Probabilistic Normalizer & Performance Weighted Aggregator
Prediction
Model Training & Prediction
Ensemble Prediction
Fig. 1 Proposed architecture
2.2 Data Pre-processing Data pre-processing is a common step in machine learning applications. The benefits of pre-processed data include faster training times and better accuracy [12]. Two main data pre-processing technique is utilized in the proposed approach. Principal Component Analysis (PCA) is a statistical tool often used to do orthogonal transformation as a pre-processing step. PCA removes the inter-variable dependency by creating a new set of uncorrelated principal components [9]. Furthermore, PCA allows us to take the top N number of components which have the largest variance. The result of this is that unimportant components or features can be omitted and this will result in faster training times as well as better accuracy and prediction consistency. Normalization is the transformation of data such that the scale of each feature is equalized. This allows for more efficient learning as the data points are standardised to a common, more easily trainable scale of values [10]. Furthermore, training neural network using normalized values tend to be easier because their activation functions may not be suitable for overly large or small values [13].
104
K. Nguyen et al.
Fig. 2 Batch-wise transformation with overlap
2.3 Feature Engineering It is well established in statistical classification that there exists various functions to examine the similarities between two objects. One of the functions is the fisher kernel which can process data of variable length [14]. Analogous to video classification, while a full video clip records the entire motion from a Diplegia patient, it is a subset of the entire video clip which characterizes the stage of Diplegia the patient is suffering from [15]. This inspires the customization of an algorithm to transform the full duration video clip into batch-wise sub clips. Figure 2 below illustrates the output of the algorithm. After dividing the full duration video clip into equal sizes, these batch-wise video clips are overlapped to avoid any distinction in any of them. Not only does this algorithm allow the model to gain focus during the training process, it also reduces the training time significantly. This is because a single record now consists only of a fraction of the number of frames in the full video clip. The original paper by Ferrari et al. [5] has only adopted 3-D angles in the dataset as the authors believes that most of the clinical signs are strongly related to angular information [7]. Alternatively, our paper explores how the raw coordinates data might be useful as well. This is due to the hypothesis that there may be value in information which only resides in the raw coordinates. In addition to data pre-processing, min-max normalization is carried out to eliminate effects of gross values [16]. This is especially important in principal component analysis (PCA) as it is a variance maximizing exercise. PCA is also conducted for dimensionality reduction. This is crucial in reducing the number of features and improving the accuracy in training a machine learning model [16]. Lastly, the full dataset is also divided into training and test datasets in the ratio of 4:1. This is a necessary countermeasure to test for overfitting [17].
2.4 Nested Ensemble A nested ensemble approach was adopted to improve overall performance through the (1) minimization of errors in variance and (2) factoring the imperfection in models. It consisted of two different ensemble stages:
Classification of Gait Patterns Using Overlapping Time …
105
Probabilistic-Based Aggregator and Normalizer The proposed feature engineering algorithm has suggested that machine learning models will be built to classify individual subset of a complete video clip. To classify patients into various stages of Diplegia, there is a need to process preliminary results from the model. Instead of using a common majority voting ensemble logic, the approach aggregates and normalizes the probability output for each trial which the model predicts [18]. For instance, in a neural network, the probability output is provided in the softmax layer for classification problems. The main motivation of such a decision is to factor in the imperfection across all the models [19]. It is almost unlikely to have a model which can predict with 100% precision across all classes of Diplegia. A majority voting logic is unable to consider such imperfection as the results are binary for each of the prediction class. On the other hand, a probabilistic approach addresses this limitation by associating each predicted class with a probability. While a model may favour a specific class for a particular test subject, it does also acknowledge that there is a chance that the test subject may fall into another class [20]. Performance-Weighted Ensembler Four different types of machine learning models are constructed to explore their strengths, weaknesses, diversity and eventually how they can complement each other. As an array of models are constructed, a final layer of ensemble logic is built to receive probability inputs from individual models, weigh them against their accuracies and perform a final class label for a specific test case [21].
3 Results and Discussion In this section, the improvements arising from the proposed approaches in feature engineering and nested ensembles are quantified and discussed. This is done by comparing the performances of the proposed model with the baseline published by the previous studies [5].
3.1 Feature Engineering The proposed approach would require a full video clip to be split into overlapping batch-wise subclips. This resulted in an exponential increase in the total number of records from 1139 to 24,296. On the other hand, as each subclip is now significantly shorter, it leads to a proportional decrease in the number of features available for training. Nevertheless, principal component analysis continues to be carried out to improve the model’s training process through dimensionality reduction. Based on retaining 95% of total variance, Table 4 details the reduction in features for the two datasets on Raw Coordinates and 3-D Angles.
106
K. Nguyen et al.
Table 4 Features on raw coordinates and 3-D angles Dataset Number of Features Before Raw coordinates 3D angles
4275 6075
After 18 455
Fig. 3 Scree plot of 3-D angles dataset
The scree plots, arising from principal component analysis, are illustrated in Figs. 3 and 4. The former plot has been truncated as it is not possible to plot the results of all 455 components. It is evident from the latter plot that although 18 principal components explained for 95%, the first 3 components have already explained majority (80%) of the total variance. This further affirms our (1) hypothesis that not all the time frames are necessary to explain a specific form of Diplegia and (2) the action taken to split the video clip into batchwise overlapping subclips. With the necessary pre-processing performed, the Raw Coordinates and 3-D Angles datasets are used independently to train all 4 models (LSTM, MLP, Random Forests (RF), SVM). Their architecture and test results are presented in Tables 5 and 6, respectively. With reference to the MLP model, a 50% increase in accuracy, against the baseline established earlier, has been observed. In addition, Table 6 has also illustrated that the 3-D Angles dataset might not benefit all models. In fact, LSTM, RF and SVM do not train well with the 3-D Angles dataset. One possible reason is that these models do not generalize well if there are too many features [21]. Even with normalization and PCA, the number of features in a transformed 3-D Angles dataset remains large at 455, approximately 25 times more than that of the transformed Raw Coordinates dataset.
Classification of Gait Patterns Using Overlapping Time …
107
Fig. 4 Scree plot of raw coordinates dataset Table 5 Architecture/hyper-parameters of models Model Architecture/hyper-parameters LSTM
MLP
RF
SVM
Architecture: LSTM(100, 64) - FC(100, 32) - Flatten(3200) - FC(1024) - FC(496) FC(64) - FC(32) - Dropout(0.5) - FC(4) Activation functions: tanh for LSTM, softmax for the last FC (fully-connected) layer, ReLU for the rest. Metric: accuracy, No. of Epochs: 100 Architecture: FC(1000) - FC(1000) - FC(2048) - FC(518) - FC(64) - Dropout(0.5) FC(16) - FC(4) Activation functions: softmax for the last FC layer, ReLU for the rest. Metric: accuracy, No. of Epochs: 40 bootstrap=True, classWeight=None, criterion=‘gini’, maxDepth=None, maxFeatures=‘auto’, maxLeafNodes=None, minImpurityDecrease=0.0, minImpuritySplit=None, minSamplesLeaf=1, minSamplesSplit=2, minWeightFractionLeaf=0.0, nEstimators=100, oobScore=False, randomState=None, warmStart=False C=1.0, cacheSize=200, classWeight=None, coef0=0.0, decisionFunctionShape=‘ovr’, degree=3, gamma=0.1, kernel=‘rbf’, maxIter=-1, probability=True, randomState=None, shrinking=True, tol=0.001
Table 6 Test results of the adopted models Metrics Baseline LSTM Accuracy—3-D angles Accuracy— coordinates
MLP
RF
SVM
64%
47%
98%
58%
66%
–
53%
88%
66%
71%
108
K. Nguyen et al.
Precision by Class Random Forest Long Short Term Memory
Support Vector Machine Multi Layer Perceptron
100% 80% 60% 40% 20% 0% 0
1
2
3
Diplegia Class
Fig. 5 Precision evaluation of the models Recall by Class Random Forest Long Short Term Memory
Support Vector Machine Multi Layer Perceptron
100% 80% 60% 40% 20% 0% 0
1
2
3
Diplegia Class
Fig. 6 Recall evaluation of the models
For each type of model, the better performing one is selected and their results are illustrated in Figs. 5 and 6. It is clear that MLP continues to dominate over all other models in precision and recall. However, MLP aside, the performance of the other models tends to vary. For instance, SVM has greater (poorer) precision than RF in class 0 (3). Hence, if MLP is excluded, the performances of these individual models might be further improved through ensemble.
3.2 Nested Ensembling Table 7 illustrates the results from the ensemble logic. Option 1 excludes MLP and it is evident that our probabilistic-based and performance-weighted ensemble logic
Classification of Gait Patterns Using Overlapping Time … Table 7 Ensemble results Option LSTM MLP 1 2
53% 53%
– 88%, 98%
109
RF
SVM
Ensemble v1
Ensemble v2
66%, 58% 66%, 58%
71%, 66% 71%, 66%
77% –
– 95%
Fig. 7 Confusion matrix
boosts overall accuracy by approximately 8%. However, the underlying condition is that the models’ accuracy had to be within 20% of each other. In option 2, where there is a dominant model, the ensembling logic fails to serve its purpose. Overall accuracy suffers from a 3% drop. The MLP model utilizing the 3-D Angles dataset should be established as the new baseline as that has near perfect accuracy results. The confusion matrix is plotted accordingly in Fig. 7 and it is in line with the high accuracy, precision and recall across all classes.
4 Conclusion Novel approaches in feature engineering and ensemble have been proposed in the classification of various forms of Diplegia. The former focuses on transforming the original dataset and bringing value out of the data through separating them into overlapping sub-clips. The accuracy of the proposed model improves from 64% to 98%, a significant improvement from the baseline established earlier. The latter approach exploits on the diversity among the individual models to improve overall performance through a probabilistic-based, performance weighted ensemble logic. The proposed ensemble architecture has been proven effective in boosting overall accuracy by 8% if the accuracies of the individual models are within 20% of each other. We can extend our work by adding to ensemble learning other classifiers such as kernel dictionary learning [22, 23], enhanced k-NN [24], and wide and deep
110
K. Nguyen et al.
learning [25]. This paper highlights the potential of the proposed approach to be used as a more quantitative measurement of gait patterns in clinical practice, and hence reducing human error in interpretation. Acknowledgements This research is supported by the Singapore Ministry of Health’s National Medical Research Council under its Enabling Innovation Grant, Grant No: NMRC/ EIG06/2017.
References 1. Krigger KW (2006) Cerebral palsy: an overview. Am Family Physician 73(1) 2. Dobson F, Morris ME et al (2007) Gait classification in children with cerebral palsy: a systematic review 3. Liptak GS, Accardo PJ (2004) Health and social outcomes of children with cerebral palsy. J Pediatr 145(2):S36–S41 4. Bax M, Goldstein M, et al (2005) Proposed definition and classification of cerebral palsy 5. Ferrari A, Bergamini L et al (2019) Gait-based diplegia classification using LSMT networks. J Healthcare Eng 2019:1–8 6. Ferrari A, Brunner R et al (2015) Gait analysis contribution to problems identification and surgical planning in CP patients: An agreement study. Euro J Phys Rehab Med 51(1):39–48 7. Cioni G, Lodesani M, et al (2008) The term diplegia should be enhanced. Part II: contribution to validation of the new rehabilitation oriented classification. Euro J Phys Rehab Med 44(2):203– 211 8. Mannini A, Trojaniello D et al (2016) A machine learning framework for gait classification using inertial sensors: Application to elderly, post-stroke and huntington’s disease patients. Sensors 16(1):134 9. Kohle M, Merkl D, Kastner J (1997) Clinical gait analysis by neural networks: issues and experiences. In: Proceedings of computer based medical systems. IEEE, New York, pp 138– 143 10. Ferrari A, Alboresi S, et al (2008) The term diplegia should be enhanced. Part I: A new rehabilitation oriented classification of cerebral palsy. Euro J Phys Rehab Med 44(2):195–201 11. Patro SGK, Sahu KK (2015) Normalization: a preprocessing stage. CoRR abs/1503.06462, 1–4 12. Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001 13. Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207 14. Sola J, Sevilla J (1997) Importance of input data normalization for the application of neural networks to complex industrial problems. IEEE Trans Nucl Sci 44(3):1464–1468 15. Yang J, Frangi AF et al (2005) KPCA plus LDA: A complete kernel Fisher discriminant framework for feature extraction and recognition. IEEE Trans Pattern Anal Mach Intell 27(2):230– 244 16. Jayalakshmi T, Santhakumaran A (2011) Statistical normalization and back propagation for classification. Int J Comput Theory Eng 3(1):1793–8201 17. Shlens J (2014) A tutorial on principal component analysis. CoRR abs/1404.1100, 1–12 18. Gonçalves I, Silva S (2013) Balancing learning and overfitting in genetic programming with interleaved sampling of training data. In: Krawiec K, et al (eds) European conference on genetic programming, LNCS, vol 7384, pp 73–84 19. Bridle JS (1990) Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Neurocomputing. Springer, Berlin
Classification of Gait Patterns Using Overlapping Time …
111
20. Dietterich TG (2000) Ensemble methods in machine learning. In: Kittler J, Roli F (eds) International workshop on multiple classifier systems (MCS 2000), LNCS, vol 1857. Springer, Berlin, pp 1–15 21. Kuncheva LI, Whitaker CJ et al (2003) Limits on the majority vote accuracy in classifier fusion. Patt Anal Appl 22. Chen X, Nguyen BP et al (2016) Automated brain tumor segmentation using kernel dictionary learning and superpixel-level features. In: Proceedings of the international conference on systems, man, and cybernetics, pp 2547–2552 23. Chen X, Nguyen BP et al (2017) An automatic framework for multi-label brain tumor segmentation based on kernel sparse representation. Acta Polytechnica Hungarica 14(1):25–43 24. Nguyen BP, Tay WL, Chui CK (2015) Robust biometric recognition from palm depth images for gloved hands. IEEE Trans Human-Mach Syst 45(6):799–804 25. Nguyen BP, Pham HN et al (2019) Predicting the onset of type 2 diabetes using wide and deep learning with electronic health records. Comput Methods Programs Biomed 182:105055
A Comparative Study to Classify Cumulonimbus Cloud Using Pre-trained CNN Sitikantha Chattopadhyay, Souvik Pal, Pinaki Pratim Acharjya, and Sonali Bhattacharyya
Abstract Thunderstorm is one of the most dangerous natural phenomenons. Basically clouds can be broadly classified into three types. Among them cumulonimbus cloud is the main ingredient of this type of savage weather. In this paper, we have classified cumulonimbus cloud from natural images of different clouds. We have used watershed transformation to segment only the cloud parts from those images and used two pre-trained convolutional neural network, namely AlexNet and GoogLeNet, to classify them into cumulonimbus and non-cumulonimbus cloud. We have seen that AlexNet is performing much better compared with GoogLeNet in terms of total training time with same parameters for training. We have done simulation on MATLAB and giving promising result of 81.65% accuracy in case of AlexNet. Keywords Digital image processing · AlexNet · GoogLeNet
1 Introduction Weather is chaotic in nature—means it cannot be fully predicted. There are lot of weather parameters like air pressure, temperature, humidity, etc. Over a specific time range, the changes of these parameters are observed and then the condition
S. Chattopadhyay Department of CSE, Brainware University, Barasat, India e-mail: [email protected] S. Pal (B) Department of CSE, Global Institute of Management and Technology, Krishnanagar, India e-mail: [email protected] P. P. Acharjya Department of CSE, Haldia Institute of Technology, Haldia, India e-mail: [email protected] S. Bhattacharyya Department of CSE, JIS College of Engineering, Kalyani, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_11
113
114
S. Chattopadhyay et al.
of the weather at a particular geographic location can be predicted. Among those parameters, cloud type is also an important one. Basically there are three types of basic cloud types, namely cumulonimbus, cirrus and stratus cloud. Among them cumulonimbus cloud is mainly responsible for the thunderstorm and heavy rain over a specific location. It can be formed from cumulous cloud which gives pleasant weather. Cumulonimbus cloud is a puffy-cotton types cloud with very high density, and generally they are vertically shaped. They made with water vapor and can cause thunderstorms. Its top position is generally “Anvil-shaped.” In this paper, our main objective is to identify this type of clouds [1, 2]. This paper is organized as follows. Module 1 is the introduction about cumulonimbus cloud followed by morphological operations and watershed segmentation in module 2. Module 3 gives a brief of two pre-trained networks used in this work. Proposed method is discussed in module 4 followed by experimental results in module 5. Conclusion and future work are discussed in module 6 and references in module 7.
2 Watershed Segmentation and Mathematical Morphology From a digital image, segmenting a specific part is one of the fundamental operations in digital image processing. This is called image segmentation. This can be done in different way. We have segmented the cloud portions from the real images using marker-controlled watershed segmentation technique. This technique is a collection of several morphological operations. Morphology is a biological term related to study of shape and size of animals and plants. Mathematical morphology is a collection of different basic and advanced set theory operations which can be used to solve many problems in digital image processing. Two basic morphological operations are dilation and erosion [3]. Dilation can be used to add pixels to an image boundary, and erosion removes pixels from an image boundary. At first, these two operations were applied in color images, but later it was applied to gray-scale images. For both of this operations, structuring element is needed which is another digital image. The shape and structure of structuring element is directly related to the number of pixels added or removed from an object boundary. Let f (x, y) and b(x, y) are the digital image and structuring element, respectively, then dilation can be expressed as follows [ f ⊕ b](x, y) = max { f (x − s, y − t)} (s,t)∈b
(1)
Similarly erosion can be expressed as follows [ f b](x, y) = min { f (x + s, y + t)} (s,t)∈b
(2)
A Comparative Study to Classify Cumulonimbus Cloud …
115
Apart from dilation and erosion, there are lot of other operations. Among them opening and closing are most important. They are directly derived from dilation and erosion. Opening is the erosion operation followed by dilation with the same structuring element. On the other hand closing is dilation operation followed by erosion with the same structuring element. Result of opening is closely related to erosion, but its effect is less than that of erosion. Similarly result of closing is closely related to dilation, but its effect is less than that of dilation. Other morphological operations are morphological thinning, thickening, skeletonization, edge detection, etc. Watershed segmentation can be used to segment touching objects present in a digital image. It can be used with or without the help of controlled marker. In case of a noisy image, a watershed algorithm without controlled marker leads to a problem called “over-segmentation.” This can be solved using controlled marker. This algorithm treats a digital image as topographic surface where dark pixels are considered as low value and light pixels are considered as high value. This algorithm works better if foreground object (also called object of interest) can be identified from background object [4]. Five main steps of marker-controlled watershed segmentation are as follows 1. 2. 3. 4.
Segmentation function calculation. Foreground marker identification. Background marker identification. To contain only minima at both the markers, modification of segmentation functions. 5. With this modified segmentation function, calculation of watershed transformation [5].
3 Pre-Trained CNN (AlexNet and GoogLeNet) Machine learning and deep learning are the use of single and multilayer artificial neural network. These networks can perform those tasks which are not explicitly programmed. On a specific set of data, these networks gain sufficient knowledge by training and by applying that knowledge, many tasks can be done which was not previously defined. In this paper, we worked with two pre-trained convolutional neural network (CNN), namely AlexNet [6] and GoogLeNet [7]. Sometimes resource allocation [8] is also a vital issue in cloud computing environment. AlexNet is a pre-trained CNN trained on a subset of well-known imagenet database, used in ImageNet Large-Scale Visual Recognition Challenge (ILSVRC). This network is trained on millions of real-life images and can classify an image into one of the 1000 different categories. Some examples of those categories are keyboard, mouse, pencil, different animals, etc. This network has 25 layers in total. Among them 8 layers have learnable weights, 5 convolutional layers and 3 numbers of fully connected layers. It is designed to accept any digital images in a size of 227 × 227 × 3 [9]. Since we have very less numbers of cloud images compared with the
116
S. Chattopadhyay et al.
Table 1 Steps for cloud segmentation and classification Steps
Explanation
Step 1
Total 545 different cloud images have been taken
Step 2
Only the cloud parts have been segmented using marker-controlled watershed segmentation algorithm
Step 3
These parts are given to both AlexNet and GoogLeNet for training
Step 4
Resize the parts according to the need of respective CNN
Step 5
80% of total images are used for training purpose, and 20% are used for validation purpose
Step 6
Transfer Learning is applied with modified last three layers of both CNN
Step 7
After training, validation has done with 20% of total images
Step 8
Both CNN is giving validation accuracy = 81.65%
original training set of this model, we have used transfer learning method to train the network with our own images. This will help us to gain the power of pre-trained network in less time and complexity. GoogLeNet is another pre-trained CNN like AlexNet but in different shape and size. It is also trained on millions of real-life images of imagenet database and capable to classify 1000 different object class. Like AlexNet, GoogLeNet also accepts digital images of size 224 × 224 × 3. It is very big network having 144 different layers compared with AlexNet. We have performed transfer learning on this network also to gain the power of this pre-trained network in less time and complexity [10].
4 Proposed Method In this paper, our objective is to identify the presence of cumulonimbus cloud in a cloud image. If we do that, then the chance of thunderstorm can be detected. The stepby-step process of cumulonimbus cloud segmentation and classification to detect the presence of that cloud is explained in Table 1. Here we have taken only 545 different images for training purpose. If this training set can be improved, then more accuracy at validation level can be achieved.
5 Experimental Results Since we have taken two pre-trained CNNs to implement our work, we have two sets of experimental results—one for AlexNet and another for GoogLeNet. Total implementation has been done with MATLAB R2017B. Figure 1 shows the validation accuracy of AlexNet. This figure clearly explains transfer learning for AlexNet.
A Comparative Study to Classify Cumulonimbus Cloud …
117
Fig. 1 Transfer learning for AlexNet
Figure 2 shows the same for GoogLeNet. They have 81.65% accuracy at validation stage taking initial learning rate = 0.0001. Figures 3 and 4 indicate the corresponding validation results for AlexNet and GoogLeNet, respectively. Both figures clearly show the validation outputs. Both the network is giving almost same accuracy at validation level, but GoogLeNet is taking much more time to train than AlexNet with the same amount of training image set. This is illustrated in Fig. 5. With the same amount of input training dataset, total execution time for training in AlexNet is 96 ns, whereas with the same parameter, GoogLeNet is taking 310 ns. Here 1 represents AlexNet and 2 represents GoogLeNet.
Fig. 2 Transfer learning for GoogLeNet
118
S. Chattopadhyay et al.
Fig. 3 Validation result for AlexNet
Fig. 4 Validation result for GoogLeNet
6 Conclusion and Future Work In this paper, we have worked with marker-controlled watershed segmentation technique. Marker is used to overcome the “over-segmentation” problem in a noisy image. If noisy image is not used, then only watershed segmentation is sufficient to segment a specific region of interest (in this work cloud part). We have performed training on two pre-trained networks and used the same feature extracted from those networks to classify cumulonimbus cloud. Since the structure and working principle of AlexNet and GoogLeNet is predefined, the validation accuracy may improve if we consider our own network. But in that situation, we will require more example images to train that network.
A Comparative Study to Classify Cumulonimbus Cloud …
119
Fig. 5 Comparison of total training time
Since size of GoogLeNet is very much larger than AlexNet, corresponding training time is also larger in case of GoogLeNet. Again they are giving almost the same accuracy. Therefore, it can be concluded that AlexNet is performing better than GoogLeNet in terms of total training time. This time is directly related to number of epochs, iterations and initial learning rate. To improve the validation accuracy percentage, these parameters must be changed accordingly. Other pre-trained networks also exist like Vgg-16 and Vgg-19. If these networks will be used for the same purpose, then training time and validation accuracy will be different compared with AlexNet and GoogLeNet.
References 1. Chakrabarty D, Biswas HR, Das GK, Kore PA (2008) Observational aspects and analysis of events of severe thunderstorms during April and May 2006 for Assam and adjoining states—a case study on Pilot storm project. vol 59, Issue 4, Mausam, pp 461-478 2. Galvin JFP (2009) The weather and climate of the tropics: part 8 Mesoscale weather systems. Weather 64(2):32-38 3. Gonzalez RC, Woods RE (2004) Digital image processing. 2nd edn. Prentice Hall 4. Hamdi MA (2011) Modified algorithm marker-controlled watershed transform for image segmentation based on curvelet threshold. Can J Image Process Comput Vis 2(8):88–91 5. Couprie C, Grady L, Najman L, Talbot H (2009) Power watersheds: a new image segmentation framework extending graph cuts random walker and optimal spanning forest. In: Proceedings ICCV, Kyoto, Japan, pp 731–738 6. Muhammad NA, Ab Nasir A, Ibrahim Z, Sabri N (2018) Evaluation of CNN, alexnet and googlenet for fruit recognition. Indonesian J Electri Eng Comput Sci 12(2):468–475 7. Krizhevsky A, Sutskever I, Hinton G (2012) Image net classification with deep convolutional neural networks. In: Proceedings advances in neural information processing systems, Lake Tahoe, NV, USA, pp 1097–1105 8. Pal S, Kumar R, Son LH et al (2019) Novel probabilistic resource migration algorithm for cross-cloud live migration of virtual machines in public cloud. J Supercomput 75:5848–5865
120
S. Chattopadhyay et al.
9. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Boston, 7–12 June, pp 1–9 10. Sudha KK, Sujatha P (2019) A qualitative analysis of googlenet and alexnet for fabric defect detection. Int J Recent Technol Eng 8(1):86–92
Improved Electric Wheelchair Controlled by Head Motion He-Thong Bui, Le-Van Nguyen, Thanh-Nghi Ngo, Tuan-Sinh V. Nguyen, Anh-Ngoc T. Ho, and Qui-Tra Phan
Abstract The electric wheelchairs are designed to assist disabled people. However, for people with disabilities who have a higher degree of impairment, such as quadriplegics, who cannot move any of the body parts except the head, medical devices for them are usually very complex, rare and expensive. In this paper, we improved an electric wheelchair model that is controlled by the motion of the head by using a microcontroller system. The system includes microelectromechanical components: tilt angle sensors and mechanical mechanisms. A new head-motion recognition technique based on processing angular acceleration data. The process control of wheelchair motion has been achieved by using 6050 tilt angle sensor and Arduino family UNO R3 microcontroller. The signal of head movement direction is sent to the microcontroller. Depending on the orientation of the sensor, microcontroller will control wheels, such as turn left, turn right, moving forward or backward with the help of DC motors. Keywords Wheelchair · Quadriplegia · Medical devices · Tilt angle sensors · Head motion
H.-T. Bui (B) · L.-V. Nguyen · T.-S. V. Nguyen · A.-N. T. Ho · Q.-T. Phan The University of Da Nang-University of Technology and Education, Danang, Vietnam e-mail: [email protected] L.-V. Nguyen e-mail: [email protected] T.-S. V. Nguyen e-mail: [email protected] A.-N. T. Ho e-mail: [email protected] Q.-T. Phan e-mail: [email protected] T.-N. Ngo The University of Da Nang-University of Science and Technology, Danang, Vietnam e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_12
121
122
H.-T. Bui et al.
1 Introduction According to the report of the World Health Organization (WHO), the number of disabled people (who has one or more physical and mental impairments which cause a significant and long-term impairment of their ability to perform everyday activities) is about 1 billion in the world. Two hundred millions of them have serious difficulties in the life, estimating that about 100 million people with disabilities need a wheelchair to use in their daily life [1, 2]. In Vietnam, according to the Ministry of Labor, War invalids and Social Affairs report, by 2015, has about seven million disabilities, accounting for 7.8 percent of the population. People with special and severe disability accounted for nearly 29% (such as paralysis of limbs, lying in place, unable to move by themselves) [3]. Quadriplegia is a disease caused by spinal cord injury. When the spinal cord is damaged, people lose their feeling and movement ability. Quadriplegia includes paralysis of the arms, hands, torso, legs and pelvic organs. The reasons for such reduced mobility may have various causes such as stroke, arthritis, high blood pressure, osteoarthritis and osteoarthritis and cases of paralysis and birth defects. In addition, quadriplegia appears as a result of an accident or age. Patients with such serious disabilities cannot perform their daily actions, such as eating, using the rest room and moving. Depending on the severity of the disability, the patient may maintain freedom of movement to a certain extent by using various medical devices [4]. These assistive devices will help them move easy and contributing to the quality of life. A few studies have been affected by the authors [7–9] regarding the field of electric wheelchair gestures controlled by head motion. In our study, we will research, redesign and improve an electric wheelchair to support the independent mobility of people with severe disabilities, quadriplegia. The product of our research is an electric wheelchair that drives movement using the gestures of the head to help patients move in any desired direction including go straight, backward, turn left and turns right. The wheelchair stops when the patient’s head is upright. Our study, a microcontroller system, has been developed for controlling a standard electric wheelchair with head motion. The article describes an electric wheelchair for people with disabilities is developed by using sensors that read the movements of head gestures and circuits to control speed of electric motors (DC). The improved power wheelchair uses the Arduino UNO R3 microcontroller because of its low cost, besides easy programming features. The MPU 6050 tilt angle acceleration sensor is used to effectively translate head movements into computer interpretation signals. For motion detection, accelerometer data are calibrated and filtered. Accelerometers are used to measure the magnitude and direction of gravity. The model uses two DC motors. A microcontroller is programmed with C programming language.
Improved Electric Wheelchair Controlled by Head Motion
123
2 Methodology In our study, we have used “Ergonomics” for the research, calculation and design of electric wheelchairs controlled by head motion. Ergonomics: “The scientific study of people and their working conditions, especially done in order to improve effectiveness” [5].
2.1 Selecting Electric Motor Weight of wheelchair G0 = 250 N, maximum weight of patient G = 1250 N, maximum speed is 12 km/h. The electric wheelchair is equipped two DC motors, and chain transmission is used to reduce velocity. The calculation of kinetics and dynamics when the wheelchair moves on a flat road (or up a slope with an angle less than 150) is calculated approximately based on reference [6]. In particular, the load (including weight of people and vehicle weight) is 1500 N; each beam on the frame is 750 N. The approximate conversion formula is as follows: F = m∗a + f ∗ N
(1)
where m The total weight of patient and wheelchair (kg), = 150 kg; a Acceleration of wheelchair (m/s2 ) (a = 0.1 m/s2 ); f Friction factor between wheels and road (f = 0.015); N Reaction force from road to wheels (N = m*g) with g–gravitational acceleration (g = 9.81 m/s2 ). F = 150∗ 0.1 + 0.015∗ 150∗ 9.81 = 37.07(N )
(2)
The necessary capacity of DC motor: N = F ∗v
(3)
where v (m/s) is maximum velocity of wheelchair (v = 12 km/h = 3.33 m/s). N = 37.07∗ 3.33 = 123(W ) So, we choose a DC motor with N = 120 W for each wheel.
(4)
124
H.-T. Bui et al.
Table 1 Specifications of tilt angle acceleration sensor
Chip: MPU-6050 (16-bit ADC, 16-bit data out) Value of gyroscapes about: ±250 500 1000 2000 degree/s Value of acceleration about: ±2 g, ±4 g, ±8 g ±16 g Communication: I2C Power: 3−5 V (DC)
2.2 Choice of Battery A battery (12 V–20Ah) is used for transmission system. The capacity of battery can provide to wheelchair within 20–30 km. Charging time is from 4 to 6 h.
2.3 Equipment and Component in the Model 2.3.1
MPU 6050 Angle Sensor
MPU 6050 integrates 6 axis sensors, including: • 3-axis MEMS gyroscope • 3-axis MEMS accelerometer. In addition, MPU 6050 has a hardware acceleration unit that specializes in signal processing (digital motion processor—DSP) collected by the sensor and performs the necessary calculations. It significantly reduces the microcontroller’s computational processes, improves processing speed and provides faster responses. This is a significant difference of MPU 6050 compared to other acceleration sensors. The MPU 6050 can be combined with a magnetic field sensor (external) to form a full 9—angle sensor via I2C communication. The sensors inside MPU—6050 use a 16—bit analog to digital converter (ADC) for detailed results about rotation angles, coordinates. The specification summaries of MPU 6050 are shown in Table 1.
2.3.2
Microcontroller Arduino UNO R3
Arduino UNO R3 (Fig. 1) can use 3 microcontrollers: ATmega8, ATmega168, ATmega328. This brain can handle simple tasks such as controlling flashing LED light, processing signals for remote control cars, making a temperature and humidity measurement station and displaying it on LCD screen. Some specifications of Arduino UNO R3 are listed in Table 2.
Improved Electric Wheelchair Controlled by Head Motion
125
Fig. 1 Microcontroller Arduino UNO R3
Table 2 Specification of Microcontroller Arduino UNO R3
2.3.3
Microcontroller
8-bit ATmega328
Voltage
5 V DC
Frequency
16 MHz
Amperage
~30 mA
Recommended voltage
7–12 V DC
Input voltage limit
6–20 V DC
Number leg of digital I/O
14/6 leg hardware PWM)
Number leg of analog
6 (resolution 10 bit)
Flash memory
32 KB (ATmega328)
SRAM
2 KB (ATmega328)
EEPROM
1 KB (ATmega328)
Control Board DC Motor BTS7960 43A
H-BTS7960 43A bridge circuit (Fig. 2) easily communicates with microcontroller, integrated circuit (IC) with full of current sensing features (combined with current measurement resistor), downtime, overheating and overvoltage, overcurrent, pressure drop and short circuit (Table 3).
126
H.-T. Bui et al.
Fig. 2 DC BTS7960 43A motor control circuit
Table 3 Specification of motor control circuit Power
6 –27 V
Load current mach
43A (Load resistance) or 15A
Control logic signal
3.3 ~5 V
Maximum control frequency
25 kHz
Automatically shutdown when low voltage For avoiding motor control at low voltage, the device will shutdown automatically. If voltage < 5.5 V Over temperature protection
BTS7960 protects overheating by built—in thermal sensor. Output will be disconnected when overheating occurs
Dimension
40 × 50 × 12 mm
3 Head-Motion Block Diagram and Algorithm 3.1 Block Diagram The microcontroller system that allows the control of a standard electrical wheelchair with head movement is developed including: tilt angle acceleration sensor, microcontroller system and mechanical transmission system. The relationship between microcontroller and other devices is described in Fig. 3. This system is used to control two DC motors to perform the movements in the desired direction of the wheelchair. The tilt angular acceleration sensor is used to collect motion data from the head. When receiving motion signal from the head (inclined angle of the head) from the sensor, the data are processed by microcontroller to transmit control commands to the mechanical transmission system for location and controlling wheelchair via user
Improved Electric Wheelchair Controlled by Head Motion
127
Fig. 3 Block diagram of control system of wheelchair
commander. Therefore, the head movement of driver is moved to the positions that need electric wheelchair control. Devices of mechanical transmission are compatible with a number of different standard electric wheelchairs.
3.2 Head-Motion Algorithm The control system which we propose allows users to issue only four different commands for direct motions: forward, go back, turn left and turn right. It means that the set of recognized movements has only four components. The algorithm has much depending on the reality. The meaning of each request is relative and depends on the current wheelchair status. If there is an obstacle, the wheelchair will stop; otherwise, it will continue to move in the possible directions (Fig. 4), until the desired position is reached. Finally, the wheelchair will stop when the goal is completed. Fig. 4 Schematic algorithm of wheelchair movement controlled by head gestures
128
H.-T. Bui et al.
4 Results and Discussion Electric wheelchair products are improved and upgraded from manual wheelchairs using two DC motors with a capacity of 120 W, for controlling two wheels by a microcontroller system (Fig. 5). Microcontroller system including acceleration sensor MPU—6050 is mounted on patient’s head, and Arduino UNO R3 is used in this research to control mechanical transmission system in desired directions: turn left, turn right, forward, backward and stop. The advantage of wheelchair model with head-motion control (Fig. 6). • Environmentally friendly • Useful for people with paralysis stroke who do not have enough strength to move their hand. • Reduce human activity and reduce physical stress. Fig. 5 Improved electric wheelchair model
Fig. 6 Wheelchair with head-motion control
Improved Electric Wheelchair Controlled by Head Motion
129
5 Conclusions In the industrial revolution 4.0, the study of electric wheelchair motion controlled by head gesture will be a new trend of humanity for improving quality of patient’s life with quadriplegia and inability to move. This paper, a head-motion recognition technique used to enable wheelchair control in the quadrilateral, has been performed, and the initial successful test has been applied. An angular acceleration sensor and microcontroller family were mounted on patient’s head to control wheelchair movements in the desired direction. In the future, we will study, improve electric wheelchair designs and enhance control algorithm to meet the characteristics of different types of people with disabilities, reduce manufacturing costs and improve product reliability. In the next study, we will integrate the function to notify mobile phones to relatives or send rescue signals when wheelchair users have problems.
References 1. Organisation mondiale de la Santé et Banque mondiale. Rapport mondial sur le handicap (2011). https://www.who.int/disabilities/world_report/2011/fr/. Accessed Feb 24 2020 2. Bui HT, Lestriez P, Pradon D, Debray K, Abdi E, Taiar R (2018) Biomechanical modeling of medical seat cushion and human buttock-tissue to prevent PUs. Russ J Biomec 22(1):37–47. https://doi.org/10.15593/RJBiomechh/2018.1.04 3. Ngu,`o,i khuy´êt tâ.t Viê.t Nam (2020) https://tieplua.net/tin-tuc/so-lieu-thong-ke-ve-nguoi-khuyettat-viet-nam-141.html; Accessed Feb 24 2020 4. Cortés U, Urdiales C, Annicchiarico R, Barrué C, Martinez AB, Caltagirone C (2007) Assistive wheelchair navigation: a cognitive view. In: Studies in computational intelligence advanced computational intelligence paradigms in healthcare. vol 1 (48), pp 165–187. https://doi.org/10. 1007/978-3-540-47527-9_7 5. Ergonomics (2020) https://dictionary.cambridge.org/vi/dictionary/english/ergonomics. Accessed Feb 24 2020 6. Yao F (2007) Measurement and modeling of wheelchair propulsion ability for people with spinal cord injury. Thesis for the Degree of Master of Mechanical Engineering in the University of Canterbury. https://core.ac.uk/download/pdf/35458655.pdf 7. Satish K, Dheeraj J, Neeraj J, Sandeep K (2015) Design and development of head motion controlled wheelchair. Int J Adv Eng Technol 8(5):816–822. https://doi.org/10.13140/RG.2.2. 29123.71209 8. Kunti A, Chouhan V, Singh K, Yadav ARY, Pankaj ID, Somkuwar S (2018) Head-motion controlled wheel chair direction using ATMega328p microcontroller. Int J Adv Res Comput Commun Eng 7(4):61–65. https://doi.org/10.17148/ijarcce.2018.7411 9. Pajkanovi´c A, Doki´c B (2013) Wheelchair control by head motion. Serbian J Electri Eng 10(1):135–151. https://doi.org/udk:615.478.3:681.586
Innovation, Entrepreneurship and Sustainability of Business Through Techno-Social Ecosystem–Indian Scene Arindam Chakrabarty and Tenzing Norbu
Abstract Creativity and Innovation are the architects of human aesthetics, excellence and the essence of advancement. However, the creativity has broader spectrum of human aspiration while the latter is the key driving force for achieving superior quality of life and the notion of modernization. In fact, innovation is a continuous process through which we are adding value propositions to the present form of products or services with the help of concurrent ideas and technological ecosystem. The flow of innovation essentially drives to develop new ideas and new attributes in the current product or services or it can bring a completely new generation product. The existing firms can create a new business line or it might trigger the evolution of new entrepreneurial venture. One of the key factors for new start-ups is the pace and culture of innovation across the society. This transformation needs multiple supports from all the stakeholders and even the new firms require constant flow of support so that it could sustain for long-term perspective. Thus, the Natural Flow of “Innovation-Entrepreneurship-Sustainability” promulgate for higher Business Growth and Perpetuity. According to Global Entrepreneurship Index (GEI), 2018, India achieves 68th position among 137 nations of the world and it achieves higher position in Product Innovation dimension while the country performs below the mark in terms of Technology Absorption. This indicates the rate of transition from Lab to Market is indeed matter of high concern in India if it aspires to propel the pace of rapid socio-economic transformation. This is more evident in the regime of 4IR (Fourth Industrial Revolution) led ecosystem. The paper has attempted to understand and identify key inhibiting factors that obstruct smooth transformation of Innovation into successful Entrepreneurship. On the contrary, the paper has also explored how India’s all the first generation firms fail to achieve long-term perpetuity. The study A. Chakrabarty (B) Assistant Professor, Department of Management, Rajiv Gandhi University (Central University), Itanagar, Arunachal Pradesh 791112, India e-mail: [email protected] T. Norbu Assistant Professor, Don Bosco College, Jollang, Itanagar, Arunachal Pradesh 791111, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_13
131
132
A. Chakrabarty and T. Norbu
has also made its effort to recommend appropriate strategy so that the Silver Line of Business Flow can be rejuvenated. Keywords Creativity · Innovation-Entrepreneurship-Sustainability · 4IR · India
1 Introduction Creativity and Innovation are the driving forces that bring new idea for the entrepreneurs [1–3]. The entrepreneurs need to incubate these ideas so that it could emerge as a Business Model. Every business model is required growth as well as Perpetuity in the context of Sustainability [4]. So, Innovation, Entrepreneurship and Business Sustainability are the key elements of a single continuum which brings rapid as well as long-lasting socio-economic transformation particularly in the era of 4IR which are largely dominated by Knowledge Economy backed by excellence in R and D as a means of Techno-Social dominance. This is high time to understand how this business value continuum is functioning or is aiming to excel to prosper in order to manifest the sky high aspirations of 4IR ecosystem.
2 Objectives of the Study i. To explore the flow of transformation from Innovation to Business Expansion and Creation of Entrepreneurial Ventures with special reference to India. ii. To study major drawbacks for achieving Long-term Sustenance and Perpetuity of Entrepreneurial Ventures in India.
3 Research Methodology The study is entirely based on secondary information. The GEI (Global Entrepreneurship Index) Report, 2018 and GII (Global Innovation Index) Report, 2019 have been immensely used to arrive at meaningful conclusion.
4 Analysis and Interpretation 4.1 Analysis–I Entrepreneurship is a form of successful commercialization of a unique business concept or innovative idea that eventually attracts a new set of customers with
Innovation, Entrepreneurship and Sustainability of Business …
133
increasing trend with the passage of time [9, 10]. In fact, a product life cycle depends on how it can absorb, incorporate and can upgrade itself with the advent of threshold technology or innovation. So, new generation business idea has become one of the key driving forces for any entrepreneurial venture. The culture of Creativity and Innovation can be incubated in the society by the intervention of Universities, Institute of higher Learning, Research Organizations or R and D Wing etc. However, the culture of Creativity and Innovation need to be patronized by the state, its policy makers, their commitments and of course, large ecosystem for sharing knowledge, networking and flow of investment. It does not imply that all the new innovative ideas or creative thinking could be transformed into a business line. The syndrome of non-conversibility remains an inherent threat for successful implementation of ideas. According to GEI-2018, it is found that some countries like USA, China, Japan, Germany, United Kingdom etc., scored quite decent in terms of Technology Absorption that means the entrepreneurial ecosystem and other stakeholders are highly committed so that the ease of transformation from Lab-to-Market can be optimized irrespective of its technological limitations or inherent constraint among its users. Figure 1 shows the business sustainability Eclipse. A country can rapidly boost-up its economy to create higher job opportunity if the state can lead in the domain of overall technology absorption. Table 1 shows Technology Absorption Worldwide in Comparison with India. Table 2 deals with Correlation between GEI Ranking and Transparency Index Ranking. Table 3 is mentioning Correlations. Survival, growth, sustenance and perpetuity are the major stages for an entrepreneur in order to establish its long-term and dominant presence in the market. Apart from the individual innovation, developing product prototype and grabbing markets for earning profits; it requires comprehensive support mechanism from every stakeholder from entrepreneurial ecosystem which is compounded by highest order of commitment and transparency. From that assumption, using GEI ranking and corresponding transparency index ranking of top ten (10) economies of the world [7], correlation was carried out to understand the nature and
Fig. 1 Business sustainability eclipse
Asia
Africa
0.667 0.252 0.222 0.136
China
Vietnam
Pakistan
0.212
Rwanda
Korea
0.214
Montenegro 0.902
0.219
Cameroon
Japan
0.222 0.220
South Africa
0.178
Brazil
Egypt
0.779
0.554
Italy 0.814
0.840
France
Canada
0.863
Germany
United States
1.000
United Kingdom
America
1.000
Switzerland
Europe
GEI technology absorption score
Countries
Continent
Countries with higher GEI technology Absorption score/rank with respect to India
Table 1 GEI (2018)—technology absorption worldwide: comparison with India
109
73
69
27
8
78
77
76
75
74
92
17
15
32
11
9
2
1
GEI technology absorption rank
(continued)
134 A. Chakrabarty and T. Norbu
0.047
India
Philippines
South America
Asia
0.018 0.014
0.017 137
136
135
134
134
132
115
112
According to global entrepreneurship index (GEI)—2018 score/rank in terms of ease and access for ‘Technology Absorption’, it is found that India ranks 134th out of 137 nations of the world which implies that only three (03) countries namely, Guyana, Suriname and Philippines are below India. All the top economies of the world are placed on the top order. Even economic giants in Asia are above the country. Quite surprisingly, almost all participant African Countries are placed above India. This is a serious indicator for India when it is the second largest populated nation of the world and the country is fortunately having large number of youth population [8]. In fact, Technology Absorption reflects the ease and access of technological output in designing, formulating and developing new products that may grab higher opportunity in the market. As a result, the country is suffering from adequate investment supports and that essentially minimizes creation of jobs. In addition to that, natural process of technological obsolescence brings de-growth for industry that badly impacts on GDP resulting significant loss of jobs. This manifests that entrepreneurial opportunities is directly related to the extent of innovation if it could be adequately transferred and transformed from Lab-to-Market
Guyana
Suriname
Africa
0.047
0.049
Sri Lanka
India
0.119
Myanmar
Asia
0.130
Bangladesh
Countries with higher GEI technology Absorption score/rank with respect to India
Table 1 (continued)
Innovation, Entrepreneurship and Sustainability of Business … 135
136
A. Chakrabarty and T. Norbu
Table 2 Correlation between GEI ranking and transparency index ranking
Name of Country
GEI ranking
Transparency index ranking
USA
15
22
China
69
87
Japan
8
18
Germany
9
11
UK
2
11
India
134
78
France
11
21
Italy
32
53
Brazil
92
105
Canada
17
9
Table 3 Correlations
Spearman’s rho
GEI ranking
Transparency index ranking
GEI ranking
Transparency index ranking
Correlation coefficient
1.000
0.760**
Sig. (2-tailed)
–
0.011
Correlation coefficient
0.760**
1.000
Sig. (2-tailed)
0.011
–
** Correlation is significant at the 0.05 level (2-tailed) a List-wise N = 10
degree of relationship. The result depicts that r-value is .760 which is highly significant at the .05 level (2-tailed). It may be interpreted that the success in entrepreneurial ventures depends on the level of transparency prevailing in the respective country. As the Transparency Index Ranking of India is 78 out of 180 nations across the globe, it is difficult for retaining survival, growth, sustenance and perpetuity for the business firms particularly first and second generation entrepreneurial ventures. In fact, higher ranking in Transparency Index depicts that the country is coveted by corruption or biased rule of law. The inequality dominance in the social and political ecosystem enforces to hide information and that makes the country less transparent. The most surprising thing is that the state has already passed and has been practicing Right to Information Act, 2005 which predominantly focus on higher elements of disclosure in the government system so that the country could raise its standard in ‘Global Transparency Index’.
Innovation, Entrepreneurship and Sustainability of Business …
137
4.2 Analysis-II Major drawbacks for achieving Long-term Sustenance and Perpetuity of Entrepreneurial Ventures in India 1. Rapid technological obsolescence Indian Corporate sector mostly dominated by conventional or stereotype businesses but of late, highly sophisticated businesses are led by new start-ups or up to 3 generation entrepreneurs because of change of taste, preference, expertise and specialization of the upcoming generation as well as diminishing demand trends in the market. 2. Lack of huge investment support 3. Turbulent legal frameworks in the state 4. Imbalance international trade policy 5. Infrastructure for free consultancy or advocacy by the state agency to boost-up new start-ups [5] 6. Lack of transparency and equitability in banking sector for business financing. Business Sustainability (Model is shown in Fig. 2) is broadly reinforced by two pillars i.e., Durability and Imitability. Durability can be defined as the rate at which a firm’s underlying resources and capabilities (Core Competencies) depreciate or become obsolete with the passage of time. Imitability is the rate at which firm’s underlying resources and capabilities can be duplicated by its rivals. In fact, durability quotient is linked to technology life-cycle while the Imitability is related to the degree of complexity of the business model. The aspect of Imitability can be further divided into three continuous entities. They are i.
Transparency: The speed with which other firms can understand the relationship between of resources and capabilities to achieve successful implementation
Fig. 2 Determinants of business sustainability/continuum of resource sustainability
138
A. Chakrabarty and T. Norbu
ii. Transferability: The ability of the rival firms to acquire the resources and capabilities which are necessary to create value addition equitable to the leader firm. iii. Replicability: The ability of the competitors to use duplicated resources and capabilities to replicate similar form of product or services. In nutshell, the logic gates ask for whether the existing business model is transparent or not for its rivals to easily understand. If it is not easily understandable by the rival firms, it is presumed that the mainframe organization enjoys superior order of business sustainability as the business model is next to impossible for the competitors to understand. If it is easily understandable by the competing firms, the next stage of logic gate emerges with reference to the ease of transferability that means whether the rival company can easily transfer the core competencies or not. If the answer is ‘Yes’, the leader company enjoys a great extent of business sustainability. But if the answer is ‘No’, the further problem loops arises with regard to its ability of replicability, i.e., whether the rival firms have the ability to replicate the product or KRAs in consistent with the mainframe company. If the answer is ‘No’, still the organization enjoys the last layer of defence in terms of retaining it s business secrecy and sustainability. But if the answer is ‘Yes’, it is difficult for the mainframe organization to retain its core competencies intact and thereby it would lose its market dominance, monopoly leverage and finally, it would lose its overall business sustainability.
5 Recommendation From the above discussion, it is crystal clear that creativity and innovation is one of the major driving forces for expansion of business as well as creating new generation entrepreneurship. However, the transformation is not easy unless the process is adequately supported and supplemented by various state and non-state stakeholder. In fact, entrepreneurship is a continuous flow of life which needs to be imbibed in the culture of a society so that it could be patronized by the community and political leadership. The paper suggests that all the entrepreneurial ventures cannot sustain for long-term perspective simply because this is the harder phase for a company when it requires adequate support systems, resource mobilization and greater network among the elements of the respective ecosystem within the framework of transparent and unbiased rule of law. There are indeed shortages of resources like investments, regular consultancies and directions which may encounter immediate degrowth or for sine-a-dine. The government should develop appropriate model to extend necessary support to the new generation entrepreneurs. The state can augment a dedicated real-time support system to a batch of entrepreneurs by providing advocacy, consultancy and reliable information search engine with the intervention of IoT [5]. In the era of data dominance, it is important for a new generation entrepreneurs to capture,
Innovation, Entrepreneurship and Sustainability of Business …
139
retrieve and utilise appropriate and real-time database so that it could devise, transform and implement comprehensive strategy to win over in the turbulent market. Recent research work explains how Quality Function Deployment (QFD) can be implemented for enabling integrated ecosystem for information and data management. If this innovative policy reforms are adopted the new generation entrepreneurs will be immensely benefited and can remain competitive in the market [6]. The flow of Innovation, Entrepreneurship and Business Sustainability should be integrated as a continuum of policy not mutually exclusive, discrete, dissociated or directionless, superficial exercises.
6 Conclusion The study has showcased how innovation, Entrepreneurship and Business Sustainability are the integral part of manifesting socio-economic development. The emerging 4IR is the representation of Techno-Social dominance where technological ecosystem attempts to dominate the social value system. The research work exhibits that higher transparency in government machineries leads to superior entrepreneurial activities.
References 1. Krishna RRBMM, Swathi A (2013) Role of creativity and innovation in entrepreneurship. Innov J Bus Manage 2(05) 2. Chen MH (2007) Entrepreneurial leadership and new ventures: Creativity in entrepreneurial teams. Creativity Innov Manage 16(3):239–249 3. Sarri KK, Bakouros IL, Petridou E (2010) Entrepreneur training for creativity and innovation. J Eur Ind Training 4. Stubbs W, Cocklin C (2008) Conceptualizing a “sustainability business model”. Organ Environ 21(2):103–127 5. Norbu T, Mall M, Sarkar B, Pal S, Chakrabarty A (2019 Jan). Revitalizing MSMEs’ performance with transparency: monitoring, mentoring and malwaring through IoT intervention. In: International conference on intelligent computing and communication technologies, Springer, Singapore, pp 553–560. https://doi.org/10.1007/978-981-13-8461-5_63 6. Chakrabarty A, Norbu T (2020). QFD Approach for integrated information and data management ecosystem: umbrella modelling through internet of things. In: Principles of internet of things (IoT) ecosystem: insight paradigm, Springer, Cham, pp 349–362. https://doi.org/10. 1007/978-3-030-33596-0_14 7. Retrieved from https://www.focus-economics.com/blog/the-largest-economies-in-the-world. Accessed on 08 Dec 2019 8. Retrieved from https://www.worldometers.info/world-population/population-by-country/. Accessed on 08 Dec 2019 9. Barringer BR (2015) Entrepreneurship: successfully launching new ventures. Pearson Education India 10. Hougaard S (2006) The business idea: the early stages of entrepreneurship. Springer Science & Business Media
Similar Image Retrieval Based on Grey-Level Co-Occurrence Matrix and Hu Invariants Moments Using Parallel Computing Beshaier Ali Abdulla, Yossra Hussian Ali, and Nuha Jameel Ibrahim
Abstract In the previous years, several researchers have presented various techniques and also various algorithms for a correct and a dependable image retrieval system. This paper goal is to build up an image retrieval system that retrieves the most similar images to the query image. In this method, the Hu invariants moments and the grey-level Co-occurrence Matrix (GLCM) features extraction methods are performed. Furthermore, with the purpose of boosting up the system performance, multi-thread technique is applied. Later, Euclidian distance measure is performed to compute the resemblance between the query image features and the database stored features. And as shown from the results, the execution time has been minimized to 50% of the conventional time of applying both algorithms without multi-thread. The proposed system is evaluated according to the measures that are used in detection, description and matching fields which are precision, recall, accuracy, MSE and SSIM measures. Keywords Image retrieval · Feature extraction · Grey-level co-occurrence matrix (GLCM) · Hu invariants moments · Euclidian distance measure
1 Introduction Digital images hold various portions of information named features, a number of those features hold important information, and when the features are used to retrieve similar images, it will retrieve the images that have similar features to the query image. The chosen methods which were operated on images in the database for B. A. Abdulla (B) · Y. H. Ali · N. J. Ibrahim Department of Computer Science, University of Technology, Baghdad, Iraq e-mail: [email protected] Y. H. Ali e-mail: [email protected] N. J. Ibrahim e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_14
141
142
B. A. Abdulla et al.
features extraction and later retrieving images from the database by using these features lead to effective retrieval [1]. A vital feature of the images is the texture; it is symbolized via the locative spreading of grey intensities in the region. Usually, the texture and tone have an association with each other which cannot be stated. Mutually, both of them are continuously existent in the image, and in certain condition, a single characteristic is able to supervise the other. Therefore, finding the spatial distribution plus the dependence of grey-level values that exist in an image constructs the main contribution for both the impression and the structure of the texture, so that two-dimensional texture analysis matrix is counted. Equally, the pixel plus pixel values define the properties of texture [13]. Grey-level co-occurrence matrix method includes massive computations. Once altogether 256 grey levels are being employed for creating the GLCM’s, every GLCM created shall stand (256 × 256) in size. Grey-level co-occurrence matrix is performed in the order of the extraction of the textural features in which computation for every component in the GLCM is included; thus, the bigger the size, the additional the computations are executed [14]. Ever since it has been extensively applied in numerous textures analysis applications, it continued to be a significant feature extraction method in the field of texture analysis [9]. Moments plus the associated invariants are significantly analysed to distinguish the images patterns in different applications. The famous moments contain Zernike moments, geometric moments, rotating moments and also difficult moments. Moment invariants originally presented through Hu, he originated six pure orthogonal invariants plus single skew orthogonal invariant grounded on algebraic invariants that is not just independent of size, position, orientation plus parallel projection. Moment invariants are confirmed as a suitable measure used to track patterns in the image concerning images, scaling, translation plus rotation below the statement of those images alongside non-stop functions plus noise-free. Moment invariants remain significantly operated towards image registration, image pattern recognition as well as image reconstruction. Though in practical application, digital images are not non-stop plus no noise, images are quantized via finite-precision pixels within a separate organization. Adding up, the noise might be presented via several circumstances like camera. Within this manner, mistakes are certainly presented through the calculation of the moment invariants. In other terms, the moment invariants can differ by the transformation of image geometric [6]. Salarna [12] examined the spatial quantization influence upon the moment invariants. Salama establishes that the error drops once the size of image enlarges plus the reducing of sampling intervals; however, it will not reduce monotonically in overall. The examined three important problems associated towards moment invariants, concluding Sensitivity to image noise, Capability for image representation plus Aspects of information redundancy. It is announced that the order moments are more exposed to noise as they get greater. Calculation errors happening in moment invariants may be produced thru the quantization plus pollution of noise, and also transformations like scaling as well as rotation. After the dimensions of images are reduced or else enlarged, images pixels are inserted, otherwise erased. Furthermore,
Similar Image Retrieval Based on Grey-Level Co-Occurrence …
143
the images rotation too affects the alteration of image function, since it includes rounding pixel values plus coordinates. Consequently, moment invariants can alter, while images rotate or else scale. Within this study, an introduction to an approach to retrieve similar images grounded on grey-level co-occurrence matrix and the Hu invariant moments are applied with multi-thread technique and later using Euclidian distance to calculate the similarity concerning the input image and various images in the dataset.
2 Literature Review Countless researches focus upon image retrieval system; some of these researches involved are as follows: Ruliang and Lin [11] introduced a new algorithm that is used for image matching that is under the name of IMEA algorithm, which is based on the Hu invariants moments. The first stage in it is the population initialization. Then, a subgraph set is created. Now, planning based on Hu invariants moments fitness function, the seven Hu invariants moments of the template image plus the searched subgraph are computed. In order to measure the similarity between the subgraph plus template image, the Euclidean distance measure is applied. Lastly, different subgraph is built through different evolutionary policies. A different subgraph is replaced with the extreme value of fitness function subgraph. The outcomes from the experiments show the IMEA algorithm huge sturdiness and effectiveness. Qing and Xiping [10] successfully extract from the microscopic images feature information of human viruses (HV), firstly panning the HV algorithm for features extraction and recognition applying grey-level co-occurrence matrix. After applying GLCM, 20 pieces of microscopic images are acquired, and later, the four texture feature parameters are extracted using GLCM, next performing the HV image recognition. The experimental outcomes confirm the GLCM was able to efficiently classify the HV image which carries importance to the present HV recognition and identification. Urooj and Singh [15] utilized Hu moments to address the issues of geometric transformation. Three experiments are conducted, i.e. scale invariant, rotation invariant, and combined rotation and scale invariance. Jie et al. [8] introduced a technique of joining moment invariants with fractal dimension. Additionally, in order to distinguish the images axis orbit edge Canny operator is applied. These two techniques are utilized as the BP neural networks feature vectors. Forty sample sets are trained and sampled of the usual fault axis in order to examine. Different eight sets of axis orbit recognition rate reached 100%; also, the recognition result is acceptable. Test outcomes showed that the applied technique has great recognition speed as well as accuracy and also ensures great practical value for the rotor system intelligent fault diagnosis.
144
B. A. Abdulla et al.
Htay and Maung [5] stated breast cancer is a disease that threatens women’s life. The simpler way in medical imaging to prevent breast cancer is by utilizing computerassisted diagnosis system in order to detect disease in its early stage. The goal of this paper was to create breast cancer early stage detection system that is able to sort irregularities in mammograms automatically. Within the technique, the pre-processing phase is done by eliminating noise thru utilizing median filter and later applying cropping upon images. Later, Otsu’s thresholding is applied in order to segment breast area found on the non-uniform background. When the thresholding is applied, feature extraction is dedicated to the grey-level co-occurrence matrix and first-order statistical analysis. Lastly, the objects that were extracted are entirely sorted if either ordinary or else extraordinary via applying k-nearest neighbour classifier.
3 Grey-Level Co-Occurrence Matrix Several texture extraction methods exist that include Gabor filter, wavelet analysis, Laws texture energy plus the grey-level co-occurrence matrix. Every one of these methods holds its specific property separately. GLCM is made of matrix of joint probability density concerning several grey levels of the image. This signifies spatial relationship of two points within the image. Let [P (x, y, d, θ )]L×L stand for the value joint probability, where • • • •
x-th line and y-th column within GLCM. d is distance. θ is direction. L is the number of grey level.
In other words, it signifies that P (x, y, d, θ ) is the probability, where both grey levels x and y are the origination, looking as the destination. Now (d = 1), (θ = 0°, 45°, 90° and 135°). Within real use, several statics are known as feature value of texture analysis founded on grey-level co-occurrence matrix. Fourteen features are extracted from grey-level co-occurrence matrix. Eight of them are preferred as texture features. The equations for calculating the eight statics are shown as follows [17]: Mean: L L
x.P(x, y)
(1)
x=1 x=1
Variance: L L (x − v)2 .P(x, y) x=1 x=1
(2)
Similar Image Retrieval Based on Grey-Level Co-Occurrence …
145
Angular second moment (ASM): −
L L
p(x, y)2
(3)
P(x, y)lg P(x, y)
(4)
x=1 x=1
Entropy: −
L L x=1 y=1
Inverse difference moment (IDM): L L x=1 y=1
P(x, y) 1 + (x − y)2
(5)
Homogeneity (HOM): L L |i − j|P(x, y)
(6)
x=1 x=1
Contrast (CON): L n=1
n2
L L
P(x, y)
(7)
x=1 x=1
Correlation (COR): L L x y P(x, y) − μ1 μ2 σ12 σ22 x=1 x=1
(8)
where ν, μ1 and μ2 are the mean value, which may be computed via Eq. (1). And 1 σ and 2 σ are variances, which may be computed via Eq. (2).
4 Hu Moment Invariants The moment invariants primary introduced via Hu [7] plus verified that the seven moment invariants continued invariant relate towards directions, scales plus orientations. The Hu moments invariants have been effectively utilized in the field of image processing, for instance object classification, visual pattern recognition plus image
146
B. A. Abdulla et al.
coding [4]. The seven Hu moment invariants are outlined as follows: ϕ1 = m 20 + m 02
(9)
ϕ2 = (m 20 + m 02 )2 + 4m 211
(10)
ϕ3 = (m 30 + 3m 12 )2 + (3m 21 + m 03 )2
(11)
ϕ4 = (m 30 + m 12 )2 + (m 21 + m 03 )2
(12)
ϕ5 = (m 30 + 3m 12 )2 (m 30 + m 12 ). (m 30 + m 12 )2 − 3(m 21 + m 03 )2 + (3m 21 − m 03 )(m 21 + m 03 )
(13)
ϕ6 = (m 20 − m 02 ) (m 30 + m 12 )2 − (m 21 + m 03 )2 + 4m 11 (m 30 + m 12 )(m 21 + m 03 ) (14) ϕ7 = (3m 21 − m 03 )(m 30 + m 12 ) (m 30 + m 12 )2 − 3(m 21 + m 03 )2 + (3m 12 + m 30 )(m 21 + m 03 ) (3m 30 + m 12 )2 + (m 21 + m 03 )2 (15) where, m pq =
μ pq , μ00
p, q = 0, 1, 2, . . . , μ pq=˜ x p y q f (x,y)dxdy,γ = p+q 2 +1.
5 Research Methodology The goal of this research is to demonstrate the difference in performance between using each grey-level co-occurrence matrix (GLCM) and the Hu invariants moments apart, and the proposed enhanced system is applying them together using multithread technique to perform the two algorithms in parallel to process quickly, as a result saving user’s time. The research methodology consists of two main phases: the first phase is named dataset creation (feature extraction) and the second phase is named testing and matching, as illustrated in Fig. 1.
Similar Image Retrieval Based on Grey-Level Co-Occurrence …
147
Fig. 1 Model for image retrieval design
Algorithm 1: The proposed system algorithm Input: Test Image Output: The most similar ten images to the test image Begin: Step1: Implement GLCM and Hu invariant moment in parallel on the test image. Step1.1: At first, image blocking is applied Step1.2: The quantization process is applied. Step1.3: It is planned to create a two-dimensional code array to stand for the intensity signature of each pixel in the test image. Step1.4: Features are extracted from the code array to describe the behaviour of every pixel in the image. This steps consists of two stages: probability computations and features computation Step1.5: Features Normalization: the selected features resulted in different ranges; it is intended to make its values extend between the range (-1 to 1). Step1.6: Apply the seven Hu moments equations to extract seven features from the test image. Step1.7: All the extracted features from both algorithms are stored in the hybrid features array. Step2: Matching occurred between the test image and the database images. Step3: Apply Euclidian distance to measure the similarity between the test image and the database images. Step4: Retrieve the most ten similar images to the test image from the database image. End
Stage 1: Database Feature Extraction This phase starts with implementing Hu invariant moments with the grey-level concurrence matrix method using multi-thread technique in parallel in order to extract the images features. Next, the features will be stored in the features hybrid array that
148
B. A. Abdulla et al.
Fig. 2 Parallel features extraction of input image by using multi-thread
consists of 8 GLCM features plus the 7 features that were extracted utilizing the Hu invariant method, as shown in Fig. 2. As a result, the features table for each image will consist of 15 features as shown in Table 1. The used database includes 100 images with 10 different types as shown in Fig. 3. Stage 2: Testing and matching After completing the feature extraction procedures, the next step begins with inputting a query image and also extracting its features and next, the matching phase begins by matching the input image features with each feature in the database. The similarity measure considered for comparison of images is Euclidean distance. The formula for Euclidean distance is shown as follows: D(x, y) =
n |xi − yi| t=1
(16)
0.980
0.981
0.981
0.977
0.937
0.780
0.932
0.954
0.945
0.943
2
3
4
5
6
7
8
9
10
Mean
0.928
0.921
0.989
0.976
0.941
0.951
0.952
0.962
0.959
0.961
Variance
0.944
0.990
0.921
0.998
0.914
0.965
0.967
0.974
0.976
0.974
ASM
0.890
0.956
0.987
0.934
0.993
0.965
0.950
0.960
0.965
0.959
Entropy
0.234
0.936
0.932
0.923
0.921
0.949
0.939
0.949
0.948
0.948
IDM
0.945
0.987
0.956
0.987
0.678
0.939
0.952
0.962
0.959
0.961
HOM
0.934
0.933
0.943
0.934
0.911
0.951
0.914
0.932
0.934
0.928
CON
Extracted features using GLCM and the seven Hu Invariants moments
1
Image Id
0.918
0.978
0.989
0.986
0.921
0.949
0.950
0.960
0.965
0.959
COR
Table 1 A sample of the extracted features applying GLCM and Hu invariants moments
0.712
0.342
0.009
0.729
0.410
0.765
0.890
0.322
0.441
0.101
M1
9.344
9.326
9.589
9.634
9.123
8.768
9.352
9.568
9.658
9.155
M2
1.876
1.478
1.373
1.457
1.132
1.464
1.406
1.380
1.458
1.130
M3
2.874
2.828
3.327
3.467
2.622
3.272
2.879
3.314
3.309
2.630
M4
5.419
5.045
4.593
1.731
3.123
1.320
5.019
4.511
1.624
3.741
M5
5.322
5.374
4.749
4.984
4.324
3.866
5.327
4.371
4.362
4.616
M6
5.555
5.495
3.163
3.734
3.912
3.142
5.655
3.849
3.633
3.952
M7
Similar Image Retrieval Based on Grey-Level Co-Occurrence … 149
150
B. A. Abdulla et al.
Fig. 3 A sample of the database image
This paper utilized Euclidean distance that uses the conception of Pythagoras’ theorem to compute the distance d (x, y).
6 Experimental Results The suggested system is evaluated against the GLCM and the Hu moments algorithms which are applied apart. The proposed database used in the experimental tests consists of different 10 datasets. By applying the enhanced system on the used database, the results showed that the recall, precision plus accuracy were improved plus the execution time was reduced. The known applied measurements in order to evaluate the system involve the recall, precision plus accuracy. Those measurements contain quantities that are true positive (TP), false positive (FP), true negative (TN) and false negative (FN) [3]: 1. True positive (TP): it is the amount of the images that were retrieved which are alike to the input image. 2. False positive (FP): it is the amount of the images that were not retrieved which are alike to the input image. 3. True negative (TN): it is the amount of the not retrieved images that are dissimilar to input image. 4. False negative (FN): it is the amount of not retrieved images but is similar to input image.
Accuracy =
(TP + TN) Total numberofthedatabase images
(17)
TP (TP + FP)
(18)
Precision =
Similar Image Retrieval Based on Grey-Level Co-Occurrence …
151
1.2 1 0.8 Accuracy
0.6
Precision
0.4
Recall
0.2 0 Image ID Image ID Image ID Image ID Image ID 1 2 3 4 5
Fig. 4 The results from applying the grey-level co-occurrence matrix on the test images
Recall =
TP (TP + FN)
(19)
In addition, MSE and SSIM measurements are applied for evaluating the system SSIM (structural similarity index) is computed using the below equation [16]: 2μx μ y + c1 2σx y + c2 SSIM(x, y) = 2 μx + μ2y + c1 σx2 + σ y2 + c2
(20)
The MSE (mean-squared error) is calculated using the following equation [2]: MSE =
m−1 n−1 1 f (x, y) − g(x, y)2 m∗n 0 0
(21)
where f is the original image data matrix, g is the degraded image data matrix, m signifies the images rows of pixels, I signifies the row index, n signifies the image columns number of pixels, and j signifies the column index (Figs. 4, 5, 6), (Tables 2, 3, 4, 5, 6 and 7).
7 Conclusion This paper used the method of grey-level co-occurrence matrix (GLCM) and the Hu invariant moments algorithms both to extract features from the created database plus the input image. These two algorithms were applied with multi-thread technique to speed up the whole process and to save time. After that, Euclidian distance measure was applied to compute the similarity between the query image and the database images; as observed, the Hu invariant
152
B. A. Abdulla et al. 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
Accuarcy Precision Recall
Image ID 1 Image ID 2 Image ID 3 Image ID 4 Image ID 5
Fig. 5 The results from applying the Hu moments invariants on the test images chart 1.02 1 0.98 0.96 0.94 0.92 0.9 0.88 0.86 0.84
Accuracy Precision Recall
Image ID 1Image ID 2Image ID 3Image ID 4Image ID 5
Fig. 6 The results from using the hybrid GLCM and Hu moments with multi-thread technique on the tests images chart
Table 2 A sample of the tested images used in the experiment Test images
Image ID
1
2
3
4
5
moments results were poor which achieved accuracy between 70% and 80% with the images that have no humans faces. The aim of the hybrid Hu invariants moments algorithm along with the grey-level co-occurrence matrix algorithm is to enhance its performance. And as seen in the experiment results, the results were improved when applying the proposed hybrid method than applying only the Hu invariants moment
Similar Image Retrieval Based on Grey-Level Co-Occurrence …
153
Table 3 A sample of the results from applying the Grey-level co-occurrence matrix Test image ID
Accuracy
Precision
Recall
TP
FP
TN
FN
1
0.98
0.8
0.8
8
2
90
2
2
0.97
0.7
0.7
7
3
90
3
3
0.98
0.8
0.8
8
2
90
2
4
0.99
0.9
0.9
9
1
90
1
5
0.98
0.8
0.8
8
2
90
2
Table 4 A sample of the results from applying the Hu moments invariants on the tests images Test image ID
Accuracy
Precision
Recall
TP
FP
TN
FN
1
0.75
0.2
0.2
2
6
90
6
2
0.76
0.3
0.3
3
7
90
7
3
0.75
0.2
0.2
2
8
90
8
4
0.80
0.4
0.4
4
6
90
6
5
0.76
0.3
0.3
3
7
90
7
Table 5 The results from using the hybrid GLCM and Hu moments with multi-thread technique Test image ID
Accuracy
Precision
Recall
1
0.99
0.9
0.9
2
1
1
1
3
0.99
0.9
0.9
4
1
1
1
5
0.99
0.9
0.9
TP
FP
TN
FN
9
1
90
1
10
0
90
0 1
9
1
90
10
0
90
0
9
1
90
1
algorithm. The system achieved accuracy between 90% and 98%. Furthermore, the execution time was reduced by almost 50%.
109.1
211.41
389.3
473.9
137.9
2
3
4
5
0.45
0.60
0.46
0.50
0.40
223.94
732.66
746.46
133.62
126.34
Image2 [SSIM]
[MSE]
[SSIM]
Image1
1
Input image Id
Retrieved images
0.40
0.54
0.42
0.58
0.59
[MSE]
Table 6 A sample of MSE and SSIM results on the retrieved images Image3
145.97
767.12
201.45
308.35
158.97
[MSE]
0.41
0.52
0.35
0.56
0.39
[SSIM]
Image4
262.88
796.21
142.89
110.99
102.01
[MSE]
0.40
0.52
0.80
0.53
0.61
[SSIM]
Image5
343.98
712.83
143.39
241.57
171.07
[MSE]
0.35
0.55
0.93
0.49
0.51
[SSIM]
154 B. A. Abdulla et al.
Similar Image Retrieval Based on Grey-Level Co-Occurrence … Table 7 Execution time for the algorithms
Algorithms
155 Time(in sec)
GLCM
36.8
Hu invariant moment
34.85
Proposed system
15.60
References 1. Abu Mezied A, Alattar A (2017) Medical image retrieval based on gray cluster co-occurrence matrix and edge strength. In: IEEE transactions on promising electronic technologies, pp 71–76 2. Albertus JS, Lukito EN, Gede BS, Risanuri H (2011) Compression ratio and peak signal to noise ratio in grayscale image compression using wavelet. Int J Comput Sci Technol IJCST 3. Cheung S, Kamath C (2005) Robust background subtraction with foreground validation for urban traffic video. EURASIP J Appl Signal Process 2330–2340 4. Duan G, Zhao X, Chen A, Liu Y (2014) An improved Hu moment invariants based classification method for watermarking algorithm. In: IEEE international conference on information and network security, pp 205–209 5. Htay TT, Maung SS (2018) Early stage breast cancer detection system using glcm feature extraction and k-nearest neighbor (k-NN) on mammography image. In: IEEE 18th international symposium on communications and information technologies (ISCIT), pp 171–175 6. Huang Z, Leng J (2010) Analysis of Hu’s moment invariants on image scaling and rotation. In: IEEE 2nd international conference on computer engineering and technology, pp V7–476V7-480 7. Hu M (1962) Visual pattern recognition by moment invariants. In: IEEE IRE transactions on information theory, pp 179–187 8. Jie S, Xinyu P, Zhaojian Y, Juanli L (2017) Research on intelligent recognition of axis orbit based on Hu moment invariants and fractal box dimension. In: IEEE 14th international conference on ubiquitous robots and ambient intelligence (URAI), pp 794–799 9. Kavitha C, Suruliandi A (2016) Texture and color feature extraction for classification of melanoma using SVM. In: IEEE international conference on computing technologies and intelligent data engineering (ICCTIDE’16), pp 1–6 10. Qing L, Xiping L (2013) Feature extraction of human viruses microscopic images using gray level co-occurrence matrix. In: IEEE international conference on computer sciences and applications, 619–622 11. Ruliang Z, Lin W (2011) An image matching evolutionary algorithm based on Hu invariant moments. In: IEEE international conference on image analysis and signal processing, 113–117 12. Salarna GI, Abbott AL (1998) Moment invariants and quantization effects. In: IEEE computer society conference: 157–163 13. Seetha BDM, Muralikrishna I, Hegde N (2008) Artificial neural networks and other methods of image classification. J Theor Appl Inf Technol 1039–1053 14. Tou JY, Tay YH, Lau PY (2008) One-dimensional grey-level co-occurrence matrices for texture classification. In: IEEE international symposium on information technology, pp 1–6 15. Urooj S, Singh SP (2016) Geometric invariant feature extraction of medical images using Hu’s invariants. In: IEEE 3rd international conference on computing for sustainable global development (INDIACom), pp 1560–1562 16. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 600–612 17. Yu J (2010) Texture image segmentation based on gaussian mixture models and gray level co-occurrence matrix. In: IEEE third international symposium on information science and engineering, pp 149–152
Abstraction of ‘C’ Program from Algorithm R. N. Kulkarni and S. Shenaz Begum
Abstract In the past few decades, computers have become most useful tool, and are being used in a variety of areas like Industry, educational institutions, University, Hospitals, Business, etc. In almost all these areas, algorithmic solutions to problems are common. The usage of both hardware and software is increasing drastically. Emphasis was given on the development of application software’s. The traditional approach of developing any software comprises of gathering the requirement from client organization, writing the specification and then developing the design of a system in the form of algorithm and afterward implementing this algorithm by using any programming language. Usually, the algorithm is written in a natural language which is either structured or unstructured and these are basically the pseudocodes and their implementation on programming languages change because of the syntax of the language. There is no defined way to implement an algorithm because the algorithm may be translated in more than one programming language and each translation produces different implementations. Even though, when two programmers translate the same algorithm using the same programming language the implementation differs. In this paper, we are proposing an automated tool which takes the input in the form of natural language statements (algorithm) then restructures the statements which are amenable for the further process. The restructuring statements are then converted to the equivalent C program. The proposed approach works on all the algorithms which are specified in the defined template. Keywords Requirements · Restructure · Algorithm · Translation
R. N. Kulkarni (B) · S. Shenaz Begum Department of Computer Science and Engineering, Ballari Institute of Technology and Management, Ballari, India e-mail: [email protected] S. Shenaz Begum e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_15
157
158
R. N. Kulkarni and S. Shenaz Begum
1 Introduction In the past few decades, computers have become most useful tool, and are being used in a variety of areas like Industry, educational institutions, University, Hospitals, Business, etc. Each time, the developing organization has to collect requirements from the client organization or customers, evaluate for the feasibility and then if it is found reliable, design and development of software is carried out. Moreover, there are many programming languages which allow coding in a variety of paradigms. An algorithm is a method represented as a list of well-defined instructions for calculating a problem. More precisely, an algorithm is basically an instance of pseudocode written by a developer to effectively produce output. These are an effective method to solve many real-time problems. Hence these have been applied in many areas like University, Banks, Industries, etc. Many algorithms are being devised to tackle industrial applications, real-time applications, applications related to the University, Bank, etc. But algorithms are not of complete use unless they are implemented as programs. For that one should be well trained with programming languages. Many organizations use different programming languages to develop software.
2 Literature Survey In paper [1, 2], the author discussed about Automatic algorithm specification to source code translation. The translation process which creates executable code for the given algorithm. The paper discuss about the algorithmic specification which is implemented using XML and perl programming language. The method proposed in this paper is a straight forward approach in which there is a direct conversion from algorithm into the program in C language. The papers [3–6] provides the details regarding the structured way of writing the algorithms, the basic data structures and programming techniques often used in efficient algorithms. In paper [7] the author discussed about a real time code generation using MATLAB for the detection of heart beat. The paper [8] provides details for implementing project by using java programming language where the algorithm statements are translated into c code by using simple if else and linked list concepts.The paper [9] discusses about the algorithm translation into its equivalent C program using Syntax Directed Translation Technique. In this paper the author has discussed about two tools one is used as scanner (flex) and other as parser (bison) which defines the rules for string acceptance and token generation this is two way processing but in this paper a straight forward approach is used where algorithm statements are converted into c program. The paper [10] discusses about a tool for the Abstraction of predicates from the C Programs using a tool called Sat. In this paper, the author discussed about predicate abstraction where Boolean variable is used to present each predicates in the abstract program.
Abstraction of ‘C’ Program from Algorithm
159
In paper [11], the author discusses about Natural Language Programming Using Class Sequential Rules. In this paper, the author has discussed about the system that recognizes a number of procedural primitives and operators. In this paper, we provide a template for writing algorithms in natural language.
3 Proposed Methodology Figure 1 illustrates the restructure and conversion of algorithm into the appropriate ‘C’ program. In restructure module, the input algorithm may contain blank lines, blank characters between the words, and multiple statements in a single line. The proposed tool will remove the blank lines which are present in the algorithm, compress the multiple blank spaces between the words to single space, and converts multi line statements to single line statement. restructure(file) #Here each line in a file holds single line from input algorithm For all statements in file Do Begin Call REMOVE_BLANK_LINE. Call MULTILINE_STATEMENT_TO_SINGLE Call COMPRESS_MUTIPLE_SPACES Call ASSIGN_LINE_NUMBERS End for In conversion module, each line of the restructured algorithm is passed and then converted to an equivalent statement in ‘C’ programming language. The user needs to follow the guidelines defined in the template. The purpose of this tool is to convert the input algorithm into a ‘C’ programming statement. This converter tool acts like an interpreter which converts each algorithm statement into executable ‘C’ statement. Restructured Algorithm Algorithm
Restructure
Converter
'C' Source Code
Fig. 1 Block diagram for restructure and conversion module
160
R. N. Kulkarni and S. Shenaz Begum
Converter(file) #Here file takes restructured input algorithm 1. 2. 3. 4. 5. 6. 7. 8. 9.
i=0 For each line in a file Do Begin Include most required header files Include main() Store ith line in a variable line Run corresponding handler for converting to equivalent ‘C’ statement i+1 End for
Our proposed system consists of three phases. First phase is recording user input algorithm here, User is provided with an algorithm specification or template, where he is asked to go through it before inputting an algorithm. Hence user is cleared about in which format the algorithm should be. Second phase is input restructuring. It means the given algorithm is properly restructured by deleting blank lines, compressing multiple spaces and inserting line numbers. The last phase is conversion phase. This phase takes restructured algorithm as input and generates executable ‘C’ source code. Hence this tool allows users to analyze how each and every line of an algorithm statement is converted to an executable c program statement. ALGORITHM INPUT: Algorithm in accordance with specified template. NOTATIONS: Variables < -[a-zA-Z_]*[0-9] Constant < -[0-9] OUTPUT: Executable ‘C’ source code. Step 1. [Read input file] Step 2. If file exists then goto step 3 else goto step 5. Step 3. algofile = call restructure(file) Step 4. Call converter(algofile) Step 5. Write “File does not exist..” Step 6. End
4 Case Study
PHASE 2: Restructured input algorithm Step1.start Step2.integer num, rem,rev_num,temp,start,end Step3.read start Step4.read end Step5.print palindrome number between start and end. Step6.for num ← start to end Step7. temp ← num Step8.while(temp) Step9.rem ← temp%10 Step10. temp ← temp/10 Step11.rev_num ← rev_num*10 +rem Step12. end while Step13. if (num equals rev_num) Step14.print num Step15.end if Step16.end for Step17.stop
PHASE 1: Input algorithm
START Let rev_num ← 0 Define num, rem, rev_num, temp, start, end as integer variable READ start read end PRINT Palindrome number between start and end For num ← start to end temp ← num while(temp) rem ← temp%10 temp ← temp/10 rev_num ← rev_num*10 + rem end while if (num equals rev_num) print num end if end for stop }
code#include < stdio.h > #include < conio.h> void main(){ int num, rem,rev_num = 0, temp, start, end; printf(“Enter the start value: “); scanf(“%d”,&start); printf(“Enter the end value: “); scanf(“%d”,&end); printf(“Palindrome numbers between %d and %d are: “,start,end); for(num = start;num buyer_offer buyer_offer = seller_offer Deal = true Else If buyer_price > buyer_max buyer_offer = buyer_max Else buyer_offer EndIf Else Deal = true End If k = k + 1 If buyer_offer < seller_offer seller_offer = seller_offer * (1 - seller_concession_rate) If seller_offer < buyer_offer seller_offer = buyer_offer Deal =true Else if seller_offer < seller_min seller_offer = seller_min Else seller_offer EndIf Else Deal = true End If k = k + 1 End While
Fig. 5 Algorithm for generating offers
A Fuzzy-Based Multi-agent Framework for E-commerce …
449
Fig. 6 Input and outputs of an ANFIS framework
6 Results The framework is tested through a number of different laptop characteristics and market expectations. The outcome is contrasted with the set allowance strategies (5%). The strategy success rates are determined based on the cumulative concessions made while the deal is struck, which is separated by the difference between the original sale and the proposal of the customer. The agent provides various negotiations with the higher laptop utilization rate of 0.8568 and lower of 0.6421 as shown in Table 2. The purchasing agent tries to keep the deal small in compared to their original offer. Once the intrinsic value for the laptop is smaller, the device does not really correlate to customer’s preferences. Therefore, such negotiations mostly do not lead toward deal closing. Whenever a set strategy is implemented, the negotiators may negotiate an offer in a short timeframe through an increased cost for a smaller payout.
7 Conclusion This paper dealt with the design and implementation of multi-agent negotiations for e-commerce recommender systems. We have demonstrated various types of agents, especially the mediator agent to facilitate simultaneous negotiations. These agents using a multi-feature weighted fuzzy function to estimate the appropriate deals. The program offers its operator with the ability to convey interests regardless of the selected values in fuzzy manner. It is really beneficial because individuals have a hard time defining their priorities to specify the standards. In order to identify which services are the nearest characteristics to customer-selectable interests, we have included a fuzzy assessment process. A fuzzy inference function uses efficiency value, initial offer, and the available negotiating period to produce a counteroffer.
450
S. Gopal Krishna Patro et al.
Table 2 Negotiations with various parameters Negotiation
Vendor’s and customer’s offer with yield ingrates (utility score 0.8568)
Vendor’s and customer’s offer with yielding rates (utility score 0.6421)
Vendor’s and customer’s offer with fixed discount rates
0
890.0000
890.0000
890.0000
1
652.0688 (0.1645%)
652.0688 (0.1645%)
681.4000 (5.0000%)
2
837.0000
837.0000 (5.0000%)
837.0000 (5.0000%)
3
651.9946 (0.1422%)
651.9946 (0.1422%)
716.6250 (5.0000%)
4
798.2000
795.1000 (5.0000%)
795.1000 (5.0000%)
5
654.8571 (0.1323%)
654.8271 (0.1323%)
751.4662 (5.0000%)
6
755.4900 (5.0000%)
755.4900 (5.0000%)
755.5000 (5.0000%)
7
695.4811 (6.3711%)
654.6920 (0.1433%)
Accepted
8
718.7655 (5.0000%)
718.7650 (5.0000%)
–
9
Accepted
655.8350 (0.1594%)
–
10
–
682.9170 (5.0000%)
–
Agreement reached?
Yes
No
Yes
Agreed price
718.7650
–
755.5000
The results reveal that the suggested smart agents significantly increase the counteroffer when negotiations are over. Whether an agent should continue to raise prices that relies on the availability, the constant negotiations, and the existing supply of the vendor. The presented framework performs better in terms of the purchaser’s viewpoint than that of the specified subsidy prices. The customer has to pay less for the preferred product. This framework has managed to meet the predicted attributes and evaluated for a repository where laptops are purchased to meet certain customer requirements.
A Fuzzy-Based Multi-agent Framework for E-commerce …
451
References 1. Abbas A, Zhang L, Khan SU (2015) A survey on context-aware recommender systems based on computational intelligence techniques. Computing 97(7):667–690 2. Adnan MNM, Chowdury MR, Taz I, Ahmed T, Rahman RM (2014) Content based news recommendation system based on fuzzy logic. In ICIEV, pp 1–6. IEEE 3. Al-Qaheri H, Banerjee S (2015) Design and implementation of a policy recommender system towards social innovation: an experience with hybrid machine learning. In: Intelligent data analysis and applications, pp 237–250. Springer 4. Anand D, Mampilli BS (2014) Folksonomy-based fuzzy user profiling for improved recommendations. Expert Syst Appl 41(5):2424–2436 5. Rajavel R, Iyer K, Maheswar R, Jayarajan P, Udaiyakumar R (2019) Adaptive neuro-fuzzy behavioral learning strategy for effective decision making in the fuzzy-based cloud service negotiation framework. J Intell Fuzzy Syst 36(3):2311–2322 6. Hu YC, Chiu YJ, Liao YL, Li Q (2015) A fuzzy similarity measure for collaborative filtering using non-additive grey relational analysis. J Grey Syst 27(2) 7. Diamah A, Mohammadian M, Balachandran BM (2012) Fuzzy utility and inference system for bilateral negotiation. In: International conference on uncertainty reasoning and knowledge engineering 8. Shojaiemehr B, Rafsanjani MK (2018) A supplier offers modification approach based on fuzzy systems for automated negotiation in e-commerce. Inf Syst Front 20(1):143–160 9. Bai T, Ding B, Wang Y, Ning J, Huang L (2015) A collaborative filtering algorithm based on citation information. In: 2015 International conference on logistics engineering, management and computer science, pp 952–956. Atlantis-Press 10. Banda L, Bharadwaj K (2014) An approach to enhance the quality of recommendation using collaborative tagging. Int J Comput Intell Syst 7(4):650–659 11. Bedi P, Vashisth P (2014) Empowering recommender systems using trust and argumentation. Inf Sci 279:569–586 12. Awasthi SK, Vij S, Mukhopadhyay D, Agrawal AJ (2016) Multi strategy selection in enegotiation: a proposed architecture. In: Proceedings of the second international conference on information and communication technology for competitive strategies, pp 1–5 13. Cao M, Luo X, Luo XR, Dai X (2015) Automated negotiation for e-commerce decision making: a goal deliberated agent architecture for multi-strategy selection. Decis Support Syst 73:1–14 14. Klein M, Löcklin A, Jazdi N, Weyrich M (2018) A negotiation-based approach for agent-based production scheduling. Proced Manuf 17:334–341 15. Castro J, Yera Toledo R, Martinez L (2017) An empirical study of natural noise management in group recommendation systems. Decis Support Syst 94:1–11 16. Panda S, Bhoi S, Singh M (2019) A collaborative filtering recommendation algorithm based on normalization approach. J Ambient Intell Human Comput 1–23 17. Nayak S, Panda S (2018) A user-oriented collaborative filtering algorithm for recommender systems. In: Fifth international conference on parallel, distributed and grid computing, pp 374–380 18. Hopkins J, Kafali Ö, Alrayes B, Stathis K (2019) Pirasa: strategic protocol selection for ecommerce agents. Electron Markets 29(2):239–252 19. Gupta A, Srivastava DK, Jain S (2016) Auction system for automated e-commerce: JADE Multi-agent Appl 5(9):18019–18024
Improved Doppler Radar Target Detection Method Using Quad Code Sequence Generation Raj Kumar D. Bhure and K. Manjunathachari
Abstract Today’s world is concentrating to set the new developments in radar to face the security challenges in defence. The present research in the area of radar and coding theory has introduced a better detection probability of target in terms of its distance and location. But most of the present art of work is focusing less towards the detection of multiple moving targets using Doppler radar. However, the existing approaches tried to increase the merit factor and range resolution of the echoes from a stationary target. As such methods are failed to find out the target when they are in multiple and due to which the amplitude of the side spikes (noise) is much more with respect to the maximum detectable limit. As these side spikes due to multiple moving targets mask, the weak received signal (echoes) from the targets may lead to degrade the detection probability. This proposed approach provides clear information about the target with respect to Doppler by creating several pure windows at various Doppler’s with respect to range. The amplitude of all windows is below 85–90 dB down, so all moving targets can be easily detected. The proposed approach is validating the use of MATLAB. Keywords Quad code · Multiple windows · Cyclic redundancy technique · Merit factor · Doppler · MATLAB · Side spikes
1 Introduction In general, the performance of radar resolution can be different in terms of the capacity to differentiate the desired object in a multi-target situation and close to each other. Adding on to this, the challenging issues are to be recognized and judge the problems, as soon as the target is found in a multiple and moving scenarios. The R. K. D. Bhure (B) GITAM University, Hyderabad, Telangana, India e-mail: [email protected] R. K. D. Bhure · K. Manjunathachari ECE Department, GITAM University, Hyderabad, Telangana, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_44
453
454
R. K. D. Bhure and K. Manjunathachari
resolution, therefore, becomes an important constraint in every discussion associated with the high-performance radars. Initially, the interference from the numerous targets must be addressed, so that Radar can effectively sense the target and also determine the exactly target distance and its speed in a multi-target environment. Considering a high-resolution radar, the range(K) of the pulse radar must be measured by considering the transmitter pulse width and the velocity can be represented as K = S Tp/2; here, S indicates the speed of light, Tp is the pulse width and dividing factor 2 indicates the back and forth distance of radar. This gives idea about finding the target distance for fundamental (basic) pulse radar system where unmodulated (static) frequency pulses are transmitted. In this paper, we present an approach to the use of equal-weighted hex codes, to generate the Quad Code Sequence and cyclic redundancy technique to enhance target detection probability of Doppler-tolerant radar in huge moving targets.
2 Related Work Golay [1] states that complementary sequence discards all the nonzero delays when added up but this does not permit ambiguous range imaging. Hence, it is useful for radars under static target ranging. Kirkpat et al. [2] proposed simulated annealing (SA) algorithm to optimize the coded wave forms, and this approach has tested for length N = 32 and N = 128; but as per observations, the crest levels of the cross-correlated side lobes and autocorrelation of side lobes are approximately equal to L/N; here, L represents total sequences in the sets and N represents length of the sequence. Deng has proved that SA algorithm is also applied to generate the frequency coded waveforms in radar applications. Even the proposed the optimum reliability of impulse function (i.e. autocorrelation functions) with almost zero level cross-correlation function by the use of SA algorithm and procedure for optimization. Author has also justified the drawbacks of the applicability of algorithms such as threshold accepting (TA) algorithm [3] and Hamming scan (HS) algorithm [4]. Here TA algorithm is proposing general-purpose numerical optimization algorithm, which is revised version of SA algorithm but presented with better characteristics; but the property of convergence is very slow. Hence, HS algorithm proposed and rectified the problem of convergence. However, this algorithm has a demerit that it can get jammed at the recognition of local least amount point. Towards this drawback, Singh et al. [5] proposed a model to overcome these limitations, called ‘new hybrid algorithm’ (HA) and considered its validity for polyphase waveforms. An attempt to be made to carry this research work and discover the properties of ‘new hybrid algorithm’ in order to better understanding the different frequency coded waveforms may be the first time. Sharma and Raja Rajeswari [6] proposed a MIMO design for ambiguity optimization with the use of phase coded pulse waveforms which are used to reduce the crest of the uncertainty function at every mismatched value of delay. But it cannot
Improved Doppler Radar Target Detection Method …
455
reduce the side lobes. Therefore, this method will not be applicable in moving target detection. Reddy and Anuradha [7] suggested a model, in which they tried to increase the SNR factor of Mesosphere–Stratosphere–Troposphere (MST) with the use of Cosh– Hamming and Kaiser–Hamming windowing techniques. But this is also limited to stationary targets because by this method the main lobe power of signal will be increased to enhance the merit factor and failed to minimize the effect of side lobes under continuously changing Doppler. Therefore, this method is validated for only stationary targets. Sindhura et al. [8] proposed a model which gives information on GPS signals and can provide a novel application to remote-sensing as they can provide precious information regarding the reflecting face. This approach focuses on the image arrangement of fixed (static) targets only using GPS L5 reflected signals which are processed based on SAR signal processing approach Singh et al. [9] presented an approach to improve the multiple moving target detection by using the windowing techniques. However, the presented approach has the amplitude of side lobes more than 0.2 at various Doppler and therefore cannot be suitable to detect fast multiple moving targets, which restrict the enhancement of probability of target detection at the desired Doppler. Also, this technique consumes maximum energy due to considering an initial length of the code word is of four bits and even delay will more to reach 256 bits.
3 Proposed Approach The cyclic redundancy technique is used to generate the target detection bit sequence. In this methodology, target detection bit sequence can be found from a prearranged statistic. Here M(x) is called as transmitting polynomial and generator polynomial G(x). To create M(x) as well as G(x) (i.e. divisor), we used equal-weighted binary hex codes (i.e. four-digit binary codes) up to the decimal value of 15 (i.e. 0–15). The transmitting polynomial M(x) called the required bit sequence of decimal numbers from 0 to 15 representing in terms of equal-weighted binary hex can be represented as M(x) =
q
ci Where C = 3, q = 4
( p=1)
We get M(x) = {3, 6, 9, 12} = {0011 0110 1001 1100}
(1)
The remaining equal-weighted missing decimal numbers 05 and 10 from 0 to 15 used to represent the generator polynomial G(x) can be designed from S=
IN + E N = 05 IN
i.e. in binary hex coding ‘0101
(2)
456
R. K. D. Bhure and K. Manjunathachari
where ‘I N ’ is the first term (i.e. 3) of Eq. 1 and ‘E N ’ is the last decimal number (i.e. 12) of Eq. 2 and the succeeding decimal number (i.e. 10) will be found by R = 2 ∗ S = 2 ∗ 05 = 10
i.e. in binary hex coding ‘1010
The combination of two terms R and S (i.e. 1010 and 0101) together are considered as a generator code word (divisor). G(x) = {R, S} = {1010 0101} In this approach, we choose the divisor as ‘10100101’ to eliminate the initial zero bit. According to CRC technique, if G(x) consisting of ‘K’ bits than the number of zero bits appended with the massage polynomial M(x) will be ‘k − 1’ (i.e. 8 − 1 = 7 zeros) in order to obtain the transmitting polynomial T (x) and can be represented as M(x) = {00110110100111000000000} T (x) =
M(x) G(x)
(3)
So after substitution of M(x) and G(x) in Eq. (3), we have the reminder Rmd = 1 1 0 0 0 0 1 where ‘Rmd ’ = Remainder, which is called cyclic redundancy check bits The transmitting polynomial is obtained by replacing the number of zero’s appended with message polynomial with cyclic redundancy check bits as follows T (x) = {M(x), Rmd (x)}
(4)
Thus, the corresponding transmitting code word TR = {00110110100111001100001}. The length of transmitting code word T R still increased with the use of three-bit EX-OR operation of the given code word (Refer Fig. 1) so that the merit factor of the received pulse will be further enhanced. Let E n = bm−1 ⊕ bm ⊕ bm+1; where m = 3n + 1 and n = 0, 1, 2, 3…7; bm represents the bits of T R where E R = (E 0 , E 1 , E 2 … E 7 , b21 , b22 ) = (100100001). The required 32-bit code word C w can be generated as
Improved Doppler Radar Target Detection Method …
457
Fig. 1 EX-OR operation
C W = TR E R = 0 0 1 1 0 1 1 0 1 0 0 1 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 0 0 0 0 1 The determined code word is consisting of 32 bits and to approach the minimal length of maximum art of work, i.e. 256 bits. This can be achieved just by appending complementary bits of ‘C W ’ to right of the code, and process is continued till 256 bits generated, so that the Merit factor and the SNR will be increased. This proposed approach is called Quad Code Sequence Generation Using Cyclic Redundancy Technique. This particular approach of target detection has fourfolds. (i) Quad Code Sequence Generation Using Cyclic Redundancy Technique The codeword length can be designed as Let CRTDI = C W
(5)
where C RTDI will be the preliminary target detection code further code will be determined as (CRTDI )1 = C W C W
(6)
(CRTDI )2 = (CRTDI )1 CRTDCI 1
(7)
(CRTDI )n = (CRTDI )n−1 CRTDCI n−1
(8)
Equations (5, 6, 7 and 8) are helpful to approach the maximum code word whose length is 256 bits to detect the position of the numerous moving targets.
458
R. K. D. Bhure and K. Manjunathachari
The Doppler versus amplitude as shown in Fig. 2, we observe that almost 20 windows have been created with respect desired Doppler to detect the multiple moving targets below 85–90 dB. This huge number of widows enhances the probability of target detection at the desired Doppler. Figure 3 shows the variation of Doppler versus delay (i.e. range calculation). From figure, we observe that the number of windows has been created at the desired Doppler, so all multiple moving targets can be easily detected with more accuracy when compared with the conventional approaches (i.e. Golay codes, PTM and oversampled PTM code).
Fig. 2 Variation of Doppler versus amplitude
Fig. 3 Variation of Doppler versus delay
Improved Doppler Radar Target Detection Method …
459
(ii) Quadratic Residue Sequence Generation Using Quad Code Cyclic Redundancy Technique (QRSGQCCRT) In this technique, we use the quadratic residue of decimal value of 33 to change the current positions of the code word generated from Eq. (5). The bit positions which are affected by quadratic residues of 33 are 1, 3, 4, 6, 9, 12, 14, 15, 16, 22, 24, 25, 27 and 31 bit positions, and these bit positions are changed with their complement bits (i.e. 1–0 or 0–1). So that the maximum bits of the proposed code word can be changed to get more clarity about windows with respect to the desired Doppler. No doubt one can choose any decimal value for quadratic residue, but the given decimal number is having the following importance. (a) It can cover maximum numbers up to 32 bits. (b) Almost all diagonal bits are changed which gives maximum change in merit factor amplitude. Table 1 represents the 32-bit quadratic residue code word. Figure 4 represents the variation of Doppler with respect to amplitude using quadratic residue technique (refer Table 1). From Fig. 4, we observe that after −15 kHz and +15 kHz we are having clear windows up to −35 and +35, respectively, to enhance the multiple moving target detection probability. Figure 5 represents the variation of Doppler versus delay, and this figure shows the clear information about the number of windows created after −15 kHz and + 15 kHz up to −35 and +35, respectively. (iii) Quadratic Residue with ‘0’ Change Sequence Generation Using Quad Code Cyclic Redundancy Technique (QRZCSGQCCRT) In this type of code word, we change only those bits of quadratic residues of ‘33’ whose weight is ‘0’ (i.e. ‘0’ changes to ‘1’) (Refer Table 2). Figure 6 shows the variation of Doppler versus amplitude using quadratic residue with ‘0’ change; here, we have seen that moving target can be easily detected within 3–9, 13–19, 23–28 and 30–31 kHz Doppler, as all windows are below 85–90 dB down. Figure 7 shows the variation of Doppler versus delay, which gives the appropriate information about the clear windows with respect to range. (iv) Quadratic Residue with ‘1’ Change Sequence Generation Using Quad Code Cyclic Redundancy Technique (QROCSGQCCRT) In this type of code word, we change only those bits of quadratic residues of ‘33’ whose weight is ‘1’ (i.e. ‘1’ changes to ‘0’) (Refer Table 3). Figure 8 again shows the variation of Doppler versus amplitude using quadratic residue with ‘1’ change; here, we observe that no way the target will be missed, as window is clear from 2 to 20 and 22 to 35, that is a huge window which can detect moving and multiple targets very easily. Figure 9 gives the clear information about these windows with respect to speed.
Quadratic residue code word 1 0 0 0 0 0 1 0 0 0
9 0
0 0
12
1 1
1
0
0
0
1
1
14 15 16
1 1
1 1
1 0
0 0
0 0
0
1
22
0
1
1
0 0
1
24 25
1
0
0
0
27
1
0
0
0
0
0
0
1
31
0
1
1
Quadratic residue of 33
6
0 0 1 1 0 1 1 0 1 0
1
Cw
3 4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Bit position
Table 1 Quadratic residue code word
460 R. K. D. Bhure and K. Manjunathachari
Improved Doppler Radar Target Detection Method …
461
Fig. 4 Variation of decoder versus amplitude
Fig. 5 Variation of Doppler versus delay
4 Comparative Analysis Figure 10 represents the comparative analysis of noise peaks with respect to the desired Doppler of various existing approaches with our proposed approach. After zero Doppler, the presented approach reduces the noise peaks below 0.2 dB, which is the detectable threshold limit of Doppler-tolerant radar. This indicates that the proposed approach is being easily used to detect the multiple moving targets by creating clear windows, as noise peaks are having value 0.1 (normalized), which is much less than the threshold
3 4
6
9 0
1
12 1
0
0
1
1
1
14 15 16
1 1
1 1
1 0
0 0
0 0
0 1
22
0
1
1
0 1
1
24 25
1
0
0
1
27
1
0
0
0
0
0
0
1
31
0
1
1
1
Quadratic residue code word 1 0 1 1 0 1 1 0 1 0 with ‘0’ change only
1
Quadratic residue of 33
1
0 0 1 1 0 1 1 0 1 0
Cw
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Bit position
Table 2 Quadratic residue code word with ‘0’ change only
462 R. K. D. Bhure and K. Manjunathachari
Improved Doppler Radar Target Detection Method …
463
Fig. 6 Variation of Doppler versus delay
Fig. 7 Variation of Doppler versus amplitude
limit of Doppler-tolerant radar. So the presented approach finds an extensive use in multiple moving target detection.
5 Conclusion In this paper, we presented a mathematical model by the use of cyclic redundancy and quadratic residue technique to generate the code that can detect multiple moving targets. We also presented the simulation result which reveals that to detect multiple moving targets the presented approach is very much suitable and effective. As the presented approach, the proposed four different coding approaches which creates huge number of clear windows with respect to Doppler to observe the current position
3 4
6
9 0
0
12 1
0
0
0
0
0
14 15 16
1 1
1 1
1 0
0 0
0 0
0 1
22
0
0
1
0 0
0
24 25
1
0
0
0
27
1
0
0
0
0
0
0
0
31
0
1
1
1
Quadratic residue code word 0 0 0 0 0 0 1 0 0 0 with ‘1’ change only
1
Quadratic residue of 33
1
0 0 1 1 0 1 1 0 1 0
Cw
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Bit position
Table 3 Quadratic residue code word with ‘1’ change only
464 R. K. D. Bhure and K. Manjunathachari
Improved Doppler Radar Target Detection Method …
465
Fig. 8 Variation of Doppler versus amplitude
Fig. 9 Variation of Doppler versus delay
Fig. 10 Variation of Doppler versus noise peaks
of multiple moving targets efficiently. Also, this approach is very much simple to operate.
References 1. Golay MJE (1961) Complementary series. IRE Trans Inform Theory 7(2):82–92 2. Kirkpat S, Gelatt CD, Vecchi MP (1983) Optimization by stimulated annealing. Science 220(4598):671–680
466
R. K. D. Bhure and K. Manjunathachari
3. Dueck G, Scheuer T (1990) Threshold accepting: a general purpose optimization algorithm appearing superior to simulated annealing. J Comput Phys 90:161–175 4. Moharir PS, Singh R, Maru VM (1996) S-K-H algorithm for signal design. Electron Lett 32(18):1642–1649 5. Singh SP, Subba Rao K (2007) A new hybrid algorithm for polyphase code design. ANNIE–2007, Missouri, USA, 11–14 + Nov 2007, pp 573–578 6. Sharma GVK, Raja Rajeswari K (2013) MIMO radar ambiguity optimization using phase coded pulse waveforms. Int J Comput Appl (0975–8887) 61(10), January 2013 7. Ravi Krishna Reddy D, Anuradha B Improved SNR of MST Radar Signals by Kaiser 8. Kavya Sindhura S, Narayana Reddy S, Kamaraj P (2015) Comparison of SNR improvement for lower atmospheric signals using wavelets. Int J Adv Res Electr Electron Instrum Eng 4(10) October 2015 9. Singh RK, Elizabath Rani D, Ahmad SJ (2017) Hex quadratic residue Ex-Or coded matrix technique to improve target detection in doppler tolerant radars, Pontee, Florence Italy. Int J Sci Res 73(1)
Genetic Algorithm-Aided Received Signal Strength Maximization in NLOS Visible Light Communication Tahesin Samira Delwar, Anindya Jana, Prasanta Kumar Pradhan, Abrar Siddique, and Jee-Youl Ryu
Abstract Visible light communication (VLC) is a subset of OWC. VLC deliberates line of sight (LOS) connections as well as different reflections from different paths. In a practical condition, LOS links could be affected due to the existence of the obstruction between LED transmitter and PD receiver. However, it is possible to create the wireless link in a non-line of sight (NLOS) scenario when the reflected light paths are present. Nevertheless, in terms of received signal strength indication (RSS), there is still a probability of demolishing the system performance. This paper presents a new and unique indoor visible light communication-based power maximization of received signal under the condition of NLOS. Based on a white light emitting diode, a photo-detector and evolutionary algorithm, genetic algorithm (GA) is a one type of evolutionary algorithm. In this paper, we have used GA which performs the maximization of the received power. Simulation result reflects the feasibility of the proposed methodology. The maximization of VLC-based power in NLOS condition can benefit smarter indoor communication in near-future VLC domain. Keywords Genetic algorithm · Non-line of sight · Optical communication · Visible light communication
1 Introduction The visible light communication (VLC) is the name given to the form of communication in which information is transmitted from the visible spectrum by modulating T. S. Delwar (B) · A. Siddique · J.-Y. Ryu Department of Information and Communications Engineering, Pukyong National University, Busan, South Korea e-mail: [email protected] A. Siddique e-mail: [email protected] A. Jana · P. K. Pradhan Department of Information and Communications Engineering, J. B Institute of Engineering and Technology, Hyderabad, India © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_45
467
468
T. S. Delwar et al.
light waves, varying from 380 to 750 nm wavelengths [1]. Generally, any scheme where data can be transferred using some kind of visible light for human eyes can be called visible light communication. It should be noted that VLC is a very promising technology. VLC systems are strongly connected to the fast and growing adoption of LEDs around the globe and the imminent approximation of the smart lighting paradigm [2]. VLC uses the transmitter’s LED and the receiver’s PD. Here LEDs are used for dual illumination and communication functionality. VLC delivers point-to-point wireless communication. Multi-gigabit per second data rates could be provided for short distances, with ~300 THz of bandwidth for VLC. For example, LED arrays in a multi-input multi-output (MIMO) manner could be used [3]. Literature Survey: Indoor communication based on LEDs is largely dependent on line of sight (LOS). Until now, several studies have been conducted on the VLC system where line of sight (LOS) plays an significant role. But any object can permanently block this LOS in the real indoor situation. Wang et al. used camera rolling shutter patterning in their paper [4] to detect long-range NLOS signals. Esmail and Fathallah [5] used various PD array orientations to achieve better system efficiency in the NLOS situation. In NLOS position, multiple light sources illuminate the receiver. Therefore, maximization of the received power in the user position plays an important role in indoor VLC to achieve maximum signal strength in a specific user position. We present a new methodology in this paper to maximize the power of received signal based on genetic algorithm. The rest of this paper is as follows: Sect. 2 describes the power maximization method proposed. Section 3 outlines a short overview of the genetic algorithm and analyzes of the results are discussed in Chap. 4. Section 4 concludes the proposed methodology.
2 Proposed Methodology of Power Maximization Technique 2.1 Problem Formulation The major concept behind maximizing the power obtained is the quantity of light intensities obtained on a PD from different directions. Using micro-electromechanical systems (MEMS) to find various intensities obtained from different directions, the PD will rotate when the user is fixed at a specific position. At this point, the maximum signal strength received is optimized by a genetic algorithm (like GA). Finally, in that user position, PD will be fixed on the direction optimized by GA to ensure the best possible NLOS communication. The communication in VLC is possible through the reflected signals coming from different sides of the room; for example, NLOS links as we can see the LOS links are blocked by an obstruction in Fig.1. From this figure, we can observe that, the
Genetic Algorithm Aided Received Signal Strength Maximization …
469
Fig. 1 System diagram for the proposed methodology
angle between the LED axis and the light beam is indicated by α1 . α2 is used to indicate the angle between the normal line to the wall and the reflected PD light beam. In addition, 1 is the angle between incident light beam on the reflecting wall and the normal line to the wall, while 2 is the AOA of the light. It is the angle between reflected light beam on the PD and the axis of PD [6]. Here d a is defined as the distance between the LED and the reflecting point on the wall, d b is the distance between the PD and the reflecting point on the wall. V T and V R in Fig. 1 are vectors representing the transmitter and receiver normally, respectively. We can obtain the impulse response as, h(t) =
N ∝ LED
h (k) (t; P)
n=1 k=0
⎡
h (K ) (t, P) = ⎣
L 1 L 2 . . . L K +1 n(k) rect
s
d1 + d2 + . . . dk + 1 ×δ t− c
θk+1 FOV
dAref , k ≥ 1
(1)
With the following equation, the power received on the PD can be estimated [5]. Max Pr (R j ) =
⎧ I ⎨ i=1
⎩
Pt Hd (0; Si , R j ) +
Pt Href (0; Si , R j ) Asur
⎫ ⎬ ⎭
(2)
470
T. S. Delwar et al.
2.2 Genetic Algorithm to Find the Maximum RSS Each generation of GA [7] consists of crossover and mutation. GA originally generates a random size 20 population in this process. Each population is equivalent to the signal strength received at that location. These 20 populations are randomly crossed in order to generate another population of 20. Again, it calculates the signal strength received from the new population. Only 20 candidates of maximum signal strength are selected out of this 40 population size and others are discarded. Once again, these 20 candidates are subject to a mutation rate of 0.001. Once again, they calculate their signal strength. This will complete a whole generation. This process is conducted until the final criteria are not met.
3 Research Results and Discussion The entire simulation was carried out in the MATLAB environment with white LED in mind. Figure 2 reflects the fitness value over generations. Fitness function for this application is given by, ωi = max j=1:N (ϕj )
Fig. 2 Fitness evolution over generation using GA
(3)
Genetic Algorithm Aided Received Signal Strength Maximization …
471
where ωi is the fitness value of ith generation and ϕ j the light intensity received by jth chromosome. Here, the fitness function represents the position at which maximum power is received for the current generation. Table 1 reflects the reduction in fitness value with new generations. Figure 3 reflects the receiving signal power over generations, and Table 2 reflects the evolution in received signal strength. Table 1 Reduction in fitness value with new generations
Generations 1 1st Generation 10 20 30 40 50 60 70 80 90 100
Minimization in fitness [Error minimization] 62.028 2.0076 0.95303 0.52003 0.4118 0.365 0.33743 0.32776 0.3183 0.28284 0.28194
Fig. 3 Receiving normalized signal power over generation
472
T. S. Delwar et al.
Table 2 The evolution in received signal strength Generation 1 10 20 30 40 50 60 70 80 90 95
Normalized received signal strength 0.38796 0.53777 0.68747 0.78898 Evolution in Normalized 0.81029 received signal strength 0.82323 0.84240 0.85876 0.87179 0.87684 0.87684
4 Conclusions and Future Scope This paper presents a novel and unique technique to maximize the received signal strength. The proposed method is used to provide accuracy in receiving maximum signal strength at the user’s position without hindering the system. This paper briefs about a cost-effective, less time-consuming, smart indoor communication using visible light as the communication medium. For future highly efficient indoor VLC, this proposed VLC-based signal power maximization methodology for NLOS communication can be considered as an additional functionality. The simulation results show the system’s robustness and reliability. However, it is clear that the proposed VLC scheme could be further explored in order to guide the PD toward optimal RSS orientation in real time, considering numerous factors such as room size, transmitter location, obstacle position and receiver position. Acknowledgements This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2018R1D1A1B09043286).
References 1. Matheus LEM, Vieira AB, Vieira LF, Vieira MA, Gnawali O (2019) Visible light communication: concepts, applications and challenges. IEEE Commun Surv Tutorials 2. Sevincer A, Bhattarai A, Bilgi M, Yuksel M, Pala N (2013). LIGHTNETs: smart lighting and mobile optical wireless networks—a survey. IEEE Commun Surv Tutorials 15(4):1620–1641; Author F, Author S, Author T (1999) Book title. 2nd edn. Publisher, Location 3. Zeng L et al (2009) High data rate multiple input multiple output (MIMO) optical wireless communications using white LED lighting. IEEE JSAC 27(9):1654–1662 4. Ghorai A, Walvekar P, Nayak S, Narayan KS (2017) Influence of non-line of sight luminescent emitters in visible light communication systems. J Opt 20(1):015703 5. Esmail CA, Fathallah HA (2015) Indoor visible light communication without line of sight: investigation and performance analysis. Photon Netw Commun 30(2):159–166
Genetic Algorithm Aided Received Signal Strength Maximization …
473
6. Delwar TS, Cahyadi WA, Chung YH (2018) Visible light signal strength optimization using genetic algorithm in non-line-of-sight optical wireless communication. Opt Commun 426:511– 518 7. Goldberg DE (1989) In: Genetic algorithms in search, optimization, and machine learning. Reading, Addison-Wesley
Implementing Web Testing System Depending on Performance Testing Using Load Testing Method Sawsan Sahib Abed and al-Hakam Hikmat Omar
Abstract Web testing is very important method for users and developers because it gives the ability to detect errors in applications and check their quality to perform services to users performance abilities, user interface, security and other different types of web testing that may occur in web application. This paper focuses on a major branch of the performance testing, which is called the load testing. Load testing depends on an important elements called request time and response time. From these elements, it can be decided if the performance time of a web application is good or not. In the experimental results, the load testing applied on the website (http://ihc oedu.uobaghdad.edu.iq) the main home page and all the science departments pages. In the conclusion, it appeared that the internet speed and the number of users that entered the website play a significant part in the determination quality of website using load testing algorithm. Keywords Web testing · Web application · Performance testing · Load testing · Request time · Response time
1 Introduction In web world, every user needs all to be quick nevertheless at the similar period there are fears around dependability of practice. Many users decide to give up if the sheet acquires extended period to be loaded. The curiosity of individuals will be misplaced in commercial if web request is not running fast. In these days, every organization use the internet to conduct there on line business, so the presentation of web application is very important aimed at some association [10, 7]. Performance means information providing in an accurate and quick way despite the interaction S. S. Abed (B) Department of Chemistry, College of Education for Pure Science Ibn al-Haithem, Baghdad University, Baghdad, Iraq e-mail: [email protected] al-H. H. Omar Department of Law, Al-Rasheed University, Damascus, Syria © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_46
475
476
S. S. Abed and al-H. H. Omar
of high multi-user or resources of constrained hardware and the ability of system transactions finishing [13]. Performance testing includes recording and monitoring the levels of performance during high, orderly and low stress load. The procedure of performing request by the intending to discover software mistakes named software testing [2]. This process is very important level of software development life cycle (SDLC). The part of software testing is to detect bugs plus tries to remove it; in turn this leads to increase the software reliability and quality of software [1]. The key aim of software testing is to evaluate the capacity or attribute of program and determine whether it satisfies the quality of serves (QOS) [9]. Software testing centers on web applications named web testing and it consists of several types as illustrated in Fig. 1: • Functionality Testing: The main role of this testing is to check all links in webpage as (test all internal connections, test the outgoing relate from every pages and check for every broken connections), cookie testing (cookie is very small files that store in the user machine, it’s operated to keep login periods and it can test the request in browsers options via disabling or else enabling the cookie), test forms (forms managed to acquire information from consumers plus to retain the communication with them and test forms include option to create form if any user try to view or delete or modify the forms, examine entirely authentication on every ground, examine field default values). Functional testing also includes database testing that is used to check the data integrity in web application. Fig. 1 Web testing methods
Implementing Web Testing System Depending on Performance …
477
2 Related Work Guo and Chen [5], in their paper, they said that the performance testing model for Web Services is proposed. Targeting to boost testing capacity plus automation, the model offers a multi-machine joint testing model and approach model. The first one is applied to portion the intense load of numerous elements, which can be named a load balance model, plus the second is applied to simulate a true Web Services running environment. The model is utilized to main web services testing software plus confirmed a functional path aimed at the service of performance testing [5]. B¨uchler and Oudinet [3], Web applications in addition to web services delight in continually-growing popularity. These applications have to deal with a set of advanced and skilled strikes The struggle of recognizing individual weaknesses rises by the complication of applications. Furthermore, the technique of breakthrough testing, for most part, counts on the abilities of extremely high qualified test specialists. The stiffness to assessment web applications later represents an intimidating task to the developers. As a phase on the way to upgrading safety analyses, model examination has, on the model level, been originating the ability to distinguish difficult attacks and so improve security analyses concerning a push-button technology. So as to link the break within the existing systems, they current SPaCiTE. This tool depends on a devoted model-checker for security analyses that engenders possible attacks with respect to mutual weaknesses in the web applications. Then, it semi-automatically turns those attacks on the system under validation (SUV) and then reports the weaknesses that were effectively broken. They used SPaCiTE to role-based-access-control (RBAC) plus cross-site scripting (XSS) lessons of WebGoat, an uncertain web application preserved by OWASP. The instrument effectively replicated RBAC plus XSS strikes [3]. Vani and Deepalakshmi [17], they said that the Software Testing is a tricky duty besides testing web presentations can be trickier because of features of these applications. A single technique to estimate IT substructure achievement is by load testing that allows you estimate in what way much the Web site proves its predictable workload through proceeding identified collection of scripts that emulate client actions at various load stages. This document clarifies the QoS influences load testing addresses, the way toward manner load testing, plus in the way it addresses business requirements in numerous obligation stages plus offerings the performance of web-based applications in the terms of QoS, response time and throughout [17]. Stoyanova [15], they presented that the testing service orchestrations are a daring work domain because of the importance of extra testing struggles, completing classical software testing. Several testing tools and approaches have been suggested; almost all of them contribute limited explanations that cover only one testing actions like the test path analysis, web service emulation, test case generation, fault inoculation and et cetera. After the present direction of the research, they have established a joined testing framework, named TASSA, that target to deliver end-to-end testing of Business Process Execution Language (BPEL) orchestrations. This study presents
478
S. S. Abed and al-H. H. Omar
its essence ability aimed at automation of test condition generation, implementation plus supervision executed in an open-source tool for WSDL-based testing for equally one web services in addition to combined web services, defined with BPEL. The functionality of the tool’s contains documentation of web service actions also BPEL variables in circumstance of service composition testing, creation of SOAP request templates, data-driven testing, the explanation of assertions at altered levels (SOAP, HTTP and BPEL variable), performance and running test situations [15]. Sabharwal and Sibal [12], they said that the internet is the main source of statistics. Massive rise in the numeral requests of users the expansion of web systems which are suitable by means of varying requests of the user. In order to assure extraordinaryquality web-based systems, extensive testing of the web application is so obligatory previously which are made a living. Within this study, a literature survey of present methods is applied for testing web applications. Major attention of this paper is on the testing methods within the web application testing, and they completed a search from 2000 to 2014 on publications in the chosen electronic databases, over their accurate survey; an overall of thirteen documents has been nominated as essential readings. Also in this paper, a comparative study is performed [12]. Dhiman and Sharma [4], testing is a quite substantial, stage of SDLC where the software is studied accurately and adjustments are submitted. This current time software, testing is the procedure of authenticating plus proving the accuracy of software. Therefore, they suggested testing is needful for the performance that is delivered via software. Implementation testing is utilized to set the responses, throughput, consistency and/or scalability of a system beneath a specific workload. Web service is a theory which is used around the world these days because of quickly popularization of the Web services and minimal literature is accessible concerning web service’s execution. Web applications are tough to test resemblance to conventional applications particularly in the terms of execution testing like unpredictable load, response time etc. This paper researches the comparison of three execution testing tools, i.e., Apache Jmeter, Grinder, HttpRider need to be complete on the basis of their response periods [4]. Lee and Chen [8], they said that the test situation recording besides playback technology, usually employed as a test case recorder, has been most practical in the business. An important research subject is exactly in what way to define if a web page has get in a complete state formerly the recorder may implement the succeeding test command? In the case of a web component is mistaken nominated through playback, the concerning test command will flop. This study set advancing four types of automatic waiting instrument aimed at playing test directions. The experimentation conclusion presents that the waiting time can be dynamically specified and automatically so the testers do not have to by hand add waiting commands, thereby decreasing the time plus error of artificial discrimination. Suggested mechanisms were performed as parts of the SideeX open-source web testing software and have been more accepted via the fresh Katalon Recorder and Selenium IDE [8].
Implementing Web Testing System Depending on Performance …
479
3 Performance Testing When web application takes minutes to load, it can say that it is frustrating comparing with other sites that download similar content in seconds, when someone trying to log in web application and receiving a message like “server busy” it’s can say that it is aggravating, when web application responds in some situations and its goes into infinite wait state in other different situation its can say that it is disconcerting. All these problems happen in web and these problems related to performance. Performance testing is used to discover performance defects that can result from weakness of network bandwidth, lack of server side, weakness of operating system capabilities and other software or hardware problems. Performance testing used to simulate the real-word loading situations like increasing in number of online transactions, amount of data, the increasing number of simultaneous web application users [11]. The advantage of execution testing is for governing the throughput and response time of some required web application [11]. Throughput observes how many transactions per second an application can handle during the test, and throughput indicates the amount of transactions produced over time. Throughput depends on different elements like 1. Types of transactions being processed 2. Specifications of the host computer 3. Processing overload in software [11]. Response time is the time that begins at sending the request by user until the application observe that the request has completed as shown in detail in the next section [14]. Tools of performance testing are used to determine how much time will be taken for executing a task by the system [18]. Execution testing is used to make sure if the non-functional supplies in document of Software Requirement Specification (SRS) are meets the system or not. In the current time, most of the websites builders or developers knows that its essential to test the performance prior starting [19]. Performance testing consists of load and stress testing. Load testing measures the reply period, and the main goal of it is to see whether the application can maintain the increasing load on the server or not and this paper focuses on the load testing and will explain in detail in next section. Stress testing is just similar load testing but is go further than load testing in increasing the load on the server, the main goal of pressure testing is to test the senseless bounds of the appliance; the following Fig. 2 shows the main steps of performance testing [6].
4 Load Testing It is the most used techniques for measuring performance, the major goal for this method is to specify anticipated peak load condition and behavior of web applications, it means while managing the particular load given via the user to the system this
480
S. S. Abed and al-H. H. Omar
Fig. 2 Performance testing steps
testing provide many information about system behavior [10]. This testing process is based on gradual rising in resources. It means that the test usually starts to load the tested web application with limited number of virtual users and gradually increasing the number from normal to peak, as displayed in Fig. 3. Load testing provides the measurement for the application QOS (quality of services) performance by depending on actual customers’ behavior. To build interaction scripts, a script recorder uses customers’ requirements. Then scripts will be replayed by the load generator module, perhaps altered via test parameters in contrast to the Web site. (a) Response time
Fig. 3 Load testing steps
Implementing Web Testing System Depending on Performance …
481
The major key in (QOS) is response time. The response time must be measured to define how customers perceive things, for instance keyword search times and page download. When defining response time, the different between the bases HTML pages download time and the other page components like (images, ad banners) should be noticed. Response time for web application varies according to many different elements such as: 1. 2. 3. 4.
The customer’s internet service provider. The testing site IPS. Bandwidth. Which network route packets from the customer to the testing website?
Obviously, measuring response time from specific time window and geographical location will not give a full picture. Response time depends on time and space. It must be known how the customers from various sites with various connectivity observe your site performance at dissimilar times [10]. (b) How load testing do the job? To accomplish the load testing process, the load generator module of the tester, see Fig. 2 simulates browser behavior; it nonstop sends requests to the Web site and then afterward the site directs a reply to the demand. It waits for a period of time and this is called (the think time) plus then submits a different demand. The load generator is able to simulate many of simultaneous users to test the scalability of web. For every simulated browser is named a computer-generated user, that is a key load testing thought. In case computer-generated users’ behavior has features comparable to real users, then the load test is valid. So it must be sure that those computer-generated users: • Use accurate reason periods. • Respond similar abandoning a Web session in case reply period is extreme, frustrated users • Track patterns like to real users. (c) How connectivity speed or bandwidths affect the load testing? One of the greatest significant aspects of load testing is the amount of bandwidth allocated to its use. How quickly the web server is able to upload information determined by the bandwidth [11]. It will be shown in the (discussion section) when load testing performed for 100 users and 1000 users.
5 Methodology Four elements should be considered when designing load testing module, these elements are:-
190
601
727
915
1117
0
371
Page Id = 15001
Page Id = 15003
Page Id = 15004
Page Id = 15005
Page Id = 15006
Page Id = 15007
Page Id = 15008
Total
Get
Get
Get
Get
Get
Get
Get
Get
3921
Requests
Name
Method
1336
0
1336
0
0
0
0
0
Failures
9100
8900
0
9000
9200
9200
9000
9000
Median response time
9155
8898
0
9117
9203
9238
9226
9106
Average response time
Table 1 Experimental results from the first test (Appling load testing by entering 100 user)
2116
4440
0
2116
3943
3846
4077
3949
Min response time
22,293
19,133
0
22,031
22,293
22,152
21,043
21,435
Max response time
39,800
37,273
0
37,148
43,524
37,332
38,043
57,406
Average
4.33
0.41
0
1.23
1.01
0.8
0.66
0.21
Request/s
482 S. S. Abed and al-H. H. Omar
486
649
845
957
0
358
Page Id = 15003
Page Id = 15004
Page Id = 15005
Page Id = 15006
Page Id = 15007
Page Id = 15008
Total
Get
Get
Get
Get
Get
Get
Get
3456
161
Page Id = 15001
Get
Requests
Name
Method
6620
386
2504
1234
984
765
558
189
Failures
44,000
4,2000
0
45,000
42,000
43,000
44,000
45,000
Median response time
48,394
47,729
0
49,103
47,248
47,928
49,800
49,315
Average response time
2014
14,243
0
2535
2014
2579
3920
16,330
Min response time
Table 2 Experimental results from the second test (Appling load testing by entering 1000 user
194,815
164,016
0
167,473
164,167
173,367
194,815
166,734
Max response time
39,824
37,273
0
37,148
43,524
37,332
38,043
57,406
Average
3.93
0.41
0
1.09
0.96
0.74
0.55
0.18
Request/s
Implementing Web Testing System Depending on Performance … 483
484
S. S. Abed and al-H. H. Omar
A. Statistics: it consists of several determination elements as shown in Tables 1, 2, these elements are: • Type: includes the type of http protocol request or method, if it is (get) then it deals with asking the users to input homepage and profile page, and they must be entered before the program started. If it is (post), then it deals with the login (), logout () functions. • Name: consist of (homepage, profile page, login (d), logout ()) as shown in Tables 1, 2. • Requests: In this option, it must be collect the tests requests that will be executed from homepage, profile page, login () and logout (). • Fails: This element explain the users test requests that will be not executing because of different reasons like connection broken or low bandwidth etc.… there are different faults that will explain in detail in failures field. • Median (MS): is a filter that measured by millisecond and it used to arrange the waiting time to execute test requests from maximum time to minimum time or on the contrary and then (median) choose the middle wait time [16]. • Average: also measured by millisecond, the job of this option is to collect all waiting times to execute the request and divide it on its numbers to get the average time. • Min (MS) and Max (MS): (Min) is the minimum time to perform user request and the (Max) is maximum time to execute user request. • Content size (bytes): in this field is calculated the weight of the required page in bytes. • Request/sec: it calculates how much time will take to execute total requests. B. Chart: it’s consisting of number of users, response time and total request per second, it will explain in detail in result section (Figs. 4, 5). C. Failures: it is explained why the requests have been failed as shown in Fig. 6. D. Download data: the purpose of this field is to download the statistics, charts; data and failures in excel files.
Fig. 4 The total request that made by users per second X-axis is response time, Y-axis estimation time
Implementing Web Testing System Depending on Performance …
485
Fig. 5 Response time for 100 users, X-axis is response time, Y-axis estimation time
Fig. 6 Total number of users, X-axis is the time taken to reach 100 users, Y-axis is number of users
6 Results and Discussion In the experimentation, the load testing algorithm applied on the college website (http://ihcoedu.uobaghdad.edu.iq) tow times, the first test applied with 100 entered user who request in all the pages in the website, the second test applied with 1000 entered user, the goal is to see the difference between the response time if the number of users grows up and to gather some detail information about how efficient the website is, the website are put under the testing for 15 min and the internet speed was 130.0 mbps see the tables below. Table 1 represents the statistics for the website for 100 entered users. The results show that almost all the request have been implemented typically except the page (Page Id = 15007) which all the requests have been failed, there are many reason for the requests to fail, the system provided with scanner that gives the cause of failed for each request shown in Figs. 6 and (median) it sorts the times from ascending and taking the middle response time and numbers in the table represent the time by (ml/s), and the range needed to implement the request successfully is from (4000 to 9000 ml/s) if the response time required more than 9000 ml/s, it’s appeared that all the requests have filed in the page “Page Id = 15007” all the requests fail in this page
486
S. S. Abed and al-H. H. Omar
which mean that there is a problem in this page; from the results, website gets good average and the response time in good situation, the number of failures is very low and all numbers for all attribute are very good. Also the charts of the respond time and the total request through the test have been set in Fig. 4 showing the number of request made by user and by increasing users the number of request will also increase, x-axis represents the testing running time and y-axis represents the number of requests are Fig. 5 shows the response time procedural in each second, as it’s appeared the response time become higher by the increasing of the number of request, request will also increase, x-axis represents the testing running time and y-axis represents the required time to response by mile second requests, Fig. 6 shows the number of users that have been entered the website by the time. Figure 7 shows the failure requests and the reasons why it’s have been failed (Figs. 8, 9, 10). Table 2 represents the statistical of second test applied with 1000 entered user, the results show that it the average time and the response time is more than the first test, in each page in the website there are many failures with deferent types, as a whole
Fig. 7 Shown total number of failures in each page, with their reason
Fig. 8 The total request that made by users per second, X-axis represent response time, Y-axis estimation time
Implementing Web Testing System Depending on Performance …
487
Fig. 9 Response time for 1000 users, X-axis is response time, Y-axis estimation time
Fig. 10 Shown total number of users, X-axis is the time taken to reach 1000 users, Y-axis is number
the numbers in the first test better than the second test, the reasons of all these results although the same internet speed, because of the large amount of users requesting at the same time, when many users enter the website, that would make some delay in the response time and the request maybe fail, in Figs. 11, 12, 13 each type of failure with the number of request that fail for each page. Charts of the respond time and the total request through the test for the second test have been set in Figs. 8, 9 shows the response time procedural in each second, Fig. 10 shows the number of users that have been entered the website by the time.
488
S. S. Abed and al-H. H. Omar
Fig. 11 Shown total number of failures in each page, with their reason
Fig. 12 Shown total number of failures in each page, with their reason
7 Conclusion This paper concentration on load testing which is the main goal of it is to test the application for determining and create the report under an anticipated live load. The outcome through this kind of testing can be end-user response time, CPU reply time and memory statistics. Those outcomes offer tester data to achieve on it for an improved reply on the website application. Load testing is very useful method for commercial center. From the results shown above, it can have concluded that the bandwidth or internet speed and the number of users plays the major role in the load testing and sometimes may adversely affect and could obstruct evaluated the application in the right way, also there are many reason for the request to fail, some of this reasons have been studied in this paper. So when load testing is applied for 100 users, the results would be better than applying for 1000 user, it appeared that
Implementing Web Testing System Depending on Performance …
489
Fig. 13 Shown total number of failures in each page
the requests failure increased when the number of users increased, the test works online, So the better results came with better internet speed. For future work and to improve the result, the method (RUM) is recommended which referred to real user monitoring, this method gathers performance data from real user as they browse the site, so Rum is spatially useful for identifying how fast the page is, after that synthetic testing method is recommended which can be used to make the page work faster
References 1. Bathla R, Bathla S (2009) Innovative approaches of automated tools in software testing and current technology as compared to manual testing. Glob J Enterp Inf Syst 127–131 2. Barr ET, Harman M, McMinn P, Shahbaz M, Yoo S (2015) The oracle problem in software testing: a survey. IEEE Trans Softw Eng 507–525 3. B¨uchler M, Oudinet J (2012) SPaCiTE—web application testing engine. In: IEEE Fifth international conference on software testing, verification and validation 4. Dhiman S, Sharma P (2016) Performance testing: a comparative study and analysis of web service testing tools. Int J Comput Sci Mobile Comput 507–512 5. Guo X, Chen Y (2010) Design and implementation of performance testing model for web services. In: 2nd International Asia conference on informatics in control, automation and robotics 6. Kaushik M, Fageria P (2014) A comparative study of performance testing tools. Int J Adv Res Comput Sci Softw Eng 1300–1307 7. Khan R, Amjad M (2016) Performance testing (load) of web applications based on test case management. Perspect Sci 355–357 8. Lee S, Chen Y (2018) Test command auto-wait mechanisms for record and playback-style web application testing. In: 42nd IEEE international conference on computer software and applications 9. Maheshwari S, Jain DC, Maheshwari MS (2012) A comparative analysis of different types of models in software development life cycle. Int J Adv Res Comput Sci Softw Eng 285–290
490
S. S. Abed and al-H. H. Omar
10. Menascé DA (2002) Load testing of web sites. IEEE Internet Comput 70–74 11. Pressman R, Lowe D (2008) Web engineering: a practioner’s approach. McGraw-Hill, Inc 12. Sabharwal S, Sibal R (2015). A survey of testing techniques for testing web based applications. Int J Web Appl 13. Sarojadevi H (2011) Performance testing: methodologies and tools. J Inf Eng Appl 5–13 14. Sharmila MS, Ramadevi E (2014) Analysis of performance testing on web application. Int J Adv Res Comput Commun Eng 2319–5940 15. Stoyanova V (2013) Automation of test case generation and execution for testing web service orchestrations. In: IEEE Seventh international symposium on service-oriented system engineering 16. Teoh SH, Ibrahim H (2017) Median filtering frameworks for reducing impulse noise from grayscale digital images: a literature survey. Int J Future Comput Commun 17. Vani B, Deepalakshmi R (2013) Web based testing—an optimal solution to handle peak load. In: International conference on pattern recognition, informatics and mobile engineering (PRIME) 18. https://loadfocus.com/blog/2013/07/04/what-is-throughput-in-performance-testing. Have been visited at 29 Nov18 19. https://loadfocus.com/blog/2013/07/03/what-is-response-time-in-performance-testing. Have been visited at 02 Dec 18
Content-Based Caching Strategy for Ubiquitous Healthcare Application in WBANs Dong Van Doan, Sang Quang Nguyen, and Tuan Phu Van
Abstract Named-data and in-network caching are two main properties of content centric networking (CCN), which bring tremendous benefits for many challenging networks, especially for wireless body area network (WBAN). In WBAN, data are collected and stored in intermediate nodes over a certain time period for a behavioral healthcare. However, some existing caching strategies show a low utilization efficiency of those stored contents due to both stringent deadline of sensory information and low energy of the sensor nodes. As a result, a WBAN integrated with CCN requires a more efficient caching strategy. This letter proposes a content-based caching strategy, based on WBAN content, to evaluate and retain the most valuable information on a certain time period for efficient use of the stored contents. Meanwhile, invalid or expired data will be dynamically discarded from the cache to improve cache performance. Simulation results show that the proposed strategy achieves a great performance gain over the existing approaches in terms of improving cache hit ratio and reducing data retrieval delay.
1 Introduction The ubiquitous healthcare is one of the potential application areas of wireless body area networks (WBANs), where patients are equipped with wearable and implantable body sensor nodes to gather sensory information for remote monitoring [1–3]. The D. V. Doan Faculty Electrical and Electronic Engineering, Ho Chi Minh City University of Transport, Ho Chi Minh City 700000, Vietnam S. Q. Nguyen (B) Institute of Fundamental and Applied Sciences, Duy Tan University, Ho Chi Minh City 700000, Vietnam e-mail: [email protected] Faculty of Electrical-Electronic, Duy Tan University, Da Nang 550000, Vietnam T. P. Van Department of Information & Communication Engineering, Kongju National University, Cheonan, Republic of Korea © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_47
491
492
D. V. Doan et al.
sensory information is monitored in real-time fashion and delivered to some stored devices or healthcare institute for further processing. Recently, CCN has been considered as a promising solution to improve caching efficiency in WBAN [4]. In particular, the cache size of a storing node is supposed to be infinite. Thus, the ALWAYS caching [5] is usually applied as a default caching strategy, where all incoming packets are stored without any selection. However, the cache size of a node is limited in a practical scenario. Thus, a caching policy is indispensable to retain valuable data packets in a cache with a limited size. However, the existing replacement policies, i.e., first in first out (FIFO), least recently used (LRU), and etc., have been applied mainly for IP-Based networks and data centers [6], which are not suitable for WBAN. Moreover, in an emergency event, the collected data need to be provided with a timely fashion for a prompt intervention from healthcare staffs. In addition, the existing caching policies may not store important data to caches. These factors will lead to a low utilization efficiency of the stored contents for WBAN data traffics. Generally, a suitable caching policy for WBAN depends on a few metrics such as network delay, data rate, and the popular content [7–10]. Only the work in [11] considered a set of parameters for cache replacement to remove the invalid content from cache. However, this approach does not consider whether the cache has enough space to store the valuable for a coming data item. In addition, the priority level of the selected attributes in each type of application is not specified in detail, which turn affects to the accuracy of caching strategy. To solve these problems, this letter provides a caching decision and replacement strategy that considers the key attributes of WBAN data. These attributes are combined with different level of importance of WBAN application types (the weight of the attributes) to select the most valuable information for the storage, and discard the invalid data from the cache.
2 System Model In this section, our proposed model is elaborated on the remote healthcare monitoring domain. A WBAN is typically composed of three tiers: intra-body (tier-1), inter-Body Sensor Network (BSN) communications (tier-2), and extra-body (tier-3), as shown in Fig. 1. We focus on the extra body tier, which includes the intermediate nodes/devices such as routers. These nodes are integrated with CCN processing modules. The sensor nodes send the collected information to consumers (e.g., physician or the end devices such as medical, telemedicine servers, and other users) by periodically sending or responding to the request of the consumers. On the forwarding path, the valuable data items may be stored in the cache of nodes/devices for reuse in the processing, analyzing and evaluating activities, even for a timely intervention from physicians. Due to the limited cache size, a smart caching ability is necessary for selecting the most valuable information and discarding the unimportant from cache.
Content-Based Caching Strategy for Ubiquitous … Intra body communication
493
Extra body communication Ambulance
Physican
Medical Server
Fig. 1 The CCN-WBAN network model
On the other hand, the WBAN data traffic can be classified into emergent traffic, normal one, and on-demand one, which are defined as follows: (i) Emergent traffic (EM) represents the vital signals including both the critical data packets (e.g., EEG, etc.) and the data that are initiated by nodes when a normal threshold is exceeded (e.g., blood pressure (BP) ≥ 140 mmHg or body temperature ≥40 ◦ C). (ii) Normal traffic (NR) denotes the data traffic that is generated under some normal patient conditions without any criticality. This NR type includes medical data (e.g., motion control) that must be delivered in stringent deadline or periodically. These time intervals are also known as the age of the corresponding content existing on the cache. (iii) On-demand traffic (OD) is required by the consumers to acquire certain data for diagnostic purpose. They correspond to regular measurements of patient physiological information that typically indicate normal values received from those lower priority sensors (e.g., SpO2, Non-invasive cuff, etc.). Thus, there are no stringent requirements on the content attributes.
3 Proposed Content-Based Caching Strategy Based on the above analysis, the proposed content-based caching strategy can be divided into two main phases: caching decision phase, and replacement policy one. In the first phase, the caching decision will be implemented in each of the data items by considering the crucial attributes (e.g., data freshness, sensing delay, popularity of the content, hop-count, and energy of a node) to retain the most valuable information in the cache for a certain time period. These crucial attributes of each data item are calculated together to assign a value when they traverse to the nodes. Only the contents whose values are greater than a predetermined threshold can be stored in the cache. In the second phase, the comparison will happen between the candidate data item and the stored contents when the cache is full. The candidate data item is retained to cache if its value is larger than any of the values of the stored contents. In
494
D. V. Doan et al.
addition, all of the contents stored in the cache will be evaluated periodically based on the types of applications. In this way, if content is invalid or expired, it will be dynamically removed from the cache. The steps are given as follows: A. Calculation of the attributes For the caching decision, all content traversing through each node is fully assessed about their value that based on some attributes of the content. Data freshness along with reliable data delivery has been identified as one of the most important attributes to support the stringent quality-of-service requirement. A data item is considered to be valid if it meets the freshness requirement of the corresponding application. When a data item is traversed to a node, it will be evaluated to determine whether or not this data item satisfies the desired freshness. The freshness of a data item is given by Eq. (1). n dpath (n) + dcaching (n) T − i=1 T − dage = (1) F= T T where dage denotes the time interval between the source and the ith node on the transmission path, T is the data life time, and n is the number of the nodes on the transmission path. dpath and dcaching are the time delays on the path between two nodes and the caching duration respectively. The WBAN sensor nodes need to be exposed in the patient’s environment at different time period to capture the sensed reading accurately. This affects both the energy and the lifetime of the sensor nodes. The sensed data will have a high value if it takes a long time to sense, which is known as the sensing delay τ as shown in (2). τ = max (D1 , D2 , · · · , Dk )
(2)
where Dk is the sensing delay of application type k. A data item is termed to be popular when a user requests it for many times, or multiple users request it in a period of time. Thus, retaining the popular content for a longer period in the cache is significant to efficient data distribution. The popularity of the content can be expressed as P(l) =
R(l) Rtotal
(3)
where R(l) is the total count of requests for the lth content, and Rtotal represents the total count of requests received by a node in a period of time. Like freshness, hop-count is another important parameter for a content caching policy. Concretely, the probability of a data item entering a cache is inversely proportional to the distance from the consumer to the source that depends on the number of hops that packets traverse, which is expressed as (4): H=
c δ
(4)
Content-Based Caching Strategy for Ubiquitous …
495
where c and δ are the position of the current node and the total number of nodes on a transmission path respectively. In a practical scenario, the sensor nodes in WBAN are equipped with capacity-limited batteries. Among the different energy levels of the sensor nodes, only the nodes with a low energy level are prioritized to store for a longer time. The energy (E) of a node is also modelled as a normalized parameter (0 ≤ E ≤ 1), where the values 0 and 1 denote the lowest and highest energy levels, respectively. B. Pre-evaluation of the overall attributes of WBAN traffics To enhance the effectiveness of our proposed approach, a preliminary assessment is made to determine whether or not the current attributes meet the requirements of the running application. In fact, a new content is considered to be invalid as long as one of its attributes does not satisfy the requirements of the running application. Indeed, if one of these parameter values is below its minimum threshold, it will not be stored; otherwise, the caching process is triggered. C. Calculations of weight of the attributes Obviously, each attribute will have different important roles in different application types. For emergent traffic, the attributes (F, τ ) related to the quality of the content should have higher priority than the network attributes (E, H , etc.), since they are very critical to guarantee the reliability and transmission deadlines. In contrast, the network attribute parameters are more important than any of the content parameters for the normal traffic and on-demand one. The central problem is how each node recognizes and applies this priority rule as evaluation criteria for the storage. Thus, the weight (ω p ) calculation of the attributes is the best solution to accomplish this work, that is based on the analytical hierarchy process (AHP) approach [12, 13], as shown in (5), (6) and Table 1. a p1 a p2 a pm 1 m + m + · · · + m ωp = m p=1 a p1 p=1 a p2 p=1 a pm ⎧ if L p = L q ⎪ ⎨1, a pq = L q − L p + 1, if L p > L q ⎪ ⎩ 1 , if L p < L q L q −L p +1
(5)
(6)
where the attributes p and q are mapped to any of the linguistic values L m ∈ {1, 3, 5, 7, 9}. L m denotes the scale of importance, and m is the number of attributes. a pq ∈ {1/9, 1/7, 1/5, 1/3, 1, 3, 5, 7, 9}. D. Content-based caching decision The main function of the caching decision is to select the valuable contents to store. The accuracy of this decision is based on two factors: the assigned value of attributes, and the predetermined threshold. Of these, the assigned value (VI ) is temporary in nature that is expressed in association with the weight vector as follows. VI =
m p=1
ω p × r (i)
(7)
496
D. V. Doan et al.
Table 1 Weight of attributes Lm Lm (EM) (NR) F τ P E H
9 7 5 3 1
7 5 5 3 3
Lm (OD)
ωp (EM)
ωp (NR)
ωp (OD)
3 3 5 7 7
0.50 0.26 0.13 0.07 0.03
0.46 0.20 0.20 0.07 0.07
0.06 0.06 0.15 0.36 0.36
where r (i) is the modeled and normalized value of the attribute parameter r ( p) ∈ {F; τ ; P; E; H }. The strength of the VI denotes the importance of WBAN application. From Eq. (7), it is observed that the assigned value is fully represented for the importance of content. To give a selection criterion for an assigned value, a minimum threshold is determined to make an assessment as to whether the current content is worth information to reserve in the cache. This threshold plays a decisive role in the accuracy of the caching decision, which is necessary to calculate in detail using the min and max attribute threshold values of different application types xmin (i), and xmax (i), as shown in (8). m 1 xmin (i) (8) μ= m i=1 xmax (i) To summarize, the procedure of caching decision can be outlined in the algorithm 1, as follows. Algorithm 1 Cache decision Input: m, V(I,coming) , μ, x(i), xmin (i) Procedure Update the coming data item 1: Calculate the attributes, using (1), (2), (3), (4) 2: Pre-evaluate the overall attributes of WBAN traffics 3: for i=1 to m do 4: if x(i) < xmin (i) then 5: Refuse to cache 6: else 7: Calculate the attribute weights, using (5), (6) 8: Set μ using (8) and V(I,coming) using (7) 9: if V(I,coming) > μ then 10: Cache the coming data item 11: else Discard and forward the coming data item 12: end if 13: end if 14: end for 15: end procedure
Content-Based Caching Strategy for Ubiquitous …
497
Algorithm 2 Cache replacement Input: m, V(I,coming) , V(I,stor ed) ( j), μ, x(i), N Procedure Update the coming data item 1: Set value of each content, using (7) 2: if cache is full then 3: for j=N to 1 do 4: if V(I,coming) >V(I,stor ed) ( j) then 5: if Si ze I,coming ≤ Si zestor ed ( j) then 6: Remove j th stored item the coming item 7: else j=j-1 8: if V(I,coming) > V(I,stor ed) ( j) N Si zestor ed ( j) then and Si ze(I,coming) ≤ j
9: 10: 11: 12: 13: 14: 15: 16: 17: 18:
Remove j th to N th stored item to cache the coming item else Forward the item without any caching end if end if else Forward the item without any caching end if end for else Drop the expired content by checking periodically end if end procedure
E. Efficient caching space For efficient utilization of the cache spaces, all stored contents need to be checked periodically to remove those invalid and expired contents. Moreover, the contents in the cache should be arranged according to the order of priority depending on VI . For the periodic and stringent contents, they are dynamically removed when expired even if the cache is not full. These interval times (periodical time and stringent deadline) are known as the age of the stored content existing in the cache. Only those periodic contents may be refreshed for every periodic time interval. F. Cache replacement When the cache becomes full, the cache replacement relies on VI , the size of coming and stored data items to determine which data item to be replaced. When the value of coming data (V(I,coming) ) is larger than the values of the evaluated stored items (V(I,stored) ), the coming item will be retained in the cache with an additional constraint to ensure that the cache space is enough for their storage. This constraint is a condition whereby the cache size of coming item is smaller than the total size of the evaluated stored data items. Algorithm 2 provides the steps to be executed by each node if its cache is full to replace the less valuable data.
498
D. V. Doan et al.
4 Performance Evaluation The proposed strategy is evaluated using the network topology in Fig. 1 by OPNET Modeler. The mesh topology is assumed for the tier 3 in this simulation. A patient is in the coverage of an access point with a circular area that has a radius of 100 meters. Assuming that the data packet size and data rate of each application type (E M;N R;O D) are set to (50;150;300 bytes) and (300;400;50 kbps), respectively. Every CCN node has the same cache capacity (max 1000 packets). Cache hit ratio and data retrieval delay are evaluated in comparison with probabilistic caching where the node randomly caches the new data items with a certain probability p (p is set to 0.5, shorted as Prob(0.5)) [14], typical ALWAYS-LRU, and ALWAYS-FIFO strategies. Cache hit ratio is the probability to achieve a cache hit from the stored content of nodes instead of the original servers, which is defined as the fraction of cache hits to the total number of request messages. Figure 2 presents the cache hit ratio comparison of those different caching strategies. It can be observed that the hit ratio increases over time for all the strategies. Under the same LRU replacement policy, the probabilistic caching always performs better than ALWAYS. In addition, the cache hit ratio of the proposed approach is significantly increasing in time and outperforms other schemes. This can be explained as follows. By explicitly considering the content according to the purpose of use and caching time, the valuable data are more likely to accurately selected for storage, and the invalid data are removed from the cache in a timely manner. The data retrieval delay is another important criterion that reflects the effective caching by determining how long the consumer is satisfied with its responses to data 0.8 0.7
Hit ratio (%)
0.6
Content−Based Prob(0.5)−LRU ALWAYS−LRU ALWAYS−FIFO
0.5 0.4 0.3 0.2 0.1 0 0
500
1000
1500
Simulation time (seconds) Fig. 2 The cache hit ratio
2000
Content-Based Caching Strategy for Ubiquitous …
499
Average data retrieval delay time (s)
0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0
ALWAYS−FIFO ALWAYS−LRU Prob(0.5)−LRU Content−Based 500
1000
1500
2000
Simulation time (seconds)
Fig. 3 Average data retrieval delay time
requests. Thus, this metric is computed from the time when the consumer sends the interested message to the reception of the matching data. As shown in Fig. 3, the proposed approach achieves the lowest retrieval in comparison with the others. In particular, this delay time reflects the ability to fetch data from intermediate nodes. In other words, this approach is shown to efficiently reduce data delay time, and increase data retrieval from the cache.
5 Conclusion In this letter, a novel efficient caching strategy is proposed for the integration of CCNWBAN. In particular, the proposed approach utilizes the content-based technique to retain the most valuable content in the cache, and to increase the cache utilization efficiency. The numerical results have shown the remarkable effectiveness of the proposed approach in terms of improving total hitting rate and reducing data retrieval delay.
References 1. Movassaghi S, Abolhasan M, Lipman J, Smith D, Jamalipour A (2014) Wireless body area networks: a survey. Commun Surv Tutor 16(3):1658–1686 2. Lal KN, Kumar A (2018) E-health application using network coding based caching for Information-centric networking (ICN). In: International conference on reliability, Infocom
500
D. V. Doan et al.
technologies and optimization 3. Al-Turjman F (2019) Internet of nano-things and wireless body area networks (WBAN). Auerbach Publications 4. Hassan MM, Lin K, Yue X, Wan J (2017) A multimedia healthcare data sharing approach through cloud-based body area network. Future Gener Comput Syst 66:48–58 5. Chai WK, He D, Psaras I, Pavlou G (2013) Cache “less for more” in information-centric networks (extended version). Comput Commun 36(7):758–770 6. Bernardini C, Silverston T, Festor O (2015) A comparison of caching strategies for content centric networking. In: Proceedings of the IEEE GLOBECOM, pp 1–6 7. Vural S, Wang N, Navaratnam P, Tafazolli R (2017) Caching transient data in internet content routers. IEEE/ACM Trans Net 25(2):1048–1061 8. Lim SH, Ko YB, Jung GH, Kim J, Jang MW (2014) Inter-chunk popularity-based edge-first caching in content-centric networking. IEEE Commun Lett 18(5):1331–1334 9. Wang S, Bi J, Wu J (2012) On performance of cache policy in information-centric networking. In: Proceedings of the IEEE ICCCN, pp 1–7 10. Hajimirsadeghi M, Mandayam NB, Reznik A (2017) Joint caching and pricing strategies for popular content in information centric networks. IEEE J Sel Areas Commun 35(3):654–667 11. Al-Turjman FM, Imran M, Vasilakos AV (2017) Value-based caching in information-centric wireless body area networks. Sensors 17(181) 12. Chandavarkar BR, Guddeti RMR (2015) Simplified and improved analytical hierarchy process aid for selecting candidate network in an overlay heterogeneous networks. Wireless Pers Commun 83(4):2593–2606 13. Van DD, Ai Q, Liu Q, Huynh D-T (2018) Efficient caching strategy in content-centric networking for vehicular ad-hoc network applications. IET Intel Transport Syst 12(7):703–711 14. Tarnoi S, Suksomboon K, Kumwilaisak W, Ji Y (2014) Performance of probabilistic caching and cache replacement policies for content-centric networks. In: Proceedings of the 39th Annual IEEE Conference on Local Computing Networks, pp 99–106
Design and Simulation of High-Frequency Actuator and Sensor Based on NEMS Resonator Tiba Saad Mohammed, Qais Al-Gayem, and Saad S. Hreshee
Abstract Nano-electromechanical systems are a type of devices in which electrical and mechanical functions are integrated at the nanoscale. One of these devices is the resonator that can be defined as a frequency eclectic amplifier that after applying the appropriate stirring type starts to vibrate. In this work, a nano-electromechanical resonator is designed, studded, and simulated. This resonator is verified in two case studies: In the first case, it is employed as a nano-actuator, while in the second case study, this resonator is used as a nano-sensor. The design of the resonator is based on a nano-cantilever which includes three main parts, drive electrode, readout electrode, and clamped free beam. This beam is placed in parallel and very close to the fixed electrode and anchored only at one end to the bias electrode, so it can be bended free around its position after an appropriate exciting at resonant frequency; the simulation result showed that the resonance frequency is around 3.59 GHz where a 6.8 v signal is passed from the driver to the read out electrode. In the second case, the nano-resonator is employed as a sensor in which the sensitivity is measured as a function of voltage where a significant increase is distinguished with the increase in the number of plates. Keywords NEMS · Resonator · Actuator · Sensor · Nano electro-mechanical
T. S. Mohammed (B) · Q. Al-Gayem · S. S. Hreshee Electrical Engineering Department, College of Engineering, University of Babylon, Hillah, Iraq e-mail: [email protected] Q. Al-Gayem e-mail: [email protected] S. S. Hreshee e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_48
501
502
T. S. Mohammed et al.
1 Introduction A NEMS device is a nanofabricated mechanical structure with at least one nonmetric dimension; these systems are generally consisting of a moving part whose motion that can be actuate or sensed electrically [1–3]. A resonator can be defined as a dynamic behavior. That is observed when specific systems are excited properly. The excitement occurs through a variety of mechanisms, including after the appropriate excitation the mechanical structure of the resonator turns to vibration, these mechanisms turn vibration to generate the output signal. So, the resonator is both a sensor and an actuator according to the previous idea. When it converts the input signal from its electrical field to the mechanical field, this works as an actuator (the sensor works in the opposite path) [4]. In general, these systems exhibit an amplified response to their input when the frequency of the excitation is equal to the resonant frequency of the system [5]. The advantages of these devices are small size, low power consumption, good performance, and very low cost. The rapid development of sensor technology across the decades has presented a remarkable case in global technology [6]. Sensors have grown exponentially in the last decade. On the one hand, today’s devices have become increasingly complex and independent, for example smart phones, home automation equipment, intelligent clocks, auto-controlled cars, etc. These devices are called smart because they integrate different sensors and precise processor to calculate the amount of data being collected. On the other hand, the size, energy consumption and cost of these devices are important criteria. The quality factor of NEMS resonators can be up to one million if properly sealed in vacuum and show a wide range of resonant frequencies from 10 Hz to 1 GHz. Chen et al. [1] show that an example of NEMS resonator systems include sensors for detecting different kinds of chemical or physical properties [3], and some one used NEMS resonators as an actuator [5]. The common structure of NEMS is as follows: A Cantilevers also called clamped free beams (CFB) are the mechanical portion fixed at one side while free at another side. B Bridge also called doubly clamped beam (CCB): the two sides in this type of structure are fixed [7]. C free free beam (FFB): beam supported mechanically at its flexural nodal points by four torsional beams [8]. In this paper is presented design and modeling of cantilever type of resonator and demonstrated that it works as an actuator at the first case study and is used as a sensor at the second case study.
Design and Simulation of High Frequency Actuator …
503
2 Design and Simulation of NEMS Resonator as an Actuator NEMS resonator is a device that has a moving part and a structure that has movable part whose motion can be actuated or sensed electrically; it is designed to show mechanical resonance at certain frequency. When the applied energy is converted from the electrical field to mechanical field, the device acts as an actuator. And when this mechanical field is converted to electrical field at the output electrode, the device acts as a sensor. In this section, the design of a NEMS resonator is introduced. The structure of the nano-device consists of three main parts. The first part is the input electrode which is fixed and the second part is the output electrode which is also fixed. The third part is the movable beam which is connected in parallel with the input and output electrodes. NEMS resonators based on nanocantilevers include the main parts shown in Fig. 1, where w, L, t, and d stand for the cantilever width, length, thickness, and gap between cantilever and one electrode, respectively [1]. Geometrical and electrical notations are used in this work. The NEMS resonator is based on a nanobeam which is included in the main parts of the general design of the device. Figure 1 describes the dimension of the one type of nanobeams (CFB). According to consideration the historical review of the previous work in this field. Table 1 shows the chosen dimensions of the structural of the CFB that is employed in the design NEMS resonators. The NEMS resonator can be described through the equivalent RLC circuit in this worked design; this circuit by using MATLAB Simulink is depicted in Fig. 2. The main physical parameters of this model are the damping, elasticity and mass which are equivalent electrically Rs , L s , and Cs , respectively [2, 3, 5]. These parameters can be obtained mathematically by using the following equation. Fig. 1 The polysilicon structure of CFB
L
t
504
T. S. Mohammed et al.
Table 1 The cantilever dimensions Parameters
Symbol
Cantilever length
L
Value
Unit
165
nm
Cantilever width Electrode width
wcan
20
nm
welc
15
nm
Cantilever thickness
t
75
nm
Cantilever-electrode area
A
2475
Gap between cantilever and input electrode
d
10
nm2 nm
Fig. 2 Equivalent RLC circuit Rin = 50
√
k m eff Q η2
(1)
Ls =
m eff η2
(2)
Cs =
η2 k
(3)
Rs =
where Rs , L s and Cs is the motional resistance, inductance and capacitance, respectively m eff is the effective mass of the cantilever
m eff = 0.25ρwcan Lt
(4)
ρ is the density of the material from which the cantilever is made Assuming a poly-Si cantilever, ρ is the density of poly-si equal to: ρ=
2330 kg m3
(5)
Design and Simulation of High Frequency Actuator …
505
k is the effective stiffness
k=
3 0.257 E twcan 3 L
(6)
η is the electro-mechanical coupling factor
η = Vp
εo Lt d2
(7)
εo is the vacuum dielectric constant (εo = 8.85 × 10−12 ) Q is the quality factor of resonator in this assumed NEMS resonator work in vacuum and the Q = 2000 Here, E is Young’s modulus of the nanocantilever material which can be calculate from chart shown in Fig. 3 [7]. For poly-Si material, E = 150 GPa for polysilicon. Co is the static capacitance of the cantilever-driver system which increases when a dc voltage is applied (V p ) according to the electro-mechanical coupling equation: Co = C(1 + η) C=
εA d
where A is a cantilever-electrode area given as
Fig. 3 This chart illustrates the value of
E ρ
for different materials
(8) (9)
506
T. S. Mohammed et al.
Table 2 Electrical parameters of RLC circuit
Parameters
Symbol
Unit
Value
Serial motional inductance
Ls
H
0.0073
Serial motional capacitance
Cs
F
2.727 × 10−19
Serial motional resistance
Rs
8.1745 × 104
Resonator static capacitance
C0
F
2.191 × 10−18
Effective stiffness
k
N/m
704,358
Resonant frequency
fo
GHz
3.5697
A = L ∗ welc
(10)
The resonance frequency can be expressed as: k m eff
1 fo = 2π
(11)
C p is represents the parasitic capacitance associated with the communication lines between cantilever and circuit [5]. The parameters of equivalent electrical circuit of NEMS resonators are extracted by using the previous equations at V p = 20 V and are illustrated in Table 2 The transfer function can be found by analyzing the electrical circuit in Fig. 2 given as follows vo =
Z=
1 sC O 1 sC O
vi ∗ 1 sC p
1 sC p
(12)
+Z
∗ Rs + s L S +
1 sC S
+ Rs + s L S +
1 sC S
(13)
Substitute Eq. (13) into Eq. (12) ⎡ ⎢ ⎢vo = ⎣
⎤ vi ∗
1 sC p
+
1 sC p
1 1 sC O ∗ Rs +s L S + sC S 1 1 sC O +Rs +s L S + sC S
⎥ sC p ⎥∗ ⎦ sC p
(14)
Design and Simulation of High Frequency Actuator …
507
4 3 2
gain
1 0 -1 -2 -3 -4 -5 10 6
10 7
10 10
10 9
10 8
frequency responce
Fig. 4 Frequency response with respect to the gain at C p = 10−18 F
vo = vi vo = vi
1+
sC p sC1 ∗ Rs +s L S + sC1 O S 1 1 sC O +Rs +s L S + sC S
1 sC O 1 sC O
1
(15)
+ Rs + s L S +
+ Rs + s L S +
1 sC S
+
1 sC S C p Rs sC L + CpO S CO
+
Cp sC S C O
(16)
By using MATLAB (m-file) and from (16), the frequency response with respect to the gain for different values for parasitic capacitance can be found (C p ) as shown in Figs. 4, 5 and 6. The simulation of the proposed model used with different values of the parasitic capacitance C p without changing the values of other parameters is shown in Table 2. The effect of the parasitic capacitance is illustrated in Figs. 4, 5 and 6, and it can be concluded that the value of parasitic capacitance affects deeply the stability of the resonator system, and decreasing its value would improve resonant performance.
3 NEMS Resonator as an Actuator This resonator is employed in this case study as an actuator. The NEMS resonator examined in this thesis is cantilever that is actuated by electrostatic force, so it is transducer which converts energy between the electrical field and mechanical field [8].
508
T. S. Mohammed et al. 4
2
gain
0
-2
-4
-6
-8 10 6
10 7
10 8
10 9
10 10
frequency responce
Fig. 5 Frequency response with respect to the gain at C p = 10−17 F 2
0
gain
-2
-4
-6
-8
-10 6 10
7
10
8
10
9
10
10
10
frequency responce
Fig. 6 Frequency response with respect to the gain at C p = 10−17 F
By electrostatic force, the resonator is actuated by using the gravitation between electric charges to generate a mechanical force; when a cantilever is actuated by electrostatic (capacitive) force both an ac and a dc voltages are applied [9]. dc voltage (Vdc ) is applied on a fixed bias electrode of the resonator, and ac voltage (Vac ) is applied on input electrode. In this work, it is assumed that the ac voltage can be represented as a time-varying sine wave amplitude Vac = 2 v. Figure 7 shows the cantilever structure in mechanical motion.
Design and Simulation of High Frequency Actuator … Fig. 7 Mechanical motion of the cantilever
Fig. 8 The input signal has peak to peak value equal to 4 V
Fig. 9 The output signal has peak to peak value equal to 6.8 V
509
510
T. S. Mohammed et al. 7
6.8
peak to peak voltage (v)
6.6
6.4
6.2
6
5.8
5.6 1
1.5
2
2.5
3
3.5
4
4.5
frquency (GHz)
Fig. 10 The frequency response of the device
Figures 8 and 9 show the input signal applied to the circuit in Fig. 2 and output signal that result by using the value of the parameters of the circuit shown in Table 2 at resonant frequency and assuming the device is actuated in vacuum. Different frequencies have been used at the input signal to check the possibility of generating the required vibration at the actuator. From Fig. 9, it can be seen that the amplification amplitude of the output signal has a peak value at the resonance frequency. This amplification verifies that the device works as an actuator at that frequency. Also, it is clear from Fig. 10 the frequency response of the actuator where different frequencies of the input signal have been used to verify the frequency at which the cantilever starts to vibrate. It can be concluded that the resonance frequency is around 3.56 GHz where a maximum amplitude has been achieved.
4 NEMS Resonator as a Sensor NEMS resonator can be employed as a sensor, which is a device that converts mechanical motion into electrical signal when is linked suitably with an electrical circuit at the output electrode (sensing electrode) [10]. The electrostatic transduction is sensitive to the overlap area where the number of the parallel plates and the gap between the plates are the main parameters that affect the sensitivity level. In this section, a simple parallel plate capacitor structure is designed and modeled which consists of (N) plates as shown in Fig. 11.
Design and Simulation of High Frequency Actuator …
511
Fig. 11 Structure of CFB with N = 5 parallel plats Table 3 Electrical parameters of RLC circuit at different N (εr = 1) Parameter
L s (H)
Rs ()
Cs (F)
Co (F)
f 0 (GHz)
4.5557 × 10−4
5.1091 ×
103
N =4
V pp (V)
4.3633 × 10−18
8.7655 × 10−18
4.4
N =5
2.9157 × 10−4
3.2698 × 103
6.8177 × 10−18
1.0957 × 10−17
4.55
10.69
N =8
1.1389 × 10−4
1.2773 ×103
1.7453 × 10−18
1.7531 × 10−17
5
13.824
9.578
Assuming every plate has the same dimension that used in Table 1, the overall capacitance (9) becomes equal to: C = N εr εo
A d
(17)
From Table 3 and Figs. 12, 13 and 14, it is noted that when increasing the number of parallel plates (N), the peak to peak output voltage increase.
5 Conclusion Electromechanical systems are devices that can do electrical and mechanical functions, and from this paper, it is clear that these devices can be implemented and integrated at the nanoscale. In this paper, the NEMS resonator was designed, studded, and simulated that can be defined as a frequency amplifier and can be considered as one of these nanoscale devices that starts to vibrate after applying the appropriate excitation.
512
T. S. Mohammed et al.
Fig. 12 The output signal when N = 4
Fig. 13 The output signal when N = 5
The resonator depends on its work on the cantilever that represents the core of the device. The proposed model simulated with different values of the parasitic capacitance, and the results showed that its value affects deeply the stability of the resonator, and decreasing its value would improve resonant performance. Another model of NEMS resonator was designed and studded. The second model depends on the parallel-plate structure of the capacitor. The results of the second model illustrate that the device becomes more sensitive with increase in the number of parallel plates, and this mean the device worked as a sensor.
Design and Simulation of High Frequency Actuator …
513
Fig. 14 The output signal when N = 8
References 1. Chen M, Zhang Z, Ding H, Lang S, Tao Y, Chu T, Xiao G, Graddage N, Ton K (2018) Fully printed parallel plate capacitance humidity sensors. In: 2018 international flexible electronics technology conference (IFETC) 2. Rajendran S, Mary Lourde R (2015) Modeling and optimization of a cantilever based DNA sensor for biomedical applications. In: 2015 international conference on electrical electronics signals communication and optimization (EESCO) 3. Huang, JG, Dong B, Tang M, Gu YD, Wu JH, Chen TN, Yang ZC, Jin YF, Hao YL, Kwong DL, Liu AQ (2015) NEMS actuator driven by electrostatic and optical force with nano-scale resolution. In: 2015 transducers-2015 18th international conference on solid-state sensors actuators and microsystems (TRANSDUCERS) 4. Ouerghi I, Philippe J, Duraffourg L, Laurent L, Testini A, Benedetto K, Charvet AM, Delaye V, Masarotto L, Scheiblin P, Reita C, Yckache K, Ladner C, Ludurczak W, Ernst T (2014) High performance polysilicon nanowire NEMS for CMOS embedded nano sensors. In: 2014 IEEE international electron devices meeting 5. Verma VK, Yadava RDS (2016) Stochastic resonance in MEMS capacitive sensors. Sens Actuat B: Chem 6. Tanaka K, Kihara R, Sánchez-Amores A, Montserrat J, Esteve J (2007) Parasitic effect on silicon MEMS resonator model parameters. Microelectr Eng 7. Arcamone J, Misischi B, Serra-Graells F, van den Boogaart MAF, Brugger J, Torres F, Abadal G, Barniol N, Pérez-Murano F (2008) Compact CMOS current conveyor for integrated NEMS resonators. IET Circuits Dev Syst 8. Verd J, Abadal G, Teva J, Gaudo MV et al (2005) Design, fabrication, and characterization of a sub microelectromechanical resonator with monolithically integrated CMOS readout circuit. J Microelectromech Syst 9. Verma VK, Yadava RDS (2016) Stochastic resonance in MEMS capacitive sensors. Sens Actuators B: Chem 10. Li L, Simulation of mass sensor based on luminescence of micro/nano electromechanical
Fast and Accurate Estimation of Building Cost Using Machine Learning T. Q. D. Pham, Nguyen Ho Quang, N. D. Vo, V. S. Bui, and V. X. Tran
Abstract Accurately predicting building cost is of a great importance to house building companies. In this study, a machine learning (ML) framework with several regression approaches is developed to model and estimate the building costs accurately in both actual and inverse ways. A dataset of 10,000 samples, the real data computed by different services of our partner construction company, is used to train and to validate ML models. The ML-based estimated results indicate that linear regression model and decision tree model provide the most accurate results for construction cost and maintenance cost, respectively. Furthermore, an artificial neural network framework is considered in the inverse way to get the highest regression accuracy in order to identify the best available features of to-be-built house that a buyer can have for a given budget. Keywords Machine learning · Regression analysis · Neural networks · Cost estimation
1 Introduction and Objective Machine learning (ML), algorithms which are able to self-learn based on a set of trained data, is becoming widely used in different technical sectors. In the civil construction sector, ML is mainly used to predict the price of buildings from the T. Q. D. Pham · N. H. Quang · V. S. Bui · V. X. Tran (B) Thu Dau Mot University, Binh Duong, Vietnam e-mail: [email protected] T. Q. D. Pham e-mail: [email protected] N. H. Quang e-mail: [email protected] V. S. Bui e-mail: [email protected] N. D. Vo Industrial University of Ho Chi Minh City, Ho Chi Minh City, Vietnam e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_49
515
516
T. Q. D. Pham et al.
mortgage data of building features (locations, building state, useful areas, garden, etc.), for example, see [1–3]. Some authors employed ML to predict the structural behavior of the house [4, 5]. For a construction company, it is essential to provide fast and accurate estimation of the building cost so that the sales representatives can communicate with the buyers in an instantaneous basis while assuring that profitable prices are always offered. In a simplified representation, a current workflow of a buying command in a construction company consists of different steps as in Fig. 1. As shown in Fig. 1, for the current practice, once a buying command is made, the key information on the features of the house to be built such as ground surface area, number of floor, number of windows, construction materials are collected by the sales representatives. These data are then passed to the design team who will sketch a plan of the future house, and structural mechanics engineers will assure that the design is safe to build. Then, resources such as quantity of materials and labor needed to build the house are estimated, which are then used to compute the bare cost of construction, normally in terms of cost of a unit surface for construction and for maintenance. This process is repeated until it is approved by the buyer prior to the final cost. It is more often that during this process, the buyer continues to change the requests, sometimes with just very minor changes. One such iteration takes a lot of time and requires many interfaces between different services within the company, leading to complicated coordination work. Ideally, the sales person should be assured that the proposed offer is always profitable. Also, the sales person should be able to guide quickly the client to the most realistic options of future house for a given budget. Our partner, a long-established company, has designed ten thousands of buildings and houses. They wish to provide a fast and reasonably accurate estimation of house building cost to increase the operational efficiency of the company. In this paper, it is aimed to build an ML-based estimation tool which can provide fast and useful estimation of house building cost. The laborious and time-consuming workflow steps performed by the back-end offices shown in Fig. 1 to be embedded in an estimation tool that a front-end sales person can estimate house building cost within one minute. This problem is described in an ML-based workflow of a buying command in a construction company as in Fig. 2.
Fig. 1 Current workflow of a buying command in a construction company
Fast and Accurate Estimation of Building Cost Using Machine …
517
Fig. 2 ML-based workflow of a command in the house building company
2 ML Problem Definition and Data Analysis A machine learning problem consists of 3 important components such as task, performance measure and experiences. These components are described in this section as well as the data analysis.
2.1 Task In this paper, two types of task are required for the ML problems. The first one is the standard regression, which identifies the value of output for any provided input value. This task is to help the sales person to estimate quickly the minimum costs of house building and maintenance to assure profits. The second task is the inverse regression which identifies the value of input for any provided output value. This task is to help the sales person to identify the best features of the future house that the buyer could afford for a given budget.
2.2 Performance Measure For forming the training and testing datasets, 80% of the collected data were used to train the algorithms, while the remaining 20% were used to validate their performance in both actual and enforced cases. The accuracy was calculated as the coefficient of determination R2 of the prediction.
2.3 Experiences A snapshot of the data structure, with normalized values, used in this study is represented in Fig. 3.
518 X1 0.00E+00 -1.67E-02 0.00E+00 0.00E+00
T. Q. D. Pham et al. X2 3.33E-02 2.22E-02 4.44E-02 4.44E-02
X3 2.89E+00 1.56E+00 2.33E+00 2.89E+00
X4 1.56E+00 1.56E+00 1.11E+00 1.78E+00
X5 5.00E-01 6.11E-01 6.11E-01 4.44E-01
X6 7.22E-01 7.78E-01 8.89E-01 8.33E-01
X7 1.67E-01 2.22E-01 1.67E-01 2.78E-01
X8 2.78E-01 2.22E-01 2.22E-01 2.78E-01
X9 X10 X11 Z1 Z2 4.22E-01 1.00E+00 5.56E-01 1.69E+02 2.22E+01 3.33E-01 9.44E-01 1.00E+00 1.56E+02 2.10E+01 3.89E-01 1.00E+00 7.78E-01 1.86E+02 2.29E+01 4.11E-01 1.00E+00 4.44E-01 1.69E+02 2.22E+01
Fig. 3 A snapshot of data structure
As shown in Fig. 3, the input parameter, denoted by vector X, has 11 features named X1, X2… X11, for confidential reason. These features, not necessarily corresponding to the nominated features, are key characteristics of the house to be built such as thermal conductivity, wall thickness, number of floors, house depth, width of windows, and heights. The output, denoted by vector Z, has 2 parameters, named Z1 and Z2. They can be the cost of the building per unit area and the maintenance cost per unit area. About 10,000 data (sample size) obtained from different combinations of parameters are the real data computed by different services of our partner company.
2.4 Data Analysis In terms of providing a meaningful result, the available data were cleaned, transformed and carefully checked. Below are the following steps for data analysis. Information Value Computation The information value is imposed in the framework using the python tool named Panda. These values were analyzed to find the defection in the data. Then, they are used as input for the machine learning framework. Data Transformation The data transformation techniques are applied for cleaning the available data, such as removing the outlier and the “null” which exists in the data followed by the data checking process. Data Checking The multi-collinearity problem is checked for data input. If the problem was found, principal component analysis (PCA) will be applied to ensure there is no defection in the data. As shown in Fig. 4, no multi-collinearity problem is found in the available dataset. The available data are therefore reliable and representative to use in the ML problems.
3 Machine Learning Frameworks After the data analysis process, several machine learning techniques such as linear regression, Lasso regression, elastic-net regression, decision tree, K-nearest
Fast and Accurate Estimation of Building Cost Using Machine …
519
Fig. 4 Relationship between number of data input
neighbor, and support vector machines (SVM) are used to develop models that can perform the tasks listed in Sect. 2. These methods are summarized in this section.
3.1 Linear Regression Linear regression is a method of studying the relationship between independent variable and dependent variable which is based on the following equation: Z = a0 +
11
ai X i
(1)
1
where X1, X2…X11 are the components of the input. Z can be Z1, Z2 and (Z1 + Z2). In Eq. (1), a0 and ai are coefficients of the model.
3.2 Lasso Regression Least Absolute Shrinkage Selection Operator (Lasso) regression is an analysis method similar to linear regression. Its strongest advantages are easy to reduce the model complexity and prevent the over-fitting problem in a machine learning framework.
520
T. Q. D. Pham et al.
3.3 Elastic-Net Regression Elastic-net regression is a penalized linear modeling approach that is a mixture of ridge regression and LASSO regression. It reduces the impact of collinearity on model parameters and also the dimensionality of the support by shrinking some of the regression coefficients to zero.
3.4 Decision Tree Decision tree regression is a method which combines the idea of random selection of features. Leo Breiman and Adele Cutler [6] developed it in the early 2001.
3.5 K-Nearest Neighbor K-nearest neighbor regression is a simple algorithm that calculates and predicts the output based on the distance functions.
3.6 Support Vector Machines Support vector regressions are the supervised machine learning framework for the regression and classification problem. It relies on the kernel functions for finding the hyper-plane which show well the difference between two classes.
4 Results and Discussion 4.1 Regression Tasks Different regression models are employed in this study. All of the models are supervised-learning type. The comparison between models is evaluated by the accuracy and the computing time. The comparative analysis is listed in Table 1. From the results listed in Table 1, it is concluded that linear regression provides the best accurate results with the least computing time in the case of the Z1. On the other hand, decision tree regression appears to be the best in cases of Z2 and (Z1 + Z2). In general, the linear regression model appears to be the most suitable model for the problem under investigation.
Fast and Accurate Estimation of Building Cost Using Machine …
521
Table 1 Comparison between different regression models for different output features Regression model
Accuracy (%)
Computing time (s)
Z1
Z2
Z1 + Z2
Z1
Z2
Z1 + Z2
Linear regression
95
94
94
0.003
0.006
0.006
Lasso regression
73
93
93
0.005
0.01
0.005
Elastic-net regression
39
59
58
0.003
0.008
0.005
Decision tree
61
96
95
0.07
0.04
0.05
K-nearest neighbors
85
70
69
0.04
0.07
0.15
Support vector machine
93
54
52
3.5
3.2
3.8
In order to find the most important independent variables, the linear regression model is applied in all cases to find the parameters with having the highest partial derivate ђZ/ђX (see Table 2). Based on the theory that a partial derivative which has the highest absolute value is the most important parameter. A larger partial derivative indicates that a small change in this feature will lead to a significant change in the output. As listed in Table 2, in the case of Z1, X1 and X9 are the most important input features, whereas in the cases of Z2 and (Z1 +Z2), the most important input features are X2 and X9. These results are very useful for the sales representatives to choose the appropriate features to vary following the change request from the buyer.
4.2 Inverse Regression Task The reversed regression case is performed in this study to extract more insights of the data. It shows the relationships between features and corresponding targets, but also the relationships between the targets, guaranteeing thereby a better representation and interpretability of the predicting-house building cost problem. Thus, it becomes a multiple output regression with only one dependent variable. Three specific inputs were considered including Z1, Z2 and (Z1 + Z2). The artificial neural networks (ANN) regression model is implemented in this case because of the only one dependent variable at the input. The ANN model consists of 3 layers in the first layer, and 5 and 7 layers at the second and third ones, respectively (see Fig. 5). The Relu activation function and Adam optimizer approach are used in the inverse regression model. The loss of the model for these three cases is shown in Fig. 6. There is no significant difference between three cases on the loss after 10 epochs.
−10
−3
Z = (Z1 + Z2)
7.5
X = X1
257
260
−3.3
X = X2
Partial derivate ђZ/ђX
Z = Z2
Z = Z1
Case
3.7
4.23
−0.5
X = X3
−28.2
−25.9
−2.3
X = X4
Table 2 Partial derivatives ђZ/ђX of the linear regression framework
10.6
8.7
1.9
X = X5
9
8.6
0.4
X = X6
24.8
20.7
4.1
X = X7
21.2
20.6
0.6
X = X8
182
163
19.2
X = X9
−4.6
−1.3
3.3
X = X10
3.0
5.1
−2.0
X = X11
522 T. Q. D. Pham et al.
Fast and Accurate Estimation of Building Cost Using Machine …
523
Fig. 5 Schematic of an artificial neural network with three layers
Fig. 6 Model loss of a Z1 case, b Z2 case, c Z1 + Z2 case
Table 3 lists the result of the artificial neural network regression model for the three cases. The accuracy of about 92% is obtained in all cases and there is not significantly different at the computing time. Table 4 lists the minimum values of the output and the associated input variables. In practice, this information would help the sales person to set the limit of the minimum Table 3 Artificial Neural Network regression results Z1
Accuracy (%)
Computing time (s)
96.21
19.7
Z2
96.12
20.3
Z1 + Z2
92.16
17.6
Table 4 Minimum outputs with artificial neural network regression results Minimum value
X1
X2
Z1 (= 179.72)
0.10
0.22 20.37 10.85 4.5
Z2 (= 1317.3)
−0.22 0.19 18.88 10.46 4.14 4.35 1.98 1.71 3.15 6.58 6.2
Z1 + Z2 (= 1506.4) 0.11
X3
X4
X5
X6
X7
X8
X9
X10 X11
4.53 1.82 1.76 3.12 7.04 6.19
0.31 19.39 11.42 4.63 4.72 1.83 1.96 3.10 7.04 6.49
524
T. Q. D. Pham et al.
price that assures the profits and the associated features that the buyer can have with this minimum cost.
4.3 Evaluation of Data Distribution In this section, the value of the output feature (Z1 + Z2) data is varied to understand associated changes of the input values X. Five values 100, 80, 60, 40, 20% compared with the maximum value of Z1 + Z2 (= 2100) are considered in this study. In practice, this information would help the sales person to quickly guide the buyer about the best features that he/she can have for a given budget. Z1 + Z2
X1
X2
X3
X4
X5
X6
X7
X8
X9
X10
X11
100% (= 2100)
−0.46
0.67
28.14
13.97
7.04
6.06
3.20
1.89
4.20
9.85
8.14
80% (= 1680)
−0.34
0.51
22.64
11.09
5.72
4.92
2.65
1.42
3.38
7.86
6.57
60% (= 1260)
−0.23
0.34
17.15
8.22
4.39
3.79
2.10
0.95
2.57
5.87
5.00
40% (= 840)
−0.11
0.18
11.65
5.34
3.07
2.65
1.55
0.489
1.766
3.88
3.43
20% (= 420)
−0.002
0.02
2.47
1.74
1.52
1.00
0.02
0.95
1.89
1.86
6
5 Conclusion and Perspectives In this study, a dataset of 10,000 samples, the real data computed by different services of our partner construction company, is used to train and to validate ML models. The ML-based estimated results indicate that linear regression model and decision tree model provide the most accurate results for construction cost and maintenance cost, respectively. Furthermore, an artificial neural network framework is considered in the inverse way to get the highest regression accuracy to identify the best available features of to-be-built house that a buyer can have for a given budget In perspective, multiple output regression will be performed using deep learning for more dependent data, comes up with the real meaning of each model. Also, problems with intermediate outputs such as labor cost and materials will be considered. Acknowledgements The support of Thu Dau Mot University for this work within the “Modelling and Simulation in the Digital Age—MaSDA” research program is greatly appreciated.
Fast and Accurate Estimation of Building Cost Using Machine …
525
References 1. Mirmirani S, Li HC (2004) A comparison of VAR and neural networks with genetic algorithm in forecasting price of oil. Adv Econometrics 19:203–223 2. Guresen E, Kayakutlu G, Daim TU (2011) Using artificial neural network models in stock market index prediction. Expert Syst Appl 38(8):10389–97 3. Siripurapu A (2014) Convolutional networks for stock trading. Stanford Univ Dep Comput Sci 1–6 4. Banerjee D, Dutta S (2017) Predicting the housing price direction using machine learning techniques. In: 2017 IEEE international conference on power, control, signals and instrumentation engineering (ICPCSI), Chennai, pp 2998–3000 5. Mu J, Wu F, Zhang A (2014) Housing value forecasting based on machine learning methods. Abstr Appl Anal 2014:1–7. https://doi.org/10.1155/2014/648047 6. Breiman L (2001) Random forests. Mach Learn 45(1):5–32
A 24 GHz Wide-Tuning-Range CMOS Digitally Controlled Oscillator for Automotive Radar Abrar Siddique, Tahesin Samira Delwar, Anindya Jana, and Jee Youl Ryu
Abstract This paper presents a 24-GHz CMOS oscillator for the automotive radar application that has a digitally controlled wide tuning range. The current digitally controlled oscillator (DCO) topology is chosen to reduce power consumption on the circuit. Furthermore, to minimize tank inductive losses; to increase the transconductance and to improve the phase noise of the DCO, capacitive source-degenerated negative resistors are employed. The proposed DCO utilizes capacitive-feedback technique to increase the output power and phase noise level. The capacitive-feedback technique has been introduced by coarse tuning capacitive cell (CTC ) and fine tuning capacitive cell (FTC ) that are parallel to the n/pMOS source terminals to increase the oscillation range of the DCO. Specific tuning mechanism is applied to design a wide-tuning-range DCO. Using 130 nm CMOS process the proposed DCO is implemented. A wide tuning range of 23.8 GHz to 26.7 GHz with CTC and 23.8 GHz to 24.9 GHz with FTC are achieved simultaneously. The proposed DCO showed low-phase noise of −104.4 dBc/Hz at 1 MHz offset and low power consumption of 721.5 µW at power supply of 0.9 V. Keywords Capacitive-feedback technique · 24 GHz · Current-reuse structure · Wide tuning range
A. Siddique (B) · T. S. Delwar · J. Y. Ryu Department of Information and Communications Engineering, Pukyong National University, Busan, South Korea e-mail: [email protected] T. S. Delwar e-mail: [email protected] A. Jana Department of Information and Communications Engineering J. B Institute of Engineering and Technology, Hyderabad, India © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_50
527
528
A. Siddique et al.
1 Introduction As the CMOS technology scales down, Radio Frequency (RF) systems are becoming more preferred in digital architectural solutions than analog systems to reduce production cost. Therefore, high-speed transistors are available that allow integration with digital processing of RF electronics. Intensive digital approaches facilitate a high level of integration for the implementation of System-On-Chips (SoCs) [1]. Road accidents are one of the world’s serious concerns. Automotive radars can be used in advanced cruise control (ACC) systems to monitor, provide driver information, regulate the distance between two vehicles and activate the vehicle accelerator and frequency. Other important functions include crash warner systems, blind spot control systems, lane-change support, traffic cross-switches and back-up parking support, collision mitigation systems and vulnerable road user detection systems for driver assistance systems based on radar. 24 GHz and 77 GHz are the main frequency bands of radar applications [2]. The mainstream for detecting other near vehicles is the medium-short range and wide beam 24 GHz. Digital oscillators (DCOs) have recently been one of the main challenges of the RF frontend modules. A study [3] proposed a conventional DCO developed for low-gigahertz oscillation. LC oscillators have been widely used among the diverse types of oscillators due to their improved phase noise at low supply voltage and relaxing and reliable startup mode [4]. The local oscillator (LO), in battery-driven systems, often absorbs a large share of the current at the frontend. There are two fundamental issues for the design of a DCO procedure to minimize bit error (BER) and improve battery life, thus, low-phase noise and low power dissipation. This paper is organized as follows. Problem statement and circuit methodology of the circuit are described in Sect. 2. Section 3 presents the implemented DCO, the design parameters and the simulation results followed by the conclusion.
2 Proposed DCO Circuit Methodology 2.1 Problem Statement One of the architectural approaches to replace the analog voltage-controlled oscillators (VCOs) circuitry is the digitally controlled oscillator. The DCO will generate a frequency that can be controlled by a digital word instead of an analog voltage controlled by the frequency tuning in the VCO. The DCO was reported in [5], for automotive radar. The work was based on the conventional LC tank oscillator [6]. A potential tuning mechanism for maximizing DCOs tuning range is discussed in this work. A 24-GHz DCO with tuning range of 23.8 GHz to 26.7 GHz with CTC and 23.8 GHz to 24.9 GHz with FTC is designed in a 130-nm CMOS process, to prove the concept. From the previous studies, DCO has not been employed in CMOS to widen the tuning range for automotive radar.
A 24 GHz Wide-Tuning-Range CMOS Digitally Controlled …
529
Fig. 1 Schematic for the proposed DCO
2.2 DCO Design Aspects DCO is designed with RFIC models TSMC 130 nm in order to verify the concept. Figure 1 shows the circuit schematic of the proposed DCO. In the proposed DCO transistors, M1 and M2 are configured in current-reuse topology and the transistors M3 and M4 are capacitive source-degenerated transistors which reduce the tank inductor losses, increase transconductance and improve phase noise of the DCO. A coarse tuning cell (CTc) and a fine tuning cell (FTc) are designed to increase and to digitally control the tuning range of DCO [1]. CTc are implemented with a 2bit Coarse Word controlled binary-weighted metal–oxide–metal (MoM) capacitors. A 3-bit Fine Word is used to control the FTc. Figure 2a shows the switched capacitor cell for CTc. Each cell consists of two identical binary-weighted MoM capacitors connected differentially by a series and two pulldown CMOS switches. Figure 2b shows the unit varactor-based cell for the FTc. The capacitive-feedback technique composed of capacitor C1 and FTC and CTC blocks is used to boost the output swing of LC DCO.
530
A. Siddique et al.
Fig. 2 Unit capacitance cells. a CTC unit cell. b FTC unit cell
3 Research Results and Discussion Keysight Advanced Design System environment is used for simulation using a 130nm TSMC models, and the device size of the designed DCO, FTC and the CTC cells is shown in Table 1. Figures 3 and 4 show the frequency tuning range of the proposed DCO. Figure 3 shows the simulated fine tuning frequency controlled by 2 bit FTC , which extends from 23.8 GHz to 24.9 GHz, while Fig. 4 shows the simulated coarse tuning frequency controlled by 3 bit CTC , which extends from 23.8 GHz to 26.7 GHz. The differential output voltage waveforms of DCO are shown in Fig. 5. The capacitive-feedback technique helps to increase the output voltage levels. Table 1 Device size of the designed DCO, FTC , and the CTC cells
Device
Size
M1
30 µm/130 nm
M2
60 µm/130 nm
M3, M4
32 µm/130 nm, 1.5 µm/130 nm
M5, M6
32 µm/130 nm, 1.5 µm/130 nm
M7
32 µm/130 nm
M8, M9
32 µm/130 nm, 1 µm/130 nm
C1
W/finger = 98 nm, array = 8 * 8, multiply = 180
L1
1turn Area: width = 186 µm, length = 171 µm
R1
186
A 24 GHz Wide-Tuning-Range CMOS Digitally Controlled …
531
Fig. 3 Output frequency of DCO with 2-bit FTC
Fig. 4 Output frequency of DCO with 3-bit CTC
The phase noise of the designed DCO is shown in Fig. 6. The designed DCO shows the phase noise of at –104.4 dBc/Hz at 1 MHz offset.
4 Conclusions and Future Scope A tuning mechanism to widen the tuning range of a 24 GHz DCO for radar applications is presented in this paper. A wide tuning range of 23.8 GHz to 26.7 GHz with CTC and 23.8 GHz to 24.9 GHz with FTC is achieved simultaneously. The total circuit with the main DCO and two buffers has a 0.9 V low power supply voltage and consume 721.5 µW. The phase noise is –104.4 dBc/Hz @ 1 MHz offset. There is an
532
A. Siddique et al.
Fig. 5 Differential output voltage waveform
Fig. 6 Phase noise performance of designed DCO
improvement in power consumption, tuning range and phase noise which makes it a good candidate for 24 GHz radar applications. As the automotive radar operate in 24 GHz and 77 GHz frequency band. So for the translation between these frequency bands work can be done on dual-band 24 GHz and 77 GHz DCO in the future. Acknowledgements This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2018R1D1A1B09043286).
A 24 GHz Wide-Tuning-Range CMOS Digitally Controlled …
533
References 1. Taha IY, Mirhassani M (2016) A 24 GHz digitally controlled oscillator for automotive radar in 65 nm cmos. In: 2016 IEEE international symposium on circuits and systems (ISCAS), IEEE, pp 2767–2770 2. Rastegar H, Zare S, Ryu J-Y (2018) A low-voltage low-power capacitive-feedback voltage controlled oscillator. Integration 60:257–262 3. Wu W, Long JR, Staszewski RB (2013) High-resolution millimeter-wave digitally controlled oscillators with reconfigurable passive resonators. IEEE J Solid-State Circ 48(11):2785–2794 4. Verma S (2005) A 17-mW 0.66-mm direct- conversion receiver for 1-Mb/s cable replacement. IEEE J Solid-State Circ 40(12):2547–2554 5. Wu W, Staszewski RB, Long JR (2014) A 56.4-to-63.4 GHz multi-rate all-digital fractional-N PLL for FMCW radar applications in 65 nm CMOS. IEEE J Solid-State Circ 49(5):1081–1096 6. Siddique A, Delwar TS, Kurbanov M, Ryu J-Y (2020) Low-power low-phase noise VCO for 24 GHz applications. Microelectronics J 104720
Output Feedback Adaptive Reinforcement Learning Tracking Control for Wheel Inverted Pendulum Systems Anh Duc Hoang, Hong Quang Nguyen , Phuong Nam Dao , and Tien Hoang Nguyen Abstract This paper presents output feedback based online adaptive reinforcement learning control for a wheeled inverted pendulum (WIP) system based on the linearized system model, which belongs to a class of continuous time systems under the influence of external disturbance and/or dynamic uncertainties. Both the Policy Iteration (PI) and Value Iteration (VI) techniques are considered in the proposed algorithm based on the equivalent discrete time system and application of Kronecker product for the data collections. Finally, it is shown that the theoretical analysis and simulation results demonstrate the effectiveness of the proposed control structure. Keywords Reinforcement learning · Approximate/adaptive dynamic programming (ADP) · Wheeled inverted pendulum · Output feedback control
1 Introduction In recent years, the control problems of wheeled inverted pendulum (WIP) systems have received much attention in many fields, such as control technique and industrial automation as represented in [1–4]. A WIP system with the under-actuated description is described as an inverted pendulum mounted on a mobile platform subjected to nonholonomic constraints. Due to the number of control inputs is less than the number of the controlled degree of freedom, it is hard to employ the classical nonlinear control for WIP systems. The fact is that the mobile platform cannot be separately controlled from the tilt angle of inverted pendulum dynamics. The presence of unstable balance and uncertainties in WIP systems lead to the disadvantages Supported by Research Foundation funded by Thai Nguyen University of Technology. A. D. Hoang · P. N. Dao · T. H. Nguyen Hanoi University of Science and Technology, Hanoi, Vietnam H. Q. Nguyen (B) Thainguyen University of Technology, Thai Nguyen, Vietnam e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_51
535
536
A. D. Hoang et al.
in implementing the control system. The authors in [5, 6] established the sliding mode control (SMC) technique to obtain the convergence of error after two stages with appropriate sliding surface. The neural network based nonlinear control was also mentioned in the work of [7] by approximating the uncertain terms in WIP model. The authors in [2, 8] pointed out the optimal control algorithms being considered as the appropriate solution for state variables constraint. However, solving the explicit solution of Ricatti equation in linear systems and partial differential HJB (Hamilton-Jacobi-Bellman) equation in general case are the necessary step in optimal control systems and it is hard to solve them [9]. This work has been extended to output feedback based adaptive dynamic programming algorithm by adding more the equivalent observer [8, 10–12] with the approach of using Kronecker product for the case of linear systems. Inspired by the above works focus on the frame of these two control direction, main contribution in this paper are listed in the following: (1) In comparison with the previous papers [1, 2, 4–6], a output feedback based online adaptive reinforcement learning control design is proposed for a disturbed WIP system. (2) The strict proof and offline simulation are given to determine the proposed algorithm. The remainder of this paper is organized as follows. The mathematical model of WIP system and control objective are given in Sect. 2. Two proposed algorithms and theoretical analysis are presented in Sect. 3. Subsequently, the simulation results are described in Sect. 4. Finally, the conclusions are pointed out in Sect. 5. Notation: Throughout this paper, ⊗ represents the Kronecker product. For an T arbitrary constant matrix A ∈ Rn×m , vec (A) = a1T , a2T , . . . , amT where ai ∈ Rn are the column of A. For a symmetric matrix P ∈ Rm×m , vecs (P) = p11 , 2p12 , . . . 2p1m , T 1 p22 , 2p23 , . . . , 2pm−1,m , pmm ∈ R 2 m(m+1) .
2 Problem Formulation Figure 1 illustrates the basic diagram of a wheel inverted pendulum. The EulerLagrange equations that establish the dynamic of the system can be found in [4]. In this paper, we concentrate on the state space model of the inverted pendulum as follows
Output Feedback Adaptive Reinforcement Learning …
537
Fig. 1 The Sketch of the Wheel Inverted Pendulum (WIP)
The dynamic model of WIP can be interpreted in form of state space as ⎡
0 ⎢0 ⎢ ⎢0 x˙ = ⎢ ⎢0 ⎢ ⎣0 0 ⎡ 1 y = ⎣0 0
⎡ ⎤ 0 0 ⎢ ζ3 0⎥ ⎢ ⎥ ⎢ 0⎥ ⎥x + ⎢ 0 ⎢ ζ4 ⎥ 0⎥ ⎢ ⎣0 ⎦ 1 0 ζ5 ⎤T 00000 0 1 0 0 0⎦ x 00010 1 0 0 0 0 0
0 0 0 0 0 0
0 0 1 0 0 0
0 ζ1 0 0 0 ζ2
⎤ 0 ζ3 ⎥ ⎥ 0 ⎥ ⎥u −ζ4 ⎥ ⎥ 0 ⎦ ζ5
where T T x = x x˙ ψ ψ˙ φ φ˙ ; u = u1 u2 Mb2 d 2 gR2
ζ1 = − Mb d 2 + Ix Mb R2 + 2Mw R2 + 2Ia − (Mb dR)2
Mb R2 + 2Mw R2 + 2Ia Mb gd
ζ2 = (Mb + 2Mw ) R2 + 2Ia Ix + 2Mb d 2 Mw R2 + Ia
(1)
538
A. D. Hoang et al.
R Mb d 2 + Ix + Mb dR
ζ3 = Mb d 2 + Ix Mb R2 + 2Mw R2 + 2Ia − (Mb dR)2 L
ζ4 = R 2 Mw + RIa2 L2 + Iz
Mb R2 + 2Mw R2 + 2Ia + Mb dR
ζ5 = − (Mb + 2Mw ) R2 + 2Ia Ix + 2Mb d 2 Mw R2 + Ia where x, ψ, φ are inverted pendulum position, rotation angle and tilt angle, respec˙ φ˙ are forward velocity and angles velocity. tively; x˙ ,ψ, T The control objective is to find the control inputs u = u1 u2 to ensure that the wheel displacement, the rotation angle and the tilt angle converge to origin that does not require knowledge of the system. Without loss of generality, this point to point control objective can be implemented as the stabilization problem around the operating point.
3 Problem Statement Consider a continuous time linear system described as below x˙ = Ax + Bu y = Cx
(2)
where x ∈ Rn , u ∈ Rm , y ∈ Rr are state variables, input vector and output vector, respectively. A ∈ Rn×n , B ∈ Rn×m , C ∈ Rr×n are the system matrices, which are completely unknown. In the case of wheel inverted pendulum, we have n = 6 , m = 2, r = 3. Define a quadratic cost function for system (2) ∞ J (x0 ) =
T
y (τ ) Qy (τ ) + uT (τ ) Ru (τ ) d τ
(3)
0
√ where Q = QT 0, R = RT > 0 and A, QC is observable. A minimum cost J ∗ can be obtained by implementing the below control policy: u = −R−1 BT P ∗ x = −K ∗ x
(4)
where P ∗ = (P ∗ )T > 0 is the solution to the algebraic Riccati equation: AT P ∗ + P ∗ A + C T QC − P ∗ BR−1 BT P ∗ = 0
(5)
Output Feedback Adaptive Reinforcement Learning …
539
By taking periodic sampling, we will derive a discretised model of (2): xk+1 = Ad xk + Bd uk yk = Cxk
(6)
where xk ,uk are the state and the input at the sample instant . Ad = eAh ,Bd = h Aτ and h > 0 is the sampling period. Suppose the sampling frequency 0 e dτ √ ωh = 2π/h is non-pathological. Then, both (Ad , C), Ad , QC are observable and (Ad , Bd ) is controllable. The discretised cost function for (6) is Jd (x0 ) =
∞ yjT Qyj + ujT Ruj
(7)
j=0
The optimal control law minimizing (7) is
−1 uk = − R + BdT Pd∗ Bd BdT Pd∗ Ad xk ≡ −Kd∗ xk
(8)
T where Pd∗ = Pd∗ > 0 is the unique solution to
−1 ATd Pd∗ Ad − Pd∗ + C T QC − ATd Pd∗ Bd R + BdT Pd∗ Bd BdT Pd∗ Ad = 0
(9)
Equation (9) is non-linear in Pd so it is very difficult to be solved analytically. Moreover, the knowledge of system dynamic Ad , Bd ,C and the measurement of all state variables are not available in practical application. Due to all of that problems, we will develop an adaptive control algorithm for discretised system with output feedback in the next section.
4 Adaptive Optimal Output Feedback Controller In this section, we will propose 2 algorithms PI and VI with output feedback feature.
4.1 State Reconstruction Similar to [8, 11], the state variables xk and output y¯ k−1,k−N can be re-created as follows
540
A. D. Hoang et al.
xk = ANd xk−N + V (N ) u¯ k−1,k−N y¯ k−1,k−N = U (N ) xk−N + T (N ) u¯ k−1,k−N
(10)
where T T T T , uk−1 , . . . , uk−N u¯ k−1,k−N = uk−1 T T T T y¯ k−1,k−N = yk−1 , yk−1 , . . . , yk−N V (N ) = Bd , Ad Bd , . . . , ANd −1 Bd T
T U (N ) = CANd −1 , . . . , (CAd )T , C T ⎡ ⎤ 0 CBd CAd Bd · · · CANd −2 Bd ⎢0 0 CBd · · · CANd −3 Bd ⎥ ⎢ ⎥ ⎢. . ⎥ .. .. .. T (N ) = ⎢ .. .. ⎥ . . . ⎢ ⎥ ⎣0 ··· 0 CBd ⎦ 0 0 0 0 0 Beside, N is the observability index. The condition to have matrix U (N ) full rank is Np n [11]. There exists a left inverse of U (N ) which is defined as U + (N ) = −1 T U (N ) U (N ) U T (N ) Lemma 1 [11] Let (Ad , C) be observable, if ωh is non-pathological, xk is derived uniquely in terms of the measured input/output sequences by u¯ k−1,k−N ≡ Θzk xk = My , Mu y¯ k−1,k−N
(11)
T T where My = ANd U + (N ), Mu = V (N ) − My T (N ), zk = [¯uk−1,k−N , y¯ k−1,k−N ]T ∈ Rq and q = N ∗ (m + r)
4.2 PI Algorithm The classical PI algorithm to find the optimal control policy K has been described as an iterative method as:
T
Ad − Bd Kj Pj Ad − Bd Kj − Pj + C T Qd C + KjT Rd Kj = 0
(12)
−1 Kj+1 = Rd + BdT Pj Bd BdT Pj Ad
(13)
with K0 ∈ Rm×n , Pj = PjT > 0 and j = 0, 1, 2, . . .
Output Feedback Adaptive Reinforcement Learning …
541
Defining Aj = Ad − Bd Kj , the discretised system (6) can be rewritten as
xk+1 = Aj xk + Bd Kj xk + uk
(14)
Indicating K¯ j = Kj Θ,P¯ j = Θ T Pj Θ. Then from (12) we obtain T ¯ T Pj xk+1 − xkT Pj xk Pj zk+1 − zkT P¯ j zk =xk+1 zk+1
= ukT ⊗ ukT − zkT ⊗ zkT K¯ jT ⊗ K¯ jT vec H¯ j1
+ 2 zkT ⊗ zkT Iq ⊗ K¯ jT + zkT ⊗ ukT vec H¯ j2 − ykT Qyk + xkT KjT RKj xk
(15) where H¯ j1 = BdT Pj Bd ,H¯ j2 = BdT Pj Ad Θ For a sufficiently large positive integer s and j = 0, 1, 2 . . . we define φk1 =[ukTj,0 ⊗ ukTj,0 − zkTj,0 ⊗ zkTj,0 K¯ jT ⊗ K¯ jT , . . . ukTj,s ⊗ ukTj,s − zkTj,s ⊗ zkTj,s K¯ jT ⊗ K¯ jT ]T φk2 =[ zkTj,0 ⊗ zkTj,0 Iq ⊗ K¯ jT + zkTj,0 ⊗ ukTj,0 , . . . zkTj,s ⊗ zkTj,s Iq ⊗ K¯ jT + zkTj,s ⊗ ukTj,s ]T T φk3 = zkTj,0 ⊗ zkTj,0 − zkTj,1 ⊗ zkTj,1 , . . . , zkTj,s−1 ⊗ zkTj,s−1 − zkTj,s ⊗ zkTj,s T Ψ = ykTj,0 Qd ykj,0 + zkTj,0 K¯ jT Rd K¯ j zkj,0 , . . . , ykTj,s Qd ykj,s + zkTj,s K¯ jT Rd K¯ j zkj,s
For any matrix K¯ j that stabilizes the system, we can rephrase (15) as ⎤ vec H¯ j1 ⎢ ⎥ ⎥ ΦjP ⎢ ⎣ vec H¯ j2 ⎦ = ψ vec P¯ j ⎡
(16)
Lemma 2 [10] If there exists a positive integer s∗ that for all s > s∗ rank (Λ) =
m (m + 1) q (q + 1) + qm + 2 2
(17)
542
A. D. Hoang et al.
Λ = υk0 ⊗ υk0 , υks ⊗ υks , . . . , υks ⊗ υks T υk = ukT , zkT The three matrices H¯ j1 , H¯ j2 , P¯ j from (15) can be uniquely solved (by least square method) by using equation (16). Then, K¯ j+1 can be obtained from (13) as −1 K¯ j+1 = R + H¯ j1 H¯ j2
(18)
Algorithm 1: PI Output Feedback ADP Select a sufficient small constant ρ < 0 and a stabilizing controlmatrix K¯ 0 Employ an initialstabilizing control law uk on the time interval 0, k0,0 .j ← 0 while P¯ j − P¯ j−1 > ρ do j Employ uk = −K¯ j zk + ek on kj,0 , kj,s with ek is an suitable exploration noise ¯ K¯ j+1 from (16) and (18) Solve P, j ←j+1 end while
Remark 1 By solving (16), there is no need of knowing the system matrices Ad , Bd , C in the control policy. Moreover, only the measurement of input/output data at the sampling instant is required. The requirement of state knowledge is completely eliminated. Theorem 1 If the rank condition in Lemma 2 is satisfied, then after applying Algorithm 1, limj→∞ P¯ j = P¯ d∗ , limj→∞ K¯ j = K¯ d∗ .
4.3 VI Algorithm Similar to the PI algorithm, classic VI algorithm is modified as
−1 Pj+1 = ATd Pj Ad + C T Qd C − ATd Pj Bd Rd + BdT Pj Bd BdT Pj Ad
(19)
−1 Kj+1 = Rd + BdT Pj Bd BdT Pj+1 Ad
(20)
with K0 ∈ Rm×n , Pj = PjT > 0 and j = 0, 1, 2, . . .
Output Feedback Adaptive Reinforcement Learning …
543
We denote new terms: Hj =
H¯ j =
H 11 H 12 j T j Hj12 Hj22
=
BdT Pj Bd BdT Pj Ad ATd Pj Bd ATd Pj Ad
H¯ 11 H¯ 12 BdT Pj Ad Θ BdT Pj Bd j T j = Θ T ATd Pj Bd Θ T ATd Pj Ad Θ H¯ j12 H¯ j22
(21)
(22)
From (19), we have T T T yk+1 Qd yk+1 = −xk+1 Υ Pj xk+1 + xk+1 Pj+1 xk+1
j T = −φ + (ψk ) vecs H¯ j+1 k+1
(23)
where
−1 Υ Pj = ATd Pj Ad − ATd Pj Bd R + BdT Pj Bd BdT Pj Ad T −1 j T 22 12 11 12 ¯ ¯ ¯ ¯ R + Hj φk+1 = zk+1 Hj − Hj Hj zk+1 T ψk = vecv ukT zkT Equation (23) can be expressed as
ΨjV vecs H¯ j+1 = ΘjV
(24)
where ΨjV = ψkj,0 , ψkj,0 , . . . , ψkj,s j j ΘjV = ykTj,1 Qd ykj,1 + φk,1 , . . . , ykTj,s+1 Qd ykj,s+1 + φk,s+1 Theorem 2 If equation (17) is satisfied, then limj→∞ H¯ j = H¯ d∗ , limj→∞ K¯ j = K¯ d∗ . Proof Proof for Theorem 1 and Theorem 2 can be obtained from [10] Theorem 3 (Based on [12])The control policy obtained from algorithm 1 and 2 asymptotically stabilizes the continuous system (2)
544
A. D. Hoang et al.
Algorithm 2: VI Output Feedback ADP Select a sufficient small constant ρ < 0
Employ an arbitrary initial control law uk on the time interval 0, k0,0 .j ← 0 H¯ j ← 0, K¯ j ← 0 while H¯ j − H¯ j−1 > ρ do j Employ uk = −K¯ j zk + ek on kj,0 , kj,s Solve H¯ j+1 from (24) −1 11 12 K¯ j+1 ← R + H¯ j+1 H¯ j+1 j ←j+1 end while
5 Simulation Results In this section, we implement offline simulations under the value of parameters in Table 1 to (1). Thus, we obtain the explicit system matrices as ⎡
0 ⎢0 ⎢ ⎢0 A=⎢ ⎢0 ⎢ ⎣0 0
1 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 1 0 0 0
0 −3.7706 0 0 0 68.9659
⎤ ⎡ ⎤ 0 0 0 ⎢ 0.599 0.599 ⎥ 0⎥ ⎥ ⎢ ⎥ ⎥ ⎢ 0 0⎥ ⎥ ⎥;B = ⎢ 0 ⎢ 1.0812 −1.0812 ⎥ 0⎥ ⎥ ⎢ ⎥ ⎦ ⎣ 0 0 1⎦ −5.776 −5.776 0
Choose the sampling period h = 0.1, the observability index N = 2 and the weighting matrices as
Table 1 The parameters and variables of wheeled inverted pendulum Parameter Symbol Value Mass of the main body Mass of each wheel Center of mass form base Diameter of wheels Distance between the wheels Moment of body inertia with respect to x-axis Moment of body inertia with respect to z-axis Moments of inertia of wheel around the centre Acceleration due to gravity
Mb Mw d R L Ix
13.3 kg 1.89 kg 0.13 m 0.13 m 0.325 kg 0.1935 kg m2
Iz
0.3379 kg m2
Ia
0.1229 kg m2
g
9.81 m s−2
Output Feedback Adaptive Reinforcement Learning …
545
⎡
⎤ 10 0 0 10 ⎣ ⎦ Q = 0 5 0 ;R = 01 0 04
||Pk-P* ||
The offline simulations results are shown in Figs. 2, 3 and 4, which determine the convergence of Actor and Critic part, the tracking problem of a WIP, and the control input based on PI, VI technique.
14000 12000 10000 8000 6000 4000 2000 0 1
2
3
4
5
6
7
5
6
7
20
25
30
20
25
30
Number of Iteration
||Kk-K* ||
40 35 30 25 20 15 10 5 0 1
2
3
4
Number of Iteration
Fig. 2 Convergence matrix K and P in PI algorithm 104
10
||Hk-H* ||
8 6 4 2 0
0
5
10
15
Number of Iteration 100
||Kk-K* ||
80 60 40 20 0
0
5
10
15
Number of Iteration
Fig. 3 Convergence matrix K and H in VI algorithm
546
A. D. Hoang et al. 6 x theta phi
5 4
Output
3 2 1 0 -1 -2
0
5
Time(s)
10
15
Fig. 4 The output of the system
6 Conclusion This paper proposed the application of online adaptive output feedback optimal control algorithm for tracking control problem of a uncertain WIP system which doesn’t require any information the dynamic systems. The theoretical analysis and simulation results shown the convergence of actor, critic parts and tracking effectiveness of proposed algorithm. Future work of this reinforcement learning technique encompasses experimental validation. Acknowledgements This research was supported by Research Foundation funded by Thai Nguyen University of Technology.
References 1. Do KD, Seet G (2010) Motion control of a two-wheeled mobile vehicle with an inverted pendulum. J Intell Robot Syst 60(3–4):577–605. https://doi.org/10.1007/s10846-010-9432-9 2. Li Z, Zhang Y (2010) Robust adaptive motion/force control for wheeled inverted pendulums. Automatica 46(8):1346–1353. https://doi.org/10.1016/j.automatica.2010.05.015 3. Li Z, Zhang Y, Yang Y (2010) Support vector machine optimal control for mobile wheeled inverted pendulums with unmodelled dynamics. Neurocomputing 73(13–15):2773–2782. https://doi.org/10.1016/j.neucom.2010.04.009 4. Long NT, Van Huong N, Nam DP, Sinh MX, Ha NT (2018) A simple approximate dynamic programing based on integral sliding mode control for unknown linear systems with input disturbance. In: 2018 international conference on system science and engineering, ICSSE 2018. IEEE, New York, pp 1–6. https://doi.org/10.1109/ICSSE.2018.8520064
Output Feedback Adaptive Reinforcement Learning …
547
5. Guo ZQ, Xu JX, Lee TH (2014) Design and implementation of a new sliding mode controller on an underactuated wheeled inverted pendulum. J Franklin Inst 351(4):2261–2282. https:// doi.org/10.1016/j.jfranklin.2013.02.002 6. Huang J, Guan ZH, Matsuno T, Fukuda T, Sekiyama K (2010) Sliding-mode velocity control of mobile-wheeled inverted-pendulum systems. IEEE Trans Robot 26(4):750–758. https://doi. org/10.1109/TRO.2010.2053732 7. Yang C, Li Z, Cui R, Xu B (2014) Neural network-based motion control of an underactuated wheeled inverted pendulum model. IEEE Trans. Neural Networks Learn. Syst. 25(11):2004– 2016. https://doi.org/10.1109/TNNLS.2014.2302475 8. Huang M, Jiang ZP, Chai T, Gao W (2016) Sampled-data-based adaptive optimal outputfeedback control of a 2-degree-of-freedom helicopter. IET Control Theory Appl 10(12):1440– 1447. https://doi.org/10.1049/iet-cta.2015.0977 9. Jiang Y, Jiang ZP (2012) Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica 48(10):2699–2704. https://doi.org/ 10.1016/j.automatica.2012.06.096 10. Gao W, Jiang Y, Jiang ZP, Chai T (2016) Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming. Automatica 72:37–45. https://doi.org/10.1016/j.automatica.2016.05.008 11. Lewis FL, Vamvoudakis KG (2011) Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data. IEEE Trans Syst Man, Cybern Part B 41(1):14–25. https://doi.org/10.1109/TSMCB.2010.2043839 12. Gao W, Jiang Y, Jiang ZP, Chai T (2014) Adaptive and optimal output feedback control of linear systems: An adaptive dynamic programming approach. In: Proceeding 11th World Congress on Intelligent Control and Automation, vol 2015-March. IEEE, New York, pp. 2085–2090. https://doi.org/10.1109/WCICA.2014.7053043
Facial Expression Recognition with CNN-LSTM Bui Thanh Hung and Le Minh Tien
Abstract Facial expression is one of the most important means to express the human emotions in social communication. Automatic facial expression recognition has become a “favorite” topic in the research area. In this paper, we propose an approach of using CNN-LSTM to learn facial expressions. By combination of the superior features of our own convolutional neural network and long short-term memory models, our proposed model gains the better results than the others. The experiments are conducted on JAFFE database and the accuracy and confusion matrix score are used to evaluate our model. Keywords Facial expression recognition · CNN-LSTM · Convolutional neural network · Long short-term memory
1 Introduction According to experts, nonverbal communication accounts for two-thirds of effective communication, which leaves us with a verbal communication portion of one-third only. Speaking of nonverbal communication, facial expressions play an important role to deliver nonverbal messages in modern human communication since they help us to interpret most of the hidden meaning of spoken words. In other words, human faces could convey thousands of emotions such as happiness, sadness, fear, anger, surprise, disgust, etc. [1, 2]. Nowadays, it could be observed that the automatic facial expression recognition has become a “favorite” topic in the research area. It relates to not only computer vision, machine learning but also behavioral sciences. Therefore, the applications of automatic facial expression recognitions could be used in diverse sectors including B. T. Hung (B) · L. M. Tien Data Analytics and Artificial Intelligence Laboratory, Engineering—Technology School, Thu Dau Mot University, 6 Tran van On Street, Phu Hoa District, Thu Dau Mot, Binh Duong, Vietnam e-mail: [email protected] L. M. Tien e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_52
549
550
B. T. Hung and L. M. Tien
but not limited to security, human–computer interaction, driver safety, healthcare, and education. Most databases containing facial emotions use the same key classification of the human emotions as it was originally presented by Ekman et al. [3]. They introduced their seminal study about facial expressions and defined basic emotions based on cross-culture study, which indicated that humans express certain basic emotions in the same way throughout the world. They labeled them by their corresponding emotional states, which are happiness, sadness, surprise, fear, disgust, anger, and contempt. This primary emotions hypothesis has been extensively exploited in cognitive computing due to its simplicity and universality. In this paper, we present a method to recognize facial expression using CNNLSTM approach. We use convolutional neural network to extract features and recognize facial expression by long short-term memory. We conduct experiment on JAFFE database and evaluate by accuracy and confusion matrix score. The remainder of this paper is organized as follows: Section 2 introduces related work. Then Section 3 describes in detail how to recognize facial expression using CNN-LSTM approach and Section 4 shows the experiments, as well as discussion related to the results. Finally, Section 5 summarizes our work, as well as future directions.
2 Related Works Facial emotion recognition has drawn increasing attention of researchers in diverse disciplines as an active research topic. The current key challenge of facial emotion recognition comes from the ambiguity and errors when mapping the emotional states to the factors that can be used to detect them. There have been several expression recognition approaches and a lot of progress has been made in this research area recently [4, 5]. Many researchers proposed local binary pattern (LBP) for the alteration of an image into an arrangement of micro-patterns [6, 7]. Ahonen et al. used binary patterns as the feature extraction; they compared and combined different machine learning techniques like template matching, support vector machine, linear discriminant analysis, and linear programming to recognize facial expressions [6]. Another researches used histogram of oriented gradients (HoG) features for facial expression recognition. Dahmane et al. used a set of feature extractors like LBP, PCA, and HoG with a SVM to classify static face images as one of the six expressions [8]. Convolutional neural networks (CNN) were firstly used by Lecun et al. [9]; this model has more advantages than the others because the models’ input is a raw image rather than hand-coded features. CNN has the ability to learn the set of features that could be the best model for the desired classification. In general, this model has alternated types of layers, convolutional layers, and sub-sampling layers. Convolutional neural networks have taken the computer vision community by storm; thus, it significantly improved the state of the art in many applications. There are many works that used convolutional neural networks for emotion recognition [4]. Lopes
Facial Expression Recognition with CNN-LSTM
551
et al. [10] created a five-layer CNN for classifying six different classes of emotions in Cohn—Kanade (CK +) database. Hamester et al. [11] proposed a two-channel CNN where the upper channel used convolutional filters, while the lower used Gabor-like filters in the first layer. Xie and Hu [12] used convolutional modules with a different type of CNN structure. This module processes the best set of features for the next layer to reduce redundancy of same features learnt and considers similar information between filters of the same layer. Donahue et al. proposed long-term recurrent convolutional networks for visual recognition and description [13]. In our research, we introduce a new method by using hybrid deep learning approach for facial expression recognition. We combine the superior features of our own convolutional neural network (“CNN”) (used to extract features) and long short-term memory (“LSTM”) models (used to recognize facial expressions).
3 Methodology 3.1 The Proposed Model The proposed model can be described in Fig. l. This model consists of two main parts: hybrid deep learning method (CNN-LSTM) and facial expression recognition. This model combines an LSTM with a deep hierarchical visual feature extractor— CNN model. Therefore, this hybrid model can learn to recognize and synthesize temporal dynamics for tasks involving sequential images. Each visual feature determined through a CNN is passed to the corresponding LSTM and produces a fixed or variable-length vector representation. The outputs are then passed into a recurrent sequence-learning module. Finally, the predicted distribution is computed by applying softmax; we will describe each part in detail. Fig. 1 The proposed model
Data Hybrid Deep Learning Approach CNN - LSTM Facial Expression Recognition
552
B. T. Hung and L. M. Tien
3.2 CNN-LSTM Approach Convolutional Neural Network (CNN) In convolutional neural network model, the input image is convolved through a filter collection in the convolution layers to produce a feature map. Each feature map is then combined to fully connected networks, and the face expression is recognized to belong to a particular class-based output of the softmax algorithm. Figure 2 shows the procedure used by CNN (pictures are extracted from JAFFA database [14]). We decided to create a CNN on our own and trained it based on the results of previous publications. We created a six-layer CNN with two convolutional layers, two pooling layers, and two fully connected layers. The structure of the CNN used in our model is shown in Fig. 3. Long short-term memory (LSTM) A recurrent neural network (RNN) is a type of advanced artificial neural network. RNN is known for its capability of using their “memory” to process sequences of data. RNN has shown incredible successes in a huge number of applications because it can remember previous inputs in its internal state, makes use of sequential information,
convolution
subsampling
convolution
subsampling
fully aconnection
Fig. 3 The proposed convolutional neural network structure
Output Softmax Layer
Fully Connected Layer 2 (128) Dropout: 0.3
Fully Connected Layer 1 (256) Dropout: 0.3
Max Pooling
Convolution Layer 3 (10,3x3)
Max Pooling
Convolution Layer 1 (10,5x5)
Input
Fig. 2 Procedure of convolutional neural network
Facial Expression Recognition with CNN-LSTM
553
Fig. 4 The long short-term memory cell
and produces the output depending on previous computations. However, when it comes to “long-term dependencies” issues, RNN has not yet proved to be workable. RNN seems to be incapable of connecting the information when there are certain cases that the gap between relevant information and the point where it is needed is so large. An advanced modification of the recurrent neural network is long short-term memory (LSTM) [15]. LSTM is able to memorize the knowledge about previous outputs in comparison with the regular feedforward neural network. The retention part is due to the feedback loop presented in their architecture. Figure 4 shows a diagram of a simple LSTM cell. Individual cells are combined to form a large network thereby befitting the term deep neural networks. The cell unit represents the memory and each cell is composed of six main elements: an input gate I, a forget ¯ a memory state C, and a hidden state gate F, an output gate O, a candidate layer C, H. Given a sequence of vectors (X 1 , X 2 , …, X n ), σ is the sigmoid function, the hidden state H t of LSTM at time t is calculated as follows: f t = σ (X t ∗ U f + Ht−1 ∗ W f )
(1)
C t = tanh(X t ∗ Uc + Ht−1 ∗ Wc )
(2)
It = σ (X t ∗ Ui + Ht−1 ∗ Wi )
(3)
Ot = σ (X t ∗ Uo + Ht−1 ∗ Wo )
(4)
Ct = f t ∗ Ct−1 + It ∗ Ct
(5)
Ht = Ot ∗ tanh(Ct )
(6)
LSTM
ht-1
LSTM
B. T. Hung and L. M. Tien
ht
LSTM
554
ht+1
Fig. 5 The hybrid deep learning CNN—LSTM model
where X t : input vector, H t −1: previous cell output, C t −1: previous cell memory, H t : current cell output, C t: current cell memory; W, U: weight vectors for forget gate ¯ input gate (I) and output gate (O); ∗: element-wise multiplication (f ), candidate (C), and + is element-wise addition. The hybrid deep learning approach is originally proposed by Donahue et al. [13]. This model is to combine the advantages of CNN and LSTM models. The convolutional neural network is effective in image classification to extract features while long short-term memory has produced remarkable results to process sequences of inputs for facial expression recognition. Each visual feature determined through a CNN is passed to the corresponding LSTM and produces a fixed or variable-length vector representation. The outputs are then passed into a recurrent sequence-learning module. The hybrid deep learning approach [13, 16] in our model which combines our own CNN structure and LSTM is shown in Fig. 5.
3.3 Facial Expression Recognition With the vector learned from the CNN-LSTM model, the output layer is defined as below:
Facial Expression Recognition with CNN-LSTM
y = (Wd xt + bd )
555
(7)
where xt : the vector learned from the CNN-LSTM model y: the degree of facial expression class of the target picture Wd , bd : the weight and bias associated with the CNN-LSTM model. By minimizing the mean squared error between the predicted facial expression class and actual facial expression class, the hybrid CNN-LSTM model is trained. The loss function is defined as follows: n 1 n i L(X, y) = h x − y i2 2n k=1 k
(8)
X = {x 1 , x 1 , x 2 , . . . , x m }: a training set of facial expression matrix y = {y 1 , y 2 , . . . , y m }: a facial expression class ratings set of training sets
4 Experiments 4.1 Dataset We used the Japanese Female Facial Expression (JAFFE) database which has 215 images of ten different female models posing for seven emotions [14]. The seven emotions in the JAFFE data set are shown in Fig. 6 below:
Fig. 6 Seven classes of emotions in the JAFFE dataset
556
B. T. Hung and L. M. Tien
To achieve the best classification results, CNN models typically require large datasets of training images. The challenge is that JAFFE contains only 215 images; therefore, before training the CNN model, we need to augment the dataset with various transformations to generate various micro-changes in appearances and poses. We conducted the augmentation of the dataset following the proposed method by Li et al. [17]. After the augmentation, we can generate 30 samples for each original image in the dataset. We used five cross-validations in our experiments. The dataset is divided to five subsets, one of the five subsets was chosen for the testing dataset, and the other four subsets were used for the training dataset. Then the cross-validation process was repeated five times, with each of the five subsets used once in place of validation data.
4.2 Experiment Result We implemented our model based on the Chollet F Keras [18] library with TensorFlow [19] back-end. We use OpenCV [20] for all image operations. We evaluated the result of our model by the accuracy and confusion matrix scores. The accuracy score is the average of accuracy of five cross-validations; this score is calculated as follows: Accuracy =
TP TP + FP
(9)
where TP is the true facial expression recognition and FP is the false facial expression recognition. This can be summed up as the total amount of correct facial expression recognition over the total amount of facial expression in the test dataset. We compared the result of our hybrid deep learning approach CNN-LSTM with CNN model. The result is shown in Table 1. Table 1 shows that the performance achieved by the proposed CNN-LSTM method is much better than the CNN method. Because visual features were extracted by the CNN and passed to the corresponding LSTM, this produced a fixed or variablelength vector representation. It proves that when combining the superior features of CNN and LSTM models, it has produced remarkable results in facial expression recognition. With hybrid deep learning CNN-LSTM model, the accuracy rate has increased by 2.17%, from 84.25 to 86.42%. Table 2 presents the average of confusion matrix score of five cross-validations with hybrid deep learning CNN—LSTM model. The results show that the accuracy Table 1 The result of our hybrid deep learning CNN—LSTM model compared with CNN model
Model
Accuracy (%)
CNN
84.25
CNN-LSTM
86.42
Facial Expression Recognition with CNN-LSTM
557
Table 2 Confusion matrix score with hybrid deep learning CNN—LSTM model Angry
Sadness
92.79
1.92
Disgust
Fear
0
0
5.29
0
0
Sadness
1.22
Surprise
0
80.08
2.44
6.10
1.63
5.28
3.25
0.88
86.78
0
0
7.49
Happiness
4.85
1.62
2.42
0.80
0
2.42
8.87
Disgust
6.79
4.07
0
1.81
85.52
1.81
0
Fear
0.81
6.91
1.63
0.81
0.81
86.99
Neutral
0
3.69
1.84
4.18
0
Angry
Surprise
Happiness
83.87
0
Neutral
2.04 90.29
Table 3 Confusion matrix score with CNN model Angry Angry
Sadness
Surprise
Happiness
Disgust
Fear
Neutral
90.38
3.85
0.96
0
4.81
0
0
Sadness
1.63
78.05
2.44
7.32
2.03
5.28
3.25
Surprise
0
0.88
84.58
0
0
9.25
5.29
Happiness
2.02
4.03
0.81
0
1.61
9.68
Disgust
6.33
4.08
0
1.81
83.26
2.71
1.81
Fear
1.63
6.10
1.22
0.81
1.63
84.96
3.65
Neutral
0
5.07
2.30
4.61
0
81.85
0
88.02
rate is significantly high at 86.42%; the error rate is 13.58%. In particular, anger reached 92.79%, fear reached 86.99% and neutral reached 90.29%. Table 3 shows confusion matrix score with CNN model. The results show that anger reached only 90.38%, fear reached 84.96%, and neutral reached 88.02%. So, the accuracy rate is 84.25%, which is lower than our model 2.15%. Figures 7, 8 present the facial expression estimation of CNN model and hybrid deep learning CNN—LSTM model for each emotion. From the results presented in Tables 2, 3, contrary to what people might think, the boundary between happiness, neutral, and sadness is quite thin in the facial structure. Most of misclassifications for neutral were sadness and happiness and viceversa. We also saw that all the top predictions tend to be negative emotions (sadness, fear, disgust, anger) when an image has a negative emotion. Its top predicted emotions tend to be negative if the given image does not have position vibe (or) emotion to it.
5 Conclusion In this paper, we propose an approach to recognize facial expression by using hybrid deep learning CNN—LSTM model. We conducted experiments on JAFFA database and evaluated by accuracy and confusion matrix scores. The CNN proposed method
558
B. T. Hung and L. M. Tien
Fig. 7 The facial expression estimation of CNN model
Fig. 8 The facial expression estimation of CNN—LSTM model
using a convolutional neural network with two convolutional layers, two pooling layers, and two fully connected layers. Based on the experiment results, it shows the superior performance of our proposed method since it could recognize facial expressions with minor errors in JAFFA dataset. We are looking to work more on the visualizing the learned filters in depth; and our future work will focus on training
Facial Expression Recognition with CNN-LSTM
559
the network with more data, more filters, and more depth to improve the accuracy in facial expression recognition.
References 1. Tian YL, Kanade T, Cohn JF (2005) Facial expression analysis. In: Handbook of face recognition, Springer, pp 247–275. https://doi.org/10.1007/0-387-27257-7_12 2. Wu Y, Liu H, Zha H (2005) Modeling facial expression space for recognition. In: IEEE/RSJ international conference on intelligent robots and systems-IROS, pp 1968–1973. https://doi. org/10.1109/iros.2005.1545532 3. Ekman P, Freisen WV, Ancoli S (1980) Facial signs of emotional experience. J Pers Soc Psychol 39(6):1125–1134. https://doi.org/10.1037/h0077722 4. Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Liu T, Wang X, Wang L, Wang G (2017) Recent advances in convolutional neural networks. Pattern Recognitio 1:1–24 arXiv:1512.07108 5. Li S, Deng W (2018) Deep facial expression recognition: A survey. arXiv preprint arXiv:1804. 08348 6. Ahonen T, Hadid A, Pietikainen M (2007) Face description with local binary patterns: application to face recognition. IEEE Trans Pattern Anal Mach Intell 28(12):2037–2041. https://doi. org/10.1109/TPAMI.2006.244 7. Ghimire D, Jeong S, Lee J, Park SH (2017) Facial expression recognition based on local region specific features and support vector machines. Multimedia Tools Appl 76:7803–7821. https:// doi.org/10.1007/s11042-016-3418-y 8. Dahmane M, Meunier J (2011) Emotion recognition using dynamic grid-based HoG features. In: 2011 IEEE international conference on automatic face gesture recognition and workshops (FG 2011), pp 884–888. https://doi.org/10.1109/fg.2011.5771368 9. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. https://doi.org/10.1109/5.726791 10. Lopes AT, de Aguiar E, Oliveira-Santos T (2015) A facial expression recognition system using convolutional networks. In: 28th SIBGRAPI conference on graphics, patterns and images, vol 00, pp 273–280. https://doi.org/10.1109/sibgrapi.2015.14 11. Hamester D, Barros P, Wermter S (2015) Face expression recognition with a 2-channel convolutional neural network. In: International joint conference on neural networks, pp 1787–1794. https://doi.org/10.1109/ijcnn.2015.7280539 12. Xie S, Hu H (2017) Facial expression recognition with FRR—CNN. Electron Lett 53(4):235– 237. https://doi.org/10.1049/el.2016.4328 13. Donahue J, Hendricks LA, Rohrbach M, Venugopalan S, Guadarrama S, Saenko K, Darrell T (2017) Long-term recurrent convolutional networks for visual recognition and description. IEEE Trans Pattern Anal Mach Intell 39(4):677–691. https://doi.org/10.1109/TPAMI.2016. 2599174 14. Lyons MJ, Akamatsu S, Kamachi M, Gyoba J (1998) Coding facial expressions with Gabor wave. In: Proceedings of the IEEE international conference on automatic face and gesture recognition, pp 200–205. https://doi.org/10.1109/afgr.1998.670949 15. Hochreiter S, Schmidhuber J (1997) Long short term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735 16. Hung BT (2018) Vietnamese keyword extraction using hybrid deep learning methods. In: Proceedings of the 5th NAFOSTED conference on information and computer science. https:// doi.org/10.1109/nics.2018.8606906 17. Li W, Li M, Su Z, Zhu Z (2015) A deep-learning approach to facial expression recognition with candid images. In: 14th IAPR international conference on machine vision applications (MVA), pp 279–282. https://doi.org/10.1109/mva.2015.7153185
560
B. T. Hung and L. M. Tien
18. Chollet F Keras (2015). https://github.com/fchollet/keras 19. Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray DG, Steiner B, Tucker P, Vasudevan V, Warden P, Wicke M, Yu Y, Zheng X (2016) Tensorflow: a system for large-scale machine learning. In Proceedings of the 12th USENIX conference on operating systems design and implementation, OSDI’16, pp 265–283. https://doi.org/10.1007/s10107-012-0572-5 20. Open source computer vision library, OpenCV (2015) https://www.opencv.org
A Fast Reading of European License Plates Recognition from Monochrome Image Using MONNA Maad M. Mijwil
Abstract The registration number is an effective way to identify vehicles; this is a unique piece of information for each car. Frequently, it is necessary to identify vehicle license plates (LP) for safety. The extracted information can be used for several interests, such as access and flow control, border and toll monitoring, search for suspicious vehicles or the fight against crime, etc. The real problem in identifying LP extracted from recorded footage. The purpose of this study was to develop and integrate an algorithm capable of assisting a human operator in the recognition of an LP using professional PDA-type equipment by using visual studio C#. The technical constraints were to recognize a registration number with a satisfactory reading rate in less than 5 s, from a monochrome image of different pixels and using a system equipped with a 2.67 GHz i5 processor and 4 GBytes of memory running Windows 10. We created a dataset of the plate’s image to be “read” came from several countries (Germany (Deutschland), Spain (España), France (France), United Kingdom, Turkey (Türkiye) and Italy (Italia)). Keywords License plates · Neural networks · Monochrome image · Recognition
1 Introduction Research demonstrates that in the course of the most recent decade, malware have been originally, license plates (LPs) were invented and put into real use for trolleys, but not for cars. The first rule of the license plate of the world is applied in France on August 14, 1983. Following this rule, all LPs must be registered with their own names, address, and registration number. In Germany, on April 14, 1899, the police provided a license plate to Mr. Chubais Barthes, the world’s first registered license plate, in which only “A1” was written as shown in Fig. 1. M. M. Mijwil (B) Computer Engineering Techniques Department, Baghdad College of Economics Sciences University, Baghdad, Iraq e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_53
561
562
M. M. Mijwil
Fig. 1 First plate number in Germany
License plate reading (LPR) is an identification process that uses image processing as well as computer vision systems to retrieve the registration number from the image of the plate in character format in ASCII code [10]. LPR uses optical character recognition (OCR) on camera images [11]. Some license plates vary in font size and position. Automatic number plate reading systems need to know how to deal with these differences to be truly effective [12]. The most advanced systems know how to manage variations between countries, although many programs are country-specific [1]. Artificial vision (AV) is a field of artificial intelligence whose goal is to design computer systems that can “understand” the elements and characteristics of realworld scenes or images [14]. These systems can extract information, symbols, and numbers from the recognition of objects and structures present in the image [5]. To achieve this, they carry out four main activities shown in Fig. 2 [13]: AV is closely related to pattern recognition and image processing techniques [7]. The font used for license plate numbers is not the same for all LPs [15]. These problems, and changing weather conditions, make the field of license plate recognition suitable for testing pattern recognition technology [6]. The development of the application was divided into two parts: Location of the LP within an image and recognition of the vehicle license plate characters [8]. Several challenges faced by industrial vision systems, such as lighting conditions, deformation, and partial occlusion of objects, make image classification a complex task [9], to which a great effort
Fig. 2 Artificial vision system
A Fast Reading of European License Plates Recognition …
563
Fig. 3 Turkish (Türkçe) license plate
Fig. 4 The part of ELP
has been devoted to developing sophisticated pattern recognition techniques, that do not always produce expected results [2]. From a machine learning perspective, image classification is a supervised learning problem and the classification algorithm generates a model from a data set or a set of previously classified images [4]. The new model is then used to classify the new image [3]. For these algorithms, the image is a three-dimensional matrix of height, width, and color depth, and the content at each position in the matrix is a numeric value that represents the color intensity of each pixel in the digital image [16]. In this paper, development algorithm to read European license plates (ELP) in various cases by using Multiple Ordinate Neural Network Architecture (MONNA) classifier for six countries Fig. 3 shows the example license plate from Turkey (Türkiye) and the parts of ELP with explain shown in Fig. 4. The remaining parts of the paper are organized as follows: The characters’ extraction is presents in Sect. 2, the performance evolution is discussed in Sect. 3, and conclusion is presented in Sect. 4.
2 Characters Extraction The interpretation of a license plate image, i.e., reading the number, requires extracting the alphanumeric characters composing the number. The variability of license plate images, due to their different origins, has forced us to develop an algorithm capable of taking into account the various cases. The character extraction algorithm includes binarization, character reconstruction, and orientation operations. Resizing the bounding box of each character ate characters. Figure 5 shows the images to be processed had been divided into three categories, first the images of good quality to be read in left, the second images captured in poor conditions for
564
M. M. Mijwil
Fig. 5 Images of vehicle license plates in different cases
which reading is expected in the center, and the third poorly captured images with defects strongly hindering reading in right. Among the known classification methods, we chose to use a Multiple Ordinate Neural Network Architecture (MONNA) classifier. This type of classifier is well suited to the classification into N classes (more than 30 in our case) of objects with which many measurements are associated. MONNA is a classifier composed of a layer of N multi-layered perceptual binary classifiers and a decomposition unit processing the N outputs of the binary classifiers to determine the class of the object. We have used here binary classifiers with a single layer of neurons, considering as input a set of measures associated with the characters to classify and returning the most probable class among the two classes considered by each classifier. Figure 6 shows classifier decomposing the classification problem into K classes into a set of N binary classifiers N = K · (K − 1)/2
(1)
whose responses are processed by the redial unit that selects the most relevant answer. Figure 7 shows the example of plate is German (Deutsche) plate and registration number of German plate is B MW 5107 in color and monochrome before detection The algorithm developed in this work, which consists of: 1. 2. 3. 4.
Standardization and binarization of the board Segmentation Modeling Recognition.
Fig. 6 Classifier decomposing
A Fast Reading of European License Plates Recognition …
565
Fig. 7 German (Deutsche) plates
Fig. 8 License plates in black and white color
The colors of the image are represented in the grayscale space. In this space, the colors are represented as a linear combination of the base vectors of black and white; that is, the color of a pixel is represented as ϕ = [B, W ]. The shape of the BW space is shown in Fig. 8. The separation between the registration number and marks and writings in plates by using Eq. (2). − → 1, ∅ ≤ δ θ∗ = 0, ∅ ≤ δ
(2)
− → where the threshold δ = 0.73, 1 , = [1, 1, 1], 0 = [0, 0, 0]. The segmenting of the characters consists of extracting from the original image, the regions that contain exclusively, the symbol of some letter or number. To obtain this region, it is necessary to identify the coordinates of the boundaries that make up the image of the character, that is, to locate its four corners, Fig. 9 shows segmentation of θ ∗ . The vertical and horizontal projection method is used to segment the character’s technique, given an image I (x, y) of width N1 , height M1 and considering that 1 ≤ x ≤ N1 and 1 ≤ y ≤ M1 and they are defined as follows: Phor (x0 ) =
N x=1
Fig. 9 Segmentation of registration number
I (x0 , y), ∀x = 1, . . . , N1
(3)
566
M. M. Mijwil
Fig. 10 Registration number of a binarized image
Phor (y0 ) =
N
I (x, y0 ), ∀y = 1, . . . , M1
(4)
x=1
The histograms obtained by means of Eqs. 3 and 4, allow to determine the coordinates of the region in which each character. Figure 10 shows the registration number after segmentation. Character segmentation is modeled using the correlation factor, Hu moments, and Fourier descriptors. Correlation is a statistical technique that quantifies the intensity of the linear relationship between two variables. Quantification is performed using Pearson’s linear correlation coefficient. The value ranges from −1 to 1. The correlation coefficient r is calculated as follows: ¯ − y¯ ) (x, y) ∈ O(x − x)(y (5) r = ¯ 2 (x, y) ∈ O(y − y¯ )2 (x, y) ∈ O(x − x) A value close to −1 indicates a negative relationship between the variables, a value close to 0 indicates no relationship and finally, a value close to 1 indicates a direct relationship between the variables. Hu moments have been used to recognize characters; in other cases, to measure geometric characteristics such as elasticity or circularities. Let O = f {(x1 , y1 ) . . . (xm , ym )} the set of coordinates of the pixels that make up an extracted character. (x, ¯ y¯ ) is the coordinate of the centroid of the object, which is calculated as follows: x¯ =
1 x n (x,y)∈O
(6)
y¯ =
1 y m (x,y)∈O
(7)
Figure 11 shows the outline of a character, which is defined by n points in the complex plane. Let f (x1 ; y1 ) . . . (xn ; yn ) ⊂ Z 2 the set of points that form the contour, each point is represented as a complex number: Z n = (xn + j yn ) where j =
√ −1
(8)
A Fast Reading of European License Plates Recognition …
567
Fig. 11 Set of points that form the outline of a character
The discrete Fourier transform of the set of points that make up the character contour can be calculated as follows: m 2π usn 2π usn 1 ϕn cos F(u) = − j sin uπ n=1 sm sm
(9)
Finally, Fourier descriptors are obtained by calculating the absolute values of complex numbers: f (u) = |F(u)|
(10)
where u = 1, 2, . . . . . . ., N and N is the total descriptors to obtain,
3 Performance Evaluation The purpose of this paper was to develop software with a manageable graphical interface, which allowed to automatically detect LP from car images and extract the number with a text format readable by the computer. For each character detected and normalized in size, we calculated the shape parameters (area, perimeter, elongation, compactness, number of holes, etc). and distribution measures are the number of pixels belonging to the character in the [4 × 8] boxes that make up the bounding box of character. The evaluation of the classifier showed that the shape parameters were little or no discriminating. The parameters retained were the [128] distribution parameters
568
M. M. Mijwil
Table 1 Result of ELP for three countries (Turkey, Italy, France) Image quality
Turkey (Türkiye)
Italy (Italia)
Registration number
Detection rate (%)
Registration number
France (France)
High
14 DK 205
94.0
DS 900 JS
91.6
NS 758 TJ
95.2
Normal
33 FP 481
86.9
EV 71 4RA
84.2
AB 671 LD
83.5
Poor
34 Y JT 51
75.1
DZ 698ND
79.9
CL 803 YZ
72.5
Detection rate (%)
Registration number
Detection rate (%)
Table 2 Result of ELP for three countries (Germany, Spain, United Kingdom) Image quality
Germany (Deutschland)
Spain (España)
Registration number
Registration number
High
B MW 5107
94.6
7795F JH
90.6
YF52 ZYZ
93.0
Normal
B KW 7165
94.5
4449 BGK
80.9
TR53 OLL
88.2
Poor
M LC 3511
76.6
0624 HJY
75.7
Y JID—
79.6
Detection rate (%)
United Kingdom Detection rate (%)
Registration number
Detection rate (%)
and the surfaces of the two holes included in the character (they are set to 0 when there is no hole). The learning base was built from simulated characters, using the fonts found on the license plates, to which rotation and perspective transformations were applied in order to take into account the acquisition conditions. For each alphanumeric character, 90 images were generated. Table 1 and Table 2 below show the recognition rates of characters according to the quality of the images.
4 Conclusion Character recognition gives very satisfactory results (more than 98% success on good quality images). The lower recognition rate of plates is explained by the failure of recognition when one of the characters is incorrectly recognized, which increases the probability of failure, and by the imperfect detection of characters (undetected characters). This should be improved if automatic recognition is desired. For the intended application, the results were satisfactory, the operator having the possibility of recapturing the image of the plate if the quality of the image is involved, and to select another proposal for each detected character. The average reading time of a plate is 2.3 s (1.4 s for character detection and 0.9 s for recognition of detected characters) with a processor clocked at 2.67 GHz i5; this system was achieved 95% detection rate for small dataset.
A Fast Reading of European License Plates Recognition …
569
References 1. Anagnostopoulos CN, Anagnostopoulos IE, Psoroulas ID, Loumos V, Kayafas E (2008) License plate recognition from still images and video sequences: a Survey. IEEE Trans Intell Transp Syst 9:337–391 2. Chen R, Luo Y (2012) An improved license plate location method based on edge detection. Phys Procedia 24:1350–1356 3. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. IEEE Comput Soc Conf Comput Vision Pattern Recogn 1:886–893 4. Gupta N, Tayal S, Gupta P, Goyal D, Goyal M (2017) A review: recognition of automatic license plate in image processing. Adv Comput Sci Technol 10(5):771–779 5. Humam M, Ghimiray S, Bhardwaj M, Bhardwaj AK (2017) Automatic number plate recognition: an approach. Int J Appl Innov Eng Manage 6:172–175 6. Kilic I, Aydin G (2018) Turkish vehicle license plate recognition using deep learning. In: IEEE International conference on artificial intelligence and data processing (IDAP) 7. Medathati NVK (2016) Computer vision and image understanding. Comput Vision Image Underst 6:1–30 8. Mohamed MA (2013) A fast algorithm for license plate detection. In: IEEE International conference on signal processing image processing and pattern recognition, pp 1–5 9. Patel C, Shah D, Patel A (2013) Automatic number plate recognition system (ANPR): a survey. Int J Comput Appl (IJCA) 69:21–33 10. Pavani T, Mohan D (2019) Number plate recognition by using open CV-Python. Int Res J Eng Technol (IRJET) 6:4987–4992 11. Qadri MT, Asif M (2009) Automatic number plate recognition system for vehicle identification using optical character recognition. In: IEEE computer society international conference on education technology and computer, pp 335–338 12. Roy A, Ghoshal DP (2011) Number plate recognition for use in different countries using an improved segmentation. In: IEEE 2nd national conference on emerging trends and applications in computer science, pp 1–5 13. Sereno JE, Bolaños F, Vallejo M (2016) Artificial vision system for differential multiples robots. Technol Appl Electron Teach Conf (TAEE) 6:1–5 14. Shubhendu S, Vijay J (2013) Applicability of artificial intelligence in different fields of life. Int J Sci Eng Res (IJSER) 1:28–35 15. Wang ML, Liu YH, Liao BY, Lin YS, Horng MF (2010) A vehicle license plate recognition system based on spatial/frequency domain filtering and neural networks. In: International conference on computational collective intelligence, vol 6423. Springer, Berlin, Heidelberg, pp 63–70 16. Xie L, Ahmad T, Jin L, Liu Y, Zhang S (2018) A new CNN-based method for multi-directional car license plate detection. IEEE Trans Intell Transp Syst 19:507–517
Implementing Authentication System Depending on Face Recognition Using Anthropometric Model Ahmed Shihab Ahmed
Abstract This paper exhibits an automatic method for detecting significant facial features point employing an advanced anthropometric face type. The facial features point the system work on are concerning the areas of mouth, nose, eyes and eyebrows. The anthropometric means the scientific study of the measurements and proportions of the human face. Several processes are performed in order to recognize human personality authenticated or not; these processes are beginning by capturing colored image using fixed digital camera and ending by features isolated into separated sub-images, and the lengths and distances among them representing authenticated person’s information are stored into database. In authentication stage, all the extracted features are compared with stored authenticated facial features in the database; the person is authenticated if a percentage of similarity equal to or greater than 83% is achieved. Keywords Facial feature points · Anthropometric · Face recognition · Face detection
1 Introduction Authentication is a process, which proves that someone, or something is valid or genuine [5], or it can be defined as a process by which a computer accepts or rejects user’s subsequent claim to those privileges. In computer system or in communication network, authentication is an important part of a good data security [14] FAGs/Index.html]. Several people can’t distinguish between authorization and authentication. Authentication handles with a question of whether or not you are actually communicated with a specific process. Authorization is involved with the process granted to do [7]. Generally, all authentication schemes have a common step in which the validity of one or more parameters must be checked. An authentication scheme is characterized by the nature of the pre-established relationships existing A. S. Ahmed (B) Department of Basic Sciences, University of Baghdad, College of Nursing, Baghdad, Iraq e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_54
571
572
A. S. Ahmed
between the checked parameters and the quantities to be authenticated [5]. The need for providing authentication arises in many different situations, and it is especially crucial when the user operates from a remote terminal. There are two different cases of authentication: • Message Authentication It is a technique, which when confirmed through two communicates allows every communicates to prove whether the received messages are authentic [5], and message authentication declares the receiver to prove the properties of the message’s origin, timelines and intended receiver. • User Authentication User authentication is a major part of presently used security infrastructure [10]. There are three main techniques for user authentication, which include: • Password authentication • Token authentication • Biometric authentication. The main aim of safety mechanisms in a system is to govern entree to information, but controlling the access to the system itself should be considered first. To several systems, physical controls are enough; nevertheless utmost systems necessity to stay accessed after sites are not beneath the physical monitoring of the site management. This leads to the fact that some systems may guard itself in two methods: • It can control who may entree the system • It can control what persons may do when they enter the system. The primary method needs the system to apply two processes • Identification (questioning who you are) • Verification (questioning you to verify it). The best method to prove the identity of the person is face recognition depending on the user face image, which gives the answer to who are you. In current years, face recognition research has achieved prominence due to the sharp security condition through the western world. Face recognition software has been united in a varied diversity of biometrics founded security systems for the aim of identification, authentication and tracking. Thus, a good security system can be built depending on face recognition, which can be used to either recognize a person identity or authenticate person identity.
2 Facial Recognition Facial recognition, a kind of biometric recognition, creates human recognition a extra automated, computerized procedure [12].
Implementing Authentication System Depending on Face …
573
Facial recognition is a procedure of computer visions that utilize faces to distinguish a human or else confirm the identity of a person. Facial recognition is done in five steps: • Acquire an image of the face. • Spot the location of every faces in the acquired image. • Analyze the spatial geometry of distinguishing features of the face to extract the identifying features of a face and then generate template from these features. • Compare the template generated in step three through those in a database of recognized faces. The process produces scores that specify how strictly the produced template equals every one of those within the database. • Last phase is deciding if each score created in phase four are high enough to state a match.
3 Related Work There are many researches focusing on image retrieval and sketch retrieval system, some of these researches include: Abdul Wahab [1] submitted a Ph.D. thesis, which consists of the automatic face recognition system. Resulting algorithms are designed reliant on four unlike color image databases. Wavelet coefficients are the source that textural features are extracted from. For every face is labeled by a subset of band-filtered images holding wavelet coefficients. Those coefficients characterize the face texture and a group of clear statistical measures to permit the user to compose compact plus significate feature vectors. This work proposes a face detection algorithm to colored images in different lighting situations and complicated backgrounds. Hwang et al. [6], the writers show a strong face recognition system for greatscale datasets made underneath an absolute illumination difference. The suggested face recognition framework contains a novel illumination-insensitive preprocessing procedure, a hybrid feature extraction Fourier-based facial, plus a score fusion structure. Ahmad et al. [3], the purpose of the study was to assess different face detection plus recognition procedures, deliver entire resolution for image-based face detection and recognition with greater precision, an enhanced response rank as a premier stage for video surveillance. Answer is suggested founded on completed trials upon numerous face neat databases under the conditions of subjects, posture, reactions, race and brightness. Alzubaydi Dhia and Samar [4], in this paper the system workflow occurs through two major stages the enrollment stage and the recognition stage. The double stages have double major modules of the system (feature extraction module and the preprocessing module). At first, the preprocessing contains different phases: face clip detection phase, eye block detection phase and edge detection phase. Later examining face
574
A. S. Ahmed
image as bmp file, the face detection step passes into those procedures and transforms to HSV color space for skin color modeling; the segmentation procedure is completed to reveal the skin area, noise removal and face localization. The eyes blocks detection phase contains those steps: image smoothing, turn into grayscale, contrast improvement plus extract master eye cutouts block. Third stage contains utilizing canny edge detection and thinning. Every one of these steps to gain the major edges of eye block. Jayshree and Lodaya [8], in their paper they said that the user might enter the data from the system to the authentic user logs out. These issues are happening possibly when the user has a brief rest or else when the user might have not logged out because of several reasons. This is a serious problem in so far as security is troubled particularly for systems holding classified information. To conquer this issue, this paper suggests a continuous authentication scheme by continuous observation of the user. The suggested system employs smooth biometric traits such as skin color, face recognition and the user’s clothing. This system automatically registers soft biometrics each period user logs in and primer equaling of soft biometric with conventional authentication specifically the password and face biometrics. Suggested technique has a respectable tolerance when the user’s posture is specifically static when he/she is utilizing the system. Royer and Blais [9] worked on utilizing various in face recognition capability in order to advance knowing the perceptual plus cognitive mechanisms supportive face processing is expanding significantly through the last years. This paper aims to define in what way differing stages of face recognition capability are related toward variations in visual information extraction approaches in the identity recognition assignment. In order to discourse this difficulty, fifty contributors finished six assignments determining face plus object processing capabilities. Utilizing the bubbles technique, they furthermore calculated every individual’s utilize of visual information in the face recognition field. In the group stage, their outcomes duplicate earlier results demonstrating the significance of the eye area to face recognition. Significantly, they present that face processing capability remains linked to the systematic growth around utilizing the eye region, particularly left eye from the viewer’s viewpoint. Certainly, their outcomes propose that the employ of this region reports to around 20% of the difference in face processing capability. The outcomes favor the indication that individual variances in face processing remain by minimum incompletely linked to the perceptual extraction approach applied through face identification. Abudarhama and Shkiller [2], the task of face recognition is computationally hard that individuals achieve easily. However, this extraordinary gift is better for recognizable faces than unrecognizable faces. To reason for humans’ greater capability to identify recognizable faces, present concepts propose that dissimilar features are utilized to the demonstration of recognizable plus unrecognizable faces. In this present study, they employed an opposite engineering method to disclose which facial features are important for recognizable face recognition. In contrast to present opinions, they determined that the equal subset of features which are applied for matching unrecognizable faces is too applied for matching in addition to recognition of recognizable faces. They additionally illustrate that these features are furthermore applied
Implementing Authentication System Depending on Face …
575
via a deep neural network face recognition algorithm. They for that reason suggest a different framework that supposes similar perceptual demonstrate to every faces plus integrates cognition plus perception to account for humans’ greater recognition of recognizable faces. Zhi and Sanyang [13], the expansion of computer technology has run to the improvement in face recognition technology. At the present time, face recognition technology has been effectively functional in several topics through the aid of computer technology plus network technology. This paper forms an operative face recognition model grounded on principal component analysis, genetic algorithm plus support vector machine, in which principal component analysis is applied to decrease feature dimension, genetic algorithm is utilized to enhance search approach, plus aid vector machine is applied to understand classification. Over the simulation test on the face of the Institute of Technology of Chinese Academy of Sciences in 2003 database, the outcomes display that the model can realize face recognition with extraordinary effectiveness, plus the maximum accuracy rate is 99%.
4 Anthropometric Model The biological science that handles different measurement of the human form plus its various parts is Anthropometry. Information acquired from anthropometric measurement knows a domain of projects that rely at awareness of the spreading of measurements through human populations [11]. The landmarks points which have been applied in the model of face anthropometric for the localization of facial feature are based on Fig. 1. It has been detected from the statistics of proportion improved during our first monitoring, position of those points (P3, P4, P6, P7) be able to gain of the range amidst two eye centers (P1 and P2) hiring centers of eyes (P5) as the medium point meanwhile space amidst the couple of points (P1 plus P3), (P2 plus P4), (P5 plus P6), (P5 plus P7) preserve closely steady quantities with the space center of right and left eyes (P1 and P2).
Fig. 1 Anthropometric face model for facial feature region [9]
576
A. S. Ahmed
5 Proposal System Model The idea of the proposed system depends on anthropometric face model building, a model of the face that is used to find the key points, which involve right eyebrow center, right eye center, mouth center, left eyebrow center, left eye center, midpoint of eyes and nose tip. The model is built after many processes to extract the pure face image without any background by using skin color segmentation and elliptical mask technique. Once the key points are obtained, the feature distances of the face image can be computed and the system can be separated into four main parts (background removing, face detection feature extraction and matching). The following flowchart in Fig. 2 displays the main steps of the proposal system. In authentication phase, the scanned image is processed by the same operations mentioned above and in Fig. 2, but the results are compared with the database fields of each authentic person which is stored previously to determine whether the identity of the claimed person is authorized or not. The comparison is done by comparing the result of the previous features measurements with the stored features measurements in the database, for example the length of the eyebrow of a person whose authorization is being tested is compared with the length of the eyebrows of all the authorized persons that are stored in the database; if a similarity by 83% or more is achieved, then the person is identified as authorized person otherwise he is not.
6 Experimental Results The following steps represent the essential operations that will be performed in the system as shown in Fig. 2. • Step one: Capturing image By using fixed digital camera that is connected to computer system, the image is captured within fixed environments and according to accepted pose to the authenticated persons as shown in Fig. 3. • Step two: Preprocessing: This process is very important to detect human face shape, isolating it from the background and converting the background to black. To achieve this main process, the following steps should be performed: • Skin segmentation: This process is performed to classify pixels into skin and nonskin pixels depending on RGB color space, and according to the following threshold, human skin at daylight ranges within (R > 95) and (G > 40) and (B > 20) and (ABS(R−G) > 15) and [Max (R, G, B) Min ((R, G, B)] > 15) and R > G and (R > B) [8, 9], as shown in Fig. 4. Binary mask: this process is done to remove any noise from the image by three steps: The isolated face from the previous skin segmentation step is converted to binary.
Implementing Authentication System Depending on Face …
577
Start
Image Capturing
Skin segmentaƟon
Background Removing
Binary mask
Resized image
Enclose the face within Elliptical mask
Face Detection
Increase face contrast
Convert nonskin pixels to black pixels
Features ExtracƟon
Feature Extraction
Isolate each feature into separated image box featuers Dataset matching
Fig. 2 The proposal system flowchart
No
yas
Not AuthenƟcated
AuthenƟcated
578
A. S. Ahmed
Fig. 3 The original image
Fig. 4 Face extraction and background removing to black color
– Filling the small holes in the face to generate a white mask of the face. – Eliminate small white areas in the background (if any). Now a binary mask of the face is created (which can be considered as template of the Face) as shown in Fig. 5. Adding the resulting mask from the previous step with the original image to extract the face without any noise (small black holes) as shown in Fig. 6.
Implementing Authentication System Depending on Face …
579
Fig. 5 White template of the face
Fig. 6 The extracted face without any noise
– Image resizing: to reduce unnecessary information, columns and rows of the background that contain only black pixels will be eliminated from the image to determine the face Area position facilitates the process of enclosing the face with elliptical mask because face position is determined exactly as shown in Fig. 7. • Step three: Masking: To extract human face, an elliptical mask will be used; the face in the resized image is enclosed with an elliptical mask which is adjusted according to the face boundary. An elliptical mask is used because it is close to human face shape, then face contrast is increased to differentiate easily between skin and nonskin pixels (nonskin pixels represent features) and facilitate the process of features extraction. Then nonskin pixels are converted to black pixels,
580
A. S. Ahmed
Fig. 7 Reduced image
which can be used in feature extraction step to measure distances as shown in Figs. 8, 9, 10, 11 and 12). A filter is applied to the enclosed face to get more clearness in features and to ensure the ability to analyze features (feature extraction). Sobel operator applied to sharpen face details features does this. The result is shown in Fig. 10. Also the mask is used as follows: Another mask similar to Laplacian filter with some changes is used to increase image contrast by enhancing image details and making it different from its Fig. 8 Elliptical mask used to enclose the face in resized image
Implementing Authentication System Depending on Face …
581
Fig. 9 Elliptical mask encloses the face in resized image
Fig. 10 Increase in face contrast
surrounding pixels (skin pixels). The result is shown in Fig. 11. The mask which is used for this purpose is: ⎧ ⎫ ⎨ 1 0 −1 ⎬ 11 3 ⎩ ⎭ 1 0 −1
582
A. S. Ahmed
Fig. 11 Edge detection with sobel filter
Fig. 12 Feature extraction
-1
0
1
-1
-2
-1
-2
0
2
0
0
0
-1
0
1
1
2
1
• Step four: Feature extraction: To extract face features, an anthropometric model is used by computing distances of face features (like the space in the middle of the two eyebrows, eyes … etc.) and then store the results in database which represents the information of the authenticated persons as shown in Fig. 12. • Step five: Authentication (yes or no)All the above operations for the captured image will be performed and compared with stored information to determine the person in the image is authenticated or not.
Implementing Authentication System Depending on Face … Table 1 The experimental results by implementing the system on 27 images for each state
Image state
583 Successful rate (%)
Clear background without skin color
83
Clear background with skin color
10
Clear front view image
80
Pose front view image
40
Face with some changes like sunglasses
15
The experimental results explained in Table 1 lead to say that face image must be with clear background, frontal view with specified pose and within stable environment to get high performance and accurate result.
7 Authentication Decision In authentication phase, the scanned image is processed by the same operations mentioned above but the results are compared with the database fields of each authentic person which is stored previously to determine whether the identity of the claimed person is authorized or not. The comparison is done by comparing the result of the previous features measurements with the stored features measurements in the database, for example the length of the eyebrow of a person whose authorization is being tested is compared with the length of the eyebrows of all the authorized persons that are stored in the database; if a similarity by 78% or more is achieved, then the person is identified as authorized person otherwise he is not.
8 Conclusions From the practical experience of the proposed system and the previous executed systems and different algorithms, some conclusions can be drawn as follows: There is no system which can recognize an image with 100% of accuracy, and this means that such systems allow a little percentage of errors. The practical implementation of the proposed system has shown that error percentage could reach 17% or little more, and the results are good, utilizing human skin color feature only to recognize human face from other image parts is not enough since skin color is near to other things; therefore, other factors should be taken into consideration such as shape, location and the size of the area, the kind and the resolution of digital camera has an significant role in the correctness and the speed of image recognition, and the anthropometric face recognition algorithm provides very accurate results but requires a high programming to get such accurate results; if the background or the person clothes contain red color or colors close to human face such as brown, this will confuse the extraction process and will not be removed but will be analyzed as
584
A. S. Ahmed
human skin, and if lighting conditions are not controlled, there will be some shadow on the face or the background, which makes face detection and extraction difficult; thus, when an image is picked there should not be any shadow on the face (controlling lighting conditions).
9 Suggestion for Future Work 1. Building a system that recognizes persons using voice and image together. 2. Building a system that recognizes persons inside the camera itself and sends the results to the computer by a signal. 3. Building a system that recognizes persons in a face image with certain degree of rotation angle.
References 1. Abdul Wahab SI (2005). Face recognition using skin color and texture feature. Ph.D. Thesis, Computer Science Department, University of Technology 2. Abudarhama N, Shkiller L (2019) Critical features for face recognition. J Cogn 73–83 3. Ahmad F, Najam A, Ahmed Z (2012). Image-based face detection and recognition. IJCSI Int J Comput Sci Issues 4. Alzubaydi Dhia AJ, Samar AY (2014) Hybrid features extraction based on master eye block for face recognition. Int J Sci Eng Res 5. Goldstein AJ, Harman LD (1971) Identification of human faces. In: Proceeding of the IEEE 6. Hwang W, Wang H, Kim H, Kee S (2011) Face recognition system using multiple face model of hybrid fourier feature under uncontrolled illumination variation. IEEE Trans Image Process 7. Jan CA, Lubbe VD (1998) In: Basic methods of cryptography. Cambridge University Press 8. Jayshree K, Lodaya P (2016) User authentication by face recognition. Int J Adv Res Comput Sci Softw Eng 9. Royer J, Blais C (2018) Greater reliance on the eye region predicts better face recognition ability. J Cogn12–20 10. Seberry J, Pieprzyk J (1989). In: Cryptography, an introduction to computer security. Prentice hall of Australia 11. Sohail A, Bhattacharya P (2006) Detection of facial feature points using anthropometric face model. Concordia University, West Montréal, H3G 1M8, Québec, Canada 12. Woodward JD (2003) In: Biometrics a look at facial recognition. Prepared for the Virginia State Crime Commission 13. Zhi H, Sanyang LS (2019) Face recognition based on genetic algorithm. J Visual Commun Image Represent 495–502 14. http://www.oga.co.th/Syncom/Seurid/Resource/FAGs/Index.htm l
Process Parameter Optimization for Incremental Forming of Aluminum Alloy 5052-H32 Sheets Using Back-Propagation Neural Network Quoc Tuan Pham, Nguyen Ho Quang, Van-Xuan Tran, Xiao Xiao, Jin Jae Kim, and Young Suk Kim Abstract In this study, a back-propagation neural network (BPNN) model is developed to optimize forming parameters of aluminum alloy 5052-H32 sheets subjected to incremental sheet forming (ISF). Process parameters including tool diameter, step size in depth, tool feed rate, and tool spindle speed are varied to investigate the formability of the test, which is quantified by a multi-objective fitness function to reach the maximum forming angle and the minimum thickness reduction. A series of experimental tests have been conducted with different values of the above-mentioned process parameters. A BPNN model was then developed to predict material responses during ISF process. This BPNN model is employed to search for Pareto optimal solutions of formability of ISF. The derived Pareto optimal front appears to be a useful design guide for forming of aluminum alloy 5052 sheets in the ISF process. Keywords Incremental sheet forming · Parameter optimization · Back-propagation neural network · Multi-objective optimal · Genetic algorithm
1 Introduction In recent decades, incremental sheet forming (ISF) has been developed for rapid prototyping the parts made by sheet metals [1]. In ISF, a single and programmable rigid tool is moved to deform an edge fully constraint sheet. Unlike traditional forming processes such as stamping, punching, and stretching, forming a blank sheet
Q. T. Pham · N. H. Quang · V.-X. Tran (B) Faculty of Engineering and Technology, Thu Dau Mot University, Binh Duong, Vietnam e-mail: [email protected] X. Xiao · J. J. Kim Graduate School, Kyungpook National University, Daegu, South Korea Y. S. Kim (B) School of Mechanical Engineering, Kyungpook National University, Daegu, South Korea e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_55
585
586
Q. T. Pham et al.
with ISF does not require a rigid die. In addition, more flexibility and large formability are two major advantages of ISF. The process has great applying potential for development in the SME manufacturing companies in Vietnam. Various studies have been conducted to understand material’s responses during ISF. For example, Jackson and Allwood [2] pointed out that mechanics of ISF processes are combinations of bending, shearing, and stretching deformations [2]. Gatea et al. [3] presented a critical review on the effects of process parameters on final deformed shapes of products made by ISF processes. Do and Kim [4] introduced various hole-bright near clamping areas to decrease the springback amount during ISF. Do et al. [5] presented a hybrid experimental–numerical method to identify the fracture limit in ISF. Lu et al. [6] investigated the mechanism of double-side ISF, which is believed to generate less residual stresses on the final product than those of single ISF. However, the processes are frequently applied to form sheet metals which exhibit a low yield stress [3]. Myriad experimental affords have been made to overcome the limitation and increase the applications of ISF in industry. Nowadays, ISF processes can be applied for hard-to-form sheet metals with the help of several assistant methods, such as laser assistance [7], electric-pulse assistance [8], and heating assistance [9],. In the other hand, employing industrial robots is an effective way to improve the accuracy of ISF. Furthermore, adopting ISF processes for two sheet layers has been proven as an effectiveness strategy to deform the sheets involving complicated shapes, for example, embossing sheet metals [10], porous sheet metals [11], and carbon fiber reinforced polymer sheets [12]. With the development computational methods and artificial intelligence, several difficulties of ISF processes have been investigated by employing machine learning techniques. Ambrogio et el. [13] developed a neural network (NN) model to estimate the forming depth of a changing section part made by ISF process. Later, Fiorentino et al. [14] employed an iterative learning control approach to a NN model to improve the geometrical accuracy of a non-axisymmetric part. Moreover, Hartmann et al. [15] presented a NN model which is able generate tool paths in ISF automatically based on designed geometry. However, few studies are used for inverse analyses to optimize the process parameters. This study aims to carry out an inverse analysis to identify optimal process parameters in ISF to improve the formability of AA5052-H32 sheets subjected to ISF process. For the purpose, a series of experimental tests have been conducted based on the Box–Behnken design (BBD) for developing a NN model. This model is then trained to identify the relationship between the inputs (tool diameter, spindle speed, step depth, and speed rate) and the outputs (forming angle and thickness reduction). Finally, the machine learning-based model is used to predict input conditions to maximize forming angle and minimize the thickness reduction simultaneously.
Process Parameter Optimization for Incremental … Table 1 Material properties of AA5052-H32 sheet obtained from uniaxial tensile tests
Direction Young modulus (GPa)
587 RD 69.53
DD 69.39
TD 70.22
Yield stress (MPa)
165.3
154.8
156.2
Ultimate tensile strength (MPa)
223.8
215.1
218.3
11.2
14.4
12.1
Elongation (%) R-value
0.697
0.562
0.946
2 Experimental Procedures 2.1 Material This study adopts an aluminum alloy sheet AA5052-H32 with a thickness of 1 mm as the tested material. A series of uniaxial tensile tests were performed following the ASTM-8 standard procedure to obtain the material properties that are reported in Table 1. In details, specimens were prepared in the rolling (RD), diagonal (DD), and transverse (TD) directions using a wire cutting machine. The testes have been conducted at a constant tensile speed of 3 mm/min with a gauge length of 50 mm. Strains were measured by a 2D digital image correlation system, while stresses were calculated from the measured tensile force.
2.2 Formability in Incremental Sheet Forming In this study, a blank sheet with a size of 130 mm × 130 mm was prepared to be formed using a CNC machine. The geometry of a varying wall-angle conical frustums (VWACF) model, which is designated as the target design for the final product, was shown in Fig. 1. In this model, as the forming depth increases, the forming angle
Fig. 1 a 2D schematic diagram and b 3D schematic diagram for VWACF model
588
Q. T. Pham et al.
Table 2 Process parameters with their corresponding levels Parameter
Notable
Levels 1
2
3
−1
0
1
Tool diameter (mm)
A
6
8
10
Spindle speed (rpm)
B
60
120
180
Step depth (mm)
C
0.2
0.4
0.6
Feed rate (mm/min)
D
400
800
1200
gradually increases from 40° to 90°. The sheet is deformed continuously until fracture occurs. Hence, the maximum forming angle ∅ can be obtained by using Eqs. (1) and (2). The maximum forming angle was adopted to indicate the formability of the test. H = L − D +r
(1)
H π ∅ = − arcsin 2 r+R
(2)
Here, D is the depth when fracture happens, R = 50 mm, while L = 38.3 mm, and r is the tool diameter. Experimental tests were conducted until fracture occurred on the specimen. Hence, the deformed specimen was cut out following a center line to measure the thickness of the specimen at a depth of 15 mm from the bottom.
2.3 Experimental Design The BBD is employed in this study to design the experimental tests. Four process parameters including tool diameter, spindle speed, step size in depth, and feed rate were varied to investigate their effect on the formability of ISF indicated by the forming angle and thickness reduction of the sheet at fracture occurrence. According to BBD, three levels for each parameter were set as shown in Table 2. Totally, 27 experimental tests were conducted, and the experimental results were shown in Fig. 2 and Table 3.
3 Back-Propagation Neural Network Model A BPNN model composes of an input layer, a hidden layer, and an output layer which is developed in this study, as shown in Fig. 3. Therefore, depending on the nonlinearity and complexity of the model, it may have several hidden layers and
Process Parameter Optimization for Incremental …
589
Fig. 2 Results of 27 experimental tests for VWACF
neurons. The neurons in each layer add up the values delivered from the previous layer and apply them to the next layer of neurons. Finally, the effective value derived from the output layer is compared with the sample data obtained from the test, and the error is analyzed and the learning process of adjusting the weight of each neuron by the back-propagation method. The sum of weighted factors linking the input layer to the hidden layer is calculated as follows wk j x j . (3) netk = b + w T x = bk + where k and j are the number of hidden layers and input variables, respectively, b is a bias vector, w is a weight between each neuron, and x is an input vector. The values derived in the above process are applied to an activation function to impart nonlinearity of the model, thereby predicting the forming angle and thickness reduction in the experiment of the VWACF model. In this study, the sigmoid and identity functions are used as the activation layer for the input layer to estimate the forming angle ϕ angle as follows. Sigmod: f (netk ) =
2 1 + exp(−2 ∗ netk )
−1
(4)
Identity: ϕ angle = f (netk ) = max(0netk )
(5)
Furthermore, mean square error (MSE) was defined in Eq. (6) to compare the difference between the predicted and measured data. E = 1/Q
2
ϕ actual (m) − ϕpredicted (m)
(6)
590
Q. T. Pham et al.
Table 3 Design of experiment and measured response Exp. No
B
C
Maximum forming angle (°)
Thickness reduction (mm)
1
A 0
−1
−1
D 0
79.704
0.561
2
0
1
0
−1
80.969
0.536
3
1
0
1
0
79.974
0.559
4
0
0
0
0
79.541
0.569
5
0
1
0
1
79.974
0.561
6
−1
0
0
1
79.541
0.567
7
0
−1
0
1
81.169
0.542
8
0
1
1
0
81.169
0.548
9
1
−1
1
0
80.769
0.558
10
1
0
0
−1
81.169
0.551
11
−1
0
−1
0
80.704
0.558
12
1
0
−1
0
79.974
0.557
13
0
0
1
−1
81.169
0.541
14
1
0
0
1
79.974
0.565
15
1
1
0
−1
81.624
0.535
16
0
1
−1
0
80.704
0.548
17
0
0
−1
1
81.169
0.538
18
−1
1
1
1
80.704
0.545
19
−1
0
1
0
80.704
0.547
20
−1
−1
0
0
80.769
0.56
21
0
−1
1
0
80.704
0.552
22
0
−1
0
−1
81.169
0.54
23
0
0
−1
−1
79.704
0.563
24
−1
−1
−1
−1
79.974
0.558
25
1
0
−1
1
79.974
0.55
26
−1
0
0
0
81.169
0.549
27
0
0
1
1
79.704
0.56
where Q, ϕ actual , ϕpredicted are the number of data, the measured and calculated forming angles, respectively. The LM algorithm, which is one of the back-propagation techniques, was used to update the BPNN model. According to the LM technique, the value of weight factor of each neuron is updated as follows wi+1 = wi − [J T J + μi diag J T J ]−1 J T n
(7)
In this equation, w is the weight of each neuron, i is the number of repetitions, J is the Jacobian matrix, μ is the damping factor, and n is the residual value between the actual forming angle and the forming angle predicted by the neural network. In
Process Parameter Optimization for Incremental …
591
Fig. 3 Schematic of the developed back-propagation neural network
this case, the Jacobian matrix is defined as follows using the backward difference method. J=
r1 − ri−1 ∂r (w) = ∂w w
(8)
The input variables in this study are four designated process parameters; meanwhile, the outputs are two indicators for ISF formability, i.e., the forming angle and thickness reduction. After various trials of machine learning parameters, it appears that at least ten neurons in the hidden layer should be employed to study ISF process. In this model, we randomly set 19 of the 27 data as the training materials (training set of 70%) and four data as the validations (validation set and test set of 15%). Figure 4 shows the correlation between the experimentally measured forming angle and thickness reduction and the BPNN predictions. As shown in this figure, the correlation coefficients of forming angle and thickness reduction are R2 = 0.99769
Fig. 4 Correlation coefficients of training sets for a forming angle and b thickness reduction
592
Q. T. Pham et al.
and R2 = 0.97546, respectively, which indicate that the BPNN model is able to provide reasonable prediction for output parameters.
4 Multi-objective Optimization This study is designed to achieve the minimum thickness reduction while maximizing the forming angle in the ISF process for AA5052 sheet. These multi-objective targets can only be achieved through multi-objective optimization. Due to the fact that changing one objective necessarily affects the others, the multi-objective optimization procedure does not have a single solution, but presents itself as a series of solutions on a Pareto front called the non-dominated solutions. Genetic algorithm (GA) is adopted in this section to search the Pareto front solutions. The developed GA model consists of four process parameters in each individual. Various individuals gather together to create a population. The population size used by the GA was 50 within assumptions of the crossover, and mutation rates were 0.8 and 0.01, respectively. The maximum generation is 500 in order to ensure that the algorithm runs to completion. The optimized Pareto front obtained after 102 iterations of the GA algorithm is shown in Fig. 5. Each point in this figure presented a specific optimal solution, in which the corresponding values of the inputs could be selected according to the requirements of the part geometry. It is seen that the differences between the maximum and minimum optimal value of the forming angle and thickness reduction were 3% and 7%, respectively. Fig. 5 Optimal Pareto front desired from GA with BPN
Process Parameter Optimization for Incremental …
593
5 Conclusions and Perspectives This study developed a non-physical BPNN model to estimate the relation between the inputs as process parameters (tool diameter, spindle speed, step depth, feed rate) and the outputs (maximum forming angle and thickness reduction). Based on the relationship, optimal Pareto front solutions were derived using GA optimization, which would be used as a reasonable design guide for practical applications of AA5052 sheet in ISF process. The results show the potential of using BPNN in fast predicting material’s responses during ISF processes. Future work should compare the derived results with those obtained from conventional optimization methods such as surface response method to validate its applications in reality. Acknowledgements This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2019R1A2C1011224). The support of Thu Dau Mot University for QTP, HQN, and VXT within the project “Modelling and Simulation in the Digital Age – MaSDA” is greatly appreciated.
References 1. Ham M, Jeswie J (2006) Single point incremental forming and forming criteria for AA3003. CIRP Annals-Manuf Technol 55(1):241–244 2. Jackson K, Allwood J (2009) The mechanics of incremental sheet forming. J Mat Proc Tech 209(3):1158–1174 3. Gatea S, Ou H, McCartney G (2016) Review on the influence of process parameters in incremental sheet forming. Int J Adv Manuf Tech 87:479–499 4. Do VC, Kim YS (2017) Effect of hole lancing on the forming characteristic of single point incremental forming. Proc Eng 184:35–42 5. Do VC, Pham QT, Kim YS (2017) Identification of forming limit curve at fracture in incremental sheet forming. Int J Adv Manu Tech 92(9–12):4445–4455 6. Lu B, Fang Y, Xu DK, Chen J, Ai S, Long H, Ou H, Cao J (2015) Investigation of material deformation mechanism in double side incremental sheet forming. Int J Mach Tools Manuf 93:37–48 7. Gottmann A, Diettrich J, Bergweiler G, Bambach M, Hirt G, Loosen P, Poprawe R (2011) Laser-assisted asymmetric incremental sheet forming of titanium sheet metal parts. Prod Eng Res Devel 5:263–271 8. Bao W, Chu X, Lin S, Gao J (2015) Experimental investigation of formability and microstructure of AZ31B alloy in electropulse-assisted incremental forming. Mater Des 87:632–639 9. Xiao X, Kim CI, Lv XD, Hwang TS, Kim YS (2019) Formability and forming force in incremental sheet forming of AA7075-T6 at different temperatures. J Mech Sci Tech 33:3795–3802 10. Do VC, Nguyen DT, Cho JH, Kim YS (2016) Incremental forming of 3D structured aluminum sheet. Int J Prec Eng Manuf 17:217–223 11. Nguyen DT, Nguyen TH, Bui NT, Nguyen TD (2012) FEM study to predict springback of embossing and wave shapes on formability of stamping process for multi-hole etching metal foil using SUS316L material. ASEAN Eng J Part A 2(2):43–50 12. Kim YS, Kim JJ, Do VC, Yang SH (2019) FEM simulation for the single point incremental forming of CFRP prepreg. On proceeding of Numiform 2019 conference. June 23–27, 2019 New Hampshire, USA
594
Q. T. Pham et al.
13. Ambrogio G, Filice L, Guerriero F, Guido R, Umbrello D (2011) Prediction of incremental sheet forming process performance by using a neural network approach. Int J Adv Manuf Tech 54:921–930 14. Fiorentino A, Feriti GC, Giardini C, Ceretti E (2015) Part precision improvement in incremental sheet forming of not axisymmetric parts using an artificial cognitive system. J Manuf Sys 35:215–222 15. Hartmann C, Opritescu D, Volk W (2019) An artificial neural network approach for tool path generation in incremental sheet metal free-forming. J Intell Manuf 30:757–770
Classification of Parkinson’s Disease-Associated Gait Patterns Khang Nguyen, Jeff Gan Ming Rui, Binh P. Nguyen, Matthew Chin Heng Chua, and Youheng Ou Yang
Abstract Parkinson’s disease (PD) is a progressive neurological disorder that affects movement of millions of people worldwide. Many methods have been developed to identify and diagnose PD in patients. However, most of these approaches require extensive setup and involve costly equipment such as using depth cameras or devices worn on the body. In this study, we investigate the use of vertical ground reaction force (VGRF) sensor readings to classify PD subjects from non-PD subjects. This presents a low-cost and straightforward approach to identify PD through the gait characteristics associated with PD. By fusing together data points from individual sensors to create an ensemble and combining it with a deep short network capable of accurately identifying the gait characteristics of PD from the sensory input, we present a novel approach to classify gait characteristic of PD which is feasible in a clinical setting. We tested our model on a public dataset from PhysioNet which consists of VGRF sensor readings of PD and non-PD patients. Preprocessing was done by extracting out several meaningful features from the raw data when was then split and normalized. Classification was done using a multilayer feed-forward artificial neural network. Experimental results conclude that this model achieved 84.78% accuracy on the PhysioNet dataset, which is a significant improvement over various state-of-the-art models. K. Nguyen (B) Institute of Science and Information Technology, Hanoi, Vietnam e-mail: [email protected] J. G. M. Rui · M. C. H. Chua National University of Singapore, Singapore, Singapore e-mail: [email protected] M. C. H. Chua e-mail: [email protected] B. P. Nguyen Victoria University of Wellington, Wellington, New Zealand e-mail: [email protected] Y. O. Yang Singapore General Hospital, Singapore, Singapore e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_56
595
596
K. Nguyen et al.
Keywords Gait · Parkinson’s disease · VGRF sensors · Neural network
1 Introduction Parkinson’s disease (PD) is the second most common neurological disorder affecting millions of people worldwide [1, 2]. Patients with PD commonly have motor symptoms such as akinesia (absent movement), bradykinesia (slowed movement), tremors, postural instability, rigidity, and gait impairments [3]. By observing and analyzing these changes in gait, the severity of PD-associated gait patterns can be detected, with the potential for earlier and more accurate diagnosis of PD [4]. With the rise of new technologies, there are many devices available for efficient measurement of patient’s gaits. These devices can be broadly classified into two categories: wearable sensors (WS) and non-wearable sensors (NWS) [5]. NWS systems usually require a controlled environment with sensors or markers to capture gait information. In contrast, WS systems allow the subjects to carry about their daily lives while capturing their gait data. This study focuses on the gait information from VGRF measurements derived from WS, in order to identify gait characteristics of PD in patients.
2 Literature Review Several studies have attempted to categorize abnormal gait patterns using different wearable sensors such as accelerometers and gyroscopes. It is a convenient and low-cost approach that is able to provide informative insights into health-related applications [6]. Joshi et al. [7] used wavelet analysis and support vector machine (SVM) to classify Parkinson’s subjects and healthy people using gait cycle comparisons to achieve an accuracy of 90.32%. Wu et al. [8] used linear regression analysis and SVM to achieve an accuracy of 84.48% for gait classification. In [9], gait cycle repetitions were exploited to distinguish healthy people from PD subjects. Several other research works are summarized in Table 1 to show the different ways that gait cycles and patterns using wearable sensors have been used to classify subjects with gait characteristic of PD from healthy subjects. Most of the studies use time-domain and frequency-domain features captured using wearable sensors to diagnose PD. However, these features cannot be easily linked to a clinical indicator. In this paper, a novel approach for PD classification using clinical features extracted solely from VGRF sensors only was developed to address these shortcomings.
Classification of Parkinson’s Disease-Associated Gait Patterns Table 1 Summary of related works [10] Years References Sensors 2010 2011
Wu and Krishnan [11] Sarbaz et al. [12]
2012
Daliri [13]
2014
Dror et al. [14]
2016
Jane et al. [15]
2017 2017
Cuzzolin et al. [16] Açici et al. [17]
2017
Joshi et al. [7]
2017
Wu et al. [8]
2018
Khoury et al. [9]
Wearable force sensors Wearable force sensors Wearable force sensors Non-wearable 3-D camera sensor Wearable force sensors Wearable IMU sensors Wearable force sensors Wearable force sensors Wearable force sensors Wearable force sensors
597
Method
Validation
LS-SVM
Leave-one-out
Nearest mean scaled SVM
70/30 50/50
SVM
Leave-one-out
Q-BTDNN
CV
HMM
CV
RF
tenfold CV
SVM
Leave-one-out
SVM
Leave-one-out
KNN, CART, RF, tenfold CV SVM, K-means, GMM
3 Methodology The proposed pipeline of our method is illustrated in Fig. 1. Details such as gait data preparation, feature extraction and selection as well as our custom classification techniques are further elaborated in the following subsections. The dataset used in this work, namely the gaitpdb dataset, was provided by Hausdorff et al. [18, 19]. This dataset has been used in several studies [7, 20, 21] and will provide a benchmark comparison of this work’s model against the works performed by other researchers. The dataset is made up of gait measurements from 93 patients with idiopathic PD (mean age: 66.3 years; 59(63%) men, 34(37%) female), and 73 healthy controls (mean age: 66.3 years; 40(55%) men, 33(45%) female). The subjects were told to walk at their normal speed and as naturally as possible for about 2 min on a flat even ground. There were eight Ultraflex Computer Dyno Graphy sensors (Infotronic Inc., USA) under each foot which recorded the force (in Newtons) exerted by the foot as a function of time. The output of these 16 sensors was then digitized at a time interval of 0.01 seconds. The records also include the sum of the eight sensor outputs per foot. Figure 2 shows the relative positions of the eight sensors on each foot [22], and Fig. 3 summarizes the VGRFs measured on left foot and right foot of the subjects.
598
K. Nguyen et al.
Sensor readings
Data extraction and cleaning
Feature extraction and ensemble
Model training and optimization
Classification results and evaluation
Fig. 1 Proposed architecture pipeline process.
Fig. 2 Relative positioning of VGRF sensors
Classification of Parkinson’s Disease-Associated Gait Patterns
599
Fig. 3 VGRFs measured on the left foot (blue) and the right foot (red) of a a healthy subject and b a subject with PD
3.1 Pre-processing In this study, instead of measuring individual sensory information which was more prone to noise, the sums of the eight sensor outputs from each foot were calculated to obtain two combined signals (left and right foot) instead. This also allowed stance and swing phase detection at higher precision. It also reflected the overall changes in gait dynamics better [23]. The subjects were told to perform a round trip [24]. To account for the anomalies in gait during the start-up and stopping of the experiment, we removed the first and last 20 seconds of VGRF data. Afterward, the time series was divided into each individual stride cycle. To define a complete stride cycle, it is the period which a foot touches the ground, goes off the ground, and touches the ground again. To represent this in the form of VGRF data, it can be represented by a series of nonzero combined VGRF force values across all eight sensors which represent the foot touching the ground, followed by a series of zero combined VGRF force values, which indicates the foot leaving the ground. Each gait cycle is then divided into the respective phases: stance and swing phases, to allow us to abstract out the relevant information in later parts of the work. Afterward, to account for signal fluctuations, a 10-point median filtering was applied. Median filtering replaces a given sample by the median of other samples in a window around that sample [25]. The result y(n) of an input vector x(n) is the output of a median filter of length l. In the case whereby l is odd, the median filter can be defined as: l −1 (1) y(n) = median x(n − k : n + k), k = 2
600
K. Nguyen et al.
Fig. 4 Data preprocessing: a raw VGRF data, b processed VGRF data [10]
Essentially, the sample input is replaced by the middle value from a sorted list of the neighbors of the original sample. This helps to reduce outliers in the signal by ignoring them altogether. Figure 4 shows the results after these preprocessing steps were done.
3.2 Feature Selection Feature selection is an important step in this study as the neural network’s performance is greatly dependent on the quality of the features used as inputs to the network. Feature selection is a dimensionality reduction technique which reduces training time as well as overfitting of the model [26]. Previous studies have shown that among all the potential spatiotemporal features available from the VGRF signals, only several significant features are required to capture the important details of the subject’s gait [27–29]. Data compression was applied on the processed gait
Classification of Parkinson’s Disease-Associated Gait Patterns Table 2 Statistical features Statistical parameters Coefficient of variation of swing time Coefficient of variation of stride time Minimum (Newton) Maximum (Newton) Median (Newton) Mean (Newton) Mean kurtosis (Second) Mean skewness (Second)
601
Description Normalized factor of the mean divided by the standard deviation Normalized factor of the mean divided by the standard deviation Minimum force value in the dataset Maximum force value in the dataset The midpoint of the frequency distribution The average of the force values in the dataset Mean kurtosis of the gait cycle duration Mean skewness of the gait cycle duration
data to obtain several statistical functions, namely maximum, minimum, median, mean, standard deviation, kurtosis, and skewness, for both the left as well as the right foot. In addition, from previous studies [30, 31], swing time variability and stride time variability are important to distinguish a PD patient’s gait from normal gaits, hence they are selected in our data fusion ensemble. This reduces the number of input variables to just 16, while still being a good representation of the subject’s gait (Table 2).
3.3 Network Design and Architecture Recently, deep neural networks have found their way into the field of gait recognition as well as classifications of neuro-degenerative diseases. In [32], researchers used multilayer perceptrons (MLP) combination classifiers to diagnose patients with knee osteoarthritis, using joint angle and time–distance data as features. Jung et al. [33] used a MLP and nonlinear autoregressive with external inputs (NARX) to accurately classify gait phases in exoskeleton robots. Traditional machine learning algorithms such as support vector machine (SVM) and random forest (RF) require domain expertise in order to select the best features to be fed to the classifier. However, with deep learning, the neural network is able to learn which features are more important in classifying PD. Hence, a neural network is used in this study to automate the process of finding the correct features. Instead, our focus is to create different ensembles using the raw gait data to develop meaningful features which can be used as inputs for the neural network. Figure 1 summarizes the different stages of our research, from data preprocessing to the development of the model and eventually the trained classifier to classify gait characteristic of PD.
Output
Dense (1) + Sigmoid
Dense (9) + ReLU
Dropout (0.5)
Dense (13) + ReLU
K. Nguyen et al.
Features (16)
602
Fig. 5 Proposed neural network architecture
The neural network used in this study is a multilayer feed-forward network with back propagation (BP) algorithm. BP is a generalization of the least mean squared algorithm to minimize the mean squared error (MSE) between the ground-truth value and the output of the network by modifying its weights used [34]. The architecture’s details are summarized in Fig. 5. To prevent over-fitting, an early stopping callback mechanism was used to give the training process an opportunity to find some additional improvements.
3.4 Cross-Validation Cross-validation is a technique to evaluate machine learning models by training several models on subsets of the original dataset and evaluating them on the complementary subset of the data. It can detect over-fitting well especially in our use case where the size of the dataset is relatively small ( 0, the root of Eq. (18) is ei (t) = ei (0)e−Λi t → 0
(19)
¯ The uncertainty of the function h(ν) in Eq. (15) is the cause of reduction of control quality. If this uncertain nonlinear item can be compensated, the control quality should be improved. From Stone–Weierstrass theorem [1], it is suggested that a neutron network could be used to perform the approximation of a nonlinear ¯ function at the intended accuracy. In order to approximate the function h(ν), the neutron network structure is selected as ¯ ˆ h(ν) = Wσ + ε = h(ν) +ε
(20)
T ˆ = hˆ 1 , hˆ 2 , .., hˆ na = Wσ is an approximation where W is a n a square matrix, h(ν)
¯ ≤ ¯ of h(ν), ε is the error of the approximation, and σ is the impact function. If h(ν) h 0 , then the error could be in a range as ε ≤ ε0 . Notated that wi is the ith vector of the matrix W, then hˆ is rewritten as na T hˆ = hˆ 1 , hˆ 2 , .., hˆ na = Wσ = σi wi
(21)
i=1
In this research, an artificial neutron network is selected to be radial basis function (RBF) neuron network [1], where its structure is shown in Fig. 4. This structure has been proved that the approximation of a nonlinear function at the intended accuracy could be performed with the finite neutron nodes. The impact function σi is selected to be Gaussian distribution as (vi − Ci )2 σi = exp − χi2 Fig. 4 RBF neutron network
(22)
636
V. Le Huy and N. D. Dzung
where Ci is the center of gravity, and χi2 is the normalization parameter of Gauss’ law. The approximate elements of the function hˆ are written as hˆ i =
na
σ j w ji , i = 1, . . . , n a
(23)
j=1
in which w ji is weight factor connecting the hidden neutron layer and the output of the approximate neutron network. The control problem is then expressed: Finding the control law u with learning neural network w ji in such a way that ν → 0 and the error ε → 0. Then it is inferred that qa (t) → qad (t). There is a theorem for this problem as follows: Theorem Trajectory qa (t) of the dynamics system (10) with the neutron network (21) (23) and the sliding plane (16) will track the given trajectory qad (t) with the error e(t) = qa (t) − qad (t) → 0. The control law u and the learning neural network are selected as ¯ s˙)q˙ ad + g¯ + d¯ − M(s)ΛP ¯ ¯ s˙)Λea ¯ qad + C(s, ea − C(s, u = M(s)R ν − Kν − γ + (1 + η)Wσ ν ˙ i = −ησi ν w
(24) (25)
where K is a n a square matrix and positive identification, and η and γ are parameters selected free ensuring η > 0, γ > 0. Proving The theorem is proved by applying Liapunov’s direct method about asymptotic stability. A positive identification function is selected as
na 1 TN V (t) = wiT wi ν Mν + 2 i=1
(26)
¯ Since M(s) is a symmetric matrix identified positively, therefore V (t) > 0 with all the values of ν = 0, wi = 0 as well as V (t) = 0 if and only if ν = 0, wi = 0. By deriving V (t) in real time, it is obtained as a P + N ν + 1 νT Mν N ˙i wiT w V˙ (t) = νT MP 2 i=1
n
(27)
¯ ¯ s˙) as Using the property of the skew symmetric matrix M(s) − 2C(s, P − 2C P = 2νT N ¯ ν = 0 → νT Mν N N Cν νT M
(28)
Tracking Control of Parallel Robot Manipulators Using …
637
By substituting the Eq. (28) into Eq. (27), V˙ (t) is rewritten as na ¯ ν + Cν ¯ + ˙i wiT w V˙ (t) = νT MP
(29)
i=1
Selecting u = τa , from the two Eqs. (24) and (25), it is inferred that ν ¯ ¯ ¯ MP ν + Cν = − Kν + γ − (1 + η)Wσ + h(ν) ν
(30)
By substituting Eq. (30) into Eq. (29), V˙ (t) is then formulated as na v T ˙ ˙i V (t) = v −Kv − γ + ηWσ − ε + wiT w ||v|| i=1
(31)
Noticing the learning neural network presented by Eq. (25), the last item in Eq. (31) is rewritten as na i=1
˙ i = −η wiT w
na
wiT νσi = −ηνT wσ
(32)
i=1
Substituting Eq. (32) into Eq. (31), V˙ (t) is now written as follows: ν ν − νT ε V˙ (t) = −νT Kν − γ ν T
(33)
If selecting γ = δ + ε0 with δ > 0, it is obtained as V˙ (t) = −νT Kν − δν − ε0 ν + νT ε
(34)
Because ε < ε0 , then V˙ (t) < 0 with all the values of ν = 0, and V˙ (t) = 0 if and only if ν = 0. In Liapunov’s direct method about asymptotic stability, it is known that ν → 0 when t → ∞. Therefore, it is concluded that ea (t) = qa (t) − qad (t) → 0.
4 Numerical Simulation of the Control Problem for Rostock Delta Parallel Robot From the robot model and the control law presented above, the simulation diagram for the parallel robot control is established as shown in Fig. 5. This section presents the numerical simulation for the control problem of the 3RUS parallel robot by using the software MATLAB\Simulink. The values of the
638
V. Le Huy and N. D. Dzung
Fig. 5 Simulation diagram for controlling the parallel robot
robot’s parameters and its trajectory are selected as follows: L = 0.242; R = 0.16; r = 0.03 (m); α1 = 0;α2 = 2π/3; α3 = 4π/3; m 1 = 0.12; m 2 = 2 × 0.15; m P = 0.2(kg); I1 = diag(0, m 1 L2 /12, m 1 L2 /12); x dP = −0.05 cos(2π t); y dP = 0.05 sin(2π t); z dP = −0.5 (m) The initial conditions with the position of the center point of the mobile platform are given as x P (t = 0) = −0.04; y P (t = 0) = −0.01; z P (t = 0) = −0.495 (m). The control parameters are given as K = diag(15, 15, 15); = diag(10, 10, 10); η = 1.1;γ = 200; χ1 = 1; χ2 = 2; χ3 = 3; c1 = 0.01; c2 = 0.02; c3 = 0.03. Two cases are simulated in this paper that the first one is the accurate robot without noise, and the second one is the robot with error and noise in the operating process. In the second case, the error and noise are given putatively as d(t) =
1 T [sin 20t cos 20t sin 20t cos 20t . . .]1×12 3
M(s) = 20%M(s); C(s, s˙) = 20%C(s, s˙); g(s) = 20%g(s) The simulation results of tracking error for the position error of the active joints and the center point of the mobile platform are shown in Figs. 6, 7, 8 and 9, where the abscissas and the ordinates present the time and the errors, respectively. Figures 6 Fig. 6 Position error of the active joints in the case of the accurate robot without noise
0.02
[m ]
0.01
0
e
e
a1
-0.01
0
0.2
e
a2
0.4
0.6
t[s]
a3
0.8
1
Tracking Control of Parallel Robot Manipulators Using … Fig. 7 Position error of the center point of the mobile platform in the case of the accurate robot without noise
639
0.015
-5
x 10
[m]
0.01 0.005
5 0 -5 0.6
0.7
ex
ey
0.8
0 -0.005 -0.01
0
0.2
0.4
0.6
ez 0.8
1
t[s] Fig. 8 Position error of the active joints in the case of the robot with modeling error and noise
0.02
[m ]
0.01
0
e
e
a1
-0.01
0
0.2
e
a2
0.4
0.6
a3
0.8
1
t[s] Fig. 9 Position error of the center point of the mobile platform in the case of the robot with modeling error and noise
0.015
-4
x 10
5
0.01
0
0.005
[m]
-5 0.6
0.7
0.8
0
ex
-0.005 -0.01
0
0.2
ey
0.4
0.6
t[s]
ez 0.8
1
640
V. Le Huy and N. D. Dzung
and 7 present the results in the case of the accurate robot without noise, while Figs. 8 and 9 present the results in the case of the robot with error and noise. The plots show that the tracking errors of the active joints and the center point of the mobile platform go to the approximation of zero after about 0.2 s. The control method based on the sliding mode control law using with the neutron network gives the position error of the center point of the mobile platform to be about 10−5 m (Fig. 7) in the case of all the robot dynamics parameters to be known exactly. When the robot dynamics parameters meet the error to be about 20% and there is noise of forces in operation, the error of the center point of the mobile platform is about 10−4 m (Fig. 9). This error is acceptable.
5 Conclusion Rostock Delta parallel robot is nowadays developed strongly especially in the application of 3D printing since its advantage is the high accuracy given by closed-loop structure. However, this structure becomes difficult in establishing the dynamics model for the control problem, because redundant generalized coordinates must be used to establish the dynamics equations in the explicit analytic form. In addition, this structure is difficult to determine exactly its dynamics parameters. Therefore, this paper used RBF neutron network to compensate the uncertain nonlinear items in the dynamics model of the Rostock Delta robot in order to improve the control quality. The stability of the tracking error was proved by applying Liapunov’s direct method about asymptotic stability. The simulation results of tracking error also show stability. However, it is still necessary to perform the comparison of the simulation results to the experimental results in the future in order to consolidate this control method. The comparison of the numeric simulation results and the experiment is necessary to be performed in the future to verify the results in this study. Acknowledgements This research is funded by PHENIKAA University under grant number 042019.03.
Appendix Movement equations of the Delta Rostock parallel robot (m 1 + m 2 )d¨1 + m 2 l cos θ1 cos γ1 θ¨1 2 − m 2 l sin θ1 sin γ1 γ¨1 2 − m 2 l sin θ1 cos γ1 θ˙12 2 − m 2 l cos θ1 sin γ1 γ˙1 θ˙1 − m 2 l sin θ1 cos γ1 γ˙12 2 − (m 1 + m 2 )g + λ3 − F1 = 0 (m 1 + m 2 )d¨2 + m 2 l cos θ2 cos γ2 θ¨2 2 − m 2 l sin θ2 sin γ2 γ¨2 2 − m 2 l sin θ2 cos γ2 θ˙ 2 2 2
Tracking Control of Parallel Robot Manipulators Using …
641
1 m 2 l sin θ2 cos γ2 γ˙22 − (m 1 + m 2 )g + λ6 − F2 = 0 2 (m 1 + m 2 )d¨3 + m 2 l cos θ3 cos γ3 θ¨3 2 − m 2 l sin θ3 sin γ3 γ¨3 2 − m 2 l sin θ3 cos γ3 θ˙32 2 − m 2 l cos θ3 sin γ3 γ˙3 θ˙3 − m 2 l sin θ3 cos γ3 γ˙32 2 − (m 1 + m 2 )g + λ9 − F3 = 0 m 2 l cos θ1 cos γ1 d¨1 2 + m 2 l 2 4 − I2x + I2y cos2 γ1 + I2x θ¨ + −m 2 l 2 4 + I2x − I2y sin 2γ1 γ˙1 θ˙1 − m 2 gl cos θ1 cos γ1 l cos α1 sin θ1 cos γ1 λ1 2 − m 2 l cos θ2 sin γ2 γ˙2 θ˙2 −
+ l sin α1 sin θ1 cos γ1 λ2 + l cos θ1 cos γ1 λ3 = 0 − m 2 l sin θ2 sin γ2 d¨2 2 + m 2 l 2 8 − I2x 2 + I2y 2 sin 2γ2 θ˙22 + l(cos α2 cos θ2 sin γ2 + sinα2 cos γ2 )λ4 + m 2 l 2 4 + I2z γ¨2 + m 2 gl sin θ2 sin γ2 2 +l(sin α2 cos θ2 sin γ2 − cos α2 cos γ2 )λ5 − l sin θ2 sin γ2 λ6 = 0 − m 2 l sin θ3 sin γ3 d¨3 2 + m 2 l 2 4 + I2z γ¨3 + m 2 l 2 8 − I2x 2 + I2y 2 sin 2γ3 θ˙32 1 m 2 gl sin θ3 sin γ3 +l(cos α3 cos θ3 sin γ3 + sinα3 cos γ3 )λ7 2 + l(sin α3 cos θ3 sin γ3 − cos α3 cos γ3 )λ8 − l sin θ3 sin γ3 λ9 = 0 +
m 3 x¨ P + λ1 + λ4 + λ7 = 0 m 3 y¨ P + λ2 + λ5 + λ8 = 0 m 3 z¨ P + λ3 + λ6 + λ9 + m 3 g = 0 x P − (R − r ) cos α1 − l cos α1 cos θ1 cos γ1 + l sin α1 sin γ1 = 0 y P − (R − r ) sin α1 − l sin α1 cos θ1 cos γ1 − l cos α1 sin γ1 = 0 z P + d1 + l sin θ1 cos γ1 = 0 x P − (R − r ) cos α2 − l cos α2 cos θ2 cos γ2 + l sin α2 sin γ2 = 0 y P − (R − r ) sin α2 − l sin α2 cos θ2 cos γ2 − l cos α2 sin γ2 = 0 z P + d2 + l sin θ2 cos γ2 = 0 x P − (R − r ) cos α3 − l cos α3 cos θ3 cos γ3 + l sin α3 sin γ3 = 0 y P − (R − r ) sin α3 − l sin α3 cos θ3 cos γ3 − l cos α3 sin γ3 = 0 z P + d3 + l sin θ3 cos γ3 = 0
(35)
References 1. Merlet JP (2006) Parallel robots. Springer, Berlin 2. Bell C (2015) 3D printing with delta printers, pp 1–333 3. Murray RM, Li Z, Sastry SS (1994) A mathematical introduction to robotic manipulation. CRC Press, Florida 4. Spong MW, Hutchinson S, Vidyasagar M (2004) Robot dynamics and control. Wiley 5. Siciliano B, Sciavicco L, Villani L, Oriolo G (2009) Robotics/modelling, planning and control. Springer, London 6. Rachedi M, Bouri M, Hemici B (2014) H∞ feedback control for parallel mechanism and application to delta robot. In: Proceedings of the 22nd mediterranean conference on control and automation. Palermo, Italy
642
V. Le Huy and N. D. Dzung
7. Rachedi M, Bouri M, Hemici B (2015) Robust control of a parallel robot. In: Proceedings of the international conference on advanced robotics (ICAR). Istanbul, Turkey 8. Rachedi M, Hemici B, Bouri M (2015) Design of an H∞ controller for the delta robot: experimental results. J Adv Rob 29(18):1165–1181 9. Fabian J, Monterrey C, Canahuire R (2016) Trajectory tracking control of a 3 DOF delta robot: a PD and LQR comparison. In: Proceedings of the 23rd IEEE international Congress on electronics, electrical engineering and computing (INTERCON). Piura, Peru, pp 1–5 10. Angel L, Viola J (2018) Fractional order PID for tracking control of a parallel robotic manipulator type delta. ISA Trans 79:172–188 11. Lu X, Liu M (2016) Optimal design and tuning of PID-type interval type-2 fuzzy logic controllers for delta parallel robots. Int J Adv Rob Syst 13:1–12 12. Lu X, Zhao Y, Liu M (2017) Self-learning interval type-2 fuzzy neural network controllers for trajectory control of a delta parallel robot. Neurocomputing 283:107–119 13. Boudjedir C, Boukhetala D, Bouri M (2018) Nonlinear PD control of a parallel delta robot: experimentals results. In: Proceedings of the international conference on electrical sciences and technologies in Maghreb (CISTEM). Algiers, Algeria, pp 1–4 14. Boudjedir C, Boukhetala D, Bouri M (2018) Nonlinear PD plus sliding mode control with application to a parallel delta robot. J Electr Eng 69:329–336 15. Hernández JME, Sierra HA, Mejía OA, Chemori A, Núñez JHA (2019) An intelligent compensation through B-spline neural network for a delta parallel robot. In: Proceedings of the international conference on control, decision and information technologies (CoDIT 2019). Paris, France, pp 235–240 16. Utkin VI (1992) Sliding modes in control and optimization. Springer, Betlin 17. Liu J, Wang X (2012) Advanced sliding mode control for mechanical system. Tsinghua University Press, Beijing and Springer, Berlin 18. Yang X, Zhu L, Ni Y, Liu H, Zhu W, Shi H, Huang T (2019) Modified robust dynamic control for a diamond parallel robot. IEEE/ASME Trans Mechatron 24(3):959–968 19. Liu J (2013) Radial basis function (RBF) neural network control for mechanical systems. Tsinghua University Press, Beijing and Springer, Berlin 20. Jalon JG, Bayo E (1994) Kinematic and dynamic simulation of multibody systems—the realtime challenge. Springer, New York 21. Shabana AA (2005) Dynamics of multibody systems, 3rd ed. Cambridge University Press, New York 22. Taghirad HD (2013) Parallel robots: mechanics and control. CRC Press 23. Blajer W, Schiehlen W, Schirm W (1994) A projective criterion to the coordinate partitioning method for multibody dynamics. Arch Appl Mech 64:86–98
An IoT-Based Air Quality Monitoring with Deep Learning Model System Harshit Srivastava, Kailash Bansal, Santos Kumar Das, and Santanu Sarkar
Abstract Air pollution occurs when the concentration levels of environmental gases including CO2 , NH3 , etc., go above the optimum level. As the AQI is being calculated and as per the Central Pollution Control Board (CPCB), there is a standard level of ranges for pollution level. This paper presents monitoring of the pollution level using Raspberry Pi 3 based on IoT technology. Here, the temperature, humidity, dew point and wind speed parameters are also monitored and these parameters are used as datasets for prediction of pollution forecasting. Then, the target of this project is applying the deep learning concept for the prediction and analysis of gas sensors’ pollution level so that we can analyze the pollution level due to the pollutant gases based on prediction analysis. Various experiments were performed for the validation of the development of the system for real-time monitoring. Here, we are discussing the different methods used in deep learning, i.e., artificial neural networks (ANN), multilayer perceptron (MLP) and recurrent neural networks (RNN), using LSTM model to analyze and predict the multivariate time-series forecasting. Keyword Internet of Things · Raspberry Pi 3 · Recurrent neural networks (RNN) · Long short-term memory (LSTM) · Air Quality Index (AQI)
H. Srivastava (B) · K. Bansal · S. K. Das · S. Sarkar NIT Rourkela, Rourkela 769008, Odisha, India e-mail: [email protected] K. Bansal e-mail: [email protected] S. K. Das e-mail: [email protected] S. Sarkar e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_60
643
644
H. Srivastava et al.
1 Introduction Pollution is a primary concern in the current scenario. As we are moving forward toward smart city development, air pollution is an important concern for human health and other living beings. Any changes in the composition of air can cause sage harm to different life forms. Air pollution is a mixture of one or more contaminants in the atmosphere such as pollutant gases in an excess quantity that can harm human beings, animals and plants. Air pollutants are measured in percentage form or ppm.
1.1 Existing and Proposed Model A gas-type sensor is a form of transducer which senses particles of gas and transforms them into an signal of an electric form with a magnitude/value proportional to the concentration level of an air pollutants. Previously, instruments using analytical techniques for monitoring the different gas sensor concentrations were used. These instruments provide adequate accuracy, but there are some disadvantages such as large size which become bulky, costly instruments and high maintenance cost, slow response time and the requirement of trained operators.
2 Literature Survey Tajne et al. [1] presented an idea of implementing a pollution monitoring system on WSN to analyze air pollution and monitor and control the air quality in Nagpur city, providing an alert message system. Al-Ali et al. [2] presented the monitoring system consisting of the mobile data acquisition unit integrating MCU, sensors array, GPS module, GPRS module and a server. It gathers air pollutant levels which is uploaded to the GPRS/GPS modem and sent to the main server and is tuned to the Google maps to display real-time air pollutant levels. Shaban et al. [3] focused on applying the ML algorithms to build forecasting models to analyze the concentration levels of NO2 , O2 and SO2 . The algorithms are ANN, SVM and M5P model trees, and the performance measures are taken based on accuracy and root-mean-square error (RMSE). Khelifi et al. [4] presented merging CNN and RNN with IoT and information-centric networking. CNN is used to extract reliable data from the environment, and RNN is integrated to model the data in real-time applications. CNN has four-layer steps, i.e., local-based receptive field, shared data weights, pooling and activation. Li et al. [5] discussed the M-BP-based algorithm to retrieve and complete the missing data. They proposed a big data frame/structure based on deep learning taking urban robust data. The framework is comprised of the LSTM model to resolve the missing data.
An IoT-Based Air Quality Monitoring with Deep …
645
Fig. 1 System architecture model
3 Methodology 3.1 Architecture Model The system’s proposed model is as follows. Figure 1 shows the overall architecture model of the whole system. The block diagram of system depicts the hardware requirements and the working principle of a particular proposed system model [6, 8]. The device will be set up to take the environmental data, and there will be a base standard value. The device will collect data based on the set values. It will show the output.
3.2 Flowchart Model The proposed system is comprised of hardware and software section. The gas sensors are calibrated [7] to sense the air pollutants accurately. The coding is done in python. The data are stored in the cloud using the Firebase server database. Figure 2 shows the flowchart model. Then, the dataset is used for the prediction of pollution based on deep learning algorithms which is here used as LSTM based on RNN.
646
H. Srivastava et al.
Fig. 2 Flowchart model diagram
4 Artificial Intelligence-Based Deep Learning It is a subset of machine learning in AI [4] that makes unsupervised learning from unstructured data. Here, the network is built in the same manner as neurons connected in the case of the human brain. It makes the hierarchical approach to process and analyze linear and nonlinear data. There are a number of algorithms based on deep learning, i.e., recurrent neural networks, recursive neural networks, etc. RNN algorithm is designed to process the speech signal and text-based datasets, which makes use of feedback loop from the output of hidden layers back to the input of the same layers.
An IoT-Based Air Quality Monitoring with Deep …
647
4.1 Recurrent Neural Network It is a type of neural network which is very useful in analyzing a huge datasets that makes use of training and testing the data sets making use of recurrent, i.e., feedback given back to the hidden layers to optimize the training model based on which the prediction of data can be made near to accurate. It uses the LSTM model for making a feedback loop where the output from the previous step is put back to the input of the current step. It is useful in prediction for real time-series forecasting models [3]. Recurrent neural network algorithm: Figure 3 shows the algorithm of deep learning for prediction, and forecasting is done using RNN using LSTM, i.e., long short-term memory [5]. In the given algorithm, first, the data are loaded as the datasets contain parameters used for forecasting. Then, the scaling of information is done. Fig. 3 RNN algorithm diagram
648
H. Srivastava et al.
Fig. 4 Real-time data packets in database
The dataset is prepared for learning by being split into training and testing datasets. Then, the LSTM model is created for training the model based on which the lossversus-epoch graph plot is obtained.
5 Results 5.1 Firebase Server Database In the proposed work, some of the real-time parameters including temperature, humidity, dew point, wind speed, NH3, and CO2 gas sensor values are stored in the firebase database (Fig. 4).
5.2 RNN Model Loss-Versus-Epoch Graph Output The RNN algorithm uses LSTM. The datasets include parameters of pollution level, temperature, humidity and wind speed based on the model being trained, and testing datasets are validated by plotting loss versus epochs that conclude about the minimum error in testing datasets over training datasets as the number of epochs increases. The test RMSE obtained is 23.714% (Fig. 5).
An IoT-Based Air Quality Monitoring with Deep …
649
Fig. 5 Loss-versus-epoch graph plot
6 Conclusion and the Future Work Here, a monitoring system is being developed to monitor and analyze the real-time air pollution level and calculate the AQI level in the environment. Further, the data are stored in real time in the server, which can be retrieved on Web services with deep learning model being used to understand the prediction problem in time-series forecasting model. Prediction and analysis of the AQI/pollution level using deep learning with different techniques and comparison of the accuracy of prediction are done. The end node hardware is to be completed for routing with 3D modeling printing. One can also implement data analytics to classify the pollution data into an area based on various pollutants and their effects on human health. Acknowledgements Our work is financed by IMPRINT (Grant No.-7794/2016), a joint initiative of the MHRD and Ministry of Housing and Urban Affairs, Government of India.
References 1. Tajne KM, Rathore SS, Asutkar GM (2011) Monitoring of air pollution using wireless sensors—a case study of monitoring air pollution in Nagpur City. Int J Environ Sci 2(2):829–838 2. Al-Ali AR, Zualkernan I, Aloul F (2010) A mobile GPRS sensors array for air pollution monitoring. IEEE Sens J 10(10):1666–1671. https://doi.org/10.1109/JSEN.2010.2045890 3. Bashir Shaban K, Kadri A, Rezk E (2016) Urban air pollution monitoring system with forecasting models. IEEE Sens J 16(8):2598–2606. https://doi.org/10.1109/JSEN.2016.2514378 4. Khelifi H, Luo S, Nour B, Sellami A, Moungla H, Ahmed SH, Guizani M (2019) Bringing deep learning at the edge of information-centric internet of things. IEEE Commun Lett 23(1):52–55. https://doi.org/10.1109/LCOMM.2018.2875978
650
H. Srivastava et al.
5. Li VOK., Lam JCK., Chen Y, Gu J (2017) Deep learning model to estimate air pollution using M-BP to fill in missing proxy urban data. In: GLOBECOM 2017—IEEE global communications conference, Singapore, pp 1–6 (2017). https://doi.org/10.1109/glocom.2017.8255004 6. Yang Y, Zheng Z, Bian K, Song L, Han Z (2018) Real-time profiling of fine-grained air quality index distribution using UAV sensing. IEEE Internet Things J 5(1):186–198. https://doi.org/10. 1109/JIOT.2017.2777820 7. Maag B, Zhou Z, Thiele L (2018) A survey on sensor calibration in air pollution monitoring deployments. IEEE Internet Things J 5(6):4857–4870. https://doi.org/10.1109/JIOT.2018.285 3660 8. Dhingra S, Madda RB, Gandomi AH, Patan R, Daneshmand M (2019) Internet of things mobileair pollution monitoring system (IoT Mobair). IEEE Internet Things J 6(3):5577–5584. https:// doi.org/10.1109/JIOT.2019.2903821
Using Feature Selection Based on Multi-view for Rice Seed Images Classification Dzi Lam Tran Tuan and Vinh Truong Hoang
Abstract Rice is one of the primary foods for human, and various rice varieties have been cultivated in many countries. Automatic rice variety inspection system based on computer vision is needed instead of using technical human expert. In this chapter, we have investigated three types of local descriptor such as local binary pattern, histogram of oriented gradients and GIST to characterize rice seed images. In order to enhance the robustness, multi-view approach has been considered to concatenate these features. However, this problem encounters the high-dimensional space feature vector and need to select the relevant features for a compact and better model. We have applied different feature selection methods in a filter way to eliminate the impertinent features. The experimental results on the VNRICE dataset have been shown the efficiency of our proposition. Keywords LBP · HOG · GIST · Feature selection · Feature ranking · Multi-view
1 Introduction In many countries, rice is the main crop and is a essential source of food consumed by nearly half the world’s population. One of the most important factors in a high yield crop is that rice seeds must be pure. The quality of rice depends entirely on the genetic properties of a rice variety. In fact, various rice varieties may be combined with others, which might affect the productivity of a rice. More recently, ST25—a specific variety rice invented at Soc Trang Province in south of Vietnam is nominated the best rice in the world. After that, a plenty of fake ST25 is produced and sold in other regions. So, it is extremely difficult to distinguish or recognize different rice varieties by a non-expert human. At this time, the process to select and identify the D. L. T. Tuan (B) · V. T. Hoang Ho Chi Minh City Open University, 97 Vo Van Tan Street, Ho Chi Minh City, District 3, Vietnam e-mail: [email protected] V. T. Hoang e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_61
651
652
D. L. T. Tuan and V. T. Hoang
unwanted seeds is made manually in Vietnamese companies based on the different visual inspections by technical experts. Thus, this process is time consuming and can lead to a degradation of the quality of seeds. An inspection process via computer vision system is required to replace the human task. A huge number of vision systems for real-life problems have been developed and implemented [1, 2]. In agriculture, there are various works that use machine vision for automatic inspection and quality control of agricultural products. Recently, the rapid advancement of technology is being developed to characterize features of multimedia data. Various attributes descriptors have been introduced to represent image in the last decade [3]. Each kind of attribute represents the data in a specific space and has precise spatial meaning and statistical properties. Nhat and Hoang [4] presented a method to fuse the features extracting from three descriptors (local binary pattern, histogram of gradient and GIST) based on block division for face recognition task. The concatenated features are then applied by canonical correlation analysis to have a compact representation before feeding into classifier. Van and Hoang [5] proposed to reduce noisy and irrelevant local ternary pattern (LTP) features and histogram of gradient (HOG) coding on different color spaces for face analysis. Mebatsion et al. [6] fused Fourier descriptors and three geometrical features for cereal grains recognition. Duong and Hoang [7] apply to extract features from rice seed images coded in multiple color spaces by using HOG descriptor. Multi-view learning was introduced to complement information between different views. Different local descriptors are extracted and combined to create a multi-view image representation. While concatenating different feature sets, it is evident that all the features do not give the same contribution for the learning task and some feature might decrease the performance. Feature selection method is used to select relevant features and reduce the high dimension of the origin data. Feature selection allows to increase the efficiency of learning activities in two categories of purpose: (1) To better represent data, it eliminates noisy or irrelevant features, (2) to compute the classification or clustering efficiently, since the dimensions of feature space selected is reduced. More recently, Zhang et al. presented a complete review related to attribute selection and feature-level fusion methods [8]. In this chapter, we propose to extract multi-view features set based on LBP, HOG and GIST descriptors from rice seed images for classification task. Before feeding to classifier, various feature selection approaches are further investigated to reduce the dimension space and increase the classification performance. This chapter is organized and structured as follows. Section 2 introduces the feature extracting methods based on three local image descriptors. Sections 3 and 4 present proposed approach and experimental results. Finally, the conclusion is discussed in Sect. 5.
Using Feature Selection Based on Multi-view …
653
2 The Feature Extracting Methods This section briefly reviews three local image descriptors used in the experiment for extracting features.
2.1 Local Binary Pattern The LBP P,R (xc , yc ) code of each pixel (xc , yc ) is calculated by comparing the gray P−1 of its P neighbors as value gc of the central pixel with the gray values {gi }i=0 follows [9]: LBP P,R =
P−1
ω(g p − gc )2 p
(1)
p=0
where gc is the gray value of central, g p is the gray value of P, R is the radius of the circle and ω(g p − gc ) is defined as ω(g p − gc ) =
1 if (g p − gc ) ≥ 0, 0 otherwise.
(2)
2.2 GIST GIST is firstly proposed by Oliva and Torralba [10] in order to classify objects which represents the shape of the object. The primary idea of this approach is based on the Gabor filter: 2
h(x, y) = e
2
− 21 ( x2 + y2 ) − j2π(u x+v y) δx δy 0 0
e
(3)
For each (δx , δ y ) of the image via the Gabor filter, we obtain all the elements in the image that are close to the point color (u 0 x + v0 y). The result of the calculated vector GIST will have many dimensions. To reduce the size of the vector, we averaged over each 4 × 4 grid of the above results. Each image is configured by a Gabor filter with 4 scales (scales) and 8 directions (orientations), creating 32 characteristic maps of the same size.
654
D. L. T. Tuan and V. T. Hoang
2.3 Histograms of Oriented Gradient Histograms of oriented gradient (HOG) descriptor is applied for different tasks in machine vision [11] such as human detection [12]. HOG feature is extracted by counting the occurrences of gradient orientation base on the gradient angle and the gradient magnitude of local patches of an image. The gradient angle and magnitude at each pixel is computed in an 8 × 8 pixels patch. Next, 64 gradient feature vectors is divided into 9 angular bins 0 − 180◦ (20◦ each). The gradient magnitude T and angle K at each position (k, h) from an image J are computed as follows: k = |J (k − 1, h) − J (k + 1, h)|
(4)
h = |J (k, h − 1) − J (k, h + 1)|
(5)
T (k, h) =
i2 + 2j
K (k, h) = tan
−1
k j
(6)
(7)
3 Feature Selection Based on the availability of supervised information (i.e., labels), feature selection techniques can be grouped into two large categories: supervised and unsupervised context [13]. Additionally, different strategies of feature selection is proposed based on evaluation process such as filter, wrapper and hybrid methods [14]. Hybrid approaches incorporate both filter and wrapper into a single structure, in order to give an effective solution for dimensionality reduction [15]. In order to study the contribution of feature selection approaches for rice seed images classification, we propose to apply several selection approaches based on image representing by multi-view descriptor. In the following, we will shortly present the common feature selection methods applied in supervised learning context. 1. LASSO (Least Absolute Shrinkage and Selection Operator) allows to compute feature selection based on the assumption of linear dependency between input features and output values. Lasso minimizes the sum of squares of residuals when the sum of the absolute values of the regression coefficients is less than a constant, which yields certain strict regression coefficients equal to 0 [15, 16]. 2. mRMR (Maximum Relevance and Minimum Redundancy) is a mutual information-based feature selection criterion or distance/similarity scores to select features. The aim is to penalize a feature’s relevancy by its redundancy in the presence of the other selected features [17].
Using Feature Selection Based on Multi-view …
655
3. ReliefF [18] is extended from Relief [19] to support multiclass problem. ReliefF seems to be a promising heuristic function that may overcome the myopia of current inductive learning algorithms. Kira and Rendell used ReliefF as a preprocessor to eliminate irrelevant attributes from data description before learning. ReliefF is general, relatively efficient and reliable enough to guide the search in the learning process [20]. 4. CFS (Correlation Feature Selection) mainly applies heuristic methods to evaluate the effect of single feature corresponding to each group in order to obtain an optimal subset of attributes. 5. Fisher [21] identifies a subset of features so that the distances between samples in different classes are as large as possible, while the distances between samples in the same class are as small as possible. Fisher selects the top-ranked features according to its scores. 6. Ilfs (Infinite Latent Feature Selection) is a technique consisting of three steps such as preprocessing, feature weighting based on a fully connected graph in each node that connect all features. Finally, energy scores of the path length are calculated and then ranked its correspondence with the feature [22].
4 Experimental Results 4.1 Experimental Setup The rice seed images database composed of six rice seed varieties in the northern of Vietnam [7, 23]. We apply the 1-NN and SVM classifier to evaluated the classification performance. A half of the database is selected for training set, and the rest is used for testing set. We use hold-out method with ratio (1/2 and 1/2) and split the training and testing set like a chessboard decomposition. All experiments are implemented and simulated by MATLAB 2019a and conducted on a PC with a configuration of a CPU Xeon 3.08 GHz, 64 GBs of RAM.
Table 1 Classification performance without selection approach for different types of features Features Dimension 1-NN SVM LBP GIST HOG LBP + GIST LBP + HOG GIST + HOG LBP + GIST + HOG
768 512 21,384 1280 22,152 21,896 22,664
53.0 69.4 71.5 70.5 72.0 72.1 72.7
77.0 88.3 94.7 91.7 95.5 94.9 95.7
Dim
22,152 22,152 22,152 22,152 22,152 22,152
LBP+ HOG
Fisher mRMR ReliefF Ilfs Cfs Lasso
72.0 72.0 72.0 72.0 72.0 72.0
21 5 2 98 12 9
4652 1108 443 21,709 2658 1994
73.3 76.4 75.7 72.2 73.4 75.8
23 11 3 98 38 19
Max ACC Max id% 5095 2437 665 21,709 8418 4209
Dim 95.5 95.5 95.5 95.5 95.5 95.5
ACC 100%
Dim
ACC 100%
id%
SVM
1-NN
100 93 87 99 55 100
id%
22,152 20,601 19,272 21,930 12,184 22,152
Dim
95.5 95.6 95.5 95.5 95.6 95.5
100 98 88 99 88 100
Max ACC Max id%
22,152 21,709 19,494 21,930 19,494 22,152
Dim
Table 2 LBP + HOG features—classification performance based on different feature selection methods with 1-NN and SVM classifier. ACC: accuracy, Dim: dimension, id%: percentage of selected features, id%: percentage of selected features with accuracy equal to all features are used
656 D. L. T. Tuan and V. T. Hoang
Using Feature Selection Based on Multi-view …
657
4.2 Results Table 1 shows the accuracy obtained by 1-NN, and SVM classifier without feature selection approach is applied. The first column indicates the features used for representing images. We use three individual local descriptors, namely LBP, GIST and HOG, and the concatenation of these descriptors such as LBP + GIST, LBP + HOG, GIST + HOG and LBP + GIST + HOG. The second column indicates the number of feature (or dimension) corresponding to features type. The third and fourth columns show the accuracy obtained by 1-NN and SVM classifier. We observe that the multiview by concatenating multiple features gives the better performance; however, it increases the dimension. Hence, the performance of SVM classifier is better than 1-NN classifier with 95.7% of accuracy. Table 2 presents the classification performance based o LBP + HOG features with different feature selection approaches. The second to sixth row indicates the name of feature selection method. The two classifiers are used for each feature ranking (a) 1-NN 80
(b) SVM
1-NN classifier on GIST + HOG features
SVM classifier on GIST + HOG features 100 95
Accuracy
Accuracy
75 70 65 60
50
0
85 80
CFS FISHER ILFS LASSO MRMR RELIEFF
55
90
CFS FISHER ILFS LASSO MRMR RELIEFF
75 70
10 20 30 40 50 60 70 80 90 100
0
Number of selected features
Number of selected features
(c) 1-NN
(d) SVM
1-NN classifier on LBP + GIST + HOG features 80
SVM classifier on LBP + GIST + HOG features 100
75
95
Accuracy
Accuracy
10 20 30 40 50 60 70 80 90 100
70 65 60
CFS FISHER ILFS LASSO MRMR RELIEFF
55 50
90 85 80
CFS FISHER ILFS LASSO MRMR RELIEFF
75 70
0
10 20 30 40 50 60 70 80 90 100
Number of selected features
0
10 20 30 40 50 60 70 80 90 100
Number of selected features
Fig. 1 1-NN (a) and SVM (b) classifier on GIST + HOG features and LBP + GIST + HOG features
Dim
21,896 21,896 21,896 21,896 21,896 21,896
GIST+ HOG
Fisher mRMR ReliefF Ilfs Cfs Lasso
72.1 72.1 72.1 72.1 72.1 72.1
20 8 2 16 9 9
4379 1752 438 3503 1971 1971
73.4 74.9 75.0 72.3 73.4 76.9
27 13 3 16 66 20
Max ACC Max id % 5912 2846 657 3503 14,451 4379
Dim 94.9 94.9 94.9 94.9 94.9 94.9
ACC 100%
Dim
ACC 100%
id%
SVM
1-NN
93 85 83 98 42 100
id%
20,363 18,612 18,174 21,458 9196 21,896
Dim
95.0 95.1 95.1 95.0 95.3 94.9
95 93 85 99 85 100
Max ACC Max id%
20,801 20,363 18,612 21,677 18,612 21,896
Dim
Table 3 GIST + HOG features—classification performance based on different feature selection methods with 1-NN and SVM classifier. ACC: accuracy, Dim: dimension, id%: percentage of selected features, id%: Percentage of selected features with accuracy equal to all features are used
658 D. L. T. Tuan and V. T. Hoang
Fisher mRMR ReliefF Ilfs Cfs Lasso
22,664 22,664 22,664 22,664 22,664 22,664
LBP + GIST + Dim HOG
72.7 72.7 72.7 72.7 72.7 72.7
20 6 2 100 10 9
4533 1360 453 22,664 2266 2040
74.3 77.2 75.6 72.8 73.9 76.4
24 11 3 100 29 18
Max ACC Max id% 5439 2493 680 22,664 6573 4080
Dim 95.7 95.7 95.7 95.7 95.7 95.7
ACC 100%
Dim
ACC 100%
id%
SVM
1-NN
48 82 100 100 44 100
id%
10,879 18,584 22,664 22,664 9972 22,664
Dim
95.8 95.8 95.7 95.7 95.9 95.7
81 85 100 100 72 100
Max ACC Max id%
18,358 19,264 22,664 22,664 16,318 22,664
Dim
Table 4 LBP + GIST + HOG—classification performance based on different feature selection methods with 1-NN and SVM classifier. ACC: accuracy, Dim: dimension, id%: percentage of selected features, id%: Percentage of selected features with accuracy equal to all features are used
Using Feature Selection Based on Multi-view … 659
660
D. L. T. Tuan and V. T. Hoang
approach. The ACC with three sub-columns corresponding three sub-columns indicates the such as: (1) 100% column represents the accuracy obtained when using all features, (2) id% column presents the minimum percentage of features used for reaching the same accuracy as all features in used (3) Dim column indicates the number of selected features used for obtaining the corresponding accuracy. The MAX ACC is the second sub-column of 1-NN column. It shows the maximal accuracy obtained for each feature selection method via three sub-columns, the percentage of features in used and the number of dimension, respectively. We try to analyze this table by comparing the accuracy obtained and the number of selected features reduced. For example, by using only 21% of features, we obtain the same accuracy as all features in used by Fisher selection method. Figure 1 illustrated in detail the performance of six feature selection approaches corresponding to the percentage of features selected for GIST + HOG features and LBP + GIST + HOG features. We see that the performance is totally depended on the kind of features and selection methods. Tables 3 and 4 illustrate the same remark and observation for other multi-view strategy. According to the obtained results by different strategies, we conclude that there is no best approach for selecting an optimal subset of features as confirmed in [24]; it mainly depends on the features type. The multi-view allows to improve the classification performance since it gives the best accuracy 95.7% (by SVM classifier and without feature selection). The dimension is largely reduced from 22,664 features to 16,318 features by Cfs method (see Fig. 4). However, the accuracy is slightly improved with 0.2%. By using feature selection method, it always gives the accuracy equal to the accuracy obtained in the case no selection method is applied.
5 Conclusion This chapter presents a rice seed image classification based on multi-view descriptor and feature selection approach. We use three common local image descriptors such as LBP, HOG and GIST. The six feature selection method in supervised learning context and hybrid evaluation strategy is then applied to remove irrelevant feature on different view strategies. The proposed approach is then evaluated on rice seed images database, and it shows its efficiency by improving the classification performance and reduces the dimension space. This work is now extended to ensemble feature selection which combines multiple selection methods to achieve the compact and better representation.
References 1. Gomes J, Leta F (2014) Applications of computer vision techniques in the agriculture and food industry: a review. Euro Food Res Technol 235:989–1000 2. Patrácio DI, Rieder R (2018) Computer vision and artificial intelligence in precision agriculture for grain crops: a systematic review. Comput Electron Agric 153:69–81
Using Feature Selection Based on Multi-view …
661
3. Humeau-Heurtier A (2019) Texture feature extraction methods: a survey. IEEE Access 7:8975– 9000 4. Nhat HTM, Hoang VT (2019) Feature fusion by using lbp, hog, gist descriptors and canonical correlation analysis for face recognition. In: 2019 26th international conference on telecommunications (ICT), pp 371–375 5. Nguyen VT, Truong HV (2019) Verification kinship, based on local binary pattern features coding in different color space. In: 26th international conference on telecommunications (ICT) (ICT 2019). Hanoi, Vietnam 6. Mebatsion HK, Paliwal J, Jayas DS (2013) Automatic classification of non-touching cereal grains in digital images using limited morphological and color features. Comput Electron Agric 90:99–105 7. Duong H, Hoang VT (2019) Dimensionality reduction based on feature selection for rice varieties recognition. In: 2019 4th international conference on information technology (InCIT), pp 199–202 8. Rui Z, Feiping N, Xuelong L, Xian W (2019) Feature selection with multi-view data: a survey. Inf Fusion 50:158–167 9. Ojala T, Pietikäinen M, Mäenpää T (2001) A generalized local binary pattern operator for multiresolution gray scale and rotation invariant texture classification. In: Proceedings of the second international conference on advances in pattern recognition. Springer, pp 397–406 10. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope, p 31 11. Deniz O, Bueno G, Salido J, De la Torre F (2011) Face recognition using histograms of oriented gradients. Pattern Recogn Lett 32:1598–1603 12. Dalal N, Triggs B Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 1. IEEE, pp 886–893 13. Benabdeslem K, Hindawi M (2011) Constrained Laplacian score for semi-supervised feature selection. In: Machine Learning and Knowledge Discovery in Databases. Springer, pp 204–218 14. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182 15. Jie C, Jiawei L, Shulin W, Sheng Y (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79 16. Makoto Y, Wittawat J, LeonidS, Xing Eric P, Masashi S (2019) High-dimensional feature selection by feature-wise kernelized lasso. Neural Comput 26(1):185–207. arXiv: 1202.0515 17. Zhao Z, Anand R, Wang M Maximum relevance and minimum redundancy feature selection methods for a marketing machine learning platform. arXiv:1908.05376 18. Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. In: Carbonell JG, Siekmann J, Goos G, Hartmanis J, Bergadano F, Raedt L (eds), Machine learning: ECML94, vol 784. Springer, Berlin Heidelberg, pp 171–182 19. Kira K, Rendell LA (1992) A practical approach to feature selection. In: Machine learning proceedings 1992. Elsevier, pp 249–256 20. Kononenko I, Simec E, Robnik-Sikonja M Overcoming the myopia of inductive learning algorithms with RELIEFF, p 17 21. Bishop CM et al (1995) Neural networks for pattern recognition. Oxford university press 22. Tajul M, Ali WCB, Teguh P (2019) Infinite latent feature selection technique for hyperspectral image classification. Jurnal Elektronika dan Telekomunikasi 19(1):32 23. Thu Hong PT, Thanh Hai TT, Thi Lan L, Ta Hoang V, Hai V, Thi Nguyen T (2015) Comparative study on vision based rice seed varieties identification. In: 2015 seventh international conference on knowledge and systems engineering (KSE). IEEE, Ho Chi Minh City, Vietnam, pp 377–382 24. Hoang VT (2018) Multi color space LBP-based feature selection for texture classification. PhD thesis, Université du Littoral C˝ote d’Opale
Smarter Pills: Low-Cost Embedded Device to Elders José Elías Romo, Sebastián Gutiérrez, Pedro Manuel Rodrigo, Manuel Cardona, and Vijender Kumar Solanki
Abstract Currently, most of the world is connected through the Internet; there are no regional barriers between people, and even between things, since they can be connected and communicate through the Internet, allowing them greater capabilities and ways of use that before; this concept is known as the Internet of things. The population of elder adults in Mexico has a tendency to increase in the coming years; therefore, it is necessary to raise awareness about the promotion of healthcare and well-being of the quality of life of the elderly. The problem is how you can provide quality healthcare to people with reduced access to providers. Smarter Pills is an embedded device capable of reminding its user the moment in which his medicine should be taken. This is done using Google calendar reminders. Smart Pills is a container of pills designed to be portable. It helps the elderly to tell them when to take their medications and which one. It is a low-cost device that can be used easily by people who they take care of the elderly, and at the same time, it is intuitive for the users.
J. E. Romo INFOTEC Centro de Investigación e Innovación en Tecnologías de la Información y Comunicación, Aguascalientes, Mexico e-mail: [email protected] S. Gutiérrez (B) · P. M. Rodrigo Facultad de Ingeniería, Universidad Panamericana, Aguascalientes, Mexico e-mail: [email protected] P. M. Rodrigo e-mail: [email protected] M. Cardona Centro de Automática y Robótica, UPM-CSIC Universidad Politécnica de Madrid, y Facultad de Ingeniería Universidad Don Bosco, San Salvador, El Salvador e-mail: [email protected] V. K. Solanki Department of Computer Science and Engineering, CMR Institute of Technology, Hyderabad, TS, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_62
663
664
J. E. Romo et al.
Keywords Advanced encryption standard · IFTTT · Internet of things · Machine to machine · Real-time clock · System on chip
1 Introduction In Mexico, the uninterrupted reduction of fertility since the end of the 1970s and the increase in life expectancy have generated an even narrower pyramidal base and an increasingly higher proportion of adults (30–59 years) and elder adults (60 and over). The adult proportion has increased their percentage from 26 to 36.7% between 1990 and 2017, while in people over 60, their proportion has increased from 6.4 to 10.5% in the same period, and it is expected that in 2050, the amount will increase to 32.4 million (21.5% of the total population) [1]. Because of that, many applications for elder people have been rising. This document presents a review of the use of IoT devices in the health sector and proposes a low-cost device that serves as an intelligent pill box which helps patients to remember when and which pills take. The Internet of things (IoT) does not have a unique answer as each author has its own definition [2]. It can identify IoT as an idea, in which it talks about physical objects communicating between them [3]. These “things” can be from a human being communicating with an electronic device, to a machine communicating with a device to report their data “machine to machine” (M2M) [3]. It is worth to note that Internet is a large network that keeps everything interconnected, not just a device or two but an immense number of interconnected devices. The IoT allows physical objects to see, listen, think and perform works by “talking” together, sharing information and coordinating decisions. The IoT changes these objects from traditional to intelligent by exploiting their underlying technologies, such as computing and data analysis, integrated devices, communication technologies, sensor networks, Internet protocols and applications [4]. A basic IoT component [3] consists of: • Embedded System: Electronic board with the ability to obtain process data and send them to the network. • Communications network: Communication solutions that interconnect to several devices at the same time and share information. • Data storage: Physical servers in which the information of the network is stored. • Data analysis: Algorithms to obtain a response of these data provided by each device. There are several types of architectures for IoT depending on the approach that your application has [5]. A good architecture has to meet certain requirements for this technology to be viable. It must allow the technology to be distributed, where objects can interact with each other’s, efficiently and securely. The architecture presented in this document is a compilation of the best layers of several existing architectures [4]. Some examples of IoT use can be from smart homes [6], smart cities [7], automotive [8], healthcare [9], energy [10], agriculture [11], among many other applications.
Smarter Pills: Low-Cost Embedded Device to Elders
665
1.1 IoT in the Health Sector IoT has great potential within the health sector such as medical attention, remote supervision, elderly care among others. The miniaturization of these devices, as well as their much lower price has led these IoT devices to a rapid and widespread adoption [12]. IoT systems for health care are composed of several sensors that measure different health parameters. Some of the examples of IoT devices applied in the health domain are the followings. A non-invasive glucose monitoring systems [13] use a photodiode sensor and an accelerometer to measure blood glucose levels, avoiding the use of devices that require a small incision in the finger to measure directly in the blood glucose levels. A remote monitor of elder people using IoT devices. There are many homes where older family members may be alone during the day and may need to be monitored remotely [14]. IoT devices detect strange behaviors and falls or the vital signs of the elderly, being a great help to care for these people in an efficient and economical way. There are also heart monitoring systems. Within the health monitoring of these vital signs, it is of great importance when used from a hospital environment for critical patients or for patients recovering at home, after surgeries. Heart rate monitoring has also gained importance in recreational activities such as athletics, intensive physical training, etc. There are many IoT devices that analyze this signal and process it in order to track the user’s heart rhythm [12, 15].
1.2 Security in IoT Systems in the Domain of Health With the development of IoT devices within the health domain, the personal data of users are transported from one device to another and from one server to another; this represents a great challenge in terms of security and privacy of this information. Many techniques are used to protect patient information in medical IoT systems. An important part of this security lies in the way of encrypting the data that the IoT devices acquire. The advanced encryption standard (AES) is a well-known algorithm since 2001. Although it is a very strong application, it needs some development to be suitable for the type of IoT applications due to the limited power, area and memory available. This can be done by minimizing the design area and the time required for processing [16]. The rest of the paper is organized as follows: Section 2 explains the overall development of the Smarter Pills prototype. Section 3 presents the system architecture and the use of IoT platforms for the device communication. Section 4 exposes the results of the prototype implementation. Section 5 discusses the conclusions and present the future steps.
666
J. E. Romo et al.
2 Development Smarter Pills is a smart box of pills, which can indicate at what time and date the pills must be taken. It has four compartments and the box is programmed through events of Google calendar, so it can be used with any smartphone or computer. When an event occurs, Smarter Pills turns on the led indicator of the configured compartment and waits until the patient takes the confirmed medication. Smarter Pills has a touch sensor in each compartment in order to know if the patient took the pills. The Smarter Pills was developed on SolidWorks software. Figure 1 shows the proposed design which will be printed on a 3D printer allowing the easy development of the box and the cheaper production by any one. The control card used for this project is the ESP32 card, which is a System on Chip (SoC) designed by the Chinese company Espressif and manufactured by TSMC. It integrates a Tensilica Xtensa dual-core processor from 32bits to 160 MHz (with the possibility of up to 240 MHz), WiFi and Bluetooth connectivity [17]. This card is a very useful and versatile hardware for the development of IoT devices since it allows a great connectivity it has WiFi and Bluetooth besides having a great capacity of processing several inputs and outputs GPIO and several sensors from touch sensors, temperature sensor and Hall Effect sensor. On this card, four LED indicators were connected to the compartments of the pill box, and each compartment was assigned a touch type capacitive sensor to verify that the pills were taken from the indicated compartment. The system architecture is based on the hardware receiving the information in a bidirectional way with touch sensors, indicators and external commands. This is connected to the Internet through WiFi enabling a cloud with the Blynk service [18]. Figure 1 shows the electrical connection of the proposed system; the development of Fig. 1 Electrical connection architecture
Smarter Pills: Low-Cost Embedded Device to Elders
667
Fig. 2 Connectivity architecture
ESP32 allows the easy connection of the elements without more devices or complements that could increase the cost of the system. The ESP32 card was configured to be used as a WiFi server. It is connected to the Internet through the indicated WiFi SSID and has a web service waiting for HTTP RESTful commands [19]. The programming of the development board is through the Arduino IDE simplifying the coding and uploading. In Figure 2, we can appreciate the general way of the connectivity of the prototype, first the ESP32 board reads the state of the hardware, then this info is uploaded to the internet through the WiFi connection, and then, the web server is enabled using the libraries of Blynk service which stay waiting for RESTful commands. Google calendar events are read and interpreted by an IFTTT Applet, which searches for an event with a previously configured keyword and this event triggers a Webhook [20] which sends the RESTful type command to the web service of the ESP32 card enabled by Blynk. Finally, this commands are interpreted by the algorithm coded in the ESP32 activating the LEDs in case of an event and waiting for the user to take his pills.
3 Results A design was made in SolidWorks of the form that is wanted to be the Smarter Pills device, in which it is thought to place a metal base in each compartment to serve as an amplifier of the capacitive touch sensor of the ESP32 card, as well as with internal edges rounded to facilitate the collection of the pads; the LED indicator has a window on these edges which will turn on if the assigned event activates them. At the bottom, it has a compartment for the ESP32 card and its battery. Figure 3 shows the CAD design with the LED indicator and the metallic base that amplifies the touch sensor embedded with the development board ESP32; in order to have a portable device, a battery is needed; the lower part of the box will have a battery
668
J. E. Romo et al. LED Indicator Metallic Base
Battery case and electronic compartment
Fig. 3 Smarter pills box design
Fig. 4 IFTTT applets
Smarter Pills: Low-Cost Embedded Device to Elders
LED Indicator
669
ESP8266
APP Notifications
Fig. 5 Test of the prototype
and electronic compartment. The battery can be a traditional 3.6 V o 3.7 V lithium battery like in any smart phone or it can be a supercapacitor which allows very fast charge (in seconds) if it is needed [21, 22]. In Figure 4, we can see the Applets that were developed in IFTTT and events were created in Google calendar with the name “PastillasAbu.” Afterward, the IFTTT application was installed on a smartphone and the Applets were executed on it. The Blynk web service was programmed on the ESP32 card and the code was added to read the touch sensor. The LEDs were connected to the GPIOs of the card and cables were placed as a simulation of the metallic platform of the box of pills. When the events created on the smartphone were activated, the IFTTT application executed the created applets and these applets sent the RESTful command to the Blynk web server that was enabled with the ESP32 card. Figure 5 shows the android notifications on the smartphone and the LEDs turned on by these notifications.
4 Conclusions and Future Work IoT devices are the future of development and innovation in the world; we live in a time when we can break the barriers of technology by bringing smart devices to the
670
J. E. Romo et al.
entire population, thus allowing a greater insertion of society in technological development and seeking help and include all people regardless of their age or disability in these benefits. The results obtained with the Smarter Pills prototype show that it is possible to make a low-cost device that can help people who do not have many resources to have a better quality of life. This prototype was designed and made thinking about using the resources that most people have a smartphone and internet connection. The cost of developing electronics is not much and the integrated sensors of the ESP32 card are reduced electronic elements in the Smarter Pills embedded device. The design of the box was made thinking of using a conventional 3D printer to reduce the costs of manufacturing and bring this model to everyone. Based on the results of this prototype, it can be concluded that this project has a lot of potential users. Readers can even add many more improvements such as an Android application as a user interface to register Google calendar events as well as add a simple way to configure the network in which the ESP32 card is connected in order to easily switch networks without the need to reprogram the card.
References 1. INEGI I (2015) Encuesta Intercensal. http://www.beta.inegi.org.mx/programas/intercensal/ 2015/ 2. Durón JIM, Gutiérrez S, Rodríguez F (2019) Mobile positioning for IoT-based bus location system using LoRaWAN. In: 2019 IEEE International conference on engineering veracruz (ICEV), pp 1–7 3. Bandyopadhyay D, Sen J (2011) Internet of things: applications and challenges in technology and standardization. In: Wireless personal communications, pp 49–69 4. Al-Fuqaha A, Guizani M, Mohammadi M, Aledhari M, Ayyash M (2015) Internet of things: a survey on enabling technologies, protocols, and applications. IEEE Commun Surv Tutorials 17:2347–2376. https://doi.org/10.1109/COMST.2015.2444095 5. Weyrich M, Ebert C (2016) Reference architectures for the internet of things. IEEE Softw. https://doi.org/10.1109/MS.2016.20 6. Alaa M, Zaidan AA, Zaidan BB, Talal M, Kiah MLM (2017) A review of smart home applications based on internet of things 7. Zanella A, Bui N, Castellani A, Vangelista L, Zorzi M (2014) Internet of things for smart cities. IEEE Internet Things J. https://doi.org/10.1109/JIOT.2014.2306328 8. Liu T, Yuan R, Chang H (2012) Research on the internet of things in the automotive industry. In: Proceedings—2012 international conference on management of e-Commerce and e-Government, ICMeCG 2012 9. Rahmani AM, Thanigaivelan NK, Gia TN, Granados J, Negash B, Liljeberg P, Tenhunen H (2015) Smart e-health gateway: bringing intelligence to internet-of-things based ubiquitous healthcare systems. In: 2015 12th Annual IEEE consumer communications and networking conference, CCNC 10. Huang AQ, Crow ML, Heydt GT, Zheng JP, Dale SJ (2011) The future renewable electric energy delivery and management (FREEDM) system: The energy internet. Proc IEEE (2011). https://doi.org/10.1109/JPROC.2010.2081330 11. Zhao JC, Zhang JF, Feng Y, Guo JX (2010) The study and application of the IOT technology in agriculture. In: Proceedings—2010 3rd IEEE international conference on computer science and information technology, ICCSIT 2010
Smarter Pills: Low-Cost Embedded Device to Elders
671
12. An J, Chung WY (2017) A novel indoor healthcare with time hopping-based visible light communication. In: 2016 IEEE 3rd World forum on internet of things, WF-IoT 2016 13. Istepanian RSH, Hu S, Philip NY, Sungoor A (2011) The potential of internet of m-health things m-IoT for non-invasive glucose level sensing. In: Proceedings of the annual international conference of the IEEE engineering in medicine and biology society, EMBS 14. De Luca GE, Carnuccio EA, Garcia GG, Barillaro S (2016) IoT fall detection system for the elderly using intel Galileo development boards generation I. In: CACIDI 2016—Congreso Aergentino de Ciencias de la Informatica y Desarrollos de Investigacion 15. Velrani KS, Geetha G (2016) Sensor based healthcare information system. In: 2016 IEEE Technological innovations in ICT for agriculture and rural development (TIAR). pp. 86–92 16. Alharam AK, El-Madany W (2017) Complexity of cyber security architecture for IoT healthcare industry: a comparative study. In: Proceedings—2017 5th international conference on future internet of things and cloud workshops, W-FiCloud 2017 17. ESP32 Reference Manual. https://www.espressif.com/sites/default/files/documentation/ esp32_technical_reference_manual_en.pdf 18. BlynkTeam, Blynk-Server. https://github.com/blynkkk/blynk-server 19. Blynk HTTP RESTful API Apiary. https://blynkapi.docs.apiary.io/#reference/0/write-pinvalue-via-put/write-pin-value-via-put?console=1 20. The Webhooks Service. https://help.ifttt.com/hc/en-us/articles/115010230347-The-Web hooks-Service 21. Ibanez F, Vadillo J, Echeverria JM, Fontan L (2013) Design methodology of a balancing network for supercapacitors. In: IEEE PES ISGT Europe 22. Ibanez FM (2018) Analyzing the need for a balancing system in supercapacitor energy storage systems. IEEE Trans Power Electron 33(3):2162–2171
Security and Privacy in IOT Devender Bhushan and Rashmi Agrawal
Abstract Internet of Things (IoT): Security and privacy are the major concern as lots of devices, communications channels, and media used to create the network. Requirement of power, communication media, and storage is also needed. This challenge become more critical when we scale up this with different kind of devices running on various operating systems and sensors. To manage security of information is more critical with this. IoT is more on human side, and any error or security breech can lead to loss of human lives and major disasters. In this article, we will discuss security majors we need to take care and also manage the privacy of the system user as it is also very important and critical. Keywords IoT security · Security challenge · Security layer · IoT privacy
1 Introduction Internet of Things (IoT) consists of various devices that gather, process, and exchange huge amounts of secure and safety critical data as well as privacy-sensitive information and hence are appealing targets of various cyber-attacks. New generation networkable devices are IoT enabled with low energy requirement and lightweight. These devices must use most of assigned available energy and computation to executing core application functionality. To make the task affordable and supporting security and privacy is quite challenging. Traditional security methods tend to use more resources in terms of energy consumption and processing overhead. Moreover, many of the state-of-the-art security frameworks are highly centralized and are thus not necessarily well-suited for IoT due to the difficulty of scale, many-to-one nature
D. Bhushan (B) · R. Agrawal Manav Rachna International Institute of Research and Studies, Faridabad, India e-mail: [email protected] R. Agrawal e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_63
673
674
D. Bhushan and R. Agrawal
of the traffic, and single point of failure. To protect user security and privacy, available methods often either reveal noisy data or incomplete data, which may potentially risk some IoT applications from offering personalized services or offerings. Consequently, IoT demands a lightweight, scalable, and distributed security and privacy safeguard. In the recent couples of years, IoT applications have increased due to rapid development of the radio-frequency identification (RFID) technology and wireless sensor networks (WSN).With the help of RFID, we are able to tag each device in the network. On the other hand, with the help of WSN, “thing,” i.e., people, device, controllers, etc., become a wireless identifiable object and able to communicate among the physical, cyber, and digital world.
2 IoT Architecture IoT is not only the connected devices or sensors. In fact, IoT is a capable system to sensing and responding based on the input autonomously without human intervention. Therefore, there is a complete protocol to manage all the operations. Like any other network protocol, IoT also have the below three major layers to manage the data flow. Details of each layer with their need and operation are given below (Fig. 1). (A) Perception layer The other name for perception layer is “sensors” layer in IoT. The main function of this layer is to acquire the data from the environment with the help of sensors and actuators. This layer detects, collects, and processes information and transmits it to Fig. 1 Three layer architecture framework of IoT
Security and Privacy in IOT
675
the next layer which is network layer. The other operation of this layer is to performs the IoT node collaboration in local and short range networks [3]. (B) Network layer The network layer serves the function of data routing and transmission to different IoT hubs and devices over the Internet. At this layer, cloud computing platforms, Internet gateways, switching, and routing devices, etc., operate using the very recent technologies such as WiFi, LTE, Bluetooth, 3G, and Zigbee. The network gateways serve as the mediator between different IoT nodes by aggregating, filtering, and transmitting data to and from different sensors. (C) Application layer The application layer is responsible and guarantees the authenticity, integrity, and confidentiality of the data. At this layer, the purpose of IoT or the creation of a smart environment is achieved.
3 Security Challenges Security goals of confidentiality, integrity and availability (CIA) apply to IoT. However, the IoT has many restrictions and limitations for the components and devices, and computational and power resources. The heterogonous and ubiquitous nature of IoT also introduces additional concerns. Now not only us with our computers/mobiles interact with the Internet but the “things” (sensors) also interact without any human intervention. These things are regularly interacting with the Internet with huge amount of data and can control some physical assets. Hacking and lack of security can be a huge risk on assets or even risk for human life’s in some cases. From manufacturer to users, IoT still have many gaps due to immature technologies in this area. Below are the major security challenges from manufacturer to users prospective. 1. 2. 3. 4.
Manufacturing standards Update management Physical hardening Users knowledge and awareness
4 Top IoT Security Risks In 2016, due to lack of compliance and unstandardized process, an IoT video camera manufacture sets some weak and unprotected password for some of their products which in turn led to one of the most damaging Mirai malware botnet attacks. Out of many IoT security threats, we will highlight some of the most important below.
676
D. Bhushan and R. Agrawal
1. Lack of compliance Due to huge development on the IoT domain, there are lots of new devices come out almost daily. But there is still not any standard compliance for these devices to follow. The primary reason for the most IoT security devices issue is due to not spending time and resource on security domain. Some of the major risks are: 1. 2. 3. 4. 5.
Very Weak, guessable hardcoded passwords. Issues with the hardware. Weak or no mechanism for firmware updates. Old or unpatched embedded operating systems. Insecure data storage and the transfers with the other devices.
2. Lack of user awareness Over the time we have learnt to avoid spam and perform virus scan on Pcs. But IoT devices are new, and users are not aware about the updates and security majors although major issues are from the manufacture side but still users are responsible for not to secure the IoT devices the way it should be. Social engineering attacks are the commonly overlooked IoT security attacks in which instead to target devices hackers target human using IoT device. In the 2010, for stuxnet attack social engineering was used against nuclear facility in Iran. 3. Problems in device update management Insure software and firmware are the another challenge for security. All the new devices sold by manufacturers have new software or firmware but it is almost inevitable that new vulnerability will come out. Updates and security patches are important to push whenever a new vulnerability gets discovered. Like smartphones and computers, there are no such automated mechanism developed so far due to different type, manufacturer, and complex network architecture for some solutions. 4. Physical hardening Another security challenge for IoT devices is the lack of physical hardening. Most of the IoT devices are control autonomously without any human interventions. Physical security for such devices are more important as chances of physical tempering and infected such devices using USB or some other means is easy. Physical security starts from manufacturer side but users are also equal responsible for the secure access to the device physically. 5. Botnet attacks A single infected IoT devices with malware are not a big problem for the network. The issues come when collection of them started per second to attack that can bring anything down. To perform botnet attack, hackers create the army of bots, infect them with malware, and direct them to send thousands of requests per second to block the network and other resources of the devices (Fig. 2).
Security and Privacy in IOT
677
Fig. 2 DDoS attacks on IoT
6. Hijacking IoT devices Ransomware is the nastiest malware type ever existed. Currently, it is rare for IoT device to get infected but it can be a trend soon as due to this health care and home gadgets can be hijacked to control operations. 7. Crypto mining Mining of the cryptocurrency need CPU and GPU resources. Due to this requirement, another IoT security risk get emerged, i.e., crypto mining with IoT bots. This type of attacks not create any damage but use the IoT device resources to mine cryptocurrency. The open source currency Monero is the first mine with infected IoT devices.
5 Conclusion IoT security and privacy are still critical to manage. There are so many risks and challenges and more will emerge in the coming years. The introduction of more devices in the future will make it more complex to manage or define. There must be a need to define some universal standards and protocols to work and define for IoT solution by the international organizations and governments. Also manufacturer of the IoT devices need to set some standards or follow common standards to design and deliver their IoT solutions. Manufacturers must take the security at center and follow all the steps strictly not only during delivery but also during post-delivery.
678
D. Bhushan and R. Agrawal
As IoT become the vital part of the modern solutions, above discussed threats can only be a beginning. So with the time if we want our device smart, we need them to secure as well for better control and safety.
References 1. Bhushan D, Agrawal R (2020) Security challenges for designing wearable and IoT solutions. In: Balas V, Solanki V, Kumar R, Ahad M (eds) A handbook of internet of things in biomedical and cyber physical system. Intelligent systems reference library, vol 165. Springer, Cham 2. Mahmoud R, Yousuf T, Aloul F, Zualkernan I (2016) Internet of things (IoT) security: current status, challenges and prospective measures. In: 2015 10th International conference for internet technology and secured transactions, ICITST. https://doi.org/10.1109/ICITST.2015.7412116 3. Suo H, Wan J, Zou C, Liu J (2012) Security in the internet of things: a review. In: Proceedings— 2012 international conference on computer science and electronics engineering, ICCSEE 2012. https://doi.org/10.1109/ICCSEE.2012.373 4. Gubbi J, Buyya R, Marusic S, Palaniswami M (2013) Internet of things (IoT): a vision, architectural elements, and future directions. Future Gener Comput Syst. https://doi.org/10.1016/j. future.2013.01.010 5. Lin J, Yu W, Zhang N, Yang X, Zhang H, Zhao W (2017) A survey on internet of things: architecture, enabling technologies, security and privacy, and applications. IEEE Internet of Things J. https://doi.org/10.1109/JIOT.2017.2683200 6. Minerva R, Biru A, Rotondi D (2015) Towards a definition of the internet of things (IoT) 7. Patel K, Patel SM (2016) Internet of things-IOT: definition, characteristics, architecture, enabling technologies, application and future challenges. Int J Eng Sci Comput. https://doi. org/10.4010/2016.1482 8. Alaba FA et al (2017) Internet of things security: a survey. J Netw Comput Appl. https://doi. org/10.1016/j.jnca.2017.04.002 9. Weyrich M, Ebert C (2016) Reference architectures for the internet of things. IEEE Software. https://doi.org/10.1109/MS.2016.20 10. Botta A et al. (2016) Integration of cloud computing and internet of things: a survey. Future Gener Comput Syst. https://doi.org/10.1016/j.future.2015.09.021 11. EPoSS (2008) In: Internet of things in 2020: a roadmap for the future, Rfid working group of the European technology platform on smart systems integration (Eposs) 12. Dorey P (2017) Securing the internet of things. In: Smart cards tokens, security and applications, 2nd edn. https://doi.org/10.1007/978-3-319-50500-8_16 13. Khan R et al (2012) Future internet: the internet of things architecture possible applications and key challenges. In: Proceedings—10th international conference on frontiers of information technology, FIT 2012. https://doi.org/10.1109/fit.2012.53 14. Rayes A, Salam S (2016) In: Internet of things-from hype to reality: the road to digitization. https://doi.org/10.1007/978-3-319-44860-2 15. Pal A (2015) Internet of things: making the hype a reality. IT Professional. https://doi.org/10. 1109/MITP.2015.36 16. Shrikanth G (2014) In: Internet of things hype or reality. Dataquest. https://doi.org/10.1007/ 978-3-319-44860-2 17. Nielsen P, Fjuk A (2010) The reality beyond the hype: mobile internet is primarily an extension of PC-based internet. Inf Soc. https://doi.org/10.1080/01972243.2010.511561 18. Hui TKL, Sherratt RS, Sánchez DD (2017) Major requirements for building smart homes in smart cities based on internet of things technologies. Future Gener Comput Syst. https://doi. org/10.1016/j.future.2016.10.026 19. Xu L Da, He W, Li S (2014) Internet of things in industries: a survey. IEEE Trans Ind Inf. https://doi.org/10.1109/TII.2014.2300753
Security and Privacy in IOT
679
20. Chatterjee JM, Ghatak S, Kumar R, Khari M (2018) BitCoin exclusively informational money: a valuable review from 2010 to 2017. Qual Quant 52(5):2037–2054 21. Jha S, Kumar R, Chatterjee JM, Khari M (2019) Collaborative handshaking approaches between internet of computing and internet of things towards a smart world: a review from 2009–2017. Telecommun Syst 70(4):617–634 22. Khari M, Kumar M, Vij S, Pandey P (2016). Internet of things: proposed security aspects for digitizing the world. In: 2016 3rd International conference on computing for sustainable global development (INDIACom), IEEE, pp 2165–2170 23. Chatterjee JM, Kumar R, Khari M, Hung DT, Le DN (2018) Internet of things based system for smart kitchen. Int J Eng Manuf 8(4):29 24. Solanki VK, Kumar R, Khari M (2019) In: Balas VE (ed) Internet of things and big data analytics for smart generation, Springer
Cross-Domain Using Composing of Selected DCT Coefficients Strategy with Quantization Tables for Reversible Data Hiding in JPEG Image Pham Quang Huy, Ta Minh Thanh, Le Danh Tai, and Pham Van Toan
Abstract Reversible data hiding (RDH) techniques for JPEG images have big challenges in order to improve the capacity, quality, and flexible embedding method. In this paper, we propose a new RDH algorithm using combination of selected DCT coefficients strategy with quantization tables. In order to further reduce the distortion, we survey the effect of quantized process on DCT coefficients for making the strategy of selection the embedding DCT coefficients. Compared to the conventional method, our experimental results show that our proposed method has better performance in terms of both the increasing capacity and improving the quality of embedded JPEG images after data embedding. Keywords Reversible data hiding (RDH) · JPEG image · DC coefficients · AC coefficients · Quantization tables
1 Introduction 1.1 Overview Recently, with the development of IT technology, the exchange of multimedia contents via Internet is becoming normally and is dramatically increasing. Everyone P. Q. Huy Electric Power University, Ha Noi, Vietnam e-mail: [email protected] T. M. Thanh · L. D. Tai Le Quy Don Technical University, Ha Noi, Vietnam e-mail: [email protected] T. M. Thanh (B) · P. Van Toan Sun* Inc., Ha Noi, Vietnam e-mail: [email protected] P. Van Toan e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_64
681
682
P. Q. Huy et al.
can easily access all types of multimedia information that raises the problems of multimedia data privacy threatening, protection of data ownership and integrity. Such problems are normally under risk while in multimedia digital form using via Internet. Therefore, researchers mainly focus on developing the techniques that can protect the ownership and the integrity of digital contents after transforming via Internet. In general, there are some methods for enhancing the protection of digital contents. Encryption is the first of considerable methodology in order to encrypt the original contents, and then, to create the scrambled contents. Such scrambled contents cannot be disclosed the content unless users use the secret key for decoding that [1]. However, the key managing task in the encrypt system is quite difficult. When the secret key is stolen (disclosed), the original contents are not protected. Therefore, the solutions for ensuring the security, integrity, and confidentiality of multimedia information are highly required nowadays. In the literature, reversible data hiding (RDH) is a solution allowing users to hide a payload of secret information into their multimedia contents for generating the embedded contents. In these solutions, both of the original cover contents and secret information can be recovered by using the embedded contents without distortion. Therefore, these important techniques are usually employed in medical imagery, military imagery, and law forensics because of the original images cannot be adjusted or damaged after extracting the secret information [2, 3]. Such RDH techniques are also the special fragile watermarking method. RDH cannot be robust against the image processing attacks, compression attacks, geometric attacks, and so on. That is the reason why RDH is always used for integrity authentication, where the distortion of original contents is easily detected if the embedded contents were slightly attacked. There are many algorithms of RDH that had been proposed in the past decade. In general, we can classify such RDH techniques into three categories. These three categories are well-known such as lossless compression appending scheme [3], difference expansion [4], and histogram shift [5], respectively. The mentioned algorithms above are normally applied on the grayscale or color images. Recently, the combination of RDH methods is also considered in order to achieve more efficient such as prediction errors (PEs) proposed in papers [6–8]. Most of RDH algorithms are proposed easily on image data, including gray images [9–11]. In color-based images, the luminance component, the chrominance component, and RGB channels are employed to control the color domain for hiding the large capacity of information [12–14]. However, the color-based RDH methods drive to destroy color images irreversibly. That is why it cannot restore images without any loss in some cases. Therefore, the RDH methods for images should choose the robust features, appropriate regions for embedding the secret digital information. One more problem of RDH methods for images is computing cost of recovery algorithm. That makes the RDH methods to apply difficultly on the practical applications. Therefore, only applications like medical imaging, military imaging, in which the distortion is not desired, the RDH methods can be applied. In order to apply the RDH methods on real applications, some researchers have proposed RDH algorithms for the Joint
Cross-Domain Using Composing of Selected DCT Coefficients …
683
Photographic Experts Group (JPEG) image.1 JPEG image is used in many applications of daily life. The RDH method in the JPEG image is useful for image integrity, image authentication, and image privacy. Such RDH for JPEG image is needed to consider the following problems: The capacity of the hiding information is limited; (2) the visual quality of JPEG image is lower than that of uncompressed image; (3) the size of embedded JPEG image may be increased more than the original one. That is why the RDH for JPEG image is a challenge comparing with that for the uncompressed image.
1.2 Our Contributions In this paper, we focus on proposing the RDH algorithm for JPEG image in order to solve the problems mentioned above. Based on the method of Tian [4], we improve the algorithm for adapting in the data domain of JPEG image such as DCT domain (DC coefficients and AC coefficients). We also extend the proposal of RDH method to embed the information into the cross-domain by using the composing the selected DCT coefficients strategy with quantization tables. In particular, we make the following contributions in this paper: Proposing appropriate RDH for JPEG image: Conventional RDH methods for JPEG images have normally employed the simple DCT domain of JPEG algorithm for embedding information. However, such methods were not considered to select the carefully the coefficients. It causes the big distortion, and then it cannot control the capacity. We propose an algorithm for JPEG reversible embedding that can be flexible to select the DCT coefficients with less total distortion. We also combine the DCT frequency domain with quantization table domain, called cross-domain, for expanding the embedding domain. Increasing the capacity for RDH of JPEG image: The limitation of RDH for JPEG image is low capacity. The reason is that the embeddable data domain depends on the structure of the JPEG image. Most of conventional RDH methods employ the nonzero AC coefficients or DC coefficients for embedding the information. However, the nonzero coefficients of DCT domain in JPEG are limited that makes the capacity of information embedding cannot be improved. In order to increase the capacity of information embedding, we propose new RDH method in which the DCT coefficient domain is not only used but also the quantization tables are employed for hiding data. Our proposed method can improve the capacity significantly. Improving the quality of the embedded images: In general, increasing the capacity of information embedding, the quality of the embedded images is affected. The distortion of the embedded images becomes visually under human eyes. In our method, we survey the affection of the quantization tables for quality of images and then decide the algorithm for hiding information into quantization table 1 https://jpeg.org/.
684
P. Q. Huy et al.
for minimizing the total distortion of images. That makes our method can be proactive for selecting the appropriate DCT coefficients in JPEG image.
1.3 Roadmap The rest of this paper is organized as follows. Section 2 presents a brief review of related works. The proposed method is then presented in Sect. 3. In Sect. 4, our proposed method is thoroughly discussed. Experimental results and evaluation are presented to show the efficiency of our method. Finally, we conclude the paper in Sect. 5.
2 Related Work Our proposed method focuses on RDH for JPEG image. We try to resolve the mentioned in Sect. 1.2. This section explains the overview of JPEG algorithm and classifies the categories of RDH for JPEG image.
2.1 JPEG Algorithm Overview of JPEG algorithm is shown in Fig. 1. Normally, there are three steps including the algorithm, namely discrete Fourier transform (DCT), quantized, and entropy encoder. Firstly, the original image is divided into non-overlapping 8 × 8
Fig. 1 Algorithm of JPEG
Cross-Domain Using Composing of Selected DCT Coefficients …
685
blocks and then is fed into 2D forward DCT (FDCT) function. The obtained DCT tables are quantized by quantization tables. Finally, the quantized coefficients are applied using the Huffman tables in order to generate the entropy coding. In the algorithm of JPEG image, the table elements of quantization tables can be adjusted or controlled by using different quality factors (QF) [15]. We can control the quality of JPEG image based on controlling the value of QF parameter in range [1, 100]. The values of quantization tables Qt are obtained by using the following equations:
i f Q F < 500 200 − (2 × Q F)i f Q F ≥ 500 SF ], 1 , Q t (i, j) = max [Q(i, j) × 100
SF =
5000 QF
(1)
(2)
where Q is the recommended quantization table value for luminance and chrominance component, SF is the scaled factors, and [x] is a mathematical operation representing the round operator of x. Figure 2 shows a sample of quantization tables for luminance component and quantized process in JPEG algorithm. The recommended quantization table Q (Fig. 2a) is scaled to become the quantization table Qt (Fig. 2b) when the original image is compressed based on the quality factor QF = 70. The quality factor QF is the parameter that decides the quality of JPEG image. The bigger value of QF 2 [0, 100] is set, the higher quality of JPEG image is generated. In the quantization process, the FDCT F tables (Fig. 2c) are quantized to compress the DC coefficients and AC coefficients shown in Fig. 2d. In the low- and middle-frequency zone, the nonzero coefficients are gathered. However, in the high-frequency zone, there are many zero coefficients that are efficiently compressed by Huffman code. In our method, we choose the selected DCT coefficients composing with the appropriate quantization table to propose new cross-frequency domain in order to increase the capacity of the embedding method.
2.2 Overview of RDH for JPEG The difficulty of RDH for JPEG image is embedding the secret information into the quantized DCT coefficients. The values of embedded domain are based on the values of DCT domain; therefore, it is hard to control to improve the quality of the embedded JPEG image. There are some proposals for data hiding using the feature of JPEG image. Fridrich et al. [16] proposed an embedding method that employs the redundant component of the DCT domain for embedding information. The embedded data can be extracted exactly because their method preserved the embedding domain efficiently. However,
686
P. Q. Huy et al.
Fig. 2 Quantization tables and quantized process
the capacity of Fridrich et al. [16]’s method is limited. This method is to focus on the DCT domain for proposing RDH method. The disadvantage of quantized DCT domain-based RDH method is increasing the file size of the embedded JPEG image. Such problem may cause attackers to attack images to exploit hiding information. For example, the proposed method of [17] also achieved the high quality of JPEG image; however, the file size of that increases greatly. Another feature of JPEG image can be employed for RDH that is modification of the Huffman tables. In general, this method controls the variable code length of Huffman code to embed the secret data into the DCT domain [18]. The variable code length feature of JPEG algorithm is the useful feature for embedding the data information. Such idea is also shown in our paper [19]. The file size problem can be solved by using the invariant Huffman code length feature. However, the limitation of capacity for embedding information is still remained. Other approach using the histogram shifting for RDH is proposed in method of Wedaj [20]. Their method employed the histogram of AC coefficients in each DCT block to decide embedding the information. They method utilized the ‘+1’ and ‘−1’ of AC coefficients for embedding information while remaining nonzero
Cross-Domain Using Composing of Selected DCT Coefficients …
687
AC coefficients by shifting its histogram to the left or right direction according to their sign. Their method achieved a better peak signal-to-noise ratio (PSNR) for quality of the embedded JPEG image comparing with previous methods. However, its embedding capacity needs to be improved efficiently.
2.3 Inspiration from Tian’s Method [4] The method proposed by Tian is the first RDH for digital image, and it is a simple algorithm for implementation. His method used the spatial domain in which its intensity value within the range of 0 and 255. In order to apply Tian’s method on JPEG image, we need to improve his approach to be suitable on DCT domain of JPEG image. The difference expansion (DE) algorithm used pairing the pixels (x i,j , yi,j ) of the original image I to create a low-pass image L and a high-pass image H. The low-pass pixels li,j and the high-pass pixels hi,j are defined as the integer averages and the pixel differences of (x i,j and yi,j ). Those values can be computed as follows: li, j = xi. j + yi, j /2, h i, j = xi, j −yi, j
(3)
The bit information b is embedded into the difference of hi,j by replacing the least significant bit (LSB) as follows: h i, j = 2h i, j + b
(4)
After embedding the watermark information, the watermarked pairing pixels (x i,j , y i,j ) can be obtained by the following calculation.
xi, j = li, j + h i, j + 1 /2, yi, j = li, j − h i, j /2
(5)
In order to ensure the watermarked image can be invertible, difference hi;j and integer average li,j should be into the expandable difference (E) and changeable (C) mentioned in the paper [4]. That means to restrict the value of x i;j , y’i;j in the range of [0, 255], the following condition is required. 0 ≤ li, j + h i, j + 1 /2 ≤ 255, 0 ≤ li, j −h i, j /2 ≤ 255
(6)
for both b e {0, 1}. When such Condition (6) is satisfied, that is understood that the hi,j associated with h’i;j is said to be expandable under the integer average value li,j .
688
P. Q. Huy et al.
3 Our Proposed Method The method of [4] is basically applied on the RGB domain image. So that, it is easily controlled the value of spatial domain. In order to apply the RDH algorithm on JPEG image, we need to improve the method of [4] for DCT frequency domain of JPEG image. The values of DCT frequency domain of JPEG contain integers; therefore, these can range from 1024 to 1023. This section presents how to adapt the general RDH algorithm for JPEG image.
3.1 Improved RDH (IRDH) Algorithm for JPEG Image Our proposed method improves the method of [4] by selecting the suitable DCT coefficients involving our algorithm. We need to choose the DCT coefficients involving the encoding method so that it can be reversed in decoding method. In our method, the original JPEG image I is decomposed into low-pass integer averages l i,j and high-pass differences hi,j . We also determine the changeable C and the expandable locations E. Also, in order to control the value of DCT coefficients involving our improved algorithm, let T be a given threshold value. Suppose that the expandable pair with difference value hi,j and average value li,j should satisfy Condition (6) or the following conditions: h i; j ≤ T, T ≤ li; j < 255 − T
(7)
Our improved RDH method applying for DCT coefficients is based on the method from the paper [21]. We also classify the difference values of DCT coefficients into two categories: the expandable set E and the inexpandable set. The expandable set E can be divided to become the set expandable only once (N e ) and the set expandable more than once (M). Note that, N e = E\M. The inexpandable sets can be divided into two groups: the ambiguously inexpandable set (N e ) and 2) the unambiguously inexpandable set (U). In our method, we classify the embeddable and inembeddable DCT coefficients pairing (x i,j , yi,j ) by using the values of hi,j and T. The conditions of such DCT coefficients pairs can be mapped as follows: • If |h| ≤ T −1 , (x i,j ,yi,j ) are belonged to M 2 • If T + 1 ≥ |h| ≥ T −1 , (x i,j ,yi,j ) are belonged to N 2 , (x i,j ,yi,j ) are belonged to Ne – If T ≥ |h| ≥ T −1 2 – If T + 1 ≥ |h| > T , (x i,j ,yi,j ) are belonged to Ne¯ • If |h| > 2T + 1, (x i,j ,yi,j ) are belonged to U
Cross-Domain Using Composing of Selected DCT Coefficients …
689
After defining the category of DCT coefficients in the whole image, the embedding and extraction can be implemented as the method shown in Sect. 2.3. Based on our improvement, IRDH method can be applied on JPEG image efficiently.
3.2 Proposal of Capacity by Using Quantization Table Embedding This paper also focuses on the proposal to increase the capacity of IRDH method by exploiting the feature of JPEG image. We employ the quantization table Q because of each Q is defined by the specified quality factor QF. We investigate the influence of Q on the values of the DCT coefficient. We count totally the number (i.e., N z (i, j)) of nonzero DCT coefficients of the whole JPEG image, where (i, j) is the coordinates of the quantization coefficients Q(i, j) in the quantization table Q. We suppose that the quantization coefficients Q(i, j) have great impact on large number of nonzero DCT coefficients of the whole JPEG image. Therefore, if the secret information W is embedded into the quantization coefficients Q(i, j) that have less impact on the DCT coefficients of JPEG image, we can improve the capacity while maintaining the quality of JPEG image. In order to show the influence of table Q on the values of the DCT coefficient, we have investigated the number of nonzero DCT coefficients of Girl. Figure 3 shows the frequency of nonzero DCT coefficients from whole Y component of Girl image. Based on the result of Fig. 3, we consider that if N z (i, j) = 0, then accordingly quantization coefficients Q(i, j) have no impact on the quality of JPEG image. If N z (i, j) ≤ T q (T q is the prefix threshold), then accordingly quantization coefficients Q(i, j) have less impact on the quality of JPEG image. Based on the above investigation, we propose new RDH by using quantization table embedding (called QRDH) as follows: – If N z (i, j) = 0, replace the value of Q(i, j) by 8 bits of secret information from W. – If N z (i, j) < T q , embed the information bit b into the LSB of Q(i, j). – If N z (i, j) ≥ T q , no embedding. Our proposed method guarantees that the capacity and the quality of JPEG image after the information embedding can be improved effectively.
4 Experimental Results Eight standard images with the size of 512 × 512 in Fig. 4, including ‘airplane,’ ‘barbara,’ ‘boat,’ ‘fruits,’ ‘goldhill,’ ‘lena,’ ‘peppers,’ ‘zelda,’ are used to evaluate the efficiency of the proposed method. The test images are compressed using the IJG
690
Fig. 3 Number of nonzero DCT coefficients of girl image
Fig. 4 Eight test images
P. Q. Huy et al.
Cross-Domain Using Composing of Selected DCT Coefficients … Table 1 Comparison of capacity
691
JPEG images Capacity (bits) (QF = 75) IRDH Our proposed method (IRDH + QRDH) airplane
5969
6483
baboon
4989
5405
barbara
5654
6266
boat
5684
6296
fruits
6384
6797
goldhill
5744
6258
lena
6424
7026
peppers
5625
6194
zelda
5918
6619
toolbox5 with the quality factors QF = 75. The secret message bits are extracted from grayscale logo shown in Fig. 4(i) with size 32 × 32. In order to evaluate the proposed method, we choose two factors that are important for RDH method such as visual quality and capacity. We compare the experimental results with that of method of IRDH in Sect. 3.1. In addition, in our proposed method, the thresholds T and T q are empirically set to be 20 and 500, respectively.
4.1 Evaluation of Capacity In the RDH method, the capacity of embedding method is very important. Many researchers have tried to improve the capacity of embedding method. However, when increasing the capacity of secret information, the quality of embedded JPEG images always becomes worse. Based on our proposed method that includes IRDH and QRDH (called cross-domain), we can improve the capacity while remaining almost the quality of the embedded JPEG images. According to the results shown in Table 1, it demonstrates that our proposed method can achieve better capacity than that of IRDH method that makes it possible to hide larger information in JPEG images more effectively.
4.2 Evaluation of Visual Quality The peak signal-to-noise ratio (PSNR) defined in the paper [19] is used to calculate the visual quality between the embedded JPEG image and the original one. The results of PSNR values of IRDH and that of our proposed method (including IRDH and QRDH) are calculated and compared to each other. Table 2 shows the comparison
692 Table 2 Comparison of PSNR values
P. Q. Huy et al. JPEG Images PSNR (dB) (QF = 75) IRDH Our proposed method (IRDH + QRDH) airplane
32.80 32.79
baboon
34.23 34.21
barbara
33.10 33.09
boat
32.80 32.79
fruits
35.23 35.21
goldhill
34.44 34.43
lena
34.13 34.12
peppers
34.01 34.00
zelda
37.87 37.86
of PSNR values. According to the results, we can find that although the capacity is increased, the proposed method can remain the quality of the embedded JPEG images.
5 Conclusion A new method of the reversible data hiding in JPEG images is proposed. We have improved the method in paper [4] for JPEG image embedding and proposed the crossdomain including IRDH and QRDH for increasing the capacity while remaining the quality of embedding JPEG images. Our method is very simple and easy to implement for real applications. Acknowledgements This research is funded by the Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2019.12.
References 1. Fiat A, Naor M (1994) Broadcast encryption. In: Stinson DR (ed) CRYPTO 1993, vol 773. LNCS. Springer, Heidelberg, pp 480–491 2. Brar AS, Kaur M (2012) Reversible watermarking techniques for medical images with ROItemper dindetection and recovery-A survey. Int J Emerg Technol Adv Eng 2(1):32–36 3. Fridrich J, Goljan M, Du R (2002) Lossless data embedding for all image formats. Proc SPIE 4675:572–583 4. Tian J (2003) Reversible data embedding using a difference expansion. IEEE Trans Circ Syst Video Technol 13(8):890–896 5. Ni Z, Shi Y, Ansari N, Wei S (2006) Reversible data hiding. IEEE Trans Circ Syst Video Technol 16(3):354–362
Cross-Domain Using Composing of Selected DCT Coefficients …
693
6. Ou B, Li X, Zhao Y, Ni R, Shi YQ (2013) Pairwise prediction-error expansion for efficient reversible data hiding. IEEE Trans Image Process 22(12):5010–5021 7. Ou B, Li X, Wang J, Peng F (2017) High-fidelity reversible data hiding based on geodesic path and pairwise prediction-error expansion. Neurocomputing 226:23–34 8. Wang J, Ni J, Zhang X, Shi Y (2017) Rate and distortion optimization for reversible data hiding using multiple histogram shifting. IEEE Trans Cybern 47(2):315–326 9. Bao P, Ma X (2005) Image adaptive watermarking using wavelet domain singular value decomposition. IEEE Trans Circ Syst Video Technol 15(1):96–102 10. Asikuzzaman M, Alam MJ, Lambert AJ, Pickering MR (2014) A blind and robust video watermarking scheme using chrominance embedding.In: International conference on digital lmage computing: techniques and applications, pp 1–6 11. Chang C, Lin C, Fan Y (2008) Lossless data hiding for color images based on block truncation coding. Pattern Recogn 41(7):2347–2357 12. Kutter M, Winkler S (2002) A vision-based masking model for spread spectrum image watermarking. IEEE Trans Image Process 11(1):16–25 13. Asikuzzaman M, Alam MJ, Lambert AJ, Pickering MR (2014) Imperceptible and robust blind video watermarking using chrominance embedding: a set of approaches in the DT CWT domain. IEEE Trans Info Forensics Secur 9(9):1502–1517 14. Chou CH, Liu KC (2010) A perceptually tuned watermarking scheme for color images. IEEE Trans Image Process 19(11):2966–2982 15. Pennebaker WB, Mitchell JL (1993) JPEG: still image data compression standard. Springer 16. Fridrich J, Goljan M (2002) Lossless data embedding for all image formats. SPIE Proc Photon West Electron Imaging 4675:572–583 17. Fridrich J, Goljan M, Du R (2001) Invertible authentication watermark for JPEG images. In: IEEE international conference on information technology, pp 223–227 18. Mobasseri BG, Berger RJ, Marcinak MP, Naik Raikar YJ (2010) Data embedding in JPEG bit-stream by code mapping. IEEE Trans Image Process 19(4):958–966 19. Thanh TM, Munetoshi TM (2013) A proposal of digital rights management based on incomplete cryptography using invariant Huffman code length feature. J Multimedia Syst 20(2):127–142 ISSN 1432-1882 20. Wedaj FT, Kim S, Kim J, Huang F (2017) Improved reversible data hiding in JPEG images based on new coefficient selection strategy. EURASIP J Image Video Process 63:2017. https:// doi.org/10.1186/s13640-017-0206-1 21. Kim HJ, Shi YQ, Nam J, Choo H, Sachnev V (2008) A novel difference expansion transform for reversible data embedding. IEEE
Empirical Analysis of Routing Protocols in Opportunistic Network Renu Dalal and Manju Khari
Abstract Opportunistic network is known as a new mechanism to provide communication in wireless mobile ad hoc network. Communication in Opportunistic network is depends on the opportunistic contacts through which messages are exchanged between wireless devices. But in mobile ad-hoc network, nodes uses end-to-end path. Major trait of opportunistic network is spasmodic connectivity, in which connection occurs at irregular interval of time, having variable connections between mobile nodes. With no priori information about network topology, paths are created dynamically. Many efficient message transferring protocols were proposed in the last decade, to find out when and with which node and path is suitable to forward the message. In this paper, the literature review with empirical analysis of routing protocols is presented which is used in opportunistic network to provide appropriate path between sender node and receiver node. Challenges and its recent applications of opportunistic network are also considered, and at last, comparison result and analysis of these protocols are given in this paper. Keywords Mobile nodes · Opportunistic network · Routing protocols · MANETs
1 Introduction Wireless networks have already emerged as a crucial part of each day lifestyles. Mobile ad hoc network (MANETs) is one type of wireless network which consists of mobile nodes. In the present and future scenario, opportunistic network is the only evolutionary paradigm which provides great advantage to heterogenous network, multi-hop wireless network and for pervasive computing. Because of largely dispersed mobile nodes, lack of wireless radio coverage range and limited resources R. Dalal · M. Khari (B) Department of Computer Science, AIACT and R, GGSIP University, Delhi, India e-mail: [email protected] R. Dalal e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_65
695
696
R. Dalal and M. Khari
of energy, opportunistic network works as a sub-class of delay-tolerant network [1–3]. Figure 1 shows the basic working paradigm for opportunistic network; in this, within communication range, different movable and stationary objects communicate with each other whenever they find the opportunity to communicate. Routing protocols and network performance are affected by the peculiarities of opportunistic network. Some of the peculiarities are mobility of nodes, contact duration time, energy and storage delay [4–6]. Multiple challenges are considered while transmitting the data in opportunistic network as follows: 1. Energy efficiency of mobile nodes: High level of energy is used by mobile nodes while message packet is saving and disseminated in opportunistic network. 2. Functionality issue: Sensor node, mobile node, wireless devices, etc., make the heterogeneous environment in opportunistic network. Due to this functionality, interoperability issue comes as challenge in this network. 3. Buffer capacity: Because store-carry-forward process is followed by mobile node to disseminate the data, more storage capacity is needed in buffer. 4. Protracted delay: Till the route of packet transmission is not decided, sender node keeps the message in their buffer, and it takes longer time to reach the packet at destination node. 5. Security issue: Cryptography algorithm is required to provide safe and secure transmission in opportunistic network, but these algorithms degrade the battery power performance.
Fig. 1 Opportunistic network
Empirical Analysis of Routing Protocols in Opportunistic Network
697
Because of lacking of privacy between mobile nodes, malicious node may enter in network. 6 Network overhead and recurrent connectivity: Overhead of wireless network enhanced, because of generating duplicate copies of same packet by the routing algorithm. In this paper, Sect. 2 is related work, it describes the routing protocols of opportunistic network. Simulation of some routing protocols on ONE tool with their result and analysis is presented in Sect. 3. Conclusion and future work are described in Sect. 4.
2 Routing Protocols of Opportunistic Network Various effective routing protocols are proposed in MANETs due to its low cost of development like hybrid Group key management for heterogeneous network, HSR, LKH, etc. [7, 8]. But routing protocols in opportunistic network work in different ways; opportunistic routing protocols work on the principle of store-carry-forward mechanism. Mobile nodes in opportunistic network having no fixed path, no source to destination connectivity between nodes and highly maneuverability property of nodes make network recurrent. Opportunistic network works in real-time scenario with diverse number of applications like Underwater Sensor Network (UWS) [9, 10], CenWits, Shared Wireless Infostation Model (SWIM), ZebraNet [11–13], Airborne Network, Space and Underwater Acoustic Communication Network, etc. [14–19]. In this section, few routing protocols are described as follows: Direct Delivery: In 2004, direct delivery routing protocol is proposed by Spyropoulos et al. [6]. Sender node kept the message in their buffer until he didn’t find the direct contact to receiver node, any message in not disseminated to any nearest mobile node. Only due to direct contact of message deliver node to message receiver node, this routing protocol is adequate in terms of network bandwidth and network resources. In direct delivery routing protocol, receiver node receives the packet after long delay so that sender node saves the packet in his buffer, which reduces the reliability of the protocol. Epidemic Protocol: Vahdat et al. [20] proposed the epidemic opportunistic routing protocol in 2000. This routing protocol broadcasts the packet to its all neighboring node whenever the mobile node received the packet. The packet is disseminated until the destination mobile node did not receive the packet or packet is not reached to its highest number of hops. Main objective of epidemic protocol is to reduce the delay and increase the message delivery. Consumption of bandwidth and memory is negligible, during transmission of message from sender to receiver. Spray and Wait Protocol: Spyropoulos et al. [21] proposed the new opportunistic routing protocol in 2005. An efficient and advanced version of epidemic routing protocol is known as Spray and Wait routing protocol. This protocol works in two phases: 1. Spray Phase, 2. Wait Phase. In first phase, N numbers of message packets are disseminated by sender node to N number of mobile nodes in network. These N numbers of nodes, who received the message, keep the message packet in their
698
R. Dalal and M. Khari
buffer. In second phase, mobile node carries the message with itself, till he did not find the destination node with in his communication range. Cluster Protocol: Dang and Wu [22] proposed the cluster-based opportunistic routing protocol in 2010. The multiple mobile nodes with having the same mobility pattern and property form the cluster in wireless network. To balance the load and minimize the overhead of network, this routing protocol interchanges and shares the resources. This protocol achieves high scalability, packet delivery ratio and performance, also having small buffer size, suitable for only one hop and low latency with limited resource consumption.
3 Result and Analysis In this section, simulation of the different routing protocols for distinct parameters is done on Opportunistic Network Environment (ONE) tool. Network simulator tool with MapRouteMovement mobility model is used to implement the routing protocols that were discussed in Sect. 2. Tables 1 and 2 present the simulated result of these protocols. According to the distinct time interval like 7200.1000 ms, 14400 ms and 28800.1000 ms, different routing protocols are simulated on distinct parameters as shown in Table 1. In Table 2, again simulation time is varied, and according to this, result of message delivered, overhead ratio and buffer time average is also varied. Figure 2 shows the simulation graph of routing protocols with the parameter of buffer time average. In terms of buffer time average, direct delivery protocol gives the maximum result and cluster protocol gives minimum output with respect to simulation time. Figure 3 shows the simulation graph of various routing protocols and message delivered parameter with respect to simulation time. Cluster protocol gives the highest output, and epidemic protocol gives minimum output. Overhead ratio and simulation time parameter are executed with respect to distinct time as shown in Fig. 4. Maximum overhead ratio is given by cluster protocol and minimum is given by direct delivery. Figure 5 shows the detail of simulation environment scenario in which Epidemic, Spray and Wait, direct delivery and cluster protocol are simulated.
4 Conclusion and Future Work Store-carry-forward mechanism is used by the relay node or intermediate node for transferring the message to final destination node in opportunistic network. On opportunity of network accessibility and proximity with un-predefined connection and disconnection of network and devices, opportunistic network is formed. The purpose of this paper is to do the analysis of some routing protocols of opportunistic network. Major challenges and important applications were also summarized in this research paper. Peculiar simulation time is used to simulate the Epidemic, Spray and Wait,
87.1028
1122.8088
Buffertime_avg
995.2923
Overhead_ration 95.7826
107
1341.8709
99.2115
208 0.0000
99
4036.0000 8166.54441
0.0000
28 21.5972
77 18.7005
197
15,220.0803 2114.5763 2175.5257
0.0000
289
2596.7257
17.5856
432
264
178.0712
167.5963
181.5785
2037.4312
194
14,400.0000 28,800.100
1252.6538 1739.8182
156
46
Cluster protocol
Messages delivered
Spray and Wait protocol
Simulation time
Direct delivery protocol
Epidermic protocol
7200.1000 14,400.0000 28,800.1000 7200.1000 14,400.0000 28,800.1000 7200.1000 14,400.0000 28„800.1000 7200.100
Factors
Table 1 Simulation results of routing protocols
Empirical Analysis of Routing Protocols in Opportunistic Network 699
Spray and Wait protocol
Cluster protocol
98.4937
1270.6052
Overhead_ration
Buffertime_avg
1362.0338
276
92.6413
Messages delivered 158
1407.0311
89.3965
343
379 0.0000
468 0.0000
197 18.7005
13,748.6630 15,653.4312 15,743.6380 2175.5257
0.0000
192
2665.3039
17.8333
540
2744.3308
17.4729
664
175.4588
1981.3946
370
182.5678
2072.185
620
184.4590
2101.5392
740
Simulation time
Direct delivery protocol
Epidermic protocol
21,600.1000 36,000.1000 43200.1000 21,600.1000 36,000.1000 43,200.1000 21,600.1000 36,000.1000 43,200.1000 21,600.100 36,000.100 43,200.1000
Factors
Table 2 Simulation results of routing protocols
700 R. Dalal and M. Khari
Empirical Analysis of Routing Protocols in Opportunistic Network
701
Fig. 2 Buffer time average and simulation time
Fig. 3 Message delivered and simulation time
direct delivery protocol and cluster protocol. In future, we focus to design the novel routing protocol with minimum delivery delay and negligible overhead ratio.
702
R. Dalal and M. Khari
Fig. 4 Overhead ratio and simulation time
Fig. 5 Simulation environment
References 1. Dalal R, Singh Y, Khari M (2012) A review on key management schemes in MANET. Int J Distrib Parallel Syst 3(4):165 2. Dalal R, Khari M, Singh Y (2012) Different ways to achieve Trust in MANET. Int J AdHoc Netw Syst (IJANS) 2(2):53–64 3. Dalal R, Khari M, Singh Y (2012) Survey of trust schemes on ad-hoc network. In: International conference on computer science and information technology, Springer, Berlin, Heidelberg, pp 170–180
Empirical Analysis of Routing Protocols in Opportunistic Network
703
4. Chaintreau A, Hui P, Crowcroft J, Diot C, Gass R, Scott J (2005) Pocket switched networks: real-world mobility and its consequences for opportunistic forwarding,”Tech.Rep., UCAMCL-TR-617, University of Cambridge 5. Mtibaa A, Chaintreau A, LeBrun J, Oliver E, Pietilai-nen A, Diot C (2008) Are you moved by your social network application?. In: Proceedings first workshop on Online social networks, pp 67–72 6. Small T, Haas Z (2005) Resource and performance trade-offs in delay-tolerant wireless networks. In: Proceedings 2005ACM SIGCOMM workshop on delay-tolerant networking, pp 260–267 7. Dalal R, Khari M, Singh Y (2012) Authenticity check to provide trusted platform in MANET (ACTP). In: Proceedings of the second international conference on computational science, engineering and information technology, pp 647–655 8. Dalal R, Khari M, Singh Y (2012) The New Approach to provide Trusted Platform in MANET 9. Chandrasekhar V, Seah WK, Choo YS, Ee HV (2006). Localization in underwater sensor networks: survey and challenges. In: Proceedings of the 1st ACM international workshop on Underwater networks, ACM, pp 33–40 10. Partan J, Kurose J, Levine BN (2007) A survey of practical issues in underwater networks. ACM Sigmobile Mobile Comput Commun Rev 11(4):23–33 11. Huang JH, Amjad S, Mishra S (2005). Cenwits: a sensor-based loosely coupled search andrescue system using witnesses. In: Proceedings of the 3rd international conference on Embedded networked sensor systems, ACM, pp 180–191 12. Small T, Haas ZJ (2003) The shared wireless infostation model: a new ad hoc networking paradigm (or where there is a whale, there is a way). In: Proceedings of the 4th ACM international symposium on Mobile ad hoc networking and computing, ACM, pp 233–244 13. Martonosi M (2004) The princeton ZebraNet project: sensor networks for wildlife tracking. Princeton University, pp 2–7 14. http://www.airbornewirelessnetwork.com/index.asp#network. Accessed on 25 March 2018 15. https://www.nasa.gov/open/space-communications.html. Accessed on 25 March 2018 16. Wood L, Eddy WM, Ivancic W, McKim J, Jackson C (2007) Saratoga: a delay-tolerant networking convergence layer with efficient link utilization. In: Satellite and space communications, 2007 IWSSC’07, International workshop on, IEEE, pp 168–172 17. https://www.nasa.gov/mission_pages/station/research/experiments/730.html. Accessed on 25 March 2018 18. Bhasin K, Hayden J (2004). Developing architectures and technologies for an evolvable NASA space communication infrastructure. In: 22nd AIAA International communications satellite systems conference and exhibit 2004 (ICSSC), p 3253 19. Stojanovic M, Preisig J (2009) Underwater acoustic communication channels: propagation models and statistical characterization. IEEE Commun Mag 47(1):84–89 20. Vahdat A, Becker D (2000) Epidemic routing for partially-connected ad hoc networks, Duke Tech Report CS-2000–06 21. Spyropoulos T, Psounis K, Raghavendra C (2005) Spray and wait: an efficient routing scheme for intermittently connected mobile networks. In: Proceedings ACM special interest group data comm. workshop delay-tolerant networking (WDTN’05) 22. Dang H, Wu J (2010) Clustering and cluster-based routing protocol for delay—tolerant mobile networks, IEEE Trans Wireless Commun 9(6)
Fuzzy Lexicon-Based Approach for Sentiment Analysis of Blog and Microblog Text Srishti Sharma and Vaishali Kalra
Abstract Sentiment analysis refers to determining the polarity of a document through natural language processing, text analytics and computational linguistics. With the continuous explosion of opinionated online text from blog Web sites, microblogs and product review Web sites, investigations in sentiment analysis are growing manifold. However, most of the sentiment analysis approaches are either lexicon-based or machine learning-based or a hybrid of these two. Very few researchers have explored the use of fuzzy logic in sentiment analysis. In this work, we propose a fuzzy lexicon-based approach for unsupervised sentiment analysis involving multiple lexicons and datasets. The crux of this work is the set of fuzzy rules proposed to assess the sentiment of blogs and microblogs. Experimental results verify the suitability of this approach for SA. We also contrast our work with the existing fuzzy sentiment analysis approaches to establish the superiority of the proposed system. Keywords Sentiment analysis · Fuzzy logic · Lexicons · Blogs · Microblogs · Twitter
1 Introduction Owing to its immense potential in becoming the voice of the customer, sentiment analysis (SA) has become the latest buzzword these days and is driving business intelligence and analytics. Sentiment analysis examines texts, like social media posts— blogs, microblogs, product reviews, etc., with respect to the opinions present in them about a product, service, event, person or idea. The elementary task in SA is grouping
S. Sharma · V. Kalra (B) Department of Computer Science and Engineering, The NorthCap University, Gurgaon, India e-mail: [email protected] S. Sharma e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_66
705
706
S. Sharma and V. Kalra
a text at the document, sentence, or feature/aspect level and categorizing it into one of the polarity classes positive, negative or neutral [1]. The major challenges in SA are: (i) Analyzing textual polarity is such a difficult task that often times, humans are also not able to arrive at a consensus regarding the polarity of a text, (ii) because of its informal nature, social media text is not grammatically correct, and often, the language used is informal, full of short forms, abbreviations and contains text marked with slangs and emoticons. Fuzzy logic was introduced in 1965 by Lotfi Zadeh [2]. Fuzzy logic is an advancement over Boolean logic wherein the truth values of variables are not limited to only 0 and 1 but may lie anywhere between these two values both inclusive. According to the theory of fuzzy logic, the truth value of a variable may lie somewhere between wholly true or wholly false. In this work, we propose a new fuzzy rule and lexicon-based SA approach.
2 Related Work According to SA surveys carried out in [3] and [4], primarily, three SA approaches have evolved over the years. lexicon-based, machine learning-based and hybrid of these two. We systematically review all of them one by one. Lexicon-Based: Authors in [5] use the lexicon-based approach for aspect-wise summarization of product review texts. In [6], the authors use a lexicon-based SA approach to topically segregate opinion holders and their respective sentiments toward different topics. In [7], the authors propose using a lexicon for SA which contains words as well as phrases. They also use eleven affix patterns used to generate further words like imX, disX, Xful, etc. Authors in [8] use a list of linguistic constraints and a number of seed adjectives, to mine new polar words and their polarities. For example, according to the rule for the conjunction “and”, the conjoined words are sentiment consistent “and” are likely to have the same polarity. Similarly, words like “but” and “however” are sentiment changers. Some authors propose lexicons for obtaining tweet sentiment as in [9]. Authors in [10] explore the prospect of using text streams as an alternative to the outmoded ballot system. They compare mass opinion obtained from polls with the opinions articulated on Twitter. In [11], the authors carry out SA of movie reviews using SentiWordNet [12] and a domain-specific lexicon. They compute the polarity class and the strength per aspect. Machine learning based: [13] was one of the earliest ML approaches for SA. In this, the authors extract sentiments from movie reviews. They employ three machine learning methods NB, MaxEnt and SVM. They analyze the factors that make SA more challenging than other classification problems. In [14], the authors use NB, SVM, k-NN classifiers with suitable feature selection and reduction schemes for SA of customer feedback data. They show that linear SVM’s attain good accuracy on data which may be difficult for even humans to analyze. Authors in [15] combine R-CNN and C-RNN using fusion gate to improve the performance of SA systems and reduce the dependence on feature engineering. Authors in [16] examine the performance
Fuzzy Lexicon-Based Approach for Sentiment Analysis …
707
of different ML algorithms like NB, SVM, J48, Sequential Mining optimization (SMO), etc., on Twitter data. They conclude that SMO, SVM and random forest display acceptable performance, but the NB classifier failed to perform. Moreover, there was no provision for handling negation words like no, not and never. In [17], the authors use deep learning for enhanced news SA. Hybrid: In [18], the authors carry out SA on Twitter using part of speech (POS) based prior polarities. They investigate using tree kernels for effective tweet depiction as well as amalgamation of different feature classes. Authors in [19] carry out SA on tweets using the Weka data mining APIs. At the outset, they use a sentiment lexicon for SA and then research on classifier training and effectiveness determination in tweet classification, trialing with diverse classifiers. Fuzzy SA approaches: Authors in [20] and [21] use fuzzy logic for sentiment analysis. In [22], the authors propose a fuzzy logic-based SA system for determining the temper fluctuations of cricket fans over time by evaluating their tweets. Contribution and Objective: Use of fuzzy logic in SA is comparatively very less as compared to lexicon-based, ML-based or hybrid SA techniques. Researchers in [20–22] carry out SA using fuzzy logic. However, in all these researches, the authors propose a system for SA of tweets only and much of the online data may be in the form of blogs, product reviews, etc., which have a structure totally different from tweets. This work aims to bridge that research gap.
3 Proposed Work In Fig. 1, we outline the detailed working of the proposed SA scheme. In this work, we use the Mamdani fuzzy inference scheme [23]. It is made up of four stages: fuzzification, inferencing, aggregation of rule outputs and defuzzification. We explain these steps with reference to our proposed fuzzy SA scheme.
3.1 Input Input is the set of texts that we need to perform SA on. First of all, we carry out text preprocessing and sentiment scoring using lexicon.
Fig. 1 Working of the proposed fuzzy SA scheme
708
S. Sharma and V. Kalra
Text Preprocessing: Our datasets primarily comprise of blog and microblog data. However, the text needs to be preprocessed first before making it usable as social media text is often unstructured, full of grammatical errors and also contains many abbreviations, slangs, emoticons, hashtags, Uniform Resource Locators (URLs), etc. Sentiment Scoring using Vader Lexicon: After preprocessing, we carry out sentiment scoring using the Vader lexicon. Vader lexicon calculates its scoring using wisdom of crowd approach along with the set of rules for the processing of English language textual data. Users generally represent their sentiments in different ways on social media by using emoticons, intensifiers, punctuations or capitalization and conjunctions. Vader lexicon has defined rules to deal with it [24]. The Vader lexicon takes as input a tweet and gives as output its positive and negative score values represented by TextPos and TextNeg, respectively.
3.2 Fuzzification Fuzzification is the name assigned to the task of making a concrete value vague. The polarity scores of each text, calculated from the last stage, are fuzzified using triangular membership function. The membership function for a fuzzy set Z on the universe Y is represented as µZ : Y → [0,1]. All the members of Y are converted to a corresponding value between 0 and 1. This represents the degree of belongingness of the particular item in X into the fuzzy set A. The triangular fuzzy membership is
Fig. 2 Triangular membership function
Fuzzy Lexicon-Based Approach for Sentiment Analysis …
709
graphically presented in Fig. 2 where the factors c, n and d are taken to be 0.2, 0.5 and 0.8, respectively. We construct three fuzzy sets, namely Less (L), Intermediate (I) and More (M) using triangular fuzzy membership for universe variables: positive (y_p), negative (y_n) and output (y_op). We compute the limits of y_p, y_n and y_op independently for all the datasets. Subsequently, we compute two factors, namely lower and upper that are the overall maximum and minimum values, respectively, for both the positive scores, TextPos as well as negative scores, TextNeg for all the texts. The factors y_p and y_n vary from (lower, upper). The factor middle is worked out as shown in Eq. 1. middle = (lower + upper)/2
(1)
The factors pertaining to the membership of the sets Less (L), Intermediate (I) and More (M) are as follows: L: {lower, lower, middle}; I: {lower, middle, upper}; M: {middle, upper, upper}. Nine Mamdani fuzzy rules presented in Eqs. 2–10 form the crux of this work. ⎧ ⎪ 0, x ≤ c ⎪ ⎪ x−c ⎨ , c r } = 1 − F(r ) where R is a random variable of radius r
(2)
F(r ) = P{R ≤ r } = P(# relevant photos of POI exist inside radius r )
(3)
F(r) is the failure function or the cumulative distribution function of random variable R. F(r) is also the probability that relevant photos exist before a certain radius r. • Probability Density Function The probability density function of the survival radius R is defined as the probability that the POI has relevant photos in the short interval per unit distance. It can be expressed as following: f (r ) = lim r →0
P[r ≤ R < r + r ] r
(4)
• Hazard Function The hazard function h(r) can be estimated using: h(r ) = lim r →0
P[(r ≤ R < r + r )|R ≥ r ] r
h(r ) =
f (r ) f (r ) = 1 − F(r ) S(r )
(5) (6)
We use life table method [10] to estimate the survival function: ˆ i) = S(r
i−1
1 − qˆ j
(7)
j=1
where qi = d i /ni is the conditional probability of failure in the interval; is the number of photos failure in the interval; di is the number of relevant photos exposed in the interval ni
3 Case Study We show the effectiveness of our approach on a case study with two POIs in USA using Flickr data. As an exemplary dataset, we select two POIs that have public
Extracting Relevant Social Geo-Tagged Photos …
751
Fig. 2 Existent probability of basic relevant photos outside each radius distance
information including GPS location from Wikipedia.3 We observed the empirical number of basic relevant photos of POI for each radius distance with the initial radius r 0 = 0.1 km and the increasing interval Δr = 0.1 km.
3.1 Example 1: Angels Stadium of Anaheim, California, USA The POI is a baseball stadium in Anaheim, California. The photos around the POI that contain the keywords “Angels Stadium” or “Angel Stadium” in its text is considered as the basic relevant photos. The survival probability of basic relevant photos by radius of the stadium shows in Fig. 2. It is stable over a long radius distance when the radius distance is greater than 0.4 km. We select 0.4 km as a reasonable radius for extracting other relevant photos. By using Google Maps, we specify neighbor POIs of the stadium within 0.4 km. There are four neighbor POIs including: City National Grove of Anaheim; Anaheim Amtrak; Industrial Plastic Supply, Inc; and Anaheim North Net Fire Training. All of the neighbor POIs are not related to baseball or sport that is why keywords related to baseball or sport are distinctive keywords between the stadium and its neighbor POIs (Fig. 3). From top 200 tag frequency, from the set of tag documents of the basic relevant photos, we can manually select the list of distinctive keywords including: “angel stadium”, “baseball”, “los angeles angels of Anaheim”, “anaheim angels”, “angels stadium”, “ballpark”, “mlb”, “stadium”, “major league baseball”, “california angels”, “baseball stadium”, “ballparks”, “red sox”, “sports”, “sporting event”, “stadiums”, “playoff”. With the distinctive keywords, we can extract other relevant photos of the stadium within the radius 0.4 km. We regard the set of relevant photos that do not exist on basic relevant photos as advanced relevant photos. From the top 200 tag frequency of advanced relevant photos, we can discover new information that is very strong related to the stadium including: “batting”, “ramirez, “diamondclub”, “edison field”, “mason”, “all-star”, “tailgate”, “games”, “match”, “regular season”. We merge the set of basic relevant photos with advanced relevant photos into the full relevant photos. We use tag clustering technology [12] to compare the topics of 3 www.wikipedia.org.
752
T.-H. Bui and T.-B.-T. Nguyen
Fig. 3 Map at Angels Stadium, Anaheim, California, USA
basic relevant photos with the full relevant photos as in Table 1. Table 1 shows that the full collection of relevant photos is better in generating description for the POI since it conveys more information in topics.
3.2 Example 2: Queens Museum of Art, New York, USA This POI is a museum in New York City, USA. The photos around the museum with “Queens Museum” or “Queen Museum” in their text are considered as the basic relevant photos. The existent probability of basic relevant photos by radius distance of the museum shows as in Fig. 4. The existent probability of basic relevant photos is nearly stable where the radius is greater than 0.5 km. We select this radius as a reasonable distance to extract other relevant photos. Within this distance, by using Google Maps, we can recognize some neighbor POIs including: Arthur Ashe Stadium, Louis Armstrong Stadium, USTA Billie Jean King, National Tennis Center, New York Hall of Science, Corona Golf Playground, Terrace On the Park, Queens Zoo, New York State Pavilion, Westinghouse Time Capsules, Rocket Thrower, Flushing Meadow Corona Park as shown in Fig. 5. We can specify some distinctive keywords from top 200 tag frequency of basic relevant photos including: “passport Fridays”, “public art”, “museum”, “museums”, “queensmuseum”, “brian tolle”, “queensmuseumofart”, “qma”, “sculpture”, “art”, “performance art”, “exhibits”, “museumexhibits”. With the set of distinctive keywords, we can extract other relevant photos. Based on advanced relevant photos, we can discover more information within top 200 tag frequency that is strongly related to the museum including: “louis comfort tiffany”, “art museum”, “panorama
Extracting Relevant Social Geo-Tagged Photos …
753
Table 1 Top tag clustering topics for Angels Stadium The basic collection of relevant photos
The full collection of relevant photos
Anaheim, anaheim angels, angel stadium, angels, angels versus oakland, ballpark, baseball, california, dodgers, los angeles, los angeles angels of anaheim, orange county, southern california, sporting event, sports
Anaheim, anaheim angels, anaheim stadium, angel, angel stadium, angeles, angels, angels versus oakland, ballpark, baseball, california, los, los angeles, los angeles angels of anaheim, mlb, orange county, seattle mariners, southern california, sporting event, sports, stadium, edison field, major leagues
Anahiem, angel, angels stadium, bono, christ, christian, concert, county, duck dynasty, expanded, fireworks, greg laurie, harvest crusade, live, music, nighttime, orange, phil robertson, phil wickham, screen, socal, stadium, summer, the edge
Bono, christ, christian, concert, duck dynasty, greg laurie, harvest crusade, live, music, phil robertson, phil wickham, socal, summer, third day, sox, people, baseball stadium, fireworks, diamondclub
Angel stadium of anaheim, angel stadium of anaheim, baseball stadium, boston, ca, chavez ravine, division series, jakepix, la, major league baseball, mlb, playoffs, red sox, red sox versus angels
Athletics, oakland, the, verses, california angels, night, long exposure, nighttime, united states, angelstadium, yankees, texas rangers, losangelesangels, batting, anny, ramirez, los angeles dodgers, alds, boston red sox, playoffs, red sox, jakepix, mason, all-star
Ama, ama supercross, anaheimca, bill ritter, blue, female, hat, jordan ross, male, man, motorcycle, people, portrait, racing, red, sky, smile, sunglasses, supercross, united states, woman
Ama, anaheimca, motorcycle, racing, supercross, bill ritter, blue, female, hat, jordan ross, male, man, motorcycle, people, portrait, racing, red, sky, smile, sunglasses, supercross, united states, woman, tailgate
Fig. 4 Existent probability of basic relevant photos outside each radius distance
challenge”, “science museum”, “new york worlds fair”, “fotokemika”, “museum of art”, “flushing corona meadows”, “history”, “sputnik”, “manhattan bridge”, “transportation”, “grounds”. We merge basic relevant photos with advanced relevant photos into the full relevant photos. We use tag clustering technology to compare the topics of the basic relevant photos with the full relevant photos as in Table 2. As can be seen, there are more information provided in the full collection of relevant photos.
754
T.-H. Bui and T.-B.-T. Nguyen
Fig. 5 Map at Queens Museum, New York, USA
4 Related Work Abbasi et al. [1] presented a method to identify landmark photos using tags and social Flickr groups. The authors applied the group information and statistical preprocessing of the tags for obtaining relevant landmark photos. In addition, the authors exploited the tagging features and social Flickr groups to train a classifier to identify landmark photos. In the work of Ahern et al. [2], the authors used the real-world data collected from Flickr to create a visualization tool, World Explorer, which displayed derived tags and relevant photos of geographic area on a map interface. In this work, the authors used clustering technology to collect relevant photos for geographic areas. They applied K-mean clustering algorithm, based on geo-tag information of photos. However, we cannot guarantee that all photos in a cluster are totally relevant to a geographic area. Similarly, He et al. [13] used mean-shift, a nonparametric clustering algorithm, to collect relevant photos for generating landmark description. Once the clustering step has been done, they regard photos in each cluster as a set of relevant photos. In Hao et al. [6], authors generated location overview with relevant images and tags by mining user-generated travelogs. The authors used descriptive tags in topics mined from travelogs to collect relevant images having descriptive tags. Samany [15] proposed a novel approach to extract landmarks automatically by clustering geo-tagged photos with density-based spatial clustering of applications with noise (DBSCAN) method and object detection by deep artificial neural network (deep belief network) algorithm. Barros et al. [16] explored the potential of geotagged data from social networks to analyze visitors’ behavior in national parks, and their research used the Teide National Park as a study area. Bui et al. [18] proposed a novel POI mining framework by using two-level clustering, random walk and constrained clustering for mining POI clusters of geo-tagged photos from Flickr.
Extracting Relevant Social Geo-Tagged Photos …
755
Table 2 Top tag clustering topics for Queens Museum The basic collection of relevant photos
The full collection of relevant photos
Lester associates, manhattan, model, moses, new york, new york city, new york city building, ny, nyc, panorama, panorama of the city of new york, qma, queens, queens museum of art, raymond lester associates, robert moses, scale model, the panorama, the panorama of the city of new york, united states
Corona, flushing, flushingmeadows, flushingmeadowscoron apark, lester associates, manhattan, model, moses, museum, new york, new york city, new york city building, ny, nyc, nycparks, panorama, panorama of the city of new york, park, qma, queens, queens museum of art, panorama challenge
Architecture, buildings, center for urban pedagogy, control, corona, crisis, cup, damon rich, design, flushing, flushing meadow, flushing meadows, flushing meadows corona park, housing, housing crisis, infrastructure, land use, larissa harris, learning, learning center, lines, long island, modeling, urbanism
Architecture, buildings, center for urban pedagogy, control, crisis, cup, damon rich, design, flushing meadow, housing, housing crisis, infrastructure, land use, larissa harris, learning, learning center, lines, long island, modeling, nyc panorama, ohny, ownership, pedagogy, power, urbanism, grounds
Ballet international africans, corona park, dance, drums, gothamist, morocco, music, passport fridays, pro-zak, rachid halihal ensemble, timothy vogel, west africa
Ballet international africans, corona park, dance, drums, gothamist, morocco, music, passport fridays, pro-zak, timothy vogel, west africa, rachid halihal ensemble, light
Architecturalmodels, cities, cityscapes, models, museums, world’s fairs, artifacts, lamp, bridge, bronx, brooklyn bridge, dusk, lower manhattan, manhattan bridge, staten island
Architecturalmodels, cities, cityscapes, models, museums, world’s fairs, artifacts, new york world’s fair, lamp, louis comfort tiffany, tiffany, bridge, bronx, brooklyn bridge, dusk, lower manhattan, manhattan bridge, staten island
American bridge company, aymar embury ii, blue, daniel chait, gilmore clarke, gilmore d. clarke, globe, internationalism, landmark, light, modern, modernism, people, space age, summer, trees, unisphere, world
American bridge company, aymar embury ii, blue, daniel chait, flushing meadow park, flushing meadows park, gilmore d. clarke, globe, new york state pavilion, people, summer, trees, unisphere, world, manhattan bridge
Han et al. [19] presented a novel framework including an improved cluster method and multiple neural network models for extracting representative images from Flickr for tourist attractions.
5 Conclusion In this paper, we propose a novel approach for extracting relevant geo-tagged photos for POIs from social media. We utilize nonparametric survival analysis to estimate the existent probability of photos with POI name in their tags, title, and description. The photos are considered as basic relevant photos to the POI. We select a reasonable radius distance that the existent probability of basic relevant photos is considerable.
756
T.-H. Bui and T.-B.-T. Nguyen
With distinctive tags between the POI and its neighbor POIs from top tag frequency of basis relevant photos, we extract other relevant photos within the reasonable radius distance. We have demonstrated the effectiveness of our method on a case study with two POIs in USA. As the future work, we will apply our method on some other studies such as generating description for POI, POI recommendation, and so on.
References 1. Abbasi R, Chernov S, Nejdl W, Paiu R, Staab S (2009) Exploiting Flickr tags and groups for finding landmark photos. In: Proceedings of the 31th European conference on IR research on advances in information retrieval, pp 654–661 2. Ahern S, Naaman M, Nair R, Hui J (2007) World explorer: visualizing aggregate data from unstructured text in geo-referenced collections. In: Proceedings of the 7th ACM/IEEE-CS joint conference on digital libraries, pp 1–10 3. Clements M, Serdyukov P de Vries AP, Reinders MJT (2010) Using Flickr geotags to predict user travel behaviour. In: SIGIR ‘10, pp 851–852 4. Cox DR (1972) Regression models and life tables. J Roy Stat Soc 34:189–220 5. De Choudhury M, Feldman M, Amer-Yahia S, Golbandi N, Lempel R, Yu C (2010) Automatic construction of travel itineraries using social breadcrumbs. In: HT, pp 35–44 6. Hao Q, Cai R, Wang X.-J, Yang J.-M, Pang Y, Zhang L (2009) Generating location overviews with images and tags by mining user generated travelogues. In: Proceedings ACM Multimedia, pp 801–804 7. Hauff C (2013) A study on the accuracy of Flickr’s geotag data. In: SIGIR 8. Lu X, Pang Y, Hao Q, Zhang L (2009) Visualizing textual travelogue with location-relevant images. In: LBSN, pp 65–68 9. Lemmerich F, Atzmueller M (2011) Modeling location-based profiles of social image media using explorative pattern mining. In: Proceedings of IEEE socialcom 10. Gehan EA (1969) Estimating survival function from life table. J Chron Dis 21:629–644 11. Snavely N et al (2006) Photo tourism. Exploring photo collections in 3D. In: SIGGRAPH 12. Begelman G, Keller P, Smadja F (2006) Automated tag clustering: improving search and exploration in the tag space. In: Proceedings of the collaborative web tagging workshop at www 13. He W, Li R, Wu Z, Hu J, Liu Y (2012) Generating landmark overviews with geo-tagged web photos. In: Proceedings of systems, man, and cybernetics (SMC) 14. Majid A, Chen L, Chen G, Mirza HT, Hussain I, Woodward J (2012) A context-aware personalized travel recommendation system based on geotagged social media data mining. Int J Geogr Inf Sci 27:662–684 15. Samany NN (2019) Automatic landmark extraction from geo-tagged social media photos using deep neural network. Cities 93:1–12 16. Barros C, Moya-Gómez B, Gutiérrez J (2019) Using geotagged photographs and GPS tracks from social networks to analyse visitor behaviour in national parks. Curr Issues Tour 1-20 17. Kuo CL, Chan TC, Fan I, Zipf A (2018) Efficient method for POI/ROI discovery using Flickr geotagged photos. ISPRS Int J Geo-Inf 7(3):121 18. Bui TH, Park SB (2017) Point of interest mining with proper semantic annotation. Multim Tools Appl 76(22):23435–23457 19. Han S, Ren F, Du Q, Gui D (2020) Extracting representative images of tourist attractions from Flickr by combing an improved cluster method and multiple deep learning models. ISPRS Int J Geo-Inf 9(2):81
Smart Wheelchair Remotely Controlled by Hand Gestures Hemlata Sharma and Nidhi Mathur
Abstract The main objective of this paper is to generate a hand gesture-controlled smart wheelchair for those physically challenged people who are incapable to move from one place to another in day-to-day life. Hand gesture-controlled wheelchair is a special kind of movable device which can control the motion of wheelchair at right direction through the hand movements of any person. This type of wheelchair is a blessing for those people who are not capable to move their lower limbs and are fully dependent on the caretaker for their activities. The proposed system can be categorized into two parts: gesture unit and wheelchair unit. A three-axis accelerometer ‘Micro-Electro-Mechanical Systems (MEMS)’ is used here as a sensor which is connected to the hand and sends the position of the hand to the microcontroller Arduino Lilypad. On the basis of data gathered from the accelerometer, microcontroller drives the signal to move the wheelchair in the desirable direction. Keywords Wheelchair · Hand gesture · Accelerometer · MEMS · Arduino lilypad
1 Introduction The problem of physical disability in this world is a big threat. A physical disable person is restricted by his regular physical activities like walking, running, sitting, standing etc. In terms of data, almost 10% people of the world’s population (650 million people in the world) are physically challenged. These physically challenged people are also the part of our society. It is our ultimate duty to help them. Most of our public assets like hospitals, government buildings, banks, public transports, etc. are not suitable for such type of people. So, these public assets should develop in such a way that people with disability can also effortlessly access these properties H. Sharma (B) · N. Mathur IMT CDL, Ghaziabad, India e-mail: [email protected] N. Mathur e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_71
757
758
H. Sharma and N. Mathur
and services. Even today, these disabled people cannot get the proper respect which they deserve. So, we have to change our perspective regarding them and accept these people and also incorporate them into our society. That is why we have developed an exceptional type of a wheelchair which would not only help them to integrate in our society but also make them self-dependent. Along with disabled people, paralytic and elderly person can also avail of this wheelchair. The wheelchair will work just with the hand gesture of a person, Chuang et al [5] found out finger and hand gestures posses rich information regarding human interaction and communication. To effectively use this wheelchair, the user must wear a transmitting device on his hand which contains an accelerometer. It will pass a special type of signal to the wheelchair so that it can move into the desirable direction. Traditional wheelchairs were comprised with so many limitations in terms of functionality, flexibility, usability, etc., but on the other side this hand gesture-controlled wheelchair not only assist the handicapped, paralytic and elder people for obstacle-free movement but also save them from the cardiovascular problems. Teotia et al [4] has introduce a hand-gesture-based control interface for navigating a vehicle.The wheelchair proposed in this paper is costeffective, the low cost and simplicity of the fabrication process play a crucial role in commercial manufacturing [8] and simple as it can be easily converted from a standard wheelchair readily available in the market. In this project, a prototype model for wheelchair has been developed which gets controlled by gestures of hand.
2 Related Work When an unfortunate incident affects the ability of a person to walk, it becomes necessary to use tool technology. Many types of wheelchairs like manual wheelchairs, electronic wheelchairs and voice recognition-controlled wheelchairs, eyeball controlled wheelchairs, etc., have been built so far manual wheelchairs are the cheapest among all available wheelchairs. These wheelchairs are best suited for those persons who are not capable to move their lower limbs or cannot move without any support. Another type of wheelchair named electronic wheelchair has been developed for those people who cannot even move the wheelchair like wheelchairs. Till date, so many experiments have been made in the field of wheelchair manually. Motors and batteries are used to operate electronic wheelchairs. Most electronic wheelchairs are run with the help of joysticks. Mazo et al. [1] developed an electric power wheelchair for quadriplegic patients who cannot use joystick. This system was controlled by the neck movement of a person. This wheelchair can be used in any type of environment because it is less sensitive to environmental variations. One other type of wheelchair ‘eye sensing wheelchair’ can control by the movement of eyes. If a person moves his eyes towards right, wheelchair will also move towards right, and if he moves his eyes towards left, wheelchair will move towards left and so on. A camera is placed on the headset, which focuses on the person’s eye. Video of the movement of eyeball is made for specific time interval, and then, snapshots are drawn from that video. In the MATLAB, these snapshots are matched to get the movement of the eyeball.
Smart Wheelchair Remotely Controlled by Hand Gestures
759
Then, MATLAB sends signals serially to Arduino and then it sends command to H-bridge to drive the motor according to the directed motion. Gautam et al. [2] used the concept of eye sensing wheelchair in their research paper. They developed an optical eye tracking system-controlled wheelchair which can work through the eye movement of a person. The developed system incorporated a spectacle mounted camera that could be tracked the eye movement of a person and captured the moving pictures of the user’s face. Further, the microprocessor took these processed images from the laptop as a USB data and converts them into the signals that can be sent to the wheels of the wheelchair for the movement. The microprocessor controlled the direction of wheelchair according to the capture images of person’s eyeball’s movement. For example, on the right, left and upward movement of the eyeballs, the wheelchair also moved on the right, left and forward direction, respectively. All the four wheelchairs were connected to the microprocessor that can send the signals to the wheels for the overall movement. One more specific type of wheelchair is also in existence called voice recognition-based wheelchair. These types of wheelchairs use some specific voice commands for their movement. Like, if a person sitting on wheelchair says right, then wheelchair must move right; if person says forward, then wheelchair must move forward and similarly; if person says stop, then the wheelchair must stop. These wheelchairs are boon for those physically handicapped people who cannot move any part of their body. The biggest drawback of this type of wheelchair is that it does not work accurately in the noisy and loud areas. Asgar et al. [3] developed a voice recognition wheelchair for physically handicapped people. This wheelchair integrated an ultrasonic and infrared sensor system along with dependent user recognition voice system. Arduino boards have been also used by them for the voice recognition purpose. This voice command driven wheelchair has very less possibility of accident due to obstacles and downstairs. PCs are also fitted with some existing wheelchairs for the recognition of gesture. But it increases complexity and further using PC makes it large and increases complexity.
3 Hardware and Software Used 3.1 Standard Wheelchair The quality of our project design is that any wheelchair having standard features readily available in the market (Fig.1) can be converted into an electronic wheelchair. In this project, a prototype model for wheelchair is developed. We have made a toy car like model, which can be controlled by hand gesture.
760
H. Sharma and N. Mathur
Fig. 1 Manual wheelchair
3.2 DC Motors We did research and perform calculations regarding the motor specification to be used in our project. We come up with decision that four DC motors having 9 V and 1 A rating and RPM 60 should be used.
3.3 Electrical Components Two batteries are connected to make total of 10 voltages, which is the requirement of our design. These two batteries have specification of 9 V which are restricted to 5 V using regulator. Motors attached to these batteries drive the wheelchair according to the input command. These batteries also provide the power to electronic components which includes Arduino Lilypad controller and the motor driver circuit.
3.4 Accelerometer In this model, we are using three-axis accelerometer and gives digital output. There are two mechanically fixed beams and the sensor can be modelled as a movable beam that moves between these two beams. Two gaps have been created; one being between the movable beam and the first stationary beam and the other in between the movable beam and the second stationary beam. Pin description of accelerometer is as follows (Table 1).
Smart Wheelchair Remotely Controlled by Hand Gestures Table 1 Accelerometer pin description
Pins
761
Description
Vcc
5-V supply should connect at this pin
X-OUT
This pin gives an analogue output in x direction
Y-OUT
This pin gives an analogue output in y direction
Z-OUT
This pin gives an analogue output in z direction
GND
Ground
ST
This pin used for set sensitivity of sensor
3.5 Microcontroller Interface The microcontroller that we have used in our project is Arduino Lilypad. The Lilypad Arduino family of boards has been designed for wearable applications. It works on rechargeable batteries and allows easy connection with sensors and actuators developed for an easy integration in clothes and fabrics.
3.6 Arduino Software (IDE) The Lilypad Arduino is programmed using the Arduino Software (IDE).
4 Proposed Framework The whole system is divided into two parts: Transmitter part and Receiver Part. In this project, a three-axis accelerometer ‘Micro-Electro-Mechanical Systems (MEMS)’ is used as a sensor. Cechowicz et al [6] presented MEMS-based orientation sensor with auto-calibration capability in their research. The output of this device is in the form of analogue signals which are further converted into digital signals with the help of comparator. RF transmitter is used to transmit the signal to receiver but before transmission the data is encoded with HT12E encoder to avoid its unwanted interference with other devices. At the receiver end, we have used RF receiver to receive data and then applied to HT12D decoder. This decoder IC converts received serial data to parallel and then read by using Arduino. According to received data, we drive wheelchair or model by using DC motor in forward, reverse, left, right and stop direction (Fig. 2).
762
H. Sharma and N. Mathur
Fig. 2 Proposed framework
5 Project Design The following section shows the whole project design through use case and data flow diagrams:
5.1 Use Case Diagram See Fig. 3. Use cases Set up the device (Table 2) Start button (Table 3) Perform gestures (Table 4) Stop the button (Table 5)
Smart Wheelchair Remotely Controlled by Hand Gestures Fig. 3 Use case diagram
Table 2 Use case 1: set up the device ID
1
Description
This use case describes that user must set up device on his/her hand
Actors
Admin
Pre-conditions
User must have the device
Post-conditions
If use case is successful
Basic flow
This use case sets the device on hand
Table 3 Use case 2: start button ID
2
Description
This use case describes how a user starts the device
Actors
User
Pre-conditions
User
Post-conditions
The use case is successful, actor can use the device
Basic flow
This use case starts when actor press the button
Table 4 Use case 3: perform gestures ID
3
Description
This use case describes to perform gestures so that wheelchair moves in desired direction
Actors
User
Pre-conditions
User must have started the device
Post-conditions
The use case is successful, actor can perform gestures
Basic flow
This use case starts when actor wishes to move
763
764
H. Sharma and N. Mathur
Table 5 Use case 4: stop the button ID
4
Description
This use case describes the user to stop the device
Actors
User
Pre-conditions
User must have started the device
Post-conditions
Use case is successful, hand gesture does not perform movements
Basic flow
This use case starts when actor press the button
Fig. 4 Context level DFD
Fig. 5 First level DFD
5.2 DFD DFD (context level) (Fig. 4) DFD (first level) (Fig. 5)
5.3 System Sequence Diagram See Fig. 6.
Smart Wheelchair Remotely Controlled by Hand Gestures
765
Fig. 6 System sequence diagram
6 Working Principle Now, we look the working principle of our project by using step by step method: • STEP I: A transmitting device (which contains RF transmitter and accelerometer) is tied in the physical disable person’s hand. • STEP II: The device will transmit command to the wheelchair so that it can do the required task like moving forward, reverse, turning left, turning right and stop. All these tasks will be performed by using hand gesture. Following are the commanding options available for physical disable person: Commands
Hand gestures
Forward command
By tilting his hand in forward direction
Backward command
By tilting his hand in backward direction
Stop command
By keeping the hand in straight position
Right command
By tilting his hand towards right
Left command
By tilting his hand towards left
• STEP III: The most important component of this project is accelerometer. Accelerometer is a three-axis acceleration measurement device. The output of this device is analogue in nature and proportional to the acceleration. This device measures the static acceleration of gravity when we tilt it and gives result in form of motion or vibration. • STEP IV: Phase-sensitive demodulation techniques are then used to determine the magnitude and direction of the acceleration.
766
H. Sharma and N. Mathur
7 Conclusion Main components used in our project are accelerometer, Arduino Lilypad, RF module and motors. Values from the movement of hand is given by the accelerometer mounted on the transmitter, microcontroller receives that values and processed it. These processed commands then operate the motors accordingly. Processing of the commands in the microcontroller is performed according to control algorithm. Five commanding action can be performed by patient such as forward, backward, right, left, stop. By tilting his hand towards forward wheelchair moves in forward direction and same goes for right and left command. When person keep his hand, straight wheelchair stops. In addition, by simply sliding the button on the transmitter, the device becomes active and wheelchair can be operated. As soon as the button is slide back, the device becomes inactive.
8 Future Works 8.1 Controlling of Speed Presently our model is moving with a constant speed. The speed cannot be varied by users or patients desire. This can be done by providing variable voltage to the motors of the wheelchair.
8.2 Obstacle Detection System Currently there is no such mechanism for obstacle detection; however, a system can be introduced in such a way that if some obstacle is detected the wheelchair should stop to avoid any collision or incident.
8.3 Health Monitor A health monitoring system should be introduced in the wheelchair such that it can measure basic information about health, such as temperature, blood pressure and pulse. Upper and lower ranges should be defined, and immediate emergency indication should be provided to the caretaker on crossing these ranges.
Smart Wheelchair Remotely Controlled by Hand Gestures
767
8.4 GPS Embedded Various safety measurements can also be installed on wheelchair like GPS system to track the wheelchair and its user, GSM system to get any important and emergency message form wheelchair user.
References 1. Mazo M, Rodriguez FJ, Lázaro JL, Ureña J, Garcia JC, Santiso E, Garcia JJ et al (1995) Wheelchair for physically disabled people with voice, ultrasonic and infrared sensor control. Autonomous Robots 2(3):203–224 2. Gautam G, Sumanth G, Karthikeyan KC, Sundar S, Venkataraman D (2014) Eye movement based electronic wheelchair for physically challenged persons. Int J Sci Technol Res 3(2):206–212 3. Asgar M, Badra M, Irshad K, Aftab S (2013) Automated innovative wheelchair. Int J Information Tech Converg Serv 3(6):1 4. Sideridis V, Zacharakis A, Tzagkarakis G, Papadopouli M (2019, September) Gesture keeper: gesture recognition for controlling devices in IoT environments. In: 2019 27th European Signal Processing Conference (EUSIPCO), pp 1–5. IEEE 5. Teotia P, Srivastav A, Varshney N, Kanaujia J, Singh A, Singh RK (2019) Gesture controlled vehicle for military purpose. Int J Adv Res Dev 4(4):42–44 6. Chuang WC, Hwang WJ, Tai TM, Huang DR, Jhang YJ (2019) Continuous finger gesture recognition based on flex sensors. Sensors 19(18):3986 7. Cechowicz R, Bogucki M (2019) Indoor vehicle tracking with a smart MEMS sensor. In: MATEC web of conferences, vol 252, p 02004. EDP Sciences 8. Mishra MK, Dubey V, Mishra PM, Khan I (2019) MEMS technology: a review. J Eng Res Rep, pp 1–24
Robust Compensation Fault-Tolerant Control Based on Sensor Fault Estimation Using Augmented System for DC Motor Tan Van Nguyen, Nguyen Ho Quang, and Cheolkeun Ha
Abstract This paper proposes a fault-tolerant control technology (FTC) that uses sensor fault compensation function to minimize the effects of unknown input disturbance and sensor fault. First, the LQR controller is used to control the system. Second, unknown input observer (UIO) algorithm is constructed to estimate sensor faults based on augmented system. Third, a model for determining residuals is also set up to find the occurrence of the fault. Fourth, a process of diagnosis based on residual logic to make a decision of compensation or non-compensation is also carried out. Finally, numerical simulation results applied to DC motor were run to demonstrate the superior performance of the proposed method. Keywords Sensor fault estimation · Unknown input observer · Fault-tolerant control · Fault detection and isolation
1 Introduction In the past decades, direct current (DC) motors have been known for their preeminent features in speed control that are widely used as actuators with various control applications. DC motors can provide high starting torque as well as being able to obtain speed control over a wide range. In addition, DC motor also has advantages characteristics such as compact structure, controllability, high accuracy, reliability, and low costs. However, in the practice, there are many faults or failures arise from actuator such as imbalance, misalignment, and wear of shaft or mechanical looseness, T. Van Nguyen · N. H. Quang (B) Institute of Engineering and Technology, Thu Dau Mot University, 6 Tran Van On, Phu Hoa ward, Thu Dau Mot city, Binh Duong province, Vietnam e-mail: [email protected] T. Van Nguyen e-mail: [email protected] C. Ha Robotics and Mechatronics Lab, Ulsan University, Ulsan 44610, Korea e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_72
769
770
T. Van Nguyen et al.
misalignment of motor bearing housing as well as around environment, and system nonlinearities with large uncertainties become critical challenges in utilizing motors to obtain high precision position and accurate speed tracking control. Furthermore, after a work period, the motor system can be faulted by aging or cable breakage of sensors. To solve these issues, one of the solutions to sensor fault compensation by the fault-tolerant control (FTC) technique is proposed for DC motors. Here, there have been many studies in fault detection and isolation (FDI) technology used to indicate the fault occurrence and isolate fault [1, 2]. FDI is becoming increasingly important for process monitoring because of the increasing demand for performance, as well as the increased safety and reliability of dynamic systems. FDI refers to the timely detection, diagnosis and correction of abnormal conditions of faults in a process. More advanced methods include fault detection based on data [3–6], which is most commonly used in many chemical and manufacturing industries. Other methods of FTC technique applied fault compensation based on estimated faults and residuals of the UIO algorithm to make a compensation decision [6, 7]. Here, the value of fault was determined on the basis of faults estimated by the UIO model. It also pointed out that the critical issue in FDI is based on the basic model to create a good residual model to describe the behavior of the monitored system. A number of different methods based on dynamic physical modeling to create a common balance, such as, output observers, parity relationships, and parameter estimation methods, have been investigated shown in [8–14]. Fault diagnosis utilizing fuzzy technique logic-based was also shown in [15]. In addition, estimating fault was an important element in FTC technology to control fault compensation, and there have been several studies obtained to many different algorithms such as unknown input observer [7, 12, 16] and sliding mode observers [17]. The main goal of widely used fault diagnosis and fault-tolerant control methods in industrial systems under compensating sensor fault is to ensure system reliability and stability. The uncertainty of systems, the presence of noise, and the random error of some variables make it difficult to achieve these goals. To address such issues, this paper proposes an approach of sensor fault compensation using FTC technique to reduce impacts under sensor fault and unknown input disturbance.
2 Observation Design and Fault Estimation 2.1 Linear Model State space Equation in discrete time can written as:
x(k + 1) = Ak x(k) + Bk u(k) y(k) = Ck x(k)
(1)
Robust Compensation Fault-Tolerant Control …
771
Fig. 1 System diagram takes implementation point into account [18]
where x(k) ∈ R n is the state vector, y(k) ∈ R p is the outputs vector. Ak , Bk , Ck , Dk , and Fs are known constant matrices with suitable dimensions. Performance scores are often defined as equilibrium points. We consider a system that combines its actuators and sensors with the entire range of performance areas for U inputs and Y output signals. If the system is linearized around an execution point (U 0 , Y 0 ), the linear model corresponds to the relationship between the variables of the input system u and the output y as (Fig. 1) u = U − U0 and y = Y − Y0
(2)
2.2 Observer Design An observation is defined as a dynamic system with state variables estimated from the state variables of another system [12]. The state space Eq. (2) of a linear system with fault sensor can be expressed in discrete time as
x(k + 1) = Ak x(k) + Bk u(k) + Dk d(k) y(k) = Ck x(k) + Fs f s (k)
(3)
where x ∈ R n is the state vector, u ∈ R m is the control input vector, and y ∈ R q is the system output vector. f s (k) is variable of a sensor fault signal, and d(k) is disturbance. The state space Eq. (3) of the Luenberger observer is given by [12] in the case without a sensor fault. ⎧ ⎪ ⎨ λ(k + 1) = N λ(k) + G Bu(k) + (L 1 + L 2 )y(k) x(k) ˆ = λ(k) + H y(k) (4) ⎪ ⎩ y(k) = C x(k) ˆ The estimation error can be defined as e(k) = x(k) − x(k) ˆ
(5)
e(k + 1) = (In − H C)x(k + 1) − λ(k + 1)
(6)
and
772
T. Van Nguyen et al.
From (6), we have λ(k + 1) = N x(k) ˆ − N H y(k) + G Bu(k) + L 1 C x(k) + L 2 y(k)
(7)
From (4) to (7), the estimation fault can be represented as e(k + 1) = (In − H C)x(k + 1) − N x(k) ˆ − N H y(k) + G Bu(k) + L 1 C x(k) + L 2 y(k) = [(In − H C)A − L 1 C]x(k) + [(In − H C) − G]Bu(k) + (N H − L 2 )y(k) − N x(k) ˆ + (In − H C)Dd(k)
(8)
Therefore, the estimation fault of Eq. (9) is obtained e(k + 1) = N e(k)
(9)
If the following conditions are satisfied, the following Equations (In − H C)A − L 1 C = N
(10)
(In − H C) − G = 0
(11)
(In − H C)D = 0
(12)
N H − L2 = 0
(13)
From (12), the matrix H can be written as H = D (C D)T (C D) (C D)−1
(14)
2.3 Sensor Fault Estimation Based on the Law of Nominal Control Consider a MIMO system represented as the following discrete time state space:
x(k + 1) = Ak x(k) + Bk u(k) + Dk d(k) y(k) = Ck x(k)
(15)
If the number of outputs is greater than the controller’s number of inputs, the selection of the control rule must be observed, and the output vector y divided as
Robust Compensation Fault-Tolerant Control …
773
follows: T T y(k) = Ck x(k) = C1T C2T x(k) = y1T (k) y2T (k)
(16)
The feedback controller is required to make the output vector y1 ∈ R p (p ≤ m), in order to monitor the reference vector at the input so that a steady state is reached. yr (k) − y1 (k) = 0
(17)
To achieve this goal, an z ∈ R p integral comparator and fault vector were added to satisfy the following relationship: z(k + 1) = z(k) + Ts (yr (k) − C1 x(k))
(18)
where T s is the sample period selected. The open loop system is governed by Eq. (18), where I p is the identity matrix of the parameter p and 0n, p is an empty matrix of n rows and p columns:
B 0n, p x(k + 1) A 0n, p x(k) + u(k) + yr (k) = z(k) z(k + 1) −Ts C1 I p Ts I p 0 p,m T (19) y(k) = C 0q, p x T (k) z T (k)
This state space equation can also be written as follows
¯ (k) + Bu(k) ¯ X (k + 1) = AX + B¯ r yr (k) y(k) = C¯ X (k)
(20)
The rules that control the nominal feedback of this system can be calculated by T u(k) = −K X (k) = − K 1 K 2 x T (k) z T (k)
(21)
K = K 1 K 2 is the feedback matrix obtained, for example, using the polar position technique, quadratic linear optimization (LQ). Estimated sensor fault can be determined as follows T T y(k) = C x(k) + Ds f s (k) = C1T C2T x(k) + Ds1 Ds2 f s (k)
(22)
In this case, the z integral error vector will also be affected by the error. Integral error vectors can be described as follows [18]: z(k + 1) = z(k) + Ts (yr (k) − C1 x(k) − Ds1 f s (k)) The magnitude of the sensor fault can be estimated as follows [18]:
(23)
774
T. Van Nguyen et al.
E¯ s X¯ s (k + 1) = A¯ s X¯ s (k) + B¯ s U¯ s (k) + G¯ s yr (k)
(24)
⎡
⎡ ⎡ ⎤ ⎤ ⎤ 0 0 In 0 0 In B 0 where E¯ s = ⎣ 0 I p 0 ⎦; A¯ s = ⎣ −Ts C1 I p −Ts Ds1 ⎦; B¯ s = ⎣ 0 0 ⎦; G s = 0 0 0 C 0 Fs 0 Iq ⎤ ⎡ 0 ⎣ Ts I p ⎦ 0 T T X¯ s = x T (k) z T (k) f sT (k) ; U¯ = u T (k) y T (k + 1)
3 Sensor Fault-Tolerant Control 3.1 Controller Design (LQR—Linear Quadratic Regulator) LQR is a modern control method used to control the system [19]. The purpose of the controller is to minimize the quadratic performance index. u(k) = −L x(k)
(25)
To control the state variables to zero, they can use extreme values through Ackerman’s formula. Here, a target function built for the LQR controller is defined as
T
∞ xk Q S xk J= uk ∗ R uk
(26)
k=0
where Q ≥ 0 and R > 0. The increase gain in L can also be calculated by the Riccati Eq. (27) −1 L = BT P B + R BT P A
(27)
3.2 Fault Diagnosis A fault diagnosis process is carried out based on the residual which is determined by the relationship between the measured output (y) and the estimated signal ( yˆ ), expressed as r (r = y− yˆ ) [12]. FTC technology is built for sensor fault compensation
Robust Compensation Fault-Tolerant Control …
775
process to reduce the impact of errors from the system. The FTC compensation control model is used by the structure of the model [13] as yc (t) = y(t) − Fs f s (t)
(28)
3.3 Residual Design The residual r(t) is indicated by the structure of a redundant detector based on an observer to detect faults r (t) = r fs (t)
(29)
where r fs (t) is the sensor fault signal.
4 Illustration Example and Simulation Results 4.1 DC Motor Model Design The proposed FDI method is applied to DC motor, and the state space equation is described as follows. x(t) ˙ = Ax(t) + Bu(t) (30) y(t) = C x(t) + Fs f s (t) T T T where A = −5 10; −0.4 −4 ; B = 0 2 ; C = 1 0; 0 1 ; Fs = T 0 1 ; u = V. The input u(t) is the armature voltage, y(t) is the measured output variable. x(t) is the selected state vector, such that T T x(t) = x1T x2T = ωT i aT
(31)
where ia is the armature current and ω is angular velocity of the shaft of the DC motor. Parameter values of DC motors are shown in Table 1. Dynamics of DC motor in discrete time can be expressed as x(k + 1) = Ak x(k) + Bk u(k) where Ak = I + Ts A; Bk = Ts B T s is the sampling time.
(32)
776
T. Van Nguyen et al.
Table 1 Value parameters of DC motors
Value
Content
R=2
Resistor
L a = 0.5 mH
Inductance
b = 1.10−1 Nm/rd s−1
Viscosity coefficient
J = 0.02 Kg m2
Moment of inertia
K = 0.2 Nm/A
Torque coefficient
Assume that sensor fault is generated by equation ⎧ ⎪ 0 ⎪ ⎪ ⎪ ⎪ ⎨ 6t − 24 f s (t) = −2t + 12 ⎪ ⎪ ⎪ sin(7t) + 0.5 ⎪ ⎪ ⎩ 2t − 16.0
if t ≤ 4 (s) if 4(s) < t ≤ 4.5(s) if 4.5(s) < t ≤ 5.5(s) if 5.5(s) < t ≤ 7.9(s) if t > 7.9(s)
(33)
4.2 Simulation Results 4.2.1
Result With Sensor Fault, Without FTC, and Disturbance
(a) Simulation results Angle velocity result of DC motor with gain K in the case with sensor fault, without FTC, and disturbance is shown as Fig. 2.
Fig. 2 Reference and response angle velocity with sensor fault, without disturbance and FTC case
Reference and Angle velocity
(b) Sensor fault and its estimation
Time (s)
Fig. 3 Sensor fault and its estimation with sensor fault, without disturbance and FTC case
777
Sensor fault and its estimation
Robust Compensation Fault-Tolerant Control …
Time (s)
Result of simulation sensor fault and its estimation in the case without disturbance and FTC is shown in Fig. 3.
4.2.2
Result Simulation of Sensor Fault Using FTC and Without Disturbance
(a) Simulation results Angle velocity result of DC motor with gain K in the case with sensor fault, FTC, and without disturbance is shown as Fig. 4. (b) Sensor fault and its estimation
Fig. 4 Reference and response angle velocity with sensor fault, without disturbance and FTC case
Reference and angle velocity
Result of simulation sensor fault and its estimation in the case with FTC and without disturbance is shown in Fig. 5. Simulation result in Figs. 2 and 4 shows the controller work well. Figure 2 shown response signal is affected under impacts of sensor faults. Figure 3 shows that the fault estimator works well. Figure 4 displays the effectiveness of fault compensation FTC technology under conditions of sensor fault. Figure 5 expresses sensor fault after FTC technology applied.
Time (s)
Fig. 5 Sensor fault and its estimation with sensor fault, without disturbance and FTC case
T. Van Nguyen et al.
Sensor fault and its estimation
778
Time (s)
5 Conclusion This paper successfully presents sensor fault compensation of the FTC technique using controller LQR. Here, a UIO model of the augmented system is constructed to determine residual and to estimate faults. Simulation results show the obtained effectiveness of the proposed method. Estimated faults are approximately equal to zero after FTC performed which is shown in Fig. 5.
References 1. Wei H, Xiaoxin S (2015) Design of a fault detection and isolation system for intelligent vehicle navigation system. Int J Navig Observ 2015, Article ID 279086, 19. http://dx.doi.org/10.1155/ 2015/279086 2. Thirumarimurugan M, Bagyalakshmi N, Paarkavi P (2016) Comparison of fault detection and isolation methods: a review. In: 2016 10th international conference on intelligent systems and control (ISCO), Coimbatore, pp 1–6. https://doi.org/10.1109/isco.2016.7726957 3. Gertler TJ (1998) Fault detection, and diagnosis in engineering systems. Marcel Dekker, New York, USA 4. Isermann R (2006) Fault-diagnosis systems: an introduction from fault detection to fault tolerance. Springer, Germany, Berlin 5. Nandi S, Toliyat HA, Ziaodong L (2005) Condition monitoring and fault diagnosis of electrical machines—a review. IEEE Trans Energy Convers 20:709–729 6. Venkatasubramanian V, Rengaswamy R, Yin K, Kavuri SN (2003) A review of process fault detection and diagnosis. Part I: quantitative model-based methods. Comput Chem Eng 27:293– 311 7. Djilali T, Mohamed B, Mohamed T (2012) Observer-based fault diagnosis and field oriented fault tolerant control of induction motor with stator inter-turn fault. Arch Electr Eng 61(2):165– 188 8. Katherin I, Achmad J, Trihastuti A (2015) Robust observer-based fault tolerant tracking control for linear systems with simultaneous actuator and sensor faults: application to a DC motor system. Int Rev Modell Simul 8(4):410–417. August, 2015 9. Halim A, Christopher E et al (2011) Fault detection and fault-tolerant control using sliding modes. Springer-Verlag, London, ISSN 1430-9491 10. Fazal Q, Liaquat M, Naz N (2015) Robust fault tolerant control of a DC motor in the presence of actuator faults. In: 2015 16th international conference on sciences and techniques of automatic control and computer engineering (STA), pp 301–333, Monastir
Robust Compensation Fault-Tolerant Control …
779
11. Tan VN, Cheolkuen H (2019) The actuator and sensor fault estimation using robust observer based reconstruction for mini motion package electro-hydraulic actuator. In: Intelligent computing methodologies, proceeding of 15th international conference, ICIC 2019. Nanchang, China, 3–6 Aug. Part III, pp 244–256 12. Tan NV, Cheolkeun H (2019) Sensor fault-tolerant control design for mini motion package electro-hydraulic actuator. MDPI Process 7:89. https://doi.org/10.3390/pr7020089 13. Tan NV, Cheolkeun H (2019) Experimental study of sensor fault-tolerant control for an electrohydraulic actuator based on a robust nonlinear observer. Energies, MDPI, Open Access J Energ 12:4337. https://doi.org/10.3390/en12224337 14. Sobhani MH, Poshtan J (2012) Fault detection and isolation using unknown input observer with structured residual general. Int J Instrum Contr Syst (IJICS), 2(2) April 2012 15. Miguel LJ, Blazquez LF (2005) Fuzzy logic-based decision-making for fault diagnosis in a DC motor. Eng Appl Artif Intell 18:423–450 16. Liu X, Gao Z, Zhang A (2018) Robust fault tolerant control for discrete-time dynamic systems with applications to aero engineering systems. IEEE Access 6:18832–18847. https://doi.org/ 10.1109/ACCESS.2018.2817548 17. Edwards C, Tan CP (2004) Fault tolerant control using sliding mode observers. In: 2004 43rd IEEE conference on decision and control (CDC) (IEEE Cat. No. 04CH37601), Nassau, vol 5, pp 5254–5259. https://doi.org/10.1109/cdc.2004.1429642 18. Noura H, Theilliol D, Ponsart JC, Chamseddine A (2009) Fault-tolerant control systems design and practical applications. In: Michael JG, Michael AJ (eds) Springer: Dordrecht, The Netherlands; Heidelberg, Germany; London, UK; New York, NY, USA; 2009, ISBN 978-1-84882-652-6. https://doi.org/10.1007/978-1-84882-653-3 19. Yaguang Y (2018) An Efficient LQR Design for Discrete-Time Linear Periodic System Based on a Novel Lifting Method”, Instrumentation and Control Engineer, Office of Research, US NRC, Two White Flint North. 2018
Monitoring Food Quality in Supply Chain Logistics Sushant Kumar and Saurabh Mukherjee
Abstract Food is indispensable for human survival and should be cared utmost. This paper monitors perishable food packaging quality over a supply chain logistics and helps to act fast to save perishable foods over the route. Internet of things signifies that the things which surround us and help us to live better and healthy life, those are connected via computer systems and Internet wirelessly, can communicate within themselves and also with the master node and exchange data. Those things can be devices, sensors, appliances, machines, etc. The location of containers has global positioning system (GPS) and is monitored remotely over a cloud. The perishable food items are in a container having environment indicators like temperature and humidity sensors. The communication aspect is critical for the better work scenario, precautionary measures, and monitoring energy consumption too. Any security or risk hazard can be prevented if we can analyze and study that data continuously or periodically from any remote site. Keywords Perishable food · Sensors · Internet of things (IoT) · Manufacturer · GPS
1 Introduction Food is the foremost thing which makes humans strive for survival [1]. Food quality can be monitored by IoT [2]. The technology has spread to major sectors like industry, automobiles, health care, retail, government, etc. The IoT infrastructure provides analytics to help with predicting failures before they occur, giving the ability to react at the right time [3]. A particular use case explored in paper is that there are plenty of processed perishable foods which are transported from the production area S. Kumar SKIT, Jaipur, India e-mail: [email protected] S. Mukherjee (B) Banasthali Vidyapeeth, Banasthali, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_73
781
782
S. Kumar and S. Mukherjee
to the distributor, retailer, and finally, the consumer. There are risks involved like refrigerator malfunctioning or fungi/microbe attack, that is how food getting stale midway and hence this model gets jeopardized. The idea of the model is to use temperature, humidity, and GPS sensors at each processed food storage areas with the adjustment of sensors performed concerning the amiable conditions [4]. Then, keep on collecting the data from those sensors continuously to a remote location along with the GPS location of the sensor at that point of time and can monitor the times when the temp or humidity fall below a threshold value set [5]. The server gathers the data from all the sensors and those results can be sent to a cloud from where all the beneficiaries gets informed by an application or a browser notification [6]. The food freshness index of each product is maintained by a barcode on the food item prior on its journey and is compared continuously with the monitoring server [7]. A proposed model, research analysis, and conclusion are prepared, respectively, in this paper. The motivation for this paper came as there was an instance of millions of dollars of processed food getting stale in USA as they were not monitored [7]. In the upcoming sections, a model is proposed on which analysis is performed and stated in the conclusion section.
2 Literature Survey Nowadays, as the world is growing and processed food items are getting lesser because of scarcity of resources, it remains big concern to save our food for longer duration of time and be consumable at that point of time too. The dairy products, chicken strips, prawn, and crab meats are few examples. Sea food industry has few constraints like the shipment delay, weather changes, unavoidable delays, and ultimately food getting stale by its usage day. This results in inflation of particular food items and so ultimately the food industry has everyone to suffer. The current generation is more digitally connected than ever. Wireless communication models are experiential learning hubs and the devices like IoT help us to bridge the gap between the manufacturers and the consumers. IoT helps both the parties to view and manage the items more efficiently than ever and hence increasing the productivity of the cycle. Sensors in correspondence to the technologies can bring a revolution to the food industry and help in the cause of eat fresh and live fresh. Similar problem and its discussion were concluded in [2, 7], where authors realized the critical food industry requirements and the supply chain orchestration. In that paper, a central data analytics model was implemented with the help of radio frequency identification (RFID) for communications [6, 8]. Continuous monitoring of perishable foods was done with help of radio waves, RFID, and data transferred to an analytics and operations center which could be cloud hosted.
Monitoring Food Quality in Supply Chain Logistics
783
3 Proposed Model This model uses hardware Raspberry Pi, Jumper wires, temperature and humidity sensor, digital humidity and temperature sensor (DHT), [9], GPS Sensor, breadboard, 5 V power supply, resistors (4.7 k) Connections: 1. Connect PIN 1 of DHT sensor to the 3.3 V–5 V pin of Raspberry Pi. 2. Connect PIN 2 of DHT sensor to any input pins of Raspberry Pi; here, we have used pin 11. 3. Connect PIN 4 of DHT sensor to the ground pin of Raspberry Pi. An adafruit library for DHT22 sensor is use to read the sensor data. The function adafruit_DHT.read_retry() reads data from the sensor. Figure 1 depicts a screenshot on the Raspberry Pi device capturing data from the sensor. It shows the temperature and humidity parameters on a particular node at a stipulated instant of time. User Datagram Protocol (UDP) creates a 2way communication between nodes, acting as a server and a client. Creating a socket s=socket.socket(Socketfamily,socketType,Protocol=0) Socket family can be AF_Unix or AF_INET. Unix-based systems or general programming-based systems. Socket type can be socket-stream or sock-datagram. Protocol is set to default 0. For sending scripts from client to server, we need Internet Protocol (IP) address and port number of servers [10]. Clients can have many IP address but server will have one IP address. The client takes reading from the sensor and sends it to the server. The server receives the data from the client and saves it in text or csv file. As multiple clients send sensor values and it assimilates over the server, we need to segregate it so that we can do processing over it. Also, there can be malfunctioning like data corruption, incomplete date, and server not collecting data properly, so filtering data becomes essential. Figure 2 depicts the path undergone for data analyzing of food item quality throughout the process indicting input and output sides as well. Firstly, the sensors are deployed at each container containing processed food batches which require same environmental conditions to be at bay. All sensors sense the environment continuously at each of the containers and monitors the temperature and the humidity of that particular place and sends it to the Raspberry Pi module connected to it. After that, there is data acquisition over a cloud [5]. The machine learning algorithms are applied and the filtered data which consists of aberrant readings like temperature/humidity increase over an increased span of time is taken as Fig. 1 Snapshot of capturing data
784
S. Kumar and S. Mukherjee
Fig. 2 Block diagram for model
input to be send over a remote server and an application which is monitored by the team. The readings may contain location of the container according to the GPS coordinates and the abnormal readings. Also, the container/van owner is intimated and then any risk can be averted through manual intervention [7]. Proper aligning and cooperation of these above steps can be followed after trigger-type setup is installed and initiated.
4 Research Analysis Suppose, for a particular processed food item, the ambient temperature is 6–12 °C and the humidity should be between 35 and 55 units. So, the readings will be according to time: item_id, item_name, gps_coordinates, container_no, temp_value, humidity_value, etc. The Boolean NOT expression is used as the processed food item freshness constraint violates when both temperature and humidity at a single time are not in the mentioned range and hence we use NOT(~) expression. Once we get to know the container_no and gps_coordinates, we can inform the specified driver and make the necessary arrangements. Also, same application can be installed at the driver’s smartphone thereby reducing the work of intermediary servers. Figure 3 uses the matplotlib library function and shows the temperature and humidity on x-axis and that varying with the time elapsed on y-axis and similarly, the right figure shows the digital values of humidity and temperature, respectively. The monitoring server at the cloud keeps the data of all sensors with GPS location. The data is collected at remote server and also at cloud and is available for analyzing. The main idea is that sensors are deployed at the containers having processed food and they record the environmental factors, namely temperature and humidity continuously, and then, the data is sent to cloud where filters are applied to it, any aberrant data is reported or notified back to remote server and application [2]. The data to cloud is sent every second and filtered continuously for any anomaly and
Monitoring Food Quality in Supply Chain Logistics
785
Fig. 3 Temperature and humidity values and its graph
notified to appropriate persons. This way the perishable processed food is made available for longer duration with the help of caretakers at each container.
5 Conclusion IoT in conjunction with recent communication technologies can bridge the gap between different industries and can effectively offer transparency to the parties involved therein. The scope of this paper is to propose a model which can track the food item continuously and monitor its quality by indicating temperature and humidity factors which are most influential in degrading a food item’s quality over a span of time. After this setup, any perishable food can be detected due course time and can be act upon in time. The challenges we face on a broader term it is that if multiple parties are involved in this trajectory, for example manufacturers, dealers and suppliers, then the visibility, mutability, and security of data become a major concern. Any party would like to modify/delete any particular data at an instant without other party noticing it and hence may be able to deceive other party. The food item concerning all should be visible to all participants at all times (for tracking) and also the security should not be jeopardized. These above thoughts can be accumulated if all the parties involved are brought together and assimilated in a blockchain environment [11–13]. As we know blockchain is immutable, visible, and secured at a very high level, the trust among all parties will be automatically restored and there will be a secured environment for all.
786
S. Kumar and S. Mukherjee
References 1. Witjaksono G, Rabih AAS, bt Yahya N, Alva S (2018) IOT for agriculture: food quality and safety. In: IOP Conference Series: Materials Science and Engineering, vol 343(1), p 012023 2. Pal A, Kant K (2018) IoT-based sensing and communications infrastructure for the fresh food supply chain. Computer 51(2):76–80 3. Gu Y, Han W, Zheng L, Jin B (2012) Using IoT technologies to resolve the food safety problem– an analysis based on chinese food standards. In: international conference on web information systems and mining, pp 380–392 4. Popa A, Hnatiuc M, Paun M, Geman O, Hemanth DJ, Dorcea D, Ghita S (2019) An intelligent IoT-based food quality monitoring approach using low-cost sensors. Symmetry 11(3):374 5. Marconi M, Marilungo E, Papetti A, Germani M (2017) Traceability as a means to investigate supply chain sustainability: the real case of a leather shoe supply chain. Int J Prod Res 55(22):6638–6652 6. Verdouw CN, Wolfert J, Beulens AJM, Rialland A (2016) Virtualization of food supply chains with the internet of things. J Food Eng 176:128–136 7. Wang J, Yue H (2017) Food safety pre-warning system based on data mining for a sustainable food supply chain. Food Control 73:223–229 8. Atzori L, Iera A, Morabito G (2010) The internet of things: a survey. Comput Netw 54(15):2787–2805 9. Venkatesh A, Saravanakumar T, Vairamsrinivasan S, Vigneshwar A, Santhosh Kumar M (2017) A food monitoring system based on bluetooth low energy and internet of things. Int J Eng Res Appl 7(3):30–34 10. Xia F, Yang LT, Wang L, Vinel A (2012) Internet of things. Int J Commun Syst 25(9):1101 11. Kshetri N (2019) Blockchain and the economics of food safety. IT Professional 21(3):63–66 12. Laurier W (2019) Blockchain value networks. In: 2019 IEEE social implications of technology (SIT) and information management (SITIM). IEEE, pp 1–6 13. Song JM, Sung J, Park T (2019) Applications of blockchain to improve supply chain traceability. Procedia Comput Sci 162:119–122
Applications of Virtual Reality in a Cloud-Based Learning Environment: A Review Nikhil S. Kaundanya and Manju Khari
Abstract Virtual learning environments (VLE) are spaces designed to educate students remotely via online platforms. Virtual reality (VR) technologies are possible to be used in combination with cloud gaming technologies to create virtual environments. These can be used effectively to create cloud-based social virtual environments which can be used for many purposes like multiplayer virtual gaming or platforms for remote learning for various users. Virtual reality affects our perceived information with its simulations. In this review paper, an overview is given of a cloud-based VLE. We study the application of the VLE in a cloud-based setting, find the challenges faced, and learn about the possible solutions. Keywords Virtual learning · Virtual reality · Low latency · Gaming · Application of VR
1 Introduction VLEs are zones made to teach understudies remotely using online projects [1]. Previous work with VLEs has shown productiveness in showing understudies remotely. One model is iSocial [2], a work area application that trains understudies with Autism Spectrum Disorder to improve their social abilities [3] by empowering social communication in a virtual world [4]. Virtual reality (VR) has likewise indicated a guarantee in showing generalizable aptitudes through vivid settings [5]. Cloud computing assists with giving a gaming backend as a service (GBaaS) [5] highlight that assists with improving the constant gaming performance [6]. The issues found in cloud gaming incorporate the server provisioning issue [6], expecting to cut server N. S. Kaundanya (B) · M. Khari Department of Computer Science and Engineering, Ambedkar Institute of Advanced Communication Technologies and Research, GGSIP University, Geeta Colony, New Delhi 110031, India e-mail: [email protected] M. Khari e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_74
787
788
N. S. Kaundanya and M. Khari
running and information stockpiling costs [7], just as guidelines and decrease of the latency [8]. Additionally, a few methods are there, because of design rendering and video encoding, to arrange transfer speed optimization in video transmission [9]. To diminish the intercommunication idleness among intelligent customers [10], the cloud asset allotment challenges were analyzed in reference [10]. An empowering technique is utilized by numerous fashioners to lighten cloud-related impediments [11]. Computational offloading can be accomplished by this [11]. The computationally costly pieces of portable games can be executed by enlisting remote PCs using this way. During this, the versatile customers just help as front-closures to get and picture the results. Calculation offloading empowers the arrival of nearby CPU assets on the cell phone and to help the more affordable consumption of battery power. The prfmary Focus behind this review paper is to analyze the feasibility and effectiveness of such a virtual reality learning environment [12] and if it can be achieved with reduced latency [12] and data speed requirements [13] with the use of current technology [14]. It also to help future researchers better focus on the deficiencies of the current technologies and find appropriate solutions. Virtual reality has been rapidly becoming a part of our current world. Applications of virtual reality now span from gaming to automobile designing, architecture simulation to immersive entertainment, and so forth. Remembering that, past work with virtual learning environments (VLEs) [15] has demonstrated compelling in showing understudies remotely. Computer-generated reality has additionally demonstrated massive potential in having the option to be a mechanism for effectively advancing the learning of generalizable abilities and information through vivid situations. Game engineers and originators of today have gained huge measures of ground in fields of virtual space/condition structure [7]. The concept of high realism has started to shape up nicely in the application of virtual reality [16]. This has opened an avenue for a wide variety of applications for virtual reality [17]. With the effective way of giving practical information in the classroom environment being provided using virtual learning environments, this review has been focused on evaluating the pros and cons of the technologies used to achieve this. The following sections enlighten upon the use of virtual reality technology with a cloud-based environment to provide a functional social virtual reality learning environment.
2 Literature Review Throughout the examination of this undertaken topic, we experienced and dissected various papers that focussed upon themes identified with virtual learning environments to decide the conceivable development of such applications with the current innovation and to decide the present condition of the related applications. 3D VLEs are characterized as “PC created, three-dimensional reproduced conditions that are combined with well-characterized learning objectives.” [18] Students enter these situations as symbols utilizing an organized PC and associate with one another with content talk, voice visit, and the control of shared items. As of now, VLE utilizes
Applications of Virtual Reality in a Cloud-Based Learning …
789
the utilization of multimedia technologies, for example, recordings and powerpoint introductions to help in the correspondence of instructive ideas. [19] However, it makes an error due to the absence of an intuitive situation with the training content. This makes an inadequate situation for explanation of different questions that may be produced during the study hall sessions. [20] These issues can be evaded using VR in the mix with VLE. Adding VR to VLEs can offer a gainful theory of aptitudes from a virtual space to this present reality since the earth copies authentic imagery and setting as noted in a bit of the investigation references [21]. A part of the investigation introduces a review and assessment concerning VRLEs and the uncovered points of interest these scholastic structures provide for the understudies, similarly as how they influence understudies’ learning adventures inside the program of study [22]. The makers construe that an adequately executed VLE improves understudy execution and is associated with improved duty. Some assessment references present a sweeping overview of PC produced reality-based direction ask about and deduce that VR circumstances are amazing for demonstrating K-12 and higher education [23]. These results are used by instructional makers to design regards. Most existing VLEs have been made as shows, upgrades, or accomplices for unequivocal planning assignments or instructor drove activities, and none of them impact quick frameworks organization and conveyed processing headways. Strikingly, some VRLE addresses a rising partition preparing paradigm [24].
3 Research Methodology 3.1 Search Strategy and Selection Criteria Operational definitions of virtual reality gaming technologies, virtual leaning environments, and cloud-based gaming technologies were obtained through a careful search of related work from various well-reputed published works in SCOPUS, Springer, etc. Related and recent work on these topics were then obtained and studied for similarity in ideas, their approaches toward the achievement of their respective goals, their respective methods to design experiments, and systems to help in their research. Their respective researches were analyzed for the challenges faced during the execution of their ideas and the solutions they tentatively proposed and acted upon. Keywords used for search are virtual reality, artificial intelligence, virtual learning environment, cloud-based game, virtual reality applications, virtual reality in education, multi-user network application, low latency cloud gaming, high-performance virtual reality. The total amount of keyword related papers obtained: (1) Before the Year 2000: 23 (2) In the range of the Year 2001–2005: 41
790
N. S. Kaundanya and M. Khari
(3) In the range of the Year 2006–2010: 102 (4) In the range of the Year 2011–2015: 216 (5) In the range of the Year 2016–2019: 98 After sorting through for paper related to our research goals, the total number of research papers we obtained were: 39.
3.2 Effective VRLE Related to Review Objective VRLE application model plan subject to the assessment reference [25] was found commonly sensible to the review goals. Its building relies upon Fig. 1 given underneath and it is also arranged apparently for virtual learning modules [26]. The standards outfit bearing, for instance, circumstances worked with diminished interferences, reasonable images, overseeing markers to organize the advancement, and locking cases to help keep the understudies set up while seeing an activity. On the client-side, it can support up to 150 customers (inferable from high fidelity capacities and expecting enough server-side resources [27]) over different topographical regions, each wearing a VRLE client contraption [28]. The client contraption includes a wearable VR headset and a close-by PC. It is altering the current iSocial Social Competence Intervention (SCI) setting up an instructive program, as a great learning module, to its new VRLE application.
Fig. 1 Basic structure of the cloud-based virtual reality learning environment
Applications of Virtual Reality in a Cloud-Based Learning …
791
Using the system illustrated in Fig. 1, the online instructor will easily be able to upload the videos and files related to the curriculum to the VRLE server. The game designed for the VRLE Teaching Simulation will use these files and videos and share it with the students who connect to the game application. With proper state control, lessons and games will be rendered to the students on their local machine with minimal effort.
4 Technology Used, Utilization, and Effectiveness of the VRLE Technology 4.1 High Fidelity Supports a high number of parallel users with the application of enough serverside resources. High competency in the creation of virtual 3D avatars for users and creation of well designed, spatialized, immersive sound experience for said users in desktop and VR modes especially. High fidelity also allows for decentralization, allowing the users to create their content and control with whom they want to share the same.
4.2 HTML/CSS React Redux Utilized in the formation of Web sites to hold instructional pages. Programs situated in training territories make helpful and natural study halls. The stage takes into account exercises to be handily included and changed by teachers more effectively than iSocial. Different stages can be connected depending on the present need and exercises.
4.3 Cloud Gaming Model A cloud gaming stage runs a game method of reasoning that is at risk for game mechanics [29] and game participation [30]. It is gotten from the game server API from game events [30] and graphics processing unit (GPU) renderer [31]. It helps in making game-world scenes progressed [31], quick and smooth [32]. The made video traces, after encoding and weight, are spouted to mobile phones [32]. The phones handle the disentangling [33] and respect a game player [34, 35].
792
N. S. Kaundanya and M. Khari
5 Research Issues in VRLE In the course of this research, using these technologies proved some inefficiencies that can be corrected in future researches.
5.1 High Fidelity High fidelity requires a high amount of server-side resources for smooth functioning. The amount of resources required is directly proportional to the increased functionality provided. A possible alternative to this is required if a large number of people want to connect at once.
5.2 HTML/CSS React Redux HTML/CSS React Redux does not allow for smooth information transfer between VRLE and the Web platform with the current technology. A stable technology with better features is sorely needed.
5.3 Cloud Gaming Model Cloud gaming model consumes a large number of resources to be useful for the proper rendition of GPU [36] created game-world in real-time along with high-speed data link [37] is needed for the real-time transfer [37]. This has to be improved to provide high-speed links while consuming fewer resources [38]. Another alternative would be to invent new hardware dedicated to the applications [39] which will contain most of the resources common to all and only will need to update unique resources through the data link [39].
6 Conclusion VRLE is an upcoming advancement in the world of learning and virtual reality. Its benefits are numerous. The application of this technology will help relieve the burden of the students while promoting curiosity and the desire to learn among them. The current technology is however not suitable to accomplish this idea to its best possible extent. The future possible applications of VRLE include a wide amount of social, academic, or even possibly strategic-level applications. Virtual reality learning
Applications of Virtual Reality in a Cloud-Based Learning …
793
environment (VRLE) builds upon a wide number of existing foundations to provide a virtualized environment extremely suitable for social activities online. With the combination of brain–computer interfaces (BCI), it may even be possible to design a complete cognition controlled virtual reality learning environment (VRLE) which would lead to a full immersion virtual dive environment completely controlled by neural activity without impact on the other senses.
References 1. Stichter JP, Laffey J, Galyen K, Herzog M (2014) “iSocial: delivering the social competence intervention for adolescents (SCI-a) in a 3d virtual learning environment for youth with high functioning autism. J Autism Dev Disord 44(2):417–430 2. Liou W-K, Chang C-Y (2018) Virtual reality classroom applied to science education. In: 23rd international scientific-professional conference on information technology (IT), 19–24 Feb 2018 3. Berman M, Chase JS, Landweber L, Nakao A, Ott M, Raychaudhuri D, Ricci R, Seskar I (2014) GENI: a federated testbed for innovative network experiments. Comput Network 61:5–23 4. Schmidt M, Jonassen D (2010) Social Influence in a 3D Virtual Learning Environment for Individuals with Autism Spectrum Disorders. University of Missouri–Columbia 5. Didehbani N, Allen T, Kandalaft M, Krawczyk D, Chapman S (2016) Virtual reality social cognition training for children with high functioning autism. Comput Human Behav 62:703– 711 6. Austin R, Sharma M, Moore P, Newell D (2013) Situated computing and virtual learning environments: e-learning and the benefits to the students learning. IEEE, pp 523–528 7. Costa I, Araujo J, Dantas J, Campos E, Silva FA, Maciel PRM (2016) Availability evaluation and sensitivity analysis of a mobile backend-as-a-service platform. Qual Reliab Eng Int 32:2191– 2205 8. Merchant Z, Goetz ET, Cifuentes L, Keeney-Kennicutt W, Davis TZ (2014) Effectiveness of virtual reality-based instruction on students’ learning outcomes in k-12 and higher education: a meta-analysis. Comput Educ 70:29–40 9. Zizza C, Starr A, Hudson D, Nuguri SS, Calyam P, He Z, (2017) Towards a social virtual reality learning environment in high fidelity. arXiv.org, 21st of July 2017 10. Deng Y, Li Y, Seet R, Tang X, Cai W (2018) The server allocation problem for session-based multiplayer cloud gaming. IEEE Trans Multime 20:1233–1245 11. Buyya R, Yeo CS, Venugopal S, Broberg J, Brandic I (2009) Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener Comp Syst 25:599–616 12. Cai W, Chi Y, Zhou C, Zhu, C, Leung (2018) V.C.M. UBC gaming: ubiquitous cloud gaming system. IEEE Syst J 12:2483–2494 13. Ross PE (2009) Cloud computing’s killer app: gaming. IEEE Spectr 46:14 14. Cai W, Shea R, Huang C-Y, Chen K-T, Liu J, Leung VCM, Hsu C-H (2016) A survey on cloud gaming: future of computer games. IEEE Access 4:7605–7620 15. Al-Rousan NM, Cai W, Ji H, Leung VCM (2015) DCRA: decentralized cognitive resource allocation model for game as a service. In: Proceedings of the IEEE 7th international conference on cloud computing technology and science (CloudCom), Vancouver, BC, Canada, 30 Nov–3 Dec 2015, pp 218–225 16. Shea R, Liu J, Ngai E, Cui Y (2013) Cloud gaming: architecture and performance. Netw IEEE 27:16–21 17. Kim H, Kim KJ (2017) Optimized state update for mobile games in cloud networks. Cluster Comput 1–7
794
N. S. Kaundanya and M. Khari
18. Li Y, Deng Y, Tang X, Cai W, Liu X, Wang G (2018) Cost-efficient server provisioning for cloud gaming. ACM Trans Multimed Comput Commun Appl 14:55 19. Choy S, Wong B, Simon G, Rosenberg C (2014) A hybrid edge-cloud architecture for reducing on-demand gaming latency. Multimed Syst 20:503–519 20. Ahmadi H, Zad Tootaghaj S, Reza Hashemi M, Shirmohammadi S (2014) A game attention model for efficient bit rate allocation in cloud gaming. Multimed Syst 20:485–501 21. Wang H, Shea R, Ma X, Wang F, Liu J (2014) On design and performance of cloud-based distributed interactive applications.. IEEE Comput Soc 37–46 22. Jiang MH, Visser OW, Prasetya ISWB, Iosup A (2018) A mirroring architecture for sophisticated mobile games using computation-offloading. Concurr Comput Pract Exp 30:e4494 23. Mishra D, El Zarki M, Erbad A, Hsu C-H, Venkatasubramanian N (2004) Clouds + games: a multifaceted approach. IEEE Int Comput 18:20–27 24. Lee J, Kim M, Kim J (2017) A study on immersion and VR sickness in walking interaction for immersive virtual reality applications. Symmetry 9:78 25. Hwang G (2010) Supporting cloud computing in thin-client/server computing model. In: ISPA 2010, pp 612–618 26. Cuervo E, Wolman A, Cox LP, Lebeck K, Razeen A, Saroiu S, Musuvathi, M (2015) Kahawai: high-quality mobile gaming using GPU offload. In: Proceedings of the 13th annual international conference on mobile systems, applications, and services-MobiSys, Florence, Italy, pp 121– 135, October–December 2015 27. Amiri M, Al Osman H, Shirmohammadi S, Abdallah M (2016) Toward delay-efficient gameaware data centers for cloud gaming. ACM Trans Multimed Comput Commun Appl 9:12 28. Jain R, Paul S (2013) Network virtualization and software-defined networking for cloud computing: a survey. IEEE Commun Mag 51:24–31 29. Mondesire SC, Angelopoulou A, Sirigampola S, Goldiez B (2018) Combining virtualization and containerization to support interactive games and simulations on the cloud. Simul Model Pract Theory (in Press) 30. Kempa WM, Wo´zniak M, Nowicki RK, Gabryel M, Damaševiˇcius R (2016) Transient solution for queueing delay distribution in the GI/M/1/K-type mode with “queued” waking up and balking, vol 9693. In: Rutkowski L, Korytkowski M, Scherer R, Tadeusiewicz R, Zadeh L, Zurada J (eds) Artificial intelligence and soft computing. Springer, Cham, pp 340–351 31. de Santana RAS, Dias-Júnior CQ, do Vale RS, Tóta J, Fitzjarrald DR (2017) Observing and modeling the vertical wind profile at multiple sites in and above the Amazon Rain Forest Canopy. Adv Meteorol 32. Desai PR, Desai PN, Ajmera KD, Mehta K (2014) A review paper on oculus rift—a virtual reality headset. Int J Eng Trends Technol 13(4) 33. Liagkou V, Salmas D, Stylios C (2019) Realizing virtual reality learning environment for industry 4.0. Procedia CIRP 79:712–717 34. Bogusevschi D, Muntean C; Muntean G-M (2019) Teaching and learning physics using 3D virtual learning environment: a case study of combined virtual reality and virtual laboratory in secondary school. In: Society for Information Technology & Teacher Education International Conference, Las Vegas, NV, United States, 18 Mar 2019 35. Parong J, Mayer RE (2018) Learning science in immersive virtual reality. J Educ Psychol 110(6):785–797 36. Zhou Y, Ji S, Xu T, Wang Z (2018) Promoting knowledge construction: a model for using virtual reality interaction to enhance learning. Procedia Compute Sci 130:239–246 37. Lai Z, Charlie Hu Y, Cui Y, Sun L, Dai N, Lee H-S (2019) Furion: engineering high-quality immersive virtual reality on today’s mobile devices. IEEE Trans Mob Comput 38. Cai W, Shea R, Huang C-Y, Chen K-T, Liu J, Leung VCM, Hsu C-H (2016) The Future of cloud gaming [point of view]. Proc IEEE 104:687–691 39. Buzys R, Maskeli R, Damaševi R, Sidekerskien T, Wo´zniak M, Wei W (2018) Cloudification of virtual reality gliding simulation game. Licensee MDPI, Basel, Switzerland
Different Platforms Using Internet of Things for Monitoring and Control Temperature Sebastián Gutiérrez, Rafael Rocha, Emmanuel Estrada, David Rendón, Gemma Eslava, Luis Aguilera, Pedro Manuel Rodrigo, and Vijender Kumar Solanki Abstract This article describes a monitoring system developed for temperature control and monitoring of a building using the Internet of things (IoT). The system controls an actuator that simulates an HVAC system using a global controller based on the data obtained by two temperature sensors. This data is stored in a platform called Firebase and is visualized in a monitoring platform through JSON which interface shows the temperature data in different colors according to the temperature of the room and shows the status of the fan (ON–OFF) in real time. Most global controllers do not have a data storage system. This work shows how it is possible to implement storage for different cloud platforms, through the Internet of things. Keywords Cloud services · Internet of things · Smart building · Temperature monitoring S. Gutiérrez (B) · R. Rocha · E. Estrada · D. Rendón · G. Eslava · L. Aguilera · P. M. Rodrigo Facultad de Ingeniería, Universidad Panamericana, Aguascalientes, Mexico e-mail: [email protected] R. Rocha e-mail: [email protected] E. Estrada e-mail: [email protected] D. Rendón e-mail: [email protected] G. Eslava e-mail: [email protected] L. Aguilera e-mail: [email protected] P. M. Rodrigo e-mail: [email protected] V. K. Solanki Department of Computer Science and Engineering, CMR Institute of Technology, Hyderabad, Telangana, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_75
795
796
S. Gutiérrez et al.
1 Introduction One of the most important Internet of things (IoT) applications is in the area of building automation systems which integrate a diverse kind of equipment like sensors, actuators, and control systems. This branch of the IoT is getting popularity because of the energy reduction in costs and can also provide services to the electric grid such as frequency regulation [1]. The automation implementations range from heating, ventilation, air conditioning, and lighting for occupant’s comfort to critical infrastructure such as fire safety and physical access control. [2]. A smart/automated building gives us the capacity to maintain always an ideal level of comfort, safety, and energy consumption. The main physical processes in a building include heat and cold transfer [3]. The cost of the energy consumption represents a huge percentage of the general expenses that a building has. In general, the energy consumption of a large office building HVAC system takes up 40–50% of the building’s total energy use [4]. Population growth, improved construction services, and comfort levels, together with increased time spent inside buildings, have raised the energy consumption of buildings to transport and industry levels [5–7]. For the reason the control and monitoring of temperature is so important, a monitoring system of temperature was developed using a global controller based on the data obtained by two temperature sensors, one of them wireless and another wired. The sensors provide information that allows decision making with the aim of reducing expenses and maintaining the comfort of the occupants [8]. Analysis of the acquired data takes place at the input of the system while the output includes the selection of actions (decision making) and the implementation of actions. With this input and output arrangement, the system allows or inhibits certain actions. For this system, we will switch ON-OFF a fan (simulating a HVAC device) according to the temperature of the room. The two sensors will send the data to the Firebase database through the global controller based on HTML5 [9– 11], and the activation or deactivation of the fan and the temperature history will be then plotted. In this way, temperature can be monitored, knowing that it represents the main factor that affects human comfort and considering that it is related to the greater building energy footprint (heating, ventilation, HVAC) [12]. The rest of the paper is organized as follows: Sect. 2 overviews the system operation. Section 3 presents and discusses the results obtained with the proposed approach. Section 4 gives some future work perspectives. Finally, Sect. 5 concludes the work summarizing the main concepts.
2 System Operation The implementation of the project was made in the building of Universidad Panamericana in the city of Aguascalientes, México. The system was designed to monitor and control two zones: the zone “A” gets values from the first half of the laboratory and zone “B” gets values from the second half as show in Fig. 1.
Different Platforms Using Internet of Things … Fig. 1 User interface: main menu
797
Zone “A”
Zone “B”
2.1 Zone A The zone “A” uses an integration of the ESP8266 board and DHT11 sensor to get and send the temperature/humidity data to the cloud through Internet. Figure 2 shows the prototype of the system of the zone “A”. The temperature/humidity values are uploaded to Firebase and are displayed on the global controller by the JSON communication protocol. Figure 3 shows the graphic interface of the global controller of the zone “A”. Figure 4 shows the IoT architecture of the zone “A”, where the important function of the JSON communication protocol can be detected. The protocol links the global controller, the ESP8266 board, and DHT11 sensor using the Firebase cloud. Fig. 2 Prototype of the collected data of the zone “A”
ESP8266& DHT11 Sensor
LCD
Relays
798
S. Gutiérrez et al.
Fig. 3 User interface: zone “A”
Zone “A”
Fig. 4 IoT architecture of the zone “A”
Zone “A”
2.2 Zone B The zone “B” gets values directly from a temperature/humidity wired sensor that was installed close to the ceiling of the laboratory. The values are first processed inside the global controller and uploaded to Firebase for storage and later the values are downloaded and showed in the application. Figure 5 shows the graphic interface in the global controller of the zone “B”.
Different Platforms Using Internet of Things … Fig. 5 User interface: zone “B”
799
Zone “B”
The temperature can be set up using a text box inside our application interface. The set value is compared with an IF structure and the value obtained from temperature/humidity sensor. The result is a particular action for instance, start fan or stop fan. A better control of the fan can be done using a power converter (or inverter or motor drive) instead of a simple relay [13–15]. For the correct operation of the system in the zone “B”, the connection was made as shown in Fig. 6. An additional feature included with the global controller software is a chart that shows the historical information about the temperature and humidity. With such information, it is possible to make decisions about the cooling system and even change it in case the comfort temperature cannot be reached. All the information was stored on the Internet (Firebase cloud database) and read for the interface of the global controller.
Zone “B”
Fig. 6 Single line connection architecture of the zone “B”
800
S. Gutiérrez et al.
Zone “A”
Zone “B”
Fig. 7 General IoT architecture
Figure 7 shows the general IoT architecture of the system which includes the zone “A” and “B”. Each of the different zones has a connection to the database through the JSON communication protocol which allows the information to be sent to the cloud and subsequently visualized in the graphical interface of the global controller.
3 Results Obtaining comfort conditions inside a building with a fan was difficult. Although our application maintained the fan working for hours, the temperature set-point never was reached because the fan being was a simulation of an HVAC system and it is not a cooling device only distributes the air around the room. Through the process, we needed to obtain data from different zones of the building. Of course, the first approach was to connect our sensor with a cable to the global controller, but it was complicated for the distance between zones. The zone “B” maintains the connection with cable, but zone A uses an additional hardware to send data to the global controller. As a solution, a device was created which monitored humidity and temperature and sent data to the global controller through Wi-Fi. The process to create a wireless device that sends data to the global controller was
Different Platforms Using Internet of Things …
(a) Fan
801
(b) ESP8266& DHT11 Sensor
Zone “A”
Zone “B” Temperature & Humidity Sensor Fan
Fig. 8 Temperature and humidity sensor and fan in a zone “A” and b zone “B”
achieved using the popular development board “nodemcu ESP8266” which sent the information using Wi-Fi. The connection between the development board and the global controller was not direct. There was a need for a place to store the information and after that, take it back in the global controller. The Google Firebase service was chosen as information storage because it is a free solution. By connecting the sensor through cable and the additional sensor through a Wi-Fi, it was possible to determine the general status of the temperature and humidity conditions in the laboratory. With the previous information, it was also possible to show the summary and real status of the laboratory using a digital interface. The user interface was developed with a DSA platform and uploaded to the global controller. This step was the core of our development. The program logic was tested and validated with real values using some test results and assumed values of delay from our sensors and the global controller. Delay issues make impossible to see real-time changes between the user interface and the hardware. Therefore, in order to prevent delays, the DSA development was updated and improved to speed up the writing and reading process to the Firebase database. Finally, the application controlled two different zones, each one with specific temperature and humidity sensor and an independent fan that simulates the HVAC system as shown in Fig. 8a, b. The user interface shows in real time the actual temperature and humidity values on the laboratory and take decisions between turn on/turn off the fans.
4 Discussion This project provides a solution for buildings in which it is not possible to store data locally. The database in the cloud (Firebase) gives us the capacity to connect the IoT system, control the building remotely, and store a lot of data to graph, analyze, and implement better energy efficiency programs. Further improvements include the use
802
S. Gutiérrez et al.
of air-conditioning systems, power electronics for controlling the actuators, and even the aperture of windows in order to renovate the air automatically.
5 Conclusions Nowadays, the IoT technology has improved energy efficiency management, maintenance savings, and an improvement in the quality of human life. This project has made it possible to verify that having apparently obsolete equipment, not having onsite storage or low storage capacity, using the technology of the Internet of things, it is possible to have a remote storage, allowing real-time storage and display of different control platforms. The implementation of the system worked perfectly, fulfilling all required monitoring and control tasks.
References 1. Parshin M, Majidi M, Ibanez F, Pozo D (2019) On the use of thermostatically controlled loads for frequency control. In: 2019 IEEE Milan Power Technology, pp 1–6 2. Pan Z, Hariri S, Pacheco J (2019) Context aware intrusion detection for building automation systems. Comput Secur. https://doi.org/10.1016/j.cose.2019.04.011 3. Liu Z, Chen X, Xu X, Guan X (2013) A decentralized optimization method for energy saving of HVAC systems. In: IEEE international conference on automation science and engineering 4. Pérez-Lombard L, Ortiz J, Pout C (2008) A review on buildings energy consumption information. Energy Build. https://doi.org/10.1016/j.enbuild.2007.03.007 5. Xu Y, Zhang L (2011) Study on energy efficiency design of public building base on overall consideration of energy consumption factors. In: 2011 international conference on multimedia technology (ICMT 2011) 6. Wang R, Lu S, Feng W (2020) A novel improved model for building energy consumption prediction based on model integration. Appl Energy. https://doi.org/10.1016/j.apenergy.2020. 114561 7. Shaikh PH, Nor NBM, Nallagownden P, Elamvazuthi I, Ibrahim T (2014) A review on optimized control systems for building energy and comfort management of smart sustainable buildings 8. Hong T, Taylor-Lange SC, D’Oca S, Yan D, Corgnati SP (2016) Advances in research and applications of energy-related occupant behavior in buildings 9. Envysion Distech Controls. http://www.distechcontrols.com/en/us/products/envysion 10. Gutiérrez S, Velázquez R, Alvarez J (2017) Review of system controller for smart homes/building applications. In: 7th IMEKO TC19 symposium on environmental instrumentation and measurements (EnvIMEKO 2017) 11. Gutierrez S, Barrientos E, Alvarez J, Cardona M (2018) An integrated architecture for monitoring and control the temperature of different platforms based on internet of things. In: 2018 IEEE 38th Central America and Panama Convention (CONCAPAN XXXVIII), pp 1–5. https:// doi.org/10.1109/CONCAPAN.2018.8596403 12. Minakais M, Okaeme CC, Mishra S, Wen JT (2017) Iterative learning control for coupled temperature and humidity in buildings. IFAC-PapersOnLine. https://doi.org/10.1016/j.ifacol. 2017.08.2290 13. Mohan N, Undeland T, Robbins WP (2002) Power electronics: converters, applications, and design, 3rd ed. Wiley, London. ISBN: 9780471226932
Different Platforms Using Internet of Things …
803
14. Florez-Tapia AM, Ibanez FM, Vadillo J, Elosegui I, Echeverria JM (2017) Small signal modeling and transient analysis of a Trans quasi-Z-source inverter. Electr Power Syst Res. https://doi.org/10.1016/j.epsr.2016.10.066 15. Ibanez FM (2019) Bidirectional series resonant DC/AC converter for energy storage systems. IEEE Trans Power Electron 34:3429–3444. https://doi.org/10.1109/TPEL.2018.2854924
Internet of Things System Using the Raspberry Pi to Monitor a Small-Scale Server Room Thien M. Nguyen, Phuc G. Tran, Phuoc V. Dang, Huy H. T. Le, and Nhu Q. Tran
Abstract In this paper, an overview of an IoT system prototype to monitor the ambient conditions of the Vietnamese—German University’s server room is discussed. The system makes use of a Raspberry Pi as the central processor, with on-site touch screen input and its own back-up power and Internet connection. The server room’s power status is to be recorded by the Raspberry Pi and reports back to the user. Ambient data is gathered by many ESP8266 microcontroller boards to make an ambient model of the isolated server room. A smartphone application is used to push the notification and ambient data, along with simple remote control commands to the user. Air conditioning is controlled automatically by these microcontroller boards by simulating the isolated AC’s remote control. This project concentrates on using off-the-shelf hardware components, well-known and well-maintained software libraries. Our monitoring results are satisfactory, with less than 2 °C in temperature error. Keyword Internet of Things Raspberry Pi server ESP8266
T. M. Nguyen · P. G. Tran · P. V. Dang · H. H. T. Le · N. Q. Tran (B) EEIT Program, Vietnamese—German University, Thu Dau Mot City, Binh Duong Province, Vietnam e-mail: [email protected] T. M. Nguyen e-mail: [email protected] P. G. Tran e-mail: [email protected] P. V. Dang e-mail: [email protected] H. H. T. Le e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_76
805
806
T. M. Nguyen et al.
1 Introduction The concept of smart monitoring plays a significant role in presenting the concept of Internet of Things to the mass. Many research groups and companies are working in this domain. Many projects [1–3] use the Raspberry Pi, a single-board ARMbased computer due to its affordability and versatility. Some microcontroller network is also used. Noar [4] used the ESP8266 with Blynk to detect the water level in flood. One notable mention is Xiaodong [5], using a control system based on the STM32 microcontroller, he created one smart home control system, with a graphic user interface (GUI) and data transmission to the user’s mobile phone and remote webpage. Shaheed [6] presented a smart home system with adaptable price using ESP8266 Wi-Fi module. Mao [7] designed an intelligent home project to monitor environmental conditions, electric household appliance, regulate humidity, temperature, and camera surveillance. Using Zigbee sensor network and 3G/4G network, his system can support users through mobile application and webpage to manage household control options. Malche [8] modelized IoT platform for household application, ambient monitoring, and security supervision. Nguyen Huynh [9] created a project to supervise and monitor ambiance conditions of a small-scale server room to prevent and decrease electrical incidences. This author makes use of Blynk, a hardware-independent IoT platform which allows third-party hosting of the service and customization of the prebuilt mobile application. These advantages are also why Nor implemented Blynk in his project [4]. Wireless interfaces provide better mobility and a border range of data collection [5, 10]. GSM/GPRS network is commonly used to transmit data to the user’s mobile phone. Many commercially available products do available in the market, like APC temperature sensor system and server room temperature and humidity sensor system by ITWatchdog. This project is executed to monitor the main server room of the Vietnamese— German University (VGU), which hosts eight servers in a floor of 12 m2 with an insulated air conditioner and an insulated floor. The servers are kept in three server racks and are powered by an uninterrupted power supply (UPS). Maintenance of this server room is responsible by VGU IT department, which consists of five staff members. Prior to this project, every working condition of the servers was collected and processed manually, either to make future decisions or to make reports periodically. The system was projected to responsible for reporting the servers’ working conditions, which includes temperature, humidity, Internet connection status, and power status of VGU’s server center to IT managers through mobile phone notifications and e-mail in certain situations. This project has its own unique and adapted features, which are a self-hosting data server, a back-up power source, and Internet connection. It tracks server room ambiance and notifies the users if a power outbreak or a loss of Internet connection occurs, or the current environmental condition does not meet the requirements. Users can use the provided data to make the decision in time and to prevent unwanted circumstances.
Internet of Things System Using the Raspberry …
807
In this paper, we introduce the design, implementation, and realization of the system, to the extent of main ideas and general descriptions. What is not included in this paper is the actual codebase and detailed descriptions of every software and hardware module.
2 Hardware Descriptions The hardware of this system mainly consists of a Raspberry Pi, some ESP8266 modules, a 4G access point, and their peripherals. The main connection between modules is a wireless connection via an IEEE 802.11n (commercially known as Wi-Fi 4) network broadcasted at 2.4 GHz by the 4G access point (Fig. 1).
2.1 Raspberry Pi This system makes use of a Raspberry Pi model 3B as the main processor. The Raspberry Pi draws its power from the AC outlet using a 5 V—2.6 A power supply connecting to the micro-USB type-B port. In case of a power outbreak, there is a separate uninterruptable power supply (UPS) connected to this Raspberry Pi. A liquid-crystal touch screen is connected to the full-size HDMI port for displaying and to one USB type-A connection for power and input. Another USB port is utilized for communicating with the servers’ UPS. The Ethernet port is used to connect the Raspberry Pi with the server room’s Internet connection. Another back-up Internet connection is provided via the wireless connection between the Raspberry Pi and the 4G hotspot using Wi-Fi 4 by the board’s built-in antenna. LCD Module. One 7-inch Waveshare capacitive liquid-crystal touch screen is used to provide the on-site user interface with the system. The resolution of the display is
Air conditioner
Server UPS
Power
Power bank
Raspberry Pi
LCD Touchscreen
Fig. 1 Hardware connections
4G Hotspot
User
Smartphone
ESP826
Ambient sensors
808
T. M. Nguyen et al.
1024 by 600 pixels, and the display technology is in-plane switching liquid-crystal display (IPS LCD). The display is connected to the Raspberry Pi by an HDMI cable to provide the image output, and an additional full-size USB type-A cable is used to provide power and touch input. The configured maximum current drawn is 3 A at 5 V. Raspberry Pi’s Back-Up Power. One Xiaomi Mi 2S 10,000 mAh power bank is used to power the Raspberry Pi in case of power failure. The maximum current is 2.4 A. The micro-USB type-B port is used to charge the batteries, using a 16 W phone charger. A switching circuit using capacitors is used to prevent connecting the Raspberry Pi, the power bank, and its DC adapter in series. This can provide power for at least 30 min in case of failure.
2.2 4G Mobile Hotspot A 4G LTE mobile hotspot is used to broadcast a Wi-Fi network and provides a back-up Internet connection. The model is TP-Link M7200, which supports IEEE 802.11n speed up to 300 Mbps. This model also has a back-up battery of 2000 mAh. The micro-USB type-B port is used to provide DC power for this module to work. The data usage is relatively low since there is rarely a lost connection in the servers’ Internet connection.
2.3 ESP8266 Six ESP8266 microcontroller boards are placed scattered around the server room: Five to collect ambient data, and one directly under the air conditioner to control it. The microcontroller board is the NodeMCU ESP8266 module. The main microcontroller is the ESP8266, a 16-bit microcontroller from Espressif. This circuit is powered by three AA batteries, on the average voltage of 4.5 V. The sleep current draw is around 40 mA, and the peak current draw is around 180 mA. The microcontroller boards are connected to the Si7021 module to measure ambient data, or an infrared light-emitting diode to simulate the air conditioner remote control. Si7021 Modules. Five Si7021 I2C modules are used for monitoring the ambient temperature and humidity. These are connected to the respective I2C-enabled pins of their controller board. The working voltage of the module is 3.3 V. IR LED. An infrared light-emitting diode is connected to a PWM-enabled pin of the ESP8266 to emulate the air conditioner’s remote control. The working PWM frequency of the IR LED is 44 kHz, and the average drawn current is 5 mA.
Internet of Things System Using the Raspberry …
809
3 Software Design The system’s software includes the software core, the graphic user interface (GUI), the notification services, the TCP/IP server, the UPS monitor, the 4G module client, the Blynk server, along with the Blynk mobile application, and the ESP8266 software base. The two latter run on the user’s Android device and the ESP8266 module, respectively, while the rest of the software run on the Raspberry Pi. All data, including the user’s credentials and monitoring results, are stored on a database file using the Python sqlite3 library, with an exception of the log file. The user’s credentials are hashed using the SHA3-512 algorithm before being stored in or compared with the database (Fig. 2).
3.1 TCP/IP Server This Python module creates a TCP/IP socket to connect and exchange data with the ESP8266 via Wi-Fi. Using the default Python socket library, it automatically creates a TCP/IP socket on the local network, with a predefined port. Whenever a new connection is established to a new device, the module will ask for its identifier (ID) from the user and bind it with the client’s Media Control Access address (MAC) to be recognized when the ESP8266 tries to reconnect on wake up from sleep. Recognizing the client by its ID, and MAC enables the client to disconnect and to connect again. By disconnecting the ESP8266 and enabling the deep sleep mode, the average current drawn is reduced from 150 to 5 mA.
GUI 4G hotspot
4G hotspot client
TCP/IP server
ESP8266 client
Database
UPS
UPS client
Decision maker Blynk server
Blynk mobile
Fig. 2 Software model
Email client
Email server
810
T. M. Nguyen et al.
3.2 ESP8266 Software The software of the ESP8266 module is programmed mainly using the official ESP8266 core for Arduino library. These C++ programs allow the ESP8266 to interface with a known Wi-Fi hotspot and TCP port to exchange data. For the ambient data collector, a package from the TCP server determines the sleep time of the board and put the board to sleep. After waking up, the board will try to communicate with its I2C slave to read its temperature and humidity data, pack it into a text package and send it to the server socket. The air conditioner controller uses the IRremote ESP8266 repository hosted by crankyoldgit on GitHub. The system can set specific settings for the temperature, humidity, and fan on request. However, a “capture” mode can be used to save a few preset of the command. This can be later extended to many other IR-enabled devices.
3.3 UPS Monitor The servers’ uninterruptable power supply (UPS) is connected to the Raspberry Pi by a USB cable to monitor its electrical power status. We conducted testing on the Santak Blazer 2000 Pro UPS, which has a USB type-B port for serial communication. The software is written on Python using the official os library to interface with the network UPS tool, a Debian-based open-source software. Supported monitoring functions include input AC voltage and frequency, total load power, the system status, etc. These functionalities depend on the support of network UPS tool with the specific model of the UPS and are subject to changes when implementing the system.
3.4 Graphic User Interface Engine The Python module we used—Kivy—provides the user with a UI to interact with the system thoroughly. Kivy is an open-source multi-platform Python library for rapid development of applications that make use of innovative user interfaces. Through the GUI, the user can configure all settings and observe the device’s statistics. There are six main screens, which are “Home,” “Report,” “AC Control,” “E-mail,” “Settings,” and “Log,” and each screen runs a specific task. “Home” simply shows the information about the temperature and humidity of VGU’s server room. The data are updated every 30 s but there is also a button that can refresh the info immediately. “Report” screen presents the statistics of the room as a graph, which could be exported to be sent to the user. Page “AC Control” allows the user to set up the air conditioner, such as adjusting the temperature and programming the timer. The screen “E-mail” shows a list of e-mail recipients. The user also adds/removes e-mail addresses and chooses which data to send: e-mail address list or statistics from page “Report.” Besides being
Internet of Things System Using the Raspberry …
811
Fig. 3 On-site user’s interface
able to modify the user’s info, “Settings” screen has settings related to the system, like the sleeping time of the ESP8266, the user can inspect the real-time logger of Python and Kivy, which are two core modules of this interface, through the last page “Log” (Fig. 3).
3.5 IoT Platform The Raspberry Pi hosts the Blynk server, which forwards messages from Raspberry Pi to Blynk mobile app. The self-hosting server was chosen to cut off the cloud renting fee, and it also provides more flexibility, as it does not require a third-party cloud solution. Blynk is responsible for two main purposes. It displays recorded data and illustrates it into a graph for a time period from the database onto the mobile application. Besides that, it also alerts by sending push notifications to the mobile app, when the local server is offline, or any negative events occur, such as excessing secure temperature or humidity, and loss of electricity. (Fig. 4).
4 Results and Discussions The testing results are satisfactory. The ESP8266, when powered by two AA batteries, can run continuously for three weeks. The recorded temperature, when comparing with the Extech Instrument Humidity Alert 2, an industry thermometer, always has an error of less than 2 °C. The humidity, on the other hand, has an error of about 5% RH, which is vastly different from the manufacturer’s datasheet. However, this is acceptable, considering the cost of the whole sensor system is only one-tenth of the industry device. This system has a very primitive approach to the IoT problem. One of its main advantages, as for other previously mentioned projects, is its usage of mainly offthe-shelf components [2]. This makes later troubleshooting and development much
812
T. M. Nguyen et al.
Fig. 4 Blynk mobile app interface
easier. One other major advantage is the low cost [1, 6] and ease of maintenance, with no usage of third-party services. Large and actively developed software repository like sqlite3, python3, and Kivy are used throughout the project make the software of this system is always up to date. However, this system has many security concerns. Security was not our main priority for this project since it is just a prototype in the limited timeline. The low versatility means that future adaptation of this project for other purposes is difficult. The system also only introduces a few functions tailored to the use case of the VGU IT department.
5 Conclusions This paper introduced an IoT prototype to monitor the ambient conditions of VGU’s server room. The article contributed to the automation initiation of VGU. The system has some advantages like using off-the-shelf components and low-maintenance costs. But the main drawback is its lack of built-in security and being single purposed. Further implementation of this project can lean toward implementing security functionalities. Focus on protecting user login, database, local server, and especially, UPS connection is also a need, as well as implementing door access and closed-circuit television for the entrance of the room. Optimizing the modules, especially the software to save more power, is also a possibility. One way to enhance automation is to add more ambient sensors to be able to feed to an ambient model of the specific room. From that, the software can automatically and accurately change the air conditioner settings.
Internet of Things System Using the Raspberry …
813
Bibliography 1. Wen X, Wang Y (2018) Design of smart home environment monitoring system based on raspberry Pi. In: 2018 Chinese Control and Decision Conference (CCDC), pp 4259–4263. https://doi.org/10.1109/ccdc.2018.8407864 2. Taylor J, Hossain HMS, Ul Alam MA, Al Hafiz Khan MA, Roy N, Galik E, Gangopadhyay A (2017) SenseBox: a low-cost smart home system. In: 2017 IEEE international conference on pervasive computing and communications workshops (PerCom Workshops), pp 60–62. https:// doi.org/10.1109/percomw.2017.7917522 3. Sharon V, Karthikeyan B, Chakravarthy S, Vaithiyanathan V (2016) Stego Pi: an automated security module for text and image steganography using Raspberry Pi. In: 2016 International conference on advanced communication control and computing technologies (ICACCCT), pp 579–583. https://doi.org/10.1109/icaccct.2016.7831706 4. Noar NAZM, Kamal MM (2017) The development of smart flood monitoring system using ultrasonic sensor with blynk applications. In: 2017 IEEE 4th international conference on smart instrumentation, measurement and application (ICSIMA), pp 1–6. https://doi.org/10.1109/ics ima.2017.8312009 5. Xiaodong Z, Jie Z (2018) Design and implementation of smart home control system based on STM32. In: 2018 Chinese control and decision conference (CCDC), pp 3023–3027. https:// doi.org/10.1109/ccdc.2018.8407643 6. Shaheed SM, Ilyas MSB, Sheikh JA, Ahamed J (2017) Effective smart home system based on flexible cost in Pakistan. In: 2017 fourth HCT information technology trends (ITT), pp 35–38. https://doi.org/10.1109/ctit.2017.8259563 7. Mao X, Li K, Zhang Z, Liang J (2017) Design and implementation of a new smart home control system based on internet of things. In: 2017 international smart cities conference (ISC2), pp 1–5. https://doi.org/10.1109/isc2.2017.8090790 8. Malche T, Maheshwary P (2017) Internet of things (IoT) for building smart home system. In: 2017 International conference on I-SMAC (IoT in social, mobile, analytics and cloud) (I-SMAC), pp 65–70. https://doi.org/10.1109/i-smac.2017.8058258 9. Nguyen Huynh DA (2018) Design and implementation of an IoT server room condition monitoring system. EEIT Graduation thesis, Vietnamese—German University 10. Han D, Lim J (2010) Smart home energy management system using IEEE 802.15.4 and zigbee. IEEE Trans Consum Electron 56:1403–1410. https://doi.org/10.1109/tce.2010.5606276
A Survey on Hybrid Intrusion Detection Techniques Nitesh Singh Bhati and Manju Khari
Abstract In new era, information plays a key role for everyone, compromising with information may harmful to user or our society. Intrusion detection is a very useful tool to protect the information at host level as well as network level. Many researchers have proposed several techniques for intrusion detection. Presently, hybrid intrusion detection is a prime area of research in the field of intrusion detection system. In hybrid intrusion detection, more than one detection methods are considered together to take the advantage of many techniques in a single model. In this paper, a survey of intrusion detection system (IDS) and hybrid techniques for intrusion detection is presented. The aim of this survey is to motivate the researcher in the area of hybrid intrusion detection techniques. Keywords Intrusion detection · Information security · Hybrid intrusion detection · Ensemble
1 Introduction Internet, being the global communication for computer networks, uses the Internet protocol suite. It is a super network of all networks that consist of diverseness of private, public, academic, and government networks having scope variation from local scope to global scope. As the Internet applications are growing day-by-day, its security against the intrusion is becoming a major issue [1]. Computer emergency response team (CERT) stated that intrusion has been increased repeatedly. In the layman’s language, intrusion is basically a condition when someone does an activity N. S. Bhati (B) Department of Computer Science and Engineering, Delhi Technical Campus, Greater Noida, India e-mail: [email protected] M. Khari Department of Computer Science and Engineering, Ambedkar Institute of Advanced Communication Technologies and Research, GGSIP University, Geeta Colony, New Delhi 110031, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_77
815
816
N. S. Bhati and M. Khari
which is not to be done by him; similarly, here intrusion refers to the act of compromising with computer network security policies, i.e. confidentiality, integrity and availability. Intrusion detection is the process of tracking the actions performed by the system and constantly evaluating for the trace of intrusions [2]. Intrusion detection system (IDS) provides the facility of software or hardware to automatize the intrusion detection procedure. Other than IDS, there is intrusion detection and prevention system (IDPS) or intrusion prevention system (IPS). IPS or IDPS has all the features of IDS along with a property that it could attempt to stop possible incidents.
1.1 Types of IDS The various types of intrusion detection system are as follows [3]: • Host-based Intrusion Detection System (HIDS): It is installed on workstations which are to be monitored. It is used to detect intrusion on a computer system or a workstation. In such type of IDS, analysing an intrusion on different computers and to a large network is a difficult process. • Network-based Intrusion Detection System (NIDS): It is installed on network appliances or sensors with a network interface card. It is used to detect intrusion within a network. • Perimeter-based Intrusion Detection System (PIDS): It detects the intrusion attempts on perimeter fences of critical infrastructure. • Virtual Machines Intrusion Detection System (VMIDS): It helps to detect intrusion using virtual machines. As it is the latest technique, so it is evolving.
2 Procedures of Detecting Intrusions Data is collected in a canonical form by the monitoring system before the execution of the pre-processing of the data that is to be analysed. The intrusion is detected by identifying its statistical approach, signature comparison or anomaly-based. If the system encounters any intrusion, then the intrusion alert system will be activated. Figure 1 shows a generic IDS [4].
3 Intrusion Detection Techniques The intrusion detection techniques can be broadly classified into the following three categories. The comparison of these three techniques is presented in Table 1:
A Survey on Hybrid Intrusion Detection Techniques
817
Fig. 1 A generic intrusion detection system
Table 1 Comparison of intrusion detection techniques Approach
Decision method
Advantages
Disadvantages
Rule-based
Based on predefined rules which are stored in database
(a) Minimum false alarms are produced (b) Better for known attacks
(a) Only past attacks are detected. (b) Rules need to be updated periodically
Signature
Based on pre-existing signatures which are stored in database
(a) False positive rate is low (b) Performance is good
(a) Unknown attacks cannot be detected
Anomaly
Based on deviation (a) Better for unknown from normal behaviour attacks (b) It can be configured easily
(a) False alarm is a lot
• Rule-based intrusion detection systems: This technique takes into account certain experiences and rules which are already built through which data has to be parsed. The parsing restrains those data which do not satisfy the parameters. It efficiently detects known attacks with minimum amount of false alarms. Given that administrator updates the rules, only past attacks can be detected but the new and foreign attacks are not shielded [5]. • Signature-based intrusion detection systems: This method works on a misuse detection principle using a repository of known data that contains a database of malicious signatures which could be dangerous for the system, leading to attacks. This database is traversed across the input data set. With a compilation of network traffic and alerts, the unacceptable patterns are compared [6]. • Anomaly-based intrusion detection systems: It is also called profile-based intrusion detection technique. In this technique, system looks out for any deviation from the normal behaviour of the defined profile. History of the system’s activity
818
N. S. Bhati and M. Khari
and specifications of user’s intended behaviour are used to predict the kind of data a user might need. Being easily configurable with acceptable accuracy are its advantages while disadvantage is the high number of false alarms produced. Moreover, training sets need to be inducted extensively to characterize a normal behaviour [7].
4 Hybrid Approach-Based Intrusion Detection Techniques The survey of hybrid approach-based techniques is presented in this section.
4.1 Neural and Fuzzy Logic With the combination of the fuzzy logic and neural networks, this hybridization is achieved. Liang [8] gave the Takagi–Sugeno (T-S) FNN bases algorithm that uses the combination of fuzzy theory along with neural networks which gives object classification and recognizes normal and abnormal behaviour. It helps overcome earlier weaknesses like feasibility, lack of standards, flexibility, and adaptability in the conventional neural networks. Due to the removal of redundancy and uncertain data in the original KDD data set, this proposed method achieved more than 90% detection accuracy.
4.2 Binary PSO and Random Forests Algorithm Combining the advantages of random forest which consists of many decision trees and particle swarm optimization that is basically a population-based stochastic optimization technique, called PSO-RF. Malik et al. [9] proposed feature selection and classification as the first and second steps, respectively. The first step required binary PSO while the second step required random forest algorithm. The proposed technique provides a better performance than other classification techniques in terms of average false positive rate and average intrusion detection rate during the attacks, while during normal records, RF algorithm achieved less false positive rate.
A Survey on Hybrid Intrusion Detection Techniques
819
4.3 Combining Decision Tree with Naive Bayes An effective intrusion detection approach which combined naive Bayes (NB) with decision tree (DT) was proposed by Panda et al. [10]. Using forward selection, selected attributes have been modelled by DT initially and naïve Bayes at each step.
4.4 Neural-Based Hybrid Technique Govindarajan et al. proposed an intrusion detection technique developed by neuralbased hybrid classification method to the enhancement of the prediction accuracy. It uses patterns of errors, expediting methods for their pertinent data. Authors concluded that the proposed method gives better detection accuracy as compared to non-hybrid model [11].
4.5 Neural Network and Bayesian Networks Jamili et al. [12] used intrusion detection based on hybrid propagation in Bayesian networks. Bayesian network is a modelling network which is used for the problem which encloses uncertainty. Junction tree inference algorithm is used to calculate discrete exact inference. There are many scepticism in data set due to which the security becomes difficult. There are mainly two origins for these scepticism or uncertainty. One is due to the uncertain character of the info, i.e. resulting from stochastic phenomena. Second is due to the imprecise and incomplete character of info due to insufficient knowledge is termed as epistemic uncertainty. DARPA data sets have been employed to design and test IDS.
4.6 Neural with Genetic Algorithms Li in 2010 described the hybrid neural network intrusion detection system using genetic algorithm. It is a method that solves the constraints as well as unconstrained problem of optimization which are based on natural selection. It rapidly customizes a population of individual solution. For analysing both misuse and anomalous patterns, neural network has been extensively used. Hybrid evolutionary neural network (HENN) structure design, IDS integrating feature selection, and weight training are proposed. Authors analysed the proposed scheme in terms of detection accuracy [13].
820
N. S. Bhati and M. Khari
4.7 HMM with Naïve Bayesian Karthick et al. proposed about the hybrid approach for adaptive network intrusion detection. In adaptive network intrusion, there are two-stage architectures. In first stage or the initial stage using probabilistic classifier, potential anomalies are detected from the traffic. In second stage, a hidden Markov model (HMM)-based model is used to decrease the attack at the given IP address. HMM is a generative model that models the sequential data. In HMM, the states and the transitions in relation are not visible. Karthick et al. achieved approximate 100% detection accuracy [14].
4.8 RBF and Elman Neural Network Tong et al. proposed hybrid RBF and Elman-based intrusion detection for neural network model where the detection rate and false positive rate are determined. The sensitivity of the system configured easily and grants permission to end users that they can tune the system for acceptable tolerances without saving to retrain the neural network [15].
4.9 Self-organizing Map and Backpropagation Network In 2009, Aydin et al. proposed a hybrid Intrusion detection system that works for network security of computers with the help of backpropagation method and self-organizing map. The proposed system gave better performance for the know attacks and unknown attacks as well. Only the known attacks are encountered by the signature-based systems, whereas the anomaly-based systems encounter any unknown attacks. Snort’s pre-processor architecture has been used here for combining the two, i.e. network traffic anomaly detector (NETAD) and packet header anomaly detector (PHAD) with Snort. A pre-processor basically refers to a set of instructions or a program that processes its input data to generate output which is used as input to another program. KDDcup99 data set has been used to examine the performance of the proposed system [16].
4.10 Fuzzy Logic and Data Mining Kumar et al. used Artificial Neural Network (ANN) successfully applied for the development of IDS. It provides simple representation of nonlinear relationship between input and output and its inherent computational speed. A SOM is a type of
A Survey on Hybrid Intrusion Detection Techniques
821
ANN which is trained by unsupervised learning for a production of a low dimensional, discretized representation of input space called map. The fuzzy logic and data mining techniques have been used for anomaly-based intrusion detection and for the host-based detection SOM was used. Data mining is basically used for the knowledge extraction from database. It efficiently discovers the interesting and useful facts from large collection of data, while the fuzzy logic gives a way to categorize any concept or method in an abstract way. For every completely true or false situation, there is a situation of partial truth or false and for this partial solution, the concept of fuzzy logics is used which is a superset of conventional logic [17].
4.11 Fuzzy Logic with Neural Network Bashah et al. proposed a technique with combination of fuzzy logic and neural to get better anomaly intrusion detection as also for host-based IDS [18].
4.12 Hybridization Unsupervised and Supervised Neural Network Bahrololum et al. proposed a method in which categorized the abnormal and normal packets. They applied misuse techniques for normal packets by which the processing time can be minimized [19].
4.13 Random Forest Classifier and Ensemble One-Class SVM Abebe and Lalitha proposed a hybrid technique for detect known and unknown attacks. They used random forest classifier for detect the known attacks and ensemble one-class SVM for unknown attacks. It improve high detection rate with low false positive rate.
4.14 Decision Tree and Rules-Based Models Ahmim et al. proposed a hierarchal intrusion detection system which works on a combination of decision tree and with REP tree, and other rule-based algorithms like JRip algorithm and Forest PA in order to improve accuracy and detection rate. They
822
N. S. Bhati and M. Khari
verified their experiment on CICIDS2017 data set. The results of their experiment were 94.457% accuracy for DR, and overall accuracy of 96.665%, and the lowest FAR with 1.145%. [20]
4.15 Anomaly-Based Model Using Feature Selection Analysis Aljawarneha et al. proposed a hybrid model of intrusion detection system which is aiming at exploring meta-heuristic anomalies present in the network. They performed this experiment on the binary and multiclass part of the NSL-KDD data set. The results of the experiment showed reduction in computational and time. The result of the proposed method also presented the accuracy of the model as 99.81% for the binary class and 98.56% for the multiclass NSL-KDD data sets. But they also published issues with high false and low false negative rates [21].
4.16 HIDS for DDoS Attacks Cepheli et al. proposed a hybrid intrusion detection system in order to detect the DDoS attacks. Their proposed model works on a combination of anomaly-based (created using multidimensional Gaussian mixture models, GMMs) and signaturebased methods (created using Snort). They performed their experiment on two different data sets DARPA and the commercial bank data set. The result of their experiment verified the better performance as proposed by them as the accuracy rate for DARPA was 92.1 and 99.9% for the commercial bank data set [22].
4.17 Hybrid Feature Selection Technique Kamarudin et al. proposed a hybrid feature selection method to solve the problem of feature selection in higher dimensions, which requires a combination of filter and wrapper selection procedure. Random forest (RF) classifier was used to evaluate the features that were selected by the filter method. The experiment was tested on KDD99 and DARPA 1999 data sets. The result of the experiment was 0.03% false positive rate, with 99.99% detection rate and 99.98% accuracy [23].
A Survey on Hybrid Intrusion Detection Techniques
823
4.18 Stacking Ensemble of C5 Decision Tree Classifier and One-Class Support Vector Machine Khraisat et al. proposed a combination of the C5 decision tree classifier and one class support vector machine as the basis of a hybrid IDS which has the main goal to detect both the known and unknown intrusions with a higher accuracy rate in terms of detection and false alarms. They performed their experiment on the NSL-KDD data set and the “Australian Defence Force Academy” (ADFA) data set. They presented their accuracy results as 83.24% in NSL-KDD and 97.40% in ADFA (Table 2) [24]. Table 2 Summary of intrusion detection techniques S. No
Year
Author
Techniques used
Results
1.
2007
Kumar et al.
SOM on BPN
Good detection rate
2.
2007
Bashah et al.
Neural and Fuzzy logic
Efficient technique for anomaly
3.
2009
Bahrololum et al.
SOM and backpropagation KDD data set
4.
2009
Jamili et al.
Hybrid propagation in Bayesian network on DARPA data set
For normal detection, we get 87.68 and 98.63% for classic and hybrid but for intrusion values are 88.64 and 96.57%
5.
2009
Tong et al.
Hybrid RBF/Elman neural network is used
High detection rate and low false positive rate
6.
2009
Aydin et al.
Hybridization
NETAD + PHAD + Snort
7.
2010
Li
HENN (hybrid evolutionary neural network)
91.51% is total detection rate and 1.31% is false positive rate
8.
2011
Govindarajan et al.
Neural-based classification—MLP and RBF
98.81% for normal 93.31% for abnormal
9.
2012
Karthick et al.
hidden Markov model (HMM)
100% accuracy from CAIDA as attack and DARPA as clean
10.
2013
Pratibha and Dileesh
Hybrid
Using snort and Hadoop
11.
2015
Abebe and Lalitha
Random forest classifier and ensemble one-class classifier SVM
High detection rate
12.
2016
Cepheli et al.
HIDS for DDoS attacks
The result for DARPA was 92.1 and 99.9% for the commercial bank data set (continued)
824
N. S. Bhati and M. Khari
Table 2 (continued) S. No
Year
Author
Techniques used
Results
13.
2017
Aljawarneha et al.
Anomaly-based model using feature selection analysis
The result was 99.81% accuracy for the binary class and 98.56% for the multiclass NSL-KDD data sets
14.
2018
Ahmim et al.
Decision tree and rules-based models
The results were 94.457% accuracy for DR, and overall accuracy of 96.665%, and the lowest FAR with 1.145%
15.
2019
Kamarudin et al.
Hybrid feature selection technique
The result was 0.03% false positive rate, with 99.99% detection rate and 99.98% accuracy
16.
2020
Khraisat et al.
Stacking ensemble of C5 decision tree classifier and one-class SVM
The accuracy results are 83.24% in NSL-KDD and 97.40% in ADFA
References 1. Gupta A, Bhati BS, Jain V (2014) Artificial intrusion detection techniques: a survey. Int J Comput Network Inform Secur 6(9):51 2. Bace R, Mell P (2001) NIST special publication on intrusion detection systems. Booz-Allen and Hamilton Inc., Mclean 3. Bhati BS, Rai CS (2016) Intrusion detection systems and techniques: a review. Int J Critical Comput Syst 6(3):173–190. https://doi.org/10.1504/IJCCBS.2016.079077 4. Lundin E, Jonsson E (2002) Survey of intrusion detection research. Chalmers University of Technology 5. Bhati BS, Rai CS (2019). Analysis of support vector machine-based intrusion detection techniques. Arab J Sci Eng 1–13 6. García-Teodoro P, Díaz-Verdej J, Maciá-Fernández G, Vázquez E (2008) Anomaly-based network intrusion detection: techniques, systems and challenges. Comput Secur 28(1):109–115 7. Jyothsna V, Prasad VVR, Prasad KM (2011) A review of anomaly based intrusiondetection systems. Int J Comput Appl 28(7):26–35 8. Liang H (2014) An improved intrusion detection based on neural network and fuzzy algorithm. J Networks 9(5):1274–1280 9. Malik AJ, Shahzad W, Khan FA (2012) Network intrusion detection using hybrid binary PSO and random forests algorithm. Security and Communication Network [online]. http://onlinelib rary.wiley.com/doi/10.1002/sec.508/full. Acessed 02 Dec 2019 10. Panda M, Abraham A, Patra MR (2011) A hybrid intelligent approach for network intrusion detection. In: International conference on communication technology and system design, pp 1–9 11. Govindarajan M, Chandrasekaran RM (2011) Intrusion detection using neural based hybrid classification methods. Comput Networks 55(8):1662–1671 12. Jemili F, Zaghdoud M, Ahmed MB (2009) Intrusion detection based on “Hybrid” propagation in Bayesian networks. In: 2009 IEEE international conference on intelligence and security informatics. IEEE, pp 137–142
A Survey on Hybrid Intrusion Detection Techniques
825
13. Li F (2010) Hybrid neural network intrusion detection system using genetic algorithm. In: 2010 International conference on multimedia technology. IEEE, pp 1–4 14. Karthick RR, Hattiwale VP, Ravindran B (2012) Adaptive network intrusion detection system using a hybrid approach. In: 2012 fourth international conference on communication systems and networks (COMSNETS 2012). IEEE, pp 1–7 15. Tong X, Wang Z, Yu H (2009) A research using hybrid RBF/Elman neural networks for intrusion detection system secure model. Comput Phys Commun 180(10):1795–1801 16. Aydın MA, Zaim AH, Ceylan KG (2009) A hybrid intrusion detection system design for computer network security. Comput Electr Eng 35(3):517–526 17. Kumar PG, Devaraj D (2007) Network intrusion detection using hybrid neural networks. In: 2007 international conference on signal processing, communications and networking. IEEE, pp 563–569 18. Bashah N, Shanmugam IB, Ahmed AM (2005) Hybrid intelligent intrusion detection system. World Acad Sci Eng Technol 11:23–26 19. Bahrololum M, Salahi E, Khaleghi M (2009) Anomaly intrusion detection design using hybrid of unsupervised and supervised neural network. Int J Comput Networks Commun (IJCNC) 1(2):26–33 20. Ahmim A, Maglaras L, Ferrag MA, Derdour M, Janicke H (2019) A novel hierarchical intrusion detection system based on decision tree and rules-based models. In: 2019 15th international conference on distributed computing in sensor systems (DCOSS). IEEE, pp 228–233 21. Aljawarneh S, Aldwairi M, Yassein MB (2018) Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model. J Comput Sci 25:152– 160 22. Cepheli Ö, Büyükçorak S, Karabulut Kurt G (2016) Hybrid intrusion detection system for ddos attacks. J Electr Comput Eng 23. Kamarudin MH, Maple C, Watson T (2019) Hybrid feature selection technique for intrusion detection system. Int J High Perform Comput Network 13(2):232–240 24. Khraisat A, Gondal I, Vamplew P, Kamruzzaman J, Alazab A (2020) Hybrid intrusion detection system based on the stacking ensemble of c5 decision tree classifier and one class support vector machine. Electronics 9(1):173
Topic-Guided RNN Model for Vietnamese Text Generation Dinh-Hong Vu and Anh-Cuong Le
Abstract Text generation is one of the most important tasks in NLP and has been applied in many applications such as machine translation, question answering and text summarization. Most of recent studies on text generation use only the input for output generation. In this research we suggest that topic information of an input document is an important factor for generating the destination text. We will propose a deep neural network model in which we use topic information together with the input text for generating summarized texts. The experiment on Vietnamese news corpus shows that our model outperforms a baseline model at least 23% in BLEU score. Keywords Text generation · Topic-guided · Deep neural networks
1 Introduction Text generation is a task in NLP1 that generates a new text from an input text. Given the input text x having n words denoted as x = {x1 , x2 , . . . , xn }, this task generates the output text y with an arbitrary length y = {y1 , y2 , . . . , y|y| }. Text generation task can be defined as finding y¯ : (1) y¯ = argminP(y|x) y
where P(y|x) is the conditional log-likelihood of the generated text y given the input text x. D.-H. Vu (B) · A.-C. Le NLP-KD Lab, Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam e-mail: [email protected] A.-C. Le e-mail: [email protected] 1 Nature
Language Processing.
© Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_78
827
828
D.-H. Vu and A.-C. Le
Deep neural network (DNN) models have recently been effectively used in text generation, the models can be trained using a set of source—target pairs. Source—target can be document—summary in text summarization problem, or question— answer in question answering systems. In machine translation, source and target texts are difference language, in other tasks they are the same language. DNN models such as sequence-to-sequence (seq2seq) [8] has shown its power for text generation. This model has been applied to many NLP tasks: machine translation [8], question answering [10], text summarization [6]. It got better performance when attention mechanism is applied to seq2seq model [1]. These works did not explicitly consider to topical corresponding to source texts. Recent years have shown the success of deep neural networks, such as Generative Adversarial Nets (GAN) [3] can be used to generate images and texts but GAN on image is better than text; Transformer model [9] and GPT-2 [5] can do better performances in text generation but these models are too big and heavy to train. Nevertheless, much documents’ external information may have an important role for text generation. We suggest that the topics of an input text provide additional information to generate the destination text because the text in the same topics may exhibit specific generation patterns. For example, text about weather is usually including temperature, humidity and location; text about security is usually including actions and effects. In this pager, we propose a model to integrate topic information of source texts into a deep neural network model for generating target texts. We will apply the proposed model for a Vietnamese news summarization task. More specifically, we will use news categories as topics and feed this external information as an extra information into the model. Therefore our model can effectively learn corresponding patterns and generate well-related target texts. In addition, we construct the news corpus included summary, title, topics, content, date time, tags of news.
2 The General Model for Text Generation In this work, we use the common seq2seq [8] DNN model combined with attention mechanism to form a general text generation model. The model uses bidirectional gated recurrent unit (GRU) [2] encoder to encode the feature vector from an input text and a GRU decoder to generate the output text.
2.1 Gated Recurrent Unit GRU is an extension of Recurrent Neural Networks (RNN) [4], GRU gets an input sequence x = {x1 , x2 , . . . , xn }, for n is length of sequence x, and calculates the output sequence h = {h1 , h2 , . . . , hn }. At ith unit, it produces output hi by using inputs hi−1 , xi and updates gate zt (2), resets gate rt (3).
Topic-Guided RNN Model for Vietnamese Text Generation
829
zi = σ (Wz xi + Vz hi )
(2)
ri = σ (Wi xi + Vi hi )
(3)
where σ (·) is the Sigmoid function. Then the unit computes hˆ i and final output hi hˆ i = tanh(Wh xi + Vh (ri · hi−1 ))
(4)
hi = (1 − zi ) · hi−1 + zi · hˆ i
(5)
2.2 Seq2Seq—The Baseline Model We use seq2seq model with Bahdanau [1] attention as the baseline model. The baseline model includes an encoder to encode input text x to feature vector h and decoder to generate output text y using h. In general, the model estimates the conditional probability p(y|x) where y = {y1 , y2 , . . . , ym } is the output text and x = {x1 , x2 , . . . , xn } is the input text: p(y|x, θ ) =
m
p(yi |h, y1 , y2 , . . . , yi−1 , θ )
(6)
i=1
where θ is the model parameter, h is the concatenation of forward and backward GRU hidden states: h=h⊕h
(7)
3 The Proposed Topic-Guided Text Generation Model This section presents our proposed model called topic-guided text generation model, in which we will show how to integrate topic information in a DNN model for text generation. Figure 1 illustrates our proposed model which is built from seq2seq with the Bahdanau attention. This mode integrates input text and topic information into both encoder and decoder. Therefore the output h and hd in this model not only contain input text information but also topic information. The encoder encodes the input text and its topic to a fixed length vector, and then the decoder uses that vector combined with the topic information to generate the output text. Encoder: we use bidirectional GRU to process input text x and topic vector τ forward and backward. The forward calculates h and the backward calculates h then we
830
Fig. 1 Topic-guided text generation model
D.-H. Vu and A.-C. Le
Topic-Guided RNN Model for Vietnamese Text Generation
831
calculate h use (7) and feed h to decoder. Specifically, at ith unit, the same forward and backward encoder calculates ri , zi , hˆ i as follow: zi = σ (Wz [xi , τ ] + Vz hi )
(8)
ri = σ (Wi [xi , τ ] + Vi hi )
(9)
where σ (·) is sigmoid function. Then the unit computes hˆ i and final output hi using the formula (5) hˆ i = tanh(Wh [xi , τ ] + Vh (ri · hi−1 ))
(10)
The forward encoder generates hi and the backward encoder generates hi and then we combine them to get hi using the formula (7). Decoder: we use GRU to process input h and output is fed forward to the next unit as input. Specifically, at i-th unit, the input is concatenation of τ and yi−1 , where hdi−1 is the previous unit’s output, and yi−1 is selected from dictionary according to hidden state hdi−1 using softmax function. Attention: to get attention of x at decoding time step i we need to calculate the context vector ci captures relevant source-side information to help predict the current target word yi . Then we produce attention vector h˜ di of this step from hi and ci as: h˜ di = tanh(Wc [ci , hi ]) where ci is calculated as: ci =
n
aij hj
(11)
(12)
j=1
exp(eij ) aij = n k=1 exp(eik )
(13)
where eik = f (hdi−1 , hk ) is a feed-forward neural network. Finally, we feed h˜ di to softmax layer to get decode output yi .
4 News Corpus We build a news corpus which has 140k documents of Vietnamese news, each document includes Id, Title, Summary, Author, Time, Tags, Topics and Content (see an example in Fig. 2). We crawled news on web then clean text by remove extra space, empty line and specific characters
832
D.-H. Vu and A.-C. Le
Fig. 2 News corpus samples
This corpus can be used for different NLP tasks such as text summarization, headline generation, essay generation, …We use the text in “summary” as the input and the text in “title” as the output for our text generation task. This work can be considered as a summarization task.
5 Experiments We use the summary, title and topic in our news corpus, news summary and has length between 10 and 90 words, news title has length between 4 and 25 (Fig. 3). It has 11 topics per 140k document, so we feed the topic to model as one-hot vector which has 11 dimensions. We split the corpus to 100k document for training, 20k for testing and 20k for validation. We use the 300-dim word embedding, vocabulary size for summary at
Fig. 3 Summary and title histogram in news corpus
Topic-Guided RNN Model for Vietnamese Text Generation Table 1 Experiment result Head Table column head Baseline model BLEU 1 BLEU 2 BLEU 3 BLEU 4
0.119 0.066 0.034 0.017
833
Topic-guided model
Increment (%)
0.146 0.089 0.052 0.031
23 35 54 76
the encoder is 15k and title at the decoder is 9k. The encoder and decoder use GRU with hidden size of 300. For better training speed we need padding input in batch to the same length, base on Fig. 3 we select max length for summary is 80 and max length for title is 25. We trained baseline and topic model in 50 epochs with rmsprop optimizer, 10−3 learning rate, and we use dropout method [7] to avoid overfitting. Table 1 show our experiment result, the topic-guided model outperforms the baseline model at least 23% BLEU score.
6 Conclusion In this paper, we have proposed a topic-guided text generation RNN model and conducted the experiment on a kind of text summarization on Vietnamese news. The proposed model shows how to use topic information in an attention GRU model. The experimental result have proved that topic is an important feature for text generation and show a much better result. In the future, we will work on more kinds of topic information and consider other extra information of documents such as tags.
References 1. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd International conference on learning representations (ICLR 2015)— conference track proceedings, pp 1-15. arXiv: 1409.0473 2. Chung J et al (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. In: arXiv preprint arXiv:1412.3555 3. Goodfellow IJ et al (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 4089–4099, Dec 2017. ISSN: 10495258 4. Mikolov T et al (2010) Recurrent neural network based language model. In: Eleventh annual conference of the international speech communication association 5. Radford A et al (2019) Language models are unsupervised multitask learners. In: OpenAI Blog 1.8 6. Shi T et al (2018) Neural abstractive text summarization with sequence-to-sequence models. In: arXiv preprint arXiv:1812.02303
834
D.-H. Vu and A.-C. Le
7. Srivastava N et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958 8. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112, 4 Jan 2014. ISSN: 10495258. arXiv: 1409.3215 9. Vaswani A et al (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008 10. Zheng H-T et al (2018) Automatically answering questions with nature languages. In: 2018 5th international conference on systems and informatics (ICSAI). IEEE, pp 350–355
Automated Currency Recognition Using Neural Networks Abhishek Jain, Paras Jain, and Vikas Tripathi
Abstract Paper currency recognition (PCR) is one sort of insightful framework which is a significant need of the present computerization frameworks in the cutting edge universe of today. It has different potential applications including electronic banking, cash checking frameworks, cash trade machines, and so on. This paper proposes a programmed paper cash acknowledgment framework for paper currency. It utilizes InceptionV3 for extraction and neural network for classification and utilizes the instance of Indian paper currency as a model. The strategy is very sensible regarding exactness. The framework manages 3100 pictures. The pictures are dispersed in six classifications—10, 20, 50, 100, 200, and 500, and they are being utilized for examination and characterization. The proposed calculation is completely programmed and requires no human intercession. To approve the adequacy of system and appropriateness of neural network for cash picture arrangement, examinations have been finished with different classifiers like K-nearest neighbors (KNN) and support vector machine (SVM). The proposed procedure delivers very palatable outcomes as far as acknowledgment and effectiveness as it has accomplished an average accuracy of 97.2% over six categories. Keywords Paper currency · Image processing · Neural network · InceptionV3 · Content-based classification
A. Jain (B) · P. Jain · V. Tripathi Graphic Era Deemed to be University, Dehradun 248002, India e-mail: [email protected] P. Jain e-mail: [email protected] V. Tripathi e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_79
835
836
A. Jain et al.
1 Introduction Money is the paper notes and ordinary coins, which is discharged for circling inside an economy by the administration. It is the vehicle of a switch for administrations and products. For exchange, paper cash is a significant medium. Attributes of paper cash are effortlessness, strength, full oversight, and also efficiency. Because of this, it becomes famous. Among all other elective types of cash, the most best type of money is the paper. There is one downside of paper monetary forms which is that they can be torn out or their colour can get fade but this issue isn’t much genuine. As the piece of the innovative movement acquainted with the financial divisions, money-related organization what’s more, banking had begun money-related self-administrations. By utilizing ATM counter and coin—gadget, the mechanized financial framework is accomplished where machines are utilized to take care of monetary standards. In such circumstances, the machine will utilize the money recognizer for the arrangement of the monetary orders. Manual testing of all notes in exchanges is very tedious and chaotic procedure, and furthermore there is an opportunity of tearing while at the same time taking care of notes. Along these lines, automatic techniques for monetary order acknowledgment are required in numerous applications, for example, programmed selling merchandise and candy machines. Consistently, RBI (Reserve Bank of India) face the issue of fake cash notes or wrecked notes.
2 Literature Survey Ingulkar et al. [1] presented the system which is divided into two parts: currency recognition and currency verification. They extracted the features such as identity mark and optical variable link. Pixel value is calculated, and based on that value histogram is plotted. For currency verification, features such as watermark, security thread, fluorescence and latent image are used. Aruna et al. [2] presented a Survey on Indian Currency Note Recognition System. They evaluated the algorithms which have been anticipated by the various researchers for the currency recognition. They also discussed the features extraction techniques used. Kevya et al. [3] structured a framework that aids in recognizable proof of Indian cash notes also, to check whether it is a substantial or invalid. This is to separate between the fake notes and authentic notes. These features are segmented using 3 × 3 grid. This is done by the use of SIFT technique which helps in efficient matching of the features. Rubeena et al. [4] presented an implementation; the implementation is done for denomination of 100, 500, and 1000. The authentication process is performed on these denominations. The design of complete system consists of three interfaces each for one denomination. They preprocess the image using grayscale conversion
Automated Currency Recognition Using Neural Networks
837
and edge detection. The features are extracted using edge-based segmentation by sobel operator and work well in the whole process with less computation time. Patel et al. [5] proposed a method using neural network pattern recognition tool which yields an accuracy of 95.6%. The image segmentation process is done using the canny edge detetctor.After the demonetization in India, it is important to build a framework for new currency notes as well as for old currency notes, so we prepared a framework for this purpose. In this paper, we present a structure dependent on neural network for effective grouping of pictures in various monetary standards and performed near examination of different currency categories utilizing distinctive classifiers like KNN, SVM, and neural network to approve our system.
3 Methodology Image categorization includes two significant advances: highlight descriptor computation and arrangement. The visual portrayal of our entire structure is shown in Fig. 1. We initially gathered paper money of every class, that is 10, 20, 50, 100, 200, and 500, and made our own dataset. This informational collection is bolstered into picture embedder for extraction of highlight descriptor esteems. For highlight extraction, the InceptionV3 model is utilized. It is Google’s pre-prepared model which has been prepared over more than 1000 classes and over 1.4 million pictures. The InceptionV3 model is a picture acknowledgment model for highlight extraction with the assistance of the convolutional neural system. Further arrangement is performed with fully connected and softmax layers. From that point forward, we separated our
Fig. 1 Framework for Indian currency image classification
838
A. Jain et al.
Fig. 2 Diagrammatic representation of InceptionV3 where a is the input image, b is the convolution layer, c is the subsampling layer, d and e are fully connected output layers
dataset into the information as preparing and testing. For our situation, we kept 70% of our information as preparing information and rest 30% as testing information and afterward sent the preparation information to our classifier, i.e., neural network for preparation. From that point forward, the rest 30% of our information was sent to the prediction section and finally the confusion matrix was built to check the fitness of our preparation. The InceptionV3 model concentrates on valuable highlights from given info pictures in preparing part and further sends grouping dependent on separated highlights in the subsequent part. The diagrammatic portrayal of the working of InceptionV3 is shown in Fig. 2. The activation function which is utilized is rectified linear unit (Relu). It is linear for every single positive worth and is zero for every single negative worth. It is very easy for computing, and consequently the model gets less time to train. We utilized this capacity since it does not have disappearing inclination issue which is available in other activation functions like sigmoid and tanh. Numerically, it very well may be communicated as appeared in Eq (1): X (x) = max(0, x).
(1)
Here, X(x) is the function, 0 is the underlying worth, x is the information and we are taking a limit of 0 and the inputted esteem. As the Relu work is 0 for every single negative worth, henceforth, the underlying worth is set to be 0.
4 Results and Discussion The proposed structure is conveyed on a framework with arrangement as Intel(R) Core(TM) i7-7700 CPU @ 3.60 × 8 GHz with 8 GB RAM. Pictures are utilized for
Automated Currency Recognition Using Neural Networks
839
Fig. 3 Sample frames of different currency notes
investigation having a size of 320 × 240. The pictures for each of the six classifications are picked so that investigation becomes complex, for instance a few pictures are obscured, some are dull and some are clear, aside from that, pictures are taken from different assets with the goal that foundation and camera edge are not uniform. The inconsistency in foundation, edge of camera, and picture obscuring make our dataset testing just as appropriate for examination. Test outlines for every one of the six classifications of pictures are shown in Fig. 3. The dataset is made by taking pictures from camera. A sum of 3100 pictures is the total of 500 (approx.) pictures in each class altogether. According to standard utilized for characterization, the dataset is isolated into testing (30%) and training (70%) parts. Table 1 shows the point-by-point measurements about pictures utilized for various paper currencies for training and testing. Table 1 Dataset divided into training and testing data
Currency category Total images Training data Testing data Ten
924
813
111
Twenty
410
329
81
Fifty
377
296
81
One hundred
315
235
80
Two hundred
382
300
82
Five hundred
654
543
111
3062
2516
546
Total
840
A. Jain et al.
The characterization is finished with the assistance of the neural network with a precision of 97.22%. To check the fitness of our preparation, the dataset is trained with the assistance of different classifiers like and the similar examination is done and the investigation shows that the outcome gave by the neural network is nearly high. The confusion matrix of neural network is shown in Table 2. The similar examination shows that the outcomes gave by the neural network are generally productive with normal exactness of 97.22% as shown in Fig. 4. The explanation for giving the best outcomes is that the neural network works all the more proficiently when more information is given to it, in contrast to different classifiers. The productivity and calculation of the neural network increment are the measure of information increment. Table 2 Confusion matrix achieved by neural network over testing dataset 10
20
50
100
200
500
Total
10
105
1
1
1
1
0
109
20
2
79
0
0
0
0
81
50
0
0
77
2
0
0
79
100
0
0
1
78
0
1
80
200
0
0
0
0
81
0
81
500
5
0
0
0
0
105
110
112
80
79
81
82
106
540
Total
Average Accuracy 98 97 96 95 94 93 92 91 90 KNN
SVM
Neural Network
Fig. 4 Graph representing average accuracy obtained by different models
Automated Currency Recognition Using Neural Networks
841
5 Conclusion Right now has been introduced a strong system for arrangement of different classifications of paper cash utilizing neural network and InceptionV3. Paper currency pictures are isolated into six classes—10, 20, 50, 100, 200, and 500. We have indicated that system has accomplished normal precision of 97.22%. To approve adequacy of neural network, aftereffects of different classifiers have been determined. As appeared in results, neural network has accomplished best exactness when contrasted with different classifiers because of its capacity to deal with huge datasets effectively. Further, future extent of this paper is all the way open. More classes can be fused for investigation. Extracted features from currency image can be used for its verification. An application can be designed to check whether currency notes is fake or genuine. Other component descriptors can be used for increasingly viable outcomes.
References 1. Suresh IA, Narwade PP (2016) Indian currency recognition and verification using image processing. Int Res J Eng Technol (IRJET) 03(06) 2. Aruna DH, Bagga M, Dr. Singh B (2015) A survey on Indian currency note denomination recognition system. Int J Adv Res Sci Eng (IJARSE) 4(01) 3. Aggarwal H, Kumar P (2014) Indian currency note denomination recognition in color images. Int J Adv Comput Eng Commun Technol 1(1) 4. Mirza R, Nanda V (2012) Design and implementation of Indian paper currency authentication system based on feature extraction by edge based segmentation using sobel operator. Int J Eng Res Develop 3(2) 5. Patel VN, Dr. Jaliya UK, Brahmbhatt NK (2017) Indian currency recognition using neural network pattern recognition tool. In: ICRISET international conference on research and innovations in science, engineering & technology. Selected papers in computing
An Application of Vision Systems for the Inspection of Two-Dimensional Entities in a Plane Van Thao Le, Quang Huy Hoang, Duc Manh Dinh, and Yann Quinsat
Abstract This study proposes a method for measuring the dimensions and position of two-dimensional (2D) entities in a plane by using a vision system (e.g., a camera). In the proposed method, a camera was installed in a non-orthogonal configuration to capture an image of 2D entities in a work plane. The plane of 2D entities possesses four reference points with their given coordinates. The image of entities captured in a non-orthogonal view was transformed into an orthogonal image by applying a homography transformation estimated from four reference points. Subsequently, the position and dimensions of 2D entities in the orthogonal image were detected automatically using some image processing algorithms that were implemented in MATLAB software. The position and dimensions of 2D entities in the world coordinate system were finally computed by using a ratio of millimeters per pixels. The accuracy and efficiency of the proposed method were validated through the case of measuring a set of milled holes. Keywords Vision system · Inspection · Camera calibration · Homography · Image processing
1 Introduction Nowadays, vision inspection systems are widely used for automated inspection, robot guidance, quality control and manufacturing applications because of its accuracy, flexibility, repeatability and efficiency [1, 2]. In the manufacturing field, the inspection plays an important task and it is increasingly integrated into the manufacturing process to avoid defective products and shorten the lead time of manufacturing
V. T. Le (B) · Q. H. Hoang · D. M. Dinh Le Quy Don Technical University, Hanoi, Vietnam e-mail: [email protected] Y. Quinsat ENS Paris-Saclay, Université Paris-Saclay, Paris, France © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_80
843
844
V. T. Le et al.
process of products [3]. In this context, this paper aims at proposing an inspection method for 2D entities in a plane by using a vision system. The vision system composed of a camera and a lighting source was used to recognize and measure the shape and dimensions of entities. The proposed method can be applied to inspect the quality of parts during the manufacturing process or after the manufacture of parts completed.
2 Proposal of Inspection Method The proposed method consists of four steps, as follows: • Firstly, the camera was calibrated to obtain its intrinsic parameters (i.e., the focal length, principle point, pixel skew coefficient, radial and tangential distortion coefficients). These parameters were subsequently used for correcting the lens distortion of images and calculating the position of entities in the world coordinate system (WCS). • Secondly, a homography transformation matrix was estimated based on four point correspondences. Applying this transformation allows an image captured in a non-orthogonal view to be transformed into an orthogonal image. Namely, the perspective distortion of images was corrected. • Thirdly, the entities in the image were detected in the corrected image by using the image processing algorithms. • Finally, the dimensions and position of entities were computed in the WCS defined in the work plane. To describe these steps of the proposed method explicitly, a case study was used. The part used in the case study is a machined part composed of 53 milled holes with diameter of 15 mm (Fig. 1). The material of part is aluminum alloy. The center points of four milled holes in the corners of the part were considered as four reference points for the homography computation. Other holes were considered as entities to be measured by the proposed method. The position and diameter of the holes were also measured by a coordinate measuring machine (CMM) to evaluate the accuracy of the proposed measurement method. Fig. 1 The case study: designed part (a) and machined part (b)
An Application of Vision Systems for the Inspection …
845
The steps of the proposed method were implemented in MATLAB software (version R2017b) and detailed in the following sections.
3 Camera Calibration The calibration of a camera aims at estimating the intrinsic parameters and the pose of the camera. For this purpose, the pinhole camera model and the chessboard calibration pattern are widely used [4] (Fig. 2). In the pinhole model, the relation between a 3D point P = (X w , Y w , Z w ) in the WCS (Rw ) and its corresponding pixel point p = (u, v) in the pixel coordinate system (Ruv ) is described by Eq. (1) and Fig. 2a: ⎤ ⎡ ⎡ ⎤ ⎡ ⎤ Xw u f x s cx ⎢ Yw ⎥ ⎥ λ⎣ v ⎦ = ⎣ 0 f y c y ⎦ ∗ R3×3 T3×1 ∗ ⎢ ⎣ Z w ⎦ = A3×3 ∗ R3×3 T3×1 ∗ P 0 0 1 1 1 (1) where cx and cy are the coordinates of the principle point; f x and f y are the effective focal length on the u-axis and v-axis of the pixel coordinate system (Ruv ); s is the skew coefficient; and λ is a non-zero scale factor. The rotation matrix R and the translation vector T describe the pose of the camera in the WCS (Rw ). In addition, to describe a real camera completely, the lens distortions (radial and tangential distortions) are also taken into consideration, Eq. (2) [4]:
xdistorted ydistorted
x 2 p1 x y + p2 r 2 + 2x 2 2 4 6 = 1 + k1 r + k2 r + k3 r + 2 p2 x y + p1 r 2 + 2y 2 y
(2)
where (x, y) are the coordinates of the point p in the image coordinate system (Rim ) without distortions: x = u − cx and y = v − cy ; (x distorted , ydistorted ) denote the coordinates of the distorted point of the point p; r 2 = x 2 + y2 ; k 1 , k 2 and k 3 are radial distortion coefficients; p1 and p2 are tangential distortion coefficients. Fig. 2 Model of a pinhole camera (a) and the chessboard calibration pattern (b)
846
V. T. Le et al.
The intrinsic parameters and the pose of the camera were estimated by minimizing the following cost function (3): n m
m i j − Mˆ A, k1 , k2 , k3 , p1 , p2 , R, T, M j 2
(3)
i=1 j=1
where Mˆ is the image point of the world point M j according to Eq. (1); mij is the jth image point of the world point M j in the image i; n is the number of images used for the calibration; and m is the number of world control points. In this study, a camera Nikon D810 with full-frame CMOS sensor of 36.3 megapixels was used to capture the images of entities. The size of an image captured by the camera is 4912 × 7360 pixels. Twenty-five images of the chessboard pattern (Fig. 2b) were captured by the camera in different positions and orientations for the camera calibration. The calibration of camera was performed by using the camera calibrator toolbox available in MATLAB R2017b. The metric used to evaluate the calibration quality is the root mean square error (RMSE) of the reprojection. The reprojection errors are the distances in pixels between the control points detected in the images and the points of the calibration pattern reprojected into the images. In this study, RMSE for all images is inferior to 0.36 pixels, which indicates an acceptable quality of the camera calibration [5]. The estimated intrinsic parameters of camera were used for the lens distortion correction of images and the computation of the position of entities in the WCS.
4 Homography Computation The perspective distortion always exists in an image taken in a non-orthogonal configuration (e.g., Fig. 3a). This distortion can be corrected by estimating a 2D homography matrix H defined as [6]: [u’ v’ 1]T = H3×3 * [u v 1]T , where (u, v) are the coordinates in the non-orthogonal image and (u , v ) are the coordinates in the orthogonal image. Herein, the matrix H was computed from four correspondences {M, N, P, Q} ↔ {M , N , P , Q }, as shown in Fig. 3. The points (M, N, P, Q) are center Fig. 3 Homography transformation: non-orthogonal image (a) and corrected image (b)
An Application of Vision Systems for the Inspection …
847
points of the holes {1, 2, 3 and 4} detected in the non-orthogonal image (Fig. 3a). The points (M , N , P , Q ) are the corresponding points in the orthogonal image (Fig. 3b) that were determined from the corresponding world points, the focal length and principle point of the camera. By using the perspective warping with the homography transformation matrix H, the perspective distortion of the non-orthogonal image was corrected effectively (Fig. 3). The corrected image appears as an image captured in the orthogonal configuration of the camera (Fig. 3b).
5 Identification of Entities in the Image Once the lens and perspective distortions of the image were corrected (Sects. 3 and 4), the entities in the corrected image were identified by using some image processing algorithms. For the part used in the case study, all milled holes appear in the corrected image (Fig. 3b) approximately as circles or ellipses. The process for detecting automatically all circles in the corrected images was described, as follows: (i) (ii)
(iii) (iv) (v)
Firstly, a theoretical grid of points was created so that each point of the grid is close to the center of a hole (Fig. 4a). For each point of a grid, an image (Fig. 4b) that includes only the corresponding hole was cropped from the corrected image (Fig. 4a). This allows isolating the considered hole from other ones and reducing the image processing time. Detecting the points on boundaries of the hole by using the CANNY algorithm [6]. This step also detects points in the interior of the hole (Fig. 4c). Filtering and eliminating undesired points (i.e., the points in the interior of the hole) by using a threshold distance (Fig. 4d). Estimating an ellipse from the boundary points of the hole by using the algorithm developed in [7] to determine the center point and diameter of the hole (Fig. 4e).
Fig. 4 Entity identification in the corrected image
848
V. T. Le et al.
6 Calculating Positions and Dimensions of Entities in the WCS To calculate the coordinates of the center and diameter of the holes in the WCS (Rw ), the ratio r of the distance in the WCS per the corresponding distance in the orthogonal image was used. This ratio was defined as Eq. (4): −−→ Pi Pk (mm) r = −−→ pi pk (pixel)
(4)
where pi , pk are the image points selected from four reference points in the orthogonal image (Fig. 3b) and pi = pk ; Pi and Pk are, respectively, the corresponding world point of pi and pk . Figure 5a shows the distribution of absolute position deviations between the center of all holes measured by the proposed method and the data measured by the CMM. It was revealed that the proposed method presents a good level of accuracy with a mean absolute position deviation of 0.045 mm, and about 80% of measured holes with a position deviation less than 0.06 mm (Fig. 5b). Figure 6 presents the absolute
Fig. 5 Distribution of absolute position deviations between the data measured by the proposed method and the CMM-measured data (a), and the cumulative histogram of deviations (b)
Fig. 6 Absolute deviation between the diameters of holes measured by the proposed method and by the CMM
An Application of Vision Systems for the Inspection …
849
deviation between the diameters of holes measured by the proposed method and by the CMM. It was found that all holes have a deviation inferior to 0.045 mm. Based on these results, it can be concluded that the proposed method is suited for inspection application with high accuracy. Moreover, the measuring time was significantly reduced compared with the conventional method such as the CMM method. The method only needs an image of entities and four given reference points.
7 Conclusions In this paper, an inspection method based on a vision system was proposed for the 2D entities in a plane. The method used a camera to take images of measured objects and computed the position and dimensions of the objects from the images based on four available reference points in the measuring plane. The accuracy of the proposed method was validated via the case study. The proposed method can be applied efficiently for the automated inspection of final products or during the manufacturing process to avoid defective products. Acknowledgements This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 107.99-2019.18.
References 1. Zhao YJ, Li HN, Song KC, Yan YH (2017) In-situ and in-process monitoring of optical glass grinding process based on image processing technique. Int J Adv Manuf Technol 3017–3031 2. Everton SK, Hirsch M, Stravroulakis P et al (2016) Review of in-situ process monitoring and in-situ metrology for metal additive manufacturing. Mater Des 95:431–45 3. Vacharanukul K, Mekid S (2005) In-process dimensional inspection sensors. Measurement 38:204–18 4. Heikkila J, Silven O (1997) A four-step camera calibration procedure with implicit image correction. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 1106–1112 5. Usamentiaga R, Garcia DF, Ibarra-Castanedo C, Maldague X (2017) Highly accurate geometric calibration for infrared cameras using inexpensive calibration targets. Measurement 112:105–16 6. Hartley R, Zisserman A (2004) Multiple view geometry in computer vision. Cambridge University Press, Cambridge 7. Mitchell DRG, Van den Berg JA (2016) Development of an ellipse fitting method with which to analyse selected area electron diffraction patterns. Ultramicroscopy 160:140–145
An Efficient Image-Based Skin Cancer Classification Framework Using Neural Network Tejasvi Ghanshala, Vikas Tripathi, and Bhaskar Pant
Abstract A decent picture examination model can be exceptionally useful in precise finding/arrangement of ailments for which pictures are accessible. Because of plenty of open picture databases, preparing and testing of calculations on the dataset have helped the development of productive systems for picture order. Skin malignant growth is one such infection for which recently picture database has been created. Out of all the different strategies for order of skin malignant growth dependent on picture examination, CNN has been demonstrated to be superior to the conventional AI systems. Understanding the significance of creating novel structure in this paper, we have utilized a pre-prepared convolutional neural system to order pictures into classifications, specifically, harmful or considerate. We prepared the model utilizing skin malignancy pictures accessible on the ISIC file. The dimension of the pictures utilized for preparing the model is 32 × 3264 × 64 and 128 × 128 in pixel as units. It was discovered that 128 × 128 shaded pictures yielded the best accuracy of 83.78%. Keywords Convolutional neural network · Skin cancer · Pooling · Medical image processing
1 Introduction Skin cancer is the most common cancer worldwide with virtually four-hundredth of reportable cancer cases being styles of carcinoma. There are two major styles of skin cancers: malignant and benign. In malignant carcinoma, the cancer cells split quickly T. Ghanshala (B) University of British Columbia, Vancouver, Canada e-mail: [email protected] V. Tripathi · B. Pant Graphic Era Deemed to Be University, Dehradun, India e-mail: [email protected] B. Pant e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_81
851
852
T. Ghanshala et al.
and interfere with the traditional functions of the cells of the body whereas in benign carcinoma, the traditional functioning of the cells of the body does not seem to be meddled with [1]. Though malignant carcinoma are the foremost dangerous among the sort of skin cancers, most skin cancers may be cured if detected early enough for an attainable treatment. This makes it even more important to discover them in order to classify them, as malignant cancers are terribly hazardous and require immediate consideration. Deep learning is state-of-the-art machine learning methodology for finding hard-to-please issues that arise in machine vision, image process, and image classification. Deep learning techniques like artificial neural network (ANN) and convolutional neural network (CNN) provide rather more subtle outcomes than the standard machine learning practices. Today, by deployment of new generation GPUs which are much better in terms of speed and computation, the preparation of comprehensive datasets becomes less cumbersome that successively offers results with higher exactness [2]. In this paper, a framework that uses CNN for feature extraction is employed to train the image dataset and predict output of the remaining images from dataset. We have used a pre-trained CNN for colored pictures on totally different dimensions 32 × 3264 × 64,128 × 128, gray scale image of dimensions 32 × 3264 × 64,128 × 128, and edge pictures with dimensions 32 × 3264 × 64,128 × 128 to check and obtain the simplest exactness attainable. Remaining paper is structured as follows: Sect. 2 consists of the previous mechanisms in the field of machine learning and cancer image classification. In Sect. 3, our framework to classify cancer images has been discussed. Section 4 provides the results generated from our proposed framework. In Sect. 5, we have concluded our work with its future aspects.
2 Literature Review Previously, researchers have proposed numerous methods for classifying specific cancer classes [3]. Classification approaches like support vector machines, instancebased learning, decision trees based learning, Bayesian and neural networks have been frequently utilized to categorize skin wounds in given images. In classifiers based on instances, mostly a distance function like Manhattan, Euclidian, etc., is used to evaluate which image of the training dataset is nearby to an anonymous or unlabeled image. In decision tree-based algorithms, drawback arises in terms of over-fitting. Bayesian learning-based approaches calculate the probability provided features for each category. Due nature of fast training computation, these approaches have been used for classification of skin image analysis. Since these methods assume that features are independent, sometimes, this has an adverse impact. In [4], authors used CNN to train and classify a similar dataset into three groups, namely melanoma, keratosis, and benign and achieved outstanding results. In [5], they have showcased the usage of the deep network (DCNN) to accomplish ISIC challenge 2017 lesion extraction on skin cancer. ResNets model on a similar dataset to classify the lesion
An Efficient Image-Based Skin Cancer Classification Framework …
853
segmentation of three classes was used in [6], and achieved good results. In [7], they have conducted lesion segmentation in which original images are combined as binary mask with manual tracing of lesion boundaries as well. Esteva et al. [8] extracted results on 757 groups of disease-trained CNNs. Ge et al. [9] used clinical imagery and dermoscopy to extract features and then classify the classes. In [10], they have showcased the usage of generic algorithm (GA) to optimize the functionality and to separate the most distinctive features for classification. From above survey, it is clearly visible that there is a need for more efficient algorithm for skin cancer identification. Since neural network is providing effective results in widely used applications, we have proposed a framework using neural net for skin cancer detection.
3 Methodology Machine learning algorithms are widely exploited for classification of images to get desired results in various applications [11]. In this paper, we introduce a neural network-based mechanism for classification of skin cancer based on image as an input. Any effective image classification is dependent upon image characteristics like brightness, smoothness, etc., which is required to be processed before deploying any machine learning algorithm. As shown in Fig. 1, images are fed into system for preprocessing. In preprocessing step, brightness of images has been fine-tuned to extract better features from input images. Apart from fine-tuning of brightness, images are resized and converted into gray scale. Further, canny edge detector is applied to extract information of images in the form of edges. After preprocessing, the fine-tuned images are fed into CNN model, which mines the principal features from the input images. In our proposed framework, we have utilized multiple layers of neural network. Here, as an activation function (Z) in last layer, the sigmoid function is utilized. The sigmoid function can be very useful to extract the likelihood of image fall into various categories like malignant, benign for skin cancer. The most integral a part of the framework is that the functioning of the CNN; CNN may be a category of neural networks that consists of convolution layers; these layers perform mathematical convolution operations on the given data and provide resultant value to consequent layer. The pictures employed in the dataset are of various dimensions and are delivered to same dimensions whether or not it is 128 × 128,64 × 64 or 32 × 32. Then, the pictures are provided to our model as input. The framework uses deep learning CNN to classify skin cancers. The proposed CNN consists of four basic steps for processing such as convolution, max pooling, flattening, and full connection. Figure 2 represents arrangement of all four steps. Firstly, convolution is done in which features are extracted from the image. Here, 2 or more convolution layers are added which perform extraction of features. The convolutional layer is the most significant part of building a CNN (Fig. 2). Many adaptable filters work cohesively to form this layer’s parameters. In forward
854
T. Ghanshala et al.
Fig. 1 Proposed architecture for identification of cancer
Fig. 2 Architecture of neural network
pass, each filter slides and convolves over the dimensions of the input data that computes the dot product between the input and filter (kernel) and gives a feature map [12].Combinations of such feature maps form a convolution layer. The convolution function which is used shown in following Eq. (1): ∞
( f ∗ g)(t)def ∫ f (τ )g(t − τ )dτ −∞
(1)
An Efficient Image-Based Skin Cancer Classification Framework …
855
The rectified linear unit (ReLU) has been utilized here. The rectifier function used in ReLU) removes all the black elements from the image, keeping only the positive value; thus, increasing the non-linearity in the image. Second step is max pooling; here, a 2 × 2 matrix is slid over the dimensions of the feature map generated from the layer and the maximum numerical value was found in the 2 × 2 box slid over the feature map with the value fed into the pooled feature map. In the next step of flattening, the pooled feature map is flattened and converted into a column vector which works as the input of the next layer. The last step is fully linked layer. Here, the fully linked layers work as a classifier over the features extracted from the previous steps. Then, a probability of a test image being predicted by the calculations is realized [13]. A “loss function” is defined for the probability of a prediction being incorrect. The loss function indicates the perfection of the system, back propagation is then used iteratively to minimize the loss. Thus, more robust preparations can be made.
4 Results and Discussion To achieve our results, the presented approach in methodology section was deployed on a computer having 16 GB of RAM with 4 GB graphic card of NVIDEA family, and an Intel i7 Seventh Generation processor having eight cores. To achieve good results in any classification approach, dataset plays very vital role. The accessible pictures from ISIC file display were utilized for this purpose [14]. It consists 3801 shaded dermoscopic skin pictures allocated into two categories, dangerous and kindhearted. There are 1896 pictures of harmful skin malignancy and 1905 pictures of considerate skin disease in the dataset, and the current pictures are of different pixel sizes (from 1022 × 767 to 6748 × 4499). Sample pictures are displayed in Fig. 3.
Fig. 3 Sample images from ISIC dataset [14]
856 Table 1 Details of cancer image dataset
T. Ghanshala et al. Class
Number of images
Malignant images
1896
Benign images
1905
Total images
3801
As presented in Table 1, the images further get used for training and testing sets. Here, 70% of the images in training set and remaining images utilized for testing. All the pictures and their preprocessed kind were at the start born-again to a size 64 × 64 and were created to run on a two superimposed design. In the preprocessing state, the pictures were born-again into gray scale that has tried mostly to be the wide used image kind because it decreases procedure time in addition as provide truthful results. However, this framework gave an accuracy of 66.18% once fed with gray scale pictures that is not in any respect satisfactory result thus we have a tendency to try out another very talked-about form of preprocessing technique, i.e., edge detection. Edge detection was performed on the dataset mistreatment smart methodology but this training conjointly yielded results with accuracy of 68.22%. After obtaining underwhelming outcomes the framework was further trained with higher than preprocessed pictures. We have a tendency to train the framework on the pictures after the brightness adjustment tires the preprocessing. These pictures increased the time taken for training the framework however they gave the most effective accuracy of astounding 80.17% out of the heap as shown in Table 2. By achieving better results in the framework which uses colored images [15], we have further explored this analysis by working on the adjustment of size of the input images fed to the CNN. Firstly, the images were brought down to a dimension of 32 × 32 and fed into the framework which lowered the computational time taken time by the framework and also lowered the accuracy marginally and brought it to 77.01% and lastly, as the dimensions of the images were brought down to 128 × 128 and fed into the framework, there was increase in computational time with increase in accuracy to 81.31% as depicted in Table 3. Since we achieved most effective results in 128 × 128 image size, we further carry forward our analysis on 128 × 128 image size. For in depth analysis, we started to increase the number of convolution layers; firstly, we added another layer bringing to Table 2 Accuracy assessment for various type of images
Table 3 Accuracy assessment of different sizes of image
Image type
Accuracy (%)
Gray scale images
66.18
Edge-detected images
68.22
Colored images
80.17
Image type
32 × 32
64 × 64
128 × 128
Colored images
77.01%
80.17%
81.31%
An Efficient Image-Based Skin Cancer Classification Framework … Table 4 Accuracy assessment on different convolution layers
857
Number of convolution layers Colored image (128 × 128) (%) Two-layer framework
81.31
Three-layer framework
82.64
Four-layer framework
83.78
Five-layer framework
67.27
a total of three convolutions layers, with marginal increase in computational time, we achieved a higher accuracy of 82.64%. Then, we further added another convolution layer to make a four layer architecture which gave an even better accuracy of 83.78%. Lastly ,we added one more convolution layer to bring the framework to five layers but conflicting to our anticipations, the accuracy dipped to 67.27% (Table 4). Thus, we managed to get our best accuracy on a brightened color image with dimensions of 128 × 128 and a four convolution-layered architecture is most suited for this dataset.
5 Conclusion In this paper, we have proposed framework where a CNN model is applied for the classification of carcinoma into malignant and benign and have achieved an accuracy of 83.78%. To achieve better accuracy, various manipulations have been performed in parameters like image sort, size of input image, and range of convolution layers within the model. The results demonstrate that the projected model works best once the CNN includes of four convolution layers with brightened pictures of size 128 × 128 as input. In the future, there is scope of improvement to the projected framework by additional preprocessing of the photographs with cutting-edge techniques further as growing the present dataset. Further, more skin diseases can be classified using current analysis.
References 1. Gulati S, Bhogal R (2020) Classification of melanoma from dermoscopic images using machine learning. In: Smart intelligent computing and applications. Springer, Singapore, pp 345-354 2. Krizhevsky AI (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, pp 1097–1105 3. Oliveira RB, Papa JP, Pereira AS, Tavares JMRS (2018) Computational methods for pigmented skin lesion classification in images: review and future trends. Neural Comput Appl 29(3):613– 636 4. Amirreza M, Rupert E (2017) Skin lesion classification using hybrid deep neural networks. Published at Cornell University. https://arxiv.org/abs/1702.08434
858
T. Ghanshala et al.
5. Yang X, Zeng Z, Yeo S, Tan C, Tey H, Su Y (2017) A novel multi-task deep learning model for skin lesion segmentation and classification. Published at Cornell University: https://arxiv. org/abs/1703.01025 6. Lei B, Jinman K, Euijoon A, Dagan F (2017) Automatic skin lesion analysis using large-scale dermoscopy images and deep residual networks. Published at Cornell University. https://arxiv. org/abs/1703.04197 7. Codella N, Gutman D, Celebi M (2018) Skin lesion analysis toward melanoma detection: a challenge. In: 2017 International symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC). Published on IEEE, Washington DC. https:// ieeexplore.ieee.org/abstract/document/8363547 8. Esteva A, Kuprel B, Novoa R, Ko J (2017) Dermatologist-level classification of skin cancer with deep neural networks. Int J Sci 9. Ge Z, Demyanov S, Bozorgtabar B, Abedini M, Chakravorty R, Bowling A, Garnavi R (2017) Exploiting local and generic features for accurate skin lesions classification using clinical and dermoscopy imaging. In: 2017 IEEE 14th international symposium on biomedical imaging (ISBI 2017), pp 986–990 10. Tan TY, Zhang L, Jiang M (2016) An intelligent decision support system for skin cancer detection from dermoscopic images. In: 2016 12th International conference on natural computation, fuzzy systems and knowledge discovery (ICNC-FSKD), pp 2194–2199 11. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Sánchez CI (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88 12. Aliakbarisani R, Ghasemi A, Wu SF (2019) A data-driven metric learning-based scheme for unsupervised network anomaly detection. Comput Electr Eng 73:71–83 13. Cornelisse D (2018) An intuitive guide to convolutional neural networks. Retrieved from www.freecodecamp.com: https://medium.freecodecamp.org/an-intuitive-guide-to-con volutional-neural-networks-260c2de0a050 14. International Skin Imaging Collaboration Website. Available: https://www.isic-archive.com/#!/ topWithHeader/onlyHeaderTop/gallery 15. Zghal NS, Derbel N (2020) Melanoma skin cancer detection based on image processing. Current Medical Imaging 16(1):50–58
Inverse Kinematics Analysis of Welding Robot IRB 1520ID Using Algorithm for Adjusting the Increments of Generalized Vector Chu Anh My, Duong Xuan Bien, and Le Chi Hieu
Abstract This paper focuses on presenting an algorithm for adjusting the increments of generalized vector to solve the inverse kinematics problem of welding robot IRB 1520ID which has six degrees of freedom. It means a redundant system. This method allows the end-effector position of robot to follow the desired trajectory and ensure a given tolerance. Values of joint variables are calculated consecutively corresponding to each position of the end-effector point. These values are ensured within the permitted limits, stick to each other and avoid singular points. The velocity, acceleration and jerk of joints are also calculated based on the results from the algorithm mentioned above combining with the accurate determination of derivative the firstand second-order Jacobian matrices with respect to time. These values are guaranteed to be within the permitted motion limits of the robot. In addition, the acceleration and the jerk at the end-effector point are determined through the forward kinematics problem. Keywords Welding robots · Adjusting algorithm · Redundant system · Inverse kinematics
1 Introduction The inverse kinematics problem always plays an important role in designing the robots control system. There are many methods which are developed to solve the inverse kinematics problems [1–28] such as Jacobian transpose [7, 8, 20], pseudoinverse [1, 7], damped least squares [2], quasi-Newton and conjugate gradient [4, 5], closed-loop inverse kinematics (CLIK) [3, 6, 13, 23]. The inverse kinematics problem was solved by using CLIK method with the velocity and acceleration constraints in [3, 13]. My et al. [23] used CLIK method for welding robot 6-DOFs combining with C. A. My (B) · D. X. Bien Le Quy Don Technical University, Hanoi, Vietnam L. C. Hieu Faculty of Science and Engineering, University of Greenwich, London, UK © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_82
859
860
C. A. My et al.
a positioner to track a complex 3D curve. The parallel genetic algorithm is used to solve IK problem of Puma 500 robot in [14]. This method also is presented in [16] and in [19] for 3-DOFs robot. Pan et al. [15] and Fu et al. [18] solved the IK problem for welding robot TA 1400 and painting robot using the offset modification (OM) method. The neutron network algorithm is used in [9, 17]. Aydun and Kucuk [10] used Euler wrist using quaternion vector pair method. Husty et al. [11] presented elimination technique to reduce the complexity of inverse kinematic formulation based on analytical method. The quick IK algorithm is proposed in [20]. The new solution method to avoid joint limitation, singularities and obstacles is introduced in [21]. This paper presents the algorithms for adjusting the increments of generalized vector to solve the inverse kinematics of welding robot IRB 1520ID with 6 degrees of freedom (DOFs). Values of joint variables are calculated consecutively following a given path of the end-effector point. The velocity, acceleration and jerk of joints are also calculated and are ensured to be within the permitted motion limits of the robot.
2 Kinematics Modeling of Welding Robot IRB 1520ID 2.1 Kinematics Modeling Consider the kinematics model of industrial 6 DOFs welding robot IRB 1520D as shown in Fig. 1. The fixed coordinate system is (O X Y Z )0 located at point O0 , and (O X Y Z )i , (i = 1 ÷ 6) are the local coordinate systems attached link i. Table 1 describes the kinematic parameters according to the D-H rule [7]. Accordingly, the transformation homogeneous matrices Hi , (i = 1 ÷ 6) are determined. The position and direction of the end-effector point (point E) from the D6 matrix following the fixed coordinate system are determined as follows [7] D6 = H1 H2 H3 H4 H5 H6
(1)
Define the generalized vector of robot is q(t) = [q1 q2 q3 q4 q5 q6 ]T , and x(t) = [ x E (t) y E (t) z E (t) ]T is the coordinate vector of end-effector point following fixed coordinate system. The forward kinematics equations are x = f (q) where f is a vector function representing the robot forward kinematics. Derivative these equations above with respect to time, the relation between generalized velocities is obtained as x˙ = J (q)q˙
(2)
where J (q) is the Jacobian matrix with size 3×6. The acceleration of the end-effector point can be given by derivative (2) x¨ = J q¨ + J˙q˙
(3)
Inverse Kinematics Analysis of Welding Robot IRB 1520ID …
861
Fig. 1 Kinematics model of welding robot IRB 1520D
Table 1 Kinematics parameters D-H Link
DH parameters θi = qi
di
ai
αi
1
q1
2
q2
d1
a1
π/2
0
a2
0
3 4
q3
0
a3
π/2
q4
d4
0
π/2
5
q5
0
0
−π/2
6
q6
d6
0
0
Derivative continuously (3), the jerk of the end-effector point is determined as ... ... x = J q + 2 J˙q¨ + J¨q˙
(4)
The inverse kinematics equations of robots are q(t) = f −1 (x(t)). Due to the redundant system, solving these equations will give a great deal of plans. Once the values of q have been determined, the joints velocity q˙ = J + (q)x˙ is determined where J + (q) is the pseudo-inverse of J (q) matrix and is defined as [7].
862
C. A. My et al.
−1 J + (q) = J T (q) J (q)J T (q)
(5)
The joints acceleration is calculated from (3) q¨ = J + (q) x¨ − J˙q˙
(6)
Similarly, the joints jerk also is determined form (4) ... ... q = J + (Q) x − 2 J˙q¨ − J¨q˙
(7)
... So, the position, velocity, acceleration and jerk of joints (q, q, ˙ q, ¨ q ) are clearly ... determined based on a given vectors x, x, ˙ x, ¨ x . The calculation of the second and the third derivatives of Jacobian matrix ( J˙, J¨) is also concerned.
2.2 The Algorithm for Adjusting the Increment of Generalized Vector Assume that the robot works in the period from t = 0 to t = T f . Divide the time T interval 0 ÷ T f into N equal segments. The value of each segment is t = Nf . The next time step can be calculated as tk+1 = tk + t; k = 0 ÷ (N − 1). Operating Taylor’s expansion of q(tk+1 ) near the value of q(tk ), the values of generalized joints position vector are given as 1 ¨ k )(t)2 + . . . ˙ k )t + q(t q(tk+1 ) = q(tk + t) = q(tk ) + q(t 2
(8)
Ignoring extremely small second order, we have: ˙ k )t q(tk+1 ) = q(tk ) + J + (q(tk ))x(t
(9)
Thus, q(tk+1 ) can be determined from a given q0 at t = 0 with k = 0 ÷ (N − 1). However, the result given (9) is quite rough at t = tk+1 . So, the approximate value q0 which is obtained as of vector q0 at t = 0 is vector q0 + q0 q0 =
(10)
From forward kinematics equations, q0 + q0 ) = f ( q0 ) + J ( q0 )q0 x0 = f (q0 ) = f ( So,
(11)
Inverse Kinematics Analysis of Welding Robot IRB 1520ID …
863
J ( q0 )q0 ∼ q0 ) = x0 − f (
(12)
The increments value of generalized vector is determined as q0 )] q0 )[x0 − f ( q0 = J + (
(13)
q0∗ = If q0 ≥ ε (where ε is the allowable error vector), then the new value q0 is calculated and substituted into (9). This equation is solved again. q0 + q0 of The process is repeated until the conditions q0 < ε is satisfied. At this point, q0 = q0∗ . At t = tk+1 , the approximate value of q (tk+1 ) is defined as follows ˙ k+1 )t q (tk+1 ) = q(tk+1 ) + J + (q(tk+1 ))x(t
(14)
Determine a better approximation according to the Formula (10) q (tk+1 ) + q(tk+1 ) q(tk+1 ) =
(15)
where the value of q(tk+1 ) is determined similarly as (13) q (tk+1 )) x(tk+1 ) − f ( q (tk+1 )) q (tk+1 ) = J + (
(16)
q (tk+1 ) + q(tk+1 ) q (tk+1 ) =
(17)
And,
q (tk+1 ). So, Operate the loop until q(tk+1 ) < ε and then take q(tk+1 ) = the value of generalized position joints vector q has been found. The algorithm for adjusting the increments of generalized vector is described as in Fig. 2. ... The values of vectors q, ˙ q, ¨ q can be calculated from the q values. Note that the kinematics conditions always are continuously considered and ensured in calculation process following the abovementioned algorithm. These kinematics constraints are defined as ... ... ˙ ≤ q˙max ; |q| ¨ ≤ q¨max ; q ≤ q max qmin < q < qmax ; |q|
(18)
... where q˙max , q¨max , q max are the maximum velocity, acceleration and jerk of joints.
3 Numerical Simulation Results Consider the parameters of the welding robot IRB 1520ID in [22]: d1 = 0.453(m), a1 = 0.16(m), a2 = 0.59(m), a3 = 0.2(m), d4 = 0.723(m), d6 = 0.2(m). The limited values of the velocity, acceleration and jerk of joints are given as
864
C. A. My et al.
Fig. 2 Diagram of the algorithm [12]
q˙max = [2.26 2.44 2.44 5.58 6.63 8.02]T (rad/s) ... q max = [5.65 6.1 6.1 13.95 16.58 20]T rad/s 2 ... q max = [28.25 30.5 30.5 70 82.88 100]T rad/s 3 The desired end-effector point such as the required welding seam is given as x E = 0.45; y E = −0.45 cos(t); z E = 0.97 + 0.45 sin(t). Allowable error in the algorithm is ε = 10−5 (rad). The diagram in Fig. 3 describes the performing steps. Fig. 3 Calculation diagram
Inverse Kinematics Analysis of Welding Robot IRB 1520ID …
865
Fig. 4 Values of joints position
The simulation results are shown in Figs. 4, 5, 6, 7, 8, 9, 10, 11, 12, 13. The values of position, velocity, acceleration and jerk of joints are shown in Figs. 4, 5, 6, 7. These values are within the allowable limit of the robot. The results demonstrate the high efficiency of the algorithm. Figures 8, 9, 10, 11 show values of position, velocity, acceleration and jerk at the end-effector point according to the given trajectory. These factors are calculated again from the results of the inverse kinematics problem using the algorithm mentioned above. Figure 12 shows that the position error of the end-effector point is extremely small. Maximum error value is 2.8 × 10−5 (m). Figure 13 presents the numerical simulation model of robot IRB 1520ID in MATLAB using the kinematics parameters and the values of joint variables which are calculated from inverse kinematics solution.
4 Conclusions This study successfully applied the algorithm for adjusting the increment of generalized vector to solve the inverse kinematics problem of the robot IRB 1520ID with 6-DOFs. The kinematics characteristics of the robots such as position, velocity, acceleration and jerk of the joints are calculated and within the kinematics limits, ensuring minimum position errors of the end-effector point in workspace. The results of the inverse kinematics problem are used to determine the acceleration and jerk values of
866
C. A. My et al.
Fig. 5 Values of joints velocity
the end-effector point through solving the forward kinematic problem. Furthermore, this paper shows the effectiveness of adjusting the increment of generalized vector for the redundant systems with many degrees of freedom and serves as the basis for developing optimal algorithms in tracking, following the given path of welding, cutting and 3D printing robots in both time domain and parametric domain.
Inverse Kinematics Analysis of Welding Robot IRB 1520ID …
Fig. 6 Values of joints acceleration
Fig. 7 Values of joints jerk
867
868
Fig. 8 Values of the end-effector point position
Fig. 9 Values of the end-effector point velocity
C. A. My et al.
Inverse Kinematics Analysis of Welding Robot IRB 1520ID …
Fig. 10 Values of the end-effector point acceleration
Fig. 11 Values of the jerk at the end-effector point
869
870
Fig. 12 Values of the end-effector position deviation
Fig. 13 Robot simulation
C. A. My et al.
Inverse Kinematics Analysis of Welding Robot IRB 1520ID …
871
Funding Statement This work was supported by a Research Environment Links, ID 528085858, under the Newton Fund partnership. The grant is funded by the UK Department for Business, Energy and Industrial Strategy and delivered by the British Council.
References 1. Yoshikawa T (1985) Dynamic manipulability of robot manipulators. J Robot Syst 2:113–124 2. Wampler CW (1986) Manipulator inverse kinematic solutions based on vector formulations and damped least squares methods. Trans Syst Man Cybern 16:93–101 3. Sciavicco L, Siciliano B (1988) A solution algorithm to the inverse kinematic problem for redundant manipulators. J Robot Autom 4:403–410 4. Wang LCT, Chen CC (1991) A combined optimization method for solving the inverse kinematics problem of mechanical manipulator. Trans Robot Autom 7:489–499 5. Zhao J, Badler NI (1994) Inverse kinematics positioning using nonlinear programming for highly articulated figures. Trans Graph 13:313–336 6. Antonelli G, Chiaverini S, Fusco G (2000) Kinematic control of redundant manipulators with online end-effector path tracking capability under velocity and acceleration constraints. IFAC Robot Control, Austria, pp 183–188 7. Spong MW, Hutchinson S, Vidyasagar M (2001) Robot modeling and control, 1st edn. New York, USA 8. Lewis FL, Dawnson DM, Abdallah C (2004) Robot manipulator control theory and practice, 2nd edn. Marcel Dekker INC, New York, USA 9. Bingul Z, Ertunc HM, Oysu C (2005) Comparison of inverse kinematics solutions using neural network for 6R robot manipulator with offset. In: Computational intelligence methods and applications, pp 1–5 10. Aydun Y, Kucuk S (2006) Quaternion based inverse kinematics for industrial roboyt manipulators with Euler wrist. In: ICM 2006 IEEE 3rd International conference on mechatronics, pp 581–586 11. Husty ML, Pfurner M, Schrocker HP (2007) A new and efficient algorithm for the inverse kinematics of a general serial 6R manipulator. Mech Mach Theory 42:66–81 12. Khang NV, Dien NP, Vinh NV, Nam TH (2010) Inverse kinematic and dynamic analysis of redundant measuring manipulator BKHN-MCX-04. Vietnam J Mech VAST 32:15–26 13. Wang J, Li Y, Zhao X (2010) Inverse kinematics and control of a 7 dof redundant manipulator based on the closed loop algorithm. Int J Adv Robot Syst 7:1–10 14. Aguilar OA, Huegel JC (2011) Inverse kinematics solution for robotic manipulators using a CUDA-based parallel genetic algorithms. In: Mexican international conference on artificial intelligence 2011, Part 1, pp 490–503 15. Pan H, Fu B, Chen L, Feng J (2011) The inverse kinematics solutions of robot manipulators with offset wrist using the offset modification method. Adv Autom Robot 1:655–663 16. Ramirez J, Rubiano A (2011) Optimization of inverse kinematics of a 3R robotic manipulator using genetic algorithms. Int J Mecha Mech Eng 5:2236–2241 17. Feng Y, Yaonan W, Yimin Y (2012) Inverse kinematics solution for robot manipulator based on Neural Network under joint subspace. Int J Comput Commun 7:459–472 18. Fu Z, Yang W, Yang Z (2013) Solution of inverse kinematics for 6R robot manipulators with offset wrist based on geometric algebra. J Mech Robot (ASME) 5 19. Momani S, Zaer S, Hammour A, Alsmadi MK (2016) O: Solution of inverse kinematic problem using genetic algorithms. Appl Math Inf Sci 10:225–233 20. Lian S, Han Y, Wang Y, Bao Y, Xiao H, Li X, Sun N (2017) Accelerating inverse kinematics for high-DOF robots. In: Proceedings of the 54th annual design automation conference 2017, Austin, USA
872
C. A. My et al.
21. Kelemen M, Virgala I, Liptak T, Mikova L, Filakovsky F, Bulej V (2018) A novel approach for an inverse kinematics solution of a redundant manipulator. Appl Sci 8:2–20 22. ABB Robotics (2018) Product manual IRB 1520. ABB AB, Robotics and Motion Se-721 68 Vasteras, Sweden 23. My CA, Bien DX, Tung HB, Hieu LC, Cong NV, Hieu TV (2019) Inverse kinematic control algorithm for a welding robot-positioner system to trace a 3D complex curve. In: International Conference on advanced technologies for communications (ATC), pp 319–323 24. My CA, Le CH, Packianather M, Bohez EL (2019) Novel robot arm design and implementation for hot forging press automation. Int J Prod Res 57(14):4579–4593 25. My CA (2016) Inverse kinematics of a serial-parallel robot used in hot forging process. Vietnam J Mech 38(2):81–88 26. My CA (2013) Inverse dynamic of a N-links manipulator mounted on a wheeled mobile robot. In: 2013 IEEE International conference on control, automation and information sciences (ICCAIS), pp 164–170 27. My CA, Trung VT (2016) Design analysis for a special serial-parallel manipulator transferring billet for hot extrusion forging process. Vietnam J Sci Technol 54(4):545 28. My CA, Hoan VM (2019) Kinematic and dynamic analysis of a serial manipulator with local closed loop mechanisms. Vietnam J Mech 41(2):141–155
Recent Trends in Big Data Ingestion Tools: A Study Garima Sharma, Vikas Tripathi, and Awadhesh Srivastava
Abstract In big data era, data is flooding in at unparalleled, inflexible rate making collection and processing of data a bit hard and unmanageable without using appropriate data handling tools. Selecting the correct tool to meet the current as well as future requirement is a heavy task, and it became more strenuous with lack of awareness of all the available tools of this area. With right tools, one can rapidly fetch, import, process, clean, filter, store, and export data from variety of sources with different frequency as well as capacity of data generation. A comprehensive survey and comparative study of performance, merits, demerits, usage of various ingestion tools in existence for frequent data ingestion activities (keeping volume, variety, velocity, and veracity in mind) have been presented in this paper. Keywords Apache Kafka · Apache Flink · Apache NIFI · Amazon Kinesis · Apache Storm · Apache Gobblin
1 Introduction 1.1 Background Management of data has been shifted to a critical differentiator to determine market winners from native competencies for all organizations irrespective their size. New initiatives and re-evaluation of existing strategies are in process to examine how the businesses can be transformed or boosted using big data. It is not a single independent technology, tool, or an initiative punch. Rather, we can count in as a trend across many areas of business and technology. It refers to technologies, tools as well as initiatives that include data that is too diverse, G. Sharma (B) · V. Tripathi Graphic Era Deemed to be University, Dehra Dun, India e-mail: [email protected] A. Srivastava KIET, Ghaziyabad, Uttar Pradesh, India © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_83
873
874
G. Sharma et al.
rapidly changing or enormous for traditional technologies, concepts, skill sets, and infrastructure to address efficiently. Big data can be viewed from different aspects depending upon the user requirements: [1]. One aspect is of fundamental level in which it is just another collection of data that can be analyzed and utilized for the benefit of any business and other can be of analyzing and understanding the uniqueness in the data and leverage the same for some distinctive purposes. The major content of this data is unstructured in nature. Big data includes all variety of data/information which can help to deliver the right information to the appropriate person in a defined timestamp which can be helpful in making right decision using available big data tools and techniques: [2]. Data ingestion is the first step for data pipelining and also one of the substantial task in big data paradigm. In data ingestion phase, we proposed methodologies to ingest data flows from hundreds or thousands of sources into our defined data center. As the data can come up from numerous sources at an unpredictable speed as well as in dissimilar formats, this layer is completely responsible for extracting, transforming, and loading the content into any big data database system. ETL is an acronym for extraction, transformation, and loading the dataset in data lakes. The process may include number of database (similar/dissimilar) combinations, i.e., one type of data is feeding into database using one ingestion tool and another mechanism is responsible to place it into another database after the required refinements. Extraction refers to the process of fetching data from any data source (having different data formats): [3]. Transforming refers to the process of converting the extracted data from its previous format to the format required to be placed into another database. Basic ETL is responsibility of data ingestion layer nowadays which is making the ingestion layer more critical and important: [4].
1.2 Contributions In this paper, we presented a comparative study of trending big data ingestion tools on the basis of their different research areas, already completed in various fields. We have also discussed their correct implementation in research areas by showing a comparison report of their prominent usage and benchmark performances on a particular task. As each tool is for a particular type of data stream, we have also marked their expert data stream type.
1.3 Organization of the Paper This paper is arranged in this way: In Sect. 2, we provided a comprehensive background on data ingestion and how it is different from a basic ETL process.
Recent Trends in Big Data Ingestion Tools: A Study
875
We have also defined different type of data streams as it is one of the bases for tool differentiation. In Sect. 3, we have given general information about various data ingestion tools along with their real-time usage in various research fields. In Sect. 4, a comparison chart is given to benchmark tools performance along with their prominent usage and the data stream it supports. In Sect. 5, we concluded on how provided a guideline to be followed before choosing and adopting any one of the data ingestion tools.
2 Literature Survey Data ingestion is the process of collecting and integrating dataset from various data originators into one or more targets. A data ingestion tool facilitates the process by providing a data ingestion framework that makes it easier to extract data from different types of sources and support a range of data transport protocols. It may comprises any or all the data ingestion processes like fetching of data, shifting, loading, and performing computations on data for later use or storage into an SQL or NoSQL database. This majorly involves ingesting data from heterogeneous sources, in variety of formats, converting, transforming and modifying existing file, and preparing a single larger dataset. The process can be either continuous or asynchronous and can be either real time or batch [5]. The source and sink of files can be of different format or can follow same or different protocols, which may require some form of transformations or conversions. A good data ingestion tool eliminates the need for manually coding individual data pipelines for every data source and accelerates data processing by helping to deliver data efficiently to ETL tools and other types of data integration software or load multi-sourced data directly into a data warehouse. Tools discussed in this paper provide a structure that empowers individuals to fetch, ingest, integrate, and perform computations on dataset from disparate data sources. These tools expedite the ingestion and computation flow by backing various data formats as well as protocols. Besides data gathering, integrating, and processing the data, these help in modifying the data for analytics and storage purposes. We can use them for ingesting data either in batch or real-time streams. Before choosing any tools, user must know the above specifications and check one of the best solutions out of them. Few tools are efficient in processing batch data while others are made only for real-time data streaming. Data in batch is basically data in rest while data in motion refers to real-time streaming. Data in motion uses streaming tools while data at rest uses batch processing tools: [6]. Several ingestion tools can be used in IoT as well as cloud-based applications: [7] Tools can only be differentiated on the basis of streams they are supporting and are discussed in Table 1 in comparison chart.
876
G. Sharma et al.
Table 1 Batch processing versus real-time stream processing Batch processing
Real-time stream processing
Discrete chunk of data at a certain interval
Import data when it is produced
Prominent in traditional analytics
Prominent in real-time analytics in current trend
Collection and loading of information are two different tasks
Collection, loading, and processing of data are under one umbrella
Output or analytics is based on history data
Output or analytics is based on real-time or current data
Store first before processing
Processing can be first even before storage
Data volume is generally specific
Data volume is dependent on tool being used
Decision making is much slower than real-time Real-time decision-making capability processing tools
3 Related Work and Study of Each Tool 3.1 Apache Kafka Apache Kafka is basically an actual time data injection tool. It masters as a highthroughput distributed messaging system. We can leverage Kafka where there is no requirement of data computation or cleansing and data can be directly consumed by some analytical system. Connection-oriented TCP protocol is used for maintaining high-performance communication between the clients and servers. It is a special kind of distributed file system exclusively devoted for highperformance, low-latency commit log storage, replication, and propagation. It is a perfect fit for any large dataset looking for real-time data with inbuilt faulttolerant facility having high-scalability solutions. The complexity of implementation is bit high. The cluster of Kafka durably keeps all published records irrespective of their consumption by any consumer for a configurable retention period. This persisted data is automatically deleted after this defined retention time. Kafka can be set up using cluster/clusters keeping large number of servers with/without multiple datacenters. Kafka is best suited as a replacement for traditional message brokers, to track user activity as a set of real-time pub-sub feeds, to monitor dataset, as a storage system for of 50 Kb to 50 TB of stored data on server, to maintain sequence in records. To support above use cases, Kafka system comprises of four basic application interfaces: producer, consumer, streams, and connector application interface for publishing, subscribing, consuming input stream records, producing output stream of records, and finally connecting these topics to existing applications. Using connector API one can consume same connector to different relational databases to capture every change to respective tables. All the API as extensible and developer can write in its own plug in over any of the above API that depends upon its use case: [8].
Recent Trends in Big Data Ingestion Tools: A Study
877
3.2 Apache NIFI Apache NIFI automates the movement of data between varied data sources as well as data lake systems, making ingestion fast, easy, and secure. It supports both batch processing as well as real-time data processing. From extraction of data from various sources like any disk file, HBase dataset, HDFS file system, Kafka message queue, SQL as well as NoSQL dataset to cleansing or conversion of data from one format to other and finally loading the dataset into required data lake can be handled by this single tool. It supports powerful and scalable directed graphs for data routing, transformation, and system mediation logics. Each graph contains set of processors connected to each other to accomplish required business task. Capability like GUI to design, implement, control, feedback, and monitor the processors [9], fault tolerance and data provenance secured with protocols like SSL, SSH, and HTTPS make NIFI stand out from other ingestion tools. NIFI shines with small latency streaming flows as heavyweight batch data transfers. It can be deployed on a single cluster or multiple node clusters. In one of the real-time scenario, NIFI architecture has been successfully deployed on cloud for processing data in cyber-physical-social systems through edge computing: [10]. They have also integrated it with Kafka by leveraging its message queuing capability. Data ingestion, integration, and cleansing were responsibility of one tool, i.e., NIFI, and have given a tremendous outcome of streaming more than 100 data values per second even after using extension processors. One of the great works using this tool has been showcased in development of knowledge graph cycle: [11] where the authors have consumed the system directly for data cleaning purpose and also have maximized the system throughput by leveraging capabilities of the tool as mentioned in above feature section of this tool.
3.3 Amazon Kinesis Amazon Kinesis is a Web service provided and hosted on Amazon (AWS) for processing big data in real-time processing system. With this tool, accumulating, transforming, and analyzing a real-time streaming data are quite easy. One can get insights of data at real time and can readily react quickly to up-to-date collected information. This tool supports real-time data ingestion of major data types such as videos, audios, logs, click streams, event streams, IoT telemetry data for machine learning, analytics, and other applications. It is capable of analyzing and processing data as it has been arrived and responds instantly instead of waiting for a batch to be prepared before the processing can begin. Since it is a Web service, the infrastructure management is not part of developer’s activity else it is done by itself. Amazon Kinesis capabilities include video stream processing for analytics, machine learning (ML), building tailored real-time applications that process data
878
G. Sharma et al.
streams, capturing, transforming, and loading real-time data into AWS data stores for analytics effortlessly: [12] using SQL or Java without any wait or at real time.
3.4 Apache Flink Apache Flink is a distributed processing engine or framework for stateful computations over infinite or finite data streams. Flink is designed with a view of running in all common cluster environments and performing computations at in-memory speed and without any scalability issues. The two fundamental APIs—DataStream and DataSet—are built up on top of Flink’s core dataflow engine that supports in providing operations on datasets such as mapping of data, filtering some data, grouping required/unrequired data: [13]. Flink is capable of maintaining application state of a very large system effortlessly. The check pointing algorithms guarantees of exactly once state consistency. Flink has integration with all the common resource managers such as Hadoop YARN. Flink setup can also be done as a stand-alone cluster. In a practical approach, an integrated system of Apache Flink and Apache Kafka is built for preparing a distributed pattern prediction system over multiple large-scale event streams of moving objects (vessels). This system uses the event forecasting with pattern Markov chain (PMC) as the base prediction model on each event stream, and it applies the protocol for distributed online prediction for exchanging information between the prediction models over various input event streams: [14]
3.5 Apache Storm Apache Storm is a real-time computing system. It works on task parallelism principle wherein using capability of code reusability on multiple nodes with disparate input dataset values. It is open source, scalable, fault-tolerant, guarantees data processing, and is easy to set up and operate. It has a tremendous performance stats of processing a million tuples per second using single node. Apache Storm does not have any state management capability of its own instead utilizes an open-source server, Apache ZooKeeper to manage its cluster state information such as message acknowledgements and processing status, therefore enabling Storm to start right from where it left even after the restart: [15]. In a practical approach, Storm is consumed as a real-time distributed processing system for processing and analyzing rapidly generated data in real time on distributed nodes in energy analytics domain. Also, a solution of both the blind distribution problem of shuffle grouping and the sender and receiver imbalance problem of localor-shuffle grouping in Apache Storm is resolved: [16].
Recent Trends in Big Data Ingestion Tools: A Study
879
An another successfully implemented use case in medical research Apache Storm is consumed to develop a scalable health monitoring system that supports the concurrent monitoring of multiple patients. They consumed Apache Storm to analyze simple heartbeat monitoring system of multiple patients: [16]
3.6 Apache Gobblin Apache Gobblin is a data ingestion tool for fetching, processing, computing, and loading a high volume of data from enormous data sources. We can view it as a data integration tool implicitly integrated with big data resource managers tools like Yarn supporting coding in MapReduce Jobs: [17]. Gobblin provides six different component interfaces to support scalability and customization in component development—source, extractor, converter, quality checker, writer, and publisher. Source helps in integration of various data sources, extractor watermarks the start and end of sources topic, convertor performs required transformations like filtration, etc. After transformation, Gobblin provides an implicit quality checker component before actually writing it to staged directory. After writing the data, publishing can be performed using publisher component: [7]. For Batch Processing endpoint it checks the checkpoint and commits the changes while in case of real-time streaming the staging is continuous, keeping the commits periodically: [7].
4 Tool Comparison See Table 2.
5 Concluding Remarks We have presented a contrast between various trending data ingestion tools. While selecting the ingestion tool, we are required to keep these points into mind. First and foremost prioritize all the available data sources, i.e., whether they are of same format or different, the velocity of each data source is similar or dissimilar, and they are real-time-based or flat file batch system. After concluding the variety and velocity, move to the veracity of each file, i.e., how much correct the data actually is and intensity of required data cleaning activity. After data cleaning kindly validate each file with respect to users’ requirement. Finally, configure data extraction and processing units including the involved methodologies for the same and check whether they are available in our chosen system or we need to fusion it with some more ingestion system. We also require checking whether we need support of data visualization
880
G. Sharma et al.
Table 2 Use case—performance comparison chart Tool
Prominent usage
Performance
Integration tools
Type of stream supported
Apache Kafka
For ingesting data, no computations
2 Million writes per second using 03 clusters
All prominent languages
Real time
Apache NIFI
For ETL purposes with vizualization
1000 events per second using 03 nodes cluster [18]
Language, Apache Spark, Apache Kafka, Apache Atlas
Batch, near to real, real time
Amazon kinesis
For videos analytics
3000 writes per second using 03 node cluster [14]
Java, Python, Real-time Ruby, streaming Node.js,.NET, Amazon RedShift, Amazon Dynamo Database, Amazon S3
Apache flink
For ETL as well as business purposes
4 million records per second using 01 node cluster [15]
Apache Kafka, Elastic Search, HDFS, RabbitMQ, Amazon Kinesis Streams, Twitter, Apache NIFI, Apache Cassandra, Apache Flume
Batch and real time
Apache spark
For ETL as well approx 2.5 million as business needs records per second using 01 node cluster
All prominent languages, big data tools as well as frameworks
Batch and near to real time
Apache storm
Only for data computation purposes
aprrox 01 million tuples processing per second using 01 node cluster
With all queuing tools like Kafka, Kinesis, JMS and database systems like SQL, MongoDb, etc.
Real time
Apache gobblin
Data ingestion, cleansing, integration
approx 100 TB/day using three clusters
Majorly Java connectors, Apache Kafka
Batch and real time: [15]
during cleaning, extraction, and processing of data. From future and current need, we also analyze the scalability of data stream as well as tool and whether the tool supports multi-platform or not. Since in future we may require fusion of tools like Apache Kafka is working on integration with Apache Flink, in this scenario, we are required to check the available integration and connecting tools. The most important aspect of tool selection involves its security features so that data is not vulnerable for malicious attack and the data security and integrity should be on topmost priority.
Recent Trends in Big Data Ingestion Tools: A Study
881
References 1. Ranjan R (2014) Streaming big data processing in datacenter clouds. IEEE Cloud Comput 1(1):78–83. https://doi.org/10.1109/MCC.2014.22 2. Thein KMM (2014) Apache kafka: next generation distributed messaging system. Int J Sci Eng Technol Res 3(47):9478–9483 3. Young R, Fallon S, Jacob P (2018) Dynamic collaboration of centralized and edge processing for coordinated data management in an IoT paradigm. In: 32nd International conference on advanced information networking and applications 4. Liu J, Braun E, D¨upmeier C, Kuckertz P, Ryberg DS, Robinius M, Stolteñ D, Hagenmeyer V (2018) A generic and highly scalable framework for the automation and execution of scientific data processing and simulation workflows. In: IEEE International conference on software architecture 5. M˘at˘acut, a˘ A, Popa C (2018) Big data analytics: analysis of features and performance of big data ingestion tools. In: Informatica economic˘a, vol 22 6. Dautov R, Distefano S, Bruneo D, Longo F, Merlino G, Puliafito A (2018) data processing in cyber-physical-social systems through edge computing 7. Pal G, Li G, Atkinson K (2018) Big data real time ingestion and machine learning. In: IEEE Second International conference on data stream mining and processing 8. Sarnovsky M, Bednar P, Smatana M (2017) Data integration in scalable data analytics platform for process industries. In: 21st International conference on intelligent engineering systems 9. Amudhavel J et al. (2015) Perspectives, motivations, and implications of big data analytics. In: Proceedings international conference on advanced research in computer science engineering and technology, vol 9. Newyork, USA, pp 344–352 10. Dung N, Duffy EB, Luckow A, Kennedy K, Apon A (2018) Evaluation of highly available cloud streaming systems for performance and price. In: 18th IEEE/ACM International symposium on cluster, cloud and grid computing 11. Simsek U, Umbrich J, Fensel D (2020) Towards a knowledge graph lifecycle: a pipeline for the population of a commercial knowledge graph. In: Proceedings of the conference on digital curation technologies, vol 2535 12. Qadah E, Mock M (2018) A distributed online learning approach for pattern prediction over movement event streams with apache flink. Vienna, Austria 13. Ficco M (2017) Aging-related performance anomalies in the apache storm stream processing system, Future generation 14. Son S, Lee S, Gil M-S, Choi M-J, Moon Y-S (2018) Locality aware traffic distribution in apache storm for energy analytics platform. In: IEEE International conference on big data and smart computing 15. Agbo CC, Mahmoud QH, Eklund JM (2018) A Scalable patient monitoring system using apache storm. In: IEEE Canadian conference on electrical and computer engineering (CCECE) 16. Dixit A, Choudhary J, Singh DP (2018) Survey of apache storm scheduler. In: 3rd International conference on internet of things and connected technologies (ICIoTCT) 17. Lindemann T, Kauke J, Teubner J (2018) Efficient stream processing of scientific data. In: IEEE 34th International conference on data engineering workshops 18. Firouzi F, Farahani B (2020) Architecting IoT Cloud. In: Firouzi F, Chakrabarty K, Nassif S (eds) Intelligent internet of things. Springer, Cham
Optimization of the Feed Rate of Six-DOFs Robot in a Parametric Domain Based on Kinematics Modeling Chu Anh My, Duong Xuan Bien, and Le Chi Hieu
Abstract Nowadays, robots are widely used in welding machining, cutting machining, printing 3D plastic, and additive manufacturing metal because of their great flexibility. Due to the demand to reduce time costs and increase the manufacturing performance, the issue of optimizing the production process is always concerned. In particular, optimizing the feed rate parameter has received much attention from researchers. This paper presents the method for optimizing the feed rate of the cutting tool along with the toolpath profile of an industrial robot with six-degrees of freedom in parametric domain based on kinematics modeling. The inverse kinematics problems of redundant system are solved by using the algorithms for adjusting the increments of generalized vector. The positions, velocities, accelerations, and jerks of joints are determined following the given toolpath in parametric domain. The algorithm is built based on the gradual increase of feed rate values, while ensuring the kinematic limits of the joints. The results of this study play an important role in improving the technological capabilities of machining robots when manufacturing complex toolpaths in space. Keywords Feed rate · Optimize algorithm · Machining robots · Parametric domain
1 Introduction One of the most effective ways to increase productivity is to optimize production processes, reduce machining time based on optimizing the feed rate of robots, especially for complex toolpaths constantly changing. Although optimizing the feed rate of robots, the accuracy of positioning each position on the toolpath and kinematics limits such as velocity, acceleration, and jerk of joints have to ensure to prevent C. A. My (B) · D. X. Bien Le Quy Don Technical University, Hanoi, Vietnam e-mail: [email protected] L. C. Hieu Faculty of Science and Engineering, University of Greenwich, London, UK © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_84
883
884
C. A. My et al.
the robots from overloading, to avoid unwanted vibrations and toolpath errors. On the other hand, during machining process, the tool speed is almost constant even the toolpath profile changes, especially for welding machining. This problem leads to be unimproved machining performance because the feed rate of the welding torch/cutting tool (tool tip) is not consistent with the welding seam/toolpath (toolpath). There are sections of toolpath allow the robot to motion at high speeds. In contrast, some others require a slow-moving of tool tip to ensure machining quality. Therefore, calculating and optimizing the feed rate of the tool tip appropriate for each segment on the toolpath profile are essential. The research on kinematics modeling, optimizing the feed rate, and machining robots is mainly addressed in the works [1–31]. Optimization the feed rate parameter on CNC is mentioned in [1, 5, 8, 9, 12– 18, 22, 24]. Kinematics and dynamics modeling analyzing for welding and cutting robots are mentioned in [2, 4, 10, 15, 23]. The method of feed rate planning using machining robots was concerned in [6]-based geometric parametric interpolation. Fang and et al. [15] introduced the method to optimize the welding robot’s motion for welding the Y-joint. My and et al. [23] presented a closed-loop inverse kinematics control algorithm for a six-DOFs welding robot combined with a welding positioner. Building geometric models for complex welding paths is concerned in [11, 19, 21]. Yan and et al. [21] proposed an algorithm to determine the welding seam that is intersected between pipes based on non-ideal models in the parametric domain. This paper presents the method for optimizing feed rate of the cutting tool along the toolpath profile of an industrial robot six-DOFs (ABB IRB 1520ID) in parametric domain based on kinematics modeling. The inverse kinematics problems of redundant system are solved by using the algorithms for adjusting the increments of generalized vector. The positions, velocities, accelerations, and jerks of joints are determined following the toolpath in parametric domain. The algorithm is built based on the gradual increase of feed rate values, while ensuring the kinematic limits of the joints. The results of this study play an important role in improving the technological capabilities of machining robots when manufacturing complex toolpaths in space.
2 Kinematics Modeling Consider the kinematics model of industrial six-DOFs welding robot IRB 1520D as shown in Fig. 1. The fixed coordinates system is (O X Y Z )0 located at point O0 , and (O X Y Z )i , (i = 1 ÷ 6) are the local coordinate systems attached link i. Table 1 describes the kinematic parameters according to the D-H rule [7]. Accordingly, the transformation homogeneous matrices Hi , (i = 1 ÷ 6) are determined. The position and direction of the end-effector point (point E) from the D6 matrix following the fixed coordinate system are determined as follows [7] D6 = H1 H2 H3 H4 H5 H6
(1)
Optimization of the Feed Rate of Six-DOFs Robot in a Parametric Domain …
885
Fig. 1 Kinematics model of welding robot IRB 1520D
Table 1 Kinematics parameters D-H Link
D-H parameters θi = qi
di
ai
αi
1
q1
2
q2
d1
a1
π/2
0
a2
3
0
q3
0
a3
π/2
4
q4
d4
0
π/2
5
q5
0
0
−π/2
6
q6
d6
0
0
T Define the generalized vector of robot is q(t) = q1 q2 q3 q4 q5 q6 and x(t) = [ x E (t) y E (t) z E (t) ]T is the coordinate vector of end-effector point following fixed coordinate system. The forward kinematics equations are x = f (q), where f is a vector function representing the robot forward kinematics. Derivative these equations with respect to time, the relation between generalized velocities is obtained as x˙ = J (q) and J (q) is the Jacobian matrix with size 3 × 6. Due to kinematics redundant of robot, the inverse kinematics equation is calculated as q˙ = J + (q)x˙ with J + (q) is the pseudo-inverse of J (q) matrix and is defined as [7] −1 J + (q) = J T (q) J (q)J T (q)
(2)
Note that q˙k = qk+1t−qk , x˙k = xk+1t−xk , where, k is an arbitrary calculation step with respect to time. The value of generalized vector at k + 1 step can be given as qk+1 = qk + J + (q)(xk+1 − xk )
(3)
For the given x, x, ˙ x¨ vectors and using the algorithm for adjusting the increments of generalized vector which was proposed in [7], the approximately joint variables value can be determined.
886
C. A. My et al.
Given a geometric trajectory such as a toolpath in parametric domain x(u) = [x(u) y(u) z(u)]T , u = [0, 1]. Define f (t) = s˙ (t) is the feed rate along the toolpath [24], where s is the arc length of curve x(s) = [x(s) y(s) z(s)]T . The inverse kinematics equation in parametric domain is rewritten as q(u) = f −1 (x(u)). Assume that the value of generalized vector q(u) is calculated by using the method mentioned above. The velocity, acceleration, and jerk of joints need to be determined in parametric domain. The generalized velocity vector can be given as q˙ = J + (q(u))
x dx ds = J + (q(u))xs s = J˙+ u f x ds dt u
(4)
du and x = dx = dx/du = dx/du = xu . The acceleration and where ds ≈ dx s |ds/du| du ds ds/du |xu | jerk of joints are determined in parametric domain as [24]
⎞ T xu xu xu x ⎠f2 + q¨ = J + ⎝⎝ u2 + 4 x x u u ⎛⎛
⎞ xu ˙ f − J˙q˙ ⎠ x u
(5)
and, ⎛
xu 3 xu
xu
(xu )T xu
T T +xu xu xu +(xu ) xu
f + ⎜ ... |xu |5 q = J +⎜ | |
⎝ T x x x u ( u) u x + u 2 + f˙ f − 2 J˙q¨ − J¨q˙ |xu | |xu |4 3
⎞ f ⎟ ⎟ ⎠ 3
(6)
... Values of q, ˙ q¨ and q depend mainly on the geometric characteristics of the tool path trajectory xu xu , xu , xu , feed rate f, f 2 , f 3 , f˙ and the kinematics structure of the robot J, J˙, J¨ .
3 Optimizing the Feed Rate Value in Parametric Domain In this section, the feed rate of tool tip along the toolpath is calculated optimally based on the kinematics constraints of the robot. The initial feed rate value is given and will increase gradually with each increment of the loop. This increase only stopped when the constraints conditions were broken. Thus, each position in the parametric trajectory will have a corresponding optimal feed rate value. Define some symbols as follows: f ini and f max ... are the initial and maximum feed rate value; generalized vectors q˙max , q¨max and q max are the maximum velocity, acceleration, and jerk vector of joints. The optimal function is given as f optimal = max f (u) u∈[0,1]
(7)
Optimization of the Feed Rate of Six-DOFs Robot in a Parametric Domain …
887
Fig. 2 Algorithm diagram for feed rate optimization
The constraints conditions for optimal problem can be defined as ... ... |q(u)| ≤ q˙max , |q(u)| ≤ q¨max , q (u) ≤ q max , f (u) ≤ f max ˙ ¨
(8)
The algorithm diagram is described as Fig. 2.
4 Numerical Simulation Results This part presents the numerical simulation results for welding robot IRB 1520ID with a 3D complex welding seam. Some parameters of welding robot IRB 1520ID can be showed as [19] d1 = 0.453(m), a1 = 0.16(m), a2 = 0.59(m), a3 = 0.2(m), d4 = 0.723(m), d6 = 0.2(m), f max = 2(m/s). q˙max = [130 140 140 320 380 460]T (Degree/s). q¨max = [325 350 350 800 950 1150]T Degree/s 2 . ... q max = [1625 1750 1750 4000 4750 5750]T Degree/s 3 The welding seam is defined as a three-degrees Bezier curve [24] X (u) = P0 (1 − u)3 + 3P1 (1 − u)2 u + 3P2 (1 − u)u 2 + P3 u 3
(9)
T where the coordinate of points P0 , P1 , P2 and P3 are given as P0 = 1.5 0 1.24 , T T T P1 = 1.2 −0.5 0.8 , P2 = 1.5 −0.8 1.2 , and P3 = 1.0 −1.2 1.0 . Note that the motion of the robot depends on the kinematics structure and the welding technology parameters such as voltage, welding current, and wire output speed. In fact, the welding seam is conducted with feed rate value much smaller than the maximum value. Therefore, based on the actual welding, the maximum speed limit
888
C. A. My et al.
for the algorithm is redefined as follows: f ini = 5(mm/s), f = 1(mm/s), f max = 100(mm/s), u = 0.01, q0 = [0 90 0 0 0 0]T (Degree). q˙max = [20 22 22 50 58.5 70.5]T (Degree/s). q¨max = [37.5 55 55 125 146 176]T Degree/s 2 , ... q max = [187.5 275 275 625 730 880]T Degree/s 3 Based on the given welding path (Fig. 3), the values of the joint variables were calculated and shown in Fig. 4. The values of joints velocity and acceleration are demonstrated in Figs. 5 and 6. Clearly, the value of velocity and acceleration of joint 3 significantly changed and reached the maximum value. Figure 7 shows the jerk values of joints variables. These values are within the allowed range. Figure 8 shows
Fig. 3 Model of robot IRB 1520 ID and welding seam
Fig. 4 Values of joint variables
Optimization of the Feed Rate of Six-DOFs Robot in a Parametric Domain … Fig. 5 Values of velocity joint variables
Fig. 6 Values of acceleration joints
Fig. 7 Values of jerk joints
889
890
C. A. My et al.
Fig. 8 Values of optimal feed rate
Fig. 9 Torch positions and optimal feed rate values
the feed rate value for each position on the welding path. This is shown more clearly in some of the statements in Fig. 9. The feed rate values change not too abruptly and are within the permissible limits.
5 Conclusions In general, the paper presented the research results of optimization feed rate along the toolpath in the parameter domain for six-DOFs robot. The optimal feed rate values are calculated for each specific position on the parametric toolpath. The values of position, velocity, acceleration, and jerk of joint variables are determined based on the given parametric curves and the algorithm for adjusting the increments of generalized vector. In the process of determining these values, the complex calculation of pseudo-inverse Jacobi matrices, the first-and second-order Jacobi matrix derivatives was completely solved. The algorithm for optimizing the feed rate is developed based on gradually increasing the value of the feed rate value, while
Optimization of the Feed Rate of Six-DOFs Robot in a Parametric Domain …
891
ensuring that the position, velocity, acceleration, and jerk values of the joint variables are still within the allowed limits. The results of this paper play an important role in programming the optimal machining for complex toolpaths, contributing to reducing machining time costs, improving productivity. Furthermore, the results can be completely applied in welding technology, cutting machining, printing 3D plastic, and additive manufacturing metal by using robots. Funding Statement This work was supported by a Research Environment Links, ID 528085858, under the Newton Fund partnership. The grant is funded by the UK Department for Business, Energy and Industrial Strategy and delivered by the British Council.
References 1. Altintas Y, Erkorkmaz K (2003) Feed rate optimization for spline interpolation in high speed machine tools. CIRP Ann 52:297–302 2. Misti S, Bouzakis D, Massour G, Sagris D, Maliaris G (2004) Off-line programming of an industrial robot for manufacturing. Int J Adv Manuf Technol 26:262–267 3. Lewis FL, Dawnson DM, Abdallah C (2004) In: Robot manipulator control theory and practice, 2nd edn. Marcel Dekker INC, New York, USA 4. Huo L, Baron L (2008) The joint-limits and singularity avoidance in robotic welding. Ind Robot An Int J 35:456–464 5. Sedighi M, Azad MN (2008) Classification of the feed rate optimization techniques a case study in minimizing CNC machining time. Int J Eng Sci 19:83–87 6. Olabi A, Bearee R, Gibaru O, Damak M (2010) Feed rate planning for machining with industrial six-axis robots. Control Eng Pract Elsevier 18:471–482 7. Khang NV, Dien NP, Vinh NV, Nam TH (2010) Inverse kinematic and dynamic analysis of redundant measuring manipulator BKHN-MCX-04. Vietnam J Mech VAST 32:15–26 8. Zhang K, Yuan CM, Gao XS (2011) Efficient algorithm for time-optimal feed rate planning and smoothing with confined chord error and acceleration. Math Mech Res 31:43–61 9. Beudaert X, Lavernhe S, Tournier C (2010) Feed rate interpolation with axis jerk constraints on 5 axis NURBS and G1 toolpath. Inte J Mach Tools Manuf 10 10. Erkaya S (2012) Investigation of joint clearance effects on welding robot manipulators. Robot Comput-Integr Manuf 28:449–457 11. Chen C, Hu S, He D, Shen J (2013) An approach to the path planning of tube–sphere intersection welds with the robot dedicated to J-grooves joint. Robot Comput-Integr Manuf 29:41–48 12. Dong J, Wang T, Li B, Ding Y (2014) Smooth feed rate planning for continuous short line tool path with contour error constraint. Robot Comput-Integr Manuf 76:1–12 13. Bharathi A, Dong J (2016) Feed rate optimization for smooth minimum-time trajectory generation with higher order constraints. Int J Adv Manuf Technol 82:1029–1040 14. My CA, Bohez ELJ (2016) New algorithm to minimize kinematic tool path errors around 5-axis machining singular points. Int J Prod Res 54(20) 15. Fang HC, Ong SK, Nee AYC (2017) Robot path planning optimization for welding complex joints. Int J Adv Manuf Technol 90:3829–3839 16. Liu H, Liu Q, Yuan S (2017) Adaptive feed rate planning on parametric tool path with geometric and kinematic constraints for CNC machining. Int J Adv Manuf Technol 90(5–8) 17. Mansour SZ, Seethaler R (2017) Feed rate optimization for computer numerically controlled machine tools using modeled and measured process constraints. J Manuf Sci Eng (ASME) 139 18. Liang F, Zhao J, Ji S (2017) An iterative feed rate scheduling method with confined high-order constraints in parametric interpolation. Int J Adv Manuf Technol 92:2001–2015
892
C. A. My et al.
19. Liu Y, Tian X (2018) Weld seam fitting and welding torch trajectory planning based on NURBS in intersecting curve welding. Int J Adv Manuf Technol 95:2457–2471 20. ABB Robotics (2018) Product Manual IRB 1520. ABB AB, Robotics and Motion Se-721 68 Vasteras, Sweden 21. Yan L, Jiang L, Xincheng T (2019) An approach to the path planning of intersecting pipes weld seam with the welding robot based on non-ideal models. Robot Comput Integr Manuf 55:96–108 22. My CA, Bohez ELJ (2019) A novel differential kinematics model to compare the kinematic performances of 5-axis CNC machines. Int J Mech Sci 163:105–117 23. My CA, Bien DX, Tung HB, Hieu LC, Cong NV, Hieu TV (2019) Inverse kinematic control algorithm for a welding robot-positioner system to trace a 3D complex curve. In: International conference on advanced technologies for communications (ATC), pp 319–323 24. My CA, Bien DX, Tung HB, Hieu LC, Cong NV (2020) New feed rate optimization formulation in a parametric domain for 5-axis milling robots. In: 6th International conference on computer science, applied mathematics and applications (ICCSAMA 2019), pp 403–411 25. My CA, Le CH, Packianather M, Bohez EL (2019) Novel robot arm design and implementation for hot forging press automation. Int J Production Res 57(14):4579–4593 26. My CA, Parnichkun M (2015) Kinematics performance and structural analysis for the design of a serial-parallel manipulator transferring a billet for a hot extrusion forging process. Int J Adv Robot Syst 12(12):186 27. My CA, Bien DX, Tung BH, Hieu LC, Cong NV, Hieu TV (2019) Inverse kinematic control algorithm for a welding robot-positioner system to trace a 3D complex curve. In: 2019 IEEE International conference on advanced technologies for communications (ATC), pp 319–323 28. My CA (2016) Inverse kinematics of a serial-parallel robot used in hot forging process. Vietnam J Mech 38(2):81–88 29. My CA (2013) Inverse dynamic of a N-links manipulator mounted on a wheeled mobile robot. In: 2013 IEEE International conference on control, automation and information sciences (ICCAIS), pp 164–170 30. My CA, Hoan VM (2019) Kinematic and dynamic analysis of a serial manipulator with local closed loop mechanisms. Vietnam J Mech 41(2):141–155 31. My CA, Bien DX, Le CH, Packianather M (2019) An efficient finite element formulation of dynamics for a flexible robot with different type of joints. Mech Mach Theor 134:267–288
Cold Start Problem Resolution Using Bayes Theorem Deepika Gupta, Ankita Nainwal, and Bhaskar Pant
Abstract With the plethora of available data online, academicians and researchers find it difficult to retrieve the relevant data. The problem complexity increases for the new academician and the new paper added with no previous knowledge of the type of useful research papers and no acknowledgment of their new work done. This is inferred as the cold start problem in recommender system having few or near zero ratings. This problem enables us to propose the methodology wherein we could able to provide the ratings of each work done of the academicians using collaborative filtering to develop hidden relation between the titled paper and its corresponding references and citations. The rating is provided using Bayes theorem conditional probability for each research paper. Hence, a new academician in research area will be recommended based on the rating in the form of calculated probability. Besides, it also sufficiently resolves the issue of sparsity, which presents the low or no ratings of the research work done. Other issues such as researcher details, probable attacks of fake papers and trust bounded feedback are also catered in the requisite study. Further, the study is labeled based on set threshold to recommend or not. Keywords Bayes theorem conditional probability · Cold start problem · Collaborative filtering · Research paper recommender system · Sparsity problem
1 Introduction Time immemorial witnessed the usage of some or the other kind of recommendation to predict the forthcoming and thus prepare oneself accordingly. Onset of D. Gupta (B) · A. Nainwal · B. Pant Graphic Era Deemed to Be University, Dehradun 248002, India e-mail: [email protected] A. Nainwal e-mail: [email protected] B. Pant e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_85
893
894
D. Gupta et al.
huge data assimilation proposed recommender system in processing the bulk data and recommends the relevant items to the user, which saves time and makes things simpler. However, not every recommender system is free from challenges. There exist multiple issues in recommender system such as sparsity, scalability, shilling attacks and copyright issues. Among them, the most concerned is cold start problem. It refers to new users in selecting their preferences with no prior knowledge of items and also non-recognition of their work done, which could be potential reference. Cold start problems are categorized into three types: (i) new community problem (ii) new user problem and (iii) new item problem [1]. Various proposals in retrieving the required paper range from a list of citations [2], list of papers by an author [3], to similarity on partial text [4], etc. All these approaches build user profile of their interests and recommend the similar interest of the one provided. However, none of them achieved resolution to cold start problem. This study resolves around the key issue of cold start problem using Bayes theorem conditional probability. The following study presents some related work done in catering the cold start problem. The identification of gaps in reviewing existing studies led formulating requisite methodology followed by the results and discussions, thus concluding the study with future prospects.
2 Related Work Cold start problem in recommender system back traced its independency as a potential research area in mid-1990s [1] when recommendation was explicitly relied on the ratings structure. The cold start problem is distinguished into three kinds [1]: • New community: [5] relates to enable recommendation in obtaining the sufficient amount of data (ratings) with no particular users and general votes. • New item: This problem is visible in collaborative filtering algorithms because they rely on items interaction for recommendation. The item set [6] is bifurcated into the head H and the tail T. Items are clustered in the tail T into various groups, which are recommended on ratings of cluster. The head items are recommended on ratings of individual item. A predictive model [7] is built for user/item pairs profile construction. • New user: Original rating matrix is deduced to a cross table, which is used as an index to rate matrix in finding neighbors, thus helping to build concept lattice [8]. The new user with low recommendation [9] may abandon lengthy sign up and reduce interest in site. Hence, [9] proposed an information theoretic strategies evaluation through extensive offline simulations with real users of online recommender system. The Quickstep and Foxtrot system [10] is knowledge-based user profiling ontological recommender system. These systems are digital libraries. The content in such libraries is dynamically updated using user browsing behavior and their explicit feedback. Academicians are let known of updation of their own research areas on
Cold Start Problem Resolution Using Bayes Theorem
895
arrival of new researches. [11] Adopted explicit way to measure user’s personality from personality quizzes on social Web sites. It [12] Represents movie recommender system to producer using movie swarm mining using minimum threshold, i.e., the frequency of users enjoying movie to prune time and space complexity. Movie swarm provides idea of the current popular movie and users’ interest to producer, and the movie genres in movie swarm generate popularity. Thus, similarity in group forms a base for recommendation. New rating profile is also predicted depending on trusted neighbor and similarity [13]. Alternating least square method maps user latent factor and item latent factor to minimize the objective function on Apache Spark [14] using k-means clustering manually. User closest to cluster center represents that cluster. Each cluster maintains top-N rated movies and thus recommends the same to the user of respective cluster. The cold start problem is mitigated using external source data [15]. It amalgates social media data and machine learning into a hybrid filtering recommender system. The data is extracted from two sources such as Yahoo! Movies and Facebook Fan Pages using content-based filtering. Demographic similarities are also used for recommendation [16], performing better on large data. The existing work either depends on ameliorating the user to rate explicitly or fetches some relation from their historical activities such as similarity of groups, demographies or past liked items. However, all inhibits initiation with no prior knowledge.
3 Requisite Methodology The study gives equal weightage to all the research papers present in the dataset. Hence, the equal weight for all research papers is assumed to be equal to 1.
3.1 Concerned Algorithm Algorithm: Probabilistic research paper recommender system Input: For each title paper Output: Top-N recommend papers 1. Retrieve all references of the input target paper. a. Calculate reference paper conditional probability given it is titled in the dataset. 2. If target paper is cited a. Retrieve all the research papers where the input paper is cited. b. Calculate cited paper conditional probability given it is titled in the dataset.
896
D. Gupta et al.
else a. Perform Step 1. 3. If target paper is cited a. Based on calculated value of citations and references, qualify the potential research paper. else a. Qualify the potential research paper based on calculated value of references. 4. Hence, recommend the top-N potential papers. The conditional probability of the references and the citations are calculated using Bayes theorem. For a given hypothesis H and evidence E, Bayes theorem states that the relationship between the probability of the hypothesis before getting evidence P(H) and the probability of the hypothesis after getting the evidence P(H|E) is P(H |E) =
P(E|H )P(H ) P(E)
(1)
where P(H) is the prior probability, P(H|E) is the posterior probability, (P(E|H))/(P(E)) is the likelihood ratio. Using these terms, Bayes theorem is applied as: H represents reference or the citation of the research paper. E represents the title of the research paper. The qualified research papers are labeled either recommended or not using threshold value equal to or greater than 0.5 such as: f (x) =
Yes, if x ≥ 0.5 No, if x < 0.5
(2)
where x represents the calculated conditional probability of both the references and citations. f (x) represents the labeled attribute to qualify the research paper into Yes or No recommendation list. The instances incorporate entitled attributes as research paper title, their references, the paper cited the research paper, the conditional probability of both the references and the citations along with the labeled attribute depicting recommendation or not.
Cold Start Problem Resolution Using Bayes Theorem
897
Fig. 1 Research chronology
3.2 Research Chronology For every titled paper in a dataset, retrieve all the reference papers and thus calculate the Bayes theorem conditional probability for each reference given it is titled in the dataset. Further, check whether the title paper is cited in another research paper. If it is cited, retrieve all the cited papers of the given title paper and thus calculate their respective Bayes theorem conditional probability given it is titled in the dataset. Hence, recommend top-N potential papers to the researcher based on calculated conditional probability of each reference and cited papers of given titled paper. However, if title paper is not cited, then recommend top-N potential papers based on the calculated conditional probability of all the reference papers of the given titled paper (Fig. 1).
3.3 Data Acquisition The requisite study is implemented on a dataset taken from ACM Digital Library [17]. We extracted the available research papers and find the hidden association between its references and the citations using the proposed algorithm for every title paper. Thus, recommend the top-N potential papers.
898
D. Gupta et al.
4 Results and Discussions Critical reviewing the related work concludes on motivating new user to rate items explicitly or developing relations using the past activities. Moreover, similarity forms the base for some recommendations as in case of clustering, demographies or finding proximity attributes of two or more domains. However, requisite research works on alleviating cold start with no such user explicit input or similarity requirement. Thus, comparison based on the previous studies could not be catered due to inherit novelty of the concerned study. The present study makes use of probabilistic approach to calculate the ratings of each paper and then recommend the eligible paper to the user. Moreover, the study solves various open issues and challenges faced by recommender systems as follows: • Cold Start Problem: The study focuses on the resolution of cold start problem. It recommends the eligible paper to the user by calculating the ratings of each titled paper using conditional probability and then recommends the eligible paper. • Sparsity: The present study removes this problem as rating is not taken by the user instead calculated using a probabilistic approach and then made eligible. • Privacy issue: Concerned method does not make use of any user’s information, thereby reducing the privacy and security issues. • Gray sheep: This issue is resolved by the presented work as the user’s profile is not used to recommend. • Trust-based issue: Calculation of trust scores takes time, also possesses challenge of unfair rating, biasness and Sybil attack. The present method does not use user profile, and therefore, the problem of trust based issue is resolved. The requisite work uses the dataset from ACM Digital Library and then calculates the rating of each titled paper using the Bayes theorem conditional probability to recommend the potential papers.
5 Conclusion and Future Scope The concerned research resolves the cold start problem using probabilistic approach. Ratings of each paper are calculated by Bayes theorem conditional probability and then recommend the eligible paper to the user. The concerned study does not make use of user profile, thereby removing challenges like trust-based issues, new user cold start problem, privacy issues, gray sheep and data sparsity issues. Various other open issues in recommender system are still left the scope to achieve. The present work does not make use of real-time dataset and therefore is not scalable. Further enhancements can be done by using dynamic and real-time data to improve the existing approach. Also, further work can be done to remove various open challenges faced by recommender systems like serendipity issues, scalability, etc.
Cold Start Problem Resolution Using Bayes Theorem
899
References 1. Bobadilla J, Fernando O, Hernando A, Bernal J (2012) A collaborative filtering approach to mitigate the new user cold start problem. In: Knowledge-based systems, Spain, pp 225–238. https://doi.org/10.1016/j.knosys.2011.07.021 2. McNee SM, Albert I, Cosley D, Gopalkrishnan P, Lam SK, Rashid AM, et al (2002) On the recommending of citations for research papers. In: Proceedings of the 2002 ACM conference on computer supported cooperative work, New Orleans Louisiana USA, pp 116–125. https:// doi.org/10.1145/587078.587096 3. Sugiyama K, Kan MY (2013) Exploiting potential citation papers in scholarly paper recommendation, In: 13th ACM/IEEE-CS Joint conference on digital libraries, Indianapolis Indiana USA, pp 153–162. https://doi.org/10.1145/2467696.2467701 4. Sugiyama K, Kan MY (2015) A comprehensive evaluation of scholarly paper recommendation using potential citation papers. Int J Digital Libraries, Springer, Heidelberg, pp 91-109. https:// doi.org/10.1007/s00799-014-0122-2 5. Schein AI, Popescul A, Ungar LH, Pennock DM (2002) Methods and metrics for cold-start recommendations. In: Proceedings of the 25th annual international ACM SIGIR conference on Rresearch and development in information retrieval, pp 253– 260. https://doi.org/10.1145/ 564376.564421 6. Park YJ, Tuzhilin A (2008) The long tail of recommender systems and how to leverage it. In: Proceedings of the 2008 ACM conference on recommender systems, pp 11–18. https://doi.org/ 10.1145/1454008.1454012 7. Park ST, Chu W (2009) Pairwise preference regression for cold-start recommendation. In: Proceedings of the third ACM conference on recommender systems, pp 21–28. https://doi.org/ 10.1145/1639714.1639720 8. Ryan PB, Bridge D (2006) Collaborative recommending using formal concept analysis. In: Knowledge based systems, pp 309–315. https://doi.org/10.1016/j.knosys.2005.11.017 9. Rashid AM, Karypis G, Riedl J (2008) Learning preferences of new users in recommender systems: an information theoretic approach. In: ACM SIGKDD explorations newsletter, pp 90–100. https://doi.org/10.1145/1540276.1540302 10. Middleton SE, Shadbolt NR, De Roure DC (2004) Ontological user profiling in recommender systems. ACM Trans Inf Syst 54-88. https://doi.org/10.1145/963770.963773 11. Hu R, Pu P (2011) Enhancing collaborative filtering systems with personality information. In: Proceedings of the fifth ACM conference on recommender systems, pp 197–204. https://doi. org/10.1145/2043932.2043969 12. Halder S, Sarkar AMJ, Lee Y-K (2012) Movie recommendation system based on movie swarm. In: Second international conference on cloud and green computing, pp 804-809. https://doi. org/10.1109/cgc.2012.121 13. Guo G (2013) Integrating trust and similarity to ameliorate the data sparsity and cold start for recommender systems. In : Proceedings of the 7th ACM conference on recommender systems, pp 451–454. https://doi.org/10.1145/2507157.2508071 14. Panigrahi S, Rakesh KL, Stitipragyan A (2016) A hybrid distributed collaborative filtering recommender engine using apache spark. Procedia Comput Sci 1000–1006. https://doi.org/10. 1016/j.procs.2016.04.214 15. Lee MR, Chen TT, Cai YS (2016) Amalgamating social media data and movie recommendation. In: 14th pacific rim knowledge acquisition workshop, Phuket, Thailand, pp 141–152. https:// doi.org/10.1007/978-3-319-42706-5_11 16. Allioui YE (2017) A novel approach to solve the new user cold-start problem in recommender systems using collaborative filtering. Int J Sci Eng Res273–281 17. ACM Digital Library. https://dl.acm.org/
Probabilistic Model Using Bayes Theorem Research Paper Recommender System Ankita Nainwal, Deepika Gupta, and Bhaskar Pant
Abstract Uniqueness is inherited in every individual. Each individual links with another individual forming some social organization. Social organization in common places some trends or norms that influence each of its members. Thus, each individual ascribes common liking or behavior. Analysis of this behavioral pattern leads to the development of a recommender system, which helps to predict and determine the preference an individual ought to opt. Recommender system proves significant in ecommerce marketing strategies. Besides, recommender system finds its application in academics, health, entertainment, etc. This paper proposes a probabilistic model research paper recommender system using naive Bayes algorithm. The originality this paper incorporates is to classify the instances of the target paper into attributes such as title, reference, citation, calculated conditional probability and label them into recommendation or not. Deploying naive Bayes classifier, the evaluation of the current research is done generating the confusion matrix and hence calculated accuracy, precision, recall and F-measure. We have achieved an accuracy of 90.47%. Keywords Bayes theorem conditional probability · Evaluation metrics · Hybrid method · Naive bayes classifier · Research paper recommender system
1 Introduction Abundant presence of data increases the complexity of the user in selecting the relevant information to deduce an inference and take decision thereupon. Users find it difficult to organize required data. Eking out this difficulty of irrelevant data that could A. Nainwal (B) · D. Gupta · B. Pant Graphic Era Deemed to Be University, Dehradun 248002, India e-mail: [email protected] D. Gupta e-mail: [email protected] B. Pant e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_86
901
902
A. Nainwal et al.
influence the behavioral pattern of user led to the development of first recommender system Tapestry [1] based on collaborative filtering accomplish numerous operations such as receiving of files and documents, filtering the files and documents received and browsing electronic documents exactly as our present email accounts. Recommender system evolves with the simultaneous evolution of life on earth. It ranges its applications from biotic to abiotic organisms. Abiotic applications can be found on the river course which changes its route after every 100 years on recommendation of the natural ecosystem. Biotic applications, such as flock of birds, ants, sheep and goats; climber plants and human arrival, reproduction and aftermath and all these processes are guided by the recommender system of one or the other kind suitable for their needs. Evolution of technology brings modification in the recommender system from similarity-based approach to the machine learning and more advanced deep learning. With this footprint, it enlarged its scope of applications in Amazon, Pandora, Netflix, matrimonial sites, social networking sites, trip advisors, etc. Among them, Amazon ace in creating the virtual space for three million customers using e-commerce [2]. Also, the growing need in research sector enables the development of recommender system on research papers to guide and identify the gaps on searched subject. Required search of research papers often confronts with the inability of the researcher to personalize the searching results. Various proposals in retrieving the required paper vary from a list of citations [3], list of papers by an author [4], similarity on partial text [5], etc. All these approaches build user profile of their interests and recommend the similar interest of the one provided. The motivation behind the current research paper is to resolve the unavailability of research paper due to copyright issues by establishing hidden associations among the input target paper, its references and the papers that cited the target paper. Also, it caters the trust-based challenges and sparsity issues without developing the user profile and considering non-cited research papers, respectively. As elaborated in [6], to qualify for recommendation, a paper must satisfy its candidature having a target paper and one of its references and also a paper having the candidate paper and the target paper cited. Then, the similarity is developed between the target paper and the qualified candidate paper assuming significant existence of co-occurrence and thus recommends top-N similar papers. Similarly, this paper generates the hidden association among research papers. However, the occurrence of the reference paper of the target paper along with the occurrence of the cited paper of the target paper using Bayes theorem conditional probability labels the enlisting of the recommended papers. Thus, implementing the naive Bayes algorithm generates the evaluation of the current research. The remaining study is organised in following ways: The study discusses the related work and furthur proposes a method by designing a probalistic algorithm and evaluating the work. The study then discusses the results and concludes the work by also providing details on how to enhance the work for future study.
Probabilistic Model Using Bayes Theorem Research Paper …
903
2 Related Works According to study [7], trust-based recommender system helps to suggest the user by providing information about the reliability of other users. The trust or reputation scores of each user are calculated. However, the calculation of reputation and scores takes time; therefore, this approach is time consuming, and also, the approach has a challenge of unfair rating, biasness and Sybil attack. The study provides [8] a method to reduce the sparsity, a problem of fewer rating, by introducing a feature called virtual reality. This allows items to be viewed in detail so that there could be a better rating. Two rating methods, virtual rating and physical rating, are used. This method, however, deals with trust-based rating which is time consuming and also has a challenge of unfair rating, biasness and Sybil attack. The problem of new user cold start problem is handled by retrieving the information of the user from the Facebook account of the user. The information of the user and the friends of the user is taken and processed to recommend the items preferred and rated by the user’s friend and the user. However, incase if the user’s facebook credentials are not available, then it is tough to find out the user’s preferences [9]. A study for recommendation using collaborative filtering based on Bayesian model [10] is developed which is based on both the user and item-based approach. The model uses naïve Bayes classifier that is based on Bayes theorm. The model makes use of three approaches, namely user-based technique, item-based technique and hybrid of both user-based and item-based approach. User-based technique makes use of rating provided by the user to compute the likelihood and the prior probability. Item-based approach makes use of the rating that has been received by each item to calculate the likelihood and prior probability. The hybrid approach combines both user- and item-based techniques to improve the performance of the research. Agarwal et al. [11] put forward a recommender system based on the past searches of the others users. The system also handles a problem of sparsity and scalability by using scalable subspace clustering algorithm. The searches of each researcher and article data are maintained and generated into space. Users with similar past searchers are assumed to like similar topics of articles and are represented by subspace cluster. However, the presented recommender system does not deal with cold start problem and therefore is unable to put a user into subspace without past access data, and the system also deals with a challenge of serendipity and may not help in recommending papers which are not related to the past searches. The content-based filtering leads to over specialization, and collaborative filtering provides serendipity and introduces new items to user [12]. Algorithms in collaborative filtering are of two types, probabilistic approach (Bayesian networks) and nonprobabilistic approach (nearest neighbor algorithms). Non-probabilistic approach deals with sparsity issue. The study [13] combines both collaborative filtering and content-based approach to create a hybrid approach which is based on fuzzy logic system. The model aims to reduce the stability or plasticity problem in which the changing preferences of user make it difficult to identify the interest of a user. As the user’s interest is dynamic,
904
A. Nainwal et al.
this method helps in considering the different interests of user that changes overtime with the help of fuzzy logic. Collaborative filtering approach has issues like cold start problem and sparsity issues which need to be handled. Most of the used collaborative techniques could not cater cold start and sparsity issues. Recommender system also deals with issues like privacy and security where the user’s information is used. Most of the recommendation methods used to handle the sparsity issue in collaborative approach used trust-based method. Trust-based methods might not be suitable in handling such issues and may lead to problem like shilling attack.
3 Requisite Hybrid-Based Research Paper Recommendations System The current research generates a hidden association among the research papers using Bayes theorem conditional probability. The occurrence of references and citations of the target paper given these references and citations are titled as one of the research papers in the available dataset that calculates conditional probability. Based on the conditional probability of both the references and the citations, a potential recommender paper is evaluated. This evaluation then labels the research paper into a recommendation or non-recommendation list. Further, implementing naive Bayes theorem on attributes having the title of research papers, their references, citations of title paper, calculated conditional probability and label attribute of whether recommended or not generates confusion matrix to evaluate precision, recall, F-measure and accuracy.
3.1 Concerned Algorithm Algorithm: Probabilistic Research Paper Recommendation Input: Single paper as input titled paper. Output: Top-N recommendation 1. Retrieve all the references of the input target paper. a. Calculate reference paper conditional probability given it is titled in the dataset using Eq. (1). 2. Retrieve all the research papers where the input paper is cited. b. Calculate cited paper conditional probability given it is titled in the dataset using Eq. (1). 3. Based on calculated value of citations and references, qualify the potential research paper using Eq. (2).
Probabilistic Model Using Bayes Theorem Research Paper …
905
4. Hence, recommend the top-N similar papers. The conditional probability of the references and the citations is calculated using Bayes theorem. For a given hypothesis H and evidence E, Bayes’ theorem states that the relationship between the probability of the hypothesis before getting evidence P(H) and the probability of the hypothesis after getting the evidence P(H|E) is P(H |E) =
P(E|H )P(H ) P(E)
(1)
where P(H) is the prior probability. P(H|E) is the posterior probability. P(E|H ) is the likelihood ratio. P(E) Using these terms, Bayes’ theorem is applied as H represents reference or the citation of the research paper. E represents the title of the research paper. The qualified research papers are labeled either recommended or not using threshold value equal to or greater than 0.5, such as f (x) =
Yes, if x ≥ 0.5 No, if x < 0.5
(2)
where x represents the calculated conditional probability of both the references and citations. f (x) represents the labeled attribute to qualify the research paper into yes or no recommendation list. The instances are generated with entitled attributes consisting of research paper title, their references, the paper cited the research paper, the conditional probability of both the references and the citations along with the labeled attribute depicting recommendation or not. Further on the implication of the naive Bayes classifier, which is a probabilistic machine learning model used for classification tasks based on independent attributes, we evaluated the current study of recommender system using the confusion matrix.
3.2 Architecture Design The architecture of current research model depicts the relation between the references and the citations of the target paper and hence computes the similarity to the target paper based on conditional probability (Fig. 1).
906
A. Nainwal et al.
Fig. 1 Proposed architecture for research paper recommender system
3.3 Evaluation Metrics To evaluate our current study, we preferred fivefold cross-validation to the references and the citations of the title papers. The performance of the current research approach is assessed using four evaluation metrics, i.e., accuracy, precision, recall and Fmeasures. • Accuracy: Accuracy measures the correctness of the classified values.
True Positive + True Negative True Positive + True Negative + False positive + False Negative
(3)
• Precision: Precision represents the actual predicted positives of the proposed instances.
Precision =
True Positive True Positive + False Positive
(4)
• Recall: Recall measures the proportion of true positive values.
Recall =
True Positive True Positive + False Negative
(5)
Probabilistic Model Using Bayes Theorem Research Paper …
907
• F-measure: F-measure is the weighted average of both the precision and the recall.
F − measure = 2∗
Precision∗ Recall Precision + Recall
(6)
3.4 Data Acquisition The current research study is implemented on a designed dataset taken from ACM Digital Library [14]. We extracted the available research papers and found the hidden association between its references and the citations using the presented algorithm for every target paper. Thus, recommend the top-N potential papers.
3.5 Results and Discussions The existing studies applied techniques on recommender system using one level of filtering for prediction such as traditional approaches like content based, collaborative filtering and some advanced approaches like clustering, fuzzy logic, Bayesian theorem, etc. However, the requisite work caters double filtering for prediction. At level one, the study calculates the ratings of research paper, which further filters on preset threshold at level two. This double filtering helps in filtering out the data and provides us with more accurate recommendation. This novel approach distincts the nature of concerned study. Thus, comparison with existing studies is however difficult. To evaluate our current research, we extracted dataset from ACM Digital Library, implemented the algorithm using Bayes theorem and then classified the instances for a recommendation or not. Further incorporating naive Bayes classifier, confusion matrix is generated, and hence, accuracy, precision, recall, and F-measure are calculated. The results are shown as follows: Precision
Recall
F-Measure
ROC Area
0.857
0.947
0.9
0.993
0.952
0.87
0.909
0.993
Weighted Avg.
0.909
0.905
0.905
0.993
Accuracy
90.47
908
A. Nainwal et al.
Confusion matrix: a
b
classified as
36
2
a = yes
6
40
b = no
4 Conclusion and Future Scope The current research identified the existing gaps in the recommender system and thus worked on them. It is substantiated using the probabilistic method to find the hidden association between the references and the citations of the target paper. We can ratify the trust-based challenges in the recommender system without developing any user profile and resolve the sparsity issue for the new researchers deploying it is substantiated using the probabilistic method to find the hidden association between the references and the citations of the target paper. The advance approach can be incorporated to resolve the open challenges in the recommender system to enhance its performance.
References 1. Goldberg D, Nicholas D, Oki BM, Terry D (1992) Using collaborative filtering to weave an information tapestry. ACM 35:61–60. https://doi.org/10.1145/138859.138867 2. Schafer JB, Konstan JA, Riedl J (2001) E-commerce recommendation applications. Springer 5:115–153. https://doi.org/10.1023/A%3A1009804230409 3. McNee SM, Albert I, Cosley D, Gopalkrishnan P, Lam SK, Rashid AM et al. (2002) On the recommending of citations for research papers. In: Proceedings of the 2002 ACM conference on computer supported cooperative work, pp 116–125. https://doi.org/10.1145/587078.587096 4. Sugiyama K, Kan MY (2013) Exploiting potential citation papers in scholarly paper recommendation. In: Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries, pp 153–162. https://doi.org/10.1145/2467696.2467701 5. Sugiyama K, Kan M-Y (2015) A comprehensive evaluation of scholarly paper recommendation using potential citation papers. Int J Digit Libr 16:91–109. https://doi.org/10.1007/s00799-0140122-2 6. Haruna K, Ismail MA, Damiasih Sutopo DJ, Herawan T (2017) A collaborative approach for research paper recommender system. PLOS ONE 1–17. https://doi.org/10.1371/journal.pone. 0184516 7. Ozsoy MG, Polat F (2013) Trust based recommendation systems In: IEEE/ACM International conference on advances in social networks analysis and mining, pp 1267–1274. https://doi.org/ 10.1145/2492517.2500276 8. Guo G (2012) Resolving data sparsity and cold start in recommender systems. Springer, pp 361–364. https://doi.org/10.1007/978-3-642-31454-4_36 9. Bedi P, Sharma C, Goel D, Dhanda M (2015) Handling cold start problem in recommender systems by using interaction based social proximity factor. In: IEEE International conference
Probabilistic Model Using Bayes Theorem Research Paper …
10.
11.
12.
13.
14.
909
on advances in computing, communications and informatics (ICACCI) 2015, pp 1987–1993. https://doi.org/10.1109/icacci.2015.7275909 Valdiviezo-diaz P Ortega F, Lara-cabrera (2019) A collaborative filtering approach based on Naïve Bayes classifier In: International conference on web-age information management 2005, Springer, Berlin, Heidelberg , pp 475–491. https://doi.org/10.1109/access.2019.2933048 Agarwal N, Haque E, Liu H, Parsons L (2005) Research paper recommender systems: a subspace clustering approach. In: International conference on web-age information management, Springer, Berlin, Heidelberg, pp 475–491. https://doi.org/10.1007/11563952_42 Ben Schafer J„ Frankowski D, Herlocker J, Sen S (2007) Collaborative filtering recommender systems In: Brusilovsky P, Kobsa A, Nejdl W (eds) The adaptive web. Lecture notes in computer science, vol 4321. Springer, Berlin, Heidelberg. pp 291–324. https://doi.org/10.1007/978-3540-72079-9_9 Maatallah M, Seridi H (2010) A fuzzy hybrid recommender system In: IEEE International conference on machine and web intelligence, pp 258–263. https://doi.org/10.1109/icmwi.2010. 5648168 ACM Digital Library: https://dl.acm.org/
The Role of Big Data Analytics and AI in Smart Manufacturing: An Overview Chu Anh My
Abstract In recent years, smart manufacturing which is the core idea of the Fourth Industrial Revolution (Industry 4.0) has gained increasing attention worldwide. Recent advancements of several information technologies and manufacturing technologies, such as Internet of Things (IoT), big data analytics, artificial intelligent (AI), cloud computing, digital twin, cyber-physical System, have motivated the development of smart manufacturing. This paper presents a comprehensive review of the recent publications related to smart manufacturing, especially related to the particular role of big data analytics and AI in the optimization of process parameters for smart manufacturing shop floors consisting of CNC machines and robots. Keywords Smart manufacturing · Big data analytics · Artificial intelligent · Industrial robot · CNC machine
1 Introduction Nowadays, the emerging big data analytics incorporated with AI presents a significant opportunity to implement smart manufacturing. In practice, most of manufacturing industries are now dealing with increasingly massive amounts of datasets in short time due to the adoption of IoT, sensors for asset monitoring, weblogs, social media feeds, product and parts tracking, and others (Shukla et al. [1]). In this context, the use of big data analytics incorporated with AI plays a key role in improving the smartness, efficiency, and effectiveness of manufacturing systems, which refers to the capability of organizations for systematic and computational analysis of big datasets. Big data analytics and AI have the potential to transform and advance manufacturing in future [1, 2]. It can help industries in making more smart decisions such as self-optimizing of real-time manufacturing process, providing greater visibility into operations, monitoring real-time status of the manufacturing system. Finally, it can help enterprises to improve the flexibility and efficiency of the productions and services and to increase C. A. My (B) Le Quy Don Technical University, Hanoi, Vietnam e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_87
911
912
C. A. My
the productivity and the quality of products. Shukla et al. [1] claimed that big data analytics is critical to the success of Industry 4.0. The benefits of using big data analytics for smart manufacturing are enormous; however, its adoption in many organizations is still in the nascent stage. Smart manufacturing is a broad concept in practice, as the products are extremely broad and different products may be manufactured in totally different ways, e.g., products processed by chemical processes, products processed by machine tools and industrial robots, etc. In this research project, “manufacturing” is restricted to product fabrication by machine tools and robots which are referred to as typical agents of material processing in advanced manufacturing systems nowadays. From this perspective about smart manufacturing, the key enablers of smart manufacturing can be identified more specifically as industrial connectivity devices, robots, and computer numerical control (CNC) machines, and a big data processing platform. In practice, most manufacturing industries are now dealing with increasingly massive amounts of datasets in short time due to adoption of IoT, sensors for asset monitoring, weblogs, social media feeds, product and parts tracking and others (Shukla et al. [1]). Storing big datasets is not new for these industries but gathering actionable and manageable insights from the data is often lacking. In this context, the use of big data analytics in modern manufacturing is of importance, which refers to the capability of organizations for systematic and computational analysis of big datasets, popularly characterized by 5Vs, i.e., volume, velocity, variety, veracity, and value-adding. Big data analytics has the potential to transform and advance manufacturing in future [1, 2]. It can help industries in making more smart decisions such as self-optimizing of real-time manufacturing process, providing greater visibility into operations, and monitoring real-time status of manufacturing system. Finally, it can help enterprises to improve the flexibility and efficiency of the productions and services and to increase the productivity and the quality of products. Shukla et al. [1] claimed that big data analytics and AI are critical to the success of Industry 4.0. The benefits of using big data analytics and AI for smart manufacturing are enormous; however, its adoption in many organizations is still in nascent stage. Some years ago, there has been an increasing number of research works devoted to the development of smart manufacturing, especially the use of big data analytics and AI for the development of intelligent algorithms for optimization of process parameters of smart manufacturing systems and shop floors. In this paper, the role of big data analytics and AI in smart manufacturing is comprehensively reviewed and discussed.
2 The Role of Big Data Analytics and AI in Smart Manufacturing Figure 1 shows a general systematic diagram that can be sketched for a smart manufacturing shop floor. Based on a big data processing platform, the shop floor is constructed with a robots–machines (equipment) layer, a sensor layer, a network
The Role of Big Data Analytics and AI in Smart Manufacturing …
913
Fig. 1 A conceptual diagram of a smart manufacturing shop floor
layer, a cognitive layer, and a control layer; this system achieves deep integration of human, products, a physical space, and a cyberspace. In the system, the sensing layer obtains data and information from the equipment layer and transfers such data and information to the cognitive layer via the network layer. The smart characteristics of this cyber-physical model of the system mostly lie in the cognitive layer and are responsible for analyzing and processing the big data collected. It then transfers the results to the control layer in order to realize the intelligent feedback control and optimization of the equipment layer. As seen from this example, the big manufacturing datasets management and usage are critical to the process performance of a smart manufacturing shop floor. As can be seen in Fig. 1, the big data platform is a basic framework of a smart manufacturing shop floor. Based on this platform, every components of the CPS can be connected and worked all together in a smart manner. All the manufacturing datasets accumulated in time are stored, tailored, and filtered that can be used for further data processing and data mining procedures. Not only the databases collected by the sensor network layer, but also all the useful datasets generated by other activities of the product design and development, the production and
914
C. A. My
service, the product management, and so on are adopted by the platform itself. For example, the data can be generated and accumulated when designing and processing a product with the use of computer-aided design, computer-aided manufacturing, and computer-aided process planning methods (CAD/CAM/CAPP). All the digital versions of the product design by CAD software, the CL data and NC programs (G-codes files) by CAM software, process plans, and process parameters by CAPP toolboxes are precious resources not to be wasted in the context of AI and smart manufacturing. The databases of a smart manufacturing shop floor can be categorized into three main groups as follows: – The data of the machine status – The data of the production/service – The data of the production management. The first data group includes all the measured and feedback signals of the machine layer, such as the feed rate, the spindle speed of a CNC machine, the current and voltage of electric motors, the vibration signals, and the product images. The data of the production/service imply that all the datasets that can be collected when design, produce/make, and provide products to market. The digital drawings, list of materials, operation plans, G-codes programs, process parameters, and tool selection are the essential data for smart manufacturing. The last group includes all the datasets generated when managing all the phases of the production/project management such as the production plans, resource plans, inventory plans, quality control, subcontracts. In the literature, various investigations have shown that big data analytics incorporated with AI algorithms, machine learning, and soft computing methods has been playing a key role in the promising future of Smart Manufacturing [1–14]. The overview of smart manufacturing–Big data analytics was well documented [1–14]. Shukla et al. [1] summarized scenarios where big data analytics and its applications were used for improving decision making in manufacturing processes. Kuo and Kusiak [2] discussed challenges and future opportunities in manufacturing data research. Babiceanu and Seker [3] surveyed the current status and discussed future outlook of smart manufacturing where big data virtualization is taken into account. Kang et al. [4] identified and reviewed eight key technologies related to smart manufacturing. Xu [5] and Tao et al. [6] emphasized on the role of cloud manufacturing and big data in the development of smart manufacturing. Strozzi et al. [7] and Zuehlke [8] conducted a review on smart factory issues, described changes and challenges, and summarized the experience gained to date in smart manufacturing. Kusiak [9] discussed the origin, current status, and the future developments in smart manufacturing–Big data analytics. Lee et al. [10] and Monostori et al. [11] focused on cyber-physical system and cyber-physical production system concepts and reviewed current trends toward implementing cyber-physical production systems in a manufacturing industry. Zhou et al. [12] provided a better understanding of past achievements and future trends of smart condition monitoring toward energy efficiency in smart manufacturing. Hiatt et al. [13] reviewed the human modeling methods for
The Role of Big Data Analytics and AI in Smart Manufacturing …
915
human–robot collaboration that is one of the most critical issues in smart manufacturing. Sünderhauf et al. [14] discussed the limits and potentials of deep learning for robotics another essential issue in the regime of smart manufacturing. The general frameworks, layouts, architectures, functionalities, and other theoretical aspects related to smart manufacturing—big data analytics have been investigated in numerous studies [15–37]. Wang et al. [15] studied a framework of a smart factory that integrates smart agents (machines and robots) with a big database feedback and coordination platform. Lu and Xu [16] proposed a generic system architecture for big data analytics-based cyber-physical system which is integrated with cloud manufacturing. Cheng et al. [17] investigated a framework of cyber-physical integration in smart factories and discussed how to enable technologies to achieve cyber-physical integration. Krishnamurthy and Cecil [18] and Lu and Cecil [19] designed an IoTbased cyber-physical system framework. Song and Moon [20] and Tang et al. [21] emphasized on a detailed framework of a smart production shop floor which consists of cyber-physical systems integrated with cloud computing resources. Tao et al. [22] Ding et al. [26] discussed methods and frameworks of big data analytics and digital twin-driven product design, manufacturing, and service. Ferry et al. [23] presented a big data platform for the management of cyber-physical Systems generated data in the cloud manufacturing systems. Woo et al. [37] developed a big data analytics framework for smart manufacturing systems. Other aspects related to big data analytics, cyber-physical system, cloud manufacturing, and digital twin integrated in a general framework of smart production systems were discussed in [24–36]. The experimental investigations and numerical simulations for demonstration of conceptual designs related to smart manufacturing–Big data analytics have been carried out as well [16, 18, 37–57]. Lu and Xu [16] demonstrated an information model for a digital twin model of a smart roll forming machine. Wang et al. [38], Chang and Wu [41], Morgan and O’Donnell [53], Woo et al. [37], Shin et al. [54, 55], and Chen et al. [57] investigated big data analysis-driven prediction and decisionmaking process that was demonstrated with CNC machining operations, especially a cloud platform with a CNC turning operation was demonstrated in [38] for adjusting cutting parameters. Gill and Singh [39], Chen and Feng [42], Dong et al. [49], and Boersch et al. [50] investigated a big data-driven optimization of automatic welding process through experimental datasets. In [39], experimental datasets including rotational speed, forge load, and tensile strength of an inertia friction welding system were collected when welding pipe joints. The datasets were then used to train an ANFIS model to predict the tensile strength of the welding. Intelligent motion planning and control for industrial robots with experiments were studied in [41, 43]. Levine et al. [52] constructed and validated a deep learning model for control of a manipulator grasping objects. The use of big manufacturing data to improve the process performance of other manufacturing processes and services was investigated also, e.g., additive manufacturing [58], logistics service [59], manufacturing resource service [60], and production management [61]. Additionally, there have been some investigations concerning with the use of big data analytics to design, demonstrate, and validate platforms of monitoring and fault diagnosis for smart manufacturing [29, 62–65].
916
C. A. My
It is clearly seen that most of the previous investigations concerning with smart manufacturing–Big data analytics mainly focus on a comprehensive review of the state of the art and a detailed discussion of the general framework designs, general approaches, and key technologies that are necessary to realize actual smart factories [1–37]. Besides, there were some studies related to experimental issues [37, 38, 41, 50, 53–55]; however, these experimental investigations are mostly aimed to demonstrate a conceptual design or some characteristics of a big data-driven smart production platform that concentrates on an individual machine (i.e., a single CNC machine or a single robot). Investigation on a practical smart shop floor consisting of multi-machines is challenging. Note that there have also existed studies concerning with big data-driven optimization of process parameters for the CNC machines [38, 41, 53, 57] and for industrial robots [39, 40, 43, 45, 50, 51]. Chang and Wu [41] presented experimental results for the optimization of turning process just for a mini CNC controller. Morgan and O’Donnell [53] demonstrated the application of a cyber-physical process monitoring system for a CNC turning machine. The cyber-physical system was built based on a dataset collected with vibration and motor current signals. In particular, Wang et al. [38] and Chen et al. [57] used the big data approach to optimize the processing parameters of an intelligent CNC turning machine. However, these investigations mostly were emphasized on some illustrative examples of big data-driven optimization concept. The construction of such optimization algorithms is mainly based on a limited experimental data collected with some experimental scenarios, not based on an online big data processing framework. A self-optimization of process parameters for robots and CNC machines that working on a real-time big manufacturing data domain is still far from a reality. Additionally, studies on the health monitoring and diagnosis for smart manufacturing systems have gained extensive attention in recent years. Torrisi and Oliveira [62, 63] designed a remote health monitoring system for a cyber-physical CNC machine that is based on a public IP network. The designed monitoring system works with vibration signals of a machine. Kumar et al. [56] used hidden Markov models and polynomial regression to estimate the remaining useful life of cutting tool for a CNC drilling machine. The dataset of the thrust force and torque of the tools for training the models is collected with offline experiments. Though there were studies using the big data processing approach to construct health monitoring and diagnosis procedures for smart manufacturing, the outcomes of these investigations are restricted to general systematic designs and experimental demonstrations. The construction and adoption of a health monitoring and diagnosis system for practical production systems are critical issues.
3 Discussion and Conclusion As reviewed in the previous section, it is clearly seen that most of the research works related to advanced manufacturing systems, smart manufacturing, AI, and big data
The Role of Big Data Analytics and AI in Smart Manufacturing …
917
analytics mainly focus on the general framework, general approaches, and technologies necessary to enable smart manufacturing systems. Also, there have been publications presented experimental results, such as the investigations in [37, 38, 41, 50, 53–55]. Nevertheless, the published experimental investigations are mostly aimed to demonstrate conceptual designs or some characteristics of a big data and AI-driven smart production platform that concentrates on individual machines and robots. Theoretical developments and experimental investigations accompanied with numerical and real-time demonstration of a smart shop floor consisting of real CNC machines and robots are challenging. It is also noticeable that there have been a number of recent studies concerning with big data and AI-driven optimization of process parameters for the CNC machines [38, 41, 53, 57] and for industrial robots [39, 40, 43, 45, 50, 51]. However, these investigations mostly were emphasized on some illustrative examples of big data and AI-driven optimization concept. The training for constructed AI algorithms is based on a limited experimental data collected with some experimental scenarios, not based on a digital twin layer and an online big data processing framework. A self-optimization of process parameters for robots and CNC machines that working on a digital twin layer embedded in a real-time big manufacturing data domain is still far from a reality. In conclusion, though there is a great potential when using AI and big data analytics with the purpose of improving the smartness of single manufacturing systems as well as smart manufacturing shop floors, real applications in industries are still at the beginning stage. In fact, there is an emerging need for today manufacturing that is how to realize and develop smart manufacturing systems such as smart industrial robots and smart computerized numerical control machines. In the future, when products of industrial robots and CNC machines are developed as intelligent machines and robots, they will be able to control their processing parameters themselves, and the smart manufacturing factories will become one of the important driving forces behind the development of Industry 4.0. In the future of smart manufacturing, robot and CNC machine products could be designed and developed with AI controller which is relied on a big data platform. Nevertheless, the combination of analytical computation models for robot design and analysis [66–76] and the AI-driven models is a noticeable research interest for researchers. In addition, the machine errors modeling and other machining optimization approaches for multi-axis CNC machines [77–82] must be taken into account when improving the machining efficiency and effectiveness in the context of AI-driven machining optimization. Funding Statement This research was supported by Vingroup Innovation Foundation (VINIF) in project code VINIF.2019.DA08. This work was also supported by a Research Environment Links, ID 528085858, under the Newton Fund partnership. The grant is funded by the UK Department for Business, Energy and Industrial Strategy and delivered by the British Council.
918
C. A. My
References 1. Shukla N, Tiwari MK, Beydoun G (2019) Next generation Smart Manufacturing and service systems using big data analytics. Comput Ind 128:905–910. https://doi.org/10.1016/j.cie.2018. 12.026 2. Kuo YH, Kusiak A (2018) From data to big data in production research: the past and future trends. Int J Prod Res 1–26. https://doi.org/10.1080/00207543.2018.1443230 3. Babiceanu RF, Seker R (2016) Big data and virtualization for manufacturing cyber-physical systems: a survey of the current status and future outlook. Comput Ind 81:128–137. https:// doi.org/10.1016/j.compind.2016.02.004 4. Kang HS, Lee JY, Choi S, Kim H, Park JH, Son JY, Do Noh S (2016) Smart manufacturing: Past research, present findings, and future directions. Int J Precision Eng Manuf-Green Technol 3(1):111–128. https://doi.org/10.1007/s40684-016-0015-5 5. Xu X (2012) From cloud computing to cloud manufacturing. Robot Comput-Integr Manuf 28(1):75–86. https://doi.org/10.1016/j.rcim.2011.07.002 6. Tao F, Zhang L, Liu Y, Cheng Y, Wang L, Xu X (2015) Manufacturing service management in cloud manufacturing: overview and future research directions. J Manuf Sci Eng 137(4):040912. https://doi.org/10.1115/1.4030510 7. Strozzi F, Colicchia C, Creazza A, Noè C (2017) Literature review on the ‘Smart Factory’ concept using bibliometric tools. Int J Prod Res 55(22):6572–6591. https://doi.org/10.1080/ 00207543.2017.1326643 8. Zuehlke D (2010) Smart factory—towards a factory-of-things. Ann Rev control 34(1):129–138 9. Kusiak A (2018) Smart manufacturing. Int J Prod Res 56(1–2):508–517. https://doi.org/10. 1080/00207543.2017.1351644 10. Lee J, Ardakani HD, Yang S, Bagheri B (2015) Industrial big data analytics and cyber-physical systems for future maintenance and service innovation. Procedia CIRP 38:3–7. https://doi.org/ 10.1016/j.procir.2015.08.026 11. Monostori L, Kádár B, Bauernhansl T, Kondoh S, Kumara S, Reinhart G, Ueda K (2016) Cyber-physical systems in manufacturing. CIRP Ann 65(2):621–641. https://doi.org/10.1016/ j.cirp.2016.06.005 12. Zhou Z, Yao B, Xu W, Wang L (2017) Condition monitoring towards energy-efficient manufacturing: a review. Int J Adv Manuf Technol 91(9–12):3395–3415. https://doi.org/10.1007/ s00170-017-0014-x 13. Hiatt LM, Narber C, Bekele E, Khemlani SS, Trafton JG (2017) Human modeling for human– robot collaboration. Int J Robot Res 36(5–7):580–596. https://doi.org/10.1177/027836491769 0592 14. Sünderhauf N, Brock O, Scheirer W, Hadsell R, Fox D, Leitner J, Corke P (2018) The limits and potentials of deep learning for robotics. Int J Robot Rese 37(4–5):405–420. https://doi.org/ 10.1177/0278364918770733 15. Wang S, Wan J, Zhang D, Li D, Zhang C (2016) Towards smart factory for industry 4.0: a self-organized multi-agent system with big data based feedback and coordination. Comput Netw 101:158–168. https://doi.org/10.1016/j.comnet.2015.12.017 16. Lu Y, Xu X (2019) Cloud-based manufacturing equipment and big data analytics to enable on-demand manufacturing services. Robot Comput-Integr Manuf 57:92–102. https://doi.org/ 10.1016/j.rcim.2018.11.006 17. Cheng Y, Zhang Y, Ji P, Xu W, Zhou Z, Tao F (2018) Cyber-physical integration for moving digital factories forward towards smart manufacturing: a survey. Int J Adv Manuf Technol 1–13. https://doi.org/10.1007/s00170-018-2001-2 18. Krishnamurthy R, Cecil J (2018) A next-generation IoT-based collaborative framework for electronics assembly. Int J Adv Manuf Technol 96(1–4):39–52. https://doi.org/10.1007/s00 170-017-1561-x 19. Lu Y, Cecil J (2016) An internet of things (IoT)-based collaborative framework for advanced manufacturing. Int J Adv Manuf Technol 84(5–8):1141–1152. https://doi.org/10.1007/s00170015-7772-0
The Role of Big Data Analytics and AI in Smart Manufacturing …
919
20. Song Z, Moon Y (2017) Assessing sustainability benefits of cyber manufacturing systems. Int J Adv Manuf Technol 90(5–8):1365–1382. https://doi.org/10.1007/s00170-016-9428-0 21. Tang D, Zheng K, Zhang H, Zhang Z, Sang Z, Zhang T, Vargas-Solar G (2018) Using autonomous intelligence to build a smart shop floor. Int J Adv Manuf Technol 94(5–8):1597– 1606 22. Tao F, Cheng J, Qi Q, Zhang M, Zhang H, Sui F (2018) Digital twin-driven product design, manufacturing and service with big data. Int J Adv Manuf Technol 94(9–12):3563–3576 23. Ferry N, Terrazas G, Kalweit P, Solberg A, Ratchev S, Weinelt D (2017) Towards a big data platform for managing machine generated data in the cloud. In: 2017 IEEE 15th International conference on industrial informatics (INDIN), IEEE, pp 263–270 24. Xiang F, Yin Q, Wang Z, Jiang GZ (2018) Systematic method for big manufacturing data integration and sharing. Int J Adv Manuf Technol 94(9–12):3345–3358 25. Xu X (2017) Machine Tool 4.0 for the new era of manufacturing. Int J Adv Manuf Technol 92(5–8):1893–1900 26. Ding K, Chan FT, Zhang X, Zhou G, Zhang F (2019) Defining a digital twin-based cyberphysical production system for autonomous manufacturing in smart shop floors. Int J Prod Res 1–20 27. Bi Z, Cochran D (2014) Big data analytics with applications. J Manage Anal 1(4):249–265 28. Davis J, Edgar T, Porter J, Bernaden J, Sarli M (2012) Smart manufacturing, manufacturing intelligence and demand-dynamic performance. Comput Chem Eng 47:145–156 29. Deng C, Guo R, Liu C, Zhong RY, Xu X (2018) Data cleansing for energy-saving: a case of Cyber-physical machine tools health monitoring system. Int J Prod Res 56(1–2):1000–1015 30. Huang B, Li C, Yin C, Zhao X (2013) Cloud manufacturing service platform for small-and medium-sized enterprises. Int J Adv Manuf Technol 65(9–12):1261–1272 31. Jain S, Shao G, Shin SJ (2017) Manufacturing data analytics using a virtual factory representation. Int J Prod Res 55(18):5450–5464 32. Song T, Liu H, Wei C, Zhang C (2014) Common engines of cloud manufacturing service platform for SMEs. Int J Adv Manuf Technol 73(1–4):557–569 33. Valilai OF, Houshmand M (2010) INFELT STEP: an integrated and interoperable platform for collaborative CAD/CAPP/CAM/CNC machining systems based on STEP standard. Int J Comput Integr Manuf 23(12):1095–1117 34. Yin Y, Stecke KE, Li D (2018) The evolution of production systems from Industry 2.0 through Industry 4.0. Int J Prod Res 56(1–2):848–861 35. Zhang C, Jiang P, Cheng K, Xu XW, Ma Y (2016) Configuration design of the add-on cyberphysical system with CNC machine tools and its application perspectives. Procedia CIRP 56:360–365 36. Jin J, Liu Y, Ji P, Liu H (2016) Understanding big consumer opinion data for market-driven product design. Int J Prod Res 54(10):3019–3041 37. Woo J, Shin SJ, Seo W, Meilanitasari P (2018) Developing a big data analytics platform for manufacturing systems: architecture, method, and implementation. Int J Adv Manuf Technol 99(9–12):2193–2217 38. Wang Z, Jiao L, Yan P, Wang X, Yi J, Shi X (2018) Research and development of intelligent cutting database cloud platform system. Int J Adv Manuf Technol 94(9–12):3131–3143 39. Gill SS, Singh J (2013) Artificial intelligent modeling to predict tensile strength of inertia friction-welded pipe joints. Int J Adv Manuf Technol 69(9–12):2001–2009 40. Bedaka AK, Vidal J, Lin CY (2019) Automatic robot path integration using three-dimensional vision and offline programming. Int J Adv Manuf Technol 1–16 41. Chang WY, Wu SJ (2016) Big data analysis of a mini three-axis CNC machine tool based on the tuning operation of controller parameters. Int J Adv Manuf Technol 1–7 42. Chen B, Feng J (2014) Multisensor information fusion of pulsed GTAW based on improved DS evidence theory. Int J Adv Manuf Technol 71(1–4):91–99 43. Janson L, Ichter B, Pavone M (2018) Deterministic sampling-based motion planning: optimality, complexity, and performance. Int J Robot Res 37(1):46–61
920
C. A. My
44. Kong X, Chang J, Niu M, Huang X, Wang J, Chang SI (2018) Research on real time feature extraction method for complex manufacturing big data. Int J Adv Manuf Technol 99(5–8):1101– 1108 45. Krueger V, Chazoule A, Crosby M, Lasnier A, Pedersen MR, Rovida F, Veiga G (2016) A vertical and cyber–physical integration of cognitive robots in manufacturing. Proc IEEE 104(5):1114–1127 46. Krueger V, Rovida F, Grossmann B, Petrick R, Crosby M, Charzoule A, Veiga G (2019) Testing the vertical and cyber-physical integration of cognitive robots in manufacturing. Robot Comput-Integr Manuf 57:213–229 47. Liu J, Zhou H, Tian G, Liu X, Jing X (2019) Digital twin-based process reuse and evaluation approach for smart process planning. Int J Adv Manuf Technol 100(5–8):1619–1634 48. Ramesh R, Jyothirmai S, Lavanya K (2013) Intelligent automation of design and manufacturing in machine tools using an open architecture motion controller. J Manuf Syst 32(1):248–259 49. Dong H, Cong M, Zhang Y, Liu Y, Chen H (2018) Modeling and real-time prediction for complex welding process based on weld pool. Int J Adv Manuf Technol 1–14 50. Boersch I, Füssel U, Gresch C, Großmann C, Hoffmann B (2016) Data mining in resistance spot welding-A non-destructive method to predict the welding spot diameter by monitoring process parameters. Int J Adv Manuf Technol 99(5–8):1085–1099 51. Levine S, Pastor P, Krizhevsky A, Ibarz J, Quillen D (2018) Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. Int J Robot Res 37(4– 5):421–436 52. Miao G, Hsieh SJ, Segura JA, Wang JC (2019) Cyber-physical system for thermal stress prevention in 3D printing process. Int J Adv Manuf Technol 100(1–4):553–567 53. Morgan J, O’Donnell GE (2017) Multi-sensor process analysis and performance characterisation in CNC turning—A cyber physical system approach. Int J Adv Manuf Technol 92(1–4):855–868 54. Shin SJ, Woo J, Rachuri S (2014) Predictive analytics model for power consumption in manufacturing. Procedia CIRP 15:153–158 55. Shin SJ, Woo J, Rachuri S (2017) Energy efficiency of milling machining: Component modeling and online optimization of cutting parameters. J Clean Prod 161:12–29 56. Kumar A, Chinnam RB, Tseng F (2019) An HMM and polynomial regression based approach for remaining useful life and health state estimation of cutting tools. Comput Ind Eng 128:1008– 1014 57. Chen J, Yang J, Zhou H, Xiang H, Zhu Z, Li Y, Xu G (2015) CPS modeling of CNC machine tool work processes using an instruction-domain based approach. Engineering 1(2):247–260 58. Mativo T, Fritz C, Fidan I (2018) Cyber acoustic analysis of additively manufactured objects. Int J Adv Manuf Technol 96(1–4):581–586 59. Zhong RY, Xu C, Chen C, Huang GQ (2017) Big data analytics for physical internet-based intelligent manufacturing shop floors. Int J Prod Res 55(9):2610–2621 60. Yuan M, Deng K, Chaovalitwongse WA, Yu H (2018) Research on technologies and application of data mining for cloud manufacturing resource services. Int J Adv Manuf Technol 99(5– 8):1061–1075 61. Fang C, Liu X, Pardalos PM, Pei J (2016) Optimization for a three-stage production system in the internet of things: procurement, production and product recovery, and acquisition. Int J Adv Manuf Technol 83(5–8):689–710 62. Torrisi NM, Oliveira JF (2008) Remote control of CNC machines using the CyberOPC communication system over public networks. Int J Adv Manuf Technol 39(5–6):570–577 63. Torrisi NM, de Oliveira JFG (2012) Remote monitoring for high-speed CNC processes over public IP networks using CyberOPC. Int J Adv Manuf Technol 60(1–4):191–200 64. Kumar A, Shankar R, Choudhary A, Thakur LS (2016) A big data MapReduce framework for fault diagnosis in cloud-based manufacturing. Int J Prod Res 54(23):7060–7073 65. Wang J, Ye L, Gao RX, Li C, Zhang L (2018) Digital Twin for rotating machinery fault diagnosis in smart manufacturing. Int J Prod Res 1–15
The Role of Big Data Analytics and AI in Smart Manufacturing …
921
66. My CA, Bien DX, Le CH, Packianather M (2019) An efficient finite element formulation of dynamics for a flexible robot with different type of joints. Mech Mach Theory 134:267–288 67. My CA, Le CH, Packianather M, Bohez EL (2019) Novel robot arm design and implementation for hot forging press automation. Int J Prod Res 57(14):4579–4593. https://doi.org/10.1080/ 00207543.2018.1521026 68. My CA, Parnichkun M (2015) Kinematics performance and structural analysis for the design of a serial-parallel manipulator transferring a billet for a hot extrusion forging process. Int J Adv Rob Syst 12(12):186. https://doi.org/10.5772/62026 69. My CA, Bien DX, Tung BH, Hieu LC, Cong NV, Hieu TV (2019) Inverse kinematic control algorithm for a welding robot-positioner system to trace a 3D complex curve. In: 2019 IEEE international conference on advanced technologies for communications (ATC), pp 319–323. https://doi.org/10.1109/atc.2019.8924540 70. My CA, Makhanov SS, Van NA, Duc VM (2020) Modeling and computation of real-time applied torques and non-holonomic constraint forces/moment, and optimal design of wheels for an autonomous security robot tracking a moving target. Math Comput Simul 170:300–315. https://doi.org/10.1016/j.matcom.2019.11.002 71. My CA et al. (2009) Design and control of a six wheels terrain robot. In: Proceedings of the IFToMM international symposium on robotics and mechatronics, pp 97–103 72. My CA (2009) Mechanical design and dynamics modelling of RoPC robot. In: Proceedings of international symposium on robotics and mechatronics, Hanoi, Vietnam, pp 92–96 73. My CA (2016) Inverse kinematics of a serial-parallel robot used in hot forging process. Vietnam J Mech 38(2):81–88. https://doi.org/10.15625/0866-7136/38/2/5958 74. My CA (2013) Inverse dynamic of a N-links manipulator mounted on a wheeled mobile robot. In: 2013 IEEE International conference on control, automation and information sciences (ICCAIS), pp 164–170. https://doi.org/10.1109/iccais.2013.6720548 75. My CA, Trung VT (2016) Design analysis for a special serial-parallel manipulator transferring billet for hot extrusion forging process. Vietnam J Sci Technol 54(4):545. https://doi.org/10. 15625/0866-708X/54/4/6231 76. My CA, Hoan VM (2019) Kinematic and dynamic analysis of a serial manipulator with local closed loop mechanisms. Vietnam J Mech 41(2):141–155. https://doi.org/10.15625/0866-7136/ 13073 77. My CA, Bohez EL (2016) New algorithm to minimise kinematic tool path errors around 5-axis machining singular points. Int J Prod Res 54(20):5965–5975 78. My CA (2010) Integration of CAM systems into multi-axes computerized numerical control machines. In: 2010 IEEE second international conference on knowledge and systems engineering, pp 119–124. https://doi.org/10.1109/kse.2010.30 79. My CA (2010) Integration of CAM systems into multi-axes computerized numerical control machines. In: IEEE Proceedings of second international conference on knowledge and systems engineering, pp 119–124 80. My CA, Bohez EL (2003) Multi-criteria optimization approach for 5-axis CNC tool path planning: modeling methodology. In: Proceedings of the 3rd asian conference on industrial automation and robotics, pp 5–11 81. My CA, Bohez EL, Makhanov SS, Munlinb M, Phien HN, Tabucanon MT (2009) On 5-axis freeform surface machining optimization: vector field clustering approach. Int J CAD/CAM 5(1) 82. My CA, Bohez ELJ, Makhanov SS (2005) Critical point analysis of 3D vector field for 5-axis tool path optimization. In: Proceedings of the 4th asian conference on industrial automation and robotics, ACIAR, pp 11–13
Literature Review: Real Time Water Quality Monitoring and Management Deepika Gupta, Ankita Nainwal, and Bhaskar Pant
Abstract With the advent of this new era of water crisis, save water is the cry all over. Water sources are encroached from every existence on Earth. Saving water needs a systematic monitoring approach to determine its quality. Availability of Internet of Things (IoT) and remote sensing techniques mark the ease of congregating, analyzing and handling of real time data to further accelerate measures taken upon. Real-time water quality monitoring and management initiates prompt alarm ensuring timely response to water contamination in protecting and conserving the aquatic habitat, improving crop production by controlling quality of irrigated water, etc. This paper upheavals the water quality parameters required due consideration for monitoring real time water quality along with the available remote sensors. Also it briefs the review of parameters covered so far. Further it proposes the methodology suitable to the needs of detecting real time water contaminations based on the challenges of existing management system and IoT. Keywords Water quality · Internet of things · Water parameters · Water quality standards · Remote sensors · Water quality management
1 Introduction Increasing natural influences and the impact of anthropogenic activities generate the concern to evaluate the water quality across various services such as clean drinking water, biodiversity preservation, sustainable fisheries, and water for irrigation, etc.
D. Gupta (B) · A. Nainwal · B. Pant Graphic Era Deemed to Be University, Dehradun 248002, India e-mail: [email protected] A. Nainwal e-mail: [email protected] B. Pant e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_88
923
924
D. Gupta et al.
More than half out of 2.17 lakh habitations are prone to excess iron, fluoride, salinity, nitrate and arsenic [1]. Moreover, approximately, 10 million cases of diarrhoea, more than 7.2 lakh typhoid cases and 1.5 lakh viral hepatitis cases occur every year due to contaminated water supply and weak healthy environment. UN-Water Integrated Monitoring Initiative for SDG 6 places sufficient evidence on deteriorating water quality. Even in India, the inauguration of Jal Shakti Ministry particularly focuses on the steps of cleaning water providing clean drinking water to everyone. All these concerns compositely required methodology to work upon. IS 10500 BIS formulated the standards to assess the quality of water resources. Turbidity, pH, conductivity, temperature, Dissolved Oxygen, Dissolved Ammonia, Bio-chemical Oxygen Demand, Chemical Oxygen Demand, nitrates and chloride are the real time monitoring parameters by CPCB. The remaining parameters still follow the traditional methodology in monitoring water quality. Traditional methodology to assess water quality requires collecting data sample from various sources manually, testing and analyzed as per standards in a laboratory. The complete task is laborious and time consuming. Meanwhile, possibility may arise on further deterioration of water quality against analyzed data. Here evolves the scope of harnessing the real time water quality monitoring in the remaining parameters persistently.
2 Water Quality Parameters Water quality parameters are widely divided into physical, chemical and biological. • Physical Parameters—Physical parameter includes turbidity, temperature and conductivity of water. Turbidity measures cloudiness or haziness in water. The turbidity in drinking water should be less than 1 nephelometric turbidity units (NTU) [2]. Temperature governs the type and the kind of various aquatic lives. It also regulates the dissolved oxygen. The conductivity of water measures its capability to pass electric current. The conductivity in drinking water should lies in the range of 0–2500 µS/cm [2]. • Chemical Parameters—Chemical parameter comprises pH, dissolved oxygen and oxidation-reduction potential [3]. pH measures the degree of alkalinity and acidity in a solution [2]. pH 7 indicates neutral solution, pH less than 7 is acidic solution and pH greater than 7 is alkaline solution. pH between 6 and 9 is considered for water distribution systems. Dissolved oxygen measures oxygen dissolved in water. The threshold limit of dissolved oxygen in water is 0.5 mg/L [3]. Oxidation-reduction potential is the measure of cleanliness of water. Higher the oxidation-reduction potential, the better is the quality of water [2]. • Biological Parameters—Biological parameters are more important in terms of the direct effect on human health. Biological parameters include microorganisms like bacteria, viruses, algae, etc. Drinking water must be regularly monitored and treated for biological components to improve water quality.
Literature Review: Real Time Water Quality Monitoring …
925
3 Challenges Complex communication among various stakeholders owing to their corresponding policies and needs [4]. Real time monitoring mostly limits river system [5] with limited water quality parameters such as temperature, dissolved oxygen, pH, conductivity, B.O.D, Nitrate– N + Nitrite–N, Faecal Coliform, Total Coliform [6]. IoT interoperability issues encompassing multi-platform connectivity, technology dependence, system dependence and various standards solving different interoperability impose a challenge to work on [7]. Energy efficient design and complex IoT systems are prone to security and privacy breach [8].
4 The Requisite Flow The requisite flow depicts the wireless sensors employment to sense the contaminated water which could be analyzed as per the set standards and hence alarm the required authorities to take measures thereupon. Sensors collected data is sent to the proprietary software SEMS (Smart Environment Monitor System), which monitors parameter, manages sensors, execute custom queries, manage users, report alarms and other operations. Sensors are deployed on each source as rivers, ponds, lakes, water tanks etc. to sense the parameters composition in the water such as temperature, turbidity, fluoride, chlorine, magnesium, copper, nitrates etc. and collect the data to further send it the controller (Fig. 1).
Fig. 1 Methodology employment to detect water contaminants
926
D. Gupta et al.
Controller and IoT gateway consolidate all sensed data irrespective of network dependency and transmit it to the cloud data repository in an efficient and secured manner using Bluetooth, Wi-Fi, GSM, ZigBee etc. Cloud consists of record capabilities, data centric processing and analysis of the sensed parameters to imitate decision making. The analyzed data thus interfaced with user device application to alarm the real time composition of the water parameters. Working–Sensors such as Libelium [9] deployed at various water sources sensed the parameters composition present in the water and send the data to the controller like Raspberry pi and IoT gateway using protocols Sigfox, IPv6 etc., which collect, consolidate and transmit data further to the cloud data repository. Data is then analyzed as per the set standards for various sources and hence drawn the inferences. These inferences in turn are sent to the user interface to inform the present level of water parameters at concerned sources.
5 Results and Discussion See Table 1.
6 Conclusion The paper holistically reviewed the physical, chemical and biological parameters which are analyzed and smartly monitored using remote sensors. It focuses on the analysis of not smartly monitored parameters and hence put forth the sensors availability to cater them via designing a methodology that can be employed to timely diminish the excess of any contaminant within the permissible limits ascribed by the standards of water usability. It requires low startup and maintenance cost. Hence have great potential in numerous monitoring applications. The reliability of this approach on sensor data is vulnerable to cyber-attacks. Moreover, the complete system dynamism will incorporate the new policies as and when government mandated.
[IS 13428]
Silver
[IS 3025, part 42]
Copper
[IS 3025, part 59]
[IS 3025, part 32]
Chloride
[IS 3025, part 46]
[IS 3025, part 40]
Calcium
Manganese
[IS 3025, part 57]
Boron
Magnesium
[IS 15302]
Barium
[IS 3025, part 60]
[IS 3025, part 34]
Ammonium
[IS 3025, part 53]
[IS 3025, part 34]
Nitrate, Nitrogen
Iron
[IS 3025, part 55]
Aluminium
Fluoride
Methods adopted
Parameters to be monitored
Libelium [9], chalcogenide glass potentiometric sensors
chalcogenide glass potentiometric sensors
Libelium [9]
chalcogenide glass potentiometric sensors
Libelium [9]
Libelium [9], chalcogenide glass potentiometric sensors
Eureka Ion-selective electrodes (ISE’s), Libelium [9]
Eureka Ion-selective electrodes (ISE’s), Libelium [9]
Boroline [10]
Anthracenone Sensor
Eureka Ion-selective electrodes (ISE’s), Libelium [9]
Eureka Ion-selective electrodes (ISE’s), Libelium [9]
Remote Optical Water—O—2311A
Sensors identified
–
–
–
–
–
–
mg/l
mg/l
ppm, mg/l, g/l
arb./p.d.u
mg/l
> 1 µm
Resolution
–
>10–5 M 868/900 MHz
– –
868/900 MHz
– –
>10–5 M
–
0.1
0.1
868/900 MHz
868/900 MHz
0–18,000 mg/l
0–40,000 mg/l
0.1
–
(continued)
Response time < 3 min
–
Response time < 3 min
–
–
5% or 2 mg/l
5% or 2 mg/l
± 1.2%
1.1E + 04
0–1000 ppm
0 − 4.0 × 10−5
1 × 10−5
5% or 2 mg/l
5% or 2 mg/l
2–5 m
Accuracy
0–100 mg/l as nitrogen 0.1
0–100 mg/l as nitrogen 0.1
0–10 m
µm mg/l
Range
Units
Table 1 The sensors that could be employed for the parameters yet to smartly monitor
Literature Review: Real Time Water Quality Monitoring … 927
Methods adopted
[IS 3025, part 24]
[IS 3025, part 29]
[IS 3025, part 49]
[IS 3025, part 41]
[IS 3025, part 27]
[IS 3025, part 47]
[IS 3025, part 48]
[IS 3025, part 2]
[IS 3025, part 54]
[IS 3025, part 37]
[IS 3025, part 52]
USEPA
Parameters to be monitored
Sulphate
Sulphide
Zinc
Cadmium
Cyanide
Lead
Mercury
Molybdenum
Nickel
Arsenic
Chromium
Biological
Table 1 (continued)
nm
ppm
Units
–
nm
nm
TOX Control
Inhibition in percentage
–
–
MFC—based Biosensors µA/mM [11]
1 × 10-5
0–4.0 × 10−5
–
–
–
–
>10–5 M
–
arb./p.d.u
–
–
−10 000– + 10,000 Rad(RI)
–
–
–
2.4E + 04
Response time < 3 min
NA
–
Absolute at 276 nm [9]
1 × 10−9 –
590 nm
1 × 10−9
590 nm
1 × 10−9 –
± 3%
1 × 10−6 –
Accuracy
Resolution
868/900 MHz
600–800 nm
525–650 nm
–
525–650 nm
0–9990 ppm
Range
MFC—based Biosensors µA/mM [11]
Anthracenone Sensor
chalcogenide glass potentiometric sensors
EventLab [12], ppm MFC—based Biosensors [11]
Libelium, chalcogenide glass potentiometric sensors
colorimetric sensor
FRET–sensor, chalcogenide glass potentiometric sensors
MFC—based Biosensors µA/mM [11]
FRET–sensor, Libelium
TDS Sensor
Sensors identified
928 D. Gupta et al.
Literature Review: Real Time Water Quality Monitoring …
929
References 1. Eleventh Five Year plan (2007–12), Planning commission government of India, vol 2 2. Pule M, Yahya A, Chuma J (2017) Wireless sensor networks: a survey on monitoring water Quality. ScienceDirect 15(6):562–570. https://doi.org/10.1016/j.jart.2017.07.004 3. Radhakrishnan V, Wu W (2018) IoT Technology for smart water system. IEEE, pp 1493– 1498. https://doi.org/10.1109/hpcc/smartcity/dss.2018.00246 4. Robles T et al (2014) An internet of things-based model for smart water management, IEEE, pp 821–826. https://doi.org/10.1109/waina.2014.129 5. Cpcb.nic.in (2016) Real time water quality monitoring of river Ganga. http://52.172.40.227: 8992/cr/. Accessed 14 Jul 2014 6. Water Quality Data Year (2016) https://cpcb.nic.in/nwmp-data-2016/. Accessed 28 Aug 2019 7. Noura M, Atiquzzaman M, Gaedke M (2018) Interoperability in internet of things: taxonomies and open challenges. Springer 24(3):796–809. https://doi.org/10.1007/s11036-018-1089-9 8. Sisinni E, Saifullah A, Han S, Jennehag U, Gidlund M (2018) Industrial internet of things: challenges, opportunities, and directions. IEEE X(X):1–11. https://doi.org/10.1109/TII.2018. 2852491 9. Libelium pushes the Water Quality Market ahead with its new Smart Water Xtreme Monitoring Platform (2018). http://www.libelium.com/libelium-pushes-the-water-quality-marketahead-with-its-new-smart-water-xtreme-monitoring-platform/. Accessed 13 Nov 2018 10. Boronline®—Rolls-Royce, (2005). https://www.rolls-royce.com/~/media/Files/R/RollsRoyce/documents/customers/nuclear/UK_Boronline. Accessed 23 Jan 2017 11. Chouler J, Lorenzo MD (2015) Water quality monitoring in developing countries; can microbial fuel cells be the answer? MDPI 5(3):450–470. https://doi.org/10.3390/bios5030450 12. JRC European Union (2013) Review of sensors to monitor water quality. https://doi.org/10. 2788/35499
Development of a Stimulated Model of Smart Manufacturing Using the IoT and Industrial Robot Integrated Production Line Minh D. Tran, Toan H. Tran, Diem T. H. Vu, Thang C. Nguyen, Vi H. Nguyen, and Thanh T. Tran Abstract The paper is to develop a stimulated model of an automatic production line in shoe manufacturing industry by integrating Internet of things (IoT) technology and industrial robots. Firstly, a conceptual design and prototype development of the simulated model is proposed for experimental study. Secondly, a control software in combination with human-machine interface (HMI) for the prototype is developed by using the programmable logic controller(PLC) and a Ardruno micro-controller. Finally, a model integrated system for automatic database management is provided by using the IoT technology. Keywords Smart factory · Automated production line · Industrial robots · IoT
1 Introduction Manufacturing automation increases its role over a variety of industrial fields that contributes hugely to the global economy. The application of automation into all fields of lives and industry which decreases the using of labors, number of equipM. D. Tran · T. H. Tran · D. T. H. Vu · T. C. Nguyen · V. H. Nguyen · T. T. Tran (B) Vietnamese -German University, Thu Dau Mot City, Binh Duong Province, Vietnam e-mail: [email protected] M. D. Tran e-mail: [email protected] T. H. Tran e-mail: [email protected] D. T. H. Vu e-mail: [email protected] T. C. Nguyen e-mail: [email protected] V. H. Nguyen e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_89
931
932
M. D. Tran et al.
ment and making the process more comfortable in comparison to manual, process and systems can also be automated in [1]. Especially, robot-based automatic manufacturing system in [2] is introduced to bring significant deductions of labor cost, operating cost, the number of equipment in the production line. Main features of the automated plant are presented to show the advantages in using a new integrated design and production process in [3, 4]. In recent years, application of internet of things(IoT) and industrial robots for automatic production lines in manufacturing technology plays an important role in improving the Overall Equipment Effectiveness (OEE) in production in [5], [6]. In addition to production activities, the production lines with integrated robots are also used for training companys employees about manufacturing process and operations in [7]. This activity aims to provide them an overview of robot integrated production lines such as how robot arm works, how conveyor, machine and industrial robots are integrated together in a production line. However, training activities usually lead to a disruption and effects to the production operations. Thus, development of a simulated model of an industrial robot integrated production line is necessary and meaningful in support of the training activities for employees in factories. This paper develops a stimulated model of IoT and industrial robot integrated production line which could provide education and training activities for factors’ employees and university students of robot integrated production and IoT technology, and create a knowledge foundation for the application of IoT and robot integrated production line.
2 Model Development of Robot Integrated Production Line In this section, two conveyors and buffing machine are designed and fabricated by using additive manufacturing technology. Design of electrical wiring diagram which is along with selection and reviews electrical component would use in this research. The last content would show the functionality of robot arms, structure assembly and the selection of robot arm components.
2.1 Mechanical Design and Fabrication for the Production Line Model Conveyors play an important role is transporting the objects from the starting point of the model to the picking point of robot arm on the first conveyor which simulates for the vulcanizing process in the assembly line of shoe production line. Figures 1 show the conveyors for the model and Fig. 2 show a buffing machine for the production line.
Development of a Stimulated Model of Smart Manufacturing Using …
933
Fig. 1 Design and fabrication of conveyor model
Fig. 2 Design and fabrication of the buffing machine
The chosen motor of the buffing machine is DC Motor 12 V 775- 150 W with the specification is listed: Power is 150 W, Motor diameter and shaft diameter is 52 mm and 8 mm relatively, voltage norm is 12V DC, Speed is 1500 rpm, Torque is 30 kg. cm. The chosen motor has high speed 1500 rpm. Therefore we select a DC motor controller 15 A which could control the speed of DC motor from 6 to 90 V.
2.2 Electrical Design and Fabrication for the Production Line Model Three common level of operating voltage: 5, 12 and 24 V in this research excepted to 220 V because it could danger the trainees. This three levels of voltage help the trainees having the overview outlook about the power in automation from a tiny
934
M. D. Tran et al.
applications. Our system uses one 12 V–12 A power, one 5 V–15 A power, and two 24 V power, one is used for 24 V sensors and 24 V actuators, and one uses for PLC. Also, three types of sensor: 5 V infrared impedance sensor, 24 V inductive proximity sensor, and the 24 V optical sensor are used to develop the model.
2.3 Robot Arm Design for the Production Line Model In the real production line, the robot arm is responsible for picking—doing the processing placing. The processing the robots do which could be buffing process, priming process or cementing process. Kuka Robot Arm has six-axes controlled by PLC and integrated into the system of automation line with vulcanization machines, heating machine, buffing machine, scanning machines by Azani Supplier which is an Italia Automation Supplier. Structure of the robot arm from the market which includes many mechanical parts such as joints, bearings, fixtures, bolts, and nuts assembled to a robot structure as shown in Fig. 3. The industrial robot arms are generally controlling by a PLC or professional robot arm controllers, but in this model, we choose Arduino Uno to control robot arm because of many reasons mentioned such that Low-cost controller, open source for reference and available in the market, a programming language is ergonomic to the coder, Operating voltage is 5 V. Therefore, we used the Arduino Uno board for the robot controller.
Fig. 3 Design and fabrication of the robot arm
Development of a Stimulated Model of Smart Manufacturing Using …
935
3 System Integration for the Production Line Model This section includes three main parts: the first part is an integration system which would share the necessary step at the planning stage consist of the block diagram of the system and the method of communicating among devices. The second part which is software design, it would be intensely designed and explained about PLC programming, HMI programming, and Robot arm programming. The last part of this chapter is a Database system which would depict about the database system of this model which is used to monitor the data and to supervise from a distance.
3.1 Block Diagram of the System Firstly, the system is starting with a hard switch or HMI button. Then the first sensor on the first conveyor inspects the product on its beginning, based on that to send the command signal to the first conveyor running or not. Consequently, the first conveyor transports the product to the picking position of the robot arm, when the product reaches the picking position of the robot arm the signals is sent to PLC by the optical sensor to stop the first conveyor and start the buffing machine. Concurrently, the robot arm sensor sends the trigger signals to the robot arm to start the robot arm process. The robot arm process is accomplishing; after that, the product is placing on the second conveyor, the inductive proximity sensor on the second conveyor detects the product then transfer signals to PLC which start the second conveyor running and stop the buffing machine for saving energy. Lastly, the product is brought to the end of the second conveyor and restart the new process from the beginning as Fig. 4.
3.2 Software Design of PLC Programming The Programmable logic controller (PLC) Delta DVP-12SE is used for control software. The PLC has eight digital input (X0-X7) and four digital output (Y0-Y3), integrate with USB gate, Ethernet gate, and 2 RS482 gates. PLC DVP-12SE operates with 24 V DC with the program memory is 16,000 steps, and Real-time clock integrated. To create programming for PLC, we write the program based on four types of languages: ladder, function block, statement list, and logic function. In this application, the ladder language is choosing because of its benefit: ergonomic, popular. In the paper, we use Delta HMI DOP-B07E415 which is made by Delta Electronics. Inc. with specification: working power 24 V, 7-inch touch screen HMI, the resolution is 800 × 480 pixel, flash Rom 4 MB, Ethernet, RS232, RS422, RS 485 communication port. Based on the block diagram of the system as the previous discussion in Fig. 4. We write programming for PLC-DVP-12SE by ladder languages on the software of Delta PLC ISPSoft 2.05. All the contacts, outputs and timers used in the program defined.
936
Fig. 4 System block diagram for the production line model
M. D. Tran et al.
Development of a Stimulated Model of Smart Manufacturing Using …
937
3.3 Design of IoT Integrated Data Management System The database system consists of two main aspects: the hardware devices and Icloud. The hardware device which is using in this model that is Wemos ESP8266 modules. The Icloud which is Thingspeak, an Icloud is produced and operating by MathWorks. A proximity sensor is connected to the Wemos D1 ESP8266 board to detect and calculate the output product value, then send that value to Thingspeak Icloud. To create a Thingspeak Icloud channel for this research, we use the email and sign up free an account of Thingspeak.com. In the next step, creating a channel which names Productivity per day to save the database of our system. A Wi-Fi setting where we would connect to the current Wi-Fi in the Manufacturing Department has: ssid is WIFI-H2, and password is 61616161. To declare the Thingspeak apiKey is necessary to define and the server of Thingspeak which the system connect to is api.thingspeak.com. After defining Thingspeak and Wi-Fi setting, the database system is connected to specified Wi-Fi and display connection status by showing Wi-Fi connected on the serial monitor of IDE software
3.4 Assembly of Production Line Model Figure 5 shows a stimulated model of smart manufacturing using the IoT and industrial robot integrated production line and Fig. 6 shows the human-machine interface (HMI) for the production line model.
Fig. 5 Model of the robot integrated production line
938
M. D. Tran et al.
Fig. 6 Human-machine interface (HMI) for the production line mode
4 Testing, Validation, and Evaluation Experimental setup is built to test the operating status of the model as shown in Fig. 5. Then, the data is recorded and analyze the received data to evaluate the reliability of the model. Finally, the result is showing for discussions about the current disturbances of the model and improvements. For experiment, experiment is setup to evaluate the reliability of the model which simulates the real production line with the simulated products. Therefore, the experiment is set up to analyze the behaviors of the model with different cases. In the model, the simulated product shape is square, and the gripper of the Robot Arm is similar to the clip gripper as shown in Fig. 7.
Fig. 7 Gripper and sample for testing
Development of a Stimulated Model of Smart Manufacturing Using …
939
Fig. 8 Thingspeak Icloud recorded experiment results
We decided to test the model based on two placing cases of the products by twenty times of the simulated products pass through to the model from the beginning to the end of the model. Results show that the stimulated model work well with all successful cases that are with activities of pick-up and place of samples in manufacturing process. Figure 8 show the received data of experiments that are recorded and analyzed for decisions in production process in real applications.
5 Conclusions and Future Works A stimulated model of an automated manufacturing system for education and training activities is developed and fabricated by integrating Internet of things (IoT) technology and industrial robots. With the integration of industrial robots, the automated production lines provide a lost-cost and effective process for factory planning in manufacturing activities. On the other hand, the IoT technology is introduced and integrated in automated manufacturing systems in order to provide a solution of smart data management. This also supports managers to analyze database in assisting with decisions in production process. The proposed model provides an professional training solution for technical employees about an automated manufacturing systems in factories. Also, this model is an effective tool for educate graduate students who are working in the fields of automation and IoT application in manufacturing industry. Therefore, the provided model plays an important role in developing the further concepts of smart learning factories and industry 4.0 in near future.
940
M. D. Tran et al.
References 1. Rodic AD (2009) Automation & control: theory and Practice. IntechOpen, New York. https:// doi.org/10.5772/163 2. Rooks WB (1996) Robots bring automation to shoe production. Assembly Automation 16(3):22– 25 3. Cocuzza S, Fornasiero R, Debei S (2013) Novel automated production system for the footwear industry. In: Emmanouilidis C, Taisch M, Kiritsis D (eds) Advances in production management systems. Competitive manufacturing for innovative products and services. APMS 2012. IFIP advances in information and communication technology, vol 397. Springer, Berlin, Heidelberg 4. Nelly Ayllon PNA (2016) Profibus vs. Profinet comparison and migration strategies, pp 1–6 5. Yang H, Kumara S, Bukkapatnam S, Tsung F (2019) The internet of things for smart manufacturing: a review. IIE Transactions, pp 1–35. https://doi.org/10.1080/24725854.2018.1555383 6. Gmez Maureira MA, Oldenhof D, Teernstra L (2014) “ThingSpeak” an API and web service for the internet of things. World Wide Web 7. Pedersen M, Nalpantidis L, Andersen R, Schou C (2019) Robot skills for manufacturing: from concept to industrial deployment. Rob Comput-Integrated Manuf 37
Reinforcement Learning Based Adaptive Optimal Strategy in Robotic Control Systems Phuong Nam Dao
and Hong Quang Nguyen
Abstract This paper considers the application of online adaptive dynamic programming for robotic systems including manipulators and wheeled inverted pendulum (WIP) systems. The sliding mode control technique enable us to implement the control design for reduced order systems and combine with Neural Networks. Both the two control problems are considered in main part of online adaptive reinforcement learning strategy. Finally, the theoretical analysis about the convergence of Actor/Critic as well as the tracking problem and simulation results demonstrate the effectiveness of the two proposed control schemes. Keywords Manipulators · Approximate/adaptive dynamic programming (ADP) · Wheeled inverted pendulum (WIP) · Output feedback control
1 Introduction The motion of a physical systems group such as robotic manipulators, Wheeled Inverted Pendulum (WIP),... can be considered as mechanical systems with dynamic uncertainties, external disturbances [2]. Recently, several control schemes have been considered for manipulators to handle the input saturation disadvantage by integrating the additional terms into the control structure [5]. In this work, a new desired trajectory has been proposed due to the actuator saturation. The additional term would be obtained after taking the derivative of initial Lyapunov candidate function along the state trajectory in presence of actuator saturation [5]. Optimal control solution has the remarkable way that can solve above constraint problems by considering the constraint based optimization [2] and Model predictive control (MPC) is one of the most effective solutions to tackle the these constraint problems for manipulators. P. N. Dao Hanoi University of Science and Technology, Hanoi, Vietnam H. Q. Nguyen (B) Thainguyen University of Technology, Thainguyen, Vietnam e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_90
941
942
P. N. Dao and H. Q. Nguyen
The optimal control algorithm has been mentioned in the work of [2] after using classical nonlinear control law. However, the online computation technique has not considered yet in [2]. Furthermore, it is difficult to find the explicit solution of Ricatti equation and partial differential HJB (Hamilton-Jacobi-Bellman) equation in general nonlinear systems. A WIP system with the under-actuated description described as an inverted pendulum mounted on a mobile platform subjected to nonholonomic constraints. Due to the number of control inputs is less than the number of the controlled degree of freedom, it is hard to employ the classical nonlinear control for WIP systems. The fact is that the mobile platform cannot be separately controlled from the tilt angle of inverted pendulum dynamics. Because of the presence of unstable balance and uncertainties in WIP systems lead to the disadvantages in implementing the control system. The authors in [4] established the sliding mode control (SMC) technique to obtain the convergence of error after two stages with appropriate sliding surface. The authors in [8] pointed out the optimal control algorithms being considered as the appropriate solution for state variables constraint. However, solving the explicit solution of Ricatti equation in linear systems and partial differential HJB (Hamilton-Jacobi-Bellman) equation in general case are the necessary step in optimal control systems and it is hard to solve them [6]. This work has been extent to output feedback based adaptive dynamic programming algorithm by adding more the equivalent observer [3] with the approach of using Kronecker product for the case of linear systems. Inspired by the above works and analysis from traditional nonlinear control technique to optimal control strategy, the work focus on the frame of online adaptive reinforcement learning for manipulators and nonlinear control with main contribution in control design for 2 types of robotics including WIP and Manipulators.
2 Dynamic Model of Robot Manipulator Consider the robot manipulator systems without constraint force described by the following dynamic equation: M (η)η¨ + C(η, η) ˙ η˙ + G(η) + F(η) ˙ + d (t) = τ (t)
(1)
Property 01: The inertia symmetric matrix M (η) is positive definite, and satisfies ∀ξ ∈ Rn : (2) aξ 2 ≤ ξ T M (η)ξ ≤ b(η)ξ 2 ˙ (η) − 2C(η, η)ξ ˙ =0 ξ T (M
(3)
where a ∈ R is a positive constant, b(η) ∈ R is a positive function with respect to η. Several following assumptions will be employed in considering the stability later.
Reinforcement Learning Based Adaptive Optimal …
943
Assumption 1 If η(t), η(t) ˙ ∈ L∞ , then all these functions C(η, η), ˙ F(η), ˙ G(η) and the first, second partial derivatives of all functions of M (η), C(η, η), ˙ G(η) with respect to η(t) as well as of the elements of C(η, η), ˙ F(η) ˙ with respect to η(t) ˙ exist and are bounded. Assumption 2 The desired trajectory ηd (t) as well as the first, second, third and fourth time derivatives of it exist and are bounded. Assumption 3 The vector of external disturbance term d (t) and the derivatives with respect to time of d (t) are bounded by known constants. Let us consider the following sliding surface: M = s (t) = e˙ 1 + λ1 e1 = 0, e1 = ηref − η, λ1 ∈ Rnxn > 0
(4)
According to (1) and (4), it can be seen that: M s˙ = −Cs − τ + f + d
(5)
where f (η, η, ˙ ηr ef , η˙ r ef , η¨ r ef ) is nonlinear function defined: f = M (η¨ ref + α1 e˙ 1 ) + C(η˙ ref + α1 e1 ) + G + F
(6)
3 ADP Strategy for Robotic Systems 3.1 Adaptive Reinforcement Learning Based Optimal Control Design for Manipulators Assume that the dynamic model of robot manipulator is known, the control input can be designed as τ =f +d −u (7) where the term u is designed by using optimal control algorithm and the remaining term f + d will be estimated later. Therefore, it can be seen that M s˙ = −Cs + u
(8)
According to (4) and (8), we obtain the following time-varying model x˙ =
−M (ηref
−λ1 e1 + s 0n×n + u M −1 − e1 )−1 C(ηref − e1 , η˙ ref + λ1 e1 − s)s
(9)
944
P. N. Dao and H. Q. Nguyen
where x = [e1T , sT ]T and the infinite horizon cost function to be minimized is ∞ J (x, u) =
1 T 1 x Qx + uT Ru dt 2 2
(10)
0
where Q ∈ R2n×2n and R ∈ Rn×n are positive definite symmetric matrices. However, in order to deal with the problem of tracking control, some additional states are given. This work leads us to avoid the non-autonomous systems. Subsequently, the adaptive reinforcement learning is considered to find optimal control solution for autonomous affine state-space model with the assumption that the desired trajectory ηref (t) satisfies η˙ ref (t) = f ref (ηref ): X˙ = A(X ) + B(X )u
(11)
where X = [xT , ηrefT , η˙ refT ]T ⎡ ⎢−M (ηref A(X ) = ⎢ ⎣
⎤ ⎡ ⎤ −λ1 e1 + s 0n×n − e1 )−1 C(ηref − e1 , η˙ ref + λ1 e1 − s)s⎥ ⎥ , B(X ) = ⎣ M −1 ⎦ ⎦ f ref (ηref ) 02n×n f˙ ref (ηref )
Define the new infinite horizon integral cost function to be minimized is ∞ J (X , u) =
1 T 1 T X QT X + u Ru d τ 2 2
(12)
t
Q0 QT = . 0 0
where
(13)
The simultaneous learning based online solution is considered by using Neural networks to represent the optimal cost function and the equivalent optimal controller [1]: (14) V (X ) = W T ψ(X ) + v (X ), 1 u (X ) = − R−1 BT (X ) 2 ∗
∂ψ ∂x
T
W+
∂ευ (x) ∂x
T (15)
where W ∈ RN is vector of unknown ideal NN weights, N is the number of neurons, ψ(X ) ∈ RN is a smooth NN activation function, v (X ) ∈ R is the function reconstruction error.
Reinforcement Learning Based Adaptive Optimal …
945
In [1], the Weierstrass approximation theorem us to uniformly approxi enables ∂ V ∗ (X ) ∂ευ (x) ∗ → 0 as N → ∞. Consider mate not only V (X ) but also ∂X with ευ (x) , ∂x to fix the number N , the critic Vˆ (X ) and the actor uˆ (X ) are employed to approximate the optimal cost function and the optimal controller as: Vˆ (X ) = Wˆ cT ψ(X )
(16)
∂ψ T 1 u (X ) = − R−1 BT (X ) Wa 2 ∂x
(17)
The adaptation laws of critic Wˆ c and actor Wˆ a weights are simultaneously implemented to minimize the integral squared Bellman error and the squared Bellman error δhjb , respectively.
δhjb
∂ Vˆ = Hˆ X , uˆ , ∂X
∂V ∗ − H ∗ X , u∗ , ∂X
(18)
1 1 = Wˆ cT σ + X T QT X + uˆ T Rˆu 2 2 (A + Bˆu) is the critic regression vector. where σ (X , uˆ ) = ∂ψ ∂x Similar to the work in [1], the adaptation law of Critic weights is given: σ d ˆ δhjb Wc = −kc λ dt 1 + νσ T λσ
(19)
where ν, kc ∈ R are constant positive gains, and λ ∈ RN ×N is a symmetric estimated gain matrix computed as follows λσ T d λ = −kc λ λ; λ(ts+ ) = λ(0) = ϕ0 I dt 1 + νσ T Ψ σ
(20)
where ts+ is resetting time satisfying αmin {λ (t)} ≤ ϕ1 , ϕ0 > ϕ1 . It can be seen that ensure λ(t) is positive definite and prevent the covariance wind-up problem [1]. ϕ1 I ≤ λ(t) ≤ ϕ0 I
(21)
Moreover, the actor adaptation law can be described as: ∂ψ −1 T ∂ψ T ˆ ka1 d ˆ BR B (Wa − Wˆ c )δhjb − ka2 (Wˆ a − Wˆ c ) Wa = proj − √ dt ∂x 1 + σ T σ ∂x (22)
946
P. N. Dao and H. Q. Nguyen
Consequently, the control design (7) is completed by implementing the estimation of = f + d , which is designed based on the Robust Integral of the Sign of the Error (RISE) framework [2] as follows: (t) = (ks + 1)s(t) − (ks + 1)s(0) + ρ(t)
(23)
where ρ(t) ∈ Rn is computed by the following equation: d ρ = (ks + 1)λ2 s + γ1 sgn(s) dt
(24)
and ks ∈ R is a positive control gain, γ1 ∈ R is a positive control gain selected satisfying the sufficient condition as: γ1 > ζ1 +
1 ζ2 . λ2
(25)
Remark 1 In early works [2], the optimal control design was considered for uncertain mechanical systems with the RISE framework. The work in [2] was extent by integrating adaptive reinforcement learning in the trajectory tracking problem.
3.2 Adaptive Optimal Output Feedback Controller for WIP The state variables xk and output y¯ k−1,k−N can be reconstructed as follows [3, 7]
y¯ k−1,k−N
xk = ANd xk−N + V (N ) u¯ k−1,k−N = U (N ) xk−N + T (N ) u¯ k−1,k−N
(26)
where Classical VI algorithm will be designed as: −1 Pj+1 = ATd Pj Ad + C T Qd C − ATd Pj Bd Rd + BdT Pj Bd BdT Pj Ad
(27)
−1 Kj+1 = Rd + BdT Pj Bd BdT Pj+1 Ad
(28)
We define Hj =
H 11 H 12 j T j Hj12 Hj22
=
BdT Pj Bd BdT Pj Ad ATd Pj Bd ATd Pj Ad
(29)
Reinforcement Learning Based Adaptive Optimal …
H¯ j =
H¯ j11 H¯ j12 BdT Pj Ad BdT Pj Bd T = T ATd Pj Bd T ATd Pj Ad H¯ j12 H¯ j22
947
(30)
Algorithm 1 VI Output Feedback ADP Select a sufficient small constant ρ < 0 ¯ ¯ Employ initial control law uk on the time interval 0, k0,0 .j ← 0 Hj ← 0, Kj ← 0 an arbitrary while H¯ j − H¯ j−1 > ρ do j Employ uk = −K¯ j zk + ek on kj,0 , kj,s Solve H¯ j+1 from (32) −1 11 12 H¯ j+1 K¯ j+1 ← R + H¯ j+1 j ←j+1 end while
From (27), we have j T Qd yk+1 = −φk+1 + (ψk )T vecs H¯ j+1 yk+1
(31)
where −1 Υ Pj = ATd Pj Ad − ATd Pj Bd R + BdT Pj Bd BdT Pj Ad T −1 j T R + H¯ j11 φ = zk+1 H¯ j12 zk+1 H¯ j22 − H¯ j12 k+1
ψk = vecv
ukT zkT
T
Equation (31) can be expressed as ψjV vecs H¯ j+1 = φjV
(32)
where φjV = ykTj,1 Qd ykj,1
ψjV = ψkj,0 , ψkj,0 , . . . , ψkj,s j j + φk,1 , . . . , ykTj,s+1 Qd ykj,s+1 + φk,s+1
Remark 2 It should be noted that the difference between the two proposed solution as follows. By contrast to the algorithm in 3. 1 with learning process described via (20), (22), the algorithm 1 is evaluated as off-policy and output feedback.
948
P. N. Dao and H. Q. Nguyen
4 Simulation Results In this section, to verify the effectiveness of the proposed tracking control algorithm, the simulation is carried out by a 2-DOF planar robot manipulator system, which is modeled by Euler-Lagrange formulation (1). In the case of 2-DOF planar robot manipulator systems (n = 2), the above matrices in (1) can be represented as follows: 1 + 22 cos η2 3 + 2 cos η2 cos η1 + 5 cos(η1 + η2 ) , G(η) = 4 5 cos(η1 + η2 ) 3 + 2 cos η2 3 −2 sin η2 η˙ 2 −2 sin η2 (η˙ 1 + η˙ 2 ) (33) C(η, η) ˙ = 2 sin η2 η˙ 1 0 M (η) =
where i , i = 1...5 are constant parameters depending on mechanical parameters and gravitational acceleration. In this simulation, these constant parameters are chosen as 1 = 5, 2 = 1, 3 = 1, 4 = 1.2g, 5 = g. T The time-varying desired reference signal is defined as ηd = 3sin(t) 3cos(t) T T and we imply that: η˙ d = 3cos(t) − 3sin(t) , η¨ d = −3sin(t) − 3cos(t) The positive definite symmetric matrices in cost function (10) are: ⎡
40 ⎢2 Q=⎢ ⎣−4 4
2 40 4 −6
−4 4 4 0
⎤ 4 −6⎥ ⎥ , R = 0.25 0 0⎦ 0 0.25 4
The design parameters in sliding variable (4) are selected as
15.6 10.6 λ1 = 10.6 10.4
The remaining control gains in RISE framework (23), (24), (25) are chosen as
60 0 140 0 λ2 = , ks = , γ1 = 5 0 35 0 20 and the gains in Actor-Critic learning laws are selected as kc = 800, ν = 1, ka1 = 0.01, ka2 = 1, On the other hand, according to [2], the consideration of V in (14) can be calculated precisely as V = 2x12 − 4x1 x2 + 3x22 + 2.5x32 + x32 cos(η2 ) + x3 x4 + x3 x4 cos(η2 ) + 0.5x42 (34)
Reinforcement Learning Based Adaptive Optimal …
949
Although we can choose arbitrary ψ(X ) in (14), to facilitate later comparison between result from experiences and result in (34), and the ψ(X ) was considered as T ψ(X ) = x12 x1 x2 x22 x32 x32 cos(η2 ) x3 x4 x3 x4 cos(η2 ) x42 and according to (34), exact value of Wˆ c in (16) and Wˆ a in (17) are Wˆ c = 2 −4 3 2.5 1 1 1 0.5 Wˆ a = 2 −4 3 2.5 1 1 1 0.5
(35)
In the simulation, the covariance matrix is initialized as Ψ (0) = diag 100 300 300 1 1 1 1 1 The simulation results is displayed in Fig. 1 depicting the tracking problem. It is clear that the problem of tracking was satisfied after only about 2.5 times through Fig. 1. The highest error which is approximately 0.05 is a acceptable result although the time of convergence is still high. These results proved the correctness of the algorithm. The offline simulations results based on Output Feedback—ADP are shown in Figs. 1, 2 and 3, which determine the convergence of Actor and Critic part, the tracking problem of a WIP, and the control input based on VI technique. Furthermore, the effectiveness of SMC based Output Feedback ADP was also shown in Figs. 4 and 5.
Fig. 1 System states q(t) and its references qd (t) with persistently excited input for the first 100 times
4 q q
3
1 2
qd1 q
2
d2
1 0 -1 -2 -3 -4
0
5
10
15
20
25
30
950
Fig. 2 Convergence matrices P, K in PI algorithm Fig. 3 The output of the system
Fig. 4 Trajectories of state variables under Algorithm 2
P. N. Dao and H. Q. Nguyen
Reinforcement Learning Based Adaptive Optimal …
951
Fig. 5 Convergence matrices under Algorithm 2
5 Conclusion This paper presented two solutions including the application of online adaptive output feedback optimal control scheme for WIP and SMC based ADP for tracking control problem of a manipulator system. The theoretical analysis and simulation results shown the convergence of weights and high effectiveness of proposed algorithm. Future work of this online optimal technique will be conducted in experimental validation. Acknowledgements This research was supported by Research Foundation funded by Thai Nguyen University of Technology.
References 1. Bhasin S, Kamalapurkar R, Johnson M, Vamvoudakis KG, Lewis FL, Dixon WE (2013) A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica 49(1):82–92 2. Dupree K, Patre PM, Wilcox ZD, Dixon WE (2011) Asymptotic optimal control of uncertain nonlinear Euler-Lagrange systems. Automatica 47(1):99–107 3. Gao W, Jiang Y, Jiang ZP, Chai T (2016) Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming. Automatica 72:37–45 4. Guo ZQ, Xu JX, Lee TH (2014) Design and implementation of a new sliding mode controller on an underactuated wheeled inverted pendulum. J Franklin Inst 351:2261–2282 5. Hu X, Wei X, Zhang H, Han J, Liu X (2019) Robust adaptive tracking control for a class of mechanical systems with unknown disturbances under actuator saturation. Int J Robust Nonlinear Control 29(6):1893–1908 6. Jiang Y, Jiang ZP (2012) Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica 48(10):2699–2704 7. Lewis FL, Vamvoudakis KG (2011) Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data. IEEE Trans Syst Man Cybern Part B 41(1):14–25 8. Li Z, Zhang Y (2010) Robust adaptive motion/force control for wheeled inverted pendulums. Automatica 46(8):1346–1353
Energy Efficient Cluster Head Selection for Wireless Sensor Network Using Fuzzy Logic Kusum Lata Jain, Shivani Gupta, and Smarnika Mohapatra
Abstract Many routing protocols are proposed and used for wireless sensor networks that use clustering for load sharing in the network. This is a significant technique which extends the network lifetime. Dynamic or topology makes cluster head selection a difficult task in the network. The following paper proposed a fuzzy logicbased cluster head selection method and compared with LEACH routing protocol which considered as the benchmark algorithm for the cluster-based routing algorithm for WSN. Keywords WSN · Fuzzy-based system · Network lifetime · Routing algorithm · Clustering
1 Introduction A WSN sensed the physical environment for some specific phenomena, gathered the data using resource scare nodes, and communicate the data to the base station (BS). Base station is resourceful and performs analysis to decision making for the environment [1]. WSNs are deployed in human unattended physical environments [2] that make the availability of energy only by some unconventional way as solar energy or vibration apart from battery connected with them [12]. This results in energy efficient methods in system design and operation of WSNs. On the basis of understanding gained through the study of existing protocols, most energy efficient protocols use hierarchical method for one or other basis [13]. Clustering is one of K. L. Jain (B) · S. Gupta Department of Computer & Communication Engineering, Manipal University Jaipur, Jaipur, India e-mail: [email protected] S. Gupta e-mail: [email protected] S. Mohapatra Poornima University, Jaipur, India e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_91
953
954
K. L. Jain et al.
the methods used as key method in number of energy efficient algorithms [3]. In the following paper, a cluster selection algorithm-based fuzzy logic is proposed to select cluster head. An environment with n sensors to sense physical value of environments is considered with a resource full base station for collection and to perform analysis on the sensed data. The base station is deployed at boundary of the network. Sensor node consumes energy for data transmission and receptions. Energy is required for data transmission and reception, and it also depends on distance for data transmission and reception nodes in the network systematized in clusters. A cluster is containing a leader called cluster head (CH), and member nodes are called cluster members (CM). All nodes have sensing capabilities but CH has to perform some more responsibilities in terms of communication with cluster members as well as with BS. A cluster head receives the information from all CMs and aggregates the information and sends to BS for further processing. Cluster head reduces multiple transmissions for the same data from different CMs in the network. By this localization, a node need not send data directly to the BS which also reduces the energy consumption. A cluster member selects a cluster head on the basis of distance. This connection can be established only if the distance between CH and CM is shorter than its distance from any other CHs in the network. In the proposed work, CM to CH, CH to base station, base station to node, and node to BS communication are possible. CHs change over time in order to share load. Each node can sense an area called sensing range. In the proposed work, sensing range is considered as a circle of radius r in the two-dimensional plane with node at the center [11, 12, 14]. Many algorithms are proposed using clustering for WSN in [11–15]. The following paper is organized as section one discusses introduction; section two describes proposed algorithm; section three shows cluster head selection method; section four shows network parameter for simulation; section five discusses about simulation result and comparison.
2 Proposed Algorithm The algorithm performs task in rounds. Each round has CH identification, cluster formation, sensing, and transmission. Each round broadly has two phases: initial phase and sensing phase. Initial phase uses the fuzzy cost to select the cluster head from the eligible sensor nodes and completes the cluster formation by broadcasting the join request to other nodes. Node wants to become a member of cluster reply to the join request. Cluster sends an acknowledgment for confirmation. Base station maintains a counter for measuring the probability (P) of nodes in each round. The algorithm deals with cluster head selection. In the first round, cluster head selection is random as it is considered that after deployment all nodes have same energy and probability to become a cluster head. For the other rounds, base station calculates the fuzzy cost at the base station on the basis of parameters: probability (P), degree of coverage (C-mean), and residual energy (Er). Higher fuzzy cost makes eligible the node to become cluster head. For simulation, sensing phase added and the sensor nodes collect the data from environment transmit to CHs. CHs use some aggregation
Energy Efficient Cluster Head Selection for Wireless …
955
algorithm to aggregate received data and send the data to base station. At the end of the round, each non-CH node calculates its residual energy (Er) and degree of coverage (C-mean) by the radio energy received by the neighborhood nodes, in case if unusual change occurs in residual energy and C-mean from the previous round via cluster head. The resourceful base station has set of three values for each node. After getting the information, the base station selects predefined number of CHs on the basis of fuzzy cost.
3 Cluster Head Selection Fuzzy-based algorithm uses probability, degree of coverage, and residual energy for cluster head selection. Degree of coverage is defined in the term of C-mean and calculated for each point for the area of network. Value of C-mean is 1 for the point if the sensing range of a sensor covers the point in the network. Residual energy is the current energy of the node. Probability is defined in the terms of round for the algorithm. If the probability is 0.5 to become a cluster head, a node cannot be a cluster head for next 5 rounds. Initially, cluster head is selected randomly. Form the second round, cluster selection is done on the basis of fuzzy cost. The first-order radio model, which is similar to one presents in LEACH [4], is used for transmission and reception energy.
3.1 Fuzzy Logic-Based System [5] Mamdani [6] method of the fuzzy logic inference system is used for simulation for the proposed algorithm. Mamdani [6] method describes four-step method for the fuzzy system: fuzzification, rule evaluation, aggregation, and defuzziofication. Fuzzification is the function that maps the system values to fuzzy set as input. Rule evaluation: It takes fuzzified input values and evaluates them according to the fuzzy rules. Aggregation unified the output of the rule. Defuzzication transforms the fuzzy set obtained by the inference engine into a single crisp value. MATLAB is used for simulation. The output of the fuzzy system and the fuzzy cost are obtained using the “if then” rules, and the range of the values of input or output is 0–1. Three values of the node are used for input to the FIS system and converted in linguistic variable sets used for FIS system Residual energy = {Excellent, Good, Poor} C-mean = {Low, Medium, High} Probability = {Higher, Middle, Lower} Table 1 shows the mapping rules for input fuzzy sets for C-mean, residual energy, probability, and fuzzy cost.
956
K. L. Jain et al.
Table 1 Fuzzy mapping rules Mapping rule
C-mean
Residual energy
Probability
Fuzzy cost
1
Low
Poor
Lower
Very poor
2
Low
Poor
Middle
Poor
3
Low
Poor
Higher
Poor
4
Low
Good
Lower
Poor
5
Low
Good
Middle
Poor
6
Low
Good
Higher
Poor
7
Low
Excellent
Lower
Poor
8
Low
Excellent
Middle
Good
9
Low
Excellent
Higher
Good
10
Medium
Poor
Lower
Poor
11
Medium
Poor
Middle
Good
12
Medium
Poor
Higher
Good
13
Medium
Good
Lower
Poor
14
Medium
Good
Middle
Very good
15
Medium
Good
Higher
Very good
16
Medium
Excellent
Lower
Poor
17
Medium
Excellent
Middle
Very good
18
Medium
Excellent
Higher
Very good
19
High
Poor
Lower
Poor
20
High
Poor
Middle
Good
21
High
Poor
Higher
Good
22
High
Good
Lower
Good
23
High
Good
Middle
Very good
24
High
Good
Higher
Very good
25
High
Excellent
Lower
Poor
26
High
Excellent
Middle
Very good
27
High
Excellent
Higher
Excellent
Energy Efficient Cluster Head Selection for Wireless …
957
3.2 Pseudo Code for Algorithm
Start 1: Initialize probability (p), number of nodes (n), Maximum Cluster Members in a Cluster Cluster MaxCMinCL; 2: Einit(i)=E0, i=1,2, …, n; // all nodes have same initial energy //INITIAL STATE 1: IF Round = = First Base Station selects CHs randomly SendToBS(IDi,) (Xi,Yi) BS send message to selected CH. ELSE // For other Rounds Calculate FC //Base Station Calculate the Fuzzy Cost and selects Nodes as with Highest Fuzzy cost as CH CH{C}=TRUE; //nodes selected as a CH SendToCH Base Station sends a message to selected CH END IF 2: IF (CH{C}=TRUE) THEN BoCast (JOIN) broadcast a JOIN request to other nodes; IF MaxCMinCL = Number of ACK BC (FULL)// Cluster head send FULL message to other nodes ELSE Join(IDs); //non-CH node s join into the closest CH , distance is calculated on the basis of signal strength ACK (IDi ) // CM receive ACK from cluster head as confirmation Cluster(c); //form a cluster c; ENDIF 3: ELSE RC(JOIN)// Node Receive one or more JOIN request from different CHs Node Calculate distance from CH from which it receive JOIN request SELECT (CH)// Node selects the a CH with minimum distance SendtoCH(ACK IDi)//Node sends ACK to confirm member ship to CH, If node received a FULL message then node select next nearest cluster head. 4:END IF 5: Select State/// Node Select their state (II) SENSING PHASE IF (CH(c)=TRUE)) THEN Rec (IDi, Data) //receive data from members, consume ERX ; Aggre(IDi, Data) //aggregate received data Consume EDA; TrToBS(CHc, Data); //transmit received data ETX; TrToBS(IDi, E(n),C-Mean)// Transmit residual Energy and CMean to Base station ELSE TransToCH(IDi, Data,) END IF All node in ACTIVE state 1 :TransToCH(IDi, E(i), C-Mean(i)); //transmit data if there is any change from last Round; END”.
958 Table 2 Network parameters for simulation
K. L. Jain et al. Parameters
Values
Number of nodes in the network
200 (nodes) + 1 base station)
Network area
100 × 100 m2
Probability of node for cluster head selection
0.5
Coordinates for BS
50, 175
Node energy at deployment
0.5 J
Sensing range of node
10 m
Energy required for 8-bit data transmission (ETX)
50 × 10−9 J/bit
Energy required for data aggregation at cluster head (EDA)
5 × 10−9 J/bit
Energy required for 8-bit data reception (ERX)
50 × 10−9 J/bit
Time duration for each round 10 s
4 Network Parameters for Simulation Simulation is performed on MATLAB with Mamdami fuzzy system. Table 2 shows the network parameters for simulation. For the simulation, 200 nodes are placed randomly in the square with vertices (0, 0), (0, 100), (100, 0), and (100, 100). MATLAB FIS [7, 8] model comprises of four components: fuzzy inference engine, fuzzy rules, fuzzifier, and a defuzzifier. For the simulation MATLAB FIS, editor is used. Fuzzification is the process of transforming the network value which is used as inputs into fuzzy sets. Table 3 shows the three linguistic sets of inputs for FIS system and actual respective values of the node. On the basis of fuzzification variable as in Table 3 and rules in Table 1 and Table 4, defuzzification shows that values are found. The output as fuzzy cost is used to select the CHs in the network. A node with the Exce or VGood value will be the highest chance to become a cluster head.
5 Simulation Results Table 5 shows the result of the simulation as number of round called as first dead node (FDN) the round number in which first dead node occur in the network. Dead node is the node that has zero residual energy and cannot be the part of network further. Round number when last dead node (LDN) occurs is also identified. Following table shows the number of the rounds for FDN [9] and LDN for LEACH [4, 10] and proposed algorithm. It is shown in table that FDN for LEACH is 587 which increases to 1302
0.7–1
>6
Excellent
> 0.7 and 0.4 and 70%
>39% and 0.4 and 0.2 and 4 and 2 and 0. Let k → 0 Repeat: 1. Apply vo = −K k xˆ + e and solve Pk from (?). 2. Update K k+1 by using (?). 3. k ← k + 1 until Pk − Pk+1 < υ, k ∗ ← k Algorithm 2 enables us to obtain the online output feedback approximated optiˆ which is one part of proposed controller for a WIP mal control policy νo = −K k ∗ x, system in (15). Remark 1 Unlike the work in [2, 3], the ADP algorithm was developed for continuous time systems. It is worth noting that in Algorithm 2, we only use the input, output of system (4) and the state estimation of the observer without requiring any knowledge of real states of system (4). Furthermore, as long as satisfying the PE condition by using exploration noise, the convergence of optimal control problem will be guaranteed. The convergence property does not depend on the high performance of observer. Finally, it should be noted that the solution in 3.2 including the two algorithms, also mentions adaptive reinforcement learning by considering closed system on the sliding surface.
4 Simulation Results In this section, we implement offline simulations described in both 2 above algorithms under the value of parameters in Table 1 to (?). Thus, we obtain the explicit system matrices as
970
P.-N. Dao and H.-Q. Nguyen
⎡
0 1 0 0 0 0 ⎢0 0 0 0 0 − 3.7706 ⎢ ⎢ ⎢ 0 0 0 0 1 0 A=⎢ ⎢ 0 0 0 0 0 0 ⎢ ⎢ ⎣ 0 0 0 0 0 0 0 0 0 0 0 68.9659
⎤ ⎤ ⎡ 0 0 0 ⎥ 0⎥ ⎢ 0.599 0.599 ⎥ ⎥ ⎢ ⎥ ⎥ ⎢ 0⎥ 0 ⎥ ⎥; B = ⎢ 0 ⎥ ⎢ ⎥ 0⎥ ⎢ 1.0812 −1.0812 ⎥ ⎥ ⎦ ⎣ 0 0 1⎦ −5.776 −5.776 0
Choose the sampling period h = 0.1, the observability index N = 2 and the matrices ⎡
⎤ 10 0 0 10 Q = ⎣ 0 5 0⎦; R = 01 0 04
Fig. 2 Convergence matrix P, K in PI algorithm Fig. 3 The output of the system
Sliding Mode Control Based Output Feedback Adaptive Dynamic Programming …
Fig. 4 Trajectories of state variables with Algorithm 2
Fig. 5 Convergence matrices under Algorithm 2
971
972
P.-N. Dao and H.-Q. Nguyen
The offline simulations results based on Output Feedback—ADP are shown in Figs. 2, 3, 4 and 5, which determine the convergence of Actor and Critic part, the tracking problem of a WIP, and the control input based on PI, VI technique.
5 Conclusion This paper proposed two solutions including the application of online adaptive output feedback optimal control algorithm and SMC based ADP for tracking control problem of a uncertain WIP system. The theoretical analysis and simulation results shown the high effectiveness of proposed algorithm. Future work of this reinforcement learning technique will be conducted in experimental validation.
References 1. Bature AA, Buyamin S, Ahmad MN, Muhammad M (2014) A comparison of controllers for balancing two wheeled inverted pendulum robot. Int J Mech Mechatron Eng 14(3):62–68 2. Gao W, Jiang Y, Jiang ZP, Chai T (2015) Adaptive and optimal output feedback control of linear systems: an adaptive dynamic programming approach. In: Proceedings of the World Congress Intelligent Control Automation 3. Gao W, Jiang Y, Jiang ZP, Chai T (2016) Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming. Automatica 72:37–45 4. Jiang Y, Jiang ZP (2012) Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica 48(10):2699–2704 5. Lewis FL, Vamvoudakis KG (2011) Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data. IEEE Trans Syst Man Cybern Part B 41(1):14–25 6. Li Z, Zhang Y (2010) Robust adaptive motion/force control for wheeled inverted pendulums. Automatica 46(8):1346–1353 7. Yang C, Li Z, Cui R, Xu B (2014) Neural network-based motion control of an underactuated wheeled inverted pendulum model. IEEE Trans Neural Networks Learn Syst 25(11):2004–2016 8. Yue M, Wei X, Li Z (2014) Adaptive sliding-mode control for two-wheeled inverted pendulum vehicle based on zero-dynamics theory. Nonlinear Dyn 76(1):459–471
Region-Based Space Filling Curve for Medical Image Scanning Ashan Eranga Kudabalage, Le Van Dang, Leran Du, and Yumi Ueda
Abstract A region-based space filling curve (RB) scanning method based on image segmentation and gradient vector flow (GVF) is presented. The proposed method extends the context-based space filling curve proposed in [1]. In this paper, the scanning path is guided by a preferred vector field which is constructed from the gradient vector field and enhanced by the GVF technique. The proposed method is compared with relevant scanning techniques. The study is important for image compression, pattern recognition, halftoning, and other applications in image processing. Keywords Space filling curve · Segmentation · Gradient vector field · Image scanning
1 Introduction Image scanning is a process of constructing a path travelling through all pixels of an image at exactly once. The method extracts the image data from 2D to 1D dimension for further applications. In image compression, the original 1D data is compressed using an approximation technique. Let {z k , k = 1, 2, ..., N p } be a 1D image data extracted from an 2D grayscale image, where N p is the number of pixels. Assume that z k is segmented and approximated by a new sequence {(m i , n i ), i = 1, 2, ..., s} with s is the number of segments, m i is the approximated pixel value of segment i, n i is the number of elements in segment i. The objective of the 1D data image compression is modelled as [2]. s ∗ = arg min ε(s)
(1)
s
A. E. Kudabalage · L. V. Dang (B) Sirindhorn International Institute of Technology, Thammasat University, Rangsit, Thailand e-mail: [email protected] L. Du · Y. Ueda Center for Frontier Medical Engineering, Chiba University, Chiba, Japan © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_93
973
974
A. E. Kudabalage et al.
where ε(s) =
s p i +n i
(z k − m i )2 , p1 = 1 and pi =
i=1 k= pi
i−1
ni ,
(2)
j=1
ε(s) represents the mean square error (MSE) of the compression. The peak signal to noise ratio PSNR (δ) is then calculated as 2 δ = 10 log(z max /ε)
(3)
where z max is the maximal pixel value in z k , i.e., z max = 255 if pixels are represented using 8 bits per sample. For a given approximation technique, the scanning path is expected to have small ε and high δ [2]. Common scanning patterns are zigzag, raster, and spiral. A fractal space filling curve (SFC) was first proposed by Peano [3] and improved by [4], called PeanoHilbert space filling curve (PH). A PH curve completely visits its current quadrant before moving to the next one until the entire image is scanned. Moore [5] constructs a scanning technique based on the PH method. The Moore curve differs with the PH curve in which the end point locates right next to the starting point. Thus, Moore’s method is suitable for scanning circular networks. Above methods are context-free, which means the scanning paths are independent to the image contents. A context-based SFC (CB) is proposed by Dafner, CohenOr, & Matias [1]. The generated CBs follow the direction of similar intensity pixels within the input images. It has been shown that the CB is outperformed other contextfree SFC methods in term of inherent coherence. A gradient SFC utilizes the gradient vector field of the image to identify suitable zigzag structure of the curve (i.e., in horizontal, vertical, left skewness, or right skewness direction) [6]. The global direction is evaluated from aforementioned direction of local vectors. To follow the minimal change of pixel directions, perpendicular direction of the global gradient vector is used. It is obvious that pixels of an image can be segmented such that in each cluster, the pixel differences are small. In other word, scanning curves can be generated based on the clusters. Kamata and Hayashi [2] present an adaptive SFC method followed the region of the minimum spanning tree (MST). The path is constructed by merging pixel blocks with similar intensity values. Each block is structured by four neighbor pixels on a rectangular grid cell. The method is then extended to the CB by adding the Manhattan-Hamiltonian path [1]. Pra˘cko, Polec, & Hasenöhrlová [7] propose a scanning method based on image segmentation. First, the image is segmented using JSEG method [8]. The JSEG method clusters image into sub-regions based on the pixel similarity criterion. Finally, raster structure is used to scan pixels of each region. Compare to the context-free methods, CB method provides significant improvement. However, it reveals that the CB and recent state-of-the-art methods still provide 1D data with low auto-correlation and having too many peaks. The peaks were resulted when the curve crosses pixels having large differences [7]. In addition, the
Region-Based Space Filling Curve for Medical Image Scanning
(a) Input image
(c) Preferred direction field
975
(b) Image segmentation
(d) RB path
Fig. 1 Image scanning process under the proposed method
starting point of the paths has not been fully explored. As a result, further studies can be performed for determining the optimal scanning direction at each pixels and the starting location. In this paper, a region-based space filling curve (RB) method is proposed. The RB method first segments the input image into sub-regions with similar pixel values. Then, gradient vector flow (GVF) technique is applied to smooth the gradient vector field on each cluster. Finally, the RB is constructed by applying the CB algorithm on the dual vector field of the gradient field, called preferred direction field (PDF). The architecture of the RB method is shown in Fig. 1. The paper is organized as follow. Section 1 summarizes related works. Section 2 presents the image segmentation algorithm. Section 3 details the procedure to generate the PDF. Section 4 is the RB method and numerical tests. Finally, discussions and conclusions are given in Sect. 5.
976
A. E. Kudabalage et al.
2 Image Segmentation Various image segmentation techniques are available, see in [9]. The choice of segmentation methods depends on a prescribed objective such as pixel similarity or pixel pattern. In this paper, a method based on morphological reconstruction and fuzzy c-means clustering (FRFCM) method is used [9]. The objective is to segment an image based on the pixel similarity. The method is fast and applicable for clustering medical images since it is invariant to various types of noise. The FRFCM algorithm is constructed in two steps. First, noise images are rebuilt using morphological reconstruction to improve image consistency. Next, the reconstructed image is partitioned using a modification of the fuzzy c–means clustering technique. Where the modification of membership partition is replaced by local membership filtering. The partition relies on the pixel difference within local spatial neighbors and clustering centers. Examples of image clustering results with FRFCM are shown in Fig. 2. Figure 2a, b show the input and segmentation result of a color space image with the number of clusters NR =3 . Whereas, Fig. 2c, d display the input and segmentation result of an actual ultrasound breast image with NR =2. Original image after segmentation can be seen as a composition of sub-region images as shown in Fig. 3.
(a) A color space image
(b) Segmentation result, NR =3
(c) An ultrasound breast image
(d) Segmentation result, NR =2
Fig. 2 Image segmentation using FRRCM
Region-Based Space Filling Curve for Medical Image Scanning
977
Fig. 3 Original image and its components by FRFCM segmentation
3 Gradient Vector Flow The consistency of pixels within each region should be higher than that between the region boundaries. In order to increase the consistency within each region, a gradient vector flow (GVF) technique [10] is applied. Consider a gradient vector field of an image denoted by V0 (u, v). The GVF replaces the V0 (u, v) by a vector field V (u, v), which minimizes an energy function given by λ
2 2 2 2 V1,u dv du + + V1,v + V2,u + V2,u
|V0 |2 (V − V0 )2 du dv
(4)
where indices u and v denote the partial derivatives, λ is the calibration coefficient. The function extends large vectors V0 (u, v) far from their original locations and smooths small vectors to follow the direction of large neighborly vectors. The Euler equation for Eq. (4) is given by λ∇ 2 V − (V − V0 ) |V0 |2 = 0
(5)
Equation (5) is solved numerically by treating V as a function of a pseudo-time, performing the iterations as follows n V n+1 = V n + τ λ∇ 2 V − (V − V0 ) |V0 |2
(6)
where n is the iteration number and τ is the time step. An example of a raw gradient field and its smooth version using GVF is shown in Fig. 4. Finally, a smooth version of the gradient vector field VGVF (u, v) of the original image is constructed by summing the GVF of each sub-region. The preferred direction field, ⊥ . PDF is obtained as a dual version of the GVF, i.e., VPDF = VGVF
978
A. E. Kudabalage et al.
(a) A vector field before smoothing
(b) GVF
Fig. 4 Example of gradient vector flow
4 Region-Based Space Filling Curve In this section, the RB path is generated using the CB with the modification of the weight function between the two circuits Ci and C j as in Fig. 5a. W (Ci , C j ) = yi1j + yi2j − xi1j − xi2j
(7)
where xi1j , xi2j , yi1j , yi2j are the pixel difference between relative pixels in the circuits. The CB algorithm includes three steps: – constructing the disjoint circuits map (Fig. 8a)
(a) Circuit map (b) MST Fig. 5 Context-based space filling curve process
(c) Hamilton path
Region-Based Space Filling Curve for Medical Image Scanning
979
– constructing the MST by merging circuits using cost function define as in Eq. (6) (Fig. 5b) – merging circuits in the MST to construct a Manhattan-Hamiltonian path (Fig. 5c). In the proposed RB algorithm, the pixel differences xi1j , xi2j , yi1j , yi2j in Equation (7) is replaced by the vector differences extracted from the vector field VPDF . The RB has been tested on actual ultrasound breast cancer images taken from http://onlinemedicalimages.com. Several tested results are shown in Fig. 6.
Fig. 6 RB tests on medical images. Left: gray scale images, Right: RB paths
980
A. E. Kudabalage et al.
Table 1 Scanning comparison among LB, PH, CB, and RB n=4 n=8 ε δ ε δ ε LB PH CB RB
286 387 92 80
54 51 65 66
406 527 160 154
50 48 60 60
418 748 267 265
n=16 δ
ε
50 43 54 55
399 1086 488 506
n=32 δ 50 40 48 48
Fig. 7 Real ultrasound gray-scale images for testing the proposed method
(a) Original image
(b) CB
(c) RB
Fig. 8 CB versus RB
In order to compare the efficiency of the scanning method, a simple 1D data compression using linear segment is employed [11]. The 1D data obtained by line– by–line (LB), PH, CB, and RB scanning method is segmented using an equal length of n=4, 8, 16, 32. The mean value of pixels on each segment is used to approximate for compression. Table 1 show the average MSE (ε) and PSNR (δ) of the each scanning methods performed on five real ultrasound grayscale images, see in Fig. 7. For the RB method, NR = 4 is used. Table 1 shows that the RB algorithm give the best results for the case of n = 4, 8, and 16. When the segment length increase, the advantage decreases. For instance, when n = 4, the RB method outperformed the CB (second best), 13% with MSE and 1.5% with PSNR. For the case of n = 32, LB is the best. A possible reason is that the LB method scan images row by row, thus the local similarity is low. However, in large scale, the pixel similarity is smaller compare to other scanning paths. Since the proposed method generates scanning path followed the PDF, the algorithm be able to scan the image region-by-region (see Fig. 8b, c). Table 1 shows that the RB is able to construct the scanning path with a better coherence in all tests, i.e., smaller mean square error (ε) and a higher peak signal to
Region-Based Space Filling Curve for Medical Image Scanning
981
Fig. 9 Sensitivity of NR on the performance of the proposed method
noise ratio (δ). Since the proposed method persuade the scanning path to follow the PDF, it is clear that the path is able to scan the image region-by-region (see Fig. 8b, c). The PSNR indexes of the tested method versus the number of run length is shown in Figure 9. It can be observed that the RB and CB are the first and second best among the four. Especially when n is small. It is noted that the number of clusters in the image segmentation algorithm may affect the performance of the proposed algorithm. Appropriate number of the segmentation need to be carefully selected. Our numerical experiments show that NR = 3 or 4 is a good choice.
5 Discussions and Conclusions An image scanning method based on the image segmentation and PDF is presented. The proposed scanning technique is constructed to follow the preferred direction which is the minimal pixel difference. The gradient vector field technique has been used to increase the homogeneous directions of local vectors within each region. Thus, following the directions having most similar pixel values, the proposed method results in better scanning paths in term of data coherence. In many applications, a scanning path with less number of turning points are preferred. For further study, it is possible to construct the PDF field based on the principle direction of each sub-region.
982
A. E. Kudabalage et al.
The proposed method and further investigation of the study topic is important for image processing (compression, pattern detection, etc.) as well as other areas such as robotic control, space searching, computer numerical control (CNC), e.g., CNC machining, and laser cutting. Specially, the method proposed here could be directly incorporated into toolpath generation for CNC machining [12–14]. Acknowledgements This research is a part of the 5th-Annual Workshop held by the Center of Excellence in Biomedical Engineering, Thammasat University, Thailand and Center for Frontier Medical Engineering, Chiba University, Chiba, Japan.
References 1. Dafner R, Cohen-Or D, Matias Y (2000) Context-based space filling curves. Comput Graphics Forum 19:209–218 2. Kamata S, Hayashi Y (2000) Region-based scanning for image compression. In: Proceedings: international conference on image processing, vol 2, pp 895–898 3. Peano G (1890) Sur une courbe, qui remplit toute une aire plane. Math Ann 36:157–160 4. Hilbert D (1891) Ueber die stetige Abbildung einer Line auf ein Flächenstück. Math Ann 38:459–460 5. Moore E (1900) On certain crinkly curves. Trans Am Math Society 1:72–72 6. Ouni T, Lassoued A, Abid M (2013) Lossless image compression using gradient based space filling curves (G-SFC). SIViP 9:277–293 7. Pracko R, Polec J, Hasenöhrlová K (2007) Segmentation based image scanning. Radioengineering 16(2):71–76 8. Yining D, Manjunath B, Shin H (1999) Color image segmentation. IEEE Comput Society Conf Comput Vis Pattern Recogn, pp 446–451 9. Lei T, Jia X, Zhang Y, He L, Meng H, Nandi A (2018) Significantly fast and robust fuzzy Cmeans clustering algorithm based on morphological reconstruction and membership filtering. IEEE Trans Fuzzy Syst 26:3027–3041 10. Xu CY, Prince JL (1998) Snakes, shapes, and gradient vector flow. IEEE Trans Image Process 7(3):359–369 11. Othman SM, Mohamed AE, Nossair Z, El-Adawy MI (2019) Image compression using polynomial fitting. In: 2019 3rd international conference on electronics, communication and aerospace technology (ICECA), pp 344–349 12. Anotaipaiboon W, Makhanov S (2005) Tool path generation for five-axis NC machining using adaptive space-filling curves. Int J Prod Res 43:1643–1665 13. Anotaipaiboon W, Makhanov S (2008) Curvilinear space-filling curves for five-axis machining. Comput Aided Des 40:350–367 14. Makhanov S (2009) Space-filling curves in adaptive curvilinear coordinates for computer numerically controlled five-axis machining. Math Comput Simul 79:2385–2402
Flexible Convolution in Scattering Transform and Neural Network Dinh-Thuan Dang
Abstract Convolution is an essential component of image processing and computer vision. There are two common approaches to implement convolution: spatial domainbased and frequency domain-based. Which approach is better for image processing? There is not one reasonable answer to all situations but depends on the applied context. The fastest method will depend on the task at hand. In this paper, we (i) evaluate the performance of two convolutional approaches in a 2D image. Then, we (ii) design some neural layers that its output structure parallels to the output of 2D scattering transform. Finally, we (iii) compare the classification results when applying scattering transform and our neural network. Keywords Convolution · Design neural network · Scattering transform · Image classification
1 Introduction Convolution occupies a significant role in the world of machine learning. It has been applied widely in many fields such as digital signal processing, speech recognition, and computer vision. Convolutional neural networks (CNN) [1] are inspired by biological neural networks. CNN includes one or more convolutional layers which act as linear filters to detect the specific features. CNN sizes have grown dramatically trainable parameters. Complexity has become an important consideration as CNN has come into use in practical systems. In particular, the computation of high-dimensional convolutions in large systems remains a challenge due to substantial requirements for computing resources and D.-T. Dang (B) Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 80778, Taiwan e-mail: [email protected] Department of Information Technology, Pham Van Dong University, 509 Phan Dinh Phung Rd, Quang Ngai 57000, Vietnam © Springer Nature Singapore Pte Ltd. 2021 R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering, Advances in Intelligent Systems and Computing 1254, https://doi.org/10.1007/978-981-15-7527-3_94
983
984
D.-T. Dang
Fig. 1 Structure of the paper
energy [2]. In an effort to improve compute performance, convolution is implemented over some approaches. It can be computed using pointwise convolution (sliding dot-product) or circular convolution (discrete Fourier transform) [3, 4]. The scattering transform [5, 6] is an invariant signal representation that suitable for many signal processing and machine learning applications [7]. It is stable to deformations and preserves high-frequency information for classification [8, 9]. Despite the remarkable successes of scattering transform, deep convolutional networks also have the significant ability to build large-scale invariants which are stable to deformations [10]. This is the reason we design some neural layers to compare the scattering transform. This paper structure is organized as presented in Fig. 1. Firstly, we evaluated the performance of the two convolutional approaches, as described in Sect. 2. Secondly, we design the neural layers that its output and the output of the 2D scattering transform have the same structure, as described in Sect. 4. Finally, we applied separately the 2D scattering transform and our designed neural layers to extract image features. We also integrated the features to classify images and evaluate the results of two classification methods, as described in Sect. 5.
2 Convolution in Image Processing 2.1 Convolution Convolution is a formal mathematical operation. It is a mathematical way of combining two signals to form a third signal. Notating to the input image is y; h is the filter; the convolution operator with the symbol is ⊗ . The convolution of
Flexible Convolution in Scattering Transform …
985
image y and filter h is written by: c =h⊗y = y⊗h
(1)
Convolution in 2D continuous space [11]: +∞ +∞ h(χ , ζ )y(τ1 − χ , τ2 − ζ )dχ dζ c(τ1 , τ2 ) = h(τ1 , τ2 ) ⊗ y(τ1 , τ2 ) =
(2)
−∞ −∞
Convolution in 2D discrete space [11]: c(τ1 , τ2 ) = h(τ1 , τ2 ) ⊗ y(τ1 , τ2 ) =
+∞ +∞
h[i, j]y[τ1 − i, τ2 − j]
(3)
i=−∞ j=−∞
The approaches of convolutional implementation. Figure 2 explains the two methods to implement a convolution. One is in the spatial domain and the other is in the frequency domain. Frequency domain convolution. This approach is conducted through the discrete Fourier transform [12]. That is based on the mathematical theorem, Fourier-based convolutional method can be more efficient than a straightforward implementation (Fig. 3). Denoting F is Fourier transform and F −1 the inverse Fourier transform, the basic outline of Fourier-based convolution is: • • • •
Apply direct fast Fourier transform (FFT) to the filter h: F{h}. Apply direct FFT to the input image y: F{y}. Perform the pointwise multiplication of the two preceding results: F{h} · F{y}. Apply inverse FFT to the result of the multiplication: F −1 {F{h} · F{y}}. The 2D convolutional command in Numpy (np): np.fft.ifft2(np.fft.fft2(h) * np.fft.fft2(y))
Fig. 2 Image processing chart in spatial domain and frequency domain
986
D.-T. Dang
Fig. 3 Convolution of image y and filter h based on FFT
Spatial domain convolution. The pixels in an input image represent different “spatial” positions [12]. In the spatial domain, convolution is the process of changing the value of one pixel to the weighted average of all the pixels in its neighborhood (Fig. 4). A mask, usually 3 × 3 or 5 × 5, is created and applied to the first pixel in the original image until the last pixel. The core factor is the size of kernel that is small, such as 3 × 3, 5 × 5, …, 9 × 9. This implementation approach is common and easy to integrate into deep learning. The 2D convolutional command in PyTorch [13]: torch.nn.function.conv2d(y, h)
Fig. 4 a Each step in convolution. b Direction of kernel movement
Flexible Convolution in Scattering Transform …
987
2.2 Performance Evaluation In order to evaluate the performance of the two approaches, we tested on a single convolution. It is convolution between an image and a filter. The parameters used for the evaluation are the size of image and filter. The results are performed by using Numpy and PyTorch [13] environments. Figure 5a shows the convolutional time on image of size 64 × 64. The results indicate that the convolutional execution time between the image and filter that its size is variable. Figure 5b represents the convolutional time on image of size 512 × 512. The diagram shows that the convolutional execution time between image of size 512 × 512 and filter that its size is variable. Figure 5c shows the convolutional time on image of size 1024 × 1024. The results point out that the convolutional execution time between image of size 1024 × 1024 and filter that its size is variable. The results show that frequency domain convolution (blue line) is stable on the size of the filter. With filter size is less than 7 × 7, the performance of spatial domain (yellow and red line) is faster than the frequency domain.
3 Scattering Transform 3.1 Scattering Transform The scattering transform was introduced by Mallat [5, 6, 14]. It is an invariant signal representation suitable for machine learning applications. It is also effective in combination with modern representation learning approaches [10]. The backend of scattering transform is FFT-based [15]. Scattering transform is defined as a complex-valued convolutional network whose filters are fixed to be wavelets and low-pass averaging filters coupled with modulus nonlinearities. Each layer is a wavelet transform, which separates the scales of the incoming signal. Let us consider a set of wavelets {ψλ }λ , such that there exists some satisfying [15]: 1− ≤
2 ψˆ λ (ω) ≤ 1
(4)
λ
Given a signal x, we define its scattering coefficient of order k corresponding to the sequence of frequencies (λ1 , . . . , λk ) to be: Sx[λ1 , . . . , λk ] = ψλk . . .ψλ1 x|. . .|
(5)
988
D.-T. Dang
Time (s)
a
Filter size
Time (s)
b
Filter size
Time (s)
c
Filter size
Fig. 5 a Convolutional time on image of size 64 × 64. b Convolutional time on image of size 512 × 512. c Convolutional time on image of size 1024 × 1024
Flexible Convolution in Scattering Transform …
989
3.2 The Second-Order Scattering Transform [16, Sect. 2.1] Consider a signal x(u) with finite energy, i.e., x ∈ L 2 (R)2 , with u the spatial position index and an integer j ∈ N , which is the spatial scale of our scattering transform. The First-Order Scattering Coefficients: S 1J x( j1 , θ1 , u) = x ⊗ ψ j1 ,θ1 ⊗ φ j 2 J u
(6)
With φ J is a local averaging filter with a spatial window of scale 2 J . ψ j,θ is a family of wavelets which is obtained by dilating and rotating the complex mother wavelet. Theta θ is the rotation angle in mother wavelet. The second-order scattering coefficients. Again, the use of the averaging is applied to the first order, leading to: S 2J x( j1 , j2 , θ1 , θ2 , u) = x ⊗ ψ j1 ,θ1 ⊗ ψ j2 ,θ2 | ⊗ φ j 2 J u
(7)
Output of Scattering Transform in 2D Let us assume that x is a tensor of size (B, C, N1 , N2 ). Then, the output Sx via a scattering transform with scale J and L angles will have size:
L 2 J (J − 1) N1 N2 B, C, 1 + L J + , J, J 2 2 2
where (N1 , N2 ) are the image sizes, B is the batch size, and C is the channel size. Scattering Propagation with J= 8, L= 2 A scattering transform iterates on wavelet modulus operators to compute cascades of wavelet convolutions. Each node is applied one low-pass filter and eight high-pass filters. The final results are 81 outputs, quarter of the size. Scattering network with J= 8, L= 2 Figure 6b explains that x0 has (N1 , N2 ) size, scattering transform of first level with L = 2 and J = 8 (nine filters: one low-pass filter and eight band-pass filters) will output block x1 . x1 includes nine images with size N21 , N22 . Continually, scattering transform of second level with L = 2, J = 8, each image in x1 will create nine outputs. The final results of x2 block has 81 outputs with size x2 . Filter in scattering transform with J= 8, L= 2 See Fig. 6c.
990
D.-T. Dang
Fig. 6 a Tree of computations for a scattering transform. b Scattering transforms as a cascade of filters. c The first column is low-pass filters, and the other columns are eight band-pass filters
4 Design the Equivalent Output Model in Neural Network 4.1 2D Scattering Transform Intuition Figure 7a is the picture of the Mira hotel from the Web site of the RICE 2020 conference. We apply the 2D scattering transform for the green color channel of the picture. The output results of level 1 and level 2 are Fig. 7b, c. The realistic output is in the appendix.
4.2 Equivalent Neural Layers Scope. Keep the most simple, we design the neural network architecture as Fig. 6b. This architecture corresponds to scattering transform with J = 8 and L = 2.
Flexible Convolution in Scattering Transform …
991
a
b
c Fig. 7 aMira hotel, 784 × 512 pixels. b Nine outputs of level 1, 392 × 256 pixels. c 81 outputs of level 2, 196 × 128 pixels
Considering a gray image that has size (128, 128), the first level uses nine filters and downsample a half, so the output of the first level is nine images with size (64,
992
D.-T. Dang
Fig. 8 Our designed neural layers
64). The second level also applies nine filters for each image and downsample a half, so the output of the second level is 81 images with size (32, 32). Design the Neural Layers Figure 8 illustrates our designed neural layers. This architecture has the output structure which is equivalent to scattering transform with level 2. conv1: is the first convolutional layer which has one input channel and nine output channels. We use nine filters of size 5 × 5, corresponds to one low-pass filter and eight band-pass filters in scattering transform. So, we have nine output channels. pool1: is the MaxPool2D layer that decreases size of input a half. After pooling the output, we will move these outputs to nine batches for the next convolutional layer. conv2: is the second convolutional layer which has one input channel and nine output channels. It is similar to the structure of conv1, but the batch is multiple of 9. pool2: this layer decreases size a half. After pooling the output, we will move the batch to channel with multiple of 81.
Flexible Convolution in Scattering Transform …
993
Fig. 9 Output of a input of size [2, 3, 128, 128]
The neural architecture is implemented in PyTorch below N = 512 #image size C = 3 #input color channel, default value is 3 L = 2 #Level class EqualL2Net(nn.Module): def __init__(self): super(EqualL2Net, self).__init__() self.conv1 = nn.Conv2d(1, 9, 5, padding=2) self.relu1 = nn.ReLU() self.pool1 = nn.MaxPool2d(kernel_size=(2,2)) self.conv2 = nn.Conv2d(1, 9, 5, padding=2) self.relu2 = nn.ReLU() self.pool2 = nn.MaxPool2d(kernel_size=(2,2)) def forward(self, x): x = x.view(-1, 1, x.shape[2], x.shape[3]) out = self.pool1(self.relu1(self.conv1(x))) out = out.view(-1, 1, out.shape[2], out.shape[3]) out = self.pool2(self.relu2(self.conv2(out))) return out.view(-1, 9**L*C, out.shape[2], out.shape[3])
We consider an input with two color images that are size [128, 128]. The shapes of the input are a tensor [2, 3, 128, 128], with two batches and three color channels. The shape of output is a tensor [2, 243, 32, 32], and it is explained in detail in Fig. 9.
4.3 Neural Network Model for Classification Overview models of image classification include a combination of two components: feature extraction part and the classification part (Fig. 10).
Fig. 10 Overview model of image classification
994
D.-T. Dang
In the feature extraction part, we apply scattering transform or our designed architecture. We use two linear layers in the classification part. The test dataset [17] has nine classes, so the code describes in detail below. self.classifier = nn.Sequential(OrderedDict([ ('linear1', nn.Linear(feature_num, 512)), ('relu1', nn.ReLU()), ('linear2', nn.Linear(512, 9)), ('relu2', nn.ReLU()), ]))
5 Performance Evaluation Environments. In order to evaluate the performance of the designed neural layers, we train and test in PyTorch [13] and KymatIO [14, 15] frameworks. Parameters. We divide up the NEU-CLS dataset into two subsets: the training set and the test set. The ratio used is 80% train, 20% test. The Cross-entropy loss function is used for classification tasks. The neural networks are trained using the Adam optimizer, with an initial learning rate of 0.01. Training is run for 50 epochs. Dataset. We care about some defects of metal surface, so we choose NEU-CLS [17, 18] to classify. Figure 11 shows the sample images of some kinds of typical surface defects that we can clearly observe. NEU-CLS dataset has nine kinds of surface
Fig. 11 Some kinds of steel surface defects in NEU-CLS
Flexible Convolution in Scattering Transform …
Loss
Accuracy
Accuracy
b
Loss
a
995
Epochs
Epochs
Fig. 12 a Scattering transform, b our method in neural network
defects: CR (1210 images), GG (296), IN (775), PA (1148), PS (797), RP (200), RS (1589), SC (773), and SP (438). Classification Results In Fig. 12a, b, the red color lines show the test accuracy versus epochs. The blue color lines present the loss function versus epochs. In general, the accuracy is nearly equivalent. Our experiment code is available at https://github.com/ddthuan/rice2020.
6 Conclusion Task i shows why almost CNNs have to use small kernels. Fourier-based convolutions are flexible in home and mobile platforms, while spatial-based convolutions are applied in large-scale systems. We have built the same neural model with scattering transform and integrated in image classification. The results of the two approaches are quite similar after training 50 epochs, whereas we might know, neural networks have slow convergence compared to almost other historical numerical methods. Scattering transform is invariance to geometric transformations and fast convergence. So, designing neural networks that archived properties of scattering transform is necessary for large-scale systems. This paper has contributed a part of this idea.
996
D.-T. Dang
Fig. 13 Inverse problem in learning kernel
A solution that solves slow convergence is weight initialization on convolutional layers. This is the extended ideas of the paper. We will train our neural network based on scattering transform. After the weight initialization on neural network that will archive properties of scattering transform, the broaden idea is further explained in Fig. 13.
Appendix Nine outputs of level 1.
Flexible Convolution in Scattering Transform …
997
998
D.-T. Dang
Flexible Convolution in Scattering Transform …
81 outputs of level 2.
999
1000
D.-T. Dang
Flexible Convolution in Scattering Transform …
1001
1002
D.-T. Dang
Flexible Convolution in Scattering Transform …
1003
1004
D.-T. Dang
Flexible Convolution in Scattering Transform …
1005
1006
D.-T. Dang
References 1. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324 2. Nguyen-Thanh N, Le-Duc H, Ta D-T, Nguyen V-T (2016) Energy efficient techniques using FFT for deep convolutional neural networks. In: Advanced technologies for communications (ATC), Ha Noi, Vietnam, Oct 2016
Flexible Convolution in Scattering Transform … 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
14.
15. 16. 17. 18.
1007
Podlozhnyuk V (2007) FFT-based 2D convolution WikipediA. https://en.wikipedia.org/wiki/Convolution_theorem Mallat S (2012) Group invariant scattering. Comm Pure Appl Math 65(10):1331–1398 Bruna J, Mallat S (2013) Invariant scattering convolution networks. IEEE Trans Pattern Anal Mach Intell 35(8):1872–1886 Andén J, Mallat S (2014) Deep scattering spectrum. IEEE Trans Signal Process 62(16):4114– 4128. https://doi.org/10.1109/tsp.2014.2326991 Oyallon E, Mallat S (2015) Deep roto-translation scattering for object classification. In: Proceedings of CVPR, June 2015 Sifre L, Mallat S (2013) Rotation, scaling and deformation invariant scattering for texture discrimination. In: Proceedings of CVPR Bietti A, Mairal J (2019) Group invariance, stability to deformations, and complexity of deep convolutional representations. J Mach Learn Res 20 Convolution. http://www.mif.vu.lt/atpazinimas/dip/FIP/fip-Convolut.html Convolution-based Operations. http://www.mif.vu.lt/atpazinimas/dip/FIP/fip-Convolut-2.html Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: an imperative style, high-performance deep learning library. In: 33rd conference on neural information processing systems (NeurIPS 2019), Vancouver, Canada Andreux M, Angles T, Exarchakis G, Leonarduzzi R, Rochette G, Thiry L, Zarka J, Mallat S, Andén J, Belilovsky E, Bruna J, Lostanlen V, Hirn MJ, Oyallon E, Zhang S, Cella C, Eickenberg M (2020) Kymatio: scattering transforms in python. https://arxiv.org/abs/1812.11214v2 KymatIO (2019). https://www.kymat.io/userguide.html Oyallon E (2017) Analyzing and introducing structures in deep convolutional neural networks. Mach Learn NEU-CLS Dataset. http://faculty.neu.edu.cn/yunhyan/NEU_surface_defect_database.html Song K, Yan Y (2013) A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl Surf Sci 285:858–864