132 101 20MB
English Pages 484 [463] Year 2021
Learning and Analytics in Intelligent Systems 21
Margarita N. Favorskaya · Sheng-Lung Peng · Milan Simic · Basim Alhadidi · Souvik Pal Editors
Intelligent Computing Paradigm and Cutting-edge Technologies Proceedings of the Second International Conference on Innovative Computing and Cutting-edge Technologies (ICICCT 2020)
Learning and Analytics in Intelligent Systems Volume 21
Series Editors George A. Tsihrintzis, University of Piraeus, Piraeus, Greece Maria Virvou, University of Piraeus, Piraeus, Greece Lakhmi C. Jain, Faculty of Engineering and Information Technology, Centre for Artificial Intelligence, University of Technology, Sydney, NSW, Australia; KES International, Shoreham-by-Sea, UK; Liverpool Hope University, Liverpool, UK
The main aim of the series is to make available a publication of books in hard copy form and soft copy form on all aspects of learning, analytics and advanced intelligent systems and related technologies. The mentioned disciplines are strongly related and complement one another significantly. Thus, the series encourages cross-fertilization highlighting research and knowledge of common interest. The series allows a unified/integrated approach to themes and topics in these scientific disciplines which will result in significant cross-fertilization and research dissemination. To maximize dissemination of research results and knowledge in these disciplines, the series publishes edited books, monographs, handbooks, textbooks and conference proceedings.
More information about this series at http://www.springer.com/series/16172
Margarita N. Favorskaya Sheng-Lung Peng Milan Simic Basim Alhadidi Souvik Pal •
•
•
•
Editors
Intelligent Computing Paradigm and Cutting-edge Technologies Proceedings of the Second International Conference on Innovative Computing and Cutting-edge Technologies (ICICCT 2020)
123
Editors Margarita N. Favorskaya Institute of Informatics and Telecommunication Reshetnev Siberian State University of Science and Technology Krasnoyarsk, Russia Milan Simic School of Engineering RMIT University, Bundoora East Campus Melbourne, VIC, Australia
Sheng-Lung Peng Department of Creative Technologies and Product Design National Taipei University of Business Taipei City, Taiwan Basim Alhadidi Department of Computer Information Systems Al-Balqa’ Applied University As-Salt, Jordan
Souvik Pal Department of Computer Science and Engineering Global Institute of Management and Technology Kolkata, India
ISSN 2662-3447 ISSN 2662-3455 (electronic) Learning and Analytics in Intelligent Systems ISBN 978-3-030-65406-1 ISBN 978-3-030-65407-8 (eBook) https://doi.org/10.1007/978-3-030-65407-8 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Organizing Committee and Key Members
Conference Committee Members
Conference Honorary Chair Lakhmi C. Jain, KES International, UK; Liverpool Hope University, UK; and University of Technology Sydney, Australia
Conference General Chair Sheng-Lung Peng, National Taipei University of Business, Taiwan
Conference Conveners Basim Alhadidi, AlBalqa’ Applied University Salt, Jordan Margarita N. Favorskaya, Reshetnev Siberian State University of Science and Technology, Russia Souvik Pal, Global Institute of Management and Technology, India
Programme Chairs Duraisamy Balaganesh, Lincoln University College, Malaysia Thamer Al-Rousan, Isra University, Jordan
v
vi
Organizing Committee and Key Members
Milan Simic, RMIT University, Australia S. Jyothi, Sri Pamavathi Mahila Visvavidyalayam, India
International Advisory Board Members Abdel Rahman A. Alkharabsheh, Qaseem Private University, Kingdom of Saudi Arabia Jhimli Adhikari, Adhikari Academy of Learning Private Limited, India Ton Quang Cuong, Vietnam National University, Vietnam Aruna Chakraborty, St. Thomas College of Engineering and Technology, India Anirban Das, University of Engineering and Management, India Debashis De, Maulana Abul Kalam Azad University of Technology, India Ahmed A. Elnger, Beni-Suef University, Egypt Paulo João, University of Lisbon, Portugal Kavita Khare, M A National Institute of Technology, India Suneeta Mohanty, KIIT University, India Mrutyunjaya Panda, Utkal University, India Prantosh Kumar Paul, Raiganj University, India Anitha S. Pillai, Hindustan Institute of Technology and Science and Technology, India Balamurugan Shanmugam, QUANTS IS & CS, India Abdel-Badeeh M. Salem, Ain Shams University, Egypt
Publication Chairs Margarita N. Favorskaya, Reshetnev Siberian State University of Science and Technology, Russia Aleksandra Klasnja-Milicevic, University of Novi Sad, Serbia Souvik Pal, Global Institute of Management and Technology, India
Programme Conveners Osman Adiguzel, Firat University, Turkey Mingyu Gloria Liao, National Kaohsiung University of Science and Technology, Taiwan Bikramjit Sarkar, JIS College of Engineering, India Satarupa Mohanty, KIIT University, India Dr. Ahmed J. Obaid, Department of Computer Science, Faculty of Computer Science and Mathematics, University of Kufa, Iraq
Organizing Committee and Key Members
vii
Technical Chairs Melike Sah Direkoglu, Near East University, Turkey Yousef Daradkeh, Prince Sattam bin Abdulaziz University, Kingdom of Saudi Arabia Pranati Rakshit, JIS College of Engineering, India Anastasia Grigoreva, St. Petersburg Polytechnic University, Russia
Technical Programme Committee Alti Adel, University UFAS of Setif, Algeria Brahim Aksasse, Moulay Ismail University, Morocco Kalinka Regina Lucas Jaquie Castelo Branco, University of Sao Paulo, Brazil Jean M. Caldieron, Florida Atlantic University, USA Arindam Chakrabarty, Rajiv Gandhi University, India Melike Sah Direkoglu, Near East University, Turkey Yousef Daradkeh, Prince Sattam bin Abdulaziz University, Kingdom of Saudi Arabia Anastasia Grigoreva, St. Petersburg Polytechnic University, Russia Zoltan Gal, University of Debrecen, Hungary Andrey Gavrilov, Novosibirsk State Technical University, Russia Virendra Gawande, College of Applied Sciences, Oman Bharat S. Rawal Kshatriya, Penn State University, USA Saravanan Krishnann, Anna University, India Maheshkumar H. Kolekar, University of Missouri, USA Ibikunle Frank, Covenant University, Nigeria Paulo João, University of Lisbon, Portugal Tarig Osman Khider, University of Bahri, Sudan Pradeep Laxkar, Mandsaur University, India Lina M. Momani, Higher Colleges of Technology, United Arab Emirates Ken Revett, Loughborough University, England Kamel Hussein Rahouma, Minia University, Egypt Abdel-Badeeh M. Salem, Ain Shams University, Egypt Zhao Shuo, Northwestern Polytechnic University, China Sandor Szabo, University of Pécs, Hungary Sattar B. Sakhan, Babylon University, Iraq Suresh Sankaranarayanan, University of West Indies, Jamaica Santanu Singha, JISCE, India S. Vijayarani, Bharathiar University, India P. Vijayakumar, Anna University, India Abraham G. van der Vyver, Monash University, South Africa Shaikh Enayet Ullah, University of Rajshahi, Bangladesh
Preface and Acknowledgements
The main aim of this proceedings book is to bring together leading academic scientists, researchers and research scholars to exchange and share their experiences and research results on all aspects of Intelligent ecosystems, data Sciences and mathematics. With the proliferation of the challenging issues of artificial intelligence, machine learning, big data and data analytics, high-performance computing, network and device security, Internet of Things (IoT), IoT-based digital ecosystem and impact on society, communication and digital pedagogy have attracted a growing number of researchers. The main aim of this conference is to bring together leading academic scientists, researchers and research scholars to exchange and share their experiences and research results on all aspects of Intelligent ecosystems and data sciences. It also provides a premier interdisciplinary platform for researchers, practitioners and educators to present and discuss the most recent innovations, trends and concerns as well as practical challenges encountered and solutions adopted in the fields of IoT and analytics. This proceeding is about basics and high-level concepts regarding intelligent computing paradigm and communications in the context of distributed computing, big data, high-performance computing and Internet of Things. The 2nd International Conference on Innovative Computing and Cutting-edge Technologies (ICICCT 2020) is an opportunity to convey their experiences, to present excellent result analysis, future scopes and challenges facing the field of computer science, information technology and telecommunication. It brings together experts from industry, governments, universities, colleges, research institutes, research scholars. ICICCT 2020 is organized by Middle East Association of Computer Science and Engineering (MEACSE) and hosted by Lincoln University College, Malaysia. ICICCT 2020 has been held during 11–12 September 2020, in online mode (ZOOM platform). The conference brought together researchers from all regions around the world working on a variety of fields and provided a stimulating forum for them to exchange ideas and report on their researches. The proceeding of ICICCT 2020 consists of 40 best selected papers out of 104 papers, which were submitted to the conferences and peer reviewed by conference committee members and international reviewers. The presenters have presented through virtual screen. ix
x
Preface and Acknowledgements
Many distinguished scholars and eminent speakers have joined from different countries like India, Malaysia, Bangladesh, Iraq, Iran, China, Vietnam, Taiwan and Jordan to share their knowledge and experience and to explore better ways of educating our future leaders. This conference became a platform to share the knowledge domain among difference countries research culture. The main and foremost pillar of any academic conference is the authors and the researchers. So, we are thankful to the authors for choosing this conference platform to present their works in this pandemic situation. The editors and conference organizers are sincerely thankful to all the members of Springer, especially Ms. Jayarani Premkumar for the providing constructive inputs and allowing an opportunity to finalize this conference proceedings. We are also thankful to Thomas Ditzinger, Dieter Merkle and Guido Zosimo-Landolfo for their support. We are also very much thankful to Prof. Lakhmi C. Jain for his fruitful suggestions for making the conference more quality event. We are thankful to all the reviewers who hail from different places in and around the globe shared their support and stand firm towards quality chapter submission in this pandemic situation. Finally, we would like to wish you have good success in your presentations and social networking. Your strong supports are critical to the success of this conference. We hope that the participants not only enjoyed the technical program in conference but also found eminent speakers and delegates in the virtual platform. Wishing you a fruitful and enjoyable ICICCT 2020. Krasnoyarsk, Russia Taipei City, Taiwan Melbourne, Australia As-Salt, Jordan Kolkata, India
Margarita N. Favorskaya Sheng-Lung Peng Milan Simic Basim Alhadidi Souvik Pal
About This Book
The conference proceeding book is a depository of knowledge enriched with recent research findings. The main focus of this volume is to bring all the computing- and communication-related technologies in a single platform. ICICCT 2020 is aimed at providing a platform for knowledge exchange of the most recent scientific and technological advances in the fields of information technology and computational science, to strengthen the links in the scientific community. This event aspires to bring together leading academic scientists, researchers, industry persons and research scholars to exchange and contribute to their knowledge, experiences and research outcome on all the phases of computer science and information technology. This book is a podium to convey researchers’ experiences, to present excellent result analysis, future scopes and challenges facing the field of computer science, information technology and telecommunication. The book also provides a premier interdisciplinary platform for researchers, practitioners and educators to present and discuss the most recent innovations, trends and concerns as well as practical challenges encountered and solutions adopted in the fields of computer science and information technology. The book will provide the authors, research scholars, listeners with opportunities for national and international collaboration and networking among universities and institutions for promoting research and developing the technologies globally. The readers will have the chance to get together some of the world’s leading researchers, to learn about their most recent research outcome, analysis and developments and to catch up with current trends in industry-academia. This book aims to provide the concepts of related technologies regarding intelligent and innovative computing systems, big data and data analytics, IoT-based ecosystems, high-performance computing, communication systems, digital education and learning process and the novel findings of the researchers through its Chapter Organization. The primary audience for the book incorporates specialists, researchers, graduate understudies, designers, experts and engineers who are occupied with research- and computer science-related issues. The edited book will be organized in independent chapters to provide readers great readability, adaptability and flexibility.
xi
Invited Speakers
Dr. Ruay-Shiung Chang, President, National Taipei University of Business, Taiwan Prof. Dong Xiang, Professor, School of Software, Tsinghua University, China Prof. Md. Ekramul Hamid, Dean, Faculty of Engineering and Professor, Department of Computer Science and Engineering, University of Rajshahi, Bangladesh Prof. Sivakumar Rajagopal, Professor, School of Electronics Engineering, Vellore Institute of Technology (VIT), Vellore, India Dr. Zaid Mustafa Abed-IL Fattah Al-Lami, Assistant Professor, Computer Information Systems Department, Abdullah Bin Ghazi Faculty of Information and Communication Technology, Al-Balqa’ Applied University, Jordan Prof. (Dr.) Amiya Bhaumik, President, Lincoln University College, Malaysia
xiii
Contents
Throat Microphone Speech Enhancement Using Machine Learning Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Subrata Kumer Paul, Rakhi Rani Paul, Masafumi Nishimura, and Md. Ekramul Hamid Use of Artificial Neural Network to Predict the Yield of Sinter Plant as a Function of Production Parameters . . . . . . . . . . . . . . . . . . . . . . . . Arghya Majumder, Chanchal Biswas, Saugata Dhar, Rajib Dey, and G. C. Das An Approach to Self-reliant Smart Road Using Piezoelectric Effect and Sensor Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aindrila Das, Aparajita Ghosh, Sudipta Sahana, Dharmpal Singh, and Ahmed J. Obaid Using Static and Dynamic Image Maps Built by Graphic Interchange Format (GIF) and Geographic Information System (GIS) for Project Based Learning Air Pollution in Schools in Hanoi, Vietnam . . . . . . . . . Bui Thi Thanh Huong Prediction of Autism Spectrum Disorder Using Feature Engineering for Machine Learning Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. Priya and C. Radhika A Novel and Smart Parking System for the University Parking System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Madhumitha, V. Sudharshini, S. Muthuraja, Sivakumar Rajagopal, S. A. Angayarkanni, and Thamer Al-Rousan Role of M-CORD Computing Architecture for Over the Top (OTT) Services and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. Senthil Kumar, P. M. Durai Raj Vincent, Kathiravan Srinivasan, Sivakumar Rajagopal, S. A. Angayarkanni, and Basim Alhadidi
1
13
27
33
45
63
75
xv
xvi
Contents
Application of Virtual Reality and Augmented Reality Technology for Teaching Biology at High School in Vietnam . . . . . . . . . . . . . . . . . . Lien Lai Phuong, An Nguyen Thuy, Quynh Nguyen Thi Thuy, and Anh Nguyen Ngoc Internet of Things Based Gesture Controlled Wheel Chair for Physically Disabled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Akanksha Miharia, B. Prabadevi, Sivakumar Rajagopal, and Basim Alhadidi
87
99
An Adaptive Crack Identification Scheme Using Enhanced Segmentation and Feature Extraction for Electrical Discharge Machining of Inconel X750 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 K. J. Sabareesaan, J. Jaya, Habeeb Al Ani, R. Sivakumar, and Basim Alhadidi Surface Wear Rate Prediction in Reinforced AA2618 MMC by Employing Soft Computing Techniques . . . . . . . . . . . . . . . . . . . . . . 123 N. Mathan Kumar, N. Mohanraj, S. Sendhil Kumar, A. Daniel Das, K. J. Sabareesaan, and Omar Adwan An Assessment of the Relationship Between Bio-psycho Geometric Format with Career and Intelligence Quotient by an IT Software . . . . . 139 Mai Van Hung, Tran Van The, Pham Thanh Huyen, and Ngo Van Hung Design of Low Power Low Jitter Delay Locked Loop in 45 nm CMOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 P. Latha, R. Sivakumar, Y. V. Ramana Rao, and Thamer Al-Rousan An Overview of Multicast Routing Algorithms in Network on Chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Sumitra Velayudham, Sivakumar Rajagopal, Swaminathan Kathirvel, and Basim Alhadidi Intelligent Gym Exercise Classification Using Neural Networks . . . . . . . 179 Kathiravan Srinivasan, Vinayak Ravi Joshi, R. Sivakumar, and Basim Alhadidi To Assess Basic Anthropometry Indices of Vietnamese People by WHO AnthroPlus Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Mai Van Hung, Vu Dinh Chuan, and Pham Thanh Huyen Computer-Aided Simulation Techniques for Ultra-Wideband Band Pass Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Ramakrishnan Gopalakrishnan, R. Sivakumar, and Omar Adwan The Conceptual Framework for VR/AR Application in Mobile Learning Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Nguyen Thi Linh Yen, Ton Quang Cuong, Le Thi Phuong, and Pham Kim Chung
Contents
xvii
Leaf Classification Model Using Machine Learning . . . . . . . . . . . . . . . . 219 T. S. Prabhakar and M. N. Veena Multi Criteria Decisions—A Modernistic Approach to Designing Recommender Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 B. N. Nithya and Manish Kumar Convolutional Neural Network (CNN) Fundamental Operational Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 B. P. Sowmya and M. C. Supriya Smart Schedule Design for Blended Learning in VNU—University of Education, Hanoi, Vietnam Based on LMS Moodle Online Portal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Nguyen Thuong Huyen and Bui Thi Thanh Huong Mobile App for Accident Detection to Provide Medical Aid . . . . . . . . . . 269 B. K. Nirupama, M. Niranjanamurthy, and H. Asha Image Processing Using OpenCV Technique for Real World Data . . . . 285 H. S. Suresh and M. Niranjanamurthy Virtual Reality: A Study of Recent Research Trends in Worldwide Aspects and Application Solutions in the High Schools . . . . . . . . . . . . . 297 Tran Doan Vinh A Detailed Study on Implication of Big Data, Machine Learning and Cloud Computing in Healthcare Domain . . . . . . . . . . . . . . . . . . . . 309 L. Abirami and J. Karthikeyan Fog Computing: State-of-Art, Open Issues, Challenges and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 M. Sathish Kumar and M. Iyapparaja Using GeoGebra Software Application in Teaching Definite Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 Nguyen Thi Hong Hanh, Ta Duy Phuong, Nguyen Thi Bich Thuy, Tran Le Thuy, and Nguyen Hoang Vu Student Performance Analysis in Spark . . . . . . . . . . . . . . . . . . . . . . . . . 337 Arun Krishna Chitturi, C. Ranichandra, and N. C. Senthilkumar Identification and Data Analysis of Digital Divide Issue of Ethnic Minorities in the Context of Information Technology Access-Based Approach in Vietnam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 Doanh-Ngan-Mac Do, Trung Tran, and Le-Thanh Thi Tran A Review on Content Based Image Retrieval and Its Methods Towards Efficient Image Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 R. Raghavan and K. John Singh
xviii
Contents
Performance Analysis of Experimental Process in Virtual Chemistry Laboratory Using Software Based Virtual Experiment Platform and Its Implications in Learning Process . . . . . . . . . . . . . . . . . . . . . . . . 373 Vu Thi Thu Hoai and Tran Thi Thu Thao Human Detection in Still Images Using Hog with SVM: A Novel Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 Shanmugasundaram Marappan, Prakash Kuppuswamy, Rajan John, and N. Shanmugavadivu Simulation of Algorithms and Techniques for Green Cloud Computing Using CloudAnalyst . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 Hasan Qarqur and Melike Sah Data-Mining Based Novel Neural-Networks-Hierarchical Attention Structures for Obtaining an Optimal Efficiency . . . . . . . . . . . . . . . . . . . 409 Ahmed J. Obaid and Shubham Sharma Protecting Cloud Data Privacy Against Attacks . . . . . . . . . . . . . . . . . . . 421 Maryam Ebrahimi, Ahmed J. Obaid, and Kamran Yeganegi Cuckoo Search Optimization Algorithm and Its Solicitations for Real World Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 M. Niranjanamurthy, M. P. Amulya, N. M. Niveditha, and Dharmendra Chahar Academic Article Recommendation by Considering the Research Field Trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 Shi-jie Lin, Guanling Lee, and Sheng-Lung Peng Classification of Kannada Hand Written Alphabets Using Multi-class Support Vector Machine with Convolution Neural Networks . . . . . . . . 455 Kusumika Krori Dutta, Aniruddh Herle, Lochan Appanna, A. Tushar, and K. Tejaswini
About the Editors
Dr. Margarita N. Favorskaya is a full Professor and Head of Department of Informatics and Computer Techniques at Reshetnev Siberian State University of Science and Technology, Russian Federation. Professor Favorskaya is a member of KES organization since 2010, the IPC member and Chair of invited sessions of international conferences. She serves as the Reviewer in international journals (Neurocomputing, Pattern Recognition Letters, and Engineering Applications of Artificial Intelligence) and Associate Editor in international journals (Intelligent Decision Technologies Journal, Knowledge Engineering and Soft Data Paradigms, International Journal of Knowledge-Based and Intelligent Engineering Systems and International Journal of Reasoning-based Intelligent Systems). She is the author or the co-author of more than 200 publications and 20 educational manuals in computer science. She co-edited twenty books for Springer recently. She supervised nine Ph.D. candidates and is presently supervising five Ph.D. students. Her main research interests are digital image and videos processing, machine learning, deep learning, remote sensing, pattern recognition, artificial intelligence, information technologies. Sheng-Lung Peng is a Professor and the director (head) of the Department of Creative Technologies and Product Design, National Taipei University of Business, Taiwan. He received the Ph.D. degree in Computer Science from the National Tsing Hua University, Taiwan. He is an honorary Professor of Beijing Information Science and Technology University, China, and a visiting Professor of Ningxia Institute of Science and Technology, China. He is also an adjunct Professor of Mandsaur University, India. Dr. Peng has edited several special issues at journals, such as Soft Computing, Journal of Internet Technology, Journal of Real-Time Image Processing, International Journal of Knowledge and System Science, MDPI Algorithms and so on. His research interests are in designing and analysing algorithms for bioinformatics, combinatorics, data mining and networks areas in which he has published over 100 research papers.
xix
xx
About the Editors
Dr. Milan Simic has Ph.D., Master and Bachelor in Electronic Engineering, from University of Nis, Serbia, and Graduate Diploma in Education, from RMIT University Melbourne, Australia. He has comprehensive experience from industry, Honeywell, Research Institute and Academia, in Europe and Australia. He has industry and academic awards for his research. Currently being with School of Engineering, RMIT University, Melbourne, Dr. Simic is also: • General Editor: KES Journal: http://www.kesinternational.org/journal/ • Professor: University Union Nikola Tesla, Belgrade, Serbia • Adjunct Professor: KIIT University, Bhubaneswar, Odisha, India • Associate Director: Australia-India Research Centre for Automation Software Engineering He conducts multidisciplinary research in the following areas: mechatronics, automotive, biomedical engineering, robotics, physical networks, information coding, green energy, autonomous systems, engineering management and education. He has established mechatronics laboratory, developed and managed the first Mechatronics Program at RMIT University. He has developed and managed Master in Engineering Management Program. He has established RMIT Engineering CISCO Networking Academy. He is member of programme committees for large number of international conferences and reviewer for numerous journals. Basim Alhadidi is presently a full Professor at the Computer Information Systems Department at Al-Balqa Applied University, Jordan. He earned his Ph.D. in 2000 in Engineering Science (Computers, Systems and Networks). He received his M.Sc. in 1996 in Engineering Science (Computer and Intellectual Systems and Networks). He published many research papers in different topics such as: computer networks, image processing and artificial intelligence. He is a reviewer for several journals and conferences. He was appointed in many conferences as keynote speaker, reviewer, track chair and track co-chair. Souvik Pal Ph.D., is currently associated as an Associate Professor and Head of the Computer Science and Engineering Department at the Global Institute of Management and Technology, West Bengal, India. Prior to that, he was associated with Brainware University, Kolkata, India; JIS College of Engineering, Nadia; Elitte College of Engineering, Kolkata; and Nalanda Institute of Technology, Bhubaneswar, India. Dr. Pal received his B.Tech., M.Tech. and Ph.D. degrees in the field of Computer Science and Engineering. He has more than a decade of academic experience. He is author or co-editor of 12 books from reputed publishers, including Elsevier, Springer, CRC Press and Wiley, and he holds three patents. He is serving as a series editor for “Advances in Learning Analytics for Intelligent Cloud-IoT Systems”, published by Scrivener Publishing (Scopus-indexed) and “Internet of Things: Data-Centric Intelligent Computing, Informatics, and Communication”, published CRC Press, Taylor & Francis Group, USA. Dr. Pal has published a number of research papers in Scopus/SCI-indexed international journals and conferences. He is the organizing chair of RICE 2019, Vietnam; RICE 2020
About the Editors
xxi
Vietnam; ICICIT 2019, Tunisia. He has been invited as a keynote speaker at ICICCT 2019, Turkey, and ICTIDS 2019, Malaysia. His professional activities include roles as associate editor and editorial board member for more than 100+ international journals and conferences of high repute and impact. His research area includes cloud computing, big data, Internet of Things, wireless sensor network and data analytics. He is a member of many professional organizations, including MIEEE; MCSI; MCSTA/ACM, USA; MIAENG, Hong Kong; MIRED, USA; MACEEE, New Delhi; MIACSIT, Singapore; and MAASCIT, USA.
Throat Microphone Speech Enhancement Using Machine Learning Technique Subrata Kumer Paul, Rakhi Rani Paul, Masafumi Nishimura, and Md. Ekramul Hamid
Abstract Throat Microphone (TM) speech is a narrow bandwidth speech and it sounds unnatural, unlike acoustic microphone (AM) recording. Although the TM captured speech is not affected by the environmental noise but it suffers naturalness and intelligibility problems. In this paper, we focus on the problem of enhancing the perceptual quality of the TM speech using the machine learning technique by modifying the spectral envelope and vocal tract parameters. The Mel-frequency Cepstral Coefficients (MFCCs) feature extraction technique is carried out to extract speech features. Then mapping technique is used between the features of the TM and AM speech using Neural Network. This improves the perceptual quality of the TM speech with respect to AM speech by estimating and correcting the missing high-frequency components in between 4 and 8 kHz from the low-frequency band (0–4 kHz) of TM speech signal. Then the least-square estimation and Inverse Short-time Fourier Transform Magnitude methods are applied to measure the power spectrum is used to reconstruct the speech signal. The ATR503 dataset is used to test the proposed technique. The simulation results show a visible performance in the field of speech enhancement in adverse environments. The aim of this study is for natural human–machine interaction for vocal tract affected people. Keywords Machine learning · Multi-layered feed forward neural network · Mel frequency cepstral coefficients · Speech spectra · Linear prediction coefficients S. K. Paul (B) · R. R. Paul · Md. E. Hamid Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh e-mail: [email protected] R. R. Paul e-mail: [email protected] Md. E. Hamid e-mail: [email protected] M. Nishimura Faculty of Informatics, Shizuoka University, 3-5-1 Johoku, Naka-ku, Hamamatsu-shi, Shizuoka 432-8011, Japan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_1
1
2
S. K. Paul et al.
1 Introduction The throat microphone is comprised of a pair of units housing fixed on a neckband. This skin vibration transducer placed near the larynx is used to record the voice signal. Although this voice is perceptually intelligible, but the sound is not natural like AM speech sound. Again we know, the acoustic microphone speech undergoes from noise in a real environment [1]. The aim of this study is to enhance the speech quality of TM speech for better human–machine communication. In the last few decades, so many studies focus on speech enhancement by suppression of additive background noise. In this section, we present a literature review of the various speech enhancement methods published to date for throat microphone speech enhancement. Shahina and Yegnanarayana [2] shows that the throat microphone speech is stable to noise, but sound not natural. Their approach improves the perceptual quality of the throat microphone speech. In that study, ANN is used to map the speech features from the TM to the AM speech. The study also presents the mapping technique for bandwidth expansion of telephone speech. Another study by Murthy et al. [3] presents a mapping technique from TM speech to AM speech to improve the speech quality of TM recording. To mapping, here pairwise vector quantization of spectral feature vectors that are obtained from every analysis frame of TM and AM speech. However, from the literature, we understand that the enhancement of throat microphone speech can be done in two different ways, one is based on the source-filter model and another is based on the use of neural networks deployed as mapping models. In this research, we try to enhance the perceptual quality of the TM speech signal using the Artificial Neural Network (ANN) based mapping technique that maps the speech wise spectra from the TM to the AM speech. The frame-wise Multi-Layered Feed Forward Neural Network (MLFFNN) is used to obtain a smooth mapping without ‘spectral jumps’ between adjacent frames. However, speech features are estimated by using the MFCCs feature extraction method. Now, the MLFF Neural Network technique is used to map between the features of the TM and AM Speech to improve the perceptual quality of the Throat Microphone speech with respect to acoustic microphone speech. As we see the spectrogram in Fig. 1, for throat microphone speech, the frequency above 4000 Hz is totally missing. Moreover, speech energy in throat microphone in between 2000 and 4000 Hz is low compared to the acoustic microphone. It is perceived that the throat microphone and acoustic microphone are thoughtful to different features of the signal. Moreover their spectral vary as a function of the speaker and the location of the transducers and as a function of the voicing of speech itself. The paper is prepared as the following: Sect. 2 describes the system which is proposed, ATR503 Dataset, feature extraction process, designing the ANN model, and speech reconstruction methods. Section 3 is the experimental result and discussion and lastly, the concluding remarks include in Sect. 4.
Throat Microphone Speech Enhancement Using …
3
Fig. 1 Spectrograms of speech sound for vowel /e/ recorded simultaneously by using AM (top) and TM (bottom)
2 Methods and Dataset 2.1 Proposed Machine Learning Based Method This study addresses an improvement of the perceptual quality of the TM speech signal using the Artificial Neural Network (ANN) based mapping technique that maps the speech wise spectra from TM speech signal to AM speech signal [4]. The study consists of two stages. In the first stage, we trained the system to learn the mapping vectors. For training, the speakers are used to utter the sample speech and record simultaneously using a TM and an AM placed near the larynx and near the mouth properly. Concurrent recording confirms that the model learns the mapping between the parallel frames of both microphone speech. Then the Cepstral coefficients are estimated using the MFCC feature extraction technique [5]. Figure 2 illustrated the two stages of mapping techniques. In the figure it shows during the training phase the Cepstral Coefficients are extracted from the throat Speech. These coefficients are used to map into the Cepstral Coefficients (CCs) extracted from the corresponding AM speech. That is, the Cepstral Coefficients resulting from the TM data are the input of the Multilayer Feed Forward Neural Network (MLFFNN) mapping system while the Cepstral Coefficients are extracted from the AM speech form the wanted output. However, testing the second stage, which is a trained neural network, where the Cepstral Coefficients derived from a test TM utterance are given as input to the system. The output formed by the network is the estimated Cepstral Coefficients of the throat microphone speech is called mapped Cepstral Coefficients. This corresponds to the AM speech as a test input speech. The MFCC Cepstral Coefficients are estimated from these derived Cepstral Coefficients for speech reconstruction.
4
S. K. Paul et al.
Fig. 2 The artificial neural network model for training and testing
2.2 Experimental Dataset Description We use ATR503 Data set to evaluate the proposed study. This data set is collected from Nishimura Laboratory, Shizuoka University, Japan. It has Acoustic Microphone (AM) and Throat Microphone (TM) simultaneous recording audio files of 5-vowel sounds [a, e, i, o, u]. Each vowel is uttered 100 times. So, the dataset contains total 100 × 5 = 500 audio files for both TM and AM speech. The recording is done by reading voice using 2 channel microphones, acoustic and throat microphones in a soundproof room. This data set contains 5 male speakers at 44 kHz.
2.3 Speech Features Extraction Using MFCC Method Figure 3 presents the block diagram of the computation steps of Mel-frequency Cepstral Coefficients estimation for speech feature extraction. The MFCC calculation includes the following steps: it starts with preprocessing of signal, framing, and windowing, then Fast Fourier Transformation (FFT), after that the Mel Filter Bank and logarithm and lastly the Discrete Cosine Transformation (DCT) [6]. The number of Mel Filter bank ensures the number of MFCCs to be computed. Figure 4 illustrates the number of MFCCs size is 10, 20, 30, …, 90, 100. Here in this study, for a single speech signal, ten reconstruction files are generated. Then for each of these ten MFCCs, we calculate the formant distance between the original signal and reconstructed signal. A Similar way, calculates formant distances for all MFCCs. However, in this study, consider formant distances for all 200 signals in the database as shown in Fig. 4. All are the MFCC parameters describe in Table 1.
Throat Microphone Speech Enhancement Using …
5
Fig. 3 MFCC steps for speech feature extraction
Fig. 4 Analysis of the reconstruction speech signals Table 1 MLFFNN parameters and their values No
Parameters
Values
1
Neurons (Input layer)
60
2
Neurons (Hidden layer 1)
100
3
Neurons (Hidden layer 2)
100
4
Neurons (Output layer)
60
5
Activation function
Sigmoid
6
Learning rate
0.01
7
Epochs number
500
8
Number of training target
1e-25
9
Size of batch
10
6
S. K. Paul et al.
Finally, we calculate band wise the mean, standard deviation, and standard error. The target is to find out which is much similar between original and reconstructed speech signals. Then find out the error rate between them. It is observed that when 60 Cepstral coefficients per frame are taken the standard deviation and the standard error rate is minimum that shows the best performance.
2.4 Design a MLFF Neural Network In this section, we discuss MLFFNN which is a part of Deep Learning. It uses one or two hidden layers that recognizing more complex features. The function of the hidden layer fits weights to the inputs and directs them through an activation function as the output. A fully connected multi-layer neural network is called a Multilayer Perception (MLP). The main advantage of the ANN is that it can be used to solve difficult and most complex problems. However, it needs a long training time sometimes. In this study, the proposed MLFFNNs provide the least mean absolute errors at a given SVD value. An MLFF neural network has three layers: input, hidden and output layers (Refer to Fig. 5) [7]. The network is activated by the hidden neurons to recognize more complicated tasks by takeout gradually the features from the input vector pattern. Figure 5 illustrates the architecture, it shows the input layer forwards data to hidden layers and to the output layer. This step is forward propagation. On the other hand, for backward propagation, a method is used to adjust the weights to minimize the difference between the estimated output and the original. The parameters and their values are used to train MLFFNN are illustrated in Table 1.
Fig. 5 A MLFFNN model architecture of the proposed method
Throat Microphone Speech Enhancement Using …
7
In this experiment, two hidden layers are considered and each hidden layer contains 100 neurons which give better results with less distortion. The sigmoid function is used in this experiment because it occurs between 0 and 1. It is mainly used for artificial neural networks where we estimate the probability as an output. For that, we use the sigmoid function for estimation.
2.5 Speech Reconstruction The speech power spectrum is obtained from the Cepstral Coefficient by using Moore–Penrose pseudo-inverse techniques. Then by using the Least-square estimation technique followed by Inverse Short-time Fourier Transform Magnitude (LSEISTFTM) method to measure the power spectrum. However, the power spectrum is used to reconstruct the speech waveform [8].
3 Experimental Results and Discussions 3.1 Speech Spectrogram Comparison We plot spectrograms of both the AM speech and enhanced speech to visually compare the speech enhancement performances of TM speech [9]. Figure 6 shows the speech spectrograms of AM speech, TM speech, and enhanced speech using the
Fig. 6 Speech spectrograms of the acoustic microphone speech, throat microphone speech and the enhanced speech by the proposed method for vowel sound /e/
8
S. K. Paul et al.
Fig. 7 LPC Power spectra of the AM speech (Blue), TM speech (Red), and the enhanced speech (Green) by the proposed method for vowel sound /e/. In figure F1, F2, … are the formant frequencies
proposed method. It is observed that the enhanced speech is much similar to the AM speech by acquiring the missing frequencies in high bands using the proposed MLFFNN model.
3.2 LPCs Power Spectrum Comparison Linear prediction coefficients (LPCs) are a form of linear regression. We can compute the spectral envelope magnitude from the LPC parameter by evaluating the transfer function [10]. Figure 7 shows the speech spectra of the AM speech, TM speech, and the enhanced speech by the proposed method. LPC is obtained from the All-pole filter in the presence speech signal corresponding to the frequency response. It is clear from the figure that the AM speech spectra are much closer to the enhanced speech spectra. So, noted that the improved spectra by this proposed method are a close approximation to the AM speech spectra.
3.3 Perceptual Evaluation of Speech Quality (PESQ) Measure The PESQ measure is carried out to see the quality of speech enhancement. The PESQ score varies from 0.50 (worst) up to 4.50 (best) as determined by ITU. It is a widely used method and one of the best algorithms for an estimation of a subjective
Throat Microphone Speech Enhancement Using … Table 2 LSD and PESQ scores comparison
9
Vowel Signal
LSD (in dB) PESQ (in dB)
/a/ /e/
AM and TM speech
1.3
3.1
AM and enhanced speech 1.2
3.2
AM and throat speech
1.4
3.3
AM and enhanced speech 1.3
3.3
/i/
AM and TM speech
1.9
3.7
AM and enhanced speech 1.8
3.9
/o/
AM and TM speech
/u/
1.0
4.0
AM and enhanced speech 1.1
4.1
AM and TM speech
1.2
3.9
AM and enhanced speech 1.2
4.0
measure [11]. Table 2 presents the PESQ score values of AM speech, TM speech, and the enhanced speech by the proposed method. More score values specify the best speech quality.
3.4 Log Spectral Distance (LSD) Measure The LSD measures the quality of the estimated speech signal concerning the original speech wide-band counterpart. Table 2 shows the average Log Spectral Distance (LSD) scores between the original acoustic speech and the throat speech signal. Moreover, the average PESQ scores between the AM and enhanced TM speech signal. Notice that, for increasing the PESQ, the LSD scores decrease in a consistent way [11].
3.5 Speech Formant Analysis Measure Linear prediction coefficients can be used to represent a signal. Formants are resonance frequencies of the vocal tract and observed by the characteristic amplitude peaks in the spectrum. Table 3 illustrates the AM speech formant distances are very much close to the corresponding distances of the enhanced speech by the proposed method. Figure 7 illustrates the graphical representation of formant frequency distances for vowel sound /e/. From the figure, it is clearly observed that the formant structure of the reconstructed signal (enhanced) is much close to the desired AM signal. In Fig. 8 graphical representation presents the comparison of Speech formant distance for vowel sound /e/ which is more interpretable than Table 3.
10
S. K. Paul et al.
Table 3 Speech formant frequencies measurement Name
Formant frequencies
Vowel
Speech
F1
F2
F3
F4
F5
F6
/a/
AM
337
1214
2131
3266
5585
5961
TM
351
2083
3587
3747
5743
6249
Enhanced
347
1286
2097
3210
5527
5951
AM
431
2040
2496
3167
6105
7365
TM
378
1977
2064
2448
6360
7322
Enhanced
490
2002
2389
3114
6073
7312
AM
452
1698
1980
2774
4102
5173
TM
466
1938
2052
2482
4202
4996
Enhanced
452
1726
1916
2598
4110
4872
AM
303
2270
2988
3872
5622
6138
TM
415
2274
3236
3849
5684
6031
Enhanced
321
2463
2965
3844
5686
6255
AM
315
1239
1304
2160
3275
5628
TM
322
2089
2208
2477
3591
5709
Enhanced
324
1273
1822
2150
3293
5684
/e/
/i/
/o/
/u/
‘F’ represents Formant Frequency in Hz
/e/ vowel AM Speech 6100 5600 5100 4600 4100 3600 3100 2600 2100 1600 1100 600 100
F1: (Hz)
F2: (Hz)
/e/ vowel TM Speech
F3: (Hz)
F4: (Hz)
/e/ vowel Enhanced Speech
F5: (Hz)
F6: (Hz)
Fig. 8 Comparison of speech formant distance for /e/ vowel
4 Conclusion In this research, we focused on improving the perceptual quality of the TM speech using a machine learning technique. We used the Multi-Layered Feed Forward Neural
Throat Microphone Speech Enhancement Using …
11
Network (MLFFNN) to model the proposed system. The perceptual quality of a speech signal depends on the acoustic characteristics. The method was divided into two subtasks. The first subtask involved find out the speech features using the MFCC method for both the AM speech and TM speech. The second subtask was spectral mapping using the multi-layered Feedforward Neural Network (MLFFNN). We used ATR503 Dataset for both the training and testing process. The output of the neural network was the enhanced speech that corrected the missed and degraded frequencies of the TM speech. Various subjective and objective measures were taken to evaluate the performance of the proposed method. The result shows a noticeable performance in the field of speech communication in adverse environments.
References 1. Gibbon, D. (2001). Prosody: Rhythms and melodies of speech (pp. 1–35). Germany: Bielefeld University. 2. Shahina, A., & Yegnanarayana, B. (2007). Mapping speech spectra from throat microphone to close-speaking microphone: A neural network approach. EURASIP Journal on Advances in Signal Processing, 2007, 1–10. 3. Murty, K. S. R., Khurana, S., Itankar, Y. U., Kesheorey, M. R., & Yegnanarayana, B. (2008). Efficient representation of throat microphone speech. In INTERSPEECH, 9th Annual Conference of the International Speech Communication Association (pp. 2610–2613). 4. Rumellhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal representation by error backpropagation (pp. 318–362). Cambridge, MA: MIT Press. 5. Vijayan, K., & Murty, K. S. R. (2014). Comparative study of spectral mapping techniques for enhancement of throat microphone speech. In Twentieth National Conference on Communications (NCC), Kanpur (pp. 1–5). 6. Sremath, S., Reza, S., Singh, A., & Wang, R. (2017). Speaker identification features extraction methods: A systematic review. Expert Systems with Applications, 90, 250–271. 7. Shahina, A. & Yegnanarayana, B. (2014). Artificial neural networks for pattern recognition. Sadhana, 19(Part 2), 189–238. 8. Min, G., Zhang, X., Yang, J., & Zou, X. (2015). Speech reconstruction from Mel frequency cepstral coefficients via 1-norm minimization. In IEEE 17th International Workshop on Multimedia Signal Processing (MMSP), Xiamen (pp. 1–5). 9. Hussain, T. (2017). Experimental study on extreme learning machine applications for speech enhancement. IEEE Access, 5, 1–13. 10. Roy, S. K. (2016). Single channel speech enhancement using Kalman filter (Master’s thesis), Concordia University. 11. Ali, M., & Turan, T. (2013). Enhancement of throat microphone recordings using gaussian mixture model probabilistic estimator (Master’s thesis), Koc University.
Use of Artificial Neural Network to Predict the Yield of Sinter Plant as a Function of Production Parameters Arghya Majumder, Chanchal Biswas, Saugata Dhar, Rajib Dey, and G. C. Das
Abstract Now a day’s an effective process management system is essential for the sustainability of integrated steel plant. Their effective process enhances quality of product and increases the cost efficiency. The nature of the raw material, its mix proportion, size, chemical composition and process parameter plays a very vital role in sinter mineralogy. The main objective of this study is to optimize the sinter plant process parameters to get the best productivity of Sinter Plant. Sinter has a very vital role for the production of hot metal in blast furnace. A huge number of industrial parameters as large as 106 numbers control the productivity of sinter plant in a very complex manner. As such there is not much study on the prediction of sinter yield as function of those parameters combined. Perhaps it is for the first time an attempt has been made to predict the sinter yield by using Artificial Neural Networking (ANN), with large number of industrial data available at Vizag Steel over a fairly long period of time. One of the most important achievement of this paper is that the reduction in the number of parameters using metallurgical knowledge and experience (without using any sophisticated optimization technique). The prediction of sinter yield with this reduced number of parameters is almost as good as that predicted by using the exhaustive number of 106 parameters within the framework of ANN.
A. Majumder (B) · C. Biswas School of Mining and Metallurgical Engineering, Kazi Nazrul University, Asansol 713340, India e-mail: [email protected] C. Biswas e-mail: [email protected] S. Dhar · R. Dey · G. C. Das Department Metallurgical and Material Engineering, Jadavpur University, Kolkata 700032, India e-mail: [email protected] R. Dey e-mail: [email protected] G. C. Das e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_2
13
14
A. Majumder et al.
Keywords Sinter making · Productivity · RMSD · Artificial neural networking (ANN)
1 Introduction The main feeds for Blast Furnace are Sinters and Pellets. In the first part of twentieth century, Sinter making technology is evolved to treat and utilize the fines generated during mining of iron ore as well as during steel making process in a better way. It’s not a surprise that sinter has become the widely accepted and preferred Blast Furnace feed material for iron making. In excess of 70% of hot metal in the world is produced through the sinter presently [1]. In India, around 50% of hot metal is being produced using sinter feed in Blast Furnaces which helps in improving its productivity [2]. The sinter plant plays a vital role of supplying sinter of right quality and quantity to blast furnace. A lot of technological up gradation of sinter making has happened in the last few decades all over the world. The competitiveness between steel plants productivity and the stringent environment requirement has brought tremendous changes in all fronts of sinter making [2–4]. Under the present condition, a tremendous scope for improvement is available be it quantity or be it quality of it. Since the quality of the raw material mainly sinter has not been reached its optimum level, a careful approach to improve both quality and quantity of sinter will ensure consistency in quality of hot metal. The productivity of sinter plant is affected by different parameters which in turn affect the productivity of Blast furnace [2]. In other words, if one can improve sinter plant productivity, corresponding Blast Furnace productivity will increase. The parameters are like vacuum, machine speed, green mix moisture, crushing indices of limestone, dolomite and coke, raw material chemistry and their size fraction, usage of LD slag as well as metallurgical waste to say a few. These parameters affect individually the productivity of sinter plant [5]. Now controlling these parameters as well as the chemistry, the final sinter quality also can be controlled. The sinter plant process parameters can be linked with a higher productivity of sinter plant. Once this is found out, the sinter plant may be operated with those parameters to get an optimised productivity with the required sinter, microstructure and mechanical properties [2, 6–9]. Hardly any research has been done to understand and ascertain the complex interlinkage between the operating parameters and plant productivity. It would be therefore worthwhile if such interlinkage can be found. Out of different models, neural network is one of the most used models nowadays. Neural networks work with the information just like the way the brain of human being work. To solve a specific problem, large number of neurons work together in tandem and process the data. The network learns from its own neighbouring environment and store it within. Numerical simulations of the various conditions in iron ore sintering bed can be carried out for various parameters: coke contents and air suction rates, along with some other parameters of the model. A transient 1-dimensional model, which considers multiple solid phases, can be used for the iron ore sintering process based on
Use of Artificial Neural Network to Predict the Yield …
15
the assumption of porous media. Mathematical modelling can be a very important tool for analyzing, controlling and optimizing a complex process like sintering keeping in mind that the process can be modelled with a reasonable degree of accuracy. The model can also be used to study sensitivity of the process to critical process variables such as suction applied, amount and mean size of coke breeze, ignition time etc. An integrated mathematical model for the complete sintering process can be formed by extending models for obtaining [10, 11]. A lot has been tried to understand the iron ore sintering process through simulation of mathematical models. These models demonstrate the effect of the heat and mass transfer, drying and condensation of water, gas flow in the machines, coke combustion, charge melting and solidification phenomena that occur during the process. It also helps in calculating different parameters inside the bed like composition, solid and gas temperature, and porosity. This study has been done with the concepts used in artificial neural networking. A large amount of data has been collected so as to use as input pattern for the system. In this regard, an effort has been made to mimic the working concept of human nervous system using different learning methods like Error- correction, Memorybased, Hebbian, Competitive and Boltzmann to find the error generated through the weighted network developed using MATLAB as working language [12–15]. This process would be done through several iterations to reach the minimum of error with weightage variation. The objective of the paper is twofold. The First objective is to predict the sinter plant productivity using ANN as a function of exhaustive number of industrial parameters of sinter plant. The second objective is to study the feasibility of reducing the number of sinter plant parameters using the metallurgical knowledge alone, without affecting significantly the accuracy of prediction of sinter plant yield.
2 Methodology A detailed study has been done on the functioning of the sinter plant and the parameters guiding the productivity of the sinter plant are identified. The data containing the controlling factors for the sinter plant productivity are collected for 12 years. The guiding parameters are tabulated against productivity forming each set of parameters. The tabulated data is then feed to a Matlab program, where every set being treated as one pattern which is then used to help the network to learn and mimic human brain using the generalized regression model. The network on which it works is close to human brain and is called artificial neural networks. It works through the optimized weight values. The method by which the optimized weight values are attained is called learning. In the learning process, it works towards teaching the network how to produce the output when the corresponding input is presented. The learning is complete when there is no significant change in the weighted factors i.e. the best possible synaptic connection among the patters is reached. The algorithm used in the generalized regression model used here is mentioned below:
16
A. Majumder et al.
• The input variables are organized in a single excel sheet naming “InputFileParametesValue.xlsx” in a columnar fashion one after another forming a table of which columns being parameters of one pattern and rows are number of observations taken for each parameter. Thus, a matrix of (n × m) is formed where ‘n’ being the no of parameters and ‘m’ being the no of observations. • Then another excel file naming “OutputFileProductivity.xlsx” is made where the output as productivity is tabulated in row forming (1 × m) matrix where m being the no of observations. • The above-mentioned files are imported and being introduced to Matlab language using the xlsread command to specific variables as input and output. • A network is then designed using the generalized regression model. • The weights are being updated in every step and the revised weights are used for further iteration to form the network. • This process is done till the point global minimum of error is reached where it is noticed that there is very little or no change in the weights on further addition of patterns as inputs. • Thus, once the artificial neural networking architecture is done using supervised method of generalized regression neural networking then an unknown input pattern is introduced as test input for predicting the productivity. • To introduce a test input a set of patterns excel data naming “testInput.xlsx” is provided and a test output variable is created which returns a value of the productivity. In this new variable testoutput the simulation function is called to run within the network formed and the new testinput variables. • The updated weight used in following iteration is given below: Wt is the weightage at tth iteration. tp and fp are the actual and best fit for each point. where η: learning rate, wi (t + 1): new weights in the t + 1 iteration. • The Matlab coding used in its command window is given below: input
=xlsread (‘InputFileParametesValue.xlsx’);
output
= xlsread (‘OutputFileProductivity.xlsx’);
net
= newgrnn (input, output);
testinput
= xlsread (‘TestInput.xlsx’);
testoutput
= sim (net, testinput);
where, input = variable created to store the input parameters that are in excel sheets. output = variable created to store the corresponding productivity that are in excel sheets. InputFileParametersValue.xlsx = Input pattern excel sheet file. OutputFileProductivity.xlsx = Output pattern excel sheet file. net = the generalized regression network formed between the input and the output variables.
Use of Artificial Neural Network to Predict the Yield …
17
testinput = input test pattern variable. TestInput.xlsx = Test input pattern excel sheet file. testoutput = simulated output variable for the test input pattern and the network formed. The trained neural network, with the updated optimal weights, is able to produce the output within desired accuracy corresponding to an input pattern. Thereafter, the revised network formed can predict the outcome beforehand with some legitimate error duly considered. Thus, as the artificial neural is formed, steps will be carried out to find the individual contribution of the parameters towards the productivity which in future will help to increase the process productivity by controlling the major affecting parameters.
3 Results and Discursions 3.1 Details Analysis of Sinter Plant Parameter A working database is tabulated from the VSP raw data covering an extensive 12 years starting from the available months of 2003–2005 till October 2016. The data are collected for each day and thus data available are huge enough for the use of Artificial Neural Networking (ANN). Following is the list of 66 parameters that would affect the productivity of sinter plants. Details parameter is given in Table 1.
3.2 Analysis of ANN An Artificial Neural Network (ANN) is an information processing system which works like nervous systems in human being. The structure for human nervous system is very critical for processing any information. To solve any kind of problem, all interconnected elements work together in a typical way. ANNs, like human being, learn on its own way. A learning process in ANN helps in configuring a specific application such as pattern recognition or data classification. Learning in human nervous systems involves adjustments to the synaptic connections that exist between the neurons. This is exactly similar to ANN system [16]. After a detailed study on the functioning of the sinter plant at Vizag Steel Plant, the aforesaid parameters are the exhaustive number of variables that control the sinter plant productivity. An ANN architecture is formed and total 106 sets of data, each consisting of 66 number of parameters are sequentially used to train the ANN. The training is carried out in MATLAB R2016a (9.0.0.341360) version with 64-bit (win64) and License Number: 123456. Once training is over, the network is then tested with 35 no. of data set. Figure 1 shows the plot of predicted and observed
18
A. Majumder et al.
Table 1 Details parameter in sinter plant Details of parameters Iron ore fines (%)
Total Fe
Process variable
% of Lime
CaO
SiO2
Sinter return used (%)
MgO
Al2 O3
Machine speed (m/min)
SiO2
LOI
Vacuum (mm WC)
Al2 O3
Moisture
Raw material (+) 10 mm consumption (%) (−) 150 micron Metallurgical waste (%)
Green mix moisture (%)
Iron ore fines
FeO/Fe2 O3
Dolomite
LOI
Coke breeze
Total Fe
Lime
CaO
LD slag
MgO
Metallurgical waste
Fe2 O3
Coke breeze
CaO
SiO2 Al2 O3 MnO
Crushing index
Coke breeze (%)
SiO2 Al2 O3
Limestone
MgO
Dolomite
TiO2
LOI
P2 O5
Moisture
Ash content
Limestone (%) CaO
Sand (%)
SiO2
MgO
Al2 O3
SiO2
Moisture
Al2 O3 LOI Dolomite (%)
LD slag (%)
CaO
Moisture
MgO
CaO
SiO2
MgO
Al2 O3
SiO2
FeO/Fe2 O3
Al2 O3
MnO
LOI
P2 O5
Moisture
Basicity
productivity of sinters as a function of no. of data set used for testing. Table 2 summarizes the test data i.e. the predicted productivity of sinters along with industrially observed productivity of sinters. These data have been utilized to compute the RMSD (Root Means Square Deviation). The computed value is found to be 4.06 which is quite satisfactory and acceptable.
Use of Artificial Neural Network to Predict the Yield …
19
Fig. 1 Plot of predicted and the corresponding observed productivity of sinter as a function of no. of data sets for testing
Working with such huge number of variables is going to pose problems to the operators as it takes a lot of time to properly record these variables. Secondly, most importantly it requires a high speed computational device to work with such huge data set and it requires longer computing time. To get rid of the above problem, it would be worthwhile if we could reduce the no. of variables using the experience and metallurgical knowledge along with the feedbacks from operators having experience of running the machine for more than 20 years. Keeping above criteria in preview the parameters have been scaled down to workable 34 parameters from 66 parameters and the following table contains an effective list of 34 parameters (Table 3). With the reduced number of parameters, the ANN analysis has been carried out in the same way it has been carried out with 66 parameters. Table 4 contains the predicted yield with the actual yield and the RMSD value calculated. Therefore Fig. 2 shows the comparison of the predicted yield with the actual yield. The input data show a gradual decrease in the error percentage with number of data inputs. After certain point around 106 data sets, there is very less change in the error value. This clearly indicates the system is trained by the data given as input. Once, it is trained, the system is fit to predict for given unknown sets of input. The minimum error of prediction is found to be 1.87% for the given sets of data. Figure 2 shows the variation of predicted and observed production of sinter as a function of 35 no. of data sets for testing. Table 4 summarizes the predicted and observed production of sinters and these have been used to find the RMSD value. The observed RMSD value is 5.92 which is marginally higher than RMSD value of 4.06. However, for the latter case, the no. of variables considered 66 which is significantly much higher than that of 34 variables. Secondly the observed productivity of sinter is about 2
Observed value
370
355
371
388
374
376
369
374
386
368
395
386
383
385
386
393
399
423
413
419
406
No. of test
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
380
380
380
380
380
380
380
380
380
380
380
380
380
380
380
380
380
380
380
380
380
Predicted value
12 15 6 3 5 6 13 19 33 22 30 26
−15 −6 −3 −5 −6 −13 −19 −43 −33 −39 −26
6
−6 12
6
11
4
6
11
4
6
8
−8 6
9
21
10
Mod error
9
25
10
Error
6.84
7.89
5.79
8.68
5.00
3.42
1.58
1.32
0.79
1.58
3.95
3.16
1.58
1.58
2.89
1.05
1.58
2.11
2.37
5.53
2.63
% error E
Table 2 Observed and predicted value of production and computation of RMSD
46.81
62.33
33.52
75.42
25.00
11.70
2.49
1.73
0.62
2.49
15.58
9.97
2.49
2.49
8.38
1.11
2.49
4.43
5.61
30.54
6.93
Error Sq 577.63
Error sum 16.50
Error sum/No. of obs. (E)
(continued)
4.06
Sqrt. of E
20 A. Majumder et al.
Observed value
400
397
386
403
394
409
398
384
383
371
370
371
398
395
No. of test
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Table 2 (continued)
362
362
362
362
362
362
362
362
380
380
362
362
362
362
Predicted value
Mod error 20 17 6 23 14 20 18 4 3 9 10 9 18 15
Error −38 −35 −24 −41 −14 −29 −36 −22 −21 −9 −8 −9 −36 −33 4.14
4.97
2.49
2.76
2.49
0.83
1.10
4.97
5.26
3.68
6.35
1.66
4.70
5.52
% error E
17.17
24.72
6.18
7.63
6.18
0.69
1.22
24.72
27.70
13.57
40.37
2.75
22.05
30.52
Error Sq
Error sum
Error sum/No. of obs. (E)
Sqrt. of E
Use of Artificial Neural Network to Predict the Yield … 21
22
A. Majumder et al.
Table 3 Table showing a sample value of parameters considered S. No.
Parameters
Sub-division
Trend values
1
Iron ore fines (%)
Total Fe
65.09
2
SiO2
2.73
3
Al2 O3
1.74
4
(−) micron
6.83
5 6
Metallurgical wastes (%)
7 8 9
Limestone (%)
10 11
Dolomite (%)
(+) micron
23.81
Total Fe
42.83
SiO2
3.53
Moisture
13
CaO
40.58
SiO2
9.72
CaO
30.86
12
MgO
14.54
13
Loss on Ignition (LOI)
45.62
SiO2
51.24
14
Coke breeze (%)
15
Al2 O3
33.34
16
Ash content
14.48
17
Lime (%)
18
Green mix moisture
CaO
4.65
86.09
19
Sinter return used (%)
12.53
20
Machine speed (m/s)
2.52
21
Vacuum
22
Raw material composition (%)
851 Iron ore fines
69.71
23
Dolomite
7.99
24
Limestone
8.85
25
Sand/Quartz
0
26
Mn ore fines
0
27
Coke breeze
5.75
28
Lime
1.06
29
LD slag
0.11
30
Metallurgical waste
6.52
31
Gas Cleaning Plant Injection Temperature (GCPIT) K
32
Crushing index
114 Coke breeze
63.79
33
Limestone
63.79
34
Dolomite
65.43
Observed value
374
376
369
374
386
386
383
385
386
393
399
423
413
419
406
400
397
386
403
394
409
No. of test
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
370
370
370
371
371
371
370.70
371
371
371
370.42
370
371
355
388
371
370
361.74
388
357
355
Predicted value
39
24
33
15
26
29
35.29
48
42
52
28.57
23
15
39
24
33
15
26
29
35.29
48
42
52
28.57
23
15
30
5
−5 30
15
16
12.25
15
16
12.25
19 19
−19
19
Mod error
19
19
Error
9.54
6.09
8.19
3.89
6.55
7.25
8.69
11.46
10.17
12.29
7.16
5.85
3.89
7.79
−1.31
3.89
4.15
3.28
−5.15
5.05
5.08
% error E
Table 4 Table showing calculation of Root Mean Square deviation (RMSD)
90.92
37.10
67.05
15.10
42.89
52.56
75.57
131.23
103.41
151.12
51.30
34.25
15.10
60.72
1.70
15.10
17.18
10.75
26.51
25.53
25.81
Error Sq 1230.50
Error sum 35.16
Error sum/No. of obs. (E)
(continued)
5.92
Sqrt. of E
Use of Artificial Neural Network to Predict the Yield … 23
Observed value
398
384
371
370
371
398
395
389
375
377
360
373
366
377
No. of test
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Table 4 (continued)
371
371
370.89
371
371
371
371
371
371
371
370
371
371
371
Predicted value
6
5
−5 6
2.10
2.10
6 11
−11
4
18
24
27
0
0
0
13
27
Mod error
6
4
18
24
27
0
0
0
13
27
Error
1.59
−1.37
0.56
−3.06
1.59
1.07
4.63
6.08
6.78
0.00
0.00
0.00
3.39
6.78
% error E
2.53
1.87
0.32
9.34
2.53
1.14
21.41
36.92
46.02
0.00
0.00
0.00
11.46
46.02
Error Sq
Error sum
Error sum/No. of obs. (E)
Sqrt. of E
24 A. Majumder et al.
Use of Artificial Neural Network to Predict the Yield …
25
Fig. 2 Graph showing predicted and observed sinter productivity as a function of no. of data set for testing
order of magnitude higher than the RMSD of 5.92. Thus, prediction by ANN is quite satisfactory and acceptable even with the reduced no. of variables. This is by itself a significant achievement, as one could predict the sinter productivity with a lesser number of 34 variables and most importantly no attention has to be paid to the rest huge number of variables.
4 Conclusion Both the objectives were successfully achieved in this paper. A mathematical relation between the process parameters and sinter plant productivity could be established with the help of available data and artificial neural network model. Secondly reduction in working variables reduces the computational time and also the machine operators have to concentrate on the least number of parameters. Finally the most important achievement is that the accuracy of yield predicted by the ANN with the reduced number of parameters is almost same as that predicted by the 66 number of exhaustive parameters. Acknowledgements The authors would like acknowledge the support of RINL Management for completing this paper successfully which otherwise would not have been possible. The authors also acknowledge the support of Kazi Nazrul University, Asansol and Jadavpur University, Kolkata.
26
A. Majumder et al.
References 1. Biswas, A. K. (1981). Principles of blast furnace iron making: Theory and practice (Ch. VI). Australia: Cootha Publ. House. 2. Umadevi, T., Brahmacharyulu, A., Roy, A. K., Mahapatra, P. C., Prabhu, M., & Ranjan, M. (2011). Influence of fines feed size on microstructure, productivity and quality of iron ore sinter. ISIJ International, 51(6), 922–929. https://doi.org/10.2355/isijinternational.51.922. 3. Loo, C. E., & Leung, W. (2003). Factors influencing the bonding phase structures of iron ore sinters. ISIJ International, 41(2), 128–135. https://doi.org/10.2355/isijinternational.43.1393. 4. Panigraphy, S. C., Jallouli, M., & Rigaud, M. (1984). Porosity of sinters and pellets and its relationship with some of their properties. In Ironmaking Proceedings (Vol. 43, pp. 233–240), AIME, Chicago, Illinois, USA. 5. Ishikawa, Y., Shimomura, Y., Sasaki, M., Hida, Y., & Hideo, T. (1983). Improvement of sinter quality based on the mineralogical properties of ores. In Ironmaking Proceedings (Vol. 42, pp. 17–29), Atlanta, GA, USA. 6. Hida, Y., Sasaki, M., Shimomura, Y., Haruna, S., & Soma, H. (1983). Basic factors for ensuring high quality of sintered ore (pp. 60–74). Australia: BHP Central Research Laboratories. 7. Loo, C. E. (2005). A Perspective of geothitic ore sintering fundamentals. ISIJ International, 45, 436–448. https://doi.org/10.2355/isijinternational.45.436. 8. Tang, W.-D., Xue, X.-X., Yang, S.-T., Zhang, L.-H., & Huang, Z. (2018). Influence of basicity and temperature on bonding phase strength, microstructure, and mineralogy of high-chromium vanadium–titanium magnetite. International Journal of Minerals, Metallurgy, and Materials, 25, 871–880. https://doi.org/10.1007/s12613-018-1636-1. 9. Bhagat, R. P. (1999). Factors affecting return sinter fines regimed and strand productivity in iron ore sintering. ISIJ International, 39, 889–895. https://doi.org/10.2355/isijinternational.39.889. 10. Brimacombe, J. K. (1985). Role of mathematical modelling in metallurgical engineering. In International Conference on Progress in Metallurgical Research, Fundamental and Applied Aspect, Feb 11–15, IIT Kanpur. 11. Higuchi, K., Kawaguchi, T., Kobayashi, M., Toda, Y., Tsubone, Y., Ito, Y., & Furusho, S. (2006). High-productivity operation of commercial sintering machine by stand-support sintering. Nippon Steel Technical Report No. 94. 12. Shalmabeebi, A., Saranya, D., & Sathya, T. (2015). A study on neural networks. International Journal of Innovative Research in Computer and Communication Engineering, 3(12), 12890– 12895. https://doi.org/10.15680/IJIRCCE.2015.0312055. 13. Fan, X.-H., Lib, Y., & Chen, X.-L. (2012). Prediction of iron ore sintering characters on the basis of regression analysis and artificial neural network. Energy Procedia, 16, 769–776. https:// doi.org/10.1016/j.egypro.2012.01.124. 14. Bhadeshia, H. K. D. H. (1999). Neural networks in materials science. ISIJ International, 39(10), 966–979. https://doi.org/10.2355/isijinternational.39.966. 15. Mehta, A. J., Mehta, H. A., Manjunath, T. C., & Ardi, C. (2007). A multi-layer artificial neural network architecture design for load forecasting in power systems. International Journal of Applied Mathematics and Computer Science, 10(4), 227–240. 16. Stevenson, R. L. (1981). Power system analysis. Singapore: Mc. Graw Hill.
An Approach to Self-reliant Smart Road Using Piezoelectric Effect and Sensor Nodes Aindrila Das, Aparajita Ghosh, Sudipta Sahana, Dharmpal Singh, and Ahmed J. Obaid
Abstract Road accidents are a major problem these days. A few of them are due to the human errors while controlling the traffic signals. In this era of advanced technologies, there are many ways to prevent these kinds of blunders. This paper describes an automatic traffic signal which works on piezoelectric power. The traffic lights are working only when the traffic is passing. This way, accidents are not only minimized, but also power conservation can be done. Specially, during midnight or dawn when the traffic is less. This supply of power of the system is self-dependent and is independent of weather or power failure. In an addition, this supply of power can also be used to provide electricity to the street lights and the CCTV Cameras. This way, it can be used to conserve a huge amount of electricity. Keywords Piezo · Energy harvest · Surveillance · Traffic · Power failure
1 Introduction The proposed work aims to track and reduce the amount of road accidents occurring all over India and electricity conservation. The Greek word piezoelectricity refers to the pressure and latent heat resulting to electricity. Jacques and Pierre Curie first A. Das · A. Ghosh · S. Sahana (B) · D. Singh Department of CSE, JIS College of Engineering, Kalyani, Nadia, West Bengal, India e-mail: [email protected] A. Das e-mail: [email protected] A. Ghosh e-mail: [email protected] D. Singh e-mail: [email protected] A. J. Obaid Faculty of Computer Science and Mathematics, Kufa University, Kufa, Iraq e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_3
27
28
A. Das et al.
introduced the piezoelectric effect in 1880. When any mechanical stress or vibration is triggered on a certain material, piezoelectric effect gives it the ability to generate AC voltage. Application of piezoelectricity is increasing day by day resulting in more demand of piezoelectric sensor, which has been started to be manufactured on a large scale in industries. The acquirement of mechanical energy due to the passing traffic results in the energy harvest by the hidden piezoelectric devices which convert it into electrical energy. The electrical energy thus produced can be provided to the LED of the traffic lights and also to the street lights and CCTV cameras. This way, electricity can be prevented from being wasted.
2 Related Work In [1] the authors tried to get piezoelectric transducer which was highly sensitive, selecting the material and its structure is important. Things should be kept in mind that the material should withstand various load conditions, including factors related to environment. The investigation on this leads to the conclusion that the materials like barium titanate (BaTiO3 ) and lead zirconatetitanate (PZT) exhibits maximum displacement. It [2] tells about the idea of Wrinkler Foundation. Implementing such systems require small amount of change in the road’s foundation so that it behaves like a plate resting on the Wrinkler Foundation. It aims to enhance the vibration caused by the kinetic energy. This way energy can be harvested. The authors in [3] describes the classic plate theory. D = (Eh3 )/(12(1 − u)) Where, E is the Young’s Modulus, H is the thickness, here which implies the thickness of the pavement and u is the Poisson’s Ratio. This mainly describes road structure as plates. Computation of the displacement of the pavement [4, 5] because of load intensity of vehicles are done by considering coefficient of friction. With the help of newly found displacement, output voltage and power can be found out by the formula: C0dV(t)/dt + V(t)/R = dQ(t)/dt In [6] the author explained their approach using piezoelectric materials where a technical simulation-based system is presented in order to provide support to the concept of generation of energy from road traffic.
An Approach to Self-reliant Smart Road Using Piezoelectric …
29
3 Proposed Work This setup is being planned to be set in a four-way road, which usually is crossing over of two streets. Each road is divided into two halves, designed for the passing of traffic in two opposite directions. Now, the piezo plates are spread all over each road for a distance of ‘d’ metres. A sensor is used for detecting the vehicles, so that any human or any other objects are not mistaken with the traffic, although footpaths are allotted for human beings. These sensors are placed on the starting points of the piezo plates, which are facing away from the traffic signal, on each of the halves of the road. When a vehicle starts to move on the piezo plates, the sensors will sense the presence of the traffic. The mechanical stress or vibration caused by the moving traffic will harvest the energy caused by it. Now the harvested mechanical energy is converted to the electrical energy by the piezoelectric effect of the piezo plates. Basically, the mechanical energy, i.e., pressure is converted into voltage before being converted into electricity. Higher voltage can be obtained if the crystals, used on the piezo plates, for conversion are arranged in cascading manner, i.e., serially. Examples of some crystals used on the piezo plates are Quartz, Topaz, Rochelle salt, Berlinite (AIPO4 ), Potassium Niobate (KNbO3 ), Lithium Niobate (LiNbO3 ) and many more. Now after the generation of electricity, the yellow LED of the traffic signal is lit, in order to decide the direction of the moving traffic. Now, the movements of the traffic on the other streets are observed and decision is taken accordingly. The energy harvested are also used to light up the streets and the CCTV. This process is executed on each of the road. The CCTV needs to be set through network. The three devices, router, NVR and the camera should all be in the same network, in order to communicate with each other. The IP address of the IP camera needs to be matched with the network. The main roads have CCTVs these days. Just the electricity required can be utilised from the energy harvested by the piezo plates and converted to electrical energy.
4 Algorithm For implementation of the system, we need to follow a set of steps. The setup is used for a four-way road. For reference, Fig. 1 is provided in the next section. Step 1:First the vehicle starts to move over the piezo plates. Step 2:The sensor present at the very starting point of the piezo plates detects the vehicles. Step 3:The crystals on the piezo plates, accepts the vibration of the vehicles and harvests the energy. Step 4:Now this energy is converted into voltage which later is converted into electricity, which is supplied to the traffic light, street lights and the CCTV cameras.
30
A. Das et al.
Fig. 1 System architecture of the proposed system
Step 5:Now the traffic signal shows yellow light in order to decide. Step 6:The direction of the flow of traffic in the other roads are checked with the help of the sensor. Step 7:Now the after deciding the how the flow of the traffic must be changed, on the basis of the weight of the flow of traffic, the signals are put accordingly. The road having highest number of vehicles will show green signal and the one with a smaller number of vehicles, that is, less pressure, will have red signal. When the pressure increases on a road having red signal, will change its color of signal. This way, the whole traffic and road system can be controlled without requirement of any non-renewable source and energy and also will be cost effective in the long run. Also, it is needless for the traffic police to concentrate on changing the light signals and can focus on other road activities.
An Approach to Self-reliant Smart Road Using Piezoelectric …
31
5 Implementation Framework The proposed work in this paper will ease the system of trafficking which is explained in Fig. 1. It shows a four way road, having the arrangement of piezo plates on the road. The proposed work in this paper will ease the system of trafficking. The unnecessary traffic jam can be avoided. It means that suppose a road is almost empty, still the cars are facing green signal, and another road having heavy trafficking is there, and faces red signal. This kind of scenario happens specially when traffic signals are shown on the basis of timing. Also, electrical energy can be conserved. Mechanical energy caused by vehicles are harvested and this energy is converted to electricity. Saving of energy occurs in this way. Errors in traffic signal operating may happen due to human carelessness. This can be prevented by this method.
6 Conclusion We have many streets where thousands of traffic lights, street lamps and CCTV cameras are used and lots of electric energy is utilised. This concept of piezoelectric effect conserves the electric energy since electricity is generated from the mechanical stress, which is a renewable energy. During night time, when traffic is less, or the street remains almost empty, there is unnecessary use of the traffic lights, in normal cases. This result in the wastage of electricity. But due to this approach of piezoelectric effect, the traffic lights are not in use unnecessarily. So, the energy is saved in that way. The harvested energy can also be used for other purposes, like for medical surveillance. Due to the use of piezo sensors, the whole system becomes cost effective.
References 1. Najini, H., & Muthukumaraswamy, S. A. (2016). Investigation on the selection of piezoelectric materials for the design of an energy harvester system to generate energy from traffic. International Journal of Engineering and Applied Science, 3(2), 43–49. 2. Zhang, Z., Xiang, H., & Shi, Z. (2016). Modeling on piezoelectric energy harvesting from pavements under traffic loads. Journal of Intelligent Material Systems and Structures, 27(4), 567–578. 3. Xu, C.-N., Akiyama, M., Nonaka, K., Shobu, K., & Watanabe, T. (1996). Electrical output performance of pzt-based piezoelectric ceramics. In Proceedings of the Tenth IEEE International Symposium on Applications of Ferroelectrics (ISAF ’96) (Vol. 2, pp. 967–970). IEEE, East Brunswick, NJ, USA. 4. Zhao, H., Tao, Y., Niu, Y., & Ling, J. (2014). Harvesting energy from asphalt pavement by piezoelectric generator. Journal Wuhan University of Technology, Materials Science Edition, 29(5), 933–937.
32
A. Das et al.
5. Goldfarb, M., & Jones, L. D. (1999). On the efficiency of electric power generation with piezoelectric ceramic. Journal of Dynamic Systems, Measurement and Control, Transactions of the ASME, 121(3), 566–571. 6. Najini, H., & Muthukumaraswamy, S. A. Piezoelectric energy generation from vehicle traffic with technoeconomic analysis. Journal of Renewable Energy, 2017 (Article ID 9643858), 16 p. https://doi.org/10.1155/2017/9643858.
Using Static and Dynamic Image Maps Built by Graphic Interchange Format (GIF) and Geographic Information System (GIS) for Project Based Learning Air Pollution in Schools in Hanoi, Vietnam Bui Thi Thanh Huong Abstract The article recommended a new application on teaching—a project base learning by graphic interchange format (GIF) based on data of geographic information system (GIS). Teachers and students implemented the project researching the status of air pollution at high schools in Hanoi, one of the urgent problems in recent times in Hanoi, Vietnam. The air pollution at high schools was shown by collecting data of polluted index (AQI) of the United State of Environmental Protection Agency (EPA) from December 5th to 22nd 2019 in some schools in Hanoi. By integrating ArcGIS Desktop and GIF, the static and dynamic image map of the status of air pollution was built for assessing levels of pollution following spatial distribution. The project was not only a new application of ArcGIS Desktop and GIF on learning but also alerting about pollution issues impacting the health of students and teachers in high schools. It is an essential issue for proposing mitigation and adaptive solutions for educational managers. Keywords Graphic interchange format (GIF) · Geographic information system (GIS) · ARCGIS desktop · Air quality index (AQI) · Project base learning (PBL) · Dynamic map image · Air pollution
1 Introduction According to the latest report of the World Health Organization (WHO), air pollution is affecting billions of children around the world, destroying the central nervous system, leading to diseases and premature death. This report claims that more than 90% of children in the world—that is approximately 1.8 billion children ranging from newborn to age 18—are breathing poisoned air [1]. According to the annual B. T. T. Huong (B) Faculty of Educational Technology, University of Education, Vietnam National University, Hanoi, Vietnam e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_4
33
34
B. T. T. Huong
report of The Environmental Performance Index (EPI) of the US, Vietnam is listed in the top 10 of air pollution countries in Asia. Notably, in Hanoi and Ho Chi Minh City, the amount of dust is constantly increasing, making the air quality index (AQI) always at an alarming level. This is really worrying since about 60,000 people died in 2016 due to heart disease, stroke, lung cancer, chronic obstructive pulmonary disease and pneumonia in Vietnam all related to air pollution [2]. The issue of integrating environmental education content into teaching in K-12 and higher education is increasingly urgent, requiring teachers and lecturers to renovate teaching methods and integrate real-life environmental issues. In addition to practice in lessons to raise the awareness of environmental protection, health self-protection in the context in Vietnam today. With an interdisciplinary approach, the author proposes a process of integrating GIS and GIFs in the study of the air pollution situation in Hanoi’s high schools as a demonstration of how to exploit graphic technology applications and geospatial data in reconstituting the air pollution situation of schools in Hanoi vividly and modernly.
2 Methodology 2.1 Some Concepts Geographical Information System (GIS): There are multiple definitions of GIS. As defined by National Geographic, GIS is a computer system for dealing with data related to location. ESRI, a US-based giant company in GIS industry define GIS as a framework for gathering, managing and analyzing data. ArcGIS desktop: To process location-based data, specific type of programs is needed. ArcGIS Desktop is a huge software that was built by ESRI company. ArcGIS Desktop is a program for GIS professionals to create, analyze, and manage geographic information. Graphic Interchange Format (GIF): Graphic Interchange Format is a common image file format used on the web environment. Even though GIF was first used almost 30 years ago, it is still widely used nowadays. Unlike JPEG image format, GIF’s compression is lossless. With this feature, the quality of GIF cannot be degraded. One of the popular GIF is the animated GIF. Although GIF was not designed for animation purpose, its capacity to store multiple images in one single file suggested using this format to store animation in the form of multiple frames Static map images: are images in PNG or JPEG format, including overlays like lines, markers, or polygons. Dynamic map image: are images in GIF including overlays like lines, markers, or polygons and can be shown changes following time. Air Quality Index (AQI): as the name implies, AQI tell us about the quality of the air. The formula to calculate AQI may be complicated. However, we only need to understand that the higher the value of AQI, the worse the air quality is (Table 1).
Using Static and Dynamic Image Maps Built by Graphic … Table 1 Classification of AQI increasing heath concern
Air quality index (AQI) 0–50 51–100
35
Impact level on health
Color assessment
Good
Green
Moderate
Yellow
101–150
Unhealthy of sensitive group
Orange
151–200
Unhealthy
Red
201–300
Very unhealthy
Purple
301–500
Very hazardous
Maroon
Source The United States Environmental Protection Agency [3]
Project base learning (PBL) is a new teaching method for learner centered pedagogy built by their learning project to gain a deeper knowledge through actions in facts by themselves. Learners can study a subject by working for an extended period to investigate and solve the duties given by teachers [4, 5].
2.2 Research Steps Project based learning is a teaching method selected for learning about air pollution in K12 or higher education. To perform the project, some implementation was proposed (Fig. 1). To help learners understanding the status of air pollution at schools where they are studying, teachers should take students into a research project for project based learning, by guiding students collecting AQI data by AirVisual Application at the time (6 AM, 12 PM, 6 PM) at 82 high schools in Hanoi from December 5th to 22nd 2019 (Step 1). To show data following high schools’ distribution, building maps by ARCGIS Desktop selected (Step 2). For interesting reports, these maps should Fig. 1 Implementation steps the project base learning
Step 1: Guide students collecting AQI data Step 2: Guide students building maps by ArcGIS Desktop
Step 3: Guide students building dynamic map by GIF Step 4: Experimental teaching
36
B. T. T. Huong
be presented by dynamic images. Students will also be guided to build a dynamic map by GIF (Step 3). And the last but not least, the results (static maps and dynamic map) will be used in the experimental lectures of teachers (Step 4). Students will be more and more motivation with lectures building their products from their research in the facts.
3 Results 3.1 The Status of Air Pollution in Vietnam According to the latest WHO data, 97% of cities in low- and middle-income countries with more than 100,000 people do not meet WHO air standards. For high-income countries, the proportion dropped to 49% [1]. According to AirVisual Air Quality Monitoring Organization and Greenpeace, 18/20 most polluted cities in the world locate in South Asia (WHO 2016). In Southeast Asia, Vietnam, Indonesia and Thailand are responsible for a large portion of the region’s pollution emissions, according to International Energy Agency reports. Particularly, in Thailand, when the AQI index exceeds 200, students are allowed to leave school while Vietnamese students still go to school normally. Yale University’s studies on Environmental Performance Index (EPI) points out five main issues that are assessed and ranked, including water and sanitation, air quality, health, agriculture, biodiversity and habitat. Regarding air quality, Vietnam ranked 132 out of 180 countries assessed and ranked with a score of 46.96/100 [5]. Earlier, Forbes Vietnam’s research also showed that air quality in Vietnam is very low with the level of deep red, red and yellow colors covering the entire country. In particular, Hanoi capital has the most serious pollution level, there are days when the index of fine dust reaches dangerous level (purple color). The air pollution in Hanoi ranked first in the world for many days. According to continuous data from 13 automatic observation station in the period 12-29/9/2019, PM2.5 dust concentration tends to increase from 12-17/9, then decreases from 1822/9 and rising again. Especially, from 23/9 till present, the PM2.5 level remain at high level. In the period from 15-17/9 and 23-29/9, up to 75% of the 24-h value of PM2.5 in Hanoi City exceeded the Vietnamese standard QCVN05.2013 [6]. In 2019, according to the statistics of the General Department of Environment, Hanoi City has suffered at least 5 waves of air pollution [7]. The air quality during the pollution wave is always alarming, the AQI in the pollution is often 2–3 times higher than the permitted standard [8]. Especially, in the second wave of air pollution, lasting from 11 to 27/3 (17 days), the 24-h average index of PM 2.5 fine dust exceeded 140 µg/m3 at Pham Van Dong and Hang Dau station in two days 13 and 14/3. Besides, the third wave of air pollution in Hanoi occurred from September 12 to October 3 (18 days); PM 2.5 index is continuously higher than 50 µg/m3 . PM value at stations such as Minh Khai, Hang Dau and Nguyen Van Cu is above 80 µg/m3 ;
Using Static and Dynamic Image Maps Built by Graphic …
37
On September 29, the average index of PM 2.5 dust was up to 110 µg/m3 . The air pollution continuously takes place coinciding with the time of the school year, which is really worrying when the number of students across the country in 2019 reaches more than 24,000,000 students [9] and only in Hanoi City, the number of students enrolling in grade 10 is more than 90,000 students [10]. Air pollution in Hanoi City is a serious problem that requires the thorough and comprehensive participation of all levels and sectors to bring an environment of fresh air for students’ learning and living because their future depends on our drastic actions on air pollution issues today.
3.2 Guide Learners Building Maps of Air Pollution from AQI Data by ArcGIS Desktop First, a data table was built. The fields in this table include: name of school, longitude, latitude, value of AQI (in different measurements…). This table was the frame for the research team to use AirVisual to collect and monitor air quality in 82 high schools across Hanoi City from December 5th to 22nd 2019 (Figs. 2 and 3). Collected data were then analyzed, processed in ArcGIS Desktop to display the location of each schools. The data table were read into ArcGIS software. Then the
Fig. 2 ArcGIS desktop interface
38
B. T. T. Huong
Fig. 3 Table structure in ArcGIS desktop
information on its location including longitude and latitude was used to display its location in the geographic coordinate system WGS 84 (Figs. 4 and 5).
Fig. 4 Result of displaying school points on ArcGIS desktop
Using Static and Dynamic Image Maps Built by Graphic …
December 5th 2019
December 12th 2019
December 15th 2019
December 17th 2019
December 18th 2019
December 19th 2019
December 21st 2019
39
December 16th 2019
December 20th 2019
December 22nd 2019
Fig. 5 Static air pollution maps in high schools in Hanoi, Vietnam
3.3 Building Dynamic Map by GIF There are a number of software that we can use to create GIF. They might be Photoshop, Pixlr… However, these software are not free. Therefore, using the online and free site that allows creating GIF is a way out of it. GIF maker allows the users to create their own animated GIFs by combining separated image files (of various formats such as jpg, png…) as frames. Some online platforms offer to produce GIFs of high quality and free of watermarks or attribution. This feature makes online GIF maker ideal for developers and content creators. To create GIF, we first need a series of images that we need to put in the GIF on website: https://gifmaker.me/ Then, we upload this sequence of images. The images might be in the format of GIF, JPG, PNG, BMP, TIFF. If the image format is in ZIP archive, and even mix together with different formats and sizes—they will be converted automatically. We can also upload animated GIF, WebP or APNG images. These animated images will be split while the delay time is preserved. The online GIF maker platform also offer the functions to edit, shorten or merge the existing GIFs. The steps building dynamic map by GIF is as follow:
40
B. T. T. Huong
Press the “Upload Images” button, then select the images desired to use as frames. We can press and hold ctrl key (or command key in case of Macbook keyboard) to select multiple files (Figs. 6 and 7). When all the necessary images are uploaded, we can customize the animation speed and frame order before creating the GIF. After the GIF is created, we can resize, crop, and optimize it to our needs. We can even adjust the speed for the GIF
Fig. 6 The steps for making dynamic air pollution map in high schools in Hanoi
Fig. 7 Dynamic air pollution maps in high schools in Hanoi, Vietnam (GIF) https://drive.google. com/file/d/15KI4kQe36xheAqIxwJTnydOgxXA7TzLN/view?usp=sharing
Using Static and Dynamic Image Maps Built by Graphic …
41
by setting a “Delay time”, or adjust the delay for each individual frames with a “Delay” input box. This input box is located right next to each frame. If we upload images of different size, additional options will be provided. We can either choose to crop them all or resize to match the smallest dimensions or choose to alignment.
3.4 Learning About Air Pollution by Project Base The project “research the status of air pollution in high schools in Hanoi” carried out December 5th to 22nd 2019. The students joining the project had to collect data AQI in high schools by using AirVisual App and Global Positioning System—GPS. The collected data processed in ArcGIS Desktop to build static maps by themselves. The students also built dynamic map image by GIF application. The GIF supported to report their researching results in a seminar organized by teachers about the status of air pollution in schools in Hanoi, Vietnam. It is easy to understand the content of lessons by doing through the project. Teachers need to guide students to implement step by step following the process made by the teachers. The results from the project were achievements of learners and teachers (Figs. 8 and 9).
Fig. 8 Learners collecting data in school gates
42
B. T. T. Huong
Fig. 9 Learners report the project and discuss in the seminar
4 Conclusion PBL is an effective teaching method that contributes to innovating teaching methods in K12 and higher education. One of the tools to support the PBL implementation process is the application of technology in teaching. Using static and dynamic image maps using Graphic Interchange Format (GIF) and Geographical Information System (GIS) for project base learning was proved to be effective in understanding the air pollution status of schools in Hanoi, Vietnam. Learners feel a lot more interested when building the dynamic map image that reflects the dynamic fluctuations of AQI index visually with the support of GIF. The study has shown that the situation of air pollution in high schools in Hanoi City is very alarming, with over 30 schools out of 82 surveyed schools having AQI index above 200. This AQI index is at dangerous level to human health, forcing people to recognize its severity to the health of all Vietnamese people in particular and the world in general. The project provides a database of AQI for 82 high schools in Hanoi from December 5th to 22nd 2019, contributing to the database of actual environmental pollution at high schools in particular and Hanoi City in general. Acknowledgements We would like to thank University of Education, Vietnam National University to fund for us to finish this article through the project: “Geographic Information System (GIS)” in teaching: Contents and Paradigms (case study the Bachelor curriculum of Natural Science in University of Education, Vietnam National University, Hanoi). Code: QS.19.03.
References 1. WHO. More than 90% of the world’s children breathe toxic air every day [Online]. Available: https://www.who.int/news-room/detail/29-10-2018-more-than-90-of-theworlds-children-breathe-toxic-air-every-day. 2. WHO. (2018). More than 60000 deaths in Vietnam each year linked to air pollution [Online]. 3. EPA. AQI brochure [Online]. Available: https://www.airnow.gov/sites/default/files/2018-04/ aqi_brochure_02_14_0.pdf.
Using Static and Dynamic Image Maps Built by Graphic …
43
4. Moursund, D. G. (1999). Project-based learning using information technology. Eugene, OR: International Society for Technology in Education. 5. Teksoz, G. Managing air pollution: How does education help? 6. Yale University. (2018). EPI rankings [Online]. Available: https://epi.yale.edu/downloads/epi 2018policymakerssummaryv01.pdf. 7. Chinh, G. (2019). Hanoi air quality persists at unhealthy levels [Online]. Available: https://e. vnexpress.net/news/news/hanoi-air-quality-persists-at-unhealthy-levels-4020528.html. 8. Tuoitrenews. Hanoi’s air pollution reaches worst level since year-start [Online]. Available: https://tuoitrenews.vn/news/society/20191112/hanois-air-pollution-reaches-worst-levelsince-yearstart/51849.html. 9. GSO. (2019). Education [Online]. 10. Vietmy. Over 24 million students nationwide enter new academic year [Online]. Available: https://en.vietmy.net.vn/social/over-24-million-students-nationwide-enter-new-aca demic-year-487642.
Prediction of Autism Spectrum Disorder Using Feature Engineering for Machine Learning Classifiers N. Priya and C. Radhika
Abstract Machine Learning presents a brand new method of predicting children with Autism Spectrum Disorder (ASD) in an early stage with different behavioral analytics. Predicting autistic’s characters through screening trials is very high-priced and long duration. According to the facts of WHO, the variety of patients identified with ASD is steadily growing. Such children are essentially not able to interact with others, put off with the acquisition of linguistic, Cognitive, repetitive behavioral, speech, and non-verbal communique. The goal of the paper is to awareness of the early deduction of ASD from the affected individual. Feature engineering is a process that extracts the appropriate features from the dataset for predictive modeling. In this study, features are analyzed and reduce in three different datasets of ASD with the categories of age. The reduced feature set is investigated with the machine learning classifiers such as SVM, RANDOM FOREST (RF), KNN. The overall performance of the prognostic models is classified in the frame of accuracy and sensitivity performance metrics. In precise, the RF method categorized the data with higher precision for ASD datasets. Keywords ASD · Machine learning techniques · PCA
1 Introduction Autism Spectrum Disorders, a neurodevelopmental incapacity, has become one of the excessive prevalence of illnesses amongst kids. Studies indicate that primary prognosis and intervention remedies assist in accomplishing effective longitudinal outcomes [1]. It takes up to six months to analyze a toddler with autism because of the prolonged approach, and an infant should see many specific experts diagnose autism, N. Priya · C. Radhika (B) S.D.N.B. Vaishnav College for Women, University of Madras, Chennai, Tamil Nadu, India e-mail: [email protected] N. Priya e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_5
45
46
N. Priya and C. Radhika
beginning from development neurologists, pediatricians, therapists [2, 3]. ASD is a development ailment that defines sure contests associated with communique, social abilities, and repetitive behaviors [4]. Recently, Machine learning-based tactics are displaying an exquisite route on the objective assessment of neuropsychiatric problems [5–8]. ASD is a mental ailment characterized with the aid of difficulties with social interplay and verbal exchange, and with the useful resource of constrained and repetitive behaviors [9–11].
2 Related Works Early prediction of this ASD can substantially reduce the fitness troubles and so it improves the overall intellectual health of the child. Akyol et al. [12] proposed the prognostic models of the use of the FUZZY Rule procedure for the detection of kids with ASD. The aggregate of the Logistic Regression algorithm and Fuzzy Rule used for the category is better than making use of most effective via the Fuzzy Rule. Thabtah [13] describes the new ASD Tests app that can be used by fitness specialists to assist their exercise or to tell people whether or not they need to pursue formal scientific prognosis. Feature and prognostic analyses display small agencies of autistic personalities enhancing the overall performance and accuracy of screening strategies. Classification generated by machine learning techniques exposed to different performance metrics. Alwidian et al. [5] proposed the Association Classification (AC) method in predicting whether or not an individual has autism or not. The evaluation for the behavioral and the overall performance in the prediction responsibilities for the AC algorithms becomes conducted for several metrics. Erkan and Thanh [6] proposed the early analysis of ASD thru machine learning techniques is feasible, wherein the Random Forests approach is an awesome approach for classifying ASD statistics. Zhou et al. [1] focus on the speech abnormalities of younger kids with ASD and gift an automated evaluation framework to help clinicians in quantifying unusual prosody associated with ASD. They model the training information the usage of both the traditional SVM version and DNN. Shihab et al. [14] Data evaluation and class of ASD are nonetheless difficult due to unresolved issues springing up from numerous severity tiers, a variety of signs, and symptoms. To expertise, the features concerned in ASD, neuroscience era analyzed responses to motivations of autistic aural and video. The author attentions to analyze the datasets of adults and kids with autistic the usage of the Principal Component Analysis algorithm. Omar et al. [15] An active and efficient technique to locate autism traits for distinct age companies. With the help of autism screening software, a person may be guided at an early degree a good way to prevent the scenario from getting any worse and decrease costs related to behind schedule prognosis. Padmapriya [16] defines a new relevant feature selection system to lessen the feature size. This system selects the minimal subset of features that have the most relevance to their elegance labels in ASD datasets. Pedregosa et al. [17] Scikit-learn is a python module integrating a extensive variety of today’s systems gaining knowledge of algorithms
Prediction of Autism Spectrum Disorder Using Feature …
47
Table 1 ASD Dataset’s descriptions [20] Id
Name
Age
Instance Feature Class
4–11 yrs
292
21
Yes No
141 151
ASD2 ASD screening data for adolescence dataset 12–16 yrs 104
21
Yes No
63 41
21
Yes No
189 515
ASD1 ASD screening data for children dataset
ASD3 ASD screening adult dataset
≥18
704
for supervised and unsupervised issues. This package deal specializes in bringing gadgets studying to non-specialists the use of a preferred-cause excessive-degree language. Wang et al. [10] new novel feature engineering and feature encoding classifications, in conjunction with a DNN classifier for autistic screening primarily based on behavioral characteristics. Kupper et al. [18] Identified dimensional reduction subspace of behavior attributes from the ADOS-IV for the whole taster in addition to youngsters and adults one after the other that showed the comparable elegance standard act to that of the entire ADOS algorithm. Abdullah et al. [8] build a new feature selection method LASSO to choose the maximum critical features for supervised machine learning strategies. Vaishali and Sasikala [19] used the Binary Firefly algorithm for feature selection to predict ASD with minimum behavior sets.
3 Dataset The ASD datasets are collected from the UCI [20]. The datasets are divided into three groups, the name of the ASD dataset’s are ASD Screening Data for Children Dataset, ASD Screening Data for Adolescence Dataset, ASD Screening Adult Dataset. ASD Dataset’s descriptions and Feature descriptions are shown in Tables 1 and 2. Figures 1 and 2 show the analogous association among gender and result versus jaundice/ASD with own family and result. As of each case, the person with a better result rating is much more likely to have autistic, impartial of the opposite traits.
4 Methodology The effective pre-processing has been applied for the three datasets of ASD and then the feature engineering process is introduced for feature selection and extraction. Finally, the reduced set in investigated in the machine learning models for classification. The framework of pre-processing is shown in Fig. 3. • Procedure for methodology shown in the procedure below Step 1: ASD Datasets acquire from the UCI Repository.
48
N. Priya and C. Radhika
Table 2 Feature set descriptions of the ASD datasets [13, 21, 22] Attributes
Values
A1–A10 (Cognitive, Linguistic, Repetitive behavioral, Speech and Non-verbal communique)
{0, 1}
Age
Age in years
Gender
{m, f}
Ethnicity
List of ethnicities
Jaundice
{y, n}
The family with ASD
{y, n}
Residence
List of countries
Relations
Parent, self, medical staff, etc.,
Used app before
{y, n}
Score
Integer values {0…10}
Age_desc
{‘4 to ≥18 yrs’}
Class
{y, n}
Fig. 1 The correlation between the result and jaundice/ASD with family
Step 2: Pre-processing the data by way of standardization of the data imputing missing data, convert categorical data into numerical data. Step 3: Partitioning ASD datasets into training and test sets. Step 4: Apply the sbs algorithm for Feature selection and PCA for Feature Extract. Step 5: Build a machine learning Classifiers. Step 6: Compare the evaluation accuracy. Step 7: Select the effective Machine Learning model for ASD datasets.
Prediction of Autism Spectrum Disorder Using Feature …
CLASS-ASD
49
Male
Female
140 120 result
100 80 60 40 20
Child
Adolescent
Adult
Fig. 2 The pictorial representation of gender and result
Fig. 3 The framework of the proposed methodology
4.1 Pre-processing Data pre-processing can be achieved by Standardization of the data, Imputing missing values, One Hot Encoding, Partitioning and scaling, Feature selection, and extraction. • Standardization of the data: In ASD datasets there are few attributes they do no longer provide any benefit of our assessment (Ethnicity, Residence, Relations, Used application previously, Age_desc, Class). The machine learning classifiers might simply have the outcome of the goal variable. For the cause of higher evaluation, those six attributes are detached. • Imputing missing values: Some time real value that desires to be exchanged. The inventor can specify the value that desires to be considered for replacement. If we want to replace all NAN incidences in the dataset. We can even replace numeral or string values within the dataset. In ASD datasets, there are lacking values in the age, ethnicity, relations variables. Apply imputing missing values approach
50
N. Priya and C. Radhika
on the autistic datasets all missing values change into the ‘unknown’ value. After pre-processing the missing values, we envision the facts to discover demanding situations and solutions for knowledge classifiers. • One-hot encoding: This method is used to create dummy_feature of every unique record of the nominal attributes [23]. • Partitioning and Scaler: Partitioning ASD datasets into test and training data and also necessary to choose attributes is at the identical gage for the most reliable overall act. The facts can be randomly assigned to test and training sets. In the scaler, different attributes are converted properly into uniform scale data. This proposed system makes uses StandardScaler() approach for scaling data.
4.2 Feature Engineering Feature choice is deciding on a subcategory of the original feature and feature extract to transmute the data onto a novel feature subset. This proposed scheme makes use of the sequential backward selection(sbs) [23] procedure to lessen the dimensional of the preliminary function subset with a minimal and feature extract- Principal component analysis (PCA) is applied to lessen a composite dataset to a lower dimensionality [23]. In this system, the feature selection procedure was applied to reduces the size of the ASD dataset. ASD screening for the child dataset reduces by a minimal subset based on the interaction, communiqué, facial capabilities. Table 3 describes the important features selected for the classification process on the ASD adult dataset. Similarly for ASD adolescents and adult dataset minimal feature selected based on the interaction, imaginary, gestures, conversation. Tables 4 and 5 shows the important features selected for the feature selection and extraction process. Figure 4. describes the graphical representation of the feature importance of each ASD datasets. Table 3 Features importance of ASD dataset [13, 21, 22] Feature selected A2: Q2
How easy is it to get eye touch at the aspect of your little one?
A3: Q3
Kid issue to suggest that she/he goals slightly?
A4: Q4
Your kid point to proportion a hobby with you?
A5: Q5
Your kid imaginary?
A6: Q6
Your kid observe in which you’re looking
A7: Q7
If you or someone else inside the family is visibly dissatisfied, does your little one display symptoms of warning to consolation them?
Prediction of Autism Spectrum Disorder Using Feature …
51
Table 4 Feature significant of ASD adolescent child dataset [13, 21, 22] Features select A3: Q3
Your child element suggest that she/he desires a few things?
A4: Q4
Your kid element to percent hobby with you?
A5: Q5
Your child pretend?
A6: Q6
Your baby examine in which you’re searching
A7: Q7
If you or a person else within the circle of relatives is visibly disenchanted, your kid display signs and symptoms of warning to consolation them?
A9: Q9
Your child uses simple gesticulations?
Table 5 Feature important of ASD adult dataset [13, 21, 22]
Feature designated A3: Q3
Your baby component to indicate that he/she aim something?
A4: Q4
Your child factor in percentage interest with you?
A5: Q5
Your child imaginary?
A6: A6
Your toddler take a look at in which you’re searching
A8: Q8
Would you state your infant’s 1st words as:
A9: Q9
Your baby uses simple gestures?
Fig. 4 Feature importance of ASD datasets
4.3 Machine Learning Models • Random Forest: The Random Forest is suitable for excessive dimension information modeling because of the fact it can cope with lacking values and can manage non-stop, exact and binary facts. The bootstrapping and ensemble scheme makes Random Forest take a look at enough to conquer the issues of overfit and consequently, there’s no need to prune the trees. Besides excessive forecast precision,
52
N. Priya and C. Radhika
Random Forest is green, non-parametric, and interpretable for abundant forms of datasets [24, 25]. • Support Vector Machine: Support vector machines are based totally on the Structural Risk Minimization precept [26] from computational mastering idea. The concept of structural hazard minimization is to discover a speculation ‘s’ for which we can assurance the lowest genuine errors. The genuine error of ‘s’ is the probability that ‘s’ will make mistakes on an unseen and randomly decided on the test sample. An upper bound can be used to connect the real error of a hypothesis ‘s’ with the error of ‘s’ at the training set and the complexity of ‘S’ (measured using VC-dimension), the speculation area containing ‘s’. Support vector machine locates the speculation ‘s’ which minimizes this certain at the authentic error with the aid of successfully and efficiently controlling the VC-Dimension of ‘S’ [24, 26]. • K-Nearest Neighbors (KNN): KNN is a type of lazy and non-parametric classifier wherein the function is best expected regionally and all computation is deferred until category. KNN class, the output is a category club. An item is classed through a majority vote of its neighbors, with the item being allotted to the elegance most not unusual among its o ‘k’ nearest associates. If ‘k’ = 1, then the item is categorically allocated to the elegance of that single nearest neighbor [27].
5 Experimental Results Extensive evaluation becomes done against experimental results to assess accuracy. This study implements the ASD datasets to expose the activities of the proposed model on the Jupyter Notebook (Anaconda3). In our proposed system, the data were arbitrarily partitioning 70% as training, and the remaining 30% as testing sets. Around 200 instances a part of the facts were arbitrarily chosen for training and testing in classification techniques. The overall performance assessment of the classification using all the 21 attributes previously pre-processing with 70–30% training and test datasets respectively shown in Table 6. Among the 21 most effective 6, vital features are decided on primarily based on verbal exchange, communique, facial features shown in Tables 3, 4, and 5, and apply the feature engineering and fed as input to each classification. To examine the overall act of the Cataloguing models, the subsequent metrics are apply: precision (pre), recall (rec), F1-score (f1), accuracy (acc), error rate (err) [5, 6, 23]. pr e =
T r ue positivit y T r ue positivit y + False positivit y
(1)
rec =
T r ue positivit y False negativit y + T r ue positivit y
(2)
f1 = 2∗
pr e ∗ r ec pr e + r ec
(3)
Support vector machine
KNN
92
0.97
0.82
0.89
0.08
Accuracy (%)
Precision
Recall
F1-score
Error rate
0.09
0.82
0.70
1.00
91
0.07
0.87
0.78
0.98
93
0.03
0.97
1.00
0.94
97
0.21
0.53
0.44
0.67
78
0.13
0.71
0.58
0.92
87
0.14
0.80
0.76
0.84
85
0.12
0.67
0.50
1.00
88
0.17
0.62
0.50
0.80
83
ASD1-child ASD2-adolescent ASD3-adult ASD1-child ASD2-adolescent ASD3-adult ASD1-child ASD2-adolescent ASD3-adult
Performance ASD dataset metrics Random forest
Table 6 Classification result with different ASD datasets before pre-processing (70–30)
Prediction of Autism Spectrum Disorder Using Feature … 53
54
N. Priya and C. Radhika err =
False positivit y + False negativit y False positivit y + False negativit y + T rue positivit y + T rue negativit y
acc = 1 − err
(4) (5)
The autism spectrum ailment screening method is a binary classifier problem given that people are labeled to either having ASD or No ASD traits. Therefore, performance assessment strategies that align with the binary class problem in machine learning were used. The Classification outcomes with child, adolescent, adult datasets after feature selection and extraction for different percentages of Test data are shown in Tables 7, 8 and 9 respectively. The confusion matrix may be used to derive one of a kind assessment metrics such as error rate, class accuracy, f1-score, precision, recall to record the overall performance of the gaining knowledge of algorithms (Tables 7, 8 and 9). Using the confusion matrix, a test instance can be allocated to an expected magnificence within the class step of the screening. As can be seen, the comparative outcomes indicate that the combination of the RF model achieved more high accuracy while comparing to different classifiers as depicted in Table 10 and Fig. 5. In the final analysis, we plot the ROC curve for the best classifier in ASD datasets, which grew to become out to be a random forest. As we will examine from the ROC curves in Figs. 6, 7 and 8, underneath the identical false positivity rate, we’re able to perceive ASD with an immoderate actual excessive true positivity rate, that is than that of inside the preliminary.
6 Conclusion In this studied exclusive methods to discover autistic using the ML models via RF, SVM, KNN. The experimentations have been achieved on three autistic datasets of the UCI repository [20]. In our experiments, we determined that the RF model is greater dynamic than the SVM, KNN models for autistic data cataloging. We additionally determined that the preliminary prediction of ASD is genuinely feasible. If the range of statistics instance is massive sufficient, the precision of the prognosis of autistic with the aid of device learning tactics could be higher. The precision also relies upon on the finishing touch degree of the gathered records. If the statistics of the autistic datasets are whole, the precision of the preliminary analysis of autistic might be excessive. The feature selection and extract with the PCA technique are propounded with the nominal subspace and used to categorize the autistic. Finally, it could be showed that with the RF model, we can stumble upon autistic without issue, rapid and because it ought to be. Therefore, ASD can eventually be dealt with successfully which in flip can recover the pleasant of life for sufferers with autistic and their relatives. The minimum subset function to detect the ASD in a preliminary phase and enhance the lifestyle of the affected humans is the destiny scope. Although besides studies are had to evaluate those minimum subset features to find out ASD with novel and hidden facts and to decide its experimental worth, these results might
0.89
92
Accuracy (%)
0.85
Recall
0.08
0.94
Precision
Error rate
20
Percentage (%)
F1-score
Random forest
Performance metrics
95
0.05
0.92
0.88
0.95
30
87
0.13
0.80
0.77
0.83
40
50
92
0.08
0.86
0.79
0.95
60
82
0.18
0.70
0.64
0.77
89
0.11
0.87
0.84
0.91
20
Support vector machine
93
0.07
0.91
0.88
0.94
30
94
0.06
0.92
0.89
0.95
40
88
0.12
0.83
0.73
0.95
50
Table 7 Classification result with ASD1_Child dataset after feature selection and extraction for different test data 60
83
0.17
0.72
0.61
0.89
85
0.15
0.78
0.64
1.00
20
KNN 30
86
0.14
0.79
0.65
1.00
40
86
0.14
0.78
0.64
1.00
50
87
013
0.80
0.66
1.00
60
86
0.14
0.77
0.62
1.00
Prediction of Autism Spectrum Disorder Using Feature … 55
0.67
81
Accuracy (%)
0.57
Recall
0.19
0.80
Precision
Error rate
20
Percentage (%)
F1-score
Random forest
Performance metrics
93
0.07
0.88
0.78
1.00
30
79
0.21
0.40
0.25
1.00
40
50
76
0.24
0.32
0.20
0.75
60
77
0.23
0.21
0.12
0.67
76
0.24
0.44
0.29
1.00
20
Support vector machine
81
0.19
0.50
0.33
1.00
30
76
0.24
0.28
0.18
0.99
40
74
0.26
0.12
0.07
1.00
50
78
0.22
0.38
0.24
1.00
60
Table 8 Classification result with asd2_adolescent dataset after feature selection and extraction for different test data
81
0.19
0.60
0.43
1.00
20
KNN 30
84
0.16
0.62
0.44
1.00
40
86
0.14
0.67
0.50
1.00
50
85
0.15
0.64
0.47
1.00
60
87
0.13
0.67
0.50
1.00
56 N. Priya and C. Radhika
0.81
91
Accuracy (%)
0.71
Recall
0.09
0.93
Precision
Error rate
20
Percentage (%)
F1-score
Random forest
Performance metrics
94
0.06
0.90
0.86
0.95
30
86
0.14
0.67
0.54
0.89
40
50
83
0.17
0.59
0.45
0.86
60
89
0.11
0.76
0.63
0.94
88
0.12
0.73
0.63
0.86
20
Support vector machine
90
0.10
0.78
0.68
0.93
30
88
0.12
0.74
0.62
0.92
40
87
0.13
0.71
0.58
0.90
50
Table 9 Classification result with ASD3_Adult dataset after feature selection and extraction for different test data 60
85
0.15
0.67
0.56
0.84
87
0.13
0.69
0.53
1.00
20
KNN 30
88
0.12
0.71
0.55
1.00
40
87
0.13
0.70
0.54
1.00
50
88
0.12
0.72
0.56
1.00
60
89
0.11
0.77
0.62
1.00
Prediction of Autism Spectrum Disorder Using Feature … 57
58
N. Priya and C. Radhika
Table 10 Prediction accuracy of the models ASD datasets
Samples
ASD1-child
292
Training Test-30%
0.95
0.93
0.86
ASD2-adolescent
104
Training
0.98
1.00
0.88
Test-30%
0.93
0.81
0.81
Training
0.99
0.98
0.97
Test-30%
0.94
0.90
0.88
ASD3-adult
704
RF
SVM
KNN
1.00
0.98
0.90
Fig. 5 Performance analysis of classifiers after pre-processing
Fig. 6 ROC for ASD2-adolescent dataset’s
Prediction of Autism Spectrum Disorder Using Feature …
59
Fig. 7 ROC for the ASD3-adult dataset’s
Fig. 8 ROC for ASD1-child dataset’s
also additionally moreover assist to enhance the complex autistic prognosis manner in toddler, adolescent, and adults.
60
N. Priya and C. Radhika
References 1. Zhou, T., Xie, Y., Zou, X., & Li, M. (2017). An automated assessment framework for speech abnormalities related to autism spectrum disorder. In 3rd International Workshop on Affective Social Multimedia Computing (ASMMC). 2. Goin-Kochel, R. P., Mackintosh, V. H., & Myers, B. J. (2006). How many doctors does it take to make an autism spectrum diagnosis? Autism, 10(5), 439–451. https://doi.org/10.1177/136 2361306066601. 3. Thabtah, F., & Peebles, D. (2020). A new machine learning model based on induction of rules for autism detection. Health Informatics Journal, 26(1), 264–286. https://doi.org/10.1177/146 0458218824711. 4. Ali, J., Khan, R., Ahmad, N., & Maqsood, I. (2012). Random forests and decision trees. International Journal of Computer science Issues, 9. 5. Alwidian, J., Elhassan, A., & Rawan, G. (2020). Predicting autism spectrum disorder using machine learning technique. International Journal of Recent Technology and Engineering, 8, 4139–4143. ISSN: 2277-3878. 6. Erkan, U., & Thanh, D. (2019). Autism spectrum disorder detection with machine learning methods. Current Psychiatry Research and Reviews, 15, 297–308. 7. Bone, D., Goodwin, M. S., Black, M. P., Lee, C. C., Audhkhasi, K., & Narayanan, S. (2015). Applying machine learning to facilitate autism diagnostics: Pitfalls and promises. Journal of Autism and Developmental Disorders, 45(5), 1121–1136. https://doi.org/10.1007/s10803-0142268-6. 8. Abdullah, A. A., et al. (2019). Evaluation on machine learning algorithms for classification of autism spectrum disorder (ASD). In International Conference on Biomedical Engineering. Journal of Physics: Conference Series, 1372, 012052. 9. Association, A. P. (2013). Diagnostic and statistical manual of mental disorders: DSM-5. Washington, DC: American Psychiatric Association. 10. Wang, H., Li, L., Chi, L., & Zhao, Z. (2019). Autism screening using deep embedding representation. In International Conference on Computational Science. https://doi.org/10.1007/9783-030-22741-8_12. 11. Alarifi, H. S., & Young, G. S. (2018). Using multiple machine learning algorithms to predict autism in children. In International Conference on Artificial Intelligence (pp. 464–467). 12. Akyol, K., Gultepe, Y., & Karaci, A. (2018). A study on autistic spectrum disorder for children based on feature selection and fuzzy rule. In International Congress on Engineering and Life Science (pp. 804–807). 13. Thabtah, F. (2019). An accessible and efficient autism screening method for behavioral data and predictive analyses. Health Informatics Journal, 25(4), 1739–1755. https://doi.org/10. 1177/1460458218796636. 14. Shihab, A., Dawood, F., & Kashmar, A. H. (2020). Data analysis and classification of autism spectrum disorder using principal component analysis. Advances in Bioinformatics. https://doi. org/10.1155/2020/3407907. 15. Islam, M. N., Omar, K., Mondal, P., Khan, N., & Rizvi, M. (2019). A machine learning approach to predict autism spectrum disorder. In International Conference on Electrical, Computer and Communication Engineering. https://doi.org/1109/ECACE.2019.8679454. 16. Padmapriya, M. (2018). A novel feature selection method for pre-processing the ASD dataset. International Journal of Pure and Applied Mathematics, 118, 17–24. 17. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12, 2825– 2830. 18. Kupper, C., Stroth, S., Wolff, N., et al. (2020). Identifying predictive features of autism spectrum disorders in a clinical sample of adolescents and adults using machine learning. Scientific Reports, 10(1), 4805. https://doi.org/10.1038/s41598-020-61607-w.
Prediction of Autism Spectrum Disorder Using Feature …
61
19. Vaishali, R., & Sasikala, R. (2018). A machine learning based approach to classify autism with optimum behavior sets. International Journal of Engineering & Technology. https://doi.org/ 10.14419/ijet.v7i4.18.14907. 20. UCI machine learning repository. Retrieved https://Archive.Ics.Uci.Edu/ML/Index.Php. 21. Thabtah, F. (2017). ASDTests. A mobile app for ASD screening [Internet] [cited December 20, 2018]. Available from: www.asdtests.com. 22. Thabtah, F. (2017). Autism spectrum disorder screening: Machine learning adaptation and DSM-5 fulfillment. In ICMHI ’17 Proceedings of the 1st International Conference on Medical and Health Informatics. https://doi.org/10.1145/3107514.3107515. 23. Raschka, S. (2015). Python machine learning, September 2015. ISBN: 978-1-78355-513-0. www.packtpub.com. 24. Vapnik, V. N. (1995). The nature of statistical learning theory. New York: Springer. 25. Qi, Y. (2012). Random forest for bioinformatics. In C. Zhang & Y. Ma (Eds.), Ensemble machine learning. Boston, MA: Springer. https://doi.org/10.1007/978-1-4419-9326-7_11. 26. Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. In 10th European Conference on Machine Learning (pp. 137–142). 27. Tanvi, S., Anand, S., &Vibhakar M. (2016). Perfomance analysis of data mining classification techniques on public health care data. International Journal of Innovative Research in Computer and Communication Engineering, 4, 11381–11386.
A Novel and Smart Parking System for the University Parking System V. Madhumitha, V. Sudharshini, S. Muthuraja, Sivakumar Rajagopal, S. A. Angayarkanni, and Thamer Al-Rousan
Abstract The necessity of an efficient and reliable car parking system is felt widespread in today’s technologically advanced world. In the Vellore Institute of Technology, car owners find it a very tedious task to search for car parking slots across the campus. Thus, our system caters to the need of such individuals by replacing the existing traditional method with an efficient car parking system. The proposed system finds a parking slot without any hassle is achieved by the use of a reliable smart app along with a cost-efficient and an accurate infrared sensor. Successful implementation of this technique leads to fewer traffic jams and hence judicious use of time. Keywords Infrared sensor · Smart app · Parking system · Reliable · Cost-efficient
V. Madhumitha (B) · V. Sudharshini · S. Muthuraja · S. Rajagopal School of Electronics and Communication Engineering, Vellore Institute of Technology, Vellore, India e-mail: [email protected] V. Sudharshini e-mail: [email protected] S. Muthuraja e-mail: [email protected] S. Rajagopal e-mail: [email protected] S. A. Angayarkanni R.M.K Engineering College, Kavaraipettai, Tamilnadu, India e-mail: [email protected] T. Al-Rousan Faculty of Information Technology, Isra University, Amman, Jordan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_6
63
64
V. Madhumitha et al.
1 Introduction There is a need to replace the present parking system with that of an efficient parking system due to various problems such as inefficient management systems and also the high cost of maintenance. The development of this idea of creating a smart parking system is a continuous evergreen process. Various models have been developed, Cunnigham et al. proposed the multimodal access and payment system (MAPS) that use smart card technology to maintain the database of the transportation modes used by various employees. This system is not cost-efficient, and also the programming speed of RFID is very high [1]. The system developed by Inaba et al. provides users with the advantage to use the internet to reserve a parking space around the area intended to park the vehicle [2]. The advantage of this system is the reduction of traffic congestion as well as reduced time lag the area intended to park the vehicle [2]. The advantage of this system is the reduction of traffic congestion as well as reduced time lag. The vacancy of the parking spot for a long time does not allow any other user also to use that parking spot. Mohandes et al. developed a system specifically for the KFUPM campus that allots parking spaces according to their top priority of the spot [3]. The main disadvantage of this system is that this system uses either ultrasonic sensors or cameras to detect a vacancy after the user provides the ID card. There exists a wide range of survey about various traffic congestion detection methodologies [4–6], which helps to better understand the scenario. Allotment of proper parking space for a vehicle consumes more time and is a major cause for bottlenecks in the roadside, which results in congestion. This work is intended to provide a smart parking system for our college campus, VIT. The infrastructure of every college is different and unique. VIT campus has several buildings with a range of parking spots available. Hence, the idea developed is specific to the VIT campus. The need for an efficient parking system is felt in many public spaces such as multiplexes, hospitals, offices, and markets. Finding a car parking spot has become increasingly difficult, and the problem tends to increase with the increasing number of private car owners. It creates the need to design an automated parking system. Institutions like VIT provide a limited number of car parking spots being made available to the faculties near each building. Not all faculties can find their desired car parking space, and hence there is always a time-consuming procedure to check for parking spaces near each building until one is found. The main problem is the efficiency of parking resources, and hence the need to create a reliable parking spot is a matter of most pressing concern. When a user can find a parking spot quickly, the searching time is reduced as well as traffic congestion near the buildings. The main idea is to inform the owners about the vacancy of parking spots before they enter the campus. The conventional parking systems lack the advantage of intelligent monitoring systems. The smart parking system proposed in this system uses the IR sensors to detect the presence of a vehicle at a particular parking spot. This data is updated to the smart mobile application. When a user opens the application and wants to find the vacant parking slots near a particular building, it
A Novel and Smart Parking System for the University …
65
helps the user to track down the vacancy using a real-time information update. There is wastage of time and many jams created in search of vacant parking spots across the campus. The situation worsens during peak time. The efficient parking system not just reduces human effort but allows reducing the amount of time to perform the tedious task. In this system, a representation of the vacant and occupied parking slots is provided in the application. The rest of the paper is structured as follows: Sect. 2 narrates the related works and Sect. 3 describes the experimental setup. Results are discussed in Sect. 4. Section 5 concludes the paper with future directions.
2 Related Works ‘Smart Parking System in VIT’ revolves around the main idea to create a smart and efficient parking system in VIT. The parking spot available in a particular building, the faculty expected to come and check for the availability of this spot. However, if there is an application that can inform the availability of that parking spot before reaching that spot, it eliminates the massive problem of checking and then re-visiting another parking spot elsewhere. The android application displays the vacancy or availability of the parking spot in the particular parking lot. Infrared Sensors detect the presence of a vehicle at a particular parking spot. The IR sensor is placed at such a location that will be inaccessible to other real-world objects. Adafruit, the cloud computing platform provides the same information about whether the parking spot is available or occupied. Additionally, it provides an analysis of the data in the graph, where the individual gets details about the frequency of availability of slots at a particular time. It also provides information about frequently used parking slots at the mentioned time and provides details about vacant parking slots. It will be of immense use for the faculty as he/she will be able to locate the near-by parking slot instead of wasting time searching for parking slots at different places. Finding a car parking spot has become increasingly difficult, and the problem tends to increase with the increasing number of private car owners. There is a need to replace the present parking system with that of an efficient parking system due to various problems such as inefficient management systems and also a high cost of maintenance. In institutions like VIT, there are a limited number of car parking spots being made available to the faculties near each building. Not all faculties can find their desired car parking space, and hence there is always a time-consuming procedure to check for parking spaces near each building until one is found. It leads to a reduction in searching time as well as traffic congestion near the buildings. The main idea is to inform the owners about the vacancy of parking spots before they enter the campus. The parking system developed by Basavaraju et al. has efficient sensors deployed in their parking spaces that provide details of the vacancy of the parking slots well in
66
V. Madhumitha et al.
advance [7]. Research works proposed for parking allocation with different schemes has been proposed [8–10]. Fog enabled method and IoT based technique was adopted by Awaisi et al. [11] and Mahendra et al. [12]. They monitor occupancy and quick data processing units to analyze data collected from different sources. The system proposed uses IR sensors to detect if a vehicle is parked at a particular parking area. This data is updated to the smart mobile application. When a user opens the application and wants to find the availability of parking spots near a particular building, it helps the user find about the vacancies as this data is updated on a real-time basis to the mobile app. Due to the inefficient management system and also the high cost of maintenance, there is a need for a visually monitored parking system, especially in institutions like VIT. The current system requires owners to drive endlessly around the campus until they find the parking slot. Since the preference for a parking slot for each slot is different for each individual and also time is taken to park their car is different, this creates a massive jam around that area, especially during peak time hours. Also, the current system depends on another person for parking their vehicles and not on an automated system. If these two problems are overcome, there is a need for an automated parking system that tracks vacancies of the slots and updates this realtime data to an application. There are various parking systems developed to reduce human efforts and provide additional comforts. Hassoune elaborates the methods to resolve parking issues, and upcoming technologies and compare them for clear understanding [13]. In the parking system developed by Grodi et al., the vehicle is parked at a particular slot is detected using RFID sensors [14]. The system will notify the drivers once a vehicle has occupied a particular parking spot. The disadvantage of this system is that the parking slots far away cannot be detected as it involves no GPS sensors. The system proposed by Khanna et al. allows the user to book a slot for which the place will be allocated to him/her [15]. In case, the user does not occupy that place at the specified time, an alarm as a reminder is sent to the user. The numbers of allocated and empty parking spaces are shown in this app. After the allocation of parking space, if the user does not reach there on time, there is wastage of time and money, since this parking space cannot be used by another individual either. D. Kanteti et al. developed a system where IP cameras are used to design the smart parking system in which they capture the vehicle registration numbers if they are pre-registered users [16]. They can proceed without any interruptions since details such as parking time details, place of a visit, and other relevant details are collected. Whenever a user parks their vehicle, the required amount is deducted, and the users will be made aware of those details. Hence, the amount will be deducted from their E-wallets, and also users will be notified. Whereas for new users, the same system followed is offline. M. Ramasamy et al. developed with the help of a mobile application, an IoT based smart parking system is used for a large parking lot by providing the details of the nearest parking space and hence reducing traffic congestion and also efficiently managing the parking system [17]. Hence a successful cloud-based parking system is
A Novel and Smart Parking System for the University …
67
developed using the IoT application and hence developed to guide the user. R. Salpietro et al. using the concept of embedded sensors and Bluetooth connectivity, automatic detection of parking actions was designed [18]. This analysis allows presenting the details of a parking event detected to be passed on to the target. The main idea behind this idea is using a combination of internet connection to a remote server. V. W. Tang et al. has used the concept of wireless sensor networks to detect the parking and the activity in the parking lot [19]. This system is just a prototype stage that uses sensors nodes to detect and simultaneously sends this information to the database where all the required tasks are hence performed. Z. Pala et al. uses a modern system using RFID technology to reduce traffic congestion during checkins and check-outs [20]. A system used to track cars is used in this system as well. The GPS based system developed by Swetha et al. using GPS and Bluetooth has the disadvantage that the Bluetooth is connected in a specific limit range which this will be the disadvantage the range of the Bluetooth is just 100 m [21]. In VIT it is impossible to connect with the users so we use NODEMCU WIFI which the WIFI can be easily connected, and the range of the WIFI is more than the Bluetooth. Al-Jabi et al. have used in this research Augmented reality-based interactive in smart parking which contains a lot more work, so we converted this disadvantage to an advantage to our problem which we used cloud service for an easy way to communicate to the users [22]. Jioudi et al. defines the relationship between parking demand, price and parking time [23]. The proposed system consists of mainly two components: The parking allocation component and the smart application. The parking allocation component consists of the IR sensor which detects if a vehicle is parked at that spot, and updates this information to the application. The user-friendly application when in use displays the occupied and vacant parking slots around the individual building, the user searches for in the campus. The user-friendly mobile application has real-time data that is updated at regular intervals of time, to display the availability or occupancy of a particular parking spot. The system works when the car is parked in the slot 1 area the IR sensor will detect the object, and it will send the message to the app through the cloud computing platform which will be the base platform for the sensor and the smart app. The IR sensor is connected with the NodeMCU pins which the Arduino UNO is acting as a junction for the IR sensor and NodeMCU pins. The NodeMCU is connected with the WIFI, and the WIFI is connected to the MQTT IoT platform. The MQTT [Adafruit] is a base platform for cloud computing networks. This cloud computing network will send the message to the smart app. The smart app is based on the frontend and backend. The object will be detected, and it will be sent to the Adafruit cloud computing which this server will send the message to the backend app there is a different code for backend and Adafruit which the backend will receive the message. Adafruit and the app will be interfaced with each other. Our smart app will be installed in the Google Play Store. Where faculties can download the app, and they can check whether the parking slot is filled or vacant.
68
V. Madhumitha et al.
3 Experimental Setup Our system works on the basic idea that when a car is parked in any area, the IR sensor will detect the object and it will send the message to the app through the cloud computing platform which will be the base platform for the sensor and the smart app. The IR sensor is connected with the NodeMCU pins for which the Arduino UNO is acting as a junction for the IR sensor and NodeMCU pins. The NodeMCU is connected with the WIFI. The WIFI is connected to the MQTT IoT platform. The MQTT [Adafruit] is a base platform for cloud computing networks. This cloud computing network will send the message to the smart app. The smart app is based on the frontend and backend. The object will be detected, and it will be sent to the Adafruit cloud computing which this server will send the message to the backend app there is a different code for backend and Adafruit which the backend will receive the message. Adafruit and the app will be interfaced with each other. Our smart app will be made available in the Google Play Store from where faculties can download the app, and they can check whether the parking slot is filled or vacant. The basic principle of IR Sensor is to detect the objects, in this case, the cars which are parked at the respective parking slots. We connected three IR sensors that are connected with the NodeMCU and Power Source. According to the NodeMCU, GPIO [General Purpose Input Output Pin], we connected GPIO016 D0 as IR1, GPIO05 D1 as IR2 and GPIO04 D3 as IR3 sensor and the GND, RST, 5V, OUTPUT, and wires connect VCC with the other pins and to the power source. The NodeMCU is then connected to the Adafruit cloud computing service. Software Design The Adafruit is a cloud service platform that helps to send and receive the messages to the users and subscribers which we used Adafruit MQTT protocol and this Adafruit dashboard and help us to view the graphs and the value of the slot. Figure 1 shows the block diagram, Fig. 2 shows hardware design, and Fig. 3 deals with Adafruit Dashboard displaying three feeds (IR1, IR2, IR3). So we created a Smart app called SmartParking IO which we used software for Front-End External and for Back-end Android Java which the Adafruit code will be interfaced with Back-end code which this will be a junction so the hardware details will be sent to the Adafruit cloud service this will pass the information to the Backend java code. The details will be displayed on the page after logging into Smart App shown in Fig. 4a, b.
4 Results and Discussions During the rush hours of a day, a hectic and time-consuming job to find parking spots on the campus, especially at the timing of classes. This is due to the increased vehicle density that there is increased traffic congestion when each owner needs to
A Novel and Smart Parking System for the University …
Fig. 1 Block diagram
Fig. 2 Hardware design
69
70
V. Madhumitha et al.
Fig. 3 Adafruit dashboard displaying three feeds (IR1, IR2, IR3)
visit each parking spot near each respective building at the need of time. This system aims to eliminate the significant task of finding car parking space by checking near each building, to an extent. The problem of traffic congestion and wastage of time needs to be reduced to an extent. Our system, which has necessary components such as the IR sensor and a user-friendly application, is cost-efficient as well as provides a solution to the above problems. The IR sensor is being used as it is cost-efficient as well as has a maximum detecting range. The user-friendly mobile application has real-time data that is updated at regular intervals of time, to depict the availability or occupancy of a particular parking spot. In recent times, the idea of smart cities has become immensely popular. Various issues need to sort, such as traffic congestion and optimal usage of time. The implementation of the Smart Parking System can resolve these issues. Smart parking in VIT can also be viewed to increase productivity and also pollution. ‘Smart Parking System in VIT’ mainly focuses on maximum optimal usage of time of faculties and students, by reducing the amount of time spent on searching parking spots during peak hours.
5 Conclusions With the help of this system, the aim is to create an efficient parking system by replacing the traditional and tedious method to search for parking spots across the institution. The use of cost-efficient and accurate infrared sensors along with the
A Novel and Smart Parking System for the University …
71
Fig. 4 a Smart app displaying screen with login details. b A screen displaying the vacancy/occupancy of slots
customized smart application is needed for an accessible parking system. Mohandes et al. developed on the main idea to provide a well-advanced reservation system that provides the user with an advantage to book the parking slots before reaching the place [3]. However, this reservation system cannot be used on campuses like VIT as there would be unnecessary wastage of parking space in case the user does not use that space. Our system is built on the idea of that is specific to the campus of VIT, which has many buildings with limited car parking spots near each building [19]. There are mainly two components in this system: The parking allocation component and the smart application. The parking allocation component consists of the IR sensor which detects the if a particular vehicle is parked at that slot. Moreover, updates this information to the application. The user-friendly application, when in use displays the occupied and vacant parking slots around the individual building, the user searches for in the campus. The user-friendly mobile application has real-time data that is updated at regular intervals of time, to display the availability or occupancy of a particular parking spot. This data is also being linked to the cloud that depicts the availability of a particular parking spot at a particular interval of time during the day. The cloud provides the
72
V. Madhumitha et al.
data of peak occupancy of available parking spots that helps car owners predict the vacancy of spots near the desired building. Currently, the ‘Smart Parking System in VIT’ is in the prototype stage and works with a limited number of sensors. However, this system is scalable and can accommodate more IR sensors to monitor an increased number of parking slots. This scale can, therefore, be adjusted according to the customer and user by adding new features or increase the number of parking slots. The above aspects regarding cost efficiency and user-friendly application are being considered for the future use of this system. There is a feasibility of short term traffic flow forecasting with Chennai road traffic with mixed traffic flow, which may give a clear picture of predicting upcoming traffic in a particular area [24]. Our work can be extended to predict the availability of free space for parking in public parking slots, which will facilitate the trespassers of the congested city.
References 1. Cunningham, R. F. (1993). Smart card applications in integrated transit fare, parking fee and automated toll payment systems-the MAPS concept. In Conference Proceedings of the National Telesystems Conference (pp. 21–25), Atlanta, GA, USA. https://doi.org/10.1109/NTC.1993. 293015. 2. Inaba, K., Shibui, M., Naganawa, T., Ogiwara, M., & Yoshikai, N. (2001). Intelligent parking reservation service on the internet. In Proceedings Symposium on Applications and the Internet Workshops (Cat. No. 01PR0945) (pp. 159–164), San Diego, CA, USA. https://doi.org/10.1109/ SAINTW.2001.998224. 3. Mohandes, M., Deriche, M., Abuelma’atti, M. T., & Tasadduq, N. (2019). Preference-based smart parking system in a university campus. IET Intelligent Transport Systems, 13(2), 417– 423. https://doi.org/10.1049/iet-its.2018.5207. 4. Mohan Rao, A., & Ramachandra Rao, K. (2012). Measuring urban traffic congestion—A review. International Journal of Traffic Transportation Engineering, 2(4), 286–305. 5. Vlahogianni, E. I., Karlaftis, M. G., & Golias, J. C. (2014). Short-term traffic forecasting: Where we are and where we’re going. Transportation Research Part C Emerging Technologies, 43, 3–19. 6. Angayarkanni, S. A., Sivakumar, R., & Ramana Rao, Y. V. (2019). Review on traffic congestion detection methodologies and tools. International Journal of Advanced Science and Technology, 28(16), 1400–1414. 7. Basavaraju, S. R., Poornimakkani, S., Senthilkumar, S., & Finney Daniel, S. (2018). A cloud based end-to-end smart parking solution powered by IoT. International Research Journal of Engineering and Technology (IRJET), 3(5), 3559–3565. 8. Kiliç, T., & Tuncer, T. (2017). Smart city application: Android based smart parking system. In International Artificial Intelligence and Data Processing Symposium (IDAP) (pp. 1–4), Malatya. https://doi.org/10.1109/IDAP.2017.8090284. 9. Sadhukhan, P. (2017). An IoT-based E-parking system for smart cities. In International Conference on Advances in Computing, Communications and Informatics (ICACCI) (pp. 1062–1066), Udupi. https://doi.org/10.1109/ICACCI.2017.8125982. 10. Kazi, S., Khan, S., Ansari, U., & Mane, D. (2018). Smart parking based System for smarter cities. InInternational Conference on Smart City and Emerging Technology (ICSCET) (pp. 1–5), Mumbai. https://doi.org/10.1109/ICSCET.2018.8537281. 11. Awaisi, K. S., Abbas, A., Zareei, M., Khattak, H. A., Khan, M. U. S., Ali, M., et al. (2019). Towards a fog enabled efficient car parking architecture. Access IEEE, 7, 159100–159111.
A Novel and Smart Parking System for the University …
73
12. Mahendra, B. M., Sonoli, S., Bhat, N., Raju, & Raghu, T. (2017). IoT based sensor enabled smart car parking for advanced driver assistance system. In 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT) (pp. 2188– 2193), Bangalore. https://doi.org/10.1109/RTEICT.2017.8256988. 13. Hassoune, K., Dachry, W., Moutaouakkil, F., & Medromi, H. (2016). Smart parking systems: A survey. In 11th International Conference on Intelligent Systems: Theories and Applications (SITA) (pp. 1–6), Mohammedia. https://doi.org/10.1109/SITA.2016.7772297. 14. Grodi, R., Rawat, D. B., & Rios-Gutierrez, F. (2016). Smart parking: Parking occupancy monitoring and visualization system for smart cities. In SoutheastCon 2016 (pp. 1–5), Norfolk, VA. https://doi.org/10.1109/SECON.2016.7506721. 15. Khanna, A., & Anand, R. (2016). IoT based smart parking system. In International Conference on Internet of Things and Applications (IOTA) (pp. 266–270), Pune. https://doi.org/10.1109/ IOTA.2016.7562735. 16. Kanteti, D., Srikar, D. V. S., & Ramesh, T. K. (2017). Smart parking system for a commercial stretch in cities. In International Conference on Communication and Signal Processing (ICCSP) (pp. 1285–1289), Chennai. https://doi.org/10.1109/ICCSP.2017.8286588. 17. Keat, T. M. (2018). IoT based smart parking system for large parking lot. In IEEE 4th International Symposium in Robotics and Manufacturing Automation (ROMA) (pp. 1–4), Perambalur, Tamil Nadu, India. https://doi.org/10.1109/ROMA46407.2018.8986731. 18. Salpietro, R., Bedogni, L., Di Felice, M., & Bononi, L. (2015). Park here! A smart parking system based on smartphones’ embedded sensors and short-range communication technologies. In IEEE 2nd World Forum on the Internet of Things (WF-IoT ) (pp. 18–23), Milan. https://doi. org/10.1109/WF-IoT.2015.7389020. 19. Tang, V. W. S., Zheng, Y., & Cao, J. (2006). An intelligent car park management system based on wireless sensor networks. In First International Symposium on Pervasive Computing and Applications (pp. 65–70), Urumqi. https://doi.org/10.1109/SPCA.2006.297498. 20. Pala, Z., & Inanc, N. (2007). Smart parking applications using RFID technology. In 1st Annual RFID Eurasia (pp. 1–3), Istanbul. https://doi.org/10.1109/RFIDEURASIA.2007.4368108. 21. Swatha, M., & Pooja, K. (2018). Smart car parking with monitoring system. In IEEE International Conference on System, Computation, Automation and Networking (ICSCA) (pp. 1–5), Pondicherry. https://doi.org/10.1109/ICSCAN.2018.8541196. 22. Al-Jabi, M., & Sammaneh, H. (2018). Toward mobile AR-based interactive smart parking system. In IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) (pp. 1243–1247), Exeter, United Kingdom. https://doi.org/10.1109/HPCC/SmartCity/DSS.2018.00209. 23. Jioudi, B., Sabir, E., Moutaouakkil, F., & Medromi, H. (2019). Estimating parking time under batch arrival and dynamic pricing policy. In 2019 IEEE 5th World Forum on Internet of Things (WF-IoT) (pp. 819–824). 24. Angayarkanni, S. A., Sivakumar, R., & Ramana Rao, Y. V. (2020). Hybrid Grey Wolf: Bald Eagle search optimized support vector regression for traffic flow forecasting. Journal of Ambient Intelligent and Humanized Computing. https://doi.org/10.1007/s12652-020-02182-w.
Role of M-CORD Computing Architecture for Over the Top (OTT) Services and Applications N. Senthil Kumar, P. M. Durai Raj Vincent, Kathiravan Srinivasan, Sivakumar Rajagopal, S. A. Angayarkanni, and Basim Alhadidi
Abstract With the deep proliferation of progression in technology, the prospect of future telecommunication would change the guidelines to safeguard the standards of the communication system and further that would enable to increase the resource contention for the innovative technologies. The 3GPP (3rd Generation Partnership Project) has spread out some essential rules which should be pursued for the effective usage of the 5G network. As the emergence of new mobile wireless standards into the market foray, there is a deemed necessity for the network providers to provides its consumers a reliable, robust and seamless network connection. When 4G spectrum was introduced into the market, it was almost 5 times faster than 3G and in tandem now, the 5G would bring in almost 4 times speeder than its predecessor 4G. To enhance the transmission speed, the M-CORD has been placed on the SDN and NFV. Meanwhile, the cloud computing technologies has been utilized for mobile wireless networks since they have offered virtualization of RAN and support other seminal properties of the network to boost the transmission rate. The M-CORD would enable the 5G mobile service faster and support community participation for N. Senthil Kumar (B) · P. M. Durai Raj Vincent · K. Srinivasan · S. Rajagopal School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India e-mail: [email protected] P. M. Durai Raj Vincent e-mail: [email protected] K. Srinivasan e-mail: [email protected] S. Rajagopal e-mail: [email protected] S. A. Angayarkanni R.M.K Engineering College, Kavaraipettai, Tamil Nadu, India e-mail: [email protected] B. Alhadidi Department of Computer Information Systems, Al-Balqa’ Applied University, As-Salt, Jordan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_7
75
76
N. Senthil Kumar et al.
the fulfilment of real 5G technology. The consumer experience on 5G services would rendered seamless and unparalleled to any existing network services and ensured that the minimum connection speed would be at around 15 Gbps with a less latency of 1 ns. Keywords Cloud computing · Telecommunication · 3G network · SDN · NFV · 5G service
1 Introduction As the modern generation has used the mobile network for its daily routine activities, the utilization of Over The Top (OTT) applications has paved way to a wide-ranging of confrontation among the companies that provide similar or overlapping services. The conventional Internet Service Providers (ISP) and the prominent telecommunication system have had to foresee the impeding challenges pertaining to third-party giants that provide over-the-top applications. For instance, the real confrontation happened between the leading company like Netflix and a cable company is growing inherently for the seamless connectivity. As for as the customers concern, they still need to pay the cable company in order to access to the Internet, but their cable package may fluctuate in accessing the cheaper streaming video over the Internet. Due to heavy workload assigned to the data centres, most of the existing technologies had vigorously depended on them and needed to carry out the major functionalities that were considered very important for the data centres. As the steep increasing of processor intensive works for the data centres at one end, safeguarding some of the running data centres over the place is an exorbitant issue. The upheaval task of the data centres have paved a serious threats to the core functionalities of the processors that the companies offered to its customers and further leverage a low settlement on the QoE of the customers that it possess. The delay happened between the actual data transmission and the receipt could be ignored well in advance and enhance the QoS by re-engineered 5G technology that manages the data transmission using its central office rather than the costlier data centres. It has been noted for the last one decade on the emergence of Web 3.0 that the multimedia content has occupied the Web space and increased the workload of any underlying data centres which further hinders the performance of the conventional networks drastically. To provide the fitting solutions to the video traffic in mobile networks, the two network giants Reliance JIO and CISCO had joined hands to team up for the novel plan and curtail the video traffic related network collision on the mobile networks. As the modern generation has used the mobile phones to view most of the Video contents using YouTube, Facebook and other social media sites, the CISCO has predicted the prospect of network traffic caused by the video streaming in the 2021 would increase to almost 82% and it may give the mobile operators a great deal to overcome the challenges in reducing the network traffic caused by video contents and give the customer a seamless and undisrupted mobile services.
Role of M-CORD Computing Architecture for Over the Top (OTT) …
77
Since the number of mobile users increased every year and it has been growing exponentially, Reliance JIO and CISCO has worked out a strategic partnership to handle this changing dynamics of network traffic and increased their computational methods on various domains of the networks and it has been vibrantly scaled across many technological zones. With the joint collaboration between CISCO and Reliance JIO to intensify the multi-access edge computing processes, they have re-invented the wheel on mobile Content Deliver Network (CDN) which paved the elicit way for enhancing the video experiences on the mobile networks. This process has started with a utilization case to additionally upgrade and improve video experience over the system by building up a versatile Content Delivery Network (CDN) and coordinated with mobile LTE network with edge caches and conveyed the content by means of edge cloudlets to furnish a superior client involvement with lower inactivity and higher execution. For the model, the organizations has utilized a right blend of IP address management techniques dependent on the portable center, demanded a coupling between edge cache, mid-level cache, and the traffic switch in the Cisco Open Media Distribution framework. The mobile center and the Cisco Open Media Distribution were joined into one framework level strategy for the CDN implanted in the versatile mobile system. To take care of the issue of client plane choice, they utilized a strategy called Control/User Plane Separation (CUPS) as characterized as 3GPP R14 (ThirdGeneration Partnership Project R14 benchmarks discharge). Cisco and Reliance JIO additionally explained the issue of relegating a topographically suitable IP address to the cell phones and the issue of handover strength notwithstanding IP address changes on the customer side and the CDN reserve site without change or unsettling influence to the present network service administrations. In view of the statement from NOKIA, it has proclaimed to work with the leading Mobile Agent BSNL with the ultimate goal of strengthening the impeding progress on the 5G mobile spectrum. In this regard, both NOKIA and BSNL will allegedly prepared to schedule a nation-wide demonstration on 5G network that promotes the features such as high speed of network access, low latency between network transmission and discuss the wide range of possibilities such as remote healthcare, augmented reality, full automation, virtual reality and many more. They have also planned to fix the suitable measure on the aspect of cost-effective path for 5G network for high speed and robust capacity. Table 1 shows Smart Applications of M-CORD Services. Figure 1 depicts the underlying principle exists between the diverse subscribers and telecom companies that offered an effective alternative solutions from a single point of presence. In nutshell, the central office has possessed all the seminal functionalities of the network and disseminates the seamless network connectivity for its customers.
78
N. Senthil Kumar et al.
Table 1 Smart applications of M-CORD services S. No. Smart city application
M-CORD services
1
Smart Homes
Demand Response, Fire Detection, Temperature Monitoring, Security Systems, Social Networking Support
2
Smart Parking
Number of Cars, Departure and Arrivals, Environment Monitoring, Mobile Ticketing, Traffic Congestion Control
3
Health Care
Tracking, Identification, Data Gathering, Sensing
4
Weather & Water System
Weather Condition, Water Quality, Water Leakage, Water Level, Water Contamination
5
Transport & Vehicular Traffic
Camera Monitoring, Environment Monitoring, Travel Scheduling, Traffic Jam Reduction, Assisted Driving
6
Environment Pollution
Green House Gas Monitoring, Energy Efficiency Monitoring, Renewable Energy Usage, Air Quality Monitoring, Noise Pollution Monitoring
7
Surveillance Systems
CCTV, Violent Detection, Public Place Monitoring, People & Object Tracking, Traffic Police
8
Smart Energy
Smart Metering, Demand Response and Demand Side Management, Distribution Automation, Network Monitoring and Control
9
Smart Buildings
Illumination Control, Temperature Control, Energy Efficiency, Safety and Occupancy Control, Synergies between energy efficiency, comfort and security
10
Smart Waster
Waste Management, Waste Water Treatment, City Cleaning, Waste Tracking
11
Smart Education
Flexible studying in an collaborative learning domain, Access to best digital content using Collaborative mechanisms, Massive Open Online Course (MOOC)
2 M-Cord Computing Architecture for OTT Services The M-CORD computing architecture is used for both testing as well as for the development in a more sophisticated manner. The structure of mobile needs to completely revamped in order to accommodate 5G network. This task will be a resource consuming one so that M-CORD computing architecture will be really helpful for the proper resource utilization. This will save especially the spectrum. Also this alternative arrangement will provide customized provisions which will ensure the best QoE to its users. The agile nature of this will also be ensuring cost effective deployment activity which in turn reduces the overall expenses. This was actually the bigger threat for all the newer technologies earlier which is addressed in this architecture. The customers or the users will be very much benefitted out of this as the reduced expenses will ensure much more services at the same cost they
Role of M-CORD Computing Architecture for Over the Top (OTT) …
79
Fig. 1 Central Office and its association with subscribers
have spent for only for few earlier. CORD controller and Monitoring as a services are some of the additions that will address network related issues. Various projects like ONOS, Docker, XOS and OpenStack and combined together in this proposed computing architecture. It is to be noted here that all these projects are open source projects and it integrates together to have proper delivery platform which is extensive in nature. The XOS project is the one does the functions of assembling and composing using its operating system. The Docker project and the OpenStack are accomplishing the data centre management in cloud environment. This will be taking care of inter connectivity services which are present in the respective software containers. On the other side the ONOS controls the white box switch fabric for organizing the leaf-spine topology. As shown in Fig. 2, the computing architecture M-CORD integrates both virtual EPC as well as virtual RAN along with various services of mobile edge like SON and edge caching. By integrating all these with CORD, a cloud based agile platform has established to customize services for effective network sharing and for profound observability. This one is applying dynamism in resource optimization as well as programmable data plane which are making this architecture as more robust and flexible when compared to others. Off-the-shelf hardware is used here to provide the identical performance achieved by any typical hardware facility. These services provided here are virtual solutions using the hardware setup. The response here are given by the radio frequency signals are real time based and this will make the virtualization of RAN is a really a toughest task. In a wireless network, a typical base station will be consisting of isolated radio head along with base band unit (BBU). When this BBU is placed closer to the equipment, the remote radio unit (RRU) will be located at top/base. But when the
80
N. Senthil Kumar et al.
Fig. 2 M-CORD architecture
centralized architecture of RAN is established, then all these BBUs will be moved in order to share space with other BBUs to centralized location. Manifold partitioning approach is highly required in a 5G atmosphere for its baseband which is very appropriate for all the small cells in order to reduce the maintenance expenses. All such Virtualized RAN elucidations are part of building this 5G networking architecture. In a dynamic setting, it is necessary to handle the high demands and this architecture will tune the network towards the ability to manage this. In every other way it is far better than the traditional way it was handled that every node has facilitated to handle the maximum demand. Figure 3 shows the comparison between traditional approaches with the virtual RAN. The existing 4G network is built with the framework of Evolved Packet Core (EPC). Circuit switching and packet switching were used for voice and data to handle the traffic separately. In 4G, IP services are handling both voice and data together in the above mentioned EPC. Figure 4 shows Conventional Structure of RAN and EPC. The important constituents of EPC include Packet Node Gateway, Serving Gateway, Mobility management entity etc. to handle everything together. The details of user authentication with different states maintenance along with the user tracking are the responsibility of NME. The SGW component will be taking care of data packets and its routing throughout the network. Quality of service when switching between previous networks and 4G are handled by PGW. These SGW and PGW are controlling the user plane as well as control plane by splitting it into two when needed. Figure 5 establishes the RAN architecture along with EPC. High flexibility and scalability could be achieved in this computing architecture because of virtualization of RAN. Also it is possible to manage the whole network at
Role of M-CORD Computing Architecture for Over the Top (OTT) …
81
Fig. 3 Conventional and virtualized RAN strategies
Fig. 4 Conventional Structure of RAN and EPC
a central level which will minimize the utilization of spectrum. Hardware customization can be done according to the user requirements because of disaggregation. These ensures better throughput with reduced cost. Thus M-CORD computing architecture guarantees better environment and networking architecture for 5G. Few are elaborated below.
82
N. Senthil Kumar et al.
Fig. 5 Architecture of RAN and virtualized EPC
2.1 Optimized CORE for Static IOT Devices In current 4G network, the overhead of control plane signalling will increase to new high. This particular issue will be addressed by M-CORD’s computing architecture. This will be very helpful on evolving to new standard like 5G. This open source package will provide more flexibility to the proposed networking architecture. Scalability is other bigger issue that will be addressed in this computing architecture which could handle many stationary IoT devices. This M-CORD computing architecture will enable us to establish different core to handle cellular devices and IoT devices. All the regular LTE connections will be handled by the usual spectrum whereas the IoT devices will be handled by SGW and PGW together.
2.2 Programmable and Scalable Connectionless Core This M-CORD computing architecture will facilitate the ability to generate programmable core to handle various varieties of adapted UEs and on the other side these static IoT devices do not carry any difficult processes that should be handled separately. So it is better to isolate the static IoTs in order to reduce the signalling capacities. By doing this the performance of the M-CORD architecture will be improved significantly with DPDK.
Role of M-CORD Computing Architecture for Over the Top (OTT) …
83
Fig. 6 End-to-End slicing in M-CORD architecture
2.3 Adaptive Analytical Service A tool is developed using M-CORD’s computing architecture which permits to begin both experiment and observe agents. This one is applied at the edge of the network which will impart geological certain decisions.
2.4 End-to-End Slicing Different UEs QoS requirements will be handled properly to support improved data rates for the 5G architecture. Network slicing will be very helpful in handling 5G networks traffic in a segregated manner. This is achieved by dynamically programmed virtual networks. The network slicing in M-CORD computing architecture is done with the help of virtualized RAN. The various components of EPC are utilized to achieve this. Figure 6 shows End-to-End slicing in M-CORD architecture.
3 Applications With advent of 5G technology to the global space, Safety as a Service (SaaS) would be turned important for effective credit checks and further enhance the maximum bandwidth to the existing networks in spite of the cross user’s affiliation with more than one network providers. This SaaS has offered to go more than the expected level of reach on voice calls and dedicated to share more on location details and seamless video calls. M-CORD means to give the robust resource allocation by streamlining the real-time resource management, surveillance check on the mobile framework and further strengthen its use on multiple Radio Access Technologies (RATs). This embedded model has offered a virtualized projection and efficient deployment at low cost for open source software using the leading commodity hardware RAN and EPC. The CORD architecture consolidates the open source projects such as ONOS,
84
N. Senthil Kumar et al.
OpenStack, XOS and Docker to make a versatile platform in offering the service delivery for the customers. Actually, the Open Network Operation System (ONOS) has emulated the leaf-spine topology by subordinating the white box switch control. The XOS operation system has been devised to assemble the components and the composition services of the network. The Docker and OpenStack has been pervasively used to carry forward the cloud data centre process and mutually interlink the network services in the software containers. This technology has paved an exuberant way for not only to mobile users instead it has been vibrantly followed up to many users who have a direct connect to the IoT devices. The transfer speed necessities would develop complex after the gadgets get associated and would likewise require a system which would make utilization of less power. The upcoming 5G innovation would empower low-control gadgets to keep running for as long as 10 years without requiring another charge. It would likewise have a fundamentally low idleness which would empower us to dodge the essential barricade of system delay in the acknowledgment of self-driving vehicles. Moreover, the lower inactivity would empower specialists to perform medical procedure with infused Nanoparticles continuously. Likewise, this could patch up the medicinal services industry and would end up being an aid for multitudinous clients. Figure 7 demonstrates the employments of 5G.
4 Conclusion In this work the M-CORD computing architecture is discussed for the 5G networks which are performing all the calculations at the edge of the network. The M-CORD architecture resolves further helpful in over the top services mechanism. This architecture would work very well with Software Defined Networks and with cloud as well. Open source nature will support this as well. The smoothness and accuracy of this one will be ensured by Multi-access Edge computing. This one is addressing its transforming abilities at the edge of the network to ease the workload of the core part. The benefit of this computing architecture which performs in open source platform is discussed in this paper.
Role of M-CORD Computing Architecture for Over the Top (OTT) …
85
Fig. 7 Applications of 5G in various streams
References 1. Barakabitze, A. A., Ahmad, A., Mijumbi, R., & Hines, A. (2020). 5G network slicing using SDN and NFV: A survey of taxonomy, architectures and future challenges. Computer Networks, 167, 106984. ISSN 1389-1286. https://doi.org/10.1016/j.comnet.2019.106984. 2. Németh, B., & Sonkoly, B. (2020). Advanced computation capacity modeling for delayconstrained placement of IoT services. Sensors, 20, 3830. 3. Hong, J., Kim, W., Yoo, J., & Hong, J. W. (2019). Design and implementation of containerbased M-CORD monitoring system. In 20th Asia-Pacific Network Operations and Management Symposium (APNOMS), Matsue, Japan (pp. 1–4). https://doi.org/10.23919/APNOMS.2019. 8893141. 4. Huang, C., Ho, C., Nikaein, N., & Cheng, R. (2018). Design and prototype of a virtualized 5G infrastructure supporting network slicing. In IEEE 23rd International Conference on Digital Signal Processing (DSP), Shanghai, China (pp. 1–5). https://doi.org/10.1109/ICDSP.2018.863 1816. 5. Sinh, D., Le, L., Lin, B. P., & Tung, L. (2019). SDN/NFV-based M-CORD for achieving scalability in deploying NB-IoT gateways. In IEEE Pacific Rim Conference on Communications,
86
6.
7.
8.
9.
10. 11.
12.
13.
14.
15. 16.
17. 18.
19.
20.
21.
N. Senthil Kumar et al. Computers and Signal Processing (PACRIM), Victoria, BC, Canada (pp. 1–6). https://doi.org/ 10.1109/PACRIM47961.2019.8985066. Guo, H., Liu, J., & Zhang, J. (2018). Computation offloading for multi-access mobile edge computing in ultra-dense networks. IEEE Communications Magazine, 56(8), 14–19. https:// doi.org/10.1109/MCOM.2018.1701069. Golestan, S., Mahmoudi-Nejad, A., & Moradi, H. (2019). a framework for easier designs: augmented intelligence in serious games for cognitive development. IEEE Consumer Electronics Magazine, 8(1), 19–24. https://doi.org/10.1109/MCE.2018.2867970. Sabella, D., Vaillant, A., Kuure, P., Rauschenbach, U., & Giust, F. (2016). Mobile-edge computing architecture: The role of MEC in the Internet of Things. IEEE Consumer Electronics Magazine, 5(4), 84–91. https://doi.org/10.1109/MCE.2016.2590118. Srinivasan, K., & Agrawal, N. K. (2018). A study on M-CORD based architecture in traffic offloading for 5G-enabled multiaccess edge computing networks. In IEEE International Conference on Applied System Invention (ICASI), Chiba (pp. 303–307). https://doi.org/10. 1109/ICASI.2018.8394593. M-CORD: https://opencord.org/wp-content/uploads/2017/02/MWC-M-CORD-WP-ShortFinal.pdf. Accessed on 20 June 2020 Srinivasan, K., Agrawal, N. K., Cherukuri, A. K., & Pounjeba, J. (2018). An M-CORD architecture for multi-access edge computing: A review. In IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW), Taichung (pp. 1–2). https://doi.org/10.1109/ICCEChina.2018.8448950 Abbas, M. T., Khan, T. A., Mahmood, A., Rivera, J. J. D., & Song, W. (2018). Introducing network slice management inside M-CORD-based-5G framework. In NOMS 2018—2018 IEEE/IFIP Network Operations and Management Symposium, Taipei (pp. 1–2). Saha, R. K., Tsukamoto, Y., Nanba, S., Nishimura, K., & Yamazaki, K. (2018). Novel M-CORD based multi-functional split enabled virtualized cloud RAN Testbed with Ideal Fronthaul. In IEEE Globecom Workshops (GC Wkshps), Abu Dhabi, United Arab Emirates (pp. 1–7). Trindade, E. P., Hinnig, M. P. F., Moreira da Costa, E., Marques, J. S., Bastos, R. C., & Yigitcanlar, T. (2017). Sustainable development of smart cities: a systematic review of the literature. Journal of Open Innovation: Technology, Market, and Complexity, 3, 11. Merino, R., Bediaga, I., Iglesias, A., & Munoa, J. (2019). Hybrid edge–cloud-based smart system for chatter suppression in train wheel repair. Applied Sciences, 9, 4283. Popescu, D., Dragana, C., Stoican, F., Ichim, L., & Stamatescu, G. (2018). A collaborative UAV-WSN network for monitoring large areas. Sensors, 18, 4202. https://doi.org/10.3390/s18 124202. Langmann, R., & Stiller, M. (2019). The PLC as a smart service in Industry 4.0 production systems. Applied Sciences, 9, 3815. Ren, J., Zhang, D., He, S., Zhang, Y., & Li, T. (2019). A survey on end-edge-cloud orchestrated network computing paradigms: Transparent computing, mobile edge computing, fog computing, and cloudlet. ACM Computing Surveys, 52, 6, Article 125, 36 p. https://doi.org/10. 1145/3362031. Sittón-Candanedo, I., Alonso, R. S., Corchado, J. M., Rodríguez-González, S., & Casado-Vara, R. (2019). A review of edge computing reference architectures and a new global edge proposal. Future Generation Computer Systems, 99, 0167-739X. https://doi.org/10.1016/j.future.2019. 04.016. Kewei Sha, T., Yang, A., Wei, W., & Davari, S. (2020). A survey of edge computing-based designs for IoT security. Digital Communications and Networks, 6(2), 195–202. https://doi. org/10.1016/j.dcan.2019.08.006. Wang, S., Zhao, Y., Jinlinag, Xu., Yuan, J., & Hsu, C.-H. (2019). Edge server placement in mobile edge computing. Journal of Parallel and Distributed Computing, 127, 160–168. https:// doi.org/10.1016/j.jpdc.2018.06.008.
Application of Virtual Reality and Augmented Reality Technology for Teaching Biology at High School in Vietnam Lien Lai Phuong, An Nguyen Thuy, Quynh Nguyen Thi Thuy, and Anh Nguyen Ngoc Abstract Augmented Reality, or AR for short, is a technology that helps users interact with virtual content in real environments. Virtual Reality (VR), abbreviated as VR, is a virtualized (digitized) virtual environment and performed on computers or smart mobile devices. Thanks to the outstanding feature of simulating images in 3D, VR/ AR is one of the effective support tools in teaching in general and Biology in particular. In Vietnam, VR technology/AR technology (VRT/ART) has been initially applied in teaching, however, there are still many shortcomings for many reasons. In this article, the author proposes a process and assessment of the status of VRT/ART application in biology teaching at high schools in Vietnam and initially tests the effectiveness of applying VRT/ART to teaching specific content. This research plays a fundamental role, opening more in-depth studies in applying VRT/ART to teaching in Vietnam. Keywords Virtual reality technology · Augmented reality technology · Imaging technology · Biology teaching
1 Introduction In the context of the explosion of science and technology, developing capacity for students, especially the capacity to use IT tools, towards a global society, global citizens is an extremely urgent task. With the strong development of software technology L. Lai Phuong (B) · A. Nguyen Thuy · Q. Nguyen Thi Thuy · A. Nguyen Ngoc VNU University of Education, Hanoi, Vietnam e-mail: [email protected] A. Nguyen Thuy e-mail: [email protected] Q. Nguyen Thi Thuy e-mail: [email protected] A. Nguyen Ngoc e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_8
87
88
L. Lai Phuong et al.
in general and educational software in particular, teachers have many tools to support the teaching process more vividly and attractively. Resolution 8 of the 11th Central Conference of the Communist Party of Vietnam on fundamental and comprehensive innovation of education and training stated: “Promote the application of information and communication technology in teaching and learning” [1]. In general education, biology is a science of the living world, the subject of biology can be microscopic structures that we cannot see with the naked eyes as the structure of cells, microorganisms or the structure of the whole ecosystem. To bring about visual images, technology is always an effective tool to support teaching. Thanks to the outstanding feature of simulating images in 3D, VR/AR is one of the effective tools in teaching Biology about structures, functions, biological processes, etc. With practical features, VR/AR software will contribute to supporting learning goals by putting theoretical learning content into practice, direct experience through vivid interaction, and cost-savings. Therefore, the application of these two technologies in education in the world is increasingly popular.
2 Virtual Reality and Augmented Reality Applications for Teaching 2.1 Overview of Virtual Reality and Augmented Reality Technologies Virtual reality (VR) is a technology that uses three-dimensional modeling techniques with the help of modern multimedia devices to build a computer-simulated world, called the virtual environment. In this virtual world, users are no longer seen as outside observers but have become part of the system. A VR system has 3 main characteristics: Interactive–Immersion–Imagination [2, 3]. Augmented reality (AR) is a technology that allows users to view a real environment directly or indirectly where environment components augment (or supplement) machine-generated data and generated properties like audio, image, GPS. An Augmented Reality system has the following three characteristics: (1) Combine reality and virtuality; (2) Create interaction in real-time; (3) Shown in three dimensions [4]. Unlike VR, the augmented information in the AR system is closely related to the real environment, the appearance of the information varies according to the way the user moves as well as considering the components in the real environment. In other words, in VR, the user’s perception of reality is completely in the virtual environment. In augmented reality AR, users are given more of their awareness of reality by the operation of computers [5–7].
Application of Virtual Reality and Augmented Reality …
89
2.2 The Situation of Researching and Application of Virtual Reality and Augmented Reality in Education Around the World and in Vietnam In the world and Vietnam, there have been many studies on virtual reality and augmented reality applications in education. Elliot Hu-Au and Joey J. Lee presented the opportunities that VR brings to education such as, engaging students, providing a learning environment for students to self-create knowledge, helping visualize difficult models, supporting students’ creativity and future career orientations [8]. Development of virtual reality applications in education and training—UNIMERSIV has successfully developed a medical training application called “Molecule VR,” providing a thorough visual experience that if merely looking at it will not be enough. At the same time, it also promotes the positive, proactive, and creative nature of learners, exploiting many learners’ senses to acquire knowledge [9]. Michael Bodekaer, the founder of Labster, presented the virtual reality laboratory for chemistry at a global conference called TED organized by Sapling Foundation Company [10]. The lab is modeled as a real laboratory through 3D graphics, participants can directly manipulate, create chemical reaction sequences and dissect reaction images to observe. In Vietnam, there have been some research topics on VRT/ART application in education. The project “Applying Virtual Reality Technology to Biology Education: The Experience of Vietnam” by the group of authors Le and Do has studied the difficulties in applying VR in biology teaching in Vietnam, and at the same time propose several suitable biological topics to apply VR technology and propose a 5-step process for the construction of biology lectures with the application of VR virtual reality [11]. The topic “Virtual reality technology—Application direction and development in multimedia training” by Vu Huu Tien has studied the application of VR in teaching, classroom management, and development orientation of virtual reality application training—Multimedia [12]. The topic “Application educational technology into elementary school: A case study of using VR/AR in an elementary teacher’s training” by authors Bui Thi Thuy Hang, Bui Ngoc Son and Nguyen Huong Giang analyzed the characteristics of teaching and learning at primary schools in Vietnam with the role of technology application in primary schools. Finally, the research topic on the use of AR in color experience activities in teaching color for first-grade students [13]. Above is the situation of application and research related to VR/AR technology in domestic and foreign education. However, these applications are mainly used for teaching at higher levels such as colleges, universities, or training centers, but there are few types of research on applications in teaching biology for high school students.
90
L. Lai Phuong et al.
3 Application of Virtual Reality and Augmented Reality Technology in Biology Teaching at High Schools in Vietnam 3.1 The Reality of Applying Virtual Reality and Augmented Reality to Biology Teaching in High Schools in Vietnam Before giving the content about virtual reality and augmented reality, we surveyed the status of using VR/AR in teaching Biology. We surveyed 130 Biological teachers in Hanoi, Vietnam. The survey results show that up to 90.7% of teachers are not familiar with VR/AR, 6.1% of teachers are known but have never used this technology in Biology teaching, only the remaining 3.2% have applied this technology (Shown in Fig. 1). The reason for such results, according to the author, is because VR/AR technology is still very new to Vietnam in general and to the education sector in Vietnam in particular. VR/AR technology is mainly applied to the primary or tertiary level of engineering or medicine, but it is still not popular. For high schools, VR/AR technology has very few applications because of its novelty and high operating costs. Fig. 1 The reality of applying VR and AR to biology teaching in high schools in Vietnam
Application of Virtual Reality and Augmented Reality …
91
3.2 Process of Applying Virtual Reality and Augmented Reality to Teaching Biology Based on an overview of VRT/ART application in teaching Biology along with our views, we propose a process of applying VR/AR in Biology subject, which is also used in this study (Fig. 2). • Determine the topic content: The teacher reviews high school Biology programs, textbook to find lesson content based on the selected topic criteria and can be linked to VR/AR. After that, the teacher determines the topic name, topic duration. • Define teaching goals: The teacher reviews the High School Biology program to determine the goals of knowledge, skills, attitudes, and competencies that need to be achieved after the lesson. • Identify content in topics that integrate VR/AR: Teachers identify parts of the lesson content that are integrated with VR/AR. • Teaching plan: Teachers build learning tasks using VR/AR software towards the set learning goals; search documents (pictures, information, videos …) as learning materials to build lessons. • Implementation of the plan: – Stage 1: Introduce the topic; introduce and guide students to use the software; assign tasks for each lesson period; provide several sources, reference; state product evaluation criteria. – Stage 2: Practice followed the plan. • Evaluate and learn from lessons: The teacher evaluates students’ products based on evaluation criteria that have built and learned lessons.
Fig. 2 Process of applying VR/AR in Biology subject
92
L. Lai Phuong et al.
3.3 Example of Applying VRT/ART to Teaching Biology at High Schools in Vietnam We have tried applying VRT/ART to teach the topic “Viruses and infectious diseases,” the content of the part is part 3: Microbiology-Biology of Grade 10. Knowledge circuits in specific topics and goals and products are as follows: (1) Definition, structure, morphology of viruses; (2) Viral replication in host cells; (3) Concept of Infectious Diseases, modes of transmission and prevention; learn about acute respiratory tract infection (Covid-19) caused by SARS-CoV-2 (structure, replication cycle, mode of transmission, prevention). Objectives, expected products, teaching methods, and assessment are presented in Table 1. To apply VRT/ART to teaching this topic, we use the Expeditions software. This is a lively learning and teaching tool that allows users to make Virtual Reality trips or Table 1 Target and proposed product topic virus and infectious disease Content
Required knowledge
The concept, structure, morphology of viruses
– Presentation of the structure of the virus – Presentation of virus characteristics – Presentation of the morphological structure of viruses – Distinguish viruses from bacteria – Explain why antibiotics cannot be used to kill viruses like bacteria
Virus replication in the host cell
– Presentation of stages in the replication cycle of animal and phage viruses – Explain why each virus can only enter certain types of host cells – Distinguish differences in the replication cycle of animal viruses and phage – Distinguish cycle of virus replication in animals and phage – Differentiate between the life cycle and the process of dissolution – Explain why the first stage of infection, it is difficult to detect the disease – Explained in viruses that mandatory intracellular parasites are required
Concept of Infectious Diseases, modes of transmission and ways to prevent them
– Identify the concept of infectious disease – List the modes of transmission of infectious disease – Presentation of the structure, replication cycle of the SARS-CoV-2 virus – Outline the way to infect the SARS-CoV-2 virus and how to prevent diseases caused by the virus – Obtain examples of some common infectious diseases and state the mode of transmission and how to prevent them
Application of Virtual Reality and Augmented Reality …
93
Fig. 3 Scenes in Google Expeditions app under the topic Virus (a: View in AR mode; b: View in VR mode)
explore objects in Augmented Reality. In the classroom or with groups, Google Expeditions allows a team member to act as a guide to lead classroom-sized “explorers” through Virtual Reality (VR) tours or show them Augmented Reality (AR) objects. A guide can use a set of tools to point out interesting things along the way (Fig. 3). Using this application, we have tried to teach the topic “Viruses and infectious diseases” on 89 students in grades 10 (10A1 and 10A2) at Viet Duc High School, Hanoi, Vietnam. Through the survey results, especially based on the results of the first semester exams, we found that the academic level of students in these classes is similar. Therefore, we have selected 10A1 as a collation class (C) and 10A2 as an experimental class (E) to conduct pedagogical experiments. Experimental plans were conducted in parallel with two E & C classes. In which: (1) E class: Using organized lesson plans for students, including using Expeditions to exploit VR/AR images to solve learning tasks that teachers tablets set out; (2) C class: Teaching with regular lesson plans with familiar learning methods and using common visual images such as pictures, presentations. After we completed the pedagogical experiment, we conducted a 15-min test with the same set of problems in both classes. We collected the results of the test in Table 2. • Plot the frequency of test points between two classes From Table 2, we draw Fig. 4. Table 2 Frequency distribution of test points of C and E class Class
Point Xi (%) 0
1
2
3
4
5
6
7
E
0.00
0.00
0.00
0.00
0.00
0.00
4.17
12.50
C
0.00
0.00
0.00
0.00
0.00
0.00
15.91
27.27
8
9
10
27.08
31.25
18.75
25.00
25.00
6.82
94
L. Lai Phuong et al.
Fig. 4 Chart of the frequency of test points of control and experimental classes
Looking at the chart we see the difference in points between the C and E classes. Specifically, in the C class, the number of students with an average score (5–7 points) accounts for a large proportion. For the E class, in contrast to the C class, the percentage of students with good grades (8–10) fairly and well accounted for the majority. Regarding the rate of weak, average, good and excellent students: From the results of Table 3, Fig. 5 is drawn. • Comment: From the above tables and charts, the study made the following remarks: – The graph of E class is always on the right of the C class (Fig. 4), which proves that students’ ability to receive knowledge of E classes is better and more uniform than that of C classes. – Figure 4 clearly shows the difference in points between C & E classes. Specifically, in the C class, the number of students with 5–7 points accounts for a large percentage. For the E class, in contrast to the C class, the percentage of students scoring 9–10 accounts for the majority. – In Fig. 5, the percentage of students who got good grades in the E class is higher than the percentage of the students who got good grades in the C class. The percentage of students with good grades in the E class accounts for the majority of the total students in the class (50%). In contrast, the percentage of students who achieved an average of the C class is higher than the E class. Thus, the results showed that the application of VR/AR in teaching in the E class is more effective than the C class. • Value of the characteristic parameters From the test results of two classes, based on the data processing method through Excel, we obtained the value of the characteristic parameters in the following table.
Number of tests
45
44
Class
E
C
0
0
0 7
2
Quantity
0
Average point (5–6)
Quantity
%
Weak point (0–4)
Table 3 The table of student learning results through the test
15.91
4.17
% 23
19
Quantity
Good point (7–8)
52.27
39.58
%
14
24
Quantity
Excellent point (9–10)
31.82
50
%
Application of Virtual Reality and Augmented Reality … 95
96
L. Lai Phuong et al.
Fig. 5 Chart of classification of student learning results through the test
Table 4 Value of the characteristic parameters
Parameters
C class
E class
Difference 0.72
Average point
7.79
8.51
Variance
1.42
1.21
1.19
1.10
Standard deviation T-test’s value
p = 0.004
– Processing score data of two classes Excel we calculated the parameter p = 0.004 less than 0.05 (Table 4), which helps to confirm that the difference between the C class and the E class is statistically significant. Specifically, the application of VA/AR to teaching E class is effective in increasing test scores compared to the C class, not by other random factors. Based on pedagogical experiment results and through the processing of pedagogical experimental data, the study found that students’ learning quality in experimental groups was higher than the collation groups. This result proves the meaning of the method that the author chooses in this paper.
4 Conclusion Biology is the science of the living world. Biology is a unique feature that requires a lot of images illustrating the biological structure of the living world. Thanks to the outstanding feature of simulating images in 3D, VR/AR is one of the effective tools in teaching Biology about structures, functions, biological processes, etc. Besides the obvious advantages of VR/AR technology in education, some limitations need to be overcome. One of them is the cost to operate and awareness of
Application of Virtual Reality and Augmented Reality …
97
teachers and students about the application of VRT / ART in education. Therefore, to achieve high efficiency, it is necessary to have the creative, flexible, and appropriate application, based on the actual conditions of each field, each school, the learner’s capacity, ability to use using modern teaching facilities of teachers, specific subjects. One of the basic orientations of educational reform in Vietnam today is to move from an academic education that is far from practical to an education that focuses on building the capacity of action and promotion, initiative and creativity of learners. The application of VRT/ART to education opens up great prospects for innovating teaching methods and form. With practical features, VR/AR software will contribute to supporting learning goals by putting theoretical learning content into practice, direct experience through lively interaction, and cost-savings.
References 1. Prime Minister of Vietnam. (1993). Resolution no. 49/CP on 04 August 1993 about Information technology development in our country in the 1990s. 2. Cong, N. T. (Ed.). (2015). 3D modeling the digestive system in virtual reality. Thesis. University of Information Technology and Communications, Thai Nguyen University, Vietnam. 3. Li, Z., Yue, J., & Jáuregui, D. A. G. (Eds.). (2009). A new virtual reality environment used for e-Learning. In 2009 IEEE International Symposium on IT in Medicine & Education, Jinan (pp. 445–449). 4. Wu, H.-K., Lee, S. W.-Y., Chang, H.-Y., & Liang, J.-C. (2013). Current status, opportunities and challenges of augmented reality in education. Computers & Education, 62, 41–49. 5. Carmigniani, J., Furht, B., Anisetti, M., Ceravolo, P., Damiani, E., & Ivkovic, M. (2011). Augmented reality technologies, systems and applications. Multimedia Tools and Applications, 51(1), 341–377. 6. Ma, M., Jain, L. C., & Anderson, P. (Eds.). (2014). Virtual, augmented reality and serious games for healthcare 1 (p. 120). Springer Publishing. 7. Sidiq, M., Lanker, T., & Makhdoomi, K. (2017). Augmented reality VS virtual reality. International Journal of Computer Science and Mobile Computing, 6(6), 324–327. 8. Elliot, H.-A., & Lee, J. J. (2017). Virtual reality in education: A tool for learning in the experience age. International Journal of Innovation in Education, 4(4), 215–226. 9. Fernandez, M. (2017). Augmented-virtual reality: how to improve education systems. Higher Learning Research Communications, 7(1), 1–15. 10. Andreoli, M. (2018). La Realtà Virtuale al servizio della Didattica. Studi Sulla Formazione, 21(1), 33–56. 11. Thi, P. L., & Thuy L. D. (2019). Applying virtual reality technology to biology education: The experience of Vietnam. In Proceedings of the First International Conference on Innovative Computing and Cutting-edge Technologies (ICICCT 2019), October 30–31, Istanbul, Turkey (pp. 455–462). 12. Van Cuong, N., & Meier, B. (Eds.). (2010). Some general issues about innovating teaching methods in high schools. Berlin/Hanoi: High School Education Development Project. 13. Hang, B. T. T., Giang, N. T. H., & Son, B. N. (Eds.). (2019). Application educational technology into elementary schools: A case study of using VR/AR in an elementary teachers training. In International Conference on Innovation in Teacher Education, VNU of Education, November 15, Vietnam.
Internet of Things Based Gesture Controlled Wheel Chair for Physically Disabled Akanksha Miharia, B. Prabadevi, Sivakumar Rajagopal, and Basim Alhadidi
Abstract A gesture-controlled wheelchair is beneficial to a physically disabled person. Though there is an incredible leap in the field of wheelchair technology, these advances haven’t been able to help steer wheelchair unaided—the wheelchair which can be operated by effortless hand signals. The user can control the wheelchair using the have movements and construes the motion calculated by the user and directs the wheelchair accordingly. It also aims at making the system wireless by using ZigBee that creates a personal area network for transfer of data wirelessly. Keywords Gesture · Wheel chair · Wireless · ZigBee
1 Introduction Gesture recognition is to recognize specific human gestures and process them to control device. It is used to achieve hands-free interaction. It is chosen because it makes the work efficient, time-saving and reduces workload. Especially in hospitals, where some patients are strictly dependent on wheelchair should get an opportunity to feel free and independent, and hence, the concept of making a gesture-controlled wheelchair was formed. To create a basic prototype of a robot that is capable of moving when the user makes the required gesture and also in worse circumstances, prevent collision automatically. This could further be enhanced to make a better use at the hospitals. A. Miharia · B. Prabadevi (B) School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India e-mail: [email protected] S. Rajagopal School of Electronics Engineering, Vellore Institute of Technology, Vellore, India e-mail: [email protected] B. Alhadidi Department of Computer Information Systems, Al-Balqa’ Applied University, As-Salt, Jordan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_9
99
100
A. Miharia et al.
To build a robot that used the fundamental of artificial intelligence and automated engineering which is capable of making its movements by the gestures that the user gives and also capable of sensing an obstacle and successfully avoiding it automatically. It should be able to sense gestures within a meter of distance and be able to move in the direction accordingly so that the person can independently use the wheelchair through just minimal movements and can move forward, backward, right or left and also be able to avoid knocking into an obstacle. The modification of this logic can be used to make a robotic arm utilized for welding or for controlling hazardous substances. The concept can be used in home automation. Further gestures for more complex works can be introduced to make a whole system controlled only by gestures. When due to certain reasons, a person is unable to use his motor capacity, it becomes essential for the person to utilize some instrument like a wheelchair. It offers a different movement for the patients with difficulty due to which they can’t move on their own. However, simply having a wheelchair is not enough. A person with such a problem would always have to be dependent on the other person for his movements, especially people who cannot handle the wheelchair by their bands due to inadequate force. For this particular reason, wheelchairs that can be manipulated by the user itself were made. Initially, they made one that would require the user to manipulate the wheels by his hands, but the user would get tired easily, and therefore won’t be long travel distance with ease. Then came the idea of the joystick, but these come costly and require a lot of maintenance and pose a lot of other problems. So a new technology involving gesture controls came into the picture. In this invention, I-GCWC, a gesture-controlled wheelchair for physically disabled is IoT and AI-based robot. This device helps the people who have to be strictly dependant on a wheelchair to feel free and independent. I-GCWC is a robot that can be controlled by the hand movements of the user. The utilizer will wear a glove which is the transmitter part of the device. The glove is enabled with sensor for sensing the movements made by the hand. The sensed data is manipulated in the Arduino board is sent to the receiver using ZigBee. Based on the signals from the transmitter, the receiver controls the gear movement of the wheelchair using a motor driver. A sensor is used for sensing the obstacles to avoid collisions. Also on collision detection, the robot is redirected to other paths, thus avoiding it. A buzzer can be used to notify about the obstacle on collision.
2 Related Works Various researchers have implemented different techniques for controlling the wheelchair for the needy, and some of them are presented in this section. A joystick control wheelchair is very vital for physically challenged people. It is straightforward to operate and can give disabled people freedom. The drive of the wheelchair can be controlled physically by the joystick. The command is implemented by using the joystick, and then the command is sent to the Arduino board
Internet of Things Based Gesture Controlled Wheel Chair …
101
Fig. 1 Joystick Controller
where the controller which will process the command. After processing the controller sends the command in the form of a digital signal to the motor driving IC and the motor driving IC controls the movement of the wheelchair. To gadget these functionalities, a software program is embedded into the internal FLASH of ATMega328P microcontroller as shown in Fig. 1. L293D IC controls DC motor and Ardunio ATMega328P according to the instruction of the joystick. This is useful for people who cannot walk and can use their hands for movements [1]. Figure 1 shows the Joystick controller. A satisfactory motion to human drivers is that of a smart robot with graceful and collision-free. A sophisticated movement is harmless, contented, speed and automatic. It is done by providing formal evaluation criteria. By use of B-splines, it shows an automatic path for a location, and draw route-following control law for various drive wheeled vehicle to continue that route within the speed and acceleration leaps a smooth functioning is achieved. The system has a MEMS transducers unit and wheelchair monitoring console. The MEMS transducer, attached to hand on a wheelchair is a 3-axis controller, and the ultrasonic sensor transforms into discrete number values and provides it to the 8051 controllers [2]. The wheelchair system should be designed utilizer favourable to diminish the burden of the custodian and will lift the faith level of the restricted movement person by helping him/her more independent. Wheelchairs are planned for the disabled person. The system has two major blocks: MEMS transducer and the other one is wheelchair manipulator. The MEMS transducer is attached to hand on a wheelchair, are a 3-axis controller and ultrasonic transducer convert into discrete number values and give it to the 8051 controllers. The controller and ultrasonic transducer provide input to the analog to digital converter and it manipulates analogue input signals into discrete number values. This, in turn, transmits the data to the processor and then communicated by the GSM Module to Mobile device. The controller is interfaced by the DC motor and the motor driver manipulates the movement of the wheelchair by the input that the ultrasonic transducer gets [3, 4]. The chair emphases on position approximation of movements used by restricted movement people to provide appropriate signals to the wheels. The position is identified using silhouettes and by movement identification by a Mahalanobis distance measure. It additionally controls a wireless robot from different movements used by restricted movement person. The robot is made to perform the front of the direction,
102
A. Miharia et al.
back, right, left and stop actions for each gesture. It can be additionally changed to work apparatus utilized for regular everyday life with the aid of movements. It can be utilized by hospital people to guide to long-distance positions during an emergency [5]. A Neurosky Mind wave headset picks up the EEG signals from the brain. The signal measured from the EEG transducer is executed by the ARM processor FRDM KL-25Z. The processor identifies a proper solution for identifying the movement of the wheelchair in a particular direction based on the hurdles avoidance transducers fitted on the wheelchair’s base. The processor identifies real-time details on a display interface. An additional interface of joystick control is provided [6]. Assistive technology is important in the movement of the people who are restricted to movements due to spinal cord problems, disorders in the central nervous system. Supportive techniques aid them to live an own-supportive and individual life. A brief description of the numerous assistive technologies for disabled people and their limitations are presented. Further Tongue movement based system joining wireless devices to all restriction related issue in the previous techniques Tongue operated unit is a tongue-based trifling invasive, unobtrusive and effective methodology to maintain several instruments in their area. It aids utilizers with the option to operate power wheelchairs and computer functions by utilizing unrestrained tongue movements [7]. A vigorous head movement-gesture interface (HGI) is intended for head movement identification of the RoboChair utilizer. The movements generated are utilized to produce moment control commands to the low-level DSP moments processor to maintain the movement of the RoboChair rendering to the utilizer’s instruction. Adaboost face identification and Camshaft object tracking are fused in the system to realize true facial recognition. It consists of two types of the intelligent manipulator and manual manipulator. Manual is using joystick and intelligent is using sensors and brings results automatically [8]. A standard joystick is challenging to use for some people in a powered wheelchair. A robotic wheelchair imparts to users not only driving support but also allows them to travel with efficiency and greater ease. It is achieved by having an inbuilt computer, sensor and a graphical utilizer system. The Wheelesley instrument provides over low-level navigation maintenance for the utilizer, allowing the utilizer to provide good movement controls like “forward” or “right.” The robot consists of a 68,332 controller that is utilized to maintain the robot and process transducer details. For identifying the location, the robot consists of 12 infrared transducers, four ultrasonic range sensors, two-wheel coders and 2 Hall effect transducers. The infrared and sonar transducers are placed in the outside of the wheelchair, with most locating in the front part of the chair The Hall effect transducers are placed on the front part of the wheelchair. More transducers to determine the present state of the location are being embedded to the chair [9]. A system to make people more independent was developed in [10]. A smart wheelchair that can be installed in proper seating is developed. The wheelchair batteries empower the sensors. In [11], Bremen Autonomous Wheelchair as a platform to focus on rehabilitation applications was presented. It is necessary to satisfy
Internet of Things Based Gesture Controlled Wheel Chair …
103
the needs of the potential users as well as ensuring the safety of the assistive technologies. Decision decisions should consider these factors. A secured system for the Bremen Autonomous Wheelchair was developed, and more light is shed on a route navigation system [11]. Rao et al. proposed the idea and application of the commends for a robotic arm for a real human-robot communication on an automated wheelchair. People with disability face problems for self-care [12]. MAiD describes the hardware arrangement, the control and navigation system, with the robotic wheelchair called Mobility Aid for movement restricted persons [13]. The process of this wheelchair is to move persons with severely affected peoples and provides them with a good quantity of movement. The equipment is designed on a commercial wheelchair, which has been equipped with a good navigation system. The work proposed by Nagashini et al. [14] applies a robotic wheelchair which can be maintained by face gestures. The robot moves according to the face movement. Although robotic wheelchairs can be utilized freely, some problem that unintentional gestures of the face may affect the wheelchair motion. They proposed an improved variant of the wheelchair enhanced by utilizing utilizer and environment. It utilizes independent potentials and the connecting by face position. It utilizes the transducer information obtained for navigation signals to resolve the issues with the control by face signs. Additionally, based on face emotions, the system selects a suitable autonomous navigation procedure to make utilizer more independent. The NavChair Assistive Wheelchair Navigation is for individuals with restriction to utilize a powered wheelchair due to cognitive, perceptual, or motor impairments [15]. It is made better than the joystick by using a software that performs the filtering and smoothing operations that earlier joystick would perform. A system that is imbedded on existing chairs was deployed by Gomi et al. [16]. A behaviourutilised method was utilized to constitute enough movements with lessor expense and components utilization while obtaining good conversion, enough safety, clear invisibility.
3 Proposed Architecture I-GCWC is an autonomous robot for aiding the physically disabled. This includes the collision detection and avoidance for a gesture-controlled wheelchair. A 2 wheeled, gesture-controlled robot and a subsequent glove for gesture making, which will do the following when switched on: will demonstrate the gesture-controlled movement of the robot in several directions, will showcase that it is fully battery controlled and wireless, will exhibit that all movements are pre-programmed and hence automated and will exhibit collision avoidance. Sometimes the user makes the wrong gestures then he/she might collide and get hurt. Also, the use of Zigbee for wireless communication, it leads to lag in data transfer so there could be a bit of lag in movements only when the user is a bit far from the robot. So to overcome it a sensor to detect a nearby object and thus automatically avoid a collision was tried to be implemented. In turn, a buzzer can be used to raise a sound when there is a collision. Figure 2
104
A. Miharia et al.
Fig. 2 Working of gesture-controlled wheelchair
depicts the working principles of gesture-controlled wheelchair.
3.1 Transmitter Section It consists of a glove which is the primary tool here for making gestures. As shown in Fig. 3, The glove consists of a MEMS sensor. Arduino and ZigBee. The MEMS Fig. 3 Transmitter Glove
Internet of Things Based Gesture Controlled Wheel Chair …
105
Fig. 4 Receiver chassis
sensor senses the movements which are read and manipulated in the Arduino and are sent to the receiver using ZigBee.
3.2 Receiver Section It consists of the chassis on which Arduino, Motor Driver Module and a ZigBee Module are placed as shown the Fig. 4. The ZigBee here receives the value from the transmitter which is manipulated by the Arduino and sent to the motor driver module that in turn controls the movement of the wheels.
4 Results and Discussions The device could successfully implement the moving of the robot by the gestures made by the user who is wearing the glove. When the user rotates his/her palm downwards, the robot moves forward, when it is rotated upwards, the robot moves backwards and similarly when the palm is flicked right or left the robot moves accordingly. The robot is also equipped with the functionality of collision detection and thus avoid colliding by opting an alternate path. With the proper connection of components and circuits and proper assignment of values to the ports, one can develop a proper working model. For this work, we assigned the optimum value for the x, y, z-axis of the MEMS sensor, to sense the rotating of arms forming a certain gesture. We have also assigned proper delay to
106
A. Miharia et al.
execute a particular function for a certain period. According to the requirement of the situation, We assigned the value of the minimum and maximum rotation of the MEMS sensor. All these factors, combined with appropriate connections, resulted in the development of obstacle avoidance robot.
5 Conclusions Twenty-first Century is upon us, and with the increasing needs of people, it is a need rather than a luxury to create smart systems. With this gesture-controlled wheelchair, added with the ability of collision detection and avoidance, we have created a small image of the future, which is yet to come. Like this gesture-controlled wheelchair which can be controlled just by the gestures of the user who due to reasons like paralysis is chair-ridden and can’t support himself without a wheelchair, a smart system has been created that makes the person on the wheelchair independent, and he won’t have to depend on anyone for his movements neither his hands will be restricted to a particular position of the joystick, which could be used for the betterment of the people. This could lay to the surge of the army of autonomous robot whom could be programmed to perform certain tasks without external intervention. They do not require any supervision and could perform the task efficiently. The idea is to promote efficient work added with a tinge of smartness. The proposed model is a miniature model of a true autonomous model, as it only deals with one real-time situation. We envision the autonomous system to be able to not only have gestures for robot movements but also for any other purpose that seems fit. With the addition of these functions and the addition of some more functions, a truly smart system could be developed. These functionalities could be replicated in the field of assistive robots, home automation, places we’re going near to site hazardous and so work can be done from a distance with movements. The same proposed chair could be used for a blind person whom could get directions using an audio device dictating the path to the user and at the same time detecting any obstacle present, and so the person can use the wheelchair and hand gestures for movements. Not only the work is limited to a gesture-controlled wheelchair, but this exact concept can be used in any form of an assistive robot. Since the entire system is wireless, the user is wearing a glove and controlling the robot, which could be at a distance of a few meters. Using even better networking device, the distance can be increased, and any assistive robots working can be controlled with gestures from a particular distance taking the smart system to a whole new level.
Internet of Things Based Gesture Controlled Wheel Chair …
107
References 1. Saharia, T., Bauri, J., & Bhagabati, C. (2017). Joystick controlled wheelchair. International Research Journal of Engineering and Technology (IRJET), 4, 235–237. 2. Gulati, S., & Kuipers, B. (2008). High-performance control for the graceful motion of an intelligent wheelchair. In IEEE International Conference on Robotics and Automation, May 19–23, Pasadena, CA, USA (pp. 3932–3938). 3. Kumar, S., & Raja, P. (2015). Ultrasonic sensor with accelerometer based smart wheel chair using microcontroller. International Research Journal of Engineering and Technology (IRJET), 2(09), 537–543. 4. Meeravali, S., & Aparna, M. (2013). Design and development of a hand-glove controlled wheel chair based on MEMS. International Journal of Engineering Trends and Technology (IJETT), 4(8), 3706–3712. 5. Singh, A., Kumar, D., Srikanth, P., Karanam, S., & Acharya, N. (2012). An Intelligent multigesture spotting robot to assist persons with disabilities. International Journal of Computer Theory and Engineering, 4(6), 998. 6. Sinha, U., & Kanthi, M. (2016). Mind controlled wheelchair. International Journal of Control Theory and Applications, 9(39), 19–28. 7. Chen, Y. L. (2001). Application of tilt sensors in human-computer mouse interface for people with disabilities. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 9(3), 289–294. 8. Gray, J. O., Jia, P., Hu, H. H., Lu, T., & Yuan, K. (2007). Head gesture recognition for hands-free control of an intelligent wheelchair. Industrial Robot: An International Journal, 34(1), 60–68. 9. Yanco, H. A., Hazel, A., Peacock, A., Smith, S., & Wintermute, H. (1995). An initial report on Wheelesley: A robotic wheelchair system. In International Joint Conference on Artificial Intelligence, August, Montreal, Canada (pp. 1–5). 10. Simpson, R., LoPresti, E., Hayashi, S., Nourbakhsh, I., & Miller, D. (2004). The smart wheelchair component system. Journal of Rehabilitation Research & Development, 41(38), 429–442. 11. Röfer, T., & Lankenau, A. (2000). Architecture and applications of the Bremen Autonomous Wheelchair. Information Sciences, 126(1–4), 1–20. 12. Rao, R. S., Conn, K., Jung, S. H., Katupitiya, J., Kientz, T., Kumar, V., et al. (2002). Human robot interaction: Application to smart wheelchairs. In IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292), May 11–15, Washington, DC, USA (Vol. 4, pp. 3583–3588). 13. Prassler, E., Scholz, J., Strobel, M., & Fiorini, P. (1999). MAid: A robotic wheelchair operating in public environments. In Sensor based intelligent robots (pp. 68–95). Berlin, Heidelberg: Springer. 14. Nakanishi, S., Kuno, Y., Shimada, N., & Shirai, Y. (1999). Robotic wheelchair based on observations of both user and environment. In IEEE/RSJ International Conference on Intelligent Robots and Systems. Human and Environment Friendly Robots with High Intelligence and Emotional Quotients (Cat. No. 99CH36289), October 17–21 (Vol. 2, pp. 912–917). 15. Levine, S. P., Bell, D. A., Jaros, L. A., Simpson, R. C., Koren, Y., & Borenstein, J. (1999). The NavChair assistive wheelchair navigation system. IEEE Transactions on Rehabilitation Engineering, 7(4), 443–451. 16. Gomi, T., & Griffith, A. (1998). Developing intelligent wheelchairs for the handicapped. In Assistive Technology and Artificial Intelligence (Vol. 1458, pp. 150–178). Berlin, Heidelberg: Springer.
An Adaptive Crack Identification Scheme Using Enhanced Segmentation and Feature Extraction for Electrical Discharge Machining of Inconel X750 K. J. Sabareesaan, J. Jaya, Habeeb Al Ani, R. Sivakumar, and Basim Alhadidi Abstract Quick interchangeable search has more significance in a large-scale dataset while using for image indexing and retrieval in many applications. Mining of image presents special characteristics due to the superior quality of the data that an image can reflect. The change of gathering to pursuit in overall and flying make, particularly as a consequence of the snappy improvement in assessment of people going around the world brisk and wealthy using air course, the materials that are combined in that industries and its reasonable machining operations makes prominence. Before producing that for the use in business operation sectors and assembling industries, it should be assessed for flaws and necessity for a mechanized response for recognizing the parts in surface of the metals using Computer Aided detection (CAD) system. The implementation of CAD framework has made a mammoth bounce in the powerful operations. It is proposed to find the applicable information for image examination and conclusion from the information base of picture depictions. This work center to actualize a viable division approach utilizing KFCM (Kernel Fuzzy C-Means) calculation. The solution consists of texture processing including feature extraction, segmentation with Content based image retrieval. The proposed solution K. J. Sabareesaan (B) University of Technology and Applied Science, Nizwa, Sultanate of Oman e-mail: [email protected] J. Jaya Hindusthan College of Engineering and Technology, Coimbatore, India e-mail: [email protected] H. Al Ani College of Engineering, Taibah University, Medina, Saudi Arabia e-mail: [email protected] R. Sivakumar Vellore Institute of Technology, Vellore, India e-mail: [email protected] B. Alhadidi Department of Computer Information Systems, Al-Balqa’ Applied University, As-Salt, Jordan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_10
109
110
K. J. Sabareesaan et al.
achieves better performance compare to direct matching scheme in terms of number of segmented objects and extracted & matched feature Keywords Computer aided detection system · Kernel fussy C-means
1 Introduction The Curiosity in researchers for making the ability of mechanized images has extended enormously all through the latest couple of years, fueled at any rate somewhat by the brisk improvement of imaging on the World-Wide Web. Customers in various master fields are misusing the open entryways offered by the ability to gain to and power distantly set aside pictures in a wide scope of new and stimulating ways. Advances in PC and media innovations permit the development of pictures (or information digitization) and extensive vaults for picture stockpiling with little cost. This has prompted [1] the measure of picture accumulations expanding quickly. As a result, picture substance is turning into a noteworthy focus for information mining exploration. Mining pictures is exceptionally troublesome since they are unstructured. It has been a dynamic exploration territory, e.g., managed picture order and unsupervised picture arrangement, and so forth. As a rule, to perform picture mining, low-level components of pictures are most importantly separated, for example, shading, surface, shape, and so on. The extricated components as highlight vectors are then used to speak to picture content for picture mining. In picture characterization, a learning machine or classifier is prepared by utilizing a given preparing set which contains various preparing illustrations and each of them is made out of a couple of a low-level component vector and its related class name. At that point, the prepared classifier has the capacity group obscure or unlabeled low-level element vectors into one of the prepared classes. It is for the most part accepted that our human visual framework utilizes compositions for acknowledgment and understanding. Mental exploration shows that the human visual framework investigations the textured pictures by breaking down the picture into its recurrence and introduction parts. Ghostly composition is suited for this kind of examination on the grounds that it separates a sign into constituent sinusoids of distinctive frequencies. It can be considered as changing our perspective of the sign from time based to recurrence based. The early dealing with performs spatial recurrence assessment and in this way, responds to particular frequencies and presentations. The issues of image recuperation are ending up being commonly seen, and the journey for courses of action an obviously unique zone for imaginative work. Issues with ordinary strategies for picture ordering have provoked the rising of excitement for techniques for recouping pictures on the reason of normally surmised components, for instance, concealing, organization and shape—an advancement now generally implied as Content-Based Image Retrieval (CBIR). Substance Based Image Retrieval (CBIR), an innovation that acknowledges data recovery by utilizing the picture content specifically, is
An Adaptive Crack Identification Scheme Using Enhanced Segmentation …
111
at present a prominent technique in data recovery. Google, Yahoo, Bing have all propelled picture internet searchers taking the substance of pictures as information. Since individuals concentrate all the more on the shading, picture recovery systems in light of shading component have been growing quick amid the previous decade. Highlight extraction is a preprocessing stride for picture indexing and recovery in the CBIR framework and a few shading depiction advances have been proposed to be utilized as the element vector Shading Histogram is the soonest system to express the element, which is invariant to rotational changes, separation changes and fractional impediment of the objective question around the perspective. On the other hand, the innovation still needs development, and is not yet being utilized on a noteworthy scale. Without hard confirmation on the adequacy of CBIR strategies practically speaking, supposition is still forcefully partitioned about their handiness in taking care of genuine inquiries in substantial and different picture accumulations. In such a circumstance, it is troublesome for chiefs and clients of picture accumulations to settle on educated choices about the estimation of CBIR systems to their own work. The utilization of auto-Correlogram as highlight vectors in substance based picture recovery (CBIR) frameworks beat shading histograms and different sorts of highlight vectors. The target of picture indexing is to recover comparative pictures from a picture database for a given question picture. Every last picture has its extraordinary element. Furthermore, consequently it can be actualized by contrasting their elements that are removed from the pictures. The closeness criteria among pictures may be taking into account the components, for example, shading, force, shape, area and surface. There are two types of techniques used for image indexing. 1. Textual indexing 2. Content based indexing. Textual indexing It is extremely basic strategies; remembering the client approach watchwords are given for a specific picture. This incorporates Caption indexing, Keyword augmentations Standard subject headings, Classification. Content-based indexing It is otherwise called computerized indexing. In this strategy pictures are listed taking into account their substance like shading, shape, heading, surface, spatial connection and so forth. This sort of indexing is taken care by programming itself, calculations are created which can separate the shading, shape, surfaces and so forth. The picture recovered through this system is known as Content Based Image Retrieval (CBIR). Likeness inquiry is characterized as the undertaking of discovering close specimens for a given question. It is of awesome significance to numerous sight and sound applications, for example, substance based media recovery. As of late, with the quick advancement of the web and the unstable developing of visual substance on the Web, expansive scale picture pursuit has pulled in extensive consideration. Comprehensively contrasting the inquiry picture and every example in the database
112
K. J. Sabareesaan et al.
is infeasible in light of the fact that the direct multifaceted nature is not adaptable in commonsense circumstances. Hashing-based routines are promising in quickening comparability hunt down their ability of creating minimal double codes for expansive quantities of pictures in the dataset so that comparative pictures will have close paired codes. Recovering comparative neighbors is then fulfilled just by discovering the pictures that have codes inside of a little Hamming separation of the code of the question. It is quite rapidly to access matching search over like binary codes, that is (1) The main memory can store the highly compressed encoded datas; (2) The binary codes hamming distance can be enumerated effectively by applying bit XOR performance and counting the quantities of set bits (from development of computers, an basic PC can do the million of hamming distance computation with a few milliseconds).
2 Related Work 2.1 Content Based Image Retrieval Hard data on the adequacy of programmed CBIR systems [3] is hard to obtain. Few of the early frameworks designers made genuine endeavors to assess their recovery viability, just giving samples of recovery yield to exhibit framework capacities. Framework designers do now for the most part report viability measures, for example, accuracy and review with a source datas, however minimal talk about scaling of client fulfillment. Without near recovery adequacy scores measuring the viability of dual unique frameworks in equal arrangements of information and questions, it is hard to make numerous factory inferences. That are all can be assumed as that recovery adequacy counts investigated picture recovery frameworks are in the same ball stop as those normally reported for content recovery. Be that as it may, the fundamental downside of current CBIR frameworks is maximum in central. It is that the main recovery signals they can endeavor are primival components, for example, shading, surface and shape. Subsequently present CBIR frameworks are liable to be of huge utilize just for applications at primitive elements. This limits their prime helpfulness to master application territories, for example, unique finger impression coordinating, trademark recovery or fabric choice. IBM’s QBIC framework has been connected to a mixed bag of undertakings, yet appears to have been best in expert regions, for example, shading coordinating of things in electronic mail-request indexes, and arrangement of geographical specimens on the premise of surface. Inside of master level 1 application, CBIR innovation [2] does have all the earmarks of being equipped for conveying valuable results, however it ought to be borne as a top priority that a few sorts of highlight have demonstrated significantly more compelling than others. It is for the most part acknowledged that shading and surface recovery to obtain good results (that machine results of comparability count
An Adaptive Crack Identification Scheme Using Enhanced Segmentation …
113
well an evident of human witnesses) compared than model coordinating. A piece of the issue with shape coordinating lies in the trouble of consequently recognizing closer view shapes and foundation detail in a characteristic picture. Notwithstanding when confronted with adapted pictures the spot of human mediation are utilized to recognize frontal area of foundation, however, model recovery frameworks regularly perform ineffectively. A noteworthy contributing element here is more likely than not the way of fewer, at cost any of model highlight define in current utilization are exact indicators of lives judgments of model closeness. Later systems in light of wavelets or Gaussian sifting seem to perform well in the recovery analyses reported by their inventors however once more, it is hard to contrast their adequacy and more customary strategies, as no near assessment studies have been performed. A picture recovery variant of the TREC (Text Retrieval Conference) content recovery examinations may well demonstrate valuable here. Albeit current CBIR frameworks [4] utilization just primitive components for picture coordinating, this do not restrain its extension only to stage 1 question. From a minimal resourcefulness from the identifier, it may be utilized to recover pictures of fancied questions or images much of the time. A question for shoreline images, for instance, able to figured through indicating pictures are blue in top and yellow beneath; an inquiry of pictures in fish by drawing a common fish is visualized in screen. Pictures of particular protests, for example, the Eiffel Tower recovered by presenting an exact scale drawing, if the edge of perspective is not very diverse. A talented pursuit mediator can subsequently equipped with stage 2 questions with current innovation, however it is not in adequate how substantial a scope of inquiries can be effectively taken care of along these lines. On the other hand, if a picture database that has been recorded utilizing essential words or spellbinding inscriptions is accessible, it is conceivable to consolidate pivotal word and picture comparability questioning (some of the time known as mixture picture recovery). Other CBIR systems [5] may have a section to organize in expert shading or model-coordinating features. It can be conceivable that it can be useful in improving the adequacy of broadly useful content according to picture recovery frameworks. However, real advances in innovation will be required before frameworks fit for programmed semantic element acknowledgment and indexing get to be accessible. Subsequently the possibilities of CBIR superseding manual indexing sooner rather than later for general applications taking care of semantic questions look remote. From the research in a semantic picture recovery methods is starting to assemble energy, especially in limited areas, (for example, distinguishing unclothed bodies of humans) If there is conceivable to create definite samples of the items included. Be that as it may, it will need a time to identify before such research thinks that its way into monetarily accessible items.
114
K. J. Sabareesaan et al.
2.2 CBIR Versus MANUAL Indexing In recent phase of CBIR advancement [6] there is pointless for have a question whether CBIR systems work preferable or more awful over indexing by manually. Possibly, CBIR procedures are various favorable circumstances over manual indexing. They are intrinsically speedier, less expensive, and totally target in its performance. Nonetheless, there are optional problems. The primary problem must be recovery viability how well does every kind of framework s are done? Tragically, the pair of sorts of method can’t be looked properly at, as they are intended to have a answer distinctive sorts of inquiry. Given a pro application at level 1, for example, trademark recovery, CBIR frequently performs better than magic word indexing, on the grounds that large portions of the pictures can’t satisfactorily be depicted by phonetic signs. In any case, Stage 2 performance like discovering a photo of a given sort of article to delineate a daily paper article, there is more viable indexing, in light of the fact that CBIR basically can’t adapt. It ought to be recollected, however, that manual grouping and indexing strategies for pictures likewise have their constraints, especially the trouble of suspecting the recovery signals future searchers will really utilize. Endeavors to recover pictures [7] by the select utilization of decisive words or primitive picture elements have not met with inadequate achievement. Is the utilization of essential words and picture highlights in blend prone to demonstrate any more viable? There are truth be told a few purposes behind accepting this would be a situation. Magic word indexing can utilized to catch a picture’s semantic substance, depicting articles which are obviously identifiable by etymological signals, for example, trees or autos. Primitive component coordinating can conveniently supplement this by recognizing parts of a picture which are difficult to name, for example, a specific state of rooftop on a building. Besides, assessment investigations of the Chabot framework demonstrated that higher exactness and review scores could be accomplished when content and shading similitude were utilized as a part of blend than when either was utilized independently. At last, hypothetical backing for this thought originates from subjective IR model; it determines that recovery by a blend of systems utilizing diverse psychological structures is prone to be more viable than by any single strategy. Further collaborations in the middle of content and picture highlight indexing are conceivable. The SemCap framework utilizes strategies from clinical brain research to get semantic components from pictures to improve primitive component coordinating. The VisualSEEk framework permits clients to add clear magic words to a whole arrangement of comparative pictures in a solitary operation, enormously accelerating the procedure of physically indexing a picture accumulation. A further conceivable refinement is the improvement of hunt extension helps, for example, the visual thesaurus, intended to connection comparative looking articles. At last, the Informedia venture has demonstrated that programmed indexing of features in the synchronous examination of both pictures and discourse on the sound track can altogether enhance indexing adequacy.
An Adaptive Crack Identification Scheme Using Enhanced Segmentation …
115
2.3 CBIR in Context Despite the fact that colleges specialists may try different things with standalone picture recovery frameworks to test the viability of inquiry calculations [3], this is not in the slightest degree ordinary of the way they are prone to be utilized as a part of practice. The experience of every single business seller of CBIR programming is that framework adequacy is vigorously impacted by the degree to which picture recovery capacities can be inserted inside of clients’ general work assignments. Trademark analysts should have the capacity to coordinate picture seeking with different keys, for example, exchange status, and install recovered pictures in authority documentation. Architects will need to adjust recovered parts to meet new plan necessities. Feature editors are unrealistic to fulfill just with capacity to monitor recovered feature successions. It is critical to stretch that CBIR is never more than the unfortunate chore. One ramifications of this is a unique future utilization of CBIR is prone to be the recovery of pictures by substance in a sight and sound framework. Open doors for collaboration in genuine media frameworks will be far more noteworthy, as officially exhibited by the Informedia venture, which joins still, and moving picture information, sound and content in producing recovery prompts. One sample of such collaboration uncovered by their recovery investigations were in the vicinity of visual prompts, very nearly 100% review can be accomplished with a 30% mistake in programmed word acknowledgment. Another part of media frameworks that can be more significant in broadly misused than its utilization of hyperlinks to direct perusers toward the associated information, If somewhere else in the archive are same or at a isolated area. The Microcosm venture spearheaded the idea of the bland connection, which utilizes a given content string as source instead of a particular report area. This permits clients to take after the links event of that word, independent of archive creator has particularly demonstrated that connection. It is accomplished by putting away the connections in a different connection source of datas, which can be questioned either by indicating a specific word in a source archive, or by direct catchphrase hunt of the source of datas. In any case, all connections will be recovered.
2.4 Standards Are Relevant to CBIR Possibly, different particular types of standards can impact, and being affected by, upgrades in CBIR advancement [6]. It includes: framework shows, for instance, TCP/IP, managing the transferring between the datas have to taken care of data and clients are making use of datas in this running applications; picture accumulating designs, for instance, TIFF or JPEG, deciding to the pictures to be encoded for maximum stretch storing or transferring; picture data pressure standards, for instance, JPEG and MPEG-2, demonstrating standard methods for pressing picture (and highlight) data for gainful transmission; source of datas charge dialects, for example,
116
K. J. Sabareesaan et al.
SQL, giving a standard grammar to indicating inquiries to a social database; metadata guidelines, for example, RDF, giving a structure to depicting the substance of mixed media protests, and dialects, for example, which is to compose in XML for content depictions. Some of these norms are unrealistic to represent any ramifications for the advancement of CBIR. For instance, low-level system transmission conventions, for example, TCP/IP uses a wide range of information in the ways are same, viewing them basically the parcels of parallel information of importance, if somebody, to leave the transferring and getting features to deal with.
3 Feature Extraction Change of information i.e. the Region of Interest into the arrangement of components important for grouping has been determined. The components removed must be precisely picked, in light of the fact that it is required to perform the sought grouping errand utilizing this diminished representation rather than the complete return for capital invested. Execution with abundant measure of data by and large angles of storage and measuring the force or a group calculation that over fits preparation information. The elements extricated are first request textural components, higher request angle highlights, discrete wavelet change elements utilizing daubechies, Coiflets and haar channels. The shared opinion for Content-Based Image Retrieval (CBIR) is to concentrate a mark of the pictures. For this, every picture has been divided into squares of 4 × 4 pixels and an element vector for every square comprising of 6 components has been removed. The LUV shading space has been utilized where L remains for luminance, U for tint and V for immersion, U and V contains shading data (chrominance). The initial three of them are normal of the estimations of the Luminance, Hue and Saturation, separately of the 16 pixels display in the 4 × 4 squares. For the other three components Haar (wavelet) change has been utilized to L part of picture. After an one-level wavelet change, a 4 × 4 square is decayed into 4 recurrence groups of 2 × 2 piece. The other three parts of every element vector are square foundation of second request snippet of wavelet coefficients of the HL, LH and HH band, individually on the grounds that each of these groups gives the varieties exhibit in diverse headings. Highlight extraction is a preprocessing stride for picture indexing and recovery in the CBIR framework and a few shading portrayal advancements have been proposed to be utilized as the element vectors. Shading Histogram is the soonest system to express the component, which is invariant to rotational changes, separation changes and incomplete impediment of the objective item. Shading Autocorrelogram and Mutual Information strategy concentrates Color Mutual Information from the shading relationship highlight grid as another component descriptor and consolidates CMI with CA to produce an unpredictable element. The highlights of this perplexing component are:
An Adaptive Crack Identification Scheme Using Enhanced Segmentation …
117
(i)
it incorporates the spatial connection of distinctive sets of hues and indistinguishable sets of hues; (ii) it can be utilized to depict the worldwide appropriation of neighborhood spatial relationship of hues; (iii) the computational time and highlight vector space are a great deal less costly. Shading Correlogram is a calculation with a m × m highlight vector. In spite of the worldwide dissemination of nearby spatial connection of hues, it takes a ton processing time and space for such vast vectors. Shading Autocorrelogram just concentrates the components of the main inclining in the component grid to constitute the element vector, which gives the spatial relationship of indistinguishable hues. In spite of the fact that Color Correlogram communicates diverse sets of hues in the spatial dissemination, its use has been restricted by extensive element vectors in managing picture coordinating. Rearranged Color Autocorrelogram strongly decreases the computational unpredictability, yet the recovery effectiveness of the picture with well-off hues or sensational change in hues is bad in light of the fact that just indistinguishable sets of hues have been considered. Shading Mutual Information in light of the Color Correlogram highlight grid, which lessens the component vector to m × 1 as well as presents dimensional data of distinctive sets of hues, the segment of the descriptor is to guide the spatial data of hues by pixel connections in diverse separations. That figures are likelihood of identifying in the picture of pair of pixels with shading C at separation d from one another. For every separation d, m probabilities are processed, where m speaks to the quantity of hues in the countable space. Its extraction calculation figures the autocorrelogram of more than a picture property. The properties can be considered: shading, angle size, texturedness and rank. Shading is removed in RGB shading space and alternate properties are extricated from the dark level picture. The summed up calculation has been utilized to section the component vectors into a few classes with each class comparing to one locale in the extricated picture. From the each fragmented articles, discrete point is processed utilizing wavelet rough guess. These focuses are gathered together, from these focuses normal point is recognized, then standard deviation variety & expected worth is processed. By including expected worth with normal quality surmised centroid worth is computed. This centroid worth is coordinated with gathered focuses and after that correct centroid is recognized. Utilizing this centroid esteem the closest focuses are shaped as areas. From the remaining focuses, again centroid is distinguished by rehashing the said steps and complete picture is changed into districts. These locales are utilized to recognize the maladies by coordinating the similarity. In PC vision, segmentation of image is the system of separating a propelled image into various segments. The goal of the division is to smooth out as well as change the portrayal of an image into something that is more significant and less requesting to explore. Image segmentation is generally used to discover noises and cutoff points in images. Even more positive, picture division is the system of designating a name to every pixel in an image to such an extent that pixels with a similar imprint give certain characteristics. The result of image division is an arranged on areas that all things
118
K. J. Sabareesaan et al.
considered spread the entire picture, or a course of action of structures expelled from the image. Every one of the pixels in a territory is relative in regards to some reserve or figured property, for instance, concealing, power, or surface. Neighboring zones are through and through differing with respect to the equivalent characteristics. In this work, division is performed utilizing two calculations, for example, Fluffy C-Implies (FCM) and Part Fluffy C-Implies (KFCM). FCM bunching calculation is the delicate augmentation of the conventional hard C-implies. It considers every bunch as a fluffy set, while a participation capacity measures the likelihood that every preparation vector has a place with a group. Accordingly, every preparation vector may be allotted to numerous bunches. In this way it can overcome in some degree the disadvantage of relying on beginning dividing bunch values in hard Cimplies. In any case, much the same as the C-implies calculation, FCM is powerful just in bunches those fresh, round, and non-covering information. At the point when managing non-round shape and abundantly covered information, for example, the Ring dataset FCM can’t generally function admirably. Along these lines we utilize the piece system to build the nonlinear rendition of FCM, and develop a Portion based Fluffy C-Means grouping calculation (KFCM). The fundamental thoughts of KFCM is to first guide the info information into a highlight space with higher measurement through a nonlinear change and after that perform FCM in that highlight space. Along these lines the first complex and nonlinearly detachable information structure in data space may get to be basic and directly distinguishable in the highlighted space after the nonlinear change. So we craving to have the capacity to improve execution. Another value of KFCM is, Dissimilar to the FCM which needs the fancied number of groups ahead of time, it can adaptively focus the quantity of bunches in the information under some criteria. On the off chance that the detachment limits between groups are nonlinear, then FCM will inadmissible work. To tackle this issue we receive the methodology of nonlinearly mapping the information space into a higher dimensional highlight space and after that performing direct FCM inside the highlight space.
4 Implementation & Performance Evaluation A numerical apparatus called discrete wavelet change has been utilized that gives a simple and powerful method for removing the varieties in example exhibit in a picture. (This device is no doubt understood for picture pressure without much loss of value.) Thus, the diverse elements show in a picture, for example, shading, example, shape and composition have been separated by utilizing the above instrument and have been spoken to by vectors, called element vectors. The PC is then made to recognize distinctive districts show in a picture with the assistance of Statistical Clustering of these component vectors by utilizing Kernal fluffy C-Means Clustering Algorithm. These bunches are accumulation of highlight vectors and inside of a group these component vectors are verging on like one another in light of the fact that a bunch speaks to one locale of the picture having comparable substance. The separation
An Adaptive Crack Identification Scheme Using Enhanced Segmentation …
119
between the mean component vectors of each of these bunches of one picture to that of inquiry picture is ascertained by straightforward Euclidean Distance equation for separation between two vectors. The weighted total of these separations as per the criticalness of the pair of groups considered (all mixes from one picture to the inquiry picture) is utilized as a measure of closeness or comparability between two pictures. The separation between all the areas show in R1 to every area exhibit in R2 is ascertained and spoke to in network structure. At that point apply the Most Significant Highest need calculation to get the need of the noteworthiness of the considerable number of blends of segments from R1 to R2. At that point we get the last separation between the pictures by ascertaining the weighted whole of the parts of the separation framework where the weights are the segments of the network acquired from Most Significant Highest need calculation. Accordingly, utilizing these methods we can coordinate the inquiry picture with every picture in the database and sort the pictures display in the database as per the separation in expanding request of separation and subsequently diminishing request of similitude, The framework has been produced in MATLAB and tried by recovering pictures like the question picture by for diverse pictures. The proposed execution is assessed regarding no of fragmented questions and separated & coordinated elements. Figure 1 illustrates the comparison of segmented objects in the image by using the texture processing with matching and direct matching. Proposed system successfully achieves better significant performance in no of segmented objects. Figure 2 depicts the comparison of extracted features in the image by using the texture processing with matching and direct matching. Proposed system successfully achieves better significant performance in no of extracted features. Figure 3 depicts the comparison of execution time of both segmentation approach KFCM and FCM. Proposed KFCM system achieves better performance in terms of execution time. Figure 4 depicts the comparison of Objective function of both segmentation approach KFCM and FCM. Proposed KFCM system achieves better performance in terms of Objective function. Fig. 1 No of segmented objects
120
K. J. Sabareesaan et al.
Fig. 2 Extracted features
Fig. 3 Execution time
Fig. 4 Objective function
5 Conclusions In this paper, texture processing based image segmentation and matching model is proposed to match the objects in image. This model consists of texture processing including feature extraction, and segmentation. The quantification performance of intensity value is based on the input image. The metal defect free images are given as training data and texture information are collected and then testing images are given as input image to match the features extracted in the image. The proposed model results significance performance compare to direct matching scheme in terms of number of segmented objects
An Adaptive Crack Identification Scheme Using Enhanced Segmentation …
121
and extracted & matched features. The uses of the current picture highlight extraction on distinguishment routines for nuisances have indicated great results, in any case, the strategies are basically centered around the extraction of single highlight, for example, the shading highlight, shape highlight or surface highlight, bringing about lower distinguishment rate for distinctive bugs. To tackle this issue, another bug distinguishment strategy in light of the combination of the highlights of hues, shapes, and surfaces is exhibited.
References 1. Aigrain, P. (1996). Content-based representation and retrieval of visual media—A state-of-the-art review. Multimedia Tools and Applications, 3(3), 179–202. 2. Sabareesaan, K. J., Ani, H. A., Jaya, K. J., & Varahamoorthi, R. (2015). An Image processing approach for detection of surface cracks in EDM machined aerospace super alloy—Inconel X750. European Journal of Advances in Engineering and Technology, 2(9), 39–44. 3. Beigi, M. (1998). MetaSEEk: A content-based meta-search engine for images. In I. K. Sethi & R. C. Jain (Eds.), Storage and retrieval for Image and video databases VI (pp. 118–128). Proc SPIE 3312. 4. Jain, A. K. (1997). Multimedia systems for art and culture: A case study of Brihadisvara Temple. In I. K. Sethi & R. C. Jain (Eds.), Storage and retrieval for image and video databases V (pp. 249–261). Proc SPIE. 5. Boureau, Y. L, Bach, F, LeCun, Y., & Ponce, J. (2010). Learning mid-level features for recognition. In Conference on Computer Vision and Pattern Recognition (pp. 2559–2566). 6. Jaya, J., & Thanushkodi, K. (2009). Structural modeling and analysis of computer aided diagnosis (CAD) system: A graph theoretic approach. International Journal of Computer Science Application, 2(1), 5–8. 7. Chen, Y., Su, W., Li, J., & Sun, Z. (2009). Hierarchical object oriented classification using very high resolution imagery and lidar data over urban areas. Advances in Space Research, 43(7), 1101–1110.
Surface Wear Rate Prediction in Reinforced AA2618 MMC by Employing Soft Computing Techniques N. Mathan Kumar, N. Mohanraj, S. Sendhil Kumar, A. Daniel Das, K. J. Sabareesaan, and Omar Adwan Abstract As a result of the enhancement of mechanical and thermal properties, Metal matrix Composites (MMC) has shown enormous potentials as a likely material for various aerospace and automotive applications. Thermomechanical wear properties of Aluminium Alloy (AA 2618) is reinforced with Silicon Nitride (Si3 N4 ), Aluminium Nitride (AlN) and Zirconium Boride (ZrB2 ) have been investigated in the proposed work. For the fabricated MMC, the prediction of wear features can be developed by the regression model and employing Statistical and Data Analysis named Minitab. The predicted response was found to be a linear with the actual responses. Utilizing the Support Vector Machine (SVM) and Artificial Neural Networks (ANN), prediction of the wear characteristics is performed with the developed regression model. The statistical performance parameter through ANN exhibits the minimization of the Mean Absolute Error (MAE) when compared to other approaches. The % of MAE through the soft computing techniques such as ANN, SVM, LR are 10%, 26% and 29% respectively. Predicted values obtained by means of the proposed N. Mathan Kumar (B) · S. Sendhil Kumar Akshaya College of Engineering and Technology, Coimbatore, India e-mail: [email protected] S. Sendhil Kumar e-mail: [email protected] N. Mohanraj Sri Krishna College of Technology, Coimbatore, India e-mail: [email protected] A. Daniel Das Karpagam Academy of Higher Education, Coimbatore, India e-mail: [email protected] K. J. Sabareesaan University of Technology and Applied Science, Nizwa, Sultanate of Oman e-mail: [email protected] O. Adwan Al-Ahliyya Amman University, Amman, Jordan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_11
123
124
N. Mathan Kumar et al.
approaches through the optimized ANN model are having sustained consistency on par with the conventional values. Keywords Wear · Prediction · Mean absolute error · Reinforcements
1 Introduction In the present technological era the applications of Aluminium alloy is inevitable. It has got such promising light weight characteristics. Nonetheless, their usage is restricted to automotive applications because of its low value of hardness and wear resistance [1]. Owing to the enhanced properties of Aluminium metal matrix composites, they derived a great attention in the current scenario [2]. AA-MMC has gained significant recognition in the field of tribology due to its improved wear resistance characteristics. However, with the practice of distinct reinforcement in aluminium matrix, the quality of its physical properties has to be compromised. Improving the resistance of wear and strength can be obtained by the addition of the reinforcement particulates. However, machining becomes difficult with the subsequent proliferation of hardness [3]. With the features like dirt free interface, Good closeness occurs within the matrix and reinforcements in equal dispersal in the matrix called in situ formation of the AMCs [4]. Owing to the level of reinforcement causes an increase in the resistance abrasive wear of the AMCs. Many researchers investigated the erosion wear behavior of composite materials [5]. Mechanically driven machine parts suffer a substantial economic loss with wear and friction [6]. A need for materials with great increase the machine parts life by resisting the wear. To improve the resistance of wear for softer material surfaces can be enhanced by performing the surface coating process [7]. The lifespan of machine parts can be enhanced by providing the wearresistant surfaces. By the process of deposition of coatings like as boride, nitride, or carbide on the metallic surfaces in the machine parts, resistance to wear can be augmented [8]. Several investigations have to be carried out to achieve the abrasive wear resistance. Experiments could be time- consuming as it has to be deployed in diversified operating conditions [9]. Employing the soft computing algorithms for the estimation of wear behaviour of materials, cost incurred in experimental studies reduces [10]. Investigation of wear resistance was predicted by LR [11], the hard chrome layer thickness of abrasion resistance can be obtained by SVM [12]. The Gaussian mixture regression model was used to calculate the tool wear [13], for predicting wear loss ANN was used [14, 15]. These algorithms were used to foresee the wear loss and friction coefficients [16]. In this paper, soft computing algorithms were utilized to predict the wear rate of AA 2618 (Aluminium Alloy) reinforced with Si3 N4 (Silicon Nitride), AlN (Aluminium Nitride) & ZrB2 (Zirconium Boride). The polynomial regression, SVM and ANN approaches were used to calculate the output wear rate according to the inputs like as composites wt%, load and velocity.
Surface Wear Rate Prediction in Reinforced AA2618 MMC …
125
Table 1 AA 2618–Chemical composition Element
Ti
Si
Ni
Fe
Mg
Cu
Al
wt%
0.07
0.18
1.0
1.1
1.60
2.30
Bal
2 Experimentation and Data Acquisition 2.1 Selection of Material Proposed work depicts the aluminum metal matrix composites which is synthesized with AA 2618 as the base metal and reinforcements are Si3N4, AlN and ZrB2. Metal matrix composites through various percentages of weight compositions were synthesized to investigate the mechanical properties and tribological characteristics. The reinforcement particulates were added in weight percentage of 0 wt%, 2 wt%, 4 wt%, 6 wt% and 8wt% with AA 2618 to form the metal matrix composite. The chemical composition of AA 2618 is tabulated in the Table 1.
2.2 Work Piece Design Specimens were synthesized with weight percentage 0 wt%, 2 wt%, 4 wt%, 6 wt% and 8 wt% of the strengthening particulates. To effectively mix the reinforcement materials into the base metal AA 2618, stir casting method is employed. At the outset, based metal is heated to 850 °C in the graphite crucible. The molten base metal and reinforcement particulates—Si3 N4 , AlN and ZrB2 of wt% (2, 4, 6 and 8) is mixed homogenously with the help of stirrer at a speed of 350 rpm/min. Precautions are taken in the stirrer material (mild steel coated with zirconium) to inhibit the possible contamination. Inorganic salts mixes chemically with the molten AA to form a MMC [6]. After the homogenous mixture of reinforcement particles and the base metal, molten form of composite is taken to a cylindrical mould with dimensions of 100 mm × 25 mm. After the post solidification, specimens are subjected to the machining operations to achieve the desired values of shape and size. Figure 1 depicts the specimens of the MMC with different weight percentages of Si3 N4 , AlN and ZrB2 .
2.3 Pin on Disc Wear Test Dry sliding wear test was done by pin on disc apparatus at the room temperature as per ATM G99 G95a standard. The line sketch of pin on disc apparatus considered for this investigation is shown in Fig. 2. MMC specimens with dimensions of 100 mm ×
126
N. Mathan Kumar et al.
Fig. 1 Work piece specimens
Fig. 2 Schematic representation of pin on disc apparatus
25 mm for all the combinations of reinforcement particles are used as a test material. The electronic weighing machine with two point accuracy is used to measure the weight of the specimens before and after subjected to the pin on disc apparatus. A total of 54 specimens consist of different combinations of reinforcement particles weights are commenced for the wear test. Wear test was carried out in the sequential steps with loads (10, 20, 30, 40 and 50 N), sliding speed of (1, 2, 3, 4 and 5 m/s) and with constant sliding distance of 3000 m. The mass of specimen is observed during each experiment for the calculation of wear rate. Figure 3 shows the SEM of the specimens with various weight percentage of reinforcement. It reveals that the reinforcement particles are scattering equally in the AA 2618 matrix. Influence of the orowan mechanism, fine grain refinement increases
Surface Wear Rate Prediction in Reinforced AA2618 MMC …
127
Fig. 3 SEM images of specimens of Al2618 with various percentage reinforcement particle
the strength of MMC. Commonly the orowan strengthening mechanism is believed that when the particle size is below 1 µm. According to that fine dispersal of reinforcement influences the strength of MMC are evident by improving the wear resistance. When the dislocation line moves on the reinforcement particles, they are non-deterministic due to that the dislocation line avoids to pass on. That could be a chance to have a mark on the dislocation rings. The strength of the composites may decrease when the dislocation rings are increased. Microstructure shown in the Fig. 3 reveals that the dislocation lines passes through grain boundaries. These lines are increased when the grains are disturbed on the boundaries and get weaken due to the hindrance effect.
3 Soft Computing Approaches To execute the soft computing approaches, predictive modelling is a prominent and extensively used application in manufacturing sector. The regression modelling can be carried out in three steps namely modelling, estimation and hypothesis testing. The present study estimates the accuracy and robustness of the predictive models of wear behavior, by investigating the polynomial regression models, vector machines and ANN. To eliminate the magnitude differences of the data variables and make it unit less, normalization of the data is performed for all the modelling techniques. For training and validation purposes, experimental data is randomly selected, which would be 70% and 30% respectively for all the models.
128
N. Mathan Kumar et al.
3.1 Polynomial Regression Models Polynomial regression is the best to apply when inter and intra relationships between the process parameters and output are curvilinear. The complexities involved in the experiments leads to the introduction of non-linearity when relating input and output parameters. Polynomial regression can be effective to deal such type of nonlinearity problems. Taylor series is used to initiate the modelling between predictors i.e. reinforcement weight percentage, load and speed and the response i.e. wear rate. y = β0 +
3 i=1
βi xi +
3
βii xi2 +
i=1
3
βi j xi x j + ε
(1)
i=1
In most of the manufacturing problems, second order polynomial models are sufficient. The second order polynomial model in three variables is given by Eq. 1.
3.2 Support Vector Machine Models In this analysis, if the centre of the extreme difference between pairs of date sets is selected as the boundary, classification is done using it majorly. To minimize the error and support the regression, the predictor curve is passed through the centre of the difference between the lower and upper boundaries of the observed data. This approach is usually considered to estimate a regression model by attaining high dimensional space through altering the input variables space. ANN models are time consuming and so we generally prefer to use Support vector regression as N and their it consumes less time. If there are N number of experiments {xi , yi }i=1 training set consists of input vector space is x i and the output variable is yi then the SVM expressed by Eq. 2. yi (W · xi + b) − 1 ≥ 0
(2)
The function for hypothesis space is given by Eq. 3. f (W, b) = sign(W · x + b)
(3)
3.3 Artificial Neural Network Models With the evolution of computing power, ANN became the most used in predictive modelling and to map the predictors and responses. The neural network architecture and the algorithms to determine the weights of nodes define the processing capability
Surface Wear Rate Prediction in Reinforced AA2618 MMC …
129
of an ANN model. Experimental data obtained from the pin on disc apparatus is initially fed into the network for learning. The model accuracy is compared with original values of test set after the learning is completed. Depends on the non linearity involved between predictors and responses, number of hidden layers are selected. Multi layer perceptron model is used in the present study and the associated weights between the nodes are calculated using back propagation algorithm. A network model comprising of three input layer such as weight percentage, load and velocity, one hidden and one output layer for output wear rate has considered for the training. The parameters such as momentum constant and learning rate were fixed at 0.9 and 0.01 respectively while training the network. ANN modelling was carried out using commercially available MATLAB software.
4 Results and Discussions 4.1 Investigations of Wear Rate During the experimentation, it was observed that as load gradually increases from 10 to 50 N, the material loss of each samples wt% also increases gently, thereby increasing the wear rate. Maximum loss of particles incurs when there is a maximum load over the composite. From this study, the factors such as wear debris and wear surfaces, implicit the non linearity noticed in the wear rates of the samples can be understood. At the maximum load of 50 N, comparing to the composites with 12% and 15%, the trend of the alloy wear rate is much higher. It is apparent that low loads of 10 N holds the declining wear rate of the composite that upsurges in wt% of composites form 0 to 15%. The lesser wear rates in composites with increasing wt% of reinforcement particles can aid in achieving the increasing peak of hardness and good interference bonding. On account of metal to metal contact, wear rate reaches the maximum as the aluminium pin is constantly displaying the tendency of growing the wear. As a consequence, deformation of large plastic occurs during wear. The specific wear rate of the MMC increases at the room temperature for each wt% of the reinforcement particles. A increment of specific wear rate was noticed for the load varying from 10 to 50 N is gently employed to the composites. When the load of 50 N is applied, there was an abrupt increase in the specific wear rate for the MMC with 8 wt% of composites. It may occur due to the loss of strength of MMC for the respective reinforcement wt%. In this research soft computing techniques such as polynomial regression, SVM and ANN to predict the wear rate of AA-MMCs synthesized with different weight percentages of the composites were performed. Training and testing data of wear rate for soft computing is provided in the Tables 2 and 3.
130
N. Mathan Kumar et al.
Table 2 Training data—Wear rate of the samples and its associated process parameters Exp. No.
Input parameters
Wear rate
Reinforcement particles
Load
Velocity
Wt (%)
(N)
(m/s)
(mm3 /m)
1
0
10
1
0.00036
2
0
20
2
0.0016
3
0
30
3
0.0027
4
0
40
4
0.0026
5
0
50
5
0.0057
6
2
10
1
0.00043
7
2
40
4
0.0022
8
2
50
5
0.0051
9
4
10
1
0.00045
10
4
20
2
0.0014
11
4
30
3
0.0014
12
4
40
4
0.0017
13
4
50
5
0.0053
14
6
20
2
0.00048
15
6
50
5
0.0035
16
8
10
1
0.0003
17
8
20
2
0.00043
18
8
30
3
0.00055
19
8
40
4
0.00079
20
8
50
5
0.0026
21
9
10
1
0.00046
22
9
20
2
0.00054
23
9
40
4
0.0033
24
9
50
5
0.0048
25
10
10
1
0.00051
26
10
20
2
0.0025
27
10
30
3
0.0036
28
10
40
4
0.0048
29
10
50
5
0.0055
30
11
20
2
0.0032
31
11
50
5
0.0061
32
12
20
2
0.0032
33
12
30
3
0.0045
34
12
40
4
0.0048 (continued)
Surface Wear Rate Prediction in Reinforced AA2618 MMC …
131
Table 2 (continued) Exp. No.
Input parameters
Wear rate
Reinforcement particles
Load
Velocity
Wt (%)
(N)
(m/s)
(mm3 /m)
35
12
50
5
0.006
36
13
20
2
0.0031
37
13
30
3
0.0042
38
13
40
4
0.0048
39
13
50
5
0.0062
40
15
10
1
0.00065
41
15
20
2
0.0035
42
15
30
3
0.0045
43
15
40
4
0.005
44
15
50
5
0.0062
Table 3 Test data—Wear rate Exp. No.
Wear rate (mm3 /m)
Input parameters Reinforcement particles wt(%)
Load (N)
Velocity (m/s)
1
2
20
2
0.0016
2
2
30
3
0.0024
3
6
10
1
0.00033
4
6
30
3
0.0011
5
6
40
4
0.0011
6
9
30
1
0.0021
7
11
10
1
0.00056
8
11
30
3
0.0044
9
11
40
4
0.0048
10
12
10
1
0.00063
11
13
10
1
0.00064
4.2 Models Performance The model performance is evaluated based on the factors such as root mean squared error, RMSE (smaller value), coefficient of determination, R-Squared (should be close to one), mean squared error, MSE (smaller value) and mean absolute error, MAE (smaller value), as given by the following equations:
132
N. Mathan Kumar et al.
Table 4 Performance parameters of the training data
Table 5 Performance parameters of the test data
Performance parameter
LR
SVM
ANN
RMSE
0.11
0.11
0.07
R-Squared
0.68
0.70
0.98
MAE
0.19
0.15
0.09
Performance parameter
LR
SVM
ANN
RMSE
0.15
0.16
0.11
R-Squared
0.63
0.69
0.87
MAE
0.29
0.26
0.10
N
(Pr edictedi − Actuali ) N N i=1 (Pr edictedi − Actuali ) 2 R =1− N i=1 (Pr edictedi ) RMSE =
i=1
(4)
(5)
M SE = RM SE2 N 1 (Pr edictedi − Actuali ) M AE = N i=1 Pr edictedi
(6)
The predicted and actual values can be determined by the performance parameters, as these helps in providing the quantitative estimation of the model. Performance parameters of the models, which are used to forecast wear rate values of the samples for LR, SVM and ANN models are provided in the Tables 4 and 5. The superior performance of ANN model over the SVM, GPR and LR models can clearly be realized by the performance factors such as RMSE, R-squared and MAE for wear rate values of training and testing datasets, provided in Tables 4 and 5. The model’s performance can be assessed proficiently through visual inspection. Employing Scatter plots is the best way to visualize the data. Figure 4 shows the comparison of the actual and predicted wear rate achieved using the linear regression model. A comparison plot of actual and predicted wear rate using SVM model is depicted in Fig. 5. The actual versus predicted wear rate of ANN model is shown in Fig. 6.The predicted values derived from the linear regression model largely differ from the actual values out of the test data (Fig. 4). In forecasting the wear rate, the competency of other regression models is far superior to that of the linear regression models. The percentage error of predicted and actual values is 19% (Table 4) and 29% (Table 5) for training and validation cases respectively. However, the predicted values
Surface Wear Rate Prediction in Reinforced AA2618 MMC …
133
Fig. 4 LR—Actual versus predicted values of wear rate
Fig.5 SVM—Actual versus predicted values of wear rate
of the model falls in the range ±30% of actual values. Hence, regardless of the complications involved, this model provides us the steady values. Support vector machine a non parametric regression model performs the prediction according to the selection of kernel function and the best performance can be achieved by proper selection of the kernel function. In this study, best performance of the SVM model is achieved through the application of radial basis function (RBF) when compared to the other kernel functions.
N. Mathan Kumar et al. 10
-3 Training:
R=0.97546 Output ~= 1*Target + -0.00012
6
Data Fit
5
Y=T
4 3 2 1 1
2
3
4
5
Output ~= 0.97*Target + 7.7e-05
Target 10-3 Test: 5
10
Data Fit Y=T
4 3 2 1 12345 10
-3 Validation:
R=0.97355
Data Fit
5
Y=T
4 3 2 1 1
2
-3
R=0.96819
Target
6
6 10
3
4
5
Target
Output ~= 0.94*Target + 9.3e-05
Output ~= 0.91*Target + 0.00013
134
10
-3
6
6 10
-3
All: R=0.97388 Data Fit Y=T
5 4 3 2 1 1
2
-3
3
4
Target
5
6 10
-3
Fig. 6 Back propagation ANN model regression plots of wear rate
Figure 5 shows the performance scatter plot of actual and predicted wear rates. Approximately 70% of the samples exist near the slope and 30% deviated from the line. The performance parameters of the regression models such as RMSE, R-squared, MSE and MAE are in close coherence for both SVM (0.11, 0.68 and 0.19) and LR (0.11, 0.70 and 0.15) models. The mean absolute error of the SVM model is slightly superior than the LR model. This shows the SVM model has better performance over other regression models. SVM model works on high dimensional space that overcomes the disadvantages associated with the non linear data. Research studies shows the one of the major advantage of SVM models is its better performance when the available data is limited as is the case of many manufacturing related studies. From the simulation results of performance parameters (Tables 4 and 5), the highest value of 0.98 (R-squared) and the lowest values of RMSE (0.07) and MAE (0.09) is noted for ANN model. Among regression models, SVM model performed better followed by the LR model. More or less similar trend of performance parameters is observed for the test data also. The mean absolute error of both SVM and LR models shows that the poor performance on the test data. However, the MAE of ANN is notably lowest with the test
Surface Wear Rate Prediction in Reinforced AA2618 MMC …
135
Fig. 7 Comparison plot of percentage MAE of the soft computing models
and the predictions are in fine conformity in the actual experimental values. The performance of the models can be ranked in the order as ANN, SVM, LR based on the statistical estimators/performance parameters considered in this study. However, the performance of SVM and LR models remains same when the data set is large and it is difficult to compare the efficacy of these models.
4.3 Validation The percentage of MAE of the models used in this study is compared in the Fig. 7. The definite values acquired from the experiment and estimated wear rate of the models for each composition of the MMC are tabulated in Table 6. From the predicted values, it can be inferred that the ANN is a suitable approach for predicting wear rate for given process parameters. Figure 8 shows Actual versus Predicted values of the approaches.
5 Conclusions Investigation of the wear rate in AA-MMC and prediction of wear rate using soft computing techniques were performed in the proposed work. Predictions using linear regression, SVM and ANN model were highlighted in the previous section of the paper. The significant outcomes obtained from this work are as follows: • Considering the factors such as root mean squared error, RMSE (smaller value), coefficient of determination, R-Squared (should be close to one), mean squared
136
N. Mathan Kumar et al.
Table 6 Comparison of actual and predicted wear rate of LR, SVM and ANN models Sample No.
Actual values
Predicted value LR
SVM
ANN
1
0.0016
0.0018
0.0019
0.00016
2
0.0024
0.0027
0.0024
0.00237
3
0.00033
0.000343
0.00027
0.00033
4
0.0011
0.0012
0.0011
0.0011
5
0.0011
0.00024
0.0014
0.00123
6
0.0021
0.0021
0.0022
0.0021 0.00049
7
0.00056
0.000312
0.00055
8
0.0044
0.0044
0.0044
0.004319
9
0.0048
0.00482
0.0047
0.00528
10
0.00063
0.000677
0.000636
0.000653
11
0.00064
0.00064
0.00059
0.000405
Fig. 8 Actual versus predicted values of the approaches
error, MSE (smaller value) and mean absolute error, MAE (smaller value) performance of each model is evaluated. • The performance parameters of the regression models such as RMSE, R-squared, MSE and MAE are in close coherence for both SVM (0.11, 0.68 and 0.19) and LR (0.11, 0.70 and 0.15) models. • Based on the statistical estimators/performance parameters, models can be placed in the order of accuracy as ANN, SVM and LR to depict the performance of same.
Surface Wear Rate Prediction in Reinforced AA2618 MMC …
137
References 1. Saheb, N., Laoui, T., Daud, A. R., Harun, M., Radiman, S., & Yahaya, R. (2001). Influence of Ti addition on wear properties of AleSi eutectic alloys. Wear, 249, 656–662 2. Bayhan, M., & Conel, K. (2010). Optimization of reinforcement content and sliding distance for AlSi7 Mg/SiCp composites using response surface methodology. Material Design, 31, 3015– 3022. 3. Suresha, S., & Sridhara, B. (2010). Effect of addition of graphite particulates on the wear behavior in aluminum silicon carbide graphite composites. Material Design, 31, 1804–1812. 4. Sabareesaan, K. J., Jaya, J., & Varahamoorthi, R. (2017). Performance Analysis of various filters for noise removal in EDM electrode surface crack images. International Journal of Applied Sciences and Management, 3(1), 317–328. 5. Hussainova, I., Pirso, J., Antonov, M., Juhani, K., & Letunovitš, S. (2007). Erosion and abrasion of chromium carbide based cermets produced by different methods. Wear, 7(12), 905–911. 6. Amann, T., Gatti, F., Oberle, N., Kailer, A., & Ruhe, J. (2018). Galvanically induced potentials to enable minimal tribochemical wear of stainless steel lubricated with sodium chloride and ionic liquid aqueous solution. Friction, 6(2), 230–242. 7. Liu, J., Yang, S., Xia, W., Jiang, X., & Gui, C. (2016). Microstructure and wear resistance performance of Cu–Ni–Mn alloy based hard facing coatings reinforced by WC particles. Journal of Alloys and Compounds, 654, 63–70. 8. Kommer, M., Sube, T., Richter, A., Fenker, M., Schulz, W., Hader, B., & Albrecht, J. (2018). Enhanced wear resistance of molybdenum nitride coatings deposited by high power impulse magnetron sputtering by using micropatterned surfaces. Surface and Coating Technology, 333, 1–12. 9. Sabareesaan, K. J., Habeeb, Al. K., Jaya, J., & Varahamoorthi, R. (2015). An image processing approach for detection of surface cracks in EDM machined aerospace super alloy—Inconel X750. European Journal of Advances in Engineering and Technology, 2(9), 39–44.2. 10. Palavar, O., Ozyurek, D., & Kalyon, A. (2015). Artificial neural network prediction of aging effects on the wear behavior of IN706 superalloy. Materials & Design, 82, 164–172. 11. Batista, J. C. A., Godoy, C., & Matthews, A. (2002). Micro-scale abrasive wear testing of duplex and non-duplex (single-layered) PVD (Ti, Al) N, TiN and Cr–N coatings. Tribology International, 35(6), 363–372. 12. Lasheras, F. S., Nieto, P. G., de Cos Juez, F.J., & Vilán, J. V. (2014). Evolutionary support vector regression algorithm applied to the prediction of the thickness of the chromium layer in a hard chromium plating process. Applied Mathematics and Computation, 27, 164–170. 13. Wang, G., Qian, L., & Guo, Z. (2013). Continuous tool wear prediction based on Gaussian mixture regression model. The International Journal of Advanced Manufacturing Technology, 66(9), 921. 14. Cetinel, H., Ozturk, H., Celik, E., & Karlik, B. (2006). Artificial neural network-based prediction technique for wear loss quantities in Mo coatings. Wear, 261(10), 1064–1068. 15. Tan, Y. F., Long, H., Wang, X. L., Xiang, H., & Wang, W. G. (2014). Tribological properties and wear prediction model of TiC particles reinforced Ni-base alloy composite coatings. Transactions of Nonferrous Metals Society of China, 24(8), 2566–2573. 16. Kaveendran, B., Wang, G. S., Huang, L. J., Geng, L., & Peng, H. X. (2013). In situ (Al3 Zr þAl2 O3 np)/2024Al metal matrix composite with novel reinforcement distributions fabricated by reaction hot pressing. Journal of Alloys and Compounds, 581, 16–22.
An Assessment of the Relationship Between Bio-psycho Geometric Format with Career and Intelligence Quotient by an IT Software Mai Van Hung, Tran Van The, Pham Thanh Huyen, and Ngo Van Hung
Abstract Bio-geometric and Psycho-geometric are theories developed since 1978 by scientists in the world. The cores of this theory are five geometric shapes each of which describes a type of human personality relating to communication, language, interaction, career choice, stress relief and decision making. This study is discovering the relationship between Bio-geometric and Psycho-geometric and the career with logical intelligence of students, through the students’ choices of the shapes (circle, box, triangle, rectangle or squiggle) which show that there is a relationship between Bio-geometric and Psycho-geometric with career and logical intelligence of students in some universities in Hanoi. The students with box, triangle and squiggle geometry often have better IQ than students with Geometric shapes round and rectangle. This relationship is significant in recognizing the ability to think through geometric observations Keywords Geometric biology · Geometric psychology · Intelligence
M. Van Hung Research Center for Anthropology and Mind Development, VNU University of Education, Hanoi, Vietnam e-mail: [email protected] T. Van The Ha Tay Teacher Training College, Hanoi, Vietnam e-mail: [email protected] P. T. Huyen Research and Development for Real Estate Institute, Hanoi, Vietnam e-mail: [email protected] N. Van Hung (B) Hanoi Metropolitan University, Hanoi, Vietnam e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_12
139
140
M. Van Hung et al.
1 Introduction Biological geometry (Bio-geometrics) along with Psychology geometrics (Psychogeometrics) are the sciences based on an analysis of the morphological characteristics of each body and the instinctive attraction of images. For individuals acting as a database, determining specific geometry structure for individuals. When individuals recognize their own “geometric symbols,” they will have an opportunity to learn more about their personal characteristics, strengths and weaknesses. People shapes divided into 5 categories through 5 geometric symbols: box, circle, triangle, squiggle and rectangle [1–3]. According to Dr. Susan Dellinger, each individual has a personality of all 5 geometric symbols but there are only one or two outstanding symbols representing individual personality [4–6]. Logical thinking is characterized by the IQ (intelligence quotient), which is a concept introduced by an English scientist named Francis Galton in the Hereditary Genius book published in the late nineteenth century. IQ is often thought to be related to academic success, and achievement at work and in society. Some recent studies show a link between IQ and health and life expectancy [5–8]. There are many studies discussing the relationship between logical thinking capacity and morphological characteristics of the body with fingerprints, lip prints, generals, diagnostic area, etc. However, no research has mentioned the relationship between logical thinking capacity and geometric shape characteristics of the body [9–11]. This study was conducted with the purpose of understanding the relationship between the geometry of the body and the ability to think logically and what this relationship means in education and training.
2 Methodology The study was conducted on 350 volunteers, (those without physical and mental deformities) at the VNU Universities of Education, VNU University of Natural Sciences, VNU Social Sciences and Humanities, VNU University of Foreign Languages, and VNU University of Foreign Studies. Technology and VNU Economics University. The studying duration is from September to December 2019. The geometric geometry characteristics are implemented according to Martin’s anthropometric method and Susan Dellinger’s method: look at five figures (box, circle, triangle, rectangle and squiggle) [6]. Choose an image that is most relevant to you or the one you are most interested in. Make an experiment at the Center for Anthropology and Intelligence Development, University of Education—VNU Hanoi. Logical intelligence is realized through the US Cognitive Assessment System (CAS) test [8]. Based on the Bio-geometric and psycho-geometric indices, an IT software was used to assess the relationship between Bio-geometric and psychogeometric format with career choice and logical intelligence indices of students. Classification was done using the WHO Anthro for computers. Frequencies and
An Assessment of the Relationship Between Bio-Psycho Geometric Format …
141
means were computed. The results were presented in tables, graphs with statistical inference. Both univariate and multivariate analysis were done to the relationship between Bio-geometric and psycho-geometric format with career choice and IQ. Variables significant at p < 0.2 in the univariate analysis were entered into the final multivariate analysis model. Statistical significance was accepted at a 5% probability level, i.e. a p-value of less than 0.05 [12, 13].
3 Results and Discussion 3.1 Bio-geometric and Psycho-geometric Format with Career Humans are formatted and geometrically perceived into 5 geometric groups equivalent to 5 symbols, each of which is determined by their respective mental characteristics by scientists. Research results of students of Vietnam National University of Education are presented in Table 1. The box people who show a clear organizational structure is normally the one who has neat, careful and accurate and minor detailed personality traits. This group often has a working plan, and a knack for working with specific processes, numbers, data, but dislikes abstract theories. In addition, these are the people who are hard working, reliable, but unpunctual. The data in Table 1 show that 21.42% of the surveyed students belong to this group, whose studying major is Technology and Economics. The circle people tend to be friendly, approachable, thoughtful, caring and understanding. They are willing to overcome difficulties. These people have a good aptitude for diplomacy, are able to listen and easy to make friends. They are incapable to persuade the others if they want to. In groups, they are often excellent at connecting with people in the crowd. More than half of the students surveyed have a circle accounting for 25.71%, Table 1, in which it can be seen that students of education and social studies dominate in this group compared to other disciplines. The Triangle people often have ambitions and intention to conquer and consider themselves as stars in a crowd. It can be clearly shown that these people own capabilities of leadership and they are ambitious, assertive, self-confident and always get Table 1 Bio-geometric and psycho-geometric format of VNU’s students
No.
Format
Quantity
Ratio (%)
1
Box
75
21.42
2
Circle
90
25.71
3
Triangle
55
15.71
4
Rectangle
70
20.00
5
Squiggle
60
17.14
350
100%
Total
142
M. Van Hung et al.
ready to take action. Moreover, these are people who are very competitive, always believe that they are right and always the winners. They are characterized as highly concentrated and uneasy to be dominated one. Approximate 15.71%, Table 1 of students belong to this group, accounting for the majority of students of Humanities and Economics. Rectangle people often have unstable traits. People are rectangle in the face of some changes in life or work. Which infer they are often erratic and unpredictable. Also, they are able to be quick at acquiring new knowledge, ways of thinking, ideas and easy to make friends though they often fall into a state of amnesia or forgetfulness. The results in Table 1 show that there are about 70 students in this group, accounting for 20%, Table 1, the majority of these students are studying foreign Languages and Economics. The Squiggle people are people who are creative and intuitive. They are often fascinated with novel ideas that are sometimes fictional and thus are easily overlooked by practical problems. They are always inspired and like to express themselves, so they are often motivators and influencers. These people tend to work independently and often have troubles in working with others. There are about 60 students, accounting for 17.14%, Table 1, of the surveyed students in the squiggle group, through surveys for teachers of natural and foreign language majors. It can be noticed that the initial survey shows that there is a relationship between body geometry format and career, department head and ambition of different groups of students. This is one of the scientific bases, through observing the body geometry format, therefore, it is a possibility to orient the students’ career in the future.
3.2 Bio-geometric and Psycho-geometric Format with Logical Intelligence Gardner’s theory of multiple intelligences is a theory of human intelligence that is viewed in many ways, with a variety [14]. H. Gardner came up with 8 types of intelligence in which logical—mathematical intelligence is considered to be the original intelligence. This intelligence is determined by Wechsler, 1981 (From the total score, IQ can be determined for ages 16–45) according to the following criteria: Over 130: very intelligent; from 120 to 129: intelligent; from 110 to 119: average intelligence on; from 90 to 109: average intelligence; from 80 to 89: lower average intelligence; from 70 to 79: boundary state; under 70: intellectual disability, Table 2. The data in Table 2 shows, students with box geometry have the highest number of logical intelligence, followed by students with rectangle geometry, students with squiggle geometry, round, lower triangle and are equivalent. Thus, through the geometry of the body it is possible to assess students’ logical intelligence, but more effective assessment tools and a larger number of research subjects are needed and distinguish the logical intelligence of groups of triangles, rectangles and circles.
An Assessment of the Relationship Between Bio-Psycho Geometric Format …
143
Table 2 Bio-geometric and psycho-geometric format and IQ of VNU’s students No.
Format
IQ n
>130
120–129
110–119
90–109
80–89
70–79
Site Administration > Plugins > Install the plugin (Figs. 7, 8, and 9). Upload ZIP file. You should only be prompted to add additional details (in the Show Extra section) if your plugin is not automatically detected. If your destination directory cannot be written, you will see a warning message. Check plugin validation report (Fig. 10).
3.4 Assessment the Status of Application of Smart Schedule in LMS Ued Moodle Advantages: • People who have permission to edit the catalog can add catalog events to the system, users or courses
266
N. T. Huyen and B. T. T. Huong
Fig. 7 Installing the event reminder plugin
Fig. 8 Running the event reminder plugin
• Easily share calendars directly when creating events in a course • Calendar can display website events, courses, groups, users and categories • There are nearly full of scheduling features like: Export calendar, share, track calendar, hide calendar display • You can add a new event by clicking the button or by clicking on the blank space of the desired day in the calendar. Limit scheduling on Moodle • The interface, the operation is still easy to confuse • Not synchronized with online meeting software or with programs of Microsoft, Google
Smart Schedule Design for Blended Learning …
267
Fig. 9 Bringing the event reminder plugin onto Moodle
Fig. 10 Authentication of event reminder plugin
• There are still some small errors such as error of saving, deleting the calendar, not having set up a detailed event reminder (only a single reminder email before a day) • Scheduling operations are still manual • Unable to schedule meetings that integrate with meeting software, additional invitation to participate, …
268
N. T. Huyen and B. T. T. Huong
4 Conclusion and Recommendation In this era of digital education transformation, smart schedules have proven to be essential in training management. There are many ways to classify educational schedules, depending on many different factors. Smart schedules are organized into several main categories: test timetables, school timetables, and timetables. Regarding university course schedules, the goal is to set lectures for rooms and time to offer different options. At present, the Moodle platform of the University of Education is fully exploited and actively used, in 2020 due to the influence of the Covid epidemic, 19 pupils and students had to leave school so using the Moodle platform instead of studying In class, teaching and learning are very necessary and often used, so the integration of Moodle Plugin is very urgent and quickly implemented. Here are some suggestions for developing LMS UEd Moodle • Simple interface, easy to use. Friendly with all users. • The timetable should have accurate reminders for teachers & students via the system and via mail. • Need to synchronize the calendar and calendar reminders with online meeting applications such as Zoom, Google Meeting. • The need to export and import calendars on the Moodle system to other systems: Microsoft Exel or any other scheduling software without importing data from anywhere. Acknowledgements We would like to thank University of Education, Vietnam National University to fund for us to finish this article through the project: “Smart Schedule contribution research on LMS-Moodle academic support according to Blended learning teaching in Ued—VNU, Hanoi”. Code: QS.NH.20.11
References 1. Thanh Huong, B. T., Van Cong, T., Cuong, T. Q., & Nguyen, N. D. (2020). Smart schedule: Training management approach in the digital education era. 2. Mirhassani, S. H., & Hooshmand, F. (2019). Methods and models in mathematical programming. Springer Nature. 3. Mishra, M. K. (2016). An Overview of the Heuristic Approaches for University Course Timetabling System. International Journal of Engineering Research & TeLindahl, M. 2017. Strategic, Tactical and Operational University Timetabling technology (IJERT), 4(21), 1–6. 4. Moodle Docs, Calendar API. https://docs.moodle.org/dev/Calendar_API. 5. Moodle Docs. Installing plugins. https://docs.moodle.org/38/en/Installing_plugins#Plu gins_for_K-12_teaching. 6. Moodle Docs. Event reminders https://docs.moodle.org/38/en/Event_reminders#Features.
Mobile App for Accident Detection to Provide Medical Aid B. K. Nirupama, M. Niranjanamurthy, and H. Asha
Abstract This application is used to provide immediate medical aid. The street mishaps rates are extremely high these days, particularly bikes. Convenient clinical guide can help in sparing lives. This framework means to make the close by clinical focus about the mishap aware of give quick clinical guide. The accelerometer in the android versatile faculties the tilt of the vehicle and in the event that it discovered it is a mishap, it brings the longitude and scope of vehicle utilizing Global Positioning Sensor (GPS) and forward the subtleties to web server utilizing web. Web server has framework that distinguish the closest emergency clinic and police headquarters utilizing the Euclidean separation computation, once the closest medical clinic and police headquarters is shortlisted, web server sends a mishap subtleties to the worry clinic and police headquarters. The Android application in the cell phone will sent instant message with respect to the mishap area to the guardian of the person in question. This framework spares the life of the mishap casualty by shares the specific area of the mishap. In this system we are using accident detect-system that provides an alert message to the authorized people with the help of accelerometer sensor through the help of using android application to get required needs. Keywords Accident detection · Alert system · Accelerometer · Android application
B. K. Nirupama (B) · H. Asha Department of MCA, BMS Institute of Technology and Management, Bengaluru 560064, India e-mail: [email protected] H. Asha e-mail: [email protected] M. Niranjanamurthy Department of Master of Computer Applications, M S Ramaiah Institute of Technology, Bangalore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_23
269
270
B. K. Nirupama et al.
1 Introduction In this era the engine vehicle populace is expanding more than the social and financial development. Because of mishap (accident) and demise (death), the street mishap, particularly bicycles are additionally expanding in high rate. Because of the absence of quick clinical help, the greater part of the mishap passing’s that occur on the streets like thruways. We can give quick clinical help to the mishap recognize that assists with diminishing the mishap all the more proficiently. On account of this got a plan to build up a ready framework that could detects the earnestness of the mishap and alarms the close by emergency clinic by giving rescue vehicle to spare the person in question. Regardless of whether a mishap has happened will be checked by a created framework and it recognizes injury of driver. In the event that the mishap has happened, the framework will searches for the close by clinical focus and alarmed by utilizing the notice about the incident. The salvage group (rescue team) could hurry to the incident spot promptly without postponing as the present area of the user is shared by his versatile. It likewise sends message to the family, companions and family members to tell them about the incident. The data about the family, companions and family members are put away as of now in the database which has been given by the versatile users one who previously utilizing this application. Entire info would keep in directory and that subtleties will be given to approve individual if mishap happens. This framework causes us to know the present area of the mishap casualty and sends ready messages to close by police headquarters, emergency clinic and fire motor too with the goal that salvage group will hurry to the present area of the mishap casualty right away. With the advancement of portable distributed computing, remote correspondence methods, wise versatile terminals, and information mining procedures, Mobile Crowd Sensing (MCS) as another worldview of the Internet of Things can be utilized in gridlock control to offer more helpful types of assistance and reduce the traffic problems [1]. Setting of system availability: Smartphones are utilized in an assortment of remote system conditions, for example, fast systems, shaky systems, and zones without an accessible system connection [2]. These days, MobileApp improvement and its use in different areas are additionally altogether rising. Notwithstanding, asset requirements of cell phones like restricted preparing power, low stockpiling, limited memory and quicker dispersal of vitality have confined asset escalated portable application improvement and its accessibility [3]. Existing portable frameworks handle generally basic client needs, where a solitary application is taken as the unit of association. To comprehend clients’ desires and to give setting mindful administrations, it is essential to demonstrate clients’ communications in the assignment space [4]. Considering an android incorporates quantities of actuators, sensors and gadgets into a framework, a normalized structure is gainful to simple move and augmentation of potential applications [5].
Mobile App for Accident Detection to Provide Medical Aid
271
Android OS utilizes for its practical section a Java-based virtual machine. Versatile application sets aside cash and time and gives arrangement in finger tips [6]. Android gives outsider applications a broad API that incorporates admittance to telephone equipment, settings, and client information. Admittance to protection and security-pertinent pieces of the API is controlled with an introduce time application authorization system [7]. Late years have seen a worldwide appropriation of savvy cell phones, especially those dependent on Android [8]. With the incorporation of cell phones into every day life, cell phones are aware of expanding measures of delicate data. Advanced versatile malware, especially Android malware, secure or use such information without client consent [9]. We, people, humanize focuses of correspondence. In this sense, humanoids or androids can have ideal interface for humans [10].
2 Existing and Proposed System 2.1 Existing System Android Application is attached with Internet of things kit, so it is quite difficult to use. In the existing system the mobile accelerometer sensor is not considered, here only considered hospital near by ambulance assistance alert system and less usage of technologies. Drawbacks • Here not possible to send emergency message which containing location to registered contacts. • In this existing system there is no voice reorganization system.
2.2 Proposed System The “Help” button is used and it will gets activate when pressed, that will provides assistance to the vehicle driver who has been already met with an accident. User when they screams by telling “help” or “help me” or “please help me” or “help me please” the GPS System will trace the nearby police station. The GPS tracks the longitude and latitude that makes out the exact location of the user and forwards the pre-entered emergency alert message which has been given by the user to the nearby police station using GSM and the mobile numbers which has been registered as well. So that alert notification will be forwarded to the cops, hospital, family, friend and relatives about the incident.
272
B. K. Nirupama et al.
2.3 Expected Outcome The proposed system gives the information about the user by giving alarm (alert) message to police station and hospital if drivers are not safe by using location GPS tracking system in Android Mobile App.
3 Tools and Technology Used 3.1 Android Introduction It is a completed association of programming for cellular mob, for instance, pill PC’s, word pads, mobile telephones, eBook peruses, set-pinnacle containers. It carries a Linux-based OS, middleware and key transportable packages. It has a tendency to be idea of as a flexible working framework. However, it isn’t constrained to transportable as it had been. It is now as of utilized in one of a kind gadgets, for instance, mobiles, capsules, TVs and so forth.
3.2 Android Emulator Figure 1 shows the Emulator of android, here user can see the navigations menus for using the system. Highlights (1) Unfold-supply. (2) Everybody could tweak the droid Policy. (3) Greater deal of transportable programs thus may be taken by way of the consumer. (4) Provides charming highlights like weather subtleties, beginning display, stay RSS channels.
3.3 AndroidManifest.Xml File in Android Document incorporates statistics of your bundle, inclusive of elements of the utility, as an instance, sporting activities, administrations, talk creditors, content material providers. It performs specific undertakings: • Successful to comfy the petition to receive to ensured elements via providing the authorizations.
Mobile App for Accident Detection to Provide Medical Aid
273
Fig. 1 Emulator of android
• This moreover pronounces the android lay out network that the application goes to uses. • That facts the instrumentations instruction. The instrumentation elegance gives profile, exceptional information’s. These facts are expelled not lengthy earlier than the software is shipped and so forth.
3.4 Dalvik Virtual Machine The DVM is a droid digital device upgraded for cellular telephones. Improve the digital machine for reminiscence, power-bank lifestyles and execution. Dalvik is a call of a metropolis in Iceland and was composed with the aid of Dan Bornstein. The Dex compiler modifications over the magnificence statistics into.dex document that surprising spike in call for for the Dalvik VM. Different magnificence facts are changed over into one dex record.
274
B. K. Nirupama et al.
3.5 HTML Hypertext MarkUp Language, licenses customers to precede Web pages to join substance, delineations and pointer to another Web page. Positive conditions • • • •
Pretty much nothing and along these lines easy for providing through net. It excludes masterminded information. Stage self-sufficient. Case-unstable.
Benefits • • • •
Record is anything but difficult to send over the net since it is little. It does excluding arranged data so it’s little. Stage autonomous. Doesn’t case-touchy.
3.6 Java Script Substance based program language. At first called Live Script and recalled as JavaScript to exhibit its relation with Java. This supports the improvement of customer and server fragments. Customer side, inside webpage, programs get executes. On server side, it might be utilized to make webserver program that could technique info introduced by a web-program and thereafter update the program’s exhibit in like way.
3.7 Java Server Pages Figure 2 shows the JSP, Innovations allow you keep scraps of servlet code legitimately to a book based archive. A JSP is a book related report that has two kinds of content: static layout information would be communicated in many content based configurations, for example, HTML, WML, and XML, and JSP components, decide how page builds dynamic substance.
3.8 J2EE Platform Overview This stage is proposed to give server side and client side assistance for making appropriated, staggered applications. Such apps are conventionally masterminded as
Mobile App for Accident Detection to Provide Medical Aid
275
Fig. 2 JSP-Java server pages
a client level to give UI, in any event one focus level module that give user organizations and business basis for an application and behind try information-systems giving data to administrators. Benefits • • • •
Disentangled plan and improvement Opportunity of selecting servers, instruments, portions Incorporation with existing informational structures Adaptability—fulfil need assortments.
3.9 MySql It is Social database of board structure, which orchestrates data as tables. MySQL is different databases-servers reliant on RDBMS model, take care of data that will go to 3 express things-data structures, data uprightness and data control. MySQL uses systems resources, on gear configurations, to pass on not matching execution and versatility. Features Portable The MySQL is used on wide extent of stage running by computers to supercomputer & many user load module, comparable application not having changes. Good MySQL RDBMS is a predominant weakness liberal DBMS, particularly expected for online trade planning and for deals with huge database applications. Multithreaded Server Architecture MySQL versatile multithreaded server engineering conveys adaptable superior for extremely enormous rate of clients of equipment design.
276
B. K. Nirupama et al.
Highlights • Data freedom. • Managing information simultaneousness. • Parallel getting support for accelerate information section and online exchange handling utilized for applications. • DB methodology, capacities and bundles.
3.10 SQLYOG It is modified and made in C++. No restriction on runtimes. It takes a local database to preserve internal data such as network positions. Thusly these surrounds will determined across gatherings on a for each table reason. • To save period creating inquiries with language shape checking • To spare time arranging ostensibly difficult requests.
3.11 SERVLET Figure 3 shows the client/servlet database, It is a small program that runs within the web-server and it receives and responds to request from the client usually across http.
Fig. 3 Client/servlet database
Mobile App for Accident Detection to Provide Medical Aid
277
4 Results 4.1 Web-Application Figure 4, In the local host we suppose to click on our project little that leads to login page of an admin. The admin must enter his/her user-name and password correctly then it redirects to actual web application. Figure 5: Add Hospital, The add hospital will consits of details such as name of the hospital, address containing area, city, state and pin code. It also have Email id
Fig. 4 Admin login
Fig. 5 Add hospital
278
B. K. Nirupama et al.
where we should provide Existing email-id and should provide Phone no. Mainly we suppose to provide location longitute and latitude of the hospital. The above added details of the hospital information will be displayed here (Fig. 6 View Hospital Information). In this list, it may contain plenty of records and also we can delete the records by selecting it. Figure 7 Add, Edit and Delete Police Station Details, The add police-station will consits of details such as name of the station, address that contains Area, City, State
Fig. 6 View hospital information
Fig. 7 Add, edit and delete police station details
Mobile App for Accident Detection to Provide Medical Aid
279
and PIN-code. It also have Email-id where we must give an existing email-address and should provide cell-phone No. Particularlly we have to give the location that has longitute and latitude of the Police-Station. The above add details of the police station would be display here. The list-of records, it may have more records & also we could remove the record by sleeting it. Figure 8 can View Police Station Details. It allows us to update new password. Figure 9 shows the Change Password.
Fig. 8 View police station details
Fig. 9 Change password page
280
B. K. Nirupama et al.
5 Android-Applications We should provide IP address here which has been given by the system. Version should to be typed then pressed ok to go to registration page (Fig. 10). To get registration we should provide user-name, email-id, PH No and PWD. Then click on Register Button. Once it’s done we can go to login page (Fig. 11). Figure 12 Client Login, In the login should give the same user-name which we used in the register form and then we should type the password then click on sign-in button. Figure 13 Home Page, Once we log-in it will redirects to home page. Here we can see add authorize, panic, call police-station, call fire-engine, change password and log-out options. In authorize we can add details such as name, phone no and relationship. Panic button helps us to send voice message and we can call nearby police station and fire-engine. Even we can change our password then click on log-out. Figure 14 Change Password, In this page it displays the user-id, name, phone number and relationship of the user with authorized person. We can view the details which have been entered previously. Figure 15 Change Password, The update password will allow to provide current PWD and should enter new Password by reentering it then click on Update button. Fig. 10 IP address
Mobile App for Accident Detection to Provide Medical Aid Fig. 11 Register form
Fig. 12 Client login
281
282 Fig. 13 Home page
Fig. 14 Change password
B. K. Nirupama et al.
Mobile App for Accident Detection to Provide Medical Aid
283
Fig. 15 Change password
6 Conclusion The proposed system deals with the mishaps (accident) that cautions (alerts) the system by sending message. Accelerometer is the main part of the system which supports to move the message to various gadgets in the system. The Accelerometer sensor will turn on when the accident take place and the details will transmit to the mobile number via GSM module. Utilizing GPS the exact position can be address to nearby police station, hospital and fire engine. And also it sends message to family, friends and relatives as well. The accident could be recognizing by an accelerometer sensor that has been used as primary module in the system. Future enhancement of this application is the specific system deals with the alarming (alerting) of the accidents. By activating automatic driving mode the alert system can able to identify about the incident to overcome the accidents. Starting at now this specific system was produced for just android telephones in future it very well may be created for IOS, Windows and different stages too.
References 1. Yan, H., Hua, Q., Zhang, D., et al. (2017). Cloud-assisted mobile crowd sensing for traffic congestion control. Mobile Networks and Applications, 22, 1212–1218. https://doi.org/10. 1007/s11036-017-0873-2.
284
B. K. Nirupama et al.
2. Ma, S., Lee, W., Chen, P., et al. (2016). Framework for enhancing mobile availability of RESTful services. Mobile Networks and Applications, 21, 337–351. https://doi.org/10.1007/s11036015-0655-7. 3. Anitha, S., & Padma, T. (2020). Adaptive proximate computing framework for mobile resource augmentation. Mobile Networks and Applications, 25, 553–564. https://doi.org/10.1007/s11 036-019-01278-8. 4. Tian, Y., Zhou, K., Lalmas, M., & Pelleg, D. (2020). Identifying tasks from Mobile App usage patterns. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’20) (pp. 2357–2366). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3397271.3401441. 5. Yu, S., Nishimura, Y., Yagi, S., Ise, N., Wang, Y., Nakata, Y., Nakamura, Y., & Ishiguro, H. (2020). A software framework to create behaviors for androids and its implementation on the Mobile Android “ibuki”. In Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction (HRI’20) (pp. 535–537). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3371382.3378245. 6. Roussel, G., Forax, R., & Pilliet, J. (2014). Android 292: Implementing invokedynamic in Android. In Proceedings of the 12th International Workshop on Java Technologies for Realtime and Embedded Systems (JTRES’14) (pp. 76–86). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2661020.2661032. 7. Felt, A. P., Chin, E., Hanna, S., Song, D., & Wagner, D. (2011). Android permissions demystified. In Proceedings of the 18th ACM conference on Computer and communications security (CCS’11) (pp. 627–638). Association for Computing Machinery, New York, NY, USA. https:// doi.org/10.1145/2046707.2046779. 8. Sufatrio, Tan, D. J. J., Chua, T.-W., & Thing, V. L. L. (2015). Securing android: A survey, taxonomy, and challenges. ACM Computing Surveys, 47(4), Article 58 (July 2015), 45 p. https://doi.org/10.1145/2733306. 9. Tam, K., Feizollah, A., Anuar, N. B., Salleh, R., & Cavallaro, L. (2017, February). The evolution of android Malware and android analysis techniques. ACM Computing Surveys, 49(4), Article 76, 41 p. https://doi.org/10.1145/3017427. 10. Ishiguro, H. (2006). Interactive humanoids and androids as ideal interfaces for humans. In Proceedings of the 11th international conference on Intelligent user interfaces (IUI’06) (pp. 2– 9). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/111 1449.1111451.
Image Processing Using OpenCV Technique for Real World Data H. S. Suresh and M. Niranjanamurthy
Abstract Picture handling is a significant segment in many picture examination and PC vision assignments. A methodological report on importance of picture preparing and its applications in the field of PC vision is done here. During a picture preparing activity the information given is a picture and its yield is an upgraded excellent picture according to the procedures utilized. Picture preparing is normally alluded to as advanced image handling, however optical and simple image/s preparing likewise are conceivable. This a research work gives the data to the plan and usage of picture preparing utilizing OpenCV and information regarding image processing for study and research. Actualized by the python programming language, the last system gives another way and reference to the Customizable improvement of distant detecting picture preparing calculation. Picture handling is the utilization of a lot of methods and calculations to an advanced picture to examine, upgrade, or streamline picture attributes. Keywords OpenCV · OpenCV functions · Image processing · Digital image · OpenCV library · Modules
1 Introduction Image management is a technique to perform certain procedures on an image, in order to obtain an improved image or extract valuable statistics from it. It is a kind of sign management where the idea is an image and the output can be an image or qualities/ highlights connected to that image. Today, image processing is one of the rapidly developing innovations. It is also setting up a core exploration territory in H. S. Suresh · M. Niranjanamurthy (B) Department of Master of Computer Applications, M S Ramaiah Institute of Technology, Bangalore, India e-mail: [email protected] H. S. Suresh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_24
285
286
H. S. Suresh and M. Niranjanamurthy
the disciplines of construction and software engineering. Image processing basically includes the following three steps: • Importing the PICTURE via image attainment tools; • Examining and operating the image; • Output in which outcome can be changed image or report that is grounded on image analysis. The principle distinction is that TensorFlow is a structure for AI, and OpenCV is a library for PC vision. … You can do picture acknowledgment with TensorFlow. Despite the fact that it is appropriate for more broad issues also, for example, characterization, grouping and relapse. Python turns into a well-suited decision for such Image preparing assignments. This is because of its developing ubiquity as a logical programming language and the free accessibility of many State of the Art Image Processing instruments in its environment. How about we take a gander at a portion of the ordinarily utilized Python libraries for Image control errands. OpenCV discharges two kinds of Python interfaces, cv and cv2 .cv: … In this, all OpenCV information types are safeguarded accordingly. For instance, when stacked, pictures are of arrangement cvMat, same as in C++. Profound Learning is a quickly developing space of Machine Learning and in case you’re working in the field of PC vision/picture handling as of now (or finding a good pace), it’s a urgent zone to investigate. With OpenCV 3.3, we can use pre-prepared systems with well known profound learning structures. This is kept as the import name to be predictable with various sort of instructional exercises around the web. It is imperative to realize what precisely picture preparing is and what is its job in the master plan before plunging into its how’s. Picture Processing is most generally named as ‘Computerized Image Processing’ and the area wherein it is habitually utilized is ‘PC Vision’. Try not to be confounded - we are going to discuss both of these terms and how they interface. Both Image Processing calculations and Computer Vision (CV) calculations accept a picture as information; nonetheless, in picture handling, the yield is additionally a picture, though in PC vision the yield can be a few highlights/data about the picture. It is free for both business and non-business use. Along these lines you can utilize the OpenCV library in any event, for your business applications. It is a library essentially focused on constant handling. Presently it has a few many inbuilt capacities which actualize picture handling and PC vision designs which make making progressed PC vision presentations simple and effective. Computer Vision intersections expressively with the following fields: ImageProcessing—It emphases on image management. PatternRecognition—It enlightens various methods to categorize patterns. Photogrammetry—concerned with finding precise measurement’s from pictures.
Image Processing Using OpenCV Technique for Real World Data
287
2 Related Work Face acknowledgment innovation is utilized in numerous applications, for example, versatile application, biometric distinguishing proof, and for understudy participation framework. It is the field of picture handling. In Open CV some portion of the face can be utilized for the recognizable proof. Entire face is likewise not requires. This technique is extremely simple to actualize and to confirm the information. There won’t be any missing information in Open CV. The presentation of the face can be estimated on precise output [1]. Open-CV is a library that is worked by intel in 1999. It is a cross stage library, essentially worked to take a shot at continuous picture handling frameworks, which incorporates the cutting edge PC vision calculations. The framework that we have manufactured can accomplish an exactness of around 85–90% precision dependent on the lighting condition and camera resolution [2]. Open CV Python programming is utilized to play out the necessary picture preparing operations [3]. Highlight extraction is a cycle of recognizing and removing highlights out of pictures and putting away it in include vectors. It is a significant period of substance based picture recovery. Extricated highlights are similarly huge as far as their use since they can be additionally taken as contributions to next periods of CBIR. There are a few procedures utilized for removing highlights out of a picture. Highlight extraction chipping away at clinical pictures utilizing a SURF strategy under the OPEN CV platform [4]. The camcorder to record the knee joint treatment and showed the shading recognition constant outcome on PC. Associated segment marking and jumping box strategies utilizing Open CV program to configuration shading identifier from shading set programming which showed the current edge and recorded the patient information as the new patient and existing patient [5]. The center of the concrete creation measure is the clinker furnace. Appropriate activity of the furnace relies upon variables, for example, the opportune checking of its warm conduct under various activity conditions. This work incorporates a systematization of observational information on talented oven administrators, connecting it with the examination of thermography pictures of the furnace utilizing Open CV [6]. Bank checks are utilized widely for budgetary exchanges in different associations. Checks are constantly confirmed physically. The customary confirmation cycle will consistently incorporate date, signature, lawful data, and installment composed on the checks. In this paper, separating the lawful data from caught check picture is acquired by preprocessing the picture, removing required data and afterward perceiving and confirming the manually written fields. Picture handling strategies like diminishing, middle sifting, widening, and confirmation procedures are likewise utilized in this approach [7]. There exists a scope of highlight distinguishing and include coordinating calculations; huge numbers of which have been remembered for the Open Computer Vision
288
H. S. Suresh and M. Niranjanamurthy
(OpenCV) library. By and large, coordinated highlights in 160.24 ms. (Overall, recognizing and coordinating 1132.00 and 80.20 highlights, separately, in 265.67 ms) [8]. Camera alignment dependent on 3D network target was utilized to build up the connection between the places of the pixels and the scene focuses. This progression was cultivated by OpenCV functions [9]. Among the procedures of sound system vision, individuals have given more consideration to binocular sound system vision which depends on preparing two pictures. It straightforwardly reenacts the way of natural eyes watching one scene from two diverse viewpoints. Based on OpenCV, the significant calculation of sound system vision is accomplished and the profundity data of item is recovered [10]. In Image Processing, various calculations can give comparable yields yet various efficiencies. In this manner, the engineer needs to lead tests making variations for more effectiveness so as to choose the best calculation. Kraken upholds OpenCV picture handling capacities. It is additionally equipped for library redirection for new or adjusted capacities frequently overhauled in OpenCV [11]. The interest for every day necessities keeps on expanding, simultaneously, the creation effectiveness and quality prerequisites of the items are getting increasingly elevated. Focusing on this issue A practical strategy for 3D printing physical surface deformity discovery dependent on OpenCV [12].
3 Image-Processing and Digital Image 3.1 Image Processing Image processing is a technique for reproducing certain procedures on an image, in order to obtain an enhanced image or possibly extract valuable data from it. The fundamental meaning of image preparation “Image preparation is the investigation and control of a digitized image, in particular to improve its quality.” Image processing is a technique to perform some procedures on an image, in order to obtain an improved image or extract valuable data from it. It is a kind of sign preparation where the information is an image and the output can be an image or attributes related to that image.
3.2 Digital Image An image can be categorized as a 2D capacity f (x, y), where x and y are spatial arrangements (planes), and the fat fit in any pair of directions (x, y) is known as the strength or degree of darkness of the image by then. An image is simply a twodimensional network (3-D in the case of tinted images) characterized by the digital
Image Processing Using OpenCV Technique for Real World Data
289
capacity f (x, y) at any time gives the estimate of the pixels of an image, the pixel estimate shows what how bright is this pixel and what shading it must have. Image preparation is essentially signal management in which the information is an image and the output is an image or qualities depending on the preconditions related to that image. Here we can see the basice structure of picture, In Fig. 1 shows blend 0’s and 1’s in formate of every pixels and mix of these pixels in cluster would we be able to call picture. In Fig. 2 shading layer of 3D picture which comprises of 3 primary shading mix RED, GREEN and BLUE. The size of the framework depends of the shading framework utilized. All the more precisely, it depends from the quantity of channels utilized. For multichannel pictures the segments contain the same number of sub sections as the quantity of channels. In each shaded picture have a few highlights and they are HSV (Hue, Saturation and Value). Hue compares to the shading components(base pigment). Saturation Fig. 1 Combination 0’s and 1’s
Fig. 2 Coloring layer of 3D image
290
H. S. Suresh and M. Niranjanamurthy
is the measure of color(depth of pigment) (dominance of Hue) (0–100%) Value is fundamentally the splendor of the color (0–100%).
4 Architecture of Image Processing Figure 3 shows a framework of plan design and shows the relationship among picture and video libraries, graphical UI (GUI) libraries, calculation and preparing libraries, and OpenCV in different picture designs (for example JPEG, PNG, TIFF, JPEG2000), video codecs and imaging gadgets (for example QTKit, VFW, videoInput, V4L), and GUI structures (for example Cocoa, Gtk+, Windows API, Qt) were actualized utilizing libraries, for example, other open source programming. Interfaces for pictures, recordings, imaging gadgets, and GUIs were incorporated during the usage. Along these lines, designers can profit by a joined I/O and GUI without considering picture designs, video codecs, camera drivers, and diverse working frameworks. For instance, we can utilize imread() and imshow() to load and show a picture, individually. These activities generally incurthe lion’s share of the cost associated with picture preparing. In OpenCV, great picture handling innovations and most recent advancements are effectively executed. Specifically, innovations that are like mechanical technology have been actualized. AI, which is crucial to these advances, has additionally been effectively actualized. Thus, calib3d, features2d, objdetect, video, and ml modules are refreshed every now and again, and new functionalities are added to the storage compartment. Utilizations of Image Processing.
Fig. 3 OpenCV architecture and development
Image Processing Using OpenCV Technique for Real World Data
291
Fig. 4 OpenCV Block diagram with sustained operating systems
In Fig. 4. OpenCV is inherent layers. At the top is the OS under which OpenCV works. Next comes the language ties and test applications. Beneath that is the subsidized code in opencv_contrib, which contains generally more significant level usefulness. After that is the center of OpenCV, and at the base are the different equipment advancements in the equipment speeding up layer (HAL). Figure 4 shows this association. Figure 5 Shows the OpenCV Basic structure, Basic Structure of OpenCV CV fundamentally contains picture handling, picture structure investigation, movement and following, design acknowledgment, camera alignment and so on. HighGui gives the graphical UI and deals with picture stockpiling. CXCore contains information structure, network variable based math, object constancy, blunder dealing with, drawing and fundamental math’s and so on. It likewise works for dynamic stacking of code. MLL comprises of many bunching, characterization and information investigation capacities. Figure 6. represents the Basic structure of OpenCV library. It mainly contains four sections Core, CV, HighGUI, and ML. The image is classified into 3 parts: Fig. 5 Basic structure of OpenCV
292
H. S. Suresh and M. Niranjanamurthy
Fig. 6 The basic structure of the OpenCV library
• Binary image: Here only two likely valuespixel is a number of bits and can take 0 or 1, 0 is black and 1 is white. The amount of channels in a binaryimage is 1. The depth of a binaryimage is 1 (bit), Fig. 7a shows the representation of the binary image. • Grayscale-image: In this image, each value resembles to a tone among blackand-white. Every pixel is an 8-bit, it can take values from 0 to 255, each value resembles to a tone among black and white (0 for black and 255 for white), the number of channels for a grayscale-image is 1, the depth of a grayscale image is 8 (bits). • RGB image: in this image (Fig. 7c), each pixel stores 3 values. R is 0–255, G is 0–255, B is 0–255, 0–255 is a corresponding color tone, the depth of an RGB image is 8 (bits), the number of channels for the image is three. g(x, y) =
1 if f (x, y) ≥ T 0 otherwise
Thresholding is the easiest technique for picture division and the most widely recognized approach to change over a grayscale picture to a double picture. Here g(x, y) speaks to limit picture pixel at (x, y) and f(x, y) speaks to greyscale picture pixel at (x, y). Circle Detection utilizing OpenCV: Circle identification finds an assortment of employments in biomedical applications, running from iris location to white platelet division. A circle can be portrayed by the accompanying condition: To discover potential circles, the calculation utilizes a 3-D network called the “Aggregator Matrix” to store expected a, b and r esteems. “The cost of a (x-co-ordinate of the midpoint) may
Fig. 7 a Binary image, b grayscale image, c RGB image
Image Processing Using OpenCV Technique for Real World Data
293
variety from 1 to rows, b (y-co-ordinate of√ the midpoint) may variety from 1 to cols”, and r may range from 1 to maxRadius = r ows 2 + cols 2 .
(x - h)2 + (y - k)2 = r2
4.1 Image Processing in OpenCV Image processing in OpenCV used for: • • • • • • • • • • • • • • • •
Change color spaces Image gradients Canny Edge-Detection Image-pyramids Contours-in-OpenCV Histograms-in-OpenCV Image transformations in OpenCV Geometric image transformations Image-Thresholding Soften images Morphological transformations Pattern matching Hough-Line transformation Hough-Circle transformation Imagesegmentation with a watershed algorithm Interactive foreground-extraction using the GrabCut.
5 Advantages of Image Processing Using Computer Vision • The preparing of pictures is quicker and more financially savvy. One needs less an ideal opportunity for handling, just as less film and other capturing hardware. • It is more natural to measure images. No handling or fixing synthetic concoctions are expected to take and cycle advanced pictures.
294
H. S. Suresh and M. Niranjanamurthy
• Process in a less difficult and quicker manner: it permits the customers and ventures to check. Likewise, it gives them admittance to their items. It’s conceivable gratitude to the presence of Computer Vision in quick PCs. • Reliability: PCs and cameras don’t have the human factor of sleepiness, which is killed in them. The productivity is generally the equivalent, it doesn’t rely upon outside components, for example, sickness or nostalgic status. • Accuracy: the exactness of Computer Imagining, and Computer Vision will guarantee a superior precision on the last item. • A wide scope of utilization: We can see a similar PC framework in a few unique fields and exercises. Additionally, in industrial facilities with distribution center following and transportation of provisions, and in the clinical business through examined pictures, among other various alternatives. • The decrease of costs: time and mistake rate are diminished during the time spent Computer Imagining. It decreases the expense of recruiting and preparing exceptional staff to do the exercises that PCs will do as many specialists.
6 Disadvantages of Image Processing Using Computer Vision • Misuse of copyright is currently simpler than it prior was. For example, pictures can be duplicated from the Internet just by tapping the mouse a few times. • Old callings, (for example, creator up, repro cameraman) disappear, and new ones don’t really show up. For example in mid-1990s, the paper Aamulehti began utilizing electronic make-up, and the customary producers up were left jobless. • Work has gotten more specialized, which may not be an inconvenience for everybody. • A advanced document of a specific size can’t be augmented with a decent quality any longer. For example, a decent banner can’t be made of a picture record of 500 kb. In any case, it is anything but difficult to make a picture littler.
7 Conclusion Picture handling assumes an imperative job in improvement of low quality pictures. Particularly information acquired from Automated Image Acquisition Systems, which is in the computerized structure, can best be used with the assistance of advanced picture preparing. Picture improvement is a significant part of advanced picture Processing. Picture improvement methods help in improving the perceivability of any bit or highlight of the picture smothering the data in different bits or highlights. The goal of image refresh is to improve the visual appearance of an image or to provide “superior change rendering for future robotic image manipulation.” Legal images refer to different orders. Many images, such as clinical images,
Image Processing Using OpenCV Technique for Real World Data
295
satellite images, tiny images, and even authentic photos, are adversely affected by a helpless difference and clamor. It is important to improve complexity and eliminate clamor to improve image quality. Additional Research More unpredictable start-up, research and improvement calculations should have been created with drivers in PC innovation, keeping their legal approval under the official courtroom. OpenCV (Open Source Computer Vision Library) is an open source library for PC vision and AI programming. Acknowledgements We thank the Management and Principal of M S Ramaiah Institute of Technology, Bangalore for support and Encourage to complete this research work.
References 1. Kumar S., & SaiLavanya, G. V. R. (2020). Face recognition using OpenCV with deep learning. In Kumar, A., Paprzycki, M., & Gunjan V. (Eds.), ICDSMLA 2019. Lecture Notes in Electrical Engineering (Vol. 601). Singapore: Springer. https://doi.org/10.1007/978-981-15-1420-3_122. 2. Thakur, A., Prakash, A., Mishra, A. K., Goldar, A., & Sonkar, A. (2020). Facial recognition with OpenCV. In Smys, S., Tavares, J., Balas, V., & Iliyasu A. (Eds.), Computational Vision and Bio-Inspired Computing. ICCVBIC 2019. Advances in Intelligent Systems and Computing (vol 1108). Cham: Springer. https://doi.org/10.1007/978-3-030-37218-7_24. 3. Gayathri Devi, T., Neelamegam P., & Sudha, S. (2017). Image processing system for automatic segmentation and yield prediction of fruits using OpenCV. In 2017 International Conference on Current Trends in Computer, Electrical, Electronics and Communication (CTCEEC) (pp. 758– 762). Mysore, 2017. https://doi.org/10.1109/ctceec.2017.8455137. 4. Kaur, B., Jindal, S., & Jindal, S. (2014). An implementation of feature extraction over medical images on OpenCV environment. In 2014 International Conference on Devices, Circuits and Communications (ICDCCom) (pp. 1–6). Ranchi, 2014. https://doi.org/10.1109/icdccom.2014. 7024695. 5. Pramkeaw, P. (2016). The study analysis knee angle of color set detection using image processing technique. In 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS) (pp. 657–660). Naples, 2016. https://doi.org/10.1109/sitis. 2016.109. 6. Morocho, V., Colina-Morles, E., Bautista, S., Mora A., & Falconí, M. (2015). Analysis of thermographic patterns using OpenCV case study: A clinker kiln. In 2015 12th International Conference on Informatics in Control, Automation and Robotics (ICINCO), (pp. 479–484). Colmar. 7. Dhanawade, A., Drode, A., Johnson, G., Rao A., & Upadhya, S. (2020). OpenCV based information extraction from cheques. In 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC) (pp. 93–97). Erode, India, 2020, https://doi.org/ 10.1109/iccmc48092.2020.iccmc-00018. 8. Noble, F. K. (2016). Comparison of OpenCV’s feature detectors and feature matchers. In 2016 23rd International Conference on Mechatronics and Machine Vision in Practice (M2VIP) (pp. 1–6). Nanjing. https://doi.org/10.1109/m2vip.2016.7827292. 9. Lü, C., Wang, X., & Shen, Y. (2013). A stereo vision measurement system based on OpenCV. In 2013 6th International Congress on Image and Signal Processing (CISP) (pp. 718–722). Hangzhou. https://doi.org/10.1109/CISP.2013.6745259. 10. Lu, K., Wang, X., Wang, Z., & Wang, L. (2011). Binocular stereo vision based on OpenCV. In IET International Conference on Smart and Sustainable City (ICSSC 2011) (pp. 1–4). Shanghai, 2011. https://doi.org/10.1049/cp.2011.0312.
296
H. S. Suresh and M. Niranjanamurthy
11. Traisuwan, A., Tandayya, P., & Limna, T. (2015). Workflow translation and dynamic invocation for Image Processing based on OpenCV. In 2015 12th International Joint Conference on Computer Science and Software Engineering (JCSSE) (pp. 319–324), Songkhla. https://doi. org/10.1109/jcsse.2015.7219817. 12. Wei, Y., Xu, G., Liu, H., & Ju, F. (2019). OpenCV-based 3D printing physical surface defect detection. In 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) (pp. 1924–1927). Chengdu, China. https://doi.org/10.1109/IAE AC47372.2019.8997583.
Virtual Reality: A Study of Recent Research Trends in Worldwide Aspects and Application Solutions in the High Schools Tran Doan Vinh
Abstract The development of science and technology has set many problems for us. In particular, in the field of education, how to use and manage technology in the Schools is very reasonable and scientific to contribute to improving the quality of teaching and learning. In this paper, after highlighting some concepts of Virtual Reality (VR), VR Technology, VR in Vietnam High Schools, we offer solutions to application VR in the Vietnamese High Schools. Keywords Virtual reality · Technology VR · School · High schools
1 Introduction The development of science and technology has set many problems for us. In particular, in the field of Education, how to use and manage technology in the Schools is very reasonable and scientific to contribute to improving the quality of teaching. The technologies in the classroom that we are particularly interested in are computers, interactive boards, projectors, laptops, VR technology, AR technology, etc. VR technology is one of the areas that need to be researched, exploited and applied more because of its practicality besides internet of things technology (IoT), smart cities, smart schools, artificial intelligence (AI), etc. Although VR technology is still a new term when there is little specific information on Vietnamese books and newspapers on this issue. Especially the application of VR technology to teaching and education is still a new issue today. Moreover, the effect of VR technology is still a mystery that needs to be exploited in time. If applied in teaching, VR technology will be breakthroughs that bring high efficiency in stimulating students’ interest—one of the important issues and difficult problems for many Educators. Application of information technology (IT) in teaching is an orientation of educational innovation, especially in teaching subjects of Technology, Physics and many T. D. Vinh (B) VNU, University of Education, Vietnam National University, Hanoi, Vietnam e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_25
297
298
T. D. Vinh
other subjects. High School students with the main activity of exchanging and exchanging confidences with their friends, they like to explore, explore, want to satisfy their imagination and have fun with their friends. However, at this age, they do not have real experiences and experiences International. In the teaching process, if we research and apply VR technology to each topic and each subject in a logical and scientific way, it will help students have practical and formed experiences for students. They have comprehensive perspectives from many aspects without spending too much money, helping them develop a comprehensive personality and improve the quality of teaching those topics and subjects. VR technology helps to create conceptual images taught by teachers in the classroom, stimulating their imagination and attaching to real things. Innovating teaching methods is one of the essential jobs of a modern teacher. The application of VR technology helps teachers improve their IT skills, catch up with the trend of the 4.0 technology era, and create motivation for students to stimulate students’ interest in each lesson period. Being aware of the above issues, at the same time, in accordance with the application of VR technology, we have selected the subject: Application solutions of VR technology in the teaching Vietnamese High Schools. The scientific hypothesis of the topic is if applying VR to teaching in High Schools is reasonable and scientific, students can manipulate with VR and form subject knowledge, topics, improve the quality of teaching and learning in High Schools. The research question of the topic is • How does VR technology help students form subject knowledge in the Vietnamese High Schools? • How to evaluate the effectiveness of VR technology in helping students improve the quality of learning subjects and topics? Based on the reasons for choosing the topic, scientific hypothesis, research questions of the topic, we will present the content in this scientific report in the following order: • Overview of domestic and foreign research on the application of VR technology in teaching; • What is VR? What are the Characteristics of VR? What is the Technology of a VR system? • Actual situation of using VR in the Vietnamese High Schools; • Solutions to application VR in the Vietnamese High Schools.
2 What Is VR? and VR in the Vietnam High Schools Before learning about VR in Vietnam High Schools, we would like to introduce the concept of VR, Overview of domestic and foreign research on the application of VR technology in teaching, the characteristics and technologies of VR…
Virtual Reality: A Study of Recent Research Trends …
299
2.1 Overview of Domestic and Foreign Research on the Application of VR Technology in Teaching 2.1.1
In the World
The application of VR in training, vocational training, and Training with education (AR Education) is a breakthrough in teaching methods, new approaches for learners and teachers, instead of studying vegetarian and theory. down will move to practical learning, practical experience through 3D simulation, LAB laboratories, so the lessons and knowledge will be practical, detailed, quick to understand, easy to remember to attract learners, much more effective. times compared to the old way. According to Forbes, Daniel Newman—CEO of Broad suite Media Group (USA)—affirmed that AR and VR are one of the educational technology trends 2019. Along with AI (artificial intelligence), VR and AR technology are increasingly proving Great benefits and potentials for education. In developed countries, VR and AR have achieved initial success with the social sciences and are continuing to grow with STEM. AR technology is expected to reach $61.39 billion by 2023. In the world, VR is used in many training fields such as VR applications in firefighter training, VR applications for vocational training and especially VR applications in high school training programs. Virtual reality has been developed enough to change the way students learn and explore the world around them. Some schools in Utah are using Nearpod digital teaching tools to educate students in an environment. Instead of just reading dryly, students can experience and observe what they have actually learned. At Full Sail University, Florida is using VR virtual reality not to educate students but to provide an on-campus environment where online students can learn more about their modules. Virtual classes allow students to study remotely and explore many social aspects without the need for a classroom. Tools that allow students to interact in a way that blurs the boundaries of classroom and online experience.
2.1.2
In the Vietnam
VR technology has now attracted a lot of attention, applied in the production process of small and large enterprises at home and abroad and in various fields due to its convenience. The growing wave of VR technology worldwide has also quickly entered the Vietnamese market. Recently, the VR exhibition has gradually asserted its position. In Vietnam, the application of AR and VR technology in teaching faces many challenges. It can be said that funding, technology is limited, the spread to students is not wide. Currently only a few training units apply this technology. Prominent among them is Arena Multimedia with e-books and learning materials equipped with AR. Arena Multimedia is a pioneer in multimedia art training in Vietnam and Asia. Therefore, creating conditions for students to study and experience is put on the
300
T. D. Vinh
top. In addition to the desirable facilities and dynamic learning environment, AR curriculum in Arena is a unique point in training and teaching. Arena has successfully applied AR and VR technology to the electronic curriculum with the support of the Aptech content development team. In addition, virtual reality glasses bring great gadgets in all areas and are still being studied, researched and developed by many young people. In Vietnam, there are many VR technology development companies that have launched research and products that aim to develop and improve VR applications in the future. Can mention the names of Holomia and Tourzy Media—two leading units in Vietnam in VR. • Holomia The strength of Holomia’s VR application is in the field of real estate. However, in the development of VR technology, Holomia is also one of the successful units in applying VR in the field of Education. The unit has sponsored and built a number of projects that apply 3D space for some subjects such as History, Science, so that students can observe images in the most realistic way even when standing in a closed room. • Tourzy Media Holomia has applied VR to all areas of society, that is VR web design. How is Web VR different from traditional websites? According to the criteria of Tourzy Media, web VR gives users 360-degree images and videos. Users are free to choose the appropriate viewing angle.
2.2 What Is VR? Virtual reality is the image of the real world generated by the computer that responds to human movements. Interaction with this simulation environment is via large data suitable for devices such as interactive speakers, glasses or mitten gloves. In order to increase the environmental experience, simulators are integrated with other senses such as hearing (sound). Virtual reality is a term used for computer—generated 3D environments that allow user to enter and interact with alternate realities. The users are able to “immerse” themselves to varying degree in the computers artificial world which may either be a simulation of some form of reality or the simulation complex data [1]. Through the above definitions of VR, we realize that VR is a birth environment simulated by humans, by a system consisting of computers, specialized software, big data and devices. Specialized equipment such as interactive speakers, glasses or gloves. Virtual reality tries to create an illusionary environment that can represent our senses with artificial information, leading us to believe that it is (almost) real. In
Virtual Reality: A Study of Recent Research Trends …
301
addition to creating virtual space, VR technology can also interact with the user through gestures and various senses such as auditory, olfactory and tactile. The virtual reality we are talking about is the one made by computers that allows users to experience and interact with the bogus 3D world by wearing a built-in monitor headset on their head and a few Something like a motion tracking sensor. The screen will often split the eyes of the user, create a 3D stereo effect with stereo sound, and comes with technology and motion sensors, which will create a true experience, allows you to explore virtual worlds created by computers. As such, we can understand that there are currently two important terms for virtual reality in the world: VR and AR (Augmented Reality). VR is a computer simulation that simulates or recreates a real life environment or situation. It immerses the user in the surrounding landscape by making them feel like they are experiencing live simulations, primarily by stimulating their sight and hearing. We will talk about the relationship between the two technologies VR and AR into the research trends.
2.3 Characteristics of Virtual Reality Characteristics of VR are: • The VR must be guaranteed: The ideal world born of computers; Make sure it looks real; Interact with users of the same type as the real world. • In practice to achieve acceptance: Provide a lower standard; Lay out many aspects of the real world. • Immersion—”immersion”. The VR environment is immersive if it gives the user the feeling of being in the environment rather than the feeling they are observing. • A real VR does not exist, making people feel comfortable when there is confusion, must have characteristics: Immersion: Generates sensational scenes that belong to the virtual environment of the user; Allows interaction with objects in the scene as the user does in practice accepts approximation with a reasonable limit; Make as much sense on the senses as possible: Sight and sound, Haptic interfaces are also included, Taste and smell and Sixth sense.
2.4 What Is Technology of a VR System? A VR technology system consists of two components, namely hardware and software. • Hardware The hardware of a VR consists of: Computer, Input devices and Output devices Computer: PC or Workstation with strong graphics configuration.
302
T. D. Vinh
Input devices: Position tracking to determine the position of the observation. Navigation interfaces to move user location. Gesture interfaces such as data glove so that the user can control the object. Output devices: including graphical displays (such as monitors, HDM, etc.) to view 3D objects. Sound equipment (speakers) to hear surround sound (such as Hi-Fi, Surround,). Haptic feedback (like gloves) to create touch when touch, grasp the object. Force feedback to create the impact force as cycling, walking, shock. • Software Software is always the soul of VR as well as for any modern computer system. In principle, any programming language or graphics software can be used to model and simulate VR objects. OpenGL, C++, Java3D, VRML, X3D, … or other commercial software such as World Toolkit, People Shop, etc. … Software of any VR must also make two main uses: Create a picture in the Simulation. The objects of VR are modeled by this software or transferred from 3D models (designed by other CAD software such as AutoCAD, 3D Studio, 3Ds Max …). VR software must then be able to dynamically simulate, dynamics, and simulate object behavior. The above are the basic components of a VR technology system and it is described as shown Fig. 1.
Fig. 1 Basic VR system. Source Author
Virtual Reality: A Study of Recent Research Trends …
303
2.5 Analyzing Situation of Teaching and Applying VR Technology in Some High Schools in the City Hanoi Today Through teaching practice, through surveys and interviews with teachers who have been teaching High School for many years, we realize that teaching and learning with VR applications in High Schools today are still limited. Stemming from many different reasons, such as the teacher’s teaching method, the students’ practical competencies, etc. In the field of Education, especially in Vietnam High Schools, VR technology is applied in many subjects. High school subjects with VR applications such as Mathematics, Physics, Chemistry, Biology, History, Technology, Language, etc. There are 131 High Schools in Hanoi and the number of schools applying VR technology for the subjects, topics. According to the statistics, the number of High Schools in Hanoi city using VR technology in subjects and topics does not exceed 50%. Through surveys and investigation on the methods commonly used in teaching at High Schools, we find that most teachers still use traditional teaching methods but rarely use other methods such as teaching with the support of IT, VR applications, conversation, etc. Very few teachers’ pay attention to applying teaching methods to promote students’ learning activeness. After taking the questionnaire of 350 10th grade students in Tay Ho High School, Hanoi, we have some information about the degree of interest when using VR technology in the subjects as in Table 1. Through the process of actual investigation survey above shows that students are not interested in learning to apply VR, teachers do not focus on applying VR to students. Therefore, we need to come up with appropriate solutions to encourage the spirit of learning, creative autonomy, exploring, applying VR technology in learning.
3 Some Measures for Application of VR in the Education Vietnam Over the past few years, we have learned about how virtual reality can transform the way we learn and teach, from providing in-depth knowledge and helping us Table 1 Statistics of students’ interest in applying VR
Interesting level of students
Amount
Ratio %
Like so much
23
6.6
Prefer
46
13.1
Normal
111
31.7
Dislike
170
48.6
304
T. D. Vinh
understand the content taught in the curriculum and complex topics to enable students to learn languages and virtual trips in the best way. In the Educational environment, VR technology is an effective tool for learning and teaching, but in Vietnam and around the world it is slowing down when it is deployed because it is an expensive, modern and difficult tool application for schools and countries. However, according to many scientists’ predictions, in the field of Educational Technology by 2020, VR will start to become the trend and main users in the field of Education and Technology, including applications for classroom. Stemming from international experience, from teaching and learning practices, from the conditions of Vietnamese Schools, we strongly recommend some solutions VR applications in teaching and learning for Education in general and for High Schools in particular.
3.1 Virtual Field Trips Today, schools often use Google Expeditions to send students to remote and even inaccessible places on the planet to take virtual field trips—it’s one of the most popular applications for VR technology for learning [2]. Teachers can instruct students to download the Google Expedition App for free on iOS or Android and invest in some cheap cardboard earphones that can be attached to smartphones. Students can actively explore everything from outer space to the deep sea with these simple headphones. These applications are used by teachers of High Schools in Vietnam to teach students at Physics and Biology classes to study planets of the universe or deep sea creatures.
3.2 Language Immersion The main language of High Schools in Vietnam that students are learning is English. When students immerse themselves completely in VR, they will listen and speak the new language they are learning every day, every day in the best way. When students become good at virtual role-playing, they can afford to fly to another country for a time at a time. Such an application is Holomia, which can be used with Oculus Rift headsets. The Holomia application allows learners to connect with people from all over the world and to practice their language skills while playing games and interacting with other students in a virtual world. In addition to Holomia, in Hanoi, Ho Chi Minh City, and many other cities in Vietnam, High Schools, especially language schools, have also used VR technology to teach English to students and initially bring about high efficiency.
Virtual Reality: A Study of Recent Research Trends …
305
3.3 Architecture and Design Virtual reality technology is a great way to arouse student creativity and get them involved, especially when it comes to architecture and design in schools. Many Technology teachers in vocational centers in Hanoi have researched ways to apply VR technology in their fields and believe that it opens up countless possibilities in architectural design. When coming to these centers, students can experience, be creative and apply the contents and topics learned in practice. The Oculus Rift hardware makes it possible for architects to take computergenerated 3D models and place viewers into those models in order to bring their plans to life. In Hanoi, Vietnam, a High School student used VR to build 3D models of Vietnam historic sites and then visit them.
3.4 Skills Training To help students practice real-world skills, we can use virtual reality simulations. Students can practice from real situations without real situations. We know that VR trained people learn faster and better than those who only watch video tutorials. An interactive learning experiment for making coffee and students watching YouTube tutorials on pulling espresso photos or being allowed to practice in VR. Reality shows that students who learned with VR made fewer mistakes and were quicker at pulling the espresso shots than those who watched the video tutorials. In High Schools in Hanoi City, Vietnam teachers often let students preview the videos to know how it works and then practice them in detail, but when students practice VR applications, the effect is much higher when just watch the video.
3.5 Special Education In Hanoi city and Ho Chi Minh city, Vietnam, High Schools for special students used Oculus Rift headphones in the classroom. To help arouse students’ imagination and give them the insight they don’t have, teachers can let them use Oculus Rift. For example, students can peek inside Thang Long, Hanoi, temple, the Hue palace, Vietnam or watch a jet engine to understand how it all fits together, which makes the lessons much more realistic. Teachers should also note that lessons with VR applications to explore the discoveries of the planets and stars, Thang Long citadel, Hue palace have the effect of calming down their students, many of whom some of them have some form of autism.
306
T. D. Vinh
Fig. 2 VR Box glasses 3D image. Source Author
3.6 Distance Learning The distance learning industry can apply Virtual Reality Technology because it has great potential and VR technology can improve learning outcomes for online students. In Universities and High schools, Vietnam, especially during the Covid-19 epidemic season, students used VR chat technology. VR Chat is application provides virtual online chat spaces where students with a VR headset can project themselves and interact with lecturers and other students. In addition, schools across the country have used the Microsoft Team Software or Zoom Software to teach online and have shown it to be very effective.
3.7 Game-Based Learning Games can be used for learning when using VR. To increase engagement and motivation, virtual reality can take this to a new level. We must note that VR games are not the only fun sources when using games and simulations in lessons. Experience shows that game-based learning is motivating because it is fun. When the playing field is leveled—a player’s gender, weight, race don’t have to interfere with their acceptance by other players. Applying VR we can do things that we cannot in real life. Virtual worlds have contributed to students’ ability to learn from experiences from visual and kinesthetic. In Vietnam, many Technology and Education Companies have invented Educational Games to put them into High Schools. For example, at Hanoi-Amsterdam High School, Hanoi for the Gifted, we tested the game “conquer the dance” and Students can also use the VR box to play games Fig. 2.
3.8 Improved Collaboration Collaboration between teachers and students, distance learning and classroom-based teaching has the potential to greatly enhance is from VR technology.
Virtual Reality: A Study of Recent Research Trends …
307
We know that virtual and augmented reality simulations increase student motivation and improve collaboration and knowledge construction. The students showed improvements in key areas including reduced embarrassment when practicing their language skills and better social interactions between students. Above are eight solutions for applying VR technology in High Schools in Vietnam. Through applied practice, we find that, these solutions had be useful for teaching and learning subjects and topics in Vietnam High Schools effectively.
4 Research Trends The research trends of VR compared and investigated for the last 50–60 years in worldwide aspects, searching in the categories of science, technology, architecture, military, entertainment, tourism, real estate, medicine, arts, entertainment, virtual tour, and meet all needs: Research-Education-Commerce-Services. Especially in the field of Education, VR technology is applied in teaching many topics, in many subjects at different levels of education. VR technology is also applied to create a learning environment in which learners are experienced, discovered, … and VR technology also helps us create vivid and useful digital materials for teaching process. In addition, VR technology also helps create activities, support for testing… in the learning and teaching. We know that, the name “Virtual Reality” was coined by Jaron Lanier, founder of Visual Programming Lab (VPL) from 1987. From 2014, Facebook has bought Oculus VR and developing a VR chat rooms for future. According to many scientists in the world (For example, Ezawa), the application of VR technology in the field of education is still 10 years away and VR technology becoming a popular everyday technology among population [3]. Today VR technology was developments to technologies AR, XR, MR and have a close relationship with VR technology. • Augmented Reality (AR) Augmented Reality technology is a combination of the real world and virtual information, not separate between virtual and real world like VR. AR technology will add virtual details created by computers and smartphones into the real world to enhance the experience. Users can freely interact with virtual content in real life, such as touching, grabbing, etc. VR brings a whole virtual world to users, while AR is a unique combination of real and virtual worlds. These two terms are not always separate. • Mixed Reality (MR) A Mixed Reality experience is one that seamlessly blends the user’s real-world environment and digitally-created content, where both environments can coexist and interact with each other. It can often be found in VR experiences and installations
308
T. D. Vinh
and can be understood to be a continuum on which pure VR and pure AR are both found. Comparable to Immersive Entertainment/Hyper-Reality. ‘Mixed Reality’ has seen very broad usage as a marketing term, and many alternative definitions co-exist today, some encompassing AR experiences, or experiences that move back and forth between VR and AR. However, the definition above is increasingly emerging as the agreed meaning of the term. • Extended Reality (XR) Extended Reality is the term covering all VR, AR and MR, as well as all future “realities” that technology can bring. XR is the intersection of different types of technologies and how they will work together to circumvent our everyday tasks. It refers to IT technology interwoven between reality and virtual reality (game space, CG video.) Through computers, headphones, etc. The development of VR on the AR, MR, and XR technologies above will help us better understand this field as well as know how to apply for different purposes, including Education.
5 Conclusion This paper investigates the recent research trends of VR, VR technology, VR in the Vietnam High Schools and eight solutions application VR in the Vietnam High Schools. In addition, eight solutions mentioned above, we can apply Philosophical Theories and Virtual Visit Campus solutions to apply VR technology in the Vietnam High Schools. We see that, VR technology is necessary and it plays a very important role in the teaching process at Vietnam High Schools. Hopefully, these solutions will partly help teachers and students in schools when the use of VR technology in the classroom is being actively concerned to improve the quality of teaching and learning.
References 1. Giraldi, G., Silva, R., & de Oliveira, J. C. Introduction to virtual reality. LNCC—National Laboratory for Scientific Computing Scientific Visualization and Virtual Reality Laboratory— {Gilson, Rodgi}@lncc.br COMSIDIS Research Group—[email protected]. https://www.lnc c.br. 2. Marianne Stenger. 10 Ways Virtual Reality Is Already Being Used in Education. https://www. opencolleges.edu.au. 3. Martín-Gutiérrez, J., Mora, C. E., Añorbe-Díaz, B., & González-Marrero, A. (2017). Virtual technologies trends in education. EURASIA Journal of Mathematics Science and Technology Education, 13(2), 469–486. ISSN 1305-8223 (online) 1305-8215(print).2017. https://doi.org/ 10.12973/eurasia.2017.00626a. https://www.ejmste.com.
A Detailed Study on Implication of Big Data, Machine Learning and Cloud Computing in Healthcare Domain L. Abirami and J. Karthikeyan
Abstract This paper is one of the promising solutions to limited human resources and decision support systems to medical practitioners. Cerebrovascular stroke (CVS) has become a critical overall general clinical issue starting late. This CVS is acting as one of the leading causes of human being death. To provide the solution for this problem, in this paper, we perform the literature survey about the implication of fusion of machine learning (ML) algorithms, analysis the health data using big data application and cloud computing concepts to analyses and predict the stroke severity level. Keywords Big data · Machine learning algorithms · Cloud computing · Cerebrovascular stroke
1 Introduction In this paper, we made the deep literature that help of analyze the efficiency of proposed system in healthcare domain by means of taking treatment plan on cerebrovascular stroke. The gathered feasibility study helps us for better understanding about disease diagnosis methodologies. In upcoming sections of our paper focusing different methodologies, techniques, classifications, performance measurements, and their limitations facing in health care field for processing and predicting vulnerable diseases. In health care field a wide number of research has been performing, improved machine learning classification and detection of CVS has been undertaking in the last few decades [1]. This paper deals with extensive research done already in the field of big data, machine learning, and cloud computing for predicting cerebrovascular stroke. L. Abirami · J. Karthikeyan (B) SITE School, VIT, Vellore, India e-mail: [email protected] L. Abirami e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_26
309
310
L. Abirami and J. Karthikeyan
2 Health Care Data Processing Using Big Data Nowadays, the data which has been using in profound fields are growing up normally. Those availabilities of data need to process and analyze effectively for producing expected accurate result as well as predicting the future directives. Consistently the clinical experts have been dealing with a gigantic measure of wellbeing information like modernized doctor request passage, electronic clinical records, clinical notes, clinical pictures, digital physical frameworks, clinical Internet of Things, genomic information, and clinical choice decision supportive networks. In this survey [2], the authors have investigated the key difficulties, information sources, methods, and future headings in the field of enormous information in the human health care field. In this study, authors have reviewed the user-friendly view of various big data technologies that had been using to build up an incorporated medicinal services expository application. The primary challenge in health care data investigation is the absence of examination devices in terms of predicting critical stroke from a tremendous amount of health data. In this study [3], the authors have proposed the intuitionistic Fuzzy Based Decision Tree to conclusion the various kinds of stroke ailment. The result of Intuitionistic Fuzzy is estimated utilizing Hamming Distance with Intuitionistic Fuzzy Entropy. In this approach, by using data gathering derived from Intuitionistic Entropy method the authors have identified stroke disease. By using decision tree-based analysis, the whole data set was processed and predicted the stroke with an accuracy of 90.59%. In this study [4], the authors have used a qualitative research design in addition to four focus groups, namely, (i) Inconstancy; (ii) Setting of Use; (iii) Basic plan highlight and (iv) Barriers to receiving innovation for gathering wellbeing information about patient stroke. Here, the gathered information were dissected to create general classes speaking to the partner contemplations for the advancement of stroke-explicit wearable screen innovation for the lower furthest point.In developing countries like us, the mixture of ambient sulfur dioxide (SO2 ) in the air has been causing severe problems in people’s health on stroke disease. In this research [5], the authors have collected the ischemic and hemorrhagic stroke of the people due to air pollution for a period of five years (2009–2014). At that point they applied the summed up added substance model with a semi Poisson interface against gathered information to dissect the ratio of both the strokes in the different age groups of people. Finally, they proved that the account of Ischemic stroke for the people with 0.1). The results of internet usage intensity also show that the digital divide is less affected by ethnic—race factors, in particular: 4 people (Kinh, Ha Nhi, Pa Then, Tay) rarely using; 35 people (41.2% Kinh and 58.8% ethnic minorities) using 1–5 times per day; 40 people (50% Kinh and 50% ethnic minorities) using 6–10 times per day; 43 people (44.2% Kinh and 55.8% ethnic minorities) using 11–20 times per day; 20 people (60% Kinh and 40% ethnic minorities) using 21– 30 times per day; 26 people (57.7% Kinh and 42.3% ethnic minorities) using over 30 times per day. In this study, the same percentage of ethnic minority consumers accessing the internet with smartphones compared to Kinh people, both accounting for 50% (χ 2 = 0.09, df = 1, p > 0.1). 90% of respondents preferred using smartphones to access the internet. Among 10% of those who prefer to use laptops or desktops, there were 7 ethnic minority people (4%) and 10 Kinh people (6%). With the question of self-assessment of household economic conditions, 30.5% said that their family was in a difficult situation, 68.3% earning the average income and 1.2% being better-off. In particular, most of those who have difficulty accessing the internet are in poor, remote, and isolated areas. For economic and development issues, people living in mountainous areas and rural areas of Vietnam often have to be poor and do not have the facilities to access the internet. Check the question: Do you encounter difficulties (scoring on a 5-point scale from easy to very difficult) in accessing and using the internet? We have received evidence for the hypothesis that the economic situation and living conditions will affect more internet access. Only 2 Hmong people in extremely difficult areas chose very difficult levels on a 5-point scale. The 4-point scale was selected by 29 people including 6.9% Dao; 3.4% Nung; 55.2% Tay; 6.9% Thai; 6.9% Mong; 3.4% Cao Lan and 17.2% Kinh, these subjects also live in remote areas. 46 people (71.7% in rural and remote areas) including 4.3% Cao Lan; 8.7% Dao; 2.2% Ha Nhi; 2.2% Han; 13% Nung; 28.3% Tay; 2.2% San Diu and 30.4% Kinh, scored the 3-point (the average level). At 3-point scale,
356
D.-N.-M. Do et al.
there were 19 people: 5.3% Tay; 10.5% Nung; 5.3% Muong; 5.3% Dao; 5.3% Thai and 68.4% Kinh, of which 47.4% are in rural and 52.6% in urban area. At the very convenient and easy level (1-point scale), there were 71 people: 4.2% Dao; 5.6% Cao Lan; 12.7% Tay; 1.4% of Pa Then; 7% Nung and 69% Kinh, of which 60.6% live in urban areas, the rest in rural areas. The survey was conducted with age and gender variables; however, there were only 41 samples (24%), and the age of the samples were quite uniform (70% from 19 to 25 years old; 19.8% from 26 to 35 years old; 10.2% over 35 years old), so these two variables had little effects on the hypothesis. • Examination of Affected Human Rights Ethnic minorities in Vietnam in particular and the world, in general, have rarely been given adequate access to their information, resulting in a lack of knowledge and communication capacity. In particular, due to the small number of ethnic minorities, their cultural identity is always at risk of disappearance easily. On that basis, we conducted a question about demand and internet access, resulting in 166/167 presenting a need (99.4%). Given the need for preserving languages and ethnic minority identities, we questioned the need for internet access on the interface in ethnic minority languages, resulting in 52.1% of them having this need. Regarding infrastructure conditions and availability, assessing the use and ability to provide high-speed Internet services in Vietnam, including mountainous and midland areas, Vietnam Internet Network Information Center (VNNIC) of the Ministry of Information and Communications published statistics of nearly 30,000 Internet users in April 2020 [15]. The results show that Vietnam internet access quality basically meets the standards, the average download speed of broadband networks reaches 61.69 Mbps, mobile networks 39.44 MBps. Thus, in general, the infrastructure for people to access the internet in Vietnam is guaranteed, even in mountainous and midland areas. However, because we are still a group of developing countries, we still receive 99/167 responses (59.3%), saying that the internet access was relatively expensive, with 103/167 responses (61.7%) feeling that the internet connection in their place of residence was slower than elsewhere. It continues the theoretical contributions of previous studies that the internet is more likely to be considered scarce assets in developing countries [16]. Regarding the right to education, we questioned the need for distance learning during the period of a social gap because of the Covid-19 pandemic in early 2020, resulting in 152/167 responses (91%) in need and still maintaining their schedule of learning and training via the internet. • Survey of Benefits or Limitations from Internet 86.8% of people chose the answer Yes, when asked: “Do you feel happy when using the internet or bored without internet?”. We listed the 8 most popular online activities that generate the above economic value in Vietnam (shown in Table 1). Besides, we further realized that smartphones are more commonly used than conventional computers to engage in online activities,
Identification and Data Analysis of Digital Divide Issue …
357
Table 1 Devices used to complete economic value creating online activities Regular Smart-phone Both computer … send or read email
Mean Std. t difference deviation
df
Sig. (2-tailed)
43 (25.8%)
58 (34.7%)
66 15 (39.5%)
6
32 167 0.00
… use a 6 (3.6%) search engine to find information
79 (47.3%)
82 73 (49.1%)
35
26 167 0.00
… research 6 (3.6%) a product or service before buying or buy a product
119 (71.3%)
40 (24%)
47
31 167 0.00
… research for education or training
41 (24.6%)
69 (41.3%)
56 28 (33.5%)
11
32 167 0.00
… get financial info online
12 (7.2%) 122 (73.1%)
26 110 (15.6%)
48
29 167 0.00
113
… look for 98 info about a (58.7%) place to live
14 (8.4%)
33 84 (22.8%)
35
31 167 0.00
… sell something online
17 (10.2%)
51 69 (30.5%)
28
31 167 0.00
16 (9.6%)
35 (21%)
23
32 167 0.00
86 (51.5%)
… buy or 73 sell stocks, (43.7%) bonds, or mutual funds or do any banking online
57
except for finding accommodation, selling online, or buying or selling stocks, etc. It can be seen that users tend to give preference to conventional computers in Table 1 (laptops, desktops, or tablets) for activities that are riskier, requiring more seriousness in selection. In contrast, we also obtained 167 responses on internet access that posed harmful risks or consequences. Main reasons included: wasting time in vain; addiction leading to neglect of the main job; unhealthy; abusing internet communication to infringe on the human rights of others; personal information failing to be kept confidential;
358
D.-N.-M. Do et al.
many information and websites violating morals and laws without being censored; cyberspace affecting real emotions and life, etc.
5 Discussion 5.1 Hypothesis Verification Through the aforementioned data, we continue to corroborate the theories developed by Brown et al. [8], World Bank [16] in developing countries. It is generally that in the mountains of Vietnam today, the boundary of the digital divide is still quite blurred, in which the variable that affects most digital divide in the mountains of Vietnam is still the economic situation and living conditions of the household. We also continue to affirm the inevitable trend and way of accessing the internet as a new form of communication and social connection, especially in the educational environment [17], even in poor mountainous areas such as Tuyen Quang and Thai Nguyen. In particular, it is still noticeable that most people tend to rely on smartphones more, instead of laptops or tablets (5.4%), desktops (4.8%) to access and use the internet. We continue to demonstrate that the digital divide on the basis of ethnicity is less common in Vietnam, a country with a very comprehensive ethnic policy system [18]. In addition, we have established and initially identified the picture of human rights in relation to internet access in the areas where ethnic minorities and Kinh people live in the mountainous regions in Vietnam today (see Sect. 5.2). These parameters not only promote our common understanding of the ways of internet access, of internet application to their needs in life, affecting the satisfaction of life but also relate to the implementation of human rights in Vietnam. Thereby, through the barriers in the mountains of Vietnam, ethnic minorities have the opportunity to regain their inherent important positions, showing the absolute attraction of multi-ethnic culture in Vietnam in particular and multi-ethnic nations in general.
5.2 Public Policy and Exercise of Human Rights We have provided more evidence for the study of Brown et al. [8], Hoffman et al. [4], Mossberger et al. [19] when continuing to show that the digital divide has led to a new natural inequality based on the economic and professional backgrounds of individuals and families. This is because digital science and technology advancement does not give the same power to everyone [20]. Ethnic minorities or people living in areas with difficult economic conditions will likely be denied access to the world’s largest libraries, limited to education and training, to pricing and product information, to policies and laws, reduced opportunities for seeking jobs (especially jobs through
Identification and Data Analysis of Digital Divide Issue …
359
remote interviews), even engaging in e-government and securing other civil rights if it is impossible to access the internet. In the context of the Covid-19 pandemic and the imperative requirements to establish social distance, application of wireless technologies and smartphone apps is as a matter of urgency, even quickly a global issue. It seems that we no longer have time to wonder about the question: “Should access to Information and Communication Technology (ICT) be considered a human right?”. It is obvious that the need to ensure human rights related to internet access in the mountains of Vietnam has become an urgent need. It includes: The rights of ethnic minorities on the need to have electronic interfaces in minority languages; Right to access information; Right to education; Right to development; Right to employment via cyberspace (internet access), etc. Therefore, the responsibilities of nations under Clause 5, Article 4 of the UDRM: “Nations should consider appropriate measures to enable those of minorities to fully participate in the development and economic progress in their country” [21] must be executed immediately. It is possible to refer to General Comment No. 25 of the 57th Session (1996) of the Human Rights Commission on creating the necessary conditions for ethnic minorities living in disadvantaged areas to access the above right group.
5.3 Concluding Remarks Vietnam is a unified country with 54 ethnic groups living together. Throughout revolutionary leadership and in every step of each historical period, Vietnam always has determined to protect the rights of ethnic minorities in particular and human rights in general. Although still a developing country, even having difficulties in accessing the economies in the world, Vietnam has made great efforts to learn and ensure the needs of people nationwide. Inside, ethnic minorities must be guaranteed the right to education, the right to access information to raise their awareness so that they can adjust and adapt themselves, exploit, select, preserve, and promote cultural values for the purpose of development. From this perspective, the study helps provide more insight into the digital divide from a human rights-based approach. Thereby, confirming the importance of information technology development, even in difficult areas such as the Northern Uplands of Vietnam. In order to promote development motivation in this field in the coming periods. All this is to promote development momentum in this area in the coming period.
6 Limitations Because only a small number of samples were collected in the two Northern Uplands provinces of Vietnam, we may not have been able to find other unknowns in other
360
D.-N.-M. Do et al.
localities in Vietnam. Subsequent research can inherit our research theory to place it in broader contexts and samples in Vietnam. Acknowledgements We would like to thank the Board of Directors of Tan Trao University and University of Science—Thai Nguyen University for helping us in collecting survey questionnaires in this study. Author Contributions Conceptualization, methodology, writing—original draft D., D-N-M.; data analysis, T., T-L-T.; review and editing, T., T.. All authors have read and agreed to the published version of the manuscript. Conflicts of Interest The authors declare no conflict of interest.
References 1. Bartikowski, B., Laroche, M., Jamal, A., & Yang, Z. (2018). The type-of-internet-access digital divide and the well-being of ethnic minority and majority consumers: A multi-country investigation. Journal of Business Research, 82, 373–380. 2. Mesch, G. S., & Talmud, I. (2011). Ethnic differences in Internet access: The role of occupation and exposure. Information, Communication & Society, 14(4), 445–471. 3. West, E., & Riggins, S. (1994). Ethnic minority media: An international perspective. Contemporary Sociology, 23, 315. 4. Hoffman, D. L., Novak, T. P., & Schlosser, A. E. (2001). The evolution of the digital divide: Examining the relationship of race to Internet access and usage over time. In The digital divide: Facing a crisis or creating a myth (pp. 47–97). 5. Anderson, R. H. (1995). Universal access to E-mail: Feasibility and societal implications. RAND, 1700 Main St., PO Box 2138, Santa Monica, CA 90407-2138. 6. Fong, E., Wellman, B., Kew, M., & Wilkes, R. (2001). Correlates of the digital divide: Individual, household and spatial variation. InOffice of Learning Technologies, Human Resources Development. 7. Nielsen. (2012). State of the Hispanic consumer: The Hispanic market imperative, https://www. nielsen.com. Accessed on 10 July, 2020. 8. Brown, A., López, G., & Lopez, M. H. (2016). Digital divide narrows for Latinos as more Spanish speakers and immigrants go online. Washington, DC: Pew Research Center, https:// www.pewinternet.org. Accessed on 10 July, 2020. -ij y tiêp - oij, Tây ´ câ.n tín du.ng cuij a hô. gia dình 9. Luâ.n, Ð. X. (2019). Ðiê.n thoa.i thông minh thúc dâ ij ij , Ba´˘ c, Viê.t Nam: Hàm ý chính sách u´ ng du.ng công nghê. sô´ trong thúc dây tài chính toàn diê.n o, , ´ ´ nông thôn. Ta.p chí nghiên cuu Kinh tê và Kinh doanh Châu Á, JED, 30(11). ´ thay 10. Hoàng, T. T. H. (2017). Tình hình thu, viê.n thê´ gio´,i, thu, viê.n Viê.t Nam và mô.t sô´ d-ê` xuât - ij i dê -ij phát triêij n. dô 11. Hargittai, E. (2007). A framework for studying differences in people’s digital media uses. InGrenzenlose Cyberwelt?. VS Verlag für Sozialwissenschaften (pp. 121–136). 12. Attewell, P. (2001). Comment: The first and second digital divides. Sociology of Education, 74(3), 252–259. 13. Tossell, C., Kortum, P., Shepard, C., Rahmati, A., & Zhong, L. (2015). Exploring smartphone addiction: Insights from long-term telemetric behavioral measures. International Journal of Interactive Mobile Technologies (iJIM), 9(2), 37–43. 14. Tuyen Quang, (https://tuyenquang.gov.vn), Thai Nguyen, (https://thainguyen.gov.vn) Accessed on 10 July, 2020.
Identification and Data Analysis of Digital Divide Issue …
361
- internet o, Viêt Nam da - t mu´,c cao trên thê´ gio´,i (https://laodong.vn) Accessed on 10 July, ´ dô 15. Tôc . . . 2020. 16. The World Bank, World Bank Group. (2016). World Development Report 2016: Digital Dividends, World Bank Publications. 17. Cruz-Jesusj, F., Vicente, M. R., Bacao, F., & Oliveira, T. (2016). The education-related digital divide: An analysis for the EU-28. Computers in Human Behavior, 56, 72–82. - ng ˜ M. N. D. (2017). Quyên ` cuij a ngu,o`,i dân tô.c thiêij u sô´ oij,Viê.t Nam hiê.n nay. Hà Nô.i, Lao dô 18. Ðô, . – Xã hô.i. 19. Mossberger, K., Tolbert, C. J., & Hamilton, A. (2012). Broadband adoption| measuring digital citizenship: Mobile access and broadband. International Journal of Communication, 6, 37. 20. Hoffman, D. L., Novak, T. P., & Schlosser, A. (2000). The evolution of the digital divide: How gaps in Internet access may impact electronic commerce. Journal of Computer-Mediated Communication, 5(3), JCMC534. 21. Toševski, I. (1993). United Nations Declaration on the Rights of Minorities. ij
A Review on Content Based Image Retrieval and Its Methods Towards Efficient Image Retrieval R. Raghavan and K. John Singh
Abstract The main motto of this paper is to review the importance of image retrieval as per the different form of requirements from the user depends on the given content. The requirement varies from one to other in terms of content. The content in an image could have many interpretations from the different ends. The interpretation used to give priority to retrieve images based on colour or to retrieve images based on shape or to retrieve images based on size too. All such possible interpretations are reviewed in this paper which would result to have a proper choice in choosing the concern method among the varieties towards our interest in fetching the analysis as per our necessity. Also this survey focussed on the following agendas in the form of minimising or throw away the irrelevant form of images, using parallel retrieval system to handle images of high dimensional cum complex features, and also to perform the image retrieval in an unsupervised fashion. Keywords Content based image retrieval · Precision · Recall · Accuracy · Dimensionality reduction · Image features · Feature matching · Query image · Feature extraction · Retrieved images
1 Introduction It seems the system which performs the role of retrieving the images has become an important activity in the domain of image processing. Due to the need of digitization takes place in the form of scanning the images, uploading the images and other forms of handling the images taking place in wide variety of places including schools, hospitals, digital malls, offices, marketing and relevant industries and other places where the need of handling the images in the form of digitized bulky manner and R. Raghavan (B) · K. John Singh School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India e-mail: [email protected] K. John Singh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_31
363
364
R. Raghavan and K. John Singh
hence handling such a huge volume of images have become more challenging and even time consuming. To handle this challenging task there are many measures taken in view point of time, accuracy and efficiency. The user interaction is playing an important role in achieving the above listed measures. The user interaction combined with other parameters also found to be fruitful in meeting with the listed measures. The key parameter in the user interaction could be possibly a query image through which the requirements of the user can be specified. It is more important to interpret the given query image in various dimensions towards the nature of color, shape and other related properties so that the observed details could help in fetching the nearest set of images or sometimes even the similar kind of image retrievals. It is only because of certain algorithms and the method described in the algorithm which triggered to invite such accuracy. Sometimes the user interaction taking place in the form of feedback which productively improves the accuracy rate as the filtering mechanism in avoiding the unwanted images as per the requirement given through the form of input received through the feedback. The feedback always has such a positive impact in filtering the outputs to the desired level. The role of pipelining kind of aspects mentioned in the algorithm is the key to transfer the result of one stage in to the next and further doing some other related refinements or improvements to travel to achieve the aim in the form of obtaining nearest or actual image retrieval. After obtaining the required image we have to look into the sequence how efficiently the result is obtained. This means whether along with the required output how many other results are coming along with, whether the required answer is a nearest or an accurate, if accurate is there any prior equivalent form of precision available in literature or the proposed method is a head on that has to be considered. The efficiency is related to time oriented, accuracy oriented, requirements matching oriented, fetching limited but exact result oriented, reduction oriented, such kind of papers is discussed in this review paper. The architecture of CBIR system could be either detailed or general. In both the systems query image and database image are ultimately compared. But in the detailed architecture preprocessing stages are included as well the types of feature extraction could have been more elaborate. The similarity measure receives input from the feature image database and from the feature query image vector and the result finally producing the filtered form of images which could be possibly the relevant image set too. The general architecture describes the outline flow whereas the detailed architecture describes in elaborate towards describing whether it is single feature based or multi-feature based and also similar like such details in depth. The general architecture and the detailed general architecture of CBIR are shown in Figs. 1 and 2.
A Review on Content Based Image Retrieval and Its Methods …
365
Fig. 1 General architecture of CBIR system
2 Literature Review Jaspreet Singh Dhanoa and Anupam Garg proposed a new idea towards the role of extracting the feature tends to shape using CBIR. This paper aim is more suitable for the current environment where the usage of digital images is increasing indefinitely. This technique also suits towards the advancement of technology and the multimedia handling details. The role of image annotation varies from people to people as it looks time consuming when it was handled through text based image retrieval. The author chose shape as it looks simple to use and also being simple as associated to other features. The authors tested the proposed algorithm with other related state of art techniques and found to be efficient. Munjal and Bhatia [1] handled the CBIR in a unique way of two different exploration one through query by image retrieval and another one through retrieval by ticket or by text The author fetches the vectors associated to features from the concern image source along with the technical support from motion picture expert group-7 and also the support from the descriptor used for edge direction. Also a GUI was introduced as a highlighted tool for the purpose of effective carrying out. Color, texture and shape are usually considered as promising features. Each and every feature associated with the visual, is having some specific property and address towards that in the concern segment of the image. In this paper [2] the authors combined both local and global features and the algorithm was devised according to this principle. Some of the features seems to be local are used to find the corner points from the image. Such detection was carried out using BEMD method for performing edge detection and HCD method to sense the corner points of an image. The non-local feature used is HSV. The author tested with COIL-100 database.
366
R. Raghavan and K. John Singh
Fig. 2 Detailed architecture of CBIR system
Zhou and Yang proposed a parallel retrieval system for fetching images and mainly aimed to achieve high responding ability [3]. The core architecture of this system was done through cluster based. The high respond is achieved as there are many embedded servers used for retrieval purpose to provide the service of CBIR. Hence it is enhanced with the ability to handle high dimensional cum complex features. This system has an advantage that it is ready to handle many requests from many users of internet. This system adopts the mode called as Browser and Server mode.
A Review on Content Based Image Retrieval and Its Methods …
367
The users are allowed to see the developed system through the enable web pages. In this work a symmetric method was used to represent the color segment of the spatial features in the name of SCSF method. The SCSF was recommended as it produces good matching accuracy and being not dependent towards the distortion of images and also it could possibly speed up the searching task. Seng et al., proposed a image retrieval systems which has the capacity to carry out the task efficiently without being supervised by any user [4]. Such an unsupervised activity was carried out to fetch blood cells related image information. Hence it is proved to be one among the limited systems available to handle the medical image related queries. The performance of descriptors for retrieving images related to blood cells is examined through a model of CBIR. The user has to give the input in the form of query image and also has to specify through selection which method to be used for implementation. After giving such inputs, the histogram technique based on the non-monochrome and also wavelet based methods are investigated for the provided input and hence the way how the indexing performed using the introduced descriptors was concluded positively after the analysis. By all the above listed supporting means the aim of the paper is well achieved. Shao et al., impressed with a proposed model which proves to be efficient as it improves the retrieval process [5]. The impressing idea proposed was to have an additional text as a measure of semantic features as it rightly addressed the medical image description which requires still a consistent solution. The role of additional text is playing an important task in fetching a meaning semantic. A kind of hierarchy was introduced in handling the semantic features as the sequence was found to be an important factor in deriving a correct decision or conclusion particularly in the medical domain. Along with this semantic feature the low level features are combined and proposed by the authors in this work. The authors rightly concluded that the experimental results showed the image retrieval process was producing a better precision and recall as it took additional text as an input to refine the semantic features to the better level. Patil et al., proposed a form of hybrid and novel oriented approach. In this approach [6] two aspects of shape feature extraction was carried out one in the form of region based and other in the form of contour based. The hybrid approach followed through the feature extraction technique used in the form of combination of Zernike-Hu model and the authors used support vector machine as a classifier. Defining metrics is playing a vital role while performing the similarity comparison for the currently using feature extraction method to come out with some form of efficient results. Sirajuddin et al., presented a method for feature extraction using CNN with an auto encoder [7]. There are two layers out of which one performs the role of encoding and other performs the decoding role. Dimensionality reduction process is mainly carried out through the encoding stage. Such a reduced dimension has proved advantage in this work in terms of bringing out the result in a faster manner. The encoding stage is also doing another core responsibility of extracting feature in the CBIR. The decode layer doing the reconstruction part of the representation in such a way that the result of the auto encoder is becoming closer to the given data. After extracting the feature the distance in the form of difference among the database image and the image obtained
368
R. Raghavan and K. John Singh
through query is calculated and such calculation leads to retrieve some images and here it is found to be more relevant in this work as the results are reasonable good while comparing other literature ends. Tanioka proposed a most required form of scalable CBIR using visual related features [8]. The core and impressive factor addressed by the authors in this paper is about how to fetch the images in a faster manner. This aim has practical importance as the current demand in the image domain is to carry out the image analysis in a faster and error free manner. Such a fastness is realised through the advance in the elastic search as this method used with cosine similarity L2 norm. The use of CNN is proved to be outstanding in performance and further deep CNN using softmax function is another one which joins the same queue with much better efficiency. Some of the drawbacks towards limiting the applicability towards high end data applications in terms of big data have been overcome in this method through some form of carrying out the evaluation with specified approach as the evaluation results proved with improved effectiveness towards improved efficiency. Fadaei et.al expressed their logic in an impressive manner through database reduction in the process of image retrieval as the logic throwed away the irrelevant form of images [9]. The filtering of the irrelevant image was done through Zernike moments the interval was calculated for the query. The images are out of the interval limit are dropped out. Hence the author concludes the essence of this new logic leads to speed up the task of retrieval process as it is achieved after performing the database. Collins et al., have proposed a technique how to use deep feature factorization to achieve goal stating how activations are combined into a global image descriptor provided such image descriptor looks compact [10]. The concept of factorization among the convolution neural network activations the given input image is divided into semantic regions from which the meaning of such segmented portion could be derived by some means of representation. Such representation is carried out by both spatial saliency heat maps and basis vectors. When combined to form a global image descriptor, the experiments show that this proposed technique surpasses the state of the art in both image retrieval and localization of the region of interest within the set of retrieved images. B. C. Mohan et al., proposed a new CBIR algorithm [11] to handle images related shape oriented features. Such shape oriented features are having high level impact in the internet and marketing related applications. More specifically the internet related applications in the form of filtering the set of images towards desired shapes as per the requirement of the user as per the interest and the necessity from the user side. Moments are highly used while handling with shape oriented features. There are different kinds of moments and here the author selected GH moments along with Support vector machines to carry out the desired aim. The average retrieval accuracy and the performance metrics are found to be better as compared to the other state of art methods. Ajam et al., proposed a novel retrieval system based on human vision [12]. Among the several features the human eye is used to majorly consider the content of the image, texture part of the image and also the colour related properties. The features we are using should be as close as it is matching to the human eye system. In this work the
A Review on Content Based Image Retrieval and Its Methods …
369
author focussed on extracting the texture part that uses the binary patterns. Once the texture is extracted then the two adjacent pixels with similar texture are calculated in HSV space. The role of histogram was introduced after that. The colour difference of two adjacent pixels helps to construct this histogram. Now the system is able to extract the features from the available histogram and such values are able to describe the visual contents in much detail about the images. The entropy value calculated assists in selecting the effective features among the many features. This proposed method is having main advantage in allowing the user by skipping some of the implementation stages and hence which saves a lot of time related to the processing efforts. The retrieval rate of this method is significantly improved as compared to the state of art methods described. A CBIR scheme that abstracts each image in the database in terms of statistical features computed using the MGA of NSCT. Noise resilience is one of the main advantages of this feature representation. As we know feedback from the user filters the unwanted output to some extent which means the semantic gap is started to reduce or much reduced. One form of such a feedback mechanism in the name of relevance feedback was used in this work. This work also uses a ranking scheme in view the obtained user’s feedback. To estimate the ranking a graph theoretic approach is well supported. It is based on the similarity of image pairs corresponding to the feature representation as an edge values a graph is constructed. The images are tagged as relevant or non-relevant according to the relevance feedback in terms of probability. In [13] the authors produced a work on Fast anisotropic Gauss filtering based CBIR. In [14] the authors worked on a measure which performs compression in a faster way with the corresponding application to the CBIR. CBIR could be also applied with biometric security using different features that includes colour, texture and shape and overall it was controlled by some fuzzy heuristics based algorithm was well carried out by the authors in [15]. In [16–20] the ideas, trends followed in the early ages in the form of past, present days are well explored. Sematic based image retrieval issues are well addressed in [21]. In addition to this a form of general review towards CBIR and particularly trademark related images and multimedia related contents are carried out in [22, 23]. Further form of reviews along with the image indexing techniques is given in [24, 25].
3 Conclusion This review paper reviews several papers with different methodologies with a single motto how the image is to be retrieved efficiently. Such a survey is most wanted to recommend or possibly to improve the stated method into the better efficient level. Such efficiency would bring good accuracy in precision and recall parameters. The statistics and the values given in this review paper would be certainly useful to derive a conclusion towards choosing any of the required stated model into the concern author’s current work directly or to show next level improved system from the existing level discussed.
370
R. Raghavan and K. John Singh
References 1. Munjal, M. N., & Bhatia, S. (2019). A novel technique for effective image gallery search using content based image retrieval system. In International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon) (pp. 25–29). 2. Kavitha, H., & Sudhamani, M. V. (2014). Object based image retrieval from database using combined features. In Fifth International Conference on Signal and Image Processing (pp. 161– 165). 3. Bing, Z., & Xin-xin, Y. (2010). A content-based parallel image retrieval system. In International Conference on Computer Design and Applications (pp. 332–336). 4. Seng, W. C., & Mirisaee, S. H. (2009). A content-based retrieval system for blood cells images. In International Conference on Future Computer and Communication (pp. 412–415). 5. Hong, S., Wen-cheng, C., & Li, T. (2005). Medical image description in content-based image retrieval. In IEEE Engineering in Medicine and Biology 27th Annual Conference (pp. 6336– 6339). 6. Patil, D., Krishnan, S., & Gharge, S. (2019). Medical image retrieval by region based shape feature for CT images. In International Conference on Machine Learning, Big Data, Cloud and Parallel Computing. 7. Siradjuddin, A., Wardana, W. A., & Sophan, M. K. (2019). Feature extraction using selfsupervised convolutional autoencoder for content based image retrieval (pp. 1–5). 8. Tanioka, H. (2019). A fast content-based image retrieval method using deep visual features. In International Conference on Document Analysis and Recognition Workshops (pp. 20–23). 9. Fadaei, S., Rashno, A., & Rashno, E. (2019). Content-based image retrieval speedup (pp. 1–5). 10. Collins, E., & Süsstrunk, S. (2019). Deep feature factorization for content-based image retrieval and localization. In IEEE International Conference on Image Processing (pp. 874–878). 11. Mohan, B. C., Chaitanya, T. K., & Tirupal, T. (2019). Fast and accurate content based image classification and retrieval using Gaussian Hermite moments applied to COIL 20 and COIL 100. In International Conference on Computing, Communication and Networking Technologies (pp. 1–5). 12. Ajam, A., Forghani, M., AlyanNezhadi, M. M., Qazanfari, H., & Amiri, Z. Content-based image retrieval using color difference histogram in image textures. In Iranian Conference on Signal Processing and Intelligent Systems (pp. 1–6). 13. Aniostropic, Geusebroek, J.-M., Smeulders, A. W. M., van de Weijer, J. (2003). Fast anisotropic Gauss filtering. IEEE Transactions on Image Processing, 12(8), 938–943. 14. Cerra, D., & Datcu, M. (2012). A fast compression-based similarity measure with applications to content-based image retrieval. Journal of Visual Communication and Image Representation, 23(2), 293–302. 15. Iqbal, K., Odetayo, M. O., & James, A. (2012). Content-based image retrieval approach for biometric security using colour, texture and shape features controlled by fuzzy heuristics. Journal of Computer and System Sciences, 78(4), 1258–1277. 16. Datta, R., Joshi, D., Li, J., & Wang, J. Z. (2008). Image retrieval: Ideas influences and trends of the new age. ACM Computing Survey, 40(1), 1–60. 17. Liu, Y., Zhang, D., Lu, G., & Ma, W. (2007). A survey of content-based image retrieval with high-level semantics. The Journal of the Pattern Recognition Society, 40(1), 262–282. 18. Smeulders, A. W., Worring, M., Santini, S., Gupta, A., & Jai, R. (2000). Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12), 1349–1380. 19. Rui, Y., Huang, T. S., & Chang, S. F. (1997). Image retrieval: Past present and future. Journal of Visual Communication and Image Representation, 1–23. 20. Thilagam, M., & Arunish, K. (2018). Content-based image retrieval techniques: A review. In International Conference on Intelligent Computing and Communication for Smart World (pp. 106–110).
A Review on Content Based Image Retrieval and Its Methods …
371
21. Harea, J. S., Lewisa, P. H., Enserb, P. G. B., & Sandom, J. (2006). Mind the gap: Another look at the problem of the semantic gap in image retrieval. Multimedia Content Analysis Management and Retrieval, 17–19. 22. Eakins, J. P., Boardman, J. M., & Graham, M. E. (1998). Similarity retrieval of trade mark images. IEEE Multimedia, 5(2), 53–63. 23. Jadhav, S. M., & Patil, V. S. (2012). Review of significant researches on multimedia information retrieval. In International Conference on Communication, Information & Computing Technology (ICCICT) (pp. 1–6). 24. Idris, F., & Panchanathan, S. (1997). Review of image and video indexing techniques. Journal of Visual Communication and Image Representation, 8(2), 146–166. 25. Dharani, T., & Aroquiaraj, I. L. (2013). A survey on content based image retrieval. In International Conference on Pattern Recognition, Informatics and Mobile Engineering (pp. 485–490).
Performance Analysis of Experimental Process in Virtual Chemistry Laboratory Using Software Based Virtual Experiment Platform and Its Implications in Learning Process Vu Thi Thu Hoai and Tran Thi Thu Thao Abstract The experimental chemistry competency is considered to be the core and specific competency that needs to be formed and developed for students in chemistry education. In recent times, there have been some educators and teachers interested in researching these capacity development measures for students. This article researches the principle, suggests the process of virtual experimental construction of oxygen– sulfur chemistry 10, and applied in teaching to students at Tay Ho High school – Hanoi, Vietnam for the school year 2019–2020. The initial experimental results show the feasibility and effectiveness of using virtual chemical experiments to develop chemistry experimental competency for students in teaching chemistry. Keywords Experimental chemistry competency · Virtual chemistry experiments · Students
1 Introduction The application of information technology in teaching to meet learning needs in the context of the Fourth Industrial Revolution has opened up the innovation of teaching methods towards developing competency for students and is also the indispensability of education innovation. One of the core and specific competencies is the experimental chemistry competency that is formed and developed for students by using many different measures, including the construction and use of virtual chemistry experiments. In addition, the experimental chemistry competency helps students remember and understand the nature of the taught knowledge and use such knowledge flexibly and accurately to do chemistry exercises and to settle practical
V. T. T. Hoai (B) · T. T. T. Thao VNU-University of Education, Hanoi, Vietnam e-mail: [email protected] T. T. T. Thao e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_32
373
374
V. T. T. Hoai and T. T. T. Thao
situations. The experimental chemistry competency is just a basis for students to integrate into the world in the professional field as well as the future life. Scientists and teachers in the world as well as in Vietnam have been interested in researching the construction and use of virtual chemistry experiments at high schools. Some recent studies showed that the use of virtual chemistry experiments is highly economical and especially appropriate for current teaching at high schools in the context of lack and non-synchronization of experimental facilities and equipment. Teachers can conduct and describe virtual experiments and instruct students to carry out practice by using pieces of software in case of experiments that are toxic and difficult to observe and easily cause danger as in oxygen–sulfur chemistry 10 [1]. Besides, virtual chemistry experiments are used as a preparatory tool which helps improve the experimental chemistry competency for students [2, 3]. According to Woodfield et al. [4], the report on the improvement of students’ learning when using virtual chemistry experiments, we think that the use of virtual experiments combined with actual laboratory activities in teaching will create good conditions for students to develop their skills and competencies, including the experimental chemistry competency.
2 Definition and Structure of the Experimental Chemistry Competency According to the document [5], “experimental chemistry competency” of students is students’ ability to use their existing or acquire knowledge, skills, and experience to design and organize the safe and successful conduction of chemistry experiments and scientifically explain the observed experimental phenomena in order to draw necessary conclusions and then form new knowledge and skills and apply them in learning as well as practical problem-solving. According to the authors of the research works [6–8], the structure of the experimental chemistry competency consists of 4 competencies: • Competency for selecting, conducting and using experiments safely and accurately; • Competency for forecasting, observing, describing and explaining experimental phenomena; • Competency for processing experiment-related information and concluding; • Competency for proposing, conducting, and verifying successful experiments.
Performance Analysis of Experimental Process in Virtual …
375
2.1 Bases for Selecting NB Chemical Virtual Chemistry Experiment Software in Teaching Chemistry to Develop the Experimental Chemistry Competency for Students When considering the use of real and virtual experiments in teaching chemistry at high schools in the present period, we find the following issues: In case a teacher conducts real experiments in the class for students’ observation, most experimental tools are small, the class is crowded, and the classroom is wide. For this reason, when the teacher conducts experiments, not every student can easily observe them. Students at the back of the classroom can only listen to the teacher and cannot observe how to conduct experiments and experimental phenomena. Meanwhile, virtual experiments are conducted on a projector screen. Normally, the projector screen is placed so that all students in the class can easily observe the experiments. Besides, the teacher can increase the sizes of experimental tools so that all students, even students at the back of the classroom, can easily observe the experiments. Another issue is experiment safety. If an experiment is conducted by using real tools and chemicals, unexpected fire and explosion can sometimes occur due to negligence. However, virtual experiments are absolutely safe, and unexpected fire and explosion cannot occur. If chemicals are wrongly used, the fire and explosion occurring on the computer screen are just virtual, not real. Besides, not every real experiment is successful. Yet, virtual experiments are preprogrammed, nearly all of them are accurate, and they can be conducted efficiently as expected. The next issue is the preparation of experimental tools. For the high school curriculum, experiments are required for nearly every hour of chemistry. In the case of simple experiments and few tools, teachers can easily prepare tools and move from a class to another class. However, in the case of experiments requiring cumbersome tools, this is not simple. For virtual experiments, teachers are not anxious about this problem. Tools are available in the software; teachers only install the experiment design software on the computer and feel secure about experimental tools, etc. At present, there are many pieces of software for constructing virtual chemistry experiments, such as Crocodile, ChemDraw Ultra 8.0, Yenka, Macromedia Flash, Chemwindow, Chemlab Portable virtual chemistry lab, Portable crocodile chemistry, Chemist by thix, NBchemical, Science Teacher’s Helper, etc. Among them, two proper pieces of software for constructing virtual chemistry experiments studied by the present article are Chemist by thix and NBchemical. For Chemist by thix, the interface is easy to interact, the operation is easy and the simulated tools and chemicals are the same as the reality. This software can develop experimental chemistry competency for students. Nevertheless, Chemist by thix has disadvantages: fee payment for chemicals in the software and requirement for students’ competency for chemical metering. Based on the above-mentioned analysis, we use NBchemical software to design virtual chemistry experiments with a view to develop the experimental chemistry competency in teaching lessons on oxygen–sulfur chemistry 10.
376
V. T. T. Hoai and T. T. T. Thao
2.2 Principle and Process of Constructing Virtual Chemistry Experiments According to Huong and Chung [9], the principle of constructing virtual chemistry experiments in teaching chemistry to develop the experimental chemistry competency for students is that experiments must be scientific and accurate in terms of experiment practice skills, every experiment must be aesthetic and the implementation steps must be proper, attractive and lively with obvious and quick phenomena. Based on NB chemical virtual chemistry experiment software, the definition and structure of the experimental chemistry competency and reference to the documents [10, 11], the process of constructing virtual chemistry experiments to develop the experimental chemistry competency for students in teaching chemistry consists of 5 steps: Step 1: Surveying virtual chemistry experiment lesson contents; Step 2: Constructing a virtual chemistry experiment scenario; Step 3: Constructing an experiment framework; Step 4: Constructing experiments; and Step 5: Adjustment.
2.3 Using Virtual Chemistry Experiments in Lessons on Oxygen–Sulfur Chemistry 10 in Teaching Chemistry to Develop the Experimental Chemistry Competency for Students Based on the principle and process of constructing virtual chemistry experiments and the study on the document [10], we propose the process of using virtual experiments in teaching chemistry to develop the experimental chemistry competency in lessons on oxygen–sulfur chemistry 10 for students with the following steps: • Step 1. Working out a lesson plan for oxygen–sulfur; • Step 2. Preparing a lesson for which virtual experiments can be conducted: – Analyzing contents of the lesson on oxygen–sulfur and selecting proper experiments to design virtual experiments. For example, the reaction of sulfuric acid (H2 SO4 ) can result in the reducing product: sulfur dioxide (SO2 ), which is a toxicgas and can cause respiratory tract infection if inhaled. – Instructing students to install NB chemical virtual chemistry experiments on the computer for presentation preparation. • Step 3. Describing the virtual experiment process in lessons on oxygen–sulfur: – Identifying and presenting the effect of objects in virtual experiments; – Stating the experiment process and describing the operations in virtual experiments and carry out comparison when working with real objects. – Forecasting the experiment results which will be obtained.
Performance Analysis of Experimental Process in Virtual …
377
• Step 4. Conducting virtual experiments in lessons on oxygen–sulfur: – Teachers instruct students to conduct virtual experiments and use questions to take students to the virtual experiment process; – Students observe and practice the operations; – Students record and process data obtained from the experiments. • Step 5. Discussion, assessment and summary: – Students can explain the obtained experiment results and draw scientific conclusions. – Students can analyze and assess the experiment results and then propose and adjust the experiments. From the process of constructing virtual chemistry experiments in lessons on oxygen–sulfur, the article used “NBchemical” software to design virtual chemistry experiments “Preparing and testing particular properties of sulfur dioxide and carbon dioxide in a laboratory”, “Water-retaining ability of sulfuric acid” and testing in teaching chemistry according to the following process: Step 1: Determining experiment objectives. • Students can select the necessary tools and chemicals when conducting experiments and can use other chemicals for replacement; • Students can present the process of conducting experiments safely; • Students can forecast and explain the phenomenon and write the chemical equation for the reaction. Step 2: Determining tools and chemicals and steps for conducting experiments. • Proposing experiment methods; • Determining the process of implementing the proposed experiment method. Step 3: Conducting the virtual chemistry experiments “Preparing and testing particular properties of sulfur dioxide and carbon dioxide in a laboratory” and “Water-retaining ability of sulfuric acid”. The use of Nbchemical software will simulate the process of preparing and testing the properties in detail, easy observation and safety for students. 1. Install NBchemical software Access the web: https://www.nobook.com/v2/chemistry.html and download the software; it is possible to download to a computer or mobile device or use directly on the website and run NBchemical software and begin to conduct virtual experiments. 2. Conduct experiments Start NB chemical software: the experiment interface of the software is as Fig. 1. 3. Carry out the virtual chemistry experiment 1: “Prepare and test the characteristics of sulfur dioxide and carbon dioxide in a laboratory”.
378
V. T. T. Hoai and T. T. T. Thao
Fig. 1 Some main function icons on the interface screen
Step 1: Click the icons for selecting the tools and chemicals to prepare the experimental practice, Fig. 2. • Tools, chemicals (replaceable): charcoal, concentrated sulfuric acid (H2 SO4 ) (18.0 mol/l), fuchsin staining solution (C20 H20 N3 .HCl), KMnO4 solution (0.119 mol/l), Ca(OH)2 solution (0.021 mol/l), NaOH solution (1.712 mol/l), Ushaped test tub containing anhydrous copper (II) sulfate (CuSO4 ), alcohol burner, test tubs, tongs, absorbing air set as Fig. 2.
Fig. 2 Prepare the tools and chemicals to test the characteristics of CO2 , SO2
Performance Analysis of Experimental Process in Virtual …
379
Fig. 3 Fit the tools and chemicals
• Cautions for the performance of a safe and successful experiment: because concentrated sulfuric acid has the water-retaining ability, it will cause heavy burn if it directly contacts with the skin of the hands so it should be very careful when performing the experimenting and must follow the safe rules. • Guide the air pipe into the test tube containing the test solution. Students guess the phenomenon of happening reaction. Step 2: Fit the tools as drawing Fig. 3. • Put several charcoal pieces into a 250 ml flask. Add slowly 40 ml of concentrated sulfuric acid into the separatory funnel. • Pour about 22 ml of fuchsin staining solution into a test tube. Then add respectively KMnO4 , Ca(OH)2 solution, and NaOH solution into the rest bottles. • Fit the air pipe from the flask to the U-shaped test tube and solution containers. Step 3: Observe the phenomenon, explain, and write chemical equation, Fig. 4. Unlock the separatory funnel and alcohol burner. The phenomenon showed to students: • In U-shaped test tub, anhydrous copper (II) sulfate (CuSO4 ) turns from white to blue. • Fuchsin solution is pale and gradually fades away. • The colour of permanganate is pale gradually. • The clear lime solution is cloudy. Explain the phenomenon and write the chemical equation of the reaction. Experiment 2: “Water-retaining ability of concentrated sulfuric acid”.
380
V. T. T. Hoai and T. T. T. Thao
Fig. 4 Observe experimental phenomenon
Fig. 5 Experimental tools and chemicals to test water-retaining ability of sulfuric acid
Step 1: Prepare experimental tools and chemicals to experiment, Fig. 5. Take note that a concentrated sulfuric acid must be used with a concentration of 18.0 mol/l, Fig. 5. Step 2: Pour glucose into a 250 ml beaker. Then add concentrated H2SO4 solution, observe the phenomenon, and write the chemical equation, Fig. 6.
2.4 Pedagogical Experiment To evaluate the effectiveness of the application of NB chemical virtual chemistry experiment software, we carried out a pedagogical experiment at Tay Ho High
Performance Analysis of Experimental Process in Virtual …
381
Fig. 6 Experimental phenomenon that proves the water-retaining ability of concentrated sulfuric acid
School–Hanoi, class 10D5 (44 students) in the school year 2019–2020. The experimental results were researched based on observation and survey results by questionnaire forms at the experimental class. Below are some images during the experimental process, Figs. 7, 8, 9 and 10. The survey results showed that 92% of the students have never heard of virtual chemistry experiment software, proving that the application of teaching by virtual chemistry experiment is still new to high school students. For NB chemical virtual chemistry experiment software, 86.21% of the students assessed that the interface of the software was beautiful and simulated like real experiments. 79.39% of the students assessed that the experimental phenomena were easy to observe and confirmed that NB chemical was easy to use and to carry out the experiments. NB chemical virtual chemistry experiment software helped students to absorb the lesson contents well, giving them the initiative and activity in searching and studying the lessons in a safe way (86.49%). Additionally, most of the students could develop their Fig. 7 Teacher instructs students to use Nbchemical software
382
V. T. T. Hoai and T. T. T. Thao
Fig. 8 Student practices experiments
Fig. 9 Teacher supports students to do the operations on the software
Fig. 10 Student presents the phenomenon and writes the chemical equation of illustrated reaction
experimental chemistry competency through virtual chemistry experiment software (about 78.42%). Most students could choose the tools and chemicals, describe and explain the experimental phenomenon. However, the language of NB chemical software is Chinese, it causes some difficulties when interacting; some chemicals are not available which must be replaced or prepared. During the experimental process, some students were able to process the related data, besides, some students met with difficulties in calculating quantitative and using substitutive chemicals. After the lesson, 92.52% of the students agreed to continue studying with NB chemical virtual chemistry experiment software, and about 7.48% of the students did not like
Performance Analysis of Experimental Process in Virtual …
383
to study by this method due to the difficulties in choosing alternative chemicals to perform experiments.
3 Conclusion The experimental results initially affirmed that the application of NB chemical software to construct virtual chemistry experiments in teaching creates conditions for students to promote their initiative and creativity in learning, and develops some typical competencies of Chemistry, especially experimental chemistry competency. Moreover, using virtual chemistry experiment software will easily perform several complex reactions and some reactions that produce toxic products for testers and the surroundings. An outstanding advantage of the virtual chemistry experiment is suitable for the shortage and poor quality of facilities at many high schools in Vietnam. At present, however, the virtual chemistry experiment software has just stopped at the process of preparing substances in laboratories but not in the industry as well as not in accordance with the teaching content in high school chemistry program or all objects of students; Therefore, teachers need to research and select appropriate experimental content so that it can be used effectively in teaching towards competency development for high school students in teaching Chemistry.
References 1. Dung, N. T. H. (2015). Applying Crocodile chemistry software to design virtual experiment models in teaching experimental chemistry practice. Vietnam Journal of Education, special edition, 74–76. 2. Martinez-Jimenez, P., Pontes-Pedrajas, A., Polo, J., & Climent-Bellido, M. S. (2003). Learning in chemistry with virtual laboratories. Journal of Chemical Education, 80(3), 346–352. 3. Ministry of Education and Training. (2018). High School Education Program—Overall Program (Promulgated together with Circular No. 32/2018/TT-BGDDT dated December, Minister of Education and Training, Hanoi, Vietnam). 4. Woodfield, B. F., Andrus, M. B., Andersen, T., Miller, J., Simmons, B., Stanger, R., et al. (2005). The virtual chemlab project: A realistic and sophisticated simulation of organic synthesis and organic qualitative analysis. Journal of Chemical Education, 82(11), 1728–1735. 5. Hoai, V. T. T., Hanh, D. H., Hong, N. T. B., & Hien, B. T. (2017). Fostering teachers about the development of experimental chemistry competency for high school students. In Proceedings of the International Science Conference: “Developing pedagogical competency for teachers of Natural Sciences to meet the requirements of innovation of high school education”. University of Education Publishing House, Hanoi, Vietnam (pp. 289–297). 6. Anh, N. T. K., & Thuy, N. N. (2018). Using the system of practice exercises of non-metallic experiments in teaching chemistry to develop experimental competency for grade 11 students. Vietnam Journal of Education, special edition Period 2 of May, 200–205. 7. Russell, J. W., Kozma, R. B., Jones, T., Wykoff, J., Marx, N., & Davis, J. (1997). Use of simultaneous-synchronized macroscopic, microscopic, and symbolic representations to enhance the teaching and learning of chemical concepts. Journal of Chemical Education, 74, 330–334.
384
V. T. T. Hoai and T. T. T. Thao
8. Tinh, P. T., & Hoai, V. T. T. (2019). The reality of design and use of experimental exercises to develop experimental chemistry competency in High Schools of Huong Son District, Ha Tinh Province. Vietnam Journal of Education, 455 Period 1 of June, 43–50. 9. Huong, B. M., & Chung, N. H. (2019). Developing chemistry self-studying competency for high school students through using experimental chemistry teaching software. Summary record of the first international seminar on innovation of teacher training, Hanoi, Vietnam (pp. 265–274). 10. Hoai, V. T. T., & Trang, V. T. (2020). Using “Chemist by thix” software to construct virtual chemistry experiments for the development of experimental chemistry competency of high school students. Vietnam Journal of Education, 470, 40–45. 11. Tatli, Z., & Ayas, A. (2013). Effect of a virtual chemistry laboratory on students’ achievement. Educational Technology & Society, 159–170.
Human Detection in Still Images Using Hog with SVM: A Novel Approach Shanmugasundaram Marappan, Prakash Kuppuswamy, Rajan John, and N. Shanmugavadivu
Abstract Human detection in images gets more popular for applications includes night vision, robotics, surveillance etc. Existing systems in the literature support detecting single and multiple human objects. Algorithms developed for detecting humans in occluded condition have many troubles due to the problem of occlusion. This paper proposes a method to detect human objects in both multiple and occluded condition. To address the above problem, this method uses HOG for describing features and SVM for classifying objects. To examine the proposed algorithm, different set of images are used which consists of human and non-human objects like buildings, trees, lawns and roads. Then the features are obtained in the image dataset and assign the labeling to make a difference between human and non-human images. Thereafter feature vectors and labels are stored as a model for training the detector. SVM is used to train the detector. The feature vectors and labels are fed into the SVM to get trained. It cuts the image into slides and processes it for feature extraction. The detector uses the trained model for classifying new images for detection. Keywords Human detection · SVM · HOG · Occluded · Fusion
S. Marappan (B) · P. Kuppuswamy · R. John Jazan University, Jazan, Kingdom of Saudi Arabia e-mail: [email protected] P. Kuppuswamy e-mail: [email protected] R. John e-mail: [email protected] N. Shanmugavadivu RVS College of Engineering & Technology, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_33
385
386
S. Marappan et al.
1 Introduction Image processing is a procedure of performing most significant operations on images in order to obtain an enhanced image or to extract some useful information from it. It uses computing algorithms to process the image digitally known as DIP (Digital Image Processing). Mathematically, images are represented as two dimensions but, DIP supports multidimensional imaging systems also. Basically, it includes the following three steps: (i). Importing the image via image acquisition tools. (ii). Analyzing and manipulating the image. (iii). Output in which result can be altered image or report that is based on image analysis. DIP helps many decision-making solutions that lead to introduce many real world applications. Most of the problems in other fields could easily be solved with association of image processing techniques [1]. For example, with the help of image processing, the level of water in dam can be measured in computing the water evaporation rate. In mechanical and automobile, the engine capacity and speed can be determined by fluctuation rate of fire in taking continuous images and its frequent process [2]. In medical, lot of severe and serious problems can be rectified in terms of taking decision by the physician looking at the images of the inner organs taken by different imaging modality [1, 3]. Similarly, plenty of useful applications can be developed and used in many industries which give significant benefits to the living human. In today’s fast life, we can see that many imaging devices have been fit in common places such as road, small traders to huge shopping mall and places in which huge crowd appeared. The purpose of fitting camera devices over there is monitoring the people and their activities. With this setup, lot of activities can be recorded based on the event take part by the individuals. This may help in making some serious decisions and provide alerts to avoid unwanted events take place. For example, automatic traffic control for monitoring the pedestrians [4–6] on road and giving alarm message when an enemy penetrate to the boundary of a country. Solutions for the above requirements are possible by processing the image in performing some serious logical steps. In this series, finding human in image is important step to many application areas [7, 8]. Some thousands of algorithms have been presented by plenty of research scholars for face detection. In fact, face detection may be the first step in the problem of human detection. But, it may not be sufficient because, the different poses of the humans leads to challenges [9, 10] in identifying the structure of humans. Unlike other objects, human is a dynamic person and it wears different color dresses which makes difficulties in detecting the required human object [11, 12]. Technically, the process of detecting objects in the image consists of some steps to be taken place [13]. First and foremost, the unwanted information like image background has to be removed. And the outline of each objects have to be derived. Now, the image contains only edges of the objects and the geometrical structure of the objects are highlighted [13, 14]. To do this, some edge detecting, for example sobel and canny [14] and background subtraction techniques can be used. Finally, objects appeared in the image can be grouped and classified using some machine learning
Human Detection in Still Images …
387
techniques such as adaboost [15], neural network and Support Vector Machine (SVM) [16–20]. This paper presents a novel human detection method with the support of improved Histogram of Oriented Gradients (HOG) [21] for deriving edge features and SVM for classifying and identifying the objects are human. Section 2 gives the brief introduction to HOG and its functionality. The proposed method for detecting human in a single image is provided in Sect. 3. In order to assess the proposed method, images are tested by the given algorithm and the results are discussed in Sect. 4.
2 Histogram of Oriented Gradients—HOG HOG is used in image processing to describe features of objects [22]. In this connection, HOG helps to detect human objects using single detection window technique. It uses a global feature to detect human object instead of using local features.
2.1 Basic Algorithm Principle Objects in an image are represented as gradients with different orientation in HOG. A histogram is generated from the collection of gradients of pixel values of corresponding cells of a region.
2.2 Normalize Gamma and Color Usually, normalizing color and gamma values are taken part separately in feature descriptors. But, it is done by HOG by computing the descriptors directly.
2.3 Computing the Gradients Image gradients are described for each pixel values. For this purpose, HOG utilizes 8 * 8 mask. The equations for computing the masks are: G x (x, y) = [−1 0 1] ∗ I (x, y) G y (x, y) = [−1 0 1]T ∗ I (x, y) The Magnitude of each cell is calculated using the equation
(1)
388
S. Marappan et al.
G=
G x (x, y)2 + G y (x, y)2
(2)
The angle is calculated by using the equation θ = tan−1
G y (x, y) G x (x, y)
(3)
One of the main applications of gradient vectors is to edge detection and feature extraction.
2.4 Gradient Orientation The gradient vectors computed in the previous step are the input to calculate orientation. Each cell consists of 8 * 8 i.e., 64 pixels. The 64 Gradient vectors computed from 64 pixels in a cell are poured into a 9 bin histogram. For signed gradients 0°–360° will be used.
2.5 Contrast Normalize Over Overlapping Spatial Blocks A naïve approach would be to just do the above step for every cell in the window/image. But this would lead to a side effect called aliasing effects. The image should be partitioned into spatially connected blocks (preferably overlapping) of 2 * 2 cells to reduce the effects. A block is made up of 2 * 2 i.e., 4 cells. So instead of performing linear histogram interpolation of every cell on its own, it’s better to perform tri-linear interpolation tri-linearly. 1. Bi-linearly into spatial cells. 2. Linearly into orientation bins. L2-norm v v → v22 + ε2
(4)
v (v1 + ε)
(5)
L1-norm v→√ L1-sqrt
Human Detection in Still Images …
389
v→√
v (v1 + ε)
(6)
L2—Hys block normalization is implemented in this project.
2.6 Mask Detection—HOG The Histograms that are computed for each block are concatenated together to form one final feature descriptor. This final descriptor is a row feature vector having 1 * N dimension. The descriptor contains all details about an image.
3 Proposed Methodology This paper elaborates on how human detection in an image could be achieved and the factors included in human detection. Apart from using edge detection algorithms to find the edges of the objects, this paper uses HOG features. This methodology is in two-fold and they are feature extraction and classification (Fig. 1). In the first phase, the cropped human images and non-human images are collected. Human images are considered as positive images and non-human images as negative images. This distinction of positive and negative is made so that it is easy to differentiate the image dataset. This image collection is often termed as ‘Training Dataset’. After this step, labeling is done for every image so it can be used for classification purpose. In the next phase, the HOG features are taken from all the images in training dataset and stores in a vector. This vector and labeling is stored in a model and this model is subjected to train the classifier. SVM is chosen as a learning classifier and it is trained with the model [23]. New image from testing dataset is applied to detection algorithm. The detection algorithm applies sliding window technique to slice a sub-window from the image and classifies with the trained SVM model. When the algorithm hits the detection it draws a bounding box over the human.
3.1 Feature Extraction Reducing unwanted data from the image helps the fast algorithm process and provides only necessary features. This process is called “Feature Extraction”. The features are related to the input data, so that comparison can be made easy. Histogram of Oriented
390
S. Marappan et al.
Fig. 1 Methodology to detect humans in a single image
Features is used as feature extraction algorithm. It is unique as it is invariant to illuminations and clutter backgrounds.
3.2 Training Dataset ‘n’ number of cropped images of human in standing or walking pose can be considered as a positive training dataset and ‘n’ number of cropped images of buildings and trees that possess non-human characteristics can be considered as a negative training dataset (Fig. 2).
3.3 Labeling Each image is subjected to labeling. The positive images are labeled as +1 and negative images are labeled as −1.
Human Detection in Still Images … Fig. 2 Human detection pseudo code
391
Step 1: Read human and non-human images and load in to training data set Step 2: Label +ve to human and –ve to non-human images Step 3: Extract HOG from all images Step 4: Save the model and submit to SVM to create trained model Step 5: Read testing image Step 6: Convert it into grey scale Step 7: Extract HOG features using sliding
window technique
3.4 Extracting HOG Features For every image in the training dataset, extract the HOG features, assign labeling and save it as a model. Since the dataset is small, the flipped version of the original image, shifting the array of the image is applied for rigorous training. This model is used to train the SVM structure for classification purpose.
3.5 Sliding Window Technique After Training the SVM classifier with the training dataset, SVM classifier will be generated. Now when the new image is applied for human detection, the function svmclassify() comes into the picture. It takes the SVM model as input with HOG features of the new image. The sliding window is a classic technique for human detection. It slides over the image in a step by step manner so this technique is called as “Sliding Window Technique.” The above step is repeated for all the sub-windows in an image. After that, all the features in a vector are submitted to svmclassify() function for classification. The group contains all the classified objects of data. Then the maximum score in the group is calculated and will try to hit the match if it matches then draw bounding box over the human. It produces confidence scores for the detected objects.
392
S. Marappan et al.
3.6 Classification For classification purpose generally, the supervised learning technique has been used. In that, there are many Machine Learning Algorithms exists. In this proposed method, the SVM is used as the classifier.
4 Results and Discussion This proposed work is carried out in MATLAB software using R2014a version. The image dataset that is used for training and testing is from the repository of INRIA database. It is a well-known database and is containing human still images including standing, walking and running poses. There are so many databases namely Caltech database, MIT pedestrian database that consists of person images and videos that is usable for human detection. INRIA database is available from the following website for research purposes. https://lear.inrialpes.fr/data. In Fig. 3, it is shown that the Graphical User Interface (GUI) implemented to use in this project. There is a small window with four options and user can interact with that in order to execute the project. In Fig. 4, it is shown that the user has clicked ‘Generate Database’ button. The system generates the database by reading all the images in training dataset. The output of this is shown as loading humans and non-humans. Figure 5 shows that the user has clicked ‘Create SVM’ button. The system creates the SVM model and it is displaying the output as the number of support vectors.
Fig. 3 Front end GUI
Human Detection in Still Images …
393
Fig. 4 Generating Database
Fig. 5 Creating SVM Model
A cell array of 3 rows and N columns is created. In Table 1, for the first row, the image names that are involved in training will be stored. For the second row, the labels that have been assigned to each and every image in a dataset will be stored. For the third row, load extracted feature vectors of all the images.
394
S. Marappan et al.
Table 1 Label and feature vector training data set Names for every image
1.png
Labeling for every image +1 (+1 for cropped human images; −1 for non-human images)
2.png
3.png
4.png
… 20.png
+1
−1
+1
… −1
Feature vectors for every 1 × 12,960 1 × 12,960 1 × 12,960 1 × 12,960 … 1 × 12,960 image
Fig. 6 Testing on Images
In Fig. 6, it is shown that the user has clicked ‘Test on Photos’ button. It triggers File Open Dialogue for the user to select any image to test it in the system. In Fig. 7, the first image shows a person walking along the street with a row of cycles in the background and it is taken as input to the system. The corresponding output image detects the person by drawing a rectangular bounding box over the person. To represent another pose, in the next image, a woman is walking and talking on the phone. The image background contains many cycles that represent non-human features. The system produces an image in the second row that contains a detected person. The third image shows that a man walking on the road in opposite direction. This image is submitted to the system for detection purpose. In the next row, the system showing a rectangular bounding box over the pedestrian indicating the person has been detected in this image. The last one shows a student who is carrying a backpack and walking along the road. The background of the image is full of lush green trees.
Human Detection in Still Images …
395
Input images for Testing
Corresponding Human Detected Images
Fig. 7 Single human detection test and result images
Fig. 8 Multiple Human detect
Result image indicates the student has been detected in the image by drawing the bounding box around the person. Figure 8 presents the test images and the result images for multiple human detections. The first image shows many people are walking and standing on the corridor. It contains many people in different directions and occluded situation. It can be seen that legs have been partially occluded. All persons in the image have been detected even the occluded ones also. The last one hits the detection of four persons and leaves the one person undetected due to false positive.
5 Conclusion In this proposed approach, it is learned that how to detect humans in an image. It is very useful in detecting pedestrians, tracking people, identification and gender classification. The human detection process is done in 2 steps. Object detection and Classification. Instead of using other edge detectors, the histogram of oriented gradients is fully optimized for human detection purpose. It is highly invariant to changes in illumination. Then it has to be saved for future classification. Then new images are subjected to classification. The sliding window technique slides over the image
396
S. Marappan et al.
in all directions. It cuts the image into slides and processes it for feature extraction. The detector uses the trained model for classifying new images for detection.
References 1. Rabbani, H., Nezafat, R., & Gazor, S. (2009). Wavelet-domain medical image denoising using bivariate Laplacian mixture model. IEEE Transactions on Biomedical Engineering, 56(12), 2826–2837. 2. Lee, K., Bae, C., & Kang, K. (2007). The effects of tumble and swirl flows on flame propagation in a four-valve S.I. engine. Applied Thermal Engineering, 27(11–12), 2122–2130. 3. Oster, J., Pietquin, O., Kraemer, M., & Felblinger, J. (2010). Nonlinear Bayesian filtering for denoising of electrocardiograms acquired in a magnetic resonance environment. IEEE Transactions on Biomedical Engineering, 57(7), 1628–1638. 4. Munder, S., & Gavrila, D. (2006). An experimental study on pedestrian classification, PAMI, 28(11), 1863–1868. 5. Dollár, P., Wojek, C., Schiele, B., & Peron, P. (2009). Pedestrian detection: A benchmark. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp 304–311). 6. Viola, P., Jones, M., & Snow, D. (2003). Detecting pedestrians using patterns of motion and appearance. In Proceedings of IEEE International Conference on Computer Vision (ICCV) (Vol. 2, pp. 734–741). 7. Heili, A., Chen, C., & Odobez, J.-M. (2011). Detection-based multi-human tracking using a CRF model. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 1673–1680). 8. Shu, G., Dehghan, A., Oreifej, O., Hand, E., & Shah, M. (2012). Part-based multiple-person tracking with partial occlusion handling. In Conference on Computer Vision and Pattern Recognition. 9. Yaoa, J., & Odobez, J.-M. (2011). Fast human detection from joint appearance and foreground feature subset covariances. Computer Vision and Image Understanding (CVIU), 115(10), 1414– 1426. 10. Lin, Z., & Davis, L. S. (2008). A pose-invariant descriptor for human detection and segmentation. In ECCV. 11. Ikemura, S., & Fujiyoshi, H. (2010). Real-time human detection using relational depth similarity features. In Asian Conference on Computer Vision (ACCV). 12. Rujikietgumjorn, S., & Robert Collins, T. (2013). Optimized pedestrian detection for multiple and occluded people. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3690–3697). 13. Papageorgiou, C., & Poggio, T. (2000). A trainable system for object detection. International Journal of Computer Vision, 38(1), 15–33. 14. Phung, S. L., & Bouzerdoum, A. (2007). Detecting people in images: An edge density approach. In IEEE International Conference on Acoustics, Speech and Signal Processing. 15. Lim, J. S., & Kim, W. H. (2012). Detection of multiple humans using motion information and Adaboost algorithm based on Harr-like features. International Journal of Hybrid Information Technology (IJHIT), 5(2). 16. Tang, S., Andriluka, M., & Schiele, B. (2014). Detection and tracking of occluded people. International Journal of Computer Vision (IJCV). 17. Schwartz, W. R., Kembhavi, A., Harwood, D., Larry Davis, S. (2009). Human detection using partial least squares analysis. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 24–31). 18. Wu, B., & Nevatia, R. (2005). Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors. In IEEE International Conference on Computer Vision (ICCV) (Vol. I, pp. 90–97).
Human Detection in Still Images …
397
19. Xu, Y., Xu, L., Li, D., Wu, Y. (2011). Pedestrian detection using background subtraction assisted support vector machine. In 11th International Conference on Intelligent Systems Design and Applications (ISDA), Cordoba (pp. 837–842). 20. Chen, Y.-T., & Chen, C.-S. (2008). Fast human detection using a novel boosted cascading structure with meta stages. IEEE Transactions on Image Processing, 17, 1452–1464. 21. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. 1, pp. 886–893). 22. Wang, X., Tony Han, X., & Yan, S. (2009). An HOG-LBP human detector with partial occlusion handling. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 32– 39). 23. Nigam, S., Khare, A. (2014). Multiresolution approach for multiple human detection using moments and local binary patterns. In International Conference on Computational Vision and Robotics (ICCVR).
Simulation of Algorithms and Techniques for Green Cloud Computing Using CloudAnalyst Hasan Qarqur and Melike Sah
Abstract In our fast growing world, cloud computing took place and became a milestone in the new technology revolution. Cloud computing offers online services, wherever and whenever the user requests. The most leading companies such as Microsoft, Google, and Amazon have contributions to the development of this field and they offer their services to users, companies, and organizations by cloud computing. Because of the importance of cloud computing, maintaining the service sustainable and reliable is very important. This leads to high electrical power consumption and big emission of CO2 . As a result, the concept of green cloud computing (GCC) has appeared and took the interest of service providers, researchers, and endusers. On the other hand, many algorithms have been developed to reduce power consumption and carbon emissions. In our paper, we simulate the efficiency of datacenters that use different algorithms for could computing (CC). In simulation, seven datacenters in different regions are used. There are seven users’ databases and these users make ten requests per hour, during a period of one year. During the process of simulation, we made tables of comparisons of the performance of algorithms to find out the best one to be used for GCC. Results show that the best algorithm for the GCC concept is the combination of the Round Robin Load Balancer Policy and ORT Broker Policy because it takes the least time response starting from sending requests until receiving the response. Keywords Cloud computing · Green cloud computing · Datacenters · Algorithms of GCC · Simulation of datacenters · Power reduction
H. Qarqur (B) Software Engineering Department, Near East University, Nicosia, Cyprus e-mail: [email protected] M. Sah Computer Engineering Department, Near East University, Nicosia, Cyprus e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_34
399
400
H. Qarqur and M. Sah
1 Introduction Cloud computing CC offers various services, which includes sustainability and reliability of service with high performance and low price [1]. With the new demands of cloud computing, networking has improved. A new concept of green cloud computing (GCC) is introduced, which deals with environmental protection and conservation while continuing of providing cloud computing services [1]. GCC offers cloud computing services with low costs which enhance the economy and leads to a decrease of e-wastes and carbon footprints. Nowadays, cloud computing has become a millstone of the available technologies, since it offers access to online resources, without demanding extensive resources. The improvements in GCC are related to the development of green Datacenters since the Datacenters are the cradle of cloud computing. The consumption of electrical energy by datacenters in the year 2010 represented 1.5% of the total power consumption in the US, where the consumption of energy of the United States is approximately the quarter consumption of the whole world. The carbon dioxide emission from (ICTs) consists of 1.3% of the world emissions in 2002 and will be approximately 2.3% of the global emission in 2020. In this paper, we simulate the efficiency of datacenters that uses different algorithms for Could Computing services. In simulations, seven datacenters in different regions are used. In each region, there are seven users’ databases and the users make ten requests per hour during a period of one year. During the process of simulation, we compare the performance of algorithms to find out the best one to be used for GCC.
2 Literature Review According to Wickremasinghe [2], the techniques of reducing power energy are divided into two main parts: Hardware strategies and software techniques. The hardware strategies are divided into four sections, the Green compilers, Dynamic voltage frequency scaling device, sleep mode, and Feedback driven threading. While the software techniques are divided into green compilers, readymade resources, repetition, and other techniques explained later briefly.
2.1 Hardware Strategies If the power supply unit meets the requirements, it will use only the power needed for a given workload. After finishing executing tasks, the consumed energy by the system will be reduced if the CPU is put into sleep mode.
Simulation of Algorithms and Techniques for Green Cloud …
401
2.2 Green Software Techniques There are many available ready-made resources for use as a service. Using repetition, energy could be consumed more because of execution operation takes more time. Other techniques can also be used such as memory addressing and instructions registering to change.
2.3 Algorithms Solution In this research, we tested the performance of chosen algorithms in CloudAnalyst software to observe the impact of them on the responding time and the cost of cloud computing services. These algorithms can be categorized into two categories: Virtual Machine (VM) Load Balancer (LB) Algorithms and Service Broker Algorithms. We briefly summarized them as follows: VM Load Balancing Algorithms Round Robin LB (RRLB). Round-robin is one of the algorithms employed by process and network schedulers in cloud computing to allocate VMs to the proper physical machine and generally using time-sharing to give a job for a time slot. If the task is not completed, it will be resumed next time. Throttled LB (TLB). The TLB stores an indexed schedule of VMs and their states, if they are available or not. In the beginning, all the states of VMs are available. Then the Data Center Controller starts to receive requests, after that it queries the TVMLB for the coming allocation. Active Monitoring LB (AMLB). The main job of LB is to maintain the workloads equal overall virtual machines. And hence, the algorithm is similar to the TLB. Firstly the Active VM LB maintains an indexed schedule of virtual machines. When required to allocate a new virtual machine from the Data Center Controller, it parses the schedule. Service Broker Algorithms The Closest Data Center (TCDC). This service broker keeps the index schedule of all Datacenters categorized by their Area. When the user in user-base sends a message he queries the closest data center of the Service Broker for the pathway of Datacenter Controller. The closest data center of Service Broker calls the district of the sender of the request and queries for the district a proximity schedule for that area from the internet characteristics. The closest data center of Service Broker catches the first data center mapped at the closest district in the proximity schedule. Performance Optimized Routing (POR). This algorithm is used usually by the fit Response Time (RT) service broker. The best RT service broker keeps the index of DCs available. When the user in a base sends a message he queries the closest
402
H. Qarqur and M. Sah
data center of the service broker for the pathway of datacenter controller. For the best response time, the service broker defines the closest DC by using the proximity service broker algorithm. After that for the best RT, the service broker repeats the schedule of all DCs and assesses the current responding time in each DC by reserving the last registered PT from internet characteristics. Dynamic Service Broker (DSB). The DSB is a development of the two mentioned algorithms before. The closest datacenter and the performance optimized routing. The dynamic service broker keeps a schedule of DCs and another schedule for the best responding time registered for each DC [3].
3 GCC Simulations of a Shopping Scenario Using CloudAnalyst In this scenario, we suppose that a shopping website which is hosted in seven Datacenters in different regions have a different number of VMs, such as 5 VMs, 25 VMs, and 100 VMs. An image size of each VM is about 10,000 Mb, Also, each VM has a 1024 Mb of memory and 1000 Mbit/s of BW. Each user has 10 requests per hour, with a simulation period of 365 days. And we are going to use nine different cases to carry out the results of the simulation. Then, we will compare the nine cases with the nine different algorithms combinations to find out the optimum way to enhance the GCC concept and reduce the electrical power energy consumption and CO2 emissions.
3.1 Simulation Setup We can add and remove datacenters of CloudAnalyst and manipulate the following parameters: Architecture, name, number of servers, operating system, storage cost per GB, data transfer cost per GB (both in and out), region, cost per 1 Mb Memory Hour, and cost per VM Hour. In Table 1, parameters are summarized. Hence, in Table 2 it shows the algorithms used in the simulation, every case contains a combination of two algorithms used each time in simulation as illustrated below. Table 1 The values and parameters used in the simulation
Parameter
Value Used
Data center—available BW per machine
1000
Data center—storage per machine
100,000 Mb
VM bandwidth
1000
VM memory
1024 Mb
Data center—VM policy
Time shared
Simulation of Algorithms and Techniques for Green Cloud … Table 2 Cases and algorithms used in simulations
403
Case
Algorithm
Case 1
The RRLB policy and CDB policy
Case 2
The RRLB policy and OTR service broker policy
Case 3
The RRLB policy and RDS broker policy
Case 4
The AMLB policy and CDB Policy
Case 5
The AMLB policy and ORT service broker policy
Case 6
The AMLB policy and RDS broker policy
Case 7
The TLB policy and CDS broker policy
Case 8
The TLB policy and ORT service broker policy
Case 9
The TLB policy and RDS broker policy
3.2 Simulation Results Firstly, we analyzed the simulation results for nine cases for different sizes of VMs such as 5 VMs, 25 VMs and 100 VMs respectively. Results when using 5 VMs. As shown in Table 3 and Fig. 1, the best algorithms are the mix of The RRLB Policy and Optimized RT (ORT) Service Broker Policy because the best RT service broker chooses either the CDs or the DCs with the minimum RT. And as seen in results the minimum TR and total cost is the RRLB Policy and ORT Service Broker Policy. On the other hand, the worst combination is the TLB Policy and RDS Broker Policy. As seen they took the longest time of response and the highest grand total cost in the simulation. Results when using 25 VMs. Results in Table 4 and Fig. 2 show that, the results, for 25 VMs. As can be seen, the best algorithms are the mix of The RRLB Policy and ORT Service Broker Policy again because the best RT service broker chooses either the CDs or the DCs with the minimum RT. On the other hand, the worst combination Table 3 Illustrates the comparison between different cases using 5 VMs Case
Overall response time
Data center processing time
Case 1
50.11
0.48
Case 2
50.11
Case 3
55.28
Case 4
Total VM cost ($)
Total data transfer cost ($)
Grand total ($)
30,660.04
649.89
31,309.93
0.48
30,660.04
649.89
31,309.93
5.65
537,800.16
649.89
538,450.05
50.11
0.48
30,660.04
649.89
31,309.93
Case 5
50.11
0.48
30,660.04
649.89
31,309.93
Case 6
50.11
0.48
30,660.04
649.89
31,309.93
Case 7
50.79
1.15
537,803.78
649.89
538,453.67
Case 8
50.11
0.48
30,660.04
649.89
31,309.93
Case 9
50.79
1.15
537,801.84
649.89
538,451.73
404
H. Qarqur and M. Sah
Fig. 1 Comparison of cases versus cost for 5 VMs simulation Table 4 Illustrates the comparison between different cases using 25 VMs Case
Overall response time
Data center processing time
Total VM cost ($)
Total data transfer cost ($)
Grand total ($)
Case 1
50.15
0.51
138,408.19
649.89
139,058.08
Case 2
50.15
0.51
138,408.19
649.89
139,058.08
Case 3
51.75
2.12
537,824.26
649.89
538,474.15
Case 4
50.16
0.52
138,408.19
649.89
139,058.08
Case 5
50.16
0.52
138,408.19
649.89
139,058.08
Case 6
50.16
0.52
138,408.19
649.89
139,058.08
Case 7
50.16
0.52
138,408.19
649.89
139,058.08
Case 8
50.16
0.52
138,408.19
649.89
139,058.08
Case 9
50.52
0.88
537,822.89
649.89
538,472.77
Fig. 2 Illustrates the comparison of cases versus cost for 25 VMs
Simulation of Algorithms and Techniques for Green Cloud …
405
is the TLB Policy and RDS Broker Policy again, as seen they took the longest time response and the highest grand total cost in the simulation. Results when using 100 VMs. Results in Table 5 and Fig. 3 shows that for 100 VMs most of the algorithms are equal in time response and cost with a slight difference which allows us to determine The RRLB Policy and ORT Service Broker Policy again because the best RT service broker chooses either the CDs or the DCs with the minimum RT as the best policy in all conditions of our experiment and the TLB Policy and RDS Broker Policy again is the worst combination, as seen they took the longest time response and the highest grand total cost in this simulation. Comparing the Results of nine cases for 5 VMs, 25 VMs and 100 VMs. According to Tables 1, 2, 3, 4, 5, and Figs. 1, 2, 3. We concluded that the best algorithm combination to use is The RRLB Policy and ORT Service Broker Policy because it presents the best response time which means lower usage of resources and equipment Table 5 Illustrates the comparison between different cases using 100 VMs Case
Overall response time
Data center processing time
Total VM cost ($)
Total data transfer cost ($)
Grand total ($)
Case 1
50.50
0.87
532,608.73
649.89
533,258.62
Case 2
50.50
0.87
532,608.73
649.89
533,258.62
Case 3
51.62
1.99
537,863.88
649.89
538,513.77
Case 4
50.50
0.87
532,608.73
649.89
533,258.62
Case 5
50.50
0.87
532,608.73
649.89
533,258.62
Case 6
50.52
0.89
537,863.90
649.89
538,513.78
Case 7
50.50
0.87
532,608.73
649.89
533,258.62
Case 8
50.50
0.87
532,608.73
649.89
533,258.62
Case 9
50.52
0.89
537,863.92
649.89
538,513.80
Fig. 3 Illustrates the comparison of cases versus the cost of 100 VMs
406
H. Qarqur and M. Sah
to conserve power energy and in sequence lowest cost. On the other hand, the worst algorithms combination is the TLB Policy and RDS Broker Policy because it takes the longest response time which means more power consumption and more total cost. We also noticed that when the number of VMs increases the total response time increases and that means more power energy consumption. But in general, we recommend the user to use the RRLB Policy and ORT Service Broker Policy because it presents best response time which means lower usage of resources and equipment to conserve power energy and in sequence lowest cost, despite that The RRLB Policy and CDs Service Broker Policy have similar results in the three experiments but it has the disadvantage that the closet (DC) not always available so it cannot be the optimum choice (Table 6 and Fig. 4). Table 6 Comparison of time response versus total cost over different VMs numbers Number of VMs
Algorithms used
Best time response in ms
Worst time response
5
The RRLB policy and ORT service broker policy
50.11
55.28a
31,309.93
25
The RRLB policy and ORT service broker policy
50.15
51.75b
139,058.08
100
The RRLB policy and ORT service broker policy
50.50
51.62c
533,258.62
a The
TLB policy and RDS broker policy TLB policy and RDS broker policy c The TLB policy and RDS broker policy b The
Fig. 4 Illustrates the comparison between time response and VMs number
Total cost in US dollar
Simulation of Algorithms and Techniques for Green Cloud …
407
4 Conclusions In this paper, we discussed the techniques and algorithms used in Green Cloud Computing (GCC), in order to reduce the energy costs of running cloud computing services. Huge Datacenters and network transferring services consume a great amount of energy and hence emitting massive amounts of CO2 . Therefore, the GCC concept has appeared and developed to reduce electrical power energy consumption and the worldwide carbon dioxide emission and maintain the sustainability, reliability, and efficiency of these services up to date. In our work, we used CloudAnalyst simulation software, to simulate 7 datacenters in the normal operating mode for 7 users based in different locations, for a period of 365 days. Results show that the best algorithm combination is The RRLB Policy and ORT Service Broker Policy because it presents the best response time which means lower usage of resources and equipment to conserve power energy and in sequence the lowest cost. On the other hand, the worst algorithms combination is the TLB Policy and RDS Broker Policy because it takes the longest response time which means more power consumption and more costs.
References 1. Radu, L. (2017). Green cloud computing: A literature survey. Symmetry, 9(12), 295. https://doi. org/10.3390/sym9120295. 2. Wickremasinghe, B. (2009). A CloudSim-based tool for modelling and analysis of large scale cloud computing environments (pp. 433–659, Rep.). 3. Nine, M. S., Azad, M. A., Abdullah, S., & Rahman, R. M. (2013). Fuzzy logic based dynamic load balancing in virtualized data centers. In 2013 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (Vol. 1, pp. 1–7). https://doi.org/10.1109/fuzz-ieee.2013.6622384.
Data-Mining Based Novel Neural-Networks-Hierarchical Attention Structures for Obtaining an Optimal Efficiency Ahmed J. Obaid and Shubham Sharma
Abstract Big data and its classification have been the recent challenge in the evolving world. Data evolving needs to be classified in an effective way. For the classifying process, deep learning and machine learning models are evolved. Hierarchical Attention Network (HAN) is one of the most dominant neural network structures in classification. The major demerits which the HAN is facing are, high computation time and numerous layers. The drawback of HAN is vanquished by a new idea arrived from the mining methods that yield mixed attention network for android data classification. By this flow it could handle more complex request apart from the concept identified. The EHAN (Enhanced Hierarchical Attention Network) has two prototypes. The first one is the attention model to distinguish the features and the second one is the self-attention model to identify worldwide facts. By the demonstrated outcome, it is inferring that the partitioning of task is constructive and therefore the EHAN features shows a significant growth on the news database. In addition to this paper we could add other subnetworks subsequently to assist this ability further. Keywords Enhanced hierarchical attention network · Visual question answering · Artificial intelligence · Deep learning · Machine learning
A. J. Obaid (B) Department of Computer Science, Faculty of Computer Science and Mathematics, University of Kufa, Kufa, Iraq e-mail: [email protected] S. Sharma Department of Mech. Engg, IKG Punjab Technical University, Jalandhar-Kapurthala Road, Kapurthala, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_36
409
410
A. J. Obaid and S. Sharma
1 Introduction The CNN (Convolutional Neural Networks) is explained in a broad sense of acknowledgment undertakings particularly picture grouping. Be that as it may, these days individuals propose greater levels of popularity which means recognizing objects in remote detecting pictures cannot, at this point meet necessities. In view of VQA (Visual Question Answering) strategies, we could gather more data out of various requests. For instance, we are not able to distinguish protests in the image, yet may likewise discover answers that need consistent thinking like What is in the focal point of local location? Moreover, as indicated by questions we can acquaint consideration instrument to accomplish a superior exhibition. By inspecting NWPU- RESISC45 dataset informational collection, we locate that a few sorts of pictures are hard to recognize in any event, for us individuals [1]. Take an instance. On the off chance that we don’t investigate subtleties, words with petrol, football might be misconstrued as gasoline and soccer. Interestingly, if we concentrate a lot on subtleties, it might cause inverse impact, that is on the grounds that there are in every case a vehicle rushed at the petrol bunk. What’s more, it’s extremely normal to accept petrol bunk as gasoline in any event, for us individuals. What’s more, from the perspective of neural systems, gasoline will have more prominent loads in the consideration dispersion charts, so worldwide data might be missed. To take care of this issue, we plan two subnetworks, one with a consideration system to get subtleties and the other one with a self-consideration component to grab worldwide data. In our work, we state various levelled consideration systems which features a structure as delineated in Fig. 1. The Hierarchical Attention Network consist the following important components: • Attention model: This utilizes quiz to inquire along with that it presents consideration component in Visual Question Answering that joins picture vector and quiz vector to shape a rectified model.
Fig. 1 Three-layer HAN
Data-Mining Based Novel Neural-Networks-Hierarchical …
411
• Self-attention model: This utilizes global square that streamline a few which present a self- attention instrument. This has advantage in getting worldwide data. Afterward the picture highlight is connected with a quiz vector to scan those data. • Combination model: Supported secondary lattice likelihood dispersion, the completely associated layer will create the last answer. This model primarily has three commitments. To begin with, we propose a various levelled consideration that organizes for remote detecting picture order. The exactness of Hierarchical Attention Network in NWPU-RESISC45 shows that Hierarchical Attention Network outflanks past conventional systems as shown in the Fig. 2. Second, through considering the dissemination heat chart we could demonstrate that division of task truly guides a lot. To conclude the Hierarchical Attention Network, it might be included progressively subnetworks with one of a kind consideration system to part the undertaking above and beyond. It is also very well viewed as a structure (Fig. 3). Visual Question Answering (VQA) by the development in PC normal vernacular handling and optical science, Visual Question Answering undertakings steadily enhance amidst well-known areas, since it relies upon each extricating picture and changing inquiries to construe answers. The pattern strategy for VQA is utilizing CNN to remove picture highlights, at that point encoding questions through LSTM,
Fig. 2 Hierarchical Attention Network (HAN) architecture
412
A. J. Obaid and S. Sharma
Fig. 3 Structure of model
lastly consolidating and interpreting them to get the outcome. VQA is firmly identified with picture inscribing so normally a few strategies are moved through [1], 2. The contrast among VQA and picture inscribing is that we have as of now got the inquiry, what we ought to do is to discover significant data. Consideration component the model in [3] is presented as a consideration component during the procedure of picture subtitling. In [4], the creator noticed that link is that the best path to join quiz vector and picture vector. Furthermore, [5] explains the basic line that relates consideration instruments in the Visual Question Answering area. The creator planned to assemble the consideration system that could find applicable areas on pictures bit by bit.
1.1 Related Works Most of the existing literature target conventional user generated content, like blogs and reviews for context-based sentiment analysis problems. Initially, Sinmoyan et al. [6] make use of program statistical information to surmise the context-sensitive sentiment polarities of the phrases and words. Wang et al. [7] examined two types of context information, the local and global to imitate complicated sentences. They included the lexicon and discourse proficiency into CRFs model. Wang et al. [8] recognized keywords with emotional information and used the concept to develop sentimental features using the information retrieval technology.
Data-Mining Based Novel Neural-Networks-Hierarchical …
413
Krishvey et al. [9] suggested a technique for integrating information from various sources to figure out a context-aware sentiment vocabulary. In the biblical works which have earlier acquired the identical word that may express entire dissimilar sentiment polarities in several circumstances. Nearly all the pertaining works depend on precise topic- focused domains or semantic features of product reviews. Although the emotions are implanted in blogging are generally more condensed and not clear sentiment words. In addition to that, the new challenge arrives for context aware sentiment analysis task while working on the context related information in blogging conversations. It may be a farther distance dependent with one another, In the inquiry point toward the end. This procedure will be worked for each point. Yang et al. [10] improvised the CRF (Conditional random fields) model to get every line on the sentence inside the product reviews which figured out the sentiment labels on sentences as a sequence tagging problem. In any case, in [11], the creator finds that extraordinary question focuses share nearly an analogous consideration point. Therefore, figuring only 1 consideration chart for all picture elements could essentially diminish calculation cost. Self-consideration instrument. It’s first utilized in characteristic language handling jobs. And afterward being moved into removing worldwide snap data. Wang et al. [12], initially computes twin interconnection among the picture element (that pixel is named inquiry focuses) so on shape and consideration chart. And afterwards NWPU-RESISC45 database includes totals augmentation of consideration circulation loads and comparing focuses. The informational index covers 45 categories and consists of 31,500 picture and. Not the least bit like past informational collections which restricts the number of pictures (2800 maximally [13]) and the little grade scale (21 maximally [10]), inside categories assorted variety, etc., NWPU- RESISC45 compensates for the inconveniences. That is to state, it could bolster growing more information driven calculations [14].
2 Proposed Methodology 2.1 Enhanced HAN Both the attention and self-attention model will yield the likelihood of every label this allows to merge all the linked layers to calculate the end product. In this division, we are going to describe the importance of HAN in depth. i. ATTENTION: At the beginning, the snap of 448 × 448 picture elements is reproportioned. We got a structure of 512 × 14 × 14 that arrives in quality surface of Visual Geometry Group Net that is the output vector. Now the deepness of quality becomes 512. Otherwise, From the snap, Visual Geometry Group Net has drawn out to 512 varieties of quality. The original snap refers to 14 × 14 that is partitioned into 196 divisions, where we found 32 × 32 zone correlated with one in every 196 divisions of the original snap. We place 512 × 196 element vector that reshapes into 1024 × 196 element vector: V by entirely binding the
414
A. J. Obaid and S. Sharma
surface with hyperbolic tangent function in order to merge the quiz vector. Vep = tanh(WpVp + bp),
(1)
where every zone of quality vector is denoted by V p in addition every zone corelated to p-th vertical of matrix by Vep. To get accurate results we enlarge the size by amalgamating the procedure. LSTM (Long short-term memory) comprise of many memory cells, during the procedure of renovating the cell state there are four steps to be followed. The topmost step is to determine what information has to be disposed in Long short-term memory from the memory block and also the consequent way which determine what new data has cached within the memory block. ft = σ (Wf · [ht − 1, at] + bf ),
(2)
Xt = σ (Wi · [ht − 1, at] + bi),
(3)
Mt ∼= tanh(Wc · [ht − 1, at] + bc),
(4)
In the above derivation, input vector is denoted by at, forget gate is denoted by ft and hidden state is denoted by ht. Input gate is denoted by Xt and memory to be recalled is denoted by Mt ˜. So that previous memory block Mt − 1 is modernized into a fresh memory block Mt. eventually, determine what results in the end product. Mt = ft*Mt − 1 + it*Mt ∼,
(5)
Rt = σ (Wo · [ht − 1, at] + bo),
(6)
ht = Rt*tanh(Mt),
(7)
And the output gate is denoted by Rt. In this lattice, questions are initially implanted to orientation vector that grasp term correlation into consideration. By comparing with one-hot codes it decreases execution time during simulation. Veq = LSTM(Embed(qt)), t ∈ {1, 2, . . . , n},
(8)
Here we see implementing the one-hot-code dimension, the embedded quiz is denoted by Veq, the term at orientation t in the one hot code is denoted by qt in the quiz. ii. ATTENTION LAYER: In order to cluster with hyperbolic tangent function, we place Vep and Veq into a completely linked surface individually. The Attention distribution chart is resulted by placing in the SoftMax Function:
Data-Mining Based Novel Neural-Networks-Hierarchical …
415
hatt = tanh((Wel , att · Vep + bep, att) ⊕ (Weq , att · Veq + beq , att )), (9) PI = Softmax(Wp · hatt + bp),
(10)
By pointing out that both Wep, att and Weq, att has element matrices of 1024 × 512, Wp has element vector of 512, Veq has an element vector of 1024, Vep has an element matrix 196 × 1024, PI results in the 196 dimensions. By appending each column of the matrix by vector we arrive the sum of the matrix and the vector which is denoted by the symbol plus ⊕ . Then by the attention distribution chart we could calculate a weighted sum. Namely, data in every zone is mixed with each other through various significance: Vi =
PI · Vep ,
(11)
For instance, take that petrol as gasoline and football as soccer like these words and phrases had varied in different zones. To sort out the unrelated data and find the right position B by paying attention to asked questions. We establish a self-attention process to gather worldwide dataset. Because of the drawbacks in cache, we could provide eleven surface residual neural network with global data, though it previously resulted in greater significance. Natural Language Processing (NLP) extremely uses the self-attention process. While rephrasing a verdict, with few terms of interpretation relies further which are secluded and Long short-term memory could not handle them properly. In order to resolve this problem self-attention process makes each term a quiz point. This paper works out for the calculation of weighted sum by correlating among the quiz points. So that it is more efficient when compared to LSTM while merging the whole verdict information. Wang et al. [12], signifies that global data acquires the similar thesis. By differentiating the NLP, quiz words are now correlated with quiz picture elements. We signify, {x}NP x in this the number of picture element denoted by Np, input is denoted by x, and the picture element of x is denoted by x i Similar to NLP accumulation, we evaluate the experimental charts. The derivation is expressed below: E i = xi +
Np f(xa , yb ) b=1
c(y)
(Wv xb )
(12)
where the f (x a , x b ) function denotes the interconnection between orientation a and orientation b, a and b are indicators of quiz points, the linear transform matrix is W v and normalization factor is C(y). In addition to that exp (x a , x b )/ m exp (x a , x m ) which is the Embedded Gaussian which we used to work out C(y) and f (x a , x b ). We point that in [15], the author prefers that some components of the actual equations are not requisite, thus we show the extemporized one. We determine many of the worldwide facts affects the position. So that we realize, computing the attention chart is crucial. Cao et al. [11], astonishingly found in which every quiz point split mostly
416
A. J. Obaid and S. Sharma
the similar attention chart. By assisting the examined data, the creator workout the cos function interval of each orientation attention chart which resulted in negligible.
2.2 Data Extraction Data Extraction is done through the android API. The dataset contains various source data which is then contained to process in the same database. Then the data is preprocessed in the initial stages of stemming, lemmatization and tokenization.
2.3 EHAN Improving the architecture of HAN has been given in the proposed System. The efficiency improvement enables three features of novel characteristics. • Function Calling: The stage of function deployment is done at the Convolutional layer after the attention process and stored onto a single storage layer. The storage is then passed onto a single function. Then the process of calling function to the rectified linear unit (ReLU) layer which optimizes the process. • Attribute Shape: The attribute shape has been invoked in the initial stages of the convolutional layers which assures the development of the neural networks in a steady state computation. • Layer Optimization: EHAN contains multiple layers of convolution and hidden layers, while the data processing through its various stages will take a lot of training time. Each convolutional layer has been optimized with least number of filters for processing.
2.4 Data Classification After building the neural structure, the data is passed onto for classification of android data into two classes of Positive and negative. The given process is passed onto EHAN and classified into two classes of explicit data. Denoting the nature of the data and its classification.
3 Experimental Evaluation The experiment was set on a GPU instance for rapid training with python 3.6 environment. Keras was used as the top layer of the network with the sandbox as Tensorflow, where in the Fig. 4 we are calling the dataset and viewed it in our screen as shown as follow.
Data-Mining Based Novel Neural-Networks-Hierarchical …
417
Fig. 4 Categories of dataset
Fig. 5 Dataset information
In the Fig. 5, we are categorized the data into two objects types, Text and Category to recognize the text content in our dataset as shown in the following figure. In the following figure we are validate into each category label as a kind of cross-validation to check the accuracy of the proposed model, as shown in the Fig. 6.
418
A. J. Obaid and S. Sharma
Fig. 6 Splitting of positive and negative
3.1 Results and Discussions At this step we are checking our proposed EHAN approach, it is evident that the proposed novel EHAN approach has produced optimal efficiency in comparison with the existing system. The loss was also minimized on the increasing epochs and
Fig. 7 Achieving accuracy in optimal epoch
Data-Mining Based Novel Neural-Networks-Hierarchical …
419
Fig. 8 Accuracy of EHAN
Fig. 9 Loss versus Performance
the performance was higher, in the Fig. 7 where the testing step accuracy has been increased simultaneously. Figure 8 shows that the accuracy of our EHAN approach through the training state. Figure 9 represents the EHAN approach through the training state and losing of weight over performance as shown below.
420
A. J. Obaid and S. Sharma
4 Conclusions This paper is exhibiting the data extraction method on Enhanced hierarchical attention network by using android data classification. From this analysis, we illustrate that the EHAN surpasses earlier HAN by utilizing the news dataset. From the demonstrated outcome explained in the charts, it is inferred that partitioning of task is fruitful. The EHAN consist of an attention model to ascertain internal data at the same time self-attention model for worldwide facts. Subsequently, EHAN could add other subnetworks with a distinctive attention mechanism to partition the job additionally.
Reference 1. Chen, X., & Lawrence Zitnick, C, (2014). Learning a recurrent visual representation for image caption generation. In Cite-seer (pp. 1411) 5654. 2. Vinyals, O., et al. (2015). Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3. Xu, K., et al. (2015). Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning. 4. Ren, M., Kiros, R., & Zemel, R. (2015). Exploring models and data for image question answering. In Advances in neural information processing systems. 5. Yang, Z., et al. (2016). Stacked attention networks for image question answering. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6. Simonyan, K., & Zisserman, A (2014). Very deep convolutional networks for large-scale image recognition (pp. 1409–1556). arXiv preprint arXiv. 7. Wang, T., et al. (2019). Aed-net: An abnormal event detection network. Engineering, 5(5), 930–939. 8. Wang, T., et al. (2018). Generative neural networks for anomaly detection in crowded scenes. IEEE Transactions on Information Forensics and Security, 14(5), 1390–1399. 9. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Image-net classification with deep convolutional neural networks. In Advances in neural information processing systems. 10. Yang, Y., & Newsam, S. (2010). Bag-of-visual-words and spatial extensions for land-use classification. In Proceedings of the 18th SIGSPATIAL international conference on advances in geographic information systems. 11. Cao, Y., et al. (2019). Gcnet: Non-local networks meet squeeze-excitation networks and beyond. In Proceedings of the IEEE international conference on computer vision workshops. 12. Wang, X., et al. (2018). Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 13. Zou, Q., et al. (2015). Deep learning-based feature selection for remote sensing scene classification. IEEE Geoscience and Remote Sensing Letters, 12(11), 2321–2325. 14. Wang, T., & Snoussi, H. (2014). Detection of abnormal visual events via global optical flow orientation histogram. IEEE Transactions on Information Forensics and Security, 9(6), 988– 998. 15. Hu, H., et al. (2018). Relation networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Protecting Cloud Data Privacy Against Attacks Maryam Ebrahimi, Ahmed J. Obaid, and Kamran Yeganegi
Abstract Data security and privacy protection are the main factors of user’s concerns about the cloud technology. Various methods are used to protect users’ privacy, including anonymity, encryption, data distortion, and generalization. Another method is machine learning and its varied algorithms to forecast the likelihood of a violation of users’ privacy according to data. One of the most important issues in machine learning and data mining is information and data security. Placing data in separate and similar categories increases the efficiency and improves the results of data mining algorithms. One of the most important and widely used machine learning algorithms is the artificial neural network algorithm, which tries to increase the accuracy according to the model made and the hidden layers of the model. This algorithm is very simple and fast and has many applications in different sciences. In this study, to improve the results and forecast privacy in cloud computing, we have provided a model to predict the extent of privacy to improve user security and privacy by using the neural networks and used an optimization algorithm called World Competitive Contests (WCC). According to the results, the proposed algorithm is more accurate than the WCC, and better clustering can be performed using this algorithm, and therefore we perform better privacy detection. Keywords Artificial neural network · World Competitive Contests (WCC) · Privacy · Cloud computing M. Ebrahimi (B) Department of Information Technology Management, Electronic Branch, Islamic Azad University, Tehran, Iran e-mail: [email protected] A. J. Obaid Department of Computer Science, Faculty of Computer Science and Mathematics, University of Kufa, Kufa, Iraq e-mail: [email protected] K. Yeganegi Department of Industrial Engineering, Zanjan Branch, Islamic Azad University, Zanjan, Iran e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_37
421
422
M. Ebrahimi et al.
1 Introduction With the increasing use of cloud computing, the issue of security is also becoming more and more important as many academic studies have been conducted on it. Cloud computing has many applications in industry and academic studies, which is the reason why it has been considered by many researchers. Services in cloud computing are divided into three main categories: Software-as-a-Service (SaaS), Infrastructure-as-a-Service (IaaS), and Platform-as-a-Service (PaaS). SaaS refers to a method for delivering software applications over the Internet. IaaS is the most basic class of cloud computing services that enable users to hire IT infrastructure such as servers from a cloud provider based on pay-as-you-go. PaaS provides an on-demand environment for developing, testing, delivering and managing software applications such as web or mobile apps. One of the most challenging topics in cloud computing environments is the issue of privacy protection. In publishing data, organizations and individuals are concerned about the disclosure of their confidential and private information. The boundaries and content of what is considered private vary between cultures and individuals, but their main theme is common. The aim of this study is to provide a model using neural network to maintain security. The question presented in this study is whether an unknown person like x requests access to the information of another person such as y, can this request x threaten the security of person y or not. For this purpose, artificial neural networks will be used, an example of which can be seen in Fig. 1. The main goal here is to provide a model using a neural network that can accurately identify threats. One of the important issues in neural networks is determining the weights of the network [1]. Determining weights is a problem of indefinite polynomials, in other words, it is not possible to get an acceptable answer to it in a reasonable time, but if there is an answer, it is possible to check in polynomials whether the answer is acceptable or not. This is one of the optimization issues in which, in this study, the World Competitive Contests (WCC) for neural network
Fig. 1 An example of an artificial neural network for security testing
Protecting Cloud Data Privacy Against Attacks
423
weighting is proposed to determine in an acceptable time how likely it is that the request could violate security. The WCC [2] is a new optimization algorithm whose efficiency has also been proven. In recent years, due to the increasing rate of data production compared to processing speed, much attention has been paid to cloud processing and it is used in many industries. Cloud computing is based on computer networks and the Internet. One of the most important topics in cloud computing is privacy. In this chapter, we are going to take a brief look at cloud computing, followed by a history of cloud evolution and a comparison of several related technologies. By analyzing the system architecture, development model and type of service, we can examine the characteristics of cloud computing from a technical, operational and economic perspective. Following that, we present the efforts that have been made in the field of business and academia in order to understand the challenges and opportunities of this dimension. In this article, we try to present a suitable method to improve privacy in cloud environments and use the neural network in this regard.
2 Study the Literature 2.1 Cloud Computing From 2007 onwards, the term cloud became one of the most common terms in the information technology industry. Many researchers have tried to discuss cloud computing from various application aspects, but there is no common and universally accepted definition among them. Among the many definitions that have been provided, we mention three common definitions in the following section: a. IFoster: A distributed, large-scale computing paradigm derived from economies of scale and a source of virtual, dynamic, and scalable resources including computing power, storage space, platforms, and services that are demand-driven. It is provided to the user through the Internet. b. From an architectural point of view, the greatest focus has been on the technical features that distinguish cloud computing from other distributed computing paradigms. For example, computerized entities are virtualized and provided as services, and these services are dynamically derived from economies of scale. c. Gartner: A computing style in which scalable IT capabilities in the form of services are provided to multiple customers through the Internet. Gartner is a technology consulting firm that examines the quality of cloud computing from an industrial perspective. Operational characteristics are emphasized in this definition, such as whether cloud computing can be considered flexible and scalable, and whether it offers its services over the Internet. d. NIST: Cloud computing can be considered as a model for easy and demandbased access to a repository of configurable computing resources (i.e. network,
424
M. Ebrahimi et al.
servers, storage space, applications and services) that can be provided quickly and easily. With the least management overhead or interaction with the server, be taken back from customer. Compared to the other two definitions provided, the National Standards and Technology Organization in the United States provides a specific and purposeful definition that not only provides a general definition of the cloud concept but also provides specific and essential features of cloud computing along with development models. A cloud is a distributed, parallel computing system that includes a set of interconnected and virtual computers that are dynamically provided and presented as one or more integrated computing resources based on service level agreements (SLAs) [2]. Foster defines cloud computing as: “A large-scale distributed computing model managed by quantity savings and resources of virtualization, dynamic scalability, managed computing capability, storage space, platforms and services. It is delivered via the Internet according to customers’ needs” [3].
2.1.1
Cloud Computer Consists of Two General Parts
– Cloud provider, – Users of the cloud environment. Figure 2 describes the overview of Cloud Computing Environment. On the users’ side, a reduction in cost and waiting time and an increase in productivity are expected. The cloud provider also tries to provide appropriate services and allocate resources to users’ requests. Therefore, users expect their programs to be fully implemented in a minimum of time. The cloud provider also tries to reduce response time and maximize the use of available resources.
2.2 Challenges of Cloud Computing There are three challenges to cloud computing [4]: • Information sharing: Since service providers and users are often segregated sectors that have their own interests, they often do not share the exact state of their resources. Service providers usually offer several new types of obscure resource repositories instead of precise specifications. • Heterogeneous environment: Most cloud applications rely heavily on a heterogeneous environment. • Being unpredictable: In order to increase resource utilization, service providers are trying to over-involve users in shared infrastructure. This leads to disputes and conflicts over resources. Other factors involved in the unpredictability of environments include the existence of heterogeneity to maintain service levels.
Protecting Cloud Data Privacy Against Attacks
425
Fig. 2 Overview of the cloud environment adopted from Lee (2012) [1]
2.3 Privacy Privacy in cloud computing is important and inevitable. In publishing data, organizations and individuals are concerned about the disclosure of their confidential and private information. Data miners work on important information such as financial and economic information, medical information, weather information, military information, etc., and if there are no methods to protect privacy, they cannot possess data of some organizations or individuals. Reliability is one of the most important tasks of a data analyst so that he can publish his data and model with sufficient confidentiality. In general, privacy in cloud computing has two expectations: first, to ensure the disclosure of personal and confidential information such as name, address, national number, account number, etc. in processing. Second, to make sure of processing efficiently and usefully [5]. Some organizations want to publish their data with the desired level of security. These levels of security can be applied to data at different stages. These steps are: data collection, data dissemination, output dissemination or distributed data sharing [6]. Of course, we may experience side effects, such as data loss or incorrect data generation. In the data dissemination stage, we use pseudo-identity or fake identity detection methods. The k-anonymity, l-variety, and t-closeness methods are used to generate anonymous data with unknown identities.
426
M. Ebrahimi et al.
The purpose of all these methods is to prevent the disclosure of sensitive information. Sensitive information may be disclosed in the output, and we can use rules to prevent it. We also need to protect the modeled results. These results are association rules and search processing. Two methods of creating distortion and blocking are used to protect the rules of the association. In the distortion method to enter an event, we change the input value of the variable. In the blocking method, we do not change the input value of the variable, but we give incomplete information so that the association rules are not easily extracted. Of course, both methods can have a side effect on data, especially insensitive data. In distributed privacy, the purpose of data dissemination is to compete with competitors. Horizontal or vertical segregation as well as encryption protocols are used.
2.4 Resource Allocation Resource allocation is an issue that has been defined in many areas of computing, such as operating systems, network computing, and data center management. The resource allocation system in cloud computing can be seen as any mechanism whose purpose is to ensure that the required applications are properly implemented by the existing provider infrastructure. Along with this assurance, the developer of resource allocation mechanisms must also consider the current state of each resource in the cloud environment in order to apply algorithms to better allocate physical or virtual resources to applications. Cloud operating costs are also reduced. The purpose of providing resources in the cloud environment is to allocate resources to users who have requested a resource. Resources should be allocated in a situation where there are no nodes in the overhead cloud and all resources in the cloud do not have any waste such as loss of bandwidth or unnecessary processing of the core or memory loss, etc. [4].
2.5 Task Scheduling When resource shipments are made available to the user, the application makes a scheduling decision. In most cases, the application contains many tasks to which resources must be allocated. For example, a MapReduce cluster in a cloud environment may perform many tasks simultaneously. Task schedulers are responsible for allocating preferred resources to a specific task, so that all computational resources can be used effectively. The application must also ensure that each task is allocated a sufficient or fair amount of resources. If the cloud environment is heterogeneous, such scheduling decisions will become more complex. A very large volume of research has been done on the discussion of task scheduling [7].
Protecting Cloud Data Privacy Against Attacks
427
3 Research Background There are many technologies in the interaction between software and hardware, and many results and ideas can be accessed and updated. Among them, the key role of cloud computing is to lease its data centers to external customers so that they can use it for their personal computing. In 2006, Amazon launched Amazon Simple Storage Service (S3) Elastic Compute Cloud (EC2). Since then, several servers have introduced cloud solutions, including Google, IBM, Sun, HP, and Microsoft, Yahoo. From 2007 onwards, the number of trademarks covering the cloud computing brand increased, and many goods and services were offered in this proportion. Cloud computing can be considered a hot topic for research. In 2007, Google and IBM and a number of universities worked on a research project called ACCI, which aimed to solve the challenges of large-scale distributed computing. Since 2008, several open source projects have been created. For example, the Eucalyptus platform is the first Application program interface (API)-compatible platform used to develop a private cloud platform. Open Nebula also developed private cloud platforms and introduced various cloud models. In July 2010, Siteon Mobile partnered with HP to integrate markets in which individuals access the Internet via mobile phones rather than PCs. As more people became equipped with smartphones, mobile cloud computing also came into focus. Several network operators, such as Orange, VodaFone and Verizon, provided cloud computing services to individuals. In March 2011, the Open Network Forum of 23 IT companies was founded by Telecom, Facebook, Google, Microsoft, Verizon and Yahoo. The organization supports a new cloud platform called software-centric network. This design can innovate through software changes in communication networks, wireless networks, data centers, and networked segments. A brief history of it is given in Fig. 3. Cloud computing can be considered as the natural evolution of virtualization, service-oriented architecture, comprehensive and useful computing. As a new computing platform, this paradigm provides quality, reliable and customized services
Fig. 3 A brief history of cloud computing [7]
428
M. Ebrahimi et al.
that can provide a dynamic computing environment to end users and therefore it can be run with several suitable computing paradigms, such as computing. Utilization and pervasive computing [7].
4 Cloud Service Cloud computing as a service delivery mechanism can provide services at three levels: software, platform and infrastructure. Software as a service represents the most commonly used option for organizations in the cloud market. It means software is licensed on a subscription basis and is centrally hosted. Platform refers to enabling customers to develop, run, and manage applications without the complexity of building and maintaining the infrastructure typically associated with developing and launching an app. Infrastructure as a service in cloud computing is about offering access to computing resources such as servers, storage and networking by a vendor provides. Companies utilize their own platforms and applications within a service provider’s infrastructure. The work done can be divided into two categories: A. Predicting the relationship between people B. Maintaining security in the cloud environment
5 Related Work Done in Relation to Forecasting Communication Between People An efficient solution is proposed to determine the direction of communication with the name of the local directional path and is compared with other algorithms. The way to work in this resource is to add an extra edge to the graph that in an acceptable time have been able to predict the constructive or destructive relationship between two people. The operation of recognizing the type of positive or negative relationship and their similarity has been done using the available links. The method of working in this resource is that the nodes are given different ranks based on the existing criteria. The forecasting operation is performed from high priority nodes to low priority nodes. The practice of forecasting the relationship between two people in dynamic networks has been studied because in traditional methods for forecasting the relationship between two people, the network structure is considered static. The characteristics of dynamic networks are that the structure of the network is constantly changing and its characteristics change over time. These issues complicate the process of forecasting the edge sign. Forecasting the type of communication in complex networks has been studied extensively. Studies show that most work is done similarly and based on routing. In the study, it was investigated what factors and criteria are involved in determining the relationship between individuals in social networks. The proposed method
Protecting Cloud Data Privacy Against Attacks
429
is such that the selected criteria are highly accurate. The tests performed confirm the accuracy of the proposed method. A fuzzy method-based evolutionary algorithm has been proposed to forecast the type of communication between individuals in social networks. The proposed algorithm is based on the ant algorithm. It is also from the category of unsupervised learning algorithms. The algorithm has been tested on several social networks, which has shown the results of its proper performance.
5.1 Related Work Performed in Relation to Maintaining Security in Cloud Environments One of the most common networks that use the cloud environment is mobile users who store their data in the cloud infrastructure. The aim of the study was to examine the challenges related to this environment and to provide solutions for each of them. IoT networks are based on a cloud computing environment in which objects are connected to the Internet and communicate with each other. Maintaining the security of people in this environment is one of the most challenging issues that many studies have been done. In the research, an algorithm is proposed to increase security protection, which is based on encryption methods. The proposed algorithm is fuzzy based and the tests performed have shown the efficiency of the proposed algorithm.
6 Research Method In this study, we provide a way to predict and estimate the extent of breach or privacy to improve user security and privacy. In this method, we use the neural network for the learning machine. The proposed method provides an estimate of the amount of breach or privacy by providing a suitable model, as well as using the WCC algorithm. One of the important points in privacy is not disclosing the data about each person. Therefore, we change the objective function of the algorithm in such a way that, if possible, privacy violations can be identified and information leakage can be detected. So we try to put very similar information in a cluster so that we can show that this data belongs to a specific person and the principle of anonymity is not observed in the data.
6.1 Implementation of the Proposed Method The implementation of the proposed algorithm is done by simulation in the software. To do this, we use MATLAB analytics software. In this software, which is coded in
430
M. Ebrahimi et al.
C language, there are various toolboxes that are used to write the required code. In this research, we use the nftool toolbox to use neural network functions. The neural network function in this toolbox is the feedforwardnet tool that is used to learn and categorize the algorithm.
6.2 Evaluate the Proposed Method To evaluate and present the proposed algorithm improvement, we use two datasets, iris and wages. These collections are available for free on the Internet and are very popular and have been used in most learning articles. To evaluate the algorithm, we use two factors, accuracy and execution time. We measure runtime using the tic and toc commands.
7 Research Findings In this research, we use Matlab software (Matlab 2012) for simulation. We use the iris and Wages datasets to obtain results and compare different methods. This collection is available for free on the Internet and is very popular and has been used in most learning articles. In this research, in the proposed algorithm, the values of 10, 20, and 100 are considered for the number of iterations, respectively. Comparison of the data set results based on the best solution identified in the 10 times execution of each algorithm as well as the processing time are calculated in order to obtain the best solution. The algorithms were implemented using Matlab 2012 software on a system with a 2 GHz Intel Core i7 processor and 8 GB of RAM. The step-by-step algorithm is described below. In general, the steps of the program can be summarized as follows. a. b. c. d. e. f.
Read input data Initialize the required parameters in the algorithm Neural network training Data testing using trained neural network Evaluate the results Show outputs
To run the program, we must first read the data and save it in variables. As seen in the code in Figs. 4 and 5. Because the data in the sets are in order, we randomly disassemble the data to create clutter in the set and gain more confidence in the results. Then we run the algorithm in order and use the main tags of the set and compare with the results of the algorithm to obtain and compare the accuracy of different algorithms. We use the tic and toc commands to calculate the execution time of the algorithm.
Protecting Cloud Data Privacy Against Attacks
431
Fig. 4 Coding Screenshot 1
Fig. 5 Coding Screenshot
7.1 Test Set Data To execute the desired algorithms, we divide the data sets into two parts: training and testing. We prepare our model using the training data and then we test the accuracy of the model using the test data. To do this, we use 75% of the data set for training and 25% of the data for the test set. To check and evaluate the algorithms, we use the execution time and the accuracy of the test data in the category. The data set is shown in the figure below. This data is hypothesized and used to obtain the desired results. For this data set, the prediction results have been obtained according to the neural network model and its accuracy is about 80% for the test data. Finally, we must consider another parameter for evaluation. The biggest advantage of the privacy violation algorithm is the low execution time of this algorithm, which has led to the efficiency and various uses of this algorithm. Therefore, we obtain the execution time of the algorithm using MATLAB functions. As shown in the Figs. 6 and 7 and Tables 1 and 2, the proposed algorithm is more accurate than the WCC, and better clustering can be performed using this algorithm, and therefore we perform better privacy detection.
432
M. Ebrahimi et al.
Fig. 6 Data set
Fig. 7 Graph of forecast results Table 1 Resultant Data Set 1 Data Set The best accuracy Average accuracy
Iris
Wages
Wine
Proposed method
82
81
80
Proposed method
79
76
78
Proposed method
67
68
72
WCC
65
64
68
Protecting Cloud Data Privacy Against Attacks Table 2 Resultant Data Set 2
433
Data Set
Iris
Wages
Wine
The least time (Best Case)
40
112
65
Average time
43
117
70
Most of the time (Worst Case)
49
131
81
8 Summary and Conclusion Privacy can be considered as the most important issue in cloud computing. The goal of cloud computing is a regular structure within a set of unlabeled data. To maintain privacy in cloud computing, efforts are made to publish data and make it available to users so that the privacy of cloud users is not compromised. Various methods are used to protect users’ privacy, including anonymity, encryption, data distortion, and generalization. Another method used is to use machine learning and its various algorithms to predict the likelihood of a breach of user privacy (according to the data). One of the most important issues in machine learning and data mining is information and data security. Placing data in separate and similar categories increases the efficiency and improves the results of data mining algorithms. In classification, each data is assigned to a predefined class, but in clustering there is no information about the existing classes in the data, in other words, the clusters themselves are extracted from the data. In fact, one of our goals in data clustering is to create privacy using data categorization. One of the most important and widely used machine learning algorithms is the artificial neural network algorithm, which tries to increase the accuracy according to the model made and the hidden layers of the model. This algorithm is very simple and fast and has many applications in various sciences. In this study, we tried to examine different methods of privacy and can use it to improve the methods of privacy and prevent violations.
References 1. Rezaei, O. (2014) A study on cloud computing. In National Conference on Computer Science and Engineering with a Focus on National Security and Sustainable Development. 2. Sadr al-Sadati, M., & Kargar, M. J. (2012). Security challenges in cloud computing and providing a solution to improve its security in order to develop public services of e-government. In 8th Conference on Advances in Science and Technology. 3. Arjmand, M. (2013). Graph coloring using genetic algorithm. Master Thesis. 4. Sadashiv, N., & Kumar, S. D. (2011, August). Cluster, grid and cloud computing: A detailed comparison. In 2011 6th International Conference on Computer Science & Education (ICCSE) (pp. 477–482), IEEE. 5. H Kim W Kim Y Kim 2010 Experimental study to improve resource utilization and performance of cloud systems based on grid middleware Journal of Communication and Computer 7 12 32 43
434
M. Ebrahimi et al.
6. F Koch MD Assunção C Cardonha MA Netto 2016 Optimizing resource costs of cloud computing for education Future Generation Computer Systems 55 473 479 7. Fei Teng 2011 Ressource allocation and schelduling models for cloud computing Diss. ChâtenayMalabry Ecole centrale de Paris
Cuckoo Search Optimization Algorithm and Its Solicitations for Real World Applications M. Niranjanamurthy, M. P. Amulya, N. M. Niveditha, and Dharmendra Chahar
Abstract Cuckoo Search is a streamlining calculation created by Yang and Deb in 2009. It is utilized to take care of advancement issues. It was enlivened by a types of winged animal considered a cuckoo that lays its eggs in the homes of other host fowls. The laying and proliferation of cuckoo eggs is the principal essential inspiration for the improvement of another enhancement calculation. This advancement calculation builds proficiency, precision, and union rate. The cuckoo streamlining calculation (COA) is utilized for persistent non-direct enhancement. The calculation can be reached out to more perplexing situations where each home has a few eggs that speak to a lot of arrangements. This exploration work speaks to a concise survey of the cuckoo search calculation just as streamlining and its issues. Diverse cuckoo search classes and different cuckoo search applications are inspected. Keywords Cuckoo search · Cuckoo-search variants · Cuckoo-search algorithm · CS pseudo-code · Levy flight · Flowchart of CSA
M. Niranjanamurthy (B) Department of Master of Computer Applications, M S Ramaiah Institute of Technology, Bangalore, India e-mail: [email protected] M. P. Amulya · N. M. Niveditha Department of Computer Science and Engineering, B G S Institute of Technology, Adhichunchanagiri University, B G Nagar, Mandya 571448, India e-mail: [email protected] N. M. Niveditha e-mail: [email protected] D. Chahar M D College Pallu, Hanumangarh, Rajasthan, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_38
435
436
M. Niranjanamurthy et al.
1 Introduction An optimization algorithm is a procedure that is performed iteratively comparing several solutions until an optimal or satisfactory solution is found. Improvement is one of the key segments of AI. The pith of most AI calculations is to make an advancement display and gain proficiency with the boundaries of the target work from the given information. A computational issue where the goal is to locate the most ideal of all arrangements utilizing this method. All the more officially, discover an answer in the achievable area that has the base (or most extreme) estimation of the goal work. It is a numerical method to locate a most extreme or least estimation of an element of a few factors subject to a lot of limitations, for example, straight programming or frameworks examination. It is a demonstration, cycle or approach to make something, (plan, frame-work/choice) as totally great, utilitarian or proficient as imaginable clearly: numerical-techniques, (Eg:, discovery the limit-of-a-capacity) associated with this. In enhancement under vulnerability or stochastic streamlining, the vulnerability is consolidated into the model. Strong improvement methods can be utilized when the boundaries are known distinctly inside specific restricts; the objective is to locate a functional answer for all the information and ideal in some sense. Cuckoo-Search (CS) utilizes the accompanying portrayals: every egg in a home speaks to an answer, and a cuckoo egg speaks to another arrangement. … In the least complex structure, each home has an egg/s. The calculation can be stretched out to more mind boggling case’s in which each home has a few eggs that speak to a lot of arrangements. The cuckoo search calculation is an as of late created metaheuristic advancement calculation that is utilized to take care of streamlining issues. A few animal varieties, for example, the Ani-and-Guira cuckoo’s, lay their egg’s in the host flying creature’s home and may murder others to improve the probability of incubating. Three unrealistic rules: (1) every-cuckoo lay’s 1-egg at a time-and-drops its egg/s into a arbitrarily chosennest; (2) The greatest nest’s with great excellence egg’s will be transferred to later groups; (3) The Number of obtainable Host-Nest’s is static and the host-bird regulates the egg-laid-by a cuckoo with a likelihood pa ∈ [0,1]. Cuckoo-Search Variants: Modified CS, Binary CS, and Enhanced CS. Advantages of CS: It deals with multicriteria optimization-problems, It can always be hybridized with other algorithms based on swarms, Aims to accelerate convergence, Simplicity, Easy to implement. The Application’s of CS in ‘Engineering-optimization’ problem’s have presented its promising-efficiency. Some of the Applications are: • Spring-design and welded-beam design • Design-optimization truss structures
Cuckoo Search Optimization Algorithm …
437
• • • • • • • • •
Engineering-optimization Steel-frames Wind turbine-blade Reliability-problems Stability-analysis Solve-nurse scheduling problem An efficient-computation A new-Quantum-Inspired CS was established to resolve ‘Knapsack’ problem’s Proficiently generate self-governing test paths for organizational software-testing and TestData generation • Applied to Train. Other examples are; • Hybrid CS- For shop scheduling problems. • Improved CS for Global optimization—To improve the accurateness and conjunction rate. • Modified C S-For Unconstrained Optimization-problems • Based-on Levenberg–Marquardt (LM) —Benefits in reducing mistakes and avoids local-minima in an algorithm/s. • Multi-objective CS-In Job-Scheduling Problem/s • A Novel Complex value-Reducing the local convergence and Enhance the information of nests. • Discrete for solving traveling-salesman problematic. • Neural based-Employee health and safety (HS) risk-on staffs at their work-places.
2 Related Work Cuckoo search is one of numerous nature-motivated calculations utilized widely to take care of enhancement issues in various fields of building. It is an exceptionally viable in comprehending worldwide improvement since it can keep up balance among nearby and worldwide irregular strolls utilizing exchanging boundary. The exchanging boundary for the first C S calculation is fixed at 25 percent and insufficient examinations have been completed to evaluate the effect of dynamic-turning boundary on the presentation of C S A [1]. The Cuckoo Search calculation is a late created meta-heuristic rationalization calculation that is used to fix progress issues. This is a metaheuristic calculation animated by nature that depends on the parasitism of the young of some species of cuckoo, as well as the irregular walks of the Levy flights. Usually the limits of the cuckoo search are constantly saved for a certain period of time, which leads to a decrease in the productivity of the algorithm [2]. A metaheuristic technique for calculating cuckoo search (CS) is used to determine the ideal structural plan for discrete and persistent factors. This depends on the parasitic behavior of the young of some species of cuckoo, as well as the Lévy
438
M. Niranjanamurthy et al.
flight behavior of certain winged creatures and flies from organic products. CS is a population-based progression calculation and, like many other meta-heuristic calculations, starts with an arbitrary introductory population that is used as inns or eggs [3]. In any case, a solitary inquiry technique in CS makes all cuckoos have comparative pursuit conduct, and it is subject to dives into nearby ideal. Likewise, regardless of whether CS can effectively tackle an problem to a unlimited degree depend upon the estimation of control boundaries. Utilizing the experimentation technique to decide the estimation of boundaries will cost a ton of computational cost frequently [4]. The finest variation of CS A that marks the progression size of Lévyflight adaptive. An enhancement calculation is a lot of steps used to take care of the streamlining issue and the most well-known are those created dependent on nature-motivated thoughts that manage choosing the best option of the given target functions [5]. The straightforwardness and accomplishment of cuckoo search (CS) calculation has roused specialists to apply these methods to the multi-target improvement field. The use of CS for tackling multi-target advancement issues (MOPs) in view of deterioration methods [6]. Cuckoo Search calculation to fathoming Multi-Skill Resource Constrained Project Scheduling Problem [7]. Considering the Cuckoo Search calculation effectively fall in neighborhood optima, now and again influence the deformities of worldwide inquiry results [8]. Cuckoo search calculation is utilized to upgrade the hub focuses on the vital range so as to get a more exact outcome. This technique can register common positive for any capacities, yet additionally figure solitary fundamental and oscillatory integral [9]. Cuckoo search calculation (CSA) is one of conduct calculation which is powerful to tackle improvement issue including the bunching issue. In view of examination, k-implies is additionally successful to take care of the bunching issue extraordinarily in quick convergence [10]. Discrete cuckoo search calculation, which is a meta-heuristic streamlining calculation used to tackle true issues, and it is applied to the different info and various yield image identification issue. In any case, just applying the cuckoo search calculation is risky since it doesn’t consider the arched and discrete nature of the image discovery problem [11, 12]. CS-The calculation adjusted from Swarm Intelligence ideas and motivated by Cuckoo winged creatures that are parasitic in the variety. Cuckoo winged animals lay eggs not on their homes however different fowls nest [13]. The accompanying central principles can be utilized for taking care of a given enhancement issue utilizing the CS calculation: (i) Selecting the best arrangement by methods for selection of the best homes. (ii) Replacement (standby) of hosteggs relying upon the nature of the new arrangements. (iii) Detection of some ‘cuckoo-eggs’ by the host winged creatures and substitution dependent on the nature of nearby arbitrary walk’s [14].
Cuckoo Search Optimization Algorithm …
439
3 Cuckoo Search Optimization Algorithm Cuckoo Search (CS) uses the subsequent representation’s: every-egg in an Nestrepresent’s a answer and an Cuckoo-Egg Represent’s a novel answer. In the simplestform, every-nest has an egg. The algorithm can be prolonged to extra complex cases where each nest has several eggs that represent a set of solutions. The cuckoo search algorithm is also used to test the functions of the optimization benchmarks. Standard reference functions are developed to compare this cuckoo algorithm with other algorithms. Basic cuckoo-search algorithm: In the algorithm above (Fig. 1), represent: Every cuckoo lays 1 egg at-a-time and drops its egg into a arbitrarily selected nest; The finest-nests with great quality-eggs will carry-over to the next generation; The number of obtainable host- is fixed and the host bird determines the egg laid by a Cuckoo with a Possibility of (0.1). The pseudo-code can be brief as: Figure 2, The above pseudo-code shows the pseudo-code of the C S A search measure. Like-other multitude founded calculations, the C S A begins with an underlying populace of n have homes. These underlying host homes will be haphazardly pulled in by the cuckoo’s with egg’s and furthermore by irregular Levy-trips to lay the egg’s. From that point, home quality will be assessed and contrasted and another arbitrary host home. In the event that the host home is improved, this-will supplant the old-host homes. This novel arrangement has the Egg-laid-by an ‘cuckoo’. In the event that the host winged creature finds the egg-with a likelihood Pα ∈ (0,1), the swarm either tosses-out the eggs, or surrenders it and fabricates another home. This progression is finished by supplanting the bountiful arrangements with the new irregular arrangements. begin Generate iteration time t = 1 Initialized with random vector values and parameters Evaluate the fitness of each individual (nest) and determine the best individual with the best fitness value while (stopping criterion is not met or t < MaxGeneration) Get a Cuckoo randomly by local random walk or Levy flights Evaluate its fitness Fm Choose a nest among n (say, n) randomly if (Fm < Fn) replace n by the new nest m end If A fraction (Pa) of worse nests are abandoned and new ones are built Keep the best nests with quality solutions Rank the solutions and find the current best Update the generation number t = t + 1 end while end
Fig. 1 Basic cuckoo search algorithm
440
M. Niranjanamurthy et al.
1: Objective function f(X), X = (f(x1, x2, … , xd)T 2: Generate initial population of n host nests Xi (i=1, 2, …, n) 3: While t < Max_iterations do 4: Get a cuckoo randomly by Levy flights 5: Evaluate its quantity/ fitness Fi 6: Choose a nest among n (say, j) randomly 7: If Fi > Fj then 8: replace j by the new solution; 9: End if 10: A fraction (Pa) of worse nests are abandoned and new ones are built; 11: Keep the best solutions 12: Rank the solutions and find the current best 13: End while 14: Postprocess results and visualization
Fig. 2 The pseudo-code
Cuckoo flying creatures lay their-eggs in the homes of further host winged animals (normally different species) with astonishing capacities, for example, choosing homes containing as of late laid eggs and eliminating existing eggs to build the bring forth likelihood of their-own egg’s. A portion of the host winged animals can battle this freeloading conduct of cuckoos and toss out the found outsider eggs or assemble another home in an unmistakable area. This cuckoo reproducing relationship is utilized to build up the CS calculation. Characteristic frameworks are unpredictable, and consequently they can’t be demonstrated precisely by a PC calculation in its essential structure. Rearrangements of regular frameworks is vital for effective usage in PC calculations. Yang and Deb rearranged the cuckoo multiplication measure into three admired standards. Three distinct administrators characterize the development cycle of CS: (A) Lévy flight, (B) substitution of certain homes by building new arrangements, and (C) elitist determination technique. X(t + 1) for, say cuckoo i, a Levyflight is achieved. =
+ α
Lèvy (λ)
where α > 0 is the ste- size which must be connected to the scales of the problematic of attention. In most cases, we can use α = O(1). The product ⊕ means entry-wise multiplications. C S A is exact operative for global-optimization problem’s since it maintain’s a balance among local-random-walk and the global-random-walk. The Local and Global Random walks are defined by Eqs. 1 and 2 (Table 1). xit+1 = xit + αs ⊗ H(pa − ε) ⊗ x tj − xkt
(1)
xi(t+1) = xit + αL(s, λ)
(2)
One of the most remarkable highlights of cuckoo search is the utilization of Lévy trips to produce new up-and-comer arrangements (eggs). Duty flights are very
Cuckoo Search Optimization Algorithm …
441
Table 1 Parameters of Local Random-walk and Global Random-walk Parameters
Description
x tj and xkt
Current position selected by random permutation
α
Positive step size scaling factor
(t+1) xi
Next position
s
Step size
⊗
Entry-wise product of two vectors
H
Heavy-side function
pa
Used to switch between local and global random walks
E
Random number from uniform distribution
L(s, λ)
Lèvy distribution, used to define the step size of random walk
straightforward inquiry calculations, yet their outcomes can be great contrasted with guileless specification and arbitrary strolls. Their absence of any type of memory is their significant shortcoming, as memory can give us enormous enhancements.
4 Flowchart of CSA Figure 3 speak to the Flowchart of Cuckoo Search Algorithm. Improvement calculations created dependent on nature-roused thoughts manage choosing the finest option in the feeling of the given target work. The streamlining calculation can be either a empirical or a meta-heuristic Approach. Heuristic methodologies are issue structured methodologies where every streamlining issue has its own Heuristic-techniques that are not pertinent for different sorts of improvement issues. Enhancement Algorithm can be separated into mostly 2 classifications, They are Heuristic and Meta-heuristic. Multitude based calculation is under Meta-heuristic procedure. Multitude based calculation contains-honey bee province, molecule multitude, and Firefly Algorithm and Cuckoo search. Table 2, speaks to the Comparison variables of Genetic Cuckoo Optimization Algorithm (GCOA) Versus Cuckoo. CS is another transformative advancement calculation which is motivated by way of life of flying creature family. This segment presents about CS dependent on the measurable outcome evaluated using articles. As indicated by the factual outcome that is show in Fig. 1, CS calculation has been applied in various types of advancement issues across different classifications. From the above Fig. 4, obviously the significant classes considered in CSA were Engineering followed Recognition of patterns, Software application or productTesting and Data or information generation, net-working, Job-Scheduling and DataFusion and Wireless-Sensor Networks [15].
442
M. Niranjanamurthy et al.
Fig. 3 Flowchart of CSA
Table 2 Genetic Cuckoo Optimization Algorithm (GCOA) Versus Cuckoo Comparison Factors
Cuckoo
GCOA
Find Optimal Solution
Higher
Higher
Reach Optimal Search Space
Lower
Lower
Iteration Counted to do
Lower
Lower
From the Fig. 5, the represents the performance considered in C S A with various parameters of cost, rate, energy, reducing mass searching, secure partitioning and SSL with projection of performance and Number.
Cuckoo Search Optimization Algorithm …
443
Fig. 4 Major-Categories in Cuckoo Search Algorithm
Fig. 5 Performance-measures deliberated in CS
5 Conclusion CS is the finest inquiry calculation that it enlivened by the rearing conduct of Cuckoos. It gives the short portrayal of the utilizations of the nature-enlivened calculation. CS calculation is in different areas containing Industry, Image- handling, remote sensor systems, flood anticipating, archive bunching, speaker acknowledgment, most limited way in dispersed framework, in the wellbeing division, work planning. The Cuckoo calculation performs different nature-enlivened calculations as far as improved execution and less computational time. Cuckoo search calculation is viewed as one of the promising meta-heuristic calculations applied to tackle various issues in various fields. Notwithstanding, it goes through the untimely assembly issue
444
M. Niranjanamurthy et al.
for high dimensional issues in light of the fact that the calculation meets quickly. This examination papers speaks to the Cuckoo Search Optimization Algorithm and its sales for true applications, Three glorified principles, Applications of CS in building enhancement, flowchart and correlation.
References 1. Mareli, M., Twala, B. (2018). An adaptive Cuckoo search algorithm for optimisation. Applied Computing and Informatics, 14(2), Pages 107–115. https://doi.org/10.1016/j.aci.2017.09.001. 2. Joshia, A. S., Kulkarni, O., Kakandikar, G. M., Nandedkar, V. M. (2017). Cuckoo search optimization—A review. In Materials Today: Proceedings (Vol. 4, Issue 8, pp. 7262–7269). https://doi.org/10.1016/j.matpr.2017.07.055. 3. Kaveh, A. (2017). Cuckoo search optimization. In Advances in metaheuristic algorithms for optimal design of structures. Cham: Springer. https://doi.org/10.1007/978-3-319-46173-1_10. 4. Gao, S., Gao, Y., Zhang, Y., & Xu, L. (2019). Multi-strategy adaptive cuckoo search algorithm. IEEE Access, 7, 137642–137655. https://doi.org/10.1109/ACCESS.2019.2916568. 5. Reda, M., Haikal, A. Y., Elhosseini, M. A., & Badawy, M. (2019). An innovative damped cuckoo search algorithm with a comparative study against other adaptive variants. IEEE Access, 7, 119272–119293. https://doi.org/10.1109/ACCESS.2019.2936360. 6. Chen, L., Gan, W., Li, H., Xu, X., Cao, L., & Feng, Y. (2019). A multi-objective cuckoo search algorithm based on decomposition. In 2019 Eleventh International Conference on Advanced Computational Intelligence (ICACI), Guilin, China (pp. 229–233). https://doi.org/10.1109/ ICACI.2019.8778450. 7. Dang Quoc, H., Nguyen The, L., Nguyen Doan, C., & Phan Thanh, T. (2020). New cuckoo search algorithm for the resource constrained project scheduling problem. In 2020 RIVF International Conference on Computing and Communication Technologies (RIVF), Ho Chi Minh, Vietnam (pp. 1–3). https://doi.org/10.1109/RIVF48685.2020.9140728. 8. Zhang, M., He, D., & Zhu, C. (2016). Cuckoo search algorithm based on hybrid-mutation. In 2016 12th International Conference on Computational Intelligence and Security (CIS), Wuxi (pp. 538–542). https://doi.org/10.1109/CIS.2016.0131. 9. Zexi, D., & Feidan, H. (2015). Cuckoo search algorithm for solving numerical integration. In 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), Shenyang (pp. 1508–1512). https://doi.org/10.1109/CYBER.2015.728 8168. 10. Girsang, A. S., Yunanto, A., & Aslamiah, A. H. (2017). A hybrid cuckoo search and Kmeans for clustering problem. In 2017 International Conference on Electrical Engineering and Computer Science (ICECOS), Palembang (pp. 120–124). https://doi.org/10.1109/ICECOS. 2017.8167117. 11. Jung, D., Eom, C., & Lee, C. (2019). Discrete cuckoo search algorithm for MIMO detection. In 2019 34th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC), JeJu, Korea (South) (pp. 1–4). https://doi.org/10.1109/ITC-CSCC.2019. 8793322. 12. Zefan, C., & Xiaodong, Y. (2017). Cuckoo search algorithm with deep search. In 2017 3rd IEEE International Conference on Computer and Communications (ICCC), Chengdu (pp. 2241– 2246). https://doi.org/10.1109/CompComm.2017.8322934. 13. Bustamam, A., Nurazmi, V. Y., & Lestari, D. (2018). Applications of cuckoo search optimization algorithm for analyzing protein-protein interaction through Markov clustering on HIV. In AIP Conference Proceedings 2023, 020232 (2018) (pp. 020232-0–020232-6). https://doi.org/ 10.1063/1.5064229.
Cuckoo Search Optimization Algorithm …
445
14. Yasar, M. (2016). Optimization of reservoir operation using cuckoo search algorithm: Example of Adiguzel Dam, Denizli, Turkey. Mathematical Problems in Engineering, 2016, Article ID 1316038, 7 pages. https://doi.org/10.1155/2016/1316038. 15. Mohamad, A., Zain, A. M., Bazin, N. E. N., & Udin, A. (2013). Cuckoo search algorithm for optimization problems—A literature review. AMM, 421, 502–506. https://doi.org/10.4028/ www.scientific.net/amm.421.502.
Academic Article Recommendation by Considering the Research Field Trajectory Shi-jie Lin, Guanling Lee, and Sheng-Lung Peng
Abstract Academic resources discovery has been an open and challenging problem. How to find the needed articles in a large number of research papers has always been a time-consuming task, especially for scholars or students who are new to the field. With the advancement of information technology and the evolution of application, the research field has its own evolutionary process. If beginners can master the evolution and the technology of the research field, they will get a deep understanding of the current target areas. Therefore, in this paper, we adopt the concept of research field trajectory to develop a paper recommendation method which is useful for beginners. The goal is to systematically provide research papers, so that researchers who are new to the field can quickly find academic research papers related to their own research, and form a solid research foundation. Furthermore, it also can recommend to researchers the emerging research papers in this field. To show the benefit of our approach, a set of experiments is performed. The experimental results show that the proposed method is effective in solving this problem. Keywords Recommendation system · Content-based recommendation · Academic article recommendation · Research field trajectory
S. Lin · G. Lee (B) Department of Computer Science and Information Engineering, National Dong Hwa University, Hualien, Taiwan, Republic of China e-mail: [email protected] S. Lin e-mail: [email protected] S.-L. Peng Department of Creative Technologies and Product Design, National Taipei University of Business, Taipei City, Taiwan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_39
447
448
S. Lin et al.
1 Introduction With the rapid development of computer technology and the Internet, more and more applications have been emerged, such as search engines and e-commerce. The advent of these applications has made digital data grow at an alarming rate. Therefore, how to find the information which satisfy the users need efficiently and effectively is an important issue. To cope with this problem, the recommendation system has received widespread attention in recent years. And nowadays, the technology of recommendation system is wildly used in goods, music and movies recommendations. Furthermore, with the advancement of digital libraries and the Budapest Open Access Initiative [14] proposed by the Open Society Institute in 2002, the academic research papers become more and more accessible. According to the statistics from 1665 to 2009 [8], more than 50 million academic research papers have been published. Furthermore, the statistics from Digital Bibliography and Library Project (DBLP) [1] also shows that the number of academic research papers published in the field of computer science has increased year by year (from 210,245 in 2010 to 403,901 in 2017) with growth rate of about 190%. In such a large amount of research papers, how to find the paper satisfied users information need efficiently and effectively has become a difficult task. Many researchers believe that finding related research papers is a very time-consuming task. Therefore, in the past 20 years, research paper recommendation systems and methods have been widely discussed. The methods applied to research paper recommendations can be roughly divided into content-based filtering (CBF) [2, 6, 7, 12, 11], collaborative filtering (CF) [5, 13, 16], reference-based recommendation [3, 4, 17], graphic-based recommendations [9, 10, 15] and Hybrid recommendation approach. We find that few of these papers discuss how to recommend suitable research papers for scholars or students who are new to the field. For beginners, it is difficult to issue an accurate query (CBF) because of lacking the background knowledge of the target domain. Furthermore, there are few research papers which have read by the beginners, if CF is used as the recommendation method, we will face cold start problem. In addition, with the advancement of information equipment and technology, and the needs of application evolution, each research field has its own evolution course. If beginners can master the evolution of the research field and the technology that it has bred, it will be of great help to deeply understand the current target area. Therefore, in this paper, we propose a paper recommendation method based on the research field trajectory which can recommend user-related papers systematically. The organization of this paper is as follows. The proposed method is presented in Sect. 2. Section 3 provides experimental results and discussions. And finally, Sect. 4 concludes this work.
Academic Article Recommendation by Considering …
449
2 Methods Figure 1 shows the ER diagram of database for the development system. The basic search process is as follows. Step 1, find the articles contained the keywords the user entered. Step 2, find the authors of the articles found in step 1. Step 3, find out all the research papers that have been done by the top k% authors found in step 2 according to Author_score (how to calculate the score will be presented later). And finally, merge the keywords of the article found in step 3, and according to the keyword vector, recommend the related articles. In the proposed method, vector space model is used
Fig. 1 ER diagram of the database
450
S. Lin et al.
to represent the article and cosine similarity is adapted to measure the similarity between two articles. As mention above, in step 2, we use author_score to select the top k% authors. In the proposed method, the higher the author_score is, the greater influence the author is. In the basic concept, the author’s influence is proportional to the published papers qualities and the activity of the author. Therefore, in our approach, author_score is calculated as follows. Author _scor e = Qualit y × Activit y
(1)
In Eq. 1, Quality is measured as the average citation numbers of the author. Qualit y =
total citation numbers o f the paper published by the author numebr o f paper s published by the author (2)
For example, if author 1 published 3 papers and each cited by 20, 30 and 40 papers. The Quality of author 1 is (20 + 30 + 40)/3 = 30. Although Quality is an indicator of Author_score, it cannot reflect the activity of the author. For example, if author 2 published 4 papers per year from 2010 to 2016, and the average citation numbers is 28, it is not objective to say that author 1 is better than author 2. Therefore, the concept of Activity is proposed to overcome it. Refer to Eq. 3, Activity is measured as the number of years the author has published articles divided by the number of years in the search range. Activit y =
the number o f the year s the author has published ar ticles numebr o f year s in the sear ch range (3)
For example, assume we want to search the papers published during 2010 to 2016. Table 1 summarizes the published information of three authors. The Activities of authors 1, 2 and 3 are 3/7, 7/7 and 4/7, respectively. Continued the above example, the Qualities of Author1 and author 2 are 30 and 28, respectively. According to Eq. 1, Author_score of authors 1 and 2 are 12.86 (30*3/7) and 28 (28*1), respectively. As mentioned above, in the proposed method, step 3 will select the top k% authors according to Author_score, and merge the keyword vectors of the articles published by them, modified the keyword vector accordingly and recommend the related articles. In the following, we set k to 25, i.e., top 25% authors will be selected (Table 1).
Academic Article Recommendation by Considering …
451
Table 1 Published information of author1, author2 and author 3 2016
2015
Author1
2014
2013
2012
V
Author2
V
Author3
V
V
V
V
V
V
2011
2010
V
V
V
V
V
V
3 Experimental Results To verify the proposed method, we compare our approach with the other two basic methods, content based recommendation (recommend the articles according to the content similarity) and citation number based recommendation (recommend the articles according to the citation numbers). We collect the articles published between 2010 and 2017 from IEEE Xplore Digital Library as the experimental dataset. The number of articles, keywords and authors contained in the dataset are 286,740, 351,350, and 554,689, respectively. To show the effectiveness of the recommendation method, we found 3 research papers from the IEEE Xplore Digital Library, and all of the 3 research papers cited papers collected in the dataset. Then we use the keywords of the paper as query keywords and make paper recommendations according to the three methods. Assume the cited papers are the correct answers of the recommendation, the MAPs (Mean Average Precision) of the three approaches are listed in Tables 2, 3 and 4. Furthermore, from the view point of research trend, we use “grid computing” as our initial search and to see whether the articles with keywords “cloud computing” or “distributed computing” will be recommended. Grid computing and cloud computing can be said to be developed from distributed computing. Grid computing is a supervirtual computer composed of a loosely coupled computer cluster. As the Internet has become more sophisticated and complex, the term cloud computing has emerged. The word cloud has been used to describe today’s very complex network architecture. Table 5 listed the number of articles containing keywords “cloud computing” Table 2 MAPs of the three methods (1) Title
Using dVRK teleoperation to facilitate deep learning of automation tasks for an industrial robot (2017)
Keywords
Our approach MAP
MAP
MAP
Education
0.62
0.46
0
Service robots Manipulators Machine learning Grippers
Content based
Citation number based
452
S. Lin et al.
Table 3 MAPs of the three methods (2) Title
A framework for energy efficient control in heterogeneous cloud radio access networks (2016)
Keyword
Our approach MAP
MAP
MAP
Interference
0.47
0.31
0.08
Content based
Citation number based
Cloud computing Resource management Computer architecture Optimization Frequency control Radio access networks
Table 4 MAPs of the three methods (3) Title
The Interplay Between Massive MIMO and Underlaid D2D Networking (2016)
Keyword
Our approach
Content based Citation number based
MAP
MAP
MAP
0.38
0.22
0.08
Interference Receivers MIMO Transmitters Signal to noise ratio Uplink Arrays
Table 5 Number of articles contain keywords “cloud computing” or “distributed computing” Number of articles
Our approach
Content based
Citation based
16
7
10
or “distributed computing” form the top 20 recommendation papers of the three methods. Of the first 20 academic papers recommended, 16 of them are related to Cloud Computing or Distributed Computing of our approach, which are larger than the other two approaches. Experimental results verify that the proposed method can effectively find academic research papers related to the development and evolution of query keywords.
Academic Article Recommendation by Considering …
453
4 Conclusions In this paper, we proposed a research field trajectory based paper recommendation method which is useful for beginners. The basic recommendation concept is that if beginners can master the evolution and the technology of the research field, they will get a deep understanding of the current target areas. Furthermore, it also can recommend to researchers the emerging research papers in this field. To show the effectiveness of the proposed method, in the experiment, we collect 286,740 articles published between 2010 and 2017 from IEEE Xplore Digital Library. And experimental results showed that the proposed method can recommend research paper reflect the evolution of research. Acknowledgements This work was partially supported by the Ministry of Science and Technology, ROC, under contract MOST 108-2221-E-259-019.
Reference 1. Digital Bibliography & Library Project. (2018). Retrieved from https://dblp.uni-trier.de/statis tics/newrecordsperyear.html. 2. Ekstrand, M. D., Kannan, P., Stemper, J. A., Butler, J. T., Konstan, J. A., & Riedl, J. T. (2010). Automatically building research reading lists. In Proceedings of the fourth ACM conference on Recommender systems. 3. Gipp, B., & Beel, J. (2009). Citation Proximity Analysis (CPA)—A new approach for identifying related work based on Co-Citation Analysis. In Proceedings of the 12th International Conference on Scientometrics and Informetrics. 4. Gori, M., & Pucci, A. (2006). Research paper recommender systems: A random-walk based approach. In Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence. 5. Haruna, K., & Ismail, M. A. (2017). Research paper recommender system evaluation using collaborative filtering. In Proceeding of the 25th National Symposium on Mathematical Sciences. 6. Hassan, H. A. M. (2017). Personalized research paper recommendation using deep learning. In Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization. 7. Jiang, Y., Jia, A., Feng, Y., & Zhao, D. (2012). Recommending academic papers via users’ reading purposes. In Proceedings of the sixth ACM conference on Recommender systems. 8. Jinha, A. (2010). Article 50 million: an estimate of the number of scholarly articles in existence. Learned Publishing. 9. Lao, N., & Cohen, W. W. (2010). Relational retrieval using a combination of path-constrained random walks. Machine learning. 10. Liang, Y., Li, Q., & Qian, T. (2011). Finding relevant papers based on citation relations. In Proceedings of International conference on Web-age information management. 11. Mai, F., Galke, L., & Scherp, A. (2018). Using deep learning for title-based semantic subject indexing to reach competitive performance to full-text. In Proceedings of ACM/IEEE on Joint Conference on Digital Libraries. 12. Manning, C. D., Raghavan, P., & Schütze, H. (2009). An introduction to information retrieval (Online). Cambridge, England: Cambridge University Press.
454
S. Lin et al.
13. McNee, S. M., Albert, I., Cosley, D., Gopalkrishnan, P., Lam, S. K., Rashid, A. M., Konstan, J. A., & Riedl, J. (2002). On the recommending of citations for research papers. In Proceedings of the ACM Conference on Computer Supported Cooperative Work. 14. Open Access Initiative. (2002). Retrieved from https://www.budapestopenaccessinitiative.org/. 15. Petri, M., Moffat, A., Wirth, A. (2014). Graph representations and applications of citation networks. ADCS. 16. Pohl, S., Radlinski, F., & Joachims, T. (2007). Recommending related papers based on digital library access records. In Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries. 17. Williams, K., & Lee Giles, C. (2016). Improving similar document retrieval using a recursive pseudo relevance feedback strategy. In Proceedings of the ACM/IEEE-CS on Joint Conference on Digital Libraries.
Classification of Kannada Hand Written Alphabets Using Multi-class Support Vector Machine with Convolution Neural Networks Kusumika Krori Dutta, Aniruddh Herle, Lochan Appanna, A. Tushar, and K. Tejaswini Abstract Handwritten character recognition is onerous, due to enormous variations in handwriting among people. Recent past, many of the machine learning approaches developed to recognize hand written characters, but, most of this center around the English language, and some other languages like Arabic, Chinese and Bangla etc. This work intend to discern handwritten alphabets of Kannada language, spoken by people of Southern part of India. Many of the alphabets in Kannada differ marginally; so classification become arduous. Convolution Neural Networks (CNN) have shown great promise in this field, both as a classifier and as an image feature extractor. In this paper, a CNN model built on the ResNet50 pre-trained network used both as a classifier and as an image feature extractor. A Support Vector Machine (SVM) trained on the features extracted by the ReseNet-50 CNN found to outperform the single CNN model.. In this work, the single CNN model built and trained using Tensorflow with a Keras front-end, in Python 3. For the combined CNN and SVM model, MATLAB version 2019b was used. Keywords Character recognition · Machine learning · Convolution neural networks · Feature extractor · Support vector machine · MATLAB
K. Krori Dutta (B) · A. Herle · L. Appanna · A. Tushar · K. Tejaswini Department of Electrical and Electronics Engineering, M S. Ramaiah Institute of Technology, Bangalore 560054, India e-mail: [email protected] A. Herle e-mail: [email protected] L. Appanna e-mail: [email protected] A. Tushar e-mail: [email protected] K. Tejaswini e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 M. N. Favorskaya et al. (eds.), Intelligent Computing Paradigm and Cutting-edge Technologies, Learning and Analytics in Intelligent Systems 21, https://doi.org/10.1007/978-3-030-65407-8_40
455
456
K. Krori Dutta et al.
1 Introduction Handwritten character recognition is a very active area of research. This task is especially challenging given the large variations in handwriting from person to person. Character Recognition is an important field of computer vision, and its applications range from deciphering ancient texts to reading license plate numbers. The characters of many different languages have been taken into account for building classifiers, especially English. This is the first work that deals with Kannada, which is a language spoken predominantly in Karnataka, which is a state in South-West India. It has 44 million native speakers, and about 56 million total speakers if people who speak it as a second and third language are also counted. This work represents the first attempt to use Neural Networks to classify handwritten Kannada letters. CNN was used to distinguish between the letters in the Arabic language, and they reported a very high accuracy, of around 95%, [1]. Similarly, CNN was used for Chinese Character Recognition, and was used for Bangla Character Recognition, [2]. Both reported very accurate classifications of the dataset, even though large complexity in shapes, strokes and number of classes was present. In most models like these, the CNN represents an end-to-end black box where the model learns features on its own which are relevant for the classification task [3]. Extraction of image features using a 3D CNN which was trained using supervised training for classification of hyper-spectral images present in literature, [4]. As compared to unsupervised approaches, supervised CNN may extract features using class-specific labels provided by training samples [4]. A CNN can then be used to extract spectralspatial features which represent the input image very well. CNNs have been known to perform excellently in classification tasks. These models attempt to learn many levels of data representation, and the higher-level features can be learnt from the lower-level features. These system of learning leads to extraction of highly abstract and invariant image features, which can be very beneficial for classification problems. An entire CNN stage involves a convolution and a pooling layer, and a CNN is made of several stacked stages of this kind. The problem with this approach is that, as the network grows deeper, it becomes susceptible to the problem of gradient vanishing or exploding. This occurs in deep recurrent neural networks when the gradient undergoes repeated matrix multiplications as it is propagated backwards to earlier layers. If the gradient value is small, it rapidly approaches zero, and if it is large, it rapidly reaches infinity. To offset this, the concept of Residual Networks was introduced by [5]. This is implemented by incorporating skip connections between the stacked layers. These simply perform an identity mapping, and help in the training of very deep networks as it allows the network to skip training the weights that are less relevant to the task. These skip connections do not introduce any new parameters or functions to be trained, and hence do not affect model complexity. ResNet-50 is an example of this kind of network, and is 50 layers deep. Here, in case the input mapping is not well represented by several non-linear mappings, the identity mapping is learnt instead. In real cases, an identity mapping may be
Classification of Kannada Hand Written Alphabets …
457
non-optimal, but the model can learn perturbations over the identity mapping rather than several zero mapped complex non-linear layers [5]. The ResNet-50 model used in this work was trained on one of the best image datasets in the world, the ImageNet dataset [6]. The ImageNet dataset consists of 1000 classes, and the model was trained on 1.28 million images. This large assortment of images makes the ResNet-50 a very good pre trained model for this work, as it becomes highly efficient at extracting highly distinguishing features from the input image. This can then be used by the SVM model to perform the classification task.
2 Dataset The dataset used in this work consists of around 12,000 pre-processed, hand-written letters of the Kannada alphabet. The entire Kannada lexicon is represented in this dataset, consisting of 48 letters. Therefore, there are 48 classes in total. Of these 48, 11 categories were used for the purposes of this paper. The images were first resized to 224 × 224 pixels, and then gray-scaled. The dataset, which was compiled from handwritten characters written by many people, obviously had many defects. There was a total of 4379 images present across all 11 classes. These were cleaned meticulously, resulting in the final dataset. The dataset also consists of many different rotation angles of the same letter. The letters from 10 of the 11 classes of letters are shown in Fig. 1: Instead of the English pronunciations of these letters, the corresponding position of each letter is used in this work, in order to simplify the text. Table 1, describes the key to understanding the notation used in this paper.
Fig. 1 Letters “Bha” to “Ha” from the dataset
Table 1 Key for numbering of letters Letter
Bha
Ma
Ya
Ra
La
Va
Sha
Ssha
Sa
Ha
Lla
Number
38
39
40
41
42
43
44
45
46
47
48
458
K. Krori Dutta et al.
3 Methodology CNNs have proven to be very useful in the process of feature extraction. Instead of using hand-crafted features to classify images, in this work, the ResNet-50 Neural Network is used to extract features from the input images. These extracted features are then used to train a Multi-class Linear Support vector Machine that does the final inference. These has several advantages over using traditional feature extraction. Chief among them is the neural network called ResNet, that has been trained on several thousands of images, which means that its ability to extract features that best represent the images is unparalleled. Using a pre-trained CNN like ResNet saves a lot of training time, as the vast training data used to train it cannot be rivalled with normal hardware constraints. By using it as a feature extractor, we can leverage the high quality inference power of ResNet. ResNet was initially created for a 1000-class classification problem. This means that the last fully-connected layer will have 1000 outputs. We will not be using this layer, and the network will be re-purposed for the task at hand. Every single layer in the Neural Network outputs a response (activation), when applied over an image. As the network grows deeper, the layers capture more and more higher level features of the image. This means that the first few layers are only reactive to the edges and general shape of the image. As we progress deeper, the higher layers begin to look at abstract features of the image, like texture etc. In essence, the network features become more and more distilled the deeper in the network the image is being processed. The deeper layers effectively combine all the lower level features and create a richer representation of the image. This is the reason why the last layer of the network is used to extract features from the images. Once these features are extracted from the images, an SVM is trained to classify the images corresponding to the input features, using the class labels 38 through 48 as the targets (classification outputs). To show the advantage of using such a dual-model approach, a CNN was trained using transfer learning upon the ResNet-50 model as well. Transfer Learning is a method that capitalises on the inference power of a well-trained network. To accomplish this, the ResNet model is used as a base, and all the layers except the final fully-connected layer are frozen. The last layer is then retrained on a new dataset. In our case, the final layer was retrained on the Kannada Letters dataset images from classes 38 through 48. This process of transfer learning makes use of the weights and biases that the model learnt during the extensive training on the ImageNet dataset, and sharpens the inference for a specific custom use case. The training Progress of this SVM is shown in Table 2. The single CNN model built and trained using Tensorflow [7] with a Keras front-end, in Python 3. For the combined CNN and SVM model, MATLAB version 2019b was used [8]. Table 2 illustrates the accuracy of single CNN model for 10 epochs and observed that after 7 epochs the accuracy actually not varying much.
Classification of Kannada Hand Written Alphabets … Table 2 CNN model accuracy with different epochs
459
Epoch
Loss
Accuracy
1
2.4132
17.8
2
2.1981
22.39
3
2.1364
24.30
4
2.1132
25.70
5
2.0823
26.60
6
2.0767
26.86
7
2.0567
28.44
8
2.0466
28.36
9
2.0459
27.98
10
2.0271
28.65
4 Convolutional Neural Networks A CNN is a class of Deep Learning algorithms that consist of several stacked Convolution filters. They consist of an input layer, multiple hidden layers, and output layer [1]. The layers have trainable weights and biases that are found during training of the model. The goal of training is to reduce the loss function, which is done using Stochastic Gradient Descent (SGD). The multi-dimensional matrix of pixel intensities resulting after each layer’s convolution is called the feature map [9]. The obtained Convolution features maps are then pooled (subsampled) using Pooling Layers [2], and passed on to the next convolution layer. Each neuron is connected with a few neurons from the previous layer. Rectified Layer Units are also present that help model non-linearities in the model better. These models are feed forward networks that have been known to perform especially well when working with image data. In this work, the ResNet-50 model is used as the base model. The architecture is as follows: • 7 × 7 Kernel Input Convolution Layer. • 1 × 1, 3 × 3 and 1 × 1 Kernel Convolution Layers repeated 3 times each. The total layers so far are 10. • 1 × 1, 3 × 3 and 1 × 1 Kernel Convolution Layers repeated 4 times each. The total layers so far are 22. • 1 × 1, 3 × 3 and 1 × 1 Kernel Convolution Layers repeated 3 times each. The total layers so far are 40. • 1 × 1, 3 × 3 and 1 × 1 Kernel Convolution Layers repeated 3 times each. The total layers so far are 49. • One Average Pooling Operator, a Fully connected The total layers of the model add up to a 50-layer deep neural network, Transfer Learning is a method where a network which was built for one task is re-purposed for another related task [10]. The weights and biases learnt initially are exploited for
460
K. Krori Dutta et al.
the new task. Usually, the first model has been trained on a large dataset, and then optimised for a specific task by training on a smaller, more specific [11].
5 Support Vector Machines In Machine Learning, SVMs are one of the most useful tools used. They are known to generalise well to unseen data. An SVM attempts to find an optimal hyperplane in the d dimensional feature space, which separates the input extracted features into the 11 classes. The hyperplane for a classification problem with d features is a (d − 1) dimensional subset of the feature space that separates the classes. The SVM method works by finding an optimal hyperplane that will maximize the margin between the hyperplane and the nearest sample. This hyperplane is a non-linear decision boundary in the original space, but can be construed to be linear in the higher dimensional input feature space [12]. Instead of feeding the features directly into the SVM, a Kernel function is first applied. In our case, a Linear Kernel was used, which is given by Eq. (1): K xi , xj = xTi xj
(1)
where xi and xj are the i-th and j-th data points. The Linear Kernel has the lowest training time of all other kernels.
6 Results The CNN model was trained using the Adam Optimizer, with an initial learning rate of 0.0001. In Table 2, it can be clearly seen that the CNN model achieved a final accuracy of around 29%. This is a very low accuracy, and cannot be used for any practical real-world application. This low accuracy can be attributed to the fact that the images are very similar, because the letters in these classes look similar. Also, the fact that the number of input training images for each class was not the same could have caused the model to be biased towards some classes over others. An SVM was then trained using features extracted from the CNN, and the results are seen in Figs. 1 and 2. In Fig. 2, it can be seen that the SVM model achieved an accuracy of 40.8%, which is 10% improvement over the single CNN approach. Here, the model was trained on 352 images from each class. This number is constrained by the minimum number of images in all the classes, as all the categories for training must have an equal number of images each. This is to ensure that the model does not get biased towards any one class. The classes 38, 41 and 42 were the best identified by this model, all showing successful identifications above 85%. This can be attributed to their distinctive
Classification of Kannada Hand Written Alphabets …
461
Fig. 2 Confusion Matrix of training with all 11 Classes of letters
appearance from the rest of the letters in the dataset. Classes 43 and 45 were the least successfully identified, with accuracies less than 25%. This can be attributed to the fact that they are similar to other letters in the dataset. The final model was prepared using the dual CNN-SVM architecture using the classes 41 and 42. The model was trained with 405 images each. The final average accuracy on both the classes was 89.22%. This shows that the combination of a CNN and an SVM is far better than a single CNN when it comes to differentiating letters in this dataset. Figure 3 shows the Confusion Matrix of training with Classes 41 and 42.
462
K. Krori Dutta et al.
Fig. 3 Confusion Matrix of training with Classes 41 and 42
7 Conclusion As clearly evident in the results, a combination of CNN and SVM will outperform a single CNN model for the task of image classification. In our study, a pre-trained network called ResNet-50 was used as a base model over which the tests were performed. Using the CNN as an image feature extractor and training a Linear SVM on these features, we were able to model the dataset and produce far more accurate results than a single CNN for a straightforward image classification problem. Nevertheless, the final accuracy of around 90% was only achieved for two quite dissimilar letters, and the accuracy for 11 classes was too low for practical use. This shows clearly that even though the model can be trained to differentiate between two classes well, when more classes enter into the training set, accuracy is compromised. This can be understood in light if the fact that Kannada letter identification pose a much more difficult problem than some other languages, for example, English. The letters in Kannada tend to be highly similar, and many have the same basic shape except for minor differences. This exacerbates the classification problem. Further tests on the entire dataset of 48 classes need to be performed in order to understand the feasibility of this model for the entire classification problem, which could be a topic for further research.
Classification of Kannada Hand Written Alphabets …
463
References 1. El-sawy, A., Loey, M., & El-Bakry, H. (2017, January). Arabic handwritten characters recognition using convolutional neural network Arabic handwritten characters recognition using convolutional neural network. WSEAS Transactions on Computer Research, 5. 2. Rahman, P. C., Akhand, M., Islam, M., & Shill, S. (2015). Bangla handwritten character recognition using convolutional neural network. International Journal of Image, Graphics and Signal Processing, 7(8), 50–57. 3. Xu, X., Zhou, J., Zhang, H., & Fu, X. (2018). Chinese characters recognition from screenrendered images using inception deep learning architecture. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (vol. 10735, LNCS, pp. 722–732). 4. Chen, Y., Jiang, H., Li, C., Jia, X., & Ghamisi, P. (2016). Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE Transactions on Geoscience and Remote Sensing, 54(10), 6232–6251. 5. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-December (pp. 770–778). 6. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Li F.-F. (2014). ImageNet: A large-scale hierarchical image database, no. May 2014 (pp. 248–255). 7. Abadi, M., Agarwal, A., Barham, P., et al. TensorFlow: Largescale machine learning on heterogeneous systems. Software available from tensorflow.org (Online). Available: https://www.ten sorflow.org/. 8. MATLAB, 9.7.0.1261785 (R2019b) Update 3. Natick, Massachusetts: The MathWorks Inc., 2019. 9. Zhao, Z., Zheng, X. S., Peng, & Wu, X. (2012). Object detection with deep learning: A review (pp. 1–21). 10. Hussain, M., Bird, J. J., & Faria, D. R. (2019). A study on CNN transfer learning for image classification. Advances in Intelligent Systems and Computing, 840(October), 191–202. 11. Huh, M., Agrawal, P., & Efros, A. A. (2016). What makes ImageNet good for transfer learning? (pp. 1–10). [Online]. Available: https://arxiv.org/abs/1608.08614. 12. Marwala, L. R. (2014). Forecasting the stock market index using artificial intelligence techniques. Journal of Artificial Intelligence, 2007(2), 67–70 (Online). Available: https://core.ac. uk/download/pdf/39667613.pdf.