387 48 40MB
English Pages 412 [413] Year 2023
Lecture Notes on Data Engineering and Communications Technologies 177
Leonard Barolli Editor
Innovative Mobile and Internet Services in Ubiquitous Computing Proceedings of the 17th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2023)
Lecture Notes on Data Engineering and Communications Technologies Series Editor Fatos Xhafa, Technical University of Catalonia, Barcelona, Spain
177
The aim of the book series is to present cutting edge engineering approaches to data technologies and communications. It will publish latest advances on the engineering task of building and deploying distributed, scalable and reliable data infrastructures and communication systems. The series will have a prominent applied focus on data technologies and communications with aim to promote the bridging from fundamental research on data science and networking to data engineering and communications that lead to industry products, business knowledge and standardisation. Indexed by SCOPUS, INSPEC, EI Compendex. All books published in the series are submitted for consideration in Web of Science.
Leonard Barolli Editor
Innovative Mobile and Internet Services in Ubiquitous Computing Proceedings of the 17th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2023)
Editor Leonard Barolli Department of Information and Communication Engineering Fukuoka Institute of Technology Fukuoka, Japan
ISSN 2367-4512 ISSN 2367-4520 (electronic) Lecture Notes on Data Engineering and Communications Technologies ISBN 978-3-031-35835-7 ISBN 978-3-031-35836-4 (eBook) https://doi.org/10.1007/978-3-031-35836-4 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Welcome Message of IMIS-2023 International Conference Organizers
Welcome to the 17th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2023), which will be from July 5 to July 7, 2023, in conjunction with the 17th International Conference on Complex, Intelligent and Software Intensive Systems (CISIS-2023). This International Conference focuses on the challenges and solutions for Ubiquitous and Pervasive Computing (UPC) with an emphasis on innovative, mobile and internet services. With the proliferation of wireless technologies and electronic devices, there is a fast-growing interest in UPC. UPC enables to create a human-oriented computing environment where computer chips are embedded in everyday objects and interact with physical world. Through UPC, people can get online even while moving around, thus having almost permanent access to their preferred services. With a great potential to revolutionize our lives, UPC also poses new research challenges. The conference provides an opportunity for academic and industry professionals to discuss the latest issues and progress in the area of UPC. For IMIS-2023, we received many paper submissions from all over the world. The papers included in the proceedings cover important aspects of UPC research domain. We are very proud and honored to have 2 distinguished keynote talks by Dr. Salvatore Venticinque, University of Campania “Luigi Vanvitelli”, Italy, and Prof. Sanjay Kumar Dhurandher, Netaji Subhas University of Technology, India, who will present their recent work and will give new insights and ideas to the conference participants. The organization of an International Conference requires the support and help of many people. A lot of people have helped and worked hard to produce a successful IMIS-2023 technical program and conference proceedings. First, we would like to thank all the authors for submitting their papers, the Program Committee Members, and the reviewers who carried out the most difficult work by carefully evaluating the submitted papers. We are grateful to Honorary Chair Prof. Makoto Takizawa, Hosei University, Japan, for his guidance and advices. Finally, we would like to thank Web Administrator Co-chairs for their excellent and timely work. We hope that all of you enjoy IMIS-2023 and find this a productive opportunity to learn, exchange ideas and make new contacts.
Organization
IMIS-2023 Organizing Committee Honorary Chair Makoto Takizawa
Hosei University, Japan
General Co-chairs Isaac Woungang Hsing-Chung Chen
Toronto Metropolitan University, Canada Asia University, Taiwan
Program Committee Co-chairs Kin Fun Li Tomoyuki Ishida
University of Victoria, Canada Fukuoka Institute of Technology, Japan
Advisory Committee Members Vincenzo Loia Arjan Durresi Kouichi Sakurai
University of Salerno, Italy IUPUI, USA Kyushu University, Japan
Award Co-chairs Tomoya Enokido Lidia Ogiela Fang-Yie Leu
Rissho University, Japan AGH Univ. of Sci. and Technology, Poland Tunghai University, Taiwan
International Liaison Co-chairs Elis Kulla Farookh Hussain Hyunhee Park
Fukuoka Institute of Technology, Japan University of Technology Sydney, Australia Myongji University, Korea
viii
Organization
Publicity Co-chairs Kangbin Yim Hiroaki Kikuchi Keita Matsuo
Soonchunhyang University, Korea Meiji University, Japan Fukuoka Institute of Technology, Japan
Finance Chair Makoto Ikeda
Fukuoka Institute of Technology, Japan
Local Arrangement Co-chairs Mehrdad Tirandazian Glaucio Carvalho
Toronto Metropolitan University, Canada Toronto Metropolitan University, Canada
Web Administrators Phudit Ampririt Ermioni Qafzezi
Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan
Steering Committee Chair Leonard Barolli
Fukuoka Institute of Technology, Japan
Track Areas and PC Members 1. Multimedia and Web Computing Track Co-chairs Chi-Yi Lin Tomoyuki Ishida
Tamkang University, Taiwan Fukuoka Institute of Technology, Japan
PC Members Noriki Uchida Tetsuro Ogi Yasuo Ebara Hideo Miyachi
Fukuoka Institute of Technology, Japan Keio University, Japan Osaka Electro-Communication University, Japan Tokyo City University, Japan
Organization
Kaoru Sugita Akio Doi Chang-Hong Lin Chia-Mu Yu Ching-Ting Tu Shih-Hao Chang
Fukuoka Institute of Technology, Japan Iwate Prefectural University, Japan National Taiwan University of Science and Technology, Taiwan National Chung Hsing University, Taiwan National Chung Hsing University, Taiwan Tamkang University, Taiwan
2. Data Management and Big Data Track Co-chairs Been-Chian Chien Akimitsu Kanzaki Wen-Yang Lin
National University of Tainan, Taiwan Shimane University, Japan National University of Kaohsiung, Taiwan
PC Members Hideyuki Kawashima Tomoki Yoshihisa Pruet Boonma Masato Shirai Bao-Rong Chang Rung-Ching Chen Mong-Fong Horng Nik Bessis James Tan Kun-Ta Chuang Jerry Chun-Wei Lin
Keio University, Japan Osaka University, Japan Chiang Mai University, Thailand Shimane University, Japan National University of Kaohsiung, Taiwan Chaoyang University of Technology, Taiwan National Kaohsiung University of Applied Sciences, Taiwan Edge Hill University, UK SIM University, Singapore National Cheng Kung University, Taiwan Harbin Institute of Technology, China
3. Security, Trust and Privacy Track Co-chairs Tianhan Gao Lidia Ogiela Arcangelo Castiglione
Northeastern University, China AGH University of Science and Technology, Poland University of Salerno, Italy
ix
x
Organization
PC Members Jindan Zhang Qingshan Li Zhenhua Tan Zhi Guan Nan Guo Xibin Zhao Cristina Alcaraz Massimo Cafaro Giuseppe Cattaneo Zhide Chen Clara Maria Richard Hill Dong Seong Kim Victor Malyshkin Barbara Masucci Arcangelo Castiglione Xiaofei Xing Mauro Iacono Joan Melià Jordi Casas Jordi Herrera Antoni Martínez Francesc Sebé
Xianyang Vocational Technical College, China Peking University, China Northeastern University, China Peking University, China Northeastern University, China Tsinghua University, China Universidad de Málaga, Spain University of Salento, Italy University of Salerno, Italy Fujian Normal University, China Colombini, University of Milan, Italy University of Derby, UK University of Canterbury, New Zealand Russian Academy of Sciences, Russia University of Salerno, Italy University of Salerno, Italy Guangzhou University, China Second University of Naples, Italy Universitat Oberta de Catalunya, Spain Universitat Oberta de Catalunya, Spain Universitat Autònoma de Barcelona, Spain Universitat Rovira i Virgili, Spain Universitat de Lleida, Spain
4. Energy Aware and Pervasive Systems Track Co-chairs Chi Lin Elis Kulla
Dalian University of Technology, China Fukuoka Institute of Technology, Japan
PC Members Jiankang Ren Qiang Lin Peng Chen Tomoya Enokido Makoto Takizawa
Dalian University of Technology, China Dalian University of Technology, China Dalian University of Technology, China Rissho University, Japan Hosei University, Japan
Organization
Oda Tetsuya Admir Barolli Makoto Ikeda Keita Matsuo
Okayama University of Science, Japan Aleksander Moisiu University of Durres, Albania Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan
Track 5. Modeling, Simulation and Performance Evaluation Track Co-chairs Tetsuya Shigeyasu Bhed Bista Remy Dupas
Prefectural University of Hiroshima, Japan Iwate Prefectural University, Japan University of Bordeaux, France
PC Members Jiahong Wang Shigetomo Kimura Chotipat Pornavalai Danda B. Rawat Gongjun Yan Sachin Shetty
Iwate Prefectural University, Japan University of Tsukuba, Japan King Mongkut’s Institute of Technology Ladkrabang, Thailand Howard University, USA University of Southern Indiana, USA Old Dominion University, USA
6. Wireless and Mobile Networks Track Co-chairs Luigi Catuogno Hwamin Lee
University of Salerno, Italy Soonchunhyang University, Korea
PC Members Aniello Del Sorbo Clemente Galdi Stefano Turchi Ermelindo Mauriello Gianluca Roscigno Dae Won Lee Jong Hyuk Lee
xi
Orange Labs – Orange Innovation, UK University of Naples “Federico II”, Italy University of Florence, Italy Deloitte Spa, Italy University of Salerno, Italy Seokyoung University, Korea Samsung Electronics, Korea
xii
Organization
Sung Ho Chin Ji Su Park Jaehwa Chung Massimo Ficco Jeng-Wei Lin
LG Electronics, Korea Korea University, Korea Korea National Open University, Korea University of Campania “Luigi Vanvitelli”, Italy Tunghai University, Taiwan
7. Intelligent Technologies and Applications Track Co-chairs Marek Ogiela Yong-Hwan Lee Jacek Kucharski
AGH University of Science and Technology, Poland Wonkwang University, Korea Technical University of Lodz, Poland
PC Members Gangman Yi Hoon Ko Urszula Ogiela Lidia Ogiela Libor Mesicek Rung-Ching Chen Mong-Fong Horng Bao-Rong Chang Shingo Otsuka Pruet Boonma Izwan Nizal Mohd Shaharanee
Gangneung-Wonju National University, Korea J. E. Purkinje University, Czech Republic AGH University of Science and Technology, Poland AGH University of Science and Technology, Poland J. E. Purkinje University, Czech Republic Chaoyang University of Technology, Taiwan National Kaohsiung University of Applied Sciences, Taiwan National University of Kaohsiung, Taiwan Kanagawa Institute of Technology, Japan Chiang Mai University, Thailand University Utara, Malaysia
8. Cloud Computing and Service-Oriented Applications Track Co-chairs Baojiang Ciu Neil Yen Flora Amato
Beijing University of Posts and Telecommunications, China The University of Aizu, Japan University of Naples “Frederico II”, Italy
Organization
xiii
PC Members Aniello Castiglione Ashiq Anjum Beniamino Di Martino Gang Wang Shaozhang Niu Jianxin Wang Jie Cheng Shaoyin Cheng Jingling Zhao Qing Liao Xiaohui Li Chunhong Liu Yan Zhang Hassan Althobaiti Bahjat Fakieh Jason Hung Frank Lai
University of Naples Parthenope, Italy University of Derby, UK University of Campania “Luigi Vanvitelli”, Italy Nankai University, China Beijing University of Posts and Telecommunications, China Beijing Forestry University, China Shandong University, China University of Science and Technology of China, China Beijing University of Posts and Telecommunications, China Beijing University of Posts and Telecommunications, China Wuhan University of Science and Technology, China Heinan Normal University, China Yan Hubei University, China Umm Al-Qura University, Saudi Arabia King Abdulaziz University, Saudi Arabia National Taichung University of Science and Technology, Taiwan University of Aizu, Japan
9. Ontology and Semantic Web Track Co-chairs Alba Amato Fong-Hao Liu Giovanni Cozzolino
Italian National Research Council, Italy National Defense University, Taiwan University of Naples “Frederico II”, Italy
PC Members Flora Amato Claudia Di Napoli Salvatore Venticinque Marco Scialdone Wei-Tsong Lee
University of Naples “Federico II”, Italy Italian National Research Center (CNR), Italy University of Campania “Luigi Vanvitelli”, Italy University of Campania “Luigi Vanvitelli”, Italy Tam-Kang University, Taiwan
xiv
Organization
Tin-Yu Wu Liang-Chu Chen Omar Khadeer Hussain Salem Alkhalaf Osama Alfarraj Thamer AlHussain Mukesh Prasad
National Ilan University, Taiwan National Defense University, Taiwan University of New South Wales (UNSW) Canberra, Australia Qassim University, Saudi Arabia King Saud University, Saudi Arabia Saudi Electronic University, Saudi Arabia University of Technology Sydney, Australia
10. IoT and Social Networking Track Co-chairs Sajal Mukhopadhyay Keita Matsuo
National Institute of Technology, Durgapur, India Fukuoka Institute of Technology, Japan
PC Members Animesh Dutta Sujoy Saha Jaydeep Howlader Mansaf Alam Kashish Ara Shakil Makoto Ikeda Elis Kulla Shinji Sakamoto Evjola Spaho Francesco Moscato
NIT Durgapur, India NIT Durgapur, India NIT Durgapur, India Jamia Millia Islamia, New Delhi, India Jamia Hamadard, New Delhi, India Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan, Japan Kanazawa Institute of Technology, Japan Polytechnic University of Tirana, Albania University of Salerno, Italy
11. Embedded Systems and Wearable Computers Track Co-chairs Jiankang Ren Kangbin Yim
Dalian University of Technology, China SCH University, Korea
Organization
xv
PC Members Yong Xie Xiulong Liu Shaobo Zhang Kun Wang Fangmin Sun Kaoru Sugita Tomoyuki Ishida Noriyasu Yamamoto Nan Guo
Xiamen University of Technology, Xiamen, China The Hong Kong Polytechnic University, Hong Kong Hunan University of Science and Technology, China Liaoning Police Academy, China Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan Northeastern University, China
IMIS-2023 Reviewers Leonard Barolli Makoto Takizawa Fatos Xhafa Isaac Woungang Hyunhee Park Fang-Yie Leu Kangbin Yim Marek Ogiela Makoto Ikeda Keita Matsuo Francesco Palmieri Massimo Ficco Salvatore Venticinque Admir Barolli Elis Kulla Arjan Durresi Bhed Bista Hsing-Chung Chen Kin Fun Li Hiroaki Kikuchi Lidia Ogiela Nan Guo Hwamin Lee Tetsuya Shigeyasu Kosuke Takano Flora Amato
Tomoya Enokido Minoru Uehara Tomoyuki Ishida Hwa Min Lee Jiyoung Lim Tianhan Gao Farookh Hussain Omar Hussain Nadeem Javaid Chi-Yi Lin Luigi Catuogno Akimitsu Kanzaki Wen-Yang Lin Tetsuya Oda Tomoki Yoshihisa Masaki Kohana Hiroki Sakaji Baojiang Cui Arcangelo Castiglione Shinji Sakamoto Massimo Cafaro Mauro Iacono Barbara Masucci Gianni D’Angelo Aneta Poniszewska-Maranda Sajal Mukhopadhyay
xvi
Organization
Tomoyuki Ishida Yong-Hwan Lee Lidia Ogiela Hiroshi Maeda Evjola Spaho Jacek Kucharski
Vamsi Paruchuri Seyed Buhari Olivia Fachrunnisa Yoshihiro Okada Sriram Chellappan Xu An Wang
IMIS-2023 Keynote Talks
Evolution of Intelligent Software Agents
Salvatore Venticinque University of Campania “Luigi Vanvitelli”, Caserta, Italy Abstract. The talk will focus on the evolution of models, techniques, technologies and applications of software agents in the last years. Rapidly evolving areas of software agents range from programming paradigms to artificial intelligence. Driven by different motivations, a heterogeneous body of research is carried out under this banner. In each research area, the acceptance of agents has always been at once critical or skeptical and enthusiastic for promising future opportunities. Nevertheless, the efforts have been continuously spent to advance the research in this field. One example is the semantic Web vision, whereby machine-readable Web data could be automatically actioned upon by intelligent software Web agents. Maybe it has yet to be realized, however semantic enrichment of Web metadata of digital archives is constantly growing including links to domain vocabularies and ontologies by supporting more and more advanced reasoning.
Securing Mobile Wireless Networks
Sanjay Kumar Dhurandher Netaji Subhas University of Technology, New Delhi, India Abstract. The area of mobile computing aims toward providing connectivity to various mobile users. There is an increasing demand by users that the information be available to them at any place and at any time. This has led to more use of mobile devices and networks. Since the wireless networks such as WLAN and Wi-Fi require the use of the unlicensed ISM band for data communication, there are increased threats to users because the data may be modified/fabricated. Additionally, these type of networks are further prone to various other threats which may even result in cyber-attacks and cyber-crime. Thus, it is a need to protect the users/devices from such threats leading to loss of important financial data and in some cases leakage of important defense documents of certain targeted countries.
Contents
An Enhanced AI-Based Vehicular Driver Support System Considering Hyperparameter Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hibiki Tanaka, Masahiro Miwata, Makoto Ikeda, and Leonard Barolli
1
An AOI-Based Surface Painting Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei-Chun Hsu, Chao-Tung Yang, Hsing-Chung Chen, Kai-Ming Uang, Yan-Ting Chen, and Jheng-Shun Chen
8
An Aircraft Assembly System Based on Improved YOLOv5 . . . . . . . . . . . . . . . . . Zhengji Yao, Tianhan Gao, Xinbei Jiang, and Zichen Zhu
18
Hyperparameter Tuning and Comparison Analysis of the DNN Model to Predict Wireless Network Conditions of Live Video Services . . . . . . . . . . . . . . SoYeon Lee and Dae-Young Kim A Soldering Motion Analysis System for Monitoring Whole Body of People with Developmental Disabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kyohei Toyoshima, Chihiro Yukawa, Yuki Nagai, Genki Moriya, Kei Tabuchi, Tetsuya Oda, and Leonard Barolli A Fuzzy Theory Based Attitude Control for Takeoff of Quadrotor . . . . . . . . . . . . Chihiro Yukawa, Kyohei Toyoshima, Yuki Nagai, Yuma Yamashita, Nobuki Saito, Tetsuya Oda, and Leonard Barolli A New Method for Improving Cache Hit Ratio by Utilizing Near Network Cache on NDN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Akari Kanazawa and Tetsuya Shigeyasu An Analysis of Theoretical Network Communication Speedup Using Multiple Fungible Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . David W. White, Isaac Woungang, Felix O. Akinladejo, and Sanjay K. Dhurandher
29
38
47
57
68
Universal Intrusion Detection System on In-Vehicle Network . . . . . . . . . . . . . . . . Md Rezanur Islam, Insu Oh, and Kangbin Yim
78
The Comparison of Machine Learning Methods for Email Spam Detection . . . . Gwonsik Kang, Kamronbek Yusupov, Md Rezanur Islam, Keunkyoung Kim, and Kangbin Yim
86
xxiv
Contents
A Lightweight Intrusion Detection System on In-Vehicle Network Using Polynomial Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Baatarsuren Sukhbaatar, Md Rezanur Islam, Kamronbek Yuspov, Insu Oh, and Kangbin Yim
96
Fuzz Testing and Safe Framework Development for Vehicle Security Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Tugsmandakh Nyamdelger, Munkhdelgerekh Batzorig, Esam Ali Albhelil, Yeji Koh, and Kangbin Yim Advanced Mathematics Curriculum Reform Based on Nine Screen Method and CDIO Educational Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Jiangtao Li, Xiaokang Liu, Yanyan Zhao, and Qiong Li Proposal of a Music Auditioning Application Using Music Compact Disk Jacket as Augmented Reality Marker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Naho Kuriya, Momoka Hagihara, and Tomoyuki Ishida Proposal of a Real-Time Video Avatar Generation Method for Metaverse Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Momoka Hagihara, Naho Kuriya, and Tomoyuki Ishida An Efficient Privacy-Preserving Authentication Scheme Based on Shamir Secret Sharing for VANETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Jiayu Qi, Tianhan Gao, and Cong Zhao A Fuzzy-Based System for Selection of Radio Access Technology in 5G Wireless Networks Considering QoE as a New Parameter . . . . . . . . . . . . . . . . . . . 149 Phudit Ampririt, Shunya Higashi, Ermioni Qafzezi, Makoto Ikeda, Keita Matsuo, and Leonard Barolli Assessment of FC-RDVM and LDIWM Router Replacement Methods by WMN-PSOHC Hybrid Simulation System Considering Chi-Square Mesh Client Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Shinji Sakamoto, Admir Barolli, Yi Liu, Leonard Barolli, and Makoto Takizawa Implementation of FC-RDVM in WMN-PSOHCDGA System Considering Two Islands Distribution of Mesh Clients: A Comparison Study of FC-RDVM and RDVM Methods for Small Scale and Middle Scale WMNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Leonard Barolli, Shinji Sakamoto, Admir Barolli, and Evjola Spaho
Contents
xxv
A Comparison Study of FBR and FBRD Protocols for Underwater Optical Wireless Communication Using Transporter Autonomous Underwater Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Keita Matsuo, Elis Kulla, and Leonard Barolli General Dynamic Difficulty Adjustment System for Major Game Genres . . . . . . 189 Qingwei Mi and Tianhan Gao Method of Facial De-identification Using Machine Learning in Real-Time Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Si-On Kim, Da-Wit Jeong, and Sun-Young Lee Softprocessor RISCV-EC for Edge Computing Applications . . . . . . . . . . . . . . . . . 209 Guillermo Montesdeoca, Víctor Asanza, Rebeca Estrada, Irving Valeriano, and M. A. Muneeb Vulnerability of the Hypercube Network Based on P2 -cuts . . . . . . . . . . . . . . . . . . 221 Yuan-Hsiang Teng and Tzu-Liang Kung Applications of Artificial Fish Swarm Algorithms for Indoor Positioning and Target Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Shu-Hung Lee, Chia-Hsin Cheng, Chien-Chih Lin, and Yung-Fa Huang Attacks and Threats Verification Based on 4G/5G Security Architecture . . . . . . . 240 Lie Yang, Chien-Erh Weng, Hsing-Chung Chen, Yang-Cheng-Kuang Chen, and Yung-Cheng Yao Design of a Composite IoT Sensor Stack System for Smart Agriculture . . . . . . . 250 Meng-Chang Wu, Yung-Hoh Sheu, Shing-Hong Liu, Jen-Yu Shieh, and Hui-Kai Su The Implement of a Reconfigurable Intelligence Trust Chain Platform with Anti-counterfeit Traceable Version Function for the Customized System-Module-IC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Hsing-Chung Chen, Yao-Hsien Liang, Jhih-Sheng Su, Kuen-Yu Tsai, Yu-Lin Song, Pei-Yu Hsu, and Jia-Syun Cai Prototyping of Haptic Datagloves for Deafblind People . . . . . . . . . . . . . . . . . . . . . 273 Patrick C. K. Hung, Kamen Kanev, Atsushi Nakamura, Ryuhei Takeda, Hidenori Mimura, and Masakatsu Kimura The Design and Implementation of a Weapon Detection System Based on the YOLOv5 Object Detection Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 Tsung-Yu Su and Fang-Yie Leu
xxvi
Contents
DR.QG: Enhancing Closed-Domain Question Answering via Retrieving Documents for Question Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 Zhi-Wei Tong, Yao-Chung Fan, and Fang-Yie Leu QoS-Oriented Uplink OFDMA Random Access Scheme for IEEE 802.11be . . . 306 Chia-Wen Chang and Fang-Yie Leu Regression Testing Measurement Model to Improve CI/CD Process Quality and Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316 Sen-Tarng Lai and Fang-Yie Leu A Study on the Abnormal Stock Returns of Listed Companies in Taiwan’s Construction Sub-industry due to the Covid-19 Epidemic Announcement . . . . . . 327 Kuei-Yuan Wang, Ying-Li Lin, Chien-Kuo Han, and Hsieh-Jung Sung The Business Model of Cross-Border E-Commerce: Source Globally, Sell Globally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Ying-Li Lin and Shih-Chieh Lin Impact of SARS and COVID-19 on Taiwan’s Tourism Industry . . . . . . . . . . . . . . 345 Ying-Li Lin, Shih-Chieh Lin, Kuei-Yuan Wang, and Ching-Lun Lin Using the Balanced Scorecard to Analyze Bank Operational Performance – Comparison of Domestic and Foreign Banks . . . . . . . . . . . . . . . . . 351 Ying-Li Lin, Shih-Chieh Lin, and Ya-Yun Yang The Influence of CEO/CFO Turnover on Company Value . . . . . . . . . . . . . . . . . . . 361 Mei-Hua Liao, Yen-Ju Chen, Yun-Hsuan Tsai, and Ya-Lan Chan The Relationships Between Underpricing and Turnover: The Study of Seasoned Equity Offerings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 Chun-Ping Chang, Yung-Shun Tsai, Shyh-Weir Tzang, and Chih-Yun Liu Research on the Influence of On-the-go Cross-store Access through APPs on Consumer Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 Ya-Lan Chan, Po-Hung Chen, Sue-Ming Hsu, and Mei-Hua Liao Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
An Enhanced AI-Based Vehicular Driver Support System Considering Hyperparameter Optimization Hibiki Tanaka1 , Masahiro Miwata1 , Makoto Ikeda2(B) , and Leonard Barolli2 1
2
Graduate School of Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan {mgm23107,mgm21108}@bene.fit.ac.jp Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected], [email protected]
Abstract. In general, the accidents are caused by declining of driving skills and the lack of attention due to the increasing number of elderly people. With the expansion of infotainment functions, concentration while driving is also hindered. In this paper, we focus on this problem and propose a enhanced intelligent driving support system to detect distracted driving behaviors. In the proposed system, the object detection is computed by YOLOv5m considering hyperparameter optimization. The proposed system can detect multiple distracted driving behaviors by considering driver’s hand movements. From the evaluation results, we found that the proposed system detected the hand manipulating of IVI with 100% accuracy. While, the cell phone usage was detected with a probability over 90%.
Keywords: Driving Support System YOLO
1
· Hyperparameter Optimization ·
Introduction
There are many services connected to smartphone apps and in-vehicle infotainment systems, which increase the distracting scenarios while driving. Furthermore, the advancement of multimedia and automated driving technology has resulted in a tendency towards over-reliance and complacency towards these technologies. Now, the cars are equipped with various devices and connected via networks with their specific functions [13]. Thus, it has become easier to lose focus while driving. Consequently, users demand not only advanced driver assistance systems and vehicle control technology to prevent accidents but also a driving environment that enables concentration [3,6,9,12,18,26]. Achieving this will require the c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 1–7, 2023. https://doi.org/10.1007/978-3-031-35836-4_1
2
H. Tanaka et al.
availability of cost-effective devices and a reliable supply of distracted driving detection system. Artificial Intelligence (AI)-based advanced systems have received significant interest across various industries [5,11,16,23,27]. In edge-focused AI systems, daily training is carried out in the cloud while real-time detection occurs at the edge [15,19,20,25]. The availability of open datasets, models and platforms has accelerated the development of apps by reducing application development time [1,2]. In [22], a dataset containing ten classes of distracted driving was provided. However, the dataset is outdated and does not take into account vehicle equipment with Advanced Driver Assistance Systems (ADAS) or large displays for electric vehicles. In our previous work [14], we proposed an intelligent distracted driving detection system. However, this system did not consider the hyperparameter optimization. The focus of this paper is to evaluate an enhanced intelligent driving support system designed to detect instances of distracted driving behaviors. In order to enhance the performance of the previous system, we have incorporated an evolution method for optimization of hyperparameters. From the evaluation results, we found that the propsoed system detected the hand manipulating of IVI with 100% accuracy. While, the cell phone usage was detected with a probability over 90%. The paper is structured as follows. Section 2 covers the overview of YOLO series. In Sect. 3, we explain the approach for supporting the driver to detect distracted behaviors. The evaluation results are presented in Sect. 4. The paper concludes with Sect. 5.
2
Overview of YOLO Series
In recent years, there are many proposals that utilize Deep Neural Networks (DNNs) [7,10]. The DNN is a multi-layered network that has a complex hierarchy to detect features and learn representations. In the real world, representation learning refers to the process of extracting essential information from observed data [21]. The YOLO algorithm was initially proposed by Joseph Redmon and his team [17]. However, due to concerns regarding military use and privacy, Redmon ceased development. Later, Alexey Bochkovskiy took over and released YOLOv4 [4] in April 2020. Subsequently, Glenn Jocher and his team released YOLOv5 [8] in June 2020, which uses PyTorch as the machine learning library. YOLOv5 release v7.0 is equipped with the instance segmentation function. Continuously, YOLOv6 and YOLOv7 were announced in 2022, and YOLOv8 [24] was released in January 2023. The YOLO series is a type of one-stage object detection network, which enables the detector to be faster than two-stage detectors. With YOLO, the entire image is taken as input, divided into a grid, and then directly predicts the entire image. This approach effectively avoids errors caused by the background by utilizing the available environmental information.
An Enhanced AI-Based Vehicular Driver Support System
3
3
Distracted Driving Detection System
3.1
Overview
In Fig. 1, we present the graphical structure of our AI-based driving support system. This system consists of an intelligent driving assistance system designed for in-vehicle use. Additionally, it includes an AI-based application that predicts objects and alerts the driver. To facilitate the training of the system, we have integrated an edge device, specifically the Jetson Xavier NX, which is connected to the Internet and uses web scraping to obtain training images. Object detection is computed using YOLOv5m considering hyperparameter optimization. By optimizing hyperparameters, the object detection models can benefit from improved accuracy, faster learning speeds and reduced overfitting. By properly applying these optimization techniques, the performance in real-world applications can be significantly enhanced, making it possible to build more effective object detection systems.
Fig. 1. Proposed driving support system.
3.2
Classification
During driving, it is not guaranteed that both hands will be in contact with the steering wheel at the same time, and the position of the hands may also change while driving. Our system provides detection of driving behavior considering the following five classes. – – – – – –
Class Class Class Class Class Class
3.3
#1: #2: #3: #4: #5: #6:
In-Vehicle Infotainment (IVI) Hand Operation of Steering Wheel Hand Operation of Cell Phone Hand Hand Operation of IVI False Positive (Predicted), False Negative (True)
Dataset
In this study, we gather images of both normal and distracted behavior to create a unique dataset. The dataset comprises of 1, 136 images. For our network model, we consider YOLOv5m model, which is enhanced with hyperparameter evolution. We used up to 300 epochs to generate a trained model.
4
4
H. Tanaka et al.
Evaluation Results
In this section, we assess the proposed system’s performance based on the number of generations in the GA. We employ the optimized hyperparameters for each generation of the GA. Then, we analyze how the number of epochs and batch size affect the number of required generations. 4.1
Impact from GA Generations
Figure 2 shows the classification accuracy of the proposed system for 200 and 300 generations. The vertical axis in the graph represents the predicted values, while the horizontal axis represents the true values. When the numbers on both axes match, the detection is considered accurate. False Negative (FN) occurs when a positive case is incorrectly identified as negative, while False Positive (FP) occurs when a negative case is incorrectly identified as positive. For instance, if a monitor is detected when it is not actually present, it is an FP. On the other hand, if a monitor is not detected when it should be, it is an FN. For 200 generations (see Fig. 2(a)), the detection accuracy of hands during driving is high, but the human hands are not detected at all. A large number of instances in Classes #6 and #2 are identified as something other than hands. Additionally, it can be confirmed that other classes are not misidentified as hands. For 300 generations (see Fig. 2(b)), the hand detection accuracy increased to 59%. However, the accuracy of detecting hands while operating the steering wheel (Class #2) declined compared with 200 generations. Despite this, the overall average value showed an improvement for 300 generations.
(a) 200 generations
(b) 300 generations
Fig. 2. Classification results of different GA generations.
4.2
Classification Performacne Using Optimized Hyperparameter
We discuss the classification results of distracted driving detection for different batch sizes and epochs. We use the hyperparameter combination for 300 generations.
An Enhanced AI-Based Vehicular Driver Support System
2
5
3
100
epochs
2 92
2
8
92
16 12
1
98
29
2
2
37 47
4 5
84
6 1
2
4
41
3
4
BS4,
17
17
3
97
3
3
90
epochs 6
12
6
1
98
14
2
2
5
5
94
1
2
True
3
35
3
4
BS4,
27
55 53
4
6 5
200 4
Predicted
95
Predicted
Predicted
BS4, 1
3
83
2
17
95
12 18
36 52
53
4 5
100
6 5
epochs
3
6
29 1
2
3
True
(a) 100 epochs
300
5
4
5
6
True
(b) 200 epochs
(c) 300 epochs
Fig. 3. Classification results for batch size 4.
2
4
3
100
epochs 3
6 95
2
5
90
4
18
97
1
2
97
30
2
2
10
5 6
1
37 65
2
17
3
4
BS8,
20
3
3
97
2
3
90
4 5
5
6
epochs
12
1
1
98
28
2
2
17 100
2
2
4
29
3
4
BS8,
17
39 59
6
True
200 4
Predicted
96
Predicted
Predicted
BS8, 1
3
(b) 200 epochs
2
10
90
4
6
epochs 6 41
29
41 6
47
5
1
2
True
(a) 100 epochs
90
6 5
300 5
3
24
3
4
100
6
5
6
True
(c) 300 epochs
Fig. 4. Classification results for batch size 8.
The classification results for the batch size 4 are shown in Fig. 3. The classification accuracy improves with increasing epochs. However, the Class #2 classification decreased by 14% at 300 epochs and some of the data are misclassified as Class #4. Figure 4 presents the classification results for batch size 8. Both Class #1 and Class #5 exhibit an increase in accuracy as the number of epochs rises. The Class #4 reaches a peak of 65% at 100 epochs, because the classification accuracy decreases with increasing of epoch number. Based on these results, it seems that the batch size 8 and 200 epochs gave the best classification accuracy. The hand manipulating of IVI system was detected with 100% accuracy. Additionally, the cell phone usage was detected with a probability over 90%. We observed that the system effectively predicts distracted driving behavior while in motion.
5
Conclusions
In this paper, we evaluated an enhanced AI-based vehicular driver support system considering hyperparameter optimization. We observed that our system detected the hand manipulating of IVI with 100% accuracy. The cell phone usage was detected with a probability over 90%. We conclude that our system effectively predicted distracted driving behaviors while in motion. In the future work, we will consider other parameters to improve the accuracy of the proposed system.
6
H. Tanaka et al.
References 1. Kaggle: Data science community. https://www.kaggle.com/ 2. Roboflow: The world’s largest collection of open source computer vision datasets and apis. https://universe.roboflow.com/ 3. Bergasa, L.M., Almeria, D., Almazan, J., Yebes, J.J., Arroyo, R.: DriveSafe: an app for alerting inattentive drivers and scoring driving behaviors. In: Proceedings of the IEEE Intelligent Vehicles Symposium 2014, pp. 240–245 (2014). https://doi. org/10.1109/IVS.2014.6856461 4. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. In: Computer Vision and Pattern Recognition (cs.CV) (2020). https://arxiv.org/abs/2004.10934 5. Chen, G., et al.: NeuroIV: neuromorphic vision meets intelligent vehicle towards safe driving with a new database and baseline evaluations. IEEE Trans. Intell. Transp. Syst. 23(2), 1171–1183 (2022). https://doi.org/10.1109/TITS.2020. 3022921 6. Ersal, T., Fuller, H.J.A., Tsimhoni, O., Stein, J.L., Fathy, H.K.: Model-based analysis and classification of driver distraction under secondary tasks. IEEE Trans. Intell. Transp. Syst. 11(3), 692–701 (2010). https://doi.org/10.1109/TITS.2010. 2049741 7. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006) 8. Jocher, G.: The project page of Ultralytics YOLOv5 (2020). https://github.com/ ultralytics/yolov5/wiki 9. Kandeel, A.A., Elbery, A.A., Abbas, H.M., Hassanein, H.S.: Driver distraction impact on road safety: a data-driven simulation approach. In: Proceedings of the IEEE Global Communications Conference (GLOBECOM-2021), pp. 1–6 (2021). https://doi.org/10.1109/GLOBECOM46510.2021.9685932 10. Le, Q.V.: Building high-level features using large scale unsupervised learning. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing 2013 (ICASSP-2013), pp. 8595–8598 (2013). https://doi.org/10.1109/ ICASSP.2013.6639343 11. Li, B., et al.: A new unsupervised deep learning algorithm for fine-grained detection of driver distraction. IEEE Trans. Intell. Transp. Syst., 1–13 (2022). https://doi. org/10.1109/TITS.2022.3166275 12. Liu, T., Yang, Y., Huang, G.B., Yeo, Y.K., Lin, Z.: Driver distraction detection using semi-supervised machine learning. IEEE Trans. Intell. Transp. Syst. 17(4), 1108–1120 (2016). https://doi.org/10.1109/TITS.2015.2496157 13. McCall, J.C., Trivedi, M.M.: Driver behavior and situation aware brake assistance for intelligent vehicles. Proc. IEEE 95(2), 374–387 (2007). https://doi.org/ 10.1109/JPROC.2006.888388 14. Miwata, M., Tsuneyoshi, M., Ikeda, M., Barolli, L.: Performance evaluation of an AI-based safety driving support system for detecting distracted driving. In: Proceedings of the 16th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2022), pp. 10–17 (2022). https://doi.org/ 10.1007/978-3-031-08819-3_2 15. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015). https://doi.org/10.1038/nature14236 16. Poon, Y.S., Lin, C.C., Liu, Y.H., Fan, C.P.: YOLO-based deep learning design for in-cabin monitoring system with fisheye-lens camera. In: Proceedings of the IEEE
An Enhanced AI-Based Vehicular Driver Support System
17.
18.
19. 20. 21.
22. 23.
24. 25.
26.
27.
7
International Conference on Consumer Electronics (ICCE-2022), pp. 1–4 (2022). https://doi.org/10.1109/ICCE53296.2022.9730235 Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR-2016), pp. 779–788 (2016). https://doi. org/10.1109/CVPR.2016.91 Shaout, A., Roytburd, B., Sanchez-Perez, L.A.: An embedded deep learning computer vision method for driver distraction detection. In: Proceedings of the 22nd International Arab Conference on Information Technology (ACIT-2021), pp. 1–7 (2021). https://doi.org/10.1109/ACIT53391.2021.9677045 Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016). https://doi.org/10.1038/nature16961 Silver, D., et al.: Mastering the game of Go without human knowledge. Nature 550, 354–359 (2017). https://doi.org/10.1038/nature24270 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR-2015) (2015). https://doi.org/10.48550/arXiv.1409.1556 State Farm: Dataset of state farm distracted driver detection (2016). https://www. kaggle.com/c/state-farm-distracted-driver-detection/ Ugli, I.K.K., Hussain, A., Kim, B.S., Aich, S., Kim, H.C.: A transfer learning approach for identification of distracted driving. In: Proceedings of the 24th International Conference on Advanced Communication Technology (ICACT-2022), pp. 420–423 (2022). https://doi.org/10.23919/ICACT53585.2022.9728846 Ultralytics: The project page of Ultralytics YOLOv8 (2023). https://github.com/ ultralytics/ultralytics Vicente, F., Huang, Z., Xiong, X., la Torre, F.D., Zhang, W., Levi, D.: Driver gaze tracking and eyes off the road detection system. IEEE Trans. Intell. Transp. Syst. 16(4), 2014–2027 (2015). https://doi.org/10.1109/TITS.2015.2396031 Wang, Y.K., Jung, T.P., Lin, C.T.: EEG-based attention tracking during distracted driving. IEEE Trans. Neural Syst. Rehabil. Eng. 23(6), 1085–1094 (2015). https:// doi.org/10.1109/TNSRE.2015.2415520 Xing, Y., Lv, C., Wang, H., Cao, D., Velenis, E., Wang, F.Y.: Driver activity recognition for intelligent vehicles: a deep learning approach. IEEE Trans. Veh. Technol. 68(6), 5379–5390 (2019). https://doi.org/10.1109/TVT.2019.2908425
An AOI-Based Surface Painting Equipment Wei-Chun Hsu1(B) , Chao-Tung Yang1 , Hsing-Chung Chen2,3 , Kai-Ming Uang4 , Yan-Ting Chen5 , and Jheng-Shun Chen5 1 Department of Vehicle Technology and Entrepreneurship, WuFeng University, 117, Sec 2,
Chiankuo Rd, Minhsiung, Chiayi County 62153, Taiwan, R.O.C. {ingmar.hsu,sl9655}@wfu.edu.tw 2 Computer Science and Information Engineering, Asia University, 500, Lioufeng Rd., Wufeng, Taichung 41354, Taiwan, R.O.C. [email protected] 3 Department of Medical Research, China Medical University Hospital, China Medical University, Taichung 404327, Taiwan [email protected] 4 Department of Electrical Engineering, WuFeng University, 117, Sec 2, Chiankuo Rd, Minhsiung, Chiayi County 62153, Taiwan, R.O.C. [email protected] 5 Weifang Enterprise Co., Ltd., No. 42-1, Zhongshan Rd., Minxiong, Chiayi County 62154, Taiwan, R.O.C. [email protected], [email protected]
Abstract. This study mainly uses automatic optical inspection (AOI) technology in the production of metal parts painting. The technology developed in this research is to upgrade the original metal parts painting technology. The developed system can improve production rate and reduce manufacturing and labor costs. Combined with automatic detection technology, various types of metal parts painting technology are established to reduce the need for automation and increase production rate. This technology can save space in the painting process. Reducing cost of painted metal parts can improve the competitiveness of product market. Such automation technology is very important for the stability of manufacturing. It is of great help to the improvement of the research and development level of metal parts painting technology.
1 Introduction In order to meet the high consumption demands on products, most competitive industries [1] must maintain high quality standards of the products. Automatic optical inspection (AOI) is one of the non-destructive techniques used in quality inspection of various products. This technique is robust and can replace human inspectors who may be dull and fatigue in performing inspection tasks. A fully automated optical inspection system consists of hardware and software setups. Hardware setup include image sensor and illumination settings and is responsible to acquire the digital inspection signals, while the software part implements an inspection algorithm to extract the features of the acquired images and send the information into next operation based on the user requirements [2]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 8–17, 2023. https://doi.org/10.1007/978-3-031-35836-4_2
An AOI-Based Surface Painting Equipment
9
In recent years, due to the rapid development of computer technology, Processing times for image capture and processing technologies have become faster. In this way, AOI technology has produced structural changes in quality monitoring and non-destructive testing in various industries [3]. In addition to quality monitoring, the automation of the control process is also a very important factor in improving productivity in factories. It is worth noting that relying on manual judgment and inspection will increase the inspection time and reduce the accuracy. Inspection techniques include human inspector inspection and automated optical inspection (AOI) performed using image sensors and processors. Some papers [4, 5] show that human visual judgment ability declines with a constant layer of daily work (i.e. fatigue). Automated machine-sensing data acquisition systems pave the way for large-scale automated production. The systems also facilitate the collection of statistically feasible information for automated production techniques [6, 7]. In recent years, due to the combination of machine learning technology and deep learning, as well as the rapid improvement of computer application technology, the AOI algorithm has been further enhanced, and the detection results and speed have been significantly improved [8, 9]. In Reference [10] the authors put forward two main requirements; 100% detection rate and lowest false positive rate. Convolutional neural network (CNN) technique is one of the commonly used deep learning algorithms recently applied to inspection purposes [11]. The main reason reveals that CNN is specially designed to process image data. CNN does not require feature extraction or preprocessing of images. Therefore, the hidden layers of CNN can embed the preprocessing and feature extraction process. AOI technology is currently used for inspection in many fields, including food industry [12], textile industry [13], construction industry [11], metal industry [14, 15] and medical technology applications [16], etc. An effective inspection system should ensure minimal escape rates and false positives [17]. A necessary prerequisite for data acquisition and processing is the accuracy of the inspection system. Therefore, before designing an automated detection system, careful consideration should be given to selecting a suitable signal acquisition system. According to the published research paper [5], a standard AOI system consists of a camera and lightning setup, a computer (processor), a conveyor belt, and a sorting mechanism, as shown in Fig. 1 [5]. A camera including an object sensor is responsible for acquiring the optical signal. Conveyor belts in automated processes are responsible for the movement of objects. Computers are responsible for preprocessing, feature extraction and selection, and classification. Finally, a sorting mechanism with an industrial controller (e.g. PLC) is responsible for sorting the products for the next process (e.g. scrap, recycling, etc.) based on decisions made by inspection algorithms. Most traditional inspection systems use subtraction or template matching techniques to compare the inspected component to a reference template image. Choosing an appropriate inspection algorithm can enhance the classification process and avoid escape and false positive rates. Often in AOI applications, the acquired information must undergo some enhancement processing before the detection algorithm can be applied. Feature extraction and selection techniques are then used to segment regions and discover important features. The final stage is to feed the processed information to the classifier algorithm, as shown in Fig. 1 [5]. In order to complete the cycle of the full AOI system, some researchers consider establishing a classification mechanism to divide products
10
W.-C. Hsu et al.
Fig. 1. AOI system [5]
into different categories (such as good, recycled, scrapped, etc.), as shown in Fig. 1. An industrial controller such as a PLC is typically used to send commands (based on decisions received by a detection algorithm) to a sorting system consisting of motors, cylinders and conveyor belts. There are many styles of metal office furniture produced by general furniture companies, and the painting process [18] for iron parts is an important part of the production process. As shown in Fig. 2 and Fig. 3, the iron parts were hung on the automatic conveyor. And then the iron parts enter the manual painting position and then enter the oven for drying after manual painting. It can be seen from the painting situation that there is a problem of floating particles in the working environment, which is very harmful to worker’s health. Therefore, in order to achieve a more automated painting operation mode, it is possible to design automatic equipment for painting metal parts of all styles with the aids of AOI technology [19].
An AOI-Based Surface Painting Equipment
11
Fig. 2. Paint conveyor [18]
Fig. 3. Manual painting process [19]
2 Painting Process Planning The most effective way to carry out industrial upgrading is to use automated production. In order to increase production capacity, spray painting is integrated to develop automatic equipment for spray painting of all types of metal parts. The technical plan is shown in Fig. 4. The technical planning is mainly divided into 3 major items: automatic inspection of the appearance of iron parts, automatic painting and drying. The dynamic inspection planning is to classify the size and style of the workpiece to be painted in the factory as a design basis for the automatic painting conditions and specifications. The application sensor device uses the sensor to sense the size and length of the workpiece, which is used as the area setting for painting. Then carry out the appearance inspection test to carry out the appearance inspection test experiment of various workpieces. Then design the fixtures for various workpieces, and carry out the design of the multi-directional
12
W.-C. Hsu et al.
Fig. 4. Painting process [19]
automatic paint spraying machine, so that the paint spraying machine can be sprayed in more directions, angles, and heights. Then carry out the spray paint test experiment of various workpieces. Finally, drying planning is used to design the drying process of various workpieces to achieve a more environmentally friendly and better-quality effect.
3 Metal Parts Painting Equipment Based on AOI The research and development of full-type metal parts coating automation equipment adopts mechatronics, and cooperates with automatic detection technology to complete the coating action. The coating equipment design is shown in Fig. 5. Combined with automatic detection technology, various AOI metal parts coating technologies are established to reduce automation requirements and increase productivity. This technique saves space during the painting process. The hardware coating automation equipment first uses the sensor device to sense the size and length of the workpiece as the setting of the coating area. The installation method of the sensor is shown in Fig. 5. There are 4 sensors used to measure the length of the workpiece. The width of the workpiece is calculated based on the travelling speed of the conveyor belt. The use of AOI technology can realize the process of automatic size detection and automatic painting. The installation method of the spray gun is shown in Fig. 6. From the equipment layout of the production line, the control unit will be connected to the smart machine box, and the smart machine box will be connected to the server. Thus, data on the current production status can be transmitted to electronic signage and mobile devices. Data includes the current speed of the conveyor belt, the amount of paint, the frequency of each spray gun, and the dimensions of the workpiece. It can be remotely monitored by company supervisors and production data can also be precisely analyzed and appropriately adjusted on site to improve production efficiency (Fig. 7). The system architecture diagram is shown in Fig. 8. The picture shows that there are 4 sensors (S1, S2, S3 and S4) and 4 spray guns (P1, P2, P3 and P4). The number of sensors and spray guns can be determined according to the actual application. The signal sensed by the sensor can be transmitted to the network computer, and then calculate the speed of the object on the conveyor belt, so that the corresponding spray gun can be controlled to carry out the painting operation. According to the size of the sensed object, the signals of the sensors (S1, S2, S3 and S4) can be sent to the network computer when
An AOI-Based Surface Painting Equipment
13
Fig. 5. Design of painting equipment
Fig. 6. Painting sensors [19]
the conveyed workpiece reaches the sensor area, The signals can be used to turn on or turn off the spraying action of the spray guns (P1, P2, P3 and P4) to complete the automatic painting function.
14
W.-C. Hsu et al.
Fig. 7. Spray guns [19]
Fig. 8. System Architecture Diagram of Sensors and Spray guns
According to the size of the sensed object, the signals of the sensors (S1, S2, S3 and S4) can be shown in Fig. 9. And the operation of the spraying action of the spray guns (P1, P2, P3 and P4) are shown in Fig. 10. There are time lags as compared with the signals in Fig. 9 and Fig. 10 because of the distances between the sensors and the spray guns.
An AOI-Based Surface Painting Equipment
15
Fig. 9. The detected signals by Sensors (S1, S2, S3 and S4)
Fig. 10. The operation of spray guns (P1, P2, P3 and P4)
4 Discussions In order to achieve an effective painting effect, it is necessary to pay attention to the setting of many parameters in this automated painting equipment. Because of the different sizes of workpieces, a scanning area is set at the entrance. Four sensors are set up from top to bottom. The corresponding spray gun can be driven to spray paint according to the sensing results of the corresponding sensors. Moreover, the speed of the conveyor belt can also control the output. It is also necessary to test the appropriate speed to match the
16
W.-C. Hsu et al.
amount of paint sprayed by the spray gun. In the future, it will also cooperate with the device of the smart box to upload the parameters in the operating system to the cloud. The production information can be displayed on the production site, and it can also provide managers to view the current production status on the Internet. Since the production information is open and can be recorded, the production results can be studied in the future. This will help to improve production efficiency.
5 Conclusions In addition to AOI technology to complete the painting process, it is still necessary to select different spreaders, temperature, speeds, number of spray guns, powder output, and air volume and establish SOP to ensure the most cost-effective process. In the future, all this operational information will be online, so the issue of information security is also a very important topic. Acknowledgments. Thanks to Weifang Enterprise Co., Ltd. for supporting the project WFU-EG1-10709-2 “Painting Gun Design”, which enabled this research to be carried out smoothly and developed an automatic painting device with AOI technology.
References 1. Lv, S., Kim, H., Zheng, B., Jin, H.: A review of data mining with big data towards its applications in the electronics industry. Appl. Sci. 8(4), 1–34 (2018) 2. Abu Ebayyeh, A.A.R.M., Mousavi, A.: A review and analysis of automatic optical inspection and quality monitoring methods in electronics industry. IEEE Access8, 183192–183271 (2020) 3. Czimmermann, T., et al.: Visual-based defect detection and classification approaches for industrial applications—A survey. Sensors 20(5), 1–25 (2020) 4. Wang, M.-J.-J., Huang, C.-L.: Evaluating the eye fatigue problem in wafer inspection. IEEE Trans. Semicond. Manuf. 17(3), 444–447 (2004) 5. Chin, R.T., Harlow, C.A.: Automated visual inspection: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 4(6), 557–573 (1982) 6. Rao, A.R.: Future directions in industrial machine vision: a case study of semiconductor manufacturing applications. Image Vis. Comput. 14(1), 3–19 (1996) 7. Huang, S.-H., Pan, Y.-C.: Automated visual inspection in the semiconductor industry: a survey. Comput. Ind. 66, 1–10 (2015) 8. Chen, J., Ran, X.: Deep learning with edge computing: a review. Proc. IEEE 107(8), 1655– 1674 (2019) 9. Huang, R., Gu, J., Sun, X., Hou, Y., Uddin, S.: A rapid recognition method for electronic components based on the improved YOLO-V3 network. Electronics 8(8), 825 (2019) 10. Timm, F., Barth, E.: Novelty detection for the inspection of light-emitting diodes. Expert Syst. Appl. 39(3), 3413–3422 (2012) 11. Cha, Y.-J., Choi, W., Büyüköztürk, O.: Deep learning-based crack damage detection using convolutional neural networks. Comput.-Aided Civ. Infrastruct. Eng.32(5), 361–378 (2017) 12. Brosnan, T., Sun, D.-W.: Improving quality inspection of food products by computer vision— A review. Appl. Sci. 61(1), 3–16 (2004)
An AOI-Based Surface Painting Equipment
17
13. Kumar, A.: Neural network based detection of local textile defects. Pattern Recognit. 36(7), 1645–1659 (2003) 14. Xue-Wu, Z., Yan-Qiong, D., Yan-Yun, L., Ai-Ye, S., Rui-Yu, L.: A vision inspection system for the surface defects of strongly reflected metal based on multi-class SVM. Expert Syst. Appl. 38(5), 5930–5939 (2011) 15. Hsu, W.C., Lee, L.F., Chen, J.S., Chen, Y.T., Singh, G.: Painting Equipment Planning. ICSSMET A3-812 (2018) 16. Ker, J., Wang, L., Rao, J., Lim, T.: Deep learning applications in medical image analysis. IEEE Access 6, 9375–9379 (2017) 17. Malamas, E.N., Petrakis, E.G.M., Zervakis, M., Petit, L., Legat, J.-D.: A survey on industrial vision systems, applications and tools. Image Vis. Comput. 21(2), 171–188 (2003) 18. Lee, L.F.: Painting equipment design. Master’s thesis, Graduate School of OptoMechatronics and Materials, WuFeng University (2019) 19. Hsu, W.C.: Painting gun design. Project report WFU-E-G1-10709-2, Graduate School of OptoMechatronics and Materials, WuFeng University (2018)
An Aircraft Assembly System Based on Improved YOLOv5 Zhengji Yao, Tianhan Gao(B) , Xinbei Jiang, and Zichen Zhu Software College, Northeastern University, Shenyang 110004, China [email protected], [email protected], {elasunaming, zhuzichen}@stumail.neu.edu.cn
Abstract. Aircraft assembly is challenging to ensure accuracy and feasibility due to its numerous components and stringent quality requirements. The assembly assistant system with computer vision has advanced rapidly. However, there is a lack of the dataset required for object detection because of the confidentiality of aircraft components, resulting in a degraded performance of the assistant system. In this paper, we proposed a mixed aircraft component dataset (MACD), including real photos and synthetic images. We adopted Squeeze and Excitation (SE)-YOLOv5 by introducing the SE-Layer into CSPDrkNet53 to improve object detection accuracy. In addition, we defined the price-performance ratio (PPR) as a measure of dataset quality. We also developed an augmented reality assembly assistant system that offers simple and convenient assembly assistance and can improve assembly efficiency and quality.
1 Introduction The manufacture and assembly of aircraft differ from other manufacturing industries because it cannot be mass-produced due to the higher cost and lesser demand for aircraft. This task is challenging to execute because it frequently requires a lot of tooling, manufacturing support and cooperation, creating a huge number of interconnected production divisions. A large amount of industry knowledge with complex contents and various forms is required in the process of aircraft assembly. As a result, meeting the actual production needs to ensure the accuracy of the assembly process is difficult, and computer assistance is required. In recent years, object detection and AR (augmented reality) have been used to guide workers through assembly operations, which is more intuitive and convenient than traditional plane drawings. However, the current AR technology has the following two disadvantages. (1) The process of object detection is included in AR technologies. Region-based object detection methods (such as Faster R-CNN [1]) and regressionbased object detection methods are common solutions (such as SSD [2]). However, because the aircraft parts are massive in volume, and the internal mechanical structures of the assembly and manufacturing plants are numerous and complex, it is difficult to take enough photos as a training set for object detection. Although the synthetic dataset generated by a machine can save a substantial amount of manpower, a significant amount © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 18–28, 2023. https://doi.org/10.1007/978-3-031-35836-4_3
An Aircraft Assembly System Based on Improved YOLOv5
19
of work is still involved in labeling the image set. Because the design and manufacture of aircraft are often highly confidential, it is difficult to carry out this work on a large scale. Object detection accuracy is difficult to guarantee in the presence of a limited image set. (2) AR technology is operated synchronously by users wearing data gloves and AR glasses. Although it provides more information and perspectives, the addition of the conversion process from human actions to machine actions makes it easy to produce unpredictable errors, and it still relies on the professional knowledge of the staff for actual proofreading. To address these issues, we proposed SE-YOLOv5, a new object detection method based on YOLOv5. The method can locate aircraft components in real-time and provide assembly guidance. Our most significant contributions are as follows: (1) We built a mixed aircraft component dataset (MACD), including real photos and synthetic images, to compensate for the lack of dataset. Three predefined classes were included concerning the target cases. (2) We improved the backbone of YOLOv5 by adding the Squeeze and Excitation block between the convolution layer of YOLOv5. The block helped the model to learn more relationships between channels. The model’s precision was improved with very little additional computational cost. (3) We defined a new measure of dataset quality called the price-performance ratio (PPR) of dataset. In consideration of the difference in cost between real photos and synthetic images. PPR can validly measure the quality of datasets. The rest of the arrangement is as follows: The second part introduces the related work of the proposed method, the third part introduces the main steps of the proposed method, the fourth part introduces the implementation of the algorithm and system and analyzes the experimental results, and the fifth part is the conclusion.
2 Preliminaries 2.1 Computer-Aided Aircraft Assembly Currently, different production management technology application strategies in the aircraft assembly process can be summarized as follows: (1) Order-oriented production optimization, that is, optimizing management content for specific orders and determining order content before production begins, to avoid the inconvenience caused by order changes during the manufacturing process. (2) Improve aircraft production measurement by incorporating measurement technology into the manufacturing process to improve production accuracy. (3) Use intelligent networks and modern network technology to achieve over-the-horizon control, plan solutions to potential problems, and increase work efficiency. Servan et al. of AIRBUS Military Company [3] created the MOON (Assembly Oriented Authoring Augmented Reality) project by combining WI with an industrial digital prototype and demonstrated the project’s feasibility with an A400FAL use case. MOON method employs the interactive mode of the enhanced display to capture a realtime image of the working area in the aircraft assembly workshop via webcam and then superimposes the database’s virtual information on the real-time image. This system
20
Z. Yao et al.
broadens the scope of information acquisition by staff and improves staff perception of the information acquisition experience. The Upskill Skylight software developed by GE based on Google Smart Glasses displays the moments of components in AR glasses, which effectively reduces the error rate of engineers during the assembly process and increases production efficiency by 8%–12% [4]. Boeing utilizes AR visual supports to guide technicians as they install wiring in their aircraft. The situational visual guides reduce error rates and cut production time by 25% [5]. 2.2 Object Detection Object detection algorithm based on convolution neural network is a process of object detection using feedforward neural network with a deep structure including convolution calculation. As one of the representative methods of deep learning, CNN has been applied in object detection, object detection, and other fields for many years. In recent years, with the development of deep learning, the accuracy of CNN has gradually improved, and the methods of applying CNN to industrial production process assistance have gradually increased. 2.3 Synthetic Dataset With the increasing demand for data in the real world, manual shooting and labeling has gradually failed to meet the needs of research and application, and the application of synthetic dataset has gradually entered the research field of object detection. As early as 2007, Taylor et al. [6] created a tracking and evaluation system by using Valve’s computer game Half-Life and applied it to monitoring facilities. Marin et al. [7] used the public dataset rendered by synthetic scenes for a benchmark test in pedestrian detection, and the classifiers generated by synthetic and real training in the test showed similar performance. Hong et al. [8] discussed the feasibility of applying the deep neural network model trained in the synthetic world to the vision-based robot control in the real world. Luo et al. [9] applied the end-to-end active tracker trained in the synthetic environment to the robot settings in the real world. Bewley et al. [10] obtained a deep learning model through the process training of driving in a synthetic environment and optimized it according to the characteristics of the real world. Wang et al. [11] narrowed the gap between the synthetic training set and the real training set in vehicle recognition by transfer learning.
3 Method 3.1 Mixed Aircraft Component Dataset (MACD) We used Unity Perception [12] tool to render and generate a synthetic dataset, and the images are rendered by virtual cameras in the Unity engine. For each image, the groundtruth in JSON format is used as the data required by the neural network model. Unity Perception also provides randomizers to generate randomization. Some of the Unity Perception components are shown in Fig. 1.
An Aircraft Assembly System Based on Improved YOLOv5
21
Fig. 1. Perception Camera component and Randomizer component.
Unity Perception provides a truth tag component to mark the objects to detect. We use it to correspond the objects captured in the lens to the values in the dataset and write the collected ground-truth data into JSON files. At the same time, in order to avoid over-fitting, we use the randomization function of the perception tool to generate random elements in the image, including aggregates of different shapes and colors, different shooting angles, and different light sources. The random effect is shown in Fig. 2(e).
Fig. 2. MACD images. (a) Synthetic image with object bounding box, (b) Synthetic image rendered in URP, (c) Synthetic image rendered in HDRP, (d) real photo, (e) Synthetic image rendered with random objects.
3.2 Basic Network Framework YOLOv5 has been widely used in computer vision, so we chose it to verify that our synthetic dataset is valid, and our improvements to the neural network model are also performed on YOLOv5. The YOLOv5 algorithm consists of three parts, and the overall structure is shown in Fig. 3. The backbone network extracts rich picture features using the CSPDarkNet53 network. The third layer uses multi-scale detection and adds a new bottom-up path aggregation network after the feature pyramid networks [13] to fuse feature information of multiple scales and forecast the three created feature maps. YOLOv5 has two convolution kernel types. The convolution structure includes a convolution layer, a batch normalization layer [14] and an activation function layer. For multi-scale fusion, the spatial pyramid pooling structure uses maximum pooling [15]. Once the backbone network
22
Z. Yao et al.
extracts features, two upsampling and three convolutions on three scales create large, medium, and small target categories and position predictions.
Fig. 3. Network structure of YOLOv5
The CSP-Darknet53 layer outputs a feature map with a size of 13 × 13 × 1024. After upsampling, it outputs feature maps with three sizes. The number of channels in the feature map is divided into three categories, namely bounding box information (XYHW), grid confidence and the score of each category. 3.3 A Priori Anchor Box Computing In YOLOv5, the image is divided into three grid cells with different scales. The anchor frame of YOLOv5 is calculated by the COCO dataset. For MACD, we clustered data by kmeans to obtain the prior information of the task’s bounding box. In this algorithm, firstly, K points are randomly selected as clusters, then all points in the data are classified, and the Euclidean distance from each cluster is calculated, and then the clusters with the smallest distance are divided into the clusters. After classification, the centroid of each set is recalculated until the change of the centroid in the results of two calculations is less than a certain threshold or the maximum number of iterations is reached, and the algorithm is terminated. The clustering results calculated by the above method are 9 clusters, as shown in Fig. 4. Through these nine anchor frame data, the convergence speed of the model training can be significantly increased, and the performance of the model in detecting the target bounding box can be improved.
Fig. 4. The visualization of k-means clustering on MACD. The width and height of the box are predicted as offsets from cluster centroids.
An Aircraft Assembly System Based on Improved YOLOv5
23
3.4 SE-YOLOv5 Considering the essential difference between computer-rendered datasets and real photo datasets, and our goal is to detect objects in a real dataset, not all image information should be processed in the same way. Otherwise, an undesirable over-fitting phenomenon will occur. Therefore, we add the SENet [16] attention mechanism module to the YOLOv5 neural network model and optimize it. We hope that the neural network will pay more attention to the same characteristics as the real data rather than the differences. In each channel, we introduce SELayer into the backbone network, which is called SE-CSPDarknet53. SENet consists of three steps, as shown in Fig. 5(b): squeeze excitation and scale. The SENet network structure used by us is as follows: firstly, the input feature map is pooled globally, and the feature map with the size of C × 1 × 1 (where C is the number of feature map channels) is obtained. After the dimension reduction and dimension increase operations are carried out in two full connection layers respectively, it is activated by the Sigmoid function, and the weight value with the size of C × 1 × 1 is obtained, which is multiplied by the original input feature map at the corresponding position to output the result. SE-Darknet53 structure as shown in Fig. 5(a) inhibits the information degradation in the deep layer of the neural network, makes the neural network learn more features in the deep layer than in the shallow layer, enhances the expression ability of the feature map, reduces the influence of the differences of datasets on the training results on a small scale, and weakens the over-fitting phenomenon caused by the synthetic dataset.
Fig. 5. The structure of (a) SE-YOLOv5 (b) SE Block
24
Z. Yao et al.
4 Experiment 4.1 Dataset Although in industrial-related practical applications, the number of real datasets is not enough because limited by confidentiality agreements, workpiece size, and other factors. We don’t just use a synthetic dataset to replace the input of the neural network, but we use synthetic data and real data together and take the form of a mixed dataset as the training set of a neural network model. The mixed dataset includes two parts: the real dataset and the synthetic dataset, in which the synthetic dataset is rendered by the Unity Perception tool, and the real data is generated by Labelme annotation after the image is obtained by shooting. Train Data: Considering the influence of the configuration mode of the Unity rendering pipeline, the available function of Perception, and the matching mode of datasets, we designed and manufactured seven different datasets, trained them respectively, quantitatively evaluated the training results, and finally got excellent dataset usage. Validation Data: Since it is likely that this system will eventually serve the assembly scene in the real world, the verification set is 150 pieces of real photos as shown in Fig. 2(d).
4.2 Evaluation Metrics Considering that the cost of synthetic datasets is much lower than that of traditional datasets, and the existing indicators take little account of the cost factors of datasets, we proposed the following methods to quantitatively analyze the value of datasets. Quantitative Metrics of Datasets. First, we proposed the authenticity of the dataset as shown in Eq. (1). Where A, RD, and SD respectively represent authenticity, real data quantity, and synthetic data quantity. This value measures the proportion of real data to all data in the dataset. The higher the value, the greater the proportion of real photos in the dataset. The range of this value is [0,1] Adataset =
RD SD + RD
(1)
Since the authenticity of the dataset, the basic consumption of the dataset is added to obtain the consumption degree of the dataset. It can be formulated as Eq. (2), where C and BC respectively represent Consumption and basic consumption. The basic consumption of the dataset measures the necessary manpower and material resources spent in formatting the dataset and distributing files, and the authenticity of the dataset measures the manpower and material resources spent in photographing and labeling this data. Cdataset = Adataset + BC
(2)
Quantitative Metrics of Network Training Results. The precision and recall shown in Eqs. (3) and (4) can be used to evaluate the performance of the network. The precision
An Aircraft Assembly System Based on Improved YOLOv5
25
indicates the proportion of correct recognition among the results identified by the neural network. The recall rate can be expressed by a formula, which indicates the proportion of the results identified by the neural network to all the correct results. Precision = Recall =
TP TP + FP
TP TP + FN
(3) (4)
In the above formula, TP, TN and FP respectively represent true positives, true negatives, and false positives. According to the precision and recall value of the neural network, the PRC (PrecisionRecall Curve) under a certain IoU can be drawn, and the area under the PRC can be calculated by the integral method so that the average accuracy AP can be calculated as Eq. (5). AP = ∫10 P(R)
(5)
Quantitative Indicators of the Overall Scheme. Finally, we use the price-performance ratio (PPR) to measure the quality of a dataset, and it can be calculated as Eq. (6), PPR =
mAP Consumption
(6)
The higher price-performance ratio of the dataset shows that it can achieve better detection performance while consuming as little manpower and material resources as possible.
4.3 Network Training and Testing The computer environment of the training process is CPU: Intel (R) Core (TM) i7-8750h, RAM: 16 GB, graphics card model: Nvgeforce RTX 2070. The system is Windows 10 Home Edition 64-bit operating system. Select datasets with different configurations to train on YOLOv5 and SE-YOLOv5 10 times. The specific parameters are set as follows: the number of batches is 2, the Adam algorithm is used for gradient optimization, and the initial learning rate is 0.001. After every 50 iterations, the learning rate decreases to 1/10 of the original, the momentum is 0.9, and the attenuation coefficient is 0.0005. The anchor frame is calculated by K-means clustering. The mAP is shown in Table 1. It can be seen from Table 1 that the effect of using HDRP is better than that of using URP in synthesizing datasets. At the same time, adding randomly changed scene objects in the process of rendering datasets helps to avoid over-fitting. In terms of the ratio of datasets, although the AP is higher when the truth of datasets is higher, the cost performance of datasets decreases. In terms of the neural network model, SE-YOLOv5 improves the average precision by up to 5% compared with YOLOv5, and the lower the
26
Z. Yao et al. Table 1. Comparison of the accuracy of different datasets on YOLOv5 and SE-YOLOv5
Training dataset
YOLOv5
SE-YOLOv5
500 photos
55.973
57.816
0.513
1.317
69.881
73.691
500 synthetic images 50 photos + 450 synthetic images 25 photos + 475 synthetic images
34.28
37.384
100 photos + 400 synthetic images
66.436
67.151
125 photos + 375 synthetic images
71.597
73.487
50 photos + 450 synthetic images (URP)
52.954
55.517
Table 2. Comparison of the PPR of the different datasets on YOLOv5 and SE-YOLOv5 Training dataset
Adataset
BC
PPR of YOLOv5
PPR of SE-YOLOv5
500 photos
1
0.05
53.31
55.06
500 synthetic images
0
0.05
10.26
26.34
50 photos + 450 synthetic images
0.1
0.05
465.87
491.27
25 photos + 475 synthetic images
0.05
0.05
342.80
373.84
100 photos + 400 synthetic images
0.2
0.05
265.74
268.60
125 photos + 375 synthetic images
0.33
0.05
188.41
193.39
50 photos + 450 synthetic images (URP)
0.1
0.05
353.03
370.11
truth of the dataset, the better the improvement effect. The dataset PPR can be calculated in Table 2. It can be seen from Table 2 that the overall PPR of datasets has been improved by SE-YOLOv5, and the best PPR ratio of datasets can still be achieved by adopting a mixed dataset with an authenticity of about 0.1. 4.4 AR Assembly Assistant System Based on the above research results, we have implemented a lightweight aircraft assembly assistant system with mixed reality. This system uses 3D animation technology to restore the real animation process of components and assembly. Users can observe the assembly process from any angle, which is more intuitive and understandable than plane drawings. The camera can capture the assembly process, and object detection can help judge the correctness of the assembly process, thus improving assembly efficiency and accuracy. It is divided into two functions: training and assembly assistance. The training function can be run locally, which aims to let assemblers know the necessary information before actual assembly. The assembly assistant function needs to be connected to
An Aircraft Assembly System Based on Improved YOLOv5
27
the server for image recognition. In the assembly process, the assistant information is provided to the assembly workers through image recognition, and its operation effect and flowchart are shown in Fig. 6.
Fig. 6. The real machine demo and flowchart of the system
5 Conclusion To improve the efficiency and quality of aircraft assembly, we built a dataset (MACD) including real photos and synthetic images with the Unity Perception tool and defined a new measure of datasets. We also proposed the corresponding network structure to get better object detection. We implemented a real-time assembly assistant system using augmented reality technology, which provides a new idea for the method of computeraided assembly. The application in the assembly process can significantly reduce the time consumption and errors in the assembly process, reduce the labor and material costs in the assembly process, and improve the assembly efficiency. In future work, we will improve the mixed dataset by adjusting Unity render pipeline and using different software to increase data types. We will change the model structure to improve performance and optimize the training time. Furthermore, we will consider constantly adjusting the system’s functions to provide assemblers with more functions beneficial to the assembly process. Acknowledgments. This work was supported by National Natural Science Foundation of China under Grant Number: 52130403.
References 1. Girshick, R., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
28
Z. Yao et al.
2. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/ 10.1007/978-3-319-46448-0_2 3. Serván, J., et al.: Using augmented reality in AIRBUS A400M shop floor assembly work instructions. In: AIP Conference Proceedings, vol. 1431, no. 1, pp. 633–640. American Institute of Physics (2012) 4. Robertson, T., et al.: Reducing maintenance error with wearable technology. In: 2018 Annual Reliability and Maintainability Symposium (RAMS), pp. 1–6. IEEE (2018) 5. Bryant, L., Hemsley, B.: Augmented reality: a view to future visual supports for people with disability. In: Disability and Rehabilitation: Assistive Technology, pp. 1–14(2022) 6. Taylor, G.R., Chosak, A.J., Brewer, P.C.: OVVV: using virtual worlds to design and evaluate surveillance systems. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007) 7. Marin, J., et al.: Learning appearance in virtual scenarios for pedestrian detection. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 137– 144. IEEE (2010) 8. Hong, Z.-W., et al.: Virtual-to-real: learning to control in visual semantic segmentation. arXiv preprint arXiv:1802.00285 (2018) 9. Luo, W., et al.: End-to-end active object tracking and its real-world deployment via reinforcement learning. IEEE Trans. Pattern Anal. Mach. Intell. 42(6), 1317–1332 (2019) 10. Bewley, A., et al.: Learning to drive from simulation without real world labels. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 4818–4824. IEEE (2019) 11. Wang, Y., et al.: Deep learning-based vehicle detection with synthetic image data. IET Intell. Transp. Syst. 13(7), 1097–1105 (2019) 12. Unity Technologies: Unity Perception Package (2020). https://github.com/Unity-Technolog ies/com.unity.perception 13. Lin, T.-Y., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017) 14. He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) 15. Song, X., Zhou, H., Feng, X.: Research on remote sensing image object detection based on deep learning. In: Kountchev, R., Nakamatsu, K., Wang, W., Kountcheva, R. (eds.) WCI3DT 2022. SIST, vol. 323, pp. 471–481. Springer, Singapore (2023). https://doi.org/10.1007/978981-19-7184-6_39 16. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Hyperparameter Tuning and Comparison Analysis of the DNN Model to Predict Wireless Network Conditions of Live Video Services SoYeon Lee1 and Dae-Young Kim2(B) 1 Department of Software Convergence, Soonchunhyang University, Asan 31538, Korea
[email protected] 2 Department of Computer Software Engineering, Soonchunhyang University, Asan 31538,
Korea [email protected]
Abstract. Recent advances in IoT and AI technologies have enabled mobile IoT devices to provide live video services. In these services, attempts to apply data learning to network control are expanding to minimize the degradation of QoS by the bad network condition. The data learning model affects the system performance, and the improvement of the model can be achieved by adjusting key parameters (i.e., hyperparameters). Therefore, this paper attempts hyperparameter tuning for the deep neural network (DNN) model for predicting network conditions and provides a comparison and analysis results of the hyperparameters. For the optimal DNN model, four hyperparameters (i.e., activation function, batch size, dropout, and weight initialization) are adjusted, and twenty-four conditions of the hyperparameters are analyzed. The DNN model for network condition prediction derived through this work can be used as a basic model of an intelligent network for smart applications.
1 Introduction Recently, Internet of Things (IoT) devices are essential in smart cites, smart factories, and smart home applications. Live video services in smart applications using wireless networks are expanded [1]. In general, image-based deep learning is applied to live video services for object detection or fault detection. Because IoT devices have insufficient computing resources, they depend on a central cloud or an edge cloud for deep learning computing [2]. In this system architecture, the network conditions are critical to run the deep learning operation. In live video services, it is important to maintain the incoming traffic rate to the cloud for deep learning. If the incoming traffic rate is decreased, deep learning operations can be held due to the lack of video frames. In bad network conditions, both interference and congestion cause packet loss. This makes it difficult to maintain the incoming traffic rate to the cloud. It also decreases the quality of experience (QoE) of the smart applications [3]. To overcome this problem, a method of predicting the network conditions and responding according to the network conditions is required. In this paper, we design © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 29–37, 2023. https://doi.org/10.1007/978-3-031-35836-4_4
30
S. Lee and D.-Y. Kim
the deep neural network (DNN) model to predict wireless network conditions. Using this learning model, we control the network interfaces for the live video services. In our previous work, we collected wireless network data to predict network conditions and constructed wireless data set for AI learning [4]. In this paper, we develop a DNN model using the wireless network data set. Furthermore, we tune the hyperparameters that have the most impact on the performance of the training and analyze the tuning results. Finally, the optimal DNN model is derived. The contribution of this paper is as follows: It is to compare the results of tuning the four essential hyperparameters in the DNN model. In addition, it is suggested which hyperparameters are suitable for predicting the network condition. The remainder of the paper is organized as follows: Sect. 2 describes the related work for hyperparameters in data learning models. Section 3 explains the wireless data set and the DNN model architecture. The hyperparameter tuning comparison and analysis are described in Sect. 4. Finally, Sect. 5 concludes this paper.
2 Related Work In general, machine learning models consist of two parameters: model parameters and hyperparameters. Model parameters refer to values composed of the weights and biases of the DNN learned by the model designed in the training stage, and hyperparameters refer to parameters that must be configured before training begins [5]. The Optimal values of the model parameters are adjusted by the learning algorithm itself through a feedback mechanism. On the other hand, the hyperparameter values are not automatically adjusted by the training algorithm itself, and the optimal value must be selected through a series of experiments with different values. The process of finding the optimal value of hyperparameters like this is called hyperparameter tuning. This process is essential for developing an optimal model in machine learning, and it is important to select appropriate hyperparameter values [6]. 2.1 Hyperparameter – Activation Function The activation function is a non-linear function that determines how much of the input value is sent to the next node. In this experiment, a total of three functions, ReLU, LeakyReLU, and SiLU (swish), were applied and compared, and analyzed as activation functions. The ReLU activation function is a non-linear activation function introduced by [7] to overcome the loss of gradient problem of the sigmoid activation function. As shown in Table 1, this activates only the positive feature vectors and removes the negative features. On the other hand, LeakyReLU was proposed by [8] to overcome the limitation of ReLU which does not have a negative characteristic value. This is proposed by slightly changing ReLU to improve optimization convergence by approximating the gradient of a value 0 or an inactive hidden unit to −0.01 as shown in the equation in Table 1. The SiLU activation function, based on the optimization of the sigmoid function and the ReLU activation function, has a stronger nonlinear ability than the ReLU activation function and solves the problem that the output of the ReLU activation function becomes 0 when there is a negative input [9] (Fig. 1).
Hyperparameter Tuning and Comparison Analysis
31
Table 1. ReLU, LeakyReLU and SiLU activation function mathematical expressions
Function Derivative
ReLU
LeakyReLU
SiLU
f (x) = max(0, x)
f (x) = max(0.01x, x)
f (x) = 1+ex −x = x · σ (x)
f (x) =
0, x ≤ 0 1, x > 0
f (x) =
0.01, x ≤ 0
f (x) = f (x) + σ (x)(1 − f (x))
1, x > 0
Fig. 1. Activation Function Comparison
2.2 Hyperparameter – Dropout Dropout, the most effective technique to reduce the overfitting of neural networks, is a technique that prevents overfitting by randomly removing some layer outputs immediately before the dropout layer whenever a layer is updated during training [10]. In this experiment, to avoid overfitting, it was designed to include one dropout layer for every two hidden layers in the entire model layer, and the dropout rate of 0.25% and 0.5% were applied and tested. 2.3 Hyperparameter – Batch Size Batch size is an essential hyperparameter of DNN. The completion of training using a training dataset is called an epoch, and when a sample is delivered to the network once, the size that determines the size of the data set to be divided and delivered is called the batch size [11]. Depending on the selected batch size, various test and training accuracy and execution time are different, so it is important to properly assign the batch size to the input dataset and training environment. In this experiment, the values 32 and 64 were applied and tested.
32
S. Lee and D.-Y. Kim
Table 2. Data fields collected by Cameras and Learning Unit Camera Unit
Learning Unit
Frame Index
Frame Index
Sent Timestamp
Received Timestamp
Packet Loss Rate
Frame Per Second
Delay
-
Fig. 2. Network state data measurement environment
2.4 Hyperparameter – Weight Initialization In deep neural networks, weight initialization is an important factor for various architectures because it can prevent the explosion or vanishing of the layer activation output values [12]. Xavier initialization [13] and He initialization [14] are the initialization techniques that have recently been evaluated as having good performance. The S-shaped activation function is effective when used with the Xavier initialization method, and the activation function in the form of bending at the origin of the coordinate plane is effective when used with the He initialization method. Therefore, in this experiment, the He initialization method was applied to ReLU, LeakyReLU, and SiLU, which are activation functions that bend at the origin of the coordinate plane.
3 Proposed Scheme 3.1 Dataset In this paper, we designed an AI model to automatically mitigate packet loss, a factor that hinders traffic, using the data set built following the previous study [4], and compared and analyzed the hyperparameter tuning results of the model. The environment in which the dataset was collected is shown in Fig. 2. The Camera Unit collects Frame Index, Sent Timestamp, Packet Loss Rate, and Delay whenever a frame is transmitted, and the Learning Unit collects Frame Index, Received Timestamp, and Frame Per Second per second. The data fields collected from both units are in Table 2 and is described. Data fields collected from both units are in Table 2. Data was collected through a total of 26 conditions, and after assigning to the network interface of the Camera Unit using the Linux tc utility, it was confirmed that the FPS and Latency values vary depending on each condition.
Hyperparameter Tuning and Comparison Analysis
33
After preprocessing by comparing the data collected from the two units, a total of three Good, Iffy, and Bad labeling criteria were created for multi-class classification based on FPS and Latency values representing service quality. The criteria for “Good” is FPS above 20 and Latency below 0.6 s. The criteria for “Iffy” is FPS above 5 and Latency below 1 s. Finally, the criteria for “Bad” is when FPS is below 5 and Latency is above 1 s.
Fig. 3. Number of each class in final dataset [4]
The finally built training data set is shown in Fig. 3. The total number of data sets is 18,516, with 9,198 for Bad, 3,835 for Good, and 5,483 for Iffy. 3.2 Deep Neural Network Architecture First, the base model before hyperparameter tuning has the structure shown in Fig. 4 and consists of 10 layers including an input layer and an output layer. In the input layer, a vector consisting of FPS and Latency values of the training dataset is input, in the output layer uses the softmax [15] activation function to achieve a total of three state classifications of Bad/Iffy/Good according to the input data. Use 80% of the total dataset was used for training and 20% for testing. To measure the loss between actual output and predicted output, sparse_categorical_crossentropy, which is used in multi-classification problems, was used, and the epoch was equally assigned as 100. Adam [16] was applied as the optimization function for optimizing the configured model, and the learning rate was equally applied as 0.0001. Finally, we applied dropout per two hidden layers to prevent overfitting.
34
S. Lee and D.-Y. Kim
Fig. 4. Base Deep Neural Network Architecture
3.3 Hyperparameter Values Since the performance of deep learning algorithms is affected by the selection of hyperparameters, it is fundamental to optimize the hyperparameter values to obtain the best performance from the DL model. DNN models are often highly configurable by providing many configuration options for their hyperparameters, and tuning these hyperparameters can be computationally expensive even for small DNN models. Nevertheless, hyperparameter tuning is very important as an optimal combination of hyperparameters can significantly improve the performance of a DNN model [17]. After designing the base model as shown in Fig. 4 in the Learning Unit, we experimented to obtain the most appropriate hyperparameters by manually tuning only the four types of hyperparameter values specified in Table 3. Table 3. Selected hyperparameter range No.
Hyperparameter
Values
1
Activation Function
ReLU, LeakyReLU, SiLU
2
Dropout rate (%)
0.25, 0.5
3
Batch size
32, 64
4
He Initialization
Used, Unused
4 Performance Evaluation Table 4 shows the results of learning by applying the hyperparameter values in Table 3. The total number of comparison conditions is 24, and to reduce the number of comparison variables, a model representing the best performance for each activation function was
Hyperparameter Tuning and Comparison Analysis
35
selected and defined in the ‘Definition’ column. Accuracy Score and F1 Score were used for model evaluation, and the Accuracy and Loss trend graphs according to the Epoch of the three models showing the best performance for each activation function were compared. As a result of analysis through a total of four evaluation indicators, all three models show excellent values as figures. However, we decided that Model C is the model that performs best on the dataset we build. Overall, not only the scores of accuracy, loss, and F1-Socre were the best, but also the learning accuracy and loss graph trend of each of the three models showed that the loss values of Model C converge stably compared to Model A and B as shown in Fig. 5. Combining the experimental results, it can be seen that the combination of hyperparameters has a significant effect on the optimization of the DNN model for determining the network state. In particular, it was found that the activation function greatly affects the model performance, and it can be seen that the smaller the dropout rate and the application of the weight initialization technique, the better the model performance. Table 4. Experimental Results Case
Activation Function
Dropout rate (%)
Batch Size
32 0.25 64 1
ReLU 32 0.5 64 32 0.25
2
64
Leaky ReLU
32 0.5 64 32 0.25 64
3
SiLU 32 0.5 64
He Initialization
Train Accuracy (%)
Train Loss
Used Unused Used Unused Used Unused Used Unused Used Unused Used Unused Used Unused Used Unused Used Unused Used Unused Used Unused Used Unused
0.9098 0.8997 0.9972 0.9966 0.9951 0.9911 0.8912 0.9938 0.9971 0.9377 0.9767 0.9951 0.9958 0.9937 0.9973 0.9963 0.9975 0.9970 0.9975 0.9974 0.9790 0.9935 0.9285 0.9229
0.2201 0.2471 0.0111 0.0107 0.0131 0.0494 0.1512 0.0425 0.0103 0.1068 0.0555 0.0127 0.0203 0.0155 0.0136 0.0149 0.0078 0.0356 0.0113 0.0083 0.0326 0.0161 0.0953 0.1202
F1-Score Macro Weighted avg avg
0.89 0.89 1.00 1.00 0.99 0.88 0.99 1.00 0.93 0.97 1.00 1.00 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.98 1.00 0.92 0.91
0.91 0.89 1.00 1.00 1.00 0.99 0.89 0.99 1.00 0.93 0.97 0.99 1.00 0.99 1.00 1.00 1.00 1.00 1.00 1.00 0.98 0.99 0.93 0.92
Definition
Model A Model B -Model C -
36
S. Lee and D.-Y. Kim
Fig. 5. Train accuracy and loss graph for each model
5 Conclusion Data learning using a DNN model is applied to various smart applications. Determining hyperparameters such as activation function, batch size, dropout rate, and weight initialization is an important step in DNN model design. In this paper, we developed the DNN model for network state prediction and presented the results of hyperparameter tuning by using data set from previous work on network data collection. The performance evaluation of the hyperparameter tuning provided several useful results: First, Training and validation accuracy was higher when the batch size was 32 than 64. Second, dropout reduced overfitting, but at the same time, it also reduced validation accuracy. For the balance of overfitting and validation accuracy, adjusting the dropout rate to 0.25 rather than 0.5 showed better performance. Third, it is better to apply the weight initialization method, which is a method of finding an initial weight point close to the optimal weight. Fourth, even if accuracy and loss rate are similar in a DNN model, the performance could be improved by observing the change according to the epoch adjustment. In this paper, our DNN model to predict network state showed 99.75% train accuracy under the following hyperparameter conditions: 100 epochs, 32 batch size, 0.25 dropout rate, 0.001 learning rate, Adam as an optimizer, SiLU as an activation function, He as weight initialization. Acknowledgments. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea Government (MSIT) (No. 2021R1C1C1013133) and the Institute
Hyperparameter Tuning and Comparison Analysis
37
of Information and Communications Technology Planning and Evaluation (IITP) grant funded by the Korea Government (MSIT) (RS-2022-00167197).
References 1. Duan, L., Lou, Y., Wang, S., Gao, W., Rui, Y.: AI-oriented large-scale video management for smart city: technologies standards and beyond. IEEE MultiMed. 26(2), 8–20 (2019). https:// ieeexplore.ieee.org/document/8509149/ 2. Hadidi, R., Cao, J., Woodward, M., Ryoo, M.S., Kim, H.: Real-time image recognition using collaborative IoT devices. In: Proceedings of the ReQuEST Workshop ASPLOS, p. 4 (2018) 3. Karmakar, R., De, S., Ghosh, A., Adhikari, T., Jain, P.: S2-GI: intelligent selection of guard interval in high throughput WLANs. In: Proceedings of the 11th International Conference on Computing, Communication, and Networking Technologies (ICCCNT), pp. 1–7 (2020) 4. Lee, S., Park, J., Kim, M., Kim, D.-Y.: Data construction method for smart live video streaming service. In: Proceedings of International Conference on Interdisciplinary Research on Computer Science, Psychology, and Education (ICICPE 2022), Pattaya, Thailand, 26–28 December 2022 (2022) 5. Hasan, M.R., Hasan, M.M., Hossain, M. Z.: Outcomes of deep neural network hyperparameter tuning on Bengali speech token classification. In: 2022 International Conference on Innovations in Science, Engineering and Technology (ICISET), Chittagong, Bangladesh, pp. 445–450 (2022). https://doi.org/10.1109/ICISET54810.2022.9775837 6. Hoque, K.E., Aljamaan, H.: Impact of hyperparameter tuning on machine learning models in stock price forecasting. IEEE Access 1 (2021) 7. Hahnloser, R.H.R., Sarpeshkar, R., Mahowald, M.A., Douglas, R.J., Seung, H.S.: Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405(6789), 947–951 (2000) 8. Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the ICML Workshop Deep Learning for Audio, Speech, and Language Processing (2013) 9. Hu, J., et al.: High speed railway fastener defect detection by using improved YoLoX-nano model. Sensors 22(21), 8399 (2022) 10. Legon, A., et al.: Detection and classification of PCB defects using deep learning methods. In: 2022 IEEE 31st Microelectronics Design & Test Symposium (MDTS). IEEE (2022) 11. Lin, R.: Analysis on the selection of the appropriate batch size in CNN neural network. In: 2022 International Conference on Machine Learning and Knowledge Engineering (MLKE). IEEE (2022) 12. Limnios, S., et al.: Hcore-Init: neural network initialization based on graph degeneracy. In: 2020 25th International Conference on Pattern Recognition (ICPR). IEEE (2021) 13. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. J. Mach. Learn. Res. 9, 249–256 (2010) 14. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of IEEE International Conference on Computer Vision, pp. 1026–1034, December 2015 15. Sharma, S.: Activation functions in neural networks. Towards Data Sci. 6, 1–7 (2017) 16. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980 (2014) 17. Liao, L., Li, H., Shang, W., Ma, L.: An empirical study of the impact of hyperparameter tuning and model optimization on the performance properties of deep neural networks. ACM Trans. Softw. Eng. Methodol. 31(3), 1–40 (2022)
A Soldering Motion Analysis System for Monitoring Whole Body of People with Developmental Disabilities Kyohei Toyoshima1 , Chihiro Yukawa1 , Yuki Nagai1 , Genki Moriya2 , Kei Tabuchi2 , Tetsuya Oda3(B) , and Leonard Barolli4 1
Graduate School of Engineering, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700–0005, Japan {t22jm24jd,t22jm19st,t22jm23rv}@ous.jp 2 Graduate School of Science and Engineering, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700-0005, Japan {r23smq2au,r23sml1og}@ous.jp 3 Department of Information and Computer Engineering, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700–0005, Japan [email protected] 4 Department of Information and Communication Engineering, Fukuoka Insitute of Technology, 3-30-1 Wajiro-Higashi-ku, Fukuoka 811-0295, Japan [email protected]
Abstract. The employment situation for people with developmental disabilities is different for different countries and types of disability. But in many cases, there are significant challenges in finding and keeping employment. For employment, it is essential to consider individual differences. Soldering work in factories is one of the options for persons with developmental disabilities, but it is difficult to ensure safety. In this paper, we propose a soldering motion analysis system for monitoring the whole body of persons with disabilities. We evaluate the proposed system by experiments. The experimental results show that the proposed system is able to monitor the soldering workers or people with developmental disabilities.
1
Introduction
The employment situation for people with disabilities is different for different countries and types of disability. But in many cases, there are significant challenges in finding and keeping employment [1]. According to World Health Organization (WHO), the persons with disabilities have lack of accessibility in the workplace, negative attitudes towards people with disabilities, insufficient support and accommodations and being excluded from certain job opportunities due to their disability [2].
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 38–46, 2023. https://doi.org/10.1007/978-3-031-35836-4_5
A Soldering Motion Analysis for Monitoring of Persons
39
Among these issues, the significant challenge is the loss of education and training opportunities in employment or school [3], which may limit their ability for specific jobs. Typical examples of developmental disabilities are Intellectual Disability (ID) [4], Autism Spectrum Disorder (ASD) [5,6], Attention Deficit Hyperactivity Disorder (ADHD) [7] and Specific Learning Disorders (SLD) [8]. Persons with these kind of disabilities may exhibit different behavior even if the diagnostic disabilities are the same. Therefore, it is important to provide support considering individual differences. For electronic device manufacturing, soldering is performed as one of the handicrafts. Also, soldering work in factories is one of the options for persons with developmental disabilities. The soldering work is performed on a desk and workers should sit in a correct posture perform the work and improve safety. However, it is difficult to ensure safety for persons with developmental disabilities because of distractions or over-concentration during soldering irons. Therefore, managers assign many supervisors to ensure safety. However, there are many issues and challenges to improve and increase the employment of persons with disabilities. Therefore, in our previous work, we proposed a soldering danger detection system [9,10], a soldering motion monitoring system [11], a haptics based soldering analysis system [12–14] and systems considering ambient intelligence [15–18]. In this paper, we propose a soldering motion analysis system for monitoring the whole body of persons with developmental disabilities. The proposed system can synchronize the motion analysis for monitoring the whole body and assesses the safety based on the movement of people in the safe area. We evaluate the proposed system by experiments. The experimental results show that the proposed system is able to monitor the soldering workers or persons with developmental disabilities. The structure of the paper is as follows. In Sect. 2, we present the proposed system. In Sect. 3, we describe the experimental results. Finally, conclusions and future work are given in Sect. 4.
2
Proposed System
In this section, we describe the proposed system. Figure 1 shows an overview of the proposed system. The proposed system supports the learning of soldering skills for beginners and workers with developmental disabilities by monitoring conditions during soldering. In addition, one of the characteristics of developmental disabilities excessive concentration, which should be considered during production process. In the proposed system, there are three units which utilize different angles to detect unsafe movements of workers for reducing blind spots when obtaining worker postures. Also, it is necessary to reduce the delay during capturing images.
40
K. Toyoshima et al.
Fig. 1. Overview of proposed system.
Fig. 2. Synchronization method.
Figure 2 shows the flowchart for controlling capture timing of images by the proposed system. The clock of each motion analysis unit is synchronized based on the Berkeley algorithm [19], which is a distributed synchronization method. The motion analysis unit for the face and the front of the upper body is the master node, while the motion analysis unit for the left half of the body and the right half of the body are slave node. The clocks are synchronized between
A Soldering Motion Analysis for Monitoring of Persons
41
Fig. 3. Whole body motion analysis methods for persons with disabilities.
Fig. 4. Experimental environment.
nodes. In the proposed system, the current time indicates the time elapsed from the start of the synchronization process. The proposed system performs pose estimation [20–24] considering the whole body motion from received color images and depth images by multiple depth cameras based on flowchart in Fig. 3. As shown in Fig. 4, three units of motion analysis system are placed in front of one worker and the others at 45 [deg.] angles to the rear left and right of the worker. The soldering work is performed while sitting on a chair. So, the area from the waist to the feet is hidden under the workbench or tables which becomes a blind spot for the camera. Thus, we consider three directions to capture the whole body motion. The proposed system estimates the pose of the face and upper body using the front motion analysis unit and the torso and lower body using the rear motion analysis system on the left and right.
42
K. Toyoshima et al.
(a) Motion analysis system for face and upper body.
(b) Motion analysis system for body right side.
(c) Motion analysis system for body left side.
Fig. 5. Examples of safety area during soldering motion.
The posture during soldering is estimated in three-dimensions from the key points (x, y) obtained by skeletal estimation by media pipe. While, the distance z is obtained from the depth image of the depth camera to each key point. In the proposed system, a cylindrical movable area in three dimensions is decided as shown in Fig. 5. If there is any movement out of the movable area, the proposed system warns workers that they are performing an unsafe movement. The movable area of the upper body is decided based on the three-dimensional coordinates of both shoulders obtained by three-depth cameras as shown in Fig. 5(a). The radius of the cylinder in the movable area of the upper body is put 20 [%]
A Soldering Motion Analysis for Monitoring of Persons
43
Table 1. Examples of voice output for unsafe detection. Conditions
Content of Output
Body is within a dangerous area
“Please take the correct posture”
More than 10 [sec.] in the safety area “You are concentrating, now relax” Less than 10 [sec.] in the safety area
“Please concentrate on your work”
Fig. 6. Experimental results of the distance of safety from the reference point.
higher than the average value of movable area of the worker considering physique situation. While, the movable area of the lower body is set from the waist to the knee and the knee to the ankle as shown in Fig. 5(b) and Fig. 5(c), respectively. The radius of the cylinder in the range of motion of both legs and arms is set to 20 [%] more than half the shoulder width, considering the size of the body and the distance from the worker to the camera. In order to within the estimation error, the acceptable safety area is 10 [%] higher than the original size. If the person moves in an unsafe area, sound is generated from the loudspeaker to inform the person as shown in Table 1.
3
Experimental Results
In this section, we present the experimental results. The experiment scenario for synchronous processing and safety assessment in 50.0 [sec.]. The results of the safety assessment are obtained at intervals of 0.15 [sec.]. We considered as experimental scenario the case when the upper body is moved back and forth at intervals of approximately 10.0 [sec.]. Figure 6 shows the visualization results of the distance change from the reference point to the centre coordinates of the upper body obtained by the motion analysis system
44
K. Toyoshima et al.
Fig. 7. Visualization results of the postural condition of the worker.
placed at three locations. In Fig. 6, the green painted area indicates a safe area. The grey painted area indicates the acceptable safe area. While, the red painted area indicates dangerous area. The synchronisation of the image capture by the motion analysis system for three units has an average error of 0.464 [sec.]. Fig. 7 shows the visualization results of the safety assessment for the whole body when the left upper arm is moving between 0.0 [sec.] to 20.0 [sec.], stopped in the safety position between 20.0 [sec.] to 30.0 [sec.] and the right thigh is moving between 30.0 [sec.] to 50.0 [sec.]. We consider the case when the postural condition is integrated with the states obtained by three units, with 0 for a safe condition, 1 for an acceptable safe condition and 2 for a dangerous condition. The average posture condition is calculated from the average value of the current condition and the condition before 0.15 [sec.] and 0.3 [sec.]. If the average value is less than 0.667 [unit], it is considered a safe condition. When the average value is greater than 0.667 [unit] and less than 1.334 [unit], it is considered to be in a safe acceptable condition. While, when the average value is greater than 1.334 [unit], it is considered to be in a dangerous condition. The experimental results show that the proposed system can assess the safety assessment of workers or people with developmental disabilities based on the safety area of people movement.
4
Conclusions
In this paper, we presented a soldering motion analysis system for monitoring the whole body of workers or people with developmental disabilities. We evaluated the proposed system by an experimental scenarios. From the experimental results, we found that the proposed system by synchronizing the motion analysis can decide the safety assessment based on the safety area of soldering worker movement. In the future, we would like to improve the proposed system and consider different scenarios. Acknowledgement. This work was supported by JSPS KAKENHI Grant Number JP20K19793.
A Soldering Motion Analysis for Monitoring of Persons
45
References 1. Khayatzadeh-Mahani, A., et al.: Prioritizing barriers and solutions to improve employment for persons with developmental disabilities. Disabil. Rehabil. 42(19), 2696–2706 (2020) 2. World Health Organization. WHO Policy on Disability (2021) 3. Agran, M., et al.: Why aren’t students with severe disabilities being placed in general education classrooms: examining the relations among classroom placement, learner outcomes, and other factors. Res. Pract. Pers. Severe Disabil. 45(1), 4–13 (2020) 4. Schepens, H.R.M.M., et al.: How to improve the quality of life of elderly people with intellectual disability: a systematic literature review of support strategies. J. Appl. Res. Intellect. Disabil. 32(3), 483–521 (2019) 5. Liu, X., et al.: Technology-facilitated diagnosis and treatment of individuals with autism spectrum disorder: an engineering perspective. Appl. Sci. 7(10), 1051 (2017) 6. Drigas, A., Vlachou, J.A.: Information and communication technologies (ICTs) and autistic spectrum disorders (ASD). Int. J. Recent Contrib. Eng. Sci. IT 4(1), 4–10 (2016) 7. Christina, S., et al.: Information and communication technologies (ICT) and pupils with attention deficit hyperactivity disorder (ADHD) symptoms: do the software and the instruction method affect their behavior? J. Educ. Multimedia Hypermedia 13(2), 109–128 (2004) 8. Grigorenko, E.L., et al.: Understanding, educating, and supporting children with specific learning disabilities: 50 years of science and practice. Am. Psychol. 75(1), 37–51 (2020) 9. Yasunaga, T., et al.: Object detection and pose estimation approaches for soldering danger detection. In: Proceedings of The IEEE 10-th Global Conference on Consumer Electronics, pp. 776–777 (2021) 10. Yasunaga, T., et al.: A soldering motion analysis system for danger detection considering object detection and attitude estimation. In: Proceedings of The 10-th International Conference on Emerging Internet, Data & Web Technologies, pp. 301–307 (2022) 11. Toyoshima, K., et al.: Analysis of a soldering motion for dozing state and attention posture detection. In: Proceedings of 3PGCIC-2022, pp. 146–153 (2022) 12. Toyoshima, K., et al.: Proposal of a haptics and LSTM based soldering motion analysis system. In: Proceedings of The IEEE 10-th Global Conference on Consumer Electronics, pp. 1–2 (2021) 13. Toyoshima, K., et al.: Design and implementation of a haptics based soldering education system. In: Proceedings of IMIS-2022, pp. 54–64 (2022) 14. Toyoshima, K., et al.: Experimental results of a haptics based soldering education system: a comparison study of RNN and LSTM for detection of dangerous movements. In: Proceedings of INCoS-2022, pp. 212–223 (2022) 15. Obukata, R., et al.: Design and evaluation of an ambient intelligence testbed for improving quality of life. Int. J. Space-Based Situat. Comput. 7(1), 8–15 (2017) 16. Oda, T., et al.: Design of a deep Q-network based simulation system for actuation decision in ambient intelligence. In: Proceedings of AINA-2019, pp. 362–370 (2019) 17. Obukata, R., et al.: Performance evaluation of an am i testbed for improving QoL: evaluation using clustering approach considering distributed concurrent processing. In: Proceedings of IEEE AINA-2017, pp. 271–275 (2017)
46
K. Toyoshima et al.
18. Yamada, M., et al.: Evaluation of an IoT-based e-learning testbed: Performance of OLSR protocol in a NLoS environment and mean-shift clustering approach considering electroencephalogram data. Int. J. Web Inf. Syst. 13(1), 2–13 (2017) 19. Gusella, R., Zatti, S.: he accuracy of the clock synchronization achieved by TEMPO in Berkeley UNIX 4.3BSD. IEEE Trans. Softw. Eng. 15(7), 847–853 (1989) 20. Fang, H., et al.: Rmpe: regional multi-person pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2334–2343 (2017) 21. Xiao, B., et al.: Simple baselines for human pose estimation and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 466– 481 (2018) 22. Martinez, J., et al.: A simple yet effective baseline for 3d human pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2640–2649 (2017) 23. Soukupova, T., et al.: Real-time eye blink detection using facial landmarks. In: Proceedings of The 21st Computer Vision Winter Workshop, Rimske Toplice, Slovenia (2016) 24. Zhang, F., et al.: MediaPipe hands: on-device real-time hand tracking (2020). arXiv preprint arXiv:2006.10214
A Fuzzy Theory Based Attitude Control for Takeoff of Quadrotor Chihiro Yukawa1 , Kyohei Toyoshima1 , Yuki Nagai1 , Yuma Yamashita3 , Nobuki Saito2 , Tetsuya Oda3(B) , and Leonard Barolli4 1
2
4
Graduate School of Engineering, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700-0005, Japan {t22jm19st,t21jm01md,t22jm24jd,t21jm02zr,t22jm23rv}@ous.jp Graduate School of Science Engineering, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700-0005, Japan [email protected] 3 Department of Information and Computer Engineering, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama-shi 700-0005, Japan [email protected], [email protected] Department of Informention and Communication Engineering, Fukuoka Insitute of Technology, 3-30-1 Wajiro-Higashi-ku, Fukuoka 811-0295, Japan [email protected]
Abstract. Unmanned Aerial Vehicles (UAVs) are utilized in various fields such as aerial shots, transportation, spraying chemicals in agriculture and surveys of plant growth in forestry. It is necessary to consider the ground effect when operating UAVs/AAVs. The lift coefficient changes and significantly affects the controllability when operating sharp turns within the range of ground effect. So, it is essential to attend to safety-based takeoff and landing as the risk of accidents. In this paper, we propose a fuzzy control for the attitude control of a quadrotor based takeoff.
1
Introduction
Unmanned Aerial Vehicles (UAVs) are utilized in various fields such as aerial shots [1,2], transportation [3,4], spraying chemicals in agriculture [5,6] and surveys of plant growth in forestry [7,8]. Also, it applies to infrastructure facilities [9,10], building surveillance [11,12], progress management on the site for construction [13,14]. So, various types of UAVs have been proposed and developed in recent years to realize diverse applications [15,16]. In addition, Autonomous Aerial Vehicles (AAVs) are implemented with intelligent algorithms such as obstacle avoidance based on computer vision technology [17–19] for image recognition and adaptive decision-making based on deep reinforcement learning [20,21]. UAVs/AAVs can be categorized into fixed-wing [22,23] or rotary-wing [24,25] types. The fixed-wing types, such as airplanes, generate lift through the primary c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 47–56, 2023. https://doi.org/10.1007/978-3-031-35836-4_6
48
C. Yukawa et al.
wing based forward motion, while the ailerons control the direction of flight. Also, the weight of the mobility is supported by the fixed wings while the propeller generates forward thrust, allowing for high-speed and long-duration flights. On the other hand, rotary-wing types like helicopters or quadrotors generate lift only from the rotor’s thrusts. Multi-copters with multiple rotors control the attitude by changing the lift of each rotor. The rotary-wing type has a smaller payload capacity and shorter flight time than the fixed-wing type. However, rotary-wing types offer more operability than fixed-wing types and can take off and land vertically. Therefore, the fixed-wing type is considered well-suited for transportation and surveys within a wide area. Also, the rotary-wing type is suitable to require stable hovering at a fixed point for inspection and observation within a limited area. In addition, it is necessary to consider the ground effect when operating UAV /AAV [26–28]. The ground effect reduces energy consumption and enables efficient flight as the closer the mobility to the ground. However, the lift coefficient changes and significantly affects the controllability when operating sharp turns within the range of ground effect. So, it is essential to attend to safety-based takeoff and landing as the risk of accidents. These UAVs/AAVs are installed with attitude control is essential to achieve stable flight and the purpose of tasks. In attitude control, various methods are based on the information obtained from different sensors to maintain and control the mobility attitude, velocity, altitude and other states based flight, such as Proportional-Integral-Differential (PID) control [29,30] and Model Predictive Control (MPC) [31,32]. In this paper, we propose a fuzzy control for the attitude control of a quadrotor based takeoff [33–35]. Also, we show the experiment results to evaluate the proposed system. The structure of the paper is as follows. In Sect. 2, we describe the proposed system. In Sect. 3, we discuss the experimental results. Finally, conclusions and future work are given in Sect. 4.
2
Proposed System
The proposed system for quadrotor attitude of a takeoff based on fuzzy control is shown in Fig. 1. The proposed system estimates the attitude of a quadrotor using the gravity values obtained from an Inertial Measurement Unit (IMU). In addition, the quadrotor is controlled by fuzzy control based on the estimated attitude.
A Fuzzy Theory Based Attitude Control for Takeoff of Quadrotor
49
Fig. 1. Proposed system.
2.1
Architecture of Quadrotor
The UAV is designed as a quadrotor with high operability, vertical takeoff and landing capability and does not require a runway. Also, the quadrotor is a rotarywing aircraft with a smaller physical footprint than fixed-wing aircraft, making them suitable for accessing confined spaces and inspecting bridges and structures at elevated heights. Also, the quadrotor is highly adaptable, scalable and easy to mount equipment such as Global Navigation Satellite Systems (GNSS), cameras, and other sensors within the payload capacity. Figure 2 shows snapshots of the quadrotor. Table 1 shows the components of the quadrotor. The quadrotor’s frame for equipment mounting and propeller guard are designed using 3D CAD software and created with a 3D printer. The propeller guards and frame for mounting equipment on the quadrotor us designed with 3D CAD software and created with a 3D printer. The quadrotor dimensions are 430 [mm] in length, 100 [mm] in height, 430 [mm] in depth, and 1500 [g] in weight. Figure 3 shows the architecture of the quadrotor. The Raspberry Pi receives remotely transmitted commands and sends them to the brushless motors through PWM drivers and ESC for flying with the control rotation speed of each brushless motor. The PWM values for controlling each brushless motor are derived based on the PWM offset command value calculated by fuzzy control. 2.2
Fuzzy Theory Based Attitude Control in Takeoff Method
In this section, we consider the Fuzzy Control for a quadrotor attitude of a takeoff. The proposed system controls the attitude of the quadrotor by gravity acquired from the IMU. In addition, the quadrotor attitude is controlled by the number of brushless motor rotations by fuzzy control, which has a high computational speed. First, we obtain from the sensor the gravity in the x and y axes of the coordinate system when the sensor is the center. Gravity is represented by g and
50
C. Yukawa et al.
Fig. 2. Snapshot of Quadrotor.
Fig. 3. Architecture of Quadrotor.
the gravity of the X-axis and Y-axis is the gx,y . The Angle of Quadrotor AQ is shown in Eq. (1). gx,y ) (1) AQx,y = arcsin( g The acquired gravity is input to the fuzzy control of each motor. The membership functions are shown in Fig. 4 and the fuzzy rule-base is shown in Table 2.
A Fuzzy Theory Based Attitude Control for Takeoff of Quadrotor
51
Table 1. Architecture of Quadrotor. Architecture
Model
Propeller
T-Motor T5147
Motor
T-Motor F60PROV KV2550
Electric Speed Controller (ESC)
Hobbywing Skywalker 2-4S 50A
Power Distribution Board (PDB)
MATEKSYS PDB-HSX
Li-Po Battery
HRB 4S 14.8v 5000mAh 50C
Uninterruptible Power Supply(UPS) RPi UPSPack V3 Mobile Battery
Raspberry Pi 4 Lithium Battery
Inertial Measurement Unit (IMU)
BNO055
Pressure Sensor
BME680
single-board computer
Raspberry Pi 4
PWM Driver
PCA9685 Table 2. Fuzzy Rule-base.
AQ of X-axis AQ of Y-axis PWM for motor offset value High
High
High
High
Middle
High
High
Low
Middle
Middle
High
High
Middle
Middle
Middle
Middle
Low
Low
Low
High
Middle
Low
Middle
Low
Low
Low
Low
The speed of the brushless motor is adjusted by correcting the PWM signal based on the output of the fuzzy control to control the attitude of the quadrotor.
3
Experimental Results
The experimental environment is shown in Fig. 5. This experiment was conducted assuming an indoor takeoff. The output of the fuzzy control is shown in Fig. 6. From the output of fuzzy control, it is seen that if AQ is increased, the PWM offset value is increased. In addition, the results of the angle of the quadrotor by fuzzy control based takeoff are shown in Fig. 7. Figure 8 shows snapshots of the quadcopter based taking off.
52
C. Yukawa et al.
From the experimental results, the proposed method can control the quadrotor attitude of the takeoff by fuzzy control. It can be seen that It control the angle of the quadrotor, trying to get it back to the angle shown in Fig. 7.
Fig. 4. Membership function.
A Fuzzy Theory Based Attitude Control for Takeoff of Quadrotor
Fig. 5. Experimental environment.
Fig. 6. Output of Fuzzy control.
53
54
C. Yukawa et al.
Fig. 7. Exprimental results of proposed method.
Fig. 8. Snapshot of the experimental results.
4
Conclusions
In this paper, we proposed a fuzzy control for the attitude control of a quadrotor based takeoff. The experimental results show that the proposed system can control the quadrotor attitude of a takeoff based on fuzzy control. In the future, we would like to improve the performance of attitude control and improve the stability-based control.
A Fuzzy Theory Based Attitude Control for Takeoff of Quadrotor
55
Acknowledgement. This work was supported by JSPS KAKENHI Grant Number 20K19793.
References 1. Mademli, I., et al.: Challenges in autonomous UAV cinematography: an overview. In: 2018 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2018) 2. Mademli, I., et al.: Autonomous unmanned aerial vehicles filming in dynamic unstructured outdoor environments [applications corner]. IEEE Signal Process. Maga. 36(1), 147–153 (2018) 3. Thiels, A., et al.: Use of unmanned aerial vehicles for medical product transport. Air Med. J. 34(2), 104–108 (2015) 4. Villa, K.D., et al.: A survey on load transportation using multirotor UAVs. J. Intell. Rob. Syst. 98, 267–296 (2020) 5. Huang, Y., et al.: Development of a spray system for an unmanned aerial vehicle platform. Environ. Pract. 25(6), 803–809 (2009) 6. Fai¸cal, S., et al.: The use of unmanned aerial vehicles and wireless sensor networks for spraying pesticides. J. Syst. Arch. 40, 393–404 (2014) 7. Mohan, M., et al.: UAV-supported forest regeneration: current trends, challenges and implications. Remote Sens. 13(13), 2596 (2021) 8. de Castro, I., et al.: UAVs for vegetation monitoring: overview and recent scientific contributions. Remote Sens. 13(11), 2139 (2021) 9. Shvetsova, S., et al.: Safety when flying unmanned aerial vehicles at transport infrastructure facilities. Transport. Res. Procedia, 141–145 (2015) 10. Jofr´e-Brice˜ no, C., et al.: Implementation of facility management for port infrastructure through the use of UAVS, photogrammetry and BIM. Sensors 21(19), 6686 (2021) 11. Qazi, S., et al.: UAV based real time video surveillance over 4G LTE. In: 2015 International Conference on Open Source Systems & Technologies (ICOSST), vol. 13, no. 11, pp. 141–145 (2015) 12. Anwar, N., et al.: Construction monitoring and reporting using drones and unmanned aerial vehicles (UAVs). In: The Tenth International Conference on Construction in the 21st Century (CITC-10), vol. 8, no. 3, pp. 2–4 (2018) 13. Li, Y., et al.: Applications of multirotor drone technologies in construction management. Int. J. Constr. Manag. 19(5), 401–412 (2019) 14. Irizarry, J., et al.: Exploratory study of potential applications of unmanned aerial systems for construction management tasks. J. Manag. Eng. 32(3), 05016001 (2016) 15. Bustamante, M., et al.: Design and construction of a UAV VTOL in ducted-fan and tilt-rotor configuration. In: 2019 16th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE), pp. 1–6 (2019) 16. Zong, J., et al.: Evaluation and comparison of hybrid wing VTOL UAV with four different electric propulsion systems. Aerospace 8(9), 256 (2021) 17. Kanellakis, C., et al.: Survey on computer vision for UAVs: current developments and trends. J. Intell. Rob. Syst. 87, 141–168 (2017) 18. Bouguettaya, A., et al.: A review on early wildfire detection from unmanned aerial vehicles using deep learning-based computer vision algorithms. Signal Process. 190, 108309 (2022)
56
C. Yukawa et al.
19. Cazzato, D., et al.: survey of computer vision methods for 2d object detection from unmanned aerial vehicles. J. Imaging 6(8), 78 (2020) 20. Akter, R., et al.: CNN-SSDI: convolution neural network inspired surveillance system for UAVs detection and identification. Comput. Netw. 201, 108519 (2021) 21. Benjdira, B., et al.: Car detection using unmanned aerial vehicles: comparison between faster r-cnn and yolov3. In: 2019 1st International Conference on Unmanned Vehicle Systems-Oman (UVS), pp. 1–6 (2019) 22. Cai, G., et al.: A brief overview on miniature fixed-wing unmanned aerial vehicles. IEEE ICCA 2010, 285–290 (2010) 23. Cory, R., et al.: Experiments in fixed-wing UAV perching. In: AIAA Guidance, Navigation and Control Conference and Exhibit, pp. 7256 (2008) 24. Ucgun, H., et al.: A review on applications of rotary-wing unmanned aerial vehicle charging stations. Int. J. Adv. Rob. Syst. 18(3), 17298814211015864 (2021) 25. Saggiani, G.M., et al.: Rotary wing UAV potential applications: an analytical study through a matrix method. Aircraft Eng. Aeros. Technol. 76(1), 6–14 (2004) 26. Aich, S., et al.: Analysis of ground effect on multi-rotors. In: 2014 International Conference on Electronics, Communication and Computational Engineering (ICECCE), pp. 236–241 (2014) 27. Sanchez-Cuevas, P., et al.: Characterization of the aerodynamic ground effect and its influence in multirotor control. Int. J. Aeros. Eng. 2017 (2017) 28. Sharf, I., et al.: Ground effect experiments and model validation with Draganflyer X8 rotorcraft. In: 2014 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 1158–1166 (2014) 29. Knospe, C., et al.: PID control. IEEE Control Syst. Maga. 26(1), 30–31 (2006) 30. Borase, P., et al.: A review of PID control, tuning methods and applications. Int. J. Dyn. Control 9, 818–827 (2021) 31. Morari, M., et al.: Model predictive control: past, present and future. Comput. Chem. Eng. 23(4–5), 667–682 (1999) 32. Garcia, E., et al.: Model predictive control: theory and practice-a survey. Automatica 25(3), 335–348 (1989) 33. Saito, N., et al.: Approach of fuzzy theory and hill climbing based recommender for schedule of life. In: Proceedings of LifeTech-2020, pp. 368–369 (2020) 34. Ozera, K., et al.: A fuzzy approach for secure clustering in MANETs: effects of distance parameter on system performance. In: Proceedings of IEEE WAINA-2017, pp. 251–258 (2017) 35. Elmazi, D., et al.: Selection of secure actors in wireless sensor and actor networks using fuzzy logic. In: Proceedings of BWCCA-2015, pp. 125–131 (2015)
A New Method for Improving Cache Hit Ratio by Utilizing Near Network Cache on NDN Akari Kanazawa and Tetsuya Shigeyasu(B) Department of Information Systems, Prefectural University of Hiroshima, Hiroshima, Japan [email protected] Abstract. Recently, most part of current network traffic is occupied by data which are used multiple times and by multiple users. If we can reuse to deliver such data for user’s request, amount of network traffic will be effectively reduced. Named Data Networking (NDN) allows to reuse of network cached data by using the ID which is not associated with nodes’ physical location. However, only the cached data stored on content delivery path will be reused on the NDN. In this paper, for further improving cache utilization, it will be proposed that a new method for searching requested content from CRs other than the content delivery path. The advantages of the proposal will be clarified by computer simulations.
1 Introduction Nowadays, amount of traffic data used by multiple users multiple times accounts for most part of on the Internet. On the communications exchange such a type of data, users engaging the communications is only curious to get the data. The type of the network using the ID based on content of data independent from location information, is categorized as content centric network. The communications engaged on the current network, however, are basically based on the locations of node not on the content of a data. For examples, on the Internet, all communications use the IP address based on the node’s physical location. Then the type of the network is categorized as location centric network. Then, there is a gap between a network architecture and a purpose of network utilization of users. It is expected to solve the gap for the future network environment. From this kind of circumstance, ICN (Information Centric Networking) has been focused as a new network architecture filling such gap. Especially the NDN (Named Data Networking) [1–3] which is one of most famous ICN architectures, engages data delivery using content ID not the IP address based on the nodes’ physical location. In NDN, CR (Content Router), a router relaying the packet, forwards a user’s content request toward content server according to the forwarding table, FIB (Forwarding Information Base) consist of content ID and next hop node. Once the forwarded content request is arrived at the content server, corresponding content will be delivered to the requestor via the reverse route of the content request. CRs on the content delivery path, cache the content (in-network cache), and deliver it on behalf of the original content server if the CR receives the corresponding content request in the future. Therefore, increasing the content deliveries from the CRs, contribute to reduce effectively both of the amount of network traffic and load of the content server. The content deliveries c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 57–67, 2023. https://doi.org/10.1007/978-3-031-35836-4_7
58
A. Kanazawa and T. Shigeyasu
from CRs on NDN, however, will be only performed if the CRs having the content cache received the content request. The content request/delivery on NDN is only performed over the shortest path among the content requestor and content server. So, content caches will not be utilized if the CRs having the corresponding cache are not a member of the forwarding path even if the CR exists near the content requestor. For further improving the performance of traffic reduction on NDN, it is expected to utilize content caches other than the CRs belonging to the shortest path. In this paper, we propose new forwarding method of content request on NDN for further utilizing the cached contents. For the purpose, our proposal employs new cache search function. The function enables content search on near CRs, and increase the ratio of content acquisitions from cached contents not the content server. Results of computer simulations confirm that our proposal effectively improves the cache hit ratios especially for the network in which there is no intersection among the content forwarding paths.
2 Related Works 2.1
Overview of NDN
In NDN, contents request and content delivery are achieved by using Interest and Data as shown in Fig. 1. In this section, we describe the overview of NDN.
Fig. 1. Frame formats of Interest and Data in NDN.
2.1.1 Configuration of CR in NDN In NDN, routers regarding to packet forwarding and contents caching are called CR. CR exchanges packets with neighboring CRs over the communication interface called face. Figure 2 shows the configuration of CR in NDN. As shown in the figure, CR consists of FIB, PIT (Pending Interest Table), and CS (Content Store). FIB is used as a routing table in NDN. The entry of the FIB is created according to the content name registered on returning contents from contents holder. The entry of FIB consists of two fields: contents name and forwarding interface. Information getting from the entry is used for Interest forwarding. PIT keeps the information of unfinished request received from down stream node which forwarded Interest. The entry of PIT consists of two fields: contents name and arriving interface. Information getting from the entry is used for returning Data of requested contents and for avoiding redundant forwarding of Interest having same content name. CS is used to keep caches of contents
A New Method for Improving Cache Hit Ratio
59
which transmitted over the CRs. When the Data of contents returned from contents holder to requestor, CRs belonging to the returning path of the Data caches the contents into its CS buffer.
Fig. 2. Configuration of CR in NDN
2.1.2
Forwarding of Interest and Data
Fig. 3. Procedure for arriving Interest on CR
Fig. 4. Procecure for arriving Data on CR
60
A. Kanazawa and T. Shigeyasu
On the NDN, user starts to transmit Interest with content name when the new content request has been arrived. In the followings, we describe the procedure on CR receiving Interest from users on NDN. Figure 3 shows the procedure. The CR receiving the Interest from neighbor, checks its CS whether corresponding content is stored in its CS. If the corresponding content having same content name in its CS, the CR returns it to neighbor who transmitted the Interest. Otherwise, the CR checks its PIT whether the entry having same content name exists. If the entry exists, the CR adds the face number to the entry, otherwise, the CR creates new entry having same content name and number of arrival Interest face, and forwards the Interest to next CR according to the FIB entry. When the forwarded Interest finally arrived at the original server or CR having the cache of the requested content, the server/CR returns the content corresponding to the Interest. Then, we describe the procedure for arriving DATA on CR (Fig. 4). The CR checks its PIT entries when it received the DATA. If there is a entry having same content name of the DATA, the CR forwards the DATA to the face registered at the entry. After the forwarding, the CR removes the entry used for the DATA forwarding, and caches the copy of the DATA into its CS. In case of no entry is kept in its PIT, the CR simply discards the DATA. 2.2
Cache Finding Except from CRs Belonging FIB Forwarding Path
As described in the previously, Interest packet, a content acquisition request of user will be forwarded toward server according to the FIB information on relay CRs. This Interest will be forwarded only over the path typically consists of shortest path (best route) between user and server, on conventional NDN. So, cached contents on the CRs does not belong to the path will never be utilized even if those caches locate near the user requesting the content. In order to achieve great traffic reduction, researchers have proposed the method to utilize those packet on the conventional NDN [4, 5]. In the literature [4], authors proposed CDN like utilization method, named, FNR (Fetching the Nearest Replica) for cached contents. On the FNR, nodes caching the contents register its state to a TS (Tracking Server). Tracking server notifies the location of cached contents when it receives a inquiry to get the contents. After the notifications, the user wishing to get the content request the corresponding content according to the information from the TS. Although this mechanism of FNR contributes to improve the utilization of cached contents, FNR needs a TS as a centralized control node. On the other hand, in the literature [5], CRs within the vicinity size from user node, notifies the information of cached contents on their CS. On requesting to get the content, a user sends its Interest directory the CR if the CR holds the corresponding the interest even if the CR is not a member of the FIB forwarding path. The method proposed in the [5] also utilizes the cached contents on the CRs out side of FIB forwarding path. But, the method only considers to utilize the cache around the user. Then, the caches near the FIB forwarding path but the out side area of vininity size is not utilized.
A New Method for Improving Cache Hit Ratio
61
3 Method for Utilizing Nearby Cache for Improving Cache Hit Ratio In this section, we propose a new method for improving cache hit ratio by cached contents stored nearby CRs except on the FIB recorded forwarding path. On the proposal, all nodes exchange the content list of its cached contents on CS, with the nearby CRs. When a CR receives a Interest, by referring the exchanged neighbor cache information, the Interest will be forwarded to the CR having a designated content cache even if the CR is not recorded as a forwarding path on the FIB. 3.1 MUNC: Method for Utilizing Nearby Cache In the followings, we propose MUNC for improving cache hit ratio using nearby cache. Most important role of MUNC is to find and utilize the cache on CRs not belonging to the forwarding path of FIB, but the near from the path. Although the conventional NDN is only utilized cached contents stored on the FIB forwarding path, the our MUNC is utilized nearby caches regardless of those are on the path or not. For realizing the utilizing the nearby cache, MUNC exchange the list of cached contents information among the neighboring CRs periodically. 3.1.1 Exchange Procedure of List of Cached Content On MUNC, CRs exchange their list of cached contents information among the nearby CRs. On the procedure, both information of its cached contents on CS and the received cached contents information from the neighbors are exchanged. At the CRs, received information from the neibghbor CRs will be recorded if the information meet the conditions. List of the cached contents on CS is called as contents list. The contents list consists of the information of cached contents both of a CR itself and the neighborhood CRs. Table 1 shows the construction of the contents list. As shown in the table, contents list consists of the 3 fields: content ID, time, hops, and owned node. content ID is the content name of a cached content. time is the cached time of a content, and, hops is the number of hops away from the cached CR. owned node indicates the node ID of the CR caching the content. For the own cached contents, value of the hops is 0 (ex. “/music/3/10” in Table 1). On MUNC, CRs periodically broadcast its content list to its neighbors. On receiving the content list from neighbor, CR checks the it and stores entries fulfilling the condition, into its aggregated list. The conditions for storing an aggregation list is as follows: • owned node is different from its node ID. • The value of hops +1 is smaller than or equal to area size1
1
area size is used to limit the area for sharing the content list. It is set by a network operator.
62
A. Kanazawa and T. Shigeyasu Table 1. Contents list content ID
time hops owned node
“/video/1/0 ”
3.0
1
2
“/music/3/10 ” 0.8
0
4
“/photo/2/5 ”
2
5
5.2
Entries fulfilling the above conditions will be stored into the aggregated list. At the time, the value of hops of each entry, is added by 1. The value of face is used as the face ID of received the content list. Table 2 shows an example of aggregated list (area size = 2) when a CR received a content list shown in Table 1 from its neighbor CR through face 2. As shown in the table, the two entries at the top, simply copied from the content list. But the entry for the content “/photo/2/5” is rejected because the value of hops+1 exceeds the area size. The entry at the bottom is for the cached contents on its CS. So, the value of the face is N/A. In additions, if the multiple entries having the same owned node for the same content ID is received, those entires will be discarded except the entry having smallest hops. If the multiple entries having the same content ID exist in the aggregated list, Interest will be forwarded according to the entry having the smallest value of hops. If the values of hops are same, newest entry is selected on the basis of time. Table 2. Aggregated list (area size = 2) prefix
time hops face owned node
“/video/1/0 ”
3.0
2
2
2
“/music/3/10 ”
0.8
1
2
3
“/document/3/5 ” 2.2
2
1
2
“/video/8/2 ”
0
N/A 1
1.2
3.1.2 Interest Forwarding Procedure When a new content request is arrived at a user, it firstly forwards Interest to its connecting CR. The CR receiving the Interest, forwards it to next CR according to FIB entry. In parallel with the forwarding, the CR forwards one more Interest if the corresponding entry exists in its aggregated list. In the case of the Interest is forwarded according to the aggregated list, normal Interest transmitted by FIB, is set the AGI flag. CRs receiving the Interest with AGI flag, will not forward the Interest according to aggregated list. At the reception of the forwarded Interest, the content server and/or CR having a corresponding content will return the content toward user.
4 Performance Evaluations In the previous section, we proposed MUNC to improve the cache hit ratio. By MUNC, it is expected to improve the cache utilization thanks to enlargement the cache search
A New Method for Improving Cache Hit Ratio
63
area. In this section, we clarify the advantages of our proposal compared with the conventional NDN on the various evaluation conditions. 4.1 Effects of Number of Contents and CS Size for Cache Hit Ratio This section evaluates the effects of number of contents and CS size for cache hit ratio of MUNC and NDN. Parameters used for the evaluations are shown in Table 3. Three users request a randomly selected content. Table 3. Simulation parameters Parameter
Value
Number of nodes
25 (Content server: 1, CR: 21, User: 3)
Interest packet size
1024 [byte]
Content packet size
1024 [byte]
Simulation time
10.0 [sec]
Interest generation interval
0.1
Aggregation list check interval 0.1 Aggregation list entry size
Infinite
CS size
10∼110 [contents]
Number of contents
80∼180
Figure 5 shows the network topology for the evaluations. On the topology, For the FIB forwarding, Interests from three users are forwarded by the shortest path.
Fig. 5. Network topology.
Figure 6 shows the characteristics of cache hit ratio under varying total number of contents on content server. For this evaluation, CS size of each CR is set as 100. As described in the previous section, area size means the search area of cached contents
64
A. Kanazawa and T. Shigeyasu
regardless the FIB information. Then, more wider area’s cached content is searched when the area size becomes large. As shown in the figure, our proposal, MUNC achieves always higher cache hit ratio than conventional NDN. In addition, in the case of larger area size, MUNC well educes higher cache hit ratio. This results indicates that wider cache search area by MUNC contributes to well improve the cache hit ratio.
Fig. 6. Characteristics of cache hit ratio – number of contents.
Figure 7 shows the characteristics of cache hit ratio under varying CS size of each CR. For the evaluations, total number of contents is set as 100. As shown in the figure, MUNC always higher cache hit ratio than conventional NDN, and, larger area size well improve the performance. When the CS size is smaller than 60, each results improve their cache hit ratio in proportion to increase CS size. However, cache hit ratios of these three results stay almost flat when the CS size if larger than 60.
Fig. 7. Characteristics of cache hit ratio – CS size.
A New Method for Improving Cache Hit Ratio
65
4.2 Characteristics of Cache Hit Ratio of Each User In the previous section, simulation results confirmed that the enlargement of cache search area by MUNC effectively improved the cache hit ratio. This section further investigates the effects of proposal in terms of user’s location. For the investigation, this section evaluates the cache hit ratio of each user. Figure 8 shows the network topology used for the evaluation. In this figure, three users are place in near positions, each other. In this figure, green line shows the FIB forwarding path of each user. As the figure, FIB forwarding path of user 12 and 16 are overlapped on the CR 5, and the 10.
Fig. 8. Network topology for evaluating effects of user’s location.
Characteristics of users’ cache hit ratio on each conditions: conventional NDN, MUNC (area size = 2), MUNC (area size = 3) are shown in Figs. 9, 10, 11, respectively.
Fig. 9. Cache hit ratio (NDN).
Results of conventional NDN, shown in Fig. 9 confirm that user 12 and 16 are higher performance than user 8. The cache hit ratio of user 12 and 16 are almost same. The
66
A. Kanazawa and T. Shigeyasu
reason for the above is thought that user 12 and 16 share the same cached contents on CR5 and 10. However, in the case of user 8, FIB forwarding path is not overlapped with the other users, then, cache hit ratio is lowest among the users.
Fig. 10. Cache hit ratio (MUNC, area size = 2).
From the Fig. 10, results of MUNC (area size = 2) confirmed that cache hit ratio of users 8 and 12 are significantly improved compared with the conventional NDN. Especially, in the case of user 8, cache hit ratio is improved almost same ratio with the user16. The reason is that user 8 and 16 shares cached content with one user. User 16 shared with user 12, and user 8 shared with user 16. In additions, user 12 achieved highest cache hit ratio among the tree users. This is because, only the user 12 shared the cached contents with other two users (user 8 and 16). From the Fig. 11, results of MUNC (area size = 3) confirmed that all users achieved almost highest and almost same cache hit ratio, each other. The reason for that is by increasing cache search area (area size = 3), all users shared cached contents with the other two users, each other.
Fig. 11. Cache hit ratio (MUNC, area size = 3).
A New Method for Improving Cache Hit Ratio
67
5 Conclusion In this paper, we have proposed a new method utilizing cached content improving cache hit ratio on conventional NDN. On the conventional NDN, cached contents is utilized to reduce network traffic and server load at the future content request. Such advantage appears if the Interest received at a CR having the corresponding cache. The conventional NDN, however, Interest will be forwarded only along with the shortest path recorded at the FIB. Hence, even if the corresponding cache exists neighbor CR of user (requestor), the cache will not be utilized when the CR is not on the FIB forwarding path. The purpose of our proposal, MUNC, is finding such content caches by prior exchanging cached contents information among neighbor CRs. The advantages of our proposal have been evaluated by computer simulations. The results of the evaluation confirmed that our proposal well educes traffic reduction than conventional NDN by utilizing content caches of CRs other than FIB forwarding paths. In addition, the evaluation also clarified that the cache hit ratio is improved in proportion to increase the area size which stands for the area size of cached contents. For the future tasks, we investigate the characteristics of MUNC under the various network conditions.
References 1. Soniya, M., Kumar, K.: A survey on named data networking. In: Proc. of 2015 2nd International Conference on Electronics and Communication Systems (ICECS 2015), pp. 1515–1519 (2015) 2. Jacobson, V., Smetters, D., Thornton, J., Plass, M., Briggs, N., Braynard, R.: Networking named content. In: Proc. of ACM CoNEXT 2009, pp. 1–12 (2009) 3. Chen, Q., Xie, R., Yu, F., Liu, J., Huang, T., Liu, Y.: Transport control strategies in named data networking: a survey. IEEE Commun. Surv. Tutor. 18, 2052–2083 (2016) 4. Cao, J., Pei, D., Zhang, X., Zhang, B., Zhao, Y.: Fetching popular data from the nearest replica in NDN. In: 2016 25th International Conference on Computer Communication and Networks (ICCCN), pp. 1–9 (2016) 5. Suwannasa, A., Broadbent, M., Mauthe, A.: Vicinity-based replica finding in named data networking. In: 2020 International Conference on Information Networking (ICOIN), pp. 146– 151. IEEE (2020)
An Analysis of Theoretical Network Communication Speedup Using Multiple Fungible Paths David W. White1(B) , Isaac Woungang2 , Felix O. Akinladejo1 , and Sanjay K. Dhurandher3 1 University of Technology, Jamaica, Kingston, Jamaica
{dwwhite,fakinladejo}@utech.edu.jm 2 Toronto Metropolitan University, Toronto, Canada
[email protected] 3 Netaji Subhas University of Technology, New Delhi, India
Abstract. Smart phones and mobile smart Internet of Things (IoT) devices have become more ubiquitous and transmit/receive more data across the Internet, which necessitate more bandwidth, and therefore more pathways. In this paper, we propose a theoretical model with three components which demonstrates how sending data over multiple fungible paths can increase the bandwidth. Using this model, the expected speedup as more fungible paths are employed is estimated, along with the theoretical fastest time that it will take to transmit data as the fungible paths are increased, as well as the point of theoretical fastest throughput, after which increasing the number of pathways any further would lead to diminishing returns. Our proposed model can help IoT and smart device manufacturers, network administrators and network application developers, in building better performing devices and network applications.
1 Introduction The recent past decades have witnessed a phenomenal growth in connected IoT devices, including smart phones. Currently, there are tens of billions of IoT devices and smart phones in existence [1]. Data from [1] presented in Fig. 1 show some forecasts of mobile devices including phones and tablets from 2020 to 2025. Similarly, data from [2] presented in Fig. 2 show some forecasts on smartphones. In [3] projections for the number of connected IoT devices are shown from 2019 to 2030. As the use of smartphones and similar devices has become more ubiquitous and the applications running on them have become more pervasive, more and more bandwidth is needed to allow acceptable communications. Interestingly, in [4], Cisco estimated that about 500 zettabytes will be transmitted on the Internet by IoT devices in the next few years.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 68–77, 2023. https://doi.org/10.1007/978-3-031-35836-4_8
An Analysis of Theoretical Network Communication
69
One way to add more bandwidth is to add more paths. In much the same way, smartphones started out with one rear camera, then two, then three, and so on, then manufactures could add more Bluetooth or Wi-Fi radios to provide the needed additional bandwidth. This raises the following questions: (a) physical dimensions aside, how much additional paths should be added to a host device to get more bandwidth? (b) how much speedup can be theoretically expected to be achieved? And (c) given the aforementioned pathways, how much time will it take to transfer a given amount of data? This paper attempts to address the above questions and present our findings, analysis and discussions of the results, conclusions and recommendations.
Fig. 1. Forecast of the number of mobile devices in use worldwide from 2020-2025 (in billions of devices). Data for 2020 and 2021 are actual. Data for 2022 to 2025 [1] are forecasts
70
D. W. White et al.
Fig. 2. Number of smartphone subscriptions worldwide from 2016 to 2021, with forecasts from 2022 to 2027 (in millions) [2].
2 Related Work In this Section, some representative related works [5–10] are discussed. In [5], Amdahl introduced what has become known as Amdahl’s law to the parallel computing world, contrasting the sequential portion (referred to as data management housekeeping) of a computational load or task with the non-sequential portion. In [6], Rodgers expounded on the merits of Amdahl’s law and presented a formula which can be used to predict the theoretical speedup in the latency of the execution of a task on a system with a fixed-workload as the resources (i.e. number of processors in the system) are increased. In [7], Gustafson stated that Amdahl’s law estimated the speedup by using the equation presented in Eq. (1). Speedup =
1 (s + p) = (s + p/N ) (s + p/N )
(1)
where N is the number of processors, s is the amount of time spent on the serial parts of the program, p is the amount of time spent on parts of the program that can be done in parallel, and (s + 1) has been set to 1 for simplicity. In [8], Bryant and O’Hallaron presented an alternative formulation of Amdahl’s law as follows: Slatency (s) =
1 (1 − p) +
p s
(2)
where Slatency represents the theoretical speedup of the entire task’s execution, s is the speedup of the part of the task that benefits from additional resources in the system, and p is the proportion of the task’s overall execution time that the sped-up portion took before speedup.
An Analysis of Theoretical Network Communication
71
In [9], McCool et al. stated that in Amdahl’s law, the speedup was limited by the fraction of the work that was not parallelizable, i.e. the non-parallelizable (or serial) portion of the workload, and this limits the scalability of parallelization. Besides, it is reported that even when speedups are possible, the efficiency of the considered system can be worse because of the serial fraction of the workload that is not parallelized. In their work, McCool et al. also expounded on the work-span model for parallel computation (shown in Fig. 3), and opined that it was more useful than the Amdahl’s law for estimating the running time of computer programs. The authors stated that in the work-span model, the tasks to be performed can take the form of a directed acyclic graph (DAG), and a task is ready to be run if all its predecessors in the DAG are already completed. The authors also advocated that the work-span model can provide both upper and lower bounds, allowing for the estimation of the running time, which can be deduced from the work and span of the program, where the work of the program is defined as the time that a serialization of the program takes, i.e. the total time to complete all the tasks involved; and the span of the program is defined as the time a parallel program takes on an ideal computer that possess an infinite number of processors, i.e. the span of the program is equal to the length of the critical path in the DAG. In [9], the formulation for the speedup Sp is given by: Sp =
T1 T1 =p ≤ Tp Tp /p
(3)
Fig. 3. Work and Span: Arrows denote dependencies between tasks. Work is the total amount of computation, while span is given by the critical path. In this example, if each task takes unit time, the work is 18 and the span is 6, from source [10].
72
D. W. White et al.
It is also reported in [9] that adding more processors to an ideal machine with greed scheduling would never slow down the program since the following inequality holds: Sp =
T1 T1 ≤ Tp T∞
(4)
work span
(5)
which implied that: speedup ≤
In [10], Eadline summarized the consequences of Amdahl’s law by stating that the amount of speedup that one can expect by parallelizing a program is limited by the sequential portion of the program and the overhead caused by parallelization. The author used a lawn mower as a metaphor to illustrate how the expected speedup is influenced by a program’s sequential section and its parallel overhead.
3 Methodology We employed a quantitative research design, which involved building and testing the proposed mathematical model to represent the theoretical speedup, transfer times, and number of ideal paths to use. The first part of our model is designed to simulate and examine the speedup and transfer times as the number of fungible communication paths between our host peers is increased from one to two and then to three, using graph theory. We extrapolated the first part of our model to continue increasing the fungible paths from three paths to two million paths, and recorded the results. This first part of our model is represented in Fig. 4. It involves transferring a 4 MB plaintext file containing the full text of the King James version of the Bible using a single path. In this first part of our model, we present a directed acyclic graph with six vertices. The first three vertices (labeled 1, 2, 3, in the Single Path section) represent the sending peer. The first vertex provides the data stream to be transmitted from the sending host to the receiving host and passes this on to the second vertex. The second vertex is meant to disassemble the data stream into discrete units, so that these individual units can be sent simultaneously along the available fungible paths, then pass on to the third vertex. The third vertex then sends the data along its respective path. The last three vertices (labeled 4, 5, 6, in the Single Path section) represent the receiving peer. The fourth vertex receives all the data sent along its respective path from the third vertex and passes it on to the fifth vertex. The fifth vertex then coordinates the assembling of the data received simultaneously from the various available fungible paths into a single data stream, and finally, the sixth vertex collects and consumes that single data stream. Treating the vertices in the first part of our model as tasks, we borrowed the workspan model from Eq. (5), and used it to compute the speedup in the first part of our model, computing the work as 6 and the span as 6, where the work represents the total number of vertices in the Single Path DAG and the span represents the critical path. We then added two more fungible paths to the first part of our model and recalculated the speedup achieved using the work-span model, each time duplicating just the third and
An Analysis of Theoretical Network Communication
73
Fig. 4. Directed acyclic graph depicting single and multiple paths used in our model.
fourth vertices (i.e. 3a and 4a, 3b and 4b, and 3c and 4c) for each additional fungible path that we have added, and thus keeping the span constant while the work increased by two as shown in the Multiple Fungible Paths section of Fig. 4. We graphed the results, then used regression analysis, curve fitting, and manual analysis, to determine an equation that estimates the speedup in the first part of our model. Besides, we extrapolated the results to two million fungible paths and again examined the results to determine if we could derive a best-fit equation that can estimate the speedup, and we checked if there was a relationship between what is known in graph theory as the first Betti number (which is the difference between the number of edges and the number of vertices plus the number of connected components in a graph), and our best-fit equation, which would allow us to restate our equation in terms of the number of edges and vertices in the DAG from the first part of our model. The second part of our model involved carefully examining how much time it takes to transfer the 4 MB file using one, two and then three fungible paths. We used this data to determine a Diophantine equation for estimating the theoretical best time it would take as we added more fungible paths to the second part of our model. We then extrapolated these results to two million fungible paths and again examined them to determine if we could derive a best-fit equation to estimate the theoretical best transfer times. The third part of our model was motivated by the “lawnmower” example data from [10], which in turn was based on Amdahl’s law. Unlike the first and second parts of our model, the third part of our model included a serial section time (vertices one and six in Fig. 4), an overhead parallel setup time (vertices two and five in Fig. 4), and a potential parallel time (vertices three and four in Fig. 4), similar to the example data used by [10]. The data that we have used is presented in Table 1. We used curved fitting to examine
74
D. W. White et al.
a possible relationship between the variables to find out if we could estimate the ideal number of fungible paths that can be used to achieve the theoretical best transfer times (i.e. minimum times). We extrapolated the data in Table 1 to two million five hundred thousand records, and again examined the results to determine if we could derive a best-fit equation to estimate the ideal number of fungible paths to be used to achieve the theoretical best transfer times. We have used the R programming language, the LibreOffice Calc spreadsheet and the Wolfram Alpha online computational engine to store, graph and analyze our data and present our results. Table 1. Data used from [10] to develop our third model
4 Results From the first part of our model, we were able to derive a best-fit relation to estimate the speedup. This relationship is shown in Eq. (6), s=
2 1 L+ 3 3
(6)
where s is the computed speedup and L is the number of fungible paths available for data transmission. The settings we used included keeping the span constantly at 6 (the value we used in Fig. 4) as we increased the work by adding more fungible paths from 1 to 2,000,000. The value of the work was increased by 2 on each run, starting from 6 and proceeding to 4,000,004. From Eq. (6), we derived an equivalent formulation given in Eq. (7): s=
L+2 3
(7)
We also found that we could compute the speedup as a function of the number of edges and vertices in the graph, after determining that the number of fungible paths was equal to 1 plus the first Betti number, i.e. s=
1 (e − v + 4) 3
(8)
An Analysis of Theoretical Network Communication
75
where s is the computed speedup, e is the number of edges in our graph, and v is the number of vertices in our graph. Equation (6) to Eq. (8) assume that we keep the span constant at 6. We found that as we increased the overhead in the first part of our model incrementally by 2, the gradient of the graph is decreased, indicating that when the overhead is increased and we gradually increase the number of paths, the incremental increase in the speedup between path r – 1 and r decreases. The linear relationships resulting from this can be expressed mathematically as shown in Eq. (9), where s is the speedup, g is the gradient, k is a constant, and x is computed as shown in Eq. (10), where z represents the span. s=g∗x+k x=
z−2 2
(9) (10)
From the second part of our model, we can derive a best-fit relation to estimate the theoretical best transfer times as given in Eq. (11): C T= (11) S ∗L where T represents the total time it will take to transfer the data, C is the amount of data to be transmitted, S is the speed of one of the fungible paths, and L is the number of available fungible paths. We used the following settings in deriving the relations above. We varied L from 1 to 2,000,000, we varied L from 1 to 100, and we kept S at a constant 5. From the third part of our model, when we examined the graph of the data from [10] and our extrapolated results in Table 1, we noticed that the minimum of the function (i.e. best time) representing the total time for all sections (i.e. total transfer time) occurred at the point where the potential parallel time and the overhead parallel setup time intersected each other. Our results also revealed that we could predict that the intersection point b between the potential parallel time x6 and the overhead parallel setup time x5 occur approximately at the point where: a , a ≥ 1, c ≥ 1 (12) b= c which implies that this intersection point occurred where the number of fungible paths being used b is equivalent to the square root of the single path time a divided by the square root of the per-unit parallel setup time c, which further implies that it is at the point that the total time for all sections, i.e. T = ts + to + tp was at its approximated minimum. We used varies settings in arriving at our prediction in Eq. (12). Using the data from Table 1, we varied the number of fungible paths from 1 to 40 in increments of 5, we kept the Serial Section Time at a constant of 20, the Single Path Time at a constant of 40, and Per Unit Parallel Setup Time at a constant of 1. We also used another configuration where we varied the number of fungible paths from 1 to 9, we kept the Serial Section Time at a constant of 5, the Single Path Time at a constant of 10, and Per
76
D. W. White et al.
Unit Parallel Setup Time at a constant of 1. We then used a configuration setting where we varied the number of fungible paths from 1 to 2,500,000, we kept the Serial Section Time at a constant of 5, the Single Path Time at a constant of 10, and Per Unit Parallel Setup Time at a constant of 1.
5 Conclusion In this paper, we discussed the need for an increasing amount of bandwidth to meet the requirements of the ever-increasing billions of IoT and smart devices such as smart phones and their applications. As our contribution to the body of knowledge, we presented a mathematical model with three parts and demonstrated how these parts could be used to determine how much additional fungible paths we should add to a host device to get more bandwidth, how much speedup we could theoretically expect to achieve by adding more fungible paths, and how much time it would take to transfer a given amount of data over a specific amount of fungible paths. We were able from our results to develop a Diophantine equation to predict the optimal number of fungible paths to use to achieve the theoretical fastest transfer times. We showed that this optimal number of fungible paths could be found by computing the square root of the Single Path Time divided by the square root of the Per-unit Parallel Setup Time. Adding any more paths after this optimal point is reached will result in diminishing returns. We used graph theory to derive the theoretical speedup expected as more fungible paths were added to our reference model. We computed this speedup as being equal to one-third of the number of fungible paths being used plus the constant two-thirds. We also deduced from our results that the theoretical time taken to transmit a given amount of data was inversely proportional to the number of fungible paths. We showed that this time could be found by computing the ceiling of the amount of data to be transferred, divided by the product of the number of fungible lanes available to transmit the data and the speed of one of those fungible lanes. We recommend that network-based applications running on appropriate devices such as smart phones and smart IoT devices dynamically check to determine how many fungible paths are available during runtime, and make use of those paths to gain more bandwidth, but should ensure no more than the optimal number of fungible paths are used so as not to trigger diminishing returns. These network-based applications can also compute metrics such as the theoretical speedup expected and the data transfer times expected. Network Administrators can utilize these metrics to fine tune their network configurations to try to achieve performance as close to the theoretical values as practical operations allows. Manufacturers can employ our model to help them in designing smart devices such as smart phones and other IoT devices with better network performance. We also recommend that more research be conducted in this area to expand the body of work we have presented here.
An Analysis of Theoretical Network Communication
77
References 1. Laricchia, F.: Number of mobile devices worldwide 2020–2025. Statista - The Statistics Portal, The Radicati Group, Hamburg, Germany (2023). https://www.statista.com/statistics/245501/ multiple-mobile-device-ownership-worldwide/. Accessed 19 Mar 2023 2. Taylor, P.: Smartphone subscriptions worldwide 2016–2021, with forecasts from 2022 to 2027. Statista - The Statistics Portal, Ericsson, Hamburg, Germany (2023). https://www.sta tista.com/statistics/330695/number-of-smartphone-users-worldwide/. Accessed 19 Mar 2023 3. Vailshery, L.S.: Number of Internet of Things (IoT) connected devices worldwide from 2019 to 2021, with forecasts from 2022 to 2030 (2022). Statista - The Statistics Portal, Transforma Insights, Hamburg, Germany. https://www.statista.com/statistics/1183457/iot-connected-dev ices-worldwide/. Accessed 19 Mar 2023 4. Cisco Visual Networking: Cisco global cloud index: Forecast and methodology, 2016–2021. White Paper. Cisco Public, San Jose, 1 (2016) 5. Amdahl, G.M.: Validity of the single-processor approach to achieving large scale computing capabilities. In: AFIPS Conference Proceedings, Atlantic City, N.J., vol. 30, pp. 483–485, 18–20 April. AFIPS Press, Reston (1967) 6. Rodgers, D.: Improvements in multiprocessor system design. ACM SIGARCH Comput. Archit. News 13(3), 225–231 (1985) 7. Gustafson, J.L.: Reevaluating Amdahl’s law. Commun. ACM. 31(5) (1988) 8. Bryant, R.E., O’Hallaron, D.: Computer Systems: A Programmer’s Perspective, 3rd edn., p. 58. Pearson Education (2016). ISBN 978-1-488-67207-1 9. McCool, M.D., Robison, A.D., Reinders, J.: Performance theory. In: Structured Parallel Programming: Patterns for Efficient Computation, pp. 61–62. Elsevier (2012). ISBN 978-0-12-415993-8 10. Eadline, D.: The Lawnmower Law. Linux Magazine. QuinStreet, Inc., CA, USA (2008). http://www.linux-mag.com/id/6020/. Accessed 13 Nov 2019
Universal Intrusion Detection System on In-Vehicle Network Md Rezanur Islam1 , Insu Oh2 , and Kangbin Yim2(B) 1 Department of Software Convergence, Soonchunhyang University, Asan, Korea
[email protected] 2 Department of Information Security Engineering, Soonchunhyang University, Asan, Korea
{catalyst32,yim}@sch.ac.kr
Abstract. The Controller Area Network (CAN) protocol is widely used in automotive and industrial applications for communication. However, the lack of authentication and encryption in CAN bus networks has made them vulnerable to cyberattacks. This study investigated the effectiveness of different intrusion detection models in accurately classifying attacks, such as Denial-of-Service (DoS) attacks, fuzzing, and replay attacks. A labeled dataset was created using a methodology that uses the CAN ID sequence, time gap, and hamming distance between hexadecimal strings of equal length. The resulting dataset was segmented and converted into heat maps that were input to deep learning models such as VGG16, AlexNet, and ResNet-50. The study provides valuable insights for developing more robust security measures for in-vehicle networks. However, recent research has shown that intrusion detection systems need to be developed individually for each vehicle, taking into account the unique data characteristics of the vehicle. Therefore, this paper proposes to implement universal IDS by using three types of CNN architectures to find the best one that is suitable for all types of attacks with high accuracy.
1 Introduction Controller Area Network (CAN) is a communication protocol widely used in the automotive and industrial sectors that allows electronic devices to communicate with each other and facilitate real-time data exchange and control. The multi-master bus characteristic of the CAN bus makes it very reliable and robust, allowing multiple devices to simultaneously transmit data on the bus without conflict [1]. However, with the increasing use of Internet of Things (IoT) and Internet of Vehicles (IoV) devices and connectivity, CAN bus networks have become increasingly vulnerable to cyberattacks. The openness and lack of authentication in CAN bus networks make them vulnerable to a range of attacks, with attackers able to manipulate CAN messages to gain control of vehicles or industrial processes [2]. Furthermore, the encryption and authentication in CAN bus networks makes it easier for attackers to access the network and compromise its integrity, which can have serious consequences. To address these challenges, systems to detect intrusions into the in-vehicle network have gained significant attention. These systems can detect and respond to potential © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 78–85, 2023. https://doi.org/10.1007/978-3-031-35836-4_9
Universal Intrusion Detection System on In-Vehicle Network
79
cyberattacks, mitigate their impact on the integrity of the CAN bus network, and ensure secure communications between devices. However, recent research has shown the importance of developing a customized attack detection system for each vehicle. According to CAN DBC [3], each vehicle has unique data characteristics, even within the same manufacturer. Therefore, these IDS may not detect attacks when the vehicle is changed. Therefore, there is a need to develop a solution that creates an in-vehicle network intrusion detection system that can be deployed for each vehicle and takes into account the unique data characteristics of the vehicle. Specifically, the contributions of this research are as follows: First, this study proposes a method for generalizing data from different vehicles. In particular, an approach is defined for addressing the challenge of data diversity in invehicle networks, which is a major obstacle to the development of a universal Intrusion Detection System (IDS). The proposed approach eliminates the need to develop IDSs for each vehicle individually. Second, the proposed algorithm is able to classify attack-free and attack-relevant data independently by using data from other vehicles. This capability allows the proposed algorithm to effectively detect attacks even when the number of available data samples for a given vehicle is limited. Finally, the proposed approach can be extended to the Roadside Unit (RSU) and the On-Board Unit (OBU) in the Vehicle-to-Everything (V2X) communication ecosystem. Through this extension, the security of V2X communication can be improved by using the proposed approach for V2X security implementation. In our previous research [4], we performed an in-depth analysis of the Controller Area Network and now we have proposed a heat map-based IDS solution using CNN that can detect attacks independently. In this paper, we implement three types of CNN architectures to find the best one that is suitable for all types of attacks with high accuracy. In summary, implementing security measures such as encryption, authentication, and intrusion detection is crucial to protect CAN bus networks from cyberattacks and ensure their reliability and security. The development of universal intrusion detection systems can further improve the security of these networks by taking into account the unique characteristics of each vehicle. As the use of IoT and IoV devices continues to grow, it is critical to take proactive measures to protect CAN bus networks from cyber threats to ensure the safety of both vehicles and industrial processes.
2 CAN Specification The specification of CAN was published by manufacturers in an article [1]. In a CAN network, data is transmitted in the form of messages consisting of two main components: the message identifier (CAN ID) and the payload data. The CAN ID is a unique identifier that specifies the priority and content of the message. The payload contains the actual information that is transmitted and can be between 0 and 8 bytes in size. When a message is transmitted, it is encapsulated in a CAN frame, which consists of seven main fields: the Start of Frame (SOF), Arbitration Field, Control Field, Data Field, CRC Field, Acknowledgment Field, and End of Frame (EOF). The Arbitration Field contains the CAN ID and determines the priority of the message. Messages with lower CAN IDs
80
Md. R. Islam et al.
have higher priority. The Control Field contains information about the data length and the type of message, such as whether it is a remote frame or a data frame. In a remote frame, the CAN ID is used to request data from other nodes on the network, while in a data frame, the payload contains the actual data that is being transmitted. The CRC field is used for error detection and is calculated based on the data payload. It ensures that the transmitted data is error free and can be correctly interpreted by the receiving node. The Acknowledgment Field is used to confirm the successful reception of a message by the receiving node. If the message was received without errors, the receiving node sends an acknowledgement message back to the sending node. Finally, the End of Frame (EOF) field signals the end of the message and allows the receiving node to prepare for the next message on the bus. In summary, the CAN protocol uses unique message identifiers (CAN IDs) to determine the priority and content of messages transmitted on the network. Each message is encapsulated in a CAN frame that contains several fields, including the arbitration field, the control field, the data field, the CRC field, the acknowledgement field, and the end-of-frame field. This enables reliable and efficient communication between nodes in the network.
3 Related Works on Automotive Security Zhu et al. [5] proposed a multidimensional IDS using multi-task LSTM for parallel computation on local terminals and mobile devices. Features include the addition of 64bit data and CAN IDs time interval. The local part predicts the next data combination, while the mobile edge has two parallel LSTMs to improve performance and achieve 90% overall. A federated learning approach [6], proposed by J. Yang et al. proposed federated learning approach [6] achieved 94.85% accuracy using ID sequence as input to ConvLSTM. Federated learning is a distributed machine learning approach that allows multiple clients to jointly train a model without sharing their data with a central server. The model is trained locally on each client, and only model updates are sent to the central server for aggregation. The architecture proposed by Narasimhan et al. detects tampering with the incoming CAN features of a vehicle using autoencoder and clustering methods. The IDEC approach [7] is used to learn optimal features and cluster data using K-means. A modified version of the IDEC algorithm is proposed where the clustering loss function is not embedded and is more suitable for IDS [8]. This modified version uses an autoencoder architecture for initial pre-training and GMM for clustering attacks and normal traffic based on various sensor inputs with an accuracy rate of 80.1%. The authors of [9] analyzed the input sequence CAN ID for attacks using a bidirectional GPT network by computing the negative log likelihood (NLL) value. Each ID in the sequence is converted to an integer and compared to a predetermined threshold to determine whether or not it is an attack. The bidirectional processing allows contextual information to be captured. The NLL value provides an estimate of the model’s confidence in its classification. The approach leverages the GPT network’s ability to learn patterns in the data and make accurate predictions. The overall accuracy is 97.8% with bidirectional GPT. The GIDS system detects attacks on the on-board network using a two-step process [10] that converts uniquely encoded CAN data into CAN images. The first discriminator outputs a value between 0 and 1, and if it is above the threshold, the corresponding
Universal Intrusion Detection System on In-Vehicle Network
81
images are fed to the second discriminator, which also outputs a value between 0 and 1. By combining both discriminators, the system detects known and unknown attacks with high accuracy. Al-Jarrah et al. [11] presented a multi-model approach that uses LSTM and ConvLSTM to classify attacks. LSTM was used to input table data, while ConvLSTM used a recursion graph. The combination of the two models increased the accuracy by 2%, resulting in an overall accuracy of 95.1%. In this section, accuracy is measured based on the source data set. In short, data was collected from a specific vehicle to train the model, and the same vehicle was used to test the model. Therefore, the model was not generalized and cannot be implemented in other vehicles unless additional data from other vehicles are added. Our proposed model aims to train the model with data from different vehicles and test it on different vehicle models. In this way, the model can be used universally. Specifically, we trained the model with data from BMW and Kia and tested it with Tesla.
4 Data Pre-processing and Deep Learning Architectures 4.1 Data Pre-processing In this study, we investigated different types of attacks, including Denial-of-Service (DoS) attacks, which can be categorized as low-speed attacks (L-DoS) or distributed attacks (D-DoS), and data injection attacks such as Fuzzing and replay attacks. To collect data for these attacks, our study employed a data collection method that used segments of 3–5 s. For DoS attacks, we injected 5000 and 1000 data points per segment, while for fuzzing attacks, we injected 100 and 500 data points per segment. For replay attacks, we injected two random data points. The results of our study shed light on the effectiveness of different intrusion detection models in accurately classifying these attacks and provide valuable insights for developing more robust security measures for in-vehicle networks. When labeling a dataset, simplicity is crucial. In our approach, we split the dataset at the beginning and end times of an attack and compute the time interval and hamming distance between the resulting equal-length hexadecimal strings [8]. By treating each symbol as a 4-bit binary number and counting the number of distinct symbols at each position, we can determine the Hamming distance for this base-16 number system. This metric is suitable for error detection and correction, cryptography, and other data transmission and storage applications. We then segment the data into fixed-length segments, each containing 40 features, including CAN IDs, time interval, and Hamming distance for CAN ID. Using the scikit-learn library, we convert the hexadecimal CAN IDs into numeric values and scale them to a range between 0 and 1. The resulting segments are then converted into heatmaps that are input to Deep Learning models. Figure 1 shows the heatmap state for different states of the CAN data, where (a) represents attack-free data and (b, c, d) represent fuzzing, DoS, and replay, respectively. From this heatmap, it can be seen that attack-free and replay data have similarities, while fuzzing and DoS data have different characteristics.
82
Md. R. Islam et al.
Fig. 1. Heat-map decomposition according to ID sequence, time gap and hamming distance.
4.2 Deep Learning Architectures VGG-16 has a simple architecture with 16 convolutional layers and three fully linked layers. It uses small 3 × 3 filters in all convolutional layers and has a fixed input size of 224 × 224. The VGG-16 architecture is known for its simplicity and accuracy in image classification tasks, but its small filters make it computationally intensive compared to other architectures [12]. AlexNet was the first CNN to win the ImageNet Challenge in 2012. It has five convolutional layers followed by three fully concatenated layers. It uses larger kernel sizes of 11 × 11 and 5 × 5 in the first convolutional layer and 3 × 3 in the other layers. AlexNet also uses overlapping pooling layers to reduce the feature map size. This architecture achieved a significant improvement in accuracy compared to previous models [13]. ResNet-50 is a much deeper CNN architecture with 50 layers that use residual connections. The residual connections allow the model to solve the vanishing gradient problem that can occur with very deep networks [14]. The residual connections also make it possible to train even deeper models with thousands of layers. ResNet-50 is computationally intensive thanks to the use of skip connections, which allow the model to learn identity mapping. This makes it possible to use a deeper architecture with fewer parameters than previous models, resulting in higher accuracy.
Universal Intrusion Detection System on In-Vehicle Network
83
5 Result Evaluation In this study, we implemented three well-known Convolutional Neural Network (CNN) models to determine their effectiveness in universally detecting intrusions into on-board networks. Specifically, we trained and tested these models with data from BMW and Kia vehicles, and then evaluated their performance on a Tesla vehicle with a different vehicle network architecture. Fuzzing and denial-of-service (DoS) attacks have unique characteristics that allow each model to accurately classify these types of attacks. However, the replay attack is more difficult to detect because the attacker uses the target’s own dataset for anomalous injections. In particular, the ResNet-50 model has demonstrated its reliability in this regard by achieving an accuracy rate of nearly 99% in classifying attack-free, fuzzing, DoS, and replay attack scenarios. In contrast, the VGG-16 and AlexNet models achieved overall accuracy rates of 88% and 93%, respectively. Our results are summarized in the attached confusion matrix (Fig. 2), Table 1 and Table 2. Overall, the results highlight the superiority of the ResNet-50 model in accurately detecting replay attacks, indicating its potential for effective intrusion detection in on-board networks. Table 1. Accuracy scores according to classifying category for different architecture models. Types
VGG-16
AlexNet
ResNet-50
Normal
0.78
0.86
0.99
Fuzz
0.93
0.98
1.00
DoS
0.99
1.00
1.00
Replay
0.18
0.39
0.97
Table 2. Overall Accuracy score for different architecture models. Types
VGG-16
AlexNet
ResNet-50
Accuracy
0.88
0.93
0.99
ROC
0.862
0.889
0.994
84
Md. R. Islam et al.
Fig. 2. Performance evaluation confusion matrix between different architecture models.
6 Conclusion The development of a universal in-vehicle network intrusion detection system that works for all vehicle types, including mechanical, hybrid, and electronic vehicles, is the focus of this study. The main challenge in developing such a system is data generalization, which was addressed using heatmaps. The prototype of the proposed system achieved high accuracy and fast response times, indicating the potential for a reliable and effective solution to improve the security of vehicle network security. However, further experiments are needed to validate the performance of the system under different attack scenarios, including different levels of data injection. The study provides valuable insights for the development and implementation of robust invehicle network security measures. Overall, the proposed system has the potential to provide a more comprehensive and efficient solution for securing vehicle networks, contributing to the overall safety and security of the automotive industry. Acknowledgments. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (No. 2021R1A4A2001810) and the Institute for Information & communications Technology Planning & Evaluation (IITP) grant funded
Universal Intrusion Detection System on In-Vehicle Network
85
by the Korean government (MSIT) (No. 2019-0-01343, Regional strategic industry convergence security core talent training business).
References 1. BOSCH CAN Specification Version 2.0 (1991) 2. Nardus, L., Miller, C., Valasek, C.: Remote Exploitation of an Unaltered Passenger Vehicle 3. Sunny, J., Sankaran, S., Saraswat, V.: A hybrid approach for fast anomaly detection in controller area networks. In: 2020 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS), pp. 1–6, December 2020. https://doi.org/10.1109/ ANTS50601.2020.9342791 4. Islam, M.R., Oh, I., Yim, K.: CANTool an in-vehicle network data analyzer. In: 2022 International Conference on Information Technology Systems and Innovation (ICITSI), pp. 252–257, November 2022. https://doi.org/10.1109/ICITSI56531.2022.9970968 5. Zhu, K., Chen, Z., Peng, Y., Zhang, L.: Mobile edge assisted literal multi-dimensional anomaly detection of in-vehicle network using LSTM. IEEE Trans. Veh. Technol. 68(5), 4275–4284 (2019). https://doi.org/10.1109/TVT.2019.2907269 6. Hussain, S., Ali Imran, M., Yang, J., Hu, J., Yu, T.: Federated AI-enabled in-vehicle network intrusion detection for internet of vehicles. Electronics 11(22), 3658 (2022). https://doi.org/ 10.3390/ELECTRONICS11223658 7. Guo, X., Gao, L., Liu, X., Yin, J.: Improved deep embedded clustering with local structure preservation. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 1753–1759, August 2017. https://doi.org/10.24963/ijcai.2017/243 8. Narasimhan, H., Ravi, V., Mohammad, N.: Unsupervised deep learning approach for invehicle intrusion detection system. IEEE Consum. Electron. Mag. 12(1), 103–108 (2023). https://doi.org/10.1109/MCE.2021.3116923 9. Nam, M., Park, S., Kim, D.S.: intrusion detection method using bi-directional GPT for invehicle controller area networks. IEEE Access 9, 124931–124944 (2021). https://doi.org/10. 1109/ACCESS.2021.3110524 10. Seo, E., Song, H.M., Kim, H.K.: GIDS: GAN based intrusion detection system for in-vehicle network (2018). https://doi.org/10.1109/PST.2018.8514157 11. Al-Jarrah, O.Y., El Haloui, K., Dianati, M., Maple, C.: A novel detection approach of unknown cyber-attacks for intra-vehicle networks using recurrence plots and neural networks. IEEE Open J. Veh. Technol. 4, 271–280 (2023). https://doi.org/10.1109/OJVT.2023.3237802 12. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations ICLR 2015 - Conference Track Proceedings, September 2014. http://arxiv.org/abs/1409.1556 13. Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and , which show the 2D location point and communicable distance. An arc existing between nodes u and v if v indicate that the nodes can communicate with each other.
3 3.1
Intelligent Algorithms Particle Swarm Optimization
The PSO algorithm uses the idea of swarms of particles placed in a search space to find better solutions. The locations of particles are used for representing the solution [9]. The goal is to minimize the fitness function, and the movement of particles is used to explore the search space. Each particle moves in the search space by considering its current and best-fitness locations, along with those of other particles in the swarm. The iteration ends once when all particles have moved and the next iteration starts. The swarm’s movement converges to an optimum fitness function, looking as a flock of birds searching for food. Each particle in the swarm is comprised by three D-dimensional vectors including the current position, the previous position and the particle velocity. The interaction and cooperation among particles are essential for the swarm to converge to a good solution. The population topology of the swarm often resembles a social network, with bidirectional edges connecting pairs of particles in the neighbourhood. The best position found by the particles in the neighbourhood, represented by the vector pg , influences the movement of each particle. During the PSO process, the velocity of each particle is adjusted iteratively to guide the particle towards its best position pi and the neighbourhood’s best position pg . PSO can be viewed as a stochastic search algorithm that uses swarm intelligence to explore and exploit the search space. 3.2
Hill Climbing
Hill climbing (HC) is a heuristic algorithm based on a simple idea: the current solution is replaced with the next solution if the fitness value of the next solution is better or equal to that of the current solution. The next solution is determined by computing the difference of fitness function between new and current solutions, denoted by δ, where δ = f (s ) − f (s) and f is the fitness function for the current solution s and the next solution s . The effectiveness of HC algorithm depends on how neighbour solutions are defined, because this determines the quality and quantity of the candidate solutions generated by the algorithm. Our WMN-PSOHC system defines the neighbour solutions as the next positions of the PSO iterations.
Assessment of FC-RDVM and LDIWM Router Replacement Methods
163
Fig. 1. Chi-Square distribution of mesh clients.
4
WMN-PSOHC Hybrid Simulation System
Our proposed system starts by generating an initial solution using randomized methods [20]. Particle velocity is determined randomly based on the area size (W ×H) being considered. For example, √ the velocity of perticles can be generated √ randomly between − W 2 + H 2 and W 2 + H 2 . Thre are different distributions of mesh clients. In this study, we cosnider Chi-Square distribution as shown in Fig. 1. We show the flowchart of WMN-PSOHC in Fig. 2 and explain in following the particle-pattern, fitness function and router replacement methods. Particle-pattern In our proposed system, a particle represents a mesh router. The fitness value of a given particle-pattern is computed based on the combination of mesh router and mesh client positions. In other words, each particle-pattern represents a solution as shown in Fig. 3. As a result, the number of particle-patterns is the same to the number of potential solutions. Fitness Function The fitness function and its encoding are crucial in any optimization problem. In our proposed system, we consider a hierarchical fitness function by maximizing first the Size of Giant Component (SGC) and then the Number of Covered Mesh Clients (NCMC) in the WMN graph. To achieve this, we assign weight coefficients α and β to the fitness function, which is defined as the sum of the product of α and SGC and the product of β and NCMC. The fitness function is defined as: Fitness = α × SGC(WMN) + β × NCMC(WMN)
164
S. Sakamoto et al.
Fig. 2. WMN-PSOHC flowchart.
Fig. 3. Relationship among global solution, particle-patterns and mesh routers.
Router Replacement Methods Mesh routers have their own x and y coordinates and a velocity used to move them. Various router replacement techniques have been proposed in Particle Swarm Optimization (PSO) research, including methods proposed by Clerc et al. [4], Shi and Eberhart [15,16], and Schutte et al. [14]. We compares two specific replacement methods: the Linearly Decreasing Inertia Weight Method (LDIWM) and the Fast Convergence Rational Decrement of Vmax Method (FC-RDVM). In LDIWM, C1 and C2 are constant values and are considered 2.0. The ω parameter is changed linearly from unstable region (ω = 0.9) to stable region (ω = 0.4) with the increase of iterations of computations [3,16]. In FC-RDVM, the Vmax decreases with the increasing of iterations as shown in Eq. (1). T −k (1) Vmax (k) = W 2 + H 2 × T + γk
Assessment of FC-RDVM and LDIWM Router Replacement Methods
165
Table 1. Parameter settings. Parameters
Values
Clients Distribution Area Size Number of Mesh Routers Number of Mesh Clients Total Iterations Iteration per Phase Number of Particle-patterns Radius of a Mesh Router Fitness Function Weight-coefficients (α, β) Curvature parameter (γ) Replacement Methods
Chi-Square Distribution 32 × 32 16 48 800 4 9 From 2.0 to 3.0 0.7, 0.3 10.0 LDIWM, FC-RDVM
where W and H are the width and the height of the considered area, respectively. While, T and k are the total number of iterations and a current number of iteration, respectively. The k is a variable varying from 1 to T , which is increased by increasing the iterations, While, γ is the curvature parameter.
5
Simulation Results
In this section, we present the simulation results of the WMN-PSOHC hybrid intelligent system. We consider Chi-Square distribution of mesh clients and 9 particle-patterns. The parameter settings for simulations are shown in Table 1. The simulation results are shown in Fig. 4 and Fig. 5. For the SGC, it can be observed that for both replacement methods SGC is 100% as shown in Fig. 4. However, the FC-RDVM converges faster compared with LDIWM. Regarding NCMC, Fig. 5 shows that FC-RDVM covers more mesh clients than LDIWM. From the simulation results, we conclude that FC-RDVM has better performance than LDIWM for the simulated scenario.
166
S. Sakamoto et al.
Fig. 4. Simulation results of WMN-PSOHC for SGC.
Assessment of FC-RDVM and LDIWM Router Replacement Methods
167
Fig. 5. Simulation results of WMN-PSOHC for NCMC.
6
Conclusions
In this work, we presented WMN-PSOHC hybrid simulation system and introduced FC-RDVM mesh router replacement method. A comparison study between FC-RDVM and LDIWM was conducted considering Chi-Square distribution of mesh clients. We evaluated FC-RDVM and LDIWM by computer simulations. The simulation results show that for SGC both replacement methods acieved 100% connectivity. However, the FC-RDVM converges faster compared with LDIWM. Also regarding NCMC, FC-RDVM covers more mesh clients than LDIWM. From the simulation results, we found that FC-RDVM performs better compared with LDIWM. We will consider other parameters and scenarios in our future work.
168
S. Sakamoto et al.
References 1. Akyildiz, I.F., Wang, X., Wang, W.: Wireless mesh networks: a survey. Comput. Netw. 47(4), 445–487 (2005) 2. Amaldi, E., Capone, A., Cesana, M., Filippini, I., Malucelli, F.: Optimization models and methods for planning wireless mesh networks. Comput. Netw. 52(11), 2159–2171 (2008) 3. Barolli, A., Sakamoto, S., Ohara, S., Barolli, L., Takizawa, M.: Performance analysis of WMNs by WMN-PSOHC-DGA simulation system considering linearly decreasing inertia weight and linearly decreasing Vmax replacement methods. In: Barolli, L., Nishino, H., Miwa, H. (eds.) INCoS 2019. AISC, vol. 1035, pp. 14–23. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29035-1 2 4. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002) 5. Franklin, A.A., Murthy, C.S.R.: Node placement algorithm for deployment of twotier wireless mesh networks. In: Proceedings of the Global Telecommunications Conference, pp. 4823–4827 (2007) 6. Islam, M.M., Funabiki, N., Sudibyo, R.W., Munene, K.I., Kao, W.C.: A dynamic access-point transmission power minimization method using PI feedback control in elastic WLAN system for IoT applications. Internet Things 8(100), 089 (2019) 7. Muthaiah, S.N., Rosenberg, C.P.: Single gateway placement in wireless mesh networks. In: Proceedings of the 8th International IEEE Symposium on Computer Networks, pp. 4754–4759 (2008) 8. Oda, T.: A delaunay edges and simulated annealing-based integrated approach for mesh router placement optimization in wireless mesh networks. Sensors 23(3), 1050 (2023) 9. Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization. Swarm Intell. 1(1), 33–57 (2007) 10. Sakamoto, S., Lala, A., Oda, T., Kolici, V., Barolli, L., Xhafa, F.: Analysis of WMN-HC simulation system data using friedman test. In: The Ninth International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS-2015), pp. 254–259. IEEE (2015) 11. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation and evaluation of a simulation system based on particle swarm optimisation for node placement problem in wireless mesh networks. Int. J. Commun. Netw. Distrib. Syst. 17(1), 1–13 (2016) 12. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Implementation of intelligent hybrid systems for node placement problem in WMNs considering particle swarm optimization, hill climbing and simulated annealing. Mob. Netw. Appl. 23(1), 27–33 (2018) 13. Sakamoto, S., Barolli, L., Okamoto, S.: A comparison study of linearly decreasing inertia weight method and rational decrement of Vmax method for WMNs using WMN-PSOHC intelligent system considering normal distribution of mesh clients. In: Barolli, L., Natwichai, J., Enokido, T. (eds.) EIDWT 2021. LNDECT, vol. 65, pp. 104–113. Springer, Cham (2021). https://doi.org/10.1007/978-3-03070639-5 10 14. Schutte, J.F., Groenwold, A.A.: A study of global optimization using particle swarms. J. Glob. Optim. 31(1), 93–108 (2005) 15. Shi, Y.: Particle swarm optimization. IEEE Connect. 2(1), 8–13 (2004)
Assessment of FC-RDVM and LDIWM Router Replacement Methods
169
16. Shi, Y., Eberhart, R.C.: Parameter selection in particle swarm optimization. In: Porto, V.W., Saravanan, N., Waagen, D., Eiben, A.E. (eds.) EP 1998. LNCS, vol. 1447, pp. 591–600. Springer, Heidelberg (1998). https://doi.org/10.1007/ BFb0040810 17. Taleb, S.M., Meraihi, Y., Gabis, A.B., Mirjalili, S., Ramdane-Cherif, A.: Nodes placement in wireless mesh networks using optimization approaches: a survey. Neural Comput. Appl. 34(7), 5283–5319 (2022) 18. Vanhatupa, T., Hannikainen, M., Hamalainen, T.: Genetic algorithm to optimize node placement and configuration for WLAN planning. In: Proceedings of the 4th IEEE International Symposium on Wireless Communication Systems, pp. 612–616 (2007) 19. Wzorek, M., Berger, C., Doherty, P.: Router and gateway node placement in wireless mesh networks for emergency rescue scenarios. Auton. Intell. Syst. 1(1), 1–30 (2021). https://doi.org/10.1007/s43684-021-00012-0 20. Xhafa, F., Sanchez, C., Barolli, L.: Ad hoc and neighborhood search methods for placement of mesh routers in wireless mesh networks. In: Proceedings of the 29th IEEE International Conference on Distributed Computing Systems Workshops (ICDCS-2009), pp. 400–405 (2009)
Implementation of FC-RDVM in WMN-PSOHCDGA System Considering Two Islands Distribution of Mesh Clients: A Comparison Study of FC-RDVM and RDVM Methods for Small Scale and Middle Scale WMNs Leonard Barolli1(B) , Shinji Sakamoto2 , Admir Barolli3 , and Evjola Spaho4 1
Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected] 2 Department of Information and Computer Science, Kanazawa Institute of Technology, 7-1 Ohgigaoka Nonoichi, Ishikawa 921-8501, Japan [email protected] 3 Department of Information Technology, Aleksander Moisiu University of Durres, L.1, Rruga e Currilave, Durres, Albania [email protected] 4 Department of Electronics and Telecommunication, Faculty of Information Technology, Polytechnic University of Tirana, Mother Teresa Square, No. 4, Tirana, Albania
Abstract. In this paper, we present WMN-PSOHCDGA hybrid simulation system for optimization of mesh routers in Wireless Mesh Networks (WMNs). We implemented FC-RDVM in WMN-PSOHCDGA. We consider Two Islands distribution of mesh clients and carry out a comparison study of FC-RDVM and RDVM router replacement methods for small scale and middle scale WMNs. Both methods have a good performance for connectivity and coverage metrics. However, they have different load balancing. By simulation results, we found that FC-RDVM has better load balancing for middle scale WMNs compared with RDVM.
1 Introduction Wireless Mesh Networks (WMNs) are cost-effective and they have good scalability, fault-tolerance and load distribution. They can be used as last miles networks, edge networks and Internet of Things (IoT) applications. The node placement problem is very important in WMNs, but it is very complex because in the optimization process should be considered different parameters such as mesh router connectivity, mesh client coverage, Quality of Service (QoS), network cost and so on. The mesh router nodes must be deployed in good locations in order to have optimal network connectivity, good client coverage and good load balancing of mesh routers. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 170–178, 2023. https://doi.org/10.1007/978-3-031-35836-4_19
WMN-PSOHCDGA Intelligent Simulation System for Node Placement in WMNs
171
For the optimization process, we consider three parameters: Size of Giant Component (SGC), Number of Covered Mesh Clients (NCMC) and Number of Covered Mesh Clients per Router (NCMCpR). Mesh node placement in WMNs for most of the formulations is computationally hard to solve [2, 4–6, 12, 13]. For this reason is some previous works, the authors propose and implement some intelligent algorithms [1, 3, 7, 8]. In [9, 10], we implemented intelligent simulations systems for WMNs considering simple heruistic algorithms. In this paper, we present a new hybrid intelligent system for Wireless Mesh Networks (WMNs) called WMN-PSOHCDGA, which integrates three intelligent algorithms: Particle Swarm Optimization (PSO), Hill Climbing (HC) and Distributed Genetic Algorithm (DGA). We implemented Fast Convergence Rational Decrement of Vmax Method (FC-RDVM) in WMN-PSOHCDGA system and we compare the performance of FC-RDVM with Rational Decreasing Vmax Method (RDVM) considering Two Islands distribution of mesh clients and different instances (different scales of WMNs). The simulation results show that FC-RDVM has better performance and load balancing than RDVM for middle scale WMNs. The paper is organized as follows. We introduce intelligent algorithms in Sect. 2. Section 3 presents the implemented WMN-PSOHCDGA system. The simulation results are given in Sect. 4. Finally, we give conclusions and future work in Sect. 5.
2 Intelligent Algorithms for WMN-PSOHCDGA System 2.1 Particle Swarm Optimization The PSO is a local search algorithm that considers a swarm of particles to find solutions for a given problem. The PSO is an evolutionary algorithm that uses random numbers and the performance depends on the quantity and quality of generated numbers. The particles in the swarm are moved according to some rules considering the previous and the best known position. The initial positions of particles are generated in random way at the beginning of the search space. Then, after every iteration, the position and velocity of each particle is changed in order to move towards the desired location. By weighting the acceleration coefficients, the efficiency of local search and convergence to the global optimum solution can be improved. 2.2 Hill Climbing Algorithm The HC algorithm is a simple optimization algorithm and belongs to the family of local search algorithms that find the best solution from a set of possible solutions. The HC algorithm can be easily implemented and can be used in a wide variety of optimization problems. It is often very efficient for finding local optima and can be used for problems where a good solution is needed quickly. The HC algorithm can be easily modified and extended. However, it can get stuck in local optima and may not find the global optimum. The performance of HC algorithm depends on initial solution, which means that a poor initial solution may result in a poor final solution.
172
L. Barolli et al.
Fig. 1. Model of Migration in DGA.
2.3
Genetic Algorithm
Genetic Algorithm (GA) is a robust optimization technique that searches through a population of individuals and can operate on various representations. But the design of genetic operators can be challenging. The main GA operators are Selection, Crossover and Mutation. The GA is successfully applied to various optimization problems, but it is computationally expensive and time-consuming. In our research work, we use Distributed GA (DGA), which has an additional mechanism to escape from local optima by considering multiple islands. The migration process of DGA is shown in Fig. 1. 2.4
Comparision of GA, PSO and HC
Both GA and PSO uses diversification mechanisms to explore the search space. For the exploration and exploitation, they combine stochastic and deterministic tasks. However, compared with GA, the PSO does not need operators such as selection, crossover and mutation. The PSO uses only primitive and simple mathematical operators. Thus, it is computationally inexpensive in terms of memory and runtime compared with GA. These two algorithms differ from each other on the implemented mechanisms. The GA focus more on approaches derived from elitism, while PSO considers collaborative behavior. The HC algorithm does not explore the search space very thoroughly, which can limit its ability to find better solutions. It may be less effective than other optimization algorithms, such as GA or PSO, for certain types of problems. The GA, PSO and HC are interesting evolutionary optimization algorithms, but they suffer from some disadvantages which limits their usage to only a few problems. Thus, a combination of these algorithms can improve the overall performance. So, a hybrid algorithm will be a good approach for different applications.
WMN-PSOHCDGA Intelligent Simulation System for Node Placement in WMNs
173
Fig. 2. Flowchart of WMN-PSOHCDGA system.
3 WMN-PSOHCDGA System Description For the implementation of WMN-PSOHCDGA hybrid intelligent simulation system, we consider the integration of three intelligent algorithms: PSO, HC and DGA in order to improve the convergence and solution of proposed system. The proposed system flowchart is shown in Fig. 2. In following, we explain initialization, particle-pattern, fitness function, distribution of mesh clients and replacement methods. Our proposed system generates the initial solution randomly by using ad hoc methods and particles velocity of is determined by a random process considering the area size [14]. We consider a particle as a mesh router and the fitness value of a particle-pattern is computed by considering position of mesh routers and mesh clients. The solution for
174
L. Barolli et al.
Fig. 3. Relationship among global solution, particle-patterns, and mesh routers in PSO part.
Fig. 4. Two islands distribution of mesh clients.
each particle-pattern is shown in Fig. 3. Each individual in the population is a combination of mesh routers and a WMN is represented by a gene. In WMN-PSOHCDGA, we use the following fitness function: Fitness = α × SGC(xi j , yi j ) + β × NCMC(xi j , yi j ) + γ × NCMCpR(xi j , yi j ). In fitness function, the SGC is the maximum number of connected routers, NCMC is the number of covered mesh clients by mesh routers and NCMCpR is the number of clients covered by each router, which is used for load balancing. In this work, we consider Two Islands distribution of mesh clients. The Two Islands distribution considers users located as shown in Fig. 4. The mesh routers movement is done according to their velocities. There are many replacing methods for mesh routers. In this paper, we consider RDVM and FC-RDVM. In RDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0) and the Vmax is decreasing with the increase of iterations as shown in Eq. (1). Vmax (x) =
T −x . W 2 + H2 × x
(1)
where, W and H are the width and the height of the considered area, while T and x are the total number of iterations and a current number of iteration, respectively.
WMN-PSOHCDGA Intelligent Simulation System for Node Placement in WMNs
175
In FC-RDVM [11], Vmax is the maximum velocity, which is decreased with increasing of iterations as shown in Eq. (2). Vmax (k) =
T −k W 2 + H2 × T +δk
(2)
where W and H are width and height of the considered area, while T and k are the total number and current number of iterations, respectively. The k is varying from 1 to T and δ is the curvature parameter.
4 Simulation Results In this section, we present the simulation results for FC-RDVM and RDVM. The coefficients of the fitness function are set as α = 0.8, β = 0.1, and γ = 0.1 and other parameters used for simulations are shown in Table 1. The visualization results for small scale WMNs are shown in Fig. 5, with Fig. 5(a) and Fig. 5(b) illustrating the results for RDVM and FC-RDVM, respectively. Both replacements methods show a good performance because all mesh routers are connected and all mesh clients are covered. However, we see the concentration of mesh routers in some areas. The visualization results for middle scale WMNs are shown in Fig. 6, with Fig. 6(a) and Fig. 6(b) illustrating the results for RDVM and FC-RDVM, respectively. For middle scale WMNs, also both replacements methods show a good performance for SCC and NCMC. However, compared with Fig. 5, the mesh routers have better distribution. In Fig. 7 and Fig. 8, we show the standard deviation, regression line and correlation coefficient for small scale and middle scale WMNs, respectively. The r is the correlation coefficient. When the standard deviation is a decreasing line, the load balancing among routers is better. We can see that the router replacement methods show different performance on load balancing. Table 1. Simulation parameters. Parameters
Values Small Scale WMN Middle Scale WMN
α :β :γ Number of GA Islands Evolution Steps Number of Migrations
8:1:1 16 9 300
Number of Mesh Routers 16 Number of Mesh Clients 48 Mesh Client Distribution Selection Method Crossover Method Mutation Method
32 96
Two Islands Distribution Rulette Selection Method UNDX Boundary Mutation
176
L. Barolli et al.
Fig. 5. Visualization results after optimization (Small Scale WMN).
Fig. 6. Visualization results after optimization (Middle Scale WMN).
Fig. 7. Standard deviation, regression line and correlation coefficient (Small Scale WMN).
WMN-PSOHCDGA Intelligent Simulation System for Node Placement in WMNs
177
Fig. 8. Standard deviation, regression line and correlation coefficient (Middle Scale WMN).
The load balancing for small scale WMNs is not good, because as shown in Fig. 5 there is a concentration of mesh routers. For middle scale WMNs, the standard deviation for RDVM is an increasing line, while for FC-RDVM is a decreasing line, which show that the FC-RDVM has better load balancing compared with RDVM.
5 Conclusions In this work, we evaluated the performance of RDVM and FC-RDVM for WMNs using a hybrid simulation system based on PSO, HC and DGA (called WMN-PSOHCDGA). We compared the simulation results of FC-RDVM and RDVM router replacement methods considering Two Islands distribution of mesh clients and different scales of WMNs. By simulation results, we found both methods achieve high network connectivity and good user coverage, but FC-RDVM has better load balancing compared with RDVM for middle scale WMNs. In future work, we will consider other parameters and different mesh router replacement methods.
References 1. Barolli, A., Sakamoto, S., Ozera, K., Barolli, L., Kulla, E., Takizawa, M.: Design and implementation of a hybrid intelligent system based on particle swarm optimization and distributed genetic algorithm. In: Barolli, L., Xhafa, F., Javaid, N., Spaho, E., Kolici, V. (eds.) EIDWT 2018. LNDECT, vol. 17, pp. 79–93. Springer, Cham (2018). https://doi.org/10.1007/978-3319-75928-9 7 2. Franklin, A.A., Murthy, C.S.R.: Node placement algorithm for deployment of two-tier wireless mesh networks. In: Proceedings of Global Telecommunications Conference, pp 4823– 4827 (2007) 3. Girgis, M.R., Mahmoud, T.M., Abdullatif, B.A., Rabie, A.M.: Solving the wireless mesh network design problem using genetic algorithm and simulated annealing optimization methods. Int. J. Comput. Appl. 96(11), 1–10 (2014) 4. Lim, A., Rodrigues, B., Wang, F., Xu, Z.: k-Center problems with minimum coverage. Theor. Comput. Sci. 332(1–3), 1–17 (2005)
178
L. Barolli et al.
5. Maolin, T., et al.: Gateways placement in backbone wireless mesh networks. Int. J. Commun. Netw. Syst. Sci. 2(1), 44–50 (2009) 6. Muthaiah, S.N., Rosenberg, C.P.: Single gateway placement in wireless mesh networks. In: Proceedings of 8th International IEEE Symposium on Computer Networks, pp. 4754–4759 (2008) 7. Naka, S., Genji, T., Yura, T., Fukuyama, Y.: A hybrid particle swarm optimization for distribution state estimation. IEEE Trans. Power Syst. 18(1), 60–68 (2003) 8. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A comparison study of simulated annealing and genetic algorithm for node placement problem in wireless mesh networks. J. Mob. Multimed. 9(1–2), 101–110 (2013) 9. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A comparison study of hill climbing, simulated annealing and genetic algorithm for node placement problem in WMNs. J. High Speed Netw. 20(1), 55–66 (2014) 10. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation and evaluation of a simulation system based on particle swarm optimisation for node placement problem in wireless mesh networks. Int. J. Commun. Netw. Distrib. Syst. 17(1), 1–13 (2016) 11. Sakamoto, S., Barolli, A., Liu, Y., Kulla, E., Barolli, L., Takizawa, M.: A fast convergence RDVM for router placement in WMNs: performance comparison of FC-RDVM with RDVM by WMN-PSOHC hybrid intelligent system. In: Barolli, L. (ed.) CISIS 2022. LNNS, vol. 497, pp. 17–25. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08812-4 3 12. Vanhatupa, T., Hannikainen, M., Hamalainen, T.: Genetic algorithm to optimize node placement and configuration for WLAN planning. In: Proceedings of the 4th IEEE International Symposium on Wireless Communication Systems, pp 612–616 (2007) 13. Wang, J., Xie, B., Cai, K., Agrawal, D.P.: Efficient mesh router placement in wireless mesh networks. In: Proceedings of IEEE International Conference on Mobile Adhoc and Sensor Systems (MASS 2007), pp. 1–9 (2007) 14. Xhafa, F., Sanchez, C., Barolli, L.: Ad hoc and neighborhood search methods for placement of mesh routers in wireless mesh networks. In: Proceedings of 29th IEEE International Conference on Distributed Computing Systems Workshops (ICDCS 2009), pp 400–405 (2009)
A Comparison Study of FBR and FBRD Protocols for Underwater Optical Wireless Communication Using Transporter Autonomous Underwater Vehicles Keita Matsuo1(B) , Elis Kulla2 , and Leonard Barolli1 1
2
Department of Information and Communication Engineering, Fukuoka Institute of Technology (FIT), 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan {kt-matsuo,barolli}@fit.ac.jp Department of System Management, Fukuoka Institute of Technology (FIT), 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected]
Abstract. Recently, underwater communication technology has been developing in many ways. Because signals are affected by a number of circumstances, communication interruptions are the fundamental problem in underwater communication. In this paper, we introduce transporter nodes, which can move linearly along with a particular Horizontal (H) or Vertical (V) line, and support message forwarding in Delay Tolerant Network (DTN) using Focused Beam Routing considering node Direction (FBRD) protocol for UOWC. We consider Horizontal Transporter (HT) and Vertical Transporter (VT) cases. From the simulation results, we found that for low angles the FBRD delivery probability increased and the overhead ratio is low. These conditions are very suitable for UOWC, because if we use narrow angle, it will increase communication distance. Also, the simulation results show that the overhead ratio of FBRD is lower than FBR through the range from 0 to 360◦ , which can reduce the consumption power of communication in the underwater environment.
1
Introduction
Wired Communication (WC), Underwater Acoustic Communication (UAC), Underwater Radio Wave Wireless Communication (URWC), and Underwater Optical Wireless Communication (UOWC) are some of the modern technologies being employed to facilitate communication in an underwater environment. Due to the limited development of underwater wireless communications and the expensive cost of hydrophones and other equipments, underwater communication is now carried out via communication cables [2]. A common method for transmitting messages from sensors, robotics and submarines across greater distances is the use of sound signals in water. The c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 179–188, 2023. https://doi.org/10.1007/978-3-031-35836-4_20
180
K. Matsuo et al.
bandwidth is constrained and the transmission speed is low. Additionally, the temperature, depth and salinity of the undersea environment all have an impact on sound speed. These elements cause the sound speed in an underwater environment to vary [1]. Radio waves are employed in URWC to facilitate communication. In addition, they can achieve high data rate for short communication range, but is hampered by the Doppler effect [2]. In [6], the authors show that for 2.4GHz the signal absorption losses are about 9 dB to 19 dB in freshwater, 19 dB to 24 dB in river water and 25 dB to 30 dB for sea water. The losses of sea water are higher than other types of water. So, it is not easy to use high frequency radio waves in underwater situation. For example, if we use high frequency radio waves in water environment, the communication distance will be within few centimeters. Since UOWC is done via visible light, larger underwater communication lengths are possible. In the recent past, the visible light communication distance in the water is over 100 m at 20 Mbps [4,5]. Due to the rapid development of Laser Diodes (LDs) and Light Emitting Diodes (LEDs), many lighting equipments are switching from traditional light to diodes as an LED light. The primary advantages of LEDs are their great energy efficiency and rapid reaction, which are adequate for achieving high speed communication in UOWC and making the implementation of a broadband underwater network. Because they are impacted by several circumstances, the underwater communication links are unstable. Delay Tolerant Network (DTN) was therefore thought to be a technology that would perform well in an underwater environment. The UOWC is using LED signal of blue color, because the blue color optical has the lower attenuation than others in underwater (see Fig. 1). In this paper, we present transporter nodes, which follow a predetermined horizontal (H) or vertical (V) line and can increase message delivery in DTN. We carry out a comparison study of Focused Beam Routing (FBR) protocol with FRB considering node Direction (FBRD). From the simulation results, we found that for low angles the FBRD delivery probability increased and the overhead ratio is low. These conditions are very suitable for UOWC, because if we use narrow angle, it will increase communication distance. Also, the simulation results show that the overhead ratio of FBRD is lower than FBR through the range from 0 to 360◦ , which can reduce the consumption power of communication in the underwater environment. The remainder of this paper is organized as follows. We present the suggested strategy in Sect. 2. We explain FBRD implementation in “The ONE Simulator” in Sect. 3. We present transporter AUVs and simulations scenario for comparing FBR and FBRD protocols in Sect. 4. We display the simulation results in Sect. 5. Section 6 provides conclusions and further work.
A Comparison Study of FBR and FBRD Protocols
181
Fig. 1. Optical attenuation of different color.
2
FBR and FBRD Protocols
In Fig. 2 is shown image of UOWC by using DTNs. The store-carry-and-forward paradigm of DTNs can be used by Autonomous Underwater Vehicles (AUVs) to communicate. AUVs may provide data to surface stations, which can subsequently interact with a surface sink. Also, a variety of applications may be created by utilizing UOWC. They can convey the data by DTNs utilizing FBR if the communication between nodes (AUVs, submarines and surface stations) becomes intermittently to communicate. The communication distance will be longer if the UOWC (signals) emit narrow visible light, but it will be shorter if the visible light has a wide range. In Fig. 3, we display the FBR protocol for UOWC. If we use 1◦ angle for the FBR protocol, the signals can be transmitted to longer distance. While for 30◦ angle the communication distance may be short. However, it can transmit the signals to more nodes. Additionally, if we adopt a 360-degrees angle, the sender node emits the signal omnidirectionally. This is similar to the epidemic protocol. The transmission distance will be at a minimum even though the sender node distributes the signal to several recipient nodes. In Fig. 4 (a) is shown FBR and Fig. 4 (b) shows FBRD protocols. The FBR protocol computes the angle between destination receiver and sender, then both θ1 and θ2 angles will set for FBR angles. The θ1 and θ2 are usually same. In case of Fig. 4 (a) the receiver node 1, 5 and 6 are staying in communication range of sender. They might get message from the sender node, while other receiver nodes which numbers are 2, 3, 4 and 7 may not get the message.
182
K. Matsuo et al.
Fig. 2. Image of UOWC using DTNs.
Fig. 3. Image of FBR protocol for UOWC.
The FBRD protocol considers moving node’s direction. When one node is moving in the communication range of sender and it is also moving downwards, the sender node might not forward messages to the node. In Fig. 4 (b), node number 1, 5 and 6 are moving in the communication range of sender. But the node 5 only can receive the messages from the sender.
A Comparison Study of FBR and FBRD Protocols
183
Fig. 4. Image of FBR and FBRD protocol for UOWC.
3
Implementation of FBR and FBRD Protocol in the ONE Simulator
The Opportunistic Networking Environment (ONE) simulator provides a framework for developing routing and application protocols and enables users to build scenarios based on various synthetic movement models and actual traces [3]. Thus, we implemented FBRD and FBR protocols in “The ONE Simulator”. In order to make the implementation process simpler, the following assumptions are made. • There are just two dimensions used to consider the environment: Horizontal axis is the width and vertical axis is the depth. • Each node is aware of its own position and destination. • Each node can move based on the Random Waypoint mobility model. • All data (messages) are sent to a surface station, which is fixed on the middle of the water surface.
4
Simulations Scenario for Comparing FBR and FBRD Protocol for UOWC Using Transporter AUVs
In this section, we describe simulation scenario using transporter AUVs as shown in Fig. 5. Three underwater sensors are sending random signals to a surface station. We used Horizontally Transporter (HT) and Vertically Transporter (VT) for this scenario.
184
K. Matsuo et al.
Fig. 5. Proposed scenario for using transporter AUVs.
These are moving specific orbit to raise node encounters. Four scenarios are used to compare FBR and FBRD protocols such as Not use both HT and VT, use HT (H), use VT(V), use both HT and VT. In Table 1 are shown simulation parameters. We simulate four situations as described above with different angles (θ): 10–360. Table 1. Simulation parameters’ settings. Parameters
Values
Transmission Speed Transmission Range Data Size Event Interval Number of Surface Stations Number of Free Nodes Nodes Speed Movement Model Simulation Time Simulation Area Buffer Size
2 [Mbps] 30 [m] 80–120 [kB] 2–5 [s] 1 Static (Middle Top) 20 0.5–2.5 [m/s] Random Waypoint 10800 [s] × 10 Times 400 × 400 [m] 30 MBytes
A Comparison Study of FBR and FBRD Protocols
5
185
Simulation Results
We compare FBR and FBRD protocols by using Transporter AUVs considering delivered probability and overhead ratio for each situation (HT and VT, only VT or HT and none) with different angles (θ): 10–360. The results are shown in Fig. 6, Fig. 7, Fig. 8 and Fig. 9. Figure 6 shows the scenario using only VT. In this case, FBR has higher delivery probability than FBRD. While, when FBR and FBRD angles are less than 70◦ the delivery probability of FBRD is higher than FBR (see Fig. 6 (a) and Fig. 6 (b)). Also, overhead ratio of FBRD is lower than FBR (see Fig. 6 (c) and Fig. 6 (d)). In Fig. 7 is shown the scenario for using HT only. In this case, FBR has higher delivery probability than FBRD. While, when FBR and FBRD angles are less than 80◦ the delivery probability of FBRD is higher than FBR (see Fig. 7 (a) and Fig. 7 (b)). Also, overhead ratio of FBRD is lower than FBR (see Fig. 7 (c) and Fig. 7 (d)). Figure 8 shows the scenario using both VT and HT. In this case, FBR has higher delivery probability than FBRD. While, when FBR and FBRD angles are less than 40◦ the delivery probability of FBRD is higher than FBR (see Fig. 8 (a) and Fig. 8 (b)). However, overhead ratio of FBRD is higher than FBR between 0 and 40◦ (see Fig. 8 (c) and Fig. 8 (d)). In Fig. 9 is shown the scenario when not using VT and HT. In this case, FBR has higher delivery probability than FBRD. While, when FBR and FBRD angles are less than 140◦ the delivery probability of FBRD is higher than FBR (see Fig. 9 (a) and Fig. 9 (b)). Also, overhead ratio of FBRD is lower than FBR (see Fig. 9 (c) and Fig. 9 (d)). From these results, we found that if we use low angles of FBRD, the value of delivery probability will be increased. In addition, the value of overhead ratio will be low. These conditions are very suitable for UOWC, because if we use narrow angle, it will increase communication distance. Also, the results show the overhead ratio of FBRD is lower than FBR through the range from 0 to 360◦ , which can reduce the consumption power of communication in the underwater environment.
186
K. Matsuo et al.
Fig. 6. Simulation results of FBR and FBRD for different angles (VT only scenario).
Fig. 7. Simulation results of FBR and FBRD for different angles (HT only scenario).
A Comparison Study of FBR and FBRD Protocols
187
Fig. 8. Simulation results of FBR and FBRD for different angles (VT and HT scenario).
Fig. 9. Simulation results of FBR and FBRD for different angles (None scenario).
188
6
K. Matsuo et al.
Conclusions and Future Work
In this paper, we implemented FBR and FBRD protocols in “The ONE Simulator” and compared their performance by using Transporter AUVs. The simulation results have shown that the delivery probability of FBR is higher than FBRD. However, for low angles (less than 70◦ ) the delivery probability of FBRD is higher than FBR. This shows that FBRD is suitable protocol for UOWC, because if we use narrow FBRD angles it will increase the communication distance. In addition, overhead ratio of FBRD is lower than FBR through the range from 0 to 360◦ . In the future work, we will expand the simulation system to include new procedures and features as well as conducting comprehensive simulations to assess diverse underwater conditions.
References 1. Awan, K.M., Shah, P.A., Iqbal, K., Gillani, S., Ahmad, W., Nam, Y.: Underwater wireless sensor networks: a review of recent issues and challenges. Wirel. Commun. Mob. Comput. 2019, 20 p. (2019). Article ID 6470359 2. Jouhari, M., Ibrahimi, K., Tembine, H., Ben-Othman, J.: Underwater wireless sensor networks: a survey on enabling technologies, localization protocols, and internet of underwater things. IEEE Access 7, 96879–96899 (2019) 3. Ker¨ anen, A., Ott, J., K¨ arkk¨ ainen, T.: The ONE simulator for DTN protocol evaluation. In: Proceedings of the 2nd International Conference on Simulation Tools and Techniques, pp. 1–10 (2009) 4. Matsuo, K., Kulla, E., Barolli, L.: Evaluation of focused beam routing protocol on delay tolerant network for underwater optical wireless communication. In: Barolli, L., Kulla, E., Ikeda, M. (eds. Advances in Internet, Data & Web Technologies (EIDWT 2022). LNDECT, vol. 118, pp. 263–271. Springer, Cham (2022). https:// doi.org/10.1007/978-3-030-95903-6 28 5. Matsuo, K., Kulla, E., Barolli, L.: A focused beam routing protocol considering node direction for underwater optical wireless communication in delay tolerant networks. In: Barolli, L. (ed.) Complex, Intelligent and Software Intensive Systems (CISIS 2022). LNNS, vol. 497, pp. 190–199. Springer, Cham (2022). https://doi.org/10. 1007/978-3-031-08812-4 19 6. Qureshi, U.M., et al.: RF path and absorption loss estimation for underwater wireless sensor networks in different water environments. Sensors 16(6), 890 (2016)
General Dynamic Difficulty Adjustment System for Major Game Genres Qingwei Mi and Tianhan Gao(B) Software College, Northeastern University, Shenyang, China [email protected], [email protected]
Abstract. Dynamic difficulty adjustment (DDA) is a research hotspot in the field of artificial intelligence (AI) in games. This paper designs and implements a general DDA system based on the flow to improve the status of the existing DDA methods highly specialized. The system proposed can be widely applied to major game genres in the game industry effectively. Improvement of parameters after introducing DDA proves the system’s high adaptability. Advantages on mean total score greater demonstrate that the system has a better player experience by inviting players to participate in the Game Engagement Questionnaires (GEQ). The soundness and generality of the system can provide a complete solution for optimizing the player’s gaming experience, thus supporting game designers and developers to achieve or port core AI systems of games more efficiently.
1 Introduction Artificial Intelligence (AI) is a booming research field, and the important research directions in the field are increasing [1–3]. In recent years, research in AI has focused on tasks that are difficult for humans to regularize or define objectively [4–7]. With the success of highly formalized and symbolic mathematical concepts and representations in multitudinous AI application cases, games have become popular in the selection of research supports in AI [8]. The first video game prototype was implemented in the 1950s, which has a history of over 70 years. At the end of the 20th century, video games with relatively complete forms gradually emerged, therewith changing the way humans play games and expanding the definition of the game. Video games are evolving continually with gameplay and mechanisms, and the application of AI technology in games is becoming more extensive as well. The field of Game AI emerges at a historic moment. The research field of Game AI covers the study of AI in and for games [9, 10]. Game AI aims to improve the player’s gaming experience while ensuring the balance and challenge of the game. As the hotspot in field research, Dynamic Difficulty Adjustment (DDA) mechanism can automatically modify the game parameters in real-time according to the player’s skill level, thus supporting game designers and developers in optimizing the player experience effectively. With the help of DDA, players will not feel bored when the game difficulty is low and also will not be frustrated while facing more difficult challenges [11–14]. However, current research on DDA is scarce. Most DDA methods © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 189–200, 2023. https://doi.org/10.1007/978-3-031-35836-4_21
190
Q. Mi and T. Gao
are highly specialized. How to effectively realize DDA features and the generality of solutions for games has become a key issue in Game AI. To address the issue, this paper designs and implements a general dynamic difficulty system. Based on the flow theory, the system achieves the generality for major game genres by independent DDA core algorithms and modules. By testing the system in different situations and inviting players to participate in gameplay experiences and questionnaires, the results demonstrate that the system owns higher player experience and adaptability than without DDA. The remainder of the paper is organized as follows. The definitions of flow theory and major game genres are sketched in Sect. 2. Section 3 elaborates on the detailed design of the proposed system. The tests and questionnaires are given, and the adaptability of modules as well as player experience in each major genre is analyzed in Sect. 4. Finally, the contributions and future work of the paper are concluded in Sect. 5.
2 Related Works As the most essential characters in the game, players are the main user groups for games. Compared to user experience, the player experience focuses on the individuals who play the game and their personal experience during the game. Player experience describes the quality of player interaction with a game and is typically surveyed during or after the game [15]. For any game genre, the flow is a vital factor in ensuring the player experience. Flow. The flow is proposed by Mihaly Csikszentmihalyi in 1975 [16], which is the mental state in which a person performs the activity with fully immersion and enjoyment [17]. In games, it can be refined into a map between the game challenge and the player’s skill level (shown in Fig. 1) [18, 19]. The flow occurs when players face challenges matching their skill level in games. If the skill level is low, too difficult challenges will make players out of flow, increase anxious and frustration, and lead them to give up playing. Conversely, too easy challenges will cause boredom to more skilled players and then make the flow disappear. Based on the flow model, the key to creating a good game experience is to adjust the relationship between the game challenge and the player’s skill level in real-time. Allowing players to get into the flow again in each state is also effective during the game. For game design, challenges are often instantiated by difficulty [20]. The effective application of DDA can hold players in the state of flow, so as to provide them with a better game experience [11]. Game Genres. Since the development of video games, it has been no precise definition of game genres. Different classification standards have large diversity. In today’s game industry, the primary and common classification method is to classify according to the game mode. That is, the specific genre is determined based on the gameplay and mechanisms of the game. Therefore, the Role-playing Game (RPG), Racing Game (RAC), Fighting Game (FTG), Shooting Game (STG), Adventure Game (AVG), Strategy Game
General Dynamic Difficulty Adjustment System for Major Game Genres
191
Fig. 1. Mihaly Csikszentmihalyi’s Flow Model.
(SG), and Casual Game (CG) are the seven mainstream genres widely researched and recognized by the game industry [21–25]. Role-Playing Game. RPG is one of the most popular major game genres. Compared with other genres, RPGs tend to focus more on the interaction between the player and the virtual game world. RPGs include the Action Role-playing Game (ARPG), Strategy Role-playing Game (SRPG), Massively Multiplayer Online Role-Playing Game (MMORPG), and other main branches. The combat interaction between players and enemies is the main source of game challenges. The core algorithms in the combat management system will affect the player experience directly, which is an important reason for introducing DDA. The Kung-Fu Circle, Beyond Kung-Fu Circle, and Belgian AI are three representative combat management algorithms. Racing Game. As one of the most distinctive major game genres, RAC emphasizes detailed simulation of driving with strong reality to give players an excellent competition experience. RACs are mainly car racing games. The main source of RAC experience for players is racing competitions. It is the key factor to ensure the challenge of RACs so that players and their opponents are able to join in fierce competition and overtake each other. Rubber-banding is a classical DDA technology used in RACs. During the game process with its support, AI racers will change their game styles in real-time according to the player’s skill level. The Power-Based Rubber-Banding System, Difficulty-Based Rubber-Banding System, and Combined Rubber-Banding system are three traditional systems. Fighting Game. FTGs have high balance requirements during design to ensure fighting fairness. FTGs have vivid immersion of fighting with fast pace, but it is difficult to get started and win. The game puts forward higher demands on the player’s response ability
192
Q. Mi and T. Gao
and micro-operation. FTGs are divided into 3D fighting games and 2D fighting games according to the action space generally. Shooting Game. The content of STGs is mostly aimed to complete tasks or destroy enemies, and the players use real-based or imaginary weapons to attack. In the Firstperson Shooting Game (FPS), the player plays from a subjective perspective. FPSs center on initiative and reality. The immersive experience will bring players a strong visual impact. However, the controller perspective is used in the Third-person Shooting Game (TPS). TPSs place emphasis on the overall view and the sense of action, which is conducive to players observing their character and the surroundings. The subgenre realizes the effective combination of action and shooting to a certain extent. Adventure Game. AVGs focus on creating a story-driven single-player experience for players. Plot-based and exploratory interaction such as exploring the unknown and solving target puzzles is the kernel of AVGs. The game emphasizes the mining of story clues and mainly trials the player’s observation and analysis skill. AVGs have puzzle adventure games, action adventure games, text adventure games, and other branches. Strategy Game. SGs allow players to control and manage individual units or troops freely by providing an environment for them to think about problems and plan tactics, so as to achieve the goal of defeating their opponents. The economic system in SGs plays a pivotal role in the design process. A sound and balanced economic system is guaranteed for the effective exertion of the player’s decision-making ability. The Tower defense Game (TD), Turn-based Strategy Game (TBS), and Real-time Strategy Game (RTS) are mainstream subgenres of SG. Casual Game. CG is a broad genre that is more common on mobile platforms. According to the differences in the definitions of the word “casual” by different players, casual games in this paper refer to games that aim to obtain high scores specifically. The game is easy to get started with simple mechanisms and has a certain educational effect on players. CGs include matching games, physics games, running games, and other subgenres.
3 Dynamic Difficulty Adjustment System As mentioned above, RPG, RAC, FTG, STG, AVG, SG, and CG are the seven major game genres experienced industry with higher approval. Through detailed analysis of the core challenges and interaction mechanisms of each genre, integrated and extendible DDA core algorithms and modules are designed with related technologies of AI development in games. The authors’ work published previously can help to understand the specific details of the MRBP Ring Model, grid priority, real-time induction and distribution algorithm, attack weight threshold, global cooldown, ARBS Relationship Curve, modular mechanism, detail processing schemes, and other technologies applied in the DDA system. RPG Module. In RPGs, combats are the main source of challenges. The combat management system is preferred for embedding the DDA mechanism. The main application scene of the module is ARPG, a representative subgenre of RPG. The effectiveness of
General Dynamic Difficulty Adjustment System for Major Game Genres
193
the core algorithms as follows in the system is the key to implementing the RPG module with high adaptability and strong extendibility. 1. Based on the definitions of variables in the Belgian AI algorithm, the module designs the MRBP Ring Model to manage attacks of enemies and realized DDA based on maximum grid capacity and current grid capacity. The MRBP Ring Model is a player-centric four-ring model. The rectangular plane coordinate system is created with the current position of the player, where the directions of the two axes are determined according to the game world. The four ring areas, from the inside out, are the melee area, ranged area, buffer area, and pursuit area. The y-axis is rotated clockwise by 22.5°, 67.5°, 112.5°, and 157.5°, respectively, forming four line segments. The melee area, ranged area, and buffer area are then equally divided into eight sectors. The number of enemies holding in each sector of the model can be set according to the specific game mechanism adaptively. The greater the value of maximum grid capacity and current grid capacity, the more difficult the combat the player meets. The maximum grid capacity and current grid capacity will be decreased when the player is dying. Conversely, they will be increased if the player has not been damaged for a long time. Moreover, the system will reset the two variables in the current combat after the player completes it, so that to achieve the expected effect of DDA. 2. Grid priority is set for each enemy to arrange the order for the module to process attack requests. The module also designs the real-time induction and distribution algorithm to enhance the control of the combat management system. When multiple enemies request to attack the player simultaneously, the higher the grid priority, the more preferred the enemy allowance. If the number of enemies or the areas where they are located in the current combat change except for exiting, the maximum grid capacity and current grid capacity will be reset. The real-time induction and distribution algorithm will execute the process by judging the enemies’ state before and after the change, and the requests of enemies will be processed according to their grid priorities. After these processes, all enemies in the current combat after the change will execute their behaviors according to the complete logic. 3. To ensure game balance, the attack weight threshold of each enemy and global cooldown of each attack type of the enemy are set. Each enemy can only attack with an attack weight less than or equal to its threshold. Besides, when an attack type is in global cooldown, all the same types of enemies cannot use this attack type to attack. Setting attack weight threshold can prevent the monopoly of the attack right of the same attack type effectively. Furthermore, global cooldown aims to reduce the same type of enemies that attack the player with the same attack type simultaneously. RAC Module. Fierce competitions are the main challenge for players in RACs. In the RAC module introducing rubber-banding technology, AI racers will slow down to wait for players when being ahead of players, otherwise they will accelerate to catch up, so as to control the difficulty of the game. 1. The module designs the ARBS Relationship Curve to describe the relationship between rubber-banding multipliers and the distance from AI racers to the player.
194
Q. Mi and T. Gao
The curve adjusts the power of the car based on the power multiplier, as well as AI racers’ skill level by the difficulty multiplier. The multistep adjustment mode is constructed by setting the threshold of the distance from the AI racer to the player. The number of threshold variables that are set in the curve can be adaptively set according to the game mechanism. Car attributes such as power, speed, acceleration, limit speed, traction, tire grip, sound effect, and limiter are selected as the curve change conditions to improve the generality of the system. 2. The subtype of RACs can be divided into single-player and multiplayer games, speed and item races, and races with short and long distance tracks according to the number of players, gameplay mode, and track distance. The modular mechanism is used to form six independent submodules based on the six subtypes above. For multiplayer games, the reference point of forward banding is set to the front player, and that of reverse banding is set to the back one. Besides, the probability of getting good items for the last few players will be increased. And in long distance tracks, multiple checkpoints are set necessarily according to the distance of the tracks. 3. However, some mechanisms need to be disabled or adjusted under certain circumstances to improve the robustness of the module. The system will set the player as the reference point to apply an additional reverse banding effect to all AI racers ahead of the player when the player is at the first turn at the beginning of the race. The forward banding effect will be disabled to make the AI racer drive normally if the distance between the player and it is less than 100 m. Furthermore, when the AI racer leads the second too much in the first place, the system will set the latter as the reference point to apply another reverse banding effect which should be set to the former to bridge the gap. FTG Module. Though the player’s action space is different in 3D or 2D FTGs, the goal of the player is to defeat the opponent in combat and win. Each combat corresponds to the challenge with DDA. 1. In addition to being affected by the skill level of the AI opponent’s combo probability and defense or counterattack reaction speed, they are also negatively correlated with the AI’s current health point (HP). The lower the current HP of the AI, the faster its reaction speed with higher combo probability. 2. The authors define the probability that the player character will be interrupted by the opponent’s attack during the use of attack types as the danger value, which is mainly determined by the cast time of the attack and the relative position of the player and the opponent. And then thresholds are set for the AI opponent’s current HP and the danger value of attack types available. When the HP of the AI opponent is lower than the threshold, it will greatly reduce the probability of using attack types with a dangerous value higher than the threshold, thereby lowering the risk of being defeated in a short time to a certain extent. STG Module. In FPSs and TPSs, each competition or task can exist as a separate challenge. AI in shooting games often appears in the form of groups, and some games
General Dynamic Difficulty Adjustment System for Major Game Genres
195
are designed with a layered architecture. Individual AIs can communicate with each other with high intelligence. 1. Add the current HP as an influencing factor to all AIs’ shooting accuracy, dodge probability, and visual and auditory perception range. That is, the accuracy or success rate of shooting and dodging will be higher for the AIs with lower HP, along with a stronger perception ability in vision and hearing for AI. 2. Based on the groupings of AI, all AIs in the camp will adopt beneficial intelligent behaviors including finding cover, reloading weapons in hidden places, and tactical cooperation more frequently if their side is at the critical point to be defeated. Setting this mechanism can re-create suspense for the already clear situation to improve the player experience. AVG Module. The required plot-based and exploratory interaction are demonstrated in the puzzle adventure games well, which are the main application scene of the AVG module. The puzzles in games consist of a series of steps, and players can only solve the puzzles and continue the adventure after completing all the steps orderly. The difficulty of the puzzles can be divided into two categories according to different design forms. One is fixed difficulty, which is not or only affected by the overall preset difficulty of the game. No matter how many times the player tries on the same puzzle, the difficulty of it will not change. The other one is random difficulty, which is mainly determined by the random seed. For games with this mechanism, the system will give players a hint if they spend too long on the same attempt at a single step. At the same time, seeds with higher difficulty will not be applied to subsequent steps, the difficulty of subsequent steps will only be generated in seeds with lower or moderate difficulty. While on the contrary, after the player completes prescribed steps in preset time, the difficulty of the subsequent steps will be decided by higher difficulty seeds randomly. The algorithms in the AVG Module have a high value in preventing the churn of players. SG Module. TD is one of the core subgenres with the longest history of SG. TDs require players to protect the base by building defensive buildings such as turrets and walls on the map to prevent enemies from attacking. DDA algorithms in the SG module are designed on the basis of enemy waves. 1. The initial position of enemies, the order in which different enemies appear, and the interval between adjacent enemies appearing are all essential factors that affect the win rate of the player in the same wave. When the player’s base is at the critical point to be destroyed, the three factors above will be adjusted gradually in the direction that is beneficial to the player. And if enemies cannot reach the player’s base for present time, the factors will be adaptively changed to increase the defending difficulty and make it easier to break through the player’s defensive lines. 2. Map resources, random subsidies, and the value of other bonus rewards will be increased if players are at a disadvantage in the wave for preset time. The refresh
196
Q. Mi and T. Gao
time will be shorter as well. So as to help players recover from the bad situations and maintain their competitiveness. CG Module. Great differences exist in the CG’s branches for its broadness. Therefore, the memory game of the matching game is selected as the application introduced of DDA based on its advantages of market share. The core gameplay of traditional memory games is that the player should complete the matching of all pairs of cards within the preset steps. When players face the same level, if they have not completed any matching of cards within prescribed steps, or only completed a few matchings within certain default steps, the system will use less of the sum of the Euclidean distances between each paired cards of all unturned ones to rearrange these cards to provide effective clearance support for rookies. However, players with higher skill levels can often complete more matchings in a short time. The system will increase the randomness of the remaining unturned cards and reduce the occurrence probability of the same type of cards if this situation occurs. As a result, the system provides a good game experience for players regardless of their skill levels.
4 Results and Discussion To verify the application effect of the proposed methods, a prototype was implemented based on the general DDA system. The process and testing results of the design, implementation, adaptability, and player experience of the system are elaborated on in this section. System Design and Implementation. The system integrates seven subsystems, which introduces RPG, RAC, FTG, STG, AVG, SG, and CG module respectively for corresponding game genres. The specific implementation process includes creating the player, AI, and other necessary classes, setting related variables, functions, events, and interfaces, implementing algorithms and modules, making art assets and sound effects, and building test scenes. System Tests. After implementing the system, the authors invited 27 players to participate in internal tests on a small scale. The test requires the players to compete with enemies at multiple fixed global difficulties with DDA modules disabled and sort them according to the score in each game genre. The players were equally divided into three groups based on the results. The median player in each group is selected to represent low, medium, and high skill levels respectively to participate in the algorithm and module test processes. Algorithm and Module Tests. The tests are primarily used to verify the execution of algorithms and the mechanisms of modules through the performance of the subsystems on different DDA parameters. According to the test results, the adaptability of the modules with/without DDA is shown in Table 1. It can be concluded that the proposed general system is more adaptive, and the application of DDA in the algorithms and modules is fully effective. Player Game Experience Tests. To further verify the effect of the general DDA system on improving the player experience, the authors used the Game Engagement Questionnaire
General Dynamic Difficulty Adjustment System for Major Game Genres
197
Table 1. Adaptability of seven system modules. Modules
DDA
Parameters
RPG
Disabled
The number of enemies that can be held is fixed, enemy types that can be managed are limited, the interaction area between enemies and the player is differed by their relative positions, attack types of enemies cannot switch automatically
Enabled
The number of enemies that can be held is adaptive based on specific game mechanisms, nearly all enemy types can be managed, the interaction area between enemies and the player is all the same, attack types of enemies can switch automatically
Disabled
The relationship curve cannot be adaptively adjusted, low driving stability of AI racers, high probability of cases that impact player experience, narrow controllable range
Enabled
The relationship curve can be adaptively adjusted according to specific game mechanisms, high driving stability of AI racers, low probability of cases that impact player experience, wide controllable range
Disabled
Combat attributes of AI opponents are merely affected by their skill level, high probability of being defeated if the HP of the AI opponent is low
Enabled
Combat attributes of AI opponents are affected by their skill level and current HP, low probability of being defeated if the HP of the AI opponent is low
Disabled
Competing with normal intelligence when AIs’ current HP is low, short duration of stalemate
Enabled
Competing with high intelligence when AIs’ current HP is low, long duration of stalemate
Disabled
Most players give up the game after multiple failed attempts at a certain step
Enabled
Only few players give up the game after multiple failed attempts at a certain step
Disabled
Factors that affect the win rate of players are set by default, the number of bonus rewards is fixed
Enabled
Factors that affect the win rate of players can be adjusted dynamically, the number of bonus rewards is determined by the defending situation of the player
Disabled
Clearance time varies greatly among players of different skill levels
Enabled
Less variation in clearance time among players of different skill levels
RAC
FTG
STG
AVG
SG
CG
(GEQ) [26] which is widely used in the game industry. The GEQ consists of 19 questions. Each question has three options: N, M, and Y referring to “No”, “Maybe”, and “Yes”, respectively. The questions in the GEQ are shown in Table 2, which are sorted randomly for each player to avoid order effects.
198
Q. Mi and T. Gao
The authors invited 336 players to participate in the gameplay experience. The gameplay experience lasted for three weeks. Three weeks later, the authors distributed the two questionnaires to the 336 players. The players should fill in the two questionnaires according to the experience with DDA disabled and enabled, respectively. Table 2. GEQ Items. ID
Items
1
I lose track of time
2
Things seem to happen automatically
3
I feel different
4
I feel scared
5
The game feels real
6
If someone talks to me, I don’t hear them
7
I get wound up
8
Time seems to kind of stand still or stop
9
I feel spaced out
10
I don’t answer when someone talks to me
11
I can’t tell that I’m getting tired
12
Playing seems automatic
13
My thoughts go fast
14
I lose track of where I am
15
I play without thinking about how to play
16
Playing makes me feel calm
17
I play longer than I meant to
18
I really get into the game
19
I feel like I just can’t stop playing
The corresponding scores of N, M, and Y are set to 0, 1, and 2 respectively to quantify the player experience. Based on potential difference analysis, no significant difference was found between the gender, age, and nationality of different players on the test results. Consequently, these factors were not considered for the analysis of the results. The mean scores of the 336 players during play in the seven major game genres are shown in Fig. 2 by counting the total scores. The questionnaire results demonstrate that the mean scores with DDA rise in varying degrees compared to that without DDA in each game genre. Among them, RAC has 3.76 scores greater, which is the largest, and 0.87 of CG is the smallest. It can be proved that the proposed general DDA system can enhance the player experience significantly for major game genres.
General Dynamic Difficulty Adjustment System for Major Game Genres
199
Fig. 2. Mean GEQ score of the players in the major game genres.
5 Conclusion This paper proposes a general DDA system with high adaptability and strong extendibility to enhance the player’s gaming experience while ensuring the balance and challenge of the RPG, RAC, FTG, STG, AVG, SG, and CG, which are experienced industry with higher approval. The DDA system designs and implements core algorithms and modules with a detailed analysis of the core challenges and interaction mechanisms of each genre. The MRBP Ring Model, ARBS Relationship Curve, and other methods applied in the system are all adaptive and robust by integrating related technologies of AI development in games. Systematic tests and questionnaire results demonstrate that the proposed system can provide an efficient solution for improving the player experience in current major game genres. Games realize the long-term goals of general intelligence best. In future research, the general DDA system will be optimized for specific game engines and also be extended for other minor game genres. The integrity and applicability of the system will be enhanced at that time, so as to promote the common development of AI and game industry. Acknowledgments. This work was supported by National Natural Science Foundation of China under Grant Number: 52130403, Fundamental Research Funds for the Central Universities under Grant Number: N2017003.
References 1. Suleimenov, I.E., et al.: Artificial Intelligence: what is it. In: Proceedings of the 2020 6th International Conference on Computer and Technology Applications, pp. 22–25. ACM (2020)
200
Q. Mi and T. Gao
2. Russell, S.J.: Artificial Intelligence a Modern Approach. Pearson Education, Philadelphia (2010) 3. Klassner, F.: Artificial intelligence: introduction. Crossroads 3(1), 2 (1996) 4. Jackson, P.C.: Introduction to Artificial Intelligence. Courier Dover Publications, Mineola (2019) 5. Lu, H., et al.: Brain intelligence: go beyond artificial intelligence. Mob. Netw. Appl. 23, 368–375 (2018) 6. Holzinger, A., et al.: Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 9(4), e1312 (2019) 7. Duan, Y., Edwards, J.S., Dwivedi, Y.K.: Artificial intelligence for decision making in the era of Big Data–evolution, challenges and research agenda. Int. J. Inf. Manag. 48, 63–71 (2019) 8. Rabin, S.: Game AI Pro 2: Collected Wisdom of Game AI Professionals. CRC Press, Boca Raton (2015) 9. Yannakakis, G.N., Togelius, J.: Artificial Intelligence and Games. Springer, Switzerland (2018) 10. Millington, I., Funge, J.: Artificial Intelligence for Games. CRC Press, Boca Raton (2009) 11. Zohaib, M.: Dynamic difficulty adjustment (DDA) in computer games: a review. Adv. Hum.Comput. Interact. (2018) 12. Xue, S., et al.: Dynamic difficulty adjustment for maximized engagement in digital games. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 465– 471. ACM (2017) 13. Hunicke, R.: The case for dynamic difficulty adjustment in games. In: Proceedings of the 2005 ACM SIGCHI International Conference on Advances in Computer Entertainment Technology, pp. 429–433. ACM (2005) 14. Silva, M.P., do Nascimento Silva, V., Chaimowicz, L.: Dynamic difficulty adjustment on MOBA games. Entertain. Comput. 18, 103–123 (2017) 15. Makantasis, K., Liapis, A., Yannakakis, G.N.: From pixels to affect: a study on games and player experience. In: 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 1–7. IEEE (2019) 16. Csikszentmihalyi, M.: Beyond Boredom and Anxiety. Jossey-Bass, London (1975) 17. Despain, W.: 100 Principles of Game Design. New Riders, Upper Saddle River (2013) 18. Chen, J.: Flow in games (and everything else). Commun. ACM 50(4), 31–34 (2007) 19. Sylvester, T.: Designing Games: A Guide to Engineering Experiences. O’Reilly Media, Sebastopol (2013) 20. Macklin, C., Sharp, J.: Games, Design and Play: A Detailed Approach to Iterative Game Design. Addison-Wesley Professional, Boston (2016) 21. Heintz, S., Law, E.L.C.: The game genre map: a revised game classification. In: Proceedings of the 2015 Annual Symposium on Computer-Human Interaction in Play, pp. 175–184. ACM (2015) 22. Grace, L.: Game type and game genre 22(2009), 8 (2005). Accessed February 23. Arsenault, D.: Video game genre, evolution and innovation. Eludamos: J. Comput. Game Cult. 3(2), 149–176 (2009) 24. Gregory, J.: Game Engine Architecture. AK Peters, Natick (2018) 25. Adams, E.: Fundamentals of Game Design. Pearson Education, Upper Saddle River (2014) 26. Brockmyer, J.H., et al.: The development of the game engagement questionnaire: a measure of engagement in video game-playing. J. Exp. Soc. Psychol. 45(4), 624–634 (2009)
Method of Facial De-identification Using Machine Learning in Real-Time Video Si-On Kim1 , Da-Wit Jeong1 , and Sun-Young Lee2(B) 1 Department of Mobility Convergence Security, Graduate School, Soonchunhyang University,
22, Soonchunhyang-ro, Asan-si, Republic of Korea [email protected], [email protected] 2 Department of Information Security Engineering, Soonchunhyang University, 22, Soonchunhyang-ro, Asan-si, Republic of Korea [email protected]
Abstract. There is a growing trend of individuals or small teams creating and broadcasting personal streaming content through content creation, editing, and live streaming. However, there is a concern about inadvertently exposing the faces of non-public figures. In order to prevent the exposure of the faces of non-public figures, manual de-identification has been performed through editing rather than real-time video. This method is prone to mistakes and consumes a significant amount of time, and there is no way to prevent face exposure in real-time broadcasts. Therefore, this paper proposes an efficient real-time de-identification technique based on Facial Landmark that can solve the problem of loss of valid information in the video without compromising the quality of the video. The proposed technique detects only the facial landmark areas of people in the video in real-time and de-identifies them. It has been shown to achieve an average object recognition rate of 95.52% and can additionally secure valid information in the video by an average of 1.2–2.2%.
1 Introduction Recently, with the emergence of various genres such as games, music, and cooking, interest in live streaming has increased. The global streaming viewing time for 5 live streaming services increased by about 14% compared to the first quarter of 2021. In addition, Asia showed the largest annual growth rate of 90% [1]. Not only video platforms like YouTube and Twitch, but also videos through live streaming on SNSs like Instagram and Facebook are being produced. Generally, uploaded videos are edited before being uploaded, but videos produced through live streaming can cause problems such as infringement of others’ portrait rights and exposure of trademarks [2]. In this paper, we propose a real-time video de-identification technique using machine learning. Existing de-identification techniques apply blurring to the Bounding Box area of the de-identification target object itself [3, 4]. This method applies blurring to the area around the Bounding Box, causing a decrease in video quality and loss of valid information, thereby reducing the value of the video. The proposed technique applies de-identification to the landmark area on the edge of the object, so it can solve the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 201–208, 2023. https://doi.org/10.1007/978-3-031-35836-4_22
202
S.-O. Kim et al.
problem of decreasing video quality without loss of valid information and provide an efficient technique for real-time de-identification.
2 Related Work 2.1 YOLO YOLO stands for You Only Look Once, and was developed by Joseph Redmon [5, 6]. It is a 1-stage detector designed to process many images and generate many images on a frame-by-frame basis, making it suitable for video. Additionally, it shows high processing performance in real-time video processing. YOLOv3 divides the image into a grid of size S × S and predicts a Bounding Box for each grid cell. The Bounding Box is represented by its coordinates, width, and height. YOLOv3 predicts a defined number of Bounding Boxes for each grid cell and predicts the class of the corresponding object. Then, the Non-Maximum Suppression (NMS) algorithm is used to remove duplicate Bounding Boxes, and finally, the selected Bounding Boxes output their respective classes and probabilities. 2.2 Facial Landmark Face detection based on machine learning can be divided into two methods: detecting the face using a Bounding Box and identifying common features such as eyes, nose, mouth, eyebrows, and jaw by marking points in the landmark method. Bounding Box refers to a rectangle that surrounds the perimeter of an object to represent the object’s position. The class probability is measured by determining the class of the object surrounded by the Bounding Box [7]. Figure 1 shows the difference when applying the Facial Landmark of each method. The landmark method has the advantage of precisely detecting the range of the face by marking 68 points on the facial features [8]. Using the detected landmarks, they can be used in various computer vision applications such as emotion recognition and pose estimation, not just face detection.
3 Proposed Method 3.1 Methodology In this paper, we propose a real-time de-identification of human faces in situations where control and editing are difficult, such as real-time online personal broadcasting and news live broadcasting. The proposed method is divided into input, processing, and output stages. In the input stage, the video is transmitted to the de-identification processing device through a device that can record and transmit the video. The de-identified video is then output through a display and a device that can receive the video. Figure 2 shows the processing procedure for de-identification. The video information is recorded through a camera and the real-time video is transmitted to the de-identification processing server through a device capable of transmission. When the server receives
Method of Facial De-identification Using Machine Learning
203
Fig. 1. Image comparison by de-identification methods
the video, the YOLOv3 network detects the faces of people in the image and generates a bounding box. The detected bounding boxes are given numbers in order of the amount of space they occupy, and the numbers are stored in an array. In the case of personal broadcasting or news broadcasting, not all individuals need to be de-identified, so the de-identified subjects and non-de-identified subjects are distinguished based on the assigned numbers. Based on the selection of the bounding box using input devices such as keyboards, mice, and touchpads, de-identified subjects are selected. If a bounding box is selected as a de-identified subject, the landmarks within the bounding box are detected, and the outer facial feature points are connected to define one area, which is then de-identified. Finally, the de-identified image is converted back to a video format and broadcasted. Figure 3 shows the flowchart for the array storing the bounding box numbers and the selection of non-de-identified subjects. To exclude the Num 2 Bounding Box, which is the third bounding box in Fig. 3, from the de-identified subjects, the area of the bounding box is selected using an input device. The number of the selected bounding box is then deleted from the array. The de-identification method proposed in this paper based on landmarks has a significant difference from the generally known face area detection and de-identification solutions in terms of the amount of information lost through de-identification. As shown in Fig. 4, the typical face detection method de-identifies the rectangular bounding box area, resulting in the loss of information that can be obtained around the person’s face. However, the landmark-based de-identification method only de-identifies the area that connects the facial feature points, which allows for the preservation of information other
204
S.-O. Kim et al.
Fig. 2. De-identification System Structure
than the face. In particular, when de-identifying multiple subjects in one screen, the efficiency of the landmark-based de-identification method is higher than that of the bounding box-based method. The comparison experiment results of the two de-identification methods are presented in Sect. 3.2.
Method of Facial De-identification Using Machine Learning
205
Fig. 3. De-identification exclusion target selection Flowchart of Bounding Box Array
Fig. 4. Comparison of De-identification using Bounding Box and Landmark
3.2 Experiment The proposed method was implemented on an Intel(R) Core(TM) i9-11900K CPU @ 3.50 GHz with 64 GB RAM and NVIDIA Geforce RTX 3080 Ti, RTX 3080 multi-GPU environment, using the Ubuntu 20.04 LTS operating system. The experiment used a combination of the Facial Landmark dataset, 300-W Dataset, and the face recognition dataset, WIDER Face Dataset [9]. A total of 523 images were used in an experiment to compare the advantages of using Landmarks and YOLOv3 for real-time de-identification. The images were a 50:50 mix of the 300-W Dataset and the WIDER Face Dataset. The experiment compared the difference in the de-identified areas of images that were de-identified using a single subject’s Bounding Box and those de-identified using a single subject’s Landmarks. The results were presented in Fig. 5. Figure 5 shows the results of an experiment that compared the amount of area de-identified using Bounding Box and Landmarks, and how much more area was deidentified using Landmarks. The images were all the same resolution, and the experiment used a total of 523 images. Landmarks de-identified an average of 3.1% less area than Bounding Box, which helped to prevent the loss of useful information. The lower 1%
206
S.-O. Kim et al.
of the 523 comparison data had a difference of 0.00156, or 0.156%, and the higher 1% had a difference of 0.20901, or 20.901%. The average difference was 0.0312, or 3.12%. However, looking at Fig. 5, there were significant differences between the higher and lower data points, and the data clustered around 0.02. Therefore, the data was analyzed using a distribution graph, as shown in Fig. 6.
Fig. 5. Ratio of De-identification regions of Landmark to Bounding box for a single target
Figure 6 is a distribution graph that represents the results of Fig. 5 by dividing the data into sections. The de-identification method that used Landmarks gathered the most data in the 1.2% to 2.2% range, with a total of 141 images, followed by the 0.2% to 1.2% range, with a total of 120 images. Therefore, the use of Landmarks for de-identification of single subjects can prevent a data loss of 1.2% to 2.2%, compared to using the Bounding Box method. Figure 7 shows the object recognition rate of the proposed model. The highest recognition rate was 98.9%, and the lowest was 92.03%, with an average performance of 95.52%.
Method of Facial De-identification Using Machine Learning
207
Fig. 6. Distribution graph by interval of the difference in the ratio of the de-identified area of Landmark to Bounding box for a single target
Fig. 7. Object detection Accuracy of the Proposed model
4 Conclusions In this paper, we propose a method to identify people in real-time video and de-identify the Landmark area of their faces. Existing Bounding Box de-identification techniques deidentify not only the face but also the surrounding area, resulting in reduced video quality
208
S.-O. Kim et al.
and loss of valid information. Our proposed technique de-identifies only the Landmark area of the face, thereby minimizing the de-identification area and preserving valid information around the face. Compared to Bounding Box de-identification techniques, our technique was able to preserve about 1.2–2.2% more valid information. Additionally, unlike existing de-identification systems, our method allows for arbitrary selection of deidentification targets, making it possible to protect the privacy of individuals in real-time videos such as personal broadcasts and news. Acknowledgments. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea Government (MSIT) (NO. 2021R1A4A2001810). This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea Government (MEST) (NO. NRF2018R1D1A1B07047656).
References 1. Conviva: Conviva’s State of Streaming. http://bitly.ws/zzNu 2. Dong, S., Kim, S., Ahn, H.: Study on trends and characteristics of infringement the right to likeness by the press. J. Korea Contents Assoc. 16(1), 370–381 (2016) 3. Gross, R., Sweeney, L., Cohn, J., de la Torre, F., Baker, S.: Face de-identification. In: Senior, A. (eds.) Protecting Privacy in Video Surveillance, pp. 129–146. Springer, London (2009). https://doi.org/10.1007/978-1-84882-301-3_8 4. Luo, S., Li, X., Zhang, X.: Bounding-box deep calibration for high performance face detection. IET Comput. Vis. 16(8), 747–758 (2022) 5. Redmon, J., et al.: You only look once: unified, real-time object detection. In: CVPR, pp. 1–10 (2016) 6. Redmon, J., et al.: YOLOv3: an incremental improvement. In: CVPR, pp. 1–6 (2018) 7. Koo, J., Seo, J., Jeon, S., Choe, J., Jeon, T.: RBox-CNN: rotated bounding box based CNN for ship detection in remote sensing image. In: The 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL 2018), pp. 420– 423. Association for Computing Machinery, New York (2018) 8. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-31924574-4_28 9. Sagonas, C., Antonakos, E., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: 300 faces In-the-wild challenge: database and results. In: Image and Vision Computing (IMAVIS), pp. 3–18 (2016)
Softprocessor RISCV-EC for Edge Computing Applications Guillermo Montesdeoca1 , V´ıctor Asanza1 , Rebeca Estrada1(B) , Irving Valeriano1 , and M. A. Muneeb2 1
Escuela Superior Politecnica del Litora, Guayaquil, Ecuador {guianmon,vasanza,restrada,ivaleria}@espol.edu.ec 2 Guru Nanak Dev Engineering College Bidar, Bidar, India [email protected]
Abstract. Hard-processors are those that have their architecture defined from the factory and this makes them less flexible in the face of architecture changes while softprocessors have the advantage of being modifiable in their architecture allowing continuous improvements to be made in the design if they are Open-source Software. In this paper, we propose a design of the first Ecuadorian open-source software softprocessor called RISCV-EC, which is based on a RISC-V single core architecture. In addition, a performance comparison is carried out between the proposed RISCV-EC softprocessor and other processors such as AVR ATMEGA328P, ARM Cortex M1 of the Raspberry Pi Pico and ARM Cortex A9 Zynq-7000 of Xilinx. This comparison consists of estimating the running time for the Fibonacci sequence algorithm increasing the number of iterations from 0 to 500. Owing to the fact that the RISCVEC softprocessor is implemented in the FPGA Zynq-7000 SoC - Xilinx, same clock was used as the other processors, in such a way that the comparison is a reflection of the architecture improvement and not of an overclock. The Fibonacci series was programmed on all processors using the same Assembler programming language (ASM) to avoid bias in the results by the compilers. Numerical results show that the RISCV-EC softprocessor has a better performance than the ATMEGA328P AVR processor for any given number of iterations of the Fibonacci series. In the case of the ARM Cortex M1 processor, the RISCV-EC processor is better for a number of iterations greater than 300 of the Fibonacci series. Finally, RISCV-EC presents better running time than the ARM Cortex A9 Zynq-700 processor only for a number of iterations lower than 18.
Keywords: RISC-V
1
· FPGA · SoC · Soft-Core Processor · VHDL
Introduction
Currently, there is a need to develop more compact processors and real-time processing because they must run from smart mobile devices to radar systems c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 209–220, 2023. https://doi.org/10.1007/978-3-031-35836-4_23
210
G. Montesdeoca et al.
and automobiles. Therefore, the use of embedded systems that are made up of a microprocessor, memory, screen, etc. is increasing. The demand to increase the yield per area it occupies increases with each passing year [10]. Design principles are based on having entire systems on a single chip that can include processors, memory, input and output circuits, Analog-to-Digital Converters (ADC), memory, and a host of special features for applications with a given set of specifications [4]. The problem with microprocessors is that once implemented on a chip, it is not possible to change its specifications, such as the number of ports, the memory, the number of cores or any characteristic of a brand. As a solution to this, there are companies that create Systems on a Chip (SOC) with a lot of variety in order to try to meet the varying demands of the market [9]. Other alternative is a softcore processor, which means that instead of being implemented in a circuit, it is inferred in hardware description code and can be synthesized to create an Application-Specific Integrated Circuit (ASIC) or implemented in an array programmable logic (FPGA) [9]. The advantage of this method is that since it is at a higher level of abstraction and changes can be performed easily. Once these changes have been made, a SOC system tailored to a particular application can be created. Some examples of this type of processor are the microblaze from the company Xilinx and the ARM Cortex m1, which are two types of processors that can be configured in the company’s own development environment and then synthesized and implemented in an FPGA. The novelty is that when configuring the intellectual property or IP block, the hardware-level characteristics of these two processors change. In the case of FPGA, most chips today have what is called a hard processor with one or more cores permanently implemented on the silicon of the FPGA chip. The advantage of this type of processor over the softcore processor is that they can operate at higher clock frequencies due to the physical medium on which they are built. On the other hand, the advantage of the softcore processor over the hard processor is the versatility of being able to change features at the Hardware Description Language (HDL) level [9]. These types of processors are commonly designed to accept a set of instructions, which means a set of basic instructions that the processor is capable of carrying out such as memory loads, arithmetic operations, etc. In the case of the aforementioned cores, commercial IPs such as those of ARM are used for the design, the only problem when using these IPs is that royalties must be paid to the company that manages the specifications and performs the updates. Risc-V is a free Industrial Architecture Standard (ISA) which means that the documents and the reasons for the design decisions can be read by anyone without having to pay any type of royalty or sign a non-disclosure agreement [6]. The main goal of this work is to design a functional softprocessor based on RISC-V and compare it with other processors such as the ARM cortex M0+ that can be found on the Raspberry Pi pico and the ARM Cortex A9 that is embedded in the silicon. of the FPGA.
Softprocessor RISCV-EC for Edge Computing Applications
211
Raspberry Pi speed can reach at up to 133 Mhz, it uses a SOC developed by the Raspberry Pi foundation called RP2040 having a dual core ARM Cortex M0+ processor inside. This processor is part of the ARM IPs focused on low power consumption. It is also designed to have a short pipeline and low cost. On the other hand, The ARM Cortex A9 is a high-end processor IP from ARM that was used in mobile devices such as the Apple A5 that had the Ipad 2 or the Samsung 4 Dual 4210 that carried Samsung Galaxy S2. It should be noted that this IP was used in the design of the SOC of these devices, so the performance depends on the design and the frequency for what these manufacturers have done for these devices. The rest of the paper is organized as follow: Sect. 2 presents the related work while Sect. 3 describes the architecture of the proposed RISC-V softprocessor. Numerical results are presented in Sect. 4. Finally, Sect. 5 concludes the research work.
2
Related Work
There are many papers that explore the differences in processor implementations in both Hard-processors and softprocessors [2,8]. Several related work investigate how to select the microprocesor based on some requirements or characteristics of microprocessors. For instance, GSK Gayatri et al. [10] proposed a guide to select a microprocessor based on the technical requirements of the project to be carried out. Other authors use the implementation of digital systems based on FPGA, these implementations have greater flexibility by allowing the implementation of architectures and not using a fixed architecture such as a microprocessor [3,7]. They state that it is important to consider at least ten characteristics when selecting a processor. Other researchers focus on how to build softprocessor such as Jim Ledin y Dave Farley [6], who created a guide to show how each part of a processor is built, the differences between various instruction sets including RISC-V and how commonly used devices such as smartphones work from an ARM processor. Other related works presented a comparison of hard processor and softprocessor such as the work of Ahmed Karim Ben Salem [4], where they compare the performance of a digital control application implemented on an FPGA using a Hard-processor and a softprocessor. In [9], the authros make a comparison of several commercial processor IPs that were available at the time of the paper’s publication and are still in use, such as the Xilinx Microblaze. Wali et al. [11], performed a test on a commercial RISC-V core where the frequency of occurrence of errors when performing basic operations both with and without an operating system was measured. Lowrisc [1], is a non-profit company, which collaborates with universities, research centers, and companies like google to develop RISC-V technology, they among other things created a RISC-V softcore called IBEX.
212
G. Montesdeoca et al.
Gray Research LLC [5], proposed an accelerator that consisted of creating a RISC-V core as component-efficient as possible to create as many cores in an FPGA as possible to increase parallelism.
3
Microarchitecture
In this section, all the Medium Scale Integration (MSI) blocks of the proposed architecture for the softprocessor RISC-V is presented. In addition, we describe the design each part of a microcontroller and their implementation in a Xilinx FPGA. Moreover, it is taken into account how the synthesis tool converts the code into an implementable design. 3.1
Softprocessor Design
The architecture was implemented using VHDL and is shown in Fig. 1. The codes of all the blocks and the complete architecture are released on the Opencores platform at the following link:1 . This platform is the most prominent online community for the development of gateway IP (Intellectual Properties) Cores. The RISCV-EC softprocessor is based on the architecture of a RISC-V processor, the architecture design considerations are specified in the ISO documentation. First, the processor has 32 registers, of which the first Register (X0) always has a zero value since it is physically connected to the ground. • Arithmetic Logic Unit: This block is responsible for performing arithmetic operations between integers and comparison. It has two 32-bit inputs, one 4bit for Operation Code (Opcode) which is a type of flag for communication between the control unit and the ALU, as shown in Table 1. In addition, it has a 32-bit output for the results. In this case, no type of carry is necessary since the numbers are added as signed integers, so the sums cannot go outside the 32-bit range. In VHDL, when using a case statement depending on the Opcode, it changes the operation it performs. • Register Bank: The register bank is a 32-bit by 32-bit memory. It has three read inputs of 5 bits each, a data input of 32 bits, a Write/Read enable signal (WE) to write the data, a clock input and two outputs of 32 bits each. The operation of this memory is asynchronous for reading and synchronous for writing. This is done by inferring the behavior of the block and then letting the synthesis tool do the work so it is mostly made up of flip-flops. • Instruction Memory: This MSI block implemented in VHDL called instruction memory is an 8-bit ROM that has a 32-bit input for the address and a 32-bit output.
1
https://opencores.org/projects/ecriscv.
Softprocessor RISCV-EC for Edge Computing Applications
213
Fig. 1. Diagram of the proposed architecture Table 1. Instructions that can be executed by the instruction set processor RISCV-EC
3.1.1
Operations between registers
Operations with immediate
Flow control operations
Memory operations
add
addi
beq
lb
sub
slti
bne
lh
sll
sltiu
blt
lw
slt
xori
bge
lb
sltu
ori
bltu
lhu
xor
andi
bgeu
sb
srl
sli
sh
sra
srli
sw
and
sli
sh
Program Counter
This MSI block is a register that saves the value of the program line in which the processor is executing. It increases by 4 every clock cycle thanks to a combination of an adder and a multiplexer. • Instruction Decoder: This section separates the 32-bit instruction into different sets that are needed by the processor. It is split following the format of R-type instructions, e.g. first 7 bits for Func7, bits for rs1 and rs2 each, 3
214
G. Montesdeoca et al.
Fig. 2. Instruction Format
bits for Func3, 5 bits for RD, and finally 7 bits for the opcode. This is shown in Fig. 2. • Multiplexors: The Multiplexer is an MSI block that changes its outputs by selecting between its inputs. This block is made up of 4 multiplexers in total and the control signal for each one is generated by the control unit. • Sign extender: This MSI block is used for immediate instructions that have a 12-bit integer value encoded in the instruction. Since this value must pass through the ALU which has 32-bit inputs, the 12 bits must be extended to 32 bits without changing magnitude or sign. • Control Unit: The MSI block called Control Unit is based on a Moore Finite State Machine (FSM). It is responsible of generating the control signals that will lead the processor to execute the instructions. It has as input the fields of opcode, func3 and func7 of the RISC-V instruction format for R type instructions. Using a case statement you can know what type of instruction it is either R,I,S,B,U or J. After that you can use another nested case statement to select the specific instruction using Func3 finally in cases such as addition and subtraction that have equal Opcode and Func3 you can use an if else or when else statement to select between the two.
3.2
Instructions Type
The type of instructions are: Type-R, Type-I, Type-S, Type-B and Type-J. These types of instructions correspond to the following types of Record, Immediate, Save, Branch and Jump; each one has its own type of format and the RISCV-EC Softprocessor is capable of differentiating them and executing the instructions one at a time. In the following, we explain how each type of instruction works in the processor starting from the R type instructions.
Softprocessor RISCV-EC for Edge Computing Applications
215
• Type R: As can be seen in Fig. 1, in the RISCV-EC softprocessor architecture, first the instructions go from the instruction memory to a single instruction decoder. Then, the fields: opcode, func7 and func3 are sent to the control unit, which is in charge of generating activation signals for the other elements, such as the ALU, depending on the instruction following the format of Table 1. IN M M U X is the signal that controls the mux that selects the field that enters the generator block immediately in this case the value it should take would be a dont care since immediate will not be used in these instructions. REG − W E will be 1 since the value that comes out of the ALU must be stored in a register. P C −M U X and M EN −W E will be 0 since it is not a memory instruction or a branch type instruction. Finally, ALU −M U X and M EN 2REG − M U X will be set to 1 since it is necessary to select the part of the register bank that goes to the ALU and that the output of the ALU goes to the din port of the register bank. • Type I: This case is similar to the previous one as far as activation signals are concerned. The ALU receive from the CU the corresponding opcode according to the operation.REG − W E and M EN 2REG − M U X should be 1 since the first one is in charge of writing in the register block the results of the ALU and the second one is in charge of addressing the output of the ALU to the data input of the register bank. The rest of the control signals will be 0. In this type of instructions the bits from 31 to 20 of the instruction are put together and extended using the sign extender block to finally reach one of the ALU ports whose output will be stored in the register bank. • Type S: The control signals that are activated in these instructions are the IN M − M U X and M EM − W E the rest will be deactivated. First an immediate a is created by joining bits 31 to 25 and 11 to 7, then this 11-bit immediate is extended respecting the sign using the sign extender block, this immediate is added with the value of the register whose address is in RS1, the result of this becomes the address in memory where the data will be stored. The data itself comes from the RS2 register that is not used in the ALU because ALU − M U X is deactivated. • Type L: In this type of instructions the REG − W E is activated, and the rest of signals will be 0. In this case the values of RS1 are added with an immediate in the ALU, that result becomes the address from where it will be read in the memory and this arrives directly to the Data in (din) of the block of registers to be saved in the address that RD says. • Type B: In this case the opcode of the ALU will depend on the instruction, according to the operations in the Table 1, also P C − M U X and ALU − M U X will be 1 and the rest 0. The way it works is the following, there is an immediate which is compared with the number that is contained in RS2, the result of this comparison is used as input to the CU so that this one knew that a branch is being made and must activate the P C − M U X. This multiplexer changes the input of an adder that is in charge of adding the bits of the program counter. How much that number increases or decreases is decoded in the instruction in the part of funct and rd in a different pattern
216
G. Montesdeoca et al.
to the previous ones, reason why it must pass through a block that reorders the bits generating the desired immediate one. The register bank receives the signals from RS2 , RS1 and RD from the instruction encoding. RS2 and RS1 are connected to the read ports of the register bank and the data stored in those addresses are reflected in Do2 and Do1 immediately. RD enters the ad port which is the write address which works with the we port to store the data coming in through din synchronously with the clock. The outputs of the register bank go to the ALU do1 directly to port 2 and do1 to a multiplexer, which is the one that allows to select if an instruction uses immediate or operates between registers.
4
Numerical Results
We run the Algorithm 1 to evaluate and to compare the performance of the different development boards that we use in this paper. Other important parameter is the number of logic elements that will be used in the FPGA when synthesizing the VHDL code. Algorithm 1: Fibonacci series Iterative test to estimate the runtime t1=get time() i=0; a=0; b=1; while i < number of iterations do a=a+b; b=a+b; i++ end t2=get time() print(t2-t1)
Table 2. FPGA resources Usage for synthesizing Slice LUTs Slice Registers F7 Muxes F8 Muxes Slice LUT as Logic (53200) (106400) (26600) (13300) (13300) (53200) 3.69% 0.84% 2.84%
2.36% 0.70% 1.66%
0.77% 0.00% 0.77%
0.24% 0.00% 0.24%
7.03% 1.76% 5.33%
3.20% 0.80% 2.40%
LUT as Memory Block RAM Bonded IOB BUFGCTRL BSCANE2 (17400) Tile (140) (125) (32) (4) 1.48% 0.14% 1.34%
1.07% 0.00% 1.07%
0.80% 0.00% 0.00%
9.38% 3.13% 3.13%
25.00% 25.00% 0.00%
Softprocessor RISCV-EC for Edge Computing Applications
4.1
217
Resource Usage
The amount of programmable logic elements occupied in the FPGA after synthesizing the design is presented in Table 2. These results show that the amount of logic elements used is very low. Thus, the proposed processor can be used in another FPGA with less logic elements.
Fig. 3. Comparison among RISCV-EC, Raspberry Pi Pico and Zynq 7000 processors
4.2
Performance Comparison
The Algorithm 1 in Sect. 4 was used to test the performance of each processor taking into account their implementations and the differences between the instruction set. Fig. 3 shows a comparison between the proposed processor, the Raspberry pi Pico and the Zynq 7000 processor. All processors were programmed in assembly language using the registers to perform the mathematical operations instead of memory. As we can see, the RISCV-EC is the slowest, then, the RaspBerry pi Pico and the fastest one is zynq-7000. This is because the processors’ clock frequencies are different (e.g. RISCV-EC, Raspberry pi Pico and the Zynq-7000 use a 50 MHz, 125 MHz and 650 MHz clock respectively). The lack of a pipeline in the RISCV-EC makes its maximum frequency low. The RP2040, which is the heart of the Raspberry, has inside an ARM Cortex M0+ core which is a chip made for high efficiency and has a short pipeline with 2 stages so its maximum clock frequency is also low. Finally, the SOC of the PYNQ-Z1 has two ARM Cortex A9 cores. This processor has been on the market for several years and was created for a 45 nm manufacturing process and has an 8-stage pipeline. Although its frequency is the highest in this comparison, it is really low by today’s standards. In summary, we can conclude that the higher the clock frequency, the higher the performance and that the stages of pipeline are essential for the frequency to be higher in a processor. This experiment allows us to measure the implementation of each of the processors, but we cannot test the differences in execution time of the instruction sets with this test alone due to the large difference in clock frequency.
218
G. Montesdeoca et al.
Fig. 4. Performance comparison RISCV-EC (16 MHz) vs Arduino One (16 MHz) with lower clock frequency.
Fig. 5. Comparison RISCV-EC vs Raspberry Pi Pico with similar operating frequency.
To compare the proposed processor with one with a similar frequency, we usedd the Arduino 1 since it has a lower frequency of 16 MHz and a 1-stage pipeline. Fig. 4 presents the results of this comparison. It is also important to add that there is a difference in the power consumption of this two processors however as this is only an estimate we can’t make any conclusions in that field. As we expected, RISCV-EC is significantly faster than Arduino. This is owing to the fact that the instruction set allows to write code a little more efficiently and also because the Arduino has an 8-bit bus while the RISCV-EC has a 32-bit bus. In Fig. 5, we can see the results of the comparison between the RISCVEC processor and the ARM Cortex M0 processor of the Raspberry Pi Pico. The operating frequency of the RISCV-EC was lowered to match the one of
Softprocessor RISCV-EC for Edge Computing Applications
219
Fig. 6. Simulated Performance of RISCV-EC vs ZYNQ 7000 with same clock frequency.
the 125 MHz ARM-M0. It can be noticed that the RISCV-EC has a better performance when the interactions are higher than 300. On the other hand, with interactions lower than 300 the ARM-M0 processor of the Raspberry Pi Pico is faster than the 32-bit RISCV-EC processor. To make a fair comparison between the processors, we also performed a comparison by using a simulator that allows us to configure the RISCV-EC with a higher operating frequency. We perform this through simulations because the experimental RISCV-EC processor was not able to reach the higher frequency due to its physical limitations. Figure 6 shows the comparison between the RISCV-EC and the zynq 7000, both with the same clock frequency. From this figure, it can be seen that the Cortex A9 is the fastest even when using clocks with the same frequency.
5
Conclusions
In this paper, the design of the first Ecuadorian open-source software softprocessor is proposed, which is based on a RISC-V single core architecture. Moreover, we presented a perfomance comparison with other processors such as AVR ATMEGA328P microcontroller, ARM Cortex M0 and A9. RISCV-EC softprocessor configuration in the FPGA has a flexible design by allowing clock frequency changes to 16 MHz, 125 MHz and 650 MHz for comparisons with the other processors. The implementation of the RISCV-EC softprocessor represents a LookUp Table (LUT) usage as Logic Elements of about 3.69%. This demonstrates the scalability of the FPGA design by consuming few logic elements. Numerical results show that the RISCV-EC softprocessor outperforms the AVR ATMEGA328P processor at any number of Fibonacci series iterations, while the RISCV-EC processor is better at number of iterations greater than 300 when compared to the ARM Cortex M0 processor. Finally, the RISCV-EC is better than the ARM Cortex A9 processor for values lower than 18 iterations.
220
G. Montesdeoca et al.
References 1. Open to the core. https://lowrisc.org/ 2. Asanza, V., Estrada, R., Miranda, J., Rivas, L., Torres, D.: Performance comparison of database server based on SoC FPGA and arm processor. In: 2021 IEEE Latin-American Conference on Communications (LATINCOM), pp. 1–6 (2021). https://doi.org/10.1109/LATINCOM53176.2021.9647742 3. Asanza, V., Pico, R.E., Torres, D., Santillan, S., Cadena, J.: Fpga based meteorological monitoring station. In: 2021 IEEE Sensors Applications Symposium (SAS), pp. 1–6 (2021). https://doi.org/10.1109/SAS51076.2021.9530151 4. Devi, G.G., Swamy, G.K.: An overview of microcontroller unit: from proper selection to specific application. J. Crit. Rev. 3(1) (2016) 5. Gray, J.: GRVI phalanx: a massively parallel RISC-V FPGA accelerator accelerator. In: 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 17–20. IEEE (2016) 6. Ledin, J., Farley, D.: Modern Computer Architecture and Organization: Learn x86, ARM, and RISC-V Architectures and the Design of Smartphones, PCs, and Cloud Servers. Packt Publishing Ltd., Birmingham (2022) 7. Montesdeoca, G., Asanza, V., Chica, K., Peluffo-Ord´ on ˜ez, D.H.: Analysis of sorting algorithms using a WSN and environmental pollution data based on FPGA. In: 2022 International Conference on Applied Electronics (AE), pp. 1–4 (2022). https://doi.org/10.1109/AE54730.2022.9920090 8. Qin, S., Berekovic, M.: A comparison of high-level design tools for SOC-FPGA on disparity map calculation example. arXiv preprint: arXiv:1509.00036 (2015) 9. Salem, A.K.B., Othman, S.B., Saoud, S.B.: Hard and soft-core implementation of embedded control application using RTOS, pp. 1896–1901 (2008) 10. Tong, J.G., Anderson, I.D., Khalid, M.A.: Soft-core processors for embedded systems. In: 2006 International Conference on Microelectronics, pp. 170–173. IEEE (2006) 11. Wali, I., S´ anchez-Maci´ an, A., Ramos, A., Maestro, J.A.: Analyzing the impact of the operating system on the reliability of a RISC-V FPGA implementation. In: 2020 27th IEEE International Conference on Electronics, Circuits and Systems (ICECS), pp. 1–4. IEEE (2020)
Vulnerability of the Hypercube Network Based on P2 -cuts Yuan-Hsiang Teng1 and Tzu-Liang Kung2,3(B) 1
2
Department of Computer Science and Information Engineering, Providence University, Taichung, Taiwan [email protected] Department of Computer Science and Information Engineering, Asia University, Taichung, Taiwan [email protected] 3 Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
Abstract. The abstract graph G can model an interconnection network’s topology. A cluster C in G is a node subset of G such that the subgraph induced by C is connected. Theoretically speaking, the order of the maximal component in G from which certain faulty clusters are removed can be referred as an index of fault tolerability. A path consisting of k distinct nodes is abbreviated as Pk . A set F of clusters of G is a P2 -cut if (i) all clusters of F induce subgraphs that are isomorphic to P2 , and (ii) G − F is trivial or disconnected. Because of the growing popularity of the hypercube architecture Qn in real-world supercomputers, this paper is devoted to exploring the vulnerability of Qn based on P2 -cuts.
1
Introduction
The abstract graph G can model an interconnection network’s topology to deal with many real-world applications based on the graph theory [1]. For an undirected graph G, V (G) and E(G) notate the node and edge sets of G, respectively. A graph T is a subgraph of G if V (T ) ⊆ V (G) and E(T ) ⊆ E(G). Two nodes u and v are adjacent if they are linked by an edge uv; then they are are neighbors of each other. A node’s degree is the number of its neighbors; for any node v ∈ V (G), degG (v) denotes the degree of v. Some graph-theory terminology and notations are listed below in advance: • δ(G) = min{degG (v) | v ∈ V (G)} is the minimum degree of G. • NG (v) denotes the set of the neighbors of v ∈ V (G), a.k.a. neighborhood of v. the open neighborhood of A ⊂ V (G), formally defined as • NG (A) denotes NG (A) = w∈A NG (w) \ A. • G[A] is the subgraph induced by A ⊆ V (G). • Pk is a path that consists of k distinct nodes, where k ≥ 1. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 221–228, 2023. https://doi.org/10.1007/978-3-031-35836-4_24
222
Y.-H. Teng and T.-L. Kung
• distG (u, v) denotes the distance between two nodes u and v, equal to the length of the shortest u, v-path in G. • κ(G) is the connectivity of G. • κg (G) is the g-extra connectivity of G. • κ(G|P2 ) is the P2 -connectivity of G. • κ(G|P2∗ ) is the P2∗ -connectivity of G. If there exist paths joining every two distinct nodes in a graph G, then G is connected ; otherwise, G is disconnected. A node-cut F of G is a subset of V (G) such that G−F is disconnected or trivial. Thus, κ(G) is the minimum cardinality over node-cuts of G. A trivial node-cut of G is each node’s neighborhood; thus, δ(G) ≥ κ(G). More specifically, a g-extra node-cut F of G is a node-cut such that no component of G − F has g or less nodes, where g ≥ 0. Then κg (G) is the minimum cardinality over g-extra node-cuts of G [3]. If a subset C ⊆ V (G) of nodes induces a connected subgraph of G, then C is a cluster C in G. Once each node of C becomes faulty, G − C is the surviving graph. Today’s faulttolerant routing also consider cluster fault model to maintain route tables [2,5–7]. According to [9], a set F of clusters is a P2 -cut of G if (i) all clusters of F induce subgraphs that are isomorphic to P2 , and (ii) G − F is trivial or disconnected. Analogously, F is a P2∗ -cut of G if (i) all clusters of F induce subgraphs that are isomorphic to either P2 or P1 , and (ii) G − F is trivial or disconnected. Then κ(G|P2 ) and and κ(G|P2∗ ) are defined as the minimum cardinality over P2 -cuts and P2∗ -cuts of G. Because of the growing popularity of the hypercube architecture Qn in realworld supercomputers [8,10], this paper is devoted to exploring the vulnerability of Qn based on P2 -cuts as well as P2∗ -cuts. To discuss this issue in a rigorous way, Sect. 2 provides several hypercubes’ network properties; Sect. 3 formally proves this paper’s main results. Finally, conclusion is drawn in Sect. 4.
2
Preliminary
The n-dimensional hypercube Qn is a network with 2n nodes coupled with n2n−1 edges, provided n ≥ 1. Every node of Qn corresponds to an n-bit binary number. Hypercubes regard a standard recursion model. Some notations are listed below. Their formal definitions can be referred to in [9]. • (v)p is the p-neighbor of v ∈ V (Qn ), where 0 ≤ p ≤ n − 1. • v(v)p is a p-edge, where 0 ≤ p ≤ n − 1. (0) (1) (0) • Qn = Qn−1 ⊕ Qn−1 notates the recursion model of Qn , where Qn−1 and (1) (0) Qn−1 are two disjoint copies of Qn−1 such that a ∈ V (Qn−1 ) if and only if (1) (a)n−1 ∈ V (Qn−1 ) It is noticed that Qn is K3 -free, n-regular, edge- and node-transitive [10]. For clarity, 5-, 4- and 3-dimensional hypercubes are depicted in Fig. 1. Lemma 1 [11]. For {a, b} ⊂ V (Qn ) with a = b, |NQn (a) ∩ NQn (b)| = 2 if and only if distQn (a, b) = 2.
Vulnerability of the Hypercube Network Based on P2 -cuts 0000
100
1001
0001
1000
0100
0101
1101
1100
0110
0111
1111
1110
101
000
001
223
111
110 010
0010
011
1010
Q4 00000
00001
00100
01001
00101
00110
Q5
1011
0011
Q3
01000
01101
00111
01100
01111
01110
00010
00011
01011
01010
10010
10011
11011
11010
10110
10100
10000
11111
10111
11101
10101
10001
11110
11100
11001
11000
Fig. 1. Illustrating Q5 , Q4 and Q3 .
Lemma 2 [4]. Let A ⊂ V (Qn ) with |A| ≥ 1 and n ≥ |A|+1 2 . Then |NQn (A)| ≥ n|A| − |A|(|A|+1) + 1. 2 Lemma 3 [12]. For 0 ≤ g ≤ n − 4, κg (Qn ) = 2 with n ≥ 4, κg (Qn ) = n 2−n .
(2n−3)g+2n−g 2 ; 2
for n ≥ g ≥ n − 3
Lemma 4 [9]. For n ≥ 3, κ(Qn |P2 ) = κ(Qn |P2∗ ) = n − 1.
3
Main Results
Lemma 5. Let X be any P2∗ - or P2 -cut of Qn for n ≥ 3 such that Qn − X has a trivial component. Then |X| is no less than n. If |X| = n, Qn − X has only two components.
224
Y.-H. Teng and T.-L. Kung
Proof. Denote by a the node of the smallest component of Qn − X. Because Qn is K3 -free, we have |A ∩ NQn (a)| ≤ 1 for every A ∈ X. Then |X| ≥ |NQn (a)| maxA∈X {|A∩NQ (a)|} = n. n
(1)
(0)
In particular, suppose that |X| = n. As Qn = Qn−1 ⊕ Qn−1 , it is (1) (0) assumed that a ∈ V (Qn−1 ). Let X0 = B∈X,B∩V (Q(0) )=∅ B ∩ V (Qn−1 ) and n−1 (1) (0) X1 = B∈X,B∩V (Q(1) )=∅ B ∩ V (Qn−1 ). Obviously, Qn−1 [X0 ] is isomorphic to n−1
(0)
(1)
K1,|X0 |−1 , and Qn−1 − X0 is connected. Moreover, Qn−1 − X1 − {a} is linked to (0)
Qn−1 − X0 . Therefore, Qn − X has only two two components if |X| = n. Lemma 6. Let X be any P2 - or has only two components.
P2∗ -cut
of Q4 such that |X| = 3. Then Q4 − X
∗ Proof. By Lemma 4, X is a minimum P in Q4 . If one cluster of 2 - or P2 -cut X were isomorphic to P1 , then we had | B∈X B| = B∈X |B| < 2|X| = 6 = κ1 (Q4 ) so that the smallest component of Q4 − X would be trivial. However, if the smallest component of Q4 − X were trivial, it would follow from Lemma 5 that |X| ≥ 4, contradicting the cardinality of X. As a consequence, every cluster smallest component of Q4 − X, denoted by H, has of X must be P2 , and the two or more nodes. Since | B∈X B| = 2|X| = 6, Q4 − X contains exactly ten nodes. Thus, H includes at most five nodes. By Lemma 2, |NQ4 (V (H))| ≥ 6 if |V (H)| ∈ {2, 5}; |NQ4 (V (H)| ≥ 7 if |V (H)| ∈ {3, 4}. As |NQ4 (V (H))| ≤ 2|X| = 6, we have |V (H)| = 5 or |V (H)| = 2.
• Case 1: |V (H)| = 5. Trivially, Q4 − X has only two components. • Case 2: |V (H)| = 2. Let V (H) = {a, x}. Obviously, NQ4 ({a, x}) ⊆ B∈X B. Because Q4 is K3 -free, we have |NQ4 ({a, x})| = 6 = 2|X|. This implies that NQ4 ({a, x}) = B∈X B. As Q4 is edge- and node-transitive, it is assumed that {a, x} = {0000, 1000}. Accordingly, X = {{0001, 1001}, {0010, 1010}, {0100, 1100}}. Clearly, Q4 − X has only two components, as illustrated in Fig. 2.
Lemma 7. Let X be any P2 - or P2∗ -cut of Q5 such that |X| = 4. Then Q5 − X has only two components, the smaller of which is P2 . Proof. Denote by H the smaller component of Q5 − X. By Lemma 3, we have κ2 (Q5 ) = 10 > 8 = 2|X|. This implies that |V (H)| ≤ 2. If |V (H)| were equal to one, then we would have |X| ≥ 5 according to Lemma 5, which did not meet the cardinality of X. ({a, y}) ⊆ 5 Below we consider |V (H)| = 2. Let V (H) = {a, y}. Obviously, NQ A. Because Q is K -free, we have |N ({a, y})| = 8 ≤ | 5 3 Q5 A∈X A| = A∈X |A| ≤ 2|X| = 8. This implies that N ({a, y}) = A. As Q Q5 5 is edgeA∈X A∈X and node-transitive, it is assumed that {a, y} = {00000, 10000}. Accordingly, X = {{00001, 10001}, {00010, 10010}, {00100, 10100}, {01000, 11000}}. Clearly, Q5 − X has only two two components, and H is really P2 .
Vulnerability of the Hypercube Network Based on P2 -cuts 0000
0010
0001
1001
0100
0101
1101
1100
0110
0111
1111
1110
0011
1011
0000
1000
225 1000
0110
0101
1101
0111
1111
0011
1010
1110
1011
Fig. 2. Q4 − X consists of exactly two components, where X = {{0001, 1001}, {0010, 1010}, {0100, 1100}}.
Lemma 8. Let X be any P2 - or P2∗ cut of Q5 such that |X| = 5. Then Q5 − X has only two components, the smaller of which is either P1 or P2 . Proof. This lemma had been carefully checked by brute force with computer programs.
Theorem 1. For n ≥ 5, let X(n) be any P2 - or P2∗ -cut of Qn such that |X(n)| ≤ n. Then Qn − X(n) has only two components, the smaller of which is either P1 or P2 . Proof. Lemma 4 says that n − 1 = κ(Qn |P2∗ ) = κ(Qn |P2 ) ≤ |X(n)|. The proof proceeds by induction on n. Based on Lemmas 7 and 8, the statement holds if n = 5. For 5 ≤ k < n, the inductive hypothesis assumes that Qk − X(k) consists of exactly two components, the smaller of which is isomorphic to P1 or P2 . According to Lemma 3, we have known that κ2 (Qn ) = 3n − 5 for n ≥ 5. As X(n) includes 2|X(n)| ≤ 2n < 3n − 5 nodes for n ≥ 6, no component of Qn − X(n) includes three or more nodes. If |X(n)| = n − 1, it is trivial that Qn − X(n) consists of exactly two components according to the inductive hypothesis. It is considered that |X(n)| = n below. If every cluster of X(n) is isomorphic to P1 , then B∈X(n) B must be a node’s neighborhood so that Qn − X(n) consists of exactly two components. Below we assume that at (1) (0) least one cluster of X(n) is P2 . Since Qn ≡ Qn−1 ⊕ Qn−1 , let X 0 (n) = (0) {B ∩ V (Qn−1 )}, X 1 (n) = {B ∩ (0) (1) B∈X(n),B∩V (Qn−1 )=∅ B∈X(n),B∩V (Qn−1 )=∅ (1) V (Qn−1 )} and Xc = B∈X(n),B∩V (Q(0) )=∅,B∩V (Q(1) )=∅ {B}. For 0 ≤ i ≤ n − 1, let
n−1
n−1
ri = |{B | B ∈ X(n), Qn [B] is isomorphic to a subgraph generated by an i-edge}|.
Without loss of generality, it is assumed that r0 ≥ r1 ≥ · · · ≥ rn−1 . Thus, we have 0 ≤ rn−1 ≤ 1. Furthermore, we consider |X 1 (n)| ≤ |X 0 (n)|.
226
Y.-H. Teng and T.-L. Kung (0)
(1)
If |X 0 (n)| were less than n − 2, then both Qn−1 − X 0 (n) and Qn−1 − X 1 (n) could be connected so that Qn − X(n) would be connected. If |X 0 (n)| were (0) (1) equal to n, then every node of Qn−1 − X 0 (n) could be linked to Qn−1 − X 1 (n) so that Qn − X(n) would be connected. For Qn − X(n) to be disconnected, we consider n − 2 ≤ |X 0 (n)| ≤ n − 1. Accordingly, since |X 1 (n)| ≤ 3 ≤ n − 3 (1) for n ≥ 6, Qn−1 − X 1 (n) is always connected. By the inductive hypothesis, (0) Qn−1 − X 0 (n) consists of exactly two components C1 and C2 , for which 1 ≤ (1) |V (C1 )| ≤ 2 < |V (C2 )|. Clearly, C2 is connected to Qn−1 − X 1 (n), but C1 is not. As a consequence, Qn − X(n) has only two components, the smaller of which is
P1 or P2 . Remark 1. Let F be a P2 -cut of Q4 , which is defined as follows: F = {{0100, 0110}, {0001, 0011}, {1101, 1111}, {1000, 1010}}. Figure 3 illustrates that Q4 − F consists of four components while |F | = 4.
0000
0010
0001
1001
1000
0000
1001
0100
0101
1101
1100
0101
1100
0110
0111
1111
1110
0111
1110
0011
1011
1010
0010
1011
Fig. 3. Q4 − F consists of four components, where F = {{0100, 0110}, {0001, 0011}, {1101, 1111}, {1000, 1010}}.
Remark 2. Let F be a P2 -cut of Q5 , which is defined as follows: F = {{00100, 00110}, {00001, 00011}, {01101, 01111}, {01000, 01010}, {10000, 10010}, {10101, 10111}}. Figure 4 illustrates that Q5 − F consists of three components while |F | = 6.
Vulnerability of the Hypercube Network Based on P2 -cuts 00000
00001
00100
01001
00101
00110
01000
01101
00111
01111
00101
01110
00011
01011
01010
10010
10011
11011
11010
10100
11111
10111
11101
10101
10000
10001
11001
01001
01100
00010
10110
00000
11110
11100
11000
227
01100
01110
00111
00010
01011
10011
11011
11010
10110
11111
10100
11101
10001
11001
11110
11100
11000
Fig. 4. Q5 − F consists of three components, where F = {{00100, 00110}, {00001, 00011}, {01101, 01111}, {01000, 01010}, {10000, 10010}, {10101, 10111}}.
4
Conclusion
In this paper, the vulnerability of Qn is investigated with regard to P2 - and P2∗ -cuts. For any P2 - or P2∗ -cut X of Qn , we definitely prove that Qn − X has only two components, the smaller of which is either P1 or P2 , if |X| ≤ n and n ≥ 5. One intriguing future work may get involved with a much bigger P2 - or P2∗ -cut F , where |F | > n. Acknowledgements. This work is supported in part by National Science and Technology Council, Taiwan, under Grant No. NSTC 111-2221-E-468-010.
References 1. Bondy, J.A., Murty, U.S.R.: Graph Theory. Springer, London (2008). https://doi. org/10.1007/978-1-4612-9967-7 2. Bossard, A., Kaneko, K.: Cluster-fault tolerant routing in a torus. Sensors 20(11), 1–17 (2020) 3. F´ abrega, J., Fiol, M.A.: On the extraconnectivity of graphs. Disc. Math. 155, 49–57 (1996) 4. Fan, J., Lin, X.: The t/k-diagnosability of the bc graphs. IEEE Trans. Comput. 54(2), 176–184 (2005) 5. Gu, Q.P., Peng, S.: An efficient algorithm for node-to-node routing in hypercubes with faulty clusters. Comput. J. 39, 14–19 (1996) 6. Gu, Q.P., Peng, S.: k-pairwise cluster fault tolerant routing in hypercubes. IEEE Trans. Comput. 46, 1042–1049 (1997) 7. Gu, Q.P., Peng, S.: Node-to-set and set-to-set cluster fault tolerant routing in hypercubes. Parallel Comput. 24, 1245–1261 (1998) 8. Harary, F., Hayes, J.P., Wu, H.J.: A survey of the theory of hypercube graphs. Comput. Math. Appl. 15, 277–289 (1988)
228
Y.-H. Teng and T.-L. Kung
9. Kung, T.L., Lin, C.K.: Cluster connectivity of hypercube-based networks under the super fault-tolerance condition. Disc. Appl. Math. 293, 143–156 (2021) 10. Saad, Y., Schultz, M.H.: Topological properties of hypercubes. IEEE Tran. Comput. 37, 867–872 (1988) 11. Sabir, E., Meng, J.: Structure fault tolerance of hypercubes and folded hypercubes. Theor. Comput. Sci. 711, 44–55 (2018) 12. Yang, W., Meng, J.: Extraconnectivity of hypercubes. Appl. Math. Lett. 22, 887– 891 (2009)
Applications of Artificial Fish Swarm Algorithms for Indoor Positioning and Target Tracking Shu-Hung Lee1 , Chia-Hsin Cheng2(B) , Chien-Chih Lin2 , and Yung-Fa Huang3(B) 1 School of Intelligent Manufacturing and Automotive Engineering, Guangdong Business and
Technology University, Guangdong 526020, China 2 Department of Electrical Engineering, National Formosa University, Yunlin 632301, Taiwan
[email protected] 3 Department of Information and Communication Engineering, Chaoyang University of
Technology, Taichung 413310, Taiwan [email protected]
Abstract. Target positioning and tracking in wireless sensor networks (WSNs) are necessary for real ap-plications. This paper uses an artificial fish swarm algorithm (AFSA) with receive signal strength indicator (RSSI) for indoor target positioning and tracking. The performance of random and fixed artificial fish deployment topologies in AFSA for target positioning and tracking is investigated when the number of artificial fish is 12, 24, 52, 72, and 100, respectively. The simulation results show that the difference in average error between the random and fixed deployments is 5%, and the average positioning time is 2% for target position. The target tracking performance of fixed deployment is better than that of random one when the number of artificial fish is small.
1 Introduction Wireless sensor networks (WSNs) have gained increasing attention in various applications, such as disaster management, space exploration, forest monitoring, security installations, and factory automation [1]. WSNs use wireless communication for selforganizing networks comprising wireless data collectors and numerous sensors. It is a distributed network that does not rely on fixed infrastructure but on direct communication between nodes within their range. These networks require cooperative relationships to establish ad hoc communication, adapt to network changes, and have a dynamic network topology. The sensors’ computing and wireless transmission devices detect temperature, light, and other environmental factors and transmit data to the data collector through wireless transmission equipment. This framework allows placing sensors or wireless data collectors randomly, reducing deployment costs and making localization [2, 3] and other applications adaptable to different environments. Several target positioning and tracking methods have emerged in recent years, including the use of received signal strength (RSS), time of arrival (TOA), time difference of arrival (TDOA), and angle of arrival (AOA) [4–7]. However, TOA, TDOA, and AOA methods call for more © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 229–239, 2023. https://doi.org/10.1007/978-3-031-35836-4_25
230
S.-H. Lee et al.
expensive equipment and higher computational complexity than RSS. AFSA has been used for the target location of the least square method [8]. Likewise, AFSA and TDOA algorithms have been used for sensor localization [9]. This paper proposes an AFSA wireless sensor network target location and tracking method combined with the receive signal strength indicator (RSSI) channel model. The deployment methods of sensor nodes are studied, including random deployment method and fixed deployment method. The performance of the algorithm is investigated from two aspects: the positioning error and positioning time of the target position, as well as the positioning time and success rate of target tracking. The rest of the paper is as follows. The next section is dedicated to describe the AFSA, and the target positioning and tacking methods is introduced in Sect. 3. Section 4 discusses the simulation results for the target positioning and tracking methods. Finally, a conclusion is given in Sect. 5.
2 AFSA The AFSA algorithm was proposed by Xiaolei Li et al. in 2002 [10, 11], which imitates the foraging, gathering, and following behaviors of fish to find the optimal solution. However, setting the algorithm’s parameters has no theoretical basis, and most are set based on experience. Increasing the number of artificial fish improves accuracy, but slows convergence. Some modifications introduced random search and artificial fish jumping behavior for global optimal solution while others used local search for optimization precision and faster solution times [12, 13]. The AFSA algorithm simulates fish behaviors such as foraging, clustering, and following to achieve an optimal solution. Artificial fish are virtual individuals that can sense environmental information and make decisions accordingly. The AFSA expresses the state of each artificial fish (AF) as a vector X, composed of variables to be solved denoted by xi (i = 1, 2, · · · , n). Y represents the food concentration at the current position of the AF and can be expressed as a function of X, denoted by f (X). The distance between two individual AFs is given by dij = Xi , Xj . Additionally, visual determines the maximum field of view, step denotes the maximum step length, δ represents the crowd factor, and Try_Number specifies the maximum number of tries allowed. The AF moves based on simple behaviors to achieve their goals, including preying, swarming, following, and random behavior.
3 AFSA Target Positioning and Tracking Methods 3.1 Target Positioning Method Mobile sensors were positioned in a 100 m × 100 m square plane as artificial fish to find the optimal solution for the global best point. The point with the highest food concentration, as determined by the RSSI value, was designated as the global best solution after each artificial fish completed its algorithmic behavior. The position of the current global best solution was tracked until it met the stopping criterion for the search, making it the estimated point. This process is depicted in Fig. 1 [14].
Applications of Artificial Fish Swarm Algorithms
231
Fig. 1. The procedure of AFSA target location method
In a square wireless sensor network with an area of 100 m × 100 m, random and fixed deployment topologies for target positioning are considered in this paper. Random deployment takes less time to set up the network and requires less cost to deploy mobile sensors randomly in the WSNs. Figure 2 is the topology of random deployment.
Fig. 2. Topology of random deployment.
In this paper, we explore the deployment of different numbers of mobile sensors using fixed deployment topology in a wireless sensor network. Specifically, we deployed a varying number of sensors, including 100, 72, 52, 24, and 12, and considered different fixed deployment topologies. Figure 3 (a)–(e) depict the different fixed deployment topologies for the sensors. These topologies are predetermined, with sensors placed in specific locations across the grid to ensure optimal coverage and collection of data.
232
S.-H. Lee et al.
Fig. 3. Fixed deployment topologies with 5 different numbers of AF (a) 100; (b) 72; (c) 54; (d) 24; (e) 12.
3.2 Target Tracking Method The target tracking flow chart of AFSA is shown in Fig. 4 [14]. The mobile sensors need to be deployed in a 100 m × 100 m square wireless sensor network, which can be randomly dispersed or fixed. Then, through algorithmic behavior, individuals cluster in areas with the highest fitness values. Different from target positioning, the algorithm individuals will eventually gather in the target point area, and select the position with the highest adaptive value as the estimated point through Eq. (1). However, due to the motion characteristics of the target point, in the next iteration, the algorithm may encounter a situation to being stuck in a local solution. Therefore, when using a clustering algorithm for target tracking, in each iteration of finding the optimal solution, it is necessary to return the global optimum to the initial value GlobalBest_Y = −100, so that the algorithm can start a new round of search. Yi > GlobalBest_Y (1) GloblaBest_Y = Yi
Applications of Artificial Fish Swarm Algorithms
233
4 Simulation Results In this study, the wireless sensor network is a two-dimensional area of 100 m × 100 m. Substitute the position of each AF and target points into the objective function to calculate the RSSI value. As mentioned earlier, the closer the location is to the target, the higher the RSSI value. It is assumed that each AF knows the positions of the other AFs. After repeating this process many times, AFs move towards the region with the highest adaptive value, and finally converge near the global optimal solution. The AF with the highest fitness value is selected as the estimated position. The parameters of the RSSI channel model during simulation are shown in Table 1 [14]. The simulation parameters of AFSA during target positioning and tracking are shown in Table 2 [14]. Table 1. RSSI channel model parameters. Parameter
Value
Transmission Power Pt
2 mW
Carrier Frequency f
2.4 GHz
Path Loss Exponent n
4.5
Reference Distance d 0
0.5 m
Antenna gains Gt , Gr
1
Standard Deviation σ
9 dBm
In the simulation of AFSA target positioning method, random deployment and fixed deployment topologies were considered in this paper. To evaluate the accuracy of the estimated point obtained by AFSA target location, we calculate the average error by ε=
N 1 ˆ Xi − Xi N
(2)
i=1
where N represents the total number of target points. The position of the ith estimated point is denoted by Xˆ i , while the position of the ith target point is represented as X i . The process of target positioning and localization in a wireless sensor network plays a critical role in a range of applications, including emergency response, environmental monitoring, and surveillance. This paper examines the effectiveness of the AFSA target positioning method under both random and fixed deployment topologies. Tables 3 and 4 demonstrate the performance of AFSA’s average error and positioning time metrics under these topologies, showing roughly a 4% difference between the average error of the random and fixed points and a 6% difference in average positioning time. Figures 5 and 6 depict the performance of the AFSA target positioning method as it applies to both high and low numbers of AFs. Specifically, Fig. 5 shows the results obtained with 100 AFs, while Fig. 6 shows the results obtained with 12 AFs. From these figures, it becomes clear that when 100 AFs are used in the random deployment topology, the optimal solution obtained is consistently centered around the target point, thereby demonstrating the
234
S.-H. Lee et al.
efficacy of the AFSA method in this context. In contrast, the use of 12 AFs resulted in a larger positioning error due to the limited range of information available to the AFs as a result of the low number of sensors used in this configuration. This results in the AFSA method falling into a local optimal solution that can limit the accuracy of the target positioning. From the results obtained in this study, it becomes evident that the performance of the AFSA method in target positioning and localization depends on several factors, including the number of AFs and the deployment topology. The random deployment topology provides flexibility and lower cost but sacrifices coverage and precision when compared to fixed deployment topology. The latter topology offers better coverage and precision due to the predetermined layout of the sensors, but comes with higher costs and requires more intensive planning. Furthermore, AFSA’s target positioning efficacy is also affected by the number of AFs employed in the network. A higher number can enhance accuracy, reduce positioning error and provide a more comprehensive coverage area.
Fig. 4. The procedure of AFSA target tracking.
In this paper, we analyze the efficacy of the AFSA target tracking system and compare the advantages and disadvantages of using fixed deployment versus random deployment. Tables 5 and 6 illustrate the average positioning time and success rate of fixed and random deployment in the AFSA target tracking system. Our findings suggest that when the number of AFs in the algorithm is 12, fixed deployment outperforms random
Applications of Artificial Fish Swarm Algorithms
235
Table 2. Parameters for AFSA target positioning and target tracking. Parameter
Value
Network Size
100 m × 100 m
Number of executions
100
Number of iterations T max
100
Number of sensors M
100, 72, 52, 24, 12
Number of targets N
10, 1
Try number Try_Number
100
step Step
5
visual Visua
50
Table 3. Average error of random and fixed deployments in target positioning. Number of AFs
Random deployment
Fixed deployment
100
0.874*
0.832
72
1.117
1.177
54
1.614
1.518
24
2.953
3.071
12
5.52
5.283
Average
2.416
2.376
* Unit is cm.
Table 4. Average positioning time (sec) of random and fixed deployments in target positioning. Number of AFs
Random deployment
Fixed deployment
100
12.986*
13.193
72
8.916
9.446
54
6.634
6.502
24
2.974
2.837
12
1.326
1.412
Average
6.567
6.678
* Unit is sec.
deployment due to its symmetrical and even distribution of AFs in the network, which reduces the potential for an uneven distribution of AFs as in random deployment. In
236
S.-H. Lee et al.
Fig. 5. The simulation results of AFSA target positioning random deployment method when the number of AF is 100.
cases where the distance between the AF and the target is too far, and there are limited time and resources to track it, this can adversely affect the AF’s ability to move around the global optimum in the random deployment method, leading to suboptimal target tracking success rates. In contrast, fixed deployment ensures that the initial positions of the AFs are uniformly distributed, leading to a better target tracking performance than random deployment. When the number of AFs is large, however, our analysis demonstrated that there is little difference in target tracking performance between fixed and random deployments. Figure 7 shows the tracking trace results of both random and fixed deployment methods for the AFSA target tracking system when the number of AFs is 12. These findings highlight the importance of considering the number of AFs and the deployment topology when using the AFSA target tracking system in wireless sensor networks. The results of this study show that the AFSA target tracking system is effective when deployed appropriately. In situations where cost is a factor and the number of AFs is limited, fixed deployment may be a more suitable option, as it allows more precise planning and uniform distribution of AFs. This can improve the target tracking success rate and reduce the potential for uneven distribution of sensors that can negatively impact the performance of the AFSA target tracking system. In contrast, random deployment offers more flexibility and responsiveness to changing conditions, which may be more beneficial in some situations.
Applications of Artificial Fish Swarm Algorithms
237
Fig. 6. The simulation results of AFSA target positioning random deployment method when the number of AF is 12. Table 5. Average positioning time of random and fixed deployment in target tracking. Number of AFs
Random deployment
Fixed deployment
100
0.081*
0.089
72
0.065
0.077
54
0.061
0.064
24
0.044
0.051
12
0.041
0.042
Average
0.058
0.065
* Unit is sec.
Table 6. Average sucess rate of random and fixed deployment in target tracking. Number of AFs
Random deployment
Fixed deployment
100
99%
100%
72
99%
100%
54
100%
100%
24
97%
100%
12
94%
100%
Average
0.978
1.000
238
S.-H. Lee et al.
Fig. 7. The tracking trace of AFSA target tracking random and fixed deployment methods.
5 Conclusions The study examines the effectiveness of the AFSA algorithm for indoor target localization and tracking in wireless sensor networks. The simulations demonstrate that the accuracy of target positioning is directly related to the number of AFs employed in the algorithm, with increased accuracy at the cost of extended positioning time. Additionally, the difference in average error between random and fixed deployment topologies is found to be 5%, while the average positioning time is only slightly different between the two. Fixed deployment is found to provide better target tracking performance than random deployment when the number of AFs is small, due to the symmetrical and evenly distributed placement of AFs. Overall, our findings underline the importance of careful selection of the deployment topology and number of deployed sensors to optimize the AFSA algorithm’s effectiveness in wireless sensor networks for indoor target localization and tracking applications. Acknowledgments. This research was funded by Ministry of Science and Technology (MOST), R.O.C. grant number MOST 111-2221-E-324-018 and MOST-111-2637-E-150-001.
References 1. Han, G., Jiang, J., Zhang, C., Duong, T.Q., Guizani, M., Karagiannidis, G.K.: A survey on mobile anchor node assisted localization in wireless sensor networks. IEEE Commun. Surv. Tutor. 18(3), 2220–2243 (2016) 2. Rajaravivarma, V., Yang, Y., Yang, T.: An overview of wireless sensor network and applications. In: Proceedings of the 35th Southeastern Symposium on System Theory, pp. 432–436 (2003) 3. Corke, P., Wark, T., Jurdak, R., Hu, W., Valencia, P., Moore, D.: Environmental wireless sensor networks. Proc. IEEE 98(11), 1903–1917 (2010)
Applications of Artificial Fish Swarm Algorithms
239
4. Laaraiedh, M., Yu, L., Avrillon, S., Uguen, B.: Comparison of hybrid localization schemes using RSSI, TOA, and TDOA. In: 11th European Proceedings of IEEE Wireless Conference 2011-Substainalbe Wireless Technologies (European Wireless), pp. 1–5 (2011) 5. Cheng, L., Wu, C.-D., Zhang, Y.-Z.: Indoor robot localization based on wireless sensor networks. IEEE Trans. Comput. Electron. 57(3), 1099–1104 (2011) 6. Chugunov, A., Petukhov, N., Kulikov, R.: ToA positioning algorithm for TDoA system architecture. In: Proceedings of International Russian Automation Conference (RusAutoCon), pp. 871–876 (2020) 7. Ahmed, S., Abbasi, A., Liu, H.: A novel hybrid AoA and TDoA solution for transmitter positioning. In: Proceedings of International Conference on Indoor Positioning and Indoor Navigation (IPIN), pp. 1–7 (2021) 8. Xu, C.-X., Chen, J.-Y.: Three-dimensional sensor node localization based on AFSA-LSSVM. Int. J. Control Autom. 7(8), 399–406 (2014) 9. Yang, X., Zhang, W., Song, Q.: A novel WSNs localization algorithm based on artificial fish swarm algorithm. Int. J. Online Eng. 12(1), 64–68 (2016) 10. Li, X.-L., Shao, Z.-J., Qian, J.-X.: An optimizing method based on autonomous animats: fish-swarm algorithm. Chin. J. Circuits Syst. 22(11), 32–38 (2002) 11. Li, X.-L., Qian, J.-X.: Studies on artificial fish swarm optimization algorithm based on decoposition and coordination techniques. Chin. J. Syst. Eng.-Theory Pract. 8(1), 1–6 (2003) 12. Fernandes, E.M.G.P., Martins, T., Rocha, A.M.: Fish swarm algorithm for bound constrained global optimization. In: Proceedings of the International Conference on Computational and Mathematical Methods in Science and Engineering (2009) 13. Neshat, M., Sepidnam, G., Sargolzaei, M., Toosi, A.N.: Artificial fish swarm algorithm: a survey of the stateof-the-art, hybridization, combinatorial and indicative applications. Artif. Intell. Rev. 42, 965–997 (2014) 14. Lee, S.-H., Cheng, C.-H., Lin, C.-C., Huang, Y.-F.: Target positioning and tracking in WSNs based on AFSA. Information 14(4), 246 (2023). https://doi.org/10.3390/info14040246
Attacks and Threats Verification Based on 4G/5G Security Architecture Lie Yang1 , Chien-Erh Weng2 , Hsing-Chung Chen3 , Yang-Cheng-Kuang Chen4 , and Yung-Cheng Yao2(B) 1 Department of Electrical Engineering, National Kaohsiung University of Science and
Technology, Kaohsiung City 807618, Taiwan 2 Department of Microelectronics Engineering, National Kaohsiung University of Science and
Technology, Kaohsiung City 811213, Taiwan [email protected] 3 Department of Computer Science and Information Engineering, Asia University, Taichung 413305, Taiwan 4 Air Force Institute of Technology, Kaohsiung City 820009, Taiwan
Abstract. 5G networks are in high demand and are expected to transform many industries, including finance, healthcare, transportation, and entertainment, shaping the future of communication. Although the 3rd Generation Partnership Project has developed several cybersecurity protocols, vulnerabilities in these agreements still exist. These vulnerabilities can lead to security attacks on mobile devices during the registration process of the base station or wireless access, such as Data Leaking attacks, Denial-of-Service attacks, and Downgrade attacks. This study aims to analyze and experiment with relevant attack cases, identify existing information security vulnerabilities in the mobile network, and test the 4G/5G network security protocol for vulnerabilities through active attacks to demonstrate their impact on mobile devices. By doing so, this project aims to contribute towards the development of stronger and more secure mobile communication technology.
1 Introduction The advent of 5G mobile network technology has brought about faster and more convenient application services, yet its use must be accompanied by caution to mitigate against potential security vulnerabilities and threats. One of the most pressing concerns relates to the protection of users’ privacy information, which, if intercepted, can lead to detrimental consequences such as identity or location lock-out. It is therefore imperative that effective measures are taken to safeguard users’ privacy information in the 5G mobile network ecosystem [1]. In the field of telecommunications, the 5G system is the latest and most advanced network architecture developed by the 3rd Generation Partnership Project (3GPP). Despite sharing many similarities with the 4G network in terms of planning architecture, protocols, signaling, and procedures, the 5G system is expected to offer significant improvements in terms of speed, capacity, and latency [2–4]. However, the security vulnerabilities that were present in the 4G network raise concerns about the security of the 5G network as well. To this end, we will utilize a simulated 5G network © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 240–249, 2023. https://doi.org/10.1007/978-3-031-35836-4_26
Attacks and Threats Verification Based on 4G/5G
241
platform and evaluate potential attacks, including disrupting 5G network services, network downgrade attacks, privacy information theft, and locating user equipment (UE). Through our investigation, we seek to determine whether the security issues found in the 4G network have been addressed in the 5G system. As illustrated in Fig. 1, the occurrence of information leakage attacks has become increasingly prevalent in communication attacks. These attacks commonly take place within the wireless transmission channel between user devices and the core network. Notably, certain signaling remains unencrypted during wireless channel transmission, making it vulnerable to interception by attackers through the deployment of sniffing devices [5, 6]. Through this method, attackers can intercept signaling in wireless transmission and analyze the data to obtain sensitive user information, such as International Mobile Subscriber Identity (IMSI) and Globally Unique Temporary UE Identity (GUTI). IMSI and GUTI codes are unique identification codes assigned to each UE, thus representing highly critical authentication and privacy data for users. If these codes are stolen, it could lead to user identity theft or targeted attacks by malicious actors.
Fig. 1. The illustration of a security breach
A DoS (Denial of Service) attack is considered one of the most serious and disruptive forms of cyberattacks. The primary objective of such an attack is to disrupt communication network services, causing mobile devices to temporarily interrupt or stop communication network services, thereby rendering users unable to use the internet or make phone calls [7, 8]. As shown in Fig. 2, There are two common methods of conducting a DoS attack. The first method involves setting up a Pseudo Base Station (PBS), which manipulates the parameter settings of mobile devices, making it impossible for them to verify the authenticity of the base station [3]. As a result, the PBS can trick mobile devices into registering with it, thereby achieving its goal of blocking services. The second method involves emitting a signal on the same frequency as the base station using specialized hardware equipment. By increasing the intensity of the emitted signal, it interferes with or interrupts the mobile device’s communication service with the base station [9]. This method can also lead to the blocking of services. [10] proposed a denial-of-service attack that takes advantage of vulnerabilities in the security protocols of 4G/5G networks to capture the IMSI code of mobile devices. When a mobile device connects to a base station, it undergoes a Tracking Area Update (TAU) process that uses TMSI for authentication. The purpose of this process is to verify the legitimacy of the user and track its location. If the TAU process authentication fails, the core network will require the mobile device to re-initiate the Attach process and register
242
L. Yang et al.
Fig. 2. Denial-of-Service attack
using its actual identity (IMSI code). The attacker can set up a pseudo base station to lure the mobile device into registering and triggering the TAU process. However, the pseudo base station does not have the authentication data of the mobile device, causing the TAU process authentication to fail. The pseudo base station then prompts the mobile device to interrupt all communication network services and re-initiate the Attach process using its actual identity (IMSI code). As shown in Fig. 3, This allows the attacker to monitor the signaling in the Attach process and obtain the mobile device’s IMSI code.
Fig. 3. IMSI Catcher attack
The attack cases mentioned above demonstrate that there are still some vulnerabilities in the security protocols of 4G/5G networks [7–9]. These vulnerabilities can lead to issues such as disrupted communication services for mobile devices and the leakage of user privacy data [5, 6]. To gain a deeper understanding of the cybersecurity risks posed by these vulnerabilities, this study will analyze and experiment with the IMSI Catcher attack case presented in [10] through practical verification. The attack will be carried out using the TAU process of the LTE network security protocol. If a mobile device fails to register with the base station during the TAU process, it will carry authentication information in plaintext and attempt to register again. This authentication information includes the unique IMSI code, which is essential authentication data during the registration process.
Attacks and Threats Verification Based on 4G/5G
243
To exploit this vulnerability, the study will use an active attack method to set up a 4G pseudo base station to lure mobile devices into registering. The TAU process will be used to perform a disruptive attack to capture the IMSI code.
2 Background Based on the specifications outlined in the Security Architecture (3GPP TS 33.102 version 11.5.1 Release 11), the mechanisms for ensuring communication security are categorized into five distinct areas [11]: Network Access Security, Network Domain Security, User Domain Security, Application Domain Security, and Visibility and Configurability of Security. 2.1 Attach Process After the UE is powered on, it begins by receiving information from the nearby base stations. The Master Information Block (MIB) and System Information Block (SIB) are then decoded to obtain the main message content. Based on the strength of the base station signal and its own Public Land Mobile Network (PLMN), the UE selects an appropriate base station for the random access procedure. Afterwards, the UE sends an Attach Request to the MME, which includes the IMSI code. Upon receiving the Attach Request, the MME retrieves necessary data from the HSS for authentication and key negotiation (AKA) based on the IMSI code [10]. Additionally, the mobile device sends data required for AKA authentication to the MME. To confirm that the mobile device and core network have the same keys, the MME compares the parameters sent by both sides. Upon successful AKA authentication, the UE enters NAS Security Mode and AS Security Mode authentication. The former ensures that the mobile device and core network use the same encryption algorithm and key, while the latter ensures that the mobile device and base station use the same encryption algorithm and key. After the security mode is confirmed, the core network allocates a temporary identity code to the mobile device, allowing it to connect. 2.2 Key Hierarchy In order to implement LTE security features, various keys serve different purposes [4]. The session keys for GSM and UMTS are directly generated from the permanent key (K), while in the LTE security protocol, an intermediate main key (KASME) is introduced, which is mainly used to generate session keys. The key hierarchy is divided into five levels. The first level is the permanent key (K), which is shared between the Universal IC Card (UICC) and the Authentication Center (AuC), and is not transmitted over the network. The second level is the ciphering key (CK) and the integrity key (IK), respectively generated by the network-side HSS and the terminal-side UICC. The third level is the main key (KASME), generated by HSS from CK and IK, and sent to the MME. The fourth level is the non-access stratum (NAS) confidentiality and integrity protection key generated by the MME, as well as the eNB key, which is then sent to the eNB. The fifth
244
L. Yang et al.
level is the access stratum (AS) integrity and confidentiality protection key and the user plane (UP) confidentiality key generated by the eNB. The key generation function is a crucial aspect of ensuring the security and privacy of information in communication networks. It is a one-way relationship that guarantees the confidentiality of low-level key information during delivery. Additionally, during Handover, all keys of the eNB are transferred to the next eNB, thereby avoiding involving the MME and preventing the possibility of reverse leakage of the keys [2, 4]. Different levels of keys have unique functions, and the key generation method is designed to make it extremely challenging for attackers to invade the system and access sensitive information. Furthermore, in the unfortunate event of a single branch key leakage, the key generation method can significantly reduce the potential loss by ensuring that the keys of different levels have distinct functionalities. This feature of the key generation function is critical in ensuring the overall security and reliability of the communication network [4]. 2.3 Authentication and Key Agreement The Authentication and Key Agreement (AKA) process for LTE security is initiated by the UE’s Attach Request. The MME requests the HSS to send the necessary parameters for authentication and stores them before sending an authentication request to the UE. The UE verifies the legitimacy of the signaling request and computes RES, which is sent back to the MME. The MME also verifies the legitimacy of the UE as a user. The Security Architecture (3GPP TS 33.401 version 10.3.0 Release 10) [12] specifies that during the AKA process, the MME includes the IMSI in the Authentication Data Request signaling message and sends it to the HSS. Upon receipt, the HSS retrieves the LTE_Key and LTE_K corresponding to the IMSI from the internal database and derives the Sequence Number (SQN) and Random (RAND). Using the Authentication Management Field (AMF), LTE_K, and the two aforementioned parameters, the HSS derives the Authentication Vector (AV) and sends it back to the MME. Upon reception of the AV, the MME stores the Expected RESponse (XRES), IK and CK. The MME then sends the RAND and Authentication Token (AUTN) to the UE through the User Authentication Request signaling. Upon receipt of the signaling, the UE uses its own LTE_K and the RAND to derive an Anonymity Key (AK), which is then used to decrypt the SQN and derive the Expected Message Authentication Code (XMAC). The UE compares the XMAC with the Message Authentication Code (MAC) computed by the core network to verify the legitimacy of the message. If the check succeeds, the UE computes the RESponse (RES) and sends it back to the MME through the User Authentication Response message [10]. The MME then compares the returned RES with the XRES stored earlier to confirm the mutual authentication.
3 Experiment and Results The USRP B210 communication module was used as the signal transmission and reception end of the base station in this study. Although the distance of signal transmission is limited, it is sufficient to meet the experimental requirements. Since this experiment is an
Attacks and Threats Verification Based on 4G/5G
245
attack on the feasibility of extending the vulnerabilities of the LTE protocol, it only needs to satisfy the transmission of signals by UE within a specific range. The B210 supports RF frequencies from 70 MHz to 6 GHz, full duplex, and MIMO features, which are sufficient to support the establishment of a rogue base station system [3]. The Open Source mainly used in the study is OpenAirInterface (OAI) and OpenBase Transceiver Station (OpenBTS). OAI is an Open Source SDR project initiated by EURECOM, which implements the communication protocols used by the base station and core network, enabling general UE to simulate 4G mobile internet capabilities through OAI. This study mainly uses OAI to establish a 4G pseudo base station to DoS or downgrading attacks on UE, and to obtain the IMSI code of UE and affect its communication services through these attack methods. In addition, OpenBTS is used to establish a 2G pseudo base station to send SMS messages containing malicious software links to UE, stealing users’ personal privacy data (such as browsing data, call records, location information, etc.) through this attack method. 3.1 Extracting IMSI Code Using DoS Attack This study employs a 4G pseudo base station to induce UE registration, as illustrated in Fig. 3. The UE will perform the TAU process with the 4G pseudo base station because its TAC differs from that of the original base station. Initially, the UE sends a TAU Request message containing the TMSI code to the 4G pseudo base station. However, as the 4G pseudo base station lacks the UE’s identity information, it returns a TAU Reject message (CAUSE#9: UE identity cannot be derived by the network) to the UE, prompting it to re-register. This causes the UE to trigger the Attach process again and obtain the IMSI code. Next, we present the experimental procedure and results of the DoS attack. The Handover mechanism causes the UE to determine the base station with a stronger signal to the mobile device, thereby forcing the mobile device to switch its connection to the base station. Using the LTE Discovery application, this study identifies the base station information to which the UE is currently connected, including MCC, MNC, TAC, Band, Downlink_frequency, and Uplink_frequency, as shown in Fig. 4. Subsequently, the OAI software is utilized to set up the 4G pseudo base station, incorporating the identified base station information and changing the parameter settings of the pseudo base station. Based on Fig. 5, it is possible to use the Wireshark software to observe whether the UE has registered after the 4G pseudo base station has been set up. Furthermore, it is possible to observe that the 4G pseudo base station responds to the UE’s TAU Request with a TAU Reject message. Figure 6 is used to verify that the TAC received by the UE matches the TAC set by the 4G pseudo base station. Finally, Fig. 7 displays the UE’s Attach Request signaling, which contains the UE’s IMSI code. 3.2 Sending Malicious SMS Using Downgrade Attack In this attack experiment, a 4G pseudo base station is set up to deceive the UE into registering. The TAC of the 4G pseudo base station differs from the TAC of the UE’s originally connected base station, leading the UE to execute the TAU procedure with the
246
L. Yang et al.
Fig. 4. Information on the Base Station to which the UE is Connected
Fig. 5. 4G Pseudo Base Station Responds with TAU Reject Message
Fig. 6. The UE Initiates an Attach Request Message
Attacks and Threats Verification Based on 4G/5G
247
Fig. 7. The Attach Request Signaling Carries the IMSI Code of UE
4G pseudo base station. Initially, the UE sends a TAU Request (TMSI) to the 4G pseudo base station. However, the pseudo base station lacks the UE’s identity information, so it responds with a TAU Reject (CAUSE #7: EPS services not allowed). Upon receiving this message, both the 4G and 3G communication services of the UE will be stopped. Although the 2G base stations of telecom operators have been dismantled, the communication function of the UE still includes the 2G communication system. Therefore, the UE will automatically register with the 2G pseudo base station. Then, the 2G pseudo base station will send an SMS containing a malicious software URL to the UE. Upon downloading and installing the malicious software (DroidJack), the attacker can exploit the UE using the malware, allowing them to capture the UE’s data, including call records, SMS records, web browsing records, stored files, and more. In this experiment, we utilized the LTE Discovery application to retrieve the current base station information that the UE is connected to. After that, we proceeded to set up a 4G pseudo base station with the OAI software, and wrote the known base station information into it, changing its parameters accordingly. As depicted in Fig. 8, the UE has already completed registration, and it can be observed that upon sending a TAU Request, the 4G pseudo base station responds with a TAU Reject message (CAUSE #7: EPS services not allowed) to the UE.
Fig. 8. Results of TAU Downgrading
248
L. Yang et al.
Upon receiving CAUSE #7: EPS services not allowed, the UE’s 4G and 3G communication services will be halted, and the device will be automatically redirected to the 2G pseudo base station created by the OpenBTS software. As depicted in Fig. 9, the IMSI and IMEI codes of the UE can be seen upon successful connection to the 2G pseudo base station. The base station will then send an SMS message to the UE containing a URL that, once clicked, will trigger the download and installation of malicious software (DroidJack). Once the software is installed, the attacker can launch attacks on the same domain and gain access to the UE’s call records, SMS records, web browsing records, GPS location, phone details, and other sensitive information.
Fig. 9. UE Information of the 2G Pseudo Base Station
4 Conclusion Based on the attack case proposed in this study, it is evident that there are several shortcomings in the 4G network security protocol. To ensure the IMSI code of mobile devices cannot be stolen, the network security protocol of the 5G communication system includes a public-private key mechanism. This makes it difficult for attackers to obtain the IMSI code of mobile devices through the current blocking attacks under the architecture of the 5G communication system. However, this study proposes a hypothesis for extracting the IMSI code of mobile devices using a downgrading attack. This attack involves setting up a pseudo base station to lure mobile devices into registering. Since the pseudo base station lacks the authentication data of the mobile device, it will return the TAU Reject (5GS services not allowed) signaling to the mobile device. Upon receiving this signaling, the mobile device will disconnect from its 5G communication system service and downgrade to the 4G communication system service. This study proposes a DoS attack to capture the IMSI code of the mobile device at this point.
Attacks and Threats Verification Based on 4G/5G
249
References 1. Borgaonkar, R., Hirshi, L., Park, S., Martin, A., Seifert, J.P.: New Adventures in Spying 3G & 4G Users: Locate, Track, Monitor. Las Vegas (2017) 2. Piqueras, J.R., Marojevic, V.: Security and protocol exploit analysis of the 5G specifications. IEEE Access 7, 24956–24963 (2019) 3. Wang, W., Li, H.: Light-weight platform for attack validation in LTE network. IEEE Netw. Lett. 2(4), 212–215 (2020) 4. Rupprecht, D., Dabrowski, A., Holz, T., Weippl, E., Popper, C.: On security research towards future mobile network generations. IEEE Commun. Surv. Tutor. 20(3), 2518–2542 (2018) 5. Bassil, R., Elhajj, I.H., Chehab, A., Kayssi, A.: Effects of signaling attacks on LTE networks. In: 2013 27th International Conference on Advanced Information Networking and Applications Workshops, pp. 499–504 (2013) 6. Fei, T., Wang, W.: LTE is vulnerable: implementing identity spoofing and denial-of-service attacks in LTE networks. In: 2019 IEEE Global Communications Conference (GLOBECOM), pp. 1–6 (2019) 7. Ghannam, R., Sharevski, F., Chung, A.: User-targeted denial-of-service attacks in LTE mobile networks. In: 2018 14th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), pp. 1–8 (2018) 8. Sattar, D., Matrawy, A.: Towards secure slicing: using slice isolation to mitigate DDoS attacks on 5G core network slices. In: 2019 IEEE Conference on Communications and Network Security (CNS), pp. 82–90 (2019) 9. Yu, C., Chen, S.: On effects of mobility management signalling based DOS attacks against LTE terminals. In: 2019 IEEE 38th International Performance Computing and Communications Conference (IPCCC), pp. 1–8 (2023) 10. Christian, S.: Location disclosure in LTE networks by using IMSI catcher. Norwegian University of Science and Technology (2017) 11. 3G Security; Security Architecture (3GPP TS 33.102 Version 11.5.1 Release 11) 12. 3GPP System Architecture Evolution (SAE); Security Architecture (3GPP TS 33.401 Version 10.3.0 Release 10)
Design of a Composite IoT Sensor Stack System for Smart Agriculture Meng-Chang Wu1 , Yung-Hoh Sheu2(B) , Shing-Hong Liu3 , Jen-Yu Shieh1 , and Hui-Kai Su4 1 Department of Electro-Optical Engineering, National Formosa University, Yunlin County 632,
Taiwan (R.O.C.) [email protected], [email protected] 2 Department of Computer Science and Information Engineering, National Formosa University, Yunlin County 632, Taiwan (R.O.C.) [email protected] 3 Department of Computer Science and Information Engineering, Chaoyang University of Technology, Taichung City 41349, Taiwan (R.O.C.) [email protected] 4 Department of Electrical Engineering, National Formosa University, Yunlin County 632, Taiwan (R.O.C.) [email protected]
Abstract. The main focus of this study is to combine the Internet of Things (IoT) with sensors, where sensor values are transmitted via an RS485 interface to a microcontroller for integration. The values are then uploaded to a custom MQTT server via a Wi-Fi module and stored in a database. FineReport software is used to quickly create web pages or mobile apps for viewing. The data includes relevant information such as temperature, humidity, wind speed, soil moisture and rainfall, allowing farmers to understand the environmental conditions of their fields as long as they have a phone or computer connected to the internet. With this system, farmers can benefit from convenience, improve the quality of agricultural products, adjust farming methods according to current environmental conditions, provide a good growing environment for crops and increase production efficiency.
1 Introduction From ancient times to the present day, most farmers have relied on “experience”, “intuition” and “luck” in farming. But when “the weather is bad”, the pressure on farmers often increases. At a time when the world is facing many environmental challenges such as extreme weather, how agriculture can thrive and meet people’s food needs is a problem that the whole world needs to solve. In densely populated countries like Taiwan, with a small amount of arable land, the country faces challenges such as an ageing population, a shortage of young workers, global competition and climate change. The sustainable development of Taiwan’s agriculture, which is mainly composed of small farmers, is currently facing challenges. There is an urgent need to promote the development of smart agriculture. To avoid a food crisis, in addition to policy support and industrial © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 250–260, 2023. https://doi.org/10.1007/978-3-031-35836-4_27
Design of a Composite IoT Sensor Stack System
251
upgrading, it is also necessary to use technology to attract more young people to join the ranks of agricultural production, and to achieve the vision of sustainable agriculture through various improvements. Therefore, in response to the needs of smart agriculture [1–3], a physical design and implementation of a Composite IoT smart agricultural sensor stack is proposed. The RS485 interface connects all the necessary agricultural sensors and standardises and aggregates all the data. The data is displayed in real time on a mobile phone or tablet via a web page.
2 System Principle This study controls the RS485 chip, LoRa Module [4, 5], Wi-Fi Module and solar controller via the UART register of the microcontroller (HT32F52352, Holtek) and the EEPROM via the I2C register. The GPIO register controls the relay and the water level sensor. As shown in Fig. 1, this is the function diagram of the Composite IoT smart agriculture sensor stack and water level detector.
Fig. 1. The function diagram of the Composite IoT smart agriculture sensor stack and Water level detector
The Composite IoT smart agricultural sensor post and water level detector are connected via their respective LoRa modules for data transmission. When an anomaly is detected, data is transmitted through the sensor post’s Wi-Fi module to update the database, allowing users to monitor the situation through a web page or mobile app.
252
M.-C. Wu et al.
2.1 MPPT Solar Controller The physical diagram of the solar controller is shown in Fig. 2. The load curve that provides the maximum power transfer efficiency varies with changes in sunlight conditions. If the load can be adjusted to match the load curve with the highest power transfer efficiency, the system will achieve optimum efficiency. The load curve with the highest power transfer efficiency is called the maximum power point, and maximum power point tracking (MPPT) is the process of finding and maintaining the load curve at this power point [6]. This module can return battery voltage, solar charge current, solar charge voltage and consumed power via the RS232 interface and upload it to the cloud database via Wi-Fi module.
Fig. 2. Diagram of the solar controller
2.2 Wind Direction Sensor, Wind Speed Sensor As shown in Figs. 3 and 4, these are the physical diagrams of the wind direction and wind speed sensors. The wind direction sensor adopts a standard three-cup anemometer and a directional wind speed sensor, which can measure wind speed in the range of 0–60 m/s and wind direction in the range of 0–360°, providing long-term and stable signal output through precision wind tunnel experiments. These two types of sensors are small and portable, making them easy to carry and install. They are widely used in greenhouses, environmental protection, meteorological stations, ships, docks, aquaculture and other environments.
Fig. 3. Wind Direction Sensor
Fig. 4. Wind Speed Sensor
Design of a Composite IoT Sensor Stack System
253
2.3 5-in-1 Air Sensor As shown in Fig. 5, this is the physical diagram of a 5-in-1 air sensor. The sensor consists of temperature, humidity, light, carbon dioxide and barometric pressure measurements (see Table 1).
Fig. 5. 5-in-1 air sensor
Table 1. 5-in-1 Air Sensor Sensing Range Table. Parameter
Detection Range
Temperature
−40 °C–80 °C
Humidity
0–100% RH
Illuminance
0-65535 lx
CO2 Concentration
400–60000 ppm
Atmospheric Pressure
30 kPa–120 kPa
2.4 Rain Sensor As shown in Fig. 6, this is the physical diagram of a rain gauge sensor. In this study, the ZTWL-YL-485 rain gauge sensor is used, which is a hydrological and meteorological instrument used to measure rainfall in nature. At the same time, the rainfall is converted into a digital signal output in the form of a switching quantity to meet the needs of information transmission, processing, recording and display. The core component of this instrument adopts a three-dimensional streamlined design, which makes the funnel water flow more smoothly and is easy to clean. It can measure rainfall in the range of 0 mm–8 mm/min.
254
M.-C. Wu et al.
Fig. 6. Rain Sensor
2.5 6-in-1 Soil Sensor As shown in Fig. 7, this is a physical diagram of a 6-in-1 Soil Sensor for temperature, humidity, EC, nitrogen, phosphorus and potassium. The main materials of this sensor are generally metal oxides and high polymer polymers. These materials have a strong adsorption capacity for water molecules, and the amount of water absorbed varies with changes in ambient humidity. Since water molecules have a large electric dipole moment, the capacitance of the material changes after absorbing water, and the capacitance value of the capacitor also changes. By converting the capacitance changes into electrical signals, humidity can be monitored. It is a highly accurate and sensitive sensor for measuring soil moisture, temperature, conductivity, nitrogen, phosphorus and potassium. Table 2 shows the measurement range values of the 6-in-1 Soil Sensor.
Fig. 7. 6-in-1 Soil Sensor
Table 2. 6-in-1 Soil Sensor Sensing Range Table. Parameter
Detection Range
Temperature
− 40 °C–80 °C
Humidity
0–100% RH(Non-condensing)
EC
0–2000 us/cm
nitrogen, phosphorus, potassium
0–1999 mg/kg
Design of a Composite IoT Sensor Stack System
255
3 System Design As shown in Fig. 8, this is the architecture diagram of the main system of the Composite IoT smart agriculture sensor stack. The sensing components are introduced into the existing conditions of the farm, such as rainfall, wind direction, soil pH, air temperature and humidity, and so on. Finally, the values obtained from each sensor are uploaded to the cloud database via the Wi-Fi module, and the current status of the farm can be viewed via a webpage or APP. In addition, the power supply is provided by the solar cells and solar panels installed on the column.
Fig. 8. The architecture diagram of the main system of the Composite IoT smart agriculture sensor stack
Since the values of the different sensors are to be transmitted, the content of the communication formats must be planned. Table 3 shows the communication format of the Composite IoT smart agriculture sensor stack. As shown in Fig. 9, this is the architecture diagram of the water level detector. Through the water level sensor and LoRa, the main system can determine if there is excessive water accumulation in the farmland. If there is water, the GPIO will control the relay to activate the water pump and remove the water. The 8-pin selector switch is used to set the number of devices, allowing the system to be expanded up to 256 detectors.
256
M.-C. Wu et al.
Table 3. Communication format of the Composite IoT smart agriculture sensor stack Table. Byte
Function
0–2
Date
3–5
Timer
6–7
Atmospheric pressure
8–9
Temperature
10–11
Humidity
12–13
Illumination
14–15
CO2
16–17
Wind speed
18
Wind direction
19–20
Rainfall for the day
21–22
Yesterday’s rainfall
23–24
Current hourly rainfall
25–26
Rainfall in the previous hour
27–28
Nitrogen
29–30
Phosphorus
31–32
Potassium
33–34
EC
35
Output Voltage
36
Battery capacity
37
Solar Panel Voltage
38
Relay 1
39
Relay 2
40
Water Level Status
41–42
Temperature Threshold
43–44
Humidity threshold
4 System Validation As shown in Fig. 10, this is the physical diagram of the Composite IoT smart agriculture sensor stack. The microcontroller reads all the sensor values of the sensing pole at one minute intervals and then uploads them to the database using the WiFi module. The current environmental readings can be accessed via the web or mobile application, please refer to Table 3 for the communication format. as shown in Figs. 11 and 12, which are the web and mobile application interfaces respectively.
Design of a Composite IoT Sensor Stack System
257
Fig. 9. The architecture diagram of the water level detector
Fig. 10. The physical diagram of the Composite IoT smart agriculture sensor stack
As shown in Fig. 13, this is the physical diagram of the water level detector. When the water level reaches a certain height and triggers the sensor, the message is sent to the sensor post via the LoRa radio module to activate the water pump to pump water. As shown in Fig. 14, this is the test environment of the water level detector and the drainage status. Among them, the Composite IoT smart agriculture sensor stack itself has two sets of relays that can control motors, specifically the water pump motor and the irrigation motor.
258
M.-C. Wu et al.
Fig. 11. The website interface of the Composite IoT smart agriculture sensor stack
Fig. 12. The mobile app interface of the Composite IoT smart agricultural sensing pole
Fig. 13. The physical diagram of the water level detector
Design of a Composite IoT Sensor Stack System
259
Fig. 14. The test environment of the water level detector and the drainage status
5 Conclusion The current Composite IoT smart agriculture sensor stack can be customized based on the crops on the farm. Farmers can monitor the growth of each crop through the situation room screen on the app and web page on the computer. Currently, the sensing pile has implemented temperature, humidity, air pressure, carbon dioxide, wind direction, wind speed, and rainfall sensing functions. As the research uses the RS485 communication format, it has good scalability and can achieve the use of 32 different sensors for sensing simultaneously. In the future, a water level sensor will be added to the water tank, and the valve threshold can be set using the mobile app. If the temperature is too high, water is sprayed and if the current water level is too high, water is drained. Through the design and implementation of this Composite IoT smart agriculture sensor stack, future smart agriculture needs can be met and the best smart agriculture mode can be provided.
References 1. Tenzin, S., Siyang, S., Pobkrut, T., Kerdcharoen, T.: Low cost weather station for climatesmart agriculture. In: 2017 9th International Conference on Knowledge and Smart Technology (KST), pp. 172–177. Chonburi, Thailand (2017) 2. Pyingkodi, M., et al.: Sensor based smart agriculture with IoT technologies: a review. In: 2022 International Conference on Computer Communication and Informatics (ICCCI), pp. 1–7. Coimbatore, India (2022) 3. Maheswari, R., Azath, H., Sharmila, P., Sheeba Rani Gnanamalar, S.: Smart village: solar based smart agriculture with IoT enabled for climatic change and fertilization of soil. In: 2019 IEEE
260
M.-C. Wu et al.
5th International Conference on Mechatronics System and Robots (ICMSR), pp. 102–105. Singapore (2019) 4. Edwin, L., et al.: LoRa system with IOT technology for smart agriculture system. In: 2022 IEEE 20th Student Conference on Research and Development (SCOReD), pp. 39–44. Bangi, Malaysia (2022) 5. Supanirattisai, P., U-Yen, K., Pimpin, A., Srituravanich, W., Damrongplasit, N.: Smart agriculture monitoring and management system using IoT-enabled devices based on LoRaWAN. In: 2022 37th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC), pp. 679–682. Phuket, Thailand (2022) 6. Altas, I.H., Sharaf, A.M.: A novel maximum power fuzzy logic controller for photovoltaic solar energy systems. Renewable Energy 33(3), 388–399 (2008)
The Implement of a Reconfigurable Intelligence Trust Chain Platform with Anti-counterfeit Traceable Version Function for the Customized System-Module-IC Hsing-Chung Chen1,2(B) , Yao-Hsien Liang1 , Jhih-Sheng Su1 , Kuen-Yu Tsai3(B) , Yu-Lin Song4(B) , Pei-Yu Hsu1(B) , and Jia-Syun Cai5(B) 1 Department of Computer Science and Information Engineering, Asia University,
Taichung 413305, Taiwan [email protected], [email protected], [email protected], [email protected] 2 Department of Medical Research, China Medical University Hospital, China Medical University, Taichung 404327, Taiwan [email protected] 3 Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan [email protected] 4 Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan [email protected] 5 Graduate Institute of Electronics Engineering, National Taiwan University, Taipei, Taiwan [email protected] Abstract. Today, information technology (IT) and operational technology (OT) accelerate the development of Intelligence of Things (IoT) systems, which consist of modular integrated circuits (ICs). However, the existing IoT cannot support this all-round certification based on System-Module-IC vertical integration of upstream and downstream product hardware version identification in its supply chain. In other words, this factor limits the ability of system equipment for IoT to include version certification of System, Module and IC chip. Furthermore, in the supply chain, a comprehensive trust relationship cannot be established in the cooperation process between suppliers and other manufacturers. This paper first implements the reconfigurable chain of trust authentication method according to the references [21] and [22] applied to the supply chain based on the SystemModule(s)-IC(s) vertical integration via the designed the customized IoT system with the comprehensive identification of the hardware version of upstream and downstream products. This implementation is based on a private blockchain combined with a decentralized database system to improve the reliability, traceability and identification of all-round customized versions of customized systems, modules and ICs. The approach we proposed could implement of a blockchain and IPFS-Based anti-counterfeit traceable of the version customized System-ModuleIC by using reconfigurable intelligence trust chain, where the DApp and the smart contracts for reconfigurable intelligence trust chain have been designed and performed in the real test-bed platform. Finally, the implemented results shown that this reconfigurable trust chain approach is a kind of the designed-in security as well as a kind of trusted platform integrated solution for creating a trusted platform module (TPM) solution based on Blockchain and IPFS. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 261–272, 2023. https://doi.org/10.1007/978-3-031-35836-4_28
262
H.-C. Chen et al.
1 Introduction Internet of Things (IoT) is transforming everyone’s life by providing features, such as controlling and monitoring of the connected smart objects [1]. IoT applications range over a broad spectrum of services including smart cities, homes, cars, manufacturing, e-healthcare, smart control system, transportation, wearables, farming, and much more. The adoption of these devices is growing exponentially, that has resulted in generation of a substantial amount of data for processing and analyzing [1]. Thus, besides bringing ease to the human lives, these devices are susceptible to different threats and security challenges, which do not only worry the users for adopting it in sensitive environments, such as e-health, smart home, etc., but also pose hazards for the advancement of IoT in coming years [1]. However, traditional access control schemes designed for a single server are not sufficient to handle such applications across multiple collaboration IoT servers to get rich services in IoT environments [2]. Especially, it does not be easily to against the insider security threats [3] in the massive IoT application deployed in wireless ad hoc networks. It does not take into account both security and efficiency of IoT servers, which securely share their resources [2]. Therefore, the collaboration IoT-based role-based access control with trust evaluation algorithm model to reducing internal security threats in intra-server and inter server for the massive IoT integrated application is proposed by Chen H.C. [2]. His paper presented the collaboration IoTbased RBAC model, which are designed and presented for reducing internal security threats in collaborative IoT servers [2]. Moreover, both information technology (IT) and operational technology (OT) accelerate the quick development of Artificial Intelligence of Things (AIoT) systems, which each of them is designed for a customized system consisting of modules together with some modular integrated circuits (ICs). However, the existing customized system cannot support this all-round comprehensive trust relationship based on the view of SystemModule-IC vertical integration of upstream and downstream product hardware version identification in the related supply chain. Therefore, this paper is the first implement for the reconfigurable intelligence chain based on private Quorum blockchain and IPFS according to the reference [21] and [22], which could really be applied to the supply chain under the System-Module(s)-IC(s) full vertical integration. This approach could improve the reliability, traceability and identification of all-round customized versions of the customized systems, modules and ICs. This implemented system could implement to provide the anti-counterfeit traceable of the version customized System-Module-IC by using reconfigurable intelligence trust chain, where the DApp and the smart contracts for reconfigurable intelligence trust chain have been designed and performed in the real test-bed system shown in Fig. 1 in the laboratory of Asia University, Taiwan. Finally, the remains of this paper are organized below. Section 2 discusses related works, while in Sect. 3, we present the implement scheme of blockchain and IPFSbased anti-counterfeit traceable version of the customized system-module-IC by using reconfigurable intelligence trust chain in this paper. Section 4 evaluates experimental results and performances. Finally, Sect. 56 summarizes this study.
The Implement of a Reconfigurable Intelligence Trust Chain Platform
263
2 Related Works Due to there are more and more security problems in the massive IoT applications. Thus, there are some approaches proposed for preventing the security problems for IoT applications, which are described below. First, the three approaches [4–6] were proposed for sharing the massive IoT resources by deploying security gateway. Second, the three compression approaches with low transmission cost and high data fidelity together with the data privacy ensuring were also presented in [7–9]. In addition, assessing the security of IoT-based smart environments such as smart homes and smart cities is becoming fundamentally essential to implementing the correct control measures and effectively reducing security threats and risks brought about by deploying IoT-based smart technologies [10]. Thus, Nickson M. Karie et al. [10] presented a review of existing security standards and assessment frameworks which also includes several NIST special publications on security techniques highlighting their primary areas of focus to uncover those that can potentially address some of the security needs of IoT-based smart environments [10]. Security in IoT devices is often neglected or treated as an afterthought from the IoT manufacturers [11]. This is mostly due to the short time to market and costs reduction driving the device’s design and development process. The few devices that support some protection usually employ software level solutions, such as firmware signing [11]. However, focusing the attention on the software-based protection schemes often leaves the hardware unintentionally vulnerable, allowing for new attacks. For the IoT applications, the development of trust mechanisms is fundamental to help people to overcome perceptions of uncertainty and risk in using IoT services and applications [11–15]. Blockchain technology was introduced by Nakamoto [16]. Alharby et al. [17] states that blockchain is a distributed data structure that is used to maintain a continuous record list, called block [18]. Each block contains a timestamp and a link to the previous block [18]. Additionally, blockchain is managed by peer-to-peer networks that collectively follow certain protocols to validate a new block [18]. The block forms a linear sequence where each block references the hash of the previous block [18]. In general, blockchain could be constructed with four parts consisting of the ledger, smart contracts, consensus, and cryptography [18]. In a blockchain network, the ledger archives all transactions by the blockchain participant node, and each node keeps a copy of them. Finally, the cryptography mechanisms, e.g., Elliptic Curve Cryptography (ECC), SHA256 and RIPEMD160, are applied to the blockchain network for protecting all transactions stored in the ledger, which are made these blocks hard to break and trace by eavesdroppers [18]. Moreover, the blockchain technology might help solving the problem through four aspects [19]: (1) digital access rules, (2) data immutability, (3) package tracking and update, and (4) easy payment management [19]. Next, appendable-block blockchain was established in an effort to standardize a blockchain architecture for heterogeneous Internet of Things devices [20]. In current constructions, the main disadvantage lies in the full trust of gateways, which can launch coordinated and individual eclipse attacks against IoT sensing devices without being caught [20]. However, an implementation of trust chain framework with hierarchical content identifier mechanism by using blockchain technology was first proposed in Ref. [21]. The novel idea of the trust chain framework of System-Module(s)-IC(s) with a comprehensive identification mechanism is implemented for the any designed MS system [21],
264
H.-C. Chen et al.
which is based and created on the private blockchain in conjunction with decentralized database systems, e.g., IPFS, to boost the flexibility, traceability, and identification of the customized system consisting of the deployed modules and IC chips [21]. Finally, Chen H.-C. applied a patent of the Republic of China on Feb. 16, 2023, which is titled as “A Reconfigurable Trust Chain Authentication Method Based on System-Module-Chip Vertical Integration of Upstream and Downstream Product Hardware Version Identification” [22].
3 The Implement Platform of Anti-counterfeit Traceable Version of the Customized System-Module-IC by Using Reconfigurable Intelligence Trust Chain In this section, there are three phases for this implement scheme, which are according to the innovation concept as well as partial approaches in Ref. [21] and [22], consisting of Registration Phase, Reconfigurable Intelligence Trust Chain Phase and Real-Time Challenge for the Customized System, Module or IC Chip Phase. They are described below and shown in Fig. 1. 3.1 Registration Phase First, in the registration phase, the manufacturer of Integrated Circuit (IC) chip registers the product information including this information of IC such as IC functions descriptions and its version. The IC product information is written into the distributed storage system, Interplanetary File System (IPFS), and then receive the hash value, content identifier (CID), calculated by IPFS managed by the IC manufacturer. Next, the IC chip manufacturer assigns a unique blockchain address by wallet belonging to IC manufacturer to this produced IC chip. Finally, the blockchain address is into the blockchain network together with the hash value CID. The module manufacturer registers the product information including this information of module and IC(s) such as module and involved IC(s) functions descriptions and their versions. The module product information is written into the distributed storage system, IPFS, and then receive the hash value CID calculated by IPFS managed by the module manufacturer. Next, the module manufacturer is assigned a unique blockchain address by wallet belonging to module manufacturer to this produced module. Finally, the blockchain address is into the blockchain network together with the hash value CID. Finally, the system manufacturer registers the product information including this information of customized system and its involving module(s) and IC(s), functions descriptions of the customized system, the involved module(s), IC(s) and their versions. The product information of the customized system is written into the distributed storage system, IPFS, and then receive the responded hash value CID calculated by IPFS managed by the system manufacturer. Next, the system manufacturer is assigned a unique blockchain address by wallet belonging to the system manufacturer to this customized system. Finally, the blockchain address is into the blockchain network together with the hash value CID.
The Implement of a Reconfigurable Intelligence Trust Chain Platform
265
3.2 Reconfigurable Intelligence Trust Chain Phase The blockchain addresses and hash values CIDs of the customized system, module and IC chip are written into the blockchain in order to form a reconfigurable intelligence trust chain (or called trust chain). When the version information of the customized system, module or IC chip is updated, the hash value CID will also be updated. The new blockchain address will also be assigned the new product, and both will be updated and written into the blockchain to form an updated trust chain of the customized SystemModule-IC. In this implementation, some smart contracts are designed in this study including a registration smart contract, a version update smart contract, a version trust smart contract, and a smart contract for the reconfigurable intelligence trust chain of maintaining versions among the customized system, module and IC chip. In the other words, the registered smart contract is a kind of smart contract running in the blockchain network. The registered smart contract uses the domain name registration (name registry) function to register and manage a customized comprehensive protection from the top to down. The blockchain address and CID value of top-mid-to-downstream IC chips, modules or systems, through the domain name registration function, register a specific name, and each specific name is registered in the registered smart contract, which will be automatically write a plurality of sub-sets relative to the system, the module, or the IC chip with a customized hierarchical name and relative blockchain address. The hierarchical set includes the sub-sets relative to the system, the module, or the IC chip. The subcollection of the smart contract is performed and published into the blockchain network, and a unique smart contract address is obtained. Since the registered smart contract has been published into the blockchain network, it is protected by the consensus mechanism of the published blockchain, so that the registered smart contract and registered collections have the characteristic of immutable once released. Once on the chain, together with each corresponding hierarchical CID and blockchain address, an identifiable trust set is formed. According to the reconfigurable intelligence trust chain which was defined in Ref. [21] and [22], the following Reconfigurable Intelligence Procedure is implemented and described in this paper. In this implemented procedure, the trust set with hierarchical product names could be reconfigured for intelligence reconfiguration. The details of Reconfigurable Intelligence Procedure is shown below. Reconfigurable Intelligence Procedure. Input: The old blockchain addresses and old CID values of IC chips, modules and the customized system from upstream, midstream to downstream; Output: The new trust set of blockchain addresses and new CID values of IC chips, modules and the customized system from upstream, midstream to downstream; Begin Step:1. Reconfigure the old blockchain addresses and old CID values of IC chips, modules and the customized system from upstream, midstream to downstream through decentralized application (DApp) proposed in this study by using Merkle DAG (Directed Acyclic Graph), verify the reassembled collections of various new versions in the decentralized IPFS, and obtain new collections of CIDs;
266
H.-C. Chen et al.
Step:2. Through the use of smart contracts with Merkle DAG functions, output the new trust set of the latest updated collection with hierarchical product information including IC chips, modules and the customized systems from upstream, midstream to downstream (including new/old systems, modules or chips, etc.) End. 3.3 Real-Time Challenge for the Customized System, Module, or IC Chip Phase In this implementation, each IC chip, module or the customized system can be challenged in real-time, where Real-Time Challenge for the Customized System, Module or IC Chip Procedure [21, 22] is implemented and described below and shown in Fig. 1. Real-Time Challenge for the Customized System, Module, or IC Chip Procedure. Input: Challenge messages, for the customized system, module or IC chip, sent by any host installed the web-based DApp via calling the related smart contracts, which is managed by the customized system owner; Output: Response the messages including whether it is in this reconfigurable intelligence trust chainor not, the latest version for the customized system, module or IC chip together with the product information in details, sent back to the web-based DApp managed by the customized system owner; Begin Step:1. Challenge messages, for the customized System, Module, or IC chip, sent by the host installed the web-based DApp via calling the related smart contracts, which is managed by the customized system owner. Step:2. Send the messages to the real customized system for challenging System, Module, or IC chip. Step:3. Reply to the blockchain address and related information of the customized System, Module, or IC chip. Step:4. Get and verify the trust information. Step:5. Get trust information form blockchain registration information. Step:6. Get IPFS data through CID. Step:7. After passing the challenge, the product specification will be successfully obtained. Step:8. Response the messages including whether it is in this reconfigurable intelligence trust chain (or said trust chain) or not, the latest version for the customized system, module or IC chip together with the product information in details, sent back to the web-based DApp managed by the customized system owner. End.
The Implement of a Reconfigurable Intelligence Trust Chain Platform
267
Fig. 1. The system model which is first proposed in this paper.
4 Experimental Results To achieve competently implement the trust chain for the customized System-ModuleIC chip, named “’Lab. H304 Server Rack Environmental Monitoring System” located at Room H304 in Asia University, Taiwan, is shown in Fig. 2 is described below. 1) WinFast Server with Dual Socket P (LGA 3647); CPU: 2nd Gen. Intel® Xeon® Scalable Processors @ 10.4GT/s; RAM 6TB 3DS ECC DDR4-2933MHz; 2) Platform: Ubuntu Linux Ubuntu 22.04.1 LTS Edition; 3) Blockchain utilized: Quorum blockchain, which is basically a private or permissioned blockchain network based on the Ethereum blockchain; 4) FPGA (Cmod A7-35T) was simulated as an ROM (Read Only Memory) IC chip with blockchain information; 5) DSI2598+ is a narrow-bank Internet of Things module with the access 4G mobile communication ability was selected to be the module plus EEPROM (Electrically-Erasable Programmable Read-Only Memory) with the relative blockchain information; 6) IoT system with Sensors: EMI (HMC2003), PM2.5 (HM3301) and DHT12. The web-based DApp proposed by Hsing-Chung Chen’s NSTC Project Team is shown in Fig. 3. There are four main functions deployed in this web based DApp. First, the registration function provides to register the product version and its related blockchain information for the customized system, module, or IC chip, individually. Second, the reconfigurable intelligence function provided for the customized system to reconfigurable the corresponding blockchain addresses into trust chain. The web-based DApp could also verify the corresponding content identifier (CID) outputted by IPFS
268
H.-C. Chen et al.
Fig. 2. Lab. H304 Server Rack Environmental Monitoring System located at Room H304 in Asia University, Taiwan
which is calculated the hash value via gathering the functionaries and version of their components including the modules or IC chips.
Web-based DApp proposed by Hsing-Chung Chen’s NSTC Project Team. The registration function and reconfigurable intelligence function for the customized System, Module or/and IC chip.
Fig. 3. The web-based DApp proposed by Hsing-Chung Chen’s NSTC Project Team.
The reconfigurable intelligence function consisting of the smart contracts is implemented to create and maintain the trust chain for the customized System-Module-IC chip, which has been implemented in this study and shown in Fig. 4. The performance evaluations for challenge and response for the DSI2598 + and FPGA have been finished by the web based DApp proposed and implemented in our laboratory H304, Asia University, individually. At first, the average challenge and response time via web based DApp to FPGA on the module DSI2598+ equals 11.96 s. The evaluation result for the real-time ROM IC chip challenging is shown in Fig. 5. In addition,
The Implement of a Reconfigurable Intelligence Trust Chain Platform
269
Fig. 4. To real-time challenge each of System, Module or IC chip for verification in the reconfigurable intelligence trust chain is shown in this figure.
Response Time (Second)
the average challenge and response time via web based DApp to FPGA on the module DSI2598+ and successfully verify either the response current version or all the related information in Quorum blockchain and IPFS we implemented equals 11.96 s. Secondly, the average challenge and response time via web based DApp to EEPROM on the module DSI2598 + and successfully verify either the response current version or all the related information in Quorum blockchain and IPFS we implemented equals 11.34 s. Finally, the evaluation result for the real-time module challenging is shown in Fig. 6.
The performance of challenge and response for FPGA 18
17
20
12
15 8 8
10
12 9
11
19
1717
15
15
1110
9
1011
16
15 10 7
5
1313
13 8
10
11
13
1112111210
9
121312121211131112
14 11
131212
5 0 0
5
10
15
20
25
30
35
40
45
50
The challenging numbers
Fig. 5. The performances of challenge and response for ROM IC chip simulated by “FPGA”.
270
H.-C. Chen et al.
Response Time(Second)
The performance of challenge and response for DSI2598+ 13
12 10
12
11 9
8
0
10
5
17
17
16
18 16 14 12 10 8 6 4 2 0
11
12121212
13
12
13
13 11
10 8
10
15
20
1313
12
11
11 9
11
12
12 8
25
11
10
30
13
12 10 8
7
35
14
9
12
12
11
11
12
13
9 7
40
45
50
The challenging numbers
Fig. 6. The performances of challenge and response for module DSI2598+ .
5 Conclusion The reconfigurable trust chain approach for the designed-in security, which is a trusted platform system, module and IC integrated solution. It is a trusted platform module (TPM) solution based on Blockchain and IPFS which is first implemented and presented in this paper according to the idea and concept from the references [21] and [22]. This implementation arms to provide the comprehensive identification of the hardware version of upstream and downstream products in the supply chain under the SystemModule(s)-IC(s) vertical integration. This approach is deployed the private blockchain we created in our Lab. H304 which can be accessed in Internet, by combined with a decentralized database system, IPFS server installed in the same Lab., to improve the reliability, traceability and identification of all-round customized versions of customized systems, modules and ICs. This approach could achieve the anti-counterfeit traceable of the version customized System-Module-IC by using reconfigurable intelligence trust chain, where the DApp and the smart contracts for reconfigurable intelligence trust chain have been designed and performed in this real test-bed system. Acknowledgments. This work was funded and supported by the National Science and Technology Council (NSTC), Taiwan, R.O.C., under contract NSTC 111-2218-E-468-001-MBK and the Ministry of Science and Technology, Taiwan, R.O.C., under contract MOST 110-2218-E-468001-MBK. In addition, this work was supported in part by NSTC, Taiwan, R.O.C., under contract NSTC 111-2218-E-002-037 and MOST 110-2218-E-002-044.
References 1. Iqbal, W., Abbas, H., Daneshmand, M., Rauf, B., Bangash, Y.A.: An in-depth analysis of IoT security requirements, challenges, and their countermeasures via software-defined security. IEEE Internet Things J. 7(10), 10250–10276 (2020). https://doi.org/10.1109/JIOT.2020.299 7651 2. Chen, H.-C.: Collaboration IoT-based RBAC with trust evaluation algorithm model for massive IoT integrated application. Mob. Netw. Appl. 24(3), 839–852 (2018). https://doi.org/10. 1007/s11036-018-1085-0
The Implement of a Reconfigurable Intelligence Trust Chain Platform
271
3. Chen, H.-C.: TCABRP: a trust-based cooperation authentication bit-map routing protocol against insider security threats in wireless Ad Hoc networks. IEEE Syst. J. 11(02), 449–459 (2017). https://doi.org/10.1109/JSYST.2015.2437285 4. Chen, H.-C., You, I., Weng, C.-E., Cheng, C.-H., Huang, Y.-F.: A security gateway application for End-to-End M2M communications. Comput. Stan. Interfaces 44, 85–93 (2016) 5. Chen, H.-C., Yang, W.-J., Chou, C.-L.: An online cognitive authentication and trust evaluation application programming interface for cognitive security gateway based on distributed massive Internet of Things network. Concurrency Comput.-Pract. Experience 33(19) (2020). https://doi.org/10.1002/cpe.6128 6. Chen, H.-C.: A trust evaluation gateway for distributed blockchain IoT network. In: Chen, J.L., Pang, A.-C., Deng, D.-J., Lin, C.-C. (eds.) WICON 2018. LNICSSITE, vol. 264, pp. 156– 162. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-06158-6_16 7. Chen, H.-C., Putra, K.T., Tseng, S.-S., Chen, C.-L., Lin, J.C.-W.: A spatiotemporal data compression approach with low transmission cost and high data fidelity for an air quality monitoring system. Future Gener. Comput. Syst. 108, 488–500 (2020). https://doi.org/10. 1016/j.future.2020.02.032 8. Chen, H.-C., Putra, K.T., Weng, C.-E.: A Novel predictor for exploring PM25 spatiotemporal propagation by using convolutional recursive neural networks. J. Internet Technol. 23(1), 165–176 (2022). https://doi.org/10.53106/160792642022012301017 9. Putra, K.T., et al.: Federated compressed learning edge computing framework with ensuring data privacy for PM2.5 prediction in smart city sensing applications. Sensors 21(13), 4586 (2021). https://doi.org/10.3390/s21134586 10. Karie, N.M., Sahri, N.M., Yang, W., Valli, C., Kebande, V.R.: A review of security standards and frameworks for IoT-based smart environments. IEEE Access 9, 121975–121995 (2021). https://doi.org/10.1109/ACCESS.2021.3109886 11. Frustaci, M., Pace, P., Aloi, G., Fortino, G.: Evaluating critical security issues of the IoT world: present and future challenges. IEEE Internet Things J. 5(4), 2483–2495 (2018) 12. Wurm O. Arias, J., Hoang, K., Jin, Y.: Privacy and security in Internet of Things and wearable devices. IEEE Trans. Multi-Scale Comput. Syst. 1(2), 99–109 (2015) 13. Pinto, S., Gomes, T., Pereira, J., Cabral, J., Tavares, A.: IIoTEED: an enhanced, trusted execution environment for industrial IoT edge devices. IEEE Internet Comput. 21(1), 40–47 (2017). https://doi.org/10.1109/MIC.2017.17 14. He, D., Chen, C., Chan, S., Bu, J., Vasilakos, A.V.: ReTrust: attack-resistant and lightweight trust management for medical sensor networks. IEEE Trans. Inf. Technol. Biomed. 16(4), 623–632 (2012) 15. Kounelis, I., et al.: Building trust in the human–Internet of Things relationship. IEEE Technol. Soc. Mag. 33(4), 73–80 (2014) 16. Nakamoto, S.: Bitcoin : A Peer-to-Peer Electronic Cash System. pp. 1–9 (2008) 17. Alharby, M., van Moorsel, A.: Blockchain based smart contracts : a systematic mapping study. In: Computer Science & Information Technology, pp. 125–140 (2017) 18. Chen, H.-C., Irawan, B., Shih, C.-Y., Damarjati, C., Shae, Z.-Y., Chang, F.: A smart contract to facilitate goods purchasing based on online haggle. Part of the advances in intelligent systems and computing book series (AISC, volume 994). In: The 13th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2019), First Online (2019) 19. Chen, H.-C., Damarjati, C., Prasetyo, E., Arofiati, F., Sugiyo, D.: Blockchain technology benefit in tackling online shopping transaction revocation issue. In: Proceedings of the 2020 The 6th International Conference on Frontiers of Educational Technologies (ICFET 2020), pp. 191–195. Tokyo, Japan (2020). https://doi.org/10.1145/3404709.3404757 20. Koe, A.S.V., et al.: Hieraledger: towards malicious gateways in appendable-block blockchain constructions for IoT. Inf. Sci. 632, 87–104 (2023)
272
H.-C. Chen et al.
21. Chen, H.-C., et al.: An implementation of trust chain framework with hierarchical content identifier mechanism by using blockchain technology. Sensors 22(13), 1–30 (2022). https:// doi.org/10.3390/s22134831 22. Chen, H.-C.: A reconfigurable trust chain authentication method based on system-modulechip vertical integration of upstream and downstream product hardware version identification. Patent of the Republic of China, Application number: 112105536, pp.1–14, Applied date on 16 Feb 2023
Prototyping of Haptic Datagloves for Deafblind People Patrick C. K. Hung1,2(B) , Kamen Kanev1,2 , Atsushi Nakamura2 , Ryuhei Takeda2 , Hidenori Mimura1,2 , and Masakatsu Kimura1,2 1 Ontario Tech University, Oshawa, Canada [email protected], [email protected], {mimura.hidenori,kimura.masakazu}@shizuoka.ac.jp 2 Shizuoka University, Hamamatsu, Japan [email protected]
Abstract. Modern haptic gloves incorporate wearable technologies employing sensor and actuator arrays with a power supply and electronics for haptic data acquisition and processing to support human-computer interaction in different application domains. This paper discusses the prototyping process of specialized haptic Datagloves that realize remote two-way communication for the deafblind. The research aims to design and develop a haptic Dataglove system as a humancomputer interface that supports the independence of the deafblind. The employed communication approach is based on the Malossi alphabet with minimal adjustments that enable its use in mobile settings. The implemented input/output method reduces the complexity of character spelling for the deafblind by touch sensitive pads and haptic feedback actuators embedded in the Datagloves. As a result, messages can be i) transmitted by simple tapping on the sensitive touch pads with the thumb of the same hand and ii) received through the haptic actuators on the other hand either simultaneously or consecutively.
1 Introduction According to a survey by the Ministry of Health, Labour and Welfare of Japan, there are over 23,000 deafblind people nationwide, with many others gradually losing sight and hearing as their disabilities progress. Since deafblind disorders combine impairments in the two main human sensory functions – the eyes and the ears – they often lead to communication difficulties. Deafblind people usually rely on the sense of touch as their main route of obtaining information and often use the Braille tactile writing system [1, 2] . The research community is developing assistive technologies for the deafblind that are based on tactile sensation [3] and can support their autonomous living with lesser dependence on caregivers and/or interpreters. However, the advanced technological means for tactile sensation based communications still appear to be challenging for deafblind people when used independently [4] . There is a need, therefore, to develop an efficient and practical remote two-way communication assistive technology that enables deafblind people to become more independent and accessible without caregivers and interpreters [5]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 273–282, 2023. https://doi.org/10.1007/978-3-031-35836-4_29
274
P. C. K. Hung et al.
Datagloves are used for direct hand and finger sensing and motion tracking in different application domains, often involving Virtual Reality (VR), Augmented Reality (AR), and Serious Gaming [6]. Such Datagloves provide assistive support for gesture based interactions that enhance natural highly intuitive interpersonal communication and help overcome language, culture, and other barriers. While traditional Datagloves are limited to gathering input data, there is a demand for enhanced models that can provide assistive feedback through haptic outputs. Such haptic feedback is typically implemented with an array of actuators that vibrate as controlled by a specialized driver circuit. Most of the academic research and development of haptic gloves deals with the tailoring and integration of different haptic technologies where a number of specialized sensors and actuators for specific purposes are investigated and embedded into hand gloves made of a variety of primary materials [7]. In this paper, we present the prototyping process of haptic gloves (a set of input and output data gloves) that are specifically designed to support the communication of the deafblind. The employed two-way communication method uses single-hand input that reduces the complexity of the character spelling and is combined with haptic reception on the highly sensitive areas on the user’s fingers. The built prototype is based on the Malossi alphabet and can be used in mobile setouts through the Mobile Malossi extension [8]. The remainder of the paper is organized as follows. Section 2 overviews the related work on haptic/data gloves for the deafblind. Section 3 presents the prototyping process of haptic Datagloves, and Section 4 concludes this paper and provides some insights on the planned future work.
2 Literature Overview An early work of Caporusso [9] described an assistive hardware/software system called DB-HAND, consisting of an input/output glove equipped with sensors and actuators to transmit the Malossi alphabet. Similarly, Gollner et al. [10] presented a Mobile Lorm Glove as a mobile communication and translation device for the deafblind. The glove employed the Lorm alphabet for text to haptic/tactile and vice versa translations and SMS transmissions. The tactile input sensors were implemented with pressure sensitive fabric and placed on the palmar side of the glove, while vibration motors, placed on the dorsal side, were employed as haptic output actuators. Khambadkar and Folmer [11] followed by using combinations of different orientations of both hands to define a set of semaphores and construct a specialized alphabet suitable for communications involving deafblind users. Then Choudhary et al. [12] presented a data glove that employed the Braille alphabet for remote communications through SMS (Short Message Service). This system employed capacitive touch sensors, placed on the palmar side of the glove for input, which was converted to text and sent as an SMS by a mobile phone. The reception of the incoming messages was provided by tactile feedback, implemented as vibration patterns on the dorsal side of the glove. Lee and Lee [13] presented a data glove for communications based on American Sign Language (ASL). The glove incorporated flex sensors, pressure sensors, and inertial motion sensors. Data was collected from the glove sensors and sent to the processing module that employed a Support Vector Machine (SVM) algorithm to classify the gestures. A Bluetooth link was then used to
Prototyping of Haptic Datagloves for Deafblind People
275
communicate the recognized ASL input to an Android device, which employed textto-audio conversion for vocalizing the output. In [14] , Navaitthiporn et al. presented a specialized input glove device that detected 30 characters from the Thai alphabet and 12 Thai words. This glove used flex sensors for measuring the bends at the finger joints and a tri-axial accelerometer and gyroscope module (GY-521) to detect the direction and hand movements. Giulia et al. [15] complemented by presenting the output glove device GLOS that consisted of a Raspberry Pi 3 board, equipped with a microphone and vibrating discs for haptic feedback. GLOS employed the microphone for voice input that was processed and converted to text in real-time. Obtained letter sequences were then used for controlling the five haptic modules attached to the glove fingers. Ozioko et al. [16] , on the other hand, integrated both, piezoresistive sensors and vibrotactile actuators into data gloves. In this system, the index, middle, and ring fingers on both hands received haptic feedback based on the Braille tactile code. In the context of the above, the authors of this work have also been involved in data glove related research and development quite extensively. The foundational sensor technology, pioneered by some of the authors and employed in data glove implementations, was presented by Suzuki et al. [17] . A later work by Gelsomini et al. [18] discussed the application of the technology in a sensor framework for advanced motion tracking with data gloves. Finally, our recent work on specialized data gloves for enhanced support of the deafblind was presented in Gelsomini et al. [8] . From a broader perspective, we have also considered the employment of data gloves in robot control and exergaming, as reported in Demoe et al. [19] and the experiential aspects of wearable multi-device user interfaces, including data gloves, as discussed in Salgado et al. [20] .
3 The Haptic Dataglove Prototyping Process A Haptic Dataglove-based communication system incorporates two essential components- one for the message sending, and another for the message receiving. While a single Dataglove can accommodate both, the sending and the receiving components, in this study, we consider separate Datagloves with dedicated functionality. This allows us to explore implementations of sending Datagloves based on different sensing and tracking technologies as well as their combined use with the different haptic feedback methods that are deemed appropriate for deafblind users. Our ultimate goal is to implement an integrated Haptic Dataglove that will incorporate both the input component and the haptic feedback component and will satisfy the needs of the deafblind community to the best possible extent. 3.1 The Dataglove Input and Its Enhancements Traditional Datagloves are input devices that allow the sensing of a human hand and finger postures through sensors embedded in the gloves. More advanced Datagloves support real-time hand and finger motion tracking and gesture recognition. In some advanced Datagloves, rapid-response highly stretchable Carbon Nanotube (CNT) based sensors are employed for improved performance [17]. A Dataglove incorporating such CNT sensors has been commercialized by Yamaha and allows for high-fidelity tracking
276
P. C. K. Hung et al.
of the subtle hand and finger motions, while the touch and feel remains as if one wears traditional work or leisure gloves [18]. The Yamaha Dataglove is a fairly complex device with 11 Degrees of Freedom (DOF) for finger motion tracking and 6 DOF for hand tracking. The embedded 11 CNT sensors are scanned at regular 4 ms intervals and the corresponding stretch values are digitized with 12 bit resolution. This is carried out in parallel with the data input from the embedded gyroscope (3 DOF) and accelerometer (3 DOF). The wearing properties and the advanced performance of the Yamaha Dataglove indicate, therefore, that it could serve as a message sending component in the prototyped Haptic Dataglove communication system. While rudimentary gesture-based communication is widely employed in daily activities, the information content delivered by gestures is very limited. This is mostly due to the fact that the use of gestures greatly varies across different countries, languages, and cultures. The use and the interpretation of the different gestures depend on the backgrounds of the sender and the receiver, so much communication is prone to misunderstanding. To alleviate this problem, we will focus on a well-defined set of finger postures with a minimal number of hand gestures that can be mapped to an encoding medium such as an alphabet. By employing the available real-time data from the Yamaha Dataglove, we have implemented a simple finger posture recognition scheme that can accommodate a sufficient number of well distinguishable finger postures for covering the English alphabet. For enhanced reliability, the scheme is based on binary (Open/Close) finger states for the index, middle, ring, and small fingers. The thumb finger is used as a modifier, so in addition to the two binary states (Open/Close), a third (Aligned) state is introduced. While for the four fingers, there are 24 =16 possible Open/Close binary combinations, as outlined in Table 1, two of them are generally not acceptable in some countries and cultures, and two others appear to be difficult to produce for some people. From the remaining 12 postures, we have reserved the one with all four fingers closed (fist) for automatic initialization purposes as it stretches all embedded CNT sensors to the possible maximum. As a result, the remaining 11 postures marked as available in the table can be used for signing without restrictions. By combining them with the three modifier postures assigned to the thumb (Open/Close/Aligned), we obtain 33 possible signs that are more than sufficient for covering the English alphabet. The benefits of the model described so far, are that it can be implemented with a simple deterministic algorithm employing thresholds that can run on a low cost microcontroller. Our experiments indicate that a standard Arduino microcontroller will be sufficient for real-time processing of the Dataglove input and the consecutive posture recognition and letter encoding. Yet, due to the binary nature of the employed finger postures, reliable posture detection across multiple users can be achieved without user specific initialization procedures. The employed encoding approach, however, has its limitations. Since it was designed by prioritizing reliability and efficiency, usability issues had not been considered in full. While the finger postures can be neatly organized in a table and explained in a succinct and logical way, they constitute a new finger posture-based communication mechanism that has to be mastered. To facilitate and broaden the adoption of the Dataglove-based input, it will be beneficial to align the adopted postures and gestures with those that are commonly employed in existing sign languages.
Prototyping of Haptic Datagloves for Deafblind People
277
Table 1. The 16 combinations of the binary (Open/Close) postures for the small, ring, middle, and index fingers and their possible assignments. Digital Codes
Small Finger
Ring Finger
Middle Finger
Index Finger
Assignment
0
O
O
O
O
available
1
O
O
O
C
available
2
O
O
C
O
available
3
O
O
C
C
available
4
O
C
O
O
available
5
O
C
O
C
available
6
O
C
C
O
generally unacceptable
7
O
C
C
C
available
8
C
O
O
O
available
9
C
O
O
C
available
10
C
O
C
O
difficult to produce
11
C
O
C
C
difficult to produce
12
C
C
O
O
available
13
C
C
O
C
generally unacceptable
14
C
C
C
O
available
15
C
C
C
C
reserved for initialization
One possibility would be the Lorm alphabet which is specifically designed to support the deafblind and is mainly employed in German-speaking countries and in the Netherlands, the Czech Republic, Poland, Georgia, etc. [10]. This tactile alphabet uses signing to place letters on the palm and to form sentences and words by touch and pinch. Another widely used tactile communication method is the Malossi alphabet which is often perceived as an alternative to the Braille system by people with vision and hearing disabilities [16, 21]. In our previous work, we have investigated the employment of the Malossi alphabet in different scenarios for enhancing the support of the deafblind and proposed a version that is particularly suitable for mobile use [8]. The Mobile Malossi alphabet retains very high similarity with the traditional Malossi alphabet and can thus be mastered with minimal additional learning. It also allows single hand signing and reception, which aligns with our experimental approach where separate sending and receiving Datagloves are being developed. In this research, we therefore adopt the Mobile Malossi alphabet as a communication medium for the technology-enhanced support of deafblind communication.
278
P. C. K. Hung et al.
3.2 The Haptic Feedback Experimental haptic feedback gloves have been designed and implemented at different research institutions, including Shizuoka University and Ontario Tech University, and some have been commercialized and made available to the general public. Most of those devices, however, provide haptic feedback through miniature DC motors with shifted weight balance that induces vibration when energized. With respect to the employment of such haptic feedback, Datagloves for support of the deafblind, one notable product is the dbGLOVE [21]. However, while the dbGLOVE fully supports the required haptic-based communication functionality for the deafblind, it appears to be rather bulky, mostly due to the limitations of the employed vibration feedback technology. As an ultimate goal of our research, we are considering a haptic feedback Dataglove with the touch and feel of a normal work or leisure glove that will be in a position to support all the functionality required for providing advanced technological support for deafblind communications. For this, we are exploring more advanced, flexible piezocrystal based actuators, some of which are being developed in-house [22]. The advantage of this approach is that such actuators can also be used as pressure sensing input components which creates a pathway toward the construction of an integrated Dataglove incorporating both the message sending and the message receiving in a single unit. 3.3 The Integrated System Prototype Figure 1 shows the Mobile Malossi alphabet input as mapped on the left hand for message sending. The position arrangement of the 26 letters of the English alphabet allows for a single hand input by touching any of the denoted letter spots with the tip of the thumb. For clarity, letters on the palm are shown in black and letters on the side of the fingers are shown in red. Note that only the letter positions for A, P, U, and W have been adjusted. All other letter positions remain consistent with the standard Malossi alphabet (the letters K and F remain on the palm in their original order, although slightly repositioned for easier reachability by the thumb).
Fig. 1. The Mobile Malossi alphabet mapping of the 26 letters from the English alphabet applied to the left hand.
Prototyping of Haptic Datagloves for Deafblind People
279
The prototype of the message sending glove is shown in Fig. 2(a). In the course of the design, we have explored various methods and construction materials, considering the ease of use, the reliability of the communication, the durability of the Dataglove, and the incurred materials and labor cost. The prototype is based on an off-the-shelf synthetic leather glove. A leather puncturing tool was used to open holes in the synthetic leather of the prototype glove at the letter positions denoted in Fig. 1. The standard snap buttons were installed in the holes and wired to the L-shaped pin header positioned near the wrist. The employed AWG30 heat-resistant insulated wire was additionally bonded to the pin connector to eliminate any excessive wire bending. To reduce possible discomfort, an inner glove was installed in the wired glove and properly fixed using one-component elastic adhesive. The initial controller device for the message sending Dataglove was implemented with an Arduino Mega board that provides a sufficient number of digital I/O ports for direct connections to all letter pads on the Dataglove. More specifically, the pad on the thumb was configured as a source, while the letter pads on the other fingers were configured as digital inputs that were scanned by the Arduino controller.
(a)
(b)
Fig. 2. The prototypes of the message sending Dataglove (a) and the message receiving Dataglove (b).
The prototype of the message receiving Dataglove as shown in Fig. 2(b) is based on the same synthetic leather material used in the message sending Dataglove. It employs thin, circular, 12 mm piezo actuators operated at 280 Hz by 15 V AC. The actuators are attached to the outer surface of an inner glove at the positions prescribed by the Mobile Malossi alphabet (a mirror image of the placement is shown in Fig. 1). The initial controller device for the message receiving Dataglove was also implemented with an Arduino Mega board. While the communication and signal control was all handled by the Arduino control program, additional signal conditioning and amplification were needed for the piezo actuators. A specialized piezo driver unit (PD-206-150B) was thus used for the signal conditioning, along with two Bolage-M W025-5V 16-channel relay arrays for the conditioned signal switching.
280
P. C. K. Hung et al.
The signal flow diagram of the integrated system is shown in Fig. 3. The two Arduino boards are interconnected so that any left-hand input detected by the message sending board is immediately communicated to the message receiving the board. The identified letter is then mirror-mapped to the receiving right hand and the corresponding piezo actuator is activated. This provides for real-time haptic feedback that follows the letters of the current message, thus ensuring proper monitoring and allowing for immediate corrective actions whenever needed. The message sending Arduino controller can also send letters, entered in the serial terminal. In this way, the Dataglove user’s capabilities to identify different letters based on their corresponding tactile feedback can be evaluated. This could be extended to experimentation addressing the analysis and natural recognition of differently sized letter sequences such as words and their abbreviations, etc.
Fig. 3. Signal flow diagram of the integrated system.
The process described so far involves a single pair of Datagloves worn by the same person and used for testing and experiments. The Mobile Malossi alphabet based communication between different persons, however, will require two sets of Datagloves and incur an exchange of information between the four Arduino boards involved in the Datagloves control. In this case, we should distinguish between the local feedback that services the Datagloves of the same user as described above and the remote feedback that is needed for the message transfer to the corresponding user. Note that letters and letter sequences received by the remote user must be directly communicated to the receiving Arduino board for effectuation by the corresponding piezo actuators. While the message sending Dataglove of the remote user is not involved in this process, it is still in an active state and thus allows the receiver to intervene as needed. In this way, communications can be interrupted and/or concurrently moderated by any of the participants for more efficient message exchanges.
4 Conclusions and Future Works This paper addresses the prototyping of input and output haptic Datagloves as humancomputer interface components specifically designed for supporting the two-way communications of the deafblind in both, local and remote setouts. The current system
Prototyping of Haptic Datagloves for Deafblind People
281
employs separate input and output Datagloves worn on the left and the right hand, respectively. This provides for more flexibility in the experimental work through ad-hoc linking of the input sensors and the output actuators embedded in the different Datagloves as needed by the experiment. In future works, we are planning to expand the Dataglove related research coverage to other application areas such as emergency services, critical operations, and disaster management. Indeed, Datagloves could be instrumental in extreme situations such as earthquakes, hurricanes, floods, and fires, where the hearing and vision of the general public are severely impeded. Therefore, we are planning to research the design and development of specialized waterproof and/or fire resistant Datagloves employing reliable communication channels and capable of prolonged independent operation that would be required in such situations. Acknowledgments. The research and development related to this work was partially supported by JSPS KAKENHI Grant Number JP22K12125 and funding for Cooperative Research at the Research Center for Biomedical Engineering and the Research Institute of Electronics.
References 1. Sandler, W., Lillo-Martin, D.: Sign Language and Linguistic Universals. Cambridge University Press (2006) 2. Mesch, J.: Tactile signing with one-handed perception. Sign Lang. Stud. 13(2), 238–263 (2013) 3. Chang, C. -M., Sanches, F., Gao, G., Johnson, S., Liarokapis, M.: An Adaptive, Affordable, Humanlike Arm Hand System for Deaf and DeafBlind Communication with the American Sign Language. In: 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, pp. 871–878 (2022). https://doi.org/10.1109/IROS47612.2022. 9982052 4. Duvernoy, B., et al.: HaptiComm: a touch-mediated communication device for deafblind individuals. IEEE Robot. Autom. Lett. 8(4), 2014–2021 (2023). https://doi.org/10.1109/LRA. 2023.3241758 5. Raavi, R., Kanev, K., Hung, P.C.K.: Integration of optical and data gloves input for improved sign language analysis and interpretation through machine learning. In: The 8th International Symposium toward the Future of Advanced Research in Shizuoka University, Japan, p. 52. (2022) 6. Kanev, K., Mimura, H., Hung, P.C.K.: Data gloves for hand and finger motion interactions. In: Lee, N. (ed.) Encyclopedia of Computer Graphics and Games, pp. 1–4. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-319-08234-9_510-1 7. Owada, S.S., Sugimura, H., Isshiki, M.: Toddler’s hand motion acquisition with handmade data glove. In: 2022 IEEE 4th Global Conference on Life Sciences and Technologies (LifeTech), Osaka, Japan, pp. 96–97 (2022) 8. Gelsomini, F., et al.: Communicating with humans and robots: a motion tracking data glove for enhanced support of deafblind. In: 55th Hawaii International Conference on System Sciences, United States, 9 p. (2022) 9. Caporusso, N.: A wearable Malossi alphabet interface for deafblind people. In: Proceedings of the Working Conference on Advanced Visual Interfaces (AVI ‘08). Association for Computing Machinery, New York, NY, USA, pp. 445–448 (2008). https://doi.org/10.1145/1385569.138 5655
282
P. C. K. Hung et al.
10. Gollner, U., Bieling, T., Joost, G.: Mobile Lorm glove. In: Proceedings of the Sixth International Conference on Tangible, Embedded and Embodied Interaction. ACM, New York, NY, USA, pp. 127–130 (2012) 11. Khambadkar, V., Folmer, E.: A tactile-proprioceptive communication aid for users who are deafblind. In: 2014 IEEE Haptics Symposium (HAPTICS), Houston, TX, USA, pp. 239–245 (2014). https://doi.org/10.1109/HAPTICS.2014.6775461 12. Choudhary, T., Kulkarni, S., Reddy, P.: A Braille-based mobile communication and translation glove for deafblind people. In: 2015 International Conference on Pervasive Computing (ICPC), Pune, India, pp. 1–4 (2015). https://doi.org/10.1109/PERVASIVE.2015.7087033 13. Lee, B.G., Lee, S.M.: Smart wearable hand device for sign language interpretation system with sensors fusion. IEEE Sens. J. 18(3), 1224–1232 (2018). https://doi.org/10.1109/jsen. 2017.2779466 14. Navaitthiporn, N., Rithcharung, P., Hattapath, P., Pintavirooj, C.: Intelligent glove for sign language communication. In: 2019 12th Biomedical Engineering International Conference (BMEiCON), 4 p. (2019). https://doi.org/10.1109/bmeicon47515.2019.8990293 15. Giulia, C., Chiara, D.V., Esmailbeigi, H.: GLOS: GLOve for Speech Recognition. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, pp. 3319–3322 (2019). https://doi.org/10.1109/EMBC. 2019.8857927 16. Ozioko, O., Dahiya, R.: Smart tactile gloves for haptic interaction, communication, and rehabilitation. Adv. Intell. Syst. 4(2), 2100091-1-22 (2021) 17. Suzuki, K., et al.: Rapid-response widely stretchable sensor of aligned MWCNT/elastomer composites for human motion detection. ACS Sens. 1(6), 817–825 (2016) 18. Gelsomini, F., et al.: Specialized CNT-based sensor framework for advanced motion tracking. In: The 54th Hawaii International Conference on System Sciences (HICSS-54), Symposium: Computing in Companion Robots and Smart Toys, Grand Wailea, Maui, Hawaii, 7–10 Jan (2021). https://doi.org/10.24251/HICSS.2021.231 19. Demoe, M., Uribe-Quevedo, A., Salgado, A.L., Mimura, H., Kanev, K., Hung, P.C.K.: Exploring data glove and robotics hand exergaming: lessons learned. In: IEEE 8th International Conference on Serious Games and Applications for Health, Vancouver, Canada, pp. 1–8 (2020) 20. Salgado, A.: User experience aspects in wearable multi-device applications designed for health systems: lessons learned. In: The 6th International Symposium on Biomedical Engineering, Japan, 2 p. (2022) 21. Caporusso, N., Biasi, L., Cinquepalmi, G., Trotta, G.F., Brunetti, A., Bevilacqua, V.: A wearable device supporting multiple touch- and gesture-based languages for the deaf-blind. In: Ahram, T., Falcão, C. (eds.) AHFE 2017. AISC, vol. 608, pp. 32–41. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-60639-2_4 22. Takeda, R., Nakamura, A., Kanev, K.: Development of a haptics glove for communication. The Institute of Electronics, Information and Communication Engineers (IEICE), Technical Report ED2022-29, CPM2022-54, LQE2022-62(2022-11), Japan, 6 p. (2022)
The Design and Implementation of a Weapon Detection System Based on the YOLOv5 Object Detection Algorithm Tsung-Yu Su and Fang-Yie Leu(B) Computer Science Department, Tunghai University, Taichung City, Taiwan [email protected]
Abstract. In recent years, as the increases of people’s concerns on environmental and body safety, various image-based detection techniques and research have gained wide attention. Currently, object-detection algorithms can be generally divided into two categories: traditional ones which extract features manually, and deep learning-based approaches that automatically extract features from images. Since the former requires a lot of manpower, material resources, and financial costs and consumes a lot of time to screen abnormal images, it no longer meets the urgent needs by our societies. In other words, more intelligent identification systems are required. For society security reason, if different types of weapons, such as sticks, knives, and guns, can be detected in surveillance images, this can effectively prevent the chance of gangsters carrying weapons and acting fiercely or seeking revenge. To identify weapons, we need to distinguish them from other surveillance objects and images in a real-time manner. But most cameras have limited computing power, and images captured in the real world have their own problems, such as noise, blur, and rotation jitter, which need to be solved if we want to correctly detect weapons. Therefore, in this study, we develop a weapon detection system for surveillance images by employing a deep learning model. The intelligent tool used for image detection is YOLO (You Only Look Once)-v5, a lightweighting architecture of YOLO series and Sohas (Small Objects Handled Similarly to a weapon) dataset are adopted for image detection comparison. According to our simulation results, we successfully reduced the number of parameters in the YOLOv5s model by substituting the backbone with Shufflenetv2, replacing the PANet upsample module in the neck with the CARAFE (Content-Aware ReAssembly of Features upsample) upsample module, and replacing the SPPF (Spatial Pyramid PoolingFast) module with three lightweight options of simp PPF. These changes resulted in a 16.35% reduction in the parameter size of the YOLOv5s model, a 30.38% increase in FLOPS computational efficiency, and a decrease of 0.024 in [email protected].
1 Introduction In recent years, artificial intelligence and deep learning have been widely applied in scientific research and technological fields, such as image recognition [1], robotics [2], speech interaction [3], and even image detection [4]. Deep learning has also been continuously applied in object detection and industrial or electronic products, such as facial recognition, Automated Optical Inspection (AOI), etc. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 283–293, 2023. https://doi.org/10.1007/978-3-031-35836-4_30
284
T.-Y. Su and F.-Y. Leu
On the other hand, artificial intelligence has been successfully applied to computer vision and real-time object detection producing good research results. For example, the accuracy and agility of object detection and classification algorithms established on Convolutional Neural Networks (CNN) have surpassed the performance of traditional algorithms [5]. Currently, there are two methods for detecting objects on the market. One is R-CNN (Region Convolution Neural Network) [6], which as a Two-stage detection algorithm employs Region Proposal [7]; the other is the sense HOG (Histograms of Oriented Gradients) + SVM (Support Vector Machine) [8] for detection. The later performed well in the MIT dataset [8]. However, with the quick development of Deep Neural Networks (DNN), deep learning models have become the third option for object detection. The working behaviors of these models are similar to that of human’s visual perception system, directly extracting features from the original image and permeating them through a series of layers to collect high-dimensional image representations. Due to the ability and learning ability of convolutional neural networks which extract features of an image, deep learning techniques have improved the accuracy and speed of object detection. Also, as people’s quality of life improves, awareness of safety is also increasing. However, news of people being violated by gangsters can be found from time to time, and for the sake of social security, surveillance cameras have been a norm on roads or in community courtyards. Under this circumstance, if image detection technology can be used to detect whether there are weapons or people carry weapons in the surrounding environment during our daily lives and then take preventive measures or request police to come, our safety can then be significantly enhanced. Furthermore, YOLO (You Only Look Once) [8] is a single-stage object detection solution based on regression models that integrates object detection and objectlocation detection into one. Compared with the Two-stage detection scheme [8], it greatly improves recognition speed and computational efficiency. Although it sacrifices the overall object-recognition accuracy, it’s overall accuracy is still within an acceptable range. Therefore, in this study, we propose a weapon monitoring scheme, named the Weapon Detection System (WDS), which is designed and implemented based on the YOLOv5 object detection algorithm, to solve the above mentioned problems, and based on deep learning for environment weapon monitoring and weapon detection. Security personnel, guards, and police can increase their vigilance based on the recognition results and even carry out physical security maintenance. For society or enterprises, in addition to improving overall safety, it also enhances work efficiency and saves a lot of manpower and financial resources, thus reducing management costs. The rest of this paper is organized as follows. Section 2 briefly presents literature review and background of this study. Section 3 analyzes YOLO algorithms. YOLOv5 improvements and experiments are shown in Sect. 4. Section 5 concludes this study.
2 Literature Review and Background This section introduces the theoretical basis and relevant techniques of the algorithms used in this study, and describes the characteristics of object detection, and features of RCNN and YOLO models.
The Design and Implementation of a Weapon Detection System
285
2.1 Deep Learning-Based Object Detection Object detection is typically done first by extracting features from images using machine learning, and then using classification algorithms to classify the target objects. In recent years, deep learning-based object detection methods have developed rapidly, the techniques used often are Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). Basically, object detection algorithms can be divided into two categories. The first includes the ones that divide the detection process into two stages, such as R-CNN, SPP-NET, Fast R-CNN, Faster R-CNN, and Mask R-CNN. The first stage identifies candidate boxes (Region Proposals) through training, and in the second stage, more refined corrections and classifications are made to the candidates. Overall, the proposals for candidate boxes are more accurately divided, resulting in better detection accuracy. But the computation speed is slower because independent CNN operations need to be performed on each candidate box. The second category, One-Stage Detectors, uses a single CNN to directly process the image and predict the category and position of the target object, such as YOLO and SSD (Single Shot MultiBox Detector). R-CNN (Region-based Convolutional Neural Networks) was proposed by Girshick et al. in 2014 as a deep learning-based object detection technique. Compared to traditional CNN-based detection algorithms, R-CNN achieved higher accuracy in object detection, which led to more related research. As a result, more efficient and accurate models such as Fast R-CNN and Faster R-CNN were derived from R-CNN. In the following, we will introduce these models. 2.2 R-CNN In RCNN, the selective search method [9] is first adopted to generate Region Proposals for each image, extracting a certain number of proposed target regions. Then, CNN is applied to extract features from each region sample to determine all possible Region Proposals in the image. The procedure of the network model is shown in Fig. 1.
Fig. 1. R-CNN: Regions with CNN features [9].
286
T.-Y. Su and F.-Y. Leu
2.3 Fast RCNN The core concept of Fast RCNN [10] is to simplify the computation complexity of RCNN. Its structure as shown in Fig. 2 mainly improves the following three aspects when compared with RCNN: (1) A single image is input into the convolutional neural network, and a feature map is obtained. Then, Selective Search extracts several candidate Region Proposals from the entire feature map. Fast RCNN shares the feature map with Selective Search, thus significantly reducing the computational complexity when compared with RCNN that runs 2000 times of feature extraction. (2) Each extracted candidate Region Proposal has different sizes or shapes. If Fast RCNN is employed to scale them to a uniform size, it will lead to the loss of information. To solve this problem, Fast RCNN uses ROI (Region of Interest) Pooling, which evenly divides a region into 7 × 7 small blocks, and each block is processed by using Max Pooling to transform the region into a uniform size. (3) The SVM and Bounding-Box Regression in RCNN are replaced by SoftMax + Bounding-Box Regression as the joint training of the classification and regression branches. However, Fast RCNN employed the Selective Search method to extract candidate Region Proposals, consequently resulting in a long feature extraction time.
Fig. 2. Fast RCNN structure diagram [9].
2.4 Faster RCNN In 2016, Girshick proposed Faster RCNN [11], which combines feature extraction, region proposals, and classification in the network, significantly improving the overall performance of the system. Especially its detection speed is fast. Faster RCNN creatively uses a convolutional network to generate proposed boxes and shares the convolutional network with the target detection network, reducing the number of proposed boxes from about 2,000 to about 300 and improving their proposed box quality by using the Region Proposal Network (RPN). Compared with Fast RCNN, Faster RCNN replaces the original Selective Search method with the RPN structure. The RPN network produces 9 anchors at each pixel point on the feature map and determines whether the anchors belong to the foreground or background by using SoftMax. The anchors’ positions are then corrected with bounding-box regression to obtain accurate proposed boxes. Using RPN to directly generate proposed boxes can greatly improve the generation speed.
The Design and Implementation of a Weapon Detection System
287
3 Analysis and Study of YOLO Algorithm Due to its good detection speed and balanced accuracy, YOLO has been successfully applied to the field of computer vision in recent years, taking into account both detection accuracy and speed, especially its excellent computing speed. YOLOv5 has improved the network architecture and training techniques of the YOLO series, significantly enhancing the speed and accuracy of object detection. There are four types of YOLOv5 models: YOLOv5s, YOLOv5m, YOLOv5x, and YOLOv5l, and their network depths and feature map widths are individually different. YOLOv5s has the smallest network depth and the narrowest feature map width among YOLOv5 series, and it runs faster on GPUs than the other three models do. The architecture of YOLOv5s as shown in Fig. 3 consists of the Backbone, Neck, and Output parts [12]. .
Fig. 3. YOLOv5s mode structure [12].
3.1 Backbone and Neck Structures YOLOv5s adopts CSPDarknet53 as its backbone network, which is an improved version of Darknet53 where CSP stands for Cross-Stage-Partial connection, which connects the output of the previous layer with the input of the following layer to improve the efficiency of feature propagation. The neck network of YOLOv5s is PANet, standing for Path Aggregation Network. It is a multi-scale feature fusion method based on the feature pyramid, which can better adapt to scale changes and object deformations. The basic idea of PANet is to aggregate and interact features from different resolutions in the feature pyramid to obtain a richer feature representation. 3.1.1 Focus Structure The Focus in Backbone structure was not present in YOLOv3 and YOLOv4, and it provides original image information for each segmented image, while reducing computation needs. Slicing is a key operation of its data processing, and the Focus is a crucial
288
T.-Y. Su and F.-Y. Leu
component that divides the input image into multiple sub-regions to extract useful feature information. The channel of the Focus refers to the number of channels in the output feature map, which is four times the number of channels in the input image. It takes every N pixels in the image to obtain four sets of image slices. For example, a 4 × 4 × 3 image shown in Figs. 4 and 5 is sliced into a 2 × 2 × 12 feature map. The channel number of the 4 × 4 × 3 (2 × 2 × 12) feature map image is 3 (12), where every four channels correspond to a sub-region in the original image. The feature information is extracted in both the horizontal and vertical directions (width and height). In other words, the channel number of the Focus structure is a key parameter that controls the number of segmented sub-regions and the extraction of feature information. Slicing transforms the RGB three-channel mode into 12 channels.
Fig. 4. Focus diagram [13].
Fig. 5. 4 × 4 × 3 slices in Focus is sampled into 2 × 2 × 12 channel maps [14].
The design of the Focus aims to enable convolutional layers to extract features more effectively, especially in low-resolution situations. It slices the original image into several small sub-regions and performs convolutional operations within each sub-region to extract many more features. Compared to directly convolving the entire image, the Focus increases computational complexity, but also reduces information loss caused by image scaling and slicing operations. Therefore, it can be said that the Focus extracts many more comprehensive features, while reducing information loss caused by image processing. 3.1.2 CBL Module The CBL (Convolutional with BN and LeakyReLU) module is present in both the Backbone and Neck of YOLOv3 and YOLOv4, as shown in Fig. 6. It consists of three parts: a convolution layer (conv), batch normalization (BN) [15], and a Leaky ReLU [16] activation function. The convolution layer in the CBL module uses two different kernel sizes: 1 × 1 and 3 × 3. The former reduces or increases the number of channels in the
The Design and Implementation of a Weapon Detection System
289
feature map, changing the representation ability of features and thereby improving the performance of the model. The latter 3 × 3 convolution is used for feature extraction, and for downsampling of the features without losing information, thereby reducing the size of the feature map and speeding up computation.
Fig. 6. Conv-BN-Leaky ReLU (CBL) module structure [17].
3.1.3 CSP Both the Backbone and Neck networks employ CSP (Cross Stage Partial) structure in YOLOv4. YOLOv5 followed the idea of CSPNet and designed two types of CSP, CSP1_X and CSP2_X, as shown in Fig. 7. For example, in the YOLOv5s network, the CSP1_X is located in the Backbone network, while the CSP2_X is placed in the Neck network.
Fig. 7. YOLOv5 CSP structure [14].
Basically, CSP can enhance the learning ability of CNN, while achieving lightweight design. In other words, CSP can maintain object detection accuracy and reduce computational and memory costs. 3.2 Analysis of YOLOv5 Algorithm The weapon detection system developed in this study can be employed in embedded systems for practical application scenarios. Target detection is implemented on embedded processors, and the memory and storage spaces of embedded systems are limited. Therefore, while ensuring detection accuracy, the detected data should be compressed as much as possible. In this study, we replace YOLOv5’s Backbone with ShuffleNetv2 and introduce the CARAFE algorithm [18] for lightweighting general-purpose upsampling. We also substitute the SPPF module by simp PPF.
4 YOLOv5 Improvements and Experiments This section will present our improvement methods and experimental results.
290
T.-Y. Su and F.-Y. Leu
4.1 Experimental Dataset (Weapon Dataset) The Sohas dataset [19] was used in this experiment. It contains six different classes, including handguns, knives, banknotes, wallets, cellphones, and credit cards as shown in Fig. 8. The classification images were obtained from detection images where the object bounding box was cropped. The dataset includes common objects, such as weapons, and the purpose of training is to enable the system to detect weapons held by suspects in criminal activities.
Fig. 8. Classification of the Sohas dataset.
During the training, the experimental results of TP, FP, and Recall are listed in Table 1. Taking the dangerous object “knife” as an example, its precision is 0.37 and recall is 0.253 in 355 detection images. The precision of “handgun” is 0.459, and the recall is 0.308. The overall precision is 0.395, and the recall is 0.374. Table 1. Experimental results of training set for YOLOv5s lightweight network. Class
Images
Labels
Precision
Recall
TP
FP
[email protected]
[email protected]:.95:
all
335
388
0.395
0.374
20.8
30.6
0.324
0.124
Knife
335
99
0.37
0.253
25
42.6
0.198
0.0876
Handgun
335
182
0.459
0.308
56
66.1
0.303
0.145
billete
335
32
0.517
0.312
10
9.34
0.294
0.0945
monedero
335
25
0.296
0.4
10
23.8
0.306
0.0944
smartphone
335
29
0.387
0.448
13
20.6
0.405
0.139
tarjeta
335
21
0.343
0.524
11
21
0.437
0.185
4.2 Experimental Results To evaluate the effectiveness of the proposed lightweight networks for weapon detection system, three lightweight networks were compared with YOLOv5s on the Sohas dataset. The experimental results are shown in Table 2.
The Design and Implementation of a Weapon Detection System
291
Table 2. Performance evaluation on different subsystems of YOLOv5 Lightweight. Networks. Parameters
Gflops
[email protected]
[email protected]:.95
YOLOv5s
7026307
15.8
0.444
0.199
(A)CARAFE
6760363
15.8
0.431
0.195
(B)simSPPF
6927025
16.3
0.433
0.193
(C)Shufflenetv2
6242997
8
0.424
0.176
Combined(A,B,C)
5877771
11
0.42
0.17
Our system implementation shows the recognition of a dangerous item, a handgun, in Table 2, and the recognition of another dangerous item, a cutting tool, i.e., knives, is listed in Table 3. Table 3. Summary of improvement in YOLO lightweight networks. Method
No of Parameters Decrease (%)
No of FLOPS Increase/Decrease (%)
[email protected] Decrease
YOLOv5s (Baseline)
0
0
0
(A) CARAFE upsample
3.7
0
0.013
(B) simSPPF
1.4
0
0.011
(C) Shufflenetv2
11.15
49.37
0.02
Combined(A,B,C)
16.35
30.38
0.024
5 Conclusions In this study, a weapon detection system for surveillance images was developed using deep learning models. Considering the need for practical deployment, the existing YOLOv5 model was improved through lightweight optimization. Before this, YOLOv5s had major issues such as a large number of model parameter, high computational complexity, high computing resources, and long runtime, all of which limited the model’s feasibility in practical applications. In addition, the YOLOv5s backbone adopted CSPDarknet, which performed well in feature extraction, but still had some deficiencies in lightweight optimization. Therefore, the backbone was substituted by Shufflenetv2, the CARAFE upsample module replaced the PANet upsample module in the neck, and the SPPF module was replaced by three lightweight options of simp PPF. These changes successfully reduce the parameter size of the YOLOv5s model by 16.35% (see Table 3), increased the
292
T.-Y. Su and F.-Y. Leu
FLOPS computational efficiency by 30.38%, and decreased the [email protected] by 0.024, providing powerful support for the practical feasibility of the weapon detection system for surveillance images. However, there is still room for further optimization to reduce algorithm complexity and improve detection efficiency, making more efforts for the system’s actual insertion into embedded devices. The new lightweight YOLO network will be our future goal to meet the needs of surveillance image weapon detection systems in embedded application scenarios.
References 1. Nath, S.S., Mishra, G., Kar, J., Chakraborty, S., Dey, N.: A survey of image classification methods and techniques. In: 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT), pp. 554–557. IEEE (2014) 2. Ruiz-del-Solar, J., Loncomilla, P., Soto, N.: A survey on deep learning methods for robot vision. arxiv preprint arXiv:1803.10862 (2018) 3. Deng, L., Liu, Y. (eds.): Deep Learning in Natural Language Processing. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-5209-5 4. Xiao, Y., et al.: A review of object detection based on deep learning. Multimed. Tools Appl. 79(33–34), 23729–23791 (2020). https://doi.org/10.1007/s11042-020-08976-6 5. Lee, S., Kim, N., Paek, I., Hayes, M.H., Paik, J.: Moving object detection using unstable camera for consumer surveillance systems. In: IEEE International Conference on Consumer Electronics (ICCE), Jan., pp. 145–146 (2013) 6. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014) 7. Ouyang, W., Wang, X.: Joint deep learning for pedestrian detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2056–2063 (2013) 8. Han, F., Shan, Y., Cekander, R., Sawhney, H.S., Kumar, R.: A two-stage approach to people and vehicle detection with hog-based svm. In: Performance Metrics for Intelligent Systems 2006 Workshop, Aug., pp. 133–140 (2006) 9. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014) 10. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015) 11. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015) 12. https://www.mdpi.com/2072-4292/13/18/3776?fbclid=IwAR18DnV-B5seCoPbki5tYabfFV n81JuR770ERYj6iV52e_PJCzZGiRd_hGc 13. https://www.hindawi.com/journals/mpe/2022/7078670/?fbclid=IwAR3gcZEqkS4CvmJg HOcUH-Gnc75I61ZhFNjjkFpVLNFO3z1jaROYLV0xBb0 14. https://www.mdpi.com/2077-1312/10/3/310?fbclid=IwAR0_dK42TBXRqKH8_FEAEkg ahyiFYT8L1K9wyL7ei1wHibQxSrnfuuQQBvE 15. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, June, pp. 448–456. PMLR (2015)
The Design and Implementation of a Weapon Detection System
293
16. Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of International Conference on Machine Learning, vol. 30, no. 1, p. 3, Atlanta, Georgia, June (2013) 17. https://www.researchgate.net/figure/Architectures-of-YOLO-v4-and-YOLO-v5s_fig4_3516 25277 18. Wang, Y., Zhang, R., Li, Z.: An improved convolutional neural network for object detection. IEEE Access 9, 95112–95123 (2021) 19. Angelova, A., Krizhevsky, A., Vanhoucke, V., Ogale, A., Ferguson, D.: Real-time pedestrian detection with deep network cascades (2015)
DR.QG: Enhancing Closed-Domain Question Answering via Retrieving Documents for Question Generation Zhi-Wei Tong1 , Yao-Chung Fan1 , and Fang-Yie Leu2(B) 1
National Chung Hsing University, Taichung, Taiwan 2 TungHai University, Taichung, Taiwan [email protected]
Abstract. In this paper, we propose a new framework called Document-Retrieval Question Generation (DR.QG). The goal of DR.QG is to generate a question corresponding to a given answer. Differing from the common question generation setting, DR.QG takes only answers for question generation, while existing question generation takes a context passage and an answer as input for generating. To achieve this goal, we explored the possibility of importing document retrieval. Through the performance evaluation on the Question Answering (QA) task, we demonstrate the feasibility of DR.QG. The result shows that our method improves QA performance by up to 13%. Furthermore, we simulate the closed-domain situation on the open-domain dataset and show that we improved performance by 3%.
1 Introduction Along with the rapid development of AI in recent years, closed-domain question answering (CDQA) applications, question answering with answers specific to a particular domain, have attracted more and more attention from enterprises. For example, a customer service/FAQ chatbot can reduce personnel costs and customer waiting time for enterprises [1]. For training an FAQ chatbot, question and answer pairs are required. During the online inference, the bot maps an input query to possible answers as output answers. However, lacking the training data for CDQA is always an issue; for example, in the e-commerce domain, the company can collect customers’ questions and customer services’ answers as training data for CDQA, but collecting enough question-answer pairs can take a lot of time. This makes data augmentation for CDQA necessary. In this study, we investigate CDQA data augmentation by QG. The question generation (QG) task is motivated by exam-like question generation and data augmentation for question answering (QA) task. The general QG setting is to take a context and an answer as input and automatically generate a question concerning the context and answer. The QG is a common data augmentation trick for improving QA performance. However, traditional QG methods cannot be directly employed. This is because traditional QG requires answers and corresponding contexts to generate a question. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 294–305, 2023. https://doi.org/10.1007/978-3-031-35836-4_31
DR.QG: Enhancing Closed-Domain Question Answering via Retrieving Documents
295
Table 1. Example of Document-Retrieval Question Generation based on the retriever-generator framework. Answer (Input)
Retrieved Document
Question (Output)
Morris Cheng Title: TSMC Passage: TSMC founded in Taiwan in 1987 by Morris Chang, TSMC was the world’s first dedicated semiconductor foundry and has long been the leading company in its field. When Chang retired in 2018, after 31 years of TSMC leadership, Mark Liu became Chairman and C. C. Wei became Chief Executive... Who is the TSMC founder?
However, in the CDQA scenario, what we assume to have is questions and their corresponding answers as question-answer pairs for training CDQA. As a result of lacking contexts, we can not directly apply the traditional question generation to generate question-answer pairs for CDQA training data augmentation. To solve the problem mentioned above, we propose a new framework called Document-Retrieval Question Generation (DR.QG). The goal of DR.QG is to generate a question corresponding to the answer phase by context retrieval. Differing from the traditional question generation setting, DR.QG takes only answers (answer-only) for question generation, while existing QG literature takes a context passage and an answer as input for generating. A straightforward idea is to train a QG model that takes only an answer as input and generates a question as output. However, such a setting is challenging to train, as the input information is quite a few (just an answer). For this, we import the concept of retrieval. To better see our setting, we show an example in Table 1. In this example, we assume the answer Morris Cheng as the input to our framework and use it to retrieve Wikipedia. Through retrieval, we obtain a document TSMC related to answer Morris Cheng. With this document, we can use the answer Morris Cheng and the document TSMC as the input to the existing QG setting for generating questions. Based on the retrieval idea, the critical part is therefore the effectiveness of retrievers. For effective retrieving documents, we explore the employment of the state-ofthe-art retriever, i.e., Dense Passage Retrieval (DPR) [5], in our DR.QG framework. However, again, DRP cannot be directly employed in our work. Specifically, the main application of DPR is to give a question to find a document related to the question. DPR was designed and used for Open Domain Question Answering (ODQA). However, our application scenario is to take an answer to find a document related to the answer. To improve the performance of the retriever, we propose a new retriever called AnswerPassage Retrieval (APR). The architecture of APR is the same as DPR, but the goal of pre-training has adapted from using a question to find the most relevant document to using an answer to find the most relevant document. With the help of the APR retriever, the QG model can be employed to generate questions based on answers and retrieved documents, but we further improve the quality of question generation. We use the state-of-the-art (SOTA) QA model as our filter to filter the questions we generated to ensure quality.
296
Z.-W. Tong et al.
Fig. 1. Overview of the entire DR.QG framework.
The contributions of this paper are summarized: (1) We propose an answer-only question generation framework through retrieving documents. (2) We extend the DPR framework (i.e., APR), and it outperforms better than the original DPR [5] when using answers to retrieve documents. (3) From the performance evaluation, we demonstrate our framework improves closed-domain QA performance by 3%.
2 Methodology 2.1
DR.QG
We propose a question generation methodology based on the RAG architecture called DR.QG. The overview idea of DR.QG is shown in Fig. 1. There are three components in DR.QG, namely retriever, question generator, and question filter. The role of the retriever is to retrieve documents related to the input and sort them by relevance. The function of the question generator is to generate questions from the input and the document retrieved by the retriever. The goal of the question filter is to filter the generated questions and keep those that match the answer. Finally, we can add these filtered questions to the training data for data augmentation. In the following subsections, we will discuss these three components individually. Retriever. We initially tried to accomplish the question generation task of RAG in an end-to-end setting, i.e., training retriever and question generator at the same time. But after training, it was found that more than half of the generated results are not complete questions. We think it is a problem with the RAG’s retriever (i.e. DPR). Because the main application of DPR is to give a question to find a document related to the question, but our application scenario is to take an answer to find a document related to the answer.
DR.QG: Enhancing Closed-Domain Question Answering via Retrieving Documents
297
To solve the above problem, we change the original end-to-end process and train the retriever and generator separately, making the retriever retrainable individually. Specifically, we vary the goal from improving the probability of retrieving the most corresponding document pR (d1 |a) in all documents D and the probability of generating the correct question pG (q|a, d1 ) together, to improving them separately. Finally, maximize the probability of generating question q given the answer a. The details are as follows. p(qg |a) = pR (d1 |a)pG (q|a, d1 ), d1 ∈ Top- 1(p(D|a)) This decoupling method allows us to replace the original DPR and train a new retriever from scratch. So we propose a new retriever called Answer-Passage Retrieval (APR), whose goal is to retrieve the most relevant documents from the answers. Answer-Passage Retrieval (APR) is structured like the original DPR, we pre-train the query encoder EQ and document encoder ED via dot-product similarity. As shown in Fig. 2, the only difference is that the original DPR use a question as a query, but we use an answer a as our query to calculate similarity with a document d. sim(a, d) = EQ (a) ED (d)
(1)
It’s worth mentioning that the original DPR imported negative samples for training. The sources of these negative samples are positive samples corresponding to other queries and retrieved documents from BM25 that do not contain the answer. In the case of one answer a, one positive sample d+ and n negative samples d− , our negative loglikelihood loss function is: − L a, d+ , d− 1 , · · · , dn = − log
sim a,d+
e ( sim(a,d+ ) n e +
i=1
(2)
)
− sim a,d i e
(
)
Question Generator. For the question generator, we take the BART model [7] and train a QG model based on the architecture. Its seq2seq architecture with bidirectional encoder and left-to-right decoder and well-designed pre-training tasks give it a place in text generation tasks. In the training period, we use the NQ-DPR (Conv.) dataset, feed its answers and corresponding documents into the generator, and train it to generate the corresponding questions. During our model inference, we take DR.QG’s input (i.e., an answer) and the retrieved document returned by the retriever into our question generator for generating questions. Question Filter. To ensure the quality of generating question-answer pairs, we use a state-of-the-art ODQA model called Fusion-in-Decoder [4] to filter the data. This model achieves an Exact Match of 50 in the NQ dataset for ODQA. Specifically, we ask the model to answer all the generated questions, keep if the answers are the same,
298
Z.-W. Tong et al.
Fig. 2. Comparison of DPR and APR. The main training goal of DPR is to find relevant documents through questions; the main training goal of APR is to find relevant documents through answers.
and discard otherwise. Algorithm 1 shows the process of our question filtering. For example, given a gold answer ag , the retriever and the question generator generate a question qg . We take qg as input to the ODQA model [4], and the model answers an answer af . If af is the same as ag , we keep the question as an augmented data instance. Otherwise, we discard the question.
Algorithm 1: Question filtering (DR.QG) Input: Generated question qg , Gold answer ag Output: Filtered question qf qf ← ∅; // Filter’s answer af ← ODQA(qg ) ; // Filtering if af = ag then qf ← qg ; return qf
2.2
DR.MQG
To further improve QA performance, we try to generate more questions. We propose an extension of DR.QG called Document-Retrieval Multiple-Question Generation (DRMQG). In DR.QG, we generate a question via an answer and the Top-1 document related to the answer. But DR.MQG can generate multiple-question via answer and Top-M documents related to the answer. As shown in Fig. 3, when M = 3, it means that
DR.QG: Enhancing Closed-Domain Question Answering via Retrieving Documents
299
we use top-3 retrieved documents and an answer to generate different questions respectively. Specifically, the probability of generating all question Qg is the product of the probability of generating Question i to Question M . And the probability of each question being generated is the probability of retrieving the Top-i corresponding document pR (di |a) in all documents D multiplied by the probability of generating the correct question pG (qi |a, di ). The details are as follows. p(Qg |a) M = i=1 pR (di |a)pG (qi |a, di ), di ∈ Top- i(p(D|a)) The advantage of this approach is that it generates more questions. However, including multiple questions also bring a side effect on question quality (poorly generated questions may be generated, and using the poor questions for data augmentation may degrade the performance of our framework). Therefore, the question filter is important in DR.MQG because it allows us to filter out low-quality questions. Because DR.MQG generates more questions, the operation logic of the question filter is slightly different from that of DR.QG. Algorithm 2 presents the entire flow. For example, given a gold answer ag , the retriever and the question generator generate multiple questions Qg . We throw those questions into the ODQA model one by one and the model outputs the answers a1 , a2 , ..., aM one by one.
Algorithm 2: Question filtering (DR.MQG) Input: Generated questions Qg = q1 , ..., qM , Gold answer ag Output: Filtered questions Qf Qf ← ∅; for qi ← q1 , ..., qM do // Filter’s answer ai ← ODQA(qi ); // Filtering if ai = ag then qi add into Qf ; end end return Qf
Subsequent experiments will explore two parts, the first is whether we use DR.MQG without filter to make performance better or worse. The second is whether the performance of the first case can be improved after using the filter.
3 Experiments In this section, our experiments are divided into four parts. The first part Subsect. 3.1 is automatic metrics used for evaluating retrievers, the question generator, and DR.QG.
300
Z.-W. Tong et al.
Fig. 3. The flow of the entire DR.MQG method.
Second, the datasets employed for performance evaluation Subsect. 3.2. Third, the implementation details are presented in Subsect. 3.3. Subsect. 3.4 presents the evaluation result. 3.1
Automatic Metrics
For retrievers, we compare the title of the retrieved documents with the title of the gold document. The metrics used are Top-k accuracy. For example, in top-5 accuracy, we get one point if the right title appears in our top-5 retrieved documents. We then add up all the points, divide by the number of gold documents, and end up with a top-5 overall score. For question generator, a commonly used indicator for text generation is the overlap indicator, which mainly evaluates the degree of overlap between the generated sentence and the reference sentence. We employ BLEU [10], METEOR [2], and ROUGE-L [9] to measure the number of overlapping n-grams and the length of the longest common subsequence. These metrics are all most commonly used to evaluate text generation. Evaluating DR.QG is a challenge because traditional QG evaluation methods do not apply to DR.QG. In the past, question generation usually used the token score to calculate the similarity of the generated question to the ground truth. The higher the similarity, the better the performance of the model. But in DR.QG, this method cannot work because different questions from ground truth can still improve the performance of QA. That is, even if it differs from the ground truth, it doesn’t mean a wrong question is generated. Therefore, to evaluate whether the questions we generate can improve the model’s performance, we directly employ QA as our DR.QG evaluation task. For DR.QG evaluation, our focus is mainly on whether it can improve the performance of QA. The most commonly used metrics for QA are Exact Match and F1 score. The concept of the Exact Match is very simple, if our model is fed with a question and the output is exactly the same as the gold answer, it gets one point, otherwise nothing. The final total is the total points divided by the number of questions. F1 score is a common metric for classification tasks and is also commonly used in QA. This metric is calculated from the individual words in the prediction versus those in the gold answer.
DR.QG: Enhancing Closed-Domain Question Answering via Retrieving Documents
301
The number of shared words between the prediction and gold answer is the basis of the F1 score, where we mainly focus on precision and recall. Precision is the ratio of the number of shared words to the total number of words in the prediction, and recall is the ratio of the number of shared words to the total number of words in the gold answer. Whether it is Exact Match or F1 score, the higher the score, the more similar the prediction is to the gold answer, and the lower the less alike. 3.2 Experiment Datasets Table 2 presents the number of instances and the split setting of the compared datasets. The first dataset we used is NQ-DPR (Conv.), which we use to pre-train our APR retriever. The second we use is NQ-Open [6] which is a subset of NQ. This dataset contains only NQ questions whose answer length ranges from 1 to 5 (i.e., short-answer questions). We use this dataset to evaluate whether our data augmentation can improve QA performance. There are two points to note. One is that to simulate the lack of data, we randomly take 10% of the train set as a non-augmented situation. The second is to test more data, we swapped the original NQ-Open dev set and test set. Since we do not have a dataset for closed-domain QA, we filter out the dataset with only specific answers (topics) from the NQ-open and call this dataset NQ-Close. Hoping to simulate closed-domain QA as much as possible through this data set. For the train set, we only keep the part of question-answer pairs whose answers appear in the dev set or test set and discard the other part. For the dev set and test set, we keep them as they are. For example, if there is a question whose answer is Morris Cheng in the test set, then we go to the train set to find out whether the question whose answer is also Morris Cheng. If it exists, we keep this question-answer pair. But if the answer Morris Cheng is not found in the dev set or test set, this question-answer is dropped. Table 2. Experiments datasets. Dataset
Train
Dev
Test
NQ-DPR (Conv.)
71741
8001
-
NQ-Open
7917
3610
8757
NQ-Close
-
946
2519
3.3 Implementation Details Our models are mainly based on Huggingface Transformers [11], which is an opensourced library on Github1 except the Fusion-in-Decoder. For Fusion-in-Decoder, we use the code released by PAQ [8]. We use RAG and BART source code from [11] for
1
https://github.com/huggingface/transformers.
302
Z.-W. Tong et al.
model implementation. The other part is the DPR model [5], we refer to the DPR source code so as to train it from scratch. This source code is also open source on Github2 . R Tesla V100 with 256GB RAM. The dataset For APR, we train it on 8 NVIDIA used for pre-training is NQ-DPR. Our training strategy follows the original DPR, which is applied with an initial learning rate of 2e-5. The number of epochs is 40, and the batch size is 16. It took 18 h to train APR and took 36 h to build the document index. R Tesla V100 with 64 GB For DR.QG question generator, we train it on 2 NVIDIA RAM. The dataset used for fine-tuning is NQ-DPR. We applied with an initial learning rate of 3e−5. The number of epochs is 10, and the batch size is 64. It took 6 h to train BART. For our evaluation tasks, closed-book question answering and closed-domain question answering, we use the BART base-sized model as our experimental model. We R RTX 3090 with 48 GB RAM. Datasets accomplish these experiments on 2 NVIDIA used for evaluation are NQ-Open and NQ-Close, respectively. The learning rate is 1e−5 and the batch size is 256. 3.4
Results
Our main experiments are divided into retriever evaluation and data augmentation performance (the overall QA performance impacts by using various retrievers). To demonstrate the effectiveness of our method, we include BM25 and the original DPR as retriever alternatives for comparison in both experiments. No matter what kind of retriever, in QA evaluation, we connect the fine-tuned BART as a generator to generate questions. Retriever Evaluation. We follow the DPR paper and evaluate performance by comparing the titles of retrieved documents. The score of T@1 represents how many percent of the accuracy is successful in the Top-1 retrieved document, and so on. As shown in Table 3, our APR performs better than the original DPR. But with the BM25, we still have a small gap. This is mainly because sparse representations have advantages over dense representations for short words or phrases. The reason is that the embedding of short words generated by the encoder is not strong, so it affects the performance. Table 3. Top@k on dev set of NQ-DPR (Conv.). Model T@1 T@5 T@10 BM25 15.96 24.98 30.62
2
DPR
9.09
13.22 14.85
APR
9.44
19.07 24.06
https://github.com/facebookresearch/DPR.
DR.QG: Enhancing Closed-Domain Question Answering via Retrieving Documents
303
Question Generator Evaluation. We use the QG training setting for the question generator, which gives an answer phase and a context text, expecting to generate a question corresponding to the answer phase. Table 4 shows our result on NQ-DPR (Conv.). The performance of our QG implementation is comparable to the SOTA score [3] on the SQuAD dataset, which shows the quality of our QG implementation. Table 4. Token scores on dev set of NQ-DPR (Conv.). Model Bleu-4 METEOR ROUGE-L BART 25.67
36.72
59.82
CBQA. Closed-book question answering (CBQA) is a QA task that is given a question and expects a correct answer without access to external knowledge. This is the same setup as our CDQA, the only difference is that CBQA usually uses the open-domain QA dataset instead of the closed-domain QA dataset. To verify whether the questions we generate can improve the performance on the CBQA task, we conduct experiments using origin and augmented training data. Table 5 shows our evaluation task results. To simulate the lack of data, we randomly sample 10% from the NQ-Open train set as our origin train set. The train sets other than the origin are all composed of the origin train set and the augmented data. As seen from the results, no matter what kind of retriever it is, the question generated by DR.QG can improve the performance. For example, we can see that APR has a score of 5.93 in Exact Match, which is higher than Origin’s score of 5.71. For the idea of generating more questions (i.e., DR.MQG), we see the part of APR (M = 5), APR obtains the highest score in the test set, which demonstrates that our proposed retriever is effective. In addition to comparing the retrievers, we also compared the effect without question filtering. We can see that as we generate more questions, the performance gets worse without question filtering, as shown in Table 5. Finally, APR retains the most questions after question filtering. It means that APR helps the question generator to generate more questions that match the answer than other retrievers. CDQA. Last, to simulate the closed-domain situation for implementing CDQA as much as possible, we focus on the questions in the dev set and test set that have the same answer as the train set. We found a total of 946 and 2519 such questions in the dev set and test set respectively. The accuracy of these problems will determine how well our method works in the closed-domain situation. Table 6 shows the results. We can see that our APR has the highest accuracy in both the dev set and test set. And in DR.MQG (M = 5), the accuracy of APR is 3% higher than that of the origin. It means that our framework is effective in both open-domain and closed-domain. Not only that, Table 6 also presents a result that our APR does help the question generator generate good questions more than other retrievers. From the results, we can
304
Z.-W. Tong et al.
Table 5. CBQA on the dev set of NQ-Open. Train Set includes original data and augmented data. M=3 means that we use the top-3 documents for each answer to generate questions, and each answer generates 3 questions at the end, and so on. Methods
Train Set Exact Match (Dev) F1 Score (Dev) Exact Match (Test) F1 Score (Test)
Origin
7917
5.71
8.9
6.76
10.35
BM25 w/o filtering
9387 15820
5.96 5.32
9.27 8.94
7.15 6.95
10.5 10.3
DPR w/o filtering
8896 15834
5.76 5.29
9.27 8.51
7.11 6.84
10.64 10.26
APR w/o filtering
10263 15834
5.93 5.43
9.29 8.54
7.49 7.13
10.93 10.32
BM25 (M=3) 11901 w/o filtering 31608
6.15 5.29
9.47 8.75
7.23 6.77
10.74 10.06
DPR (M=3) w/o filtering
10740 31668
6.01 5.35
9.36 8.76
7.38 6.68
10.84 10.04
APR (M=3) w/o filtering
14329 31668
6.26 5.51
9.51 8.26
7.65 7.07
11.01 10.04
BM25 (M=5) 14002 w/o filtering 47381
6.2 5.21
9.64 8.57
7.42 6.86
11.02 9.85
DPR (M=5) w/o filtering
12422 47502
6.12 4.99
9.53 8.21
7.49 6.57
11.03 9.75
APR (M=5) w/o filtering
17946 47502
6.2 5.71
9.31 8.72
7.86 7.25
11.17 10.37
Table 6. CDQA on the dev set of NQ-Close. Train Set includes original data and incremental data. M=3 means that we use the top-3 documents for each answer to generate questions, and each answer generates 3 questions at the end, and so on. DR.QG
DR.MQG (M=3)
DR.MQG (M=5)
Methods
Accuracy (Dev) Accuracy (Test) Accuracy (Dev) Accuracy (Test) Accuracy (Dev) Accuracy (Test)
Origin
21.04%
22.95%
–
–
–
–
BM25 21.78% w/o filtering 19.87%
24.06% 23.54%
22.73% 19.66%
24.37% 23.18%
22.83% 19.45%
24.97% 23.54%
DPR 20.93% w/o filtering 19.34%
24.14% 23.03%
22.09% 20.19%
25.01% 22.83%
22.62% 18.92%
25.17% 22.67%
APR 21.99% w/o filtering 19.98%
25.45% 24.49%
23.04% 20.72%
26.04% 24.53%
23.04% 21.56%
26.84% 25.13%
see that whether it is DR.QG or DR.MQG, our APR without filtering has a relatively high correct rate compared to other retrievers. For example, in DR.MQG (M = 5), APR w/o filtering has a correct rate of 21.56%. Not only higher than 21.04% of Origin but also higher than 19.45% of BM25 w/o filtering and 18.92% of DPR w/o filtering.
DR.QG: Enhancing Closed-Domain Question Answering via Retrieving Documents
305
4 Conclusion We propose Document-Retrieval Question Generation, a new question generation framework for data augmentation in ODQA. To improve the performance of the framework, we introduce a retriever-generator framework, which replaces the context required by traditional question generation in the past by retrieving documents. Furthermore, we propose the APR retriever for question generation, which shows advantages concerning the original DPR in using answers to retrieve documents. Our results show that DR.QG does have the opportunity to generate promising questions, and in our evaluation, it achieved up to 3% improvement in performance in the CBQA task.
References 1. Badugu, S., Manivannan, R.: A study on different closed domain question answering approaches. Int. J. Speech Technol. 23(2), 315–325 (2020). https://doi.org/10.1007/s10772020-09692-0 2. Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72 (2005) 3. Chan, Y.H., Fan, Y.C.: A recurrent BERT-based model for question generation. In: Proceedings of the 2nd Workshop on Machine Reading for Question Answering, pp. 154–162 (2019) 4. Izacard, G., Grave, E.: Leveraging passage retrieval with generative models for open domain question answering. arXiv preprint: arXiv:2007.01282 (2020) 5. Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. arXiv preprint: arXiv:2004.04906 (2020) 6. Lee, K., Chang, M.W., Toutanova, K.: Latent retrieval for weakly supervised open domain question answering. arXiv preprint: arXiv:1906.00300 (2019) 7. Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint: arXiv:1910.13461 (2019) 8. Lewis, P., et al.: PAQ: 65 million probably-asked questions and what you can do with them. Trans. Assoc. Comput. Linguist. 9, 1098–1115 (2021) 9. Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004) 10. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meting of the Association for Computational Linguistics, pp. 311–318 (2002) 11. Wolf, T., et al.: HuggingFace’s transformers: state-of-the-art natural language processing. arXiv preprint: arXiv:1910.03771 (2019)
QoS-Oriented Uplink OFDMA Random Access Scheme for IEEE 802.11be Chia-Wen Chang and Fang-Yie Leu(B) Computer Science Department, Tunghai University, Taichung City, Taiwan [email protected]
Abstract. Recently, wireless networks have been an essential element in people’s lives. With the booming development of real-time applications (RTA), such as multimedia streaming, VR reality, online conferences, etc., the transmission rates required by these applications have been increasingly high. To solve the problem of dense and heavy network transmission, IEEE 802.11ax provides an Uplink OFDMA Random Access (UORA) transmission mode. However, when too many STAs are delivering data, data transmission delays are often long and network performance is frequently not high. To meet the low-latency sensitive network needs of real-time applications, this study proposes an IEEE 802.11be transmission scheme which adjusts the OFDMA Backoff (OBO) calculation for STAs according to the priorities level of data streams. STAs with higher priority levels can transmit data with higher priorities, thereby improving the transmission efficiency of latency-sensitive data streams and enhancing the Quality of Experience (QoE) for users on the receiving end.
1 Introduction In recent years, with the rapid development of technology, the demand for real-time applications has increased, including online conferences [1], multimedia streaming [2], online gaming [3], remote education [4], smart homes [5], etc. To meet these demands and provide low-latency, high-speed transmission, and stable wireless network experience, the IEEE 802.11be Task Group (TGbe), proposed a new wireless network technology, i.e., 802.11be (also known as WiFi 7), called Extreme High Throughput wireless communication. It not only expands the transmission channel width to 320 MHz and continues supporting the 6 GHz frequency band of 802.11ax, but also increases the Orthogonal Amplitude Modulation to 4096-QAM (4 K-QAM) and 16-stream spatial layers. 802.11ax (also known as Wi-Fi 6) [6] was an important wireless network technology standard that introduced Uplink OFDMA Random Access (UORA) technology [7] to solve the problem of dense and heavy network transmission for IoT (Internet of Things) [8] devices, allowing multiple users to access the network and simultaneously transmit data in the uplink direction. However, as the number of users increases, UORA transmission encountered the problem of data transmission collision, resulting in serious network transmission timeouts. In the latest IEEE 802.11be draft, it is shown that UORA technology will still to be used [9]. To address this challenge and improve the overall © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 306–315, 2023. https://doi.org/10.1007/978-3-031-35836-4_32
QoS-Oriented Uplink OFDMA Random Access Scheme
307
network transmission performance, in this study, we propose a new scheme, named Different Access Categories for OCW (DACO), for highly sensitive data streams to enhance the overall QoS (Quality of Service) of the network. This DACO will further optimize the resource allocation of wireless networks, reduces probabilities of data-transmission collision, and provides more efficient and reliable transmission services in high-density user environments. The structure of this article is structured as follows. Chapter 2 introduces the UORA transmission mechanism and the reasons for its low network performance. Chapter 3 analyzes and compares existing UORA improvement solutions. Chapter 4 present the new scheme for latency-sensitive networks. Chapter 5 concludes the paper and outlines our future research.
2 Uplink OFDMA Random Access (UORA) The IEEE 802.11ax defines two types of multi-user uplink transmissions: Random Access (RA) and Scheduled Access (SA). RA allows N STAs to compete for available channels for data transmission, while SA permits the AP to directly allocate channels to STAs in a non-competitive manner. The Uplink OFDMA Random Access (UORA) mechanism in the IEEE 802.11ax standard adopts the former approach, which is similar the function of the slotted Aloha wireless transmission protocol [10]. Their STAs compete for the available channels during a specific time slot, while meeting the requirement in which all STAs start (end) data transmission simultaneously. The IEEE 802.11ax standard also incorporates OFDMA technology [11], which divides a channel into multiple small resource units (RU) for data transmission. The working principle of the UORA mechanism is as follows. At the beginning of a UORA transmission opportunity (TXOP), the AP broadcasts a Trigger Frame (TF) to trigger the transmission. The TF contains information such as the size of the OFDMA contention window (OCW), the number of RUs that can be randomly accessed, and the association identifiers (AIDs) which STAs are allowed to use the RU. When an STA receives the TF, it begins a backoff mechanism to compete for the transmission channel. If the AID of the RU is set to 0, this channel can only be accessed by STAs that are already associated with the AP. If it is set to 2045, the RU can be accessed by non-associated STAs. The OCW size is configured by the AP. If an STA does not receive the OCW announcement from the AP, it will use the default OCW, which is calculated by using Eqs. (1) and (2): OCWmin = 2EOCWmin − 1
(1)
OCWmax = 2EOCWmax − 1
(2)
The Extended OFDMA Contention Window (EOCW) is used to expand the OCW and thus increase the backoff time when needed. For example, when there is a collision on the RU and the OCW needs to be increased. The purpose is to avoid data-transmission collision again. When a STA would like to compete for RUs to transmit data, it randomly selects a value within the OCW range specified in the TF sent by the AP as the initial OFDMA
308
C.-W. Chang and F.-Y. Leu
backoff counter (OBO) for use in the backoff calculation. The calculation formula is shown in Eq. (3): initialOBO = [0, OCW ]
(3)
After receiving the TF, the STA will perform OBO decrement based on the number of available RUs in the TXOP, as calculated by Eq. (4): OBO = OBO − (numberofavailableRA_RU)
(4)
When the backoff stage ends, STAs with OBO less than or equal to zero will randomly select an available RU to transmit data. After the transmission, STAs that have successfully transmitted will receive the Multi-user Block Acknowledgment (MU-BACK) sent by the AP, but STAs that have failed to transmit will not receive it. STAs with successful transmission will wait for the next TXOP with the same OCW. A STA with OBO greater than zero cannot access the RUs in this round of TXOP and must continue to wait for n TXOPs, where n ≥ 1, until its OBO is less than or equal to zero. If the STA does not receive the MU-BACK from the AP, it means that the transmission failed. The OCW will be doubled, and a new OBO will be randomly selected from the range between 0 and the doubled OCW as shown in Eq. (3). The flow chart of the UORA mechanism is illustrated in Fig. 1. Figure 2 illustrates an example of the UORA mechanism with five STAs attempting to transmit data. STAs 1 to 4 are already associated with the AP, while STA 5 is an unassociated one. Upon receiving the trigger frame from the AP, STAs decrement their OBOs based on Eq. (4) for RU contention in this TXOP. In this case, there are four RUs (RUs 3 to 6) with AID 0 available for associated STAs and two RUs (RUs 1 and 2) with AID 2045 available for unassociated STAs. As a result, the OBO counter of each associated STA is decremented by 4 in this TXOP, while the OBO counter of each unassociated STA is decremented by 2. After the backoff phase, the OBO counters of associated STAs 1, 2, and 4 are less than zero. They must randomly select one of the four RUs with AID 0. As shown, associated STA 2 randomly selects RU 5, while STAs 1 and 4 both select RU 3, resulting in a collision. Unassociated STA 5 also selects RU 1 because its OBO counter is less than zero. Since the OBO counter of STA 3 is greater than zero, it does not allow to compete for RU in this TXOP.
3 Related Studies The UORA mechanism in the IEEE 802.11ax standard has a transmission efficiency of only 37% [12], and the reasons for this are as follows. (1) When OBOs of two or more STAs are all less than or equal to 0 and they randomly choose the same RU, this will lead to data-transmission collisions; (2) After STAs back off, several RUs may be idle in current TXOP due to not being selected by any STA. They will result in poor transmission efficiency. To solve these problems, many scholars have proposed solutions. However, in practical scenarios, there are still some situations that cannot be eliminated: in a TXOP, the AP cannot predict the number of STAs that will transmit data, e.g., the associated STAs that have just been awakened from sleeping mode and the unassociated STAs
QoS-Oriented Uplink OFDMA Random Access Scheme
309
Fig. 1. UORA Mechanism for each TXOP
Fig. 2. A UORA Example
that need to transmit data. Under the UORA mechanism, STAs must start (end) data transmission at the same time, but they neither comprehend whether each other has data to transmit or not, nor do they know which RU will be selected for data transmission. To improve the transmission efficiency of UORA, scholars have proposed solutions [9, 13–15]. [13] analyzes the UORA mechanism using a two-dimensional Markov-chain model, which analyzes the transmission efficiency of each STA at each UORA time slot
310
C.-W. Chang and F.-Y. Leu
t given different numbers of RUs under different OBO phases and OBO statuses for each STA. In [14], the authors proposed a trigger frame and carrier sensing-based H-UORA. The latter is a two-stage OBO backoff mechanism. In the second-stage backoff, each STA generates a random number Xu which is compared with the OBO probability ρu . If Xu < ρu , the STA performs channel sensing. If idle channels are found, the STA randomly chooses one to transmit data; If Xu ≥ ρu , , the STA has to do the next round of second-stage backoff. The H-UORA mechanism can reduce the number of idle RUs and the number of unsuccessful transmissions. However, the drawback is that it requires to modify the UORA mechanism of IEEE 802.11ax protocol, which requires all STAs to transmit data (end data transmission) at the same time. Additionally, the second-stage backoff in H-UORA mechanism may shorten the time for a STA to transmit data in a TXOP. In [9], the authors propose a solution called Collision Reduction and Utilization Improvement (CRUI) to reduce data-transmission collisions on RUs. Before data transmission and after STAs have completed backoff, there is an extra backoff stage (EBO) that lasts for one slot. During the EBO stage, STAs prioritize their own OBO counters and randomly select available RUs based on their priorities. In addition, the authors propose opportunistic RU hopping (ORH), which allows STAs to have a second chance to transmit data on seemingly idle RUs if they perceive a low priority level. The CRUI mechanism brings two concepts to the UORA mechanism. (1) priority of OBO counters for fairness to STAs that need to transmit data; (2) STA hopping probability, which offers a second chance for STAs that might collide on the same RU to choose another RU. Compared to the H-UORA mechanism, although the CRUI mechanism satisfies the IEEE 802.11ax rule of simultaneous data transmission and end of transmission, the EBO stage of CRUI still takes up transmission time, and the ORH part of CRUI also increases the chance of data transmission collision due to STA hopping. In [15], the authors propose a solution that is more in line with the IEEE 802.11ax standard, by controlling the STA’s OBO counter and determining the OBO value more suitable for the current STA based on its individual transmission success or failure. This not only allows the UORA mechanism to be used as specified in the IEEE 802.11ax standard, but also adjusts the OBO value for each TXOP more accurately based on the individual transmission status of each STA. The calculation method of the OBO counter is shown in Eq. (5): min(α + δ, αmax ) (5) OBO = OBO − αMru , α = max(α − δ, αmin ) where Mru represents the number of available RUs for data transmission in the current TXOP; α ranges from 0 < αmin ≤ 1 ≤ αmax is an initial value used to adjust the OBO counter; δ is a value greater than zero. When a STA successfully transmits data in a TXOP, its α value increases, making the OBO counter decrement faster in the next TXOP to increase the chance of competing for an RU. Conversely, if the STA fails to transmit data, its α value decreases, causing the OBO counter to decrease slower and consequently reducing the successful rates of competing for a RU. The OBO control scheme allows UORA to adjust the competition speed for RUs for each STA based on
QoS-Oriented Uplink OFDMA Random Access Scheme
311
its individual transmission status, indirectly affecting the likelihood of RU collisions and number of idle slots. This scheme also complies with the IEEE 802.11ax standard. However, as the number of STAs increases, collision problems may still occur. In [15], the authors proposed RA-NFRP, which combines the Trigger Frame and Null Data Packet Feedback Report Polling (NFRP) of the UORA scheme, as shown in Fig. 3. RA-NFRP uses NFRP to enable the AP to collect resource request messages from STAs. The STA checks to see whether the number of pending data packets exceeds a threshold. If yes, the STA reports a 1; otherwise, it replies with a 0. In RA-NFRP, upon receiving the NFRP TF from the AP, the STA randomly selects an RU Tone Set to transmit its feedback status (0 or 1), instead of transmitting data. The AP can randomly select STAs that have transmitted feedback status to schedule RUs for data transmission. After the STA successfully transmits the data, the AP sends an M-BA to indicate the successfulness.
Fig. 3. The RA-NFRP Mechanism
4 Proposed Solutions The choice of OCW range plays an important role in the proposed solution of UORA. The AP dynamically adjusts the OCW range based on the current number of known STAs. But as mentioned above, it is difficult for the AP to accurately identify the number of STAs that need to transmit data. When the OCW range is not appropriate, it can reduce the efficiency of UORA. In addition, in 802.11ax, only the range of OCW is specified, as shown in Fig. 4, and it is not presented how the OCW of the UORA mechanism varies according to different data flows. In order to explicitly differentiate the priority level of data streams in 802.11be, we divide network traffic into four types of Access Categories (AC) according to their priority levels in EDCA (Enhanced Distributed Channel Access). These four types, ranked from high to low, are AC_VO (Voice), AC_VI (Video), AC_BE (Best Effort), and AC_BK (Background). The OCW varies depending on the type of traffic. For example,
312
C.-W. Chang and F.-Y. Leu
Fig. 4. OCW Range Field format
Table 1. AC and CW CWmin
CWmax
AC_VO
7
15
AC_VI
15
31
AC_BE
31
1023
AC_BK
31
1023
the minimum contention window for Voice is 7, while the minimum contention window for Background is 31, as shown in Table 1. Our idea is that setting different OCW values according to the type of data stream that the STA is transmitting allows different data streams to be transmitted with varying priority levels. When the STA randomly selects an initial OBO value, the method is the same as in the conventional UORA, which selects a random value from the range between 0 and OCW, i.e., [0, OCW]. For instance, the initial contention window for AC_VO is [0, OCWmin ], i.e., [0,7], For AC_BK, it is [0,31]. Generally, a specific type’s OBO value is chosen from [x, OCW], where x is the lower limit value of OCW for this round (x = 0 when transmitting for the first time). After this TXOP, STAs that successfully transmit will still choose a new OBO still from [0, OCW]; STAs that did not get a transmission opportunity will wait for the next round; however, for STAs that did not transmit successfully, their OCW will double, and the OBO will be randomly selected from [OCW, OCW*2], rather than [0, OCW*2] in order to reduce the chance of STA collisions and improve data stream priority. Additionally, we set a time slot before STAs transmit data. During this time slot, STAs detect whether there are signals being transmitted on the RU; RUs with ongoing signal transmission are considered busy, while others are idle. When STAs detect potential collisions on the RU, they will prioritize based on their own priority level; STAs with higher priority can choose idle RUs to transmit data. After the time slot ends, the OCW value for STAs experiencing transmission collisions on the RU will double, and a new OBO value will be selected from [OCW, OCW*2].
QoS-Oriented Uplink OFDMA Random Access Scheme
313
5 Simulation and Discussion In the following, we simulated a BSS (Basic Service Set) environment, including one AP and multiple STAs. The transmission bandwidth was 20 MHz, and it was divided into RUs with 26-tone per RU. The experimental parameters are listed in Table 2. Table 2. Experimental Parameter Parameter
Value
Number of subcarrier per RU
26 tones
Number of STA
1 ~ 100
MCS
11
Packet Size
2000 octets
TF duration
100 μs
SIFS duration
16 μs
PIFS duration
25 μs
Multiuser BA duration
68 μs
The experimental results for DACO and UORA throughputs are shown in Fig. 5. The throughput of DACO is higher than that of UORA. But due to the smaller OCW for voice and video streams, collisions are more likely to occur, leading to a decrease in throughput. Figure 6 shows the throughput for each type of data transmission. As shown, when there are many more STAs, AC_BE and AC_BK have larger throughputs than those of AC_VO and AC_VI since the larger range of [OCW, OCW*2] results in fewer chances of OBO collisions. Figure 7 illustrates the latencies of the 4 types of data streams. Voice and Video conduct longer latencies than those of Best Effort and Background since shorter OBOs cause many more collisions. UORA DACO
100
System Throughput (Mbps)
80
60
40
20
0 10
20
30
40
50
60
70
80
90
100
Number of STAs
Fig. 5. System Throughputs of DACO and UORA given different numbers of STAs ranging between 1 and 100
314
C.-W. Chang and F.-Y. Leu DACO(VO) DACO(VI) DACO(BE) DACO(BK)
80
System Throughput (Mbps)
70
60
50
40
30
20
10
0 10
20
30
40
50
60
70
80
90
100
Number of STAs
Fig. 6. Throughput for each AC given different numbers of STAs ranging from 1 to 100
6
10
-8
DACO(VO) DACO(VI) DACO(BE) DACO(BK)
5
Latency
4
3
2
1
0 10
20
30
40
50
60
70
80
90
100
Number of STAs
Fig. 7. Latency for each AC given different numbers of STAs ranging from 1 to 100
6 Conclusions and Future Work In this study, we classified data streams and selected the OCW (contention window) for their priority levels. Our simulation results indicate that choosing the correct OCW for each data stream is crucial for the overall throughput and efficiency of the system. Under various network load conditions, selecting the appropriate OCW can reduce the probability of collisions, decrease the number of retransmissions, and increase network throughput. However, dynamically adjusting the OCW to cope with different network environments remains challenging. In future research, we will choose suitable OCWs for each data stream in a more flexible and adaptive manner to meet the requirements of different data streams under various network conditions.
References 1. Zhu, M., Dai, W., Qiu, M.: A survey of virtual conference systems. IEEE Access 8, 148756– 148769 (2020) 2. Liu, Y., Li, Z., Xie, M., Li, M.: A survey on peer-to-peer video streaming systems. Peer-to-Peer Networking and Applications 5(1), 18–44 (2012)
QoS-Oriented Uplink OFDMA Random Access Scheme
315
3. Claypool, M.: The effect of latency on user performance in Warcraft III. In: Proceedings of the 2005 ACM SIGCHI International Conference on Advances in computer entertainment technology, pp. 457–463 (2005) 4. Al-Fraihat, D., Joy, M., Sinclair, J.: Evaluating E-learning systems success: an empirical study. Computers in Human Behavior 102, 67–86 (2020) 5. Alaa, M., Zaidan, A.A., Zaidan, B.B., Talal, M., Kiah, M.L.M.: A review of smart home applications based on Internet of Things. J. Netw. Comp. Appl. 97, 48–65 (2017) 6. “Draft Standard for Information Technology - Telecommunications and Information Exchange Between Systems - Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications - Amendment 6: Enhancements for High Efficiency WLAN D6.0”, Draft IEEE Standard, P802.11 (2019) 7. Ghosh, C.: Random access with trigger frames using OFDMA. [online] Available: https:// mentor.ieee.org/802.11/dcn/15/11-15-0875-01-00ax-random-access-with-trigger-framesusing-ofdma.pptx. Accessed on 23 April 2023 8. 2022 Wi-Fi Alliance, “The Wi-Fi® Internet of Things advantage,” [online] Available: https://www.wifi.org/download.php?file=/sites/default/files/private/IoT_Highlights_2 0220713.pdf. Accessed on 23 April 2023 9. Kim, J., Lee, H., Bahk, S.: CRUI: collision reduction and utilization improvement in OFDMABased 802.11ax Networks. In: IEEE Global Communications Conference (GLOBECOM), pp. 1–6 (2019). https://doi.org/10.1109/GLOBECOM38437.2019.9013337 10. Kumar, A., Manjunath, D., Kuri, J.: CHAPTER 8 - Multiple Access: Wireless Networks. In: Kumar, A., Manjunath, D., Kuri, J. (eds.) The Morgan Kaufmann Series in Networking, Communication Networking, 435–531. Morgan Kaufmann (2004). ISSN 18759351, ISBN 9780124287518, https://doi.org/10.1016/B978-012428751-8/50008-2. Accessed on 23 April 2023 11. Lee, J.: OFDMA-based Hybrid Channel Access for IEEE 802.11ax WLAN. In: The International Wireless Communications & Mobile Computing Conference (IWCMC), pp. 188–193 (2018). https://doi.org/10.1109/IWCMC.2018.8450369 12. Baron, S., Nezou, P., Viger, P.: Random access for 11be (2021) 13. Lanante, L., Uwai, H.O.T., Nagao, Y., Kurosaki, M., Ghosh, C.: Performance analysis of the 802.11ax UL OFDMA random access protocol in dense networks. In: The IEEE International Conference on Communications (ICC), pp. 1–6 (2017). https://doi.org/10.1109/ICC.2017. 7997340 14. Lanante, L., Ghosh, C., Roy, S.: Hybrid OFDMA Random Access with Resource Unit Sensing for Next-Gen 802.11ax WLANs. IEEE Transactions on Mobile Computing 20(12), 3338– 3350 (2021). https://doi.org/10.1109/TMC.2020.3000503 15. Kim, Y., Lam, K., Park, E.C.: OFDMA backoff control scheme for improving channel efficiency in the dynamic network environment of IEEE 802.11ax WLANs. Sensors 21(15), 5111 (2021). https://doi.org/10.3390/s21155111
Regression Testing Measurement Model to Improve CI/CD Process Quality and Speed Sen-Tarng Lai1(B) and Fang-Yie Leu2 1 Department of Information Technology and Management, Shih Chien University,
Taipei 10462, Taiwan [email protected] 2 Department of Computer Science, Tunghai University, Taichung 40704, Taiwan [email protected]
Abstract. Regression testing is a necessary and important activity for software maintenance and CI/CD process. Regression testing is responsible for the overall stability and functionality of the existing features. After any code changes, updates, or improvements, regression testing can ensure an application still functions as expected. DevOps and agile software development apply CI/CD processes to increase software delivery quality and deployment speed. However, regression testing is a critical workflow to affect the efficiency and quality of CI/CD process. Regression testing must have automation, quality assurance, assisting CD process, and efficiency four features for improving CI/CD quality and speed. In this paper, based on quality items and influencing factors, the Regression Testing Measurement (RTM) model is proposed. Quantified measurement can identify defects and deficiencies in regression testing workflow. RTM-based regression testing workflow can assure four quality features and improve the quality and speed of the CI/CD process.
1 Introduction Agile software development using IID (Iterative and Incremental Development) was proposed by 17 professionals in 2001 to improve the success rate of the software project through member interactions [1–3]. DevOps was proposed in 2009 to overcome the issues between software development and operations through interactions [4]. DevOps and IID have the same target to improve delivery and deployment speed and ensure product quality [5–7]. CI/CD is an automation-based practice that is a kernel process of IID and DevOps to accomplish their target [8]. Integrating many purposes software tools for development, testing, delivery, and deployment is the basis for CI/CD process automation [9, 10]. In CI/CD process, regression testing is responsible for ensuring the functional and non-functional correctness, consistency, and completeness of existing requirements. Any code changes, updates, or improvements, regression testing must ensure that the system still functions as expected. Regression testing is a necessary workflow of CI/CD process. In software maintenance, requirements changes, and the IID, regression testing is a necessary task to ensure the function, non-function, and system can correct execution. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 316–326, 2023. https://doi.org/10.1007/978-3-031-35836-4_33
Regression Testing Measurement Model to Improve CI/CD
317
However, regression testing always takes a lot of time and manpower to complete the task. For ensuring product quality and delivery speed, regression testing must have three features which are automation, quality assurance, and assisting delivery and deployment. In addition, many academic papers have proposed research results of the selection techniques and priority sequence algorithms for regression testing to improve efficiency and reduce personal involvement in regression testing. Selection techniques can identify the defects and bugs in the early testing step. Priority sequence algorithms can reduce the complete testing time. Regression testing is also necessary to increase efficiency using suitable selection techniques and priority sequence algorithms. Regression testing is a critical task in the CI/CD process for software IID and software maintenance. This paper defines the quality items of the regression testing workflow and proposes the Regression Testing Measurement (RTM) model to identify the quality defects of the regression testing workflow and develop specific improvement methods. Assist the CI/CD process in rapid complete testing activities and deliver high product quality. In addition, enterprises and organizations can continuously undertake new customer requirements and changing environments. Section 2 discusses the importance and challenges of regression testing and the relationship between regression testing and CI/CD process. Section 3 discusses the steps of regression testing workflow and the major quality items. Section 4 based on the major quality items, proposes the Regression Testing Measurement (RTM) model to quantify and improve the operation quality of regression testing workflow. Section 5 evaluates the advantages of CI/CD process using high-quality regression testing workflow. Section 6 describes the importance of a high-quality regression testing workflow to expedite CI/CD process and rapid deployment. This paper proposes the RTM model to quantify and improve the operation quality of the regression testing workflow to improve CI/CD Process quality and speed.
2 CI/CD Process and Regression Testing Regression testing is a critical workflow in CI/CD process that affects product quality and deployment speed. 2.1 Relationship between CI/CD Process and Regression Testing DevOps continues the IID of agile software development to speeding up maintenance deployment and improving key product quality, software maintenance can be effectively improved. CI/CD process is an important process and key workflow in DevOps and agile software development. The CI/CD process is the continuous development, testing, delivery, and rapid deployment of high-quality products, an ideal combination for perfective maintenance of frequent needs [9]. Combining multiple software tools for development, testing, delivery, and deployment is the basis for CI/CD pipeline process automation [10]. Crucially, the CI/CD process can be broken down into four main steps (shown in Fig. 1), where CI tasks are divided into impact item identification and automated testing and integration: Step 1. Affected items identification: Based on VC (Version Control), CM (Configuration Management), and RM (Repository Management) system to quickly identify the affected related software items from the software library.
318
S.-T. Lai and F.-Y. Leu
Step 2. Automatic testing: Using automated testing tools to reduce human involvement can speed up the efficiency and quality of unit, functional, non-functional and regression testing activities. The CD process tasks include a two-step process of continuous delivery and continuous deployment: Step 3. Quality product delivery: Transform changing requirements into acceptance criteria, use tools to assist system testing, acceptance testing, and delivery processes, and confirm that the new version has key qualities such as correctness, integrity, consistency, and security. Regression testing should assist the delivery process facilitates rapid and continuous delivery of new releases. Step 4. Rapid deployment: Convert the operating environment requirements into deployment setting parameters, and then use the installation assistance tools (such as regression testing) and deployment process to quickly complete the deployment of the new version.
Fig. 1. Relationship between CI/CD process and Regression Testing
2.2 Importance of Regression Testing Regression testing is a necessary task to ensure the functional, and non-functional requirements and product quality, and it is also a key activity to assist product speed delivery and deployment. The workflow of regression testing must cover all the possible test cases and impacted functionalities. In order to improve CI/CD process quality and speed, regression testing needs to have four features, such as automation, quality assurance, assisting CD process, and efficiency. Five steps of the regression testing workflow are described as follows: Step 1. Identify affected and unaffected software items Step 2. Identify the relationship between affected and unaffected software items and build the cross-reference table. Step 3. Apply selection techniques [11] and priority sequence algorithms [12] to select test cases and arrange test case sequences that can greatly reduce the time consumed to inspect bugs and identify defects, in addition, integrate useful automatic test tools for improving regression testing efficiency.
Regression Testing Measurement Model to Improve CI/CD
319
Step 4. Ensure correctness, completeness, and consistency of unit, functional, and nonfunctional testing and meet the modified requirements specifications. Step 5 Assist the following jobs such as acceptance testing of the delivery procedure and installation testing of the deployment procedure.
3 Quality Items and Influencing Factors of Regression Testing 3.1 Main Features of Regression Testing Regression testing is an important and critical workflow for software maintenance and IID methodology. Additionally, regression testing serves as a bridge between the CI and the CD in the CI/CD process. Assist the CI/CD process overcome challenges such as shorting the maintenance time, ensuring the critical quality of the product, and reducing user inconvenience. For this, regression testing must have four features: (1) Automation: Regression testing is a complex and time-consuming task that usually requires more time and resources to complete fully. For increasing RT efficiency, personal involvement should be reduced and automated tools need to be properly applied. CM, VC, and RM can identify the affected items in software maintenance and incremental versions. Additionally, proper use of testing tools can increase the efficiency and quality of regression testing activities. (2) Quality Assurance: the main purpose of regression testing is to assure that unaffected functional and non-functional requirements can be executed correctly and meet user requirements after software maintenance or version expansion. Based on CM system, functional and non-functional modules can be decomposed into many unit programs. Therefore, basic components must be thoroughly tested before functional and nonfunctional modules to be tested. Regression testing must ensure unaffected items which are unit programs, function, and non-function modules are still correct and meet the requirement specification after maintenance or new releases. (3) Assisting delivery and deployment: In CI/CD process, regression testing can serve as a bridge between the CI and the CD. Therefore, regression testing has another mission of supporting delivery and deployment procedure. Assisting in the acceptance testing of the delivery procedure can improve the delivery product quality. Assisting in the installation testing of the deployment procedure can improve the speed of system deployment. (4) Efficiency: In order to improve efficiency, many academic papers have proposed research results of regression testing selection techniques and priority sequence algorithms to reduce time consumption and improve efficiency. Selection techniques can identify the defects and bugs of the product as early as possible in regression testing to reduce time consumption. Priority sequence algorithms can use fewer test cases to complete the full regression testing. Evaluating suitable selection techniques and priority sequence algorithms can improve the efficiency of regression testing.
320
S.-T. Lai and F.-Y. Leu
3.2 Quality Influencing Factors of Regression Testing To assure that the four features of regression testing have been integrated into the workflow, the paper evaluates the workflow of regression testing using the quality items and influencing factors. (shown as Table 2). (1) Automation: identify the affected items of software maintenance and incremental development and items cross-reference table needs to apply VC, CM, and RM systems to increase identification speed. Additionally, qualified testing tools can reduce personnel involvement and speed up testing activities. The quality items of automation should consider four quality items: – VC system quality: VC system quality should consider VC system influence factors of usability, integration ability, and adaptability. – CM system quality: CM system quality should consider CM system influence factors of usability, integration ability, and adaptability. – RM system quality: RM system quality should consider RM system influence factors of usability, consistency, and completeness. – Testing tool quality: this quality item should consider the influence factors of usability, changeability, and adaptability. (2) Quality Assurance: correctness and meeting user requirements can ensure that system can run with high quality and meet user functional and non-functional requirements after maintenance or delivery of new versions. The quality items of quality assurance should consider four quality items: – Unit program correctness: this quality item should have correctness, completeness, and consistency of unit program regression testing. – Functional modules correctness: this quality item should have correctness, completeness, and consistency of functional regression testing. – Non-functional modules correctness: this quality item should have correctness, completeness, and consistency of non-functional regression testing. (3) Assisting delivery and deployment: for improving product delivery and system deployment speed, regression testing should consider three quality items: – Assisting acceptance testing quality: this quality item should have correctness, completeness, and consistency of acceptance testing. – Assisting installation testing quality: this quality item should have correctness, completeness, and consistency of acceptance testing. – Assisting CD process quality: this quality item should have correctness, completeness, and efficiency of Continuous Deployment (CD). (4) Efficiency: using a good quality evaluation method can choose high-quality selection techniques and priority sequence algorithms to improve the efficiency of regression testing. The quality items of efficiency should consider three quality items: – Selection techniques quality: this quality item should consider the influence factors of usability, changeability, and extensibility. – Priority sequence algorithms quality: this quality item should consider the influence factors of usability, effectiveness, and adaptability.
Regression Testing Measurement Model to Improve CI/CD
321
– Evaluation methods quality: this quality item should consider influencing factors of usability, quantifiability, and comparability (Table 1).
Table 1. Quality items and influencing factors of regression testing
4 Regression Testing Measurement Model and Improvements 4.1 Regression Testing Measurement Model For quantifying and improving the RT workflow, we defined the quality items of RT workflow. And, based on the linear combination model [13], proposes the RTM model that combines automation, quality assurance, assisting CD process, and efficiency four quality measurements. Inspect and review factors of quality items, software professionals and experienced maintainers can draw up quantitative values of basic quality items. A quantified value approaching 1 represents good quality, and a quantified value approaching 0 represents bad quality. In the model, the senior software engineers assign the weighted value between 0 and 1. A weighted value close to 1 indicates that the quality item is important for quality measurement. The RTM model describes as follows: (1) Automation Quality Measurement: For quickly identifying affected items and unaffected from the software repository, Configuration Management (CM), Version Control (VC), and software Repository Management (RM) should have high quality. For this, Automation Quality Measurement (AQM) combines CMVCQ, RMQ, and TTQ three quality items. Combination formula is shown as Eq. (1): AQM: Automatic Tools Measurement CMVCQ : CM and VC Quality W1 : Weight of CMVCQ RMQ : Repository Management Quality W2 : Weight of RMQ TTQ : Testing Tools Quality W3 : Weight of TTQ AQM = W1 ∗ CMVCQ + W2 ∗ SRQ + W3 ∗ TTQ W1 + W2 + W3 = 1
(1)
322
S.-T. Lai and F.-Y. Leu
(2) Quality Assurance Measurement (QAM): Regression testing needs to ensure basic components, functional, and non-functional modules are correct and meet the requirement specifications. Test correct measurement needs to consider the combination quality of unit testing (UT), functional testing (FT), and non-functional testing (NFT) three automatic tools. For this, Test Correctness Measurement (TCM) combines UT, FT, and NFT three quality items. Combination formula is shown as Eq. (2): QAM: Quality Assurance Measurement UTQ : Unit Testing Quality W1 : Weight of UTQ FTQ : Functional Testing Quality W2 : Weight of FTQ NFTQ : Non-Functional Testing Quality W3 : Weight of NFTQ QAM = W1 ∗ UTQ + W2 ∗ FTQ + W3 ∗ NFTQ W1 + W2 + W3 = 1
(2)
(3) Assisting CD Process Measurement (ADPM) should consider Assisting Acceptance Testing of product delivery Quality (AATQ), assisting Installation Testing of system deployment Quality (AITQ), and CD Assistant Quality (CDAQ). For this, ADPM combines AATQ, ITAQ, and CDAQ three quality items. Combination formula is shown as Eq. (3): ADPM: Assisting CD Process Measurement AATQ : Assisting Accep tan ce Testing Quality W1 : Weight of AATQ AITQ : Assisting Installation Testing Quality W2 : Weight of AITQ CDAQ : CD Assistant Quality W3 : Weight of CDAQ ADPM = W1 ∗ AATQ + W2 ∗ AITQ + W2 ∗ CDAQ W1 + W2 + W2 = 1
(3)
(4) Efficiency Quality Measurement (EQM): EQM needs to consider the quality of test case selection techniques (STQ), the quality of test cases priority sequence Algorithms (PAQ), and Evaluation Methods Quality (EMQ). For this, EQM combines STQ, PAQ, and TC three quality items. Combination formula is shown as Eq. (4): EQM: Efficiency Quality Measurement STQ : Selection Techniques Quality W1 : Weight of STQ PAQ : Priority sequence Algorithm Quality W2 : Weight of PAQ EMQ : Evaluation Methods Quality W3 : Weight of EMQ EQM = W1 ∗ STQ + W2 ∗ PAQ + W3 ∗ EMQ W1 + W2 + W3 = 1
(4)
Regression Testing Measurement Model to Improve CI/CD
323
(5) Regression Testing Workflow Measurement (RTWM) combines four quality measurements which include AQM, QAM, ADPM, and EQM. Combination formula is shown as Eq. (5): RTWM: Regression Testing Workflow Measurement AQM : QAM : ADPM EQM :
Automation Quality Measurement Quality Assurance Measurement : Assisting CD Process Measurement Efficiency Quality Measurement
W1 W2 W3 W4
: Weight of : Weight of : Weight of : Weight of
AQM QAM ADPM FQM
RTWM = W1 ∗ AQM + W2 ∗ QAM + W3 ∗ ADPM + W4 ∗ EQM W1 + W2 + W3 + W4 = 1
(5)
4.2 RT Workflow Improvements Quality quantification can help identify the quality item defects of the RT workflow and take appropriate improvements. The following describes the process quality improvement measures based on the RTM model: IF RTWM < 0.6, according to Eq. (5), the defect of quality measurements should be identified, and apply the rule-based improvement mechanism: (1) IF AQM < 0.6∗ and weight value of AQM > 0.3. THEN according to Eq. (1), CMVC, RM, and TT quality items need to be inspected to identify the bad quality item which quantified value < 0.6. The influencing factors of identified bad-quality items need to be detected one by one for making correct improvements manners. (*threshold can be adjusted) (2) IF QAM < 0.6∗ and weight value of QAM > 0.3 THEN according to Eq. (2), UT, FT, and NFT quality items need to be inspected to identify the bad quality item for which the quantified value < 0.6. The influencing factors of identified bad-quality items need to be detected one by one for making correct improvements manners. (*threshold can be adjusted) (3) IF ADPM < 0.6∗ and weight value of ADPM > 0.3 THEN according to Eq. (3), AAT, AIT, and CDA quality items need to be inspected to identify the bad quality item which quantified value < 0.6. The influencing factors of identified bad-quality items need to be detected one by one for making correct improvements manners. (*threshold can be adjusted) (4) IF EQM < 0.6∗ and weight value of EQM > 0.3 THEN according to Eq. (4), ST, PA, and EM quality items need to be inspected to identify the bad quality item which quantified value < 0.6. The influencing factors of identified bad-quality items need to be detected one by one for making correct improvements manners. (*threshold can be adjusted)
324
S.-T. Lai and F.-Y. Leu
5 Advantages of RTM-Based Regression Testing CI/CD process must accomplish four missions which are affected software items identification, automated testing activities, qualified product delivery, and speed system deployment to overcome the three challenges of shortening maintenance time, ensuring product critical quality, and reducing user inconvenience. Regression testing is a critical workflow to help CI/CD process accomplish four missions. Regression testing with automation and efficiency can speed up the activities of testing tasks, product delivery, and system deployment. Regression testing with quality assurance can ensure all affected items have been identified, and the unit, function, and non-function testing activities have been completed correctly. Regression testing with assisting CD process can assist in acceptance testing of product delivery and installation testing of system deployment. RTM-based regression testing using quantified measurement and improved manner to integrate automation, quality assurance, assisting CD process, and efficiency features into regression testing workflow. Three types of regression testing workflow that focus on the core task are described as follows: (shown as Table 2). (1) Tool-based RT workflow: Applying automated testing tools for RT is a convenient approach to reduce personal involvement and increase testing efficiency. However, selecting suitable testing tools and planning the test data and test results are necessary jobs before RT operation. In addition, automated testing tools cannot assure the correctness of RT workflow and completeness. (2) Algorithm-based RT workflow: Many academic papers proposed selection techniques and designed test priorities to improve RT operation efficiency and performance. However, RT selection techniques and priority algorithms need to redesign for different applications or systems. Therefore, algorithm-based workflow is not a convenient approach. In addition, the algorithm-based workflow cannot assure the correctness of RT workflow and completeness. (3) RTM-based RT workflow: In the RTM model, quality items include four measurements which are automatic tools measurement, techniques and algorithms measurement, correctness measurement, and completeness measurement. RTM-based workflow is a comprehensive consideration and involves automatic tools, RT techniques, and algorithms, RT workflow correctness to assure the CI process, RT workflow completeness to assist the CD process.
Table 2. Three Regression Testing Comparison Table Types of RT Features of RT Automation Efficiency Quality Assurance Assisting CD process
Tool-based RT
Algorithm-based RT
Measure-based RT
Strong weak No concerned No concerned
Concerned Strong Weak Weak
Strong Strong concerned concerned
Regression Testing Measurement Model to Improve CI/CD
325
6 Conclusion In order to increase the market competitiveness of enterprises and organizations, information system maintenance quality must have high quality, high security, and rapid delivery and deployment. Agile software development and DevOps apply IID and CI/CD processes to improve product deployment efficiency, critical quality, and security. In CI/CD process, regression testing has the mission of ensuring product quality in CI activities and assisting delivery and deployment speed in the CD process. After related test activities, and entering the Continuous Delivery (Deployment) process, regression testing should assist in the acceptance testing of the product delivery and support installation testing of system deployment. Regression testing must have automation, quality assurance, assisting CD process, and efficiency four features for improving CI/CD quality and speed. In this paper, the regression testing measurement (RTM) model was proposed to identify the defects of the regression testing workflow. And improve the running efficiency and quality of regression testing with the rule-based improvement manner. CI/CD is a kernel process in DevOps and agile software development. Applying CI/CD process, software maintenance and agile software development can speed up product delivery and deployment with critical quality. RTM-based regression testing with high quality can improve CI/CD process quality and speed.
References 1. Beck, K., et al.: Manifesto for Agile Software Development (Online) (2001). Available: http:// www.agilemanifesto.org/ 2. Larman, C., Basili, V.R.: Iterative and Incremental Development: A Brief History. Computer. IEEE CS Press. 48 (2004) 3. Gheorghe, A.M., Gheorghe, I.D., Iatan, I.L.: Agile Software Development. Informatica Economica 24(2) (2020) 4. Allspaw, J.: 10+ Deploys Per Day: Dev and Ops Cooperation at Flickr (2009). Retrieved 21st of Aprile, 2023. http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-opscooper ation-at-flickr/2009 5. Roche, J.: Adopting DevOps practices in quality assurance. Commun. ACM 56(11), 38–43 (2013) 6. Fröschle, H.-P.: DevOps. HMD Praxis der Wirtschaftsinformatik 54(2), 171–172 (2017). https://doi.org/10.1365/s40702-017-0296-3 7. Leite, L., Rocha, C., Kon, F., Milojicic, D., Meirelles, P.: A survey of DevOps concepts and challenges. ACM Computing Surveys (CSUR) 52(6), 1–35 (2019) 8. Shahin, M., Babar, M.A., Zhu, L.: Continuous integration, delivery and deployment: a systematic review on approaches, tools, challenges and practices. IEEE Access 5, 3909–3943 (2017) 9. Purohit, K.: Executing DevOps & CI/CD, Reduce in Manual Dependency. IJSDR 5(6), 511– 515 (2020) 10. Pratama, M.R., Kusumo, D.S.: Implementation of Continuous Integration and Continuous Delivery (CI/CD) on Automatic Performance Testing. In: 2021 9th International Conference on Information and Communication Technology (ICoICT) IEEE, pp. 230–235. (2021, August) 11. Biswas, S., Mall, R., Satpathy, M., Sukumaran, S.: Regression test selection techniques: A survey. Informatica 35(3), (2011)
326
S.-T. Lai and F.-Y. Leu
12. Harikarthik, S.K., Palanisamy, V., Ramanathan, P.: Optimal test suite selection in regression testing with testcase prioritization using modified Ann and Whale optimization algorithm. Clust. Comput. 22(5), 11425–11434 (2017). https://doi.org/10.1007/s10586-017-1401-7 13. Fenton, N.E.: Software Metrics - A Rigorous Approach, Chapman & Hall (1991)
A Study on the Abnormal Stock Returns of Listed Companies in Taiwan’s Construction Sub-industry due to the Covid-19 Epidemic Announcement Kuei-Yuan Wang1 , Ying-Li Lin1(B) , Chien-Kuo Han2 , and Hsieh-Jung Sung3 1 Dept. of Finance, Asia University, Taichung, Taiwan
[email protected] 2 Dept. of Food Nutrition and Healthy Biotechnology, Asia University, Taichung, Taiwan
[email protected] 3 DT Swiss Asia, Taichung, Taiwan
Abstract. The Covid-19 epidemic has impacted the global economy and life, and even threatened human life. This study takes Taiwan’s listed construction industry as a sample to test the impact of the Guidance Center’s announcement of the escalation or downgrade of the epidemic on the abnormal returns of Taiwan’s listed construction industry stocks. The empirical results point out that the three subindustries (construction, construction, and others) of Taiwan’s listed construction industry all have prior reactions before the announcement of the upgrade and downgrade events, which means that investors may have expected psychological reactions in advance. However, the reactions of the three sub-sectors after the announcement of the incident were different. This empirical result has reference value for investors’ investment decisions.
1 Introduction In early December 2019, a new type of unknown virus was found in the Huanan Seafood Market in Wuhan, China. On December 31, 2019, the World Health Organization (WHO) obtained a case statement for this new type of viral pneumonia. On the same day, Taiwan Centers for Disease Control began border quarantine, especially for incoming flights from Wuhan. On January 11, 2020, China had its first death from the novel coronavirus. It coincides with the Lunar New Year holiday in China, and the influx of people returning home and traveling has contributed to the rapid spread of the epidemic, even spreading to Asia and the United States (Department of Disease Control, Ministry of Health and Welfare, 2021). On January 21, 2020, the first imported case occurred in Taiwan. The Taiwan Centers for Disease Control opened a command center on January 23, 2020. With the rapid and massive increase in the number of deaths worldwide, even though China announced the closure of Wuhan at the end of January 2020, it still cannot stop the spread of the epidemic around the world. WHO officially named the outbreak Covid-19 on February 11, 2020. The WHO has also officially declared that Covid-19 has entered the scale of © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 327–334, 2023. https://doi.org/10.1007/978-3-031-35836-4_34
328
K.-Y. Wang et al.
a global pandemic (Department of Disease Control, Ministry of Health and Welfare, 2021). This study mainly explores the impact of Covid-19 warning announcements at different levels on the abnormal return of stock prices of listed companies in different construction sub-industries, which forms the research purpose of this study.
2 Methodology 2.1 Research Periods and Event Days This study adopts the event study method. The stock rate of return is calculated using the daily rate of return. The event date will be the announcement of the upgrade to Level 3 on May 15, 2021 and the downgrade of the epidemic alert on July 27, 2021. This study uses the T-test to test whether the abnormal return of stocks reaches a statistically significant level. Considering that the stock market is closed on Saturday, May 15, the event day is postponed to the next opening day, May 17. In the view of the fact that the time of the two event days is too close, in order to avoid the overlap between the estimation window and the event window, this study finally uses 5 days before and after the event day as the event window, and 60 days before the event window as the estimation window, as shown in Table 1. Table 1 Windows of event study Event
Day
Estimation window
Event window
Level 3 Alert
2021/05/17
2021/03/13–2021/05/11
2021/05/12–2021/05/22
Level 2 Alert
2021/07/27
2021/05/23–2021/07/21
2021/07/22–2021/08/01
2.2 Samples and Data This study takes 60 listed companies in Taiwan’s construction industry and its subsectors as samples. The data are taken from the database of Taiwan Economic News (TEJ database). The industry standard is based on TEJ’s classification standard, and the construction industry includes 45 construction sub-industries, 10 construction subindustries, and 5 others. 2.3 Building Expected Return on Stocks 2.3.1 Market Model This study assumes that the return rate of individual stocks has a linear relationship with the return rate of the market-weighted index. Therefore, this study adopts the ordinary least square method (OLS) and uses the market-weighted index return to establish the stock expected return. The formula is calculated as follows. Rit = ai + βi Rmt + εit
(1)
where Rit represents the stock return rate of sample i on trading day t; Rmt represents the market-weighted index return rate on trading day t; εit represents the residual item.
A Study on the Abnormal Stock Returns of Listed Companies
329
2.3.2 Expected Return on Stock During the Event Window The formula for the expected stock return during the event period is calculated as follows. E(Rˆ it) ) = aˆ i + βˆi Rmt
(2)
where E(Rˆ it) ) represents the expected rate of return of sample i on trading day t. 2.3.3 Estimation of Abnormal Rate of Return This study intends to subtract the actual rate of return on the event day from the expected rate of return on that day to calculate the abnormal rate of return on that day caused by the event. The formula is calculated as follows. ARit = Rit − E(Rˆ it) )
(3)
where ARit represents the abnormal rate of return of sample i on day t of the event window. 2.3.4 Estimation of Cumulative Abnormal Returns In this study, the cumulative abnormal return rate of company i can be obtained by summing up the abnormal return rate of sample i on each day in a certain period during the event window. The formula is calculated as follows. CARi (τ1 , τ2 ) =
τ2
ARit
(4)
i=τ1
where CARi (τ1 , τ2 ) represents the cumulative abnormal return rate of sample i from τ1 day to τ2 day in the event period. 2.3.5 Estimation of Average Abnormal Returns In order to avoid factors other than unspecified events interfering with the stock return rate. This study calculates the average abnormal return rate of all sample firms in the same event period. The formula calculation is as follows. AARt =
N 1 ARit N
(5)
i=1
where AARt represents the average abnormal return rate for all samples on day t. 2.3.6 Estimation of Cumulative Average Abnormal Returns In this study, the daily average abnormal return rate of all sample companies in the event period is summed up to obtain the average cumulative abnormal return rate of all samples. The formula is calculated as follows CAARi (τ1 , τ2 ) =
τ2 e=τ1
AARt
(6)
330
K.-Y. Wang et al.
where CAARi (τ1 , τ2 ) represents the average cumulative abnormal return rate of all samples from τ1 day to τ2 day in the event period. 2.4 Test for Abnormal Rate of Return In this study, AR and CAR were tested by traditional methods. The formula is calculated as follows. tAR =
1 N
ARt
N 2 i=1 sˆ i
=
1 N
N 1 i=1 (Ti −1)
ARt t2
εit E=t1 (ˆ
−
Tl
εˆ it 2 t=1 Ti )
(7)
where tAR represents the t-test value of the average abnormal return on day t of the event window. εˆ it represents the residual item of sample i in the estimation window t day. sˆi2 represents the variance of sample i’s residual item in the estimation window. Ti represents the length of the estimation window. N tCAR =
t=1 [
T2
t=τ1
√ N
(
ARit sˆ
√i m
)
]
(8)
where t CAR represents the t-test value of the average cumulative abnormal return on t days in the event period. M represents the number of periods in the event period. Sˆ i represents the standard deviation of the residual term. This study will adopt the traditional method to test whether the AR and CAR of the stock price are significant due to specific events. If the p-value is lower than the statistically significant level, it can be inferred that the effect of upgrading or downgrading the Covid-19 alert is not significant. On the contrary, if the p-value exceeds the statistically significant level, it means that there is an impact from the announcement of the escalation or downgrade of the Covid-19 alert, which will cause stock price fluctuations.
3 Empirical Results 3.1 Building Sub-Industry Table 2, the AR of the building sub-industry on Day t-5 is significantly positive. However, Day t-4 starts a downward trend. However, the negative correction is the largest on Day t-3 and reaches the lowest point during the event. It represents the phenomenon that investors react in advance. On the day of the event, there was a significant negative AR, and it turned into a significant positive AR on Day t + 1. However, by Day t + 5, the AR totaled 1.9219%, indicating that after the event occurred, it had a positive impact on the building sub-industry. On behalf of investors have overreaction phenomenon. Table 3, CAR on Day t-5 was significantly positive. Since Day t-3, it started a downward trend and reached a significantly negative CAR. It represents the phenomenon that investors have reacted in advance. In addition, until the end of the event window, all CARs were significantly negative, and the CAR on the event day was the lowest. It represents the phenomenon that investors have a delayed reaction.
A Study on the Abnormal Stock Returns of Listed Companies
331
Table 2 AR of the 45 building sub-industries due to the change of the epidemic alert announcement Day
AR(%)
t-value
Day
AR(%)
t-value
-5
0.7279***
3.9314
+1
1.6448***
8.8842
-4
-0.6026***
-3.2548
+2
-0.4496**
-2.4284
-3
-1.9937***
-10.7683
+3
0.3882***
2.0967
-2
-0.1087
-0.5869
+4
0.2139
1.1551
-1
0.0019
0.0105
+5
0.1246
0.6728
0
-1.8393***
-9.9344
Note: ***, ** and* represent the 1%, 5% and 10% statistically significant level
Table 3 CAR of the 45 building sub-industries due to the change of the epidemic alert announcement Day
CAR(%)
t-value
Day
CAR(%)
t-value
-5
0.7279***
3.9314
+1
-2.1695***
-4.4291
-4
0.1253
0.4784
+2
-2.6191***
-5.0016
-3
-1.8684***
-5.8264
+3
-2.2310***
-4.0166
-2
-1.9770***
-5.3393
+4
-2.0171***
-3.4452
-1
-1.9751***
-4.7709
+5
-1.8925***
-3.0821
0
-3.8144***
-8.4109
Note: ***, ** and* represent the 1%, 5% and 10% statistically significant level
3.2 Construction Sub-industry Table 4, the construction sub-industry presents a positive AR on Day t-5. The next day, the correction was negative, but not statistically significant. On Day t-3, AR drops more significantly. Then, continue the negative correction until the event day. It represents the phenomenon that investors have reacted in advance. On Day t + 1, AR significantly corrects positively. In the next four days, AR will be positive and negative, with no fixed direction. Table 5, Day t-4 and Day t-5 were in the event period, and the only two showed positive CAR, but neither of them reached a significant level. Since Day t-3, investors have reacted in advance. The CARs for nine consecutive days all reached a negative statistically significant level. The largest correction occurred on the day of the event, and the CAR on the day of the event was the bottom of the 11 days. It represents the situation that investors have delayed reaction.
332
K.-Y. Wang et al.
Table 4 AR of the 10 construction sub-industries due to the change of the epidemic alert announcement Day
AR(%)
t-value
Day
AR(%)
t-value
-5
0.9620
1.9713
+1
1.5137***
3.1018
-4
-1.0149
-2.0797
+2
-0.1326
-0.2718
-3
-2.8564***
-5.8529
+3
-0.4307
-0.8825
-2
0.1590
0.3257
+4
0.3629
0.7435
-1
-0.2961
-0.6067
+5
-0.4372
-0.8959
0
-2.5497***
-5.2246
Note: ***, ** and* represent the 1%, 5% and 10% statistically significant level
Table 5 CAR of the 10 construction sub-industries due to the change of the epidemic alert announcement Day
CAR(%)
t-value
Day
CAR(%)
t-value
-5
0.9620
1.9713
+1
-4.0824***
-3.1617
-4
-0.0529
-0.0767
+2
-4.2150***
-3.0536
-3
-2.9093***
-3.4418
+3
-4.6457***
-3.1731
-2
-2.7503***
-2.8178
+4
-4.2828***
-2.7751
-1
-3.0464***
-2.7916
+5
-4.7201***
-2.9161
0
-5.5961***
-4.6813
Note: ***, ** and* represent the 1%, 5% and 10% statistically significant level
3.3 Other Sub-industry Table 6, Day t-5 presents a significantly positive AR. However, it reversed negatively on Day t-3, and reached the lowest point of negative AR within 11 days of the event window. It represents the phenomenon that investors have reacted in advance. Significantly negative AR remained on the day of the event. On Day t + 1, AR is positively corrected. Next, AR has no fixed direction. Table 7, Day t-5 shows significantly positive CAR. Day t-3 turned into a negative CAR, and it was continuously revised down to the event day. The CAR on the event day was the lowest point in 11 days. This result represents the phenomenon that investors react in advance. Until the end of the event window, only the negative return on Day t + 5 did not reach the statistically significant level, and the remaining days maintained a significant negative CAR. Represents a situation where investors have a delayed reaction.
A Study on the Abnormal Stock Returns of Listed Companies
333
Table 6 AR of the 5 other sub-industries due to the change of the epidemic alert announcement Day
AR(%)
t-value
Day
AR(%)
t-value
-5
0.8983**
2.0464
+1
1.1263**
2.5658
-4
0.0090
0.0205
+2
-0.6496
-1.4798
-3
-1.8540***
-4.2235
+3
0.0532
0.1211
-2
-0.5511
-1.2553
+4
-0.0143
-0.0325
-1
-0.2002
-0.4561
+5
0.1558
0.3549
0
-1.3535***
-3.0834
Note: ***, ** and* represent the 1%, 5% and 10% statistically significant level
Table 7 CAR of the 5 other sub-industries due to the change of the epidemic alert announcement Day
CAR(%)
t-value
Day
CAR(%)
t-value
-5
0.8983**
2.0464
+1
-1.9252*
-1.6576
-4
0.9073
1.4616
+2
-2.5748**
-2.0737
-3
-0.9467
-1.2451
+3
-2.5216*
-1.9147
-2
-1.4977*
-1.7059
+4
-2.5359*
-1.8268
-1
-1.698*
-1.7298
+5
-2.3801
-1.6348
0
-3.0515***
-2.8379
Note: ***, ** and* represent the 1%, 5% and 10% statistically significant level
4 Results and Suggestions The Covid-19 epidemic has seriously impacted the global economy and life, and even threatened human life. Taiwan has also set up a guidance center to guide people to deal with this shock. This study takes Taiwan’s listed construction industry as a sample to test the impact of the Guidance Center’s announcement of the escalation or downgrade of the epidemic on the abnormal returns of Taiwan’s listed construction industry stocks. The empirical results point out that the three sub-industries (construction, construction, and others) of Taiwan’s listed construction industry all have prior reactions before the announcement of the upgrade and downgrade events, which means that investors may have expected psychological reactions in advance. However, the reactions of the three sub-sectors after the announcement of the incident were different. This empirical result has reference value for investors’ investment decisions. Acknowledgements. Authors thank anonymous reviewers for their valuable suggestions.
334
K.-Y. Wang et al.
Reference 1. Department of Disease Control: Ministry of Health and Welfare: Covid-19 epidemic prevention key decision-making timeline. Retrieved on 13 February 2022. from https://covid19.mohw.gov. tw/ch/sp-timeline0-205.html
The Business Model of Cross-Border E-Commerce: Source Globally, Sell Globally Ying-Li Lin1 and Shih-Chieh Lin2(B) 1 Department of Finance, Asia University, No. 500, Lioufeng Road, Wufeng Dist.,
Taichung 41354, Taiwan [email protected] 2 Department of Business Administration, Asia University, No. 500, Lioufeng Road, Wufeng Dist., Taichung 41354, Taiwan [email protected]
Abstract. Alibaba Group Holding Limited was established in 1999 with a mission statement: Lower barriers for all businesses. In 2012, the US government declared that big data is the new oil in the future. Lei Jun, the founder of Xiaomi, in 2013 pointed out that the concept of e-commerce will not exist in the future because all businesses will be e-commerce businesses. As for the purpose of Cross-Border E-Commerce is to satisfy consumers’ demand as soon as possible. Through big data, operating direction of cross-border ecommerce businesses will be defined by understanding consumers’ preference of services and goods, studying consumers’ satisfaction and repeat business rate, and doing good product segmentation. The purpose of the study is to research this company and analyze its click-and-mortar business model. Through deconstructed five forces analysis, we do the SWOT for the business and have a clear perception of the common ground between this enterprise and its consumers. This study can be used as a reference for e-commerce businesses when they try to have operating plans, so the consumers’ needs can be met and wealth can be created, which is a win-win situation for the whole economy.
1 Introduction With the advancement of information technology, the Internet has become an indispensable part of modern life. Mobile devices have pushed the Internet to another peak and our daily life is related to the Internet closely, forming intricate links. Due to the maturing of Internet technology, people’s consumption behavior and consumption patterns have undergone significant changes, and various types of ecommerce have also emerged as the times required. Today, shopping around the world and selling globally is the trend of cross-border e-commerce model. In order to maintain the competitive advantage of sustainable operation, continuous innovation is the best way for enterprises. “Innovation” is the core driving force of the cross-border e-commerce industry. “Innovation” is to rearrange the resources to satisfy customers’ needs and create the value for the customers. Peter Drucker, the father of modern management, once said, “If you don’t innovate, you perish.” Joseph Schumpeter, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 335–344, 2023. https://doi.org/10.1007/978-3-031-35836-4_35
336
Y.-L. Lin and S.-C. Lin
a classical economist, was the pioneer of the concept of innovation. He believed that innovation refers to the effective use of resources. If an enterprise meets the needs of the market in an innovative way, it will continue to promote the growth momentum of the enterprise. In the 21st century, innovation has become a decisive factor affecting the competitiveness of enterprises. In order to maintain the growth of enterprises and their competitive advantages, “Service Innovation” is a more popular and important topic. The Internet is the important starting point of innovation especially the innovation trend of cross-border ecommerce having great potential for the development in China. From entrepreneurs to capital markets, everyone is looking for the next model of †China’s e-commerce market. However, except B2B and B2C ‡, O2O §(Offline to Online/Online to Offline) is also the target locked by e-commerce, because the development of O2O can make up for the lack of consumption experience of consumers when purchasing online. After purchasing online, consumers can go to the physical store to pick up the goods or experience consumption, for example, group buying platform or network. As for another the vertical e-commerce network, such as fresh food e-commerce, cosmetics and medical beauty e-commerce, etc., which emerged in 2013, giving vertical e-commerce companies a new development opportunity. The rapid development of mobile shopping network is mainly due to the popularization of network construction; the increasingly popular price of smart devices and the acceleration of function expansion, especially after the full popularization of 4G networks can provide faster wireless networks. In addition, tablets and smart phones have gradually become the main devices for people to access the Internet; making online shopping gradually shifts from computers at home or workplaces to mobile devices that can be used for shopping anytime and anywhere. More secure transactions promote the rapid development of mobile online shopping as well. This study focuses on the cross-border e-commerce enterprises of the case company, and takes B2B, B2C and O2O as the business models, which analyze how the platform integrated by the cross-border e-commerce models, according to lower tax base, authentic products marketing and gift strategies. The case company develops a niche market in the cross-border e-commerce industry. In view of the above, the purpose of this study is as follows: (1) Studying the trend of cross-border e-commerce and providing guidance for enterprises into the industry. (2) Discussing the strategies of cross-border e-commerce in the process of dynamic evolution, and then grasping the market opportunities. † B2B (Business to Business) is the exchange of products, services and information between enterprises through the Internet. ‡ B2C (Business to Customers) is a business-to-consumer e-commerce model. This form of e-commerce is generally dominated by online retailing, which mainly uses the Internet to carry out online sales activities. § The O2O marketing model, also known as the online – offline business model, refers to online sales and online purchases driving offline operations and offline consumption. O2O delivers information about offline stores to Internet users through discounts, information provision, and service reservations, thereby converting them into their own offline customers. It is especially suitable for goods and services that must be consumed in stores.
The Business Model of Cross-Border E-Commerce
337
(3) Analyzing individual case companies, providing B2B, B2C and O2O business models, developing niches and competitiveness, and then occupying a place in China or the global cross-border e-commerce industry.
2 Literature Review 2.1 Definition and Application of B2B B2B, Business to Business, refers to the e-commerce models of cooperation or transaction between enterprises (Lu Decong, Min 96). Nath et al. (1998) pointed out that electronic data exchange has a more mature security technology than the Internet, and electronic data exchange using Internet technology has a lower cost, and the direction of B2B is generally based on this development model. In addition, Lu Chengzhi (Min 87) believed that e-commerce emphasizing the integrated operation and benefits between enterprises, and often required further negotiations between buyers and sellers before a B2B transaction can be successfully completed. Therefore, the concepts of before the event, during the event, and after the event are introduced into the internal operating process of the enterprises, and B2B is focused on cooperating with the upstream business and downstream business in the operation process of the enterprise to jointly create B2B benefits. 2.2 Definition and Application of B2C B2C, Business to Customers, means that enterprises provide customers with various transactions and services through the Internet. After connecting to the Internet with computers, tablets, mobile devices or other tools, customers can obtain real-time online services, including product catalog inquiry, product supports, realtime information reports, online ordering and other business-to-consumers ecommerce models. Guynes, Vedder, and Vanecek (1997) also believed that the development of the Internet has increased the number of companies adopting ecommerce; and Chen Shiyun (Min 89) also pointed out how to attract consumers who only inquire but do not shop. Online shopping is actually an important issue for all businesses. Laudon and Traver (2002), who put forward the seven characteristics of B2C, emphasized the importance of popularization, transnationality, openness, richness, density, interaction, and customization of the sizes, such as big ones or small ones. 2.3 Definition and Application of O2O The first appearance of the O2O was proposed by Rampell (2010). When he analyzed Groupon, OpenTable, Restaurant.Com and SpaFinder, he found the commonality between these companies, which promoted the development of online offline commerce, so its mode was defined as “online payment, offline experience” business (Online to Offline), which can be referred to as O2O for short, and can be the same business term as B2C and B2B. And its consumption mode is “consumers enjoying services or obtaining goods in physical stores but paying online”. Li Caiting (Min 103)’s O2O virtual-real
338
Y.-L. Lin and S.-C. Lin
integrated business model allows consumers in the physical world to interact with virtual business service systems through different mobile digital devices such as tablets or mobile phones, and then leads consumers to physical channels for consumption, so as to achieve virtual reality Integration Purposes. In fact, O2O has been developed so far and can be subdivided into four operating modes: online transaction to offline consumption experience, Online to Offline; offline marketing to online transaction, Offline to Online; offline marketing to online transaction to offline consumption experience, Offline to Online to Offline; and online transaction or marketing to offline consumption experience and then to online consumption experience, Online to Offline to Online. The overall concept is to integrate online and offline resources to bring online customers into offline consumption, to guide offline customers into online consumption.
3 Case Study and Research Methods 3.1 Case Profile The case enterprise owns cross-border e-commerce platforms, which integrate business flow, cash flow, information flow, logistics, human resources, professional teams, and government policies. As regards target customers, they are gathered through virtual channels and physical platforms; the virtual channels refer to the integration of mobile barcodes, including QR codes, and social network resources to connect, to communicate, and to enhance the brand image of the website in the minds of consumers; social network resources such as community platforms, Rotary Clubs, Lions Clubs, JCI(Junior Chamber International), Supercar Clubs, lectures for middlelevel and high-level management of listed companies, gatherings of luxury house owners, and EMBA platforms for senior managers to rapidly increase the number of members, and absorb customers whose annual income is at the top 20% of the pyramid through the above channels. To meet customers’ expectations, to satisfy and impress customers, and to attract those exclusive consumers joining members, the case enterprise offers free member benefits, brand discounts, boutique sales, and also serves members through a unique yacht club (the physical platform) to enhance loyalty to the shopping platforms. On the authority of the synergy of the case enterprise, big data resources generate corresponding economic value in B2B business model, B2C business model and O2O business model. The relationship between the case enterprise and its members as well as upstream and downstream cooperative enterprises is shown in BMC, Business Model Canvas, Business Model Diagram Analysis below. The cooperative suppliers provide products to the case enterprise. The case enterprise cooperates with various channels (as mentioned above, including online virtual channels combined with social networks, etc., and offline physical platforms such as financial securities industry, private banks, private wealth management centers, and personal financial management associations) to promote business, so that customers can enjoy various free gifts, brand discounts, high-quality products, membership rewards, or exclusive services to attract more potential consumers. Reputation marketing indicates trustworthiness drawing more customers willing to become members on the cross-border e-commerce platform of the case enterprise, and to use transfers, swiping cards, or third-party payment systems for network security transactions. At the same time, the case enterprise also provides the information to the cooperative suppliers
The Business Model of Cross-Border E-Commerce
339
facilitating delivery of products to members or notifies members enjoying services in the physical stores. When the members receive the products and services with satisfaction, the platform will be able to use this fee, after drawing the handling fee, transferring the payment to the cooperative suppliers to complete the transaction. 3.2 Business Model Diagram Analysis Business Model Diagram Analysis, BMC, Business Model Canvas, is developed by Osterwalder and first proposed by Pigneur et al. in the book “Business Model Generation” in 2010. The business model diagram is a visual form that describes the value proposition, infrastructure, customers and finance of the enterprise, which helps the enterprise to adjust activities, and the potential trade-offs associated with it. The nine major factors in the case enterprise such as “value proposition”, “customer relationship”, “target customers”, “channels”, “key activities”, “partners”, and “core competencies”, “revenue source” and “cost structure”, etc., can plan business operations and profits, and assist the case enterprise to record and review existing business models or develop new business models.
4 Analysis Results Osterwalder et al. (2010) pointed out that the business model covers four dimensions, which are “customer”, “offer”, “infrastructure” and “financial viability”. The BMC business model diagram is exactly the core business model of the case company. Its nine elements affect the business model of the company. The case enterprise can analyze and plan its operations and profits in turn and arrange these nine elements logically; “Business Model Map (Business Model Canvas, BMC)” can assist the case enterprise to develop, discuss, design and then operate a business model. This study uses the nine elements of BMC to explore the business operation blueprint of the case enterprise and gain an in-depth understanding of how each element interweaves and influences each other. In this study, the case enterprise is incorporated into the BMC model, and the sub-items are as follows: 4.1 Value Propositions In addition to B2B and B2C, the business model of the case enterprise focuses on “ecommerce O2O”. In addition, it has formed alliances with top channels and different industries, such as a mutually beneficial and win-win integration with VIP wealth management banks, and through high-tech systems to subvert the tradition. The consumption mode can not only enable merchants to achieve more publicity and display opportunities, but also measure and track the effect of promotion, and truly grasp the habits of consumers, thereby saving a lot of costs, which can be more rapid in increasing sales of new products and consumption in new stores, so that the choice of store location is no longer a problem; moreover, for consumers, they can obtain real-time and richer business or product information, and in terms of price, they can also obtain more favorable prices and better services than physical channels. From the above, we can see that synergy is the most important value of the cross-border ecommerce platform of the case enterprise.
340
Y.-L. Lin and S.-C. Lin
4.2 Core Competencies (Key Resources) The case enterprise has a professional management team, which provides comprehensive activities, projects and website planning, launches activities in conjunction with relevant celebrations, and has dedicated specialist to improve the planning website; the purpose of enhancing the corporate image has been achieved, in addition to the strong information of the case enterprise. The technology should not be underestimated. Not only the design of the website and the program is quite perfect, but also equipped with augmented reality images, which can make customers feel as if they were on the scene; multi-faceted platforms such as entertainments, it can be seen that their network layout and source of funds cannot be ignored. The case enterprise has integrated resources for O2O marketing of the business model and has a stable relationship with physical channels. From this, it can be seen that the case enterprise has rich resources and strong integration capabilities. Therefore, leveraging on strength can create unlimited possibilities for enterprises. 4.3 Key Activities The case enterprise provides an online transaction platform that integrates food, clothing, housing, transportation, and entertainment, so the members can not only enjoy the services of suppliers through physical stores but also through the online platform. Except the assistance, the case enterprise also gets feedback from the customers, so the case enterprise can meet customer needs successfully and increase the infiltration of partners to participate activities, to achieve the effect of enhancing the brand awareness of partners, and to gain brand recognition, thereby increasing the operating income of partners, and members can also enjoy perfect service. 4.4 Partners (Key Partnerships) Partners can be divided into four categories, which are strategic alliances between noncompetitors, joint ventures, competitive alliances, and buyer-seller relationships. In order to effectively operate the business models, the case enterprise establishes various partnerships based on various reasons. The cooperative suppliers who combine different industries with the case enterprise have established a solid mutual assistance relationship with each other, and the channel dealers who cooperate with the case enterprise have a large scale. 4.5 Target Customers (Customer Segments) The case enterprise targets the high-consumer groups whose income accounts at the top 20% of the pyramid, and the source of customers can also be divided into virtual channels and physical platforms. First, virtual channels such as social network platforms to advertise or seek customers by word of mouth, and the second, physical trading platforms, such as private banks, financial securities industry, wealth management centers, private clubs and other platforms to quickly increase the niche markets.
The Business Model of Cross-Border E-Commerce
341
4.6 Channels The case enterprise provides customers with an optimized O2O trading platform and adopts the route of virtual and real integration. While the virtual part represents the network platform, and the physical part is mainly the physical channel. On the platform, bringing current customers into the strategic mode of online platforms, sending messages through mobile phones, ordering online and consuming offline, or attracting customers from online to offline, combining mobile barcodes (QR codes) with physical stores, increasing the number of members from offline to online. The above channels create a win-win strategy for partners. 4.7 Customer Relationships The case enterprise is a platform for cross-industry integration, so the information flow can communicate with each other and exchange data smoothly and make business flow effectively. As for the logistics part, the case enterprise uses the ERP system and CRM system to send vouchers, which can be received and used immediately after payment. The main cash flow of the case enterprise comes from shareholders’ own joint venture funds, strategic investors and venture capital funds. There is no problem of inventory cost, and the quality of goods is relatively stable, which makes the case enterprise and customers a stable relationship. 4.8 Sources of Income (Revenue Streams) The main source of income of the case enterprise also includes diversified income such as e-commerce in food, clothing, housing, transportation, entertainment, etc., and also has income from lifelong membership of the yacht club, yacht leasing, new and old yacht trading, ship management, related leasing, and education courses and training, related products and other items. 4.9 Cost Structure The main costs of the case enterprise include personnel expenses, rental expenses, management expenses, marketing expenses, start-up expenses, etc., among which marketing expenses account for the largest proportion.
342
Y.-L. Lin and S.-C. Lin
5 Conclusion Under the background of the global information transparency of modern ecommerce, enterprises in various countries must develop cross-border e-commerce in order to seize the opportunity if they want to keep up with the big waves of international trade in the new era. The two key elements of cross-border ecommerce in this case enterprise are “Fast” and “Excellent Value” to meet consumer needs as soon as possible, and use B2B, B2C, O2O as its business model to attract middle and upper-class consumer groups joining membership programs through exclusive yacht clubs. The case enterprise has seven major niches, including a professional management team, strong information technology, extensive contacts, smooth cash flow, fast logistics, a complete innovative business model, and crossstrait government policy support, spanning the cross-strait markets of China and Taiwan. This study uses the Business Model Map (BMC) to explore nine elements of cross-border e-commerce companies, including value proposition, customer relationship, target customers, channels, key activities, partners, core competencies, revenue sources, and cost structure, to assist cross-border e-commerce companies build a business model that develops, discusses, designs, and operates to maintain a more stable membership relationship and improve the business model structure of the industry. Acknowledgments. This research was supported by Ministry of Science and Technology of the Republic of China under contract MOST 111-2420-H-468-001.
References Cang cypress (Min 103): How virtual operators use big data to improve their business models and competitive advantages in the Chinese market (overseas special class dissertation). National Taiwan University, Taipei Cha, H. (Min 102): Script Analysis of Cross-border Business Integration of Big Data and Mobile Devices – A Design Thinking Perspective. Master’s Thesis, Tamkang University, New Taipei City Chen, S. (Min 89): Obstacles to the development of B2C e-commerce in my country. Institute for Innovative Application Services of the Institute for Policy Planning. http:// www.find.org.tw/webSearch.aspx?value=B2C, http://www.find.org.tw/trend_disp.asp?trend_ id=1124, http://www.find.org.tw/ Chen, N., Liu, G. (Min 103): The application of cross-border e-commerce in enterprises of different sizes in China. University of Shanghai for Science and Technology, Shanghai Chen, X. (Min 91): Qualitative research in the social sciences . Five South Book Publishing Co., Ltd. Chen, Y. (Min 102): Construction and Application of Mobile E-Commerce System – Taking Xuhai Hot Spring, Mudan Township, Pingtung County as an Example. Master’s Thesis, Chianan University of Pharmacology, Tainan City Guynes, C., Vedder, R., Vanecek: Security Issues on the Internet. Secur. Audit Control Rev. 15(2), 8–12 (1997) Han, J. (Min 104): Research on the Innovation Path of Profit Model of SMEs in the Big Data Era. E-Commerce (1), (2015) Huang, Q. (Min 103): A Study on the Key Factors for the Success of E-Commerce Platforms – Taking Amazon.com as an Example. Master’s Thesis, Ming Chuan University, Taipei City
The Business Model of Cross-Border E-Commerce
343
Kalakota, R., Whinston, A.B.: Electronic Commerce: A Manager Guide. Addison-Wesley Publishing Company, Inc. (1997) Kang, S. (Min 102): Cross-border e-commerce barriers break the ice. China Economic Net Retrieved from http://big5.ce.cn/gate/big5/intl.ce.cn/specials/zxxx/201309/10/t20130 910_1457688.shtml Kraemer, K.L., Gibbs, J., Dedrick, J.: Impacts of globalization on e-commerce use and firm performance: a cross-country investigation. Inf. Soc. 21(5), 323–340 (2005) Laudon, K.C., Traver, C.G.: E-commerce: Business Technology. Society, pp. 54–60. Addison Wesley (2002) Li, C. (Min 103): Research on Using Interactive Experience Model as Interface Specification of O2O Service System. Master’s Thesis, National Taiwan University of Science and Technology, Taipei Liao, C. (Min 103): The countermeasures of the life insurance industry in the era of big data analysis. Master’s thesis, National Chengchi University, Taipei City Lin, R. (Min 90): Research on the competition and cooperation relationship and development opportunities of physical and virtual channels —— Taking the development of online shopping market as an example. Master’s thesis, National Taiwan University, Taipei Lin, Z. (Min 93): Taiwan B2C e-commerce management and success model analysis. Master’s thesis, National Taiwan University, Taipei Lin, Y. (Min 103): Cross-border e-commerce international taxation methods and suggestions for the development of my country’s tax system. Fiscal Tax Res. 43(1), 120–137 Lin, J. (Min 96): Preliminary discussion on the development process of my country’s industrial added value and the influencing factors of added value. Taiwan School of Economics Lu, C. (Min 87): Introduction to Internet Marketing, Ministry of Economic Affairs Information Application Navigation, 26 Lu, D. (Ministry 96): B2B E-Commerce Partnership Management Key Linkages: An Exploratory Study. Master’s Thesis, National Chengchi University, Taipei City Nath, R., Akmanligil, M., Hjelm, K., Sakaquchi, T., Schultz, M.: Electronic commerce and the internet: issues, problems, and perspectives. Int. J. Inform. Manag. 18(2), 91–101 (1998) Osterwalder, A., Pigneur, Y., Smith, A.: Business model canvas. Self published. www.businessm odelalchemist.com (2010). Last retrieval 5 May 2011 Porter, M.E.: How competitive forces shape strategy. In: Asch, D., Bowman, C. (eds.) Readings in Strategic Management. Palgrave, London (1989). https://doi.org/10.1007/978-1-349-203178_10 Porter, M.E.: Competitive Strategy: Techniques for Analyzing Industries and Competition, p. 300. New York (1980) Porter, M.E.: Competitive Strategy: Creating and Sustaining Superior Performance. The free, New York (1985) Porter, M.E.: Competitive advantage: Creating and sustaining superior performance. Simon and Schuster (2008) Rampell, A.: Why Online 2 Offline Commerce is a trillion dollar opportunity. http://techcrunch. com/2010/08/07/whyonline2offline-commerce-is-a-trillion-dollaropportunity/ (2010) Sarkar, M. B., Butler, B., Steinfield, C.: Intermediaries and cybermediaries: sarkar, butler and steinfield. J. Comput.-Mediated Commun. 1(3) (1995) Steiner, G.A.: Strategic Planning: What Every Manager must Know. Free Press, N.Y. (1979) Trimi, S., Berbegal-Mirabent, J.: Business model innovation in entrepreneurship. Int. Entrep. Manag. J. 8(4), 449–465 (2012) Weihrich, H.: The SWOT matrix—a tool for situational analysis. Long Range Plan. 15(2), 54–66 (1982) Wu, Y. (min 100): Research on B2C E-commerce Trust: Elaborating the Online Shopping Process with Qi’s Influence Model. Master’s Thesis, National Chengchi University, Taipei City
344
Y.-L. Lin and S.-C. Lin
Yue, Z. (Min 103): explored the trend of virtual community combined with big data from the perspective of customer relationship management. Master’s Thesis, Tamkang University, New Taipei City Zhang, H. (Min 102): Discussing the willingness to use O2O (Offline to Online) mode from the point of view of consumption values – taking Homeplus virtual store as an example. Master’s Thesis, Tatung University, Taipei City Zhang, X. (Min 104): Can cross-border e-commerce lead the journey of “buying the world” and “selling the world”. Xinhua News Agency. http://news.xinhuanet.com/fortune/2015-04/14/c_1 114966047.htm Zhang, C. (Min 103): The research on the integrated marketing model of virtual reality takes a cultural and creative company as an example. Master’s thesis, Jianxing University of Science and Technology, Taoyuan City
Impact of SARS and COVID-19 on Taiwan’s Tourism Industry Ying-Li Lin1 , Shih-Chieh Lin2 , Kuei-Yuan Wang3(B) , and Ching-Lun Lin3 1 Department of Finance, Asia University, No. 500, Lioufeng Road, Wufeng District,
Taichung 41354, Taiwan [email protected] 2 Department of Business Administration, Asia University, No. 500, Lioufeng Road., Wufeng District, Taichung 41354, Taiwan 3 Department of Finance, Asia University, No. 500, Lioufeng Road., Wufeng District, Taichung 41354, Taiwan [email protected]
Abstract. This study mainly discusses the impact of SARS and COVID-19 on Taiwan’s tourism industry, as well as the impact of COVID-19 on the world, and compares the impact of SARS and COVID-19 to observe the difference between the two and the impact of stock shocks. The research method is taken from the database of Taiwan Economic News, the collected data is Taiwan’s listed OTC tourism companies as research samples, and the main research is done by using the event study method, and the research is based on four event days.
1 Introduction SARS broke out in early 2003, and the World Health Organization officially named this coronavirus “SARS virus” on April 16, 2003. Taiwan’s stock market has been greatly impacted. In March 2003, it fell from 6242 points to 4139 points in April 2003 with the spread of the epidemic, a drop of about 33.7%. The novel coronavirus (COVID-19) broke out at the end of 2019, major global stock markets plummeted, and the covid pneumonia caused a decline in the GDP growth rate. Based on the above reasons, this study intends to study the impact of Taiwan’s SARS and COVID19 on the stock prices of the tourism industry. In this study, April 24, 2003 (Taipei Peace Hospital, the Republic of China, was closed due to the outbreak of SARS infection, which was the first hospital closure due to SARS infection in Taiwan) was regarded as the major event date of SARS, and January 22, 2020 (Taiwan The first case of novel coronavirus pneumonia was confirmed on January 21, 2020), April 19, 2020 (the first case of cluster infection in the Navy Friendly Aviation Training Detachment was April 18, 2020), 2021 January 13, 2021 (January 12, 2021, when the first case of peach cluster infection in Taiwan) was a major event day, using the event study method as the research method, the research results found that two major events had a significant impact on the stock price of Taiwan’s tourism industry. It has a serious impact, and the impact of SARS is relatively faster than that of COVID-19. That is to say, COVID-19 has a more serious impact on the stock price of Taiwan’s tourism industry. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 345–350, 2023. https://doi.org/10.1007/978-3-031-35836-4_36
346
Y.-L. Lin et al.
2 Literature Review The covid pneumonia (COVID-19) incident has directly affected the economic behavior of the people, and economic production activities have also been affected. According to the survey, Taiwan’s consumer confidence index fell all the way at the beginning of 2020, and did not stop falling until June. At the beginning of 2020, the test point of the industrial prosperity and business climate began to decline, and it continued to gradually recover after May (firstly, the service industry, followed by manufacturing. Industry and construction industry), the degree of recession will vary in different industries, and the impact of the covid pneumonia (COVID-19) on the recovery performance of representative industries will vary. Goodell (2020) and Yarovayaet al.(2020) proposed that the covid pneumonia pandemic may have a significant impact on the operation of financial departments, Baker et al. (2020) compared the response of the US stock market to various infectious diseases and found that Covid-19 caused unprecedented volatility, the same as Zhang Hu and Ji (2020) studied the volatility of ten stock markets between January and February 2020 when most of the confirmed cases occurred, and found that the volatility increased significantly in February due to COVID-19. Alfaro et al. (2020) Using US data, stock market values have declined in response to epidemics such as Covid-19 and SARS. Al-Awadhi et al. (2020) also found that the overall stock price in China fell due to the expected adverse economic results. Ashraf (2020a) studied data from 64 countries and found that the overall stock market had a negative impact on the Covid19 outbreak, but the response was only statistically significant for the increase in the number of confirmed cases, but not for the increase in the number of deaths. Ashraf (2020b) found that strict social safety distance measures can greatly weaken the negative reaction of the stock market to the increase in confirmed cases of Covid-19. In addition, interesting studies like Corbet et al. (2020) study on the impact of the company’s name “corona” on stock returns found that when COVID-19 declared a pandemic, there was a strong negative hourly return effect and a very large increase in hourly volatility. Similarly, Sharif et al. (2020) found that the impact of the COVID19 pandemic on US geopolitical risk and economic uncertainty is higher than that of the US stock market.
3 Research Methods 3.1 Event Study Method Event Study Methodology (Event Study Methodology) aims at a specific event to explore whether it affects the Abnormal Return of corporate securities. The Specific Event studied can distinguish different situations that affect a certain securities market, market industry or individual securities, such as: changes in policies implemented by government agencies, impacts caused by fluctuations in overall economic data, economic events in a specific industry or an event such as the release of individual securities financial information. The event study method of this study is carried out in the following four steps, (1) setting the event date, (2) defining and estimating the abnormal rate of return, (3) testing the abnormal rate of return, and (4) analyzing and explaining the results. The event date is the day when Taiwan’s Ministry of Health and Welfare first released a major message, and the event dates of this study are summarized in Table 1. In this study, the
Impact of SARS and COVID-19 on Taiwan’s Tourism Industry
347
event period is set as 10 days before and after the event day, a total of 21 days. The estimated period is set at 150 days, that is, the estimated period is from 160 days before the event to 11 days before the event. Table 1. Major event day event code event day time event happened E1
2003/4/24
Taipei Peace Hospital closed due to SARS
E2
1/22/2020
The date of the first confirmed case of new coronavirus pneumonia in Taiwan isJanuary 21, 2020
E3
2020/4/19
The first case of cluster infection in the Friendly Aviation Training Detachment of the Republic of China Navy was April 18, 2020
E4
2021/1/13
The first case of cluster infection at Taoyuan Hospital of the Ministry of Health and Welfare is January 12, 2021
3.2 Data Sources and Samples In order to consider the completeness and availability of the research data, the research subjects choose the tourism industry listed on the Taiwan Stock Exchange (TSEC), and have complete financial, equity structure and stock market information, excluding those with different financial statements industries. The research period is set as April 2003 and December 2019 to February 2021. There are 9 sample companies, as shown in Table 2. Financial information is obtained through the database of Taiwan Economic News, while non-financial information is obtained from company annual reports and public information observation stations. Table 2. Sample List company
Industry direction
company
Industry direction
Wanqi (2701)
sightseeing
First store(2706)
sightseeing
Huayuan(2702)
sightseeing
Regent (2707)
sightseeing
State Guest(2704)
sightseeing
Phoenix(5706)
sightseeing
Luk Fook (2705)
sightseeing
Xintiandi(8940)
sightseeing
Hao Le Di (9943)
sightseeing
348
Y.-L. Lin et al.
4 Empirical Results 4.1 Verification of the Average Abnormal Rate of Return (AR) of E1 on the Event day The average abnormal returns on the event day (t = 1) and (t = 2) were 4% and 4.3% respectively, and reached the extremely significant standard. The test results show that Taiwan’s tourism industry stock market has a significant immediate downward reaction to the severe acute respiratory syndrome event, which lasts until the sixth day after the event day, and the average abnormal return rate gradually stabilizes from the tenth day after the event day. 4.2 Verification of E1 Average Cumulative Abnormal Rate of Return (CAAR) on the Event day The average cumulative abnormal return rate before the event day is less significant. However, the average cumulative abnormal return rate after the event day is mostly negative and extremely significant. 4.3 Verification of E2 Average Abnormal Rate of Return (AR) on the Event day The event day E2 reached the extremely significant standard, and its abnormal rate of return was negatively reacted six days before the event day. Although it caused the greatest impact on the event day, the reaction was not completed in a day, and continued until the first day after the event day, on the second day, the fourth day, the sixth day and the seventh day, and from the eighth day after the event date, the average abnormal rate of returns to be stable. 4.4 Verification of E2 Average Cumulative Abnormal Rate of Return (CAAR) on the Event day The average cumulative abnormal return rate before the event day reaches significance, and the average cumulative abnormal return rate after the event day is mostly negative and extremely significant. 4.5 Verification of the Average Abnormal Rate of Return (AR) of E3 on the Event day The average abnormal rate of return on event day E3 is negative 0.9%, and there is no significant abnormality at 0. Three days before the event (t = −3), the abnormal rate of return is significantly positive. The first day, the fifth day, the seventh day, and the eighth day are positive abnormal returns, and have reached the extremely significant standard, implying that the impact of the new coronavirus pneumonia on Taiwan’s tourism industry has gradually disappeared, and there is a gradual improvement trend.
Impact of SARS and COVID-19 on Taiwan’s Tourism Industry
349
4.6 Verification of E3 Average Cumulative Abnormal Rate of Return (CAAR) on the Event day The average cumulative abnormal return rate of E3 on the event day is positive from t = −9, and the highest point falls at t = 8, (CAR = 14.7%, p < 0.01). The obvious and significant intervals fall between t = −8 to t = 0 and t = 5 to t = 10, indicating that Taiwan’s tourism industry has been affected by event day E3 for a long time. 4.7 Verification of E4 Average Abnormal Rate of Return (AR) on the Event day The day before the event day E4 caused a 1.5% drop and reached a significant level, and the fifth day after the event day caused a 2% drop and reached an extremely significant standard. The results of the test show that in addition to the reaction after the event day, the negative average abnormal rate of return was also generated before the event day, but the reaction was not completed once, and the decline was significant on the second day and the fifth day after the event day. 4.8 Verification of E4 Average Cumulative Abnormal Rate of Return (CAAR) on the Event day Negative abnormal returns gradually accumulated seven days before the event day, and reached the maximum negative accumulated abnormal returns on the ninth day after the event day, and most of them showed negative and extremely significant phenomena.
5 Conclusion This study explores whether SARS and COVID-19 have affected the stock price fluctuations of stocks related to Taiwan’s tourism industry, and whether the abnormal rate of return has reached the significant standard. Using the Taiwan Economic News Database (TEJ) event research module, select the stock price of 10 days before and after the event day as the event period; the stock price of 150 days before and after the event day as the estimation period, and estimate the abnormal return rate and cumulative abnormal return rate. The empirical results show that when the first case of COVID-19 appeared, it caused negative abnormal returns, and this effect continued after the event day. In April of the same year, the Friendly Aviation training team had cluster infections, and only two companies accumulated abnormal returns on the event day. The rate of return reached the significant standard, and the cumulative abnormal rate of return after the event date only reached the significant standard for some days in companies such as Jinghua, Huayuan, Xintiandi, Luk Fook, and Guobin, which shows that this event did not cause significant abnormal rate of return for other companies. When a cluster infection occurs in Taoyuan Hospital of the Ministry of Health and Welfare in the following year, negative abnormal rewards are generated. The abnormal rate of return fell on the event day and reached an extremely significant standard, which was gradually reflected after the event day.
350
Y.-L. Lin et al.
References Al-Awadhi, A.M., Al-Saifi, K., Al-Awadhi, A., Alhamadi, S.: Death and contagious infectious diseases: impact of the COVID-19 virus on stock marketreturns. J. Behav. Exp. Finance 27, 100326 (2020) Ashraf, B.N.: Stock markets reaction to COVID-19: cases or fatalities? Res. Int. Bus. Financ. 54, 101249 (2020) Ashraf, B.N.: Economic impact of government interventions during the COVID-19 pandemic: international evidence from financial markets. J. Behav. Exp. Finan. 27, 100371 (2020) Baker, S., Bloom, N., Davis, S.J., Kost, K., Sammon, M., Viratyosin, T.: TheUnprecedented Stock Market Reaction to COVID-19. Covid Economics: Vettedand Real-Time Papers 1 (2020) Chen, J.: Research on the Key Success Factors of the Second Southern International Airport. Evergreen School of Management, Master Thesis (1988) Chien, G.C.L., Law, R.: The impact of the Severe Acute Respiratory Syndrome on hotels: a case study of Hong Kong. Int. J. Hosp. Manag. 22(3), 327–332 (2003) Corbet, S., Hou, Y., Hu, Y., Lucey, B., Oxley, L.: Aye Corona! The contagion effects of being named Corona during the COVID-19 pandemic. Financ. Res. Lett. 101591 (2020) Goodell, J.W.: COVID-19 and finance: agendas for future research. Financ. Res. Lett. 35, 101512 (2020) Harif, A., Aloui, C., Yarovaya, L.: COVID-19 pandemic, oil prices, stockmarket, geopolitical risk and policy uncertainty nexus in the US economy: fresh evidence from the wavelet-based approach. Int. Rev. Financ. Anal. 70, 101496 (2020) Huang, M.: Research on airline service quality evaluation – application of fuzzy multi-criteria decision-making method “Yong” culture university. Master’s Thesis 1984 of the Republic of China (1984) Li, S.: Performance evaluation of maritime mass transportation operations and services – application of fuzzy multi-criteria evaluation. Master’s Thesis, Jiaotong University (1988) Lin, L.: Ecological benefit evaluation of coastal plant landscape structure. Master’s Thesis, Tunghai University (1991) Wang, G.: Research on the establishment of tourism risk assessment model – application of fuzzy multi-criteria decision-making method use. Master’s Thesis, Cultural University (1983) Xu, X.: Research on the application of fuzzy multi-criteria decision-making method to the performance evaluation of museum promotion activities research – taking the national palace museum as an example. Master’s Thesis, Culture University, Republic of China 86 Yahoo Stock Market: Benefiting from the epidemic, digital medical investment has boomed. https://reurl.cc/Q3Vzjq (2020) Yahoo stock market: A few bright spots in the epidemic the two major advantages of biotech groups support investment value. https:/reurl.cc/14kyZX (2020) Yarovaya, L., Brzeszczynski, J., Goodell, J.W., Lucey, B.M., Lau, C.K.: Rethinking financial contagion: information transmission mechanism during the COVID-19 pandemic. SSRN Electron. J. (2020) Yunsen, Y.: Research on fuzzy multi-criteria evaluation of china’s international commercial port investment in tourism and recreation enterprises – taking keelung port as an example. Master’s Thesis, National Ocean University (1988) Zhang, D., Min, H., Ji, Q.: Financial markets under the global pandemic of COVID-19. Financ. Res. Lett. 36, 101528 (2020) Zheng, Q.: Research on the Business Strategy of Domestic Airlines after the Opening of Taiwan’s High Speed Rail. Master’s Thesis, Chang Jung University (1991) Zeng, Y.: Research on the evaluation of urban park green space location from the viewpoint of landscape ecology - taichung city dongfeng park and fengle park as examples. Master Thesis, Tunghai University, Republic of China 88
Using the Balanced Scorecard to Analyze Bank Operational Performance – Comparison of Domestic and Foreign Banks Ying-Li Lin1 , Shih-Chieh Lin2(B) , and Ya-Yun Yang3 1 Department of Finance, Asia University, No. 500, Lioufeng Road, Wufeng District,
Taichung 41354, Taiwan 2 Department of Business Administration, Asia University, No. 500, Lioufeng Road, Wufeng
Dist., Taichung 41354, Taiwan [email protected] 3 San Ho Elementary School, No.1, Chunting 2nd Street, Daya District, Taichung City 428405, Taiwan
Abstract. Under the condition that the banking industry in Taiwan cannot be greatly changed, how to allocate resources more efficiently to the key factors with the most competitive advantages to improve business performance is an important issue facing the management. The key factors of the scorecard are used as a diagnostic tool to understand the advantages and disadvantages of the bank itself and find out the key factors of the bank’s competitive advantage, grasp the core competitiveness and examine the shortcomings, and seek the bank’s future competitive advantage for survival. Through interviews with banks A and B, this study found that: 1. The Balanced Scorecard is a framework for thinking, and the focus should be on the development of the causal relationship related to the value creation strategy. 2. Enterprises that implement the balanced scorecard must think from the perspective of the overall enterprise to provide comprehensive and holistic business intelligence, data acquisition, analysis, aggregation and effective query processing capabilities, in order to provide the balance scorecard required item information. 3. The balanced scorecard needs quantitative indicators. 4. The need for comprehensive data analysis.
1 Introduction Under the condition that the banking industry in Taiwan cannot be greatly changed, how to allocate resources more efficiently to the key factors with the most competitive advantages to improve business performance is an important issue facing the management. The key factors of the scorecard are used as a diagnostic tool to understand the advantages and disadvantages of the bank itself and find out the key factors of the bank’s competitive advantage, grasp the core competitiveness and examine the shortcomings, and seek the bank’s future competitive advantage for survival. Through interviews with banks A and B, this study found that: 1. The Balanced Scorecard is a framework for thinking, and the focus should be on the development of the causal relationship related to the value creation strategy. 2. Enterprises that implement the balanced scorecard must think from © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 351–360, 2023. https://doi.org/10.1007/978-3-031-35836-4_37
352
Y.-L. Lin et al.
the perspective of the overall enterprise to provide comprehensive and holistic business intelligence, data acquisition, analysis, aggregation and effective query processing capabilities, in order to provide the balance scorecard required item information. 3. The balanced scorecard needs quantitative indicators. Fourth, the need for comprehensive data analysis.
2 Literature Review 2.1 The Nature of the Balanced Scorecard The concept of the Balanced Scorecard is: In the case of not excluding financial indicators in the structure of enterprise management, in addition to incorporating financial indicators into the enterprise management system, further actively find out more performance drivers that will contribute to future operating financial results. The leading indicators that may be related to financial results, such as customer satisfaction, efficient process arrangements, and employee quality, ability, and morale, so that the short-term performance of business operations can be compared with longterm operations. The strategic goals tend to be consistent. Since the performance measurement mechanism of the balanced scorecard was proposed, most of the research in the early stage was mainly focused on private profit-making institutions. Kaplan and Norton (1993) began to apply the balanced scorecard to the oil industry, computer industry, and electronics industry. Cases such as banking, finance, and insurance show that the application of balanced scorecard measurement indicators is not to replace the existing measurement tools of enterprises, but to help enterprises balance the gap between current actions and future results more effectively. 2.2 Four Aspects of Balanced Scorecard 2.2.1 Financial Aspect The financial aspect is the business result that the shareholders of the company care about, and it is also the ultimate goal of the company. Although financial indicators are lagging indicators, they can directly encourage corporate members and motivate longterm performance. The ultimate goals of the other three aspects are also linked to the financial aspect (Kaplan and Norton, 1993). At the same time, whether the enterprise can operate efficiently is also a financial measurement it must continue to ensure, and achieve the goal at the minimum cost. Therefore, the measurement indicators of the balanced scorecard in the financial aspects take into account both financial and non-financial aspects. 2.2.2 Customer Aspect The customer aspect refers to achieving the corporate vision by satisfying the maximum satisfaction of customers. The purpose of business operation is to acquire target customers and deliver products and services to target customers. Therefore, products and services must be unique and different from competitors in order to create value for target
Using the Balanced Scorecard to Analyze Bank Operational Performance
353
customers and allow customers to return revenue to the company; at the same time, it can also bring fruitful financial results to the company by improving customer satisfaction. Therefore, the customer and financial aspects of the Balanced Scorecard are both describing the desired results of the enterprise development process. In addition, the balanced scorecard not only presents the external and internal balance of the enterprise, but also satisfies the needs of external customers and internal customers. 2.2.3 Internal Process Aspect The internal process aspect is to establish an operation process design that can satisfy customers and shareholders (Kaplan and Norton, 1993), including innovation, operation, after-sales service and other operational processes, with the aim of obtaining the best financial profit process design. It is to provide customers with the products or services they need according to their new or potential needs, and to provide customers with followup inquiries or maintenance services. Its strategic priority will determine the criticality of the business and the operation process to assist the organization in providing value to customers, assisting in the design and development of new products, opening up new markets and customer sources, and meeting the new needs of customers. 2.2.4 Learning and Growth Aspect The excellence and performance that an organization must achieve depends on the goals set by the financial, customer, and internal process aspects, and its foundation depends on the efforts of the learning and growth aspect. In other words, the learning and growth aspect is what drives the first three aspects of motivation to achieve superior outcomes (Kaplan and Norton, 1993). Therefore, to establish a balanced scorecard, three key factors should be strengthened: employee capabilities, information system capabilities and incentives, and consistency of authorization, and consider how to build a performance measurement framework for learning and growth.
3 Research Design The main purpose of this chapter is to explain the research method, the first section is the research method, the second section is the introduction of the banking industry, the third section is the research object, the fourth section is the research tool, the fifth section is the implementation process, and the sixth section is the information processing and analysis. 3.1 Research Methods 3.1.1 Case Study Method Case study (case study) it is a scientific research, empirical inquiry method. It refers to the extensive collection of archives and records for single or multiple individuals, families, enterprises, or institutional groups during the entire research process, so as to understand the current environment, background and enterprise development process related to the
354
Y.-L. Lin et al.
research subject., through paper-based question and-answer methods, direct observation, written documents, file records, and in-depth interviews, etc., to explore and understand the crux of the research subject’s internal operating conditions and external influences, and to propose appropriate methods or suggestions to solve problems in a timely manner. 3.1.2 In-Depth Interview Qualitative research includes many different research methods, such as in-depth interview method, Delphi method, content analysis method, etc. Each method needs to be determined according to the research situation, and the data source of this research is mainly the opinions of customers. It is necessary to guide the interviewees to answer their own feelings through the dialogue with each other, so the in-depth interview method is chosen as the research method of this paper. 3.2 Research Object This study conducts initial telephone interviews with executives of local Bank A and foreign Bank B through semi-structured interviews. This study intends to take the executives of local A bank and foreign B bank as case studies to gain a deeper understanding of the use of the Balanced Scorecard by the executives of local A bank and foreign B bank. The interviewees are the executives of local A bank and foreign B bank. Respondents were coded as A1, B1 and B2 respectively. After the interview, the author of this article collected his opinions within one week, and reconfirmed the unanswered questions or content with doubts with the interviewees, so as to clarify the real feelings of the interviewees, so as to improve the progress of this research. The reliability and validity of the in-depth interview are shown in Table 1. Table 1. Interview methods and positions Respondent ID
interview method
department
Position
A1
field interview
Marketing Department
associate
B1
field interview
Industry Credit Investigation Section
project Manager
B2
field interview
Sales Department
manager
Source: Organized by this study
3 managers were interviewed in this study, 3 were male in the part of gender; 3 were between 41 and 50 years old in the part of age; For the part of seniority, all three of them have more than 10 years of experience. The detailed information is shown in Table 2. 3.3 Case Company Profile 3.3.1 Introduction to Bank A Since its establishment, Bank A has been adhering to the traditional spirit of “solid operation and serving the public”. It has gone through the difficult era of Japanese rule,
Using the Balanced Scorecard to Analyze Bank Operational Performance
355
Table 2. Respondent Profile basic information
Respondent ID A1
B1
B2
gender
male
male
male
age
41–50 years old
41–50 years old
41–50 years old
educatio n level
master
master
colleges and universities
Seniority
11 years
15 years
12 years
Source: Organized by this study
the recovery of Taiwan, and the economic transformation after the relocation of the national government. It is vital to revitalize the economy, promote industrial upgrading, and improve people’s lives. The level and international status have played a pivotal role. Under the government’s policy of promoting financial liberalization, internationalization, and the development of Taiwan as a regional financial center, Bank A’s “service”,”efficiency”, “Innovative” business philosophy, committed to the innovation of business, improving service standards, expanding the scope of services, in order to provide more convenient and more comprehensive financial services to industrial and commercial enterprises and the general public. 3.3.2 Introduction to Bank B Bank B conforms to the trend of global financial development, and aims to expand economies of scale, exert operational synergy, and strengthen the market competitiveness of financial cross-industry operations. Based on benefit considerations such as cross-industry cross-selling, sharing of group customers, improvement of group management efficiency, and tax incentives, Bank B is moving towards an integrated financial service provider. In terms of its business strategy, it will continue to carry out cross-selling and improve asset quality., advance to the mainland market, expand capital as the direction, continue to strengthen the information integration and share resources among the subsidiaries of the group; establish customer-centric business departments, strengthen customer-oriented target segmentation, and promote the advantages of cross-industry sales; and on the premise of increasing the company’s profits and shareholders’ rights and interests, it will actively seek the possibility of various strategic alliances and cooperation, expand the financial territory, and enhance the core competitiveness of the group as a whole.
356
Y.-L. Lin et al.
4 Empirical Results 4.1 Situation of the Four Aspects of the Balanced Scorecard A1
B1
B2
Financ ial aspect
Growth Index of Assets and Liabilities Fund position status reserve position liquidity index Loan Classification Analysis Deposit classification analysis Interest Rate Sensitivity Index Exposure Index Overdue release and collection analysis
The company’s system is to construct a balanced scorecard performance indicator system, so the data source and calculation method required for each aspect indicator of the balanced scorecard are the basic operations for the development of this system. The target value and actual value of each measurement index are
net income AR fee income Import and export fee income Taiwan Dollar Survival Operating Volume Operating volume in foreign currency
Custo mer aspect
Provide easy service procedures High-quality products/services Execute specific customer plans creative solution Provide complete valueadded services
important basis for designing data warehouse
The number of customers of joint loan sponsors New customers Valuable Customer Ratio Cross-sell completion rate
Intern al Process aspect
R & D costs Internal turnover of new projects Time to market for new products/services
New overdue loan ratio Audit score Number of new product development job rework rate (continued)
Using the Balanced Scorecard to Analyze Bank Operational Performance
357
(continued) A1 Learni ng and growt h Linkage between aspect corporate goals and personal goals Communication channels and plans between employees and enterprises performance appraisal system Hierarchical Responsibility Authorization System leadership development
B1
B2 License rate Key Talent Turnover Rate Completion rate of training courses
Based on the interview results of Bank A and Bank B, this research establishes the common points of Balanced Scorecard construction as follows: 4.1.1 Emphasis on Causality Related to Value Creation Strategies The Balanced Scorecard is a framework for thinking. The focus should be on the development of the causal relationship related to the value creation strategy. Enterprises should use the Balanced Scorecard to clarify the vision and strategy and find out the causal relationship. After the causal relationship is established clearly, then according to the causal relationship to establish the aspects that you really need, and then develop various measurement indicators, so that you can have the correct direction of implementation and give full play to the benefits of the balanced scorecard. 4.1.2 Identify and Quantify the Valuable and Meaningful Measurement Indicators The most important concept of the Balanced Scorecard is that “if you can’t measure it, you can’t manage it”, and the most important thing to measure is to quantify it. If you can’t quantify it, you can’t measure performance. KPI is for the measurement of strategy. If the indicator is important to the execution of the strategy, it must be found out. On the contrary, if the indicator already exists but is not very important to value creation, it should be discarded. 4.1.3 The Feedback Mechanism of the Balanced Scorecard Must Be Combined with the Internal Processes of the Enterprise In order for the balanced scorecard to become a continuous cycle process, the enterprise itself must combine the feedback mechanism of the balanced scorecard with internal processes or systems, for example, modify the original reward system, budget system,
358
Y.-L. Lin et al.
and other related action plans, in order to ensure the strategic competitiveness of the enterprise. 4.1.4 Balanced Scorecard Indicators Need Data Collection Companies that implement the Balanced Scorecard must think from the perspective of the overall enterprise to provide comprehensive and holistic business intelligence. The ability to capture, analyze, aggregate, and effectively process inquiries can provide all the information required by the Balanced Scorecard.. Therefore, when designing the data warehouse, the indicator requirements of the balanced scorecard must be considered. Except that some data can be obtained from internal related system operation databases, some are external data. To count these data, it is best to design Some data tables that can be input by users, and then the data is transferred in batches on a regular basis. 4.1.5 Data Analysis Analysis is the core element of the performance management cycle. When users analyze and solve problems through the balanced scorecard system, they need to be able to drill down/up to various measurement indicators from “overall performance” “, and even trace the “raw data”, or can extract the data of a specific period for necessary statistical analysis to find out the root cause, and then assist users in making management decisions.
5 Conclusion This research focuses on local bank company A and foreign bank company B, and analyzes their characteristics and current development status. Based on the concept of the balanced scorecard, the four aspects of the balanced scorecard are used: financial aspect, customer aspect, internal procedures aspect, learning and growth aspect. Based on the four aspects, the establishment of appropriate performance evaluation indicators and a good performance evaluation system, and the common conclusions of this study are as follows: 5.1 Consider the Balanced Scorecard Indicators Companies that implement the Balanced Scorecard must think from the perspective of the overall enterprise to provide comprehensive and holistic business intelligence. The ability to capture, analyze, aggregate, and effectively process inquiries can provide all the information required by the Balanced Scorecard. Therefore, when designing the data warehouse, the indicator requirements of the balanced scorecard must be considered. In addition to some data that can be transferred from the internal related system operation database, some are external data. To count these data, it is best to design some the data table can be input by the user, and then the data is transferred in batches on a regular basis.
Using the Balanced Scorecard to Analyze Bank Operational Performance
359
5.2 Integration with Existing Enterprise Systems A competent balanced scorecard system should be able to mine and integrate effective information from the existing system of the enterprise, and transfer it to a visual application system to save management time and ensure data consistency. In this way, when decision makers or employees apply the Balanced Scorecard, they can skip the level of data editing and processing. Business managers can observe changes in operating performance from a multi-dimensional perspective, and have a clearer understanding of the direction of execution, and truly grasp the timing and business opportunities. Make the best operational decisions at the first time. 5.3 Systematic Architecture Although this study has devoted a lot of attention to the interpretation of the concept of the balanced scorecard and the process of applying it to the case object, what is the relative position of the balanced scorecard among various management techniques today? It is necessary to touch on the management practice meaning. The Balanced Scorecard has an integrated effect. The strategy and the Balanced Scorecard can guide enterprises to understand the management techniques that suit their needs, and each technique supports one or several aspects of the Balanced Scorecard. The generated information is fed back to the strategy to review the strategy’s assumptions and serve as a reference for future revisions.
References Chen, M.: The Balanced Scorecard is used in the performance evaluation of police agencies. Master thesis of Institute of Industrial Engineering, Chung Yuan University (2003) Chen, Z.: Research on Performance Evaluation Indicators of Non-profit Organizations — Taking 300 Major Foundations in Taiwan as Examples. Master’s Thesis, Institute of Resource Management, Academy of Defense Administration (2002) Chen, Y., Zheng, H.: Balanced scorecard in China trust. Acc. Res. Monthly 183, 84–96 (2001) Delaney, J.T., Huselid, M.A.: The impact of human resource management practices on perceptions of organizational performance. Acad. Manag. J. 39(4), 949–969 (1996) Evans, H., Gary, A., Mike, C., Andrew, D., David, T.: Exploiting activity based information: easy as ABC. Manag. Account. 74(7), 24–29 (1996) Gao, H.: Planning and Design of Balanced Scorecard – Taking Keelung Port Authority as an example. Master’s thesis of the Institute of Shipping Management, National Taiwan Ocean University (2000) He, J., Chen, C., Cha, Q.: An empirical study on the relationship between the incentive combination of senior executives and corporate performance in the ownership structure of listed companies. J. Hunan Univ. 19(6), 55–59 (2005) Kang, J.-K., Shivdasani, A.: Firm performance, corporate governance, and top executive turnover in Japan. J. Finan. Econ. 38(1), 29–58 (1995) Kaplan, R., Norton, D.: Putting the balanced scorecard to work. Harv. Bus. Rev. 1, 134–147 (1993) Kaplan, R.S., Norton, D.P.: The strategy-focused organization: how balanced scorecard company thrive in the new business environment. The Intern. Auditor. Altamonte Springs 59(1), 21–40 (2002)
360
Y.-L. Lin et al.
Kirk, J., Miller, M.: Reliability and Validity in Qualitative Research. Sage, London (1988) Li, S.: Pragmatic and innovative strategic performance evaluation. Mon. J. Acc. Res. 113, 15–23 (1995) Liang, Y., Xie, X., Wang, Q.: A study on the application of balanced scorecard to the performance evaluation of information departments in public agencies — taking a public department as an example. Electron. Commer. Res. 6(4), 425–445 (2008) Xu, S.: Towards Organizational Performance Evaluation in the Age of Innovation. Outsmart Culture, Taipei (2000) Wu, W., Lin, W.: Research on the relationship between supervisor behavior characteristics, organizational culture, organizational learning style and business performance. Furen Manag. Rev. 9(1), 71–94 (2002)
The Influence of CEO/CFO Turnover on Company Value Mei-Hua Liao1 , Yen-Ju Chen1 , Yun-Hsuan Tsai1 , and Ya-Lan Chan2(B) 1 Department of Finance, Asia University, Taichung, Taiwan, Republic of China
[email protected] 2 Department of Business Administration, Asia University, Taichung, Taiwan, Republic of China
[email protected]
Abstract. In the existing literature, most of the studies focus on the short-term impact of the turnover of the CEO/CFO on the company. However, the structural changes in the personnel structure caused by the turnover of senior executives have a greater impact on the long-term value of the company. Our sample includes 21,689 firm-year observations representing 1,669 firms listed on Taiwan Stock Exchange during 2000–2021. In order to explore the impact of the turnover of the CEO/CFO on the company’s values, we set up four hypotheses: (1) The CEO/CFO who has voluntary turnover will increase more company value than those who are involuntary. (2) Involuntary CEO/CFO turnover is associated with worse prior performance. And voluntary CEO/CFO turnover is the opposite. (3) Foreign investors weigh more shares in firms with high company value. (4) The company hiring more independent directors would improve company value.
1 Introduction In recent years, cases of company internal control and financial problems have become more frequent. The academic circles have begun to explore the relationship between managers’ turnovers and financial crises. Especially Chang (2013) believes that studying the impact of CFO turnover on company performance evaluation, this project extends to explore the relationship between the CFO, the CEO and the value of the company. Most of the research in the academic circles is on the impact of the turnover of CFO on the financial crisis. Most of the relevant research is in a short period of time, and there are few direct studies impacts on the company’s value. However, the impact of senior executive turnover warrants more far-reaching and in-depth research. It is worth discussing the impact of senior executive turnover on firm value. Foreign professional investment institutions generally make investment decisions based on the long-term interests of the company. Therefore, the more shares they hold, the more optimistic they are about the future value of the company. Additionally, having more independent directors on the board, who are relatively unrelated to the company’s ownership, indicates that the company values expert opinions. This can be inferred to have a positive long-term impact on the company’s value. Therefore, it is worthwhile to explore the investment decisions of foreign professional investment institutions and their impact on the company’s value. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 361–368, 2023. https://doi.org/10.1007/978-3-031-35836-4_38
362
M.-H. Liao et al.
In the existing literature, there is limited research on the long-term impact of managers’ turnovers on companies’ value. This study aims to investigate the impact of voluntary and involuntary manager turnover, the level of foreign professional investment institutions’ shareholding, and the number of independent directors on the firm value. Specifically, we hypothesize that companies with voluntary manager turnover are more likely to enhance firm value than those with involuntary turnover. We also assume that involuntary manager turnover may negatively affect firm value, while voluntary manager turnover may have a positive impact. Additionally, we hypothesize that a higher level of shareholding by foreign professional investment institutions is positively associated with firm value, and that having more independent directors is also beneficial to the company’s value. Conducting a thorough investigation into these hypotheses can shed light on the factors that contribute to firm value and help firms make better decisions in terms of management and governance.
2 Literature Review Turnovers in general manager and CFO can cause structural changes to a company, and even extend to the company’s value. Therefore, this study aims to investigate these changes. 2.1 The Literature on Corporate Value Shih, Tsai, Hsiang (2019) found that larger board sizes and higher shares held by independent directors and supervisors could better fulfill corporate social responsibilities. The higher the manager’s shareholding ratio and the transparency of information disclosure, the higher the company’s value. This indicates that fulfilling corporate social responsibility has a positive impact on company value. Furthermore, Chen, Wang, Hung (2017) added that companies with larger board sizes, more independent directors, or more institutional investors perform better in corporate social responsibility operations. A good corporate governance mechanism is essential for enhancing company value and operational performance through participation in corporate social responsibility. The research results of Chen, Chu (2021) found that in general, the higher the amount of investment profits from Mainland China that are repatriated to Taiwan by companies, the more domestic funds can be utilized. Chen, Chu (2022) believe that companies with better corporate governance evaluations can improve the positive relationship between investment profits repatriated from Mainland China and company value, regardless of whether the company is profitable. Huang, Chen (2017) used Tobin Q as a measure of company value and found that tax avoidance is only beneficial to shareholders and positively related to company value in firms with better corporate governance. In addition, Huang, Lin, Chang (2022) found that family firms with shorter establishment years are not suitable for using derivatives for hedging because limited funds invested in derivatives do not help increase company value. However, in family firms where the chairman or CEO is not a family member, the most appropriate hedging strategies can be found to improve company performance. Conversely, when hedging strategies are developed by professional managers in family
The Influence of CEO/CFO Turnover on Company Value
363
firms, they can increase company value. Wang, Jang (2011) showed that as manager shareholdings increase, the interests of managers become more aligned with those of shareholders, thereby reducing agency problems and increasing company value. However, higher levels of family shareholdings may infringe on the rights of minority shareholders and lead to a decrease in company value. The wider the gap between control rights and cash flow rights, the lower the company value, indicating that agency problems are detrimental to company value. Using the sample of Taiwanese listed companies, Lin, Zhu, Hong (2006) studied the impact of CEO turnover on company performance from 1999 to 2001 and found that voluntarily turnover companies outperformed those that experienced forced turnover. However, Huang (2006) studied the impact of manager turnover from the beginning of 1996 to the end of 2002 and found that not all managers who experienced non-voluntary turnover performed poorly. By distinguishing between voluntary and non-voluntary turnover, the results showed that companies experiencing non-voluntary turnover managers due to poor management performance, while those experiencing voluntary turnover may have relatively better performance, so replacing managers had a negative impact and led to a decline in stock prices. The study observed the company performance 30 days before and after the event during a four-month estimation period, but the impact should be more long-term. Therefore, this project will focus on the long-term company value. 2.2 Literature on Managerial Turnovers Weisbach’s (1988) study indicates that compared to firms with internal board guidance, firms led by outside directors are more likely to replace underperforming managers. Stock returns and changes in earnings can be used as standards to predict CEO resignation. Goyal and Park (2002) suggest that companies with the CEO and chairman of the board positions held by the same person lack independent leadership, making it difficult for the board to remove poorly performing managers. Chang (2001) proposes a negative correlation between CEO turnover probability and relative industry performance, and a positive correlation between CEO compensation and relative industry performance. When industry competition is higher, the likelihood of CEO turnover increases. Chen (2003) argues that the higher the CEO’s holding of the position of chairman of the board or director, or the higher the shareholding ratio of external major shareholders, the higher the likelihood of CEO turnover. However, the relationship between external director shareholding ratio, CEO shareholding ratio, and CEO turnover is not significant. Lee, Tang, Lu (2020) verify through experiments that companies with D&O insurance and better corporate governance mechanisms have higher value and receive positive evaluations from investors; excessive insurance, on the other hand, has a negative impact; the greater the excess insurance amount, the higher the possibility of moral crisis among directors, supervisors, and management. Chang (2013) sets methods such as univariate tests, logistic regression, and multiple regression analysis to analyze listed and OTC companies in Taiwan from 2006 to 2012, and the results show that in companies with turnovers in CFO, those with independent directors have better operating performance. Liu, Yang, Di and Li (2022) empirically examine the relationship between CFO tenure and earnings management based on data from A-share listed companies in China from 2009 to 2015. Campagnolo and Vincenti (2022) analyze the impact of cross-border
364
M.-H. Liao et al.
mergers and acquisitions on the performance of target companies by comparing cases of cross-border mergers and acquisitions and acquisitions of local companies and find that after a turnover of CEO in the acquired company, there are greater changes in regulations, values, and management practices, which are beneficial to the performance of the company. Liao, Lee, Chiang, Chen (2020) point out that generally, when there are turnovers in the CEO or CFO, there is higher volatility in earnings management. According to research findings, if the incentive for management to engage in earnings management is to be suppressed, an audit committee should be established, which will make the positive relationship between turnovers in CEO, turnovers in CFO, and earnings management less significant. Lin, Zhu, Hong (2006) it is believed that the companies with voluntary or no managerial turnovers in the position of the general manager will have better performance compared to those with forced turnovers. Company performance can be measured in different ways, but this study focuses on company value, hence the following hypotheses are proposed. Hypothesis 1. Voluntary managerial turnover is more beneficial to company value than involuntary turnover. However, Huang (2006) found that managers who are involuntarily replaced are often replaced due to poor performance. Replacing such managers can lead to improvements in company management and subsequently, increases in stock prices. On the other hand, managers who voluntarily leave may have relatively better performance prior to their departure. Thus, replacing these managers may lead to investor concerns that their performance may not be better than their predecessors, leading to a decrease in stock prices. The study mainly examines the impact of stock excess returns, which could also affect long-term company value. Therefore, the following hypotheses are proposed: Hypothesis 2-a. Hypothesis 1. Voluntary managerial turnover is more beneficial to company value than involuntary turnover. Hypothesis 2-b. Managers voluntarily leave when they are beneficial to company value. Huang (2006) indicated that foreign institutional investors can supervise and balance managerial performance, which can enhance excess stock returns after involuntary managerial turnover. This study hypothesizes that foreign institutional investors make investment decisions with a long-term perspective. As their shareholding increases, it is beneficial to company value. Thus, the following hypothesis is proposed: Hypothesis 3. The more shares held by foreign institutional investors, the more beneficial it is to company value. According to Chang (2018), larger board sizes and independent directors lead to fewer discretionary provisions for bad debt expenses. This indicates that the board has not effectively supervised the company’s operations. Normally, when companies are profitable, they reduce their bad debt provisions. However, during periods of poor profitability, more bad debt provisions are made. When a bank’s capital reserve is sufficient, it is willing to make more provisions for bad debt because such reserves are the first line
The Influence of CEO/CFO Turnover on Company Value
365
of defense against credit risk in the banking industry. Insufficient provisions may lead to bank failures during economic downturns. Chang (2013) believed that companies with independent directors have better managerial performance when their CFO turnover. This study hypothesizes that having more independent directors would be beneficial to company value in the long term. Thus, the following hypothesis is proposed: Hypothesis 4. The more independent directors a company has, the more beneficial it is to company value.
3 Research Methods The data for this study was obtained from the Taiwan Economic Journal (TEJ) database, covering the period from the beginning of 2000 to 2021. The financial statement data was downloaded from TEJ’s IFRS Finance International Accounting Standards, while trading-related data such as closing prices and annual returns were downloaded from TEJ’s stock price database. Governance-related data was obtained from TEJ’s governance database, and basic company information was downloaded from TEJ’s Company DB database. All financial statement and trading data were collected on an annual basis. The sample for this study excluded financial industries with different accounting practices such as banking, ticketing, insurance, financial holdings, securities, and futures industries, resulting in the deletion of 879 firm-year. In addition, 11,453 firm-year were excluded due to missing data or insufficient information, resulting in a final sample size of 21,689 firm-year. As shown in Table 1, the industry with the highest proportion of the sample is the optoelectronics industry, accounting for 25.22% of all samples, with a total of 421 companies. The second largest industry is the semiconductor industry, accounting for 9.83% of all samples, with 164 companies, followed by the resource recycling industry, accounting for 7.79% of all samples, with 130 companies. To investigate the reasons and influencing factors for manager turnover, the four hypotheses mentioned above were established. We use the following two models to solve. Model 1. Company Value (voluntary turnover managers) = Company Value (involuntary turnover managers) Model 2. Tobin Q = α0 + α1(Manager Turnover Dummy) + α2(Company Size) + α3 (Market - To - Book) + α4(Industry Dummy) + α5(Board Size) + α6 (Debt Ratio) + α7(Independent Directors Ratio)
From the descriptive statistics in Table 2, the maximum of Tobin Q, which represents company value, is 52.3. And the minimum value is 0.02, and the standard deviation is 1.21. The average debt ratio is 0.41, and the standard deviation is 0.18. The average market-to-book ratio is 2.11, and the standard deviation is 10.53. We believe that analyzing the reasons for CEO/CFO turnover can effectively prevent the company’s value from being damaged. Furthermore, in the process of personnel structural changes involving senior executives’ turnover, conditions that are more conducive to the company’s long-term performance are constructed.
366
M.-H. Liao et al. Table 1. Statistics on the size of each industry.
Industry code and industry name
Number of Firms
%
M11A Cement Manufacturing
7
0.42
M12Z Other Foods
28
1.68
M14C Ready-Made Garments
53
3.18
M15D Automotive Components
125
7.49
M16 Electrical Wires
16
0.96
M17C Pharmaceuticals
4
0.24
M17G Medical Consumables
125
7.49
M17Z Other Chemicals
41
2.46
M18 Glass and Ceramic Products
29
1.74
M19A Paper Making
7
0.42
M20B Metal Products
45
2.70
M21A Tires
11
0.66
M23C Optoelectronics / IO
421
25.22
M23G Semiconductors
164
9.83
M23H Electronic Equipment
83
4.97
M23K Communication Equipment
98
5.87
M23T Information Communication Pathways
35
2.10
M23X Software Services
37
2.22
M25A Construction
77
4.61
M26C Freight Warehousing Industry
22
1.32
M27A Food and Beverage Industry
39
2.34
M28N Insurance Brokers
1
0.06
M29A Department Stores and Wholesale
35
2.10
M99D Public Utilities
11
0.66
M99K Shoes and Luggage
25
1.50
M99L Resource Recycling
130
7.79
Total
1669
100.00
The Influence of CEO/CFO Turnover on Company Value
367
Table 2. Summary statistics. Maximum
Minimum
Average
Median
Standard Deviation
Tobin Q
52.3
0.02
1.26
0.96
1.21
Company Size
9.59
4.24
6.65
6.57
0.63
board size
1.43
0.3
0.85
0.85
0.11
Independent Director Ratio
0.8
0
0.26
0.29
0.17
Market-to-Book
608.57
−278.81
2.11
1.08
10.53
Debt Ratio
1
0
0.41
0.41
0.18
Acknowledgments. Constructive comments of editors and anonymous referees are gratefully acknowledged. This research is partly supported by the National Science Council of Taiwan (NSC 109-2813-C-468-099-H and NSC 110-2813-C-468-086-H).
References Chang, Y.J.: The impact of turnovers in CFO on company performance. Unpublished master’s thesis, Department of Finance, National Dong Hwa University, Taiwan (2013) Shih, T.C., Tsai, L.W., Hsiang, H.P.: The impact factors of corporate social responsibility performance of listed companies in Taiwan: from the perspective of corporate governance and intellectual capital. Acc. Auditing Rev. 9(1), 119–166 (2019) Chen, R.C.Y., Wang, J.C., Hung, S.W.: The impact of corporate governance on corporate social responsibility and firm value. Sun Yat-sen Manag. Rev. 25(1), 135–176 (2017) Chen, C.J., Chu, C.P.: The repatriation of investment earnings from Mainland China, auditor tenure, and firm value. Contemp. Account. 22(2), 207–250 (2021) Chen, C.J., Chu, C.P.: The repatriation of investment earnings from Mainland China and firm value: the influence of corporate governance evaluation and institutional investors. Account. Rev. 75, 81–135 (2022) Huang, M.J., Chen, W.L.: The relationship between tax avoidance and firm value from the perspective of agency theory: an empirical study from China. Contemp. Acc. 18(2), 155–185 (2017) Huang, P.K., Lin, Y.C., Chang, S.C.: Can hedging increase the value of family firms? Sun Yat-sen Manag. Rev. 30(3), 515–554 (2022) Wang, Y.J., Jang, C.B.: The impact of ownership structure and board characteristics on firm value from the perspective of core agency problems. Securities Market Dev. J. 23(2), 131–174 (2011) Lin, Y.F., Zhu, D.S., Hong, C.H.: A study of economic value added and CEO turnover. Contemp. Account. 7(1), 103–134 (2006) Huang, H.H.: Executive turnover and shareholder wealth effects. Manage. Rev. 25(1), 23–25 (2006) Weisbach, M.S.: Outside directors and CEO turnover. J. Financ. Econ. 20(1988), 431–460 (1988) Goyal, V.K., Park, C.W.: Board leadership structure and CEO turnover. J. Corp. Finan. 8(1), 49–66 (2002) Chang, C.C.: A study of relative performance evaluation and CEO turnover and compensation from the perspective of industry competition. Unpublished Master’s Thesis, Graduate Institute of Accounting, National Chung Cheng University, Taiwan (2001)
368
M.-H. Liao et al.
Chen, H.H.: A study of the relationship among corporate governance, CEO turnover, and firm performance. Unpublished Master’s Thesis, Graduate Institute of Management, National Chiayi University, Taiwan (2003) Lee, J.Z., Tang, L.F., Lu, H.J.: The relationship between D&O insurance and firm value from the perspective of stock investors. Securities Market Dev. J. 32(4), 31–72 (2020) Liu, X., Yang, J., Di, R., Li, M.: CFO tenure and classification shifting: evidence from China. Emerg. Markets Finance Trade 58(6), 1578–1589 (2022) Campagnolo, D., Vincenti, G.: Cross-border M&As: the impact of cultural friction and CEO turnover on the performance of acquired companies. J. Int. Manag. 28(4), 100942 (2022) Liao, M.H., Lee, C.C., Chiang, C.H., Chen, Y.C.: The impact of CEO overconfidence on earnings management: evidence from China. Int. J. Account. Inf. Manag. 29(2), 295–310 (2020) Chang, H.H.: The relationship between earnings management and corporate governance: an example from the Chinese banking industry. Account. Audit. Rev. 8(1), 49–80 (2018)
The Relationships Between Underpricing and Turnover: The Study of Seasoned Equity Offerings Chun-Ping Chang, Yung-Shun Tsai(B) , Shyh-Weir Tzang, and Chih-Yun Liu Department of Finance, Asia University, Taichung, Taiwan [email protected]
Abstract. Cash capital increase will not only affect the shareholding structure, but also cause a discount in stock prices. Therefore, this study hopes to explore the impact of cash capital increase on stock discounts and corporate performance. This study selects listed companies that have implemented cash capital increase from 2009 to 2018 for inspection. The empirical method adopted in this study is to use the Fama-French three-factor model and the abnormal return during the holding period as the verification method to test the discount of cash capital increase, company performance and Correlation of stock turnover ratio. The empirical results of this research are as follows: 1. The stock discount rate of cash capital increase will be affected by factors such as stock turnover rate. 2. Abnormal stock return for cash capital increase will be affected by factors such as stock turnover rate. 3. The excess return of cash capital increase will be affected by factors such as stock turnover rate. 4. The excess return of cash capital increase stock will be affected by the discount. Keywords: cash capital · discount rate · abnormal return · Fama-French three-factor model
1 Introduction Cash capital increase will not only cause changes in the company’s equity capital, but also affect financial leverage, net worth, and earnings per share, which will be reflected in the stock price. Therefore, there should be a relationship between different types of cash capital changes and stock prices [14]. For listed companies, the company will choose a cash capital increase when its operations are recognized by the market and the stock price is also performing well. During this period, the company will try to keep the stock price from falling or even rise significantly, so that the capital increase can be carried out smoothly. However, under such circumstances, whether it is beneficial to the development of the stock price or the performance of the stock price is not as expected, it will have a significant impact on investors’ stock investment returns, which is of great concern to investors. However, there are not many studies discussing the correlation between cash capital increase and stock performance in previous studies, and whether there is really a positive © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 369–380, 2023. https://doi.org/10.1007/978-3-031-35836-4_39
370
C.-P. Chang et al.
return on investing in stocks after the capital increase remains to be tested. In view of this, this study takes the cash capital increase event as the research topic. The object of this study is to conduct an in-depth discussion and research on the relationship between the abnormal return after the capital increase, the stock turnover, and the underestimation of the capital increase pricing for companies that have increased capital in cash among listed companies in Taiwan. Will the company’s implementation of cash capital increase have a positive or negative impact on the company’s performance and stock price? Some studies have pointed out that investors buy cash capital increase stocks because of the arbitrage opportunity of companies issuing at a discount, but the high discount of cash capital increase may also cause the current value of the stock to be overvalued. Furthermore, participating in the purchase will be easily affected by the negative reaction of the market, and it will be more difficult to realize the expected profit [29]. In addition, some studies have found that newly listed companies will experience the so-called “cash capital increase market” before the first capital increase, and the long-term performance of new shares with cash capital increase within three years of listing is significantly better than that of companies without cash capital increase. The stock price performance of companies with capital increase is also significantly better than that of companies without a second capital increase [23]. Since various literatures point out that cash capital increase is closely related to the company’s stock price development and performance, this study takes Taiwan-listed companies that have processed cash capital increase as samples, aiming to explore how the current capital increase has caused a significant discount to the stock price. At the same time, we also observe the correlation between cash capital increase and stock performance. Cash capital increase in Taiwan also has the problem of underwriting price discount. [31] found that when the company’s information asymmetry is low or the industry boom is good, the discount will be smaller. In addition, [32] pointed out that the longer the underwriting period for cash capital increase, the greater the discount rate, and the underwriting fee also affects the discount rate. Therefore, this study takes the cash capital increase in the Taiwan stock market as a sample to test the impact of cash capital increase on the company’s stock discount and performance. This research intends to test the following hypotheses for cash capital increase enterprises: 1. The stock discount range of cash capital increase will be affected by stock turnover. 2. Abnormal return for cash capital increase will be affected by stock turnover. 3. The excess return of cash capital increase will be affected by stock turnover. 4. The excess return of cash capital increase stock will be affected by the discount.
2 Literature Review 2.1 Underpricing Cash capital increase is currently one of the most common ways for companies to raise funds. Many literatures have verified that if a company chooses to use the issuance of shares to obtain cash to increase capital (seasoned equity offerings; SEOs) and raise funds, the stock price will be transmitted. Overvalued signal, making market investors react negatively [1, 3, 6, 7, 19–22]. In addition, because the underwriters understand the
The Relationships Between Underpricing and Turnover
371
needs of investors better, in order to compensate the underwriters, the company tends to issue at a discount; in addition, in order to avoid moral hazard, the company will issue at a discount as compensation for its investment in providing information [4]; at the same time, to subsidize the purchase risk of investors with incomplete information, and to avoid investors’ complaints due to the company’s inaccurate forecast of profit [8, 27]. With regard to the discount motivation, the parity view is a quite pragmatic explanation. In addition to purchasing new shares at the underwriting price from the subscription activity, investors can also choose to obtain the issued shares of the same issuing company from the circulation market at the market price. The profit from buying SEOs and then selling them at the market price should stimulate the willingness to subscribe. When the underwriting price of SEOs is higher than the market price, investors will choose to withdraw from the subscription activities, resulting in the failure of new share sales. Therefore, if SEOs purchase activities want to avoid lack of interest, discount sales are the most direct way [28], and as the problem of information asymmetry becomes more and more serious, issuing companies should give more discounts [5, 18], in order to prevent investors from finding out that the pricing is too high in the future to file a lawsuit and damage the reputation of the underwriter. The discount rate of the underwriting price for cash capital increase varies with different factors. For example, when the industry is booming or the company’s information asymmetry is low, the discount rate will be smaller [31]. In addition, some studies have pointed out that the longer the underwriting period for cash capital increase, the greater the discount rate will be, and the underwriting fee will also affect the discount rate [32]. In addition, information asymmetry may also cause investors to suffer losses when participating in the subscription. [25] proposed that information asymmetry exists between informed investors and uninformed investors. Informed investors refer to those who have special information. They understand the true value of the issuing company, so informed investors will subscribe for issuing companies that they think are valuable and worthy of investment, while uninformed investors will subscribe randomly. Therefore, in order to attract uninformed investors to participate in the subscription, the issuing company often makes up for the possible investment losses of uninformed investors through issuance at a discount. In view of the above, discount issuance often and commonly occurs in cash capital increase subscription activities. Therefore, this study hopes to use listed companies that handle cash capital increase as a sample to explore the extent of discount and the impact of cash capital increase on stock discount issuance. 2.2 Performance Issuing new shares for cash capital increase is one of the financing channels often used by enterprises. The more acceptable reason for Taiwanese investors is usually the company’s pursuit of growth or cash capital increase for the purpose of expanding its business territory. Therefore, companies often pass the provisions of Article 266 of the Company Law: “When a company issues new shares, the board of directors shall be carried out by the resolution of more than two-thirds of the directors present and the approval of more than half of the directors present”, and raise funds by cash capital increase. The company usually raises the stock price before the capital increase to increase investors’
372
C.-P. Chang et al.
willingness to pay, or handles a cash capital increase when the stock price is high, so that more funds can be successfully raised to complete the capital increase [26]. For every investment, the investor pays the most attention to the investment return, but the result may not be as expected due to some factors. Past studies have pointed out that the company’s stock has obvious excess returns in the first six months of capital increase. Although the stock performance continues to decline after the capital increase, there is no significant return different from zero, and the company’s stock that has reincreased capital has a significant return [30]. However, there are also many literatures that find that stock performance after SEOs is poor [1, 3, 6, 7, 12, 13, 19, 22], and past research tends to find that the stocks of American companies tend to have significantly negative abnormal returns during the period of cash capital increase announcement. For example, [24] shows that U.S. firms have cumulative abnormal returns of about -2% over the period they announce a cash capital increase. The research of [17] found that company managers prefer cash capital increase when the stock price is high, and the timing of stock repurchase is usually when the stock price is undervalued, rather than the company’s future growth and profitability. Profitability is used as the basis for cash capital increase. Therefore, the stock price may start to fall after the capital increase, making the performance of stock investment poor. [15] also aimed at the abnormal stock returns of Taiwan-listed OTC companies’ initial cash capital increase between 1985 and 2005, and tried to use company size, beta value, liquidity, net value to market price ratio, momentum, investment ratio and individual risk, etc. to explain; the empirical results show that the companies that implement cash capital increase for the first time have positive abnormal returns on their stocks. Whether the impact of cash capital increase on stock is positive or negative, it appears that cash capital increase has a significant impact on stock performance. In view of this, this study will discuss the impact of cash capital increase on stock performance.
3 Methodology 3.1 Model We calculate the discount range of the stock price lower than the market price when the cash capital is increased, and then uses the holding period method and the FamaFrench three-factor model to measure the company’s stock performance and abnormal returns, and then conducts a regression analysis of the cash capital increase discount on the abnormal returns, to explore whether there is a significant relationship between the stock for cash capital increase and various variables, and whether cash capital increase will affect the company’s financial performance. Hypothesis 1. The stock discount rate of cash capital increase will be affected by stock turnover. Stock discount means that the company issues share at a price lower than the current market price, and the discount rate is a measure of the difference between the market price and SEOs underwriting price. In view of the fact that the company usually publishes the underwriting announcement one day before the start of subscription for new share issuance, investors can already grasp the sales information and understand the discount
The Relationships Between Underpricing and Turnover
373
situation at this time. Therefore, this study refers to the method of Wang (2014) for calculating the discount rate, that is, the gap between the average closing price of the company’s stock 20 days before the start of subscription and the underwriting price of SEOs. The formula is as follows: UNP = a + a1 ∗RMj + a2 ∗Turn + a3 ∗Capitalj + a4 ∗NETj + a5 ∗Earnj + e j j
(1)
Discount rate UNPj,T = 1 −
UWPj PMj,T
(2)
UNPj,T : The discount rate of the jth SEOs measured by the average closing price of the company’s stock during T days before the start of subscription. The larger the value, the higher the discount rate; (Trading day) and other periods to measure it, so let T = 20. UWPj : It is the underwriting price of the jth SEOs. PMj,t : It is the average closing price of the company’s stock of the jth SEOs during the T days before the start of subscription. RMj : The jth company underwriting day, 20 days before and after the average return rate of the market. Turnj : The jth company underwriting day, the average stock turnover rate of the 20 days before and after. Capitalj , NETj , Earnj : are respectively the asset growth rate, net worth growth rate, and revenue growth rate of the jth company on the underwriting day. Hypothesis 2. The abnormal return of stocks with cash capital increase will be affected by stock turnover. In this study, the short-term performance measurement period of stocks is set as one month after the start of subscription, which is mainly used to measure whether investors have unusual reactions to cash capital increases. In addition, it can help investors draw up better investment Trading strategy. This study uses two methods to measure stock performance. The first is to use the buy-and-hold period method, which describes the investor’s overall profit and loss return during the investment holding period. In this study, the performance of market indicators is adjusted to become the abnormal return of individual stocks during the holding period. When the abnormal return during the holding period is greater than zero, it means that the stock performance is better than the market benchmark index. ARj,T =
T T 1 + Rj,t − 1 + RMj,t t=1
RMj,t =
(3)
t=1
RMTSE,t × MVTSE,t−1 + RMOTC,t × MVOTC,t−1 MVTSE,t−1 + MVOTC,t−1
(4)
AR = b + b1 ∗RMj + b2 ∗Turn + b3 ∗Capitalj + b4 ∗NETj + b5 ∗Earnj + Dj + e j j (5)
374
C.-P. Chang et al.
ARj,T : Using the market index as the benchmark index, calculate the abnormal return of the jth SEOs during the holding period of T and start from the purchase start date (t = 1). Rj,t : Return for the jth SEOs on the tth day. RMj,t : The market index return corresponding to the jth SEOs on day t, the weighted index return and counter-buy index return are calculated in a weighted way by market value. RMj : The jth company underwriting day, 20 days before and after the average return of the market. Turnj : The jth company underwriting day, the average stock turnover rate of the 20 days before and after. Capitalj , NETj , Earnj : These are respectively the asset growth rate, net worth growth rate, and revenue growth rate of the jth company on the underwriting day. Hypothesis 3. Cash capital increase excess stock returns will be affected by stock turnover. Hypothesis 4. The excess return of cash capital increase stock will be affected by the discount.
3.2 Abnormal Return The second method used in this study is the Fama-French three-factor model, which is also used to test abnormal performance. [8] proposed to estimate individual stocks after controlling common factors such as market, size, and book-to-market ratio.. When the constant term of this model (i.e., “Jensen’s α”) is different from zero, then individual stocks are said to generate excess returns. The method of operation is mainly to refer to [11], and use the method of combining samples to estimate the regression formula. All variables in the model are first processed using the holding period method (calculated from the start date of SEOs subscription) to evaluate the period-by-period performance changes. Secondly, in order to compare the difference of Jensen’s α between SEOs with different discount advantages, a dummy variable that can distinguish sample attributes is added to the model, and its coefficient can determine whether the excess performance exists. In addition, the Fama-French three-factor model originally used monthly frequency data. In order to observe shortterm performance changes in this study, the method of [10] was adopted and adjusted to use daily data. (1) In the first stage, calculate the excess return of the Fama-French three-factor model Rj,T − RFj,T = a + b RMj,T − RFj,T + sSMBj,T + hHMLj,T + dDj + εT (6) Rj,T : is the return of the jth SEOs for the holding period of T, starting from the start date of subscription (t = 1). Rj,t : is the return of the jth SEOs on the tth day.
The Relationships Between Underpricing and Turnover
375
RFj,T , RMj,T , SMBj,T and HMLj,T : These variables are respectively start to cooperate with the cash capital increase of the jth company. The risk-free assets measured with the day as the starting point (the first bank’s one-month time deposit rate is converted into the daily interest rate as the proxy variable), the market index return, the scale factor and the net value-to-market ratio return. Dj : is a dummy variable for distinguishing the jth SEOs group, and its setting is different due to different comparison sample bases. α (i.e. Jensen’s α), b, s, h and d are all regression parameters. ε: is the error term. the estimated value aˆ of the constant to a is the excess return (2) In the second stage, regression analysis is performed with the excess return aˆ of the Fama-French three-factor model aˆ j = b + b1∗RMj + b2∗Turnj + b3∗Capitalj + b4∗NETj + b5∗Earnj + Dj + e (7) aˆ : It is excess return. j RMj : The jth company underwriting day, 20 days before and after the average return of the market. Turnj : The jth company underwriting day, the average stock turnover of the 20 days before and after. Capitalj , NETj , Earnj : are respectively the asset growth rate, net worth growth rate, and revenue growth rate of the jth company on the underwriting day. Dj : D = 1, Is companies in the top 1/2 of the discount ranking; D = 0, companies in the bottom 1/2 of the discount ranking. 3.3 Sample The samples for this research come from the database of Taiwan Economic News (TEJ). The sample period is from January 1, 2009 to December 31, 2018. The stocks of listed companies that have processed cash capital increase. The one closest to the end of the year, excluding companies lacking financial information, a total of 257 valid data, and the rest of the financial variables are also taken from the TEJ.
4 Empirical Result 4.1 Descriptive Statistical Analysis This study mainly explores whether the discount rate of listed companies that have processed cash capital increase is affected by various financial variables. The sample period is from January 1, 2009 to December 31, 2018. The total number of samples that have processed cash capital increase, total of 257 valid data. The distribution of variable statistics of the overall sample is shown in the table below (Table 1).
376
C.-P. Chang et al. Table 1. Descriptive Statistical unit: %
Variable
UNP
RM
TURN
CAPITAL
NET
EARN
Average
14.6350
0.7545
21.0816
24.1626
39.5284
40.6102
Median
14.5208
0.6166
12.7429
15.0000
25.1500
5.7400
Max
54.7386
18.0595
213.5197
519.0400
976.8300
4848.4400
0.0489
−42.8300
−29.8100
−92.6900
Std. Dev
11.1643
4.5189
26.3968
46.3129
77.4924
333.5362
Skewed
−1.9592
−0.1129
4.0226
5.9861
8.8089
12.8058
Kurtosis
17.0094
4.3153
25.6925
56.6780
Min
−68.8819 −12.4682
Jarque-Bera 2266.0760 P value
0.0000
19.0721 6207.1720 32389.070 0.0001
0.0000
97.9652 99895.590
0.0000
176.3850 328941.90
0.0000
0.0000
Note: UNP is discount rate, RM is market return (month), TURN is stock turnover (month), CAPITAL is asset growth rate, Net is net worth growth rate, Earn is revenue growth rate, and the total number of samples is 257
According to the data in the above table, all the variables have high and narrow kurtosis, and the kurtosis value is greater than the value 3, especially the asset growth rate (CAPITAL), net worth growth rate (Net) and revenue growth rate (EARN). If the skewness value of each variable is positive, it belongs to “right skewness”, such as: stock turnover rate (TURN), asset growth rate (CAPITAL), net worth growth rate (NET) and revenue growth rate (EARN), etc.; If the state value is negative, it is a “left skewed state”, such as: discount rate (UNP) and market return rate (RM), etc (Table 2). Table 2. Regression Analysis of Discount Range Variable
Coefficient
Std. Error
T test
P value
C
11.8327
0.9298
12.7250
0.0000***
RM
−0.0442
0.1493
−0.2959
0.7675
0.1291
0.0255
5.0535
0.0000***
TURN CAPITAL
0.0237
0.0218
1.0848
0.2790
−0.0140
0.0138
−1.0116
0.3127
EARN
0.0023
0.0023
0.9985
0.3190
Adjusted R-squared
0.0817
NET
Note: RM is market return (month), TURN is stock turnover (month), CAPITAL is asset growth rate, Net is net worth growth rate, Earn is revenue growth rate, and Adjusted R-squared is adjusted R-squared. * p < 0.05, ** p < 0.01, *** p < 0.00
The main tests in the above table are: Hypothesis 1. Whether the stock discount rate of cash capital increase will be affected by stock turnover. The results show that only the
The Relationships Between Underpricing and Turnover
377
stock turnover will affect the discount rate, and the other variables are not significant. In addition, the stock turnover has a positive effect on the discount rate. If the stock turnover increases, the discount rate will increase (Table 3). Table 3. Buy and Hold Abnormal Return Regression Analysis Variable
Cofficient
Std. Error
T test
P value
C
4.5321
1.9832
2.2853
0.0229***
RM
0.6101
0.2638
2.3125
0.0213*
TURN
0.4038
0.0541
7.4619
0.0000***
CAPITAL NET EARN D Adjusted R-squared
0.0410
0.0441
0.9290
0.3535
−0.0131
0.0289
−0.4554
0.6491
0.0009
0.0050
0.1826
0.8551
−18.0921
2.7080
−6.6812
0.0000***
0.1976
Note: RM is market return (month), TURN is stock turnover (month), CAPITAL is asset growth rate, NET is net worth growth rate, Earn is revenue growth rate, and Adjusted R-squared is adjusted R-squared. * p < 0.05, ** p < 0.01, *** p < 0.00
The above table is to test: Hypothesis 2, whether the abnormal return of stocks for cash capital increase will be affected by stock turnover, and the information in the table indicates that the market return, stock turnover and discount rate will all affect abnormal returns, and the stock turnover and discount rate are significant, while the rest of the variables are not significant (Table 4). Table 4. Fama French three-factor regression analysis Variable
Cofficient
Std. Error
T test
P value
C
9.0403
1.7221
5.2497
0.0000***
MR
1.4687
0.3463
4.2415
0.0000***
SML
2.0250
0.5604
3.6135
0.0003***
HML
0.8138
0.6120
1.3297
0.1845
D
−13.721
2.7588
−4.9733
0.0000***
Adjusted R-squared
0.1866
Note: MR is the market risk premium, SML is the scale premium (three factors), HML is the price-to-market premium, D is the discount, and Adjusted R-squared is the adjusted R-squared. * p < 0.05, ** p < 0.01, *** p < 0.00
The above table is the Fama French three-factor regression analysis. This table is to calculate the constant term C to estimate the excess return of the Fama French three factors (Table 5).
378
C.-P. Chang et al. Table 5. Regression Analysis of Fama French Excess Return
Variable
Cofficient
Std. Error
T test
P value
C
2.0025
1.9309
1.0371
0.3004
D
−18.1019
2.6366
−6.8657
0.0000***
RM
−0.2374
0.2569
−0.9243
0.3560
TURN
0.3903
0.0527
7.4074
0.0000***
CAPITAL
0.0504
0.0430
1.1724
0.2418
NET
−0.0089
0.0281
−0.3167
0.7517
EARN
0.0010
0.0049
0.1998
0.8418
Adjusted R-squared
0.1871
Note: D is the discount rate, RM is the market return (month), TURN is the stock turnover (month), CAPITAL is the asset growth rate, Net is the net worth growth rate, Earn is the revenue growth rate, and Adjusted R-squared is the adjusted After the R-squared. * p < 0.05, ** p < 0.01, *** p < 0.00
The above table mainly tests: Hypothesis 3. Whether the excess return of cash capital increase stocks will be affected by stock turnover and Hypothesis 4. Whether the excess return of cash capital increase stocks will be affected by the discount rate. The data in the table indicate that only the discount rate and stock turnover will affect the excess return of cash capital increase stocks, and the rest of the variables will still not be affected.
5 Conclusion This study uses companies that have handled cash capital increases from 2009 to 2018 in the Taiwan Economic News database. And, we use financial variables to test the correlation between cash capital increase discounts, company performance, and stock turnover, and we use various variables to describe statistics The demonstration obtained by empirical methods such as tables and regression analysis can prove that: (1) The stock discount rate for cash capital increase will be affected by stock turnover: This study shows that the discount rate of cash capital increase stocks will indeed be affected by the stock turnover, and the result is significant and positive correlated. As long as the stock turnover rate increases, the discount rate will become larger. (2) Abnormal stock return for cash capital increase will be affected by stock turnover: The research results show that the rate of return of the market, the stock turnover rate and the discount rate will all affect the abnormal return of the cash capital increase stock, and the analysis result of the stock turnover is very significant, which means that the stock turnover may be the biggest factor affecting the abnormal return of the stock. (3) The excess return of cash capital increase stock will be affected by stock turnover: The research results show that the stock turnover will affect the excess return of cash capital increase stocks, and the result is very significant. (4) The excess return of cash capital increase will be affected by the discount rate: The research results show that the excess return of cash
The Relationships Between Underpricing and Turnover
379
capital increase will be affected by the discount, but the rest of the variable are not significant. This research is to study the correlation between cash capital increase discount, corporate performance and stock turnover. The research results show that discount rate, abnormal return and excess return are all affected by stock turnover, so stock turnover can be used as an investment reference index. In addition, this study also exist several research limitations. First of all, this research only focuses on listed stocks that have undergone cash capital increase from 2009 to 2018, so they cannot be compared with OTC companies. In the future, we can further study all listed companies that have undergone cash capital increase and observe different sample data selection period, or compare with cash capital reduction. If we can do further research, we will understand the relationship between cash, capital discount, corporate performance and stock turnover.
References 1. Asquith, P., Mullins, D.W.: Equity issues and offering dilution. J. Financ. Econ. 15(1–2), 61–89 (1986) 2. Barber, B.M., Lyon, J.D.: Detecting long-run abnormal stock returns: the empirical power and specification of test sttatistics. J. Financ. Econ. 43(3), 341–372 (1997) 3. Barclay, M.J., Litzenberger, R.H.: Announcement effects of new equity issues and the use of intraday price data. J. Financ. Econ. 21(1), 71–99 (1988) 4. Baron, D.P.: A model of the demand for investment banking advising and distribution. J. Financ. 37(4), 955–976 (1982) 5. Corwin, S.A.: The determinants of underpricing for easoned equity offers. J. Financ. 58(5), 2249–2279 (2003) 6. Dann, L.Y., Mikkelson, W.H.: Convertible debt issuance, Capital structure change and financing-related information: some new evidence. J. Financ. Econ. 13(2), 157–186 (1984) 7. Eckbo, B.: Valuation effects of corporate debt offerings. J. Financ. Econ. 15(1–2), 119–151 (1986) 8. Fama, E., French, K.: The cross-section of expected stock returns. J. Financ. 47(2), 427–465 (1992) 9. Firth, M.: An analysis of the stock market performance of new issues in New Zealand. Pac.-Basin Financ. J. 5(1), 63–85 (1997) 10. Gu, G.-P.: Rediscussion on short-term and long-term performance of newly listed stocks in Taiwan. Securities Market Dev. Q. 15(1), 1–40 (2003) 11. Hong, Z.-Q., Qin-shan, W., Chen, A.-L.: The impact of irrational investment behavior on the price performance of new listed stocks. Manage. Rev. 21(2), 53–79 (2002) 12. Kalay, A., Shimrat, A.: Firm value and seasoned equity issues. J. Financ. Econ. 19(1), 109–126 (1987) 13. Levis, M.: The long-run performance of initial public offerings: the UK experience 1980– 1988. Financ. Manage. 22(1), 28–41 (1993) 14. Li, C., Lai, Y.-W., Xiao, X.-F.: Cash capital increase surplus manipulation and stock market response – theory and empirical practice. Zhongshan Manage. Rev. 17(1), 115–157 (2009) 15. Li, J.-R., Lin, Z.-X.: The impact of cash capital increase on the wealth of external shareholders through inquiry and buy-in. Contemp. Account. 2(2), 127–146 (2001) 16. Lin, H.-G., Yongheng, X., Shen, J.: The announcement effect of cash capital reduction and its impact on its operating performance. Taiwan Manag. J. 9(2), 185–204 (2009)
380
C.-P. Chang et al.
17. Lin, Y.-B., Li, J., Liao, Y.-L.: Discussion on the motivation of listed companies’ cash capital increase: the market timing and market condition hypothesis test. Securities Market Dev. Q. 24(3), 79–100 (2012) 18. Liu, C., Chung, C.Y.: SEO underpricing in China’s stock market: a stochastic frontier approach. Appl. Fin. Econ. 23(5), 393–402 (2013) 19. Loughran, T., Ritter, J.R.: The new issues puzzle. The J. Financ. 50(1), 23–51 (1995) 20. Loughran, T., Ritter, J.R.: The operating performance of firms conducting seasoned equity offerings. J. Financ. 52(5), 1823–1850 (1997) 21. Mikkelson, W.H., Partch, M.M.: Valuation effects of security offerings and the issuance process. J. Financ. Econ. 15(1–2), 31–60 (1986) 22. Patel, A., Emery, D.R., Lee, Y.W.: Firm performance and security yype in seasoned offerings: an empirical examination of alternative signaling models. J. Financ. Res. 16(3), 181–192 (1993) 23. Qiu, Z.-R., Zhou, T.-K., Weng, J.-L.: Discussing the long-term stock price performance of listing and capital increase of new shares-testing the impact of subsequent cash capital increase decisions. J. Financ. Financ. 12(2), 1–41 (2004) 24. Ritter, J.R., Welch, I.: A review of IPO activity, Pricing, and allocations. J. Financ. 57, 1795– 1828 (2002) 25. Rock, K.: Why new issues are underpriced. J. Financ. Econ. 15, 187–212 (1986) 26. Slovin, M.B., Sushka, M.E., Bendeck, Y.M.: Seasoned common stock issuance following an IPO. J. Bank. Financ. 18, 207–226 (1994) 27. Tinic, S.M.: Anatomy of initial public offerings of common stock. J. Financ. 43(4), 789–822 (1988) 28. Chao-shi, W.: The dominance of discount factors in the decision of cash capital increase stock purchase. J. Manag. 32(2), 205–221 (2015) 29. Wang, C.-S.: Research on the advantage of cash capital increase and stock discount. Manag. Syst. 21(1), 161–186 (2014) 30. Weng, J.: Exploring the Long-term Stock Price Performance of IPOs and Capital Increases – Testing the Impact of Subsequent Equity Financing, Master’s Thesis of the National Cheng Kung University Institute of Finance and Finance (2004) 31. Xu, S.: Research on the decision-making of underwriting and allotment methods of cash capital increase of listed companies in Taiwan and the phenomenon of underwriting price discount, Master’s thesis of the Institute of Financial Management, Nanhua University (2006) 32. Yang, Q.: Evaluation Model of Inquiry and Circle Buying for Capital Increase and IPO, Doctoral Dissertation of Institute of Management Science, National Chiao Tung University (2001)
Research on the Influence of On-the-go Cross-store Access through APPs on Consumer Behavior Ya-Lan Chan1 , Po-Hung Chen1 , Sue-Ming Hsu2 , and Mei-Hua Liao3(B) 1 Department of Business Administration, Asia University, Taichung, Taiwan, ROC
[email protected] 2 Department of Business Administration, Tunghai University, Taichung, Taiwan, ROC
[email protected] 3 Department of Finance, Asia University, Taichung, Taiwan, ROC
[email protected]
Abstract. Convenience stores in Taiwan are introducing smart services for their member APPs to enhance customer satisfaction and loyalty. A survey was conducted to analyze the impact of performance expectations, effort expectations, and social influence on consumers’ intentions to use the app for cross-store purchases. The study found that the behavioral intention of consumers to use apps had a positive impact on their actual usage behavior. The study recommends enhancing consumers’ behavioral intention to buy and use across stores by improving work performance, simplifying system operation, and considering the opinions of people around them. Strengthening self-buying ability, seeking help when encountering problems, and increasing the subjective willingness of consumers to use instant shopping and cross-store pick-up are also recommended to increase the frequency of consumption in convenience stores. The study aims to promote more environmentally friendly consumption behavior by reducing the number of trips and carbon emissions through cross-store pick-up.
1 Introduction There are four major convenience store chains in Taiwan: Family Mart, 7–11, Lifestyle, and OK. Each chain has launched its own membership system to cultivate loyal customers and segment the market. Family Mart has over 14.5 million registered members and continues to update its membership services, which have contributed to 50% of the company’s revenue and created a new peak. Among supermarket apps in Taiwan, Family Mart’s app is the most frequently used by consumers. Furthermore, in 2021, Family Mart launched a map function on their app that allows consumers to stay at home and check the inventory status of products in nearby stores (Economic Daily, 2021). The competition among convenience stores is fierce, and operators have become more aggressive in recent years due to the disappearance of the demographic dividend and rising costs of cultivating new customer groups. They invest in managing members to increase the number and loyalty of members, using big data analysis to understand consumer behavior and market their products more efficiently. By doing so, they aim to © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 381–388, 2023. https://doi.org/10.1007/978-3-031-35836-4_40
382
Y.-L. Chan et al.
meet the various needs of consumers and seize market opportunities. In this highly competitive market, whoever masters membership will have an advantage. Success depends on the breadth and depth of the membership base. The cross-store pick-up function of the Family Mart app enables consumers to purchase specific products at a discounted group-buying price, and exchange them at any branch. This has prompted a study to examine consumer acceptance of this new sales model. The research aims to: 1) Explore the relationship between performance expectations, effort expectations, social influence, favorable conditions, behavioral intentions, and usage behaviors of consumers when using the Family Mart app through the UTAUT model. 2) Investigate whether background variables such as gender, marital status, and frequency of visiting Family Mart stores affect the presentation of UTAUT factors, such as performance expectations, effort expectations, social influence, favorable conditions, behavioral intentions, and usage behavior.
2 Literature Review 2.1 Features and Characteristics of APP The APP is a small software that can be used to browse the web on mobile devices. Its advantage lies in the fact that users can easily save time using it, and it can also be connected to the cloud system and cooperate with different carriers to perform work tasks (Cheng Wan-xuan, 2014). Ma Yun has publicly stated that the era of pure e-commerce will surely be replaced by the arrival of “new retail” and become a thing of the past. The application program of an APP is far superior to the marketing effect of traditional advertising due to its convenience, interaction, instant and ubiquitous characteristics. Therefore, the industry has launched enterprise-specific APPs to effectively interact with consumers, and has organized various types of APP promotion activities to help customers develop a positive attitude towards their brands and be willing to engage in purchasing behaviors (Huang Xiao-han, 2017). In 2016, Family Mart convenience stores launched their digital membership program, which integrated with their mobile app to virtualize the points previously obtained through sticker purchases. This marked the beginning of a new era of big data applications and the implementation of a membership economy (Dai Qiao-rong, 2019). Over time, the features of the Family Mart app have expanded beyond point accumulation, gift transfers, donations, and item exchanges to include various other services such as cross-store pick-up, e-wallet, bookkeeping, courier packages, and map inquiries. These additional features have contributed to the growing membership base year after year. 2.2 Unified Theory of Acceptance and Use of Technology (UTAUT) Technological products are continually launching updated versions, and the willingness of consumers usage has been a topic of interest to both academic and business circles. Scholars such as Venkatesh et al. (2003) proposed the Integrated Technology Acceptance Model (UTAUT) by combining eight important but different theories related to technology acceptance. This model explains how users accept and use new technological
Research on the Influence of On-the-go Cross-store Access
383
products. In this study, we focus on the factors of performance expectation, effort expectation, social influence, and favorable conditions to examine the relationship between behavioral intention and usage behavior. Additionally, we propose four moderating variables that may affect this relationship: gender, age, experience, and voluntary use. The aim of this study is to explore the impact of these variables on users’ consumer behavior. The main theoretical framework of this study adopts the UTAUT model, which examines the influence of consumers’ behavioral intentions and related factors on their usage behavior. The study uses “performance expectations”, “effort expectations”, “social influence” and “favorable conditions” as the independent variables, and “behavior intention” and “usage behavior” obtained from cross-stores as the dependent variables to explore the impact of each facet. The “performance expectation” can be defined as “an individual’s belief that using or operating this new information technology system will improve their work performance and help achieve better results.” Generally, people are more willing to use a new technology system if it can bring better performance in their work. Efforts and expectations refer to “the difference between the difficulty and ease of using or operating a new information technology system.” In other words, if a specific new information technology system is perceived as simple, easy to understand, and userfriendly, the average person is more likely to be willing to use it. The expected effort mainly includes factors such as perceived ease of use, system complexity, and usability. Social influence refers to the degree to which significant others who can affect a user’s adoption of a new information technology system believe that the user should also adopt it. The components of social influence typically include subjective norms, social factors, and public image. Moreover, social influence may differ depending on variables such as gender, age, experience, and voluntariness. In particular, young women and early adopters tend to be more susceptible to the influence of their senior managers and colleagues. However, as experience grows, the impact of social influence diminishes. The facilitating conditions are: “I personally feel that the current organizational structure and technical level are more or less supportive of the adoption of new technology information systems.” In the process of using the new technology information system, if the difficulties that arise can be solved by the organization or the technology at that time, it will be helpful for the adoption of the new technology information system; the content of the favorable conditions mainly includes the following, self-aware behavior control, convenience, compliance.. Thus, as experience increases, so does the effect on favorable conditions. The conceptual definition of behavioral intention is “the probability of the subjective consciousness of one’s own behavior” Ajzen & Fishbein (1975), and many scholars Hu, Chau, Sheng, & Tam (1999); Mathieson (1991); Szajna, 1996) empirically, “behavioral intentions” can be used as predictors of system usage. In the UTAUT model, in addition to the above-mentioned influences on usage behavior, four moderator variables, including gender (Gender), age (Age), experience (Experience) and voluntary use (Voluntariness of Use), will also indirectly affect behavioral intention and usage behavior.
384
Y.-L. Chan et al.
3 Research Model and Hypothesis Based on the reviewed literature, the research model of this study is developed as illustrated in Fig. 1.
Fig. 1. Research model adapted from Venkatesh et al. (2003)
A number of studies have shown that performance expectations can influence users’ adoption intentions. For instance, Lv Zheyu’s (2016) research found that the use of mobile payment methods for transactions can bring convenience to users’ daily lives and work, thereby increasing their willingness to use it. Similarly, Chen Shugong’s (2017) study revealed that investors believe placing orders with an APP can lead to greater performance, which can increase their willingness to use it. H1. Performance expectancy has a positive impact on behavior intention. When there is a direct positive effect between effort expectations and behavioral intentions, consumers tend to prefer new forms of information technology that are simple, easy to learn, and easy to use, making them more willing to adopt these technologies. For instance, consumers are more likely to continue using Buy Now if they find it easy to use and perceive it as useful. Studies have also shown that effort expectations can impact users’ adoption intentions. For example, the ease of learning, operation, and speed of a tablet computer can affect the user’s behavioral intentions towards the tablet (Jian Zhengchang, 2015). H2. Effort expectancy has a positive impact on behavior intentions. Several studies have shown that social influence can have an impact on users’ adoption intentions. For example, Wu Wanxin (2015) found that Taiwanese stores are generally influenced by social factors when deciding to use Beacon technology for advertising push services. In another study, Zeng Xinxuan (2015) found that the more online game
Research on the Influence of On-the-go Cross-store Access
385
users hear about or see others participating in the game League of Legends, the greater their intention to use it. H3. Social influence has a positive impact on behavior intentions. A number of recent studies have shown that when consumers perceive themselves as having advantages, such as capabilities or related resources, they tend to be more active in their actual use of new forms of information technology. For instance, Yang Xiaojiao (2015) found that the actual use behavior of middle school students is positively affected by the device they use, whether it is a smartphone or a tablet. Similarly, Chen Shugong (2017) discovered that investors believe securities firms’ ordering APPs can enhance their performance and increase their willingness to use them. H4. Facilitating condition has a positive impact on use behavior. Several studies have demonstrated that users’ willingness to adopt is influenced by their behavioral intentions. For instance, Ji Jiayou (2013) found that teachers’ willingness to use interactive electronic whiteboards has a positive impact on their usage behavior. Similarly, Yang Xiaojiao (2015) discovered that middle school students’ actual use behavior is positively affected by their behavioral intention, regardless of whether they are using a smartphone or a tablet. H5. Behavior intention has a positive impact on use hehavior.
4 Data Analysis This study employs SPSS statistical software for conducting regression analysis. The six predictive variables considered in this study are performance expectation (PE), effort expectation (EE), social influence (SI), favorable condition (FC), behavioral intention (BI), and usage behavior (UB). Table 1 shows that Model 1 (BI = 0.299PE + 0.323EE + 0.352 * SI) includes significant variables, namely performance expectation, effort expectation, and social influence. Therefore, we can assume that H1, H2, and H3 are established. Specifically, consumers find instant shopping highly convenient, which positively influences their willingness to continue using instant shopping. Store pick-up also has a positive impact on consumers. Furthermore, when using buy-anywhere cross-store pick-up, consumers feel they are keeping up with the new era of technology, which also has a positive impact on their willingness to continue using this service. Additionally, the independent variables can explain 78.1% of the variation of the dependent items, with an F value of 126.167, and p-value < 0.001, indicating that the model has good adaptability. Table 2 indicates that Model 2 (UB = 0.722 * BI) does not find favorable conditions to have a significant impact on usage behavior. Therefore, Hypothesis H4 is not supported. Additionally, a consumer’s relevant knowledge of using the convenience store does not positively influence the frequency of their increase in convenience stores. However, behavioral intention significantly affects consumer usage behavior, indicating support for Hypothesis H5. This suggests that consumers are willing to continue using the convenience store, which will positively impact the frequency of their increase in
386
Y.-L. Chan et al. Table 1. Estimation results of Model 1 β
P-value
(constant)
0.119
0.621
PE
0.299
0.002***
EE
0.323
0.001***
SI
0.352
0.001***
Dependent variable = BI F value = 126.167, Adjusted R2 = 0.781 Notes: *** : p < .001
convenience stores. The model can explain 57% of the variation of the dependent variable, with an F value of 70.679 and a significant P-value < 0.001, suggesting that the model has good adaptability. Table 2. Estimation results of Model 2 β (constant)
P-value 0.355
FC
0.161
0.267
BI
0.613
0.001***
Dependent variable = UB F value = 70.679, Adjusted R2 = 0.570 Notes: *** : p < .001
5 Conclusions This study utilized Model 1 in SPSS software to demonstrate that performance expectations, effort expectations, and social influence are significantly and positively correlated with behavioral intentions. Consumers perceive cross-store retrieval to be highly useful, which positively impacts their willingness to continue using the Buy Now feature. Moreover, consumers feel that cross-store retrieval allows them to keep up with the latest technology, which positively influences their willingness to continue using this feature. Through Model 2 in SPSS software, this study revealed that favorable conditions do not have a significant positive correlation with usage behavior. The study also found that higher levels of facilitating conditions obtained through cross-store retrieval do not affect consumers’ usage behavior. Additionally, the study indicated that having knowledge of instant shopping and cross-store pick-up does not positively impact the frequency of consumers’ visits to convenience stores. However, the study did reveal a significant positive correlation between behavioral intention and usage behavior through Model
Research on the Influence of On-the-go Cross-store Access
387
2. This indicates that the higher the level of behavioral intention that consumers have regarding cross-store retrieval, the greater the influence it will have on their own usage behavior. The results are summarized as Table 3: Table 3. Summary table of research hypothesis results Hypothesis
Confirmed or not
H1
Performance expectations have a significant positive impact on behavioral intentions
Confirmed
H2
Effort expectations have a significant positive impact on behavioral intentions
Confirmed
H3
Social influence has a significant positive impact on behavioral intentions
Confirmed
H4
Favorable conditions have a significant positive impact on usage behavior
Not confirmed
H5
Behavioral intentions have a significant positive impact on usage behavior
Confirmed
Practical recommendations for this study are as follows: 1. Enhance the intention to use cross-store retrieval when making purchases, primarily by improving personal work performance, making the system easy to operate, and seeking positive opinions from people who have been influenced by the feature. 2. Strengthen the ability and knowledge of cross-store retrieval, and have access to assistance when encountering problems. This can increase the frequency of consumers visiting convenience stores.3. Increase the subjective willingness of consumers to use cross-store retrieval, thereby increasing the frequency of visits to convenience stores. Acknowledgments. This research is partly supported by the National Science Council of Taiwan NSC 110-2813-C-468-086-H.
References Ajzen, I., Fishbein, M.: Belief, Attitude, Intention, and Behavior: An Introduction to Theory and Research Reading. Addison-Wesley (1975) Ajzen, I.: The theory of planned behavior. Organ. Behav. Hum. Decis. Processes. 50, 179–211 (1991) Abbas, S.K., Hassan, H.A., Asif, J., Ahmed, B., Hassan, F., Haider, S.S.: Integration of TTF, UTAUT, and ITM for mobile Banking Adoption. Int. J. Adv. Eng., Manag. Sci. 4(5), 89–101 (2018) Bergeron, F., Rivard, S., de Serre, L.: Investigating the support role of the information center. MIS Q. 14(3), 247 (1990) Chen, S.-G.: A study on user behavior of securities firms’ mobile trading APPs based on the UTAUT2 model. Master’s thesis, Department of Business Management, National Sun Yat-sen University (2017)
388
Y.-L. Chan et al.
Chen, W.-K.: Factors influencing user acceptance of knowledge community sharing: A case study of PChome online community. Master’s thesis, Department of Information Management, Fu Jen Catholic University (2004) Cheng, W.-H. (2014). The relationship between visual symbols of APP icons and consumer download decisions: A case study of gaming apps. Master’s Thesis, Graduate Institute of Design, National Taipei University of Technology Hu, P.J., Chau, P.Y., Sheng, O.R.L., Tam, K.Y.: Examining the technology acceptance model using physician acceptance of telemedicine technology. J. Manag. Inf. Syst. 16(2), 91–112 (1999) Huang, H.-H.: The interactive effect of brand APPs and event marketing. Master’s Thesis, Department of Business Administration, Feng Chia University (2017) Lu, C.-Y.: A study on consumers’ willingness to use mobile payment: Applying the UTAUT2 model. Master’s Thesis, Department of Business Administration, National Taipei University (2016) Morris, M.G., Venkatesh, V.: Age differences in technology adoption decisions: implications for a changing work force. Pers. Psychol. 53(2), 375–403 (2000) Mathieson, K.: Predicting user intentions: comparing the technology acceptance model with the theory of planned behavior. Inf. Syst. Res. 2(3), 173–191 (1991) Szajna, B.: Empirical evaluation of the revised technology acceptance model. Manag. Sci. 42(1), 85–92 (1996) Tai, C.-J.: Exploring consumers’ usage intentions towards convenience store APPs: A case study of FamilyMart. Master’s Thesis, Department of Information Management, College of Technology, Nanhua University (2019) Tseng, H.-H.: A study on the acceptance of League of Legends players: Based on the UTAUT2 model. Master’s Thesis, Graduate Institute of Electronic Business and Commerce, National Kaohsiung First University of Science and Technology (2015) Venkatesh, V., Morris, M.G., Davis, G.B., Davis, F.D.: User acceptance of information technology: toward a unified view. MIS Q. 27, 425–478 (2003) Wu, W.-H.: Acceptance of Beacon technology advertising services by Taiwanese stores: An application of the UTAUT and TTF models. Master’s Thesis, Department of Industrial Engineering and Management, National Taipei University of Technology (2015) Yang, S.-C.: A study on the use of smartphones and tablets by junior high school students based on the UTAUT2 model. Master’s Thesis, Department of Information Management, Southern Taiwan University of Science and Technology (2015)
Author Index
A Akinladejo, Felix O. 68 Albhelil, Esam Ali 103 Ampririt, Phudit 149 Asanza, Víctor 209 B Barolli, Admir 160, 170 Barolli, Leonard 1, 38, 47, 149, 160, 170, 179 Batzorig, Munkhdelgerekh 103 C Cai, Jia-Syun 261 Chan, Ya-Lan 361, 381 Chang, Chia-Wen 306 Chang, Chun-Ping 369 Chen, Hsing-Chung 8, 240, 261 Chen, Jheng-Shun 8 Chen, Po-Hung 381 Chen, Yang-Cheng-Kuang 240 Chen, Yan-Ting 8 Chen, Yen-Ju 361 Cheng, Chia-Hsin 229 D Dhurandher, Sanjay K. E Estrada, Rebeca
209
F Fan, Yao-Chung
294
G Gao, Tianhan
68
18, 138, 189
H Hagihara, Momoka 120, 129 Han, Chien-Kuo 327
Higashi, Shunya 149 Hsu, Pei-Yu 261 Hsu, Sue-Ming 381 Hsu, Wei-Chun 8 Huang, Yung-Fa 229 Hung, Patrick C. K. 273 I Ikeda, Makoto 1, 149 Ishida, Tomoyuki 120, 129 Islam, Md Rezanur 78, 86, 96 J Jeong, Da-Wit 201 Jiang, Xinbei 18 K Kanazawa, Akari 57 Kanev, Kamen 273 Kang, Gwonsik 86 Kim, Dae-Young 29 Kim, Keunkyoung 86 Kim, Si-On 201 Kimura, Masakatsu 273 Koh, Yeji 103 Kulla, Elis 179 Kung, Tzu-Liang 221 Kuriya, Naho 120, 129 L Lai, Sen-Tarng 316 Lee, Shu-Hung 229 Lee, SoYeon 29 Lee, Sun-Young 201 Leu, Fang-Yie 283, 294, 306, 316 Li, Jiangtao 112 Li, Qiong 112 Liang, Yao-Hsien 261 Liao, Mei-Hua 361, 381 Lin, Chien-Chih 229
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 L. Barolli (Ed.): IMIS 2023, LNDECT 177, pp. 389–390, 2023. https://doi.org/10.1007/978-3-031-35836-4
390
Author Index
Lin, Ching-Lun 345 Lin, Shih-Chieh 335, 345, 351 Lin, Ying-Li 327, 335, 345, 351 Liu, Chih-Yun 369 Liu, Shing-Hong 250 Liu, Xiaokang 112 Liu, Yi 160 M Matsuo, Keita 149, 179 Mi, Qingwei 189 Mimura, Hidenori 273 Miwata, Masahiro 1 Montesdeoca, Guillermo 209 Moriya, Genki 38 Muneeb, M. A. 209 N Nagai, Yuki 38, 47 Nakamura, Atsushi 273 Nyamdelger, Tugsmandakh O Oda, Tetsuya 38, 47 Oh, Insu 78, 96 Q Qafzezi, Ermioni 149 Qi, Jiayu 138 S Saito, Nobuki 47 Sakamoto, Shinji 160, 170 Sheu, Yung-Hoh 250 Shieh, Jen-Yu 250 Shigeyasu, Tetsuya 57 Song, Yu-Lin 261 Spaho, Evjola 170 Su, Hui-Kai 250 Su, Jhih-Sheng 261 Su, Tsung-Yu 283 Sukhbaatar, Baatarsuren 96 Sung, Hsieh-Jung 327
T Tabuchi, Kei 38 Takeda, Ryuhei 273 Takizawa, Makoto 160 Tanaka, Hibiki 1 Teng, Yuan-Hsiang 221 Tong, Zhi-Wei 294 Toyoshima, Kyohei 38, 47 Tsai, Kuen-Yu 261 Tsai, Yung-Shun 369 Tsai, Yun-Hsuan 361 Tzang, Shyh-Weir 369 U Uang, Kai-Ming
8
V Valeriano, Irving 209 103
W Wang, Kuei-Yuan Weng, Chien-Erh White, David W. Woungang, Isaac Wu, Meng-Chang
327, 345 240 68 68 250
Y Yamashita, Yuma 47 Yang, Chao-Tung 8 Yang, Lie 240 Yang, Ya-Yun 351 Yao, Yung-Cheng 240 Yao, Zhengji 18 Yim, Kangbin 78, 86, 96, 103 Yukawa, Chihiro 38, 47 Yuspov, Kamronbek 96 Yusupov, Kamronbek 86 Z Zhao, Cong 138 Zhao, Yanyan 112 Zhu, Zichen 18