142 89 46MB
English Pages 548 [544] Year 2021
Lecture Notes in Networks and Systems 279
Leonard Barolli Kangbin Yim Hsing-Chung Chen Editors
Innovative Mobile and Internet Services in Ubiquitous Computing Proceedings of the 15th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2021)
Lecture Notes in Networks and Systems Volume 279
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA; Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada; Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.
More information about this series at http://www.springer.com/series/15179
Leonard Barolli Kangbin Yim Hsing-Chung Chen •
•
Editors
Innovative Mobile and Internet Services in Ubiquitous Computing Proceedings of the 15th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2021)
123
Editors Leonard Barolli Department of Information and Communication Engineering Fukuoka Institute of Technology Fukuoka, Japan
Kangbin Yim Department of Information Security Engineering Soonchunhyang University Asan, Korea (Republic of)
Hsing-Chung Chen Department of Computer Science and Information Engineering Asia University Taichung, Taiwan
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-030-79727-0 ISBN 978-3-030-79728-7 (eBook) https://doi.org/10.1007/978-3-030-79728-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Welcome Message of IMIS-2021 International Conference Organizers
Welcome to the 15th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2021), which will be held from July 1–3, 2021, at Soon Chun Hyang (SCH) University, Asan, Korea, in conjunction with the 15th International Conference on Complex, Intelligent and Software Intensive Systems (CISIS-2021). This international conference focuses on the challenges and solutions for ubiquitous and pervasive computing (UPC) with an emphasis on innovative, mobile and Internet services. With the proliferation of wireless technologies and electronic devices, there is a fast-growing interest in UPC. UPC enables to create a human-oriented computing environment where computer chips are embedded in everyday objects and interact with physical world. Through UPC, people can get online even while moving around, thus having almost permanent access to their preferred services. With a great potential to revolutionize our lives, UPC also poses new research challenges. The conference provides an opportunity for academic and industry professionals to discuss the latest issues and progress in the area of UPC. For IMIS-2021, we received many paper submissions from all over the world. The papers included in the proceedings cover important aspects from UPC research domain. We are very proud and honored to have two distinguished keynote talks by Dr. Jayh (Hyunhee) Park, Myongji University, Korea, and Dr. Antonio Esposito, University of Campania “Luigi Vanvitelli”, Italy, who will present their recent work and will give new insights and ideas to the conference participants. The organization of an international conference requires the support and help of many people. A lot of people have helped and worked hard to produce a successful IMIS-2021 technical program and conference proceedings. First, we would like to thank all the authors for submitting their papers, the Program Committee members and the reviewers who carried out the most difficult work by carefully evaluating the submitted papers. We are grateful to Honorary Co-Chairs Kyoil Suh, Soon Chun Hyang (SCH) University, Korea, and Prof. Makoto Takizawa, Hosei University, Japan, for their guidance and advices.
v
vi
Welcome Message of IMIS-2021 International Conference Organizers
Finally, we would like to thank Web Administrator Co-Chairs for their excellent and timely work. We hope that all of you enjoy IMIS-2021 and find this a productive opportunity to learn, exchange ideas and make new contacts.
IMIS-2021 Organizing Committee
Honorary Co-chairs Kyoil Suh Makoto Takizawa
Soonchunhyang University, Korea Hosei University, Japan
General Co-chairs Kangbin Yim Francesco Palmieri Hsing-Chung Chen
Soonchunhyang University, Korea University of Salerno, Italy Asia University, Taiwan
Program Committee Co-chairs Jonghyouk Lee Lidia Ogiela Baojiang Cui
Sejong University, Korea Pedagogical University in Krakow, Poland Beijing Univ. of Posts and Telecom, China
Advisory Committee Members Vincenzo Loia Arjan Durresi Kouichi Sakurai
University of Salerno, Italy IUPUI, USA Kyushu University, Japan
Award Co-chairs Hae-Duck Joshua Jeong Tomoya Enokido Farookh Hussain Fang-Yie Leu
Korean Bible University, Korea Rissho University, Japan Univ. of Technology Sydney, Australia Tunghai University, Taiwan
vii
viii
IMIS-2021 Organizing Committee
International Liaison Co-chairs Hyobeom Ahn Marek Ogiela Elis Kulla
Kongju University, Korea AGH Univ. of Science and Technology, Poland Okayama University of Science, Japan
Publicity Co-chairs Hyunhee Park Hiroaki Kikuchi Keita Matsuo
Myongji University, Korea Meiji University, Japan Fukuoka Institute of Technology, Japan
Finance Chair Makoto Ikeda
Fukuoka Institute of Technology, Japan
Local Arrangement Co-chairs Sunyoung Lee Hwankuk Kim Kyungroul Lee
Soonchunhyang University, Korea Sangmyung University, Korea Soonchunhyang University, Korea
Web Administrators Phudit Ampririt Kevin Bylykbashi Ermioni Qafzezi
Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan
Steering Committee Chair Leonard Barolli
Fukuoka Institute of Technology, Japan
Track Areas and PC Members 1. Multimedia and Web Computing Track Co-chairs Chi-Yi Lin Tomoyuki Ishida
Tamkang University, Taiwan Fukuoka Institute of Technology, Japan
PC Members Noriki Uchida Tetsuro Ogi
Fukuoka Institute of Technology, Japan Keio University, Japan
IMIS-2021 Organizing Committee
Yasuo Ebara Hideo Miyachi Kaoru Sugita Akio Doi Chang-Hong Lin Chia-Mu Yu Ching-Ting Tu Shih-Hao Chang
ix
Kyoto University, Japan Tokyo City University, Japan Fukuoka Institute of Technology, Japan Iwate Prefectural University, Japan National Taiwan University of Science and Technology, Taiwan National Chung Hsing University, Taiwan National Chung Hsing University, Taiwan Tamkang University, Taiwan
2. Context and Location-aware Computing Track Co-chairs Massimo Ficco Jeng-Wei Lin
University of Campania Luigi Vanvitelli, Italy Tunghai University, Taiwan
PC Members Kandaraj Piamrat Kamal Singh Seunghyun Park Paolo Bellavista David Camacho Michal Choras Gianni D’Angelo Hung-Yu Kao Ray-I Chang Mu-Yen Chen Shian-Hua Lin Chun-Hsin Wu Sheng-Lung Peng
University of Nantes, France University of Jean Monnet, France Korea University, Korea University of Bologna, Italy Universidad Autónoma de Madrid, Spain University of Science and Technology, Poland University of Benevento, Italy National Cheng Kung University, Taiwan National Taiwan University, Taiwan National Taichung University of Science and Technology, Taiwan National Chi Nan University, Taiwan National University of Kaohsiung, Taiwan National Dong Hwa University, Taiwan
3. Data Management and Big Data Track Co-chairs Been-Chian Chien Akimitsu Kanzaki Wen-Yang Lin
National University of Tainan, Taiwan Shimane University, Japan National University of Kaohsiung, Taiwan
PC Members Hideyuki Kawashima Tomoki Yoshihisa
Keio University, Japan Osaka University, Japan
x
Pruet Boonma Masato Shirai Bao-Rong Chang Rung-Ching Chen Mong-Fong Horng Nik Bessis James Tan Kun-Ta Chuang Jerry Chun-Wei Lin
IMIS-2021 Organizing Committee
Chiang Mai University, Thailand Shimane University, Japan National University of Kaohsiung, Taiwan Chaoyang University of Technology, Taiwan National Kaohsiung University of Applied Sciences, Taiwan Edge Hill University, UK SIM University, Singapore National Cheng Kung University, Taiwan Harbin Institute of Technology Shenzhen Graduate School, China
4. Security, Trust and Privacy Track Co-chairs Tianhan Gao Olivia Fachrunnisa
Northeastern University, China UNISSULA, Indonesia
PC Members Qingshan Li Zhenhua Tan Zhi Guan Nan Guo Xibin Zhao Cristina Alcaraz Massimo Cafaro Giuseppe Cattaneo Zhide Chen Clara Maria Richard Hill Dong Seong Kim Victor Malyshkin Barbara Masucci Arcangelo Castiglione Xiaofei Xing Mauro Iacono Joan Melià Jordi Casas Jordi Herrera Antoni Martínez Francesc Sebé
Peking University, China Northeastern University, China Peking University, China Northeastern University, China Tsinghua University, China Universidad de Málaga, Spain University of Salento, Italy University of Salerno, Italy Fujian Normal University, China Colombini, University of Milan, Italy University of Derby, United Kingdom University of Canterbury, New Zealand Russian Academy of Sciences, Russia University of Salerno, Italy University of Salerno, Italy Guangzhou University, China Second University of Naples, Italy Universitat Oberta de Catalunya, Spain Universitat Oberta de Catalunya, Spain Universitat Autònoma de Barcelona, Spain Universitat Rovira i Virgili, Spain Universitat de Lleida, Spain
IMIS-2021 Organizing Committee
xi
5. Energy Aware and Pervasive Systems Track Co-chairs Chi Lin Elis Kulla
Dalian University of Technology, China Okayama University of Science, Japan
PC Member Jiankang Ren Qiang Lin Peng Chen Tomoya Enokido Makoto Takizawa Oda Tetsuya Admir Barolli Makoto Ikeda Keita Matsuo
Dalian University of Technology, China Dalian University of Technology, China Dalian University of Technology, China Rissho University, Japan Hosei University, Japan Okayama University of Science, Japan Aleksander Moisiu University of Durres, Albania Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan
6. Modeling, Simulation and Performance Evaluation Track Co-chairs Tetsuya Shigeyasu Bhed Bahadur Remy Dupas
Prefectural University of Hiroshima, Japan Bista Iwate Prefectural University, Japan University of Bordeaux, France
PC Members Jiahong Wang Shigetomo Kimura Chotipat Pornavalai Danda B. Rawat Gongjun Yan Akio Koyama Sachin Shetty
Iwate Prefectural University, Japan University of Tsukuba, Japan King Mongkut’s Institute of Technology Ladkrabang, Thailand Howard University, USA University of Southern Indiana, USA Yamagata University, Japan Old Dominion University, USA
7. Wireless and Mobile Networks Track Co-chairs Luigi Catuogno Hwamin Lee
University of Salerno, Italy Soonchunhyang University, Korea
xii
IMIS-2021 Organizing Committee
PC Members Aniello Del Sorbo Clemente Galdi Stefano Turchi Ermelindo Mauriello Gianluca Roscigno Dae Won Lee Jong Hyuk Lee Sung Ho Chin Ji Su Park Jaehwa Chung
Orange Labs – Orange Innovation, UK University of Naples “Federico II”, Italy University of Florence, Italy Deloitte Spa, Italy University of Salerno, Italy Seokyeong University, Korea Samsung Electronics, Korea LG Electronics, Korea Korea University, Korea Korea National Open University, Korea
8. Intelligent Technologies and Applications Track Co-chairs Marek Ogiela Yong-Hwan Lee Jacek Kucharski
AGH University of Science and Technology, Poland Wonkwang University, Korea Technical University of Lodz, Poland
PC Members Gangman Yi Hoon Ko Urszula Ogiela Lidia Ogiela Libor Mesicek Rung-Ching Chen Mong-Fong Horng Bao-Rong Chang Shingo Otsuka Pruet Boonma Izwan Nizal Mohd Shaharanee
Gangneung-Wonju National University, Korea J. E. Purkinje University, Czech Republic Pedagogical University of Krakow, Poland Pedagogical University of Krakow, Poland J. E. Purkinje University, Czech Republic Chaoyang University of Technology, Taiwan National Kaohsiung University of Applied Sciences, Taiwan National University of Kaohsiung, Taiwan Kanagawa Institute of Technology, Japan Chiang Mai University, Thailand University Utara, Malaysia
9. Cloud Computing and Service-Oriented Applications Track Co-chairs Baojiang Ciu Neil Yen Flora Amato
Beijing University of Posts and Telecommunications, China The University of Aizu, Japan University of Naples “Frederico II”, Italy
IMIS-2021 Organizing Committee
xiii
PC Members Aniello Castiglione Ashiq Anjum Beniamino Di Martino Gang Wang Shaozhang Niu Jianxin Wang Jie Cheng Shaoyin Cheng Jingling Zhao Qing Liao Xiaohui Li Chunhong Liu Yan Zhang Hassan Althobaiti Bahjat Fakieh Jason Hung Frank Lai
University of Naples Parthenope, Italy University of Derby, UK University of Campania “Luigi Vanvitelli”, Italy Nankai University, China Beijing University of Posts and Telecommunications, China Beijing Forestry University, China Shandong University, China University of Science and Technology of China, China Beijing University of Posts and Telecommunications, China Beijing University of Posts and Telecommunications, China Wuhan University of Science and Technology, China Hainan Normal University, China Yan Hubei University, China Umm Al-Qura University, Saudi Arabia King Abdulaziz University, Saudi Arabia National Taichung University of Science and Technology, Taiwan University of Aizu, Japan
10. Ontology and Semantic Web Track Co-chairs Alba Amato Fong-Hao Liu Giovanni Cozzolino
Italian National Research Council, Italy National Defense University, Taiwan University of Naples “Frederico II”, Italy
PC Members Flora Amato Claudia Di Napoli Salvatore Venticinque Marco Scialdone Wei-Tsong Lee Tin-Yu Wu Liang-Chu Chen Omar Khadeer Hussain Salem Alkhalaf
University of Naples “Federico II”, Italy Italian National Research Center (CNR), Italy University of Campania “Luigi Vanvitelli”, Italy University of Campania “Luigi Vanvitelli”, Italy Tam-Kang University, Taiwan National Ilan University, Taiwan National Defense University, Taiwan University of New South Wales (UNSW) Canberra, Australia Qassim University, Saudi Arabia
xiv
Osama Alfarraj Thamer AlHussain Mukesh Prasad
IMIS-2021 Organizing Committee
King Saud University, Saudi Arabia Saudi Electronic University, Saudi Arabia University of Technology Sydney, Australia
11. IoT and Social Networking Track Co-chairs Sajal Mukhopadhyay Francesco Moscato
National Institute of Technology, Durgapur, India University of Campania Luigi Vanvitelli, Italy
PC Members Animesh Dutta Sujoy Saha Jaydeep Howlader Mansaf Alam Kashish Ara Shakil Makoto Ikeda Elis Kulla Shinji Sakamoto Evjola Spaho
NIT Durgapur, India NIT Durgapur, India NIT Durgapur, India Jamia Millia Islamia, New Delhi, India Jamia Hamadard, New Delhi, India Fukuoka Institute of Technology, Japan Okayama University of Science, Japan Seikei University, Japan Polytechnic University of Tirana, Albania
12. Embedded Systems and Wearable Computers Track Co-chairs Jiankang Ren Keita Matsuo Kangbin Yim
Dalian University of Technology, China Fukuoka Institute of Technology, Japan SCH University, Korea
PC Members Yong Xie Xiulong Liu Shaobo Zhang Kun Wang Fangmin Sun Kyungroul Lee Kaoru Sugita Tomoyuki Ishida Noriyasu Yamamoto Nan Guo
Xiamen University of Technology, Xiamen, China The Hong Kong Polytechnic University, Hong Kong Hunan University of Science and Technology, China Liaoning Police Academy, China Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China SCH University, Korea Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan Northeastern University, China
IMIS-2021 Organizing Committee
xv
IMIS-2021 Reviewers Leonard Barolli Makoto Takizawa Fatos Xhafa Isaac Woungang Hyunhee Park Hae-Duck Joshua Jeong Fang-Yie Leu Kangbin Yim Marek Ogiela Makoto Ikeda Keita Matsuo Francesco Palmieri Massimo Ficco Salvatore Venticinque Admir Barolli Elis Kulla Arjan Durresi Bhed Bista Hsing-Chung Chen Kin Fun Li Hiroaki Kikuchi Lidia Ogiela Nan Guo Hwamin Lee Tetsuya Shigeyasu Fumiaki Sato Kosuke Takano Flora Amato Tomoya Enokido Minoru Uehara Santi Caballé Tomoyuki Ishida Hwa Min Lee
Jiyoung Lim Tianhan Gao Danda Rawat Farookh Hussain Jong Suk Lee Omar Hussain Nadeem Javaid Zahoor Khan Chi-Yi Lin Luigi Catuogno Akimitsu Kanzaki Wen-Yang Lin Tomoki Yoshihisa Masaki Kohana Hiroki Sakaji Baojiang Cui Takamichi Saito Arcangelo Castiglione Shinji Sakamoto Massimo Cafaro Mauro Iacono Barbara Masucci Ray-I Chang Gianni D’Angelo Remy Dupas Aneta Poniszewska-Maranda Wang Xu An Sajal Mukhopadhyay Lelio Campanile Tomoyuki Ishida Yong-Hwan Lee Lidia Ogiela Hiroshi Maeda
IMIS-2021 Keynote Talks
Asking AI Why: Explainable Artificial Intelligence Jayh (Hyunhee) Park Myongji University, Yongin, Korea
Abstract. In the early phases of AI adoption, it was okay to not understand what the model predicts in a certain way, as long as it gives the correct outputs. Explaining how they work was not the first priority. Now, the focus is turning to build human interpretable models. In the invited talk, I will explain why eXplainable AI is important. Then, I will explain an AI model. Through this invited talk, I will discuss models such as ensembles and neural networks called black-box models. I will deal with the following questions. • Why should we trust your model? • Why did the model take a certain decision? • What drives model predictions?
xix
Co-evolution of Semantic and Blockchain Technologies Antonio Esposito University of Campania “Luigi Vanvitelli”, Aversa, Italy
Abstract. Semantic technologies have demonstrated to have the capability to ease interoperability and portability issues in several application fields such as Cloud Computing and the Internet of Things (IoT). Indeed, the increase in resource representation and the inference capabilities enabled by semantic technologies represent important components of current distributed software systems, which can rely on better information interoperability and decision autonomy. However, semantics alone cannot solve trust and reliability issues that, in many situations, can still arise within software systems. Blockchain solutions have shown to be effective in this area, creating data sharing infrastructure where information validation can be done without the necessity of third-party services. A co-evolution and integration of semantic and blockchain technologies would at the same time enhance data interoperability and ensure data trust and provenance, creating undeniable benefits for distributed software systems. This talk will focus on the current state of the art regarding the integration of semantic and blockchain technologies, looking at the state of their co-evolution, at the available and still needed solutions.
xxi
Contents
Implementation of a VR Preview Simulation System by Capturing the Human Body Movements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tomoyuki Ishida and Kyosuke Sakuma
1
Projection Region Detection Model Based on BASNet . . . . . . . . . . . . . . Yukang Zhao, Nan Guo, and Xinbei Jiang
10
Analysis of Epidemic Events Based on Event Evolutionary Graph . . . . Kang Xie, Tao Yang, Rourong Fan, and Guoqing Jiang
20
Proposal and Development of Anonymization Dictionary Using Public Information Disclosed by Anonymously Processed Information Handling Business Operators . . . . . . . . . . . . . . . . . . . . . . . Masahiro Fujita, Yasuoki Iida, Mitsuhiro Hattori, Tadakazu Yamanaka, Nori Matsuda, Satoshi Ito, and Hiroaki Kikuchi
30
VANETs Road Condition Warning and Vehicle Incentive Mechanism Based on Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shuai Zhou and Tianhan Gao
40
An Authentication Scheme for Car-Home Connectivity Services in Vehicular Ad-Hoc Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cong Zhao, Nan Guo, Tianhan Gao, Jiayu Qi, and Xinyang Deng
50
Vulnerability Analysis of Software Piracy and Reverse Engineering: Based on Software C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jaehyuk Lee, Kangbin Yim, and Kyungroul Lee
59
PtPeach: Improved Design and Implementation of Peach Fuzzing Test for File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hua Yang, Jingling Zhao, and Baojiang Cui
67
xxiii
xxiv
Contents
An Efficient Approach to Enhance the Robustness of Scale-Free Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Syed Minhal Abbas, Nadeem Javaid, Muhammad Usman, Shakira Musa Baig, Arsalan Malik, and Anees Ur Rehman A Blockchain Based Secure Authentication and Routing Mechanism for Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Usman Aziz, Muhammad Usman Gurmani, Saba Awan, Maimoona Bint E. Sajid, Sana Amjad, and Nadeem Javaid Blockchain Based Authentication and Trust Evaluation Mechanism for Secure Routing in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . Saba Awan, Maimoona Bint E. Sajid, Sana Amjad, Usman Aziz, Usman Gurmani, and Nadeem Javaid
76
87
96
Towards Energy Efficient Smart Grids: Data Augmentation Through BiWGAN, Feature Extraction and Classification Using Hybrid 2DCNN and BiLSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Muhammad Asif, Benish Kabir, Pamir, Ashraf Ullah, Shoaib Munawar, and Nadeem Javaid Comparative Study of Data Driven Approaches Towards Efficient Electricity Theft Detection in Micro Grids . . . . . . . . . . . . . . . . . . . . . . . 120 Faisal Shehzad, Muhammad Asif, Zeeshan Aslam, Shahzaib Anwar, Hamza Rashid, Muhammad Ilyas, and Nadeem Javaid Routing Strategy for Avoiding Obstacles During Message Forwarding in Mobile Ad-Hoc Network . . . . . . . . . . . . . . . . . . . . . . . . . 132 Qiang Gao, Tetsuya Shigeyasu, and Chunxiang Chen Fuzzing Method Based on Selection Mutation of Partition Weight Table for 5G Core Network NGAP Protocol . . . . . . . . . . . . . . . . . . . . . 144 Yang Hu, Wenchuan Yang, Baojiang Cui, Xiaohui Zhou, Zhijie Mao, and Ying Wang Simulation Results of a DQN Based AAV Testbed in Corner Environment: A Comparison Study for Normal DQN and TLS-DQN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 Nobuki Saito, Tetsuya Oda, Aoto Hirata, Kyohei Toyoshima, Masaharu Hirota, and Leonard Barolli Stochastic Geometric Analysis of IRS-aided Wireless Networks Using Mixture Gamma Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 Yunli Li and Young Jin Chun
Contents
xxv
Performance Evaluation of CM and RIWM Router Replacement Methods for WMNs by WMN-PSOHC Hybrid Intelligent Simulation System Considering Chi-square Distribution of Mesh Clients . . . . . . . . 179 Shinji Sakamoto, Yi Liu, Leonard Barolli, and Shusuke Okamoto Web Page Classification Based on Graph Neural Network . . . . . . . . . . 188 Tao Guo and Baojiang Cui Malicious Encrypted Traffic Identification Based on Four-Tuple Feature and Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Kunlin Li and Baojiang Cui Improved Optimal Reciprocal Collision Avoidance Algorithm in Racing Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 Wenyu Zhang and Tianhan Gao An Intelligent Travel Application Using CNN-Based Deep Learning Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 Kyungmin Lee, Mincheol Shin, Yongho Kim, and Hae-Duck J. Jeong Reduced CNN Model for Face Image Detection with GAN Oversampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 Jiha Kim and Hyunhee Park Power Meter Software Quality Analysis Based on Dynamic Binary Instrumentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 Zhai Feng, Kong Lingda, Xu Yongjin, and Ye Xin DClu: A Direction-Based Clustering Algorithm for VANETs Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Marco Lapegna and Silvia Stranieri ComPass: Proximity Aware Common Passphrase Agreement Protocol for Wi-Fi Devices Using Physical Layer Security . . . . . . . . . . . 263 Khan Reaz and Gerhard Wunder Fuzzing with Multi-dimensional Control of Mutation Strategy . . . . . . . 276 Han Xu, Baojiang Cui, and Chen Chen An ELF Recovery Method for Linux Malicious Process Detection . . . . . 285 Zheng Wang, Baojiang Cui, and Yang Zhang A High Efficiency and Accuracy Method for x86 Undocumented Instruction Detection and Classification . . . . . . . . . . . . . . . . . . . . . . . . . 295 Jiatong Wu, Baojiang Cui, Chen Chen, and Xiang Long Simulation-Based Fuzzing for Smart IoT Devices . . . . . . . . . . . . . . . . . 304 Fanglei Zhang, Baojiang Cui, Chen Chen, Yiqi Sun, Kairui Gong, and Jinxin Ma
xxvi
Contents
On the Neighborhood-Connectivity of Locally Twisted Cube Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 Tzu-Liang Kung, Cheng-Kuan Lin, and Chun-Nan Hung Assessing the Super Pk -Connectedness of Crossed Cubes . . . . . . . . . . . . 322 Yuan-Hsiang Teng and Tzu-Liang Kung Learning Performance Prediction with Imbalanced Virtual Learning Environment Students’ Interactions Data . . . . . . . . . . . . . . . . 330 Hsing-Chung Chen, Eko Prasetyo, Prayitno, Sri Suning Kusumawardani, Shian-Shyong Tseng, Tzu-Liang Kung, and Kuei-Yuan Wang An APP-Based E-Learning Platform for Artificial Intelligence Cross-Domain Application Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Anthony Y. H. Liao Automatic Control System for Venetian Blind in Home Based on Fuzzy Sugeno Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 Hsing-Chung Chen, Galang Wicaksana, Agung Mulyo Widodo, Andika Wisnujati, Tzu-Liang Kung, and Wen-Yen Lin Combining Pipeline Quality with Automation to CI/CD Process for Improving Quality and Efficiency of Software Maintenance . . . . . . . 362 Sen-Tarng Lai, Heru Susanto, and Fang-Yie Leu Efficient Adaptive Resource Management for Spot Workers in Cloud Computing Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373 Lung-Pin Chen, Fang-Yie Leu, Hsin-Ta Chiao, and Hung-Jr Shiu You Draft We Complete: A BERT-Based Article Generation by Keyword Supplement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 Li-Gang Jiang, Yi-Hong Chan, Yao-Chung Fan, and Fang-Yie Leu A Q-Learning Based Downlink Scheduling Algorithm for Multiple Traffics in 5G NR Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393 Zhi-Qian Hong, Heru Susanto, and Fang-Yie Leu Study on the Relationship Between Dividend, Business Cycle, Institutional Investor and Stock Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 Yung-Shun Tsai, Chun-Ping Chang, and Shyh-Weir Tzang The Choice Between FDI and Selling Out with Externality and Exchange Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412 Chun-Ping Chang, Yung-Shun Tsai, Khunshagai Batjargal, Hong Nhung Nguyen, and Shyh-Weir Tzang Performance of the Initial Public Offering in the Taiwan Stock Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 Shyh-Weir Tzang, Chun-Ping Chang, Tsatsral Ochirbal, Bolor Sukhbaatar, and Yung-Shun Tsai
Contents
xxvii
Analysis of the Causal Relationship Among Diversification Strategies, Financial Performance and Market Values by the Three-Stage Least Squares (3SLS) . . . . . . . . . . . . . . . . . . . . . . . 431 Ying-Li Lin, Kuei-Yuan Wang, and Jia-Yu Chen Macroeconomic Variables and Investor Sentiment . . . . . . . . . . . . . . . . . 438 Mei-Hua Liao, Chun-Min Wang, and Ya-Lan Chan College Students’ Learning Motivation and Learning Effectiveness by Integrating Knowledge Sharing, Action Research and Cooperative Learning in Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 444 Tin-Chang Chang, I.-Tien Chu, and Pei-Shih Chen A Feasibility Study of the Introduction of Service Apartment Operation Model on Long-Term Care Institutions . . . . . . . . . . . . . . . . . 464 Kuei-Yuan Wang, Ying-Li Lin, Chien-Kuo Han, and Shu-Tzu Shih Research on the Integration of Information Technology into Aesthetic Teaching in Kindergartens . . . . . . . . . . . . . . . . . . . . . . . . . . . 472 Ya-Lan Chan, Pei-Fang Lu, Sue-Ming Hsu, and Mei-Hua Liao A Simulation System for Analyzing Attack Methods in Controller Area Network Using Fuzzing Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 478 Mitsuki Tsuneyoshi, Masahiro Miwata, Daisuke Nishii, Makoto Ikeda, and Leonard Barolli Combination of Pseudonym Changing with Blockchain-Based Data Credibility for Verifying Accuracy of Latest Vehicle Information in VANETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 Lei Zhao and Tianhan Gao Vulnerability Analysis of Intellectual Property Rights Using Reverse Engineering: Based on Software D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494 Wontae Jung, Kangbin Yim, and Kyungroul Lee Web Crawler for an Anonymously Processed Information Database . . . 501 Hiroaki Kikuchi, Atsuki Ono, Satoshi Ito, Masahiro Fujita, and Tadakazu Yamanaka A Scheme for Checking Similarity and Its Application in Outsourced Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511 Su Yunxuan, Wang Xu An, and Ge Yu Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
Implementation of a VR Preview Simulation System by Capturing the Human Body Movements Tomoyuki Ishida(B) and Kyosuke Sakuma Fukuoka Institute of Technology, Fukuoka 811-0295, Fukuoka, Japan [email protected], [email protected] Abstract. In this research, we implemented a function that enables an avatar to operate the VR space in conjunction with the captured user’s movement. Then, we constructed a VR preview simulation system by applying this function to VR preview. By capturing the user’s movements, this system allows the avatar to walk through and open and close the door in the VR preview space in synchronization with the user’s walking movement and door opening and closing movement in the real world, respectively. This system provides the user with a free space experience and an interior furniture layout arrangement according to their own movements.
1 Introduction Due to the remarkable development of information system and information communication network technologies in recent years, services utilizing these technologies are now indispensable to our lives. Among such technologies, virtual reality (VR), augmented reality (AR), and mixed reality (MR) stand out. With the spread of inexpensive VR devices and tablet terminals, these technologies are presently widely used in various fields, such as medical care, welfare, tourism, and education. In particular, the construction field adopts VR/AR/MR technologies such as VR simulation systems, AR interior simulation systems, and VR preview systems. Presently, with the onset of the new coronavirus (COVID-19) pandemic, a preview system that utilizes VR technology is becoming increasingly important as a tool to prevent infection. A survey by Spacely, Inc. revealed that 360° panoramic VR browsing is common when searching for rental properties [1]. However, most of these devices only yield images taken from the camera viewpoint.
2 Research Objective In this study, we used a RealSense Depth Camera [2] with depth measurement and motion tracking functions. By capturing the user’s movements using RealSense, we obtained a VR preview simulation system that enables walk-through and door opening/closing in the VR space. In this system, when the 3D model avatar that captures the user is placed in the VR space, it performs the same operation in synchronization with the movement of the user. As a result, when the user walks in the real space, the 3D model avatar walks in the VR space. This system provides the user with free and intuitive VR space operation by realizing the interior layout and walk-through of the 3D model avatar in the VR space by the user’s gesture operation. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 1–9, 2022. https://doi.org/10.1007/978-3-030-79728-7_1
2
T. Ishida and K. Sakuma
3 System Configuration Figure 1 shows the configuration of our system, which consists of the VR preview space control function, VR preview space, avatar control function, and object storage.
Fig. 1. System configuration of the VR preview simulation system.
3.1 VR Preview Space Control Function The VR space preview control function presents the VR preview space to the user and reflects the avatar’s walk-through and door opening/closing motions synchronized with the user’s gesture motion in the VR preview space. The VR preview space is presented to the user via HMD. The user can walk through and open/close the door in the VR preview space by walking and opening/closing the door in the real space, respectively. 3.2 VR Preview Space The VR preview space is a virtual space presented to the user through the VR preview space control function. 3.3 Avatar Control Function The avatar control function reads the movement of the user’s whole body captured via the gesture sensor and synchronizes it with the movement of the avatar placed in the VR preview space via the VR preview space control function. This function enables avatar walk-through and interior operation in the VR preview space by recognizing the user’s walking motion and door opening/closing motion in the real space.
Implementation of a VR Preview Simulation System
3
3.4 Object Storage Interior objects that can be placed in the VR preview space are stored in the object storage. According to the user’s operation order, the interior object is taken out from the object storage via the VR preview space control function and reflected in the VR preview space.
4 System Architecture Figure 2 shows the system architecture, which consists of the Virtual Reality Private View Simulation System Control Function, Virtual Reality Private View Simulation Space Control Function, HMD Function, and Object Storage.
Fig. 2. System architecture of the VR preview simulation system.
5 VR Preview Simulation System As mentioned above, we developed a system that can operate the VR preview space by recognizing the user’s gesture using RealSense. In this system, the user’s movement captured by skeleton tracking is synchronized with the avatar in the VR preview space in real time. Figure 3 shows the initial screen of this system, and Fig. 4 shows the initial state of the avatar. The functions of the VR preview simulation system includes a walk-through function, a door/window opening/closing function, and an interior layout function. 5.1 Walk-Through Function The walk-through function is a mechanism that enables the avatar in the VR preview space to walk in conjunction with the user’s movement of walking in the real space. Figure 5 shows the states before and after the movement by the walk-through function, and Fig. 6 shows the state of the user during the walking motion.
4
T. Ishida and K. Sakuma
Fig. 3. Initial screen of the VR preview simulation system.
Fig. 4. Initial state of the avatar.
Fig. 5. Before and after the movement by the walk-through function.
Implementation of a VR Preview Simulation System
5
Fig. 6. A user walks through the VR preview space through walking motion.
5.2 Door/Window Opening/Closing Function The door/window opening/closing function is a mechanism that enables the avatar to open and close the doors and windows placed in the VR preview space in conjunction with the user’s arm movement in the real space. Figure 7 shows the opening and closing of the door, and Fig. 8 shows the state of the user during opening and closing motion.
Fig. 7. Door opening and closing through the door/window opening and closing function.
6
T. Ishida and K. Sakuma
Fig. 8. A user operating the door in the VR preview space through opening and closing motion.
Fig. 9. A VR preview space with interior objects.
Implementation of a VR Preview Simulation System
7
5.3 Interior Layout Function The interior layout function enables the user to freely arrange the interior in the VR preview space by selecting the interior from the 3D menu. Figure 9 shows the VR preview space where the interior is arranged, and Fig. 10 shows the interior objects that can be arranged in the VR preview space.
Fig. 10. Interior objects that can be arranged in the VR preview space.
6 Functionality Evaluation We evaluated the functionality of the VR preview simulation system for the walkthrough, door/window opening/closing, and interior layout functions. Regarding the functionality of the walk-through function (Fig. 11), 83% of the participants answered “satisfied” or “somewhat satisfied.” This result confirms the high functionality of the walk-through function. Regarding the functionality of the door/window opening/closing function (Fig. 12), 67% of the participants answered “somewhat satisfied,” while 33% of the subjects answered “no opinion.” This result indicates that the functionality of the door/window opening/closing function must be improved. Finally, regarding the functionality of the interior layout function (Fig. 13), 100% of the participants answered “satisfied” or “somewhat satisfied,” thus confirming the high functionality of the interior layout function.
8
T. Ishida and K. Sakuma
Fig. 11. Functionality of the walk-through function (n = 6).
Fig. 12. Functionality of the door/window opening/closing function (n = 6).
Implementation of a VR Preview Simulation System
9
Fig. 13. Functionality of the interior layout function (n = 6).
7 Conclusion In this paper, we described the construction and evaluation of a VR preview simulation system by capturing the human body. Additionally, we constructed a VR preview simulation system that enables free walk-through, the opening and closing of doors/windows, and interior layout arrangement in the VR preview space by capturing the human body using RealSense. After evaluating the functionality of this system, we conclude that the walk-through and interior layout functions showed high functionality whereas the door/window opening/closing function presented some functionality issues.
References 1. Spacely, Inc.: Consumer awareness survey report on changing rental real estate in new coronavirus. https://tips.spacely.co.jp/covid_realestate/?_ga=2.69517904.1384256498. 1601879327-1026682804.1601879327. Accessed Feb 2021 2. Intel Corporation: Depth camera D435. https://www.intelrealsense.com/depth-camera-d435/. Accessed Feb 2021
Projection Region Detection Model Based on BASNet Yukang Zhao1 , Nan Guo1(B) , and Xinbei Jiang2 1
Computer Science and Engineering College, Northeastern University, Shenyang 110004, China [email protected] 2 Software College, Northeastern University, Shenyang 110004, China [email protected]
Abstract. As smart devices are used to assist in understanding presentation slides, such as machine translation and augmented reality, in seminars and conferences, separating the projection region from the foreground and background will improve the accuracy of recognition of slides. To take on this challenge, this paper proposes a novel detection model, BASNet with channel and space attention (CS-BASNet), to accurately detect projection area. Firstly, channel module and space attention module are added to the backbone network of ResNet18 to improve the detection evaluation metrics. Secondly, to reduce the influence of color on detection, we used gray images as inputs to train the model. We also propose a projection region dataset which sets projection region as foreground. The test results on this dataset show that the evaluation metrics of CS-BASNet is better than that of BASNet, and it showed that the evaluation metrics of CS-BASNet in detecting projection region has outperformed the latest five salient object detection models.
1
Introduction
International conferences are becoming more and more common with the blooming of economy and globalization. When mobile smart devices such as mobile phones and smart glasses are used to assist in understanding the projection content, separating the projection region from the background will improve the subsequent recognition accuracy. In practice it is necessary to reliably detect the projection region in real time under complex background when detecting the projection region. It is a great challenge for computer vision to locate the region of interest in complex situations. It is classified as salient object detection (SOD) when studying this kind of problem. SOD is used in image compression, segmentation, redirection, video coding, target detection and recognition and other occasions [1–3]. This paper mainly studies the methods of salient object detection. SOD methods are generally divided into traditional methods and deep learning methods. Traditional methods don’t require a lot of images for training, and often calculate saliency based on low-level features such as texture, color and edge gradient, c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 10–19, 2022. https://doi.org/10.1007/978-3-030-79728-7_2
Projection Region Detection Model Based on BASNet
11
as well as artificially designed frequency domain and low-rank matrix. On the contrary, the deep learning SOD model combines low-level features in images non-heuristically to obtain high-level semantic features. Then when calculating the image saliency, it uses high-level semantic features. In this paper,we propose a projection region detection model based on edge information. Our contributions are as follows. 1. The edge features and boundary connectivity of the projection region in the image are fully considered, and they are integrated to make the image saliency detection more robust and general. 2. An attention module is added on the basis of BASNet. The attention map is inferred from two dimensions of channel and space. Then the attention map is multiplied by the input feature map for adaptive feature optimization. 3. We made the PR-SOD dataset, and the test results of CS-BASNet on this dataset show that the performance of the proposed model achieves the expected goal. And the performance of CS-BASNet in detecting projection regions has reached a better level. The remainder of this paper is organized as follows: We discussed the related work in Sect. 2, and put forward the detection framework in Sect. 3, Sect. 4 is the experimental results, and Sect. 5 is the summary of the paper.
2
Related Work
In the past 20 years, there have been dozens of traditional methods and deep learning methods proposed for SOD. In the next part, we mainly reviewed some traditional salient object detection methods and salient object detection models based on deep learning. 2.1
Traditional Salient Object Detection Methods
Block-Based Salient Object Detection Methods: Block-based SOD methods were adopted in the early days [4]. Rosin [5] proposed an effective method to detect salient object. His method is parameter less and only needs very simple pixel operation, but the obtained region is relatively rough. Li et al. [6] defined center-around contrast as a cost-sensitive maximum marginal classification problem. The central patch is marked as a positive sample, detecting salient edges and vertices in hypergraph and capturing context saliency information of image pixels/super pixels. Block-based saliency detection methods can hardly preserve salient object boundaries well enough. Region-Based Salient Object Detection Methods: In order to solve the shortcomings of block-based saliency detection, Jiang et al. [7] proposed a method based on multi-scale local region contrast. This method enhances robustness by calculating the saliency values of multiple segmented regions, and combines the saliency values of these regions to obtain pixel-level saliency mapping.
12
Y. Zhao et al.
In paper [8], propose a regional contrast based saliency extraction algorithm, which simultaneously evaluates global contrast differences and spatial coherence. The proposed algorithm is simple, efficient, and yields full resolution saliency maps. 2.2
Salient Object Detection Based on Deep Learning
Multi-layer Perceptron (MLP)-Based Methods: MLP-based methods usually extract the deep features of each processing unit of the image to train MLP-classifiers for calculating saliency [9]. SSD [10] firstly uses candidate boxes to generate candidate regions on the image, then uses the improved CNN to detect the saliency map of the candidate regions, thus obtaining the shape prediction of saliency objects to acquire a rough saliency map. Then it uses specific middle and low-level information to refine the prediction map. MDF [11] introduces a neural network architecture, which has fully connected layers on top of CNNs responsible for feature extraction at three different scales. They then propose a refinement method to enhance the spatial coherence of our saliency results. Models based on multilayer perceptron cannot capture spatial information well. Fully Convolutional Network (FCN)-Based Methods: Due to the great success of FCN [12] in semantic segmentation, many SOD solutions based on deep learning in recent years have adopted classical classification models, such as VGGNet [13] and ResNet [14]. PAGE-net [15] proposes a pyramid attention module and an edge detection module for Salient object detection. The former effectively expands the receptive field and extracts multi-scale clues, while the latter uses clear edge information to locate and sharpens the boundary of saliency objects. BASNet [16] uses an encoder-decoder module to make coarse saliency prediction, and then refines the saliency map with a fine module. The mixed loss function is used in the training process, which is beneficial to accurate boundary prediction. MLSLNet [17] can detect foreground contour more accurately by guiding foreground edge detection tasks to each other. A mutual learning module is designed, and more information can be learned from multiple modules. SCRN [18] proposes a stacked cross refining unit (CRU), which can refine the features of Salient object detection and edge detection at the same time. The overall effect of this method is better than other methods in salient object detection.
3
CS-BASNet
In order to realize the accurate detection of projection region and improve the detection ability of BASNet, a projection area detection model based on edge information is proposed on the basis of BASNet. BASNet is modular and can be easily adapted or expanded to other tasks by replacing prediction networks or refining modules, so BASNet was chosen as the basic model.Firstly, in the
Projection Region Detection Model Based on BASNet
13
process of model training, the targeting ability of the basic model is improved by strengthening the detection of edge information in pictures. The backbone network of CS-BASNet is ResNet18 rather than ResNet34 [14]. Because the edge of the projection region is regular, and its features are less than those of general salient objects. By introducing attention mechanism into ResNet18, the detection performance of projection area is improved. 3.1
CS-BASNet Architecture
The CS-BASNet module modifies the prediction module according to the characteristics of the detected projection region on the basis of the BASNet model. The module structure is shown in Fig. 1. The prediction module is composed of an encoder-decoder network. This structure can capture both high-level global context information and low-level features. The encoder part consists of an input convolution and six basic res-blocks. The first four stages and the input convolution layer adopt ResNet18, which is different from BASNet. The number of network layers required is less. Because the edge of the projection region is regular, the number of features is less than those of general salient objects. We add an attention module to ResNet18, but it does not affect its structure. By Keeping the change of the input layer filter, the feature image with the size of 3 × 3 and stride of 1 can still keep the same resolution as the input image before the second stage. Our input layer retains the last layer of the decoder, which is supervised by the ground truth enlightened by HED [19]. The method based on global contrast can separate large-scale foreground objects from their background environment, and can produce higher saliency values at or near the edge of the object than the method based on local contrast [8].To reduce the impact of global information, we removed the bridging phase.
Fig. 1. Architecture of CS-BASNet.
The decoder and encoder are BASNet-like which is almost symmetrical. Each stage of the encoder includes three convolution layers, followed by batch standardization and ReLU activation functions. The input of each stage is a feature map of the connection between its previous stage and its up-sampled output of
14
Y. Zhao et al.
the corresponding stage in the encoder. There are a total of six outputs. The last prediction map is taken as the final output of the prediction module which is then taken as the input of the refining module because of its highest accuracy in significance detection among others. The original refining module [16] is retained, because the refining module in BASNet can refine the relatively fuzzy prediction map of the prediction module. This effort is achieved by learning the residual error between the ground truth and the prediction map. 3.2
Attention Module
Convolution Attention Module (CBAM) [20] is an efficient and concise attention module for feedforward convolution neural networks. For feature maps, CBAM module will infer attention maps in turn along two independent dimensions of channel and space, and then multiply the attention maps with the input feature maps for adaptive feature optimization. It can be integrated into any CNN architecture regardless of the cost of the module, and end-to-end training can be carried out together with the CNN, because CBAM is a lightweight general module. Therefore, in the backbone network ResNet18, attention modules are added to the first and the last layer convolution. Through comparative experiments, Fmeasure has been improved under different training conditions by adding CBAM attention module. 3.3
Loss Function
Three loss functions are used including BCE, IoU and SSIM in BASNet. BCE is a pixel-level loss function, SSIM is a local region-level loss function, and IoU is a feature map-level loss function. When detecting the projection region, considering that the foreground region is generally large, IoU is discarded in the improved model. BCE cross entropy loss function is the most widely used loss in dichotomy and segmentation. It is defined as follows. lbce = −
[G(r, c) ∗ log(S(r, c) + (1 − G(r, c)) ∗ log(1 − S(r, c))]
(1)
(r,c)
Where G(r,c) ∈ [0, 1] is the ground truth value of the pixel (r,c) and S(r,c) is the pixel value of the prediction map. SSIM is an index to measure the similarity of two pictures. It is used to capture the structural information in the image. In the BASNet paper, x = {xj : j = 1, . . . , N 2 }, y = {yi : i = 1, . . . , N 2 } are respectively the pixel values of the N × N picture corresponding to the prediction map and the ground truth. It is defined as follows. lssim = 1 −
(2μx μy + C1 )(2σxy + C2 ) (μ2x + μ2y + C2 )(σx2 + σy2 + C2 )
(2)
Projection Region Detection Model Based on BASNet
4 4.1
15
Experimental Analysis Dataset
Datasets commonly used for salient object detection of RGB images include SOD, ECSSD, DUT-OMRON, PASCAL-S, HKU-IS, DUTS. For projection region detection, no public dataset has been found at present, and The dataset (PR-SOD) containing 2000 pieces of image has been made. The dataset includes 500 scene pictures which are selected, (1) the public scene recognition dataset SUN, (2) pictures taken by students in class, (3) network pictures and conference speech pictures (such as TED, etc.). Subsequently, the dataset was expanded to 2000 by data enhancement (increasing brightness, flipping and rotation). According to the actual needs of the task, we ignore the small region occlusion of the projection region, and take the label value as a quadrilateral. Some labels of PR-SOD dataset are shown in Fig. 2.
Fig. 2. PR-SOD dataset. (a) Original picture. (b) Simple label.
4.2
Experimental Setup
During the training process, the hardware configuration is an Intel Core TM i7 8700 3.2 GHz CPU (16 GB RAM) and RTX 2070 GPU (6 GB RAM) for training and testing. The size of each image is adjusted to 256 × 256 then randomly cropped to 224 × 224 during training. Resnet18 model is no longer used as pre-training for encoder parameter initialization, and other convolution layers are initialized by Xavier. The optimizer selects Adam, and its hyperparameters are set to default, where the initial learning rate lr = 1e − 3, eps = 1e − 8, betas = (0.9, 0.999) and the weight decay = 0. Without using the validation set, the network is trained until the loss converges. The training loss converges after 300 iterations, with the batch size of 2, and the training process takes about 80 h. During testing, the input image is adjusted to 256 × 256 and sent to the network to obtain its saliency map. Then, the saliency detection map is adjusted back to the original size, both of which use bilinear interpolation.
16
4.3
Y. Zhao et al.
Evaluation Metrics
Common evaluation criteria for salient object detection mainly include PR curve (Precision-Recall Curve), Mean Absolute Error (MAE) and F-measure curve (Fmeasure curve). PR curve is a standard method to evaluate the probability graph of prediction significance. By comparing the binarized saliency map with the ground truth, the accuracy and recall rate of the saliency map can be calculated. In order to comprehensively measure the accuracy and recall rate, the F-measure expression calculated based on each pair of accuracy and recall rate is as follows. Fβ =
(1 + β 2 )P recision × Recall β 2 P recision + Recall
(3)
Where β represents non-negative weight, which is used to balance the relationship between accuracy rate and recall rate, usually β 2 = 0.3. Compared with PR curve, F-measure curve has the advantage that the effect of significant graph predicted by the model can be judged by observing the change of a single index. MAE represents the average absolute difference per pixel between the predicted saliency map and its ground truth mask. MAE is defined as follows. M AE =
W H 1 |S(r, c) − G(r, c)| H × W r=1 c=1
(4)
Where S and G are saliency probability maps and their ground truth values, W and H represent the width and height of saliency maps, and (r,c) represents pixel coordinates. For a dataset, its MAE can represent the average MAE of ll saliency mappings. 4.4
Evaluation of BASNet and CS-BASNet
By comparing the performance of five relatively new models in six commonly used databases, including BASNet [16], SCRN [18], MLSLNet [17], PAGEnet [15] and SSD [10], the general detection performance of the salient object detection model can be obtained. The F-measure of the model is approximately 0.85–0.93 and the MAE is approximately 0.11–0.03, the performance of the model reaches the benchmark. The salient object detection model detection map is shown in Fig. 3. It can be seen from column (f) in Fig. 3 that the detection results after retraining BASNet model using PR-SOD dataset have achieved good results. In column (d) and column (e), wrong predictions appear inside and at the edge of the projection, and column (c) is CS-BASNet detection chart, which shows the best effect. In the contrast experiment, taking the characteristics of the projection region into consideration, the color features have little influence on the detection of the projection region, so the contrast experiment between gray image and RGB image is considered. There are fewer features required to detect the projected region than in general saliency detection. Whereas in the deep learning
Projection Region Detection Model Based on BASNet
17
Fig. 3. Salient object detection model saliency map. (a) original image. (b) ground truth. (c) ours test saliency map. (d) and (e) saliency map of the ablation test. (f) saliency map of the original BASNet.
model, shallow convolution generally extracts more features, and deep convolution can extract main features. Therefore, when designing experiments, the balance between model capacity and performance is achieved by reducing the number of convolution layers, so as to prepare for the subsequent transplantation of the model to mobile devices. The experimental results of the comparative experiment are shown in in Table 1. Table 1. Performance comparison between BASNet and CS-BASNet Condition
P
R
F-measure MEA
Res34+RGB
0.9018
0.9939
0.9215
0.0262
Res18+RGB
0.9060
0.9979
0.9257
0.0254
Res18+Gray
0.9335
0.9959
0.9454
0.0153
Res18+Att+Gray
0.9435
0.9999
0.9560
0.0116
Res18+Att-Bridge+Gray 0.9526 0.9999 0.9631 0.0135 ∗ Res18 and Res34 are backbone networks, RGB and Gray represent different datasets, Att is added attention module, and Bridge is bridge hierarchy.
It can be concluded that the combined experimental results of Res18 + attbridge + gray are the most ideal, so it is selected as the CS-BASNet model. With comparative experiments, it can be seen that res18 has better effect than res34, which proves that fewer features are needed to detect projection areas, and increasing the number of network layers will help extract useless features. The effect of using gray-scale picture datasets has also been improved. Compared with the color dataset, Precision and F-measure increase when using the gray dataset, but Recall decreases, meaning the color has less effect.
18
5
Y. Zhao et al.
Conclusion
In this paper, we propose an end-to-end model CS-BASNet based on ResNet18. According to the modular characteristics of BASNet and the practical problem of detecting projection region, the prediction module is improved to adapt to the detection of projection region. The proposed CS-BASNet is also a predictrefining architecture, which consists of prediction and refinement module. Combined with simplified hybrid function, the improved model can capture largescale and fine boundaries. Experimental results on PR-SOD dataset shows that our model has better performance in projection region detection. Due to the lack of contrast benchmarks, the performance of the selected general salient object detection model on common datasets is taken as a benchmark. The proposed model performance is better than the selected benchmark. It means that targeted salient object detection is higher than general salient object detection. In the future work, the datasets will continue to be expanded, and the proposed model will be compressed and deployed on source limited devices to meet the requirements of application.
References 1. Bayer, F.M., Cintra, R.: DCT-like Transform for Image Compression Requires 14 Additions Only. In: ArXiv, arXiv:1702.00817 (2017) 2. Ma, C., Huang, J., Yang, X., Yang, M.: Hierarchical convolutional features for visual tracking. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3074–3082 (2015) 3. Li, T., Gao, S., Xu, Y.: Deep multi-similarity hashing for multi-label image retrieval. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (2017) 4. Borji, A., Cheng, M., Jiang, H., Li, J.: Salient object detection: a survey. Comput. Vis. Media 5, 117–150 (2019). https://doi.org/10.1007/s41095-019-0149-9 5. Rosin, P.L.: A simple method for detecting salient regions. Pattern Recogn. 42, 2363–2371 (2009) 6. Li, X., Li, Y., Shen, C., Dick, A., Hengel, A.V.: Contextual hypergraph modeling for salient object detection. In: 2013 IEEE International Conference on Computer Vision, pp. 3328–3335 (2013) 7. Jiang, H., Wang, J., Yuan, Z., Liu, T., Zheng, N.: Automatic salient object segmentation based on context and shape prior. In: BMVC (2011) 8. Cheng, M., Zhang, G., Mitra, N., Huang, X., Hu, S.: Global contrast based salient region detection. CVPR 2011, 409–416 (2011) 9. Wang, W., Lai, Q., Fu, H., Shen, J., Ling, H.: Salient object detection in the deep learning era: an in-depth survey. IEEE Trans. Pattern Anal. Mach. Intell. PP (2021) 10. Kim, J., Pavlovic, V.: A shape-based approach for salient object detection using deep learning. In: ECCV (2016) 11. Li, G., Yu, Y.: Visual saliency based on multiscale deep features. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5455–5463 (2015)
Projection Region Detection Model Based on BASNet
19
12. Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 42, 640–651 (2017) 13. Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. In: CoRR, arXiv:1409.1556 (2015) 14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016) 15. Wang, W., Zhao, S., Shen, J., Hoi, S., Borji, A.: Salient object detection with pyramid attention and salient edges. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1448–1457 (2019) 16. Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., J¨ agersand, M.: BASNet: boundary-aware salient object detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7471–7481 (2019) 17. Wu, R., Feng, M., Guan, W., Wang, D., Lu, H., Ding, E.: A mutual learning method for salient object detection with intertwined multi-supervision. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8142–8151 (2019) 18. Wu, Z., Su, L., Huang, Q.: Stacked cross refinement network for edge-aware salient object detection. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 7263–7272 (2019) 19. Xie, S., Tu, Z.: Holistically-nested edge detection. Int. J. Comput. Vis. 125, 3–18 (2015) 20. Woo, S., Park, J., Lee, J.-Y., Kweon, I. S.: CBAM: convolutional block attention module. In: ECCV (2018)
Analysis of Epidemic Events Based on Event Evolutionary Graph Kang Xie(B) , Tao Yang, Rourong Fan, and Guoqing Jiang The Third Research Institute of Ministry of Public Security, Shanghai, China [email protected]
Abstract. In this paper, by analyzing the characteristics of the news reports of the COVID-19 epidemic events, we extract the event ancestor pairs from the text, extract the relationship between the events through attention-based bidirectional LSTM, and display them in the form of EEG model, which is conducive to the analysis of the evolution of epidemic events. The method proposed in this paper provides a new idea for the evolution of network events. The constructed event map can clearly show the evolution path of network events, monitor key nodes of network events, assist relevant management departments to formulate corresponding measures, and lead the events forward in a positive way.
1 Introduction Since the end of 2019, a novel coronavirus pneumonia has broken out in Wuhan, Hubei province, China, which is called COVID-19. With the spread of the new coronavirus accelerating, person-to-person transmission occurred in families and hospitals. There are at least 101,700 cases confirmed, 4,842 cases death and 96,134 cases cured in China and 205 countries worldwide confirmed 111,278,977 cases being diagnosed, 2,464,928 case death in February 22, 2021. Based on the current epidemiological survey and data, more comprehensive research is executing to understand COVID-19 feature, including the source of disease, transmission, extent of infection, and the clinical plan. Chinese authorities improved surveillance network and provided efficient epidemiological controls, such as restricting travels across cities, contact tracing, Anti-gouging price controls, etc. These days, people have encountered a lot of epidemic events from public media, such as instant messaging, social media chats, online news and more. From large quantities of text stream data, there often exist graph structures constructed by events and their relationships. The evolution and development of events have their underlying principles, which usually indicate the basic patterns of events evolution and human behaviors. How to deal with these massive data and assist people to analysis the law of events has become an important problem. Knowledge graph which describes the events and their relationships in the objective world, has attracted considerable attention in many fields. Event Evolutionary Graph (EEG) is a kind of knowledge graph, which describes the event evolutionary patterns and moreover achieves the event prediction in the next few days [1]. In this paper, we © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 20–29, 2022. https://doi.org/10.1007/978-3-030-79728-7_3
Analysis of Epidemic Events Based on Event Evolutionary Graph
21
use EEG to represent the dynamic emergence of the epidemic events, further more to achieve the decision-making for potential organizations based on event prediction. The rest of the paper is organized as follows. In Sect. 2, we discuss the related work in detail. In Sect. 3, we introduce the design method of knowledge graph based on EEG model, and discuss the event extraction and relationship recognition. We draw the experiments in Sect. 4. Finally, Sect. 5 is the conclusion to summary the paper.
2 Related Work EEG is similar to knowledge graph, which is a structure for entity-relation representation based on ontology. The event and entity relationship are extracted from text stream data. Then, a framework system is built based on ontology, and event information is stored in the form of structured triples. The difference between EEG and knowledge graph is that knowledge graph is less sensitive to the dynamic events, but EEG describes the rules and patterns among dynamic events. Charlotte R. et al. propose a method to produce semantic annotations of news articles based on the EEG and Wikidata knowledge base [2]. M. Liu et al. analyze the crossover and convergence between services in service market based on EEG [3]. Analyses were conducted on the world aviation safety accident investigation report based on EEG [4]. EEG can describe the evolution law between events and predict subsequent events. Since the logical relation of event evolution shows in the form of uncertain probability, the extraction of the event relationship is very important. Lots of work to predict trends in texts has been presented in the past few years. Liu Xue et al. give a brief overview of recent progresses on relationship extraction based on deep neural networks, such as Long Short-Term Memory Networks (LSTM), Bidirectional LSTM and Convolutional Neural Networks (CNN). A novel multi-encoder bidirectional LSTM model is used to find potential relational facts in textual content [6]. They compute the confidence values of automatically labeled content using co-occurrence statistics of relations and dependency patterns of aligned sentences. Lim et al. [7] propose a method to capture the temporal context from Korean natural language sentences based on LSTM for extracting the relationships among the time expressions and events. And numerous efforts have been dedicated to extracting temporal and causal relations from texts. For causality relation extraction, Liang Zhang et al. [8] investigate the causal relations based on linguistic theories of minor sentences. 3D CNN is used to capture temporal relations from semantic neighbors of different receptive fields over time in [9]. However, these studies have some limitations. First, they can only extract relations from single sentences. Second, these studies extract relations based on the semantic of specific context rather than discover the underlying patterns of event evolution from large-scale user generated documents. Thus, we propose a promising construction framework to build EEG from large-scale unstructured web corpus.
3 Proposed Framework We introduce an entity-event network called Event Evolutionary Graph (EEG) to clarify the historical events by retrieving the valuable information from the COVID-19 internet
22
K. Xie et al.
news. The value of each relation denotes the transition probability between events. Therefore, the construction of EEG can be simplified as two key problems. The first is how to construct the expression mode of events. The second is to recognize relationships between each two events. It can be solved based on the classification framework. In this section, we propose a construction framework to construct EEG from largescale unstructured text, including data cleaning, natural language processing, event extraction, event pair candidates extraction, sequential relation and direction recognition, causality recognition and transition probability computation. Figure 1 sketches this framework. Details about the main construction steps are described below.
EEG construction
Knowledge update
Knowledge fusion
Relationship analysis
Relation extraction
Anaphora resolution
Entity alignment
Event extraction
Entity extraction
Attribute extraction
Label engineering
Data acquisition
Unstructured data
Semi structured data
Structured data
Fig. 1. Framework for building EEG from large-scale text
3.1 Design Knowledge Graph Based on EEG EEG is a Directed Cyclic Graph, whose nodes are events, and edges stand for the sequential and causal relations between events. EEG describes underlying evolutionary logic of the relations between events. Events and events relationship are two important concepts in EEG [1]. Events are represented by abstract, generalized, semantically complete predicate phrases that contain event triggers and other necessary components to maintain the semantic completeness of the events. The EEG can reflect the rules and patterns of event evolution, which is very suitable for event prediction. EEG can be formally denoted as G = (V , E), where V = {v1 , v2 , · · · , vn } is the node set and E is the edge set. Each node v ∈ V represents an epidemic event, and each e ∈ E is a directed edge with a weight. The weight ω of a directed edge vi → vj can be computed by: count vi , vj (1) ω vi |vj = k count(vi , vk ) Where count vi , vj means the frequency of vi , vj appears in the whole event chains. Weights will be an important part of the input to the event prediction model.
Analysis of Epidemic Events Based on Event Evolutionary Graph
23
Events relations include temporal and causality relationship. The temporal relationship focuses on the partial order between two events in time, for example, the epidemic data is dynamic update according to the time. The causal relationship between events is a subset of the temporal relationship, emphasizing the cause and effect, e.g. as a result of the epidemic situation, medical supplies such as masks were rushed to purchase and supplies were tight. For temporal direction recognition, if event A occurs before B with a higher frequency than B occurs before A, we think the temporal direction is from event A to event B. 3.2 Event Extraction Event extraction is the process of extracting event information from the description texts and presenting the information in the form of structured data. In EEG, events are represented by abstract, generalized and semantically complete verb phrases. Each event must contain a trigger word, which mainly indicates the occurrence of the event, and some other necessary components, such as the subject, object or modifier, to ensure the semantic completeness. Abstract and generalized phrases mean that we mainly focus on the accurate location, time and the exact subject of the event. Semantically complete means that human beings can understand the meaning of the event without vagueness and ambiguity. We extract abstract and verb-object phrases from a series of natural language processing and TextRank algorithm. Data Collection and Data Clean. Due to the Internet news media of large quantity, there are various types of data formats. News data formats mainly include audio, video, news, text and so on. Therefore, it is essential to carry out some preprocesses such as format conversion and data cleaning, and store them in the database. Data cleaning, combined with a variety of natural language processing operations, removes words that are irrelevant or unable to express the core concepts of the texts, so as to reduce the unnecessary calculation later on, which can greatly shorten the uptime of the system. After cleaning the data, a series of natural language processing steps including segmentation, part-of-speech tagging, and dependency parsing are conducted for event extraction. In the generic EEG, an event is represented by a pair of verb lemma and the grammatical dependency relation between the verb and entity, which can be easily extracted by NLP toolkit. Keyword Extraction. Keyword extraction is a vital step. In this paper, we use TextRank method based on graph model to extract keywords. First of all, the language network graph of the document is constructed, and then the language network graph is analyzed to find the important words or phrases in the graph. In the process of language network construction, the preprocessed words and sentences are used as nodes, the relations between words and sentences are regarded as edges, and the weights between edges are represented by the correlation degree of words and sentences. When using language network to obtain keywords, we need to evaluate the importance of each node, sort the nodes according to the importance, and select the words represented by the top k nodes as keywords. The degree of similarity between two sentences, that is, the weight of the
24
K. Xie et al.
edge, is obtained by the following formula: WS(Vi ) = (1 − d ) + d ×
j∈In(Vi )
wji vk ∈out (Vj ) wjk
WS Vj
(2)
Where Vi represents the word to be calculated weight, S(Vi ) represents the weight of this word, d represents the set in damping coefficient, and In(Vi ) represents the word the same window. out Vj represents the word set in the same window and out Vj represents the number of elements in the word set. Calculate the Similarity Between Documents. The core of text clustering algorithm is to calculate the similarity between documents. When a new article is clustered by conventional algorithm, all documents in all categories will be traversed. As the number of documents increases, the running efficiency will be greatly reduced. Therefore, this paper uses seed events to cluster text. That is, several documents are selected in an event to represent it. Firstly, the document vector is calculated, and then the batch documents are clustered, and the seed event set is selected to save the seed event vector for text similarity calculation. After obtaining the next batch of data, the similarity calculation can be carried out directly. M wik wjk sim di , dj = k=1 (3) M M 2 2 k=1 wik k=1 wjk 1 k sim di , dj = sim di , dj j=1 k
(4)
Where di represents the feature vector of the new document, dj represents the feature vector of the jth seed event of an event, M represents the dimension of the feature vector, wik represents the kth weight of the feature vector of the new document i, wjk represents the kth weight of the feature vector of the jth seed topic of the new document, sim di , dj represents the similarity between the document and a seed, and sim di , dj represents the average similarity between the document feature vector and the j-th seed event. Text Clustering. Text clustering is the core of event generation, and the texts with common topics are gathered into one group by clustering. In this paper, online incremental clustering single-pass algorithm is used to match text similarity. The steps are as follows: Step 1: input a new document D; Step 2: calculate the similarity between d and each document in the existing topic classification to obtain the event with the largest similarity to d, and calculate the similarity value t; Step 3: if t is greater than the threshold, the document D is classified into a known topic category, otherwise a new event is formed; Step 4: end the process of text clustering. UUID Generation. UUID is a universal and unique identification code. A UUID is generated for each topic after clustering, and the event is initially formed.
Analysis of Epidemic Events Based on Event Evolutionary Graph
25
Abstract Generation. In this paper, the generation of abstract is also based on TextRank algorithm. In the process of generating abstract, each sentence is regarded as a node. If there is similarity between two sentences, there should be an undirected weighted edge between the two corresponding nodes, and the weight is the similarity. The most important sentences calculated based on TextRank algorithm form the summary of events. Event Category. Give an event that has been classified a label to determine its category. According to the epidemic events, the label types of the events include: epidemic development, notification by the Health Commission, measures taken by national government agencies, hospital measures, virus research progress, Internet public opinion, social impact, international influence, etc. These labels will form a dynamic thesaurus. In event classification operation, the similarity between event keywords and thesaurus is compared first, and then the corresponding labels are determined.
3.3 Relationship Recognition The logical relationship between events can reflect the evolution path of the whole event, identify the logical relationship in data, extract the logical event pairs, and represent the events in a structured form. The extraction of event relationship can reveal the law of event development and clarify the association between events. Actually, there are various complex relationships between service collaboration events, such as coreference, temporal, comparison, sequential and causality relationship. For the sake of generality, we consider sequential, causal, temporal and reversal relationships in EEG. Hence, a method based on attention-based bidirectional Long Short-Term Memory (LSTM) model is selected for the further processing. The bidirectional LSTM method based on self attention mechanism classifies the relationship between event pairs by using event summary as input, and get various event relationship element pairs as output. By using the attention mechanism and seq2seq model, we can obtain the summary trigger words with sentence sets as the inputs which include event trigger’s position identification. First of all, input the summary A = {w1 , w2 , . . . , wN } and use a bidirectional GRU as the encoder to generate a hidden state hwn for each element wn in the summary the unidirectional GRU as the decoder to generate the title matrix. Then we use (0) (0) (0) K(0) = k1 , k2 , . . . , kt . Next, we use the soft attention mechanism to calculate the correlation between the summary and the trigger words so that the encoder can focus on the most relevant words in the summary. In the process of the N times of decoding, soft attention mechanism is used to encode the hidden layer to obtain the most appropriate summary text vector. Am =
N
αm,n hwn
(5)
n=1
αm,n = softmax(f sm−1 , hwn )
(6)
26
K. Xie et al.
Among them, sm−1 is the m − 1th hidden state, s0 = hwN is the last hidden state, f is used to measure the correlation between the word w1 in the summary and the trigger (0) word km−1 .
By using softmax classifier, we can obtain the classification result of event pair Ei , Ej . h∗ = tanh(HA)
(7)
ˆl = argmax softmax ω(m) h∗ + b(m)
(8)
Where h∗ means the probability belonging to the four relationship categories, and ˆl represents the most likely relationship category.
4 Experiments Based on the EEG, an empirical analysis experiment on the law of attention-based bidirectional LSTM was conducted. We select the reports on epidemic events since December 2019. We use Selenium to obtain the event information from the web browser and crawl the news reports of epidemic-related events. The collected data is preprocessed by a series of natural language operations, such as de-duplication, denoising and part-ofspeech tagging, in order to obtain the description of the event, and to list the description key values of the event. For the extracted events, neo4j is used to construct the EEG. The EEG model contains 3,183 nodes and 5,249 weighted edges. For the convenience of display, the labels of the node events in the EEG are the simplification of the description of the events. Part of the EEG is shown in Fig. 2. The two ends of the directed edge represent the cause event and the result event respectively, and the weight on the edge represents the probability of occurrence of the result event after the cause event occurs. For example, the event reported in January 14th, 2020, “A kit agent was asked by colleagues where to buy the shortage of protective clothing on the market” is simplified to “the shortage of protective clothing on the market”, which is labeled by “social impact”. The shortage of supplies was triggered by a series of conditional relationship events, such as medical workers being infected with unexplained pneumonia, abnormal CT of a doctor without history of exposure to South China seafood wholesale market, Li Wenliang having symptoms of infection. The conditional relationship weight is 0.52, 0.48 and 0.24 respectively. Based on EEG, we can analyze the fundamental reasons for the shortage of protective materials, which include the short-term sharp increase in market demand caused by the outbreak of the epidemic and the government failing to intervene in the market supply and implementing the control policy of distribution on demand in time, resulting in the hoarding of a large number of protective materials by those well-informed people with extensive channels. Since people do not know the real supply of materials, under the agitation of rumors and emotions, the public will think that there will be a shortage of materials, leading to a rush to buy materials.
Analysis of Epidemic Events Based on Event Evolutionary Graph
27
Fig. 2. Experiment result for building EEG
The EEG reveals that the evolutionary path of events is multi-directional. The panic purchase incident of materials caused the price gouging and the problem of materials of poor quality, which caused public controversy and widespread concern from all walks of life. The situation made the government take positive measures and punish the illegal personnel. If an event has a relatively high indegree, it means that more related events will trigger this event. Therefore, when there are nodes like this, the relevant departments should intervene actively and strengthen the monitoring so as to avoid similar incidents of manufacturing counterfeit and shoddy products (Fig. 3). For another example, based on EEG, we can find two kinds of reversal relationship events since December 29, 2019 to January 20, 2020. One is “Officials say no obvious human to human transmission was found”, and the other is “There is evidence of transmission from human to human”. It is obvious that these two kinds of events are contradictory and belong to the reversal relationship in the EEG. Before January 20, 2020, the official website of Wuhan Health Commission has published more than ten issues of notification on the epidemic situation of pneumonia. It was only mentioned in the notification on January 15 that “there is no clear evidence of human to human transmission, so the possibility of limited person to person transmission cannot be ruled out, but the risk of continuous person to person transmission is low.” In other circulars, either “no clear evidence of human to human transmission has been found” or nothing has been mentioned. Since the guidance of public opinion is that there is no risk of human to human transmission, even in the appearance of human to human, medical staff and the public do not take corresponding protective measures. From the perspective of event transition probability, under the assumption of reporting the personnel infection data in time, even if the official continues to release no notice of obvious human to human phenomenon, the probability of occurrence of human to human event reports will continue to increase. As a result, the incident will develop in
28
K. Xie et al.
Fig. 3. Experiment result for reversal relationship
the direction of human to human, so that medical staff and the public will pay more attention to protective measures, and play a role in restraining the development of the epidemic. From the above analysis, when the event shows multi-directional evolution, the deduction model based on EEG can judge the possible evolution direction of the event, simulate the evolution trend of the event by calculating the possibility of each path. Then the official can carry out targeted early warning, and take effective measures to control the evolution of the emergency, so as to provide corresponding measures and suggestions for the control of the event.
5 Conclusions In this paper, we formally analyze the epidemic events based on EEG theory, and discuss the event relationships in detail. The EEG model is mainly designed to extract the potential events and analysis the relationship between events from massive news from the internet, and further discover the trends embodied in the contents in text streams. We also study the connectivity of the event graph, and give the equivalent conditions to determine the connectivity of event graph. Experimental results show that the methods we developed are effective for both relation and direction recognition. Acknowledgements. This work is supported by National Key Research and Development Program of China (Project No. 2018YFC0806903), the basic work project of Ministry of public security science and technology (Project No. 2019GABJC20) and the Key Lab of Information Network Security of Ministry of Public Security C19600 (The Third Research Institute of Ministry of Public Security).
Analysis of Epidemic Events Based on Event Evolutionary Graph
29
References 1. Li, Z., Zhao, S., Ding, X., Liu, T.: EEG: knowledge base for event evolutionary principles and patterns. In: Cheng, X., Ma, W., Liu, H., Shen, H., Feng, S., Xie, X. (eds.) SMP 2017. CCIS, vol. 774, pp. 40–52. Springer, Singapore (2017). https://doi.org/10.1007/978-981-106805-8_4 2. Charlotte, R., Thibault, E., Olivier, F.: Searching news articles using an event knowledge graph leveraged by Wikidata. In: International World Wide Web Conference Committee (2019). 3. Liu, M., Wang, Z., Tu, Z.: Crossover service phenomenon analysis based on event evolutionary graph. In: Liu, X., et al. (eds.) ICSOC 2018. LNCS, vol. 11434, pp. 407–412. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-17642-6_35 4. Wang, H., Zhu, H., Yunqing, B.: Construction of causality event evolutionary graph of aviation accident. In: 5th International Conference on Transportation Information and Safety, pp. 692– 697 (2019) 5. Liu, X., Song, Q., Pengzhou, Z.: Relation extraction based on deep learning. In: IEEE/ACIS 17th International Conference on Computer and Information Science, pp. 687–691 (2018) 6. Siddhartha, B., Kostas, T.: Relation extraction using multi-encoder LSTM network on a distant supervised dataset. In: IEEE 12th International Conference on Semantic Computing, pp. 235– 238 (2018) 7. Chae-Gyun, L., Ho-Jin, C.: LSTM-based model for extracting temporal relations from Korean text. In: IEEE International Conference on Big Data and Smart Computing, pp. 666–668 (2018) 8. Liang, Z., Aijun, L., Yingyi, L.: Chinese causal relation: conjunction, order and focus-tostress assignment. In: 11th International Symposium on Chinese Spoken Language Processing, pp. 339–343 (2018) 9. Liwen, Z., Jiqing, H., Ziqiang, S.: Learning temporal relations from semantic neighbors for acoustic scene classification. IEEE Signal Process. Lett. 27, 950–954 (2020)
Proposal and Development of Anonymization Dictionary Using Public Information Disclosed by Anonymously Processed Information Handling Business Operators Masahiro Fujita1(B) , Yasuoki Iida1 , Mitsuhiro Hattori1 , Tadakazu Yamanaka1 , Nori Matsuda1 , Satoshi Ito2 , and Hiroaki Kikuchi2 1 Mitsubishi Electric Corporation, 5-1-1 Ofuna Kamakura, Kanagawa, Japan
[email protected] 2 Meiji University, 4-21-1 Nakano, Tokyo, Japan [email protected]
Abstract. To increase the number of anonymously processed information handling business operators, support for business operators to produce anonymously processed information is indispensable. In this study, as one of the supports, an “anonymization dictionary,” which focuses on public information disclosed by anonymously processing information handling business operators is proposed and developed. In Japan, in compliance with the Act on the Protection of Personal Information, a personal information handling business operator, when producing anonymously processed information and providing it to a third party, shall disclose to the public the categories of information relating to an individual contained in the information and the method of providing the information. Most operators disclose them on their webpages. In several cases, the disclosures also contain information other than that imposed by the act, such as the anonymization techniques used and the purpose of its use. Herein, the disclosures are collected from webpages, and an anonymization dictionary is developed using the disclosures. Evaluating the effectiveness of the dictionary through scenario-based evaluation, we demonstrate that the dictionary provides useful information for anonymously processed information production.
1 Introduction 1.1 Anonymously Processed Information With the advancement of Internet technology, many business operators have collected customers personal information. The business operators want to use this information for their business from the viewpoint of utility. However, in terms of privacy, they must handle the information with caution in accordance with the philosophy of respecting the personalities of individuals. To maintain the utility–privacy balance, “anonymously processed information” was established in Japan as a part of the Act on the Protection of Personal Information [1] and Enforcement Rules of the Act [2]. The information can be used without users’ consent for purposes other than the original intent and/or provided to a third party. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 30–39, 2022. https://doi.org/10.1007/978-3-030-79728-7_4
Proposal and Development of Anonymization Dictionary
31
1.2 Definition of Terms In this paper, the following terms are defined and used. Terms defined in Act [1]1 Anonymously processed information: Information relating to an individual that can be produced by processing personal information such that a specific individual cannot be identified through actions prescribed in each following item in accordance with the divisions of personal information set forth in each said item, nor can the personal information be restored (Article 2 (9) in Act [1]). Anonymously processed information handling business operator: A person who provides for use in business a collective body of information comprising anonymously processed information that has been systematically organized (Article 2 (10) in Act [1]).
Terms defined by authors “use” anonymously processed information: “An anonymously processed information operator uses anonymously processed information” means that “the operator produces the information or provides the information to a third party.” Category/Categories An item/items contained in anonymously processed information, such as “name,” “address,” and “history information.” Anonymization techniques: Individual techniques for making it impossible to identify a specific individual and restore the personal information used for the anonymously processed information production, such as “delete name,” and “top coding of age at 80.” See Reference [3] or [4] for further details. Public information Information disclosed by anonymously processed information handling business operators. According to Articles 36 and 37 in Act [1], the operators shall disclose the public information when using anonymously processed information. See Sect. 3.1 for further details. Elements Individual information contained in the public information, such as “the categories of information relating to an individual contained in the anonymously processed information,” and “the providing method to a third party.” Public webpage A webpage containing public information.
1 In this paper, the explanations of descriptions in Act [1] or Act [2] are un-official, i.e., they were
translated from Japanese to English by the authors or un-official translators.
32
M. Fujita et al.
1.3 Aims of This Paper With an increase in the number of IoT devices, business operators have collected a large amount of personal information. They can provide new services to their customers using anonymously processed information produced from personal information. As discussed in Sect. 2, to promote the use of anonymously processed information, support for the operators to produce anonymously processed information is indispensable. As one of the supports, in this study, an “anonymization dictionary” is proposed and developed. A personal information handling business operator, when producing anonymously processed information and providing the information to a third party, shall disclose certain information. The information was collected from webpages in which the disclosures were described, and then an anonymization dictionary using the collected information was developed. The dictionary provides useful information for production, such as “What anonymization techniques other companies use for the production?” with the operator.
2 Motivation and Related Works A business operator shall, when producing anonymously processed information, process personal information in accordance with the predefined standards. It consists of the following five rules (Article 19 in Enforcement Rules [2]): (i)
(ii)
(iii)
(iv)
(v)
deleting a whole or part of those descriptions etc. which can identify a specific individual contained in personal information (including replacing such descriptions etc. with other descriptions etc. using a method with no regularity that can restore the whole or part of descriptions etc.) deleting all individual identification codes contained in personal information (including replacing such codes with other descriptions etc. using a method with no regularity that can restore the individual identification codes) deleting those codes (limited to those codes linking mutually plural information being actually handled by a personal information handling business operator) which link personal information and information obtained by having taken measures against the personal information (including replacing the said codes with those other codes which cannot link the said personal information and information obtained by having taken measures against the said personal information using a method with no regularity that can restore the said codes) deleting idiosyncratic descriptions etc. (including replacing such descriptions etc. with other descriptions etc. using a method with no regularity that can restore the idiosyncratic descriptions etc.) besides action set forth in each preceding item, taking appropriate action based on the results from considering the attribute etc. of personal information database etc. such as a difference between descriptions etc. contained in personal information and descriptions etc. contained in other personal information constituting the personal information database etc. that encompass the said personal information
Proposal and Development of Anonymization Dictionary
33
These are just “standards,” i.e., a business operator needs to select the appropriate anonymous techniques for satisfying the above standards. However, because there are so many anonymization techniques and elements, this is a very difficult task for the operator. In fact, a report published by the Personal Information Protection Commission [5] also mentioned this difficulty. Sharing the use-cases, explaining how other businesses anonymously processed information handling operators produced anonymously processed information is a good solution to mitigate this difficulty [5]. The Personal Information Protection Commission explained a few use-cases in their report [6, 7]. A few business operators also explained their use-cases on their webpage (e.g., Reference [8]). These are good materials; however, only a few use-cases are explained. This motivated us to develop an anonymization dictionary. Specifically, we collected use-cases comprehensively and organized them systematically as a dictionary.
3 Development of Anonymization Dictionary 3.1 Overview Inquiring individual business anonymously processed information handling operators how they produce anonymously processed information is a simple way to collect usecases comprehensively. However, this is impossible owing to time and cost constraints. “Collecting the use-cases comprehensively,” is a big challenge for the development of an anonymization dictionary. We focused on public information to overcome this challenge. In compliance with Article 36 of the Act [1], the personal information handling business operator shall disclose some information as follows: (3) A personal information handling business operator, when having produced anonymously processed information, shall, pursuant to rules of the Personal Information Protection Commission, disclose to the public the categories of information relating to an individual contained in the anonymously processed information. (4) A personal information handling business operator, when having produced anonymously processed information and providing the anonymously processed information to a third party, shall, pursuant to rules of the Personal Information Protection Commission, in advance disclose to the public the categories of information concerning an individual contained in anonymously processed information to be provided to a third party and its providing method, and state to the third party explicitly to the effect that the information being provided is anonymously processed information.
34
M. Fujita et al.
Most operators disclose them on their webpages. In many cases, the disclosures contain information other than that imposed by Act [1], such as the anonymization techniques they used and destinations (third-party names). The disclosures (called public information) were collected comprehensively from the webpages (called public pages) as use-cases, and the anonymization dictionary was then developed using the disclosures. 3.2 Construction Procedure for Anonymization Dictionary Figure 1 presents an overview of the anonymization dictionary. In this section, the steps are described and depicted in Fig. 1.
Fig. 1. Overview of anonymization dictionary
(1) Collecting public pages. As described in Sect. 3.1, public pages were collected to gather the use-cases. In this study, a total of 308 public pages were manually collected, i.e., the pages were retrieved from an Internet search engine, from May to August 2019. (2) Extract public information. Part of Fig. 2 shows an overview of this step. Public pages contain information other than public information. However, only public information was extracted from the 308 public pages.
Proposal and Development of Anonymization Dictionary
35
Fig. 2. Overview of “extract public information” and “store to database”
(3) Store to the database. Part of Fig. 2 shows an overview of this step. Public information is written in natural language. To store it in the database, a database table was created with 13 elements, as shown in Table 1. The elements were defined based on an analysis of the structure of the collected public information. The public information was then stored in the table. (4) View use-case list. A business operator can search the use-cases by category (e.g., “name” and “address”) and/or an operator name (e.g., “Mitsubishi Electric”). The operator can use regular expressions for the search. Figure 3 is a screenshot of when a business operator searched the use-cases by “name.” As a result of the search, they can obtain a use-case list. Each row shows a single use-case in which the category contained, or which is produced by the operator’s name. Each use-case shows information through the anonymization technique the operator used for the production. (5) View details of a use-case. Figure 4 shows a detailed view of the use-case. A business operator can obtain detailed information on a single use-case. Other views, such as a view of trend analysis and statistical analysis will be added. Notably, this view can be added easily because the use-cases are stored in the database.
36
M. Fujita et al. Table 1. Database table elements
Element name
Supplement
URL Business operator name Date Purpose of use Data source Usage pattern
Recorded as {yes/no, yes/no} The former yes/no means whether the use-case produces anonymously processed information The latter yes/no means whether the use-case provides anonymously processed information to a third party
Categories contained in produced anonymously processed information
Recorded as {category A, category B, category C, …}
Providing method Categories contained in provided anonymously processed information
Recorded as {category A, category B, category C, …}
Third party name Security control action Original full text Supplement
Fig. 3. Screenshot of use-case list view. (The view is originally implemented in Japanese. For this study, the descriptions on the view are translated by authors.)
Proposal and Development of Anonymization Dictionary
37
Fig. 4. Screenshot of use-case detail view. (The view is originally implemented in Japanese. For this study, the descriptions on the view are translated by authors.)
4 Evaluation 4.1 Evaluation Method In this section, the effectiveness of the anonymization dictionary is demonstrated. A scenario-based evaluation is conducted using the following procedure: (1) The following typical situation is assumed: a business operator wants to produce anonymously processed information from personal information, which contains ” in Japanese) as a category. “address” (actually, “ (2) The effectiveness of the dictionary is demonstrated by discussing how the dictionary works and supports the business operator in this situation. This evaluation is recognized as the first step of evaluation. Additional evaluations, such as usability evaluation, will be conducted for the dictionary. 4.2 Scenario-Based Evaluation Based on the situation described in 4.1(1), we inputted “address” to the search form in the use-case list view. Consequently, the view showed “number of search results: 88.” This means that the operator was able to know easily how many business operators use anonymously proceed information containing “address.” Each row of the use-case list view shows which anonymization technique is applied to “address.” This means that the operator was able to know which anonymization technique could be used for the situation. Each row of the use-case list view was linked to the details of each use-case. After clicking one of the links, the details of the use-case were opened. Seeing the detail,
38
M. Fujita et al.
the operator can know which categories in addition to “address” are contained in the use-case. From the above evaluation, we concluded that the dictionary shares the use-cases in a concrete and easily understandable form to a business operator, i.e., the dictionary provides useful information for anonymously proceeding information production.
5 Discussion 5.1 Synonym Retrieval Some of the categories in the collected use-cases have the same meaning as the others. For example, “a place a person lives in” is expressed as the category “address” in a use-case, and on the contrary, expressed as the category “prefecture” in another usecase. The original expressions are used for the dictionary. In the dictionary, when a business operator needs to get a list containing both categories, they need to input both ” in Japanese), into categories, such as “address|prefecture” (actually, “ the search form. This is low usability for operators. In our future work, we will implement a synonym retrieval search function using a lexical database, such as WordNet [9, 10]. 5.2 Automatic Update The number of anonymously processed information handling business operators is increasing every day. In fact, reference [7] reported that the number increased by 70 operators in 2018. In this study, use-cases were manually collected. To manage this increase, an automatic update function needs to be implemented in the dictionary. As described in Sect. 3.2, the procedure of use-case collection involves three steps: (1) collection of public pages, (2) extraction of public information, and (3) storage to a database. Automatic collection for step (1) is a particularly important future work. This is because, in step (1), public pages from an enormous number of webpages were surveyed. It is impossible to do this manually daily. In addition, as explained in Sect. 3.2, a use-case collection was conducted in 2019. Some of the use-cases stored in the dictionary may not be updated. The automatic update not only contributes to increasing the collection efficiency but also guarantees that the information is updated. 5.3 Limitation Act [1] imposes the disclosure obligations to a personal information handling business operator when it produces anonymously processed information and provides it to a third party. On the contrary, there is no disclosure obligation when a personal information handling business operator ceases to use the information. This means that some usecases stored in the dictionary may not be used today by a personal information handling business operator.
Proposal and Development of Anonymization Dictionary
39
6 Conclusion As for the operators’ support to produce anonymously processed information, an “anonymization dictionary” was proposed and implemented. We collected the disclosures by personal information handling business operators from the webpages, and then implemented the anonymization dictionary using the disclosures as use-cases. It was demonstrated that the dictionary shares the use-cases in a concrete and easily understandable form with a business operator through scenario-based evaluation, i.e., the dictionary can provide useful information for the anonymously processed information production with the operator. In our future work, we will add some functions, such as an automatic update function, to the dictionary. Acknowledgments. We would like to thank Yuki Kaneko and Atsuki Ono of Meiji University for their support in collecting use-cases.
References The titles of the references written in Japanese are translated by authors 1. Japanese Law Translation: Act on the protection of personal information. http://www.japane selawtranslation.go.jp/law/detail/?id=2781&vm=04&re=02 (2016). Accessed 20 Mar 2021 2. Japanese Law Translation: Enforcement rules for the act on the protection of personal information. http://www.japaneselawtranslation.go.jp/law/detail/?id=2886&vm=04&re=02 (2016). Accessed 20 Mar 2021 3. U.K. Information Commissioner’s Office: Anonymisation: Managing Data Protection Risk Code of Practice (2012) 4. Aggarwal, C., Yu, P.: A general survey of privacy-preserving data mining models and algorithms. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining. Advances in Database Systems, pp. 11–52. Springer US, Boston, MA (2008). https://doi.org/10.1007/ 978-0-387-70992-5_2 5. Mitsubishi Research Institute: Surveys of trends in the proper use of anonymously processed information and personal information (2018 edn). https://www.ppc.go.jp/files/pdf/tokumeika kou_report.pdf (2018). Accessed 20 Mar 2021 (in Japanese) 6. Personal Information Protection Commission Secretariat: Report by the Personal Information Protection Commission Secretariat: anonymously processed information: towards balanced promotion of personal data utilization and consumer trust. https://www.ppc.go.jp/files/ pdf/The_PPC_Secretariat_Report_on_Anonymously_Processed_Information.pdf (2017). Accessed 20 Mar 2021 7. Nomura Research Institute, Ltd.: Fact-finding surveys in the proper use of personal data. https://www.ppc.go.jp/files/pdf/personal_date_cases2019.pdf (2020). Accessed 20 Mar 2021 (in Japanese) 8. JCB Co., Ltd.: What is anonymously processed information? https://www.jcb.co.jp/service/ pop/tokumeikakou.html. Accessed 20 Mar 2021 (in Japanese) 9. Fellbaum, C.: WordNet and wordnets. In: Brown, K., et al. (eds.) Encyclopedia of Language and Linguistics, 2nd edn., pp. 665–670. Elsevier, Oxford (2005) 10. Francis Bond: Japanese Wordnet. http://compling.hss.ntu.edu.sg/wnja/index.en.html. Accessed 20 Mar 2021
VANETs Road Condition Warning and Vehicle Incentive Mechanism Based on Blockchain Shuai Zhou1 and Tianhan Gao1,2(B) 1
2
Software College, Northeastern University, Shenyang 110169, China [email protected] Engineering Research Center of Security Technology of Complex Network System, Ministry of Education, Beijing, China
Abstract. As an extension of mobile Ad Hoc network, VANETs is an important part of Intelligent Transportation System (ITS). However, problems such as vehicle misbehavior and road congestion reduce vehicle efficiency and affect the safety and privacy of vehicle users. This paper proposes a blockchain-based road condition warning and vehicle incentive mechanism (CWVI) for VANETs. The account-based authentication technology is proposed to ensure the privacy and anonymity of vehicles. Using improved blockchain technology to ensure the security of sensitive data; Reward and punishment system promotes the enthusiasm of vehicle release and forwarding traffic information. Performance and security analysis show that compared with other typical VANETs traffic condition warning schemes, the proposed mechanism has higher warning message transmission efficiency and stronger security.
1
Introduction
With the rapid development of wireless communication technology and Intelligent Transportation Systems (ITS), Vehicle Ad-hoc Networks (VANETs) is becoming an important infrastructure in future intelligent transportation [1]. Applications based on road safety should be based on reducing the probability of traffic accidents. Related studies have shown that early warning of road emergencies can avoid 60% of traffic accidents [2–4]. Effective supervision of road conditions and early warning of traffic emergencies are urgent needs of VANETs in safety applications. Besides, the protection of key information and vehicle users’ privacy in the road warning process has become a key link to ensure the safety of VANETs. Aiming at the road condition early warning research of VANETs, literature [5] proposes to dynamically adjust the broadcast operation of relay vehicles by predicting the distance between vehicles in a single lane and controls the transmission range of early warning information to reduce communication overhead. [6] Constructes an intelligent transportation framework to collect traffic information (such as traffic density etc.) of passing vehicles through Intelligent Traffic c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 40–49, 2022. https://doi.org/10.1007/978-3-030-79728-7_5
VANETs Road Condition Warning and Vehicle Incentive Mechanism
41
Lights (ITLs), using this as a basis for predicting traffic accidents and issues warnings to vehicles. Literature [7] evaluates the performance of VANETs early warning information based on the IEEE802.11p standard. The evaluation indicators include the time required to send warning messages, the number of blind spots, and the number of early warning packets received by vehicle nodes. While [8] further expands the relevant indicators, considering the impact of vehicle neighbor node density on the receiving efficiency of warning messages. Literature [9] compares simulated data with real data and analyzes the impact of different factors on warning information. In recent years, researchers have introduced blockchain technology to VANETs to solve the problems of security and privacy protection, thereby providing VANETs with more application-layer security solutions. [10,11] propose a lightweight scalable block LSB, which effectively solve the problems of high overhead, low scalability, and low throughput of traditional blockchains. In order to ensure the authenticity, correctness, and completeness of information in VANETs, [12,13] respectively proposes a scheme for rating the reputation of vehicles and a new “event proof” mechanism. [14] A blockchain-based encrypted trust point mechanism (CTP) is proposed to establish trust in VNAETs and record the bad behavior of the vehicle. However, because the information about the vehicle node is transparently accessed and traced online, the privacy of vehicles is difficult to be guaranteed. Regarding how to use blockchain to increase the participation of vehicle nodes, literature [15,16] evaluate the reputation of vehicles separately, and at the same time use the currency properties of the blockchain to increase the enthusiasm of vehicles to forward announcements. Aiming at the early warning problem, the early warning schemes in VANETs can provide solutions such as how to distinguish the early warning information and define the early warning indicators, but there are still some shortcomings, such as how to improve the efficiency of information forwarding, how to ensure the true reliability of data, and how to store the collected early warning information, how vehicles’ response during the warning process will affect the warning effect remain to be resolved. Therefore, this paper proposes a blockchain-based road condition warning and vehicle incentive mechanism (CWVI) of VANETs. The contributions of this paper are as follows: (1) This paper puts forward a kind of anonymous authentication scheme based on account. Vehicles can adaptivly change pseudonyms which meets the demand safety and location privacy. Through storing vehicles’identity information on blockchain, we provide higher security, relative to the revocation scheme based on a certificate revocation list, and more efficient revocation efficiency. (2) An early warning mechanism based on block chain is proposed to ensure the traceability, immutability and security of vehicle behavior and road information.
42
S. Zhou and T. Gao
(3) This paper puts forward an incentive mechanism based on blockchain, combining with the early warning mechanism, improving the enthusiasm of vehicles to forward information in the network. The rest of this article is organized as follows. The second section provides a description of the relevant basic knowledge and techniques. The third section introduces the overall structure and the specific content of the scheme. In the fourth part, we have made the security analysis and performance analysis, in the last part, we have carried on the summary to this article.
2 2.1
VANETs Road Condition Warning and Vehicle Incentive Mechanism Based on Blockchain System Architecture
The mechanism proposed in this paper includes three parts: vehicle certification, road condition warning and reward and punishment mechanism. TA generates identity credential for vehicles according to its ID and public key; RSU manages account information of vehicles and verifies warning message to record it on blockchain; OBU collects, publishes and participates in verification and summarizes the verification results to RSU. 2.2
Vehicle Certification
Vehicles should be authenticated to ensure the legitimacy and effectiveness of identity. First, TA is initialized. Then vehicles send ID and public key to TA, and TA generates an identity credential for it from which vehicles generate multiple accounts. RSU stores vehicles’ identity credentials and account information on blockchain. (1) Initial settings First of all, TA selects public parameter param= {P, Q, e, H1 , H2 , H3 , H4 , P KT A } for the system, where G1 and G2 are the additive cyclic group and multiplicative cyclic group, P ∈ G, bilinear mapping e : G1 ∗ G1 − − > GT , Q is the generator of G1 , H1 : {0, 1}∗ − > Zq∗ , H4 : {0, 1}∗ − > G1 , SKT A ∈ Zq∗ as TA’s private key, and P KT A = SKT A ∗ P as the public key. Secondly, Vehicle Vi randomly selects SKVi ∈ Zq∗ as its private key, P KVi = SKVi ∗ P as public key. Then Vi sends P KVi and IDVi to TA, TA calculate rVi = H2 (IDVi ) which is the identity credential of the vehicle. After that TA signing the time stamp TVi and rVi , namely SigrVi (rVi ||TVi ), together with the initialized Coin=0 are sent to the Vi . (2) Vehicle account registration Vehicle Vi generates new accounts by calculating the following formula: SKAccount(Vi )i = rVi ⊕ SKVi ; P KAccount(Vi )i = SKAccount(Vi )i ∗ P (P KAccount(Vi )i :account public key). Then Vi calculates H3 (P KAccount(Vi )i ) and converts the public key into 32 bytes. And then takes the last 20 bytes as the account address address(Vi )i . At this time, Account(Vi )i = {address(Vi )i ||Coinvalue}. Repeat the above steps, Vi can generate multiple
VANETs Road Condition Warning and Vehicle Incentive Mechanism
43
accounts, but the number of accounts must not exceed m. (3) RSU records account information After RSU receives the vehicle’s identity verification request, RSU needs to check SigrVi (rVi ||TVi ) and H2 (IDVi ). If vehicle is illegal, RSU will broadcast its identity, or RSU will store SigrVi (rVi ||TVi ) and its several accounts on the blockchain. (4) RSU revokes vehicle identification If RSU verifies that the vehicle identification rVi is false, expired or other vehicles find malicious behavior in Vi , RSU first submits rVi of the illegal vehicle Vi to TA, and then finds Vi on the leaf node of block N, mark F on it, which means the identiy credential of the vehicle is revoked. Except for the revoked node in the newly generated block N+1, the positions of the other leaf nodes remain unchanged. As shown in Fig. 1.
Fig. 1. RSU revokes vehicle identification
2.3
Road Condition Warning
The real road conditions considered in this scheme are shown in Fig. 2, which mainly involves two kinds of emergency road conditions: collision and malicious behavior. In the event of a sudden emergency, adjacent vehicles will request RSU to form a temporary group to complete the information collection of emergency road conditions. RSU decides which vehicle can become the group leader according to the historical behaviors of the applied vehicle. RSU acts as the group manager, and the group leader makes a systematic summary of the emergency road situation. Vehicles that have successfully joined the group are those that have applied to RSU. Vehicles in the group are divided into three categories: group leader, applicant and verifier. (1) Form a temporary group (a) In the event of an emergency ahead, vehicles first submit a request to form a temporary group to the RSU. The public and private key pair of
44
S. Zhou and T. Gao
Fig. 2. Road condition warning road scene
vehicle Vj in the group is (SKVj , address(Vj )j ). The RSU first verifies whether address(Vj )j is on the leaf node of the blockchain account. If it does not exist, the request will be rejected. (b) If all checks work successfully, Vj applies for its group iden∈ Zq∗ , j=1,2...k, and then send tity: randomly select xj SKVj xj P, xj P, SKVj P, address(Vj )j to RSU.RSU verifies whether e(SKVj xj P, P ) = e(xj P, SKVj P )is established, if so, it will send Sj = sH4 (T, SKVj xj P ) to Vj as the group identity in the temporary group. (c) After receiving applications for multiple vehicles, RSU selects the vehicle with the best history behaviors Vi as the leader of the temporary group to collect emergency information within the Δ time calculated from the time when the first vehicle applied to join the temporary group. At this time, within Δ time, all vehicles have completed the application to join the temporary group, obtained group status, and are eligible to collect emergency road conditions. (2) Collect emergency road conditions (a) If Vj s Coin > u, it can post an emergency traffic message as a group member. If Vj s Coin < u, the message can only be verified. Because the publishing message needs to be attached with a certain amount of Coins as cost of publishing message. Once the message is proved to be false, Coins attached to it will be confiscated to increase the cost of publishing message. (b) To make a signature, Vj needs to calculate: U = aH4 (T, SKVj xj P ), Vj randomly selects a ∈ Zq∗ ; V = SKVj xj H4 (m, U ); h = H1 (m, U +V ); W = (a + h)Sj ; Sig = (U, V, W, T, SKVj xj P ). (c) If Vj s Coin < u, it can only validate the Transaction and attach its Sig. For the verification process of Sig: first verify whether T is a valid period, if so, calculate Q = H4 (T, SKVj xj P ), H4 (m, U ), h = H1 (m, U + V ). If satisfy e(W, P ) = e(U +hQ, P KRSU ), e(V, P ) = e(H4 (m, U ), SKVj xj P ),
VANETs Road Condition Warning and Vehicle Incentive Mechanism
45
it proves that the signature is legally verified, otherwise, the verification fails. (d) Finally, Vi is responsible for sorting out the Transaction and verification signatures posted in the group, attaching its own Sig and sending it to RSU. (3) Verify emergency road conditions (a) After receiving the message of Vi , according to the current traffic density d (that is the number of vehicles driving in the same direction within unit time and unit distance, neighbor density=(number of vehicle*maxinum Tx range)/(map area)) RSU calculates the number of confirmations of Transaction that should be received within Φ time (the maximum time interval at the time point recorded in Transaction) n = kd. (b) If the number of Transactions confirmed is greater than n, the event verification is established and will be written into the reward chain by RSU; if it is less than n, it fails and will be written into the penalty chain by RSU. Among them, if the Transaction (M) verification is established, RSU will send the reported information about the Vehicle Vx with bad behavior to TA, and broadcast the identity of Vx on the entire network. Then RSU will revoke the illegal vehicle according to 2.2(4). 2.4
Reward and Punishment Mechanism
In this paper, scheme CWVI designed the Reward and punishment mechanism. According to behaviors of vehicles, the corresponding reward or punishment is given to vehicles. (a) RSU calls the smart contract: When the number of confirmations of Transaction received by RSU is greater than n, RSU will return the reward coin attached to the Transaction to the corresponding vehicle, and the reward for the vehicle publishing the Transaction is twice of the vehicle verifying the Transaction, the group leader is 2.5 times. Starting from the first node of the record in Transaction, the reward vehicles will be decreased in chronological order. If the number of confirmations for Transaction(M) is greater than n, RSU will deduct all the reward coins of Vx which is reported to have bad behavior, and they will be collected in the system as the principal of rewarding other vehicles. (b) When the number of Transaction confirmations received by RSU is less than n, RSU will confiscate all the Coins attached to the Transaction, and deduct a certain amount of Coins as a penalty for vehicles and team leaders that verify the Transaction. All rewards or penalties will flow on the blockchain along with the confirmation of the event, which will be used as evidence for follow-up verification, and at the same time prevent vehicles from arbitrarily imitating the corresponding amount of reward coins.
46
S. Zhou and T. Gao
3
Security Analysis
3.1
Security Analysis
(1) Unforgeability: In CWVI, the identity certificate of the vehicle is issued exclusively by TA, and the vehicle itself cannot be generated. Only with the identity certificate issued by TA, can all accounts generated by the vehicle be legal and can communicate and obtain information in the network. The account of the vehicle is created jointly by the vehicle’s identity certificate and the selected SKVI of the vehicle, which cannot be forged by other vehicles. All the information of the vehicle is recorded on the blockchain, and the state tree is access with permissions. (2) In CWVI, accounts and group identities are used to communicate with vehicles in the network. Acts are generated according to identity certificates issued by TA and private keys of vehicles. Vehicles can choose and change accounts according to their needs, which ensures that attackers cannot connect to any specific vehicles. Using group identity to make a voice further guarantees the anonymity of the vehicle, and the real identity of the vehicle is known only to the group manager. (3) Non-Repudiation, Tracing and Revocation: CWVI supports traceability. In the process of authentication, RSU can calculate the hash value to identify the identity of the illegal vehicle, and revoke and broadcast its identity. If a vehicle behaves badly, RSU can simply search the blockchain to find the vehicle’s identity and revoke it. If the vehicle sends false messages in the group, RSU can find the real identity of the signer. (4) Integrity and Reliability: Reward and punishment system ensures the cost of publishing false news and reduces the publishing rate of false news. The authenticity and integrity of the message is verified by the neighbor’s vehicle. The use of blockchain ensures that records are open, transparent and immutable. 3.2
Performance Analysis
(1) Computation Overhead First of all, CWVI is about the combination of blockchain and VANETs security application. Since the security information does not contain any sensitive information, it does not require confidentiality. Exchange of security information in VANETs requires authentication, rather than encryption. Ethereum’s block production efficiency is more than ten seconds compared to Bitcoin’s block production efficiency. The maximum amount of vehicle identity information in a year in blokchain maintained by RSU is (365*24*60*60/20*(176 m +160)/8/1024)*n = (33876.5624 m +30,796.875) kB*n (n is the number of vehicles). Compared with [1], which can store 43800 KB*n identity information every year, CWVI under the optimal state is equivalent to nm + n cars, which is better than [1]. When a vehicle enters a new RSU area and needs to make an action, RSU needs to verify whether the identity of the vehicl is legitimate. RSU only needs
VANETs Road Condition Warning and Vehicle Incentive Mechanism
47
to call Bloom Fitter to query the identity credential and account information of the vehicle on the MPT. At this time, the complexity of the query is log2 n. However in [1] the time complexity of query certificate revocation list is O(n). As shown in Table 1, after the introduction of block chain, the anonymity of vehicles is more flexible and the efficiency of query and revocation is higher. Table 1. Authentication and revocation data contast Content
CWVI
[1]
Authentication capacity
(33876.5624 m + 30,796.875) kB*n 43800 kB*n
Query/revocation time complexity
O(log2 n)
O(n)
(2) Early warning information processing The message format for sending warning messages under the warning mechanism is shown in Table 2. It can be seen from [1] that the total message packet size (security message and digital signature and certificate containing public key and signature) is between 284 and 791 bytes, and the size of message packet affects the message drop rate. Larger message packet will increase the message drop rate. In this scheme, CWVI takes the form of the group leader aggregating messages from the group to RSU. The size of messages sent by the vehicles that publish transactions in the group is 100 + 4 + 42 + 1 = 147bytes, and the size of messages aggregated by the group leader is 100 + (4 + 42 + 1) * n = 100+ 47n bytes. If it reaches the maximum value of 791Bytes mentioned in [1], it can contain 14 verification signatures. At this time, if it is in the state of congestion, there will be 25 vehicles in the congested section. Half of them have verified the emergency. If the verification conditions are met, the message will not be overloaded. Table 2. Message format Message payload Timestamp Signature Coin 100 bytes
4
4 bytes
42 bytes
1 byte
Conclusion
This article puts forward a plan to combine blockchain with VANETs at the security application level, that is, a VANETs road condition warning and vehicle incentive mechanism based on blockchain(CWVI). First, the vehicle’s identity authentication mechanism has been improved to reduce the overhead caused by
48
S. Zhou and T. Gao
the certificate revocation list, and the real identity information of the vehicle is stored on the blockchain with access rights, which realizes the protection of the vehicle privacy and the availability, traceability and immutability of vehicle identity. In addition, the participation and historical behavior of vehicles in the network are stored in the reward chain and the penalty chain respectively according to the vehicle performance, realizing the traceability, non-tampering, and openness of network data and vehicle behavior. Finally, through the reward and punishment mechanism to improve the enthusiasm of vehicles. Performance analysis shows that this solution can provide more reliable privacy protection and more efficient information transmission efficiency.
References 1. Raya, M., Hubaux, J.-P.: The Security of Vehicular Ad Hoc Networks (2005). https://doi.org/10.1145/1102219.1102223 2. Eze, E.C., Zhang, S.J., Liu, E.J., Eze, J.C.: Advances in vehicular ad-hoc networks (VANETs): challenges and road-map for future development. Int. J. Autom. Comput. 13(1), 1–18 (2016) 3. Bai, F., Krishnan, H.: Reliability analysis of DSRC wireless communication for vehicle safety applications. In: Proceedings of Intelligent Transportation Systems Conference, pp. 355–362. IEEE, Toronto (2006) 4. Moharrum, M.A., Al-Daraiseh, A.A.: Toward secure vehicular ad-hoc networks: a survey. IETE Tech. Rev. 29(1), 80–89 (2012) 5. Yu, Q., Liu, D.: Disseminate warning message in VANETs based on predicting the interval of vehicles. In: Fifth International Conference on Frontier of Computer Science & Technology. IEEE (2010) 6. Barba, C.T., Mateos, M.A., Soto, P.R., et al.: Smart city for VANETs using warning messages, traffic statistics, and intelligent traffic lights. In: Intelligent Vehicles Symposium (IV). IEEE (2012) 7. Martinez, F.J., Cano, J.C., Calafate, C.M.T., et al.: A performance evaluation of warning message dissemination in 802.11p based VANETs. In: The 34th Annual IEEE Conference on Local Computer Networks, LCN 2009, 20–23 October 2009, Zurich, Switzerland, Proceedings. IEEE (2009) 8. Martinez, F.J., Toh, C.K., Cano, J.C., et al.: The representative factors affecting warning message dissemination in VANETs. Wirel. Pers. Communi. 67(2), 295–314 (2012) 9. Martinez, F.J., Fogue, M., Coll, M., et al.: Evaluating the impact of a novel warning message dissemination scheme for VANETs using real city maps. In: Networking, International Ifip Tc 6 Networking Conference, Chennai, India, May, DBLP (2010) 10. Dorri, A., Steger, M., Kanhere, S.S., et al.: BlockChain: a distributed solution to automotive security and privacy. IEEE Commun. Mag. 55(12), 119–125 (2017) 11. Cebe, M., Erdin, E., Akkaya, K., et al.: Block4Forensic: an integrated lightweight blockchain framework for forensics applications of connected vehicles. IEEE Commun. Mag. 56(10), 50–57 (2018) 12. Yang, Z., Zheng, K., Yang, K., et al.: A blockchain-based reputation system for data credibility assessment in vehicular networks. In: 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC). IEEE (2017)
VANETs Road Condition Warning and Vehicle Incentive Mechanism
49
13. Guo, H., Meamari, E., Shen, C.: Blockchain-inspired event recording system for autonomous vehicles. In: 2018 1st IEEE International Conference on Hot Information-Centric Networking (HotICN), Shenzhen, 2018, pp. 218–222 (2018) 14. Singh, M., Kim, S.: Crypto trust point (cTp) for secure data sharing among intelligent vehicles. In: 2018 International Conference on Electronics, Information, and Communication (ICEIC), pp. 1–4. IEEE (2018) 15. Lu, Z., Wang, Q., Qu, G., et al.: BARS: a Blockchain-based Anonymous Reputation System for Trust Management in VANETs (2018) 16. Li, L., Liu, J., Cheng, L., Qiu, S., Wang, W., Zhang, X., Zhang, Z.: CreditCoin: a privacy-preserving blockchain-based incentive announcement network for communications of smart vehicles. IEEE Trans. Intell. Transp. Syst. 19(7), 2204–2220 (2018)
An Authentication Scheme for Car-Home Connectivity Services in Vehicular Ad-Hoc Networks Cong Zhao1 , Nan Guo2(B) , Tianhan Gao1,3 , Jiayu Qi1 , and Xinyang Deng1 1
3
Software College, Northeastern University, Shenyang 110169, CO, China [email protected], [email protected], [email protected] 2 Computer Science and Engineering College, Northeastern University, Shenyang 110169, CO, China [email protected] Engineering Research Center of Security Technology of Complex Network System, Ministry of Education, Beijing, China
Abstract. With the rapid development of technology, the Internet of Things (IoT) has been associated with various fields, such as smart home networks, intelligent traffic system (ITS), and so on. Vehicular ad-hoc networks (VANETs) can provide value-added services by roadside units (RSUs) for vehicles. Car-home connectivity can be defined as allowing drivers and passengers to remotely control smart appliances in the smart home while driving. The communication process between connected vehicles and smart appliances may cause personal information disclosure. To protect the security and privacy of communication, this paper proposes an authentication scheme for car-home connectivity services. The proposed scheme can protect the security and privacy of the communication process between the vehicle and the smart appliances and multiple third trusted authorities can cooperate to retrieve the identity of misbehaving users. The system analysis shows that the proposed scheme is efficient, secure, privacy-preserving, and easily deployable.
1
Introduction
Internet of Things (IoT) is a platform where devices become smarter and communication becomes more convenient [1]. IoT has been associated with various fields, such as smart home networks and intelligent transport systems (ITS) [2,3]. ITS can improve user experience and ensure driving safety [4]. IoT also can help vehicles become more intelligent to connected vehicles to ITS servers, cloud/Internet, etc. Vehicles are becoming even smarter, vehicles can connected with several entities, such as personal devices, Vehicular ad-hoc networks (VANETs), and the Internet, besides deploying innovative applications [5]. VANETs can provide services by roadside units (RSUs) for connected vehicles and vehicles can communicate with others. And each vehicle periodic broadcasts c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 50–58, 2022. https://doi.org/10.1007/978-3-030-79728-7_6
Authentication Scheme for Car-Home Connectivity Services
51
traffic related messages within a time-interval of 100–300 ms [6]. As wireless vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) transfers become common, vehicle security may be compromised [7,8]. The vehicles of the future will be intelligent, autonomous and connected [9]. The smart home is defined as a home equipped with smart appliances that can predict and respond to users’ requests. It can bring users a more convenient and comfortable living environment. And home appliances are connected to the Internet to provide services to users. Researchers suggested a wireless technology to connect vehicles with smart home. Car–home connectivity can be defined as allowing drivers and passengers to remotely control smart appliances in smart home while driving. However, the car–home connectivity service is in its infancy [10]. And, the communication process between connected vehicles and smart homes may cause personal sensitive information disclosure, which may cause users to suffer property damage [11]. To address the problem, the communication process must be anonymous, but it must be possible to retrieve the real identity [12]. To maintain the balance between privacy and security of car–home connectivity services, the paper proposes an authentication scheme for car-home connectivity services in VANETs. In this paper, we propose an authentication scheme to protect the security and privacy of the communication process. It allows the user to interact with home appliances while driving, and the interaction is anonymous and unlinkability. The sensitive information of users is prevented from being leaked to others. This paper demonstrates that the proposed scheme is efficient, secure, privacypreserving, and easily deployable. The remainder of the paper is organized as follows. In Sect. 2, cryptographic building blocks are introduced. In Sect. 3, we show the network model and the main procedures of our proposed scheme. In Sect. 4, a usage scenario is presented in detail. The security and performance is analyzed in Sect. 5, and the conclusion is drawn in Sect. 6.
2
Preliminaries
This section describes the key cryptographic building blocks used in the rest of the paper, which are zero-knowledge proof and Paillier’s Homomorphic Encryption. 2.1
Zero-Knowledge Proof
Zero-knowledge proof [13] was proposed by S. Gouldwasser, S. Icali and C. R. Ackoff in the early 1980s. It refers to the prover can convince the verifier that a statement is correct without providing any useful information to the verifier. We use the notation of Camenisch and Stadler to present the protocol. According to that notation, the following statement ˜
˜β } P K{(α, β) : y = g α hβ ∧ y˜ = g˜α˜ h
52
C. Zhao et al.
denotes a zero-knowledge proof of knowledge of integers (α, β) such that y = ˜ are elements of some groups ˜ β˜ holds, where y, g, h, y˜, g˜, h g α hβ and y˜ = g˜α˜ h ˜ = ˜ ˜= h G = h = g and G g of the same order. All the parameters are known to the verifier except for the elements in the parentheses. 2.2
Paillier’s Homomorphic Encryption
Paillier’s homomorphic encryption has been widely used in cryptography since 1999 [14]. The main processes of Paillier encryption are shown as follows: • KeyGen: choose q, p ← GenP rime(κ), satisfy gcd(pq, (p − 1)(q − 1)) = 1; compute n = pq, λ = lcm(p − 1, q − 1); choose g ∈ Zn∗2 , satisfy gcd(L(g λ mod n2 ), n) = 1; public key:(n, g), private key: λ; • Encryption (Epk (M )): choose r ∈ Zn ; compute CT = Epk (M, r) = g M · rn mod n2 • Decryption (Dsk (CT )): compute M = Dsk (CT, λ) = where L(x) =
3
L(CT λ mod n2 ) mod n L(g λ mod n2 )
x−1 n
The Proposed Scheme
This section describes the network model and presents the main procedures of our proposed scheme. 3.1
Network Model
In this subsection, we present the network model of car-home connectivity service in VANETs. Vehicles in VANETs can control home appliances in smart home through car-home connectivity services. The car-home connectivity service provides security and privacy protection for connectivity between vehicle and smart home. The entities in VANETs are OBU, RSU and TA. The entities involved in the car-home connectivity are car–home service provider, car–home client and car–home clone. The entities are described in [15]. 3.2
Construction
To satisfy the security and privacy requirements of the car-home connectivity service, this paper proposes an authentication scheme. The authentication scheme can establish anonymous connectivity between the vehicle and the smart home
Authentication Scheme for Car-Home Connectivity Services
53
and establish anonymous sessions between vehicles with other entities. The proposed scheme is constructed by a credential system that consists of three modules, user, issuer and inspector. These modules run on different entities to fulfill their functions. In car-home connectivity services, the issuer running on the car-home service provider (SP) releases credentials to car-home clients and car-home clones for establishing anonymous connectivity. Users can be divided into provers and verifiers. The prover running on the car-home clone proves the validity of the indistinguishable presentation token to the verifier running on car-home SP or car-home clients. In VANETs, the issuer running on TA releases credentials to OBU for anonymous authentication and the prover running on OBU proves the validity of the presentation token to the verifier running on another OBU or RSU. In case the verifier finds misbehavior of the prover, inspector and issuer cooperate to retrieve the real identity of the prover. This subsection mainly includes four phases including the setup and key generation, credential issuance, presentation token proof, and presentation token inspection. Setup and Key Generation. The issuer first generates system parameters as RSA signature and Paillier encryption shown. Meanwhile, issuer defined cryptographic hash functions H1 : {0, 1}∗ → Zn ∗ , H2 : Zn2 ∗ → Zn ∗ . The public key of issuer is pkis = (n, n2 , eis , g1 , g2 , H1 , H2 ), and the private key is skis = (p, q, dis , λ). Then the issuer issues a pair of public key and private key for the inspector d, e and users du , eu respectively. Credential Issuance. The credential issuance is used for a user to be authenticated by the issuer and inspector. The issuer and the inspector cooperate to issue certificates Cert for users. The user generates a secret value s based on ID and computes a parameter gs = g2 s mod n2 based on s for subsequent credential issuance. The user sends the s, gs to the issuer. The issuer storage a database of the correspondence between ID and gs . Then the issuer forwards these parameters to the inspector. The inspector signs parameters with RSA signature and forwards signatures as certificates Cert via the issuer to the user. Presentation Token Proof. The prover computes τi = H1 (ts) based on timestamp ts and computes r = H2 (s||τi ). The prover encrypts gs with Paillier encryption to get ciphertext m = g1 r·n ·gs mod n2 and generates a zero-knowledge proof P of m. The prover sends {Cert, P, m, ts} as the anonymous and unlinkability presentation token to the verifier, and the verifier checks the validity of presentation token by RSA verification and zero-knowledge proof. After the presentation token proof procedure, an anonymous session between the prover and the verifier can be established. Presentation Token Inspection. In case of misbehavior, the inspector and the issuer can cooperate to retrieve the real identity ID of the misbehaving user. If the prover finds that the verifier has If Prover finds that the verifier has acted illegally, the verifier forwards the presentation token of the prover and evidence to the issuer. The issuer forwards ciphertext m to the inspector. The inspector
54
C. Zhao et al. λ
2
L(m mod n ) executes Paillier decryption of the ciphertext m to get s = L(g mod n 2 λ 2 mod n ) and sends it back to the issuer. The issuer computes gs based on s to retrieve the database to get the ID of the prover.
4
Use Case
This section describes a usage scenario for our proposed scheme. Assuming that Alice wants to control or confirm the status of smart appliances in the smart home while driving. Alice first deploys a car-home client in the smart home, such as a hub that can connect many home appliances, to establish secure communication with smart appliances. Then the car-home client registers with the nearest car-home connectivity service provider which is deployed in the nearest RSU. When the vehicle connects to the smart home for the first time, Alice deploys a car-home clone on the OBU of the vehicle. For the subsequent anonymous communication, the car-home clone is loaded with the same parameters as the car-home client. Figure 1 shows the application scenario that Alice wants to control the status of smart appliances at smart home when leaving the communication range of smart home. Since the vehicle has left the communication range of smart home, the vehicle can only communicate with smart home through V2I and V2V in VANETs.
Fig. 1. Use case of the car-home application.
5
Security Analysis and Performance Analysis
This section demonstrates the security analysis and performance analysis of our proposed scheme.
Authentication Scheme for Car-Home Connectivity Services
5.1
55
Security Analysis
This subsection presents the security and privacy analysis of our proposed scheme. Minimum Disclosure. In the presentation token proof phase, the presentation tokens Cert, m, P, ts are randomly generated based on timestamp ts, the unique secret value and random numbers. Cert is the signature of the issuer based on the secret value, it do not reveal the sensitive information of users. m is ciphertext used the Paillier encryption, the plaintext is well protected. And P is generated based on random numbers. Therefore, the presentation token presented by users do not contains additional personal information. Unlinkability. In the presentation token proof phase, the user chooses random numbers to generate presentation token, these random numbers obscure the characteristics of the unique secret value. The selected random numbers are different each time, therefore, the presentation token presented by user is different, which satisfies the unlinkability. Restricted Credential Usage. The presentation token is generated based on timestamp ts. Only one valid presentation token can be generated in an interval. Therefore, there is only one available presentation token for each user in the same interval. Consequently, our scheme satisfies restricted credential usage. Anonymity. The presentation tokens Cert, m, P, ts are generated based on timestamp ts and random numbers. The presentation tokens presented each time is different and do not contains additional information, neither the verifier nor anyone else can access the user’s real identity through the presentation token. Consequently, our scheme can satisfy the requirement of anonymity. Accountability. The user uses presentation token to communicate with other vehicles in VANETs or smart appliances in smart home. And the issuer and the inspector can cooperate to retrieve the identity ID of the misbehaving vehicle through presentation token. The inspector retrieves the F ID of misbehaving users and forwards the F ID to the issuer to retrieve the database to get the ID. Consequently, users cannot deny presentation tokens sent by themselves, meanwhile these presentation tokens imply the real identities of the users, therefore our scheme satisfies the accountability. Distributed Resolution Authority. In our scheme, the F ID only can be retrieved by the inspector, and then the inspector forwards F ID to the issuer to get the real identity ID. It is impossible for any authority to retrieve the identity of users alone. Therefore, our scheme satisfies distributed resolution authority. 5.2
Performance Analysis
The performance analysis of our scheme is given in this subsection. Due to the limited computing and storage capacity of the vehicle, we analyzed the proposed scheme from the aspects of computing overhead and communication overhead.
56
C. Zhao et al.
The basic cryptographic operations determine the performance of our proposed scheme, which consists of biginteger subtraction, biginteger multiplication, biginteger modular exponentiation and secured hash function. We adopt the GNU multiple precision arithmetic library to execute our program. The benchmark information includes:the hardware platform with 3.7 GHz Intel(R) Core(TM) i7-8700K CPU, 4 GB RAM, the operating system with Debian 9.4. And we perform these four cryptographic operations 1000 times to calculate the average running time of these cryptographic operations. The results were shown in Table 1 and the length of parameters is depicted in Table 2. Table 1. The average running time of cryptographic operations. Notations Descriptions
Running time (ms)
Tmul
The running time of a biginteger multiplication 0.0002
Th
The running time of a secured hash function
0.0002
Tsub
The running time of a biginteger subtraction
0.00001
Tpowm
The running time of a biginteger modular exponentiation
0.25
Communication Overhead. The communication overhead is determined by the size of the message transmitted during authentication. In our proposed scheme, the prover needs to send (Cert, m, ts) and the corresponding proof P to verifier. Then the verifier checks the validity and ownership of the presentation token. The process of authentication is noninteractive. And the communication overhead of our scheme is identical in V2V and V2I authentication. The total of communication overhead of ours is: COours = 256 × 4 + 512 + 4 = 1540 bytes Computation Overhead. The computation overhead is the total executed time required for the authentication process. Since the computational overhead Table 2. The length of parameters. Parameter Description
Byte
q , p
The size of q, p in Biginteger-Based Cryptography
128
n
The size of n in Biginteger-Based Cryptography
256
n2
The size of n2 in Biginteger-Based Cryptography
512
H1 , H2
The size of the hash function H1 , H2 in Biginteger-Based Cryptography
256
ts
The size of timestamp
4
Authentication Scheme for Car-Home Connectivity Services
57
of biginteger modular exponentiation Tpowm is much higher than that of biginteger subtraction Tsub , biginteger multiplication Tmul and hash Th , we only focus on operations with high computation. The computation overhead of the authentication process in our proposed scheme refers to the execution of cryptographic operations in the procedure of presentation token proof and presentation token verification. The total computational overhead of V2V authentication is calculated as follows : Tours−V 2V = 4Tpowm = 1.00 ms The V2I authentication process is same as the V2V process. And RSUs have better computation and storage capabilities than vehicles, we only consider the computation overhead of vehicles. Tours−V 2I = 3Tpowm = 0.75 ms
6
Conclusions
Security and privacy are important issues that must be considered in the carhome connectivity services and VANETs. This paper proposes an authentication scheme based on RSA cryptosystems for privacy-preserving. It allows the user to interact with home appliances while driving without sensitive information being leaked. And multiple authorities can cooperate to retrieve the identity of misbehaving users. Security analysis and performance analysis show that our proposed scheme is secure and efficient. In the future, we will refine our scheme and conduct a more detailed security and performance analysis, as well as simulation of the scheme.
References 1. Ray, P.P.: A survey on Internet of things architectures. J. King Saud Univ. Comput. Inf. Sci. 30(3), 291–319 (2016) 2. Al-Fuqaha, A., et al.: Internet of things: a survey on enabling technologies, protocols and applications. IEEE Commun. Surv. Tutor. 17, 1 (2015) 3. Moser, K., Harder, J., Koo, S.G.M.: Internet of things in home automation and energy efficient smart home technologies. In: 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1260–1265 (2014) 4. Muhammad, M., Safdar, G.A.: Survey on existing authentication issues for cellularassisted V2X communication. Veh. Commun. 12, 50–65 (2018) 5. D’Angelo, G., Castiglione, A., Palmieri, F.: A Cluster-based multidimensional approach for detecting attacks on connected vehicles. IEEE IoT J. 1 (2020) 6. Al-Shareeda, M.A., Anbar, M., Hasbullah, I.H., Manickam, S.: Survey of authentication and privacy schemes in vehicular ad hoc networks. IEEE Sens. J. 21(2), 2422–2433 (2021) 7. Kukkala, V.K., Pasricha, S., Bradley, T.: SEDAN: security-aware design of timecritical automotive networks. IEEE Trans. Veh. Technol. 69(8), 9017–9030 (2020)
58
C. Zhao et al.
8. Castiglione, A., Palmieri, F., Colace, F., Lombardi, M., Santaniello, D., D’Aniello, G.: Securing the internet of vehicles through lightweight block ciphers. Pattern Recogn. Lett. 135, 264–270 (2020) 9. Grimm, D., Pistorius, F., Sax, E.: Network security monitoring in automotive domain. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) FICC 2020. Advances in Intelligent Systems and Computing, vol. 1129, pp. 782–799. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39445-5 57 10. Hong, J., Shin, J., Lee, D.: Strategic management of next-generation connected life: focusing on smart key and car-home connectivity. Technol. Forecast. Soc. Chang. 103, 11–20 (2016) 11. Rubio, J.E., Alcaraz, C., Lopez, J.: Recommender system for privacy-preserving solutions in smart metering. Pervasive Mob. Comput. 41, 205–218 (2017) 12. He, D., et al.: An efficient identity-based conditional privacy-preserving authentication scheme for vehicular ad hoc networks. IEEE Trans. Inf. Forensics Secur. 10(12), 2681–2691 (2015) 13. Goldwasser, S., Micali, S.: Rackoff the knowledge complexity of interactive proof systems. SIAM J. Comput. - SIAMCOMP 18, 186–208 (1989) 14. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Advances in Cryptology – EUROCRYPT 1999, International Conference on the Theory and Application of Cryptographic Techniques, Prague, Czech Republic, 2–6 May 1999, Proceeding. Springer, Heidelberg (1999) 15. Guo, N., Zhao, C., Gao, T.: An anonymous authentication scheme for edge computing-based car-home connectivity services in vehicular networks. Future Gener. Comput. Syst. 106, 659–671 (2020)
Vulnerability Analysis of Software Piracy and Reverse Engineering: Based on Software C Jaehyuk Lee1 , Kangbin Yim2 , and Kyungroul Lee3(B) 1 Department of Computer Software, Daegu Catholic University,
Gyeongsan 38430, South Korea [email protected] 2 Department of Information Security Engineering, Soonchunhyang University, Asan 31538, South Korea [email protected] 3 School of Computer Software, Daegu Catholic University, Gyeongsan 38430, South Korea [email protected]
Abstract. Illegal use of software due to its piracy has been an issue for several years. Software piracy causes other issues such as non-retention of genuine software, violation of the license, and excessive use of the use period. These issues adversely affect the software market and industry. Several studies have been studied to analyze the weaknesses of the existing software copyright protection technologies, such as license authentication, obfuscation, and watermarking, for ameliorating these adverse impacts, revitalizing the software market and industry, and improving safety of software copyrights. However, the existing types of software protection technologies very are diverse, and they have different vulnerabilities even applied the same protection technology. We analyzed and demonstrated the vulnerability of the software protection technology of software C. Further, we proposed a more effective and overall security improvement method for software copyright protection technologies.
1 Introduction The recent rise in software piracy using reverse engineering has a negative impact on the development of the software market and industry [1]. Researches for illegal copying vulnerability analysis are introduced to address these adverse effects [2, 3]. However, these researches are focused only on the license authentication technology that applies to A and B software among the existing software copyright protection technologies and feature restrictions, i.e., the maximum number of commands. For license authentication technologies, many software adopt recent software protection technologies, such as source code obfuscation, watermarking, and encryption. Furthermore, additional limited functions, such as feature restrictions and period limitations, are provided depending on the characteristics of each software. Given this, software developers need to analyze the weaknesses of the existing copyright protection technologies for various software to come up with measures to increase the security of © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 59–66, 2022. https://doi.org/10.1007/978-3-030-79728-7_7
60
J. Lee et al.
the various software copyright protection technologies. Since the copyright protection technology is applied in various ways, and the restriction functions provided by each software are different, we analyze the vulnerability of illegal copying of software by reverse engineering based on software C. The contributions of this study are as following: • Experiment results of this paper help to analyze the vulnerability of software C and provide guidelines to improve the security of the software copyright protection technology to prevent the illegal copying and use of any software. • This paper derives clues to provide more secure restriction functions and has the analysis results of bypassing the restriction functions provided for software C. This paper is organized as follows. Section 2 describes related works, such as the existing copyright protection technology and prior researches, and Sect. 3 shows the analysis of software C copyright protection technology and restriction options. Section 4 describes the piracy vulnerability of software C, finally, Sect. 5 concludes the paper.
2 Related Works The generally used software copyright protection methods include source code obfuscation, source code watermarking, license authentication technique, and encryption. Source code obfuscation is a technique that hides the algorithm or idea of a program and makes it difficult for analysts to understand the source code [4]. Source code watermarking is a technique that inserts a specific watermark in the source code to identify the copyright holder [5]. License authentication involves serial key authentication technology like ID/PW authentication technology. This technology has the advantage of being easy to apply and easy to implement, so it is a technology adopted and applied for many software [6]. Encryption is a technology that protects sensitive information by transforming information into values that cannot be easily obtained [7]. In the existing studies [2, 3] on the vulnerability of software copyright protection technologies stated above, the vulnerability because of which the hard-coded license key is exposed to the source code was derived by analyzing the license authentication technology. For general commercial software, the applicable restricted features include restrictions on the features used and restrictions on the duration of use. Of these, the limited functions that apply to software limit the functions to be used. In other words, they limit the maximum number of commands a user can use. Like A software, B software does not have a limit on the duration of use and limits the maximum uploadable file capacity. This means that even if the same license authentication technology is used, there are different vulnerabilities. Table 1 shows copyright protection technologies and restrictions applied to A and B software. We have analyzed the authentication technology and restriction functions applied to software C and derived the possible vulnerabilities in detail. Based on the analysis of the results, we have derived effective countermeasures for software copyright protection technology. This paper presents all these details.
Vulnerability Analysis of Software Piracy and Reverse Engineering
61
Table 1. Copyright protection techniques applied to prior research, vulnerabilities found, restriction function (A and B software) Software
Protection technology
Vulnerability
Restriction function
A
License authentication
Hard-coded license keys Limits the number of commands (2,500 commands)
B
License authentication
Neutralizing limited functions
Limits the number of file uploads
3 Analysis of Copyright Protection Technology and Restriction Function of Software C To ascertain the license authentication vulnerability of software C, we analyzed the overall use authentication process and Fig. 1 shows authentication flow of software C.
Fig. 1. Authentication process of software C
The analyzed process for software C consists of three steps: software authentication, authentication result output, and function restriction. The first step, the software authentication step, is where the user inputs a username and registration code for authentication. The second step is to output the result of authentication, and accordingly, the success or failure of the authentication process is displayed. If the license input by the user is the correct key, authentication is successful, while if the license is incorrect, authentication fails. The last step is to limit the function according to the authentication result. If authentication is successful, the user can use whole functions of the software, but if it fails, only limited functions can be used. Software C protects the copyright of the software by restricting its functionality with the use of the license authentication technology as a method for copyright protection. Table 2 shows the restriction function of the software C.
62
J. Lee et al. Table 2. Software C: restriction function
Restriction function
Types
Description
Functional restrictions
Dialog box output
–
Period restrictions
Technical support
–
Limitation function
Upload capacity limit (below 300 MB)
X
The restriction function of the software C does not limit the duration of use, only certain features. This function includes a license purchase message (dialog box output), software update support, technical support, and upload capacity restrictions. Among these, the upload capacity limit means that one cannot exceed the total size of 300 MB while uploading files. This means that if one uses the trial version without purchasing a license, they will be prompted to purchase a license and will not be able to get technical support and update support if they do not purchase it. Therefore, those users who do not purchase a license will not be able to use this software normally.
4 Software C Piracy Vulnerability Analysis 4.1 License Authentication Vulnerability Analysis As mentioned above, the license authentication technology was applied to software C. A serious problem with the license authentication technology is that the information related to the correct license key is stored inside the software. In other words, to verify the license key input by the user, a comparison routine must be performed with the correct license key, and the correct license key used for comparison must be stored inside the software. Structural issues like these lead to vulnerabilities such as hard-coded license key vulnerabilities in which the valid license key is exposed, and there is a security threat of an illegal user being disguised as a legitimate user. The license authentication technology was also applied to software C, resulting in vulnerabilities and security threats like those in the existing software copyright protection technology. Therefore, we have also analyzed not only the hard-coded license key vulnerabilities but also the additional vulnerabilities applied to the license authentication technology in software C and we found the location of the storage of the displayed string. Figure 2 shows the results of this analysis, and we found the location of the storage of the displayed message. For analysis, the username was “1234”, and the registration code was “QWER”. As a result of analyzing the message that displayed the authentication result, the authentication success message “Thank you for your registration.” or the authentication failure message “The username or serial number is invalid.” are displayed. Therefore, we are assumed that this result is a message according to the authentication result. To verify this, we analyzed the related code and found that the username “1234” is stored in EAX register and the registration code “QWER” is stored in ECX register
Vulnerability Analysis of Software Piracy and Reverse Engineering
63
Fig. 2. Checking strings related to license authentication (“Thank you for registering”, “Your username or serial number is not valid.”)
as shown in Fig. 3. Based on this, the code that shows the results of Figs. 2 and 3 was analyzed in detail, and Fig. 4 shows the analysis result.
Fig. 3. License information (ID and registration key) input by the user is stored
As a result of code analysis, if a user input an invalid license key, the function at location 0x004693CC is executed, and the string “The username or serial number is invalid” is displayed. Conversely, if the user input a valid license key, the function at location 0x0046942F will be executed, and a string such as “Thank you for registering” is displayed. In other words, regardless of the authentication of the license key input by the user, if the code is manipulated to execute the function at location 0x0046942F, it may get forcibly authenticated without a valid license, which means that authentication can be bypassed. To confirm the above assumption, we modified the code at location 0x004691B9 and found “Thank you for registering.” A string is displayed, and the license authentication got bypassed successfully. Nevertheless, even if one were to take advantage of the license key authentication bypass vulnerability, there is still a limitation in that the restricted function of software C cannot be used. Therefore, to neutralize the limitation of not being able to use the restricted function, we analyzed the vulnerability of the restricted function. The main function of software C is the file upload function, and this function is restricted by limiting the maximum
64
J. Lee et al.
Fig. 4. Detailed analysis of license authentication
Fig. 5. Code for calculating the size of the uploaded file
upload capacity to 300 MB. This means that when a file is uploaded, it calculates the file capacity inside the software C and stops uploading the file if the file capacity exceeds
Vulnerability Analysis of Software Piracy and Reverse Engineering
65
300 MB. Therefore, there must be a code that checks the size of the uploaded file. In other words, the code compares the size of the uploaded file with the maximum uploadable size. As a result of analyzing the upload file size comparison code, the CMP DWORD PTR and 0x12C00000 codes were repeatedly called as shown in Fig. 5. In this code, converting 0x12C00000 expressed in hexadecimal to decimal obtained the number 314,572,800. This means the maximum uploadable capacity is of 300 MB. To verify this, after changing 0x12C00000 to 0x7FFFFFFF and uploading a file larger than 300 MB, we found that the file was uploaded successfully by neutralizing the restriction function as shown in Fig. 6.
Fig. 6. Results of neutralizing restriction function (top: cannot upload files over 300 MB, bottom: can upload files over 300 MB).
As a result of analyzing the vulnerability of the license authentication technology of software C, we verified that a user without the authority to use the software would be able to perform the same functions by neutralizing the file upload capacity limit as a user who has purchased the license.
5 Conclusion In this study, in order to improve the security of a broad and effective software copyright protection technology, we derived several vulnerabilities that exist in the same
66
J. Lee et al.
copyright protection technology based on software C. In the case of software C, we concluded that, as found out by the previous studies, the license authentication method was used. However, even with the same copyright protection technology, different license authentication bypass vulnerabilities could be derived and verified, and not the hardcoded license key vulnerability discovered in previous studies. Additionally, according to the characteristics of each software, we could verify that the function restriction can be bypassed by modifying a specific code, even though different restricted functions, such as a limitation of use period or specific function, are applied. Future research should focus on finding the root cause of various vulnerabilities in the same license authentication technology and study its countermeasures. Moreover, additional vulnerabilities can be derived by analyzing other software to which the existing software copyright protection technologies, such as obfuscation, watermarking, and encryption, are applied. Acknowledgments. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2021R1A4A2001810).
References 1. Frame, J.D.: National commitment to intellectual property protection. J. Law Technol. 2, 209– 227 (1987) 2. Lee, J., Lee, K.: Vulnerability Analysis of Software Piracy and Reverse Engineering: Based on A Software (2020) 3. Jung, W., Lee, K.: Vulnerability Analysis of Intellectual Property Rights using Reverse Engineering: Based on B Software (2020) 4. Myles, G.M.: Software theft detection through program identification. Ph.D. thesis. Dept. of Computer Science, The Univ. of Arizona (2006) 5. Tamada, H., Nakamura, M., Monden, A., Matsumoto, K.: Java birthmarks – detecting the software theft. IEICE Trans. Inf. & Syst, E88-D(9), pp. 2148–2158 (2005) 6. Lee, S.R.: A design of illegal usage protection system of software through internet. J. Korea Soc. Comput. Informat. 6(4), 110–118 (2001) 7. Katz, J., Lindell, Y.: Introduction to Modern Cryptography , 3rd edn Chapman and Hall/CRC Press (2020)
PtPeach: Improved Design and Implementation of Peach Fuzzing Test for File Format Hua Yang(B) , Jingling Zhao, and Baojiang Cui Beijing University of Posts and Telecommunications, Beijing, China {yang_hua,zhaojingling,cuibj}@bupt.edu.cn
Abstract. The existing open-source framework of Peach fuzzing test is based on mutation or model generation, which cannot obtain the execution information inside the input program, which makes many invalid byte mutations during fuzzing. This paper proposes a new feedback mechanism and mutation mechanism. Based on the Peach framework, the PtPeach guidance test cases generation link is improved. The key fields can be located with the help of the information changes during program execution during mutation so that the keyword section in the file can be fuzzed, which will greatly shorten the generation time of fuzzing test cases and speed up fuzzing for vulnerability mining speed. Experiments show that in the same time, we fuzzed the AcroRd32.dll in the Adobe library, and finally improved Peach’s ability to perform fuzzing to find potential vulnerabilities.
1 Introduction Software vulnerability mining has always been one of the important topics of software security, and software developers try to find software vulnerabilities before attackers. Fuzzy testing can generate malformed data through normal test cases mutation and input it into the software to explore potential vulnerability paths. At present, fuzz testing is mainly divided into white box testing and black-box testing. With the release of AFL in recent years, gray-box testing has also become the mainstream tool for fuzz testing. Gray-box testing can obtain part of the internal information of the target program or its execution without source code. Different from white box testing, gray-box testing improves fuzz testing by collecting input seed characteristics and execution information, including branch information, coverage information, and program execution time, and other auxiliary information to guide the generation of test cases for fuzzing speed [1, 2]. Peach3 is an open-source and extensible framework for fuzzing network protocols and file formats released by Deja vu Security in 2013 because it can specify the input Pit file to establish the data model in the file format and the state model in the protocol. It instructs the fuzzer to complete more efficient mutations. Peach supports two mutation strategies based on mutation and format generation. However, because Peach lacks the feedback-driven strategy of seeds, it often takes a lot of time to explore meaningless mutations when using Peach for fuzzing. In summary, we can know that for commonly used software, we should use a gray box fuzzer for vulnerability mining. The main contribution of this paper is as follows: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 67–75, 2022. https://doi.org/10.1007/978-3-030-79728-7_8
68
H. Yang et al.
(1) We complete the coverage information recording based on multi-threaded PT to guide the generation of test cases. (2) We realize the positioning of key fields in the file and map the key fields to the effective character array Effmap. (3) We realize the mutation of key fields through Effmap, optimize the mutation strategy of Peach.
2 Background In this section, we introduce the key technologies and backgrounds needed to improve the Peach framework. 2.1 Intel Processor Trace Intel Processor Trace (IPT) is a new set of processor trace technology introduced by Intel in its Intel core M and fifth-generation Intel processors and is now equipped in mainstream Intel processors. IPT can track the execution flow of the program according to the settings of the specific application, and record TNT, TIP and other highly compressed data packets related to the control flow in real-time. Use software to decode the collected packets and combine them with the binary of the program to reconstruct the accurate and complete control flow of the program. Intel PT technology provides context for all types of events. Since Intel PT is a hardware-level feature, compared to instrumentation technology, there will be less overhead in performance and faster execution speed (Table 1). Table 1. Types and meanings of data packets recorded by IPT Packet type
Meaning
TNT
Recording the trade-offs of direct conditional branches
TIP
Recording the destination IP of indirect branches, exceptions, interrupts and other branches or events
FUP
The source IP of the asynchronous event (interrupt or exception). There are other source IPs that cannot be obtained from the binary file
MODE
Providing important processor execution information of the decoder, so that the decoder can interpret the disassembled binary program and trace log
The packets recorded by the IPT are stored in the memory area set in advance. When this memory area is filled, the hardware will issue an interrupt to notify the software to process. The software decoder passes the recorded packets and some other information such as the program Binary files are combined to accurately restore the original control flow of the program. The packets recorded by IPT are highly compressed and written directly to physical memory without going through TLB and cache. Therefore, the additional IPT runtime brings the performance overhead is very small. We can consider screening seeds through coverage [3].
PtPeach: Improved Design and Implementation of Peach Fuzzing Test
69
2.2 Peach Test Cases Mutation Strategies As a framework for fuzzing, Peach provides two different mutation strategies. Mutationbased and model-based mutation strategies [4]. Mutation Strategy Based on Mutation. Peach is to mutate existing test cases and cannot parse the format information inside the file. The mutation methods include bit flipping and double-byte shifting [5]. The random strategy will always run and select at most MaxFieldsToMutate elements for mutation at a time. For each selected element, the corresponding mutation is also random. Random strategies are also more effective for larger data models or run after the execution sequence is fuzzy. The sequence strategy will fuzz each element of the DataModel in sequence. Peach will start from the head of the DataModel, perform all effective mutations on each element, and guide all mutations to use up. Based on the mutation of existing test cases, many test cases can be mutated in a short period of time. However, since its mutation strategy is based on existing test cases, it will affect the diversity of test cases in the later stage of the fuzzing test. At the same time, because the format information of the file is not used, the shortcoming of mutation-based mutation cannot quickly find new paths. Mutation Strategy on Model. Peach will use the input Pit file to build a model and analyze the file format. The Pit file defines the data model and the state model and generates test cases according to the known model of the target file. The model-based mutation will speed up the generation of effective test cases, but Pit file writing increases the cost of fuzzing. Peach can use these two mutation methods alone or write Pit files in partial file formats, and combine the two mutations to generate test cases, so that it cannot only analyze the file structure, but also completely changes based on the generated test cases. More mutations to achieve better test case mutation effects [6].
3 Design and Implementation In this section, We will introduce the overall design and implementation of the system. The overall framework of this paper based on Peach’s improved fuzzy testing tool is shown in Fig. 1. As shown in Fig. 1, the input test cases record the program execution information through IPT, and then completes the location of the keyword section and the recording of the branch information and completes more effective mutations by optimizing the mutation strategy. This section introduces the three key parts of our fuzzer in the design process. The first part is to complete the extraction of program execution flow through IPT and combine the statically generated control flow data to obtain the control flow graph (CFG) of the program through decoding. The second part is to use random mutation and lightweight taint analysis technology to complete the positioning of key fields. The third part is to transfer the excellent mutation strategy in AFL to Peach to complete the generation of test cases.
70
H. Yang et al.
Inial seed input
Byte mutaon
Target Binary
Program execuon
Malformed seeds
Calculate seed score
PtPeach
Opmized mutaon strategy
Field Type
Branch informaon
Fig. 1. PtPeach overall framework.
3.1 Getting the Coverage of the Program Program CFG is a graph composed of points and directed edges, which represents all possible paths of control flow of a program at runtime. In the CFG, each point represents a basic block, that is, a fragment of assembly instructions with a single entry and a single exit. The control flow jump instruction is the last instruction of the basic block, and the target of their jump is the first instruction of another basic block. The directed edge of the data structure represents the control flow transfer generated by a control flow jump instruction, that is, from the source basic block to the target basic block. Using CFG, we can know the set of target addresses to which all control flow transfer instructions in a program may jump. Because statically constructed CFG are usually not accurate enough, there is a large amount of branch information that cannot be recorded, which will lead to errors in the calculation of coverage. To better guide PtPeach to fuzz test the target program, it is necessary to construct a CFG for the target program. When using PtPeach for fuzz testing, we use IPT to trace the target program. IPT will use the ipt.sys driver to trace the target program at the processor instruction level through the ioctl function. Load the target module, with the help of libipt secondary development, complete the multithreaded block decoding of the obtained trace file, obtain the control flow information of this fuzz test, and record it as a two-tuple form of . After obtaining the coverage rate of the program, we can compare it with the coverage rate of the last test case. If the coverage rate is reduced or unchanged this time, the mutation will be canceled. If the coverage rate is increased, the feedback mechanism such as genetic algorithm will be used. To complete the screening of this test cases and join the queue. We define the CFG of the target program as C, the statically constructed CFG as C t , the branch information obtained by calculating the test cases for the i-th time as C d [i], and the coverage rate as Cov, then the coverage rate can be expressed in the following.
PtPeach: Improved Design and Implementation of Peach Fuzzing Test
Cov =
Ct +
Cd [n] n 1 Cd [i]
71
(1)
At the same time, due to the incompleteness of the test case exploration path, the following code, we did not consider the execution after line 10, and the if statement on line 9 should have a higher weight. Therefore, we are not going to use the path coverage as the only factor but introduce more other factors into the genetic algorithm to calculate the score of the seed. int main(int argc, char **argv){ int size = 0; unsigned int buf[1000]; char *tmp = argv [1]; while( tmp != ’\0’) ++size; tmp = argv [1]; if( size < 20 ) { if(tmp [2] == ‘I’){ if(tmp [3] == ‘F’){ SOME BUG ... } } }else{ Error... } return 0; }
If the probability of sending is used as the only feedback mechanism, the if statement in line 9 will achieve coverage improvement when it is executed multiple times, and the fuzzer will abandon the mutation of this field and more exploration at this time. Therefore, when mutating the if statement field in line 9, we need to consider the number of subsequent basic blocks and the number of executions of the statement, to avoid ignoring possible bugs in line 11. 3.2 Key Field Type Targeting Because the files are all regarded as binary files written in a certain format. Through the changes in the program execution path and frequency after byte mutation, we can infer the attributes of the changed fields to guide the evolution process of the seed mutation. Before that, we introduced some definitions. p[i] is a map that represents the branch execution CFG at the i-th iteration, and its value represents the number of executions of the corresponding path. 0 1 2 3 k n−1n P[i] = ... ... (2) 2403 2403 169 461 351 0 0
72
H. Yang et al.
We introduce the calculation method in AFL and divide the number of path executions into 0, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024…times. If the number of the path executions’ times was 62 last time, but 60 this time, then we believe that the frequency of path execution has not changed. We can compare whether the corresponding key value exists and its value is consistent to judge its similarity with the last time. The calculation method of similarity can be expressed in the following. { n0 1|p[i − 1] = 0&p[i] = 0&floor(log2 p[i]) == floor(log2 p[i − 1])} S[i] = n (3) floor represents a round-down function. Through similarity calculation, we can classify continuous fields with consistent similarity into one category. Then use the magic number, data, and characteristics of the special field to determine the field type. Magic Number. Usually, in file format analysis, the magic number with a fixed offset is detected to determine the specific attributes of the file. Generally, the magic number is some fixed value. Therefore, for the magic number field, if it is mutated, it is very likely to be directly entering error processing, so the execution information of this execution will usually have a lot of non-coincidence information with the execution information before the mutation, including the number and frequency of path execution. It can be concluded that the similarity will be lower than a threshold α. When we perform byte mutation from 0x00 to 0xff on this byte, if the mutation similarity S[i] is less than α, then we can infer that the byte is a magic number and add it to the Effmap, the corresponding value will also be mapped to the corresponding data structure. Data. The feature of data field is to complete the filling of the file content, usually data such as color and location. Mutating these data will not affect the parsing process of the file format to get more exploration of new paths, so we skip the mutation of the data field and do not add into Effmap. Special Field. The special field refers to that when it is mutated, part of the variation value can have more path coverage or similarity, that is to say, the special field includes more valid values. We add it to the Effmap and map the effective values to the corresponding data structure. At the same time, considering that there may be a valid value of the loop count, we will process the valid value +1/−1, which may find an off-by-one vulnerability or other vulnerabilities. In addition, through mutation, we can get the processing of most of the error blocks. These error processing blocks are not the execution path we care about. Therefore, we extract the error handling block and use its weight as a consideration factor for subsequent genetic algorithms to reduce the fitness of the test cases that execute the error handling block.
3.3 Mutation Strategy Optimization Since Peach can input the pit file to give some inspiration to the file specification, we consider it as a preprocessing stage to mutate the fields defined in the DataModel in the
PtPeach: Improved Design and Implementation of Peach Fuzzing Test
73
pit file. After the mutation of the pit file is completed, the key field mutation mentioned in Sect. 3.2 is performed. After the fitness function is calculated, a series of mutation operations are performed on the obtained test cases. Since we got the key field type and coverage information of the test cases in Sect. 3.2, we can find the effective value mapping in the Effmap to complete the byte mutation of the key field. For the magic number field, we only use one mutation strategy; for data fields, we don’t mutate them; for special fields, we bring all their valid values into it for multiple mutations. At the same time, combined with the pit file mutation method for combined mutation, the test cases have more randomness. In the non-deterministic mutation, we use the splice idea in AFL to cross the two screened test cases to generate new test case. Among them, we must consider that when we give up random cross-mutation, there will be a high probability of direct exception handling test cases, so we only perform cross-mutation operations in data fields and special fields. This will reduce the possibility of generating low-quality test cases to a certain extent. 3.4 Experiment and Evaluation Based on Peach, we improved the design PtPeach described in this paper and compared this tool with the original Peach tool for 20 h. The experiment target is the AcroRd32.dll dynamic link library in the Adobe Reader 9.0. The initial corpus is a PDF file collected on the Internet, and the same Pit file and the same test cases are input. The experimental host is configured with a processor Intel® Core™ i7-7700HQ @ 2.80 GHz and 128G memory. Table 2 shows the specific experimental data of comparative experiment 1. The first column is the number of newly discovered paths, the second column is the branch coverage, and the third column is the total number of crashes found. From the experimental test data in Table 2, it can be seen that in the same time, PtPeach has a certain increase in the number of new paths, branch coverage, and the number of crashes found. Among them, the number of new paths has increased by 12.78%. The number of crashes increased by 29.17%. The increase in branch coverage and the number of crashes found means that PtPeach can execute more branch paths and discover potential vulnerabilities. Table 2. Comparison of experimental data Number of new branch
Branch coverage (%)
Number of unique crashes
Peach
1573
14.2
24
PtPeach
1774
16.0
31
Figure 2 shows the line graph of Comparative Experiment 2, which describes the number of samples that PtPeach and the original Peach tool need to generate at the same time. The smaller the number of samples, the fewer the number of cycles of the fuzzing test, and the better the performance of the fuzzing test.
74
H. Yang et al.
Time to Test Test cases Generaon 25000
Test cases
20000 15000 PtPeach 10000
Peach
5000 0 1h
2h
3h
4h
5h
Fig. 2. PtPeach and Peach generate test sample comparison chart
It can be seen from Fig. 2 that after PtPeach introduces the coverage feedback mechanism and the key field mutation mechanism, PtPeach’s generation of test samples becomes slower. After the 5 h test, the generated test cases are reduced by 27.9% compared with Peach. It can be concluded that in the same time, PtPeach can save more space for storing test cases on the hard disk and discover more branch paths with fewer iterations. PtPeach can better use program execution information and file format to improve the performance of the fuzzing test. Based on the above two comparative experiments, we can conclude that after introducing the coverage feedback mechanism based on multi-thread PT, PtPeach can find more paths in the same time than the original Peach, and fuzzing test can be faster through the key field mutation strategy simultaneously. Due to the full use of program execution information and file format information, PtPeach relatively reduces the number of iterations of fuzzing and reduces the performance overhead of fuzzing.
4 Conclusion Through the research on the fuzzing framework Peach, we found that Peach lacks a feedback mechanism and lacks efficient methods for mutation of key fields. Although the Peach tool supports inputting pit files to complete the analysis of the file format and combining mutations to complete the mutation and generation of test cases. However, because the format of the pit file needs to be proficient in the file format, there is a high probability that the pit file is incomplete. To improve the automation and efficiency of file format fuzzing, we improved peach’s fuzzing strategy. This paper uses a feedback mechanism to screen seeds with high coverage, and to generate higher-quality test cases through key field positioning and optimizing mutation strategies, thereby improving the ability of fuzzing to find potential vulnerabilities.
PtPeach: Improved Design and Implementation of Peach Fuzzing Test
75
References 1. Manès, V.J.M., et al.: The art, science, and engineering of fuzzing: a survey. IEEE Trans. Softw. Eng. (2019) 2. Rawat, S., Jain, V., Kumar, A., Cojocar, L., Giuffrida, C., Bos, H.: VUzzer: application-aware evolutionary fuzzing. In: NDSS, vol. 17, pp. 1–14 (2017). 3. Fioraldi, A., Maier, D., Eißfeldt, H., Heuse, M.: AFL++: combining incremental steps of fuzzing research. In: 14th USENIX Workshop on Offensive Technologies (WOOT 2020) (2020) 4. Kim, M., Park, S., Yoon, J., Kim, M., Noh, B.N.: File analysis data auto-creation model for peach fuzzing. J. Korea Inst. Inform. Security Cryptol. 24(2), 327–333 (2014) 5. You, W., et al.: Profuzzer: on-the-fly input type probing for better zero-day vulnerability discovery. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 769–786. IEEE (2019) 6. Pham, V.T., Böhme, M., Santosa, A.E., Caciulescu, A.R., Roychoudhury, A.: Smart greybox fuzzing. IEEE Trans. Softw. Eng. (2019)
An Efficient Approach to Enhance the Robustness of Scale-Free Networks Syed Minhal Abbas, Nadeem Javaid(B) , Muhammad Usman, Shakira Musa Baig, Arsalan Malik, and Anees Ur Rehman COMSATS University Islamabad, Islamabad 44000, Pakistan
Abstract. In this paper, we aim to increase the robustness of Scale-Free Networks (SFNs) that are robust against random attacks; however, they are fragile to malicious attacks. When the highest-degree nodes are removed from the SFNs, they greatly affect the connectivity of the remaining nodes. In the proposed method, we perform two attacks, namely, the high degree link attacks and the recalculated high degree link attacks. These link attacks affect the core part of the network. To measure the extent of damaging the network at a greater portion, we use the closeness centrality and the eigenvector centrality. The goal is to construct a topology that shows better robustness against different malicious attacks with a minimum computational cost. The Pearson’s correlation coefficient is used to select the measures that are positively correlated with each other. Based on these centrality measures, we optimize the network robustness for both node and link attacks simultaneously. The optimization is performed based on two efficient measures, which are computationally less expensive to increase the network robustness.
1 Introduction The Internet of Things (IoT) has made remarkable progress in the last decade. With the increasing number of smart devices, the digital world is growing rapidly. The IoT allows manual processes to be automated into the digital world. The IoT generates the necessary data and includes it in real-time applications. With the evolution of IoT, its applications are spread in many fields, including health system [1], transportation system [2, 3], medical monitoring system [4], etc. In complex network theory, there are mainly two types of models: the Small World Networks (SWNs) and the Scale-Free Networks (SFNs) [5, 6]. The SWNs are constructed by using heterogeneous nodes. These nodes have different communication range, energy, and bandwidth. Moreover, the SWNs contain a high clustering coefficient and a small average path length [7]. Whereas, in SFNs, the degree of the nodes is distributed according to the power law. With this distribution, the high degree nodes are less in number as compared to low degree nodes that make connections with similar types of nodes [8]. On the other hand, the SFNs are constructed by homogeneous network topologies [9]. The homogeneous nodes have the same bandwidth and communication range. Therefore, the network containing the SFNs’ property is usually used in IoT due to the same bandwidth and transmission range. However, it is challenging to generate a network topology that is robust against the malicious attacks [9, 10]. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 76–86, 2022. https://doi.org/10.1007/978-3-030-79728-7_9
An Efficient Approach to Enhance the Robustness
77
Schneider et al. [11] is introduced a measure to calculate the network robustness R against the malicious attacks. The robustness of the network is calculated as, R = 1/N
N
∑ s(Q).
(1)
Q=1
In the above equation, s(Q) represents the fraction of removed nodes, N denotes the total number of nodes. The R is calculated by analyzing the connectivity of nodes in a network, which is followed by removing nodes until the network is paralyzed. Therefore, we need to make a network topology that is robust against the malicious attacks. Also, the probability of malicious attacks is based on the intelligent selection of nodes that is growing increasingly [12, 13]. The authors in [14] consider the degree and Betweenness Centrality (BC) based attack on both nodes and links to evaluate the nodes’ importance. Therefore, when high degree nodes are attacked, the network breaks down into sub-networks. In a network, random and malicious attacks may occur simultaneously. Among them, the malicious attacks damage the network to a greater extent because they occur on the most important nodes in the network [9, 15]. Several strategies are used to make the network robust against malicious attacks. The SFNs have a high ability to resist random attacks. However, they are weak against malicious attacks [16]. Therefore, an efficient topology is required to make the network robust against malicious attacks [17]. Several optimization techniques including memetic algorithm [18], multi-population genetic algorithm [19], enhanced differential evolution [20], natural connectivity model [21], greedy model [22], and elephant herding optimization [23] are used to enhance the network robustness. These techniques make the onion-like structure without changing the degree of nodes. The structure is robust against both random and malicious attacks. Our contributions in this work are summarized as follows. • Two efficient measures, namely, Closeness Centrality (CC) and Eigenvector Centrality (EC) are used to find the most important nodes in the network with less computational cost. • The Pearson’s correlation coefficient is used to determine the relationship between robustness measures based on CC and EC and robust the network for both node and link attacks. • Two attacks are performed, High Degree Link Attack (HDLA) and Recalculated HDLA (RHDLA). These attacks help the attacker to affect the network as the greater portion of the network is damaged in a less time. The rest of the paper is organized as follows. The related work is provided in Sect. 2. The proposed model and description are discussed in Sect. 3. Section 4 describes the simulation results of our proposed schemes and Sect. 5 concludes the paper.
2 Related Work Previous work uses single-objective optimization with crossover and mutation. The network that is robust against node attacks is not necessarily robust against the link attacks
78
S. M. Abbas et al.
[10]. In the real-world problem, multiple malicious attacks are not simultaneously considered to optimize the robustness of a network [14]. Traditional Genetic Algorithm (GA) is used to make the network robust, premature convergence occurs due to loss of diversity in the population [19]. Simulated Annealing (SA) attempts to find the best solution with the worst probability and finds a better solution, leading to global optima. The algorithm requires unnecessary comparisons to optimize the network robustness caused by computational overhead [18]. With the increase of IoT in smart cities, the network is prone to cyber-attacks that include malicious and random attacks. The network is required to be robust against cyber-attacks. However, many redundant operations cause computational complexity [20]. In [24], authors use the large SFNs exploring the hidden features in solution space to enhance the robustness of the network; however, the computational cost occurs. The network structure is robust on specific types of attacks; however, the structured properties are ignored in a large complex network [21, 25]. Machine Learning (ML) techniques are used to optimize the network topology against cyber-attacks. SFNs are efficient regarding robustness against the random attacks; however, vulnerable to the malicious attacks [26, 27]. The robust networks are the essential support for the IoT applications such as smart cities, intelligent vehicles, smart devices [28], etc. However, resource-constrained IoT devices and various types of attacks are challenging against the network’s robustness and communication capability [29]. ML-based algorithms are designed to enhance the network robustness against malicious attacks. Modification of data regarding the swapping edges has influenced the structure of the network. Constructing a network robust to withstand malicious attacks is still a challenging issue in the optimization problem [30, 31]. The robust network against node attacks is not necessarily strong against the links attack simultaneously [14]. In smart cities, nodes collect data from different domains [32]. There is an issue to deal with big data with the network’s robustness against the malicious attack [19]. GA is used to enhance a scale-free network’s robustness against malicious attacks as well as random attacks [19]. Multiple methods are used to overcome malicious attacks and cascading failures separately [15, 33–35]. The evolution algorithm is used the two operators to optimize the network’s tolerance using fixed probability rates [19]. However, there is an issue in tackling the large SFN exploring the hidden features in solution space. Moreover, the improved network structure is based on specific types of attacks. SFNs are tolerant of random attacks; however, prone to malicious attacks [16]. Fault-tolerant based network topology evolution scheme is proposed. A new node joins the network based on preferential attachment and fault probability. Qiu et al. [12] propose the scheme that makes the network into the onion-like structure by performing the degree difference and angle sum operations. The same degree nodes connect with each other; however, redundant operations are performed due to edge swapping. The node’s failure with fault and cyber-attacks affects the communications among the networks [33]. Furthermore, in the network, some nodes are more densely connected internally as compared to the rest of the network and form the community. It is an important feature of the network structure. Therefore, the network is optimized by considering the onion-like structure in each community [36]. The robustness is enhanced along with the preservation of the network community.
An Efficient Approach to Enhance the Robustness
79
3 Scale-Free Networks Topology and Robustness Optimization In this section, the proposed system model is designed, as shown in Fig. 1. In the figure, three limitations to be addressed are mapped to the proposed solutions. 3.1 Construction of Initial Topology We construct the network using the Barab´asi Albert (BA) model [8]. The model generates the SFNs topology that follows the power-law distribution. Initially, a network is constructed from a small clique. A new node joins the network based on the preferential attachment. The degree of the neighboring node is calculated and the node having a high degree has more probability to be selected through the roulette wheel mechanism. The communication range and energy resources of nodes are limited because the high degree nodes consume more energy, which affects the overall performance of the network. Therefore, the maximum value of the nodes is fixed [37]. 3.2 Centrality Measures In the proposed system model, we introduce two measures CC and EC, to find the most influential nodes in the network. These measures have less computational cost as compared to BC. Moreover, CC does not have to determine how many shortest paths pass through specific nodes that are required in BC. The CC measure provides the network’s global information. It finds nodes’ centrality based on the relative distance between each pair of nodes. The centrality of a node finds the broadcaster i.e., how close a node is to all other nodes in the network. In this process, the shortest distance is calculated between a specific node and all other nodes in the network. The CC c of a node x is calculated using the equation defined as: cx = 1/ ∑ d(y, x).
(2)
y
Where d represents the shortest distance between two vertices y and x [38]. On the other hand, EC measures the importance of nodes in the network. It assigns a relative score of x to each node. For the given graph, G = (v, e), the relative score x of EC score a vertex v connected with other vertex with an edge e can be defined as: xv = 1/λ
∑ av,t x(t)
(3)
t∈ G
Where a represents the adjacency matrix, v and t are two vertex, and λ is a constant factor [39]. The EC finds the node influencing the entire network and assigns the relative index value to all nodes based on the neighbor’s connection. The node containing the high indexed value contributes more to that node having a low index. Using the above measures, we find the most influential nodes in the network to perform the attacks.
80
S. M. Abbas et al.
Fig. 1. Proposed system model
In the proposed model, three types of malicious attacks based on degree, CC, and EC are performed. These attacks affect the functionality of the remaining nodes in the network. These attacks are performed on both nodes and links. However, the attacker utilizes the information of high-degree nodes in the network and removes the links. Therefore, in the proposed system model, we consider two attacks, namely, HDLA and RHDLA. In HDLA, the links between two high-degree nodes in the network are removed based on the initial information. Whereas, in RHDLA the links degree between two nodes is recalculated after every node’s removal. Our focus is to evaluate the robustness of the network with the limited number of attacks using the provided information from these centrality measures [38, 39]. Based
An Efficient Approach to Enhance the Robustness
81
on the robustness measure for nodes and links, we find out the Pearson’s correlation coefficient among them. The optimization is performed on two positively correlated measures. The topology that increases the robustness for both nodes and link attacks is considered an optimized network topology with less computational cost. A smart rewiring mechanism is adopted in the proposed model, that helps to convert the scalefree topology into an onion-like structure. Table 1. Mapping of limitations with proposed solutions and their validations. Limitations identified
Solutions proposed
L1: The BC measure S1: CC and EC finds the increases the influential node in less computational cost to find computational cost the influential node [14]
Validations V1: Validate the execution time with two centrality measures in Fig. 2
L2: Network is not robust S2: Network is optimized V2: Network performance against different types of simultaneously against is validated with malicious attacks [27] different types of node and robustness values in Fig. 4 link attacks based on CC, EC, HDLA, and RHDLA L3: Attacker utilizes node’s information to perform attacks [12, 17, 20, 24]
S3: Introduce links as well as node attacks
V3: Evaluate the network connectivity by performing attacks in Fig. 3
In Table 1, L1 section contains a measure to calculate a node’s centrality using betweenness to make the network robust. The computational cost to find the influential node based on the BC is high as compared to the node’s selection by considering node degree removal [14, 16]. Two efficient measures CC and EC find the importance of a node in less time as mentioned in S1 section. Using these measures, we will construct a network that is efficient against malicious attacks. Also, it reduces the computational cost and optimizes the network robustness. In V1 section, the execution time of CC and EC is validated in Fig. 2. These measures have less computation time as compared to BC. The network is optimized on the basis of robustness against node attack Rn and robustness against link attack Rl in L1 section. Using the proposed centrality measures, we design a network topology that is fragmented with less number of attacks. We use Pearson’s correlation coefficient between Rn and Rl . The positively correlated measures are used to enhance the network robustness. Various attacks are performed to strengthen the network, which includes CC, EC, HDLA, and RHDLA, as explained in S2 section. To enhance network robustness, we use the smart rewiring mechanism to achieve the minimum computational cost and faster convergence that leads to the optimal solution.
82
S. M. Abbas et al.
L3 section contains attacks on important nodes [12, 17, 20, 24]. Two link attacks, HDLA and RHDLA, based on the two high degrees nodes are performed on the network, as explained in S3 section. These attacks will affect the whole network due to the failure of a high degree edge to make the network robust. The V3 section evaluates the network connectivity against different malicious attacks, including BC, CC, EC, HDLA, and RHDLA.
4 Simulation Results and Discussion In this section, SFNs are used to evaluate the efficiency of our model with ROSE and SA algorithms. The simulations are conducted in MATLAB. In a sensor field of 500 * 500 m2 , nodes are deployed randomly. The modeling process requires each node to have a sufficient number of neighbors, and the communication range r is set at 200m to indicate the communication radius. In the simulations, the number of nodes (N) is set to 100, edge density (m) = 2 and the maximum value of the node’s degree is 25. We evaluate the results for 100 iterations in each algorithm. We observe that our model converges to optimal results within 100 iterations in Fig. 4. 160 BC CC EC
Execution Time (seconds)
140 120 100 80 60 40 20 0 50
100
150
200
Number of Nodes
Fig. 2. Comparison of execution time between centrality measures.
The comparison of various centrality measures is shown in Fig. 2, BC the computational overhead of the BC is high. As compared to BC, CC has a strong ability to find the most important node in less time due to the calculated shortest path between two nodes. With the increasing number of nodes from 50 to 200 nodes, the closeness measure outperforms the betweenness measure because CC is not calculated in the shortest paths that pass only through a specific node. Furthermore, EC finds the influential node for the attack in less time as compared to BC. Because, it indicates not just the direct influence of a given node, also the influence of nodes that are more than a hop away.
An Efficient Approach to Enhance the Robustness
83
1 BC CC EC HDLA RHDLA
Probability of Removed Links
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
20
40
60
80
100
120
140
160
180
200
Number of Link Attacks
Fig. 3. Malicious link attacks affect the network topology.
Validation of the proposed model is achieved through the removal of links with malicious attacks. Results demonstrate that the performance of the proposed attacks HDLA and RHDLA affect the network more as compared to the BC and CC, as shown in Fig. 3. With the increasing number of attacks, the probability of removed links using HDLA and RHDLA is greater than that of BC and CC attacks. After the 120th attack, HDLA and RHDLA damage, the network at a greater portion compared with BC and CC attacks. The effectiveness of HDLA and RHDLA is shown in Fig. 3. HDLA affects the most important link between the two high-degree nodes. The robustness of the network is evaluated using the different number of iterations. Our model, which is smart rewiring mechanism, is compared with ROSE and SA in Fig. 4. 0.35
0.3
Robustness
0.25
0.2
0.15
Smart Rewiring ROSE SA
0.1
0.05
0 10
50
100
Number of Iterations
Fig. 4. Robustness value with different number of iterations.
84
S. M. Abbas et al.
As the number of iterations increases, the proposed smart rewiring mechanism outperforms both ROSE and SA in terms of high robustness (i.e., Schneider R-value). Among the best results, the robustness value for Smart Rewiring, ROSE, and SA are 0.350 0.289, and 0.259 respectively. As compared to ROSE and SA schemes, our model’s performance is higher to make the network topology robust against malicious attacks. The process of edge swap is restricted to the high-degree nodes. In this way, the re-connection of these nodes will change the overall performance of the network as compared to ROSE and SA.
5 Conclusion This paper improves the robustness of SFNs. Two efficient centrality measures called CC and EC, are used to find the most influential nodes in the network. Using these measures, the malicious attacks are performed. These attacks affect the network quickly and make it robust. A topology’s robustness is determined using the Schneider R metric. Two link attacks are performed, the HDLA and the RHDLA. Both of these attacks target the high degree links of the network. Simulation results show that these attacks damage the network with less computational cost. The network is designed to withstand malicious attacks on nodes as well as links. Through smart rewiring, the network topology is turned into an onion-like structure, which is more robust against malicious attacks.
References 1. Abdelgawad, A., Yelamarthi, K.: Internet of things (IoT) platform for structure health monitoring. Wireless Commun. Mob. Comput. 2017 (2017) 2. Liu, Y., Weng, X., Wan, J., Yue, X., Song, H., Vasilakos, A.V.: Exploring data validity in transportation systems for smart cities. IEEE Commun. Mag. 55(5), 26–33 (2017) 3. Lou, Y., Zhang, L.: Defending transportation networks against random and targeted attacks. Transp. Res. Rec. 2234(1), 31–40 (2011) 4. Santagati, G.E., Melodia, T.: An implantable low-power ultrasonic platform for the internet of medical things. In: IEEE INFOCOM 2017-IEEE Conference on Computer Communications, pp. 1–9. IEEE (2017) 5. Li, S., Da. Li, X., Zhao, S.: 5G Internet of Things: a survey. J. Ind. Inf. Integr. 10, 1–9 (2018) 6. Javaid, N., Sher, A., Nasir, H., Guizani, N.: Intelligence in IoT-based 5G networks: opportunities and challenges. IEEE Commun. Mag. 56(10), 94–100 (2018) 7. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440–442 (1998) 8. Albert, R., Jeong, H., Barab´asi, A.-L.: Error and attack tolerance of complex networks. Nature 406(6794), 378–382 (2000) 9. Herrmann, H.J., Schneider, C.M., Moreira, A.A., Andrade, J.S., Jr., Havlin, S.: Onion-like network topology enhances robustness against malicious attacks. J. Stat. Mech. Theory Exp. 2011(01), P01027 (2011) 10. Gunasekara, R.C., Mohan, C.K., Mehrotra, K.: Multi-objective optimization to improve robustness in networks. In: Mandal, J., Mukhopadhyay, S., Dutta, P. (eds.) Multi-Objective Optimization, pp. 115–139. Springer, Singapore (2018)
An Efficient Approach to Enhance the Robustness
85
11. Schneider, C.M., Moreira, A.A., Andrade, J.S., Havlin, S., Herrmann, H.J.: Mitigation of malicious attacks on networks. Proc. Natl. Acad. Sci. 108(10), 3838–3841 (2011) 12. Qiu, T., Zhao, A., Xia, F., Si, W., Wu, D.O.: ROSE: robustness strategy for scale-free wireless sensor networks. IEEE/ACM Trans. Netw. 25(5), 2944–2959 (2017) 13. Wang, X., Zhou, W., Li, R., Cao, J., Lin, X.: Improving robustness of interdependent networks by a new coupling strategy. Physica A Stat. Mech. Appl. 492, 1075–1080 (2018) 14. Zhou, M., Liu, J.: A two-phase multiobjective evolutionary algorithm for enhancing the robustness of scale-free networks against multiple malicious attacks. IEEE Trans. Cybern. 47(2), 539–552 (2016) 15. Ma, J., Zhichao, J.: Cascading failure model of scale-free networks for avoiding edge failure. Peer-to-Peer Netw. Appl. 12(6), 1627–1637 (2019) 16. Shihong, H., Li, G.: TMSE: a topology modification strategy to enhance the robustness of scale-free wireless sensor networks. Comput. Commun. 157, 53–63 (2020) 17. Qiu, T., Lu, Z., Li, K., Xue, G., Wu, D.O.: An adaptive robustness evolution algorithm with self-competition for scale-free internet of things. In: IEEE INFOCOM 2020-IEEE Conference on Computer Communications, pp. 2106–2115. IEEE (2020) 18. Zhou, M., Liu, J.: A memetic algorithm for enhancing the robustness of scale-free networks against malicious attacks. Physica A Stat. Mech. Appl. 410, 131–143 (2014) 19. Qiu, T., Liu, J., Si, W., Han, M., Ning, H., Atiquzzaman, M.: A data-driven robustness algorithm for the internet of things in smart cities. IEEE Commun. Mag. 55(12), 18–23 (2017) 20. Qureshi, T.N., Javaid, N., Almogren, A., Abubaker, Z., Almajed, H., Mohiuddin, I.: Attack resistance based topology robustness of scale-free Internet of Things for smart cities 21. Peng, G., Jun, W.: Optimal network topology for structural robustness based on natural connectivity. Physica A Stat. Mech. Appl. 443, 212–220 (2016) 22. Qiu, T., Luo, D., Xia, F., Deonauth, N., Si, W., Tolba, A.: A greedy model with small world for improving the robustness of heterogeneous internet of things. Comput. Netw. 101, 127– 143 (2016) 23. Strumberger, I., Beko, M., Tuba, M., Minovic, M., Bacanin, N.: Elephant herding optimization algorithm for wireless sensor network localization problem. In: Doctoral Conference on Computing, Electrical and Industrial Systems, pp. 175–184. Springer (2018) 24. Qureshi, T.N., Javaid, N., Almogren, A., Khan, A.U., Almajed, H., Mohiuddin, I.: An adaptive enhanced differential evolution strategies for topology robustness in Internet of Things 25. Zhang, X.-J., Guo-Qiang, X., Zhu, Y.-B., Xia, Y.-X.: Cascade-robustness optimization of coupling preference in interconnected networks. Chaos Solitons Fractals 92, 123–129 (2016) 26. Chen, N., Qiu, T., Zhou, X., Li, K., Atiquzzaman, M.: An intelligent robust networking mechanism for the Internet of Things. IEEE Commun. Mag. 57(11), 91–95 (2019) 27. Rong, L., Liu, J.: A heuristic algorithm for enhancing the robustness of scale-free networks based on edge classification. Physica A Stat. Mech. Appl. 503, 503–515 (2018) 28. Hussain, B., Hasan, Q.U., Javaid, N., Guizani, M., Almogren, A., Alamri, A.: An innovative heuristic algorithm for IoT-enabled smart homes for developing countries. IEEE Access 6, 15550–15575 (2018) 29. Chen, N., Qiu, T., Chaoxu, M., Han, M., Zhou, P.: Deep actor-critic learning-based robustness enhancement of Internet of Things. IEEE Internet Things J. 7(7), 6191–6200 (2020) 30. Xuan, Q., Shan, Y., Wang, J., Ruan, Z., Chen, G.: Adversarial attacks to scale-free networks: testing the robustness of physical criteria. arXiv preprint arXiv:2002.01249 (2020) 31. Safaei, F., Yeganloo, H., Akbar, R.: Robustness on topology reconfiguration of complex networks: an entropic approach. Math. Comput. Simul. 170, 379–409 (2020) 32. Ain, Q.-U., Iqbal, S., Khan, S.A., Malik, A.W., Ahmad, I., Javaid, N.: IoT operating system based fuzzy inference system for home energy management system in smart buildings. Sensors 18(9), 2802 (2018)
86
S. M. Abbas et al.
33. Wang, S., Liu, J.: Designing comprehensively robust networks against intentional attacks and cascading failures. Inf. Sci. 478, 125–140 (2019) 34. Xu, S., Xia, Y., Ouyang, M.: Effect of resource allocation to the recovery of scale-free networks during cascading failures. Physica A Stat. Mech. Appl. 540, 123157 (2020) 35. Zheng, G., Liu, Q.: Scale-free topology evolution for wireless sensor networks. Comput. Electr. Eng. 39(6), 1779–1788 (2013) 36. Liu, W., Gong, M., Wang, S., Ma, L.: A two-level learning strategy based memetic algorithm for enhancing community robustness of networks. Inf. Sci. 422, 290–304 (2018) 37. Qiu, T., Liu, J., Si, W., Wu, D.O.: Robustness optimization scheme with multi-population co-evolution for scale-free wireless sensor networks. IEEE/ACM Trans. Netw. 27(3), 1028– 1042 (2019) 38. Sabidussi, G.: The centrality index of a graph. Psychometrika 31(4), 581–603 (1966) 39. Bonacich, P.: Power and centrality: a family of measures. Am. J. Sociol. 92(5), 1170–1182 (1987)
A Blockchain Based Secure Authentication and Routing Mechanism for Wireless Sensor Networks Usman Aziz1 , Muhammad Usman Gurmani2 , Saba Awan2 , Maimoona Bint E. Sajid2 , Sana Amjad2 , and Nadeem Javaid2(B) 1 2
COMSATS University Islamabad, Attock Campus, Attock 43600, Pakistan COMSATS University Islamabad, Islamabad 44000, Pakistan
Abstract. In this paper, a blockchain-based authentication scheme is proposed for secure routing in the Wireless Sensor Networks (WSNs). The unauthenticated and malicious nodes affect the routing process and the correct identification of the routing path becomes a challenging issue. Therefore, in our model, to prevent the participation of malicious nodes in the network, the registration of the nodes is done by a Certificate Authority Node (CAN). Each node that participates in the routing is authenticated by the Base Station (BS) and a mutual authentication is performed. Moreover, the SHA-256 hashing algorithm is used in the verification of the registration process. Furthermore, in the proposed routing protocol, a Cluster Head (CH) sends the data to BS by selecting the forwarder CH node based on the residual energy and minimum distance from BS. The simulation results show that our proposed model improves the packet delivery ratio and the network lifetime is also increased.
1
Introduction
A Wireless Sensor Network (WSN) plays an important role in the development of science and technology. The WSNs are considered to be essential components in the progressive growth of smart societies and are used in many fields such as military, healthcare, surveillance, environment monitoring, etc., [1] and [2]. The WSNs are comprised of a number of sensor nodes (SNs) that collect the data from the sensing area and forward it to the sink node. In various scenarios, a WSN is divided into multiple clusters which are managed by the Cluster Heads (CHs). The CHs gather the data from the associated SNs and send it to the Base Station (BS). In recent years, the blockchain has shown its potential in various applications, such as medical, agriculture, supply chain, smart grids, etc., [3–6]. It provides a secure and permanent database that stores the transactions in the form of a digital ledger, which is distributed among all the nodes in the network. The transactions are encapsulated in the blocks that are added to the chain in a chronological order. Moreover, hashing algorithms are used to ensure the integrity of data. In the blockchain, once a transaction is added to the ledger, c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 87–95, 2022. https://doi.org/10.1007/978-3-030-79728-7_10
88
U. Aziz et al.
it cannot be modified. Furthermore, blockchain provides the feature of smart contracts that are triggered automatically without the involvement of a third party. In addition to the above-mentioned applications, the blockchain is also exploited in different fields of WSNs, such as localization and trust evaluation of SNs, secure routing, authentication of SNs, etc. Authentication is a key operation that is performed to prevent the participation of external entities in the network. The authentication in traditional WSNs relies on centralized authority, which causes the issue of a single point of failure [7]. Many blockchain based authentication schemes are proposed to solve this issue. In [8], a sink node validates the identities of the SNs by assigning them the sequence numbers. However, there is no mechanism to check the legitimacy of SNs and the scheme is prone to single point of failure. In [7], a mutual authentication scheme is proposed for Internet of Things (IoT) networks. However, it requires high computational power, which makes it unsuitable for resource constrained IoT devices. Furthermore, to address the privacy leakage in a crowd-sensing environment, an incentive mechanism is used in [9]. However, no mechanism is defined for the validation of the shared data. Alongside, a localization mechanism on the basis of the reputation value of the beacon nodes is proposed in [10]. In this scheme, if the nodes are unregistered and they communicate with each other, the reputation of legitimate nodes is affected. Therefore, the correct reputation value of nodes is not computed due to unregistered nodes. To tackle the above-mentioned issues, a blockchain based authentication scheme is proposed in this paper. The contributions of the proposed work are listed as follows. • An authentication mechanism is proposed for the secure data transmission between CHs and BSs. • To reduce the workload of the BSs, a trusted entity named CAN is introduced for storing the nodes’ registration information. • An efficient CH selection mechanism is proposed based on SNs’ residual energy and distance from the BS. The rest of the paper is organized as follows. The related work is provided in Sect. 2. The proposed system model and description are discussed in Sect. 3. Section 4 describes the simulation results of our proposed system and Sect. 5 concludes the paper.
2
Related Work
In this section, the existing literature is discussed in detail. The related work is further divided into multiple categories on the basis of the problems addressed. 2.1
Trust Based Localization of Sensor Nodes
In [10,11], trust of the sensor nodes is evaluated to identify and eliminate the malicious nodes in the network. The trust values of the nodes are computed based
Blockchain Based Secure Authentication and Routing
89
on their behavior and interaction with other nodes. After the identification of malicious nodes, the localization of the nodes is performed using the location information of the benign nodes. Both of the studies use blockchain to store the trust values and location of the sensor nodes. 2.2
Authentication of Sensor Nodes
In [8], authors use a lightweight authentication mechanism to detect malicious nodes in the IoT networks. In their scheme, the BS assigns sequence numbers to the nodes based on their distance. When the nodes communicate with the BS, it uses the concept of Merkel Tree to verify the legitimacy of the nodes based on the assigned sequence numbers. In [7], a mutual authentication mechanism of sensor nodes is proposed which is based on the blockchain. In this scheme, the public and private blockchains are deployed on the BSs and CHs, respectively. The CHs are used to register the ordinary nodes, whereas BSs are used to register the CHs. The registration and authentication of the nodes are carried out using smart contracts. While, the identity information is stored in the public blockchain maintained by the BSs. The authors in [12] use blockchain to eliminate the third party for trust evaluation and authentication of the nodes. In addition, they store the credentials and the behavior of the nodes in the blockchain. In [13], the blockchain is utilized for secure key management in the WSNs. The nodes’ identity information and their public keys are stored in the blockchain. 2.3
Secure Routing Schemes
In the WSNs, multihop routing face security issues in the presence of malicious nodes. To carry out the secure routing, a blockchain and Q-learning based routing protocol is presented in [14]. The scheme uses blockchain to store the routing information in a tamper-proof database. While, Q-learning is used to select the suitable next hop for packet forwarding based on the forwarding rate of the routing nodes. Similarly, in [17], the blockchain based smart contracts are used to perform route discovery and route establishment operations in IoT networks. Furthermore, the authors use blockchain tokens as bonds to prevent the malicious behavior of the nodes participating in the routing process. Consequently, the routing overhead is reduced in this scheme. In [15], the authors present a blockchain based model for secure routing in the WSNs. They consider the residual energy of the CHs to create a route to the BS. In [18], the blockchain-like data structure is used to create a chain of blocks of the sensed data. Furthermore, the scheme uses the number of successful and failed data transmissions to select the suitable data forwarding nodes. 2.4
Blockchain for IoT Devices
In [19], authors offload the mining process from IoT devices to the edge nodes. Thus, they minimize the workload on the IoT devices and make it possible to
90
U. Aziz et al.
deploy blockchain on resource constrained IoT devices. In [20], a lightweight blockchain is presented for IoT devices. In this scheme, the authors minimize the computational requirements of blockchain in a resource constrained industrial IoT environments. Similarly, in [21,22], the lightweight blockchain systems are proposed for IoT networks to reduce the computational and storage requirements. 2.5
Data Security and Attack Detection
In [9], authors leverage blockchain in crowdsensing environment to store and protect the users’ location information. In [23], the authors use a blockchain based data structure to store the data as well as users’ information in an air pollution monitoring system. In [24], blockchain is utilized to ensure the security of the sensed data in the WSNs. In this scheme, the CHs maintain a blockchain ledger in their cache to store the data within the network. In [25], a blockchain based incentive-based mechanism is proposed for the sensor nodes in WSNs. In this mechanism, incentives are given to the nodes that store the data sensed by the other nodes. While, all the nodes in the network jointly maintain the blockchain. Likewise, in [26], blockchain is utilized for the security of the data sensed by the sensor nodes in the WSNs. In [27], blockchain is utilized in the smart cities to avoid the single point of failure issues and ensure data security using a memory-hard Proof of Work (PoW) consensus. In [28], a rolling blockchain is presented to store the data in IoT environments. The rolling blockchain is implemented on IoT devices to store the data temporarily. While, a full blockchain is utilized to store the data permanently. Moreover, in [29], blockchain is utilized as a secure storage to store the monitoring data of the shellfish. In [30], the blockchain is used to store the hashes of the agricultural data stored InterPlanetary File System.
3
Proposed System Model
In our proposed system, the network consists of BSs, CHs, SNs and a CAN, as shown in Fig. 1. The SNs and CHs are static and randomly deployed. The BS has high computational and storage capabilities and it is responsible for authentication of the nodes and storage of sensed data. On the other hand, the CHs have low computational and storage capabilities as compared to BS. The responsibilities of CHs are to receive the data from SNs, process it and forward it to the BS. Furthermore, the Proof of Authority based blockchain is deployed on the BSs. In the system, first, a CH sends request to the CAN for its registration. The CAN receives the credentials of the CH and verifies if it is already registered or not (step 1 in Fig. 1. After verification, CAN registers the CH, assigns it a pseudonym ID and public key, and stores its information in the blockchain (step 2 in Fig. 1). Second, the SNs send registration requests to the associated CHs. Each CH verifies the requests received from its subordinate SNs. After
Blockchain Based Secure Authentication and Routing
91
verification, it assigns pseudonym IDs and public keys to the SNs. When an SN is successfully registered, it becomes eligible to send sensing data to its CH. In the proposed routing mechanism, the CH communicates directly with the BS if it is located in the transmission range. Otherwise, the CH selects one of its neighbor CHs for packet forwarding based on their residual energy and distance from the BS. When a CH A wants to communicate with another CH B, it sends a communication request to the blockchain via smart contract. In response, the BS checks the reputation of B stored in the blockchain. If the reputation of B is higher than the predefined threshold value, a confirmation message is sent to A. The same procedure is followed by B to verify the identity of A, Thus a mutual authentication is performed and A sends the data packets to B. After continuous transmission, the data packets reach the BS. After receiving the data, the BS requests CAN to authenticate the CH. In response, the CAN checks if it has the information of the CH. After verification, CAN sends the acknowledgment message to the BS. At this stage, if CH is not registered then CAN provides a negative feedback to BS and CH is declared as a malicious node. Consequently, the reputation of the CH is decreased. On the other hand, if the verification of the CH is successful, its reputation is increased. Table 1 shows the mapping of limitations to their proposed solutions and validations.
Fig. 1. Proposed system model
92
U. Aziz et al. Table 1. Limitations identified, solutions proposed and validations.
Limitations identified
Solutions proposed
Validations
L.1 No mechanism to check legitimacy of the nodes [8]
S.1 Registration and authentication of the nodes
V.1 Gas consumption, as shown in Fig. 2
L.2 Lack of traceability of CHs S.2 Authentication of the participating in routing [11] CHs
V.1
L.3 Energy inefficient routing
V.2 PDR
4
S.3 CH selection is based residual energy and minimum distance from BS
Simulations and Results
To evaluate the performance of the proposed system, we conduct the simulations on Intel(R) Core (TM) i3-2350M CPU @ 2.30 GHz 2.30 GHz processor with 8 GB RAM and Windows 10 operating system. The Solidity language is used to write the smart contract and Ganache is used to deploy the blockchain. Whereas, the network is simulated using MATLAB. Figure 2 shows the gas consumption of registration and authentication operations of SNs and CHs. The gas consumption during registration of SNs is less than that in CHs’ registration. The reason is that a CH has to register with CAN which is far away from it. On the other hand, the authentication of an SN consumes more gas because CAN has to authenticate more credential for CH as compared to the SN. Figure 3 shows the total number of packets sent from the SNs to the CHs in each round. The results show that the total number of packets received by the CHs increases with the rounds. However, after 2500 rounds, the number of packets becomes constant. This is because only a few CHs and SNs are left in the network. 10000 9000
Sensor Node Cluster Head
8000
Gas Consumption
7000 6000 5000 4000 3000 2000 1000 0 Registration
Authentication
Fig. 2. Gas consumption of registration and authentication of nodes.
Blockchain Based Secure Authentication and Routing
93
Fig. 3. Number of packets sent from SNs to CHs.
Figure 4 depicts that the Packet Delivery Ratio (PDR) keeps decreasing with the increase in the number of rounds. The reason is that the energy of the CHs is depleted in each round and a CH drains out completely after a certain number of rounds. As a result, the total number of CHs in the network keeps decreasing. Consequently, the PDR of the network decreases.
Fig. 4. PDR.
5
Conclusion
In this paper, we present a secure authentication and routing mechanism for WSNs. The aim of our proposed mechanism is to carry out authentication of the sensor nodes and ensure the secure communication between the CHs and BS. The proposed routing protocol selects the CHs on the basis of shortest distance from the BS. Whereas, a secure authentication mechanism of nodes is performed using
94
U. Aziz et al.
the blockchain and CAN. The simulation results show that our proposed model improves the packet delivery ratio and the network lifetime is also increased. In future work, the proposed idea will be tested on larger networks and a realistic routing environment.
References 1. Perkins, D.D., Tumati, R., Wu, H., Ajbar, I.: Localization in wireless ad hoc networks. In: Resource Management in Wireless Networking, pp. 507–542. Springer, Boston (2005) 2. Chintalapudi, K.K., Dhariwal, A., Govindan, R., Sukhatme, G.: On the feasibility of ad-hoc localization systems. Technical report (2003) 3. Khalid, R., Samuel, O., Javaid, N., Aldegheishem, A., Shafiq, M., Alrajeh, N.: A secure trust method for multi-agent system in smart grids using blockchain. IEEE Access 9, 59848–59859 (2021) 4. Samuel, O., Javaid, N.: A secure blockchain? Based demurrage mechanism for energy trading in smart communities. Int. J. Energy Res. 45(1), 297–315 (2021) 5. Sadiq, A., Javed, M.U., Khalid, R., Almogren, A., Shafiq, M., Javaid, N.: Blockchain based data and energy trading in internet of electric vehicles. IEEE Access 9, 7000–7020 (2020) 6. Yick, J., Mukherjee, B., Ghosal, D.: Wireless sensor network survey. Comput. Netw. 52(12), 2292–2330 (2008) 7. Cui, Z., et al.: A hybrid BlockChain-based identity authentication scheme for multiWSN. IEEE Trans. Serv. Comput. 13(2), 241–251 (2020) 8. Hong, S.: P2P networking based internet of things (IoT) sensor node authentication by Blockchain. Peer-to-Peer Netw. Appl. 13(2), 579–589 (2020) 9. Jia, B., Zhou, T., Li, W., Liu, Z., Zhang, J.: A blockchain-based location privacy protection incentive mechanism in crowd sensing networks. Sensors 18(11), 3894 (2018) 10. Kim, T.-H., et al.: A novel trust evaluation process for secure localization using a decentralized blockchain in wireless sensor networks. IEEE Access 7, 184133– 184144 (2019) 11. Goyat, R., Kumar, G., Rai, M.K., Saha, R., Thomas, R., Kim, T.H.: Blockchain powered secure range-free localization in wireless sensor networks. Arab. J. Sci. Eng. 45(8), 6139–6155 (2020) 12. Moinet, A., Darties, B., Baril, J.-L.: Blockchain based trust & authentication for decentralized sensor networks. arXiv preprint arXiv:1706.01730 (2017) 13. Tian, Y., Wang, Z., Xiong, J., Ma, J.: A blockchain-based secure key management scheme with trustworthiness in DWSNs. IEEE Trans. Industr. Inf. 16(9), 6193– 6202 (2020) 14. Yang, J., He, S., Yang, X., Chen, L., Ren, J.: A trusted routing scheme using blockchain and reinforcement learning for wireless sensor networks. Sensors 19(4), 970 (2019) 15. Haseeb, K., Islam, N., Almogren, A., Din, I.U.: Intrusion prevention framework for secure routing in WSN-based mobile Internet of Things. IEEE Access 7, 185496– 185505 (2019) 16. She, W., Liu, Q., Tian, Z., Chen, J.-S., Wang, B., Liu, W.: Blockchain trust model for malicious node detection in wireless sensor networks. IEEE Access 7, 38947– 38956 (2019)
Blockchain Based Secure Authentication and Routing
95
17. Ramezan, G., Leung, C.: A blockchain-based contractual routing protocol for the Internet of Things using smart contracts. Wireless Commun. Mob. Comput. 2018 (2018) 18. Kumar, M.H., Mohanraj, V., Suresh, Y., Senthilkumar, J., Nagalalli, G.: Trust aware localized routing and class based dynamic block chain encryption scheme for improved security in WSN. J. Ambient Intell. Humaniz. Comput. 12, 1–9 (2020) 19. Liu, M., Yu, F.R., Teng, Y., Leung, V.C.M., Song, M.: Computation offloading and content caching in wireless blockchain networks with mobile edge computing. IEEE Trans. Veh. Technol. 67(11), 11008–11021 (2018) 20. Liu, Y., Wang, K., Lin, Y., Wenyao, X.: LightChain: a lightweight blockchain system for industrial Internet of Things. IEEE Trans. Industr. Inf. 15(6), 3571–3581 (2019) ˇ Popovski, P.: Delay and communication 21. Danzi, P., Kalør, A.E., Stefanovi´c, C, tradeoffs for blockchain systems with lightweight IoT clients. IEEE Internet Things J. 6(2), 2354–2365 (2019) 22. Kim, J.-H., Lee, S., Hong, S.: Autonomous operation control of IoT blockchain networks. Electronics 10(2), 204 (2021) 23. Kolumban-Antal, G., Lasak, V., Bogdan, R., Groza, B.: A secure and portable multi-sensor module for distributed air pollution monitoring. Sensors 20(2), 403 (2020) 24. Mori, S.: Secure caching scheme by using blockchain for information-centric network-based wireless sensor networks. J. Signal Process. 22(3), 97–108 (2018) 25. Ren, Y., Liu, Y., Ji, S., Sangaiah, A.K., Wang, J.: Incentive mechanism of data storage based on blockchain for wireless sensor networks. Mob. Inf. Syst. 2018 (2018) 26. Haro-Olmo, F.J., Alvarez-Bermejo, J.A., Varela-Vaca, A.J., L´ opez-Ramos, J.A.: Blockchain-based federation of wireless sensor nodes. J. Supercomput. 1–13 (2021) 27. Sharma, P.K., Park, J.H.: Blockchain based hybrid network architecture for the smart city. Future Gener. Comput. Syst. 86, 650–655 (2018) 28. Sergii, K., Prieto-Castrillo, F.: A rolling blockchain for a dynamic WSNs in a smart city. arXiv preprint arXiv:1806.11399 (2018) 29. Feng, H., Wang, W., Chen, B., Zhang, X.: Evaluation on frozen shellfish quality by blockchain based multi-sensors monitoring and SVM algorithm during cold storage. IEEE Access 8, 54361–54370 (2020) 30. Ren, W., Wan, X., Gan, P.: A double-blockchain solution for agricultural sampled data security in Internet of Things network. Futur. Gener. Comput. Syst. 117, 453–461 (2021)
Blockchain Based Authentication and Trust Evaluation Mechanism for Secure Routing in Wireless Sensor Networks Saba Awan1 , Maimoona Bint E. Sajid1 , Sana Amjad1 , Usman Aziz2 , Usman Gurmani1 , and Nadeem Javaid1(B) 1
2
COMSATS University Islamabad, Islamabad 44000, Pakistan COMSATS University Islamabad, Attock Campus, Attock, Pakistan
Abstract. In this paper, a blockchain based authentication model is proposed where the identity of each node is stored on the blockchain. The public and private blockchains are used for authentication. The authentication of Sensor Nodes (SNs) is performed at the private blockchain, whereas the public blockchain authenticates the cluster heads. The existing malicious node detection methods do not guarantee the authentication of the entities in Wireless Sensor Networks (WSNs). The unregistered nodes can easily access the resources of the network and perform malicious activities. Moreover, the malicious nodes broadcast wrong route information that increases packet delay and lowers packet delivery ratio. In the proposed model, the trust value is calculated in order to remove the malicious nodes. The secure routing is performed on the basis of the most trustworthy nodes in the network. The aim is to reduce the packet delay and increase the packet delivery ratio. The simulation results show that the high throughput and packet delivery ratio is achieved due to the presence of highly trusted nodes. Moreover, our proposed model detects the malicious nodes effectively. Keywords: Wireless Sensor Networks · Blockchain Trust evaluation · Smart contract · Secure routing
1
· Authentication ·
Introduction
The Wireless Sensor Networks (WSNs) play a significant role in many areas, such as medical, military, surveillance, industrial, etc., [1–3]. A WSN is a selforganizing network where Sensor Nodes (SNs) are randomly deployed and have limited storage, energy and computational power [4–6]. They monitor the parameters like temperature, humidity, etc. and transmit the data towards the Base Stations (BSs) [7]. However due to the limited resources, the attackers can easily attack the SNs [8]. A blockchain is an emerging technology that consists of nodes. The nodes maintain the state of a distributed ledger. It efficiently keeps a record of transactions among multiple parties [9,10]. No one can easily perform data tampering as blockchain is immutable. The data in blockchain is secure because c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 96–107, 2022. https://doi.org/10.1007/978-3-030-79728-7_11
Blockchain Based Authentication
97
the blocks are connected via hashes. Moreover, each block also consists of the block header and the block body [11]. The root hash of the Merkle Tree is present in the block header. While the block body comprises of the transactions. If any data tampering is performed, it can be easily identified by comparing the hash of the data with the root hash [12]. In WSNs, the security threats are becoming more serious day by day [13,14]. There are two types of attacks namely internal and external in WSN. In internal attack, nodes behave selfishly while in external attack, attackers force the nodes to perform malicious activities. Therefore, it is very crucial to identify malicious nodes and remove them from the network. In WSNs, the detection of malicious nodes is widely studied and divided into WSNs protocols and trust model. In a network without authentication, any node can enter in the network and forge the identity of benign nodes. The adversary nodes broadcast wrong route information on the behalf of the compromised nodes. The existing malicious node detection methods do not guarantee the authentication of nodes. Moreover, the malicious node detection is performed on the basis of trust value only. Different metrics are used to check the node’s fairness; however, detection is performed at a later stage which consumes more resources of the network [8,15,16]. The presence of a large number of malicious nodes in the network have negative impact on the routing [17,18]. The malicious nodes modify the data and broadcast wrong route information which affects the performance of the network. The attacks in WSNs occur in two ways one of them is data and other is routing. In data attack, the adversary nodes perform data tampering. While in routing attack, adversary nodes choose a least energetic route that depletes the energy of SNs. The efficiency of the network is affected in terms of high delay and low packet delivery ratio [19,20]. Moreover, the SNs are resource constrained devices that are vulnerable to different attacks like spoofing and impersonation, etc. To overcome the challenges mentioned above, blockchain based authentication and trust evaluation mechanism for secure routing in the WSNs is proposed. In [19], authors propose a trust aware localized routing scheme for WSNs. However, the authentication and malicious node detection are not considered. Due to this, unregistered nodes can access the network resources and data [15]. The malicious nodes can enter in the network and forge the real identity of the nodes. Moreover, they drop the data packets, which increases the network delay and decreases the packet delivery ratio [20]. The contributions of this paper are summarized as follows. 1. Any node can easily enter and access the data and the resources in the network without authentication. The authentication of the nodes is important to secure the network from the intruders. 2. After the authentication of nodes, they may selfishly behave and drop the data packets. Therefore, the trust of the nodes is evaluated based on the trust value. 3. The victim nodes drop the data packets, which increases the number of retransmissions and delay. Therefore, highly trustworthy nodes are used to perform the secure and efficient routing.
98
S. Awan et al.
The rest of the paper is organized as follows: Sect. 2 contains related work, the proposed system model is discussed in Sect. 3. The simulation of the proposed model is presented in Sect. 4. The conclusion and future work are explained in Sect. 5.
2
Related Work
This section discusses the literature review of the related studies. 2.1
Trust Evaluation for Malicious Nodes Detection
In a hostile and remote environment, whenever any malicious activity is performed on beacon nodes, getting an accurate node location is challenging. Incorrect location estimation affects the localization accuracy and energy dissipation affects the lifetime of the WSNs [8]. According to [15], traditional methods of WSNs do not keep the record of original data for later use. In [16], the dynamic behavior of SN makes it a challenging concern for the localization process. Due to the bisected information, the SNs cannot broadcast the accurate location in the network. 2.2
Node’s Authentication to Ensure Data Confidentiality and Validity
Intra-platform authentication of specific users and random access of unauthorized users in IoT platform requires an access control process for its protection [21]. Conventional IoT identity protocols are centralized-based Trusted Third Party Protocol (TTP) that are vulnerable to a single point of failure [22]. Dynamic WSNs have more uncertainty and large coverage as compared to static WSNs, thus cause trust issues. The traditional WSNs are mostly homogeneous that involve complex design protocols and additional overhead [23]. Existing models do not allow content access, reliable authentication and trust management [12]. Lack of traceability of each node in IoT network leads to inefficiency and significant loss in industrial growth. However, interconnection, implementation and communication among IoT devices lead to personal and confidential concerns [24]. The existing encryption protocols such as Secure Socket Layer (SSL) and Transport Layer Security (TLS) allow secure communications from one end to another. These protocols do not ensure user anonymity and authentication of data [25]. 2.3
Secure Routing Protocols in a WSN
The existing solutions have concentrated on the evolution of static topology and ignored the dynamic behavior of the normal nodes. Data collection of static nodes may be imprecise and leads to unreliability and network disruption. The dynamic nature of the network causes energy deficiency and degradation of packet delivery
Blockchain Based Authentication
99
[26,27]. The presence of adversaries in a WSN introduces different challenges and threats. Many solutions have been proposed to tackle security issues, but they do not ensure network performance. In the secure routing, feedback process performed based on the most trusty beacon nodes increase network overhead [19]. Due to the central nature of trust issues that arise between IoT vendors and the high cost of implementing trust management such as PKI [28]. The existing routing strategies do not distinguish the nodes’ behaviors and suspicious nodes take part in the routing. When a malicious node gets a data packet by their neighbors, discard them and does not forward to the next neighbor, which creates a black hole attack [29]. 2.4
Lightweight Blockchain for IoT
Blockchain requires high resources and poor performance that is not well addressed. Furthermore, miners integrate the huge block data and handle many transactions in a Peer to Peer (P2P) network, where storage and bandwidth are challenging to manage in IIoT devices [30]. Existing blockchain technology can not handle the Internet of Underwater Things (IoUT) big data efficiently where multiple nodes have a duplication of the full ledger. Distributed nature of blockchain requires high storage [31]. The local copy of the blockchain records is not feasible for memory constrained and low power devices [32]. Blockchain has a slow update rate due to the chain nature and update the data in parallel. Whereas in tangle, transactions validate their two previous transactions before joining it at that speed. SNs have limited computational resources and invalidate the prior transactions [33]. In blockchain application, wireless mobile face many challenges of PoW puzzle in the mining process, that requires high processing ability and data storage availability [34]. 2.5
Good Performance Regarding Data Storage
In a WSN, nodes have limited energy and storage. Thus, some nodes may behave selfishly. Moreover, these nodes do not forward the packet and entire network is effected. Furthermore, there is no incentive mechanism for the network nodes to store data [35]. To perform PoW on mobile devices, high computational power is required which makes it impossible to implement blockchain on it. Nodes have limited capacity to store the information of nodes, resulting in keeping node information for a limited time. To keep the network updated, all the elements in the network must have information of other neighboring nodes which is not possible all the time [36]. 2.6
Data Security and Privacy
The presence of adversaries in a WSN introduces different challenges and threats. Different solutions are proposed to tackle security issues, but they do not ensure network performance and secure routing. While, feedback process is performed
100
S. Awan et al.
based on the most trusty beacon nodes which increases network overhead [19]. The issues identified in the existing smart city system include bandwidth bottlenecks, high latency, scalability, privacy and security [37]. The growing number of the IoT devices increases security issues due to attacks. It is necessary to protect the devices from cyber attacks. Existing solutions are not suitable due to the issues such as storage constraints, single point of failure, high latency and high computational cost. Furthermore, the traditional systems face issues such as big data problems (requires a massive amount of data for accurate decisions and detection of attacks) and lack of privacy (collects data without user’s permission). This leads to wrong decision making [38]. In crowd sensing network, mobile devices sense and compute the data which saves cost, but there is an issue of privacy leakage. Low user engagement and upload false information by users that cause the disclosure of private information [39]. In IoT, central authority is used to store information. As shellfish products are highly vulnerable, perishable and quality deterioration generally takes place in a short time. A high rate of temperature control in shellfish products is a major issue which affects the shelf life, freshness loss and food decomposition [40]. In an Information-Centric Network (ICN) of a WSN, caching scheme is not investigated. Dispersal of data, i.e., duplication, raises privacy and security issues [41]. 2.7
Blockchain Based Fair Non-repudiation Mechanism
The existing service schemes face security challenges which cause many concerns like malicious clients may refuse the service provisioning and service providers can provide fake services [42].
3
System Model
The proposed system model is motivated from [15,22]. In proposed system model, a blockchain based authentication and trust evaluation mechanism for secure routing in the WSN is proposed. A WSN comprised of CHs, SNs, BSs and end users. As the SNs have limited computational power and storage, they sense the data and send it to their associated CHs. The CHs receive the data and forward it to the nearby BSs, which consist of abundant resources like storage, energy, computational power, etc. In the proposed model, the public and private blockchains are used as shown in Fig. 1. The public blockchain is deployed on BSs whereas, the private blockchain is deployed on CHs. The steps include in the authentication scheme are initialization, registration and authentication. In initialization, the BS initializes all the nodes that are present in the network. The BS generates public and private keys for the CHs, SNs and itself. The keys are used to verify the integrity of messages. Each node has its unique MAC (Media Access Control) address. The identity of BS, CH and SN is marked as BSID, CHID and SNID. The CHs are registered using
Blockchain Based Authentication
101
Fig. 1. Proposed system model
smart contract that are deployed on the public blockchain. The identity information of CHs is stored in the public blockchain. The smart contract verifies if a CH node already exists or not. Moreover, the validity of the CH’s MAC address and correctness of its identity is checked. When these steps are performed successfully, the public blockchain records the identity of CHs. If the verification of identity fails, an error message is returned. In contrast, registration of SNs is performed on the private blockchain. The SNs are allowed to join the blockchain network after registration. The registration process of SNs is same as of CHs. The identity information of SNs are stored on the private blockchain. The SNs are bound to their CHs after the deployment. The WSNs are vulnerable to two kinds of attacks i.e., external and internal. The registration and authentication of the nodes mitigates the external attacks. The intruders are not allowed into the network. However, internal nodes may behave maliciously and broadcast wrong information into the network. If a node wants to join the network it must be registered. When a SN communicate with CH, the CH authenticate the identity of SN using private blockchain. Furthermore when a CH communicates with BS, the BS verifies the identity of CH using public blockchain. Additionally when two CHs communicate, the mutual authentication of CHs is performed. Both CHs sends request to the BS for authentication. In internal attack, nodes behave selfishly in the network. It is crucial to identify and remove the malicious nodes after registration. The trust value of the nodes is computed in order to remove
102
S. Awan et al.
the malicious nodes. The following steps are involved in the trust evaluation of the nodes. Step 1: The CH checks the state of the SNs either they are alive or not. Step 2: The delayed transmission, forwarding rate and response time information is collected for alive nodes and the communication quality is computed. Step 3: The trust value η is calculated on the basis of nodes communication quality. Step 4: If η is greater than a defined threshold, it is considered as a legitimate node otherwise malicious. Step 5: The η of each node lies between 0 and 1. After the trust evaluation, CHs sends a η of the nodes to the BS. The smart contract of malicious node detection is deployed at the BS. Step 6: The nodes with high η transmit the packet to their associated CHs. After receiving the packet from SNs, CHs forward it to the nearby BS. Table 1. Mapping of limitations identified, solutions proposed and validations done Limitations identified
Solutions proposed
Validations done
L1: Unregistered nodes access the network resources
S1: The WSN which consist of two blockchains i.e. public and private
V1: Throughput, packet delivery ratio
L2: Malicious node detection
S2: Malicious node detection is based on trust value
V2: Credibility of the SNs
L3: Insecure routing
S3: Most trust worthy nodes perform a secure routing to reduce the delay
V3: Throughput, packet delivery ratio
The Table 1 shows the limitation identified and the proposed solution and their validation. In first solution the authentication is performed based on the two blockchains, i.e., public and private. On a public blockchain, BS registers the CH and the SN is registered by the CH on the private blockchain. The unique identity of each CH and SN is stored on BS. Throughput and packet delivery ratio are used to evaluate the performance of the proposed system. In second solution, whenever the authentication of the nodes is performed, the η of each node is computed that lies between 0 and 1. Trust value is used to identify the legitimate or malicious nodes. In third solution, most trustworthy nodes take part in the packet transmission to reduce the delay. In the network, whenever malicious nodes get a packet from their neighbor node, they do not forward and drop it, which increases the delay. The efficiency of the network is affected due to high delay and packet drop ratio. The Performance parameters are packet delivery ratio and throughput (S3).
Blockchain Based Authentication
4
103
Results and Discussions
This section presents the simulation results and their discussion. The MATLAB is used to validate the performance of the proposed model. The SNs are considered stationary for simulations. The parameters used in the simulations are given in Table 2. Table 2. Simulation parameters Parameter
Value
Sensing area
100 m × 100 m
Deployment
Random
Sensor nodes 8 Cluster head 4 Initial energy 0.5 J
0.35
Packet delivery ratio
0.3
0.25
0.2
0.15
0.1 10
20
30
40
50
60
70
80
90
100
Number of rounds (r)
Fig. 2. Packet delivery ratio
The Fig. 2 shows the packet delivery ratio to a different number of rounds. The number of packets increases with the number of rounds. The η of the node is shown in Fig. 4. The route is selected on the basis of η. Therefore there is a high value of packet delivery ratio. As SNs have limited computational powers, they do not directly send their packets to the BS which consumes high energy. Therefore, each CH of the cluster collects data from its all associating SNs and transmits it to the forwarder CH or BS based on the minimal distance. The proposed model improves the possibility of receiving packets successfully. A large number of cooperating nodes are involved in packet transmission, thus reduces
104
S. Awan et al. 7
Average throughput (MB/s)
6.5 6 5.5 5 4.5 4 3.5 3 2.5 10
20
30
40
50
60
70
80
90
7
8
100
Number of rounds (r)
Fig. 3. Throughput 1 0.9 0.8
Trust value
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1
2
3
4
5
6
Sensor nodes
Fig. 4. Credibility of sensor nodes
the computational overhead. High reliability is achieved due to the authentication of the nodes, outsiders can not participate in it. Figure 3 depicts that the packets are successfully received at the BS. Packets sent to the CHs depend on the number of high trusted nodes. The more trustworthy nodes send more packets to the BS, thus increases the network throughput and packet delivery ratio. To transmit packets to the BS, different forwarding CHs forward the data to the BS in each round. Therefore, nodes have enough energy to transmit more packets to the BS and the throughput of the network is also increased. Figure 4 shows the trust value of SNs. The malicious SNs are detected based on the obtained η. An appropriate threshold is set and if the η of SN is less than the threshold. The node is considered malicious; otherwise, it is marked as
Blockchain Based Authentication
105
a legitimate node. When authentication of SN is performed, only a registered node can participate in the network. However, due to the limited resources, the node may behave selfishly. Therefore trust value is calculated based on the delayed transmission, response time and forwarding rate.
5
Conclusion and Future Work
In this paper, we have proposed a blockchain based authentication and trust model for secure routing. To achieve the goal of authentication, the smart contract, public and private blockchain is used. The local blockchain is established between the CHs and the BS are added to global blockchain. The identity registration between SNs and CHs are completed. When registration is completed, trust value of a nodes is computed. The simulation results show higher throughput and packet delivery ratio in the presence of highly trusted nodes. As credibility of each node is computed, the nodes having high trust value participates in the network otherwise removed from the network. The packet transmission is performed in the intra-network. In future, we will extend our work in multi WSNs; therefore, routing will be performed in inter-network.
References 1. Kandris, D., Nakas, C., Vomvas, D., Koulouras, G.: Applications of wireless sensor networks: an up-to-date survey. Appl. Syst. Innov. 3(1), 14 (2020) 2. Yetgin, H., Cheung, K.T.K., El-Hajjar, M., Hanzo, L.H.: A survey of network lifetime maximization techniques in wireless sensor networks. IEEE Commun. Surv. Tutorials 19(2), 828–854 (2017) 3. Noel, A.B., Abdaoui, A., Elfouly, T., Ahmed, M.H., Badawy, A., Shehata, M.S.: Structural health monitoring using wireless sensor networks: a comprehensive survey. IEEE Commun. Surv. Tutorials 19(3), 1403–1423 (2017) 4. Wang, J., Gao, Y., Liu, W., Sangaiah, A.K., Kim, H.J.: Energy efficient routing algorithm with mobile sink support for wireless sensor networks. Sensors 19(7), 1494 (2019) 5. Azarhava, H., Niya, J.M.: Energy efficient resource allocation in wireless energy harvesting sensor networks. IEEE Wireless Commun. Lett. 9(7), 1000–1003 (2020) 6. Khan, Z.A., et al.: Efficient routing for corona based underwater wireless sensor networks. Computing 101(7), 831–856 (2019) 7. Lee, H.C., Ke, K.H.: Monitoring of large-area IoT sensors using a LoRa wireless mesh network system: design and evaluation. IEEE Trans. Instrum. Measur. 67(9), 2177–2187 (2018) 8. Kim, T.H., et al.: A novel trust evaluation process for secure localization using a decentralized blockchain in wireless sensor networks. IEEE Access 7, 184133– 184144 (2019) 9. Bao, Z., Wang, Q., Shi, W., Wang, L., Lei, H., Chen, B.: When Blockchain Meets SGX: An Overview. Challenges and Open Issues, IEEE Access (2020) 10. Gourisetti, S.N.G., Mylrea, M., Patangia, H.: Evaluation and demonstration of blockchain applicability framework. IEEE Trans. Eng. Manage. 67(4), 1142–1156 (2019)
106
S. Awan et al.
11. Xu, Y., Huang, Y.: Segment blockchain: a size reduced storage mechanism for blockchain. IEEE Access 8, 17434–17441 (2020) 12. Moinet, A., Darties, B., Baril, J.L.: Blockchain based trust & authentication for decentralized sensor networks (2017). arXiv preprint arXiv:1706.01730 13. Jiang, Q., Zeadally, S., Ma, J., He, D.: Lightweight three-factor authentication and key agreement protocol for internet-integrated wireless sensor networks. IEEE Access 5, 3376–3392 (2017) 14. Shin, S., Kwon, T.: A lightweight three-factor authentication and key agreement scheme in wireless sensor networks for smart homes. Sensors 19(9), 2012 (2019) 15. She, W., Liu, Q., Tian, Z., Chen, J.S., Wang, B., Liu, W.: Blockchain trust model for malicious node detection in wireless sensor networks. IEEE Access 7, 38947– 38956 (2019) 16. Goyat, R., Kumar, G., Rai, M.K., Saha, R., Thomas, R., Kim, T.H.: Blockchain powered secure range-free localization in wireless sensor networks. Arabian J. Sci. Eng. 45(8), 6139–6155 (2020) 17. Alghamdi, W., Rezvani, M., Wu, H., Kanhere, S.S.: Routing-aware and malicious node detection in a concealed data aggregation for WSNs. ACM Trans. Sensor Networks (TOSN) 15(2), 1–20 (2019) 18. Yadav, M., Fathi, B., Sheta, A.: Selection of WSNs inter-cluster boundary nodes using PSO algorithm. J. Comput. Sci. Colleges 34(5), 47–53 (2019) 19. Kumar, M.H., Mohanraj, V., Suresh, Y., Senthilkumar, J., Nagalalli, G.: Trust aware localized routing and class based dynamic block chain encryption scheme for improved security in WSN. J. Ambient Intell. Humanized Comput., 1–9 (2020) 20. Latif, K., Javaid, N., Ullah, I., Kaleem, Z., Abbas, Z., Nguyen, L.D.: DIEER: delayintolerant energy-efficient routing with sink mobility in underwater wireless sensor networks. Sensors 20(12), 3467 (2020) 21. Hong, S.: P2P networking based internet of things (IoT) sensor node authentication by Blockchain. Peer-to-Peer Networking Appl. 13(2), 579–589 (2020) 22. Cui, Z., et al.: A hybrid BlockChain-based identity authentication scheme for multiWSN. IEEE Trans. Serv. Comput. 13(2), 241–251 (2020) 23. Tian, Y., Wang, Z., Xiong, J., Ma, J.: A blockchain-based secure key management scheme with trustworthiness in DWSNs. IEEE Trans. Ind. Inf. 16(9), 6193–6202 (2020) 24. Rathee, G., Balasaraswathi, M., Chandran, K.P., Gupta, S.D., Boopathi, C.S.: A secure IoT sensors communication in industry 4.0 using blockchain technology. J. Ambient Intell. Humanized Comput. 1–13 (2020) 25. Kolumban-Antal, G., Lasak, V., Bogdan, R., Groza, B.: A secure and portable multi-sensor module for distributed air pollution monitoring. Sensors 20(2), 403 (2020) 26. Haseeb, K., Islam, N., Almogren, A., Din, I.U.: Intrusion prevention framework for secure routing in WSN-based mobile Internet of Things. IEEE Access 7, 185496– 185505 (2019) 27. Javaid, N., Shakeel, U., Ahmad, A., Alrajeh, N., Khan, Z.A., Guizani, N.: DRADS: depth and reliability aware delay sensitive cooperative routing for underwater wireless sensor networks. Wireless Networks 25(2), 777–789 (2019) 28. Ramezan, G., Leung, C.: A blockchain-based contractual routing protocol for the internet of things using smart contracts. Wireless Communications and Mobile Computing (2018) 29. Yang, J., He, S., Xu, Y., Chen, L., Ren, J.: A trusted routing scheme using blockchain and reinforcement learning for wireless sensor networks. Sensors 19(4), 970 (2019)
Blockchain Based Authentication
107
30. Liu, Y., Wang, K., Lin, Y., Xu, W.: LightChain: a lightweight blockchain system for industrial internet of things. IEEE Trans. Ind. Inf. 15(6), 3571–3581 (2019) 31. Uddin, M.A., Stranieri, A., Gondal, I., Balasurbramanian, V.: A lightweight blockchain based framework for underwater iot. Electronics 8(12), 1552 (2019) ˇ Popovski, P.: Delay and communication 32. Danzi, P., Kalør, A.E., Stefanovi´c, C., tradeoffs for blockchain systems with lightweight IoT clients. IEEE Internet Things J. 6(2), 2354–2365 (2019) 33. Rovira-Sugranes, A., Razi, A.: Optimizing the age of information for blockchain technology with applications to IoT sensors. IEEE Commun. Lett. 24(1), 183–187 (2019) 34. Liu, M., Yu, F.R., Teng, Y., Leung, V.C., Song, M.: Computation offloading and content caching in wireless blockchain networks with mobile edge computing. IEEE Trans. Vehicular Technolo. 67(11), 11008–11021 (2018) 35. Ren, Y., Liu, Y., Ji, S., Sangaiah, A.K., Wang, J.: Incentive mechanism of data storage based on blockchain for wireless sensor networks. Mobile Information Systems (2018) 36. Sergii, K., Prieto-Castrillo, F.: A rolling blockchain for a dynamic WSNs in a smart city. arXiv preprint arXiv:1806.11399 (2018) 37. Sharma, P.K., Park, J.H.: Blockchain based hybrid network architecture for the smart city. Future Gener. Comput. Syst. 86, 650–655 (2018) 38. Rathore, S., Kwon, B.W., Park, J.H.: BlockSecIoTNet: Blockchain-based decentralized security architecture for IoT network. J. Network Comput. Appl. 143, 167–177 (2019) 39. Jia, B., Zhou, T., Li, W., Liu, Z., Zhang, J.: A blockchain-based location privacy protection incentive mechanism in crowd sensing networks. Sensors 18(11), 3894 (2018) 40. Feng, H., Wang, W., Chen, B., Zhang, X.: Evaluation on frozen shellfish quality by blockchain based multi-sensors monitoring and SVM algorithm during cold storage. IEEE Access 8, 54361–54370 (2020) 41. Mori, S.: Secure caching scheme by using blockchain for information-centric network-based wireless sensor networks. J. Signal Process. 22(3), 97–108 (2018) 42. Xu, Y., Ren, J., Wang, G., Zhang, C., Yang, J., Zhang, Y.: A blockchain-based nonrepudiation network computing service scheme for industrial IoT. IEEE Trans. Ind. Inf. 15(6), 3632–3641 (2019)
Towards Energy Efficient Smart Grids: Data Augmentation Through BiWGAN, Feature Extraction and Classification Using Hybrid 2DCNN and BiLSTM Muhammad Asif1 , Benish Kabir1 , Pamir1 , Ashraf Ullah1 , Shoaib Munawar2 , and Nadeem Javaid1(B) 2
1 COMSATS University Islamabad, Islamabad 44000, Pakistan International Islamic University Islamabad, Islamabad 44000, Pakistan
Abstract. In this paper, a novel hybrid deep learning approach is proposed to detect the nontechnical losses (NTLs) that occur in smart grids due to illegal use of electricity, faulty meters, meter malfunctioning, unpaid bills, etc. The proposed approach is based on data-driven methods due to the sufficient availability of smart meters’ data. Therefore, a bi-directional wasserstein generative adversarial network (Bi-WGAN) is utilized to generate the synthetic theft samples for solving the class imbalance problem. The Bi-WGAN efficiently synthesizes the minority class theft samples by leveraging the capabilities of an additional encoder module. Moreover, the curse of dimensionality degrades the model’s generalization ability. Therefore, the high dimensionality issue is solved using the two dimensional convolutional neural network (2D-CNN) and bidirectional long short-term memory network (Bi-LSTM). The 2D-CNN is applied on 2D weekly data to extract the most prominent features. In 2D-CNN, the convolutional and pooling layers extract only the potential features and discard the redundant features to reduce the curse of dimensionality. This process increases the convergence speed of the model as well as reduces the computational overhead. Meanwhile, a Bi-LSTM is also used to detect the non-malicious changes in consumers’ load profiles using its strong memorization capabilities. Finally, the outcomes of both models are concatenated into a single feature map and a sigmoid activation function is applied for final NTL detection. The simulation results demonstrate that the proposed model outperforms the existing scheme in terms of mathew correlation coefficient (MCC), precision-recall (PR) and area under the curve (AUC). It achieves 3%, 5% and 4% greater MCC, PR and AUC scores, respectively as compared to the existing model.
1
Introduction
Electricity has become a necessary part of our lives. The electricity generated through hydropower, wind power, or thermal power is transmitted to the grid stations. The grid stations further transmit the electricity to power utilities c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 108–119, 2022. https://doi.org/10.1007/978-3-030-79728-7_12
Towards Energy Efficient Smart Grids
109
for distribution in different industrial and residential regions. Therefore, during the generation, transmission and distribution of electricity different losses often occur. These losses are generally distributed into technical losses (TLs) and nontechnical losses (NTLs). The former losses occur due to the energy dissipation in electricity distribution lines, short circuits in transformer, fatal electric shocks, etc. The later losses happen due to the metering faults, bypassing the meters, physical tampering through shunts devices, unpaid bills, etc. For power utilities, the NTLs become a serious issue because they account for billions of dollars in electricity losses every year. According to a world bank report, the United States suffers from $6 billion [1] due to NTLs, which is a huge amount. Moreover, the power utilities in Fujian China bear almost $15 million till now [2]. That is why the electricity theft detection (ETD) is a quite serious issue for the current and future era. However, the emergence of smart grids and advanced metering infrastructure (AMI) enables two-way energy and communication flow between power utilities and consumers. The smart meters collect the electricity consumption data at each time stamp. So, the sufficient availability of electricity consumption data opens a new way for the research community to contribute their efforts for efficient ETD. The literature has teemed with the various methods of ETD. Currently, in the literature, three main methods exist for the detection of energy theft: 1) state or hardware based, 2) game theory based and 3) data driven based. The state based methods [3] require additional hardware devices and sensors for theft detection, which is not a suitable approach because the additional monetary cost is needed to install and maintain the devices. Similarly, the game theory based methods [4] create a simulated environment where a game is played between the consumers and power utilities for solving the electricity theft problem. However, this is not a suitable approach because designing the simulated environment for complex real world scenarios is a challenging task. Therefore, the data driven methods attract the research community’s attention because they only require a dataset for model’s training. Afterwords, they able to discriminate between the normal and malicious users by exploiting different machine and deep learning techniques. A substantial body of work has been done in the literature to identify energy theft by utilizing supervised and unsupervised learning models. Many researcher [5–7] use different machine and deep learning techniques for detecting energy theft. However, all of them have low detection rate and poor generalization results due to the inefficient feature engineering and limited availability of labeled electricity data. Moreover, another most common issue that occurs in ETD is a class imbalance [5,8–10] because in the real world scenarios, the theft samples are rarely available as compared to the honest samples. Furthermore, the curse of dimensionality [8] is also the major problem faces by the researchers. It degrades the model’s accuracy as well as increase the computational time. The major contribution of this research are as follows. • In this study, a 2D-CNN and Bi-LSTM hybrid model is used to solve the curse of dimensionality issue. The 2D-CNN model captures only latent trend,
110
M. Asif et al.
hidden features and periodicity from the high dimensional feature space. And meanwhile, the Bi-LSTM learns the long-term temporal correction from the electricity consumption data for efficient ETD. • The Bi-WGAN is used to solve the class imbalance problem. The minority class samples are synthesized through Bi-WGAN. The Bi-WGAN generates most plausible theft samples by using the strong capabilities of the additional encoder module. The encoder module performs inverse mapping of real input to the latent space in order to strengthen the generator capabilities. The rest of the paper is organized as follows: Sect. 2 presents the related work. The Sect. 3 describes the detailed information of the proposed system model. Whereas, the Sect. 4 describes the results and discussion about the proposed and existing models. The conclusion of the proposed model is presented in Sect. 5.
2
Related Work
In literature, many researchers use different machine learning and statistical models for ETD, however, these models demand manual feature engineering and relevant domain knowledge. The existing models are applied to one-dimensional (1D) electricity consumption data as capturing latent features from the 1D data is a challenging task [5]. Similarly, in [11], the authors discuss that many existing machine learning models do not focus on proper features engineering so, it leads the models toward poor generalization results. In addition, the available electricity consumption dataset is high dimensional. So, extracting the most abstract features representation from the high dimensional data is a very difficult and challenging task. As improper feature engineering also leads to high FPR, which degrades the system performance. In literature, many traditional schemes are focused on handicraft feature engineering for NTL detection [8]. Whereas, there are no mathematical mechanisms founded in the existing literature for identifying the shunt and double-tapping attacks. Moreover, for detecting any new type of NTL behavior, the traditional schemes demand re involvement of domain experts for creating new relevant features, which is a tedious and time-consuming task. In [7,12–14], the authors address that in existing methods, there are no appropriate feature engineering mechanisms presented. The manual feature engineering process is required extra time and domain knowledge. Whereas in [12], the autoencoder is used to extract the abstract features from high dimensional electricity consumption data. However, it still needs improvement to detect some intelligent attacks such that zero-day attack with high precision. The authors of [15,16] study that in literature, the features relevant to electricity consumption are mostly designed manually by using the domain experts’ knowledge. However, these features are not still suitable for detecting NTL because of arbitrarily changing patterns of electricity consumption profile. So, for industrial users, these manually generated features are not sufficient for efficient pattern recognition and NTL detection. In [17,18] authors mention that several previous studies exploit different machine and deep learning models for efficient ETD and
Towards Energy Efficient Smart Grids
111
feature construction. However, none of them maintain temporal correlation of a customer consumption pattern for a long period for efficient theft identification. Also, learning hidden patterns from 1D electricity consumption data is a difficult task. Whereas, in [19,20], the conventional machine learning models have low detection ability and poor performance results because of several non malicious factors. In [21], a semisupervised based solution is proposed for ETD. However, it still needs improvement in terms of improving DR and lowering FPR. In [7,11,22–24], the authors address that the data imbalance is a vital issue in ETD. In a real world scenario, theft samples are rarely available as compare to honest samples. So, the machine learning classifiers show biasness towards majority class samples. In addition, the limited availability of theft samples degrades the DR of classification models. In [6,25,26], the NTL detection through machine learning techniques become a challenging task due to the insufficient availability of labeled training data. Similarly, in [15,18,27], the severe proportion of imbalanced data also affects the classification model’s generalization ability and has a high chances of over-fitting. The authors of [20] discuss that different oversampling techniques are used to reproduce the minority class data samples for solving the data imbalance issue in the case of ETD. The existing oversampling techniques such as SMOTE, adaptive synthetic (ADASYN), generative adversarial network (GAN), etc., are exploited for synthesizing the theft class samples. However, these techniques did not consider the fluctuation and probability distribution curve while generating theft samples, which failed to give a real assessment.
3
Proposed System Model
This section describes the detailed description of each component of the proposed system model. Whereas, Fig. 1 shows the complete workflow of the proposed methodology. 1. The electricity consumption data often contains noisy and missing values because of faulty meters, meters’ hacking, maintenance or storage issues, etc. So, the erroneous and noisy data degrade the models’ performance. To tackle these issues, we use data preprocessing techniques in this paper. The missing values are handled through the linear interpolation method. This method fills the missing values by taking the average of the next and previous day’s electricity consumption. Similarly, the noise and outliers are also necessary to be handled because they affect the model’s performance. So, we use three sigma rule of thumb [28] (TSR) to handle the outliers. Afterwards, the electricity consumption data should be normalized because the deep neural networks are very sensitive to diverse data. So, we use Min-Max normalization to normalize the dataset. 2. The Bi-WGAN [18] is the enhanced version of WGAN [29,30]. An additional encoder module is integrated with Bi-WGAN for enhancing the capabilities of the generator network. Therefore, in this study, to overcome the class imbalanced issue, the Bi-WGAN is employed for generating the most plausible fake
112
M. Asif et al. Theft/Benign Hybrid module
2D-CNN module Fully connected
Bi-LSTM module yt-1
yt
yt+1
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
Flatten Pooling
xt-1
Convolutional S.2
2D weekly data
1D daily data TheŌ class samples
Bi-WGAN for data augmentation
Encoder
S.1
Latent noise
xt+1
S.2
Normal class samples
Real theŌ Data
xt
True
Discriminator
False
Generator Fine tune training
L1. Class imbalanced L2. Curse of dimensionality S1. Data augmentation through BiWGAN S2. 2D-CNN and Bi-LSTM
Structured dataset
Preprocessing
Dataset
Fig. 1. Proposed system model
electricity theft patterns that closely mimic the real world behavior of electricity thieves. The Bi-WGAN model is equipped with an additional encoder module for effective inverse mapping of latent space for a given input. To generate the most prominent theft samples, the encoder module works in the inverse direction of the generator network for creating the latent space using the real input data. Moreover, the Bi-WGAN utilizes wasserstein distance (WD) as a loss function, which helps the model for stable learning and speedy convergence towards global optima. The WD is also called the earth moving distance. It moves the small portion of one probability distribution to the other for the sake of generating the fake samples closely related to the real samples. So, during the adversarial training of generator and discriminator, the WD should be minimum for better generation of the fake samples. 3. In [31], the authors apply a 2D-CNN on temporal data for speech recognition. It shows satisfactory performance in speech recognition. Moreover, in [5], the 2D-CNN is used to capture the hidden patterns and trends from the electricity load profile. As motivated from [5] and [31], in this methodology, a 2D-CNN is exploited for extracting the most prominent features from the high-dimensional feature space. The available electricity data is in 1D raw form. To capture the hidden fluctuations and trends from the 1D electricity data is very difficult because of no association of consumption patterns with
Towards Energy Efficient Smart Grids
113
each other. Therefore, in this work, we transform the 1D daily electricity data into 2D weekly electricity consumption data. The data is passed to the CNN model for capturing the latent patterns and trends for better generalization. The 2D-CNN model applies 2D convolution layers on the data for convolving operations. Every convolution layer has a specific receptive field or area where different filters are stride and generate feature maps. Afterwards, pooling layers are also applied to the feature maps in order to reduce the dimensionality and number of parameters. In addition, max-pooling is chosen for pooling operations. It picks up the maximum value from the receptive field and discard the remaining values. 4. In the proposed methodology, a Bi-LSTM [32] is used for capturing the temporal correlated features from the time series data for efficient ETD. The Bi-LSTM utilizes the forward and backward pass concurrently on each timestamp. It also maintains the context of previous knowledge as well as the current knowledge for better prediction. Due to preserving the previous longterm history of customer patterns, it efficiently deals with the non-malicious factors and reduces the FPR to a minimal level. So, by lowering FPR, it also saves the unnecessary on-site inspections’ cost. In Bi-LSTM, different gates are used to maintain the sequence of information. The input gate in LSTM takes the data of previous and current states and passes it through the sigmoid function for deciding, which state information is important. Similarly, the forget gate decides, which information should be kept or thrown. Finally, the output gate decides, which and how much information is passed to the next hidden state. Furthermore, a cell state is maintained for storing the necessary information for a long time. The benefit of the Bi-LSTM is that it also remembers the context of the previous knowledge in both directions, which increases the detection accuracy and reduces the FPR. 5. In the hybrid layer, we concatenate the output features of both the 2D-CNN and Bi-LSTM models into a single feature map and apply a joint weight for hybrid training. Then, we use the sigmoid activation function to the combined feature map for final classification.
4
Model Evaluation
This section contains the simulations results and discussion of the proposed and benchmark models. The proposed model is evaluated and testing on SGCC dataset, which is publically available on the internet. 4.1
Performance Metrics
As the ETD is a class imbalance problem so, the selection of appropriate performance measures is a necessary task for the comprehensive evaluation of the proposed model. Therefore, in this study, PR, AUC and MCC scores are considered as the performance metrics. The mathematical formulation of these metrics is given as follows: TP , (1) P recision = TP + FP
114
M. Asif et al.
TP , (2) TP + FN TP ∗ TN − FP ∗ FN , (3) M CC = P recision + Recall i ∈ positiveclssRAN Ki − P (1+P 2 AU C = , (4) P ∗N where TP and TN denote how much consumers are accurately identified as normal and abnormal, respectively. Whereas, the FP represents those consumers who are wrongly classified as abnormal. Similarly, the FN denotes those consumers, which are misclassified as normal. The precision and recall scores tell about the accurate prediction of theft. The AUC-ROC score measures the separability of fair and unfair classes. The MCC score equally focuses on TP, TN, FP and FN for fair analysis. It has ranges between −1 and +1. The MCC score close to +1 depicts that the model performs best while detecting the energy thieves and vise versa. Recall =
4.2
Simulation Results
Figure 2 depicts the loss curves of both generator and discriminator models while training and testing on the real and fake samples. In Bi-WGAN, during each iteration of training, half batch of real samples and half batch of fake samples is used to update the weights of the discriminator model. Therefore, the blue curve shows the loss of discriminator model on real samples and the orange curve shows the loss of discriminator on fake samples formulated by the generator. These both curves clearly show that the discriminator model classifies the fake samples more efficiently than the real samples after a few iterations. It also depicts that the discriminator model fights well with generator model during the adversarial training. Moreover, in Bi-WGAN, the discriminator model is updated more as
Fig. 2. Loss of generator and discriminator of Bi-WGAN during training
Towards Energy Efficient Smart Grids
115
Table 1. Mapping table Limitations identified
Solutions proposed
Validations
L.1 Curse of dimensionality and inefficient feature engineering degrade the model’s accuracy as well as increase the computational time [24, 26]
S.1 A hybrid 2D-CNN and Bi-LSTM approach is used for extracting the most prominent features from the high dimensional time series data
V.1 The performance of the proposed model is validated through MCC, AUC-ROC and precision-recall curve (PRC), as given in Figs. 3a, 3b and 4
L.2 Due to class imbalanced issue the classifier’s biased towards majority class [8, 10]
S.2 Bi-WGAN generates the most plausible real world synthetic attack samples by the addition of encoder module along with generator
V.2 Proposed Bi-WGAN synthesizes fake theft samples and the results depicts in Figs. 2
compared to the generator model during the training phase for better generalization results. Whereas, the green curve demonstrate the loss of generator model during the training time. The generator model gradually reduces loss on each iteration because of having additional encoder module for the inverse mapping of real samples back to the latent dimension. Due to the updated wasserstain loss function and additional encoder module, the Bi-WGAN model has best generalization results while generating the electricity theft samples. Table 1 shows the mapping of limitations to their proposed solutions and validations. The limitation L1 describes the curse of dimensionality and inefficient feature engineering issues. The authors of [6,24,26], did not consider any feature engineering mechanism for extracting the most relevant features from the high dimensional feature space, which decreases the model’s detection accuracy as well as increases the computational overhead. So, in S1, we present a hybrid 2D-CNN and Bi-LSTM approach for extracting the most prominent features from the high-dimensional time series data. In V1, the results of S1 are validated through MCC, AUC-ROC and PRC. As shown in Figs. 3a, 3b and 4. Whereas, the L2 is about the data imbalance issue. In ETD, the collection of balance data is a challenging task because in a real world scenario the electricity theft samples are rarely available as compared to the normal users. So, the classification models get biased towards the majority class due to the class imbalance problem. So, in S2, a BiWGAN model is used to generate the synthesized fake theft samples that are closely related to the real world theft cases. Therefore, in V2, the Bi-WGAN performance is validated by measuring the classification results on the synthetic samples, as shown in Figs. 2. Moreover, the Fig. 2 validates the convergence speed of the Bi-WGAN in terms of loss. In Fig. 3a the MCC score is illustrated. The MCC score equally focuses on TP, TN, FP and FN for quantifying the correlation between them. The calculation
116
M. Asif et al.
Fig. 3. MCC and ROC-AUC of the proposed 2D-CNN and Bi-LSTM
of MCC score is necessary in case of ETD because the FN rate is also valuable for power utilities for recovering the maximum revenue. The proposed model achieves 0.91 MCC score, which is good in case of ETD. It depicts that the proposed model efficiently tackle the FN rate and helps the power utilities to save the financial and onsite inspections’ expenses. Figure 3b shows the ROC-AUC score of the proposed model and a benchmark LSTM-MLP model. The proposed and LSTM-MLP models achieve AUC-ROC score of 0.98 and 0.96, respectively. It clearly means that the proposed model outperforms the existing benchmark model while detecting the energy thieves. The proposed model efficiently reduces the high FPR to a minimal levels due to the strong memorization and learning capabilities of Bi-LSTM model. The 2DCNN module of hybrid model solves the curse of dimensionality issue by using the powerful capabilities of max pooling layers.
Fig. 4. PRC of hybrid 2D-CNN and Bi-LSTM
Towards Energy Efficient Smart Grids
117
Figure 4 illustrates the PRC score of the proposed model and benchmark LSTM-MLP model. Both precision and recall scores are valuable and important for power utilities. These scores helps the power utilities to detect the electricity thieves and recover the maximum revenue. It is seen that the proposed and existing LSTM-MLP model obtain 0.96 and 0.94 PRC score, respectively. The simulation results prove that the proposed model performs better than the LSTM-MLP model while detecting energy thieves.
5
Conclusion
This paper presents a novel hybrid deep learning model for efficient ETD. The problem of imbalanced dataset is solved thorough Bi-WGAN. The Bi-WGAN efficiently learns the electricity theft patterns and then generates new theft samples that are closely mimic the real world theft behavior. The Bi-WGAN performs well due to the additions of external encoder module with generator model for the inverse mapping of real inputs back to the latent space. It increases the convergence speed of the generator model of Bi-WGAN and helps it to generate most plausible theft samples. Moreover, in Bi-WGAN, the wasserstain distance is used as a loss function, which increases the stable learning of Bi-WGAN model. Furthermore, the curse of dimensionality issue is solved using the strong capabilities of 2D-CNN model and Bi-LSTM model. The 2D-CNN model significantly reduces the data dimensions through the pooling layers. Meanwhile, the Bi-LSTM stores only the relevant important information and discards the redundant information and overlapping features. Finally, the simulations results depict that the proposed model outperforms in terms of AUC-ROC, PRC and MCC score values, which are 3%, 2% and 4% greater than the existing scheme.
References 1. McDaniel, P., McLaughlin, S.: Security and privacy challenges in the smart grid. IEEE Secur. Priv. 7(3), 75–77 (2009) 2. Chen, Q., Zheng, K., Kang, C., Huangfu, F.: Detection methods of abnormal electricity consumption behaviors: review and prospect. Autom. Electr. Power Syst. 42(17), 189–199 (2018) 3. Lo, C.-H., Ansari, N.: Consumer: a novel hybrid intrusion detection system for distribution networks in smart grid. IEEE Trans. Emerg. Top. Comput. 1(1), 33– 44 (2013) 4. Amin, S., Schwartz, G.A., Tembine, H.: Incentives and security in electricity distribution networks. In: International Conference on Decision and Game Theory for Security, pp. 264–280. Springer (2012) 5. Zheng, Z., Yang, Y., Niu, X., Dai, H.-N., Zhou, Y.: Wide and deep convolutional neural networks for electricity-theft detection to secure smart grids. IEEE Trans. Industr. Inf. 14(4), 1606–1615 (2017) 6. Buzau, M.M., Tejedor-Aguilera, J., Cruz-Romero, P., G´ omez-Exp´ osito, A.: Detection of non-technical losses using smart meter data and supervised learning. IEEE Trans. Smart Grid 10(3), 2661–2670 (2018)
118
M. Asif et al.
7. Kong, X., Zhao, X., Liu, C., Li, Q., Dong, D., Li, Y.: Electricity theft detection in low-voltage stations based on similarity measure and DT-KSVM. Int. J. Electr. Power Energy Syst. 125, 106544 (2021) 8. Buzau, M.-M., Tejedor-Aguilera, J., Cruz-Romero, P., G´ omez-Exp´ osito, A.: Hybrid deep neural networks for detection of non-technical losses in electricity smart meters. IEEE Trans. Power Syst. 35(2), 1254–1263 (2019) 9. Razavi, R., Gharipour, A., Fleury, M., Akpan, I.J.: A practical feature-engineering framework for electricity theft detection in smart grids. Appl. Energy 238, 481–494 (2019) 10. Yao, D., Wen, M., Liang, X., Zipeng, F., Zhang, K., Yang, B.: Energy theft detection with energy privacy preservation in the smart grid. IEEE Internet Things J. 6(5), 7659–7669 (2019) 11. Punmiya, R., Choe, S.: Energy theft detection using gradient boosting theft detector with feature engineering-based preprocessing. IEEE Trans. Smart Grid 10(2), 2326–2329 (2019) 12. Huang, Y., Qifeng, X.: Electricity theft detection based on stacked sparse denoising autoencoder. Int. J. Electr. Power Energy Syst. 125, 106448 (2021) 13. Arif, A., Javaid, N., Aldegheishem, A., Alrajeh, N.: Big data analytics for identifying electricity theft using machine learning approaches in micro grids for smart communities 14. Aldegheishem, A., Anwar, M., Javaid, N., Alrajeh, N., Shafiq, M., Ahmed, H.: Towards sustainable energy efficiency with intelligent electricity theft detection in smart grids emphasising enhanced neural networks. IEEE Access 9, 25036–25061 (2021) 15. Xiaoquan, L., Zhou, Yu., Wang, Z., Yi, Y., Feng, L., Wang, F.: Knowledge embedded semi-supervised deep learning for detecting non-technical losses in the smart grid. Energies 12(18), 3452 (2019) 16. Ramos, C.C.O., Rodrigues, D., de Souza, A.N., Papa, J.P.: On the study of commercial losses in Brazil: a binary black hole algorithm for theft characterization. IEEE Trans. Smart Grid 9(2), 676–683 (2016) 17. Kocaman, B., T¨ umen, V.: Detection of electricity theft using data processing and LSTM method in distribution systems. S¯ adhan¯ a 45(1), 1–10 (2020) 18. Hu, T., Guo, Q., Sun, H., Huang, T.-E., Lan, J.: Nontechnical losses detection through coordinated BiWGAN and SVDD. IEEE Trans. Neural Netw. Learn. Syst. 32, 1866–1880 (2020) 19. Saeed, M.S., Mustafa, M.W., Sheikh, U.U., Jumani, T.A., Mirjat, N.H.: Ensemble bagged tree based classification for reducing non-technical losses in Multan electric power company of Pakistan. Electronics 8(8), 860 (2019) 20. Gong, X., Tang, B., Zhu, R., Liao, W., Song, L.: Data augmentation for electricity theft detection using conditional variational auto-encoder. Energies 13(17), 4291 (2020) 21. Aslam, Z., Ahmed, F., Almogren, A., Shafiq, M., Zuair, M., Javaid, N.: An attention guided semi-supervised learning mechanism to detect electricity frauds in the distribution systems. IEEE Access 8, 221767–221782 (2020) 22. Li, S., Han, Y., Xu, Y., Yingchen, S., Wang, J., Zhao, Q.: Electricity theft detection in power grids with deep learning and random forests. J. Electr. Comput. Eng. 2019 (2019) 23. Avila, N.F., Figueroa, G., Chu, C.-C.: NTL detection in electric distribution systems using the maximal overlap discrete wavelet-packet transform and random undersampling boosting. IEEE Trans. Power Syst. 33(6), 7171–7180 (2018)
Towards Energy Efficient Smart Grids
119
24. Jokar, P., Arianpoo, N., Leung, V.C.M.: Electricity theft detection in AMI using customers consumption patterns. IEEE Trans. Smart Grid 7(1), 216–226 (2015) 25. Zheng, K., Chen, Q., Wang, Y., Kang, C., Xia, Q.: A novel combined data-driven approach for electricity theft detection. IEEE Trans. Industr. Inf. 15(3), 1809–1819 (2018) 26. Gunturi, S.K., Sarkar, D.: Ensemble machine learning models for the detection of energy theft. Electr. Power Syst. Res. 192, 106904 (2021) 27. Hasan, Md., Toma, R.N., Nahid, A.-A., Islam, M.M., Kim, J.-M., et al.: Electricity theft detection in smart grid systems: a CNN-LSTM based approach. Energies 12(17), 3310 (2019) 28. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 1–58 (2009) 29. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: International Conference on Machine Learning, pp. 214–223. PMLR (2017) 30. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.: Improved training of Wasserstein GANs. arXiv preprint arXiv:1704.00028 (2017) 31. Zhao, J., Mao, X., Chen, L.: Speech emotion recognition using deep 1D & 2D CNN LSTM networks. Biomed. Signal Process. Control 47, 312–323 (2019) 32. Cui, Z., Ke, R., Ziyuan, P., Wang, Y.: Stacked bidirectional and unidirectional LSTM recurrent neural network for forecasting network-wide traffic state with missing values. Transp. Res. Part C Emerg. Technol. 118, 102674 (2020)
Comparative Study of Data Driven Approaches Towards Efficient Electricity Theft Detection in Micro Grids Faisal Shehzad1 , Muhammad Asif1 , Zeeshan Aslam2 , Shahzaib Anwar1 , Hamza Rashid3 , Muhammad Ilyas1 , and Nadeem Javaid1(B) 1
3
COMSATS University Islamabad, Islamabad, Pakistan 2 Bahria University Islamabad, Islamabad, Pakistan University of Management and Technology, Lahore, Pakistan
Abstract. In this research article, we tackle the following limitations: high misclassification rate, low detection rate and, class imbalance problem and no availability of malicious or theft samples. The class imbalanced problem is severe issue in electricity theft detection that affects the performance of supervised learning methods. We exploit the adaptive synthetic minority oversampling technique to tackle this problem. Moreover, theft samples are created from benign samples and we argue that the goal of theft is to report less than consumption actual electricity consumption. Different machine learning and deep learning methods including recently developed light and extreme gradient boosting (XGBoost), are trained and evaluated on a realistic electricity consumption dataset that is provided by an electric utility in Pakistan. The consumers in the dataset belong to different demographics and, different social and financial backgrounds. Different number of classifiers are trained on acquired data; however, long short-term memory (LSTM) and XGBoost attain high performance and outperform all classifiers. The XGBoost achieves a 0.981 detection rate and 0.015 misclassification rate. Whereas, LSTM attains 0.976 and 0.033 detection and misclassification rate, respectively. Moreover, the performance of all implemented classifiers is evaluated through precision, recall, F1-score, etc.
1 Background Study The objective of electricity theft detection (ETD) is to detect electricity losses in a smart grid. There are two types of losses: technical losses and non-technical losses (NTL). Former occurs due to energy dissipation in transmission lines, transformers and other types of electric equipment. The latter happen due to illegal activities: meter tampering, meter bypassing, using shunt devices, etc., [1]. In literature, different approaches are designed to detect NTL in the smart grid: game theory, hardware-based and data-driven approaches. In the following articles, the authors use multiple data-driven approaches to analyse users’ consumption and extract the abnormal patterns. Joker et al. [2] suggest a consumption pattern-based electricity theft detector to detect abnormal patterns. Irish electricity dataset is used to train the classifier, that contains information about only benign1 consumers. The authors propose six theft cases to generate malicious samples 1
Benign and normal samples are used alternatively.
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 120–131, 2022. https://doi.org/10.1007/978-3-030-79728-7_13
Electricity Theft Detection Through
121
and argue that abnormal electricity consumption is always less than the normal consumption. However, they use the oversampling technique to handle the class imbalanced ratio. It generates duplicated copies of minority class samples, which affect the learning rate of the classifier. Rajiv et al. [3] present a comparison between gradient boosting classifiers: eXtreme gradient boosting (XGBoost), light gradient boosting (LightGBM) and categorical boosting (CatBoost), and support vector machine (SVM). Irish smart meter data is used that contains 5% to 8% theft samples. New theft samples are generated through the existing six theft cases, which are described in [2] to balance the dataset. In [4–6], authors use supervised machine learning techniques to detect unusual consumption behaviour from electricity consumption dataset. In [7], Razavi et al. develop an agnostic model for feature extraction through finite mixture model and genetic evolutionary algorithm. Xiangyu et al. [8] propose a combined supervised and unsupervised-based method to detect illegal consumers. Dynamic time wrapping similarity measure is used for preliminary detection and then data is passed to hybrid SVM and K-nearest neighbour (KNN) based model to differentiate between normal and abnormal patterns. To tackle the class imbalance problem, synthesized electricity theft patterns are generated through Wasserstein generative adversarial network. Finally, features are selected and extracted through clustering evaluation criteria and autoencoder, respectively. Madalina et al. [9] use XGBoost, electricity consumption data and smart meter information to detect NTL in Spain. K-means clustering, euclidean and Manhattan distance are utilized to generate new features. Buzau et al. [10] propose a hybrid model, which is a combination of long short term memory (LSTM) and multilayer perceptron (MLP). Former is validated and tested on sequential data. Whereas, latter is validated and tested on non-sequential data. Dataset used in this article has high imbalanced nature, which tends the model towards the majority class and generates false results. Zheng et al. [11] propose a Wide and deep convolutional neural network to detect electricity theft. Sequential data (weekly data) is passed to a convolutional neural network (CNN) to capture local patterns. While, 1D-data is fed in an MLP model (wide component) to retrieve global patterns from electricity consumption data. However, they do not handle class imbalance ratio, which biases the classifier towards the majority class and increases misclassification rate. In [12], Huang et al. develop stacked sparse denoising autoencoder (SSDAE) to capture abnormal patterns from electricity consumption data. SSDAE extracts optimal features, compare them with original ones and reduces reconstruction loss. The authors introduce sparsity and noise to enhance the feature extraction ability of the autoencoder and its hyperparameters are tuned using the particle swarm optimization technique. Fenza et al. [13] introduce an idea, which is based on drift concept and time-aware system. Their proposed framework contains LSTM and K-means clustering algorithms. K-means clustering algorithm opts to find similar consumption behavior. Rajendra et al. [14] use LSTM, CNN and stacked autoencoder to detect abnormal consumption patterns in distributed systems. The authors use a synthesized dataset that contains 7% abnormal samples. CNN classifier outperforms the LSTM and stacked autoencoder. Hassan et al. [15] propose CNN-LSTM based hybrid model. CNN is employed to extract the optimal features and LSTM is used for the classification task. Dataset has an imbalance ratio, which is solved through synthetic minority oversam-
122
F. Shehzad et al.
pling technique (SMOTE). In [16], despite of extensive usage of machine learning (ML) techniques, the authors do not focus on the selection of optimal features. In [1, 17], the authors give possibilities of implementing ML classifiers for detection of NTL and describe the advantages of selecting optimal features and their impacts on classifier performance. One of the main challenges [18] that limits the classification ability of existing methods is high dimensionality of data. Moreover, smart meters greatly improve data collection procedures and provide high dimensionality data to capture complex patterns. However, research work shows that the most existing classification methods are based on conventional ML classifiers like ANN, SVM and decision tree, which have limited generalization ability and are unable to learn complex patterns in high dimensional data. The list of contributions are explained below. • The legitimate data of any consumer is collected through consumption history. However, it is very difficult to attain malicious samples because theft cases are rarely happened and may not be presented in the user’s consumption history. So, we apply six theft cases to generate malicious samples and argue that the motive of theft is to report less consumption than actual usage. • After generating malicious samples, adaptive synthetic minority oversampling (ADASYN) technique is exploited to tackle the class imbalanced problem. This problem creates a biased model which leads to high FPR. • We conduct an empirical study to compare the performance of different machine learning and deep learning models: XGBoost, CatBoost, SVM, KNN, MLP, LSTM and RF. • We train and test all classifiers through a realistic electricity dataset that is provided by PRECON. Moreover, different measures such as confusion matrix, precision, recall, F1-score, etc. are used to evaluate their performance.
2 Acquiring Dataset and Handling Class Imbalanced Problem PRECON2 is the first kind of dataset that belongs to users of a developing country. Data collection aims to understand the electricity consumption behavior of users belonging to different demographics, and different social and financial backgrounds. Data is collected for the period of one year and contains electricity consumption information of thirty-two houses. The people who participated in this research have installed smart meters in their houses and agreed to take part in this research. So, it is a valid assumption that all participants are honest consumers. The large variety of consumers, measurement of long periods and online availability make this dataset an excellent source for research on smart grids. It contains thirty-two files with electricity measurement after each second. We reduce the data granularity by taking one sample after each halfhour because high granularity requires high computation power and affects consumer privacy. However, the dataset contains only honest consumers. Figure 1a shows the electricity consumption of a benign consumer. 2
PRECON: PAKISTAN RESIDENTIAL ELECTRICITY CONSUMPTION DATASET.
Electricity Theft Detection Through
(a) Daily consumption of a normal consumer
123
(b) Consumption behaviour of a theft consumer
Fig. 1. xx
For analysis of electricity consumption dataset, we require the theft samples but theft may be completely absent in users’ consumption history. We solve the lack of theft samples issue by generating malicious samples from benign samples because goal of electricity theft is to report less consumption or shift load from on-peak hours to offpeak hours. In [2], Joker et al. introduced six cases to generate malicious samples using benign ones. Description of these theft cases is given below. If xt is the real consumption of a normal consumer (t ∈ [1–48]). t1 (xt ) = α xt
α = random(0.1, 0.8)
(1)
t2 (xt ) = β xt
β = random[0.1, 0.8]
(2)
t3 (xt ) = γt xt
γt = random(0.1, 0.8)
(3)
t4 (xt ) = β mean(x)
β = random(0.1, 0.8)
(4)
t5 (xt ) = mean(x)
(5)
t6 (xt ) = x24−1
(6)
t1 (·) multiplies a random number between 0.1 and 0.8 with meter reading and sends it to an electric utility. t2 (·) sends or does not send measurements at different intervals of time. t3 (·) multiplies each meter reading with a different random number and reports lower consumption. t4 (·) and t5 (·) report mean random factor or exact mean value of measurements. t6 (·) reverses the order of consumption and shifts load towards peak hours. t5 (·) and t6 (·) launch attacks against load control mechanism and report high consumption in off-peak hours and low consumption in on-peak hours. We apply all these cases on benign samples and generate the malicious samples. Figure 1b shows the normal consumer consumption after applying the six theft cases. We have 321 normal consumption records and each theft case generates 321 new theft patterns. The total number of theft patterns is 1926, which are more than the normal consumers. This situation creates the class imbalance ratio that is a critical problem in ETD where one class (honest consumers) is dominant to other class (theft consumers). Data is not normally distributed and is skewed towards the majority class. A machine learning model is applied on data imbalance dataset. It would be biased
124
F. Shehzad et al.
towards majority class and not learn importance features of the minority class, which increases the FPR. Traditionally, two sampling techniques: undersampling and oversampling are used to balance dataset. However, these approaches are not adopted due to computational overhead, information loss and duplication of existing data. In the manuscript, we opt Adaptive Synthetic Sampling Approach (ADASYN) method to address class imbalance ratio [19, 20]. ADASYN is baised on adaptive learning approach, where it focuses on minority class samples, which are difficult to learn rather those that are easy to learn. It can not only reduce bias learning due to data imbalance distribution but also shifts class boundary towards minority class that are difficult in learning. Below paragraph shows working process of ADASYN. Training data Dtr with m samples {xi , yi } where, i = 1...m, xi belongs to X with n dimensions space. yi ∈ Y = [0,1] where, 0 and 1 are associated identity labels of xi . ms and ml represent the number of minority and majority class samples. • Calculate the degree of imbalance ratio. d=
ms ml
(7)
d ∈ (0, 1] d < ds (ds is preset threshold for maximum to tolerate minority class) • Calculate the number of synthetic samples, which are needed to generate for minority class. (8) G = (ml − ms ) ∗ β
β in [0,1] β is parameter which, decides balance level between minority and majority class samples. if β = 1, then data imbalance problem is fully resolved. • For xi in minority class, find KNN. ri =
δi , i = 1 ..., m K
δi is the number of samples that belong to the majority class. • Normalize ri ri ri = ms ∑i=1 ri
(9)
(10)
s xi is a density distribution function ∑m i=1 ri = 1 • Calculate the number of synthetic samples, which are generated for each minority class xi gi = ri ∗ G (11)
G is total number of minority samples that needed to be generated. • For each xi , generate gi number of sample using following steps. Do Loop 1 to gi : Randomly selects one sample xzi from KNN of xi si = xi + (xi − xzi ) ∗ λ
(12)
Electricity Theft Detection Through
125
End Loop λ = [0,1] si is a new minority class sample. It is added at end of the dataset The idea of ADASYN is based on density distribution function in, which ri automatically gives higher weights to minority class samples that are difficult in learning. This is a major difference to SMOTE. SMOTE technique is an oversampling techniques. It gives the equal weight to each minority class sample.
3 State-of-the-Art NTL Detection Techniques There are different techniques that are used for NTL detection: data-driven, game theory and hardware approaches. Figure 2 shows the flowchart of these approaches. We will focus on supervised learning approaches that have applications in different fields like bioinformatics, object detection, text classification, spam and anomaly detection, etc. In this manuscript, we use supervised learning approaches to detect anomalies in electricity consumption patterns and compare results to evaluate their performance. The description of these approaches is given below.
Techniques for NTL detection
Game theory techniques
Data driven techiques
Semi supervised learning
Supervised learning
Anamoly detection, Heuristic approaches, Generative models, Graph based methods
MLP, SVM, Decision tree, Gradient boosting classifiers, LR, RF, CNN, LSTM
Hardware based techniques
Unsupervised learning
Association rule mining, Clustering methods, Regression models
Fig. 2. Different techniques for electricity theft detection
126
F. Shehzad et al.
Gradient Boosting Classifiers: Boosting is an ensemble technique where, several weak learners are combined to form a strong learner. There are many boosting techniques: random forest, adaptive boosting, gradient boosting, etc. All of these have different mechanisms to reduce the loss function. Gradient boosting algorithms use gradient descent to reduce the loss function. XGBoost is based on sequential learning, where weak learners are trained through parallel implementation, that increases the algorithm performance. It is designed in this way to utilize maximum hardware resources. Cache and hard drive are utilized efficiently to handle small and large datasets. Besides, it used a weighted sketch algorithm to find optimal splitting criteria. CatBoost is a latest member of the gradient family toolkit. It is developed by the machine learning team at Yandex. It has ability to handle the categorical features and uses an order boosting strategy to avoid information leakage. CatBoost uses an oblivious decision tree. These trees have equal splitting criteria at each level and are less prone to overfitting. Support Vector Machine: SVM is a well-known classifier in ETD. It is an extension of the maximal margin hyperplane. In SVM, training data is fed into a classifier and results are predicted through testing data. It uses kernel functions to transform data into higher dimensions, where hyperplane can be drawn to separate classes. Polynomial and radial base kernels are used to handle non-linear data. Whereas, a linear kernel is used to draw a decision boundary between classes of linear data. In [2], the authors use SVM to capture abnormal patterns from electricity consumption data. Random Forest: Random forest (RF) is an ensemble learning technique where multiple decision trees are integrated to give a final decision through the majority of wisdom. It is widely used for classification and regression tasks. Moreover, it reduces the overfitting problem through multiple decision trees mechanism. However, the main limitation of random forest is that it contains a large number of decision trees which makes it less efficient for real-world problems. k-Nearest Neighbours Algorithm: KNN is a simple and easy to implement supervised machine learning algorithm. It is used for both classification and regression tasks. However, it is mostly used for classification tasks in the industry. KNN assumes the concept of similarity measure to handle the classification and regression problems. It is a lazy and non-parametric learning algorithm. Its computation time, memory and accuracy depend upon the nature of data. In [8], authors use KNN with SVM to reduce the misclassification of data points that are near the decision boundary. Multilayer Perceptron: Artificial neural network or multilayer perceptron is the first biological-inspired machine learning algorithm. It contains input, hidden and output layers. It is successfully used for NTL detection in [1] and [2]. It extracts the latent information from consumer’s consumption to differentiate between normal and theft samples. Long Short Term Memory: LSTM networks are variants of recurrent neural networks (RNN). Smart meter provides historical data of any customer,, which can be yearly long and increases on daily. RNN networks are unable to detect abnormal patterns from long sequence data due to vanishing and exploding gradient problems. LSTM is an enhanced version of the RNN network, which solves these problem. We use LSTM to capture
Electricity Theft Detection Through
127
longer patterns from the user’s historical data. LSTM structure is similar to the RNN network but different in internal components. The important part of LSTM is its cell state, which acts as a transport highway that transfers the relative pattern to way down sequence chain. It has three gates to regulate the flow of information throughout the network. Forget gates decide which information to keep or discard from the cell state. Input gates use the sigmoid and tanh function to update the cell state. The output gate predicts the final output. It also decides the hidden state. All equations of the LSTM network are given below [15]. ft = σ (w f (ht−1 , xt ), +b f )
(13)
it = σ (wi (ht−1 , xt ), +bi ) tanh(wi (ht−1 , xt ), +bi )
(14)
Ct = ft Ct−1 + it
(15)
ot = σ (wo (ht−1 , xt ), +bo )
(16)
ht = ot tanh(Ct )
(17)
ft , it and ot represent forget gate, input gate and output gate, respectively. xt , w f , wi and wo denote current input, forget gate weight, input gate weight and output gate weight. Ct and ht represent current cell state and hidden state, respectively. Ct−1 and ht−1 denote previous cell state and hidden state. b f , bi and bo represent forget gate bias, input gate bias and output gate bias. denotes point wise multiplication and σ represents sigmoid function. tanh function squashes the values between −1 and +1.
4 Experimental Results In this section, we evaluate the performance of all classifiers by performing extensive experiments. All of these are implemented on Google Colab that is an open-source platform. It is mostly used for machine learning analysis, by taking advantage of distributed computing. SVM-linear, KNN and RF are implemented through Scikit learn library. While LSTM and MLP are trained and evaluated through the TensorFlow library. LighBoost and CatBoost are also open source libraries that are available on GitHub. One of contribution is that we compare the performance of different classifiers and give detailed analysis by utilizing different performance measures: precision, recall, F1-score and FPR. Another contribution is that we compare the efficiency of recently developed stateof-the-art classifiers: XGBoost and CatBoost with conventional machine learning and deep learning models. Table 1 shows the confusion matrix of all implemented classifiers. Precision, recall, F1-score and FPR are presented in Table 2. It is interesting to consider F1-score as a performance measure to evaluate the classifiers’ performance. XGBoost has the highest F1-measure value than CatBoost and RF. The counterpart is that RF also belongs to the group of ensemble learning but it has the lowest performance. KNN and SVM-linear obtain F1-measure value 0.935 and 0.905, respectively. KNN gives good results which means that normal and theft classes are easily separable. LSTM and MLP classifiers belong to the deep learning class. However, LSTM attains a 0.976 F1-measure value that is more than MLP because it has memory cells to remember the consumption patterns of theft and normal consumers. XGBoost gives the
128
F. Shehzad et al. Table 1. Confusion matrix of all classifiers Type of classifiers True positive True negative False positive False negative XGBoost
58
379
6
7
CatBoost
54
367
10
19
RF
57
289
7
97
SVM-linear
50
331
14
55
KNN
63
340
1
40
MLP
54
338
10
48
LSTM
51
377
13
9
Table 2. Precision, recall, F1-score and FPR of all classifiers Type of classifiers Precision Recall F1-measure FPR XGBoost
0.983
0.984
0.981
0.015
CatBoost
0.973
0.950
0.961
0.026
RF
0.976
0.748
0.847
0.023
SVM-linear
0.959
0.857
0.905
0.04
KNN
0.880
0.997
0.935
0.002
MLP
0.971
0.875
0.920
0.028
LSTM
0.966
0.976
0.971
0.033
highest performance than all other classifiers because it performs sequential learning and reduces the misclassification rate by utilizing a gradient descent algorithm. Taking precision as an efficiency metric, ensemble classifiers achieve high precision as compared to conventional and deep learning models. XGBoost attains 0.983 precision value that is more than all implemented classifiers. MLP attains 0.971 which is better than the LSTM precision value. Now, we take recall as a performance measure metric. It is a ratio of relevant results that are returned by a classifier. KNN obtains the highest recall while MLP achieves the lowest value. FPR is also known as misclassification rate. It is very important for electric utilities because they have limited resources for on-site inspection. KNN gives the lowest FPR value than all other implemented classifiers. However, one drawback of KNN is that it belongs to group of lazy learning classifiers. These classifiers give good results on small datasets, while their performance is drastically decreased on larger datasets. Reciever operating characteristics curve (ROC Curve) is a tool that is commonly exploited to access the performance of machine learning models. It is ratio of true positive rate and false positive rate on different threshold values between 0 and 1. Figure 3 shows ROC curve of all models to better evaluate their performance. XGBoost outperforms all classifiers and achieves the highest ROC curve. While RF gives the lowest performance. The remaining classifiers CatBoost, SVM-linear, MLP, LSTM and KNN attain 0.971, 0.884, 0.914, and 0.965 AUC values of ROC curve, respectively. Precision recall curve (PR curve) is another
Electricity Theft Detection Through
129
measure that is used to access the classifier performance for imbalanced datasets. It is ratio of precision and recall at different threshold[0 and 1]. The AUC of PR curve gives a summary of the skilled classifiers.
Fig. 3. ROC curve of classifiers
PR curve of implemented classifiers are represented in Fig. 4 to check their performance on the imbalanced dataset. XGBoost outperforms all classifiers and achieves 0.997 AUC value. While RF achieves the lowest AUC value. CatBoost, SVM, MLP, LSTM and KNN get 0.995, 0.975, 0.995 and 0.994 AUC, respectively. In this research article, we manually analyze the performance of conventional machine learning and deep learning classifiers for NTL detection. However, XGBoost outperforms the classifiers and achieves the highest results. It has a built-in feature extraction module that removes the noise, extracts optimal features and improves performance. Moreover, it performs sequential learning, where weak learners are trained in a sequenced manner and are combined in the end to make a strong learner. Due to all of these reasons, XGBoost performs better than all other tested classifiers.
Fig. 4. PR curve of classifiers
130
F. Shehzad et al.
5 Conclusion In this manuscript, we exploit different supervised learning methods: XGBoost, CatBoost, SVM, RF, KNN, MLP and LSTM to detect anomalies in users’ consumption history. For a consumer in a smart grid, we can easily obtain benign samples from his consumption history. However, theft cases may not be presented in consumption history. We use six cases to generate malicious samples and argue that the purpose of theft is to report less consumption than actual electricity consumption. ADASYN is utilized to balance the ratio between benign and theft samples. A realistic electricity consumption dataset is utilized to train and evaluate all implemented classifiers, provided by the electric utility of Pakistan. XGBoost outperforms and achieves 0.986 and 0.997 ROCAUC and PR-AUC, respectively. It has a built-in feature extraction module that reduces noise, selects the optimal features and increases its performance. Moreover, precision, recall, F1-score and FPR is utilized to evaluate their performance of the classifiers.
References 1. Avila, N.F., Figueroa, G., Chu, C.-C.: NTL detection in electric distribution systems using the maximal overlap discrete wavelet-packet transform and random undersampling boosting. IEEE Trans. Power Syst. 33(6), 7171–7180 (2018) 2. Jokar, P., Arianpoo, N., Leung, V.C.M.: Electricity theft detection in AMI using customers’ consumption patterns. IEEE Trans. Smart Grid 7(1), 216–226 (2015) 3. Punmiya, R., Choe, S.: Energy theft detection using gradient boosting theft detector with feature engineering-based preprocessing. IEEE Trans. Smart Grid 10(2), 2326–2329 (2019) 4. Khan, Z.A., Adil, M., Javaid, N., Saqib, M.N., Shafiq, M., Choi, J.-G.: Electricity theft detection using supervised learning techniques on smart meter data. Sustainability 12(19), 8023 (2020) 5. Arif, A., Javaid, N., Aldegheishem, A., Alrajeh, N.: Big data analytics for identifying electricity theft using machine learning approaches in micro grids for smart communities. Concurrency Comput. Pract. Experience, 1532–0634 (2021) 6. Ghori, K.M., Abbasi, R.A., Awais, M., Imran, M., Ullah, A., Szathmary, L.: Performance analysis of different types of machine learning classifiers for non-technical loss detection. IEEE Access 8, 16033–16048 (2019) 7. Razavi, R., Gharipour, A., Fleury, M., Akpan, I.J.: A practical feature-engineering framework for electricity theft detection in smart grids. Appl. Energy 238 , 481–494 (2019) 8. Kong, X., Zhao, X., Liu, C., Li, Q., Dong, D.L., Li, Y.: Electricity theft detection in lowvoltage stations based on similarity measure and DT-KSVM. Int. J. Electr. Power Energy Syst.125, 106544 (2021) 9. Buzau, M.M., Tejedor-Aguilera, J., Cruz-Romero, P., G´omez-Exp´osito, A.: Detection of nontechnical losses using smart meter data and supervised learning. IEEE Trans. Smart Grid 10(3), 2661–2670 (2018) 10. Buzau, M.-M., Tejedor-Aguilera, J., Cruz-Romero, P., G´omez-Exp´osito, A.: Hybrid deep neural networks for detection of non-technical losses in electricity smart meters. IEEE Trans. Power Syst. 35(2), 1254–1263 (2019) 11. Zheng, Z., Yang, Y., Niu, X., Dai, H.-N., Zhou, Y.: Wide and deep convolutional neural networks for electricity-theft detection to secure smart grids. IEEE Trans. Ind. Inform. 14(4), 1606–1615 (2017)
Electricity Theft Detection Through
131
12. Huang, Y., Qifeng, X.: Electricity theft detection based on stacked sparse denoising autoencoder. Int. J. Electr. Power Energy Syst. 125, 106448 (2021) 13. Fenza, G., Gallo, M., Loia, V.: Drift-aware methodology for anomaly detection in smart grid. IEEE Access 7, 9645–9657 (2019) 14. Bhat, R.R., Trevizan, R.D., Sengupta, R., Li, X., Bretas, A.: Identifying nontechnical power loss via spatial and temporal deep learning. In: 2016 15th IEEE International Conference on Machine Learning and Applications, pp. 272–279 (2016) 15. Hasan, M., Toma, R.N., Nahid, A.-A., Islam, M.M., Kim, J.-M.: Electricity theft detection in smart grid systems: a CNN-LSTM based approach. Energies 12(17), 3310 (2019) 16. Ramos, C.C.O., Rodrigues, D., de Souza, A.N., Papa, J.P.: On the study of commercial losses in Brazil: a binary black hole algorithm for theft characterization. IEEE Trans. Smart Grid 9(2), 676–683 (2016) 17. Coma-Puig, B., Carmona, J.: Bridging the gap between energy consumption and distribution through non-technical loss detection. Energies 12(9), 1748 (2019) 18. Hu, T., Guo, Q., Sun, H., Huang, T.-E., Lan, J.: Nontechnical losses detection through coordinated BIWGAN and SVDD. IEEE Trans. Neural Netw. Learn. Syst. 32, 1866–1880 (2020) 19. He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks, pp. 1322–1328 (2008) 20. Javaid, N., Jan, N., Javed, M.U.: An adaptive synthesis to handle imbalanced big data with deep Siamese network for electricity theft detection in smart grids. J. Parallel Distrib. Comput, 0743–7315 (2021)
Routing Strategy for Avoiding Obstacles During Message Forwarding in Mobile Ad-Hoc Network Qiang Gao1(B) , Tetsuya Shigeyasu2 , and Chunxiang Chen2 1
Graduate School of Comprehensive Scientific Research, Prefectural University of Hiroshima, Hiroshima, Japan [email protected] 2 Department of Management and Information System, Prefectural University of Hiroshima, Hiroshima, Japan {sigeyasu,chen}@pu-hiroshima.ac.jp
Abstract. It is well known that obstacles like ruins will appear after disaster and it degrades performance of the message forwarding. Hence we propose a scheme to keep communication performance. By the proposal, nodes in the disaster affected area can decide whether there are obstacles around itself, and then change its message routing strategy to successfully forward the message with bypassing the obstacle. According to performance evaluations, we clarify that our scheme has better resistance to obstacles, while keeping higher delivery ratio, low end-to-end latency, and the use of energy, respectively.
1
Introduction
In recent years, earthquakes have occurred frequently [1]. Damaged environments due to earthquakes seriously affect the rescue activities in which rescuer helps injured. Rescuer must know the geographical location of injured and other necessary information about injured before relief activities. The information could let the action personnel prepare enough medicine or suitable medical equipment. After reaching the location of the injured person, let the injured person get timely medical rescue. In order to solve appeared cases the above, we have proposed two types of schemes: MRDAI (Message Relay Decision with Area Increase) and SRS (SubRelay Stations) in the literature [2]. MRDAI can calculate the appropriate number of hops that messages go through from source node to destination node according to distributed distance between these nodes in message relay area1 [3]. If the number of actual hops of received message at any node exceeds evaluated hops, MRDAI thinks that obstacles exist in the message relay area, then it enlarges the original message relay area. It is believed that if more mobile nodes join into the forwarding 1
The mobile nodes only within specified range can forward messages.
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 132–143, 2022. https://doi.org/10.1007/978-3-030-79728-7_14
Routing Strategy for Avoiding Obstacles During Message Forwarding
133
process, the message may bypass the obstacles due to enough relay candidate nodes, the delivery ratio gets improved. But the more nodes join in, the more energy consumption, and the higher the end-to-end delay. SRS has been proposed as the updated version of MRDAI to deal with unneeded energy consumption. By counting the coordinates of multiple messages that has been successfully delivered, and using the k-means2 clustering method, then the center point that most messages may pass through is calculated, which is called sub-relay points of the transit coordinate. The purpose of sub-relay points is to establish sub-relay area extracted from enlarged message relay area. Afterwards, the mobile nodes only within sub-relay area can forward messages. It is guaranteed that SRS could bypass the obstacles, and the same time, it could also effectively reduce the useless energy consumption. In addition, SRS improves performance about delivery ratio compared with MRDAI, also. But the above two types of methods still must be improved because of the core idea considering the probable existing obstacles from perspective view. Even if the updated version, SRS, will fail to work to expectations in some cases like multiple obstacles existing on the environments, in other words, established subrelay area may still have obstacles. For solving this, we propose a new scheme named APO (Adapted Predict Obstacle) in this paper. APO predicts and judges the existence of obstacles by checking changes in the number of surrounding nodes over a period of time for node itself. In terms of metrics including delivery ratio, end-to-end delay from experiments, APO outperforms both of MRDAI and SRS. The rests of this paper are organized as follows. In Sect. 2, we summarize related work. In Sect. 3, we describe detailed procedure of our proposal. Then, results of performance evaluation are reported on Sect. 4, and Sect. 5 summarizes this paper.
2
Related Works
There are a great number of solutions which many pioneers have proposed what if mobile nodes face problems of obstacles refer some papers. The following literatures can be divided into two categories according to their nature whether the location of obstacles is known in advance. Literatures [4–7] belong to the first category where the location of the obstacle is not known in advance. The method in [4] adopted waiting timeout mechanism to the process that the connection between nodes established, if sender does not receive the acknowledgements from receiver within a predefined certain period after sender sending handshake messages, which is determined that there possibly exists obstacles between two nodes. It is not recommended that using acknowledgement mechanism due to the nature of DTN (Delay Tolerant Network) [8]. 2
This is a clustering method which can divide a group into sets of children group by use of Euclidean distance.
134
Q. Gao et al.
The approximate approach in [5] is that creator tries to find the location of destination by sending control message among neighbor nodes and relay nodes. After that virtual line between creator and destination can be established. The R , are cresquare units surrounding with virtual line, side length of which is 2√ 2 3 ated. The scheme allows the nodes only within units to forward messages and delegates to other nodes in other units to help the message bypass obstacles by use of control message of Help Forward,4 then arrive at destination. The method proposed in the literature [6] simply provided a model which can allow the nodes quickly to move to the destination in the full-obstacle environment. Literature [7] proposed a utility-based equation by considering the remaining energy of node of next hop and number of neighboring nodes, which can successfully make the messages arrive at destination. Literatures [9–13] belong to the second category where the location of the obstacle is known in advance. The method proposed in the literature [9] based on the knowledge of location of obstacles in advance. The method constructs available route of mobile nodes by creating a Tyson polygon5 or triangulation.6 Afterwards, the mobile nodes can move on established routes surrounding obstacles, and characteristics of dynamic graphs are also mentioned in the literature. The method proposed in the literature [10] places nodes called standby node by obstacles. The relay nodes can forward the messages to standby nodes when facing obstacles, then standby nodes holding messages forward messages to other nodes while moving along the sides of obstacles. The method proposed in the literature [11] described a scheme that a way to place the infrastructure equipped with sensor in the environment about obstacle. The method proposed in the literature [12] firstly finds the vertices of obstacles between message creator and destination, then chooses either Voronoi mode or Compass mode to confirm the nodes close to obstacles, finally finish the delivery processing. The method proposed in the literature [13] try to bypass obstacles to deliver messages by selecting border nodes and corner nodes. Above schemes have considered a variety of parameters to finish to bypassing the probable existing obstacles in environments, some of which belong to direct fields, some of which belong to indirect fields. But all of those are so ideal, and the require predefine conditions are needed (e.g. the location of obstacles in some papers) so that those are not applicable in our obstacle model, that is why we try to propose our scheme which belongs to first category.
3 4 5
6
R is the transmission range of nodes. A kind of control information which can relay the critical message after delay time. The Tyson polygon is also called the Voronoi diagram, named after George Voronoi. It is a set of continuous polygons composed of vertical bisectors connecting two adjacent points. Triangulation is the most basic research method in algebraic topology. Taking a curved surface as an example, the curved surface can be cuting to pieces, each piece of which is a curved triangle.
Routing Strategy for Avoiding Obstacles During Message Forwarding
3
135
Adapted Predict Obstacle
The obstacle model targeted by our scheme is a scene of ruins appeared after disaster, in which the mobile nodes in it are mainly human-beings, and the mobile phone can be used as the creator and forwarder of the message. The purpose of our scheme is to allow the injured person to convey the location obtained through GPS (Global Positioning System) or other necessary information to the medical rescue center, and to pick up unsafe people by relief activities. 3.1
Forwarding Direction
On the basis of environment without obstacles, it is not allowed that sender forwards the message to other nodes which horizontal vector distance from destination is farther than sender. Hence we give a default direction named plus or minus 90◦ in which the vector direction from the message creator to the destination is the reference line as 0◦ , then nodes can forward to other nodes and make the creator and destination fixed in our model as shown in Fig. 1. When a node holds a message, the message can only be forwarded to the local nodes within plus or minus 90◦ . The nodes that maintain this behavior are called normal nodes. 3.2
Phenomenon When Nodes Are Close to Obstacles
As shown in Fig. 2, we assume the nodes in network are uniformly distributed. While the mobile nodes are close to different structures of obstacles, the number of neighbors of ones is decreased continuously.
Fig. 1. The example about default forwarding direction.
136
Q. Gao et al.
Fig. 2. Changes to number of neighbor nodes when nodes are close to obstacles.
According to the above phenomenon, the number of neighbors of mobile nodes that it shows a decreasing trend is inevitable while nodes get closer to the extreme obstacle. That is to say, the probability of forwarding messages will decrease. Exponential function is suitable for processing the probability value, and it is more in line with the change of the number of neighbor nodes in Fig. 2. So we choose it to describe this phenomenon as shown in Eq. (1). f (x) = e−x
(1)
x means number of neighbors (number of connections). f (x) means probability of nodes changing to the unlimited nodes which will be introduced in the following paragraph (Fig. 3).
Fig. 3. Relationship when nodes are close to obstacles.
In order to deal with that the nodes within the interior of obstacles cannot forward messages, we would like to select these special nodes and change the default forwarding direction from plus or minus 90◦ to 360◦ forwarding. Mentioned nodes are called unlimited nodes as shown in Fig. 4.
Routing Strategy for Avoiding Obstacles During Message Forwarding
137
Fig. 4. Normal node changes to unlimited node
3.3
Connections Change Rate
We found that observing changes in the number of connections can vaguely predict the distance between a node and an obstacle from Subsect. 3.2. Therefore, we discuss the movement characteristics of the node in an obstacle environment at a certain moment by calculating the connection change rate between each movement of the node in this chapter. As shown in Eq. (2). ncurr − nprev t s t= v
rate = s.t.
(2)
Here ncurr , nprev , s and v are the number of connections, number of connections at previous position, the distance between current position and previous position, means the moving speed of node, respectively. Here are two cases of number of rate. towards obstacle (rate ≤ 0) moving direction away from obstacle (rate > 0) 3.4
The Oldest Survival Age
In addition to the connection change rate, the oldest survival age of the message can also be used to determine whether the obstacle has affected the behavior of the node to forward the messages. The so-called survival age of the message refers to the difference between the current time and the time stamp of a message7 stored in the node. The oldest survival age refers to the time with the largest survival age among all messages in the current node as shown in Eq. (3). Tmax = max{Tcurr − t | t ∈ T S(i, j)}
7
(3)
The time stamp of message will be updated to the current time point of the forwarding every time the message is forwarded, and it will be updated at both the sender and the receiver at the same time.
138
Q. Gao et al.
T S(i, j) is a set of time stamp of each delivered copy of ith message from node j. The bigger Tmax means that the message has not been updated for a long time. Therefore, it can be considered that the node hits an obstacle and enters the interior which is similar to a concave obstacle. 3.5
Probability for Predicting Obstacles
By using rate ∗ Tmax instead of x in Eq. (1), we can get probability as shown in Eq. (4). f = e−
v∗Tmax ∗(ncurr −nprev ) s
(4)
Assuming that the current mobile connection change rate (rate) remains unchanged during the longest period of time (Tmax ) when a message has not been forwarded, the change in the number of connections can finally be obtained. Then calculate the probability that the node is close to the obstacle through the Eq. (4), and then we can draw the conclusion whether the node is close to the obstacle during Tmax . 3.6
Select Unlimited Nodes
As mentioned in the previous section, we have obtained the probability that the node is close to the obstacle. However, for further rigorous attitude, we consider the horizontal8 distance between the current node and the destination. That is, whether the node is close to the destination before and after the move. Through the Eq. (5), it is finally judged the node can become an unlimited node only two conditions are hold simultaneously. fcurr > fprev (5) dcurr < dprev Here fcurr , fprev , dcurr and dprev are the probability after moving, the probability before moving, the horizontal distance after moving and the horizontal distance before moving, respectively. When normal nodes are chosen as unlimited nodes, the nature of 360◦ forwarding can be transferred from unlimited nodes to neighbors. But infected neighbor nodes cannot transfer it to other nodes again. Figure 5 gives an example when scheme executed compared with previous scheme.
8
The vector direction mentioned in Fig. 1 is considered as horizon.
Routing Strategy for Avoiding Obstacles During Message Forwarding
139
Fig. 5. The example when scheme executed compared with previous scheme. Table 1. Simulation parameters Parameters
Value
Simulation period
5,000 s
Number of mobile nodes
400
Field Size
1500 m × 1500 m
dth
200 m
D(S,R)
700 m
Buffer size
12.5 MB
Message creation interval 30 s
4
Transmit speed
2 Mbps
Transmit range
100 m
Movement model
RandomWalk with Obstacle
Angle variation
[0◦ ; 45◦ ; 90◦ ]
Shape of obstacles
Concave
Number of obstacles
3
Performance Evaluation
In order to evaluate our scheme in terms of obstacle-resisting, we use a multiple obstacle model as shown in Fig. 6.
140
Q. Gao et al.
100m
Fig. 6. The obstacle model of our evaluation (Obstacles at 0◦ ).
Evaluation results have been conducted with the parameters as shown in Table 1. dth among them decides the range of message relay area. In this simulation, we evaluated the difference between the current scheme and the two previously proposed schemes. Compare the differences of various indicators in different angle changes of obstacles. In Fig. 7, we find that the current scheme is better than the previous two schemes in the various angle changes of the obstacle, and it can be found that when the angle of the obstacle is 90◦ , the performance is almost improved by half compared with the previous two solutions. And this also proves that the current scheme has strong resistance to concave obstacles, and this is in line with our guess in Subsect. 3.6.
Fig. 7. Delivery ratio among 3 schemes.
Routing Strategy for Avoiding Obstacles During Message Forwarding
141
Fig. 8. Average delay among 3 schemes.
In Fig. 8, we can find that while the current scheme maintains a high delivery rate, its average end-to-end delay is also lower than the previous two scheme, although it is slightly higher when the obstacles are at 90◦ compared with the past two schemes, this also shows that the stability of the current scheme is stronger than that of the past two schemes. Unlike the past two schemes, which will be subject to environmental changes and produce drastic fluctuations.
Fig. 9. Number of connection among 3 schemes.
142
Q. Gao et al.
In Fig. 9, it can be seen that in the indicator of total number of connections, MRDAI is much higher than the current scheme and SRS. This is because MRDAI’s core strategy is to allow more mobile nodes to participate in the process of message forwarding in an environment where obstacles exist. The performance of SRS is almost the same as the current scheme, but we can see that the current scheme still has certain advantages when the obstacle angle is 45◦ .
5
Conclusion
Through the comparison of the three different indicators in the experiment, it can be seen that the current scheme has the following characteristics: resistance to obstacles, stability (less affected by environmental changes), and a high delivery ratio while ensuring low energy consumption. However, this scheme has only been tested under our obstacle model, and more obstacle models should be replaced to test the scheme. In addition, the scheme has some flaws, such as the functional relationship that the node follows as it approaches the obstacle, is there a more appropriate functional relationship to represent, etc., factors like these can still be expected to be improved.
References 1. Karimzadeh, S., Matsuoka, M.: Building damage characterization for the 2016 Amatrice earthquake using ascending-descending COSMO-SkyMed data and topographic position index. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 11(8), 2668–2682 (2018). https://doi.org/10.1109/JSTARS.2018.2825399 2. Gao, Q., Shigeyasu, T.: A New DTN relay method reducing number of transmissions under existence of obstacles by large-scale disaster. In: Proceedings of the 15th International Conference on Broad-Band Wireless Computing, Communication and Applications (BWCCA 2020), pp. 98–107 (2020) 3. Kawamoto, M., Shigeyasu, T.: Message relay decision algorithm to improve message delivery ratio in DTN-based wireless disaster information systems. In: 2015 IEEE (2015). https://doi.org/10.1109/AINA.2015.275 4. Ahmed, K.E.U., Miah, I., Hossen, N., Al Aaqib, A.: Obstacles avoidance and optimal path determination in mobile ad hoc network. In: 2009 International Conference on Computer Technology and Development, Kota Kinabalu, pp. 394–396 (2009) https://doi.org/10.1109/ICCTD.2009.231. 5. Ho, Y.H., Ho, A.H., Hua, K.A., Do, T.: Adapting connectionless approach to mobile ad hoc networks in obstacle environment. In: 2006 1st International Symposium on Wireless Pervasive Computing, Phuket, p. 7 (2006) https://doi.org/10.1109/ ISWPC.2006.1613597. 6. Chenchen, Y., Xiaohong, L., Dafang, Z.: An obstacle avoidance mobility (OAM) model. In: 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems, Xiamen, pp. 130–134 (2010). https://doi.org/10.1109/ICICISYS. 2010.5658821. 7. Maymi, F.J., Rodriguez-Martinez, M.: Obstacle avoidance for utility-based geocasting. In: 2009 Sixth International Conference on Information Technology: New Generations, Las Vegas, NV, pp. 338–343 (2009) https://doi.org/10.1109/ITNG. 2009.326.
Routing Strategy for Avoiding Obstacles During Message Forwarding
143
8. Fall, K., Farrell, S.: DTN: an architectural retrospective. IEEE J. Sel. Areas Commun. 26(5), 828–836 (2008) 9. Aboue-Nze, C.G., Guinand, F., Pign, Y.: Impact of obstacles on the degree of mobile ad hoc connection graphs. 2009 Fifth International Conference on Networking and Services, Valencia, pp. 332–337 (2009) https://doi.org/10.1109/ICNS. 2009.36. 10. Di, W., Yan, Q., Ning, T.: Connected dominating set based hybrid routing algorithm in ad hoc networks with obstacles. In: 2006 IEEE International Conference on Communications, Istanbul, pp. 4008–4013 (2006) https://doi.org/10.1109/ICC. 2006.255708. 11. Eledlebi, K., Ruta, D., Saffre, F., AlHammadi, Y., Isakovic, A.F.: Voronoi-based indoor deployment of mobile sensors network with obstacle. In: IEEE 3rd International Workshops on Foundations and Applications of Self* Systems (FAS*W), Trento, vol. 2018, pp. 20–21 (2018). https://doi.org/10.1109/FAS-W.2018.00019 12. Di, W., Yan, Q., Zhongxian, C.: A voronoi-trajectory based hybrid routing (VTBR) algorithm for wireless ad hoc networks with obstacles. In: Sixth International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT 2005), Dalian, China, pp. 65–69 (2005). https://doi.org/10.1109/ PDCAT.2005.56. 13. Chang, C., Shih, K., Lee, S., Chang, S.: RGP: active route guiding protocol for wireless sensor networks with obstacles. In: 2006 IEEE International Conference on Mobile Ad Hoc and Sensor Systems, Vancouver, BC, pp. 367–376 (2006). https:// doi.org/10.1109/MOBHOC.2006.278576.
Fuzzing Method Based on Selection Mutation of Partition Weight Table for 5G Core Network NGAP Protocol Yang Hu1 , Wenchuan Yang1 , Baojiang Cui1(B) , Xiaohui Zhou2 , Zhijie Mao3 , and Ying Wang1 1 Beijing University of Posts and Telecommunications, Beijing, China
{yangwenchuan,cuibj}@bupt.edu.cn
2 Tianjin Network Technology Institute, Tianjin, China 3 National University of Defense Technology, Changsa, Hunan, China
[email protected]
Abstract. By analyzing the NGAP protocol of 5G core network, we study the protocol format and find an effective security detection method. In this paper, Fuzzing technique is used to detect the security flaws in NGAP protocol of 5G core network. In order to improve the efficiency of Fuzzing, a selection mutation algorithms based on partition weight table is proposed. In the actual test we test NGAP protocol in 5G core network to test and use our proposed new algorithm in Fuzzing technique and find the security problem of NGAP protocol. Finally, we can prove that the selection of mutation algorithms based on the partition weight table is effective by counting the variation samples and calculating the increase in the probability of triggering anomalies in the target network elements after using the algorithm compared to before using the algorithm. Keywords: 5G · Fuzzing · Partition weight table · Mutation algorithms · NGAP protocol
1 Introduction The current mobile communication technology is developing rapidly and 5G networks have been widely deployed in China and abroad. The current rapid growth of mobile data traffic, massive device connections, the constant emergence of various new services and application scenarios, and the diverse needs of vertical industry terminal interconnections all make this era a time of great demand for 5G [1]. Recently security research companies have conducted effective security assessments of 5G core networks, analyzing the security issues that exist based on the characteristics of their network architecture and the characteristics of the 5G core network transmission protocols. For the functionally independent features on the network architecture, attackers can use these features to conduct Dos attacks, covertly monitor the location of users, etc [2]. The security assessment of the transport protocol includes data leakage that may be caused by insecure network transport protocols, attacks against network availability, and tampering with transport © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 144–155, 2022. https://doi.org/10.1007/978-3-030-79728-7_15
Fuzzing Method Based on Selection Mutation of Partition
145
data and compromising data integrity. It can be found that there are some security problems in the 5G core network due to the characteristics of the network architecture on the one hand, and some potential security risks due to the inability to guarantee the security of the 5G core network communication protocols on the other hand. Once these security flaws are exposed in usage scenarios, it will be easier for attackers to exploit these network protocol security flaws to attack the 5G core network, resulting in very serious consequences for network availability, user privacy, and other aspects. How to perform effective security testing on 5G core networks? Due to the complex and large 5G core network features and vendor closed source issues, it is difficult for us to use white box testing or audit code to perform security testing on the 5G core network. It is these features that make it easier for us to choose Fuzzing as our testing technique. Fuzzing is usually classified into two categories, a variation-based Fuzzing technique and a generation-based Fuzzing technique [4]. The mutation technique generates test data by changing existing data samples, while the generation technique generates new test data by modeling the program input. We generate valid malformed data by mutating standard packets, which are sent to the 5G core network for normal communication with the corresponding 5G core network elements to verify the security of the protocol and the logical integrity of the data processed by the core network. Currently Fuzzing technology is widely used in software security testing and protocol security testing. Software security testing includes firmware security testing for IoT, operating system kernel testing, and regular application software security testing. Protocol security tests such as WIFI, Bluetooth or vendor-developed protocol security tests [8]. For 5G mobile networks, some researchers also consider using Fuzzing to detect security flaws in 5G access networks, and this paper examines the Fuzzing method to efficiently detect security flaws in 5G core networks. Although Fuzzing is widely used in various security research areas, its efficiency depends on the sample generation strategy depending on the detection target. Since the test cases of traditional Fuzzing techniques are generated by a random generation algorithm, the constraints of the protocol data format in protocol Fuzzing cause such test cases to be discarded directly by the target entity, making detection inefficient. Therefore, for protocol Fuzzing, efficient sample generation algorithms should be specified according to the characteristics of the protocol data format and the process of protocol data processing by the target entity, so that the test samples are more efficient and the test surface is wider, thus improving the efficiency and comprehensiveness of security defect detection.
2 N2 Interface and NGAP Protocol Analysis The 5G core network architecture is shown in Fig. 1, which includes AUSF network elements (authentication server function), AMF network elements (access and mobility management function), NRF network elements (network repository function), PCF network elements (policy control function), UDM network elements (unified data management), SMF network elements (session management function), UPF network elements (user plane function), N1 to N6 communication interfaces, UE, (R)AN, data network (DN) and inter-network communication protocol components. The focus of this paper
146
Y. Hu et al.
NSSF
NRF
PCF
AUSF
AMF
SMF
N2
N4
N1
UE
(R)AN
N3
UPF
UDM
N6
DN
Fig. 1. Schematic diagram of 5G core network infrastructure
is to detect security flaws in the communication protocol NGAP for the N2 interface between (R)AN to AMF network elements using an improved Fuzzing algorithm. 2.1 N2 Interface The 5G system adopts the SBA architecture, which combines the characteristics and technical development trend of the network of 5G core network, divides the network function into several reusable network element services, and uses lightweight interfaces to communicate between the network element services [3]. Compared with the traditional point-to-point network element architecture in the communication industry, the change of 5G core network is revolutionary. Under the SBA architecture, the interface between core network elements is SBI (server based interface), and the N2 interface between the 5G core network and the access network still adopts the traditional model. The N2 interface is defined as the interface for communication between (R)AN to AMF network elements and between (R)AN to SMF network elements. For (R)AN to SMF communication, the N2 interface provides only the purpose of transmission but not the service processing, and messages for communication between (R)AN to SMF core network elements also need to be transparently transmitted through AMF network elements (as shown in the Fig. 2) [7]. And for providing (R)AN to AMF network element communication, the N2 interface not only needs to carry out messaging also needs to realize business processes, typical processes are: PDU session resource management, UE context management process, UE mobility management process, NAS messaging process, etc. 2.2 NGAP Protocol In order to propose a Fuzzing method suitable for NGAP protocol in 5G core network, we need to analyze its protocol format, study the characteristics of the protocol and develop a corresponding test sample generation strategy on the one hand. On the other hand, we need to conduct a comprehensive analysis of the NGAP protocol business process in
Fuzzing Method Based on Selection Mutation of Partition
N2 SM information
N2 SM information
NGAP
NGAP
SCTP
SCTP
IP
IP
L2
L2
L1 5G-AN
147
N11
N11
L1 N2
AMF
N11
SMF
Fig. 2. Schematic diagram of AN-SMF transmission protocol stack
order to improve the code coverage and make the testing more comprehensive for the sake of testing completeness. 2.2.1 NGAP Protocol Business Process and Protocol Field Analysis The 5G core network uses TCP, HTTPS, and SCTP protocols on the transport layer, and the NGAP protocol data we are studying is sent on the transport layer using the SCTP protocol, which is similar to the UDP and TCP protocols. This feature is the same as UDP, and the length of each record written by the sender is passed to the receiver with the data, so SCTP has both the characteristics of the UDP protocol and the TCP protocol. In addition, SCTP provides multi-homing feature, which means that a single SCTP endpoint can support multiple IP addresses. Once that endpoint has established an association with another endpoint, if one of its networks or some cross-domain Internet pathway fails, SCTP can avoid the failure by switching to use another address that is already associated with that association. NGAP consists of a basic process (EP), which is a unit of interaction between NGRAN nodes and AMF, with independence and flexibility [6]. There are two types of message composition of the basic process: 1) answer-type messages, there is a request and response process, response will respond to the success or failure. 2) notificationtype messages, send notify, without waiting for the response. The main functions are as follows: interface management function, alarm messaging, UE mobility management function, UE context management function, PDU session management function, NAS messaging function, etc. Test Registration is a test registration process (the process analysis is schematically shown Fig. 3): A. The NG-RAN node initiates the process by sending an NG SETUP REQUEST message including the appropriate data to the AMF. B. The NG-RAN node sends a registration request message and the AMF network element sends back the authentication request.
148
Y. Hu et al.
(R)AN
AMF NGSetupRequest NGSetupResponse InitialUEMessage, Registration request DownlinkNASTransport,Authentication request UplinkNASTransport,Authentication response DownlinkNASTransport,Security mode command UplinkNASTransport,Security mode complete InitialContextSetupRequest,Registration accept InitialContextSetupResponse UplinkNASTransport,UL NAS transport,PDU session establishment PDUSessionResourceSetupRequest,DL NAS transport,PDU session establishment accept PDUSessionResourceSetupResponse
Fig. 3. Schematic diagram of TestRegistration business process
C. The NG-RAN node responds with an authentication message, and the AMF performs the authentication and sends it into secure mode. D. The NG-RAN node sends back a message to notify the AMF that the node has completed security mode and the AMF network element initiates the initial message establishment process. E. The NG-RAN node sends back packets to notify the AMF network element that establishment is complete and sends a UPLINK NAS TRANSPORT message to the AMF to initiate uplink NAS transmission. F. The AMF initiates the PDU session resource setup by sending a PDU SESSION RESOURCE SETUP REQUEST message to the NG-RAN node, and the NG-RAN sends back a PDU SESSION RESOURCE SETUP RESPONSE to represent the completion of the setup. 2.2.2 Potential Security Flaws in NGAP Protocol NGAP protocol vulnerability analysis is as follows: the use of this protocol due to the many functions of its N2 interface, the complexity of business processes will bring
Fuzzing Method Based on Selection Mutation of Partition
149
potential security threats. If the attacker follows the NGAP protocol standard to attack the N2 interface, then it will directly cause great harm to the 5G core network, resulting in core network collapse, information leakage and other hazards. Possible potential attack surfaces are as follows: session hijacking in the process of establishing connection between NG-RAN and AMF, and NGsetuprequest message without corresponding authentication mechanism, which may lead to forging this message to create fake GNB connection and occupy AMF resources. Forging data based on NGAP protocol standard makes the 5G core network generate release exceptions, context management exceptions, and message delivery exceptions. On the other hand, if equipment vendors do not strictly follow the NGAP protocol standard, the core network will generate many unknown exceptions such as triggered memory buffer errors and null pointer exceptions when handling malformed data due to the failure to consider some special data fields in advance.
3 Fuzzing Tool Architecture
Mutation module
optimizing
Record module
Monitoring module Monitor core network status
Mutate the sample
Sample generation module
Send module
5G Core Network
Fig. 4. Fuzzing tool architecture
As shown in Fig. 4, the 5G core network NGAP protocol Fuzzing tool architecture consists of these five modules: sample generation module, variation module, sending module, monitoring module, and logging module. Sample Generation Module: By studying and analyzing NGAP protocol, we can help us understand NGAP protocol data format and encapsulate the base message format of NGAP protocol Fuzzing by capturing packets. In order to improve code coverage, we need to master the communication flow of NGAP protocol in different business processes and encapsulate the complete communication flow in order to perform complete testing of this protocol. Mutation Module:It is responsible for mutating selected specific fields using the appropriate algorithm. Since a protocol packet consists of multiple fields, we choose to mutate a field and then need to assemble it into a complete packet that conforms to the protocol format based on the data we have encapsulated in other fields. Sending Module: Communicates with the AMF network element of the 5G core network, specifically sending the samples generated by the variant module to the AMF core network and obtaining the data passed from the AMF network element.
150
Y. Hu et al.
Monitoring Module: After the sending module sends the malformed samples, this Fuzzing tool needs to know the status of the target network element, whether it is normal, abnormal or crashed. Based on this information to take the next step, this module detects whether the core network AMF network element is crashed by sending part of the initial message (e.g., build right, etc.). Logging Module: Based on the content obtained by the monitoring module and the sending module to determine, record the useful information we obtained in this test to help improve the efficiency of Fuzzing and debugging the core network.
4 Selection Mutation Algorithms Based on Partition Weight Table Fuzzing of 5G core network protocols tends to have more complex mechanisms compared to traditional software Fuzzing, and the difference is mainly reflected in the data characteristics. For a certain kind of data of the protocol often contains a large number of fields, the traditional protocol Fuzzing techniques tend to mutate each data in turn to generate malformed packets, the disadvantage of this method is also obvious is the consumption of time and will bring a lot of unnecessary load on the core network. If a field is a protocol-specific identifier, a blind variant modification of it will invalidate all packets under that field variant and consume a lot of time, which will lead to extremely inefficient testing. By analyzing the protocol data we understand that although the protocol packets contain a large number of fields, the meaning and function of different fields are different. (As shown in Table 1, the UplinkNASTransport packet fields in the TestRegistration business process are parsed.) The difference in these functions then indicates that the weight or priority of the different fields is not the same. Some fields may easily lead to security problems in the core network, for example: the field indicating length in the protocol data, if it is not handled properly in the protocol processing module of the target entity and there is no security check for this part of the field, then there will be potential vulnerabilities such as integer overflow and buffer overflow. Then according to the principle of efficient testing, such fields have high priority and cost-effectiveness, and should be our highest priority for mutation. However, for some fields, such as a specific identifier, the protocol processing module of the target entity determines whether the field is data for the corresponding protocol, so if we mutate the field, it makes no sense and causes the subsequent packets to be dropped directly. For some other fields, the entity’s protocol processing module often has a good error handling mechanism, and even if the value corresponding to the field is modified to malformed data, it is difficult to have some security problems because of the better processing mechanism the target entity is able to handle the packet with good errors. In this paper, we propose a selection variation algorithm based on a field weight table to solve these problems to significantly improve fuzzy testing efficiency. The purpose of this algorithm is to accomplish the following two objectives: 1. The weights of different fields are set according to their impact on security. The mutated packets with high impact on the core network have relatively high weights, and those with low impact have low weights while recording these weights into the global variables. 2. When conducting the test, the weights of different fields are normalized to filter out one or several fields with
Fuzzing Method Based on Selection Mutation of Partition
151
Table 1. UplinkNASTransport field table. Head field
Bytes
Content
NGAP-PDU
53
Initial message of UplinkNASTransport
Length
2
DATA chunk
16
The length of message Some Info of this message (SSN, PPID…)
ProcedureCode
1
UplinkNASTransport Id
id-AMF-UE-NGAP-ID
6
The AMF UENGAP ID uniquely identifies the
id-RAN-UE-NGAP-ID
6
UE on the NG interface in the AMF
id-NAS-PDU
58
The RAN UENGAP ID uniquely identifies the UE in the GNB through the NG interface
id-User-Location-Information
19
Security protected NAS 5GS message and NAS 5GS message
higher weights for variation as much as possible so that the fields that are more likely to trigger core network abnormalities are tested as much as possible within a certain time frame. On the one hand, it increases the probability of possible problems in the core network with high probability, and on the other hand, it improves the testing efficiency. The code of calculate the field value is described below: do{ try{ getPacket(); Attack::VarationMethodFuzzTestRegistration(); testStaus(); } Catch(…){ addValue(); } }while(condition); handleAllFieldValue();
In order to determine the impact of a field on the core network, the method used in this paper is obtained by two methods: obtaining the messages sent back from the core network and monitoring the current state of the core network. Specifically, we use a protocol interpreter to assemble this mutated field with other normal fields into a compliant NGAP packet by mutating field a and then sending this packet to the core network. By monitoring the core network, we can know the current operational status of the core network. Suppose the packets mutated by this field are more likely to cause
152
Y. Hu et al.
the core network to crash or abnormal. In that case, it means that the core network’s filtering or processing mechanism for this field is not perfect, then this field is more likely to cause security threats, so the weight of this field is increased accordingly. On the contrary, if the field variation is difficult to trigger anomalies and crashes in the core network, and the core network handles the field without response or directly sends back error messages, then it means that the core network is relatively more perfect in handling the mechanics of the field, and the field is less likely to trigger security threats, so the weight of the field is unchanged or reduced. Because we need a large number of samples for testing, the final value of the defined field weights is to be calculated uniformly based on the test results of all samples.
5 Experimental Results 5.1 Testing Environment Free5GC
https
AMF
SMF
N2
gNB1
N4 N3
UPF
N6
DN
Fig. 5. Free5gc architecture diagram
Free5GC is an open-source project for fifth-generation (5G) mobile core networks. The test version in this paper is 3.0.4, and the default network element topology is shown in Fig. 5. The ultimate goal of the project is to implement the 5G core network (5GC) as defined in 3GPP Release 15 (R15) and later [5]. The project is done in Go language and implements functions including AMF, SMF, UDM, UDR, NSSF, PCF, AUSF, NRF, UPF, etc. Free5GC virtualizes IP addresses, customizes loopback addresses, completes functional interfaces between individual network elements, and completes basic functions such as registration and PDU session establishment.
Fuzzing Method Based on Selection Mutation of Partition
153
5.2 Testing Results We count the number of samples before and after packet mutation using partition weight table-based selection mutation algorithms, and the number of times they cause core network anomalies, crashes, and warnings. Comparing the effects generated before and after using the algorithm, before using the algorithm, we would randomly mutate 35 fields in the registration process packets, and the total number of samples generated was 312682, which eventually led to 35 packets crashing in AMF network elements, 127 packets triggering exceptions, and 839 packets triggering warnings. As shown in Fig. 6, the number of exceptions caused by 35 different fields is counted.
Fig. 6. Statistical table of the number of abnormalities in field variation
After we use the selection mutation algorithms based on the partition weight table, the 35 fields are selectively mutated according to their weight size, and we can see that among the first 311682 samples with high weights, there are 78 packets that cause the AMF network element to crash, 1283 packets that cause anomalies, and 3261 packets that cause warnings. Among the latter 302682 high-weighted samples, there are 12 packets triggering crashes, 78 packets triggering exceptions, and 651 packets triggering warnings. The following is the all-around comparison chart using the partition weight table-based selection mutation algorithms. From the comparison of Fig. 7 and Fig. 8, we can learn that the probability of variant samples triggering anomalies in the target network elements increases substantially after using the selection variation algorithm based on the field weight table, and it is more likely to cause security flaws in the target in the case of selecting field variants with high weights, while it is less likely to detect security flaws in the target in the case of selecting field variants with low weights. The selection of mutation algorithms based on partition weight table improves the efficiency of sample testing, and it is easier to find the security flaws of the target network elements in a limited time using this algorithm.
154
Y. Hu et al.
Fig. 7. Statistics of the number of abnormalities caused by the same sample size
Fig. 8. Generate a statistical chart of abnormal probability per 100,000 samples
6 Conclusion From the test results of NGAP protocol in 5G core network, the selection mutation algorithms based on partition weight table proposed in this paper has a greater advantage compared with the traditional Fuzzing technique, which generates samples in a shorter time of Fuzzing to make the target network elements more likely to generate anomalies, thus improving the efficiency of Fuzzing. However, the partition weight table-based selection mutation algorithms also have obvious drawbacks, we need enough samples
Fuzzing Method Based on Selection Mutation of Partition
155
to test and calculate the weights of each field before using the algorithm, in this process if the number of samples is large will undoubtedly reduce the efficiency of the whole Fuzzing, and too few samples will make the calculation of field weights is not accurate. Therefore, we need to find an intelligent weight calculation method that can calculate field weights quickly and accurately with less feedback from the target network element. During the testing process we also found that the protocol does have some security flaws that can crash and error the target network element, which shows that our Fuzzing method is effective. Similarly, we can use the improved Fuzzing method to test other protocols in the 5G communication network, which is of general importance for the security of the 5G communication network.
References 1. Schneider, P., Horn, G.: Towards 5G security. In: 2015 IEEE Trustcom/BigDataSE/ISPA, Helsinki, Finland, pp. 1165–1170 (2015). https://doi.org/10.1109/trustcom.2015.499 2. Ahmad, I., Shahabuddin, S., Kumar, T., Okwuibe, J., Gurtov, A., Ylianttila, M.: Security for 5G and beyond. IEEE Commun. Surv. Tutorials 21(4), 3682–3722 (2019). https://doi.org/10. 1109/comst.2019.2916180 3. Zhang, X., Kunz, A., Schröder, S.: Overview of 5G security in 3GPP. In: 2017 IEEE Conference on Standards for Communications and Networking (CSCN), Helsinki, pp. 181–186 (2017). https://doi.org/10.1109/cscn.2017.8088619 4. Cui, B., Feng, S., Xiao, Q., Li, M.: Detection of LTE protocol based on format fuzz. In: 2015 10th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA), Krakow, Poland, pp. 187–192 (2015). https://doi.org/10.1109/bwcca. 2015.42 5. Free5GC (2021). https://github.com/free5gc/free5gc 6. 3GPP. TS.23.501. g70. System architecture for the 5G System (Release15) 7. 3GPP. TS.38.410. g30. NG-RAN; NG general aspects and principles (Release15) 8. Wikipedia contributors, Fuzzing, 24 Feb 2021. https://en.wikipedia.org/wiki/Fuzzing
Simulation Results of a DQN Based AAV Testbed in Corner Environment: A Comparison Study for Normal DQN and TLS-DQN Nobuki Saito1 , Tetsuya Oda2(B) , Aoto Hirata1 , Kyohei Toyoshima2 , Masaharu Hirota3 , and Leonard Barolli4 1
4
Graduate School of Engineering, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700-0005, Japan {t21jm01md,t21jm02zr}@ous.jp 2 Department of Information and Computer Engineering, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700-0005, Japan [email protected], [email protected] 3 Department of Information Science, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700-0005, Japan [email protected] Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-higashi, Higashi-ku, Fukuoka 811-0295, Japan [email protected]
Abstract. The Deep Q-Network (DQN) is one of the deep reinforcement learning algorithms, which uses deep neural network structure to estimate the Q-value in Q-learning. In the previous work, we designed and implemented a DQN-based Autonomous Aerial Vehicle (AAV) testbed and proposed a Tabu List Strategy based DQN (TLS-DQN). In this paper, we consider corner environment as a new simulation scenario and carried out simulations for normal DQN and TLS-DQN for mobility control of AAV. Simulation results show that TLS-DQN performs better than normal DQN in the corner environment.
1
Introduction
The Unmanned Aerial Vehicle (UAV) is expected to be used in different fields such as aerial photography, transportation, search and rescue of humans, inspection, land surveying, observation and agriculture. Autonomous Aerial Vehicle (AAV) [1] has the ability to operate autonomously without human control and is expected to be used in a variety of fields, similar to UAV. So far many AAVs [2–4] are proposed and used practically. However, existing autonomous flight systems are designed for outdoor use and rely on location information by the Global Navigation Satellite System (GNSS) or others. On the other hand, in an environment where it is difficult to obtain position information from GNSS, it is necessary to determine a path without using position information. Therefore, autonomous c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 156–167, 2022. https://doi.org/10.1007/978-3-030-79728-7_16
Simulation Results of a DQN Based AAV Testbed in Corner Environment
157
movement control is essential to achieve operations that are independent of the external environment, including non-GNSS environments such as indoor, tunnel, and underground. In [5,6] the authors consider Wireless Sensor and Actuator Networks (WSANs), which can act autonomously for disaster monitoring. A WSAN consists of wireless network nodes, all of which have the ability to sense events (sensors) and perform actuation (actuators) based on the sensing data collected by the sensors. WSAN nodes in these applications are nodes with integrated sensors and actuators that have the high processing power, high communication capability, high battery capacity and may include other functions such as mobility. The application areas of WSAN include AAV [7], Autonomous Underwater Vehicle (AUV) [8], Autonomous Surface Vehicle (ASV) [9], Heating, Ventilation, Air Conditioning (HVAC) [10], Internet of Things (IoT) [11], Ambient Intelligence (AmI) [12], ubiquitous robotics [13], and so on. Deep reinforcement learning [14] is an intelligent algorithm that is effective in controlling autonomous robots such as AAV. Deep reinforcement learning is an approximation method using deep neural network for value function and policy function in reinforcement learning. Deep Q-Network (DQN) is a method of deep reinforcement learning using Convolution Neural Network (CNN) as a function approximation of Q-values in the Q-learning algorithm [14,15]. DQN combines the neural fitting Q-iteration [16,17] and experience replay [18], shares the hidden layer of the action value function for each action pattern and can stabilize learning even with nonlinear functions such as CNN [19,20]. However, there are some points where learning is difficult to progress for problems with complex operations and rewards, or problems where it takes a long time to obtain a reward. In this paper, we consider corner environment as a new simulation scenario and carried out simulations for normal DQN and Tabu List Strategy based DQN (TLS-DQN) for mobility control of AAV. Simulation results show that TLS-DQN performs better than normal DQN in the corner environment. The corner environment in this paper refers to a non-GNSS single-path environment including turns. The structure of the paper is as follows. In Sect. 2, we show the DQN based AAV testbed. In Sect. 3, we describe TLS-DQN. In Sect. 4, we discuss the simulation results that used normal DQN and TLS-DQN. Finally, conclusions and future work are given in Sect. 5.
2
DQN Based AAV Testbed
In this section, we discuss quadrotor for AAV and DQN for AAV mobility.
158
N. Saito et al.
Fig. 1. Snapshot of AAV. Table 1. Components of quadrotor. Component
Model
Propeller
15 × 5.8
Motor
MN3508 700 kv
Electric speed controller
F45A 32bitV2
Flight controller
Pixhawk 2.4.8
Power distribution board MES-PDB-KIT
2.1
Li-Po battery
22.2 v 12000 mAh XT90
Mobile battery
Pilot Pro 2 23000 mAh
ToF ranging sensor
VL53L0X
Raspberry Pi
3 Model B Plus
PVC pipe
VP20
Acrylic plate
5 mm
Quadrotor for AAV
For the design of AAV, we consider a quadrotor, which is a type of multicopter. Multicopter is high maneuverable and can operate in places that are difficult for people to enter, such as disaster areas and dangerous places. It also has the advantage of not requiring space for takeoffs and landings and being able to stop at mid-air during the flight, therefore enabling activities at fixed points. The quadrotor is a type of rotary-wing aircraft that uses four rotors for takeoff
Simulation Results of a DQN Based AAV Testbed in Corner Environment
159
Fig. 2. AAV control system.
or propulsion, and can operate with less power than hexacopter and octocopter, and is less expensive to manufacture. In Fig. 1 is shown a snapshot of the quadrotor used for designing and implementing AAV testbed. The quadrotor frame is mainly composed of polyvinyl chloride (PVC) pipe and acrylic plate. The components for connecting the battery, motor, sensor, etc. to the frame were created using an optical 3D printer. Table 1 shows the components used in the quadrotor. The size specifications of the quadrotor (including the propeller) are length 87 [cm], width 87 [cm], height 30 [cm] and weight 4259 [g]. In Fig. 2 is shown the AAV control system. The raspberry pi reads saved data of the best episode when carrying out the simulations by DQN and uses telemetry communication to send commands such as up, down, forward, back, left, right and stop to the flight controller. Also, multiple Time-of-Flight (ToF) range sensors using Inter-Integrated Circuit (I2 C) communication and GeneralPurpose Input Output (GPIO) are used to acquire and save flight data. The Flight Controller (FC) is a component that calculates the optimum motor rotation speed for flight based on the information sent from the built-in acceleration sensor and gyro sensor. The Electronic Speed Controller (ESC) is a part that controls the rotation speed of the motor in response to commands from FC. Through these sequences, AAV behave and reproduces movement in simulation. 2.2
DQN for AAV Mobility
The DQN for moving control of AAV structure is shown in Fig. 3. The DQN for AAV mobility is implemented by Rust programming language [22]. In this work, we use the Deep Belief Network (DBN), where computational complexity is smaller than CNN for DNN part in DQN. The environment is set as vi . At each step, the agent selects an action at from the action sets of the mobile actuator nodes and observes a position vt from the current state. The change of the mobile actuator node score rt was regarded as the reward for the action. For the reinforcement learning, we can complete all of these mobile actuator nodes sequences mt as Markov decision process directly, where sequences of observations and actions are mt = v1 , a1 , v2 , . . . , at−1 , vt . A method known as experience replay is used to store the experiences of the agent at each
160
N. Saito et al.
Fig. 3. DQN for AAV mobility control.
timestep, et = (mt , at , rt , mt+1 ) in a dataset D = e1 , . . . , eN , cached over many episodes into a Experience Memory. By defining the discounted reward for thefuture by a factor γ, the sum of the future reward until the end would be T Rt = t =t γ t −t rt . T means the termination time-step of the mobile actuator nodes. After running experience replay, the agent selects and executes an action according to an -greedy strategy. Since using histories of arbitrary length as inputs to a neural network can be difficult, Q-function instead works on fixed length format of histories produced by a function φ. The target was to maximize the action value function Q∗ (m, a) = maxπ E[Rt |mt = m, at = a, π], where π is the strategy for selecting of best action. From the Bellman equation (see Eq. (1)), it is equal to maximize the expected value of r +γQ∗ (m , a ), if the optimal value Q∗ (m , a ) of the sequence at the next time step is known. Q∗ (m , a ) = Em ∼ξ [r + γa maxQ∗ (m , a )|m, a].
(1)
By not using iterative updating method to optimize the equation, it is common to estimate the equation by using a function approximator. Q-network in DQN is a neural network function approximator with weights θ and Q(s, a; θ) ≈ Q∗ (m, a). The loss function to train the Q-network is shown in Eq. (2): Li (θi ) = Es,a∼ρ(· ) [(yi − Q(s, a; θi ))2 ].
(2)
The yi is the target, which is calculated by the previous iteration result θi−1 . The ρ(m, a) is the probability distribution of sequences m and a. The gradient of the loss function is shown in Eq. (3): ∇θi Li (θi ) = Em,a∼ρ(· );s ∼ξ [(yi − Q(m, a; θi ))∇θi Q(m, a; θi )].
(3)
Simulation Results of a DQN Based AAV Testbed in Corner Environment
161
Algorithm 1. Tabu List for TLS-DQN. Require: The coordinate with the highest evaluated value in the section is (x, y, z). 1: if (xbef ore ≤ xcurrent ) ∧ (xcurrent ≤ x) then 2: tabu list ⇐ ((xmin ≤ xbef ore ) ∧ (ymin ≤ ymax ) ∧ (zmin ≤ zmax )) 3: else if (xbef ore ≥ xcurrent ) ∧ (xcurrent ≥ x) then 4: tabu list ⇐ ((xbef ore ≤ xmax ) ∧ (ymin ≤ ymax ) ∧ (zmin ≤ zmax )) 5: else if (ybef ore ≤ ycurrent ) ∧ (ycurrent ≤ y) then 6: tabu list ⇐ ((xmin ≤ xmax ) ∧ (ymin ≤ ybef ore ) ∧ (zmin ≤ zmax )) 7: else if (ybef ore ≥ ycurrent ) ∧ (ycurrent ≥ y) then 8: tabu list ⇐ ((xmin ≤ xmax ) ∧ (ybef ore ≤ ymax ) ∧ (zmin ≤ zmax )) 9: else if (zbef ore ≤ zcurrent ) ∧ (zcurrent ≤ z) then 10: tabu list ⇐ ((xmin ≤ xmax ) ∧ (ymin ≤ ymax ) ∧ (zmin ≤ zbef ore )) 11: else if (zbef ore ≥ zcurrent ) ∧ (zcurrent ≥ z) then 12: tabu list ⇐ ((xmin ≤ xmax ) ∧ (ymin ≤ ymax ) ∧ (zbef ore ≤ zmax )) 13: end if
We consider tasks in which an agent interacts with an environment. In this case, the AAV moves step by step in a sequence of observations, actions and rewards. We took in consideration AAV mobility and consider 7 mobile patterns (up, down, forward, back, left, right, stop). In order to decide the reward function, we considered Distance between AAV and Obstacle (DAO) parameter. The initial weights values are assigned as Normal Initialization [23]. The input layer is using AAV and the position of destination, total reward values in Experience Memory and AAV movements patterns. The hidden layer is connected with 256 rectifier units in Rectified Linear Units (ReLU) [24]. The output Q-values are the AAV movement patterns.
3
TLS-DQN
The idea of the Tabu List Strategy (TLS) is motivated from Tabu Search (TS) proposed by F. Glover [21] to achieve an efficient search for various optimization problems by prohibiting movements to previously visited search area in order to prevent getting stuck in local optima. ⎧ 3 (if (xcurrent = xglobal destinations )∧ ⎪ ⎪ ⎪ ⎪ (ycurrent = yglobal destinations )∧ ⎪ ⎪ ⎪ ⎪ (z ⎪ current = zglobal destinations ))∨ ⎪ ⎪ ⎪ (((x ⎪ bef ore < xcurrent ) ∧ (xcurrent ≤ xlocal destinations ))∨ ⎪ ⎨ ((xbef ore > xcurrent ) ∧ (xcurrent ≥ xlocal destinations ))∨ r= (4) ((ybef ore < ycurrent ) ∧ (ycurrent ≤ ylocal destinations ))∨ ⎪ ⎪ ⎪ ⎪ ((ybef ore > ycurrent ) ∧ (ycurrent ≥ ylocal destinations ))∨ ⎪ ⎪ ⎪ ⎪ ((zbef ore < zcurrent ) ∧ (zcurrent ≤ zlocal destinations ))∨ ⎪ ⎪ ⎪ ⎪ ((zbef ore > zcurrent ) ∧ (zcurrent ≥ zlocal destinations ))). ⎪ ⎪ ⎩ −1 (else). In this paper, reward value is decided by Eq. (4), where “x”, “y” and “z” means X-axis, Y -axis and Z-axis, respectively. The current means the current
162
N. Saito et al.
Fig. 4. Tabu rule addition method.
coordinates of the actor node in the DQN, and the before means the coordinates before selecting and moving the action. Also, the global destination means the destination in the problem area, and the local destination means the target passage points until the global destination. The considered area is partitioned based on the target passage points, and one destination is set in each area. If the current coordinate is closer to the destination than the coordinate before the move, also if the current coordinate is equal to the destination, the reward value is 3. In all other cases, the reward value is −1. The tabu list in TLS is used when an actor node of DQN selects an action or the reward for that action is determined. The tabu list is referred to when selecting the action if the direction of movement of the action has been randomly determined. If the direction of the movement area is included in the tabu list, the actor node will reselect the action. Also, the tabu list is used when the reward was determined. When the reward value is 3, the prohibited area is added to the tabu list based on the rule shown in Algorithm 1. The tabu list holds the added prohibited areas until the end of the episode and is initialized for each episode. Figure 4 shows an example of adding the prohibited area method to the tabu list according to Algorithm 1. The n in Fig. 4 is a natural number and refers to the number of iterations in each episode. In Step: [n] of Fig. 4, the actor node has moved in the Y -axis direction and is closer to the destination than before the move, and (ybef ore < ycurrent ) and (ycurrent ≤ ylocal destinations ) in Algorithm 1 are satisfied. Therefore, the black-filled area of [(xmin ≤ xmax ), (ymin ≤ ybef ore ), (zmin ≤ zmax )] is added to the tabu list. Also, in Step: [n+1], the actor node has moved in the X-axis direction and is closer to the destination than before the move, and (xbef ore < xcurrent ) and (xcurrent ≤ xlocal destinations ) in Algorithm 1 are satisfied. Therefore, the black-filled area with [(xmin ≤ xbef ore ), (ymin ≤ ymax ), (zmin ≤ zmax )] is added to the tabu list. The search by TLS-DQN is done in a wider range and is better than the search by random direction of movement.
Simulation Results of a DQN Based AAV Testbed in Corner Environment
163
(a) From the initial placement to the corner.
(b) From the corner to the initial placement.
(c) From the global destination to the corner.
(d) From the corner to the global destination.
Fig. 5. Snapshot of considered area.
Fig. 6. Considered area for simulation. Table 2. Segmentation of considered area. Area no. X-axis range
Y -axis range
Z-axis range
1
0 ≤ x ≤ 200
0 ≤ y < 300
0 ≤ z ≤ 293
2
0 ≤ x ≤ 200
300 ≤ y < 1990
0 ≤ z ≤ 293
3
0 ≤ x < 200
1990 ≤ y ≤ 2200 0 ≤ z ≤ 350
4
200 ≤ x < 400
1990 ≤ y ≤ 2200 0 ≤ z ≤ 350
5
400 ≤ x < 1500
1990 ≤ y ≤ 2200 0 ≤ z ≤ 350
6
1500 ≤ x ≤ 1800 1990 ≤ y ≤ 2200 0 ≤ z ≤ 350
164
N. Saito et al. Table 3. Simulation parameters of DQN. Parameters
Values
Number of episode
50000
Number of iteration
2000
Number of hidden layers
3
Number of hidden units
15
Initial weight value
Normal Initialization
Activation function
ReLU
Action selection probability () 0.999 − (t/Number of episode) (t = 0, 1, 2, . . ., Number of episode) Learning rate (α)
0.04
Discount rate (γ)
0.9
Experience memory size
300 × 100
Batch size
32
Number of AAV
1
3500
3500
Best Median Worst
3000 2500
2500 2000
Reward
Reward
2000 1500 1000 500
1500 1000 500
0
0
-500
-500
-1000
Best Median Worst
3000
0
500
1000
1500
(a) Normal DQN.
2000
-1000
0
500
1000
1500
2000
(b) TLS-DQN.
Fig. 7. Simulation results of rewards.
4
Simulation Results
We consider for simulations the operations such as takeoffs, flights and landings between the initial position and the destination. The considered area for the simulation scenarios is an actual environment such as a corridor, which is a single-path environment including a corner. In Fig. 5 and Fig. 6, the red-filed area indicates the corner of the simulation environment and the blue-filed area indicates the floor surface. Figure 5 shows snapshots of the area used in the simulation scenario and it was taken on the ground floor of Building C4 and C5 at Okayama University of Science, Japan. Figure 6 shows the considered area based on the actual measurements of Fig. 5 area. Table 2 shows the partitioning of the problem area. The initial 1 , 2 , 3 , 4 placement is [100, 150, 0]. The local destination in areas , 5 and 6 are [100, 300, 150], [100, 1990, 150], [200, 2200, 150], [400, 2100, ,
Simulation Results of a DQN Based AAV Testbed in Corner Environment
165
Initial Placement Local Destination Global Destination
35 30 25 20
Z-axis 15 10 5 0
-20
0
20
40
60
X-axis
80 100 120 140 160 180 0
50
100
150
200
250
Y-axis
(a) Normal DQN. Initial Placement Local Destination Global Destination
35 30 25 20
Z-axis 15 10 5 0
-20
0
20
40
60
X-axis
80 100 120 140 160 180 0
50
100
150
200
250
Y-axis
(b) TLS-DQN.
Fig. 8. Visualization results.
150], [1500, 2100, 150], and [1650, 2100, 0] (global destination), respectively. Table 3 shows the parameters used in the simulation. Figure 7 shows the change in reward value of the action in each iteration for Worst, Median, and Best episodes in normal DQN (which does not take the tabu list into account) and TLS-DQN. The TLS-DQN outperformed the normal DQN for Median and Best episodes. This is because in each episode the normal DQN can re-explore already explored areas and obtain rewards, whereas TLS-DQN is restricted for the re-explored areas. The TLS-DQN has low reward for the explored areas that are far away from the global destination or local destination. Also, in the Best episode of TLS-DQN, the gradient increased from approximately 1600 iterations, which results in reaching the global destination. Figure 8 visualizes the movement of coordinates in episodes with the Best reward in normal DQN and TLS-DQN. From Fig. 8, the normal DQN repeatedly explores the close area of the initial position, whereas the TLS-DQN can reach the global destination. Therefore, the performance of TLS-DQN is better than normal DQN from the visualization results. The simulation results show that TLS-DQN is applicable to action decisions in corner environment.
166
5
N. Saito et al.
Conclusion
In this paper, we considered corner environment as a new simulation scenario and carried out simulations for normal DQN and TLS-DQN for mobility control of AAV. From simulation results, we concluded as follows. • For Median, and Best episodes, the reward value of TLS-DQN is better than normal DQN. • The visualization results of the trajectories of AAV movement show that the performance of TLS-DQN is better than normal DQN because the TLS-DQN has reached the global destination. • The TLS-DQN is a good approach for corner environment. In the future, we would like to improve the TLS-DQN for AAV mobility by considering different scenarios. In addition, we would like to develop automatic settings for the destination or target passage point of DQN by Simultaneous Localization and Mapping (SLAM) using Light Detection and Ranging (LiDAR). Acknowledgements. This work was supported by JSPS KAKENHI Grant Number 20K19793.
References 1. St¨ ocker, C., Bennett, R., Nex, F., Gerke, M., Zevenbergen, J.: Review of the Current State of UAV Regulations. Remote Sens. 9(5), 1–26 (2017) 2. Artemenko, O., Dominic, O., Andryeyev, O., Mitschele-Thiel, A.: Energy-aware trajectory planning for the localization of mobile devices using an unmanned aerial vehicle. In: Proceedings of the 25th International Conference on Computer Communication and Networks, ICCCN-2016, pp. 1–9 (2016) 3. Popovi´c, M., et al.: An informative path planning framework for UAV-based terrain monitoring. Auton. Robot. 44, 889–911 (2020) 4. Nguyen, H., Chen, F., Chesser, J., Rezatofighi, H., Ranasinghe, D.: LAVAPilot: lightweight UAV trajectory planner with situational awareness for embedded autonomy to track and locate radio-tags, pp. 1–8. arXiv:2007.15860 (2020) 5. Oda, T., Obukata, R., Ikeda, M., Barolli, L., Takizawa, M.: Design and implementation of a simulation system based on deep q-network for mobile actor node control in wireless sensor and actor networks. In: Proceedings of the 31th IEEE International Conference on Advanced Information Networking and Applications Workshops, IEEE AINA 2017, pp. 195–200 (2017) 6. Saito, N., Oda, T., Hirata, A., Hirota, Y., Hirota, M., Katayama, K.: Design and implementation of a DQN based AAV. In: Proceedings of the 15th International Conference on Broadband and Wireless Computing, Communication and Applications, BWCCA 2020, pp. 321–329 (2020) 7. Sandino, J., Vanegas, F., Maire, F., Caccetta, P., Sanderson, C., Gonzalez, F.: UAV framework for autonomous onboard navigation and people/object detection in cluttered indoor environments. Remote Sens. 12(20), 1–31 (2020)
Simulation Results of a DQN Based AAV Testbed in Corner Environment
167
8. Scherer, J., et al.: An autonomous multi-UAV system for search and rescue. In: Proceedings of the 6th ACM Workshop on Micro Aerial Vehicle Networks, Systems, and Applications, DroNet 2015, pp. 33–38 (2015) 9. Moulton, J., et al.: An autonomous surface vehicle for long term operations. In: Proceedings of the MTS/IEEE OCEANS, pp. 1–10 (2018) 10. Oda, T., Ueda, C., Ozaki, R., Katayama, K.: Design of a deep Q-network based simulation system for actuation decision in ambient intelligence. In: Proceedings of the 33rd International Conference on Advanced Information Networking and Applications, AINA 2019, pp. 362–370 (2019) 11. Oda, T., Matsuo, K., Barolli, L., Yamada, M., Liu, Y.: Design and implementation of an IoT-based e-learning testbed. Int. J. Web Grid Serv. 13(2), 228–241 (2017) 12. Hirota, Y., Oda, T., Saito, N., Hirata, A., Hirota, M., Katatama, K.: Proposal and experimental results of an ambient intelligence for training on soldering iron holding. In: Proceedings of the 15th International Conference on Broadband and Wireless Computing, Communication and Applications, BWCCA 2020, pp. 444453 (2020) 13. Hayosh, D., Liu, X., Lee, K.: Woody: low-cost, open-source humanoid torso robot. In: Proceedings of the 17th International Conference on Ubiquitous Robots, ICUR 2020, pp. 247–252 (2020) 14. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015) 15. Mnih, V., et al.: Playing Atari with deep reinforcement learning, pp. 1–9. arXiv:1312.5602 (2013) 16. Lei, T., Ming, L.: A robot exploration strategy based on Q-learning network. In: IEEE International Conference on Real-time Computing and Robotics, IEEE RCAR-2016, pp. 57–62 (2016) 17. Riedmiller, M.: Neural fitted Q iteration - first experiences with a data efficient neural reinforcement learning method. In: Proceedings of the 16th European Conference on Machine Learning, ECML-2005, pp. 317–328 (2005) 18. Lin, L.J.: Reinforcement Learning for Robots Using Neural Networks. Technical Report, DTIC Document (1993) 19. Lange, S., Riedmiller, M.: Deep auto-encoder neural networks in reinforcement learning. In: Proceedings of the International Joint Conference on Neural Networks, IJCNN 2010, pp. 1–8 (2010) 20. Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1–2), 99–134 (1998) 21. Glover, F.: Tabu Search - Part I. ORSA J. Comput. 1(3), 190–206 (1989) 22. Takano, K., Oda, T., Kohata, M.: Design of a DSL for converting rust programming language into RTL. In: Proceedings of the 8th International Conference on Emerging Internet, Data & Web Technologies, EIDWT 2020, pp. 342–350 (2020) 23. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, AISTATS-2010, pp. 249–256 (2010) 24. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, AISTATS 2011, pp. 315–323 (2011)
Stochastic Geometric Analysis of IRS-aided Wireless Networks Using Mixture Gamma Model Yunli Li and Young Jin Chun(B) Department of Electrical Engineering, City University of Hong Kong, Hong Kong SAR, China [email protected], [email protected]
Abstract. An intelligent propagation environment is in massive demand to achieve ubiquitous connectivity for future wireless networks. One novel approach to resolve this demand is by utilizing passive intelligent reflective surfaces (IRS) that can operate with negligible energy and be deployed at low cost. Due to these properties, IRS has recently gained immense attention in the research community and has been studied extensively. However, most of the published work focused on link-level performance without incorporating the impact of the co-channel interference. These limitations motivated us to evaluate the IRS-aided wireless network’s network performance by using a stochastic geometric framework. We utilized the Mixture Gamma model to represent arbitrary fading distribution and derived its statistics. We derived the outage probability in a closedform expression based on the proposed channel model and introduced a tight bound for asymptotic analysis. Our numerical results indicate that the Mixture Gamma model provides an excellent fit to diverse propagation environments, including the line of sight (LOS) and Non-LOS channel, and the majority of the well-known popular fading models.
1 Introduction The notion of an intelligent communication environment (ICE) has gained interest as a future form of wireless networks. Various technologies have been proposed to realize an ICE, and one tentative solution that is recently getting focused on is the intelligent reflecting surface (IRS). The IRS is a revolutionary technology that can achieve high spectrum and energy-efficient communications at a low cost [1, 2]. The recent development of IRS-related research is triggered by the technical advance in meta-surface and graphene. The IRS consists of a massive number of passive reflect elements on the planar surface and a control part that regulates the phase shift and direction adjustment. Meta-surface-based IRS is often used for millimeter-wave (mmWave) systems, whereas the graphene plasmonic-based IRS is employed for terahertz (THz) systems. In contrast to conventional RF chains, the passive IRS elements reflect signals without additional processing, enabling a cost-efficient deployment of the IRS elements. Furthermore, the IRS’s passive elements can effectively suppress self-interference and noise amplification, achieving a quantum leap improvement than the active relay and surfaces. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 168–178, 2022. https://doi.org/10.1007/978-3-030-79728-7_17
Mixture Gamma Channel Model for IRS-aided Wireless Networks
169
1.1 Related Work Motivated by IRS’s potentials, many research works delivered an extensive study on various aspects of the IRS, including the waveform, architecture, hardware, practical constraints, channel models, and deployment method. Furthermore, there are numerous works on IRS-aided wireless communications system on different architecture: channel estimation [3, 4], passive beamforming [5], orthogonal frequency division multiplexing (OFDM) [6], non-orthogonal multiple access (NOMA) [6], D2D network [7], and physical layer security [8–10]. In [12], the authors analyzed a quasi-static phase shift design for the IRS operating over Rician shadowed fading channel [13]. The authors of [11, 14, 15] presented a network-level performance analysis for the IRS-aided wireless network over Rayleigh faded channel. The authors of [15] extended the prior works by introducing Gamma approximation to represent the cascaded channel between the IRS links, which method has been adopted in [16] to analyze the outage probability of the non-orthogonal multiple access (NOMA) in the uplink of a cellular network. The majority of the published work on IRS-aided wireless networks commonly assumed Rayleigh fading as an underlying channel model due to its simplicity and tractability [14]. However, the propagation environments of the IRS-aided wireless network are more complicated than conventional channel models, where traditional Rayleigh fading can not properly represent the realistic propagation environment of the IRS channel. Furthermore, in future wireless networks, it is essential to incorporate the impact of random shadowing on the received signal power, which is caused by random blockages [17]. On the IRS channel, the transmitted signal follows a cascaded channel between the base station (BS), IRS node, and the user equipment (UE), forming both LOS link (BS-UE) and non-LOS (NLOS) link (BS-IRS-UE). For any advanced fading distribution beyond Rayleigh, the cascaded channel model is often intractable or can only be represented by an extremely versatile function, e.g. Meijer-G or Fox-H function. There are several works that considered asymmetric cascaded channels for relay networks, including Rayleigh/Rician channel [18], Nakagami-m/Rician channel [19], and η − μ /κ − μ fading channel [20]. In [21], the authors proposed a N×Nakagami distribution, which consists of N cascaded Nakagami-m faded channels, to represent symmetric cascaded fading in MIMO communications. Most of the prior works on cascaded fading assumed Gaussian approximation and ignored interference. However, the impact of cochannel interference can not be ignored as the IRS elements generate random scattering, increasing the aggregate interference. In [14], the authors incorporated the co-channel interference, but the channel model is approximated by the Gaussian distributions, and the fading distribution is limited to Rayleigh fading. The authors of [16] extended the work to the Nakagami-m fading channel, where the cascaded fading channel is approximated by a Gamma distribution. These limitations motivated us to consider an accurate approximation model that can represent arbitrary fading distribution on the IRS-aided wireless networks and evaluate the network performance metrics.
170
1.2
Y. Li and Y. J. Chun
Contributions
In this paper, we propose a versatile channel model for the IRS-aided wireless network, where we utilize the Mixture Gamma model to represent a cascaded channel of arbitrary fading distributions. We apply the proposed Mixture Gamma model to represent both LOS and NLOS cascaded channels. We develop a unified framework to evaluate IRS-aided wireless networks’ system-level performance metrics over arbitrary fading environments based on the proposed versatile channel model.
2 Mixture Gamma Approximation Based Fading Model In this section, we utilize a Mixture Gamma model to approximate arbitrary channels that includes not only single fading channels but also cascaded double channels. This approach enables us to virtually represent a cascaded double fading link into a direct LOS link without incorporating the obstacles between each channel. In [22], the authors proved that an arbitrary function f (x) with a positive domain x ∈ (0, ∞) can be accurately approximated as a weighted sum of Gamma distribution power density functions (PDFs). We adopt this approach to model a composite fading models in terms of Mixture Gamma distribution as follows N
N
fh (x) = ∑ wi fi (x) = ∑ αi xβi −1 e−ξi x , fi (x) =
i=1 i=1 βi βi −1 −ξi x ξi x e
Γ (βi )
,
wi =
αiΓ (βi ) βi
ξi
(1) ,
where fi (x) is the PDF of a Gamma distribution, αi , βi , ξi are the parameters of the ith Gamma component, N is the number of terms with ∑Ni=1 wi = 1, 0∞ fh (x)dx = 1. Arbitrary complicated distributions can be expressed by the Mixture Gamma distribution in (1), e.g., Gamma-Lognormal, generalized K, Nakagami-m, Rician, κ -μ /η -μ , and κ -μ shadowed [15, 22]. Furthermore, any cascaded double channel can be represented by the Mixture Gamma model as listed below. 1. Double Rayleigh The PDF of two cascaded Rayleigh channel is given as (2), where each single link is distributed by Rayleigh fading with R2 ∼ Γ (1, 2δ1 2 ) N
− 12 2 y 1 −1 4ti δ1 δ2 w t e . 2δ 2 i i 4 δ i=1 1 2
fY (y) = ∑
(2)
2. Double Nakagami-m The PDF of two cascaded Nakagami-m fading channel is given as (3), where each link is distributed by a Gamma distribution Γ (m, m1 ) with parameters m1 for link 1 and m2 for link 2, respectively (m1 m2 )−m1 − 1 y witim2 −m1 −1 ym1 −1 e ti m1 m2 . i=1 Γ (m1 )Γ (m2 ) N
fY (y) = ∑
(3)
Mixture Gamma Channel Model for IRS-aided Wireless Networks
171
3 System Model 3.1 Network Model We consider a downlink (DL) transmission of the IRS-aided wireless network, where the BSs, IRSs, and UEs are modeled by two dimensional (2D) homogeneous Poisson Point Process (HPPP). We denote the PPP for the BSs as ΛB with node density λB , ΛI for the IRSs with density λI , and ΛU for the UEs with density λU , respectively. The typical UE, denoted by UE0 , is assumed to be located at the origin. As illustrated in Fig. 1, there are two types of links between the BS0 and the UE0 ; a direct, LOS link BS0 -UE0 and a cascaded NLOS link BS0 -IRS0 -UE0 . The typical UE UE0 always connect to its nearest BS, denoted as BS0 , and the PDF of the inter-node distance across BS0 -UE0 link is given by 2 fd0 (d0 ) = 2πλB d0 e−λB π d0 . (4) In this paper, we assume that only a single IRS is associated between UE0 and BS0 . It has been proved in [23] that the optimal deployment of a single associated IRS is in the vicinity of either UE0 or BS0 [23]. Due to the symmetry of the cascaded channel, the IRS association policy will connect the UE0 with its nearest IRS, denoted as IRS0 . The IRS association policy has three different modes of operation based on the distance from the UE0 to its nearest IRS, denoted by r0 , as follows. 1. Case 1. If r0 ≤ D1 , the UE0 associate to the nearest IRS IRS0 , which aligns the phase of received signal for the LOS/NLOS link and transmits the signal to UE0 with beamforming (BF). 2. Case 2. If D1 < r0 ≤ D2 , UE0 does not associated with any IRS and the IRS nodes affect the aggregate interference by generating random scattering. 3. Case 3. If r0 > D2 , the interference caused by the IRS nodes can be ignored or treated as an additive Gaussian white noise (AWGN). Given that an IRS is associated to UE0 , the PDF of the link distance IRS0 − UE0 is expressed as follows 2 fr (r) = 2πλI re−λI π r . (5) To further simplify the performance analysis, we assume the following conditions; ( j)
1. The distance from the BS0 to UE UE0 or any IRS j are identical, i.e., lir,0 ≈ d0 , where ( j)
lir,0 is the link distance between BS0 and IRS j . 2. The distance of the BS0 and IRS0 follows the same distribution as the distance of the UE0 and BS0 . ( j) 3. The link distances lir,0 of different IRS elements are identical.
172
Y. Li and Y. J. Chun
Fig. 1. IRS-aided muti-cell wireless network (DL)
3.2
Channel Model
If IRS j is located within D2 , i.e., r0 < D2 , the channel environment forms a composite channel that consists of a LOS link BSm − UE0 and a cascaded NLOS link BSm − IRS j − UE0 . The channel coefficients of a LOS link is denoted by hd,m , and the channel coefficients of a cascaded NLOS link is given as follows ( j)
( j)
( j)
hir,m [hi,m ]T Φ ( j) [hr ] =
K
( j)
( j)
( j)
∑ hir,m,k hr,k eiφk
,
( j)
( j)
( j)
( j)
hir,m,k hi,m,k hr,k eiφk
(6)
k=1 ( j)
where hir,m,k denotes the channel coefficients of BSm − IRS j − UE0 link reflected by ( j)
the n-th IRS element, Φ ( j) represents the phase adjustment by IRS j , φk
is the phase
( j) adjustment of the n-th element of IRS j , hi,m,k is the channel coefficients of the LOS link, ( j) and hr,k is the channel coefficients of the cascaded NLOS link. Based on the Mixture Gamma approximation, we can virtually convert a cascaded NLOS link BSm − IRS j −
UE0 to a LOS link. Since the IRS nodes are uniformly distributed, the average phase adjustment is zero and the phase effects can be ignored for IRS-aided wireless networks. Hence, the channel model only needs to incorporate the channel gain and path loss and we utilize h to denote the channel coefficients. Based on the proposed model, the received signal to interference plus noise ratio (SINR) at UE0 can be expressed as follows
γ
S . I +W
(7)
The term S is the normalized received signal power, I is the normalized interference 2 power, and W δPt is the normalized noise power as follows
Mixture Gamma Channel Model for IRS-aided Wireless Networks
S hd,0 PLd,0 +
∑
( j)
( j)
hir,0 PLir,0 ,
j∈J
I
∑
m∈ΛB \{0}
173
Im =
∑
hd,m PLd,m +
m∈ΛB \{0}
∑
( j) ( j) hir,m PLir,m
(8) ,
j∈J
where the received power S consists of the transmitted power from BS0 −UE0 link and from all IRSs located within the circle with center UE0 and radius D2 . The interference power I includes the received power from all the BSs and IRSs located within the circle with center UE0 and radius D2 , which are the received signal power from BS − UE0 LOS link and all cascaded NLOS links.
4 Performance Analysis In this section, we evaluate the outage probability of the IRS-aided wireless networks based on stochastic geometric analysis [17]. Based on the SINR model in (7), the outage probability Poutage is defined as follows Poutage = 1 − P{γ > γ } = 1 − P{S > γ (I +W )}.
(9)
Since we approximated the conditional received signal power by Gamma distribution, (9) can be further expressed into a finite summation form as follows k−1 Γ (k, x) (−s) p (p) LY (s) , (10) P{γ > γ } = EX =∑ Γ (k) p=0 p! s=1
where k, θ are the parameters of Gamma distribution, Y (p)
γ (I+W ) , LY (s) is the Laplace θ
transform of Y , and LY (s) denotes the p-th order derivative of LY (s). For a given locations of BS0 , we can derive LY (s) in closed form expression
−sY sγ sγ W = exp − (11) LY (s) E e LI|d ,r = eV (s) , 0 0 θ θ d0 ,r0
where δ = α2 , V (s) − sγθW − 2πλBU U (sη , h) =
sγη θ ,h
and U (x, y) is given by
−α δ 2d2 . Eh (sη h)δ d −2 γ 1 − δ , sη hd0−α − 1 − e−sη hd0 2 (p)
(12)
The outage probability can be further analyzed by evaluating LY (s) based on the lower triangular Toeplitz matrix as follows [24] k−1 1 d p C(z) e , (13) P{γ > γ } = ∑ p p=0 p! ds s=1,z=0
174
Y. Li and Y. J. Chun
(−s) n (m) (s), τ = 1 + j + β , and η = sη d −α . The first where C(z) ∑∞ i n=0 cn z , cm = m! V 0 C(z) M coefficients of e form the first column of the matrix exponential eCM , where CM is a M × M lower triangular Toeplitz matrix ⎤ ⎡ c0 ⎥ ⎢ c1 c0 ⎥ ⎢ ⎥ ⎢ c2 c c 1 0 (14) CM = ⎢ ⎥. ⎥ ⎢ .. . .. ⎦ ⎣ . cM−1 · · · c2 c1 c0 m
Although (13) is expressed in a closed form, the expression is computationally challenging and requires iterative search. To improve tractability, we derived a tight upper bound in (15), which can be readily proved by Alzer’s inequality k
Poutage < 1 − ∑ (−1)n+1
k n
e−ξ nγW LI (ξ nγ ),
(15)
n=1
1
where ξ = k(k!)− k and k is parameter of the Gamma distribution.
5 Numerical Results In this section, we introduce numerical results to verify the analytical derivations. Figure 2 illustrates the CDF of the received signal power against the transmit power over various parameters, including the path loss exponent α , the IRS operation mode, and the link distance across UE0 − IRS0 . The double Nakagami-m fading is assumed as the underlying channel model. We observed that the IRS-aided systems could significantly enhance the received signal power compared to conventional wireless networks without IRS. Furthermore, we notice that the received signal power can be increased by either reducing the path loss exponent or decreasing the link distance of UE0 − IRS0 , which can be explained in terms of channel hardening effect [14]. In Fig. 3, we assessed the impact of diverse channel parameters on the received signal power. We illustrated the variation of the received signal powers’ distribution for a diverse combination of the m parameters on both the LOS and NLOS link. We observed that utilizing the IRS node indeed enhances the received power, and the amount of power increment by using IRS increases for a smaller m parameter, i.e., the IRS can greatly improve the received signal power when the channel is under severe fading. Furthermore, we noticed that the received signal power becomes higher as the fading parameters of the underlying channel get improved. Based on the numerical results, we observed that the channel hardening effect gets stronger as the pathloss /or fading becomes severe and the link distance UE0 − IRS0 gets shorter. In Figs. 4 and 5, the outage probabilities are plotted against the SINR threshold. The outage performance is related to the transmission power, path-loss, BS density, IRS elements number, and IRS density. Figure 4 shows that the IRS-aided system achieves significant improvement, especially when the LOS link is under severe fading. By increasing the IRS node density, the outage probability gets smaller, achieving a better
Mixture Gamma Channel Model for IRS-aided Wireless Networks
Fig. 2. CDF of the mean signal power with different settings
Fig. 3. CDF of conditional received power with different m on each link
175
Y. Li and Y. J. Chun
1 No IRS IRS BF, IRS density:0.002 IRS BF, IRS density:0.003
0.9 0.8
outage probability
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 -15
-10
-5
0
5
10
SINR threshold (dB)
Fig. 4. Outage probability with different IRS density
1 transmit power = 1 transmit power = 10 transmit power = 20
0.9
0.8
0.7
outage probability
176
0.6
0.5
0.4
0.3
0.2
0.1
0 -10
-8
-6
-4
-2
0
2
4
6
8
SINR threshold (dB)
Fig. 5. Outage probability with different transmitting powers
10
12
Mixture Gamma Channel Model for IRS-aided Wireless Networks
177
end-to-end performance. Figure 5 illustrates the change in outage probability versus different transmit power. It is clear that transmitting with a higher power decreases the outage probability.
6 Conclusion In this work, we proposed a uniform framework for analyzing IRS-aided wireless networks over an arbitrary fading environment. First, we utilized the Mixture Gamma distribution model to approximate the channel coefficients of the LOS link and NLOS link for arbitrary fading. Second, we derived the conditional signal power for a given link distance and approximated the conditional mean signal power into Gamma distribution. Based on the three available operational modes of the IRS, we derived the Laplace transform of conditional interference power for a given link distance BS0 −UE0 . Finally, we derived the outage probability in a closed-form expression and provided the tight lower bound by applying Alzer’s inequality. The simulation results validated the analytical derivation. Acknowledgment. This work was supported in part by the City University of Hong Kong (CityU), Startup Grant 7200618, and in part by the CityU, Strategic Research Grant 21219520.
References 1. Tsilipakos, O., Tasolamprou, A.C., Pitilakis, A., Liu, F., Wang, X., Mirmoosa, M.S., Tzarouchis, D.C., Abadal, S., Taghvaee, H., Liaskos, C., et al.: Toward intelligent metasurfaces: the progress from globally tunable metasurfaces to software-defined metasurfaces with an embedded network of controllers. Adv. Opt. Mater. 8(17), 2000783 (2020) 2. Wu, Q., Zhang, R.: towards smart and reconfigurable environment: intelligent reflecting surface aided wireless network. IEEE Commun. Mag. 58(1), 106–112 (2019) 3. Wang, Z., Liu, L., Cui, S.: Channel estimation for intelligent reflecting surface assisted multiuser communications. In: 2020 IEEE Wireless Communications and Networking Conference (WCNC), pp. 1–6. IEEE (2020) 4. You, C., Zheng, B., Zhang, R.: Channel estimation and passive beamforming for intelligent reflecting surface: discrete phase shift and progressive refinement. IEEE J. Sel. Areas Commun. 38(11), 2604–2620 (2020) 5. Zhao, M.-M., Wu, Q., Zhao, M.-J., Zhang, R.: Intelligent reflecting surface enhanced wireless network: Two-timescale beamforming optimization. IEEE Trans. Wirel. Commun. 20(1), 2–17 (2020) 6. Yang, G., Xu, X., Liang, Y.-C.: Intelligent reflecting surface assisted non-orthogonal multiple access. In: IEEE Wireless Communications and Networking Conference (WCNC), vol. 2020, pp. 1–6. IEEE (2020) 7. Yin, R., Zhong, C., Yu, G., Zhang, Z., Wong, K.K., Chen, X.: Joint spectrum and power allocation for d2d communications underlaying cellular networks. IEEE Trans. Veh. Technol. 65(4), 2182–2195 (2015) 8. Cui, M., Zhang, G., Zhang, R.: Secure wireless communication via intelligent reflecting surface. IEEE Wirel. Commun. Lett. 8(5), 1410–1414 (2019) 9. Chen, J., Liang, Y.-C., Pei, Y., Guo, H.: Intelligent reflecting surface: a programmable wireless environment for physical layer security. IEEE Access 7, 82 599–82 612 (2019)
178
Y. Li and Y. J. Chun
10. Xu, D., Yu, X., Sun, Y., Ng, D.W.K., Schober, R.: Resource allocation for secure IRS-assisted multiuser miso systems. In: IEEE Globecom Workshops (GC Wkshps), vol. 2019, pp. 1–6. IEEE (2019) 11. Wu, Q., Zhang, S., Zheng, B., You, C., Zhang, R.: Intelligent reflecting surface aided wireless communications: A tutorial. arXiv:2007.02759 (2020) 12. Jia, Y., Ye, C., Cui, Y.: Analysis and optimization of an intelligent reflecting surface-assisted system with interference. arXiv preprint arXiv:2002.00168 (2020) 13. Paris, J.F.: Statistical characterization of κ -μ shadowed fading. IEEE Trans. Veh. Technol. 63(2), 518–526 (2013) 14. Lyu, J., Zhang, R.: Hybrid active/passive wireless network aided by intelligent reflecting surface: System modeling and performance analysis. arXiv preprint arXiv:2004.13318 (2020) 15. Atapattu, S., Fan, R., Dharmawansa, P., Wang, G., Evans, J., Tsiftsis, T.A.: Reconfigurable intelligent surface assisted two-way communications: Performance analysis and optimization. IEEE Trans. Commun. 68(10), 6552–6567 (2020) 16. Tahir, B., Schwarz, S., Rupp, M.: Analysis of uplink IRS-assisted NOMA under Nakagamim fading via moments matching. arXiv preprint arXiv:2009.03133 (2020) 17. Chun, Y.J., Cotton, S.L., Dhillon, H.S., Lopez-Martinez, F.J., Paris, J.F., Yoo, S.K.: A comprehensive analysis of 5g heterogeneous cellular systems operating over κ -μ shadowed fading channels. IEEE Trans. Wirel. Commun. 16(11), 69957010 (2017) 18. Duong, T.Q., Shin, H., Hong, E.-K.: Effect of line-of-sight on dualhop nonregenerative relay wireless communications. In: IEEE 66th Vehicular Technology Conference, vol. 2007, pp. 571–575. IEEE (2007) 19. Gurung, A.K., Al-Qahtani, F.S., Hussain, Z.M., Alnuweiri, H.: Performance analysis of amplify-forward relay in mixed Nakagami-m and Rician fading channels. In: The 2010 International Conference on Advanced Technologies for Communications, pp. 321–326. IEEE (2010) 20. Peppas, K.P., Alexandropoulos, G.C., Mathiopoulos, P.T.: Performance analysis of dual-hop AF relaying systems over mixed η -μ and κ -μ fading channels. IEEE Trans. Veh. Technol. 62(7), 3149–3163 (2013) 21. Karagiannidis, G.K., Sagias, N.C., Mathiopoulos, P.T.: N*nakagami: a novel stochastic model for cascaded fading channels. IEEE Trans. Commun. 55(8), 1453–1458 (2007) 22. Atapattu, S., Tellambura, C., Jiang, H.: A mixture gamma distribution to model the snr of wireless channels. IEEE Trans. Wirel. Commun. 10(12), 4193–4203 (2011) 23. You, C., Zheng, B., Zhang, R.: How to deploy intelligent reflecting surfaces in wireless network: Bs-side, user-side, or both sides? arXiv preprintarXiv:2012.03403 (2020) 24. Yu, X., Zhang, J., Haenggi, M., Letaief, K.B.: Coverage analysis for millimeter wave networks: The impact of directional antenna arrays. IEEE J. Sel. Areas Commun. 35(7), 1498– 1512 (2017)
Performance Evaluation of CM and RIWM Router Replacement Methods for WMNs by WMN-PSOHC Hybrid Intelligent Simulation System Considering Chi-square Distribution of Mesh Clients Shinji Sakamoto1(B) , Yi Liu2 , Leonard Barolli3 , and Shusuke Okamoto1 1
Department of Computer and Information Science, Seikei University, 3-3-1 Kichijoji-Kitamachi, Musashino-shi, Tokyo 180-8633, Japan [email protected], [email protected] 2 Department of Computer Science, National Institute of Technology, Oita College, 1666, Maki, Oita 870-0152, Japan [email protected] 3 Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected]
Abstract. Wireless Mesh Networks (WMNs) have many advantages such as easy maintenance, low upfront cost and high robustness. However, WMNs have some problems such as node placement problem, security, transmission power and so on. In this work, we deal with node placement problem. In our previous work, we implemented a hybrid simulation system based on Particle Swarm Optimization (PSO) and Hill Climbing (HC) called WMN-PSOHC for solving the node placement problem in WMNs. In this paper, we present the performance evaluation of two router replacements methods: Constriction Method(CM) and Random Inertia Weight Method (RIWM), for WMNs by WMN-PSOHC intelligent system considering Chi-square distribution of mesh clients. The simulation results show that RIWM has a better performance than CM.
1 Introduction The wireless networks and devices are becoming increasingly popular and they provide users access to information and communication anytime and anywhere [2–4, 9, 10, 12]. Wireless Mesh Networks (WMNs) are gaining a lot of attention because of their low cost nature that makes them attractive for providing wireless Internet connectivity. A WMN is dynamically self-organized and self-configured, with the nodes in the network automatically establishing and maintaining mesh connectivity among them-selves (creating, in effect, an ad hoc network). This feature brings many advantages to WMNs such as low up-front cost, easy network maintenance, robustness and reliable service coverage [1]. Moreover, such infrastructure can be used to deploy community networks, metropolitan area networks, municipal and corporative networks, and to support applications for urban areas, medical, transport and surveillance systems. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 179–187, 2022. https://doi.org/10.1007/978-3-030-79728-7_18
180
S. Sakamoto et al.
In this work, we deal with node placement problem in WMNs. We consider the version of the mesh router nodes placement problem in which we are given a grid area where to deploy a number of mesh router nodes and a number of mesh client nodes of fixed positions (of an arbitrary distribution) in the grid area. The objective is to find a location assignment for the mesh routers to the cells of the grid area that maximizes the network connectivity and client coverage. Network connectivity is measured by Size of Giant Component (SGC) of the resulting WMN graph, while the user coverage is simply the number of mesh client nodes that fall within the radio coverage of at least one mesh router node and is measured by Number of Covered Mesh Clients (NCMC). Node placement problems are known to be computationally hard to solve [24]. In some previous works, intelligent algorithms have been recently investigated [11, 18, 19]. We already implemented a Particle Swarm Optimization (PSO) based simulation system, called WMN-PSO [15]. Also, we implemented a simulation system based on Hill Climbing (HC) for solving node placement problem in WMNs, called WMN-HC [14]. In our previous work [15, 17], we presented a hybrid intelligent simulation system based on PSO and HC. We called this system WMN-PSOHC. In this paper, we analyze the performance of Constriction Method (CM) and Random Inertia Weight Method (RIWM) for WMNs by WMN-PSOHC simulation system considering Chi-square distribution of mesh clients. The rest of the paper is organized as follows. We present our designed and implemented hybrid simulation system in Sect. 2. In Sect. 3, we introduce WMN-PSOHC Web GUI tool. The simulation results are given in Sect. 4. Finally, we give conclusions and future work in Sect. 5.
2 Proposed and Implemented Simulation System 2.1
Particle Swarm Optimization
In Particle Swarm Optimization (PSO) algorithm, a number of simple entities (the particles) are placed in the search space of some problem or function and each evaluates the objective function at its current location. The objective function is often minimized and the exploration of the search space is not through evolution [13]. However, following a widespread practice of borrowing from the evolutionary computation field, in this work, we consider the bi-objective function and fitness function interchangeably. Each particle then determines its movement through the search space by combining some aspect of the history of its own current and best (best-fitness) locations with those of one or more members of the swarm, with some random perturbations. The next iteration takes place after all particles have been moved. Eventually the swarm as a whole, like a flock of birds collectively foraging for food, is likely to move close to an optimum of the fitness function. Each individual in the particle swarm is composed of three D-dimensional vectors, where D is the dimensionality of the search space. These are the current positionxi , the previous best position pi and the velocity vi . The particle swarm is more than just a collection of particles. A particle by itself has almost no power to solve any problem; progress occurs only when the particles interact. Problem solving is a population-wide phenomenon, emerging from the individual
Performance Evaluation of CM and RIWM Router Replacement Methods
181
behaviors of the particles through their interactions. In any case, populations are organized according to some sort of communication structure or topology, often thought of as a social network. The topology typically consists of bidirectional edges connecting pairs of particles, so that if j is in i’s neighborhood, i is also in j’s. Each particle communicates with some other particles and is affected by the best point found by any member of its topological neighborhood. This is just the vector pi for that best neighbor, which we will denote with pg . The potential kinds of population “social networks” are hugely varied, but in practice certain types have been used more frequently. In the PSO process, the velocity of each particle is iteratively adjusted so that the particle stochastically oscillates around pi and pg locations. 2.2 Hill Climbing Hill Climbing (HC) algorithm is a heuristic algorithm. The idea of HC is simple. In HC, the solution s is accepted as the new current solution if δ ≤ 0 holds, where δ = f (s ) − f (s). Here, the function f is called the fitness function. The fitness function gives points to a solution so that the system can evaluate the next solution s and the current solution s. The most important factor in HC is to define effectively the neighbor solution. The definition of the neighbor solution affects HC performance directly. In our WMNPSOHC system, we use the next step of particle-pattern positions as the neighbor solutions for the HC part.
Fig. 1. Chi-square distribution of mesh clients.
2.3 WMN-PSOHC System Description In following, we present the initialization, particle-pattern, fitness function and router replacement methods. Initialization Our proposed system starts by generating an initial solution randomly, by ad hoc methods [25]. We decide the velocity of particles by a random process considering the area size. For instance, when the area size is W × H, the velocity is decided randomly from
182
S. Sakamoto et al.
√ √ − W 2 + H 2 to W 2 + H 2 . Our system can generate many client distributions. In this paper, we consider Chi-square distribution of mesh clients as shown in Fig. 1. Particle-Pattern A particle is a mesh router. A fitness value of a particle-pattern is computed by combination of mesh routers and mesh clients positions. In other words, each particle-pattern is a solution as shown is Fig. 2. Therefore, the number of particle-patterns is a number of solutions. Fitness Function One of most important thing is to decide the determination of an appropriate objective function and its encoding. In our case, each particle-pattern has an own fitness value and compares other particle-patterns fitness value in order to share information of global solution. The fitness function follows a hierarchical approach in which the main objective is to maximize the SGC in WMN. Thus, we use α and β weight-coefficients for the fitness function and the fitness function of this scenario is defined as: Fitness = α × SGC(xi j , yi j ) + β × NCMC(xi j , yi j ). Router Replacement Methods A mesh router has x, y positions and velocity. Mesh routers are moved based on velocities. There are many router replacement methods in PSO field [8, 21–23]. In this paper, we consider CM and RIWM. Constriction Method (CM) CM is a method which PSO parameters are set to a week stable region (ω = 0.729, C1 = C2 = 1.4955) based on analysis of PSO by M. Clerc et al. [4, 5, 8]. Random Inertia Weight Method (RIWM) In RIWM, the ω parameter is changing randomly from 0.5 to 1.0. The C1 and C2 are kept 2.0. The ω can be estimated by the week stable region. The average of ω is 0.75 [20, 22]. Linearly Decreasing Inertia Weight Method (LDIWM) In LDIWM, C1 and C2 are set to 2.0, constantly. On the other hand, the ω parameter is changed linearly from unstable region (ω = 0.9) to stable region (ω = 0.4) with increasing of iterations of computations [6, 23]. Linearly Decreasing Vmax Method (LDVM) In LDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). A value of Vmax which is maximum velocity of particles is considered. With increasing of iteration of computations, the Vmax is kept decreasing linearly [7, 21]. Rational Decrement of Vmax Method (RDVM) In RDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). A value of Vmax which is maximum velocity of particles is considered. The Vmax is kept decreasing with the increasing of iterations as T −x Vmax (x) = W 2 + H 2 × . x Where, W and H are the width and the height of the considered area, respectively. Also, T and x are the total number of iterations and a current number of iteration, respectively [16].
Performance Evaluation of CM and RIWM Router Replacement Methods
183
Fig. 2. Relationship among global solution, particle-patterns and mesh routers.
Fig. 3. System structure for web interface.
Fig. 4. WMN-PSOHC Web GUI tool.
3 WMN-PSOHC Web GUI Tool The Web application follows a standard Client-Server architecture and is implemented using LAMP (Linux + Apache + MySQL + PHP) technology (see Fig. 3). We show the WMN-PSOHC Web GUI tool in Fig. 4. Remote users (clients) submit their requests by completing first the parameter setting. The parameter values to be provided by the user are classified into three groups, as follows.
184
S. Sakamoto et al. Table 1. Parameter settings. Parameters
Values
Clients distribution
Chi-square distribution
Area size
32.0 × 32.0
Number of mesh routers
16
Number of mesh clients
48
Total iterations
800
Iteration per phase
4
Number of particle-patterns
9
Radius of a mesh router
From 2.0 to 3.0
Fitness function weight-coefficients (α , β ) 0.7, 0.3 Replacement method
(a) CM
CM, RIWM
(b) RIWM
Fig. 5. Simulation results of WMN-PSOHC for SGC.
• Parameters related to the problem instance: These include parameter values that determine a problem instance to be solved and consist of number of router nodes, number of mesh client nodes, client mesh distribution, radio coverage interval and size of the deployment area. • Parameters of the resolution method: Each method has its own parameters. • Execution parameters: These parameters are used for stopping condition of the resolution methods and include number of iterations and number of independent runs. The former is provided as a total number of iterations and depending on the method is also divided per phase (e.g., number of iterations in a exploration). The later is used to run the same configuration for the same problem instance and parameter configuration a certain number of times.
4 Simulation Results In this section, we show simulation results using WMN-PSOHC system. In this work, we consider Chi-square distribution of mesh clients. The number of mesh routers is considered 16 and the number of mesh clients 48. We consider the number of particlepatterns 9. We conducted simulations 10 times, in order to avoid the effect of randomness and create a general view of results. The total number of iterations is considered
Performance Evaluation of CM and RIWM Router Replacement Methods
(a) CM
185
(b) RIWM
Fig. 6. Simulation results of WMN-PSOHC for NCMC.
(a) CM
(b) RIWM
Fig. 7. Visualized image of simulation results for different clients.
800 and the iterations per phase is considered 4. We show the parameter setting for WMN-PSOHC in Table 1. We show the simulation results in Fig. 5 and Fig. 6. For SGC, both router replacement methods reach the maximum (100%). This means that all mesh routers are connected to each other. For the NCMC, RIWM converges faster than CM. This is because the RIWM can search a solution space widely. In Fig. 7, we show the visualized images of the simulation results for both router replacement methods. We see that some mesh clients are not covered for both methods. However, the number of covered mesh clients by RIWM is higher than CM. Therefore, we conclude that the performance of RIWM is better than CM.
5 Conclusions In this work, we evaluated the performance of CM and RIWM router replacement methods for WMNs by WMN-PSOHC hybrid intelligent simulation system. Simulation results show that the performance of RIWM is better compared with CM.
186
S. Sakamoto et al.
In our future work, we would like to evaluate the performance of the proposed system for different parameters and scenarios.
References 1. Akyildiz, I.F., Wang, X., Wang, W.: Wireless mesh networks: a survey. Comput. Networks 47(4), 445–487 (2005) 2. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: A hybrid simulation system based on particle swarm optimization and distributed genetic algorithm for WMNs: performance evaluation considering normal and uniform distribution of mesh clients. In: International Conference on Network-Based Information Systems, pp 42–55. Springer (2018) 3. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: Performance analysis of simulation system based on particle swarm optimization and distributed genetic algorithm for WMNs considering different distributions of mesh clients. In: Barolli, L., Xhafa, F., Javaid, N., Enokido, T. (eds.) IMIS 2018. AISC, vol. 773, pp. 32–45. Springer, Cham (2019). https://doi.org/10. 1007/978-3-319-93554-6 3 4. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: Performance evaluation of WMNPSODGA system for node placement problem in WMNs considering four different crossover methods. In: The 32nd IEEE International Conference on Advanced Information Networking and Applications (AINA-2018), pp. 850–857. IEEE (2018) 5. Barolli, A., Sakamoto, S., Durresi, H., Ohara, S., Barolli, L., Takizawa, M.: A comparison study of constriction and linearly decreasing vmax replacement methods for wireless mesh networks by WMN-PSOHC-DGA simulation system. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp 26–34. Springer (2019) 6. Barolli, A., Sakamoto, S., Ohara, S., Barolli, L., Takizawa, M.: Performance analysis of WMNs by WMN-PSOHC-DGA simulation system considering linearly decreasing inertia weight and linearly decreasing vmax replacement methods. In: International Conference on Intelligent Networking and Collaborative Systems, pp 14–23. Springer (2019) 7. Barolli, A., Sakamoto, S., Ohara, S., Barolli, L., Takizawa, M.: Performance analysis of WMNs by WMN-PSOHC-DGA simulation system considering random inertia weight and linearly decreasing vmax router replacement methods. In: Conference on Complex, Intelligent, and Software Intensive Systems, pp 13–21. Springer (2019) 8. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002) 9. Matsuo, K., Sakamoto, S., Oda, T., Barolli, A., Ikeda, M., Barolli, L.: Performance analysis of WMNs by WMN-GA simulation system for two WMN architectures and different TCP congestion-avoidance algorithms and client distributions. Int. J. Commun. Networks Distrib. Syst. 20(3), 335–351 (2018) 10. Ohara, S., Barolli, A., Sakamoto, S., Barolli, L.: Performance analysis of WMNs by WMNPSODGA simulation system considering load balancing and client uniform distribution. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp 25–38. Springer (2019) 11. Ozera, K., Bylykbashi, K., Liu, Y., Barolli, L.: A fuzzy-based approach for cluster management in VANETs: performance evaluation for two fuzzy-based systems. Internet Things 3, 120–133 (2018) 12. Ozera, K., Inaba, T., Bylykbashi, K., Sakamoto, S., Ikeda, M., Barolli, L.: A WLAN Triage Testbed Based on Fuzzy Logic and Its Performance Evaluation for Different Number of Clients and Throughput Parameter. International Journal of Grid and Utility Computing 10(2), 168–178 (2019)
Performance Evaluation of CM and RIWM Router Replacement Methods
187
13. Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization. Swarm Intell. 1(1), 33–57 (2007) 14. Sakamoto, S., Lala, A., Oda, T., Kolici, V., Barolli, L., Xhafa, F.: Analysis of WMN-HC simulation system data using friedman test. In: The Ninth International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS-2015), IEEE, pp. 254–259 (2015) 15. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation and evaluation of a simulation system based on particle swarm optimisation for node placement problem in wireless mesh networks. Int. J. Commun. Networks Distr. Syst. 17(1), 1–13 (2016) 16. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation of a new replacement method in WMN-PSO simulation system and its performance evaluation. In: The 30th IEEE International Conference on Advanced Information Networking and Applications (AINA2016), pp 206–211 (2016). https://doi.org/10.1109/AINA.2016.42 17. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Implementation of intelligent hybrid systems for node placement problem in WMNs considering particle swarm optimization, hill climbing and simulated annealing. Mobile Networks Appl. 23(1), 27–33 (2018) 18. Sakamoto, S., Barolli, A., Barolli, L., Okamoto, S.: Implementation of a web interface for hybrid intelligent systems. Int. J. Web Inf. Syst. 15(4), 420–431 (2019) 19. Sakamoto, S., Barolli, L., Okamoto, S.: WMN-PSOSA: an intelligent hybrid simulation system for WMNs and its performance evaluations. Int. J. Web Grid Serv. 15(4), 353–366 (2019) 20. Sakamoto, S., Ohara, S., Barolli, L., Okamoto, S.: Performance evaluation of WMNs by WMN-PSOHC system considering random inertia weight and linearly decreasing vmax replacement methods. In: International Conference on Network-Based Information Systems, pp 27–36. Springer (2019) 21. Schutte, J.F., Groenwold, A.A.: A study of global optimization using particle swarms. J. Global Optim. 31(1), 93–108 (2005) 22. Shi, Y.: Particle swarm optimization. IEEE Connections 2(1), 8–13 (2004) 23. Shi, Y., Eberhart, R.C.: Parameter selection in particle swarm optimization. In: Evolutionary Programming VII, pp 591–600 (1998) 24. Wang, J., Xie, B., Cai, K., Agrawal, D.P.: Efficient mesh router placement in wireless mesh networks. In: Proceedings of IEEE Internatonal Conference on Mobile Adhoc and Sensor Systems (MASS-2007), pp 1–9 (2007) 25. Xhafa, F., Sanchez, C., Barolli, L.: Ad hoc and neighborhood search methods for placement of mesh routers in wireless mesh networks. In: Proceedings of 29th IEEE International Conference on Distributed Computing Systems Workshops (ICDCS-2009), pp. 400–405 (2009)
Web Page Classification Based on Graph Neural Network Tao Guo(B) and Baojiang Cui Beijing University of Posts and Telecommunications, Beijing, China [email protected]
Abstract. Web page, a kind of semi-structured document, includes a lot of additional attribute content besides text information. Traditional web page classification technology is mostly based on text classification methods. They ignore the additional attribute information of web page text. We propose WEB-GNN, an approach for Web page classification. There are two major contributions to this work. First, we propose a web page graph representation method called W2G that reconstructs text nodes into graph representation based on text visual association relationship and DOM-tree hierarchy relationship and realizes the efficient integration of web page content and structure. Our second contribution is to propose a web page classification method based on graph convolutional neural network. It takes the web page graph representation as to the input, integrates text features and structure features through graph convolution layer, and generates the advanced webpage feature representation. Experimental results on the Web-black dataset suggest that the proposed method significantly outperforms text-only method.
1 Introduction With the fast development of the network information technology industry, especially in the context of intelligent network digitalization, the content related to the network black and gray industry is quietly spreading on the Internet. Therefore, the security of network content has been paid more and more attention by security researchers. At present, most of the research focuses on the field of hacker attack and defense in network hacking, such as abnormal traffic monitoring, advanced persistent threat (APT) attack detection, etc. However, there are only a few studies on the black content that has the greatest potential impact on the public, such as fraud pages, gambling pages, porn pages, which are most close to the general public. At present, the research on network black content mainly combines search engine technology and web page classification technology. Web page classification refers to the supervised learning task of assigning web pages to one or more predefined category tags, based on learning models that have been trained using labeled data [1, 17]. The current classification methods of web pages are mainly divided into three categories according to the characteristic information, text, HTML elements, and both. In the early days, the classification method based on text information dominated [1, 16, 19]. They usually extract text statistical information, such as information gain [14], mutual information [22], chi-square statistics [22], as features to represent web page. These methods only consider the literal information of c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 188–198, 2022. https://doi.org/10.1007/978-3-030-79728-7_19
Web Page Classification Based on Graph Neural Network
189
the text and convert the semi-structured text of the web page into ordinary text, thus losing the structural information. In addition to the text feature, another obvious feature of Web pages is the HTML markup. To improve the classification accuracy, some researchers divide the webpage text according to the label category of the text and give the corresponding text weight [6, 16, 21]. By coding the tags and concatenating them with the word vectors, the feature representation of the web page is generated [2]. However, most HTML tags are presentation-oriented, and the element tags of text have great uncertainty. It is possible that the HTML markup information is not consistent with the HTML document format. Reference [13] proposed to use the visual feature information of the web page to convert the text and image of the web page into a visual adjacency graph representation, ignoring the structural features of the HTML of the web page. The web page classification method mentioned above is mostly based on the text and HTML elements. Although it achieves a good classification effect, it has more or fewer disadvantages. A web page is a collection of rich text content [1]. In addition to the text information, it also includes a large number of additional attribute content, such as text style attributes: text in the page render size, text display color, etc. Text organization structure attributes. At the same time, there are also complex semantic relations between the text on the web page. For example, the header navigation bar of the web page is mostly the title of the web page content, while the main content in the middle of the web page is the specific content of the header. There is a significant correspondence between title and content. This structural information also plays an important role in the classification of web content, but most of the current classification methods only focus on the text itself, without considering the attributes and structural information listed above. In order to solve the problems mentioned above, we proposed a method, called Web to Graph(W2G), that parses the HTML Web page and reconstructs the text contents into graph representation according to the visual association relation of the text and the DOM-tree hierarchy relation. In this way, we can efficiently realize the integration of the content and structure of the Web page. Then, the pre-trained word vector is used to represent the text of the node, and the attribute feature vector of the node is constructed according to the extracted text attribute feature. Finally, the Web page graph classification method based on graph neural network, named Web-GNN, is used to classify web page graph. Through the node information propagation and aggregation of GNN, we fuse the text features and text structure features of the web page, generate feature vector representation, and finally output web page labels using a fully connected neural network. Our major contributions are as follows: • We propose a novel method to represent web pages(W2G). This method converts original web pages into web page graphs and realizes the high integration of text information and structure information of web pages, which provides a basis for the presentation of web pages for the downstream web-based research. • We propose a novel web page classification method based on graph neural network(Web-GNN). It takes web page graph as input, fuses text features and structure features through graph convolution, pooling, and attention mechanism, gener-
190
T. Guo and B. Cui
ates advanced web page feature representation, and finally realizes webpage classification. • We compared Web-GNN with a number of Web page classification methods, and further analyzed and visually demonstrated the advantages of this model. Section 2 summarizes the background of web page classification and graph neural network. Section 3 defines the concepts and symbolic representations related to the web page diagram. The specific details of the classification model and the experimental results and analysis are summarized in Sect. 4 and Sect. 5. In Sect. 6, we conclude this paper and discuss current challenges, and suggest directions for the future.
2 Related Word In this section, we will introduce the related works about Graph Neural network in detail. 2.1
Graph Neural Network
Graph neural networks(GNN) [18], deep learning-based methods, operate on graph domain. Deep learning has been proven that can effectively capture hidden patterns of Euclidean data, however, there also exist many other data which are in the form of trees or graphs, such as chemical compound [5], knowledge graphs [7], social networks [8]. GNNs exploit the rich information inherent to the graph’s structure and annotations associated with their vertices and edges. In graph classification, methods based on GNNs contribute to learning a global representation of the graph and then fed it into classification algorithms: such as MLP, SVM. Many proposed methods were constructed based on the “message passing” mechanism [5], which means the node representation is the fusion of neighbor representations [7, 8, 12]. So for graph classification, the key method is how to summary all representations of nodes and produce the corresponding graph representation. In many graph classification domains, such as molecule classification, graphs from a specific class may have low global similarities but share some unique substructures. Therefore, using discriminative substructures as features and transferring graphs into vector space becomes a popular solution. From this inspiration, we applied GNN to web page classification.
3 Problem Formulation and Definition A web page (or webpage) is a specific collection of information provided by a website and displayed to a user in a web browser. A web page typically consists of many contents elements. The core element of it is one or more texts. Images, videos, and other multimedia files are also often embedding in web pages. Suppose, we want to parse a web page H to a content graph G = (V,E). Therefore, our objective is to extract a set of contents elements N = n1 , n2 , ...n p from the web page H and construct the graph G = (V, E). Each vertex v ∈ V corresponds to the elements of H. Subsequently, we reconstructed the vertex relationship by referring to the
Web Page Classification Based on Graph Neural Network
191
hierarchy of each node and the rendered position relationship. Our core insight here is that a web page is a nested object consist of elements that are semantically and spatially related to each other. It is discussed in greater detail in the following section. 3.1 Web Page Html Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. An HTML element is a type of HTML document component, one of several types of HTML nodes. An HTML document is composed of a tree T of simple HTML elements, such as textual elements, and HTML elements, which add semantics and formatting to parts of the document. Here, we only consider text elements and image elements and ignore other elements(e.g. hyperlinks, scripts.). Each element can have HTML attributes specified and we focus on the text size and BoundingClientRect of elements which are more important in our graph model. Since many meaningless or empty elements exist in H, it is essential to cutting elements so that processing such a large tree can be very expensive. To deal with this problem, serval rules of cutting elements were proposed. Given a parse tree T = (NT , ET ). NT is set of nodes and ET is set of edges. ET = {(n1 , n2 ) | n1 , n2 ∈ N AND n1 = n2 } is the set of edges in T . An element n1 is said to be the parent of element n2 , if there is a directed edge (n1 , n2 ) from n1 to n2 . Element n1 is then said to be a child of element n2 . Here we define element n is NULL if and only if n is not contain text or image • if element n is null and only one child of n is not null and other children of n are null, then remove n. • if node n is null and it has only one child, then substitute n by it’s child. Table 1. Graph item description item
Attribute Char Description
Vertex content
text image
vt vi
vertex type is textual vertex type is image
Vertex coordinates x y height width
vx vy vj vw
vertex coordinate: x vertex coordinate: y vertex size: height vertex size: width
Edge
ex ey er1 er2 eh ew
horizontal distance: v1 and v2 vertical distance: v1 and v2 ratio of v1’s width and height ratio of v2’s width and height ratio of v1’s width and v2’s width ratio of v2’s height and v2’s width
dis x dis y ratio v1 ratio v2 ratio h ratio w
192
3.2
T. Guo and B. Cui
Web Page Graph
We define the graph generated by W2G as G = (V, E) with an object type mapping function ϕ : V → A and a edge type mapping function φ : E → R. Each vertex v ∈ V belongs to one particular element type in the element type sets A : ϕ (v) ∈ A , and each edge e ∈ E belongs to a particular relation type in the relation type set R : φ (e) ∈ R. Attributes of vertex and edge see in Table 1. In order to further reduce the complexity of the graph and enrich the relationship between the nodes of the graph, we define bounded ploy set: P. Each bounded polygon p ∈ P is represented as a set of vertices defined with (x, y) screen coordinates and T hierarchy.P = {p | p ∈ P}. we introduce following definations: • p = {v ∈ p | (vx = n||vh = h) | n, h ∈ N} Then we add full connection edges to P, and each p is connected to it’s elements v. The contents of P are copies of all contents of v. Here is an example of a web page graph 2 (Fig. 1).
4 Model In this section, we will introduce our method, a Graph neural network-based approach for web page classification, called WEB-GNN. First, we show how to process web page G proposed in Sect. 3. Then we briefly describe the WEB-GNN model. After giving an overview of our approach, we provide details on: (i) how to weight the importance of vertex under this model; (ii) how to fusion multimodality vertex feature and predict the label for web page graph G.
Fig. 1. Web page graph
4.1
Overview
The proposed approach to WEB-GNN follows the general pipeline depicted in Fig. 2. The novel point here concerns the position that the Webpage-elements matrix is weighted. Instead of using the text frequency criteria, we show that WEB-GNN can
Web Page Classification Based on Graph Neural Network
193
capture the hierarchy of text and the rendered position relationships of the vertex by constructing a graph. In addition, we add image vertex into web page graph G, the experience result shows it can improve the accuracy of web page classification of some categories. We will discuss these steps in more detail below.
Fig. 2. Graph neural network of web page graph
4.2 Build Web Page Graph We transform text, which are strings of characters, into a representation suitable for learning algorithms and classification tasks. Inspired by the recent study on word embedding in NLP, we use Tencent AI Lab Embedding Corpus [20] to represent each text content in vertex which provides 200-dimension vector representations, a.k.a. embeddings, for over 8 million Chinese words and phrases. For detail, we parse text content to text segmentation and use embedding in Corpus represent it. For each vectex, the representation of text content is the average embedding of it’s text segmentation. Concretly, For vt in G: 1 N vt = ∑ ti (1) N i N is the number of text segments in v and ti is the embedding of i − th segment. 4.3 Web-GNN GNN can capture and extract structure information from graph. In this paper, we use three GATConv layer to collect information from adjacents vertexs. The new representation of vertex is the fusion of information colleted from neighbors and itself. More formally, which is defined as: xi = αi,i θ xi +
∑
j∈N (i)
αi, j θ x j
(2)
194
T. Guo and B. Cui
Attention coefficient αi, j is the self-attention coefficient of the node vector, which is the weighted sum of the attention weight of the text content itself and the attention of the text location. δ is the super parameter.
α = δ ∗ β + (1 − δ ) ∗ γ The attention coefficients βi, j are computed as: exp LeakyReLU b [θ xi θ x j ] βi, j = ∑k∈N (i)∪{i} exp (LeakyReLU (b [θ xi θ xk ])) The attention coefficitets γi, j are computed as: exp LeakyReLU r [θ pi θ p j ] γi, j = ∑k∈N (i)∪{i} exp (LeakyReLU (r [θ pi θ pk ]))
(3)
(4)
(5)
Where xi is a 200-dimension vector that is the text content embedding of vertex vt ; pi is 4-dimension vector which is the posistion representation of vertex; θ is a linear function which parse old vector to a new vector with target dimension. N (i) denotes vertexs that are neighors of the target vertex. In order to decrease the scale of graph G, we add pool layer to each convolution layer. Here, we use ToPKPool [4], select top-k nodes from the given graph accroding to the learned node score. Here, we set pooling ratio is 0.8. After pool layer, we implemented readout layer to aggregate the vertex features to generate the representation the graph, which is as follows: s=
1 N 1 N N xi s = ∑ xi || max xi ∑ i=1 N i=1 N i=1
(6)
where N is the number of current vertexs and || denotes concatenation of vector. The key point of graph classification is to get the n-dimension vector representation of the graph. In this paper, the feature representation of each graph is the sum of output features of readout layers and it is fed to the MLP layer for classification.
5 Experiments We evaluate the proposed method on web page classification task. In Sect. 5.1, we introduce the baseline methods. In Sect. 5.2, we show the datasets used for evaluation. In Sect. 5.3, we show the experiment results and compare with baseline methods. 5.1
Baseline
We compare our WEB-GNN with multiple state-of-the-art text classification and webpage classification as follows: • SVM: TF-IDF, short for term frequency–inverse document frequency. Use SVM as the classifier.
Web Page Classification Based on Graph Neural Network
195
• Bert: Bidirectional Encoder Representations from Transformers, which is designed to pre-train deep bidirectional representations from unlabeled text [3]. • TextCNN: Defined in Kim 2014 [10]. Get representation of text by using convolution and max pooling operation on word embedding. • TextRNN: Use the last hidden state as the representation of text [15]. • DPCNN: Deep Pyramid Convolutional Neural Networks for Text Categorization [9] • Bert-CNN: Add a CNN network layer after Bert model. • Bert-RNN: the same as Bert but add an RNN network layer. 5.2 Dataset We consulted public data sets such as Kaggle and found no open-source data sets related to Internet webpages that met the experimental requirements. Therefore, we crawled 4978 Internet content-related webpages through the Internet of Things content crawler engine, including 1133 legal content webpages, 1428 gambling webpages, 455 financial frauds, 625 pornographic webpages, and illegal ordering. There are 649 web pages, and 688 other illegal web pages (including pirated novels, surrogacy promotion, etc.). For the convenience of presentation, we call this dataset WEB-Black. We randomly select 10% web pages from the dataset to build a test set. And the 90% webpages have been split into the train and dev set (Table 2). Table 2. Description of WEB-Black Class label Train web Dev web Test web Legal
751
208
118
Gamble
966
277
135
Finance
293
91
41
Pron
343
91
58
Other
490
128
61
Click farm 410
135
52
5.3 Experimental Setup and Tools We have implemented the proposed Web-GNN algorithm in Python using PyTorchgeometric library and baseline methods by PyTorch library. We set the dimension of node representation as 200 and initialize with pre-trained word embeddings [20]. We use the Adam optimizer [11] with an initial learning rate of 0.0005. Dropout with a keep probability of 0.5 is applied after the dense layer. The batch size of our model is 32. We stop training if the validation loss does not decrease for 10 consecutive epochs. We first preprocessed all the webpage by cleaning and tokenizing text. Here, we extract the text content of the webpage in WEB-Black. Note that we don’t remove stop
196
T. Guo and B. Cui
words and low-frequency words in the dataset in order to keep the integrity of the webpage. We train the classification model on the Train set. The classification performance is evaluated using the macro-averaged Precision, Recall, F1-score measures, and classification accuracy. 5.4
Experimental Results
Table 3 reports the results of our models against other baseline methods. We can see that our model can achieve state-of-the-art results. Table 3. Comparsion of end-to-end performance against baseline methods on WEB-Black Index Model
Precision Recall
1
SVM
90.00
90.00
2
TextCNN
90.53
90.32
3
TextRNN
89.35
89.03
4
DPCNN
88.36
87.96
5
BERT
91.28
91.10
6
BERT+CNN 92.21
92.16
7
BERT+RNN 91.12
91.10
8
Our model
95.05
95.13
We note that the results of Web-GNN based on graph neural network are better than traditional models. The reason is that the text representation in the form of a graph can represent structural information between texts. For the content of the rich text structure of the web page, the relationship of sub-web page node content can be reflected by the corresponding node position and connection. Traditional text representation algorithms based on the word bag model or word vector model can only capture the text information in a fixed context but are not capable of capturing the complex hierarchical structure information among web pages. WEB-GNN learns the structural relations between texts through the GNN layer and filters text nodes through TopkPool to finally generate the feature vector representation of WEB pages. Another significant difference between web text and traditional text is that it has partial attribute information. Compared with the base method mentioned above, WebGNN introduces an attention mechanism for text attribute information. In this way, the method not only learns the information of the text, but also learns the information of the size and location of the text, and finally generates the feature representation of the node. This feature information is not available in traditional classification methods. Figure 3 shows the learning result of the algorithm for a sample.
Web Page Classification Based on Graph Neural Network
197
Fig. 3. This algorithm selects some representative nodes, as shown in the blue box in the figure, which are mainly concentrated in the navigation bar at the top of the page, the content module in the middle of the page and the annotation section at the bottom of the page. The result of this screening has a great overlap with the key nodes selected manually.
6 Conclusion We propose a Web content classification method WEB-GNN based on Graph Neural Network. In this work, we first reconstruct the original web page content into the web page graph representation and integrate the text information and structure information of the web page. Then, we use the graph neural network method to learn the hierarchical structure features between web pages, use the pre-trained word vector in natural language to represent the content of web pages, and achieve the organic combination of the two based on the graph attention mechanism to generate the feature vector representation of web pages. Finally, we use the conventional classification method to classify the web pages. Experimental results show that this method is superior to the classification method using only text features. In the future, we will further explore other categories of web content information, such as pictures, videos, etc., and integrate multi-modal information of web pages based on web page graphs to further improve the robustness of the method.
References 1. Davison, X.Q., Davison., B.: Web page classification: features and algorithms. ACM Comput. Surv. (CSUR) 41(2), 1–31 (2009) 2. Deng, L., Du, X., Shen, J.: Web page classification based on heterogeneous features and a combination of multiple classifiers. Front. Inf. Technol. Electron. Eng. 21, 1004–995 (2020) 3. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) 4. Gao, H., Ji, S.: Graph u-nets. In: international conference on machine learning, pp. 2083– 2092. PMLR (2019)
198
T. Guo and B. Cui
5. Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl, G.E.: Neural message passing for quantum chemistry. In: International Conference on Machine Learning, pp. 1263–1272. PMLR (2017) 6. Golub, K., Ard¨o, A.: Importance of HTML Structural Elements and Metadata in Automated Subject Classification. In: Rauber, A., Christodoulakis, S., Tjoa, A.M. (eds.) Research and Advanced Technology for Digital Libraries, ECDL 2005, Lecture Notes in Computer Science, vol. 3652, pp. 368–378. Springer, Berlin (2005) 7. Hamaguchi, T., Oiwa, H., Shimbo, M., Matsumoto, Y.: Knowledge transfer for out-ofknowledge-base entities: A graph neural network approach. arXiv preprint arXiv:1706.05674 (2017) 8. Hamilton, W.L., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. arXiv preprint arXiv:1706.02216 (2017) 9. Johnson, R., Zhang, T.: Deep pyramid convolutional neural networks for text categorization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 562–570 (2017) 10. Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP (2014) 11. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) 12. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016) 13. Kovacevic, M., Diligenti, M., Gori, M., Milutinovic, V.: Visual adjacency multigraphs-a novel approach for a web page classification. In: Proceedings of SAWM04 workshop, ECML 2004, (2004) 14. Lewis, D.D., Ringuette, M.: A comparison of two learning algorithms for text categorization. In: Third Annual Symposium on Document Analysis and Information Retrieval, vol. 33, pp. 81–93 (1994) 15. Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101 (2016) 16. Lu, M.Y., Shen, D., Guo, C.H., Lu, Y.C.: Web-page summarization methods for web-page classification. Dianzi Xuebao (Acta Electronica Sinica) 34(8), 1475–1480 (2006) 17. Mitchell, T.: Machine Learning, Mcgraw-hill Higher Education, New York (1997) 18. Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20(1), 61–80 (2009). https://doi.org/10.1109/TNN. 2008.2005605 19. Shanks, V., Williams, H.: Fast categorisation of large document collections. In: String Processing and Information Retrieval, International Symposium on, pp. 0194–0194. IEEE Computer Society (2001) 20. Song, Y., Shi, S., Li, J., Zhang, H.: Directional skip-gram: Explicitly distinguishing left and right context for word embeddings. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 2 (Short Papers), pp. 175–180 (2018) 21. Sun, A., Lim, E.P., Ng, W.K.: Web classification using support vector machine. In: Proceedings of the 4th International Workshop on Web Information and Data Management, pp. 96–99 (2002) 22. Wiener, E., Pedersen, J.O., Weigend, A.S.: A neural network approach to topic spotting. In: Proceedings of SDAIR-95, 4th Annual Symposium on Document Analysis and Information Retrieval, vol. 317, p. 332. Las Vegas, NV (1995)
Malicious Encrypted Traffic Identification Based on Four-Tuple Feature and Deep Learning Kunlin Li(B) and Baojiang Cui Beijing University of Posts and Telecommunications, Beijing, China {likunlin,cuibj}@bupt.edu.cn
Abstract. With the increasing popularity of traffic encryption protocols such as SSL/TLS, attacks based on encrypted traffic are more and more rampant, so does the need to inspect all SSL traffic. At present, the methods based on feature engineering and the methods based on deep learning and representation learning are the research hot-spots. In this paper, a new method of encrypted malicious traffic identification is proposed, which is based on deep learning and four- tuple feature. The unit of traffic identification is flow four-tuple. We extract 3 types of features which are statistical feature, handshake byte stream feature, and application data size sequence feature. We design different deep learning models to deal with various features and work together in traffic identification. We used the CTU Malware dataset to experiment. The results show that the accuracy of our method can reach 98.31%, which is better than that of other methods using four-tuple as the unit of flow identification and experimenting on the CTU Malware dataset.
1 Introduction As awareness of network security and privacy grows, Secure Sockets Layer (SSL) and Transport Layer Security(TLS) Protocol are becoming more common. As is reported in [1], by 2020, more than 80% of network traffic is encrypted by SSL/TLS protocol. In the meantime, threats hidden in SSL-encrypted traffic are on the rise. The number of SSLencrypted traffic-based attacks is estimated to have increased by 260% over the previous year. In the context of the global covid-19 epidemic in 2020, the medical and health care industry is the first to be hit by an SSL encryption attack, which brings great harm to human society. As this encrypted traffic based attack grows, so does the need to inspect all encrypted traffic. There are five kinds of traffic identification methods [2]: port-based method, deep packet detection method, black-and-white list-based method, artificial feature selection based modeling method, and the methods based on deep learning and representation learning. With the use of dynamic port and port transmit technology, the port-based method is no longer applicable. With the popularity of traffic encryption technology, the characteristics of traffic payload are covered, which makes the method based on deep packet detection no longer applicable. The method based on black-and-white lists faces some problems, such as the efficiency of updating the lists and the flexibility of identification. The methods of modeling based on artificial feature selection and deep learning based on automatic feature extraction have become the research hot-spots because of their c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 199–208, 2022. https://doi.org/10.1007/978-3-030-79728-7_20
200
K. Li and B. Cui
excellent performance. In this paper, four-tuple (a set of sessions with the same source IP, destination IP, destination port, and transport layer protocol) is used as the unit to be identified. A traffic detection method based on traffic four-tuple and deep learning is proposed to make detection of malicious encrypted traffic. This article is mainly composed of the following parts. The second part elaborates the related work of this article. In the third part, the method of encrypted traffic detection proposed in this paper is described, including the recognition unit of the traffic, the extraction of the characteristics of the four-tuple and the design of the model. In the fourth part, the experimental process is described and the experimental results are analyzed. The fifth part summarizes the content of this article.
2 Related Work The research on traffic identification has a long history at home and abroad. At present, there are five kinds of traffic identification methods [2]: port-based method, deep packet detection method, black-and-white list-based method, artificial feature selection based modeling method, and the methods based on deep learning and representation learning. The mapping relationship between protocol and port is managed by IANA. Typically, ports ranged from 0 to 1023 are assigned to common protocols in the TCP/IP network (for example, FTP traffic uses port 21), and port numbers ranged from 1024 to 49151 are assigned to other applications registered in IANA. However, with the widespread use of dynamic port technology and port transmit technology, the portbased method is no longer suitable for traffic identification. At present, the widely used deep packet inspection aims at the payload of network layer, transport layer and application layer of data packets to carry out keywords search and pattern matching, which can achieve high identification accuracy. However, in encrypted traffic, the character of bytes is masked, which limits the use of DPI technology. The method of deep packet inspection after decrypting violates the privacy of users, which goes against the original intention of encryption protocol [3]. This method also has the problems of large computation and challenging configuration of decryption system. The black-and-white list-based approach relies on black-and-white lists collected by authorities. The black-and-white list includes server IP list, domain names list and TLS fingerprints list [4]. Even though IP and domain names can be changed easily, it is difficult for an attacker to change how their malware is connected. As a result, lists of TLS fingerprint have better traffic identification capabilities. This method can identify malicious encrypted traffic with a low false alarm rate, but can not flexibly identify traffic that is not in lists. In recent years, the identification of encrypted traffic based on artificial feature selection has become a research hot-spot. Features commonly used in previous studies include packet header features (such as features in the network layer and transport layer), unencrypted handshake features, and series features (including inter-arrival time series and packet size series). Most of the commonly used models in previous researches are classical machine learning models. Cisco [5] used Markoff’s state transition matrix to describe sequence features that can be applied to ML models. In addition, they used
Malicious Encrypted Traffic Identification
201
byte distribution as a feature and use the support vector machine model for classification. Then Cisco [3] proposed a new study and combined contextual HTTP and DNS features to improve the identification accuracy in the previous study. Deep Learning and representation method is also a research hot-spot in recent years. The deep learning method has excellent representation ability and can extract features automatically through learning. Compared with the traditional machine learning method, deep learning also has a stronger learning ability. Wang’s team compared onedimensional convolutional neural network(1D-CNN) [6] with two-dimensional convolutional neural network(2D-CNN) [7], and then a method based on hierarchical spatialtemporal features of CNN and long short-term memory network(LSTM) is proposed in their new study [8]. M. Lotfollah et al. [9] proposed a method to identify encrypted traffic using stacked auto-encoders (SAEs) and 1D-CNN to represent the traffic payload in lower dimensions. Traditional traffic identification methods take the flow or the session as the unit of identification. Strasak et al. [10] proposed a method to identify encrypted traffic with four-tuple(source IP, destination IP, destination port, protocol) as the unit of identification. They extracted multiple features of four-tuple and compared the identification accuracy of the neural network, random forests, and XGBoost, and they get the best result with XGBoost. Based on the work of Strasak et al., Dai et al. [11] extract more features to improve the accuracy of traffic identification. There are some datasets coming from different organizations. The Dataset used in this paper is the CTU Malware Datasetdataset which comes from the Malware Capture Facility Projectproject of the Czech Technical University. A new dataset called Unisa Malware Dataset (UMD) [20] is based on the extraction of static and dynamic features characterising the malware activities.
3 Proposed Method In this part, this paper mainly introduce some innovation points and propose some method. First of all, this section explains why four-tuple should be used as the unit of traffic identification, which is different from the traditional method using session as the unit of traffic identification. Secondly, this section describes the features extracted for four-tuple. Last but not least, this section explains how the deep learning model is constructed to apply to four-tuple features. 3.1 Four-Tuple as Analysis Unit A four-tuple is a group of sessions that have the same source IP, Destination IP, destination port, and transport layer protocol. A four-tuple contains all the flows that one client communicates with one port of one server. In communication with the server, the client uses a temporary port ranged from 49152 to 65535. Even if the SSL/TLS session is resumed, the client may assign a new port to this session. Using the four-tuple as an analysis unit is helpful to provide more information for traffic identification. Resumed SSL/TLS sessions do not have a complete handshake process and the certificate features of these flow/sessions can not be extracted. So it is not conducive to accurate identification with flow/session as the unit of identification.
202
K. Li and B. Cui
3.2
Feature Extraction
3.2.1 Statistical Feature of Four-Tuple After analyzing the characteristics of the four-tuple, Dai et al. [11] extracted 32dimensional features, which is consist of flow statistics features, handshake field features and certificate features. Based on their study, we extract a 37-dimensional feature, as is shown in Table 1. We use the top 1000000 domains released by Alexa [12] as a white list to filter server name in the SNI extension. Table 1. Statistical features of four-tuple Id
Description
Catagory
Type
1
Number of connections in 4 tuple
Flow statistics
String
2
Number of bytes in forward direction(upstream)
Flow statistics
String
3
Number of bytes in backward direction(downstream)
Flow statistics
String
4
Ratio of backward and forward bytes
Flow statistics
String
5
Number of packets in forward direction
Flow statistics
String
6
Number of packets in backward direction
Flow statistics
String
7
Ratio of backward and forward packets
Flow statistics
String
8
Number of bytes in forward direction (upstream) after handshake
Flow statistics
String
9
Number of bytes in backward direction(downstream) after handshake
Flow statistics
String
10
Ratio of backward and forward bytes after handshake
Flow statistics
String
11
Number of packets in forward direction after handshake
Flow statistics
String
12
Number of packets in backward direction after handshake
Flow statistics
String
13
Ratio of backward and forward packets after handshake
Flow statistics
String
14
Mean duration of the connection
Flow statistics
String
15
Standard deviation of duration of the connection
Flow statistics
String
16
Mean inter-arrival time of connection
Flow statistics
String
17
Mean secondary time difference of connection
Flow statistics
String
18
Number of SSL connections in 4 tuple
Handshake field
String
19
Ratio of SSL connections and all connections in 4 tuple
Handshake field
String
20
Ratio of handshaked connections and SSL connections
Handshake field
String
21
Ratio of connections with application data and SSL connections
Handshake field
String
22
Ratio of resumption connections and SSL connections
Handshake field
String
23
Ratio of connections with SNI extension and SSL connections
Handshake field
String
24
Number of different server names in SNI
Handshake field
String
25
Indicates if server names in SNI contains destination IP
Handshake field
Boolean
26
Indicates if server names in SNI contains any server name in white list
Handshake field
Boolean
27
Ratio of SSLv2 connections and SSL connections
Handshake field
String
28
Ratio of SSLv3 connections and SSL connections
Handshake field
String
29
Ratio of TLS1.0 connections and SSL connections
Handshake field
String
30
Ratio of TLS1.1 connections and SSL connections
Handshake field
String
31
Ratio of TLS1.2 connections and SSL connections
Handshake field
String
32
Ratio of TLS1.3 connections and SSL connections
Handshake field
String
33
Number of different certificates
Certificate
String
34
Mean certificate chain length
Certificate
String
35
Mean certificate validity
Certificate
String
36
Indicate if this 4 tuple is in cerificate vaild period
Certificate
Boolean
37
Indicate if this 4 tuple contains self-signatured certificate
Certificate
Boolean
Malicious Encrypted Traffic Identification
203
3.2.2 Handshake Byte Stream Feature According to the results of Wang et al. [8], the best result can be obtained by taking the first 800 bytes of the byte stream. The first 800 bytes on the TCP layer payload cover the client hello message, server hello message and part of certificate message in the handshake layer of the SSL/TLS protocol. JA3/JA3s [13] is a kind of SSL/TLS fingerprint. Only one JA3 can be generated by one client and only one JA3s can be generated by one server which answers one client. In a four-tuple, JA3 of flows should be the same, and so should JA3s. Therefore, a handshake byte stream of a non-sessionresumpted flow in a four-tuple can be used as a four-tuple feature. We used only the first 800 bytes of the payload in the SSL/TLS handshake layer to avoid the error of fitting Mac and IP addresses. 3.2.3 Application Data Layer Size Series After the handshake is successful, the data is encrypted transmitted via the SSL/TLS protocol’s application data layer. Encryption masks the characteristic of the byte stream, but the application data layer size sequence feature can’t be masked. We extract the application data layer size sequence of each flow in a four-tuple and mark the direction with a plus or minus sign. The plus sign indicates the server-to-client direction, while the minus sign indicates the client-to-server direction. If the length of the layer size is more than 20, the first 20 layer size will be taken as a feature. If the length of the layer size is less than 20, 0 is used to pad. Each four-tuple takes the application data layer size sequence of the first five flows according to the order of the start time of the flow. The feature size is (5,20). 3.3 Model In this study, three different deep learning models are applied to deal with the three types of features mentioned above. The core of the models applied are the fully connected network, CNN [14] and LSTM [15]. We use a single-layer full-connection network to process four-tuple statistical features. The fully connected layer performs well due to the ability of non-linear expression. We use a single-layer full-connection network with 32 neurons and Relu as an activation function to process the four-tuple statistical features. We use an Inception module [16] consist of 1D-CNN to handle the handshake byte stream feature. CNN has an excellent representation learning capability and can make a shift-invariant classification for input information. The convolutional neural network is suitable for processing various signal data, series features and natural language, as well as handling handshake byte streams. In traditional convolutional neural networks, multiple convolution layers are stacked up, which makes the model deeper, more parameters, easier overfitting and slower training. The Inception module overcomes these shortcomings. In the Inception module, the input data is processed with different convolution kernel sizes and produce different outputs, which are concatenated together and become the output of the Inception module. Compared with traditional networks the Inception module increases the width of the network and reduces the depth and parameters of the network, which improves the adaptability to different size inputs and provides more
204
K. Li and B. Cui
information for classification. Wang’s team compared 1D-CNN [6] and 2D-CNN [7] and found that 1D-CNN worked better. In this paper, the Inception module consists of 1D-CNN with Relu as the activation function. The byte stream feature ranges from 0 to 255, and we can use the Embedding layer to encode one byte as a vector. The module and parameters used in this study are shown in Fig. 1. We use 1D-CNN in combination with LSTM to process the application data size sequence feature. LSTM is a kind of recurrent neural network and solved the longterm dependence problem of general recurrent neural networks, which is suitable for processing time series and has good performance in natural language processing. Bidirectional recurrent neural networks (BRNNS) were proposed by Graves [17] et al. The traditional recurrent neural network can make a prediction based on the previous information, while the bidirectional recurrent neural network can make a prediction based on the context information, and has better performance. 1D-CNN is suitable for representing one application data size sequence in a four-tuple as a vector. Bi-LSTM is suitable for dealing with multiple vectors that are ordered by start time flows. The application data size ranges from −16408 to 16408, so we add 16500 to the data and use the embedding layer to encode an application data size into a vector. The module structure and parameters used in this study are shown in Fig. 2. Three outputs are concatenated and then input to a fully connected layer with 1dimension output and a sigmoid as an activation function.
Fig. 1. The model for handshake byte stream feature
Malicious Encrypted Traffic Identification
205
Fig. 2. The model for application data size sequence feature
4 Experiments 4.1 Data Set and Data Processing The Dataset used in this paper is the CTU Malware Dataset [18] which comes from the Malware Capture Facility Project [19] of the Czech Technical University. They capture long-lived real botnet traffic by executing specific malware on virtual machines. This dataset includes malware, normal PCAPand background traffic. All we need are just PCAP files of normal and malware traffic. We extract the four-tuple features from these PCAP files, and any four-tuple that do not contain a successful handshake SSL/TLS stream will be filtered. The statistical features of the four-tuple need to be standardized and the mean and variance of each feature are 0 and 1 respectively. We add 16500 to the application data size sequence features to eliminate negative values. We extracted 11974 normal four-tuple and 9667 malicious four-tuple from the original PCAP file. 4.2 Evaluation Indicators In this article, we use accuracy, precision, recall, and F1-score to evaluate the performance of the method proposed in this paper. The formula for calculating each indicator is as shown in Eq. 1, 2,3, 4. True Positive (TP) means that traffic belonging to Category X is correctly classified as Category X. A False Positive (FP) means that traffic not belonging to Category X is classified as Category X. True Negative (TN) means that the traffic not belonging to Category X is correctly classified as any category excpet Category X. A False Negative
206
K. Li and B. Cui
(FN) means that traffic belonging to Category X is classified as any category excpet Category X. TP T P + FP + FN + T N TP+TN Precision = T P + FP TP Recall = T P + FN Precision × Recall F1 − Score = 2 × Precision + Recall Accuracy =
4.3
(1) (2) (3) (4)
Results and Analyses
We conduct 2 groups of experiments. Experiment 1 is performed to verify the validity of the three proposed features memtioned in above session. In Experiment 1, we compared the classification results using only the handshake byte stream feature, only the four-tuple statistical feature combined with the handshake byte stream feature, and all three kinds of features. The train set was randomly selected from the data set and consists of 80% of the dataset, while the test set consists of the remaining 20%. Three cases use the same train set and test set to compare the accuracy of test set classification in different training epochs. Experiment 2 is performed to verify the superiority of the method proposed in this paper. Compared with other related research methods, the method proposed in this paper has better effect. In Experiment 2, we compared the classification results of the method proposed in this paper with the best classification results of Strasak et al. and Dai et al.. Strasak et al. [10] and Dai et al. [11] both use the CTU Malware Dataset. The four indicators in this paper are calculated through 5-fold cross-validation and averaging. The results of experiment 1 are shown in Fig. 3. Feature 1 is used to represent the four-tuple statistic feature, feature 2 is used to represent the handshake byte stream feature, and feature 3 is used to represent the application layer size sequence feature. From the graph, we can see that using only the handshake byte stream feature, the accuracy of the test set is nearly stable after 80 training epochs, and the peak accuracy is 97.13%. Using only the four-tuple statistic feature and handshake byte stream feature, after 90 epochs of training, the accuracy of the test set is nearly stable, and the peak of the accuracy is 98.04%. Using all three types of features, the model’s fitting speed is the fastest. After training for 10 epochs, the accuracy of the test set reaches a high level and tends to be stable, after training for 27 epochs, the accuracy of the test set fluctuates greatly, and the accuracy of the training for 38 epochs reaches the peak, peak accuracy was 98.54%. After 40 epochs of training, the accuracy rate decreased obviously, which means that over-fitting appeared. From the experimental results, we can get the best identification. The results of experiment 2 are shown in Fig. 4. In experiment 2, the four indicators was calculated by 5-fold cross-validation and averaging. Based on the observation of
Malicious Encrypted Traffic Identification
207
the results of Experiment 1, the models of Experiment 2 train 20 epochs. The methods proposed in this paper are better than those proposed by Strasak et al. and Rui Dai et al. in four indicators. The accuracy of the method is 98.31%, the precision is 98.22%, the recall rate is 97.96%, and the F1-score is 98.09%.
Fig. 3. The relationship between training epoch and testing set accuracy with different features.
Fig. 4. Compared with other papers in 4 evaluation indicators
5 Conclusion In this paper, a new method of traffic identification is based on deep learning and fourtuple feature. In this paper, the statistical feature, the handshake byte-stream feature and the application data size sequence feature of four-tuple are extracted. With three kinds of four-tuple features mentioned above, we adopt a targeted deep learning model structure, which takes 1D-CNN, Bi-LSTM, and the fully connected layer as the core. The
208
K. Li and B. Cui
experiments show that the three kinds of four-tuple features are helpful to achieve better classification results. The method achieves 98.31% in accuracy, 98.22% in precision, 97.96% in recall and 98.09% in f1-score in the encryption traffic identification task, which is significantly improved compared with the feature-based engineering method. In future work, we will try to apply this method to the multi-classification problem of encrypted traffic. Based on traffic identification on the four-tuple, we will also explore other traffic identification unit, such as the client level.
References 1. ThreatLabZ. 2020 State of Encrypted Attacks https://www.zscaler.com/resources/industryreports/state-of-encrypted-attacks-summary-report.pdf 2. Rezaei, S., Liu, X.: Deep learning for encrypted traffic classification: an overview. IEEE Commun. Mag. 57(5), 76–81 (2019). https://doi.org/10.1109/MCOM.2019.1800819 3. Anderson, B., Mcgrew, D.: Identifying encrypted malware traffic with contextual flow data. In: Proceedings of the 2016 ACM Workshop, ACM (2016) 4. Gancheva, Z., Sattler, P., W¨ustrich, L.: TLS Fingerprinting Techniques. Network 15, (2020) 5. Anderson, B., Paul, S., Mcgrew, D.: Deciphering Malware’s use of TLS (without Decryption). J. Comput. Virol. Hacking Tech. 14(1), 1–17 (2016) 6. Wang, W., Zhu, M., Wang, J., et al.: End-to-end encrypted traffic classification with onedimensional convolution neural networks. In: 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), IEEE (2017) 7. Wang W , Zhu M , Zeng X , et al. Malware traffic classification using convolutional neural network for representation learning. In: 2017 International Conference on Information Networking (ICOIN), IEEE (2017) 8. Wang, W., Sheng, Y., Wang, J., et al.: HAST-IDS: learning hierarchical spatial-temporal features using deep neural networks to improve intrusion detection. IEEE Access 6, 1792– 1806 (2017) 9. Lotfollahi, M., Zade, R.S.H., Siavoshani, M.J., et al.: Deep packet: a novel approach for encrypted traffic classification using deep learning. Soft Comput. (2017) 10. Stras´ak, F.: Detection of HTTPS malware traffic. Czech Technical University in Prague, Bachelor project assignment (2017) 11. Dai, R., Gao, C., Lang, B., et al.: SSL malicious traffic detection based on multi-view features. In: ICCNS 2019: 2019 the 9th International Conference on Communication and Network Security (2019) 12. Alexa. http://www.alexa.com/ (2020) 13. Althouse, J.: TLS Fingerprinting with JA3 and JA3S - Salesforce Engineering (2019). https:// engineering.salesforce.com/tlsfingerprinting-with-ja3-and-ja3s247362855967 14. Bouvrie J. Notes on convolutional neural networks (2006) 15. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 16. Szegedy, C., Liu, W., Jia, Y., et al.: Going Deeper with Convolutions (2014) 17. Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: International Conference on Acoustics Speech & Signal Processing, icassp (2013) 18. CTU Malware Dataset. https://www.stratosphereips.org/datasets-malware 19. Garcia, S.: Malware Capture Facility Project (2013). https://mcfp.felk.cvut.cz/ 20. D’Angelo, G., Palmieri, F., Robustelli, A., et al.: Effective classification of android malware families through dynamic features and neural networks. Connection Sci. 1–16 (2021)
Improved Optimal Reciprocal Collision Avoidance Algorithm in Racing Games Wenyu Zhang and Tianhan Gao(B) Software College, Northeastern University, Shenyang, China [email protected], [email protected]
Abstract. Autonomous character behavior design is a key to the development of game artificial intelligence. Currently, collision avoidance technology, which is widely used in the field of robotics, cannot be directly applied to the autonomous character behavior design in racing games without specific adjustment. This paper makes some improvements to the classical Optimal Reciprocal Collision Avoidance (ORCA) algorithm to propose a novel collision avoidance algorithm (IORCA) suitable for racing games. Relevant concepts unique to racing games are put forward. The collision handling principles followed by racing AI on the straight lines and curves are further presented. By implementing the two systems equipped with ORCA and IORCA, the collision avoidance behavior of the racing AI in multiple scenarios are compared. It could be found that the collision avoidance behavior under IORCA is more suitable for racing games, and is able to give players a better gaming experience.
1 Introduction In racing games, autonomous character behavior design is a key factor in the development of AI [1]. The motion behavior of the autonomous character [2] is divided into three levels: action selection, steering, and movement, in which the steering behavior is described by a geometric calculation of the vector representing the required steering force [3]. As a type of steering behavior, collision avoidance behavior [4] equips the character with the ability to move flexibly in a cluttered environment by avoiding obstacles [5]. For character motion systems, collision avoidance processing for a non-player character (NPC) is very important [6]. In racing games, collision avoidance technology is the key to maintain the performance of high-velocity racing AI [7], which needs to avoid various static obstacles and other racing AIs on the track to reach the destination safely. The Velocity Obstacles [8] (VO) is a widely used collision avoidance algorithm [9], where each racing AI determines its route according to the observation and prediction of the position and velocity of other racing AIs [10]. While if one AI’s prediction is wrong, the two AIs trying to avoid collision will remain stagnant. Jur et al. [11] designed the Reciprocal Velocity Obstacles algorithm (RVO) based on the VO algorithm. The main idea is to propose a decision process for all racing AIs. However, the RVO algorithm is not an optimal collision avoidance scheme [12]. Thus, they further propose the Optimal © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 209–220, 2022. https://doi.org/10.1007/978-3-030-79728-7_21
210
W. Zhang and T. Gao
Reciprocal Collision Avoidance Algorithm [13] (ORCA), which defines a set of constraints on the velocity of racing AIs. When these constraints are followed, the AIs will do a collision-free movement without communicating with each other. Although ORCA can be achieved through linear relaxation constraints in most cases to obtain the ideal approximation value of the target route, collisions may still occur when multiple racing AIs pass through a narrow track (e.g. curve). In particular, when multiple racing AIs are turning simultaneously, the AI will choose a longer route to bypass the turn or stand still, where a deadlock may occur. Besides, there is no collision detection and avoidance algorithm specifically designed for racing games currently. In order to be seamlessly integrated into racing games, it is necessary to make further improvements on the current collision avoidance approaches, like ORCA. This paper analyzes the design ideas and execution process of ORCA and makes improvements to the existed problems. In terms of the characteristics of racing games, this paper proposes the IORCA algorithm through some new concepts and principles. Moreover, the corresponding collision avoidance systems are implemented under Unreal Engine 4. By testing the actual operation of the related algorithms in the sample game, the effect of the IORCA algorithm is evaluated. Experimental results show that the IORCA algorithm performs better than the ORCA algorithm in racing games, and can give players a better gaming experience. The remainder of the paper is organized as below. Section 2 presents the definition of collision avoidance and the details of the ORCA algorithm, and the proposed algorithm (IORCA) is elaborated. In Sect. 3, two collision avoidance systems are implemented based on ORCA and IORCA respectively, and the corresponding evaluation and discussion are made. Finally, the conclusions and perspectives are given in Sect. 4.
2 ORCA Algorithm and IORCA Algorithm 2.1 ORCA Algorithm Problem Definition. The original ORCA algorithm focused on solving the scattered and expected collision avoidance behavior between multiple robots. In the process of collisions, each robot is responsible for half of the collision avoidance, which provides sufficient conditions for collision-free motion. The collision problem here is known as reciprocal n-body collision avoidance, and the definition is as follows. A group of diskshaped robots are in a 2D plane. The parameters of each robot include two parts: external state parameters that can be observed by other robots, including the current position of robot A, pA (the center of the disk), the current velocity vA , the radius rA , and internal state parameters that cannot be observed by other robots, including the maximum velocity pref vmax of robot A when there are no other obstacles. The A and the preferring velocity vA main task of collision avoidance is to make each robot choose its own velocity vAnew by observing the external state of other robots to ensure that the robot does collision-free motion at least within a preset time τ . Each robot uses the same strategy to select a new velocity [14]. Robots A and B are abstracted as a disk with a radius of r and centered in p, as shown in Fig. 1(a), whose geometric representation is D(p, r) in Eq. 1. The definition of the
Improved Optimal Reciprocal Collision Avoidance Algorithm
211
Fig. 1. The relevant geometric representations in the ORCA algorithm. (a) Status of robots A and B. (b) Velocity obstacle. (c) The set of collision avoidance velocity vectors. τ (see Fig. 1(b)) is as Eq. 2, which is a truncated cone (τ = 2 in velocity obstacle VOA|B Fig. 1) with a vertex at the origin and truncated by an arc with a radius of (rA + rB )/τ and centered in (pB − pA )/τ . If robot A chooses the velocity in this set, it will collide with B within time τ .
(1) τ = {v|∃ t ∈ [0, τ ] :: tv ∈ D(pB − pA , rA + rB )}. VOA|B
(2)
2.2 Improved ORCA Algorithm The algorithm proposed in this paper (IORCA) makes corresponding improvements to the problems existing in the ORCA, including the introduction of related concepts, such as “AI level” and “damage degree”, and the optimization of collision handling in turning scenarios. AI Level. Obstacle avoidance is an essential ability in racing games, which is divided into two cases according to the immediacy. The real-time ability of longer-range obstacle avoidance is weak, which is mainly used to observe the track and plan a safe forward route to avoid all static obstacles (including faulty vehicles and railings on both sides of the rails); Shorter-range obstacle avoidance is more immediate and is mainly used in racing AI to make slight adjustments to maintain its route quickly, such as vehicles quickly avoiding others rushing towards them, overtaking vehicles next to them, fast turning, etc. In racing games, there are great differences in the racing ability of AI in terms of the racing performance, the ability of drivers, and other factors. Therefore, the concept of racing AI level comes up. The racing AI is divided into three levels, namely primary AI, intermediate AI, and advanced AI. Each level has at most three racing AIs. The intelligence level of advanced AI is the highest, while the intelligence level of intermediate AI and primary AI is lower. The AI of the racing vehicle is abstracted as a rectangle with a
212
W. Zhang and T. Gao
length of a and width of b, centered in p, and represented by R(p, a, b). Different collision principles are selected according to different scenarios, and the velocity obstacle τ of two racing AIs A and B are defined as Eq. 3. VOA|B τ VOA|B = {v|∃t ∈ [0, τ ] :: tv ∈ R(pB − pA , (aA + aB )/2, (bA + bB )/2)}.
(3)
For the optional velocity set VB of any racing AI B, to ensure that there is no collision between A and B at least in time τ , the definition of its collision avoidance velocity set CAτA|B (VB ) is as Eq. 4. τ ⊕ VB . / VOA|B CAτA|B (VB ) = v|v ∈
(4)
The sets ORCAτA|B and ORCAτB|A are the optimal set of reciprocal collision avoidance velocities for vehicles A and B, respectively, which contain more allowable velocities opt opt close to the optimized velocities vA and vB . For all other reciprocal collision avoidance velocity pairs VA and VB , Eq. 5 should be met. opt opt ORCAτA|B ∩ D vA , r = ORCAτB|A ∩ D vB , r . (5) opt opt min VA ∩ D vA , r , VB ∩ D vB , r Damage Degree. The IORCA algorithm sets a variable, named damage degree, for the racing AI in racing games with the concept of damage. Collision with static obstacles or other racing vehicles will cause different degrees of damage to the vehicle going forward. When the damage degree reaches 1, it needs to wait for a certain time to repair the vehicle. ORCA tends to make all robots undertake the responsibility of avoiding possible collisions without communicating with each other equally. Regardless of the time consumed, ORCA proves to take a good collision avoidance effect when multiple decentralized robots or other AIs cross forward. However, the primary goal of racing AI is not to avoid all collisions in racing games, but to reach the destination as soon as possible. Consequently, push all vehicles to follow the following collision handling principles while passing the straight track: (1) when the damage degree of racing AI is lower than a certain safety threshold, it will ignore all other racing AIs’ collision; (2) when the vehicle whose damage degree is between the safety threshold and the danger threshold, it will shoulder the responsibility of avoiding the collision equally; (3) when the damage degree is higher than the danger threshold, the vehicle will try to avoid all the upcoming collisions regardless of whether the opponent is avoiding collision or not. Therefore, in addition to the guardrails and the faulty vehicles on the straight track, when the racing AI in front is damaged below the safety threshold, it will not slow down or change direction for avoiding the collision of the rear vehicle. The advanced AIs own higher intelligence and more humanized thinking. They will be able to analyze the track situation to select and execute the decision that is conducive to success. For example, when the vehicle’s degree of damage is below the danger threshold, the advanced AI in front will try to move to the other racing AI’s
Improved Optimal Reciprocal Collision Avoidance Algorithm
213
possible route to force the opponent to slow down or change lanes, which is called blocking. Similarly, the advanced AI that is ranked lower will actively seek the route for accelerating and surpassing the vehicle ahead to overtake. When the backward advanced AI observes a vehicle with a high degree of damage in the field of vision ahead, it will even actively crash it. While the primary or intermediate AI is not intelligent enough to actively collide or block other vehicles. Collision Logic for Turning Scenario. This paper aims to improve the inappropriate collision processing flow of ORCA in the turning scenario, where at most three racing AIs passing through a degree of 90 turning to the left at the same time in Fig. 2. When there is only one AI, it will continue to move forward from position 1 to pass the curve as quickly as possible. When two or more AIs pass the curve, the principle of one AI and one lane will be implemented to create an overtaking opportunity. Thus, AI1 will be at position 1or 4, AI2 will be at position 2 or 5. When three AIs pass through, AI1 may be at position 1, 4, or 7, AI2 will be at position 2, 5, or 7, and the possible positions of AI3 are 3, 6, and 9.
Fig. 2. The position distribution of AIs before turning. There are at most three racing AIs passing through a degree of 90 turning to the left at the same time.
In Fig. 2, the racing psychology of the AI is as follows: (1) When the racing AI continues to move forward in the original state and will collide within the collision time range. (2) The racing AI at the forefront just wants to pass the curve as soon as possible without paying attention to the vehicle at the rear. (3) The racing AI on the rear will try its best to avoid the potential collision with the vehicle in front. (4) The racing AI at the leftmost will be responsible for avoiding the possible collision with the left rail when preceding the vehicle on the right, while ignoring the impact of other racing vehicles. (5) When a vehicle is behind the right vehicle, it will be responsible for avoiding the right vehicle at the same time. (6) The racing AI at the rightmost needs to avoid the collision with the right rail. The process in the scenario of turning right is the same as above and will not be repeated.
3 Implementation and Discussion In order to verify the improved effect of the IORCA algorithm, we implement two collision avoidance systems for the racing game based on ORCA and IROCA respectively under Unreal Engine 4 as shown in Fig. 3. Firstly, establish a library of collision avoidance scenarios, including all the situations that the vehicle may encounter, such as static
214
W. Zhang and T. Gao
obstacles, AI blocking and overtaking on a straight track, multiple racing AIs passing corners, etc. The two collision avoidance systems are tested for the above collision avoidance scenarios to check if all the AIs reach the target smoothly and quickly. The indicators such as velocity and naturalness of movement of the racing AIs are also compared. This paper adopts the behavioral simulation diagram in [15] to visually present the simulation results.
Fig. 3. The implementations of collision avoidance systems. (a) The IORCA collision avoidance system. (b) The ORCA collision avoidance system.
3.1 Static Obstacle In the static obstacle scenario, when the racing AI finds a static obstacle on the road ahead, it starts to adjust the direction and velocity to avoid the obstacle. Since the concept of the field of view is introduced in the IORCA collision avoidance system, the obstacles in the field of view can be detected, while the racing AI in the ORCA collision avoidance system can only detect collisions that may occur within the collision time. With the same collision avoidance time, the former racing AI founds obstacles earlier and starts to avoid them, and the movement is more natural and smooth, as shown in Fig. 4(a).
Fig. 4. The behavior simulation diagram of avoiding static obstacles. (a) The IORCA collision avoidance system. (b) The ORCA collision avoidance system.
Improved Optimal Reciprocal Collision Avoidance Algorithm
215
3.2 Advanced AI Blocking and Overtaking When two advanced racing AIs catch up with each other on the straight track, they will react differently according to their own situation and the surrounding environment in the IORCA collision avoidance system. If the damage level of the current vehicle is lower than the danger threshold, the front vehicle will never consider avoiding the rear vehicle, but try to block the rear vehicle for higher ranking. Equally, if the damage level of the rear vehicle is lower than the danger threshold, it will not consider avoiding collision with the front vehicle, but try to find clearance to overtake, as shown in Fig. 5(a). However, if the damage degree of the rear vehicle is higher than the danger threshold, it will adjust to avoid the collision with the vehicle ahead for its own safety considerations, as shown in Fig. 5(b). When the front vehicle’s degree of damage is higher than the danger threshold, the front vehicle shall continue to drive near the edge of the track for its own safety, and take the Responsibility of avoiding the collision. While, if the rear vehicle is robust and observes that the damage of the front vehicle is higher, it will actively hit the front vehicle to improve the ranking. Thus, the front vehicle can only stay in place and wait for a certain period time for repair, as shown in Fig. 5(c). When the damage degree of the rear vehicle is higher than the dangerous threshold, it will equally bear the responsibility of avoiding the collision with the front one, as shown in Fig. 5(d). Since the ORCA collision avoidance system does not consider the damage degree of the vehicle, no matter what the situation is, the vehicle adopts the principle of equally bearing the responsibility of collision avoidance, which greatly reduces the fun of racing games.
Fig. 5. Simulation diagram of racing behavior of two racing AIs on a straight track. (a) The damage degrees of both vehicles are lower than the danger threshold. (b) The degree of the rear vehicle is higher. (c) The degree of the front vehicle is higher. (d) The degrees of both vehicles are higher than the danger threshold.
3.3 Multiple Racing AIs Passing the Curve Multiple racing AIs will choose different tracks when passing through the curve to reduce the possible collisions that other vehicles may cause. In Fig. 2, there are 27
216
W. Zhang and T. Gao
situations when the three racing AIs passing the curve. All these 27 situations are tested in the experiment respectively to verify the feasibility of the proposed collision avoidance logic. This paper takes the three AIs located at position 1, 5, and 9 in Fig. 2 as an example for the following analysis. In the IORCA collision avoidance system, the racing AI in the front only wants to pass the curve as soon as possible, paying no attention to the traffic condition in the rear. Therefore, as shown in Fig. 6(a), AI1 at position 1 is only responsible for avoiding possible collision with the left rail, AI2 at position 5 is fully responsible for the possible collision with AI1, while AI3 at position 9 shoulder the responsibility for the possible collision with AI2 and the right rail. As for the ORCA collision avoidance system, since all AIs are equally responsible for the possible collision with each other, AI1 and AI3 also need to take the collision avoidance responsibility with the rails, thus they won’t pass the curve safely and fast. AI1 at the corner will detach from the track and go a long way in order to avoid the collision with the left rail. The best case is that the three AIs pass the curve from the middle track or the right track in turn, which increases the probability of collision, as shown in Fig. 6(b).
Fig. 6. Simulation diagram of racing behavior when passing curve. (a) The IORCA collision avoidance system. (b) The ORCA collision avoidance system.
3.4 Playability Test Finally, we need to ascertain the extent to which the IORCA algorithm improves the player experience in the racing system. Player experience questionnaires widely used in the game field mainly include immersion experience questionnaires (IEQ) [16] and game engagement questionnaires (GEQ) [17]. In the experiment, the questions in IEQ and GEQ were screened and combined with the experience of the game. The questionnaire was composed of questions from the four aspects of “immersion”, “presence”, “flow” and “absorption”. The list of questions is shown in Table 1. The experiment invited 200 players who liked racing games to experience the two systems for a week. Among the 200 players, there are 100 men and 100 women. Their average age is 25.86, the average gaming age is 7.35 years, of which 2.66 years for racing games. A week later, a questionnaire was issued to the players, and they were asked to answer the questionnaires on their experiences with the two systems. The brief question sets typically include face-valid items that are responded to on Likert scales
Improved Optimal Reciprocal Collision Avoidance Algorithm
217
Table 1. The question list of the questionnaire Categories
ID
Questions
Immersion
Q1
I really get into the game
Q2
I enjoy the graphics and imagery of the game
Q3
I’m in suspense about whether I would win or lose the game
Q4
There are not any particularly frustrating aspects of the controls to get the hang of
Q5
I enjoy playing the game
Q6
I lose track of time
Q7
I feel like I’m in the car rather than in the chair
Q8
I play longer than I meant to
Q9
I’m unaware of operating the keyboard
Presence
Flow
Absorption
Q10
Things seem to happen automatically
Q11
I play without thinking about how to play
Q12
I’m knocked over
Q13
I can’t hear when someone talks to me
Q14
The car seems to be automatic
Q15
I feel excited when I overtake a car
Q16
Time seems to stand still or stop
Q17
I lose track of where I am
Q18
I feel extremely frustrated when I fail
Q19
It seems as if I could interact with the world of the game as if I was in the real world
Q20
I don’t find myself becoming so caught up with the game that I want to speak to directly to the game
with a total of 7 magnitudes varying from strongly disagree to strongly agree. For fear of sequence effects, the questions are arranged randomly for each player. The results of the questionnaire survey are shown in Fig. 7. According to the data, the average total score of the questionnaire of the ORCA collision avoidance system is 60.862, while the score of the IORCA collision avoidance system is 73.269. Apparently, the average of the latter is higher than the former. Through comparative analysis, it is concluded that the IORCA collision avoidance system can significantly improve the player’s experience compared with the ORCA collision avoidance system, that is, the IORCA algorithm can give players a better experience. In summary, the IORCA collision avoidance system is implemented completely in accordance with the collision avoidance logic proposed in this paper. The experimental results prove the correctness, feasibility, and practicability of the proposed algorithm.
218
W. Zhang and T. Gao
Fig. 7. The average score of each question in ORCA collision avoidance system and IORCA collision avoidance system.
Compared with the ORCA algorithm, the IORCA algorithm is more suitable for racing games, which can give players a better experience.
4 Conclusions To improve the collision detection and avoidance effects of racing games and enhance the player’s gaming experience, this paper analyzes the practicability and defects of ORCA in racing games and designs and implements a novel algorithm (IORCA). The IORCA algorithm inherits the basic principle of ORCA to avoid impending collisions. The concept of degree of damage is put forward to analyze the racing psychology of racing AI. The collision processing principle that multiple racing AIs follow when they meet on the straight track is improved. With the introduction of racing AI levels, advanced AI owns higher wisdom, and there will be more humane behaviors such as blocking and overtaking. In addition, IORCA aims at the imperfect performance of
Improved Optimal Reciprocal Collision Avoidance Algorithm
219
multiple racing AIs in the turning scenarios and proposes the corresponding collision avoidance logic according to the position of the racing AIs. The feasibility and rationality of the IORCA algorithm is then verified by implementing the relevant systems and observing their actual running effect in different scenarios. The experiment results demonstrate that the IORCA algorithm greatly promotes the effect of collision avoidance in racing games. However, this paper focuses on the collision avoidance research based on mathematical models, while ignoring the impact on kinematics and dynamics. In future research, we will take these factors into account when calculating the allowable velocity of racing AIs, so as to achieve a more reliable and realistic racing collision avoidance effect. Conflicts of Interest. The authors declare that there is no conflict of interest regarding the publication of this paper. Funding Statement. This work was supported by the National Natural Science Foundation of China under [Grant Number N180716019 and N182808003].
References 1. Lecchi, S.: Artificial intelligence in racing games. In: 2009 IEEE Symposium on Computational Intelligence and Games, p. 1. IEEE (2009) 2. Kontoudis, G.P., Vamvoudakis, K.G.: Robust kinodynamic motion planning using model-free game-theoretic learning. In: 2019 American Control Conference (ACC), pp. 273–278. IEEE (2019) 3. Reynolds, C.W.: Steering behaviors for autonomous characters. In: Game Developers Conference 1999, pp. 763–782 (1999) 4. D’apolito, F., Sulzbachner, C.: Collision avoidance for unmanned aerial vehicles using simultaneous game theory. In: 2018 IEEE/AIAA 37th Digital Avionics Systems Conference (DASC), pp. 1–5. IEEE (2018) 5. Liu, D., Shi, G., Li, W.: Decision support based on optimal collision avoidance path and collision risk. In: 2018 3rd IEEE International Conference on Intelligent Transportation Engineering (ICITE), pp. 164–169. IEEE (2018) 6. Agrawal, S., Varade, S.W.: Collision detection and avoidance system for vehicle. In: 2017 2nd International Conference on Communication and Electronics Systems (ICCES), pp. 476–477. IEEE (2017) 7. Gaur, L., Rizvi, I.: Improved Vehicle collision avoidance system. In: 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), pp. 1071–1075. IEEE (2018) 8. Laurenza, M., et al.: Car collision avoidance with velocity obstacle approach: evaluation of the reliability and performance of the collision avoidance maneuver. In: 2019 IEEE 5th International forum on Research and Technology for Society and Industry (RTSI), pp. 465– 470. IEEE (2019) 9. Rabin, S.: Game AI Pro3: Collected Wisdom of Game AI Professionals, p. 265. A. K. Peters, Ltd. (2017) 10. Guy, S.J., et al.: Clearpath: highly parallel collision avoidance for multi-agent simulation. In: Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 177–187 (2009)
220
W. Zhang and T. Gao
11. Van den Berg, J., Lin, M., Manocha, D.: Reciprocal velocity obstacles for real-time multi-agent navigation. In: 2008 IEEE International Conference on Robotics and Automation, pp. 1928– 1935. IEEE (2008) 12. Rabin, S.: Game AI Pro 2: Collected Wisdom of Game AI Professionals (2015) 13. Durand, N.: Constant speed optimal reciprocal collision avoidance. Transp. Res. Part C Emerg. Technol. 96, 366–379 (2018) 14. Van Den Berg, J., et al.: Reciprocal n-body collision avoidance. In: Robotics Research, pp. 3– 19. Springer, Berlin (2011) 15. Baldi, T.L., et al.: Haptic guidance in dynamic environments using optimal reciprocal collision avoidance. IEEE Robot. Autom. Lett. 3(1), 265–272 (2017) 16. Jennett, C., et al.: Measuring and defining the experience of immersion in games. Int. J. Hum Comput Stud. 66(9), 641–661 (2008) 17. Brockmyer, J.H., et al.: The development of the Game Engagement Questionnaire: a measure of engagement in video game-playing. J. Exp. Soc. Psychol. 45(4), 624–634 (2009)
An Intelligent Travel Application Using CNN-Based Deep Learning Technologies Kyungmin Lee1 , Mincheol Shin2 , Yongho Kim3 , and Hae-Duck J. Jeong4(B) 1
Webtoon Operation Platform Development Team, Kakao Entertainment Corporation, Seoul, South Korea 2 Mobile Development Team, Village Baby, Seoul, South Korea 3 Department of Information and Communication Engineering, Myongji University, Seoul, South Korea [email protected] 4 Department of Computer Software, Korean Bible University, Seoul, South Korea [email protected]
Abstract. This paper discusses the design and implementation of an application that provides travel information inquiries, schedule management, and budget management to travelers preparing to travel or already traveling to provide a more efficient, leisurely and rewarding travel experience. In particular, it utilizes YOLOv4’s CNN-based deep learning technologies to perform coin recognition to ensure near real-time processing time, no GPU or low GPU performance, and guarantees minimum average precision. The intelligent travel application can instantaneously classify coins, identify the issuing country and provide exchange rate information. In addition, when a budgeting application is being used, the travel application is designed to conveniently implement and manage the budget, thereby saving the user time.
1 Introduction According to the Korea Tourism Organization’s 2019 Korean departure statistics, more than 28 million Korean overseas travelers (more than half of the total number of nationals) visited overseas. In addition, according to Mastercard’s Global Travel Cities Index (GDCI) report, Korean tourists’ overseas travel spending is ranked 6th in the world. As a result, numerous services related to travel are being released in advance. Currently available service applications provide limited types of related functions such as hotel reservations and air ticket reservations, and there are not many services that focus on convenience while traveling. This paper describes the design and implementation of an application that provides travel information inquiry, travel schedule management and budget management to travelers preparing to travel or already traveling to provide a more efficient, leisurely and rewarding travel experience. In particular, using the coin recognition function based on YOLOv4’s CNN (Convolutional Neural Networks)-based model, we also implement a function that immediately the classification of coins, coin issuance country and exchange rate information. In addition, when entering the budget into the application, it c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 221–231, 2022. https://doi.org/10.1007/978-3-030-79728-7_22
222
K. Lee et al.
is designed and implemented to conveniently manage the budget, thereby reducing the time required. In this paper, Sect. 2 will introduce research related to artificial intelligence and image processing technology. Section 3 will describe the coin recognition technology, Sect. 4 pertains to the transfer of data, Sect. 5 explains the coin recognition model, and in Sect. 6, experimental results and the implementation of the travel application will be presented, followed by the conclusion.
2 Related Works People from all over the world may well remember that the iconic moment when AlphaGo and Sedol Lee (9 Dan) played the Go game and AlphaGo won the game 4 to 1. AlphaGo is an artificial intelligence Go program developed by Google’s subsidiary DeepMind. With the advent of the 4th industrial revolution era, artificial intelligence and image processing technologies are being used and studied more actively in various application fields across autonomous vehicles, IoT(Internet of Things), and industries [18]. In particular, deep learning not only detects objects including image processing technology [2, 10, 11, 17], but can also sort fruits and vegetables [5, 7], manage autonomous flight based on drone target detection and tracking [13], and predict fine dust according to traffic flow [8, 16], IoT device recognition [3], etc. are being studied in various ways.
3 Coin Recognition The purpose of the coin recognition function is twofold. The first purpose is to reduce the amount of time it takes when entering coin categories and budgets into the application. In the case of foreign coins, when mixed with other coins, it is often difficult to identify the country of origin and value of the coin. Through the coin recognition function, we intend to apply a function that immediately identifies the issuing country of the coin as well as exchange rate information. The second purpose is to make budget management more convenient. The application needs to know the type and number of all currencies of the user, but it is not easy for the user to enter all the information manually. For improved user experience, coin information is to be entered into the application all at once. Computer vision methods used for coin recognition are largely classified into three categories. The first category involves looking at an image and classifying the image into one specific class. The second category involves localization that puts a bounding box on a specific position of the object recognized through the bounding box regression of the classified object in the image. The third category is when there are multiple objects instead of one object, and it is divided into object detection that applies classification and localization to multiple objects. Figure 1 shows the case of a cat, dog, and duck, and shows the connection between classification, localization and object detection.
An Intelligent Travel Application Using CNN-Based Deep Learning Technologies
223
Fig. 1. Classification, classfication + localization, and object detection of cat, dog and duck [4]
If you only want to achieve the first purpose of the coin recognition function, the classification+localization function is sufficient. However, in order to achieve the second purpose, you need to be able to classify coins and count the number of coins. To perform this function, an object detection technique is applied. As a requirement of the object detection technique, processing time close to realtime, no GPU(Graphics Processing Units) or low GPU performance must be achieved, and minimum AP (Average Precision) must be guaranteed. As of the time of this paper’s publication, several techniques such as EfficientDet [15], ATSS [19], ASSFF [12] and CenterMask [9] have been developed, but the requirements specified above have not been satisfied. However, the YOLOv4 [1] model satisfies these requirements. The CNN-based model that has recently been studied by numerous researchers, has a fairly high GPU requirement and must have a specific system environment. On the other hand, YOLOv4 provides guaranteed performance even on traditional GPUs such as Nvidia’s GTX 1080 Ti and RTX 2080 Ti and is in a proper position in the trade-off relationship between AP (Average Precision) and FPS (Frame Per Second). Therefore, in this paper, we intend to describe the design and implementation of an intelligent travel application using YOLOv4’s CNN-based model.
4 Data Preprocessing In order for the program to acquire coin data, about 1,000 images of the front and back of coins are required for each country. However, currently, this type of data is very limited. In addition, it is difficult to collect this data by taking images of coins one by one. Thus, as an alternative, the following coin image pre-processing was performed. A white sheet of paper was placed under the coin and brightness adjusted to shoot for 32 to 33 s. The rational for using white paper was that when a circle was detected in the image later, pictures on white paper had an increased detection rate. The reason for adjusting the brightness is that the user does not know the brightness of the environment in which the coin will be recognized, so we adjusted it to various degrees of brightness when taking the pictures. Next, an image was extracted from the captured coins. Image extraction extracted 1/20 of the entire frame, and about 98 images were extracted. As shown in the below Hough Transform algorithm [6, 14], a circle (the border of a coin) is detected from the image using the Hough Transform algorithm, and the image
224
K. Lee et al.
cut out of only the coin is saved. In this process, the alpha mask is also extracted and saved. The alpha mask is necessary when compositing the coin image with an arbitrary background. Algorithm of the Hough transform : (1) Build a parameter space with a suitable quantization level for line slope m and intercept b. (2) Create an accumulator array A(m, b). (3) Set A(m, b) = 0, ∀(m, b). (4) Extract image edges using Canny detector. (5) For each pixel on the image edges (xi , y j ), ∀(mk , bl ) verifying equation : bl = −xi mk + y j , increase : A(mk , bl ) = A(mk , bl ) + 1. (6) Find the local maxima in A(m, b) that indicate the lines in the parameter space. Although some data was created, it is still not enough, so the coin image data set in various situations is created by synthesizing images of coins with random backgrounds. Finally, augmentation was performed to further inflate the data generated so far. Augmentation is a technique in which an image is rotated, inverted, size reduced and enlarged, and a color filter or noise is added to inflate it into various forms in various ways. After all these steps, data processing required for learning has been completed.
5 Coin Recognition Model The coin recognition model is divided into two types. The first is to detect the bounding box area and then apply the classification model. The second method is to draw a number of random bounding boxes, erase the bounding boxes that are not suitable, and sort out the remaining bounding boxes as coins. A known issue was that the request response speed of the existing coin recognition API was noticeably slow. When this issue was resolved, the process was much faster and coin data was quickly retrieved. The reason for the slowness was that the weight of the coin recognition model was fetched for each request, which led to an inevitably slow response for each request. In order to optimize this process, when starting the server, the problem was solved by loading the model and maintaining the TensorFlow session. In addition, the data received by requesting coin recognition data has been improved so that it can be synchronized with the budget for travel itinerary budget management.
An Intelligent Travel Application Using CNN-Based Deep Learning Technologies
225
5.1 Only YOLOv4 Next, the augmented data was trained using YOLOv4. The specifications of the computer used for learning consist of i7 CPU, 16G RAM, and GTX 960M 4G. As a development environment, Python was used for the back-end, Flutter and Dart for the front-end, and GitHub, Stack, Trello, Skype, Codemagic, and Wikidocs were used as collaboration platforms. The total number of items in the data set was 38,740, and the train and test set were divided by an 8:2 ratio, and the train data was divided into 27,118 and the test data was divided into 11,622. The goal is to classify a total of 8 classes before and after the coin. The study period took 15 d from June 6, 2020 to June 20, 2020. In the first learning, no features were learned. By changing the resolution of the model and image, and adjusting subdivision, convolutional filter, etc., the performance of the model improved, and it became as shown in Fig. 2 on the far right. However, it did not detect the coin area well, and the sorting performance of the coin was very low. Two problems were hypothesized as follows: • Hypothesis 1. Model does not respond to camera distortion. • Hypothesis 2. The model does not detect the change in shape due to the light reflection of the coin. All 40,000 pieces of data were deleted, and data was collected again from the beginning. First, 100 pieces of data were collected and labeled close to the environment in which the actual coin was photographed. In addition, the augmentation method was changed, and the perspective method was applied to respond to coins photographed at an angle. In addition, additional augmentation techniques such as covering or cutting out a part of the coin were applied to extract general features of the data. Training was conducted with a total of 300 pieces of data, and after model optimization, the result as shown in Fig. 3 was calculated. The accuracy was very high, and a model that could be used in practice was derived.
Fig. 2. Process of changing a model and image resolution, and adjusting subdivision and convolutional filter
226
K. Lee et al.
Fig. 3. Model optimization
5.2
YOLOv4 + Image Classification
This study confirmed that learning with YOLOv4 will lead to poor results. However, in the case of this study, creating a model that recognizes only coins with YOLOv4 and implementing an image classification model that can classify detected coins by each type, lead to a successful method of classification through two processes. For purpose of this study, image classification was conducted by learning, using CNN. Before proceeding with image classification, data necessary for learning was also created for the purposes of this study. As shown in Fig. 4(a) and Fig. 4(b), a coin video was taken, and the image was extracted from each frame, and after extracting the border of the coin using the Hough Transform algorithm from the extracted image, only the coin part was cropped. As shown in Fig. 4(c), the cropped image is blended with a random background, and augmentation is performed to inflate a small amount of data.
An Intelligent Travel Application Using CNN-Based Deep Learning Technologies
227
(a) A coin image extracted (b) Cropped (c) A coin image blended from video frame image and augmented with random background
Fig. 4. Coin image classification
6 Experimental Results and Travel Application Implementation 6.1 Coin Recognition Experiment Results Among the generated data, there was a case where coins to be learned went out of the image due to augmentation. Since these types of data would not be helpful for learning, learning was performed after removing images that might interfere with the process. Figure 5 shows the training accuracy and validation accuracy graphs. The xaxis shows the number of epochs and the y-axis shows the accuracy. As the amount of data increases, one can see that the training accuracy is close to 1, and the validation accuracy is 0.46. Therefore, close convergence is evident.
Fig. 5. Graph of training accuracy and validation accuracy
228
K. Lee et al.
Figure 6 shows the CNN structure used for training. The epoch was set to 300, and max pooling and dropout were added to prevent overfitting, resulting in a number of 46.48% of the model performance.
Fig. 6. The CNN structure used for training
The EfficientNet [15] model saves a lot of the number of parameters and FLOPS (FLoating point Operations Per Second) used for training compared to the existing CNN models. It also has a high level of accuracy. However, it took too long to learn, and the results could not be confirmed. 6.2
Travel Application Implementation
As shown in Fig. 7, the screen structured by the application screen by directory is as follows. The internal structure was improved by refactoring, and the design and implementation steps were repeated once more by applying the merits of the existing UI and UX and deleting the parts that were considered to be demerits. From a user’s perspective, it is designed to be easier to use than existing applications. 6.3
Application Structure
One of the most important things when deciding the structure of application development is deciding how to produce and consume data between screens. Depending on which method is adopted, code duplication is prevented, the code can be used flexibly, and readability and resource management are facilitated. First, it passes the BLoC
An Intelligent Travel Application Using CNN-Based Deep Learning Technologies
229
Fig. 7. Major screens of the travel application
(Business Logic Components) pattern, and then the data for each screen transition. The pattern applied through these processes is ChangeNotifierProvider, which is a widget that provides an instance of ChangeNotifier to sub-items. In this pattern, when data is produced in the parent class, child classes can access the data after creating an instance. In addition, if you add a listener for the data, there is the advantage that the data can be continuously updated. Each class model was created by utilizing these features, and the screen was dynamically configured. Also, a signature function among the functions using this function is the ability to change to a dark theme. A separate dark theme
230
K. Lee et al.
function has been added to the application itself, and all properties can be changed without restarting the application.
7 Conclusions and Future Works This paper described an application designed and implemented for convenient and efficient travel abroad. In particular, by utilizing the coin recognition function based on YOLOv4’s CNN-based model, a function to immediately inform the classification of coins, country of issuance and exchange rate information was implemented. In addition, when a budgeting application is being used, the application is designed to conveniently implement and manage the budget, thereby saving the user time. Future research will focus on solving the problem of a low recognition rate when sunlight or lighting is directly reflected on the coin or the degree of damage to the coin is severe. To address this issue, future work will improve the performance of the model by modifying and supplementing the coin recognition model and adding data sets. Acknowledgements. The authors would like to give thanks to the funding agencies for providing financial support. Parts of this work were supported by a research grant from Korean Bible University. The authors also thank Professor WooSeok Hyun, Assistant Professor Wonbin Kim, Professor Jiyoung Lim, Assistant Professor Susan Elizabeth Nel, and Assistant Professor HyeKyung Yang for their constructive remarks and valuable comments.
References 1. Bochkovskiy, A., Wang, C.Y., Liao, H.: YOLOv4: Optimal Speed and Accuracy of Object Detection. ArXiv abs/2004.10934 (2020). https://arxiv.org/abs/2004.10934 2. Cartucho, J., Ventura, R., Veloso, M.: Robust object recognition through symbiotic deep learning in mobile robots. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2336–2341 (2018) 3. Chu, Y., Choi, Y.: A deep learning based IOT device recognition system. J. Semicond. Disp. Technol. 18, 1–5 (2019) 4. Druzhkov, P., Kustikova, V.: A survey of deep learning methods and software tools for image classification and object detection. Pattern Recogn. Image Anal. 26, 9–15 (2016). https://doi. org/10.1134/S1054661816010065 5. Hameed, K., Chai, D., Rassau, A.: A comprehensive review of fruit and vegetable classification techniques. Image Vis. Comput. 80, 24–44 (2018). https://doi.org/10.1016/j.imavis. 2018.09.016 6. Hassanein, A., Mohammad, S., Sameer, M., Ragab, M.: A Survey on Hough Transform, Theory, Techniques and Applications. IJCSI Int. J. Comput. Sci. Issues 12 (2015). http:// arxiv.org/abs/1502.02160 7. Kim, J., Cho, W., Na, M., Chun, M.: Development of automatic classification system of vegetables by image processing and deep learning. J. Korean Data Anal. Soc. 21, 63–73 (2019) 8. Kim, M., Shin, S., Suh, Y.: Application of deep learning algorithm for detecting construction workers wearing safety helmet using computer vision. J. Korean Soc. Saf. 34, 29–37 (2019) 9. Lee, Y., Park, J.: CenterMask: real-timeanchor-free instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13,906– 13,915 (2020)
An Intelligent Travel Application Using CNN-Based Deep Learning Technologies
231
10. Lee, Y.H., Kim, Y.: Implementation of object feature extraction within image for object tracking. J. Semicond. Disp. Technol. 17, 113–116 (2018) 11. Lee, Y.H., Kim, Y.: Comparison of CNN and YOLO for object detection. J. Semicond. Disp. Technol. 19, 85–92 (2020) 12. Liu, S., Huang, D., Wang, Y.: Learning Spatial Fusion for Single-Shot Object Detection. ArXiv abs/1911.09516 (2019) 13. Saqib, M., Daud Khan, S., Sharma, N., Blumenstein, M.: A study on detecting drones using deep convolutional neural networks. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–5 (2017). https://doi.org/10. 1109/AVSS.2017.8078541 14. Shapiro, L., Stockman, G.: Computer Vision. Prentice-Hall, Inc., Hoboken (2001) 15. Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10,778–10,787 (2020) 16. Yi, H., Bui, K., Seon, C.: A deep learning LSTM framework for urban traffic flow and fine dust prediction. J. KIISE 47, 292–297 (2020) 17. Yu, Y.S., Kim, C., Hong, C.P.: An implementation of embedded Linux system for embossed digit recognition using CNN based deep learning. J. Semicond. Disp. Technol. 19, 100–104 (2020) 18. Yu, Y.S., Kim, C.G., Hong, C.P.: An Implementation of Embedded Linux system for embossed digit recognition using CNN based deep learning. J. Semicond. Disp. Technol. 19, 100–104 (2020) 19. Zhang, S., Chi, C., Yao, Y., Lei, Z., Li, S.Z.: Bridging the Gap between anchor-based andanchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9756–9765 (2020)
Reduced CNN Model for Face Image Detection with GAN Oversampling Jiha Kim and Hyunhee Park(B) Department Information and Communication Engineering, Myongji University, Yongin, South Korea {yaki5896,hhpark}@mju.ac.kr
Abstract. In this paper, we propose a reduced convolution neural network (RedNet) model for face image detection implemented using SqueezeNet and AlexNet. The proposed RedNet model shows about 12.5% lower storage efficiency than AlexNet. Compared to SqueezeNet, the RedNet model can improve accuracy by 2.3809% and reduce loss by 0.05986. Eventually, the RedNet model predicts age and gender based on face images. To do this, the RedNet model that is lighter than AlexNet and more accurate than SqueezeNet is proposed, but it still shows a disadvantage to the imbalance of the dataset. Therefore, we solve the overfitting and data imbalance problem in the learning process through oversampling of GAN and apply the proposed RedNet model.
1
Introduction
Recently, performance of PC hardware has improved while performance of commonly used small-embedded devices has also increased. However, the performance of embedded devices is bound to be lower than that of servers and PCs. Nevertheless, we intend to insert a deep learning model that requires high performance in the embedded devices. For example, it is intended to use small embedded devices rather than large devices in an application area such as detecting a specific person. In particular, in an environment such as the COVID Pandemic, it can be seen that all the equipment installed to detect personal information and thermal sensors is a small embedded environment. Therefore, in this paper, we develop a deep learning model that can predict age and gender through facial images. In particular, we propose a fine-tuning model that the deep learning model can operate in a small embedded environment. In addition, it is necessary to predict age and gender and simply provide services based on the face image and less data. To do this, we can consider two methods when used in the small embedded devices after learning the model. We have to decide whether to use a large amount of data to build a high-performing model or tolerate some errors while using a small amount of data. This is because small embedded devices operate with limited resources. The first is to use a model that takes up a lot of physical storage space instead of maximizing its performance. In order to use the c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 232–241, 2022. https://doi.org/10.1007/978-3-030-79728-7_23
Reduced CNN Model for Face Image Detection with GAN Oversampling
233
freezing technique, we have to use all the data in the training process. The second is to allow some error (depending on the user) but to minimize storage space. For this, it is often used to create ‘.tflite’ files using Tensorflow lite. In these two considerations, we proposes a deep learning model using face images. We proceed to optimize the initial learned model data to fit small embedded devices and show it by inference like a real-time operation using specific data (e.g., weight) rather than using all the information from the model. This is less accurate than using all the data in the model, but it can achieve storage efficiency. The TensorFlow Lite1 binary is 1 MB when all 125+ supported operators are linked (for 32-bit ARM builds), and less than 300KB when using only the operators needed for supporting the common image classification models InceptionV3 and MobileNet.
2
Related Work
AlexNet is a convolutional neural networks (CNN) structure that won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2012 [1]. Basically, it is the model that has made the most progress in image classification using CNN. However, because too many parameters produce results over a very large number of computational processes, small embedded devices take up a lot of physical storage space (HDD, etc.) to implant models alone. On the other hand, squeezeNet compresses AlexNet to remove the fully connected layer and produces results using only the convolution layer [2]. Compared to AlexNet, the parameter is overwhelmingly small, and it also uses less storage space when transplanting models to embedded devices. However, it tends to fall in terms of accuracy. To solve the data imbalance that occurs to learn a small amount of original data, we apply the Generative Adversarial Nets (GAN) method. The oversampling method in GAN is based on learning ‘discriminator’ to judge images [3]. When input is provided, the discriminator proceeds with the learning to fit the entered image sample. However, in the early process, the data given by learning can also be determined to be incorrect. At this point, the ‘generator’ generates an image from a random noise using weights of features different from the distribution of the actual input image. If the accuracy of the discriminator is reduced, the learning of the generator is fixed and the optimization of the discriminator is carried out. Then, fix the discriminator and proceed with optimization for the generator. By repeating this process, eventually the discriminator determines the image of the generator, showing 50% to 70% in accuracy. Finally, it becomes an oversampling method that creates a new image from noise. ∗ = DG(z)
Pdata (x) , Pdata (x) + Pg(x)
(1)
∗ is the results of discriminator, Pdata (x) is xth data, and Pg(x) is where DG(z) xth data similarly created by the discriminator, respectively [3]. 1
The TensorFlow Lite: https://www.tensorflow.org/lite/guide.
234
J. Kim and H. Park
Using CGAN and DCGAN will solve the problem of not being able to utilize various features and channels of general GAN [4,5]. CGAN allows the creation of specific image classes by adding conditions. DCGAN uses convolutional layers to use all image features.
3
Proposed Reduced Model
In this paper, we make a face prediction model for small embedded devices based on a small set of face images. To do this, open facial images published on the Internet are collected, and the images are labeled by age and gender. In particular, images in their teens and 60s are collected as a small amount of data, resulting in data imbalance. In order to solve this problem, the GAN oversampling method is applied to a deep learning model. 3.1
Data Collection
The configuration of datasets is the most important part of the process of creating machine learning models. In general, the initial dataset is collected from the Kaggle site. Through the Kaggle site, a large amount of data can be collected at once, and various kinds of datasets can be obtained. Therefore, we collected the first dataset for face images divided by gender to create a simple classification model from the Kaggle site. Depending on the dataset collected, we can specify which model to use. Representative models (e.g., ResNet-50 [6], MobileNet [7], and VGG16 [8]) that are known to have high accuracies can be shown unstable performances depending on datasets. The dataset collected by Kaggle is mainly for human faces. However, it is not sufficient for the classification model by gender and age. Because face image classification can be influenced by the clothes, hair style, and accessories a person wear. Therefore, not only the face but also the upper body image is needed to make a classification model for gender and age. In addition, when the main target is Korean image, we only need to get a dataset of Korean. In this paper, we use the open portal sites such as Naver and Daum for Korea in order to collect datasets for Korean images. The collected images are public images without copyright. The size of the image is 95 by 119 which is the size of the identification photo. In the process of making the classification model, the 95 by 119 size is resized to fixed 64 by 64. 3.2
Data Imbalance Problem
When we collect a data set, it is difficult to have an even number for all classes. Eventually, we need to check if there is the problem of data imbalance in the collecting dataset. In this paper, we collected a total of 4,871 face images. Among the face images, the gender specific image dataset consists of 50.2% of men and 49.8% of women, respectively. In addition, the age specific image dataset consists of 7.0%, 15.8%, 18.7%, 19.6%, 19.4%, and 19.2% for each class, respectively. The gender ratios in the dataset are each composed of similar ratios. However, the
Reduced CNN Model for Face Image Detection with GAN Oversampling
235
ratio of age shows imbalance problems, for example under teens is about 2.5 times less than other ages. In Fig. 1, Each gender consists of a similar proportion. The ratio for the right-hand age consists of a similar proportion, except for those under teens or younger. We divide the ratio of the data set for the learning process into training with 80%, validation with 10%, and test with 10%, respectively. The validation set is used as a data set to prevent overfitting of learning. The test set is used for the testing process of a created classification model during the learning process.
Fig. 1. The ratio of face image dataset.
3.3
Data Preprocessing
In order to use the image dataset, it is necessary to load the images in the form of a numpy array. When loading the images, the size is fixed by 64 by 64. The reason the size is 64 by 64 is for the efficiency of physical storage space when the model is used in a small embedded device. For example, even though the hyperparameter of the AlexNet model is modified to match the size of the image, the physical storage space of the reduced model is one-eighth smaller. The imported image should be made into a 4D array for use in a convolutional neural network. At this time, the information of each dimension is as follows. This data array is used as the input shape of the convolutional neural network. As the last process of data preprocessing, the output is made in vector form by using the one-hot encoding for the target class. For example, when the target class is three, only the value at the index number three position is 1 and the rest are 0 as the vector form. After the one-hot encoding process is over, the process of data preprocessing for the learning and testing datasets is finished. 3.4
Proposed Model with Hyperparameter
In this paper, we proposed a reduced convolutional neural network (RedNet). The proposed RedNet model consists of three convolution layers and each max pooling layer. After that, it is connected to a fully connected layer. The fully connected layer consists of an input layer and two hidden layers, and finally classifies 12 classes as the output results. In order to make the RedNet model,
236
J. Kim and H. Park Table 1. RedNet hyperparameter Layer
Feature map Filter size Stride Padding Activation function
Input
3 (RGB)
–
–
–
–
Conv2d
96
4 by 4
2
Valid
ReLU
MaxPooling 96
3 by 3
2
Valid
–
Conv2d
256
3 by 3
1
Same
ReLU
MaxPooling 256
3 by 3
2
Valid
–
Conv2d
384
2 by 2
1
Same
ReLU
MaxPooling 384
3 by 3
2
Valid
–
Flatten
–
–
–
–
–
Dense
–
–
–
–
ReLU
Dense
–
–
–
–
ReLU
Output
–
–
–
–
Sigmoid
Fig. 2. Overall architecture of RedNet.
64 feature maps with three-by-three filters and two strides are performed as the convolution layer of the input. The reason why three by three filters and two strides are set is to reduce the data size of input in half. After passing through the max pooling layer, 64 features become 30 by 30 kernel size. The remaining four convolutional layers and max pooling layers equally make each output in half. Finally, when entering a fully connected layer, we can make a reduced model with a small number of inputs neural (Table 1).
Reduced CNN Model for Face Image Detection with GAN Oversampling
237
The overall structure of the reduced fully connected layer is shown in Fig. 2. After passing through five convolutional layers and five max pooling layers, the number of neural as final input becomes 1,024. In this paper, we decide to use the optimizer as the adaptive moment estimation (Adam). This is because the Adam optimizer can be updated at moments of gradient to adapt the learning rate for each weight of the neural network. To do this, the learning rate is set to 0.0001. The learning rate and other hyperparameters are optimized through simulation tests. We implement a categorical cross-entropy loss function for each class, which often in practice means a cross-entropy loss function for classification problems. In addition, when configuring a neural network, a positive probability value for each class can be obtained by using softmax as the activation function at the last output. Finally, the error can be calculated using the obtained results. As a learning parameter, 130 epochs are performed. The training and validation are conducted together with the train set and validation set, respectively. The batch size is set to 16 because the amount of dataset is small. As the results, the difference in accuracy between the train set and validation set is 61.191% in Fig. 3(a), and the difference in loss between the train set and validation set is 4.8638 in Fig. 3(b), respectively. 3.5
Data Oversampling and Augmentation
As the results of Fig. 3, the training results must be overfitting. As the validation results, the accuracy shows a difference of 61.191%, and the loss shows a difference of 4.8638. This difference can be seen as a large error even in a general model, and in this paper, three methods are used to solve it. The first is to blur the original images with kernel filters. Using kernel filters means to filter images by the Gaussian noise. It is similar to the operation method of making the same size of the input and output by using padding in the convolutional layer.
(a) Original dataset accuracy
Fig. 3. Original dataset.
(b) Original dataset loss
238
4 4.1
J. Kim and H. Park
Performance Evaluation ImageDataGenerator Dataset Results
Table 2. Each class shows the test dataset (without GAN dataset) results of the models Class Num of sample RedNet AlexNet SqueezeNet 0
308
45
22
38
1
285
20
24
44
2
682
191
174
140
3
659
140
144
81
4
792
117
84
116
5
791
224
256
138
6
836
88
87
127
7
814
129
216
207
8
812
198
187
192
9
835
273
219
139
10
814
567
502
479
11
814
398
324
393
The results of previous simulations are as follows. Simple augmentation techniques are used to supplement datasets with small amounts or unbalance. However, the new dataset is also built on the existing dataset, so it cannot produce meaningful results. Here, deep learning-based augmentation techniques are used (Table 2). Figure 4 shows that there is a wide gap between the actual answers in the test dataset and the results predicted by each model. Figure 4(a) to Fig. 4(c) show the results of each model predicting the test dataset and matching it. Figure 4(d) shows overall loss and accuracy. In the results of Fig. 4(d), RedNet has 1.78868% higher accuracy than AlexNet and 3.50628% higher than SqueezeNet. Loss is 0.7330 lower than AlexNet, but 0.9871 higher than SqueezeNet. 4.2
ImageDataGenerator Dataset with GAN Results
Figure 5(a) to Fig. 5(c) show the results of each model predicting the test dataset and matching it. The results of Fig. 5(d) show that RedNet performs worse than AlexNet. Compared to AlexNet, accuracy dropped 0.6076% and loss increased by 0.0383. Compared to SqueezeNet, accuracy is 2.3809% higher and loss is 0.0598 lower (Table 3).
Reduced CNN Model for Face Image Detection with GAN Oversampling
(a) RedNet test result
(b) AlexNet test result
(c) SqueezeNet test result
(d) Each models accuracy and loss
239
Fig. 4. Results of each model predicting the test dataset. Table 3. Each class shows the test dataset (with GAN dataset) results of the models Class Num of sample RedNet prediction AlexNet prediction SqueezeNet prediction 0
654
642
640
641
1
653
638
642
640
2
671
643
649
641
3
670
638
644
618
4
676
642
654
610
5
676
643
653
645
6
678
631
645
618
7
677
639
649
622
8
677
652
644
629
9
678
646
645
608
10
677
663
671
643
11
677
661
651
631
The Table 4 shows the number of parameters used in each model and the size of the file when the model was stored. For AlexNet, we adjust the filter of convolutional layers to fit the image size used in this paper. The model proposed in this paper tends to be slightly less accurate and loss than AlexNet. However, the number of parameters in the learning process can be reduced to reduce learning time, and it can also take the efficiency of
240
J. Kim and H. Park
(a) RedNet test result with GAN
(b) AlexNet test result with GAN
(c) SqueezeNet test result with GAN
(d) Each models accuracy and loss with GAN
Fig. 5. Results of each model predicting the test dataset with GAN. Table 4. Comparison of total parameters and model size for each model Category
AlexNet
SqueezeNet RedNet
Total parameters 60,162,516 728,652
7,517,580
Model file size
86.1 MB
688 MB
8.58 MB
storage space. In addition, common augmentation techniques are likely to recognize images as similar when they are created and become overfitting. The GAN technique can be used to avoid the overfitting problem by adding results from new images.
5
Conclusion
As small embedded devices become active, they are being used in many areas. In particular, a lot of research is being conducted for a heavy model based on deep learning into a small embedded device. In this paper, we proposed new reduced deep learning model that can predict age and gender using facial image data sets. In particular, the layer has been optimized that it can be put into a small embedded device, at the same time a deep learning model is proposed that the accuracy similar to that of AlexNet comes out. The proposed RedNet in this paper showed a small spatial efficiency of about 1/8 in storage space efficiency
Reduced CNN Model for Face Image Detection with GAN Oversampling
241
compared to AlexNet, and 2.3809% higher accuracy compared to SqueezeNet. In the future, it is expected that each user’s service requirements will increase due to embedded devices. This would require finding a service method that efficiently uses resources within a finite resource of embedded devices. Acknowledgements. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government Ministry of Science and ICT (MSIT) 2019R1F1A1060742.
References 1. Krizhevsky, A., Sutskever, I., Hinton, G.H.: ImageNet classification with deep convolutional neural networks. Commun. ACM Mag. 60(6), 84–90 (2017) 2. Iandola, F.N., et al.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and 0
(1)
Similarly, for the path P = [p1 , p2 , ...pn ], the state of the path can be defined as TP = i∈P 2i . For a new path T0 and current state T , the following formula can calculate the number of newly visited stakes in: (T0 , T ) = bitcount((not T )&T0 ).
(2)
In the formula, the bitcount function refers to the number of ones in the binary representation of a number. After completing the analysis and storage of the new path, the current status needs to be updated. The state transformation formula is as follows, where T represents the new state: T = T |T0
(3)
The total complexity of each round of the algorithm is O( 2nt ), where t is the machine word length.
6 Quality Evaluation Model In order to evaluate the software quality of the program under test and determine when to stop the test based on this, a quality evaluation model needs to be designed [14]. During the testing process of the quality evaluation system, the code coverage information and path coverage information of the program under test are collected. Combined with the code block information obtained during static analysis, the completeness of the current test can be evaluated; after a crash is found in the test process, the path analysis of the crashed sample is performed to determine the uniqueness of the crashed sample; combined with the construction of a test evaluation model, the test can be realized Evaluation of process and software quality. 6.1 Evaluation of the Completeness of the Fuzzing Test During the test, the total number of code blocks of the program under test was analyzed during the disassembly process, and the coverage of the code blocks was counted during the fast path analysis process. Based on the above data, the completeness of the fuzzing test can be calculated as: EC =
Count(Wvisited ) ∗ μC Count(W )
(4)
Where EC represents the completeness of the test, Wvisited represents the set of covered piles, W represents the set of all piles, and μC is the adjustment coefficient.
250
Z. Feng et al.
The adjustment factor is introduced to correct the proportion of unreachable codes during the test. For example, when testing one of the functions of the program, the code blocks to which the other functions belong cannot be accessed, but are counted together during the instrumenting process. Due to the difficulty of counting the number of code blocks, according to the actual program under test, the value of the adjustment coefficient can be estimated based on experience and used to correct the calculation of test completeness. 6.2 Analyze the Uniqueness of the Crash Sample During the testing process, a large number of fuzzing test samples that cause a crash may eventually be output. But some of the samples trigger the crash for the same reason, and need to remove duplication. At this time, the uniqueness of the crashed sample can be judged by the execution path of the sample that caused the crash. According to the path analysis results of the crash samples, the crash samples can be divided into two categories: unique crash point samples and unique path samples. The only crash point sample means that the last executed code block of the sample is at the same location during program execution. The code at this location is often the cause of the crash. Therefore, the samples that trigger the crash in the same code block can often be classified into one category. The unique path sample refers to the same path that the sample traverses during the execution of the program. Since the crash may be caused by multiple factors on the execution path, it is sometimes necessary to classify the samples according to the entire execution path of the sample. Note that the execution path here includes the last executed code block, so the only path sample must be the only crash point sample. 6.3 Comprehensive Evaluation Test Process Using code block coverage information and path coverage information to construct a path coverage map, and combining crash information to construct a test evaluation model, the test process and software quality can be evaluated. Assume that Ccrash represents the sample number of unique crash points triggered by the test process, and EC is the completeness of the test. The design quality evaluation model is as follows: β ∗ 100% (5) E = αEC + 1 + Ccrash Among them, E is the software quality evaluation value, and its value range is (0,1). The larger the value, the higher the degree of safety. α is the coverage coefficient, while β is the crash detection rate coefficient. There exists the equation that α + β = 1.
7 Analysis of Advantages and Disadvantages The advantages of the solution proposed in this paper are that it has no dependence on source code, is adaptable to various instruction sets and hardware environments, supports
Power Meter Software Quality Analysis
251
various debugging interfaces such as SWD and Ptrace, and has high test efficiency and consumes less hardware resources during operation. In addition, the safety evaluation model based on the coverage rate can visually show the test status, solve the blindness problem of traditional fuzz testing, and make the safety inspection process more reliable. The limitation of this solution is that its quality evaluation result mainly depends on the result of fuzzing test, and it has a certain degree of randomness and chaos [15]. For complex power meter software, even if no defects are found during the test, there is no guarantee that there will be no safety hazards in the power meter software [16]. Therefore, the quality evaluation result can only give a reference to the safety of the electric energy meter, and cannot guarantee the safety of the software.
8 Summary Aiming at the difficulty of safety inspection of smart energy meter equipment, this paper studies key technologies such as embedded binary dynamic instrumentation, fast storage and analysis of execution paths, and embedded software security evaluation based on traditional fuzzing testing technology. The inspection system provides a complete solution to the security inspection problem of embedded devices. In the next step, the system will be put into actual electric energy meter safety inspection tasks, and the stability and versatility of the system will be continuously optimized. Acknowledgments. This work is supported by Science and Technology Project of SGCC. (No.5600-201955458A-0–0-00).
References 1. Van Doorn, P.M., Lightbody, S.H., Ki, C.S.S.: Power meter for determining parameters of muliphase power lines (1998) 2. James, W.: Electric power measurement system and hall effect based electric power meter for use therein (1987) 3. Software engineering . Computer 8(2), 72–72 4. Jay, J.: Poorly-secured smart energy meters could place millions at risk from hackers, warns GCHQ[EB/OL] (2018). https://www.teiss.co.uk/smart-energy-meters-security-gchq 5. Makonin, S., Popowich, F., Gill, B.: The cognitive power meter: looking beyond the smart meter (2013) 6. Waheib, B.M., et al.: Design of smart power meter for local electrical power generators in Baghdad city. IOP Conf. Ser. Mater. Sci. Eng. 881, 012105 (2020) 7. Son, M.H.: Power-consumption control apparatus and method for smart meter (2011) 8. Zhang, S.L., Zhou, N., Wu, J.X.: The fuzzy integrated evaluation of embedded system security. In: International Conference on Embedded Software & Systems. IEEE Computer Society (2008) 9. Fujiwara, K., Matoba, O.: Detection and evaluation of security features embedded in paper using spectral-domain optical coherence tomography. Opt. Rev. 18(1), 171–175 (2011) 10. Tah, J.H.M., Carr, V.: A proposal for construction project risk assessment using fuzzy logic. Constr. Manag. Econ. 18(4), 491–500 (2000)
252
Z. Feng et al.
11. Dongmei, Z., Changguang, W., Jianfeng, M.: A risk assessment method of the wireless network security. J. Electron. 24(3), 428–432 (2007). https://doi.org/10.1007/s11767-0060247-6 12. Liu, Y.-L., Jin, Z.-G.: SAEW: a security assessment and enhancement system of Wireless Local Area Networks (WLANs). Wireless Pers. Commun. 82(1), 1–19 (2014). https://doi. org/10.1007/s11277-014-2188-y 13. Kim, D.R., Moon, J.S.: A study for protecting the virtual memory of applications. IEMEK J. Embed. Syst. Appl. 11(6), 335–341 (2016) 14. De Barros, M., et al.: Adaptive fuzzing system for web services. US 2013 15. Householder, A.D., Foote, J.M.: Probability-based parameter selection for black-box fuzz testing (2012) 16. Chen, Q.S., Chen, X.W.: Intelligent electric power monitor and meter reading system based on power line communication and OFDM. In: 2008 Congress on Image and Signal Processing, vol. 4, pp. 59–63. IEEE (2008)
DClu: A Direction-Based Clustering Algorithm for VANETs Management Marco Lapegna and Silvia Stranieri(B) Department of Mathematics and Applications, University of Naples Federico II, Naples, Italy {marco.lapegna,silvia.stranieri}@unina.it
Abstract. Nowadays, the world is adapting to the concept of Internet of Things, changing the way of intending everything around us, including cities. Indeed, smart cities are an interesting emerging application of Internet of Things, combining the infrastructures with new technologies aimed to improve mobility, energy consumption, but also the environment. In such a scenario, vehicular ad hoc networks play an important role, by allowing a smart communication between vehicles, and improving, as a consequence, traffic congestion and road safety. To perform such a communication in an efficient way, clustering techniques have always been used to organize vehicles in groups in an opportune way, so to exploit the network potential at the best. In this work, we propose an innovative clustering approach, which is based on the direction of movement of vehicles, rather than on their punctual position. The direction is expressed in terms of angle of displacement with respect to a fixed point. Results show that the direction-based approach provides a vehicles grouping guaranteeing an efficient inter-vehicular communication. Keywords: VANETs · Clustering · Internet of Things · Direction
1 Introduction With growing development of new technologies based on artificial intelligence, almost everything around us is been given a processing capability. This is what is called Internet of Things (IoT), meaning the application of Internet to objects and places. Among the several enforcement of IoT, we find smart cities, where the urban management is handled through the widespread emerging technologies. One of these are vehicular ad hoc networks (VANETs, for short), where vehicles play the role of nodes in a network by keeping it updated via broadcasting communication. In such a network, vehicle are supposed to be equipped with on board side units placed on the vehicle (such as GPS localization system) and on road side units placed on infrastructure elements along the streets. The communication can happen from vehicle to vehicle, performing a V2V communication, or, when the signal between vehicles is not strong enough to perform the information exchange, road infrastructures can be used as bridge between vehicles to transmit information over a longer distance, performing a V2I communication. Similarly to a generic wireless sensor network, clustering techniques are needed and are normally used to optimize the whole network management. In the VANETs c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 253–262, 2022. https://doi.org/10.1007/978-3-030-79728-7_25
254
M. Lapegna and S. Stranieri
case, clustering can be used to perform a vehicles grouping according to some criterion, in order to improve the information flow through the network itself, by transmitting only the information which is actually needed, among those vehicles which are actually interested in it. Our proposal starts with an observation: among the most interesting features of VANETs, but also their challenging points, we find their high dynamicity and self-organizing capability. For this reason, we want to provide a clustering algorithm which takes into account these properties. Indeed, in the clustering process, it would be reductive considering only snapshots of the network taken into an instant of time, without considering the continuous evolution of the vehicles movement. Indeed, our proposal is based on the direction of movement of the vehicles populating the network. In a way, we apply a punctual clustering, but by using metrics which allow including the network dinamicity, too. The direction of movement is expressed in terms of angle of displacement with respect to a known fixed point on the road map. The proposed algorithm provides the most reasonable number of clusters for a given set of vehicles, based on the intuition that vehicles moving toward the same direction have something in common, are interested in the same information, should belong to the same cluster. The rest of the paper is organized as follows: in Sect. 2, we summarize the state of the art on clustering in VANETs; Sect. 3 shows our proposal, from the used metrics to the algorithm pseudo-code; in Sect. 4, we propose some execution examples of our clustering algorithm; Finally, in Sect. 5, conclusions and hints for future development are provided.
2 Related Work Internet of Things is having an exponential growth in these last years, even thanks to the several and variegate applications. Healthcare is, for sure, one of the fields that is investing on IoT technologies the most, as they do in [25]. Nowadays researchers, such as authors of [26], are betting on IoT more than ever, because of the pandemic from COVID-19 affecting the whole world. However, in literature, applications of IoT are numerous, from data management [15], to energy [18], or network changing detection [8] and training [9, 11], or even social behavior analysis [1, 2], indoor localization [10], but also smart cities [21]. To deal with smart cities, very often researchers employ vehicular ad hoc networks technology to transform the city in a communicating network. Such a communication is made more efficient by means of clustering techniques, which guarantee a smart information flow. Clustering allows grouping vehicles according to some criterion for a more effective network management. Researchers offer different solutions, as they didi in [7, 12]. For instance, some solutions based on variation of the known k-means algorithm have been proposed in [23, 24]. In particular, in this field they propose several criteria for this goal. In [22], there is a survey on the existent clustering algorithms for VANETs, among which we find predictive solutions [14], proposals aimed to optimize energy consumption [16], or solutions based on the monitoring of malicious vehicles [17]. The innovative aspect of our proposal is to consider the vehicle behavior in the clustering process. This is made through the analysis of the direction of movement of any vehicle. In this way, we do not limit to create groups according the vehicles
DClu: A Direction-Based Clustering Algorithm for VANETs Management
255
punctual position in a given instant of time, but we catch several successive snapshots of the network in a single angle, representing the vehicle movement. To the best of our knowledge, this is the first work providing a clustering algorithm based on direction of movement. Our algorithm does not require the number of clusters to be known apriori, but it provides the best number of clusters needed with respect to the set of vehicles given as input. Definitely, the strength of our proposal is the ability to catch the network dinamicity and continuous changing, without increasing the complexity of the clustering process.
3 Direction-Based Clustering As explained in the previous section, clustering algorithms constitute an important ingredient in VANET communication mechanism, as they allow grouping vehicles which have something in common. Typically, vehicles are grouped according to their punctual position and the distance between them: vehicles close to each other in a given instant of time belong to the same cluster. In this work, we want to overcome such an aggregation criterion, by choosing a mechanism which is, in our opinion, more precise. We do not limit the analysis of vehicles features to a single snapshot of a static situation, but we want to enrich this analysis with features existing in their path for longer. To this aim, we introduce the direction parameter associated to each vehicle. When we speak about direction, we mean the way the movement of a vehicle is directed. To be more precise, we will compute the direction of movement as the angle δ of displacement with respect to a known fixed point. 3.1 Network Modeling In this section, we want to specify how our vehicular network is modeled to explain later the parameters and the logic behind our direction-based clustering. Definition 1 (VANET). Our vehicular network is a weighted graph: G = (V, E, weight)
(1)
where V is the set of nodes (vehicles) and E is the set of edges (communication channels) between nodes. The weight function: weight : E → [RSSImin , RSSImax ]
(2)
associates a weight to each edge. To be more precise, such a weight represents the Received Signal Strength Indicator, meaning the communication power between vehicles linked by the corresponding edge. RSSImin represents the minimum power required to perform the communication, while RSSImax is the best power allowing the speediest information exchange. Clearly, given two vehicles, being in the communication range of each other is the necessary condition to be in the same cluster.
256
3.2
M. Lapegna and S. Stranieri
Displacement Angle
We define, now, the key parameter for our clustering algorithm. As said before, the direction of the movement of any vehicle is expressed in terms of angle of displacement with respect to a fixed point. We call such an angle δ . Definition 2 (Displacement Angle). Let V be the set of vehicles as defined in Definition 1. Let v ∈ V be any vehicle of the network. Let F and P be the vector pointing to the known coordinate of the chosen fixed point and the vector obtained as projection of the vehicle respectively. The displacement angle δ is defined as follows:
δ (v) = angle between(F, P) ∀v ∈ V
(3)
In Fig. 1, an example is shown. As we can see, according to the different direction of traveling related to a given vehicle, the corresponding δ , drawn in red in the figure, is wider or smaller. The intuition is that, even if vehicle C and D from the Fig. 1 are very close to each other, they should not belong to the same group, since their paths, their behaviors, their directions are totally different. All this information is kept in a single value, which is δ . In our setting, the δ -function does not affect the vehicle speed at all: indeed, it only indicates where the vehicle is going to move, most likely.
Fig. 1. Example of vehicle movement direction as angle of displacement w.r.t. the fixed point
While in a standard clusterization algorithm based on distance C and D would be grouped in the same cluster, our approach allows discriminating those vehicles whose behavior is completely different even if their positions are not. In our opinion, this is a key point in a network such dynamic such as the vehicular ones. It is meaningless using algorithms based on static situation, punctual values, when we speak about networks which change at any instant. Of course, the assumption made in the previous section about the communication range avoids that vehicles with the same δ , but very far from one another end up in the same cluster.
DClu: A Direction-Based Clustering Algorithm for VANETs Management
257
3.3 The Threshold For the algorithm definition, we also need an additional parameter establishing the threshold over which a cluster switch should be performed, we call it θ . Definition 3 (Threshold). A threshold θ ∈ [0, 360] is an angle such that: v in current cluster if δ (v) ≤ θ v in a new cluster if δ (v) > θ
(4)
Clearly, varying the threshold we can obtain completely different result. Indeed, the choice of θ is a key step in our proposal. 3.4 The Algorithm According to the parameters explained in the sections above, we compute the directionbased clustering algorithm. First of all, we suppose having a set of vehicles and the corresponding δ -function indicating their angle of displacement from a fixed point. As shown in the code of Algorithm 1, we iterate over the vehicles in the network in order to assign an opportune cluster to any of it. Clearly, if the set of clusters, which we indicate as clusters is empty, we first create a new cluster, then we add the vehicle to it, and finally we add the created cluster to the set of clusters. If, instead, there are already formed groups in clusters, then we look for a candidate cluster which can contain the vehicle v to be allocated. If such a group exists, then v is added, otherwise we are in the same situation of the empty clusters.
Input: set of vehicles V Output: set of clusters for any vehicle v in V do if clusters.empty() then create a new cluster c; add v to c; add c to clusters; else c=candidate(clusters); if c.exists() then add v to c; else create a new cluster nc; add v to nc; add nc to clusters; end end end
Algorithm 1: Clusters creation
258
M. Lapegna and S. Stranieri Input: set of clusters and vehicle v to be added Output: candidate cluster c for vehicle v for any cluster c in clusters do M=average(δ (u)), with u ∈ c ; if |δ (v) − M| ≤ θ then candidate c; end end return candidate;
Algorithm 2: Candidate cluster extraction
The way the candidate group of clusters is selected is shown in the code of Algorithm 2. The criterion used to establish if v can be added to an existent cluster depends on δ and θ , obviously. Indeed, if there exists a cluster whose average value obtained from δ functions is similar to the δ function of v (where the similarity is decided by θ ), then such a cluster is returned. Essentially, for any cluster in the set of clusters, an average value over all the δ functions is computed. Such a value will be used as representative displacement angle for that cluster, and such an information will be useful to understand if a new vehicle should belong to a given cluster or it should not. The decision is not only because of the average value, but also the θ value. Indeed, if the new vehicle to be added has a displacement angle such that δ (v) − M, in absolute value, does not overcome the fixed threshold, then the vehicle can be added to the candidate cluster.
4 Evaluation In this section, we are going to provide some simulation results to validate our proposal. We applied the Algorithms 1 and 2 proposed in the previous section. In Table 1, we show a first example: the table contains the parameters given as input to the clustering algorithm, meaning the δ -function value for each vehicle in the set of vehicle to be grouped, made of 10 vehicles in this example. Table 1. Example 1 with 10 vehicles and their δ -functions v0
v1 v2 v3
v4
v5 v6 v7
v8 v9
δ 180 40 30 150 170 65 10 300 20 310
Figure 2 shows the resulting clustering from the set of values of Table 1. To perform the clustering in such example, we fixed the threshold θ = 20, to test the algorithm response. Indeed, it provides 6 different clusters, each one containing vehicles whose δ -functions are distant to each other no more than 20. Vehicles 1 and 2 are grouped, with δ 40 and 30 respectively, 3 and 4 with 150 and 170, 7 and 9 with 300 and 310, as well as vehicles 6 and 8.
DClu: A Direction-Based Clustering Algorithm for VANETs Management
259
Fig. 2. Resulting clusters from example 1 in Table 1 with θ = 20
In Table 2, we propose an additional example with 10 vehicles again, by varying their displacement angles. Table 2. Example 2 with 10 vehicles and their δ -functions v0
v1
v2 v3
v4
v5 v6 v7
v8 v9
δ 200 140 30 350 170 65 40 300 20 310
In the resulting clusters from Fig. 3, obtained by fixing θ = 10, we can immediately see how reducing the threshold causes a higher number of clusters produced by the algorithm. Indeed, in this case we obtain 8 clusters, among which 6 are single.
Fig. 3. Resulting clusters from example 2 in Table 2 with θ = 10
In Table 3, we propose a bigger example, by considering 20 vehicles and their associated displacement angles, then we will study how the clustering results change according to θ for this set of vehicles. Indeed, as we pointed out while presenting the algorithm, the candidate cluster choice is clearly driven by the threshold, whose value establish if a vehicle is accepted in the cluster or not.
260
M. Lapegna and S. Stranieri Table 3. Example 4 with 20 vehicles and their δ -functions v0
v1
v2
v3
v4
v5
v6 v7
v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 v19
δ 300 100 310 350 270 165 40 100 20 10 16 320 155 115 50 75 235 130 200 160
In the following Figs. 4a, 4b, and 4c, we can see the different solutions of clusters creation for the set of vehicle in Table 3, with θ set to 10, 20, and 30 respectively. As one can imagine, the more θ grows, the less the number of cluster.
Fig. 4. Resulting clusters from Table 3.
5 Conclusions Clustering in VANETs is a challenging topic. Several approaches have been done in literature, but, as far as we know, no one ever used the direction of movement to classify vehicles in the network. We have done it in this work, by providing a clustering algorithm exploiting the displacement angle of each vehicle with respect to a known point, by catching the dynamic nature of VANETs. Results prove that our approach provides an opportune number of groups starting from a set of vehicles. As future work, we aim to use formal techniques based on the strategic reasoning for multi-agent systems, by designing the desired solution as an equilibrium and solve it as a game, similarly to [5, 6, 19, 20]. We could also investigate the proposed clustering technique application to social networks, as they did in [3, 4], or to grouping tasks in large data centers to improve the resources management in [13].
DClu: A Direction-Based Clustering Algorithm for VANETs Management
261
References 1. Amato, A., et al.: Analysis of consumers perceptions of food safety risk in social networks. In: Advanced Information Networking and Applications. Springer, Cham (2019). https://doi. org/10.1007/978-3-030-15032-7 102 2. Amato, F., et al.: An agent-based approach for recommending cultural tours. Pattern Recogn. Lett. 131, 341–347 (2020) 3. Amato, F., et al.: Diffusion algorithms in multimedia social networks: a preliminary model. In: Advances in Social Networks Analysis and Mining 2017 (2017) 4. Amato, F., et al.: Extreme events management using multimedia social networks. Future Gener. Comput. Syst. 94, 444–452 (2019) 5. Aminof, B., et al.: Graded modalities in strategy logic. Inf. Comput. 261, 634–649 (2018) 6. Aminof, B., et al.: Graded strategy logic: reasoning about uniqueness of nash equilibria. In: Autonomous Agents & Multiagent Systems (2016) 7. Balzano, W., Murano, A., Stranieri, S.: Logic-based clustering approach for management and improvement of VANETs. J. High Speed Netw. 23(3), 225–236 (2017) 8. Balzano, W., Murano, A., Vitale, F.: EENET: energy efficient detection of network changes using a wireless sensor network. In: Conference on Complex, Intelligent, and Software Intensive Systems. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61566-0 95 9. Balzano, W., Murano, A., Vitale, F.: SNOT-WiFi: sensor network-optimized training for wireless fingerprinting. J. High Speed Netw. 24, 79–87 (2018) 10. Balzano, W., Murano, A., Vitale, F.: V2v-en-vehicle-2-vehicle elastic network. Procedia Comput. Sci. 98, 497–502 (2016) 11. Balzano, W., Murano, A., Vitale, F.: WiFACT–wireless fingerprinting automated continuous training. In: Advanced Information Networking and Applications Workshops (WAINA). IEEE (2016) 12. Balzano, W., et al.: A logic-based clustering approach for cooperative traffic control systems. In: P2P, Parallel, Grid, Cloud and Internet Computing. Springer, Cham (2016). https://doi. org/10.1007/978-3-319-49109-7 71 13. Barone, G., et al.: An approach to forecast queue time in adaptive scheduling: how to mediate system efficiency and users satisfaction. J. Parallel Program. 45(5), 1164–1193 (2017) 14. Cheng, J., et al.: A connectivity-prediction-based dynamic clustering model for VANET in an urban scene. IEEE Internet Things J. 7(9), 8410–8418 (2020) 15. Di`ene, B., et al.: Data management techniques for Internet of Things. Mech. Syst. Signal Process. 138, 106564 (2020) 16. Elhoseny, M., Shankar, K.: Energy efficient optimal routing for communication in VANETs via clustering model. In: Elhoseny, M., Hassanien, A. (eds.) Emerging Technologies for Connected Internet of Vehicles and Intelligent Transportation System Networks. Studies in Systems, Decision and Control, vol. 242. Springer, Cham (2020). https://doi.org/10.1007/9783-030-22773-9 1 17. Fatemidokht, H., Rafsanjani, M.: QMM-VANET: an efficient clustering algorithm based on QoS and monitoring of malicious vehicles in vehicular ad hoc networks. J. Syst. Softw. 165, 110561 (2020) 18. Hossein Motlagh, N., et al.: Internet of Things (IoT) and the energy sector. Energies 13(2), 494 (2020) 19. Jamroga, W., Malvone, V., Murano, A.: Reasoning about natural strategic ability. In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (2017) 20. Jamroga, W., Murano, A.: Module checking of strategic ability. In: Autonomous Agents and Multiagent Systems (2015)
262
M. Lapegna and S. Stranieri
21. Jiang, D.: The construction of smart city information system based on the Internet of Things and cloud computing. Comput. Commun. 150, 158–166 (2020) 22. Katiyar, A., Singh, D., Yadav, R.S.: State-of-the-art approach to clustering protocols in VANET: a survey. Wireless Netw. 26(7), 5307–5336 (2020) 23. Laccetti, G., et al.: A high performance modified K-means algorithm for dynamic data clustering in multi-core CPUs based environments. In: Montella, R., et al. (eds.) Internet and Distributed Computing Systems, IDCS19. Lecture Notes in Computer Science, vol. 11874. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-34914-1 9 24. Laccetti, G., et al.: Performance enhancement of a dynamic K-means algorithm through a parallel adaptive strategy on multicore CPUs. J. Parallel Distrib. Comput. 145, 34–41 (2020) 25. Qadri, Y.A., et al.: The future of healthcare internet of things: a survey of emerging technologies. IEEE Commun. Surv. Tutor. 22(2), 1121–1167 (2020) 26. Singh, R.P., et al.: Internet of Things (IoT) applications to fight against COVID-19 pandemic. Diabetes Metab. Syndrome Clin. Res. Rev. 14(4), 521–524 (2020)
ComPass: Proximity Aware Common Passphrase Agreement Protocol for Wi-Fi Devices Using Physical Layer Security Khan Reaz(B) and Gerhard Wunder Institute of Computer Science, Freie Universit¨at Berlin, Berlin, Germany [email protected]
Abstract. Secure and scalable device provisioning is a notorious challenge in Wi-Fi. WPA2/WPA3 solutions take user interaction and a strong passphrase for granted. However, the often weak passphrases are subject to guessing attacks. Notably, there has been a significant rise of cyberattacks on Wi-Fi home or small office networks during the COVID-19 pandemic. This paper addresses the device provisioning problem in Wi-Fi (personal mode) and proposes ComPass protocol to supplement WPA2/WPA3. ComPass replaces the pre-installed or user-selected passphrases with automatically generated ones. For this, ComPass employs Physical Layer Security and extracts credentials from common random physical layer parameters between devices. Two major features make ComPass unique and superior compared to previous proposals: First, it employs phase information (rather than amplitude or signal strength) to generate the passphrase so that it is robust, scaleable, and impossible to guess. Our analysis showed that ComPass generated passphrases have 3 times more entropy than human generated passphrases (113-bits vs. 34-bits). Second, ComPass selects parameters such that two devices bind only within a certain proximity (≤3m), hence providing practically useful in-build PLS-based authentication. ComPass is available as a kernel module or as full firmware.
1 Introduction Connectivity is the key to the world of business, entertainment, education, and government services. While cellular dominates mobility use-case, 802.11 a.k.a Wi-Fi is the single most widely used technology to access the internet when it comes to streaming movies to the smart TV at home, making a video conference call at the workplace, or merely sharing vacation photos from a hotel room or a caf´e. In recent years, consumers have also embraced Wi-Fi for connecting new types of peripherals as part of their daily life such as Amazon Alexa powered Echo devices or Google Connected Home or Apple Home accessories. Quite recently, the world has faced COVID-19 pandemic. Due to the lockdown, people relied on home Wi-Fi more than ever to work remotely. Interpol reported an alarming rate of cyberattacks during the pandemic months [14]. The increased number of remote-working has made an adversary more interested in the radio part of the communication since it is more straightforward to capture packets over the air. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 263–275, 2022. https://doi.org/10.1007/978-3-030-79728-7_26
264
K. Reaz and G. Wunder
The Wi-Fi Alliance has developed several security protocols over the last decades to secure Wi-Fi communication. Nevertheless, none of the protocols provides full-proof and future-proof security. Recently, a significant flaw, popularly known as KRACKattack was discovered, and it heavily affected all platforms [26]. To ease the provisioning of credentials, especially for resource-constrained devices, Wi-Fi Alliance had developed Wi-Fi Protected Setup (WPS) protocol. It gives consumers an easier option to set up a secure Wi-Fi connection by pushing a button (PBC mode), or entering a PIN, or via NFC interface [33]. However, WPS has a long-standing weak security, known as WPS PIN recovery [28]. In an effort to strengthen the security of Wi-Fi, WPA3 has been recently announced (last release v3.0 on December 2020) [34]. The new standard mandates a higher cryptographic standard (192-bit key for enterprise mode, although 128-bit for personal mode). It replaces the Pre-Shared Key (PSK) exchange with Simultaneous Authentication of Equals (SAE) and introduces Forward Secrecy. However, a passphrase is still used. The newly introduced Wi-Fi Easy Connect [31] replaces all previous methods of WPS with a Public Key Cryptography (PKC) based provisioning mechanism. In Wi-Fi Easy Connect, a network owner is presumed to have a primary device (Configurator) with a rich user interface (e.g., a smartphone or tablet with a camera) that runs the Device Provisioning Protocol (DPP). Here, all Enrollees have electronic or printed QR codes or human-readable strings. The Configurator scans the code (the user can also manually type in the human-readable strings) to provision the Enrollee with credentials. DPP relies on QR code scanning, which is not at all feasible for a large number of devices (Think of a premise to be monitored with a Wi-Fi IP camera; then all the cameras have to be scanned and connected to the network). The Wi-Fi Alliance has released another supporting protocol, called Enhanced Open [10]. It is an adapted version of the Opportunistic Wireless Encryption (OWE) [32] protocol that aims to mitigate attacks on open un-encrypted wireless networks. Here, the client (STA) and the access point (AP) generate pairwise secret by performing Diffie-Hellman (DH) key exchange during the 4-way handshake procedure. OWE is based on PKC, and PKC is threatened by the uprising of quantum computers [17]. It is to be noted that the exponent size used in the DH must be selected such that it has at least double the entropy of the entire crypto-system, i.e., if we use a group whose strength is 128-bits, we must use more than 256-bits of randomness in the exponent used in the DH calculation [18]. This brings to the required DH key-size of 4200 bits at its best strength estimation [18]. The large key-size is a massive burden for the IoT ecosystem [17]. In recent years several works have been done to generate the secret key using PHYlayer properties based on Shannon’s [24], Wyner’s [36], and Maurer’s [20] seminal information-theoretic security concept. In the Physical Layer Security (PLS) approach, the inherent reciprocity property of the wireless channel and its varying nature (i.e. randomness) is used to agree on a key between two legitimate transceivers. Enthusiasm among researchers gave a significant rise towards developing key generation algorithms on this principle. Most of the existing works are based on Amplitude or Received Signal Strength (RSS) [25, 37, 39, 40]. It is because Amplitude and RSS show reciprocity without much effort and hence can be easily reconciled to generate a symmetric key. On the other hand, the slightest displacement of the transceivers cause the Phase to vary
ComPass
265
significantly. Xi et al. proposed the Dancing Signal (TDS) scheme in [37]. It requires devices to be within 5cm which is very impractical since most of the cases APs are wallmounted or hidden to keep away unwanted hardware access. In TDS, keys are generated from the local entropy source instead of the randomness of the wireless channel. Their evaluation showed a good performance since the implementation is done on a traditional computer. This will not be the case for resource-constrained IoT devices which are known to have low entropy [17]. From the literature, it is well established that Amplitude or Received Signal Strength (RSS) based existing methods are slow, need iterative communications, authenticated channel, and a large number of samples to generate a good quality key. We propose ComPass to tackle the challenges mentioned above. It is a new proximity aware common passphrase agreement protocol for deployable Wi-Fi network consisting of all classes of Wi-Fi devices (hence, some devices may have no camera or keypad). Our PLS based proposed method uses Phase information of the wireless channel and its varying nature (i.e., randomness) to agree on a passphrase between two legitimate transceivers. With the ComPass generated passphrase, it is possible to generate 128/192/256-bit (or higher) key with high entropy at a minimum communication overhead. Our intention is not to replace the well known WPA2/WPA3; instead, supplement it with the new automated passphrase generation protocol. The paper is organized as follows. A brief introduction to Wi-Fi channel measurement is given in Sect. 2. We presented the end-to-end steps of the ComPass protocol in Sect. 3. and its security analysis in Sect. 4. Sect. 5 describes the implementation details. Our concluding remark is given in Sect. 6.
2 Preliminaries We revisit some of the core technologies of the Wi-Fi PHY, specifically Beamforming. It utilizes the knowledge (i.e., Channel State Information (CSI)) of the MIMO channel to improve the receiver’s throughput significantly. In the complex baseband MIMO channel model, a vector xk = [x1 , x2 , ...xNTx ]T is transmitted in subcarrier k using OFDM scheme. The received vector yk = [y1 , y2 , ...yNRx ]T is then modeled as: yk = Hk xk + Z
(1)
Hk is the channel response matrix of dimensions NRx × NTx where NRx is the maximum number of receiving antenna, NTx is the maximum number of transmitting antenna. Hk is expressed in complex number to represent the attenuation (i.e. amplitude (Hk ) and the phase shift (∠Hk )) for each subcarrier k. Z is the additive white Gaussian noise. The CSI is expressed as a multidimensional matrix taking H = [NRx ][NTx ][NSc ] form. NSc is the number of used data subcarriers [38, 41]. Depending on the Wi-Fi chip, protocol version, bandwidth and channel estimation method, the size of this matrix will vary. For example, a 3 × 3 MIMO device with a Qualcomm Atheros Wi-Fi chip operating on IEEE 802.11n 5GHz band with a BW =20/40 MHz would report CSI as a [3][3][56/114] matrix. We refer to the IEEE standard [13] for the detailed explanation of the IEEE 802.11 PHY procedure.
266
K. Reaz and G. Wunder
3 ComPass Protocol Let us define entities of the ComPass protocol. Access Point is kept hidden (to reduce Evil Twin attacks) and it has an Authenticator with a rich user interface. The Enrollee is a device with limited interface (it can have a rich user interface too). Before initiating the protocol, devices are brought within proximity (≤ 3m). Summary of the protocol steps are as follows (1) With a button press or after booting, the Enrollee broadcasts its name-id with random nonce in Wi-Fi infrastructure mode. Power button or existing WPS button can be re-programmed for this purpose. (2) Authenticator verifies and confirms the Enrollee from an app or from the system’s Wi-Fi setting. a. Authenticator and Enrollee perform procedures as mentioned in the following sections (3.1 to 3.5) to generate a common passphrase1 . Once connected, the Authenticator sends (SSID + AP-MAC) to the Enrollee. Subsequently, it sends Enrollee’s MAC+ passphrase1 to the Access Point. This communication is already encrypted since the Authenticator has joined the network beforehand. b. Enrollee switch to Wi-Fi Client mode after receiving (SSID + AP-MAC) from the Authenticator. It sends Association request to the Access Point appending hashed passphrase1 . c. Access Point verifies the request by comparing hashed passphrase1 . If successful, it initiates procedures as described in Sect. (3.1 to 3.5). (3) Access Point and Enrollee generates passphrase2 in the similar way. (4) If successful, Access Point allows Enrollee to connect and it notifies Authenticator, else Enrollee returns to step (2)b. Finally, Authenticator and Access Point delete the passphrase1 . Authenticator and Enrollee refer to STA and the Access Point as AP. We assume that the Authenticator joins the Access Point securely either by existing WPA2/WPA3 method or by generating their common passphrase according to the procedures mentioned in Sect. (3.1 to 3.5). New devices can only be joined through the Authenticator(s). In the following subsections we present the intermediate steps of the protocol and algorithms. 3.1
Synchronized CSI Collection
In the last few years, several toolchains have been developed by researchers to extract CSI from commercial off-the-shelf (COTS) devices. Among them, the Intel CSI Tool (ICT) by Halperin et al. [9] and the Atheros CSI Tool (ACT) by Xie et al. [38] are widely used. The recent release of nexmon CSI extraction tool [8] has opened the door for extracting CSI from Broadcom and Cypress chipsets. Although there are some differences between these toolchains, they all report CSI to the firmware’s user-space in a similar fashion. Hence, ComPass remains compatible with all of them. In this paper, we worked with ACT to implement ComPass on devices. We have patched some of the bugs that we found in ACT. For example, previously, the driver reported CSI for all packets,
ComPass
267
including the acknowledgment packets (ACK). It caused one device to have more CSI data than the other. The ACT supports up to 3 RF chains, but the SoC firmware sometimes use the Link Adaptation technique (especially in LOS scenarios) to turn off some antennas. Also, the time stamp associated with the reported CSI was according to each device’s local clock. It caused misalignment for our intended use of the CSI to generate a common passphrase on both devices. One of the first challenges of PLS method is to ensure that the collected channel measurements are coming from the packets that are exchanged within the channel’s coherence time. This is to make sure that the collected channel measurements on both sides hold reciprocity property. To mitigate the unwanted effects on CSI, we employed a Synchronous CSI Collection (SCC) procedure between the devices to ensures that they have a common time stamp (up to a certain accuracy) and only CSI from the correct probing packets are logged. At first, STA aligns its local clock with AP by utilizing the Linux built-in library. AP instructs STA to start exchanging a fixed N number of dummy packets after waiting for td seconds. Once CSI for an incoming packet is reported, it checks for Rx × Tx combination. If Rx = Tx , CSI value is dropped. After collecting CSI for N packets, the protocol moves to the Parameter Extraction step. 3.2 Parameter Extraction A vast body of literature on channel-based key generation, specifically those who implemented their schemes on COTS hardware relied only on the Amplitude/RSS part of a signal; only a very few considered to work with the Phase part [25, 29]. However, the Amplitude fluctuation of the signal is very low in proximity and in an static environment [23]. An active adversary can generate a synthetic channel amplitude profile to mimic the intended transceiver. Conversely, Phase varies significantly in an indoor environment while respecting the reciprocity property [35]. Thus, it is nearly impossible for an adversary to generate a synthetic phase profile. In this paper, we investigate the Phase part of the channel frequency response. It is to be noted that the CSI reported by the Wi-Fi SoC driver contains the channel’s cumulative frequency response and the device’s inner circuitry response as it goes through amplification, down-conversion, packet detection phase. All this additional processing contaminate the true channel response as verified by previous works [15, 19, 27, 35, 37, 41]. Hence, the collected CSI needs sanitizing to remove unwanted effects. According to Zhu et al. [41], the measured phase φk = ∠H( fk ) can be decomposed as: φk = atan εg ·
sin(2π · fs · k · ζ + εθ ) cos(2π · fs · k · ζ )
− 2π · fs · k · λ + β
(2)
where gain mismatch and phase mismatch is denoted by εg , and εθ respectively. Unknown timing offset and phase offset error is indicated by ζ and β . λ sums up the delay caused by time-of-flight (TOF), packet detection delay (PDD) and sampling frequency offset (SFO). Note that AWGN is omitted since it would cancel out when comparing phases of the measured CSI from two nodes.
268
K. Reaz and G. Wunder
We adapted the decomposition method of [41] to extract the relevant parameter from the measured CSI phase. We have studied the characteristics of these five parameters through several measurement campaigns performed at various locations at the Freie Universit¨at Berlin and other private apartments that included LOS and NLOS scenarios. Our key findings are: (i) The almost sigmoidal-shaped arcus tangent function of the Eq. 2 strongly conforms in the LOS scenario and fails in the NLOS scenario. (ii) Cumulative delay parameter, λ is almost constant, which is expected because TOF, PDD, SFO remains static for a low mobility environment. Conveniently, λ could be useful to filter out CSI for a packet that arrived later than the channel coherence time Tc . (iii) εg and εθ are the only useful parameter with good statistical properties. This revelation of our analysis encouraged us to extract εg and εθ from the collected CSI and proceed to the next steps. Taking the Eq. 2 as a reference decomposition model, we estimate the default value for each of the five parameters from the ideal arctan function: εg = 0.512, ζ = −0.02812, εθ = −0.006355, λ = −0.02762, β = 0.1326. Then we perform a non-linear least square curve-fitting operation to estimate the parameters. Before we implemented ComPass on our COTS setup, we used a simulation tool for the next steps by quantizing both εg and εθ . Our analysis showed that εg gives a slightly better result. Henceforth, ∑Ni=1 εg is the parameter from the measured CSI-phase that we will use in the following steps. The Delay Aware Parameter Extractor (DAPPER) algorithm is described in the Step 1 of Algorithm 1. AP and STA perform DAPPER independently. 3.3
Parameter Quantization
Existing lossy and lossless (as categorized by Zenger et al. in [39]) quantization schemes in the literature tend to overlook the fact that the underlying reciprocity would be broken if the guard-interval for converting measured complex-valued vectors to bit-string is calculated based on the whole CSI data set. Keeping this fact in mind, we opted in for an adaptive moving window based quantizer (MOW) (Step 2 of Algorithm 1. It is a lossless scheme and produces bit-string at 1 Bit/sample. The resulted scheme overcomes the well-known problem of burst 0’s and 1’s (i.e., 000 . . . 0, 111 . . . 1). In an one-hop wireless environment, Round-Trip-Time (RTT) can be a useful metric to roughly estimate the effective channel coherence time (Tc ) instead of using the Clarke’s mathematical reference model [22]: Tc = 16π (9f )2 , ( fm is the Doppler m spread). RTT is readily available for each packet, and it takes into account various factors including propagation delay, clock offset, processing delay, motions of objects in the environment. We get the mean RTT value for the exchanged packets to set the window size w for the MOW quantizer, which is then rounded up according to the IEEE 745 standard respecting the half-to-even rule. The minimum is w = 3 since it needs at least 3 packets to successfully calculate the distance for two nodes (with asynchronous clocks). Then starting from the most significant bit, we take w element from εg and find the mean w¯ of that window. We convert each element of the w to 1/0 such that 1 for εgi ≥ w¯ QAi /Bi = . After that, it moves to the next window and continues until 0 for otherwise the last element. If the last window has fewer elements than w, it will be filled by 0. This process will construct quantized bit strings QA for STA and QB for the AP.
ComPass
269
Algorithm 1: Related algorithms for ComPass protocol
1 2 3 4 5 6 7 8
9 10 11 12 13 14 15 16
17 18 19 20 21
22 23 24
/* Step 1: Delay Aware Parameter Extractor (DAPPER) */ Input : CSI-phase: φk of N packets Output: Parameter: εg for i ← 1 to N do Extract εg , ζ , εθ , λ , β by non-linear least-square curve fitting on Eq.2 if λi λ0 then Drop the associated εg else Continue end end /* Step 2: Moving Window based quantizer (MOW) */ Input : Channel parameter εg Output: Quantized bits QA for STA, QB for AP Get the mean RT T from exchanged N packets Calculate window size, w ← mean(RT T ) for j ← 1 to N/w do for i ← 1 to w do Calculate mean, w¯ ← ∑w i εg 1 for εgi ≥ w¯ QAi /Bi = 0 for otherwise end end /* Step 3: Generate Secure Sketch at STA */ Input : QA , random strings R1 , R2 Output: SS Generate R1 , R2 : where len(R1 = R2 ) = len(QA ) Multiply R1 and R2 to get MR1 R2 Use BCH to generate code: C ← BCHenc (MR1 R2 ) Generate syndrome of the code: syn(C) return SS (QA , MR1 R2 , syn(C)) /* Step 4: Recovery at AP */ Input : QB , SS Output: QA Generate K(QB , SS ) Decode with BCH: N ← BCHdec (K) return QB (SS , N) ≡ QA
3.4 Reconciliation Reconciliation shares the common properties of error-correction. The quantized bits on AP and STA are not necessarily the same; thus they cannot be used as is. In [5], Dodis et al. presented a new primitive: Secure Sketch (SS). We employ SS as the reconciliation protocol for its notable advantages over others [5]. It allows reconciling one party’s quantized bits with the other at minimum leakage. We chose a binary Bose–Chaudhuri– Hocquenghem (BCH) code based construction for SS, referred to as PinSketch [11].
270
K. Reaz and G. Wunder
It is the most efficient, flexible, and linear over GF(2). One can overcome the computation time by choosing an efficient decoding algorithm for the BCH [5]. We designed the algorithm in a bottom-up approach using the available BCH library in the Linux kernel [4]. 3.4.1 Secure Sketch SS generates public information X about its input a that can be used to reproduce a from its correlated version a , where a ∈ M and the metric space M has a distance function δ . It is a randomized procedure involving Sketch and Recover such that for input a ∈ M , Sketch produces a string s ∈ {0, 1}. The Recover procedure, Recover(a , Sketch(a)) = a works when δ (a, a ) ≤ t, t is the number of error. It uses random bit strings to mask original information from an adversary. 3.4.2 Construction Procedure At this point, STA and AP has quantized bit strings QA , and QB respectively which are similar but not same. Our goal is to reconcile QB with QA at minimum leakage. We start designing the algorithm by choosing the Galois field order m. In our case m = 7 for generating a 128-bit key; which makes the maximum BCH codeword size n = 127 ← (2m − 1). Details of the BCH algorithm is out of scope of this paper, hence, we refer to the original works [2], [12] and its modified version for SS in [11]. With the optimum error-correcting capability set as t = 9 bits, we create blocks each with 56 bits resulting 3 blocks. Because of the size of n, the last block has padding bits. Then each block is treated independently to produce secure sketch according to the Step 3 of Algorithm 1 and concatenated: SS ← Ss1 Ss2 Ss3 STA sent SS to AP as the helper string (note that SS does not expose the quantized bits QA ). AP performs Recovery operation according to the Step 4 of Algorithm 1 to find the mismatch in QB and correct them. Usually, in a BCH decoder, error locator root-finding is done by Chien search [3]. However, in our implementation, we used the technique of [1] for its better performance. It consists of factoring the error locator polynomial using the Berlekamp Trace algorithm down to degree 4. After that, the low degree polynomial solving technique of [42] is used. Fianally, AP and STA possess the same bit string, resulting in QB ≡ QA . 3.5
Mapping Bits to Passphrase
We map each 8-bit (starting with MSB) of the QA /QB according to the widely adopted 8-bit Unicode (UTF-8) (i.e., total 256 characters) encompassing the whole alphabet set of a passphrase (lowercase, uppercase, numerals, and symbols). Since there are some control and non-latin characters within the UTF-8 table, we changed U+0000 – U+0020 → uppercase HEX, and U+0080 – U+00FF → lowercase HEX. U+0021 – U+007E remains unchanged. This way, the generated passphrase complies with password policies such as lower and uppercase letters, digits and symbols (converted HEX are treated as regular AlphaNumeric). Finally, the resulted passphrase is treated as per the IEEE 802.11 standard’s recommended passphrase to PSK mapping, as defined in IETF RFC 2898 Sect. 5.2 [16].
ComPass
271
4 Security Analysis We used two well-known password quality estimators to evaluate ComPass generated passphrase. Microsoft’s zxcvbn toolkit [30] is used to calculate the number of minimum attempts needed to guess (crack) a password using brute-force. zxcvbn’s algorithm finds token, reversed, sequence, repeat, keyboard, date, and brute force pattern to estimate strength (as shown in Fig. 1, and Fig. 2). KeePass– recommended by the German Federal Office for Information Security (BSI-E-CS001/003 1.5), and audited [7] by the European Commission’s Free and Open Source Software Auditing (EU-FOSSA 1) project is used to calculate the available entropy (as shown in Fig. 3). We assume that the information leakage due to reconciliation is negligible and at most tlog2 (n + 1) (as mentioned in Theorem 6.3 of [5]). Notably, an upper bound is given by 56 bits in our case. Since it is a different metric than the password strength, we leave its evaluation for future work. We have collected 50 Wi-Fi passphrases from various Cafes, Hotels, and users, which we label as the human-generated passphrase. Then we use an Apple Macbook Pro (with dedicated crpto processor) to generate another set of 50 passphrases using OSX Keychain’s Password Assistant tool. Finally, we compare these two sets with 50 ComPass generated passphrases for AP (Bob) and STA (Alice).
Fig. 1. Guess analysis of passphrases generated by ComPass, machine, and human.
In Fig. 1, it is shown that the human-generated passphrases would need less than 1015 attempts, whereas machine-generated passphrases almost always need 1031 attempts to crack it using brute-force. ComPass-generated passphrases went up as high as 1032 and never below 1024 guesses. To evaluate an attacker’s (Eve) performance, we put Eve very close (≤ wavelength/2) to the Alice and generate 50 passphrases for Bob-Eve. Although Eve is closely located to Alice, the Phase part of her channel profile is very different from Alice’s (as also observed by Wu et al. in [35]). Whereas, Amplitude and Signal Strength of the two is very similar. For this very reason, we chose to work with the Phase (as we have explained earlier).
272
K. Reaz and G. Wunder
Now, we compare Eve’s passphrases with Alice’s. We append the actual (AliceBob’s) passphrase with Eve’s and Alice’s to mimic the fact that Eve has partial knowledge of the channel profile. Appending Alice-Bob’s passphrase to Alice does not make a difference since repeat is recognized by zxcvbn, and KeePass. Eve’s channel profile will be the product of Alice-Bob’s channel profile (HAB ) and Bob-Eve’s channel profile (HEA ). Eve cannot separate it without a noiseless secondary channel. Notice from Fig. 2 that the chances of Eve to guess the valid passphrase would be very low as the number of guesses is drastically high even though Eve’s channel profile consists of Alice-Bob’s channel profile.
Fig. 2. Guess analysis of passphrases between an attacker and the STA
We positioned the STA at different distances; (1, 3, 5, 10, 15)m apart from the AP in various indoor environments to verify the proximity aspect of ComPass,. We observed that the reconciliation scheme (in Sect. 3.4) fails when the distance is greater than 3m. It happens due to the multi-path effect that causes the reciprocity phenomenon to break, and thus there left almost negligible common randomness in the channel-phase profile to generate a common passphrase.
Fig. 3. Available entropy (in bits) from ComPass, machine, and human generated passphrases.
ComPass
273
Using KeePass entropy analysis tool, we show that on an average the humangenerated passphrases have 34-bit entropy, ComPass-generated ones have 113-bit, and machine-generated passphrases have 168-bit entropy (Fig. 3). Thus ComPass generated passphrases have nearly 3 times more entropy than a typical human generated passphrase. 4.1 Outlook on Privacy Amplification In a conventional channel-based key generation methods, a final step called Privacy Amplification is performed to cover the lost entropy during the Reconciliation. We forgo this additional step in our current implementation of ComPass protocol in favor of a Secure Sketch based reconciliation protocol, which inherently provides security against leakage. In our future work, we aim to incorporate the KECCAK algorithm based NIST SHA-3 family hash functions for this purpose [6]. After this step, we hope to see that the notches in the curve of the guess analysis of ComPass generated passphrase is reduced.
5 Implementation Our demo setup involves implementing the algorithms on COTS hardware. We chose very ordinary and widely available TP-Link N750 routers (v1.5, v1.6), and Android device (8.0+) playing the role of AP and STA. We were operating our devices in 802.11n and chose channel number 40 (on 5 GHz) with BW = 20MHz. Our patched version of ACT uses the upgraded ath10k driver instead of ath9k. All of the devices were equipped with 3×3 antenna, and Modulation and Coding Scheme (MCS)- index 16 is set to enable transmission with all 3 antennas. For the non-linear least-squares fitting, we have used the least-square-cpp library by [21]. We enabled the bidirectional channel estimation option, where two devices (regardless of their role) exchange sounding Physcial Layer Protocol Data Unit (PPDU). The receiving STA computes an estimate of the MIMO channel matrix Hk for each subcarrier k and for each RF chain. While it is possible to extract key-bits from all the available 9 antenna combination, we have implemented one of the nine paths for the demo. We put our devices in various co-working rooms of Freie Universit¨at Berlin campus and private apartments resembling typical indoor environments to perform measurements and protocol tests.
6 Conclusion We presented ComPass, a PLS inspired common passphrase agreement protocol for all classes of Wi-Fi devices governed by proximity (≤ 3m). It forgoes the necessity of memory friendly short password generation by an user and the dependency on PKC. We showed that the ComPass generated passphrase has increased the number of guesses required to crack it using brute force or dictionary attack compared to a typical humangenerated passphrase, and it has increased the available entropy 3 times (113-bits vs. 34bits). ComPass has been implemented on COTS hardware running the latest OpenWrt. The compiled module is 143 kb in size, and can be installed on existing devices using opkg package manager or as a full firmware replacement.
274
K. Reaz and G. Wunder
References 1. Biswas, B., Herbert, V.: Efficient Root Finding of Polynomials over Fields of Characteristic 2. https://hal.archives-ouvertes.fr/hal-00626997/ (2009) 2. Bose, R.C., Ray-Chaudhuri, D.K.: On a class of error correcting binary group codes. Inf. Control 3(1), 68–79 (1960) 3. Chien, R.: Cyclic decoding procedures for Bose-Chaudhuri-Hocquenghem codes. IEEE Trans. Inf. Theor. 10(4), 357–363 (1964) 4. Djelic, I., Borgerding, M.: User BCH (Bose-Chaudhuri-Hocquenghem) encode/decode library based on BCH module from Linux kernel. https://github.com/mborgerding/bch codec (2015) 5. Dodis, Y., Ostrovsky, R., Reyzin, L., Smith, A.: Fuzzy extractors: how to generate strong keys from biometrics and other noisy data. SIAM J. Comput. 38(1), 97–139 (2008) 6. Dworkin, M.J.: SHA-3 standard: permutation-based hash and extendable-output functions. NIST Pubs (2015). https://doi.org/10.6028/NIST.FIPS.202 7. EVERIS-NTT DATA Company: KeePass Code Review Results Report. https://joinup.ec. europa.eu/collection/eu-fossa-2/project-deliveries (2016) 8. Gringoli, F., Schulz, M., Link, J., Hollick, M.: Free your CSI: a channel state information extraction platform for modern Wi-Fi chipsets. In: Proceedings of the 13th International Workshop on Wireless Network Testbeds, Experimental Evaluation & Characterization (2019) 9. Halperin, D., Hu, W., Sheth, A., Wetherall, D.: Tool release: Gathering 802.11 n traces with channel state information. ACM SIGCOMM Comput. Commun. Rev. 41(1), 53–53 (2011) 10. Harkins, D., Kumari, W.: Opportunistic Wireless Encryption. RFC 8110 (2017). https://doi. org/10.17487/RFC8110. https://www.rfc-editor.org/rfc/rfc8110.html 11. Harmon, K., Johnson, S., Reyzin, L.: An implementation of syndrome encoding and decoding for binary BCH codes, secure sketches and fuzzy extractors. https://www.cs.bu.edu/ ∼reyzin/code/fuzzy.html (2008) 12. Hocquenghem, A.: Codes correcteurs d’erreurs. Chiffres 2(2), 147–56 (1959) 13. IEEE: IEEE Std 802.11-2016 Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications (2016). https://doi.org/10.1109/IEEESTD.2016. 7786995 14. INTERPOL: COVID-19 Cybercrime Analysis Report. https://www.interpol.int/en/Newsand-Events/News/2020/INTERPOL-report-shows-alarming-rate-of-cyberattacks-duringCOVID-19 (2020) 15. Jung, P., Wunder, G.: On time-variant distortions in multicarrier transmission with application to frequency offsets and phase noise. IEEE Trans. Commun. 53(9), 1561–1570 (2005) 16. Kaliski, B.: PKCS #5: Password-based cryptography specification version 2.0. RFC 2898 (2000). https://doi.org/10.17487/RFC2898. https://www.rfc-editor.org/rfc/rfc2898.html 17. Kilgallin, J., Vasko, R.: Factoring RSA keys in the IoT era. In: IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (2019) 18. Kivinen, T., Kojo, M.: More Modular Exponential (MODP) Diffie-Hellman groups for Internet Key Exchange (IKE). RFC 3526 (2003). https://doi.org/10.17487/RFC3526. https:// www.rfc-editor.org/rfc/rfc3526.html 19. Kotaru, M., Joshi, K., Bharadia, D., Katti, S.: SpotFi: decimeter level localization using WiFi. In: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, pp. 269–282 (2015) 20. Maurer, U.M.: Secret key agreement by public discussion from common information. IEEE Trans. Inf. Theor. 39(3), 733–742 (1993)
ComPass
275
21. Meyer, F.: A single header-only C++ library for least squares fitting. https://github.com/ Rookfighter/least-squares-cpp (2019) 22. Rappaport, T.: Wireless Communications: Principles and Practice, pp. 165–166 (2001) 23. Reaz, K., Wunder, G.: Wireless Channel-based Autonomous Key Management for IoT (AutoKEY) on WiSHFUL Testbed. http://www.wishful-project.eu/sites/default/files/ AutoKEY-leaflet.pdf (2017) 24. Shannon, C.E.: Communication theory of secrecy systems. Bell Syst. Tech. J. 28(4), 656–715 (1949) 25. Thai, C.D.T., Lee, J., Prakash, J., Quek, T.Q.: Secret group-key generation at physical layer for multi-antenna mesh topology. IEEE Trans. Inf. Forensics Secur. 14(1), 18–33 (2018) 26. Vanhoef, M., Piessens, F.: Key reinstallation attacks: forcing nonce reuse in WPA2. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, ACM (2017) 27. Vasisht, D., Kumar, S., Katabi, D.: Decimeter-level localization with a single WiFi access point. In: 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), pp. 165–178 (2016) 28. Vieb¨ock, S.: Wi-Fi Protected Setup (WPS) PIN brute force vulnerability. CERT Vulnerability Note VU#723755. https://www.kb.cert.org/vuls/id/723755/ 29. Wang, Q., Xu, K., Ren, K.: Cooperative secret key generation from phase estimation in narrowband fading channels. IEEE J. Sel. Areas Commun. 30(9), 1666–1674 (2012) 30. Wheeler, D.L.: zxcvbn: low-budget password strength estimation. In: 25th USENIX Security Symposium, pp. 157–173 (2016) 31. Wi-Fi Alliance: Wi-Fi Easy Connect. https://www.wi-fi.org/discover-wi-fi/wi-fi-easyconnect. Accessed 23 Oct 2019 32. Wi-Fi Alliance: Opportunistic Wireless Encryption Specification. Specification v1.0 (2019) 33. Wi-Fi Alliance: Wi-Fi Protected Setup Version 2.0.2 (2020) 34. Wi-Fi Alliance: WPA3 Specification Version 3.0 (2020) 35. Wu, C., Yang, Z., Zhou, Z., Qian, K., Liu, Y., Liu, M.: PhaseU: Real-time LOS identification with WiFi. In: IEEE Conference on Computer Communications, pp. 2038–2046. IEEE (2015) 36. Wyner, A.D.: The Wire-Tap Channel. Bell Syst. Tech. J. 54(8), 1355–1387 (1975) 37. Xi, W., et al.: Instant and Robust Authentication and Key Agreement among Mobile Devices. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (2016) 38. Xie, Y., Li, Z., Li, M.: Precise Power Delay Profiling with Commodity WiFi. MobiCom 2015. ACM (2015). https://doi.org/10.1145/2789168.2790124 39. Zenger, C., Zimmer, J., Paar, C.: Security analysis of quantization schemes for channelbased key extraction. In: proceedings of the 12th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services (2015) 40. Zenger, C.T., Chur, M.J., Posielek, J.F., Paar, C., Wunder, G.: A novel key generating architecture for wireless low-resource devices. In: 2014 International Workshop on Secure Internet of Things. IEEE (2014) 41. Zhu, H., Zhuo, Y., Liu, Q., Chang, S.: π -splicer: perceiving accurate CSI phases with commodity WiFi devices. IEEE Trans. Mobile Comput. 17(9), 2155–2165 (2018) 42. Zinoviev, V.: On the solution of equations of degree ≤ 10 over finite fields GF(2m ). Rapports de recherche-INRIA (1996)
Fuzzing with Multi-dimensional Control of Mutation Strategy Han Xu1 , Baojiang Cui1(B) , and Chen Chen2 1 Beijing University of Posts and Telecommunications, Beijing, China
{xuhan123,cuibj}@bupt.edu.cn
2 Air Force Engineering University, Xi’an, China
[email protected]
Abstract. Vulnerabilities present complexity and diversity, which pose a great threat to the computer systems. Fuzzing is a effective method for vulnerability detection. The exposure of vulnerabilities mainly depends on the quality of the test samples. The traditional fuzzing method has the defect of low code coverage. In order to make up for the shortcomings of traditional fuzzing, this paper proposes a new fuzzer called MCMSFuzzer based on multi-dimensional control of mutation strategy. We model coverage-based graybox fuzzing as a Markov Decision Process, and guide the mutation process by reinforcement learning. MCMSFuzzer optimizes the selection of mutation location, mutation intensity and mutation algorithm to improve quality and efficiency of fuzzing. Experimental results shows that in 5 real-world programs and LAVA-M dataset, MCMSFuzzer has higher code coverage and stronger vulnerability detection capabilities.
1 Introduction In recent years, various security incidents such as data leaks and ransomware attacks have emerged one after another around the world, and the losses and impacts caused have increased year by year. This has made countries, enterprises and people increasingly aware of the importance of cyberspace security. Vulnerabilities are one of the direct causes of many cyberspace security incidents. Fuzzing [1] is currently the most commonly used vulnerability detection technology. The core idea of fuzzing is shown in Fig. 1. The test cases generated automatically or semi-automatically is input into the program under test to monitor whether the target program runs abnormally. By analyzing the input samples that cause the crashes, the hidden bugs in the program can be found. In practice, fuzzing performs extremely well and has been widely used in the field of vulnerability detection. Traditional fuzzing randomly changed the sample. This process destroys the format of the sample, causing most of the generated samples to be invalid, and the deep logic of the program cannot be executed. The current advanced fuzzers use heuristic algorithms to maximize specific goals [2], such as the number of crashes, code coverage and so on. Coverage-based graybox fuzzing (CGF) use lightweight tools to obtain program runtime information. CGF effectively guide the fuzzing process without sacrificing program analysis time. For example, if a sample executes a new basic block, the fuzzer will keep © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 276–284, 2022. https://doi.org/10.1007/978-3-030-79728-7_27
Fuzzing with Multi-dimensional Control of Mutation Strategy
277
Fig. 1. The principle of fuzzing.
the sample; otherwise, it will discard the sample. For large programs, due to the huge search space and the limitation of computing resources, there is no method in practical applications that can thoroughly check the entire input space, nor can it search the entire execution path of the target program [3]. In this article, we implemented MCMSFuzzer for multi-dimensional control of the fuzzing mutation strategy. In the process of interacting with the tested program, the fuzzer uses reinforcement learning to maximize returns and improve the efficiency of fuzzing. The multi-dimensional mutation strategy of fuzzing includes mutation location, mutation intensity and mutation algorithm. Using reinforcement learning and integrating the three-dimensional mutation strategy, MCMSFuzzer has significantly improved the code coverage of the tested program. Our primary contributions in this paper are as follows: (1) Model the CGF as a Markov Decision Process (MDP), and combine reinforcement learning and multi-dimensional mutation strategy to greatly increase the code coverage. (2) We implemented the MCMSFuzzer, whose performance is better than AFL [4] and the fuzzer based only on reinforcement learning on 5 real applications and LAVA-M dataset.
2 Related Work Coverage-Based Fuzzing. Coverage-based fuzzing uses runtime information in different ways to automatically detect vulnerabilities in software and has become a mainstream technology. AFL uses coverage information to evaluate samples and guide the test to execute unexecuted paths. AFL can be said to be the most successful vulnerability detection tool so far, therefore many technologies have improved AFL. AFLFast [5] uses Markov model for energy distribution. Fairfuzz [6] uses Branch Mask to guide mutations. CollAFL [7] achieves greater path coverage by reducing path conflicts in AFL. Skyfire [8] uses probabilistic context-sensitive grammar to generate high-quality samples. VUzzer [9] combines static analysis and dynamic taint analysis to calculate the fitness of each seed to improve the depth of the path coverage of the fuzzer. Machine Learning-Based Fuzzing. Machine learning is a hot research direction in recent years. Researchers are trying to greatly improve the efficiency of fuzzing through machine learning.
278
H. Xu et al.
VDiscover [10] uses machine learning to quickly screen programs that may contain vulnerabilities from a large number of programs. Learn&Fuzz [11] converts the highstructure sample generation problem in the fuzzing into a text generation problem in the NLP field. Neuzz [12] uses deep learning to guide sample mutation and improves the coverage of the fuzzing according to the gradient descent algorithm. Böttinger [13] et al. applied reinforcement learning technology to fuzzing, using mutation operators as actions to improve code coverage. Angora [14] uses taint tracking and gradient descent to reduce constraint solving overhead.
3 Design of Multidimensional Mutation Strategy The key to multi-dimensional control of mutation strategy is to model CGF as MDP and abstract it as a reinforcement learning problem. Use the action elements in reinforcement learning to control the mutation strategy in multiple dimensions and improving the efficiency of fuzzing. 3.1 Reinforcement Learning Reinforcement learning [15] is inspired by the behaviorism theory, that is, under the stimulus that the environment continuously rewards or punishes the organism, the organism adopts the most favorable behavior through the prediction of the stimulus. Reinforcement learning emphasizes that the organism learns in the process of interacting with the environment in order to achieve the expected goals. The learner is called the agent. The interaction continues. The agent chooses the action to perform, and the environment gives feedback to the agent. The agent will be in the next new state, and the environment will return. The goal of reinforcement learning is to maximize the return over a period of time. By defining state, action and reward in the fuzzing environment, fuzzing is formalized as MDP. The fuzzing system is an environment for agent interaction and learning. The state is represented by bits, bytes, or substrings of the sample. Action is the mutation strategy of fuzzing. Reward is calculated based on the runtime information of the target program, such as path coverage, edge coverage and basic block coverage. 3.2 Double Q-Learning Typical task solving methods based on reinforcement learning include SARSA, Qlearning [16], Deep Q-learning (DQN) [17], Double Q-learning (DDQN) [18], Deep Deterministic Policy Gradient (DDPG) [19], etc. SARSA and Q-Learning cannot solve overly complicated problems. They need a Q table. In the case of the scale of actions and states is huge, the Q table will be very large, and search and storage require a lot of time and space. If the state and action reach the million level, it may cause overflow. DQN uses neural network to approximate the value function, which allows DQN to solve large-scale reinforcement learning problems. DQN does not necessarily guarantee the convergence of the Q network, which has become a defect that DQN must solve.
Fuzzing with Multi-dimensional Control of Mutation Strategy
279
Double Q-learning avoids overestimation by separating the action selection and action evaluation. DDQN uses two Q networks Q and Q’. Each Q network uses the value of the other Q network to update the next state. Both Q networks must learn from different sets of experiences. In DQN, we use the following formula to update the Q value: (1) y = R + γ maxa Q φ S , a , ω . In DDQN, two Q networks are used. DDQN finds the action corresponding to the maximum Q value in one of the Q networks, and then calculates the target Q value in the other network Q’: (2) y = R + γ Q φ S , arg maxa Q φ S , a , ω ω . The flow chart of the fuzzing with DDQN is shown in Fig. 2.
Fig. 2. Fuzzing with DDQN.
3.3 Multidimensional Control of Mutation Strategy There are various mutation strategies for fuzzing. The simplest mutation strategy is random mutation, such as randomly inserting or deleting a byte, randomly flipping a bit, rearranging the byte order, and so on. Using multiple mutation strategies at the same time is better than using a single mutation strategy. Using the feedback obtained by CGF, different mutation strategies can be selected intelligently. The mutation strategy of fuzzing can be divided into three dimensions: mutation location, mutation intensity, and mutation algorithm. The three-dimensional mutation strategy can be comprehensively controlled with the help of deep reinforcement learning. Action is one of the key elements of reinforcement learning. As mentioned above, when modeling CGF as MDP, the mutation strategy is action. Combine the byte sequence of the fuzzing sample and multiple mutation algorithms to form a two-dimensional data table.
280
H. Xu et al.
The actions in the reinforcement learning model correspond to each element in the data table. In addition, the selection of actions by deep reinforcement learning is the control of the intensity of mutation. Through the above method, the number of invalid mutations is reduced, and the efficiency of fuzzing is improved.
4 Implementation We designed MCMSFuzzer based on the fuzzing with multi-dimensional control of mutation strategy. The deep reinforcement learning network selects action and return according to the state and rewards in the fuzzing environment. The sample mutates in multiple dimensions according to the selected action, and the mutated sample is sent to the program under test. MCMSFuzzer obtains the state of the sample after execution in the program under test, and feeds back the calculated reward to the DDQN network. 4.1 DDQN Model In traditional fuzzing, the criterion for measuring the effect of fuzzing is whether the sample triggers crashes. Due to the irregular distribution of vulnerabilities and the long process of generating samples that can cause crashes, most fuzzers test programs by maximizing code coverage to improve fuzzing efficiency. Therefore, we record the number of the edge that jump from the basic block i to the basic block j, and use the edge coverage as the reward. MCMSFuzzer uses the sample data as the state space of the DDQN network, expressed in byte sequence. We set the maximum length of the sample as L, and automatically complete samples whose length is less than L. The DDQN model we built has 3 hidden layers with 256, 1024, and 128 nodes respectively. The activation function is softmax. 4.2 Mutation Strategy The multi-dimensional control method of mutation strategy divides the mutation strategy of fuzzing into different dimensions. The mutation location and mutation algorithm are combined into a two-dimensional data table, which is used as the action space in the DDQN model. The mutation position is the character sequence of the input sample of fuzzing. The mutation algorithm of MCMSFuzzer is a series of functions. Based on the above method, for a sample S, the action of DDQN selection can be to select a byte of S to mutate using a mutation function shown in Table 1. Table 1. Mutation functions of MCMSFuzzer. Mutation function Description InsertByte
Insert a random byte
InsertSameBytes
Insert multiple identical random bytes (continued)
Fuzzing with Multi-dimensional Control of Mutation Strategy
281
Table 1. (continued) Mutation function Description DeleteByte
Delete a byte
ChangeByte
Change a byte
ChangeString
Change multiple adjacent bytes
InsertString
Insert multiple random bytes
InsertSameBytes
Insert multiple identical random bytes
SwapBytes
Rearrange multiple adjacent byte sequences
5 Evaluation To evaluate the effectiveness of MCMSFuzzer, we conducted two sets of experiments. The environment of the two experiments is: CPU: E5-2630V4 2.20 GHz, RAM: 384 GB, OS: Ubuntu 16.04.6 LTS. We use 5 real-world applications to test the fuzzing efficiency of MCMSFuzzer, and use the LAVA-M dataset to test the vulnerability detection capabilities of MCMSFuzzer. 5.1 Real-World Programs For the first experiment, we selected 5 popular real-world programs with vulnerabilities: gzip, libming, libjpeg pngquant, and miniFtp. Detailed information about the tested programs is shown in Table 2. Table 2. Tested program information. Program Version Type gzip
1.2.4
Compressor
libming
0.4.8
Flash output library
libjpeg
9a
JPEG image library
mupdf
1.12.0
Lightweight PDF viewer
libxml2
2.9.2
XML file parser
We use MCMSFuzzer, AFL and a fuzzer based only on reinforcement learning to test for 12 h. Among them, MCMSFuzzer and the fuzzer based only on reinforcement learning both perform 2000 warm-up steps. During the execution, the input samples of each tested program are the same, and the test commands are the same. For each program and technique, Fig. 3 plots the edge coverage. The fuzzer based only on reinforcement learning is slightly better than AFL, but the advantage is not obvious. The coverage of MCMSFuzzer with multi-dimensional control of mutation strategy has been significantly improved.
282
H. Xu et al.
Fig. 3. Comparison of edge coverage of fuzzers.
5.2 LAVA-M Dataset LAVA-M [20] contains 4 Linux Utilities programs that injected vulnerabilities, including Base64, Md5sum, Uniq and Who. The LAVA-M test set is an executable program generated by LAVA after automatically injecting difficult-to-trigger vulnerabilities into the source code. It is generally used to verify the vulnerability detection capabilities of tools such as fuzzers and symbolic execution tools. We use MCMSFuzzer and AFL to test on the LAVA-M dataset for 12 h. During the execution, the input seed samples of each tested program are the same, and the test commands are the same. Among them, Md5sum uses the “-c” parameter, Base64 uses the “-d” parameter to run, and Who and Uniq use the default parameters to run. The results of each group of tests to detect the vulnerabilities in LAVA-M are shown in Table 3. The first column is the name of the target program, the second column is the number of all vulnerabilities contained in the target program, and the third column is the number of vulnerabilities found by AFL. The fourth column is the number of vulnerabilities detected by MCMSFuzzer using the multi-dimensional control of mutation Table 3. The number of vulnerabilities detected by different fuzzing methods in LAVA-M. Program Total AFL MCMSFuzzer Base64
44 0
39
Md5sum
57 2
50
Uniq
28 0
21
Who
2136 1
66
Fuzzing with Multi-dimensional Control of Mutation Strategy
283
strategy in this paper. Since all of these injected bugs are guarded by string comparison functions, AFL is ineffective in detecting bugs from LAVA-M. But MCMSFuzzer has significantly improved efficiency in actual vulnerability detection.
6 Conclusion Fuzzing is currently one of the most popular methods of vulnerability detection. In order to make up for the shortcomings of traditional fuzzing, we models CGF as MDP, and implements MCMSFuzzer by combining deep reinforcement learning and multidimensional control of mutation strategies. By comprehensively controlling the variation of various dimensions in the reinforcement learning process, the generation of invalid samples is reduced and the efficiency of fuzzing is improved. Experiments on real-world programs and the LAVA-M data set show that MCMSFuzzer has higher code coverage and stronger vulnerability detection capabilities. In the future, machine learning and fuzzing should be further combined.
References 1. Miller, B.P., Fredriksen, L., So, B.: An empirical study of the reliability of UNIX utilities. Commun. ACM 33, 32–44 (1990) 2. Zhao, J., Wen, Y., Zhao, G.: H-Fuzzing: a new heuristic method for fuzzing data generation. In: Altman, E., Shi, W. (eds.) NPC 2011. LNCS, vol. 6985, pp. 32–43. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24403-2_3 3. Peng, H., Shoshitaishvili, Y., Payer, M.: T-Fuzz: fuzzing by program transformation. In: 2018 IEEE Symposium on Security and Privacy (SP), pp. 697–710. IEEE (2018) 4. Zalewski, M.: American Fuzzy Lop. https://github.com/google/AFL 5. Böhme, M., Pham, V.T., Roychoudhury, A.: Coverage-based greybox fuzzing as markov chain. IEEE Trans. Softw. Eng. 45, 489–506 (2017) 6. Lemieux, C., Sen, K.: FairFuzz: a targeted mutation strategy for increasing greybox fuzz testing coverage. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 475–485. ACM (2018) 7. Gan, S., Zhang, C., Qin, X., Tu, X., Li, K., Pei, Z., Chen, Z.: CollAFL: path sensitive fuzzing. In: 2018 IEEE Symposium on Security and Privacy (SP), pp. 679–696. IEEE (2018) 8. Wang, J., Chen, B., Wei, L., Liu, Y.: Skyfire: data-driven seed generation for fuzzing. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 579–594. IEEE (2018) 9. Rawat, S., Jain, V., Kumar, A., Cojocar, L., Giuffrida, C., Bos, H.: VUzzer: application-aware evolutionary fuzzing. In: NDSS, pp. 1–14 (2017) 10. Grieco, G., Grinblat, G. L., Uzal, L., Rawat, S., Feist, J., Mounier, L.: Toward large-scale vulnerability discovery using machine learning. In: Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, pp. 85–96. ACM (2016) 11. Godefroid, P., Peleg, H., Singh, R.: Learn&Fuzz: machine learning for input fuzzing. In: 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 50–59. IEEE (2017) 12. She, D., Pei, K., Epstein, D., Yang, J., Ray, B., Jana, S.: NEUZZ: efficient fuzzing with neural program smoothing. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 803–817. IEEE (2019)
284
H. Xu et al.
13. Böttinger, K., Godefroid, P., Singh, R.: Deep reinforcement fuzzing. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 116–122. IEEE (2018) 14. Chen, P., Chen, H.: Angora: efficient fuzzing by principled search. In: 2018 IEEE Symposium on Security and Privacy (SP), pp. 711–725. IEEE (2018) 15. Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (2018) 16. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992) 17. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013) 18. Hasselt, H.: Double q-learning. Adv. Neural. Inf. Process. Syst. 23, 2613–2621 (2010) 19. Lillicrap, T.P., Hunt, J.J., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015) 20. Dolan-Gavitt, B., Hulin, P., Kirda, E., Leek, T., Mambretti, A., Robertson, W., Ulrich, F., Whelan, R.: Lava: large-scale automated vulnerability addition. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 110–121. IEEE (2016)
An ELF Recovery Method for Linux Malicious Process Detection Zheng Wang, Baojiang Cui(B) , and Yang Zhang Beijing University of Posts and Telecommunications, Beijing, China {wangzheng96,cuibj,taoshi}@bupt.edu.cn
Abstract. In recent years, malicious attacks against cloud hosts and IoT devices have become more frequent. New types of ransomware and mining viruses have brought a huge threat to Internet security. Traditional static detection methods cannot effectively deal with No-File malware, and the detection methods based on behavior characteristics are difficult to identify the owner of malicious samples. Compared the binary file extracted from process memory with library sample file can detect the malicious process accurately. we retain the dynamic characteristics based on network characteristics in consideration of the time cost of static detection. In this paper, we implemented a prototype system. We selected six typical Linux malicious samples for experiments. By setting similar thresholds, we can accurately screen out malicious processes. The ELF recovery degree of the samples is all above 98%. This technology can be applied to internal memory forensics in the future and can also help combat Internet crimes.
1 Introduction With the development of information technology, the Internet plays an extremely important role in people’s work and life. Linux has been favored by many IoT and cloud host manufacturers due to its rich software ecosystem. However, with the increasing number of network attacks, the Linux system has also been attacked by various malware. According to Rising’s Cloud Security Report, they detected a total of 420000 Linux virus samples from January to June 2017, exceeding the sum of 2013, 2014, and 1025. If these Linux malware attacks are successful, there will be huge losses for businesses and individuals. Therefore, Linux malware detection research is of paramount importance. A malicious process is the dynamic entity of malware in the system, which usually runs in the background. The malware will hide its running traces in some ways to prevent them from being discovered and removed by system administrators. ELF file is a common format of executable files under the Linux system. We explore a technology to recover ELF files in Linux process memory and accurately identify malicious processes through a binary file comparison. Finally, we implemented a prototype system to experiment and verify our method. The remainder of the paper is organized as follows. Section 2 summarizes the related work. Section 3 describes the design of our system. Section 4 presents the experiment and results of our method. Section 5 shows the limitations and future research directions and Sect. 6 is summary. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 285–294, 2022. https://doi.org/10.1007/978-3-030-79728-7_28
286
Z. Wang et al.
2 Related Work Malware Detection Based on Static Analysis. There has been much research on malware detection technology. They obtain the ELF file of the running process and perform static analysis on it [2]. The traditional static analysis methods include comparing the hash value of the file and the sample, searching the characteristic text string, checking the file link library, and reverse engineering analysis [3]. Wu improved the Euclidean distance-based method and designed the method based on Minkowski distance to detect whether PE files are packed. The results show the method has a higher detection rate [4]. Wang studies the theory of Word2vec and proposes an algorithm in which they can extract the n-gram timing characteristics from the PE viruses. The results show that the timing feature extraction algorithm has certain practicality. Malware Detection Based on Behavior. In recent years, the method of dynamic detection based on malware behavior has been deeply studied [5]. Xu extracted the features which have the typical characters from malicious software and designed an algorithm to convert these features into the data formal. Then they designed a BP neural network model to detect these features. The results show the false-negative rate and the false positive reached a quite satisfactory level [6]. Han proposed a flexible malware detection method, which extracted API function call name, input parameters in API functions, the two types of the combination of features. It calculated the information gain value of the corresponding API and its parameters to select the characteristic in distinguishing the malware. Experiment shows a small amount of feature selection and higher accuracy makes it more superior to the algorithm of API-based detection of malware [1]. Zhou constructed the platform through the system call service(SCS) to achieve the collection of benign software and malware system call records. They proposed a feature extraction technique based on n-gram and designed the sequential method to detect malware. The experimental results show that the detection technology has a true positive rate of 95%. It is difficult for traditional static detection methods to analyze malware because many of them hide their static characteristics by packing, obfuscation, and encryption. Besides, some No-File malicious processes will dynamically delete the loaded files saved on the disk when they are running, which makes it hard for static analysis. Furthermore, the dynamic detection method for malware identification is coarse granularity. Different malware may show the same behavioral characteristics. So a combination of dynamic and static characteristics is useful to identify malicious processes. We figure out a method to extract the process’s memory data completely and safely. Then we develop a method to recover the ELF file from the memory data. By comparing the binary recovered and the sample file we can accurately detect malware. Finally, we develop a system that combined static detection with dynamic detection.
3 Basic Design The design architecture of this system is shown in Fig. 1. Specifically, the system can be divided into four sub-modules. Among them, the process information extraction module collects various information about the running processes in the system (such as process
An ELF Recovery Method for Linux Malicious Process Detection
287
name, resource usage rate, IP address, and port number for network connection). The runtime information will be used as input features of the dynamic behavior detection module to help research and judgment. The process memory extraction module uses the Linux kernel module to extract the memory data of the specified process, save it as a disk file, and hand it over to the ELF restoration module for ELF file reconstruction. The resulting file will be used as the input feature of the static behavior detection module for feature comparison and analysis. The design of these four modules will be introduced separately below.
Process informaon extracon module
Process name
Resource Ulizao n
Network Informa on
ELF restoraon module
Auxiliary vector analysis
ELF reorganizao n
Process memory extracon module
Process virtual address
Process memory data
Malicious process detecon module
GOT repair
Stac detecon module
Dynamic behavior detecon module
Fig. 1. Design architecture diagram of the malicious process detection system
3.1 Process Information Extraction Module The Linux system uses the/proc virtual filesystem to save process-related information. We use it to obtain the desired process information. First, we need to determine the detection target. There is a special kind of process called kernel thread in the process, which is created and managed by the kernel, we need to exclude these processes. We traverse the subdirectories under the/proc directory, get the process name when the process is running according to /proc/pid/cmdline, get network connection information according to the file under /proc/net, and check the CPU usage of the process according to the /proc/stat file, view the memory usage of the process according to /proc/pid/status. 3.2 Process Memory Extraction Module On a Linux system, the ptrace system call is usually used to obtain process memory. Ptrace is a system call that provides a method for a process to observe and modify the memory of other processes. It is mainly used to implement debugger. This method of
288
Z. Wang et al.
extracting memory in user mode is easy to implement, but it is also easy to be identified by the malicious process which may take countermeasures such as self-destruction. Therefore, we present a method to obtain process memory in kernel mode. The Linux kernel uses mm_struct structure to manage process memory information, and the virtual memory area of the process is managed through the vm_area_struct structure. As Fig. 2 shows, the vm_start of the structure points to the start of a certain memory area allocated to the process. The vm_end member points to the end address of the memory area. The kernel manages vm_area_struct with a red-black tree. We can use the vm_end pointer to loop through all vm_area to get the virtual address set of the process. Using the multi-level page table of the process, the virtual address set can be translated into a physical address set. Finally, we get the memory set of the process through reading the physical memory.
Userspace Memory
vm_area_struct vm_start vm_end vm_next mm_struct
Fig. 2. Linux kernel process memory management diagram
Specifically, we develop a kernel device driver module. We use the ioctl function to receive the control instructions sent by the user-mode program. The module performs memory data query, the user-mode program interacts with the module for data transmission. 3.3 ELF File Recovery Module The Linux kernel uses the auxiliary vector (Auxiliary Vector) to pass information to the application such as the entry point of the system call. The loader converts an ELF file into a process running in the system. It parses the ELF file, maps various program segments to the memory, and initializes the stack space of the process. It puts the ELF auxiliary
An ELF Recovery Method for Linux Malicious Process Detection
289
vector array and other information on the stack. The auxiliary vector array saves much useful information such as the program header of the program. The start address and length of the segment are stored in each program header. We find the segment loaded into the memory by traversing the auxiliary vector, and reorganize it into a new ELF file after extracting it. For the program with the address randomization compilation option turned on, we need to subtract the program load base address when calculating the offset to get the original ELF file. Dynamically linked ELF files use PLT (Procedure Linkage Table) to realize the call to external functions. To realize delayed binding, PLT adds an indirect jump, which does not directly jump through GOT but through the PLT item. When the ELF file calls an external function for the first time, the instruction pointer first points to the PLT table, then points to the GOT table, and executes the resolve function to resolve the address of the external function. The external function’s address is stored in GOT table. This mechanism reduces the cost of symbol resolution. In other words, the external function address may have been written into the GOT table. We need to restore it to the initial value. We traverse the DYNAMIC section to locate the INIT section. Then we traverse backward from the INIT section, use the bytecode pattern matching method to find the PLT table. Then we rewrite the GOT table to recover it. After the repair, we write the data in the buffer to a new file. The new file is executed as same as its initial ELF file (Fig. 3).
AUXV
MemFile
ELF Segment Extractor
Dynamic linked?
Init ELF
Yes
Recover GOT
No Target ELF
Fig. 3. Design flow chart of ELF restore module
3.4 Malicious Process Detection Module The malicious process detection module can be divided into two sub-modules. We use pre-collected malicious sample sets to make feature sets. The features include malware file hash value, process names, sensitive strings, process network connection information (CC server IP, feature port number), etc. We develop feature detection for each malware. Malicious process detection is performed by comparing the collected process information and the recovered ELF file with the characteristics. First, we view information such as the compilation format and running architecture of the ELF header structure. Then we extract the code segments of the sample file and the recovered ELF file, use the disassembler objdump to convert the machine code into
290
Z. Wang et al.
the corresponding assembly text according to the compilation architecture. The binary file is converted into a text file in this way. We use the diff command to compare them, set similar thresholds, and filter out malicious processes from the processes running on the system. The comparison is coarse-grained. We use Google’s Open-Source tool Bindiff for auxiliary judgment, which compares the similarity of binary files based on a variety of factors, such as basic block call graphs, function call graphs, etc. In the dynamic detection module, we investigated the common process names and network characteristics of the sample files during runtime. For example, the Nopen backdoor uses “noserver-server” as its process name at runtime. The process is traversed to search for the process with the same name. Besides, Nopen backdoor starts listening from port 32754 by default. If it encounters port occupancy, it will increase the number. Therefore, we set the suspicious port set (32754, 32754+n) (n is the heuristic maximum number of suspicious backdoors). Then search all the processes to find the malicious processes listening to the same port (Fig. 4). Stac Analysis Module
Unique String Search
Dynamic Analysis Module
ELF Diff
Process Name Search
Network Feature Match
Fig. 4. Diagram of malicious process detection module
4 Results and Discussion 4.1 Environment The experimental configuration we used to evaluate the model parameters is described in Table 1. Table 1. Experimental environment Machine Information OS
Ubuntu 16.04
CPU
Intel(R) Core (TM) i5-9400F CPU @ 2.90 GHz
RAM
16 GB
DISK
128 GB
An ELF Recovery Method for Linux Malicious Process Detection
291
4.2 Dataset We use Nopen backdoor [7], Prism backdoor [8], Ish backdoor [11], Gates Trojan [9], Vegile backdoor [10], SSH Wrapper, and other six popular malicious samples under Linux as the experiment test samples dataset. The dataset is mainly downloaded through public websites. To evaluate the detection effect of the system, we constructed some interference samples manually. We run normal programs with the same name as the test sample. For example, we write a program named “noserver-server” which acted like a normal process. We also write a communication program. It uses a local listening port to monitor external network connections as same as the Nopen backdoor does. For malicious samples with source code, we add some useless code to simulate mutation samples. The number of test samples and interference samples is shown in Table 2. Both malicious samples and interference samples are run in the same system for detection. Table 2. The number of test samples Malicious sample
Number
Interference sample
Number
Nopen
10
Nopen_same_name
10
Prism
10
Nopen_same_port
1
ish
10
Ish_modification
10
Gates
10
Ish_same_name
10
Vegile
10
Prism_same_name
10
SSH wrapper
10
4.3 Experiment We run the test sample files and detect malicious processes with our detection system. Take the Nopen backdoor as an example, after running the detection system, the results will be saved in the log file including the process numbers of the malicious processes searched by the process name, comparison of the hash value, network characteristics, and binary file similarity. As Fig. 5 shows, one of the ten malicious samples in the experiment failed to start. Nineteen malicious processes were screened by comparing process names, which were composed of ten interference samples and nine malicious samples respectively. A total of nine malicious processes are screened out by the file hash value, and they all belong to the processes loaded by malicious samples. Similarly, these nine malicious samples can be screened out by using binary comparison similarity. Finally, ten suspected malicious processes are obtained through the network characteristic ports, which are composed of nine malicious samples and an interference sample.
292
Z. Wang et al.
Fig. 5. Inspection result diagram of Nopen backdoor of the detection system
To confirm whether the result is correct, we choose the process whose pid is 25280 and extract the ELF file from its process memory. Then we compare it with the library sample file with Bindiff. First, we decompile the ELF file using IDA Pro along with Google’s open-source binexport plugin. We get the Binexport file from this step, which will be used as Bindiff input for binary comparison. As Fig. 6 shows, the similarity between the two files is as high as 99.2089%.
Fig. 6. The similarity between two samples with Bindiff
4.4 Discussion We test other malicious software on the target machine in the same way, record the result and make a table to show detection results. As Table 3 shows, all the malicious processes can be detected by the detection system. The existence of the interfering process affects the detection system to a certain extent. The normal process with the same name as the malicious sample triggered the detection warning. The normal program with the same feature port of the malicious sample will trigger the alarm of the dynamic detection module. After adding some useless codes, the binary file similarity of the mutant samples decreased, but still higher than 90%. Therefore, setting a reasonable threshold can still detect interfering samples effectively.
An ELF Recovery Method for Linux Malicious Process Detection
293
Table 3. Detection effect diagram of different samples by the detection system Sample
Process name
File hash
Network feature
ELFDiff similarity
IsELF executable
Nopen
Yes
Yes
Yes
99.2089%
Yes
Prism
Yes
Yes
No
98.0239%
Yes
ish
Yes
Yes
No
98.3315%
Yes
Gates
No
Yes
No
98.1319%
Yes
Vegile
Yes
Yes
Yes
98.2172%
Yes
SSH wrapper
No
Yes
Yes
98.4728%
Yes
5 Limitation and Future Research Our memory extraction module is implemented under kernel modules. Due to the different kernel functions of different Linux versions, this system has only been tested and adapted on Ubuntu 16.04, Ubuntu 18.04, Ubuntu 20.04, and Debian 9. It may cause compilation errors on some out-of-date Linux systems. Besides, the detection of other malicious samples under multiple architectures remains to be studied, such as the method of comparison between ELF files extracted from arm architecture systems and x86 malicious sample files. Besides, as a prototype system, the features it used are not enough, except for the assembly code features, there are false negatives and false positives to a certain extent for other features. It is still necessary to explore more representative features for detection.
6 Conclusion This paper focuses on the ELF recovery technology on Linux. We can recover the executable ELF file with auxiliary vectors and the process memory data of all non-kernel threads. Compared to the recovery file with the malicious sample file, we can detect malicious processes in a combination of dynamic and static by delimiting a similarity threshold and combine network characteristics. Through experiment results, it can be found that this method can detect Linux malicious processes effectively. The traditional features represented by process name and special communication port number have certain false negatives and false positives, while detection based on binary file similarity can deal with mutated viruses effectively, identify malicious processes accurately.
References 1. Zou, X.: Research on Malicious Process Detection Technology Based on System Call Analysis. Strategic Support Force Information Engineering University (2018) 2. Zhang, J.: Research on forensic analysis method of malware. J. Hubei Univ. Police 27(11), 162–166 (2014)
294
Z. Wang et al.
3. Wu, L., Li, Y., Liang, J.: Minkowski distance-based method to identify packed PE files. Mod. Electron. Tech. 39(19), 80–81+88 (2016) 4. Wang, Z.: Study and Implementation of PE Virus Files Clustering Technology. Beijing University of Posts and Telecommunications (2016) 5. Xu, C.: Research on the Automatic Classification Method Based on the Behaviors of the Malicious Software. Xiangtan University (2014) 6. Han, L.: Behavior detection of malware based on the combination of API function and its parameters. Appl. Res. Comput 30(11), 3407–3410+3425 (2013) 7. Alpha_h4ck: The UNIX backdoor nopen for decryption equation organization [EB]. https:// www.freebuf.com/articles/system/114607.html 8. Fabrizi, A.: Prism Sample Open source [EB]. https://github.com/andreafabrizi/prism.git 9. Tencent computer housekeeper. Analysis of gates Trojan horse on Linux platform [EB]. https://www.freebuf.com/articles/system/117823.html 10. Screetsec. Vegile Sample Open source [EB]. https://github.com/Screetsec/Vegile 11. Sourceforge. Ish Open Sample File[EB]. http://prdownloads.sourceforge.net/icmpshell/ishv0.2.tar.gz
A High Efficiency and Accuracy Method for x86 Undocumented Instruction Detection and Classification Jiatong Wu1 , Baojiang Cui1(B) , Chen Chen2 , and Xiang Long1 1 Beijing University of Posts and Telecommunications, Beijing, China
{jiatongwu07,cuibj,xianglong}@bupt.edu.cn 2 Air Force Engineering University, Xi’an, China
Abstract. Processors are important parts of a computer and have been believed to be secure for a long time. In recent years researches show that processors are, on the contrary, full of undocumented instructions which are caused by design flaws or intentionally hidden by manufacturers. Undocumented instructions can be detected using fuzzing technology. But currently existing methods have low efficiency and accuracy. Furthermore, to analyze the function of detected instructions, a classification method is necessary to reduce the large amount of detected undocumented instructions. Therefore, this paper introduces a high-efficiency instruction searching and classification method by applying instruction format analysis. Results show that our method can successfully find 10 times of the amount of detected undocumented instructions using existing tool with just 30% of executing time. Also, after classifying the large quantity of the detection results, the amount can be reduced to less than 10000 instructions which is a reasonable amount for further research.
1 Introduction Processors are the foundation of computing systems. For a long time, researchers have mainly paid attention to software security and network security, determining the underlying processor as the “trusted base” which causes a huge security blind zone. At the same time, processor designers have long focused on improving processing performance, power consumption, reliability and lacked attention to the safety of the processor itself. Undocumented instruction is one of the important threats to processor security. Those instructions can result in the processor ceasing to function or grant the attacker with illegal privilege. The fuzzing method introduced in this paper could be used on x86 processors and improves efficiency and accuracy compared with existing tools. We improved instruction searching algorithm used in instruction fuzzing by skipping those instructions that have a low possibility of being an undocumented instruction. We also introduce an undocumented instruction classification method for reducing the large detection results according to instruction prefixes, length, signals after execution and other information. After summarizing related work in the next section, the rest of the article is organized in the following order: Sect. 3 describes the implementation of our fuzzing and © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 295–303, 2022. https://doi.org/10.1007/978-3-030-79728-7_29
296
J. Wu et al.
classification method; Sect. 4 compares experiment result with existing tools; Sect. 5 summarizes the limitations and future directions; Sect. 6 is the summary.
2 Related Work 2.1 Vulnerabilities of Processors The vulnerabilities of processors have been appeared since they first been manufactured. However, researches on this area are mainly started in the last 10 years. According to Duflot L. [1], those vulnerabilities can be divided into three types: bugs, backdoors and undocumented functions. A bug is an involuntary implementation mistake that will lead to undefined behavior of instructions. An undocumented function corresponds to a function that developer implemented but not documented for some reason. If a function whose only purpose is to grant additional privileges to the entity using it, then it is a backdoor. Domas C. [2] first proved the existence of a hardware backdoor in x86 processor which is triggered by undocumented instructions. Zhang et al. [3] also model a typical hidden instruction Trojan which is proved to have low trigger probability and can survive from detection of existing methods. All those researches indicate the existence of undocumented instructions and they could bring serious secure vulnerabilities in our computing system. 2.2 Methods of Detecting Undocumented Instructions The first undocumented instruction detecting tool Sandsifter [4] uses fuzzing technology and is designed for x86 processors. Its instruction search process, which is called tunneling, reduces the extremely large number of instructions in x86 instruction set to be tested. UISFuzz [5] is a tool based on Sandsifter. It shortened the executing time of detection by instruction format recognition. However, the amount of detected undocumented instructions are not shown in their results and the source code or tool is not provided. After implementing the UISFuzz using the instruction search algorithm introduced in their paper, we found out the number of undocumented instructions detected is also largely reduced comparing with Sandsifter. Therefore, these methods still have limitations on efficiency and accuracy.
3 Implementation Our work can be divided into two parts, one is a high efficiency and accuracy instruction searching algorithm used in undocumented instruction fuzzing and the other is a result classification method. The framework of our tools is shown in Fig. 1. The instruction fuzzing technology is based on Sandsifter. First, initiate fuzzer with processor vendor, model name and other processor information. Search the next instruction to be executed with instruction searching algorithm. Protect the execution environment before execute the instruction and then restore environment after execution. Compare the signal generated by executing instruction and the result of disassembling the instruction using Capstone. In cases when the instructions could be executed without generating a SIGILL signal while disassembler does not recognize those instructions, we judge them as undocumented instructions.
A High Efficiency and Accuracy Method
297
Fig. 1. Framework of undocumented instruction fuzzing and classification tool.
3.1 Improving Instruction Searching Algorithm Since a x86 instruction is up to 15 bytes, the searching space can be as large as 1.3 × 1036 instructions. Considering the fuzzing speed, only a small part of the instructions can be executed within a reasonable time consumption. Therefore, we introduce an instruction length-based depth-first searching algorithm with format analysis to improve efficiency and variable mutation length to improve accuracy for searching instructions tested in fuzzing. Length-Based Depth-First Searching Algorithm. The instruction searching process works as follows. As the longest instructions in x86 are 15 bytes, we generate a buffer of 15 bytes and set all to zero in the beginning. Then we decide the index of the byte to be mutated as the last byte of instruction’s actual length. The actual length of all-bytes-zero instruction is 2, therefore the index of the byte to be mutated would be 1. We mutate the byte by increasing the value one at a time. As the increasing of the mutation byte, the actual length of the instruction will be changed, as the example shown in Fig. 2. When mutation byte becomes ‘04’, the actual length will be changed to 3 bytes. In those cases, the index of mutation byte will be fixed to the altered last byte. When the new index increases to ‘05’, again the length changes and the index will be fixed. When the mutation byte has been increased 255 times without any change of the instruction length and turns to ‘0xFF’, the mutation byte will return back to 0 and the index will be decreased by one. When the mutation index is 0 and the first byte has increased 255 times, the instruction searching process will end.
298
J. Wu et al.
Fig. 2. Example of length-based depth-first searching algorithm. The light grey block shows fixed part of the instruction and the dark gray block shows the mutation part of the function.
Format Analysis. The algorithm introduced above can programmatically search x86 instruction set and detect a large number of instructions within a reasonable time consumption. But the efficiency can still be improved by applying instruction format analysis. The instruction format of x86 ISA is shown in Fig. 3. A x86 instruction can contain at most 8 bytes of displacement and immediate. Mutating those bytes takes considerable time and will not be helpful for finding undocumented instructions since they have little effort on changing instruction behavior. Instruction behavior are mostly determined by prefixes, opcode ModR/M and SIB.
Fig. 3. Instruction format of x86 ISA. A x86 instruction is consists of optional instruction prefix, opcode, optional ModR/M, optional SIB, displacement and immediate.
For known instructions, those bytes can be analyzed by disassembler such as Capstone. Therefore, before the mutation of an instruction, we use Capstone to disassemble the instruction. If it is known by Capstone, we get details of the instruction to find out if it contains any displacement or immediate bytes and the length of those bytes. We fix
A High Efficiency and Accuracy Method
299
the mutation index according to the information provided by format analysis to avoid mutating displacement and immediate. For instructions not recognized by Capstone, we are not able to find out the details of them but according to x86 ISA format standard, we can still speculate that the bytes that determine instruction behavior are very likely being located in the front and middle part of the instruction instead of the last part. Therefore, when instruction length changes, the mutation byte will be set to the middle part of the instruction instead of the last byte. The specific position of the mutation byte will be discussed in the next section. An example of length-based depth-first searching algorithm with format analysis is shown in Fig. 4. Instruction ‘00040500000000’ is a known instruction. After format analysis we find out the last 4 bytes are immediate, so the mutation byte starts at the third byte. In this case 4 bytes of mutation is reduced, which helps to improve efficiency.
Fig. 4. Example of length-based depth-first searching algorithm with format analysis. The light grey block shows fixed part of the instruction and the dark grey block shows the mutation part of the function.
Fig. 5. The time consumption of Sandsifter and our tool on four platforms.
Variable Mutation Length. With instruction format analysis being applied to searching algorithm, efficiency is greatly improved. The time consumption of fuzzing is shortened to 10% compared with Sandsifter. However, the undocumented instructions being
300
J. Wu et al.
detected is also decreased. To solve this problem, we use variable mutation length to improve the accuracy of searching undocumented instructions in order to detect more in less time. For instructions not recognized by disassembler, the index of the byte to start the mutation is the key point to improve the accuracy. The index is changed every time instruction length changes. When that happens, for those instructions not being detected as undocumented instructions, the index will be increased by one. For those being detected as undocumented instructions, the index will be increased by two since in those cases undocumented instructions are more likely to be existed. In this way, we variate the mutation length for different circumstances to improve the accuracy in searching instruction. 3.2 Result Classification After the fuzzing test of undocumented instruction detection, about 108 instructions will be logged as undocumented instructions. Before any further research, an effective result classification method is necessary. Result classification is based on instruction information. For an undocumented instruction, instruction length, signal generated after execution are the information we can obtain easily. The x86 ISA format is complex and each part of the instruction is not length-fixed, so it is difficult for us to obtain details. But there is still useful information after analysis. The prefixes of x86 instruction are different from instruction opcodes and the prefixes are limited. Therefore, we can find out the prefix part of an instruction by analyze the first a few bytes of the instruction. Also, if a series of undocumented instructions are different from each other only in the mutation byte, then there is a high possibility that those instructions share the same opcode and the mutation byte is part of their displacement or immediate. We classify undocumented instruction detection result according to the information obtained above. First, we analyze the prefixes of the instructions and get the part of the instruction without any prefixes as the new instructions. Drop the duplicated instructions with exactly the same instruction bytes. After this step, the mount of instructions is decreased to about 20% of the original. Then we divide the instructions with the same length and signal in to groups. In each group, find out the instructions that are very likely to share the same opcode and drop the duplicates. After result classification, the number of undocumented instructions is reduced to less than 10000 instructions. The classification results for Sandsifter are even less than 1000. Which are much more reasonable amounts for future researches.
4 Evaluation and Result We tested our undocumented instruction detection and classification tool on four platforms shown in Table 1. We run Sandsifter and our tool on each of these platforms and classified the results. We evaluate our tool by comparing the results with Sandsifter.
A High Efficiency and Accuracy Method
301
Table 1. Details of evaluation platforms. Seq Processor
Memory
1
Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20 GHz
32G DDR4
2
Intel(R) Core(TM) i5-6600 CPU @ 3.30 GHz
8G DDR3
3
Intel(R) Core(TM) i5-4590 CPU @ 3.30 GHz
16G DDR3
4
Intel(R) Core(TM) i5-1038NG7 CPU @ 2.00 GHz 32G DDR4
4.1 Evaluation of Instruction Searching Algorithm The fuzzing test results are shown in following figures. Figure 6 shows the fuzzing test’s time consumption of Sandsifter and our tool on four platforms. Our tool takes less than 30% of the time token by Sandsifter except for Intel(R) Core(TM) i5-6600. On Intel(R) Core(TM) i5-6600 Sandsifter failed to work normally and only detected very few undocumented instructions with little time consumption. The result shows that our tool has higher efficiency and reliability.
Fig. 6. The undocumented instructions detected by Sandsifter and our tool on four platforms.
At the same time, our tool can also detect more undocumented instructions. Figure 5 shows the undocumented instructions being detected by Sandsifter and by our tool on four evaluation platforms. The number of undocumented instructions detected by our tool is about 10 to 15 times of those by Sandsifter. On platform Intel(R) Core(TM) i5-6600, Sandsifter detected very few undocumented instructions while our tool still worked successfully and detected a large number of instructions. The result shows that our instruction searching algorithm has high efficiency and accuracy for detecting undocumented instructions. 4.2 Evaluation of Instruction Classification We classified all the results detected by Sandsifter and our tool on four evaluation platforms. Table 2 shows the results detected by our tool. After classification, the number of
302
J. Wu et al.
Table 2. Numbers of undocumented instructions detected by our tool before and after classification on four platforms. Processor
Before classification After classification
Xeon(R) CPU E5-2630 v4 37160522
9172
Core(TM) i5-6600
37160522
9173
Core(TM) i5-4590
16705075
909
Core(TM) i5-1038NG7
37162570
9189
instructions is largely decrease and has dropped below 10000. The classification process takes less than 15 min. Table 3 shows the results detected by Sandsifter. After classification, the number is dropped under 1000. The result also indicates that our instruction searching algorithm is effective and accurate. Table 3. Numbers of undocumented instructions detected by Sandsifter before and after classification on four platforms. Processor
Before classification After classification
Xeon(R) CPU E5-2630 v4 826627376
418
Core(TM) i5-6600
16518336
299
Core(TM) i5-4590
826627376
418
Core(TM) i5-1038NG7
749925489
634
5 Limitation and Future Research The undocumented instruction detection and classification method introduced in this paper can only be applied on x86 architecture processors. On RISC processors, undocumented instruction detecting tools search the whole instruction space for fuzzing test since their instruction is up to 4 bytes, which makes the number of instructions to be executed within a reasonable range. However, on those processors which have limited performance, an instruction searching method with high efficiency and accuracy is still necessary. Therefore, we will do further research on designing a suitable detection and classification method for RISC processors.
6 Conclusion In this paper, we introduced a high efficiency and accuracy method of undocumented instruction detection and classification on x86 processors. The instruction searching
A High Efficiency and Accuracy Method
303
algorithm we used is based on length-based depth-first searching algorithm. We improved efficiency and accuracy by applying instruction format analysis and variable mutation length. Then we classified the results according to instruction information. Result shows that our method can detect 10 times of the number within 30% of the time compared with Sandsifter. After classification, the number of instructions can be greatly reduced and is much more reasonable for future researches. Acknowledgments. This article is supported by the Fundamental Research Funds for the Central Universities 2019XD-A19.
References 1. Duflot, L.: CPU bugs, CPU backdoors and consequences on security. J. Comput. Virol. 5(2), 91–104 (2009) 2. Domas, C.: Hardware backdoors in x86 CPUs. Black Hat, 1–14 (2018) 3. Zhang, J., et al.: HIT: a hidden instruction trojan model for processors. In: 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1271–1274. IEEE (2020) 4. Li, X., et al.: UISFuzz: an efficient fuzzing method for cpu undocumented instruction searching. IEEE Access 7, 149224–149236 (2019) 5. Wu, H., Wei, Q., Wu, Z.: Research on CPU instruction security. In: 2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC), pp. 233–239. IEEE (2020) 6. Dofferhoff, R., et al.: iScanU: a portable scanner for undocumented instructions on RISC processors. In: 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 306–317. IEEE (2020) 7. Dofferhoff, R.: A performance evaluation of platform-independent methods to search for hidden instructions on RISC processors 8. Strupe, F., Kumar, R.: Uncovering hidden instructions in Armv8-A implementations. In: Workshop on Hardware and Architectural Support for Security and Privacy (2020) 9. Kocher, P., et al.: Spectre attacks: exploiting speculative execution. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 1–19. IEEE (2019) 10. Zhu, J., et al.: CPU security benchmark. In: Proceedings of the 1st Workshop on SecurityOriented Designs of Computer Architectures and Processors, pp. 8–14 (2018) 11. Canella, C., et al.: A systematic evaluation of transient execution attacks and defenses. In: 28th {USENIX} Security Symposium ({USENIX} Security 19), pp. 249–266 (2019) 12. Shamshiri, S., Esmaeilzadeh, H., Navabi, Z.: Test instruction set (TIS) for high level selftesting of CPU cores. In: Proceedings of the 13th Asian Test Symposium, November, pp. 158– 163 (2004) 13. Ahmad, B.A.: Real time detection of spectre and meltdown attacks using machine learning. arXiv preprint arXiv:2006.01442 (2020) 14. Chakraborty, R.S., Narasimhan, S., Bhunia, S.: Hardware Trojan: threats and emerging solutions. In: 2009 IEEE International High Level Design Validation and Test Workshop, pp. 166–171. IEEE (2009)
Simulation-Based Fuzzing for Smart IoT Devices Fanglei Zhang1 , Baojiang Cui1(B) , Chen Chen2 , Yiqi Sun1 , Kairui Gong1 , and Jinxin Ma3 1 Beijing University of Posts and Telecommunications, Beijing, China
{zhangfanglei,cuibj,sunyiqi,realgkr}@bupt.edu.cn 2 Air Force Engineering University, Xi’an, China 3 China Information Technology Security Evaluation Center, Beijing, China
Abstract. The early research on IoT (Internet of Things) firmware is mostly based on the hardware environment, the software interfaces and hardware resources are very limited, and the traditional dynamic debugging and fuzzing tools cannot be executed efficiently, which leads to high research costs. In order to solve this problem, a simulation-based fuzzing prototype tool for smart IoT devices (IoTSFT) is proposed in this paper. It builds a pure software virtual environment to make the firmware run out of hardware constraints. In addition, the security analysis of the firmware can be completed by combining the path coverage-based fuzzing technology. It is verified by experiments that IoTSFT can successfully simulate binary, obtain the sample execution path coverage, and fuzz the target binary.
1 Introduction With the popularization of smart IoT devices in work and life, there are endless attacks on them, and the security research requirements on device firmware is also increasing. Automatic vulnerability detection technology is currently the main method for binary vulnerability mining and security analysis. And it’s mainly for the binary in the desktop environment. However, the analysis of firmware is very different from the binary in the desktop environment: First, the architecture is different. The firmware is generally based on ARM, MIPS and other architectures, while the desktop environment binary is mostly the x86 architecture. Second, the storage space and memory resources are different. In the hardware environment, storage space is limited and memory resources are relatively small. Third, the software interface is different. The kernel of the desktop environment provides abundant calling interfaces, but they are relatively limited in the hardware environment. Due to the above limitations, the hardware environment cannot support the efficient working of fuzzing tools, which seriously affects research progress and efficiency. In this paper, based on QEMU and binary instrumentation technology, we build a virtual environment for firmware binary, which makes it run completely on the desktop environment and get rid of the constraints and interferences of hardware devices. By combining with path feedback-based fuzzing, we can detect the vulnerabilities of firmware binary in the virtual environment.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 304–313, 2022. https://doi.org/10.1007/978-3-030-79728-7_30
Simulation-Based Fuzzing for Smart IoT Devices
305
The contributions of this paper are as follows: 1) We design a fuzzing prototype tool for smart IoT devices based on QEMU. 2) We verify the feasibility of IoTSFT for firmware binary. The remainder of this paper is organized as follows. In Sect. 2, we introduce the related work. Next, in Sect. 3, we describe the specific system design of IoTSFT. Then, Sect. 4 shows the experiments and evaluation results. Section 5 summarizes the limitations and future directions. Finally, the conclusions are given in Sect. 6.
2 Related Work 2.1 Firmware Simulation QEMU [1] and Firmadyne [2] are currently commonly used firmware simulation tools. QEMU is a pure software-implemented simulator, which can simulate application-level software and system-level software. It has strong versatility, which supports multiple architectures such as PowerPC, ARM and MIPS. Firmadyne is a tool for firmware simulation and dynamic analysis based on QEMU. It is mainly for Linux embedded firmware based on ARM and MIPS. It solves the problem of hardware interaction at the system level by modifying the kernel and providing the user-space NVRAM library. Compared with QEMU, Firmadyne only supports system-level simulation, and the performance overhead is higher. Moreover Firmadyne is not friendly to the simulation of some new firmware, and is generally used to verify known vulnerabilities. Bao et al. [3] proposed a QEMU-based embedded firmware simulation method, which requires self-compiling the kernel, modifying the firmware startup program, recreating the root file system image and mounting it on the self-compiled kernel, which solved the problem of firmware kernel incompleteness and firmware kernel address mapping. Aiming at the practical problems of Firmadyne and QEMU’s full-system and user-mode simulation, Zheng et al. [4] proposed a hybrid-mode simulation which combines the high compatibility of system-mode and the efficiency of user-mode, solving the performance bottleneck caused by system-mode simulation. The above two methods mainly solve the problem of the full firmware simulation. But when mining the firmware vulnerabilities, we generally focus on the binaries of the network protocol-related services, so simulating a single binary can meet the demand. In this case, using QEMU’s user-mode for simulation is simpler and more efficient. 2.2 Fuzzing Fuzzing is a commonly used automatic vulnerability detection technology, which can be divided into three types [5, 6]: non-directional fuzzing, target-oriented fuzzing, and path feedback-based fuzzing. Non-directional fuzzing adopts random mutation to generate samples, and during the mutation process, it does not utilize any program runtime information to adjust the sample mutation strategy. In addition, the random mutation destroys the format of the
306
F. Zhang et al.
sample, so that most of the generated samples cannot execute the deep logic of the program, and the path coverage is not high enough. The target-oriented fuzzing will guide the path to the position of suspicious vulnerabilities based on the path information of each sample. This method is more targeted, but there is also the problem of low binary coverage. Path feedback-based fuzzing will select samples that can improve path coverage as seeds, and increase the probability of triggering vulnerabilities by executing as many paths as possible. AFL [7] is a path feedback-based fuzzing tool, which uses instrumentation to obtain path coverage information and guide sample selection. However, AFL usually uses source code for instrumentation, and IoT firmware generally does not provide source code. Peach [8] is a non-directional fuzzing framework, but it restricts the sample format by Peach Pit file, which increases the number of valid samples. At the same time, we can take advantage of the path information to select samples, with the purpose of improving the fuzzing efficiency. 2.3 Binary Instrumentation Binary instrumentation inserts a block of custom code into the binary file to obtain information such as control flow and data flow at runtime, and can also insert jump instructions to change the execution logic of the program. Binary instrumentation includes static binary instrumentation and dynamic binary instrumentation. Dynamic binary instrumentation [9] is to inject code and data into the process memory when the program is running. The instrumented code does not change the binary file. This kind of instrumentation will perform frequent control transfer and data interaction between the original binary code, control code, and instrumentation code, which will seriously affect the run-time performance of the target binary, causing a series of problems such as untimely acquisition of path coverage information. Static binary instrumentation [10] directly modifies the original binary file, and inserts the corresponding code and data in the appropriate position. Because the instrumentation code and the original code are in the same file, and both run in the same process, it avoids the time consumption caused by information interaction between multiple processes. Compared with dynamic instrumentation, static instrumentation is more in line with current needs.
3 System Design IoTSFT is mainly based on QEMU and static binary instrumentation technology to construct a virtual environment of IoT device’s firmware and obtain runtime path coverage information; then implements the fuzzing control module based on Peach, managing test samples according to the path feedback information, to increase efficiency of fuzzing. IoTSFT consists of four modules: Static configuration analysis module, Dynamic executing analysis module, Monitor and information feedback module, and Fuzzing control module. The system design is shown in Fig. 1. Firstly, IoTSFT obtains the target firmware in many ways, extracts the file system, and collects some related static information. Then perform a pre-simulation on the target binary, combine with dynamic
Simulation-Based Fuzzing for Smart IoT Devices
307
debugging to locate where the hardware interaction functions are called. Based on the static instrumentation technology, we patch the hardware interaction functions, so that the binary can be simulated correctly. In order to provide path information to the fuzzing module, monitoring code is inserted into the branch of the binary using static instrumentation. So when the program is running, the path coverage information is sent back to the fuzzing control module in real time to guide the management of seed samples.
Fig. 1. System design of IoTSFT.
3.1 Static Configuration Analysis Before simulating the firmware, we need to make an initial analysis of it and collect some necessary static information, including device type, firmware version, program architecture, etc. Using the above information to create a firmware database can provide references for future research on other firmware. Obtaining Firmware. First, we need to grab the firmware to be studied. Generally, the following methods can be used: Download the firmware of the target model directly from the official website; Use Telnet or SSH to remotely connect to the target device and directly access the file system to obtain the binary; Use some hardware and software tools to extract the firmware from the chip or interface.
308
F. Zhang et al.
Collecting Static Information. For firmware and file system, we can use some Linux commands and tools to collect static information. For example, the ‘file’ command can identify the file type, and the ‘strings’ command can output some printable characters in the firmware, such as the device name. Binwalk is a commonly used tool for analyzing firmware. Its functions include scanning and analyzing firmware structure, entropy analysis, and automatic file system extraction. In the file system, there are also some config files. For example, the lighttpd program needs the support of lighttpd.conf. 3.2 Dynamic Executing Analysis When simulating the firmware binary under QEMU user-mode, it often reports errors and exits, which is generally caused by the hardware interaction functions in the binary. Utilizing the debug feature of QEMU, we can determine the location of the error and locate the function that needs to be patched. Then, based on static binary instrumentation, patching code is inserted directly in the binary, to return the suitable data without calling those functions. After multiple rounds of repeating the above two steps, all the functions that cause simulation failure are patched, and the binary can execute normally in the virtual environment. Figure 2 shows the design principle of this module. This system is currently mainly developed for the ARM architecture, and the following only takes ARM architecture for example.
Fig. 2. The principle of patching hardware interaction functions based on static instrumentation.
Selecting Instrumentation Point. First of all, we need to choose appropriate positions as the instrumentation points. ELF utilizes the optimization technique of Lazy Binding when processing external function calls. When a function is called, it does not directly
Simulation-Based Fuzzing for Smart IoT Devices
309
go to the GOT table to find the real address of the function in the dynamic link library, but jumps to the PLT (Procedure Linkage Table) item corresponding to this function. If the function is called for the first time, it will first perform address binding operation and fill in the real address of this function in the GOT table. When this function is called again, after jumping to the PLT item, it can jump directly to the real address in the GOT table and execute the latter instructions. Each external function in ELF has a corresponding PLT item, and each call of the function will jump to the address of the PLT item. According to the above analysis, we take the PLT item corresponding to the target patched function as the instrumentation point. Inserting Patch Code. In the original binary, the data from hardware is needed after calling the hardware interaction function, but the Nvram library is not supported in the simulation environment, which makes the nvram-related function call fails. Patch code directly returns the preset data, avoiding the interactions with the hardware interface. The size of the PLT item is fixed at 12 bytes, but the size of patch code may be greater than 12 bytes, so it’s not appropriate to insert patch code directly in the PLT item. In order to ensure that the instrumented code does not affect the original instructions, IoTSFT expands the space of the binary and adds a LOAD segment of a custom size at the end of the file to store the patch code. And the instruction at the PLT item is replaced to the LDR instruction, so that it can jump to the patch code in the new segment. After executing, program returns to the next instruction of the function call to continue. 3.3 Monitor and Information Feedback The path coverage information at runtime is essential for fuzzing based on path feedback. IoTSFT inserts the monitoring code into the binary applying static instrumentation. Each time the program runs to a branch, it will execute the monitoring instructions, record the path of the current sample, and provide an interface for real-time monitoring the firmware running status. Obtaining the Address of the Instrumentation Point. To obtain the program path which each sample is executed, we need to insert a probe at each branch of the binary. Program branches are generally jump instructions, so we apply IDA Python scripts to linearly scan the binary and filter out the addresses of all jump instructions as the instrumentation points. Design of Monitoring Code. A complex binary inevitably contains many jump instructions. If the monitoring code is inserted for each jump instruction, it will certainly cause a waste of space. In order to save space as much as possible, we divide the monitoring code into three parts, namely Prepare, Instrument, and End. The Instrument part is the main body of the monitoring code, which temporarily stores path information based on shared memory. The operation of this part is the same at every instrumentation point, so only one copy needs to be kept. The Prepare part is executed after the instrumentation point. It is mainly responsible for saving the field data of the instrumentation point instruction, and saving the address of the corresponding End part to the register. Then the program jumps to the Instrument part. After the Instrument part is executed, the address of the End part will be read from the register, and then the jump will be
310
F. Zhang et al.
realized. The End part is responsible for restoring the field data of the instrumentation point instruction, executing this instruction, and then jumping to the next instruction to continue executing the original code. Because the Prepare part and End part both need to save specific addresses, there are multiple duplicates in the binary. Path Information Feedback. The path information obtained by the monitoring code is temporarily stored in a block of shared memory. We use the monitoring program named Monitor to read the data stored in the shared memory, and connect the fuzzing control module through socket to return the path information. 3.4 Fuzzing Control Module The fuzzing control module is implemented based on the open source fuzzing framework Peach, and we can restrain the sample format and set up the sample mutation by writing Peach Pit files. Peach is a non-directional fuzzing technology, which cannot directly apply the feedback path information. Therefore, in the control module, we maintain a seed queue, and add samples that trigger new paths to the seed queue according to the feedback path information. The seeds in the seed queue are mutated and added to the sample queue, then the samples in the sample queue are sent to the target program for execution. Next this module receives and handles the feedback information. The above cycle process stops when a crash is triggered or the sample queue is empty. Samples Mutation. On the basis of the sample format described in the Pit file, the sample file can be parsed and mutated according to the data model. This mutation method satisfies the format constraints of the sample, enables the sample to execute the deeper-level logic, and finds the deep-level vulnerabilities of the program. Path Information Processing. The path information returned by Monitor is a 01-bit string. We first read it into the temporary memory and compare it with the current path information. If a bit in the temporary memory is 1, and the corresponding position of the current path information is 0, it means that the current sample has generated a new path coverage. Then, update the current path information and related records, and add the sample into the seed queue. Seed Management. The samples newly added to the seed queue need to be calculated weights. The weights are evaluated based on the number of code blocks executed, sample size, and sample execution times. The more code blocks executed, the smaller sample length, and the fewer execution times, the higher the weight. The seed queue is arranged in the order of decreasing weight.
4 Experiment and Result Ubuntu 16.04 32-bit system was adopted to run Monitor and simulate the firmware. The version of QEMU is 2.5.5, and gcc 5.4.0 is used as the compiler. We perform the fuzzing control module on Windows 10 64-bit system, the version of Peach used is 3.1.124. The hardware environment is, CPU: Intel Core i7-8550U @ 1.80 GHz 1.99 GHz, RAM: 16.0 GB. We choose the routers of ASUS RT-AC68U and RT-AC66U as our test targets, which are ARM-based, and the Web service httpd is used to verify the effectiveness of IoTSFT.
Simulation-Based Fuzzing for Smart IoT Devices
311
4.1 Experiment After obtaining the RT-AC68U’s firmware and extracting the file system, we obtained the following information by static analysis. Firmware version: 3.0.0.4.384_20308, Web service version: httpd/2.0, dynamic link library containing nvram hardware interaction functions: libnvram.so and libshared.so, and some other information. We found a total of 26 nvram library functions that need to be patched using dynamic debugging, including nvram_get, nvram_set, nvram_init, etc. By scanning the patched binary, 12016 instrumentation points were obtained. Then, under the supervision of Monitor, employ QEMU user-mode to simulate the instrumented httpd, and at the same time, execute the fuzzing control module to send test samples and receive feedback. 4.2 Result Evaluation For RT-AC68U, after inserting the monitoring code, the file size of httpd has expanded to 2.3 times the original size, and many additional jumps and instructions have been added. We use httping to test the response time of httpd in the simulation environment, and send 20 access requests to patched httpd and instrumented httpd respectively for 10 times. Figure 3 shows a comparison of the average response time. In general, the response time of instrumented httpd is more than that of patched httpd. But there is no obvious difference between the two. And the maximum response time of instrumented httpd is less than 20 ms. Within a tolerable range, the impact of instrumentation on the efficiency of the fuzzing test is small.
Fig. 3. Comparison of response time between patched httpd and instrumented httpd.
After fuzzing httpd of RT-AC68U and RT-AC66U for 10 h, the number of covered basic blocks and covered paths changes over time as shown in the Tables 1 and 2. Since the fuzzing control module has not been further optimized, the path coverage has not reached a very high level, but it can also verify the usability of IoTSFT for fuzzing the firmware binary in a simulation environment.
312
F. Zhang et al.
Table 1. The number of covered basic blocks and covered paths within 10 h. (RT-AC68U) 1 h 2 h 3 h 4 h 5 h 6 h 7 h 8 h 9 h 10 h Basic block 391 433 433 433 433 434 434 434 434 434 Path
546 588 588 588 588 589 589 589 589 589
Table 2. The number of covered basic blocks and covered paths within 10 h. (RT-AC66U) 1 h 2 h 3 h 4 h 5 h 6 h 7 h 8 h 9 h 10 h Basic block 348 351 351 351 351 351 351 351 351 351 Path
917 918 922 922 922 922 922 922 922 922
5 Limitation and Future Direction At present, IoTSFT cannot work if the file system cannot be obtained due to encryption or other reasons. And it only provides the patch function for the hardware interaction functions, so it doesn’t support the binaries those fail simulation because of other causes. This tool is developed only for the ARM architecture currently, and in the future, needs to support other architectures such as MIPS etc. It also relies on manual debugging when searching for hardware interaction functions, so we need summarizing the rules to realize automatic searching. In addition, the fuzzing efficiency is still low, and further research on sample mutation strategies and path feedback utilization is necessary in order to improve the possibility of triggering vulnerabilities.
6 Conclusion In this paper, we present a porotype tool, IoTSFT, which can fuzz the firmware binary in the virtual environment. This tool patched the hardware interaction functions in the firmware binary based on static binary instrumentation, and built a virtual execution environment for IoT firmware, to make sure that the firmware binary executes correctly and efficiently on the pure software platform without the constraint of hardware environment. And then, based on the path feedback information, we implemented the fuzzing test of firmware, which provides a convenient and low-cost prototype tool for the security research and vulnerability mining of firmware. Acknowledgments. This article is supported by the National Natural Science Foundation of China (No. 61872386).
References 1. Bellard F.: QEMU, a fast and portable dynamic translator. In: USENIX Annual Technical Conference, FREENIX Track, vol. 41, p. 46 (2005)
Simulation-Based Fuzzing for Smart IoT Devices
313
2. Chen, D.D., Woo, M., Brumley, D., Egele, M.: Towards automated dynamic analysis for linux-based embedded firmware. In: NDSS, vol. 1, p. 1 (2016) 3. Fang, J., Bao, Q., Song, S.: Network device firmware running analysis based on QEMU. In: 2015 International Conference on Information Computer and Communication Engineering (ICC2015), pp. 1–10 (2015) 4. Zheng, Y., Davanian, A., Yin, H., Song, C., Zhu, H., Sun, L.: FIRM-AFL: high-throughput greybox fuzzing of IoT firmware via augmented process emulation. In: 28th {USENIX} Security Symposium ({USENIX} Security 19), pp. 1099–1114 (2019) 5. Li, J., Zhao, B., Zhang, C.: Fuzzing: a survey. Cybersecurity 1, 1–13 (2018) 6. Chen, C., Cui, B., Ma, J., Wu, R., Guo, J., Liu, W.: A systematic review of fuzzing techniques. Comput. Secur. 75, 118–137 (2018) 7. American Fuzzy Lop. https://lcamtuf.coredump.cx/afl/. Accessed on 31 March 2021 8. Amini, P.: Fuzzing frameworks. In: Black Hat USA, vol. 14, pp. 211–217 (2007) 9. Nethercote, N.: Dynamic Binary Analysis and Instrumentation, pp. 1–8. University of Cambridge, Computer Laboratory, (No. UCAM-CL-TR-606) (2004) 10. Yang, W., Wang, Y., Cui, B., Chen, C.: A Static Instrumentation Method for ELF Binary. In: Barolli, L., Xhafa, F., Hussain, O. (eds.) Innovative Mobile and Internet Services in Ubiquitous Computing. IMIS 2019. Advances in Intelligent Systems and Computing, vol. 994. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-22263-5_38
On the Neighborhood-Connectivity of Locally Twisted Cube Networks Tzu-Liang Kung1(B) , Cheng-Kuan Lin2 , and Chun-Nan Hung3 1
2
Department of Computer Science and Information Engineering, Asia University, Taichung, Taiwan [email protected] College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China [email protected] 3 Department of Computer Science and Information Engineering, Da-Yeh University, Changhua, Taiwan [email protected]
Abstract. Interconnection networks are emerging as an approach to solving system-level communication problems. A network may be modeled by a graph whose vertices represent the nodes, and whose edges represent communication links. For any vertex v in a graph G, let NG (v) denote the open neighborhood of v, and let NG [v] = {v} ∪ NG (v) denote the closed neighborhood of v. The connectivity has long been a classic factor that characterizes both network reliability and fault tolerance. A set F of vertex subsets of G is called a neighborhood-cut if G − F is disconnected, and each element of F happens to be the closed neighborhood of some vertex in G. The neighborhood-connectivity of G is the minimum cardinality over all neighborhood-cuts in G. The locally twisted cube network is a promising alternative to hypercube, which is well known as one of the most popular network architectures for high-performance computing. In this paper, we determine the exact neighborhood-connectivity of locally twisted cubes.
1
Introduction
Interconnection networks are emerging as a solution to the system-level communication problems [4]. A common challenge for network designers is to match the data communication scheme of the problem at hand to the network’s topology. A variety of interconnection networks have been developed on the basis of some well-known undirected graphs, such as meshes, torus, hypercubes, crossed cubes, exchanged hypercubes, butterfly graphs, star graphs, arrangement graphs, Gaussian graphs, etc. [1,5–7,18,20,23]. Among the different kinds of network topologies, the hypercube network [9,25] is one of the most attractive candidates for high-performance computing [13,18,26] due to its promising advantages, for example, including regularity, vertex/edge symmetry, maximum connectivity, optimal fault-tolerance, Hamiltonicity, and so on. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 314–321, 2022. https://doi.org/10.1007/978-3-030-79728-7_31
On the Neighborhood-Connectivity of Locally Twisted Cube Networks
315
The network’s topological structure is typically modeled by a connected graph for mathematical analysis [2]. A graph G = (V, E) consists of a vertex set V and an edge set E. For convenience, V (G) and E(G) represent the vertex and edge sets of G, respectively. Two vertices u and v of G are neighbors of each other if there is an edge (u, v) joining them in G. For any vertex v of G, NG (v) = {u ∈ V (G) | (u, v) ∈ E(G)} denotes the open neighborhood of v, and NG [v] = {v} ∪ NG (v) denotes the closed neighborhood of v. For S ⊂ V (G), NG (S) = v∈S NG (v) \ S. The minimum degree of G is defined by δ(G) = min{|NG (v)| | v ∈ V (G)}. A graph G is a subgraph of G if V (G ) ⊆ V (G) and E(G ) ⊆ E(G). For any S ⊆ V (G), G[S] denotes the subgraph induced by S. Paths and cycles are two types of fundamental subgraphs. For clarity, Pr = v1 , v2 , . . . , vr represents a path of order r ≥ 2, and Ck = v1 , v2 , . . . , vk , v1 represents a cycle of order k ≥ 3, in both of which any two consecutive vertices are adjacent. A graph G is connected if for every pair u, v of distinct vertices of G, there exists a path between u and v. A vertex-cut of G is a subset S of V (G) such that G − S is disconnected or trivial. The connectivity of G is defined as κ(G) = minS⊂V (G) {|S| | G − S is disconnected or trivial}, which is equal to the minimum cardinality over all vertex-cuts of G and can be determined in polynomial time using Menger’s theorem [24]. The connectivity has long been a classic factor that characterizes both network reliability and fault tolerance. For any S ⊂ V (G), the set F = {NG [v] | v ∈ S} is called a neighborhood-cut of G if G − F is disconnected or trivial. The neighborhood-connectivity of G, denoted by nκ(G), is the minimum cardinality over all neighborhood-cuts in G. The locally twisted cube [28] is a hypercube variant with lower diameter than that of the hypercube. Many attractive properties of the locally twisted cube have been widely studied [8,12,14–16,21,22,29]. For example, cycle and path embedding is of high flexibility on locally twisted cubes [15–17]. This article is inspired to determine the neighborhood-connectivity of the locally twisted cube. The rest of this paper is structured as follows. Section 2 introduces the topological properties of locally twisted cubes. Section 3 presents the main result of this paper. Finally, some concluding remarks are given in Sect. 4.
2
Preliminary
The vertex set of the n-dimensional locally twisted cube LT Qn corresponds typically to the set of n-bit binary numbers: LT Q2 is isomorphic to C4 whose vertex and edge sets are {00, 01, 10, 11} and {(00, 01), (01, 11), (11, 10), (10, 00)}, respectively. For n ≥ 3, LT Qn is a combination of two copies of LT Qn−1 , denoted by LT Q0n−1 and LT Q1n−1 . The vertex set of LT Q0n−1 is V (LT Q0n−1 ) = {x : 0xn−2 xn−3 · · · x0 | xi ∈ {0, 1} for 0 ≤ i ≤ n − 2}, and the vertex set of LT Q1n−1 is V (LT Q1n−1 ) = { y : 1yn−2 yn−3 · · · y0 | yi ∈ {0, 1} for 0 ≤ i ≤ n − 2 }. Let Ec = {(x, y) | x ∈ V (LT Q0n−1 ), y ∈ V (LT Q1n−1 ), yn−2 = (xn−2 + x0 ) mod 2, yi = xi for every 0 ≤ i ≤ n − 3}. Then the vertex set of LT Qn is V (LT Qn ) = V (LT Q0n−1 ) ∪ V (LT Q1n−1 ), and the edge set of LT Qn is
316
T.-L. Kung et al.
E(LT Qn ) = E(LT Q0n−1 ) ∪ E(LT Q1n−1 ) ∪ Ec . Here LT Qn = LT Q0n−1 ⊗ LT Q1n−1 represents the recursive construction of LT Qn . Figure 1 illustrates LT Q3 and LT Q4 . There exists no cycle of order three in LT Qn ; i.e., LT Qn is K3 -free. The n+3 diameter of LT Qn is n+1 2 if n = 3, 4, and 2 if n ≥ 5. Yang et al.. [28] proved that κ(LT Qn ) = n, and some assessments on conditional connectivity of LT Qn have been proposed [3,12]. The locally twisted cube has received a wide variety of researchers’ attention for its attractive properties [8,10,11,14– 17,19,21,22,27,28].
Fig. 1. Illustrating LT Q3 and LT Q4 .
For any two adjacent vertices u = un−1 un−2 · · · u0 and v = vn−1 vn−2 · · · v0 in LT Qn , they are the d-neighbors of each other, 0 ≤ d ≤ n − 1, if ud = vd and ui = vi for i > d. For convenience, the d-neighbors of u and v are denoted by (u)d and (v)d , respectively. As v = (u)d and u = (v)d , it is trivial that ((v)d )d = v. More precisely, take v for example, (v)0 = vn−1 · · · v1 v¯0 , (v)1 = vn−1 · · · v2 v¯1 v0 , vd−2 · · · v0 , where vd−1 = (vd−1 + v0 ) mod 2 if 2 ≤ (v)d = vn−1 · · · vd+1 v¯d vd−1 n−1 = v¯n−1 vn−2 vn−3 · · · v0 , where vn−2 = (vn−2 + v0 ) mod 2. d ≤ n − 2, and (v)
3
Neighborhood-Connecitivity of LT Qn
At the first glance, we observe that it is straightforward to determine nκ(LT Q2 ), nκ(LT Q3 ) and nκ(LT Q4 ). Lemma 1. nκ(LT Q2 ) = 1 and nκ(LT Q3 ) = nκ(LT Q4 ) = 2. Proof. Let x denote any vertex of LT Q2 . Then LT Q2 − NLT Qn [x] is a singleton, so we can immediately derive nκ(LT Q2 ) = 1. Suppose that n ∈ {3, 4}. For any vertex v of LT Qn , LT Qn − NLT Qn [v] is connected. Hence, we can know that nκ(LT Q3 ) ≥ 2 and nκ(LT Q4 ) ≥ 2. Obviously, {{011, 010, 001, 101}, {110, 111, 100, 010}} is a neighborhood-cut in LT Q3 , and {{0011, 0010, 0001, 0101, 1111}, {1100, 1101, 1110, 1000, 0100}} is a neighborhood-cut in LT Q4 . As a consequence, we get nκ(LT Q3 ) ≤ 2 and nκ(LT Q4 ) ≤ 2 to complete the proof. Lemma 2. For n ≥ 3, let Sn be a subset of V (LT Qn ) such that |Sn | ≤ n2 , and let F (Sn ) = {NLT Qn [v] | v ∈ Sn }. Then κ(LT Qn − F (Sn )) ≥ n − 2|Sn |.
On the Neighborhood-Connectivity of Locally Twisted Cube Networks
317
Proof. If Sn = ∅, then F (Sn ) = ∅. Thus, it is obvious that κ(LT Qn − F (Sn )) = κ(LT Qn ) = n. When n is even and |Sn | = n2 = n2 , it is trivial to derive that κ(LT Qn − F (Sn )) ≥ n − 2|Sn | = 0. When n is odd, it is clear that |Sn | ≤ n2 = n−1 n−1 2 . Below we only need to consider 1 ≤ |Sn | ≤ 2 . The proof proceeds by induction on n. The inductive basis stands on Lemma 1, assuring that nκ(LT Q3 ) = 2. As |S3 | = 1, LT Q3 −F (S3 ) is connected, and this implies that κ(LT Q3 − F (S3 )) ≥ 1. As the inductive hypothesis, we assume that nκ(LT Qk − F (Sk )) ≥ k − 2|Sk | if |Sk | ≤ k2 for 3 ≤ k ≤ n − 1. Let T be any vertex subset of LT Qn − F (Sn ) such that |T | ≤ n − 2|Sn | − 1. It suffices to argue that (LT Qn − F (Sn )) − T remains connected. As LT Qn = LT Q0n−1 ⊗ LT Q1n−1 , let i i i = Sn ∩ V (LT Qin−1 ), F (Sn−1 ) = {NLT Qin−1 [v] | v ∈ Sn−1 }, Sn−1 (i) Fn (Sn ) = X∈F (Sn ) {X ∩ V (LT Qin−1 ) | |X ∩ V (LT Qin−1 )| = n}, (i) F1 (Sn ) = X∈F (Sn ) {X ∩ V (LT Qin−1 ) | |X ∩ V (LT Qin−1 )| = 1} and T (i) = T ∩ V (LT Qin−1 ) (i) i 0 1 for i ∈ {0, 1}. Obviously, we have F (Sn−1 ) = Fn (Sn ) and |Sn | = |Sn−1 |+|Sn−1 |. (0) (1) (1) 0 1 Moreover, we have |Sn−1 | = |Fn (Sn )| = |F1 (Sn )| and |Sn−1 | = |Fn (Sn )| = (0) |F1 (Sn )|. Figure 2 sketches the partition of F (Sn ) into LT Q0n−1 and LT Q1n−1 . 0 1 | ≥ |Sn−1 |. Without loss of generality, we assume that |Sn−1 (1)
(0)
1 1 • Case 1: |Sn−1 | ≥ 1. That is, F (Sn−1 ) = Fn (Sn ) = ∅. Thus, both |Fn (Sn )| (1) (1) (0) 1 and |Fn (Sn )| are smaller than |Sn |: 1 ≤ |Sn−1 | = |Fn (Sn )| ≤ |Fn (Sn )| = n−1 0 |Sn−1 | < |Sn | ≤ 2 . Figure 3 illustrates this case. By the inductive hypoth(0) 0 )) ≥ (n − 1) − esis, we have κ(LT Q0n−1 − Fn (Sn )) = κ(LT Q0n−1 − F (Sn−1 0 1 1 1 |> 2|Sn−1 | = n−1−2(|Sn |−|Sn−1 |) = (n−1−2|Sn |)+2|Sn−1 | ≥ |T |+2|Sn−1 (0) (0) (0) (0) 1 (0) 0 |T |+|Sn−1 | = |T |+|F1 (Sn )| so that (LT Qn−1 −Fn (Sn ))−(F1 (Sn )∪ (1) (1) {T (0) }) is connected. Similarly, (LT Q1n−1 − Fn (Sn )) − (F1 (Sn ) ∪ {T (1) }) (0) (1) is connected too. As n|Fn (Sn )| + n|Fn (Sn )| + |T | = n|Sn | + |T | ≤ n−1 for n|Sn |+(n−2|Sn |−1) = (n−2)|Sn |+n−1 ≤ (n−2)× n−1 2 +n−1 < 2 (0) (0) 0 n ≥ 4, there exists a vertex w in (LT Qn−1 − Fn (Sn )) − (F1 (Sn ) ∪ {T (0) })
Fig. 2. Illustrating the partition of F (Sn ).
318
T.-L. Kung et al.
Fig. 3. Illustrating the proof of Case 1.
such that (w)n−1 is in (LT Q1n−1 − Fn (Sn )) − (F1 (Sn ) ∪ {T (1) }). Therefore, (LT Qn − F (Sn )) − T is connected. (1) 1 0 • Case 2: |Sn−1 | = 0 (i.e., |Sn−1 | = |Sn |). That is, Fn (Sn ) = ∅, and thus (0) 0 |F (Sn−1 )| = |Fn (Sn )| = |Sn | ≤ n−1 2 . (1)
(1)
– Subcase 2.1: T (1) = ∅ (i.e., T (0) = T ). Figure 4 illustrates this subcase. (1) (0) Since |F1 (Sn )| = |Fn (Sn )| = |Sn | ≤ n−1 2 < n − 1 = κ(LT Qn−1 ), (1) 1 LT Qn−1 − F1 (Sn ) is connected. Obviously, every vertex of (LT Q0n−1 − (0) (1) Fn (Sn )) − T (0) links to its (n − 1)-neighbor in LT Q1n−1 − F1 (Sn ). Therefore, (LT Qn − F (Sn )) − T is connected. – Subcase 2.2: T (1) = ∅ (i.e., |T (1) | ≥ 1). Thus, |T (0) | = |T | − |T (1) | < n−2|Sn |−1. Figure 5 illustrates this subcase. By the inductive hypothesis, (0) 0 we can derive κ(LT Q0n−1 −Fn (Sn )) = κ(LT Q0n−1 −F (Sn−1 )) ≥ (n−1)− (0) 0 (0) 0 2|Sn−1 | = n−1−2|Sn | > |T |. As a result, (LT Qn−1 −Fn (Sn ))−T (0) is (1) connected. As |F1 (Sn )| + |T (1) | = |Sn | + |T (1) | ≤ |Sn | + |T | ≤ |Sn | + (n − (1) 2|Sn |−1) = n−|Sn |−1 < n−1 = κ(LT Qn−1 ), (LT Q1n−1 −F1 (Sn ))−T (1) (0) is connected. Since n|Fn (Sn )| + |T | = n|Sn | + |T | ≤ n|Sn | + (n − 2|Sn | − n−1 for n ≥ 4, 1) = (n − 2)|Sn | + n − 1 ≤ (n − 2) × n−1 2 +n−1 < 2
Fig. 4. Illustrating the proof of Subcase 2.1.
On the Neighborhood-Connectivity of Locally Twisted Cube Networks
319
Fig. 5. Illustrating the proof of Subcase 2.2.
there exists a vertex w in (LT Q0n−1 − Fn (Sn )) − T (0) such that (w)n−1 (1) (1) is in (LT Qn−1 − F0 (Sn )) − T (1) . Therefore, (LT Qn − F (Sn )) − T is connected. (0)
Lemma 3. For n ≥ 3, nκ(LT Qn ) ≤ n2 . Proof. Suppose that v is any vertex of LT Qn for n ≥ 3. Let n−2 Si (v) = {(v)i , (v)i+1 , ((v)i )i+1 } ∪ j=1 {(((v)i )i+1 )i+1+j (mod n) } f or 0 ≤ i ≤ n − 2 and Sn−1 (v) = {(v)n−1 , ((v)n−1 )0 } ∪ {(((v)n−1 )0 )j | 1 ≤ j ≤ n − 1}. Then the Sk (v) is identical to NLT Qn [((v)k )k+1 (mod n) ] for each 0 ≤ k ≤ n − 1. Let S (v) | 0 ≤ j ≤ n2 − 1 if n is even F (v) = 2j S2j (v) | 0 ≤ j ≤ n2 if n is odd Apparently, v becomes an isolated vertex in LT Qn − F (v) so that F (v) forms a neighborhood-cut of LT Qn . This directly implies that nκ(LT Qn ) can be upperbounded by |F (v)| = n2 . Figure 6 illustrates neighborhood-cuts in LT Q5 and LT Q6 . Theorem 1. For n ≥ 2, nκ(LT Qn ) = n2 . Proof. By Lemma 3, nκ(LT Qn ) ≤ n2 . It suffices to show that nκ(LT Qn ) ≥ n2 . For any Sn ⊂ V (Qn ) with |Sn | ≤ n2 −1, let F (Sn ) = {NLT Qn [v] | v ∈ Sn }. As n2 − 1 ≤ n2 , Lemma 2 assures that κ(LT Qn − F (Sn )) ≥ n − 2|Sn | ≥ n − 2( n2 − 1) ≥ 1. Thus, LT Qn − F (Sn ) is connected, and this implies that nκ(LT Qn ) ≥ n2 .
320
T.-L. Kung et al.
01001
110011
11101
10111
10001
10011
001111
S4(v)
000011
000001
10000 v = 00000
S2(v)
00101
00010
01000
00100
01111
11011
S4(v)
000010
S0(v)
S2(v)
100000
01100 01110
(a) A neighborhood-cut in LTQ5
010000
001000
110000
11100 01101
00011
000101
z = 000000
S0(v) 00001
011011
110001 110100
110010 111000
000100 001100
101100
001110
011100
001101
(b) A neighborhood-cut in LTQ6
Fig. 6. Neighborhood-cuts in LT Q5 and LT Q6 .
4
Conclusion
The neighborhood-cut addresses the scenario that every node can damage its neighborhood simultaneously so that it gets isolated in a network. The neighborhood-connectivity is a reasonable assessment on the degree of a network’s connectedness in such a situation. In this paper, we determine the exact neighborhood-connectivity of locally twisted cubes. Our result shows that nκ(LT Qn ) = n2 for n ≥ 2. In our future study, we will consider a probabilistic case that a node just damage parts of its neighborhood in networks. Acknowledgements. This work is supported in part by the Ministry of Science and Technology, Taiwan, under Grant No. MOST 109-2221-E-468-009-MY2.
References 1. Akers, S.B., Krishnamurthy, B.: A group theoretic model for symmetric interconnection networks. IEEE Trans. Comput. 38(4), 555–566 (1989) 2. Bondy, J.A., Murty, U.S.R.: Graph Theory. Springer, London (2008) 3. Chang, N.-W., Hsieh, S.-Y.: {2, 3}-Extraconnectivities of hypercube-like networks. J. Comput. Syst. Sci. 79, 669–688 (2013) 4. Dally, W.J., Towles, B.: Principles and Practices of Interconnection Networks. Morgan Kaufmann, San Francisco (2004) 5. Day, K., Tripathi, A.: Arrangement graphs: a class of generalized star graphs. Inf. Process. Lett. 42(5), 235–241 (1992) 6. Efe, K.: The crossed cube architecture for parallel computing. IEEE Trans. Parallel Distrib. Syst. 3, 513–524 (1992) 7. Flahive, M., Bose, B.: The topology of Gaussian and Eisenstein-Jacobi interconnection networks. IEEE Trans. Parallel Distrib. Syst. 21(8), 1132–1142 (2010)
On the Neighborhood-Connectivity of Locally Twisted Cube Networks
321
8. Han, Y., Fan, J., Zhang, S., Yang, J., Qian, P.: Embedding meshes into locally twisted cubes. Inf. Sci. 180, 3794–3805 (2010) 9. Harary, F., Hayes, J.P., Wu, H.-J.: A survey of the theory of hypercube graphs. Comput. Math. Appl. 15, 277–289 (1988) 10. Hsieh, S.-Y., Tu, C.-J.: Constructing edge-disjoint spanning trees in locally twisted cubes. Theor. Comput. Sci. 410, 926–932 (2009) 11. Hsieh, S.-Y., Wu, C.-Y.: Edge-fault-tolerant hamiltonicity of locally twisted cubes under conditional edge faults. J. Comb. Optim. 19, 16–30 (2010) 12. Wei, C.-C., Hsieh, S.-Y.: h-restricted connectivity of locally twisted cubes. Discret. Appl. Math. 217, 330–339 (2017) 13. Hsu, L.-H., Lin, C.-K.: Graph Theory and Interconnection Networks. CRC Press, Boca Raton (2008) 14. Hung, R.-W.: Embedding two edge-disjoint Hamiltonian cycles into locally twisted cubes. Theor. Comput. Sci. 412, 4747–4753 (2011) 15. Kung, T.-L.: Flexible cycle embedding in the locally twisted cube with nodes positioned at any prescribed distance. Inf. Sci. 242, 92–102 (2013) 16. Kung, T.-L., Chen, H.-C.: Improving the panconnectedness property of locally twisted cubes. Int. J. Comput. Math. 91(9), 1863–1873 (2014) 17. Kung, T.-L., Chen, H.-C., Lin, C.-H., Hsu, L.-H.: Three types of two-disjoint-cyclecover pancyclicity and their applications to cycle embedding in locally twisted cubes. Comput. J. 64(1), 27–37 (2021) 18. Leighton, F.T.: Introduction to Parallel Algorithms and Architectures: Arrays · Trees · Hypercubes. Morgan Kaufmann, San Mateo (1992) 19. Li, T.-K., Lai, C.-J., Tsai, C.-H.: A novel algorithm to embed a multi-dimensional torus into a locally twisted cube. Theor. Comput. Sci. 412, 2418–2424 (2011) 20. Loh, P.K.K., Hsu, W.J., Pan, Y.: The exchanged hypercube. IEEE Trans. Parallel Distrib. Syst. 16(9), 866–874 (2005) 21. Ma, M., Xu, J.-M.: Panconnectivity of locally twisted cubes. Appl. Math. Lett. 19, 673–677 (2006) 22. Ma, M., Xu, J.-M.: Weak edge-pancyclicity of locally twisted cubes. ARS Comb. 89, 89–94 (2008) 23. Mart´ınez, C., Beivide, R., Stafford, E., Moret´ o, M., Gabidulin, E.M.: Modeling toroidal networks with the Gaussian integers. IEEE Trans. Comput. 57(8), 1046– 1056 (2008) 24. Menger, K., Kurventheorie, Z.: Fundam. Math. 10, 96–115 (1927) 25. Saad, Y., Shultz, M.H.: Topological properties of hypercubes. IEEE Trans. Comput. 37, 867–872 (1988) 26. Xu, J.-M.: Topological Structure and Analysis of Interconnection Networks. Kluwer Academic Publishers, Dordrecht/Boston/London (2001) 27. Xu, X., Zhai, W., Xu, J.-M., Deng, A., Yang, Y.: Fault-tolerant edge-pancyclicity of locally twisted cubes. Inf. Sci. 181, 2268–2277 (2011) 28. Yang, X., Evans, D.J., Megson, G.M.: The locally twisted cubes. Int. J. Comput. Math. 82, 401–413 (2005) 29. Yang, X., Megson, G.M., Evans, D.J.: Locally twisted cubes are 4-pancyclic. Appl. Math. Lett. 17, 919–925 (2004)
Assessing the Super Pk -Connectedness of Crossed Cubes Yuan-Hsiang Teng1 and Tzu-Liang Kung2(B) 1
2
Department of Computer Science and Information Engineering, Providence University, Taichung, Taiwan [email protected] Department of Computer Science and Information Engineering, Asia University, Taichung, Taiwan [email protected]
Abstract. An interconnection network is a programmable system that serves to transport data or messages amongst network components and/or terminals. A network’s topology is typically modeled by a graph. A path of order k in a graph G is a sequence of k distinct vertices, denoted by Pk = v1 , v2 , · · · , vk , in which any two consecutive vertices are adjacent. The connectivity is a classic index to assess the level of network reliability and fault tolerance. For k ≥ 2, a set F of vertex subsets of G is a Pk -cut if G − F is disconnected, and each element of F happens to induce a Pk -subgraph in G. A connected graph G is super Pk -connected if the smallest component of G − F is a singleton for every minimum Pk -cut F of G. A network with smaller diameter can reduce its communication delay in a worstcase perspective. The crossed cube CQn is a hypercube variant whose diameter is about one half of that of the hypercube. This paper is inspired to discover whether CQn is super Pk -connected for k = 2, 3, 4.
1 Introduction Interconnection networks are backbones of high-performance computing [14, 21, 25]. Efficient network architectures support high-speed data communication based on flexible possibilities of path and cycle embedding [16–18]. A typical methodology to deal with an interconnection network’s topological structure is on the basis of graph theory [1]. A simple, undirected graph G = (V, E) is a two-tuple consisting of vertex set V and edge set E. For convenience, we also use V (G) and E(G) to denote the vertex and edge sets of G, respectively. For notational simplicity, the unordered pair {u, v} represents the edge joining adjacent vertices u and v; equivalently, u and v are neighbors of each other in G. The degree of a vertex v in G, denoted by degG (v), is the number of edges incident to v. The minimum degree of G is defined as δ (G) = min{degG (v) | v ∈ V (G)}. For any vertex v ∈ V (G), NG (v) = {u ∈ V (G) | {u,v} ∈ E(G)} denotes the neighborhood of v; for any vertex subset S ⊂ V (G), NG (S) = v∈S NG (v) \ S. A subgraph H of G is a graph with V (H) ⊆ V (G) and E(H) ⊆ E(G). A path of order k in G is a sequence of k distinct vertices, denoted by Pk = v1 , v2 , · · · , vk , in which every two consecutive vertices are adjacent. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 322–329, 2022. https://doi.org/10.1007/978-3-030-79728-7_32
Assessing the Super Pk -Connectedness of Crossed Cubes
323
A graph G is connected if for every pair u, v of distinct vertices of G, there exists a path between u and v. The connectivity of G, κ (G), is the minimum number of vertices whose removal from G makes the survival graph disconnected or trivial. A vertex-cut of a graph G is a subset F of V (G) such that G − F is disconnected or trivial. It is well known that κ (G) appears to be the minimum cardinality over all vertex cuts of G. Using Menger’s theorem [22] the exact value of κ (G) can be determined in polynomial time. A graph G is super connected if its minimum vertex-cut is always composed of the neighborhood of a vertex v ∈ V (G). For this reason, a vertex-cut F of G is said to be a super vertex-cut if δ (G − F) > 0. A cluster C in a graph G is a vertex subsets of G such that the C-induced subgraph G[C] is either connected or trivial. If all vertices in C are faulty, then C is a faulty cluster. Cluster fault model has been addressed to establish efficient, fault-tolerant routing in today’s interconnection networks [2, 10–12]. Following the definition given in [20], a set F of clusters of G is a cluster-cut if G − F is disconnected or trivial. Suppose that H is any connected graph or trivial graph. Then a cluster-cut F of G is an H-cut if for each cluster C ∈ F, H is isomorphic to a spanning subgraph of G[C]. The H-connectivity of G, denoted by κ (G|H), is the minimum cardinality over all H-cuts of G. Let H ∗ denote the union of the set of all connected subgraphs of H and the set of the trivial graph. Then a cluster-cut F of G is an H ∗ -cut if for each cluster C ∈ F, at least one of H ∗ is isomorphic to a spanning subgraph of G[C]. The H ∗ -connectivity of G, denoted by κ (G|H ∗ ), is the minimum cardinality over all H ∗ -cuts of G. An H-cut F of a connected graph G is said to be a super H-cut if δ (G − F) > 0. Similarly, an H ∗ -cut F of G is a super H ∗ -cut if δ (G − F) > 0. Then a connected graph G is super H-connected if δ (G − F) = 0 for every minimum H-cut F of G; analogously, G is super H ∗ -connected if δ (G − F) = 0 for every minimum H ∗ -cut F of G. The crossed cube [8] is a hypercube variant with lower diameter than that of the hypercube. Many attractive properties of the crossed cube have been widely discovered [3–6, 19, 23, 26]. In this paper, we are going to study whether or not the n-dimensional crossed cube CQn can be super Pk -connected and super Pk∗ -connected for k = 2, 3, 4. The rest of this paper is structured as follows. Section 2 introduces the topological properties of crossed cubes. Section 3 discovers the super Pk - and Pk∗ -connectedness of crossed cubes. Finally, some concluding remarks are drawn in Sect. 4.
2 Preliminary The n-dimensional crossed cube CQn has 2n vertices, each of which corresponds to an n-bit binary string. Any two 2-bit binary strings x1 x0 and y1 y0 are pair-related, briefly denoted by x1 x0 ∼ y1 y0 , if and only if (x1 x0 , y1 y0 ) ∈ {(00, 00), (10, 10), (01, 11), (11, 01)}. The formal definition of CQn is given below. Definition 1 [8]. The n-dimensional crossed cube CQn is recursively constructed as follows: (i) CQ1 is a complete graph with vertex set {0, 1}. (ii) CQ2 is isomorphic to C4 with vertex set {00, 01, 10, 11} and edge set {{00, 01}, {00, 10}, {01, 11}, {10, 11}}.
324
Y.-H. Teng and T.-L. Kung
(iii) For n ≥ 3, let CQ0n−1 and CQ1n−1 denote two copies of CQn−1 with V (CQ0n−1 ) = {0un−2 un−3 · · · u0 | ui = 0 or 1, 0 ≤ i ≤ n − 2} and V (CQ1n−1 ) = {1un−2 un−3 · · · u0 | ui = 0 or 1, 0 ≤ i ≤ n − 2}. Then CQn is formed by combining CQ0n−1 and CQ1n−1 with 2n−1 edges so that a vertex u = 0un−2 un−1 · · · u0 of CQ0n−1 is directly linked to a vertex v = 1vn−2 vn−3 · · · v0 of CQ1n−1 if and only if (1) un−2 = vn−2 while n is even, and (2) u2i+1 u2i ∼ v2i+1 v2i for all i, 0 ≤ i < n−1 2 . For the sake of brevity, the recursive construction of CQn is expressed as CQn = CQ0n−1 ⊕CQ1n−1 . Figure 1 depicts CQ3 and CQ4 . It was proved that CQn is n-connected [15] and its diameter is n+1 2 [8].
Fig. 1. Illustrating CQ3 and CQ4 .
An i-edge in CQn , 0 ≤ i ≤ n − 1, links vertices u = un−1 un−2 . . . u0 and v = vn−1 vn−2 . . . v0 if the following conditions can be satisfied: (i) ui = vi , (ii) u j = v j for all j > i, (iii) u2k+1 u2k ∼ v2k+1 v2k for all k, 0 ≤ k < 2i , and (iv) ui−1 = vi−1 while i is odd. Then u is the i-neighbor of v, denoted by (v)i , and vice versa. Lemma 1 [13]. For n ≥ 3, no subgraph of CQn is isomorphic to K3 or K2,3 . Lemma 2 [13]. For any two vertices u and v in CQn , |NCQn (u) ∩ NCQn (v)| ≤ 2. Lemma 3 [15]. For 1 ≤ k < n, let u and v be k-neighbors of each other in CQn . If i is an odd integer between 0 and k − 1 or i = k − 1, then CQn [{u, v, (u)i , (v)i }] is isomorphic to a cycle of order four. Lemma 4 [8]. For n ≥ 1, κ (CQn ) = n. The g-extra connectivity of a graph G, denoted by κg (G), is the minimum cardinality over vertex-cuts of G, whose deletion not only disconnects G but also every remaining component has more than g vertices [9]. Lemma 5 [7]. For n ≥ 3, κ1 (CQn ) = 2n − 2. Lemma 6 [24]. For n ≥ 5 and 0 ≤ g ≤ n2 , κg (CQn ) = n(g + 1) − g(g+3) 2 . Lemma 7 [23]. Let Pk be any path of order k in CQn for 2 ≤ k ≤ n. Suppose that v is any vertex in CQn −V (Pk ). Then |NCQn (v) ∩V (Pk )| ≤ 2k .
Assessing the Super Pk -Connectedness of Crossed Cubes
325
Lemma 8 [23]. Let Pk be any path of order k in CQn for 2 ≤ k ≤ n. Suppose that u and v are any two adjacent vertices in CQn −V (Pk ). Then |NCQn ({u, v}) ∩V (Pk )| ≤ 2k 3 . Based on the results proposed by Pan and Cheng [23], we have the following two lemmas. Lemma 9. For n ≥ 3, κ (CQn |P2 ) = κ (CQn |P2∗ ) = n − 1. Lemma 10. For 3 ≤ k ≤ n, κ (CQn |Pk ) = κ (CQn |Pk∗ ) =
n . 2k
3 Super Pk - and Pk∗ -Connectedness of CQn For any vertex x of CQn , let F = {(x)i , ((x)1 )i } | 0 ≤ i ≤ n − 1, i = 1 . By Lemma 3, (x)i and ((x)1 )i are adjacent for i = 0, 2, 3 . . . , n − 1. That is, for each element S ∈ F, CQn [S] is isomorphic to P2 . Because CQn [{x, (x)1 }] is an isolated component in CQn − F, F turns out to be a P2 -cut in CQn . As CQn = CQ0n−1 ⊕CQ1n−1 , Fig. 2 illustrates F, in which x is supposedly within CQ0n−1 . By Lemma 9, CQ1n−1 − {(x)n−1 , ((x)1 )n−1 } is connected. Because every vertex of CQ0n−1 − {(x)i , ((x)1 )i | 0 ≤ i ≤ n − 2} has its (n−1)-neighbor in CQ1n−1 −{(x)n−1 , ((x)1 )n−1 }, CQn −F consists of exactly two components. Since δ (CQn −F) = 1, F is a super P2 -cut. By Lemma 9, F is a minimum super P2 -cut. As a consequence, CQn is not super P2 -connected. Theorem 1. For n ≥ 3, CQn is neither super P2 -connected nor super P2∗ -connected.
Fig. 2. A P2 -cut F = {(x)i , ((x)1 )i } | 0 ≤ i ≤ n − 1, i = 1 in CQn .
Below we address the super P3 -connectedness and super P3∗ -connectedness. Theorem 2. CQ4 is super P3 -connected and super P3∗ -connected.
326
Y.-H. Teng and T.-L. Kung
Proof. The proof proceeds by contradiction. Suppose that there exists a minimum that δ (CQ4 − F) > 0. By Lemma 10, |F| = κ (CQ4 |P3∗ ) = P3∗ -cut F in CQ4 such κ (CQ4 |P3 ) = 2. As | S∈F S| ≤ 6, we consider the following two cases. Case 1: | S∈F S| < 6. It follows from Lemma 5 that | S∈F S| < κ1 (CQ4 ) = 6. Thus, the smallest component of CQ4 − F must be a singleton, contradicting the assumption that δ (CQ4 − F) > 0. Case 2: | S∈F S| = 6. Let F = {V1 ,V2 }, where V1 = {u1 , u2 , u3 } and V2 = {v1 , v2 , v3 } are disjoint such that both CQ4 [V1 ] and CQ4 [V2 ] are isomorphic to P3 . Without loss of generality, we assume that u1 , u2 , u3 and v1 , v2 , v3 are P3 . As CQ4 = CQ03 ⊕ CQ13 , let V1j = V1 ∩V (CQ3j ) and V2j = V2 ∩V (CQ3j ) for j = 0, 1. Since |V1j | + |V11− j | = |V2j | + |V21− j | = 3, we distinguish the following subcases. 2.1: |V1k | ≥ 2 and |V2k | ≥ 2 with k = 0 or 1. Thus, |V11−k ∪ V21−k | ≤ 2. By Lemma 4, − (V11−k ∪ V21−k ) is connected. As each vertex of CQk3 − (V1k ∪ V2k ) is CQ1−k 3 1−k linked to its 3-neighbor in CQ1−k ∪V21−k ), CQ4 − F is connected, lead3 − (V1 ing to a contradiction. 2.2: |V1k | = 3 and |V2k | = 1 with k = 0 or 1. In general, we assume that 1−k {u1 , u2 , u3 , v3 } ⊂ V (CQk3 ) and {v1 , v2 } ⊂ V (CQ1−k − 3 ). By Lemma 10, CQ3 {v1 , v2 } is connected. Furthermore, there must exist a vertex x in CQk3 − {u1 , u2 , u3 , v3 } such that (x)3 is in CQ1−k 3 − {v1 , v2 }. k a. Suppose that CQ3 − {u1 , u2 , u3 , v3 } is connected. Then CQ4 − F is connected, leading to a contradiction. b. Suppose that CQk3 − {u1 , u2 , u3 , v3 } is disconnected. Based on a brute-force enumeration, if (u2 )1 ∈ {u1 , u3 }, then CQk3 −{u1 , u2 , u3 , v3 } is connected. For the sake of symmetry, we consider the removal of u2 = 0000 and u1 = 0010. Then either {u3 , v3 , v2 } = {0100, 0111, 1101}, v1 ∈ {1100, 1111, 1011} or {u3 , v3 , v2 } = {0001, 0101, 1111}, v1 ∈ {1110, 1101}. As CQk3 − {u1 , u2 , u3 , v3 } always consists of two components, both − {v1 , v2 }, CQ4 − F is connected, leading are connected to CQ1−k 3
of which to a con-
tradiction. 2.3: |V1k | + |V2k | = 3 with k = 0 or 1. Obviously, we have |V11−k | + |V21−k | = 3. By − (V11−k ∪ V21−k ) are conLemmas 5 and 7, both CQk3 − (V1k ∪ V2k ) and CQ1−k 3 k k nected. Because there exists a vertex x in CQ3 − (V1 ∪ V2k ) such that (x)3 is in 1−k ∪V21−k ), CQ4 − F remains connected, leading to a contradiction. CQ1−k 3 − (V1 As a consequence, every minimum P3∗ -cut of CQ4 is not a super P3∗ -cut, and the proof is completed. Theorem 3. For n ≥ 5, CQn is super P3 -connected and super P3∗ -connected.
Assessing the Super Pk -Connectedness of Crossed Cubes
327
Proof. Let F be any minimum P3 -cut (or minimum P3∗ -cut) of CQn . By Lemma 10, |F| = κ (CQn |P3 ) = κ (CQn |P3∗ ) = n2 . Let C be the smallest component of CQn − F. We claim that C is a singleton. The proof proceeds by contradiction. Suppose that C has more than one vertex. Case 1: |V (C)| = 2. Let V (C) = {u, v}. As CQn is K3 -free, |NCQn ({u, v})| = 2n − 2. By Lemma 8, |L ∩ NCQn ({u, v})| ≤ 2 for each L ∈ F. Thus, the cardinality of F requires |N
({u,v})|
n |F| ≥ CQn 2 ≥ 2n−2 2 = n − 1 > 2 for n ≥ 5, leading to a contradiction. Case 2: |V (C)| ≥ 3. Then F contains at most 3|F| = 3 × n2 vertices. By Lemma 6, κ2 (CQn ) = 3n − 5 > 3 × n2 for n ≥ 5. Consequently, C contains at most two vertices, leading to a contradiction. As a result, C must be a singleton, and CQn is super P3 -connected and super P3∗ connected.
Both CQ4 and CQ5 are neither super P4 -connected nor super P4∗ -connected. For CQ4 , Z is a minimum P4 -cut such that δ (CQ4 − Z) > 0: Z = {{0100, 0101, 0111, 0110}, {1000, 1001, 1011, 1010}}. Then Z = {{00100, 00101, 00111, 00110}, {01000, 01001, 01011, 01010}, {10000, 10010, 10011, 10001}} is a minimum P4 -cut in CQ5 such that δ (CQ5 − Z ) > 0. However, CQ6 is super P4 -connected and super P4∗ -connected. Theorem 4. For n = 6 or n ≥ 8, CQn is super P4 -connected and super P4∗ -connected. Proof. Let Z be any minimum P4 -cut (or minimum P4∗ -cut) of CQn . By Lemma 10, |Z| = κ (CQn |P4 ) = κ (CQn |P4∗ ) = n2 . Let C be the smallest component of CQn − Z. We claim that C is a singleton. The proof proceeds by contradiction. Suppose that C has two or more vertices. Case 1: |V (C)| = 2. Let V (C) = {u, v}. As CQn is K3 -free, |NCQn ({u, v})| = 2n − 2. By Lemma 8, |S ∩ NCQn ({u, v})| ≤ 2×4 3 = 3 for each S ∈ Z. Thus, |Z| ≥
|NCQn ({u,v})| ≥ 2n−2 3 3 ,
leading to a contradiction while n = 6 or n ≥ 8. Case 2: |V (C)| ≥ 3. Then Z totally contains at most 4 × n2 vertices. By Lemma 6, κ2 (CQn ) = 3n−5 > 4 n2 for n = 6 or n ≥ 8. Then C contains no more than two vertices, leading to a contradiction. As a result, C must be a singleton, and the proof is completed. For any vertex x of CQ7 , Fig. 3 shows a super P4 -cut {Z1 , Z2 , Z3 , Z4 }, where Z1 = {((x)0 )1 , (x)0 , ((x)0 )2 , ((x)1 )2 }, Z2 = {(x)2 , ((x)2 )3 , (x)3 , ((x)3 )1 }, Z3 = {((x)4 )1 , (x)4 , ((x)4 )5 , (x)5 }, Z4 = {((x)1 )5 , (((x)1 )5 )6 , ((x)1 )6 , (x)6 )}. Since NCQ7 ({x, (x)1 }) ⊂ Z1 ∪ Z2 ∪ Z3 ∪ Z4 , we have δ (CQ7 − {Z1 , Z2 , Z3 , Z4 }) = 1. As κ (CQ7 |P4 ) = κ (CQ7 |P4∗ ) = 4, CQ7 is neither super P4 -connected nor super P4∗ connected.
328
Y.-H. Teng and T.-L. Kung 0
1
CQ6
CQ6
(x)5 (x)4 ((x)4)5 (x)3
Z2
(x)0
(x)2 ((x)0)2
Z3 ((x)1)4
(x)6
x
Z1
Z4
((x)1)0
((x)1)2 ((x)1)3
(x)1
((x)1)6 (((x)1)5)6
((x)1)5
Fig. 3. A super P4 -cut in CQ7 .
4 Conclusion The super Pk -connectedness and super Pk∗ -connectedness are natural extension to the notion behind the classic vertex-connectivity. In this paper, we determine whether CQn is super Pk -connected and super Pk∗ -connected for k = 2, 3, 4. Regarding to our future work, it is also intriguing to discover if CQn is super Pk -connected and super Pk∗ connected for a larger k ≥ 5. Acknowledgements. This work is supported in part by the Ministry of Science and Technology, Taiwan, under Grant No. MOST 109-2221-E-468-009-MY2.
References 1. Bondy, J.A., Murty, U.S.R.: Graph Theory. Springer, London (2008) 2. Bossard, A., Kaneko, K.: Cluster-fault tolerant routing in a torus. Sensors 20(11), 3286, 1–17 (2020) 3. Chang, C.-P., Sung, T.-Y., Hsu, L.-H.: Edge congestion and topological properties of crossed cubes. IEEE Trans. Parallel Distrib. Syst. 11, 64–80 (2000) 4. Chen, H.-C., Kung, T.-L., Hsu, L.-H.: Embedding a Hamiltonian cycle in the crossed cube with two required vertices in the fixed positions. Appl. Math. Comput. 217, 10058–10065 (2011) 5. Chen, H.-C., Kung, T.-L., Hsu, L.-Y.: 2-Disjoint-path-coverable panconnectedness of crossed cubes. J. Supercomput. 71, 2767–2782 (2015) 6. Chen, H.-C.: The panpositionable panconnectedness of crossed cubes. J. Supercomput. 74(6), 2638–2655 (2018) 7. Chen, Y.-C., Tan, J.J.M.: Restricted connectivity for three families of interconnection networks. Appl. Math. Comput. 188(2), 1848–1855 (2007)
Assessing the Super Pk -Connectedness of Crossed Cubes
329
8. Efe, K.: The crossed cube architecture for parallel computing. IEEE Trans. Parallel Distrib. Syst. 3, 513–524 (1992) 9. F´abrega, J., Fiol, M.A.: On the extraconnectivity of graphs. Discret. Math. 155, 49–57 (1996) 10. Gu, Q.-P., Peng, S.: An efficient algorithm for node-to-node routing in hypercubes with faulty clusters. Comput. J. 39, 14–19 (1996) 11. Gu, Q.-P., Peng, S.: k-pairwise cluster fault tolerant routing in hypercubes. IEEE Trans. Comput. 46, 1042–1049 (1997) 12. Gu, Q.-P., Peng, S.: Node-to-set and set-to-set cluster fault tolerant routing in hypercubes. Parallel Comput. 24, 1245–1261 (1998) 13. Hung, C.-N., Lin, C.-K., Lin, L.-H., Cheng, E., Lipt´ak, L.: Strong fault-Hamiltonicity for the crossed cube and its extensions. Parallel Process. Lett. 27(2), 1750005 (2017) 14. Hsu, L.-H., Lin, C.-K.: Graph Theory and Interconnection Networks. CRC Press, Boca Raton/London/New York (2008) 15. Kulasinghe, P.: Connectivity of the crossed cube. Inf. Process. Lett. 61, 221–226 (1997) 16. Kung, T.-L., Lin, C.-K., Liang, T., Hsu, L.-H., Tan, J.J.M.: On the bipanpositionable bipanconnectedness of hypercubes. Theor. Comput. Sci. 410, 801–811 (2009) 17. Kung, T.-L., Teng, Y.-H., Hsu, L.-H.: The panpositionable panconnectedness of augmented cubes. Inf. Sci. 180, 3781–3793 (2010) 18. Kung, T.-L.: Flexible cycle embedding in the locally twisted cube with nodes positioned at any prescribed distance. Inf. Sci. 242, 92–102 (2013) 19. Kung, T.-L., Chen, H.-C.: Optimizing Hamiltonian panconnectedness for the crossed cube architecture. Appl. Math. Comput. 331, 287–296 (2018) 20. Kung, T.-L., Lin, C.-K.: Cluster connectivity of hypercube-based networks under the super fault-tolerance condition. Disc. Appl. Math. 293, 143–156 (2021) 21. Leighton, F.T.: Introduction to Parallel Algorithms and Architectures: Arrays · Trees · Hypercubes. Morgan Kaufmann, San Mateo (1992) 22. Menger, K.: Zur allgemeinen Kurventheorie. Fundam. Math. 10, 96–115 (1927) 23. Pan, Z., Cheng, D.: Structure connectivity and substructure connectivity of the crossed cube. Theor. Comput. Sci. 824–825, 67–80 (2020) 24. Wang, S., Ma, X.: The g-extra connectivity and diagnosability of crossed cubes. Appl. Math. Comput. 336, 60–66 (2018) 25. Xu, J.-M.: Topological Structure and Analysis of Interconnection Networks. Kluwer Academic Publishers, Dordrecht/Boston/London (2001) 26. Yang, M.-C., Li, T.-K., Tan, J.J.M., Hsu, L.-H.: Fault-tolerant cycle embedding of crossed cubes. Inf. Process. Lett. 88, 149–154 (2003)
Learning Performance Prediction with Imbalanced Virtual Learning Environment Students’ Interactions Data Hsing-Chung Chen1,2 , Eko Prasetyo1,3 , Prayitno1,4 , Sri Suning Kusumawardani5 , Shian-Shyong Tseng6(B) , Tzu-Liang Kung1(B) , and Kuei-Yuan Wang7 1 Department of Computer Science and Information Engineering, Asia University,
Taichung City, Taiwan [email protected] 2 Department of Medical Research, China Medical University Hospital, China Medical University, Taichung City, Taiwan 3 Department of Information Technology, Universitas Muhammadiyah Yogyakarta, Yogyakarta, Indonesia 4 Department of Electrical Engineering, Politeknik Negeri Semarang, Semarang, Indonesia 5 Department of Electrical Engineering and Information Technology, Universitas Gadjah Mada Yogyakarta, Yogyakarta, Indonesia 6 Department of M-Commerce and Multimedia Applications, Asia University, Taichung City, Taiwan [email protected] 7 Department of Finance, Asia University, Taichung City, Taiwan
Abstract. One of the critical aspects in completing study in a virtual learning environment (VLE) is the student behavior when interacting with the system. However, in real cases, most of the student behavior data have imbalanced label distribution. This imbalanced dataset affects the model performance of machine learning algorithms significantly. This study attempts to examine several resampling methods such as random undersampling (RUS), oversampling with synthetic minority oversampling technique (SMOTE), and hybrid sampling (SMOTEENN) to resolve the imbalanced data issue. Several machine learning (ML) classifiers are employed to evaluate the efficiency of the resampling methods, including Naïve Bayes (NB), Logistic Regression (LR), and Random Forest (RF). The experiment results indicate that the performance of classifiers is improved utilizing more balanced dataset. Furthermore, the Random Forest classifier has accomplished the best result among all other models while using SMOTEENN as a resampling approach.
1 Introduction The World Health Organization (WHO) issued a recommendation to all countries to limit face-to-face meetings to prevent the spread of Covid-191 . This recommendation impacts education, where face-to-face learning should be carried out online [1, 2]. The 1 https://www.who.int/emergencies/diseases/novel-coronavirus-2019/technical-guidance.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 330–340, 2022. https://doi.org/10.1007/978-3-030-79728-7_33
Learning Performance Prediction with Imbalanced
331
sudden change from face-to-face learning to online learning raises several problems. One of these problems is how students interact with a virtual learning environment (VLE) platform such as Moodle. Unfortunately, according to [3], most students avoid the process; therefore, they fail online. In [4], one of the success keys to graduate from the online course was assisting students in interacting with the online system. Therefore, identifying the learning performance at an early stage allows students to make appropriate interactions with VLE. When dealing with learning performance classifications with real-world VLE data, the problem becomes more challenging. There are possibilities that imbalanced data appears in the classification training dataset. In binary classification, the class with a larger distribution is called the majority, while the others are called the minority [5]. For example, at the end of an online course, the majority sample class is many students who pass the online course. Meanwhile, the minority sample class does not pass the course and has a tiny number. In 2017, [6] found that traditional machine learning (ML) algorithms failed to predict minority observations when dealing with imbalanced data; the classification model was inclined to choose the majority class. As in [6], the classification accuracy of the minority class will be much lower than the majority class. Finally, it is necessary to increase the ML model’s accuracy performance for both the minority and the majority class. This study investigates the effectiveness of resampling methods combined with machine learning techniques to predict students at risk of failure based on their interactions with the virtual learning environment. The contributions of this study include: • Evaluating the effectiveness of resampling methods combined with machine learning techniques such as Naïve Bayes (NB), Logistic Regression (LR), and Random Forest (RF). • Measuring the performance of the implemented models using different evaluation measure methods such as Accuracy, Recall, Precision, and F1-Score. • Showing the effect of the resampling methods on the classifiers’ performance. The paper is structured as follows. The next section describes a study of the existing literature on predictions of students’ performance. Section 3 provides a description of the dataset, proposed methods, and metrics for this study. The tests and results are described in Sect. 4 accompanied by performance evaluations. Finally, the conclusions and future directions are described in Sect. 5.
2 Literature Review Several works related to predicting student academic performance as a regression problem have been studied, where ML algorithms are used to predict students’ final grades. Moreover, the prediction of student academic performance consists of a pass, fail, or drop from the course. One of the variables that affect the student performance prediction is student behavior during the learning process. Several studies regarding the imbalanced distribution class have been conducted. They implement an algorithmic approach and data leveling approach to handle this
332
H.-C. Chen et al.
problem [7, 8]. The algorithmic approach is performed using a more conducive classifier to minority class or using a hybrid method. Generally, this approach uses cost-sensitive learning, boosting, bagging, and stacking [9]. Meanwhile, the data leveling approach is performed by adjusting the “relatively balanced” ratio between the minority and majority classes. Most data leveling approaches use resampling or data synthesis mechanisms at the pre-processing stage [10]. There are several types of machine learning algorithms. Certain algorithms perform better for use in different scenarios. This is due to the differences in the dataset regarding their applications. To generate the best model, several different algorithms are evaluated to solve a particular problem. Four different machine learning algorithms, i.e., Generative, Discriminative, Decision trees, and Ensemble learners, are described in a study [11]. The generative classifier is a statistical model for the distribution of joint probability on an observable variable (X ) and target variable (Y ). Naïve Bayes (NB) is one of Generative classifier model family. NB predicts each of the features x ∈ X in the model that is conditionally independent for each other. The discriminative probabilistic is a model, where Y ’s probability is modeled according to X ’s observation. Meanwhile, Logistic Regression (LR) is categorized into the Discriminative classifier. LR uses a logistic function to illustrate the relationship between one or more than one class, ratio level, or ordinal variables. Finally, Ensemble learners is a type of ML in which several models are trained on a particular task using the same algorithm. Then the results are combined to produce a better predictive performance against each constituent. Random Forest (RF) is a kind of Ensemble learner methods. It boosts performance by aggregating the results of different trained trees. The user could set a number of estimators before simulating. A problem, i.e. unbalanced data, could appear in a classification system and become a major problem that causes the reduction in classifier performance. Oversampling and undersampling is one way to solve the problem of imbalanced data by distributing data in a balanced way by repeatedly replicating random minority instances (i.e., synthetic data) [12]. On the other hand, undersampling techniques remove examples from the training dataset that belong to the majority class in order to better balance the class distribution [12].
3 Proposed Methods This section presents a proposed method for identifying student performance with imbalanced VLE data. Figure 1 presents the architecture of the proposed method. In the initial step, a database of student interactions with VLE is collected. Then, from the collected VLE student activities logs, feature extraction is performed. This requires a pre-processing method. After 8 features are extracted, the next step is to apply the resampling method. This method is implemented before training the data into ML algorithms which are useful for improving performance, especially in the imbalanced data processing. The output of the ML algorithm is the prediction of student performance, i.e., pass or fail.
Learning Performance Prediction with Imbalanced
333
Fig. 1. Architecture of the proposed method for student performance detection.
3.1 Learning Performance Prediction Task The student performance prediction task can be described as follows. Given a labeled interaction data from students D with n data D = {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )} whereas xi (i = 1,2, . . . ,n) represent features in i-th space and yi ∈ {1,0}(i = 1,2, . . . , n) represent the label of learning performance at i-th. Given unlabeled interaction data from student x, the task is to predict whether the student is ‘pass’ or ‘failed’ such that: 1, whilexisthestudentpredictedpass f (x) = 0, otherwise
3.2 Dataset Information This study uses a dataset of the “Digital Transformation” course held from February to August 2019 at Gadjah Mada University Yogyakarta, Indonesia. This course is an interdisciplinary course, which is followed by various study programs. The number of participants was 977 students divided into 8 classes. The lecturer in this course is teamteaching. This course is delivered online using synchronous and asynchronous methods. On the one hand, the synchronous method is delivered using online meeting facility. On the other hand, the asynchronous method is provided using the Moodle VLE facility. The dataset obtained contains information on assignment scores, quizzes, and exams, as well as interactions between students and VLEs for more than 202,000 clickstreams. The dataset is recognized into two different data types: • Performance – reflects students’ results and achievements during their learning in the course. The score files contain the students’ marks of their activities related to quizzes, assignments, and exams on the VLE. • Learning portfolio – contains logs of student interactions with materials and activities in the VLE.
334
H.-C. Chen et al.
3.3 Data Preprocessing and Feature Construction Data preprocessing is one of the essential phases in machine learning. The raw data is converted into an appropriate format in this step. Hence, the step eliminates the errors, and datasets could be handled easily [11]. Datasets are preprocessed to provide detailed features that could increase the performance of the ML algorithm. The features also describe student behavior when interacting with VLE. To obtain this behavior, both types of files which are score and log types, are further processed to get the number of clickstreams for each student activity in learning. Student learning activities on the online course are listed in Table 1. Several other features are generated to facilitate understanding of student behavior and predict their course outcome. Furthermore, the number of students’ clickstreams in this learning activity is combined with a value table to generate meaningful features. Merging of these tables is carried out in all classes from this course. The next process is to classify the students’ performance based on the final grade of the course. Classification is performed by differentiating students with the “pass” specification (1) if the final score >= 50, while the “fail” & “withdrawn” specification (final score .05, which does not reach the significant level. It means that there is no significant difference in the pre-experiment test scores of the experimental group and the control group. There is no statistical difference in the artificial intelligence knowledge of the two groups of students, that is, the two groups of students have the same level of awareness of artificial intelligence knowledge. Table 1. Independent samples t test of pre-test for two groups Levene test for equal variance Assume equal variance
Equal mean t-test F
p-value
t
df
p-value
10.674
.002
1.591
58
.117
1.591
49.081
.117
Do not assume equal variance
4.2 Paired Sample t Test of Academic Performance for Two Groups This research uses artificial intelligence cross-domain application practice learning platform teaching courses, as well as traditional artificial intelligence teaching course learning achievement pre-test and post-test to compare the differences in the learning effectiveness of students through different course teaching methods. Therefore, after the learning activity, the pre-test scores and post-test scores were taken to make paired sample tests. 4.2.1 Paired Sample t Test of Experimental Group It can be seen from Table 2 that the experimental group paired sample t value = − 6.830, p = .000 < 0.050, reaching a significant level of 0.050. Therefore, it can be seen that the pre-test and post-test results are significantly different. After the students in the experimental group used the platform developed by this study to teach on the crossdomain application of artificial intelligence learning platform, their post-test scores (M = 95.87) were significantly better than the pre-test scores (M = 86.53), showing the traditional teaching methods also significantly help academic performance. Table 2. Paired sample t test of experimental group Variable name M Pre-test score
N SD
t
p-value
86.53 30 7.62 −6.278 .000
Post-test score 95.43 30 3.39
348
A. Y. H. Liao
4.2.2 Paired Sample t Test of Control Group It is found from Table 3 that the paired sample t value of the control group = −16.919, p = .000 < 0.050, reaching a significant level of 0.050. Therefore, after the students in the control group used traditional teaching methods to learn on the cross-domain application of artificial intelligence learning platform, their post-test scores (M = 84.16) were significantly better than the pre-test scores (M = 70.36), showing the traditional teaching methods It is also a significant help for academic performance. Table 3. Paired sample t test of control group Variable name M
N SD
t
p-value
82.40 30 12.01 −4.622 .000
Pre-test score
Post-test score 91.47 30
5.70
4.2.3 Independent Sample t Test of Post-test Scores for Two Groups After the teaching course, the independent sample t test of post-test scores is implemented, and the Levene test of the homogeneity of the variance is performed. Table 4 shows the results of the independent sample t test of the two groups of students before the experiment. After the F value test result of the Levene test method, F = 6.107, significance (p) = .016 < .05, which reaches the significant level. Therefore, the two sets of variances can be regarded as equal, so look at the t value in the column of “assuming equal variances”, t = 3.274, significance (p) = .002 < .05, reaching a significant level, as shown in Table 4. It means that there is a significant difference in the post-experiment test results of the experimental group and the control group, which means that the two groups of students have significant differences in the results of artificial intelligence knowledge learning through different learning methods. Table 4. Independent sample t test of post-test scores for two groups Levene test for equal variance Assume equal variance Do not assume equal variance
Equal mean t-test F
p-value
t
df
p-value
6.107
.016
3.274
58
.002
3.274
4.219
.002
4.3 Comparison of Learning Attitudes Between Pre-test and Post-test This study conducted pre- and post-tests for the experimental group and the control group to compare the differences in learning attitudes of students learning artificial intelligence
An APP-Based E-Learning Platform for Artificial Intelligence
349
learning courses through different teaching methods. In the aspect of attitude pre-test, the Levene test F value of the homogeneity test of variance = .007 (p = .936, p > 0.050), as shown in Table 5, does not reach a significant difference, so it is assumed that the variance is equal, and its independent samples The t test value was 1.466 (p = .148, p > 0.050), which did not reach a significant difference, indicating that the students in the experimental group and the control group had the same learning attitude at the starting point. After the students in the experimental group and the control group learn through two different teaching methods, the post-test results of the students’ learning attitudes, the paired sample t test shows that the experimental group is better than the control group students, as shown in Table 6. Table 5. The Levene test of pre-test differences in learning attitudes between two groups Levene test for equal variance Assume equal variance
Equal mean t-test F
p-value
t
df
p-value
.007
.936
1.466
58
.148
1.466
57.657
.148
Do not assume equal variance
Table 6. The Levene test of the difference in the post-test of learning attitudes Levene test for equal variance Assume equal variance Do not assume equal variance
Equal mean t-test F
p-value
t
df
p-value
7.250
.009
6.368
58
.000
6.368
50.849
.000
4.4 Comparison of Pre-test and Post-test for Learning Achievement This study conducted pre-test and post-test for the experimental group and the control group to compare the differences in the learning achievement of students learning artificial intelligence learning courses through different teaching methods. In the aspect of learning achievement pre-test, the Levene test F value of the homogeneity test of the variance is 1.015 (p = .318, p > 0.050). As shown in Table 7, the significant difference is not reached, so assuming the variance is equal, its independent sample t test value is .226 (p = .822, p > 0.050), which does not reach a significant difference, indicating that the learning achievements of the students in the experimental group and the control group are the same at the starting point. After the students in the experimental group and the control group learn through two different teaching methods, the post-test results of the learning achievement of the students in the experimental group are better than those in the control group by the paired sample t test, as shown in Table 8.
350
A. Y. H. Liao
Table 7. The Levene test of the difference of learning achievement pre-test for two groups Levene test for equal variance Assume equal variance
Equal mean t-test F
p-value
t
df
p-value
1.015
.318
.226
58
.822
.226
56.615
.822
Do not assume equal variance
Table 8. The Levene test of the difference of learning achievement post-test for two groups Levene test for equal variance Assume equal variance
Equal mean t-test F
p-value
t
df
p-value
10.830
002
5.257
58
.000
5.257
48.761
.000
Do not assume equal variance
4.5 Survey of System Usage Satisfaction After completing the teaching course of the cross-domain application of artificial intelligence learning platform, students in the experimental group directly used the platform developed by the institute to fill in the student satisfaction survey questionnaire. There are 25 questions in this questionnaire, which is based on the Likert five-point scale (the highest score is 5 points, the lowest score is 1 point). According to the results of the satisfaction survey questionnaire, it can be seen that students’ satisfaction with this cross-domain application of artificial intelligence learning platform system is quite high, and the percentage of students who choose the answer is very satisfied and the total percentage of the two positive options is as high as 88.3%, as shown in Table 9. Table 9. Survey statistics of system usage satisfaction Answer items
Very satisfied
Satisfied
Neutral
Unsatisfied
Very unsatisfied
Total average percentage
38.3%
50%
11.7%
0%
0%
5 Conclusion This research mainly uses cloud-based digital technology to integrate digital teaching and learning. According to a questionnaire survey conducted after learning on a learning platform with cross-domain applications of artificial intelligence, it shows that students’ satisfaction with the APP platform is as high as 88.3%. The analysis results of the students’ learning factors such as attitude and achievement show that compared to traditional learning, this artificial intelligence cross-domain application practice learning
An APP-Based E-Learning Platform for Artificial Intelligence
351
platform can enable students to learn more effectively in a simulated environment, and enhance students’ interest in learning artificial intelligence knowledge and applications. Students use mobile devices and APP for learning, which improves their learning attitude and achievement. For the learning of artificial intelligence knowledge, compared with traditional teaching, it can be quickly absorbed and learned. If the teaching method of multimedia AR/VR teaching method can be used, the learning of artificial intelligence knowledge can be improved. This learning model also can be used in other subjects. In terms of learning, it can also obtain very good learning results. Acknowledgments. The author of this paper would like to express his gratitude to the Ministry of Science and Technology of the Republic of China for its partial grant support to this research. The grant number is MOST 109-2511-H-468-002.
References 1. Brophy, J.: Socializing students’ motivation to learn. Adv. Mot. Achiev.: Enhanc. Mot. 15, 181–210 (1987) 2. Bybee, R.W.: Teaching Science as Inquiry. American Association for the Advancement of Science, Washington, DC (2000) 3. Colburn: An Inquiry Primer. Science Scope, pp. 42–44 (2000) 4. Gilbert, J., Watts, M.: Concepts, misconceptions and alternative conceptions: changing perspectives in science education. Stud. Sci. Educ. 10, 61–98 (1983). https://doi.org/10.1080/030 57268308559905 5. Chang, C.-H.: Chang’s Dictionary of Psychology, 2nd edn. Donghua Bookstore, Taipei, Taiwan (1991) 6. Lave, J., Wenger, E.: Situated Learning: Legitimate Peripheral Participation. Cambridge University Press, Cambridge (1991) 7. Choi, J., Hannifin, M.: Situated cognition and learning environments: roles, structures, and implications for design. Educ. Technol. Res. Dev. 43(2), 53–69 (1995)
Automatic Control System for Venetian Blind in Home Based on Fuzzy Sugeno Method Hsing-Chung Chen1,2 , Galang Wicaksana1 , Agung Mulyo Widodo1,3 , Andika Wisnujati1,4 , Tzu-Liang Kung1(B) , and Wen-Yen Lin5(B) 1 Department of Computer Science and Information Engineering, Asia University,
Taichung City, Taiwan [email protected] 2 Department of Medical Research, China Medical University Hospital, China Medical University, Taichung City, Taiwan 3 Department of Computer Science, Esa Unggul University, West Jakarta, Indonesia 4 Department of Machine Technology, Universitas Muhammadiyah Yogyakarta, Yogyakarta, Indonesia 5 Department of Information Engineering, National Taichung University of Science and Technology, Taichung City, Taiwan [email protected]
Abstract. Smart home technology is evolving quickly, and numerous smart home devices associated with Artificial Intelligence (AI) have increased the quality of life for occupants. The purpose of making this tool is to design, build, and test the Internet of Things (IoT) using the Universal Board with the ATMega328 microcontroller to measure, record, and display data via a smartphone. The Sugeno fuzzy method is used to find the cryptic value of the system. The system design consists of a series of Universal Board modules with an ATMega328 microcontroller which acts as a controller for automatic drying monitoring, a series of light-dependent resistor sensors, raindrop, and DHT22 sensors, as well as a DC number and a micro switch that functions as the output of all these sensors. The data obtained is displayed on the mobile application. These tools and applications have worked well, it can be seen from the several tests that have been carried out there are no significant differences in system calculations and manual calculations.
1 Introduction The definition of the Internet of Things is not widely agreed (also as IoT). The word IoT, used for the first time by Kevin Ashton in a 1998 lecture, defines the evolving global information service architecture focused on the Internet. About a decade earlier, late Mark Weiser created a seminal view of future technical ubiquity – one in which the growing “availability” of computing capacity will be followed by its diminishing “visibility” [16, 20]. The words “Internet” and “Things” mean an interconnected worldwide network based on sensory, communication, networking, and information processing technologies, which might be the new version of information and communications technology (ICT) [8]. The exponential growth of telecommunications has sparked new concepts © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 352–361, 2022. https://doi.org/10.1007/978-3-030-79728-7_35
Automatic Control System for Venetian Blind
353
that can be developed with the use of internet technology. The Internet of Things is one of the most common ideas. IoT or the Internet of Things is a term that seeks to extend the advantages of constant access to the Internet. The IoT application that is being built is a smart house. In recent years, the growth of smart home technology has led to the transition from a conventional home to a smart, wired Internet. Smart Home is a household filled with devices such as sensors, wired and wireless networks, actuators and intelligent systems [5]. The fields of use for IoT technologies are as vast as they are diverse, as IoT applications increasingly extended to nearly all aspects of daily life. The most important fields of implementation include, for example, the smart industry, where the implementation of intelligent production systems and linked production sites also was discussed under the heading of Industry 4.0. Intelligent thermostats and monitoring devices attract the most coverage in the smart house or construction market, while smarts energy solutions depend on smarts power, gas, and water meters [15, 20]. IoT can be considered as a global network infrastructure composed of numerous connected devices that rely on sensory, communication, networking, and information processing technologies [14, 20]. The smart home is a high-tech home that requires multiple systems and gadgets at home to interact with each other. Smart home systems can be used to monitor nearly all things in the home that can be remotely controlled (remote). Applying a smart home is simple and effective to overcome the many challenges and events at home, taking into account the different activities of any person outside the home [11, 20]. Smart home is a convergence of information technology and computer systems used in homes or buildings occupied by humans, depending on the performance, automation, convenience, protection and savings of home electronic devices. As a result of technical advances, the new supply of the smart home has been well developed with different ideas and systems in mind [4].
2 Related Works In [2] the model of smart IoT was used for irrigation. It was presented how sensor readings can be used for precise irrigation in modern agriculture. IoT infrastructure for temperature monitoring and optimal control in retail refrigerated cabinets is one of the recent propositions [12]. This [1] research proposes a new solution based on multiple objective optimizers (MOO) to set a proper flow rate and water temperature minimizing the used amount of energy by the taps. [20] Note that, there is a large gap in this area when there are multiple users on different taps within a house. The future of the Internet will consist of heterogeneously connected devices that will further extend the borders of the world with physical entities and virtual components. The Internet of Things (IoT) will empower the connected things with new capabilities [6]. They present a modified Takagi–Sugeno method, one of Fuzzy Rule-Based Systems family, applied for prediction of forging dies wear [7]. A technological process, as well as acquired data are briefly discussed. The modified Takagi-Sugeno approach is introduced. Its main advantage, acquiring knowledge from experts instead of datasets, is emphasized [20]. They also defined the deployment status of artificial intelligence in smart home devices and how it is used in our home so that we can grasp how artificial intelligence
354
H.-C. Chen et al.
is used to make smart homes [5]. This research describes the smart home management paradigm for the proposed system and the key tasks to be undertaken at each level. In addition, we address realistic architecture issues with focus on data transmission as well as smart home connectivity protocols and their interoperability [13]. The author [3] proposed to use a ball robot (BR) device control issue, where the BR has the ability to travel omni-directionally. [20] The suggested control system incorporates two fuzzy BR control methods. In this fuzzy control approach, the TS fuzzy paradigm for fuzzy BR modeling was introduced. The principle of parallel distributed compensation (PDC) was used to build a fuzzy control scheme for TS fuzzy models. Fuzzy language can be interpreted as vague, in other words, fuzzy logic is vague logic. Where in fuzzy logic a value can be ‘true’ and ‘false’ simultaneously. The rate of ‘true’ or ‘false’ value in fuzzy logic depends on the membership weights it has [19]. A new regression model is developed by [10] for the calculation of program effort based on the use case point model. In addition, the Sugeno Fuzzy Inference Method (FIS) approach to this model is used to improve estimation. The findings show that an increase of 11% could be made in the MMRE following the Sugeno fuzzy logic strategy [10]. Martin et al. [9] described the application whose effects are compared to those of multiple regression. A subset of 41 modules built from 10 programs is used as results [9]. The result indicates that the value of the MRE (aggregation of magnitude of relative error, MRE) applying fuzzy logic was marginally higher than that of the MRE applying multiple regression, while the value of Pred (20) applying fuzzy logic was slightly higher than that of Pred (20) applying multiple regression. In comparison, six of 41 MRE was equal to zero (without any deviation) when fuzzy logic was implemented (not any similar case was presented when multiple regression was applied) [9].
3 Methods 3.1 Research Approach The methodology used is the planning approach. In Indriarti 1997, Thierauf and Klekamp 1975 suggest that the planning approach can be used to critically identify problems [20]. There are several steps in the planning approach method. Researchers developed a mobile-based automatic monitoring system using Arduino Nano to solve the current problems. It uses a Rain Drop Sensor (RD), Light Dependent Resistor (LDR) and DHT22 Sensor to construct this solution, as well as a mobile application for remote monitoring of venetian blind. The system design, hardware development and programming of mobile applications will be carried out using the Arduino Nano device and the ESP8266 module. After that, the hardware and application will be tested. If the test results are suitable, the hardware and applications will be implemented [20]. 3.2 Research Stages 3.2.1 Data Flow Diagram The data flow diagram illustrates that the input has an analog signal from the lightdependent resistor sensor (LDR), the raindrop sensor (RD) and the DHT22 sensor. The
Automatic Control System for Venetian Blind
355
Arduino Nano will process all of these inputs. After all data has been processed, the data transfer process will be carried with the server via the HTTP Protocol. ON/OFF is the output of the DC motor. The information will be transferred into a WEB API from the server and processed using a mobile application [20]. The data flow diagram is illustrated in Fig. 1 [20]. Analog Signal
LDR Sensor
Analog Signal
HTTP Protocol
Analog Signal
Arduino Nano
RD Sensor
DHT22 Sensor
Server
Analog Signal
WEB API Digital Signal
Android DC Motor
DC Motor
Fig. 1. Data flow diagram
3.2.2 Sensor’s Variables The DHT22 sensor will detect temperature and humidity, the LDR sensor will detect light, and the RD sensor will detect rain. On the Arduino Nano microcontroller, the results from each sensor will be processed and the data processing results will be sent to the server [20]. Each sensor’s variables are as follows. 1. 2. 3. 4.
The values of variable for DHT22 Sensor (Humidity) are shown in Fig. 2. The values of variable for RD Sensor (Rain) are shown in Fig. 3. The values of variable for LDR Sensor (Light) are shown in Fig. 4. The values of variable for DC Motor are shown in Fig. 5.
3.2.3 System Design System design is carried out at this stage to read the value of the RD sensor, LDR sensor, and DHT22 sensor which the Arduino Nano microcontroller and ESP8266 module have already processed [20]. The data is sent to the server and processed to be shown in the mobile application. The system design is shown in Fig. 6.
356
H.-C. Chen et al.
Fig. 2. DHT22 sensor graphics
Fig. 3. RD sensor graphics
3.2.4 User Interface Design Designing user interface to display data from RD sensors, LDR sensors, and DHT22 sensors. Figure 7 illustrates the user interface design [20]. 3.2.5 Prototype Design The venetian blind is made of aluminum with enamel finished flat sheets with the following characteristics: Blind dimensions: 1 m × 1.42 m, Sheet wide: 80 mm, Maximum distance between the sheets: 75 mm, Distance from the glass: 150 mm, Sheet colour: white [17]. It is shown in Fig. 8 for more details. An Arduino Nano with an LDR sensor and an RD sensor is located on the roof. On the wall, 2 DC motors are installed which function to provide the venetian blind with output and there is also a DHT22 sensor to detect humidity and temperature. See Fig. 9 for more details [20].
Automatic Control System for Venetian Blind
357
Fig. 4. LDR sensor graphics
Slow
Fast
Medium
Membership Degrees (x)
1
0.8 0.6 0.4 0.2 0
550
350
750
Fig. 5. DC motor graphics DC Motor
RD Sensor
DHT22 Sensor
Arduino Nano Server
LDR Sensor
Mobile DC Motor
Fig. 6. System design
3.2.6 Database Design [20] Database design to support the process of developing a mobile application. The database design is shown in Fig. 10.
358
H.-C. Chen et al.
Fig. 7. Mobile application interface design
Fig. 8. Venetian blind
Fig. 9. Prototype design
user -id_user: Integer -nama_user: Varchar -password: Integer -email: Varchar +add() +show()
report
data
-id_data: Integer -id_user: Integer +show()
-id_data: Integer -sensor_ldr: Integer -sensor_rd: Integer -sensor_dht22: Integer -posisi_jemuran: Varchar +show()
Fig. 10. Database design
Automatic Control System for Venetian Blind
359
Fig. 11. Sensor data processing
Fig. 12. Serial on the Arduino IDE (LDR Sensor)
Fig. 13. Serial on the Arduino IDE (RD Sensor)
3.2.7 Sensor Data Processing A DHT22 (Temperature and Humidity) sensor, an LDR (Light) sensor, and an RD sensor are the system input on the hardware [20]. Each sensor has analog input data (0–1023). All sensor inputs are processed in the Arduino Nano microcontroller and the ESP8266 module. The data will be transmitted to the DC motor if the sensor data matches the logic that has been made so that the DC motor can move the venetian blind. 3.2.8 Programming Stage At this stage, mobile application programming and all hardware components are carried out [20]. The programming language used to program hardware is the C programming language and the Java programming language for mobile applications. The data sent to the mobile application is WEB API data. If the results of this programming are as
360
H.-C. Chen et al.
expected, the testing process can be performed. To measure the effect of the independent variable on the dependent variable, a t-test or partial test was used to test the system. This t-test is used to ensure that the command moves in accordance with each variable.
4 Results and Discussion 4.1 Testing and Analysis of Light Dependent Resistor and Rain Drop Sensor Both sensors require the Arduino IDE and installing the ESP8266 library [20]. Testing process is done by connecting the sensor pin to the Arduino Nano board pin. Light Dependent Resistor Sensor pin A0 is connected to pin A0 on the Arduino Nano, Rain Drop Sensor pin A0 is connected to pin A1 on the Arduino Nano, pin ground is connected to pin ground, and pin VCC is connected to pin 5 V. If the light-dependent resistor sensor does not have an error, the Serial Monitor display will contain the Arduino Nano data will appear as below. The measurement method to find out the size of the error from testing the lightdependent resistor sensor is using Root Mean Square Error (RMSE) [20]. RMSE is not ambiguous in the results, and it is more appropriate to use [18]. RMSE =
=
==
= 245.38
5 Conclusion From this research, it was concluded that the circuit designed based on the application of the fuzzy logic algorithm can be implemented to control the response of the servo motor DC that acts as an actuator [20]. The experiments included programming, testing, and comparing measurement variables were found to be statistically not significantly different with the reference system. The system that was designed was feasible to be used as an IoT system development and contributed to the development of research towards the next IoT system. As a suggestion for further research, the alternative techniques of artificial e.g. Artificial Neural Networks (ANN), Genetic Algorithms and Gray Systems can be explored to increase system accuracy and reliability [20]. Acknowledgements. This work was also supported in part by the Ministry of Science and Technology, Taiwan, under Grant both No. MOST 109-2221-E-468-009-MY2 and No. MOST 1102218-E-468-001-MBK. This work was also supported in part by Ministry of Education under Grant No. I109MD040. This work was also supported in part by Asia University Hospital under Grant No. 10951020.
Automatic Control System for Venetian Blind
361
References 1. Farid, A.M., Sharifi, J., Mouhoub, M., Barakati, S.M., Egerton, S.: Multiple objective optimizers for saving water and energy in smart house. In: IEEE International Conference on Systems, Man and Cybernetics (SMC) (2019) 2. García, L., Parra, L., Jimenez, J.M., Lloret, J., Lorenz, P.: IoT-based smart irrigation systems: an overview on the recent trends on sensors and IoT systems for irrigation in precision agriculture. Sensors 4(20), 1042 (2019) 3. Chiu, C.-H., Peng, Y.-F.: Design of Takagi-Sugeno fuzzy control scheme for real world system control. Sustainability 11(14), 3855 (2019) 4. Aditya, F.G., Hafidudin, H., Permana, A.G.: Analisis Dan perancangan prototype smart home dengan sistem client server berbasis platform android melalui komunikasi wireless. eProc. Eng. 2(2) (2015) 5. Guo, X., Shen, Z., Zhang, Y., Wu, T.: Review on the application of artificial intelligence in smart homes. Smart Cities 2(3), 402–420 (2019) 6. Li, S., Da Xu, L., Zhao, S.: The internet of things: a survey. Inf. Syst. Front. 2(17), 243–259 (2015) 7. Macioł, A., Macioł, P., Mrzygłód, B.: Prediction of forging dies wear with the modified Takagi-Sugeno fuzzy identification method. Mater. Manuf. Processes 6(35), 700–713 (2020) 8. Marry, W.: Disruptive civil technologies six technologies with potential impacts on us interests out to 2025 (2013) 9. Martin, C.L., Pasquier, J.L., Yanez, C.M., Tornes, A.: Software development effort estimation using fuzzy logic: a case study. In: Sixth Mexican International Conference on Computer Science (ENC’05) (2005) 10. Nassif, A.B., Capretz, L.F., Ho, D.: Estimating software effort based on use case point model using sugeno fuzzy inference system. In: IEEE 23rd International Conference on Tools with Artificial Intelligence (2011) 11. Putri, D.R., Perdana, D.P., Bisono, Y.G.: Design and performance analysis of smart roof clothesline system based on microcontroller by smartphone application. TEKTRIKA-Jurnal Penelitian dan Pengembangan Telekomunikasi, Kendali, Komputer, Elektrik, dan Elektronika 1(2) (2017) 12. Ramírez-Faz, J., Fernández-Ahumada, L.M., Fernández-Ahumada, E., López-Luque, R.: Monitoring of temperature in retail refrigerated cabinets applying IoT over open-source hardware and software. Sensors 3(20), 846 (2020) 13. Stojkoska, B.L.R., Trivodaliev, K.V.: A review of Internet of Things for smart home: challenges and solutions. J. Clean. Prod. 140, 1454–1464 (2017) 14. Tan, L., Wang, N.: Future internet: the internet of things. In: 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE) (2010) 15. Wortmann, F., Flüchter, K.: Internet of things. Bus. Inf. Syst. Eng. 3(57), 221–224 (2015) 16. Wu, M., Lu, T.-J., Ling, F.-Y., Sun, J., Du, H.-Y.: Research on the architecture of Internet of Things. In: 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE) (2010) 17. Carletti, C., et al.: Thermal and lighting effects of an external venetian blind: experimental analysis in a full scale test room. Build. Environ. 106, 45–56 (2016) 18. Chai, T., Draxler, R.R.: Root mean square error (RMSE) or mean absolute error (MAE)? – arguments against avoiding RMSE in the literature. Geosci. Model Dev. 7, 1247–1250 (2014) 19. Trillas, E., Eciolaza, L.: Fuzzy Logic, vol. 10, pp. 978. Springer International Publishing (2015). https://doi.org/10.1007/978-3-319-14203-6 20. Chen, H.C., Wicaksana, G., Widodo, A.M., Wisnujati, A.: The Internet of Things (IoT) for automatic control mobile based on fuzzy Sugeno method. Future ICT 2021. Taichung, Taiwan, 01–05 Feb 2021
Combining Pipeline Quality with Automation to CI/CD Process for Improving Quality and Efficiency of Software Maintenance Sen-Tarng Lai1(B) , Heru Susanto2,3,4 , and Fang-Yie Leu5 1 Department of Information Technology and Management, Shih Chien University,
Taipei 10462, Taiwan [email protected] 2 School of Business, University Technology of Brunei, Bandar Seri Begawan BE 1410, Brunei Darussalam [email protected] 3 Department of Information Management, Tunghai University, Taichung 40704, Taiwan 4 Research Center for Informatics, The Indonesia Institute of Sciences, Cibinong 16912, Indonesia 5 Department of Computer Science, Tunghai University, Taichung 40704, Taiwan [email protected]
Abstract. In Taiwan, the regular software maintenance is the essential procedure in general commercial banks. However, during system maintenance, the banking system must be suspended and stop related services. Maintenance period usually make many inconveniences for banking customer. Even affect the transaction efficiency and market competitive advantages of enterprises. How to improve the maintenance efficiency and quality for software system is a critical job to increase the banking service quality and reduce customer lost. In this paper, we redefine the software maintenance workflow and apply the automation tool for constructing automated CI/CD workflow, and propose the CPQM model. CPQM model combines the pipeline quantity factors and automation tool factors of CI/CD process for identifying process defects and assistant improving process quality and efficiency. CI/CD process with high quality and high automation can concretely enhance the maintenance efficiency and quality for software system.
1 Introduction In the internet and information age, information system is a weapon for all walks of life to improve market competitiveness. However, information system is impacted by the technological evolution and continuous changes in the environment. In order to satisfy the needs of multi-faceted users, conform to the value of the enterprise, and enhance the market competitiveness of enterprises and organizations, information system must have the ability of adjustments, modifications, and expansions continuous. Taking online banking as an example, it provides a system that is not limited by time, area, and number
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 362–372, 2022. https://doi.org/10.1007/978-3-030-79728-7_36
Combining Pipeline Quality with Automation
363
of service personnel. As long as the network is workable, banking business can be processed anytime and anywhere, greatly improving service quality and transaction convenience. However, in addition to routine maintenance operation, online banking also sometimes has unexpected maintenance tasks. During the maintenance period, no service can be provided, which causes great inconvenience for users who urgently need online banking services. Under the fierce market competition, information systems can improve enterprises and organizations service quality and market competitiveness. At the same time, information systems are also facing the challenge of continuous maintenance. In order to advance with the times, enterprises and organizations must continue to launch various new service items and promotion activities to improve service advantages and market value. For this, how to improve the efficiency of software maintenance and reduce the impact of user is a challenge that must be overcome the software maintenance. DevOps is an abbreviation of the two terms Development and Operations. It is a culture, interactive content and operational habits that value the communication and cooperation between “Developer” and “Operator” [1]. DevOps promotes close cooperation between traditional developers and maintenance personnel, effectively increasing customers’ competitive advantage in the fierce market. DevOps uses agile development methods [2] to overcome the changing needs, advocate collaboration between development and maintenance, and commit to improving the software maintenance efficiency and deployment speed. Four critical items of DevOps are People-oriented and emphasizing collaboration, Test Driven Development (TDD) [3], Continuous Integration (CI) and Continuous Delivery (CD). Automated CI/CD workflow has become the key core of the DevOps environment, and the quality of CI/CD process is the main factor that affects the efficiency and success of DevOps. This paper, analyzes the quality and automation mechanism of CI/CD workflow, and proposes a CI/CD process quality measurement (Continuous Process Quality Measurement; CPQM) model. And based on the CPQM model, a CI/CD Process Quality improvement (CPQI) procedures was designed to concretely improve the efficiency of software maintenance and the quality of continuous delivery products. In Sect. 2, discusses the necessary and importance of software maintenance. In Sect. 3, describes the pipeline workflow, key quality, and automation mechanism that CI/CD process should be possessed. Section 4 proposes a CI/CD process quality measurement model with a multi-layer linear combination model. Based on the CPQM model, Sect. 5 design a CI/CD process quality improvement procedure. In Sect. 6, the importance and influence of software maintenance are emphasized again, and the contributions of this paper is described.
2 Importance of Software Maintenance 2.1 The Values of Software Maintenance Software maintenance cost is as high as 70% of software development costs, and some systems even exceed 80% [4]. And, the efficiency and quality of software maintenance will always affect the service quality of the information system, and therefore the normal
364
S.-T. Lai et al.
operation of enterprise and organization. Information system in operation must have high maintainability to overcome the challenges of the changing environment and fast evolving technology [5]. Online banking provides a set of banking online business transaction systems that are not restricted by time and region. Customers do not have to queue up at the physical bank and wait for personnel services to process banking business, which greatly improves the quality of service and the convenience of transactions. However, software maintenance force online banking to suspend any service, which causes great inconvenience for users who urgently need online banking services. For this reason, how to improve the efficiency of maintenance operations and reduce the impact of use is a challenge that must be overcome in the maintenance operations of online banking. Information systems of the same nature will also face software maintenance problems. Unless the information system has been eliminated and is no longer in use, maintenance operations is inevitable. For this, the maintenance operations of the information system need to have three characteristics: high efficiency, high quality and shortened outage time. Software maintenance with high efficiency can improve the availability and convenience of information systems, and reduce the impact of use. Software maintenance with high quality can increases the trust and acceptance of information systems, and improve the reputation and good impression of enterprise and organization. Continuous software maintenance is the necessary task for enhancing the competitiveness of enterprise and organization. Continuous process with pipeline quality and automation can concretely improve the quality and efficiency of software maintenance. 2.2 Improvement Methods for Software Maintenance Reducing software maintenance costs and improving software maintenance efficiency and quality have always been the goals of software developers and maintainers. The melting of key maintenance quality into the products in the development process can concretely improve software maintainability. In additional, combining several times maintenance requirements can reduce maintenance costs and maintenance frequency [4]. Applying automation testing and integration tools can improve maintenance efficiency and quality. Importing CI/CD pipeline and automatic features can shorten maintenance downtime and reduce the impact of users [6]. There are three methods and advantages of improving of software maintenance operations (shown as Table 1): (1) Summarize several times maintenance requests and complete in one time to reduce the cost of software maintenance and system paused frequency. (2) Apply automatic testing tools to reduce the personal involvement and improve maintenance efficiency and quality. (3) Import pipeline and automatic continuous process can effectively improve maintenance efficiency and quality, and reduce the impaction of use.
Combining Pipeline Quality with Automation
365
Table 1. Three improvement methods of software maintenance and their benefits Improvement methods Benefits
Summarize many requests at one
Apply automation tools
Apply CI/CD pipeline workflow and automatic features
Reduce maintenance Cost
VV
V
V
Reduce paused frequency
VV
V
VVV
Increase efficiency
V
VVV
VVV
Reduce the use impaction
–
VV
VVVV
Increase maintenance quality
–
VV
VVV
** effect: -:none, V:low, VV: middle, VVV: high, VVVV: very high
3 CD/CD Pipeline Workflow and Pipeline Automation 3.1 CI/CD Pipeline Workflow The CI/CD pipeline is an automated tool chain to improve the operating efficiency and process quality of the DevOps environment [7, 8]. The precondition for the pipeline to have automation is to establish the standard format for the input and output interface of each phase, and manual involvement can be reduced. In CI/CD process, reducing the personnel participation can decrease the error rate and improve process efficiency and product quality [7]. In addition, a monitoring mechanism is required to confirm that the products completed at each phase have correct, complete and consistent quality. Quickly identify abnormal or error conditions in the phase operation, and immediately decide the reasons and take appropriate corrective measures to ensure the process can normal operation. To achieve rapid integration and deployment capabilities of the CI/CD pipeline, it should have phase operations such as Build, Unit Testing, Deploy, Automated Testing, and Deploy to Production (shown in Fig. 1), in addition to automation capabilities, each phase must be highly integrated. The tasks of each phase are described as follows: (1) Build phase: The build phase will be triggered when the added or modified code is submitted to the program resource library. The source code is stored in the branch of the resource library, so the compiler will collect all the functions of the code and their dependencies, and then compile the code into a new version. (2) Unit testing phase: The testing job includes multiple testing levels, and unit testing is the most critical step. The unit test will test the newly added or modified program units according to the program specifications of the source code.
366
S.-T. Lai et al.
(3) Deployment phase: After the build project passes the test, it will enter the deployment phase immediately, and then put the initially built product into the test server. At this phase, developers can conduct deployment testing in a simulated environment equivalent to the delivered product to check the various functions of the product. (4) Automated test phase: Acceptance testing will be performed in the automated testing phase, and the completed functions will be tested before deployment. In this phase, automatic and continuous testing will be taken to take advantage of the build function and ensure that the software system has no residual errors. (5) Product deployment phase: Confirm that the source code or product passes all tests, and then enter the production server in the final phase. The continuous feedback loop helps the CI/CD pipeline to become a closed process. During this process, the build content will be continuously submitted, tested, and deployed to the operating environment. In the CI/CD operation process, each phase of the monitoring work inspection phase completes the task. When an abnormality or error is found, a message is immediately sent to the development team for feedback, so that the problem can be immediately confirmed and a solution can be proposed. After the abnormal or error event is confirmed and repaired, it will return to the CI/CD pipeline operation process again (shown as Fig. 1). MR Trigger Build Phase
CM & VC
Unit Testing Phase
Automated Test Phase
Deploy Phase
Deploy to Production Phase
Repositor
Fig. 1. CI/CD pipeline process
3.2 CI/CD Pipeline Automation The workflow of the pipeline must adopt continuous and iterative phased operations. For this, the tasks and activities of each phase should be clearly defined, and appropriate tools must be introduced to assist in the automated operation [9]. Finally, the correctness, completeness and consistency of the integration between the various phases must be considered. In order to get the advantages of pipeline operation, the CI/CD process must have the following processing capabilities: (1) In view of maintenance requirements, the affected functional modules and single components can be identified in a timely manner.
Combining Pipeline Quality with Automation
367
(2) According to the CM system and Cross-Reference relationship, the order of module or component revision can be drawn up. (3) Import appropriate testing tools to assist in automated unit testing and integration testing [9]. (4) Cooperate with the configuration management and version control system to help improve the ability of automated deployment and delivery. (5) In order to speed up the efficiency and quality of pipeline operation, replacing personnel inspection with an automated monitor mechanism can speed up monitor efficiency and reduce misjudgments. CI/CD process must have the ability to automatically inspect and identify phased exceptions or defects. And it is best to have automatic correction or improvement measures.
4 CI/CD Process Quality Measurement 4.1 Pipeline Workflow and Automation Quality Factors of CI/CD CI/CD process quality should combine two critical items which are pipeline workflow quality and pipeline automation quality (shown as Fig. 2). (1) Pipeline Workflow Quality (PWQ) for CI/CD process should consider Modularization/Object-oriented Design Quality (MODQ) and CM-based CrossReference relationship Quality (CMCRQ). (2) Pipeline Automation Quality (PAQ) for CI/CD process should consider Automatic Testing Tool Quality (ATTQ), Automation Deployment Quality (ADQ) and Automatic Detection Quality (ADQ).
CI/CD Process Quality Measurement Pipeline Workflow Quality (PWQ) Modularity/Object-Oriented Design Quality (MODQ) CM-based Cross-Reference Quality (CMCRQ)
Pipeline Automation Quality (PAQ) Automation Testing Tool Quality (ATTQ) Automation Deployment Quality (ADQ) Automation Inspection Quality (AIQ)
Fig. 2. Multi-layer continuous process quality architecture
Use checklists and tools detection to collect the important quality factors to measure critical quality characteristics:
368
S.-T. Lai et al.
(1) MOD Quality: Checklists of detailed design inspection can collect the quantity factors. High Modularity/OOD Quality should have high cohesion, low coupling and low complexity three key factors. (2) CRCM Quality: Checklists of phase documentation review can collect the Cross-Relationship quality factors. High quality of Cross-Relationship should have Requirement and design cross-relationship, Design and implement crossrelationship and Traceability three key factors. (3) ATT Quality: Automatic testing tools can increase efficiency and quality of test activities. In CI/CD process should have the tools of unit testing, integration testing and interface checking for reaching pipeline automation [9]. Basic test tools should have availability, consistency and integrability. (4) AD Quality: Software system delivery and deployment need take more resource. Complete procedures, tools and system of CM and VC, and delivery and deployment assistant tool can increase the efficiency and correctness of software system delivery and deployment. (5) AI Quality: CI/CD process has pipeline and automation mechanisms. Therefore, process workflow need have normal, correct and complete operations. Data collection, analysis and exception detector can help identify the exception and error situations. 4.2 Multi-layer Quality Measurement Model Based on the linear combination model [10], the quality factors that affect the quality of CI/CD operations are combined into key quality characteristics to quantify and evaluate the operational quality of the CI/CD pipeline. The quality measurement model for CI/CD process is a Multi-Layer Quality Measurement Model which combines PWQ and PQA two major measures. Detail combination formulas of the measurement model describes as follows: (1) Pipeline Workflow Quality for CI/CD process have two quality items: • Modularization/Object-oriented Design Quality (MODQ) should have high cohesion, low coupling and low complexity (logic complexity and data structure complexity) in detailed design activities. MODQ measurement combines with module cohesion, coupling and complexity three metrics. Combination formula shown as Eq. (1): MODQM : MQD Quality Measurement. MChM : Module Cohestion Metric
W1 : Weight of MChM
MCuM : Module Coupling Metric
W2 : Weight of MCuM
MCxM : Module Complexity Metric
W3 : Weight of MCxM
MODQM = W1 ∗ MChM + W2 ∗ MCuM + W3 ∗ MCxM
W 1 + W2 + W3 = 1
(1)
Combining Pipeline Quality with Automation
369
• CM-based Cross-Relationship Quality (CRQ) should have requirement and design cross-relationship, design and implement cross-relationship and document traceability in software development activities. CRQ measurement is combined requirement and design cross-relationship, design and implement crossrelationship and document traceability three metrics. Combination formula shown as Eq. (2): CRQM : CR Quality Measurement. RDCRM : RD Cross - Relationship Metric
W1 : Weight of MChM
DICRM : DI Cross - Relationship Metric
W2 : Weight of MCuM
DTM : DocumentsTraceability Metric
(2)
W3 : Weight of DTM
CRQM = W1 ∗ RDCRM + W2 ∗ DICRM + W3 ∗ DTM
W1 + W2 + W3 = 1
• Pipeline Workflow Quality Measurement (PWQM) is combined MODQM with CRQM. (2) Pipeline Automation Quality for CI/CD process have three quality items: • Automation Testing Tool quality (ATTQ) should have unit testing, integration testing and interface checking tools. CASE tool evaluation can identify quality factors of automation testing tools. ATTQ measurement combines with unit testing, integration testing and interface checking three metrics. Combination formula shown as Eq. (3): ATTQM : Module Object Design Quality Measurement. UTM : Unit Testing tool Metric W1 : Weight of UTM ITM : Intergration Testing tool Metric ICM : Interface Checking tool Metric
W2 : Weight of ITM W3 : Weight of ICM
ATTQM = W1 ∗ UTM + W2 ∗ ITM + W3 ∗ ICM
W1 + W2 + W3 = 1 (3)
• Automation Deployment Quality (ADQ) should have version control system, configuration management system, and delivery/ deployment assistant tools in software development activities. ADQ measurement is combine quality metrics of version control system, system and tool of CM, and tools of delivery and deployment. Combination formula shown as Eq. (4): ADQM : ADQ Measurement. VCQM : Version Control Quality Metric
W1 : Weight of VCQM
CMQM : Configuration Management Quality Metric
W2 : Weight of CMQM
DDQM : Delivery and Deployment Metric
W3 : Weight of DDQM
ADQM = W1 ∗ VCQM + W2 ∗ CMQM + W3 ∗ DDQM
W1 + W 2 + W 3 = 1
(4)
370
S.-T. Lai et al.
• Automation Inspection quality (AIQ) should have the tools of data collection, data analyzing and exception detector in continuous process monitoring activities. AIQ measurement combines with data collection quality, data analyzing quality and exception detector quality three metrics. Combination formula shown as Eq. (5): AIQM : AIQ Measurement. DCM : Data log ging and collection Metric
W1 : Weight of DCM
DAM : Data Analysis Metric
W2 : Weight of DAM
EDM : Exception Detection Metric
W3 : Weight of EDM
AIQM = W1 ∗ DCM + W2 ∗ DAM + W3 ∗ EDM
(5)
W1 + W 2 + W 3 = 1
• Pipeline Automation Quality Measurement (PAQM) is combined ATQM, ADQM and AIQM. (3) CPQM (Continuous Process Quality Measurement) is combined PWQM with PAQM. Combination formula shown as Eq. (6): PWQM : Pipeline Workflow Quality Measurement
W1 : Weight of PWQM
PAQM : Pipeline Automation Quality Measurement
W2 : Weight of PAQM
CPQM : Continuous Process Quality Measurement CPQM = Wpw ∗ PWOM + Wpa ∗ PAQM
(6)
Wpw + Wpa = 1
The linear combination model is divided into three layers. The first layer combines basic quality factors into quality measures. The second layer combines the quality measures into PWQM and PAQM two higher quality measurements. The third layer combines two higher quality measures into a quality indicator of CI/CD process. This paper calls the quality measurement the CI/CD Process Quality Measurement (CPQM) model, and its architecture shown as Fig. 2.
5 CI/CD Process Quality Improvement Procedure Effectively monitoring the CI/CD workflow can identify the problems and defects in the operation of CI/CD in time, and then proposes specific improvements to these problems and defects. The CPQM model proposed in Sect. 4 uses quantitative values to confirm the quality of CI/CD workflow, and can identify the problems and defects in CI/CD activities through measurement formulas. The CPQM model applies a multi-layer linear combination model. CI/CD process quality measurement is composed of four key quality characteristics of CI/CD activities. The business process that guides various workflows must be continuously adjusted and improved in line with changes in the environment and requirements to meet the operation requirements of enterprises and organizations. The purpose of quantitative quality measurement (CPQM) is to identify the defects of quality of the CI/CD process, assist in discovery bad quality items of process, and then propose improvement measures to reach the effect of continuous improvement of the CI/CD process. Therefore, it is necessary to first specific a threshold for the
Combining Pipeline Quality with Automation
371
continuous quality improvement of the CI/CD process. When the quality measurement is not within the range of expected target threshold, it means that the quality of the CI/CD process is in a poor state, and poor quality processes will affect the efficiency and effectiveness of software maintenance. For this, the improvement procedure should cooperate with the Multi-Layer combination formula of the CPQM model to identify bad quality metrics, quality factors, and corresponding quality activity items in a rulebased manner. Further help in understanding the defects or problems of quality activities, and propose improvement measures. The CI/CD Process Quality Improvement (CPQI) procedure steps are described as follows: Step 1. Set or adjust the quality threshold of CPQM. Step 2. Evaluate the CPQM by CPQM model. Inspect the quality indicator of CPQM. If quality indicator meet the threshold, then end of procedure, else enter next step. Step 3. IF PWQM show the low quality indicator, then enter step 4 to inspect the quality indicator of PWQM Else enter step 5 to inspect the quality indicator of PAQM. Step 4. Based on quality metrics and quantity quality factors of Formula (1) and (2), identify the bad quality item, and correspond to the quality activity. Assisted by senior software engineer to identify the bad quality activities and propose the improvement manner. Step 5. Based on quality metrics and quantity quality factors of Formula (3), (4) and (5), identify the bad quality item, and correspond to the quality activity. Assisted by senior software engineer to identify the bad quality activities and propose the improvement manner.
6 Conclusions Software maintenance is a necessary activity to extend life cycle of the information system. Software maintenance will continue until the information system is retirement. However, software maintenance must temporarily stop partial or all services, which will cause great impact and inconvenience to users who need to use the system. In addition, it may even affect the customer’s competitive advantage in the market, and have a negative impact on the quality of service of the information system. For this, the maintenance of the information system should be as short as possible and not too frequent to improve the efficiency and quality of services and reduce customer complaints. This paper applies the CI/CD continuous process, which combines pipeline and automated quality features to improve the deficiencies of traditional software maintenance and proposes the CPQM model to measure the operating quality of the continuous process. Then uses the measured values of the CPQM model to carry out the continuous quality improvement process, which effectively improvement the efficiency and quality of software maintenance, and specifically reducing the impact of users during software maintenance. The advantages and contributions of adopting high-quality integrated pipelines and automated CI/CD Continuous process are described as follows: • Multi-layered measurement model can effectively identify the defects and problems of quality items, and assistant develop improvement measures. • Apply CPQM model and CPQI procedure to improve the CI/CD process quality.
372
S.-T. Lai et al.
• Improving the operation quality of pipeline of CI/CD continuous process can reduce the service impact in software maintenance process. • Enhancing automated quality of maintenance of CI/CD continuous process can increase the efficiency and quality in software maintenance process. • High quality CI/CD continuous process can improve efficiency and quality of software maintenance, and reduce the impact of use.
References 1. Loukides, M.: What is DevOps?. O’Reilly Media, Inc. (2012) 2. Szalvay, V.: An introduction to agile software development. Danube Technol. 3 (2004) 3. Larman, C., Basili, V.R.: Iterative and incremental development: a brief history, computer. IEEE Comput. 36(6), 47–56 (2003) 4. Schach, S.R.: Object-Oriented and Classical Software Engineering, 8th edn. McGraw-Hill, New York (2011) 5. Erich, F., Amrit, C., Daneva, M.: Cooperation between information system development and operations: a literature review. In Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, p. 1 (2014) 6. Wikström, A.: Benefits and challenges of Continuous Integration and Delivery: A Case Study (2019) 7. Purohit, K.: Executing DevOps & CI/CD, reduce in manual dependency. IJSDR 5(6), pp. 511– 515 (2020) 8. Jackson, L.: The Complete ASP.NET Core 3 API Tutorial: Hands-On Building, Testing, and Deploying. Apress, Berkeley, CA (2020). https://doi.org/10.1007/978-1-4842-6255-9 9. Wolf, J., Yoon, S: Automated Testing for Continuous Delivery Pipelines. In industrial talk), in Pacific NW Software Quality Conference (2016) 10. Fenton, N.E.: Software Metrics - A Rigorous Approach, Chapman & Hall (1991)
Efficient Adaptive Resource Management for Spot Workers in Cloud Computing Environment Lung-Pin Chen, Fang-Yie Leu(B) , Hsin-Ta Chiao, and Hung-Jr Shiu Department of Computer Science, Tunghai University, Taichung, Taiwan {lbchen,leufy,josephchiao,hjshiu}@thu.edu.tw
Abstract. Due to flexible scheduling requirements for various applications, a cloud platform usually has some temporarily unleased workers. To make cost effective, such considerable amount of idle workers can be collected to perform malleable tasks. These workers, called spot workers, often cannot work unstably since they may be interrupted by an external scheduler. This paper proposes a resource management approach that employs replication to increase resource availability. In fact, lifting the replication factor can increase the reliability that a worker can finish a job. But this oppositely decreases resource-usage efficiency due to computational redundancy. We also develop an algorithm that not only controls the replication factor to adapt the changing workload, but also maintains the system performance.
1 Introduction Today, many organizations and institutes rely on cloud computing for conveniently deploying their applications, which are performed in a monitored environment. In a data center, cloud platforms often allows users to use resources in an on-demand manner to accommodate increasing requirements. A cloud usually provides reserved, on-demand, or other flexible lease modes on different users’ pricing models, such as the common pay-as-per-usage model. The reserved mode, with higher rent charges, conducts users to use dedicated computing resources over a contracted period of time. Alternatively, the on-demand mode dynamically allocates resources according to user requirements. In a cloud platform, dynamically allocating multiple hardware resources to serve various applications is considered as a challenge. On the contrary, sometimes the scheduler tries to collect CPU cores across multiple hosts for executing a job requesting a great number of cores. In this situation, the cores are reserved until the job can start, thereby forming the inter-node fragments. A cloud service provider usually estimates and prepares a sufficient capacity of resources in order to meet surge demands. This can lead to excess capacity of idle resources during off-peak hours. To fully utilize these resources, the cloud service provider offers these idle machines at a discounted shortterm contract. These idle machines are called spot workers in this paper. When there are new arrival on-demand requests to the cloud, the resource broker may interrupt the applications hosted on some spot workers and transits them into normal workers. Additionally, idle resources can also exist in leased machines. For example, in the machines © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 373–382, 2022. https://doi.org/10.1007/978-3-030-79728-7_37
374
L.-P. Chen et al.
with monthly-based leased mode, the user is not motivated to fully utilize the rented resources due to the fixed rent cost. These idle resources can be also useful. Spot workers are temporarily unused cloud hosts in a cloud platform. To make cost effective, such considerable amount of idle workers can be collected and rented in low costs. Spot workers are unstably employed to the hosted applications since they may be abruptly interrupted and re-scheduled by the main system scheduler. The computing systems built on top of the unreliable/unstable workers typically employ replicas to guarantee the computation quality [1–3]. By assigning multiple copies for a task to different workers, the success rate of the execution can be increased via group strength based on probability. Determining the replication factor is crucial for such an opportunistic computing system [1, 4]. The trade-off is that lifting the replication factor can increase the worker reliability but decreasing the throughput due to computational redundancy, and vice versa. To address the best replication factor, many brokers adopt the trial-and-error approach which makes varied attempts for adjusting the factor until success. In this study, we also develop a systematic approach for addressing replication factors efficiently and an algorithm that controls the replication factor adaptively to the changing workload and therefore maintains the system performance. The rest of this paper is organized as follows. Section 2 presents the system architecture. Section 3 discusses the policies of organizing worker to form a reliable group for executing tasks. Section 4 discusses the approach which controls the replication factor based on a balancing technique. The implementation and experimental results are presented in Sect. 5. Section 6 concludes this study and addresses our future studies.
2 Backgrounds This section discusses the reliability of computing resources in a cloud computing platform, focusing on computing machines. Resource allocation for other types of resources such as communication bandwidth and data storage is beyond the scope of this paper. 2.1 Normal Workers and Spot Workers A cloud computing environment contains workers of various hardware and software with different capabilities and versions. Fully utilizing the computing resources in a cloud computing platform is considered as a challenge due to the complex scheduling criteria for various applications. In this paper, spot workers refer to those temporarily unused machines in the concerned cloud. A spot worker can be leased or unleased. For example, for the monthly-based leased mode, the user is not motivated to fully utilize the rented resources owing to the fixed lease cost. In this case, a rented machine can be idle for several hours or even longer. To make cost-effective, the considerable amount of idle resources collected on spot workers can be used to perform malleable tasks, such as high-scale web applications. The host applications using the idle resources on the spot worker are called spot applications. The system architecture for collecting idle resources and managing the pools of spot workers and non-spot workers is illustrated in Fig. 1. There is a resource manager that
Efficient Adaptive Resource Management for Spot Workers
375
controls the pools of different types of workers. The resource manager monitors the utilization of the CPU of computer machines in the cloud. If the utilization rate is less than a threshold, the machine is transited from non-spot to spot type.
Fig. 1. Spot workers in a cloud computing system.
In this study, we develop a replicating approach to obtain stable resources from unreliable spot workers. As illustrated in Fig. 1, there is a replication controller that controls the replication factor according to the different conditions of individual workers. A spot worker can be interrupted abruptly without notification. We also develop a new resource broker for performing malleable tasks over a pool of spot workers. The new approach controls the replication factor via the balancing on the system metrics that are positively or negatively related to the factor. With this balancing technique, the broker can rapidly control the replication factor to adapt to the changing workload and thereby improves the system performance. 2.2 Reliability and Replication Factor Redundancy is a key technique for performing tasks on top of an unreliable network infrastructure. With this technique, the server performs a task by making several copies of task replicas and assigns them to different workers. By organizing the independent workers that are hidden to each other, the server can mitigate the unreliable reports sent by these workers via an agreement policy, e.g. majority voting mechanism. To model the computation on the unstable spot workers, this paper uses the reliability ri to estimate the probability of successively executing a spot job in spot worker wi . We assume that ri for all workers i is given and is an input parameter of our scheduling algorithm. This probability distribution can be determined based on the classification on hardware feature, or the probability distribution on history logs over time.
376
L.-P. Chen et al.
For the sake of simplicity, we assume that each worker uses exactly one CPU core. The reliability of a worker wi at time t is given by [1–4]: ri (t) =
ni (t) + 1 Ni (t) + 1
which represents the ratio of correct answers reported by worker wi (i.e. ni (t)) to the total number of tasks assigned to worker wi (i.e. Ni (t)). For convenient, hereinafter we ignore t in notation ri (t) when the context is clear. A worker group (or simply, group) refers a set of workers that execute the replication of a job. Let (w1 , w2 , · · · , w2k+1 ) be a group of 2k + 1 workers with reliability ratings (r1 , r2 , · · · , r2k+1 ). The HPC cluster workers are considered trusted as they are hosted in a managed high-capability data center. For such workers, instead of using the major voting mechanism, we adopt the at-least-one voting mechanism which confirms a success execution upon receiving the first success reply among the workers. For the at-least-one voting model, the reliability of a worker group G is defined as follows: λ(G) = 1 −
k (1 − ri )
(1)
i=1
Clearly, 0 ≤ λ(G) ≤ 1. The reliability of a group is also termed group strength. In Eq. (1), all the response vectors, except {0, 0, · · · , 0}, are calculated. Note that {0, 0, · · · , 0} represents the case in which no response has been received from any group members. For example, for a worker group G = (w1 , w2 ) with reliability ratings (0.5, 0.5), the group reliability based on the at-least-one voting is given by λ(G) = (1 − 0.5 · 0.5) = 0.75. Intuitively, if we assign a task to two workers both with reliability of 1/2, the probability that neither workers correctly report is 1/4.
3 System Design This section discusses the system design and architecture of the proposed system. 3.1 Groups and Execution Success Rate As defined in Eq. (1), the reliability of a group λ(G) refers to the success rate of executing a job in a group G. This subsection discusses how the group size affects the success rate. Let γ(G) = 1 − λ(G) where γ (G) refer to the probability of failed execution of the job for group G. Herein, we omit G in notation λ(G) and γ (G) when the context is clear. In the volunteer computing system, a job will be executed and m ay be rescheduled until success. Based on geometric distribution [5], the total probability for getting k − 1 failures followed by one successful execution is given by P(X = k) = (1 − λ)k−1 λ = γk−1 λ
Efficient Adaptive Resource Management for Spot Workers
377
The expected value of above P(X = k) is the sum of every possible k multiply its probability: E(X ) =
∞
k · P(X = k)
(2)
k=1
Simplifying the series of Eq. (2) using deviation obtains E(X ) =
∞
k · γk−1 · λ
k=1
∞ λ
d 1 λ = γ k ·γ = γ γ dγ 1 − γ k=1 λ 1 1 1 λ = γ γ 2 = = 2 γ γ λ λ (1 − γ ) k
(3)
For example, for a worker group with λ = 0.5, the expected number of executions that the group performs to obtain a correct result is given by 1/0.5 = 2. Alternatively, if λ = 0.8, then the worker group performs 1/0.8 = 1.25 expected number of executions to complete the job. 3.2 Groups and Throughput The throughput of a group of spot workers during a time period T is the total number of jobs the spot workers can complete during the time period. The throughput is dominated by the following two factors: • Average throughout of a group, denoted by α, is defined as the average number of tasks completed by the worker group in the period of time T . • Number of groups, denoted by β, is the number of groups formed in the period of time T . The average throughput of a group with group reliability λ is equal to the average number of tasks the group can complete in T units of time, which, according to (3), is given by α = T /(1/λ) = T λ
(4)
According to Eq. (4), the throughput of a group depends on the reliability λ of the group. The reliability indicates the strength of the group. In addition, another factor, called group granularity, refers to the average size of groups. Although high group strength can guarantee the group performance, there is a trade-off between group granularity and group strength we need to take into account when designing spot worker scheduling algorithms. Let W be the number of spot workers and let r be the average reliability of the spot workers. For those low reliable workers, we need to gather more workers to increase the
378
L.-P. Chen et al.
strength. In contrast, high reliable workers can achieve the same strength with smaller group size. The group granularity (i.e., group size) k is related to group reliability λ as follows. A group with k spot workers, each with average worker reliability r, leads to group strength which is equal to λ = 1 − (1 − r)k Conversely, the number of workers to achieve group strength λ can be derived as k = log(1−r) (1 − λ) = (log(1 − λ)/log(1 − r)) If the average number of group size is equal to k, the number of groups β can be derived by β = W /k = (W · log(1 − r)/(log(1 − λ)))
(5)
Now the total throughput H of all groups in a period of time T can be formulated as H (T , r, λ) = αβ = (T λ)(W · log(1 − r)/(log(1 − λ)))
(6)
3.3 Trade-Off Property Between α and β Given a system with W spot workers with average reliability r, we need to determine the parameter λ to maximize the system performance. According to Eq. (6), the total throughput H is equal to αβ. That is, higher α and β induce better throughput H . The relationship between α and the parameter λ, formulated in Eq. (4), is illustrated in Fig. 2 which illustrates that α increases as λ is higher. Similarly, the relationship between β and λ, formulated in Eq. (5), is illustrated in Fig. 3, in which β is higher as λ increases.
Fig. 2. The trade-off property in terms of reliability.
Fig. 3. The trade-off property in terms of reliability.
We summarize the trade-off property of α and β, formulated in Eqs. (4)–(6), in the following property. Property 3.1. For a system with W spot workers each with average reliability r, the total throughout H (T , r, λ) = αβ satisfies the following properties: (1) α increases as λ is higher; (2) β is higher as λ decreases.
Efficient Adaptive Resource Management for Spot Workers
379
4 Adaptive Replication Algorithms This section discusses our new scheduling algorithm for maintaining the replication factor. The replication control algorithm maintains a global parameter called group reliability threshold λth . When constructing a group g, the group strength λ(g) must greater than or equal to λth i.e., λ(g) ≥ λth . Given a system with W spot workers with average reliability r, our algorithm aims to maximize H (T , r, λth ) as listed in Eq. (6). Table 1. Algorithm 1: Maintain_Group ().
To achieve this, as shown in Property 3.1, we need to maximize both of α and β. However, increasing λth leads to higher α but lower β, and vice versa. Next, we develop algorithms to find a value of λth that leads to maximal αβ. Two algorithms, Maintain_Group and Control_Replication, as listed in Tables 1 and 2, respectively, have been developed. Algorithm Maintain_Group organizes workers to form groups in terms of reliability threshold λth as described below. In Line 1(a) and Line 1(b), when a new task is received, Maintain_Group scans current groups and checks whether their group reliability exceeds λth . All the groups
380
L.-P. Chen et al.
with strength less than λth is destructed and will be reconstructed in line 1(a). In contrast, the groups with strength greater than λth is reduced. In Line 1(b), if no group can execute task T, the algorithm sorts the workers in terms of their reliability values and tries to form a new group g with λ(g) ≥ λth . Table 2. Algorithm 2: Control_Replication ().
5 Experiments In the following, we ran a series of simulated jobs on a Qube render management system in Formosa 3 and a Computer Game Desktop Grid (CGDG).
Efficient Adaptive Resource Management for Spot Workers
381
Fig. 4. Logs of Formosa 3 render farm: (a) 2014/8/19~2014/9/1; (b) 2014/3/28~2014/4/30.
Fig. 5. Logs of Formosa 3 render farm: (a) 2014/8/19~2014/9/1; (b). 2014/3/28~2014/4/30.
From the above experiments, we can see that after applying the MJ system to the HPC test environment, we may achieve a resource usage of 97.06% for 60-min. jobs, or a higher resource usage of 99.23% if the relatively shorter 10-min. jobs are used (Figs. 4 and 5).
6 Conclusions and Future Studies In this paper, we propose a high-performance computing system, malleable job scheduling system or called “MJ system”, which maximizes the resource utilization by allocating unused resource fragments for some jobs. The resource utilization of the shorter jobs is better than the longer jobs. Using shorter jobs for the MJ system allows the overall computing resource utilization to reach 99.23%, while allowing job completion to reach 89.53%, effectively solving the resource fragment problem. In the future, we would like to improve resource utilization for long jobs and derive the reliability model for the proposed system. These constitute our future studies.
References 1. Wu, I.C., Huang, D.Y., Chang, H.C.: Connect6. ICGA J. 28(4), 234–241 (2005)
382
L.-P. Chen et al.
2. Wu, I.C., Chen, C.P.: Desktop grid computing system for Connect6 application. Institute of Computer Science and Engineering College of Computer Science NCTU, August 2009. SW (2009) 3. Wu, I.C., Han, S.Y.: The study of the worker in a volunteer computing system for computer games. Institute of Computer Science and Engineering College of Computer Science NCTU (2011) 4. Wu, I.C., Jou, C.Y.: The study and design of the generic application framework and resource allocation management for the desktop grid CGDG. Institute of Computer Science and Engineering College of Computer Science NCTU (2010) 5. Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers, Inc., Burlington (1999)
You Draft We Complete: A BERT-Based Article Generation by Keyword Supplement Li-Gang Jiang1 , Yi-Hong Chan1 , Yao-Chung Fan1 , and Fang-Yie Leu2(B) 1
National Chung Hsing University, Taichung, Taiwan 2 TungHai University, Taichung, Taiwan [email protected]
Abstract. In this paper, we investigate a text generation scenario that we are given a sequence of keywords and our goal is to automatically turn the sequence into a semantic coherent passage. Moving toward the goal, we propose a scheme called round-robin generation that produces semantic coherent results by overcoming the repeating word problem (commonly observed in story generation scenarios) through global attention and multi-tasking training. Experimental evaluation demonstrates that our scheme effectively mitigates the word repeating problems. We also show an improvement of BLEU 4 from 17.8 to 19.1 compared with a strong baseline on the benchmarking datasets.
1 Introduction In this paper, we investigate a text generation scenario that we are given a sequence of keywords and our goal is to automatically turn the token sequence into a semantic coherent passage. Specifically, as shown in Fig. 1, we are given a sequence of keywords K = [k1 , ..., k|K| ] as a seed and the goal is to generate an article containing the keyword sequence. Such a text generation scenario is motivated by the needs for quick story generation in news report. With the goal, we investigate the employment of BERT [1] for the targeted scenario. We investigate three architecture variants. The first one is a Left-to-Right Generation (LR), which generates articles from left to right in a sequential manner. However, from the experiments, we find that the LR generation suffers from a repeating word problem. Thus, we propose the second scheme that generates text contents in a round-robin manner to reduce the repeating word problem. With the second design, we further investigate a multi-task training setting to enable the model to capturing global attention for text generation. From the experiment evaluation, we demonstrate the effectiveness of our model design. The contribution of this paper is summarized as follows. – We propose a round-robin generation scheme, which can alleviate the repetitive problems encountered in the left-to-right generation scheme. – We demonstrate the effectiveness by jointly training parallel generation and sequential generation in a multi-task manner, which can mitigate the semantic inconsistency issue for our text generation task. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 383–392, 2022. https://doi.org/10.1007/978-3-030-79728-7_38
384
L.-G. Jiang et al.
Fig. 1. An example for our goal
– Through the evaluation with CNN/Daily mail data set, we demonstrate that our schemes significantly outperform several strong baselines in term of BLEU score. Our best scheme provides an improvement of 19.172% (BLEU 4) compared with the strong baselines.
2 Text Generation Methods 2.1
Left to Right Generation
Training. Assume that we are given an article T = [t1 , ..., t|T | ]. We select keywords by first deleting the stop words from T and then randomly selecting the rest words as keywords. Assume that the keywords we selected is K = [k1 , ..., k|K| ], we align the input sequence to BERT by inserting mask tokens [M] before k1 , and between any consecutive keyword tokens. The [M] is to signal a token prediction. As an example, in Table 1, we show an example of training data creation where T = [t1 , t2 , t3 , t4 , t5 ] and K = [t2 , t4 ]. In this example, we generate 6 training samples. We design our model to decode tokens in an auto-regressive manner. We select to decode one token at a time. The first training sample is [[M1 ], t2 , [M2 ], t4 , [M3 ]] and the label is t1 , and the second sample is t1 , [M1 ], t2 , [M2 ], t4 , [M3 ] and the label is [S]. We have two things to note. First, we generate a token at a time. Second, for a [M], we decode it by the following grammar: [M] −→ {w[M], [S]|w ∈ V } That is, for a given [M], we predict (1) a token w from vocabulary V and insert a new [M] after w or (2) a [S] token to signal the completion of the mask token in consideration. Note that in this scheme, we decode the mask token from the left to right. The decoding process of a mask token starts when all mask tokens to its left are completed.
You Draft We Complete
385
Table 1. A LR masking example for target article T = [t1 , ..., t5 ], and a given sequence of keywords K = [t2 , t4 ] Left To Right Masking Iter. Input sequence
Labels for [M1 ]
1
[M1 ], t2 , [M2 ], t4 , [M3 ]
t1
2
t1 , [M1 ], t2 , [M2 ], t4 , [M3 ] [S]
3
t1 , t2 , [M1 ], t4 , [M2 ]
t3
4
t1 , t2 , t3 , [M1 ], t4 , [M2 ]
[S]
5
t1 , t2 , t3 , t4 , [M1 ]
t5
6
t1 , t2 , t3 , t4 , t5 , [M1 ]
[S]
Prediction. For a given keywords K = [k1 , ..., k|K| ], we insert [M] tokens between each two consecutive input keywords, and the input sequence Xi is aligned as Xi = [[C], [Mj ], k1 , [Mj+1 ], k2 , [Mj+2 ] ..., k|K| , [M|M | ]] Then, the input sequence Xi is represented by the BERT embedding layers and then travel forward into the BERT model. We denote the final hidden vector of [Mj ] as h[Mj ] ∈ Rh . We compute the label probabilities P r(w|Xi ) ∈ R|V | by a softmax function as follows. P r(w|Xi ) = sof tmax(h[Mj ] · WLR + bLR ) tˆ[Mj ] = argmaxw P r(w|Xi ) Every step only generates a token from first special token [M], the newly generated token tˆ[Mj ] is appended into Xi and the article generation process is repeated with the new Xi until [S] is predicted. Then the first [M] will not be put in Xi , and the next round of prediction will move to the next position [M], and the process continue until Xi contains no [M] tokens. In Fig. 2, we illustrate the prediction process. 2.2 Round Robin Generation By experimental evaluation, we find that the LR generation suffers from word repeating problems (the generation result contains repeating grams). Thus, we explore to change the order of generation. The generation of [M] is proceeded in a round robin manner, as shown in Table 2. We call such a generation scheme as Round Robin Generation (RR). Training. For making [M] prediction in a round robin manner, we introduce a special token [SP]. Except for the currently focused [M], the remaining [M] will be replaced with [SP]. That is, all [M] take turns to make prediction, as illustrate in Fig. 3.
386
L.-G. Jiang et al.
Fig. 2. The LR prediction Table 2. A RR masking example for target article T = [t1 , ..., t5 ], and a given sequence of keywords K = [t2 , t4 ] Round Robin Masking Iter. Input sequence
Labels for [M]
1
[M], t2 , [SP], t4 , [SP]
t1
2
t1 , [SP], t2 , [M], t4 , [SP]
t3
3
t1 , [SP], t2 , t3 , [SP], t4 , [M]
t5
4
t1 , [M], t2 , t3 , [SP], t4 , t5 , [SP] [S]
5
t1 , t2 , t3 , [M], t4 , t5 , [SP]
[S]
6
t1 , t2 , t3 , t4 , t5 , [M]
[S]
As an example, in Table 2, we show an example where T = [t1 , t2 , t3 , t4 , t5 ] and K = [t2 , t4 ]. In this example, we generate 6 training samples. The first training sample is [M], t2 , [SP], t4 , [SP], and the label is t1 , and the second sample is t1 , [SP], t2 , [M], t4 , [SP] and the label is t3 . We have three things to note. First, we generate a token at a time. Second, all [M] take turns to make prediction in a round-robin manner. Third, for a [M], the prediction is based on [M] −→ {w[M], [S]|w ∈ V }
Prediction. Our RR scheme works as follows. First, for a given keywords K = [k1 , ..., k|K| ], we insert [M] tokens between each two continuous input keywords.
You Draft We Complete
387
Fig. 3. The RR prediction
Except for the current target [M], all other [M] will be replaced by [SP]. the input sequence Xi to RR is formulated as Xi = [[C], [M], k1 , [SP], k2 , [SP] ..., k|K| , [SP]] The prediction of [M], is computed by a linear layer WRR ∈ Rh×|V | and the hidden representation h[M] ∈ Rh with a softmax activation over all vocabulary dimension as follows. P r(w|Xi ) = sof tmax(h[M] · WRR + bRR ) tˆ[M] = argmaxw P r(w|Xi ) After generating tˆ[M] , we select to predict the next [SP]. At this time, the [SP] will be masked with [M], and the remaining tokens will become [SP]. The generation process is terminated when no [M] remains, as shown in Fig. 3. 2.3 Context Level Parallel Generation As mentioned, our goal is to produce complete news articles. The quality of the article depends on the degree of fact matching and the correctness of grammar and semantics. Both LR and RR are a one-by-one generated architecture. According to our observations, the both models only learns to focus on the generation between two keywords, rather than focusing on the generation of the entire text. Therefore, it leads to the problem of factual in-consistence. To this problem, we propose a context level parallel generation architecture for our task. Unlike the previous two models, the tokens between all keywords are predicted in parallel, as illustrated in Fig. 4. Our goal is to make the model simultaneously pay attention to all keywords and in a sentence level. However, the parallel generation often fails to generate fluent results. Therefore, we propose to incorporate the context level parallel generation in a multi-task manner, as shown in Fig. 5.
388
L.-G. Jiang et al.
Fig. 4. The CL prediction Table 3. Number of dataset Train data Test data CNN/Daily mail
5849
1000
AIdea AI CUP 2019 9558
1000
3 Experiments 3.1
Experimental Settings
Implementation Details. As mentioned, in order to complete cloze task and parallel generation, we use BERT to train our model. To achieve this, our experiment is based on huggingface transformers framework[5]. The optimization is performed using Adamax, with an initial learning rate of 5e–5. In the case of Chinese, the batch size is set to 38. All models are trained with 2 GPUs (RTX Titan) for 5 epochs (Table 3). Datasets. For our task, we two news data sets (CNN/Daily mail [3] and AI-Cup2019) for performance evaluation. The splits and the dataset settings are summarized in the following table. 3.2
Token Score Comparison and Case Study
To evaluate the performance of our methods, we use BLEU score [4] and ROUGE L [2] score as performance metrics. The BLEU scores evaluate average n-gram precision on a set of reference sentences, with penalty for overly long sentences. The ROUGE (L) measure is the recall of longest common sub-sequences. The evaluation is summarized in Table 4.
You Draft We Complete
389
Fig. 5. Multi-task with context level generation
We compare the two methods of LR and RR and their advantages after multitasking. From the experiment result, we can see that RR outperforms LR. We also show two case-study in Table 5 and Table 6 to see the qualitative results of the compared schemes. From the case studies, it can be found that LR is prone to the problem of predicting repeated words. For example, in Example 1, the word “col” is generated repeatedly, and in Example 2, when enumerating date, “50” and other places are continuously generated. In RR, the order of generation is in a round-robin manner, so that the model focuses on the generation between different keywords, Even if one of the positions encounters a repetitive situation, it can be predicted by the next step to go to the other position, skipping the repetitive problem. Although RR mitigates the problem of duplication, there are still other problems, such as factual conflicts and inconsistent context. For example, in Example 1, the sentences “ac milan have signed a three - year contract extension with ac milan” are Table 4. Comparison of token score Model BLEU 1 BLEU 2 BLEU 3 BLEU 4 ROUGE-L CNN/Daily mail
LR
55.703
36.12
24.671
17.845
42.18
RR
55.815
36.989
25.819
19.01
43.748 44.447
CL
55.958
37.238
25.711
18.563
AIdea AI CUP 2019 LR
60.441
48.42
38.627
31.305
56.44
RR
61.139
49.294
39.722
32.465
57.473
CL
58.564
47.439
38.296
31.252
57.184
390
L.-G. Jiang et al. Table 5. Article generation example 1
You Draft We Complete Table 6. Article generation example 2
391
392
L.-G. Jiang et al.
semantically inconsistent, the point of the conflict is that “AC Milan” is a football team, so it should sign contracts with the players, not with the team itself. We believe that even if the CL itself has a low token score, the ability to generate globally has a certain degree of influence on the model itself to understand the context. Therefore, both LR and RR are multi-tasked with CL. After performing multi-tasks on the two models LR and RR separately, we see that the quality of the generation has been greatly improved, as shown in the score comparison. In Example 1 and Example 2, after LR completed multitasking, the problem of repetition was greatly reduced and the models generate smooth and coherent articles. RR is based on the original advantage, and then alleviates the problem of fact mismatch and makes the context semantics more fluid. For example, in Example 2, the team “AC Milan” originally signed a contract with the team itself, which created a contradiction, but now it can sign the contract with the player correctly.
4 Conclusion In this paper, we propose to use BERT models for article auto generation. From our study, we propose to use round robin decoding strategy augmented by a multi-tasking with a context-level parallel decoding. From the performance evaluation, we show an improvement of BLEU 4 from 17.8 to 19.1 compared with a strong baseline on the benchmarking datasets.
References 1. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) 2. Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: Text Summarization Branches Out, pp. 74–81 (2004) 3. Nallapati, R., Zhou, B., Gulcehre, C., Xiang, B., et al.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. arXiv preprint arXiv:1602.06023 (2016) 4. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311–318 (2002) 5. Wolf, T., et al.: Huggingface’s transformers: State-of-the-art natural language processing. ArXiv arXiv:1910.03771 (2019)
A Q-Learning Based Downlink Scheduling Algorithm for Multiple Traffics in 5G NR Systems Zhi-Qian Hong1 , Heru Susanto2,3,4 , and Fang-Yie Leu1(B) 1 Department of Computer Science, Tunghai University, Taichung 40704, Taiwan
{g08350021,leufy}@thu.edu.tw
2 School of Business, University Technology of Brunei, Bandar Seri Begawan
BE 1410, Brunei Darussalam [email protected] 3 Department of Information Management, Tunghai University, Taichung 40704, Taiwan 4 Research Center for Informatics, The Indonesia Institute of Sciences, Cibinong 16912, Indonesia
Abstract. Nowadays, due to the rapid growth and popularity of IoT devices, current wireless networks suffer from the huge traffic produced by these devices. On the other hand, with the availably of 5G networks, people expect to have high quality streaming mechanisms. In a 5G network, data transmission between BS and UE is one of the biggest challenges for high-quality streaming since the bandwidth that a BS can provide is limited. To enlarge the bandwidth efficiency for a BS, in this study, we propose a downlink scheduling mechanism, named the Q-learning based Scheduling and Resource Allocation Scheme (QSRAS). This scheme dynamically adjusts radio resource allocation by referring to QoS parameters. Our simulation and analysis show that the system’s throughputs, fairness and average delays, all outperform those of state-of-the-art systems.
1 Introduction With the quick progress of transmission technology, online streams and multimedia, such as, VR and 8K live broadcast, are popularly deployed in our surrounding environment. On the other hand, a large number of IoT systems, such as smart families, factory automation and smart agriculture, have or will be soon connected to the Internet and/or 5G networks. However, current 4G network systems and communication technologies do not meet these service requirements and are also unable to solve the problems caused by these requirements during data transmission, e.g., narrow residual bandwidth and poor channel quality. In order to offer users with greater bandwidth and tolerable delays, LTE uses Scheduling and Resource Allocation (SRA) to schedule and distribute wireless resources to users. Effectively managing limited radio resources of a BS is a key factor for providing users with better QoS. In this study, we propose a downlink scheduling mechanism, called Q-learning based Scheduling and Resource Allocation Scheme (QSRAS), which optimally allocates downlink radio resources to users and applications, so that the receiving ends can enjoy better network usage experience. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 393–402, 2022. https://doi.org/10.1007/978-3-030-79728-7_39
394
Z.-Q. Hong et al.
This paper is organized as follows. Section 2 introduces the similarities and differences between LTE-4G PHY and 5G PHY, and briefly describes related studies of this paper. Section 3 presents the QSRAS. Our simulation and the results are stated and discussed in Sect. 4, respectively. Section 5 concludes this paper and addresses out future studies.
2 Relate Work 2.1 5G NR Physical Layer 5G wireless communication has been evolved from Orthogonal Frequency Division Multiple Access (OFDMA) to scalable New Radio (NR) which introduces the concept of Numerologies. As shown in Table 1, a Frame (subframe) still lasts 10 (1) ms. However, NR provides a variety of subcarrier spacing (SCS) (15 kHz, 30 kHz, 60 kHz …) to enhance bandwidth scalability. The durations of a CP and a slot of a subframe vary given different SCSs. The number of Numerologies allocated to a UE is basically based on UEs’ needs. Table 1. Comparison between LTE’s SCS and CP and 5G’s numerologies. Numerologies (μ)
0
1
2
3
4
LTE
Sub-carrier 15 kHz Spacing (SCS) (2μ x 15 kHz)
30 kHz
60 kHz
120 kHz 240 kHz 15 kHz
Cyclic Prefix
Normal
Normal Normal/Extended Normal
Normal
Normal/Extended
No. of symbols per slot
14
14
14/12
14
14
7/6
No. of slots per subframe
1
2
4
8
16
2
No. of slots per frame
10
20
40
80
160
20
Slot duration
1000 μs 500 μs
250 μs
125 μs
62.5 μs
500 μs
Although, in 5G NR, RB has detailedly defined the allocation in Frequency domain, the definition in Time domain is not clear. The minimum length in Time domain could be an OFDM symbol. However, the exact time varies due to Start and Length Indicator (SLIV) which is an indicator for the time domain allocation in PDCCH.
A Q-Learning Based Downlink Scheduling Algorithm for Multiple Traffics
395
2.2 Scheduling and Resource Allocation (SRA) The transmission architecture of 5G NR consists of two parts: BS and UE. In a real environment, it is connected by UEs with a BS. For easy understanding, there is only one UE connected to BS as illustrated in [1]. Each of BS and UE has MIMO mechanisms, which own array antenna of NT × NR for sending and receiving radio signals simultaneously, where NT (NR ) represents N transmission (receiving) antennas and currently NT = NR . When M UEs request transmitting or receiving data, they stay in queues at BS. Of course, these requests may be various applications, such as eMBB, URLLC and eMTC. In each TTI, most scheduling algorithms allocate RBs to UEs according to channel quality, queue state and other QoS requests. Channel State Information (CSI) is exchanged between US and BS every TTI. After receiveing CQI from UE, BS calculates an appropriate modulation and coding scheme (MCS) for the UE following Adaptive Modulation and Coding (AMC) defined in [2]. Then, the SRA effectively allocates appropriate radio resources to the UE to achieve better system performance. To improve the spectral efficiency of a system, for every UE, resource allocation is performed by calculating a Matrix (see Eq. (1)) with which to assign suitable RB to UE. (1) Matrix = wi,j ∗ Ai,j where wi,j is the weight of the UEi to RBj . When wi,j is the highest for all js, 1 ≤ j ≤ m, RBj will assign to UEi and then, Ai,j is 1; otherwise, Ai,j = 0. 2.3 Packet Scheduling Algorithm Analysis According to [1], as shown in Fig. 1, radio resource management and scheduling solutions can be divided into two categories: QoS unaware and QoS aware.
Fig. 1. Packet scheduling classification.
396
Z.-Q. Hong et al.
2.3.1 QoS Unaware QoS unaware considers parameters related to system fairness, including CSI, average transmission rate, etc. The goal is to effectively allocate radio resources and scheduling UE transmission. Max CQI [3] allocates radio resources to UEs only based on the conditions of the channel allocated to UEs at current TTI. Using Max CQI can maximize the overall throughputs, but it does not guarantee resource allocation fairness among UEs. For RR, the rotation schedule provides the best user fairness, but its throughputs are not maximum because of lacking channel quality. PF maximizes throughputs and achieves fairness by estimating the average transmission rate and current channel quality before assigning RBs to UEs. Equation (2) [1] shows the PF index Wi,j (t). Wi,j (t) =
ri,j (t) R¯ i (t)
R¯ i (t) = (1 − β)R¯ i (t − 1) + β ri (t)
(2) (3)
where ri,j (t) is the instant transmission rate of flowi at subcarrierj . R¯ i (t) defined in Eq. (3) [1], as the estimated transmission rate for UEi at time t, is calculated by Exponential average method. R¯ i (t − 1) is the estimated transmission rate at the previous TTI, i.e., t − 1. ri (t) is the data rate achieved by UEi in current scheduling time t. β is a parameter that smooths the system transmission rate and controls system fairness. 2.3.2 QoS-Aware This scheduling category takes different parameters into account for scheduling decisions. Figure 1 shows five methods. The following explains them one by one. 2.3.2.1 Delay Aware This category is suitable for RT flows, such as, real-time gaming and live streaming. QoS class identifier (QCI), as defined in the standard of LTE and 5G [4], is suitable for different types of RT flows, including priority, data delay, data loss, etc. Delay aware methods include Modified earliest deadline first (M-EDF) [5], Modified Largest Weighted Delay First (M-LWDF) [6], Exponential Proportional Fair (EXP-PF) [6] and Exponential Rule (EXP-Rule) [7]. However, these algorithms do not provide balanced QoS services when serving hybrid flows, i.e., RT and NRT. RT flows of higher priority will be served first. M-LWDF The M-LWDF considers PF and Head of Line (HoL) Delay when processing RT flows. For NRT flows, the PF is used. In other words, M-LWDF supports RT and NRT flows for scheduling. The calculation of the weight parameter Wi,j (t) is described in [1], where DHoL,i (t) is HoL delay, and αi is used to determine the degree of delay for the weight. δi is defined as the maximum probability that DHoL,i (t) exceeds the predefined delay threshold τi . If the delay time of a RT-flow packet in the MAC queue exceeds τi , the packet will be discarded.
A Q-Learning Based Downlink Scheduling Algorithm for Multiple Traffics
397
EXP-PF The EXP-PF scheduler calculates weight Wi,j (t) to improve the priority level of RT flows, by involving exponential function, DHoL and PF rule, where h(t) defined in [1] represents the delay of the overall RT flows of the system, and Ni is the current number of RT flows in the system. It effectively reduces delays of all RT flows. 2.3.2.2 Queue Aware This category emphasizes fairness for overall UEs in a system, especially focusing on queue length, aiming to optimize throughputs of RT flows while providing the lowest guaranteed transmission rate for NRT flows. One of the most widely used schedulers in this category is Virtual Token Modified Largest Weighted Delay First (VT-M-LWDF), which is an improved version of M-LWDF. The calculation of Wi,j (t) [1] replaces the delay of RT Flows in M-LWDF with queue length. This parameter indicates the amount of data now in a queue. In fact, the VT-M-LWDF improves QoS for multimedia services, such as VoIP and jitter sensitive applications. 2.3.2.3 Target Bitrate The purpose of this scheduling category is to maximize throughputs of all UEs in a system, like those in [8, 9], by miniming and maximizing target transmission rates for RT flows and NRT flows, respectively. However, when serving the two types of flows simultaneously, their rules do not guatantee the QoS requested by UEs. 2.3.2.4 Hybrid and Others Many other algorithms have been proposed. Nasvalla [1] introduced a hybrid scheduling algorithm which improves M-LWDF by simultaneously performing SRA algorithm on both RT and NRT flows to solve the problem of NRT delays. In addition, a cross-layer high-efficient algorithm was proposed in [10] which reaches the best solution of the SRA system by using a dynamic programming approach. It is a improved version of the system stated in [11], which adopts a greedy algorithm. However, a greedy algorithm may calculate the best solution locally, rather than globally. Feki and Zarai [12] proposed a Q-learning SRA algorithm to determine which scheduling algorithm should be used for current TTI. According to system throughputs, fairness indexes and various thresholds, PF is empolyed when the system achieves a balance between throughputs and fairness. Max CQI is selected when system throughputs do not meet UE‘s requirements. When fairness is lower than requirements, RR is chosen. 2.4 The Concept of Q-Learning Reinforcement learning is one of machine learning techniques, which emphasizes on the interaction between itself and current working environment to maximize the benefits for the following activities. The main idea of Q-learning is to strengthen a good behavior, and weaken a bad one. Q function defined in Eqs. (4) and (5) [12] for a given Q(state, action) is as follows:
398
Z.-Q. Hong et al.
Q(st , at ) ← Q(st , at ) + α × δt
(4)
δt = rt+1 + γ max Q(st+1 , at ) − Q(st , at )
(5)
a
where Q(st , at ) is function which evaluates current actions; δt represents temporal difference error; α is learning rate, 0 ≤ α ≤ 1, used to rate the certitude of values previously estimated; rt+1 is an immediate reward received from the environment and γ, representing the weight affected to future rewards by immediate rewards, 0 ≤ γ ≤ 1, is a discount factor indicating the importance of future rewards. If γ = 0, only the immediate reward is considered.
3 Proposed Q-Learning Based SRA Algorithm – QSRAS Figure 2 shows the system architecture of the QSRAS, which balances throughputs and delays for UEs by updating Q-Table in Q-learning mechanism. For RT flows, QoS is guaranteed. Meanwhile, it also serves NRT flows. The QSRAS considers channel quality and HoL Delay as its QoS parameters. As shown in Eq. (6), the QSRAS prioritizes UE with the highest weight calculated by the Q-learning mechanism for the next TTI. Wi,j (t) = Wi,j (t − 1) + α Ri,j (t) − Wi,j (t − 1) (6) where Wi,j (t) is weight of flowi at subcarrierj at time t, Wi,j (t − 1) represents previous weight of flowi at subcarrierj , Ri,j (t) is the immediate reward received from the environment for flowi at subcarrierj at time t, α is the learning rate, 0 ≤ α ≤ 1. The larger the value of α, the greater the impact on Q by reward r. To ensure balanced QoS for RT flows, α > 0.5 to enhance the impact on the next TTI by current reward, thus conducting a greater impact on the weight of RT flows at time t. γ in Eq. (5) represents the weight affected to future rewards relative to immediate rewards. In this study, we only take immediate reward into account because this reward is determined by current environment, i.e., γ = 0. The reward from the current environment is defined in Eq. (7). Ri,j (t) =
ri,j (t) × DHoL,i (t) R¯ i (t)
(7)
where ri,j (t) the same as that in Eq. (2) is the instantaneous transmission rate of flowi at subcarrierj at time t, R¯ i (t) is defined in Eq. (3), DHoLi (t) is flowi ’s HoL Delay at time t. When SRA algorithm is utilized to allocate an RB to UE, e.g., UEl , SRA looks up the maximum weight, e.g., Wl,k (t), 1 ≤ k ≤ m, in Matrix table (n × m) and then allocates RBk to UEl . According to Eq. (7), ri,j (t) is used to improve system throughputs, and DHoL,i (t) enhances the weight for delay in Matrix table. Since Q-learning takes past weight into consideration, its diversification of weight is smoother than those of the other algorithms.
A Q-Learning Based Downlink Scheduling Algorithm for Multiple Traffics
399
Fig. 2. The architecture of the QSRAS.
4 Simulation and Performance Analysis In this section, we use an open-source simulator, 5G-air-simulator [13], to do our experiments. We evaluate performance of the QSRAS and existing SRA algorithms in 5G environments with interference. 4.1 Simulation Scenario Table 2 lists parameter settings in 5G-air-simulator. In a cellular system, 7 cells are connected as a hexagon, and many UEs may connect to a BS. UEs move along randomly chosen directions with the speed of 30 km/h. Each UE delivers VoIP, video and CBR traffic at the same time. The simulation duration is 30 s, and a flow lasts 25 s. Table 2. Simulation parameters Parameters
Values
System Bandwidth 20 MHz Number of RBs
100 RBs
Number of UEs
10–60
Max Delay
100 ms
Frame Structure
FDD
Cell radius
1 km
Video bitrate
440 kbps
4.2 Performance Evaluation In the following experiment, we evaluate performance of some SRA algorithms, including PF, M-LWDF, EXP-PF, RR and QSRAS. The evaluation metrics including PLRs, packet delays, throughputs and spectral efficiencies. Each experiment is performed 10 times.
400
Z.-Q. Hong et al.
Fig. 3. Packet loss rates.
Fig. 4. Delays.
Fig. 5. Throughputs.
The PLRs for VoIP, video, and CBR flows are presented in Fig. 3. As the number of users rises, the video quality decreases. However, PF and RR have better performance than M-LWDF, EXP-PF and the QSRAS do on VoIP and CBR. When both RT and NRT flows are transmitted, the delivery priority of RT flows is higher. Delays are illustrated in Fig. 4. It is clear that no matter whether it is VoIP, video or CBR, the algorithms without QoS receive longer delays. That is why delays are considered as one of the QoS parameters so that the algorithms can effectively reduce packet delay and improve user experience. The QSRAS performs as well as M-LWDF and EXP-PF do in delay. When transmitting VoIP packets, performance of RR and PF are more excellent than others due to small traffic employed.
A Q-Learning Based Downlink Scheduling Algorithm for Multiple Traffics
401
Throughputs are shown in Fig. 5. As number of users rises, throughputs increase for all applications. With video, when traffic is larger, the problems caused by PF and RR appear. In addition, we found that performance of the QSRAS encounters a bottleneck when number of users rises up to 60. This is a problem that we need to improve in future studies.
Fig. 6. System spectral efficiencies.
The total system spectral efficiencies for a network are plotted in Fig. 6. Spectral efficiency is one of the performance indicators for resource utilization. As number of users is higher, spectral efficiencies increase up to 7 bit/Hz in M-LWDF, EXP-PF and the QSRAS. However, situations are the same as throughputs in video, the QSRAS might encounter a bottleneck when number of users is higher than 60.
5 Conclusions and Future Studies This paper proposes a downlink SRA algorithm based on Q-learning, aiming to enhance the performance for different categories of traffic, and improve system capacity. According to [10], the scheduling algorithms with different QoS parameters are analyzed in detail, including delay aware, queue aware, target bit-rate aware, Hybrid aware and others. The QSRAS takes channel quality, past throughputs and delays into account, and calculates the weight by using Q-learning technique for RB allocation. Our simulation compares and analyzes different downlink scheduling algorithms in terms of performance metrics, including PLRs, packet delays, throughputs and system spectral efficiencies. According to our simulation results, the algorithms with delay awareness perform well. Furthermore, the QSRAS balances QoS and performance in different categories of traffic. Our conclusions are that QoS balanced scheduling algorithms perform well. In future, an optimization and an extension of this study will be carried out. Acknowledgments. This study is financial support in part by Ministry if Technology and Science, Taiwan under the grants MOST 108–2221-E-029–009 and MOST 109–2221-E-029–017-MY2.
402
Z.-Q. Hong et al.
References 1. Nasralla, M.M.: A hybrid downlink scheduling approach for multi-traffic classes in LTE wireless systems. IEEE Access 8, 82173–82186 (2020) 2. 3GPP 38 Series TS 138 214V16.5.0 5G; NR; Physical layer procedures for data 3. Gueguen, C., Baey, S.: A fair MaxSNR scheduling scheme for multiuser OFDM wireless systems. In: 2009 IEEE 20th International Symposium on Personal Indoor and Mobile Radio Communications, Tokyo, Japan, pp. 2935–2939 (2009) 4. 3GPP 23 Series TS 123 501V16.8.0 5G; System architecture for the 5G System 5. Hamed, M.M., Shukry, S., El-Mahallawy, M.S., El-Ramly, S.H.: Modified earliest deadline first scheduling with channel quality indicator for downlink real-time traffic in LTE networks. In: The Third International Conference on e-Technologies and Networks for Development (ICeND2014), Beirut, Lebanon, pp. 8–12 (2014) 6. Basukala, R., Ramli, H.A.M., Sandrasegaran, K.: Performance analysis of EXP/PF and M-LWDF in downlink 3GPP LTE system. In: 2009 First Asian Himalayas International Conference on Internet, Kathmundu, Nepal, pp. 1–5 (2009) 7. Ang, E.M., Wee, K., Pang, Y.H., Phang, K.K.: A performance analysis on packet scheduling schemes based on an exponential rule for real-time traffic in LTE. EURASIP J. Wirel. Commun. Netw. 2015(1), 1–12 (2015). https://doi.org/10.1186/s13638-015-0429-8 8. Monghal, G., Pedersen, K.I., Kovacs, I.Z., Mogensen, P.E.: QoS oriented time and frequency domain packet schedulers for the UTRAN long term evolution. In: Proceedings of VTC Spring-IEEE Vehicular Technology Conference, Singapore, pp. 2532–2536 (2008) 9. Skoutas, D.N., Rouskas, A.N.: Scheduling with QoS provisioning in mobile broadband wireless systems. In: Proceedings of the European Wireless Conference (EW), Lucca, Italy, pp. 422–428 (2010) 10. Vora, A., Kang, K.-D.: Effective 5G wireless downlink scheduling and resource allocation in cyber-physical systems. Technologies 6(4), 105 (2018) 11. Femenias, G., RieraPalou, F., Mestre, X., Olmos, J.J.: Downlink scheduling and resource allocation for 5G MIMO-multicarrier: OFDM vs FBMC/OQAM. IEEE Access 5, 13770– 13786 (2017) 12. Feki, S., Zarai, F.: Cell performance-optimization scheduling algorithm using reinforcement learning for LTE-advanced network. In: 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications, pp. 1075–1081 (2017) 13. Martiradonna, S., Grassi, A., Piro, G., Boggia, G.: 5G-air-simulator: an open-source tool modeling the 5G air interface. Comput. Netw. 173, 107151 (2020). (Elsevier)
Study on the Relationship Between Dividend, Business Cycle, Institutional Investor and Stock Risk Yung-Shun Tsai, Chun-Ping Chang(B) , and Shyh-Weir Tzang Department of Finance, Asia University, Taichung, Taiwan
Abstract. Investors in Taiwan mostly use dividends and business cycles as their main indicators of judgment when making investment decisions, and they pay less attention to individual stock risks. Recently, the proportion of corporate investment in the Taiwan stock market has gradually surpassed that of individual investors. Therefore, this paper analyzes the stock price risk of 200 listed companies in Taiwan that issued dividends every year and all institutional investors hold shares from 2008 to 2014. The individual risk is measured by the standard deviation and the market risk is measured by the beta coefficient. The impact of the dividend, business cycle, and institutional investors’ shareholding on stock risk is discussed. The empirical results found that (1) Dividends and institutional investors’ shareholding are significantly positively correlated with individual risks. (2) The business cycle is negatively correlated with individual risks. (3) The relationship between dividends, business cycle, the shareholding of institutional investors, and market risks is not obvious. Keywords: Risk · Dividend · Business cycle · Institutional investor
1 Introduction When most investors choose stocks for investment in the stock market, they often prioritize the dividends and price differences of individual stocks and ignore the risk assessment. In theory, risk means the probability of financial loss. Therefore, it is necessary to predict the stock risk when the asset portfolio is matched. The past literature clearly pointed out that the volatility of the overall stock market will change over time. From the studies of French et al. (1987) and Hamao et al. (1990), we know that there is a cluster of volatility in time series data such as stock prices, exchange rates, and inflation in financial markets. The phenomenon is the risk of “changing with time”. However, there are few relevant studies on how stock risks change over time. Therefore, this study intends to conduct empirical research on stock price volatility and market risk to measure stock risk. Linter (1956) put forward the connotation of dividend information, and believed that the company’s dividend payment is not only the result of the company’s operating performance, but also the expected future earnings; if the company expects future earnings to increase, it will increase the dividend payment. Litzenberger and Ramaswamy (1979) © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 403–411, 2022. https://doi.org/10.1007/978-3-030-79728-7_40
404
Y.-S. Tsai et al.
also found that the dividend yield is helpful to predict the change in the average return of stocks in the longitudinal section, while Lamont (1998) proposed that the dividend payout rate has a significant correlation with the average return of stocks; therefore, the dividend policy is used to Explore stock risks. In terms of basic factors, overall economic factors have a considerable impact on stock prices. Taiwan’s weighted stock price index had more than 10,000 points in 1997 and 2000. In November 2003, the stock market crashed under the influence of the financial crisis in the US subprime mortgage, and the index once fell below 400 points. This shows that the stock market is deeply affected by the business cycle, so this study chooses the business cycle as an explanatory variable to measure stock risk. In terms of transaction factors, Scholes (1972) proposed that if there is a huge transaction in stocks, the asymmetry between buying and selling will occur in the market, which will result in the inability of buyers and sellers to trade at the expected price, resulting in transaction costs of the bid-ask spread. Affect the actual price of the stock. Since 1990, Taiwan has opened up foreign professional investment institutions to directly invest in the domestic securities market year by year. Until 2012, the transaction volume of institutional investors reached as high as 38%, and the proportion of institutional investors investment increased year by year, and its influence on the stock market was increasing. Therefore, this study adopts the shareholding ratio of institutional investors as an explanatory variable to measure stock risk. Based on the above reasons, the main research directions of this study can be summarized as follows: (1) Explore the correlation between dividend policy and stock risk (2) Explore the correlation between business cycle and stock risk (3) The impact of Institutional investor’s shareholding ratio on stock risk.
2 Literature Review Pettit (1972) explored the impact of cash dividend declarations on stock prices, and pointed out that there is a positive correlation between the increase in cash dividend declarations and abnormal returns in stock prices, and dividends have information content; in addition, when dividends decrease or increase significantly, the market reacts quickly. Banke et al. (1993) explored the relationship between stock dividends and cash dividends. According to the company’s past and current dividend policies, they divided into two sample groups of good and bad companies. The results showed that during the announcement period, the good Companies will generate positive abnormal returns; bad companies will generate negative abnormal returns, and the sample subgroups are significantly related to the direction of abnormal returns. Baskin (1989) uses the standard deviation of stock returns as a measure of volatility to explore the correlation between the dividend and stock price volatility; his research found that the dividend is significantly negatively correlated with stock price volatility. The dividend policy will directly affect the volatility of stock prices, rather than other factors that rely on the dividend policy to affect the volatility of the market. Rajverma et al. (2019) shows that dividends paid by management have decreased, resulting in lower valuations and increased individual risks. The research further shows that the concentration of family ownership and family control will affect the company’s performance and risk level.
Study on the Relationship Between Dividend, Business Cycle
405
Officer (1973) uses the standard deviation of the industrial production index as a substitute index for changes in economic activity. They found that the standard deviation of the industrial production index has an excellent explanatory power for the stock market risk, the higher the industrial production index, the lower the stock market risk. Mukherjee and Naka (1995) study the dynamic relationship between macroeconomic variables and the Japanese stock market. The research results show that there is a balanced relationship between overall economic variables and stock prices. Stock prices are positively correlated with exchange rates, money supply, industrial production index. Morelli (2002) uses the GARCH model and the VAR model to explore whether the Macroeconomic variable volatility affects the volatility of stock prices in the UK. The research results show that only the exchange rate is significant, and the overall variables obtained through regression analysis cannot explain the volatility of the stock market. He and Leippold (2020) they found that short-term risks vary with the business cycle, and it has a substantial predictive power for market excess returns and the value premium. Value stocks have a larger exposure to risk than growth stocks. Reilly and Wright (1984) used regression analysis to measure the impact of trading volume on stock price volatility. The empirical results found that: trading volume and stock market volatility showed an insignificant correlation. Chan et al. (1993) used regression analysis to explore the impact on stock prices before and after corporate transactions. The empirical results found that: institutional investors bought transactions, causing the stock price to rise. However, institutional investors’ selling transactions caused a small fluctuation in the stock price index. In addition, the size of capital and the scale of transactions are also the main reasons for the volatility of the stock market. Hamao and Mei (1995) tested the performance of foreign institutional investors and domestic institutional investors in Japanese stock market. The empirical results found that: foreign institutional investors has no better investment ability than domestic institutional investors. Malkiel and Xu (1999) used S&P 500 listed company as sample. Their research found that the higher the institutional investors’ holdings, the more volatility that impacts individual stocks. In addition, they also proved that the shareholding of institutional investors can effectively predict the industrial volatility of the investment portfolio. Choi et al. (2017) Using data on security holdings for institutional investors from different countries to test whether concentrated investment strategies result in excess risk-adjusted returns. They find that portfolio concentration is directly related to risk-adjusted returns for institutional investors.
3 Methodology 3.1 Hypothesis Hypothesis 1. Dividends are positively correlated with stock risk. This study explores whether dividends affect stock risks. According to previous literature, the dividend is positively correlated with the fluctuation of stock return, so the dividend has a positive relationship with the stock risk. Hypothesis 2. Business cycle is negatively correlated with stock risk.
406
Y.-S. Tsai et al.
As the business cycle progresses, various economic factors also change. If the economy starts to recover from the trough, corporate revenue will increase, and profit per share will also grow. The stock prices will also rise accordingly. Conversely, if the economy is in recession, the stock price will fall. Hypothesis 3. The shareholding ratio of institutional investor is positively correlated with stock risk. The investment proportion of institutional investor is increasing year by year, and its influence on Taiwan stocks is increasing day by day. The ratio of foreign ownership in individual stocks will affect the fluctuation of stock prices. Gompers and Metrick (2001) pointed out that compared with general investors, institutional investors prefer to invest in large companies and stocks with higher liquidity, and confirmed that stocks with a higher proportion of institutional investors will have higher returns.
3.2 Model The purpose of this research is to explore the impact of dividends, business cycle, and the shareholding ratio of institutional investors on the stock market fluctuations and investment risks in Taiwan’s stock market. Volatility refers to the deviation of market transaction prices that reflect the growth and decline of buyers and sellers. Therefore, this study uses standard deviation to measure stock price fluctuations to assess risk. Sharpe proposed the capital asset pricing model (CAPM) in 1964, and then through supplementary research by Lintner (1965) and Mossin (1966), it has become one of the most important theories in financial economics. In addition to using CAPM to determine the theoretical price of individual securities assets, investors can also use Beta Coefficient to measure the market risk coefficient of individual securities assets. Using the β coefficient value to compare the returns of individual assets allows investors to have multiple comparisons and choices, which facilitates the selection of ideal investment targets. Because OLS is easier to ignore the consideration of individual effects or time effect, this will be resulting in estimation errors (K’onya 2006). Hsiao (2003) pointed out that traditional OLS is not possible to estimate time-series and cross-sectional data together. However, Panel Data combined with time-series and cross-section data forms not only have the dynamic nature of time series data, but also have cross-section data that can show the characteristics of individual environments. It can provide more complete information than OLS. Therefore, this research adopts Panel Regression Analysis Test for research and exploration. The observation and selection samples of this research are the relevant variables of 200 listed companies from 2008 to 2014. Model 1. Stdt = α0 + α1 ∗ Assett + α2 ∗ Earningt + α3 ∗ Dividendt + α4 ∗ Indext + α5 ∗ Institutiont + εt Model 2. βt = β0 + β1 ∗ Assett + β2 ∗ Earningt + β3 ∗ Dividendt + β4 ∗ Indext + β5 ∗ Institutiont + εt
(1)
(2)
“Std” represents the standard deviation of return. “β” Represents the beta coefficient. “Dividend” represents the rate of change in dividend. When the coefficient is
Study on the Relationship Between Dividend, Business Cycle
407
significantly positive, it means that the sum of cash dividends and stock dividends is positively correlated with stock price fluctuations. The “Index” represents the weighted index of Taiwan’s stock price. This study uses this as the business cycle index. When the coefficient is significantly positive, it means that the business cycle and stock price fluctuations are positively correlated. “Institution” represents the shareholding ratio of institutional investors. When the coefficient is significantly positive, it means that the shareholding ratio of institutional investors is positively correlated with stock price fluctuations. “Asset” represents the scale of assets. This article uses assets as a measure of company scale. When the coefficient is significantly positive, it means that the scale of assets is positively correlated with stock price fluctuations. “Earning” represents the change in earnings per share.
4 Empirical Result This chapter is an empirical test of the relationship between dividend, business cycle, and the shareholding of institutional investors and stock risks. We taking Taiwan listed companies as the research object, collecting annual data of 200 companies that issued dividends and institutional investors all have shares from 2008 to 2014. Through the establishment of regression models, a step-by-step analysis method is used. Analyze and verify the hypothesis proposed in this research. 4.1 Descriptive Statistics This research design is based on Taiwan listed companies. The research period is from 2008 to 2014. The sample data is extracted from the Taiwan Economic Journal (TEJ). After excluding samples with incomplete sampling years and insufficient data in this study, there are a total of 200 companies and 1,200 valid data. The distribution of variable statistics for the entire sample is shown in the table (Table 1). Table 1. Descriptive statistics
The data in the table shows that the average rate of change in dividend payment is 0.2656. It can be seen that the total amount of dividends paid by the 200 companies in the sample each year between 2008 and 2014 has not changed much. The average of Taiwan’s stock index is 8296.703, the median is 8188.110, and the minimum is 4591.220;
408
Y.-S. Tsai et al.
it shows that most of Taiwan’s stock index falls between 7000 and 8500 points; and the minimum of 4591.220 is the 2008 financial crisis and the global stock market is in a downturn. The average shareholding ratio of institutional investors is 16.9865, ranging from 0.0008 to 79.3400, and the standard deviation is 17.4026; it shows that among the sampled companies, the shareholding ratio of institutional investors is quite different. The average of the standard deviation of the returns of the explained variables is 1.7348, which shows that the stock price of the selected sample fluctuates greatly. The average number of beta coefficients is 0.8042, and the median is 0.7960, which is smaller than the average number; it can be seen that the beta coefficients of most samples are less than 1. The results show that the volatility of individual stock returns in most samples is smaller than that of the market. 4.2 Individual Risk This section discusses the impact of various variables on stock price fluctuations. The regression used is the Eq. (1) in the empirical model. The explained variable is the standard deviation of return, this is used to measure the individual risk. The explanatory variables are the dividend, the Taiwan stock index, the shareholding ratio of institutional investors, the asset, and the change rate of earnings. The empirical results are as follows (Table 2). Table 2. Individual risk (standard devidation, std)
The result of “Model 1” show that dividend is significantly positively correlated with the individual risk (the standard deviation of return). This result is in line with the “Hypothesis 1”, the dividend is positively correlated with stock risk.” This is consistent with the conclusion put forward by the Pettit (1972), investors will base the stock price adjustment based on the declaration of dividends. The result of “Model 2” show that business cycle is significantly negatively correlated with the individual risk (the standard deviation of return). This result is in line with the
Study on the Relationship Between Dividend, Business Cycle
409
“Hypothesis 2”, business cycle is negatively related to stock risk.” And it is consistent with the conclusion of Officer (1973). “Model 3” include the shareholding ratio of institutional investors; the result is significantly positively correlated with the individual risk (the standard deviation of return). This result is consistent with “Hypothesis 3”, the shareholding ratio of institutional investors is positively correlated with stock risk. At the same time, it is consistent with the result of Malkiel and Xu (1999). The study found that the higher the institutional investor’s holdings, the greater the volatility, this will increase the risk of investment. “Model 4” also have the same result compare to the other model. 4.3 Market Risk This section discusses the impact of various variables on market risk. The regression used is the Eq. (2) in the empirical model. The explained variable is the beta coefficient of stock market, this is used to measure the market risk. The explanatory variables are the dividend, the Taiwan stock index, the shareholding ratio of institutional investors, the asset, and the change rate of earnings. The empirical results are as follows (Table 3). Table 3. Market risk (beta coefficient, β)
The results of the above table found that the dividend is significantly positively correlated with the lower market risk (β1). Conversely, the dividend has no significant correlation with the higher market risk (β2). Therefore, the company’s internal operating conditions are relatively stable, stock price fluctuations are more affected by market risk. It is consistent with the “Hypothesis 1”. The index of stock market has a significant negative correlation with the low market risk (β1), and the index of stock market has no significant correlation with the high market risk (β2). That is, for companies with low dividend, the stock market is more
410
Y.-S. Tsai et al.
affected by market risks than companies with higher dividend. This result is in line with the “Hypothesis 2”. The shareholding ratio of institutional investors is significantly positively correlated with the lower market risk (β1). Conversely, the shareholding ratio of institutional investors has no significant correlation with the higher market risk coefficient (β2). It can be inferred from this that for companies with a lower shareholding ratio of institutional investors is more affected by market risks than companies with a higher shareholding ratio of institutional investors. This article infers that if market risks increase, they will easily cause substantial transactions by institutional investors and affect stock prices. This result is in line with the “Hypothesis 3”.
5 Conclusion This article uses the standard deviation of the return as a measure of the individual risk, and at the same time uses the β coefficient as a measure of market risk; and divides the sample group into a lower market risk and a higher market risk. Compared with the previous literature that mostly focused on the return of stock, this article focuses on the risk of stock. We discuss the impact of dividend, business cycle, and shareholding ratio of institutional investors on stock risk. The empirical results of this paper are summarized as follows. First, the result show that the dividend is significantly positively correlated with individual risk and lower market risk, and the dividend is no significant correlation with higher market risk. Secondly, the business cycle is significantly negatively correlated with individual risk and lower market risk; and the impact of business cycle to higher market risk is not significant. In addition, the shareholding ratio of institutional investors has a positive and significant impact on individual risk and lower market risk. From this empirical result, it is inferred that the higher the shareholding ratio of institutional investors, the more likely the stock price to fluctuate. The limitation of this research is that not all listed companies pay dividends during the above-mentioned period, and some companies that do not have shareholding of institutional investors, and emerging companies that lack information due to insufficient data, are excluded from this research. Therefore, it is suggested that in the future, researchers may aim to join emerging companies or comparison with other markets in order to obtain more complete research results.
References Baskin, J.: Dividend policy and the volatility of common stocks. J. Portf. Manage. 159, 19–25 (1989) Chan, L.K.C., Lakonishok, J.: Institutional trades and intraday stock price behavior. J. Financial Econ. 33, 173–199 (1993) Choi, I.: Unit root tests for panel data. J. Int. Money Finance 20, 249–272 (2001) Choi, N., Fedenia, M., Skiba, H., Sokolyk, T.: Portfolio concentration and performance of institutional investors worldwide. J. Financial Econ. 123(1), 189–208 (2017)
Study on the Relationship Between Dividend, Business Cycle
411
Morelli, D.: The relationship between conditional stock market volatility and conditional macroeconomic volatility: empirical evidence based on UK data. Inte. Rev. Financial Anal. 11, 101–110 (2002) French, K.R., Schwert, G.W., Stambaugh, R.F.: Expected stock returns and volatility. J. Financial Econ. 19, 3–29 (1987) Gompers, P.A., Metrick, A.: Institutional investors and equity prices. Q. J. Econ. 116(1), 229–259 (2001) Hamao, Y., Mei, J.: Living with the enemies: an analysis of foreign and domestic investor behavior in the Japanese equity market. Columbia University Paine Webber Working Paper Series in Money Economics, and Finance, p. 20 (1995) Hamao, Y., Masulis, R., Ng, V.: Correlation in price changes and volatility across international stock markets. Rev. Financial Stud. 3, 281–303 (1990) He, Y., Leippold, M.: Short-run risk, business cycle, and the value premium. J. Econ. Dyn. Control 120, 1–36 (2020) Hsiao, C: Analysis of Panel Data. Cambridge University, Press, Cambridge (2003) K’onya, L.: Exports and growth: granger causality analysis on OECD countries with a panel data approach. Econ. Model. 23(6), 978–992 (2006) Lamont, O.: Earnings and expected returns. J. Finance 53, 1563–1587 (1998) Linter, J.: Distribution of incomes of corporations among dividends retained earnings and taxes. Am. Econ. Rev. 46, 97–113 (1956) Lintner, J.: Security prices, risk and maximal gains from diversification. J. Finance 20, 587–616 (1965) Litzenberger, R.H., Ramaswamy, K.: The effect of personal taxes and dividends on capital asset prices. Theory Empir. Evid. 7, 163–195 (1979) Malkiel, B.G., Xu, Y.: The structure of stock market volatility. Financial Research Centre Working paper, no. 154. Princeton University (1999) Mossin, J.: Equilibrium in a capital asset market. Econometrica 34(4), 68–83 (1966) Mukherjee, T.K., Naka, A.: Dynamic relations between macroeconomic variables and the Japanese stock market: an application of a vector error correction model. J. Financial Res. 18(2), 223–237 (1995) Officer, R.R.: The variability of the market factor of New York stock exchange. J. Bus. 46, 434–453 (1973) Pettit, R.R.: Dividend announcements, security performance, and capital market efficiency. J. Finance 27(5), 993–1007 (1972) Rajverma, A.K., Misra, A.K., Mohapatra, S., Chandra, A.: Impact of ownership structure and dividend on firm performance and firm risk. Manag. Finance 45(8), 1041–1061 (2019) Reilly, F.K., Wright, D.J.: Block trading and aggregate stock volatility. Financial Anal. J. 40, 54–60 (1984) Scholes, M.S.: The market for securities: Substitution versus price- pressure and the effects of information on share prices. J. Bus. 45(2), 179–211 (1972) Sharp, W.F.: Capital asset prices: a theory of market equilibrium under conditions of risk. J. Finance 19, 425–442 (1964)
The Choice Between FDI and Selling Out with Externality and Exchange Rate Chun-Ping Chang(B) , Yung-Shun Tsai, Khunshagai Batjargal, Hong Nhung Nguyen, and Shyh-Weir Tzang Department of Finance, Asia University, Taichung City, Taiwan {109035162,108035493}@live.asia.edu.tw
Abstract. The purpose of this study is to formalize the choice of foreign direct investment (FDI) and selling out assets for the domestic firm with externality relative to exchange rate effect from a dynamic perspective. This paper analyzes that the ratio of externality to exchange rate affects market entry of greenfield investment and exit of selling out the assets for the domestic firm. The predictable results suggest that the ratio changes play an important role at switching its production platform and selling out the assets that have an impact on the choice for better investment opportunities. The model theoretically differs from other similar models in several ways. Keywords: FDI · Selling out · Externality
1 Introduction Many traditional financial models are based on complete information and no externality. Some abnormal phenomena cannot be explained by the above models. However, Chamley and Gale (1994) discuss strategy delay with information revelation in the investment process. Chari and Jagannathan (1988) analyze bank run with two effects of information externality and reward externality. The lack of externality between options executive is a characteristic that most real options papers have in common. Grenadier (1996) develops an equilibrium game framework, and shows a possible explanation for why some markets may experience building booms in the face of declining demand and property value. Grenadier (1999) argues that a waterfall of information will appear when decision makers reveal two positive continuous signals. Décampsand Mariotti (2003) and Nielsen (2002) extends the duopoly result of Dixit and Pindyck (1994) for investments with positive externality. Corato and Maoz (2019) explicitly model the present externality and then let the social planner choose the cap level maximizing welfare. Differing from the above research directions, I propose firm either can set up a new plant abroad for searching for better resources in foreign markets with externality relative to exchange rate movement by switching or implement exit strategy to sell out the assets for getting more return than keep staying there while it operates in the domestic market. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 412–419, 2022. https://doi.org/10.1007/978-3-030-79728-7_41
The Choice Between FDI and Selling Out with Externality
413
The firm commonly capitalizes on business opportunities by engaging in FDI or selling out the assets because it is more attractive to the domestic firm. Making the related decisions are economic opportunities. It is like exercising the call (put) option, and the economic cost of the investment(disinvestment) is the strike price of the option (Dixit (1989), Dixit and Pindyck (1994)). I assume that there are two uncertain factors arising from externality and exchange rate. I develop a model with creative concepts and examine the impact of the ratio of uncertainties on the timing of decision. In this article, I account for timing problem with the ratio of externality to exchange rate and analyze optimal green investment and selling out timing problem and provide analytical numerical solutions. My model is further extended into more complex situations based on considerable releasing. The paper is organized as follows. Section 2 provides the literature review. Section 3 provides a theoretical model set up. Section 4 provide predictable results. Section 5 draws remarks and suggestions for future research.
2 Literature Review Katz and Shapiro (1985) find that the positive externality effect will be derived to accelerate the consistent consumption of this product by some consumers. Vives (1995) shows that the company that is issuing new shares has the information revelation in early implementation of the contract. Siddiqui and Takashima (2012) develop a sequential leader–follower two-stage ROG model for capacity expansion investments where there is knowledge spillover. Lee et al. (2013) finds that governments can focus on generating positive externality, as opposed to avoiding failure for individual firms. Kong and Kwok (2007) examine strategic investment games between two firms. Under positive externality, firms do not compete to lead. The firm engages in FDI to reduce costs and benefit from economies of scale, foreign factors of production, foreign raw material, foreign technology and reacting to exchange rate movements. The firm aims to obtain in attracting new sources of demand, entering profitable markets, exploiting monopolistic advantages, reacting to trade restrictions, diversifying internationally. Buckley and Casson (1976) create an overall framework to explain why MNEs choose FDI. Smets (1993) uses an option exercise game for an application in international finance. Campa (1993) finds a negative effect of exchange rate volatility on the number of foreign firms. Mello et al. (1995) composed an integrated model of real flexibility and financial hedging. Chen et al. (2003) analyze the impact of policy uncertainty on foreign direct investment strategies. Aray and Gardeazabal (2006) analyze the effect of changes in the parameters of the demand and cost functions, local and foreign taxation and the role of labor costs in deciding where to locate production. Gilroy and Lukas (2006) formalize the choice of market entry strategy for an individual multinational enterprise (MNE). Lukas (2007) formalize the optimal choice of market entry strategy for an individual multinational enterprise (MNE) from a dynamic perspective. Kijima and Tian (2013) examine the impact of international debt shifting and exchange rate uncertainty on investment and capital structure decisions of foreign subsidiary. Martzoukos and Zacharias (2013) find that strategy shifts are easier to observe in market environments of high growth and high volatility. Azevedo, et al. (2017) developed
414
C.-P. Chang et al.
a real options model that decides the optimal time to invest in an FDI project. Zhang and Xue (2017) examines the operational decisions of a multinational corporation in a cooperative framework, where the corporation is endowed with an abandonment option and shares its profit with the host country. Mankan et al. (2017) find that FDI in host countries with uncertain demand, strong competition and few barriers to trade will likely to be delayed with respect to immediate investment.
3 Methodology We assume that all uncertainties arise from the exchange rate movement and the externality. The stochastically evolutionary process is crucially affected by the externality and exchange rate change. Externality and exchange rate, X and Y, are described by the following stochastic differential equation: dX = μx dt + σx dz x X
(1)
dY = μy dt + σy dz y , μy = rh − rf Y
(2)
where dz x,y is the increment of a standard Wiener process, normally distributed with zero mean and variance dt. μx,y is the drift, where μy = rh − rf is the difference rate over two country h and f. The deviation of the exchange rate from its equilibrium path at each point in time. The σx,y is respectively a measure of its potential volatility. Value of the All-Equity Firm Considering an asset (contingent claim) as a perpetual entitlement to an income flow. The production price can be expressed as Pn=h,f , which represents the selling price at their home market. The parameter Cn=h,f denotes the market production cost in some country n that depends on both production platforms state. τn=h,f is country tax rate. The net operating cash flows of firm after tax at country n = h, f is worth: Pn=h,f Cn=h,f (3) X 1n=h + Y 1n=f 1 − τn=h,f − rn=h,f rn=h,f where 1n is the indicator function of firm location at country n. We discuss that when the all-equity firm exits from home industry to two states by switching to another production platform or selling out, the options vanishes. As a result, the firm values in tows states can be expressed as follows: (4) V f (X , Y ) = Y 1 − τf K ∗ (1 − α), R < Rd where Y 1 − τf K is the residual value after tax, (1 − α) is recovery rate. The explanations in Eq. (4) is that foreigners’ willingness to take over is not very much due to the appreciation of the dollar. Even if the externality is large and the profit is relatively small, it is more likely to be resold because operating a foreign market cannot
The Choice Between FDI and Selling Out with Externality
415
obtain relatively large benefits. If you can sell it to other people for a certain value, it is a relatively good choice. Pf Cf , R > Rg V f (X, Y ) = Y 1 − τf − (5) rf rf Similar intuitive explanations in Eq. (5) is that negative externality and exchange rate depreciation effect cause switching to another production platform. The determinants of the value of these options thus impact the investment and disinvestment decision. For some other proofs, it is convenient to study two-sided hitting claims whose value depends upon a geometric Brownian motion’s first excursion from a strip, e.g., for some initially interior the ratio level R exiting the open interval Rd , Rg to either side the first time. For any values of X and Y, the payoff of the selling out option and switching options are respectively given by Dl (X , Y ), G f (X , Y ). Using real option pricing techniques, we compute the value of the option held by the firm. Given that the initial ratio of externality relative to exchange rate R = XY before switching and selling out. The payoff from the options to sell out assets and to switch satisfy: Dl (X , Y ) = YK ∗ (1 − α) − X
Ph C − h rh rh
and G f (X , Y ) = Y
Pf rf
−
Cf rf
−X
C Ph − h rh rh
The value of the firm embedded two options is defined as follows: Ph Ch V h (X , Y ) = X (1 − τh ) + Dl (X , Y ) + G f (X , Y ), Rd < R < Rg − rh rh
(6)
(7)
where the selling out threshold is Rd and the switching threshold is Rg . Arrow–Debreu Prices Denote by Td the first passage time to the disinvestment threshold Rd and by Tg the first passage time to the expansion threshold Rg . H (1, R) = Sup 1Td 45 8
53.56
45.76
54.66
53.55
60.35
≤45 12
11.60
7.47
12.73
14.22
17.61
2010
2011 >45 15
4.18
≤45 22
0.11 −4.72 −4.11 −5.27 −3.62
1.77 −0.98 −0.73 −0.20
2012 >45 10
12.51
6.74
6.96
6.42
6.77
≤45 9
19.86
18.39
18.11
14.48
13.92
>45 13
32.75
37.29
36.55
37.25
38.58
≤45 14
37.55
33.74
36.19
37.26
33.67
>45 8
25.57
26.91
25.97
20.71
20.08
≤45 8
29.01
29.58
26.44
27.71
28.32
2013
2014
2015 >45 13
27.01
24.88
28.46
27.74
28.87
≤45 7
63.80
60.55
57.36
60.80
75.94
2016 >45 13
13.15
7.88
7.06
7.57
6.42
≤45 9
20.56
22.67
22.09
23.66
30.35
>45 80
22.11
19.82
20.46
19.88
20.85
≤45 81
28.90
22.69
20.89
21.60
22.05
2010–2016
Table 5. Descriptive statistics Mean
Median
Std. Dev.
Min
Max
Initial return
22.36%
15.29%
42.08%
−50.89%
418.52%
Initial market return
−0.05%
0.03%
0.75%
−2.75%
1.64%
Initial abnormal return
22.40%
15.41%
42.02%
−52.13%
417.39%
Raw return - 3 year
24.39%
−4.11%
114.98%
−91.60%
1073.16%
7.29%
6.71%
12.09%
−21.49%
40.21%
17.10%
−10.08%
113.71%
−113.71%
1072.65%
Market return -3 year MAR - 3 year
428
S.-W. Tzang et al.
Table 5 shows the descriptive statistics for the IPO r variables. The sample consists of 161 initial public offerings in Taiwan issued in 2010–2016. Table 5 shows that the mean initial return of IPOs in Taiwan is 22.36%. The three-year raw return of Taiwan IPOs is 24.39% while the average return of the market over the same period is 7.29%. The table shows the high standard deviation which presents Taiwanese stock market’s high volatility. Figure 1 shows that the raw return of IPO stocks is increasing for 3 years and it is remarkably above the market index return. By applying the BHAR and CAR approaches, abnormal returns are generated for the IPOs in Taiwan stock exchange over periods of 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36 months.
Return 40.0% 30.0% 20.0% 10.0% 0.0% 3
6
9
12
IPO stock return
15
18
21
24
27
Market index return
30
33
36
AAR
Fig. 1. Returns of IPO stock, market and market adjusted average abnormal return for holding period from 3 to 36 months.
Cumulative Abnormal Return (CAR) When CAR takes a positive (negative) value, it indicates that the IPOs outperform (underperform) relative to the market portfolio. Table 6 shows the average and cumulative average abnormal returns for the 36 months in quarterly series after the listing date for full sample of 161 IPOs between 2002 and 2005 and separated for 1-, 2- and 3-year period to estimate the market-adjusted AAR and the CAR. The results show that CAR is positive over the 3-year period. We find that Taiwanese IPO companies outperformed the market in the three-year period. Unlike other studies, the results reveal that the IPOs in TWSE have persistent outperformance over the long run. In comparison with the market benchmark, the results show that the average abnormal return are also significant by t-stat in the 6th, 9th, 12th, 21st, 24th, 27th, 30th, 33rd and 36th months, and the cumulative average abnormal returns are significant in the 9th–36th months as well.
Performance of the Initial Public Offering
429
Table 6. Market adjusted average abnormal return and cumulative abnormal return Month Year-1
AAR (%)
t-statistic CAR (%)
t-statistic
3
12.49
1.58
12.49
0.91
6
14.99
1.80*
27.49
1.33
9
13.34
1.73*
40.83
1.74*
12
13.57
1.86*
54.40
2.12**
Year-2 15
11.30
1.48
65.70
2.18**
18
11.17
1.48
76.86
2.13**
21
15.42
1.82*
92.28
2.34**
18.34
2.00**
110.63
2.42**
Year-3 27
21.90
2.06**
132.52
2.38**
30
26.44
2.41**
158.97
2.62***
23.30
2.23**
182.27
2.98***
22.93
2.19**
205.20
3.25***
24
33 36
Note: *** statistically significant at 1% level, ** statistically significant at 5% level, * statistically significant at 10% level.
Buy and Hold Abnormal Return (BHAR) Buy and hold return is considered better than arithmetic mean returns because it avoids negative return problems in long-run abnormal returns [14]. A positive value for BHAR indicates that the IPO outperformed the market and a negative value indicates that the IPO underperformed the market. Table 7 reports the long-run performance of full sample using the equally weighted buy and hold return method. Columns 4 and 6 report the raw buy-and-hold returns for our IPO sample and buy-and-hold returns for market index, while column 8 reports the results of the buy-and hold abnormal return (BHAR), which is calculated as the difference between the raw returns and the market returns. Interestingly, the results show that IPO companies listed on the Taiwan stock market outperform the market over three years after they went public, with a BHAR of 11.99%, 12.55% and 22.23%. The results of year 1 and 3 are statistically significant at 10% and 5% level. Our results are consistent with the results of CAR, in which IPO companies listed on the TWSE Market tend to outperform. Table 7. Buy and hold abnormal return (BHR, BHAR) n
BHR- stock (%)
t-stat
BHR-market (%)
yr-1
161
14.58%
1.97
2.59%
yr-2
161
19.35%
2.42
yr-3
161
34.43%
3.37
BHAR (%)
t-stat
5.00
11.99%
1.63*
6.80%
7.26
12.55%
1.59
12.20%
12.74
22.23%
2.19**
* and ** denote 10% and 5% significance level respectively.
t-stat
430
S.-W. Tzang et al.
5 Conclusion This study examines the short- and long-term IPO returns of companies listed in Taiwan stock market during the 2010–2016. The sample for the study includes 161 companies listed in TWSE. We use the market adjusted excess return, CAR and BHAR to analyze the short run and long run performance. A year-wise analysis showed that the investors earned high positive returns over the 20 days. These findings describe that the Taiwan IPO market showed strong evidence of short-run underpricing. We also found that underpricing occurred on smaller offerings than larger offerings except 2010 and 2011. Possibly, the Eurozone debt crisis affected the result in 2010 and 2011. Given the results of both CAR and BHAR method about aftermarket performance of IPO firms, the performance of Taiwan IPO market is not consistent with the previous findings from other markets. We need further evidence to support this Taiwan anomaly based on this initial analysis.
References 1. Jain, B.A., Kini, O.: The life cycle of initial public offering firms. J. Bus. Finance Account. 26(9–10), 1281–1307 (1999) 2. Ritter, J.R.: The long-run performance of initial public offerings. J. Finance 46(1), 3–27 (1991) 3. Loughran, T., Ritter, J.R.: The new issues puzzle. J. Finance 50(1), 23–51 (1995) 4. Chen, A.: The long-run performance puzzle of initial public offerings in Taiwan: empirical findings from various models. J. Financial Stud. 9, 1–20 (2001) 5. Ritter, J.R., Welch, I.: A review of IPO activity, pricing, and allocations. J. Finance 57(4), 1795–1828 (2002) 6. Gounopoulos, D., Nounis, C., Stylianides, P.: The short and long term performance of initial public offerings in the Cyprus stock exchange. J. Financial Decis. Mak. 4(1) (2007) 7. Ahmad-Zaluki, N.A., Kect, L.B.: The investment performance of MESDAQ market initial public offerings. Asian Acad. Manage. J. Account. Finance Penerbit Universiti Sains Malays. 8(1), 1–23 (2012) 8. Wen, Y.-F., Cao, M.H.: Short-run and long-run performance of IPOs: evidence from Taiwan stock market. J. Finance Account. 1(2), 32–40 (2013) 9. Fama, E.F.: Market efficiency, long-term returns, and behavioral finance. J. Financial Econ. 49(3), 283–306 (1998) 10. Gompers, P., Lerner, J.: The really long-run performance of initial public offerings: the preNasdaq evidence. J. Finance 58(4), 1355–1392 (2003) 11. Mitchell, M., Stafford, E.: Managerial decisions and long-term stock price performance. J. Bus. 73(3), 287–329 (2000) 12. Lee, P.J., Taylor, S.L., Walter, T.S.: Australian IPO pricing in the short and long run. J. Bank. Finance 20(7), 1189–1210 (1996) 13. Conrad, J., Kaul, G.: Long-term market overreaction or biases in computed returns. J. Finance 48(1), 39–63 (1993) 14. Ljungqvist, A.P.: Pricing initial public offerings: further evidence from Germany. Eur. Econ. Rev. 41(7), 1309–1320 (1997)
Analysis of the Causal Relationship Among Diversification Strategies, Financial Performance and Market Values by the Three-Stage Least Squares (3SLS) Ying-Li Lin, Kuei-Yuan Wang(B) , and Jia-Yu Chen Department of Finance, Asia University, No. 500, Lioufeng Road, Wufeng District, 41354 Taichung, Taiwan [email protected]
Abstract. According to many studies, product diversification will benefit companies and motivate companies to carry out business diversification, including improving market competitiveness, spreading business risks and reducing ruin probabilities. In contrast to previous studies which only focus on the factors influencing diversification and the benefits of diversification, this study intends to analyze the causal relationship among diversification strategies, financial performance, and market values by the three-stage least squares (3SLS).
1 Introduction Diversification, as a method of diversified operations, has a wider range of investment and is more likely to produce the benefits of economies of scale. Scholars generally believe that transnational enterprises perform better than Taiwanese enterprises. Gyan et al. (2017) found that enterprises that frequently use diversification strategies in transnational investments perform better than that of investments that are limited only to the home country. From the perspective of diversified operation, broad product lines or investments in various industries can produce the benefits of economies of scale. Through product diversification, enterprises can obtain departmental resources, share the internal capital market, enter new markets with the same brand, and spread business risks by tie-selling various goods (Cerrato 2006). Diversification allows companies to enter new business areas. In order to reflect the values of these resources in enterprise performance, enterprises will expand product lines by using the resources when implementing diversification strategies for products of their own industries, and be committed to special knowledge research and development with the help of the system and enterprise size. Product diversification allows enterprises to improve their approach to weaker links. Hence, this paper intends to explore the following questions: (1) To explore the influences of diversification strategies on the companies’ overall business performance. (2) To analyze the causal relationship among the factors influencing diversification strategies, namely, diversification degree, financial performance and market value. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 431–437, 2022. https://doi.org/10.1007/978-3-030-79728-7_43
432
Y.-L. Lin et al.
2 Literature Review 2.1 Relationship Between Company Size and Degree of Product Diversification The company size is reflected in enterprises’ resource capacities and influences the competitiveness of enterprises in the market (Chiao et al. 2008). Size is usually determined by total assets and turnover, and large scale is achieved through consistent long-term cumulative profits. Strong risk bearing capabilities make it easier for enterprises to achieve anticipated targets. In general, large-scale enterprises have more resources and are usually superior in production. Hence, large-scale enterprises can expand their businesses based on abundant resources. According to the studies on investment layout in mainland China, the larger an enterprise is, the more diversified its products will be, and will produce greater benefits and achieve effective economic scalability. Hence, the following hypothesis is proposed: H1a: The company scale has a positive effect on its degree of product diversification strategy. 2.2 Relationship Between Management Team Heterogeneity and Degree of Product Diversification If being diversified and heterogeneous, when implementing new ideas in correct directions, management teams can implement enterprise strategies that have positive effects, which contributes to the effectiveness and development of decisions (Chemmanur and Paeglis 2005). When a team has multiple specialties, that team is diversified and considers problems more comprehensively so that the decisions on product diversification can achieve the most appropriate results under the leadership of a professional team (Lu et al. 2014). Hence, the following hypothesis is proposed: H1b: A heterogeneous management team is more likely to improve the degree of product diversification strategy. 2.3 Influences of Diversification Degree on Financial Performance The interaction between product diversification and internationalization had positive effects on financial performance (Tallman and Li 1996). International strategies can significantly reduce debt ratios. The multiplicative interaction between corporate product diversification and internationalization was positively related to financial performance (Chao et al. 2011). According to the study of Chien (2012), corporate product diversification had positive influences on corporate performance. Hence, the following hypothesis is proposed: H2: Diversification degree is significantly and positively related to financial performance
Analysis of the Causal Relationship Among Diversification Strategies
433
2.4 Influences of Financial Performance on Market Values Through diversification strategies, in addition to spreading risks, enterprises can improve their use of resources and operational efficiencies and improve their recognition in the international market, so as to increase their confidence and motivations and indirectly enhance their market values. In the process of internationalization, companies operating businesses in two industries fully implement the diversification strategies, which helps to improve corporate performance and spread risks, thus enhancing the influence of corporate values. Braun and Traichal (1999) pointed out that, the pressure to promote internationalization was that the trades between different countries can reduce barriers. In addition, internationalization can reduce trade barriers, increase market integration and benefit companies. Hence, the following hypothesis is proposed: H3: Financial performance is significantly and positively related to market values 2.5 Influences of Diversification Degree on Corporate Values Lang and Stulz (1994) identified that, values of highly diversified companies are significantly less than those of companies with single product diversification, and the performance of related product diversification is better than that of unrelated product diversification. Based on the multiplicative interaction among diversification strategies, corporate values increase with the internationalization improvement. Investors expect that, in addition to spreading risks, the strategies of product diversification and internationalization can, due to their international recognition, improve market share to increase confidence in companies, so as to enhance corporate values in the capital market. According to the study of Pantzalis (2001), product diversification was positively related to corporate values, and transnational enterprises in developing countries have significantly higher corporate values than those in developed countries. Hence, according to the above literature review, the following hypothesis is proposed: H4: Diversification degree is directly and positively related to corporate values Based on the above discussion, this study intends to propose the following conceptual framework.
company size
H1a H3
H2 diversification degree management team heterogeneity
financial performance
H1b H4
Fig. 1. Conceptual framework
corporate values
434
Y.-L. Lin et al.
3 Research Methods 3.1 Research Period and Data Sources In this study, the research period is 10 years from 2010 to 2019, and the subjects are companies listed on Taiwan Stock Exchange. Samples are selected according to the following criteria: (1) The subjects are companies in the electronics industry publicly listed on Taiwan Stock Exchange. (2) A sample shall be a firm that has at least one overseas subsidiary. 2. Data Sources (1) The product diversification degree data of listed companies are obtained from the notes to the annual financial statements and annual consolidated financial statements of listed companies in the “professional securities database”. 3.2 Variable Definition 3.2.1 Dependent Variables- Corporate Value Measurement Variable (TQ) Pantzalis (2001) measured proxy variables of values of transnational companies by Tobin’s Q, and found that after testing, product diversification was positively related to corporate values. Since Tobin’s Q can measure corporate performance, predict investment opportunities, and measure values of intangible and technological assets by considering the effect of deferred benefits, it is generally believed that any hypothesis with a time lag is realistic. Hence, the natural logarithm of Tobin’s Q after correction is used to measure the corporate values (MV) deferred for one period, defined as below: CSi,t + PSi,t + Di,t MVi,t+1 = LN TAi,t where: CS is the value of outstanding common shares; PS is the value of outstanding preferred shares; D is the debt value; and TA is the total asset value. 3.2.2 Independent Variables 1. Financial performance (FP) This paper uses ROA (return on assets) and ROE (return on equity) as the tools to measure performance (Hitt 1997, Chkir and Cosset 2001), and analyzes whether product diversification can maximize the benefits of invested assets and the overall profits of multinational operations, based on the different accounting profits of transnational enterprises obtained by using asset efficiencies and increasing returns on shareholder’s equity. 2. Diversification degree (FATA)
Analysis of the Causal Relationship Among Diversification Strategies
435
In order to measure the diversification degree, in this study, the product diversification is mainly measured based on the classification method proposed by Rumelt (1974). In addition, Sullivan (1994) proposed five significant reliable variables to measure the degree of international diversification, two of which are: (1) measuring the performance of the degree of international diversification: proportion of foreign sales in total sales (FSTS) and (2) measuring the structures of companies with international diversification: ratio of total foreign assets (FATA). Hence, this study intends to measure the diversification degree by dividing foreign assets by the total corporate assets (FATA). 3. Management team heterogeneity-Degree of education (DEG) According to Castle and Jane (1997), employees with higher education had strong abilities in processing information and could accept innovative thinking. Hence, this study intends to classify managers into four categories according to their educational level, namely doctor’s degree, master’s degree, university degree and below university degree, with each one being weighted by 4, 3, 2 and 1 respectively, to calculate the average educational level by the weighted method, estimate standard deviations and measure management team heterogeneity. 4. Firm size (SIZE) This study, by referring to the measurement method of Chatterjee and Wernerfelt (1991), intends to take the natural logarithm of the total corporate capital as a criterion to measure firm size, so as to meet the hypothesis of normality. 3.2.3 Control Variables The control variables in this paper are corporate growth (GROWTH) (Nagaoka 2007), debt ratio (DEBT) (Nagaoka 2007), outsiders (INDEDIR) (Chatterjee et al. 2003) and independent directors holding concurrent posts (PART) (Fich and Shivdasani 2006). 3.3 Empirical Model According to the structure in Fig. 1, company size, management team heterogeneity, diversification strategies, financial performance and market values are related to the three-stage least squares regression. In order to test whether they are interdependent, this study intends to estimate the simultaneous equations by the three-stage least squares (3SLS), which demonstrates good properties of unbiasedness and consistency and can be effective. The simultaneous equations are set up as follows: TQi,t+1 = α11 + α21 FPi,t + α31 FATAi,t + α41 GROWTHi,t + α51 DEBTi,t + εi1 (1) FPi,t = α12 + α22 FATAi,t + α32 INDEDIRi,t + α42 PARTi,t + εi2
(2)
FATAi,t = α13 + α23 DEGi,t + α33 SIZEi,t + εi3
(3)
436
Y.-L. Lin et al.
4 Empirical Results According to Table 1, the empirical results show that diversification degree has no significant influence on corporate values, debt ratio has significantly negative influences on corporate values, diversification degree has significantly negative influences on financial performance, and outsiders have significantly negative influences on financial performance. However, independent directors holding concurrent posts have positive influences on financial performance. Finally, the higher the education level is, the higher the diversification degree will be. Table 1. Empirical results. TQ FP
−2.63(0.48)
FATA
0.08(0.51)
FP
FATA
−0.04(0.03) **
GROWTH 0.04(0.88) DEBT
−1.07(0.00) ***
INDEDIR
−0.08(0.00) ***
PART
0.01(0.06)
DEG
* 0.53(0.00) ***
Note: The value in the parentheses of the table is the p-value. ***, **, * Denote coefficient estimates that are reliably significant at the 1%, 5%, 10% levels, respectively.
5 Conclusion According to the empirical results, senior managers are highly educated, but diversification degree does not help to improve financial performance. In addition, the more independent directors there are, the greater negative influences will be on financial performance; while the more independent directors that hold concurrent posts, the greater positive influences will be on financial performance. For this phenomenon contrary to the expected result, the corporate governance mechanism that does not deliver the expected result deserves further investigation. Considering that the study results are not as expected, it is suggested that future studies use different measurement indicators for cross validation to select the most appropriate operational definitions of variables. Acknowledgments. This research was supported by Ministry of Science and Technology of the Republic of China under contract MOST 109-2813-C-468-098-H.
Analysis of the Causal Relationship Among Diversification Strategies
437
References Braun, G.P., Traichal, P.A.: Competitiveness and the convergence of international business practice: North American evidence after NAFTA. Glob. Finance J. 10(1), 107–122 (1999) Castle, N.G., Jane, B.H.: Top management team characteristics and innovation in nursing homes. Gerontologist 37(5), 572–580 (1997) Cerrato, D.: The multinational enterprise as an internal market system. Int. Bus. Rev. 15(3), 253–277 (2006) Chao, L., Pan: Study the influences of internationalization depth on financial performance and corporate value. East-Asia Rev. 473, 49–71 (2011) Chatterjee, S., Harrison, J.S., Bergh, D.D.: Failed takeover attempts, corporate governance and refocusing. Strateg. Manage. J. 24, 87–96 (2003) Chatterjee, S., Wernerfelt, B.: Strateg. Manage. J. 12(1), 33–48 (1991) Chemmanur, T.J., Paeglis, I.: Management quality, certification, and initial public offerings. J. Financial Econ. 76(2), 331–368 (2005) Chkir, I.E., Cosset, J.C.: Diversification strategy and capital structure of multinational corporations. J. Multinatl. Financi. Manage. 11, 163–179 (2001) Chiao, Y.C., Yu, C.M.J., Li, P.Y., Chen, Y.C.: Subsidiary size, internationalization, product diversification, and performance in an emerging market. Int. Mark. Rev. 25(6), 612–633 (2008) Chien: An analysis of diversification-performance in coffee industry of Taiwan and ChinaEvidence from B coffee company. Unpublished master’s dissertation, Soochow University (2012) Pantzalis, C.: Does location matter? An empirical analysis of geographic scope and MNC market valuation. J. Int. Bus. Stud. 32(1), 133–155 (2001) Fich, E.M., Shivdasani, A.: Are busy boards effective monitors? J. Finance 61(2), 689–724 (2006) Gyan, A.K., Brahmana, R., Bakri, A.K.: Diversification strategy, efficiency, and firm performance: insight from emerging market. Res. Int. Bus. Finance 42, 1103–1114 (2017) Hitt, M.A., Hoskisson, R.E., Kim, H.: International diversification: effects on innovation and firm performance in product-diversified firms. Acad. Manage. (1997) Lang, L.H.P., Stulz, R.M., et al.: Tobin’s q, corporate diversification, and firm performance. J. Polit. Econ. 102(6), 1248–1280 (1994) Lu, et al.: A new class of ubiquitin-Atg8 receptors involved in autophagy and polyQ protein clearance. Autophagy 10(12), 2381–2382 (2014) Nagaoka, S.: Assessing the R&D management of a firm in terms of speed and science linkage: evidence from the US patents. J. Econ. Manage. Strategy 16(1), 129–156 (2007) Pantzalis: J. Bus. Stud. 32, 793–812 (2001) Rumelt, R.P.: Strategy, Structure and Economic Performance, pp. 9–11. Harvard University Press, Boston (1974) Sullivan, D.: Measuring the degree of internationalization of a firm. J. Int. Bus. Stud. 25, 325–342 (1994) Tallman, S., Li, J.: Effects of international diversity and product diversity on the performance of multinational firms. Acad. Manage. (1996)
Macroeconomic Variables and Investor Sentiment Mei-Hua Liao1(B) , Chun-Min Wang1 , and Ya-Lan Chan2(B) 1 Department of Finance, Asia University, Taichung, Taiwan, People’s Republic of China
[email protected]
2 Department of Business Administration, Asia University, Taichung, Taiwan,
People’s Republic of China [email protected]
Abstract. Many studies have shown that investor behavior is closely related to macroeconomic variables. In the Taiwan stock market, foreign investors, mutual funds, and dealers have a huge impact on the stock market, and the difference in their trading behavior comes from different investment needs. In order to explore the relationship between investor sentiment of three major institutional investors with macroeconomic variables, this paper sets the following hypotheses: (1) Investor sentiment of foreign investment, mutual funds, and dealers have a positive impact on macroeconomic variables; (2) Macroeconomic variables have a positive impact on the investor sentiment of the three major institutions; (3) In the bear market, the sentiment of the three major institutional investors has a positive impact on macroeconomic variables. The purpose of this article is to make up for the missing parts of the literature so far.
1 Introduction The macroeconomic variables reflecting economic activity provide little information regarding future investor sentiment. From the perspective of long-term economic development, when production efficiency and resource utilization efficiency increase, economic standards will rise and the economy will expand. After reaching a peak, economic activities generally slow down and business activities will shrink, forming a cycle. The relationship between macroeconomic variables and investor sentiment has always been a topic of concern in the financial academic circles. Su Xuanqi et al. [1] emphasize the relationship between the stock market and the overall economy. Traditional financial theory puts forward the efficient market hypothesis, which believes that investors are rational, and their emotional changes will be reflected in the stock price in the shortest time. However, studies by Kamare et al. [2] have shown that stock market prices are often over-represented or under-reacted to real prices, so there is no efficient market. At the same time, behavioral finance began to emerge, believing that people are bounded rationality, adding social science perspectives, trying to explore phenomena that traditional efficient market theory cannot explain, and investor sentiment is one of the obvious studies of financial behavior. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 438–443, 2022. https://doi.org/10.1007/978-3-030-79728-7_44
Macroeconomic Variables and Investor Sentiment
439
In the early days, some scholars believed that investor sentiment was not related to the overall economy, such as Solt and Statman [3]. However, more studies have found that investor sentiment is significantly related to the overall economy. For example, Lee et al. [4] found that investors Sentiment and market volatility show negative changes. When investors are more optimistic (pessimistic), market volatility becomes smaller (larger). In addition to knowing from literature research that investor sentiment has an impact on market prices, Baker and Wurgler [5] also compile the bubble economy that occurred in the United States in recent decades, and found that these market sentiments that have occurred often lead to stock market crashes. People are turning to more stable stocks. Now, the investor structure of Taiwan’s stock market is dominated by individual investors. But DeVault et al. [6] point out that institutional investor sentiment can better control market changes. In the Taiwan stock market, foreign investors, mutual funds, and dealers have a huge impact on the stock market in recent years. And the three major institutional investors have a lot of funds and large research teams. So they have an advantage individual investors in information interpretation. And their emotions and investment behavior have a greater impact on the market. Most of the previous studies on investor sentiment have been discussing the impact on the rate of return. But seldom discussed the interaction between investor sentiment and the overall economy. However, when observing the stock market and the overall economy, the relationship between the two cannot be ignored. This study focuses on the Taiwan stock market, and analyzes the respective roles of institutional investors’ sentiment from foreign investors, mutual funds, and dealers in the overall economic changes. Therefore, we will explore the overall economic changes and foreign investment. The relationship between investor sentiment, mutual funds, and dealer. On the other hand, we also examine the relationship between investor sentiment of foreign investors, mutual funds, and dealers with the overall economy in bear and bull markets.
2 Literature Review and Hypotheses According to the foregoing research objectives, the following first will discuss the relevant literature of investor sentiment, institutional investors and macroeconomic variables. Then we explain the hypotheses that extend our research. 2.1 Investor Sentiment Related Literature Behavioral finance is a recent research topic that has attracted the attention of many researchers. Many foreign documents have pointed out that when there are noisy traders in the market, high (low) sentiment will hold more stocks, thereby increasing (decreasing) market turnover. In addition, Zhou et al. [7] found that investor sentiment is one of the irrational factors in noise trading. Peress and Schmidt [8] point out that the investment decisions of noise traders will be affected by past performance [9, 10]. Choi and Robertson [11] find that noise traders will reduce market liquidity.
440
M.-H. Liao et al.
2.2 Institutional Investors Related Literature Because institutional investors have the ability to gather information, the firms in which they hold shares have less information asymmetry [6, 12–14]. Chakravarty [15] point out that compared retail investors with institutional investors, it will have a larger cumulative price impact on the stock market when placing a moderate amount of orders. Pukthuanthong-Le and Visaltanachoti [16] find that the herding behavior of foreign investment in Thailand. Chang [17] found that prices are prone to overreaction due to foreign trading activities in Taiwan stock market. Pavabutr and Yan [18] find that the abnormal plummet of Asian stock markets is not significantly related to foreign transactions. Foreign institutional investors have better information capabilities than domestic investors, so the companies they hold are relatively free of information asymmetry [19]. Li and Lai [20] find that when the stock market rose sharply, fund managers mainly used buying herding strategy, while dealers mainly used selling herding one. But when all of stock price fell sharply, the total market exist strong herding evidence. 2.3 Business Cycle Related Literature A lot of literature explored whether New York stock market volatility has a significant effect in predicting the overall economy [21–23]. Based on the above literature, it can be seen that stock prices have a considerable degree of correlation with the overall economy, so the following hypothesis is established: Hypothesis 1: Investor sentiment from foreign investment, mutual funds, and dealers all have a positive impact on the overall economy. Some scholars have also tried to explain the possibility that the stock market has been predicted to fluctuate [24–26]. However, the research of Davis and Kutan [27] shows that the overall economy’s predictive power on the stock market is weak. Therefore, the following hypothesis is established: Hypothesis 2: Macroeconomic variables has a positive influence on the sentiments of three major institutional investors. Lee et al. [28] find that the overall economy has predictive power for the stock market in a short market. Chen [29] investigates the predictive power of various macroeconomic variables on the bear market. Wu and Lee [30] also point out that the consumption-wealth ratio has better predictive power for the US bear market. Eisfeldt [25] find that when the economic boom is in an expansion period, the stock market has a greater rate of change. Wu et al. [31] explore the forecasting ability of multi-country overall economic variables on their own short market, and found that interest rates have better forecasting ability. Therefore, we establish the following hypothesis: Hypothesis 3: Macroeconomic variables has a positive impact on the sentiments of three major institutional investors in the bear market. Research Methods This research mainly discusses the relationship between macroeconomic variables and institutional investors’ sentiments. Therefore, the samples selected in this research are listed companies as the research object. The research period is from January 2008 to December 2020.
Macroeconomic Variables and Investor Sentiment
441
In this study, macroeconomics variables, stock prices, and trading volumes as institutional investors’ sentiments are obtained from the Taiwan Economic Journal (TEJ) database. The authoritative data bank covering extensive institutional investors’ trading data sets in Taiwan Stock Exchange (TWSE) since 2008. The macroeconomic variables include M1b, monitoring indicator and the annual growth rate of the Taiwan Stock Price Index published by the National Development Commission. To measure investor sentiment, we calculate the turnover rate as follows: Turnover rate = (number of investor purchases + number of investor sales) 2 / number of outstanding shares} ∗ 100 Table 1 shows the regression results of investor sentiment in various industries on each macroeconomic variable. Dependent Variables are M1b (M1), monitoring indicator (M2) and the annual growth rate of the Taiwan Stock Price Index (M3). Table 1. The regressions of investor sentiments in various industries. M1
M2
M3
Cement Industry
−0.55***
−0.38
−2.65***
Food Industry
−0.02
0.71***
0.35
Plastic Industry
−0.12
−0.11
1.33
Textile Industry
−0.10
0.01
0.05
Electric Machinery Industry
0.06
0.17
0.66
Electrical and Cable Industry
0.07
0.39***
0.41
Chemical Biomedical Industries
−0.28*
0.27
−0.44
Biotechnology and Medical Care Industry
0.11**
−0.09
0.03
Glass and Ceramic Industry
0.30***
0.17
0.15
Paper and Pulp Industry
0.02
0.16
0.58
Iron and Steel Industry
0.30**
−0.29
−1.33**
Rubber Industry
0.57***
0.24
2.12***
Automobile Industry
0.29***
−0.22**
−0.61**
Electronic Industry
0.20*
−0.07
0.80
Information Service Industry
0.11***
0.14***
0.40***
Building Material and Construction Industry
0.07
0.03
0.88*
Shipping and Transportation Industry
−0.18**
0.10
−0.62**
Tourism Industry
−0.02
−0.28
−0.48
Financial and Insurance Industry
−0.41**
−0.37
−3.18***
Trading and Consumers’ Goods Industry
−0.50***
0.07
0.13 (continued)
442
M.-H. Liao et al. Table 1. (continued) M1
M2
M3
Oil, Gas and Electricity Industry
−1.25**
−1.52
1.67
Other Industry
0.14
−0.81***
−1.02
R-squared
0.76
0.60
0.64
Note: Superscripts ∗ , ∗∗ , and ∗∗∗ denote significance of the t -test for the difference in means between the two subsamples at 10%, 5%, and 1% levels, respectively
The regression results show that the information service industry positively relate to macroeconomic variables. Investor behavior of the information service industry is affected by fiscal or monetary policies. Acknowledgments. Constructive comments of editors and anonymous referees are gratefully acknowledged. This research is partly supported by the National Science Council of Taiwan (NSC 109-2813-C-468-099-H and NSC 108-2813-C-468-039-H) and by Asia University of Taiwan (ASIA-105-CMUH-10 and ASIA104-CMUH-12).
References 1. Su, X., Qi, A., Lowe, C.D., Yuan, J., Yang, B.: Stock market liquidity and the business cycle: comprehensive evidence from Taiwan. J. Manag. Syst. 23, 65–106 (2016) 2. Kamare, T.W.M., Siegel, A.F.: The effect of futures trading on the stability of standard and poor 500 return. J. Futur. Mark. 12, 645–658 (1992) 3. Solt, M.E., Statman, M.: How useful is the sentiment index? Financ. Anal. J. 44, 45–55 (1988) 4. Lee, W.Y., Christine, X.J., Daniel, C.I.: Stock market volatility, excess returns, and the role of investor sentiment. J. Bank. Finance 26, 2277–2299 (2002) 5. Baker, M., Wurgler, J.: Investor sentiment and the cross-section of stock returns. J. Finance 61, 1645–1680 (2006) 6. DeVault, L., Sias, R.W., Starks, L.: Sentiment metrics and investor demand. J. Finance 74, 985–1024 (2019) 7. Chou, P.H., Zhang, Y.Z., Lin, M.C.: The interaction between investor sentiment and stock returns. Rev. Secur. Futures Markets 19, 153–190 (2019) 8. Peress, J., Schmidt, S.: What matters to individual investors? Evidence from the horse’s mouth. J. Finance 75, 1083–1133 (2020) 9. Liao, M.H., Sone, H.: Do institutional investors like local longevity companies? In: Proceedings of the 13th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2019), Sydney (2019) 10. Liao, M.H., Kuo, W.L., Chan, Y.L.: Investment concentration and home bias. In: Proceedings of the 14th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2020), Lodz (2020) 11. Choi, J.J., Robertson, A.: Glued to the TV: distracted noise traders and stock market liquidity. J. Finance 75, 1965–2020 (2020) 12. Brown, G.W., Cliff, M.T.: Investor sentiment and the near-term stock market. J. Empir. Finance 11, 1–27 (2004)
Macroeconomic Variables and Investor Sentiment
443
13. Michaely, R., Vincent, C.: Do institutional investors influence capital structure decisions? Working Paper, Cornell University, NY (2013) 14. Dennis, P.J., Strickland, D.: Who blinks in volatile markets, individuals or institutions? J. Finance 57, 1923–1949 (2002) 15. Chakravarty, S.: Stealth-trading: Which traders’ trades move stock prices? J. Finance Econ. 61, 289–307 (2001) 16. Ukthuanthong-Le, K., Visaltanachoti, N.: Commonality in liquidity: evidence from the Stock Exchange of Thailand. Pac.-Basin Finance J. 17, 80–99 (2009) 17. Chang, C.: Herding and the role of foreign institutions in emerging equity markets. Pac. Basin Financ. J. 18, 175–185 (2010) 18. Pavabutr, P., Yan, H.: The impact of foreign portfolio flows on emerging market volatility: evidence from Thailand. Aust. J. Manag. 32, 345–368 (2007) 19. Dahlquist, M., Robertsson, G.: Direct foreign ownership, institutional investors, and firm characteristics. J. Finance Econ. 59, 413–440 (2001) 20. Li, C.A., Laih, Y.W.: Taiwan stock market and domestic institutional investors herding behaviors during acutely market volatility periods. Taiwan Acad. Manag. J. 5, 231–267 (2005) 21. Kaul, A., Kayacetin, N.V.: Forecasting economic fundamentals and stock returns with equity market order flows: macro information in a micro measure? Working Paper, University of Alberta (2009) 22. Næs, R., Skjeltorp, J.A., Ødegaard, B.A.: Stock market liquidity and the business cycle. J. Finance 66, 139–176 (2011) 23. Beber, A., Brandt, M.W., Kavajecz, K.A.: What does equity sector orderflow tell us about the economy? Working Paper, University of Amsterdam (2010) 24. Fujimoto, A.: Macroeconomic sources of systematic liquidity. Working Paper, University of Alberta (2004) 25. Eisfeldt, A.L.: Endogenous liquidity in asset markets. J. Finance 59, 1–30 (2004) 26. Kurihara, Y.: The relationship between exchange rate and stock prices during the quantitative easing policy in Japan. Int. J. Bus. 11, 375–386 (2006) 27. Davis, N., Kutan, A.M.: Inflation and output as predictors of stock returns and volatility: international evidence. Appl. Financial Econ. 13, 693–700 (2010) 28. Lee, W.M., Wu, S.J., Huang, C.T.: Predicting bear markets of the TAIEX and industry indices in Taiwan. Taipei Econ. Inq. 51, 171–224 (2015) 29. Chen, S.S.: Predicting the bear stock market: macroeconomic variables as leading indicators. J. Bank. Finance 33, 211–223 (2009) 30. Wu, S.J., Lee, W.M.: Predicting the U.S. bear stock market using the consumption-wealth ratio. Econ. Bull. 32, 3174–3181 (2012) 31. Wu, S.J., Lee, W.M., You, S.Y.: Predicting bear stock markets: international evidence. Unpublished Working Paper (2013)
College Students’ Learning Motivation and Learning Effectiveness by Integrating Knowledge Sharing, Action Research and Cooperative Learning in Statistics Tin-Chang Chang1(B) , I.-Tien Chu2 , and Pei-Shih Chen3 1 Department of Digital Media Design, Department of Business Administration,
Asia University, Taichung, Taiwan [email protected] 2 Department of Creative Product Design, Asia University, Taichung, Taiwan [email protected] 3 General Education Center, Central Police University, Taoyuan, Taiwan [email protected]
Abstract. Statistics is everywhere in the people’s lives. It is a very important and fundamental knowledge. It is one of the basic compulsory courses of the School of Management. It will be widely applied to all levels of management. The purpose of this study is to use multiple teaching methods to improve the effectiveness of learning, enhance learning motivation, and avoid the shortcomings of traditional, rigid, single teaching methods and poor teaching results in the course of statistics. This research combines three learning and management models into an innovative teaching method, which includes Collaborative Learning, Action Research, and application of learning processes into practical knowledge management. This innovative teaching method will apply to sophomores in a compulsive course of statistics in Asian University. The students will be able to better meet the qualifications of employment practice, have a broader vision, and experience a more diverse learning in a more structured course of statistics. The evaluation of learning effectiveness also uses innovative methods which include three parts: a. Statistical learning motivation scale. b. Questionnaire assessment scale. c. Self-assessment of learning effectiveness. It is also hoped that the restriction of the traditional teaching method of using textbooks and the standardization of the teaching process will be broken, as well as the traditional practice of paper-oriented examinations, in order to improve students’ learning motivation and learning effectiveness. The results of this research show that the average score of the first semester mid-term examination of the 1.108 academic year is higher than that of the first semester of the 107 academic year, indicating that the overall learning effect has improved, although there is no significant difference (t value 0.342, p-value = 0.733), but The standard deviation of the mid-term exam results in the 108 semester is also smaller than that in the 107 semester, indicating that the overall learning effect is relatively consistent. 2. The first and second semesters of learning motivation and assessment results have grown significantly. 3. The self-assessment of learning effectiveness grows every semester, and the growth in the second semester is more significant than that of the first semester. In particular, the self-assessment © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 444–463, 2022. https://doi.org/10.1007/978-3-030-79728-7_45
College Students’ Learning Motivation and Learning Effectiveness
445
of learning effectiveness at the end of each semester has grown much more than at the beginning of the semester. This result shows that the direction of this research is to have actual learning effects, but it is still necessary to continue to strengthen teaching preparation and improve teaching methods.
1 Research Motive and Purpose This research focuses on the method of cooperative learning in working on group assignments. Based on the theory of action research in the case studies of statistics, learning by doing is a great way to internalize students’ explicit knowledge from the textbook (text and formulas in the textbook) into tacit knowledge (practical skills learned in statistics that can be applied in reality). The study subjects of this research are the college students in the School of Management. Students have submitted through various admission channels including multi-star project recommendation, individual admission, admission by recommendation and screening, advanced subjects test admission, transfer test admission, and repeaters. Students’ admission backgrounds are so different that their learning processes are also very different. Therefore, more diversified methods are needed to increase students’ learning effectiveness. The shortcomings of traditional teaching methods in the class of statistics can be avoid such as boring content, single teaching method and poor teaching outcome. In the traditional classrooms with teacher-designed, teacher-arranged learning environment, students have just either followed the teachers’ directions to learn, did their own thing, used mobile phones, or even did nothing. Therefore, the current teaching environment must change. Teachers are not only knowledge givers, but facilitators. According to the theory of learning pyramid proposed by Edgar Dale in 1969, the most efficient learning is active learning at the bottom of his proposed learning pyramid, which is learning from direct and purposeful learning experiences such as hands-on workshops. Moreover, lesson stimulation and presentation performance can enable students to remember 90% of the lesson contents after two weeks. This research has emphasized on students’ learning motivation and learning effectiveness. This research focuses on students’ perspective of collaborative learning and action research to cultivate their creative abilities, which offset the shortcomings of traditional teaching methods. Differ from the traditional teaching methods, this study aims to internalize students’ explicit text knowledge in the textbook into the tacit knowledge, which they can apply such practical abilities learned in the class of statistics into reality. The purposes of this research are as follows: A. Using the method of cooperative learning to encourage students working in groups and to increase students’ active learning. B. Working on case studies by using the method of action research to develop students’ abilities in analysis and problem-solving.
446
T.-C. Chang et al.
2 Literature Review (1) Learning Pyramid American educator Edgar Dale (1969) pointed out that the least efficient learning is the top of the learning pyramid, and the most efficient learning is the bottom of the pyramid such as direct and purposeful learning as shown in Fig. 1. The higher the pyramid gets, such as lecturing, the lower the retention rate of students’ learning will have, which is only 5%. The retention rate of learning by using audio-visual is 20%. The lower the pyramid gets, the higher the retention rate of students’ learning will have, such as learning by doing. The retention rate of learning can reach up to 75%. Teaching others can reach up to 90% of retention rate of learning. Therefore, by adopting a teacher-centered teaching method, the retention rate of students’ learning is low. Adopting a student-centered teaching method, students can obtain more information from the class. The teacher-centered teaching method has the advantage of being able to transfer knowledge systematically, but students may become passive learners with poor learning effectiveness. On the contrary, the student-centered teaching method can greatly improve students’ learning effectiveness. This is the most obvious difference between interactive teaching and lectured-teaching. Only relying on the teachers to teach the course content is boring and difficult to be absorbed completely by the students. Also, it is the worst teaching mode that cannot stimulate the students’ learning motivation and learning efficiency. (2) Action Research Action research is particularly about the role that researchers play in the learning process. They are not only researchers but also participants, consultants and empowerers. In the process of action research, the researchers and the research subjects are in an equal relationship participating in the process and solving problems together (Webb 1991; Li 1997). The concept of action learning is derived from a series of continuous processes of learning and reflection in experience learning
Fig. 1. Learning Pyramid by Edgar Dale, The cone of experience, in “Audiovisual methods in teaching” (1969). Hinsdale, IL: The Dryden Press. p. 107
College Students’ Learning Motivation and Learning Effectiveness
447
theory (McGill and Beaty 1992). According to Freire (1993), knowledge is produced or acquired in the process of action research, and self-awareness is improved through self-reflection in action. Through the process of self-reflection, clarification and discernment of self-value standpoint, learners can deepen their views on the outside world. Action research is combining research and action. Participants are able to solve actual problems or improve the current situation with cooperation of experts, scholars or members of the organization by conducting a systematic research (Elliott 2007). The steps of action research proposed by Stringer (1996) is a spiral cycle with three steps in each cycle: the first step is the observation stage including collection of relevant information, situation construction, and situation description. The second step is the thinking stage including exploration, analysis, and explanation of the problem. The third step is the implementation stage including plan formation, implementation and evaluation. After the completion of the first cycle, the researcher and participants will review the results together, re-observe, revise the plan, and put it into practice again. This research focuses on finding out the problems during the process of learning the course of statistics, and then solving the problems encountered in the learning process in the class of statistics and hoping to promote a more practical learning of statistics. (3) Cooperative learning Cooperative learning is to allow students to work together in a small team or group to master the learning materials (Slavin 1980). The cooperative learning also divides students into groups or teams. Group or team members cooperate with each other and actively participate in the learning process to construct their own knowledge (Tsai 2001). Cooperative learning not only has significant effects on the acquisition of basic knowledge and skills, but also higher-level cognitive abilities, such as critical thinking and reasoning strategies (Huang 1992; Johnson and Johnson 1987). In addition, students’ abilities on logical thinking, judgmental reasoning, and problem-solving are greatly enhanced (Schunk et al. 1987). Cooperative learning is the key to knowledge construction and knowledge sharing (Tannenbaum and Tahar 2008; Vygotsky 1978). Cooperative learning allows students to build their own expertise, cooperate with others seamlessly, share knowledge, and achieve in-depth thinking during the process of participating in cooperative learning (Tannenbaum and Tahar 2008; Tsai 2001). Students also improve their communication skills and problem-solving skills while cooperating with others in learning. Teachers who implement the method of cooperative learning in their teaching should consider the following requirements (Huang and Lin 1996; Lin 2004; Tsai 2001): A. Teaching preparation: deciding the number of students in one group (it is much more appropriate to have 5–8 people in a group, but no more than 10 people), grouping students, working in small groups, creating a suitable learning environment, preparing cooperative learning materials, and setting up learning tasks. B. Teaching implementation: explaining the learning tasks, clearly explaining the rules, establishing a positive relationship of mutual dependence, designing individual performance evaluations and group evaluations, mentioning expected behaviors in cooperative learning, and conducting cooperative learning teaching.
448
T.-C. Chang et al.
C. Learning evaluation and praise: tracking students’ learning process, providing assistance in tasks and social skills, conducting formative and diagnostic evaluations, and giving praises to students and groups for outstanding performance. D. Learning and teaching reflection: reflection on the groups’ learning process, and reflection on the teaching process. (4) Knowledge Sharing and Cooperative learning Andersen (2000) pointed out that the scope of knowledge management includes four levels: Data, Information, Knowledge, and Wisdom. This definition is supported by many scholars and believes that the objects of knowledge management should cover all levels (Liu 2001; Sena and Shani 1999; Wen and Tang 2007; Wu 2000). Generally speaking, knowledge management is to improve the performance and competitive advantage of individuals and organizations. For the important explicit and tacit knowledge that exists inside and outside the organization and the employees themselves, it can carry out the management process of efficient collection, storage, delivery, sharing, utilization and creation. Although knowledge has similar characteristics, when knowledge present in different presentation modes, storage locations, abstraction levels, and utilization purposes, it will take on different forms (Lin 2005). However, the more common classification divides knowledge into two categories: tacit knowledge and explicit knowledge (Hedlund 1994; Maula 2000; Nonaka and Takeuchi 1995; Polanyi 1967; Sanchez 2007). When knowledge can be clearly expressed by words, numbers, graphics or other symbols, such knowledge can be clearly defined and captured, that is, explicit knowledge. Tacit knowledge refers to the one that is highly personalized, which is deeply embedded in personal experience, judgment, values, subconscious mind and mental model. It is usually only intentional but difficult to express, so it is easier to create communication restrictions. In order to integrate knowledge into internal competence, the organization must have a well-planned knowledge management process. The steps of knowledge management can be divided into the following stages: knowledge creation, confirmation, collection, classified storage, sharing and access, use and improvement to elimination (O’Dell and Grayson 1998). Knowledge management can be regarded as a strategy for the organization to enhance its own competitiveness. Knowledge is accumulated through a series of processes of knowledge discovery, knowledge specification, and knowledge transfer to develop new knowledge in order to enhance organizational assets. This research aims at the teaching in the class of statistics and students’ learning process. By sharing knowledge with the students, the teachers can understand their students’ learning performance in the process of learning. Yang (2003) mentioned in the study of knowledge management that the effectiveness of the knowledge management process will depend on the environment, which means that the knowledge management process will be affected by the context in which the knowledge is used. In the process of knowledge sharing, the process of knowledge transfer and knowledge conversion is included. With the dynamic conversion of tacit and explicit knowledge, new knowledge is created. The transformation of knowledge can be divided into the four types: socialization, externalization, internalization, and combination (Hsieh 2005; Liu 2000; Nonaka and Takeuchi 1995) In the process of integration, the existing explicit knowledge
College Students’ Learning Motivation and Learning Effectiveness
449
is analyzed, classified, integrated and reorganized to form new explicit knowledge. Through these four stages, knowledge is transferred from individuals to groups, and the organization is constantly innovating and growing. The four stages are explained as shown in Fig. 2: A. socialization: tacit knowledge to tacit knowledge, the process of transforming tacit knowledge by sharing experience. B. externalization: tacit knowledge to explicit knowledge. Tacit knowledge is expressed through metaphors, analogies, concepts, assumptions or models. C. combination: explicit knowledge to explicit knowledge, the process of systemizing concepts to form a knowledge system involves combining different explicit knowledge systems. D. internalization: explicit knowledge to tacit knowledge by conveying knowledge through language and stories, or making it into a document manual, which can help transform explicit knowledge into tacit knowledge.
Fig. 2. Transformation of tacit knowledge and explicit knowledge (Nonaka and Takeuchi 1995; Liu 2000; Hsieh 2005)
This research aims to understand the motivation and behaviors of knowledge sharing between teacher teaching and student group learning. In the mode of cooperative learning, the behaviors of knowledge sharing between group members also be analyzed to explore the effectiveness of student learning.
3 Research Question A. By using the method of cooperative learning to cooperate and motivate each other, can it improve students’ learning motivation?
450
T.-C. Chang et al.
B. By integrating the method of action research in class to emphasize students’ abilities on analysis and problem-solving, can it improve students’ learning effectiveness? C. By using the method of knowledge sharing when working on industrial case studies with teachers’ facilitation in group operation and knowledge construction, so can it improve students’ learning effectiveness?
4 Research Methodology At the beginning of the semester, students were in the class of statistics 1. They were dividing into groups of 5 to 6 people for the purpose of cooperative learning. Each group then discussed the category of industry that would be used in action research. Each group would share their learning progress in class regularly and learn from each other. The explicit knowledge in the textbook such as the text content and statistical formula is actually applied to the practical analysis, and it is internalized into the students’ tacit knowledge such as statistical abilities that can be applied in reality. The statistics course design is shown in Table 1, including: descriptive statistics, probability theory, sampling theory, statistics inference and applications in statistics. At the same time, students are grouped according to industry categories as a classification plan for writing statistical analysis case reports including agriculture, forestry, and fishery, manufacturing, transportation and storage, accommodation and catering, finance and insurance, arts and entertainment, and leisure services. By using the method of cooperative learning, students in the class of statistics can motivate each other when working in groups, and promote students’ active learning. Teachers play the role of problem-oriented facilitators to guide students in a timely manner. Also by using the method of action research, students’ analytical skills and problem-solving skills can be emphasized and developed with the learning materials provided in the class of statistics and case studies in various industries worked by each group. Teachers further guide students to work in groups by knowledge sharing through surveying different industrial scenarios combined with the working process of questionnaire design demonstrated by the teachers in class. Students’ knowledge construction of statistics can be further internalized through the process of knowledge sharing in groups and in class. Students’ learning can be more structured, complete and fulfilling. Also through the process of knowledge sharing, teachers receive feedbacks from the students so they can know where the teaching obstacles are. Teachers can later adjust their teaching methods accordingly. The teachers have played the role of observing. Each cycle includes four steps: planning, cooperation, implementation, sharing, and reflection. The research structure is shown in Fig. 3. The research subjects of this study are the college students enrolled in the class of statistics in the Department of Management and Administration in central Taiwan. There are two assessment tools for evaluating students’ learning motivation and learning effectiveness in this study, which are Learning Motivation Scale and Self-evaluation of Learning Effectiveness. Learning Motivation Scale was constructed and modified by the authors from the original Work Preference Inventory by Amabile et al. (1994). The Work Preference Inventory is originally designed to evaluate individual differences in intrinsic and extrinsic motivational orientations. Under the Work Preference Inventory’s
Statistics 1
1–6
1
Descriptive statistics, Setting up of topic, interview, questionnaire
Prepare teaching materials in various industries, learning motivation scale (pre-test)
Help in grouping, set up learning tasks
Teaching preparation
Cooperative learning
Course
Week
Topic
Unit
Teaching method and instruction
Help in designing questionnaire, mid-report sharing
Demo of questionnaire samples, topics of probability distribution in various industries
Probability theory
2
7–12
Group discussion, case studies preparation of statistical analysis, learning motivation scale (2nd test)
Statistical inference
4
1–6
Statistics 2
Help in distributing Help students to questionnaire understand the location and method scope of applications in statistics, increase the scope of data for group discussion, and increase the variables
Sampling selection, sampling method
Sampling theory
3
13–18
Table 1. Course design in the class of statistics
7–12
Help in data analysis and discussion, find out suitable statistical analysis methods, mid-term sharing
Comparison between two groups and multi-groups, preparation of chi-square test
Applications in statistics 1
5
(continued)
Help in Discussing forecast variables and statistical methods
Correlation analysis in various case studies, learning motivation scale (post-test)
Applications in statistics 2
6
13–18
College Students’ Learning Motivation and Learning Effectiveness 451
Evaluation/assessment
Pre-test Week 3 sharing 10% Week 6 sharing 20%
Week 3 sharing Week 6 sharing
Knowledge sharing
1
Topic
Find out the appropriate interviewers for relevant industry, discuss the topic direction
1–6
Week
Action research
Statistics 1
Course
Week 9 mid-report sharing 20% (include midterm) Week 12 sharing 10%
Week 9 sharing Week 12 sharing Movie: moneyball
Discuss the topics to find applicable statistical data, film appreciation (moneyball)
2
7–12
Week 15 sharing 10% Week 18 final report sharing 30% (include final)
Week 18 sharing
Week 15 sharing
Questionnaire distribution, data collection
3
13–18
Table 1. (continued)
5
7–12
Pre-test Week 3 sharing 10% Week 6 sharing 20%
Week 3 sharing Week 6 sharing
Week 9 mid-report sharing 20% (include midterm) Week 12 sharing 10%
Week 9 sharing Week 12 sharing Movie: market survey
Discuss data, find Variable analysis, out the appropriate film appreciation statistical method (market survey)
4
1–6
Statistics 2
Week 15 sharing 10% Week 18 final report sharing 30% (include final)
Week 15 sharing Week 18 sharing
Discuss variable analysis
6
13–18
452 T.-C. Chang et al.
College Students’ Learning Motivation and Learning Effectiveness
453
Fig. 3. Research structure
intrinsic motivational orientations, the ideas of self-determination, competence, task involvement, curiosity, enjoyment, and interests are included, while competition, evaluation, recognition, money, and tangible incentive are included in extrinsic motivational orientations. According to the principles of the Work Preference Inventory, Learning Motivation Scale used in the class of statistics was constructed. In terms of intrinsic motivation, elements of self-determination, competence, task involvement, curiosity, enjoyment, and interests are included. In terms of extrinsic motivation, elements of competition, evaluation, recognition, money, tangible incentives, and influence by significant persons are included. During the class, most students just followed teachers’ teaching instructions. They just either followed the teachers’ directions to learn specific tasks, did their own thing, used mobile phones, or even did nothing. Teachers should not only be knowledge givers, but facilitators. Teachers should bring students’ attention into the classroom and obtain their focus on learning when the lessons begin. The traditional evaluation is passive. Instead, self-assessment can be the proactive way of evaluating students’ learning performance.
5 Teaching Outcomes and Results (1) Teaching Process a. Teaching Preparation and Lecture: The primary purpose of the course of statistics is to establish a foundation for students to understand statistical theories. The lesson content covered in class does not only focus on the details, but seeks to integrate theories and practical means, and also to establish students’ critical thinking and analytical skills. All the lesson plans are presented in the form of slides or PowerPoints. Teachers upload these lesson plans or teaching materials to the course website in advance. Students can download for previews before class. Then, through the teacher-made teaching materials, the background knowledge of the authentic case studies are explained in order and in a step-by-step manner. The teaching materials are taught in a logical order from simple to complex. Each learning
454
T.-C. Chang et al.
stage provides a one-minute feedback. Teachers would regularly review the level of students’ learning progress, as a basis for modifying the teaching mode. Teachers also provide assistance to students in constructing basic concepts to understand the content of these case studies in an orderly manner. b. Problem-oriented Cooperative Learning: When teaching the class of statistics, teachers must consider students’ interests, needs, levels, goals, and problems. When designing the context for the case scenario, teachers need to cover more inspiring topics with supplementary materials and examples, which can be applied to the discussion topics and guide students how to look for problems, clarify problems, collect information, participate in the discussion, and use induction methods to obtain conclusions. In addition, the incentive mechanism of film appreciation including Moneyball and Market Research could increase students’ interest in learning, engage in interactive discussions, and discuss specific issues together. Students can be quipped in developing problem-solving skills through the process of communication, listening, and information sharing. The class of statistics course can be designed based on the attributes of the case. Teachers explain the key points in the classroom and gives the students the case scenarios to be solved in class. The students then suggest possible solutions to solve the cases. The teacher can immediately answer any questions and discusses the case scenarios with the students. The students must form in groups and go through the case scenario which were already uploaded in the course website with their group members before the class. By competing with each other in groups, students learn how to work with each other within the team. Students feel secure because they work with partners, and they won’t lose their confidence in learning because of competition. Through a cooperative and competitive learning among peers, it is possible to take the two aspects of students’ learning motivation and learning confidence into account. In addition, each group can ensure that every member prepares for class and discusses with peers. Each group would decide the group leader, and then lead the discussion of the content in the case studies and key plots in the class. The team works together to find out the problems, analyze the problems and suggest the possible solutions. During the group discussion, group members not only can exchange opinions and information with each other through verbal and non-verbal means, but also share personal views and feelings of the group or seek consensus acceptable to most people. c. Action Research(group discussion, questionnaire design, questionnaire distribution, data collection): This research method partly adopts the method of practice-oriented action research, which combines action and research into one. The action research is a cycle of planning, action, observation, and reflection. Action research is often categorized as qualitative research. Qualitative research is not concerned with objective classification and measurement, searching for universal laws, or falsification of causal hypotheses and statistical inferences. It pays more attention to the constructing process of social facts and experience and interpretation
College Students’ Learning Motivation and Learning Effectiveness
455
under the unique cultural and social context. This exploratory research filled with contextual process, interaction, meaning and interpretation. Its research value and measuring criteria cannot be fully covered and explained by elements in the quantitative research (Lin 2005). Therefore, a quantitative score cannot completely represent students’ learning effectiveness in the class of statistics. In the part of the action research adopted in this study, the researcher is the course designer and the on-site teacher. In order to avoid the results being too subjective, students’ mid-term exam scores, attendance rate, and teaching reflection records are included to improve the credibility of research results. In the design of research methodology, quantitative research is adopted for the method of quasi-experimental research. During the process, students’ learning effectiveness in the current class is compared with the one in the previous class in order to understand the degree of influence in different teaching styles in the class of statistics. The method of action research is served to deliver the qualitative results. Researchers who use this research method must be the course instructors or mentors. The students in their own teaching class are their research subjects. When encountering with the problems in class, teachers use course materials as a medium to design lesson plans, seek improvement in the process of action research, and reflect on introspection to solve the problems. In the meantime, examples of different questionnaire module in various industry category are included in the course materials for the group to write statistical analysis case reports. After the semester, the topics of questionnaire design for each group are shown in Table 2. d. Knowledge Sharing (extrinsic knowledge transfer to tacit knowledge then sharing with others): The basic nature of knowledge includes explicitness and codification, which affects the ability to transfer and acquire knowledge (Boisot 1983; Nonaka 1994; Sanchez 1997). Explicitness refers to the degree to which knowledge can be clearly expressed to others, and codification will affect the disseminating ability of knowledge. Generally, codified knowledge is easier to spread than uncodified knowledge. Tacit knowledge usually spreads slowly and is confined to a limited audience. If tacit knowledge is codified into explicit knowledge, we often have to generalize it and lose its original richness. To increase the disseminating ability of knowledge often depends on the situation and needs at the time. Even though tacit knowledge is not easy to comprehend and transfer, it is usually the knowledge that everyone must acquire. From the perspective of resources and strategic management, the ambiguity of resources will bring barriers to competitive advantage, making it difficult for competitors to understand and imitate (Barney 1991; Mosakowski 1997). This kind of tacit knowledge is difficult to be coded. Although relying on tacit knowledge is risky, organizations can gain greater value by converting or sharing it (Haldin-Herrgard 2000). Students would feel secure because they work with partners, and they will not lose their confidence in learning because of competition. Through sharing knowledge among peers, they can increase each other’s learning motivation and confidence in learning. The team works together to find out the problems,
456
T.-C. Chang et al.
analyze the problems and suggest the possible solutions. During the group discussion, group members not only can exchange opinions and information with each other through verbal and non-verbal means, but also share personal views and feelings of the group or seek consensus acceptable to most people.
Table 2. Each group’s topic in the class of statistics Group
Topics
1st Group
Research on customer satisfaction of Xiangshan Administration and visitor center
2nd Group
Research on the correlation of banking staffs’ work stress, personality traits and job satisfaction
3rd Group
Research on customer satisfaction of Hsinchu Logistics Co., Ltd.
4th Group 5th Group
The impact of high-density development on real estate prices in Taipei City
6th Group
Consumer behaviors in the insurance industry of taking Cathay Pacific Life Insurance
7th Group
Exploring the impact of customer satisfaction on consumer behavior of ETUDE HOUSE
8th Group
The influence of motivational intention on consumer behaviors of McDonald’s
9th Group
The influence of motivational intention, perceived value, and service quality on consumer behavioral intentions of JuliaNews
10th Group
Research on the satisfaction of media industry of Vieshow Cinema in Taiwan
11th Group
The impact of different services and communication difficulties on consumer behavior intentions in the Cultural and Educational Foundation
Research on customer satisfaction of Harbor Plaza
(2) Teachers’ teaching reflection and results According to the implementation process data completed in each stage of teaching, self-evaluation of learning effectiveness, learning motivation scale, and learning assessments are conducted at the beginning, middle and end of each semester as shown in Table 3. Students provide feedbacks to the teachers. Teachers can understand their status and effectiveness of learning. And they can modify their teaching accordingly. This study was implemented in the academic year of 2019. As shown in Table 4, the results were compared with the one in the academic year of 2018. The results have shown that there is no significant difference between the results of midterm examination taken in the academic year of 2018 and 2019. It can be shown that the students’ learning performance in the two academic years are similar, therefore, no significant difference existed. The results of final exam in the academic year of 2018 and 2019 are shown in
Week 9 midterm exam
Week 9 self-evaluation of learning effectiveness
1–6
Week 1 learning motivation scale
Week 6 self-evaluation of learning effectiveness
Week
Tools of learning evaluation
7–12
1st Semester
Semester 1–6
Week 16 to 18 evaluation for questionnaire design
Week 6 self-evaluation of learning effectiveness
Week 18 Week 1 learning self-evaluation of motivation scale learning effectiveness
13–18
2nd Semester
Table 3. Implementation of learning evaluation
Week 9 self-evaluation of learning effectiveness
Week 9 midterm exam
7–12
Week 16 to 18 evaluation for questionnaire design Week 18 learning motivation scale
Week 18 self-evaluation of learning effectiveness
13–18
College Students’ Learning Motivation and Learning Effectiveness 457
458
T.-C. Chang et al.
Table 5. The results have shown that there is a significant difference, that is, the students’ academic performance in the final exam in the academic year of 2019 is better than the students of academic year of 2018. Table 4. Midterm in the first semester of academic year (AY) of 2018 and 2019 AY
n
Mean
SD
SE
p-value (two-tailed)
1081 58 49.052 12.1098 1.5901 1071 57 48.193 14.6911 1.9459 .733 Table 5. Final in the first semester of academic year of 2018 and 2019 AY
n
Mean SD
SE
p-value (two-tailed)
2019 58 73.35 10.153 1.345 2018 57 68.55 10.386 1.364 0.014
As for the analysis of learning motivation assessment, only the results in the academic year of 2019 are compared because there is no data collection in the academic year of 2018. As shown in Fig. 4, the results have shown that students were significantly more motivated in the second semester of academic year of 2019 than the one in the first semester. This has indicated that students’ levels of learning motivation have increased over the year of course learning in the class of statistics. Students felt more motivated by learning with peers and learning by doing.
1st semester 1st take
2nd semester 1st take
2nd semester 2nd take
Fig. 4. Learning motivation assessment in the academic year of 2019
In terms of the learning assessment, scores of midterm and final are used. As shown in Fig. 5, the results have shown that students did better in midterm and final at the second semester than the first semester in the academic year of 2019. This has indicated that students’ levels of learning performance in terms of midterm and final have increased over the year of course learning in the class of statistics. Students are able to obtain more lesson contents by learning with peers and learning by doing.
College Students’ Learning Motivation and Learning Effectiveness
2nd semester midterm
1st semester final
1st semester midterm
459
2nd semester final
Fig. 5. Learning assessment of midterm and final in the academic year of 2019
As for the students’ self-evaluation of learning effectiveness shown in Table 6, students have shown the best improvement in their 3rd take in the second semester, followed by the 3rd take in the first semester, then 2nd take in the second semester. Students have their least improvement at the 1st take in the first semester. Figure 6 shows that students’ self-assessment of learning effectiveness gradually increases from the beginning of each semester to their highest level of improvement at the end of each semester, which indicates that students increase their level of confidence in learning by learning with peers and learning by doing during the learning process. Table 6. Descriptive analysis of self-evaluation of learning effectiveness in the academic year of 2019 n
Mean SD
SE
95% confidence interval for mean Lower bound
1st Semester 1st
Min
Max
Upper bound
58 6.86
1.69 .22 6.41
7.31
2.00
9.00
1st Semester 2nd take
58 7.64
1.55 .20 7.23
8.05
2.00 10.00
1st Semester 3rd take
58 8.57
1.99 .26 8.05
9.10
2.00 10.00
2nd Semester 1st take
45 7.56
1.39 .21 7.14
7.97
4.00 10.00
2nd Semester 2nd take
45 7.91
1.06 .16 7.59
8.23
5.00 10.00
2nd Semester 3rd take
45 8.96
1.15 .17 8.61
9.30
7.00 10.00
309 7.89
1.68 .09 7.70
8.08
2.00 10.00
take
Total
460
T.-C. Chang et al.
1st semester 1st take
1st semester 2nd take
1st semester 3rd take
2nd semester 1st take
2nd semester 2nd take
2nd semester 3rd take
Fig. 6. Self-evaluation of learning effectiveness in the academic year of 2019
As shown in Table 7, the variance analysis test shows that it has reached a significant level, which means that the average scores of the six self-evaluations of learning effectiveness are significantly different. Table 7. Variance analysis of self-evaluation of learning effectiveness in the academic year of 2019 Sum of squares
df
Mean square
F
Sig.
Between groups
147.852
5
29.570
12.458
.000
Within groups
719.184
303
2.374
Total
867.036
308
(3) Students’ Feedbacks and Teaching Adjustment a. As shown in Table 8, in the sixth week of the first semester, one student said that the homework questions were too difficult. So, as the teaching adjustment, the teacher would demonstrate the exercises in class and reduce the level of difficulty. b. Some students suggested that they should review the homework after the homework corrections in order to improve the students’ learning performance. As the teaching adjustment, some class time is reserved to answer students’ questions and review their homework problems. c. After the mid-term exam, some students proposed to give examples of statistical theory. As the teaching adjustment, starting to introduce questionnaire design
College Students’ Learning Motivation and Learning Effectiveness
461
and the application of industrial topics after the midterm, which also triggered the improvement of students’ learning motivation. d. As shown in Table 9, in the sixth week of the second semester, one student reported that the homework exercises were too difficult. By talking with the student, the teacher found out that this student was a repeater. So, the teacher will provide more assistance with this student on his homework in the future.
Table 8. Students’ feedbacks in the first semester Other feedbacks to the teacher 1. Always hand in the exercises to the teachers after the class, but don’t know how well we did. Hope teachers can review the last lessons’ exercises 2. Teacher is very nice and hard working. But I hope we can have some extra credits in the quizzes 3. Teacher work very hard, but some areas are bit difficult. Hope the teacher can simply the questions 4. Hope the teacher can grade the homework and discuss in the class 5. Hope teachers won’t flunk us. Tr thank you 6. Statistics is so hard 7. Overall is great. Can be slow down during the lecture. Hope to give more examples that are easy to understand so we will know how to use them in the future
Table 9. Students’ feedbacks in the second semester Other feedbacks to the teacher 1. The teacher was hard working in the class. He answered our questions 2. The teacher worked hard in the class with great time management
6 Recommendations and Reflections I.
Students can be motivated by the method of cooperative learning when working in groups. The results have shown that this method can increase students’ initiative and enthusiasm, thereby enhancing students’ learning motivation. Students also feel secure because they work with partners, and they won’t lose their confidence in learning because of competition. II. Working on class cases with the method of action research can emphasize on students’ abilities in analyzing problems as well as their problem-solving skills. The results have shown that this method can effectively improve students’ learning performance. This research method must be on the center of teaching instructors.
462
T.-C. Chang et al.
Teachers can adjust at any time when facing the problems, seek improvement in the process of action research, and reflect on introspection to solve the problems in the classroom. In the meantime, practical examples from various industries are included in the course materials to enhance students’ learning interest. III. Using the method of knowledge sharing when working on the case studies along with teachers’ guidance can create knowledge construction. The results have shown that this method can enrich students’ structured learning and effectively enhance their learning effectiveness. Levels of learning motivation and learning confidence can be increased through the method of knowledge sharing among peers. In the process of knowledge sharing, each group first decides the group leader, and then leads the discussion of the content in the case studies and key plots in the class. The team works together to find out the problems, analyze the problems and suggest the possible solutions. During the group discussion, group members not only can exchange opinions and information with each other through verbal and non-verbal means, but also share personal views and feelings of the group or seek consensus acceptable to most people.
References Amabile, T.M., Hill, K.G., Hennessey, B.A., Tighe, E.M.: The work preference inventory: assessing intrinsic and extrinsic motivational orientations. J. Pers. Soc. Psychol. 66, 950–967 (1994) Andersen, A.: Value for Money Drivers in the Private Finance Initiative. Arthur Andersen and Company, Chicago (2000) Altrichter, H., Posch, P., Somekh, B.: Teachers Investigate their Work: An Introduction to the Methods of Action Research. Routledge, New York (1997) Barney, J.B.: Firm resources and sustained competitive advantage. J. Manag. 17, 99–120 (1991) Boisot, M.H.: Convergence revisited: the codification and diffusion of knowledge in a British and a Japanese firm. J. Manage. Stud. 20(2), 159–190 (1983) Boisot, M.H.: Knowledge Assets. Oxford University Press, Oxford (1998) Dale, E.: Audiovisual Method in Teaching. The Dryden Press, Hinsdale (1969) Elliott, J.: Assessing the quality of action research. Res. Pap. Educ. 22(2), 229–246 (2007) Freire, P.: Pedagogy of the Oppressed. Continuum, New York (1993) Hart, F., Bond, M.: Action Research for Health and Social Care: A Guide to Practice. McGraw-Hill Education, London (1995) Haldin-Herrgard, T.: Difficulties in diffusion of tacit knowledge in organizations. J. Intellect. Cap. 1, 357–365 (2000) Hedlund, G.: A model of knowledge management and the N-form corporation. Strateg. Manag. J. 15, 73–90 (1994) Huang, T-J.: Promoting five ways of life in the elementary and middle schools. Center for Educational Research of NTNU, Taipei (1992) Huang, T.-J., Lin, P.-S.: Pedagogy of creativity and cooperation. ShtaBook, Taipei (1996) Hsieh, T-.Y.: Study of virtual community and knowledge sharing in role of expectancy and value. Unpublished Master thesis. Department of Information Management of National Central University, Taoyuan (2005) Johnson, D.W., Johnson, R.T.: Learning Together and Alone: Cooperative, Competitive, and Individualistic Learning. Prentice-Hall Inc, Hoboken (1987)
College Students’ Learning Motivation and Learning Effectiveness
463
Kemmis, S., Mc Taggart, R.: The Action Research Planner. Deakin University Press, Victoria (1988) Li, I.-C.: Application of action research in nursing. Nurs. Res. 5(5), 463–467 (1997) Lin, D.-C.: Information Management – Core Competence of e-Enterprise. Best-Wise Publishing, Taipei (2005) Lin, P.-S.: Principles of Instruction. Psychological Publishing, Taipei (2004) Liu, G.-W.: First Text About Knowledge Management. Business Weekly Publications Inc, Taipei (2000) Mosakowski, E.: Strategy making under causal ambiguity: conceptual issues and empirical evidence. Organ. Sci. 8(4), 414–442 (1997) McGill, I., Beaty, L.: Action Learning: A Practitioner’s Guide. Kogan Page, London (1992) McNiff, J.L., Lomax, P.P., Whitehead, J.: You and Your Action Research Project. Routledge, New York (1996) Maula, M.: Three parallel knowledge processes. Knowl. Process. Manag. 7(1), 55–59 (2000) Nonaka, I., Takeuchi, H.: The Knowledge-Creating Company: How Japanese Companies Create the Dynamics of Innovation. Oxford University Press, Oxford (1995) Nonaka, I.: A dynamic theory of organizational knowledge creation. Organ. Sci. 5(1), 14–37 (1994) O’Dell, C., Grayson, C.J.: If only we knew what we know: identification and transfer of internal best practices. Calif. Manage. Rev. 40(3), 154–173 (1998) Polanyi, M.: The Tacit Dimension. Routledge and Kegan Paul, London (1967).Learning and Knowledge Management. John Wiley, Chichester Sanchez, R.: Managing articulated knowledge in competence-based competition. In: Sanchez, R., Heene, A., Thomas, H. (eds.) Strategic Learning and Knowledge Management, pp. 163–187. Wiley, New York (1997) Sanchez, R.: Tacit knowledge versus explicit knowledge. approaches to knowledge management practice (2007). www.knowledgeboard.com/download/3512/Tacit-vs-Explicit.pdf Schunk, D.H., Hanson, A.R., Cox, P.D.: Peer-model attributes and children’s achievement behaviors. J. Educ. Psychol. 79(1), 54 (1987) Sena, J.A., Shani, A.B.: Intellectual capital and knowledge creation: towards an alternative framework. Knowl. Manag. Handbook 8(1), 8–16 (1999) Stringer, E.T.: Action Research –A Handbook for Practitioners. Macmillan, London (1996) Slavin, R.E.: Cooperative learning. Rev. Educ. Res. 50(2), 315–342 (1980) Tannenbaum, M., Tahar, L.: Willingness to communicate in the language of the other: Jewish and Arab students in Israel. Learn. Instr. 18(3), 283–294 (2008) Tsai, P.-L.: Introduction of Instruction. Pro-ED Publishing Company, Taipei (2001) Vygotsky, L.: Interaction between learning and development. Read. Dev. Child. 23(3), 34–41 (1978) Webb, C.: Action research. In: Cormack, D.F.S. (ed.) The Research Process in Nursing, pp. 155– 165. Blackwell Scientific, London (1991) Wen, Y.-F., Tang, K.-Y.: Knowledge Management. Princeton International Publishing Co. Ltd., Taipei (2007) Wu, S.-H.: Nine Strategies: The Essence of Strategic Thinking. Faces Publication, Taipei (2000) Yang, Y.-C.: A study of inter-organizational knowledge sharing in information system outsourcing processes. Unpublished Master thesis. Department of Information Management of National Sun Yat-sen University, Kaohsiung (2003)
A Feasibility Study of the Introduction of Service Apartment Operation Model on Long-Term Care Institutions Kuei-Yuan Wang1 , Ying-Li Lin1(B) , Chien-Kuo Han2 , and Shu-Tzu Shih3 1 Department of Finance, Asia University, Taichung, Taiwan
{yuarn,yllin}@asia.edu.tw
2 Department of Food Nutrition and Healthy Biotechnology, Asia University, Taichung, Taiwan
[email protected]
3 Gome Development Co., Ltd., Taichung, Taiwan
[email protected]
Abstract. The Long-Term Care Service Act stipulates that it will be implemented two years after its formulation. This study explored the feasibility of introducing community-based long-term care institutions into service apartments. The conclusions are as follows:(1) The establishment of community-based long- old-age support, live a healthy life dignity, the general family economy to reduce the burden on the community can promote peace and harmony. (2) service apartment management model in line with the development of community-based long-term care institutions, apartment-style equipment and personal training is based on hotel-like professional services. For the day-to-day placement of the elderly as a tourist-like accommodation, the use of various types of equipment for the elderly and courses designed specifically for the elderly is more likely to be favored by the elderly and less expensive than the private care center. The introduction of serviced apartments into community long-term care institutions is an important reference for viable decisions.
1 Introduction The elderly population in Taiwan is increasing. At the end of 2020, the registered population of Taiwan was 47,122,472, of which the elderly (over 65 years old) population was 3,471,407 (7.37%) (Statistics Department of the Ministry of the Interior 2021). The health care of the elderly is indeed an important issue that cannot be ignored. In order to improve the long-term care service system, the government can provide more comprehensive long-term care services, and ensure that the quality of care and service is not dependent on the gender, sexual orientation, gender identity, marriage, age, disability, illness, class, race of the recipient, religious beliefs, nationality, and residence area, etc., resulting in differential treatment and triggering discriminatory behaviors. The “Long-Term Care Service Act” was enacted on June 3, 2014 (Laws & Regulations Database of the Republic of China 2021). Next, in order to solve the problem of financial resources of the Long-Term Care Service Act and that the longterm care institutions established under the current laws and regulations can continue to © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 464–471, 2022. https://doi.org/10.1007/978-3-030-79728-7_46
A Feasibility Study of the Introduction of Service Apartment
465
operate under conditions, the inheritance tax, gift tax and tobacco and alcohol tax will be increased to raise financial resources. On June 3, 2016, the “Long-Term Care Service Act” was revised again. The service apartment is the department between the residence and the hotel. The main target groups include leisure and business travelers. The service apartment provides professional management and family-style equipment, so that the guests can feel like at home no matter that it is a short-stay, extended stay or long-stay. However, most of the service apartments are located in communities with high population densities in major cities. It usually has the characteristics of convenient transportation and low price. Serviced apartments can also provide temporary accommodation, group houses, small-scale multi-function and other integrated services. In addition, young people often need to go to work during the day and have no time to take care of the elderly at home. Therefore, if community-based long-term care institutions can integrate the temporary accommodation, low prices, leisurely vacation, multi-function and other integrated services provided by service apartments to form their competitive advantages, I believe that they can create huge business opportunities for the day care market. Therefore, whether community-based long-term care institutions introduce the business model of hotel-style apartments is indeed feasible, which forms the research motivation of this research. The research purpose of this research is to rethink the feasible business strategy and marketing strategy of the traditional community-based long-term care institutions introducing service apartments by case study.
2 Methodology 2.1 Case Study Yin (1981) believes that case study is a kind of empirical research. He believes that case study is a basic research style and research strategy in the social sciences, and should focus on the analysis and research of individual cases, rather than merely collecting individual cases. Ye (2001) believes that the case study method is to collect complete information for a particular individual or group, and then make an in-depth analysis of the cause and effect of the problem. 2.2 Interview The interviews of this research are mainly based on non-structured interviews, and the following interview procedures and question design are designed as follows: 2.2.1 Interview Procedure The interview procedure of this study uses unstructured interviews. The role of the interviewee is defined as an information provider, rather than simply answering questions. In addition, this research not only hopes to allow interviewees to speak freely, but also to express more in-depth views. Considering the limited time of the interviewees, this research will send an outline of the interview content before the actual interview so that
466
K.-Y. Wang et al.
the interviewee can understand the subject of this research in advance. Finally, after this research has compiled the verbatim draft of the interview, this research will confirm with the interviewees, hoping that the interview content and article citations can be consistent with the perceptions of both parties. 2.2.2 Interview Outline This study uses an unstructured interview method to design the interview outline. In order to avoid the interviewee’s confusion in semantic terms, this research tries to avoid difficult academic terms, choose colloquial words, and give examples during the interview to make it easier for the interviewee to understand the problem. The outline of the interview for this study are as follows: a. What is the incentive for your company to invest in this mature long-term care industry with a service apartment business model? b. In the face of competition in the industry, what is your company’s business strategy? c. In the face of a saturated market, what is your company’s marketing strategy? 2.2.3 Interviewee This study invited the general manager of company A as the interviewee. Company A was established in 2009. It is located in Taichung City, Taiwan.
3 Analysis Results 3.1 Company A Company A adopts “three good” as its business philosophy, which are good design, good construction, and good quality. As Taiwan’s population aging index exceeded 100 for the first time in February 2017, reaching 100.18 (Zhong, 2017). For the first time, the elderly population in Taiwan exceeds the younger population. In the future, Taiwan will become a member of the super-aged society. However, Company A believes that the related industries of long-term care institutions will create unlimited business opportunities in the future. Therefore, Company A actively plans to invest in the community-based long-term care organization industry and introduces a hotel-style apartment business model to operate, hoping to meet the current market demand for long-term care. 3.2 SWOT Analysis Company A adopts a new business philosophy that is different from the past, and introduces the related operations of long-term care service agencies into the service apartment business model. Company A not only combines the current aging trend of society, but also has a strong grasp of the source of raw materials in construction and operation. In addition, company A uses standard operating procedures for its internal operations, and uses an information system to integrate all relevant information and data of various
A Feasibility Study of the Introduction of Service Apartment
467
departments. However, because Company A has entered the industry related to long-term care for the first time, its understanding of various laws and regulations on long-term care is relatively inferior to the competition. The SWOT analysis results are showed in Table 1. Table 1. SWOT analysis results of Company A Strength
Weakness
1. Different from the previous new business philosophy (S1) 2. Strong ability to master the source of raw materials (S2)
1. The understanding of various regulations is relatively inferior to competitors (W1) 2. Technological progress is rapid and difficult to follow (W2)
Opportunity
Threat
1. Combining the current trend of aging 1. Suppliers adopt cooperative development society (O1) methods. Therefore, there are uncertain 2. The internal standard operating program of factors that cannot cooperate, and it is not company A uses an information system to easy to establish alternative suppliers (T1) integrate all relevant information and data of various departments (O2)
3.3 TOWS Analysis 3.3.1 SO Strategy a. Utilize new business concepts and combine the external opportunities of aging society Company A uses its advantages (S1), which is different from the previous new business philosophy, combined with the current trend of aging society (O1), to introduce long-term care institutions into the business model of hotel-style apartments. Because there is no such product concept in the current market, and this business model is unique. Therefore, since existing competitors still adopt the traditional long-term care organization business model, if they can effectively introduce a more advanced management system, they will be able to achieve their goals. 3.3.2 ST Strategy a. Cooperate with suppliers to develop and master the source of high-quality raw materials In the past, Company A mainly focused on housing construction planning, design, construction, and sales. It has rich experience in construction and construction related projects and has a strong grasp of the source of raw materials (S2). Company A has established a good relationship with manufacturers for a long time. Company A will
468
K.-Y. Wang et al.
adopt a cooperative development approach with suppliers. Therefore, Company A is not suitable for cooperation with other suppliers (T2). However, A company requires suppliers to provide the highest quality raw materials. In fact, if Company A can cooperate with suppliers who have a good reputation, a sense of responsibility, good quality and excellent technology, and propose a mutually beneficial cooperation plan, it will be able to achieve a win-win result. In this way, the supplier will be willing to cooperate with Company A for a long time. Company A can continue to obtain the best raw materials. 3.3.3 WO Strategy a. Take a positive attitude to understand the relevant laws and regulations of the longterm care industry Because Company A has just entered the long-term care industry, their understanding of long-term care-related laws and regulations is relatively inferior to that of competitors (W1). In recent years, long-term care related industries have received much attention. The “Long-term Care Ten Year Plan” (O1) implemented in Taiwan since 2007 (Ministry of Health and Welfare, 2012). However, these relevant laws and regulations are also external opportunities for the current ageing social market. Therefore, A company needs to actively understand the relevant laws and regulations of long-term care. 3.3.4 WT Strategy a. Combine the company’s internal information system to develop diversified technology products Technology advances rapidly and it is difficult to follow up (W2). Therefore, it is necessary for the entire industry to cooperate with each other to develop the long-term care industry to achieve the maximum effect. Company A attaches great importance to informationization. Their internal standard operating program integrates all relevant information and data of various departments and units with an information system (O2). Therefore, Company A can combine its information technology experience with the development of ICT products to construct an innovative service system for the cloud system management of long-term care institutions. By further improving the quality of service and combining the innovative thinking of “technology and humanity care”, Company A implements the main axis of welfare services for the elderly that require “aging and vitality” to reduce the number of elderly people who need long-term care. In order to achieve the goal of long-distance and comprehensive personal health management and care services, so that the elderly can get better life care. In addition, company A takes smart care as its core axis and uses technologies such as the Internet of Things and big data analysis to create smart care services and environments. For example, Smart health passbook and wearable devices can continuously track and understand the long-term care service data of the elderly. And by taking care of the introduction and development of robots, it is the focus of the development of all countries. Looking forward to solving the serious shortage of care manpower problems.
A Feasibility Study of the Introduction of Service Apartment
469
3.4 Marketing Strategy Analysis 3.4.1 Product Strategy Because the elderly and the young people have different needs, tastes, preferences, and lifestyles, etc. Company A integrates the long-term care institutions with the characteristics of service apartment to provide similar hotels and customized services. For example, professional room cleaning, customized butler service and thoughtful room equipment. Company A also invites a professional hotel management company to take charge of the management. It has the advantages of meticulous service and complete facilities to meet the needs of the elderly who want to be respected and self-fulfilled. 3.4.2 Pricing Strategy Company A formulated a three-stage pricing strategy as follows: a. Direction of pricing strategy: Before entering a certain market formally, Company A will first investigate local market trends, consumer spending power, etc., and use these results as a reference for pricing. Since Company A has just entered this market, it intends to adopt a lowto-medium price strategy to encourage consumers to try and experience Company A’s products, thereby achieving the goal of sales growth. b. Influence of pricing strategy on market share: Because company A has done market research and budget evaluation before it officially enters a certain market. Therefore, their products and marketing plans can not only meet the needs of local elderly consumers, but also achieve the goal of sales growth, thereby expanding market share. c. Influence of pricing strategy on margin: Company A must not only recover its investment in the product, but also provide the company with sufficient profits. Therefore, company A formulated a medium-tohigh price strategy in the initial stage of the product launch to encourage consumer consumption. Next, company A conducts big data analysis on the elasticity of its prices for different consumer groups. Company A changed the price-sensitive consumer groups to a higher price strategy; on the contrary, it changed the price-sensitive consumer group to a low-price strategy to increase the company’s profits. 3.4.3 Place Strategy Company A combined the Internet to create a dedicated web page and provide productrelated information to the Internet, so that consumers can easily search for product information without having to go to the destination to learn about it. In this way, consumers can not only grasp the latest information, but also achieve real-time updates. 3.4.4 Promotion Strategy Company A believes that successful promotional activities must pay attention to three key points, which are as follows:
470
K.-Y. Wang et al.
a. Limited time offer project: Usually, if consumers see a promotional activity but do not respond immediately, it is often because they are tired of similar activities or see similar promotional programs too often. Therefore, A company will try to control the time of each promotion within the “hesitation period” of consumers. For example, formulate relevant limitedtime transaction housing preferential schemes to stimulate consumers’ willingness to spend. b. Online fan page for quick update of information: Under the influence of the rapid development of information technology, Company A adopted the method of establishing a fan page on the Internet and updated the information from time to time. Through the power of the Internet, Company A cannot only reduce the related expenditures of advertising, but also enable the rapid expansion of product promotion. c. Combine with integrated marketing tools: Company A combines local daily community activities for promotion. This method can not only improve the visibility of the organization, but also increase the exposure of its products. In addition, Company A also launched a more favorable plan for people in the same county or city, or combined local medical resources to launch a more attractive integrated marketing plan, and gave consumers the best service and environment for the elderly.
4 Conclusions Company A uses a service apartment business model to subvert the people’s traditional thinking about traditional community-based long-term care institutions. Company A not only improves the quality of service, but also improves the living environment. And before entering the market, Company A will first understand the local market trends and consumer spending power, and analyze the results to formulate its pricing strategy. In addition, since A company’s business model also combines information technology to create a dedicated web page, consumers can easily obtain company A’s product information. Company A has also adopted a combination of local community activities, increased product exposure, and integrated marketing tools to launch more attractive marketing programs to provide consumers with the best service and elderly care environment, and increase market share and gains. From the experience of Company A, it can be seen that it is feasible for long-term care institutions to introduce service apartment operations, and there will be considerable business opportunities in the future long-term care market. This study has some suggestions as follows: (1) In response to the aging population, although the government’s introduction of foreign labor can solve the problem of insufficient domestic care manpower and improve care for the disabled, it can also reduce the burden of domestic women’s housework. However, if the introduction of foreign labor is excessive, will it deprive our local laborers of employment? Therefore, it is recommended that the introduction of foreign labor should be planned appropriately and meet the needs of the current industry, so as to avoid excessive deprivation of the right
A Feasibility Study of the Introduction of Service Apartment
471
to employment of local labor. (2) The low level of satisfaction of care attendants with work remuneration, benefits and career development opportunities is the main reason for resignation. Others may include that too long working hours can easily affect their daily routines and cause their physical and mental conditions to be affected and prone to job burnout. And the tendency of high turnover rate; therefore, it is recommended to formulate appropriate working hours and related benefits and regulations to protect the care attendants from any obstacles to employment, and provide relevant vocational training to enhance the professionalism of the care attendants. (3) This research recommends that the industry strive to develop localized and advanced service models, and encourage the integration of self-help, mutual assistance and innovative services to provide excellent community care.
References Laws & Regulations Database of the Republic of China, Long-Term Care Services Act (2021). https://law.moj.gov.tw/LawClass/LawAll.aspx?pcode=L0070040 Ministry of Health and Welfare, Long-term care ten-year plan-101 to 104 mid-year plan book, Taipei: Ministry of Health and Welfare (2012) Statistics Department of the Ministry of the Interior (2021). https://ws.moi.gov.tw/001/Upload/ 400/relfile/0/4405/75dd71a8-e5f0-4b89-820a-f0cc57c4e951/year/year.html Ye, C.C.: Educational Research Method. Psychology, Taipei (2001) Yin, R.K.: The case study crisis: some answers. Adm. Sci. Q. 26(1), 58–65 (1981) Zhong, N.: (2017). http://www.chinatimes.com/newspapers/20170310000049-26
Research on the Integration of Information Technology into Aesthetic Teaching in Kindergartens Ya-Lan Chan1 , Pei-Fang Lu1 , Sue-Ming Hsu2 , and Mei-Hua Liao3(B) 1 Department of Business Administration, Asia University, Taichung, Taiwan 2 Department of Business Administration, Tunghai University, Taichung, Taiwan 3 Department of Finance, Asia University, Taichung, Taiwan
[email protected]
Abstract. This study aims to explore the impact of the current use of information technology on the integration of information technology into the field of aesthetics. We take Taichung kindergartens (in Taiwan) to understand the difference between the current use of information technology and the integration of information technology into the field of aesthetics with different background variables. This research takes public and private kindergartens registered in Taichung City as the research object, and conducts a survey based on 385 valid sample questionnaires. Based on the analysis of the questionnaire data, this study has the following conclusions: 1. Taichung City’s early childhood education and protection service personnel use information technology with different backgrounds and “holding positions” in different situations. 2. There is a significant positive correlation between the use of information technology in the teaching of aesthetics in Taichung City’s early childhood education and protection service personnel. 3. The higher the current status of the use of information technology by the Taichung child education and protection service personnel, the higher the degree of recognition of using information technology to integrate into the field of aesthetics.
1 Introduction Aesthetics education has long be regarded as a part of basic education, and Finland is the best example. Finland can astound the world in its performance in music, architecture and design aesthetics, elevating its “beauty” to “national power” (Tianxia Magazine, 2005). In recent years, my country’s education policy has gradually promoted the importance of “aesthetic education.” In 2004, the Ministry of Education proposed policies related to art education but only for the primary and secondary levels; then in the education policy blueprint in 2005, aesthetic education began to be listed as the main development strategy, but the primary and secondary levels were still the main development strategy. In 2013, the Ministry of Education also announced the “Aesthetic Sense from Childhood, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): IMIS 2021, LNNS 279, pp. 472–477, 2022. https://doi.org/10.1007/978-3-030-79728-7_47
Research on the Integration of Information Technology
473
Aesthetic Lifelong Learning” mid-to-long-term five-year plan for aesthetic education. In the era of advanced information technology, can different information technology be used to assist teachers in teaching in the field of aesthetics? Is it effective to use different information technology equipment to assist in teaching in the field of aesthetics? This study further explores the integration of information technology into teaching in the field of aesthetics. The theme of this research is “the impact of the current use of information technology on the integration of information technology into teaching in the field of aesthetics”. Through literature analysis and empirical research, it explores whether the use of information technology in the teaching of aesthetics by early childhood education and health services can improve teachers Implement teaching in the field of aesthetics.
2 Related Research on Teaching in the Field of Information Technology and Aesthetics (1) Information technology assisted teaching methods. (i) CAI Computer assistance is the use of computers to assist teachers in teaching and help students learn (Li Baojin, 1984). Computer-assisted teaching materials can increase the motivation of students to learn. Use computer-assisted textbooks to compile a short textbook so that students can complete it at the speed of their own learning ability. (ii) ASSURE mode of systematic teaching design Heinich, Molenda, Russell, Smaldino (2002) In the actual teaching situation, carefully choose and make good use of multimedia tools to help achieve teaching goals, and encourage students to participate in interactively, and the acronym “ASSURE” of the six-step verb Express the meaning of “ensure the success and effectiveness of teaching.” A: Analyze learners Learner characteristics are divided into three aspects: generality, particularity and learning style. S: Writing State objectives refers to the knowledge or attitudes that learners should have at the end of the study. S: Select instructional methods, media, and materials (Select instructional methods, media, and materials). After teachers understand the characteristics of students and write learning objectives, they must rely on appropriate tools to complete between the starting point and the end point, that is, how to choose methods, Media and teaching materials. U: Utilize media and materials: This stage includes the following steps: teachers watch beforehand, arrange the teaching environment, prepare the viewers, operate or show the teaching materials. R: Inspire learner participation (Require learner participation) It means that teachers should give learners the opportunity to practice the newly acquired knowledge and give feedback to enhance their learning effectiveness. E: Evaluate and revise: Divided into achievement evaluation, teaching media and teaching materials evaluation.
474
Y.-L. Chan et al.
(2) The connotation of the aesthetics of the new curriculum of kindergarten: The three skills in the field of aesthetics in this syllabus include “exploration and awareness”, “performance and creation” and “response and appreciation”. The researcher mainly discusses the aesthetic areas: 1. “Explore awareness”: Mainly refers to the use of keen awareness to explore the beauty of things around life, and to perceive the changes in it. In an encouraging learning situation, these different aesthetic experiences can arouse children’s curiosity and exploration, and be aware of the changes in it. 2. “Response Appreciation”: It means to express the feelings and preferences of individuals or groups of diverse artistic creations or performances in the living environment. Usually younger children will respond intuitively to these creations with body movements or voice expressions. As they grow older, they will gradually describe or express their feelings and opinions in more complex ways.
3 Research Methods and Statistical Analysis (1) Research structure H1: The current situation of teachers’ use of information technology has a positive influence on the exploration and perception of teaching in the field of aesthetics. H1-1: Teachers’ IT ability has a positive influence on exploration and awareness. H1-2: Teachers’ IT willingness has a positive influence on exploration and awareness. H1-3: Teachers’ IT attitudes and concepts have a positive influence on exploration and awareness. H2: The current situation of teachers’ use of information technology has a positive influence on the response and appreciation of teaching in the field of aesthetics. H2-1: The ability of teachers to use information technology has a positive effect on response and appreciation. H2-2: Teachers’ willingness to use IT has a positive impact on response and appreciation. H2-3: Teachers’ IT attitudes and concepts have a positive influence on response and appreciation. (2) Analysis of the current situation of the use of information technology and the influence of the integration of information technology into the teaching of aesthetics. This research mainly adopts regression analysis method, using information technology status of “information technology capability”, “information technology willingness” and “information technology attitude and concept” as independent variables, and “exploration awareness” and “response appreciation” as dependent variables.
Research on the Integration of Information Technology
475
(i). Analysis of the impact of independent variables (each level of the use of information technology) and dependent variables (exploration and awareness) (Table 1). The results show that the model has a significant level (F = 51.635, p