492 18 57MB
English Pages [425] Year 2021
Lecture Notes in Networks and Systems 313
Leonard Barolli Hsing-Chung Chen Tomoya Enokido Editors
Advances in Networked-Based Information Systems The 24th International Conference on Network-Based Information Systems (NBiS-2021)
Lecture Notes in Networks and Systems Volume 313
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA; Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada; Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.
More information about this series at http://www.springer.com/series/15179
Leonard Barolli Hsing-Chung Chen Tomoya Enokido •
•
Editors
Advances in Networked-Based Information Systems The 24th International Conference on Network-Based Information Systems (NBiS-2021)
123
Editors Leonard Barolli Department of Information and Communication Engineering Fukuoka Institute of Technology Fukuoka, Japan
Hsing-Chung Chen Department of Computer Science and Information Engineering Asia University Taichung, Taiwan
Tomoya Enokido Faculty of Business Administration Rissho University Tokyo, Japan
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-030-84912-2 ISBN 978-3-030-84913-9 (eBook) https://doi.org/10.1007/978-3-030-84913-9 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Welcome Message from NBiS-2021 Organizing Committee
We would like to welcome you to the 24th International Conference on Network-Based Information Systems (NBiS-2021), which will be held at Asia University, Taichung, Taiwan, from September 1 to September 3, 2021. The main objective of NBiS is to bring together scientists, engineers, and researchers from both network systems and information systems with the aim of encouraging the exchange of ideas, opinions, and experiences between these two communities. NBiS started as a workshop and was held for 12 years together with DEXA International Conference as one of the oldest among DEXA Workshops. The workshop was very successful, and in 2009 edition, NBiS was held at IUPUI, Indianapolis, USA, as an independent international conference supported by many international volunteers. In the following years, NBiSs was held in Takayama, Gifu, Japan (2010); Tirana, Albania (2011); Melbourne, Australia (2012); Gwangju, Korea (2013); Salerno, Italy (2014); Taipei, Taiwan (2015); Ostrava, Czech Republic (2016); Toronto, Canada (2017); Bratislava, Slovakia (2018); Oita, Japan (2019); and Victoria, Canada (2020). It is our honor to chair this prestigious conference, as one of the important conferences in the field. Extensive international participation, coupled with rigorous peer reviews, has made this an exceptional technical conference. The Technical Program and Workshops add important dimensions to this event. We hope that you will enjoy each and every component of this event and benefit from interactions with other attendees. Since its inception, NBiS has attempted to bring together people interested in information and networking, in areas that range from the theoretical aspects to the practical design of new network systems, distributed systems, multimedia systems, Internet/Web technologies, mobile computing, intelligent computing, pervasive/ubiquitous networks, dependable systems, semantic services, and scalable computing. For NBiS-2021, we have continued these efforts as novel networking concepts emerge and new applications flourish. In this edition of NBiS, many papers were submitted from all over the world. They were carefully reviewed and only high-quality papers will be presented during conference days. v
vi
Welcome Message from NBiS-2021 Organizing Committee
The organization of an international conference requires the support and help of many people. A lot of people has helped and worked hard for a successful NBiS-2021 technical program and conference proceedings. First, we would like to thank all the authors for submitting their papers. We are indebted to track co-chairs, program committee members, and reviewers who carried out the most difficult work of carefully evaluating the submitted papers. We would like to express our great appreciation to our keynote speakers for accepting our invitation as keynote speakers of NBiS-2021. We hope that you have an enjoyable and productive time during the conference.
NBiS-2021 Organizing Committee
Honorary Chairs Shian-Shyong Tseng Mao-Jiun Wang
Asia University, Taiwan Tunghai University, Taiwan
General Co-chairs Hsing-Chung Chen Marek Ogiela Naohiro Hayashibara
Asia University, Taiwan AGH University of Science and Technology, Poland Kyoto Sangyo University, Japan
Program Committee Co-chairs Fang-Yie Leu Tomoya Enokido Kin Fun Li
Tunghai University, Taiwan Rissho University, Japan University of Victoria, Canada
Award Co-chairs Yung-Fa Huang Minoru Uehara David Taniar Arjan Durresi
Chaoyang University of Technology, Taiwan Toyo University, Japan Monash University, Australia IUPUI, USA
Publicity Co-chairs Yeong-Sheng Chen Markus Aleksy Wenny Rahayu
National Taipei University of Education, Taiwan ABB AG, Germany La Trobe University, Australia
vii
viii
Isaac Woungang Lidia Ogiela
NBiS-2021 Organizing Committee
Ryerson University, Canada Pedagogical University of Cracow, Poland
International Liaison Co-chairs Chuan-Yu Chang Ching-Chun Chang Tomoyuki Ishida Farookh Hussain Hiroaki Kikuchi Zahoor Khan
National Yunlin University of Science and Technology, Taiwan Tsinghua University, China Fukuoka Institute of Technology, Japan University of Technology Sydney, Australia Meiji University, Japan Higher Colleges of Technology, UAE
Local Arrangement Co-chairs Jui-Chi Chen Wei-Zu Yang Shyi-Shiun Kuo
Asia University, Taiwan Asia University, Taiwan Nan Kai University of Technology, Taiwan
Finance Chair Makoto Ikeda
Fukuoka Institute of Technology, Japan
Web Administrator Co-chairs Phudit Ampririt Kevin Bylykbashi Ermioni Qafzezi
Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan
Steering Committee Leonard Barolli Makoto Takizawa
Fukuoka Institute of Technology, Japan Hosei University, Japan
Track Areas and PC Members Track 1: Mobile and Wireless Networks Track Co-chairs Tetsuya Shigeyasu Vamsi Krishna Paruchuri Makoto Ikeda
Prefectural University of Hiroshima, Japan University of Central Arkansas, USA Fukuoka Institute of Technology, Japan
NBiS-2021 Organizing Committee
ix
PC Members Nobuyoshi Sato Kanunori Ueda Masaaki Yamanaka Takuya Yoshihiro Tomoya Kawakami Masaaki Noro Admir Barolli Keita Matsuo Elis Kulla Arjan Durresi
Iwate Prefectural University, Japan Kochi University of Technology, Japan Japan Coast Guard Academy, Japan Wakayama University, Japan Nara Institute of Science and Technology, Japan Fujitsu Laboratory, Japan Aleksander Moisiu University of Durresi, Albania Fukuoka Institute of Technology, Japan Okayama University of Science, Japan IUPUI, USA
Track 2: Internet of Things and Big Data Track Co-chairs Stelios Sotiriadis Chun-Wei Tsai Patrick Hung
Birkbeck, University of London, UK National Ilan University, Taiwan University of Ontario Institute of Technology, Canada
PC Members Sergio Toral Euripides G. M. Petrakis Mario Dantas Xiaolong Xu Kevin Curran Shih-Chia Huang Jorge Roa Alvaro Joffre Uribe Marcelo Fantinato Marco Zennaro Priyanka Rawat Francesco Piccialli Chi-Yuan Chen
University of Seville, Spain Technical University of Crete (TUC), Greece Federal University of Juiz de Fora (UFJF), Brazil University of Posts & Telecommunications, China Ulster University, UK National Taipei University of Technology, Taiwan UTN Santa Fe, Argentina Universidad Militar Nueva Granada, Colombia University of Sao Paulo, Brazil Wireless and T/ICT4D Laboratory, Italy University of Avignon, France University of Naples Federico II, Italy National Ilan University, Taiwan
x
NBiS-2021 Organizing Committee
Track 3: Cloud, Grid and Service Oriented Computing Track Co-chairs Ciprian Dobre Omar Hussain Muhammad Younas
Polytechnic University of Bucharest, Romania UNSW Canberra, Australia Oxford Brookes University, UK
PC Members Adil Hammadi Walayat Hussain Farookh Hussain Rui Pais Raymond Hansen Antorweep Chakravorty Rui Esteves Constandinos X. Mavromoustakis Ioan Salomie George Mastorakis Sergio L. Toral Marín Marc Frincu Alexandru Costan Xiaomin Zhu Radu Tudoran Mauro Migliardi Harold Castro Andrea Tosatto Rodrigo Calheiros
Sultan Qaboos University, Oman University of Technology Sydney, Australia University of Technology Sydney, Australia University of Stavanger, Norway Purdue University, USA University of Stavanger, Norway National Oilwell Varco, Norway University of Nicosia, Cyprus Technical University of Cluj-Napoca, Romania Technological Educational Institute of Crete, Greece University of Seville, Spain West University of Timisoara, Romania IRISA/INSA Rennes, France National University of Defense Technology, China Huawei, Munich, Germany University of Padua, Italy Universidad de Los Andes, Colombia Open-Xchange, Germany Western Sydney University, Australia
Track 4: Multimedia and Web Applications Track Co-chairs Takahiro Uchiya Tomoyuki Ishida Nobuo Funabiki
Nagoya Institute of Technology, Japan Fukuoka Institute of Technology, Japan Okayama University, Japan
PC Members Shigeru Fujita Yuka Kato Yoshiaki Kasahara
Chiba institute of Technology, Japan Tokyo Woman’s Christian University, Japan Kyushu University, Japan
NBiS-2021 Organizing Committee
Rihito Yaegashi Kazunori Ueda Ryota Nishimura Shohei Kato Shinsuke Kajioka Atsuko Muto Kaoru Sugita Noriyasu Yamamoto
xi
Kagawa University, Japan Kochi University of Technology, Japan Keio University, Japan Nagoya Institute of Technology, Japan Nagoya Institute of Technology, Japan Nagoya Institute of Technology, Japan Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan
Track 5: Ubiquitous and Pervasive Computing Track Co-chairs Chi-Yi Lin Elis Kulla Isaac Woungang
Tamkang University, Taiwan Okayama University of Science, Japan Ryerson University, Canada
PC Members Jichiang Tsai Chang Hong Lin Meng-Shiuan Pan Chien-Fu Cheng Ang Chen Santi Caballe Evjola Spaho Makoto Ikeda Neeraj Kumar Hamed Aly Glaucio Carvalho
National Chung Hsing University, Taiwan National Taiwan University of Science and Technology, Taiwan Tamkang University, Taiwan Tamkang University, Taiwan University of Pennsylvania, USA Open University of Catalonia, Spain Polytechnic University of Tirana, Albania Fukuoka Institute of Technology, Japan Thapar University, India Acadia University, Canada Sheridan College, Canada
Track 6: Network Security and Privacy Track Co-chairs Takamichi Saito Sriram Chellappan Feilong Tang
Meiji University, Japan University of South Florida, USA Shanghai Jiao Tong University, China
PC Members Satomi Saito Kazumasa Omote Koji Chida
Fujitsu Laboratories, Japan University of Tsukuba, Japan NTT, Japan
xii
Hiroki Hada Hirofumi Nakakouji Na Ruan Chunhua Su Kazumasa Omote Toshihiro Yamauchi Masakazu Soshi Bagus Santoso Laiping Zhao Jingyu Hua Xiaobo Zhou Yuan Zhan Yizhi Ren Arjan Durresi Vamsi Krishna Paruchuri
NBiS-2021 Organizing Committee
NTT Security (Japan) KK, Japan Hitachi, Ltd., Japan Shanghai Jiao Tong University, China Osaka University, China University of Tsukuba, Japan Okayama University, Japan Hiroshima City University, Japan The University of Electro-Communications, Japan Tianjin University, China Nanjing University, China Tianjin University, China Nanjing University, China Hangzhou Dianzi University, China IUPUI, USA University of Central Arkansas, USA
Track 7: Database, Data Mining and Semantic Computing Track Co-chairs Wendy K. Osborn Eric Pardade Akimitsu Kanzaki
University of Lethbridge, Canada La Trobe University, Australia Shimane University, Japan
PC Members Asm Kayes Ronaldo dos Santos Mello Saqib Ali Hong Quang Nguyen Irena Holubova Prakash Veeraraghavan Carson Leung Marwan Hassani Tomoki Yoshihisa Tomoya Kawakami Atsushi Takeda Yoshiaki Terashima Yuuichi Teranishi
La Trobe University, Australia Universidade Federal de Santa Catarina, Brazil Sultan Qaboos University, Oman Ho Chi Minh City International University, Vietnam Charles University, Prague, Czech Republic La Trobe University, Australia University of Manitoba, Canada Aachen University, Germany Osaka University, Japan NAIST, Japan Tohoku Gakuin University, Japan Soka University, Japan NICT, Japan
NBiS-2021 Organizing Committee
xiii
Track 8: Network Protocols and Applications Track Co-chairs Sanjay Kuamr Dhurandher Hsing-Chung Chen
NSIT, University of Delhi, India Asia University, Taiwan
PC Members Amita Malik Mayank Dave Vinesh Kumar R. K. Pateriya Himanshu Aggarwal Neng-Yih Shih Yeong-Chin Chen Hsi-Chin Hsin Ming-Shiang Huang Chia-Cheng Liu Chia-Hsin Cheng Tzu-Liang Kung Gene Shen Jim-Min Lin Chia-Cheng Liu Yen-Ching Chang Shu-Hong Lee Ho-Lung Hung Gwo-Ruey Lee Li-Shan Ma Chung-Wen Hung Yung-Chen Chou Chen-Hung Chuang Jing-Doo Wang Jui-Chi Chen Young-Long Chen
Deenbandhu Chhotu Ram University of Science and Technology, India NIT Kurukshetra, India University of Delhi, India MANIT Bhopal, India Punjabi University, India Asia University, Taiwan Asia University, Taiwan National United University, Taiwan Asia University, Taiwan Asia University, Taiwan National Formosa University, Yunlin County, Taiwan Asia University, Taiwan Asia University, Taiwan Feng Chia University, Taiwan Asia University, Taiwan Chung Shan Medical University, Taiwan Chienkuo Technology University, Taiwan Chienkuo Technology University, Taiwan Lung-Yuan Research Park, Taiwan Chienkuo Technology University, Taiwan National Yunlin University of Science & Technology University, Taiwan Asia University, Taiwan Asia University, Taiwan Asia University, Taiwan Asia University, Taiwan National Taichung University of Science and Technology, Taiwan
Track 9: Intelligent and Cognitive Computing Track Co-chairs Lidia Ogiela Farookh Hussain Shinji Sakamoto
Pedagogical University of Cracow, Poland University of Technology Sydney, Australia Seikei University, Japan
xiv
NBiS-2021 Organizing Committee
PC Members Yiyu Yao Daqi Dong Jan Platoš Pavel Krömer Urszula Ogiela Jana Nowaková Hoon Ko Chang Choi Gangman Yi Wooseok Hyun Hsing-Chung Jack Chen Jong-Suk Ruth Lee Hyun Jung Lee Ji-Young Lim Omar Hussain Saqib Ali Morteza Saberi Sazia Parvin Walayat Hussain Tetsuya Oda Makoto Ikeda Admir Barolli Yi Liu
University of Regina, Canada University of Memphis, USA VŠB-Technical University of Ostrava, Czech Republic VŠB-Technical University of Ostrava, Czech Republic Pedagogical University of Krakow, Poland VŠB-Technical University of Ostrava, Czech Republic Chosun University, South Korea Chosun University, Republic of Korea Gangneung-Wonju National University, Korea Korean Bible University, Korea Asia University, Taiwan KISTI, Korea Yonsei University, Korea Korean Bible University, Korea UNSW Canberra, Australia Sultan Qaboos University, Oman UNSW Canberra, Australia UNSW Canberra, Australia University of Technology Sydney, Australia Okayama University of Science, Japan Fukuoka Institute of Technology, Japan Aleksander Moisiu University of Durresi, Albania National Institute of Technology, Oita College, Japan
Track 10: Parallel and Distributed Computing Track Co-chairs Naohiro Hayashibara Bhed Bista
Kyoto Sangyo University, Japan Iwate Prefectural University, Japan
PC Members Tomoya Enokido Kosuke Takano Masahiro Ito Jiahong Wang Shigetomo Kimura
Rissho University, Japan Kanagawa Institute of Technology, Japan Toshiba Lab, Japan Iwate Prefectural University, Japan University of Tsukuba, Japan
NBiS-2021 Organizing Committee
Chotipat Pornavalai Danda B. Rawat Gongjun Yan Naonobu Okazaki Yoshiaki Terashima Atsushi Takeda Tomoki Yoshihisa Akira Kanaoka
xv
King Mongkut’s Institute of Technology Ladkrabang, Thailand Howard University, USA University of Southern Indiana, USA Miyazaki University, Japan Soka University, Japan Tohoku Gakuin University, Japan Osaka University, Japan Toho University, Japan
NBiS-2021 Reviewers Ali Khan Zahoor Barolli Admir Barolli Leonard Bista Bhed Caballé Santi Chang Chuan-Yu Chellappan Sriram Chen Hsing-Chung Cui Baojiang Di Martino Beniamino Durresi Arjan Enokido Tomoya Ficco Massimo Fun Li Kin Funabiki Nobuo Gotoh Yusuke Hussain Farookh Hussain Omar Javaid Nadeem Jeong Joshua Ikeda Makoto Ishida Tomoyuki Kikuchi Hiroaki Kohana Masaki
Koyama Akio Kulla Elis Matsuo Keita Nishigaki Masakatsu Ogiela Lidia Ogiela Marek Okada Yoshihiro Omote Kazumasa Palmieri Francesco Paruchuri Vamsi Krishna Rahayu Wenny Rawat Danda Shibata Yoshitaka Saito Takamichi Sato Fumiaki Takizawa Makoto Tang Feilong Taniar David Uchiya Takahiro Uehara Minoru Venticinque Salvatore Wang Xu An Woungang Isaac Xhafa Fatos
NBiS-2021 Keynote Talks
Big Data Management for Data Streams Wenny Rahayu La Trobe University, Melbourne, Australia
Abstract. One of the main drivers behind big data in recent years has been the proliferation of applications and devices to generate data with high velocity in multiple formats. These devices include IoT sensors, mobile devices, GPS trackers, and so on. This new data generation, called data streams, requires new ways to manage, process, and analyze. These data streams drive the need for a new database architecture that is able to manage the complexity of multiple data formats, deal with high-speed data, and integrate them into a scalable data management system. In this talk, the primary motivation for using data lakes, which is new wave of database management that is underpinned by the need to deal with data volume and variety of big data storage, will be presented. Then, some case studies to demonstrate the development of big data ecosystems involving data streams will be discussed. These case studies include the development of data lake for smart factory with sensor data collection/ingestion and big data system for GPS crowdsourcing as part of a community planning.
xix
Convergence of Broadcast and Broadband in 5G Era Yusuke Gotoh Okayama University, Okayama, Japan
Abstract. In order to converge broadband and broadcast, the realization of TV viewing system by mobile devices is a particularly important challenge. Action on standardized mobile communication technologies for multicast transmission started in 2006, and now Further evolved Multimedia Broadcast Multicast Service (FeMBMS) is an official component of 5G in 3GPP as an LTE-based 5G terrestrial broadcasting system. In this talk, I will introduce the technologies for the convergence of broadband and broadcast in the 5G era. Furthermore, I will introduce our recent work related to the technology of broadcasting while maintaining the compatibility with 5G mobile networks.
xxi
Contents
A Monotonically Increasing (MI) Algorithm to Estimate Energy Consumption and Execution Time of Processes on a Server . . . . . . . . . Dilawaer Duolikun, Tomoya Enokido, Leonrad Barolli, and Makoto Takizawa Performance Comparison of CM and LDIWM Router Replacement Methods for WMNs by WMN-PSOHC Simulation System Considering Chi-Square Distribution of Mesh Clients . . . . . . . . . . . . . . Shinji Sakamoto, Yi Liu, Leonard Barolli, and Shusuke Okamoto
1
13
A Capability Token Selection Algorithm for Lightweight Information Flow Control in the IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shigenari Nakamura, Tomoya Enokido, and Makoto Takizawa
23
Join Processing in Varying Periodic and Aperiodic Spatial Data Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wendy Osborn
35
The Improved Redundant Active Time-Based Algorithm with Forcing Termination of Meaningless Replicas in Virtual Machine Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tomoya Enokido, Dilawaer Duolikun, and Makoto Takizawa Optical Simulations on Aerial Transmitting Laser Beam for Free Space Optics Communication . . . . . . . . . . . . . . . . . . . . . . . . . . Shun Jono, Takuto Koyama, Kota Watanabe, Kiyotaka Izumi, and Takeshi Tsujimura Employee Management Support Application for Regional Public Transportation Service in Japan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chinasa Sueyoshi, Hideya Takagi, Toshihiro Uchibayashi, and Kentaro Inenaga
50
59
71
xxiii
xxiv
Contents
Message Ferry Routing Based on Nomadic Lévy Walk in Delay Tolerant Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Koichiro Sugihara and Naohiro Hayashibara
82
A Trust-Based Tool for Detecting Potentially Damaging Users in Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kaley J. Rittichier, Davinder Kaur, Suleyman Uslu, and Arjan Durresi
94
Secure Cloud Storage Using Color Code in DNA Computing . . . . . . . . 105 Saravanan Manikandan, Islam M. D. Saikhul, Hsing-Chung Chen, and Yu-Lin Song A Hybrid Intelligent Simulation System for Node Placement in WMNs: A Comparison Study of Chi-Square and Uniform Distributions of Mesh Clients for CM and LDVM Router Replacement Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Admir Barolli, Kevin Bylykbashi, Ermioni Qafzezi, Shinji Sakamoto, and Leonard Barolli Outage Probability of CR-NOMA Schemes with Multiple Antennas Selection and Power Transfer Approach . . . . . . . . . . . . . . . . . . . . . . . . 131 Hong-Nhu Nguyen, Ngoc-Long Nguyen, Nhat-Tien Nguyen, Ngoc-Lan Nguyen, and Miroslav Voznak An Efficient Framework for Resource Allocation and Dynamic Pricing Scheme for Completion Time Failure in Cloud Computing . . . . . . . . . . 143 Anjan Bandyopadhyay, Vikash Kumar Singh, Sajal Mukhopadhyay, Ujjwal Rai, and Arghya Bandyopadhyay Inferring Anomalies from Cloud Metrics Using Recurrent Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Spyridon Chouliaras and Stelios Sotiriadis Method of Lyric Association Based on Mind Mapping in Collaborative Lyric Writing of Popular Music . . . . . . . . . . . . . . . . . . 165 Meguru Yamashita and Kiwamu Satoh Collaborative Virtual Environments for Jaw Surgery Simulation . . . . . 179 Krit Khwanngern, Juggapong Natwichai, Vivatchai Kaveeta, Phornphanit Meenert, and Sawita Sriyong Deterrence-Based Trust: A Study on Improving the Credibility of Social Media Messages in Disaster Using Registered Volunteers . . . . 188 Takumi Kitagawa, Tetsushi Ohki, Yuki Koizumi, Yoshinobu Kawabe, Toru Hasegawa, and Masakatsu Nishigaki A Perceptron Mixture Model of Intrusion Detection for Safeguarding Electronic Health Record System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Wei Lu and Ling Xue
Contents
xxv
Personalized Cryptographic Protocols - Obfuscation Technique Based on the Qualities of the Individual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Radosław Bułat and Marek R. Ogiela Personalized Cryptographic Protocols for Advanced Data Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Urszula Ogiela, Makoto Takizawa, and Lidia Ogiela Antilock Braking System (ABS) Based Control Type Regulator Implemented by Neural Network in Various Road Conditions . . . . . . . 223 Hsing-Chung Chen, Andika Wisnujati, Agung Mulyo Widodo, Yu-Lin Song, and Chi-Wen Lung Physical Memory Management with Two Page Sizes in Tender OS . . . . 238 Koki Kusunoki, Toshihiro Yamauchi, and Hideo Taniguchi Sensor-Based Motion Analysis of Paralympic Boccia Athletes . . . . . . . . 249 Ayumi Ohnishi, Tsutomu Terada, and Masahiko Tsukamoto A Design and Development of a Near Video-on-Demand Systems . . . . . 258 Tomoki Yoshihisa A Consideration of Delivering Method for Super-Resolution Video . . . . 268 Yusuke Gotoh and Takayuki Oishi Proposal of a Tele-Immersion Visiting System . . . . . . . . . . . . . . . . . . . . 275 Reiya Yahada and Tomoyuki Ishida A Study on the Impact of High Refresh-Rate Displays on Scores of eSports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 Koshiro Murakami and Hideo Miyachi Assessing the Sense of Presence to Evaluate the Effectiveness of Virtual Reality Wildfire Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Huang Heyao and Ogi Tetsuro 3D Measurement and Feature Extraction for Metal Nuts . . . . . . . . . . . 299 Zhiyi Gao, Tohru Kato, Hiroki Takahashi, and Akio Doi A Machine Learning Approach for Predicting 2D Aircraft Position Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306 Kazuma Matsuo, Makoto Ikeda, and Leonard Barolli Evaluation of Rainfall Characteristics Between 1-h Precipitation and 10-min Precipitation Observed by AMeDAS . . . . . . . . . . . . . . . . . . . . . 312 Kiyotaka Fujisaki Numerical Analysis of Photonic Crystal Waveguide with Stub by CIP Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 Hiroshi Maeda
xxvi
Contents
A CCM-Based HC System for Mesh Router Placement Optimization: A Comparison Study for Different Instances Considering Normal and Uniform Distributions of Mesh Clients . . . . . . . . . . . . . . . . . . . . . . 329 Aoto Hirata, Tetsuya Oda, Nobuki Saito, Yuki Nagai, Kyohei Toyoshima, and Leonard Barolli EPOQAS: Development of an Event Program Organizer with a Question Answering System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Jun Iio Estimating User’s Movement Path Using Wi-Fi Authentication Log . . . 349 Jun Yamano, Yasuhiro Ohtaki, and Kazuyuki Yamamoto Integrating PPN into Autoencoders for Better Information Aggregation Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 Yudai Okui, Tatsuhiro Yonekura, and Masaru Kamada An AR System to Practice Drums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 Kaito Kikuchi, Michitoshi Niibori, and Masaru Kamada A Proposal of Learning Feedback System for Children to Promote Self-directed Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 Yoshihiro Kawano and Yuka Kawano A SPA of Online Lecture Contents with Voice . . . . . . . . . . . . . . . . . . . . 384 Masaki Kohana, Shusuke Okamoto, and Masaru Kamada A Dynamic and Distributed Simulation Method for Web-Based Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 Ryoya Fukutani, Shusuke Okamoto, Shinji Sakamoto, and Masaki Kohana Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
A Monotonically Increasing (MI) Algorithm to Estimate Energy Consumption and Execution Time of Processes on a Server Dilawaer Duolikun1(B) , Tomoya Enokido2 , Leonrad Barolli3 , and Makoto Takizawa4 1
Hosei University, 3-7-2, Kajino-cho, Koganei-shi, Tokyo 184-8584, Japan Faculty of Business Administration, Rissho University, 4-2-16, Osaki, Shinagawa-ku, Tokyo 141-8602, Japan [email protected] Department of Information and Communications Engineering, Fukuoka Institute of Technology, 3-30-1, Wajiro-Higashi, Higashi-ku, Fukuoka, Japan [email protected] Research Center for Computing and Multimedia Studies, Hosei University, 3-7-2, Kajino-cho, Koganei-shi, Tokyo 184-8584, Japan [email protected] 2
3
4
Abstract. It is critical to reduce the electric energy consumption of information systems, especially clusters of servers. In order to reduce the energy consumption, we have to select a virtual machine on an energyefficient server to perform an application process so that total energy consumption of servers can be reduced. Here, we have to estimate how much energy a server consumes to perform processes. In the simple estimation (SP) algorithm previously proposed, every current process is assumed to have the same computation residue. The execution time of each process and energy consumption of the server are underestimated while it takes short time to do the estimation. In this paper, we newly propose a novel monotonically increasing (MI) algorithm to estimate energy consumption of a host server to perform application processes. Here, current processes p1 , . . ., pnt on a server st are assumed to be totally ordered so that the computation residue RP i of a process pi is RP 1 · αi−1 larger than a process pi−1 (i = 1, . . . , nt ). In the evaluation, we show the execution time and energy consumption of a server obtained by the MI algorithm are only one [%] different from the simulation results. Keywords: Energy-efficient server cluster · Energy estimation algorithm · MI algorithm · SP algorithm · Green computing systems
1
Introduction
Information systems are getting scalable like the IoT (Internet of Things) [24] and consume huge amount of energy [2]. Especially, we have to reduce energy c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 1–12, 2022. https://doi.org/10.1007/978-3-030-84913-9_1
2
D. Duolikun et al.
consumption of server clusters [2–5,15] to realize green computing systems. Virtual machines [1] supported by server cluster systems [2–6] like cloud computing systems [21] are widely used to provide applications with virtual computation services on resources like CPUs and storages independently of heterogeneity and locations of various types of servers. If a client issues a service request to a cluster of servers, a virtual machine on a host server is selected where an application process to handle the request is performed. Algorithms [7–13,18,20] are proposed to select a host server which is expected to consume smallest energy. After a process is issued to a selected server in a cluster, the server might be overloaded and consume more energy than estimated. If a server consumes more energy, processes on the server migrate to another server which is expected to consume less amount of energy. The migration approach to reducing energy consumption of servers in clusters is also proposed by our previous studies [8,9,15–17,19]. In order to find a host virtual machine to perform a new process and make a virtual machine migrate among servers, we have to estimate the energy consumption of a server to perform application processes. Power consumption and computation models [2–5,13] are proposed to show the power consumption and execution time of a server to perform application processes. By using the models, the execution time of processes and the energy consumption of a server to perform the processes are obtained by simulating the computation of the processes in the simulation (SM) algorithm [10–13]. However, it takes time to do the simulation for processes on a server while the execution time and energy consumption can be precisely estimated. In order to more efficiently do the estimation, the simple (SP) algorithm to do the estimation is proposed [16,22,23], where the computation residue of each current process is assumed to be proportional to the number of active processes. That is, the execution time and energy consumption are just estimated by using the number of active processes. However, the execution time and energy consumption are underestimated. In this paper, we newly propose a monotonically increasing (MI) algorithm to more precisely estimate the execution time and energy consumption of a server. Here, the current active processes p1 , . . ., pnt on a server st are totally ordered so that the computation residue RP i of each process pi is RP 1 · αi−1 larger than the computation residue RP i−1 of a process pi−1 preceding pi . In the evaluation, the execution time and energy consumption of a server can be more precisely estimated in the MI algorithm than the SP algorithm. In Sect. 2, we discuss the computation and power consumption models of a server. In Sect. 3, we propose the MI estimation algorithm. In Sect. 4, we evaluate the MI algorithm compared with the SM and SP algorithms.
2
System Model
A cluster is composed of servers s1 , . . ., sm (m ≥ 1) which are interconnected in networks. Each server st is equipped with npt (≥ 1) homogeneous CPUs. Each CPU supports pct (≥ 1) homogeneous cores. The server st totally supports nct
An Energy-aware Migration of Virtual Machines
3
(=npt · pct ) cores. Each core supports tnt threads each of which is a unit of computation. Usually, tnt is one or two. The server st totally supports applications with ntt (=nct · tnt ) threads. On each thread, one process can be at time performed. A thread is active if and only if (iff) a process is performed. Each server st can in parallel perform ntt processes. A server, CPU, and core are active iff at least one thread is active, otherwise idle. A cluster supports applications with a set V M of virtual machines vm1 , . . ., vmm (m ≥ 1). Applications can take usage of computation resources of servers without being conscious of heterogeneity and location of the servers. Application processes are performed on a virtual machine vmk . A virtual machine vmk on a host server st can migrate to another guest server su in a live manner [1]. That is, active processes on the virtual machine vmk can migrate to the guest server su without terminating the processes while the processes are suspended during the migration of the virtual machine vmk . In this paper, we consider a computation type of application process which takes usage of CPU resources [3]. A term process stands for an application process in this paper. Each process pi is at a time performed on a thread of a server st . In this paper, time is considered to be discrete, i.e. a sequence of time units [tu]. The minimum execution time minT ti [tu] of a process pi is the shortest time to perform the process pi on a server st , i.e. only one process pi is performed without any other process on a server st . The more number of processes are performed concurrently with a process pi , the longer execution time of the process pi . Let minT i is the minimum one of minT 1i , . . ., minT mi . That is, minT i is the minimum execution time minT f i of the fastest server sf whose thread supports the fastest computation rate. In this paper, the amount of computation of each process pi is defined to be the minimum execution time minT i . In reality, well-formed application processes are performed in most information systems like Web-based ones. Here, the minimum execution time minT i of each process pi can be obtained. The thread computation rate T CRt of a server st is defined to be minT i /minT ti (≤ 1) for any process pi . It is noted minT i /minT ti = minT j /minT tj for any pair of processes pi and pj . If only one process pi is performed on a thread of a server st , the process pi is performed at rate T CRt . If (nt −1) processes are performed concurrently with a process pi on the thread, the process pi is performed on the server st at rate T CRt /nt under the assumption CPU resources are fairly allocated to every process in fair scheduling schemes. On a server st where nt processes are active, each process is performed at rate N P Rt (nt ) in the MLC (Multi-Level Computation) model [10,12,13] as follows: [MLC model] N P Rt (nt ) =
T CRt if nt ≤ ntt . ntt · T CRt /nt if nt > ntt .
(1)
Here, the total computation rate N SRt (nt ) of a server st is N P Rt (nt ) · nt . In the MLPCM (Multi-Level Power Consumption) model [10,12,13], the power consumption N Et (nt ) [W] to be consumed by a server st to concurrently perform nt processes is given as follows:
4
D. Duolikun et al.
[MLPCM model] ⎧ minEt if nt = 0. ⎪ ⎪ ⎪ ⎪ + n · (bE + cE + tE ) if 1 ≤ nt ≤ npt . minE ⎪ t t t t t ⎪ ⎪ ⎪ ⎨ minEt + npt · bEt + nt · (cEt + tEt ) if npt < nt ≤ nct . N E t (nt ) = minEt + npt · bEt + nct · cEt + nt · tEt ⎪ ⎪ if nct < nt < ntt . ⎪ ⎪ ⎪ ⎪ (= minE + np · bE + nc · cE maxE ⎪ t t t t t t + ntt · tEt ) ⎪ ⎩ if nt ≥ ntt .
(2)
An idle server st just consumes the minimum electric power minE t [W] although no process is performed. Each time a CPU, core, and thread are activated on a server st , the power consumption N Et of a server st increases by bEt , cEt , and tEt [W], respectively. A server st consumes the maximum electric power maxE t if every thread is active. Here, more than or equal to ntt (≥ ntt ) processes are performed on the server st . For example, minE t , bEt , cEt , tEt , and maxEt are 126.1, 30, 5.6, 0.8, and 301.3 [W], respectively, for an HP server DL360 Gen9 [14]. Suppose a server st is equipped with two CPUs each of which supports two cores each of which supports two threads, i.e. npt = 2, nct = 4, and ntt = 8. If three processes (nt = 3) are performed on the server st , N Et (3) = minEt + 2 · bEt + 3 · cEt + 3 · tEt . An idle server st consumes the minimum power minEt [W] where no process is performed. If nt ≥ ntt , a server st consumes the maximum power maxE t independently of the number nt of processes. The MLPCM model holds for a server with multiple CPUs. Let CPt (τ ) be a set of active processes on the server st at time τ . The electric power Et (τ ) of a server st to perform processes at time τ is assumed to be N Et (|CPt (τ )|)in this paper. Energy consumed by a server st from time st et [tu] to time et is τ =st N Et (|CPt (τ )|) [W tu]. We discuss how processes are performed on a server st based on the MLC and MLPCM models. Let Ct be a variable denoting a set of active processes on a server st . Let RP i show the computation residue of each process pi in the set Ct on a server st . Processes on a server st are performed as follows: [Computation model of processes on a server st ] 1. Initially, Et = 0; Ct = φ; τ = 1; T = 0; 2. while () (a) for each process pi which starts at time τ , Ct = Ct ∪ {pi }; RP i = minT i ; Ti = 0; (b) nt = |Ct |; /* number of active processes */ Et = Et + N E t (nt ); /* energy consumption */ T = T + 1; /* server is active at time τ */ (c) for each process pi in Ct , i. RP i = RP i − N P Rt (nt ); /* residue is decremented */ ii. Ti = Ti + 1; /* pi is active */
An Energy-aware Migration of Virtual Machines
5
iii. if RP i ≤ 0, Ct = Ct − {pi }; /* pi terminates at time τ */ (d) τ = τ + 1; When a process pi starts on a server st , the computation residue RP i is minT i . nt shows the number of active processes. At each time, RP i is decremented by the process computation rate N P Rt (nt ). Ti is the execution time of each process pi on the server st . If RP i ≤ 0, pi terminates. Et and Tt give the total energy consumption and execution time of the server st .
3 3.1
Estimation Model Estimation on a Server
In order to find a server to perform a new process and make a virtual machine on a host server migrate to another guest server, we have to estimate energy to be consumed by the servers to perform processes. Let nt be the number of active processes p1 , . . ., pnt on a server st at time τ , i.e. nt = |CP t (τ )|. RS t denotes the total computation residue of active processes on a server st . RP i shows the computation residue of each process pi (≤ minT i ). In our previous studies [12,13,16], the simulation (SM) algorithm is proposed where the execution time and energy consumption of a server are obtained in the simulation as shown in the computation model. The execution time of a server st means least recent time after when there is no active process. We can estimate the execution time and energy consumption of a server but it takes longer time to do the simulation. In order to more efficiently do the estimation, the simple estimation (SP) algorithm [15,16] is proposed. The execution time ET t of a server st is RS t /N SRt (nt ) [tu]. The energy consumption EE t of a server st is N E t (nt )·ET t [W tu]. This means, every active process pi is assumed to have the same computation residue RP i = RS t /nt at current time τ as shown in Fig. 1. The execution time and energy consumption of a server are underestimated in the SP algorithm. In this paper, we newly propose a monotonically increasing (M I) algorithm to more precisely estimate the execution time and energy consumption of of a server st . Here, active processes p1 , . . ., pnt in the set CP t (τ ) are totally ordered so that RP i < RP i+1 and RP i = RP 1 · (1 + α + α2 + . . . + αi−1 ) = RP 1 · (1 − αi )/(1 − α). For α = 1, RP i = RP 1 · i (i = 1, . . . , nt ). Thus, the computation residue RP i of each process pi monotonically increases and decreases as i increases and decreases, respectively, as shown in Fig. 2. The total computation residue RS t of a server st is RP 1 + . . . + RP nt = RP 1 · [1+ (1 + α) + (1 + α + α2 ) + . . . + (1 + α + . . . + αnt −1 )] = RP 1 · [(1 − α) + (1 − α2 ) + . . . + (1 − αnt )]/(1 − α) = RP 1 · (nt − (α + α2 + . . . + αnt ))/(1 − α) = RP 1 · ( αnt +1 − (nt +1)·α + nt ) / (1−α)2 . This means, RP 1 = RS t ·(1 −α)2 /(αnt −(nt +1)·α+nt ). For α = 1, RPi = RP1 · (i + 1) · i / 2. Thus, since RP 1 depends on the number nt of active processes and the total computation residue RS t , RP 1 is given as a following function AC t (nt , RS t ): RSt · (1 − α)2 /(αnt − (nt + 1) · α + nt ) if α = 1 (3) ACt (nt , RS t ) = if α = 1. 2 · RSt /(nt · (nt + 1))
6
D. Duolikun et al.
Fig. 1. Computation residue RP in the SP model.
Fig. 2. Computation residue RP in the MI model.
Let aT be an average value of minimum execution time minT 1 , . . . , mimT nt of active processes p1 , . . ., pnt . In this paper, we assume the total computation residue RS t of active processes p1 , . . ., pnt on a server st is aT · (nt + 1)/2. In the MLC model, the process p1 first terminates time T1 (nt , RS t ) [tu] after the current time τ . Here, the amount RP 1 · nt of computation is performed on a server st , i.e. the computation residue RP i of each process pi is decremented by RP 1 (i = 1, . . . , nt ). Hence, T1 (nt , RS t ) = RP 1 · nt /N SRt (nt ) = AC t (nt , RS t ) · nt / N SRt (nt ). For α = 1, T1 (nt , RS t ) = (aT /nt ) · nt /N SRt (nt ). Then, there are (nt −1) active processes p2 , . . ., pnt on the server st . The process p2 terminates T2 (nt , RS t ) [tu] after the process p1 terminates where T2 (nt , RS t ) = ACt (nt , RS t ) · (nt − 1)/N SRt (nt − 1). The computation residue RP j of a process pj is decremented by RP 1 · α (j = 2, . . . , nt ). For α = 1, T2 (nt , RS t ) = (aT /nt )·(nt −1)/N SRt (nt −1). When the process pi (i < nt ) terminates, (nt −i) processes pi+1 , pi+2 , . . ., pnt are still active. Thus, each process pi terminates Ti (nt , RS t ) = ACt (nt , RS t ) · (nt − i + 1)/N SRt (nt − i + 1) [tu] after the process pi−1 terminates. Here, the process pi is referred to as top active process. Here, the computation residue RP j of each process pj (j = i, i + 1, . . . , nt ) is decremented by RP 1 · αi−1 . If nt − i + 1 ≤ ntt , Ti (nt , RS t ) = AC t (nt , RS t )/T CRt since each process pj (j = i, . . . , nt ) is performed on a different thread. The
An Energy-aware Migration of Virtual Machines
7
computation residue RP j of each active process pj is decremented by RP 1 · αi−1 (j = i, . . . , nt ). The execution time Ti (nt , RS t ) is as follows: ACt (nt , RS t ) · (nt − i + 1)/(nt · T CRt ) if nt − i + 1 ≤ ntt Ti (nt , RS t ) = ACt (nt , RS t )/T CRt otherwise. (4) Time T1 (nt , RS t ) + . . . + Tnt (nt , RS t ) the process pnt terminates is the execution time T S t (nt , RS t ) of nt active processes p1 , . . ., pnt . The total execution time T St (nt , RS t ) of a server st to perform nt processes p1 , . . ., pnt is: ⎧ ACt (nt , RS t ) · [(nt + (nt − 1) + . . . + (ntt + 1))/(ntt · T CRt )+ ⎪ ⎪ ⎨ (ntt /ntt + (ntt − 1)/(ntt − 1) + . . . + 1/1)/T CRt ] = T S t (nt , RS t ) = ACt (nt , RS t ) · [(nt + ntt + 1) · (nt − ntt ) + 2 · nt2t )] ⎪ ⎪ if nt ≥ ntt . /(2 · ntt · T CRt ) ⎩ otherwise. ACt (nt , RS t )/T CRt
(5)
A sever st consumes energy Ei (nt , RS t ) [W tu] to perform (nt − i + 1) processes pi , pi+1 , . . ., pnt for Ti (nt , RS t ) [tu] from time τ + T1 (nt , RS t ) + . . . + Ti−1 (nt , RS t ): Ei (nt , RS t ) = N E t (nt − i + 1) · Ti (nt , RS t ).
(6)
The total energy ESt (nt , RS t ) to be consumed by a serer st is E1 (nt , RS t ) + . . . + Ent (nt , RS t ): nt N Et (nt − i + 1) · Ti (nt , RS t ). (7) ESt (nt , RS t ) = i=1
Given the number nt of current active processes and total computation residue RS t of a sever st , ACt (nt , RS t ) = 2 · RS t /(nt · (nt + 1)) as presented in formula (3). By using nt and AC t (nt , RS t ), the execution time T S t (nt , RS t ) and energy consumption ESt (nt , RS t ) are obtained as discussed here. 3.2
Migration of a Virtual Machine
Next, we discuss the execution time and energy consumption of servers among which virtual machines migrate. We consider the computation residue N RS t (nt , RS t , tm) of a server st at time τ + tm where nt processes p1 , . . ., pnt are performed at current time τ and the total computation residue of the processes is RS t . Here, tm is the migration time among virtual machines [19]. If T1 (nt , RS t ) + . . . + Ti−1 (nt , RS t ) ≤ tm < T1 (nt , RS t ) + . . . + Ti (nt , RS t ) (nt ≥ 2), (nt − i + 1) processes pi , pi+1 , . . ., pnt are performed and the process pi is the top active process at time τ + tm. Then, the computation residue RP i of the i−1 top process pi is αi−1 · ACt (nt , RS t )· (tm − j=1 Tj (nt , RS t ))/Ti (nt , RS t ). The computation residue RP j of another active process pj (j > i) is AC t (nt , RS t ) · i−1 [ αi−1 · (tm − k=1 Tk (nt , RS t )) / Ti (nt , RS t ) + (αi + αi+1 + . . . + αj−1 )]. The total computation residue N RS t (nt , RS t , tm) of a server st is: nt i−1 ACt (nt , RS t ) · j=i [(tm − k=1 Tk (nt ))/Ti (nt , RS t ) N RS t (nt , RS t , tm) = +(αi − αj )/(1 − α)]. (8)
8
D. Duolikun et al.
The number N P t (nt , RS t , tm) of active processes on a server st at time τ + tm where pi is a top process is given as follows. N P t (nt , RS t , tm) = nt −i if
i j=1
Tj (nt , RS t ) ≤ tm
nV Tvt (τ ), otherwise.
(3)
Here, αvt (τ ) is the computation degradation ratio of a virtual machine V Mvt at time τ (0 ≤ αvt (τ ) ≤ 1). αvt (τ1 ) ≤ αvt (τ2 ) ≤ 1 if N Cvt (τ1 ) ≥ N Cvt (τ2 ). αvt (τ ) N C (τ )−1 = 1 if N Cvt (τ ) ≤ 1. αvt (τ ) is assumed to be εvt vt where 0 ≤ εvt ≤ 1. i Suppose that a replica pvt starts and terminates on a virtual machine V Mvt i at time stivt and etivt , respectively. The total computation time Tvt of a replica i etvt i i i i pvt performed on a virtual machine V Mvt is etvt - stvt and τ =sti fvt (τ ) = vt
i i . The computation laxity lcivt (τ ) [vs] of a replica pivt at time τ is V Svt V Svt τ i f (x). i vt x=st vt
2.3
Power Consumption Model
A notation Et (τ ) shows the electric power [W] of a server st to perform every current replica on every active virtual machines in the server st at time τ . Let maxEt and minEt be the maximum and minimum electric power [W] of a server st , respectively. Let act (τ ) be the total number of active cores in a server st at time τ . Let minCt be the electric power [W] where at least one core cht is active on a server st and cEt be the electric power [W] consumed by a server st to make one core active. The PCSV (Power Consumption Model of a Server with Virtual machines) model [13] to perform replicas of computation processes on virtual machines is proposed in our previous studies. In the PCSV model, the electric power Et (τ ) [W] of a server st at time τ is given as follows [13]: Et (τ ) = minEt + σt (τ ) · (minCt + act (τ ) · cEt ).
(4)
54
T. Enokido et al.
Here, σt (τ ) = 1 if at least one core cht is active on a server st at time τ . Otherwise, σt (τ ) = 0. The processing power P Et (τ ) [W] is Et (τ ) - minEt at time τ in a server st . The total electric energy T P Et (τ1 , τ2 ) [J] of a server st from time τ1 processing τ to τ2 is τ2=τ1 P Et (τ ).
3 3.1
Process Replication The IRATB Algorithm
In our previous studies, the IRAT B (Improved Redundant Active Time-Based ) algorithm [11] is proposed to reduce the total electric energy consumption of a server cluster S and the average response time of each process for redundantly performing each computation process. The total processing electric energy laxity tpelt(τ ) [12] shows how much electric energy a server st has to consume to perform every current replica on every active virtual machine in the server st at time τ . The IRATB algorithm estimates the total processing electric energy i of laxity tpelt (τ ) [J] of each server st at time τ based on the response time RTvt i each current replica pvt performed on each virtual machine V Mvt in the server st . Let T P ELivt be the total processing electric energy laxity of a server cluster S where a replica pivt of a request process pi is allocated to a virtual machine V Mvt performed on a server st at time τ . In the IRATB algorithm, each time a load balancer K receives a new request process pi , the load balancer K estimates the total processing electric energy laxity T P ELi of a server cluster S where rdi replicas are performed on a subset V M S i of virtual machines in a server cluster S. Then, the load balancer K selects a subset V M S i of virtual machines where the total processing electric energy laxity T P ELi of a server cluster S is the minimum for the request process pi . In addition, if a thread thk t on a core cht allocated to a virtual machine V Mv t is idle, the thread thk t is used for another active virtual machine V Mvt performed on the same core cht in each server st at time τ in the IRATB algorithm. Then, the computation time of each replica performed on the virtual machine V Mvt can be reduced since the computation rate of the virtual machine V Mvt increases. As a result, the total processing electric energy consumption of each server st can be more reduced in the IRATB algorithm. 3.2
The IRATB-FMRT Algorithm
In this paper, we newly propose an IRATB-FMRT (IRATB with Forcing Meaningless Replicas to Terminate) algorithm to provide an energy-efficient server cluster system for redundantly performing computation processes. Each replica pivt of a request process pi is modeled to be a sequence of operations. Suppose a replica pivt of a request process pi successfully terminates on a virtual machine V Mvt but another replica pivu is still being performed on another virtual machine V Mvu . The request process pi can commit without performing the replica pivu since the replica pivt already successfully terminates.
The IRATB-FMRT Algorithm
55
[Definition] A subsequence of a replica pivu to be performed on a virtual machine V Mvu in a server su after another replica pivt performed on a virtual machine V Mvt in a sever st successfully terminates is a meaningless tail part. In the IRATB-FMRT algorithm, a meaningless tail part of each replica is forced to terminate on every active virtual machine. Then, the computation time of each replica can be reduced since a computation rate of each virtual machine to perform the meaningless tail part of the replica can be used to perform other replicas. As a result, the total processing electric energy consumption of a server cluster S can be more reduced in the IRATB-FMRT algorithm than the IRATB algorithm. Each time a replica pivt successfully terminates on a virtual machine V Mvt in a server st , the virtual machine V Mvt sends a termination notification TN (pivt ) message of the replica pivt to both the load balancer K and every other virtual machine in the subset V M S i . The termination notification TN (pivt ) includes a i . On receipt of TN (pivt ) from a virtual machine V Mvt , the meaningless reply rvt tail part of a replica pivu is forced to terminate on each virtual machine V Mvu by the following procedure: Force Termination(TN (pivt )) { if a replica pivu is performing, pivu is forced to terminate; else TN (pivt ) is neglected; } The total processing electric energy consumption of a server cluster S in the IRATB-FMRT algorithm can be more reduced than the IRATB algorithm by forcing meaningless tail part of replicas to terminate. In addition, the computation time of each replica can be more reduced in the IRATB-FMRT algorithm than the IRATB algorithm since the computation resources of each virtual machine can be more efficiently utilized in the IRATB-FMRT algorithm.
4
Evaluation
The IRATB-FMRT algorithm is evaluated in terms of the total processing electric energy consumption [KJ] of a homogeneous server cluster S and the response time of each process pi compared with the IRATB algorithm [11]. A homogeneous server cluster S is composed of five physical servers s1 , ... s5 (n = 5). Parameters of every server st and virtual machine V Mkt are the same and obtained from the experiment [13] as shown in Tables 1 and 2. Every server st is equipped with a dual-core CPU (nct = 2). Two threads are bounded for each core in a server st (ctt = 2). Hence, four threads are supported in each server st , i.e. ntt = 4. There are twenty virtual machines in the server cluster S. We assume the fault probability f rt for every server st is the same f r = 0.1.
56
T. Enokido et al. Table 1. Homogeneous cluster S. Server nct ctt ntt minEt st
2
2
4
minCt
cEt
maxEt
14.8 [W] 6.3 [W] 3.9 [W] 33.8 [W]
Table 2. Parameters of virtual machine. Virtual machine M axfvt V Mvt
εvt βvt (1) βvt (2)
1 [vs/msec] 1
1
0.6
The number m of processes p1 , ..., pm (0 ≤ m ≤ 8,000) are issued to the server cluster S. The starting time sti of each process pi is randomly selected in a unit of one millisecond [msec] between 1 and 3600 [msec]. The minimum i of every replica pivt is assumed to be 1 [msec]. The computation time minTvt delay time dKt of every pair of a load balancer K and every server st is 1 [msec] i of every replica in the server cluster S. The minimum response time minRTvt i i pvt is 2dKt + minTvt = 2 · 1 + 1 = 3 [msec]. Figure 1 shows the average total processing electric energy consumption of the server cluster S to perform the number m of processes in the IRATB-FMRT and IRATB algorithms. In Fig. 1, IRATB-FMRT(rd) and IRATB(rd) stand for the average total processing electric energy consumption of the server cluster S in the IRATB-FMRT and IRATB algorithms with redundancy rd (=1, 2, 3), respectively. In the IRATB-FMRT algorithm, the meaningless tail part of each replica is forced to terminate on every virtual machine V Mvt . As a result, the average total processing electric energy consumption to perform the number m of processes can be more reduced in the IRATB-FMRT algorithm than the IRATB algorithm. Figure 2 shows the average response time of each process in the IRATBFMRT and IRATB algorithms. If rd = 1, the average response time of each process in the IRATB-FMRT and IRATB algorithms are 3 [msec] since every request process pi is not redundantly performed in the IRATB-FMRT and IRATB algorithms and the computation resource to perform the number m of processes in the server cluster S is enough. For rd ≥ 2, the average response time in the IRATB-FMRT algorithm can be more reduced than the IRATB algorithm since the computation resources of each virtual machine to perform meaningless tail part of replicas can be used to perform other replicas. Following the evaluation, we conclude the IRATB-FMRT algorithm is more useful in a homogeneous server cluster than the IRATB algorithm.
The IRATB-FMRT Algorithm
Fig. 1. Total electric energy consumption.
5
57
Fig. 2. Average response time.
Concluding Remarks
In this paper, a meaningless tail part of a replica which is not required to be performed on a virtual machine is defined. Then, the IRATB-FMRT algorithm which improves the IRATB algorithm is proposed to furthermore reduce the total electric energy consumption of a server cluster and the average response time of each computation process by forcing meaningless tail part of each replica to terminate on each virtual machine. The evaluation results showed the total electric energy consumption of a server cluster and the average response time of each process can be more reduced in the IRATB-FMRT algorithm than the IRATB algorithm. Following the evaluation results, the IRATB-FMRT algorithm is more energy and performance-efficient than the IRATB algorithm in a homogeneous server cluster.
References 1. KVM: Main Page - KVM (Kernel Based Virtual Machine) (2015). http://www. linux-kvm.org/page/Mainx Page 2. Enokido, T., Aikebaier, A., Takizawa, M.: Process allocation algorithms for saving power consumption in peer-to-peer systems. IEEE Trans. Ind. Electron. 58(6), 2097–2105 (2011) 3. Enokido, T., Aikebaier, A., Takizawa, M.: A model for reducing power consumption in peer-to-peer systems. IEEE Syst. J. 4(2), 221–229 (2010) 4. Enokido, T., Aikebaier, A., Takizawa, M.: An extended simple power consumption model for selecting a server to perform computation type processes in digital ecosystems. IEEE Trans. Ind. Inform. 10(2), 1627–1636 (2014) 5. Enokido, T., Takizawa, M.: Integrated power consumption model for distributed systems. IEEE Trans. Ind. Inform. 60(2), 824–836 (2013) 6. Natural Resources Defense Council (NRDS): Data center efficiency assessment - scaling up energy efficiency across the data center industry: Evaluating key drivers and barriers (2014). http://www.nrdc.org/energy/files/data-centerefficiency-assessment-IP.pdf
58
T. Enokido et al.
7. Lamport, R., Shostak, R., Pease, M.: The byzantine generals problems. ACM Trans. Program. Lang. Syst. 4(3), 382–401 (1982) 8. Schneider, F.B.: Replication management using the state-machine approach. In: Distributed Systems, 2nd edn, pp. 169–197. ACM Press (1993) 9. Enokido, T., Aikebaier, A., Takizawa, M.: An energy-efficient redundant execution algorithm by terminating meaningless redundant processes. In: Proceedings of the 27th IEEE International Conference on Advanced Information Networking and Applications (AINA-2013), pp. 1–8 (2013) 10. Enokido, T., Aikebaier, A., Takizawa, M.: Evaluation of the extended improved redundant power consumption laxity-based (EIRPCLB) algorithm. In: Proceedings of the 28th IEEE International Conference on Advanced Information Networking and Applications (AINA-2014), pp. 940–947 (2014) 11. Enokido, T., Duolikun, D., Takizawa, M.: The improved redundant active timebased (IRATB) algorithm for process replication. In: Proceedings of the 35nd International Conference on Advanced Information Networking and Applications (AINA-2021), pp. 172–180 (2021) 12. Enokido, T., Duolikun, D., Takizawa, M.: An energy-efficient process replication algorithm based on the active time of cores. In: Proceedings of the 32nd IEEE International Conference on Advanced Information Networking and Applications (AINA-2018), pp. 165–172 (2018) 13. Enokido, T., Takizawa, M.: Power consumption and computation models of virtual machines to perform computation type application processes. In: Proceedings of the 9th International Conference on Complex, Intelligent and Software Intensive Systems (CISIS-2015), pp. 126–133 (2015)
Optical Simulations on Aerial Transmitting Laser Beam for Free Space Optics Communication Shun Jono, Takuto Koyama, Kota Watanabe, Kiyotaka Izumi, and Takeshi Tsujimura(B) Saga University, Saga, Japan [email protected]
Abstract. Free space optics is necessary to align the optical axis of the laser beam with the center of receiver lens to perform broadband transmission. The authors propose the algorithm of laser beam direction adjustment by controlling laser beam spread. This paper conducts the computer simulations on laser beam emitted from the transmitter to evaluate the transferred beam radius with respect to tunable arrangement of lenses contained in the transmitter.
1 Introduction Free space optics (FSO) communication can be expected to be used in the event of a disaster as a communication method that complements or replaces optical fiber and radio networks [1–3]. The advantage of free space optics is a wider band than radio waves and is easier installation than optical fiber cables [4–9]. FSO has the function of converting the transmitted laser beam through an optical fiber into spatial laser beam. When communication cables or radio antennas break due to a disaster, a temporary communication network can be configured by using FSO at the broken point as shown in Fig. 1 [10–14].
Fig. 1. Image of FSO utilization.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 59–70, 2022. https://doi.org/10.1007/978-3-030-84913-9_6
60
S. Jono et al.
FSO device has the flaw that communication cannot be performed unless the laser beam is incident precisely on the remote receiver. It is necessary to align the optical axis of the laser beam with the receiver lens to perform free space optics. This paper describes the design of FSO transmitter/receiver devices and also the method of estimating the optical axis of the laser beam using the positioning photodiode (PD). The method requires expanding and contracting operations of discharged laser beam. The authors have investigated the optical alignment procedure by disclosing the relationship between the arrangement of lenses within the transmitter and the spread of transmitted laser beam by computer simulations.
2 Free Space Optics 2.1 Free Space Optics System Design The authors have designed and constructed the prototype of a free space optics system as shown in Fig. 2. It is equipped with two lenses. The right is a transmitter, which emits 1550 nm laser beam of 10 mm in diameter and transmits 1 Gbit/s broadband data. Optical signals are supplied through a single mode fiber (SMF). The laser beam is steered by voice coil motors (VCMs) with a positioning accuracy of 7.54 × 10–6 m/V which is equivalent to a resolution of 1.15 × 10–9 m/bit [14–16].
Fig. 2. FSO transmission/receiver device.
Figure 3 shows an overall view of the two-way communication system. Laser beam is emitted from the transmitter, and is incident on the receiver to perform communication. The transmitter is equipped with voice coil motors, which move the lenses to tune the diffusion size and direction of the laser beam. Positioning PDs are attached around the receiver and play a role in estimating the position of the arrived laser beam. Figure 4 shows the positioning PD layout. Positioning PDs surrounds the lens of the FSO receiver. Five positioning PDs are designated as sensor 1, sensor 2, sensor 3, sensor 4, and sensor 5, respectively, and the layout diagram is shown in a coordinate system,
Optical Simulations on Aerial Transmitting Laser Beam
61
where the origin of the coordinate system is sensor 1. In order to perform active free space optics, it is necessary that all positioning PDs are exposed to laser beam. The laser beam radius needs to be large enough to cover all the arrangement coordinates of the sensors shown in the Fig. 4.
Fig. 3. Overall view of the two-way communication system.
Fig. 4. Positioning PD layout.
We have designed an ad-hoc optical space communication device that incorporates hardware and software for automatic optical axis adjustment, and produced a prototype according to the specifications in Table 1. Outdoor optical wireless transmission is planned using the ad-hoc free space optics communication device, and the characteristics will be quantitatively evaluated with respect to automated installation. The influence of disturbance vibration, optical axis adjustment time, operability, and reliability will be verified by measuring the bit error rate as shown in Table 2. The outline of free space optics system is shown in Fig. 5. The incident laser beam passes through the lens barrel and reaches the optical fiber. The optical path is adjusted by the motor-driven lenses placed in the middle, and the arrival position refers to the branched light by the quadrant photodiode (QPD). The transmitter has the function of converting the transmitted laser beam in the single-mode optical fiber (SMF) into the air via the optical devices. The role of the VCM is to control the discharge direction of the laser beam. The designed communication system is possible to transmit and receive laser beam and easily establish a broadband communication line by setting two devices 1000 m apart and facing each other.
62
S. Jono et al. Table 1. Specifications of ad-hoc free space optics device Target transmission distance
1000 m
Communication wavelength
1550 nm
Transmission band
10 Gbps
Signal beam
15 (collimated beam)
Optical axis adjustment drive system
VCM actuator
Light intensity distribution measurement Quadrant photodiode [For fine adjustment] Distributed PD array [For rough adjustment]
Table 2. Characteristic evaluation items Transmission quality
Bit error rate 10−7 or less
Optical axis adjustment performance Automatic optical axis adjustment time within 5 min Quantitative evaluation items
Operability, reliability, environmental resistance, aging characteristics, etc.
Fig. 5. Outline of free space optical system of free space optics device.
Bilateral optical communication is established between FSO apparatuses. Figure 6 represents a symmetrical block diagram of active free space optics system by mutually discharging laser beam [14–18]. Each apparatus contains a transmitter and a receiver as well as a controller PC. Broadband communication is performed by transferring the thin laser beam from the transmitter to the opposite receiver. It is necessary to keep the laser spot within the receiver lens even if it drifts along. Optical path of the transmission laser beam is adjusted by VCMs installed in the transmitter lens-barrel. The beam angle can be
Optical Simulations on Aerial Transmitting Laser Beam
63
steered up to 1.2°. The objective angle of laser beam is determined by the output data of the opposite positioning photodiodes. The receiver has two types of laser beam alignment systems. Coarse adjustment is applied to perform between confronting transmitter and receiver in terms of its regional optical intensity. The laser luminescent distribution can be presumed based only on the dispersed regional intensity of the off-target laser beam. Measured data of the positioning PDs is transferred to the VCM in the opposite transmitter via reverse optical line, and is converted to feedback control commands. The VCM is controlled based on the positioning PD data to lead the laser beam within the receiver lens. Fine adjustment is conducted within the receiver. Around 10% of the incident laser beam is divided by the beam splitter and monitored by the QPD to evaluate positional shift of laser beam alignment. The feedback control algorithm maintains the monitor laser beam onto the center of the QPD in real time, and the mainstream laser beam is consequently coupled to the outlet SMF at all time.
Fig. 6. Control system block diagram of optical space communication device
2.2 Bidirectional Communication Path Opening Algorithm Figure 7 illustrates collaborative procedure of the initial alignment where the confronting apparatuses, FSO #1 and #2 are simplified with the transmitter, receiver, and controller [14–18]. It represents situation of laser beam and flow of control information. A pair of FSO apparatus are remotely located facing each other. At the onset of connection, laser beams are emitted toward the opposite FSO apparatus but they miss the receivers as shown in Fig. 7 (1). Neither transmission line is established yet. Positioning PDs in FSO #2 detect the intensity of off-target laser beam traveling from FSO #1. Next, the laser beam haphazardly scans the area where the target receiver is expected to be as shown in Fig. 7 (2). Meanwhile, the laser beam carries both the transmitter orientation control data and the positioning PD data. The moment the laser beam accidentally hits at the receiver
64
S. Jono et al.
by a fluke, it transfers those data to FSO #1 in a flash as shown in Fig. 7 (3). The controller in FSO #1 is informed the transmission laser direction of FSO #2 at the occasion of the correspondence as well as output values of the positioning PDs. In the next instance, the laser beam passes by and goes out of the receiver. Then, FSO #1 controller adjusts the transmitter direction based on FSO #2 positioning PD data to reach the target receiver as shown in Fig. 7 (4). The downstream transmission line is connected from FSO #1 to #2. Finally, the downstream line transfers positioning PD data of FSO #1 to FSO #2 controller, as it keeps connection ever after as shown in Fig. 7 (5). That makes it possible for FSO #2 controller to establish the upstream transmission from FSO #2 to #1. Bidirectional lines are successfully connected after all as shown in Fig. 7 (6).
Fig. 7. Collaborative adjustment procedure of laser alignment
According to the algorithm, two devices alternately emit laser beam from the transmitters, and position the optical axis coordinates of the laser beam on the opposing receivers. Additional operation is required to find the target lens while widely scanning over the receiver as shown in Fig. 7 (2). The receiver is equipped with the positioning PDs to find misalignment of the optical axis of arrived laser beam. It estimates the gap of misalignment based on the received intensity distribution of the laser beam even if it misses the target receiver lens as shown in step (2) of Fig. 7. The strategy for efficient adjustment of laser beam is as follows. First of all, the scanning laser beam is expanded wide enough to cover the positioning PDs around the opposite receiver as shown in Fig. 8. It is realized by tuning the arrangement of lenses in the transmitter. This procedure makes it easier to catch the arrival laser beam. Once the positioning PDs detect the laser beam, the laser control system estimates the optical axis of the beam based on the distribution of laser intensity by applying the principle of Gaussian beam optics. Then, the direction of the discharged laser beam is adjusted to meet the center of the receiver lens. Finally, the optical devices are rearranged to create a thin and dense laser beam to maintain stable optical communications.
Optical Simulations on Aerial Transmitting Laser Beam
65
After all, the strategy expands and contracts the laser beam in the appropriate size by controlling the placement of optical lenses. That is why we have investigated the relationship between the lens layout and the expanding size of laser beam transmitted over long distance by conducting some optical simulations.
Fig. 8. Conceptual diagram of alignment by expanding/contracting laser beam.
3 Simulation An optical model for simulation is illustrated in Fig. 9. Four VCM drive type lenses are built in the FSO. By changing the positions of the four lenses, a simulation is performed to move the lenses on the actual machine.
Fig. 9. Free space optical model for simulation.
Figure 10 shows the simulation results at the standard position of the lenses. The beam radius expands as the distance from 0 to 1000 m increases. The beam radius is 4, 16, 120 mm at a transmission distance of 1, 10, 100 m, respectively. If the transmission distance exceeds 100 m, the beam radius suddenly increases. Note that this optical system is designed to use at short distance of around 100 m.
66
S. Jono et al.
We have conducted some computer simulations of laser beam profile. First of all, either of Lens 1, Lens 2, Lens 3, and Lens 4 is moved independently to simulate the diameter of the laser beam at a point 100 m apart. Figure 11 indicates the beam radius when displacing Lens 1, Lens 2, Lens 3, Lens 4, independently.
Fig. 10. The simulation results at the standard position of the lens.
Fig. 11. Beam radius at 100 m apart.
Even if the displacement of Lens 1 and Lens 2 is changed in the range of 0 [mm] to +20 [mm], the laser beam diameter at 100 [m] increases as the displacement increases. There is almost no change in the beam radius at 100[m] even if the displacement of Lens 3 and Lens 4 is changed. When Lens 3 and Lens 4 are moved, the beam radius is not enlarged, so it is not effective in measuring the beam reception intensity distribution.
Optical Simulations on Aerial Transmitting Laser Beam
67
Next computer simulations are conducted under the conditions of displacing two lenses together. We move both Lens 1 and another at the same time to see the change in beam radius. The simulation results when moving Lens 1 and Lens 2 are shown in Fig. 12. It simulates Lens 1 with a displacement of 0 mm to +20 mm and Lens 2 with a displacement of 0 mm to +20 mm, and shows the beam radius at 100 m at a total of 9 locations. When the displacement of Lens 1 is 0 mm and the displacement of Lens 2 is 0 mm, the beam radius is 123.3 mm, which is the smallest. When the displacement of Lens 1 was +20 mm and the displacement of Lens 2 was 0 mm, the beam radius was 8349 mm, which was the largest. The beam radius becomes smaller as the displacement of Lens 1 approaches 0 mm and the displacement of Lens 2 approaches 0 mm. The closer to the displacement of Lens 1 +20 mm and the displacement of Lens 2 0 mm, the larger the beam radius.
Fig. 12. Beam radius when Lens 1 and Lens 2 are moved at the same time.
The simulation results when moving Lens 1 and Lens 3 are shown in Fig. 13. It simulates Lens 1 with a displacement of 0mm to +20 mm and Lens 3 with a displacement of −10 mm to +10 mm, and summarizes the beam radius at 100 m in a total of 9 locations. When the displacement of Lens 1 is 0 mm and the displacement of Lens 3 is −10 mm, the laser aperture is 123.0 mm, which is the smallest. When the displacement of Lens 1 was +20 mm and the displacement of Lens 3 was +10 mm, the beam radius was 8854 mm, which was the largest. The laser aperture becomes smaller as the displacement of Lens 1 approaches 0 mm and the displacement of Lens 3 approaches −10 mm. The closer to the displacement of Lens 1 +20 mm and the displacement of Lens 3 +10 mm, the larger the beam radius. The simulation results when moving Lens 1 and Lens 4 are shown in Fig. 14. It simulates Lens 1 with a displacement of 0 mm to +20 mm and Lens 4 with a displacement of −10 mm to +10 mm, and summarizes the laser apertures at 100 m in a total of 9 locations. When the displacement of Lens 1 is 0 mm and the displacement of Lens 4 is + 10 mm, the beam radius is 196.1 mm, which is the smallest. When the displacement of Lens 1 was +20 mm and the displacement of Lens 4 was −10 mm, the beam radius was 15270 mm, which was the largest. The beam radius becomes smaller as the displacement
68
S. Jono et al.
Fig. 13. Beam radius when Lens 1 and Lens 3 are moved at the same time.
Fig. 14. Beam radius when Lens 1 and Lens 4 are moved at the same time.
of Lens 1 approaches 0 mm and the displacement of Lens 4 approaches 0 mm. The closer to the displacement of Lens 1 +20 mm and the displacement of Lens 4 −10 mm, the larger the beam radius. Results suggest that it is appropriate to combine Lens 1 and Lens 4 to control the size of the laser beam radius. It is effective to move Lens 1 away from the light source and Lens 4 closer to the light source. If combining Lens 1 and one of the other lenses, it is effective to move the other lens without changing the displacement of Lens 1 to contract the beam radius.
Optical Simulations on Aerial Transmitting Laser Beam
69
4 Conclusion This paper describes the design and construction of the ad-hoc free space optics communication system. The authors also propose the algorithm to automatically connect the aerial optical communication line. It requires the optical operations of the transmitted laser beam radius to cover the positioning PD installed in the receiver and to concentrate intensity of the optical signals. Optical simulations are carried out to evaluate laser beam profile emitted from the transmitter. It reveals the effects of lens positon to the dispersion of travelling laser beam. Beam radius is traced along distance from 0 to 1000 m regarding the standard lens arrangement. It is confirmed that the laser beam gradually enlarges until 100 m, and steeply extend afterwards. We conduct some simulations in terms of different arrangement of four lenses involved in the transmitter by changing position of a sole or a pair of lenses. Results have proved that we can manipulate the transmitted laser beam radius by controlling either of two lenses which are placed far from the light source, or by altering relative distance between the specified two lenses. After all, this paper has established the way of optical axis adjustment: The expanded laser beam covers the positioning PDs to measure distribution of laser intensity and to estimate the arrived position. After correcting the optical axis position, the beam is reformed thinner to concentrate the transmission energy.
References 1. Pratt, W.K.: Laser communication systems. New York, Wiley (1969) 2. Ueno, Y., Nagata, R.: An optical communication system using envelope modulation. IEEE Trans. COM-20. 4, 813 (1972) 3. Willebrand, H., Ghuman, B.S.: Free-Space Optics: Enabling Optical Connectivity in Today’s Networks. Indianapolis, Sams Publishing (1999) 4. Nykolak, G., et al.: Update on 4x2.5 Gb/s, 4.4km free-space optical communications link: availability and scintillation performance. In: Optical wireless communications II. Proceedings of SPIE, vol. 3850, pp. 11–19 (1999) 5. Dodley, J.P., et al.: Free space optical technology and distribution architecture for broadband metro and local services. In: Optical Wireless Communications III. Proceedings of SPIE, vol. 4214, pp. 72–85 (2000) 6. Vitasek, J., et al.: Misalignment loss of free space optic link. In: 16th International Conference on Transparent Optical Networks, pp. 1–5 (2014) 7. Dubey, S., Kumar, S., Mishra, R.: Simulation and performance evaluation of free space optic transmission system. In: International Conference on Computing for Sustainable Global Development, pp. 850–855 (2014) 8. Wang, Q., Nguyen, T., Wang, A.X.: Channel capacity optimization for an integrated wifi and free-space optic communication system. In: 17th ACM international Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, pp. 327–330 (2014) 9. Kaur, P., Jain, V.K., Kar, S.: Capacity of free space optical links with spatial diversity and aperture averaging. In: 27th Biennial Symposium on Communications, pp. 14–18 (2014) 10. Tsujimura, T., Yoshida, K.: Active free space optics systems for ubiquitous user networks. In: 2004 Conference on Optoelectronic and Microelectronic Materials and Devices, pp. 197–200 (2004)
70
S. Jono et al.
11. Tanaka, K., Tsujimura, T., Yoshida, K., Katayama, K., Azuma, Y.: Frame-loss-free optical line switching system for in-service optical network. J. Lightwave Technol. 28(4), 539–546 (2010) 12. Yoshida, K., Tsujimura, T.: Seamless transmission between single-mode optica fibers using free space optics system. SICE J. Control Measur. Syst. Integr. 3(2), 94–100 (2010) 13. Yoshida, K., Tanaka, K., Tsujimura, T., Azuma, Y.: Assisted focus adjustment for free space optics system coupling single-mode optical fibers. IEEE Trans. Ind. Electron. 60, 5306–5314 (2013) 14. Shimada, Y., Tashiro, Y., Yoshida, K., Izumi, K., Tsujimura, T.: Initial alignment method for free space optics laser beam. Jpn. J. Appl. Phys. 55, 8S3 (2016) 15. Tsujimura, T., Suito, Y., Yamamoto, K., Izumi, K.: Spacial laser beam control system for optical robot intercommunication. In: 2018 IEEE International Conference on Systems, Man, and Cybernetics (2018) 16. Tsujimura, T., Izumi, K., Yoshida, K.: Collaborative all-optical alignment system for free space optics communication. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 146–157. Springer, Cham (2019). https://doi.org/10.1007/978-3-31998557-2_14 17. Tsujimura, T., Shimogawa, R., Yamamoto, K., Izumi, K.: Design of binocular active free space optics transmission apparatus. In: 2019 IEEE 44th Conference on Local Computer Networks (LCN) (2019) 18. Yamamoto, K., Simogawa, R., Izumi, K., Tsujimura, T.: Optical axis estimation method using binocular free space optics. In: Barolli, L., Nishino, H., Miwa, H. (eds.) INCoS 2019. AISC, vol. 1035, pp. 247–256. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-290351_24
Employee Management Support Application for Regional Public Transportation Service in Japan Chinasa Sueyoshi1(B) , Hideya Takagi1 , Toshihiro Uchibayashi2 , and Kentaro Inenaga1 1 Kyushu Sangyo University, Fukuoka, Japan {sueyoshi,inenaga}@is.kyusan-u.ac.jp, [email protected] 2 Kyushu University, Fukuoka, Japan [email protected]
Abstract. Recently, there have been several issues with regional public transportation for Japan that are different from the issues encountered in major transportation systems. The use of information technology is expected to solve these issues. Human resources shortage is a severe problem that requires a solution at the earliest. In japan, regional public transportation systems fill the gaps in depopulated areas where the major transportation systems are unable to operate or have been withdrawn. Because most of the financial resources required for managing and operating these regional public transportation systems are provided by the respective local governments, the systems experience a chronic shortage of funds. Hence, regional public transportation systems apply only a limited number of personnel. There has been an increasing number of reported accidents caused by who become inattentive while driving because of health issues. The number of drivers in these regional transportation systems who suffer from health issues will increase with the aging of the workforce. We believe the regional public transportation systems operation must be managed by monitoring the health status of the drivers. Therefore, to digitize the operations management of the regional transportation systems and improve their work efficiency, we developed an application with two modules, namely, one for operation managers and the other one for call operators. The applications will support employee management in the systems and reduce the burden on the employees. One function of the application will be to link the health status of the employees with operations management using employee health information. Keywords: Regional public transportation · Employee management support · Community bus · Mobile application
1
Introduction
Over the last few years, research on regional public transportation, in Japan has been actively pursued, mostly by local governments [2,6,9,10]. Regional public c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 71–81, 2022. https://doi.org/10.1007/978-3-030-84913-9_7
72
C. Sueyoshi et al.
transportation has encountered several problems, which are different from those found in major transportation. These problems could be solved using information technology. The regional transportation systems suffer from a shortage of human resources, which has become a severe problem that requires an urgent solution. According to the statistics maintained by the Metropolitan Police Department of Japan, the number of Class 2 licensed drivers o f buses cabs, hire cars, and private ambulances has been decreasing year by year. For example, in 2008 the number of driving license holders was 1,100,000 and in 2018, the figure was approximately 900,000, a decrease of approximately 80%. Several regional public transportation systems operate in depopulated areas because the major transportation systems have either been withdrawn from those areas or are unable to handle those areas. The management/operation of regional public transportation systems is funded mostly by the respective local governments. The local governments are already experiencing fund shortages and thus find it financially challenging to secure sufficient human resources to operate the transportation systems. The systems, therefore, are compelled to operate with a limited number of employees. Providing the systems with the latest equipment is also difficult. The operations management in these systems remains often done on paper. With a shortage of staff, the staff will be overloaded and burdened with work. With no new staff joining the system, the workforce of a regional transportation system will be aging. According to the results of the “Basic Survey on Wage Structure” conducted by the Ministry of Health, Labor, and Welfare of Japan, the average age of bus drivers in the country has increased over the years; for example, it has increased from 46.5 years in 2010 to 50.7 years in 2019. In 2019, the average age of drivers in a company with 10 to 99, 100 to 999, and exceeding 1000 employees were 54.8, 50.8 and 47.8 years, respectively. Thus, the average age of a driver has increased as the company size decreased. Most of the regional public transport companies have only 10–99 employees. The average age of the employees in a company of that size is the highest among the three sizes of other companies. According to a report prepared by the Ministry of Land Infrastructure, Transport, and Tourism of Japan, the commercial number of reported vehicle accidents for which the poor health of drivers was responsible, is Employee increasing. In 2017, approximately 300 such accidents had been reported. Approximately 30% of the drivers responsible for these accidents had been inattentive while driving. Hence, as the number of employees with health issues increases with the aging of the workforce, the health status of the drivers in the regional transportation systems will require monitoring. We developed an application to support employee management in the regional transportation systems for Japan. The application will digitize the operations management and improve the operational efficiency of regional public transportation systems while reducing the burden on the employees. The application is intended for use by operation managers and call operators. One of the functions the application will be to link the employee health status with operations management. In this paper, we introduce and discuss the developed applications. Section 2 presents the salient information presented in related papers of
Employee Management Support App. for Reg. Public Tran. Service in Japan
73
other authors. Section 3 outlines the issues faced by regional public transportation systems, whereas Sect. 4 explains the development and implementation of the application to solve the issues. Section 5 presents the conclusion.
2
Related Works
This section introduces existing employee management and demonstration experiments targeting cities outside Japan. A past study by Paandian et al. on employee health care has prepared an online database of employee medical history and developed a platform for information exchange among employers, employees, and healthcare providers through a web service [5]. Several organizations offer medical benefit schemes for their employees to ensure a healthy work environment. These medical benefit schemes manage the medical expenses of the employees. However, they do not store the medical records of the employees or monitor their health. Two past studies on urban transportation have explored the transportation systems of Dhaka and Taiwan [1,3]. Dhaka has designed an intelligent transportation system for the public buses in Dhaka to enable passengers to see the name of the bus, distance, traveled, travel cost, and route map, which can be used to add, delete, or reroute routes as required. The system operates under the supervision of traffic controllers and traffic bureaus. In Taiwan, risk and health are linked to occupational issues. The significance of ergonomic exposures that can lead to Musculoskeletal Disorders (MSDs) was assessed; the workstations used by the bus drivers, evaluated; and solutions to minimize MSDs in drivers, recommended. The authors concluded that MSDs can affect anyone who uses improper work methods and postures stipulated guidelines, the bus drivers will be comfortable when driving. In their study on employee management [4] presents a cloud-based attendance management system using the Near Field Communication (NFC) technology. Employee attendance records that are accurate play an essential role in managing work discipline and improving worker productivity. Human errors and fraudulent time and attendance records can affect the productivity of an organization. The differences among the time and attendance policies of different organizations complicate the evaluation of the working hours of employees using the time and attendance systems available in the market. An automated time and attendance system, therefore, is essential to improve organizational performance and profitability. However, some of the existing time and attendance systems have limitations in identifying the speed and cost of equipment required, real-time monitoring of attendance, and the flexibility provided for managing database storage size. Therefore, they proposed an application, which provides essential functions such as attendance record capture using the NFC, technology, automatic time calculation, vacation and overtime hour checking, working hour evaluation, real-time updates, and report creation. User satisfaction evaluations indicate that the proposed system is practical and satisfactory.
74
3
C. Sueyoshi et al.
Regional Public Transportation Issues
As shown in Fig. 1, we studied the regional public transportation systems operated by more than 10 local governments in the Fukuoka Prefecture in Japan [7,7,8]. To identify the issues faced in these transportation systems, we conducted onboard questionnaire surveys among passengers and studied the passenger count records maintained by the drivers, public transportation infrastructure data, and bus location systems. Staff shortage and workforce aging were the most significant issues faced by transportation systems. In contrast to most of the major transportation systems in Japan that use a distance based fare system, the regional public transportation systems operate on a fixed fare structure. The fares being low, the primary source of funding of a regional transportation system is the local government that operates it. Consequently, these regional transportation systems face financial difficulties and must be managed carefully. To operate smoothly the presence of a staff shortage, each employee, especially an operations manager, must carry a heavy workload. Because the transportation systems cannot afford expensive equipment, their operations managers adopt conventional analog methods to perform their work on paper. Therefore, developed an application to support operations management though digitization. In Japan, cab and bus drivers are required by law to take roll calls before and after each ride. To optimize pre- and post-boarding processes, we implemented a mechanism that allows the application to take roll calls tutorially with automatic transitions, thereby relieving the drivers from the responsibility of taking roll calls. Any information on operations management can be easily obtained using the application, which will reduce the work pressure on the employees caused by the overabundance of operations management information. The information can be added and edited easily. The application can also produce roll call information in real-time, eliminating time lags and reducing errors. To link employee health information with operations management information, the health checkup and hospital visit histories provided by the employees can be fed to the operations management system, enabling the operations manager to check the health status of an employee and manage the operations efficiently. Applying the developed application to actual regional public transportation will be possible to manage the operation more efficiently. Efficient operation management will create financial leeway and generate employment. Employee’s health management can prevent health-related accidents and protect the safety of employees.
4
Development and Implementation
This section describes the development and implementation of the application to solve regional public transportation issues (Fig. 2). The application has two modules: one for call operators and the other for operation managers. The call manager can use the employee management support module intended for call operators, to call the employee before and after work. The purpose of the module
Employee Management Support App. for Reg. Public Tran. Service in Japan
75
Fig. 1. Our supported Fukuoka’s regional public transport service.
intended for operation managers is to allow viewing of employee information and support operations management. 4.1
Support Application for Call Operator
Figure 3 shows the flow of the module. The call operator makes the call on behalf of the employee. Figure 4 shows the template of the employee identity (ID) card. First, the call operator logs in to the system by getting the application to read the QR code of the employee or by entering the employee ID card number and password (Fig. 5). Each employee has a unique QR code printed on his/her ID card. Next, the call operator selects either “pre-” or “post-” in the application and makes the call on behalf of the employee (Fig. 6). The employee is recognized by reading the QR code printed on his/her employee ID card. The call operator
76
C. Sueyoshi et al.
Fig. 2. Employee management support application.
begins to make the roll call. In the perform pre-trip call, questions are asked on nine items (Fig. 7). In the perform post-trip call, questions are asked on eight items (Fig. 8). The call operation can be easily performed by following the items displayed on the screen. The inputs are immediately sent to the server for sharing with the operation manager’s employee support application. Although the steps to be followed by operations managers are slightly different in making calls relating to pre- and post-ride, their work procedures have the same flow.
Fig. 3. Application flow.
Employee Management Support App. for Reg. Public Tran. Service in Japan
77
Fig. 4. Employee ID card template.
4.2
Support Application for Operation Manager
Figure 9 shows the login screen of the application module intended for operations managers. Unlike call operators, operation managers can login to the system by entering only the employee ID card number and password. After the login is completed, the employee list is displayed (Fig. 10). The employee list shows static and dynamic information regarding the employee status. The static information contains the branch name, and the name (JP), chronic illnesses, and hospital visit frequency of the employee. The dynamic information contains the car number, attendance, and the date of the next hospital visit of the employee. The call operators update the car number and attendance. The date of the next hospital visit of an employee is entered using the relevant medical checkup information, as described below. The navigation bar can be used to add an employee, filter the employee list, and reset and update the filtered employee list (Fig. 11). Using the Add Employee screen, a new employee can be added by entering the basic information regarding the employee (Fig. 12). The dynamic information of an employee, such as attendance, is not documented because it will be automatically added later. Figure 13 shows the health records of the employees. The health record of a particular employee can be accessed by clicking the employee name in the list of employees and then clicking the health checkup item. This record can be viewed only with the consent of the employee. The records of employees who have not consented will not be available. On the same screen, the results of past health checkups can be viewed. The items displayed are the last checkup date, number of checkups conducted within the year, person who conducted the checkup, whether the checkup was repeated, and the details of the checkups. Using the navigation bar, the results of health checkups can be added and the checkup list can be updated (Fig. 14). The screen for adding and editing health checkup results can be used to enter health checkup findings (Fig. 15).
78
C. Sueyoshi et al.
Fig. 5. Call operator login page.
Fig. 6. Call select page.
Fig. 7. Pre-call operation data input page.
Fig. 8. Pre-call operation data confirmation page.
Fig. 9. Operations manager login page.
Fig. 10. Employee information list page.
Employee Management Support App. for Reg. Public Tran. Service in Japan
4.3
79
Backend Environment of the Applications
Figure 16 shows the block diagram of the overall support system. The backend system is built in serverless environment provided by Amazon Web Services (AWS). A REST API provides access to the backend. The application is connected to the backend via the API. The API has three endpoints: “driverinfo/” for employee information, “user/” for user information, and “healthinfo/” for the health information of the employees. Each endpoint can manipulate the database of related information, such as “add/” for adding and “delete/” for deleting. The AWS backend that provides the API consists of the API Gateway, Lambda, and DynamoDB. Each operator has its own API.
Fig. 11. Employee information list navigation bar.
Fig. 12. Page used for adding a new employee.
Fig. 13. Health checkup result navigation bar.
Fig. 14. Page used for editing a health checkup result.
80
C. Sueyoshi et al.
Fig. 15. Page used for editing a health checkup result.
Fig. 16. Block diagram of the support application.
5
Conclusion
In this paper, we focused on the financial issues encountered in regional public transportation systems in Japan and the aging of their employees. We developed an application to solve these issues. The developed application has two modules, one for use by call operators and the other by operations managers. The modules are connected to a REST API. The information entered in each module is immediately shared with the other module. Using this application the health status of an employee can be monitored, thereby relieving the burdens on the operations managers. In the future, we expect to prevent accidents by acquiring real-time health data from the smartwatches worn by the employees and linking the data to the application. We also plan to strengthen the security of the server environment and develop it into a platform that can handle personal information. In addition, we plan to have actual regional public transportation operators use the developed application to evaluate it and get feedback, and then brush up the application to make it more in line with actual operation patterns. Acknowledgements. This research was supported by JSPS KAKENHI Grant Number JP 17K00472, 21K18021, and Kyushu Sangyo University 2020 KSU Research Funds Number K060267.
Employee Management Support App. for Reg. Public Tran. Service in Japan
81
References 1. Estember, R.D., Huang, C.: Essential occupational risk and health interventions for Taiwan’s bus drivers. In: 2019 IEEE 6th International Conference on Industrial Engineering and Applications (ICIEA), pp. 273–277 (2019) 2. Ichikawa, K., Nakatani, Y.: Driver support system: spatial cognitive ability and its application to human navigation. In: Smith, M.J., Salvendy, G. (eds.) Human Interface and the Management of Information. Interacting in Information Environments, pp. 1013–1019 (2007) 3. Kamal, M.M., Islam, M.Z., Based, M.A.: Design of an intelligent transport system for Dhaka city. In: 2020 IEEE Region 10 Symposium (TENSYMP), pp. 1812–1815 (2020) 4. Oo, S.B., Oo, N.H.M., Chainan, S., Thongniam, A., Chongdarakul, W.: Cloudbased web application with NFC for employee attendance management system. In: 2018 International Conference on Digital Arts, Media and Technology (ICDAMT), pp. 162–167 (2018) 5. Paandian, M., Malarvili, M.B.: A web based employee medical history management and monitoring system. In: 2012 IEEE-EMBS Conference on Biomedical Engineering and Sciences, pp. 789–794 (2012) 6. Shimada, Y., Takagi, M., Taniguchi, Y.: Person re-identification for estimating bus passenger flow. In: 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 169–174 (2019) 7. Sueyoshi, C., Takagi, H., Inenaga, K.: Analysis of the number of passengers in consecutive national holiday collected with a practical management support system in the case of community bus of Shingu town in Japan. In: 2020 11th International Green and Sustainable Computing Workshops (IGSC), pp. 1–8 (2020) 8. Sueyoshi, C., Takagi, H., Inenaga, K.: Developing practical management support system for regional public transportation service provided by municipalities. In: Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis, pp. 23–28. ICCDA 2020 (2020) 9. Uchimura, K., Takahashi, H., Saitoh, T.: Demand responsive services in hierarchical public transportation system. IEEE Trans. Veh. Technol. 51(4), 760–766 (2002) 10. Watanabe, C., Ishikawa, S., Yabe, H., Tanaka, M.S.: A study of a bus stop that displays the current location of the bus to increase user convenience. In: 2020 IEEE 9th Global Conference on Consumer Electronics (GCCE), pp. 268–269 (2020)
Message Ferry Routing Based on Nomadic L´evy Walk in Delay Tolerant Networks Koichiro Sugihara and Naohiro Hayashibara(B) Kyoto Sangyo University, Kyoto, Japan {i1986089,naohaya}@cc.kyoto-su.ac.jp
Abstract. Message ferry is a way of communication in Intermittently Connected Delay-Tolerant Networks. It is for collecting data from stationary sensor nodes. The efficiency of message delivery using message ferries depends on the routing scheme. Nomadic L´evy Walk that is a variant of L´evy Walk, is an eligible candidate for a message ferry routing scheme. It includes homing behavior in addition to the behavior similar to L´evy walk with strategic moving the home (sink) position. In this paper, we conducted several simulation runs on a DTN to measure the delivery ratio and the latency of message delivery by message ferries with several configurations. We also discuss the correlation between the efficiency of message delivery and the link density of graphs according to our simulation result.
1 Introduction Various messaging protocols in Delay Tolerant Networks (DTNs) have been proposed so far [10]. This is because WSNs are attracted attention in various research fields. In this type of network, mobile/stationary nodes which have wireless communication capability with limited communication range are located in a field, and messages sent by nodes are delivered to the destination in a Store-Carry-Forward manner because endto-end paths do not always exist. It means that each message or data is carried to the destination by some mobile entities which repeat message passing to each other. This type of communication is useful to send and deliver messages without a stable network infrastructure, especially in the case of disasters. Now, we focus on the Message Ferrying based approach for data collection in DTNs. Message ferries (or MF for short) are mobile nodes that move around the field to collect messages from nodes and to deliver them to destinations properly. Whenever a node sends a message to other nodes in the network, first of all, it sends the message an MF with short-range communication capability (e.g., Bluetooth) when it gets close. The message might be reached to the destination by MFs. Therefore, the efficiency of message delivery depends on the routing scheme of MFs. Most of the message ferry routing schemes assume a fixed route; for instance, MFs move on a circular path [9, 22, 23]. However, the fixed route-based routing scheme does not work if a part of the route is not available due to some event such as road maintenance and a disaster. We focus on a random walk-based routing scheme for message ferry routing. In particular, L´evy walk has recently attracted attention due to the optimal search in animal c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 82–93, 2022. https://doi.org/10.1007/978-3-030-84913-9_8
Message Ferry Routing Based on Nomadic L´evy Walk
83
foraging [7, 21] and the statistical similarity of human mobility [11]. It is a mathematical fractal that is characterized by long segments followed by shorter hops in random directions. L´evy walk has also been used to improve the spatial coverage of mobile sensor networks [19], to analyze the properties of an evolving network topology of a mobile network [5], and to enhance grayscale images combined with other bio-inspired algorithms (i.e., bat and firefly algorithms) [6]. It is also considered to be particularly useful for message ferry routing that aims to deliver messages in DTNs [12]. Nomadic L´evy Walk (NLW) is a variant of L´evy walk, which has been proposed by Sugihara and Hayashibara [16]. An agent starts from a sink node and returns to the sink node with the given probability α . Each sink node changes its position according to the given strategy with the probability γ . We suppose to deliver messages by using battery-driven autonomous electric vehicles as MFs. Moreover, they have an onboard camera to drive along the road and avoid obstacles. Each of them basically moves with a mobile power source vehicle. Then, it departs from the sink to visit nodes for collecting and delivering messages. It requires going back to the sink to charge its battery. This paper applies NLW as a message ferry routing and measures the message delivery ratio and the latency in a DTN. We conducted simulations to measure these criteria with several different configurations. Then, we analyze the simulation result to clarify the impact of the parameter γ and the given sink relocation strategy of NLW on the message delivery ratio and the latency. We also discuss how the link density of graphs influences the delivery ratio and delivery latency.
2 Related Work We now introduce several research works related to message ferry in WSNs and WSNs. Then, we explain existing movement models that can be used for message ferry routing. 2.1 Message Ferry in DTNs and WSNs Tariq et al. proposed Optimized Way-points (OPWP) ferry routing method to facilitate connectivity on sparse ad-hoc networks [4]. According to the simulation result, OPWPbased MFs outperforms other ferry routing method based on Random way point (RWP). Shin et al. apply the L´evy walk movement pattern to the routing of message ferries in DTN [12]. They demonstrated message diffusion using message ferries based on the various configuration of L´evy walk. According to the simulation result, the ballistic movement of message ferries (i.e., smaller scaling parameter of L´evy walk) is efficient regarding the message delay. Basagni et al. proposed the notion of a mobile sink node and its heuristic routing scheme called Greedy Maximum Residual Energy (GMRE) [3]. The motivation of the work is to prolong wireless sensor nodes that are deployed in a large field and send data to the sink periodically. The mobile sink as a data collection and processing point moves around the field to save the energy consumption of sensor nodes. The simulation result showed that the mobile sink with the proposed routing scheme improved the lifetime of sensor nodes.
84
K. Sugihara and N. Hayashibara
Alnuaimi et al. proposed a ferry-based approach to collect sensor data in WSNs [1, 2]. It divides a field into virtual grids and calculates an optimal path for a mobile ferry to collect data with a minimum round trip time. It utilizes a genetic algorithm and the node ranking clustering algorithm for determining the path. 2.2
Movement Models
Birand et al. proposed the Truncated Levy Walk (TLW) model based on real human traces [5]. The model gives heavy-tailed characteristics of human movement. Authors analyzed the properties of the graph evolution under the TLW mobility model. Valler et al. analyzed the impact of mobility models including L´evy walk on epidemic spreading in MANET [20]. They adopted the scaling parameter λ = 2.0 in the L´evy walk mobility model. From the simulation result, they found that the impact of velocity of mobile nodes does not affect the spread of virus infection. Thejaswini et al. proposed the sampling algorithm for mobile phone sensing based on L´evy walk mobility model [19]. Authors showed that proposed algorithm gives significantly better performance compared to the existing method in terms of energy consumption and spatial coverage. Fujihara et al. proposed a variant of L´evy flight which is called Homesick L´evy Walk (HLW) [8]. In this mobility model, agents return to the starting point with a homesick probability after arriving at the destination determined by the power-law step length. As their result, the frequency of agent encounter obeys the power-law distribution though random walks and L´evy walk do not obey it. Most of the works related to L´evy walk assume a continuous plane and hardly any results on graphs are available. Shinki et al. defined the algorithm of L´evy walk on unit disk graphs [14]. They also found that the search capability of L´evy walk emphasizes according to increasing the distance between the target and the initial position of the searcher. It is also efficient if the average degree of a graph is small [13]. Sugihara et al. proposed the novel mobility model called Nomadic L´evy Walk (NLW) as a variant of HLW and the sink relocation strategy based on a hierarchical clustering method [16, 17]. They conducted simulations to measure the cover ratio of unit disk graphs. The simulation result showed that the mobility model cover a wide area with preserving homing behavior.
3 System Model We assume a delay tolerant network (DTN) modeled by a unit disk graph UDG = (V, E) with a constant radius r. Each node v ∈ V is located in the Euclidean plane and an undirected edge (link) {vi , v j } ∈ E between two nodes vi and v j exists if and only if the distance of these nodes is less than 2r. Nodes in the network represent landmarks such as shelters in disaster cases. Note that r is the Euclidean distance in the plane. We assume that any pair of nodes in the graph has a path (i.e., connected graph). It does not mean network connectivity. It just means that there is a geometric route to any node in the field. UDGs are often used for modeling road networks. We assume the field restricts the movement of mobile entities by links.
Message Ferry Routing Based on Nomadic L´evy Walk
85
We also assume a mobile entity called a message ferry (MF), which collects messages from nodes, and forwards them to the destination. This assumption is similar to the existing work [1, 2, 12, 23]. MFs know the positions of nodes in the network. Practically speaking, an MF is a battery-driven autonomous electric vehicle which is required to charge its battery in a certain period of time. Moreover, MFs and nodes have shortrange communication capability such as Bluetooth, ad hoc mode of IEEE 802.11, Near Field Communication (NFC), infrared transmission, and so on. Therefore each node is able to transmit a message by communicating with an MF when it gets close to the node. Each MF starts moving from the corresponding power source vehicle, called sink, and it goes back to the sink to recharge its battery. There are two types of ferry interaction. • Ferry-to-Ferry interaction. Message ferries exchange messages with each other if they are at the same node (i.e., they are in their communication range). • Ferry-to-Node interaction. Each sensor node can send data piggybacked on a message to an MF when it is physically close to the node (i.e., it is at the node in the graph). We also assume that each MF identifies its position (e.g., obtained by GPS), which is accessible from an MF, and it has a compass to get the direction of a walk. Each MF has a set of neighbor nodes, and such information (i.e., positions of neighbors) is also accessible. Moreover, an MF has an onboard camera to drive along the road and avoid obstacles. We suppose that MFs move and visit shelters, which are represented as nodes in this paper, to collect and deliver messages in the case of disasters.
4 Nomadic L´evy Walk We proposed a variant of L´evy walk called Nomadic L´evy Walk in our previous work [16, 17] to improve the ability of the broad area search while preserving the homing behavior. NLW is an extension of HWL [8], and holds the following properties in addition to HLW. Nomadicity: Each sink moves its position with the given probability γ . Sink relocation strategy: The next position of the sink is decided by a particular strategy. The movement of each MF obeys HLW. Thus, their trajectory is radially from their sink. Moreover, they move their sink at the probability γ in Nomadic L´evy walk. In fact, the fixed sink restricts the area that each MF explores. The property of Nomadicity is expected to improve coverage by each MF. We now explain the detail of the algorithm of Nomadic L´evy walk on unit disk graphs in Algorithm 4.
86
K. Sugihara and N. Hayashibara
1: Initialize: s ← the position of the sink c ← the current position o←0 orientation for a walk. PN(c) ← the possible neighbors to move. 2: if Probability: α then 3: d ← the distance to s 4: o ← the orientation of s 5: else 6: d is determined by the power-law distribution 7: o is randomly chosen from [0, 2π ) 8: end if 9: if Probability: γ then 10: s ← P(c) update the position of the sink according to the given strategy. 11: end if 12: while d > 0 do 13: PN(c) ← {x|abs(θox ) < δ , x ∈ N(c)} 14: if PN(c) = 0/ then 15: d ← d −1 16: move to v ∈ PN(c) where v has the minimum abs(θov ) 17: c←v 18: else 19: break no possible node to move. 20: end if 21: end while
At the begging of the algorithm, each MF holds the initial position as the sink position s. In every walk, each MF determines the step length d by the power-law distribution and selects the orientation o of a walk randomly from [0, 2π ). It can obtain a set of neighbors N(c) and a set of possible neighbors PN(c) ⊆ N(c), to which MFs can move, from the current node c. In other words, a node x ∈ PN(c) has a link with c that the angle θox between o and the link is smaller than π2 . In unit disk graphs, it is not always true that there exist links to the designated orientation. We introduce δ that is a permissible error against the orientation. In this paper, we set δ = 90. It means that MFs can select links to move in the range ±90 with the orientation o as a center. In a given probability α , the MF goes back to the sink node (line 2 in Algorithm 4). In this case, it sets the orientation to the sink node as o and the distance to the sink as d. Each MF changes its sink position with the given probability γ (line 10 in Algorithm 4) and then starts a journey from the sink. 4.1
Sink Relocation Strategy
Each sink replaces its position according to the given strategy in NLW. Obviously, the sink position is one of the important factors for covering a wide area of a graph. Each sink manages the history H of the sink positions. We have proposed several sink relocation strategies for NLW [16, 17]. We now introduce those strategies as follows.
Message Ferry Routing Based on Nomadic L´evy Walk
87
4.1.1 L´evy Walk Strategy (LWS) The next position of the sink is changed obeying the L´evy Walk movement pattern. The orientation o of the next base is selected at random from [0,2π ) and the distance d of it from the current position is decided by the power-law distribution. 4.1.2 Reverse Prevention Strategy (RPS) This strategy has been proposed in [16]. In this strategy, each agent assumes a set of the history H of the sink positions. It calculates the reverse orientation of each position in the set. Then, it determines the next sink so that its position is the opposite side of the past sink positions. 4.1.3 Clustering-Based Reverse Prevention Strategy (CRPS) In this strategy, each sink assumes an ordered set of the history of the sink positions Bhist with the size h to store the coordinates of the locations at which the sink has been located in the past [17]. It calculates the reverse orientation of each position in Bhist . The sink positions could be biased at a particular area of the graph while relocating. As a result, the cover ratio could not be improved with MFs. For this reason, we use the unweighted pair group method with arithmetic mean (UPGMA) clustering method [15] to detect a bias of sink positions. UPGMA method is an agglomerative hierarchical clustering method based on the average distance of members in two clusters. It calculates a pairwise distance pdist(A, B) between two clusters A and B. pdist(A, B) is computed as follows. 1 (1) pdist(A, B) = ∑ ∑ d(a, b) |A| · |B| a∈A b∈B d(a, b) is the distance between the element a and b. In each step, a cluster A merges another cluster B so that pdist(A, B) is the minimum value. Finally, all elements are in one cluster, and the merging process can be represented as a binary tree. First, each sink constructs a matrix of pdist({x}, {y}), where x, y ∈ H. pdist({x}, {y}) is calculated by the Eq. 1. Note that the distance d(x, y) is Euclidean distance. Then, two clusters {x} and {y} merge into one if pdist({x}, {y}) is the minimum value and pdist({x}, {y}) ≤ T . This merging phase is repeated until there is no candidate to merge. As a result, the sink determines the past sink positions are biased if the size of the biggest cluster is more significant than |H| 2 . In the case of detecting a bias of sink positions, the next sink s will be relocated with the following condition, where C ⊆ H is the biggest cluster. pdist(C, {s}) > T
5 Performance Evaluation We measure the average message delivery ratio, which indicates the ratio of sensor nodes from which MFs have collected data. It means that the efficiency of data collection by MFs. We show the message delivery ratio and the latency by MFs using Nomadic L´evy Walk (NLW) as a message ferry routing with the parameter α and γ .
88
5.1
K. Sugihara and N. Hayashibara
Environment
In our simulation, we distribute 1,000 nodes at random in the 1,000 × 1,000 Euclidean plane. Each node has a fixed-sized circle r as its communication radius. A link between two nodes exists if and only if the Euclidean distance of these nodes is less than 2r. Thus, the parameter r is a crucial factor to form a unit disk graph. We automatically generated unit disk graphs such that two nodes have an undirected link if these circles based on r have an intersection. 5.1.1 Radius r of Unit Disk Graphs We set r ∈ {35, 50} as a parameter of the environments in our simulations. r is a compatible parameter with the diameter of a unit disk graph. Since nodes obtain more links to other nodes that are not connected by gaining r, it tends to become small. The average degrees of nodes deg(UDG) are 14.6 with r = 35 and 28.9 with r = 50, respectively. The diameters of unit disk graphs are 24 with r = 35 and 16 with r = 50. Links in a graph indicate the road on which MFs move. It means that r restricts the freedom of movement of MFs. Therefore, the degree of freedom of movement of MFs is proportional to r. 5.2
Parameters
The scaling parameter λ and the homesick probability α are common to NLW and HLW. We set λ = 1.2 for all simulations because it is known as the parameter that realizes the efficient search on unit disk graphs [13]. Each message has Time-to-live (TTL) for message expiration. TTL decreases by elapsing simulation steps. A message is removed from MFs that store this message when its TTL equals to zero. The number of MFs is set as k ∈ [2, 10]. Each MF has a particular sink node (e.g., mobile power source vehicle). 5.3
Performance Criteria
We use the following criteria for measuring the performance of MFs. Message delivery ratio: This indicates a ratio of nodes that receive a message. Every message is broadcasted to all nodes. A message ferry routing is efficient for message delivery if the message delivery ratio is high with a small number of steps. Latency: This indicates the time for delivering a message that are successfully received at the destination. 5.4
Simulation Result
We show the simulation results of the message delivery ratio and the latency of NLW based MFs on unit disk graphs. The destination of each message is selected from nodes at random.
Message Ferry Routing Based on Nomadic L´evy Walk
89
5.4.1 Message Delivery Ratio We compare the message delivery ratio by using NLW based MFs with different sink relocation strategies stated in Sect. 4.1 and the parameter γ . The number of MFs is set at two to ten. Each message has a TTL value. Every message could eventually arrive at the destination if TTL = ∞. However, it means that messages are never removed from the storage of mobile devices. In this simulation, we set the initial TTL value for each message according to the simulation time that the cover ratio of MFs reaches 0.8. We decided the initial TTL = 800 from the simulation in our previous work [18]. Figure 1, 2 and 3 show the message delivery ratio on unit disk graphs with r = 50 and Figs. 4, 5 and 6 show that on unit disk graphs with r = 35. The parameter γ is to determine the frequency to change a sink location. The sink is relocated more frequently if γ increases. In CRPS and LWS, the message delivery ratio with γ = 0.8 is always better than the ones with γ = 0.2 and 0.5. On the difference of strategies regarding the message delivery ratio, CRPS and LWS are slightly better than RPS. Graphs with r = 50 has more links than that with r = 35. It means that the freedom of movement of MFs on the former one is larger than that on the latter one. In general, the freedom of movement influences the cover ratio, and our previous work showed this fact by the simulation result [18]. The message delivery ratio has the same tendency as the cover ratio. The freedom of movement is proportional to r. The message delivery ratio of r = 50 with CRPS increases 19.1% with k = 3, 17.8% with k = 4 and 15.3% with k = 5 compared to the ones of r = 35. However, the gap between them is reduced to 2.2% with k = 10. Although the freedom of movement has an impact on the message delivery ratio, it could be solved if we can add more MFs to the network. The freedom of movement could be reduced by cutting off the road network in disaster cases. In this case, we need to adjust the number of MFs for the required message delivery ratio. 100
100
LWS RPS CRPS
60
40
20
0
60
40
20
2
3
4
5
6
7
8
9
10
Num. of Agents
Fig. 1. Delivery ratio on UDG of r = 50 with γ = 0.2.
0
LWS RPS CRPS
80
Delivery Ratio(%)
80
Delivery Ratio(%)
Delivery Ratio(%)
80
100
LWS RPS CRPS
60
40
20
2
3
4
5
6
7
8
9
10
Num. of Agents
Fig. 2. Delivery ratio on UDG of r = 50 with γ = 0.5.
0
2
3
4
5
6
7
8
9
10
Num. of Agents
Fig. 3. Delivery ratio on UDG of r = 50 with γ = 0.8.
90
K. Sugihara and N. Hayashibara
100
100
LWS RPS CRPS
100
LWS RPS CRPS
60
40
60
40
20
20
0
0
2
3
4
5
6 7 Num. of Agents
8
9
10
Fig. 4. Delivery ratio on UDG of r = 35 with γ = 0.2.
LWS RPS CRPS
80
Delivery Ratio(%)
80
Delivery Ratio(%)
Delivery Ratio(%)
80
60
40
20
2
3
4
5
6 7 Num. of Agents
8
9
0
10
Fig. 5. Delivery ratio on UDG of r = 35 with γ = 0.5.
2
3
4
5
6 7 Num. of Agents
8
9
10
Fig. 6. Delivery ratio on UDG of r = 35 with γ = 0.8.
5.4.2 Latency We compare message delivery latency by using NLW based MFs with different sink relocation strategies and the parameter γ . The configuration of the number of MFs is the same as the one of the message delivery ratio (see Sect. 5.4.1). Figure 7, 8 and 9 show the latency of message delivery on unit disk graphs with r = 50 and Figs. 10, 11 and 12 show that on unit disk graphs with r = 35. In the previous simulation regarding the message delivery ratio, we configured TTL = 800 for each message. We set TTL = ∞ to purely measure the average latency of message delivery in the simulation. γ does not influence the latency on the graph with r = 50. On the other hand, The latency with γ = 0.8 is slightly better than the one with γ = 0.2 and 0.5 in CRPS and LWS on the graph of r = 35. However, the difference between them is negligible. LWS and CRPS are almost the same regarding the latency. LWS is slightly better than CRPS with k ≥ 8. The gap between LWS and CRPS becomes larger when r decreases. The latency of LWS reduces 15% to 28% with k ≥ 4 compared to the one of CRPS on the graph of r = 35 though they are just a few percent on the graph of r = 50. CRPS is a good strategy to cover a wide area. On the other hand, LWS is much better than CRPS for improving the latency of message delivery if the freedom of movement is restricted. 2000
2000
LWS RPS CRPS
1000
500
0
1000
500
2
3
4
5
6
7
8
9
10
Num. of Agents
Fig. 7. Latency on UDG of r = 50 with γ = 0.2.
0
LWS RPS CRPS
1500
Latency
1500
Latency
Latency
1500
2000
LWS RPS CRPS
1000
500
2
3
4
5
6
7
8
9
10
Num. of Agents
Fig. 8. Latency on UDG of r = 50 with γ = 0.5.
0
2
3
4
5
6
7
8
9
10
Num. of Agents
Fig. 9. Latency on UDG of r = 50 with γ = 0.8.
Message Ferry Routing Based on Nomadic L´evy Walk 2000
2000
LWS RPS CRPS
500
1000
500
2
3
4
5
6
7
8
9
10
Num. of Agents
Fig. 10. Latency on UDG of r = 35 with γ = 0.2.
0
LWS RPS CRPS
1500
Latency
1000
0
2000
LWS RPS CRPS
1500
Latency
Latency
1500
91
1000
500
2
3
4
5
6
7
8
9
10
Num. of Agents
Fig. 11. Latency on UDG of r = 35 with γ = 0.5.
0
2
3
4
5
6
7
8
9
10
Num. of Agents
Fig. 12. Latency on UDG of r = 35 with γ = 0.8.
6 Conclusion In this paper, we evaluated the delivery ratio and the latency of message delivery by NLW-based MFs in delay-tolerant networks (DTNs). Then, we discuss the impact of the parameters and the sink relocation strategies on the delivery ratio and the latency according to the simulation result. We also discuss the impact of the link density of graphs on these criteria. On the sink relocation strategies of NLW, LWS and CRPS are better than RPS on the delivery ratio and the latency. LWS with γ = 0.8 outperforms other strategies on the latency when the link density reduces (see Figs. 10, 11 and 12). The freedom of movement is restricted if the link density is reduced (i.e., r becomes smaller). It means that we need to relocate the sink position frequently and to use LWS as the sink relocation strategy to improve the latency. Our result would be useful for MF-based message delivery, especially in DTNs with the restriction of movement.
References 1. Alnuaimi, M., Shuaib, K., Abdel-Hafez, K.A.M.: Data gathering in delay tolerant wireless sensor networks using a ferry. Sensors 15(10), 25809–25830 (2015) 2. Alnuaimi, M., Shuaib, K., Abdel-Hafez, K.A.M.: Ferry-based data gathering in wireless sensor networks with path selection. Procedia Comput. Sci. 52, 286–293 (2015) 3. Basagni, S., Carosi, A., Melachrinoudis, E., Petrioli, C., Wang, Z.M.: Controlled sink mobility for prolonging wireless sensor networks lifetime. Wireless Netw. 14(6), 831–858 (2008) 4. Bin Tariq, M.M., Ammar, M., Zegura, E.: Message ferry route design for sparse ad hoc networks with mobile nodes. In: Proceedings of the 7th ACM International Symposium on Mobile Ad Hoc Networking and Computing, MobiHoc 2006, pp. 37–48. Association for Computing Machinery (2006) 5. Birand, B., Zafer, M., Zussman, G., Lee, K.W.: Dynamic graph properties of mobile networks under levy walk mobility. In: Proceedings of the 2011 IEEE Eighth International Conference on Mobile Ad-Hoc and Sensor Systems, MASS 2011, pp. 292–301. IEEE Computer Society (2011) 6. Dhal, K.G., Quraishi, M.I., Das, S.: A chaotic l´evy flight approach in bat and firefly algorithm for gray level image. Int. J. Image Graph. Sign. Process. 7, 69–76 (2015)
92
K. Sugihara and N. Hayashibara
7. Edwards, A.M., et al.: Revisiting l´evy flight search patterns of wandering albatrosses, bumblebees and deer. Nature 449, 1044–1048 (2007) 8. Fujihara, A., Miwa, H.: Homesick l´evy walk and optimal forwarding criterion of utilitybased routing under sequential encounters. In: Proceedings of the Internet of Things and Inter-cooperative Computational Technologies for Collective Intelligence 2013, pp. 207–231 (2013) 9. Kavitha, V., Altman, E.: Analysis and design of message ferry routes in sensor networks using polling models. In: 8th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks, pp. 247–255 (2010) 10. Li, Y., Bartos, R.: A survey of protocols for intermittently connected delay-tolerant wireless sensor networks. J. Netw. Comput. Appl. 41, 411–423 (2014) 11. Rhee, I., Shin, M., Hong, S., Lee, K., Kim, S.J., Chong, S.: On the levy-walk nature of human mobility. IEEE/ACM Trans. Netw. 19(3), 630–643 (2011) 12. Shin, M., Hong, S., Rhee, I.: DTN routing strategies using optimal search patterns. In: Proceedings of the Third ACM Workshop on Challenged Networks, CHANTS 2008, pp. 27–32. ACM (2008) 13. Shinki, K., Hayashibara, N.: Resource exploration using l´evy walk on unit disk graphs. In: The 32nd IEEE International Conference on Advanced Information Networking and Applications (AINA-2018) (2018) 14. Shinki, K., Nishida, M., Hayashibara, N.: Message dissemination using l´evy flight on unit disk graphs. In: The 31st IEEE International Conference on Advanced Information Networking and Applications (AINA 2017) (2017) 15. Sokal, R.R., Michener, C.D.: A statistical method for evaluating systematic relationships. Univ. Kansas Sci. Bull. 38, 1409–1438 (1958) 16. Sugihara, K., Hayashibara, N.: Message dissemination using nomadic L´evy walk on unit disk graphs. In: Barolli, L., Hussain, F., Ikeda, M. (eds.) Complex, Intelligent, and Software Intensive Systems. CISIS 2019. Advances in Intelligent Systems and Computing, vol. 993. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-22354-0 13 17. Sugihara, K., Hayashibara, N.: Performance evaluation of nomadic L´evy walk on unit disk graphs using hierarchical clustering. In: Proceedings of the 34th International Conference on Advanced Information Networking and Applications (AINA-2020), pp. 512–522. Springer (2020) 18. Sugihara, K., Hayashibara, N.: Message ferry routing based on nomadic L´evy walk in wireless sensor networks. In: Barolli, L., Woungang, I., Enokido, T. (eds.) Advanced Information Networking and Applications - Proceedings of the 35th International Conference on Advanced Information Networking and Applications (AINA-2021), Toronto, ON, Canada, 12–14 May 2021, vol. 1, Lecture Notes in Networks and Systems, vol. 225, pp. 436–447. Springer (2021). https://doi.org/10.1007/978-3-030-75100-5 38 19. Thejaswini, M., Rajalakshmi, P., Desai, U.B.: Novel sampling algorithm for human mobilitybased mobile phone sensing. IEEE Internet Things J. 2(3), 210–220 (2015) 20. Valler, N.C., Prakash, B.A., Tong, H., Faloutsos, M., Faloutsos, C.: Epidemic spread in mobile ad hoc networks: determining the tipping point. In: Proceedings of the 10th International IFIP TC 6 Conference on Networking - Volume Part I, NETWORKING 2011, pp. 266–280. Springer-Verlag (2011). https://doi.org/10.1007/978-3-642-20757-0 21 21. Viswanathan, G.M., Afanasyev, V., Buldyrev, S.V., Murphy, E.J., Prince, P.A., Stanley, H.E.: L´evy flight search patterns of wandering albatrosses. Nature 381, 413–415 (1996)
Message Ferry Routing Based on Nomadic L´evy Walk
93
22. Zhao, W., Ammar, M., Zegura, E.: A message ferrying approach for data delivery in sparse mobile ad hoc networks. In: Proceedings of the 5th ACM International Symposium on Mobile Ad Hoc Networking and Computing, pp. 187–198. Association for Computing Machinery (2004) 23. Zhao, W., Ammar, M., Zegura, E.: Controlling the mobility of multiple data transport ferries in a delay-tolerant network. In: IEEE INFOCOM 2005. Maiami, FL, USA (2005)
A Trust-Based Tool for Detecting Potentially Damaging Users in Social Networks Kaley J. Rittichier, Davinder Kaur, Suleyman Uslu, and Arjan Durresi(B) Indiana University-Purdue University Indianapolis, Indianapolis, IN, USA {krittich,davikaur,suslu}@iu.edu, [email protected]
Abstract. We offer a trust-based framework to detect potentially damaging users in social networks. This method captures human and community trust within the network to identify users who are likely to cause harm. Human and communitybased frameworks offer an advantage over other approaches because trust and credibility are hard to fake as they are built over time. Furthermore, given that the metric represents human trust with evidence, it serves as an excellent label to inform users of potential damage associated with another account. We illustrate the proposed metric with Twitter data, distinguishing potentially damaging users from both trustworthy users and those who lack trustworthiness but have a low chance of causing harm.
1 Introduction Since its conception, social media has seen a growing number of users and has become a daily activity for hundreds of millions of people. With this increase, there is also an increased risk of spreading damaging information on these platforms. This damage can be to a government, with elections being tampered [4], or users’ personal health. We also saw increased medical misinformation with the fear arising from the COVID19 global pandemic [7]. Between January and March of 2020 alone, there were 225 verified cases of misinformation about COVID-19 circulating [7]. After the vaccines became publicly available, false information spread, questioning the vaccine’s legitimacy and safety. In France, there was recently a reported case of social media users with a high follower count getting offered large sums of money to spread false negative information about the Pfizer COVID-19 vaccine [21]. The European External Action Service (EEAS) has been releasing an update on the disinformation surrounding this every few months [1]. There is a need to detect users who could potentially spread such misinformation. Labeling a user as less trustworthy can help prevent the initial sharing of false information. Users would see that the person contributing does not have any reputability on the platform. Social media platforms like Twitter offer politicians, companies, celebrities, academics, etc., a way to share their thoughts and actions with the public without traditional intermediaries, such as journalists or public hearings. Consequently, several accounts on the platform do not interact with the community but simply use their account to follow other users. For this reason, our approach distinguishes general low trust users from damaging users. We propose a tool that can be used to help determine c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 94–104, 2022. https://doi.org/10.1007/978-3-030-84913-9_9
A Trust-Based Tool for Detecting Potentially Damaging Users
95
user trust and distinguish the harm of the lack of trust. In this paper, we focus on the detection of potentially damaging users and note how these methods may be helpful for the detection of fake followers. This paper is organized as follows: Sect. 2 presents the background and related work in the field of misinformation and fake user detection as well as mechanisms for measuring trust. Section 3 describes our framework for calculating trust. Section 4 discusses our implementation. Section 5 presents results using our framework on real Twitter accounts, and Sect. 6 concludes the paper.
2 Background and Previous Work In this section, we discuss the background of this topic and the related work done to detect fake users and damaging information. 2.1 Misinformation Detection Misinformation on social media can come in text, images, videos, and audio [22]. These forms of information can be original to the user, copied from another website, re-shared from another user, or shared via a link. These different forms can lend themselves to different methods of detection. Some methods have dealt with evaluating the information itself, which is especially beneficial when the information is original to the user. There have been some successes with machine learning approaches [23, 25, 31] that get accuracy around mid-seventies to mid-eighties percent on the detection by using types of terms or styles of writing to determine whether the article is fake. There have also been informational retrieval techniques used to compare certain claims in the text to claims stated on reputable websites [24, 34]. These methods benefit from the fact that they can analyze more nuanced types of fake news, considering the trust of the claims, but they are limited by the extent of the data they have. For instance, with textual information, the information’s phrasing can be limited to near direct quotes. Given the drawbacks of these two approaches, most of the detection on prominent platforms like Facebook1 rely on the credibility and user/expert labeling. The first approach is leveraging the credibility of the website. This method only works if a link is shared and the website is well-known enough to be officially labeled as a reputable or disreputable source. The second approach relies on user identification and labeling. In this case, the platform allows users to flag the post as false and then have thirdparty fact-checkers investigate the case to reach a verdict. The problem with this second type is scalability. Given how much activity happens on these social media platforms and how much misinformation is spread, much damage can occur before action is taken. Therefore, there should be attempts to detect the kind of users who post such misinformation.
1
https://www.facebook.com/formedia/blog/working-to-stop-misinformation-and-false-news.
96
2.2
K. J. Rittichier et al.
Fake User Detection
The primary goal of fake users is to misrepresent [26]. Various forms of fake users are created for different ends. Some, such as fake followers or review accounts, focus simply on presence, making something appear more well-loved than it is. These types of fake users can call into account the integrity of a platform, whether it be social media or e-commerce. Companies may want to remove fake followers from their platform because it changes the aim of the website. Another type of fake users are damaging users who spread false information or viruses [13]. Fake user detection can be an essential part of misinformation detection. This is because it is in the best interest of fake news producers to have bots and other kinds of fake users that could share misinformation [19, 30]. There is a specific risk of health misinformation. Several studies have shown a distinct danger of spreading false medical information and affecting people’s actions [5]. Social bots particularly have been demonstrated to push forward misinformation [13, 30]. In 2017, one study [6] found that social bots were widely used to promote electronic cigarettes. Fake users can be particularly damaging because of their anonymity, allowing them to evade punishment from the law. Much research has been done on the topic of fake user detection. Two prominent approaches are graph property analysis and machine learning. One main graph-based approach is random walks [8]. Random walks, or similar approaches, are sometimes paired with other methods focused on detecting fake users in the earlier stage by looking at behavior such as the speed of sending friend requests or posting and how other users respond [3, 9]. In addition, many machine learning methods use various page characteristics to classify the accounts [2, 12, 20, 33]. These methods have their advantages, but a problem with many of these methods is that they do not consider how adaptive malicious account-creators can be. This point was even illustrated in a 2018 New York Times article [11] showing how the methods continue to improve to avoid detection. For example, there have been cases of social media identity theft where malicious account-creators will use the pictures and names of real people to avoid detection by the characteristics mentioned before. This is important because the information they are using is legitimate, meaning a user can have the exact characteristics of a real person. Additionally, some of these characteristics might be slightly tweaked to have more accounts but still avoid detection. 2.3
Trust Mechanisms
Trust is an integral part of knowledge acquisition. As a result, there has been much research on modeling trust and using these models for crime detection [18], the internet of things [27], healthcare [10], and more. Trust is a vital component in social networks because users share other users’ posts with friends or followers, expressing their approval or disapproval of its contents. For this reason, trust-based mechanisms have been applied to such networks. [29] presents our framework to measure trust in a social network. A trust-based framework was used on Twitter data to predict the stock market [28]. In [16], a trust-based metric was coupled with clustering techniques to detect fake users on simulated data.
A Trust-Based Tool for Detecting Potentially Damaging Users
97
3 Trust Network for Twitter This section outlines for completeness our measurement theory-based trust mechanisms [29] and how they are adapted to process social networks, specifically Twitter data. 3.1 Trust Components Impression (m): Impression is defined as the level of trust one person has in another. Trustworthiness of one person can be calculated by another based on interactions or experiences. These interactions and experiences do not have to be first-hand but can also be indirect. In social networks particularly, direct interactions can be measured between the truster and trustee. The level of trustworthiness resulting from the measurements taken between the truster and trustee can be seen in Eq. 1. i=N mi ∑i=1 (1) N Confidence (c): Confidence is defined as how certain a person is about his or her impression (m) of another person. These metrics deal with the errors that can come when calculating the trust. In measurement theory, confidence is related to the variance, or error, of the measurements. If a person is highly confident about their impression, then there should be a little variance between their different trust assessments (mi s). Equation 2 uses error to calculate confidence. i=N (mi − m)2 ∑i=1 (2) c = 1 − 2 ∗ e where e = N ∗ (N − 1)
m=
3.2 Trust Modeling for Twitter The measurements we collect are from users on Twitter. We use interactive tweets which contain mentions, replies, and retweets from one user to another. These interactions contain mentions which are indicated with the @ symbol. These can be used in our measurement framework as they function as direct experiences between two individuals. Using these interactive tweets, we can calculate impression (m) with Eq. 1. We divide these tweets up into time windows based on the date they were posted. A new time window is created every month so we can treat the months differently. In each window, we calculate the mean impression of a given day md and the confidence cd for that day. After calculating this for each day, we combine the results for each month in Eq. 3. The weight is used to assign a higher value to the impressions which have higher confidence. This weight can be seen in Eq. 4. The corresponding error of the weighted mean can be seen in Eq. 5. mmonth =
i=31 wi mi ∑i=1 i=31 wi ∑i=1
(3)
98
K. J. Rittichier et al.
wi = e2month =
1 e2
(4) 1
i=31 wi ∑i=1
(5)
We also have a metric that can give higher weight to older interactions as time is an important feature of trust. This metric is based on the idea that a positive impression of someone you have known or interacted with for a while is more valuable than a positive interaction with someone you just met. We give this weight by having a newness factor σ , where sigma is less than 1 to capture the effect this has on the truster’s confidence. In our system, we can take this into account with Eq. 6 where t is the number of time windows (or months) since the collection of the tweets started. With this formula, the confidence in January is rated slightly higher than the confidence in the following July. The months are discounted by σ , meaning that where January may be discounted by σ , February is discounted by σ 2 .
ct = ct × σ t
(6)
In this work we use σ = 0.99 as the newness factor. However, further refinement based on theories of trust over time might be offered and are slightly discussed in Sect. 5. 3.3
Trust Inference
Multiple aggregation and transitivity trust operators can be used. [28] demonstrates some of the possible options and their assumptions about the relationships demonstrated in the networks. We use two of the transitivity and aggregation operators presented and described in [29]. The choice of these two operators was based on previous experiments [28], which demonstrated these two operators to be the most advantageous for measuring trust in Twitter data. Trust Transitivity: This method is used to calculate indirect trust. If user A does not have any direct connection with user Z, but does have a direct connection with user B who has a direct connection with user Z, then trust from A to Z is calculated using the trust from A to B and B to Z. Equation 7 shows how impression is calculated for transitive trust while Eq. 8 shows how transitive error is calculated. Given the nature of social media user-to-user trust and the transitivity operator we use, we only calculate the trust for 2 hops. This transitivity operator is beneficial as it assumes low trust is more indicative of transitive trust than high trust is. mAB ⊗ mBZ = mmin = min(mAB , mBZ )
(7)
eAB ⊗ eBZ = min(ei where mi = mmin )
(8)
Trust Aggregation: This method is used to determine the trust value between two nodes if there is more than one path between them. For example, if there are two paths
A Trust-Based Tool for Detecting Potentially Damaging Users
99
to reach Z from A, such as A-B-Z and A-C-Z, then the aggregated trust will be the combination of both the trust values of the graph. Equations 9 and 10 show how impression and error are calculated for trust aggregation, respectively. A:C mA:B Z ⊕ mZ =
A:C eA:B Z ⊕ eZ =
+ w2 ∗ mA:C w1 ∗ mA:B Z Z ∑ wi
1 (∑ wi )2
A:C 2 2 2 (w21 ∗ (eA:B Z ) + w2 ∗ (eZ ) ))
(9)
(10)
This operator for aggregation is beneficial as it offers a straightforward way to incorporate the weight from Eq. 4 into calculating the impression. This is important because not all paths are created equal. For example, using this framework, we can calculate the trust of every user. 3.4 User Power The confidence and impression are then used to arrive at the user’s power, shown in Eq. 11. Here, INu is the set of users who have impressions of user u. Only the impressions that are neutral or higher are calculated in the user power. Pu =
∑
ui ∈IN u & mui ≥0.5
mui ∗ cui
(11)
The user power serves as the labeling described in Sect. 1. The benefit of the user power metric is its adaptability to various platforms and metrics. For example, one may set multiple conditions for displaying the user power, such as only showing those who are not following or friends of the user in question. Additionally, the platform could show either the original sharers’ power or the accumulative powers of users who shared the post. This power can be combined with other tools such as clustering for platforms to evaluate the user’s likelihood of damage.
4 Implementation In this section, we discuss the implementation, describing the dataset used and how it was pre-processed. 4.1 The Data The Twitter data we use in this project is from [12]. E13 and TFP are the two real user datasets that contain user information such as relationships and related tweets. These datasets were gathered separately through different means. E13 was collected by the University of Perugia and the Sapienza University of Rome as part of a sociological study about the 2013–2015 Italian elections cycle [12]. The researchers collected the data following the use of the hashtag #elezioni2013. They removed any users who were officially involved in the political discourse, such as politicians, journalists, political
100
K. J. Rittichier et al.
bloggers, etc. The remaining accounts were manually checked by two sociologists at the University of Perugia. All the accounts that were not agreed upon by both researchers were discarded; the number of accounts remaining after this process is 1481. TFP was collected by creating a Twitter account and explicitly asking for real users with “Follow me only if you are NOT a fake”. Then, the researchers contacted other researchers, journalists, and bloggers who shared the account on various platforms. After the accounts were collected, they were verified using CAPTCHA. There were 574 accounts that successfully completed the CAPTCHA. The other portion of the data is made up of the fake follower datasets: INT, FSF, and TWT. All three of the datasets were bought by the researchers from intertwitter.com, fastfollowerz.com, and twittertechnology.com, respectively. The final number of users collected were 1337 from INT, 1000 from FSF, and 845 from TWT. Given that our aim is to detect potentially damaging users, we primarily analyze how real users demonstrate trust amongst themselves. However, in Sect. 5 we demonstrate the metric used on the fake followers. 4.2
Processing
Our method uses sentiment as the impression (m). Under this view, users trust more those to whom they say positive things vs. those to whom they say negative things. Sentiment is, therefore, used to measure the user’s impression of another user’s trustworthiness. To calculate the sentiment, we used SentiStrength, a prominent sentiment analysis software tool to analyze social media text for academic research [32]. SentiStrength was demonstrated to give an absolute mean error of 0.224 when evaluating Yelp reviews [28]. The tweets are processed by removing the URLs and mentions of other users. All the tweets that contain mentions, whether retweets or tags, make up our interactive tweets dataset. We then detect the language for each of those tweets. For this we use Facebook’s AI Research (FAIR) lab’s fastText language detection package [14, 15]. FastText is an open-source package for word embedding and classification. SentiStrength can process 16 languages: English, Spanish, Italian, French, German, Polish, Turkish, Finnish, Dutch, Welsh, Swedish, Russian, Greek, Persian, Portuguese, and Arabic. These languages cover around 97% of the interactive tweets. This is the portion of the dataset that we use to measure the trust of the users. Because our concern is with damaging users and misinformation, we narrow down to only these languages as they are the most likely to do harm to the population from which they were collected. In other words, we only analyze the tweets that are likely to be read by those living in Italy. For instance, Korean is one of the most highly represented of the languages we don’t process. Still, Korean tweets do not pose as much of a threat because the users they would be targeting are not likely able to read them and therefore receive misinformation.
A Trust-Based Tool for Detecting Potentially Damaging Users
101
Fig. 1. The distribution of real users’ number of tweets based on whether or not they receive interaction.
5 Results As stated earlier, some real accounts are going to have low power because of how little they interact with the community or how little the community interacts with them. We consider these users one category of low trust users and call them Spectators. Spectators may simply observe or post some themselves, but receive little interaction from other users. Figure 1 shows the distribution of tweets of those who receive interaction vs. those who do not. We can see that the users who receive interaction are much more likely to post. The existence of said Spectators does not seem to cause a threat to the community because in addition to spreading less information; they are not spreading information that is having an effect on the other users. In order for information to be potentially damaging, it should have some effect on the retweets and replies from other users. We therefore focus our attention on users who have received interactions. In Fig. 2, we demonstrate the tool being used with a threshold of 190 interactions the user has received and a user power of 35. Spectators, represented in blue, make up the majority of the users in this graph, with most users appearing to have little effect in the community. The second most common type are trusted users, who have received many positive interactions from others. These come in two categories. The green users, called Widely-trusted users, have received high trust over many tweets while the yellow users have received high trust over fewer tweets, making them Well-trusted users. The red users are our target users, the Potentially Damaging users. With the aforementioned thresholds, the system detects 9 Potentially Damaging users, 1885 Spectators, 39 Welltrusted users, and 17 Widely-trusted users within the dataset. These thresholds may be changed and multiple thresholds for both user power and amount of interaction might be used for different classification. The user power can inform users of other users’ community trust without Twitter having to delete accounts from the platform because of a low score. Additionally, this helps the system meet the requirements of Trustworthy AI [17] by allowing the user to be informed in their interactions and the platform’s method of evaluation.
102
K. J. Rittichier et al.
Fig. 2. The account classification of the users who are the target of the interacted tweets when σ = 0.99.
5.1
Fake Followers
Although the aim of this study has been to demonstrate the behavior of real users, our system was also helpful at identifying the fake users from the datasets discussed in Sect. 4, INT, FSF, and TWT. Of the 3,351 fake users, 4.45% were Pure Spectator accounts that did not have any posts nor receive any interaction from other users. 99.88% of the accounts fall into the zero received interaction category of low trust. The remaining 4 accounts are all from the INT dataset. When setting the user power threshold as the maximum of these accounts, they all were correctly labeled by our system as having low trust. Additionally, 84.5% of the real users who received interaction from another user had a user power above this threshold when the same σ value was used. Future work will expand these results to different datasets with smarter fake users and demonstrate how this tool may be paired with other tools.
6 Conclusions In this paper, we demonstrate that a trust-based framework can be beneficial for detecting potentially damaging users. This system can identify the users based on networklevel trust and characterize them based on the potential damage they could do. This trust-based framework has benefits over some other systems. It does not focus on characteristics that a malicious account-creator can easily fake. Instead, it focuses on trust built throughout time. Acknowledgments. This work was partially supported by the National Science Foundation under No. 1547411 and by the U.S. Department of Agriculture (USDA), National Institute of Food and Agriculture (NIFA) (Award Number 2017-67003-26057) via an interagency partnership between USDA-NIFA and the National Science Foundation (NSF) on the research program Innovations at the Nexus of Food, Energy and Water Systems.
A Trust-Based Tool for Detecting Potentially Damaging Users
103
References 1. EEAS special report update: Short assessment of narratives and disinformation around the COVID-19 pandemic (update December 2020–April 2021) (2021). https://euvsdisinfo.eu/ eeas-special-report-update-short-assessment-of-narratives-and-disinformation-around-thecovid-19-pandemic-update-december-2020-april-2021/ 2. Agarwal, N., Jabin, S., Hussain, S.Z., et al.: Analyzing real and fake users in Facebook network based on emotions. In: 2019 11th International Conference on Communication Systems & Networks (COMSNETS), pp. 110–117. IEEE (2019) 3. Al-Qurishi, M., Al-Rakhami, M., Alamri, A., Alrubaian, M., Rahman, S.M.M., Hossain, M.S.: Sybil defense techniques in online social networks: a survey. IEEE Access 5, 1200– 1219 (2017) 4. Allcott, H., Gentzkow, M.: Social media and fake news in the 2016 election. J. Econ. Perspect. 31(2), 211–36 (2017) 5. Allem, J.P., Ferrara, E.: Could social bots pose a threat to public health? Am. J. Pub. Health 108(8), 1005 (2018) 6. Allem, J.P., Ferrara, E., Uppu, S.P., Cruz, T.B., Unger, J.B.: E-cigarette surveillance with social media data: social bots, emerging topics, and trends. JMIR Pub. Health Surveill. 3(4), e98 (2017) 7. Brennen, J.S., Simon, F., Howard, P.N., Nielsen, R.K.: Types, sources, and claims of COVID19 misinformation. Reuters Inst. 7(3), 1 (2020) 8. Breuer, A., Eilat, R., Weinsberg, U.: Friend or faux: graph-based early detection of fake accounts on social networks. Proc. Web Conf. 2020, 1287–1297 (2020) 9. Cao, Q., Sirivianos, M., Yang, X., Pregueiro, T.: Aiding the detection of fake accounts in large scale social online services. In: 9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, pp. 197–210 (2012) 10. Chomphoosang, P., Durresi, A., Durresi, M., Barolli, L.: Trust management of social networks in health care. In: 2012 15th International Conference on Network-Based Information Systems, pp. 392–396. IEEE (2012) 11. Confessore, N., Dance, G.J.X., Harris, R., Hansen, M.: The follower factory (2018). https:// www.nytimes.com/interactive/2018/01/27/technology/social-media-bots.html 12. Cresci, S., Di Pietro, R., Petrocchi, M., Spognardi, A., Tesconi, M.: Fame for sale: efficient detection of fake Twitter followers. Decis. Support Syst. 80, 56–71 (2015) 13. Ferrara, E., Varol, O., Davis, C., Menczer, F., Flammini, A.: The rise of social bots. Commun. ACM 59(7), 96–104 (2016) 14. Joulin, A., Grave, E., Bojanowski, P., Douze, M., J´egou, H., Mikolov, T.: Fasttext.zip: compressing text classification models. arXiv preprint arXiv:1612.03651 (2016) 15. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp. 427–431. Association for Computational Linguistics (2017) 16. Kaur, D., Uslu, S., Durresi, A.: Trust-based security mechanism for detecting clusters of fake users in social networks. In: Barolli, L., Takizawa, M., Xhafa, F., Enokido, T. (eds.) WAINA 2019. AISC, vol. 927, pp. 641–650. Springer, Cham (2019). https://doi.org/10.1007/978-3030-15035-8 62
104
K. J. Rittichier et al.
17. Kaur, D., Uslu, S., Durresi, A.: Requirements for trustworthy artificial intelligence – a review. In: Barolli, L., Li, K.F., Enokido, T., Takizawa, M. (eds.) NBiS 2020. AISC, vol. 1264, pp. 105–115. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-57811-4 11 18. Kaur, D., Uslu, S., Durresi, A., Mohler, G., Carter, J.G.: Trust-based human-machine collaboration mechanism for predicting crimes. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds.) AINA 2020. AISC, vol. 1151, pp. 603–616. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44041-1 54 19. Lazer, D.M., et al.: The science of fake news. Science 359(6380), 1094–1096 (2018) 20. McAuley, J.J., Leskovec, J.: Learning to discover social circles in ego networks. In: NIPS, vol. 2012, pp. 548–56. Citeseer (2012) 21. News, B.: France puzzled by mystery anti-Pfizer campaign offer (2021). https://www.bbc. com/news/world-europe-57250285 22. Parikh, S.B., Atrey, P.K.: Media-rich fake news detection: a survey. In: 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 436–441. IEEE (2018) 23. P´erez-Rosas, V., Kleinberg, B., Lefevre, A., Mihalcea, R.: Automatic detection of fake news. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3391–3401 (2018) 24. Potthast, M., Kiesel, J., Reinartz, K., Bevendorff, J., Stein, B.: A stylometric inquiry into hyperpartisan and fake news. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 231–240 (2018) 25. Reis, J.C., Correia, A., Murai, F., Veloso, A., Benevenuto, F.: Supervised learning for fake news detection. IEEE Intell. Syst. 34(2), 76–81 (2019) 26. Ruan, Y., Durresi, A.: A survey of trust management systems for online social communitiestrust modeling, trust inference and attacks. Knowl. Based Syst. 106, 150–163 (2016) 27. Ruan, Y., Durresi, A., Alfantoukh, L.: Trust management framework for Internet of Things. In: 2016 IEEE 30th International Conference on Advanced Information Networking and Applications (AINA), pp. 1013–1019. IEEE (2016) 28. Ruan, Y., Durresi, A., Alfantoukh, L.: Using twitter trust network for stock market analysis. Knowl. Based Syst. 145, 207–218 (2018) 29. Ruan, Y., Zhang, P., Alfantoukh, L., Durresi, A.: Measurement theory-based trust management framework for online social communities. ACM Trans. Internet Technol. (TOIT) 17(2), 1–24 (2017) 30. Shao, C., Ciampaglia, G.L., Varol, O., Yang, K.C., Flammini, A., Menczer, F.: The spread of low-credibility content by social bots. Nat. Commun. 9(1), 1–9 (2018) 31. Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media: a data mining perspective. ACM SIGKDD Exp. Newsl. 19(1), 22–36 (2017) 32. Thelwall, M.: The heart and soul of the web? Sentiment strength detection in the social web with sentistrength. In: Hołyst, J.A. (ed.) Cyberemotions. UCS, pp. 119–134. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-43639-5 7 33. Xiao, C., Freeman, D.M., Hwa, T.: Detecting clusters of fake accounts in online social networks. In: Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security, pp. 91–101 (2015) 34. Zhou, X., Zafarani, R.: A survey of fake news: fundamental theories, detection methods, and opportunities. ACM Comput. Surv. (CSUR) 53(5), 1–40 (2020)
Secure Cloud Storage Using Color Code in DNA Computing Saravanan Manikandan1 , Islam M. D. Saikhul1 , Hsing-Chung Chen1 , and Yu-Lin Song1,2(B) 1 Department of Computer Science and Information Engineering, Asia University,
Taichung, Taiwan [email protected], [email protected] 2 Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan
Abstract. Now a days the cloud computing became a trend of storage, computational power, and other advanced services. Already 90% of the companies are on the cloud, more and more corporate data and public data are being stored in the cloud. Nearly two-third of organizations see security as the tremendous challenge for cloud adoption. So now a days there are lot of research going on to reduce the security risk of storing confidential data over the cloud. In this research we introduced a novel encryption method to secure the data in the cloud using the color code encryption and the well-known DNA computing method. Here we used a 1024-bit encryption key to store the data in the cloud. This key has been designed by using the 256-RGB Color codes, User’s attributes like MAC address and personal data, decimal encoding, ASCII values and DNA bases with complementary rules. The Security analysis of the algorithm is discussed in this research.
1 Introduction IT technologies are getting advanced each day, starting from basic mainframe computers to supercomputers and it theoretically reached up to quantum computers and spintronicsbased computers. Recently cloud computing acted as a game-changer in the IT industry, it is changing the overall method of computing and storing data. [1, 2] Previously companies wanted to spend Millions of dollars to design their On-premise Servers to help their needs. Now cloud service providers like Amazon web services, Microsoft Azure, and Google cloud platform providing a lot of services like compute power, Containers, Storage, Database, Management, and governance, and more. [3] the cloud service provider giving a lot of cost management policies and different pricing for different services and adjustable price ranges. Compare to the expenses we needed to spend on on-premise servers the cloud servers have a cheap costing policy. You can scale up and scale down the services whenever you want. So, no need of highly configured computers on premises to run a complex task. Users can buy the needed services and storage from the cloud service provider as a pay-as-you-go policy. The cloud revolution made the computational system rich and less expensive. Already 90% of companies are moved to the cloud, more and more corporate data and public data are being stored in the cloud. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 105–116, 2022. https://doi.org/10.1007/978-3-030-84913-9_10
106
S. Manikandan et al.
Experts say 60% of workloads are running on a hosted cloud service in 2020. So, what makes the remaining 40% of companies think about cloud adoption? However, the Cloud service provider giving a lot of services with low pricing policies, still the corporates are thinking about the Security factors as the main concern. There are several security factors in cloud computing. Some of them are [4], Consumers Have Reduced Visibility and Control, On-Demand Self Service Simplifies Unauthorized Use, Internet-accessible Management APIs can be Compromised, Data Deletion is Incomplete, Credentials Insecurity, Insiders Abuse Authorized Access, and so on. Because of those reasons, the remaining 40% of companies are still thinking to move to cloud computing. There are numerous researches going on around the world to remove these risk factors on security issues over the cloud environment. The proposed system helps to remove the risk of insider abuse authorized to access. Cryptography is one of the main solutions for confidential data security. Cryptography is mainly used for the secure transmission of data in the public network using the help of security key encryption. If we are using the same key for encrypting and decrypting the data in the sender and receiver ends, then it is called symmetric key encryption. [5] If we are using a different key for the sender and receiver end, then it is called asymmetric key encryption. Here the sender and receiver will be provided with a key pair called private key (Pr k) and public key (Pu k). The private key is kept secret and the public key will be shared publicly. The sender needs to encrypt the data with the receiver’s public key before sending it to the public network. So that the receiver can decrypt the data with his private key. Only the receiver’s private key will have the ability to decrypt the data which is locked by its public key. There are many researches introduced different schemes to improve the performance of the cloud environment [6–10].
2 Related Works Crues [11] has proposed a system, In that system, the administrator of the cloud service provider has complete control over the data, and the admin decides whether the access has to give or not the user doesn’t have any role, the user has to accept whatever the admin says. Hota et al. [12] have introduced an access control model, which is based on the capability. In this method, the data owner encrypts their data by a symmetric key which is generated by the data owner. The key will be shared with the user. So, they can decrypt the data. The problem with this paper is because they are using a symmetric key, one user can access another user’s data also this model takes more time to transfer data to the user. Zhu et al. [13] proposed a method called temporal access control for cloud computing. This method gives security at the time user accessing the data. the data encryption is handled on the proxy. They introduced a current time in this model concerning proxy-based re-encryption. The problem with this scheme is, it doesn’t allow the user to access the data or service at any time they needed. The DNA computing is one of the advanced prominent technology, which is used in multiple sectors like storage, hardware, data security, and more. Adleman was the first person who introduced DNA computing [14]. In the beginning stage of invention, this technology had been used for solving NP-complete problems. After that, slowly many technologies have been invented in the area of DNA schemes. Nowadays DNA
Secure Cloud Storage Using Color Code in DNA Computing
107
computing is acting a main role in cryptography and data security. In DNA Cryptography, DNA is used as a computational carrier. Because of the complex structure of DNA, it became a prominent method in data security. In this method, instead of using zero and one, encrypted data are stored in the DNA bases. The DNA have four bases called Adenine, Cytosine, Guanine, and Thymine respectively represented as A, C, G, and T. The sender has a free choice to select the combination of DNA base at the time of data encryption which increases the data security. Many researches have proposed many schemes to improve data security in cloud computing environment using DNA computing, like. Risca et al. [15] proposed a system for Hiding Message in microdots is a method where they used a concept called steganography. Here the microdots will be defined by the process of concealing the data. But this system is comparatively slow. LU MingXin et al. [16] introduced a DNA Symmetric key crypto where they are using symmetric key encryption. They used DNA computing to generate the symmetric key. Because of the symmetric key, there are possibilities for cracking the key is high. Xuejia Lai et al. [17] proposed an asymmetric encryption and signature method with DNA technology, Here, they have used a key called DNA-PKC. It uses two pairs of keys for encryption and signature. The key generation process has a lot of layers, so it takes more time to generate a key. Binwang et al. [18] have introduced a Reversible data hiding based on DNA computing, here they used a histogram modification algorithm for reversible data hiding. The Embedding rate of the scheme is improved compare to the other previous method. But comparatively the speed of key generation is slow. All the above method has a security issues also these methods take much time to key generation and retrieval. To solve this above problem, we have introduced a double layered key generation scheme using DNA bases and Color codes. Like DNA computing, the color-coded cryptography also has good features in the encryption of data. Adithya [19] proposed a system called color coded cryptography which is used to encrypt the data by using RGB 256 color mode. In RGB 256-color scheme, each 8-bit represented the values of R, G and B. So that, it can create almost 16 million possible colors. We mapped the 256 color with 8-bit of binary numbers. So that we get the Hex values of the colors and converted the Hex to binary and the resultant values are used back in the Encryption process. In this research, the DNA cryptography and color-coded cryptography is used to increases the level of hardness in the secret key generation for cloud data sharing. The final key of the scheme will be 1024 bit long. The rest of the paper have shown the background of the scheme, Discussion on the proposed system and conclusion and future work.
3 Discussion 3.1 System Model In the system model of the proposed system, we have three entities (Fig. 1 have shown the system model of the scheme). Cloud End. The Cloud-End is the normal cloud service provider in the real world, who has the overall control of the cloud architecture. Cloud-End has storage and servers, so it can provide the data storage space and computational power for different services.
108
S. Manikandan et al.
User End. The User-End deals with the normal users who wanted to access data and use the services from the Cloud-End. Before Accessing the data from the Cloud-End they get authorized by the concerned Data owner and Cloud-End. Data Owner End. The Data owner end deal with the producers of the data who wanted to store their confidential or normal data on the cloud end. He personally exchanges keys with the user so that the user can access the data from the cloud end. The data owner is the creator of the CCDNAS key.
Fig. 1. System model for CCDNAS.
3.2 Proposed Scheme The scheme is proposed to give strong security to the confidential data in the cloud for that we have created a 1024-bit CCDNAS key. The CCDNAs based key is generated with the help of DNA computing and Color-codes. This creation of the key has involved several steps. The user from the user end needs to send a request to the cloud service providers in the cloud end for the data. If the user is an authorized user of the cloud, then the cloud will send back the Pu k of the concerned data owner. With the Pu k of the data owner, the user can send a request for the CCDNAS key to decrypt the data from the cloud end, the user will share his personal detail and a secret color code with the data-owner end. After getting the request the Data-owner checks the authenticity of the user with the cloud and reads the user details. The data-owner generate a CCDNAS key and share the key and certificate with the user in SSL connection to access the data from the cloud. We have five stages in this proposed scheme. (1) System config, (2) User Enrolling, (3) Generation of CCDNAS key, (4) Cloud end storage, (5) Cloud end Access. (Fig. 2 Shows the complete workflow of the system). System Config. The public and private keys of the cloud service provider can be decided on their own from (Z/nZ)*. A big prime number p is selected in this stage by the service provider to identify the multiplicative group (Z/nZ)*. Pu k and Pr K for data-owner end
Secure Cloud Storage Using Color Code in DNA Computing
109
and user end has to been selected by the cloud service provider from the multiplicative group (Z/nZ)*. These keys are shared with the user and data-owner at the time of their registration. Only the user who is authorized by the cloud will have the public and private keys. Without knowing the public key of the entity, no one can send data to the specific entity.
Fig. 2. Workflow diagram of CCDNAS.
User Enrolling. The user should register themselves in the cloud-end to access the data. The only authorized users are can access the data from the cloud end. When the Cloudend receives the request from the user, it will collect the user’s personal data such as name, location, date of birth, first pet name, and so on. And then create the user account. The public and private key for the user is generated in this stage, and it will be shared with the user to access data. The secret keys will be shared in the SSL connection to avoid the risk of the intruder. Generation of CCDNAS Key. This is the most crucial stage in this system. Once the data-owner end receives the request from the user, it will check with the cloud-end to verify the user’s authenticity. If the user is an authorized one, then the data-owner end collects the data of the user’s personal information and color code (C_code) from the user, the color code is a randomly generated color from the user, he/she has 16 million combination to select the c_code. It increases the strength of the CCDNAS key. Once the data are collected from the user the data-owner end starts to generate the CCDNAS key using DNA computing and 256 color codes mapping. The step of the process is explained below steps. Table 2 shows the algorithm of the key generation. Step 1: After getting the Public key of the Data-owner from Cloud-End the user generates the request for secret key and sends the user credential with C-code to Dataowner.
110
S. Manikandan et al.
Step 2: The Data-owner verifies the user’s authorization with cloud-end and continues the further step only if the user is authorized by the cloud. Step 3: The Data-owner fetches all the credentials of the user in the following order User-id, address, C-code, Date of birth, and MAC address. For an instance, AsiaUniv xxxyyy 2F32C7 17071992 00:1B:44:11:3A:B5 Step 4: The Data-owner converts the values to decimal numbers using decimal encoding rules. For an instance, 65 115 105 97 85 110 . . . . . . . . . .58 51 65 58 66 53 Step 5: Each decimal number is converted into an 8-bit binary using standard binary encoding rules. 00000110.00000101 . . . . . . . . . 00000101.00000011 Step 6: The resultant binary sequence is then divided into two parts and EXOR operation will be handled on those two parts. For an instance, the first half of the binary number is 0000011000000101 and the second half of the binary number is like 0000010100000011, then the EXOR operation will be conducted in this binary sequence, the output will be, 0000001100000110
Table 1. Shown some samples of 256 color sheet
Step 7: For each 8-bit, the data-owner maps a color in 256 colors sheet (Table 1: shown some samples of 256 color sheet), and then the hex value of the color is converted into a binary sequence. This table also shared with the user. For example, 00000011 → Gray → #808080 00000110 → Teal → #008080 Step 8: Convert the Hex value to binary sequence. For example, 100000001000000010000000. 000000001000000010000000
Secure Cloud Storage Using Color Code in DNA Computing
111
Step 9: Each 8-bit binary numbers are transformed into their randomly generated ASCII values. For an instance, in the above step there are six 8 bit number, 10000000 10000000 10000000 00000000 10000000 10000000. ASCII values of 10000000 can be 120 and 00000000 can be 6 or it can be anything. Now the ASCII values are converted into its standard 8-bit binary format. For example, 00110010 10110101 Step 10: The Data-owner Divide the resultant of above step and adds the MAC address again in the binary sequence. Then the sequence become, 00110010 00:1B:44:11:3A:B5 10110101 Step 11: The resultant again converted in Binary sequence and extra bits are deleted from the left side or zeros is added to the right side to maintain the length of the binary sequence in 1024 bit. 00110010 . . . . . . . . . . . . . . . . . . . . . . . . . . . 10110101 Step 12: Now the binary sequence is grouped into 256-bit blocks and assigned to the DNA bases to each block. 001100100110100101 …………...……………100110101010110101 A
G
T
C
Step 13: Now the complementary rules is applied on the resultant and to make a final CCDNAS secret key. The final key sequence will be like, GATC The session will be complete once Data-owner end created generated the CCDNAS key, the resultant of step12 is used as a secret key for data encryption in data-owner end. And the resultant of step-13 is used as a Key for decryption at the user-end. The key will be sent to the user after it is encrypted by the user’s Pu k. The table of 256 color code and table of randomly generated ASCII value will be sent to the user in SSL connection. The user can decrypt the real key once the user applies his Pr k. Cloud End Storage. After generating the key, the Data-owner will encrypt his data with the CCDNAS key. Then the data will be stored in the cloud-end for user access. The Data owner split the data into 1024 bit blocks. Again, each block is divided into 4 blocks of 256 bits. So that EXOR operation can be done easily with the CCDNAS key. The Data-owner does this operation as a block by block of 256 bit, so it required a 4-EXOR operation for every block of 1024 bits. Each pair of a binary numbers will be translated to DNA bases as shown in Table 3. There are 24 combinations of DNA bases because of 4 DNA bases. Then the data-owner will execute the same complementary rule we have seen in the key generation section. After completing the encryption, the Data-owner end will encrypt the data with cloud-end’s Pu k and user’s Pu k. The cloud
112
S. Manikandan et al.
end doesn’t have the CCDNAS key to open the data. He only can forward it to the user. Because the authorized user already has his private key and CCDNAS key, only he can decrypt and use the data. The other authorized user of the cloud who is not authorized by the data-owner can’t decrypt the data. Here, without the knowledge of data-owner no one can read his data.
Table 2. Algorithm of CCDNAS secret key generation
CCDNAS Secret key generation Input: Attributes of the Users Output: CCDNAS 1024-bit secret key. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Begin If user = authorized by cloud_End then, Collect User_data and c_code from the user Convert the attributes to decimal using Decimal encoding rule Convert the decimal numbers into 8-bit binary using Binary encoding rule Divide Resultant into two parts Execute EXOR operation between two parts Map the resultant binary into Hex value using 256 Color code table Convert the HEX value to 8-Binary numbers Convert binary to ASCII values using Randomly generated table by data-owner end Convert the ASCII values to Standard Binary numbers Divide the resultant into two parts Add to both sides of MAC address Convert the resultant into binary number If length(resultant) > 1024bit then, Delete extra bit from left end to make 1024bit Else If length(resultant) < 1024bit then, Append zero at the right end to make 1024bit End Divide the binary number into four blocks Assign DNA base for each block Apply complementary rule on the DNA sequence Else Compliant to cloud service provider to change the public key of the DO End End
Cloud End Access. The user must send a data request to the cloud-end to get access to the data. After the registration, the user can log in to the cloud and he can access the data. The user wants to send a request to the cloud-end by encrypting the request by users Pr k and Cloud-ends Pu k. After receiving the request from the user, the cloud-end search
Secure Cloud Storage Using Color Code in DNA Computing
113
Table 3. DNA bases with 2-bit binary number 00 01 10 11 A
G
T
C
A
G
C
T
G
A
T
C
G
A
C
T
.
.
.
.
.
.
.
.
.
.
.
.
T
C
A G
for the requested data-owner’s Pu k and send the data-owner details by encrypting with cloud-end’s private key and user’s public key. Now the user can request the data-owner for the CCDNAS key and certificate to access the data. the user must encrypt the request with data-owner’s Pu k and user’s Pr k. After cross checks the cloud-end, the data-owner generates the CCDNAS key and sends it to the user with the certificate to access the cloud. While sending the key the data-owner will encrypt with his Pr k and user’s Pu k. The user can give the certificate to the cloud-end, he can get the encrypted data from the cloud-end. Now the data will be encrypted with cloud-end’s Pr k and user’s Pu k. Finally, the user can decrypt the data and use it, because he already has the CCDNAS key and cloud-end’s public key.
4 Security Analysis The CCDNAS key based encryption can stand against many kinds of attacks. The following are some of the security analysis. 4.1 Insider Attack In the inside attack [20], a normal user itself can become an attacker to hack the data or service of another user. In the cloud environment, any person who uses the same cloud services can become an attacker, even the cloud service provider itself can become an attacker. In the proposed method, the Data-owner generates a CCDNAS key using the user’s credentials, color code of 16 million possibilities, and randomly generated ASCII values. So, it’s hard for the attackers to break the key. The Data-owner shares this CCDNAS key only with the authorized user with encryption of public and private key concept. Therefore, the CCDNAS method is solved the Insider attack.
114
S. Manikandan et al.
4.2 Side Channel Attack In the side-channel attack [21], Attacker uses an unauthorized virtual machine in the same host to take some confidential data from the cloud. In this attack, the attacker mainly focuses on the implementation of the key-encryption algorithm. In our proposed system, we have used multilayers of key generation process to generate CCDNAS key and the data is encrypted with the 1024 bit key. Even after this hard encryption, the scheme has used the public and private key concept to protect the key. The corresponding private and public keys were only known by the concerned authorized user. The Data-owner is the only person who generates the CCDNAS key, he won’t store the key anywhere, even he won’t store the key in the cloud. So, it’s impossible for the attacker to get the key. Therefore, this scheme stands strong against the side attackers. 4.3 Phishing Attack In the phishing attack [22], unauthorized users get the credential of the Cloud-end authorized user such as his name, Date of birth, MAC address. Then the attacker tries to get the cloud services or try to access the user data by using those details. In this scheme, we are using a color code with 16 million possibilities, and it’s hard for attackers to guess the color code. Same time cloud-end gets all the details from the user and sends the Pu k and Pr k to the user through an SSL connection. After that, all the transmission is encrypted by the Pu k and Park encryption. When the user requests the cloud for data, the cloud-end encrypts the data with the corresponding user’s Pu k key and the cloud-end’s private key. Neither data-owner nor Cloud-end doesn’t want to share their confidential details with any unauthorized user. So, it is hard for the attacker to attack the data from the cloud. Therefore, the proposed scheme can protect the data from Phishing attacks. 4.4 Denial of Service Attack In Denial of service attack, attackers make the server gets overloaded by sending more requests by using some bot, by that they can make the service of the cloud unavailable for the authorized user. In this proposed method, the data is stored in the cloud with the 1024 bit CCDNAS key encryption, again the data is encrypted with the cloud’s public key and the data-owner’s private key. The required materials for decrypting the data are given to the authorized user only but not to the cloud. So even though, the cloud is hacked, the attackers couldn’t decrypt and get any data from the cloud. So, the system is safe, and the proposed scheme helps to escape from the DOS attack.
5 Conclusions and Future Works Cloud technology is making a big change in this era. Because of security threads for confidential data, this technology is not accepted by many companies. To increases, security in the cloud computing environment this research proposed a method called CCDNAS encryption with a 1024 bit long encryption key. This method used DNA computing and Color codes to increases security. The public and private key concepts
Secure Cloud Storage Using Color Code in DNA Computing
115
are also used in the key exchange phase. Here the Data-owner no need to stay online to give access to the user, after sharing the certificate and CCDNAS key, the user can go offline. This Proposed system has solved many security threats in the cloud computing environment. The performance analysis and the mathematical analysis will be considered as future work. Moreover, there is a large scope to improve the authentication process of the cloud computing environment. Acknowledgements. This work was supported in part by the Ministry of Science and Technology, Taiwan, through grant MOST 110-2218-E-468-001-MBK.
References 1. Thomas, E., Ricardo, P., Zaigham, M.: Cloud Computing: Concepts, Technology & Architecture, The Pearson Service Technology Series (2014) 2. Dotson, C.: Practical Cloud Security. O’Reilly Media, Inc (2019) 3. Mathew, S.: AWS principal solutions architect, Overview of Amazon Web Services (2020) 4. Morrow, T.: 12 risks, threats, & vulnerabilities in moving to the cloud, Carnegie Mellon University’s Software Engineering Institute Blog (2018) 5. Kumar, S., Wollinger, T.: Fundamentals of symmetric cryptography. In: Lemke, K., Paar, C., Wolf, M. (eds.) Embedded Security in Cars. Springer, Berlin, Heidelberg. https://doi.org/10. 1007/3-540-28428-1_8 6. Namasudra, S., Roy, P., Balusamy, B., Vijayakumar, P.: Data accessing based on the popularity value for cloud computing. In: 2017 International Conference on Innovations in Information, Embedded and Communication Systems, pp. 1–6 (2017) 7. Namasudra, S., Devi, D., Kadry, S., Sundarasekar, R., Shanthini, A.: Towards DNA based data security in the cloud computing environment. Comput. Commun. 151, 539–547 (2020) 8. Younis, Y.A., Kifayat, K., Merabti, M.: An access control for cloud computing. J. Inf. Secur. Appl. 19(1), 45–60 (2014) 9. Balamurugan, B., Krishna, P.V.: Extensive survey on usage of attribute-based encryption in cloud. J. Emerg. Technol. Web Intell. 6(3), 263–272 (2014) 10. Shamir, A.: Identity-based cryptosystems and signature schemes. In: Proceedings of Advances Cryptology, pp. 47–53 (1985) 11. Crues, R.A.: Methods for access control: advances and limitations (2013) 12. Hota, C., Sanka, S., Rajarajan, M., Nair, S.K.: Capability based cryptographic data access control in cloud computing. Int. J. Adv. Netw. Appl. 3(3), 1152–1161 (2011) 13. Zhu, Y., Hu, H., Ahn, G.J., Huang, D., Wang, S.: Towards temporal access control in cloud computing. In: Proceedings of IEEE INFOCOM, Orlando, USA, pp. 2576–2580 (2012) 14. Adleman, L.M.: Molecular computation of solutions to combinatorial problems. Science 266, 1021–1024 (1994) 15. Clelland, C., Risca, V., Bancroft, C.: Hiding messages in DNA microdots. Nature 399(V), 533–534 16. MingXin, L.: Symmetric key cryptosystem with DNA technology. Sci. China Ser. F Inf. Sci. 50, 324–333 (2007) 17. Lai, X., MingXin, L., Qin, L.: Asymmetric encryption and signature method with DNA technology. Sci. China. Inf. Sci. 53, 506–514 (2010) 18. Wang, B., Xie, Y., Zhou, S., Zhou, C., Zheng, X.: Reversible data hiding based on DNA computing. Comput. Intell. Neurosci. 2017, 9 (2017). Article ID 7276084 19. Gaitonde, A.: Color coded cryptography. Int. J. Sci. Eng. Res. 3(7), 828–830 (2012)
116
S. Manikandan et al.
20. Tiri, K.: Side-channel attack pitfalls. In: Proceedings of the 44th ACM/IEEE Design Automation Conference, San Diego, USA. IEEE (2007) 21. Jagatic, T.N., Johnson, N.A., Jakobsson, M., Menczer, F.: Social phishing. Commun. ACM 50(10), 94–100 (2007) 22. Chen, H.-C., et al.: A secure color-code key exchange protocol for mobile chat application. In: You, I., Leu, F.-Y., Chen, H.-C., Kotenko, I. (eds.) MobiSec 2016. CCIS, vol. 797, pp. 54–64. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-7850-7_6
A Hybrid Intelligent Simulation System for Node Placement in WMNs: A Comparison Study of Chi-Square and Uniform Distributions of Mesh Clients for CM and LDVM Router Replacement Methods Admir Barolli1(B) , Kevin Bylykbashi2 , Ermioni Qafzezi2 , Shinji Sakamoto3 , and Leonard Barolli4 1
4
Department of Information Technology, Aleksander Moisiu University of Durres, L.1, Rruga e Currilave, Durres, Albania 2 Graduate School of Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan 3 Department of Computer and Information Science, Seikei University, 3-3-1 Kichijoji-Kitamachi, Musashino-shi, Tokyo 180-8633, Japan [email protected] Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected] Abstract. Wireless Mesh Networks (WMNs) are gaining a lot of attention from researchers due to their advantages such as easy maintenance, low upfront cost, and high robustness. Connectivity and stability directly affect the performance of WMNs. However, WMNs have some problems such as node placement problem, hidden terminal problem and so on. In our previous work, we implemented a simulation system to solve the node placement problem in WMNs considering Particle Swarm Optimization (PSO), Simulated Annealing (SA) and Distributed Genetic Algorithm (DGA), called WMN-PSOSA-DGA. In this paper, we compare chi-square and uniform distribution of mesh clients for two router replacement methods: Constriction Method (CM) and Linearly Decreasing Vmax Method (LDVM). The simulation results show that for chisquare distribution, the mesh routers can cover all the mesh clients but this distribution has not good load balancing. The uniform distribution makes a better load balancing but not all mesh clients are covered. Both CM and LDVM show good results for this distribution, with the latter showing slightly better results.
1
Introduction
The wireless networks and devices are becoming increasingly popular and they provide users access to information and communication anytime and anywhere [3,8–11,14,20,26,27,29,33]. Wireless Mesh Networks (WMNs) are gaining c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 117–130, 2022. https://doi.org/10.1007/978-3-030-84913-9_11
118
A. Barolli et al.
a lot of attention because of their low-cost nature that makes them attractive for providing wireless Internet connectivity. A WMN is dynamically self-organized and self-configured, with the nodes in the network automatically establishing and maintaining mesh connectivity among itself (creating, in effect, an ad hoc network). This feature brings many advantages to WMN such as low up-front cost, easy network maintenance, robustness and reliable service coverage [1]. Moreover, such infrastructure can be used to deploy community networks, metropolitan area networks, municipal and corporative networks, and to support applications for urban areas, medical, transport and surveillance systems. Mesh node placement in WMNs can be seen as a family of problems, which is shown (through graph theoretic approaches or placement problems, e.g. [6,15]) to be computationally hard to solve for most of the formulations [37]. We consider the version of the mesh router nodes placement problem in which we are given a grid area where to deploy a number of mesh router nodes and a number of mesh client nodes of fixed positions (of an arbitrary distribution) in the grid area. The objective is to find a location assignment for the mesh routers to the cells of the grid area that maximizes the network connectivity, client coverage and consider load balancing for each router. Network connectivity is measured by Size of Giant Component (SGC) of the resulting WMN graph, while the user coverage is simply the number of mesh client nodes that fall within the radio coverage of at least one mesh router node and is measured by Number of Covered Mesh Clients (NCMC). For load balancing, we added in the fitness function a new parameter called NCMCpR (Number of Covered Mesh Clients per Router). Node placement problems are known to be computationally hard to solve [12, 13,38]. In previous works, some intelligent algorithms have been recently investigated for node placement problem [4,7,16,18,21–23,31,32]. In [24], we implemented a Particle Swarm Optimization (PSO) based simulation system, called WMN-PSO. Also, we implemented another simulation system based on Genetic Algorithm (GA), called WMN-GA [19], for solving node placement problem in WMNs. Then, we designed and implemented a hybrid simulation system based on PSO and distributed GA (DGA). We call this system WMN-PSODGA. In this paper, we compare the simulation results of chi-square and uniform distribution of mesh clients for two router replacement methods: Constriction Method (CM) and Linearly Decreasing Vmax Method (LDVM). The rest of the paper is organized as follows. We present our designed and implemented hybrid simulation system in Sect. 2. The simulation results are given in Sect. 3. Finally, we give conclusions and future work in Sect. 4.
2 2.1
Proposed and Implemented Simulation System Particle Swarm Optimization
In PSO a number of simple entities (the particles) are placed in the search space of some problem or function and each evaluates the objective function at its
A Hybrid Intelligent Simulation System for Node Placement in WMNs
119
current location. The objective function is often minimized and the exploration of the search space is not through evolution [17]. Each particle then determines its movement through the search space by combining some aspect of the history of its own current and best (best-fitness) locations with those of one or more members of the swarm, with some random perturbations. The next iteration takes place after all particles have been moved. Eventually the swarm as a whole, like a flock of birds collectively foraging for food, is likely to move close to an optimum of the fitness function. Each individual in the particle swarm is composed of three D-dimensional vectors, where D is the dimensionality of the search space. These are the current position xi , the previous best position pi and the velocity vi . The particle swarm is more than just a collection of particles. A particle by itself has almost no power to solve any problem; progress occurs only when the particles interact. Problem solving is a population-wide phenomenon, emerging from the individual behaviors of the particles through their interactions. In any case, populations are organized according to some sort of communication structure or topology, often thought of as a social network. The topology typically consists of bidirectional edges connecting pairs of particles, so that if j is in i’s neighborhood, i is also in j’s. Each particle communicates with some other particles and is affected by the best point found by any member of its topological neighborhood. This is just the vector pi for that best neighbor, which we will denote with pg . The potential kinds of population “social networks” are hugely varied, but in practice certain types have been used more frequently. We show the pseudo code of PSO in Algorithm 1. In the PSO process, the velocity of each particle is iteratively adjusted so that the particle stochastically oscillates around pi and pg locations. 2.2
Distributed Genetic Algorithm
Distributed Genetic Algorithm (DGA) has been used in various fields of science. DGA has shown their usefulness for the resolution of many computationally hard combinatorial optimization problems. We show the pseudo code of DGA in Algorithm 2. Population of individuals: Unlike local search techniques that construct a path in the solution space jumping from one solution to another one through local perturbations, DGA use a population of individuals giving thus the search a larger scope and chances to find better solutions. This feature is also known as “exploration” process in difference to “exploitation” process of local search methods. Fitness: The determination of an appropriate fitness function, together with the chromosome encoding are crucial to the performance of DGA. Ideally we would construct objective functions with “certain regularities”, i.e. objective functions that verify that for any two individuals which are close in the search space, their respective values in the objective functions are similar. Selection: The selection of individuals to be crossed is another important aspect in DGA as it impacts on the convergence of the algorithm. Several selec-
120
A. Barolli et al.
Algorithm 1. Pseudo code of PSO. /* Initialize all parameters for PSO */ Computation maxtime:= T pmax , t := 0; Number of particle-patterns:= m, 2 ≤ m ∈ N 1 ; Particle-patterns initial solution:= P 0i ; Particle-patterns initial position:= x0ij ; Particles initial velocity:= v 0ij ; PSO parameter:= ω, 0 < ω ∈ R1 ; PSO parameter:= C1 , 0 < C1 ∈ R1 ; PSO parameter:= C2 , 0 < C2 ∈ R1 ; /* Start PSO */ Evaluate(G0 , P 0 ); while t < T pmax do /* Update velocities and positions */ = ω · v tij v t+1 ij +C1 · rand() · (best(Pijt ) − xtij ) +C2 · rand() · (best(Gt ) − xtij ); t+1 xij = xtij + v t+1 ij ; /* if fitness value is increased, a new solution will be accepted. */ Update Solutions(Gt , P t ); t = t + 1; end while Update Solutions(Gt , P t ); return Best found pattern of particles as solution;
tion schemes have been proposed in the literature for selection operators trying to cope with premature convergence of DGA. There are many selection methods in GA. In our system, we implement 2 selection methods: Random method and Roulette wheel method. Crossover operators: Use of crossover operators is one of the most important characteristics. Crossover operator is the means of DGA to transmit best genetic features of parents to offsprings during generations of the evolution process. Many methods for crossover operators have been proposed such as Blend Crossover (BLX-α), Unimodal Normal Distribution Crossover (UNDX), Simplex Crossover (SPX). Mutation operators: These operators intend to improve the individuals of a population by small local perturbations. They aim to provide a component of randomness in the neighborhood of the individuals of the population. In our system, we implemented two mutation methods: uniformly random mutation and boundary mutation. Escaping from local optima: GA itself has the ability to avoid falling prematurely into local optima and can eventually escape from them during the search process. DGA has one more mechanism to escape from local optima by considering some islands. Each island computes GA for optimizing and they migrate its gene to provide the ability to avoid from local optima (See Fig. 1).
A Hybrid Intelligent Simulation System for Node Placement in WMNs
121
Algorithm 2. Pseudo code of DGA. /* Initialize all parameters for DGA */ Computation maxtime:= T gmax , t := 0; Number of islands:= n, 1 ≤ n ∈ N 1 ; initial solution:= P 0i ; /* Start DGA */ Evaluate(G0 , P 0 ); while t < T gmax do for all islands do Selection(); Crossover(); Mutation(); end for t = t + 1; end while Update Solutions(Gt , P t ); return Best found pattern of particles as solution;
Fig. 1. Model of migration in DGA.
Convergence: The convergence of the algorithm is the mechanism of DGA to reach to good solutions. A premature convergence of the algorithm would cause that all individuals of the population be similar in their genetic features and thus the search would result ineffective and the algorithm getting stuck into local optima. Maintaining the diversity of the population is therefore very important to this family of evolutionary algorithms. 2.3
WMN-PSODGA Hybrid Simulation System
In this subsection, we present the initialization, particle-pattern, fitness function, and replacement methods. The pseudo code of our implemented system is
122
A. Barolli et al.
Algorithm 3. Pseudo code of WMN-PSODGA system. Computation maxtime:= Tmax , t := 0; Initial solutions: P . Initial global solutions: G. /* Start PSODGA */ while t < Tmax do Subprocess(PSO); Subprocess(DGA); WaitSubprocesses(); Evaluate(Gt , P t ) /* Migration() swaps solutions (see Fig. 2). */ Migration(); t = t + 1; end while Update Solutions(Gt , P t ); return Best found pattern of particles as solution;
shown in Algorithm 3. Also, our implemented simulation system uses Migration function as shown in Fig. 2. The Migration function swaps solutions among lands included in PSO part. Initialization We decide the velocity of particles by a random process considering the area size. For√instance, when √ the area size is W × H, the velocity is decided randomly from − W 2 + H 2 to W 2 + H 2 . Particle-Pattern A particle is a mesh router. A fitness value of a particle-pattern is computed by combination of mesh routers and mesh clients positions. In other words, each particle-pattern is a solution as shown is Fig. 3. Gene Coding A gene describes a WMN. Each individual has its own combination of mesh nodes. In other words, each individual has a fitness value. Therefore, the combination of mesh nodes is a solution. Fitness Function WMN-PSODGA has the fitness function to evaluate the temporary solution of the router’s placements. The fitness function is defined as: F itness = α × N CM C(x ij , y ij ) + β × SGC(x ij , y ij ) + γ × N CM CpR(x ij , y ij ).
This function uses the following indicators. • NCMC (Number of Covered Mesh Clients) The NCMC is the number of the clients covered by the SGC’s routers. • SGC (Size of Giant Component) The SGC is the maximum number of connected routers.
A Hybrid Intelligent Simulation System for Node Placement in WMNs
123
Fig. 2. Model of WMN-PSODGA migration.
Fig. 3. Relationship among global solution, particle-patterns, and mesh routers in PSO part.
• NCMCpR (Number of Covered Mesh Clients per Router) The NCMCpR is the number of clients covered by each router. The NCMCpR indicator is used for load balancing. WMN-PSODGA aims to maximize the value of the fitness function in order to optimize the placements of the routers using the above three indicators. Weight-coefficients of the fitness function are α, β, and γ for NCMC, SGC, and NCMCpR, respectively. Moreover, the weight-coefficients are implemented as α + β + γ = 1. Router Replacement Methods A mesh router has x, y positions, and velocity. Mesh routers are moved based on velocities. There are many router replacement methods. In this paper, we use CM and LDVM. Constriction Method (CM) CM is a method which PSO parameters are set to a week stable region (ω = 0.729, C1 = C2 = 1.4955) based on analysis of PSO by M. Clerc et al. [2,5,35].
124
A. Barolli et al. Table 1. The common parameters for each simulation. Parameters
Values
Distribution of mesh clients Chi-square, uniform Number of mesh clients
48
Number of mesh routers
16
Radius of a mesh router
2.0–3.5
Number of GA Islands
16
Number of migrations
200
Evolution steps
9
Selection method
Random method
Crossover method
UNDX
Mutation method
Uniform mutation
Crossover rate
0.8
Mutation rate
0.2
Replacement method
CM, LDVM
Area Size
32.0 × 32.0
Random Inertia Weight Method (RIWM) In RIWM, the ω parameter is changing randomly from 0.5 to 1.0. The C1 and C2 are kept 2.0. The ω can be estimated by the week stable region. The average of ω is 0.75 [28,35]. Linearly Decreasing Inertia Weight Method (LDIWM) In LDIWM, C1 and C2 are set to 2.0, constantly. On the other hand, the ω parameter is changed linearly from unstable region (ω = 0.9) to stable region (ω = 0.4) with increasing of iterations of computations [35,36]. Linearly Decreasing Vmax Method (LDVM) In LDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). A value of Vmax which is maximum velocity of particles is considered. With increasing of iteration of computations, the Vmax is kept decreasing linearly [30,34]. Rational Decrement of Vmax Method (RDVM) In RDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). The Vmax is kept decreasing with the increasing of iterations as Vmax (x) =
W 2 + H2 ×
T −x . x
Where, W and H are the width and the height of the considered area, respectively. Also, T and x are the total number of iterations and a current number of iteration, respectively [25].
A Hybrid Intelligent Simulation System for Node Placement in WMNs
(a) CM
125
(b) LDVM
Fig. 4. Visualization results after the optimization (Chi-square Distribution).
3
Simulation Results
In this section, we compare the simulation results of chi-square and uniform distributions of mesh clients for CM and LDVM. The weight-coefficients of fitness function were adjusted for optimization. In this paper, the weight-coefficients are α = 0.6, β = 0.3, γ = 0.1. The number of mesh routers is considered 16 and the number of mesh clients 48. Table 1 summarizes the common parameters used in each simulation. Figure 4 and Fig. 5 show the visualization results after the optimization for chi-square distribution and uniform distribution, respectively. Figure 6 and Fig. 7 show the number of covered mesh clients by each router. Figure 8 and Fig. 9 show the standard deviation where r is the correlation coefficient. As shown in Fig. 4, when using chi-square distribution 16 mesh routers can cover all the mesh clients for both CM and LDVM. On the other hand, in Fig. 5, the simulation results show that for the uniform distribution, 16 mesh routers are not enough to cover all mesh clients regardless which router replacement method is used. However, as we can see from Fig. 6, Fig. 7, Fig. 8 and Fig. 9, this distribution has better results compared to chi-square distribution in terms of load balancing. For chi-square distribution, some routers cover most of the clients while the rest of the routers cover only a few of them, and this can be seen for both CM and LDVM. The uniform distribution, on the other side, has good load balancing for both router replacement methods, with LDVM showing slightly better results than CM.
126
A. Barolli et al.
(a) CM
(b) LDVM
Fig. 5. Visualization results after the optimization (Uniform Distribution).
Fig. 6. Number of covered clients by each router after the optimization (Chi-square Distribution).
Fig. 7. Number of covered clients by each router after the optimization (Uniform Distribution).
A Hybrid Intelligent Simulation System for Node Placement in WMNs
127
Fig. 8. Transition of the standard deviations (Chi-square Distribution).
Fig. 9. Transition of the standard deviations (Uniform Distribution).
4
Conclusions
In this work, we evaluated the performance of WMNs using a hybrid simulation system based on PSO and DGA (called WMN-PSODGA). We compared the simulation results of chi-square and uniform distributions of mesh clients for two router replacement methods: CM and LDVM. The simulation results show that for chi-square distribution, the mesh routers can cover all the mesh clients but this distribution has not good load balancing. The uniform distribution makes a better load balancing but not all mesh clients are covered. Both CM and LDVM show good results for this distribution, with the latter showing slightly better results. In future work, we will consider other distributions of mesh clients and other router replacement methods.
128
A. Barolli et al.
References 1. Akyildiz, I.F., Wang, X., Wang, W.: Wireless mesh networks: a survey. Comput. Netw. 47(4), 445–487 (2005) 2. Barolli, A., Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L., Takizawa, M.: Performance evaluation of WMNs by WMN-PSOSA simulation system considering constriction and linearly decreasing Vmax methods. In: Xhafa, F., Caball´e, S., Barolli, L. (eds.) 3PGCIC 2017. LNDECT, vol. 13, pp. 111–121. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69835-9 10 3. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: Performance analysis of simulation system based on particle swarm optimization and distributed genetic algorithm for WMNs considering different distributions of mesh clients. In: Barolli, L., Xhafa, F., Javaid, N., Enokido, T. (eds.) IMIS 2018. AISC, vol. 773, pp. 32–45. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93554-6 3 4. Barolli, A., Sakamoto, S., Ozera, K., Barolli, L., Kulla, E., Takizawa, M.: Design and implementation of a hybrid intelligent system based on particle swarm optimization and distributed genetic algorithm. In: Barolli, L., Xhafa, F., Javaid, N., Spaho, E., Kolici, V. (eds.) EIDWT 2018. LNDECT, vol. 17, pp. 79–93. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75928-9 7 5. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002) 6. Franklin, A.A., Murthy, C.S.R.: Node placement algorithm for deployment of twotier wireless mesh networks. In: Proceedings of Global Telecommunications Conference, pp. 4823–4827 (2007) 7. Girgis, M.R., Mahmoud, T.M., Abdullatif, B.A., Rabie, A.M.: Solving the wireless mesh network design problem using genetic algorithm and simulated annealing optimization methods. Int. J. Comput. Appl. 96(11), 1–10 (2014) 8. Goto, K., Sasaki, Y., Hara, T., Nishio, S.: Data gathering using mobile agents for reducing traffic in dense mobile wireless sensor networks. Mobile Inf. Syst. 9(4), 295–314 (2013) 9. Inaba, T., Elmazi, D., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: A secure-aware call admission control scheme for wireless cellular networks using fuzzy logic and its performance evaluation. J. Mobile Multimedia 11(3&4), 213–222 (2015) 10. Inaba, T., Obukata, R., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of a QoS-aware fuzzy-based CAC for LAN access. Int. J. Space-Based Situated Comput. 6(4), 228–238 (2016) 11. Inaba, T., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: A testbed for admission control in WLAN: a fuzzy approach and its performance evaluation. In: Barolli, L., Xhafa, F., Yim, K. (eds.) BWCCA 2016. LNDECT, vol. 2, pp. 559–571. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49106-6 55 12. Lim, A., Rodrigues, B., Wang, F., Xu, Z.: k-center problems with minimum coverage. Theor. Comput. Sci. 332(1–3), 1–17 (2005) 13. Maolin, T., et al.: Gateways placement in backbone wireless mesh networks. Int. J. Commun. Netw. Syst. Sci. 2(1), 44–50 (2009) 14. Matsuo, K., Sakamoto, S., Oda, T., Barolli, A., Ikeda, M., Barolli, L.: Performance analysis of WMNs by WMN-GA simulation system for two WMN architectures and different TCP congestion-avoidance algorithms and client distributions. Int. J. Commun. Netw. Distrib. Syst. 20(3), 335–351 (2018) 15. Muthaiah, S.N., Rosenberg, C.P.: Single gateway placement in wireless mesh networks. In: Proceedings of 8th International IEEE Symposium on Computer Networks, pp. 4754–4759 (2008)
A Hybrid Intelligent Simulation System for Node Placement in WMNs
129
16. Naka, S., Genji, T., Yura, T., Fukuyama, Y.: A hybrid particle swarm optimization for distribution state estimation. IEEE Trans. Power Syst. 18(1), 60–68 (2003) 17. Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization. Swarm Intell. 1(1), 33–57 (2007). https://doi.org/10.1007/s11721-007-0002-0 18. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A comparison study of simulated annealing and genetic algorithm for node placement problem in wireless mesh networks. J. Mobile Multimedia 9(1–2), 101–110 (2013) 19. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A comparison study of hill climbing, simulated annealing and genetic algorithm for node placement problem in WMNs. J. High Speed Netw. 20(1), 55–66 (2014) 20. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A simulation system for WMN based on SA: performance evaluation for different instances and starting temperature values. Int. J. Space-Based Situated Comput. 4(3–4), 209–216 (2014) 21. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Performance evaluation considering iterations per phase and SA temperature in WMN-SA system. Mobile Inf. Syst. 10(3), 321–330 (2014) 22. Sakamoto, S., Lala, A., Oda, T., Kolici, V., Barolli, L., Xhafa, F.: Application of WMN-SA simulation system for node placement in wireless mesh networks: a case study for a realistic scenario. Int. J. Mobile Comput. Multimedia Commun. (IJMCMC) 6(2), 13–21 (2014) 23. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: An integrated simulation system considering WMN-PSO simulation system and network simulator 3. In: Barolli, L., Xhafa, F., Yim, K. (eds.) BWCCA 2016. LNDECT, vol. 2, pp. 187– 198. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49106-6 17 24. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation and evaluation of a simulation system based on particle swarm optimisation for node placement problem in wireless mesh networks. Int. J. Commun. Netw. Distrib. Syst. 17(1), 1–13 (2016) 25. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation of a new replacement method in WMN-PSO simulation system and its performance evaluation. In: The 30th IEEE International Conference on Advanced Information Networking and Applications (AINA 2016), pp. 206–211 (2016) 26. Sakamoto, S., Obukata, R., Oda, T., Barolli, L., Ikeda, M., Barolli, A.: Performance analysis of two wireless mesh network architectures by WMN-SA and WMN-TS simulation systems. J. High Speed Netw. 23(4), 311–322 (2017) 27. Sakamoto, S., Ozera, K., Barolli, A., Ikeda, M., Barolli, L., Takizawa, M.: Implementation of an intelligent hybrid simulation systems for WMNs based on particle swarm optimization and simulated annealing: performance evaluation for different replacement methods. Soft Comput. 23(9), 3029–3035 (2017) 28. Sakamoto, S., Ozera, K., Barolli, A., Ikeda, M., Barolli, L., Takizawa, M.: Performance evaluation of WMNs by WMN-PSOSA simulation system considering random inertia weight method and linearly decreasing Vmax method. In: Barolli, L., Xhafa, F., Conesa, J. (eds.) BWCCA 2017. LNDECT, vol. 12, pp. 114–124. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69811-3 10 29. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Implementation of intelligent hybrid systems for node placement problem in WMNs considering particle swarm optimization, hill climbing and simulated annealing. Mobile Netw. Appl. 23(1), 27–33 (2017)
130
A. Barolli et al.
30. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Performance evaluation of WMNs by WMN-PSOSA simulation system considering constriction and linearly decreasing inertia weight methods. In: Barolli, L., Enokido, T., Takizawa, M. (eds.) NBiS 2017. LNDECT, vol. 7, pp. 3–13. Springer, Cham (2017). https://doi.org/10.1007/ 978-3-319-65521-5 1 31. Sakamoto, S., Ozera, K., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of intelligent hybrid systems for node placement in wireless mesh networks: a comparison Study of WMN-PSOHC and WMN-PSOSA. In: Barolli, L., Enokido, T. (eds.) IMIS 2017. AISC, vol. 612, pp. 16–26. Springer, Cham (2017). https://doi. org/10.1007/978-3-319-61542-4 2 32. Sakamoto, S., Ozera, K., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of WMN-PSOHC and WMN-PSO simulation systems for node placement in wireless mesh networks: a comparison study. In: Barolli, L., Zhang, M., Wang, X. (eds.) EIDWT 2017. LNDECT, vol. 6, pp. 64–74. Springer, Cham (2017). https://doi. org/10.1007/978-3-319-59463-7 7 33. Sakamoto, S., Ozera, K., Barolli, A., Barolli, L., Kolici, V., Takizawa, M.: Performance evaluation of WMN-PSOSA considering four different replacement methods. In: Barolli, L., Xhafa, F., Javaid, N., Spaho, E., Kolici, V. (eds.) EIDWT 2018. LNDECT, vol. 17, pp. 51–64. Springer, Cham (2018). https://doi.org/10.1007/ 978-3-319-75928-9 5 34. Schutte, J.F., Groenwold, A.A.: A study of global optimization using particle swarms. J. Global Optim. 31(1), 93–108 (2005) 35. Shi, Y.: Particle swarm optimization. IEEE Connections 2(1), 8–13 (2004) 36. Shi, Y., Eberhart, R.C.: Parameter selection in particle swarm optimization. In: Porto, V.W., Saravanan, N., Waagen, D., Eiben, A.E. (eds.) EP 1998. LNCS, vol. 1447, pp. 591–600. Springer, Heidelberg (1998). https://doi.org/10.1007/ BFb0040810 37. Vanhatupa, T., Hannikainen, M., Hamalainen, T.: Genetic algorithm to optimize node placement and configuration for WLAN planning. In: Proceedings of The 4th IEEE International Symposium on Wireless Communication Systems, pp. 612–616 (2007) 38. Wang, J., Xie, B., Cai, K., Agrawal, D.P.: Efficient mesh router placement in wireless mesh networks. In: Proceedings of IEEE International Conference on Mobile Adhoc and Sensor Systems (MASS 2007), pp. 1–9 (2007)
Outage Probability of CR-NOMA Schemes with Multiple Antennas Selection and Power Transfer Approach Hong-Nhu Nguyen1,2(B) , Ngoc-Long Nguyen1(B) , Nhat-Tien Nguyen1,2(B) , Ngoc-Lan Nguyen2(B) , and Miroslav Voznak1(B) 1 VSB Technical University of Ostrava, 17 Listopadu 2172/15, 70800 Ostrava-Poruba,
Czech Republic {nhu.hong.nguyen.st,ngoc.long.nguyen.st,miroslav.voznak}@vsb.cz, [email protected] 2 Faculty of Electronics and Telecommunications, Saigon University, Ho Chi Minh City, Vietnam [email protected]
Abstract. In this paper, the outage performance of CR-NOMA schemes in decode-and-forward (DF) relay systems Device-to-Device (D2D) with antenna selection is investigated. We propose the power beacon, which can feed energy to the relay device node to further support the transmission from the source to the destination. To this end, closed-form expressions for the outage probabilities at user are derived. An asymptotic analysis at a high signal-to-noise ratio (SNR) is carried out to provide additional insights into the system performance. Furthermore, computer simulation results are presented to validate the accuracy of the attained analytical results. Keywords: CR-NOMA · Non-orthgonal multiple access · Decode-and-forward · Outage probability (OP)
1 Introduction Cognitive radio (CR) has been noticed remarkably thanks to its ability to improve spectrum utilization [1]. In the context as CR is proposed to the fifth generation (5G) mobile networks, both the primary users, including a base station (BS) and mobile users served by the BS and the secondary users (mobile users non-served by the BS), can coexist together in a same licensed system [2]. Non-orthogonal multiple access (NOMA) has been key as a promising solution to achieve higher spectrum efficiency in the 5G wireless networks [3]. Thanks to technical improvements of superposition coding and successive interference cancellation (SIC) in NOMA, multiple users can be served in the same resource block. Cooperative NOMA is proposed in [4] to improve reception reliability. The performance of cooperative NOMA is investigated in [5, 6]. The achievable rate of the cooperative NOMA is studied in [5]. Considering imperfect SIC, the ergodic sum capacity © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 131–142, 2022. https://doi.org/10.1007/978-3-030-84913-9_12
132
H.-N. Nguyen et al.
of the cooperative NOMA is investigated where two sources communicate with their corresponding destinations via a standard relay [7]. Cooperative NOMA is applied in a cognitive radio network to enhance spectrum efficiency [8]. Approximated expressions for the outage probability are derived for a cooperative NOMA in an underlay cognitive radio network where one secondary user is selected as a relay [9]. In [10], the outage probability and the ergodic capacity of cooperative NOMA in an underlay cognitive radio network are derived, assuming that the primary source’s interference is considered a constant, which is not realistic in practice. Recently, the network integration between wireless networks and vehicle-to-everything (V2X) communications have been investigated to provide more advantages for vehicular networks including more efficient, more intelligent, and safer road traffic in the future [11–13]. Recently, energy harvesting wireless networks are expected to introduce several transformative changes in wireless networking. So, a prospective approach is to apply energy harvesting technologies to relay networks. Radio-frequency (RF) energy harvesting is used [14]. The relaying networks have been widely studied by introducing flexible, sustainable, and stable energy supply devices in such networks [15–18]. In [16], renewable energy is considered a solution to employing dense small cell base stations (SBSs) to adapt to the increasing demand for communication services. In [18], system throughput can be enhanced by utilizing the optimal channel selection method and the harvested RF energy. The cognitive radio sensor networks benefit from RF energy harvesting. The rest of this paper is organized as follows. Section 2, the System model of power beacon-assisted CR-NOMA and energy harvesting are presented. Section 3 considers the outage probability and asymptotic OP analysis of such NOMA applied together with wireless power transfer (WPT). The numerical simulations are conducted in Sect. 4, and we conclude with some remarks in Sect. 5.
2 System Model Consider a cooperative NOMA system in an underlay cognitive radio network consisting of a primary destination PD, a power beacon B, a secondary source BS with multiple antennas N (n = 1, . . . , N ), and two secondary destinations D1 , D2 as shown in Fig. 1. Assume that there is a direct path between BS and the secondary destination D2 . Suppose that each node has a single antenna and operates in half-duplex mode. BS transmit the signal to the secondary destinations according to NOMA principle. A transmission frame of the network consists of two equal length phases. In the first phase, BS transmits a signal to D1 , D2 . In the second phase,D1 transmits the re-encoded signal to the secondary destinations only if the received signal at D1 is decoded successfully and D1 harvests energy from B, and then uses this energy to transmit signals [19]. Wireless channels denoted as in Fig. 1 in such a system relying on NOMA are subjected to Rayleigh flat fading plus additive white Gaussian noise. The complex channel coefficients for the links BS → D1 , BS → D2 , BS → PD, B → D1 , D1 → PD, D1 → D2 are represented by 2 |h1n |2 ∼ CN (0, λ1n ), |h2n |2 ∼ CN (0, λ2n ), hp1 ∼ CN 0, λp1 , |hb |2 ∼ CN (0, λb ), 2 hp2 ∼ CN 0, λp2 , |h3 |2 ∼ CN (0, λ3 ), respectively. In the first phase, the received signal at BS → D1 is given by yD1 = PBS h1n (β1 s1 + β2 s2 ) + μD1 , (1)
Outage Probability of CR-NOMA Schemes with Multiple Antennas
133
where PBS is the transmit power of BS, si (i = 1, 2) is the information symbol for Di , βi is the power allocation coefficient for si with β1 + β2 = 1 and β1 < β2 , and μD1 is the AWGN at BS → D1 with μD1 ∼ CN (0, ω0 ).
hp 2
hp1
B
hb
PD
h1n
D1
h3 h2n Information Transfer Energy Harvesting Interference
D2
BS
Fig. 1. System model of power beacon-assisted CR-NOMA
The signal to interference plus noise ratio (SINR) to decode s2 at BS → D1 is given by β2 PBS |h1n |2
(x )
γD12 =
β1 PBS |h1n |2 + ω0
.
(2)
After imperfect SIC, the SINR to decode s1 is given by (x )
γD11 =
β1 PBS |h1n |2 . ω0
(3)
The observation BS → D2 for the direct link is written as yBS−D2 = PBS h2n (β1 s1 + β2 s2 ) + μD2 ,
(4)
where μD2 is the AWGN at D2 with μD2 ∼ CN (0, ω0 ). The received SINR at D2 to detect s2 for the direct link is given by (x )
2 γBS−D = 2
β2 PBS |h2n |2 β1 PBS |h2n |2 + ω0
.
(5)
In the second phase, since DF relaying protocol is invoked in D1 , we assume that D1 can DF the signal s2 to D2 successfully for relaying the link from D1 to D2 . yD1 −D2 = PD1 h3 s2 + μD2 , (6) where PD1 is the transmit power of D1 .
134
H.-N. Nguyen et al.
Therefore, calculating SNR to detect s2 , which is transmitted in the second hop from user D1 to user D2 , is given as (x )
γD12−D2 =
PD1 |h3 |2 . ω0
(7)
The chosen antenna can be selected to strengthen the BS → Di (i = 1, 2) link as follows [19]:
(8) n∗ = arg max |hin |2 . n=1,...,N
The CDF and PDF related to selected channels are given as [20]
N N nx , F|hin |2 (x) = 1 − (−1)n−1 exp − λin n
(9)
n=1
and
N nx n−1 n f|hin |2 (x) = . exp − (−1) λin λin n n=1 N
(10)
In the considered system, the relay D1 harvests energy from a power beacon. The operation of the second stage of signal processing is supported by harvests energy at D1 . At energy harvesting phase, the time switching (TS) based energy harvesting technique is applied. In a transmission block time T (in which a block of information is sent from the beacon to the relay), the relay takes χ T to harvest energy from the beacon, in which χ is the energy harvesting time fraction that depends on the schedule of B. The time slot (1 − χ )T is then divided into two equal time slots for the BS to D1 and D1 → D2 transmissions. Therefore, the energy harvested at D1 is given as [19] EHD1 = θ PB χ T |hb |2 ,
(11)
where 0 < θ < 1 is the efficiency coefficient of the energy conversion process, 0 < χ < 1 is the percentage of energy harvesting,PB is the transmit power of the beacon, assuming that these power beacons have the same power level, respectively. Under the assumption that the processing energy at D1 is negligible, the transmit power of D1 is written as following [19]: EH PD = 1
2θ PB χ |hb |2 . (1 − χ)
(12)
In order to guarantee the quality of service requirement for PD, interference power at PD must be kept below a tolerable interference constraint H . The transmit power of BS max , and D1 are limited by PBS ≤ min PBS max [21, 22]. Respectively, where PBS and D1 respectively.
H EH , P max , H and PD1 ≤ min PD D1 1 |hp1 |2 |hp2 |2 max and PD1 are the maximum available power of BS
Outage Probability of CR-NOMA Schemes with Multiple Antennas
135
3 Outage Probability Analysis 3.1 The Outage Probability of D1 According to NOMA protocol, the complementary events of the outage at D1 can be explained as: D1 can detect s2 as well as its own message s1 . From the above description, the outage probability D1 is expressed as [23]
(x ) (x ) OP1 = Pr min γD12∗ , γD11∗ < ε1 ε1 ω0 ω0 ε1 ξ 2 2 2 = 1 − Pr |h1n∗ | > , |h1n∗ | > , = 1 − Pr |h1n∗ | > β1 PBS PBS (β2 − ε1 β1 )PBS A
(13) where ε1 = 22Ri − 1 i = (1, 2) with Ri being the target rate at Di to detect si . Theorem 1. The closed-form expression for the outage probability D1 is given by.
⎡
⎤ N N nξ nH ⎦ × ⎣1 − (−1)n−1 exp − max (−1)n−1 exp − max PBS λ1n PBS λp1 n=1 n n=1 n
N N N N mH λ1n H nξ m . × exp − + − (−1)n+m−2 max nξ λp1 + mH λ1n H λ1n λp1 PBS n m
OP1 = 1 −
N
N
(14)
n=1 m=1
Proof: See Appendix A. 3.2 The Outage Probability of D2 For the second scenario, the outage events D2 for FD NOMA are described below. One of the events is when s2 can be detected at D1 , but the received SINR after SIC at D2 in one slot is less than its target SNR. Another event is that neither D1 nor D2 can detect s2 . Therefore, the outage probability D2 is expressed as [24] x2 x x x2 x x < ε2 = Pr γBS−D OP2 = Pr max γBS−D , min γD 2∗ , γD 2−D ∗ < ε2 , min γD 2∗ , γD 2−D ∗ < ε2 1 1 2 1 1 2 2∗ 2∗ ⎤ ⎡ ⎢ ⎥ ⎥ ⎢ x2 x2 x2 ⎥ =⎢ ⎢1 − Pr γBS−D2 ∗ < ε2 ⎥ Pr min γD1 ∗ , γD1 −D2 ∗ < ε2 . ⎣ ⎦ B1
B2
(15) Theorem 2. The closed-form expression for the outage probability of D2 with direct link is given by.
136
H.-N. Nguyen et al.
⎧ ⎫
N ⎪ ⎪ N nε2 ω0 ⎪ ⎪ n−1 ⎪ ⎪ 1 − exp − (−1) ⎪ ⎪ ⎪ ⎪ max (β − ε β )λ ⎪ ⎪ PBS n ⎪ ⎪ 2 2 1 2n ⎪ ⎪ n=1 ⎪ ⎪ ⎪ ⎪ ⎡ ⎤ ⎪ ⎪
⎪ ⎪ ⎨ ⎬ N N N N N N nH n−1 exp − n+m−2 ⎦− OP2 = ×⎣1 − (−1) (−1) max ⎪ ⎪ PBS λp1 ⎪ ⎪ m ⎪ ⎪ n=1 n n=1 m=1 n ⎪ ⎪ ⎪ ⎪ ⎪ ⎪
⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ nε2 ω0 H mH (β2 − ε2 β1 )λ2n m ⎪ ⎪ ⎪ ⎪ × exp − + ⎩× max ⎭ nε2 ω0 λp1 + mH (β2 − ε2 β1 )λ2n H (β2 − ε2 β1 )λ2n λp1 PBS ⎧ ⎤ ⎡
⎧ ⎫ N ⎪ N ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (−1)n−1 ⎥ ⎪
⎢1 − ⎪ ⎪ ⎪ ⎪ ⎪ N ⎨ ⎥ ⎢ ⎪ ⎪ N n ⎪ ⎪ ω nε ⎥ ⎢ 2 0 n=1 ⎪ ⎪ n−1 exp − ⎪ ⎪ 1 − × (−1) ⎥ ⎢ ⎪ ⎪
max ⎪ ⎪ ⎪ ⎥ ⎢ PBS (β2 − ε2 β1 )λ1n ⎪ ⎪ n ⎪ ⎪ ⎪ ⎪ nH ⎦ ⎣ n=1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ × exp − max ⎩ ⎪ ⎪ ⎪ ⎪ PBS λp1 ⎨ ⎬ ⎫ . ×
N N ⎪ ⎪ ⎬ N ⎪ ⎪ N nε2 ω0 H mH (β2 − ε2 β1 )λ1n m ⎪ ⎪ ⎪+ ⎪ ⎪ ⎪ × exp − + (−1)n+m−2 ⎪ ⎪ max ⎪ ⎪ ⎭ nε2 ω0 λp1 + mH (β2 − ε2 β1 )λ1n H (β2 − ε2 β1 )λ1n λp1 PBS ⎪ n=1 m=1 n ⎪ m ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪
⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 2ω0 ε2 (1 − χ) 2ω0 ε2 (1 − χ ) ω0 ε2 H λ3 ⎪ ⎪ ⎪ ⎪ × exp − max K1 ⎩× ⎭ ω0 ε2 λp2 + H λ3 PD λ 3 θPB χ λ3 λb θPB χ λ3 λb 1
(16)
Proof: See Appendix B. 3.3 Asymptotic Outage Probability Analysis max → ∞ it can be obtained asymptotic performance of D and D . We first Where PBS 1 2 look at the asymptotic performance of the first user in term of outage probability as
N N N N mH λ1n OP1−asym = 1 − , (17) (−1)n+m−2 nξ λp1 + mH λ1n n m n=1 m=1 ⎧ ⎫
N N ⎨ ⎬ N N mH (β2 − ε2 β1 )λ2n n+m−2 OP2−asym = 1 − × (−1) ⎩ nε2 ω0 λp1 + mH (β2 − ε2 β1 )λ2n ⎭ n m n=1 m=1
⎧ ⎫
N N N N ⎪ ⎪ mH (β2 − ε2 β1 )λ1n ⎪ ⎪ ⎪ ⎪ (−1)n+m−2 ⎪1 − ⎪ ⎪ ⎨ ⎬ nε2 ω0 λp1 + mH (β2 − ε2 β1 )λ1n ⎪ n m n=1 m=1 . ×
⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 2ω0 ε2 (1 − χ ) 2ω0 ε2 (1 − χ ) H λ3 ⎪ ⎪ ⎪ ⎪ × K ⎩ ⎭ 1 ω0 ε2 λp2 + H λ3 θPB χ λ3 λb θPB χ λ3 λb
(18)
4 Numerical Results and Simulations In this paper, we show the comparisons related to the outage performance of two users using NOMA. These users are grouped in the downlink of the using Rayleigh fading channels under different simulated parameters. The outage probability versus the transmit SNR at the BS is illustrated in Fig. 2, where we consider two main scenarios. The different power allocation coefficients are assigned to two users, and hence outage performance of the first user is less than that of the second user. It can be easily seen that more antennas result in the lowest outage.
Outage Probability of CR-NOMA Schemes with Multiple Antennas
Fig. 2. Outage performance comparison of D1 max /ω by varying N and D2 versus PBS 0 (β1 = 0.2, θ = 0.8, χ = 0.6 R1 = R2 = 0.5 (bps/Hz), λ1n = λ2n = λ3 = λb = λp1 = λp2 = 1, max /ω = 50 (dB), PD 0 1 H /ω0 = PB /ω0 = 30(dB)).
Fig. 4. Outage performance comparison of max /ω by varying D1 and D2 versus PBS 0 R1 = R2
137
Fig. 3. Outage performance comparison of max /ω by varying β D1 and D2 versus PBS 0 1 (θ = 0.8, χ = 0.6, R1 = R2 = 0.5(bps/Hz), λ1n = λ2n = λ3 = λb = λp1 = λp2 = 1, max /ω = 50(dB), PD 0 1 H /ω0 = PB /ω0 = 30(dB), N = 3).
Fig. 5. Outage performance comparison of D1 max /ω by varying and D2 versus PBS 0 R1 = R2 = 0, 5(bps/Hz)
When the SNR is greater than 30 (dB), outage probabilities for these cases go to a straight line, which means they meet the saturation situation. In addition, imperfect SIC at the first user has worse outage performance compared with the perfect case. max /ω = (β1 = 0.2,θ = 0.8, χ = 0.6, λ1n = λ2n = λ3 = λb = λp1 = λp2 = 1, PD 0 1 50(dB), H /ω0 = PB /ω0 = 30(dB), N = 3). Considering the outage performance of two users versus transmit SNR with different power allocation factors, as in Fig. 3, the users’ performance change based on the amount of power allocated. Higher R leads to better outage performance at the first user. These trends of curves related to outage behavior are similar as in Fig. 4. While considering how to transmit SNR at power beacon impacts outage probability, it can be seen similar performance as in Fig. 5.
138
H.-N. Nguyen et al.
5 Conclusions In this paper, we investigated the CR-NOMA scheme in an underlay cognitive radio network by enabling energy harvesting and transmit antenna selection schemes. To this end, exact closed-form expressions and asymptotic expressions for the outage probabilities of the two users were derived with imperfect SIC. Furthermore, the direct link between BS and the far user was utilized to convey information, and one diversity order was obtained for the distant user. Finally, in future work, we will consider the problem of Secure CR-NOMA with multiple-input and single-output (MISO) architecture applying to transmit antenna selection (TAS) technique. Acknowledgments. The research leading to this results was supported by Czech Ministry of Education, Youth and Sports under project reg. no. SP2021/25 and also partially under the e-INFRA CZ project ID:90140. The authors would like to thank the anonymous reviews for the helpful comments and suggestions. This work is a part of the basic science research program CS2020-21 funded by the Saigon University.
Appendix A: Proof of Theorem 1 From (13), A can be formulated by
A = Pr |h1n∗ |2 >
= Pr |h1n∗ |2 >
ξ max PBS
A1
⎞
⎛
ξ
H
max , PBS > ⎠ hp1∗ 2 |hp1∗ | ⎛ ⎞ H ξ H 2 max H , PBS > ⎠. hp1∗ 2 hp1∗ 2 2 |hp1∗ |
+ Pr ⎝|h1n∗ |2 >
max , PBS
n=1
2 max , hp1 < ξ
H
= Pr |h1n∗ |2 >
n=1
ξ
%
(A.2)
Outage Probability of CR-NOMA Schemes with Multiple Antennas
139
In a similar way, A2 it can be first calculated as
A2 = Pr |h1n∗ |2 >
' hp1∗ 2 ξ ∞ 2 ξx H 2 2 (x)dx 1 − F , hp1∗ > max = f H h1n∗ H PBS H hp1∗ P max BS
' ∞ N N N N nξ m m exp − + = x dx (−1)n+m−2 H λp1 H λ1n λp1 m max n=1 m=1 n
(A.3)
PBS
=
N N nξ H mH λ1n m × exp − + . (−1)n+m−2 max nξ λp1 + mH λ1n H λ1n λp1 PBS m n=1 m=1 n N N
From (A.2) and (A.3), A can write such as
⎡
⎤
N N N nξ nH ⎦ × ⎣1 − (−1)n−1 exp − max (−1)n−1 exp − max PBS λ1n PBS λp1 n=1 n n=1 n
N N N N nξ mH λ1n H m × exp − + + (−1)n+m−2 max . nξ λp1 + mH λ1n H λ1n λp1 PBS m n
A=
N
(A.4)
n=1 m=1
We plug (A.4) into (13), it can be achieved OP1 as the proposition. This is the end of the proof.
Appendix B: Proof of Theorem 2 From (15), B1 can be formulated by
(x2 ) B1 = Pr γBS−D2 ∗ ≥ ε2 = Pr |h2n∗ |2 ≥
ε2 ω0 H max , PBS − Pr ⎝|h2n∗ |2 ≥ H 2 (β2 − ε2 β1 ) hp1∗ |hp1∗ |2
ε2 ω0 H 2 max = Pr |h2n∗ | ≥ max ,P . H (β2 − ε2 β1 ) BS hp1∗ 2
(B.1)
B1b
Next, B1a it can be first calculated as
& %
2 H × 1 − Pr hp1∗ ≥ max PBS
⎡
⎤ N N N N nε2 ω0 nH ⎦. = × ⎣1 − (−1)n−1 exp − max (−1)n−1 exp − max PBS (β2 − ε2 β1 )λ2n PBS λp1 n=1 n n=1 n B1a = Pr |h2n∗ |2 ≥
ε2 ω0 max (β − ε β ) PBS 2 2 1
From (B.1), B1b it can be first calculated as
(B.2)
140
H.-N. Nguyen et al.
B1b = Pr |h2n∗ |2 ≥
=
N N n=1 m=1
=
2 ε2 ω0 hp1∗ H (β2 − ε2 β1 )
N N n
m
N N N N n=1 m=1
n
m
2 , hp1∗ >
H max PBS
' ∞
=
' ∞ H max PBS
1 − F
h2n∗ 2
ε2 ω0 x H (β2 − ε2 β1 )
f
2 (x)dx hp1∗
nε2 ω0 m + x dx H (β2 − ε2 β1 )λ2n λp1
(−1)n+m−2
m × λp1
(−1)n+m−2
nε2 ω0 mH (β2 − ε2 β1 )λ2n H m × exp − + . max nε2 ω0 λp1 + mH (β2 − ε2 β1 )λ2n H (β2 − ε2 β1 )λ2n λp1 PBS
H max PBS
exp −
(B.3) From (B.2) and (B.3), B1 can written such as
⎡
⎤ N N N nε2 ω0 nH n−1 n−1 ⎣ ⎦ B1 = exp − max exp − max × 1− (−1) (−1) PBS (β2 − ε2 β1 )λ2n PBS λp1 n=1 n n=1 n
N N N N H mH (β2 − ε2 β1 )λ2n nε2 ω0 m . × exp − + + (−1)n+m−2 max nε2 ω0 λp1 + mH (β2 − ε2 β1 )λ2n H (β2 − ε2 β1 )λ2n λp1 PBS n m N
n=1 m=1
(B.4) From (15), B2 can be formulated by
(x ) (x ) (x ) (x ) B2 = Pr min γD12∗ , γD12−D2 ∗ < ε2 = 1 − Pr γD12∗ ≥ ε2 Pr γD12−D2 ∗ ≥ ε2 . B2a
B2b
(B.5) B2a can be formulated such as B1 , we have ⎡
⎤
N nH n−1 ⎦ B2a = exp − max (−1) PBS λp1 n=1 n n=1 n
N N N N nε2 ω0 H mH (β2 − ε2 β1 )λ1n m × exp − + + . (−1)n+m−2 max nε2 ω0 λp1 + mH (β2 − ε2 β1 )λ1n H (β2 − ε2 β1 )λ1n λp1 PBS m n N
N
nε2 ω0 (−1)n−1 exp − max PBS (β2 − ε2 β1 )λ1n
× ⎣1 −
N
n=1 m=1
(B.6) From (B.5), B2b it can be first calculated as
(
2 ) (x ) EH max hp2 ≥ ω0 ε2 , P , H / B2b = Pr γD12−D2 ≥ ε2 = Pr min PD D1 1 H ω0 ε2 ω0 ε2 ω0 ε2 EH max Pr PD1 ≥ Pr 2 ≥ = Pr PD1 ≥ hp2 |h3 |2 |h3 |2 |h3 |2
2 (B.7) hp2 ε ω ε ω0 ε2 ω − χ) (1 0 2 0 2 2 |h | = exp − max × Pr ≥ . Pr |h3 |2 ≥ 3 PD1 λ3 H 2θ P χ |hb |2 B (1)
B2b
(2)
B2b
Outage Probability of CR-NOMA Schemes with Multiple Antennas
141
(1)
From (B.7), B2b it can be first calculated as ' ∞ ω0 ε2 (1 − χ ) ω0 ε2 (1 − χ ) 1 x (1) B2b = Pr |h3 |2 ≥ = dx exp − exp − 2θ PB χ λ3 x λb λb 2θ PB χ |hb |2 0
2ω0 ε2 (1 − χ ) 2ω0 ε2 (1 − χ ) = K1 . θ PB χ λ3 λb θ PB χ λ3 λb (B.8) It is worth noting that the + last equation follows from the fact that. *∞ √ δ δ δϕ in [25, Eq. (3.324)]. 0 exp − 4x − ϕx dx = ϕ K1 (2)
From (B.7), B2b it can be first calculated as (2)
B2b = Pr |h3 |2 ≥
2 ω0 ε2 hp2 H
=
' ∞ ω0 ε2 1 1 H λ3 exp − + . x dx = λp2 0 H λ3 λp2 ω0 ε2 λp2 + H λ3
(B.9)
From (B.6), (B.7), (B.8) and (B.9), B2 can written such as ⎧
⎡
⎤
N N ⎨ N N nε2 ω0 nH ⎣1 − ⎦ (−1)n−1 exp − max (−1)n−1 exp − max ⎩ PBS (β2 − ε2 β1 )λ1n PBS λp1 n=1 n n=1 n ⎫
N N ⎪ N N mH (β2 − ε2 β1 )λ1n ⎪ ⎪ + (−1)n+m−2 ⎪ ⎬ nε2 ω0 λp1 + mH (β2 − ε2 β1 )λ1n ⎪ m n=1 m=1 n
⎪ ⎪ ⎪ nε2 ω0 H m ⎪ ⎪ × exp − + ⎭ max H (β2 − ε2 β1 )λ1n λp1 PBS
H λ3 2ω0 ε2 (1 − χ ) 2ω0 ε2 (1 − χ ) ω0 ε2 × exp − max K1 × . ω0 ε2 λp2 + H λ3 PD λ 3 θPB χ λ3 λb θPB χ λ3 λb
B2 = 1 −
(B.10)
1
We plug (B.4), (B.10) into (15), it can be achieved OP2 as the proposition. This is the end of the proof.
References 1. Haykin, S.: Cognitive radio: brain-empowered wireless communications. IEEE J. Sel. Areas Commun. 23(2), 201–220 (2005) 2. Goldsmith, A., Jafar, S., Maric, I., Srinivasa, S.: Breaking spectrum gridlock with cognitive radios: an information theoretic perspective. Proc. IEEE 97(5), 894–914 (2009) 3. Akyildiz, I.F., Lee, W.-Y., Vuran, M.C., Mohanty, S.: Next generation/dynamic spectrum access/cognitive radio wireless networks: a survey. Comput. Netw. 50(13), 2127–2159 (2006) 4. Namdar, M., Basgumus, A.: Outage performance analysis of underlay cognitive radio networks with decode-and-forward relaying. Cogn. Radio (2017) 5. Ghasemi, A., Sousa, E.: Fundamental limits of spectrum-sharing in fading environments. IEEE Trans. Wireless Commun. 6(2), 649–658 (2007) 6. Wang, L., Kim, K.J., Duong, T.Q., Elkashlan, M., Poor, H.V.: Security enhancement of cooperative single carrier systems. IEEE Trans. Inf. Forensics Secur. 10(1), 90–103 (2015) 7. Rodriguez, L.J., Tran, N.H., Duong, T.Q., Le-Ngoc, T., Elkashlan, M., Shetty, S.: Physical layer security in wireless cooperative relay networks: state of the art and beyond. IEEE Commun. Mag.53(12), 32–39 (2015)
142
H.-N. Nguyen et al.
8. Sun, L., Zhang, T., Lu, L., Niu, H.: On the combination of cooperative diversity and multiuser diversity in multi-source multi-relay wireless networks. IEEE Signal Process. Lett. 17(6), 535–538 (2010) 9. Ju, M., Song, H.-K., Kim, I.-M.: Joint relay-and-antenna selection in multi-antenna relay networks. IEEE Trans. Commun. 58(12), 3417–3422 (2010) 10. Do, D.-T., Le, A.-T.: NOMA based cognitive relaying: transceiver hardware impairments, relay selection policies and outage performance comparison. Comput. Commun. 146, 144– 154 (2019) 11. Martinek, R., Danys, L., Jaros, R.: Adaptive software defined equalization techniques for indoor visible light communication. Sensors 20(6), 1618 (2020) 12. Martinek, R., Danys, L., Jaros, R.: Visible light communication system based on software defined radio: Performance study of intelligent transportation and indoor applications. Electronics 8(4), 433 (2019) 13. Martinek, R., et al.: Design of a measuring system for electricity quality monitoring within the smart street lighting test polygon: pilot study on adaptive current control strategy for three-phase shunt active power filters. Sensors 20(6), 1718 (2020) 14. Nasir, A.A., Zhou, X., Durrani, S., Kennedy, R.A.: Relaying protocols for wireless energy harvesting and information processing. IEEE Trans. Wireless Commun. 12(7), 3622–3636 (2013) 15. Le, T., Shin, O.: Wireless energy harvesting in cognitive radio with opportunistic relays selection. In: Proceedings of IEEE PIMRC, Hong Kong, pp. 949–953 (2015) 16. Banerjee, A., Paul, A., Maity, S.P.: Joint power allocation and route selection for outage minimization in multihop cognitive radio Networks with energy harvesting. IEEE Trans. Cog. Commun. Netw. 4(1), 82–92 (2018) 17. Wakaiki, M., Suto, K., Koiwa, K., Liu, K., Zanma, T.:A control theoretic approach for cell zooming of energy harvesting small cell networks. IEEE Trans. Green Commun. Netw. 3(2), 329–342 (2019). https://doi.org/10.1109/TGCN.2018.2889897 18. Wang, H., Wang, J., Ding, G., Wang, L., Tsiftsis, T.A., Sharma, P.K.: Resource allocation for energy harvesting-powered D2D communication underlaying UAV-assisted networks. IEEE Trans. on Green Commun. Netw. 2(1), 14–24 (2018). https://doi.org/10.1109/TGCN.2017. 2767203 19. Nguyen, N., Duong, T.Q., Ngo, H.Q., Hadzi-Velkov, Z., Shu, L.: Secure 5G wireless communications: a joint relay selection and wireless power transfer approach. IEEE Access 4, 3349–3359 (2016) 20. Fan, L., Yang, N., Duong, T.Q., Elkashlan, M., Karagiannidis, G.K.: Exploiting direct links for physical layer security in multiuser multirelay networks. IEEE Trans. Wireless Commun. 15(6), 3856–3867 (2016) 21. Ye, J., Liu, Z., Zhao, H., Pan, G., Ni, Q., Alouini, M.: Relay selections for cooperative underlay CR systems with energy harvesting. IEEE Trans. Cogn. Commun. Netw. 5(2), 358–369 (2019) 22. Im, G., Lee, J.H.: Outage probability for cooperative NOMA systems with imperfect sic in cognitive radio networks. IEEE Commun. Lett. 23(4), 692–695 (2019) 23. Yue, X., Liu, Y., Kang, S., Nallanathan, A., Ding, Z.: Exploiting full/half-duplex user relaying in NOMA systems. IEEE Trans. Commun. 66(2), 560–575 (2018) 24. Lee, S., Benevides da Costa, D., Duong, T.Q.: Outage probability of non-orthogonal multiple access schemes with partial relay selection. In: 2016 IEEE 27th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), Valencia (2016) 25. Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series and Products, 6th edn. New York, Academic Press (2000)
An Efficient Framework for Resource Allocation and Dynamic Pricing Scheme for Completion Time Failure in Cloud Computing Anjan Bandyopadhyay1 , Vikash Kumar Singh2 , Sajal Mukhopadhyay1(B) , Ujjwal Rai1(B) , and Arghya Bandyopadhyay1(B) 1
2
Department of Computer Science and Engineering, National Institute of Technology (NIT), Durgapur, India [email protected], [email protected], [email protected], [email protected] School of Computer Science and Engineering, Vellore Institute of Technology, Amaravati, Andhra Pradesh, India [email protected]
Abstract. Cloud computing, as an infrastructure less service, has gained a lot of attention over a decade now. The surge for the resource allocation and pricing have been at the centre stage of the research for a while in cloud computing. In this paper, we have proposed an efficient resource allocation and dynamic pricing algorithm for completion time failure in cloud computing (RADPACTF). Theoretical analysis is also provided in support of the proposed algorithm.
1 Introduction Cloud computing is a new computer paradigm used for sharing on-demand computing services like CPU, RAM etc. Various applications like Flikr, document storage such as Dropbox and Google Docs are based on cloud computing. Since cloud computing is an infrastructure-less service, instead of deploying actual infrastructures at the user end, we have to share a huge amount of virtual infrastructure to the user. There are two main challenges that emerge to share these resources, they are: (1) How to allocate system infrastructures such as CPU, RAM, other hardware etc. among several users?, and (2) The other one is the payment made by the users to the service providers in exchange of the costly infrastructures they are getting from them. Lots of work has been carried out on allocating infrastructures to the users who are demanding the services. Allocation of infrastructures mainly deals with two issues: (1) sharing of servers, and (2) scheduling of tasks over a time horizon [1]. Sharing of servers among the users entangled with the security issues are provided by virtual machines and container based technology [1–3]. Scheduling of tasks arriving over a time horizon, is ventured around mainly with two objectives. The first one is the service level objectives termed generally as SLOs (for example meeting the deadline of the tasks under consideration) is addressed in [1, 4–6]. The second objective that is discussed aptly in the literature is the cluster utilization [7–9]. The concepts about the system level challenges that have been evolved in the c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 143–153, 2022. https://doi.org/10.1007/978-3-030-84913-9_13
144
A. Bandyopadhyay et al.
literature has also been brought into practice both in public and private cloud [10–12]. So this is a good indication that allocation perspective has been thrived to some extent [1]. While more and more demand of infrastructure-less service from the individuals (may be a person, an organization etc.) is increasing, it is also necessary to develop efficient pricing schemes (that is, how much money to be demanded based on the service being provided) coupled with an adequate allocation schemes [1]. Few recent works have been done in this direction where economically viable solutions have the prime focus [1, 13–16]. The pricing schemes that have been deployed at present in the cloud computing scenario are: • Fixed prepaid prices for a guaranteed quota of service [10, 17, 18]. • Unit-prices according to the demand made for the resources to be utilized [12, 19]. • Unlike the other two, there is another interesting pricing scheme where an individual can show his maximum willingness to pay (a bidding like scenario) based on Spot instance price. Though in this scheme, there may be an interruption while getting the service [14–16, 20]. The main issue is to design an efficient dynamic pricing scheme coupled with resource allocation in comparison with the present pricing schemes that are being used in current deployment of cloud computing. In [1] a framework has been proposed in this direction where a value based (that maximizes the social welfare) efficiency is developed while designing a good economic mechanism. In this paper we have extended their framework when the user is unable to complete his task within the allocated time frame (this was mentioned as the future works in [1]) i.e. within the completion time. In this setting we have addressed how to design an efficient dynamic pricing scheme when the users are not being able to perform their desired tasks within the stipulated completion time. Now we can summarize our contributions as follows: • Providing flexibility to the users if they are unable to complete their tasks within the allocated time frame (stipulated completion time). • How to handle situation when enough samples are not available (this is the initial scenario when the process starts). The remainder of this paper is organized as follows. In Sect. 2, we describe our proposed system model. We then present our proposed mechanism in Sect. 3. Analysis of the proposed mechanism is carried out in Sect. 4. In Sect. 5 the paper is concluded and the future directions are coined.
2 System Model The model deals with the resource allocation and pricing scheme in cloud framework, when the users fail to meet the deadline and then resubmitting the demand. Here, a cloud service provider has k number of resources and each resource has a capacity (natural resources are RAMs, cores, HDDs etc.). This is denoted as r = {(r1 , a1 ), (r2 , a2 ), . . . , (rk , ak )}, where the ith component (ri , ai ) represents the ith resource ri and its available capacity ai . A user can choose any subset of resources.
An Efficient Framework for Resource Allocation
145
For example a user can demand two resources as (r1 , aˆ1 ) and (r2 , aˆ2 ), where aˆi ≤ ai or aˆi > ai . Here, aˆi is the capacity requested by user of the resource ri . If the demand of a user exceeds the capacity i.e. aˆi > ai then that request will be rejected as the supply is less than the demand. Another user may demand (r1 , aˆ1 ), (r2 , aˆ2 ), and (r4 , aˆ4 ). If a user provides the demand as the proper subset of the available resources a cloud provider has, the system fills the demand for the other resources as zero. For example, if r = {(r1 , a1 ), (r2 , a2 ), (r3 , a3 ), (r4 , a4 ), (r5 , a5 )} and a user renders {(r2 , aˆ2 ), (r4 , aˆ4 )} then the demand for the user will be treated as {(r1 , 0), (r2 , aˆ2 ), (r3 , 0), (r4 , aˆ4 ), (r5 , 0)}. So, the demand vector for the users are represented as D = {d1 , d2 , . . . , dn }, where we have n number of users and the total unit of demand for ith user can be calculated as k
∑ di · a j
(1)
j=1
where di · a j extract the component actually placed in terms of capacity. For example, if a user places the demand as: (RAM, 2 GB), (cores, 4), then ∑kj=1 di · a j = 2 + 4 + 0, considering 3 resources RAMs, cores, HDDs the cloud provider has. The total unit demand for n users, likewise, will amount to n
k
∑ ∑ di · a j
(2)
i=1 j=1
This characterization will help us designing a scalable algorithm for resource allocation when users fail to meet the deadline then resubmitting the demand. Along with this resource attribute, the user places the time restriction as well. Each user request is then denoted by the tuple {di ,ti ,ti∗ }, where di is already defined, ti is the time required to finish the requested job and ti∗ is the deadline to finish the job. For example, the user demand will be something like: needing simultaneously the resource (RAM, 2 GB), (cores, 4) with the time restriction that the requested job will be requiring ti time to complete and it has to be completed within the deadline ti∗ . The ith user also provides the valuation (maximum willingness to pay) for their demand and we can denote it by θi and then the each user request will finally be denoted by {di ,ti ,ti∗ , θi }. After collecting the demand, the allocations are to be made and the price of each user is to be calculated. The price vector of the users is denoted by c = {c1 , c2 , . . . , cn }. We will denote the corresponding price of the demand di as di · ci .
3 Proposed Mechanism: RADPACTF In this section, first a sketch of the proposed mechanism i.e. RADPACTF is presented and then the detailed methods will be discussed. 3.1 Sketch of RADPACTF In this section the underlying idea of RADPACTF is discussed.
146
A. Bandyopadhyay et al.
RADPACTF • First the demands are collected for a stipulated time (say 15 min before running actual algorithm) and store it in a list. • Then randomize the list of demands and in that order process one by one. The list is randomized to give equal opportunity to each user to be processed first (here a priority based processing may be used. In that case we can schedule the request of the frequent visitors first and they have been defaulted a few times. To avoid cold start problem an ε -Greedy algorithm may be employed). If the first user meet the available supply criteria, we can tentatively allocate the user, otherwise reject the user. Based on the available supply, the next user is addressed and the process repeats. • For setting the price of the user being processed now, we see who are the users that were processed earlier over a period of time and collect a good sample of that. Actual implementation for calculating the price of an agent (user and agent will be used interchangeably) is elaborated in Sect. 3.1.1 with a detailed formulation. This will help us to design the (dynamic) pricing scheme based on the past demand right from the beginning. • After setting the price ci of ith agent at this moment, the tentative allocation is confirmed if ci ≤ θi , otherwise the agent’s demand is rejected for this round. Then the next agent is processed and so on. • How actually ci is calculated is discussed in the next section. 3.1.1 Price Calculation For the price calculation we first see the ti∗ of the agent i. So, the time window for the agent i is:
time from when this round starts
t∗i
Fig. 1. Time span of ith user
• Let us denote the time from when the present time starts by T j . So, the time window for collecting the samples for setting the price of current agent i is (T j ,ti∗ ). Within this time window, there were several time periods, when the requests got executed earlier. First out of this time window, we select s = {s1 , . . . , sl ∗ } number of time periods randomly as shown in the following figure.
An Efficient Framework for Resource Allocation
t∗i
TR
t∗i
TR
(a) Total time periods
147
(b) Selected time periods
Fig. 2. Sampling options
• Say we select the red colored time periods. In each of these time periods we got many requests. We extract and create a list L = {L1 , . . . , Ll ∗ } of requests who were the winners. Each Li will then be searched for the similar request the current agent has requested. Similar request may vary depending on the applications, demand at present, the objective of the service provider etc. For example, say the agent being processed has requested for (6 GB, 4 cores). For similarity checking we can take (6 GB, 3 cores) or (5 GB, 4 cores) or others as similar. In this process, out of all the samples we collect, we separate L into two separate lists. One with similar requests and others with dissimilar requests and denoted by L = {L∗ , L∗∗ }. • To set the price we take the weighted approach of the average price of the L∗ and L∗∗ as follows: |L∗ | |L∗∗ | ∑i=1 di .ci ∑i=1 di .ci pi = +γ (3) |L∗ | |L∗∗ | ∑i=1 ∑kj=1 di .a j ∑i=1 ∑kj=1 di .a j Here 0 < γ < 1 and we can set that based on our objective. For example γ = 0.01 will take less weight of the dissimilar requests for setting the price of the agent being processed now and γ = 0.5 will take more weight of that. 3.2 Detailing of RADPACTF Based on the sketch of the proposed mechanism and price calculation, a formal algorithm (named as resource allocation and dynamic pricing algorithm for completion time failure in cloud computing (RADPACTF)) is presented in Algorithm 1 and in Algorithm 2. The Algorithm 1 is the Main() routine where we collect and process the demand of the users. The first for loop collects the demand in the list l.
148
A. Bandyopadhyay et al.
Algorithm 1: Main 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
begin Let T j be present time. Let x be the time from when the collection of demand start. for ∀y ∈ {T j − x, T j } do l ← l ∪ ly end l ← rand(l) /* randomize the list */ for i = 1 to |l| do bool = true for k = 1 to |li (di )| do if li (di · aˆk ) ≤ ak ∀t ∈ (T j ,t ∗ ) then ak ← (ak − li (di · aˆk )) end else bool = f alse end end if bool = f alse then reset() /* This will reset ak value again */ end Price() end return . end
Then we randomize the list which gives every user an equal opportunity to be processed first. The second (nested) for loop process the list l in order. It checks, with the inner for loop, whether the demand of the current user satisfies with the available demand throughout the user’s deadline t ∗ . The rest() function will be called for, if the user does not satisfy the available demand criteria. If the user satisfies the criteria, then the Price() subroutine is called to set the price of the user dynamically based on the previous demand. The Price() function sets the price to be paid by the current user. The if condition takes care of the prices to be set when first time this type of system (for the failed user) begins to operate. Here W is the number of rounds to be executed before going to the actual dynamic pricing.
An Efficient Framework for Resource Allocation
149
Algorithm 2: Price 1 2 3 4 5 6 7 8
9
10
11 12 13 14 15 16 17 18 19 20
begin Initially for several rounds set the price based on present demand with ∑ ∑ di · ai if # of rounds ≤ W then // W is set by system pi ← pˆi // pˆi is system generated threshold price end else s ← rand(T j ,ti∗ ) // randomly selected time periods L∗ ← process(s) // process(s) separates s into list L∗ L∗∗ ← process(s) // process(s) separates s into list L∗∗ |L∗ | |L∗∗ | ∑i=1 di .ci ∑i=1 di .ci pi ← +γ // set price by Eq. 3 |L∗ | k |L∗∗ | k ∑i=1 ∑ j=1 di .a j
if pi ≤ θi then final allocation made; ci = pi else reject end end return . end
∑i=1 ∑ j=1 di .a j
The pˆi is the system generated threshold price that can also be set dynamically depending on the demand of the previous round. We can start with a pˆi and then it can be set ε · pˆi or (1 + ε ) · pˆi , where 0 ≤ ε ≤ 1 depending on the demand. In the else part, first the samples from the time window (T j ,ti∗ ) are collected as shown in Fig. 2. After collecting the samples in s, it is divided into two separate lists. One with similar requests and others with dissimilar requests and denoted by {L∗ , L∗∗ }. Then by Eq. 3, the price for the user is set. With this price the current user is checked with the condition pi ≤ θi . If it is satisfied, the final allocation of the user is made, otherwise her request is rejected.
4 Analysis In this analysis section first, we calculate the expected number of users that can be allocated from the requests collected within a given time frame {T j − x, T j } under the two probability models. Finally the insight about the truthfulness of the proposed algorithm is provided. Lemma 1. Given the fact that the probability of the ith user’s request is getting fulfilled is 1i , the expected number of user’s successfully allocated is bounded by ≤ log2 n + 1
150
A. Bandyopadhyay et al.
Proof. For time period T j − x to T j , the number of users who have given their demand is say l. However depending on the availability of the resource some of the request can be fulfilled and others may not be fulfilled. We want to see, how many request in expectation can be fulfilled? Here, the earlier a user is processed from l, the more will be the chance for that user’s demand getting fulfilled. So, we can take the probability of getting her request fulfilled as 1i where i is the ith user being processed. For example, if she is processed as the 10th user, the probability that her request being fulfilled will 1 , which is quite realistic. For each of the request by a user we create a random be 10 variable Mi . This Mi will count the number of successful request for the ith user. So, the total number of successful request will be M = M1 + · · · + Mn n
(4)
= ∑ Mi i=1
Taking expectations both sides, n
E[M] = E[ ∑ Mi ] i=1
n
= ∑ E[Mi ], by linearity of expectation i=1 n
=∑
i=1 n
1 1 ·1+ 1− ·0 i i
(5)
1 i=1 i
=∑
= Hn , ≤ log2 n + 1,
Harmonic series by upper bound of Hn
The probability model that we consider above may not be plausible if the service provider has good amount of resources to allocate. In this case, we can proceed like this: the probability of not getting successfully allocated can increase with the number of allocations already made including the ith user being processed now. So, that probability can be taken as ni . The interpretation is like: When the first user is considered, then she is not allocated is 1n which is very less and so on. With this interpretation we can present our next lemma. Lemma 2. Given the fact that the probability of the ith user’s request is not getting fulfilled is ni , the expected number of users successfully allocated is bounded by n−1 2
An Efficient Framework for Resource Allocation
151
Proof. The probability that the ith user getting allocated is (1 − ni ). The total number of allocations that are counted by the random variables are: M = M1 + · · · + Mn n
= ∑ Mi
(same as before)
i=1
Taking expectations both sides, n
E[M] = E[ ∑ Mi ] i=1
n
= ∑ E[Mi ] i=1 n
1 1 =∑ 1− ·1+ ·0 n n i=1 n 1 = ∑ 1− n i=1 =
1 n ∑ (n − i) n i=1
=
1 n−1 ∑i n i=0
=
1 n−1 ∑i n i=1
(as the first term is 0)
1 (n − 1)(n − 1 + 1) = · n 2
n(n + 1) as ∑ i = 2 i=1 n
1 n(n − 1) · n 2 n−1 = 2 =
Lemma 3. RADPACTF is truthful. Proof. We have to show that if the ith user deviates from θito θˆi ,she can’t gain.For |L∗ | |L∗∗ | ∑i=1 di .ci ∑i=1 di .ci any arbitrary user i, the price calculation pi ← γ + , is |L∗ | k |L∗∗ | k ∑i=1 ∑ j=1 di .a j
∑i=1 ∑ j=1 di .a j
independent of θi . If the user reveals her true valuation θi , then her utility ui = (valuation − payment) = θi − pi . Now, if she deviates and reports a value θˆi , other than θi , his utility uˆi will not be changing as the price calculation (which is the payment to be paid by the user) is independent of her reported value for the demand she put forward and thereby the (valuation - payment) will remain the same. Hence ui = uˆi .
152
A. Bandyopadhyay et al.
5 Conclusion and Future Works When a service is allocated to the users in cloud computing by a service provider, there may be a scenario that within the reported time, the users may not be able to complete the task for which they demanded the service. In this situation, an efficient framework is proposed so that the user can extend the time required to finish the task. A dynamic pricing scheme is proposed in this situation when they will be allocated the resources. In our future works, the sample complexity for determining the price could be optimized more so that the trade off between the faster and more accurate pricing decision could be handled properly. Acknowledgements. This work is supported by the Visvesvaraya Ph.D. scheme, sponsored by MeitY Govt. of India with grant number [PhD-MLA/4(29)/2014-15].
References 1. Babaioff, M., et al.: ERA: a framework for economic resource allocation for the cloud. In: Proceedings of the 26th International Conference on World Wide Web Companion, WWW 2017 Companion, pp. 635–642. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE (2017) 2. Pahl, C., Brogi, A., Soldani, J., Jamshidi, P.: Cloud container technologies: a state-of-the-art review. IEEE Trans. Cloud Comput. 7, 677–692 (2019) 3. Park, J., Kim, D., Yeom, K.: An approach for reconstructing applications to develop container-based microservices. Mob. Inf. Syst. 2020, 1–23 (2020). Article id: 4295937 4. Ferguson, A.D., Bodik, P., Kandula, S., Boutin, E., Fonseca, R.: Jockey: guaranteed job latency in data parallel clusters. In: Proceedings of the 7th ACM European Conference on Computer Systems, EuroSys 2012, pp. 99–112. ACM, New York (2012) 5. Tumanov, A., Zhu, T., Park, J.W., Kozuch, M.A., Harchol-Balter, M., Ganger, G.R.: TetriSched: global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters. In: Proceedings of the 11th European Conference on Computer Systems, EuroSys 2016. ACM, New York (2016) 6. Griebler, D., Vogel, A., De Sensi, D., Danelutto, M., Fernandes, L.G.: Simplifying and implementing service level objectives for stream parallelism. J. Supercomput. 76, 4603–4628 (2020) 7. Rasley, J., Karanasos, K., Kandula, S., Fonseca, R., Vojnovic, M., Rao, S.: Efficient queue management for cluster scheduling. In: Proceedings of the 11th European Conference on Computer Systems, EuroSys 2016. ACM, New York (2016) 8. Ousterhout, K., Wendell, P., Zaharia, M., Stoica, I.: Sparrow: scalable scheduling for subsecond parallel jobs. Technical Report No. UCB/EECS-2013-29, EECS Department, University of California, Berkeley (2013) 9. Grandl, R., Ananthanarayanan, G., Kandula, S., Rao, S., Akella, A.: Multi-resource packing for cluster schedulers. SIGCOMM Comput. Commun. Rev. 44, 455–466 (2014) 10. Verma, A., Pedrosa, L., Korupolu, M., Oppenheimer, D., Tune, E., Wilkes, J.: Large-scale cluster management at Google with Borg. In: Proceedings of the 10th European Conference on Computer Systems, EuroSys 2015. ACM, New York (2015) 11. Hindman, B., et al.: Mesos: a platform for fine-grained resource sharing in the data center. In: NSDI 11, pp. 295–308 (2011) 12. Sarkar, D.: Introducing HDInsight. In: Pro Microsoft HDInsight. Apress, Berkeley (2014). https://doi.org/10.1007/978-1-4302-6056-1 1
An Efficient Framework for Resource Allocation
153
13. Lee, I.: Pricing schemes and profit-maximizing pricing for cloud services. J. Revenue Pricing Manage. 18, 112–122 (2019) 14. Chun, S.-H.: Cloud services and pricing strategies for sustainable business models: analytical and numerical approaches. Sustainability 12, 49 (2020) 15. Bhan, R., Singh, A., Pamula, R., Faruki, P.: Auction based scheme for resource allotment in cloud computing. In: Patnaik, S., Yang, X.-S., Tavana, M., Popentiu-Vl˘adicescu, F., Qiao, F. (eds.) Digital Business. LNDECT, vol. 21, pp. 119–141. Springer, Cham (2019). https://doi. org/10.1007/978-3-319-93940-7 5 16. Ni, T., Chen, Z., Chen, L., Zhong, H., Zhang, S., Xu, Y.: Differentially private combinatorial cloud auction. arXiv preprint arXiv:2001.00694 (2020) 17. Boutin, E., et al.: Apollo: scalable and coordinated scheduling for cloud-scale computing. In: Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation, OSDI 2014, pp. 285–300. USENIX Association, USA (2014) 18. Mazrekaj, A., Shabani, I., Sejdiu, B.: Pricing schemes in cloud computing: an overview. Int. J. Adv. Comput. Sci. Appl. 7 (2016) 19. Dimitri, N.: Pricing cloud IaaS computing services. J. Cloud Comput. 9, 14 (2020). https:// doi.org/10.1186/s13677-020-00161-2 20. Song, Y., Zafer, M., Lee, K.-W.: Optimal bidding in spot instance market. In: Proceedings of the IEEE INFOCOM 2012, pp. 190–198 (2012)
Inferring Anomalies from Cloud Metrics Using Recurrent Neural Networks Spyridon Chouliaras(B) and Stelios Sotiriadis Birkbeck, University of London, Malet Street, Bloomsbury, London WC1E 7HX, UK {s.chouliaras,stelios}@dcs.bbk.ac.uk
Abstract. Cloud computing has emerged as a new paradigm that offers on-demand availability and flexible pricing models. However, cloud applications are being transformed into large scale systems where managing and monitoring cloud resources becomes a challenging task. System administrators are in need of automated tools to effectively detect abnormal system behaviour and ensure the Service Level Agreement (SLA) between the service user and the service provider. In this work, we propose a framework for online anomaly detection based on cloud application metrics. We utilize Recurrent Neural Networks for learning normal sequence representations and predict future events. Then, we use the predicted sequence as the representative sequence of normal events and based on the Dynamic Time Warping algorithm we classify future time series as normal or abnormal. Furthermore, to create a real world scenario and validate the proposed method, we used Yahoo! Cloud Serving Benchmark as a state-of-the-art benchmark tool for cloud data serving systems. Our experimental analysis shows the ability of the proposed approach to detect abnormal behaviours of NoSQL systems on-the-fly with minimum instrumentation.
1
Introduction
Today, cloud providers offer numerous features for deploying large scale applications in the cloud. Cloud computing offers high availability in terms of virtualized resources (e.g. CPU cores, Memory, Disk) while at the same time introduces payas-you-go pricing models. Consequently, cloud users have the ability to increase or decrease cloud resources based on their demand and only pay for what they use. Cloud users may choose between three main services namely as Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS). Each type of service consists of different characteristics and can offer different functionalities to the end user. As a result, cloud computing becomes more popular since it consists a fertile ground for modern application deployment. However, high availability and cloud scalability generate massive amounts of data in terms of volume, variety and velocity. A variety of systems, namely
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 154–164, 2022. https://doi.org/10.1007/978-3-030-84913-9_14
Inferring Anomalies from Cloud Metrics Using Recurrent Neural Networks
155
as NoSQL systems such as MongoDB1 , Apache Cassandra2 and Elasticsearch3 , promise to deal with massive amounts of data and support a non-schema structure for real-time analysis [1]. Unfortunately, in such systems hardware failures are more common and a failure per data center per day is commonly reported [2]. Furthermore, there is still a challenge to provide security against Denial-ofservice (DoS) and distributed denial-of-service (DDoS) attacks in cloud systems [3]. Therefore, automated ways of detecting abnormal behaviours in such systems is more important than ever since they can cause application performance degradation and provoke unreasonable energy consumption [4]. In this work, we developed a framework for detecting abnormal behaviours on-the-fly with minimum instrumentation. We deployed a NoSQL system and we used state of the art workloads to emulate a real world scenario. In more detail, we used Yahoo! Cloud Serving Benchmark (YCSB) [5] that executed a variation of operations in MongoDB. In addition, we monitored our system under YCSB workload execution to collect various application metrics. The latter used as an input to train Recurrent Neural Networks (RNNs) for learning normal sequence representations and generate future predictions. The predicted sequence has been used in our method as the representative sequence of normal events. Then, the Dynamic Time Warping (DTW) algorithm has been used to calculate the distance between the representative sequence and future signals. Our system is able to classify a future sequence as normal or abnormal based on the DTW distance and a given threshold. As a result, new unseen sequences have been classified as normal if the DTW distance is lower that the normal threshold or abnormal if it is not.
2
Motivational Experiment
This work is motivated by the need to experimentally show the abnormalities that may occur in cloud systems. Although, the periodic running workloads are very common in NoSQL systems, different abnormal signals could negatively impact application performance in a realistic scenario. To demonstrate this, we present a preliminary experimental analysis by running state of the art workloads inside our application. In particular, we utilise an intensive workload generator, the YCSB, to demonstrate a realistic scenario that executes in MongoDB. In more detail, a YCSB workload with 200 thousand records has been executed in a small size VM with 2 CPU Cores, 200 HD Disk and 4GB Memory. YCSB offers a variety of workload types to the user such as read heavy, update heavy, read latest and read only configurations. In this work, a 50% read and 50% update operations (update heavy type) has been used to emulate a real world scenario. In addition, we used Stress Linux package to further stress our system and create abnormal signals. Figure 1 demonstrates an abnormal behaviour of the CPU usage in our system. As the YCSB workload was executing in MongoDB, 1
https://www.mongodb.com/. https://cassandra.apache.org/. 3 https://www.elastic.co/. 2
156
S. Chouliaras and S. Sotiriadis
Stress Linux package executed various “stress” processes simultaneously in order overutilize the CPU resources of the VM. Consequently, the CPU usage reached the maximum level of 100% for a significant period of time between 16:48 and 16:50.
Fig. 1. CPU usage percentage over YCSB workload and stress package
Moreover, we are motivated to further explore the relationship between resource usage and application performance metrics. Since, application throughput consists a key performance indicator for NoSQL applications, we are motivated to explore the throughput of MongoDB and visualize unusual trends that occurred from abnormal system behaviours. For that reason, we collected the application throughput (operations/second) for the same period of time. Figure 2 shows the throughput of MongoDB at the same time window with Fig. 1. We can observe that the sharp increase in CPU usage has negatively impacted application throughput. In more detail, as the CPU reached the maximum level of 100% for a substantial period of time, the application throughput significantly dropped from 1800 ops/sec to 600 ops/sec. Furthermore, application throughput continued to fluctuate between 600 ops/sec and 1400 ops/sec without recovering until the CPU reached a normal level of 60% at 16:50:25. Our motivation findings suggest that NoSQL systems may experience various abnormalities, that lead to application performance degradation. Our goal is to introduce statistical learning methods to learn from the repeatable normal patterns and generate future predictions that used as a representative sequence of normal events. The latter will be used as a benchmark to detect future abnormalities and generate alarms to system administrators.
Inferring Anomalies from Cloud Metrics Using Recurrent Neural Networks
157
Fig. 2. MongoDB throughput over YCSB workload and stress package
3
Methodology
In this work, a variety of metrics captured over time to support our method. Our key idea is to monitor MongoDB while YCSB workload executes different types of tasks. Then, under the assumption that YCSB generates repeatable patterns, we created a training dataset to train RNNs to predict normal sequence representations. Additionally, the DTW algorithm has been used to measure the distance between the predicted sequence, that is the representative sequence of normal events, and the new unseen signal. The latter has been classified as normal or abnormal based on a threshold parameter set by the user. 3.1
System Overview
Our cloud environment consists three Virtual Machines that host our basic components. The first VM (VM1) hosts the Monitoring engine that monitors the NoSQL application. In more detail, we monitor VM1 and collect various application metrics (e.g. throughput, average read operations, average write operations etc.) in order to observe application and system behaviour while YCSB workload executes in MongoDB. The second VM (VM2) consists the storage Node that hosts ElasticSearch as a NoSQL database storage system. This component stores the collected metrics in a form of a log file inside ElasticSearch. The third VM (VM3) hosts the analyser which is responsible for analysing and detecting future abnormalities. Figure 3 demonstrates the workflow of our system. After the monitoring phase, the collected normal data used to train RNNs in order to model normal sequence representations. Then, the RNN model produces a predicted sequence which is used as the representative sequence of normal workload executions. Then, as new sequences arrive in our system, we calculate the DTW distance between the unseen sequences and the representative sequence. If the DTW distance is greater than normal application threshold, then the new sequence is being classified as an anomaly.
158
S. Chouliaras and S. Sotiriadis
Fig. 3. System flowchart
3.2
Recurrent Neural Networks
Artificial Neural Networks (ANNs) inspired by biological neural networks and designed to solve a variety of problems such as pattern recognition, forecasting and optimisation [6]. ANNs contain connected computational units called neurons organised together in layers. RNNs are powerful deep neural networks models build for sequential data in order to remember patterns, maintaining context and process complex signals for long time periods [7] which makes them ideal for real time resource usage analytics. Figure 4 demonstrates the architecture of RNNs. In RNNs the information flows from the input layer denoted as x(t) in time step t as well as the hidden layer denoted as h(t − 1) from the previous time step t − 1. The latter allows the network to have a memory of past events [8].
Fig. 4. RNNs architecture
Inferring Anomalies from Cloud Metrics Using Recurrent Neural Networks
159
Wxh is the weight matrix between the input x(t) and the hidden layer h, Whh is the weight matrix associated with the information from the previous time step, that is, the recurrent edge, and Why is the weight matrix between the hidden layer h and the output layer y. 3.3
Dynamic Time Warping
DTW uses a dynamic approach to align the time series, thus, the pattern detection involves searching wavelets of different data points [9]. However, this research focuses on wavelets that have same number of d data points. Specifically, the pattern detection involves searching wavelet P for instances on wavelet T: P = p1 , p2 , p3 , .., pd
(1)
T = t1 , t2 , t3 , ..., td
(2)
The sequences P and T are used to form a d-by-d square matrix where each point (i, j) corresponds to an alignment between elements pi and ti . A warping path W maps the elements pi and ti so that the distance between them is minimised. W = w1 , w2 , w3 , ...., wk
(3)
This alignment corresponds to the distance between two elements, where both functions, the magnitude of the difference or the square of the difference of points pi and ti can be used: δ(i, j) =| pi − tj |
(4)
δ(i, j) = (pi − tj )2
(5)
Once the measurement is defined, DTW defines a minimisation problem over potential warping paths based on the cumulative distance for each path: K δ(wk ) (6) DT W (P, T ) = minW k=1
where wk is the k-th element of the warping path W that needs to be minimised based on the cumulative distance measurement function δ. The cumulative distance for each point γ(i, j) is based on the following recurrence relation: γ(i, j) = δ(i, j) + min [γ(i − 1, j), γ(i − 1, j − 1), γ(i, j − 1)]
(7)
that is the sum of the distance between current point elements and the minimum of cumulative distance of the neighbour points.
160
S. Chouliaras and S. Sotiriadis
DTW has been used as the primary signal similarity algorithm of our solution while it enables the user to set a normal threshold variable. If DTW pointed distance is greater than the normal threshold, then the sequence is being classified as abnormal and the system detects an abnormality.
4
Inferring Anomalies Using RNNs and DTW Algorithm
In this section, experimental scenarios are being discussed and visualized to demonstrate the effectiveness of our solution. We verify the performance of our method based on YCSB workload that executes in MongoDB. It includes, (a) the experimental setup, (b) RNNs for time series forecasting (c) DTW algorithm for Anomaly Detection. 4.1
Experimental Setup
In this section, the solution is being clarified by demonstrating cloud deployment architecture. Three medium size VMs with 4 CPU Core, 300 HD Disk and 8 GB RAM have been deployed and monitored as the system runs. The first VM hosts MongoDB as a NoSQL database and the monitoring engine. The latter, sends metric beats (e.g. application throughput) while our application is being exposed in real world workloads. However, our goal is to avoid storage inefficiencies inside the VM that hosts the main application. Thus, the second VM serves as the storage node that receives data from the monitoring engine and stores them in ElasticSearch in JavaScript Object Notation (JSON) format, a lightweight datainterchange format that supports necessary data types for this work. The third VM is responsible to collect, process, analyse and visualise data on-the-fly. 4.2
RNNs for Time Series Forecasting
As already mentioned, NoSQL systems tend to generate repeated patterns over a significant period of time. RNNs have been used to learn application’s throughput behaviour and make future predictions of unknown events. The first step is to normalise the data within the range of −1 and 1 in order to train the RNN model. Next, the hyperbolic tanh function has been used as the activation function for each node. In addition, Root Mean Squared Error (RMSE) has been used as the loss function in the training phase. Thus, the weights of the RNN model have been updated based on the square root of the mean of the square of all of the error. Furthermore, the RNN model has used the adaptive moment estimation (Adam) as an optimisation algorithm. Adam is an algorithm for first-order gradient based optimization of stochastic objective function based on lower-order estimates that are adaptive through the training process [10]. The number of the layers and the nodes inside each layer was a significant parameter to consider. In many cases, more layers-nodes indicate that our model tends to overfit the data as the complexity of the model increases sharply. Thus, we used a test dataset to evaluate RNN model performance on unseen data and ensure
Inferring Anomalies from Cloud Metrics Using Recurrent Neural Networks
161
Fig. 5. Recurrent neural networks prediction
model generalization. As shown in Fig. 5 the first 10 wavelets have been used as the training set while the last 2 wavelets have been used as the test set. The best RM SE = 0.071 has been achieved after using a two-layer architecture with 50 nodes on each layer and training epoch set to 300. Since the RNN model trained on normal sequences (blue line), the RNN produces a predicted sequence (green dotted line) that has been used as the representative sequence of normal events to support our method. 4.3
Dynamic Time Warping Algorithm for Anomaly Detection
DTW algorithm has been used as a score to detect abnormalities in time series data. Consequently, we are able to classify a wavelet as normal or abnormal based on the DTW distance. As discussed earlier, the RNN model is being used to learn normal sequence representations. As a result, the RNN model has the ability to produce a predicted sequence that has been used as the representative sequence of normal workload executions. Then, the DTW algorithm used to measure the distance between the representative sequence and the new unseen sequence and classify the latter as normal or abnormal based on a given threshold. Figure 6 shows the normal, predicted and abnormal sequences over YCSB workload execution in MongoDB. A new wavelet is being classified as normal (blue line) since the DTW distance between the normal and the predicted wavelet (green dotted line) is lower than the normal threshold. On the contrary, a new wavelet is being classified as abnormal (red line) since the DTW distance between the abnormal and the predicted wavelet is higher than the normal threshold. The normal threshold consists a tuning parameter that can be adjusted by the user based on application functionality. As a result, a high threshold value will detect only extreme sequence representations. On the other hand, a low threshold value will be less forgiving and more unseen sequences will be classified as abnormal.
162
S. Chouliaras and S. Sotiriadis
Fig. 6. DTW algorithm to label a sequence as normal (blue line) or abnormal (red line) based on a normal threshold.
5
Related Work
In this section, we briefly discuss various techniques based on resource usage and application monitoring for detecting abnormalities on Cloud systems. In [11] authors underline the importance of resource usage cloud monitoring for both cloud-user and service provider. The cloud-user’s interest is to arrive at an appropriate Service-level of agreement (SLA) while the cloud-provider’s interest is to ensure user’s satisfiability. In order to meet the aforementioned requirements, they propose a distributed monitoring framework, which enables application monitoring and ensures SLA based on Quality of Service (QoS) requirements. Three Web Servers hosted on those three VMs on a single host and httperf [12] benchmarking tool was used to measure web server performance by generating specific HTTP workloads. The basic architecture of their virtual environment consist four components VM Agent, Dom0 Agent, Metrics Collector and Customer Interface Module. The VM agent collects the metrics for each VM while The Dom0 Agent is specific to Xen hypervisor and collects the perVM effort. Both Agents communicate with the Metrics Collector which in turn communicates with the Customer Interface Module in order to meet customer’s monitoring requirements. In [13] authors proposed a framework to detect abnormal behaviour of cloud systems based on application metric monitoring. They used Long short-term memory (LSTM) Autoencoders to learn reconstruct normal sequence representations. Then, new unseen signals have been classified as normal or abnormal based on the reconstruction error produced from the LSTM-Autoencoder model. As a result, new signals classified as normal if the reconstruction error was lower from a given threshold or abnormal if the reconstruction error was higher. In recent, different approaches have been used to detect abnormalities in cloud systems based on resource usage data. Such techniques consist a powerful tool to tackle denial-of-service (DoS) attacks from unauthorised users since DoS attacks often occupies large amounts of computing resources. In [14] authors proposed
Inferring Anomalies from Cloud Metrics Using Recurrent Neural Networks
163
a defending schema that uses virtual machine monitor (VMM) to detect DoS attacks effectively by adapting threshold of available resources. Due to the fact that a DoS attack occupies available resources without crashing the operating system, the VMM collects resource usage metrics (e.g. CPU, Memory etc.) to acquire information on lower level. Consequently, they detect an abnormal behaviour as the resources of virtual machine decreases to threshold, after the attacker deprived the resources. Lastly, if the defending system assures an attack, it will migrate the operating system alongside the tagged applications, on-thefly, in a new isolated environment while the initial VM will be interrupted or even destroyed. In [15] authors introduce the logging as a key source of information on a system state. However, traditional ways of analysing log files to detect abnormalities are difficult to adjust on modern applications that need to scale up dynamically in the cloud. As virtual machines can scale horizontally in huge sizes, the complexity of those systems is rising and auditing becomes a challenging task for system administrators. Their work, presents efficient algorithms for log mining with minimum human intervention. They applied a word embedding technique based on Google’s word2vec algorithm that requires little intervention and promises strong predictive performance. Natural Language Processing (NLP) techniques give a linguistic approach to anomaly detection that can improve the performance of the analysis. However, extra parameters such as system metrics need to be considered to generate stronger models with the ability to analyse and predict abnormalities in large scale systems. Although, all the aforementioned techniques proposed effective solutions in order to detect abnormalities in cloud systems, they are mainly focused on log mining and data storage techniques. On the contrary, our method focuses on real time anomaly detection by monitoring application metrics and classify signals by using RNNs and DTW algorithm. Consequently, as signal similarity methods detect abnormalities on real time they can also compare unseemlier in length signals and provide the administrators the ability to preset customised thresholds. Furthermore, the proposed method efficiently collects and distributes the application metrics data outside the application environment without “stealing” application’s resources.
6
Conclusion
Distributed systems have the ability to scale out in huge cluster nodes in order to deal with big data volumes. However, this ability introduces inefficiencies as auditors struggle to manually track abnormal behaviour in the system. In this work, we recommended a real time anomaly detection system based on the application metrics of a NoSQL system. As we explored that YCSB generates similar patterns over time, we used deep learning modeling to learn from that context and generate future predictions. The predicted sequence used as a representative sequence of the normal workload executions. Our key idea was to find dissimilarities between the representative and new abnormal sequences by using DTW distance metric. We introduced an application threshold as a tuning parameter
164
S. Chouliaras and S. Sotiriadis
to classify a new signal as normal or abnormal. Thus, a new sequence is being classified as normal if the DTW distance is lower than the normal threshold or abnormal if it is not. In future work, we will further explore various resource usage features and alternative statistical learning approaches to detect abnormal behaviour in a cloud environment. We aim to combine the information that arrives from the monitoring engine and use multiple predictors that will result in a multivariate time series analysis for anomaly detection.
References 1. Gudivada, V.N., Rao, D., Raghavan, V.V.: NoSQL systems for big data management. In: 2014 IEEE World Congress on Services, pp. 190–197. IEEE (2014) 2. Bhattacharyya, A., Jandaghi, S.A.J., Sotiriadis, S., Amza, C.: Semantic aware online detection of resource anomalies on the cloud. In: 2016 IEEE International Conference on Cloud Computing Technology and Science (CloudCom), pp. 134– 143. IEEE (2016) 3. Gupta, B., Badve, O.P.: Taxonomy of DoS and DDoS attacks and desirable defense mechanism in a cloud computing environment. Neural Comput. Appl. 28(12), 3655–3682 (2017). https://doi.org/10.1007/s00521-016-2317-5 4. Chouliaras, S., Sotiriadis, S.: Real-time anomaly detection of NoSQL systems based on resource usage monitoring. IEEE Trans. Ind. Inf. 16(9), 6042–6049 (2019) 5. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154. ACM (2010) 6. Jain, A.K., Mao, J., Mohiuddin, K.M.: Artificial neural networks: a tutorial. Computer 29(3), 31–44 (1996) 7. Williams, R.J., Zipser, D.: A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989) 8. Raschka, S.: Python Machine Learning. Packt Publishing Ltd., Birmingham (2015) 9. Berndt, D.J., Clifford, J.: Using dynamic time warping to find patterns in time series. In: KDD Workshop, vol. 10, no. 16, pp. 359–370. Seattle, WA (1994) 10. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) 11. Dhingra, M., Lakshmi, J., Nandy, S.: Resource usage monitoring in clouds. In: 2012 ACM/IEEE 13th International Conference on Grid Computing. IEEE (2012) 12. Mosberger, D., Jin, D.: httperf–a tool for measuring web server performance. ACM SIGMETRICS Perform. Eval. Rev. 26(3), 31–37 (1998) 13. Chouliaras, S., Sotiriadis, S.: Detecting performance degradation in cloud systems using LSTM autoencoders. In: Barolli, L., Woungang, I., Enokido, T. (eds.) Advanced Information Networking and Applications. AINA 2021. Lecture Notes in Networks and Systems, vol. 226, pp. 472–481. Springer, Cham (2021). https:// doi.org/10.1007/978-3-030-75075-6 38 14. Zhao, S., Chen, K., Zheng, W.: Defend against denial of service attack with VMM. In: 2009 Eighth International Conference on Grid and Cooperative Computing, pp. 91–96. IEEE (2009) 15. Bertero, C., Roy, M., Sauvanaud, C., Tr´edan, G.: Experience report: log mining using natural language processing and application to anomaly detection. In: 2017 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE), pp. 351–360. IEEE (2017)
Method of Lyric Association Based on Mind Mapping in Collaborative Lyric Writing of Popular Music Meguru Yamashita(B) and Kiwamu Satoh Graduate School of Software and Information Science, Iwate Prefectural University, Takizawa, Japan [email protected], [email protected]
Abstract. Owing to the prosperity of consumer generated media and user generated content, collaborative production (collaborative composition) of popular music by multiple people over a network has become common. However, there has been limited discussion on the collaborative production of lyrics (collaborative lyric writing), which is an important component of creating popular music. Collaborative lyric writing is a collaborative creative act, where words and sentences are conceived based on the vocabulary and experiences of each member. Therefore, it is important to support this process because it is expected to produce richer ideas than those produced by individuals. In this study, we propose a method to support collaborative lyric writing using the lyrics association maps (LAMs). These maps support collaborative lyric writing as a creative act by visualizing the process of the group’s lyric conception, facilitating the understanding of the entire lyrics for a song and the association and sharing of the lyrics.
1 Introduction With the rise of consumer generated media and user generated content, collaborative production of popular music by multiple people via a network in a desktop music environment (hereafter referred to as “collaborative composition”) has become common practice [1]. Additionally, regarding publishing musical works, users can now choose from various sharing services, such as YouTube (https://www.youtube.com) and SoundCloud (https://soundcloud.com/), as noncommercial places to publish their music on the Internet. Several popular songs have lyrics. Lyrics are important elements that compose songs. Lyric writing—similar to composing—is an act performed by one person or multiple people. In fact, there are numerous examples of songs in commercial popular music whose lyrics were written by multiple people (hereinafter referred to as “collaborative lyric writing”). Collaborative lyric writing is a collaborative creative act, where words and sentences are conceived based on the vocabulary and experience of each member, and richer ideas can be expected than those created by individuals. Most existing research on the application of information technology to lyric writing has been on the automation of lyric writing [2–5]; however, there are few studies on the ideation of lyrics by multiple people. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 165–178, 2022. https://doi.org/10.1007/978-3-030-84913-9_15
166
M. Yamashita and K. Satoh
In our previous paper [6], we proposed an integrated collaborative lyric-writing support environment for lyrics and melodies. The focus was on the editing function of lyrics, that is, mapping lyrics to melodies; therefore, it did not provide support for the creative aspect of lyric writing. Hence, we are currently focusing on the co-creative actions of lyric writing. When writing lyrics for popular music, focusing on the elements that constitute the lyrics (such as stories, characters, and perspectives) leads to easy lyric writing. All members should share these elements accurately during collaborative lyric writing. If members have ideas that do not share these elements accurately, it will be difficult for the ideas to diverge and converge. To solve this problem, we propose a collaborative lyric-writing support method using lyric association maps based on the radial thinking of mind maps. The objectives of this study are as follows. 1. We propose a supporting method that expands the lyric ideas of each member, to visualize the process of group lyric writing. This method allows us to visually grasp the entire world behind the lyrics of the song, and to facilitate the association and sharing of lyrics. 2. We propose a method to support the convergence of ideas presented in 1, and decide on the most appropriate lyrics. This method encourages the construction of storybased knowledge and facilitates lyric decision-making, by listing all lyric candidates and presenting them to the user. 3. We implement additional functions for creating lyrics using the lyrics association map (hereinafter referred to as LAM), sharing and listing the conceived lyrics, and performing the melodies with the lyrics of the collaborative composition system from our previous work [7]. With these additional functions, collaborative lyric-writing support, which is described in 1 and 2, becomes possible in our system. The study comprises six sections. Section 2 describes the process of lyrics writing and their constraints in popular music in general, as well as lyrics writing by multiple people. Section 3 describes our proposed method of lyric divergence and convergence using lyric association maps based on mind maps. Section 4 describes the collaborative lyric-writing functions based on our proposed method, which was implemented in our collaborative music composition system. Section 5 describes an outline of the preliminary experiment for examining how the use of lyric association maps affects lyrics ideation. Section 6 presents a summary of the study.
2 Collaborative Lyrics Writing 2.1 Lyric Writing in Popular Music Recently, in music production, technological innovations—mainly digitalization—have simplified the equipment required, and lowered the cost of production [8, 9]. Such a decrease in cost has made it easier for major record companies as well as small independent labels, independent individuals and groups, and even unknown amateurs to release
Method of Lyric Association
167
their music. In other words, lyric writing by people without advanced lyrics writing skills or experience is increasing. Lyric writing is the process of writing words serially to form phrases, sentences, and stanzas. This process is an act of creativity, wherein a person conceives words and sentences based on his or her linguistic knowledge and experience, such as vocabulary and grammar. The success of lyric writing depends on the ability of the lyricist to produce good ideas. In the field of popular music, various methods to make lyric writing easier (lyricwriting methods) have been published in books and movies on the Internet. However, in general, there is no set procedure for lyric writing. In such a situation, this study will support the lyric-writing act based on a lyric-writing method [10] that considers elements such as “story,” “characters,” “viewpoint,” “line of sight,” and “perspective. ” The visualization and presentation of the above elements and their relationships will encourage further conception of lyrics. In this study, the above elements are defined as follows. Characters: People whose existence are described in the lyrics. The central character is called a protagonist. Lyrics Story: A world composed of words and phrases inspired by the characters. Moreover, the world constitutes the message that the lyricist is attempting to convey. The lyrics are important words—which are extracted from the words and phrases—that construct the lyrics story. Perspective: The center of the world of the lyrics story, which is the position of the protagonist. Viewpoint: The position of the lyrics story in the world, where a character other than the protagonist, is located. Line of Sight: A line (vector) that represents the protagonist’s attention from his or her perspective to a character from a particular viewpoint.
2.2 Constraints on Lyric Writing in Popular Music In writing lyrics within the scope of popular music, the following three constraints are considered. 1. Structural constraint The structural constraint is imposed by the structure of the music. The lyrics need to be adjusted, in terms of overall length, and separated, considering the time during which the melody is playing and the number of notes. When comparing English and Japanese lyrics, English lyrics correspond to one syllable per note, whereas Japanese lyrics correspond to one mora per note [11]. Because of this characteristic, the number of notes used in a word in Japanese is larger than that in English. Therefore, the selection of words is important in Japanese lyric writing.
168
M. Yamashita and K. Satoh
Additionally, contemporary popular music production methods can be divided approximately into two categories: tune-first, where the lyrics are added to the previously composed melody, and lyrics-first, where the melody is added to the previously written lyrics [5]. Japanese is classified as a pitch-accented language [12], in which their pitch identifies words. Therefore, in the case of tune-first songwriting, lyricists tend to focus on the temporal changes in the pitch of the melody when selecting words with accents that are comfortable to use in Japanese. Consequently, knowledge of synonyms is important in the selection of Japanese lyrics. In summary, when writing lyrics in Japanese, it is important to be aware of the following two points in selecting words and phrases. 2. Content constraint Content constraint is a constraint on the content and method of expressing the lyrics. In popular music, especially when a work is produced as a commercial piece, it is necessary to have lyrics that can be understood by and is relatable to various listeners. Therefore, in collaborative lyric writing, it is considered effective for ideation support to have a mechanism allowing members to consider whether they can gain the sympathy of listeners for the conceived words and phrases. 3. Ethical constraint In some cases, a record company or broadcasting station takes measures such as banning or canceling the broadcast or release of a song, because the contents of the lyrics or specific words in the lyrics are discriminatory, obscene, or otherwise offensive to public order and morals. However, truncating certain vocabulary during creative work is considered to have a negative effect on ideation support. Therefore, we did not consider this constraint in this study. 2.3 Collaborative Lyric Writing We conducted research on the act of composing music by multiple people, that is, a collaborative composition [6, 7]. Collaborative composition is a collaborative creative act, wherein members conceive musical rhythms and phrases based on their own musical knowledge, repertoire, and experience. The act of collaborative lyric writing—occurring before and after collaborative composition—is also a collaborative creative act, in which words and sentences are conceived based on each member’s linguistic knowledge and experience, such as vocabulary and grammar. Such a collaborative creative act by multiple people can be expected to produce richer ideas than those produced by individuals. In commercial popular music, the percentage of collaborative writing is lower than that of individual lyric writing. However, collaborative lyric writing is common practice. We think that collaborative lyric writing facilitates the writing of lyrics under the constraints described in Sect. 2.2. The reasons are as follows. 1. Concerning structural constraints, a small number of words mean that lyrics have relatively little information. In other words, trial-and-error is necessary to consider and select the appropriate words. The range of choices is widened using the vocabulary of each member. Additionally, a collaborative examination of words and phrases among members is expected to facilitate the writing of lyrics.
Method of Lyric Association
169
2. Concerning content constraints, by sharing the lyrics with multiple people, it is possible to examine whether everyone can understand or empathize with the lyrics.
3 Support for Lyric Writing Based on Mind Mapping In this section, we discuss a method to support the conception of song lyrics. In ideation support, it is important to perform both the divergence and convergence of ideas [13]. Therefore, the proposed method includes both divergence and convergence. The divergence support method is described in Sect. 3.2. Subsequently, the convergence support method is described in Sect. 3.3. Additionally, collaborative lyric writing is not necessarily a task that is performed synchronously by multiple members. Asynchronous work procedures, such as one member creating lyrics and another member making changes to them later, are also possible. Therefore, these methods need to be adapted for both individual and collaborative work. In this paper, the terms “lyrics” and “lyric candidates” will be used distinctively. Lyric candidates refer to all “words” or “phrases” (a sequence of two or more words, regardless of whether they form a sentence) conceived using our proposed method. Lyrics refer to lyric candidates that are associated with the melody of a musical piece. 3.1 Methods for Supporting Divergence Thinking The support methods for divergent thinking include mind mapping by Buzan [14], KJ method by Kawakita [15], and NM method by Nakayama [16]. For the following reasons, we considered mind mapping as a suitable method to support word divergence in lyric writing. Mind mapping is a method of ideation support through visual drawing based on radial thinking. The structure of mind maps mimics the connection of synapses in the human brain; by branching out from a theme in the center of the map and drawing curves, it is possible to expand ideas and facilitate associations. This feature of mind mapping, which facilitates association, is suitable for use in associating words and phrases for examination and consideration in lyric writing. Additionally, because the divergence process of thinking is illustrated, it has the following two advantages for collaborative work. • It is easy to understand the thought processes of others. • It is easy to add one’s own associations without destroying existing ideas. The KJ method is used to classify the data created and collected, to “let the chaos (data) speak for itself,” [15] and to offer new ideas and solutions to problems. In this method, labeling and spatial arrangement of the classified data are performed. We assume that the creation of lyrics is a series of actions where new words are conceived in a chain from a perspective and existing words; subsequently, they are connected to each other. In other words, the collected data constitute the lyrics as deliverable. In the case of mind maps, this connection can be made using radial connections, and there is little need to label the dataset in lyrics writing. Therefore, when comparing
170
M. Yamashita and K. Satoh
these two methods, mind mapping is considered more suitable for supporting divergent thinking in lyrics writing. The NM method connects seemingly unrelated things and obtains new ideas from those connected through metaphor and association. We assume that the lyrics are the words and phrases inspired by the characters that construct the world or the message that the lyricist is trying to convey, and they are not something that is conceived when different things are combined. Additionally, ideation from seemingly unrelated things can create unexpected lyrics. However, it is difficult for a non-experienced lyricist to deliberately aim for the unexpected. For beginners, it is appropriate to write lyrics using an idea method that allows more direct associations between words. From the above discussion, we conclude that mind mapping is a more appropriate method for such a concept. 3.2 Supporting for Divergence of Words and Phrases Based on Mind Maps As described in Sect. 2.1, when creating lyrics, it is important to create a lyric story with clear characters, perspectives, viewpoints, and lines of sight. Here, the term “lyric story” is not a chronological sequence of events, as in a novel. It is a set of words and phrases that expresses various concepts that construct the world that is conveyed by the lyrics, such as relationships between characters, thoughts, situations, and dialogs. Therefore, to make the concepts that construct the world into lyrics, it is necessary to express these concepts in terms of words. Hereinafter, we refer to these words as “the lyrics keywords.” The structure of the relationship between the words that construct the lyrics story is considered to have a high affinity with the structure of the keywords that were conceived by the radial thinking of mind mapping. This is because: 1. One mind map has one central image, and branches that have a lyric keyword individually diffuse radially from the image. Similarly, the words that construct a lyric story can be considered as a diffusion of lyrics keywords, which are conceived with the protagonist as the central image. 2. The idea of diffusing lyric keywords from the central image in reason 1 can be applied to characters other than the main character. However, because there is generally only one central image in mind mapping, we call the image of a non-protagonist a sub-image. 3. By adding sub-branches under the branch, it is possible to associate them with lyrics keywords. This action can be seen as an act of associatively delving into and expanding the expression of the world described by the lyrics. 4. A complete map constructed using lyrics keywords can be considered a lyric story. Therefore, lyric keywords are the lyric candidates. Additionally, because the semantic connection through the association can be found among the connected branches, there is a high probability that the sentences of the lyrics can be composed of the lyrics keywords of these branches. 5. In mind mapping, there is a rule that an arrow must be drawn between two distant branches if they are connected. It is easy to understand who is paying attention to whom, by connecting the perspectives and viewpoints with arrows.
Method of Lyric Association
171
The perspective in reason 1 is associated with the position of the central image, and the viewpoint in reason 2 above is associated with the endpoint of an arbitrary branch; furthermore, the words and phrases are associated with the keywords of the branch. However, the concepts of sub-images and line of sight are not found in conventional mind maps. Because Buzan’s mind map has a strict definition, the introduction of these concepts is our own extension based on the mind map. Therefore, we call the radial map that introduces these concepts as the LAM. The LAM comprises the following elements. Central image: The central image is the protagonist, and there is only one on the map. Graphically, it consists of an icon and a name label. The center of the icon in the central image is simultaneously the origin of the map and coordinates of the perspective. Sub-image: A sub-image is a character other than the protagonist. Thus, it exists as many of them on the map as there are characters. Graphically, it consists of an icon and a name label. Branch: Branch refers to the lyrics keywords that extend from the central image or sub-image. Graphically, it consists of a line or curve and a lyrics keyword label. It is possible to connect zero or more branches to a branch. The connected branches are called sub-branches. Edge: Edge indicates the endpoint of a branch. Graphically, it consists of a circle. The center of the icon in the sub-image is simultaneously the center of an arbitrary edge and the coordinates of a viewpoint. A line of sight is formed when a branch connects the central image icon and subimage icon. If the two are farther apart, an arrow is placed between them to represent the line of sight. We show an example of the LAM on the left side of Fig. 1. The advantages of using the LAM for collaborative lyric writing are as follows. 1. The radial representation of the LAM allows the information to understood easily. The map makes it easy for members to understand and share the lyrics story behind them. 2. If a member presented the lyrics as sentences from the beginning, interposing in the contents of the lyrics or editing them has a significant impact on the overall composition. When creating the lyrics story using the LAM, it is easy to add, edit, or delete words without affecting the overall structure. 3. It is expected that beginners in lyric writing will find it difficult to begin immediately with creating a larger framework of lyrics (entire lyrics, stanzas, and sentences). By using the LAM, members can easily grasp the relationships among the minimal components of lyrics, such as words and phrases. Furthermore, members can create sentences, stanzas, and entire lyrics by associating and connecting new words from these relationships, which will facilitate bottom-up style lyric writing.
172
M. Yamashita and K. Satoh
Fig. 1. Example of the LAM (left) and lyric candidate sentences (right)
3.3 Supporting Convergence by Enumerating Lyric Candidates Generally, lyrics are story-based texts. They can be regarded as a story-type of nonmulti-attribute knowledge representation. Because the lyrics story defined in Sect. 2.1 is not story-based, it needs to be transformed into story-based text, that is, lyrics. In other words, in the stage of divergence lyric candidates to the LAM, branches are associated and added to focus on word-level relationships. Alternatively, in the stage of convergence of the lyric candidates and converging them as lyrics, they are recognized as a story-type knowledge format. If group members can recognize the story-type knowledge format during the convergence stage, it will facilitate the structure of story-type knowledge [17]. The above convergence from lyric candidates to lyrics requires the process of (1) discarding and selecting words displayed in the LAM and (2) converging them to a sentence structure. The problem with this process is that it is difficult to interpret as a story-type by only examining the map, because people’s knowledge structures are different. Therefore, as described in reason 4 of Sect. 3.2, we convert the map into a list of lyric candidate sentences that have a high possibility of being composed of lyric sentences, and then, select the words from this list. A lyric candidate sentence is a list of lyric candidates adjacent to an end branch or an image adjacent to another image. We show an example of a lyric candidate sentence on the right side of Fig. 1. An additional advantage of such an enumeration method is that it can avoid a situation that is not recognized because it is not represented [18]. Because it is up to the user to decide which lyric candidate is the best, all lyric candidate sentences should be listed and recognized by the user. Because the LAM does not consider grammatical rules such as case changes, prepositions, and particles, it is unlikely that the selection of words on the LAM alone will result in grammatically correct sentences. Users handle such grammatical corrections and additions in the processing of converging to the sentence structure, adjusting the structure so that it is correct. Additionally, as mentioned in the structural constraint subsubsection in Sect. 2.2, it is desirable that the pitch accent of the lyrics can be immediately identified when singing them, regardless of the lyricist’s singing level.
Method of Lyric Association
173
4 Implementation of the Proposed Method into the Cooperative Music Composition System Based on the methods described in Sect. 3, we implemented the functions for lyric ideation support. These functions are an addition to the cooperative music composition system, which was implemented in our previous study [7]. 4.1 Requirements for the Lyrics Ideation Support Function Based on the discussion in Sects. 2.1, 2.2, and 3, we have defined the following four requirements for lyrics ideation support in collaborative lyrics writing using our system. 1. All lyric candidates expressed on the LAM should be displayed such that members can easily grasp them. 2. The operations to edit a perspective as well as to add, edit, and delete viewpoints and branches should be easy. 3. All the lyric candidate sentences created from the lyric candidates on the LAM should be listed, and the members should be able to see them at anytime. 4. The members should be able to discuss and exchange opinions with other members regarding the decisions of lyrics. Requirements 1 and 2 are the requirements for the divergence support. These are implemented in the LAM window, as described in Sect. 4.2. Additionally, requirements 3 and 4 are requirements for the convergence support. These are implemented in the lyrics editing window, as described in Sect. 4.3. 4.2 LAM Window When “Open lyrics associative map” is selected from the context menu, the LAM window appears on the screen (Fig. 2). In the LAM window, unique icons represent the viewpoints and perspectives, and their naming labels are displayed below the icons. Additionally, the lyric keywords of the branches are displayed parallel to the straight line representing the branch in the immediate vicinity. This window has the following three functions. 1. Perspective editing function The perspective editing function is a function to set or unset the center of any edge as the viewpoint. This function satisfies requirement 2 in Sect. 4.1. To set the viewpoint, enter text data for the name label. When the input is completed, an icon appears on the screen, and a name label appears below the icon. 2. Branch editing function The branch editing function is a function to add, delete, or insert branches to any image or branch, and to set lyric keywords. This function satisfies requirement 2 in Sect. 4.1. On the server side, morphological analysis was performed on the text data of the lyric keywords. The text data are stored on the server along with the morphological and pronunciation data. In Japanese, some words have multiple pronunciations.
174
M. Yamashita and K. Satoh
Therefore, each lyric-writing project has a database to register the pronunciation of special lyric keywords. 3. Branch coordinate modification function The branch coordinate modification function moves the branches to an arbitrary position on the window by dragging a sub-image or an edge with the mouse. This function satisfies requirement 1 in Sect. 4.1. The sub-branches below the branch to be moved are moved simultaneously. The system does not specify the meaning of the coordinates, interpretation of which is left to the user who creates the lyrics.
Fig. 2. Lyrics association map window (left) and lyrics editing window (right)
4.3 Lyrics Editing Window When the “Open lyrics editing window” is selected from the context menu, the lyrics editing window appears on the screen (Fig. 2). The status of the lyrics is displayed for each theme in the upper half of the window. The buttons for various operations are displayed on the right side of the lyrics. A theme refers to a part of the song separated by its development. A list of lyric candidate sentences in the lower half of the window is displayed in a tabular form. The lyric candidate sentences are created by selecting the lyric keywords entered in the lyrics associative map window in one of the following three ways. • All lyric keywords from a terminal branch to the central image • All lyric keywords from a terminal branch to a sub-image • All lyric keywords from a sub-image to the central image Buttons corresponding to the number of themes are displayed on the right of each line of the list. The lyrics editing window has the following three functions. 1. Lyrics selection function The lyrics selection function selects an arbitrary lyric candidate as the lyric of the theme. This function satisfies requirement 3 in Sect. 4.1. To decide which lyric candidates are included in the lyrics, a lyric candidate sentence is first registered to a
Method of Lyric Association
175
theme by pressing the button of the theme number used. The registered lyric candidate sentence is displayed as labels in the upper part of the window, directly below the lyrics. By clicking on the label, the corresponding lyric candidate is selected as the lyric. When the user determines the lyrics, the determination is immediately reflected in the melody of the corresponding theme, and he or she can confirm the singing of a vocal synthesizer. Manipulating the cursor specifies the selected position of the lyrics. If the number of notes in the melody is larger than the number of morae in the lyrics, the lyrics of the notes become scats (pronounced “la” by default). If the number of morae in the lyrics is larger than the number of notes in the melody, the extra morae are ignored. 2. Lyrics editing function The lyrics editing function fine-tunes the lyrics, as shown below. • Change order of words in lyrics • Edit lyrics line-by-line • Delete lyrics line-by-line The text data modification and dictionary data registration, which is described in function 2 in Sect. 4.2, can be performed in the lyrics editing window as well. 3. Commenting function The commenting function is used to comment on lyrics line-by-line, and to share them among the members. This function satisfies requirement 4 in Sect. 4.1. It is also possible to comment on existing comments. This function makes it possible to discuss and examine sympathy for the lyrics among the members, as described in the content constraint sub-subsection in Sect. 2.2.
5 Experiment 5.1 Experiment Objective As described in Sect. 3, collaborative lyric writing can be a synchronous process with multiple members or an asynchronous process. That is, the proposed methods should be applied to both individual and collaborative work. Therefore, as a preliminary experiment before the main experiment of collaborative lyric writing, our experiment aims to verify whether the divergence-related functions of our system implementing the proposed method are effective for individual tasks. The evaluation of the convergence method and collaborative work is a future issue. 5.2 Experimental Overview Evaluation method: The evaluation method was a controlled experiment, where the experimental group wrote the lyrics using the LAM and the control group did not use the map. Participants: There were seven participants in total, i.e., six males and one female, who were undergraduate students and a faculty member.
176
M. Yamashita and K. Satoh
Experimental system: Two systems were used. System A implements the lyricwriting function of the collaborative composition system, which is described in Sect. 4. Moreover, System B removes the LAM function from System A. In System B, only the lyric editing window is displayed on the screen; instead of the lyric association map, users perform batch input of words in rows of a tabular list from the text area at the bottom of the window. Task: The participants were given an original piece of music (four parts: melody, piano, bass, and drums), and asked to freely write lyrics using the system. Experimental procedures: 1. We explained to the participants the purpose of the experiment and how to use the system. 2. To perform the lyric-writing task, the experimental group used System A, and the control group used System B. The duration of the task was 60 min, which could be extended as needed by the participants. The work was recorded in a video. 3. After the task was completed, a questionnaire and an interview regarding system evaluation were administered. 5.3 Results We extracted the number of words conceived by each participant from the operation log, and compared the words that were selected by the experimental and control groups. In the operation log, we excluded those in which a meaningless string of characters was entered, and those in which a temporary word was entered to fill the space until a lyric candidate was conceived. Table 1 shows the number of conception times of words and phrases by the participants for each task in the two groups, as well as their sum and average. The average number of words conceived by the experimental group was larger than that of the control group. The results of Welch’s t-test (α = 0.05) for both groups showed no significant differences. Table 1. Number of lyric candidates for lyric-writing tasks System
Participant
Number of lyric candidates
Average number of lyric candidates
System A
1
77
35.5
2
14
3
39
4
12
5
29
6
24
7
35
8
23
System B
27.8
Method of Lyric Association
177
5.4 Discussion The experimental group outperformed the control group in terms of the maximum and average numbers of lyric candidates conceived. However, there was no significant difference between the two groups, because of the large variation in values between individuals. The large variation is considered a result of some participants being so focused on deciding the lyrics that they were unable to conceive of the words and phrases in time. Further experiments are required to obtain more accurate data.
6 Conclusion In this study, we proposed a lyric ideation method using the lyric association map as an extension of the mind map for collaborative lyric writing, and implemented the method as an additional function of the collaborative music composition system. Additionally, as a preliminary experiment to verify the effectiveness of the proposed method for collaborative lyric writing, we conducted a control experiment using our system with additional functions. We verified the effectiveness of the proposed method by conducting an experiment on collaborative lyric writing by multiple users. In the future, we would like to propose a cooperative lyric-writing system that supports multiple users to present lyric candidates collaboratively, examine the candidates (including confirmation of the relationship with the melody by listening to vocalized melody), and decide on the lyrics by consensus.
References 1. Goto, M.: The CGM movement opened up by Hatsune Miku, Nico Nico Douga and PIAPRO. IPSJ Mag. 53(5), 466–471 (2012) 2. Nishimura, A., Shiio, I.: conteXinger: VOCALOID takes the context of everyday in singing. IPSJ SIG Tech. Rep. 2013-UBI-38(9), 1–6 (2013) 3. Hori, G., Sagayama, S.: Lyrics generator exploiting distributed semantic representation. In: The 31st Annual Conference of the Japanese Society for Artificial Intelligence, vol. 1N1–2, pp. 1–2 (2017) 4. Nakamura, C., Onisawa, T.: Music/lyrics composition system reflecting user’s preference of music. In: 24th Fuzzy System Symposium, vol. WA1–3, pp. 18–23 (2008) 5. Yamamoto, T., Matsubara, M., Saito, H.: Conversion from natural text to lyrics considering the number of morae and syllables. In: The 15th Annual Meeting of the Association for Natural Language Processing, pp. 168–171 (2009) 6. Yamashita, M., Sato, K., Nunokawa, H.: Proposal for the collaborative lyric writing support environment to support integrated lyric writing with melody. IPSJ SIG Tech. Rep. 2019-DCC21(29), 1–6 (2019) 7. Yamashita, M., Sato, K., Nunokawa, H.: The implementation of collaborative composition system. Int. J. Affect. Eng. 16(2), 81–94 (2017) 8. Nagayama, S.: The mechanisms of building business-systems in the contents industry: a case study of the Japanese record business history. J. Japan Soc. Inform. Manag. 33(2), 71–82 (2012) 9. Ugaya, H.: What is J-pop: The Burgeoning Music Industry, Tokyo (2005)
178
M. Yamashita and K. Satoh
10. Shimazaki, T.: Songwriting Study Book: The Key to Creating a Sympathetic Story is to Expand Your Perspective and Ideas, Tokyo (2015) 11. Azechi, N.: J-pop: transition in the inclusion rules of rhythm and lyrics. Japan. J. Music Educ. Pract. 5(1), 25–31 (2007) 12. Kitahara, Y., Uwano, Z.: Asakura Japanese Language Courses Vol. 3: Phonetics and Phonology, Tokyo (2018) 13. Sugiyama, K., et al. (ed.): Knowledge Science, pp. 150–163, Tokyo (2002) 14. Buzan, T., Buzan, B.: The Mind Map Book, London (2006) 15. Kawakita, J.: Way of Conception: For Creativity Development, Tokyo (1967) 16. Nakayama, M.: All About NM Method: Theory and Practical Methods of Idea Generation, Tokyo (1977) 17. Kameda, T.: Seeking Consensus Knowledge: Group Decision Making, pp. 71–104, Tokyo (1997) 18. Fischhoff, B., Slovic, P., Lichtenstein, S.: Fault trees: sensitivity of estimated failure probabilities to problem representation. J. Exp. Psychol. Hum. Percept. Perform. 4(2), 330–344 (1978)
Collaborative Virtual Environments for Jaw Surgery Simulation Krit Khwanngern1(B) , Juggapong Natwichai2,4 , Vivatchai Kaveeta1 , Phornphanit Meenert3 , and Sawita Sriyong3 1
3
Princess Sirindhorn IT Foundation Craniofacial Center, Chiang Mai University, Chiang Mai, Thailand {krit.khwanngern,vivatchai.k}@cmu.ac.th 2 Center of Data Analytics and Knowledge Synthesis for Healthcare, Chiang Mai University, Chiang Mai, Thailand [email protected] National Science and Technology Development Agency, Pathum Thani, Thailand {Phornphanit.mee,Sawita.sri}@ncr.nstda.or.th 4 Faculty of Engineering, Chiang Mai University, Chiang Mai, Thailand
Abstract. Jaw surgery is a challenging surgical technique to study, due to the small case numbers. Surgeons may never have real experience with the procedure. This causes delays in the critical procedures and may affect the patient’s speech development. Virtual reality (VR) is a great tool to simulate surgical operations. Previously, we introduced a virtual reality system for jaw surgery simulation. In this work, we performed an additional evaluation of our system to realize its limitation and improvable elements. Results show that its potential as a training tool is limited. Mainly due to the lack of collaborative interaction. Trainers outside of the VR environment can only communicate with VR users through verbal communication. However surgical techniques are impractical to teach via this method. To address this shortcoming, we surveyed works on collaborative features on virtual reality systems. Other proposed improvement features are advanced input devices and artificial intelligence. The improvements can lift the realism of the VR system.
1 1.1
Introduction Jaw Surgery
At Princess Sirindhorn IT Foundation Craniofacial Center, we focus on the treatment of cleft lip cleft palate and craniofacial anomalies patients. These conditions are mainly birth defects in which newborns have malformed facial bones or features. The treatment process of these anomalies can take from birth until 20 years of age. During which multiple procedures need to be performed. One of the crucial procedures in the treatment is jaw surgery. It is a surgical operation for patients with malocclusion or misalignment between the upper and lower jawline. Malocclusion can affect the patient’s ability to chew food, c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 179–187, 2022. https://doi.org/10.1007/978-3-030-84913-9_16
180
K. Khwanngern et al.
maintain oral hygiene, and physical appearance. This procedure involves cutting the lower and/or upper jaw to realign and lock jaws into a better position. The procedure requires highly skilled specialists. As a teaching hospital affiliated with Chiang Mai University, our mission is to train medical knowledge to students. Unfortunately, the number of jaw surgery cases per year is very limited. Therefore, many medical students may never have real-world experience. This is the main reason behind our effort to simulate this operative procedure in a virtual reality system. 1.2
Our Proposed System
Fig. 1. Virtual operating room.
To address the shortcoming of lack of sample cases, we proposed a VR jaw surgery simulation system in our previous work [9]. The system simulates the operating room environment Fig. 1. It uses patient’s CT scan images to create accurate three-dimension skull models. And it stimulates the jaw surgery process by providing multiple interaction modes as follows (Fig. 2).
Fig. 2. Interaction modes in our system. (Left) Viewing mode. (Right) Cutting mode.
1.2.1 Training Mode The first mode that users start their interaction with. It teaches user control schemes and user interfaces. Short exercises are provided in each section to test user skills before continuing to the next modes.
Collaborative Virtual Environments for Jaw Surgery Simulation
181
1.2.2 Viewing Mode Users can visualize the skull model in this mode. They can use a hand controller with their buttons to move and grab the model for a better viewing angle. 1.2.3 Cutting and Drilling Mode Users perform the main interaction cut and drill in this mode. They can select the appropriate tool in the visual interface panels. The user will need to cut the jaw at the correct position. The application tracks their progress by comparing the cutting line with the predefined guideline by professionals. 1.2.4 Joining Mode After users cut the jaw into separate parts, they can glue the cut model into a single object. The objects will be locked and move together. This will help when compares the model in the comparing mode afterward. 1.2.5 Comparing Mode In this mode, two skull models before and after cutting operation, are shown side by side. Users can control the position and rotation to visualize their differences.
2 2.1
Related Works Virtual Reality for Medical Training
The first category is the works on VR and AR medical applications. [8] review multiple methods for anatomy education. The methods include virtual reality (VR) and augmented reality (AR). [2,15] create virtual reality simulators for training of orthopedic and orthognathic surgery. [6,7] create simulated laparoscopy procedure by integrating 360-degree video record of the operation room. [17] provide the VR and AR implementation for radiology training and communication. The application targets both medical professionals and patients. 2.2
Collaborative Virtual Reality
Works are focus on the collaborative aspect of virtual reality applications. [4,10] review past and recent works on collaboration in mixed, augmented, virtual reality. [3] explore current and future direction of collaborative mixed reality (MR) system. Many collaborative concepts in this work provide a base for our extensions proposal. [16] employ collaborative VR in geography education. Multiple users share a virtual room. The movement of the object in the room is tracked and transferred. Participants are shown as the avatar representation and use the laser pointer function for communication. Audio is recorded and playback on devices. Results show the importance of the social aspect in the effectiveness of virtual environment-based education. [13,14] use a virtual avatar in remote multiplayer VR. Users can acknowledge gazing and pointing from other user’s avatars.
182
2.3
K. Khwanngern et al.
Artificial Intelligence in Virtual Reality
The last category is works related to artificial intelligence and machine learning for virtual reality simulation. We see these techniques as an important part of the near future VR system. [11] integrate artificial intelligence as a virtual operative assistant. It can perform automated benchmarks on user performance. Similarly, [1] employ artificial intelligence to differentiate the surgical training level between senior and junior participants.
3
System Evaluation
See Fig. 3.
Fig. 3. A testing session by surgical residents.
3.1
Evaluation Metrics
Since our previous work, we improve critical parts of the system. Then, we deployed the system in a real teaching scenario. It underwent further testing by surgical residents as a part of its training exercise. Afterward, participants filled survey form and provide additional feedback. Evaluation metrics are satisfaction score range from 1 (Lowest) to 5 (Highest) in these areas 1) Training mode 2) Viewing mode 3) Cutting mode 4) Drilling mode 5) Joining mode 6) Comparing mode 7) User-friendliness and 8) Interaction realism. 3.2
Results
The result in Table 1 shows above-average scores on all of the system’s features. We believe this score is acceptable for the first real-world usage version of the application. However, some features will need further examination and improvement. The highest score on training mode shows the importance of the first user impression. We spent considerable time implemented a comprehensive training
Collaborative Virtual Environments for Jaw Surgery Simulation
183
Table 1. User satisfactory results Modes/Functions
Average satisfactory score (1–5)
Training mode
4.3
Viewing mode
4.0
Cutting mode
4.0
Drilling mode
4.0
Joining mode
4.0
Comparing mode
4.0
User friendliness
3.7
Interaction realism 4.0
routine. Users are introduced one by one to each control feature and simple exercise to test their understanding. Other interactive modes receive a score of 4.0 of 5. User-friendliness receives the lowest score of 3.7 of 5. This mainly comes from non-standardized control mapping across all features. Users often found the same button interact differently on the different modes. We plan to improve the overall control scheme for better coherence in the next version. Users also provided additional suggestions beyond the list of our survey. These remarks will be summarized in the next section. 3.3
User Remarks
Beyond the satisfaction score, we collect additional feedback and summarize them into three main ideas as follows. 3.4
Guidance and Communication
In the first testing session, we immediately found a big challenge in the communication between the lecturer and students. Trainers who are outside of the VR environment can only monitor user progress via the live recording of the user viewpoint. This view is different from the actual user view. As they are cropped images to be displayed on a flat-screen. So the trainer may not be able to comprehend all of the objects in the scene. Besides the viewpoint limitation, trainers can only make verbal communicate with VR users. This leads to a difficult situation where they may need to indicate something on the scene. This is a critical limitation that hinders the system’s usefulness as a teaching tool. 3.5
Control Scheme
As in Sect. 3.2, the control scheme is another pain points in the first version. The user mentions inconsistency of button mapping between modes. Joystick which we map to whole skull rotation is negatively received. Users propose a feature
184
K. Khwanngern et al.
to open and close a patient’s lower jaw beyond the normal physical range for better views on important regions. Also, they want the ability to enlarge the skull model to be bigger than the actual size. These features can be beneficial for visualization purposes. But at the same time, this seriously breaks the simulation realism and reduces its accuracy from reality. Users find the movement cumbersome and not precise enough. We found that for VR applications lower number of button controls is preferred in most situations. The last suggestion about the control scheme is the lack of action feedback. Users feel difficult to control the cutting and drilling tool without a clear indication of its effect. Currently, when the tool approaches the bone, sound and partition emitter are produce. However, other visual cues can also be used such as cutting depth indicator, controller vibration, and haptic feedback controller. 3.6
Device Limitations
Some feedbacks are from the limitation of current-generation VR devices. Although VR devices keep getting smaller each version. The weight still being an obstacle for the long usage periods. A better design for a head band and weight distribution needed to make the devices comfortable for all types of users. We currently utilize tethered VR that is tethered with a personal computer with powerful graphic capability. This restricts the range and area of possible movement. However, in [12] the recent mobile-based VR shows a great result when compared with desktop-based devices. When these mobile VR devices can render detailed skull models and physic simulations, we intend to port this application into a wireless mobile-based VR application. In the next section, we propose possible improvement which directly addressed these limitations.
4
Proposed Improvement
We look at the possible improvements to our current surgical simulation system. Extensions are listed in several groups. Their implementation, benefit, and possible drawback are discussed. 4.1
Multiplayer Session
Multiplayer capability in surgery simulation can provide many benefits. Users can share a virtual environment and interact with objects in the same scene. Object properties including its location, rotation, and movement are synced across user views. Another benefit is trainer can also enter the VR environment and directly give visual cues to trainees in a form of a laser pointer or tag. Eliminating communication shortcomings in our current version. We can extend a local multiplayer VR into a real remote collaborative experience. This will open many possibilities for the across healthcare provider collaboration. However, the remote VR application faces technical challenges on
Collaborative Virtual Environments for Jaw Surgery Simulation
185
the network bandwidth and delays. Without proper mitigation, the interaction between participants in the far distance can feel sluggish and prone to error. Another possible multiplayer scenario is the asymmetric interaction. [5] use both AR and VR devices in a shared session. This asymmetric structure may even be more suitable than VR only for medical education. As the trainer can remain outside of the VR, but still able to interact with the shared environment using AR devices. 4.2
Advanced Input Devices
Currently, our system relies on standard 3D controllers bundled with VR headmounted display devices. This type of interface is suitable for general VR applications. On the other hand, surgery is a delicate procedure that requires input accuracy and fidelity. The new generation of input devices is recently introduced which provide extra input dimension for VR users. Hand and finger tracking can detect user finger movements as actions in VR environments. Movements such as pinch, push, swipe can be projected into virtual interaction. Users usually can control with their fingers more naturally than artificial controllers. Combine tracking with haptic feedback devices, users can feel the object’s presence. Force feedback in cutting and drilling operation make the simulation much more realistic. 4.3
Artificial Intelligence
Our current system evaluates user progress and success by measure the cut distance against a predefined optimal cutting path. The system calculates the progress in percentage and shows it on a progress panel on a side of the main user viewport. Users need to glance at the panel to get information. [1,18] train machine learning to evaluate user performance. In the same way, we can implement an AI to monitor and evaluate user cutting progress. Instead of a single pre-defined optimal line, the system should be able to adapt to the acceptable alternative routes. Other aspects of the session that should be included are operating time, cutting straight, the position of the jaw after surgery.
5
Conclusion
In this work, we survey the works related to multiplayer collaborative virtual reality. We evaluated our proposed virtual reality jaw surgery simulation. We show the result and categorize the user feedback. From their suggestion, we propose the additional features to address limitations, increase user-friendliness and improve interaction realism. The proposed features are multiplayer, advanced input devices, and artificial intelligence.
186
K. Khwanngern et al.
References 1. Bissonnette, V., Mirchi, N., Ledwos, N., Alsidieri, G., Winkler-Schwartz, A., Del Maestro, R.F.: Artificial intelligence distinguishes surgical training levels in a virtual reality spinal task. J. Bone Joint Surg. Am. 101(23), e127 (2019) 2. Cecil, J., Kumar, M.B.R., Gupta, A., Pirela-Cruz, M., Chan-Tin, E., Yu, J.: Development of a virtual reality based simulation environment for orthopedic surgical training. In: Ciuciu, I. et al. (eds.) On the Move to Meaningful Internet Systems. LNCS, vol. 10034, pp. 206–214. Springer, Cham (2016). https://doi.org/10.1007/ 978-3-319-55961-2 21 3. Ens, B., et al.: Revisiting collaboration through mixed reality: the evolution of groupware. Int. J. Hum. Comput. Stud. 131, 81–98 (2019) 4. Fraser, M., et al.: Revealing the realities of collaborative virtual reality. In: Proceedings of the Third International Conference on Collaborative Virtual Environments, pp. 29–37 (2000) 5. Grandi, J.G., Debarba, H.G., Maciel, A.: Characterizing asymmetric collaborative interactions in virtual and augmented realities. In: 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), pp. 127–135. IEEE (2019) 6. Huber, T., Paschold, M., Hansen, C., Wunderling, T., Lang, H., Kneist, W.: New dimensions in surgical training: immersive virtual reality laparoscopic simulation exhilarates surgical staff. Surg. Endosc. 31(11), 4472–4477 (2017) 7. Huber, T., Wunderling, T., Paschold, M., Lang, H., Kneist, W., Hansen, C.: Highly immersive virtual reality laparoscopy simulation: development and future aspects. Int. J. Comput. Assist. Radiol. Surg. 13(2), 281–290 (2018) 8. Iwanaga, J., Loukas, M., Dumont, A.S., Tubbs, R.S.: A review of anatomy education during and after the Covid-19 pandemic: revisiting traditional and modern methods to achieve future innovation. Clin. Anat. 34(1), 108–114 (2021) 9. Khwanngern, K., et al.: Jaw surgery simulation in virtual reality for medical training. In: Barolli, L., Nishino, H., Enokido, T., Takizawa, M. (eds.) NBiS - 2019. AISC, vol. 1036, pp. 475–483. Springer, Cham (2019). https://doi.org/10.1007/ 978-3-030-29029-0 45 10. Ladwig, P., Geiger, C.: A literature review on collaboration in mixed reality. In: Auer, M., Langmann, R. (eds.) REV 2018. LNNS, vol. 47, pp. 591–600. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95678-7 65 11. Mirchi, N., Bissonnette, V., Yilmaz, R., Ledwos, N., Winkler-Schwartz, A., Del Maestro, R.F.: The virtual operative assistant: an explainable artificial intelligence tool for simulation-based training in surgery and medicine. PloS One 15(2), e0229, 596 (2020) ˇ 12. Moro, C., Stromberga, Z., Stirling, A.: Virtualisation devices for student learning: comparison between desktop-based (Oculus Rift) and mobile-based (Gear VR) virtual reality in medical and health science education. Australas. J. Educ. Technol. 33(6), 1–10 (2017) 13. Piumsomboon, T., Day, A., Ens, B., Lee, Y., Lee, G., Billinghurst, M.: Exploring enhancements for remote mixed reality collaboration. In: SIGGRAPH Asia 2017 Mobile Graphics & Interactive Applications, pp. 1–5 (2017) 14. Piumsomboon, T., et al.: Mini-Me: an adaptive avatar for mixed reality remote collaboration. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp. 1–13 (2018) 15. Pulijala, Y., Ma, M., Pears, M., Peebles, D., Ayoub, A.: An innovative virtual reality training tool for orthognathic surgery. Int. J. Oral Maxillofac. Surg. 47(9), 1199–1205 (2018)
Collaborative Virtual Environments for Jaw Surgery Simulation
187
ˇ sinka, C, ˇ et al.: Collaborative immersive virtual environments for education in 16. Saˇ geography. ISPRS Int. J. Geo-Information 8(1), 3 (2019) 17. Uppot, R.N., et al.: Implementing virtual and augmented reality tools for radiology education and training, communication, and clinical care. Radiology 291(3), 570– 580 (2019) 18. Winkler-Schwartz, A., et al.: Artificial intelligence in medical education: best practices using machine learning to assess surgical expertise in virtual reality simulation. J. Surg. Educ. 76(6), 1681–1690 (2019)
Deterrence-Based Trust: A Study on Improving the Credibility of Social Media Messages in Disaster Using Registered Volunteers Takumi Kitagawa1 , Tetsushi Ohki1 , Yuki Koizumi2 , Yoshinobu Kawabe3 , Toru Hasegawa2 , and Masakatsu Nishigaki1(B) 1 Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
[email protected]
2 GraduateSchool of Information Science and Technology, Osaka University, Osaka, Japan 3 Department of Information Science, Aichi Institute of Technology, Aichi, Japan
Abstract. There is increasing recognition of the importance of social media in information transmission during disasters. However, it is difficult to distinguish between credible information and non-credible information from the deluge of content on social media. In order to ensure the transmission of highly-credible information during disasters, both the credibility of messages and the credibility of individuals must be guaranteed. We propose to enhance (i) the credibility of messages by combined use of an automatic message filtering system and a crowdsourcing message verifying platform, and (ii) the credibility of the individual by registering volunteers’ identity information to use the traceability as deterrence against false verification. We are studying the both (i) and (ii) simultaneously, but this paper focuses on the latter and aims to determine the relationship between types of identity information and the ability to deter the transmission of false report. An online survey administered to 114 volunteers (with 41 valid responses) was used to analyze the appropriate types of identity information from the perspectives of privacy concerns and determent of false information transmission. Keywords: Disaster communication · Social media messages · Trust · Biometric information
1 Introduction In recent years, the magnification of natural disasters has coincided with an increase in public disaster awareness. During disasters, it is vital for rescue organizations such as the fire and police departments, and emergency personnel to collect the latest information about victims and disaster conditions, and the usefulness of the Internet in transmitting disaster information through social media platforms, such as Twitter, is gaining increasing recognition. As social media makes it possible to collect real-time information from all individuals involved in an event, the rate of social media usage for disaster support at local governments is growing from year to year [5, 29]. However, the credibility of information on social media is generally said to be low compared to other media. Social © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 188–201, 2022. https://doi.org/10.1007/978-3-030-84913-9_17
Deterrence-Based Trust: A Study on Improving the Credibility of Social Media
189
media features a plethora of information from a variety of users, which makes it difficult to single out highly-credible information. Given the above challenge, a mechanism is needed to guarantee the credibility of information on social media during disasters. In order to ensure the transmission of highly-credible information during disasters, it is necessary to identify highly-credible messages and senders. However, the social media messages are riddled with uncertainty and inaccuracies. For example, victims might sometimes send false messages with self-serving motives; volunteers might sometimes send incorrect messages with misunderstanding; outsiders might sometimes send prank messages. In addition, as disaster situations can be very dynamic, any conditions confirmed as true in a given location/time might not persist long-term. Because of these conditions, there is a pressing need to find ways to evaluate and verify the credibility of social media messages. Determining message consistency (the lack of contradictory messages from multiple sources) may help in verifying the message credibility. However, in order to address circumstances where multiple users intentionally/unknowingly retweet fake news, the addition of a monitoring mechanism for confirming the veracity of messages by human eyes is desirable. Also, confirming the credibility of individuals who were posting information on social media may help in evaluating the message credibility. However, it is difficult to guarantee the credibility of anonymous social media users. To overcome the challenge, we propose to enhance the credibility of messages by combined use of a system called the social media engine (SME), that automatically filters out inconsistent information from multiple messages sent/received on social media, and a crowdsourcing credibility verification platform, that asks volunteers (on-site volunteers providing physical support at a disaster site) to verify the credibility of messages. The credibility of individuals is enhanced by asking registered volunteers, who agree to have any kind of their personal information (“identity information” hereafter) temporarily register with shelters at the disaster site, to monitor anonymous volunteers’ behavior. This effectively increases the accuracy of crowdsourcing credibility verification by having a few registered volunteers encourage a larger number of anonymous volunteers to modify their behavior. In a later chapter, we will provide an overview of our credibility enhancement system. As will be explained later, in order to make our credibility enhancement system feasible, the following issues must be studied: IS1) How to develop the social media engine, IS2) How to improve trust in anonymous volunteers, and. IS3) How to create trust in registered volunteers. We are studying all the issues simultaneously, but this paper focuses on the subject IS3. That is, this paper aims to analyze how much each seed of identity information will contribute to create trust in registered volunteers. We refer to this type of trust as “deterrence-based trust.” This is based on the psychology of “if I post a lie, then I will be identified as the liar when the lie is exposed,” leading to increased credibility among registered volunteers. In this paper, details on deterrence-based trust will be provided.
190
T. Kitagawa et al.
2 Related Research 2.1 Use of Social Media in Disasters To make quick and correct decisions regarding how human lives can be protected in the event of a disaster, it is important to analyze one’s surroundings, which necessitates quick compilation of information regarding the disaster that varies over time. Therefore, communication is vital for responding to and recovering from a disaster. However, disasters frequently damage communication and information infrastructure, resulting in reduced communication availability and information flow [27]. A countermeasure to tackle against the issue is to study and develop disaster-resilient networks such as delay/disruption-tolerant networks [6, 17, 30]. Also, the use of social media as a means of communication during disasters has been proposed as another countermeasure [11, 13]. There are examples of incidents from recent years when social media was used at actual disaster sites. In the case of the Haiti Earthquake in January 2010, volunteers monitored traditional and social media sources and plotted the disaster state linked with the disaster position on a live map on the Internet. The emergency services used this information for disaster-response planning [9]. In the Tohoku Earthquake, Japan in March 2011, social media was used not only as a communication tool between family and community members but also as a direct channel to the government and public [16]. In addition to the sharing of disaster information locally, the continuous transmission of local information to the entire world through social media is possible [12]. This is a big help to quickly share information for providing information to family members living far away from the affected areas and for providing external support (e.g., through the transportation of goods). In Mabi Town, Kurashiki City, Japan, where approximately 30% of the area was submerged in water owing to torrential rains in September 2018, a man who was contacted by a friend conducted rescue activities on a water bike. This information was spread through social media [28]. Thus, social media plays an important role as an information-sharing tool in the event of a disaster. 2.2 Credibility of Social Media Trust is an important aspect of communication. In particular, in the event of a disaster, accurate and credible information is highly necessary to protect the property and lives of victims in affected areas. However, the existing social media is incredible in terms of information because of fake news. The increase in fake news is becoming a global problem [21]. Furthermore, social media is considered a platform where even false information can spread rapidly because it specializes in the spreading of news by its users through sharing and retweeting of information [25]. Although a certain report suggested that social media credibility improves in the event of a disaster [11], a review of the report that analyzed the factors leading to the spread of fake news [1] indicated that fake news is more likely to spread in the event of a disaster and that social media incredibility remains a major concern. In this regard, it has been reported that emergencymanagement organizations hesitate to trust and utilize information received from social media user in some cases because they cannot quickly verify the information [4, 8].
Deterrence-Based Trust: A Study on Improving the Credibility of Social Media
191
2.3 Building Credibility of Social Media in the Event of a Disaster Various studies have been conducted to improve the credibility of social media. A previous study [1] attempted to identify fake news by analyzing the factors leading to the spread of fake news and the types of users spreading the tweets. In another previous study [26], machine learning was used to create a classifier that predicts the credibility of web pages shared on Twitter. However, these existing studies did not target to a disaster communication and required a number of social media messages sent/received for a long period of time to a certain degree before identifying/predicting fake tweets. Because social media communication is likely to exhibit different behavior in the event of a disaster and vary from disaster to disaster, we believe that the techniques proposed by these studies cannot be used in the straightforward manner. In reference to a credibility-verification system applicable in the event of a disaster, crowdsourcing inspection model have been proposed [22]. In the crowdsourcing model proposed herein, it is assumed that social media users are hired as crowd workers, i.e., in their model, social media users play the roles of both information disseminators and information-credibility verifiers. However, as mentioned already, social media incredibility itself remains a concern and emergency-management organizations do not trust reports from social media users in some cases. Judging from these facts, the problem of fake-news proliferation is considered difficult to be solved with the crowdsourcing inspection conducted by social media users. Therefore, we focused on volunteers working in the field of disaster site and proposes to ask them to play the role of crowd workers. We believe that as social media has become increasingly popular as a tool for sharing information regarding disasters, information review must become a new and necessary activity for supporting areas affected by disasters. By asking volunteers at a disaster site to visually confirm the credibility of information on social media, we aim to improve the trustworthiness of the crowdsourcing inspection model for social media in the event of disasters.
3 Credibility Enhancement System 3.1 Overview Jahanian et al. proposed the use of dual-level information cleansing with the social media engine (SME) and crowdsourcing inspection as means of gathering accurate information during a disaster [14]. The SME automatically filters out inconsistent information from multiple messages sent and received on social media. Volunteers gathered at a disaster site serve as crowd workers and visually confirm the credibility of information from the SME. For this type of crowdsourced information cleansing to work perfectly, we must assume that no volunteers are providing non-credible reports. However, this is not always true especially when volunteers are anonymous. If the identities of all volunteers were known, then the fact that any volunteer’s identity could be revealed would deter volunteers from submitting non-credible reports. However, it would not be pragmatic to force all volunteers to register their identities during a disaster owing to the psychological burdens that would be imposed on those registering identities.
192
T. Kitagawa et al.
Fig. 1. Credibility verification system.
Therefore, we developed a system in which volunteers who have had their identities registered and confirmed (registered volunteers) can check the work of other volunteers who have not had their identities registered (anonymous volunteers). For registered volunteers, it is expected that the fact that any registered volunteer’s identity could be revealed serves to deter such volunteers from submitting non-credible reports. For anonymous volunteers, it is expected that the fact that anonymous volunteers are being watched by registered volunteers serves to deter anonymous volunteers from submitting non-credible reports. This should reduce the number of volunteers who must have their identities registered while achieving information cleansing with a higher degree of certainty. Figure 1 shows an overview of the proposed credibility verification platform. 3.2 SME: Filtering Out Inconsistent Social Media Messages Social media engine (SME) parses social media messages and uses text mining technology to extract disaster site information from social media, and then classifies and itemizes this information according to time, location, and content. Information spread through tweets on Twitter with the same content is condensed and summarized in the form of individual “event reports” for each independent item. However, there could be circumstances where multiple users intentionally/unknowingly retweet fake news. Therefore, even after SME filters out inconsistent information from multiple messages, some event reports still contain inaccurate information at this stage. To address that, during the next stage, the credibility of event reports is visually confirmed by both anonymous volunteers and registered volunteers. 3.3 Crowdsourcing Platform: Confirming the Credibility of Event Reports The event reports outputted by SME are submitted to a crowdsourcing system and jobs are published to confirm the credibility of each individual event report. After arriving at a disaster site, volunteers access the crowdsourcing system to check the list of jobs (event reports). Volunteers then select event reports that suit their individual situations (such
Deterrence-Based Trust: A Study on Improving the Credibility of Social Media
193
as event reports located near their current location) and receive orders to confirm the credibility of received jobs. Volunteers then proceed directly to the site of the reported event, visually verify the credibility of the contents of the report, and then report the results via the crowdsourcing system. Based on the large number of messages sent and received on social media, there is likely to be an equivalently large number of event reports outputted by SME. Therefore, it is necessary to ensure that there are enough on-site volunteers (crowd workers) to determine the credibility of such large numbers of event reports. To secure as many volunteers as possible, all volunteers gathered at a disaster site are allowed to participate anonymously. How to recruit registered volunteers will be describe later in Sect. 3.5. 3.4 WiT-Based Trust: Ensuring Credibility of Anonymous Volunteers Because volunteers are anonymous, lazy volunteers might make quick assessments about the credibility of event reports, or even they could simply submit reports without actually visiting to the site of an event. Anonymity would also make it easier for people to engage in deliberate fraud, such as launching conspiracy or Sybil attacks. To deter this type of behavior, we use registered volunteers as a form of deterrence against fraudulent behavior by anonymous volunteers. Specifically, we ask volunteers to confirm the credibility of event reports, not in an individual manner but, in a group manner. Teams are formed from two to three volunteers by randomly selecting both anonymous and registered volunteers. Barbara et al. reported that overall team credibility is greatly influenced by trust in the team leader [2]. Therefore, it is expected that the anonymous volunteers in a team with a registered volunteer will be motivated to behave properly. On the other hand, it is also expected that it would be difficult for anonymous volunteers to lie even in teams formed solely of anonymous volunteers, because they will be subject to peer pressure from the other team members with whom they are working. We refer to this type of trust as “Within-the-Team-based trust (WiT-based trust).” It is noted that it is standard practice for volunteers to work in teams at actual disaster sites because volunteers acting on their own could increase the risk of secondary disasters. 3.5 Deterrence-Based Trust: Ensuring Credibility of Registered Volunteers The registered volunteers responsible for verifying the credibility of the anonymous volunteers are the “last line of trust” in our information cleansing system. Therefore, it is important to provide processes where we can recruit registered volunteers and create deterrence where the registered volunteers cannot lie. For volunteer recruiting, we can obtain support from shelters at the disaster site. All volunteers at actual disaster sites are required to fill out a volunteer application by accessing the Internet or by visiting a shelter. This is a standard procedure for volunteers, because volunteers need to receive the job information to know “what to do.” Here we know that, in general, this application includes a volunteer insurance affiliation. This is because, in many countries, volunteers are usually suggested to take out a volunteer insurance in order to prepare for the risk of secondary disasters [23, 7, 3]. Through the insurance enrollment, the volunteer’s contact information and personal information are
194
T. Kitagawa et al.
stored in shelters and/or insurance companies. We believe that we can recruit registered volunteers for our information cleansing system just by adding an opt-in question in the volunteer application form. We ask those volunteers, who explicitly agree to use their information to identify each volunteer, to participate as registered volunteers in our system (Fig. 2). The information from the other volunteers will be never used to identify any particular volunteer, and we ask them to participate as anonymous volunteers in our system. For deterrence creation, we have to examine which type of identity information will be appropriate as a means of identity verification and to prevent the transmission of false information. Here, “identity information” signifies the information that is used for the exclusive purpose of verifying an individual’s identity. Examples would be personal information (name, address, date of birth, gender), biometric information, and/or smartphone numbers.
4 Effectiveness of Deterrence-Based Trust 4.1 Survey Overview In order to make the social media information cleansing system explained in the previous chapter feasible, the issues IS1 (How to develop the social media engine), IS2 (How to improve trust in anonymous volunteers), and IS3 (How to create trust in registered volunteers) must be studied. We are studying all the issues simultaneously, but this paper focuses on the issue IS3. Our idea for aiming the goal is to use deterrence as a mechanism for enhancing credibility of registered volunteers. The “deterrence-based trust” leverages the psychology of “if I post a lie, then I will be identified as the liar when the lie is exposed” to make it harder for information transmitters to tell lies. As a result, it is expected that registration of identity information increases credibility among registered volunteers.
Fig. 2. Information transmission from registered volunteers.
Deterrence-Based Trust: A Study on Improving the Credibility of Social Media
195
This paper focuses on analyzing how much each seed of identity information will contribute to create trust in registered volunteers. The types of identity information featured in this survey consisted of three items of biometric information, four items of personal information, and one item as control: face, fingerprint, voiceprint, two pieces of personal information (name and address), four pieces of personal information (name, address, date of birth, and gender), smartphone number, driver’s license, and no identity information registered. We believe that we can reports the survey results regarding to issues IS1 and IS2 in near future. 4.2 Survey Items In order to clarify the effectiveness of the deterrence-based trust, the following research questions were investigated via a survey: RQ1: What types of identity information should be used to heighten deterrence of the transmission of false information? RQ2: Is the degree to which privacy is violated equivalent to the effectiveness of the deterrence of the transmission of false information? RQ3: Does the effectiveness of deterrence of the transmission of false information change based on the severity of the lie? In RQ3 about the severity of potential lies, various classifications of lies exist: lies that disguise the truth, lies that exaggerate, etc. It is believed that the type of lie affects how easy it is for users to tell lies and how difficult it is for deterrence to work. In this survey, we began to feature with three types of lies: “rescue: a lie that requests to rescue the liar faster and other injured persons would get rescued slower,” “distribution: that requests a food ration for the liar faster and other persons’ foods would be distributed slower,” and “prank: a lie that has no merit to the liar and no disadvantage to others.” 4.3 Survey Question Composition In order to clarify the RQ1 to RQ3, the following Questions (1) through (7) were created for the survey. Due to words limit in this paper, we only described in detail about Questions (2) to (5). The question numbers represent the order in which the questions on the survey were administered. The survey was created using LimeSurvey [19], a Japanese Web questionnaire system, and respondents were recruited using Lancers [20], a Japanese crowdsourcing service. During recruitment, the survey targets were indicated as “individuals who have used biometric authentication.” (1) Questions about usage of biometric authentication. (2) IMC questions. It is well known that the issue of potential satisficing greatly impacted the online surveys [18]. One of possible safeguards against satisficing is an instructional manipulation check (IMC) [24]. An IMC is a method of checking for satisficing that asks respondents to respond “incorrectly” to question.
196
T. Kitagawa et al.
(3) Questions about privacy. In order to survey how much of their privacy they felt would be violated if personal information happened to be disclosed to others, respondents were asked the following question for each of seven types of personal information (face, fingerprint, voiceprint, two pieces of personal information, four pieces of personal information, smartphone number, and driver’s license): “How much of a privacy violation do you feel disclosure of this information would represent?” Responses used a five-point Likert scale: “1: None at all”; “2: Not that much of a violation”; “3: Can’t say either way”; “4: Somewhat of a violation”; “5: Yes, it would be a violation”. The response results constituted the “privacy score.” (4) Questions about false message transmission. In order to survey the effectiveness of deterrence-based trust for each of eight types of personal information (face, fingerprint, voiceprint, two pieces of personal information, four pieces of personal information, smartphone number, driver’s license, and no information registered), respondents were asked the following question for each of three types of lies (pranks, distribution, and rescue): “Do you think a user would knowingly transmit a false message on an emergency communications network if their identity information were registered?” Responses used a five-point Likert scale: "5: None at all"; "4: Not really"; "3: Can’t say either way"; "2: Somewhat yes"; "1: Very much yes." The response to this question composed the “deterrence score.” (5) Questions about knowledge of biometric authentication. Questions were posed to the respondents regarding the vulnerabilities of biometric authentication. The questions were created with reference to 17 vulnerabilities described in the literature [10]. Here, some questions were created in pairwise manner so that filtering was conducted using response consistency. As illustrated by the following example, two questions were posed for each of several vulnerabilities to confirm response consistency: Ex.) The possibility of duplicating biometric information physically: Question (a): Biometric information cannot be duplicated by physical methods. Question (b): It is possible to falsify biometric information successfully. If the responses are inconsistent, it can be considered that the respondent is skimming through the questions. As a result of discussion among the experimenters (the author and two co-authors), 31 questions were created; of these, 12 pairs were created for response consistency assessment. Furthermore, two dummy questions were added requesting that the respondent obeys: “please select the right (or left) answer for this question.” (6) Questions about social media usage. (7) Questions about basic information. 4.4 Results The survey garnered 114 respondents in their 20s to 60s (75 men, 39 women). IMC (in Question (2)) filtering was first applied to the respondents. As a result, 73 respondents passed, while 41 were excluded. Next, the 73 respondents were filtered via response consistency (in Question (5)). As stated in item 5 of Sect. 4.3, a total of 11 question pairs
Deterrence-Based Trust: A Study on Improving the Credibility of Social Media
197
were used, and the pairs in which the answers were inconsistent were counted as wrong answers. Properly speaking, only respondents with zero inconsistent answers should have been used in the analysis, but unfortunately the number of such respondents was only four. Hence, we widened the scope to a tolerance of 0 to 2 inconsistent answers for this survey so that 41 respondents were ultimately included in the analysis. The subsequent analysis was conducted using the 41 respondents. In this paper, hereinafter, each item of the identity information used in the survey are defined as follows: no identity information to be registered is represented by “non”; the face by “face”; fingerprints by “fing”; voiceprints by “voice”; two pieces of personal information (name and address) as “PI2”; four pieces of personal information (name, address, date of birth, and gender) as “PI4”; a driver’s license by “DL,” and a smartphone number by “PN.” Averages and variances for privacy scores (in Question (3)) and deterrence scores (in Question (4)) are listed in the “Privacy Score” and “Deterrence Score” columns of Table 1, respectively. The values in the table are calculated to three significant digits and displayed in an “average (variance)” format. Based on the survey results summarized in Table 1, a statistical analysis was conducted, and the survey items detailed in Sect. 4.2 were clarified. As responses were gathered using a Likert scale in this study, the analysis treated privacy scores and deterrence scores as ordinal scales and therefore, they must be analyzed using non-parametric statistics [15]. In order to answer the RQ1, it was examined whether disparities existed in the deterrence scores for “prank.” To compare three or more groups nonparametrically, a Friedman test was conducted, and multiple comparisons were conducted using the Holm method with a Wilcoxon signed-rank test at a significance level of 5%. As a result of the Friedman test, a p-value was p < .05 and a degree of difference were confirmed between the types of identity information. The multiple comparison results in Table 2 illustrate the identity information across which disparities in deterrence scores exist. The values in the table are rounded to the third significant digit. From Tables 1 and 2, the amount of deterrence against posting false information was highest for “PI4, DL, PI2” followed by “face, PN, fing.” The lowest was for “voice,” though no particular significant gap was seen. Significant gaps were only demonstrated between “voice” and others except for “face”; and between “none” and others. Similar trends were confirmed in the deterrence scores for the other types of lies. In order to answer the RQ2, first, disparities in privacy scores were examined in the same manner. The Friedman test results were p < .05, confirming a degree of disparity between the types of identity information. Table 3 lists the results of multiple comparisons for the “prank.” From Tables 1 and 3, the amount of potential privacy violation was highest for “PI4, DL, PI2” followed by “face, PN, fing.” The lowest was for “voice.” Significant gaps were seen particularly between “PI4” and “face, PN, fing, voice”; between “PI2, DL” and “PN, fing, voice”; and between “face, PN, fing” and “voice.” Similar trends were confirmed in the deterrence scores for the other types of lies. Then, an analysis was conducted to determine if the degree of potential privacy violation and the deterrence against the transmission of false information were similar. Spearman’s rank correlation was used as a nonparametric measure of correlation. An analysis was conducted at a 5% significance level for each type of lie. Table 4 lists the results for each
198
T. Kitagawa et al.
lie. Based on Table 4, it is clear that there is a low correlation between privacy score and deterrence score for “prank” and “distribution.” For visual comparison, distribution charts for each lie were created in Fig. 3, with privacy score on the horizontal axis and deterrence score on the vertical axis. The ellipses on the chart represent 30% probability region. Based on Fig. 3, it is visually seen that the effectiveness of deterrence decreases as the degree of the lie increases. Table 1. Survey results and analysis.
Table 2. Deterrence score multiple comparison results: p values (n = 41).
Table 3. Privacy score multiple comparison results: p values (n = 41).
In order to answer the RQ3, the existence of disparities resulting from the severity of the lie was examined. First, for each type of lie, all the deterrence scores of each identity information (except for “none”) of each respondent were averaged. Then, a comparison between averaged deterrence scores for “prank”, “distribution”, and “rescue”
Deterrence-Based Trust: A Study on Improving the Credibility of Social Media
199
Table 4. Correlation analysis results (n = 41).
Table 5. Deterrence score disparity based on the degree of lie (n = 41).
was performed. The analysis methods used were identical to those used previously. The results of the Friedman test with p < .05 confirmed a degree of difference between the types of lies. The results of the multiple comparisons are shown in Table 5. From Table 5, it is clear that deterrence posed by the identity information registration becomes less effective as the severity of the lie increases.
Fig. 3. Privacy score and deterrence score distribution charts.
5 Conclusion and Future Research As a means of guaranteeing the credibility of information on social media during a disaster, we proposed a combinational use of a system called the social media engine (SME), that automatically filters out inconsistent information from multiple messages sent/received on social media, and a crowdsourcing credibility verification platform, in which two layers of volunteers at a disaster site are asked to visually verify the credibility of messages. This crowdsourcing credibility verification platform seeks to use both WiT-based trust as a method of guaranteeing the credibility of anonymous volunteers and deterrence-based trust as a method of guaranteeing the credibility of registered volunteers. Our social media information cleansing system will work more
200
T. Kitagawa et al.
effective as the number of volunteers visiting the disaster site is increasing in its relief supply phase. This paper focused on the deterrence-based trust and conducted a survey to examine its potential utility. As a result, it was revealed that the usage of identity information could work as a deterrence against the transmission of false information and that the degree of that deterrence varied based on both the type of identity information and the severity of the potential lie. The survey conducted in this paper found that deterrencebased trust can contribute to the crowdsourcing credibility verification platform and biometric information could be one of the optimal types of identity information to use for deterrence-based trust. However, the effectiveness would differ on a case-by-case basis depending on various factors, and therefore a further investigation is required. Currently, we are trying to develop the SME and analyzing how the WiT-based trust can be effective for several disaster scenarios. Also, an integration of mechanism for the SME, WiT-based trust, and deterrence-based trust are planned as a future study. Acknowledgements. This study is a joint research with Prof. K. K. Ramakrishnan of Dept. of Computer Science and Engineering, University of California, Riverside, USA. This study is partially supported by the National Institute of Information and Communications Technology in Japan (Contract No. 193).
References 1. Apuke, O.D., Omar, B.: Fake news and COVID-19: modelling the predictors of fake news sharing among social media users. Telematics Inform. 56, 101475 (2020) 2. Adams, B.D., Sartori, J.A.: Validating the trust in teams and trust in leaders scales. Defence Research and Development Canada Toronto (2006) 3. CIMA Volunteers Insurance. https://www.cimaworld.com/nonprofits/cima-volunteers-insura nce/. Accessed 04 April 2021 4. Crowe, A.: Disasters 2.0: the Application of Social Media Systems for Modern Emergency Management. CRC Press, Boca Raton (2012) 5. Information Technology Strategy Planning Office, Cabinet Secretariat: Guidebook for Social Media Usage During Disasters (in Japanese). https://www.kantei.go.jp/jp/singi/it2/senmon_ bunka/pdf/h2903guidebook.pdf. Accessed 04 April 2021 6. Fall, K.: A delay-tolerant network architecture for challenged internets. In: Proceedings of the 2003 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 27–34 (2003) 7. Fukushi Hoken Service Co.: Volunteer Activity Insurance (in Japanese). https://www.fukush ihoken.co.jp/fukushi/front/council/volunteer_activities.html. Accessed 04 April 2021 8. Haddow, G.D., Bullock, J.A., Coppola, D.P.: Introduction to Emergency Management, 5th edn. Butterworth-Heinemann, Waltham (2014) 9. Haddow, G.D., Haddow, K.S.: Disaster Communications in a Changing Media World, 2nd edn. Butterworth-Heinemann, Waltham (2014) 10. Hitachi System Development Research Institute; Development of biometric security assessment standards: 2003 report (excerpt) (in Japanese). https://www.jaisa.or.jp/action/group/bio/ pdfs/0715_02.pdf. Accessed 04 April 2021 11. Houston, J.B., et al.: Social media and disasters: a functional framework for social media use in disaster planning, response, and research. Disasters 39(1), 1–22 (2015)
Deterrence-Based Trust: A Study on Improving the Credibility of Social Media
201
12. Houston, J.B., Schraedley, M., Worley, M.E., Reed, K.: Disaster journalism: fostering citizen and community disaster mitigation, preparedness, response, recovery, and resilience across the disaster cycle. Disasters 43(3), 591–611 (2019) 13. Jaeger, P.T., Shneiderman, B., Fleischmann, K.R., Preece, J., Qu, Y., Wu, P.F.: Community response grids: e-government, social networks, and effective emergency management. Telecommun. Policy 31(10–11), 592–604 (2007) 14. Jahanian, M., et al.: DiReCT: disaster response coordination with trusted anonymous volunteers. In: Proceedings of 2019 International Conference on Information and Communication Technologies for Disaster Management (2019) 15. Jamieson, S.: Likert scales: how to (ab)use them. Med. Educ. 38, 1212–1218 (2004) 16. Jung, J.Y., Moro, M.: Multi-level functionality of social media in the aftermath of the Great East Japan Earthquake. Disasters 38(S2), S123–S143 (2014) 17. EL Khaled, Z., Mcheick, H.: Case studies of communications systems during harsh environments: a review of approaches, weaknesses, and limitations to improve quality of service. Int. J. Distrib. Sens. Netw. 15(2), 1–22 (2019) 18. Krosnick, J.A.: Response strategies for coping with the cognitive demands of attitude measures in surveys. Appl. Cogn. Psychol. 5, 213–236 (1991) 19. LimeSurvey: the online survey tool - open source surveys. https://www.limesurvey.org/. Accessed 04 April 2021 20. Lancers: https://www.lancers.jp/. Accessed 04 April 2021 21. McGonagle, T.: Fake news: false fears or real concerns? Neth. Q. Hum. Rights 35(4), 203–209 (2017) 22. Mehta, A.M., Bruns, A., Newton, J.: Trust but verify: social media models for disaster management. Disasters 41(3), 549–565 (2017) 23. Welfare insurance (Welfare insurance services) (in Japanese). https://www.shakyo.or.jp/ guide/hoken/index.html. Accessed 04 April 2021 24. Oppenheimer, D.M., Meyvis, T., Davidenko, N.: Instructional manipulation checks: detecting satisficing to increase statistical power. J. Exp. Soc. Psychol. 45, 867–872 (2009) 25. Rampersad, G., et al.: Birds of a feather: homophily in social networks. Comput. Hum. Behav. 9(1), 1–9 (2019) 26. Shah, Z., Surian, D., Dyda, A., Coiera, E., Mandl, K.D., Dunn, A.G.: Automatically appraising the credibility of vaccine-related web pages shared on social media: a Twitter surveillance study. J. Med. Internet Res. 21(11), e14007, pp. 1–14 (2019) 27. Shklovski, I., Burke, M., Kiesler, S., Kraut, R.: Technology adoption and use in the aftermath of Hurricane Katrina in New Orleans. Am. Behav. Sci. 53(8), 1228–1246 (2010) 28. A hero who came with a water bike saves 120 people over 15 hours. https://www.tellerrep ort.com/news/--a-hero-who-came-with-a-water-bike-saves-120-people-over-15-hours.S1q KWXaXX.html. Accessed 04 April 2021 29. Utsu, K., Uchida, O.: Analysis of rescue request and damage report tweets posted during 2019 Typhoon Hagibis. IEICE Trans. Fundam. E103-A(11), 1319–1323 (2020) 30. Yan, L.: A survey on communication networks in emergency warning systems, Saint Louis, MO: All Computer Science and Engineering Research, Report No. WUCSE-2011-100 (2011)
A Perceptron Mixture Model of Intrusion Detection for Safeguarding Electronic Health Record System Wei Lu1(B) and Ling Xue2 1 Department of Computer Science, Keene State College, USNH, Keene, NH, USA
[email protected]
2 College of Health and Human Services, University of New Hampshire, Durham, NH, USA
[email protected]
Abstract. Electronic Health Record System, EHRS, has recently become a common healthcare information technology that has been widely adopted by many physicians. Due to compromised internal structures and systems, safeguarding the privacy and security of EHRS becomes a very challenging issue based on the most recent studies on trends and characteristics of protected health information breaches in the United States. Traditionally intrusion detection systems were proposed to address security of EHRS infrastructure by detecting unauthorized accesses; they, however, tend to generate a large number of false alerts mainly due to lack of proper features to model normal behaviors and overfitting when using signature-based detection algorithms. In this paper we address this limitation and propose a mixture model combining both misuse detection and anomaly detection approaches to minimize the number of false alerts over a specific time period in real time through its self-learning-fixing-and-improving capability built upon the perceptron algorithm.
1 Introduction Electronic Health Record System (EHRS), including medical history, notes, symptoms, diagnoses, medications, lab results, vital signs, immunizations, and reports from diagnostic tests, is the electronic version of the paper repository of information that are reviewed and used by doctors or other healthcare providers for clinic, research, administrative and financial purposes. EHRS has recently become a common healthcare information technology that has been widely adopted by many physicians. In fact, over 90% of hospitals in the US have already integrated electronic health record technology into their systems mainly due to the benefits of improved quality of healthcare with more efficiency and convenience. Emerging technologies such as wearable medical devices have dramatically increased the opportunity of new business of using electronic health record by ushering in more ways of capturing and accessing health data such as steps taken, distance covered, body temperature and heart rate, they, however, also paved a way to a large number of cyberattack activities targeting in the EHRS. According to the most recent studies on “trends and characteristics of protected health information breaches in © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 202–212, 2022. https://doi.org/10.1007/978-3-030-84913-9_18
A Perceptron Mixture Model of Intrusion Detection
203
the United States”, “From 2010 to 2018, a total of 2,529 breaches affected 194.74 million individual records. Overall, 72.08% incidents involved healthcare providers; theft (32.94%) and hacking (22.7%) were major types of breaches. Large cases affecting more than a million records happened due to compromised internal structures and systems” [1]. Traditionally intrusion detection systems are applied to deal with different kinds of intrusions into health information systems. They, however, tend to generate a large number of false alerts mainly due to the lack of proper features to model normal behaviors and overfitting when using signature-based detection algorithms. In order to reduce the number of false alerts, alert correlation was proposed. In alert correlation, multiple components with the purpose of analyzing alerts provide a high-level insight view on the security state of the network under surveillance, and thus offering a potential to release system administrators from the large number of false alerts. Nevertheless, a major limitation of alert correlation is that it neglects the inherent cause of large number false alerts even though it reduces the number of false alerts externally through some correlation techniques. As a result, in this paper we address this limitation and propose a perceptron mixture model (PMM), including a large number of different detection algorithms and various features from different sources. With the dynamical programming techniques, we show that the proposed PMM based intrusion detection systems can obtain the minimum number of false alerts over a specific time period through the adaptive adjustment on the internal parameters of PMM. Therefore, instead of tuning or correlating alerts externally to reduce the number of false alerts, the intrusion detection using PMM can always minimize the number of false alerts through its self-learningfixing-and-improving capability. The rest of the paper is organized as follows. Section 2 introduces related work, in which we summarize some existing works on intrusion detection systems. Our proposed detection scheme will be explained in Sect. 3. Section 4 presents the preliminary experimental evaluation of our approach and discusses the obtained results. Finally, in Sect. 5, we draw conclusions and discuss the future work.
2 Related Work Generally, existing approaches for detecting network intrusions in health information systems are mainly based on misuse (signature-based) detection and anomaly detection. Misuse detection is based on an assumption that most attacks leave a set of signatures in the stream of network packets or in audit trails, and thus attacks are detectable if these signatures can be identified by analyzing the audit trails or network traffic behaviors [22]. However, misuse detection is strictly limited to known attacks. Detecting new attacks is one of the biggest challenges faced by misuse detection. On the other hand, anomaly detection techniques attempt to establish normal activity profiles by computing various metrics, and an intrusion is detected when the actual system behavior deviates from the normal profiles, thus addressing the weakness of misuse detection [23–25]. Machine learning techniques are well-suited instruments for implementing network intrusion detection systems in both misuse detection and anomaly detection for a number of reasons including such as (1) machine learning techniques can learn and improve
204
W. Lu and L. Xue
based on sample data (both with and without manual labelling); (2) modern machine learning algorithms can cope with large amounts of training data; and (3) machine learning can be applied on making automatically the optimal decision, taking into account uncertainties, risks, and also computational costs. According to whether they are based on supervised or unsupervised learning techniques, intrusion detection schemes can be classified into supervised and unsupervised [2]. Supervised intrusion detection establishes the normal/abnormal profiles of systems or networks through training based on labelled datasets. In contrast, unsupervised intrusion detection attempts to detect intrusions without using any prior knowledge of attacks or normal instances. Although learning techniques achieve good results on network intrusion detection so far, they still face many challenges, such as “can machine learning be secure?” [3], “would behavioral non-similarity in training and testing data totally fail learning algorithms on anomaly detection?” [4], and “what are the limitations for detecting previously unknown attacks due to the large number of false alarms?” [5]. In order to overcome these challenges, some researchers have proposed the idea of hybrid detection. This way, the system will achieve the advantage of misuse detection to have a high detection rate on known attacks as well as the ability of anomaly detectors in detecting unknown attacks. According to this fusion approach, current hybrid intrusion detection systems can be divided into two categories: (1) sequence-based in which either anomaly detection or misuse detection is applied first, and the other one is applied next [6–14]; (2) parallel-based in which multiple detectors are applied in parallel, and the final decision is made based on multiple output sources [15–19]. Although with respect to the characteristics of signature-based and anomaly-based methods, the fusion of these two approaches should theoretically provide a high-performance intrusion detection system, there are still two important issues that make this task cumbersome. First, all anomaly-based methods need a completely labeled and up-to-date training set which is very costly and time-consuming to create if not impossible. Second, getting different detection technologies to interoperate effectively and efficiently becomes a big challenge for building an operational hybrid intrusion detection system. The first issue was recently addressed in [20] and [21]. In [20] Boxwala et al. propose a system Monitoring Access Pattern to detect inappropriate access to electronic health record data. Two statistical and machine learning methods are applied to model the electronic health record access logs that are collected over a 2-month period in which 26 features are extracted to be applied for detecting suspicious accesses. The two models are built upon 10-fold cross validation sets of 1291 labelled events and then are used on an external set of 58 events that are identified as truly suspicious. In [21] Kim et al. improve the system implemented in [20] and develop an integrated filtering method in combining both anomaly detection (i.e. symbolic clustering) and misuse detection (i.e. a rule-based technique using signatures). A comparative study on two datasets (one having 25.5 million access records and the other one having 8.6 million access records) shows that fusing two different types of detectors improve performance significantly in terms of false negative rates.
A Perceptron Mixture Model of Intrusion Detection
205
3 Perceptron Mixture Model for Hybrid Intrusion Detection Most existing intrusion detection systems use machine learning techniques like support vector machines or neural networks in an ad hoc method. In this paper, we formulate a generative perceptron mixture model and use that to inform the choice, design, and training of classifiers. We prove, in theory, that following the model we can always obtain the minimum number of false alerts over a specific time period. Figure 1 illustrates the perceptron mixture model for our hybrid intrusion detection system. The meaning of notations appeared in Fig. 1 is explained as follows:
Fig. 1. Perceptron mixture model for hybrid intrusion detection framework
• Feature vector is denoted by F(f1 , f2 , . . . fn ), in which fi (i = 1, 2, . . . , n) refers to features that might be based on temporal, flow, distribution, packet, host logs, firewall/alert events, traffic behaviour, biometric, to name a few. • Detection agents are denoted by S(S1 , S2 , . . . Sm ) that include m different detection algorithms for intrusion detection systems. • Notation PW refers to Perceptron Weight and it measures the credibility degree of decisions. In particular, PW fi Sj is the perceptron weight for feature fi in Sj , where i = 1, 2, . . . , n and j = 1, 2, . . . , m. The higher the value of PW fi Sj , the more credible its decision by feature fi and Sj is. For each separate detection agent Sj , we have:
206
W. Lu and L. Xue
n
PW fi Sj = 1
i=1
• PW Sj is the reputation weight for each detection agent Sj , where j = 1, 2, . . . , m. The higher the value of PW Sj , the more credible this specific detector’s decision is. For all the m detection agents, we have: m
PW Sj = 1
j=1
• Notation pfi Sj is the attacking probability generated by a feature fi and detection agent Sj . It measures the anomalous degree of current networks by feature fi and detection agent Sj , where i = 1, 2, . . . , n and j = 1, 2, . . . , m. The higher the value of pfi Sj , the more anomalous the current network. • Notation pSj is the attacking probability correlated by all features fi (i = 1, 2, . . . , n) with a specific detection agent Sj , and we have: pSj =
n
pfi Sj × PW f S j = 1, 2, . . . , m i j
i=1
• Notation panomalous is the final attacking probability generated by the mixture model, and we have: panomalous =
m
pSj × PW S
j
j=1
• Notation FACount is the number of false alerts obtained from historical alerting reports. Network administrators verify every alert reported by mixture model and make after-event decisions on true or false. Based on FACount, feedbackfactor are used to adjust the value of PW fi Sj and PW Sj in order to minimize FACount. Our perceptron mixture model aims to minimize the total number of false alerts over a long time period through adjusting dynamically and adaptively the perceptron matrix with feedback factors. We formalize this problem by defining the objective function F and its value f , where f is the total number of false alters over a time period T . Each time interval is denoted by Δt. Suppose there are N time intervals over T , we have: T = N × Δt The loss function is denoted by L. The loss factor is the value of loss function and it is denoted by l. The gain function is denoted by G. The gain factor is the value of gain function and it is denoted by g. L and G are used to adjust the value of perceptron weight and we have: l = L(g) =
1 g
A Perceptron Mixture Model of Intrusion Detection
g = G(l) =
207
1 l
gl = 1 Considering a time period [t, t + Δt], the objective function value over [t, t + Δt] is denoted by ft and it is dependent on the value of variable g, l, and t. Thus, we have the objective function over [t, t + Δt] as follows: ft = F t (g t , l t , t) Since g t × l t = 1, we can simplify Ft as follows: ft = F t (l t , t) Where t is momentary time variable and l t is the loss factor at t used to adjust the value of perceptron weight. Similarly, we denote object function over [t +Δt, t +2×Δt] as Ft+Δt (l t+Δt , t +Δt), object function over [t + 2 × Δt, t + 3 × Δt] as Ft+2×Δt (l t+2×Δt , t + 2 × Δt), to name a few. Therefore, we have the objective function F over T as follows: f = F l t , t, l t+Δt , t + Δt, . . . , l t+(N −1)×Δt , t + (N − 1) × Δt =
N −1
Ft+i×Δt l t+i×Δt , t + i × Δt
i=0
f∗
Our goal is to find an optimal solution vector l ∗ that can minimize F over T . Suppose is the minimum value of F, we have: f ∗ = min
N −1
Ft+i×Δt (l ∗ )t+i×Δt , t + i × Δt
i=0 T
where vector l ∗ = [(l ∗ )t , (l ∗ )t+Δt , . . . , , (l ∗ )t+(N −1)×Δt ] . To optimize the objective function F, we can suppose that the minimum value of F over [t, t + j × Δt] is denoted by f ∗ ((l ∗ )j , j), and then we have: j−1 Ft+i×Δt (l ∗ )t+i×Δt , t + i × Δt f ∗ ( l ∗ )j , j = min i=0 T
where l ∗ j = [(l ∗ )t , (l ∗ )t+Δt , . .. , , (l ∗ )t+(j−1)×Δt ] . Thus, given the formula of f ∗ ( l ∗ )j , j , we have: j−2 f ∗ ( l ∗ )j , j = min{Ft+(j−1)×Δt (l ∗ )t+(j−1)×Δt , t + (j − 1) × Δt + Ft+i×Δt (l ∗ )t+i×Δt , t + i × Δt } i=0
208
W. Lu and L. Xue
According to the principle of optimality, we have: f ∗ ((l∗)j , j) = min min {Ft+(j−1)×t (l ∗ )t+(j−1)×t , t + (j − 1) × t i=0,1,...j−2
j−2
+
Ft+i×t (l ∗ )t+i×t , t + i × t }
i=0
= min{Ft+(j−1)×t (l ∗ )t+(j−1)×t , t + (j − 1) × t =
min
i=0,1,...j−2
{
j−2
Ft+i×t (l ∗ )t+i×t , t + i × t }
i=0
= min{Ft+(j−1)×t (l ∗ )t+(j−1)×t , t + (j − 1) × t +f ∗ (l ∗ )j−1 , j − 1 } where j ≥ 1 and f ∗ ( l ∗ )0 , 0 = f0 , in which f0 is a constant and it stands for the initial value of the number of alerts. Based on the above formula, f ∗ can be solved recursively starting from calculating function F over T has been min{Ft ((l ∗ )t , t)}. As a result, the optimization of objective transferred into minimizing objective function Ft+i×Δt (l)t+i×Δt , t + i × Δt over [t + i × Δt, t + (i + 1) × Δt], where i = 0, 1, . . . , N − 1. Next, we consider only the objective function Ft ((l)t , t) over [t + i × Δt] and discuss how to optimize it, and then the other objective function Ft+i×Δt (l)t+i×Δt , t + i × Δt over [t + i × Δt, t + (i + 1) × Δt] will be optimized using the same algorithm. Suppose ft∗ is the minimum value of Ft , we have ft∗ = min{Ft ((l ∗ )t , t)} where l ∗ is the optimum to minimize Ft . Many nonlinear optimization algorithms can be used to solve this issue, such as hill climbing algorithm (gradient descent algorithm, if space is continuous), simulated annealing algorithm, genetic algorithm, to name a few.
4 Preliminary Experimental Evaluation To analyze the performance of our method, we choose the DARPA 1998 dataset for our preliminary experimental evaluation mainly because it has been the most widely used large-scale dataset to evaluate intrusion detection systems and some signature-based detection systems have a very bad performance on this dataset in that we can show how the mixture model will improve the general performance of the whole system. To perform our experiment on DARPA 1998 dataset we choose the first 8 h of week 6, Thursday because of the diversity of attacks on that day. We divided the traffic into four two-hour time intervals. Table 1 compares the detection rate and false positive rates of the single individual detection system with the hybrid system. As it is expected, the single detector has a poor performance on the DARPA dataset. However, using the mixture detector we gain a good detection rate especially on the first and third intervals, while keeping the false alarm rate near 0%.
A Perceptron Mixture Model of Intrusion Detection
209
Table 1. Preliminary evaluation results on DARPA 1998 dataset Device
Single detector
Mixture detector
Detection False alarm Detection False alarm rate rate rate rate Time interval 1 0.04%
0.35%
45.27%
0.21%
Time interval 2 0.06%
0.39%
2.97%
0.38%
Time interval 3 0.04%
0.40%
52.77%
0.55%
Time interval 4 0.0%
0.53%
4.28%
0.61%
5 Conclusions and Future Work In this paper, we propose a new perceptron mixture model for intrusion detection, a general framework to combine any misuse and anomaly detection systems in order to reduce the large number of false alerts when safeguarding electronic health record systems in digital health. We prove the effectiveness of the model using dynamic programming, and a very preliminary experimental evaluation with the 1998 DARPA intrusion detection dataset shows the potential of the proposed framework in terms of improving detection rate and in the meantime reducing the number of false alarms. In artificial neural networks, the proposed generative model for perceptron mixture can also be seen as a variation of well-known hierarchical mixture of expert models in learning committee machines theory. A committee machine is defined as a combination of experts (classifiers), in which a complex computational task is solved by dividing it into a number of computationally simple tasks according to the principle of divide and conquer. Committee machines consist of two major categories: static structure and dynamic structure. In static structure-based committee machines, the outputs of subclassifiers are combined according to a mechanism that does not involve the inputs. There are two typical methods in this category, namely ensemble averaging and boosting. In dynamic structure-based committee machines, the output of individual sub-classifier will be integrated into the overall output with the input signals. That means input signals involve the final response of the system. There are two major types of dynamic structures in this category, namely mixture of experts (ME) and hierarchical mixture of experts (HME). In ME, individual responses of classifiers are nonlinearly combined based on a single gating network. HME is a natural extension of ME, in which individual responses of classifiers are nonlinearly combined based on several gating networks in a hierarchical fashion. An example of HME model in Fig. 2 includes four classifiers and two levels of hierarchy (i.e. two layers of gating networks). The four classifiers are called expert networks and the integrating unit is called gating network that performs like a mediator among expert networks. The basic idea of HME is that the different classifiers work best in different regions of input space according to a specific probabilistic generative model, namely the gating network. Our model falls into the category of two-level HME. The input features in the perceptron mixture model stand for the input vector x in HME. The perceptron weight PW fi Sj for feature fi in classifier Sj refers to the realization of
210
W. Lu and L. Xue
Fig. 2. HME for two levels of hierarchy with 4 classifiers
the second-level gating network j in HME, and the perceptron weight PW Sj stands for the first-level gating network in HME. In the future we will extend the approach by integrating more detectors and evaluate it with more dataset collected on the public networking environment. Acknowledgments. This research was supported in part by funding from a Keene State College Faculty Development Grant.
References 1. Hossain, M.M., Hong, Y.A.: Trends and characteristics of protected health information breaches in the United States. In: AMIA Annual Symposium Proceedings, pp. 1081–1090, 4 March 2019 2. Portnoy, L., Eskin, E., Stolfo, S.: Intrusion detection with unlabeled data using clustering. In: Proceedings of ACM CSS Workshop on Data Mining Applied to Security (DMSA-2001), Philadelphia, PA (2001) 3. Barreno, M., Nelson, B., Sears, R., Joseph, A.D., Tygarcan, J.D.: Can machine learning be secure? In: Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, pp. 16–25 (2006) 4. Sabhnani, M., Serpen, G.: Analysis of a computer security dataset: why machine learning algorithms fail on KDD dataset for misuse detection. Intell. Data Anal. 8(4), 403–415 (2004) 5. Patcha, A., Park, J.M.: An overview of anomaly detection techniques: existing solutions and latest technologies trends. Comput. Netw.: Int. J. Comput. Telecommun. Netw. 51(12), 3448–3470 (2007) 6. Barbarra, D., Couto, J., Jajodia, S., Popyack, L., Wu, N.N.: ADAM: detecting intrusions by data mining. In: Proceedings of the 2001 IEEE, Workshop on Information Assurance and Security, West Point, NY, June 2001
A Perceptron Mixture Model of Intrusion Detection
211
7. Lunt, T.F., et al.: A Real-time Intrusion Detection Expert System (IDES). Technical Report, Computer Science Laboratory, SRI International, Menlo Park, USA, February 1992 8. Anderson, D., Frivold, T., Tamaru, A., Valdes, A.: Next Generation Intrusion Detection Expert System (NIDES). Software User’s Manual, Beta-Update release, Computer Science Laboratory, SRI International, Menlo Park, CA, USA, Technical Report SRI-CSL-95-0, May 1994 9. Porras, P., Neumann, P.: EMERALD: event monitoring enabling responses to anomalous live disturbances. In: Proceedings of the 20th NIST-NCSC National Information Systems Security Conference, Baltimore, MD, USA, pp. 353– 365 (1997) 10. Tombini, E., Debar, H., Mé, L., Ducassé, M.: A serial combination of anomaly and misuse IDSes applied to HTTP traffic. In: Proceedings of the 20th Annual Computer Security Applications Conference, Tucson, AZ, USA (2004) 11. Zhang, J., Zulkernine, M.: A hybrid network intrusion detection technique using random forests. In Proceedings of the 1st International Conference on Availability, Reliability and Security, pp. 262–269. Vienna University of Technology (2006) 12. Peng, J., Feng, C., Rozenblit, J.W.: A hybrid intrusion detection and visualization system. In: Proceedings of the 13th Annual IEEE International Symposium and Workshop on Engineering of Computer Based Systems, pp. 505–506 (2006) 13. Depren, O., Topallar, M., Anarim, E., Ciliz, M.K.: An Intelligent intrusion detection system (IDS) for anomaly and misuse detection in computer networks. Expert Syst. Appl. 29(4), 713–722 (2005) 14. Qin, M., Hwang, K., Cai, M., Chen, Y.: Hybrid intrusion detection with weighted signature generation over anomalous internet episodes. IEEE Trans. Dependable Secure Comput. 4(1), 41–55 (2007) 15. Xiang, C., Lim, S.M.: Design of multiple-level hybrid classifier for intrusion detection system. In: Proceedings of the IEEE Workshop Machine Learning for Signal Processing, pp. 117–122 (2005) 16. Thames, J.L., Abler, R., Saad, A.: Hybrid intelligent systems for network security. In: Proceedings of the 44th ACM Annual Southeast Regional Conference, pp. 286–289 (2006) 17. Peddabachigari, S., Abraham, A., Grosan, C., Thomas, J.: Modeling intrusion detection system using hybrid intelligent systems. Special issue on network and information security: a computational intelligence approach. J. Netw. Comput. Appl. 30(1), 114–132 (2007) 18. Shon, T., Moon, J.: A hybrid machine learning approach to network anomaly detection. Int. J. Inf. Sci. 177(18), 3799–3821 (2007) 19. Sabhnani, M.R., Serpen, G.: Application of machine learning algorithms to KDD intrusion detection dataset within misuse detection context. In: Proceedings of International Conference on Machine Learning: Models, Technologies, and Applications, pp. 209–215 (2003) 20. Boxwala, A.A., Kim, J., Grillo, J.M., Ohno-Machado, L.: Using statistical and machine learning to help institutions detect suspicious access to electronic health records. J. Am. Med. Inform. Assoc. 18(4), 498–505 (2011). https://doi.org/10.1136/amiajnl-2011-000217 21. Kim, J., et al.: Anomaly and signature filtering improve classifier performance for detection of suspicious access to EHRs. In: AMIA Annual Symposium Proceedings, vol. 2011, pp. 723– 731, 22 October 2011 22. Ghorbani, A.A., Lu, W., Tavallaee, M.: Network attacks. In: Network Intrusion Detection and Prevention: Concepts and Techniques, vol. 47, pp. 1–25. Springer, Heidelberg (2010). https:// doi.org/10.1007/978-0-387-88771-5_1, ISBN: 978-0-387-88770-8 23. Lu, W., Xue, L.: A heuristic-based co-clustering algorithm for the internet traffic classification. In: 28th International Conference on Advanced Information Networking and Applications Workshops, pp. 49–54 (2014). https://doi.org/10.1109/WAINA.2014.16
212
W. Lu and L. Xue
24. Lu, W., Traore, I.: Determining the optimal number of clusters using a new evolutionary algorithm. In: Proceedings of IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2005), Hongkong, pp. 712–713, November 2005 25. Garant, D., Lu, W.: Mining botnet behaviors on the large-scale web application community. In: 27th International Conference on Advanced Information Networking and Applications Workshops, pp. 185–190 (2013). https://doi.org/10.1109/WAINA.2013.235
Personalized Cryptographic Protocols Obfuscation Technique Based on the Qualities of the Individual Radosław Bułat and Marek R. Ogiela(B) Cryptography and Cognitive Informatics Laboratory, AGH University of Science and Technology, 30 Mickiewicza Avenue, 30-059 Kraków, Poland {bulat,mogiela}@agh.edu.pl
Abstract. Most of the cryptographic protocols nowadays, mainly asymmetrical, have a generated user personal key as their basis, known only to them but still vulnerable to theft. There can be proposed an innovative solution, which hashes the cipher with chosen key and uses some personal characteristics of the person generating it (e.g., by voice sampling or biometrics, encoded as a bit vector). Such a private key would be usable only by its creator, making the design much safer and authenticating him personally, making the non-repudiation of encoded material possible. Also, with the proposed solution, another avenue of research has been opened by using such a technique in steganography - encoding information in the user-provided picture, secured by his biometric sequences (quality vectors). In both cases, the cryptographic protocols take on truly unique sequences, used instead of traditional salt values, also providing a layer of truly personal touch to an otherwise purely theoretical construct. It may find many applications in the growing world of IoT where people and their unique needs have to be incorporated into the growing web of security protocols. Keywords: Personalized cryptography · Security protocols · User-oriented security systems
1 Introduction Nowadays, as modern society advances and electronic devices permeate every area of our lives, both public and private, there is a growing need for security. Both user data and communications have to be secured without regard to them being of business or private use. Still, many methods and protocols are employed to do exactly that, most of them revolving around the concepts of prime numbers, secret hashes, and other arbitrary keys, which, being purely math constructs, are not attuned to any specific person. As such, those methods generally provide a high layer of security to data or message traffic but cannot be used to confirm the user’s identity, having no connection with any inherent qualities of them as a person. Thus, there is the need to invent and put into widespread usage protocols unique to individuals using them, being tailored to their qualities and identifying their actions. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 213–218, 2022. https://doi.org/10.1007/978-3-030-84913-9_19
214
R. Bułat and M. R. Ogiela
As of now, we can differentiate between two classes of encryption methods – symmetrical and asymmetrical [5]. The symmetric encryption principle is very simple – both the encryption and decryption share the same key, and as such, both the sender and the receiver share it. The key sharing protocol’s secrecy poses another set of challenges by itself, though now the point is moot. Since the key has to be shared between individuals, it could not be considered personal (unless taking the characteristics of the whole group). On the other hand, the asymmetric methods differ considerably. The keys are being generated in pairs, one of them not being secret (public key), and another private. The keys work only in pairs, and what is encoded by one of them can be decoded only by its matching twin. Thus, all of the messages and data encoded by an individual public key can only be read by a private key holder, which provides secrecy. On the other hand – all data encoded by an individual’s private key can be publicly accessed only by his public key – which serves as digital signage and a non-repudiation tool. Still, both keys remain arbitrary numbers, and there is no possibility to infer the signer/key owner identity from the cipher alone [11]. As can be seen, the latter method proves quite a promising field to introduce personalized cryptography either in private keys (to make them confirm their owner identity) as well as public (to ensure that the message recipient has not only the correct cipher but also some given personal characteristics). The projected risks belong mainly to the privacy issues, as well as the computational power included. Since most of the data used in ciphers have to be personal, whether a medical record or a private file, there must be certain transparency and explicit consent included. Also, there is a great need to ensure that although the cipher has to be traceable to one individual in question, there should not be any way to obtain their raw data from the cipher. Also, on another note – since the authorization often has to take place in IoT devices, using the data either already stored or taken on the fly, there is a need to minimize computational power, or complexity of checks involved, since the provided devices may not be up to par [1]. As such, it would be prudent to use industry practice, which limits asymmetric methods (and in this case, personal ones) as a medium useable only for digital signage and symmetric key exchange method (as seen above). In this case, one symmetric, non-personalized key may be signed and secured by a personalized asymmetric protocol, making the whole encryption and decryption process significantly easier without compromising the security of the venture.
2 Personalization of a Protocol In this paper, a method to add a personal characteristic to a protocol has been proposed. First of all, there should be a determination, which set of a person qualities would be a starting point for encryption (forming an ordered vector of qualities, for which a subset can be chosen for any given cipher).
Personalized Cryptographic Protocols - Obfuscation Technique
215
2.1 Personal Characteristics As has been presented in previous research in this field, the characteristics considered in various protocols differ greatly and represent a wide range of personal interests. [3] Therefore, for considering, the most useful and cost-efficient would be: • Fingerprints. Especially useful on mobile devices and easily stored profiles, which can be recognized and translated from images to digital data. • Iris recognition. Like the above, using a device’s inherent camera, a view of a person’s eye is not a complicated quality to obtain and, using pattern recognition, authenticate. Likewise, a digitized map of a person’s iris is another unique quality. • Face recognition. A feature used in most of image-recognition devices, measuring the facial features’ dimensions and their relation. • Voice recognition. Not fully unique to an individual and easily mimicked, but also easy to obtain and useful in multimodal authentication. • Speech patterns. Not to be confused with the above – while in the former we can observe a person voice characteristic like the pitch and tone, the latter method has to encode individual speech habits, like interjections, involuntary stuttering, speech cadence, and any other pattern that emerges from long recordings in a casual environment (often taken from person personal data library and stock) [7]. • Typing patterns and cadence. Like speech, typing has its own pattern that can be distinguishable for a person, measuring the time difference between keystrokes and some letter pairs. • GPS data. In mobile or personal devices, GPS or at least motion sensors are commonplace. Thus, a person’s movement history or daily routine (on weekdays or weekends) can be easily used to form another digital footprint identifying a person. • Sleep patterns. Nowadays, wearable health devices record their user sleep habits according to the usual hours and the sleep phases, which also can define an individual. • ECG data. Similar to the above – data tracked in real-time is readily available and stored on most wearable devices. • Scanned signature (written). A person signature, which in most parts of the world is considered as a proof of identification/non-repudiation, can also be digitized and stored as one of the personal variables (which is not that easy to forge without special skills). • Social media data and history. Data posted by an individual on social media sites (with the consent to be used for advertising and identification purposes) – especially their history, groups, or public “likes”-have already built a person’s digital profile for the advertising companies. • Personal photos and media. The data (especially visual data, such as photos) from the personal storage of an individual can be used as their “keys” or passwords – to denote the simple ownership of the data in question or to put the user’s face on a picture for a shape recognizing algorithms. As can be seen, the user can generate a nearly endless stream of data during their daily routine without the need for any special activity. Such data should be used to identify him for encryption purposes [6, 8].
216
R. Bułat and M. R. Ogiela
2.2 Encryption Method All of the previous qualities can be represented as digitized binary data – either in sensor readings or pictures taken by the devices themselves [4]. In the case of pictures, there is a need for them to be put through visual pattern detection software to provide context to the observed data. Nevertheless, in the end, any individual should be represented by an indexed vector of raw data (Fig. 1). Such vector should have different identifiable vertices, each representing one of the categories mentioned earlier, along with the context (time and date of taking the sample, user coordinates, photography context, et caetera). Furthermore, the vector address table should be uniform for any particular implementation of the algorithm to provide the possibility of access to any given category used in the encryption process.
Fig. 1. Examples of personal features, which can be used for security purposes.
During the process mentioned above, there is no need to use the whole data stream – such a method would be ineffective and insecure (especially given that any given vector can include hundreds of personal data nodes, for example, a whole photo collection). As can be seen, it is feasible to use an approach similar to the usage of masked passwords (since every node in the data stream has an indexed location). As a result, the stream used in any given communication can be identified (and required as a server challenge) to provide just a subset of data based on arbitrary criteria. Although there should be at least one of the qualities unique to an individual for every challenge (such as iris scan or other biometrics), another subset of the keystream should be chosen during each encrypted communication. Changing subsets would make it impossible to compute the whole data, even in eavesdropping during the authentication process. Any of the fragments chosen from the vector retains some of their user qualities and can be used to secure the communication adding another layer of security (being a one-time pad used to salt the communication). Also, there is a possibility to identify the user as a correct sender/receiver since there is no randomness to the vector itself. 2.3 Steganography Another research area, which would benefit from personalization, is steganography. As we are already using a vector comprised of user personal pictures or photos, another
Personalized Cryptographic Protocols - Obfuscation Technique
217
layer of security can be put to use there. Any information that needs to be encoded and signed can be inserted into one of the user’s pictures by simply changing the least significant bits of every pixel (which would produce color changes not perceptible for the observer) [9, 10]. As such, any message can be “watermarked” by taking the sender’s picture (with all the geolocation and timestamp tags embedded if necessary) using any biometric checks that the sender’s equipment is capable of (making use of iris check or face recognition). The message should already be encoded by the encryption method mentioned above – especially using the same qualities present on the photo. As a result, using the photography of someone other than the individual in question or different from their routine - wrong geolocation, wrong surroundings (as per pattern recognition software) would not work. The cause is obvious - the decoding process would use those data instead, and they would not match the secret key – original biometrics of a person. In such a way, changing the photos would not work, as would the spoofing – only the original data could provide a chance for matching the proper variables for decryption. In that context, any message sent by the person embedded into his real-time photo would be non-repudiatable, as the picture would provide the “here and now” context while being an immutable ingredient in the decryption process. Nevertheless, to keep the information confidential, the encryption vector still has to include factors not present on the subject picture (such as the history of the other pictures, their surroundings for recognition purposes, et caetera) to prevent cryptoanalysis. The receiver would get a photo with encoded information, verify the sender’s identity by simply checking if the keys included in the photo match those in the message, and then proceed with the decryption with the one-time pad (fragments of the stream) received earlier in the asymmetric handshake process. In such a model, a man-in-the-middle attack would prove nearly impossible, as the decryption and re-encryption of the data with another person’s data would not provide the correct context in the resulting picture, making the keys mismatch and the information being corrupt.
3 Conclusions As can be seen, many qualities describe an individual, many of them being innate to the user in question. Thus, it raises some interesting research problems – which way would be worth pursuing in the future of distributed small systems? What would make a man unique in the context of authentication? Since the IoT has its own strong and weak points (many biometric sensors and weak computational power, respectively), any future methods should be considered. Nevertheless, any person nowadays leaves a huge digital footprint (especially due to the use of wearable devices and social media presence) that is being measured, stored, and digitized and can be useful as a proof-of-identity and a cryptographic primitive. Such data, of properly ordered, can be used according to the previous research as the encoding stream and a personal signature. Due to its semirandom nature, it would prove difficulties in replication in case of an attack, yet it still can be determined the identity of the owner of any given data stream. Thus, it can be used as an addition to existing methods, binding them permanently to their user instead of just random noise, thus adding another security layer to the protocol. Still, storing and processing those data also raises privacy issues. Most of them are provided willingly by the user (especially as the wearable sensor tracks their fitness)
218
R. Bułat and M. R. Ogiela
[2]. However, the usage of anybody’s social media history or private photos should be regulated and transparent to their owner – since their consent is instrumental in any case. Should the consent be given, the sheer amount of data can be enormous (and still, chosen by the user, which could, for example, limit the usage of some categories of data vector) and would undoubtedly raise the security factor of their encryption and communication. Acknowledgments. This work has been supported by the AGH University of Science and Technology Research Grant No 16.16.120.773. This work has been supported by the AGH Doctoral School Grant No 10.16.120.7999.
References 1. Sfar, A.R., Natalizio, E., Challal, Y., Chtourou, Z.: A roadmap for security challenges in the Internet of Things. Digit. Commun. Netw. 4(2), 118–137 (2018) 2. Loukil, F., Ghedira, C., Aïcha-Nabila, B., Boukadi, K., Maamar, Z.: Privacy-aware in the IoT applications: a systematic literature review. In: Panetto, H., et al. (eds.) On the Move to Meaningful Internet Systems. LNCS, vol. 10573, pp. 552–569. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-319-69462-7_35 3. Traore, I., Ahmed, E.A.: Continuous Authentication Using Biometrics: Data, Models, and Metrics, 1st edn. IGI Global, Hershey (2011) 4. Ogiela, M.R., Ogiela, L.: Cognitive cryptography techniques for intelligent information management. Int. J. Inf. Manage. 40, 21–27 (2018) 5. Menezes, A., Van Oorschot, P., Vanstone, S.: Handbook of Applied Cryptography. CRC, Boca Raton (1997) 6. Fierrez, J., Morales, A., Vera-Rodriguez, R., Camacho, D.: Multiple classifiers in biometrics. Part 2: trends and challenges. Inf. Fusion 44, 103–112 (2018). https://doi.org/10.1016/j.inf fus.2017.12.005, ISSN: 1566–2535 7. Szlósarczyk, S., Schulte, A.: Voice encrypted recognition authentication – VERA. In: 2015 9th International Conference on Next Generation Mobile Applications, Services and Technologies, pp. 270–274 (2015). https://doi.org/10.1109/NGMAST.2015.74 8. Acien, A., Morales, A., Vera-Rodriguez, R., Fierrez, J., Tolosana, R.: MultiLock: mobile active authentication based on multiple biometric and behavioral patterns, pp. 53–59 (2019). https://doi.org/10.1145/3347450.3357663 9. Cox, I., Miller, M., Bloom, J., Fridrich, J., Kalker, T.: Digital Watermarking and Steganography. Morgan Kaufmann Publishers, Burlington (2008) 10. Ogiela, M.R., Koptyra, K.: False and multi-secret steganography in digital images. Soft Comput. 19(11), 3331–3339 (2015). https://doi.org/10.1007/s00500-015-1728-z 11. Bułat, R.: Zastosowanie metod ewolucyjnych w kryptografii – problem faktoryzacji. In: Metody analizy i oceny bezpiecze´nstwa oraz jako´sci informacji, Kraków (2012)
Personalized Cryptographic Protocols for Advanced Data Protection Urszula Ogiela1 , Makoto Takizawa2 , and Lidia Ogiela1(B) 1 Pedagogical University of Krakow, Podchor˛az˙ ych 2 Street, 30-084 Kraków, Poland 2 Department of Advanced Sciences, Hosei University,
3-7-2, Kajino-cho, Koganei-shi, Tokyo 184-8584, Japan [email protected]
Abstract. In data protection processes, security is increasingly being used, which is to some extent related to the creator of a given solution. Data security techniques based on the use of personal methods of securing information are increasingly used. Such solutions guarantee full data protection. This paper presents personalized data security techniques dedicated to advanced data protection processes. Keywords: Advanced data security · Personalized cryptographic protocols
1 Introduction Personalized data protection methods are currently used in many technical solutions aimed at full data protection, and access to them using information identified with its owner. Because personalized protection of data and their sources becomes indispensable due to the fact of increasingly aggressive methods of taking over confidential data. This process is now observable not only in the state secret data protection zone, but also affects the protection of data against attacks by external services and cyberhackers. Nevertheless, it is also quite commonly observed on the lower levels of functioning of societies, enterprises, or specific groups of associates. Therefore, it becomes necessary to develop effective methods of securing data against unauthorized access. One of the possible solutions in this regard is provided by cryptographic protocols for the division and sharing of a secret [1–4, 8–10]. However, they themselves do not guarantee such data protection that applies to a specific person – it has the features of a personalized solution. The development of such solutions gives the possibility to modify the cryptographic algorithm only by its creator, by means of only known settings of personal user characteristics [8, 9]. Therefore, personalized data protection solutions indicate a wide range of possibilities for modifying the implemented solutions in the field of full data protection, they provide the possibility of unlimited changes to the base data protection procedures by random selection of identification features for each participant of the data protection protocol. Their functionality and usability results from the level of data protection measured with the measure of the complexity of the applied cryptographic algorithms. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 219–222, 2022. https://doi.org/10.1007/978-3-030-84913-9_20
220
U. Ogiela et al.
2 Personalized Cryptographic Protocols Personalized protocols constitute a group of cryptographic solutions aimed at securing data through the use of techniques and algorithms for secure information protection, considering personal data of protocol participants. This is to ensure a level of data protection that guarantees that only specific persons participating in a given protocol are able to safely implement it. The personal features of individual participants of the protocols are unique and unambiguously belong to each of them. The use of individual personal data in the sense of personal characteristics – from simple features such as fingerprint, eye retinal scan, face layout, through the timbre of the voice, speech features, DNA code, to less standard solutions such as fingers spacing, distance between eyes and ears whether the arms to the skeletal system or the identification of lesions (or their absence) – allows to determine the compliance of their owner with a specific participant in the protocol. The use of personal characteristics in data protection protocols allows to proper identification of each participant in the protocol, while eliminating access to them by persons who do not have the required characteristics. The possibility of breaking access to them is the smaller, the more data will be required in the versioning process. Because in the case of developing a protocol based on one selected type of personal information, there is a risk that an unauthorized person or system will gain possession of a single personal information. Increasing the required set of information with further data will cause that a person trying to access the data will have to gain access to a lot of information. In order to protect the data against such an attack, it is possible to facilitate the random selection of several personal characteristics required in the verification process of each participant in the protocol. Moreover, each of them can be verified by a different set of features, such a procedure guarantees random selection of the verification features [5–7]. The possibility of randomly selecting personalized data of each participant of the cryptographic protocol in order to verify it properly, guarantees the constant variability of the base set identifying both each participant and all those willing to take over the information.
3 Advanced Personalized Data Protection Techniques Personalized data protection techniques can be used in a variety of cryptographic protocols, but their valuable features are most useful in data partitioning and sharing solutions. Threshold diagrams ensure a real modification of the adopted solutions, both in the sets of personal characteristics of protocol participants, and in the amount required in the shadow distribution process and in the process of reconstructing the secret. An important novelty that distinguishes the described solution is that it allows for a completely diversified selection of threshold schemes in the process of securing a secret. The number of protocol participants may depend on the degree of confidentiality of a given information, while at the same time varying the number of shadows distributed among protocol participants. In addition, the secret reconstruction process can be performed at many different levels, taking into account the superiority/submission of particular groups of shadow holders.
Personalized Cryptographic Protocols for Advanced Data Protection
221
Another novelty of the proposed solution is the possibility for all participants of protocols to define their personal set of features on the basis of which they will be verified. Personal features belonging to this set will be randomly selected at each verification process, however, if each participant of the protocol independently defines the base set, only he/she will know the set of features for full identification. Moreover, the participant of the protocol may indicate such features that are extremely specific, such as deformations, special features or individual changes in the body structure that affect the data recorded by the system. In such case, each participant of the protocol can be sure that the features presented to him for verification are correct or not. In the event of a breach of the personal verification system, when the system/hacker tries to randomly select the participant’s verification features, there may be features that the participant did not indicate – then he is sure that it is an attempt to take over the secret. The solution that can completely prevent the interception of data is the indication by all participants of the protocol in their collections of such personal characteristics that only they know and are characteristic of them. Forcing the system to indicate one or several such features in each verification process of a protocol participant will ensure that in the event of an attempted hostile interception of data, the burglar will not have knowledge of these specific characteristics of each participant, which will significantly complicate or completely prevent him from taking over the data. The use of personalized and characteristic data for each protocol participant is an advanced technique of data protection. It allows to full protection of information at every level of its functioning, from state data, development and strategic data, through data on strategic defense, towards data related to the proper functioning of individual entities and units, to data related to information management processes in every single subject. For the development of such algorithms that are based on the indication of groups of secret trustees, in which each of them independently defines their personal characteristics that can be verified by personal identification systems, considering the cryptographic threshold schemes used in the process of dividing and distributing parts of a secret, provides unique solutions belonging only to each of the participants of these protocols. The high security of these solutions makes them optimal protection of important and strategic data at every stage of their management and transmission. The uniqueness of human personal characteristics, on the other hand, makes these protocols the most effective safeguards against hostile and unexpected interception of secret information.
4 Conclusions The development of data protection protocols based on the use of personal verification techniques of protocol participants directs the currently functioning solutions to the paths of advanced data protection. The developed algorithmic solutions are the foundations of a new, extremely effective protection of the bottoms against hostile takeover, at the same time the sets can be modified at any time, on the basis of which all participants of the protocol are properly verified.
222
U. Ogiela et al.
Therefore, the solutions presented in this paper indicate effective possibilities of securing data with various degrees of confidentiality, circulating in various entities and dedicated to their various trustees. Acknowledgments. This work has been supported by Pedagogical University of Krakow research Grant No. BN.711-79/PBU/2021. This work has been supported by the National Science Centre, Poland, under project number DEC-2016/23/B/HS4/00616.
References 1. Chen, C.-K., Lin, C.-L., Chiang, C.-T., Lin, S.L.: Personalized information encryption using ECG signals with chaotic functions. Inf. Sci. 193, 125–140 (2012) 2. Chomsky, N.: Language and Problems of Knowledge: The Managua Lectures. MIT Press, Cambridge (1988) 3. Ghosh, A.M., Grolinger, K.: Edge-cloud computing for internet of things data analytics: embedding intelligence in the edge with deep learning. IEEE Trans. Industr. Inf. 17(3), 2191– 2200 (2021) 4. Mackenzie, O.J. (ed.): Information Science and Knowledge Management. Springer, Berlin (2006) 5. Nakamura, S., Ogiela, L., Enokido, T., Takizawa, M.: Flexible synchronization protocol to prevent illegal information flow in peer-to-peer publish/subscribe systems. In: Barolli, L., Terzo, O. (eds.) CISIS 2017. AISC, vol. 611, pp. 82–93. Springer, Cham (2018). https://doi. org/10.1007/978-3-319-61566-0_8 6. Ogiela, L.: Transformative computing in advanced data analysis processes in the cloud. Inf. Process. Manag. 57(5), 102260 (2020) 7. Ogiela, L., Ogiela M.R.: Cognitive security paradigm for cloud computing applications. Concurr. Comput. Pract. Exp. 32(8), e5316 (2020) 8. Ogiela, M.R., Ogiela, U.: Secure information splitting using grammar schemes new challenges in computational collective intelligence. Stud. Comput. Intell. 244, 327–336 (2009) 9. Ogiela, M.R., Ogiela, U.: Linguistic cryptographic threshold schemes. Int. J. Future Gener. Commun. Network. 2(1), 33–40 (2009) 10. Schneier, B.: Applied Cryptography: Protocols, Algorithms, and Source Code in C. Wiley, Hoboken (1996)
Antilock Braking System (ABS) Based Control Type Regulator Implemented by Neural Network in Various Road Conditions Hsing-Chung Chen1,2(B) , Andika Wisnujati1,3(B) , Agung Mulyo Widodo1,4(B) , Yu-Lin Song1,5 , and Chi-Wen Lung6(B) 1 Department of Computer Science and Information Engineering, Asia University,
Taichung, Taiwan [email protected], [email protected] 2 Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan 3 Department of Machine Technology, Universitas Muhammadiyah Yogyakarta, Yogyakarta, Indonesia [email protected] 4 Department of Computer Science, Esa Unggul University, West Jakarta, Indonesia [email protected] 5 Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan 6 Department of Creative Product Design, Asia University, Taichung, Taiwan [email protected]
Abstract. Antilock Braking System (ABS) is a braking system to avoid the wheel of the vehicle unlocked. This is happened caused by the friction coefficient between tire and road surface will degrade when the brakes are applied on a slippery surface or during panic breaking. The control algorithm has a limited ability to compensate for a wide variety of road conditions. The learning to the controller can take enabling to compensate for adverse road condition. This paper proposes a solution to hand the problem above by applies the controller that able to learn based on an artificial neural network. By using the neural network concept do not need to learn the inverse dynamics of the plant controlled as usual for neural in the control system. The design concept results from the belief that the main objective of the control system design is to determine the controller generating the signal to achieve the best performance output plant. Therefore, the neural network training just takes a little time (about 300–500 periods) the neural network can get the desire error target (0.02). The simulation has been applied and takes good performance on breaking standards, such as maintaining 20% slip value and keeping the maximum point of breaking coefficient for various adverse road conditions. Instead of this, the neuro-regulator can keep the 10–30% slippery. Simulation’s effectiveness compares with the two-form controller i.e., Bang-bang Controller and Fuzzy logic control in various road conditions. Keywords: Antilock braking system (ABS) · Neural network · Various road
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 223–237, 2022. https://doi.org/10.1007/978-3-030-84913-9_21
224
H.-C. Chen et al.
1 Introduction In the early 1900s, the concept of Antilock Brake System (ABS) was introduced by introducing modulated hydraulic braking pressure with the primary goal of preventing tire locking and wheel slippage [1]. The ABS is activated and deactivated in response to wheel decline and slip. It is used in modern automobiles to increase the level of safety and consistency. The ABS should be designed for a variety of driving conditions, particularly those that require frequent braking and acceleration. Thus, extending the life of the braking system becomes critical in these conditions [2]. It is essentially an active protection system that is currently installed in the majority of automobiles and is capable of preventing the wheels from locking during hard braking [3]. ABS is a wellestablished safety feature in the automotive industry. ABS generally improves vehicle safety by limiting directional wheel slip during a braking event characterized by deep slip [4]. The friction force on the locked wheel is usually considerably less when sliding on the road. Under braking, if one or more of vehicle wheel locked (begin to slip) then has a number of consequences such as braking distances increase, steering control is loss and tire wheel will abnormal. These are undesirable conditions. Modern automobiles incorporate ABS, traction control systems, and other safety and reliability features. The autonomous ABS system can take over the vehicle’s traction control entirely or in part. However, the ABS exhibits strong nonlinear characteristics [5]. An antilock braking system using an on-off control strategy to maintain the wheel slip within a predefined range is studied here. Integrating the controller design with the vehicle dynamics model is required. A model of a single wheeled vehicle or a bicycle vehicle considers only constant normal load on the wheels. On the other hand, a four-wheel vehicle model that takes into account dynamic normal loading on the wheels and generates the proper lateral forces is suitable for designing a reliable brake system [2]. The ABS’s primary objective is to rapidly decelerate a vehicle while maintaining steerability during an emergency braking maneuver. Its primary objective is to improve braking, steering, and driving stability. Today, almost every country in the world requires the ABS as a mandatory safety system [6, 7]. A properly functioning ABS controller should be capable of maintaining the wheel slip at an optimal value that is appropriate for the road conditions encountered at any given point in time, thereby preventing the wheel from locking during braking [8]. Several approaches in ABS control strategy have been successfully implemented such as robust, describing function, sliding mode control, etc. [9]. The advanced control strategy such as Fuzzy Neural Network Control has also done by any researcher and take a good performance [10–12]. In general, a major problem in all control strategies is that the performance of the ABS may deteriorate when the tire meets a road where adverse conditions are encountered. Therefore, the learning control strategy is needed to improve its performance over time by learning how to compensate for a wide variety of road conditions. The sense of “learning” is a reason to choose the advanced control strategy such as neural network control. The fuzzy controller’s performance is determined by if-then inference rules, similar to those used in expert systems. The primary advantage of this representation is its readability for the human user. A fuzzy system’s knowledge base is composed of two components: 1) a definition of fuzzy sets; and 2) a look-up table containing rules for abstraction. Inference is a critical component of fuzzy logic control.
ABS Based Control Type Regulator Implemented by Neural Network
225
It possesses the ability to make human-like decisions based on fuzzy concepts. The most frequently used technique for determining the controller output is to specify the output of each individual fuzzy rule and then to express the resulting output as a composition of individual rules [1]. Defuzzification is the process of converting an output linguistic value derived from the composition of partial outputs to a “crisp” value that is used as the controlled process input via an actuator. In recent years, the using a neural network has been developed for non-linier process control. It is provided an alternative to mimic the biological brain neural network into the mathematical model. Typically, the algorithm of neural network for control employs mostly with error backpropagation to learn the characteristic or the inverse dynamics of the controlled system. All of this scheme uses desired to respond and or the plant output. However, it may take much time due to learning backpropagation. Neural networks (NNs) have been applied to the identification and control of dynamic systems with great success. Numerous methods for incorporating NNs into control structures have been proposed. Multilayer NNs have become a popular choice for modeling nonlinear systems and implementing general-purpose nonlinear controllers due to their universal approximation capabilities. NN-based controllers have been developed to compensate for the effects of nonlinearities and system uncertainties, thereby improving the control system’s stability, convergence, and robustness [13]. The Bang-Bang controller is essentially an on-off switch. It is a common signum function that is used to keep the wheel slip within a specified range. The ABS system continuously monitors the wheel slip value and compares it to the desired slip value, which is typically 0.2, thereby ensuring that the error is zero. This action occurs whenever the value of the brake torque (TB) reaches its maximum value. The model is a closed-loop control system in which the controller determines a reference signal for the actuator using sensor data [3]. In this paper we proposed a neural network control strategy that does not learn of a dynamic process. This design is based on that in general the control strategy is achieved by the controller that can generate the good control action. The enhancement of this method is great simplified in the learning process, it is caused by the output or input of controller for ABS has known. Therefore, the training takes much a little time. It is also from the concept of regulator type which maintains the output process in a fixed set-point (20% slip for ABS) [10, 11] with respect to any disturbance of various of road types. The amount of braking power and energy retrieved on a dense snow road is less than on a cobblestone wet road. The torque required for a low friction road is less than the torque required for a medium friction road [14]. Nomenclature F
Force
M
Moment
v
Velocity
bv
Viscous coefficient that proportional to linear motion
γ
Angle of inclination of the road
m
The mass of vehicle (continued)
226
H.-C. Chen et al. (continued) Nomenclature g
Gravitational acceleration constant
Tb
Braking torque
Rwb
Viscous friction force between tire and brake shoe
rw
Radius of wheel
fb
Friction force
I
Rotational Inertia of wheel
Fx
Longitudinal braking force
vx
Longitudinal velocity of the vehicle
λ
Wheel slip
μ(λ)
Slip ratio
μ
Coefficient of wheel slip
Cd
Discharge coefficient
qws
Rate of fluid flow
Aws
The area of cylinder wheel
ρ
Fluid mass density
Pws
Pressure of cylinder master
av
Acceleration of vehicle
θ
Bias serves as a threshold
w
Weight connection
x
Input vector
C1
Maximum value friction curve
C2
Friction curve shape
C3
Friction curve difference between maximum value at λ=1
C4
Wetness characteristic value (range 0.02–0.04 s/m)
As illustrated in Fig. 1, the tire forces and moments generated by the road surface act on the tire. The longitudinal force Fx , the lateral force Fy , and the normal force Fz are the forces acting along the x, y, and z axes, respectively. Similarly, the moments acting along the x, y, and z axes are as follows overturning moment Mx , rolling resistance moment My , and self-aligning moment Mz [2, 15].
ABS Based Control Type Regulator Implemented by Neural Network
y
227
z γ
Mz
My Fz Fy
Fx
Mx x
α Fig. 1. Tire forces and moments [2]
2 The Mathematics Model of ABS A simplified model of a quarter vehicle is shown in Fig. 2. In each block contain a transfer function that is developed for an ABS. They are dynamics of vehicle (linier) and a rotational dynamic of single wheel, pressure dynamics in cylinder wheel, slip estimator and the solenoid valve as actuator. Mathematical modeling is the first and most critical step in designing an antilock braking system-based control type regulator. Modeling an antilock braking system, on the other hand, is a difficult task due to the ABS dynamics being highly nonlinear and time varying. In this paper, however, a simplified model for controller design and computer simulation is applied. Torque Control, a
Hidden input
Set-point
Pressure
output
Error, e
EsƟmator Slip
neuroregulator
Vehicle and Wheel Dynamics
Fig. 2. Simplified model closed loop ABS dynamics
2.1 Linier Motion of Vehicle The model considers the wheel speed and vehicle speed to be state variables, while the torque applied to the wheel is considered an input variable. The two state variables in this model are related to the dynamics of a single wheel rotation and linear vehicle dynamics. The state equations are the result of Newton’s law [16] in Fig. 3, which is applied to the dynamics of wheels and vehicles. Equation (1) [16] sums the force applied to the
228
H.-C. Chen et al.
vehicle during normal braking and derives the differential equation for the vehicle. 1 μ(λ)w cos γ + w sin γ + bv v (1) v=− m where f = μ(λ) cos γ is friction force and w = m.g which g is gravitational acceleration constant. 2.2 Rotational Dynamics of a Wheel To control wheel slip, a dynamical quarter vehicle model with the essential characteristics of the actual model is used as the design model. This model has been used extensively in the development of the longitudinal wheel slip controller. The dynamics of one wheel is modeled by Eq. (2) [16]. 1 (2) w = (rw fb − Rwb − Tb ) I
Tb Vx ω
Fx Fz
Fig. 3. Wheel model body diagram
It is the normalized difference between the vehicle’s angular velocity and the wheel’s angular velocity. The slip value of λ = 0 denotes the wheel’s free motion when no friction force Ft is applied. When the slip reaches the value λ = 1, the wheel is locked. 2.3 Pressure Dynamics in Wheel Cylinder The wheel dynamics is determined from the hydraulic systems. The fluid flow in wheel cylinder is assumed an incompressible flow. The scheme to derived differential equation is achieved by the sense that fluid flow is flow across the sudden area change (such as fluid through the nozzle). From Bernoulli’s theorem for fluid, the equation of pressure dynamic in wheel cylinder is defined as Eq. (3). dPws ∼ (3) = Cd .qws dt The model of solenoid valve is assumed by a gain of solenoid valve pressure operation divided by current that is generated by controlled (−1,1 A). Therefore, from Eq. (3) the variable in Eq. (2) can be derived as Tb ∼ = Pws .Aws.ρw . It is also shown by logical mean that manipulated variable is pressure in wheel cylinder Pws .
ABS Based Control Type Regulator Implemented by Neural Network
229
2.4 The Road-Tire Friction for Various Road Types The important thing in using a nonlinear controller such as neural network is a nonlinear curve of coefficient friction versus slip ratio μ(λ) that shown in Fig. 4. The feedback linearizing controller, given the desired wheel slip-ratio, cancels all nonlinear dynamics and imposes an acceptable linear behavior on the wheel slip-ratio [4]. Another friction model in Eq. (4), often used to model tire forces is given by Burckhardt equation [17] in order to derive with similar methodology where μ is expressed as a function of wheel slip (λ), and the vehicle velocity v (Table 1). (4) μ(λ, v) = C1 1 − e−C2 λ − C3 λ e−C4 λv
Table 1. Friction parameters [17] Surface parameters C1
C2
C3
C4
Asphalt-dry
1.029
17.16
0.523
0.035
Asphalt-wet
0.857
33.822
0.347
0.035
Cobblestone-dry
1.3713
Snow
0.1946 94.129
6.4565 0.6691 0.035 0.0646 0.035
Fig. 4. The typical curve of μ(λ) [2]
The Fig. 4 illustrates the slip–friction curve for a wheel traveling at a linear speed of 10 m/s under various road conditions. As shown in this figure, the coefficient of friction increases linearly with the increase in slip until it reaches a maximum value, at which point it decreases monotonically. If the wheel becomes locked (σx ) = 1, the coefficient of
230
H.-C. Chen et al.
friction decreases, the wheel begins to slide, and the steering control may be lost, which is completely unacceptable. To improve the steering system for vehicle, lateral stability and to shorten the stopping distance during braking, the slip value must be maintained within a specified range. Due to the rapid dynamics of slip and the fact that any value beyond the peak of the friction curve is open loop unstable, the slip value is maintained within a specified range, referred to as the sweet spot (see the shaded area in Fig. 4). As the side slip angle increases, the longitudinal force decreases. This physical phenomenon is the primary reason for ABS brakes, as avoiding excessive longitudinal slip values ensures the steering system for vehicle and lateral stability during braking. Manual control is difficult to achieve because the slip dynamics are fast and open loop unstable when operating at wheel slip values to the right of any friction curve peak. 2.5 The Slip Estimator Slip angle estimation is a critical technical component in the development of vehicle stability control systems. Thus, over the decades, the problems of slip angle estimation have received considerable attention. Slip angle estimation methods can be classified into two categories based on the models used: velocity kinematics-based approaches and dynamics model-based approaches. As a result, it is resistant to change in vehicle parameters caused by tire-road conditions and driving operations, as it is not dependent on vehicle parameters [18]. In the real application the λ is difficult to measure directly. It is cased the sensor for λ is not present for commercial car. The alternative way by assuming that sensor for both acceleration and wheel speed are available. The slip estimator is defined as Eq. (5): Rw 1 − λ(t) av (t) − ωw (t) (5) λ(t) = v(t) v(t)
3 The Theory of Neural Networks and Error Backpropagation Method The model of a single neuron is illustrated in Fig. 5, it is often referred to as a node, where (x) is input vector, (w) is weight connection and θ is bias serves as a threshold shown in Eq. 6. Each single neuron determines a net input value based on all its input connection. The mathematical model of input neural network is defined in Eq. (4).
xj .wij + θ (6) neti = j
where the output is yi = f (net)i , f is an activation function. In this study, we choose a sigmoid function that has a range output from −1 to 1. A sigmoid function is a mathematical function having a characteristic “S”-shaped curve or sigmoid curve in Eq. (7). A common example of a sigmoid function is the logistic function shown in the first figure and defined by the formula: S(x) =
ex 1 = 1 − S(−x) = 1 + e−x ex + 1
(7)
ABS Based Control Type Regulator Implemented by Neural Network
231
Fig. 5. The model of single neuron Error backpropagaƟon
-+ Input vectors
-+ Hidden layers Input layers
Output layers
Fig. 6. The architecture for error backpropagation
In learning of training process, the network presented with a pair patterns inputoutput. In Fig. 6 is illustrated the architecture that is used in this study. There are three layers feed forward neural network with error back propagation for training or learning process (Eq. 8). The main goal a training is obtain the best weights that minimize the mean square error between actual output of neural network and target vectors. The error back propagation is a generalization of the least mean square, because the relationship we are trying to map is to be nonlinear as well multi-dimensional, it employs an iterative version of the simple least square method called steepest descent technique. The error that is minimized is the sum of the squares of errors for all output (see Eq. 4). 1 2 δpk 2 M
Ep =
(8)
k=1
where p is defined as training vector pth and k for output kth . To determine the direction in which to change the weights the calculate of negative gradient Ep with respect to the weights is used. Then the adjusting the values of the weights such that the total error is reduced. In this section using chain rule method is considerably for simplifying calculation. Thus, the weights on the output layer are updated according to Eq. (9): o o o wkj (t + 1) = wkj (t) + p wkj (t)
(9)
232
H.-C. Chen et al.
In the Eq. (10) below, η is called learning rate that increases the speed of convergence. It is method must be repeated for the hidden layer that is calculated as:
o o h × pi wkj ypk − opk fko netpk p wjih = ηfjh netpj
(10)
k
From the equation above every update on the hidden layer depend on all the error terms δpk = ypk − opk on the output layer.
4 The Simulation Results The Matlab program was developed for simulating the ABS under various road conditions. The vehicle parameters used in simulation are shown in Table 2. Table 2. Parameter used in simulation Parameter Value v μ(λ) d λm dt
λr
80 Km/h 50% −10λm (t) + 10λϒ 0.2
A 200 N pedal force that proportional to 1.1 × 106 Pascal wheel cylinder pressure (0–1.1 × 106 Pascal within 1.5 s) to simulate a panic stop braking. Slip model reference is followed [16] Layne et al. where λm = model and λϒ = reference (0.2). The road type is dry asphalt, wet asphalt, wet surface and transition among them. Transition 1 is transition between wet asphalt to wet surface, transition 2 is transition between dry asphalt to wet asphalt. ωv =
Vv λr
(11)
The wheel rotates with an initial angular speed that corresponds to the vehicle speed before the brakes are applied. We used separate integrators to compute wheel angular speed and vehicle speed. We use two speeds to calculate slip, which is determined by Eq. (11). To demonstrate the ABS system’s potential, compare the stopping distance and braking time of three different braking systems: those without a controller, those with a Bang-Bang controller, and those with a Fuzzy Logic Controller (FLC). In this paper, the figure (result of simulation) depicts only a subset of the braking controller system strategies discussed previously. Only stopping distance and braking time appear to be indicators of each control strategies effectiveness. The Figs. 7 and 8 illustrate the simulation results for an ABS without a controller, in which the wheel is locked directly but the car (vehicle) continues to move. This is
ABS Based Control Type Regulator Implemented by Neural Network
233
Fig. 7. Velocity of wheel speed and vehicle speed without controller on dry asphalt
Fig. 8. Velocity wheel speed and vehicle speed without controller on wet asphalt
Fig. 9. Velocity wheel speed and vehicle speed with Bang-bang controller on dry asphalt
an undesirable condition that resulted in the addition of braking systems and increased braking time. Rather than this, the steering would be unable to control. While bangbang control demonstrates increasing performance, oscillation serves a single purpose: to fully open or close a solenoid valve (illustrated in Figs. 9 and 10). The better performance is achieved by regulator based neural networks according in decreasing the braking distance and time needed. It is shown in Figs. 11 and 12. The main
234
H.-C. Chen et al.
Fig. 10. Velocity wheel speed and vehicle speed with Bang-bang controller on wet asphalt
Fig. 11. ABS with neuroregulatory in dry asphalt
Fig. 12. ABS with neuroregulatory in wet asphalt.
reason that the slip is maintained on the best value 0.2 (20%). It is looked at Figs. 13, 14 and 15, the neural networks could take 0.2 faster than the other control strategy.
ABS Based Control Type Regulator Implemented by Neural Network
235
Fig. 13. Slip without controller.
Fig. 14. Slip with Bang-bang controller.
Fig. 15. Slip with Neuroregulator.
5 Conclusions This paper proposes a solution to hand the problem above by applies the controller that able to learn based on an artificial neural network. By using the neural network concept do not need to learn the inverse dynamics of the plant controlled as usual for neural in the control system. The design concept results from the belief that the main objective of the control system design is to determine the controller generating the signal to achieve the
236
H.-C. Chen et al.
best performance output plant. Therefore, the neural network training just takes a little time (about 300–500 periods) the neural network can get the desire error target (0.02). The simulation has been applied and takes good performance on breaking standards, such as maintaining 20% slip value and keeping the maximum point of breaking coefficient for various adverse road conditions. Instead of this, the neuro-regulator can keep the 10–30% slippery. Simulation’s effectiveness compares with the two-form controller i.e., Bang-bang Controller and Fuzzy logic control in various road conditions. Acknowledgments. This work was supported by the Ministry of Science and Technology (MOST), Taiwan, under Grant No. MOST 110-2218-E-468-001-MBK. This work was also funded Grant No. 110-2221-E-468 -007. This work was also supported in part by Ministry of Education under Grant No. I109MD040. This work was also supported in part by Asia University Hospital under Grant No. 10951020. This work was also supported in part by Asia University, Taiwan, and China Medical University Hospital, China Medical University, Taiwan, under Grant No. ASIA-108-CMUH-05. This work was also supported in part by Asia University, Taiwan, UMY, Indonesian, under Grant No. 107-ASIA-UMY-02.
References 1. Majid, M.A., Bakar, S.A., Mansor, S., Hamid, M.A., Ismail, N.: Modelling and PID value search for antilock braking system (ABS) of a passenger vehicle. J. Soc. Automot. Eng. Malaysia 1(3) (2017) 2. Bera, T.K., Bhattacharya, K., Samantaray, A.K.: Evaluation of antilock braking system with an integrated model of full vehicle system dynamics. Simul. Model. Pract. Theory 19(10), 2131–2150 (2011) 3. Gowda, V.D., Ramachandra, A., Thippeswamy, M., Pandurangappa, C., Naidu, P.R.: Modelling and performance evaluation of anti-lock braking system. J. Eng. Sci. Technol 14(5), 3028–3045 (2019) 4. Anwar, S., Ashrafi, B.: A predictive control algorithm for an anti-lock braking system. SAE Trans. 484–490 (2002) 5. Gowda, D., Kumar, P., Muralidhar, K., BC, V.K.: Dynamic analysis and control strategies of an anti-lock braking system. In: 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA), pp. 1677–1682. IEEE (2020) 6. Aksjonov, A., Ricciardi, V., Augsburg, K., Vodovozov, V., Petlenkov, E.: Hardware-in-theloop test of an open loop fuzzy control method for decoupled electro-hydraulic antilock braking system. IEEE Trans. Fuzzy Syst. (2020) 7. Reif, K.: Brakes, Brake Control and Driver Assistance Systems. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-658-03978-3 8. Rajendran, S., Spurgeon, S., Tsampardoukas, G., Hampson, R.: Time-varying sliding mode control for ABS control of an electric car. In: IFAC-PapersOnLine, vol. 50, no. 1, pp. 8490– 8495 (2017) 9. Ivanov, V., Savitski, D., Augsburg, K., Barber, P.: Electric vehicles with individually controlled on-board motors: revisiting the ABS design. In: 2015 IEEE International Conference on Mechatronics (ICM), pp. 323–328. IEEE (2015) 10. Habibi, M., Yazdizadeh, A.: A new fuzzy logic road detector for antilock braking system application. In: IEEE ICCA 2010, pp. 1036–1041. IEEE (2010)
ABS Based Control Type Regulator Implemented by Neural Network
237
11. Fedin, A.P., Kalinin, Y.V., Marchuk, E.A.: Antilock braking system fuzzy controller optimization with a genetic algorithm in a form of cellular automaton. In: 2020 4th Scientific School on Dynamics of Complex Networks and their Application in Intellectual Robotics (DCNAIR), pp. 78–81. IEEE (2020) 12. Pramudijanto, J., Ashfahani, A., Lukito, R.: Designing neuro-fuzzy controller for electromagnetic anti-lock braking system (ABS) on electric vehicle. J. Phys. Conf. Ser. 974(1), 012055 (2018). IOP Publishing 13. Poursamad, A.: Adaptive feedback linearization control of antilock braking systems using neural networks. Mechatronics 19(5), 767–773 (2009) 14. Itani, K., De Bernardinis, A., Khatir, Z., Jammal, A.: Comparison between two braking control methods integrating energy recovery for a two-wheel front driven electric vehicle. Energy Convers. Manag. 122, 330–343 (2016) https://doi.org/10.1016/j.enconman.2016.05.094 15. Rajamani, R.: Vehicle Dynamics and Control. Springer, Heidelberg (2011). https://doi.org/ 10.1007/978-1-4614-1433-9 16. Layne, J.R., Passino, K.M., Yurkovich, S.: Fuzzy learning control for antiskid braking systems. IEEE Trans. Control Syst. Technol. 1(2), 122–129 (1993) 17. Oudghiri, M., Chadli, M., El Hajjaji, A.: Robust fuzzy sliding mode control for antilock braking system. Int. J. Sci. Tech. Autom. Control 1(1), 13–28 (2007) 18. Lee, S.-H., Son, Y., Kang, C.M., Chung, C.C.: Slip angle estimation: development and experimental evaluation. In: IFAC Proceedings Volumes, vol. 46, no. 10, pp. 286–291 (2013)
Physical Memory Management with Two Page Sizes in Tender OS Koki Kusunoki(B) , Toshihiro Yamauchi, and Hideo Taniguchi Graduate School of Natural Science and Technology, Okayama University, Okayama 700-8530, Japan [email protected], {yamauchi,tani}@cs.okayama-u.ac.jp
Abstract. Physical memory capacity has increased owing to large-scale integration. In addition, memory footprints have increased in size, as multiple programs are executed on a single computer. Many operating systems manage physical memory by paging a 4 KB page. Therefore, the number of entries in the virtual address translation table for virtual to physical increases along with the size of the memory footprints. This cause a decrease in the translation lookaside buffer (TLB) hit ratio, resulting in the performance degradation of the application. To address this problem, we propose the implementation of physical memory management with two page sizes: 4 KB and 4 MB. This allows us to expand range of addresses to be translated by a single TLB entry, thereby improving the TLB hit rate. This paper describes the design and implementation of the physical memory management mechanism that manages physical memory using two page sizes on The ENduring operating system for Distributed EnviRonment (Tender OS). Our results showed that when the page size is 4 MB, the processing time of the memory allocation can be reduced by as much as approximately 99.7%, and the processing time for process creation can be reduced by as much as approximately 51%, and the processing time of the memory operation could be reduced by as much as 91.9%.
1 Introduction Memory chips have achieved larger capacity and smaller size owing to large-scale integration (LSI). Therefore, physical memory capacity has increased. In addition, many programs are executed on a single computer having increased physical memory capacity. For example, in CentOS 8.3, 63 resident processes can run in parallel. Many operating systems (OSs) manage physical memory in units of 4 KB. However, processing demands by applications have increased because the applications offer many functions. Therefore, the number of entries in the address translation table for virtual to physical addresses increases, the translation lookaside buffer (TLB) hit rate decreases, and the performance of the application deteriorates. To solve this problem, technology for managing physical memory on a large page size is being developed [5–8, 10]. By increasing the size of the base unit for managing physical memory (page size), the number of address translation table entries is reduced. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 238–248, 2022. https://doi.org/10.1007/978-3-030-84913-9_22
Physical Memory Management with Two Page Sizes
239
Fig. 1. Resource identifier used to manage the resources that encapsulate objects manipulated by OS.
The TLB hit rate is improved, and processing for application programs is faster. However, this introduces the problem that memory fragmentation increases. In The ENduring operating system for Distributed EnviRonment (Tender OS) [2], the physical memory resources manage physical memory with a page size of 4 KB. In this paper, we propose the design and implementation of physical memory resources with page size of 4 KB and 4 MB to reduce the number of entries in the address translation table. In addition, we report the result of basic evaluations.
2 Tender OS In the Tender OS, objects manipulated by the OS are encapsulated as resources and separated so that they become independent. Resources are identified by the resource name and resource identifier. The resource identifier, as shown in Fig. 1, is given at the time of resource creation, and has a location, type, and serial number within the type of resource. In addition, there is an operations interface for each type of resource (program components) and a table for managing data related to each resource. The program components use a unified interface. The unified interface is called the resource interface controller (RIC). The memory-related resources of Tender OS include physical memory, virtual region, virtual space, virtual user space, and virtual kernel space. The serial number of the resource identifier for the physical memory resource is related directly to the physical address, as shown in Fig. 2. Specifically, the physical address can be fetched by shifting the serial number to 12 bits. Therefore, processing based on physical memory resources is high-speed. The virtual region is a resource that virtualizes a memory image, which is in physical memory or external storage. The virtual space is the space to be mapped to the virtual address and corresponds to the address translation table. The virtual user space and virtual kernel space are spaces accessible from the processor through a virtual address. These spaces are created by attaching the virtual region to the virtual space and are deleted when the virtual region is detached. Here, attaching implies the mapping of a virtual address to a physical address. Specifically, the physical address or address of the external storage is set at the entry of the address translation table that matches the virtual address.
3 Physical Memory Resources with Two Page Sizes 3.1 Background When the page size is small (e.g., 4 KB), the allocated memory size can vary according to the memory footprint of the process being run, thus reducing fragmentation.
240
K. Kusunoki et al.
Fig. 2. Relationship between resource identifier and physical address.
However, because the number of entries in the address translation table increases, the TLB hit rate decreases. However, if the page size is large (e.g., 4 MB), the allocated memory size is larger than the memory footprint of the process being run. This results in fragmentation, and there is a high possibility of memory shortage. As the number of address translation table entries is reduced, the TLB hit rate improves. Therefore, by considering the advantages and disadvantages of each page size, we implemented a physical memory resource with two page sizes, and implemented a method that makes the best use of the advantages of both page sizes. 3.2
Design
3.2.1 Requirements In this study, we attempted to design a memory management mechanism that does not require memory compaction by separating physical memory space according to page size, unlike other OSs, such as the Linux OS. There are two requirements for a physical memory resource with two page sizes. (Requirement 1) The method be able to retain high-speed processing. In a physical memory resource, the serial number of the resource identifier and its physical address are related. This speeds up the task of fetching the physical addresses and using physical memory resources. Even for physical memory resources with two page sizes, the aforementioned tasks must retain high-speed processing. (Requirement 2) The interface for resource manipulation must be unified. To improve the TLB hit rate while considering the advantages and disadvantages of each page size and preventing memory shortage due to fragmentation, the page size must be switched according to the memory footprint of the running process. The interface for each page size should be unified. 3.2.2 Resource Identifier The two page sizes used in this study were 4 KB and 4 MB, and the physical memory resource identifiers for these page sizes are described as follows, and presented in Fig. 3. (1) The most significant bit of the serial number of resource identifier is used as the page size bit to distinguish between page sizes. For the page size bits of 0 and 1, the page sizes are 4 KB and 4 MB, respectively. (2) The lower 15 bits of the serial number of the resource identifier are used as the information corresponding to the physical address.
Physical Memory Management with Two Page Sizes
241
The physical address space managed by this configuration of the resource identifier is shown in Fig. 4. (1) For the 128 MB space with a physical address ranging from 0x0 to 0x07ffffff(128 MB), the page size is 4 KB. (2) For the space with a physical address of 0x08000000(128 MB) or higher, the page size is 4 MB. The maximum size of this space is 128 GB. As mentioned earlier, the serial number is related directly to the physical address, as with a conventional physical memory resource. This makes it faster to use physical memory resources, satisfying requirement 1. Specifically, when the page size bit is 0, the physical address is fetched by left-shifting the serial number of the resource identifier by 12 bits. When the page size bit is 1, the physical address is fetched by leftshifting the serial number by 22 bits and adding the first address of the space managed when the page size bit is 1. With this resource identifier, it is possible to use both the page sizes with a single interface, thus satisfying requirement 2.
Fig. 3. Resource identifier of physical memory resource for two page sizes.
Fig. 4. Physical address space.
3.2.3 Interface To satisfy requirement 2, the interface for the open operation of the physical memory resource is retained, as described as follows. In the open operation of a physical memory resource, argument num indicates the number of pages to be allocated. In addition, the maximum number of pages to be allocated is 65536 (16 bits). Therefore, the most significant bit is set as the bit indicating the page size (the page size bit). If the page size bit is 0, the page size is 4 KB, and if the page size bit is 1, the page size is 4 MB. This indicates that the operation of the 4 KB page is the same as that of conventional physical memory resources.
242
K. Kusunoki et al.
4 Implementation 4.1
Challenges
The implementation of physical memory resources with two page sizes involves the following challenges. Challenge 1: Efficient structure for the program components of physical memory resources Regardless of whether the page size of the physical memory resource is 4 KB or 4 MB, creation and deletion are provided at the same interface. This increases efficiency and reduces the amount of modification required. Moreover, adding additional page sizes should be easy. Challenge 2: Creation of appropriate physical memory resource in the virtual region The physical memory resource is always used in association with the virtual region. Therefore, the virtual region must create the physical memory resource by making the best use of the advantages and disadvantages of both 4 KB and 4 MB page sizes. Challenge 3: Attaching 4 KB and 4 MB in the virtual space The virtual memory space used by the OS and processes consists of multiple virtual kernel spaces and virtual user spaces. Here, the virtual memory space must be able to combine the mappings from both 4 KB and 4 MB page sizes to improve the efficiency of memory utilization by the OS and processes. In other words, when a virtual space attaches virtual regions to create a virtual kernel space or virtual user space, it must be possible to combine the virtual regions with different page sizes. We solve the challenges by designing the memory management using the following functions of Intel x86 architecture (32 bit). The 4 MB page is enabled by a page size extension (PSE) for Intel x86 architecture. In addition, physical memory exceeding 4 GB is enabled by a PSE-36 (36-bit page size extension). The maximum physical address available in PSE-36 is 36 bits wide. Therefore, the available physical memory is limited to 64 GB. 4.2
Efficient Structure for Program Components of a Physical Memory Resource
Challenge 1 is solved by implementing a method that manages physical memory resources that fuse both page sizes in a single program component. Physical memory resources manage the physical memory using a bitmap. The program component that manages the two page sizes by using the bitmap method is shown in Fig. 5 and is described below. Open operation of the physical memory resource involves (1) fetching the address of the bitmap table, (2) searching for free space in the bitmap table, and (3) allocating free space and putting it to use. Steps (2) and (3) are the same for each page size. In addition, steps (2) and (3) are the same as those in the conventional physical memory
Physical Memory Management with Two Page Sizes
243
resource. Therefore, only step (1) must be changed in the implementation of a physical memory resource that manages two page sizes. Specifically, step (1) is modified to create a bitmap table for each page size and fetch the bitmap table for the specified page size. This reduces the amount of modification required. In addition, more types of page sizes can easily be added by adding more types of bitmap tables.
Fig. 5. Program component that manages two page sizes.
4.3 Creation of Appropriate Physical Memory Resource in a Virtual Region Challenge 2 is solved by implementing a method for managing physical memory resources using information about the virtual region, considering the advantages and disadvantages of each page size. Specifically, when the requested amount of physical memory is small, the physical memory resource is created in 4 KB page sizes, and when the requested amount of physical memory is large, the physical memory resource is created with 4 MB page sizes. In the virtual region interface, the virtual region is created by specifying its size. Therefore, if the size of the requested virtual region is greater than or equal to threshold value M, a 4 MB page size physical memory resource is created, and if the size is less than the threshold value, a 4 KB page size resource is created. For example, if M = 3.5 MB, a physical memory resource with a 4 MB page size is used to create a in virtual region larger than 3.5 MB. This method allows the creation of physical memory resources by considering the advantages and disadvantages of each page size. For example, a process that has a small memory footprint creates a physical memory resource having the 4 KB page size, while a process that has a large memory footprint creates a physical memory resource having the 4 MB page size. 4.4 Attaching 4 KB and 4 MB into the Virtual Space Challenge 3 is solved by implementing a function that maps in 4 KB and 4 MB units during the attachment step, and by implementing a method that switches the mapping unit considering the page size. In the virtual space, the virtual address space is managed using paging. The paging structure that uses a physical memory resource with two page sizes is shown in Fig. 6 and described as follows.
244
K. Kusunoki et al.
Fig. 6. Paging structure using physical memory with two page sizes.
The page size can be switched at any 4 MB boundary in the virtual address space. For example, a physical memory resource with a page size of 4 KB can be mapped to the space where the process is loaded, and the physical memory resource with a page size of 4 MB can be mapped to the space used as the work space. This switching of page size is achieved by changing the value of the page size bit (PS) in the page directory entry. When the page size is 4 KB, two address translation tables, the page directory, and the page table, are required to fetch the physical address from the virtual address. A page directory manages 1024 page tables, and a page table manages 1024 pages with a page size of 4 KB. For example, when 4 MB of memory is used, at least one page directory entry and 1024 page table entries are required. On the other hand, when the page size is 4 MB, only one address translation table in the page directory is required to fetch the physical address from the virtual address. The page directory manages 1024 pages with a page size of 4 MB. When 4 MB of memory is used, only one page directory entry is required. By setting the page size to 4 MB, the number of page table entries can be reduced from 1024 to 1 when using 4 MB of memory, compared with the case with a page size of 4 KB. This implies that number of TLB entries required to use 4 MB of memory is reduced from 1024 to 1.
Physical Memory Management with Two Page Sizes
245
5 Evaluation 5.1 Overview We evaluated the processing times for (1) memory allocation, (2) process creation, and (3) memory operation to show the difference in the processing times for using the physical memory with each page size. Evaluation (1) showed the effect of differences in the address translation table due to differences in page size on the processing time for memory allocation. Evaluation (2) showed the effect of evaluation (1) on the processing time for process creation. Evaluation (3) showed the effect of difference in the number of TLB misses on the processing time of the memory operation. In this evaluation, regardless of threshold value M, physical memory is allocated in units of 4 KB for a 4 KB page size, and in units of 4 MB for a 4 MB page size. The evaluations were performed using a computer with Intel Core i7-2600, 3.4 GHz Data TLB0: 2 MB or 4 MB pages, 32 entries, Data TLB: 4 KB Pages, 64 entries, Instruction TLB: 4 KB pages, 64 entries, Shared 2nd-level TLB: 4 KB pages, 512 entries, CPU, Tender OS (single core), 8 GB RAM. 5.2 Processing Time for Memory Allocation We compared the processing time for allocating physical memory of a specific size for each page size and attaching it to the virtual space as a virtual user space. The evaluation results for memory allocation are presented in Table 1 and described below. The processing time, when one physical memory resource was created and attached for each page size (4 KB and 4 MB of memories were allocated for 4 KB and 4 MB page sizes, respectively), was increased by approximately 20% (2.62 μ s–3.26 μ s) when the page size was 4 MB compared with when the page size was 4 KB. However, this difference is considerably smaller than the effect of increasing size of allocated memory. This is because the memory allocation process is the same in both cases because the number of registrations to the address translation table is 1. The processing time when physical memory was created and attached for each page size was reduced by approximately 99.7% (1065.16 to 3.26 μ s) when the page size was 4 MB than when the page size was 4 KB. This is because when the page size is 4 KB, the number of registrations to the address translation table is 1024, whereas when the page size is 4 MB, this number is 1, thereby reducing the processing time for the memory allocation. In addition, when the page size is 4 KB, the number of physical memory resources created is 1024, whereas it is 1 when the page size is 4 MB. This reduces the number of resource operations, thereby reducing the processing time for memory allocation.
246
K. Kusunoki et al. Table 1. Evaluation result for memory allocation. Page size Allocation memory size Processing time [μ s] 4 KB
4 KB
2.62
4 KB
4 MB
1065.16
4 MB
4 MB
3.26
Fig. 7. Evaluation results of the process creation when the program size was increased in increments of 8 KB from 8 to 400 KB.
5.3
Processing Time for Process Creation
We evaluated the processing time to create a process of the same size for each page size. The text and data sections of the process were increased in increments of 4 KB from 4 to 200 KB. This implies that the program size was increased in increments of 8 KB from 8 to 400 KB. The evaluation results for process creation are shown in Fig. 7 and described below. When the size of the process to be created was small, the processing time for process creation was the same for each page size. This is because the size of the memory allocated during process creation is small, and the processing time for the memory allocation is not significantly different. When the size of the process to be created was large, the processing time for the process creation was reduced by approximately 51% (325–160 μ s) when the page size was 4 MB than when the page size was 4 KB. As shown in Sect. 5.2, the processing time for memory allocation can be reduced by increasing the page size. During process creation, the program contexts are copied to the allocated memory. In this case, the TLB hit rate is improved when the page size is 4 MB compared with when the page size is 4 KB, and the processing time is reduced.
Physical Memory Management with Two Page Sizes
247
Table 2. Evaluation results for the memory operation. Page size Processing time [μ s] 4 KB
128.54
4 KB
10.42
5.4 Processing Time for Memory Operation To confirm the effect of reducing TLB misses by increasing the page size, we evaluated the processing time for memory operation. The evaluation method for the memory operation was (1) randomly write to a memory space of 4 MB (4 B units); (2) generate random values at 4 KB intervals, such as 4096 ∗ (rand()%1024) (using the same random value for each page size); (3) measure the processing time for 1024 random writes. The evaluation results for the memory operation are shown in Table 2. When the page size is 4 KB, the memory operation is performed at 4 KB intervals, resulting in a maximum of 1024 TLB misses. However, when the page size is 4 MB, the TLB misses only once during the first memory operation. Thus, when the page size is 4 MB, the number of TLB misses is reduced and the processing time for the memory operation is reduced by about 91.9% (128.54 μ s to 10.42 μ s) compared with the case where the page size is 4 KB.
6 Related Work Both Windows and Linux comprise a function for creating physical memory with a large page size (huge pages), which is available to users [1, 4]. However, the users must be aware of the availability of huge pages when using these functions. Tender OS can automatically manage and allocate huge pages. Linux OS supports Transparent Huge Pages [3]. The OS tracks the allocation of these huge pages. In addition, the studies in references [5, 6, 8] investigated the algorithm for allocating and releasing Transparent Huge Pages. Reference [7] studied the usage of large pages and memory compaction with unmovable pages. Reference [10] studied FreeBSD’s algorithm for managing huge pages. Tender OS differs from these OSs in that it manages a separate physical address space for each page size. This allows large pages to be allocated without fragmentation. In addition, Tender OS can easily manage huge pages that require an extremely large contiguous area, such as 1 GB. A technique in which large pages were managed by the OS without memory compaction was studied in reference [9]. However, hardware changes were required.
7 Conclusion This paper described the design and implementation of a physical memory resource with two page sizes. To reduce the number of address translation table entries by using
248
K. Kusunoki et al.
such the physical memory resource, we implemented the functions as follows. Therefore, the paging structure of the attachment function was changed so that a physical memory could be set to a page with a size of 4 MB. In addition, because the virtual region was used when attaching physical memory, the function of creating the virtual region was changed to create a physical memory resource by switching the page size according to the required memory size threshold. In the evaluation, we showed that when the page size is 4 MB, the processing time for memory allocation can be reduced by as much as approximately 99.7%, and the processing time for process creation can be reduced by as much as approximately 51%. In addition, we showed that the processing time for memory operation could be reduced by as much as approximately 91.9%. Future studies should investigate the effect of different page sizes on program execution time and implement a method to effectively use different page sizes. In addition, it should investigate the to be applied to other architectures (e.g., AMD). Acknowledgements. This research was partially supported by Grant-in-Aid for Scientific Research 21K11830.
References 1. Persistent huge pages in Linux. https://www.kernel.org/doc/Documentation/vm/ hugetlbpage.txt 2. Tender project. https://www.swlab.cs.okayama-u.ac.jp/lab/tani/research/tender/index.html 3. Transparent huge pages in Linux. https://www.kernel.org/doc/Documentation/vm/transhuge. txt 4. Windows large page support. https://docs.microsoft.com/en-us/windows/win32/memory/ large-page-support 5. Kwon, Y., et al.: Coordinated and efficient huge page management with ingens. In: 12th USENIX Symposium on Operating Systems Design and Implementation, pp. 705–721 (2016) 6. Michailidis, T., et al.: Mega: Overcoming traditional problems with OS huge page management. In: Proceedings of 12th ACM International Conference on Systems and Storage, pp. 121–131 (2019) 7. Panwar, A., et al.: Making huge pages actually useful. In: Proceedings of Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 679–692 (2018) 8. Panwar, A., et al.: Hawkeye: efficient fine-grained OS support for huge pages. In: Proceedings of Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 347–360 (2019) 9. Park, C.H., et al.: Perforated page: supporting fragmented memory allocation for large pages. In: 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture, pp. 913–925 (2020) 10. Zhu, W., et al.: A comprehensive analysis of superpage management mechanisms and policies. In: 2020 USENIX Annual Technical Conference, pp. 829–842 (2020)
Sensor-Based Motion Analysis of Paralympic Boccia Athletes Ayumi Ohnishi(B) , Tsutomu Terada, and Masahiko Tsukamoto Kobe University, Kobe, Hyogo, Japan {ohnishi,tsutomu}@eedept.kobe-u.ac.jp, [email protected]
Abstract. In this study, using sensors, we analyzed the movements of Japan’s top-level boccia athletes, which is one of the official Paralympic sports. Motion analysis in sports has been widely studied, and a gradual approach is commonly available to help amateurs improve their skills. However, in two specific cases, personal observations are necessary. One is how to improve the skills of the top athletes. As the other, specific handicaps also require individualized support. Because the top boccia players fit into both cases, we measured the movements of each top player using a sensor-based system. In the actual training camp, we took the approach of summarizing and discussing the results of the daytime measurements at night and applying them to the practice the following day.
1 Introduction Boccia is a sport that originated in Europe and was designed for people with severe cerebral palsy or similar severe functional disabilities of the extremities. International competitions, such as the Paralympics, are divided into four classes, BC1–BC4, depending on the degree of disability. Japan’s national team, “Fireball Japan”, won a silver medal at the Rio de Janeiro Paralympics and is attempting to strengthen itself over the long term with an aim at winning a medal at the 2020 Tokyo Paralympics. Many practice techniques based on motion analysis have been provided in the field of sports. Previous studies have generally established a standard step-by-step approach to improving the skills of amateurs [1]. The analysis sometimes requires various sensors. However, in two particular cases, a personal observation is necessary. The first case is how to improve the skills of the top athletes. As the second case, people with physical limitations also require individualized support. Because physical limitations are unique to each individual [2], a practice technique should be designed based on the different mobilities of each player. The top boccia players belong to both of the above categories. In such cases, we should analyze each athlete in detail and tailor the practice style specifically to them. One type of detailed analysis is a sensor-based measurement. However, previous research on motion analysis conducted in boccia has been limited. The manner in which sensors can be used to support top-level athletes is still being explored. In this study, using sensors, we analyzed the motions of top-level boccia athletes in Japan to investigate how to support their improvement. In addition, at the national team c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 249–257, 2022. https://doi.org/10.1007/978-3-030-84913-9_23
250
A. Ohnishi et al.
training camp, we went through a cycle of summarizing and discussing the results of the first day’s measurements before the evening meeting and applied them to the next day’s practice. Based on the results of the experiment, we discuss how sensor data should be used to support athletes. In this paper, we introduce related research in Sect. 2 and describe the experiments in Sect. 3. Section 4 describes the results. Finally, Sect. 5 provides some concluding regarding this research..
2 Related Research In this section, we introduce the research on improving boccia skills. Calado et al. proposed a ball-detection method using machine learning for a boccia game analysis [3]. In addition, Alves et al. developed a Sportive Communication Board to support communication with assistants for BC3 athletes [4]. Leite et al. also developed a real-time scoring tool to increase the involvement and commitment of elderly people in boccia games [5]. Moreover, Ichiba et al. investigated the relationships among lung function, throwing distance, and psychological competitiveness in Japanese boccia players [6]. Respiratory function was extremely low compared to the normal range. However, highly competitive boccia players performed well. In addition, Faria et al. developed a realistic boccia game simulator for BC3 players [7]. They also investigated the relationship between the experience of a participant in driving a wheelchair and their autonomy and independence [8]. Fong et al. investigated the effects of arm and neck fatigue patterns on the performance of boccia athletes [9]. The trapezius muscle showed fatigue after a prolonged boccia game. Reina et al. also examined the effect of throwing distance on the accuracy and kinematics of the top players of the Spanish boccia team [10]. They found a positive correlation between throwing speed and accuracy at medium distances and a negative correlation at long distances. Finally, Lapresa et al. built a system for capturing and observing videos of BC3 players for a technical and strategic analysis of the Spanish boccia Paralympic team [11]. Table 1. Classifications for international competitions such as the Paralympics Class Target
Throwing
Ramp Assistant
BC1
Cerebral palsy
Availablea
–
◦
BC2
Cerebral palsy
Available
–
–
BC3
Cerebral/Non-cerebral palsy Unavailable ◦
◦
BC4 Non-cerebral palsy a Leg kicking is accepted.
Available
–
–
Although there have been several studies on boccia, a method for supporting top professionals has yet to be established. According to a survey by Lorenzo et al., wearable inertial sensors and electromyography (EMG) sensors are most commonly used to assist people with disabilities. Motion capture is also often used [12]. Therefore, these sensors were also used in the present research.
Sensor-Based Motion Analysis of Paralympic Boccia Athletes
251
3 Experimental Method 3.1 Environmental Settings In international competitions such as the Paralympics, players are divided into four classes (BC1 to BC4) according to the degree of disability, as shown in Table 1. Players in classes BC1, BC2, and BC4 throw a ball with their arms. BC1 players cannot operate a wheelchair and have a severe paralysis of the limbs and trunk. BC2 athletes can operate a wheelchair with their arms and have cerebral palsy. BC4 is a class of athlete with muscular dystrophy and other severe limb functional disabilities, equivalent to BC1 and BC2. BC3 athletes are the most severely disabled class and are unable to throw on their own; thus, consultative assistants help them adjust the direction and angle of the ramp (a platform that rolls the ball down a slope). Because the physical constraints of each class differ widely, we consider the measurement method suitable for each class. 3.2 System Configuration The system configuration was determined through discussions between informatics researchers and boccia coaches. Figure 1 shows the system configuration. Because BC3 players do not throw with their arms, we decided to measure only the seat pressure distribution and video. Sensors used in BC1, 2, 4
Ch2(anterior)
Seat pressure distribution sensor
Ch1 (posterior)
EMG sensor Acceleration sensor
EMG
(a) Configuration
Fig. 1. Sensor configuration
• • • • •
EMG (electromyography) sensor Acceleration and angular velocity sensor Seat pressure distribution sensor Motion capture (first measurement only) Video camera
(b) Positions
X Y Z Acceleration Angular velocity
252
A. Ohnishi et al.
Electrodes were attached to the muscles used while throwing to measure the myoelectricity. An acceleration and angular velocity sensor was attached to the wrist. The frequency of the measurement 50 Hz. A pressure distribution sensor was placed on the seat of the wheelchair and was updated every 0.2 s. Sensors used in BC3 • Seat pressure distribution sensor • Video camera 3.3
Procedure
During the experiment, we conducted two types of measurement experiments, as described below. Based on the measurement results, we investigated the support methods for top-level boccia players. The participants of the experiment were Japanese national team boccia players and players of the national team training camp. The participants throw a ball and play against themselves and not other players. The participant first throws a white jack ball, and then throws six red balls and six blue balls in turn, trying to get them close to the jack ball. We used the trial as one set and acquired approximately five to ten sets of data for one player. Experiment 1: Measurements During Practice During the first experiment, we measured throws on 2 practice days of players in classes BC1, BC2, and BC4, who were capable of throwing on their own. We then observed their pitching from the acquired data. Based on the data, we analyzed their motions and considered a more effective sensor configuration. Six people participated: one BC1 athlete, four BC2 athletes (one of whom was measured twice), and one BC4 athlete.
Fig. 2. Snapshots of Experiment 1
Experiment 2: A Cycle of Measuring Motions and Improving Skills During Training Camp During the 2-day and 1-night national team training camp, we attempted to take measurements, discuss the results at a meeting that night, and apply the results to the next day’s practice and measurements. The motions of the BC3 athletes were also measured during the training camp. The participants were one BC2 athlete and four BC3 athletes (one of them was measured on both days).
Sensor-Based Motion Analysis of Paralympic Boccia Athletes
253
4 Result 4.1 Result 1: Measurements in Practices Motion capture was used in the first measurement but was excluded from the second measurement because it was difficult to attach. Snapshots of the measurements are shown in Fig. 2, and the sensor waveforms are shown in Fig. 3. The results indicate that the movements of each player differed greatly depending on the physical constraints, and the sensor data also differed. In Fig. 3, the upper left graph shows the acceleration, angular velocity, and EMG waveforms of Player 1 of BC2 for three throws; the first is a normal throw, the second is a ball thrown to the first ball, and the third is a lobbing throw. During this measurement, we asked the player about the player’s aim. The player stated that, “the first throw was a failure and the trajectory of the ball was not good. I wanted to hit it from above”. In response to the failure of the first throw, the player corrected the second and third throws by raising the trajectory. The angular velocity Y of the second and third rows was larger than that of the first throw.
Player 1 (BC2)
Release rebound
-5 10 20
20 30
30 40 Time[s]
40 50
50 60
Gyro [10-6 dps]
5 0 -5 -10 0 10
10 20
20 30
30 40 Time[s]
40 50
50 60
1
EMG [mV]
60 70 ang X ang Y ang Z
Acceleration [m/s-2]
0
2.5
60 70 Anterior Posterior
0
40 30 Time[s]
50 40
60 50
0 40
10 50
20 60
30 70
40 80 Time[s]
50 90
60 100
70 110
80 120 ang X ang Y ang Z
0
-5 10 50
20 60
30 70
40 80
50 90
60 100
70 110
80 120 Anterior Posterior
0
Ball release
Ball release
2.5 0
acc X acc Y acc Z
-2.5 -5 0 95
10 105
20 115
Time[s]
30 125
40 135
50 145 ang X ang Y ang Z
5 0 -5 -10 0 95
10 105
20 115
40 0
70 60
Throwing Action
30 125
40 135
50 10
60 20
70 30
Ball release
5
Ball release
120 80
acc X acc Y acc Z
-5 0 15
5 20
10 25
15 30 Time[s]
20 35
25 40
30 45 ang X ang Y ang Z
5 0 -5 -10 0 15
5 20
10 25
15 30 Time[s]
20 35
25 40
2
EMG [mV]
-1
110 70
0
50 145
0
100 60
10
Anterior Posterior
1
90 50
Ball release
Time[s] 2
80 40 Time[s]
Throwing Action
Player 4 (BC1) Acceleration [m/s-2]
30 20
10
EMG [mV]
acc X acc Y acc Z
-2.5
0 40
Gyro [10-6 dps]
20 10
Player 3 (BC2) 5
-2
Ball release
0
-2.5 10 0
Acceleration [m/s-2]
Ball release
2.5
-1
Gyro [10-6 dps]
Throwing Action Ball release
5
Gyro [10-6 dps]
acc X acc Y acc Z
EMG [mV]
Acceleration [m/s-2]
Ball release
0 10
Player 2 (BC4)
Throwing Action
5
30 45 Anterior Posterior
1 0 -1 -2
0 95
10 105
20 115
Time[s]
30 125
40 135
50 145
15 0
20 5
25 10
30 15 Time[s]
35 20
40 25
45 30
Fig. 3. Acceleration, angular velocity, and EMG values in Experiment 1 (Players 1–4)
254
A. Ohnishi et al.
Player 2 from BC4 in Fig. 3 had a unique throwing motion. The player had pendulum-like throwing motion, and the EMG did not change before the ball was released. It was confirmed that the player used force to achieve an arm swing but did not use force at the moment of release. Although Player 3 from BC2, shown in Fig. 3, was in the same class as Player 1, and thus their physical constraints were relatively close, the sensor data of the two players were completely different. However, the data were consistent among the individuals. The player from BC1 in Fig. 3 throws the ball with a surprising full-body motion. Because there was no periodic motion in the throw, no periodic waveform was observed. However, the shape of the waveform and the order of the changing sensors were almost constant for each throw. Because the EMG of the arm did not change much, we assumed that the player was throwing the ball using the player’s whole body. The results confirmed that each player’s motion differed greatly depending on the player’s physical constraints, and the sensor data also differed. Therefore, to achieve an improvement of a specific player, data on that player must be collected.
Fig. 4. Photographs of measurements taken at the national team training camp (2 days and 1 night).
Throwing Action
Throwing Action
0 -5 -10 0 30
20 50
40 70
Time[s]
60 90
80 110
Gyro [10-6 dps]
10
100 130 a X ang aang Y aang Z
5 0 -5 -10 0 30
20 50
40 70
60 90
80 110
5
Acceleration [m/s-2]
a X acc aacc Y aacc Z
acc X acc Y acc Z
0 -5 -10 0 180
20 200
40 220
-1
100 280
0 -5 -10 20 200
40 220
Time[s]
60 240
80 260
2
EMG [mV]
EMG [mV]
0
80 260
5
0 180
100 130 Posterior P os Anterior A nt
1
60 240
ang X ang Y ang Z
Time[s] 2
Time[s]
10
Gyro [10-6 dps]
Acceleration [m/s-2]
5
100 280 Posterior Anterior
1 0 -1 -2
-2 30 0
50 20
70 40
Time[s]
90 60
Throwing motion to a long distance
110 80
130 100
180 0
200 20
220 40
Time[s]
240 60
260 80
280 100
Throwing motion to a short distance
Fig. 5. Acceleration, angular velocity, and EMG values in Experiment 2 (Player 3 from BC2 in Experiment 1)
Sensor-Based Motion Analysis of Paralympic Boccia Athletes
255
4.2 Result 2: Cycle of Measuring Motions and Improving Skills During Training Camp First Day On the first day, data from one player from BC2 and one player from BC3 were acquired. In the preliminary discussions, the coaches were concerned that showing the sensor values to the athletes would often lead to a deterioration of their form. Therefore, as shown on the left side of Fig. 4, measurements were only taken on the first day, and the data were not shown to the athletes. Figure 5 shows a player from BC2 throwing the ball to the far and near sides, respectively. Comparing the waveforms, the acceleration and motion waveforms are similar, except for the differences in amplitude. However, the EMG waveforms were extremely different, and artifacts were introduced when aiming closely. Although we cannot deny the possibility that this is due to the sequencing or sensor mounting, it is also possible that different muscles are being used. We therefore believe further analysis is required. The pressure distribution on the seat of the wheelchair of the player from BC3 was measured. From the measurements on the first day, it was found that they bent forward when aiming.
Fig. 6. Trial when targeting becomes inaccurate (before correcting the initial position of the lamp)
Meeting with Coaches on First Night During the evening meeting with the coaches, we created and reported a graph of the sensor data measured on the first day. In addition, we discussed the possibility of support. In the discussion, we talked about strengthening the trunk and visualization. In the measurement on the first day, the athletes were not informed of the measurement results and were attached to the sensors. We believed that we should provide the measurements to those players who cooperated with the assessment. However, we were concerned that showing the data to the players might have negative effects, such as a deterioration of their pitching form. Therefore, after a discussion with the coaches, we decided to present the sitting pressure distribution related to the trunk on the second day because it was predicted to have a relatively less negative effect. Second Day For the measurements taken on the second day, to present the results of the seat pressure measurements in real time, we set up a monitor in a location where the players and coaches could see it, as shown in Fig. 4. On this day, we measured four players from BC3. One was a player who had been measured on the first day. The coach and one
256
A. Ohnishi et al.
of the authors noticed an interesting aspect when the results were shown in real time during the measurements. The coach stated that the targeting of this player occasionally became inaccurate during the latter half of the game. When we observed the motion and sensor values during the experiment, the center of gravity of the player was biased to the right when targeted, as shown in Fig. 6. In some cases, the bias remained small when the player returned to the original posture. Therefore, we discussed and modified the starting position of the ramp with the coach and assistant. Figure 7 shows the trial after the modification. As shown in the figure, the rightward bias of the center of gravity was corrected by approximately half a square on the sensor sheet, and the center of gravity shifted toward the center. The surface pressure distribution sensor contributed to the fine adjustment of the relationship between the body orientation and the starting position of the ramp.
Fig. 7. Trial when targeting after correcting the initial position of the lamp
5 Conclusion In this study, we analyzed the movements of top-level Japanese boccia athletes, which is an official Paralympic sport, using sensors. In a 2-day, 1-night national team training camp, we summarized and discussed the results of the day’s measurements during a nighttime meeting and applied them to the next day’s practice. From the experiment, it was confirmed that each player’s movements differed greatly depending on their physical constraints, and the sensor data also differed among the individuals. The measurements and presentation of the pressure distribution on the surface of the seat of the players of BC3 were shown to be effective in terms of technical support. However, the long-term effects of such measurements and presentation have yet to be confirmed. To improve their skills, we believe that it will be necessary to conduct continual measurements of each player, rather than ending with a single measurement. There are only a few measurement data available for boccia in comparison to other sports, and we are still researching technical support for top athletes. Acknowledgements. This work was supported by JST CREST Grant Number JPMJCR18A3, and JSPS KAKENHI Grant Number JP21K17790, Japan.
References 1. Huijgen, B.C., Elferink-Gemser, M.T., Ali, A., Visscher, C.: Soccer skill development in talented players. Int. J. Sports Med. 34(8), 720–726 (2013)
Sensor-Based Motion Analysis of Paralympic Boccia Athletes
257
2. Curran, S., Frossard, L.: Biomechanical analyses of the performance of paralympians: from foundation to elite level. Prosthetics Orthot. 36(3), 380–395 (2012) 3. Calado, A., Silva, V., Soares, F., Novais, P., Arezes, P.: Ball detection for Boccia game analysis, Proceedings of the 6th International Conference on Control, Decision and Information Technologies (CoDIT), pp. 1468–1473, April 2019 4. Alves, A., Castro, H., Miceli, L., Barbosa, J.: Sportive communication board: a low cost strategy to improve communication of BC3-Paralympics Boccia Athletes. Creat. Educ. 9(11), 1743–1762 (2018) 5. Leite, P., Calado, A., Soares, F.O.: Boccia court analisys for real-time scoring. In: ICINCO, vol. 2, pp. 521–526 (2018) 6. Ichiba, T., Okuda, K., Miyagawa, T., Kataoka, M., Yahagi, K.: Relationship between pulmonary function, throw distance, and psychological competitive ability of elite highly trained Japanese boccia players via correlation analysis. Heliyon 6(3), 1–6 (2020) 7. Faria, B.M., Ribeiro, J.D., Moreira, A.P., Reis, L.P.: Boccia game simulator: serious game adapted for people with disabilities. Expert Syst. 36(3), 1–11 (2019) 8. Faria, B.M., Silva, A., Faias, J., Reis, L.P., Lau, N.: Intelligent wheelchair driving: a comparative study of cerebral palsy adults with distinct Boccia experience. New Perspect. Inf. Syst. Technol. 2, 329–340 (2014) 9. Fong, D.T.P., Yam, K.Y., Chu, V.W.S., Cheung, R.T.H., Chan, K.M.: Upper limb muscle fatigue during prolonged Boccia games with underarm throwing technique. Sports Biomech. 11(4), 441–451 (2012) 10. Reina, R., Dom´ıez, M., Urban, T., Roldan, A.: Throwing distance constraints regarding kinematics and accuracy in high-level Boccia players. Sci. Sports 33(5), 299–306 (2018) 11. Lapresa, D., Santesteban, G., Arana, J., Anguera, M.T., Aragon, S.: Observation system for analyzing individual Boccia BC3. J. Dev. Phys. Disabil. 29(5), 721–734 (2017) 12. Lorenzo, R., et al.: Wearable sensors in sports for persons with disability: a systematic review. Sensors 21(5), 1–25 (2021)
A Design and Development of a Near Video-on-Demand Systems Tomoki Yoshihisa(B) Cybermedia Center, Osaka University, Osaka, Japan [email protected]
Abstract. Recently, Video-on-Demand (VoD) services have attracted great attention. In conventional VoD services, the processing load and the communication load on the distribution server increases as the number of the clients increases. To distribute the loads on the distribution server, some researches focus on Near VoD (NVoD) systems. Most of the researchers investigated the waiting time arose in NVoD systems using a simulation program because it is difficult to develop the NVoD systems that work on actual situations and there are no opened NVoD systems. However, the simulated waiting time can be different from the actual waiting time because the simulators cannot completely reflect the actual situation. The simulation model that removes some trivial processing or communication flows gives different evaluation results for the actual situations. In this study, we design and develop an NVoD system to investigate the performances such as the waiting time and the interruption time in actual situations.
1 Introduction Due to the recent increase in the speed of wireless communication, Video-on-Demand (VoD) services have attracted great attention. In VoD services, a video distribution server sends the video data to the client which requests the data to it. Therefore, the processing load and the communication load on the distribution server increases as the number of the clients increases. In the cases that the distribution server suffers from the heavy loads, it takes a long time for the distribution server to send the video data to the client. If the client cannot finish receiving a part of the video data before the client starts playing it, an interruption occurs. To avoid the interruptions occur when the number of the clients is large, some researches focus on Near VoD (NVoD) systems. In NVoD systems, the distribution server periodically broadcasts/multicasts the video data to the clients. Since the clients do not request the data and the distribution server does not need to respond the requests, its loads do not increase even when the number of the clients increases. However, the clients need to wait for the broadcast of the video data since they are periodically broadcast. Thus, waiting time occur for the clients to start playing the video. To reduce the waiting time, some NVoD systems adopt the division-based broadcasting technique [1–6]. In the technique, the video data are divided into several segments and a more preceding segment is broadcast more frequently. The chance for the clients to receive the preceding segments increases. Thus, the waiting time is reduced compared with the case that the distribution server broadcasts the video © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 258–267, 2022. https://doi.org/10.1007/978-3-030-84913-9_24
A Design and Development of a Near Video-on-Demand Systems
259
data without dividing them. Most of the researchers investigated the waiting time using a simulation program because it is difficult to develop the NVoD systems that work on actual situations and there are no opened NVoD systems. However, the simulated waiting time can be different from the actual waiting time because the simulators cannot completely reflect the actual situation. The simulation model that removes some trivial processing or communication flows gives different evaluation results for the actual situations. Therefore, it is required in the NVoD research field for a long time to develop an actual NVoD system. In this study, we design and develop an NVoD system to investigate the performances such as the waiting time and the interruption time in actual situations. Our designed NVoD system consists of two software, one is for the distribution server and the other is for the clients. The functions required for the software for the distribution servers are, broadcasting the segmented video data according to predetermined broadcast schedule, and controlling the data broadcast rate considering the network capacity for the system. The functions required for the software for the clients are, constructing the segments from the received broadcast data, and playing the received segments in the order of their sequence. We develop our designed NVoD system and measure its performances. The reminder of the paper is organized as follows. We will introduce some related work in Sect. 2. We will explain NVoD systems and schemes in Sect. 3. We explain our design and implementation for NVoD system in Sect. 4. We will show some evaluation results in Sect. 5. Finally, we will conclude the paper in Sect. 6.
2 Related Work To distribute the loads on the video distribution server, a method for duplicating and allocating video files and methods using peer-to-peer technology were proposed [2– 4]. In these methods, the clients receive the video data from other clients. Each client requests a playback of the video data to the distribution server. The distribution server determines an appropriate reception schedule of the segmented video data and sends it to the client. The interruption time can be reduced by receiving the segments according to the schedule. In [7–9], the authors proposed a method for reducing the distribution server’s loads using multicast for one-to-many communication. These methods assume an environment in which the minimum bandwidth and the data reachability of the clients are guaranteed to enable multicasting. However, in an actual systems, the multicast bandwidth may not be guaranteed due to the rule of the communication services or the multicast forwarding may be blocked by routers. Several systems that combine communication and broadcasting were proposed in [10–12]. These systems used broadcast waves for broadcasting data. In these systems, video data are distributed via broadcasting, and related information is distributed via communication such as the Internet. However, on-demand video distribution is not possible. In [13], the authors proposed a video data distribution method that allows flexible bandwidth allocation with the aim of reducing the waiting time in consideration of the capacity of client devices to receive data from broadcast and communication channels. The simulated results achieved by the proposed method demonstrate, in comparison
260
T. Yoshihisa
Fig. 1. Our assumed NVoD systems
to conventional methods, its capacity to achieve the shortest waiting time by suitably allocating bandwidths.
3 NVoD Systems In this section, we first explain the NVoD systems and then introduce three methods for broadcasting video data for the systems. 3.1 Assumed Systems Our assumed NvoD systems is shown in Fig. 1. The distribution server has the video contents and equipped with a broadcasting system such as WiFi, bluetooth, terrestrial broadcasting systems. The video data can be divided into several segments. The clients are in the broadcasting range of the distribution server and can receive the broadcast data from the broadcasting system. They can play each segment when the reception finishes. The broadcast data consist of the header and the content data. The header includes the information about the content data such as the video number, the segment number, and the data size of the content data. The clients cannot receive the content data unless they receive the broadcast data from the beginning of the header because the information inscribed in the header are required for the reception. Examples of the NVoD systems are broadcasting HTTP Live Streaming (HLS) data to the client via WiFi. The distribution server can distribute the data using the WiFi and the clients can play the segmented video data using the HLS technology. With HLS, the video file is divided into several video data files called Transport Stream (TS) files, and the client can play each TS file independently. 3.2 Conventional NVoD Methods This subsection briefly describe several conventional methods for reducing the waiting time (time to the start of playback), which is the period from when the client begins
A Design and Development of a Near Video-on-Demand Systems
261
receiving the broadcasted data to the time that the video playback begins in a NVoD system. 3.2.1 Simple Method In the simple broadcasting method, the video distribution server repeatedly broadcasts the video data without segmentation. In the cases that the broadcast bandwidth is B and the data size of the video data is D, it takes D/B to broadcast the data. Therefore, the waiting time is (3D)/(2B). The broadcast bandwidth should be larger than the bit rate of the video data R to play the data without interruptions. The simple method is easily implementable because the broadcast schedule is very simple, i.e., repeat broadcasting a video data. However, the waiting time is longer than other methods. 3.2.2 Binary Broadcasting Method In the binary broadcasting method, the video file is divided into two segments, i.e., the first half and second half. By broadcasting the first half segment more frequently, the client has more opportunities to start playing the video, allowing for a reduction in the average waiting time to start the playback. By determining the number of times that the first half segment is broadcasted so that the client can complete the second half segment while the first half segment is being played, the client can play the second half segment without interruption after finishing playing the first half segment. The analysis in [5] revealed that the average waiting under the binary broadcasting method is shorter than the simple method when the broadcasting bandwidth is twice the bit rate of the video data, i.e., B > 2R. The binary broadcasting method is a little bit complex compared with the simple method because the video file is divided into two segments. However, only by dividing the data into two segments and broadcasting the first half several times, the average waiting time can be reduced drastically further than that under the simple method. 3.2.3 Parallel Broadcasting Method In the parallel broadcasting method, the distribution server broadcasts the segmented video data using several broadcast channels. The method assumes that the broadcast bandwidth for each broadcast channel is the same because the distribution server cyclically broadcasts each segment using one physical broadcast channel. The data size of each segment is calculated so that the time required for receiving the segment is equal to the time to finish playing the video data from the beginning to the previous segment. Therefore, the clients can play the video without interruptions by starting playing the video data immediately after receiving the first segment. In the parallel broadcasting method, the waiting time is reduced in proportional to the number of the segments. However, the broadcast schedule is more complex than the binary broadcasting method because the video file is divided into more than two segments.
262
T. Yoshihisa
4 Our NVoD Systems In this section, we first explain the design of our NVoD systems and after that we will explain its implementation. 4.1 Our Design In this subsection, we explain our design for NVoD systems. Our designed NVoD system consists of two software, one is for the distribution server and the other is for the clients. Here, we explain the requirements for each software. 4.1.1 Requirements for Distribution Server Software The software for the distribution server runs on the distribution server and broadcasts the segmented video file to the clients. First, the function required for the software for the distribution servers is broadcasting the segmented video data according to predetermined broadcast schedule according to the waiting time reduction methods introduced in Subsect. 3.2. For this, the software needs to be able to input the broadcast schedule and broadcast the segments in the order of the input broadcast schedule. For the binary broadcasting method, the number of the times to broadcast the first half of the video file is the parameter. The number of the segments that can reduce the waiting time effectively depends on the broadcasting situations such as the processing load or the communication load. Therefore, the software needs to be able to select these values. The other requirement is controlling the data broadcast rate considering the network capacity for the system. The distribution server can control the data broadcast rate by waiting for broadcasting the next segment. Also, the segments are generally divided into chunks, i.e., a part of the segment to be sent in one communication packet. Therefore, by giving a time gap among the chunks, the distribution server can also control the data broadcast rate. 4.1.2 Requirements for Clients Software The software for the clients runs on each client and receives and play the segmented video file. First, the function required for the software for the clients is constructing the segments from the received broadcast chunks. This can be established by combining the received chunks in sequential order. To prepare the memory buffers to reconstruct each segment, it is better to inform the number of the segments to the clients. Otherwise, the clients need to prepare a sufficient number of buffers for it. The other requirement is playing the received segments in the order of their sequence. This can be established by storing the segments constructed in each client and playing them in their sequence. The sequence can be written in the header of each segment. Some video players need to buffer a certain period of data to play each segment. Therefore, The clients need to use the video players that do not need buffering data.
A Design and Development of a Near Video-on-Demand Systems
263
4.2 Our Implementation In this subsection, we explain our implementation. Figure 2 shows a screenshot of our implemented software. 4.2.1 Data Structure Figure 3 show the data structure. The first 12 bytes are header and a fixed pattern appears there. The next is the chunk header and this includes several information about the following chunk data. We implemented the system also to evaluate the system. Therefore, the chunk header includes the data types, flags, and parameters for the evaluation. 4.2.2 Distribution Server Software To satisfy the first requirement, we give selectors for the broadcasting method, the number of the segments, and the number of the times to broadcast the first half of the video file for the binary broadcasting method. By selecting preferable methods and numbers, the software can broadcast the segments according to predetermined broadcast schedule. To satisfy the other requirement, we give the gap times among the chunks and the segments.
Fig. 2. Our implemented software (upper: for the clients, lower: for the distribution server)
264
T. Yoshihisa
By changing the times, the distribution server can control the broadcast bandwidth to broadcast the segments. FF,FF,FF,00,00,00, FF,FF,FF,00,00,00
Chunk Header
Chunk Data
Header Send Time
Data Type
Segm ent ID
Flags
Param eters
Data Size
1: Request bit 1: Last Segment 2: Reply bit 2: Last Chunk 3: MulƟcast bit 3: Experiment IniƟalize 4: WaiƟng Time bit 4: Next Parameter 5: StaƟcs Data bit 5: Next Experiment 6: Request Time bit 6: Experiment Finish 7: Reply Time 8: Experiment Control
Fig. 3. Data structure
As shown in the lower side in Fig. 2, we implemented the system to be able to input some parameters for the evaluation such as the range for automatically changing the number of segments, and so on. 4.2.3 Clients Software To satisfy the first requirement, our implemented system first sends the number of the segments to the clients. The clients prepare the buffers for combining the chunk data with the same number of the segments. By combining the received chunk data, each client construct segments. To satisfy the other requirement, the clients store each segment into their storages and play it when the time to start playing it comes. For this, our implemented system adopt HLS and the segments are stored as TS files. As shown in the lower side in Fig. 2, we implemented the system to be able to check the evaluation indexes such as the waiting time and the interruption time.
5 Evaluation To evaluate our implemented VoD system, we compare the waiting time got by our system with that got by our developed simulator. The waiting time is from the time that a client starts receiving the broadcast data to the time that the client starts playing the data. In this section, we explain the result. 5.1 Setup We use a video data of that duration is 1 min. We get the video data by cutting the beginning 1 min of an open video data named “Big Buck Bunny”. The data size is 5 Mbytes. To divide the data into several segments, we use “ffmpeg”. To get the average waiting time, we make the client to start receiving the broadcast data ten times and calculate the average value. The buffering time required for the video player is 1 s.
A Design and Development of a Near Video-on-Demand Systems
265
The distribution server is a laptop computer (Windows 10 Home, Intel Core i7, 16 GB memory) and the client is a laptop computer with the same specifications. We use WiFi for broadcasting data and the average bandwidth is 3.8 Mbps.
Average WaiƟng Time [sec.]
16 Actual (Simple)
14
Actual (Parallel) SimulaƟon (Simple)
12
SimulaƟon (Parallel)
10 8 6 4 2 0 1
3
5
7
Number of Segments
9
11
Fig. 4. Average waiting time and the number of segments
5.2 Result Figure 4 shows the result. The horizontal axis is the number of the segments and the vertical axis is the average waiting time. We show the results for the simple method and the parallel method because the binary broadcasting method does not depend on the number of the segments. We can see that the average waiting time tends to decrease as the number of the segments increases. This is because the clients can receive the first segment in a shorter time as the video file is divided into more segments. However, in the parallel method under the actual situation, the average waiting time increases a little when the number of the segments is larger than 7. This is because the video player requires a buffering time. Since a larger number of the segments cause a less data amount of the first segment, the buffering time increases. We can also see that the tendency of the change in the average waiting time is similar between the actual situation and the simulation.
6 Conclusion The simulated waiting time in previous studies can be different from the actual waiting time because the simulators cannot completely reflect the actual situation. In this paper, we explained our designed and developed NVoD system. Our designed NVoD system
266
T. Yoshihisa
consists of two software, one is for the distribution server and the other is for the clients. We implemented two functions for the software for the distribution servers, broadcasting the segmented video data according to predetermined broadcast schedule, and controlling the data broadcast rate considering the network capacity for the system. Moreover, we implemented two functions for the software for the clients, constructing the segments from the received broadcast data, and playing the received segments in the order of their sequence. Our evaluation results show that the performances of the NVoD system was different from the simulated results, but the tendency was similar. In the future, we will implement other NVoD methods and will extend the software to be able to communicate with each other via a local area network to further reduce the waiting time. Acknowledgments. This work was partially supported by JSPS KAKENHI Grant Numbers JP21H03429, JP18K11316, and by G-7 Scholarship Foundation.
References 1. Fratini, R., Savi, M., Verticale, G., Tornatore, M.: Using replicated video servers forVoD traffic offloading in integrated metro/access networks. In: Proceedings of IEEE International Conference on Communications, pp. 3438–3443 (2014) 2. Zhang, G., Liu, W., Hei, X., Cheng, W.: Unreeling Xunlei Kankan: understanding HybridCDN-P2P video-on-demand streaming. IEEE Trans. Multimedia 17(2), 229–242 (2015) 3. Sheshjavani, A.G., Akbari, B., Ghaeini, H.R.: An adaptive buffer-map exchange mechanism for pull-based peer-to-peer video-on-demand streaming systems. Springer Int. J. Multimedia Appl. 76(5), 7535–7561 (2016). https://doi.org/10.1007/s11042-016-3425-z 4. Araniti, G., Scopelliti, P., Muntean, G.-M., Lera, A.: A hybrid unicast-multicast network selection for video deliveries in dense heterogeneous network environments. IEEE Trans. Broadcast. Early Access 65, 83–93 (2018) 5. Yoshihisa, T., Tsukamoto, M., Nishio, S.: A Scheduling scheme for continuous media data broadcasting with a single channel. IEEE Trans. Broadcast. 52(1), 1–10 (2006) 6. Yoshihisa, T.: Data piece elimination technique for interruption time reduction on hybrid broadcasting environments. In: Proceedings of IEEE International Conference on Pacific Rim Conference Communications, Computers and Signal Processing, 6 p. (2017) 7. Guo, J., Gong, X., Liang, J., Wang, W., Que, X.: An optimized hybrid unicast/multicast adaptive video streaming scheme over MBMS-enabled wireless networks. IEEE Trans. Broadcast. Early Access 64, 791–802 (2018). 8. Tian, C., Sun, J., Wu, W., Luo, Y.: Optimal bandwidth allocation for hybrid video-on-demand streaming with a distributed max flow algorithm. ACM J. Comput. Netw. 91(C), 483–494 (2015) 9. Boronat, F., Montagud, M., Marfil, D., Luzon, C.: Hybrid broadcast/broadband tv services and media synchronization: demands, preferences and expectations of Spanish consumers. IEEE Trans. Broadcast. 64(1), 52–69 (2018) 10. Boronat, F., Marfil, D., Montagud, M., Pastor, J.: HbbTV-compliant platform for hybrid media delivery and synchronization on single-and multi-device scenarios. IEEE Trans. Broadcast. 64(3), 6 (2017) 11. Christodoulou, L., Abdul-Hameed, O., Kondoz, A.-M.: Toward an LTE hybrid unicast broadcast content delivery framework. IEEE Trans. Broadcast. 63(4), 656–672 (2017)
A Design and Development of a Near Video-on-Demand Systems
267
12. Hang, Y., et al.: Proactive video push for optimizing bandwidth consumption in hybrid CDNP2P VoD systems. In: Proceedings of IEEE INFOCOM, 9 p. (2018) 13. Matsumoto, S., Ohira, K.: Yoshihisa, T.: A mathematical analysis of 2-tiered hybrid broadcasting environments. In: Proceedings of International Workshop on Streaming Media Delivery and Management Systems, 454–460 (2019)
A Consideration of Delivering Method for Super-Resolution Video Yusuke Gotoh(B) and Takayuki Oishi Graduate School of Natural Science and Technology, Okayama University, Okayama, Japan [email protected]
Abstract. The satisfaction of users who watch contents from video delivery services is highly dependent on the communication environment between the server and the client. In order to reduce the interruption time while playing the video, several methods have been proposed to change the quality of the video according to the connection. Therefore, many researchers have studied super-resolution processing techniques that convert low-quality video into high-quality video by increasing the resolution of each frame. However, if the client does not have sufficient computing resources, such as CPU and memory, it is difficult to perform superresolution processing in real time for all the frames that constitute the received video. In this paper, we consider a delivery method for playing super-resolution video in real time.
1
Introduction
Due to the spread of video delivery services, video traffic is rapidly increasing all over the world [1]. It is necessary for video delivery systems to adapt to changes in the communication environment. If the connection with the server is poor, the client may experience interruptions while playing the video. Many researchers have studied super-resolution processing techniques that convert low-quality video into high-quality video by increasing the resolution of each frame. In real-time super-resolution processing, the client converts lowresolution frames to high-resolution frames within the buffering time between the finishing time of receiving data and the starting time of playing it for all frames that constitute the video data. In this case, super-resolution processing for frames with few features cannot significantly improve the visual quality of the video. In this paper, we consider a delivery method for playing super-resolution video in real time. The remainder of the paper is organized as follows. In Sect. 2, we explain the techniques for enlarging images. In Sect. 3, we clarify the relationship between feature value and super-resolution accuracy. Our proposed method is described in Sect. 4. Finally, we conclude the paper in Sect. 5.
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 268–274, 2022. https://doi.org/10.1007/978-3-030-84913-9_25
A Delivering Method for Super-Resolution Video
269
Fig. 1. Original image of bird
2 2.1
Techniques for Enlarging Images Pixel Interpolation
In order to enlarge a video, it is necessary to enlarge the sequence of images that constitute the video data. Since the enlarged image has many more pixels than the original image, the client needs to interpolate pixels that are not present in the original image. Figure 2 shows an image of the rectangular focal area of the original image of the bird shown in Fig. 1, enlarged by a factor of 4 using the nearest neighbor, bilinear, and bicubic methods [2]. The nearest neighbor method sets the value of the interpolated pixel to that of the pixel closest to it. However, since the nearest neighbor method uses the pixel values around the interpolated pixel, jaggies are generated at the edges of the image. The bilinear method calculates the pixel value based on the weighted average of the interpolated pixel and each of the four pixels around it. Since the bilinear method cannot generate high-frequency components, the resolution of the image is reduced. The bicubic method calculates the pixel value based on the weighted average of the interpolated pixel and each of the sixteen pixels around it. Since the bicubic method uses the average value of the surrounding pixels as does the bilinear method, it cannot generate high-frequency components and cannot emphasize edges.
270
2.2
Y. Gotoh and T. Oishi
Super-Resolution
In super-resolution, unlike the conventional pixel interpolation described in Subsect. 2.1, the client uses the characteristics of the image to increase its resolution. Super-resolution methods can be classified into two types: reconstructiontype super-resolution, which generates a single high-resolution image based on multiple similar images, and learning-type super-resolution, which learns the correspondence pattern between a high-quality image and a low-quality image based on a training image.
Fig. 2. Focal image of bird enlarged four times using three methods
Fig. 3. Enlarged image of bird using SRCNN
Figure 3 shows the rectangular region of the image shown in Fig. 1 enlarged by a factor of 4 using Super-Resolution CNN (SRCNN) [3]. In Fig. 3, SRCNN emphasizes the edges in the image enlargement process compared to the other three types of methods. The client can perform video super-resolution by applying the single-image super-resolution method to each frame. The quality of video super-resolution depends not only on the accuracy of the super-resolution for each frame but
A Delivering Method for Super-Resolution Video
271
also on the maintenance of continuous playback between frames. Therefore, a video super-resolution method [4,5] has been proposed to maintain continuous playback between frames. 2.3
Super-Resolution Method for Video Delivery
In video delivery services, clients can convert low-resolution video received from a server into high-resolution video by applying super-resolution technology. The client needs to perform super-resolution processing on each frame of the received video data in real time in order to perform super-resolution processing while playing it. Therefore, a real-time video super-resolution method that uses the characteristics of video has been proposed.
Fig. 4. Original image with city and image with corners
Zhang et al. proposed a super-resolution method [6] focusing on video compression. In this method, super-resolution is applied only to the key frames included in the Group Of Picture (GOP) among all frames constituting the video data. By propagating the effect of super-resolution to other frames, this method can finally give the effect of super-resolution to all frames.
3
Relationship Between Feature Value and Super-Resolution Accuracy
In the research on feature detection that quantifies image features, fast methods of detecting features for corners such as Features from Accelerated Segment Test (FAST) [7] have been proposed. Clients use these methods in processes such as face recognition and Simultaneous Localization and Mapping (SLAM) to extract features of objects in real time [8].
272
Y. Gotoh and T. Oishi
Figure 4 shows the original image of the city and the image of the corner detected by applying FAST to this image. The city image is complex because it consists of many objects such as buildings and cars, and the number of detected corners is 5,407. Figure 5 shows the restored images of the original images in Fig. 4. Each image was scaled down 0.25 times and enlarged 4 times using the bicubic method and SRCNN, respectively. The evaluation results for the similarity between the two types of images shown in Fig. 5.
Fig. 5. Enlarged image of each method on the reduced original image of city
4 4.1
Proposed Method Outline
In this paper, we consider a method for playing video while performing superresolution processing. In this section, we describe the order of super-resolution processing for buffered video and for all frames in the proposed method. 4.2
Super-Resolution Processing of Buffered Video
In many video delivery systems, the client reduces the number of playback interruptions by playing the video while buffering a certain amount of data. Therefore, in real-time video delivery, it is difficult for the client to play the video while applying super-resolution to the frames.
A Delivering Method for Super-Resolution Video
4.3
273
Processing Steps
The processing steps of the proposed method are as follows. 1. Split the buffered frames in batches 2. Calculate the total number of corners for all frames in each batch 3. Sort all batches in order of increasing number of corners Initially, the client batches and splits the buffered frames into batches of N frames each. Next, we calculate the sum of the corners in all of the frames that make up each batch. As described in Sect. 3, frames with a large number of corner features improve the estimation accuracy when enlarging the image in super-resolution. Therefore, we use the number of corner features as an index to select frames for super-resolution. Finally, all batches are sorted in the order of the number of corners.
5
Conclusion
In this paper, we propose a method for playing video while performing superresolution processing that prioritizes frames with many features when receiving low-quality video. In the proposed method, the client can improve the visual quality of the buffered video by prioritizing the frames that have more features and that are predicted to be more effective in improving the visual quality. Furthermore, the method performs super-resolution processing on these frames until the video starts to play. In the future, we will implement and evaluate a delivering method for superresolution video based on frame-by-frame features. Acknowledgement. This work was supported by JSPS KAKENHI Grant Number 18K11265.
References 1. Cisco Annual Internet Report (2018–2023) White Paper - Cisco (online). https:// www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annualinternet-report/white-paper-c11-741490.html. Accessed 01 June 2021 2. Keys, R.: Cubic convolution interpolation for digital image processing. IEEE Trans. Acoustic Speech Signal Process. 29, 1153–1160 (1981) 3. Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184–199 (2014). https://doi.org/10.1007/978-3319-10593-2 13 4. Sajjadi, M.S., Vemulapalli, R., Brown, M.: Frame-recurrent video super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6626–6634 (2018) 5. Chu, M., Xie, Y., Mayer, J., Leal-Taix´e, L., Thuerey, N.: Learning temporal coherence via self-supervision for GAN-based video generation. ACM Trans. Graphics, 39(4) (2020). https://doi.org/10.1145/3386569.3392457
274
Y. Gotoh and T. Oishi
6. Zhang, Z., Sze, V.: Fast: a framework to accelerate superresolution processing on compressed videos. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1015–1024 (2017) 7. Rosten, E., Drummond, T.: Machine learning for high-speed corner detection. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 430– 443. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023 34 8. Li, Y., Brasch, N., Wang, Y., Navab, N., Tombari, F.: Structure-SLAM: low-drift monocular SLAM in indoor environments. IEEE Robot. Autom. Lett. 5(4), 6583– 6590 (2020)
Proposal of a Tele-Immersion Visiting System Reiya Yahada(B) and Tomoyuki Ishida Fukuoka Institute of Technology, Fukuoka, Fukuoka 811-0295, Japan [email protected], [email protected]
Abstract. The COVID-19 pandemic limits visits to hospitalized patients and their families. Therefore, we implemented a tele-immersion visiting system for inpatients and their families. Inpatients can move freely by operating a remote operation robot equipped with a 360-degree camera at their homes with a game controller. In addition, inpatients receive a 360-degree image via a head-mounted display so they can experience a highly realistic home space. To verify the effectiveness of this system, we conducted a questionnaire survey of six subjects. As a result, the effectiveness for inpatient families remains an issue.
1 Introduction With the development of wideband networks recently, tele-immersion technology has attracted much attention. Currently, R & D using large-scale 3D data and video images are being conducted globally in various fields, such as medicine, art, education, and science and technology. R & D using the tele-immersion technology includes a tele-immersion conference system using an immersive display device developed by Miyachi et al. [1] and a tele-immersion communication system using a tiled display wall developed by Ebara et al. [2]. These systems use an immersive virtual reality (VR) device called Cave automatic virtual environment and a large-scale high-resolution display environment called tiled display wall. Furthermore, 2016 has been called the “first year of VR.” With the release of head-mounted displays (HMDs), such as HTC VIVE [3], Oculus Rift [4], and PlayStation VR [5], for general users in 2016, tele-immersion technology has become familiar.
2 Research Objective In this study, we develop a tele-immersion visiting system to realize real-time remote communication between inpatients and their families by combining an HMD, a 360degree camera, and a remote operation robot. Inpatients wearing the HMD can experience a realistic home by operating the remote operation robot equipped with the 360degree camera at home using the game controller. This system provides an immersion communication environment between inpatients and their families.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 275–282, 2022. https://doi.org/10.1007/978-3-030-84913-9_26
276
R. Yahada and T. Ishida
3 System Configuration Figure 1 shows the tele-immersion visiting system’s configuration. This system consists of a remote operating system that remotely operates the robot, a video presentation system that presents video images of remote locations to the HMD, and a voice call system for communicating with family members in remote locations. Inpatient users can freely move around their home by remotely operating the robot equipped with a 360-degree camera in the remote locations and see the state of their homes via the HMD. In addition, they can communicate with their family members in the remote locations using the voice call system.
Fig. 1. System configuration of tele-immersion visiting system.
4 System Architecture Figure 2 shows the system architecture of the tele-immersion visiting system. The remote operating system consists of an HMD function, a virtual space control function, a robot control function, and a voice chat function. In addition, the 360-degree video presentation system consists of a 360-degree camera control function, remote robot function, and voice chat function.
Proposal of a Tele-Immersion Visiting System
277
Fig. 2. System architecture of tele-immersion visiting system.
4.1 Remote Operating System Each function of the remote operating system is described below. 4.1.1 HMD Function • Display The display provides the user with a 360-degree video image of the remote location pasted in the virtual space. 4.1.2 Virtual Space Control Function • Virtual Space Control Manager The virtual space control manager provides the user with the virtual space, acquires 360-degree video image from the 360-degree camera control function via the network, and pastes it into the virtual space. • Virtual Space The virtual space is the space for pasting 360-degree video images.
278
R. Yahada and T. Ishida
4.1.3 Robot Control Function • User Interface The user interface provides operation inputs for the remote operation of the robot. • Robot Controller Manager The robot controller manager sends the operation information input from the user interface to the remote robot function of the 360-degrees video image presentation system via the network interface. • Network Interface The network interface sends the operation information received from the remote robot function of the 360-degrees video image presentation system via the network. 4.1.4 Voice Chat Function • User Interface The user interface captures the voice of the user who uses the remote operating system and sends the voice to family members in the remote location. • Sky Way SDK The Sky Way SDK sends the voice of the user who uses the remote operating system and receives the voice of the families. 4.2 360-Degrees Video Image Presentation System Each function of the 360-degrees video image presentation system is described below. 4.2.1 360-Degree Camera Control Function • 360-degrees Camera The 360-degrees camera consists of two fisheye lenses and captures 360-degree video images. • Camera API The camera API operates the 360-degree camera in response to a request received from the remote operating system via the HTTP server. • HTTP Server When the HTTP server receives a 360-degree camera operation request from the virtual space control function, it sends the request to the camera API and sends the 360-degree video image received from the camera API to the remote operating system. 4.2.2 Remote Robot Function • Robot Interface The robot interface reflects the operation information received from the robot control manager on the robot. • Robot Control Manager The robot control manager sends the robot operation information received from the remote operating system via the HTTP server to the robot interface.
Proposal of a Tele-Immersion Visiting System
279
• HTTP Server When the HTTP server receives the robot operation request from the robot control function, it sends the request to the robot control manager.
5 Prototype System Inpatients send operation information using Gamepad API [6] and WebSocket by operating the game controller connected to the PC. When the iRobot Create 2 equipped with the Raspberry Pi [7] at a remote location receives operation information, iRobot Create 2 moves in real time according to the received operation information. Figure 3 shows iRobot Create 2 equipped with the Raspberry Pi for server and robot operation.
Fig. 3. iRobot Create 2 equipped with the Raspberry Pi for server and robot operation.
Inpatients receive 360-degree video images from a THETA V [8] equipped with iRobot Create 2. By connecting THETA V to the network, users can access THETA V’s Camera API from a PC. Inpatients experience a VR environment in which 360-degree video images received from the THETA V are pasted on the spherical object of unity [9]. Figure 4 shows a user who is experiencing the VR environment with a 360-degree video image received from the THETA V.
6 System Evaluation We conducted a questionnaire survey of six subjects to evaluate the effectiveness of the tele-immersion visiting system. The subjects answered the questionnaire after using this
280
R. Yahada and T. Ishida
Fig. 4. User experiencing the VR environment with 360-degree video image received from the THETA V.
Fig. 5. Effectiveness of the tele-immersion visiting system for inpatients (n = 6).
Proposal of a Tele-Immersion Visiting System
281
system. Figure 5 shows the evaluation result regarding the effectiveness for inpatients. Regarding effectiveness for inpatients, about 70% of the subjects answered “effective” or “somewhat effective.” Moreover, Fig. 6 shows the evaluation result regarding the effectiveness for inpatient families. Regarding effectiveness for inpatients, 50% of the subjects answered “effective” or “somewhat effective.” The reason for the low evaluation is that the families cannot see the face of inpatients who are operating the robot from the hospital. Consequently, their evaluation was low.
Fig. 6. Effectiveness of the tele-immersion visiting system for inpatient families (n = 6).
7 Conclusion In this study, we implemented a tele-immersion visiting system for inpatients and their families. Inpatients wearing HMDs operate a remote operation robot with a 360-degree camera installed at home using a game controller. By using this system, inpatients can freely move around their home spaces with high reality and can talk with their families by voice call. As a result of evaluating the effectiveness of this system, there remains an issue for inpatient families. Therefore, we should consider implementing a function that enables communication while looking at the faces of both the inpatients and their families.
References 1. Miyachi, H., Ogi, T., Koyamada, K., Ebara, Y., Hirose, M.: The development of the fundamental software for tele-immersive conference. J. Vis. Soc. Jpn. 26(2), 67–68 (2006)
282
R. Yahada and T. Ishida
2. Ebara, Y., Shibata, Y.: Study on tele-immersive communication with multi-video streaming on tiled display wall. In: Proceedings of the 13th International Conference on Network-Based Information Systems, pp. 439–443 (2010) 3. HTC Corporation: VIVE. https://www.vive.com/jp/. Accessed April 2021 4. Facebook Technologies, Oculus Rift: https://www.oculus.com/rift/?locale=ja_JP. Accessed April 2021 5. Sony, PlayStation VR: https://www.playstation.com/ja-jp/explore/playstation-vr/. Accessed April 2021 6. World Wide Web Consortium, Gamepad: https://www.w3.org/TR/gamepad/. Accessed April 2021 7. iRobot Corporation: iRobot Create 2 Programmable Robot. https://www.irobot.com/about-iro bot/stem/create-2.aspx. Accessed April 2021 8. Ricoh Company: RICOH THETA V. https://theta360.com/ja/about/theta/v.html. Accessed April 2021 9. Unity Technologies: Unity Real-Time Development Platform|3D, 2D VR & AR Engine. https:// unity.com/. Accessed April 2021
A Study on the Impact of High Refresh-Rate Displays on Scores of eSports Koshiro Murakami1(B) and Hideo Miyachi2 1 Graduate School of Environmental and Information Studies, Tokyo City University,
3-3-1 Ushikubo-nishi Tsuzuki-ku, Yokohama, Japan [email protected] 2 Department of Information Systems, Tokyo City University, 3-3-1 Ushikubo-nishi Tsuzuki-ku, Yokohama, Japan [email protected]
Abstract. In general, 60 Hz displays are often used in daily life. However, in the field of Electric Sports which are called as eSports, it is claimed that over 120 Hz displays are much effective to scores of eSports. In this study, authors measured simple reaction time in 6 kinds of Refresh-Rate, at least 60 Hz to at most 360 Hz to clarify impact of high refresh-rate of displays on score of eSports. As a result, it was suggested that high refresh-rate displays are usefulness in eSports field.
1 Introduction In recent years, Electronic Sports which called as eSports has become much popular around the world. For example, according to a survey of future dreams by insurance provider Sony Life, professional eSports player ranks second in Japanese junior high school boys [1]. However, eSports are defined as competitive games such as shooting games, card games, fighting games, and so on. Thus, it has often discussed about positive and negative effect of eSports [2–4]. Also in Japan, according to a survey on games and children by ASMARQ, 50.7% of parents thought that playing games has bad educational effect for children [5]. However, everyone can play eSports anytime and anywhere. For example, according to a survey of ESX360, 53% of mothers support gaming during the pandemic of COVID-19 as it occupies the children while they work, and 47% saw gaming as the perfect socially distanced [6]. In addition, some of high schools and universities in United States adopt eSports to educational program [7– 9]. Therefore, eSports might become national sports which contribute to education and health. However, the authors held eSports event, and in this event, amateurs cannot compete with professionals, so professional must have skills which are developed by eSports. Therefore, authors started the study to clarify human skills which improved by eSports. As first step, in order to measure the fairness of the skills, the influence of the machine performance was measured. In previous study [10], the authors clarified relationship between refresh-rate of 240 Hz display and reaction time by using simple game. However, 360 Hz displays are introduced to eSports field last year. In this study, the authors measured advantage of 360 Hz display in eSports field. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 283–288, 2022. https://doi.org/10.1007/978-3-030-84913-9_27
284
K. Murakami and H. Miyachi
2 Related Works Research on human reaction time began in the 1860s with experiments by Donders et al. [11], and since then, various studies have been conducted on people and reaction time, including task differences: Simple Reaction Time (SRT), Choice Reaction Time (CRT), Discriminative reaction time (Go /No-Go reaction time), and the relationship between stimulus modality, stimulus intensity, and reaction time have been studied [12]. Even today, research on human reaction time continues to be conducted, including differences in reaction time due to differences in the color of visual stimuli [13] and differences in reaction time due to human gender and age [14]. On the other hand, it has been suggested that sports training may improve a person’s reaction time [15]; Mori et al. demonstrated that karate athletes could respond significantly faster and more accurately than novices in choice RTs [16]. Sports skills can be categorized into open skills, which are influenced by changing external factors, such as in soccer and basketball, and closed skills, which are not influenced by external factors and can be performed at one’s own pace, such as in athletics and gymnastics. Skilled players in each category have been shown to have different and superior sensory cognitive abilities [17]. However, the characteristics of reaction time related to eSports have not been studied. Therefore, we began to investigate the effect of eSports on the development of reaction time. The first issue to be addressed is the reliability of reaction time measurement with commercially available computer systems. Here, we measured simple reaction time using a gaming machine equipped with a 360 Hz monitor, which has been introduced to the market in recent years, and verified its validity.
3 Experiment 3.1 Experimental Method In previous study, authors create a simple game to measure simple reaction time by using Unity Game Engine [10]. In this study, simple reaction time is defined as the time it takes for a subject to initiate a prearranged response to a defined stimulus. In this experiment, screen color change from white to blue in random time after testers start game, and they click a mouse button when screen color changed to blue. In this experiment, 8 testers play the game for 50 times in 6 steps of refresh-rate, 60 Hz, 120 Hz, 144 Hz, 240 Hz, 300 Hz, and 360 Hz. The way to pick up the valid data, authors sorted each data from fastest to slowest, and use middle 20 data as valid data to remove extreme value data, and use average value of these 20 data as a personal value. Before start experiment, authors recorded a display to recognize operation of display by using high frame rate camera which can record 1000 frame per second. Recorded pictures of 360 Hz display is shown in Fig. 1. 360 Hz display changes picture every 2.78 ms, and this display takes about 3 ms to change picture such as shown in Fig. 1. Therefore, this display can be operated in 360 Hz in this game.
A Study on the Impact of High Refresh-Rate Displays
285
Fig. 1. Recorded pictures of 360 Hz display by using high frame rate camera
3.2 Hardware All testers done experiments by using same display and computer which is shown in Table 1. Table 1. Machine specifications.
286
K. Murakami and H. Miyachi
4 Results The average result of this experiment is depicted in Fig. 2. In this figure, diamond shows the average of slowest value in each refresh-rate, circle shows the average of fastest value in each refresh-rate, and dot line shows the average of all personal value in each refresh-rate.
Fig. 2. Result of experiment
From Fig. 2, all value gets faster as display refresh-rate gets higher. In addition, difference between maximum and average, and between average and minimum in each refresh-rate is shown in Table 2. From this table, difference between average and minimum is close to zero when display refresh-rate is 60 Hz and 120 Hz. Conversely, over 240 Hz, these differences become large. According to these results, it is thought that most of testers could not play with their best performance because, average value and fastest value is almost same, so their skills are not satisfied when display refresh-rate is 60 Hz and 120 Hz. However, the gap between the fastest value and the average value on displays above 240 Hz indicates that average of human skills can improve when using higher refresh-rate display, and it can lead to higher performance when playing eSports. As a result, the score when playing in 300 Hz and 360 Hz marked higher than 240 Hz, and further improvement can be expected. Therefore, it is suggested that using higher refresh–rate display is much advantageous in eSports field.
A Study on the Impact of High Refresh-Rate Displays
287
Table 2. Difference value of result in each refresh-rate.
5 Conclusion In this study, the authors focus on the impact of high refresh-rate on scores of eSports. As a result, the score when playing in 300 Hz and 360 Hz marked higher than 240 Hz, and further improvement can be expected. Therefore, it is suggested that using higher refresh–rate display is much advantageous in eSports field. As the future study, authors plan to measure human skills which improved through eSports. Acknowledgments. This research was supported by a Grant from The Telecommunication Advanced Foundation. In addition, we gratefully acknowledge the work of past and present members of our laboratory.
References 1. What do Japanese kids want to be when they grow up? For 30 percent of boys, YouTubers, survey says. https://soranews24.com/2019/08/17/what-do-japanese-kids-want-to-be-whenthey-grow-up-for-30-percent-of-boys-youtubers-survey-says/ 2. Happonen, A., Minashkina, D.: Professionalism in Esport: Benefits in Skills and Health & Possible Downsides, LUT Scientific and Expertise Publications (2019) ISBN 978–952–335– 375–6 (PDF) 3. Griffiths, M.D.: The psychosocial impact of professional gambling, professional video gaming, and eSports. Casino Gaming Int. 28, 59–63 (2017) 4. Choi, C., Hums, M., Bum, C.H.: Impact of the family environment on juvenile mental health: esports online game addiction and delinquency. Int. J. Environ. Res. Public Health 15(12), 2850 (2018). https://doi.org/10.3390/ijerph15122850 5. Survey on games and children “translated from Japanese”. https://www.asmarq.co.jp/data/ mr201409game/ 6. New Poll Shows 60% of Parents Know More About Their Child’s Favorite Video Games Than They Do About Their Classes At School. https://apnews.com/article/video-games-games-lif estyle-health-coronavirus-pandemic-643d89a8a4f17eb53d74a980b67440a0
288
K. Murakami and H. Miyachi
7. UCI to launch first-of-its-kind official e-sports initiative in the fall. https://news.uci.edu/2016/ 03/30/uci-to-launch-first-of-its-kind-official-e-sports-initiative-in-the-fall/ 8. Rising Stars: All-women’s Stephens College breaks ground with varsity esports program. https://www.espn.com/esports/story/_/id/19195390/all-women-school-stephens-collegeadds-scholarship-esports-program 9. This Company Is Bringing E-Sports to High Schools. To the Students Who Play, It’s Bringing a Life Line. https://www.inc.com/kevin-j-ryan/high-school-esports-first-season-playvs.html? cid=hmsub2 10. Murakami, K., Miyashita, K., Miyachi, H.: A Study on the Relationship Between Refresh-Rate of Display and Reaction Time of eSports (2020) 11. Donders, F.C.: On the speed of mental processes. Acta Physiol. (Oxf.) 30, 412–431 (1969) 12. Kosinski, R.J.: Clemson University; A Literature Review on Reaction Time (2008). [Google Scholar] 13. Vishteh, R.A.: Evaluation of simple visual reaction time of different colored light stimuli in visually normal students. Clin. Optom. (Auckl.), 11, 167–171 (2019) 14. Hanumantha, S., Kamath, A., Shastry, R.: Diurnal Variation in Visual Simple Reaction Time between and within Genders in Young Adults: An Exploratory, Comparative, Pilot Study, e Scientific World Journal Volume 2021, Article ID 6695532 15. Hamidur Rahman, Md., Shahidul Islam, M.: Investigation of audio-visual simple reaction time of university athletes and non-athletes. J. Adv. Sports Phys. Educ. (2021) https://doi.org/ 10.36348/jaspe.2021.v04i03.002 16. Mori, S., Ohtani, Y., Jmanaka, K.: Reaction times and anticipatory skills of karate athletes. Human Move. Sci. 21(2), 213–230 (2002) 17. Nuri, L., Shadmehr, A., Ghotbi, N., Attarbashi, M.B.: Reaction time and anticipatory skill of athletes in open and closed skill-dominated sport. Eur. J. Sport Sci. 13, 431–436 (2013)
Assessing the Sense of Presence to Evaluate the Effectiveness of Virtual Reality Wildfire Training Huang Heyao(B) and Ogi Tetsuro System Design and Management, Keio University, Yokohama, Kanagawa, Japan [email protected]
Abstract. Virtual Reality is beneficial for high-risk training such as firefighting, extreme weather, and police due to cost and risk concerns. A fundamental characteristic of VR is creating the sense of being there and train participants to make the right decisions under high pressure. This research presents an experimental study that aims to evaluate the effectiveness of a Virtual Environment to train wildfire fighting by assessing the sense of presence. We have 10 participants who experienced the same Virtual Environment in two VR conditions represented by HTC Vive and Google Cardboard. The two devices represent two immersion levels. To evaluate the effectiveness, we used the Igroup Presence Questionnaire (IPQ) for participants to rate the subjective opinions of spatial presence, the real-time recorded ECG signal to indicate the stress level during the experiment, and skin temperature to show the excitement level.
1 Introduction For the past decade, we have experienced the warmest time on record. Accompanied by the record is the widespread and devastating wildfire from the Amazon to California, the Arctic Circle to Australia. Wildfire suppression is complex due to the interaction of wind, terrain, and plants in the landscape can cause a variety of unusual yet significant effects on fire propagation1 . That requires firefighter’s ability of situational awareness and decision making under high pressure and tension to put out the fire and protect themselves. A frequent, consistent, and quality training plays an important role in the success and security of a firefighter. Despite these needs, often firefighters held training sessions with less frequency or quality than desired because of several reasons, such as safety concerns and budget cuts. Virtual Reality (VR) is presented as an alternative method to perform training that has the capability of overcoming these problems2 . The technology has been tested for disaster exercises and training purposes in many different settings including fire extinguisher operation and rescue training. Compared to the traditional fire training methods, VR provides a simulated experience for trainees to explore 1 Sharples, McRae, and Wilkes, “Wind-terrain effects on the propagation of wildfires in rugged
terrain: Fire Channelling”. 2 Narciso et al., Virtual reality in training.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 289–298, 2022. https://doi.org/10.1007/978-3-030-84913-9_28
290
H. Heyao and O. Tetsuro
various emergency scenarios repeated without exposure to the hazard. Moreover, using technology allows the study and collection of certain aspects such as human behavioral and psychological data under emergency situations that are not possible or feasible in reality3 . These advantages are valuable to support firefighters from traditional fire training methods because it allows them to practice the scenarios repeatedly until they familiar with the situations and master the skills. The effectiveness of the training is the challenge of using VR training. Training environment VR immerse the trainee ideally should provide an experience the same way as reality. However, the current technology is not capable to replicate all stimuli in the virtual environment with the same fidelity as in the Real Environment. Although the current VR technology cannot produce the simulation with the same fidelity, it is still promising to have effective training by using VR. According to Harter et al.4 the suspension of disbelief can cause the same sort of autonomic reactions as if users were experiencing the situation in real life. In this study, the training is considered effective if the virtual environment is able to evoke the mental state where the person feels the sense of presence while knows that the scenario being experienced is not real. This study presents an experiment with non-wild firefighting experienced participants where the effectiveness of a training VE was assessed using participant’s sense of presence. The level of presence was measured subjectively using igroup presence questionnaire (IPQ), objectively through concentrate and stress analysis from HeartRate Variability (HRV), and excitement analysis from skin temperature. The experiment compared conditions of 3D and 360 images, viewing angles, the resolutions of the displayed images, and the interaction functions by two VR Head Mount Displays (HMD) – HTC Vive and Google Cardboard. The goal of this experiment is to see if one of condition provided a more concentrated, stress environment for participants. This study is pertinent by realizing the effective training simulator brings benefits not only to the fire department but also for other fields that need to understand and identify training for the dynamic dangerous.
2 First Stage Experiment This study of evaluating the sense of presence is a second stage of evaluating the effectiveness of VR wildfire training. The first stage is to evaluate the effectiveness of a training VE by comparing the user’s response time to the target. The purpose of first stage is to confirm this VE is useful for user to locate head-fire (the side of the fire having the fastest rate of spread, the form of head fire has been considered as one of the most terminal cause of firefighter’s entrapment)5 by using the information of wind direction and speed. Two scenarios with different terrains and wind speed and direction were designed. Users participated in the scenario B (harder) first, the time user reaches the head fire area and the total time user spend will be recoded. After the first implementation in scenario B, user practiced analysis wind information and locate fire area 3 Kinateder et al., “Virtual reality for fire evacuation research”. 4 Harter et al., “An immersive virtual environment for varying risk and immersion for effective
training”. 5 NWCG, “Head, Flank, and Rear Fire Terms”.
Assessing the Sense of Presence to Evaluate the Effectiveness
291
in scenario A for four times with the same equipment. Then users conducted again with the same environment and time record criteria as the first time in scenario B. The result of the first stage shows that after practices in scenario A, user has improved 49 s in test 2 than in test 1 in scenario B. Thus, the VE is approved effectiveness and in this study known as the second stage of evaluating the effectiveness of the VR wildfire training continued to use the same VE.
3 Method 3.1 Virtual Scenes and Tasks The study continued to use the none-real-time simulated forest VE was designed by Unity from the first stage experiment. In the case of HTC Vive, one user entered an already burring scenario a time by using the HTC HMD. User was presented as computer generated character Avatar while participating, he/she has the ability to control the avatar’s movement by looking at the directions user wants to go by wearing an HMD and use joysticks to walk forward. The system does not support walk back, to walk backward, user need to turn around and look to the direction. The simulation applies the first-person point of view in order to provide a virtual “first-hand” perspective of the setting (Fig. 1).
Fig. 1. First person point-of-view during the actual virtual environment training.
All the fire is established by particle system and is able to be extinguished by using the fire hose attached to the right joystick controlled by user. When water particles clashed with fire particles, fire slowly disappeared. All the flammable elements such as grass, tree, leaves in the environment are controlled by Fire Manager assets provided by Unity. It allows an automatic fire propagation by calculating the information of wind speed, direction, topography, and humidity. For Google Cardboard, since it’s only contains 3D video displays by mobile devices, a recorded video of user participated in HTC Vive is used in Google Cardboard displayed by iPhone 10.
292
H. Heyao and O. Tetsuro
3.2 Procedures 10 participants with ages between 20 and 30 years old participate in two conditions: a 30-degree viewing angle, 2436-by-1125 pixel resolution and no interaction function Google Cardboard with iPhone X, and a HTC Vive with 110-degree viewing angle, 2160by-1200 pixel resolution with motion-tracked handheld controllers to interact with the environment. All participants participated this study without firefighting or fire training experiences since the study is designed to train junior or volunteer firefighter who has minimal knowledge of wild firefighting. Biological information of electrocardiogram, facial skin thermogram were measured. Participants were asking to wear a chest wear HRV measure device: Polar H10 to record the real-time RRV value. RRV is the variance of R-R intervals. It is known that RRV value decreases when the human concentrates the mind6 . After we explained the flow of the experiment and how to use the devices, we tested a 5 s test record for Polar H10’s signal. All participants participated the experiment in the order of 30-degree viewing angle then 110-degree viewing angle. As HRV tested continuously during the experiment, the facial skin temperature was also continuing to track the participant’s nose in real-time for the temperature change using FLIR One (Fig. 2). The temperature at nose-tip is known as the most significant for emotion arousal, decrease of the nose temperature indicates the increase of the mental stress7 . After each VE participants were asked to answer the questionnaire. This not only reflected the objective results, also helped object to subside to calm from the first experiment. In addition, although Google Cardboard has no interaction function, this study asked objects to stand during the non-interaction experiment to have a similar physical performance as the interaction function experiment.
4 Result The results of the tests were performed with the goal of measuring object’s presence, stress, excitement, and concentration. The study chose a critical p-value lower than 0.05 as significant and a p-value between 0.05–0.10 as indicative. Considering each object has different physical conditions, all the results are compared individually as a pair with parametric statistical tests. 4.1 HRV The R-R interval (RRI) has the small changes (milliseconds) in the intervals between successive heartbeats. The greater the RRV (variance of RRI) is, the lesser the human’s concentration is. Then, the longer the distance between two successive RRV graphs is, the larger the difference of the concentration level is. This study selected the 20 smallest variances for both two conditions during the experiment and compare the two tailed t-test result to see which environment makes participants more concentrate. 6 Hirose, Ishiii: “A method for Objective Assessment of Mental Work, Transactions”. 7 Ogi, et al., “Evaluation of High Presence Sensation based on Biological Information”.
Assessing the Sense of Presence to Evaluate the Effectiveness
293
Fig. 2. Facial temperature recording at the nose tip during the Google Cardboard experiment.
Figure 3 presents the RRV for 10 participants for 20 heart beats while participating in the 30-degree non-interactive environment and during 110-degree interactive environment by using two tails t-test. Vive represents for 110-degree interactive environment, and CB represents for 30-degree non-interactive environment. Analysis of the results indicates statistically significant differences in RRV values where participant 1 p = 0.000, participant 2 p = 0.000, participant 3 p = 0.019, participant 4 p = 0.149, participant 5 p = 0.099, participant 6 p = 0.008, participant 7 p = 0.0014, participant 8 p = 0.086, participant 9 p = 0.000, and participant 10 p = 0.0053. There are great differences between T-test results of P-value because of the physical condition for each object is different. Although not all 10 objects proved the significant differences, but 8 out of 10 have P-values < 0.05, especially object 1, 6, and 7 with consistent and low RRV during the experience in 110-degree interaction function experience than in 30-degree non-interaction-function experience where RRV are fluctuant with greater variances. 4.2 Facial Thermal Functional infrared thermal imaging is considered a promising method to measure emotional autonomic responses through facial cutaneous thermal variations.8 This study recorded the facial temperature at the nose tip – the important region of interest for facial thermal variations, and compared to electrodermal responses, a robust index of emotional arousal9 . The lower the nose temperature the more stress object feels during the experiment. 8 Kosonogov et al., “Facial thermal variations”. 9 Donadio el al, “Skin sympathetic”.
294
H. Heyao and O. Tetsuro
Fig. 3. 20 smallest RR-variances comparisons for each participant in 110-degree environment and 30-degree environment for all 10 participants.
The objects in facial skin temperature were reduced due to the data lost caused by the camera delay during the nose track. The design of low quality envrionment experiment
Assessing the Sense of Presence to Evaluate the Effectiveness
295
requires object to turn their heads to navigate and that has caused the camera blocking for some data collections. We kept the first-time data since the emotion arouse would appear different between first time experiment and experiment participants familiar with. The change of the room temperature influences the facial skin temperature, thus, for this study, the difference of nose temperature was measured and used. Below in Table 1 is the results of the facial thermal variations in the unit of object. The result of the statistical analysis of two-tailed t-test shows significant differences between two conditions expect for participant 6 (p = 0.33) and 8 (p = 0.77). Although most of the results are significant, the average difference of narrower viewing angle and lower resolution with non-interaction function environment reduced during the experiment while wider viewing angle, high resolution device with interaction function increase or reduce less temperature. That indicated the participant felt less stress in the low-quality environment than in the high-quality environment. However, studies indicate higher presence environment relax people while digital display causes more stress10 . Table 1. Result of facial temperatures in Celsius Object Average SD
p-value
Vive 1
0.34
0.71 3.364E–07
CB 1
– 0.15
0.68
Vive 2
0.22
1.18 5.75E-12
CB 2
– 0.70
1.40
Vive 3
0.01
0.87 1.519E-12
CB 3
– 0.61
0.79
Vive 4
– 0.52
0.93 0.0002
CB 4
– 1.08
0.61
Vive 5
0.29
0.58 0.33
CB 5
0.22
0.52
Vive 6
0.91
0.86 5.504E–07
CB 6
1.40
0.83
Vive 7
– 0.96
0.74 1.076E–39
CB 7
0.14
0.50
Vive 8
– 0.22
1.63 0.77
CB 8
– 0.02
1.11
4.3 IPQ The study adapted the igroup presence questionnaire (IPQ) with 14 questions in three subscales and one additional general item. The three subscales emerged from principal component analyses and can be regarded as fairly independent factors. The three subscales are: spatial presence, the sense of being physically present in the VE; Involvement, measuring the attention devoted to the VE and the involvement experienced; and 10 Ogi et al., “Evaluation of high presence sensation based on biological information”.
296
H. Heyao and O. Tetsuro
Fig. 4. The result of Igroup presence questionnaire. Greener the score, more sense of presence participant subjectively feels.
Assessing the Sense of Presence to Evaluate the Effectiveness
297
Experienced Realism, measuring the subjective experience of realism in the VE. There are three items with reversed wording, the results reversed the scales before the analysis. The study performed parametric tests for statistical evaluation with two independent variables 30-degree viewing angle, low resolution, non-interaction function and 110degree viewing angle, high resolution interaction function. The results are below in Fig. 4, smaller the score, less for object to feel the sense of presence and the sheets are labeled in color from red to green. The results of special presence and involvement are showing significant difference. High-quality image with interaction function has high sense of presence as surrounded by the virtual world and sense of acting in the virtual space. Expect item 9 INV3 “I still paid attention to the real environment.” And item 13 real 3 “How real did the virtual world seem to you?” the results of high-quality image with interaction function and low-quality image with non-interaction function have minor differences.
5 Conclusion The main goal of this work was to evaluate the effectiveness of a virtual environment to train firefighters for the wildfire. To achieve this, the study first established the virtual environment based on the study of past wildfire cases, test the environment’s fire training function effective, and study the influence of our virtual training environment on presence. The elements that would influence the sense of presence in this study are video quality, field of view, and operation function. By using HTC Vive with 110degree viewing angle, high display resolution with joystick interaction represents the high immersive device. On the other hands, Google Cardboard represent the low-quality device with 30-degree viewing angle, low display resolution and no interaction function. The results from HRV used 20 smallest RRV from RR-intervals to represents the most concentrate period of time during the experiment in high-quality and low-quality environment. Result showed that during the experiment in HTC Vive, participants having more stable and lower RRV than during the Google Cardboard, which means that objects were more concentrated in wider viewing angle, higher display resolution with ability to interaction than in narrower viewing angle, lower display resolution without interaction ability. The result from facial thermal used to study the emotion arousal of the object in the environment. Normally, facial temperature reduced if the level of stress increase, however, in this case, participant felt more stress by conducting a digital-like environment when they are more relaxed and concentrated in the high sense of presence environment. Lastly, results from the questionnaire used to study the subjective opinions about feeling the presence in the virtual environment. 12 out of 14 items have indicated all 10 objects have felt more immerse and surrounded by virtual environment in the 110-degree viewing angle than in the 30-degree viewing angle. With the high concentration, immersion of the result, this VR wildfire training is effective in simulating wildfire environment for firefighters to train. Although the results from the virtual training environment was far different from the ones achieved in the real environment, objects have showed more level of concentration, and immersive in the operational VR wildfire training than in observational. These are important because
298
H. Heyao and O. Tetsuro
the system will serve as a reference for future wildfire VR training development for both physical and psychological learning.
References 1. Sharples, J.J., Richard, H.D.M., Stephen, R.W.: Wind–terrain effects on the propagation of wildfires in rugged terrain: fire channelling. Int. J. Wildland Fire 21(3), 282–296 (2012) 2. Narciso, D., Melo, M., Raposo, J.V., Cunha, J., Bessa, M.: Virtual reality in training: an experimental study with firefighters. Multimed. Tools App. 79(9–10), 6227–6245 (2019). https:// doi.org/10.1007/s11042-019-08323-4 3. Kinateder, M., et al.: Virtual reality for fire evacuation research. In: 2014 Federated Conference on Computer Science and Information Systems, pp. 313–321 (2014) 4. Harter, D., Lu, S., Kotturu, P., Pierce, D.: An immersive virtual environment for varying risk and immersion for effective training. In: Proceedings of the ASME 2011 World Conference on Innovative Virtual Reality. ASME 2011 World Conference on Innovative Virtual Reality. Milan, Italy, June 27–29, pp. 301–307. ASME (2011). https://doi.org/10.1115/WINVR20115522 5. “8.6 Head, Flank, and Rear Fire Terms” n.d. Nwcg.Gov. https://www.nwcg.gov/course/ffm/ fire-behavior/86-head-flank-and-rear-fire-terms. Accessed 29 May 2021 6. Hirose, M., Ishii, T.: A method for objective assessment of mental work. Bulletin of JSME 29(253), 2330–2335 (1986). https://doi.org/10.1299/jsme1958.29.2330 7. Ogi, T., Kubota, Y., Toma, T., Chikakiyo, T.: Evaluation of high presence sensation based on biological information. In: 2013 16th International Conference on Network-Based Information Systems, pp. 327–331. IEEE (2013) 8. Kosonogov, V., et al.: Facial thermal variations: a new marker of emotional arousal. PLOS ONE 12(9), e0183592 (2017). https://doi.org/10.1371/journal.pone.0183592 9. Donadio, V., et al.: Skin sympathetic adrenergic innervation: an immunofluorescence confocal study. Ann. Neurol. Official J. Am. Neurol. Assoc. Child Neurol. Soc. 59(2), 376–381 (2006)
3D Measurement and Feature Extraction for Metal Nuts Zhiyi Gao(B) , Tohru Kato, Hiroki Takahashi, and Akio Doi Iwate Prefectural University, 152-52 Susuko, Takizawa, Iwate 020-0693, Japan [email protected], {toru_k,t-hiroki,doia}@iwate-pu.ac.jp
Abstract. The metal nuts used to assemble bridges and buildings inevitably deteriorate over time, which makes it crucial to obtain various parameters related to their conditions during maintenance activities and then determine whether and when they should be replaced. However, since measuring the essential parameters such as height and minimum width of large numbers of nuts is a very time- and energy-consuming project, we developed a method that uses a laser measurement instrument to collect three-dimensional (3D) information on such nuts. We then combined it with a software application that can quickly and automatically calculate their 3D point cloud data and obtain their parameters. As a result, building and bridge inspections can now be conducted more quickly and efficiently, and with reduced manpower requirements.
1 Introduction Large metal nuts, which are among the most common fastening parts used in the assembly of machines and buildings, typically have a center hole with a female thread and are used in combination with male threaded parts, such as bolts. Generally speaking, hexagonalshaped nuts and bolts are used to fix steel bars together in buildings and bridges. However, in situations where they are installed exposed to outside environments, they corrode over time, and after 10 or 20 years, need to be replaced. Hence, such nuts need to be examined periodically. In conventional inspections, the examiner conducts an on-site survey and determines the condition of each nut based on his or her experience. However, due to the large number of nuts in a typical bridge or building, this procedure is often time-consuming and potentially dangerous [1]. In previous studies, nuts were measured in three dimensions, and their parameters were obtained by interactively manipulated on a personal computer (PC) using software such as MeshLab and CloudCompare. However, in these situations, the nuts were considered individually, and the parameters of each had to be manually calculated via numerous interactive mouse-based operations, which required significant amounts of time. To solve the above problem, we developed a system that automatically performs a quick inspection study by using a laser scanner to obtain a three-dimensional (3D) image of a nut, along with an application that automatically calculates its 3D point cloud data to obtain its parameters. These data are used in combination with previously uploaded information on the bridge or building to determine whether the nut is still functional. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 299–305, 2022. https://doi.org/10.1007/978-3-030-84913-9_29
300
Z. Gao et al.
Using this system, inspectors can determine which nuts are defective and need to be replaced via a simple, fast, easy, and safe operation.
2 Automatic Measurement of 3D Nut Data Parameters 3D Measurements of Metal Nut. We have a lot of experience in 3D point cloud inspection. We have used 3D inspection technology to obtain 3D point cloud data of rocks and street in Miyako City [2, 3]. In this study, we used a 3D laser inspection device to obtain the 3D point cloud data of nuts installed in the Kanmon Kaikyo bridge, which connects the main Japanese Islands of Honshu and Kyushu. For the processing of 3D point clouds, we took inspiration from the literature [4] and [5]. We can calculate the parameters by obtaining the eigenvalues of the nuts. An example of the targeted nuts is shown in Fig. 1. Image data were obtained using a 3DSL Rhino 01 laser imager (Seikowave Energy, Lexington, KY), as shown in Fig. 2. Numerous point cloud data of 3D nut was captured via this system, such as the sample shown in Fig. 3.
Fig. 1. The nuts of Kanmon Kaikyo Bridge
Fig. 2. 3DSL-Rhino-01 laser scanner
Fig. 3. Point cloud data of metal nut.
3D Measurement and Feature Extraction for Metal Nuts
301
Principle of the Algorithm. To calculate nut parameters, we first need to know the plane below the nut, which means we need the coordinate values of three points on that plane in order to calculate the plane equation. In Fig. 3, we see that the points corresponding to the maximum and minimum values on the x-axis of the nut point cloud are on the plane. To prevent loopholes for the third point, we do not use the extreme point value and instead set the third point to point P (Xp, Yp, Zp). Point P is the farthest point from the midpoint of points Xmax and Xmin (Xmid, Ymid, and Z mid). We then Use the Xmax, Xmin, and P points to obtain the plane equation (Ax + By + Cz + D = 0).
Lmax =
Xmid =
(Xxmax + Xxmin) 2
Ymid =
(Yxmax + Yxmin) 2
Zmid =
(Zxmax + Zxmin) 2
(Xmid − Xp)2 + (Ymid − Yp)2 + (Zmid − Zp)2
We can then find the point farthest from the plane (Xd, Yd, Zd). The distance from this point to the plane is the height of the nut called d. d=
|Ax + By + Cz + d − D| √ A2 + B2 + C 2
Next, we divide d into 200 equal parts and calculate the number of point clouds contained in each part of the space separately. As shown by Fig. 4, the horizontal coordinates
Fig. 4. Correspondence table of the number of point clouds in different intervals.
302
Z. Gao et al.
are d1 to d200, and the vertical coordinates are the number of point clouds contained in the interval. In Fig. 4, it can be seen that d1 has 1460 points, which is the highest number of all the parts surveyed, thus indicating that d1 is the plane (Ax + By + Cz + D = 0), while Fig. 5 shows that d1, d27, d137, and d198 correspond to the four planes of the nut.
Fig. 5. Side view of the nut
Next, we calculate the parameters of plane d137, which is hexagonal in shape, as shown in Fig. 6. Since the Xmax, Xmin, Ymax, and Ymin points can be found in Fig. 7, we need just two more vertices to obtain them all.
Fig. 6. Top view of the nut
Fig. 7. Four poles on the hexagonal plane
The other two vertices may exist at four different positions, as shown in Fig. 8. We calculate the number of point clouds in interval s designated as 0, 1, 2, and 3. The two intervals with the highest number of point clouds are where the remaining vertices are located. Next, we find the point in this interval that is farthest from the line where the pole is located, from which we obtain the two points Pa and Pb. As a result, we now have six vertices and can calculate the hexagon parameters.
3D Measurement and Feature Extraction for Metal Nuts
303
Fig. 8. Four positions of two vertices
Study Results. We developed an application to automatically calculate nut data parameters and tested it using the 3D Model Ply_1. Figures 9 and 10 and Table 1 show the results.
Fig. 9. Nut height parameters
304
Z. Gao et al.
Fig. 10. Nut hexagonal parameters
Table 1. Hexagonal parameter calculation results Height of nut thinning
20.830475
L1
34.022653
L2
27.271394
L3
29.174537
Shortest line segment (yellow part) L2 = 27.271394
In order to judge the accuracy of our algorithm, we tested the cross-sectional shortest distance parameters of five sets of data using the present method and the inch method, showed in Table 2. By comparison, the accuracy of the present method was proved to be high. Table 2. Comparison table showing the true parameters of five nuts. Data name Shortest line segment by software True value Ply_1
27.27 mm
27.3 mm
Ply_2
33.54 mm
33.5 mm
Ply_3
33.83 mm
33.9 mm
Ply_4
31.14 mm
31.2 mm
Ply_5
33.79 mm
33.9 mm
3D Measurement and Feature Extraction for Metal Nuts
305
3 Conclusion and Future Work In this study, we showed how we could collect the 3D point cloud data of nuts installed on bridges, and reported on the development of a program to automatically calculate nut parameters. Then, using the 3D space conversion principle, we showed how we could obtain the equation of the lower plane of the nut, calculate the nut height by adjusting the value of d in the equation, and then obtain the hexagonal cross-section of the nut. These data allow us to find the point cloud coordinates of the six hexagon vertices using the polar method and calculate the minimum width value of the hexagonal cross-section. The minimum width indicates the area worn out by external environment weathering and corrosion effects over the years. We then set a threshold value and determine if the minimum width of any nut is below this threshold. If so, that nut needs to be replaced. The program currently under study is only the first stage. Our ultimate goal is to create a program that can successively examine multiple nuts in succession, analyze the nut defects, and mark the nuts that need to be replaced. The final form of the program may be reduced to a cellphone application with which the inspector simply opens the phone camera, quickly scans the nut, obtains a 3D model, and calculates the nut parameters. The 3DSL-Rhino-01 laser imager we use is only suitable for close range inspection, which is not really applicable to nuts on bridges with complex environment. In the future, we can use cell phone to control the aerial imager to get data, synthesize 3D point cloud by SfM algorithm [6], and then combine with nut calculation program to mark the broken nuts.
References 1. Wang, Q., Kim, M.-K.: Applications of 3D point cloud data in the construction industry: A fifteen-year review from 2004 to 2018. Adv. Eng. Inf. 39, 306–319 (2019). https://doi.org/10. 1016/j.aei.2019.02.007 2. Gao, Z., Doi, A., Sakakibara, K., Hosokawa, T., Harada, M.: 3D measurement and modeling for gigantic rocks at the sea. In: Barolli, L., Kryvinska, N., Enokido, T., Takizawa, M. (eds.) Advances in Network-Based Information Systems: The 21st International Conference on Network-Based Information Systems (NBiS-2018), pp. 514–520. Springer International Publishing, Cham (2019). https://doi.org/10.1007/978-3-319-98530-5_44 3. Gao, Z., Kato, T., Takahashi, H., Doi, A.: A Method for Extracting Electric Pole Areas from 3D Measurement Data and Its Application. In: Tohoku Branch Conference of Japan Society for Arts and Sciences (2021) 4. Masuda, H., Mori, Y.: Automatic extraction of features around roads from point cloud and images acquired by mobile measurement. Image Lab 30(8), 42–47 (2019) 5. Matsumoto, H., Midagawa, R., Saito, K.: Non rigid registration for point group processing of cylindrical objects. Spring meeting of Precision Engineering Society, pp. 567–568 (2018) 6. van Riel, S.: Exploring the use of 3D GIS as an analytical tool in archaeological excavation practice, M.A. thesis in Archaeology - Lund University, p.16 (2016)
A Machine Learning Approach for Predicting 2D Aircraft Position Coordinates Kazuma Matsuo1 , Makoto Ikeda2(B) , and Leonard Barolli2 1
2
Graduate School of Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-higashi, Higashi-ku, Fukuoka 811-0295, Japan [email protected] Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-higashi, Higashi-ku, Fukuoka 811-0295, Japan [email protected], [email protected]
Abstract. The prediction of arrival time for buses, trains and other transportation systems using Machine Learning (ML) and Deep Neural Network (DNN) approaches are attracting attention. For daily operation the data can be collected and used as training data for ML and DNNs. In this paper, we present a ML-based system for predicting twodimensional aircraft position coordinates by using a part of the received data from ADS-B. The evaluation results show that our proposed system can predict the aircraft two dimensional path and the accuracy is about 94%.
Keywords: Aircraft position
1
· Prediction · ADS-B · ML
Introduction
The next generation Air Traffic Management (ATM) automation systems is essential to improve navigation, communications, surveillance and passenger safety [16]. The Automatic Dependent Surveillance-Broadcast (ADS-B) system is at the core of this future [11,17,21,23,24,26]. Recently, Machine Learning (ML) has been used to predict the delay time of some transportation system such as bus, train and aircraft [1–3,5,10,14,15]. For a routed operation such as daily operation the data can be collected and used as training data for ML and Neural Networks (NNs) [8,9,13,19]. In this research, we focus on the aircraft signals and collect them to predict the aircraft’s future position coordinates. In this paper, we propose a system for predicting two-dimensional aircraft position coordinates by using the received data from ADS-B. The evaluation of the proposed system, we consider aircraft taking off and landing on domestic and international flights at Fukuoka Airport in Japan. Also, our system considers aircraft passing overhead. The system consists of a Raspberry Pi and an ADS-B receiver to collect training data. The ADS-B can receive the aircraft identification c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 306–311, 2022. https://doi.org/10.1007/978-3-030-84913-9_30
A Machine Learning Approach for Predicting 2D Aircraft Position
307
number, altitude, speed and position coordinates, but only the received time and position coordinates of the data are used to predict the aircraft position. The structure of the paper is as follows. In Sect. 2, we describe the related works. In Sect. 3, we present the design of aircraft prediction system. In Sect. 4, we provide the evaluation results. Finally, conclusions and future work are given in Sect. 5.
2
Related Works
In this section, we present the overview of aircraft surveillance radars. An Airport Surveillance Radar (ASR) consists of Primary Surveillance Radar (PSR) [20] and Secondary Surveillance Radar (SSR) [22]. The deployed radar systems depend on the airport’s size and purpose. The PSR is an integrated radar with transmitter and receiver. The PSR does not provide altitude information and aircraft identification. The SSR uses a transmitter that sends a question signal to a transponder on the aircraft and the transponder responds with a reply signal. When a transponder is damaged, the communication is interrupted and monitoring becomes impossible. The ADS-B is more advanced aircraft surveillance system than the SSR. The ADS-B broadcasts the aircraft’s current position, altitude, and other information constantly. Anyone can receive this signal from aircraft using an ADS-B receiver [18]. The passive bistatic radar [25] and multi-static primary surveillance radar are application of PSR [4,6,7,12]. They consist of transmitter and receiver separately. By receiving at multiple points, it is expected to expand the monitoring coverage area and improve the update frequency.
3 3.1
Proposed System Design Overview
We implemented a system based on the Raspberry Pi to receive the 1,090 MHz signal from ADS-B. In the received data of ADS-B, there are 8 messages, in which the position coordinates are stored in message #3. In each message, there are 22 fields. We extracted the fields that stored latitude, longitude and the time of message transmission as training data for ML. Figure 1 shows the extracted data of a received ADS-B message, sorted by the received time (first column). The second column indicates longitude, the third column indicates latitude, and the fourth column the difference in received time (Timediff ). Timediff indicates the difference t0 − tn between the first received time t0 and the received time tn for each aircraft identification code. The measured data are managed in CSV format by aircraft identification code. This data was measured with an ADS-B receiver
308
K. Matsuo et al.
Fig. 1. Sample of message format.
installed at Fukuoka Institute of Technology. The distance between Fukuoka Airport and Fukuoka Institute of Technology is about 10 km. This system will be considered to use for illegal position detection as one of its possible applications. 3.2
Data for Machine Learning
We are using Python to import CSV data into an array for ML. Then, the padding process is applied using the end of the data, which is aligned with the largest data length for training. As ML technique, we use random forest regression to deal with time series data. As the training data for this system, we measured the data (973 data) for one week (7 days). We predicted longitude and latitude as performance indicators.
4
Evaluation Results
We show in Fig. 2 the results of one week dataset when the difference between the predicted and actual paths is small. The upper figure shows longitude and bottom shows latitude (see Fig. 2). While, in Fig. 3, we show mapping results of Fig. 2. The accuracy is 93.71% and 94.47% for longitude and latitude, respectively.
A Machine Learning Approach for Predicting 2D Aircraft Position
309
Fig. 2. Prediction results of latitude and longitude.
Fig. 3. Mapping results.
5
Conclusions
In this paper, we evaluated a ML approach for predicting two-dimensional aircraft position coordinates by using the received data from ADS-B. We used random forest regression for training and measured the data for one week. From the evaluation results, we found that our proposed system can predict the aircraft path. The accuracy is about 94%, but it is only a prediction in twodimensional coordinates. In the future work, it is necessary to consider other parameters and algorithms in order to apply the proposed system for different situations.
310
K. Matsuo et al.
References 1. Cai, Q., Alam, S., Duong, V.N.: A spatial-temporal network perspective for the propagation dynamics of air traffic delays. Engineering 7(4), 452–464 (2021), https://www.sciencedirect.com/science/article/pii/S2095809921000485 2. Choi, S., Kim, Y.J., Briceno, S., Mavris, D.: Prediction of weather-induced airline delays based on machine learning algorithms. In: 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC), pp. 1–6 (2016) 3. Duan, Y., Yisheng L.V., Wang, F.Y.: Travel time prediction with LSTM neural network. In: 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), pp. 1053–1058 (2016) 4. Edrich, M., Schroeder, A.: Design, implementation and test of a multiband multistatic passive radar system for operational use in airspace surveillance. In: 2014 IEEE Radar Conference, pp. 12–16 (2014) 5. Gui, G., Liu, F., Sun, J., Yang, J., Zhou, Z., Zhao, D.: Flight delay prediction based on aviation big data and machine learning. IEEE Trans. Veh. Technol. 69(1), 140– 150 (2020) 6. Honda, J., Otsuyama, T., Watanabe, M., Makita, Y.: Study on multistatic primary surveillance radar using DTTB signal delays. In: 2018 International Conference on Radar (RADAR), pp. 1–4 (2018) 7. Honda, J., Otsuyama, T.: Feasibility study on aircraft positioning by using ISDB-T signal delay. IEEE Antennas Wirel. Propag. Lett. 15, 1787–1790 (2016) 8. Kim, Y.J., Choi, S., Briceno, S., Mavris, D.: A deep learning approach to flight delay prediction. In: 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC), pp. 1–6 (2016) ´ 9. Mart´ınez-Prieto, M.A., Bregon, A., Garc´ıa-Miranda, I., Alvarez-Esteban, P.C., D´ıaz, F., Scarlatti, D.: Integrating flight-related information into a (big) data lake. In: 2017 IEEE/AIAA 36th Digital Avionics Systems Conference (DASC), pp. 1–10 (2017) 10. Moreira, L., Dantas, C., Oliveira, L., Soares, J., Ogasawara, E.: On evaluating data preprocessing methods for machine learning models for flight delays. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2018) 11. Nijsure, Y.A., Kaddoum, G., Gagnon, G., Gagnon, F., Yuen, C., Mahapatra, R.: Adaptive air-to-ground secure communication system based on ADS-B and widearea multilateration. IEEE Trans. Veh. Technol. 65(5), 3150–3165 (2016) 12. O’Hagan, D.W., Baker, C.J.: Passive bistatic radar (PBR) using FM radio illuminators of opportunity. In: 2008 New Trends for Environmental Monitoring Using Passive Systems, pp. 1–6 (2008) 13. Olive, X., et al.: OpenSky report 2020: analysing in-flight emergencies using big data. In: 2020 AIAA/IEEE 39th Digital Avionics Systems Conference (DASC), pp. 1–10 (2020) 14. Pamplona, D.A., Weigang, L., de Barros, A.G., Shiguemori, E.H., Alves, C.J.P.: Supervised neural network with multilevel input layers for predicting of air traffic delays. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–6 (2018) 15. Peters, J., Emig, B., Jung, M., Schmidt, S.: Prediction of delays in public transportation using neural networks. In: International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce, CIMCA-IAWTIC 2006, vol. 2, pp. 92–97 (2005)
A Machine Learning Approach for Predicting 2D Aircraft Position
311
16. Post, J.: The next generation air transportation system of the United States: vision, accomplishments, and future directions. Engineering 7(4), 427–430 (2021), https:// www.sciencedirect.com/science/article/pii/S209580992100045X 17. Sch¨ afer, M., Strohmeier, M., Lenders, V., Martinovic, I., Wilhelm, M.: Bringing up OpenSky: a large-scale ADS-B sensor network for research. In: IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks, pp. 83–94 (2014) 18. Sciancalepore, S., Alhazbi, S., Di Pietro, R.: Reliability of ADS-B communications: novel insights based on an experimental assessment. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, SAC 2019, pp. 2414–2421. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/ 3297280.3297518 19. Shi, Z., Xu, M., Pan, Q., Yan, B., Zhang, H.: LSTM-based flight trajectory prediction. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2018) 20. Skolnik, M.I.: Introduction to Radar System, 3rd edn. Mcgraw-Hill College, New York (1962) 21. Smith, A., Cassell, R., Breen, T., Hulstrom, R., Evers, C.: Methods to provide system-wide ADS-B back-up, validation and security. In: 2006 IEEE/AIAA 25th Digital Avionics Systems Conference, pp. 1–7 (2006) 22. Stevens, M.C.: Secondary Surveillance Radar. Artech House, Norwood (1988) 23. Strohmeier, M., Lenders, V., Martinovic, I.: On the security of the automatic dependent surveillance-broadcast protocol. IEEE Commun. Surv. Tutor. 17(2), 1066–1087 (2015) 24. Strohmeier, M., Sch¨ afer, M., Lenders, V., Martinovic, I.: Realities and challenges of nextgen air traffic management: the case of ADS-B. IEEE Commun. Mag. 52(5), 111–118 (2014) 25. Willis, N.J.: Bistatic Radar, 2nd edn. Artech House (1995) 26. Yang, A., Tan, X., Baek, J., Wong, D.S.: A new ADS-B authentication framework based on efficient hierarchical identity-based signature with batch verification. IEEE Trans. Serv. Comput. 10(2), 165–175 (2017)
Evaluation of Rainfall Characteristics Between 1-h Precipitation and 10-min Precipitation Observed by AMeDAS Kiyotaka Fujisaki(B) Fukuoka Institute of Technology, 3-30-1 Wajiro-higashi, Higashi-ku, Fukuoka 811-0295, Japan [email protected]
Abstract. In next-generation satellite communication systems, rainfall has a significant impact on link quality. In order to build a reliable satellite link, technologies that suppress the effects of rainfall are required. Based on the precipitation data that the Japan Meteorological Agency has observed so far, we are studying a method to predict the effect of rainfall on satellite links from the expected precipitation. In this paper, based on the data of 1-h precipitation and 10-min precipitation observed by AMeDAS, the statistical properties between 1-h precipitation and 10-min precipitation are reported.
1
Introduction
The Ka-band such as 21 GHz can be used to achieve future broadcasting satellite services or high speed satellite communication. However, the frequency band above the Ka-band is susceptible to severe climate effects, especially rainfall, and communication may not be possible for a certain period of time during rainfall. To determine the excess path-attenuation due to rainfall is a major problem in link engineering for satellite communication system. Theoretical studies and experiments have been conducted by many researchers to estimate the rainfall attenuation of satellite radio waves [1–9]. Based on these results and ITU recommendations [10], the rainfall margin for satellite links is now estimated. If sufficient link margin can be supplied to the satellite links at all times, error-free service can be provided even in the rain. On the other hand, signal attenuation increases sharply at higher frequencies. Furthermore, it is difficult to secure a sufficient link margin in satellite communication systems because there are various restrictions such as the power that can be supplied to the system depends on the capacity of the solar cells mounted on the satellite. Under such circumstances, in order to provide high data rate and highly reliable communication, it is necessary to introduce an adaptive control system according to the condition of the propagation path. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 312–319, 2022. https://doi.org/10.1007/978-3-030-84913-9_31
Evaluation of Rainfall Characteristics
313
As a counterpart of rain attenuation on Ka-band and higher frequency satellite communication, multi-beam satellite system is proposed. By using the multibeam satellite technology, the transmission power of each beam can be adaptively controlled, the influence of rain on the link can be reduced, and reliable communication can be performed [11–15]. In [13–15], we examined the effectiveness of adaptive control methods using 21 GHz multi-beam satellites. From the precipitation data, we predicted the deterioration of the quality of each beam and showed that the link quality can be improved by increasing the transmission power of some of the beams that are greatly affected by rainfall. In these papers, in order to evaluate the quality of satellite links, 1-h precipitation data of AMeDAS (Automatic Meteorological Data Acquisition System) was regarded as the rainfall intensity at that time. Since the rainfall intensity fluctuates greatly in a short period of time, it is easy for a large error to occur in the evaluation if 1-h precipitation data is used as the rainfall intensity. Therefore, we also reported the need for evaluation based on shorter rainfall data. However, it is not easy to use short-term precipitation data for adaptive control because it is difficult to efficiently obtain data from observation points all over Japan. Therefore, based on the 10-min precipitation data and the 1-h precipitation data observed by the Japan Meteorological Agency, the statistical properties of the rainfall intensity that occurs in 1 h are evaluated. We are also aiming to propose a method for probabilistically estimating the distribution of rainfall intensity from 1-h precipitation data. In this paper, in order to find out the statistical properties between 1-h precipitation and 10-min precipitation, 1-h precipitation was divided into 6 groups, and the maximum value of 10-min precipitation was extracted for each group, and their frequency of occurrence was evaluated. Furthermore, the occurrence status of rainfall for each rainfall group is shown by month. The paper structure is as follows. In Sect. 2, we introduce a regional meteorological observation system in Japan. In Sect. 3, we present the relation between 1-h precipitation and 10-min precipitation by using the observation data of AMeDAS. Finally, in Sect. 4, we conclude the paper.
2
AMeDAS
AMeDAS [16] is a regional meteorological observation system operated by the Japan Meteorological Agency, and is composed of approximately 1,300 meteorological stations nationwide to cover Japan. The official name of this system is Automated Meteorological Data Acquisition System. This system has been in operation since November 1974. By using this system, weather conditions such as rain, snow, wind, temperature, humidity and so on are monitored in detail for each short time and for each subdivided area, helping to prevent and mitigate weather forecasts and meteorological disasters. Currently, when observing precipitation using this system, the result of one station can be regarded as an average value of 17 km square. Meteorological data observed by AMeDAS is provided in csv format. One dataset consists of about 70 fields and the data is accumulated every 10 min.
314
K. Fujisaki
Data files are created daily for each observatory on a daily basis. In this research, we use date, time, 1-h precipitation, and 10-min precipitation data from the data observed by AMeDAS.
3
Relationship Between 1-h Precipitation and Maximum Value of 10-min Precipitation
We have proposed a method to improve the link quality by predicting the link quality of each beam from rainfall data and adaptively controlling the transmission power of each beam using multi-beam satellite technology [13–15]. In the link quality forecast, the amount of rainfall per hour was regarded as the rainfall intensity at that time. However, rain conditions change in a short period of time and uniform rainfall does not continue for an hour. Short-term precipitation data is needed to more accurately predict the impact of rainfall on satellite links. But, retrieving such data is not easy. To solve this problem, we consider to evaluate statistically the impact of rainfall on satellite links from the 1-h precipitation data and 10-min precipitation data obtained by the Japan Meteorological Agency. This paper uses AMeDAS data observed in 2014 to investigate the relationship between 1-h precipitation and the maximum value of 10-min precipitation observed at that time. In this analysis, precipitation was evaluated in 6 groups. Specifically, precipitation is divided into 0–4 mm, 5–14 mm, 15–24 mm, 25–34 mm, 35–44 mm, 45 mm and above. First, 1-h precipitation data is divided into this group and then the maximum value is found from the 10-min precipitation data for the time when the 1-h precipitation was measured. This process was performed on each data and the distribution of the maximum value of 10-min precipitation was obtained. Figures 1, 2, 3, 4, 5, 6 show the results of this evaluation. As shown in Fig. 1, when the 1-h precipitation is about 0–4 mm, the 10-min precipitation does not exceed 3 mm and the maximum value is often 0.5 mm. On the other hand, as the 1-h precipitation increased, the maximum observed 10-min precipitation increased. In addition, it was shown that not only the mainly observed 10-min precipitation increases, but also the variation increases. In next-generation satellite communications using the Ka-band, if the rainfall intensity exceeds 15 mm/h, the link quality will deteriorate significantly and stable use will not be possible. This means that if the 10-min precipitation exceeds about 3 mm, the rainfall can affect the quality of the satellite links. Therefore, even if the 1-h precipitation is less than 5 mm, the rainfall will affect the satellite links during periods 10-min precipitation is 3 mm or more. This situation becomes more pronounced as precipitation increases. Figure 4 shows the results when the 1-h precipitation exceeds 25 mm. It can be seen that more than half of the maximum 10-min precipitation exceeds 3 mm and more than half of the observed rainfall affected the satellite links. Table 1 shows the results of monthly evaluation of the frequency of rainfall for each group. Most of the rainfall is less than 5 mm, but some rainfall above
Evaluation of Rainfall Characteristics
315
0-4mm Normarized Rate of Occurrence
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Fig. 1. Distribution of maximum value of 10-min precipitation in the case of 1-h precipitation is 0–4 mm.
5-14mm Normarized Rate of Occurrence
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Fig. 2. Distribution of maximum value of 10-min precipitation in the case of 1-h precipitation is 5–14 mm.
15-24mm Normarized Rate of Occurrence
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Fig. 3. Distribution of maximum value of 10-min precipitation in the case of 1-h precipitation is 15–24 mm.
316
K. Fujisaki
25-34mm Normarized Rate of Occurrence
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Fig. 4. Distribution of maximum value of 10-min precipitation in the case of 1-h precipitation is 25–34 mm.
35-44mm Normarized Rate of Occurrence
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Fig. 5. Distribution of maximum value of 10-min precipitation in the case of 1-h precipitation is 35–44 mm.
45mm or more Normarized Rate of Occurrence
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Fig. 6. Distribution of maximum 10-min precipitation in the case of 1-h precipitation is 45 mm or more.
Evaluation of Rainfall Characteristics
317
Table 1. Number of 1-h precipitation data for each 1-h precipitation category. 0–4 mm 5–14 mm 15–24 mm 25–34 mm 35–44 mm 45 mm or more Jan
164856
8642
165
13
0
0
Feb
187672
18847
910
172
27
14
Mar 245480
37154
1992
284
46
39
Apr 140885
20387
872
168
25
6
May 144591
38499
2519
472
149
125
Jun
203808
58267
6315
1463
471
216
Jul
164580
56329
8016
2297
748
506
Aug 225622
94284
17619
5468
1984
1240
Sep
114642
31939
4290
1165
370
259
Oct
182072
74015
9573
2391
824
492
Nov 174668
21182
999
290
95
82
Dec 297365
25555
845
79
6
2
Table 2. Monthly average of 10-min precipitation and annual average and variance evaluated for each 1-h precipitation category 0–4 mm 5–14 mm 15–24 mm 25–34 mm 35–44 mm 45 mm or more Jan
0.556
1.157
2.691
3.731
0.000
0.000
Feb
0.539
1.246
3.113
4.605
4.796
7.250
Mar
0.565
1.260
3.059
4.153
6.359
8.282
Apr
0.578
1.230
2.981
4.220
5.740
6.250
May
0.604
1.345
3.053
4.732
5.587
8.332
Jun
0.634
1.465
3.126
4.573
5.679
7.104
Jul
0.676
1.615
3.320
4.620
5.830
8.093
Aug
0.715
1.722
3.303
4.720
6.020
7.818
Sep
0.670
1.631
3.434
4.516
5.711
8.207
Oct
0.634
1.409
3.047
4.657
6.300
8.246
Nov
0.601
1.314
3.315
4.752
5.821
7.579
Dec
0.567
1.160
2.903
3.924
5.167
3.000
Average 0.612
1.379
3.112
4.434
5.251
6.680
Variance 0.003
0.033
0.039
0.108
2.676
6.086
25 mm has been observed. Therefore, it is important to consider that rainfall can affect satellite links in any month. It can be seen that heavy rainfall is distributed mainly in summer. Table 2 shows the results of monthly evaluation of the average 10-min precipitation for each group. The results show that when 1-h precipitation exceeds
318
K. Fujisaki
15 mm, the average 10-min precipitation exceeds 3 mm, which is likely to affect satellite links. In addition, it can be seen that as the amount of precipitation increases, the variance also increases, and the fluctuation of the rainfall that appears also increases.
4
Conclusion
In this paper, 1-h precipitation was divided into 6 groups and 10-min precipitation in the rainfall of each group was analyzed. The evaluation results have shown that even if the amount of rainfall in one hour is small, the rainfall that affects the satellite links may occur for a short time. Also the quality of the links deteriorated significantly when 1-h precipitation exceeds 25 mm. In this work, the analysis considering regional characteristics is not performed. But, rain conditions are regional. For example, even if the amount of rainfall is the same, heavy rain may fall in a short period of time in one area and light rain may continue to fall for a long time in another area. For this reason, it is not possible to evaluate the quality of satellite links simply by precipitation. To more accurately predict the impact of rainfall on satellite links, it is necessary to understand the statistical relationship between precipitation and rainfall intensity in each region. In the future, we would like to perform a detailed analysis for each region and evaluate the rainfall situation. Acknowledgements. The author would like to thank Mr. Yuto Koba for cooperating with the data analysis.
References 1. Crane, R.K.: Prediction of attenuation by rain. IEEE Trans. on Comm. 28(9), 1717–1733 (1980) 2. Satoh, K.: Studies on spatial correlation of rain rate and raindrop layer height. IEICE Trans. Commun. 66(4), 493–500 (1983). (in Japanese) 3. Kang, J., Echigo, H., Ohnuma, K., Nishida, S., Sato, R.: Three-year measurement by VSAT system and CCIR estimation for rain attenuation in Ku-band satellite channel. IEICE Trans. Comm. 79(5), 722–726 (1996) 4. Dissanayake, A., Allnutt, J., Haidara, F.: A prediction model that combines rain attenuation and other propagation impairments along earth-satellite paths. IEEE Trans. Antennas Propag. 45(10), 1546–1558 (1997) 5. Karasawa, Y., Maekawa, Y.: Ka-band earth-space propagation research in Japan. Proc. IEEE 85(6), 821–842 (1997) 6. Polonio, R., Riva, C.: ITALSAT propagation experiment at 18.7, 39.6 and 49.5 GHz at Spino D’Adda: three years of CPA statistics. IEEE Trans. Antennas Propag. 46(5), 631–635 (1998) 7. Panagopoulos, A.D., Arapoglou, P.M., Cottis, P.G.: Satellite communications at KU, KA, and V bands: propagation impairments and mitigation techniques. IEEE Commun. Surv. Tutorials 6(3), 2–14 (2004)
Evaluation of Rainfall Characteristics
319
8. Kamei, M., Tanaka, S., Shogen, K.: Study of the relationship between the transmission capacity and the transmission power for 21 GHz-band broadcasting satellites using phased array antenna. IEICE Trans. Commun. 89(2), 106–114 (2006). (in Japanese) 9. Maekawa, Y., Nakatani, T., Shibagaki, Y., Hatsuda, H.: A study on site diversity techniques related to rain area motion using Ku-band Satellite Signals. IEICE Trans. Commun. 91(6) 1812–1818 (2008) 10. Rec. ITU-R P.618-7, Propagation data and prediction methods required for the design of earth-space telecommunication systems, ITU (2001) 11. Yoshino, T., Ito, S.: A selection method of beams compensated using AMeDAS data for the multi-beam satellite broadcasting (in Japanese). IEICE Trans. 82(1), 64–70 (1999) 12. Chodkaveekityada, P., Fukuchi, H.: Evaluation of adaptive satellite power control method using rain radar data. IEICE Trans. Commun. 99(11), 2450–2457 (2016) 13. Iwamoto, T., Fujisaki, K.: Study of beam power control of ka-band multi-beam broadcasting satellite using meteorological data. In: Proceedings of AINA Workshops, pp. 267–274 (2019). https://doi.org/10.1007/978-3-030-15035-8 25 14. Iwamoto, T., Fujisaki, K.: Study of beam power control methods of ka-band multibeam broadcasting satellite system using meteorological data - evaluation of link quality in consideration of beam-directivity. IEICE Tech. Rep. 119(220), 93–98 (2019) 15. Iwamoto, T., Fujisaki, K.: A beam power allocation method for ka-band multibeam broadcasting satellite based on meteorological data. In: Proceedings of EIDWT, vol. 2020, pp. 237–246 (2020). https://doi.org/10.1007/978-3-030-397463 26 16. AMeDAS, Japan Meteorological Agency. https://www.jma.go.jp/jma/en/Activiti es/amedas/amedas.html
Numerical Analysis of Photonic Crystal Waveguide with Stub by CIP Method Hiroshi Maeda(B) Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Fukuoka 811-0295, Japan [email protected]
Abstract. Transmission and reflection characteristics for photonic crystal waveguide with a variety of stub structure was numerically analyzed by constrained interpolation profile (CIP) method. The numerical results show the structure is applicable for drop circuit for wavelength division multiplexed (WDM) signal by changing length of the stub for a carrier wave with specific wavelength.
1
Introduction
Photonic crystal structures or electromagnetic band gap structures have periodic distribution of material constants in it and are applied into practical use in optical components for signal generation, transmission and reception, because of its unique and sensitive characteristics with respect to the signal frequency. Those characteristics are based on photonic band gap (PBG) phenomena [1–4]. In signal procession and transmission utilizing PBG devices in optical integrated circuits, high density multiplexing in frequency domain is expected due to its sensitivity with respect to optical wavelength. This is important to improve capacity of information transmission in photonic network with dense multiplexing technique of signal in wavelength domain. The behavior of electromagnetic wave in periodic structure can be controlled by selecting material constants, designing periodic profile of the structure and the frequency spectrum range of the signal. For various kinds of materials and for various frequency ranges of purposes, PBG might be found by designing the structure with fundamental unit lattice. This means that, by setting the parameters appropriately, confinement and transmission of electromagnetic wave along line-defect in the structure is possible for desired range of frequency from microwave to optical domain. In this meaning, we examined the propagation and filtering characteristics of two dimensional photonic crystal waveguide and cavities with triangular lattice of dielectric pillar in microwave frequency around 4 GHz. In the experiment [5–8], authors have used ceramic rods as dielectric pillar. For its quite low-loss property and high dielectric constant of εr = 36.0, ceramic is suitable to confine electromagnetic field tightly when periodic structure is composed with less numbers of layers. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 320–328, 2022. https://doi.org/10.1007/978-3-030-84913-9_32
Numerical Analysis of PC Waveguide with Stub
321
As a useful numerical analysis technique, finite different time domain (FDTD) method [9] is powerful and widely applicable, for enabling to design various boundary shape of structure with multi-dimensional problems. However, it is known that FDTD shows physically incorrect behavior for problems including large gap of material constants at the boundary. It is possible to avoid such behavior by setting smaller cells, however, it increases cell numbers for the entire analysis region with increase of memory and computation time. This means that we should pay attention to guarantee reliable results to choose the discrete cells within reasonable computation time. On the contrary, constrained interpolated profile (CIP) method has been proposed by Yabe [10], with advantage of preventing such spurious behavior in FDTD method. Because the CIP method hires cubic polynomials to express the profile in a cell, it is possible to renew not only the profile at each discretized point but also the first order spatial derivatives of the profile. Authors have been applied the CIP method for analysis of wave propagation in periodic structures composed by ceramic pillars in air background.
Fig. 1. Top view of fundamental triangular lattice of photonic crystal and side view of ceramic rod with parameters.
In this paper, filtering characteristics of stubs, situated as defects along linedefect waveguide, is numerically investigated by CIP method [11–15]. In the simulation, band-limited wave with time evolving envelope of sampling function is given as input. The transmitted frequency peaks were obtained by fast Fourier transform (FFT) of output electric field in time domain. The results show that obvious transmission peaks are observed in Fourier transformed domain.
2
CIP Method to Solve Two-Dimensional Maxwell’s Equations
Maxwell’s curl equations for electric field vector E and magnetic field vector H in lossless, isotropic, non-dispersive and non-conductive material are given as follows;
322
H. Maeda
∂E , ∂t ∂H , ∇ × E = −μ ∂t
∇×H=ε
(1) (2)
where ε is permittivity and μ is permeability of the space, respectively. Assuming two dimensional uniform space along with z axis (i.e. ∂/∂z = 0), Maxwell equations are decomposed into two sets of polarization. We analyze TE mode or E-wave which includes (Hx , Hy , Ez ) as the component and propagates to x axis. For TE mode, Maxwell’s equations are reduced to following sets; ∂Ez ∂Hy = −μ , ∂y ∂t ∂Ez ∂Hy =μ , ∂x ∂t ∂Hx ∂Ez ∂Hy − =ε . ∂x ∂y ∂t
(3) (4) (5)
From Eqs. (3) to (5), we obtain an vector partial differential equation as; ∂ ∂ ∂ AW + BW + CW = 0, ∂x ∂y ∂t where
⎞ Hx W = ⎝ Hy ⎠ , Ez ⎛ ⎞ 010 A = ⎝0 0 1⎠, 000 ⎛ ⎞ −1 0 0 B = ⎝ 0 0 0⎠, 0 01 ⎛ ⎞ 0 0 −ε C = ⎝ 0 −μ 0 ⎠ . μ 0 0
(6)
⎛
(7)
(8)
(9)
(10)
Split step procedure [10] is used for solving Eq. (6). Transformation into advection equations, procedures of split step solution for advection equations are same as our previous work [15] and is omitted here, however they are described in Refs. [10] and [11] in detail.
Numerical Analysis of PC Waveguide with Stub
323
line-defect waveguide
Hy Ez Hx ʀʀʀPort1(Input) ʀʀʀPort2(Transmittance)
ʀʀʀStubs (defect) ʀʀʀPort3(Reflectance)
Fig. 2. Top view of 2D periodic triangular lattice with a line-defect waveguide and 2 pairs of Fano resonators.
3
Two-Dimensional, Pillar-Type Photonic Crystal Structure, Line-Defect Waveguide and Fano Cavity
In Fig. 2, top view of periodic triangular lattice with a line-defect waveguide is shown, together with electromagnetic field components. The longitudinal axis of the cylinder corresponds to polarization direction of electric field Ez of TE mode. Material of the cylinder is ceramic with relative dielectric constant εr = 36.0 at frequency f = 4.0 GHz. In the simulation, input frequency band is from 3.6 to 4.2 GHz, the dielectric loss is as negligibly small as 10−6 and the real part of dielectric constant can be assumed to be uniform in the frequency range. The ceramic rods are fabricated and supplied by Kyocera company in Japan for general use as microwave circuit elements. The lattice period P = 26.5 mm was designed so that the line defect structure shows PBG for frequency range from 3.6 to 4.2 GHz in the experiment. Following the design, the incident wave is guided along with defect without penetrating into periodic structures. In Fig. 2, TE mode with components (Hx , Hy , Ez ) is excited at port #1. The electric field Ez has Gaussian profile along y-axis with full beam waist w0 = 24.8 mm. Also shown in Fig. 2, six stubs with defects are situated along both side of the waveguide to achieve filtering circuits.
324
4
H. Maeda
FFT Analysis of Outputs Electric Field in Cavity
In numerical analysis, the discretization for space and time are set to be Δx = Δy = 0.75 mm and Δt = 2.5 × 10−13 s, respectively. Supposing band-limited spectrum with square profile, the time evolving input wave f (t) is given by inverse Fourier transform of the spectrum as follows; Re{f (t)} = −fL × Sa(2πfL t) + fU × Sa(2πfU t), where
(11)
sin(x) (12) x is a sampling function, fL and fU are lower and upper frequency [Hz] of the limited band, respectively. Here, fL = 3.6 GHz and fH = 4.2 GHz are used for obtaining the flat spectrum. The maximum input amplitude in the simulation comes at time t = 100/fC [sec], where 1/fC is time period for center frequency of the range and fC = (fL + fU )/2. For obtaining frequency resolution to be comparable with experimental results, the time evolving data from CIP method is sampled every Δtsample = 100Δt. Therefore, sampled time interval Δtsample = 25.0 psec with numbers of sample data Nsample = 4096 was set to obtain frequency resolution Δf = (Δtsample ×Nsample )−1 9.77 MHz. This resolution brings 600/9.77 61 points in the input frequency range. In Fig. 3(a)–(c), typical three types of structures with different lengths of stubs are shown. Figure 3(a) illustrates a line-defect waveguide with six stubs of length 2P, where P = 26.5 mm is the lattice period. All stubs are separated equally with 3P horizontally, and facing each other vertically. As is depicted, this structure has point symmetry with respect to center of the line-defect waveguide. Figures 3(b) and (c) show similar waveguide structure with stubs, but with different lengths of stubs of 3P and 4P, respectively. In Fig. 3(a) and (b), simulated S parameters are plotted, where Fig. 3(a) is transmission spectrum |S21 | and Fig. 3(b) is reflection spectrum |S21 |, respectively. In Fig. 3(a), two peaks of transmission spectra are observed for 2P and 4P, while for 3P wide range of transmission band is obtained. The result suggests the transmission spectra for 2p shows the structure is applicable for band limited filter with two transmission band around 3.8 GHz and 4.2 GHz. In Fig. 3(b), reflection of all three types of structure is in high level in entire input frequency band. It is required to reduce the reflection, and to increase the transmission. Sa(x) =
Numerical Analysis of PC Waveguide with Stub
325
(a) 6 stubs with length of 2P.
(b) Stubs with length of 3P.
(c) Stubs with length of 4P.
Fig. 3. Illustration of a triangular photonic crystal waveguide with 6 stubs of the length of (a) 2P, (b) 3P, and (c) 4P, respectively, where P is the lattice period 26.5 [mm]. The stubs are indicated by blue boxes.
326
H. Maeda
Fig. 4. Transmitted and reflected power spectrum at (a) port 2 and (b) port 3, respectively.
5
Conclusion and Future Subject
Filtering characteristics of line-defect waveguide with stubs is numerically demonstrated. The output and reflected spectra for proposed structure suggest that it is applicable for band limited filter with dual transmission bands. As our future work, mechanisms of dual transmission band in the structure should be discussed based on electromagnetic field profile in stub space, together with reducing amount of the reflection for variety of stub locations. Comparison with microwave measurement of the structure is also required.
Numerical Analysis of PC Waveguide with Stub
327
Acknowledgment. Author expresses my appreciation to Mr. K. Kurushima, Mr. N. Tomokiyo and Ms. H. Doi of Fukuoka Institute of Technology as part of their undergraduate research under supervision by the author in 2020-21.
References 1. Yasumoto, K. (ed.): Electromagnetic Theory and Applications for Photonic Crystals. CRC PRESS (2006) 2. Inoue, K., Ohtaka, K. (eds.): Photonic Crystals - Physics, Fabrication and Applications. Springer, New York (2004) 3. Noda, S., Baba, T. (ed.): Roadmap on Photonic Crystals. Kluwer Academic Publishers (2003) 4. Joannopoulos, J.D., Meade, R.D., Winn, J.N.: Photonic Crystals. Princeton University Press, New Jersey (1995) 5. Maeda, H., Inoue, S., Nakahara, S., Hatanaka, O., Zhang, Y., Terashima, H.: Experimental Study on X-shaped photonic crystal waveguide in 2D triangular lattice for wavelength division multiplexing system. In: Proceedings of 26th International Conference on Advanced Information Networking and Applications (AINA-2012), pp. 629–632, March 2012 6. Maeda, H., Zhang, Y., Terashima, H.: An Experimental study on X-shaped branching waveguide in two-dimensional photonic crystal structure. In: Proceedings of 6th International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS-2012), pp. 660–664, July 2012 7. Maeda, H.: Four-branching waveguide in 2D photonic crystal structure for WDM system. J. Space-Based Situated Comput. 3(4), 227–233 (2013) 8. Bao, Y., Maeda, H., Nakashima, N.: Studies on filtering characteristics of X-shaped photonic crystal waveguide in two-dimensional triangular lattice by microwave model. In: Proceedings of International Symposium on Antenna and Propagation (ISAP2015), pp. 842–845, November 2015 9. Taflove, A.: Advances in Computational Electrodynamics - The Finite-Difference Time-Domain Method, Artech House 10. Yabe, T., Feng, X., Utsumi, T.: The constrained interpolation profile method for multiphase analysis. J. Comput. Phys. 169, 556–593 (2001) 11. Maeda, H.: Numerical Technique for Electromagnetic Field Computation Including High Contrast Composite Material, as Chapter 3 of Optical Communications, pp. 41–54, InTech Open Access Publisher, October 2012 12. Maeda, H., Yasumoto, K., Chen, H., Tomiura, K., Ogata, D.: Numerical and experimental study on Y-shaped branch waveguide by post wall. In: Proceedings of 16th International Conference on Network-Based Information Systems (NBiS 2013), pp. 508–512, September 2013 13. Jin, J., Bao, Y., Chen, H., Maeda, H.: Numerical analysis of Y-shaped branch waveguide in photonic crystal structures and its application. In: Proceedings of 7th International Conference on Broadband, Wireless Computing, Communication and Applications (BWCCA-2014), pp. 362–365, November 2014
328
H. Maeda
14. Maeda, H., Ogata, D., Sakuma, N.N., Toyomasu, N., Nishi, R.: Numerical analysis of 1×4 branch waveguide in two dimensional photonic crystal structure. In: Proceedings of International Conference on Advanced Information Networking and Applications (AINA 2015), pp. 366–369, March 2015 15. Maeda, H., Cada, M., Bao, Y., Jin, J., Tomiura, K.: Numerical analysis of transmission spectrum of X-shaped photonic crystal waveguide for WDM system. In Proceedings of International Conference on The Tenth International Conference on Complex, Intelligent, and Software Intensive Systems (CISIS-2016), pp. 186– 189, July 2016
A CCM-Based HC System for Mesh Router Placement Optimization: A Comparison Study for Different Instances Considering Normal and Uniform Distributions of Mesh Clients Aoto Hirata1 , Tetsuya Oda2(B) , Nobuki Saito1 , Yuki Nagai2 , Kyohei Toyoshima2 , and Leonard Barolli3 1
3
Graduate School of Engineering, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700–0005, Japan {t21jm02zr,t21jm01md}@ous.jp 2 Department of Information and Computer Engineering, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700–0005, Japan {oda,t18j056tk}@ice.ous.ac.jp, [email protected] Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected]
Abstract. The Wireless Mesh Networks (WMNs) enables routers to communicate with each other wirelessly in order to create a stable network over a wide area at a low cost and it has attracted much attention in recent years. There are different methods for optimizing the placement of mesh routers. In our previous work, we proposed a Coverage Construction Method (CCM) and CCM-based Hill Climbing (HC) system for mesh router placement problem considering normal and uniform distributions of mesh clients. In this paper, we propose a CCM-based HC reduction method and evaluate performance of CCM-based HC system for different instances considering normal and uniform distributions. For the simulation results, we found that the CCM-based HC system was able to cover more mesh clients for different instances compared with CCM and processing time has been reduced compared to the previous system.
1
Introduction
The Wireless Mesh Networks (WMNs) [1–4] are one of the wireless network technologies that enables routers to communicate with each other wirelessly to create a stable network over a wide area at a low cost and it has attracted much attention in recent years. The placement of the mesh routers has a significant impact on cost, communication range and operational complexity. Therefore, research is being done to optimize the placement of these mesh routers. In our previous work [5–12], we proposed and evaluated different meta-heuristics such c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 329–340, 2022. https://doi.org/10.1007/978-3-030-84913-9_33
330
A. Hirata et al.
Fig. 1. Flowchart of the CCM.
as Genetic Algorithms (GA) [13], Hill Climbing (HC) [14], Simulated Annealing (SA) [15], Tabu Search (TS) [16] and Particle Swarm Optimization (PSO) [17] for mesh router placement optimization. Also, we proposed a Coverage Construction Method (CCM) for mesh router placement problem [18] and CCM-based Hill Climbing (HC) system [19]. The CCM is able to rapidly create a group of mesh routers with the radio communication ranges of all mesh routers linked to each other. The CCM-based HC system was able to cover many mesh clients generated by normal and uniform distributions. We also showed that in the two islands model, the CCM-based HC system was able to find two islands and can cover many mesh clients [20]. In this paper, we propose a CCM-based HC reduction method and evaluate performance of CCM-based HC system for different number of mesh routers and instances considering normal and uniform distributions. As evaluation metrics, we consider the Size of Giant Component (SGC) and the Number of Covered Mesh Clients (NCMC).
CCM Based HC Approach for Mesh Router Placement Optimization
331
The structure of the paper is as follows. In Sect. 2, we define the mesh router placement problem. In Sect. 3, we describe the CCM system, CCM-based HC system and reduction processing time method. In Sect. 4, we present the simulation results. Finally, conclusions and future work are given in Sect. 5.
2
Mesh Router Placement Problem
We consider a two-dimensional continuous area to deploy a number of mesh routers and a number of mesh clients of fixed positions. The objective of the problem is to optimize a location assignment for the mesh routers to the twodimensional continuous area that maximizes the network connectivity and mesh clients coverage. Network connectivity is measured by the SGC, while the NCMC is the number of mesh clients that is within the radio communication range of at least one mesh router. An instance of the problem consists as follows. – An area W idth × Height which is the considered area for mesh router placement. Positions of mesh routers are not pre-determined, and are to be computed. – The mesh router, each having its radio communication range, defining thus a vector of routers. – The mesh clients located in arbitrary points of the considered area, defining a matrix of clients.
3
Proposed System
In this section, we describe the proposed system. In Fig. 1 and Fig. 2 are shown the flowchart of CCM and CCM-based HC system. 3.1
CCM for Mesh Router Placement Optimization
In our previous work, we proposed CCM [18] for mesh router placement optimization problem. The flowchart of CCM is shown in Fig. 1. The CCM search the solution with maximized SGC. Among the solutions generated, the one with the highest NCMC is the final solution. We describe the operation of CCM in follow. First, the mesh clients are generated in the considered area. Next, randomly is determined a single point coordinate to be mesh router 1. Once again, randomly determine a single point coordinate to be mesh router 2. Each mesh router has a radio communication range. If the radio communication ranges of the two routers do not overlap, delete router 2 and randomly determine a single point coordinate and make it as mesh router 2. This process is repeated until the radio communication ranges of two mesh routers overlaps. If the radio communication ranges of the two mesh routers overlap, generate next mesh routers. If there is no overlap in radio communication range with any mesh router, the mesh router is removed and generated randomly again. If any of the other mesh router has overlapping radio
332
A. Hirata et al.
Fig. 2. Flowchart of HC method for mesh router placement optimization.
communication ranges, generate next mesh routers. Continue this process until the setting number of mesh routers.
CCM Based HC Approach for Mesh Router Placement Optimization
333
By this procedure is created a group of mesh routers connected together without the derivation of connected component using Depth First Search (DFS) [21]. However, this method only creates a population of mesh routers at a considered area, but does not take into account the location of mesh clients. So, the procedure should be repeated for a setting number of loops. Then, determine how many mesh clients are included in the radio communication range group of the mesh router. The one with the highest number of covered mesh clients during repetition process is the solution. 3.2
CCM-Based HC for Mesh Router Placement Optimization
In this subsection, we describe CCM-based HC system [19]. The flowchart of HC for the mesh router placement problem is shown in Fig. 2. We describe the operation of CCM-based HC system in following. First, we randomly select one of the mesh routers in the group of mesh routers as the initial solution obtained by the CCM and change the coordinates of the chosen mesh router randomly. Then, we decide the NCMC for all mesh routers. If the NCMC is greater than that of previous one, then the changed mesh router placement is the current solution. If the NCMC is less than that of previous NCMC, the changed mesh router coordinates are restored. This process is repeated until the setting of number of loops. However, this process alone is inadequate for the mesh router placement problem. This is because depending on the routers placement and their radio communication range all mesh routers may be are not connected. Therefore, it is necessary to create an adjacency list for a mesh router each time the mesh router placement are changed and use DFS to find out if all mesh routers are connected. The NCMC is decided only when all mesh routers are connected and only when NCMC is greater than the current solution. In this algorithm, to cover many mesh clients, the number of loops is not counted until the radio communication range of the mesh router whose coordinates are changed overlaps the radio communication range of one of the mesh routers. 3.3
CCM-Based HC Reduction Method
In our previous system [19], when the proposed system places mesh router randomly, the placement area is all considered area. As the size of the considered area increases, the probability of overlapping the radio communication range of mesh routers decreases, which increases the processing time. In proposed system, the Algorithm 1 was applied before the randomly generated mesh router process in order to reduce the execution time. In Algorithm 1, the maximum values of the range of random numbers are M ax − X and M ax − Y , respectively, and the minimum values are M in − X and M in − Y .
334
A. Hirata et al.
Algorithm 1. Change the size of the mesh router placement area. Input: Placement list of all mesh routers [X, Y ]. Output: Area size for mesh router placement. 1: M ax − X ← Max X value of placement list of all mesh routers + Diameter of radio communication range of mesh routers. 2: if M ax − X > Width of considered area then 3: M ax − X ← Width of considered area. 4: end if 5: M in − X ← Minimum X value of placement list of all mesh routers - Diameter of radio communication range of mesh routers. 6: if M in − X < 0 then 7: M in − X ← 0. 8: end if 9: M ax − Y ← Max Y value of placement list of all mesh routers + Diameter of radio communication range of mesh routers. 10: if M ax − Y > Height of considered area then 11: M ax − Y ← Height of considered area. 12: end if 13: M in − Y ← Minimum Y value of placement list of all mesh routers - Diameter of radio communication range of mesh routers. 14: if M in − Y < 0 then 15: M in − Y ← 0. 16: end if
Algorithm 1 changes the mesh router placement area size according to the placement of other mesh routers without changing the coverage performances, and increases the probability of overlapping the radio communication range of mesh routers.
4
Simulation Results
In this section, we evaluate the proposed system. The parameters used for simulation are shown in Table 1. We considered 6 types of instances. Parameter settings of instances are shown in Table 2. We performed the simulations 15 times for each instance. The simulation results are shown in Table 3 and Table 4 for CCM and CCMbased HC system, respectively. We show also the simulation results of best SGC, avg. SGC, best NCMC and avg. NCMC. For each simulation result, the SGC is always maximized. In instance 1 and instance 2, the proposed system was able to cover most of mesh clients on average. Table 5 shows the average processing time for each instance compared to the previous system and the proposed system. From Table 5, we can see that the processing time was reduced by applying Algorithm 1. The visualization results are shown in Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7 and Fig. 8. We can see that the proposed CCM-based HC system can cover more mesh clients than CCM.
CCM Based HC Approach for Mesh Router Placement Optimization
335
Table 1. Parameters and value for simulations. Width of considered area
32, 64, 128
Hight of considered area
32, 64, 128
Number of mesh routers
16, 32, 64
Radius of radio communication range of mesh routers
2
Number of mesh clients
48, 96, 192
Distributions of mesh clients
Normal distribution, Uniform distribution
Standard deviation for normal distribution
Width of considered area/10
Number of loop for CCM
3000
Number of loop for HC method
100000
Table 2. Parameter settings of instances. Instance
Grid size
Distribution Number of mesh router Number of mesh client
Instance 1 32 × 32
Normal
16
48
Instance 2 64 × 64
Normal
32
96
Instance 3 128 × 128 Normal
64
192
Instance 4 32 × 32
Uniform
16
48
Instance 5 64 × 64
Uniform
32
96
Instance 6 128 × 128 Uniform
64
192
Table 3. Simulation results of CCM. Instance
Best SGC Average SGC Best NCMC Average NCMC [%]
Instance 1 16
16
48
92.778
Instance 2 32
32
67
63.889
Instance 3 64
64
77
36.528
Instance 4 16
16
17
31.806
Instance 5 32
32
18
14.305
Instance 6 64
64
15
6.528
Table 4. Simulation results of CCM-based HC. Instance
Best SGC Average SGC Best NCMC Average NCMC [%]
Instance 1 16
16
48
99.167
Instance 2 32
32
92
96.297
Instance 3 64
64
149
71.840
Instance 4 16
16
28
43.056
Instance 5 32
32
32
22.430
Instance 6 64
64
19
9.896
336
A. Hirata et al. Table 5. Simulation results of processing time. Instance
Processing time of previous system [sec .]
Processing time of proposed system [sec .]
Instance 1
2.241
1.845
Instance 2
23.050
15.743
Instance 3 108.034
64.438
Instance 4
6.669
Instance 5
29.478
23.541
Instance 6 143.509
101.429
32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
5.772
Mesh Clients Mesh Router Communication Range
0
1
2
3
4
5
6
7
8
32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Mesh Clients Mesh Router Communication Range
0
1
(a) Result of CCM.
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
(b) Result of CCM-based HC.
Fig. 3. Visualization results of instance 1. 64 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 0
Mesh Clients Mesh Router Communication Range
0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64
(a) Result of CCM.
64 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 0
Mesh Clients Mesh Router Communication Range
0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64
(b) Result of CCM-based HC.
Fig. 4. Visualization results of instance 2.
CCM Based HC Approach for Mesh Router Placement Optimization 128 124 120 116 112 108 104 100 96 92 88 84 80 76 72 68 64 60 56 52 48 44 40 36 32 28 24 20 16 12 8 4 0
Mesh Clients Mesh Router Communication Range
0
4
128 124 120 116 112 108 104 100 96 92 88 84 80 76 72 68 64 60 56 52 48 44 40 36 32 28 24 20 16 12 8 4 0
8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100104108112116120124128
Mesh Clients Mesh Router Communication Range
0
(a) Result of CCM.
4
8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100104108112116120124128
(b) Result of CCM-based HC.
Fig. 5. Visualization results of instance 3. 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Mesh Clients Mesh Router Communication Range
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
(a) Result of CCM.
32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Mesh Clients Mesh Router Communication Range
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
(b) Result of CCM-based HC.
Fig. 6. Visualization results of instance 4.
337
338
A. Hirata et al. 64 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 0
Mesh Clients Mesh Router Communication Range
0
2
4
6
64 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 0
8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64
Mesh Clients Mesh Router Communication Range
0
2
4
(a) Result of CCM.
6
8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64
(b) Result of CCM-based HC.
Fig. 7. Visualization results of instance 5. 128 124 120 116 112 108 104 100 96 92 88 84 80 76 72 68 64 60 56 52 48 44 40 36 32 28 24 20 16 12 8 4 0
Mesh Clients Mesh Router Communication Range
0
4
8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100104108112116120124128
(a) Result of CCM.
128 124 120 116 112 108 104 100 96 92 88 84 80 76 72 68 64 60 56 52 48 44 40 36 32 28 24 20 16 12 8 4 0
Mesh Clients Mesh Router Communication Range
0
4
8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100104108112116120124128
(b) Result of CCM-based HC.
Fig. 8. Visualization results of instance 6.
5
Conclusions
In this paper, we proposed a CCM-based HC reduction method and evaluated the performance of CCM-based HC system for different instances considering normal and uniform distributions. For the simulation results, we found that the proposed system was able to cover more mesh clients for different instances compared with CCM and the processing time has been reduced compared to the previous system. In the future, we would like to consider Simulated Annealing and Genetic Algorithms. Acknowledgement. This work was supported by JSPS KAKENHI Grant Number JP20K19793 and Grant for Promotion of OUS Research Project (OUS-RP-20-3).
CCM Based HC Approach for Mesh Router Placement Optimization
339
References 1. Akyildiz, I.F., et al.: Wireless mesh networks: a survey. Comput. Networks 47(4), 445–487 (2005) 2. Oda, T., et al.: Implementation and experimental results of a WMN testbed in indoor environment considering LoS scenario. In: Proceedings of the IEEE 29-th International Conference on Advanced Information Networking and Applications (IEEE AINA-2015), pp. 37–42 (2015) 3. Jun, J., et al.: The nominal capacity of wireless mesh networks. IEEE Wirel. Commun. 10(5), 8–15 (2003) 4. Oyman, O., et al.: Multihop relaying for broadband wireless mesh networks: from theory to practice. IEEE Commun. Mag. 45(11), 116–122 (2007) 5. Oda, T., et al.: Evaluation of WMN-GA for different mutation operators. Int. J. Space-Based Situated Comput. 2(3) (2012) 6. Oda, T., et al.: WMN-GA: a simulation system for WMNs and its evaluation considering selection operators. J. Ambient. Intell. Humaniz. Comput. 4(3), 323– 330 (2013) 7. Ikeda, M., et al.: Analysis of WMN-GA simulation results: WMN performance considering stationary and mobile scenarios. In: Proceedings of the 28-th IEEE International Conference on Advanced Information Networking and Applications (IEEE AINA-2014), pp. 337–342 (2014) 8. Oda, T., et al.: Analysis of mesh router placement in wireless mesh networks using friedman test. In: Proceedings of the IEEE 28-th International Conference on Advanced Information Networking and Applications (IEEE AINA-2014), pp. 289–296 (2014) 9. Oda, T., et al.: Effect of different grid shapes in wireless mesh network-genetic algorithm system. Int. J. Web Grid Serv. 10(4), 371–395 (2014) 10. Oda, T., et al.: Analysis of mesh router placement in wireless mesh networks using friedman test considering different meta-heuristics. Int. J. Commun. Networks Distributed Syst. 15(1), 84–106 (2015) 11. Oda, T., et al.: A genetic algorithm-based system for wireless mesh networks: analysis of system data considering different routing protocols and architectures. Soft. Comput. 20(7), 2627–2640 (2016) 12. Sakamoto, S., et al.: Performance evaluation of intelligent hybrid systems for node placement in wireless mesh networks: a comparison study of WMN-PSOHC and WMN-PSOSA. In: Proceedings of the 11-th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2017), pp. 16–26 (2017) 13. Holland, J.H.: Genetic algorithms. Sci. Am. 267(1), 66–73 (1992) 14. Skalak, D.B.: Prototype and feature selection by sampling and random mutation hill climbing algorithms. In: Proceedings of the 11-th International Conference on Machine Learning (ICML-1994), pp. 293–301 (1994) 15. Kirkpatrick, S., et al.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983) 16. Glover, F.: Tabu search: a tutorial. Interfaces 20(4), 74–94 (1990) 17. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of the IEEE International Conference on Neural Networks (ICNN-1995), pp. 1942–1948 (1995) 18. Hirata, A., et al.: Approach of a solution construction method for mesh router placement optimization problem. In: Proceedings of the IEEE 9-th Global Conference on Consumer Electronics (IEEE GCCE-2020), pp. 1–2 (2020)
340
A. Hirata et al.
19. Hirata, A., et al.: A coverage construction method based hill climbing approach for mesh router placement optimization. In: Proceedings of the 15-th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA-2020), pp. 355–364, 2020. The 15-th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA2020), pp. 355–364 (2020) 20. Hirata, A., et al.: Simulation results of CCM based HC for mesh router placement optimization considering two Islands model of mesh clients distributions. In: Proceedings of the 9th International Conference on Emerging Internet, Data & Web Technologies (EIDWT-2021), pp. 180–188 (2021) 21. Tarjan, R.: Depth-first search and linear graph algorithms. SIAM J. Comput. 1(2), 146–160 (1972)
EPOQAS: Development of an Event Program Organizer with a Question Answering System Jun Iio(B) Faculty of Global Informatics, Chuo University, 1-18 Ichigaya-tamachi, Shinjuku-ku, Tokyo 162-8478, Japan [email protected]
Abstract. With the increase in the various types of communication channels, communicating via online chatting has become common and is now considered an effective means of communication during real-time meetings. Various communication media, for example, traditional tools (such as Facebook Messengers, Twitter, Google Docs) or novel technologies (such as Slido and Miro), can be used as side channels; however, all these media remain separated from the main meeting. Therefore, in this paper, we propose an event program organizer containing a simple chatting function that can be used for questions and answers; this system has been named Event Program Organizer with a Question Answering System (EPOQAS). The system can be easily operated, and it has unique functions that enable the meeting participants to communicate with one another. Also, the source code of the system is published as an open-source software. Hence, implementing an instance of this system in research meetings will accelerate scientific discussions. Keywords: Online communication tool · Event program organizer · Discussion support system
1 Introduction In the 5th Science and Technology Basic Plan, the Japanese government proposed Society 5.0 as a future society that Japan should aspire to achieve [1]. Society 5.0 will succeed Society 4.0, which is the information society. In Society 4.0, there is a lack of crosssectional knowledge sharing. If society moves to next stage, “social reform (innovation) in Society 5.0 will achieve a forward-looking society that breaks down the existing sense of stagnation, a society whose members have mutual respect for each other, transcending the generations, and a society in which each and every person can lead an active and enjoyable life” [1]. Regardless of whether the society is organized as Society 4.0 or Society 5.0, information exchange frequently occurs in every event. Here, we focus on the effect of side-channel communication [2]. Side-channel communication is quite common these days. For example, during real-time meetings, participants often use a chat system or discuss with one another on Twitter. In such cases, the participants conduct multiple © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 341–348, 2022. https://doi.org/10.1007/978-3-030-84913-9_34
342
J. Iio
discussions both offline (face-to-face) and online. It would be natural to consider the online discussion as a side-channel discussion in such situations. Simple notification service tools, such as Twitter, Facebook Messenger, and LINE, can be online side-channel communication tools. Occasionally, co-editing on an online document, such as Google Docs, is used as the communication facility. In addition, some novel communication technologies, online whiteboard tools, and team collaboration software (e.g., Slido,1 Miro,2 and Strap3 ) are proposed to share the information. Those tools provide convenient ways for the meeting participants to discuss and communicate with one another. If the meeting is conducted as an online event, the sidechannel communication tools can compensate for the main communication channel’s narrow bandwidth offered by the online meeting tools. However, the tools for side-channel communication are basically independent of the main meeting’s information. Therefore, it is sometimes complicated to manage many tools smoothly for all participants without providing detailed explanation. To solve this problem, this paper proposes a system of an event program organizer, which contains a simple chatting function that can be used for asking questions and providing answers. The function is called Event Program Organizer with Question Answering System (hereinafter EPOQAS). The system is simple and can be easily operated. It implements unique functions for the meeting participants to enable them to communicate with one another. The source code of EPOQAS has been published as open-source software. Hence, it will accelerate scientific discussions if many research meeting organizers implement an instance of the system and operate it for their meetings. There are numerous studies on organizing conference programs. The Cobi program [3] collects preferences, constraints, and affinity data from community members. It also provides a visual interface for event organizers to improve the schedule by solving the constraints obtained from the collected data. Similar research has been reported by Bhardwaj et al. [4], who explored the design space of community sourcing by introducing attendee sourcing instead of the committee sourcing and author sourcing proposed by Cobi. Furthermore, there are many conference scheduling tools, such as POST (Moore et al.) [5], Frenzy (Chilton et al.) [6], iGenda (Costa et al.) [7], DODO event scheduler (Ali and Abbas) [8–11]. However, these tools mainly focus on the optimization of event scheduling. Scheduling needs to be considered for these existing tools. Our proposed model does not focus on program optimization. Our proposal is a combination of the event scheduler and online chatting functions. The rest of this paper is structured as follows. In Sect. 2, the background of the problem is described. In Sect. 3, an overview of the system is presented. In Sect. 4, we propose an additional and useful function implemented in the system. In Sect. 5, we discuss the possibilities of this system. Finally, our conclusions are presented along with the future work in Sect. 6.
1 https://www.sli.do/. 2 https://www.miro.com/. 3 https://product.strap.app/.
EPOQAS
343
2 Background of the Problem As mentioned in Section one, various collaboration tools are used to realize side-channel communication. Figure 1 shows typical examples using Google Docs (see Fig. 1(a)) and Slido (see Fig. 1(b)). Both tools were prepared to compensate for the narrow bandwidth of the main communication channels provided by the Zoom or Webex online meeting software. The 47th Sig-cyberworld research meeting and the OSC 2021 Spring/Tokyo meeting were held online because of the COVID-19 pandemic. All online meetings have the disadvantage of having a narrower information bandwidth for communication than a face-to-face meeting. Lately, organizers of online meetings have been preparing some information sharing systems for side-channel communications.
Fig. 1. (a) On the left is a screenshot of a Google document shared by the participants in the 47th Sig-cyberworld research meeting held on the 4th of March, 2021. (b) On the right is the Slido prepared for communication among the participants in the lightning talk session of the Open-Source Conference (OSC) 2021 Spring Tokyo session held on the 6th of March, 2021.
Several event program organization systems have been proposed. The online event support system OLiVES [12] is one of the systems used for organizing event programs. The main object of OLiVES is to provide serendipity for the event participants. If the event is held offline, the participants have opportunities to join in the seminar even when they had not considered joining earlier, which results in better experiences. However, if the event is held online, such serendipity cannot be expected. Then, the OLiVES system provides a recommendation function based on a collaborative filtering algorithm to provide some opportunities to participate in seminars in the last minute. A problem arises when considering managing both systems, that is, when considering the event program management system and information sharing system as providing simultaneous side-channel communication. Although both systems can be used only by registering a user account, the user accounts (or user IDs) are neither identical nor tied with one another except for the special case when single sign-on architecture is used. As an alternative, we need to choose another option to allow anonymous access to the system. This will be a simple solution, but it could lead to unnecessary discussions.
344
J. Iio
Therefore, the event organizer needs to struggle to manage several types of user accounts. We need to manage Google accounts, Twitter accounts, Facebook accounts, and unique accounts for our own systems, which makes the problem very complicated. To avoid this complication, we propose EPOQAS, a novel event program organizing system. This system has an online discussion function in addition to an event program database. The user accounts are merged into one ID for both functions, and the use cases are simplified.
3 System Overview The EPOQAS is implemented using Ruby on Rails (version 6.1.3 with Ruby 2.7.1 and PostgreSQL 12.2) in the local development environment. The system has been deployed on the Heroku platform (https://epoqas.herokuapp.com/). The entity–relationship diagram (ERD) of the system is shown in Fig. 2. The two tables (Meeting and Talk) are the main databases that store the main meeting information. Each meeting record has several talks (presentations). The gem “devise” introduces the User table. The Post and Comment tables, which are related to the Talk and User tables, represent the questions and answers online.
Fig. 2. ERD of the system. The two tables (Meeting and Talk) are the main databases storing the main meeting information. The other two tables (Post and Comment) represent the discussions on the questions and answers related to the topic presented at the Talk.
The questions and answers are structured in two layers. The entry in the Post table represents the first level of comment from users. It can be just a comment or a question to the speaker. It is tied to the entry in the Talk. Hence, every talk has a record of the discussions on the topic. The entry in the Comment table is basically a response to the comment or an answer to the question stored in the Post table. Both entries are related to the entry in the User table, which represents the submitter of the comment, question, and answer. Figures 3 and 4 are the screenshots of EPOQAS. Once a user accesses the top page of EPOQAS, the list of meetings appears. After selecting the meeting by clicking on the meeting link, the meeting program is displayed, as shown in Fig. 3.
EPOQAS
345
Fig. 3. Screenshots of EPOQAS. (a) The figure on the left shows a meeting program. Each entry of the presentation can be accessed by clicking the link in the program. (b) The right figure shows the information of a presentation. This page does not have any comments and questions.
Fig. 4. Screenshot of the EPOQAS showing a dialog (a discussion thread) involving questions and answers.
The title of each presentation is clickable. When a user clicks on the presentation link shown in Fig. 3, the information about the presentation appears, which includes the title, date, time, abstract, speaker names, and types of talks (see Fig. 3(a)). The screenshot in Fig. 3(b) shows the case when a user is logged in. The user’s name “Jun IIO (Chuo Univ)” is displayed at the center of the navigation bar on the top of the screen. If the page is accessed without logging in, the keyword “EPOQAS” is placed instead of the user’s name. Also, the Login and the Sign-up buttons will be seen at the top left corner and the top right corner of the navigation bar instead of the MyPage and the Logout buttons, respectively. When a user is logged in, the test area to enter a comment or a question will appear, and when a user is not logged in, this area will not appear; this is the main difference between the two cases. The window in Fig. 3(b) has a big text area with the placeholder message “Add your comment here.” If a user accesses the page without any login process, the text area and the submit button will not be displayed; therefore, no one can enter any
346
J. Iio
comments and questions. The registered comments, questions, and answers are listed at the bottom of the page. In the case shown in the right side of Fig. 3, there are still no opinions, and only the message indicating that no comments are registered is displayed. A typical question-and-answer thread is shown in Fig. 4. After a comment or a question is registered on the presenter’s page, anyone who logged in to the EPOQAS can reply to the message. A presenter’s page can contain multiple discussion threads so that many questions and answers can be recorded in the appropriate order.
4 Presentation Reviewing System A more useful function is implemented in EPOQAS. Research meetings occasionally present awards for the presentations or paper descriptions. However, if the award is given at the end of the meeting on the same day, the problem is that the duration between the end of presentation and the reception of the award is very less. The EPOQAS system provides an efficient evaluation system by using the following processes (see Fig. 5): 1. Reviewers watch the presentation and obtain information by viewing the presentation information provided by EPOQAS. 2. The presentation pages in EPOQAS also have the link to the evaluation form implemented by Google Forms. Reviewers can easily record their judgments using Google Forms. 3. The results will be instantly calculated using the summarize function of the Google Spreadsheet.
Fig. 5. Illustration of the reviewing and evaluation process for the presentation. Reviewers can judge the presentation scores via Google Forms linked from the presentation page in EPOQAS.
EPOQAS
347
As mentioned previously, the User table was introduced by the gem devise. An extra column “role” was added to manage the user’s responsibility (see Fig. 2). The users who have “admin” roles can access the dashboard of EPOQAS, and the users who have “reviewer” roles can access the link to the evaluation sheets and proceed with the reviewing process.
5 Benefits of EPOQAS A major advantage of introducing EPOQAS is to organize event programs instantly. The operation of registering any meeting and presentation information is simple enough, and it does not require user account registration if the user wants only to view the registered information. The simple chatting system implemented in EPOQAS helps accelerate discussions between speakers and audiences. Although side-channel communication can be realized by preparing some other tools, such as Twitter and Slack, or by sharing documents, the process of managing the accounts of each tool is slightly complicated. EPOQAS can store the user accounts in its own database. Therefore, the problem that the user accounts tend to be scattered will be solved. In addition, it is important that the discussion threads are tied to each presentation. Access to the discussion is not restricted within the period of the meeting. Even after the meeting is finished, anyone can add comments and questions, and the speaker can reply to the question at any time. The presenters, who are not accustomed to giving presentations, would not want to record the questions asked and the replies they have given. The EPOQAS online question-and-answer chatting system supports them. Furthermore, the effectiveness of the presentation evaluating function has already been confirmed. It was recognized to be useful by several board members of the HCDNet in their research meetings held in June 2020 and November 2020, when they used functions like the ones implemented in OLiVES. At that time, the duration to determine the HCD-Net research meeting presentation award was much shorter than before. Two research meetings were held online. However, this system can be used for offline meetings as well. Therefore, it was decided to use the system if the event could be held not only online but also offline or in hybrid events in the future.
6 Conclusions and Future Work In this study, we proposed an event program organizing system with a question-andanswer chatting function named EPOQAS. The system can manage event programs and present information and discussions related to each presentation. EPOQAS can be operated intuitively and helps increase scientific discussion among the participants of research meetings. This is the first time that a system like this has been proposed. Therefore, additional research is required to confirm the efficiency and the effectiveness of EPOQAS; the system itself needs to be developed further. Currently, EPOQAS can be used only in English and Japanese; additional languages need to be included. Also, additional optimizations are required in future studies.
348
J. Iio
Acknowledgments. The author would like to like to thank all the committee members of the sig-cyberworld, IEICE, for the fruitful discussion on this system proposal.
References 1. Japanese Government Cabinet Office: Society 5.0, English Home > Science and Technology Policy. Council for Science, Technology, and Innovation > Society 5.0. https://www8.cao. go.jp/cstp/english/society5_0/index.html. Accessed 1 May 2021 2. Iio, J.: Effects of side-channel communication: text-based analyses of chat transcripts from a case of remote education using a tele-conference system and a social networking system. In: Proceedings of the 2nd Workshop of Web Services and Social Media (WSSM2013) in conjunction with the 16th International Conference on Network-Based Information Systems (NBiS2013), Gwangju, Korea, pp. 416–421 (2013) 3. Kim, J.,et al.: Cobi: a community-informed conference scheduling tool. In: Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology, pp. 173–182 (2013) 4. Bhardwaj, A., et al.: Attendee-sourcing: exploring the design space of community-informed conference scheduling. In: Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, vol. 2, no. 1 (2014) 5. Moore, N., Molloy, K., Lovo, W., Mayer, S., Wozniak, P. W., Stewart, M.: POST: a machine learning based paper organization and scheduling tool. In: Companion of the 2020 ACM International Conference on Supporting Group Work, pp. 135–138 (2020) 6. Chilton, L.B., et al.: Frenzy: collaborative data organization for creating conference sessions. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1255–1264 (2014) 7. Costa, Â., Laredo, J.L., Novais, P., Corchado, J.M., Neves, J.: iGenda: an event scheduler for common users and centralised systems. In: Corchado, E., Novais, P., Analide, C., Sedano, J. (eds.) Soft Computing Models in Industrial and Environmental Applications, 5th International Workshop (SOCO 2010). AISC, vol. 73, pp. 55–62. Springer, Heidelberg (2010). https://doi. org/10.1007/978-3-642-13161-5_8 8. Ali, M., Abbas, M.Y.: DODO Event Scheduler, Doctoral dissertation, University of Management & Technology (2018) 9. André, P., Zhang, H., Kim, J., Chilton, L., Dow, S., Miller, R.: Community clustering: leveraging an academic crowd to form coherent conference sessions. In: Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, vol. 1, no. 1 (2013) 10. Han, Y., Wang, Z., Chen, S., Li, G., Zhang, X., Yuan, X.: Interactive assigning of conference sessions with visualization and topic modeling. In: 2020 IEEE Pacific Visualization Symposium (PacificVis), pp. 236–240. IEEE (2020) 11. Hosein, P., Boodhoo, S.: Event scheduling with soft constraints and on-demand reoptimization. In: 2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA), pp. 62–66. IEEE (2016) 12. Iio, J.: Novel Online Communication Tools: OLiVES and DiaLogBook, COSCUP 2020, Taipei, Taiwan (2020)
Estimating User’s Movement Path Using Wi-Fi Authentication Log Jun Yamano1(B) , Yasuhiro Ohtaki1 , and Kazuyuki Yamamoto2 1
Ibaraki University, Naka-narusawa 4-12-1, Hitachi, Ibaraki, Japan {20nm732h,yasuhiro.ohtaki.lcars}@vc.ibaraki.ac.jp 2 Ibaraki University, Bunkyo 2-1-1, Mito, Ibaraki, Japan [email protected]
Abstract. In this paper, we propose a method to estimate user’s movement path using Wi-Fi authentication log. Large organization often have many Wi-Fi access points (APs) which are centrally managed by a wireless LAN controller, usually with roaming feature. While the user moves around the area covered by the Wi-Fi network, the AP they connect to will change. Each time the connected AP change, an authentication such as 802.1x is performed. The credentials such as user ID and password are sent to an authentication server via the controller. This allows the controller to know who has connected to which access point and when. Our method is to track a certain person’s movement path from the authentication log afterword. Assuming the authentication log is recorded where the roaming occurred, the area where the device was at point could be estimated. By combining with the structure of the building, we are able to roughly estimate the user’s movement path.
1
Introduction
Numerous studies on Wi-Fi location estimation, both indoor and outdoor, have been conducted in the past [1–11]. To achieve higher accuracy, some methods requires precise time measurement on user’s device or on the AP. Some other methods require building a large database of fingerprint of reference points. However, most organizations already have APs in place and cannot add a special device to get additional information later. In this paper, we propose a method to estimate the user’s movement path using these Wi-Fi authentication logs. In large scale organizations, many Wi-Fi access points (APs) are installed and centrally managed by a wireless LAN controller. In such environments, roaming capability is generally provided, and the AP to connect to automatically changes as the user moves around the area covered by the wireless network. Whenever the AP you are connecting to switches, an authentication protocol is performed. Usually, the IEEE802.1x authentication protocol is used, which requires a valid user ID and password. The credentials sent at this time is sent to the authentication server via the controller. Therefore, the controller can know who has connected to which AP and when. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 349–358, 2022. https://doi.org/10.1007/978-3-030-84913-9_35
350
2
J. Yamano et al.
Related Works
Numerous studies on location estimation using Wi-Fi have been conducted in the past. These methods can be roughly divided into the following three types. (1) Association Method. This is the simplest and most straightforward means of determining a wireless device’s position in a Wi-Fi network. The association method uses only the information of which AP the mobile device is connected to. The average area covered by an AP can be represented by a sphere that is 20 m in diameter centered on the AP. Thus a simple association method lacks granularity. (2) Triangulation Method [1,2]. This method calculates the position of user device using triangulation based on the distance from multiple APs which the location is known in advance. There are several ways to measure the distance between an AP and the device, such as based on the Received Signal Strength Indicator (RSSI) or based on the signal delay time. While these methods are relatively accurate, they usually require dedicated applications to measure RSSI or delay time, and often require assistance of OS and hardware. (3) Scene Analysis Method (Fingerprinting) [8–11]. This is considered as the most widely used technology in the field of indoor positioning. In this method, the Wi-Fi fingerprinting system records the RSSI of the APs collected at each location point in the database. To estimate user’s location, the RSSI observed by the user’s device is compared with the fingerprint, and the location which is most similar is used as the estimated one. However, building the fingerprint database involves a laborious task of site analysis. Estimation methods can also be classified by where the calculation of location is done. • The user’s device wants to find its own location. • The organization wants to estimate user’s location. No matter what method is used, the focus is basically on accurately estimating the current exact location of the user’s device. In contrast, the goal of the proposed method is to estimate the movement path of a device. Note that the proposed method does not require the accuracy of the information about where the device was at a certain moment.
3
Authentication Log of Wi-Fi Network
In this section, we describe about the authentication log that the proposed method uses for location estimation.
Estimating User’s Movement Path Using Wi-Fi Authentication Log
3.1
351
IEEE802.1x Authentication
In large organization, APs are centrally managed by a LAN controller. Access control when connecting to such a wireless network does not use simple methods such as a unique SSID–WPA key pair. Instead, IEEE802.1x, an authentication protocol that runs EAP (Extended Authentication Protocol) over LAN which requires valid user ID and password, is often used. IEEE802.1x authentication involves a client device, AP and an authentication server. In the authentication process, credentials are passed from the device to the controller, and the controller queries the authentication server as shown if Fig. 1. Therefore, the controller can know all the authentication processes and the results of all the connection attempts to the managed APs. This authentication information can be recorded in a log.
Fig. 1. Credentials are sent from AP to the controller and then queried to the authentication server.
The authentication information recorded in the log includes the date and time, the identification ID of the connected AP, the user ID, the mac address of the user’s device, and the result of the authentication (success of failure). 3.2
Authentication During Roaming
By broadcasting a same SSID from every AP, we can achieve a roaming feature. As the user moves, the AP to which the user’s device is connected switches to more optimal one. Even if the AP is switched, the upper level session can be prevented from being disconnected. Actually, when the client switches from the currently connected AP to another one, it is necessary to go through all the steps from association to authentication. Thus, if IEEE802.1x authentication is used, the EAP authentication process is performed to connect the new AP every time a roaming occurs. It is the client side that determines the best AP from among multiple APs at a certain location and connects to it. The client connects to the AP that is judged to be the best overall, based on signal strength, quality, etc. The criteria for judging depends on the client’s hardware and software.
352
J. Yamano et al.
However, most clients do not change the connected AP even if there is a more suitable AP nearby, because they want to keep current session connected. This behavior is called sticky clients. In centrally managed Wi-Fi networks, the controller monitors the RSSI of the client and disconnects the client from the AP when it drops below a set threshold [12]. As shown in Fig. 2, the location where roaming occurs depends on the direction of movement.
Fig. 2. Roaming occurs as the client moves, and authentication is performed each time. The location where roaming occurs depends on the direction of movement.
4
Methodology
Since the authentication log only records the identifier of the AP, our method can be categorized as association method. Let’s assume that the user moved with a device already connected to the Wi-Fi network. We assumed that the roaming occurs when the RSSI of currently connected AP drops, and the reconnected AP is the most optimal AP. Figure 3 illustrates this situation. If we have a sequence of authentication log at two APs, we can assume that roaming occurred in an area where the following two conditions were simultaneously satisfied. (a) The RSSI of the AP to which the device was originally connected is close to the threshold for disconnection. (b) In the Voronoi diagram of the AP, it is in the area closest to the newly connected AP. Therefore the area hatched in yellow would be the estimated location where the roaming occurred. Compared to the simple association method, the size of this region can be expected to be a much smaller. Of course, this is an ideal situation; in reality, the story is not so simple. As radio waves pass through walls, bookshelves, and other obstacles, the RSSI drops significantly, and the area covered by the AP varies greatly from spherical.
Estimating User’s Movement Path Using Wi-Fi Authentication Log
353
Fig. 3. Two sequence of authentication log gives us a hint where the user was when roaming occurred.
We can retrieve the sequence of authentication log of which user ID and the device with the same MAC address, whether successful or unsuccessful, from the authentication log. As Fig. 4 shows, it might be possible to estimate the movement path of the device by connecting the areas where roaming seems to have occurred in chronological order.
Fig. 4. By connecting the area where roaming occurred, we can estimate a movement path of the device.
Let us now consider illustrating the movement path of the device. Assume that we have a sequence of authentication log with timestamp of t1, t2, t3 at different APs as shown in Fig. 5. If the device roams as a sticky client, the location of device at time t1, t2, t3 would be the yellow area which is slightly closer to the new AP than the center of two APs. By connecting this yellow area, we get a rough movement path of the device. If we can combine more information, such as elapsed time or information about the structure of the building, the area where each roaming occurred would be much narrowed as shown in red outlined area.
354
J. Yamano et al.
Fig. 5. Simplified movement path estimation using authentication log. Combining with the building structure, we may narrow down the estimated area.
5
Verification Experiment
We conducted an experiment to verify our method. The APs managed by our university are located mainly in the buildings with classrooms. We walked inside the building, mostly in the corridors, carrying a device connected to the Wi-Fi network. Since we did not know in advance when the authentication log would be output, we recorded the time and location many times throughout the movement so that we could find out later when and where we were. Later, we extracted the log of the day of the experiment from the authentication log. Figure 6 shows the sequence of authentication log and the estimated location range of the device when the authentication log was output. The blue points indicate the location of APs. The yellow areas show the estimated location, which are calculated from the location of the connected APs before and after roaming based on the proposed method. As we can see, theses area are still too large, they are much smaller than the area simply associated to the APs (dotted circle). Figure 7 shows the actual movement path and the point where the authentication log was output. We can notice that the device is quite sticky and is not always connected to the nearest AP. The actual location of the device when the authentication log was output is close to the estimated area. If we can combine more information, such as elapsed time or information about the structure of the building, it may be able to improve the accuracy of the estimated location of the device. If the estimated area could be narrowed down to the area surrounded by the red line, we can estimate the movement path by connecting these areas.
Estimating User’s Movement Path Using Wi-Fi Authentication Log
355
Fig. 6. The estimated location range of the device when the authentication log is output.
Fig. 7. The actual movement path and the point where the authentication log was output.
356
6
J. Yamano et al.
Prototype System Implementation
We are now implementing a prototype system. This system uses the actual authentication logs collected from the controllers which manage all APs in our university. Figure 8 shows the structure of the prototype system. The prototype system consists of two major parts: a system that performs anonymization processing and a retrieval system. Since authentication log is considered to contain sensitive information, it is stored in the retrieval system after a simple anonymization process. The anonymization process is performed using a keyed hash function on the organization administrative side. Only the activities of those who are allowed to be tracked can be searched. Figure 9 is a screen shot of our prototype. Giving a (hashed value of) user ID and the period to be searched to the retrieval system, a sequence of authentication logs that match the conditions is output. We are now working on building a visualizer which renders a graphical output which shows the estimated location and the movement path.
Fig. 8. Prototype system overview
Estimating User’s Movement Path Using Wi-Fi Authentication Log
357
Fig. 9. Screen shot of the prototype system.
7
Conclusion
In this paper, we proposed a method for estimating user’s movement path from the authentication logs collected by Wi-Fi controller. From the authentication log, we can know the unique identifier of APs which the user’s device connected to before and after roaming. By taking the direction of movement into account, the area where roaming occurred can be estimated. By connecting these areas in chronological order, the user’s movement path can be estimated. If we could combine the structure of the building, it is considered possible to improve the accuracy. The prototype system displays the sequence of authentication log of the connected APs by giving a hashed value of a user ID and the period to be searched. Acknowledgments. We would like to thank the Center for Information Technology, Ibaraki University by facilitating data collection of AP authentication log.
References 1. Lim, C., Wan, Y., Ng, B., See, C.S.: A real-time indoor WiFi localization system utilizing smart antennas. IEEE Trans. Consumer Electron. 53(2), 618–622 (2007). https://doi.org/10.1109/TCE.2007.381737 2. Soltanaghaei, E., Kalyanaraman, A., Whitehouse, K.: Multipath triangulation: decimeter-level WiFi localization and orientation with a single unaided receiver. In: MobiSys 2018: Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services, pp. 376–388, June 2018. https://doi. org/10.1145/3210240.3210347
358
J. Yamano et al.
3. Sapiezynski, P., Stopczynski, A., Gatej, R., Lehmann, S.: Tracking human mobility using Wi-Fi Signals. PLoS One 10(7), e0130824 (2015). https://doi.org/10.1371/ journal.pone.0130824 4. Chaitany, K., Simma, J., Mammoli, A., Bogus, S.M.: Real-time occupancy estimation using WiFi network to optimize HVAC operation. Procedia Comput. Sci. 155, 495–502 (2019) 5. Khanh, T.T., Van Dung, N., Pham, X.-Q., Huh, E.-N.: Wi-Fi indoor positioning and navigation: a cloudlet-based cloud computing approach Human-centric Computing and Information Sciences, vol 10, Article number: 32 (2020) 6. Jaisinghani, D., Balan, R.K., Naik, V., Misra, A., Lee, Y.: Experiences & challenges with server-side WiFi indoor localization using existing infrastructure. In: Proceedings of the 15th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services (MobiQuitous 2018). Association for Computing Machinery, New York, NY, USA, pp. 226–235. https://doi.org/10. 1145/3286978.3286989 7. El-Naggar, A., Wassal, A., Sharaf, K.: Indoor positioning using Wi-Fi RSSI trilateration and INS sensor fusion system simulation. In: SSIP 2019: Proceedings of the 2019 2nd International Conference on Sensors, Signal and Image Processing, pp. 21–26 (2019). https://doi.org/10.1145/3365245.3365261 8. Wang, X., Wei, X., Liu, Y., Yang, K., Du, X.: Fingerprint-based Wi-Fi indoor localization using map and inertial sensors. Int. J. Distribut. Sensor Networks 13(12) (2017). https://doi.org/10.1177/1550147717749817 9. Ali, M.U., Hur, S., Park, Y.: Wi-Fi-based effortless indoor positioning system using IoT Sensors. Sensors 19, 1496 (2019). https://doi.org/10.3390/s19071496 10. Shin, H.-G., Choi, Y.-H., Yoon, C.-P.: Movement path data generation from Wi-Fi fingerprints for recurrent neural networks. Sensors 21, 2823 (2021). https://doi. org/10.3390/s21082823 11. Li, Q., Fan, Q., Gao, X., Hu, S.: A real-time location-based services system using WiFi fingerprinting algorithm for safety risk assessment of workers in tunnels. Mathematical Problems in Engineering, vol 4, pp. 1–10 (2014). https://doi.org/ 10.1155/2014/371456 12. Hwelett Packard Enterprise Company: RF and Roaming Optimization for Aruba 802.11ac Networks (2019)
Integrating PPN into Autoencoders for Better Information Aggregation Performance Yudai Okui1 , Tatsuhiro Yonekura2 , and Masaru Kamada1(B) 1
2
Ibaraki University, Hitachi 316-8511, Japan [email protected] National Institute of Technology, Ibaraki College, Hitachinaka 312-8508, Japan
Abstract. The Pulse-in Pattern-out Network (PPN) presented by Yonekura et al. in 1991 has gained a reputation of providing good decoders whereas it does not provide any encoders. It may be possible by integrating the PPN into the standard neural network models with encoding and decoding capabilities to enhance their decoding capability and, thus, improve their overall performance. In this study, we integrate the PPN into the four-layer autoencoder model so that the two network models share the same information aggregation layer and the latter ones, and we let either one of the input layer in the autoencoder model or that in the PPN accept the input data in turn. It was clarified that the integration of PPN into autoencoders improves the information aggregation performance measured by the restoration errors especially in the cases that the information aggregation layer has fewer nodes and that the third layer has many nodes.
1
Introduction
The Pulse-in Pattern-out Network (PPN) [1] presented by Yonekura et al. in 1991 has gained a reputation of providing good decoders whereas it does not provide any encoders. It may be possible by incorporating the PPN into the standard neural network models with encoding and decoding capabilities to enhance their decoding capability and, thus, improve their overall performance. In this study, we integrate the PPN into the autoencoder model [2] so that the two network models share the same information aggregation layer and the latter ones. The number of layers in the autoencoder model was set to four. Then we let either one of the input layer in the autoencoder model or that in the PPN accept the input data in turn. The models were implemented by the Chainer framework. Restoration errors by the variant models with different numbers of nodes in the second and third layers were evaluated for four sets of test data of different dimensions. The results say that the PPN-integrated model performed better to give smaller restoration errors in 93 cases out of the total 204 cases. The PPN-integrated model achieved smaller errors on the average with smaller variances especially for lower dimensional data. For higher c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 359–366, 2022. https://doi.org/10.1007/978-3-030-84913-9_36
360
Y. Okui et al.
Fig. 1. Four-layer autoencoder model
dimensional data, the PPN-integrated model worked better in the cases that the third layer had many nodes. On the contrary, it worked worse in the cases that the information aggregation layer had many nodes. But that would not be a problem since autoencoders with many information aggregation nodes do not aggregate information in the first place.
2 2.1
Preparation Autoencoder
The autoencoder is a kind of multi-layer neural network, which can be used as an information compression method for dimensionality reduction. The autoencoder is trained so that the input be restored at the output through few nodes in the hidden layer. Finding the fewest nodes that retain proper restoration of the input data, we know that the characteristic and essential information of the input data is compressed at those nodes in the hidden layer. Figure 1 shows the four-layer autoencoder model where the information is compressed at the second layer that is called the information aggregation layer. The process of compressing the input data is called the encoder, and the process of restoring the input from the aggregated information is called the decoder. The training process adjusts the weight (w) and bias (b) so that the output data be close to the training data given to the input. The numbers of nodes in the input layer and output layer are the same as the dimensionality of the training data.
Integrating PPN into Autoencoders
361
Fig. 2. Four-layer PPN model
2.2
PPN
The PPN (Pulse-in Pattern-out Network) is a network model where the input layer accepts pulses in the form of unit vectors. The network is trained to give the output close to each sample training data when one and only one input node corresponding to the sample is given 1 and the other input nodes are given 0. So the number of nodes in the input layer is equal to the number K of samples in the training data while the number of nodes in the output layer is the same as the dimensionality of the training data. Figure 2 shows the four-layer PPN model. Focusing on one connection from the input layer to the information aggregation layer in the second layer, we notice that its weight is updated only according to any one sample of the training data yielding 1 into its source node and that the weight is not affected while the network is trained by the other samples. So the composite pattern in the information aggregation layer is formed to lower the errors propagated from the subsequent layers without being affected by the input layer. That is why the PPN model is likely to organize a more desirable composite pattern in the information aggregation layer. In fact, the PPN model can configure a good decoder with appropriate information aggregation and restoration performances. However, the PPN model lacks any encoder functions due to its network structure.
362
Y. Okui et al.
Fig. 3. A model that integrates PPN into autoencoder
3
A Model that Integrates PPN into Autoencoder
We shall integrate the PPN into the autoencoder model in attempt to bring the advantage of the PPN of forming a good decoder into the autoencoder. Figure 3 shows a model where the PPN is fused with a four-layer autoencoder. In this model, the first layer, which is the input layer, is constructed of two parts, the autoencoder part and the PPN part. Each of the two input layer parts is connected to the second layer, which is the aggregation layer. The number of nodes of the autoencoder part in the input layer is the dimensionality of the training data while the number of nodes of the PPN part is the number K of samples in the training data. The two types of input layer parts, autoencoder and PPN, are switched in turn by the interval of a specified number of epochs in training. After training for a specified number of switchings, we stop using the PPN part and use only the autoencoder part to continue training.
4
Experiment
We compare the information aggregation performance of the PPN-integrated four-layer autoencoder model agaist the original four-layer autoencoder in terms of the restoration errors in the training process for four standard test data with different dimensions. The experiment was carried out under the following conditions. • The number of nodes in the input and output layers is determined by the training data. • The linear functions are employed for connections between the input layer and the second layer, and also between the third and fourth layers.
Integrating PPN into Autoencoders
363
Table 1. Test data used in the experiment Data set
Iris [3]
Digits [4]
MNIST [5]
CIFAR-100 [6]
Iris varieties Handwritten images Handwritten images RGB images Dimensions
4
64
784
3,072
Number of samples 75
800
10,000
10,000
Table 2. Computing environment OS
Ubuntu 18.04.3 LTS
CPU
Intel Xeon (R) CPU E5-1603 v4 @2.8 GHz ×4
GPU
GeForce RTX 2070 SUPER
Software Chainer 7.1.0, CUDA toolkit V.10.2.89, cuDNN, anaconda3-4.2.0
• The relu functions are employed for connections between the second and third layers. • Try various network models with the numbers of nodes widely varied in the second and third layers. • Ten sets of training are performed for each one of the network models. • The original four-layer autoencoder is trained for 10,000 epochs in the ordinary way. • The PPN-integrated four-layer autoencoder is trained in two steps. In the first step, the input layer is switched between the autoencoder part and the PPN part in turn at every epoch. After switching 100 times (consisting of 50 epochs of training by the PPN part and the other 50 epochs of training by the autoencoder part in total), in the second step, the network is trained only by the autoencoder part for 9,900 epochs. N • The mean square restoration error M SE = N1 i=1 (yi − yi )2 is used for the evaluation, where N is the dimensionality of the data, y1 , y2 , . . . , yN are the values of the restored data from the fourth layer, and y1 , y2 , . . . , yN are the values of the training data. Table 1 list the four data sets with different dimensions used for this experiment. Table 2 shows the specification of the computing environment for this experiment.
5
Results
The performance is compared in terms of the three indices: the minimum value, the maximum value, and the average value of the restoration error for ten sets of trainings. In Tables 3 and 6, the ratio (restoration error of PPN-incorporated autoencoder model)/(restoration error by original autoencoder model) is listed for different test data and for different numbers of nodes in the third and fourth layers. The index value less than 1 means that the restoration error by the PPNintegrated model performed better to achieve less restoration errors.
364
Y. Okui et al.
Table 3. Ratio of restoration errors by the PPN-incorporated autoencoder against that by original autoencoder (Iris data set)
6
Table 4. Ratio of restoration errors by the PPN-incorporated autoencoder against that by original autoencoder (Digits data set)
Discussion
From Tables 3, 4, 5 and 6, we see that the PPN-integrated model performed better to achieve less restoration errors in 93 cases of the total 204 training results. In Table 3 for the Iris data set of lower dimensional data, the PPN-integrated model performed better in almost all the cases regardless of the number of nodes in the second and third layers. For higher dimensional data in the other tables, there appear fewer cases where the PPN-integrated model outperforms the original autoencoder in general. However, focusing on the cases where there are fewer nodes in the second layer (information aggregation layer) and more nodes in the third layer, we see that the PPN-integrated model often performs better than the original autoencoder. That is prominent in Table 6 for the CIFAR data set of higher dimensional data. The PPN-integrated autoencoder performs worse in most of the cases
Integrating PPN into Autoencoders Table 5. Ratio of restoration errors by the PPN-incorporated autoencoder against that by original autoencoder (MNIST data set)
365
Table 6. Ratio of restoration errors by the PPN-incorporated autoencoder against that by original autoencoder (CIFAR data set)
where the information aggregation layer has many nodes. But that would not be a problem since autoencoders with many information aggregation nodes do not aggregate information in the first place.
7
Conclusions
In this study, in attempt to improve the information aggregation performance of the four-layer autoencoder, a new neural network model in which the existing four-layer autoencoder model and the PPN model share the information aggregation layer was constructed and implemented on the Chainer framework. In the training process of this model, the input layer of the autoencoder and the input layer of the PPN alternately accept the training data. We compared the information aggregation performance to see that there are some cases where
366
Y. Okui et al.
the PPN-integrated four-layer autoencoder performs better than the original autoencoder. In the future, it will be possible to grasp the characteristics of the PPNintegrated network in more detail by further changing the interval for switching between the PPN part and the autoencoder part in the first layer.
References 1. Yonekura, T., Miyazaki, S., Toriwaki, J.: Analysis of the data integration function of the four layer neural network based on the auto association model and PPN. IEICE Trans. J74–D2(10), 1398–1410 (1991) 2. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006) 3. Iris Data Set. https://archive.ics.uci.edu/ml/datasets/iris. Accessed 22 June 2021 4. Optical Recognition of Handwritten Digits Data Set. https://archive.ics.uci.edu/ ml/datasets/optical+recognition+of+handwritten+digits. Accessed 22 June 2021 5. THE MNIST DATABASE of handwritten digits. http://yann.lecun.com/exdb/ mnist/. Accessed 22 June 2021 6. The CIFAR-100 dataset. https://www.cs.toronto.edu/∼kriz/cifar.html. Accessed 22 June 2021
An AR System to Practice Drums Kaito Kikuchi1 , Michitoshi Niibori2 , and Masaru Kamada1(B) 1
2
Ibaraki University, Hitachi 316-8511, Japan [email protected] Learning-i Ltd., Nakanarusawa, Hitachi 316-0033, Japan
Abstract. An augmented reality (AR) system is developed that helps beginners practice the electronic drums. The player sitting on the drums is captured by a USB camera mounted on top of a large display device placed in front of the drums and the player. Their mirror image is shown on the display device in real time with superimposed graphical instructions that indicate when and which drum to bang along a musical note. Each bang on the drum set is sent from the drum controller as a MIDI signal to the PC where it is compared with the musical note. Throughout the practice, the player only has to look straight at the display screen because everything including the drum sticks, arms, legs of his/her own, and the instructions is integrated on the screen.
1
Introduction
It is so much fun to play the drums that there have been popular video games such as DrumMania [1] where the player hits colored pads at the timing when the chips of corresponding colors falling down on the display touch down the bottom of its screen. Such video games are useful also for practicing the real drums. They are, however, not so friendly as the augmented reality (AR) system for the guitar [2] where the player can see the instructions superimposed on the images of the real instrument and the player’s hand. On the other hand, the virtual reality (VR) drum systems such as Paradiddle [3] should be considered as a new kind of instruments rather different from the traditional drums because the virtual drums do not give any mechanical feedback to the player. It has been said that every drummer should sit with correct posture and look straight forward while playing the drums. Suppose that we place a large mirror in front of the player and the drum set. Then the player only has to look straight at him/herself in the mirror together with the drums. Nowadays, it is easy to replace the mirror by a large computer display device with a camera mounted on top of the device. Besides, superimposing instructions that indicate when and which drum to bang on the mirror image, we can help the player practice the drums better. The player only has to look straight at the display screen where everything including the drum sticks, arms, legs of his/her own, and the instructions is integrated. On the basis of that idea, we have developed an AR system that helps beginners practice the drums. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 367–373, 2022. https://doi.org/10.1007/978-3-030-84913-9_37
368
K. Kikuchi et al.
Fig. 1. Schematic diagram of the system
2
System Design
The schematic diagram of this system is shown in Fig. 1 and its final appearance is in Fig. 2. The player sitting on the electronic drums (Medeli electronic drums DD610J) is captured by a USB camera (Buffalo Wide angle CMOS Web Camera BSW505MBK) mounted on top of a large display device (I-O DATA 43-in. 4K monitor EX-LD4K432DB) placed in front of the drums and the player. Their mirror image is shown on the display device in real time with superimposed graphical instructions that indicate when and which drum to bang along a given musical note. A green ring is placed to circle each drum or cymbal as shown in Fig. 3. The green rings shall be called target rings since they indicate the target instruments. In accordance with the musical note loaded from a text file, larger rings in different colors such as red and yellow ones in Fig. 4 appear around the green target rings. Those larger rings shall be called instruction rings and they shall shrink toward the green target rings to indicate in advance the timing to strike the target instruments. The player watches the instruction ring shrinking and strikes the drum or cymbal at the moment when the instruction ring touches the green target ring. The electronic drum set controller detects the strike, encodes it in the MIDI format [4], and produces the respective sound. At the same time, the controller sends the MIDI signal via a MIDI-USB converter (Roland MIDI interface UMONEmk2) to the PC (Mouse Computer MB-K700SN2-M2S5-CPSC with Windows10 operating system, Core i7 9750H CPU at 2.60 GHz, 16 GB memory, and 512 GB NVMe storage) where the MIDI signal is compared with the musical note to check if the note is played on time. A praising message like “COOL!” appears on the green target ring as shown in Fig. 5 if the note is played correctly.
An AR System to Practice Drums
369
Fig. 2. Real overview of the system
Fig. 3. Initial screen where drums or cymbals are marked by green target rings
3
Implementation
The whole system has been implemented on the Unity platform [5]. The captured image, rings and messages are represented by objects in Unity. The MIDI signals are acquired also under the control of Unity.
370
K. Kikuchi et al.
Fig. 4. Instructions indicated by red and yellow instruction rings shrinking toward the target instruments marked by the green target rings
Fig. 5. Praising message on hitting the right instrument at the right time
3.1
Video Processing
We use Vuforia Engine [6], which is a library for AR development on Unity, to create a game object called ARcamera that shows, in the Unity game background, the video image from the USB camera connected to the PC. To this ARcamera object, pasted is a material called Unlit/CorrectARCameraShader created from a shader capable of flipping the left and right hand sides of the output image [7]. By activating the X flip option at the inspector of the material, the output image is flipped so that it looks as a mirror image.
An AR System to Practice Drums
371
The initial game screen is shown in Fig. 3. The ARcamera object is created and shows the horizontally flipped camera image in the background. Game objects for the green target rings to mark drums and cymbals are also created and placed in the foreground. The green target rings can be manually dragged on the screen by the mouse pointer to the positions of the target instruments. The positions can be stored in a file so that we do not have to re-adjust them unless we move the drum set or the USB camera. 3.2
Musical Note Acquisition
The musical note is prepared as a text file where pairs of the note number in the MIDI format (listed in Table 1) and its time in seconds are recorded in the chronological order as those in Table 2. The game object “Game Controller” reads the file and stores the data in the arrays noteNum[] and timing[] as indicated in Table 2. Table 1. Note numbers in MIDI and the instruments Note number in MIDI Instruments 38
Snare drum
48
High-mid tom
45
Low tom
43
Floor tom
42
Hi-hat closed
49
Crash cymbal
51
Ride cymbal
36
Kick drum
Table 2. Example musical note given in the file notesCount Note number in MIDI Time in seconds noteNum [notesCount] timing [notesCount] 0
36
2.425
1
49
2.496
2
42
2.856
3
42
3.232
4
38
3.232
5
42
3.576
6
36
3.937
7
42
3.966
8 .. .
42 .. .
4.367 .. .
372
3.3
K. Kikuchi et al.
Instructions by Shrinking Rings
The shrinking instruction rings, such as the red and yellow ones in Fig. 4, indicating when and which instrument should be banged as the next notes, are also created as game objects. We use the functions CheckNextNotes and SpawnNotes in Listing 1. CheckNextNotes keeps calling SpawnNotes to create an instruction ring in advance by the time offset of 1.6 s before the specified time of the note. SpawnNotes creates an instruction ring as a game object at the position of the green target ring deployed at the drums and cymbals in accordance with the correspondence in Table 1 by referring to the value of noteNum[notesCount]. The created instruction ring starts shrinking so that it will touch the green target ring in 1.6 s. When the instruction ring gets smaller than the green target one, the game object representing the instruction ring is deleted. Listing 1. CheckNextNotes void CheckNextNotes(){ while (_timing[notesCount] + timeOffset < GetMusicTime() && _timing[notesCount] != 0) { SpawnNotes(_noteNum[notesCount]); notesCount++; } } Listing 2. Decoding MIDI signal public MidiMessage (ulong data){ source = (uint)(data & 0xffffffffUL); // sender source status = (byte)((data >> 32) & 0xff); // MIDI status data1 = (byte)((data >> 40) & 0xff); // MIDI message ID data2 = (byte)((data >> 48) & 0xff); // data of ID of data1 }
3.4
MIDI Signal Reception
The MIDI signal from the electronic drum set controller is received via a MIDIUSB converter by a game object called MIDI Receiver included in the Unity plugin distributed by the name of “unity-midi-receiver-test” [8]. Then the MIDI signal is decoded by Listing 2 according to its data structure. The received MIDI signal is compared with the instructed note. The MIDI message ID (data1 in Listing 2) is the note number of the instrument just hit, and it is first checked against the expected note number. Then the received time is compared against the expected time of the note. In the case that the note number is correct and the time difference is within an allowance, a text object “COOL!” appears at the upper left position of the green target ring as in Fig. 5 for a short time.
An AR System to Practice Drums
4
373
Conclusions
We have developed an AR system for practicing the drums mainly for the novice players. The player can practice the drums by looking at the single screen integrating the whole drum set, his/her drum sticks, arms, legs, and the ring-shaped instructions indicating in advance when and which instrument to bang. Acknowledgement. The authors thank Daisuke Tanaka for his valuable and helpful discussions in the early stage of this study.
References 1. DrumMania. https://en.wikipedia.org/wiki/GuitarFreaks and DrumMania Cited 21 Jun 2021 2. Motokawa, Y., Saito, H.: Support system for guitar playing using augmented reality display. In: Fifth IEEE/ACM International Symposium on Mixed and Augmented Reality, ISMAR 2006, 22–25 October 2006, Santa Barbara (2006). https://doi.org/ 10.1109/ISMAR.2006.297825 3. Paradiddle. https://paradiddleapp.com/ Cited 21 Jun 2021 4. MIDI Association. https://www.midi.org/ Cited 21 Jun 2021 5. Unity. https://unity.com/ Cited 21 Jun 2021 6. Vuforia engine developer portal. https://developer.vuforia.com/ Accessed 29 May 2021 7. Qiita Zine: How to display the camera image vertically and horizontally flipped. https://qiita.com/ELIXIR/items/d6cd335a697148f61c38 Cited 21 Jun 2021 8. Takahashi, K.: Unity-midi-receiver-test. https://github.com/keijiro/unity-midireceiver-test Cited 21 Jun 2021
A Proposal of Learning Feedback System for Children to Promote Self-directed Learning Yoshihiro Kawano1(B) and Yuka Kawano2 1 Department of Informatics, Tokyo University of Information Sciences, Chiba, Japan
[email protected] 2 Candy, Computer Science School, Tokyo, Japan
Abstract. Since after COVID-19 pandemic, many schools promoted ICT-based learning activities such as online classes, online Project Based Learning (PBL) and active learning and learning portfolios, etc. The authors have been studying the development of children’s independence, especially, classifying and defining skills required to promote self-directed learning, and developing a learning support system for children. In this paper, the authors proposed a learning feedback system for children with clustering of the learning-data. Training data for clustering features of learners have collected in community activities in 2020. The authors planned a new community relationship event “Walk Adventure” in which online and onsite activities are linked to keep social distance. A feedback system based on the learning-data collected in this event and simple clustering was developed.
1 Introduction After 2020, the global pandemic of COVID-19 will significantly restricted face-to-face activities. COVID-19 greatly affected not only economic activities, but also children’s learning by closing schools, promoting online classes, and refraining from extracurricular activities. Since after the pandemic, many schools promoted ICT-based learning activities such as online classes, online Project Based Learning (PBL) and active learning and learning portfolios, etc. In Japan, programming education in elementary schools started in April 2020 fullscale, and many programming classes have been launched at various locations across the country [1]. Programming education is supposed to cultivate logical thinking skills through cross-disciplinary and holistic inquiry-based learning, and to promote efforts to address social issues through programming [2]. On the other hand, face-to-face activities such as multi-generational exchange events, seasonal festivals, and lifelong learning in the community have a positive impact on children’s independence, however, many activities were restricted. The authors have been studying the development of children’s independence, especially, classifying and defining skills required to promote self-directed learning, and developing a learning support system for children [3, 4]. In previous works, the authors defined these skills into three categories: “Computational thinking”, “ICT literacy”, and “Social perspective”, with the aim of facilitating the selection of proactive learning tasks © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 374–383, 2022. https://doi.org/10.1007/978-3-030-84913-9_38
A Proposal of Learning Feedback System for Children
375
that will prepare them to tackle social issues. Focusing on these three perspectives, the authors have developed a learning-data collection system (henceforth, referred to as “collection system”) that collects data regarding the developmental stages and learning activities of children as the first stage of the learning support system [5]. In this paper, the authors propose a learning feedback system for children with clustering of the learning-data. Training data for clustering features of learners would be collected in community activities in 2020. “Children’s City,” which has been conducted in the previous years, was not conducted because of the concentration of participants in one place, and the authors planned a community relationship event “Walk Adventure” in which online and onsite activities are linked to keep social distance. In this event, the onsite participants cooperate with the online participants to compete for time to clear the task while visiting local spots in the manner of a walk rally. To reduce the time required to answer the questionnaire at the event, the questionnaire items were limited to the minimum necessary. A feedback system based on the learning-data collected in this event and simple clustering was developed.
2 Self-directed Learning for Children 2.1 Skills Required to Promote Self-directed Learning Children need to understand their own disposition and role in society in order to choose their own learning tasks. In order to know the interests of children and their potential contribution in relation to others, it is effective to expand their perspectives by experiential learning. The purpose of this study is to help children choose their own learning tasks, and the skills required for choosing their tasks are classified into three categories: “Computational thinking,” “ICT literacy,” and “Social perspective” (Fig. 1). “Computational thinking”, “ICT literacy”, and “Social perspective” are essential skills for self-directed learning, corresponding to “Creativity”, “Cooperativeness”, and “Sociality”, respectively. In order to identify the social-issue domains in which children can contribute, it has been determined that the following three skills are necessary: computational thinking, ICT literacy and social perspective. 2.2 Activities Corresponding to the Required Skills Figure 1 shows the corresponding three activities of computational thinking, ICT literacy, and social perspective: programming classroom, IT Advisement Session, and the Children’s city. The three activities that the authors promote corresponding to the aforementioned skills are described below. The Walk Adventure planned in 2020 activities is a community activity that corresponds to ICT literacy and social perspective at Fig. 1 (Fig. 2).
376
Y. Kawano and Y. Kawano
Fig. 1. Skills required to promote self-directed learning.
Fig. 2. Picture of walk adventure.
3 Learning System for Children 3.1 Design In previous work, the authors have developed a learning support system for children based on the aforementioned philosophy. As shown in Fig. 3, the level of achievement and satisfaction with respect to each activity is recorded and collected at the end of each learning activity corresponding to each skill, and the degree of similarity between the behavioral traits and interests of the children is calculated. While choosing a learning task, the children are trained to choose what they want to learn by presenting the tasks that other children have worked on that are highly similar to their own behavioral traits and interests. The tasks that other children have worked on that are highly similar to their own.
A Proposal of Learning Feedback System for Children
377
Fig. 3. Design of learning support system for children.
3.2 Learning-Data Collection System As mentioned in the above discussions, a learning-data collection system (henceforth, referred to as “collection system”) has been developed as the first stage of the learning support system for children. In this study, the learning-data such as type and difficulty of activities, achievement level and next target are collected through an input interface and a DB is used to record the data; this would facilitate reviewing the learning activities later. For the collection items, the depth of the content and the way of listening to the content can be adjusted according to the children’s stage of growth, for example, the level of achievement and satisfaction with the activities. For the above, the depth of the content and the way of listening is adjusted according to the developmental stage of the target children. According to the Ministry of Education, Culture, Sports, Science and Technology (MEXT), “Characteristics of each stage of children’s development and issues to focus on”, in the early elementary grades of school, emphasis is placed on the judgement of right and wrong, developing a sense of normalcy in group life, and cultivating emotions [6]. In the collection system, a Web API is developed that can access the information regarding the questionnaire recorded in the DB. This API outputs a list of questions and choices in JSON format, with learning activities and developmental stages as the input parameters. In the Web API, the questions and choices of the questionnaire are output when programming is specified as the learning activity and elementary school is specified as the developmental stage, as the input parameters. In addition, when storing questionnaire response data, it is desirable to be able to store and analyze a large amount of response data at high speed in a simple manner. Therefore, MongoDB is adopted that is a document-oriented DB that can store data in the form of a hierarchical structure.
378
Y. Kawano and Y. Kawano
Fig. 4. Architecture of the system.
In order to smooth data collection, the authors adopted Vue.js that is a framework for building UI for Web applications. The system dynamically generates questionnaires from the data in the DB via the Web API, based on the grades and learning activities selected by children. This makes it possible to reactively switch the display of the questionnaire items according to the operation without any webpage transition. Figure 4 shows the architecture of the system based on the above design, and Fig. 5 and 6 show the UI of the collection system. According to the schematic in Fig. 4, when the user accesses the system, he/she selects the grade and experienced learning activity (Fig. 5), and the questions and choices are displayed according to the input (Fig. 6). The system obtains questions and choices via the API, with the grade and the learning activity as input parameters.
Fig. 5. Web page of the collection system UI.
A Proposal of Learning Feedback System for Children
379
Fig. 6. Questionnaire generated based on grades and learning activities.
3.3 Feedback System The authors have developed feedback system based on the collected learning-data with 2020 Walk Adventure activities. The questionnaire was limited to two items to reduce the time required to answer, that is, why they enjoyed it, what they did. The modification of the collection system was completed by the recording the questionnaire items for the walk adventure in the DB shown in Fig. 4. Currently, the feedback system requires manual data formatting steps. These steps are as follow.
1. 2. 3. 4. 5. 6.
Collects learning-data by using the collection system Converts the learning-data from JSON to CSV format in manually Loads the file of 2 above and clustering with K-means method Standardizes the matrix of clustering results Executes principal component analysis and presents the results Records the vectors of the first and second principal components and the coordinates of the center of gravity, and eigenvalue to DB 7. Calculates the clustering results for the new response data by using 6 above The screenshot of the feedback system is shown in Fig. 7. In Fig. 7, the learning-data was classified into three clusters. Labeling was conducted appropriately based on the eigenvalues of the first and second principal components. Each principal component was wording chosen to express the difference in features instead of superiority in skills. When the learner responds to the collection system, the clustering results are presented in red marker. 3.4 Discussions The Walk Adventure was held in Yotsukaido city on October 31 and November 1, 2020. There were 101 onsite participants and 28 online participants, and 8 teams in total over the two days. Most of the elementary school students participated onsite, while the university students supported online. A feedback system was rapidly developed to
380
Y. Kawano and Y. Kawano
Fig. 7. Screenshot of the feedback system.
present simple aggregate results, as feedback on the day of the event was not possible due to manual operation (Fig. 8). In Fig. 8, the results presented the number of people who gave similar answers to the learners and the degree of similarity with those who achieved the task. A total of 80 learning-data were obtained from a total of 33 people over two days, and these were applied as training data for the feedback system. The mission clearing at the spot, questionnaire answering and feedback on the activity was assumed to be one trial, and the relationship between the number of trials and achieved factors was investigated. In this experiment, the number of achieved factors was assumed to be the result of self-directed learning. Table 1 shows the number of trials and the average of achieved factors. From Table 1, it was confirmed that there was an increase in the average value of achieved factors after two or more trials. In the case of five or more trials, the average value of achieved factors was the same as that of the first trial. Although more experimental data and detailed analysis are required, it is considered that the repeated learning and feedback in community activities contribute to self-directed learning. There are some issues below for evaluation of degree of self-directed learning in community activities. 1. Feedback on the learning activity should be available in real-time 2. Establishment of an assessment methodology for self-directed learning To solve issue 1 above, manual step of the feedback system should be automated. Figure 9 shows image of conversion the learning-data from JSON to CSV format for machine learning. The data formatting process in automatic is required before machine learning. For the issue 2 above, it is necessary to consider feedback methods to encourage the choosing learning tasks for self-directed learning and assessment metrics for learning effectiveness.
A Proposal of Learning Feedback System for Children
381
Fig. 8. Feedback system at the time of the event.
Table 1. The number of trials and the average of achieved factors. Trials
1 time
2 times
3 times
4 times
5 times over
Average
1.619
2.650
2.267
2.364
1.692
People
21
20
15
11
13
Fig. 9. Conversion the learning-data from JSON to CSV format for machine learning.
4 Related Works With respect to the learning process of children, a study on community cooperation for the welfare of the community to foster the independence of children [7] and a study on the necessity of various learning opportunities for children [8] have been published. With regard to the development of children’s autonomy, the book, “The 7 Habits of Highly Effective People” that promotes a philosophy of life that takes a long-term view, has
382
Y. Kawano and Y. Kawano
been introduced to the “special activities” of elementary schools [9]. Further, there is also a case study found regarding the autonomous activities of children [10]. Furthermore, with respect to programming education, in light of the compulsory programming education in elementary schools in 2020, the following are reported: a case study on programming from the perspective of manufacturing with robotic control [11], a case study on the creation of teaching materials and practical education in elementary schools from the perspective of computational thinking [12], a workshop report on cooperative work using visual programming tools, and a survey of issues for promoting programming education in elementary schools. However, there are no studies that discuss the necessity of ICT literacy and social perspective, in addition to computational thinking, to determine the scopes of social issues where children can contribute autonomously. In this study, the authors classify and define the skills necessary for the children’s learning into three categories: “Computational thinking”, “ICT literacy”, and “Social perspective”, with the aim of facilitating the selection of proactive learning tasks that will prepare them to tackle social issues. Focusing on these three perspectives, the authors have developed a learning-data collection system that collects data regarding the developmental stages and learning activities of children [5].
5 Conclusions The authors are studying a learning support system for children to promote self-directed learning. In previous work, the authors have developed the collection system that collects data regarding the developmental stages and learning activities of children as the first stage of the learning support system. In this paper, the authors proposed a learning feedback system for children with clustering of the learning-data. Training data for clustering features of learners have collected in community activities in “Walk Adventure 2020”. A feedback system based on the learning-data collected in this event and simple clustering was developed. A feedback system based on the learning-data collected in this event and simple clustering was developed. Future works are to realize feedback on the learning activity should be available in real-time, and establishment of an assessment methodology for self-directed learning. Acknowledgments. This work was supported by the JSPS KAKENHI [grant number JP19K02982].
References 1. Ministry of Education, Culture, Sports, Science and Technology (MEXT): Programming Education at the Elementary School Level (Summary of Discussion). http://www.mext.go.jp/ b_menu/shingi/chousa/shotou/122/attach/1372525.htmAccessed 31 May 2020. (in Japanese) 2. Shibuya, K.: Programming Education in Elementary School Integrated Learning Time, Miraino Manabi Consortium. https://miraino-manabi.jp/content/260. Accessed 31 May 2020. (in Japanese) 3. Kawano, Y., Kawano, Y.: Consideration of the e-learning system for children to promote proactive learning. JSiSE Technical report, March 2019. (in Japanese)
A Proposal of Learning Feedback System for Children
383
4. Kawano, Y., Kawano, Y.: Development of learning data gathering system and analysis of learning data in a programming classroom for children. JSiSE Technical report, 3 March 2020. (in Japanese) 5. Kawano, Y., Kawano, Y.: A proposal of children learning system to promote self-directed choosing of learning tasks and analysis of learning data in a programming classroom. In: The 23rd International Conference on Network-Based Information Systems (NBiS-2020), Victoria, Canada, August, 2020 6. Ministry of Education, Culture, Sports, Science and Technology (MEXT): Characteristics of each stage of a child’s development and issues to focus on. https://www.mext.go.jp/b_menu/ shingi/chousa/shotou/053/gaiyou/attach/1286156.htm. Accessed 10 Feb 2020. (in Japanese) 7. Ushiroyama, E.: Welfare education that fosters children’s autonomy. Bull. Tokai Gakuin Univ. 2, 43–46 (2008). (in Japanese) 8. Tamura, M.: A Report of “WAKABA-CBT” Project by Partnership Between University and Regional Area”, Uekusa Gakuen University, Departmental Bulletin Paper, vol. 18, pp. 1–7 (2017). (in Japanese) 9. Covey, S.R.: The 7 Habits of Highly Effective People. Free Press (1989) 10. Takahashi, K.: Looking for children to be autonomous in their activities: Focusing on the ‘Seven Habits’, Center for Educational Research and Development, Joetsu University of Education, Departmental Bulletin Paper, vol. 20, pp. 217–222 (2010). (in Japanese) 11. Matsuda, T.: Necessity of low grade programming education and in fact -through utilization of cutlery apps. In: Japan Society of Digital Textbook, vol. 7 (2018). (in Japanese) 12. Toyoda, M.: Report on considerations for promoting programming class at elementary school, Graduate School of Teacher Education Wakayama University bulletin of Course Specializing in Professional Development in Education, vol. 2, pp. 83–90 (2017). (in Japanese)
A SPA of Online Lecture Contents with Voice Masaki Kohana1(B) , Shusuke Okamoto2 , and Masaru Kamada3 1
Chuo University, Ichigaya-Tamachi 1-18, Shinjuku, Tokyo, Japan [email protected] 2 Seikei University, Tokyo, Japan [email protected] 3 Ibaraki University, Ibaraki, Japan [email protected]
Abstract. This paper proposes a way to construct web-based lecture materials. This material proceeds automatically based on the audio playback. Because of COVID-19, many classes become online lecture. There are several types of online lecture, such as real-time online lecture, video lecture and so on. We focus on video lecture. A video of a lecture is large files. Therefore, video streaming puts strain on the Internet. Depending on the home Internet environment, it is difficult for a student to take a lecture. To reduce a file size, our proposed method synchronizes a playing voice data and page transition of a web page. Students can take the same lecture as the video by transitioning the web pages with the voice playback.
1
Introduction
The COVID-19 pandemic has changed education. University campus is shut down. As a result, the educational activities move onto the online platforms. In our university, Chuo University in Japan, a part of small classes are held on classroom, while the other classes are held on the online platforms. However, all classes move onto the online platform or hybrid style lecture because the government of Japan declares a state of emergency. Our university has three styles of online lecture, the real-time style, the ondemand style, and the handout with narration. The real-time style lecture uses online meeting system such as Cisco WebEx [1] and Zoom [2]. A teacher and students can communicate using video, voice, text in real-time. This style suitable for seminars and exercises. The hybrid style lecture combines the off-line class and the real-time lecture. The students join the online meeting system. A part of students join the system from the classroom, while the others join the system from their home. On the other hand, in the on-demand style lecture, a teacher publishes a video for students. Students view the video on their own devices. The teacher and students communicate with each other using e-mail or a learning management system (LMS). c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 384–390, 2022. https://doi.org/10.1007/978-3-030-84913-9_39
A SPA of Online Lecture Contents with Voice
385
The home Internet environment is the important factor for online lecture. Both the real-time lecture and the on-demand lecture use the Internet environment to join the lecture. However, there is a disparity of the Internet environment among the students. Even though the general household penetration rate of the Internet environment is 97.4% in 2018, the FTTH penetration rate is 63.4% [3]. A student who does not have the FTTH environment connect to the Internet via the mobile environment. In the most case, the mobile Internet environment limits the monthly volume of the data transfer. When the student uses up the volume, the student cannot join the real-time online lecture. Therefore, the student needs to save the data transfer consumption. In this study, we focus on the on-demand style lecture. A teacher provides a lecture video via some video-sharing service, such as YouTube. The students view the video from their home, and they work on a quiz sometime. The teacher and the students communicate with each other using the e-mail or the LMS. However, the size of the video is large, for example, a video (1080p, 30 min) is 678 MB, the other video (1080p, 70 min) is 1.3 GB. Furthermore, the video contents has some disadvantages. When the student moves to the specific slide, the student search the video using the seek bar, which is hard to search. Even though the video has some contents such as the hyper link, the student cannot interact with the video. In this paper, we proposes a web-based lecture material that replaces the video contents. We provide the material as a single-page application (SPA). Our lecture material consists of just the HTML, CSS, JavaScript, and a voice data. This material moves the slide contents based on the voice playback. The slide transition synchronizes the position of the voice playback. Section 2 introduces some studies related to our work. Section 3 and 4 describes the detail of our system. Section 5 shows our discussion, and Sect. 6 concludes our study.
2
Literature Survey
This section introduces some studies related to our work. There are a lot of services and studies for online lecture. One of the major online lecture is the Massive Open Online Courses (MOOC) [4]. MOOC provides many contents to learn new skills and the educational experience. To learn new skills, MOOC provides many online videos. In Japan, the DotInstall is a major online learning contents [5]. DotInstall provides several courses to learn skills. This service also provides many videos that is only 3 min. Go Tutorial is provides a programming language Go [6]. This tutorial provides a text description. And it also provides a playground that is a runtime environment. A user can read the description and write a program in Go on the same page. Ogawa et al. proposed a system to record the chalk annotation and the talk voice based on HTML5 [7]. This system adds the annotation and the voice to a
386
M. Kohana et al.
web page. The web page is held as a image file, and the system adds the chalk and the voice on the image. Okamoto et al. proposed an online programming environment to use a programming course on an university [8]. This system uses a web page as a front-end and a virtual machine as a back-end. A user can learn a programming skill using their own web browser.
3
System Overview
This section introduces an overview of our online lecture system. Our system provides a lecture material as a Single-Page Application (SPA). Therefore, a student can view the material using a Web browser. Figure 1 shows a screenshot of our system. This system consists of three parts, the control part, the main view part, and the note part. The main view part is the main part of our system. This shows the contents of the lecture material. In our system, the teacher makes the contents using HTML document. Therefore, the contents can include the text, the images, and the HTML form. Since the material is a HTML document, the student can interact with the contents, such as the copy/paste and the clicking hyper link. The note part is the text area at the bottom of the Fig. 1. The student takes note at this text area. This text area is related to the main contents slide. When the main contents changes the other slide, the note also changes the other note. Our system stores the note to the WebStorage. The control part is placed at the top of the contents. The control part consists of two parts, the audio control and the slide control. The audio control part manages the voice playback, the play, the stop, and the seek. The main view displays a slide page. The displayed slide is changed based on the position of
Fig. 1. Screenshot
A SPA of Online Lecture Contents with Voice
387
the voice playback. The slide control part is a series of the button that indicates the slide number. The student click a button, the main view part displays the slide that corresponds the number. When the slide is changed by the button, our system also changes the position of the voice playback.
4
Implementation
This section describes the implementation of our system. Our web-based online lecture system is developed as a SPA. This consists of the HTML, the CSS, and the JavaScript. We use the pure JavaScript without any frameworks. At the first, we show the HTML part that is the main contents includes the presentation slide and the notes. Figure 2 shows a part of html file. The div part with view ID is the main part. This part includes some div elements that are classified slide. The div with note ID includes the notes for each slide. The textarea elements are also classified note. The visibility of the slide elements and the note elements are controlled by using JavaScript (Fig. 3). The control part consists of two part, the audio control and the slide control. The audio control part is just an audio tag. The system specify the voice data as the source of the audio tag. The slide control part consists of the button tags. These buttons are named the number. This number is related to the slide number. At the second, we describes the JavaScript part. Figure 4 shows the visibility control part for slide. This function operates when the student clicks a button or when the voice playback reach certain timing. This system holds the current slide number, and monitors the position of the voice playback. The timing of the slide transition is held as an array. The system checks the position of the voice playback every 1 s. If the position exceeds the timing of the current slide number, the system changes the slide that displayed the main view part. When the student clicks a button, the main view part changes the slide. In this situation, the system needs the changing the position of the voice. The audio tag has the current parameter that indicates the position of the playback. The position of the playback related to the changing the slide is held in the array. Therefore, when the button is clicked, the system assigns the playback position to the current parameter of the audio tag.
388
M. Kohana et al.
Fig. 2. HTML part
Fig. 3. Slide control
5
Discussion
In this study, we proposes a SPA to view the lecture material with voice playback to reduce the transmission file size. The file size of our lecture material with 30 min voice data (the wave format) is about 180 MB, while the size of a video with the same length is about 678 MB. We can reduce the file size, which also can reduce the communication load. However, our system has some problems. A major problem of our system is the sequence of the slides. In our system, the slide selection button corresponds to the slide number. This specification assumes that the slide changes from the first slide to the final slide in one direction.
A SPA of Online Lecture Contents with Voice
389
Fig. 4. Visibility control
Fig. 5. Position structure
However, in the real lecture, the teacher changes the slide back and forth among all slides. In this situation, even if the expected slide order is 2-3-4-5, the actual slide order may be 2-3-2-4-5. One of the solution for this problem is the array includes the position of the playback and the slide number. Figure 5 shows a sample structure. The elements of the array is structures, and the structure has the num and the pos. The num indicates the slide number, and the pos indicates the playback position. Using this structure, our system can handle the non-straight slide order. However, Using this structure, the label of the button that indicates the slide number does not match the actual slide number. Furthermore, the current label of the button does not show the topic of the slides. We need to improve the user interface. The other problem is the annotation. Our system just show the presentation slide and the voice playback. The teacher would like to add any annotations on the slide. The annotation can be added by using canvas element that is a HTML
390
M. Kohana et al.
element to draw graphics. If our system can records the voice and the annotation using canvas, our system can provide more rich lecture material.
6
Conclusion
This paper proposed a system to provide a lecture material as a single-page application. This system displays the presentation slides and plays the voice data. The slides and the voice work together. When the student jumps the specified slide, the voice playback also jumps the related position. On the other hand, the voice playback reaches the certain position, the slide also changes the related slide. Using this system, the file size of the lecture material for 30 min becomes 180 MB, while the video with the same length is 678 MB. As our future work, we need to handle the non-straight order slide transition and annotations. And we need to develop a contents editor system with voice recording feature.
References 1. Cisco, Webex. https://www.webex.com/ja/index.html. Accessed 25 Jun 2021 2. ZOOM. https://zoom.us. Accessed 25 Jun 2021 3. Ministry of Internal Affairs and Communications, Communications Usage Trend Survey 2018 (2017). (Japanese). https://www.soumu.go.jp/johotsusintokei/ statistics/data/190531 1.pdf. Accessed 23 Jun 2021 4. MOOC.org. https://www.mooc.org. Accessed 25 Jun 2021 5. DotInstall. https://dotinstall.com. (Japanese). Accessed 25 Jun 2021 6. A Tour of Go. https://tour.golang.org/list. Accessed 25 Jun 2021 7. Ogawa, S., Niibori, M., Yonekura, T., Kamada, M.: An HTML5 implementation of web-com for recording chalk annotations and talk voices onto web pages. In: Barolli, L., Enokido, T., Takizawa, M. (eds.) Advances in Network-Based Information Systems. NBiS 2017. Lecture Notes on Data Engineering and Communications Technologies, vol. 7. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-655215 98 8. Okamoto, S., Sakamoto, S., Kohana, M.: An environment for computer programming classes under COVID-19 situation. In: Barolli, L., Li, K., Enokido, T., Takizawa, M., (eds.) Advances in Networked-Based Information Systems. NBiS 2020. Advances in Intelligent Systems and Computing, vol. 1264. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-57811-4 61
A Dynamic and Distributed Simulation Method for Web-Based Games Ryoya Fukutani1 , Shusuke Okamoto1 , Shinji Sakamoto1(B) , and Masaki Kohana2 1 Seikei University, Musashino, Tokyo, Japan [email protected], [email protected], [email protected] 2 Chuo University, Ichigaya Tamachi, Shinjuku-ku, Tokyo, Japan [email protected]
Abstract. This paper proposes a simulation method for web-based games to implement our tank battles with user programming. Our previous implementation had adopted a server-side simulation. The server simulated tank battles and sent its result to a root web browser through WebSocket. The browser forwarded them to the others by using a treestructured WebRTC network. However, the server’s load and the root browser’s load were high and limited the scalability of the number of users. We focus on a game property that can parallelize the calculations on the simulation if a group of tanks is located far from other groups due to the gun range. This paper shows an algorithm for dynamic grouping tanks, an implementation of a distributed simulation using this algorithm, and evaluations.
1
Introduction
We are developing a web system for learning Python programming by using WebRTC. This system uses an idea from Robocode [1], and the user can learn Python programming by playing robot battles. IT skills learners need to learn to make a program code. Robocode is a programming game where the goal is to develop a program of a robot battle tank to defeat others. The program is written in Java or .NET. A user writes a robot program in advance and runs the simulator to let the multiple robots battle. Mathew A. Nelson originally started it. Then IBM promoted it as a fun way to get started with learning programs in Java. It is also used as a machine learning subject, such as automatic generation of a program by genetic programming [2]. Figure 1 is a screenshot of Robocode. The name of robots and remaining energy levels are listed on the right-side pane, and battle simulation is displayed on the left side window. The user selects robots and makes them fight in the battle field. At the beginning of a battle, all robot has the same energy level. They shoot a bullet by using that energy. When a bullet hits a robot, the energy level is decreased. When it reaches zero, the robot is defeated and is removed from the field. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 391–400, 2022. https://doi.org/10.1007/978-3-030-84913-9_40
392
R. Fukutani et al.
Fig. 1. A battle screen of Robocode
Our system uses the idea of Robocode; however, it is a bit different. The user struggle with a programming quiz called a stage mission in addition to robot battles. And not only just watching a battle simulation but also rewriting own program among the battle. When the user rewrites the program, the robot’s behaviour is changed instantly, and the stage or battle continues without restarting [3]. Our previous implementation had adopted a server-side simulation. The server simulated tank battles and sent its result to a root web browser through WebSocket. The browser forwarded them to the others by using a tree-structured WebRTC network. However, the server’s load and the root browser’s load were high and limited the scalability of the number of users. We focus on a game property that can parallelize the calculations on the simulation if a group of tanks is located far from other groups due to the gun range.
2
Literature Survey
We implemented a web application by using WebSocket to communicate between a web server and browsers in our previous work. The technologies we used was based on an implementation of Robocode by Youchen Lee [4], where the run-time code is written in LiveScript, HTML and JavaScript. Kohana et al. proposed an information sharing method for a web-based virtual world by WebRTC [5]. They build a ring-form topology with the web browsers to communicate for a virtual world that is not divided into blocks. Each web browser can get only the necessary information to update its status although they share a considerable area of the virtual world. CodeCombat is a personal web programming game [6]. It is used for learning programming, where the user writes and rewrites their program to complete
A Dynamic and Distributed Simulation Method for Web-Based Games
393
game stages. For example, a stage consists of a corridor, a treasure and an avatar. The user writes a program to move their avatar to treasure and presses the RUN button to see the result. Several programming languages such as Python, JavaScript, CoffeeScript and so on can be used to learn.
3
Key Technologies
WebRTC (Web Real-Time Communication) [7] is a real-time communication protocol among web browsers and mobile applications. It can be used to build a peer-to-peer network of browsers. WebRTC enables us to build a video chat application, file sharing application, and so on without using any server to communicate. We use WebRTC to communicate robot information such as locations, angles, and energy levels among browsers in our system. The browsers can advance simulation steps without using any server to communicate. Web Worker is a way to run multi-threads on a web browser [8]. It is a functionality of JavaScript. Using Web Worker, it can run calculations in the background without disturbing user interfaces. We use a Web Worker to calculate a robot behaviour. WebSocket is a way to communicate between clients and a server [9]. It is a simultaneous two-way communication protocol over a TCP connection. The protocol enables interaction between a web browser and a web server with lower overhead. We use it to transfer a robot program when a user rewrites their program. Transcrypt is one of the programming languages for web browsers [10]. It has the same syntax as the Python programming language. It can be used as a compiler from Python to JavaScript to run code on a web browser. It also offers seamless access to any JavaScript library.
Fig. 2. Screenshot of robot battle
394
R. Fukutani et al.
Fig. 3. Network of server batch processing method
4
System Design and Implementation
Figure 2 is a screenshot of a web browser using our developing web system. The canvas on the left side is a battle field, and the text on the right side is a robot program. There are two robots, the user of this browser operates one, and the other is a stage enemy prepared by the system or controlled by the other user. As the user changes their robot code, the behaviour of the robot is changed immediately. Our system consists of a server and more than one web browsers as users. 4.1
Server Batch Processing Method
We describe a server batch processing method. It is the method of the previous version of our system. At the first, the server replies to a browser if getting an HTTP request. Our system uses WebSocket to communicate between the server and browsers. The server uses WebWorker to manage and simulate a tank battle, and sends information about tank battle simulation progress to browsers. Then the browser displays the tank battles. For example, suppose three browsers join to play the game, as shown in Fig. 3. The server simulates a battle game with three tanks. When Browser1 modifies its
Fig. 4. Proposed grouping algorithm
A Dynamic and Distributed Simulation Method for Web-Based Games
395
tank program, the server receives the information, and the simulation continues. Then the server sends data to each browser to display the simulation progress. 4.2
Dynamic and Distributed Simulation Method
The server’s load and the root browser’s load were high in the server batch processing method. Those limit the scalability of the number of users. Therefore, we propose a dynamic and distributed simulation method. The method parallelizes the calculations on the simulation by using a game property that can be calculated independently if a group of tanks is located far from other groups due to the gun range. The parallelized simulations are conducted a browser per a group. The browser conducting a simulation is called a super browser. Then the super browser shares simulation progress with the other browsers in the same group. 4.3
Grouping Tanks
We describe how to determine the groups. We want to group the tanks considering the larger number of groups and each group independent for a long time. The larger number of groups is, the more the simulation can be parallelized. When each group is independent for a longer time, the server’s task for managing or re-grouping gets lower frequency, so the server’s load becomes low. Thus, we consider a fitness value F to decide on the number of groups as shown in Eq. (1): the larger value of F , the better grouping.
Fig. 5. Network of our proposed system
F = Dmin + C1 × Gnum ,
(1)
where Dmin is the minimum distance among groups, C1 is an adjustment coefficient, Gnum is the number of groups. Here, the distance between two groups is defined as the minimum distance between a tank belonging to a group and a tank belonging to the other. A group is completely independent of other groups during computation time T calculated by T =
Dmin − 2L , 2v
(2)
396
R. Fukutani et al.
where L is a tank’s distance of visible on their radar and/or the gun range, and v is the max velocity of tanks. The L and the v are a constant value in the simulation, so that the T is dependent on Dmin . We show the proposed grouping algorithm in Fig. 4. At first, we put tanks in the same group if the distance between two tanks shorter than 2L. Nextly, we calculate the fitness value F and store the F value and its groups. Then, if the number of groups is less than three, we apply a group for which F is the maximum. If the number of groups is more than or equal to three, we merge the closest two groups. After that, we calculate the fitness value F and store the F value and its groups again. The process will continue while the number of groups is more than or equal to three. Each super browser simulates parallelized simulations with a duration of T because independence is guaranteed by Eq. (2). Then the server re-groups using surviving tanks and calculates the next computation time T . We show the network of our proposed system in Fig. 5. Browsers communicate with each other to share the simulation progress because the super browser simulates. The server groups the tanks many times in a simulation so that the information flow path of WebRTC may often change.
Fig. 6. Example of three sessions
Table 1. Experimental settings for evaluation A Parameter
Value
Users
4,7,10
Sessions
1
Trials
5 times
Experimet time per trial 60 s
4.4
Game Sessions
We assume that the server needs to handle more than one battlefields simultaneously. Thus, we call one battlefield a session. In other words, the server can
A Dynamic and Distributed Simulation Method for Web-Based Games
397
handle some sessions simultaneously, and a browser belongs to a session. We show an example of three sessions in the proposed system in Fig. 6. Each session is entirely independent of other sessions. New players may create a new session. Also, a session may disappear by finishing the session.
5
Evaluations
We evaluate the proposed method by comparing the batch processing method. We consider two evaluations. Evaluation A is for the effects of the number of users. Evaluation B is for the effects of the number of sessions. For both evaluations, we experiment for one minute, trying five times in the same environment. 5.1
Evaluation A: Effects of Number of Users
The experimental setting for evaluation A is shown in Table 1. We consider the different number of users who are accessing the game server with own browser. There is just one session during this evaluation. We evaluate the CPU usage and data traffic of both server and browser. We show the experimental results for CPU usage for both methods in Table 2. We see that the server’s CPU usage of the proposed method is around 0.1 times compared with the case of the batch processing method. On the contrary, the average of browser CPU usage increases with increasing the number of users. Thus, the proposed method can distribute the server’s load to the browsers so that scalability is increased. The experimental results of data traffic for both methods are shown in Table 3. The results show that the server’s data traffic in the proposed method reduced to around 0.1 times comparing the batch processing method. The data Table 2. Results of CPU usage for both methods Number of users Server CPU usage [%] Browser CPU usage [%] Batch Proposal Batch Proposal 4
13.4
1.2
16.0
23.6
7
21.5
1.5
19.8
28.3
10
42.7
4.2
13.1
35.7
Table 3. Results of data traffic for both methods Number of users Server data traffic [KiB] Browser data traffic [KiB] Batch Proposal Batch Proposal 4
176
18
65
351
7
297
22
26
581
10
330
30
39
898
398
R. Fukutani et al. Table 4. Experimental settings for evaluation B Parameter
Value
Sessions
1,2,3,4
Browsers per session
4
Trials
5 times
Experimet time per trial 60 s
traffic among browsers is increased 5 times to 20 times. These results show that the proposed method can distribute the server’s load to the browsers so that scalability increases. 5.2
Evaluation B: Effects of Number of Sessions
The experimental setting is shown in Table 4. We consider the number of sessions from one to four, and each session includes four browsers. As an evaluation setting, each tank behaves differently with each distinct program, which is not rewritten by users during battles.
Fig. 7. Average of server’s CPU usage for different number of sessions
Fig. 8. Average of server’s data traffic for different number of sessionsAverage of server’s data traffic for different number of sessions
Figure 7 shows that the server’s CPU usage of the proposed method keeps lower than the case of the batch processing method even the number of sessions increases. Also, the server’s data traffic of the proposed method gets lower than the case of the batch processing method, as shown in Fig. 8. Figure 9 shows that the number of sessions is four. At the beginning of the simulation, the browser was selected as a super browser, so the browser sends many packets to share the progress of the simulation. After time T (see Eq (2)), the browser was not a super browser, so the number of received packets is more than the number of send packets. During the simulation time around 55 to 95,
A Dynamic and Distributed Simulation Method for Web-Based Games
399
Fig. 9. A browser’s data traffic load in our proposed method
the browser simulated only its tank without sharing any information to other browsers. This was because a group was made only in this browser. In other words, the tank was far enough from other tanks. These results show that our proposed method works well for distributing the server’s load to the browsers.
6
Conclusion
This paper proposed a simulation method for web-based games to implement our tank battles with user programming. The server simulated tank battles and sent its result to a root web browser through WebSocket. The browser forwarded them to the others by using a tree-structured WebRTC network. We focused on a game property that can parallelize the calculations on the simulation if a group of tanks was located far from other groups due to the gun range. We described an algorithm for dynamic grouping tanks, an implementation of a distributed simulation using this algorithm, and evaluations. From the evaluations, we concluded that the proposed algorithm was able to group tanks, and the proposed method was capable of distributing the server’s load to browsers, compared with the batch processing method. In our future work, we would like to improve the grouping algorithm by considering browsers’ status.
References 1. Robocode. http://robocode.sourceforge.net/, Accessed 5 June 2021 2. Shichel, Y., Ziserman, E., Sipper, M.: GP-Robocode: using genetic programming to evolve robocode players. In: Keijzer, M., Tettamanzi, A., Collet, P., van Hemert, J., Tomassini, M. (eds.) EuroGP 2005. LNCS, vol. 3447, pp. 143–154. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31989-4 13
400
R. Fukutani et al.
3. Fukutani, R., Okamoto, S., Sakamoto, S., Kohana, M.: A real-time programming battle web application by using WebRTC. In: Barolli, L., Nishino, H., Enokido, T., Takizawa, M. (eds.) NBiS - 2019 2019. AISC, vol. 1036, pp. 731–737. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29029-0 73 4. LiveScript (JavaScript) implementation of robocode. https://github.com/ youchenlee/robocode-js/, Accessed 5 June 2021 5. Kohana, M., Okamoto, S.: A data sharing method using WebRTC for web-based virtual world. In: Barolli, L., Xhafa, F., Javaid, N., Spaho, E., Kolici, V. (eds.) EIDWT 2018. LNDECT, vol. 17, pp. 880–888. Springer, Cham (2018). https:// doi.org/10.1007/978-3-319-75928-9 81 6. CodeCombat. https://codecombat.com/, Accessed 5 June 2021 7. WebRTC. https://webrtc.org/, Accessed 5 June 2021 8. Web Worker. https://html.spec.whatwg.org/multipage/#toc-workers, Accessed 5 June 2021 9. WebSocket. https://tools.ietf.org/html/rfc6455, Accessed 5 June 2021 10. Transcrypt. https://transcrypt.org/, Accessed 5 June 2021
Author Index
B Bandyopadhyay, Anjan, 143 Bandyopadhyay, Arghya, 143 Barolli, Admir, 117 Barolli, Leonard, 1, 13, 117, 306, 329 Bułat, Radosław, 213 Bylykbashi, Kevin, 117 C Chen, Hsing-Chung, 105, 223 Chouliaras, Spyridon, 154 D Doi, Akio, 299 Duolikun, Dilawaer, 1, 50 Durresi, Arjan, 94 E Enokido, Tomoya, 1, 23, 50 F Fujisaki, Kiyotaka, 312 Fukutani, Ryoya, 391 G Gao, Zhiyi, 299 Gotoh, Yusuke, 268 H Hasegawa, Toru, 188 Hayashibara, Naohiro, 82 Heyao, Huang, 289 Hirata, Aoto, 329
I Iio, Jun, 341 Ikeda, Makoto, 306 Inenaga, Kentaro, 71 Ishida, Tomoyuki, 275 Izumi, Kiyotaka, 59 J Jono, Shun, 59 K Kamada, Masaru, 359, 367, 384 Kato, Tohru, 299 Kaur, Davinder, 94 Kaveeta, Vivatchai, 179 Kawabe, Yoshinobu, 188 Kawano, Yoshihiro, 374 Kawano, Yuka, 374 Khwanngern, Krit, 179 Kikuchi, Kaito, 367 Kitagawa, Takumi, 188 Kohana, Masaki, 384, 391 Koizumi, Yuki, 188 Koyama, Takuto, 59 Kusunoki, Koki, 238 L Liu, Yi, 13 Lu, Wei, 202 Lung, Chi-Wen, 223 M Maeda, Hiroshi, 320 Manikandan, Saravanan, 105
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): NBiS 2021, LNNS 313, pp. 401–402, 2022. https://doi.org/10.1007/978-3-030-84913-9
402 Matsuo, Kazuma, 306 Meenert, Phornphanit, 179 Miyachi, Hideo, 283 Mukhopadhyay, Sajal, 143 Murakami, Koshiro, 283 N Nagai, Yuki, 329 Nakamura, Shigenari, 23 Natwichai, Juggapong, 179 Nguyen, Hong-Nhu, 131 Nguyen, Ngoc-Lan, 131 Nguyen, Ngoc-Long, 131 Nguyen, Nhat-Tien, 131 Niibori, Michitoshi, 367 Nishigaki, Masakatsu, 188 O Oda, Tetsuya, 329 Ogiela, Lidia, 219 Ogiela, Marek R., 213 Ogiela, Urszula, 219 Ohki, Tetsushi, 188 Ohnishi, Ayumi, 249 Ohtaki, Yasuhiro, 349 Oishi, Takayuki, 268 Okamoto, Shusuke, 13, 384, 391 Okui, Yudai, 359 Osborn, Wendy, 35 Q Qafzezi, Ermioni, 117 R Rai, Ujjwal, 143 Rittichier, Kaley J., 94 S Saikhul, Islam M. D., 105 Saito, Nobuki, 329 Sakamoto, Shinji, 13, 117, 391 Satoh, Kiwamu, 165
Author Index Singh, Vikash Kumar, 143 Song, Yu-Lin, 105, 223 Sotiriadis, Stelios, 154 Sriyong, Sawita, 179 Sueyoshi, Chinasa, 71 Sugihara, Koichiro, 82 T Takagi, Hideya, 71 Takahashi, Hiroki, 299 Takizawa, Makoto, 1, 23, 50, 219 Taniguchi, Hideo, 238 Terada, Tsutomu, 249 Tetsuro, Ogi, 289 Toyoshima, Kyohei, 329 Tsujimura, Takeshi, 59 Tsukamoto, Masahiko, 249 U Uchibayashi, Toshihiro, 71 Uslu, Suleyman, 94 V Voznak, Miroslav, 131 W Watanabe, Kota, 59 Widodo, Agung Mulyo, 223 Wisnujati, Andika, 223 X Xue, Ling, 202 Y Yahada, Reiya, 275 Yamamoto, Kazuyuki, 349 Yamano, Jun, 349 Yamashita, Meguru, 165 Yamauchi, Toshihiro, 238 Yonekura, Tatsuhiro, 359 Yoshihisa, Tomoki, 258