936 15 25MB
English Pages 790 [351] Year 2021
Lecture Notes in Networks and Systems 312
Leonard Barolli Hsing-Chung Chen Hiroyoshi Miwa Editors
Advances in Intelligent Networking and Collaborative Systems The 13th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2021)
Lecture Notes in Networks and Systems Volume 312
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA, Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science.
More information about this series at http://www.springer.com/series/15179
Leonard Barolli Hsing-Chung Chen Hiroyoshi Miwa •
•
Editors
Advances in Intelligent Networking and Collaborative Systems The 13th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2021)
123
Editors Leonard Barolli Department of Information and Communication Engineering Fukuoka Institute of Technology Fukuoka, Japan
Hsing-Chung Chen Department of Computer Science and Information Engineering Asia University Taichung, Taiwan
Hiroyoshi Miwa School of Science and Technology Kwansei Gakuin University Sanda, Japan
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-030-84909-2 ISBN 978-3-030-84910-8 (eBook) https://doi.org/10.1007/978-3-030-84910-8 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Welcome Message from the INCoS-2021 Organizing Committee
Welcome to the 13th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2021), which will be held at Asia University, Taichung, Taiwan, from 1 to 3 September 2021. INCoS is a multidisciplinary conference that covers latest advances in intelligent social networks and collaborative systems, intelligent networking systems, mobile collaborative systems, secure intelligent cloud systems, etc. Additionally, the conference addresses security, authentication, privacy, data trust and user trustworthiness behaviour, which have become crosscutting features of intelligent collaborative systems. With the fast development of the Internet, we are experiencing a shift from the traditional sharing of information and applications as the main purpose of the networking systems to an emergent paradigm, which locates people at the very centre of networks and exploits the value of people’s connections, relations and collaboration. Social networks are playing a major role as one of the drivers in the dynamics and structure of intelligent networking and collaborative systems. Virtual campuses, virtual communities and organizations strongly leverage intelligent networking and collaborative systems by a great variety of formal and informal electronic relations, such as business-to-business, peer-to-peer and many types of online collaborative learning interactions, including the virtual campuses and e-learning systems. Altogether, this has resulted in entangled systems that need to be managed efficiently and in an autonomous way. In addition, the conjunction of the latest and powerful technologies based on cloud, mobile and wireless infrastructures is currently bringing new dimensions of collaborative and networking applications a great deal by facing new issues and challenges. The aim of this conference is to stimulate research that will lead to the creation of responsive environments for networking and the development of adaptive, secure, mobile and intuitive intelligent systems for collaborative work and learning. The successful organization of the conference is achieved thanks to the great collaboration and hard work of many people and conference supporters. First, we would like to thank all the authors for their continued support to the conference by submitting their research work to the conference, for their presentations and discussions during the conference days. We would like to thank PC co-chairs, Track v
vi
Welcome Message from the INCoS-2021 Organizing Committee
co-chairs, TPC members and external reviewers for their work by carefully evaluating the submissions and providing constructive feedback to authors. We would like to acknowledge the excellent work and support by the International Advisory Committee. Our gratitude and acknowledgement for the conference keynotes for their interesting and inspiring keynote speeches. We greatly appreciate the support by Web Administrator Co-chairs. We are very grateful to Springer as well as several academic institutions for their endorsement and assistance. Finally, we hope that you will find this proceedings to be a valuable resource in your professional, research and educational activities.
INCoS-2021 Organizing Committee
Honorary Chairs Jeffrey J. P. Tsai Makoto Takizawa
Asia University, Taiwan Hosei University, Japan
General Co-chairs Hsing-Chung Chen Hiroyoshi Miwa Flora Amato
Asia University, Taiwan Kwansei Gakuin University, Japan University of Naples “Federico II”, Italy
Programme Co-chairs Chao-Tung Yang Omar Hussain Lidia Ogiela
Tunghai University, Taiwan UNSW Canberra, Australia Pedagogical University of Cracow, Poland
International Advisory Committee Vincenzo Loia Albert Zomaya Fang-Yie Leu Masato Tsuru
University of Salerno, Italy University of Sydney, Australia Tunghai University, Taiwan Kyushu Institute of Technology, Japan
International Liaison Co-chairs Tzu-Liang Kung Jerry Chun-Wei Lin Aneta Poniszewska-Maranda
Asia University, Taiwan Western Norway University of Applied Sciences, Norway Lodz University of Technology, Poland
vii
viii
Natalia Kryvinska Xu An Wang Jana Nowakova Jakub Nalepa
INCoS-2021 Organizing Committee
Comenius University in Bratislava, Slovakia Engineering University of CAPF, China VŠB-Technical University of Ostrava, Czech Republic Silesian University of Technology & Future Processing, Poland
Award Co-chairs Tomoya Enokido Marek Ogiela Vaclav Snasel
Rissho University, Japan AGH University of Science and Technology, Poland VŠB-Technical University of Ostrava, Czech Republic
Web Administrator Co-chairs Phudit Ampririt Kevin Bylykbashi Ermioni Qafzezi
Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan Fukuoka Institute of Technology, Japan
Local Arrangement Co-chairs Zhi-Ren Tsai Cheng-Hung Chuang Charles C. N. Wang
Asia University, Taiwan Asia University, Taiwan Asia University, Taiwan
Finance Chair Makoto Ikeda
Fukuoka Institute of Technology, Japan
Steering Committee Chair Leonard Barolli
Fukuoka Institute of Technology, Japan
Track Areas and PC Members Track 1: Data Mining, Machine Learning and Collective Intelligence Track Co-chairs Carson K Leung Alfredo Cuzzocrea
University of Manitoba, Canada University of Calabria, Italy
INCoS-2021 Organizing Committee
ix
TPC Members Fan Jiang Wookey Lee Oluwafemi A. Sarumi Syed K. Tanbeer Tomas Vinar Kin Fun Li
University of Northern British Columbia, Canada Inha University, Korea Federal University of Technology Akure, Nigeria University of Manitoba, Canada Comenius University in Bratislava, Slovakia University of Victoria, Canada
Track 2: Intelligent Systems and Knowledge Management Track Co-chairs Marek Ogiela Chang Choi Daichi Kominami
AGH University of Science and Technology, Poland Gachon University, Republic of Korea Osaka University, Japan
TPC Members Hsing-Chung (Jack) Chen Been-Chian Chien Junho Choi Farookh Khadeer Hussain Hae-Duck Joshua Jeong Hoon Ko Natalia Krzyworzeka Libor Mesicek Lidia Ogiela Su Xi Ali Azadeh Jin Hee Yoon Hamed Shakouri Jee-Hyong Lee Jung Sik Jeon
Asia University, Taiwan National University, Taiwan Chosun University, Korea University of Technology Sydney, Australia Korean Bible University, Korea Sungkyunkwan University, Korea AGH University of Science and Technology, Poland J. E. Purkinje University, Czech Republic Pedagogical University of Cracow, Poland Hohai University, China Tehran University, Iran Sejong University, South Korea Tehran University, Iran Sungkyunkwan University, South Korea Mokpo National Maritime University, South Korea
Track 3: Wireless and Sensor Systems for Intelligent Networking and Collaborative Systems Track Co-chairs Do van Thanh Shigeru Kashihara
Telenor & Oslo Metropolitan University, Norway Nara Institute of Science and Technology, Japan
x
INCoS-2021 Organizing Committee
TPC Members Dhananjay Singh Shirshu Varma B. Balaji Naik Sayed Chhattan Shah Madhusudan Singh Irish Singh Gaurav Tripathi Jun Kawahara Muhammad Niswar Vasaka Visoottiviseth Jane Louie F. Zamora
HUFS, Korea IIIT Allahabad, India NIT Sikkim, India HUFS, Korea, USA Yonsei University, Korea Ajou University, Korea Bharat Electronics Limited, India Kyoto University, Japan Hasanuddin University, Indonesia Mahidol University, Thailand Weathernews Inc., Japan
Track 4: Service-Based Systems Track Co-chairs Corinna Engelhardt-Nowitzki Natalia Kryvinska Takuya Asaka
University of Applied Sciences, Austria Comenius University in Bratislava, Slovakia Tokyo Metropolitan University, Japan
TPC Members Maria Bohdalova Ivan Demydov Jozef Juhar Nor Shahniza Kamal Bashah Eric Pardede Francesco Moscato Tomoya Enokido Olha Fedevych
Comenius University in Bratislava, Slovakia Lviv Polytechnic National University, Ukraine Technical University of Košice, Slovakia Universiti Teknologi MARA, Malaysia La Trobe University, Australia University of Campania, Italy Rissho University, Japan Lviv Polytechnic National University, Ukraine
Track 5: Networking Security and Privacy Track Co-chairs Xu An Wang Mingwu Zhang
Engineering University of CAPF, China Hubei University of Technology, China
TPC Members Fushan Wei He Xu
The PLA Information Engineering University, China Nanjing University of Posts and Telecommunications, China
INCoS-2021 Organizing Committee
Yining Liu Yuechuan Wei Weiwei Kong Dianhua Tang Hui Tian Urszula Ogiela
xi
Guilin University of Electronic Technology, China Engineering University of CAPF, China Xi’an University of Posts & Telecommunications, China CETC 30, China Huaqiao University, China Pedagogical University of Krakow, Poland
Track 6: Big Data Analytics for Learning, Networking and Collaborative Systems Track Co-chairs Santi Caballe Francesco Orciuoli Shigeo Matsubara
Open University of Catalonia, Spain University of Salerno, Italy Kyoto University, Japan
TPC Members Soumya Barnejee David Bañeres Nicola Capuano Nestor Mora Jorge Moneo David Gañán Isabel Guitart Elis Kulla Evjola Spaho Florin Pop Kin Fun Li Miguel Bote Pedro Muñoz
Institut National des Sciences Appliquées, France Open University of Catalonia, Spain University of Basilicata, Italy Open University of Catalonia, Spain University of San Jorge, Spain Open University of Catalonia, Spain Open University of Catalonia, Spain Okayama University of Science, Japan Polytechnic University of Tirana, Albania University Politehnica of Bucharest, Romania University of Victoria, Canada University of Valladolid, Spain University of Carlos III, Spain
Track 7: Cloud Computing: Services, Storage, Security and Privacy Track Co-chairs Javid Taheri Shuiguang Deng
Karlstad University, Sweden Zhejiang University, China
xii
INCoS-2021 Organizing Committee
TPC Members Ejaz Ahmed Asad Malik Usman Shahid Assad Abbas Nikolaos Tziritas Osman Khalid Kashif Bilal Javid Taheri Saif Rehman Inayat Babar Thanasis Loukopoulos Mazhar Ali Tariq Umer
National Institute of Standards and Technology, USA National University of Science and Technology, Pakistan COMSATS University Islamabad, Pakistan North Dakota State University, USA Chinese Academy of Sciences, China COMSATS University Islamabad, Pakistan Qatar University, Qatar Karlstad University, Sweden COMSATS University Islamabad, Pakistan University of Engineering and Technology, Pakistan Technological Educational Institute of Athens, Greece COMSATS University Islamabad, Pakistan COMSATS University Islamabad, Pakistan
Track 8: Social Networking and Collaborative Systems Track Co-chairs Nicola Capuano Dusan Soltes Yusuke Sakumoto
University of Basilicata, Italy Comenius University in Bratislava, Slovakia Kwansei Gakuin University, Japan
TPC Members Santi Caballé Thanasis Daradoumis Angelo Gaeta Christian Guetl Miltiadis Lytras Agathe Merceron Francis Palma Krassen Stefanov Daniele Toti Jian Wang Jing Xiao Jian Yu
Open University of Catalonia, Spain University of the Aegean, Greece University of Salerno, Italy Graz University of Technology, Austria American College of Greece Beuth University of Applied Sciences Berlin, Germany Screaming Power, Canada Sofia University “St. Kliment Ohridski”, Bulgaria Roma Tre University, Italy Wuhan University, China South China Normal University, China Auckland University of Technology, Australia
INCoS-2021 Organizing Committee
Aida Masaki Takano Chisa Sho Tsugawa
xiii
Tokyo Metropolitan University, Japan Hiroshima City University, Japan Tsukuba University, Japan
Track 9: Intelligent and Collaborative Systems for e-Health Track Co-chairs Massimo Esposito
Mario Ciampi
Giovanni Luca Masala
Institute for High Performance Computing and Networking - National Research Council of Italy, Italy Institute for High Performance Computing and Networking - National Research Council of Italy, Italy University of Plymouth, UK
TPC Members Tim Brown Mario Marcos do Espirito Santo Jana Heckenbergerova Zdenek Matej Michal Musilek Michal Prauzek Vaclav Prenosil Alvin C. Valera Nasem Badr El Din Emil Pelikan Joanne Nightingale Tomas Barton
Australian National University, Australia Universidade Estadual de Montes Claros, Brazil University of Pardubice, Czech Republic Masaryk University, Czech Republic University of Hradec Kralove, Czech Republic VSB-TU Ostrava, Czech Republic Masaryk University, Czech Republic Singapore Management University, Singapore University of Manitoba, Canada Academy of Sciences, Czech Republic National Physical Laboratory, UK University of Alberta, Canada
Track 10: Big Data Analytics for Learning, Networking and Collaborative Systems Track Co-chairs Miroslav Voznak Akihiro Fujihara Lukas Vojtech
VSB-Technical University of Ostrava, Czech Republic Chiba Institute of Technology, Japan Czech Technical University in Prague, Czech Republic
xiv
INCoS-2021 Organizing Committee
TPC Members Nobuyuki Tsuchimura Masanori Nakamichi Masahiro Shibata Yusuke Ide Takayuki Shimotomai Dinh-Thuan Do Floriano De Rango Homero Toral-Cruz Remigiusz Baran Mindaugas Kurmis Radek Martinek Mauro Tropea Gokhan Ilk Shino Iwami
Kwansei Gakuin University, Japan Fukui University of Technology, Japan Kyushu Institute of Technology, Japan Kanazawa Institute of Technology, Japan Advanced Simulation Technology Of Mechanics R&D, Japan Ton Duc Thang University, Vietnam University of Calabria, Italy University of Quintana Roo, Mexico Kielce University of Technology, Poland Klaipeda State University of Applied Sciences, Lithuania VSB-Technical University of Ostrava, Czech Republic University of Calabria, Italy Ankara University, Turkey Microsoft, Japan
INCoS-2021 Reviewers Amato Flora Barolli Admir Barolli Leonard Bylykbashi Kevin Caballé Santi Capuano Nicola Cui Baojiang Enokido Tomoya Esposito Christian Fenza Giuseppe Ficco Massimo Fiore Ugo Fujihara Akihiro Fun Li Kin Funabiki Nobuo Gañán David Hsing-Chung Chen Hussain Farookh Hussain Omar Ikeda Makoto Ishida Tomoyuki Javaid Nadeem
Joshua Hae-Duck Kohana Masaki Kolici Vladi Koyama Akio Kromer Pavel Kryvinska Natalia Kulla Elis Leu Fang-Yie Leung Carson Li Yiu Maeda Hiroshi Mangione Giuseppina Rita Matsuo Keita Messina Fabrizio Miguel Jorge Miwa Hiroyoshi Natwichai, Juggapong Nadeem Javaid Nalepa Jakub Nowakowa Jana Ogiela Lidia Ogiela Marek
INCoS-2021 Organizing Committee
Orciuoli Francesco Palmieri Francesco Pardede Eric Poniszewska-Maranda Aneta Rahayu Wenny Rawat Danda Sakaji Hiroki Shibata Masahiro Snasel Vaclav SpahoEvjola Sukumoto Yusuke
xv
Taniar David Takizawa Makoto Terzo Olivier Thomo Alex Tsukamoto Kazuya Tsuru Masato Uchida Masato Uehara Minoru Venticinque Salvatore Wang Xu An Woungang Isaac
INCoS-2021 Keynotes
Big Data Management for Data Streams Wenny Rahayu La Trobe University, Melbourne, Australia
Abstract. One of the main drivers behind big data in recent years has been the proliferation of applications and devices to generate data with high velocity in multiple formats. These devices include IoT sensors, mobile devices, GPS trackers and so on. This new data generation, called data streams, requires new ways to manage, process and analyse. These data streams drive the need for a new database architecture that is able to manage the complexity of multiple data formats, deal with high-speed data and integrate them into a scalable data management system. In this talk, the primary motivation for using data lakes, which is new wave of database management that is underpinned by the need to deal with data volume and variety of big data storage, will be presented. Then, some case studies to demonstrate the development of big data ecosystems involving data streams will be discussed. These case studies include the development of data lake for smart factory with sensor data collection/ingestion and big data system for GPS crowdsourcing as part of a community planning.
xix
Convergence of Broadcast and Broadband in 5G Era
Yusuke Gotoh Okayama University, Okayama, Japan
Abstract. In order to converge broadband and broadcast, the realization of TV viewing system by mobile devices is a particularly important challenge. Action on standardized mobile communication technologies for multicast transmission started in 2006, and now Further evolved Multimedia Broadcast Multicast Service (FeMBMS) is an official component of 5G in 3GPP as an LTE-based 5G terrestrial broadcasting system. In this talk, I will introduce the technologies for the convergence of broadband and broadcast in the 5G era. Furthermore, I will introduce our recent work related to the technology of broadcasting while maintaining the compatibility with 5G mobile networks.
xxi
Contents
Performance Comparison of CM and LDVM Router Replacement Methods for WMNs by WMN-PSOSA-DGA Hybrid Simulation System Considering Stadium Distribution of Mesh Clients . . . . . . . . . . Admir Barolli, Shinji Sakamoto, Leonard Barolli, and Makoto Takizawa
1
Effects of Augmented Reality Markers for Networked Robot Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masato Ogata, Masato Inoue, Kiyotaka Izumi, and Takeshi Tsujimura
11
Algorithm Based on Local Search Method for Examination Proctors Assignment Problem Considering Various Constraints . . . . . . . . . . . . . Takahiro Nishikawa and Hiroyoshi Miwa
23
Bio-inspired VM Introspection for Securing Collaboration Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Huseyn Huseynov, Tarek Saadawi, and Kenichi Kourai
32
Artificial Intelligence-Based Early Prediction Techniques in Agri-Tech Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alessandra Amato, Flora Amato, Leopoldo Angrisani, Leonard Barolli, Francesco Bonavolontà, Gianluca Neglia, and Oscar Tamburis Automatic Measurement of Acquisition for COVID-19 Related Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alessandra Amato, Flora Amato, Leonard Barolli, and Francesco Bonavolontà Algorithms for Mastering Board Game Nanahoshi Considering Deep Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masato Saito and Hiroyoshi Miwa Revealing COVID-19 Data by Data Mining and Visualization . . . . . . . . Carson K. Leung, Tyson N. Kaufmann, Yan Wen, Chenru Zhao, and Hao Zheng
42
49
59 70
xxiii
xxiv
Contents
An Approach to Enhance Academic Ranking Prediction with Augmented Social Perception Data . . . . . . . . . . . . . . . . . . . . . . . . . Kittayaporn Chantaranimi, Prompong Sugunsil, and Juggapong Natwichai A Fuzzy-Based System for User Service Level Agreement in 5G Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Phudit Ampririt, Ermioni Qafzezi, Kevin Bylykbashi, Makoto Ikeda, Keita Matsuo, and Leonard Barolli
84
96
Cognitive Approach for Creation of Visual Security Codes . . . . . . . . . . 107 Urszula Ogiela and Marek R. Ogiela Transformative Computing Based on Advanced Human Cognitive Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Urszula Ogiela, Makoto Takizawa, and Lidia Ogiela Topology as a Factor in Overlay Networks Designed to Support Dynamic Systems Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Abbe Mowshowitz, Akira Kawaguchi, and Masato Tsuru A Genetic Algorithm for Parallel Unmanned Aerial Vehicle Scheduling: A Cost Minimization Approach . . . . . . . . . . . . . . . . . . . . . 125 Aprinaldi Jasa Mantau, Irawan Widi Widayat, and Mario Köppen A Movement Adjustment Method for DQN-Based Autonomous Aerial Vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Nobuki Saito, Tetsuya Oda, Aoto Hirata, Kyohei Toyoshima, Masaharu Hirota, and Leonard Barolli A Self-learning Clustering Protocol in Wireless Sensor Networks for IoT Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Nhat Tien Nguyen, Thien T. T. Le, Miroslav Voznak, and Jaroslav Zdralek The Effect of Agents’ Diversities on the Running Time of the Random Walk-Based Rendezvous Search . . . . . . . . . . . . . . . . . . . . . . . 158 Fumiya Toyoda and Yusuke Sakumoto A Study on Designing Autonomous Decentralized Method of User-Aware Resource Assignment in Large-Scale and Wide-Area Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Toshitaka Kashimoto, Fumiya Toyoda, and Yusuke Sakumoto Social Media Data Misuse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Tariq Soussan and Marcello Trovati Deep Learning Approaches to Detect Real Time Events Recognition in Smart Manufacturing Systems – A Short Survey . . . . . . . . . . . . . . . . 190 Suleman Awan and Marcello Trovati
Contents
xxv
A Comparison Study of CM and RIWM Router Replacement Methods for WMNs Considering Boulevard Distribution of Mesh Clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Peng Xu, Admir Barolli, Phudit Ampririt, Shinji Sakamoto, and Leonard Barolli Consideration of Presentation Timing in Bicycle Navigation Using Smart Glasses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 Takahiro Uchiya and Ryo Futamura Graph Convolution Network for Urban Mobile Traffic Prediction . . . . . 218 Changliang Yu, Zhiyang Ye, and Nan Zhao Deep Reinforcement Learning for Task Allocation in UAV-enabled Mobile Edge Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Changliang Yu, Wei Du, Fan Ren, and Nan Zhao Medical Image Analysis with NVIDIA Jetson GPU Modules . . . . . . . . . 233 Pavel Krömer and Jana Nowaková Analysis of Optical Mapping Data with Neural Network . . . . . . . . . . . . 243 Vít Doleží and Petr Gajdoš Evolutionary Multi-level Thresholding for Breast Thermogram Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Arti Tiwari, Kamanasish Bhattacharjee, Millie Pant, Jana Nowakova, and Vaclav Snasel Identification of the Occurrence of Poor Blood Circulation in Toes by Processing Thermal Images from Flir Lepton Module . . . . . . . . . . . 264 Martin Radvansky, Martin Radvansky Jr., and Milos Kudelka License Trading System for Video Contents Using Smart Contract on Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 Kosuke Mori, Kota Nakazawa, and Hiroyoshi Miwa Query Processing in Highly Distributed Environments . . . . . . . . . . . . . 283 Akira Kawaguchi, Nguyen Viet Ha, Masato Tsuru, Abbe Mowshowitz, and Masahiro Shibata Loose Matching Approach Considering the Time Constraint for Spatio-Temporal Content Discovery . . . . . . . . . . . . . . . . . . . . . . . . . 295 Shota Akiyoshi, Yuzo Taenaka, Kazuya Tsukamoto, and Myung Lee Optimized Memory Encryption for VMs Across Multiple Hosts . . . . . . 307 Shuhei Horio, Kouta Takahashi, Kenichi Kourai, and Lukman Ab. Rahim
xxvi
Contents
Blockchain Simulation Environment on Multi-image Encryption for Smart Farming Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316 Irawan Widi Widayat and Mario Köppen Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Performance Comparison of CM and LDVM Router Replacement Methods for WMNs by WMN-PSOSA-DGA Hybrid Simulation System Considering Stadium Distribution of Mesh Clients Admir Barolli1 , Shinji Sakamoto2 , Leonard Barolli3(B) , and Makoto Takizawa4 1
3
Department of Information Technology, Aleksander Moisiu University of Durres, L.1, Rruga e Currilave, Durres, Albania 2 Department of Computer and Information Science, Seikei University, 3-3-1 Kichijoji-Kitamachi, Musashino-shi, Tokyo 180-8633, Japan [email protected] Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected] 4 Research Center for Computing and Multimedia Studies, Hosei University, 3-7-2 Kajino-Cho, Koganei-Shi, Tokyo 184-8584, Japan [email protected]
Abstract. Wireless Mesh Networks (WMNs) have many advantages such as: easy maintenance, low upfront cost and high robustness. The connectivity and stability affect directly the performance of WMNs. However, WMNs have some problems such as node placement problem, hidden terminal problem and so on. In our previous work, we implemented a simulation system to solve the node placement problem in WMNs considering Particle Swarm Optimization (PSO), Simulated Annealing (SA) and Distributed Genetic Algorithm (DGA), called WMN-PSOSA-DGA. In this paper, we compare the performance of Constriction Method (CM) and Linearly Decreasing Vmax Method (LDVM) for WMNs by using the WMN-PSOSA-DGA hybrid simulation system considering the Stadium distribution of mesh clients. Simulation results show that LDVM has better performance than CM.
1
Introduction
The wireless networks and devices are becoming increasingly popular and they provide users access to information and communication anytime and anywhere [6,11,15]. Wireless Mesh Networks (WMNs) are gaining a lot of attention because of their low cost nature that makes them attractive for providing wireless c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 1–10, 2022. https://doi.org/10.1007/978-3-030-84910-8_1
2
A. Barolli et al.
Internet connectivity. A WMN is dynamically self-organized and self-configured, with the nodes in the network automatically establishing and maintaining mesh connectivity among themselves (creating, in effect, an ad hoc network). This feature brings many advantages to WMNs such as low up-front cost, easy network maintenance, robustness and reliable service coverage [1,12,13]. Mesh node placement in WMN can be seen as a family of problems, which are shown to be computationally hard to solve for most of the formulations [8,25]. We consider the version of the mesh router nodes placement problem in which we are given a grid area where to deploy a number of mesh router nodes and a number of mesh client nodes of fixed positions (of an arbitrary distribution) in the grid area. The objective is to find a location assignment for the mesh routers to the cells of the grid area that maximizes the network connectivity and client coverage. Network connectivity is measured by Size of Giant Component (SGC) of the resulting WMN graph, while the user coverage is simply the number of mesh client nodes that fall within the radio coverage of at least one mesh router node and is measured by Number of Covered Mesh Clients (NCMC). Node placement problems are known to be computationally hard to solve [9,26]. In previous works, some intelligent algorithms have been investigated for node placement problem [2,14]. In [19], we implemented a Particle Swarm Optimization (PSO) and Simulated Annealing (SA) based simulation system, called WMN-PSOSA. Also, we implemented another simulation system based on Genetic Algorithm (GA), called WMN-GA [2,10], for solving node placement problem in WMNs. Then, we designed a hybrid intelligent system based on PSO, SA and DGA, called WMN-PSOSA-DGA [18]. In this paper, we compare the performance of Constriction Method (CM) and Linearly Decreasing Vmax Method (LDVM) for WMNs by using the WMNPSOSA-DGA hybrid simulation system considering the Stadium distribution of mesh clients. The rest of the paper is organized as follows. We present our designed and implemented hybrid simulation system in Sect. 2. The simulation results are given in Sect. 3. Finally, we give conclusions and future work in Sect. 4.
2
Proposed and Implemented Simulation System
Distributed Genetic Algorithms (DGAs) are capable of producing solutions with higher efficiency (in terms of time) and efficacy (in terms of better quality solutions). They have shown their usefulness for the resolution of many computationally hard combinatorial optimization problems. Also, Particle Swarm Optimization (PSO) and Simulated Annealing (SA) are suitable for solving NP-hard problems.
Performance Evaluation of CM and LDVM by WMN-PSOSA-DGA
2.1
3
Velocities and Positions of Particles
WMN-PSOSA-DGA decides the velocity of particles by a random process considering the area size. For instance, when √ the area size is W × H, the velocity √ is decided randomly from − W 2 + H 2 to W 2 + H 2 . Each particle’s velocities are updated by simple rule. For SA mechanism, next positions of each particle are used for neighbor solution s . The fitness function f gives points to the current solution s. If f (s ) is larger than f (s), the s is better than s so the s is updated to s . However, if f (s) is not larger than f (s), the s may be updated by using the probability of f (s )−f (s) . Where T is called the “Temperature value” which is decreased exp T with the computation so that the probability to update will be decreased. This mechanism of SA is called a cooling schedule and the next Temperature value of computation is calculated as Tn+1 = α × Tn . In this paper, we set the starting temperature, ending temperature and number of iterations. We calculate α as α=
SA ending temperature SA starting temperature
1.0/number of iterations .
It should be noted that the positions are not updated but the velocities are updated in the case when the solusion s is not updated. 2.2
Routers Replacement Methods
A mesh router has x, y positions and velocity. Mesh routers are moved based on velocities. There are many router replacement methods. In this paper, we use CM and LDVM. Constriction Method (CM) CM is a method which PSO parameters are set to a week stable region (ω = 0.729, C1 = C2 = 1.4955) based on analysis of PSO by M. Clerc et al. [3,7,21]. Random Inertia Weight Method (RIWM) In RIWM, the ω parameter is changing randomly from 0.5 to 1.0. The C1 and C2 are kept 2.0. The ω can be estimated by the week stable region. The average of ω is 0.75 [5,23]. Linearly Decreasing Inertia Weight Method (LDIWM) In LDIWM, C1 and C2 are set to 2.0, constantly. On the other hand, the ω parameter is changed linearly from unstable region (ω = 0.9) to stable region (ω = 0.4) with increasing of iterations of computations [4,24]. Linearly Decreasing Vmax Method (LDVM) In LDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). A value of Vmax which is maximum velocity of particles is considered. With increasing of iteration of computations, the Vmax is kept decreasing linearly [5,20,22].
4
A. Barolli et al.
Fig. 1. Model of WMN-PSOSA-DGA migration.
Fig. 2. Relationship among global solution, particle-patterns and mesh routers in PSOSA part.
Rational Decrement of Vmax Method (RDVM) In RDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). The Vmax is kept decreasing with the increasing of iterations as Vmax (x) =
W 2 + H2 ×
T −x . x
Where, W and H are the width and the height of the considered area, respectively. Also, T and x are the total number of iterations and a current number of iteration, respectively [17].
2.3
DGA Operations
Population of individuals: Unlike local search techniques that construct a path in the solution space jumping from one solution to another one through local perturbations, DGA use a population of individuals giving thus the search a larger scope and chances to find better solutions. This feature is also known as “exploration” process in difference to “exploitation” process of local search methods. Selection: The selection of individuals to be crossed is another important aspect in DGA as it impacts on the convergence of the algorithm. Several selection schemes have been proposed in the literature for selection operators trying to cope with premature convergence of DGA. There are many selection methods
Performance Evaluation of CM and LDVM by WMN-PSOSA-DGA
5
in GA. In our system, we implement 2 selection methods: Random method and Roulette wheel method. Crossover operators: Use of crossover operators is one of the most important characteristics. Crossover operator is the means of DGA to transmit best genetic features of parents to offsprings during generations of the evolution process. Many methods for crossover operators have been proposed such as Blend Crossover (BLX-α), Unimodal Normal Distribution Crossover (UNDX), Simplex Crossover (SPX). Mutation operators: These operators intend to improve the individuals of a population by small local perturbations. They aim to provide a component of randomness in the neighborhood of the individuals of the population. In our system, we implemented two mutation methods: uniformly random mutation and boundary mutation. Escaping from local optimal: GA itself has the ability to avoid falling prematurely into local optimal and can eventually escape from them during the search process. DGA has one more mechanism to escape from local optimal by considering some islands. Each island computes GA for optimizing and they migrate its gene to provide the ability to avoid from local optimal. Convergence: The convergence of the algorithm is the mechanism of DGA to reach to good solutions. A premature convergence of the algorithm would cause that all individuals of the population be similar in their genetic features and thus the search would result ineffective and the algorithm getting stuck into local optimal. Maintaining the diversity of the population is therefore very important to this family of evolutionary algorithms. In following, we present fitness function, migration function, particle pattern and gene coding. 2.4
Fitness and Migration Functions
The determination of an appropriate fitness function, together with the chromosome encoding are crucial to the performance. Therefore, one of most important thing is to decide the determination of an appropriate objective function and its encoding. In our case, each particle-pattern and gene has an own fitness value which is comparable and compares it with other fitness value in order to share information of global solution. The fitness function follows a hierarchical approach in which the main objective is to maximize the SGC in WMN. Thus, the fitness function of this scenario is defined as Fitness = 0.7 × SGC(xij , y ij ) + 0.3 × NCMC(xij , y ij ). Our implemented simulation system uses Migration function as shown in Fig. 1. The Migration function swaps solutions between PSOSA part and DGA part.
6
A. Barolli et al.
Fig. 3. Stadium distribution.
Fig. 4. Simulation results of WMN-PSOSA-DGA for SGC.
2.5
Particle-Pattern and Gene Coding
In order to swap solutions, we design particle-patterns and gene coding carefully. A particle is a mesh router. Each particle has position in the considered area and velocities. A fitness value of a particle-pattern is computed by combination of mesh routers and mesh clients positions. In other words, each particle-pattern is a solution as shown is Fig. 2. A gene describes a WMN. Each individual has its own combination of mesh nodes. In other words, each individual has a fitness value. Therefore, the combination of mesh nodes is a solution. Table 1. WMN-PSOSA-DGA parameters. Parameters
Values
Clients distribution
Stadium distribution
Area size
32.0 × 32.0
Number of mesh routers
24
Number of mesh clients
48
Number of GA islands
16
Number of Particle-patterns 32 (continued)
Performance Evaluation of CM and LDVM by WMN-PSOSA-DGA
7
Table 1. (continued) Parameters
Values
Number of migrations
200
Number of migrations
200
Evolution steps
320
Radius of a mesh router
2.0–3.5
Selection method
Roulette wheel method
Crossover method
SPX
Mutation method
Boundary mutation
Crossover rate
0.8
Mutation rate
0.2
SA Starting value
10.0
SA Ending value
0.01
Total number of iterations 64000 Replacement method
CM, LDVM
Fig. 5. Simulation results of WMN-PSOSA-DGA for NCMC.
3
Simulation Results
In this section, we show simulation results. In this work, we analyze the performance of WMNs by using the WMN-PSOSA-DGA hybrid intelligent simulation system considering Stadium distribution [16] as shown in Fig. 3. We carried out the simulations 10 times in order to avoid the effect of randomness and create a general view of results. We show the parameter setting for WMN-PSOSA-DGA in Table 1. We show simulation results in Fig. 4 and Fig. 5. We consider number of mesh routers 24. We see that for SGC, WMN-PSOSA-DGA can maximize for both replacement methods. However, for NCMC, the performance of LDVM is better compared with the performance of CM.
8
4
A. Barolli et al.
Conclusions
In this work, we evaluated the performance of WMNs by using a hybrid simulation system based on PSO, SA and DGA (called WMN-PSOSA-DGA) considering Stadium distribution of mesh clients. Simulation results show that LDVM achieved the better performance compared with the case of CM. In our future work, we would like to evaluate the performance of the proposed system for different parameters and patterns.
References 1. Akyildiz, I.F., Wang, X., Wang, W.: Wireless mesh networks: a survey. Comput. Netw. 47(4), 445–487 (2005) 2. Barolli, A., Sakamoto, S., Ozera, K., Barolli, L., Kulla, E., Takizawa, M.: design and implementation of a hybrid intelligent system based on particle swarm optimization and distributed genetic algorithm. In: International Conference on Emerging Internetworking. Data and Web Technologies, pp. 79–93. Springer (2018) 3. Barolli, A., Sakamoto, S., Durresi, H., Ohara, S., Barolli, L., Takizawa, M.: A comparison study of constriction and linearly decreasing Vmax replacement methods for wireless mesh networks by WMN-PSOHC-DGA simulation system. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 26–34. Springer (2019) 4. Barolli, A., Sakamoto, S., Ohara, S., Barolli, L., Takizawa, M.: Performance analysis of WMNs by WMN-PSOHC-DGA simulation system considering linearly decreasing inertia weight and linearly decreasing Vmax replacement methods. In: International Conference on Intelligent Networking and Collaborative Systems, pp 14–23. Springer (2019) 5. Barolli, A., Sakamoto, S., Ohara, S., Barolli, L., Takizawa, M.: Performance analysis of WMNs by WMN-PSOHC-DGA simulation system considering random inertia weight and linearly decreasing Vmax router replacement methods. In: Conference on Complex, Intelligent, and Software Intensive Systems, pp. 13–21. Springer (2019) 6. Barolli, A., Sakamoto, S., Ohara, S., Barolli, L., Takizawa, M.: Performance evaluation of WMNS using WMN-PSOHC-DGA considering evolution steps and computation time. In: International Conference on Emerging Internetworking. Data and Web Technologies (EIDWT-2020), pp 127–137. Springer 7. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002) 8. Hirata, A., Oda, T., Saito, N., Hirota, M., Katayama, K.: A coverage construction method based hill climbing approach for mesh router placement optimization. In: International Conference on Broadband and Wireless Computing, Communication and Applications. pp. 355–364. Springer (2020) 9. Maolin, T., et al.: Gateways placement in backbone wireless mesh networks. Int. J. Commun. Netw. Syst. Sci. 2(1), 44 (2009) 10. Matsuo, K., Sakamoto, S., Oda, T., Barolli, A., Ikeda, M., Barolli, L.: Performance analysis of WMNs by WMN-GA simulation system for two WMN architectures and different TCP congestion-avoidance algorithms and client distributions. Int. J. Commun. Netw. Distrib. Syst. 20(3), 335–351 (2018)
Performance Evaluation of CM and LDVM by WMN-PSOSA-DGA
9
11. Ohara, S., Barolli, A., Sakamoto, S., Barolli, L.: Performance analysis of WMNs by WMN-PSODGA simulation system considering load balancing and client uniform distribution. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, ,pp 25–38. Springer (2019) 12. Ohara, S., Durresi, H., Barolli, A., Sakamoto, S., Barolli, L.: A hybrid intelligent simulation system for node placement in WMNs considering load balancing: a comparison study for exponential and normal distribution of mesh clients. In: International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 555–569. Springer (2019) 13. Ohara, S., Qafzezi, E., Barolli, A., Sakamoto, S., Liu, Y., Barolli, L.: WMNPSODGA-an intelligent hybrid simulation system for WMNs considering load balancing: a comparison for different client distributions. Int. J. Distrib. Syst. Technol.(IJDST) 11(4), 39–52 (2020) 14. Ozera, K., Sakamoto, S., Elmazi, D., Bylykbashi, K., Ikeda, M., Barolli, L.: A fuzzy approach for clustering in MANETs: performance evaluation for different parameters. Int. J. Space-Based Situat. Comput. 7(3), 166–176 (2017) 15. Ozera, K., Inaba, T., Bylykbashi, K., Sakamoto, S., Ikeda, M., Barolli, L.: A WLAN triage testbed based on fuzzy logic and its performance evaluation for different number of clients and throughput parameter. Int. J. Grid Util. Comput. 10(2), 168–178 (2019) 16. Sakamoto, S., Oda, T., Bravo, A., Barolli, L., Ikeda, M., Xhafa, F.: WMN-SA system for node placement in WMNs: evaluation for different realistic distributions of mesh clients. In: The IEEE 28th International Conference on Advanced Information Networking and Applications (AINA-2014), pp 282–288. IEEE (2014) 17. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation of a new replacement method in WMN-PSO simulation system and its performance evaluation. The 30th IEEE International Conference on Advanced Information Networking and Applications (AINA-2016), pp 206–211 (2016). https://doi.org/ 10.1109/AINA.2016.42 18. Sakamoto, S., Barolli, A., Barolli, L., Takizawa, M.: Design and implementation of a hybrid intelligent system based on particle swarm optimization, hill climbing and distributed genetic algorithm for node placement problem in WMNs: a comparison study. In: The 32nd IEEE International Conference on Advanced Information Networking and Applications (AINA-2018), pp 678–685. IEEE (2018) 19. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Implementation of intelligent hybrid systems for node placement problem in WMNs considering particle swarm optimization, hill climbing and simulated annealing. Mob. Netw. Appl. 23(1), 27–33 (2018) 20. Sakamoto, S., Ohara, S., Barolli, L., Okamoto, S.: Performance evaluation of WMNs by WMN-PSOHC system considering random inertia weight and linearly decreasing VMAX replacement methods. In: International Conference on NetworkBased Information Systems, pp 27–36. Springer (2019)
10
A. Barolli et al.
21. Sakamoto, S., Ohara, S., Barolli, L., Okamoto, S.: Performance evaluation of WMNs WMN-PSOHC system considering constriction and linearly decreasing inertia weight replacement methods. In: International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 22–31. Springer (2019) 22. Schutte, J.F., Groenwold, A.A.: A study of global optimization using particle swarms. J. Glob. Optim. 31(1), 93–108 (2005) 23. Shi, Y.: Particle swarm optimization Particle swarm optimization Particle swarm optimization. IEEE Connect. 2(1), 8–13 (2004) 24. Shi, Y., Eberhart, R.C.: Parameter selection in particle swarm optimization. In: Evolutionary programming VII, pp 591–600 (1998) 25. Vanhatupa, T., Hannikainen, M., Hamalainen, T.: Genetic algorithm to optimize node placement and configuration for WLAN planning. In: The 4th IEEE International Symposium on Wireless Communication Systems, pp. 612–616 (2007) 26. Wang, J., Xie, B., Cai, K., Agrawal, D.P.: Efficient mesh router placement in wireless mesh networks. In: Proceedings of IEEE International Conference on Mobile Adhoc and Sensor Systems (MASS-2007), pp. 1–9 (2007)
Effects of Augmented Reality Markers for Networked Robot Navigation Masato Ogata1(B) , Masato Inoue1 , Kiyotaka Izumi2 , and Takeshi Tsujimura2 1 Graduate School of Science and Engineering, Saga University, Saga, Japan
{20713001,21729004}@edu.cc.saga-u.ac.jp
2 Department of Mechanical Systems Engineering, Saga University, Saga, Japan
[email protected], [email protected]
Abstract. The authors have studied augmented reality techniques applied to a support robot navigation system. It helps care recipients to remotely control robots for themselves with simple commands through a communication network. This paper focuses on the identification and geometrical measurement of planar markers with mosaic patterns. Experiments reveal that the identification rate as well as measurement accuracy is related to feature points of the pattern. They also clarify that there is a positive correlation between the number of feature points and the accuracy of marker recognition. This research project also proposes the optimum AR markers for networked welfare robot navigation.
1 Introduction It is required to reduce the labor of caregivers under the current severe situations in the field of welfare. Under the current severe situations in social welfare, the amount of labor done by caregivers need to be reduced. This paper proposes a support system for people in the frail state to operate remote robots by themselves using a communication network. The authors are studying to use augmented reality (AR) technology to recognize the environment. Augmented reality is utilized by superimposing virtual objects onto real space [1]. The authors have applied the AR technology function of marker position/orientation estimation to robot navigation [2–6]. The accuracy of geometric estimation and marker identification is important for robot navigation. This paper focuses on the feature points of two-dimensional AR markers and investigates the effect of number of points. It also proposes the optimal marker and evaluates its performance.
2 Welfare Robot Navigation System The authors have planned to establish a welfare robot navigation system based on augmented reality localization and mapping techniques. One of conceptual diagrams of the welfare robot systems is illustrated in Fig. 1. A patient lying on the bed gives commands to the AR navigation system to make robots act on the intended tasks. After interpreting the command, the AR system instructs the robot to perform the desired task and target. AR markers are attached to the target objects in advance. The robot searches for the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 11–22, 2022. https://doi.org/10.1007/978-3-030-84910-8_2
12
M. Ogata et al.
target while identifying AR markers attached on objects in the environment in real time [7, 8]. For example, if an operator sends the command “Fetch me the bottle”, the robot will move to the next room and bring back the target bottle by identifying the AR marker “B” on the bottle. Figure 2 shows overview of the AR processing system, which is equipped within the welfare robot in Fig. 1. The robot contains is equipped with an android device containing the AR system. The target manager of Vuforia engine registers images as a marker and retrieves the package. It can specify the dimensions of the marker, load the package into Unity, associate the information to overlay, and take a picture of the desired image with the camera. It measures the position and orientation of the marker in Unity. AR processing procedure is performed according to Fig. 2 as; (1) The android device sends the recognized 3D physical image to the Vuforia object scanner. (2) The Vuforia object scanner sends the feature point information of the 3D entity to the Vuforia Engine. (3) The Vuforia engine associates Unity with occlusion. (4) Unity operates the AR system on the Vuforia Engine. (5) The Vuforia Engine sends system information to the camera device. (6) The camera device sends feedback to Unity.
Fig. 1. Welfare robot system.
Fig. 2. AR processing system overview.
Effects of Augmented Reality Markers for Networked Robot Navigation
13
3 Experiments on Planar AR Markers Some experiments have been conducted using the prototype AR system described in the previous section. Microsoft® LifeCam HD-5000 camera was used. AR markers are prepared by printing mosaic patterns made of small squares on cards such as illustrated in Fig. 3.
Fig. 3. Standard two-dimensional AR marker.
3.1 Geometrical Measurement First of all, we have evaluated the fundamental characteristics in estimating the distance and angle of AR markers. An AR marker is set 400 mm apart facing to the camera as shown in Fig. 4. Coordinate system is fixed on the camera, which moves by 50 mm in the x-direction. AR system provides us the three dimensional location data of the AR marker. At the initial position, O of the camera, the measured z-axis value indicates the distance to the marker.
㻻㻌
Fig. 4. Experiment of marker distance estimation.
The experimental results are shown in Table 1. The measurement error is the smallest in x-coordinate when the camera is placed at the point O. Error is the largest when it is ±100 mm away from the point O. As a result of the measurement, the average distance error is about 14%.
14
M. Ogata et al. Table 1. Estimated distance to markers. x
y
z
Ideal [mm] Measured [mm] Ideal [mm] Measured [mm] Ideal [mm] Measured [mm] A
100
86.06
50
51.32
400
373.96
B
50
41.53
50
51.30
400
378.68
O
0
0.16
50
50.87
400
391.03
C
−50
−42.14
50
51.19
400
381.31
D −100
−90.63
50
51.60
400
376.05
The angle of the marker is estimated by the AR system while it is inclined from −60 to 60°. The distance between the camera and the marker is 500 mm. The experimental setup is shown in Fig. 5. The marker used in this experiment is shown in Fig. 3. The marker was put 400 mm away from the camera.
Fig. 5. Experiment of marker angle estimation.
The graph of the results of the measurements at each angle show the proportional relationship between the true and measured angles. As a result of the measurement, the average error of the angle was about 1°. The error ratio of the angle is 6%. The estimated value of the angle is about 3.4. It is found that the largest standard deviation is 2.48° when the marker inclines 60°, and that the smallest is 0.86° when the marker inclines 45° as shown in Table 2. All the standard deviations are within a variation of 1 to 3°. Next, the effect of the presence of a black frame on the AR markers were investigated. Two markers are newly prepared: one is the marker shown in Fig. 7(a), which is made by removing the frame from the standard marker shown in Fig. 3. Figure 7(b) displays the other marker that also has no frame and the number of feature points is added to keep the same number as the standard marker.
Effects of Augmented Reality Markers for Networked Robot Navigation
15
Fig. 6. Estimated marker angle.
Table 2. Standard deviation of marker angle estimation. True value [deg]
−60 −45 −30 30
45
60
Standard deviation [deg] 2.04 0.95 1.47 1.53 0.86 2.48
Fig.7. (a) Marker removing boundary frame from standard marker. (b) Frameless marker containing the same number of feature points as standard.
Three patterns of AR markers, as shown in Figs. 3, 7(a), and 7(b), are compared in inclination estimation experiments as shown in Fig. 5. Experimental results are described in Table 3, where θ0 , θ1 , and θ2 represent the estimated angles of AR markers in Figs. 3, 7(a), 7(b), respectively. It suggests that lack of the marker frame degrades the marker pose identification, and that it can be compensated by adding some more feature points. 3.2 Identification Accuracy in Terms of Markers’ Feature Points Next experiments were carried out to examine the effects of feature points of AR markers. Several markers similar to the standard marker were prepared by adding or removing mosaic portions as shown in Fig. 8. AR marker with 4 to 160 feature points were used.
16
M. Ogata et al. Table 3. Angle estimation by frame-less markers.
True value [deg]
−60
−45
−30
Measured value of θ0 [deg]
−59.54
−41.6
−27.59
32.81
47.64
62.31
Measured value of θ1 [deg]
−47.08
−38.64
−17.42
46.86
65.06
85.11
Measured value of θ2 [deg]
−64.85
−51.51
−41.82
25.02
51.24
75.54
Relative error of θ1 [%]
27.45
16.45
72.23
−35.97
−30.83
−29.50
Relative error of θ2 [%]
−7.48
−12.63
−28.26
19.92
−12.17
−20.57
30
45
60
Inclination angle is estimated using the markers placed at an angle of −45°. The reason of the orientation is that the standard marker gives the best performance of measurement and that the influence of lighting can be avoided at the angle. Figure 9 shows the experimental results in terms of feature points. Markers were not recognized for feature points 4 to 28 at various angles and distances. As for feature points 40 and 52, the AR system provides us unstable estimated values. Table 4 indicates standard deviation of the estimated angle. Figure 10 shows the relationship between the number of the angle recognition experiment are quantified in Table 4. It was found that the minimum number of the feature points and the relative error of the experimental results. From Table 4, there is a large variation in accuracy when the number of feature points is 40 and 52. As seen in the result of Table 4 and Fig. 3, as the number of feature points increases, the variation in marker accuracy and the relative error decreases. The correlation between the feature points and the marker angle recognition accuracy with regard to 10 markers as follows. x = 106, y = −4.75
(1)
n n 1 1 2 Sx = (xi − x) = 34.47, Sy = (yi − y)2 = 2.90 n n i=1
(2)
i=1
1 (xi − x)(yi − y) = 68.18 n n
Sxy =
i=1
From Eqs. (1), (2), (3), we obtain r=
Sxy = 0.68. Sx × Sy
where, x: y: x:
Feature score data, Relative error between measured value and true value for each marker, The average value of data x,
(3)
Effects of Augmented Reality Markers for Networked Robot Navigation
(a) 4 feature points
(b) 16 feature points
(c) 28 feature points
(d) 40 feature points
(e) 52 feature points
(f) 64 feature points
(g) 76 feature points
(h) 88 feature points
(i) 100 feature points
(j) 112 feature points
(k) 124 feature points
(l) 136 feature points
(m) 148 feature points
(n) 160 feature points
Fig. 8. AR markers with different feature points.
17
18
M. Ogata et al.
-30 Measured value [deg]
-35 -40 -45 -50 -55 -60 0
40 80 120 Number of feature points [-]
160
Fig. 9. Angle estimation by different markers.
Table 4. Standard deviation of angle estimation. Number of 4 16 28 40 feature points Standard deviation [deg]
– –
–
52
64
76
88
100
112
124
136
Relative error [%]
20 10 0 -10 -20 -30 40 80 120 Number of feature points [-] Fig. 10. Relative error of angle estimation.
y: n: Sx : Sy : Sxy : r:
160
6.49 4.99 0.68 1.51 2.56 1.20 0.14 0.98 0.36 1.19 0.30
30
0
148
The average value of data y, Total number of data, Standard deviation of data x, Standard deviation of data y, Covariance of data x and data y, Correlation coefficient between data x and data y.
160
Effects of Augmented Reality Markers for Networked Robot Navigation
19
Because the correlation coefficient is 0.68, it is considered to be a positive interrelation between the number of feature points and the accuracy of marker identification. Experiments have also indicated that errors in the AR process calculations occurred when the patterns were too intricate. If the feature points exceed 100, marker recognition becomes intermittent in Unity tracking mode. From the collected data, the number of feature points of AR markers embellished such patterns is preferably around 64. This experiment confirmed that the system does not operate properly when the markers are arranged symmetrically in the vertical and horizontal directions inside the black frame. The experiment determined the angle for recognition for three markers: horizontal-line symmetry, vertical-line symmetry, and point symmetry as shown in Fig. 11(a), (b), and (c). Each marker is measured 10 times while it is placed at –45°. The experimental method is the same as the previous experiment.
(a) Horizontal symmetry
(b) Vertical symmetry
(c) Point symmetry
Fig. 11. Symmetric markers.
Table 5. Identification of symmetric markers.
Recognition angle [deg]
Horizontal symmetry
Vertical symmetry
Point symmetry
Unidentified
Unidentified
−48.64
Table 5 shows the average value of the experimental results. The system did not work with line symmetry. Error ratio of the point symmetry marker identification is around 8%, which is not too high but the variation is large. After all, these discussions suggest the design guidelines of the optimum twodimensional marker as follows; • containing mosaic patterns within a frame, • keeping the number of feature points around 64, • avoiding symmetric patterns. Figure 12 shows an example of a marker proposed based on the guideline. Mosaic patterns are randomly arranged with 70 feature points.
20
M. Ogata et al.
Fig. 12. The optimum marker.
The same experiment as the position recognition experiment in Fig. 4 is performed. Table 6 shows the experimental results of distance estimation experiment using our proposed optimum AR marker. Table 7 shows a comparison of the errors between the reference markers in Fig. 3 and the optimum markers proposed in Fig. 12. Table 6. Distance estimation using optimum marker. x
y
z
Ideal [mm] Measured [mm] Ideal [mm] Measured [mm] Ideal [mm] Measured [mm] A
100
111.84
50
49.55
400
385.96
B
50
56.80
50
49.35
400
395.56
O
0
7.82
50
49.02
400
401.19
C
−50
−41.66
50
48.62
400
409.74
D −100
−96.93
50
48.60
400
412.87
Table 7. Comparison of estimated distance by optimum and reference marker. Relative error [%] x
y
Reference marker −13.99 Optimal marker
z 2.50 −9.95
1.42 −1.94
0.27
The position recognition accuracy of the marker under the optimum conditions is higher than that of the reference marker. We perform the same experiment as the slant estimation experiment in Fig. 5 with the marker in Fig. 12. θ1 was set to be the experimental value obtained using the standard marker, and θ2 was set to be the experimental value obtained in this experiment. The experimental results and relative errors are shown in Table 8. A comparison of standard deviations is shown in Table 9. Two tables confirm that the optimum marker has a much higher ability to identify than the reference marker. It is possible to create a stable and highly accurate AR system with small errors and small variations.
Effects of Augmented Reality Markers for Networked Robot Navigation
21
Table 8. Comparison of estimated angle by optimum reference marker. True value [deg]
−60
−45
−30
30
45
60
Measured value of θ1 [deg]
−59.54
−41.60
−27.59
32.81
47.64
62.31
Measured value of θ2 [deg]
−60.27
−44.79
−29.92
30.46
44.02
58.97
Relative error of θ1 [%]
−0.77
−7.56
−8.03
9.37
5.87
3.85
Relative error of θ2 [%]
0.46
−0.47
−0.26
1.53
−2.17
−1.72
Table 9. Standard deviation of estimated angle by optimum reference marker. True value [deg]
−60 −45 −30 30
45
60
Standard deviation of θ1 [deg] 2.04 0.95 1.47 1.53 0.86 2.48 Standard deviation of θ2 [deg] 0.67 0.49 1.08 0.36 0.10 0.46
4 Conclusion The authors have prototyped the networked welfare robot system based on augmented reality techniques. Remote control performance by untrained operators depends on the measurement accuracy of AR marker location. The effects of AR markers on identification in the real world, in terms of number and arrangement of their feature points, were investigated. Several examples of two-dimensional markers were used to confirm the accuracy of its distance and orientation measurement. The influence of feature points of AR markers to their identification was also examined. Similar markers were designed with only different number of feature points. Experiments suggest that markers with more feature points are identified more precisely, and that too many feature points give no effects. We have proposed the optimal figure of AR marker containing 70 feature points. It has successfully identified the location of objects in the real world with high accuracy of 86% in distance and 94% in angle. Acknowledgments. This works was supported by JKA and its promotion funds from KEIRIN RACE.
References 1. Schmalstieg, D., Hollerer, T .: AR Textbook, p. 27. Japan, My-navi Publishing (2018) 2. Kurata, T., Kiyokawa, K.: Basics, Development and Practice of AR (Augmented Reality) Technology. Japan, Scientific Information Publishing (2015) 3. Park, H., Jung, H.-K., Park, S.-J.: Tangible AR interaction based on fingertip touch using small-sized non-square markers. J. Comput. Des. Eng. 1, 289–297 (2014) 4. Koch, C., Neges, M., König, M., Abramovici, M.: Natural markers for augmented reality-based indoor navigation and facility maintenance. Autom. Constr. 48, 18–30 (2014)
22
M. Ogata et al.
5. Shirata, D., Izumi, K., Tsujimura, T.: Surface recognition using a composite augmented reality marker. In: 7th International Conference on Human-Agent Interaction (2019) 6. Shirata, D., Izumi, K., Tsujimura, T.: Construction of augmented reality system using 3D entity markers. In: 59th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE), Chiang Mai, Thailand (Online) (2020) 7. Zhang, F., Lai, C.Y., Simic, M., Ding, S.: Augmented reality in robot programming. In: Proceedings of the Procedia Computer Science, vol. 176, pp. 1221–1230 (2020) 8. Tsujimura, T., Aoki, R., Izumi, K.: Geometrical optics analysis of projected-marker augmented reality system for robot navigation. In: 12th France - Japan Congress, 10th Europe - Asia Congress on Mecatronics (2018)
Algorithm Based on Local Search Method for Examination Proctors Assignment Problem Considering Various Constraints Takahiro Nishikawa and Hiroyoshi Miwa(B) Graduate School of Science and Technology, Kwansei Gakuin University, 2–1 Gakuen, Sanda-shi, Hyogo 669–1337, Japan {TakahiroNishikawa,miwa}@kwansei.ac.jp
Abstract. In a periodic examination and a general entrance examination of a university, examination proctors are needed in order to smoothly and rigorously carry out the examination. Since such an examination is typically executed at many sites at the same time, many examination proctors are required and they must be assigned appropriately. When assigning proctors, there are a wide range of constraints to be considered. When all the constraints are not satisfied, it is necessary to obtain a solution that satisfies the constraints as much as possible by minimizing the total of the penalties for the soft constraints. In this way, the examination proctors assignment problem can be formulated as an optimization problem. We formulate the problem of assigning proctors as an integer programming problem. Furthermore, we design an algorithm based on the 2-opt method of local search which is one of the meta-heuristic algorithms. The performance of the proposed algorithm is evaluated by applying it to some periodic examination assignment with actual sizes. As a result, the proposed algorithm can output a feasible solution of a local minimum, even if the size of an instance of the problem is large and the optimization solver cannot solve it.
1
Introduction
In a periodic examination and a general entrance examination of a university, examination proctors are needed in order to smoothly and rigorously carry out the examination. Since such an examination is typically executed at many sites at the same time, many examination proctors are required and they must be assigned appropriately. When assigning proctors, there are a wide range of constraints to be considered, such as the fact that one proctor cannot be in charge of multiple test sites at the same time, and that the number of times a particular proctor is assigned should not be uneven. There are two main types of constraints: those that must c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 23–31, 2022. https://doi.org/10.1007/978-3-030-84910-8_3
24
T. Nishikawa and H. Miwa
be satisfied (hard constraints) and those that should be satisfied if possible (soft constraints). When all the constraints are not satisfied, it is necessary to obtain a solution that satisfies the constraints as much as possible by minimizing the total of the penalties for the soft constraints. In this way, the examination proctors assignment problem can be formulated as an optimization problem. Although it is possible to manually assign proctors on a small scale, it is very difficult to do so on the scale of regular university examinations or general entrance examinations. In addition, since there are changes in the system every year and the candidates for proctors also change, the assignment of proctors must be made every time. Therefore, a system that automatically assigns examination proctors is necessary. In this paper, we deal with the examination proctors assignment problem. We formulate the problem of assigning proctors as an integer programming problem. The constraints to be considered in the formulation are whether the proctors are assigned to the same two classes, whether the proctors are assigned to classes that are inconvenient for them, and whether the maximum number of assignments for each proctor is not exceeded, and so on. The objective function is the sum of the penalties for failing to meet the soft constraints and it is minimized. Because of the large scale of this problem, it is difficult to find the optimum solution even using an optimization solver. In fact, only small-scale problems can be solved. Therefore, we design an algorithm based on the 2-opt method of local search which is one of the meta-heuristic algorithms. The performance of the proposed algorithm is evaluated by applying it to a periodic examination assignment on actual scales.
2
Previous Research
There are various types of assignment problems: class timetabling problems, exam timetabling problems, nurse scheduling problems, and so on. A high school timetabling problem is solved by an algorithm based on the local search method [1]. A university course scheduling problem is solved by an algorithm based on the simulated annealing (SA) [2]. The simulated annealing (SA) method is a meta-heuristic algorithm. An algorithm based on the tabu search is designed for the university course scheduling problem [3]. An algorithm based on the genetic algorithm (GA) is designed for a school time scheduling problem [4]. The nurse scheduling problem [5] is formulated as an integer programming problem. Since it is difficult to solve the problem with a real scale, many algorithms including meta-heuristic algorithms such as the local search method are studied [6]. A shift scheduling problem in the service industry is investigated [7]. Some instances based on data of actual confectionery stores and restaurants are solved by using a optimization solver. There are many similar problems such as timetabling for individual cram schools [8], a manpower scheduling problem [9] that considers the assignment of workers to workplaces while taking into account their safety, a jobshop scheduling problem [10] that considers which tasks should be assigned to each machine.
Algorithm for Examination Proctors Assignment Problem
25
Since constraints, objective functions, and scales depend on a target problem, it is difficult to make a general-purpose system. Indeed, an examination proctor assignment is generally made manually. If it is not made manually, a specialized algorithm is designed and implemented. As for the problem in this paper, the existing algorithm can not be used as it is, because the information of the correspondent dates of examination proctors and the information of the possible dates to which each subject can be assigned must be also simultaneously considered.
3
Examination Proctors Assignment Problem
In this section, we describe the examination proctors assignment problem in detail, and we formulate it as an integer programming problem. We show the constraints as follows: • There is only one slot that can be assigned to each subject. • Each subject must be assigned to a slot where the exam of the subject can be conducted. • Each proctor must be assigned to a slot where the proctor can be assigned. • A proctor assigned to a slot must be assigned to only one subject conducted in the slot. • Each subject must has more than or equal to the required number of proctors. • Too many subjects are assigned to only specific proctors. For each of these constraints, a penalty is given if the constraint is not satisfied. Each constraint has a different penalty value, and the more important the constraint, the larger the penalty value. We can formulate the problem as an optimization problem minimizing the sum of the penalties as the objective function. We show the input data as follows: • • • • •
Set of proctors, M Set of slots on a timetable, C Set of subjects that their exams are executed, S The required number of proctors for subject s, Ns (s ∈ S) Ts,c (s ∈ S, c ∈ C) (If subject s can be assigned to slot c, Ts,c = 1; otherwise, Ts,c = 0) • Pm,c (m ∈ M, c ∈ C) (If proctor m can be assigned to slot c, Pm,c = 1; otherwise, Ps,c = 0) • Maximum number of slots assigned to proctor m, Hm (m ∈ M ) • Minimum number of slots assigned to proctor m, Lm (m ∈ M ) We show the variables, in other words, the output data, as follows: • xs,c (If subject s is assigned to slot c, xs,c = 1; otherwise, xs,c = 0) • ys,m (If proctor m is assigned to subject s, ys,m = 1; otherwise, ys,m = 0)
26
T. Nishikawa and H. Miwa
We show the constraints in detail, as follows: 1. A subject is assigned to only a slot. xs,c = 1 (s = 1, 2, . . . , |S|) c
2. A subject is assigned to only a solt to which the subject can be assigned. xs,c Ts,c = 1 (s = 1, 2, . . . , |S|) c
3. The number of proctors assigned to a subject is more than or equal to the required number of proctors for the subject. ys,m ≥ Ns (s = 1, 2, . . . , |S|) m
4. A proctor assigned to a slot must be assigned to only one subject conducted in the slot. xs,c ys,m ≤ 1 (c = 1, 2, . . . , |C|, m = 1, 2, . . . , |M |) s
5. A proctor is assigned to a slot to which the proctor can be assigned, the proctor is assigned to a subject in the slot, and the required number of proctors for the subject is satisfied. xs,c ys,m Pm,c ≥ Ns (s = 1, 2, . . . , |S|) m
c
6. The number of slots assigned to proctor m is the range from Lm to Hm . ys,m ≤ Hm (m = 1, 2, . . . , |M |) Lm ≤ s
We assume that the constraints (1), (2), (3), and (4) are the hard constraints and that the constraints (5) and (6) are the soft constraints. We define the penalty functions for the constraints (5) and (6). When these constraints are not satisfied, the corresponding penalty functions gives the penalty. The constants w1 , w2 , w3 are the positive real numbers of the weights. The objective function is defined as the weighted sum of the penalty functions as follows: w1 ·
s
exp(Ns −
m
c
xs,c ys,m Pm,c )+w2 ·
m
exp(Lm −
s
ys,m )+w3 ·
m
exp(
ys,m −Hm )
s
We formulate the optimization problem for the examination proctors assignment problem as the problem to minimize the objective function under the constraints (1), (2), (3), and (4).
Algorithm for Examination Proctors Assignment Problem
4
27
Local Search Algorithm
The optimization problem for the examination proctors assignment problem formulated in the previous section is the integer programming problem. When the problem size is large, it is difficult to solve it with a generalpurpose optimization solver. In this paper, we design an algorithm based on the local search method which is one of the meta-heuristic algorithms. The local search method searches an improved solution in the neighborhood of the tentative solution, and when an improved solution is found, it is updated as the tentative solution. Since this is an algorithmic framework, we need to design a specific neighborhood for each problem. Our proposed algorithm consists of two phases. First, for all slots, subjects assigned to each slot are determined. We define a solution as an assignment of subjects to all slots, and the neighborhood of a solution as the set of all the assignments made by a swap of two subjects chosen from the solution. Thus, we get a solution by applying the local search method. In the second phase, for a solution in a first phase, proctors are assigned to subjects. In this solution, a slot to which a subject is assigned is determined for all subjects. We define a solution as an assignment of proctors to all subjects, and the neighborhood of a solution as the set of all the assignments made by a swap of two proctors chosen from the solution. Thus, we get a solution by applying the local search method. We evaluate the quality of a solution by the value of the objective function, if the hard constraints are satisfied. Otherwise, a solution satisfying the hard constraints are searched in the neighborhood.
5
Performance Evaluation
In this section, we evaluate the performance of the proposed algorithm for the examination proctors assignment problem. First, we investigate the accuracy of the solution by comparing with the optimum solution. When the size of an instance of the problem is small, we can solve the problem by using the general-purpose optimization solver. We used Gurobi Optimizer version 9.1.0 [11] as the solver on the workstation with CPU of Intel Core i5, 1.60GHz, and 8GB RAM. We used the instance in Table 1. The required number of proctors for a subject, the maximum number of slots assigned to a proctor, the minimum number of slots assigned to a proctor, Ts,c , and Pm,c are randomly determined. We assumed that the constants w1 , w2 , w3 , the weights in the objective function, are one in the rest of this paper. We show an example of the solution by using the solver and the solution by the proposed method, respectively (Table 2, Table 3, Table 4, and Table 5).
28
T. Nishikawa and H. Miwa Table 1. Instance with small size Number of proctors
8
Number of slots
25
Number of subjects
5
Required number of proctors for a subject
2, 3, or 4
Maximum number of slots assigned to a proctor 1, 2, or 3 Minimum number of slots assigned to a proctor 0
Table 2. Optimum assignment of subjects to slots for instance with small size Monday
Tuesday
Wednesday
Thursday
Friday
Table 3. Optimum assignment of proctors to subjects for instance with small size
1st
Assigned Proctors’ID
2nd
Subject 1
3rd 4th
Subject 2 Subject 4
5th
Subject 5
Subject 3
Table 4. Assignment of subjects to slots for instance with small size by proposed algorithm Monday
Tuesday
Wednesday Thursday Friday
Subject 1
256
Subject 2
2456
Subject 3
18
Subject 4
57
Subject 5
345
Table 5. Assignment of proctors to subjects for instance with small size by proposed algorithm
1st Subject 2
Assigned Proctors’ID
2nd Subject 1, Subject 4 3rd 4th 5th
Subject 3 Subject 5
Subject 1
356
Subject 2
1245
Subject 3
47
Subject 4
148
Subject 5
124
The computation time is 158.25 s by the solver and 111.78 s by the proposed algorithm. Both output a feasible solution. The value of the objective function of the optimum solution is 217.92308 and one of the solution by the proposed algorithm is 234.12403. We can get sufficient good solution by the proposed algorithm, although the computation time is almost same. Next, we solve an instance with large size. The solver cannot solve such an instance; however, the proposed algorithm can solve even the instance. We used the instance in Table 6. The required number of proctors for a subject, the maximum number of slots assigned to a proctor, the minimum number of slots assigned to a proctor, Ts,c , and Pm,c are randomly determined. We show an example of the solution by the proposed method (Table 7 and Table 8).
Algorithm for Examination Proctors Assignment Problem
29
Table 6. Instance with large size Number of proctors
300
Number of slots
25
Number of subjects
121
Required number of proctors for a subject
3, or 4
Maximum number of slots assigned to a proctor 1, 2, or 3 Minimum number of slots assigned to a proctor 0 Table 7. Assignment of subjects to slots for instance with large size by proposed algorithm Monday
Tuesday
Wednesday
Thursday
Friday
1st
33, 51, 91, 98, 108
16, 34, 104
10, 52, 77, 110, 115
9, 49
7, 22, 89, 92, 93, 96, 102, 105
2nd
1, 97
57, 75,101
25, 26, 87, 90
43, 54
21, 24, 28, 86, 88, 113
3rd
30, 56, 59, 80, 85
11, 38, 64, 66
4, 17, 23, 50, 63, 69, 74
31, 44, 61, 67, 99
20, 39, 41, 70, 78
4th
12, 13, 14, 19, 48
5,62, 65, 95, 103, 106, 111, 112
32, 42, 60, 82, 83, 116, 118, 120
2, 8, 35, 37, 81
36, 46, 53, 58, 71, 79, 100, 107
5th
45
27, 68, 84, 117
15, 18, 29, 40, 73, 109
3, 6, 55, 72, 94
47, 76, 114, 119
Table 8. A part of assignment of proctors to subjects for instance with large size by proposed algorithm Subject Assigned Proctors’ID 1
102156225
2
80140293
3
81147267295
4
14137256267
5
37123297
6
107134264
7
75119217264
The computation time is 4328.79 s. The proposed algorithm outputs a feasible solution, and the value of the objective function is 14287.982. We show changes in the value of the objective function in Fig. 1. The value of the objective function rapidly decreases initially, and it converges the value of the local minimum. We conducted the numerical experiences for more three instances with large size generated by randomly determining Ts,c and Pm,c . The computation times are 4287.62 sec., 4572.85 sec., and 4183.56 secs; the objective values are 14094.525, 13982.462, and 14725.071, respectively. The optimization solver cannot solve these instances; however, the proposed algorithm can output a feasible solution of a local minimum for all the instances.
30
T. Nishikawa and H. Miwa
Fig. 1. Changes in value of objective function
6
Conclusion
In this paper, we dealt with the examination proctors assignment problem. First, we formulated the problem of assigning proctors as an integer programming problem. Because of the large scale of this problem, it is difficult to find the optimum solution even using an optimization solver. In fact, only small-scale problems can be solved. Therefore, we designed an algorithm based on the local search method which is one of the meta-heuristic algorithms. We evaluated the performance of the proposed algorithm by applying it to some periodic examination assignment instances with practical sizes. When the size of an instance is small, since the optimization solver can solve the problem and output the optimum solution, we can compare the solution by the proposed algorithm. As a result, we can get an approximation solution by the proposed algorithm. When the size of an instance is large, although the optimization solver cannot solve the problem, the proposed algorithm can output a feasible solution of a local minimum. Since it is important to get a solution for an instance with practical size, the proposed algorithm is useful for solving the periodic examination assignment problem with a practical size. Acknowledgements. This work was partially supported by the Japan Society for the Promotion of Science through Grants-in-Aid for Scientific Research (B) (17H01742) and JST CREST JPMJCR1402.
References 1. Schaerf, A.: Local search techniques for large high school timetabling problems. IEEE Trans. Syst. Man Cybern. Part A Syst. Humans 29(4), 368–377 (1999)
Algorithm for Examination Proctors Assignment Problem
31
2. Aycan, E., Ayav, T.: Solving the course scheduling problem using simulated annealing. In: 2009 IEEE International Advance Computing Conference, pp. 462–466 (2009) 3. Al Tarawneh, H.Y., Ayob, M.: Using tabu search with multi-neighborhood structures to solve university course timetable UKM case study. In: 2011 3rd Conference on Data Mining and Optimization (DMO), pp. 208–212 (2011) 4. Sigl, B., Golub, M., Mornar, V.: Solving timetable scheduling problem using genetic algorithms. In: 25th International Conference on Information Technology Interfaces IT1, pp. 231–259 (2003) 5. Miller, H.E., Pierskalla, W.P., Rath, G.J.: Nurse scheduling using mathematical programming. Oper. Res. 24(5), 797–1026 (1976) 6. Aickelin, U., White, P.: Building better nurse scheduling algorithms. Ann. Oper. Res. 128, 159–177 (2004) 7. Aykin, T.: Optimal shift scheduling with multiple break windows. Manage. Sci. 42(4), 475–627 (1996) 8. Raghavjee, R., Pillay, N.: A genetic algorithm selection perturbative hyperheuristic for solving the school timetabling problem. ORiON 31(1), 39–60 (2015) 9. Hoong Chuin Lau: On the complexity of manpower shift scheduling. Comput. Oper. Res. 23(1), 93–102 (1996) 10. Goncalves, J.F., de Magalhaes Mendes, J.J., Resende, M.G.C.: A hybrid genetic algorithm for the job shop scheduling problem. Eur. J. Oper. Res. 167(1), 77–95 (2005) 11. GUROBI OPTIMAZATION Documantation. https://www.gurobi.com/ documentation/
Bio-inspired VM Introspection for Securing Collaboration Platforms Huseyn Huseynov1(B) , Tarek Saadawi1 , and Kenichi Kourai2 1
2
Department of Electrical Engineering, City University of New York, City College, New York, NY, USA {hhuseynov,saadawi}@ccny.cuny.edu Department of Computer Science and Networks, Kyushu Institute of Technology, Fukuoka, Japan [email protected]
Abstract. As organizations drastically expand their usage of collaborative systems and multi-user applications during this period of mass remote work, it is crucial to understand and manage the risks that such platforms may introduce. Improperly or carelessly deployed and configured systems hide security threats that can impact not only a single organization, but the whole economy. Cloud-based architecture is used in many collaborative systems, such as audio/video conferencing, collaborative document sharing/editing, distance learning and others. Therefore, it is important to understand that safety risk can be triggered by attacks on remote servers and confidential information might be compromised. In this paper, we present an AI powered application that aims to constantly introspect multiple virtual servers in order to detect malicious activities based on their anomalous behavior. Once the suspicious process(es) detected, the application in real-time notifies system administrator about the potential threat. Developed software is able to detect user-space based keyloggers, rootkits, process hiding and other intrusion artifacts via agent-less operation, by operating directly from the host machine. Remote memory introspection means no software to install, no notice to malware to evacuate or destroy data. Conducted experiments on more than twenty different types of malicious applications provide evidence of high detection accuracy.
1
Introduction
Collaborative platforms, groupware, or multi-user applications allow groups of users to communicate and manage common tasks. Many companies, industrial infrastructures, government agencies and universities rely on such applications periodically. All of these systems contain information and resources with different degrees of sensitivity. The applications deployed in such systems create, manipulate, and provide access to a variety of protected information and resources. Balancing the competing goals of collaboration and security is difficult because interaction in collaborative systems is targeted towards making people, information, and resources available to all who need it, whereas information c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 32–41, 2022. https://doi.org/10.1007/978-3-030-84910-8_4
Bio-inspired VM Introspection for Securing Collaboration Platforms
33
security seeks to ensure the availability, confidentiality, and integrity of these elements while providing it only to those with proper authorization. Protection of contextual information and resources in such systems therefore requires a constant automated mechanism that will address necessary vulnerability points. Among the several areas of security under consideration for collaborative environments, authorization or access control is particularly important because such systems may offer open access to local desktops or resources through network. In such environments, some applications can gain privilege to access textbased chat, audio/video files, shared whiteboard or any other data. Users need a mechanism not only for identifying collaborators through proper authentication, but to manage files, applications, system processes and so forth. Proposed application aims to eliminate these needs for users. In this paper, we present a single solution to detect malicious applications that tries to surreptitiously gain access to personal files. This solution provide secure environment by constantly checking servers for the presence of keyloggers, rootkits, trojans and other malicious applications using cutting edge artificial immune system (AIS) based technology [1,2]. Crucial part of proposed architecture is KVMonitor the virtualization module that collects data (interrupts, system calls, memory writes, network activities, etc.) by introspecting remote servers [3]. The rest of this paper is organized as follows: Sect. 2 provides a brief background on security for collaboration platforms and list of potential threats. Section 3 explains the negative selection algorithm (NSA) and artificial immune system based IDS. Section 4 describes our proposed end-to-end intrusion detection approach for cloud based collaboration platforms. Section 5 provides a detailed performance evaluation of the proposed security approach. Section 6 draws conclusion and discusses future work.
2
Security and Privacy in Collaboration Platforms
Collaboration and communication, hence it is very common for organizations of all sizes to use tools that facilitate connection between their employees. However, with the advancement in technological collaboration platforms, the risk level also goes up. Hence, the people who hold the authority to adopt such platforms must be aware of some hygiene practices to mitigate risks. Security in collaboration platforms starts with hardening the security of Virtual Machines deployed in the cloud servers. For example, with cloud computing, user data is usually stored in the cloud service providers (CSPs) data centers across the globe, unknown to the user. The security of such data is crucial in any network environment and even more critical in cloud computing, given the fact that files are constantly replicated across different geographical zones. Several possibilities of attacks exist in this realm. One of the most threatening is the insider attack, which also considered as one of the largest threats in this decentralized cloud computing environment.
34
H. Huseynov et al.
The heterogeneity and diversity of the cloud computing environment opens up a series of avenues for malicious attacks and privacy issues that could constitute a major threat to the entire system. These threats can be classified from three different perspectives: network, application and virtualization. • Security threats from a network perspective. Denial of Service (DoS) attack is an age-long threat in various computing and networking areas. DoS creates an artificial scarcity or lack of online and network services. It could happen in the form of distributed denial-of-service (DDoS) or wireless jamming and could be launched on both the virtualization and regular network infrastructures. In case of Software Defined Networks (SDN), DoS attacks have limited scope, as described in [4], DoS attack on the network edge will affect only the attacked vicinity and not the entire network. Therefore, due to the autonomous and semi-autonomous nature of edge data centers, the attack might not lead to a complete disruption of the core network infrastructure. Another known network-based attack technique is Man-in-the-Middle (MitM), characterized by the presence of a third malicious party interposed between two or more communication parties and secretly relaying or altering the communication between such parties. The potency of an MitM attack on mobile networks has been proven in various works and literature [5,6]. Such attacks would be even more threatening for the SDN scenario, considering that SDN heavily relies on virtualization, hence launching an MitM attack on multiple VMs could very easily affect all other elements on both sides of the attack (Fig. 1).
Fig. 1. Network layers in cloud computing infrastructure
Bio-inspired VM Introspection for Securing Collaboration Platforms
35
• Security threats from system and application perspective. Third party applications running in remote servers can pose a fatal security threats by exposing virtual machines to different malicious applications. When virtualization software such as hypervisor or container engine is attacked, remote applications can fail and data can be leaked. Interim attacks through manipulated or malware-infected remote applications or spread of infection to other cloud-based software and data leakage can occur. Keyloggers, rootkits, spyware, adware, ransomware, worms, trojans and other nefarious threats are considered as potential risk factors for virtual machines. Exploitation of vulnerabilities in open SDN systems can occur, also known as hyperjacking, in which a hacker takes control over the hypervisor that creates the virtual environment within a VM host. • Security threats from a virtualization perspective. While virtual machines are relatively secure because they provide a completely isolated computing environment, containers are vulnerable since they share a single operating system. One of the possible threats in SDN is VM manipulation, which mainly affects the virtualization infrastructure. The adversary in VM manipulation is mostly a malicious insider with enough privileges or a VM that has escalated privileges. In addition, arbitrary container access manipulation can lead to a control takeover attack on the container, and there is a possibility of data manipulation or data leakage through open API vulnerabilities in cloud-based applications. Proposed work mainly lies on detecting security threats in Virtual Machines within system and application perspective. Designed approach employs artificial immune system (AIS) based algorithm for anomaly detection. One significant feature of the theory immunology is the ability to adapt to changing environment and dynamically learning. AIS is inspired by the human immune system (HIS), which has the ability to distinguish internal cells and molecules of the body against diseases [1].
3
Artificial Immune System Based Intrusion Detection
Anomaly-based intrusion detection system monitors network traffic and user/system activity for abnormal behavior. Unlike the signature-based detection method, the anomaly-based IDS can detect both known and unknown (zero-day) attacks. Hence, it is a better solution than the signature-based detection technique if its system is well designed [4]. Therefore, efficiency of anomaly-based IDS depends on multiple requirements such as what kind of algorithms have been deployed, what is the main target, understanding generated input data, application run-time and so on. Artificial Immune System (AIS) is a type of “adaptive systems”, inspired by theoretical immunology and observed immune functions, principles, and models, which are applied to problem-solving. Immunology uses models for understanding the structure and function of the immune system. Simplification of such biological immune system models can produce AIS models that when applied
36
H. Huseynov et al.
to determined problems, could be the basis of artificial immune system algorithms and consequently computer programs [1]. An important mechanism of the adaptive immune system is the “self/nonself recognition”. The self-nonself (SNS) model is an immunology model that has been successfully utilized in AIS in the design of IDS systems to detect malicious activities and network attacks in a given operating system. Immune system is able to recognize which cells are its own (self) and which are foreign (nonself); thus, it is able to build its defense against the attacker instead of self-destructing [2]. 3.1
Negative Selection Algorithm
The negative selection algorithm is based on the self-nonself model of the human immune system (HIS). The first step of the NSA according to Forest et al. [7] involves randomly generating detectors (which is the AIS’s equivalent of B cell in HIS) in the complementary space (i.e., space which contains no seen self elements) and then to apply these detectors to classify new (unseen) data as self (no data manipulation) or nonself (data manipulation). Several variations of NSAs have been proposed after the original version was introduced (Forest et al., 1994); however, the main features of the original algorithm still remain. The whole shape-space U is divided into a self set S and a nonself set N with U = S ∪ N and S ∩ N = ∅
(1)
There are two steps or phases in NSA, known as detector generation phase and nonself detection phase. In the first step, a set of detectors is generated by some randomized process that uses a collection of self as the input. Candidate detectors that match any of the self-samples are eliminated, whereas unmatched ones are kept [2]. Algorithm 1 shows a pseudocode of a basic negative selection algorithm. At the detector generation phase, normal profiles (also called self profiles or self samples) which have been extracted from the training data are used to generate random detectors. Each data instance in the normal profile is obtained from the data instances captured by the system during periods of normal activity (i.e., during the absence of any malicious applications). A detector is defined as d = (C, rd ), where C = {c1 , c2 , ..., cm }, ci ∈ R, is an m-dimensional point that corresponds to the center of a unit hyper-sphere with rd ∈ R as its unit radius. For the generic NSA shown in Algorithm 1, rd = rs [8].
Bio-inspired VM Introspection for Securing Collaboration Platforms
37
Algorithm 1. A Generic Negative Selection Algorithm 1: function GenericNSA(S, Tmax , rs ) 2: Where S - set of normal/self profiles, Tmax - max. number of detectors, rs matching threshold. 3: D←∅ 4: while |D| < Tmax do 5: Generate a random detector (d) 6: if d does not match any element in S then 7: D ←D∪d 8: end if 9: end while 10: for All new incoming samples ν ∈ ∪ do 11: if ν matches any element in D then 12: Classify ν as a nonself sample 13: end if 14: end for 15: return D 16: end function
Figure 2 shows a basic block-diagram of two NSA phases: detector generation process on the left and nonself detection on the right. Randomly generated candidates that match any self samples are discarded. The detector generation process is halted when the desired number of detectors is obtained. To determine if a detector (C, rdi ) matches any normal profile, the distance (dValue) between this detector and its nearest self profile neighbor (X normal , rs ) ∈ S is computed, where X normal is an m-dimensional point {x1 , x2 , ..., xm } and corresponds to the center of a unit hyper-sphere (with rs as its unit radius). Here di is a random candidate detector with center C and radius rdi . Self Samples
List of Detectors
No Random Candidates
Match Yes
No
Add to the list of detectors
Data
Normal (Self )
Match
Discard Yes
Abnormal (Nonself )
Fig. 2. Detector generation process on the left and nonself detection on the right.
In the proposed work distance dValue is obtained using Squared (Euclidean) distance, however, depending on architecture, any real valued distance measures can be used (such as Euclidean distance, Manhattan distance, Chebyshev distance, etc.). m (ci − xi )2 (2) d(c, x) = i=1
38
H. Huseynov et al.
Process of generating random candidates to cover the nonself space employs Genetic Algorithm. The self-space consisted of a set S, a subset of [0, 1]m ; accordingly, a data point was represented as a feature vector x = (x1 , x2 , ..., xm ) in [0, 1]m . At the beginning, an initial population of candidate detectors is generated at random. Such detectors then mature through an iterative process. In each iteration, the radius of each detector is calculated as rd = dValue − rs , where rs is the variable distance around a self [1,2].
4
Proposed Security Approach
The proposed intrusion detection and mitigation approach, the overall architecture of which is depicted in Fig. 3, provides security in cloud-based networks by automated, intelligent analysis of network flows and system level forensics, followed by mitigation actions being taken in accordance with the decision of IDS component. KVMonitor is the crucial part of nonself detection phase and provides an API for translating a virtual address to a physical one [3]. To introspect a virtual disk with the qcow2 format, KVMonitor uses network block device (NBD) for QEMU. By doing so, it allocates a real disk space only to used blocks, therefore saving a disk space. Several conducted experiments confirmed efficiency of memory introspection using KVMonitor [3].
Artificial Immune System based IDS
Offload
VM
monitor
Virtual disk
Virtual Machine Introspection module KVMonitor
Virtual NIC QEMU-KVM
host operating system Fig. 3. Basic architecture of proposed intrusion detection system.
List of detectors obtained from the detector generation phase is being used in the second phase. During the nonself detection phase (Fig. 2), KVMonitor constantly introspects multiple VMs and returns raw feature values to the IDS. Next, the application converts these features into the binary tuples and begins matching process. If application finds a match for any incoming set of features among the detectors, it is immediately notifies administrator about potential anomaly. Primary focus is made on the following features:
Bio-inspired VM Introspection for Securing Collaboration Platforms
• • • •
39
Keyboard Driver: XkbGetState(), XKeysymToString(), XkbRules(). Memory Usage: system calls Read() and Write(), RssFile(), RssShmem(). File System: ReadFile and WriteFile, CreateFile, OpenFile. Network Flow: Send, Sendto, Sendmsg, TCP socket, UDP socket.
The controller at IDS periodically collects these entries from virtual machines, which are retrieved by the KVMonitor at regular intervals. Upon retrieval, features are converted into binary tuples for every flow and algorithm begins matching process. While looping over the flow entries, the incoming features are immediately sent to the IDS, without waiting to finish creation of other flow entries. One of the main benefits of AIS based virtual machine introspection is zero load on VM, since IDS operates from the host operating system.
5
Experimental Evaluation
In this section, we provide an experimental evaluation of the proposed security approach using three different types of Linux based keyloggers taken from the open source software list [9]. The experiments were conducted on a host machine with Intel Core i5-11400 @2.60GHz processor and 16 GB RAM. The guest machine was running on Ubuntu 18.04 LTS with allocated 2 GB memory. All malicious applications listed in the Table 1 have been initially installed into the VM. In order to demonstrate efficiency of proposed system, we have divided experiments into two parts. First, after being logged in to the VM, user starts typing short sentences with periodic pauses (≈40–80 characters). The text can be entered into any application (browser, text editor, etc.) running inside VM (Chart (a)). As part of the second experiment, user types long text making certain pauses between sentences (≈400–1500 characters) using any default text editor (Chart (b)). We measured time for both scenarios considering fluctuation of several features (keyboard tracking, file access, network flow).
1 Keyboard Tracking File Access Network
0.8
0.6
0.4
0.2
0
0
60 120 180 240 300 360 420 480 540 600 Time [sec]
(b) API calls invoked by gedit using Blueberry Normalized API Call Frequency Values
Normalized API Call Frequency Values
(a) API calls invoked by Firefox using Logkeys
1 Keyboard Tracking File Access Network
0.8
0.6
0.4
0.2
0
0
60 120 180 240 300 360 420 480 540 600 Time [sec]
40
H. Huseynov et al.
Chart (a) shows the result of anomalous fluctuation of the features depicted by our IDS while typing in the infected VM. The X-axis represents time in seconds and the Y -axis is normalized value of API call frequencies. The normalized API call frequency values are the total value we get during 10 s divided by the maximum value of the whole period (600 s). Chart (b) represents second part of the experiment, but with a different keylogger. In this case, keylogger triggers networking features by trying to send captured keystrokes over the TCP protocol to remote server. Table 1. Three different types of keyloggers used in this experiment Logkeys
:
Multi functional GNU/Linux keylogger. Logs all entered keystrokes including function keys [10]
Blueberry
:
Opens a stream to the keyboard event handler and gets every key press. Create logs when the buffer gets 300 characters and sends it to the remote server over TCP protocol [11]
EKeylogger :
Sends recorded keystrokes every 10 sec using SMTP protocol [12]
Proposed IDS was able to detect all listed keyloggers within the first 10 s of their launch. This time is allocated as an interval for VM introspection and can be reduced depending on IDS configuration. Moreover, application efficiently detects other types of malicious applications (trojans, rootkits, adware and so on) without human interaction. Artificial Immune System based IDS is able to track minor deviations from normal profile triggered by malevolent processes. Conducted experiments on more than twenty different malicious application demonstrate high detection accuracy and efficient VMI without any user being engaged. The proposed security approach is promising for achieving real-time, highly accurate detection and mitigation of attacks in cloud-based servers, which will be in widespread use in the 5G and beyond era.
6
Conclusions
Collaboration solutions have become key to enabling remote work, and if the proper steps are taken to securely configure and deploy them, the risks they introduce can be mitigated. As these platforms become used more heavily in regular business, it is increasingly imperative that organizations have threat intelligence feeds in place, and vulnerabilities impacting these platforms are identified and addressed promptly. In this paper, we provided a distributed solution to secure cloud-based servers for collaboration platforms. We began by examining and classifying potential vulnerabilities for such systems. Next, we presented Artificial Immune System algorithm that is used in proposed application. Following by describing overall IDS architecture and providing experimental evaluation
Bio-inspired VM Introspection for Securing Collaboration Platforms
41
of presented application. Our future work will include an extension of current introspection by accessing virtual machines remotely. Initial experiments were successfully conducted to introspect a virtual machine over the GRE tunnel. Continuous tests on many different malicious applications provide capability to detect large attack surface in a variety of network structures. We believe that our study helps to introduce a new model for securing collaboration platforms and provide best practices on issues that has high impact on security and privacy.
References 1. Igbe, O., Saadawi, T., Darwish, I.: Digital Immune System for Intrusion Detection on Data Processing Systems and Networks, March 2020. Patent No. US 10,609,057; Filed, 26 June 2017; Issued 31 March 2020 2. Dasgupta, D., Nino, F.: Immunological Computation: Theory and Applications, 1st edn. Auerbach Publications, Boca Raton (2008) 3. Kourai, K., Nakamura, K.: Efficient VM introspection in KVM and performance comparison with Xen. In: Proceedings of the 2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing, PRDC 2014, pp. 192–202, USA. IEEE Computer Society (2014) 4. Roman, R., Lopez, J., Mambo, M.: Mobile edge computing, Fog et al.: a survey and analysis of security threats and challenges. Future Gener. Comput. Syst. 78, 680–698 (2018) 5. Stojmenovic, I., Wen, S., Huang, X., Luan, H.: An overview of fog computing and its security issues. Concurr. Comput. Pract. Experience 28(10), 2991–3005 (2016) 6. Zhang, L., Jia, W., Wen, S., Yao, D.: A man-in-the-middle attack on 3G-WLAN interworking. In: Proceedings of the 2010 International Conference on Communications and Mobile Computing, CMC 2010, vol. 01, pp. 121–125, USA. IEEE Computer Society (2010) 7. Forest, S., Perelson, A., Allen, L., Cherukuri, R.: Self-nonself discrimination in a computer. In: Proceedings, Research in Security and Privacy, pp. 202–212, USA. IEEE Computer Society Symposium (1994) 8. Igbe, O., Darwish, I., Saadawi, T.: Distributed network intrusion detection systems: an artificial immune system approach. In: 2016 IEEE First International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), pp. 101–106 (2016) 9. Top Open Source Keylogger Projects. https://awesomeopensource.com/projects/ keylogger. Accessed 2 May 2021 10. Logkeys - a GNU/Linux Keylogger. The source code for the index construction and search is available at https://github.com/kernc/logkeys. Implemented on C and dual licensed under the terms of either GNU GPLv3 or later, or WTFPLv2 or later. Accessed 2 June 2021 11. Blueberry - Simple Open Source Keylogger for Linux. The source code for the index construction and search is available at https://github.com/PRDeving/blueberry. Implemented on C and has open license. Accessed 2 June 2021 12. EKeylogger or simply Keylogger. The source code for the index construction and search is available at https://github.com/aydinnyunus/Keylogger. Implemented on Python for the purpose of testing the security of information systems. Accessed 2 June 2021
Artificial Intelligence-Based Early Prediction Techniques in Agri-Tech Domain Alessandra Amato1 , Flora Amato1,2(B) , Leopoldo Angrisani1,2 , Leonard Barolli3 , Francesco Bonavolontà1,2 , Gianluca Neglia4 , and Oscar Tamburis4 1 Centro Servizi Metrologici e Tecnologici Avanzati (CeSMA),
University of Federico II, Naples, Italy {alessandra.amato,flora.amato,leopoldo.angrisani, francesco.bonavolonta}@unina.it 2 Department of Electrical Engineering and Information Technologies, University of Federico II, Naples, Italy 3 Department of Information and Communication Engineering, Fukuoka Institute of Technology, Fukuoka, Japan [email protected] 4 Department of Veterinary Medicine and Animal Productions, University of Federico II, Naples, Italy {gianluca.neglia,oscar.tamburis}@unina.it
Abstract. This work is aimed at presenting an application of Artificial Intelligence techniques in the Precise Livestock Farming domain. In particular, we focused on the possibility of identifying mastitis using noninvasive IR techniques, delegating to artificial intelligence techniques the task of early detecting ongoing mastitis situations. We conduct an experimental campaign finalized to verify that an infrared thermography measurement technique (which absorbs infrared radiation and generates images based on the amount of heat generated) can be profitably used in the early detection of mastitis in buffalo populations. For this experimental campaign, the surface temperature of the udder skin was measured using an infrared camera. We use a Deep Learning approach to perform an early classification of the infrared pictures, finalized to distinguish healthy breasts from those with ongoing mastitis symptoms. Keywords: Precision livestock farming · Smart farming · One Digital Health · Innovation
1 Introduction Industry 4.0 (I4.0), the so-called fourth industrial revolution, is based on the wide adoption of the IIoT (Industrial Internet of Things) and ICPSs (industrial cyber-physical systems), along with cloud manufacturing-related new paradigms. The purpose of I4.0 is to build a highly flexible production model of personalized and digital products and services, with real-time interactions between people, products and devices during the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 42–48, 2022. https://doi.org/10.1007/978-3-030-84910-8_5
Artificial Intelligence-Based Early Prediction Techniques
43
production process. The application of such principles for what concerns the livestock field leads to the implementation of IoHAT–related paradigms (Internet of Animal Health Things) that can change animals, machineries and processes into “information objects” by connecting them to the network, so as to help improve the whole Farming Data Management set, and to figure out accordingly “data–driven”–like business models. This currently goes under the now–known name of Precise Livestock Farming (PLF) [1, 2]. The whole Buffalo–related dairy production and supply chain represents a leading sector of the entire Agri–food arena, especially in the South Italy. A systematic deployment of integrated (and economically sustainable) solutions of continuous monitoring of quality and salubrity of the production environments is still lacking, which should instead be a basic requirement for policies of environmental sustainability of the productions sites. An approach like that would therefore imply an integrated vision between production characteristics, animal welfare, and security issues, necessary to achieve a better overall quality control of the supply chain as well as a better economic valorisation of the final products. Clearly, the adoptions of quality methods [11] and related tools in this sector have to be triggered and accompanied by deep changes in the philosophy and organizational model of the countryside enterprise, as interesting application areas though not yet entirely investigated [3]. In order to do that, the following fundamental steps are to be implemented: • identifying the critical process that mainly affect productivity and incomes, along with data to be monitored and measured; • introducing automated tools for data gathering and measurement [12, 13, 21], to be integrated within the production chain; • defining protocols for data analysis, in order to detect as earlier as possible causes of inefficiency within the production processes; • using the results of the data analysis as input for automated decision-making processes, on which effective management actions rely [10]; • activating automated control systems, whose tasks need to be documented through the deployment of standard operating procedures (SOPs); • activating procedures for the monitoring of the outcomes of control and documentation activities, in order to guarantee Quality Improvement and Assurance.
2 Research Objectives Technologies available for precision livestock and agriculture are currently still at their early development stage, and researchers can, metaphorically, “sightsee grassland” in front of them. The recent increasing awareness for environment–related instances is pushing towards the design of sustainable dairy supply chains, since they are recognized as capable of generating value for consumers. To such purpose, new technical facilities are available to perform objective measurements of critical-to-quality characteristics, which can be directly related to the process outcome, like for instance the functional molecules’ contents in the food production, or indirectly related to it, as it is for a variety of environmental factors that can have an influence. An effective vision of smart farming requires therefore the realization of synergistic actions between farmers, veterinary
44
A. Amato et al.
doctors, biologists, chemists, and engineers, called to work together with the common goals of defining in the first place a set of environmental parameters to be measured via a timely network of sensors (both traditions and biosensors), and then deploying innovative technological platforms to connect the sensors network with an infrastructure for short– and long–range wireless connectivity. The purpose is to obtain structured field data (for instance from the automated milking systems already during the passage of the buffaloes) to be elaborated via predictive models [9] into complex indicators of environmental sustainability and salubrity, as well as of animal welfare (KPILW: Key Performance Indicators of Livestock Welfare). This implies accordingly [4, 5]: • developing new knowledge and instruments to realize an “eco-labeling” of the products under a “One Digital Health” perspective, i.e. with a pronounced attention to the ways the agribusiness production affects the surrounding environment. Such information, not yet fully available, should stand among the bases for an improvement of the overall quality of the production process; • designing the specific control–and–tracking technology [16] to be deployed with the aim of generating real–time alerts after contamination phenomena of both natural and anthropic origin, or monitoring the production process in order to prevent issues, or to guarantee rapid corrective interventions; • re-designing the entire supply chain “from the stable to the table”. In addition, livestock farming and cultivations can benefit in monitoring food processing, food safety, and food quality [14], also by exploiting the new available facilities for on field data acquisition and transmission to data collector centers, where they can be analyzed and compared to benchmarks or other data from different monitoring sites. Digitalizing process-related information at the source helps as well managerial actions, and, in conjunction with diagnostic algorithms, automatic trouble-shooting and interventions, which allow for steady settled conditions time and resource savings [15]. All these expected results deal with the capability to figure out for a smart farm a realistic dairy supply chain in line on the one hand with the development characteristics of I4.0 [17], but also on the other hand compatible with the so–called TBL approach (Triple Bottom Line), which seeks the sustainability of a system through the pursuing of a balance between social, economic, and environmental targets [6].
3 A Case of Study: Early Warning Prediction of Mastitis As a case study, we carried out a classification campaign of infrared images to perform early prediction of mastitis pathologies from the analysis of the surface temperature of buffalo udders. We want to classify the infrared images into three classes: “infection”, “no infection”, “probable infection”. We process 400 images manually labelled by domain experts. Infrared pictures are taken by using the Testo 881 Infrared Camera. This camera is able to capture infrared images with a resolution of 160 × 120 pixels, and can measure temperature in a range from −20 to +350 °C. In the following we report a snippet of code implemented for the infrared image classification. We exploit the Python libraries NumPy, TensorFlow, Pandas and Os. In
Artificial Intelligence-Based Early Prediction Techniques
45
particular, we use the TensorFlow Keras procedures for image and layers processing and classification.
In the following, we report the instructions for preprocessing the data for the training of the model.
In Fig. 1 we report Infrared images labelled by domain experts. On the left an image related to a breast with ongoing infection; on the right an image related to a probable infection.
Fig. 1. Infrared images with ongoing infection (on the left) and probable infection (on the right)
In Fig. 2 we report Infrared images labelled by domain experts. On the left an image related to a breast with a probable infection; on the right an image related to a infection.
46
A. Amato et al.
Fig. 2. Infrared image with probable infection (on the left) and no infection (on the right)
Instructions for the model building are in the following; they include the model configuration done based on the losses and metrics.
In the following with the Sequential method, we provide training and inference features on our model. We use the model to classify the images.
As a result of classification, we obtain images labelled with the probability of ongoing infection, as reported in Fig. 3.
Artificial Intelligence-Based Early Prediction Techniques
47
Fig. 3. Resulting classified infrared image: on the left an image classified with label of probable infection and on the right an image classified with the label of no infection.
4 Conclusions Farming enterprises have definitely to accede and implement quality improvement approaches like the successfully ones widely adopted by industry and service companies. The farmer’s new role in the 4.0 era requires some extensive knowledge, ranging from agronomy and breeding to process development and engineering design, not excluding finance and accounting, marketing, distribution and logistics. Most important, the farmer has to get familiar with new technologies and acquainted of their inherent value and usefulness for the goal he/she is pursuing. This means heading towards a number of challenges to be solved for any of the I4.0-compliant systems – from networked connection of components, to cyber-security issues and enhanced flexibility, to massive data gathering, just to name a few. The necessary skills for the achievement of the domain of manufacturing, informatics, and process technologies [18–20], in addition to a clear integral vision of the trade as well as high creativity, must be figured out and nurtured as early as possible in an educational environment [7]. Accordingly, terms like “engineering education 4.0” should definitively become also part of the curricula for students from Animal Productions or Agriculture Sciences [8].
References 1. Mat, I., Kassim, M.R.M., Harun, A.N., Yusoff, I.M: Smart agriculture using internet of things. In: 2018 IEEE Conference on Open Systems (ICOS), pp. 54–59. IEEE (2018) 2. Chaudhry, A.A., Mumtaz, R., Zaidi, S.M.H., Tahir, M.A., Syed Hassan Muzammil School: Internet of Things (IoT) and machine learning (ML) enabled livestock monitoring. In: 2020 IEEE 17th International Conference on Smart Communities: Improving Quality of Life Using ICT, IoT and AI (HONET), pp. 151–155. IEEE (2020) 3. Salzano, A., et al.: Space allocation in intensive Mediterranean buffalo production influences the profile of functional biomolecules in milk and dairy products. J. Dairy Sci. 102(9), 7717– 7722 (2019) 4. Jukan, A., Masip-Bruin, X., Amla, N.: Smart computing and sensing technologies for animal welfare: a systematic review. ACM Comput. Surv. (CSUR) 50(1), 1–27 (2017)
48
A. Amato et al.
5. Benis, A., Tamburis, O., Chronaki, C., Moen, A.: One Digital Health: a unified framework for future health ecosystems. J. Med. Internet Res. 23(2), e22189 (2021) 6. Khan, I.S., Ahmad, M.O., Majava, J.: Industry 4.0 and sustainable development: a systematic mapping of triple bottom line Circular Economy and Sustainable Business Models perspectives. J. Clean. Prod. 297, 126655 (2021) 7. Bonavolontà, F., D’Arco, M., Liccardo, A., Tamburis, O.: Remote laboratory design and implementation as a measurement and automation experiential learning opportunity. IEEE Instrum. Meas. Mag. 22(6), 62–67 (2019) 8. Saucier, P.R., Langley, G.C.: An evaluation of the knowledge, performance, and consequence competence in a science, technology, engineering, and mathematics (STEM) based professional development for school-based agricultural science teachers: an assessment of an industry supported Com. J. Agric. Syst. Technol. Manag. 28, 25–43 (2017) 9. Amato, F., Cozzolino, G., Moscato, F., Moscato, V., Xhafa, F.: A model for verification and validation of law compliance of smart-contracts in IoT environment. IEEE Trans. Ind. Inform. (2021) 10. Amato, F., Coppolino, L., Cozzolino, G., Mazzeo, G., Moscato, F., Nardone, R.: Enhancing random forest classification with NLP in DAMEH: a system for DAta Management in eHealth Domain. Neurocomputing 444, 79–91 (2021) 11. Amato, F., Casola, V., Cozzolino, G., De Benedictis, A., Mazzocca, N., Moscato, F.: A security and privacy validation methodology for e-health systems. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 17(2s), 1–22 (2021) 12. Cozzolino, G.: Using semantic tools to represent data extracted from mobile devices. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 530–536. IEEE, July 2018 13. Amato, A., Bonavolontà, F., Cozzolino, G.: Extracting information from food-related textual sources. AINA (3) 72–80 (2021) 14. Amato, A., Cozzolino, G., Moscato, V.: Big data analytics for traceability in food supply chain. In: Barolli, L., Takizawa, M., Xhafa, F., Enokido, T. (eds.) WAINA 2019. AISC, vol. 927, pp. 880–884. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15035-8_86 15. Bonavolontà, F., et al.: On the suitability of compressive sampling for the measurement of electrical power quality. In: 2013 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), pp. 126–131. IEEE, May 2013 16. Buonanno, A., et al.: A new measurement method for through-the-wall detection and tracking of moving targets. Meas. J. Int. Meas. Confederation 46(6), 1834–1848 (2013) 17. Angrisani, L., Arpaia, P., Bonavolonta, F., Lo Moriello, R.S.: Academic FabLabs for industry 4.0: experience at University of Naples Federico II. IEEE Instrum. Meas. Mag. 21(1), 6–13 (2018). Article no. 8278802 18. Loffredo, F., Vardaci, E., Quarto, M., Roca, V., Pugliese, M.: Validation of electromagnetic and hadronic physical processes in the interaction of a proton beam with matter: a solar particle events case study with an al slab. Adv. Space Res. 59(1), 393–400 (2017) 19. Loffredo, F., Scala, A., Adinolfi, G.M., Savino, F., Quarto, M.: A new geostatistical tool for the analysis of the geographical variability of the indoor radonactivity. Nukleonika 65(2), 99–104 (2020) 20. Savino, F., Pugliese, M., Quarto, M., Adamo, P., Loffredo, F., De Cicco, F., Roca, V.: Thirty years after chernobyl: long-term determination of 137cs effective half-life in the lichen stereocaulon vesuvianum. J. Environ. Radioact. 172, 201–206 (2017) 21. Tamburis, O., Mangia, M., Contenti, M., Mercurio, G., Mori, A.R.: The LITIS conceptual framework: measuring eHealth readiness and adoption dynamics across the Healthcare Organizations. Health Technol. 2(2), 97–112 (2012). https://doi.org/10.1007/s12553012-0024-5
Automatic Measurement of Acquisition for COVID-19 Related Information Alessandra Amato1(B) , Flora Amato1 , Leonard Barolli2 , and Francesco Bonavolont` a1 1 Department of Electrical Engineering and Information Technologies and Centro Servizi Metrologici e Tecnologici Avanzati (CeSMA), University of Napoli Federico II, Naples, Italy {alessandra.amato,flora.amato,francesco.bonavolonta}@unina.it 2 Department of Information and Communication Engineering, Fukuoka Institute of Technology (FIT), Fukuoka, Japan [email protected]
Abstract. This paper investigates data representation and extraction procedures for the management of domain-specific information regarding COVID-19 information. To integrate among different data sources, including data contained in COVID-19 related clinical texts written in natural language, Natural Language Processing (NLP) techniques and the main tools available for this purpose were studied. In particular, we use an NLP pipeline implemented in python to extract relevant information taken from COVID-19 related literature and apply lexicometric measures on it.
1
Introduction
Since December 2019, when first cases of people infected with SARS-CoV-2 appeared in China, the entire global medical and scientific community immediately mobilised in the research and sharing of the knowledge obtained about this new virus. One of the first publications dedicated to describing the disease was released on January 24, 2020 by the New England Journal of Medicine. In this article, doctors from the Chinese Center of Disease Control and Prevention analysed patients who were admitted to the city of Wuhan on December 27, 2019 for coronavirus pneumonia. In the weeks following that publication, hundreds of scientific articles conducted by different research groups were released with the intent of presenting initial answers to frequently asked questions about COVID-19 such as how the virus is transmitted, the most common symptoms, the incubation period, and the mortality rate. However, the publication process is complex and requires a long time to review and evaluate the documents. If this process had been implemented even in the early period of the pandemic, there would have been significant delays in the implementation of specific therapies for patients with acute respiratory symptoms and in the production of the first vaccines. Instead, Artificial Intelligence programs were used that sped up the review of articles by experts and facilitated the dissemination of scientific c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 49–58, 2022. https://doi.org/10.1007/978-3-030-84910-8_6
50
A. Amato et al.
research around the world. These systems were designed for document selection and retrieval of essential information in the fight against the virus. The results obtained through Information Retrieval and Natural Language Processing techniques were subsequently collected and made freely available in CORD19, a repository of only academic publications on COVID-19. Content was drawn from a variety of sources such as PubMed Central and bioRxiv and is ranked by search engines created in the months following the release of the dataset [1–3]. In addition, other repositories were created that expanded the knowledge with new types of documents. The aim of this paper is to analyse Artificial Intelligence programs dedicated to automatic text processing and to illustrate in detail how CORD-19 works, exposing its advantages, limitations and how other applications have improved its efficiency. In an experimental section we propose our approach aiming to extract information through a Python NLP pipeline.
2
CORD-19
CORD-19 (COVID-19 Open Research Dataset) is a free resource that includes a large collection of scientific publications and articles on COVID-19, SARS, and MERS. It was released on March 16, 2020 by the research group Allen Institute for AI (AI2) in collaboration with the White House Office of Science and Technology Policy (OSTP), the Chan Zuckerberg Initiative (CZI) Foundation, the National Library of Medicine (NLM), Microsoft Research, and the Kaggle platform coordinated by Georgetown University’s Center for Security and Emerging Technology (CSET). The first version of the dataset contained 28,000 documents from Biology, Chemistry and Medicine and was downloaded 75,000 times in the first month of release alone. The goal that the creators of CORD-19 hope to achieve with the realisation of their program is to group all documents on COVID-19 from different sources into a single collection in order to apply the techniques of Information Retrieval and Natural Language Processing for the extraction of useful information. The results obtained will then be made available to physicians to find experimental treatments for cases of sick people in intensive care and in the realisation of the first vaccines. Moreover, they will be used by policy makers as an assessment tool to be able to manage the pandemic in different countries around the world. In this chapter, we will discuss how artificial intelligence systems were used for document insertion within the dataset and extrapolation of relevant concepts. The material included in CORD-19 is sourced from free digital archives containing open access documents. The main datasets used as sources are: • PubMed Central (PMC) - collection of academic articles published in biomedical journals; • bioRxiv - collection of articles dedicated to biology; • medRxiv - collection of articles devoted to medicine; • World Health Organization (WHO) COVID-19 Database - collection of documents on COVID-19.
Automatic Measurement of Acquisition for COVID-19 Related Information
51
CORD-19 is composed of all those articles that have one of the following keywords in their content: • • • • • • •
COVID-19; Coronavirus; 2019-nCoV; SARS-CoV; MERS-CoV; Severe Acute Respiratory Syndrome; Middle East Respiratory Syndrome.
The main documents considered for inclusion in the dataset are divided into metadata papers and articles in PDF or XML format. After a processing process, each paper will be associated with a set of metadata fields (title, authors, publication date, etc.) with a unique identifier such as DOIs, PubMed Central IDs, PubMed IDs, WHO Covidence or MAG. Metadata is harmonised and deduplicated by Semantic Scholar in a three-step process: 1. Paper Clustering - all papers with the same tuple identifier such as doi, pmc id or pubmed id are collected. Next, a unique CORD UID identifier is assigned to each cluster. The latter will be persistent during different releases of the dataset and will facilitate the addition of new documents. 2. Canonical metadata selection - canonical metadata is selected for each cluster based on the type of copyright license. Empty metadata fields are filled with the values present in the same cluster. 3. Filter clusters - all metadata related to summaries, indexes, and information documents are removed from the dataset. The 73,000 metadata present in the first version of CORD-19 were reduced after this process to 51,100 metadata divided into: • • • • •
28,600 from PMC documents; 1,100 from medRxiv documents; 800 from bioRxiv documents; 1,100 from WHO documents; 19,500 from publishers’ documents.
Most of the CORD-19 corpus includes information from articles in XML or PDF format. Pipeline analysis is used to extract the full text and bibliographies from the documents, allowing relevant information such as the title and authors of the text to be obtained directly from the sources and without the use of external programs such as ScienceParse [4,5]. The process has been realized for the S20RC (Semantic Scholar Open Research Corpus) dataset and includes a conversion step of all PDFs into TEI XML files using the GROBID library. Articles from PMC are instead converted to the JATS XML format. Subsequently, all files will be converted to JSON and cleaned of unnecessary links related to citations and bibliographic entries. CORD-19 has allowed users of the platform to be constantly updated on the evolution of the pandemic. The knowledge obtained from the literature is filtered and organized through the use of several tools:
52
A. Amato et al.
• BM25 (Best Matching 25) - ranking algorithm used by search engines to estimate the relevance of documents for a given query; • ScispaCy - package written in Python containing models for the extraction of scientific, biomedical and clinical concepts from texts; • COVIZ - visualization tool used to identify documents describing possible relationships between extracted entities; • COVIDASK - Question Answering (QA) system created by a South Korean research group that combines biomedical text mining and QA techniques to provide answers to user questions posed in real time; • ASReview - learning system designed to assist researchers in finding relevant papers during literature review; • Vespa - a text search engine from Verizon Media that generates summaries of paper articles. Since its release, CORD-19 has been used as a data mining resource through competitions [6]. The Kaggle platform, in collaboration with AI2 and OSTP, organised the CORD-19 Research Challenge. Participants were tasked with answering key scientific questions about CORD-19 by deriving answers from articles in the dataset. The purpose of the competition was to provide an opportunity for global experts in artificial intelligence and the NLP community to develop text mining tools that can help clinicians in the fight against the virus. More than 500 development teams participated in the competition, and the solutions found were judged by a panel of experts in the biomedical field. Thanks to their intervention it was possible to create a table where for each question the answers were collected from the different automatic extractions [7]. During a series of TREC conferences, the TREC-COVID Information Retrieval Challenge was organized by AI2, NIST, NLM and the Oregon (OHSU) and Texas (UTHealth) Universities of Health and Science. In the first round of the competition, participants were presented with thirty questions from MedlinePlus research and Twitter conversations. Each week, teams were required to submit a collection of text useful in solving the questions posed. The overall goal of the contest was to evaluate systems on their ability to classify documents based on the queries asked. With these evaluations, the most effective methods for managing scientific information for future global biomedical crises may be discovered. Even today, although CORD-19 is constantly being updated with new documents being added daily and many automated mechanisms being used for information retrieval [8], it has several limitations. First, the collection appears to be incomplete as it includes only scientific articles and academic publications with an open access copyright license. The dataset lacks technical reports, informational publications from government agencies, graphs, tables, and text describing the early phase of the pandemic in China. In addition, not all articles are processed correctly during their inclusion in the collection. This is because the PDF format is designed for document and image representation and not for automated analysis. Therefore, the content of the articles and the extracted metadata will be imperfect and will be cleaned of duplicate canonical metadata before being reprocessed again by NLP systems. Finally, the conversion of texts
Automatic Measurement of Acquisition for COVID-19 Related Information
53
into JSON turns out to be an expensive and lossy process. Information have to be validated through a model for verification and validation of domain-related documents [9]. Nevertheless, CORD-19 has been implemented by several systems such as COVID-19 Fatcat Snapshot, CO-Search, CovidQA, COVIDSeer and Neural Covidex that have improved its efficiency. The operation of these programs will be discussed in more detail in the next paragraph.
3
CORD-19 Applications
To retrieve material not contained in the CORD-19 corpus, Internet Archive (IA) released a dataset called COVID-19 Fatcat Snapshot. In its first version, the archive contained 45,294 articles extracted from PDF files. During the processing, each paper was associated with one of the unique identifiers used in CORD-19 for paper grouping. This will facilitate the correspondence between the PDFs entered in COVID-19 Fatcat Snapshot and their respective metadata fields present in the archive created by AI2. The task of extracting the content from the documents in the dataset was entrusted to PDFMEF, a tool that derives entities from academic articles thanks to four different internal programs: 1. PDFFigures - a system used for extracting captions, tables and figures. The extracted contents are collected in a directory that has the document identifier as prefix. In the first version of COVID-19 Fatcat Snapshot, 101,937 images were extracted; 2. GROBID - program also used by CORD-19 to extract bibliographic data of documents such as title, authors and place of publication; 3. PDFBOX - tool used for complete extraction of abstracts. It turns out to be more efficient than GROBID in processing documents correctly; 4. Citation-enhanced keyphrase extraction (CeKe) - supervised model for keyword extraction. A list of keywords that correctly describe the content of the document is created for each paper [10,11]. The extracted information is processed using an NLP pipeline process and then indexed by Elasticsearch, a Lucene-based search server. To search for information on COVID-19 Fatcat Snapshot, COVIDSeer was used, a tool developed by SeerSuite Project that filters the documents in the corpus based on the keywords used. The search engine has been realized to simplify the collection of all the documents useful to give answers to the questions posed by the scientists. With COVIDSeer it is possible to perform two types of searches: 1. Faceted Search - a technique that allows users to narrow down the results obtained by applying filters. It is a very advanced type of search that classifies papers based on various parameters such as author, source, journal and year of publication; 2. Paper recommendation - a technique that provides users with a list of all papers similar to the searched article. The list is created through the use of templates that compare and find semantic similarities between articles [12, 13]. A similar approach has also been used by the Citeomatic recommendation system [14].
54
A. Amato et al.
Since their online release in March 2020, COVID-19 Fatcat Snapshot and CovidSeer have been continuously updated as documents are added to the dataset and new queries are implemented in the search engine [15].
4
NLP Pipeline in Python
In this section we report our NLP pipeline, developed to analyse medical related texts. We used nltk package for text processing primiteves. We extract information taken from a sample of Covid-19 related literature1 . Text=”’Human coronaviruses can sometimes cause lower respiratory tract illnesses, such as pneumonia or bronchitis. This is more common in people with pre-existing chronic cardiovascular and/or respiratory diseases, as well as individuals with weakened immune systems, infants and elderly people. Other human coronaviruses that made the leap from animals to humans, like MERSCoV and SARS-CoV, can cause severe symptoms. Symptoms of Middle East Respiratory Syndrome typically include a fever, cough and shortness of breath, which often progress to pneumonia, and around 3 to 4 in 10 cases are fatal. MERS cases are still occurring, mostly in the Arabian Peninsula. Symptoms of Severe Acute Respiratory Syndrome include fever, chills and muscle aches, which usually progress to pneumonia. Since 2004, however, no new cases of SARS-CoV infection have been reported anywhere in the world.”’” import nltk nltk.download(’punkt’) nltk.download(’averaged˙perceptron˙tagger’) sentences=nltk.sent˙tokenize(texts) # sentences split token=nltk.word˙tokenize(texts) # token split pos˙tagged=nltk.pos˙tag(token) # POS Tagging [nltk data] Downloading package punkt to /home/jovyan/nltk data. . . [nltk data] Downloading package averaged perceptron tagger to [nltk data] /home/jovyan/nltk data. . . We show in following the POS associated to each token. []: pos˙tagged []: [(’Human’, ’JJ’), (’coronaviruses’, ’NNS’), (’can’, ’MD’), (’sometimes’, ’RB’), (’cause’, ’VB’), (’lower’, ’JJR’), (’respiratory’, ’NN’), (’tract’, ’NN’), (’illnesses’, ’NNS’), (’,’, ’,’), (’such’, ’JJ’), (’as’, ’IN’), (’pneumonia’, ’NN’), (’or’, ’CC’), (’bronchitis’, ’NN’), (’.’, ’.’), (’This’, ’DT’), (’is’, ’VBZ’), (’more’, ’JJR’), (’common’, ’JJ’), (’in’, ’IN’), (’people’, ’NNS’), (’with’, ’IN’), (’pre-existing’, ’JJ’), (’chronic’, ’JJ’), (’cardiovascu1
https://www.epicentro.iss.it/en/coronavirus/symptoms-diagnosis.
Automatic Measurement of Acquisition for COVID-19 Related Information
55
lar’, ’NN’), (’and/or’, ’NN’), (’respiratory’, ’NN’), (’diseases’, ’NNS’), (’,’, ’,’), (’as’, ’RB’), (’well’, ’RB’), (’as’, ’IN’), (’individuals’, ’NNS’), (’with’, ’IN’), (’weakened’, ’VBN’), (’immune’, ’JJ’), (’systems’, ’NNS’), (’,’, ’,’), (’infants’, ’NNS’), (’and’, ’CC’), (’elderly’, ’JJ’), (’people’, ’NNS’), (’.’, ’.’), (’Other’, ’JJ’), (’human’, ’JJ’), (’coronaviruses’, ’NNS’), (’that’, ’WDT’), (’made’, ’VBD’), (’the’, ’DT’), (’leap’, ’NN’), (’from’, ’IN’), (’animals’, ’NNS’), (’to’, ’TO’), (’humans’, ’NNS’), (’,’, ’,’), (’like’, ’IN’), (’MERSCoV’, ’NNP’), (’and’, ’CC’), (’SARS-CoV’, ’NNP’), (’,’, ’,’), (’can’, ’MD’), (’cause’, ’VB’), (’severe’, ’JJ’), (’symptoms’, ’NNS’), (’.’, ’.’), (’Symptoms’, ’NNS’), (’of’, ’IN’), (’Middle’, ’NNP’), (’East’, ’NNP’), (’Respiratory’, ’NNP’), (’Syndrome’, ’NNP’), (’typically’, ’RB’), (’include’, ’VBP’), (’a’, ’DT’), (’fever’, ’NN’), (’,’, ’,’), (’cough’, ’NN’), (’and’, ’CC’), (’shortness’, ’NN’), (’of’, ’IN’), (’breath’, ’NN’), (’,’, ’,’), (’which’, ’WDT’), (’often’, ’RB’), (’progress’, ’VBP’), (’to’, ’TO’), (’pneumonia’, ’VB’), (’,’, ’,’), (’and’, ’CC’), (’around’, ’RB’), (’3’, ’CD’), (’to’, ’TO’), (’4’, ’CD’), (’in’, ’IN’), (’10’, ’CD’), (’cases’, ’NNS’), (’are’, ’VBP’), (’fatal’, ’JJ’), (’.’, ’.’), (’MERS’, ’JJ’), (’cases’, ’NNS’), (’are’, ’VBP’), (’still’, ’RB’), (’occurring’, ’VBG’), (’,’, ’,’), (’mostly’, ’RB’), (’in’, ’IN’), (’the’, ’DT’), (’Arabian’, ’JJ’), (’Peninsula’, ’NNP’), (’.’, ’.’), (’Symptoms’, ’NNP’), (’of’, ’IN’), (’Severe’, ’NNP’), (’Acute’, ’NNP’), (’Respiratory’, ’NNP’), (’Syndrome’, ’NNP’), (’include’, ’VBP’), (’fever’, ’NN’), (’,’, ’,’), (’chills’, ’NNS’), (’and’, ’CC’), (’muscle’, ’NN’), (’aches’, ’NNS’), (’,’, ’,’), (’which’, ’WDT’), (’usually’, ’RB’), (’progress’, ’VBP’), (’to’, ’TO’), (’pneumonia’, ’VB’), (’.’, ’.’), (’Since’, ’IN’), (’2004’, ’CD’), (’,’, ’,’), (’however’, ’RB’), (’,’, ’,’), (’no’, ’DT’), (’new’, ’JJ’), (’cases’, ’NNS’), (’of’, ’IN’), (’SARS-CoV’, ’NNP’), (’infection’, ’NN’), (’have’, ’VBP’), (’been’, ’VBN’), (’reported’, ’VBN’), (’anywhere’, ’RB’), (’in’, ’IN’), (’the’, ’DT’), (’world’, ’NN’), (’.’, ’.’)] In order to create an NP-chunker, we must first define a grammar (chunk grammar), which consists of rules that indicate how sentences should be chunked. We define such grammar by using a regular expression. In our example, the regular expression says that an NP chunk should be formed whenever the chunker finds an optional determiner (DT) followed by any number of adjectives (JJ) and then a noun (NN). Using this grammar, we create a chunk parser and test it on the sentences. We are defining the grammar with label SyntagmaN. The sequence of POS that makes up the grammar is given by the POS enclosed in angle brackets, while the whole rule is enclosed in curly brackets grammar=label: . We exploit the mechanism of regular expressions, in order to recognize all entities ARTICLE + ADJECTIVE +NAME. The grammar, that is the rule can have a label, has a label preceded by: which indicates the type of element that we are going to extract. []: grammar=”SintagmN:–¡DT¿¡JJ¿¡NN¿˝”
56
A. Amato et al.
Let’s now apply the grammar we have just defined on the text, in order to extract all the DT+JJ+NN sequences present in the text, which satisfy the following grammar. []: cp=nltk.RegexpParser(grammar) SintagmN=cp.parse(pos˙tagged) Now we visualize the resulting parsing tree: []: print(SintagmN) (S Human/JJ coronaviruses/NNS can/MD sometimes/RB cause/VB lower/JJR respiratory/NN tract/NN illnesses/NNS ,/, such/JJ as/IN pneumonia/NN or/CC bronchitis/NN ./. This/DT is/VBZ more/JJR common/JJ in/IN people/NNS with/IN pre-existing/JJ chronic/JJ cardiovascular/NN and/or/NN respiratory/NN diseases/NNS ,/, as/RB well/RB as/IN individuals/NNS with/IN weakened/VBN immune/JJ systems/NNS ,/, infants/NNS and/CC elderly/JJ people/NNS ./. Other/JJ human/JJ coronaviruses/NNS that/WDT made/VBD the/DT leap/NN from/IN animals/NNS to/TO humans/NNS ,/, like/IN MERS-CoV/NNP and/CC SARS-CoV/NNP ,/, can/MD cause/VB severe/JJ symptoms/NNS ./. Symptoms/NNS of/IN Middle/NNP East/NNP Respiratory/NNP Syndrome/NNP typically/RB include/VBP a/DT fever/NN ,/, cough/NN and/CC shortness/NN of/IN breath/NN ,/, which/WDT often/RB progress/VBP to/TO pneumonia/VB ,/, and/CC around/RB 3/CD to/TO 4/CD in/IN 10/CD cases/NNS are/VBP fatal/JJ ./. MERS/JJ cases/NNS are/VBP still/RB occurring/VBG ,/, mostly/RB in/IN the/DT Arabian/JJ Peninsula/NNP ./. Symptoms/NNP of/IN Severe/NNP Acute/NNP Respiratory/NNP Syndrome/NNP include/VBP fever/NN ,/, chills/NNS and/CC muscle/NN aches/NNS ,/, which/WDT usually/RB progress/VBP to/TO pneumonia/VB ./. Since/IN 2004/CD ,/, however/RB ,/, no/DT new/JJ cases/NNS of/IN SARS-CoV/NNP infection/NN have/VBP been/VBN reported/VBN anywhere/RB in/IN the/DT world/NN ./.) This extracted information allows us to apply lexicometric measure on the text based on TF-IDF index [12]. []: TFidf = {} for tok in range(token): tf = counter[token]/wordsCount df = docFreq(token) idf = np.log(N/(df+1)) TFidf[doc, token] = tf*idf
Automatic Measurement of Acquisition for COVID-19 Related Information
5
57
Conclusions
The COVID-19 pandemic has had a major impact globally in the economic, historical, institutional and social fields and there will be strong repercussions for the coming years. However, the scientific community has managed to find faster solutions to this new disease also thanks to the use of Artificial Intelligence systems that have facilitated the collection and dissemination of the work carried out by various research groups around the world. To date, CORD-19 contains more than 400,000 academic articles, of which more than 150,000 in full text, and has allowed with the various techniques of Natural Language Processing and Information Retrieval to collect all the information useful in the ongoing fight against the coronavirus.
References 1. Loffredo, F., Vardaci, E., Quarto, M., Roca, V., Pugliese, M.: Validation of electromagnetic and hadronic physical processes in the interaction of a proton beam with matter: A solar particle events case study with an al slab. Adv. Space Res. 59(1), 393–400 (2017) 2. Loffredo, F., Scala, A., Adinolfi, G.M., Savino, F., Quarto, M.: A new geostatistical tool for the analysis of the geographical variability of the indoor radon activity. Nukleonika 65(2), 99–104 (2020) 3. Savino, F., Pugliese, M., Quarto, M., Adamo, P., Loffredo, F., De Cicco, F., Roca, V.: Thirty years after chernobyl: long-term determination of 137cs effective half-life in the lichen stereocaulon vesuvianum. J. Environ. Radioact 172, 201–206 (2017) 4. Castiglione, A., Cozzolino, G., Moscato, F., Moscato, V.: Cognitive analysis in social networks for viral marketing. IEEE Trans. Ind. Inf. (2020) 5. Amato, A., Cozzolino G., Maisto, A., Pelosi, S.: Analysis of covid-19 data. Lect. Notes Netw. Syst. LNNS 158, 251–260 (2021) 6. Angrisani, L., Bonavolont` a, F., Liccardo, A., Moriello, R.S.L., Ferrigno, L., Laracca, M., Miele, G.: Multi-channel simultaneous data acquisition through a compressive sampling-based approach. Measurement 52, 156–172 (2014) 7. Bonavolont` a, F., et al.: On the suitability of compressive sampling for the measurement of electrical power quality. In: 2013 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), pp. 126–131. IEEE (2013) 8. Canonico, R., et al.: A smart chatbot for specialist domains. Adv. Intell. Syst. Comput. AISC 1150, 1003–1010 (2020) 9. Amato, F., Cozzolino, G., Moscato, F., Moscato, V., Xhafa, F.: A model for verification and validation of law compliance of smart-contracts in iot environment. IEEE Trans. Ind. Inf. (2021) 10. Barolli, L., Koyama, A., Durresi, A., De Marco, G.: A web-based e-learning system for increasing study efficiency by stimulating learner’s motivation. Inf. Syst. Front. 8(4), 297–306 (2006) 11. Xhafa, F., Barolli, L.: Semantics, intelligent processing and services for big data (2014) 12. Cozzolino, G.: Using semantic tools to represent data extracted from mobile devices. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 530–536. IEEE (2018)
58
A. Amato et al.
13. Amato, F., Coppolino, L., Cozzolino, G., Mazzeo, G., Moscato, F., Nardone, R.: Enhancing random forest classification with nlp in dameh: A system for data management in ehealth domain. Neurocomputing 444, 79–91 (2021) 14. Amato, F., Casola, V., Cozzolino, G., De Benedictis, A., Mazzocca, N., Moscato, F.: A security and privacy validation methodology for e-health systems. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 17(2s), 1–22 (2021) 15. Angrisani, L., Bonavolont` a, F., Liccardo, A., Moriello, R.S.L., Serino, F.: Smart power meters in augmented reality environment for electricity consumption awareness. Energies 11(9), 2303 (2018)
Algorithms for Mastering Board Game Nanahoshi Considering Deep Neural Networks Masato Saito and Hiroyoshi Miwa(B) Graduate School of Science and Technology, Kwansei Gakuin University, 2-1 Gakuen, Sanda-shi, Hyogo 669-1337, Japan {MasatoSaito,miwa}@kwansei.ac.jp
Abstract. With the recent rapid progress of AI (Artificial Intelligence) technology, the performance of AI in board games such as Chess, Shogi, and Go has become very high. The combination of advanced search-tree methods and deep learning has made it possible for AI to outperform professionals. “Nanahoshi” is a board game similar to Chinese Anqi. Unlike Go and Shogi, Nanahoshi is not a perfect information game, because all pieces are faced down and hidden in the initial state. The uncertainty largely increases the difficulty of searching for game AI. In this paper, we design some algorithms that masters Nanahoshi. First, we design an algorithm using the Monte Carlo tree search. Next, we design an algorithm using deep neural networks. Furthermore, we evaluate the performance of these algorithms based on winning rate by playing games between these algorithms.
1
Introduction
With the recent rapid progress of AI (Artificial Intelligence) technology, the performance of AI in board games such as Chess, Shogi, and Go has become very high. The combination of advanced search-tree methods [1] and deep learning has made it possible for AI to outperform professionals. In fact, AlphaGo [2], an AI for Go, defeated the professional Go player Fan Hui 2-dan. It also achieved overwhelming victories against existing game AIs of Go, including Zen [3], the most powerful Go AI at the time. Furthermore, its successor, AlphaGo Zero [4], defeated the world’s top Go player at the time, Ke Jie 9-dan. By applying the ideas of AlphaGo and AlphaGo Zero, various game AIs such as a turn-based strategy game [5] and a simplified mahjong game [6] have been studied. “Nanahoshi” is a board game similar to Chinese Anqi, created by professional Shogi player Madoka Kitao and Hiroki Kaneko. Figure 1 shows the rules of Nanahoshi game. There are 12 cells on the board, and a piece is placed on each cell. At the beginning of the game, all pieces are faced down and hidden. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 59–69, 2022. https://doi.org/10.1007/978-3-030-84910-8_7
60
M. Saito and H. Miwa
On the face of a piece, either a red or a yellow ladybug with the points and the direction to which the piece can move are drawn, and a clover is drawn at the back of a piece. Two players, one red and one yellow, can move only the ladybug pieces of their own color. At the turn of a player, the player must turn over the piece of the clover and turn to the left or the right, or move a ladybug piece of the player’s color to the neighbor cell to which the piece can move. If the piece on the cell to which the piece moves is the opponent’s one, it is removed and the player gets the points of the opponent’s piece. The first player who acquires the seven points, the sum of the points of the opponent’s pieces that the player gets, wins the game.
Fig. 1. Board game Nanahoshi’s rule (Color figure online)
Unlike Go and Shogi, Nanahoshi is not a perfect information game, because all pieces are faced down and hidden in the initial state. The uncertainty largely increases the difficulty of searching for game AI [7]. In this paper, we design some algorithms that masters Nanahoshi game. First, we design an algorithm using the Monte Carlo tree search [8]. Next, we design an algorithm using a deep neural network. Furthermore, we evaluate the performance of these algorithms based on winning rate by playing games between these algorithms.
2
Previous Research
AlphaGo is an AI for board games that uses deep learning technology. This is an AI for Go. AlphaGo uses two types of CNNs (Convolutional Neural Network).
Algorithms for Mastering Board Game Nanahoshi
61
The first type is the policy network. The input is a state of the board and the output is the cell on which the stone is placed at the next turn. A state of the board includes the information about the location of stones of each color and the positions of breathing points. The second type is a value network. The input is a state of the board and the output is the winning rate of the input state of the board. The game record data of the top players on the Internet Go site are used for learning data. Since it is difficult to set the evaluation value on the board of Go, Go AI before AlphaGo could not show high performance. AlphaGo became an excellent Go AI, because the board state evaluation with high accuracy is possible by the value network. AlphaGo uses the Monte Carlo tree search with these two CNNs. In the Monte Carlo tree search, a vertex of the tree is a board state and an edge is a player’s move. During the tree search, each vertex is scored and the child vertices are expanded from the vertex with the highest score. In this scoring, values estimated by deep learning are used. Depending on the probability of a move output from the policy network, the weight to select the move is determined. The value output from the value network is used to find the value of the move. The Monte Carlo tree search is described in detail in the next section. AlphaZero [9], the successor to AlphaGo, was developed as a stronger AI with less game-specific information. It can be applied to Go, Chess, and Shogi as well. This does not use human game record data, but automatically plays games in AI to create game record data. The subject of these studies is a perfect information game. A game is not affected by luck in a perfect information game. On the other hand, in this paper, we aim to develop a game AI for the imperfect information game, Nanahoshi, based on the game AI model of AlphaZero.
3 3.1
Game AI for Nanahoshi Using Game Tree Search Game Tree Search
In general, a game AI of a board game uses tree search to represent the progress of a turn. A vertex corresponds to a board state and an edge corresponds to a player’s move. Since the turns of two players, red player and yellow player, alternate in Nanahoshi game, red and yellow edges alternately line up in a path from the root on the tree. According to descending the tree from a root, a vertex corresponds to a board state that the game progresses. We develop a game AI using the Monte Carlo tree search, which is one of tree search algorithms. In the next section, we describe the Monte Carlo tree search in detail. 3.2
Monte Carlo Tree Search
The Monte Carlo tree search is a search that uses random plays to score vertices. According to the score of a vertex, vertices to be searched further are limited to
62
M. Saito and H. Miwa
reduce useless search. Figure 2 shows the concept of the Monte Carlo tree search algorithm. The root corresponds to the initial board state; a vertex corresponds to a board state; an edge corresponds to a player’s move. The value of UCB1 (Upper Confidence Bound 1) represents the sum of the winning rate and the 2 log t where w is bias. Precisely, the value of UCB1 is defined as follows: w n + n the number of wins for the vertex; n is the number of times that the vertex has been searched; t is the number of times that all vertices have been searched.
Fig. 2. Algorithm for Monte Carlo tree search
The bias of a vertex is a value that becomes smaller as the number of times the vertex is searched increases. Each vertex has the board state, the number of times that the vertex has been searched, and the winning rate. The value of UCB1 is computed using the values of each vertex. From the root to the leaf, a vertex with the largest UCB1 value is chosen and is moved to its child vertex. When the leaf is searched, if the number of times that the vertex has been searched is greater than or equal to the threshold, a new vertex adjacent from the vertex is created as a child vertex; otherwise, the game is executed by random play from the board state corresponding to the vertex and the winner is
Algorithms for Mastering Board Game Nanahoshi
63
determined. This random play is called playout. Then, the values of the vertices are updated. This process is repeated a certain number of times, and the vertex with the largest number of times that the vertex has been searched is determined as the move.
4 4.1
Game Tree Search Using Deep Learning Basic Idea of Game Tree Search Using Deep Learning
First, we describe the structure of the game AI of Nanahoshi using deep learning. We show the basic concept in Fig. 3. Our proposed game AI uses a combination of deep learning, reinforcement learning, and tree search. First, we briefly describe the roles of each algorithm. The deep neural network learns the relationship among a board state, the winning rate of the board state, and the next move. The value of a vertex in a game tree is determined by the deep neural network. The reinforcement learning is used to improve the neural network made by deep learning.
Fig. 3. Basic idea of game AI using deep learning
Next, we describe the learning cycle. First, self-plays are executed by using the neural network made by deep learning. We denote this neural network as N . In the initial state, the neural network N is determined randomly. In self-plays, the move is determined by using the Boltzmann distribution. After a sufficient number of games, we use the data of the self-plays as training data for deep learning. The data include a set of a board state, next move, and the result of win or lose. We use a learning model called the residual network for deep learning. The self-plays are executed by using the neural network N , and let N
64
M. Saito and H. Miwa
be the neural network made by using the data of the self-plays as the training data. Self-plays between the game AI of N and the game AI of N are executed, and N is updated to the neural network N or N with the higher winning rate. By repeating this process, the neural network N will be reinforced. 4.2
Monte Carlo Tree Search Using Deep Learning
We describe the Monte Carlo tree search using deep learning, which plays a role in determining moves in a match. Our proposed game AI evaluates the value of vertices in the Monte Carlo tree search by using deep learning without playing out games. We can expect that this improves the speed and accuracy of search. More precisely, the bias in UCB1 of the Monte Carlo tree search is determined by deep learning. We define √ t w UCB1 as n + cpuct · p · (1+n) where w is the number of wins for the vertex; n is the number of times that the vertex has been searched; t is the number of times that all vertices have been searched; p is the probability of move; cpuct is a constant. Next, we describe the board representation for deep learning of Nanahoshi game. Nanahoshi has 12 pieces in total. There are two red (resp. yellow) pieces with one point, two red (resp. yellow) pieces with two points, and two red (resp. yellow) pieces with three points. The board of Nanahoshi has the layout missing 4 corners from 4 × 4 grid. A board state is represented by using 4 × 4 matrix and 8 channels. Each channel represents the positions of two pieces of a same type, the positions of reversed pieces, and the positions of pieces on clover. We show an example in Fig. 4.
Fig. 4. A board state and the representation (Color figure online)
Nanahoshi has no large amount of game records like in Go, and there are no strong players like professionals. Therefore, we generate the training data by selfplays. Let N be the neural network made by deep learning using some training data. The neural network N is assumed to be the best one at the moment, and N
Algorithms for Mastering Board Game Nanahoshi
65
is assumed to take a random value if it has never been trained. A game is played until a winner is determined (at least 7 points are scored by one of the players). If a game continues more than 300 moves, the game is a draw. Moreover, we use the Boltzmann distribution to predict moves. In this paper, we generate the training data by 500 self-plays. The input of the deep neural network is the board state and the output is the probabilities of next all possible moves. We use the residual network (ResNet) [10] as the model for deep neural network (Figs. 5 and 6).
Fig. 5. Residual block
Fig. 6. Residual network
5
Performance Evaluation
We evaluate the performance of the game AIs. The measure is the average of the winning rate by playing against each other 100 times on a same initial board state for 100 different initial board states.
66
M. Saito and H. Miwa
First, we show the results of the plays between the naive game AI that takes random moves and the game AI that uses the Monte Carlo tree search. In the Monte Carlo search, the number of searches to output a move is 100. The results are shown in Fig. 7. The Monte Carlo tree search game AI wins 49 times and loses 51 times. The performance of the game AI of the Monte Carlo tree search is equivalent to the naive game AI.
Fig. 7. Monte Carlo tree search game AI and random game AI
We increased the number of the searches in the Monte Carlo tree search to investigate the influence of the number of searches. In Fig. 8, the number of searches to determine a move in the Monte Carlo tree search is increased from 100 to 1000. The vertical axis is the winning rate (number of wins/number of games) and the horizontal axis is the number of searches. As for the Monte Carlo tree search of a perfect information game, more number of searches causes higher winning rate. However, as for Nanahoshi game, the winning rate does not increase, even if the number of searches increases at most 1000 times per a move. This is because, since Nanahoshi is an incomplete information game, the size of the game tree of Nanahoshi is huge and the number of searches of 1000 times per a move is not sufficient to improve the performance. Next, we show the result of our proposed game AI using deep learning. In this numerical experiment, we used the game AI made by 10 training and reinforcement learning. Figure 9 shows that the proposed game AI had 95 wins and 3 loses, and 2 ties against the random game AI. Next, we show the results of the proposed game AI against the game AI using the Monte Carlo tree search in Fig. 10. The proposed game AI had 67 wins, 3 loses, and 30 ties for the game AI using the Monte Carlo tree search. These results show that the proposed game AI using deep learning achieved higher performance against the random game AI and the Monte Carlo tree search game AI.
Algorithms for Mastering Board Game Nanahoshi
67
Fig. 8. Number of searches and winning rate
We investigate the influence of training of deep learning. We use the game AI made by 25 training. The result is shown in Fig. 11. The improved game AI had 96 wins, 3 loses, and 1 ties for the random game AI. Similarly, the improved game AI had 75 wins, 0 loses, and 25 ties for the Monte Carlo Tree Search AI (Fig. 12). With further training, we can improve the performance of the proposed game AI. It implies that the performance can be further improved by spending more training time.
Fig. 9. Deep learning AI and random AI
68
M. Saito and H. Miwa
Fig. 10. Deep learning AI and Monte Carlo Tree Search AI
Fig. 11. Deep learning AI with further learning and random AI
Fig. 12. Deep learning AI with further learning and Monte Carlo Tree Search AI
Algorithms for Mastering Board Game Nanahoshi
6
69
Conclusion
In this paper, we designed some algorithms that masters Nanahoshi game. First, we designed an algorithm using the Monte Carlo tree search. Next, we designed an algorithm using deep neural networks and reinforcement learning. We evaluated the performance of these algorithms based on winning rate by playing games between these algorithms. As a result, the proposed game AI based on the algorithm using deep neural networks and reinforcement learning achieved higher performance against the random game AI and the Monte Carlo tree search game AI. In addition, with further training, the performance of the proposed game AI is improved.
References 1. Yoshizoe, K.: Monte-Carlo tree search - a revolutionary algorithm developed for computer go. IPSJ Mag. 49(6), 686–693 (2008) 2. Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016) 3. Kato, H.: Zen the world strongest computer go player. Trans. Jpn. Soc. Artif. Intell. AI 27(5) (2012) 4. Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550, 354–359 (2017) 5. Tomihiro, K.: Application of reinforcement learning algorithm using policy network and value network to the turn-based strategy game. In: GamePrograming Workshop 2019 (GPW-2019), vol. 2019, pp. 73–79 (2019) 6. Shimizu, T., Tanaka, T.: Building mahjong player using deep reinforcement learning. In: GamePrograming Workshop 2020 (GPW-2020), vol. 2020, pp. 147–154 (2020) 7. Ayato, M., Makoto, M., Takashi, C.: Decision making with imperfect information using UCT search. In: GamePrograming Workshop 2009 (GPW-2009), vol. 2009(12), pp. 43–50 (2009) 8. Browne, C.B., et al.: A survey of Monte Carlo Tree search methods. IEEE Trans. Comput. Intell. AI Games 4(1), 1–43 (2012) 9. Silver, D., et al.: A general reinforcement learning algorithm that masters chess, shogi, and go through, self-play. Science 362(6419), 1140–1144 (2018) 10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2015)
Revealing COVID-19 Data by Data Mining and Visualization Carson K. Leung(B)
, Tyson N. Kaufmann, Yan Wen, Chenru Zhao, and Hao Zheng University of Manitoba, Winnipeg, MB, Canada [email protected]
Abstract. In the current era of big data, huge volumes of valuable data are generated and collected at a rapid velocity from a wide variety of rich data sources. Examples include disease and epidemiological data such as privacy-preserving statistics on patients who suffered from epidemic diseases like the coronavirus disease 2019 (COVID-19). Embedded in the huge volumes of COVID-19 data for large numbers of COVID-19 cases around the world is implicit, previously unknown and potentially useful information and knowledge—which can be discovered by data mining. As “a picture is worth a thousand words”, having the pictorial representation further enhances this knowledge discovery process. Visualization of COVID-19 data helps users discover useful information and knowledge—such as popular features and their associative relationships—related to COVID-19 cases. Moreover, visualization of discovered knowledge helps users get a better understanding and interpretation of discovered knowledge. Hence, in this paper, we present a data science solution that makes good use of both data mining and visualization for conducting data analytics and visual analytics of COVID-19 data to reveal important information and knowledge from COVID-19. Evaluation on real-life COVID-19 data demonstrates the effectiveness of our solution in revealing useful information and knowledge of COVID-19 by data mining and visualization. Keywords: Data mining · Data analytics · Visual analytics · Visualization · COVID-19
1 Introduction In the current era of big data [1–3], huge volumes of valuable data can be easily generated and collected at a rapid velocity from a wide variety of rich data sources. In recent years, the initiates of open data also led to the willingness of many government, researchers, and organizations to share their data and make them publicly accessible. Examples of open big data include biodiversity data [4], census data [5], healthcare data and disease reports (e.g., COVID-19 statistics) [6–9], music data [10, 11], patent register [12, 13], social networks [14–16], time series (e.g., financial time series) [17–20], transportation and urban data [21–23], weather data [24], and web data [25, 26]. Embedded in these big data are useful information and valuable knowledge that can be discovered by data science [27, 28]—which makes good use of data mining © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 70–83, 2022. https://doi.org/10.1007/978-3-030-84910-8_8
Revealing COVID-19 Data by Data Mining and Visualization
71
algorithms [29–33], data analytics methods [34–38], visual analytics techniques [39, 40], machine learning tools [41–44] and/or mathematical and statistical models [45, 46]. Hence, analyzing and mining these big data can be for social good. For instance, analyzing and mining the healthcare data and disease reports helps people to get a better understanding of the disease such as coronavirus disease 2019 (COVID-19). COVID-19 is also a viral respiratory disease caused by severe acute respiratory syndrome-associated coronavirus 2 (SARS-CoV-2). It was reported to break out in late 2019, became a global pandemic in early 2020, and is still prevailing in 2021. Since its declaration as a pandemic, researchers have contributed to various aspects of the COVID19. For instance, from the medical and health science prospective, health researchers have examined clinical and treatment information for drug discovery and vaccine development [47–51]. Similarly, from the social science and humanities research prospective, social scientists have studied social and economic impact of the disease [52–54]. In contrast, as computer scientists, we focus on the natural science and engineering prospective. In this paper, we present a data science solution—which makes good use of data mining and visualization—to reveal useful information and knowledge about COVID-19 cases. Since the declaration of COVID-19 pandemic, many visualizers and dashboards have been developed over the past two years. Notable ones that provide global COVID-19 data include (a) World Health Organization (WHO) Coronavirus Disease 2019 (COVID-19) Dashboard1 , (b) COVID-19 Dashboard by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU)2 , and (c) COVID-19 dashboard by European Center for Disease Prevention and Control (ECDC)3 . Others also provide mostly local COVID-19 data. These include those from the governments (e.g., Government of Canada4 ), media (e.g., TV5,6,7 , Wikipedia8 ). To provide users or viewers with updated information about COVID-19, most of these rich data sources update their information frequently (e.g., daily). One commonality among these visualizers and dashboards is that they focus mostly on reporting the total numbers of new/confirmed cases and deaths, as well as their cumulative totals. While they serve the purpose of fast dissemination of these crucial numbers related to COVID-19 cases, there is more information and knowledge embedded in the data and yet to be discovered. This motivates our work to reveal this information and knowledge from COVID-19 cases by data mining and visualization. To elaborate, existing visualizers and dashboards mostly provide one-dimensional information (e.g., numbers of new or active COVID-19 cases, recoveries, deaths). These numbers show the seriousness of the disease in a certain location. However, to help
1 https://covid19.who.int/. 2 https://coronavirus.jhu.edu/map.html. 3 https://qap.ecdc.europa.eu/public/extensions/COVID-19/COVID-19.html. 4 https://www.canada.ca/en/public-health/services/diseases/2019-novel-coronavirus-infection.
html. 5 https://newsinteractives.cbc.ca/coronavirustracker/. 6 https://www.ctvnews.ca/health/coronavirus/tracking-every-case-of-covid-19-in-canada-1.485
2102. 7 https://beta.ctvnews.ca/content/dam/common/exceltojson/COVID-19-Canada-New.txt. 8 https://en.wikipedia.org/wiki/Template:COVID-19_pandemic_data/Canada_medical_cases.
72
C. K. Leung et al.
prevent, detect, control and/or combat the disease, it is important to get a better understanding of the characteristics of the disease. Hence, in contrast to related works, our data science solution provides users with multi-dimensional information (e.g., connections among characteristics of COVID-19 cases, transmission methods, hospitalization status, clinical outcomes). Our key contributions of this paper is our data science solution for revealing COVID19 data. It does so by making good use of data mining and visualization. In particular, we focus on frequent pattern mining, which aims to discover implicit, previously unknown and potential useful information and knowledge such as popular characteristics of COVID-19 cases in terms of discovered frequent patterns. These help reveal the following: • • • • • •
What is the most common transmission method? What are the most frequently observed symptoms? Which groups are more vulnerable to the disease? What is the most common hospitalization status? What is the most common clinical outcome? What are connections among some of the aforementioned characteristics?
The remainder of this paper is organized as follows. The next section discusses related works. Section 3 describes our data science solution for revealing COVID-19 data. It visualizes data, mines frequent patterns, and visualizes these mined patterns. Section 4 shows our evaluation results on real-life COVID-19 cases. Finally, conclusions are drawn in Sect. 5.
2 Related Works In terms of related works on visualizing frequent patterns, Jentner and Keim [55] surveyed several visualization techniques for frequent patterns. These techniques can be broadly generalized into the following categories: • Lattice representation, which is the most intuitive representation of frequent patterns. With it, all frequent patterns are visualized as nodes in a lattice (aka concept hierarchy). In the lattice, immediate supersets and subsets of a frequent pattern are all connected by edges. • Pixel-based visualization, in which a frequent pattern containing k items (i.e., kitemset) is represented as a pixel. Each pixel captures one or more frequent k-itemsets of the same length k. All patterns of the same length are lexicographically sorted. Their frequencies are indicated by the color of the pixels. • Linear visualization, in which frequent patterns are represented linearly. For instance, FIsViz [56] visualizes and represents a frequent k-itemset in a polyline that connects k nodes in a 2-dimensional space. With this representation, crucial information about any frequent k-itemset is captured by its associated (x, y)-coordinates in this space. For example, the frequency of a k-itemset is indicated by the y-coordinate. • Observing that polylines in FIsViz may not be easily distinguishable from one another due to their potential bending and crossing-over for frequent patterns, FpVAT [40]
Revealing COVID-19 Data by Data Mining and Visualization
73
visualizes a frequent k-itemset in a horizontal line that connects k nodes in a 2dimensional space in a wiring-type diagram (i.e., orthogonal graph). Again, the frequency of a k-itemset is indicated by the y-coordinate. • Tree visualization, in which frequent patterns are represented according to a tree hierarchy. For instance, PyramidViz [57] visualizes frequent patterns in a hierarchical layout—namely, a building block layout. With it, short patterns are put on the bottom of the pyramid, whereas longest related patterns (which are extensions of short patterns) are put on the top of the pyramid. Their frequencies are indicated by the color of the building blocks. • While PyramidViz can be considered as visualizing frequent patterns with a sideview of the pyramid, an alternative visualizer—namely, FpMapViz [58]—visualizes frequent patterns with a top-view. With this top-view representation, short patterns are put on the background, whereas longest related patterns (which are extensions of short patterns) are put on the foreground. Again, their frequencies are indicated by the color. These visualizers were designed to visualize frequent k-itemsets mined from a single domain. In contrast, when mining COVID-19 epidemiological data, each record captures mostly categorical data. Each categorical attribute is likely form a domain. As such, frequent k-itemsets are likely to be mined from k domains (i.e., multiple domains), with one item from each domain. Moreover, with COVID-19, our data science solution represents frequent k-itemsets mined from this multiple domains in the forms of an outward sunburst diagram (i.e., an outward doughnut chart). With it, each ring represents a domain. The frequency of a k-itemset is indicated by its sector area.
3 Our Data Science Solution Motivated by the existing visualizers and dashboards that mostly report the seriousness of the disease by showing numbers like new or active COVID-19 cases, recoveries, deaths, we design and develop a data science solution that aims to reveal more information such as characteristics of COVID-19 cases like their transmission methods, hospital status and clinical outcomes. This information helps users to get better understanding of the disease. In turn, users can take part in preventing, detecting, controlling and/or combating the disease. 3.1 Data Collection The first key step of our data science solution in revealing COVID-19 data is to collect data. Data sources may vary from locations (e.g., city, province, country) to locations. In general, COVID-19 epidemiological data are likely to capture the following: • Administrative information such as location (e.g., names of hospital or clinic) and episode date (i.e., symptom onset date or its closest date). • Case details such as gender, age, and occupation (e.g., health care workers, school or daycare workers, long-term care residents).
74
C. K. Leung et al.
• Transmission methods (e.g., domestic acquisition via contact of COVID-19 case or contact with traveller, international travel). • Indicators to show whether the case caught the original virus SARS-CoV-2 or its more contagious variants. For the latter, data may capture labels of the variants of concerns (VOCs) or variants of interests (VOIs). • Indicators to show whether the case is asymptomatic or symptomatic. For the latter, data may capture symptoms (e.g., chills, cough, diarrhea, fever, headache, irritability, nausea, pain, runny nose, shortness of breath, sore throat, weakness). • Hospital status such as hospitalized in the intensive care unit (ICU), non-ICU hospitalized, and not hospitalized. Depending on the data sources, some data may capture different levels of the ICU and may contain semi-ICU (SICU). • Clinical outcomes (e.g., recovery, death). 3.2 Data Preprocessing Once the COVID-19 epidemiological data are collected, the next key step for our data science solution is to preprocess the data. It builds hierarchy or taxonomy for several features. For example, to incorporate calendar effects (e.g., delay in reporting the case information over the weekend), it generalizes episode date into episode week. Moreover, to preserve privacy of individual identity, some episode week with an identifiably small number of cases can be grouped into a mega-week. Similarly, to preserve privacy of individual identity, individual age is generalized into an age group of 10-year interval (e.g., 20s, 30s, etc.). Along this direction, individual specific location (e.g., names of hospital or clinic) is also generalized into a health regions, which can then be further generalized into provinces and multi-province regions. We also observed that COVID-19 data contain many NULL values, which were probably caused by privacy-preservation of individual COVID-19 cases or unavailability of feature values due to fast reporting of COVID-19 cases. Instead of removing records with NULL values, we keep them around. 3.3 Data Mining and Visualization After preprocessing the collected COVID-19 data, the third key step of our data science to solution is to mine and visualize the data and the mined results. With these multi-domain features, it is temping to set up a data cube to capture the frequency of all combinations of features. Once the data cube is built, the frequency of all patterns can be easily looked up. For example, when using the generalization of the listed features in Sect. 3.1, we have generalized features like generalized location (e.g., in terms of multi-province regions), episode week, gender, age range, occupation, transmission methods, indicators for variants, a list of symptoms, hospital status, and clinical outcomes. Note that, given a list of m symptoms, there are 2m possible combinations of symptoms. Here, the empty combination {} indicates either an asymptomatic case or a NULL case (i.e., a case without unstated symptoms). For the remaining features, there are d i unique known values for each features. Incorporating the NULL value (i.e., unstated value), each domain consists of (d i + 1) possible values in the data cube. All of these lead to a data cube of size 2m
Revealing COVID-19 Data by Data Mining and Visualization
75
× i (d i + 1) for the data entries. For the data entries, the data cube should include the aggregated entries. As such, the size would be (2m + 1) × i (d i + 2). Consequently, the resulting data cube can be huge. As a concrete example, let us consider five multi-province regions, 52 + 25 = 77 episode weeks (from the beginning of 2020 to June 26 2021, two common gender, 10 age ranges, four occupations, two transmission methods (e.g., community transmission, international travel), four VOCs, a list of m = 13 different symptoms, three hospital statuses (e.g., ICU, non-ICU, not hospitalized) and two clinical outcomes (e.g., recovery, death). Then, the resulting data cube would be of (213 + 1) × (5 + 2) × (77 + 2) × (2 + 2) × (10 + 2) × (4 + 2) × (2 + 2) × (4 + 2) × (3 + 2) × (2 + 2) ≈ 835 trillion entries. Hence, it may not be practical due to its huge size. Moreover, it may not be space efficient in the sense that many entries may have a 0 value (indicating such possible combinations do not occur in the real world). As an alternative, our data science solution mines frequent patterns from the COVID19 data so that patterns with high frequencies are returned to the users. For users to comprehend the frequencies of these patterns with respect to their related patterns (e.g., patterns with alternative values for some domains), our solution enumerates these alternatives, scans the data, cumulates and returns the resulting frequencies. For ease of interpretability, our solution visualizes the patterns in a pie chart or a sunburst diagram (i.e., a doughnut chart). These patterns reveal interesting knowledge of the COVID-19 cases (e.g., breakdown of cases according to the transmission methods to find out the most common transmission method). Our data science solution is not confined to mining and visualizing frequent patterns of length 1 covering only 1 domain (i.e., 1-itemsets). It also mines and visualizes frequent patterns of length k (where k > 1) covering k domains (i.e., k-itemsets). We observed that existing frequent pattern visualizers were mainly designed to visualize frequent k-itemsets mined from a single domain. As such, records that support two (k + 1)-extensions or (k + 1)-supersets of a frequent k-itemset may overlap. For example, record t 1 = {a1 , a2 , a3 , a4 } supports both {a1 , a2 , a3 } and {a1 , a2 , a4 }. Consequently, frequency of any (k + 1)-extension or (k + 1)-superset is bounded above by the frequency of the frequent k-itemset. In contrast, when mining COVID-19 epidemiological data, each record captures mostly categorical data. Each categorical attribute is likely form a domain. As such, frequent k-itemsets are likely to be mined from k domains (i.e., multiple domains), with one item from each domain. As such, records of two (k + 1)-extensions or (k + 1)-supersets—from the same sets of domains—of a frequent k-itemset are disjoint. For example, record t 1 = {a1 , b2 , c3 , d 4 } supports {a1 , b2 , c3 } but not {a1 , b2 , c4 }. Consequently, sum of frequencies of (k + 1)-extensions or (k + 1)-supersets is bounded above by the frequency of the frequent k-itemset. Hence, our solution visualizes these k-itemsets (with their k items in a set from k domains) in an outward sunburst diagram (i.e., an outward doughnut chart). With it, each ring represents a domain. The relative frequency of a pattern is indicated by its sector area. These patterns reveal interesting relationships among COVID-19 cases (e.g., what is the common clinical outcome for those who were infected by the most popular transmission methods?).
76
C. K. Leung et al.
Recall from Sect. 3.2, we keep records with NULL values. So, our solution provides users with options to include or exclude records with NULL values in both data mining and visualization.
4 Evaluation To evaluate our data science solution for revealing COVID-19 data, we applied it to real-life Canadian COVID-19 epidemiological data that were collected from various sources (e.g., Statistics Canada9 , Canadian national TV network10 ). Our solution preprocessed, mined and visualized the data and the mined results. Frequent singleton patterns reveal the breakdown of certain domains, whereas frequent non-singleton patterns reveal associative relationships among different feature values. Up to May 29, 2021 (i.e., Week 21 of 2021), there have been close to 1,368,442 COVID-19 cases in Canada. A frequent pattern {community exposures}:1124920 reveals that 82.20% of all Canadian cases were acquired domestically via community exposures. With a quick scan of the data to seek patterns related to other transmission methods. Figure 1(a) reveals the distribution breakdown of transmission methods by showing that the remaining 17.25% were with unstated transmission method and 0.55% were exposed via (international) travel.
Fig. 1. (a) Visualization of all transmission methods, (b) Visualization of VOCs.
Our solution provides users with options to include or exclude NULL values. With exclusion of NULL values for transmission methods, the frequent pattern {community exposures}:1124920 reveals that 99.34% of all Canadian cases with stated transmission methods were acquired domestically, whereas the remaining 0.66% were exposed via (international) travel. 9 https://www150.statcan.gc.ca/n1/pub/13-26-0003/132600032020001-eng.htm. 10 https://www.ctvnews.ca/health/coronavirus/tracking-variants-of-the-novel-coronavirus-in-can
ada-1.5296141.
Revealing COVID-19 Data by Data Mining and Visualization
77
We also examined variants of concerns (VOCs)11 —such as alpha (B.1.1.7) beta (B.1.351), gamma (P.1) and delta (B.1.167.2), which were first reported in the UK, South Africa, Brazil and India respectively—and variants of interests (VOIs) such as epsilon (B.1.427/B.1.429), zeta (P.2), eta (B.1.525), theta (P.3), iota (B.1.526), kappa (B.1.617.1) and lambda (C.37). Among all 205,525 cases that were screened and sequenced, a frequent pattern {alpha}:191425 reveals that 93.14% of VOCs were the alpha variant (B.1.1.7). Again, a quick scan of the data reveals patterns related to other VOCs. Figure 1(b) shows that the second frequent VOC was gamma variant (with 5.72% of cases having VOCs), which was followed by beta and delta variants (with 0.81% and 0.33% of cases having VOCs respectively). In terms of hospitalization, a frequent pattern {not hospitalized}:880358 reveals that 64.33% of cases did not need hospitalization. By including records with NULL values, our solution reveals that 30.42% of cases were with unstated hospital status. The remaining 4.26% and 0.99% of cases were admitted to regular wards and intensive care units (ICU), respectively. By excluding records with NULL values (i.e., by only considering stated hospital status), these numbers increased (from 64.33%) to 92.46% for no hospitalization, (from 4.26% to) 6.12% for regular ward, and (from 0.99% to) 1.43% for the ICU. Figure 2 shows the visualization for (a) including and (b) excluding records with NULL values. In other words, Fig. 2(a) visualizes all hospital status, whereas Fig. 2(b) visualizes only stated hospital status.
Fig. 2. (a) Visualization of all hospital status, (b) Visualization of stated hospital status.
Similarly, the most frequent pattern {recovered}:1319299 reveals that 96.41% of the COVID-19 cases were recovered. Only 1.86% of the cases were deceased, and the remaining 1.73% were with unstated clinical outcome. Among cases with stated clinical outcomes, 98.11% recovered and 1.89% deceased. Data mining and visualization on age reveals that the two most vulnerable groups were youth in their 20s and younger as indicated by frequent patterns {20s}:262893 and 11 https://www.who.int/en/activities/tracking-SARS-CoV-2-variants/.
78
C. K. Leung et al.
{0–19}:258175, i.e., 19.21% and 18.87% of cases. Figure 3(a) shows the next few age groups (in descending number of COVID-19 cases) were 30s (16.45%), 40s (14.74%) and 50s (13.16%). Seniors in their 80+ were the second smallest in terms of absolute number of cases (with only 5.10% of all cases). However, it is important to note that the population of all age groups is not evenly distributed. For example, there are more youth (aged 0–19) than seniors (aged 80+). Taking into consideration of population (e.g., estimated population on July 01, 2020) [59], Fig. 3(b) visualizes the distribution of COVID-19 based on age with respect to their population. The figure reveals that 5.13% of youth 20s got the disease. So, this age group has the highest number of COVID-19 cases in terms of both the absolute counts and the relative percentage with respect to its population. Moreover, 4.25% of people in 30s and 4.19% of seniors in 80+ were the next two age groups. This shows that, despite being the second smallest group in terms of absolute number of cases, seniors in 80+ were ranked the third large age groups in terms of relative percentage. %case (wrt populaƟon in age group) 0-19 20s
Age
30s 40s 50s 60s 70s 80+ 0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
%cases wrt populaƟon of age groups
(a)
(b)
Fig. 3. Visualization of age in terms of (a) absolute numbers of cases and (b) relative percentage with respect to their population.
So far, we have only demonstrated the mining and visualization for singleton frequent patterns. It is important to note our solution is also capable to mine and visualize nonsingleton frequent patterns, which reveal interesting relationships among features in COVID-19 cases. For example, Fig. 4 shows a visualization of transmission methods, then hospital status, and finally clinical outcomes in an outward sunburst fashion, with each ring represents one of these three domains. Recall that 82.20% of all Canadian cases were acquired domestically via community exposures, 17.25% without stated transmission methods, and the remaining 0.55% exposed through travel. Among the community exposed cases, a frequent non-singleton pattern {community exposures, not hospitalized}:715750 (52.30% of all cases) reveals that a majority of them did not required hospitalization. Along this direction, another frequent non-singleton pattern {community exposures, not hospitalized, recovered}:694474 (50.75% of all cases) reveals that, among these non-hospitalized cases who exposed the disease via community, most of them recovered. Similarly, a large fraction (0.99) of those community exposed cases without hospital status also recovered (25.31% of all cases). However, a smaller fraction
Revealing COVID-19 Data by Data Mining and Visualization
79
Fig. 4. Visualization of transmission methods, then hospital status, and finally clinical outcomes in an outward sunburst fashion, with each ring represents one of these three domains.
(0.82) of those community exposed cases admitted to regular wards recovered, and an even smaller fraction (0.69) for those admitted to the ICU. The fatality ratio was 0.16 and 0.28 for those community exposed cases admitted to the regular wards and the ICU, respectively.
5 Conclusions In this paper, we presented a data science solution for revealing COVID-19 data by data mining and visualization. Our solution collected data from a wide variety of rich data sources (e.g., COVID-19 epidemiological data, cases with VOCs, population), preprocessed them, mined and visualized frequent patterns and their related patterns. These patterns captures categorical feature values from multiple domains. Our solution provides users with option to include or exclude unstated (i.e., NULL) values, and reveals breakdown of various features associated with COVID-19 cases and their relationships among features. This enables users to get a comprehensive understanding of the cases, which in turn helps them to take an active role in preventing, detecting, controlling and/or combating the disease. Evaluation results on real-life Canadian COVID-19 cases show the practicality and effectiveness of our data science solution in revealing COVID-19 data. As ongoing and future work, we extend the current solution to mine and visualize temporal aspects of COVID-19 data to explore how the patterns change over time. Acknowledgments. This project is partially supported by NSERC (Canada) and University of Manitoba.
80
C. K. Leung et al.
References 1. Bo, D., Ai, L., Chen, Y.: Research and application of big data correlation analysis in education. In: Barolli, L., Nishino, H., Miwa, H. (eds.) INCoS 2019. AISC, vol. 1035, pp. 454–462. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29035-1_44 2. Saberi, M., et al.: Challenges in efficient customer recognition in contact centre: state-of-theart survey by focusing on big data techniques applicability. In: INCoS 2016, pp. 548–554 (2016) 3. Ray, J., Trovati, M.: On the need for a novel intelligent big data platform: a proposed solution. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 473–478. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98557-2_43 4. Anderson-Grégoire, I.M., et al.: A big data science solution for analytics on moving objects. In: Barolli, L., Woungang, I., Enokido, T. (eds.) AINA 2021. LNNS, vol. 226, pp. 133–145. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75075-6_11 5. Choy, C.M., et al.: Natural sciences meet social sciences: census data analytics for detecting home language shifts. In: IMCOM 2021, pp. 520–527 (2021). https://doi.org/10.1109/IMC OM51814.2021.9377412 6. Balco, P., Kajanová, H., Linhardt, P.: Economic interpretation of eHealth implementation in countrywide measures. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 255–261. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98557-2_23 7. Leung, C.K., et al.: Big data analysis and services: visualization of smart data to support healthcare analytics. In: IEEE iThings-GreenCom-CPSCom-SmartData 2019, pp. 1261–1268 (2019) 8. Shang, S., et al.: Spatial data science of COVID-19 data. In: IEEE HPCC-SmartCity-DSS 2020, pp. 1370–1375 (2020) 9. Souza, J., Leung, C.K., Cuzzocrea, A.: An innovative big data predictive analytics framework over hybrid big data sources with an application for disease analytics. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds.) AINA 2020. AISC, vol. 1151, pp. 669–680. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44041-1_59 10. Barkwell, K.E., et al.: Big data visualisation and visual analytics for music data mining. In: IV 2018, pp. 235–240 (2018) 11. Takano, A., Hirata, J., Miwa, H.: Method of generating computer graphics animation synchronizing motion and sound of multiple musical instruments. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 124–133. Springer, Cham (2019). https://doi. org/10.1007/978-3-319-98557-2_12 12. Lee, W., et al.: Reducing noises for recall-oriented patent retrieval. In: IEEE BDCloud 2014, pp. 579–586 (2014) 13. Leung, C., Lee, W., Song, J.J.: Information technology-based patent retrieval models. In: Glänzel, W., Moed, H.F., Schmoch, U., Thelwall, M. (eds.) Springer Handbook of Science and Technology Indicators. SH, pp. 859–874. Springer, Cham (2019). https://doi.org/10.1007/ 978-3-030-02511-3_34 14. Amato, F., Cozzolino, G., Moscato, F., Xhafa, F.: Semantic analysis of social data streams. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 59–70. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98557-2_6 15. Jiang, F., et al.: Finding popular friends in social networks. In: CGC 2012, pp. 501–508 (2012) 16. Singh, S.P., Leung, C.K.: A theoretical approach for discovery of friends from directed social graphs. In: IEEE/ACM ASONAM 2020, pp. 697–701 (2020) 17. Busse, V., Gregus, M.: Crowdfunding – an innovative corporate finance method and its decision-making steps. In: Barolli, L., Nishino, H., Miwa, H. (eds.) INCoS 2019. AISC, vol. 1035, pp. 544–555. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-290351_53
Revealing COVID-19 Data by Data Mining and Visualization
81
18. Chanda, A.K., et al.: A new framework for mining weighted periodic patterns in time series databases. ESWA 79, 207–224 (2017) 19. Morris, K.J., et al.: Token-based adaptive time-series prediction by ensembling linear and non-linear estimators: a machine learning approach for predictive analytics on big stock data. In: IEEE ICMLA 2018, pp. 1486–1491 (2018) 20. Roy, K.K., Moon, M.H.H., Rahman, M.M., Ahmed, C.F., Leung, C.K.: Mining sequential patterns in uncertain databases using hierarchical index structure. In: Karlapalem, K., et al. (eds.) PAKDD 2021. LNCS (LNAI), vol. 12713, pp. 29–41. Springer, Cham (2021). https:// doi.org/10.1007/978-3-030-75765-6_3 21. Audu, A.-R.A., Cuzzocrea, A., Leung, C.K., MacLeod, K.A., Ohin, N.I., Pulgar-Vidal, N.C.: An intelligent predictive analytics system for transportation analytics on open data towards the development of a smart city. In: Barolli, L., Hussain, F.K., Ikeda, M. (eds.) CISIS 2019. AISC, vol. 993, pp. 224–236. Springer, Cham (2020). https://doi.org/10.1007/978-3-03022354-0_21 22. Balbin, P.P.F., et al.: Predictive analytics on open big data for supporting smart transportation services. Procedia Comput. Sci. 176, 3009–3018 (2020) 23. Leung, C.K., et al.: Data mining on open public transit data for transportation analytics during pre-COVID-19 era and COVID-19 era. In: Barolli, L., Li, K.F., Miwa, H. (eds.) INCoS 2020. AISC, vol. 1263, pp. 133–144. Springer, Cham (2021). https://doi.org/10.1007/978-3-03057796-4_13 24. Cox, T.S., et al.: An accurate model for hurricane trajectory prediction. In: IEEE COMPSAC 2018, vol. 2, pp. 534–539 (2018) 25. Leung, C.K., et al.: Explainable machine learning and mining of influential patterns from sparse web. In: IEEE/WIC/ACM WI-IAT 2020, pp. 829–836 (2020) 26. Singh, S.P., et al.: Analytics of similar-sounding names from the web with phonetic based clustering. In: IEEE/WIC/ACM WI-IAT 2020, pp. 580–585 (2020) 27. Dierckens, K.E., et al.: A data science and engineering solution for fast k-means clustering of big data. In: IEEE TrustCom-BigDataSE-ICESS 2017, pp. 925–932 (2017) 28. Leung, C.K., Jiang, F.: A data science solution for mining interesting patterns from uncertain big data. In: IEEE BDCloud 2014, pp. 235–242 (2014) 29. Alam, M.T., Ahmed, C.F., Samiullah, M., Leung, C.K.: Mining frequent patterns from hypergraph databases. In: Karlapalem, K., et al. (eds.) PAKDD 2021. LNCS (LNAI), vol. 12713, pp. 3–15. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75765-6_1 30. Fariha, A., Ahmed, C.F., Leung, C.K.-S., Abdullah, S.M., Cao, L.: Mining frequent patterns from human interactions in meetings using directed acyclic graphs. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7818, pp. 38–49. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37453-1_4 31. Leung, C.K.-S.: Uncertain frequent pattern mining. In: Aggarwal, C.C., Han, J. (eds.) Frequent Pattern Mining, pp. 339–367. Springer, Cham (2014). https://doi.org/10.1007/978-3-31907821-2_14 32. Leung, C.K., et al.: Distributed uncertain data mining for frequent patterns satisfying antimonotonic constraints. In: IEEE AINA Workshops 2014, pp. 1–6 (2014) 33. Zhang, J., Li, J.: Retail commodity sale forecast model based on data mining. In: INCoS 2016, pp. 307–310 (2016) 34. Jiang, F., Leung, C.K.: A data analytic algorithm for managing, querying, and processing uncertain big data in cloud environments. Algorithms 8(4), 1175–1194 (2015) 35. Lee, W., Leung, C.K.., Nasridinov, A. (eds.): BIGDAS 2018. AISC, vol. 899. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-8731-3 36. Leung, C.K.: Big data analysis and mining. In: Encyclopedia of Information Science and Technology, 4e, pp. 338–348 (2018)
82
C. K. Leung et al.
37. Leung, C.K.-S., Jiang, F.: Big data analytics of social networks for the discovery of “following” patterns. In: Madria, S., Hara, T. (eds.) DaWaK 2015. LNCS, vol. 9263, pp. 123–135. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22729-0_10 38. Vanˇcová, M.H.: Place of analytics within strategic information systems: a conceptual approach. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 479–485. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98557-2_44 39. Jezowicz, T., et al.: Visualization of large graphs using GPU computing. In: INCoS 2013, pp. 662–667 (2013) 40. Leung, C.K., Carmichael, C.L.: FpVAT: a visual analytic tool for supporting frequent pattern mining. ACM SIGKDD Explor. 11(2), 39–48 (2009) 41. Ahn, S., et al.: A fuzzy logic based machine learning tool for supporting big data business analytics in complex artificial intelligence environments. In: FUZZ-IEEE 2019, pp. 1259– 1264 (2019) 42. Ibrishimova, M.D., Li, K.F.: A machine learning approach to fake news detection using knowledge verification and natural language processing. In: Barolli, L., Nishino, H., Miwa, H. (eds.) INCoS 2019. AISC, vol. 1035, pp. 223–234. Springer, Cham (2020). https://doi.org/ 10.1007/978-3-030-29035-1_22 43. Leung, C.K., et al.: Machine learning and OLAP on big COVID-19 data. In: IEEE BigData 2020, pp. 5118–5127 (2020) 44. Monno, S., Kamada, Y., Miwa, H., Ashida, K., Kaneko, T.: Detection of defects on SiC substrate by SEM and classification using deep learning. In: Xhafa, F., Barolli, L., Greguš, M. (eds.) INCoS 2018. LNDECT, vol. 23, pp. 47–58. Springer, Cham (2019). https://doi.org/ 10.1007/978-3-319-98557-2_5 45. Leung, C.K.: Mathematical model for propagation of influence in a social network. In: Alhajj, R., Rokne, J. (eds.) Encyclopedia of Social Network Analysis and Mining, 2e, pp. 1261–1269. Springer, New York (2018). https://doi.org/10.1007/978-1-4939-7131-2_110201 46. Nakamura, T., Shibata, M., Tsuru, M.: On retrieval order of statistics information from OpenFlow switches to locate lossy links by network tomographic refinement. In: Barolli, L., Nishino, H., Miwa, H. (eds.) INCoS 2019. AISC, vol. 1035, pp. 342–351. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-29035-1_33 47. Arshadi, A.K., et al.: Artificial intelligence for COVID-19 drug discovery and vaccine development. Frontiers Artif. Intell. 3, 65:1-65:13 (2020) 48. Berber, B., Doluca, O.: A comprehensive drug repurposing study for COVID19 treatment: novel putative dihydroorotate dehydrogenase inhibitors show association to serotonindopamine receptors. Briefings Bioinform. 22(2), 1023–1037 (2021) 49. Caruso, F.P., et al.: A review of COVID-19 biomarkers and drug targets: resources and tools. Briefings Bioinform. 22(2), 701–713 (2021) 50. Dagliati, A., et al.: Health informatics and EHR to support clinical research in the COVID-19 pandemic: an overview. Briefings Bioinform. 22(2), 812–822 (2021) 51. Dotolo, S., et al.: A review on drug repurposing applicable to COVID-19. Briefings Bioinform. 22(2), 726–741 (2021) 52. Chen, Y.: A data science solution for supporting social and economic analysis. In: IEEE COMPSAC 2021, pp. 1690–1695 (2021). https://doi.org/10.1109/COMPSAC51774.2021. 00252 53. Kuo, W., He, J.: Guest editorial: crisis management - from nuclear accidents to outbreaks of COVID-19 and infectious diseases. IEEE Trans. Reliab. 69(3), 846–850 (2020) 54. Oksanen, A., et al.: COVID-19 crisis and digital stressors at work: a longitudinal study on the Finnish working population. Comput. Hum. Behav. 122, 106853:1-106853:10 (2021)
Revealing COVID-19 Data by Data Mining and Visualization
83
55. Jentner, W., Keim, D.: Visualization and visual analytic techniques for patterns. In: FournierViger, P., Lin, J.C.-W., Nkambou, R., Vo, B., Tseng, V.S. (eds.) High-Utility Pattern Mining. SBD, vol. 51, pp. 303–337. Springer, Cham (2019). https://doi.org/10.1007/978-3-03004921-8_12 56. Leung, C.-S., Irani, P.P., Carmichael, C.L.: FIsViz: a frequent itemset visualizer. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 644–652. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68125-0_60 57. Leung, C.K., et al.: PyramidViz: visual analytics and big data visualization of frequent patterns. In: IEEE DASC-PICom-DataCom-CyberSciTech 2016, pp. 913–916 (2016) 58. Leung, C.K., et al.: FpMapViz: a space-filling visualization for frequent patterns. In: IEEE ICDM 2011 Workshops, pp. 804–811 (2011) 59. Statistics Canada: Table 17-10-0005-01 population estimates on July 1st, by age and sex (2020). https://doi.org/10.25318/1710000501-eng
An Approach to Enhance Academic Ranking Prediction with Augmented Social Perception Data Kittayaporn Chantaranimi1(B) , Prompong Sugunsil2 , and Juggapong Natwichai3 1
Data Science Consortium, Faculty of Engineering, Chiang Mai University, Chiang Mai, Thailand kittayaporn [email protected] 2 College of Art, Media, and Technology, Chiang Mai University, Chiang Mai, Thailand [email protected] 3 Department of Computer Engineering, Faculty of Engineering, Chiang Mai University, Chiang Mai, Thailand [email protected]
Abstract. Academic ranking prediction are indicators that have significant influences on the decision-making process of stakeholders of universities. In addition, we are in digital age with a pandemic situation, social media and technology have revolutionized the way scholars reach and disseminate academic outputs. Thus, the ranking consideration should be adjusted by augmented social perception data, e.g. Altmetrics. In this study, dataset of 1,752,494 research outputs from Altmetric.com and Scival.com which published between 2015–2020 are analyzed. This study assesses whether there are relationships between various scholarly output’s social perception data and citations. Moreover, various machine learning models are constructed to predict the citations. Results show weak to moderate positive correlation between social perception data and citation. We have found that the outperforming prediction model is Random Forest regression. The finding in our study suggested that social perception data should be considered to enhance academic ranking prediction in conjunction with related features. Keywords: Social perception data · Machine learning prediction · Altmetrics · Correlation · SciVal
1
· Citation
Introduction
The evolution of digital innovation and technologies is causing rapid and inevitable changes in the academic system such as learning patterns and publication policies, both at the national and international levels. Therefore, learning c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 84–95, 2022. https://doi.org/10.1007/978-3-030-84910-8_9
Enhance Academic Ranking Prediction
85
system, research conduction and the dissemination of academic services to community are education challenges of creating world-class university and achieving the preferable academic ranking. There are various research assessment methodologies, for example, peer reviews, bibliometrics, the quantitative analysis of publications, and other important indicators such as Citation Index and Journal Impact factor. However, mostly current institution ranking methods use bibliometric data as core indicators, e.g., Citations count, together with the results from employer and academic surveys. In addition, in the digital age where social media and technology have revolutionized the way researchers communicate on online network, the opportunity to receive score voted from scholars or to be most-cited papers in each field of study may be influenced by the dissemination of output and the reputation of the institution being mentioned on social media platform, especially in a pandemic situation. Thus, the term ‘Altmetrics’ written in 2010 [9] which concerned the used of social media data is measured scholarly impact that is not reflected by traditional bibliometrics, it is measurements of research’s impact on society, economy, and public engagement ecosystems. Besides, the methods of measurement or indicators mentioned above have different limitations. Peer reviews can be costly, time-consuming and rely heavily on human judgments, while bibliometrics is measured over a period and is only one of the dimensions of academic quality [2]. The use of altmetrics needs the consideration of reliability of data source. Additionally, each measurement provides just one part of the picture. Aksnes et al. (2019) also suggested that “Quality needs to be seen as a multidimensional concept that cannot be captured by any one indicator, and which dimension of quality should be prioritised may vary by field and mission”. Ideally, we think it should be used in conjunction and supplemented by each other. Therefore, the use of results of appropriate research assessments must take the limitations of methods into an account. The research’s assessing for different disciplines, the indicators’ weight should be considered as each indicator has vary affects to the assessment result. The Sustainable Development Goals (SDGs) presented by the United Nations are 17 goals to achieve a better and more sustainable future for all. They covers the global challenges, including poverty, inequality, climate change, environmental degradation, peace and justice [12]. In our view, general people should be realised the importance of academic contributions to sustainable development and the academic outputs related to those SDGs should be one of a measure of institution ranking. Table 1 presents these goals. The purpose of this study is to investigate an initial approach to enhance academic ranking prediction with augmented social perception data as whether the data could be substituted the traditional metric or be the part of the approach with the following key research questions/objectives: 1. Are there the relationships between various scholarly output’s social perception data and number of citation count? 2. The use of bibliometric data and social perception data to construct the citation predictive model.
86
K. Chantaranimi et al. Table 1. Sustainable development goals (SDGs) SDG goal Title
2
1
No poverty
2
Zero hunger
3
Good health and well-being
4
Quality education
5
Gender equality
6
Clean water and sanitation
7
Affordable and clean energy
8
Decent work and economic growth
9
Industry, innovation and infrastructure
10
Reduced inequalities
11
Sustainable cities and communities
12
Responsible consumption and production
13
Climate action
14
Life below water
15
Life on land
16
Peace, justice and strong institutions
17
Partnerships for the goals
Related Work
There are many studies focused on the relationships between social perception data and citations. For the Altmetric data [9], several studies used Spearman correlation test to explore the correlation between altmetric data and citation. Huang, W. [6] tested articles published on six Public Library of Science (PLOS) journals by using Spearman correlation to find the correlation between Altmetric Attention Scores (AAS) and citations. He found a possibility that AAS may be associated with citations in positive way, but there is a difference across disciplines. Barbic et al. [5] performed descriptive statistics and Spearman test on 50 most frequently cited articles published in emergency medicine (EM) journals and the result suggested weakly positive correlation. In the study by Lehane, D.J. and Black, C.S. [7], the Spearman test was used in following two groups of publication. First, Scientific Impact Group of critical care medicine publications was formed from 100 most cited articles, with a moderately positive correlation. Second, Media Impact Group was formed from top 100 articles by Altmetric Attention Scores, with a weakly positive correlation. In another study, Banshal S.K. [4] used Spearman correlation test to find the correlation between altmetric mentions and later citation counts on a combined dataset from Web of Science, Altmetric.com and ResearchGate. He found a weakly positive correlation, but it is relative higher in the case of ResearchGate data compared to social media
Enhance Academic Ranking Prediction
87
data. Study, Meschede, C. [8] analyzed research articles related to the SDGs in many aspects, e.g. Collaborations, using descriptive statistics, correlation test and visualization techniques such as bar chart and network graph. Akella, A.P. et al. [1] intensively studied in various machine learning models, in which both classification and regression to predict Citations from the Altmetric data. Random Forest, Decision Tree, Multiple Linear Model and Neural Network are the selected regression models. They used R-squared, mean squared error (MSE) and the mean absolute error (MAE) to compare the performance of regression models. Bai, X. et al. [3] proposed the Paper Potential Index (PPI) and compared to the earlier proposed models by Wang et al. (2013) [13]. The comparison is made in terms of Mean absolute error (MAE), Root mean squared error (RMSE), Range-normalized RMSE (NRMSE), Mean absolute percentage error (MAPE), and accuracy. The result showed that PPI is improves citation prediction of scholarly papers. Thelwall, M. and Nevill, T. [11] found that only Mendeley reader counts are consistent predictors of future citation impact from the fitted regression model.
3 3.1
Data and Methodology Data Collection
The academic outputs used to find the correlation and perform the predictive model in this study were obtained from two sources including Altmetric.com and Scival.com. The datasets were joined by its Digital Object Identifier (DOI), meanwhile the outputs which were lacking of a DOI were discarded. In additional, although the record from Scival.com is more than 10 million records, the data is excluded when it appears that there is the absent of Altmetric record. The time span covers outputs published between 2015–2020 by institutions that have been ranked in the QS Subject Rankings. These data were collected in May 2021. 3.2
Data Preparation
To preprocess this dataset, we adopted several procedures: (1) The rows with missing value in several columns were eliminated; (2) The categorical features were transformed using one-hot encoding technique. These include Open-access status, publication type, publication’s source type, Sustainable Development Goals (SDGs) Tag; (3) All Science Journal Classification (ASJC) field name feature were mapped to 5 QS subject areas [10]: Arts & Humanities, Engineering & Technology, Life Sciences & Medicine, Natural Sciences, and Social Sciences & Management; (4) The new feature was created from the existing, namely Age of output; (5) Before developing the citation predictive model, Min-max normalization is the method used to normalize every feature; (6) For model validation, dataset was divided into two subsets: training set and testing set. After preprocessing, we end up with 1,752,494 research outputs. However, there may be duplicated outputs because each specific output could be categorized into one or more of the five QS subject areas.
88
K. Chantaranimi et al.
A total of 25 features are listed and described in Table 2. Citation counts for these academic outputs were used as the target variable for constructing predictive model through the machine learning approach. Other features considered in this study were identified to reflect: 1. The dissemination of output: 15 alternative metrics (Altmetrics) were used. These variables are such as Number of Mendeley readers, as well as counts of times an output has been mentioned on online platform or social media platforms (e.g., News, Facebook, Twitter). It represents how many times outputs are discussed, used, or shared around the world. 2. Traditional academic impact: Bibliometric metrics consists of 8 variables; namely Views count, SNIP, Field-Weighted View Impact, which is quantitative measures of scholarly impact were selected. 3. Academic output characteristics: These are used to identify the characteristic of each output. There are Number of Institutions, Period of Output, Open access status, Publication type and Source type. 4. Sustainable Development Goals impact: It represented whether the output is linked to any of the United Nations Sustainable Development Goals (SDGs). For the dataset used in this study, only 16 goals were identified. SDG 17 “Partnerships for the goals” is not address.
4 4.1
Results Exploratory Data Analysis
The descriptive statistics of citations count for outputs in each subject areas are shown in Table 3. With the amount of 1,752,494 research outputs, almost of the publication are mapped to life science and medicine area. The overall mean is 23.366. The highest mean of citations count is from engineering and technology area, with a value of 33.612. Standard deviation value for engineering and technology area yields greater among other areas. While the smallest mean is 13.588, which is the mean of outputs in the arts and humanities area. We explored the proportion of output referred to SDGs goals in each area (Table 4). The most referred goal is SDG 3 Good health and well-being since it is referred in every subject area. In addition, SDG 11 Sustainable cities and communities is the most mentioned goal in Arts & Humanities area. While most outputs published in Engineering and Technology area is focused on SDG 7 Affordable and clean energy. In contrast, it is noticed that SDG 16 Peace, justice and strong institutions is the most mentioned goal in Social Sciences and Management area.
Enhance Academic Ranking Prediction Table 2. List of features Feature
Feature description
Citations
Citations counts represent the total number of citations received since an item was published, up to the date of the last data cut
Views count
Views counts represent the total number abstract views and clicks on the link to view the full-text at the publisher’s website
SNIP (publication year)
The source normalized impact per publication
Field-weighted view impact
The ratio of citations received relative to the expected world average
Number of institutions
Number of a unique home institutions for all authors
Period of output
The difference between the current year and the published year
Open access status
The feature indicates whether the output is published in an open access source
Publication type
The feature indicates the output type
Source type
The feature indicates the output’s source type
SDGs tag
The feature indicates whether the output is linked to any of the United Nations Sustainable Development Goals (SDGs)
News mentions
Number of times an output has been mentioned in the news, blogs, policy, and on Twitter or Facebook, and so on
Blog mentions Policy mentions Patent mentions Twitter mentions Peer review mentions Weibo mentions Facebook mentions Wikipedia mentions Google+ mentions Reddit mentions F1000 mentions Q & A mentions Video mentions Number of Mendeley readers A total number of Mendeley readers
89
90
K. Chantaranimi et al. Table 3. Descriptive statistics of Citation across research areas
Descriptive & statistics
Arts & Engineering humanities & technology
Life sciences Natural & medicine sciences
Social Total sciences & management
Number of output 28,378
116,769
1,188,254
293,012
126,081
1,752,494
Mean
33.612
22.688
26.965
14.096
23.366
13.588
Std. dev
29.894
79.625
63.245
60.908
30.143
62.063
Min
0
0
0
0
0
0
25%
2
5
4
5
2
4
50%
6
14
10
12
6
10
75%
15
33
23
18
15
24
Max
1,710
3,677
15,188
3,677
1,583
15,188
Table 4. Top 3 referred SDGs in each subject area
4.2
Rank Arts & humanities
Engineering Life sciences & technology & Medicine
Natural sciences
Social sciences & management
1
SDG 11
SDG 7
SDG 3
SDG 3
SDG 16
2
SDG 3
SDG 3
SDG 2
SDG 13
SDG 8
3
SDG 16
SDG 9
SDG 5
SDG 17
SDG 3
Correlation Analysis
4.2.1 Spearman’s Correlation Spearman’s correlation is the appropriate method that was performed to measure of the strength and direction of association that exists between two variables measured since our variables are not normally distributed. Figure 1 shows the Spearman’s correlation coefficient values between citations count and other continuous features. The correlation of citations and number of Mendeley reader is strong positive correlation, with a score of 0.72. The moderate positive correlations are between citations and Views (0.52) and between citations and period of output (0.49). However, it is demonstrated that between other features show a weak positive correlation with citations. 4.2.2 A Point-Biserial Correlation A point-biserial correlation is used to determine the correlation between citations count and dichotomous variables (Open access status, Publication type, Source type, SDGs Tag). Though almost the p-value tested are less than the significance level (α = 0.05) which means that there is a correlation between each pair of variables, the coefficients suggested very weak correlations.
Enhance Academic Ranking Prediction
91
Fig. 1. Spearman’s Correlation matrix between citation count and countinuous features.
4.3
Predictive Model
We would like to make a comparison between model constructing based on all features mentioned and model constructing based only on altmetrics features. Thus, various regression models have been constructed to predict number of citations count in different five subject areas, including multiple regression, Ridge regression, Decision Tree, Random Forest, XGboost. We also implemented the multilayer neural network. These supervised learning algorithms have been trained and tested using scikit-learn python package. Three metrics for model evaluation that were used in this study are Mean squared error (MSE), Mean absolute error (MAE) and R-squared. R-squared measures the proportion of total variance in the number of citations explained by the model whereas MSE and MAE measure the predictive accuracy of the model [1]. Table 5 illustrates the comparison of evaluation results of machine learning models for citation predictive that based on all variables mentioned. Random Forest outperforms the other models when fitted with dataset regarding Art and Humanities, Life science and Medicine, and Natural Science areas, while XGBoost is the outperforming model when fitted with dataset regarding Engineering and Technology and Social Sciences and Management areas. Among all models, the best model is Random Forest when fitted with Life science and Medicine dataset. We also considered constructing predictive models on the dataset with only altmetric variables. Our experiences have 2 structures, first one is also included
92
K. Chantaranimi et al.
Table 5. Comparison of evaluation results of machine learning models for citation predictive (All variables). Subject area
Model
MAE
Arts & humanities
Linear regression
0.0045 0.0095 0.6809
RMSE R-squared
Ridge regression
0.0044 0.0095 0.6832
Decision tree regression
0.0035 0.0099 0.6549
Random forest regression 0.0029 0.0075 0.8016
Engineering & technology
Multilayer perceptron
0.0062 0.0108 0.5912
XGboost
0.0028 0.0076 0.7993
Linear regression
0.0056 0.0138 0.5950
Ridge regression
0.0056 0.0138 0.5953
Decision tree regression
0.0021 0.0084 0.8513
Random forest regression 0.0022 0.0070 0.8951
Life sciences and medicine
Multilayer perceptron
0.0077 0.0128 0.6503
XGboost
0.0025 0.0066 0.9088
Linear regression
0.0009 0.0024 0.6721
Ridge regression
0.0009 0.0024 0.6670
Decision tree regression
0.0002 0.0013 0.9026
Random forest regression 0.0003 0.0010 0.9419
Natural sciences
Multilayer perceptron
0.0010 0.0023 0.6928
XGboost
0.0006 0.0012 0.9221
Linear regression
0.0043 0.0095 0.6405
Ridge regression
0.0042 0.0095 0.6398
Decision tree regression
0.0014 0.0061 0.8518
Random forest regression 0.0015 0.0046 0.9130 Multilayer perceptron
0.0044 0.0084 0.7159
XGboost
0.0022 0.0047 0.9111
Social sciences & management Linear regression
0.0047 0.0099 0.7132
Ridge regression
0.0046 0.0100 0.7071
Decision tree regression
0.0023 0.0081 0.8110
Random forest regression 0.0022 0.0059 0.8982 Multilayer perceptron
0.0077 0.0114 0.6219
XGboost
0.0027 0.0058 0.9004
number of Mendeley readers to the models. The second, number of Mendeley is excluded. We were able to find out that the results change a little when compared the models with Mendeley readers to the previous models with all variables included. However, when number of Mendeley is excluded, the performance of the models to predict the citation is very decrease. Table 6 illustrates the comparison of evaluation results of machine learning models for citation predictive on the dataset with altmetric variables excluding Mendeley reader. Random Forest still outperformed the other models when fitted with dataset regarding Life science and Medicine and Natural Science areas, while XGBoost is still the
Enhance Academic Ranking Prediction
93
Table 6. Comparison of evaluation results of machine learning models for citation predictive (Altmetrics variables without Mendeley reader). Subject area
Model
MAE
Arts & humanities
Linear regression
0.0073 0.0156 0.1487
RMSE R-squared
Ridge regression
0.0073 0.0156 0.1459
Decision tree regression
0.0070 0.0165 0.0466
Random forest regression 0.0069 0.0157 0.0734
Engineering & technology
Multilayer perceptron
0.0070 0.0003 0.0734
XGboost
0.0069 0.0158 0.1214
Linear regression
0.0086 0.0206 0.3857
Ridge regression
0.0086 0.0206 0.1002
Decision tree regression
0.0072 0.0170 0.3857
Random forest regression 0.0074 0.0169 0.3963
Life sciences & medicine
Multilayer perceptron
0.0089 0.0201 0.1448
XGboost
0.0074 0.0167 0.4101
Linear regression
0.0013 0.0037 0.2217
Ridge regression
0.0013 0.0037 0.2224
Decision tree regression
0.0010 0.0034 0.3287
Random forest regression 0.0010 0.0025 0.6467
Natural sciences
Multilayer perceptron
0.0017 0.0037 0.2367
XGboost
0.0011 0.0026 0.6228
Linear regression
0.0064 0.0145 0.1518
Ridge regression
0.0064 0.0145 0.1504
Decision tree regression
0.0052 0.0122 0.4052
Random forest regression 0.0053 0.0119 0.4302 Multilayer perceptron
0.0067 0.0144 0.1641
XGboost
0.0056 0.0119 0.4297
Social sciences & management Linear regression
0.0077 0.0167 0.1841
Ridge regression
0.0076 0.0167 0.1870
Decision tree regression
0.0066 0.0165 0.2077
Random forest regression 0.0067 0.0148 0.3642 Multilayer perceptron
0.0121 0.0182 0.0349
XGboost
0.0068 0.0149 0.3495
outperforming model when fitted with dataset regarding Engineering and Technology area. Among all models, Random Forest when fitted with Life science and Medicine dataset performed better than others.
5
Discussion
Even though we have a large dataset obtained from scival.com, but some data used in this study was omitted due to the available of data from altmetric.com. Outputs from Life science and Medicine area are largest amount, which is more
94
K. Chantaranimi et al.
than 1 million records. Academic outputs published in another area have a small fraction, especially in Art and Humanities. It is reflected a difference in the number of outputs and dissemination patterns in a specific discipline. It is revealed that SDG 3 Good health and well-being is being mentioned in academic outputs most. In accordance with Meschede, C. [8], it is stated that the most predominant SDG among the analyzed research articles is SDG 3. This study used Spearman correlation and a point-biserial correlation to identify the relationships between scholarly output’s data and its citations. Like many previous studies that focused on identifying relationships with citations, most of the results showed weak to moderate positive correlations [4–7]. Even though we could use Altmetrics data to predict the citations, the comparison of evaluation results of machine learning models illustrated that the integration of blibiometric data, SDGs data together with Altmetrics data for citation prediction is better in term of the higher value of R-square and lower values of MAE and MSE in almost every compared models. Thus, social perception data should be used in conjunction with traditional approach to enhance academic ranking prediction in the future study. However, it should not have completely substituted each other.
6
Conclusion
To answer the first research question on whether there are relationships between various scholarly output’s social perception data and number of citation count, the Spearman tests and point-biserial correlation found that output’s social perception data correlate with citations, which are weak to moderate positive correlation. For predictive models, we constructed machine learning model using bibliometric features and social perception features to predict the citation counts of academic outputs and found that Random Forest regression outperformed the other models.
References 1. Akella, A.P., Alhoori, H., Kondamudi, P.R., Freeman, C., Zhou, H.: Early indicators of scientific impact: Predicting citations with altmetrics. J. Informetrics 15(2), 101128 (2021). https://doi.org/10.1016/j.joi.2020.101128 2. Aksnes, D.W., Langfeldt, L., Wouters, P.: Citations, citation indicators, and research quality: an overview of basic concepts and theories. SAGE Open 9(1), 2158244019829575 (2019). https://doi.org/10.1177/2158244019829575 3. Bai, X., Zhang, F., Lee, I.: Predicting the citations of scholarly paper. J. Informetrics 13(1), 407–418 (2019). https://doi.org/10.1016/j.joi.2019.01.010 4. Banshal, S.K., Singh, V.K., Muhuri, P.K.: Can altmetric mentions predict later citations? A test of validity on data from ResearchGate and three social media platforms. Online Inf. Rev. 45(3), 517–536 (2020). https://doi.org/10.1108/OIR11-2019-0364 5. Barbic, D., Tubman, M., Lam, H., Barbic, S.: An analysis of altmetrics in emergency medicine. Acad. Emerg. Med. 23(3), 251–265 (2016). https://doi.org/10. 1111/acem.12898
Enhance Academic Ranking Prediction
95
6. Huang, W., Wang, P., Wu, Q.: A correlation comparison between Altmetric Attention Scores and citations for six PLOS journals. PLoS One 13(4), 1–15 (2018). https://doi.org/10.1371/journal.pone.0194962 7. Lehane, D.J., Black, C.S.: Can altmetrics predict future citation counts in critical care medicine publications? J. Intensive Care Soc. 22(1), 60–66 (2021). https:// doi.org/10.1177/1751143720903240 8. Meschede, C.: The sustainable development goals in scientific literature: a bibliometric overview at the meta-level. Sustainability, 12(11), 4461 (2020). https://doi. org/10.3390/su12114461 9. Priem, J., Taraborelli, D., Groth, P., Neylon, C.: Altmetrics: a manifesto (2010). http://altmetrics.org/manifesto/. Accessed 18 Feb 2021 10. QS Quacquarelli Symonds Limited: Qs world university rankings by subject. https://www.topuniversities.com/subject-rankings/2021. Accessed 18 Feb 2021 11. Thelwall, M., Nevill, T.: Could scientists use Altmetric.com scores to predict longer term citation counts? J. Informetrics 12(1), 237–248 (2018). https://doi.org/10. 1016/j.joi.2018.01.008 12. United Nations: take action for the sustainable development goals. https://www. un.org/sustainabledevelopment/sustainable-development-goals/. Accessed 18 Feb 2021 13. Wang, D., Song, C., Barab´ asi, A.L.: Quantifying long-term scientific impact. Science 342(6154), 127–132 (2013). https://doi.org/10.1126/science.1237825, https://science.sciencemag.org/content/342/6154/127
A Fuzzy-Based System for User Service Level Agreement in 5G Wireless Networks Phudit Ampririt1(B) , Ermioni Qafzezi1 , Kevin Bylykbashi1 , Makoto Ikeda2 , Keita Matsuo2 , and Leonard Barolli2 1
Graduate School of Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan 2 Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected], {kt-matsuo,barolli}@fit.ac.jp
Abstract. The Fifth Generation (5G) wireless network is expected to be flexible to satisfy user requirements and the Software-Defined Network (SDN) with Network Slicing will be a good approach for admission control. In 5G wireless network, the resources are limited and the number of devices is increasing much more than the system can support. So, the overloading problem will be a very critical problem. In this paper, we propose a fuzzy-based system for user Service Level Agreement considering 3 parameters: Reliability (Re), Availability (Av) and Latency (La). We carried out simulations for evaluating the performance of our proposed system. From simulation results, we conclude that the considered parameters have different effects on the SLA. When Re and Av are increasing, the SLA parameter is increased but when La is increasing, the SLA parameter is decreased.
1
Introduction
Recently, the growth of wireless technologies and user’s demand of services are increasing rapidly. Especially in 5G wireless networks, there will be billions of new devices with unpredictable traffic pattern which provide high data rates. With the appearance of Internet of Things (IoT), these devices will generate Big Data to the Internet, which will cause to congest and deteriorate the QoS [1]. The 5G wireless network will provide users with new experiences such as Ultra High Definition Television (UHDT) on Internet and support a lot of IoT devices with long battery life and high data rate on hotspot areas with high user density. In the 5G technology, the routing and switching technologies aren’t important anymore or coverage area is shorter than 4G because it uses high frequency for facing higher device’s volume for high user density [2–4]. There are many research work that try to build systems which are suitable to 5G era. The SDN is one of them [5]. For example, the mobile handover mechanism with SDN is used for reducing the delay in handover processing and improve QoS. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 96–106, 2022. https://doi.org/10.1007/978-3-030-84910-8_10
A Fuzzy-Based System for Slice Service Level Agreement
97
Also, by using SDN the QoS can be improved by applying Fuzzy Logic (FL) on SDN controller [6–8]. In our previous work [9,10], we proposed a fuzzy-based scheme for evaluation of QoS in 5G Wireless Networks considering Slice Throughput (ST), Slice Delay (SD), Slice Loss (SL) and Slice Reliability (SR). In this paper, we propose a fuzzy-based system for user Service Level Agreement (SLA) in 5G wireless networks considering three parameters: Reliability (Re), Availability (Av) and Latency (La). The rest of the paper is organized as follows. In Sect. 2 is presented an overview of SDN. In Sect. 3, we present application of Fuzzy Logic for admission control. In Sect. 4, we describe the proposed fuzzy-based system and its implementation. In Sect. 5, we discuss the simulation results. Finally, conclusions and future work are presented in Sect. 6.
2
Software-Defined Networks (SDNs)
The SDN is a new networking paradigm that decouples the data plane from control plane in the network. In traditional networks, the whole network is controlled by each network device. However, the traditional networks are hard to manage and control since they rely on physical infrastructure. Network devices must stay connected all the time when user wants to connect other networks. Those processes must be based on the setting of each device, making controlling the operation of the network difficult. Therefore, they have to be set up one by one. In contrast, the SDN is easy to manage and provide network software based services from a centralised control plane. The SDN control plane is managed by SDN controller or cooperating group of SDN controllers. The SDN structure is shown in Fig. 1 [11,12]. • Application Layer builds an abstracted view of the network by collecting information from the controller for decision-making purposes. The types of applications are related to: network configuration and management, network monitoring, network troubleshooting, network policies and security. • Control Layer receives instructions or requirements from the Application Layer and control the Infrastructure Layer by using intelligent logic. • Infrastructure Layer receives orders from SDN controller and sends data among them. The SDN can manage network systems while enabling new services. In congestion traffic situation, management system can be flexible, allowing users to easily control and adapt resources appropriately throughout the control plane. Mobility management is easier and quicker in forwarding across different wireless technologies (e.g. 5G, 4G, Wifi and Wimax). Also, the handover procedure is simple and the delay can be decreased.
98
P. Ampririt et al.
Fig. 1. Structure of SDN.
3
Outline of Fuzzy Logic
A FL system is a nonlinear mapping of an input data vector into a scalar output, which is able to simultaneously handle numerical data and linguistic knowledge. The FL can deal with statements which may be true, false or intermediate truth-value. These statements are impossible to quantify using traditional mathematics. The FL system is used in many controlling applications such as aircraft control (Rockwell Corp.), Sendai subway operation (Hitachi), and TV picture adjustment (Sony) [13–15]. In Fig. 2 is shown Fuzzy Logic Controller (FLC) structure, which contains four components: fuzzifier, inference engine, fuzzy rule base and defuzzifier. • Fuzzifier is needed for combining the crisp values with rules which are linguistic variables and have fuzzy sets asSLAiated with them. • The Rules may be provided by expert or can be extracted from numerical data. In engineering case, the rules are expressed as a collection of IF-THEN statements. • The Inference Engine infers fuzzy output by considering fuzzified input values and fuzzy rules. • The Defuzzifier maps output set into crisp numbers.
A Fuzzy-Based System for Slice Service Level Agreement
99
Fig. 2. FLC structure.
3.1
Linguistic Variables
A concept that plays a central role in the application of FL is that of a linguistic variable. The linguistic variables may be viewed as a form of data compression. One linguistic variable may represent many numerical variables. It is suggestive to refer to this form of data compression as granulation. The same effect can be achieved by conventional quantization, but in the case of quantization, the values are intervals, whereas in the case of granulation the values are overlapping fuzzy sets. The advantages of granulation over quantization are as follows: • it is more general; • it mimics the way in which humans interpret linguistic values; • the transition from one linguistic value to a contiguous linguistic value is gradual rather than abrupt, resulting in continuity and robustness. For example, let Temperature (T) be interpreted as a linguistic variable. It can be decomposed into a set of Terms: T (Temperature) = {Freezing, Cold, Warm, Hot, Blazing}. Each term is characterised by fuzzy sets which can be interpreted, for instance, “Freezing” as a temparature below 0 ◦ C, “Cold” as a temparature close to 10 ◦ C. 3.2
Fuzzy Control Rules
Rules are usually written in the form “IF x is S THEN y is T” where x and y are linguistic variables that are expressed by S and T, which are fuzzy sets. The x is a control (input) variable and y is the solution (output) variable. This rule is called Fuzzy control rule. The form “IF ... THEN” is called a conditional sentence. It consists of “IF” which is called the antecedent and “THEN” is called the consequent. 3.3
Defuzzificaion Method
There are many defuzzification methods, which are showing in following:
100
• • • • •
4
P. Ampririt et al.
The Centroid Method; Tsukamoto’s Defuzzification Method; The Center of Are (COA) Method; The Mean of Maximum (MOM) Method; Defuzzification when Output of Rules are Function of Their Inputs.
Proposed Fuzzy-Based System
In this work, we use FL to implement the proposed system. In Fig. 3, we show the overview of our proposed system. Each evolve Base Station (eBS) will receive controlling order from SDN controller and they can communicate and send data with User Equipment (UE). On the other hand, the SDN controller will collect all the data about network traffic status and controlling eBS by using the proposed fuzzy-based system. The SDN controller will be a communicating bridge between eBS and 5G core network. The proposed system is called Integrated Fuzzy-based Admission Control System (IFACS) in 5G wireless networks. The structure of IFACS is shown in Fig. 4. For the implementation of our system, we consider four input parameters: Quality of Service (QoS), Slice Priority (SP), Slice Overloading Cost (SOC), Service Level Agreement (SLA) and the output parameter is Admission Decision (AD).
Fig. 3. Proposed system overview.
In this paper, we apply FL to evaluate the SLA. For SLA evaluation, we consider 3 parameters: Re, Av and La. The output parameter is SLA.
A Fuzzy-Based System for Slice Service Level Agreement
101
Fig. 4. Proposed system structure.
Reliability (Re): The Re value is the required user reliability. When Re value is high, the SLA is high. Availability (Av): The Av is the required user availability. When Av value is high, the SLA is high. Latency (La): The La is the required user latency. When La value is high, the SLA is low. Service Level Agreement (SLA): The SLA is described based on these three parameters. The user request is characterized by its required SLA. Table 1. Parameter and their term sets. Parameters
Term set
Reliability (Re)
Small (Sm), Medium (Me), High (Hi)
Availability (Av)
Low (Lw), Medium (Md), High (Hg)
Latency (La)
Low (L), Medium (M), High (H)
Service Level Agreement (SLA) SLA1, SLA2, SLA3, SLA4, SLA5, SLA6, SLA7
The membership functions are shown in Fig. 5. We use triangular and trapezoidal membership functions because they are more suitable for real-time operations [16–19]. We show parameters and their term sets in Table 1. The Fuzzy Rule Base (FRB) is shown in Table 2 and has 27 rules. The control rules have
102
P. Ampririt et al. Table 2. FRB. Rule Re Av
La SLA
1
Lo Lw L
SLA3
2
Lo Lw M SLA2
3
Lo Lw H
SLA1
4
Lo Md L
SLA4
5
Lo Md M SLA3
6
Lo Md H
SLA2
7
Lo Hg L
SLA5
8
Lo Hg M SLA4
9
Lo Hg H
SLA3
10
Me Lw L
SLA4
11
Me Lw M SLA3
12
Me Lw H
SLA2
13
Me Md L
SLA5
14
Me Md M SLA4
15
Me Md H
SLA3
16
Me Hg L
SLA6
17
Me Hg M SLA5
18
Me Hg H
SLA4
19
Hi
Lw L
SLA5
20
Hi
Lw M SLA4
21
Hi
Lw H
SLA3
22
Hi
Md L
SLA6
23
Hi
Md M SLA5
24
Hi
Md H
SLA4
25
Hi
Hg L
SLA7
26
Hi
Hg M SLA6
27
Hi
Hg H
SLA5
the form: IF “condition” THEN “control action”. For example, for Rule 1: “IF Re is Lo, Av is Lw and La is L THEN SLA is SLA3”.
5
Simulation Results
In this section, we present the simulation result of our proposed system. The simulation results are shown in Figs. 6, 7 and 8. They show the relation of SLA with Re, Av and La. We consider the Re as constant parameter. We change the Av value from 10% to 90% and the La from 0 ms to 10 ms. In Fig. 6, we consider
A Fuzzy-Based System for Slice Service Level Agreement Lo
Me
Hi
0.5
0
0
10
20
30
40
50
60
70
80
90
Lw
1
µ[LOC]
µ[Re]
1
0
10
20
30
40
Re(%)
L
µ[La]
M
H
1
2
3
4
5
70
80
90
100
SLA1
6
7
8
9
SLA2
SLA3
SLA4
SLA5
SLA6
SLA7
1
µ[SLA] 0
60
(b) Availability
0.5
0
50
Av(%)
(a) Reliability
1
Hg
0.5
0
100
Md
103
10
0.5
0
0
0.1
0.2
0.3
0.4
La(ms)
0.5
0.6
0.7
0.8
0.9
1
SLA
(c) Latency
(d) Service Level Agreement
Fig. 5. Membership functions.
the Re value as 10%. When La increased form 0 ms to 10 ms, we see that SLA is decreasing. When La is 5 ms, the SLA is increased by 15% when Av is increased form 10% to 50% and form 50% to 90%, respectively. We compare Fig. 6 with Fig. 7 to see how Re has affected SLA. When La is 5 ms and Av is 50%, SLA is increased 11.38% by increasing Re from 10% to 50%. In Fig. 7, when Av is 90%, all SLA values are higher than 0.5. This means Re=10% 1
Av=10% Av=50% Av=90%
0.9 0.8
SLA [unit]
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
1
2
3
4
5 La [ms]
6
7
8
Fig. 6. Simulation results for Re = 10%.
9
10
104
P. Ampririt et al. Re=50% 1
Av=10% Av=50% Av=90%
0.9 0.8
SLA [unit]
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
1
2
3
4
5 La [ms]
6
7
8
9
10
9
10
Fig. 7. Simulation results for Re = 50%. Re=90% 1 0.9 0.8
SLA [unit]
0.7 0.6 0.5 0.4 0.3 0.2 Av=10% Av=50% Av=90%
0.1 0 0
1
2
3
4
5
6
7
8
La [ms]
Fig. 8. Simulation results for Re = 90%.
that users fulfill the required SLA. In Fig. 8, we increase the value of Re to 90%. We see that the SLA value is increased much more compared with the results of Figs. 6 and 7.
6
Conclusions and Future Work
In this paper, we proposed and implemented a Fuzzy-based system for user SLA. The admission control mechanism will search for set of slices and try to make connection between a new user and a slice that match the required SLA. So, the SLA parameter will be used an input parameter for Admission Control in 5G wireless networks. We evaluated the proposed system by simulations. From the simulation results, we found that the three parameters have different effects on
A Fuzzy-Based System for Slice Service Level Agreement
105
the SLA. When Re and Av are increasing, the SLA parameter is increased but when La is increasing, the SLA parameter is decreased. In the future, we would like to evaluate the Admission Control system by considering other parameters.
References 1. Navarro-Ortiz, J., Romero-Diaz, P., Sendra, S., Ameigeiras, P., Ramos-Munoz, J.J., Lopez-Soler, J.M.: A survey on 5G usage scenarios and traffic models. IEEE Commun. Surv. Tutorials 22(2), 905–929 (2020) 2. Hossain, S.: 5G wireless communication systems. Am. J. Eng. Res. (AJER) 2(10), 344–353 (2013) 3. Giordani, M., Mezzavilla, M., Zorzi, M.: Initial access in 5G mmWave cellular networks. IEEE Commun. Mag. 54(11), 40–47 (2016) 4. Kamil, I.A., Ogundoyin, S.O.: Lightweight privacy-preserving power injection and communication over vehicular networks and 5G smart grid slice with provable security. Internet Things 8, 100–116 (2019) 5. Hossain, E., Hasan, M.: 5G cellular: key enabling technologies and research challenges. IEEE Instrum. Measure. Mag. 18(3), 11–21 (2015) 6. Yao, D., Su, X., Liu, B., Zeng, J.: A mobile handover mechanism based on fuzzy logic and MPTCP protocol under SDN architecture*. In: 18th International Symposium on Communications and Information Technologies (ISCIT-2018), pp. 141– 146, September 2018 7. Lee, J., Yoo, Y.: Handover cell selection using user mobility information in a 5G SDN-based network. In: 2017 Ninth International Conference on Ubiquitous and Future Networks (ICUFN-2017), pp. 697–702, July 2017 8. Moravejosharieh, A., Ahmadi, K., Ahmad, S.: A fuzzy logic approach to increase quality of service in software defined networking. In: 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN2018), pp. 68–73, October 2018 9. Ampririt, P., Ohara, S., Qafzezi, E., Ikeda, M., Barolli, L., Takizawa, M.: Integration of software-defined network and fuzzy logic approaches for admission control in 5G wireless networks: a fuzzy-based scheme for QOS evaluation. In: Barolli, L., Takizawa, M., Enokido, T., Chen, H.-C., Matsuo, K. (eds.) Advances on BroadBand Wireless Computing, Communication and Applications, vol. 159, pp. 386– 396. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-61108-8_38 10. Ampririt, P., Ohara, S., Qafzezi, E., Ikeda, M., Barolli, L., Takizawa, M.: Effect of slice overloading cost on admission control for 5G wireless networks: A fuzzy-based system and its performance evaluation. In: Barolli, L., Natwichai, J., Enokido, T. (eds.) Advances in Internet, Data and Web Technologies, pp. 24–35. Springer, Cham (2021) 11. Li, L.E., Mao, Z.M., Rexford, J.: Toward software-defined cellular networks. In: 2012 European Workshop on Software Defined Networking, pp. 7–12, October 2012 12. Mousa, M., Bahaa-Eldin, A.M., Sobh, M.: Software defined networking concepts and challenges. In: 2016 11th International Conference on Computer Engineering and Systems (ICCES-2016), pp. 79–90. IEEE (2016) 13. Jantzen, J.: Tutorial on fuzzy logic. Technical University of Denmark, Department of Automation, Technical report (1998)
106
P. Ampririt et al.
14. Mendel, J.M.: Fuzzy logic systems for engineering: a tutorial. Proc. IEEE 83(3), 345–377 (1995) 15. Zadeh, L.A.: Fuzzy logic. Computer 21, 83–93 (1988) 16. Norp, T.: 5G requirements and key performance indicators. J. ICT Stand. 6(1), 15–30 (2018) 17. Parvez, I., Rahmati, A., Guvenc, I., Sarwat, A.I., Dai, H.: A survey on low latency towards 5G: ran, core network and caching solutions. IEEE Commun. Surv. Tutorials 20(4), 3098–3130 (2018) 18. Kim, Y., Park, J., Kwon, D., Lim, H.: Buffer management of virtualized network slices for quality-of-service satisfaction. In: 2018 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN-2018), pp. 1–4 (2018) 19. Barolli, L., Koyama, A., Yamada, T., Yokoyama, S.: An integrated CAC and routing strategy for high-speed large-scale networks using cooperative agents. IPSJ J. 42(2), 222–233 (2001)
Cognitive Approach for Creation of Visual Security Codes Urszula Ogiela1 and Marek R. Ogiela2(B) 1 Pedagogical University of Krakow, Podchor˛az˙ ych 2 Street, 30-084 Kraków, Poland 2 Cryptography and Cognitive Informatics Laboratory, AGH University of Science and
Technology, 30 Mickiewicza Avenue, 30-059 Kraków, Poland [email protected]
Abstract. Modern cryptography allows to apply personal features or cognitive parameters in creation of security protocols. This allow to define personally oriented security cryptosystems widely oriented for different application like services management, and big data secure exploration. We can also create thematic-based visual codes used for personal authentication. In such procedures it is possible to use contactless devices while selecting proper visual parts, necessary for authentication. In this paper will be proposed such approach dedicated for creation of security procedures, which use visual paths. Passing such paths requires special cognitive skills, and abilities of following commands during authentication. Keywords: Cognitive cryptography · User-oriented security systems · Eye tracking interfaces
1 Introduction Modern security protocols are very often based on application of personal or cognitive features. Many procedures can apply selected personal characteristics or even biometric templates in creation of user-oriented security systems [1, 2]. Application of cognitive parameters requires special interfaces, which allow to register users’ attention, movements, or unique personal features. In conducted research we proposed cognitive-based approach in creation of authentication protocols, which generate sequences of visual patterns, which should be recognized or understand by users during authentication. Definition of such protocols allow to introduce cognitive cryptography technologies, which use so called cognitive information systems [1]. Application of cognitive systems, allow to generate different sequences containing visual patterns representing selected different areas of knowledge [3, 4]. Generated security codes should be presented to the user, who according cognitive skills and knowledge has to find the answer for presented questions. Having abilities to understand presented patterns user can find the proper answer and make selection of visual parts. This selection can be done using contactless interfaces like eye tracking devices, sensor cameras etc.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 107–111, 2022. https://doi.org/10.1007/978-3-030-84910-8_11
108
U. Ogiela and M. R. Ogiela
2 Cognitive Solutions in Security Technologies Cognitive approaches implement resonance techniques, which imitate the natural way of human thinking and cognition. Implementation of such processes on computer systems were described in [1]. Also, some possible applications of such systems in development security solutions defining areas of so-called cognitive cryptography was presented in [5]. Surely the same approaches can be successfully applied in creation of visual security codes especially thematic-based sequences focused on user authentication task. Application of cognitive systems allow to define human-centered protocols, which consider users’ thematic knowledge, and perception skills. Cognitive systems allow to model the brain functions and thinking processes [1], what is possible thanks to the comparison of observed feature with gained experiences or expectations stored in databases. During such comparison, some generated hypothesis is verified, and decision about similarities to the expectation is made. This is the implementation of human model of visual perception called knowledge-based perception. This model assume that our mind cannot recognize any pattern or situation if we haven’t any knowledge or previous experiences in analysis of this pattern. Having resonance paradigm, it is possible to implement different classes of cognitive systems, connected with multimedia interfaces, which allow to construct efficient technologies for moving attention analysis and security authentication.
3 An Idea of Visual Security Codes In this section we will propose security procedures which use sequences of visual patterns, from which particular subset should be selected according some requirements. To implement such protocols users should have specific information or thematic knowledge representing selected area of interest. Application of such procedures allow to check if user is a human (like in CAPTCHA codes) and if he belongs to authorized group [6, 7]. Such security protocols work in following manner: • A sequence of visual patterns from representing particular area of expertise is generated. • Verification questions are formulated and presented to the user. • User should quickly find proper answer, according his knowledge and perception skills. In such protocols it is necessary to quickly and correctly select the semantic combination of visual elements, which create an answer for presented question. Proper selection should follow requested goals or fulfil requirements about semantic meaning. During analysis user should understand question, and recognize all visual parts of presented sequence, than finding answer by moving attention and concentrate on particular elements selected by blinking with his eyes [8, 9]. Signal can be registered using eye-tracker sensors, what allow to trace the attention points and create answers in the form of visual paths. Verification of users with application of such codes can be done in two different ways.
Cognitive Approach for Creation of Visual Security Codes
109
The first approach requires selection of visual pattern in in proper order according the verification questions. This means that all visual parts should be permutated in such manner that finally fulfill requested conditions.
Fig. 1. Visual security codes based on symmetric encryption procedures. A set of symmetric cryptosystems are presented: 1-DES, 2-AES, 3-SAFER, 4-RC6. Authentication procedure can require from user to select procedures with increasing number of iterations (8, 10, 12, 14, 16 etc.), or sorting them with increasing key lengths (64-bits, 256-bits).
The second approach lays on following multiply questions connected with single pattern placed among others. Each question request to place analyzed pattern in proper order among others, bur depending from the question position should change.
110
U. Ogiela and M. R. Ogiela
In both approaches only, qualified user can understand asked questions connected with the meaning of patterns, what allow to find proper relation between visual elements presented in the whole sequence. In Fig. 1 is presented example presenting several block ciphers [10]. When user understand them, he can be asked about the number of encryption iteration. In next stage he can be asked for selecting systems using the keys with particular length. In next stage he can be asked to sort procedures according complexity etc. Such protocol allows to determine if user is a human and also if he or she possess specific knowledge from particular area, and is able to find proper sequences reflecting semantic meaning of the asked question.
4 Conclusions In this paper was proposed a new approach to create authentication procedures having the form of visual pattern sequences. Described protocols can be implemented using cognitive systems, which generate sets of thematic visual patterns. Such pattern should be understood and recognized by users using contactless interfaces like eye tracking devices. Proposed procedures allow to define a type of user-oriented security protocols, considering different levels of users’ thematic knowledge. During verification it also allows to check basic perception skills, what confirm that user is a human, and not a machine [11, 12]. Application of contactless interfaces allow to develop mobile security solutions, working in changing environment, and considering users features, and also external parameters describing surrounding world. Acknowledgments. This work has been supported by Pedagogical University of Krakow research Grant No BN.711-79/PBU/2021. This work has been supported by the AGH University of Science and Technology research Grant No 16.16.120.773.
References 1. Ogiela, L., Ogiela, M.R.: Cognitive security paradigm for cloud computing applications. Concurrency Comput. Pract. Exp. 32(8), e5316 (2020). https://doi.org/10.1002/cpe.5316 2. Ogiela, M.R., Ogiela, L., Ogiela, U.: Biometric methods for advanced strategic data sharing protocols. In: 2015 9th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing IMIS 2015, pp. 179–183 (2015). https://doi.org/10.1109/IMIS.201 5.29 3. Ogiela, U., Ogiela, L.: Linguistic techniques for cryptographic data sharing algorithms. Concurrency Comput. Pract. Exp. 30(3), e4275 (2018). https://doi.org/10.1002/cpe.4275 4. Ogiela, M.R., Ogiela U.: Secure information splitting using grammar schemes. In: Nguyen, N.T., Katarzyniak, R.P., Janiak, A. (eds.) New Challenges in Computational Collective Intelligence. Studies in Computational Intelligence, vol. 244, pp. 327–336 (2009). https://doi.org/ 10.1007/978-3-642-03958-4_28 5. Ogiela, L., Ogiela, M.R., Ogiela, U.: Efficiency of strategic data sharing and management protocols. In: The 10th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS 2016), Fukuoka, Japan, 6–8 July 2016, pp. 198–201 (2016). https://doi.org/10.1109/IMIS.2016.119
Cognitive Approach for Creation of Visual Security Codes
111
6. Alsuhibany, S.: Evaluating the usability of optimizing text-based CAPTCHA generation. Int. J. Adv. Comput. Sci. Appl. 7(8), 164–169 (2016) 7. Osadchy, M., Hernandez-Castro, J., Gibson, S., Dunkelman, O., Perez-Cabo, D.: No bot expects the DeepCAPTCHA! Introducing immutable adversarial examples, with applications to CAPTCHA generation. IEEE Trans. Inf. Forensics Secur. 12(11), 2640–2653 (2017) 8. Ancheta, R.A., Reyes, F.C., Jr., Caliwag, J.A., Castillo, R.E.: FEDSecurity: implementation of computer vision thru face and eye detection. Int. J. Mach. Learn. Comput. 8, 619–624 (2018) 9. Guan, C., Mou, J., Jiang, Z.: Artificial intelligence innovation in education: a twenty-year data-driven historical analysis. Int. J. Innov. Stud. 4(4), 134–147 (2020) 10. Menezes, A., van Oorschot, P., Vanstone, S.: Handbook of Applied Cryptography. CRC Press, Waterloo (2001) 11. Yang, S.J.H., Ogata, H., Matsui, T., Chen, N.-S.: Human-centered artificial intelligence in education: seeing the invisible through the visible. Comput. Educ. Artif. Intell. 2, 100008 (2021) 12. Ogiela, L.: Cryptographic techniques of strategic data splitting and secure information management. Pervasive Mobile Comput. 29(2016), 130–141 (2016)
Transformative Computing Based on Advanced Human Cognitive Processes Urszula Ogiela1 , Makoto Takizawa2 , and Lidia Ogiela1(B) 1 Pedagogical University of Krakow, Podchor˛az˙ ych 2 Street, 30-084 Kraków, Poland 2 Department of Advanced Sciences, Hosei University, 3-7-2, Kajino-cho, Koganei-shi,
Tokyo 184-8584, Japan [email protected]
Abstract. Currently, data analysis processes based on individual, personalized recording techniques are used more and more. Such processes allow for intelligent data analysis considering only those elements that are of significant importance to a specific recipient. In addition, they provide the possibility of applying individual analysis results to deep data analysis processes. In this paper, such possibilities of data analysis will be described, with particular emphasis on the use of transformational computing technology taking into account advanced human cognitive processes. Keywords: Transformative computing · Cognitive processes · Personalised data description
1 Introduction The possibilities related to the use of transformative computing in the processes of data analysis and processing are very large, and their diversity depends, for example on the expansion of the network of connections of signal recorders, the speed of processing of data recorded by sensors, and the degree of diversity of the obtained information [1–5]. Of course, the possibility of processing data recorded by independent sensors is also one of the stages of transformative computing implementation. The necessity of implementation of such results is due to the fact that sensors can record data in the form of various forms of recording. Transformative computing is used to obtain a variety of data, recorded by independent sensors, which collect huge data sets, allow for their analysis in real time. A characteristic feature of this type of calculation is the possibility of modifying both the set of sensors recording signals, as well as changing the amount of recorded data. A new direction of discussed processes is cognitive transformative computing for the meaningful analysis of the recorded data and linking them in terms of causality, significance and similarity of results. Cognitive computing allows for the implementation of the processes of analyzing only the data, the importance of which is significant for the entire interpretation process, while ignoring the data that is irrelevant in a specific process [6–9]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 112–115, 2022. https://doi.org/10.1007/978-3-030-84910-8_12
Transformative Computing
113
2 Advanced Human Cognitive Processes in Data Interpretation Cognitive methods of information description, characteristic of human cognitive processes are currently used in the processes of machine description and interpretation of big and complex data sets. Computer solutions aimed at the most accurate mapping of human cognitive-interpretative processes include both conscious and subconscious stages of the human analysis process [6, 9]. Therefore, they are implemented both in the areas of memory analysis, the processes of perceiving and recording information, as well as cognitive processes and the analysis of the causes of their occurrence. The basic processes of data analysis include the steps of recording, saving and processing data convergent with the processes of cognition, perception, acting and triggering specific reactions. Advanced cognitive data analysis processes include processes, by which cognition and inference are possible. Advanced cognitive processes allow to provide answers related to the occurrence of impulses of a given phenomenon, reasons for its occurrence, reasons for its changes, and connections with other similar and/or different phenomena. The use of advanced data interpretation techniques based on human cognitive processes, allows to full data analysis carried out by intelligent information systems, whose task is not only to understand the analyzed data but also to quickly respond to the reasons for the occurrence of a given phenomenon, the possibility of modification and removal of existing limitations, inclusion in processes of analysis of favorable factors shaping a given phenomenon. Figure 1 shows a diagram of the application of cognitive solutions in the processes of data analysis and interpretation.
Fig. 1. The scheme of use of cognitive techniques in advanced data analysis and interpretation processes.
Undoubtedly, the advantage of this type of data interpretation is the ability to determine the significance of the analyzed data and its role in the entire analysis process, taking into account both the analyzed data sets and the information that has not been collected in the data sets. The semantic analysis of data allows to supplement in real time the information that is not included in the analyzed data sets, and the interpretation of which may be
114
U. Ogiela et al.
important for a more complete analysis. Such algorithm is possible as a result of the system learning process of new solutions based on previously unknown solutions by the system. The new knowledge acquired by the system enriches the base set, which again becomes the basis for the interpretation process being carried out. Currently, a new application for advanced cognitive processes is transformative computing in the field of real data processing and analysis in order to depth-in understand and known the analysed phenomena.
3 New Transformative Computing Value Based on Cognitive Processes Standard transformative computing solutions are used for data recording and processing carried out in real time when the data is obtained from various sources, and their registration is performed by independent sensors. Each of them, when recording data, does not analyze them, and at the same time all the collected signals are processed into a form that allows them to be compared with each other. Also, allows to indicate both the information that allows to perform a broad analysis of the phenomenon, and those that are completely not they fit. A feature added to basic transformative computing is the ability to perform data interpretation tasks by determining their meaning [1, 3, 5]. For if certain information has an impact on a given phenomenon, they can change it at a given moment or cause a change in the future. In this case, then a change in the way of their interpretation consisting in including in the process of describing given information that may cause a change will bring specific benefits. The essence of the proposed approach is the inclusion of cognitive processes in the understanding of data registered by various sensors into the transformative processes. This process will allow to identify those that have the greatest impact on the occurrence of a given phenomenon, and then allow to assess their significance for all possible modifications – long and shortwave. This method of analysis is therefore extremely flexible, adapted to the needs of the moment and a specific situation. In particular it allows to consider: • • • • • •
the variability of the phenomenon, changes in its determinants, the period of its implementation, the influence of various groups of factors, modification of base sets, changes of sensors recording both in a given group and outside it.
Its usefulness is calculated by the time of data processing by specific sensors, and the correctness of the data understanding process and the influence of a given factor on the changes.
Transformative Computing
115
4 Conclusions Transformative computing is now a broad scope for data analysis, putting it at the point where it depends on a proper initial assessment of the situation. Data selection due to their importance for the entire analysis process due to their greater impact than others, based on processes convergent with the human way of interpretation and description, allows for flexible modification of data sets and recorder sets. This, in turn, makes the entire interpretation process oriented towards certain expectations and gaining knowledge on specific results of the analysis. At the same time, it should be emphasized that the possibilities of using this type of solutions are extremely universal due to the applied techniques of transformative computing and cognitive techniques for data description and interpretation. Acknowledgments. This work has been supported by the National Science Centre, Poland, under project number DEC-2016/23/B/HS4/00616. This work has been supported by Pedagogical University of Krakow research Grant No BN.711–78/PBU/2021.
References 1. Benlian, A., Kettinger, W.J., Sunyaev, A., Winkler, T.J.: The transformative value of cloud computing: a decoupling, platformization, and recombination theoretical framework. J. Manag. Inf. Syst. 35, 719–739 (2018) 2. Gil, S., et al.: Transformative effects of IoT, blockchain and artificial intelligence on cloud computing: evolution, vision, trends and open challenges. Internet Things 8, 100118 (2019) 3. Ko, H., et al.: Smart home energy strategy based on human behaviour patterns for transformative computing. Inf. Process. Manag. 57(5), 102256 (2020) 4. Nakamura, S., Ogiela, L., Enokido, T., Takizawa, M.: Flexible synchronization protocol to prevent illegal information flow in peer-to-peer publish/subscribe systems. In: Barolli, L., Terzo, O. (eds.) CISIS 2017. AISC, vol. 611, pp. 82–93. Springer, Cham (2018). https://doi. org/10.1007/978-3-319-61566-0_8 5. Ogiela, L.: Transformative computing in advanced data analysis processes in the cloud. Inf. Process. Manag. 57(5), 102260 (2020) 6. Ogiela, L., Ogiela, M.R.: Cognitive security paradigm for cloud computing applications. Concurr. Comput. Pract. Exp. 32(8), e5316 (2020) 7. Ogiela, M.R., Ogiela, U.: Secure information splitting using grammar schemes. In: Nguyen, N.T., Katarzyniak, R.P., Janiak, A. (eds.) New Challenges in Computational Collective Intelligence. Studies in Computational Intelligence, vol. 244, pp. 327–336 (2009). Springer, Berlin. https://doi.org/10.1007/978-3-642-03958-4_28 8. Ogiela, M.R., Ogiela, U.: Linguistic cryptographic threshold schemes. Int. J. Future Gener. Commun. Netw. 2(1), 33–40 (2009) 9. Ogiela, U., Takizawa, M., Ogiela, L.: Classification of cognitive service management systems in cloud computing. In: Barolli, L., Xhafa, F., Conesa, J. (eds.) BWCCA 2017. LNDECT, vol. 12, pp. 309–313. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-69811-3_28
Topology as a Factor in Overlay Networks Designed to Support Dynamic Systems Modeling Abbe Mowshowitz1(B) , Akira Kawaguchi1 , and Masato Tsuru2 1 Department of Computer Science, The City College of New York, New York, USA
{amowshowitz,akawaguchi}@ccny.cuny.edu 2 Kyushu Institute of Technology, Iizuka, Japan [email protected]
Abstract. Overlay networks are logical networks embedded in physical substrate networks. They are useful for supporting specialized applications involving computation and information exchange among subsets of users. This paper examines the characteristics of overlays that make them suitable for different applications in dynamic, distributed database environments. One such characteristic is the ease with which distances between nodes can be calculated. Examples of overlay graph structures that exhibit certain desirable characteristics include the hypercube, toroidal grid graph and the Kautz graph. Applications involving mobile elements are examined, and a system designed to support comparative analysis of the performance of different overlay structures is discussed.
1 Introduction Characteristics of network topology can have significant effects on network performance. This is especially true for a highly distributed data network such as a 5G platform. Distance between nodes, for example, can affect latency, and may also play a role in traffic congestion if large amounts of data must be moved in response to queries. Previous research proved in theory that traffic congestion in overlay networks can be reduced by engineering the network to conform to a hypercube structure [1, 6, 10, 11, 16]. In a current project we are implementing a hypercube overlay by means of software defined networking. The aim of this experimental research is to provide a practical demonstration of increased querying efficiency in distributed database systems, achieved by taking account of distance between nodes [14]. In this paper we generalize the previous work through a systematic examination of the advantages and disadvantages of different overlay network topologies for different applications. The principal contribution of this work lies in demonstrating the utility of particular layover topologies designed to support dynamic systems modeling.
2 Overlay Network An overlay network is a virtual or logical network embedded in a physical substrate such as a WAN, MAN or the Internet [15]. To illustrate the relationship between overlay and © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 116–124, 2022. https://doi.org/10.1007/978-3-030-84910-8_13
Topology as a Factor in Overlay Networks Designed
117
substrate, suppose nodes a and b are connected by a path P of length k in the overlay. If a message is sent from a to b in the overlay, it could follow an entirely different path Q in the substrate whose length could be less than or greater than k. If an overlay is well aligned with the substrate, the difference in path lengths will be minimal. Overlay networks are typically used to establish communication and information exchange among a subset of substrate users for specialized activities.
3 Network Topologies and Graph Models of Networks Network structure can be characterized abstractly by means of graphs. Our concern here is with dynamically changing networks, whose structure varies over time due to the entrance of new nodes and disappearance of existing ones [1]. If there are no constraints on the evolution of such a network, the structure is in effect a random graph. However, evolving networks can be structured to conform to predetermined graph types. The rationale for structuring growth is that the resulting overlay network will resemble a particular type of graph with known properties. For example, if distance between nodes is important, a graph structure can be chosen that has a small diameter, and low average distance between nodes [8]. Moreover, with some graphs (e.g., hypercube), the node labels allow for easy determination of distance and computation of paths between pairs of nodes [16]. This property provides a systematic way of optimizing distributed data exchange so as to minimize overall data transfer on the network. Broadcasting offers another example of the potential utility of a structured overlay. If the graph associated with an overlay has a Hamiltonian circuit, that structure could be exploited to establish an efficient algorithm for broadcasting. One structure long used in computational modeling is the mesh or grid graph, also known as a lattice graph [18]. The simplest type of this structure is the square grid graph which is defined as the Cartesian product (Pm X Pn ) of two paths Pm and Pn , having m and n vertices (m-1 and n-1 edges), respectively. In a notable application of this graph, the vertices correspond to processors and edges represent direct communication links between processors. Path lengths in Pm X Pn can be as high as (m−1) + (n−1), the diameter of the graph. If a large amount of data has to be shared between processors separated by considerable distance, network performance can be compromised. By contrast, the hypercube Hk with 2k vertices has diameter k and offers an average path length smaller than the square grid graph. For example, if m and n are 32 and k is 10, both graphs have 1024 vertices, but the diameters are far apart: the hypercube has diameter 10, while the square grid graph has diameter 62. One additional property of the hypercube is of special interest, namely the labels associated with its vertices. In fact, the labels can be used to define the hypercube: Hk = (V,E), where V is a set of 2k vertices, and E is a set of edges. Each vertex has a unique binary k-sequence label, and e = uv ∈ E if the labels of u and v differ in exactly one position. The distance between two vertices is the Hamming distance between their respective labels. Thus, structuring a layover network as a hypercube makes distance computation almost immediate, whereas in the random graph case it is necessary to exchange messages between nodes to determine the network distance between them.
118
A. Mowshowitz et al.
4 What Constitutes a Good Overlay Network Topology? Some network properties such as fault tolerance or resistance to attack, as measured, for example, by the difficulty of breaking a network into isolated components, are critical for all network applications. However, a ‘good topology’ is generally one that answers to the needs of a particular type of application. We have already called attention to diameter and average path length as important properties for dynamic, distributed database operations, since a small diameter and low average path length can help to reduce message traffic and thus improve latency and throughput. Unlike its unstructured cousin, a structured overlay must be maintained which requires effort and entails cost. This consideration leads us to examine the tradeoffs between desirable overlay topologies and the cost of maintaining such topologies. To illustrate the trade off issues we will discuss three different overlay topologies that can be used to support dynamic, distributed database operations. In addition we will examine an application involving mobile devices. Distributed database operations call for layover networks in which distance between nodes can be determined easily. Other applications may require close alignment of real world features, such as physical proximity, with network properties; and maintenance costs of a given overlay topology may be a critical concern. 4.1 Hypercube As noted earlier, the hypercube (see Fig. 1) is well suited to support distributed database operations inasmuch as a simple bit vector operation on pairs of labels can b be used to compute distance. Moreover paths can be constructed easily using the node labels. The cost of maintaining the hypercube overlay is more than offset by the savings in bandwidth utilization if the overlay network is relatively stable under conditions of intensive querying. If on the other hand, the rate of change in the network caused by nodes entering and leaving is high, maintenance costs can become excessive. The cost of maintaining a hypercube overlay structure stems in part from the relatively high edge density of this type of graph. The hypercube Hk has k/2 edges per node. Connections between nodes in an overlay network must be checked to be sure they are intact, so a high ratio of edges to nodes incurs considerable maintenance cost. This high ratio, making the hypercube robust and resistant to attack, is thus a negative for maintenance.
Labels 000 to 111, nodes differing in one bit are adjacent Fig. 1. Hypercube
Topology as a Factor in Overlay Networks Designed
119
4.2 Toroidal Grid Graph A topology that also has the virtue of natural labels that afford simple inter-node distance calculation is the toroidal grid graph, so called because it can be embedded in a torus. This graph (see Fig. 2) is the Cartesian product cm x cn of cycles cm , cn of lengths m and n, respectively; it has mn nodes, 2 mn edges, constant edge density 2, and is regular of degree 4. Compared with the hypercube, the low edge density offers a reduction in the cost of edge connection maintenance if it is used as an overlay network. The Cartesian product offers a natural edge labeling scheme, since nodes have the form (i,j), and the distance d[(i,j),(p,q)] between nodes (i,j) and (p,q) is given by the formula d[(i,j),(p,q)] = dCm (I,p), if J = q; or dCn (J,q), if I = p; or dCm (i,p) + dCn (j,q), otherwise, where dck (x,y) = |x−y| if |x−y|≤n/2, or else d(x,y) = k − |i−j|. So, the toroidal grid graph can be used to support distributed database operations, at a lower cost than the hypercube. The downside of the relatively low edge density is greater vulnerability to attack, since fewer elements than in the hypercube are needed to break the network into disjoint pieces that cannot communicate with each other.
Fig. 2. Toroidal grid graph [8]
Note that every Cartesian product affords a natural labeling scheme that can be used to compute distance easily, assuming the constituent graphs in the product also have an easy to compute distance metric. The different Cartesian product graphs constitute an infinite class of potential layover networks that could be used in applications involving distributed database operations. Interestingly, the hypercube too can be defined recursively using the Cartesian product. Other graph product operations (e.g., lexicographic product) could also serve as potential layover topologies. 4.3 Kautz Graph This graph (see Fig. 3), like the hypercube and the toroidal grid graph, has been used in the design and analysis of interconnection networks, linking processors or switching elements [17]. The undirected Kautz graph is typically defined in two steps, first as a directed graph, and then with a straightforward modification, it is turned into an undirected graph. Node labels play a critical role, just as in the definition of the hypercube. A vertex of the directed Kautz graph K(d,n) has as label a length n sequence of the form x1 ,x2 ,…,xn where the xi take values in the set S = {0,1,…,d} for some integer d > 0,
120
A. Mowshowitz et al.
subject to the condition that xi+1 = xi . A vertex u is adjacent to vertex v if the label of v is a one position right shift of u’s label with a new value from the set S tacked on to the end. K(d,n) is d-regular (i.e., indegree = outdegree = d), with (d + 1)dn−1 vertices and (d + 1)dn edges. Like the toroidal grid graph K(b,n) has a fixed number d of paths between vertices. The undirected Kautz graph UK(d,n) is obtained from K(d,n) by deleting the orientation of all directed edges and keeping one edge of each pair of multiple (bidirectional) edges. Like the toroidal grid graph Cm X Cn , UK(d,k) graph has a constant edge density approximately equal to d. UK(d,k) is fault tolerant, but not as resistant to attack as the hypercube Hk . With a relatively small diameter (n), and relatively high connectivity (2d−1), UK(d,n) is desirable as a network topology [2, 3]. However, inter-node distance computation based on the labels is more complex than in the hypercube or the toroidal grid graph, which makes this graph topology less attractive for applications in which distance between nodes plays a critical role.
Fig. 3. Directed Kautz graphs with |S| = 3 and string length 2 and 3 [19]
The three graphs discussed here do not exhaust the possibilities. Graph operations such as tensor product and composition could also be used to design overlay topologies. The graphs examined here are offered as examples for the purpose of demonstrating the issues of concern in evaluating topologies for overlay networks designed to support particular applications.
5 Example of an Overlay Topology to Support a Mobile Application Thus far we have emphasized the need for labeling schemes that permit simple computation of inter-node distances in overlay networks. Other properties may be equally important in some applications. Another important consideration has to do with maintenance of the overlay topology. Connections between nodes must be checked and sometimes re-established, so the edge density of an overlay network may be an important parameter. Consider a smart environment application in which cars circulating in a city need to share information and to determine which other cars are nearby. For simplicity we speak of cars without mentioning drivers explicitly, although the latter are the agents utilizing the information being exchanged. The scheme proposed here as an illustration
Topology as a Factor in Overlay Networks Designed
121
is different from a Mobile Ad Hoc Network (MANET). We assume the cars can connect to a certain edge-based network (or possibly a WAN or the Internet) and make use of an overlay network embedded in that substrate to exchange messages with other cars (see Fig. 4). To facilitate communication and message passing between neighbors, the messaging system should take account of both network distance between nodes representing cars in the overlay and distance between cars in real physical space. Ideally, the assignment of nodes to cars would group neighboring cars close together in the overlay network.
Fig. 4. A mobile application utilizing a toroidal grid graph overlay
An important requirement in this case is a close alignment between network distance and physical distance. The proposed scheme uses the toroidal grid graph and works as follows. A car announces its intention to join the overlay network by broadcasting a message which includes its GPS location. Nodes currently in the network receiving the message will respond by providing their GPS location, their node label, and a free label in their immediate neighborhood. The new entrant will compare its GPS location with those of the first, say three, responding nodes and accept the label sent by the node closest to itself in both real space and the overlay network based on a fitness function of both parameters. Taking account of the movement of cars in the city, and to assure a close alignment of network and physical distance, a car would remain connected to the overlay network for only a specified period of time. This means that at the end of the time interval the car would disconnect and repeat the joining operation again, presumably from a new physical location. To determine all the cars within a given radius of its current location, a car would broadcast a request giving its GPS location, overlay network label, and specification of the target distance from its current location. A receiving car would compare its current location with that of the calling car and respond appropriately. The toroidal grid graph is a suitable topology for this application because its vertex labels permit easy determination of network distance between nodes, and its edge density is relatively low, thus keeping the maintenance cost relatively low. Hypercube and Kautz graphs could also be used, which would allow for comparative analysis of the advantages and disadvantages of different topologies.
122
A. Mowshowitz et al.
This example of a possible application is meant to illustrate the importance of topology as a factor in overlay network design. The element of mobility draws attention of the relationship between physical and network distance. In practice, a number of issues would have to be addressed to make the application work. In particular, the substrate network would have to be specified as well as the precise method of accessing the substrate. The conceptual design described above could in principle be adapted to any setting in which mobile devices connected to some substrate need to share information. Air traffic control, for example, could be managed in this way, if aircraft were supplied with the requisite instruments needed to connect to the substrate. This approach could offer an effective and cheaper way of managing control than is currently available. Another interesting area involves the spread of contagious diseases. Treating contagion as a kind of message passing among mobile individuals is an innovative way of modeling epidemics that could offer analytic tools for designing effective interventions [4].
6 Work in Progress We are implementing a comprehensive simulation system in support of the comparative investigation of overlay network topologies. The design is based on multiple opensource, heterogeneous, relational database systems such as MariaDB, PostgreSQL, Firebird, SQLite3. These systems are built on top of OpenFlow environment embedded with the software-oriented Address Resolution Protocol. To insure portability of the simulated environment, the experimental system makes use of the Ubuntu 18.04 operating system built with a Mininet virtual net enhancement (see Fig. 5). Initially, the system will create a static hypercube to accommodate a predetermined number of nodes, and will activate an overlay structure by accessing a designated IP address that hosts a specific database system. Our aim is to build a working and practical demonstration of the advantages of overlay networks.
Fig. 5. Simulated software environment
Topology as a Factor in Overlay Networks Designed
123
The experiments will pay special attention to the data requirements of users. To this end, we are building several distributed query processing applications designed to gather data from a publicly available database system, namely New York City OpenData. Collections of interest in this system include records of parking violations, housing litigation, arrests, complaints, etc. Currently, the data gathered from New York City OpenData, ranging in size from a hundred MB to one GB) are placed in a centralized repository. For this application the data sets are divided it into several relational databases corresponding to the grouping of district IDs, precincts, building IDs, etc. These databases are connected through WebSockets, and therefore they can process bidirectional data transfers. Critical parameters of query performance are response time and the total amount of data transmitted on the network. Comparison of performance between a hypercube overlay network and a set of randomly built networks is underway.
7 Conclusion Overlays are useful tools for connecting subgroups of network users for applications involving dynamic systems, and the topology of an overlay is a critical success factor in such applications. Further research is needed to categorize overlay network topologies based on features needed to support particular applications. Such a categorization of topologies, based on selected features of interest, investigated both analytically and experimentally, would allow for selecting the best topology for any given dynamic modeling application.
References 1. Bent, G., Dantressangle, P., Vyvyan, D., Mowshowitz, A., Mitsou, V.: A dynamic distributed federated database. In: Annual Conference of the International Technology Alliance, Imperial College-London (2008) 2. Bermond, J.-C., Homobono, N., Peyrat, C.: Connectivity of Kautz networks. Discret. Math. 114, 51–62 (1993) 3. Du, D.Z., Hsu, D.F., Lyuu, Y.D.: On the diameter vulnerability of Kautz digraphs. Discret. Math. 151, 81–85 (1996) 4. Enright, J., Kao, R.R.: Epidemics on dynamic networks. Epidemics 24, 88–97 (2018) 5. Imrich, W., Klavzar, S.: Product Graphs: Structure and Recognition. Wiley, New York (2000) 6. Kawaguchi, A., et al.: A model of query performance in dynamic distributed federated databases taking account of network topology. In: Annual Conference of the International Technology Alliance, Southampton, UK (Sept 2012) 7. Marquez, A., de Mier, A., Noy, M., Revuelta, M.P.: Locally grid graphs: classification and Tutte uniqueness. Discret. Math. 266, 327–352 (2003) 8. MathOverflow. Torus graph dynamics (2021). https://mathoverflow.net/questions/270761/ torus-graph-dynamics 9. Mowshowitz, A., Mitsou, V., Bent, G.: Evolving networks and their vulnerabilities. In: Dehmer, M., et al. (eds.) Modern and Interdisciplinary Problems in Network Science: A Translational Research Perspective, CRC Press, pp. 173–192 (2018) 10. Mowshowitz, A., et al.: Network topology as a cost factor in query optimization. In: Annual Conference of the International Technology Alliance, Adelphi, MD (2011)
124
A. Mowshowitz et al.
11. Mowshowitz, A., et al.: Query optimization in a distributed hypercube database. In: Annual Conference of the International Technology Alliance, Imperial College-London (Sept 2010) 12. Mowshowitz, A., Bent, G., Dantressangle, P.: Aligning network structures: embedding a logical dynamic distributed database in a MANET. In: Annual Conference of the International Technology Alliance, University of Maryland (2009) 13. Mowshowitz, A., Bent, G.: Formal properties of distributed database networks. In: Annual Conference of the International Technology Alliance, University of Maryland (2007) 14. Sadaawi, T., Kawaguchu, A., Lee, M., Mowshowitz, A.: Secure resilient edge cloud designed network. IEICE Transactions on Communications (in press) 15. Tarkoma, S.: Overlay Networks: Toward Information Networking. CRS Press, Boca Raton (2010) 16. Toce, A., Mowshowitz, A., Kawaguchi, A., Stone, P., Dantressangle, P., Bent, G.: An efficient hypercube labeling scheme for dynamic peer-to-peer networks. J. Parallel Distrib. Comput. 102, 186–198 (2017) 17. Wang, S., Lin, S.: The k-restricted edge connectivity of undirected Kautz graphs. Discret. Math. 309, 4649–4652 (2009) 18. Wikipedia. Lattice graph (2020). https://en.wikipedia.org/wiki/Lattice_graph 19. Wikipedia. Kautz graph (2021). https://en.wikipedia.org/wiki/Kautz_graph
A Genetic Algorithm for Parallel Unmanned Aerial Vehicle Scheduling: A Cost Minimization Approach Aprinaldi Jasa Mantau(B) , Irawan Widi Widayat, and Mario K¨ oppen Graduate School of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan [email protected], [email protected], [email protected]
Abstract. In recent years, there are many research achievements in the Unmanned Aerial Vehicle (UAV) fields. UAV can be used to deliver the logistics and do surveillance as well. Two main problems in this field are UAV-routing and UAV-scheduling. In this paper, we focus on the UAV scheduling problem, which is the problem to search for the scheduling order of the UAV using a fixed number of UAVs and a fixed number of targets. The objective of this paper is to minimize the total cost for efficient realization. A Genetic Algorithm (GA) method is used to solve the UAV-scheduling problem considering the time-varying cost or Timeof-Use tariff (ToU) constraints. The Job Delay Mechanism is also used to improve cost optimization, as a kind of post-processing for the fitness evaluation of an individual schedule, and show that GA alone can not find it. Finally, a numerical experiment is conducted to implement the idea in this paper. Experiment results showed that the proposed method is quite promising and effective in solving the related problem. Keywords: Unmanned Aerial Vehicle · Genetic algorithm Scheduling · Job delay mechanism · Cost efficiency
1
·
Introduction
The empowering Internet-of-Things (IoT) innovation has roused countless novel stages and applications in the last decade. One of the popular IoT innovations is the Unmanned Aerial Vehicle (UAV). UAV is used because having several advantages, such as: being flexible, fast, relatively cheap, lightweight, and also easy to use [8]. Therefore, UAVs have been widely used for military or even various civilian purposes. Some of the fields that utilize UAVs are wireless sensor networks [1] and also wireless sensor networks for special purposes like harvesting [14], logistic delivery for Optimization using drones in traveling salesman problem [7], while [10] perform the crowd intelligence in last-mile parcel delivery for smart cities. Also, in the surveillance field like [4] do research on Joint routing and scheduling for vehicle-assisted multi drone surveillance, while [3] use a c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 125–135, 2022. https://doi.org/10.1007/978-3-030-84910-8_14
126
A. J. Mantau et al.
multi-drones-based inspection system. The other field that utilizes UAVs is disaster patrol or warning systems using remote sensing technique [15]. And also, odor source localization problem, an av-based odor source localization using simulated annealing algorithm is proposed in [13] and the Modification of particle swarm optimization by reforming global best term to accelerate the searching of odor source by [11]. However, current technological developments are directly proportional to resource use, especially electrical resources. Inefficient use of electrical resources can accelerate the depletion of existing resources. For this reason, future technological developments must consider the efficiency of electricity use, and this UAV technology is no exception. In the end, this kind of energy-efficient research will help us to solve the energy problem in the future and also protect the environment. In this paper, we will employ a Genetic Algorithm (GA) to solve the UAVscheduling problem. As well as other NP-complete problems, the UAV-scheduling has constraints including minimizing the total cost, asymmetric routing and planing, heterogeneous vehicle types, and time traveled, or the number of vehicles [9]. For the efficiency realization, we adopted the Time-of-Use (TOU) tariff condition in the actual world condition. TOU rate is a type of electricity rate used by some utilities that may differ for different periods. For example, Electricity price during peak period (day time) is different from the off-peak period (night time). Drones used as surveillance can be exemplified in Fig. 1, due to limited range and electricity requirements, drones usually work only in certain limited areas like red circles in Fig. 1, and must return to the drone base location (depot) at certain times for charging. However, the model of using the depot and each drone’s allocation area is not included in this research.
Fig. 1. The general illustration of Multi UAVs patrol in specific area.
Previous studies on TOU were mainly done for machine scheduling. Because the characteristics of the problems tend to be similar, we tried to adapt the approach to the UAV-scheduling problem. UAV-scheduling under TOU tariffs
GA for Parallel Unmanned Aerial Vehicle Scheduling: A Cost Minimization
127
can be said to be a different field from the existing UAV research so far. One of research in TOU tariffs has been studied in [5]. In that study, the authors propose a Hybrid Insertion Algorithm for solving the unrelated parallel machine scheduling under tariff difference constrain to finding minimize total cost [6]. They provide a blank job insertion algorithm to delay the task and its sequence to minimize the cost. Therefore, we want to adapt the machine scheduling problem under TOU electricity cost with a meta-heuristics technique to minimize costs. The paper is organized as follows: Sect. 2 gives the problem formulation of this research. In Sect. 3, we present some necessary background for the following parts and the concept to solving the problems. The result of numerical experiment to test our proposal will be shown in Sect. 4. The conclusion about the significance of the results in this work and some feasibility studies are described in Sect. 5. Table 1. The daily electricity price [5]. Period Price Period Price Period Price Period Price
2
0
45.0
6
45.0
12
158.9 18
90.8
1
45.0
7
45.0
13
90.8
19
90.8
2
45.0
8
45.0
14
158.9 20
90.8
3
45.0
9
45.0
15
158.9 21
90.8
4
45.0
10
90.8
16
158.9 22
90.8
5
45.0
11
90.8
17
158.9 23
45.0
Problem Formulation
In this preliminary research, we want to investigate the effect of modification of genetic algorithm under TOU to solve parallel UAVs-Scheduling problem. We make several constraints to limit the scope of this research as follows: 1. The number of Tasks and UAVs is predefined. Number of Drones (UAVs) and Tasks are denoted by the notation N and M, respectively. 2. All drones have the same function, but the energy required to complete the task and the completion time are different. So, we can say that the drones are unrelated, with the time to complete a specific job is assigned. 3. Each task t (t = 1, . . . , N ) has to be processed on exactly one drone d in M . 4. For simplification, the unit of time used is 1 h for each period, in other words, there are 24 times period a day. We add some limitation on this research which is the heaviest work must be completed by the drone with the lowest capability in a period of less than or equal to 24. 5. For cost calculation simulations, the applicable rates for energy consumption may differ from each period. We use data from Table 1 to calculate the electricity rates.
128
A. J. Mantau et al.
2.1
Notation
We present in this section the different notation used in this paper. • • • • • • • • •
N M ptd wd K
Number of tasks, t = 1, 2, 3, . . . , N Number of drones, d = 1, 2, 3, . . . , M The processing time of task t by drone d with max(ptd ) = 24 Energy consumption rate per period of drone d when its do a task Number of period. We assume the interval period is uniform, so we divided into 24 K = {k ∈ K|k = 1, 2, 3, . . . , 24} ck Electricity cost in a period k. So, there is a possibility that there are 24 kinds of tariffs that may apply st Period as a starting period for the first task of drone t dt Period as a finished period for the last task of drone t T C Total cost for the output schedule.
The model involves the following decision variables: 1, if task t is processed at time k by drone d xdtk = 0, otherwise ydt =
1, if task t is processed by drone d 0, otherwise
Objective:
Minimize: T C =
K M N
xdtk × ck × wd
(1)
d=1 t=1 k=1
Subject to:
N st−1
xdtk = 0,
where 1 ≤ d ≤ M
(2)
t=1 k=1 24 N
xdtk = 0,
where 1 ≤ d ≤ M
(3)
t=1 k=dt+1 M
ydt = 1,
1≤d≤N
(4)
1 ≤ d ≤ M, 0 ≤ k ≤ 24
(5)
d=1 N t=1
xdtk ≤ 1,
GA for Parallel Unmanned Aerial Vehicle Scheduling: A Cost Minimization dt
xdtk = dt − st + 1,
st < dt ≤ 24
129
(6)
st
Equation (1) shows the objective function of this study, which aims to minimize the total cost of electricity used. The Time-Of-Use tariff and the capability of the drones are considered in this equation. Equation (2) and (3) tell us that all tasks must be processed at the st period as a starting period for the first task, and the dt period as a finished period for the last task. A single task that can be processed only in one drone at a time is presented in constrain number (4). Equation (5) represents our assumption, which show that each task is handled by at most one drone at a time. Finally, the assumptions of continues drone operation on a task is expressed in Eq. (6). Those all of constrain will be used to calculate the objective of this research. Actually the Objective M in(T C) can be solved by using IBM CPLEX Optimizer as a High-performance mathematical programming solver for linear programming, mixed-integer programming and quadratic programming. However, because of the problem is NP-hard problem, we cannot solve the problem efficiently in a short time. Therefore, a modified genetic algorithm with job delay mechanism is proposed.
3 3.1
Methodology Genetic Algorithm (GA)
Genetic Algorithm (GA) has been used in various fields of science. GA is a popular computational technique introduced by John Holland, and his collaborators in the 1960s and 1970s [2]. As a first evolutionary algorithm, a Genetic algorithm paves the way for contemporary evolutionary computation [12]. The idea of this method is inspired by biological evolution based on Charles Darwin’s theory of natural selection. The basic principle of the genetic algorithm and its fundamental genetic operators are crossover, mutation, and selection. However, various popular methods have been developed based on the nature-inspired idea, such as Particle Swarm Optimization, Artificial Bee Colony Algorithm, Firefly Algorithm, Cuckoo Search, and to date numerous others. 3.2
Genetic Algorithm with Job Delay Mechanism
In this subsection, we will introduce the step-by-step of Genetic algorithm with Job Delay Mechanism with several modification. • Individual Representation Most of the studies in the scheduling field used an array of jobs or tasks for each machine that represents the processing order of the jobs or tasks assigned to that machine. It is also common to randomly generate the initial population in a genetic algorithm. For example, Fig. 2 shows the concept of chromosome, gene, and population. There is a population with five individuals, where each
130
A. J. Mantau et al.
Algorithm 1. Standard Genetic Algorithm 1: procedure Genetic Algorithm procedure 2: Generate an initial population 3: while stopping criterion is unsatisfied do 4: Select two parents from the population using tournaments 5: Produce two children from the parents using crossover 6: Mutate the children 7: Evaluate fitness of the children 8: Choose the best child 9: if entry criteria are satisfied by the chosen child then 10: Choose population member to be replaced 11: The child enters into population 12: end if 13: end while 14: end procedure
Fig. 2. Genetic algorithm data structure.
individuals has five chromosomes. From that concept we can see a schedule represented by a individual and a chromosome in it as a task. As we mention in Sect. 2, two drones can have different processing times for an identical task, which means that they have different capabilities or performance levels for processing the same task. After we have the population, we need to assign all of the individuals to available drones. The most commonly used assignment method is the earliest completion time first (ECT) rule. In this paper, we propose two assignment rules to assign all tasks to drones. The First assignment rule: coming from the first-come, first-serve concept. The main idea of this first concept is assigning task t from a chromosome based on its position in the sequence. After calculating the total cost of that task t in all drones, we assign that task t to drone d that gives the smallest total cost. After that, remove task t from the chromosome sequence and repeat this procedure to all tasks in the chromosome. The Second assignment rule: coming from the possibility of an unbalanced load for all drones. If there is a “best” drone that can always give the lowest total cost, all tasks will always assign to this drone. This case makes the utilization of drones is weak. Therefore, the idea of this rule is not to assign a task t to the drones that currently process more tasks compared to other drones, even if that drone gives the best total cost. The rest, the procedure used is the same as the first rule above. We have to calculate all of the costs for each task and assign them to the appropriate drones.
GA for Parallel Unmanned Aerial Vehicle Scheduling: A Cost Minimization
131
When chromosome decoding has been completed, the schedule is obtained. At this stage, we can calculate the objective function in this research as a fitness function in GA. To minimize the cost, we add a shifting mechanism to find the best fitness of the chromosome. This shifting mechanism helps find a better period to avoid expensive electricity costs. • Job Delay Mechanism Job Delay Mechanism (JDM) is a method to shift a specific task from a particular period to another. JDM aims to reduce the cost of a task due to the tariff difference constraint. For example, a task with a high cost during the day, if it is shifted slightly to the afternoon, will provide lower costs. This shifting mechanism allows a task to be performed at the lowest possible cost. The concept of the Shifting method that we used is simple. This method tries to find which period gives us the lowest cost for every task (due to the electricity rate period). The process started when the last tasks shifted to the next period until the possible period. For example, if the last task t (the last task to be completed, or in other words, a chromosome that is in the last position in the sequence) needs 4 periods and the previous starting period is period 15, there are 5 (24-15-4) times possibility that task t can perform the shifting. If during the process, we found the lowest cost for task t, then a new starting period for the task t is determined. Next is the second last task t2, we do the same shifting method, but with the maximum shifting limit is the period when the next task starts. For example, after we found the best period for the last task is 19–23, the maximum shifting for the task t2 is 19 minus starting period of the task t2 minus the period length of the task t2. This shifting method will be carried out on all existing tasks until all tasks find their “best” period. • Selection, Crossover and Mutation Operator The next step after we evaluate the objective value of each individual is Selection. In this step we need to select individuals to become parents that will breed offspring for the next generation. Once the selection process has been done, the next step will be Crossover. In this step, we want to produce offspring for the next generation population. After we finish with this step, the next step will be Mutation. Mutation is an operation that searches the local neighborhood for a better solution. Mutation procedure used to maintain genetic diversity in the next generation. This study uses a swapping operation, which means we exchange two randomly selected genes from a selected chromosome. In the end, a new population can be formed, which can find better individuals than before.
4
Result and Discussion
After previous procedures, deeded, it yields a scheduled procedure that meets the objective function T C. Figure 3 and 4 are the example results of the sequence that we get, with seven tasks and two drones. Both pictures show the example result without and with the Job Delay Mechanism (JDM). The schedule in
132
A. J. Mantau et al.
Fig. 4 produces objective value better than the schedule in Fig. 3. As we can see from those figures, starting time of task 1, 2, and 3 are delayed to wait for the cheaper electricity consumption time. For this example, GA with JDM has value of 158,460 compared to without JDM 188,424.
Fig. 3. The illustration of the example result sequence from 5 tasks and 2 drones without JDM
Fig. 4. The illustration of the example result sequence from 5 tasks and 2 drones with JDM
Numerical results were analyzed by comparing the proposed method with standard GA. The evaluation was conducted by comparing the average objective value from the result with the objective value from standard GA. We implemented the proposed algorithm using the JAVA programming language on a machine with a Dual-Core Intel Core i5 processor and 16 GB of RAM. The parameter used in this research can be seen in Table 2. Mean Absolute Deviation (MAD) is also used to measure the performance of the algorithm based on Eq. (7). T Cavg denotes the average value from Objective function, while T Cbest denotes the best value from objective function are reported in this research. Table 2. Standard parameter values of GA. Parameter
Value
Population size
50
Number of generation
30
Probability of crossover 0.75 Probability of mutation 0.05
T Cavg − T Cbest × 100 M AD = T Cavg
(7)
GA for Parallel Unmanned Aerial Vehicle Scheduling: A Cost Minimization
133
The result of the test method can be seen in Table 3. In the table, we present the outcome of our proposed method and standard GA. The test was carried with ten kinds of test sets, where each test has a different problem size. For each test instance, ten trials were carried out. In this research, we compare the standard Genetic Algorithm (GA) with our proposed method which are GA with Job Delay Mechanism (JDM). GA+JDM1 represents the GA+JDM with the first assignment rule, whilst GA+JDM2 represents GA+JDM with the second assignment rule. Based on the experiment, comparison to the proposed method performs better solutions than the standard GA method. Table 3. Experiment result. Test number Problem size GA (D × T) TC
GA + JDM1 MAD
TC
GA + JDM2 MAD TC
MAD
1
2×5
188, 424
9.672
168, 536
5.595
158,460
11.239
2
3×7
338, 613
11.285
285,316
8.075
288, 900
6.920
3
3 × 10
348, 147
11.974
288,900
8.717
295, 710
6.566
4
4 × 10
405, 008
13.730
327,784
6.900
335, 550
4.694
5
5 × 10
407, 318
13.722
328,146
8.185
339, 042
5.136
6
5 × 15
588, 878
15.597
452,896
9.030
486, 492
2.282
7
8 × 20
910, 684
14.127
734,572
5.951
748, 606
4.154
8
10 × 20
1, 224, 167
15.308
982, 056
4.774
978,726
5.097
9
10 × 30
1, 289, 089
15.489
1,013,636
6.101
1, 045, 870
3.115
10
15 × 45
1, 974, 043
11.371
1, 681, 488
5.567
1,661,966
6.663
Also, from our experiments, we can see that GA with Job Delay Mechanism with two different assignment rules is not superior to each other. This condition can happen because a set configuration of drones and tasks can have better performance if we use the first assignment rule, but it can be the opposite for different configurations. Therefore, we still do not know the correlation of configuration is suitable for specific assignment rules, so further investigation is needed.
5
Conclusions and Future Research
This research proposed modifying the Genetic Algorithm (GA) with Job Delay Mechanism (JDM) to parallel UAVs-Scheduling Problem. In implementing JDM, we also use two different assignment rule approaches. From the experiments that have been performed in this research, it can be concluded that the proposed method can improve scheduling performance by employing JDM. Using JDM, we can delay the starting period for a task to avoid the electricity rate in a certain period. Future research directions involve the consideration of more complex methods for initialization, selection, and mutation. To improve the methodology
134
A. J. Mantau et al.
or application range, we want to solve the problem with flexible time processing for the tasks, uncertainty electricity conditions, flexible time period, and longterm period tasks (more than a day). Lastly, we also consider using other metaheuristic algorithm that suitable for the combinatorial problem, such as Binary Particle-Swarm, Firefly Algorithm, Binary Bat Algorithm, or Cuckoo Search to solve this UAVs-scheduling problem.
References 1. Ebrahimi, D., Sharafeddine, S., Ho, P.H., Assi, C.: UAV-aided projection-based compressive data gathering in wireless sensor networks. IEEE IoT J. 6(2), 1893– 1905 (2019). https://doi.org/10.1109/JIOT.2018.2878834 2. Holland, J.H., et al.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. MIT Press (1992) 3. Hu, M., et al.: On the joint design of routing and scheduling for vehicle-assisted multi-UAV inspection. Futur. Gener. Comput. Syst. 94, 214–223 (2019). https:// doi.org/10.1016/j.future.2018.11.024 4. Hu, M., et al.: Joint routing and scheduling for vehicle-assisted multidrone surveillance. IEEE IoT J. 6(2), 1781–1790 (2019). https://doi.org/10.1109/JIOT.2018. 2878602 5. Moon, J., Shin, K., Park, J.: Optimization of production scheduling with timedependent and machine-dependent electricity cost for industrial energy efficiency. The Int. J. Adv. Manuf. Technol. 68, 523–535 (2013) 6. Moon, J.Y., Shin, K., Park, J.: Optimization of production scheduling with time-dependent and machine-dependent electricity cost for industrial energy efficiency. Int. J. Adv. Manuf. Technol. 68 (2013). https://doi.org/10.1007/s00170013-4749-8 7. Murray, C.C., Chu, A.G.: The flying sidekick traveling salesman problem: optimization of drone-assisted parcel delivery. Transp. Res. Part C Emerg. Technol. 54, 86–109 (2015). https://doi.org/10.1016/j.trc.2015.03.005 8. Peng, K., et al.: A hybrid genetic algorithm on routing and scheduling for vehicleassisted multi-drone parcel delivery. IEEE Access 7, 49191–49200 (2019). https:// doi.org/10.1109/ACCESS.2019.2910134 9. Toth, P., Vigo, D.: The vehicle routing problem. Soc. Ind. Appl. Math. (2002). https://doi.org/10.1137/1.9780898718515 10. Wang, F., Wang, F., Ma, X., Liu, J.: Demystifying the crowd intelligence in last mile parcel delivery for smart cities. IEEE Netw. 33(2), 23–29 (2019). https://doi. org/10.1109/MNET.2019.1800228 11. Widiyanto, D., Purnomo, D., Jati, G., Mantau, A., Jatmiko, W.: Modification of particle swarm optimization by reforming global best term to accelerate the searching of odor sources. Int. J. Smart Sens. Intell. Syst. 9(3), 1410–1430 (2016). https://doi.org/10.21307/ijssis-2017-924 12. Yang, X.S.: Genetic algorithms, Chap. 6, 2nd edn. In: Yang, X.S. (ed.) NatureInspired Optimization Algorithms, pp. 91–100. Academic Press (2021). https:// doi.org/10.1016/B978-0-12-821986-7.00013-5 13. Yang, Z.Z., Jing, T., Meng, Q.H.: UAV-based odor source localization in multibuilding environments using simulated annealing algorithm. In: 2020 39th Chinese Control Conference (CCC), pp. 3806–3811 (2020). https://doi.org/10.23919/ CCC50068.2020.9189425
GA for Parallel Unmanned Aerial Vehicle Scheduling: A Cost Minimization
135
14. You, C., Zhang, R.: 3D trajectory optimization in Rician fading for UAV-enabled data harvesting. IEEE Trans. Wirel. Commun. 18(6), 3192–3207 (2019). https:// doi.org/10.1109/TWC.2019.2911939 15. Yuan, C., Zhang, Y., Liu, Z.: A survey on technologies for automatic forest fire monitoring, detection, and fighting using unmanned aerial vehicles and remote sensing techniques. Can. J. For. Res. 45(7), 783–792 (2015). https://doi.org/10. 1139/cjfr-2014-0347
A Movement Adjustment Method for DQN-Based Autonomous Aerial Vehicle Nobuki Saito1 , Tetsuya Oda2(B) , Aoto Hirata1 , Kyohei Toyoshima2 , Masaharu Hirota3 , and Leonard Barolli4 1
3
Graduate School of Engineering, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700-0005, Japan {t21jm01md,t21jm02zr}@ous.jp 2 Department of Information and Computer Engineering, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700-0005, Japan [email protected], [email protected] Department of Information Science, Okayama University of Science (OUS), 1-1 Ridaicho, Kita-ku, Okayama 700-0005, Japan [email protected] 4 Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-higashi, Higashi-ku, Fukuoka 811-0295, Japan [email protected]
Abstract. The Deep Q-Network (DQN) is one of the deep reinforcement learning algorithms, which uses deep neural network structure to estimate the Q-value in Q-learning. In the previous work, we designed and implemented a DQN-based Autonomous Aerial Vehicle (AAV) testbed and proposed a Tabu List Strategy based DQN (TLS-DQN). In this paper, we propose a movement adjustment method for decreasing the movement fluctuations caused by TLS-DQN during autonomous movement control. The performance evaluation results show that the proposed method can decrease the movement fluctuations.
1
Introduction
The Unmanned Aerial Vehicle (UAV) is expected to be used in different fields such as aerial photography, transportation, search and rescue of humans, inspection, land surveying, observation and agriculture. Autonomous Aerial Vehicle (AAV) [1] has the ability to operate autonomously without human control and is expected to be used in a variety of fields, similar to UAV. So far many AAVs [2–4] are proposed and used practically. However, existing autonomous flight systems are designed for outdoor use and rely on location information by the Global Navigation Satellite System (GNSS) or others. On the other hand, in an environment where it is difficult to obtain position information from GNSS, it is necessary to determine a path without using position information. Therefore, autonomous c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 136–148, 2022. https://doi.org/10.1007/978-3-030-84910-8_15
A Movement Adjustment Method for DQN-Based AAV
137
movement control is essential to achieve operations that are independent of the external environment, including non-GNSS environments such as indoor, tunnel and underground. In [5–8] the authors consider Wireless Sensor and Actuator Networks (WSANs), which can act autonomously for disaster monitoring. A WSAN consists of wireless network nodes, all of which have the ability to sense events (sensors) and perform actuation (actuators) based on the sensing data collected by the sensors. WSAN nodes in these applications are nodes with integrated sensors and actuators that have the high processing power, high communication capability, high battery capacity and may include other functions such as mobility. The application areas of WSAN include AAV [9], Autonomous Underwater Vehicle (AUV) [10], Autonomous Surface Vehicle (ASV) [11], Heating, Ventilation, Air Conditioning (HVAC) [12], Internet of Things (IoT) [13], Ambient Intelligence (AmI) [14], ubiquitous robotics [15], and so on. Deep reinforcement learning [16] is an intelligent algorithm that is effective in controlling autonomous robots such as AAV. Deep reinforcement learning is an approximation method using deep neural network for value function and policy function in reinforcement learning. Deep Q-Network (DQN) is a method of deep reinforcement learning using Convolution Neural Network (CNN) as a function approximation of Q-values in the Q-learning algorithm [16,17]. DQN combines the neural fitting Q-iteration [18,19] and experience replay [20], shares the hidden layer of the action value function for each action pattern and can stabilize learning even with nonlinear functions such as CNN [21,22]. However, there are some points where learning is difficult to progress for problems with complex operations and rewards, or problems where it takes a long time to obtain a reward. In this paper, we propose a movement adjustment method for decreasing the movement fluctuations caused by the Tabu List Strategy based DQN (TLSDQN) during autonomous movement control of the DQN Based AAV. Also, we present the simulation results for AAV control using TLS-DQN [23,24] considering an indoor single-path environment. The structure of the paper is as follows. In Sect. 2, we show the DQN based AAV testbed. In Sect. 3, we describe the proposed method. In Sect. 4, we discuss the simulation results of TLS-DQN. Finally, conclusions and future work are given in Sect. 5.
2
DQN Based AAV Testbed
In this section, we discuss quadrotor for AAV and DQN for AAV mobility.
138
N. Saito et al.
Fig. 1. Snapshot of AAV.
Table 1. Components of quadrotor. Component
Model
Propeller
15 × 5.8
Motor
MN3508 700 kv
Electric speed controller
F45A 32bitV2
Flight controller
Pixhawk 2.4.8
Power distribution board MES-PDB-KIT Li-Po battery
22.2 v 12000 mAh XT90
Mobile battery
Pilot Pro 2 23000 mAh
ToF ranging sensor
VL53L0X
Raspberry Pi
3 Model B Plus
PVC pipe
VP20
Acrylic plate
5 mm
Fig. 2. AAV control system.
A Movement Adjustment Method for DQN-Based AAV
2.1
139
Quadrotor for AAV
For the design of AAV, we consider a quadrotor, which is a type of multicopter. Multicopter is high maneuverable and can operate in places that are difficult for people to enter, such as disaster areas and dangerous places. It also has the advantage of not requiring space for takeoffs and landings and being able to stop at mid-air during the flight, therefore enabling activities at fixed points. The quadrotor is a type of rotary-wing aircraft that uses four rotors for takeoff or propulsion, and can operate with less power than hexacopter and octocopter, and is less expensive to manufacture. In Fig. 1 is shown a snapshot of the quadrotor used for designing and implementing an AAV testbed. The quadrotor frame is mainly composed of polyvinyl chloride (PVC) pipe and acrylic plate. The components for connecting the battery, motor, sensor, etc. to the frame are created using an optical 3D printer. Table 1 shows the components in the quadrotor. The size specifications of the quadrotor (including the propeller) are length 87 cm, width 87 cm, height 30 cm and weight 4259 g. In Fig. 2 is shown the AAV control system. The raspberry pi reads saved data of the best episode when carrying out the simulations by DQN and uses telemetry communication to send commands such as up, down, forward, back, left, right and stop to the flight controller. Also, multiple Time-of-Flight (ToF) range sensors using Inter-Integrated Circuit (I2 C) communication and General-Purpose Input Output (GPIO) are used to acquire and save flight data. The Flight Controller (FC) is a component that calculates the optimum motor rotation speed for flight based on the information sent from the built-in acceleration sensor and gyro sensor. The Electronic Speed Controller (ESC) is a part that controls the rotation speed of the motor in response to commands from FC. Through these sequences, AAV behaves and reproduces movement in simulation.
Fig. 3. DQN for AAV mobility control.
140
2.2
N. Saito et al.
DQN for AAV Mobility
The DQN for moving control of AAV structure is shown in Fig. 3. The DQN for AAV mobility is implemented by Rust programming language [25]. In this work, we use the Deep Belief Network (DBN), because the computational complexity is smaller than CNN for DNN part in DQN. The environment is set as vi . At each step, the agent selects an action at from the action sets of the mobile actuator nodes and observes a position vt from the current state. The change of the mobile actuator node score rt was regarded as the reward for the action. For the reinforcement learning, we can complete all of these mobile actuator nodes sequences mt as Markov decision process directly, where sequences of observations and actions are mt = v1 , a1 , v2 , . . . , at−1 , vt . A method known as experience replay is used to store the experiences of the agent at each timestep, et = (mt , at , rt , mt+1 ) in a dataset D = e1 , . . . , eN , cached over many episodes into a Experience Memory. By defining the discounted reward for the by a Tfuture factor γ, the sum of the future reward until the end would be Rt = t =t γ t −t rt . T means the termination time-step of the mobile actuator nodes. After running experience replay, the agent selects and executes an action according to an greedy strategy. Since using histories of arbitrary length as inputs to a neural network can be difficult, Q-function instead works on fixed length format of histories produced by a function φ. The target was to maximize the action value function Q∗ (m, a) = maxπ E[Rt |mt = m, at = a, π], where π is the strategy for selecting of best action. From the Bellman equation (see Eq. (1)), it is possible to maximize the expected value of r + γQ∗ (m , a ), if the optimal value Q∗ (m , a ) of the sequence at the next time step is known. Q∗ (m , a ) = Em ∼ξ [r + γa maxQ∗ (m , a )|m, a].
(1)
By not using iterative updating method to optimize the equation, it is common to estimate the equation by using a function approximator. Q-network in DQN is a neural network function approximator with weights θ and Q(s, a; θ) ≈ Q∗ (m, a). The loss function to train the Q-network is shown in Eq. (2): Li (θi ) = Es,a∼ρ(· ) [(yi − Q(s, a; θi ))2 ].
(2)
The yi is the target, which is calculated by the previous iteration result θi−1 . The ρ(m, a) is the probability distribution of sequences m and a. The gradient of the loss function is shown in Eq. (3): ∇θi Li (θi ) = Em,a∼ρ(· );s ∼ξ [(yi − Q(m, a; θi ))∇θi Q(m, a; θi )].
(3)
We consider tasks in which an agent interacts with an environment. In this case, the AAV moves step by step in a sequence of observations, actions and rewards. We took in consideration AAV mobility and consider 7 mobile patterns (up, down, forward, back, left, right, stop). In order to decide the reward function, we considered Distance between AAV and Obstacle (DAO) parameter. The initial weights values are assigned as Normal Initialization [26]. The input layer is using AAV and the position of destination, total reward values
A Movement Adjustment Method for DQN-Based AAV
141
Algorithm 1. Tabu List for TLS-DQN. Require: The coordinate with the highest evaluated value in the section is (x, y, z). 1: if (xbef ore ≤ xcurrent ) ∧ (xcurrent ≤ x) then 2: tabu list ⇐ ((xmin ≤ xbef ore ) ∧ (ymin ≤ ymax ) ∧ (zmin ≤ zmax )) 3: else if (xbef ore ≥ xcurrent ) ∧ (xcurrent ≥ x) then 4: tabu list ⇐ ((xbef ore ≤ xmax ) ∧ (ymin ≤ ymax ) ∧ (zmin ≤ zmax )) 5: else if (ybef ore ≤ ycurrent ) ∧ (ycurrent ≤ y) then 6: tabu list ⇐ ((xmin ≤ xmax ) ∧ (ymin ≤ ybef ore ) ∧ (zmin ≤ zmax )) 7: else if (ybef ore ≥ ycurrent ) ∧ (ycurrent ≥ y) then 8: tabu list ⇐ ((xmin ≤ xmax ) ∧ (ybef ore ≤ ymax ) ∧ (zmin ≤ zmax )) 9: else if (zbef ore ≤ zcurrent ) ∧ (zcurrent ≤ z) then 10: tabu list ⇐ ((xmin ≤ xmax ) ∧ (ymin ≤ ymax ) ∧ (zmin ≤ zbef ore )) 11: else if (zbef ore ≥ zcurrent ) ∧ (zcurrent ≥ z) then 12: tabu list ⇐ ((xmin ≤ xmax ) ∧ (ymin ≤ ymax ) ∧ (zbef ore ≤ zmax ))
in Experience Memory and AAV movements patterns. The hidden layer is connected with 256 rectifier units in Rectified Linear Units (ReLU) [27]. The output Q-values are the AAV movement patterns.
3 3.1
Proposed Method TLS-DQN
The idea of the Tabu List Strategy (TLS) is motivated from Tabu Search (TS) proposed by F. Glover [28] to achieve an efficient search for various optimization problems by prohibiting movements to previously visited search area in order to prevent getting stuck in local optima. ⎧ 3 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨
(if (xcurrent = xglobal destinations )∧ (ycurrent = yglobal destinations )∧ (zcurrent = zglobal destinations ))∨ (((xbef ore < xcurrent ) ∧ (xcurrent ≤ xlocal destinations ))∨ ((xbef ore > xcurrent ) ∧ (xcurrent ≥ xlocal destinations ))∨ r= ((ybef ore < ycurrent ) ∧ (ycurrent ≤ ylocal destinations ))∨ ⎪ ⎪ ⎪ ⎪ ((y ⎪ bef ore > ycurrent ) ∧ (ycurrent ≥ ylocal destinations ))∨ ⎪ ⎪ ⎪ ((z ⎪ bef ore < zcurrent ) ∧ (zcurrent ≤ zlocal destinations ))∨ ⎪ ⎪ ⎪ ((z ⎪ bef ore > zcurrent ) ∧ (zcurrent ≥ zlocal destinations ))). ⎪ ⎩ −1 (else).
(4)
In this paper, reward value is decided by Eq. (4), where “x”, “y” and “z” means X-axis, Y -axis and Z-axis, respectively. The current means the current coordinates of the actor node in the DQN, and the before means the coordinates before selecting and moving the action. Also, the global destination means the destination in the problem area, and the local destination means the target passage points until the global destination.
142
N. Saito et al.
Fig. 4. Tabu rule addition method.
The considered area is partitioned based on the target passage points and one destination is set in each area. If the current coordinate is closer to the destination than the coordinate before the move, also if the current coordinate is equal to the destination, the reward value is 3. In all other cases, the reward value is −1. The tabu list in TLS is used when an actor node of DQN selects an action or the reward for that action is determined. The tabu list is referred to when selecting the action if the direction of movement of the action has been randomly determined. If the direction of movement area is included in the tabu list, the actor node will reselect the action. Also, the tabu list is used when the reward was determined. When the reward value is 3, the prohibited area is added to the tabu list based on the rule shown in Algorithm 1. The tabu list holds the added prohibited areas until the end of the episode and is initialized for each episode. Figure 4 shows an example of adding the prohibited area method to the tabu list according to Algorithm 1. The n in Fig. 4 is a natural number and refers to the number of iterations in each episode. In Step: [n] of Fig. 4, the actor node has moved in the Y -axis direction and is closer to the destination than before the move, and (ybef ore < ycurrent ) and (ycurrent ≤ ylocal destinations ) in Algorithm 1 are satisfied. Therefore, the black-filled area of [(xmin ≤ xmax ), (ymin ≤ ybef ore ), (zmin ≤ zmax )] is added to the tabu list. Also, in Step: [n+1], the actor node has moved in the X-axis direction and is closer to the destination than before the move, and (xbef ore < xcurrent ) and (xcurrent ≤ xlocal destinations ) in Algorithm 1 are satisfied. Therefore, the black-filled area with [(xmin ≤ xbef ore ), (ymin ≤ ymax ), (zmin ≤ zmax )] is added to the tabu list. The search by TLS-DQN is done in a wider range and is better than the search by random direction of movement. 3.2
Movement Adjustment Method
The movement adjustment method is used for reducing movement fluctuations caused by TLS-DQN. The Algorithm 2 inputs the movement of coordinates (X, Y , Z) in the episode of Best derived by TLS-DQN and generates the Adjustment P oint Coordinates List. The number of coordinates included in the movement is the same as the number of iterations of DQN. In Algorithm 2,
A Movement Adjustment Method for DQN-Based AAV
143
the N umber of divided lists indicates the number of divisions to the coordinate movements; the N umber of coordinates indicates how many coordinates are included in the Divided List; and the (xcneter , ycneter , zcneter ) indicates the center coordinates derived from the maximum and minimum values of coordinates in X-axis, Y -axis and Z-axis included in the Divided List. Algorithm 2. Movement Adjustment Decision. Input: M ovement Coordinates ← The movement of coordinates (X, Y , Z) by TLSDQN Output: Adjustment P oint Coordinates List. N umber of divided list ← Any number. Iterations in T LS−DQN . 2: N umber of coordinates ← N umberN of umber of divided list i ← 0, j ← 0 4: for k = 0 to Number of coordinates in M ovement Coordinates do Divided List[j] ← M ovement Coordinates[k]. 6: j ← j + 1. if j ≥ N umber of coordinates then 8: (xmin , xmax ) ← Min. and Max. values for X-axis in the Divided List. (ymin , ymax ) ← Min. and Max. values for Y-axis in the Divided List. 10: (zmin , zmax ) ← Min. and Max. values for Z-axis in the Divided List. (xcenter , ycenter , zcenter ) ← ( xmin +2 xmax , ymin +2 ymax , zmin +2 zmax ) 12: Adjustment P oint Coordinates List[i] ← (xcenter , ycenter , zcenter ). i ← i + 1, j ← 0
(a) From the initial placement to the grobal (b) From the grobal destination to the initial destination. placement.
Fig. 5. Snapshot of considered area.
144
N. Saito et al.
Fig. 6. Considered area for simulation.
4
Performance Evaluation
In this section, we discuss simulation results of TLS-DQN and performance evaluation of movement adjustment method. 4.1
Simulation Results of TLS-DQN
We consider for simulations the operations such as takeoffs, flights and landings between the initial position and the destination. The target environment is a corridor with an indoor single-path environment. Figure 5 shows snapshots of the area used in the simulation scenario and it was taken on the ground floor of Building C4 at Okayama University of Science, Japan. Figure 6 shows the considered area based on the actual measurements of Fig. 5 area. In this simulation, Table 2. Simulation parameters of DQN. Parameters
Values
Number of episode
50000
Number of iteration
2000
Number of hidden layers
3
Number of hidden units
15
Initial weight value
Normal initialization
Activation function
ReLU
Action selection probability () 0.999 − (t/Number of episode) (t = 0, 1, 2, . . ., Number of episode) Learning rate (α)
0.04
Discount rate (γ)
0.9
Experience memory size
300 × 100
Batch size
32
Number of AAV
1
A Movement Adjustment Method for DQN-Based AAV Best Median Worst
Reward
3000 2500 2000 1500 1000 500 0 -500 -1000
145
0
500
1000
1500
2000
Fig. 7. Simulation results of rewards. TLS-DQN Adjustment [20] Adjustment [10] Adjustment [5]
60 50
Initial Placement Local Destination Global Destination
40
Z-axis
30 20 10 0 -10
-5
0
5
X-axis
10
15
20
25
30
35 0
100
50
150
200
250
300
350
400
Y-axis
Fig. 8. Visualization results.
1 , 2 the initial placement is [75, 75, 0], the local destination in areas , 3 are [75, 150, 150], [75, 1850, 150] and [75, 1925, 0] (global destination), respectively. Table 2 shows the parameters used in the simulation. Figure 7 shows the change in reward value of the action in each iteration for Worst, Median, and Best episodes in TLS-DQN. In the episodes of Best and Median, it can be seen that the reward value is on the rise trend.
-5 0
TLS-DQN Adjustment [20] Adjustment [10] Adjustment [5]
Initial Placement Local Destination Global Destination
TLS-DQN Adjustment [20] Adjustment [10] Adjustment [5]
Initial Placement Local Destination Global Destination
X-axis
X-axis
5 10 15 20 25 30 35
Y-axis
Y-axis
(a) XY plane.
(b) Y Z plane.
Fig. 9. Simulation results of rewards.
146
N. Saito et al. Table 3. The distance of movement in the XY and Y Z plane. Plane Minimum TLS-DQN Adjustment 20 Adjustment 10 Adjustment 5
4.2
XY
370.00
850.00
397.61
390.94
370.65
YZ
406.19
890.00
395.06
390.39
387.91
Results of Movement Adjustment Method
Figure 8 shows the visualization results of movement adjustment method in Best episodes of TLS-DQN, when the Number of divided lists is 20, 10 and 5. Figure 9 shows the results of TLS-DQN and the proposed movement adjustment method on the XY and Y Z planes. The Table 3 shows the minimum distance of movement, the distance of movement by TLS-DQN and the proposed movement adjustment method. The distance of movement is derived from the total Euclidean distances between coordinates. The performance evaluation shows that the movement adjustment method can decrease the distance of movement and the movement fluctuations for both XY and Y Z planes.
5
Conclusions
In this paper, we proposed a movement adjustment method for decreasing the movement fluctuations caused by TLS-DQN during autonomous movement control of the DQN Based AAV. Also, we presented simulation results for AAV control by TLS-DQN considering an indoor single-path environment. From performance evaluation results, we conclude as follows. – By using the proposed movement adjustment method, the movement fluctuations can be decreased. – The distance of movement decreases by suppressing the movement fluctuations. – The proposed method is a good approach for indoor single-path environments. In the future, we would like to improve the TLS-DQN for AAV mobility by considering different scenarios. Acknowledgement. This work was supported by JSPS KAKENHI Grant Number JP20K19793 and Grant for Promotion of OUS Research Project (OUS-RP-20-3).
References 1. St¨ ocker, C., et al.: Review of the current state of UAV regulations. Remote Sens. 9(5), 1–26 (2017) 2. Artemenko, O., et al.: Energy-aware trajectory planning for the localization of mobile devices using an unmanned aerial vehicle. In: Proceedings of the 25th International Conference on Computer Communication and Networks, ICCCN 2016, pp. 1–9 (2016)
A Movement Adjustment Method for DQN-Based AAV
147
3. Popovi´c, M., et al.: An informative path planning framework for UAV-based terrain monitoring. Auton. Robot. 44, 889–911 (2020) 4. Nguyen, H., et al.: LAVAPilot: lightweight UAV trajectory planner with situational awareness for embedded autonomy to track and locate radio-tags, pp. 1–8. arXiv:2007.15860 (2020) 5. Oda, T., et al.: Design and implementation of a simulation system based on deep q-network for mobile actor node control in wireless sensor and actor networks. In: Proceedings of the 31st IEEE International Conference on Advanced Information Networking and Applications Workshops, IEEE AINA-2017, pp. 195–200 (2017) 6. Oda, T., et al.: Performance evaluation of a deep Q-network based simulation system for actor node mobility control in wireless sensor and actor networks considering three-dimensional environment. In: Proceedings of the 9th International Conference on Intelligent Networking and Collaborative Systems, INCoS 2017, pp. 41–52 (2017) 7. Oda, T., et al.: A deep Q-network based simulation system for actor node mobility control in WSANs considering three-dimensional environment: a comparison study for normal and uniform distributions. In: Proceedings of the CISIS 2018, pp. 842– 852 (2018) 8. Saito, N., et al.: Design and implementation of a DQN based AAV. In: Proceedings of the 15th International Conference on Broadband and Wireless Computing, Communication and Applications, BWCCA 2020, pp. 321–329 (2020) 9. Sandino, J., et al.: UAV framework for autonomous onboard navigation and people/object detection in cluttered indoor environments. Remote Sens. 12(20), 1–31 (2020) 10. Scherer, J., et al.: An autonomous multi-UAV system for search and rescue. In: Proceedings of the 6th ACM Workshop on Micro Aerial Vehicle Networks, Systems, and Applications, DroNet 2015, pp. 33–38 (2015) 11. Moulton, J., et al.: An autonomous surface vehicle for long term operations. In: Proceedings of MTS/IEEE OCEANS, pp. 1–10 (2018) 12. Oda, T., et al.: Design of a deep Q-network based simulation system for actuation decision in ambient intelligence. In: Proceedings of the 33rd International Conference on Advanced Information Networking and Applications, AINA 2019, pp. 362–370 (2019) 13. Oda, T., et al.: Design and implementation of an IoT-based e-learning testbed. Int. J. Web Grid Serv. 13(2), 228–241 (2017) 14. Hirota, Y., et al.: Proposal and experimental results of an ambient intelligence for training on soldering iron holding. In: Proceedings of the 15th International Conference on Broadband and Wireless Computing, Communication and Applications, BWCCA 2020, pp. 444–453 (2020) 15. Hayosh, D., et al.: Woody: low-cost, open-source humanoid torso robot. In: Proceedings of The 17th International Conference on Ubiquitous Robots, ICUR 2020, pp. 247–252 (2020) 16. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015) 17. Mnih, V., et al.: Playing Atari with deep reinforcement learning, pp. 1–9. arXiv:1312.5602 (2013) 18. Lei, T., Ming, L.: A robot exploration strategy based on Q-learning network. In: IEEE International Conference on Real-time Computing and Robotics, IEEE RCAR 2016, pp. 57–62 (2016)
148
N. Saito et al.
19. Riedmiller, M.: Neural fitted Q iteration – first experiences with a data efficient neural reinforcement learning method. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 317–328. Springer, Heidelberg (2005). https://doi.org/10.1007/11564096 32 20. Lin, L.J.: Reinforcement learning for robots using neural networks. Technical report, DTIC Document (1993) 21. Lange, S., Riedmiller, M.: Deep auto-encoder neural networks in reinforcement learning. In: Proceedings of the International Joint Conference on Neural Networks, IJCNN 2010, pp. 1–8 (2010) 22. Kaelbling, L.P., et al.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1–2), 99–134 (1998) 23. Saito, N., et al.: Proposal and evaluation of a Tabu list based DQN for AAV mobility. In: Proceedings of the 9th International Conference on Emerging Internet, Data & Web Technologies, EIDWT 2021, pp. 189–200 (2021) 24. Saito, N., et al.: A Tabu list strategy based DQN for AAV mobility in indoor single-path environment: implementation and performance evaluation. IoT J. 14, 100394 (2021) 25. Takano, K., et al.: Design of a DSL for converting rust programming language into RTL. In: Proceedings of the 8th International Conference on Emerging Internet, Data & Web Technologies, EIDWT 2020, pp. 342–350 (2020) 26. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, AISTATS 2010, pp. 249–256 (2010) 27. Glorot, X., et al.: Deep sparse rectifier neural networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, AISTATS-2011, pp. 315–323 (2011) 28. Glover, F.: Tabu search - part I. ORSA J. Comput. 1(3), 190–206 (1989)
A Self-learning Clustering Protocol in Wireless Sensor Networks for IoT Applications Nhat Tien Nguyen1,2(B) , Thien T. T. Le2 , Miroslav Voznak1 , and Jaroslav Zdralek1 1 VSB Technical University of Ostrava, 17 Listopadu 2172/15, 708 00 Ostrava, Czech Republic
[email protected], {miroslav.voznak,jaroslav.zdralek}@vsb.cz 2 Faculty of Electronics and Telecommunications, Sai Gon University, Ho Chi Minh City, Vietnam [email protected]
Abstract. The integration of Wireless sensor networks (WSN) and Internet of Things (IoT) perform many tasks control or monitor the surrounding area or the environment. The WSN-based IoT consists of many sensor nodes connect which transmit the collecting data of the environment to the manager through the Internet. The network topology requires high reliability connections while requires low energy consumption at the sink node and long network lifetime. In this paper, we introduce the self-learning clustering protocol to discover neighbors and the network topology. The cluster head is selected based on the information of the neighbors and the residual energy of the node. The maximum number of cluster members is set according to the network density. The proposed protocol can adapt the changing of the dynamic network with low energy consumption; therefore, ensuring the network connectivity. The simulation results show that the proposed clustering protocol performs well in terms of long network lifetime and high throughput while comparing to other clustering protocols.
1 Introduction Wireless sensor networks have been deployed to many applications to monitor the environment or collect data of the objects. The sensors are categorized into many types such as static or mobile devices which can be deployed in the small areas or large area to collect information about their deployment area. In the era of Internet of Things (IoT), the sensors can be distributed for many applications such as sensing, monitoring, and controlling [1]. The WSN-based IoT has been considered as the integrating WSN into IoT which is composed of WSN, the gateway server, middle-ware, and mobile client [2, 3]. The WSN-based IoT has can be deployed for smart cities to control or monitor the environment, building area with low energy consumption [4–6]. Some networks of WSN-based IoT can be described as Device to Device (D2D) or Machine to Machine (M2M) connections which have attracted considerable attention from the research community. The quality of services (QoS) of network can be considered in terms of good connectivity, long lifespan, high throughput, and low latency. The IoT objects or the sensors are randomly distributed in the network; therefore, the network lifetime can be © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 149–157, 2022. https://doi.org/10.1007/978-3-030-84910-8_16
150
N. T. Nguyen et al.
affected by the link connectivity and the transmission activities of sensors. In WSNbased IoT, the network topology can be divided into many clusters which save energy consumption as well as extend lifespan [7, 8]. The sensor nodes can be group into cluster in which each cluster consists of a cluster head and several cluster members [5–9]. The network consists of many sensors which collect the information of their place then forward to the sink. In LEACH, cluster head is selected by using a random value [7]. In [8], the modified LEACH selects some nodes to become cluster relay which forwards data from one cluster to the sink. The effective cluster head selection algorithm may results in high network lifespan and high throughput which ensures the requirement of QoS. Another work has selected the cluster head by considering the network topology [9]. In this paper, the network scenario of WSN-based-IoT applications consists of sensor nodes, routers and a sink. We propose a Self-learning Clustering Protocol (SLCP) to maintain the network topology by considering the number of neighbors and the residual energy of node. The cluster head is selected by using the number of neighbors, the average distance to the neighbors, and the residual energy of node. The maximum number of cluster member is defined by density of nodes which ensures the strong connectivity of nodes. The rest of paper is organized as follows. In Sect. 2, the Self-learning Clustering Protocol (SLCP) is introduced in details. In the next section, the network performance is evaluated with comparison to other protocol. Finally, the paper is concluded in Sect. 4.
2 A Self-learning Clustering Protocol in WSN-IoT 2.1 Network Model The network consists of a sink and multiple nodes which is deployed in the area L(m) × L(m). The sensor nodes support IoT applications which send the collecting data to the sink. The number of nodes in the network is N nodes which each node can be denoted as ni . Each node knows its location and the sink location. Each cluster has one cluster head (CH) and M cluster members (CM). The node which does not belong to any cluster called the cluster relay (CRe). The CRe can directly transmit data to the sink. Some reasonable assumptions can be adopted as follows: • Nodes are randomly distributed in the area of L × L meters with the same initial energy 0.5 J. The location of the sink is fixed. • The sensor nodes have a certain limitation as energy, storage, radio communication capabilities, and bandwidth. • The CM will transmit data to the CH and, then, the CH forward aggregated data to the sink. The CH is selected according to the residual energy, number of neighbors, and average distance to the neighbors. • The CH maintains the number of CMs based on the network density.
A Self-learning Clustering Protocol in Wireless Sensor Networks
151
2.2 Energy Consumption Model We assume that the energy consumption model follows the energy consumption model in [7]. The energy consumed to transmit the l-bits message (E tx ) and to receive this message (E rx ) are respectively given by [7] as follows: Etx = l · Eelec +
l Pt R
Erx = l · Eelec
(1) (2)
where E elec is the electronics energy dissipated per bit, R denotes the transmission rate (bit/s), and Pt is the transmitted power. Here, l/R expresses the time for sending the message. 2.3 The Self-learning Clustering Protocol (SLCP) The SLCP. The flowchart of the SLCP is shown in Fig. 1. The SLCP consists of three phases: phase 1 is neighbor discovery; phase 2 is cluster head selection and cluster formation; phase 3 is the data transmission phase. Round = 1
No
If round < round_max
Yes Phase 1: Neighbor Discovery
Phase 2: Cluster head selecƟon and cluster formaƟon Phase 3: Data transmission
Round + 1 Stop
Fig. 1. A self-learning clustering protocol in WSN for IoT applications.
Neighbor Discovery. In this phase, each node broadcast a HELLO message which consists of node ID (ni ), location of node (x i , yi ), residual energy of node (E res (ni )).
152
N. T. Nguyen et al. Phase 1: Neighbor Discovery HELLO = {NodeID, locaƟon, Energy} Broadcast “HELLO” to neighbors Receive “HELLO” from neighbors Store informaƟon of NK neighbors
Fig. 2. The neighbor discovery procedure.
If any nodes nj in the transmission range of ni , node nj will successfully receive the HELLO message. Then, node ni will add node nj to the list of neighbor as follows: {Nei(i) = Nei(i) ∪ nj | d(i,j) < Tx} and vice versa (Fig. 2). Cluster Head Selection and Cluster Formation Procedure. The cluster-head is chosen according to its residual energy, number of neighbor and average distance to the neighbor. The number of neighbors is taken after receiving the “HELLO” messages as in Phase 1. The average distance of node ni is the average of distance to all cluster member which is calculated as follows: {dst(ni ) = average(distance(ni ,nj )) | nj ∈ Nei(ni )}. In order to control the network topology, the average number of cluster member (M) is set according to the density of node in the network. The density of nodes in the network is denoted as K which is calculated as follows K=
L×L π × Tx2
M ≤K
(3) (4)
where L × L is the network area, Tx is the transmission range of a node. Each node can play the role of cluster head by calculating the probability of cluster head selection. The probability to become cluster head of ni is called pCH (i) which is calculated using the desired node degree, number of neighbors, and residual energy as follows. (5) α+β +γ =1
(6)
where popt is the optimal probability to become CH, E res is the residual energy, E init is the initial energy, dst is the average distance to all neighbors, Tx is the transmission
A Self-learning Clustering Protocol in Wireless Sensor Networks
153
range, NK is the number of neighbor, and K is the maximum number of cluster member; α, β, γ are the coefficient of energy, average distance and the number of cluster member, respectively. In order to balance the number of cluster head in the network, we denote G(r) is the set of nodes not being the CH in the previous round r as the LEACH protocol in [7, 8]. Node will generate a random number (rdn) chosen between 0 and 1. If the random number is less than thresholds Pth (ni ), the node becomes CHs. The probability threshold to choose a cluster head is defined by
(7) In the phase 2, each node generates the random number rdn, then compare to the Pth (ni ) – the probability to become cluster head as in Fig. 3. If the rdn is less than the Pth (ni ), node ni elects to cluster head. Then, node ni sends the join request - “JOIN_RQT” to the neighbors. Otherwise, node ni can become cluster member or cluster relay. If node ni receives the “JOIN_RQT” from one CH, node ni will join the cluster by sending the message “JOIN_ACK” to the CH. If node ni does not receive any “JOIN_RQT”, node will become CRe and directly send data to the sink. If node ni receives more than one “JOIN_RQT”, node ni will select the CH with the minimum distance to join. In the phase 3, the CH schedules the transmission of each CM in TDMA frame. Each CM wakes up and transmits data during the predefined slot. The CH forwards the aggregated data to the sink node by performing CSMA/CA.
3 Performance Evaluation 3.1 Simulation Environment In our simulations, nodes are uniformly deployed at random in a 50 × 50 m2 field [3]. The sink node is fixed at the center of the network area. The percentage of cluster head varies over the network during simulation time. Each node calculates the density of nodes and the probability to become cluster head as in (4) and (5), respectively. In WSN-basedIoT applications, we assume the network consists of N heterogenous nodes. Any node can elect to be the cluster head or cluster relays. The energy consumption model for transmitting and receiving data is calculated as in [7, 8]. We compare the SLCP to the LEACH and Modified LEACH in [7, 8]. The detailed simulation parameters are listed in Table 1. In order to focus on the energy efficient, we select the coefficient of energy is 0.5 while the other coefficient is 0.25.
154
N. T. Nguyen et al.
Phase 2: Cluster head selection and cluster formation Compute pCH(i) as in (5) and Pth(i) as in (7) Generate random number rdn
Yes
If rdn < Pth(i)
No Wait for “JOIN_RQT”
Send “JOIN_RQT” to K neighbors
Yes
If receive(JOIN_RQT) >1
Form into cluster
If receive(JOIN_RQT) is TRUE
No
No
Yes Select the cluster with min(d,CH) Send “JOIN_ACK” to CH
Send “JOIN_ACK” to CH
Send data directly to the sink
Fig. 3. The cluster formation procedure.
3.2 Simulation Results Network Lifetime. The network lifetime is measured by the number of alive nodes in the network to evaluate the network performance. The network lifetime is calculated as the duration until a half number of nodes die due to energy depletion of battery. The simulation result in Fig. 4 shows that the number of alive nodes in SLCP is higher than that of LEACH and Modified LEACH. In SLCP, the cluster head is selected according to the number of neighbors which may change due to mobility. Therefore, the SLCP can select the effective CH which saves the energy consumption during transmission. Residual Energy. In Fig. 5, the average residual energy of SLCP is higher than that of LEACH and Modified LEACH. The number of CMs is limited within a predefined number which results in lower number of transmission in the network. The CH is selected based on residual energy; therefore, the node with low residual energy will not become
A Self-learning Clustering Protocol in Wireless Sensor Networks
155
Table 1. Simulation parameters. Parameter
Value
Network size
50 m × 50 m
Number of nodes
100
Packet size
4000 bit
Probability popt
0.5
α (The coefficient of energy)
0.5
β (The coefficient of distance)
0.25
γ (The coefficient of cluster member)
0.25
Initial energy E 0
0.5 J
Energy for data aggregation
5 nJ/bit/signal
Transmitting and receiving energy E elect 5 nJ/bit
Fig. 4. The number of live nodes.
CH. In addition, the CM will select the CH with the minimum distance to join, which results in lower energy consumption during transmission. In the phase 2, the CHs are selected according to the network topology in terms of the network density and the number of neighbors. If there is any dead node around, the CH will have the information by receiving the “JOIN_ACK”. In consequence, the CH in the previous round may become CH in the next round which reduces the number of broadcast messages. The node also calculates the density of nodes in the network, therefore, the CH will select the reasonable cluster members which prolong the network connectivity and high received packets.
156
N. T. Nguyen et al.
The Successfully Received Packets at the Sink. The successfully received packets at the sink are the summation of transmitting packets from the nodes to the sink node as in Fig. 6. The SLCP achieves better performance than LEACH and Modified LEACH in terms of higher successfully received packets. The SLCP achieves longer network lifetime and higher residual energy which ensure the higher transmitting packets and higher receiving packets.
Fig. 5. Average residual energy at one node.
Fig. 6. Total successfully received packets at the sink.
A Self-learning Clustering Protocol in Wireless Sensor Networks
157
4 Conclusions In this paper, we have proposed the SLCP algorithm to maintain the number of cluster member while considering the change of network topology. The sensors for IoT applications require long network lifespan as well as high residual energy. The simulation evaluation shows that the SLCP is effective in the dynamic network by learning the network topology. Therefore, the SLCP outperforms the other clustering protocol in terms of longer network lifetime and higher successful received packets at the sink node. In our future works, we will improve the cluster head selection in order to adapt the dynamic network with low energy consumption. The network density is affected by the mobility of nodes which also be considered as another factor to the network lifetime. Due to the mobility, the number of cluster member will be re-selected to optimize the cluster which will reduce the energy consumption. Acknowledgments. The authors would like to thank the anonymous reviewers for the helpful comments and suggestions. This research was supported by the Ministry of Education, Youth and Sports of the Czech Republic under the grant SP2021/25 and e-INFRA CZ (ID:90140). Correspondence should be addressed to Nhat Tien Nguyen ([email protected]).
References 1. Ghayvat, H., Mukhopadhyay, S., Gui, X., Suryadevara, N.: WSN- and IoT-based smart homes and their extension to smart buildings. Sensors 15(5), 10350–10379 (2015) 2. Khalil, N., Abid, M.R., Benhaddou, D., Gerndt, M.: Wireless sensors networks for Internet of Things. In: 2014 IEEE Ninth International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), pp. 1–6. IEEE, April, 2014 3. Bajaj, K., Sharma, B., Singh, R.: Integration of WSN with IoT applications: a vision, architecture, and future challenges. In: Rani, S., Maheswar, R., Kanagachidambaresan, G.R., Jayarajan, P. (eds.) Integration of WSN and IoT for Smart Cities. EICC, pp. 79–102. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-38516-3_5 4. Sharma, H., Haque, A., Blaabjerg, F.: Machine learning in wireless sensor networks for smart cities: a survey. Electronics 10(9), 1012 (2021) 5. Bensaid, R., Said, M.B., Boujemaa, H.: Fuzzy C-means based clustering algorithm in WSNs for IoT applications. In: 2020 International Wireless Communications and Mobile Computing (IWCMC), pp. 126–130. IEEE (2020) 6. Asiri, M., Sheltami, T., Al-Awami, L., Yasar, A.: A Novel approach for efficient management of data lifespan of IoT devices. IEEE Internet Things J. 7(5), 4566–4574 (2019) 7. Heinzelman, W., Chandrakasan, A., Balakrishnan, H: Efficient routing protocols for wireless microsensor networks. In: Proceedings of the 33rd Hawaii International on Conference System Sciences (HICSS), USA, 7 Jan 2000, p. 10. (2000) 8. Nguyen, N., Ho, C.V., Le, T.T.T.: A topology control algorithm in wireless sensor networks for IoT-based applications. In: 2019 International Symposium on Electrical and Electronics Engineering (ISEE), pp. 141–145 (2019). https://doi.org/10.1109/ISEE2.2019.8921357 9. Alharbi, M.A., Kolberg, M., Zeeshan, M.: Towards improved clustering and routing protocol for wireless sensor networks. EURASIP J. Wirel. Commun. Netw. 2021(1), 1–31 (2021). https:// doi.org/10.1186/s13638-021-01911-9
The Effect of Agents’ Diversities on the Running Time of the Random Walk-Based Rendezvous Search Fumiya Toyoda(B) and Yusuke Sakumoto Graduate School of Science and Technology, Kwansei Gakuin University 2-1 Gakuen, Sanda, Hyogo 669-1337, Japan {fumiya.toyoda,sakumoto}@kwansei.ac.jp
Abstract. A rendezvous search aims to efficiently meet multiple searchers existing on different nodes on a network. The blind type of a rendezvous search is lightweight since it only uses a part of network information (adjacent information of each node), and is expected to use for various kinds of networks. As such a blind rendezvous search, we have proposed a random walk-based rendezvous search (RRS) that utilizes agents performing the random walks with the preferential selection of a high degree node. In our previous work, we have conducted the analysis and the experiment of an RRS to clarify its characteristics, under the simple circumstance where all agents start at the same time, and move every unit time by random walks with the same stochastic rule for selecting a next node from adjacent nodes. In this paper, as the first step to understand an RRS under the complex circumstances, we investigate the effect of agents’ diversities on the running time of an RRS through several experiments. As the result of experiments, we clarify the following things: (a) the running time of RRSes is almost the same when agents start random walks from different nodes at different times, (b) the difference of the stochastic rules to select a next node from adjacent nodes greatly affects the running time of an RRS, and (c) the difference of the frequencies of moving random walk agents greatly affects the running time of an RRS.
1 Introduction A rendezvous search [1] is used for multiple searchers existing on different nodes on a network to efficiently meet. Many studies of a rendezvous search have been conducted [2–6]. Among those, a blind rendezvous search aims to find other searchers using only a part of network information(i.e., adjacent information among nodes), and is expected to use for various kinds of networks such as unstructured peer-to-peer networks, mobile ad-hoc networks, and social networks. However, as far as we know, an efficient way to a blind rendezvous search has not been clarified yet. A flooding-based rendezvous search (FRS) is the simplest blind rendezvous search. When two searchers aim to meet with a FRS, either the searcher or both searchers existing on different nodes generate messages packed with its location information. A node having the message transfers it to its all adjacent nodes. If the messages of different searchers arrive at the same node simultaneously, the FRS can finish because c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 158–168, 2022. https://doi.org/10.1007/978-3-030-84910-8_17
The Effect of Agents’ Diversities on the Running Time
copy starting node a
159
agent 1a
starting node b copy
Fig. 1. Flooding-based rendezvous search (FRS)
starting node a
agent 2a
starting node b agent 1b
Fig. 2. Random walk-based rendezvous search (RRS)
each searcher can know the location information of the other searcher from the message. Figure 1 shows the example of a FRS. In this FRS, searchers exist on node a and b, respectively. Each searcher generates a message packed with its own location information, and transfers it on the network. The FRS can theoretically accomplish the smallest search time, but it has the major drawback that network load is huge since the tremendous amount of messages is diffused to the entire nodes on the network. In [7] we have proposed a random walk-based rendezvous search (RRS) that utilizes random walk agents to realize an efficient blind rendezvous search. In an RRS, each searcher existing on two different nodes generates their multiple agents, and the agents perform random walks with the preferential selections of high degree nodes. More specifically about a preferential selection of a high degree node, when an agent selects a next node from adjacent nodes, it is likely to select a high degree node more preferentially than a lower degree node. If agents starting from different nodes meet at the same node,an RRS can finish because each searcher can know the location information of the other searcher from the agent. Figure 2 shows the example of an RRS. In this RRS, each searcher exists on node a and b, and they generate two agents and one agent, respectively. Their agents perform independent random walks. In our previous works [7, 8], we have clarified the essential characteristics of an RRS and the difference of the performance (i.e., running time and search load) between an RRS and a FRS through the analyses and the experiments. However, in the experiments of our previous works, we have conducted the analysis and the experiment of an RRS to clarify its characteristics, under the circumstance where all agents start at the same time, and move every unit time following random walks with the same stochastic rule for selecting a next node from adjacent nodes. The characteristics of an RRS is understood under such a simple circumstance. To deepen our understanding of an RRS, we should investigate its characteristics under complex circumstances. In this paper, as the first step to understand an RRS under complex circumstances, we investigate the effect of agents’ diversities on the running time of an RRS through several experiments. While there are many kinds of diversities of agents, we choose the following three diversities among those, and conduct the experiments using agents with the following three diversities to be represented in experiments. Firstly, we investigate the effect of agents’ diversity about the starting times of the random walks. Secondly, we clarify the effect of the difference of the stochastic rules for selecting adjacent nodes
160
F. Toyoda and Y. Sakumoto
on the running time of an RRS, through the experiment that agents use different preferential selections of high degree nodes. Finally, we investigate the effect of agents’ diversity about the frequencies of moving the random walk agents. In this experiment, we adjust the frequencies of moving a random walk agent by having the agent probabilistically wait at the current node. As the result of experiments, while the difference of the starting times of the random walks does not affect the running time of an RRS, the difference of stochastic rules for selecting adjacent nodes and frequencies of moving random walk agents greatly affects it. Furthermore, through our consideration based on the result of the experiments, we show that the following two things are necessary to minimize the running time of an RRS, (a) the same stochastic rule for selecting adjacent nodes should be set, and (b) agents should be set to wait with appropriate probabilities. This paper is organized as follows. In Sect. 2, we explain random walks on a network which are necessary to understand a RRS. Next, in Sect. 3, we explain the overall of an RRS. Next, in Sect. 4, we perform several experiments to investigate the effect of the agents’ diversities. Finally, in Sect. 5, we conclude this paper, and discuss the future work.
2 Random Walk on a Network In an RRS, each agent performs a random walk on a network. In order to understand an RRS, we explain random walks on a network. The symbols necessary for explanation are summarized in Table 1. Table 1. Definition of symbols Network
G
Set of nodes
V
Set of links
E
Number of nodes
n
Set of adjacent nodes of node i
∂i
Degree of node i
ki
Weighted degree of node i
di
Weight of link (i, j)
wi j
Transition probability from node i to node j
pi→ j
Expected first meeting time with starting nodes a and b μa,b l-th power sum of weighted degrees
sl
Let G = (V, E) be the network where V and E are the sets of nodes and links, respectively. We denote the number (the number of nodes) of elements of the node sets in G by n. Let ∂ i be the set of the adjacent nodes of node i, and ki be |∂i | (degree of node i). Link (i, j) ∈ E is weighted by wi j . Weighted degree di of node i is defined by di :=
∑ wi j .
j∈∂ i
(1)
The Effect of Agents’ Diversities on the Running Time
161
The random walk agent on network G moves from node i to node j following probability pi→ j . Transition probability pi→ j is given by pi→ j =
wi j . ∑l∈∂ i wil
(2)
In [9], we analyzed expected time μa,b of two random walk agents starting from node a and b (a = b) until first meeting at the same node on network G. According to the analysis result, first meeting time μa,b is approximated by
μa,b ≈
s21 , s2
(3)
where sl is l-th power sum to weighted degree di of each node i, and more specifically, is defined by sl :=
∑ dil .
(4)
i∈V
3 Random Walk-Based Rendezvous Search(RRS) 3.1 Overview In [7], we have proposed an RRS that utilizes random walks agents, and investigated the characteristics of an RRS through the analysis and experiments. In [7], each agent performs the stochastically same random walk. Namely, all agents start at the same time, and move every unit time following random walks with the same stochastic rule for selecting a next node from adjacent nodes. However, it is impractical because the searchers on different nodes need to negotiate in advance to perform the same random walks. Therefore, in this paper, in order to investigate the characteristics of an RRS, we suppose that each agent performs a more complex random walk than our previous work’s. We assume that two searchers exist on node a and node b where a = b. Each searcher generates na and nb agents performing the random walks explained in Sect. 2. If the agents starting from different nodes meet at the same node, they can know the location information of the other side through agents’ exchanging location information of the searchers. In this case, the running time of an RRS is determined by expected first meeting time μa,b . 3.2 Procedure of a Random Walk Agent We explain the procedure of a random walk agent discussed in this paper. The definition of symbols necessary for explanation is shown in Table 2. In this paper, we discuss an agent behaving more complexly (waits at the current node following probability ηs ). By changing waiting probability ηs , we can adjust the frequency of moving random walk agents. An agent starts moving from node s ∈ a, b where the searchers exist. Every unit time, an agent existing on node i ∈ V performs the following procedures to move to adjacent node j with the stochastic rule.
162
F. Toyoda and Y. Sakumoto Table 2. The definition of symbols concerning an agent starting from node s in RRS Starting time of an agent
ts
Number of agents
ns
Preferential selection of a high degree node αs An agent’s waiting probability
ηs
1. An agent determines whether it moves by a random walk or not, following waiting probability ηs . 2. If an agent determines to move by a random walk in step 1, it gets information of degree k j (an adjacent node j ∈ ∂ i of the current node i) of its all adjacent nodes 3. Using the information of degree k j , transition probability pi→ j to node j is calculated by wi j kiαi −1 = , α −1 ∑l∈∂ i wil ∑l∈∂ i kl i
pi→ j =
(5)
where αi ≥ 1, and means the strength of preferentially selecting a high degree node as a next node in the random walk. Equation (5) can be derived by giving link degree wi j by wi j = (ki k j )αi −1 .
(6)
Table 3. Parameter configuration Number of nodes
n
1,000
Average degree
kavg 8
Strength of preferentially selecting of a high degree node αs
6
Number of agents
ns
1
Agent’s waiting probability
ηs
0
4. Reiterate step 1 to 3 until agents starting from different nodes meet at the same time. 3.3
Characteristics
In [9], we analyzed the running time of an RRS under the situation where all agents perform the stochastically same random walk, by using the approximated first meeting time given by Eq. (3). According to the analysis result in [9], we clarified that the running time of RRSes becomes a simply decreasing function of αs since the value of the right side in Eq. (3) decreases as αs increases. However, contrary to the analysis result, the running time of RRSes in the experiment of [7] becomes the convex function of αs , since Eq. (3) is inaccurate when αs is an exceedingly large value. According to the experiment results, αs of an RRS should be carefully set to avoid the large running time.
The Effect of Agents’ Diversities on the Running Time
163
4 Experiment In this section, we conduct the experiment of simulation using agents with the three diversities to be represented, and investigate the effect of agents’ diversities on the running time of an RRS. Firstly, by conducting the experiment that agents of node a and b start at different times, we examine the effect of agents’ diversity about the starting times of random walks on the running time of an RRS. Next, by conducting the experiment that agents of searchers existing on different nodes a and b use the different strength αs of preferentially selecting a high degree node, we examine the effect of agents’ diversity about stochastic rules for selecting adjacent nodes on the running time of an RRS. Finally, by conducting the experiment agents of the different nodes use the different waiting probabilities ηs , we examine the effect of the agents’ diversity about frequencies of moving random walk agents. We use the parameter configuration shown in Table 3 as a default parameter configuration. 4.1 Procedures of Experiment Firstly, we generate network G for an experiment. In this paper, we use the random networks with the number n of nodes and average degree kavg following the BA (Barab´asi– Albert) model [10] and the ER (Erd¨os–R´enyi) model [11]. In order to adjust average degree kavg of the BA network, the number n0 of nodes as the initial cluster in the BA model and the number m of additional links of the new node are given by n0 = m, kavg m= . 2
(7) (8)
In order to adjust average degree kavg of the ER network, probability pER of link connections is given by pER =
kavg . n−1
(9)
Next, we calculate the average running time of RRSes using the following procedures. 1. Randomly set node a and node b on G (a, b ∈ V and a = b). 2. Perform an RRS, and calculate the running time of RRS. 3. Reiterate steps 1 to 2, and calculate the mean of the running time of RRSes. The reiteration continues until the 95% confidence interval to the mean becomes sufficiently small. 4.2 The Effect of Agents’ Diversity About Starting Times on the Running Time of RRS By conducting the experiment that agents of node a and node b start at different times, we examine the effect of agents’ diversity about the starting times of random walks on
164
F. Toyoda and Y. Sakumoto
Running time of RRS[step]
the running time of an RRS. In this experiment, we set ta = 0 and change tb to a negative direction, and by doing this, investigate the effect of different times when each agent starts. Also, we define the difference of agents’ starting times ta and tb as Δ t := |ta −tb |. Moreover, we calculate the running time of an RRS using the criteria that both agents start, as t = ta = 0 Figure 3 shows that the average running time of RRSes when changing difference Δ t of starting times. The average running time for average degree kavg = 2 is calculated by using the running times of the experiments if the distance (number of hops) between nodes a and b is even. This is because network G with average degree kavg = 2 is the tree, and has no cycles. Thus, as in Fig. 4, the variation in distance between nodes where the agents exist is always ±2 at t = ta ≥ 0 after both agents have already started. Therefore, if the distance between node a and b is odd, each agent never meets at the same node. This phenomenon is ascribed to the effect of experiment’s setting where both agents are synchronized and move every unit time, so this is not essential. Hence, in order to calculate the average running time for average degree kavg = 2, we use the experiment results aside from the case where both agents never meet. According to the result shown in Fig. 3, even when difference Δ t of starting times increases, the running time of RRS is only slightly changed. The reason why for this is 800 600 400 kavg = 2 kavg = 6 kavg = 8 kavg = 10
200 0
1
10 Δt
100
Fig. 3. Average running time of RRSes in the BA network with various average degree kavg when changing difference Δ t of starting times 14
kavg = 2
12 Distance
10 8 6 4 2 0
0
2000
4000
6000
8000
10000
t
Fig. 4. Distance between the both agents in the BA network with average degree kavg = 2 when changing the time
The Effect of Agents’ Diversities on the Running Time
165
that a random walk of each agent is a kind of Markov chains, so the condition of the starting times does not significantly affect the characteristics of an RRS. Therefore, the difference of agents’ starting times hardly affect the running time of RRS. 4.3 The Effect of the Difference of Agents’ Stochastic Rules for Selecting Adjacent Nodes on the Running Time Next, by conducting the experiment that agents starting from different nodes use the different strength αs of preferentially selecting high degree nodes, we examine the effect of the difference of stochastic rules for random walk agents’ selecting adjacent nodes on the running time of an RRS. Figures 5, 6, and 7 show the average running time of RRSes in the BA network and the ER network with various average degree kavg when changing strength αa and αb of preferentially selecting high degree nodes. According to the result, depending on the combination of αa and αb , the running time of RRS is largely varied. Moreover, according to the results, we should set αa = αb to minimize the running time of an RRS. From the experiment, the effect of the difference of an agent’s stochastic rule for selecting an adjacent node on the running time is profound. Although many kinds of agents’ rules for selecting adjacent nodes are considered, agents’ selection rules should be the same to shorten the running time of RRS.
Fig. 5. Average running time of RRSes in the networks with average degree kavg = 6 when changing agents’ strength αa and αb of preferentially selecting high degree nodes
Fig. 6. Average running time of RRSes in the networks with average degree kavg = 8 when changing agents’ strength αa and αb of preferentially selecting high degree nodes
166
F. Toyoda and Y. Sakumoto
Fig. 7. Average running time of RRSes in the networks with average degree kavg = 10 when changing agents’ strength αa and αb of preferentially selecting high degree nodes
Fig. 8. Average running time of RRSes in the networks when changing ηa and ηb
4.4
The Effect of the Agents’ Diversity About Frequencies of Moving Agents on the Running Time
Additionally, by conducting the experiment that agents starting from the different nodes use the different waiting probabilities ηs , we examine the effect of agents’ diversity about the frequencies of moving random walk agents. Figure 8 shows the average running time of RRSes in the BA network and the ER network when changing waiting probabilities ηa and ηb . According to the results, the running time of RRS is largely changed, depending on the combination of ηa and ηb . The results also show that the running time of RRS is minimal when ηa + ηb is 0.2 or 0.3, regardless of the network topologies. Hence, in order to minimize the running time of RRS, waiting probabilities ηa and ηb should be set to a value above the certain value except for 0. For the experiment, the effect of agents’ diversity about the frequencies of moving random walk agents on the running time of RRSes is large. Finally, we investigate the effect of waiting probabilities ηa and ηb on setting parameter αs , where we use ηa = ηb and αa = αb . Figure 9 shows the average running time of RRSes in the BA network when changing αa and αb . The result of ηa = ηb = 0 shows that the running time of an RRS sharply increases when we set αa and αb larger than the certain value. On the other hand, the result of ηa > 0 and ηb > 0 does not show such sharp increases in the running time of an RRS. Therefore, by using ηa > 0 and ηb > 0, the setting of parameter αs becomes easy.
Running Time of RRS[step]
The Effect of Agents’ Diversities on the Running Time 100 80 60
167
ηa = η b = 0 ηa = ηb = 0.1 ηa = ηb = 0.3 ηa = ηb = 0.5
40 20 0
5
10 15 αa and αb
20
25
Fig. 9. Average running time of RRSes in the BA network when changing waiting probabilities ηa and ηb
5 Conclusion and Future Work In this paper, we clarified the effect of agents’ diversities on the running time of an RRS through several experiments. In order to do this, we conducted the following three represented experiments in this paper. Firstly, we conducted the experiment that the agents of two searchers start at different times. The result showed that the agents’ diversities about starting times of random walks hardly affect the running time of an RRS. Next, we conducted the experiment that the agents of each searcher follow the various stochastic rules for selecting adjacent nodes. In the experiment, we especially gained the results where the combination of agents’ strength of preferentially selecting high degree nodes is widely changed. The result showed that the effect of the difference of an agent’s stochastic rules for selecting an adjacent node on the running time is profound. Finally, we examined the effect of agents’ diversity about the frequencies of moving random walk agents on the running time of an RRS. In this experiment, we introduced the probability for agents to wait at the current nodes to adjust the frequencies of moving random walk agents. Through the experiment where agents’ waiting probabilities are changed, it is clarified that the effect of agents’ diversity about the frequencies of moving random walk agents on the running time of an RRS is large. Moreover, we considered things necessary to minimize the running time of an RRS on the basis of the results of experiments. The results of consideration showed that the following two things are necessary to minimize the running time of an RRS, (a) the same selection rule of adjacent nodes should be set, and (b) appropriate waiting probabilities of agents should be set. As future work, we will consider optimization of the parameters used in RRS (i.e., agents’ strength of preferentially selecting high degree nodes and waiting probabilities). Additionally, we will also consider the evaluation of an RRS for actual network topologies. Acknowledgements. This work was supported by JSPS KAKENHI Grant Number 19K11927.
168
F. Toyoda and Y. Sakumoto
References 1. Kranakis, E., Krizanc, D., Rajsbaum, S.: Mobile agent rendezvous: a survey. In: Proceedings of the 13th International Colloquium on Structural Information and Communication Complexity, SIROCCO 2006, pp. 1–9 (July 2006) 2. Avin, C., Kouck´y, M., Lotker, Z.: Cover time and mixing time of random walks on dynamic graphs. Random Struct. Algorithms 52, 576–596 (2018) 3. Kowalski, D.R., Malinowski, A.: How to meet in anonymous network. In: Flocchini, P., Gasieniec, L. (eds.) SIROCCO 2006. LNCS, vol. 4056, pp. 44–58. Springer, Heidelberg (2006). https://doi.org/10.1007/11780823 5 4. Ribeiro, R., Silvestre, D., Silvestre, C.: A rendezvous algorithm for multi-agent systems in disconnected network topologies. In: Proceedings of the 28th Mediterranean Conference on Control and Automation, MED 2020, pp. 592–597 (2020) 5. Alpern, S.: Rendezvous search: a personal perspective. Oper. Res. 50, 751–922 (2002) 6. Thomas, S., Luca, Z.: Random walks on dynamic graphs: mixing times, hittingtimes, and return probabilities. arXiv:1903.01342 (March 2019) 7. Toyoda, F., Sakumoto, Y., Ohsaki, H.: Proposal of an efficient blind search utilizing the rendezvous of random walk agents. In: Proceedings of the 43th IEEE Signature Conference on Computers, Software, and Applications, COMPSAC 2020, pp. 568–575 (July 2020) 8. Toyoda, F., Sakumoto, Y., Ohsaki, H.: Study on the effectivity of rendezvous search using random walk in large-scale unknown networks. The Institution of Electronics, Information and Communication Engineers, Technical Committee on Internet Architecture Technical Report (CQ2020-73), pp. 15–20 (September 2020) 9. Sakumoto, Y., Ohsaki, H.: Graph degree heterogeneity facilitates random walker meetings. IEICE Trans. Commun. E104-B(6), 604–615 (2021) 10. B´arabasi, A.L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999) 11. Erd¨os, P., R´enyi, A.: On random graphs. Mathematicae 6(26), 290–297 (1959)
A Study on Designing Autonomous Decentralized Method of User-Aware Resource Assignment in Large-Scale and Wide-Area Networks Toshitaka Kashimoto(B) , Fumiya Toyoda, and Yusuke Sakumoto Graduate School of Science and Technology, Kwansei Gakuin University, 2-1 Gakuen, Sanda, Hyogo 669-1337, Japan {t.kashimoto,fumiya.toyoda,sakumoto}@kwansei.ac.jp
Abstract. The assignment problem on networks is a fundamental problem associated with various methods such as distributed computing and data delivery. Especially, in order to perform efficient data delivery, the necessary resources should be assigned near users while accomplishing the high fairness among users. In order to accomplish such a useraware resource assignment on large-scale and wide-area networks, an autonomous decentralized method is needed because of the computational complexity and the difficulty to gather information from the whole of them. In this paper, we design the autonomous decentralized method of the user-aware resource assignment without the global information of networks, and investigate the performance of the designed method through the experiment. From the results, we show that the designed method accomplishes the same performance of the centralized method using the global information.
1
Introduction
The assignment problem on networks is a fundamental problem associated with various methods such as distributed computing and data delivery [1,2]. For example, the realization of efficient data delivery is deeply related to the optimization for the assignment of necessary resources (e.g., content replicas and data storages) to nodes on the basis of the location of users. Many studies [3–7] on the assignment problem have been conducted. Such studies have proposed (a) the method to appropriately assign adjacent peers to each peer on overlay networks, and (b) the method to appropriately place content replicas on the caches of nodes in information centric networks (ICNs), but many studies discussed the solutions specialized for individual cases. Since the assignment problem on networks is a kind of combinational problems which are NP-hard, it is not easy to discover the exact solution of the general case. However, the discovery of the general solution applicable to many cases enables not only to deepen understanding of the assignment problem on networks, but also to systematically design various methods associated with the assignment problem. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 169–182, 2022. https://doi.org/10.1007/978-3-030-84910-8_18
170
T. Kashimoto et al.
The methods to solve the assignment problem on networks are roughly categorized as centralized methods and autonomous decentralized methods. A centralized method first gathers the information of the entire network, and calculates the optimal solution on the basis of the gathered information. Next, the method assigns the resources to the nodes using the optimal solution. However, it is theoretically impossible to obtain the optimal solution of the assignment problem on large-scale and wide-area networks in terms of computational complexity. Also, it is difficult to gather all information from the entire large-scale and wide-area network. Moreover, the state of an actual network dynamically changes, and accordingly, the optimal solution constantly changes. Therefore, even if the problem of computational complexity in a centralized method is solved, it is still difficult to conduct the optimal assignment on a large-scale and wide-area network with the centralized method. In order to avoid such difficulty of the centralized method, autonomous decentralized methods have been proposed in [8,9]. In an autonomous decentralized method, each resource is managed by an entity, and each entity autonomously assigns its resource using only its own adjacent information. However, each entity cannot act independently since it is generally impossible to divide a combinational problem into independent subproblems. Therefore, when building an autonomous decentralized method, we need to not only simply divide the assignment problem on the entire network into the subproblems for individual entities, but also design their autonomous action associated with the optimization of the assignment on the entire network. In [8,9], the authors have proposed the scheme to design the autonomous action of entities to accomplish the optimal assignment on the entire network. In [8], they designed the autonomous action of each entity on the basis of the Markov approximation, and clarified that the autonomous action can stochastically solve combinational problems on networks. The scheme in [8] is applicable for the assignment problem since it is a kind of combinational problems, but the authors have considered only using the scheme for the combinational problem formulated by a single objective function. For many cases, the assignment problem are multi-objective in nature. For example, the delivery of data requires to assign necessary resources near users while accomplishing the high fairness among users. In [9], we have proposed the designing scheme of autonomous actions which can be used for the multi-objective assignment problem. More concretely, the scheme designs the autonomous action of entities on the basis of Markov Chain Monte Carlo (MCMC), and enables to optimize the trade-off between multiple objectives in the assignment problem. Moreover, we applied the assignment problem of virtual machines necessary to efficiently perform distributed computing such as Hadoop and MapReduce, and clarified the effectiveness of the designed autonomous decentralized method. In the applied case, virtual machines should be assigned to nodes (physical machines) near other virtual machines while distributing the loads of each node. We expect that the proposed scheme in [9] is used for various cases related to the assignment problem. However, the effectiveness of the scheme to other cases besides that for virtual machines has not been clarified yet.
Designing Autonomous Decentralized Method
171
In this paper, in order to clarify the effectiveness of the designing scheme in [9] to various cases related to the assignment problem, we design an autonomous decentralized method of user-aware resource assignment to efficiently perform data delivery, and investigate the performance of the designed method through the experiment. In the designed method, an entity manages a resource, and assigns it appropriately to a node by the autonomous action. Although our resource assignment are autonomously decentralized, we aim to assign resources to nodes near users while accomplishing the high fairness among the users. The discussed case of the assignment problem in this paper is related with the assignment of content replicas in ICNs [4–7], and the assignment of data storages in publish-subscribe networks [7]. The assignment problem considering the location of users is essentially different from that of virtual machines in [9]. More specifically, the assignment problem of virtual machines aims to decrease the distances between entities, but that in this paper aims to decrease the distances between any user and entity. By applying our designing scheme for the essentially different assignment problem in this paper, we can clarify its effectiveness for various cases. In the experiment, we investigate the performance of the autonomous decentralized method by comparison with the simple centralized methods in complete binary trees. The paper is organized as follows. In Sect. 2, we explain the system model used in this paper. Next, in Sect. 3, we design the autonomous action of entities to perform the user-aware resource assignment. Moreover, in Sect. 4, we conduct the experiment to clarify the effectiveness of the designed action. Finally, in Sect. 5, we conclude this paper, and discuss the future work.
2
System Model
In this section, we explain the system model for the data delivery. The example of the data delivery is as follows. For an ICN, content replicas are assigned to the caches of some nodes, and users can receive data by accessing one of the replicas. Also, for a publish-subscribe network, storages are assigned to some of the nodes on the network, and users (publishers or subscribers) receive data by accessing one of the storages. Figure 1 shows the example for a publish-subscribe network. The symbols defined in this section are summarized in Table 1.
Fig. 1. The example of the system model for a publish-subscribe network
172
T. Kashimoto et al. Table 1. Definition of symbols Network
G
Set of nodes
V
Set of links
E
Number of nodes
n
Degree of node i
ki
Set of adjacent nodes for node i
∂i
Distance between nodes i–j
d(i, j)
Set of entities
E
Node existing entity l
xl
Set of user nodes
U
Rate requesting from user node i to entity l ri (l) Total rate requesting from user node i
ai
Let G = (V , E ) be the network where the data delivery to users is performed. V and E are the sets of nodes and links, respectively. We denote the number of nodes in G by n. Let ∂i be the set of the adjacent nodes of node i, and ki be degree of node i (i.e., ki = |∂i|). Let d(i, j) be distance between nodes i–j where i, j ∈ V 2 . In this paper, for simplicity, we assume that distance d(i, j) is time invariant, and d(i, j) is given by the shortest-path length from node i to node j. An entity exists on the node , assigning a resource for the data delivery, and manages it. Let E be the sets of entities. For simplicity, E does not change with time. Let xl ∈ V be the node where entity l ∈ E exists. We assume that there is only one entity in a node at the same time, and thus xl = xk if l = k for l, k ∈ E . Vector X = (xl )l∈E corresponds to the assignment of resources. Entity l ∈ E autonomously changes its location on the basis of the access history from users to its own resource. In node i ∈ U , there are users that access resource. We call such a node a user node. We denote the set of user nodes by U . Note that U ⊆ V and |U | |E |. For simplicity, we assume that the users in node i ∈ U access the nearest resources managed by entity l ∈ E , and receive the data via the resource. Let ri (l) be rate requesting from user node i ∈ U to resource l, and ai be total rate requesting from user node i ∈ U . For ai and ri , the equation ri (l) = ai , (1) l∈E
is satisfied.
3
Design
Using the scheme proposed in [9], we design the autonomous decentralized method of user-aware resource assignment that considers the location of users.
Designing Autonomous Decentralized Method
173
In the designed method, each entity manages a resource for the data delivery, and assigns its own resource to an appropriate node by changing its location. Through the autonomous action by entities, we aim to assign the resources near users while accomplishing high fairness among users. We will derive such an autonomous actions as follows. First, we discuss the assignment problem formulated by distance function M (X) between any user and entity. Through the discussion, we obtain the stochastic solution to minimize M (X). Then, we divide the stochastic solution into the autonomous action of each entity on the basis of the designing scheme in [9]. Finally, we discuss that the autonomous action of each entity can assign the resources near users while accomplishing the high fairness among users, by adjusting the strength of minimizing M (X). 3.1
Formularization of Assignment Problem and Its Stochastic Solution
The problem to assign the resources near users while accomplishing the high fairness among users is multi-objective. In [9], we first discussed the objective function of an assignment problem formulated by a single objective function, and then proposed the scheme to design the autonomous decentralized method to solve a multi-objective problem by adjusting the strength of optimizing the objective function. Similarly to [9], we first discuss the minimization problem of distance function M (X) between any user and entity for designing the autonomous action of the user-aware resource assignment. The minimization problem of M (X) is formulated by min M (X),
X ∈ΩX
(2)
where ΩX = V |E | , which is the set of possible X. For simplicity, we assume that all entities are homogeneous, and M (X) is given by ml (xl ), (3) M (X) = l∈E
where ml (xl ) denotes the distance function between users and entity l. In this paper, we define ml (xl ) as ri (l) d(i, xl ). (4) ml (xl ) := i∈U
According to the above equation, ml (xl ) is the sum of distance d(i, xl ) between user node i and entity l weighted by requesting rate ri (l). Hence, in order to minimize M (X), a resource l is likely to be assigned near user node i with larger ri (l). In general, it is impossible to simply divide the minimization problem of M (X) in Eq. (2) into the minimization subproblem of ml (xl ) for entity l ∈ E . According to Eq. (4), ml (xl ) for entity l seems to only depend on its location
174
T. Kashimoto et al.
xl . However, each user accesses nearest entity l, so the value of ml (xl ) implicitly depends on location xl of other entity l ∈ E \ {l}. Because of the dependency relationship, entity l cannot deterministically perform the minimization of ml (xl ), ignoring the behaviour of the other entities. In [8,9], the authors discussed the stochastic solution to a combinational problem to handle the difficulty of dividing the combinational problem into subproblems like the aforementioned minimization problem of M (X). Similarly to [8,9]. we also discuss a stochastic solution to minimize M (X). According to [8], the minimal value of M (X) is approximated by function I(M ) using the log-sum-exp approximation. More concretely, I(M ) is given by 1 −λ M (X ) e (5) ≈ min M (X), I(M ) = − log X ∈ΩX λ X ∈ΩX
where λ > 0. Note that I(M ) satisfies min M (X) ≤ I(M ) ≤ min M (X) +
X ∈ΩX
X ∈ΩX
|E | log |V |. λ
(6)
Hence, the upper bound of approximation error by I(M ) is given by |E |/λ log |V |. Since I(M ) is the convex function, using the Legendre transformation, conjugate function I ∗ (P ) for I(M ) is obtained as ∗ P (X) M (X) − I(M ) = max H(M ) I (P ) = max M
=−
M
X ∈ΩX
1 P (X) log P (X), λ
(7)
X ∈ΩX
where
X ∈ΩX
P (X) = 1. To derive the above equation, we used ∂H(M ) ∂I(M ) = P (X) − ∂M (X) ∂M (X) e−λ M (X ) =0 −λ M (Y ) Y ∈ΩX e
= P (X) − P (X) =
e−λ M (X ) . −λ M (Y ) Y ∈ΩX e
(8)
Since the right side in the Eq. (7) is the entropy, the probability distribution to maximize I ∗ (P ) is given by Eq. (8).
Designing Autonomous Decentralized Method
175
Moreover, by conducting the Legendre transformation for conjugate function I ∗ (P ), I(M ) is obtained as ∗∗ ∗ I(M ) = I (M ) = min P (X) M (X) − I (P ) P
= min P
X ∈ΩX
X ∈ΩX
1 P (X) M (X) + P (X) log P (X) . λ X ∈ΩX
(9) According to the duality principle, in order to minimize the right side in I ∗∗ (M ), probability distribution P (X) is given by Eq. (8). Therefore, by changing X with probability distribution P (X) given by Eq. (8), we can stochastically solve the minimization problem of M (X). 3.2
Autonomous Action of Entities
Using the scheme based on MCMC in [9], we divide the stochastic solution to minimize M (X), and derive the action that entity l adjusts ml (xl ) in an autonomous way. In the stochastic solution to minimize M (X), X needs to be changed with probability distribution P (X) given by Eq. (8). MCMC is a method to design Markov process where the given probability distribution becomes a stationary distribution. By using MCMC, it is possible to derive state transition X → X for X to follow probability distribution P (X). Let P (X → X) be state transition probability of state transition X → X . According to MCMC, in order for X to follow probability distribution P (X), P (X → X) should be satisfied with P (X → X) P (X) = P (X → X ) P (X ).
(10)
By substituting Eq. (8) into Eq. (10), we obtain P (X → X) = exp [−λ(M (X ) − M (X))] P (X → X ) = exp [−λΔml (xl → xl )] ,
(11)
l∈E
where Δml (xl → xl ) is Δml (xl → xl ) = ml (xl ) − ml (xl )
= ri (l) d(i, xl ) − d(i, xl ) .
(12)
i∈U
By dividing state transition probability P (X → X ) following Eq. (11), we derive state transition probability P (xl → xl ). We consider the case that only
176
T. Kashimoto et al.
location xl of entity l is changed to xl ∈ ∂xl in state transition X → X . Then, Eq. (11) is simplified by P (X → X) = exp [−λΔml (xl → xl )] P (X → X ) l∈E
= exp [−λΔml (xl → xl )] .
(13)
With the scheme in [9], we derive state transition probability P (xl → xl ) from Eq. (13) that is the ratio of the state transition probability related to xl . Namely, from Eq. (13), we obtain P (xl → xl ) = exp [−λΔml (xl → xl )] P (xl → xl ) exp [−κλΔml (xl → xl )] = exp [(1 − κ)λΔml (xl → xl )] 1 exp [−κλΔml (xl → xl )] kxl = , 1 exp [(1 − κ)λΔml (xl → xl )] kxl
(14)
where 0 ≤ κ ≤ 0.5. From the denominators and numerators in the left and right side in the above equation, we derive state transition probability P (xl → xl ) of xl associated with the stochastic solution to minimize M (X) as ⎧ 1 ⎪ exp [−κλΔml (xl → xl )] if Δml (xl → xl ) ≤ 0 ⎪ ⎪ ⎨ kxl P (xl → xl ) = . (15) ⎪ ⎪ 1 ⎪ ⎩ exp [−(1 − κ)λΔml (xl → xl )] otherwise kxl In the above equation, λ is the parameter also appearing in Eq. (5). According to Eq. (6), as λ increases, M (X) would approach to its minimum value. Hence, λ means the strength of minimizing M (X). Entity l ∈ E changes its position xl , following Eq. (15). Δml (xl → xl ) in Eq. (15) is calculated by the information (requesting rate ri (l) and distance d(i, xl )). Entity l can estimate ri (l) and d(i, xl ) from the request message transferred from user node i ∈ U . If another entity already exists in adjacent node xl ∈ ∂xl , we set P (xl → xl ) = 0 to avoid that multiple entities exist in the same node at the same time. 3.3
Macroscopic Characteristics and Solving Multi-objective Problem
When entity l ∈ E changes its position xl , following designed Eq. (15) based on MCMC, probability distribution P (M ) follows P (M ) =
G(M ) exp [−λM ] , Y ∈ΩM G(Y ) exp [−λY ]
(16)
Designing Autonomous Decentralized Method
177
where G(Y ) is the number of states X that M (X) in is equal to Y in ΩX . ΩM is the set of possible M . The probability distribution like Eq. (16) is known for the Boltzmann distribution in statistical mechanics. Since users are more interested in M (X) than X, P (M ) is more important than P (X). Using function F (M ) = M − 1/λ log G(M ), Eq. (16) is transformed into P (M ) =
exp [−λF (M )] , Y ∈ΩM exp [−λF (Y )]
(17)
where F (M ) is the function called Helmholtz free energy in statistical mechanics, and plays the central role in understanding macroscopic characteristics in statistical system. In this paper, we discuss the macroscopic characteristics of the designed autonomous decentralized method through F (M ). Equation (17) indicates that minimal point M ∗ of F (M ) as the value of distance function M is likely to appear in network G when assigning resources with the autonomous action of entities. Minimal point M ∗ of F (M ) is determined by the sum of the first term and the second term in F (M ). These terms in F (M ) have trade-off relationship. As the first term M decreases, − log G(M ) in the second term exponentially increases. If − log G(M ) increases, the number of states, G(M ), decreases. The value of G(M ) is strongly associated with the fairness among the users. When X is randomly chosen among set ΩX , X with large G(M ) usually appears, and fair assignment for users is performed. On the other hand, in order to decrease G(M ), X is chosen under the strong restriction (i,e., assigning an entity to a user). In this case, the fairness among users is extremely low. Hence, the ratio of the first term and the second term is involved in the balance between the minimization of M and the fairness among users. The ratio of those terms is determined by the value of 1/λ. Therefore, by adjusting λ, the autonomous decentralized method would accomplish the resource assignment with the minimization of M and the high fairness among users.
4
Experiment
In this section, we investigate the effectiveness of the designed autonomous decentralized method, which performs the user-aware resource assignment, through the experiment with simulation assuming the assignment problem of data storages on publish-subscribe networks [7]. This problem aims to improve the efficiency of the data delivery from publisher to subscriber, assigning the data storages to appropriate nodes with the consideration of the location of users. 4.1
Setting
In this paper, we use a complete binary tree as network G to evaluate the effectiveness of the designed autonomous decentralized method. In the complete binary tree, n is given by 2h+1 − 1, where h is distance d(i, j) from root node i to leaf node j. If G is limited to the complete binary tree, we can conduct the
178
T. Kashimoto et al.
user-aware resource assignment with considering the minimization of M and the high fairness among users by using the simple centralized methods. Therefore, we can investigate how effectively assignment is able to be accomplished for the multi-objective problem using the designed autonomous decentralized method. Users are placed to leaf nodes on complete binary tree G. We randomly select half the number of the nodes among leaf nodes, and add them to U . The number of leaf nodes in G is 2h , so the number |U | of user nodes is 2h−1 . Total rate ai of user node i ∈ U is randomly given by the uniform distribution with the range (rmin , rmax ). The number |E | of the resources (entities) is set to the half of |U |. Namely, |E | = 2h−2 . At the starting time of the simulation, entities are placed to randomly selected |E | nodes. Every simulation time, one of the entities is randomly selected, and changes its position on the basis of the autonomous action explained in the previous section. Although there are various metrics for fairness among users, in this paper, ˆ i (X) between user node i we use coefficient of variation, cv (X), of distance m and the entities as a fairness metric. cv (X) is defined by 2 1 ˆ i (X) − m ˆ i (X) i∈U m |U | cv (X) := , (18) m ˆ i (X) where m ˆ i (X) is the average of distance m ˆ i (X), and is given by 1 m ˆ i (X) = m ˆ i (X). |U |
(19)
i∈U
Also, distance m ˆ i (X) is defined by m ˆ i (X) :=
ri (l) d(i, xl ).
(20)
l∈E
Smaller cv (X) means higher fairness among users. In order to evaluate the effectiveness of the assignment by the autonomous decentralized method, we calculate the both averages of M and cv using the following procedures. 1. Generate complete binary tree G . 2. Randomly place users and entities on G . 3. Perform the resource assignment by the autonomous decentralized method until simulation time t becomes T . During the simulation, we measure M and m ˆ i in every simulation time t. 4. Calculate the time average with M and m ˆ i measured in the simulation, and ˆ i. calculate cv from the time average m 5. Reiterate steps 1 to 4, and calculate the both means of the time average of M and the time average of cv . The reiteration continues until the 95% confidence interval to the mean becomes sufficiently small. We use the parameter configuration shown in Table 2 as a default parameter configuration.
Designing Autonomous Decentralized Method
179
Table 2. Default parameter configuration Number of nodes, n
511
Number of users, |U |
128
Number of entities, |R|
64
Parameter in Eq. (15), κ
0.1
Lower bound of total requesting rate, rmin
0.8
Upper bound of total requesting rate, rmax 1.2 End time of the simulation, T
4.2
100,000
Comparison
We use two centralized methods as comparisons to evaluate the effectiveness of the designed autonomous decentralized method. Firstly, we consider centralized method (CM-F) emphasizing the fairness among users. CM-F assigns the resources to all nodes at the depth of h − 2. Since |E | = 2h−2 in the setting explained in the Subsect. 4.1, the number of these nodes is equal to the number of resources, |E |. In the assignment by CMF, the distance between a user and a resource is always 2. Therefore, CM-F can accomplish the perfect fairness among the users while minimizing M at a certain level. When CM-F is used for the assignment, the average of M is given by 2 |U | ravg = 2h where ravg is the expected value of the total requesting rate. With the default parameter configuration, ravg is set to 1. Since the fairness using CM-F is perfect, cv is always 0. Secondly, we consider centralized method (CM-M) emphasizing the distances between any user and resource. CM-M first assigns the resources to the nodes at the depth of h − 2 as well as CM-F. Then, we divide complete binary tree G into the subtrees rooted at nodes with the depth of h − 2, and move resources so that the assignment in each subtree is optimal. The placements of user nodes in the subtrees are divided into 16 cases. These 16 cases can be classified into 6 patterns shown in Fig. 2. Each figure shows the optimal assignment in the subtree for each pattern. There are no user nodes in the subtree for pattern E. Therefore, the resource assigned to the root of the subtree for pattern E are moved to that for pattern A. We analyze the expected values of M and cv in CM-M, respectively. For simplicity, we discuss complete binary tree G that is combined of the same number of subtrees for the above 16 cases. According to our analysis result for the assignment on G by CM-M, the expected values of M and cv are given as 3 |U |/2 and 322/900 ≈ 0.358, respectively. In this paper, the derivation process is omitted due to space limitations. The expected values of M and cv for G are slightly smaller than those for G. Hence, our analysis result is the lower bound of the performance for CM-M. Note that CM-M cannot always minimize M if G. In order to minimize M , it is necessary to assign resources to user nodes, but CM-M does not perform such assignment. Hence, CM-M also considers not only the distance between users and resources but also the fairness between users.
180
T. Kashimoto et al.
Fig. 2. Optimal assignment patterns A to F in the subtrees of the complete binary tree
4.3
Result
First, we visually confirm the assignment by the autonomous decentralized method. Figures 3(a) and (b) show the snapshots (i.e., placement of user nodes and entity) of the simulation using the autonomous decentralized method with λ = 1 and 10, respectively. In these figures, the red nodes are user nodes, and the blue nodes are the nodes where an entity exists. The purple nodes are the nodes where both a user and a entity exist. Comparing with these figures, if λ is set to a large value, the method enables to assign the entities (resources) near user nodes. Next, we investigate the effectiveness of the autonomous decentralized method. Figure 4(a) shows decrease ratio M/M0 dividing the mean of M by M0 , which is that with λ = 0. Figure 4(a) shows the mean of cv for different values of λ. As n increases, the mean of M increases in nature. Hence, from M/M0 , we can compare the level of minimizing M in the results with different numbers of nodes, n. For reference, the value of M0 for different n is shown in Table 3. In these figures, we also show the results of CM-F and CM-M, which are the centralized method for the complete binary tree. The value of M/M0 in CM-F and CM-M are calculated using M0 of the autonomous decentralized method when the same n is used. From Fig. 4(a), M/M0 decreases as λ increases. On the other hand, from Fig. 4(b), cv increases as λ increases. Hence, by adjusting λ, the balance between the fairness among users and the distance between users and resources can be changed. In addition, the autonomous decentralized method when λ = 0.5, 1.0, and 2.0 can accomplish the same level of performance as CM-F and CM-M. Therefore, by setting λ appropriately, the autonomous decentralized method is expected to be able to assign the resources near users while considering the fairness among users at the same level of the centralized methods.
Designing Autonomous Decentralized Method
(a) λ = 1
181
(b) λ = 10
Fig. 3. Snapshots (i.e., placement of user nodes and entity) of the simulations by the autonomous decentralized method
(a) decrease ratio M/M0
(b) mean of cv
Fig. 4. Decrease ratio M/M0 and the mean of cv for different values of λ Table 3. Mean of M0 (M with λ = 0) n
127 255 511
M0 94.9 192 387
5
Conclusion and Future Work
In this paper, we first designed the autonomous decentralized method of the user-aware resource assignment on the basis of the scheme proposed in [9]. In the designed method, an entity manages a resource, and uses the autonomous action to assign it to a node near users while accomplishing the high fairness among users. In order to build such an autonomous action, we discussed the stochastic solution to minimize the distance function between users and resources, and divided the stochastic solution into the autonomous action of entities on the basis of the scheme [9]. Then, we investigated the effectiveness of the designed method though the experiment assuming the assignment problem of data storages on publish-subscribe networks [7]. In the experiment, we evaluate the effectiveness of the autonomous decentralized method by the comparison with the simple centralized methods on the complete binary tree. As the results, we showed the autonomous decentralized method would be able to assign the resources near users while considering the fairness among users at the same level of the centralized methods. As a future work, we will clarify the effectiveness of the autonomous decentralized method on various networks, and design the scheme to appropriately set parameter λ in the method, and evaluate its robustness in dynamic environment.
182
T. Kashimoto et al.
Acknowledgements. This work was supported by JSPS KAKENHI Grant Number 19K11927.
References 1. Eugene, L.L.: The quadratic assignment problem. Manage. Sci. 9(4), 586–599 (1963) 2. Eliane, M.L., de Areu, N.M.M., Boaventura-Netto, P.O., Peter, H., Tania, Q.: A survey for the quadratic assignment problem. Eur. J. Oper. Res. 176(2), 657–690 (2007) 3. Yunhao, L., Li, X., Lionel, N.: Building a scalable bipartite P2P overlay network. IEEE Trans. Parallel Distrib. Syst. 18(9), 1296–1306 (2007) 4. Yonggong, W., Zhenyu, L., Gareth, T., Steve, U., Gaogang, X.: Optimal cache allocation for content-centric networking. In: Proceedings of the 21st IEEE International Conference on Network Protocols, ICNP 2013, pp. 1–7 (October 2013) 5. Mai, V.S., Ioannidis, S., Pesavento, D., Benmohamed, L.: Optimal cache allocation under network-wide capacity constraint. In: Proceedings of the 10th IEEE International Conference on Computing, Networking and Communications, ICNC 2019, pp. 816–820 (February 2019) 6. Haozhe, W., Jia, H., Geyong, M., Wang, M., Nektarios, G.: Cost-aware optimisation of cache allocation for information-centric networking. In: Proceedings of the 60th IEEE Global Communications Conference, GLOBECOM 2017, pp. 1–6 (December 2017) 7. Vasilis, S., Paris, F., Georgios, S.P., Dimitrios, K., Leandros, T.: Storage planning and replica assignment in content-centric publish/subscribe networks. Comput. Netw. 55(18), 4021–4032 (2011) 8. Minghua, C., Soung, C.L., Ziyu, S., Kai, C.: Markov approximation for combinatorial network optimization. IEEE Trans. Inf. Theory 59(10), 6301–6327 (2013) 9. Yusuke, S., Masaki, A., Hideyuki, S.: Autonomous decentralized control for indirectly controlling system performance variable of large-scale and wide-area networks. IEICE Trans. Commun. E98-B, 2248–2258 (2015)
Social Media Data Misuse Tariq Soussan(B) and Marcello Trovati School of Computing, Edge Hill University, Ormskirk, UK [email protected], [email protected]
Abstract. The present high-tech landscape has allowed institutes to undergo digital transformation in addition to the storing of exceptional bulks of information from several resources, such as mobile phones, debit cards, GPS, transactions, online logs, and e-records. With the growth of technology, big data has grown to be a huge resource for several corporations that helped in encouraging enhanced strategies and innovative enterprise prospects. This advancement has also offered the expansion of linkable data resources. One of the famous data sources is social media platforms. Ideas and different types of content are being posted by thousands of people via social networking sites. These sites have provided a modern method for operating companies efficiently. However, some studies showed that social media platforms can be a source for misinformation at which some users tend to misuse social media data. In this work, the ethical concerns and conduct in online communities have been reviewed to see how social media data from different platforms has been misused, and to highlight some of the ways to avoid the misuse of social media data.
1 Introduction The growth of technology has transformed the lifestyle that people lead. However, it has also created major issues of over population, societal difficulties, and ecological obstacles. Ethical aspects are coming to be extremely valuable despite of progressively serious demographic, societal and ecological concerns [7]. There are many incidents that happened in the world recently that require users to highlight and rediscover the ethical aspect in technology [8]. Ethics have advanced as an essential aspect of technology with its expansion. Technology requiring morals is a fundamental necessity for itself and it is primarily divided into 3 parts. The first part explains that the technology excellence is eventually assessed by nature, and it adds that nature and technology should be compatible. The second part describes that the openness of technology verifies the technology must be controlled. Lastly, technology morals deliberateness is a continuation contrast to the use of technology. Thus, an essential aspect of technology is its ethical aspect and for technology to grow, there is an inexorable need for ethics [7]. Because of several developments in technology, as well as the growth of smart gadgets, the cloud computing rising, and the internet of things, big amounts of data are produced every day at an outstanding pace from various sources such as wellbeing, regime, social networks, advertising, and business [5]. The objective of creating technology is frequently to get it to fulfill a specific purpose [1]. Technology and its purpose have a clear and © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 183–189, 2022. https://doi.org/10.1007/978-3-030-84910-8_19
184
T. Soussan and M. Trovati
uncontroversial correlation. Technology maintains ethical consequences irrespective if the technology has been produced with this considered or not [1]. To be capable of managing the function and ethics during the design procedure is an aim and far-reaching desire [1]. An example of an ethical problem related to technology is in China during which the technologies that are being used in big social engineering schemes with social media and face recognition have been used to generate a worldwide ‘social-credit’ scheme [29]. This scheme improves the social status and benefits of people based on a mark which the authority system appoints as a measurement of their social integrity. This is defined as their capability to encourage a scheme of “social coherence” [29]. Overall, the designers need to predict as much usage situations as can be while planning technology so that they can foresee any potential ethical obstacles [1]. Big sets of data were collected from digital detectors that detect signals or stimuli, networks, computation, and memory receiving information devices which are of significance to industry, science, administrations, and organization. New companies were established on obtaining free to access information on the internet and delivering it to individuals in beneficial ways. Microsoft, Google, and Yahoo are examples of such businesses [25]. New facilities like satellite pictures, steering instructions, and picture recovery are added by such corporations that gather loads of bytes of information each day. While search portals have changed how data can be retrieved, the actions of corporations, scientific scholars, medical physicians, and security and intelligence functions have been altered by additional types of big data computing [25]. In the coming sections, Sect. 2 will discuss the social media data misuse. Section 3 will discuss the data misuse in Facebook platform. Section 4 and 5 will discuss the data misuse in Twitter and Instagram platforms, respectively. Section 6 will discuss the data misuse in LinkedIn platform. Section 7 will discuss some of the ways to restrict social media data misuse.
2 Social Media Data Misuse Social Media data is now shared by millions of users who spread information and provide online point of views. Users of social media platforms can disregard or unfollow points of views that do not fit their own contrasting local network [21]. Corporations use social media platforms to distribute data about the goods and services proposed by them [14]. In current years, a rising issue appeared related to how social media networks are being utilized for propaganda and false campaigns [10]. These platforms host customer discussions, prospects, and points of views. Using current online platforms through robot accounts and remote users may introduce misinformation into social media, generate a fad, and promptly propagate the message in a quicker and less expensive way than any other channel in the past [21]. Beneficial choices can be achieved and collaborations between customers and shareholders can be reached when corporations use those platforms [17]. Discussions on cooperation social media profiles effect the choices the cooperation can make. With ethical conduct and company social obligation, organizations can receive substantial longstanding profits, yet occasionally the corporations may become engaged with the unconventional methods leading to quicker, short-term benefits [15].
Social Media Data Misuse
185
By fabricating data on social media, organizations can fraudulently invite clients to their goods, improve sales and revenues, invite additional workers, and decrease enrollment expenses. On their clients’ engagement in social media, organizations can generate various analytics to be used in profit making business approaches [15]. The technology ethical side lead to a modern research area called Technoethics. Technoethics means technological morals. In the technological intelligence framework, AI and robotics areas are confronting vital ethical judgment which has made technoethics grow to be more valuable [7]. Posted data on social media can lack authenticity and profile pages may contain accessible personal information [15]. In addition, there are other risks that may arise from social media. Intruders, trollers, and psychopaths contact victims through social media [12]. Child safety concerns are raised because some users may harass kids and young individuals online [18]. Due to failed political revolutions, some regimes have intensified management of social media platforms [23]. Moreover, these platforms, can encourage terrorist actions and distribute rumors in tragedy circumstances [2]. Those examples of data misuse have resulted to intensified supervision of civilians and online networks. Other problems exist when users try to embrace dangerous information other users have posted or when they adopt harmful conduct other users have done or when they adopt abuse of personal data with illegal intention [16]. An example about this is online forums where users post about their experience. There is a possible risk if wrong, useless, or even dangerous information is posted on those discussion forums [16]. There are several moral concerns in social media that ought to be considered in spite of its practical and valuable ability in the social and progressive evolution in human interaction [15]. One concern is anonymous data. Organizations receive fake customer product feedbacks which are false and could harm the reputation of organizations. Some social media networks allow users to remain unknown during which their identity is switched to a number or an icon, so users will not realize who they are speaking to [28].
3 Facebook Data Misuse Facebook is a social media platform that allows users to socialize, interact, and post pictures and videos. It has shown some issues which cannot be disregarded. With regards to privacy of personal information, devices that have its application installed and permissions approved, allow the application to assemble data and log on to services, as per the data policy, such as time zone, Wi-Fi Signals, mobile operator, ISP, GPS, language, Bluetooth [27]. In such a case, the data being collected by Facebook cannot be controlled by users. In addition, some immoral Facebook posts that might include sentiments containing racism, or religious hate have led to Facebook to fulfill governments’ requirements to prevent or limit massive amount of user content in some countries [11]. In terms of information leakage, details about eighty-six million people who use Facebook were distributed by a researcher to a political organization specialized in consulting called “Cambridge Analytica” which dealt with the campaign of Trump [6]. Another issue that Facebook faced was identity theft. Identity theft occurred when a user builds a fake account to copy the someone’s identity or when a user hacks someone’s password and impersonates the owner of the account [15]. Furthermore, an additional issue regarding Facebook data is fake news as Facebook has more shared fake news reports than famous mainstream news reports [24].
186
T. Soussan and M. Trovati
4 Twitter Data Misuse Twitter social media platform has also foreseen some unethical behaviors during which its data has been misused. Fake accounts on Twitter can produce junk email, false web rating, and false information. Fake accounts may also be used to disclose prohibited download links and stalking other users through those accounts [15]. Some corporations used Twitter to exhibit their undesirable goods into people’s timelines through paid tweets. Other issues related to Twitter data was the tweets’ absence of context which occurred when Twitter users misread the messages because of the absence of all the vital information related to a concept which may be confusing [19]. Moreover, another ethical dilemma with Twitter is ghost tweets of which people hire ghostwriters to use their Twitter accounts and tweet as if it were them. The problem here is that the tweets content truth or falsification cannot be known by users [15]. In addition, there was another issue which was selling Twitter data for commercial benefits.
5 Instagram Data Misuse Another social media platform that had some social media misuse was Instagram. Like Facebook, Instagram also allows posting and sharing pictures and videos. In the terms and conditions, it is compulsory for users before joining the network to accept that their private pictures on the platform can be acquired by main advertising organizations or third-party purchasers without their approval [15]. There is an official and legitimate approval needed from Instagram users for Instagram to reach private data, nonetheless, users sometimes tend to agree on the permission forms before reviewing them [3]. Thus, this might jeopardize the users’ private data being sold. Moreover, another issue regarding Instagram data misuse is influencer digital advertising. Businesses expect that customers will be more persuaded to seek the goods or services that the corporation proposes when these businesses display influencers on Instagram to the marketplace they are aiming for [9].
6 LinkedIn Data Misuse LinkedIn is a professional social media platform where users create profiles having their employment and private data which is shared with other users. LinkedIn also had some unethical problems. The platform has a job board that company owners can post job offers related to their company. However, some companies have advertised vacancies that do not exist because they want to collect information such as archive of CVs and create traffic on their LinkedIn webpage [15]. In addition, businesses must pay money to LinkedIn for vacancies posted and for accessing the CVs [15]. The accuracy and the reliability of LinkedIn data cannot be verified. Previous work has showed that 50% of CVs used to evaluate job candidates by HR specialists can include factual faults [20]. When studying the conditions for doing employment through LinkedIn platform, the absence of legal guidance was observed which is another issue related to the platform. Incorrect procurement may occur if HR specialists are not trained on how to retrieve
Social Media Data Misuse
187
online data since they might be hiring new employees based on false acquired information [15]. Another issue with LinkedIn platform is the absence of privacy. Companies are collecting data about potential candidates before a job is offered. Their employers use this data through LinkedIn to perform screening checks prior to recruitment [22].
7 Approaches to Restrict Social Media Data Misuse A survey from previous work showed that the social media job screening performed by corporations in the United States for job candidates has risen to 70% in 2018 after it was around 11% in 2006 [10]. The companies tend to authenticate applicants skills, evaluate if they introduce themselves in a professional way, and guarantee that these applicants do not have any posts that can be categorized as bullying or insulting content [10]. Thus, when employers or hiring managers have the chance to reach such “reliable data”, it can be referred to as cybervetting [4]. This kind of data can help determine crucial metrics, such as academic capability and job efficiency [13]. Since not all users are acquainted with the methods relating to their social media data being misused or breached, they found this to be a barrier for them when they want to use social media. Some previous work proposes that beginner users are aided by a facilitator or well internet knowledgeable tutors so that those users can get familiarized with data confidentiality safety concerns and other important data [16]. Although some social media websites guide their new users throughout registration, the conversations and content being shared may not be strongly checked by a moderator. Users with little knowledge prefer to have a moderator who they can return to if they have some queries [16]. Moreover, Policymakers need to read, envision, and cross-examine how the analytics from social networks can be integrated ethically, thus worries of individuals and businesses are looked after [15]. The objective and the extended consequences of the analytics of social media is vital to be recognized by the policymakers. Businesses need to build, distribute, and regularly employ specific rules concerning its utilization of analytics of social media [15]. In businesses, there are certain approaches for trying to reduce exploitation of social media data by workers. Social media misuse can take place beyond the physical sites of the businesses, in nonworking hours of the employees, and can be done by any device regardless of, if it belongs to the company or not [26]. Thus, it is vital for companies to deliver proper employee training related to the rules, and data security and equality law in specific to try to reduce the harm that can be produced by social media platforms data misuse [26]. Another approach that businesses can apply is usage policies for social media. It includes the guidelines of conduct required from workers which are related to their usage of social media in a clear way regardless of if they are at the workplace or not [26]. Another approach is related to the use of companies’ official social media accounts. There should be rules on creating such accounts in addition to updating and supervising them. Access to those accounts should be limited to authorized employees only [26]. Any references used on these accounts should be cited as per copyright law. Moreover, another strategy for avoiding social media data misuse is following some guidance. An example is for employees to update their accounts from public accounts
188
T. Soussan and M. Trovati
to private accounts, thus only users on their friend list can see their posts’ content taking into consideration that the friend list is reviewed often [26]. Employers should offer guidance to try to guarantee that workers are considerate about their company, their colleagues, and others that might be correlated with the company.
8 Conclusion In this work, the possible misuse of social networking platforms has been reviewed. Many examples were given from previous work about how social media have been misused with focus on data misuse coming from social media platforms such as Facebook, Twitter, Instagram, and LinkedIn. Many suggested approaches have been mentioned to deal with such data misuse. To restrict or minimize misuse, certain usage rules need to be applied. Providing beginner users with some tutoring can help them get to know more about data confidentiality safety concerns and other information that might help them. Future work can suggest more ways on making stakeholders aware about the misuse of their social media data so that they can take the necessary precautions to keep their data safe from being misused.
References 1. Albrechtslund, A.: Ethics and technology design. Ethics Inf. Technol. 9(1), 63–72 (2007) 2. Alexander, D.E.: Social media in disaster risk reduction and crisis management. Sci. Eng. Ethics 20(3), 717–733 (2014) 3. Bechmann, A., Vahlstrup, P.B.: Studying Facebook and Instagram data: The digital footprints software. First Monday (2015) 4. Berkelaar, B.L.: Cybervetting, online information, and personnel selection: new transparency expectations and the emergence of a digital social contract. Manag. Commun. Q. 28(4), 479–506 (2014) 5. Botta, A., De Donato, W., Persico, V., Pescapé, A.: Integration of cloud computing and internet of things: a survey. Futur. Gener. Comput. Syst. 56, 684–700 (2016) 6. Cadwalladr, C., Confessore, N., Rosenberg, M.: How Trump consultants exploited the Facebook Data of Millions. The New York Times (2018) 7. Fan, Z., Ge, Y.: The influence of technoethics on industrial design. In: MATEC Web of Conferences, vol. 167, p. 01008. EDP Sciences (2018) 8. Galvan, J.M.: On technoethics. IEEE-RAS Mag. 10(4), 58–63 (2003) 9. Glucksman, M.: The rise of social media influencer marketing on lifestyle branding: a case study of Lucie Fink. Elon J. Undergraduate Res. Commun. 8(2), 77–87 (2017) 10. Gruzd, A., Jacobson, J., Dubois, E.: Cybervetting and the public life of social media data. Soc. Media+ Soc. 6(2), 2056305120915618 (2020) 11. Isaac, M.: Facebook said to create censorship tool to get back into China. The New York Times. http://www.nytimes.com 12. Kim, W., Jeong, O.R., Lee, S.W.: On social Web sites. Inf. Syst. 35(2), 215–236 (2010) 13. Kluemper, D.H., Rosen, P.A., Mossholder, K.W.: Social networking websites, personality ratings, and the organizational context: more than meets the eye? J. Appl. Soc. Psychol. 42(5), 1143–1172 (2012) 14. Kumar, P., Mittal, S.: The perpetration and prevention of cyber crime: an analysis of cyber terrorism in India. Int. J. Technoethics (IJT) 3(1), 43–52 (2012)
Social Media Data Misuse
189
15. Kumar, V., Nanda, P.: Social media to social media analytics: ethical challenges. Int. J. Technoethics (IJT) 10(2), 57–70 (2019) 16. Leist, A.K.: Social media use of older adults: a mini-review. Gerontology 59(4), 378–384 (2013) 17. Naik, D.A.: Organizational use of social media: the shift in communication, collaboration and decision-making (2015) 18. O’Keeffe, G.S., Clarke-Pearson, K.: The impact of social media on children, adolescents, and families. Pediatrics 127(4), 800–804 (2011) 19. Onook, O., Manish, A., Raghave, R.H.: Community intelligence and social media services: a rumor theoretic analysis of tweets during social crisis. Manag. Inf. Syst. Q. 37(2), 407–426 (2013) 20. Parez, M.E.: Linked into a job?: The ethical considerations of recruiting through LinkedIn (2013) 21. Prier, J.: The Command of the Trend: Social Media as a Weapon in the Information Age. AIR UNIVERSITY MAXWELL AFB United States, SCHOOL OF ADVANCED AIR AND SPACE STUDIES (2017) 22. Saylin, G., Horrocks, T.: The risks of pre-employment social media screening. SHRM. https://www.shrm.org/resourcesandtools/hr-topics/talent-acquisition/pages/pre employment-social-media-screening.aspx. Accessed 27 Feb 2019 23. Shirky, C.: The political power of social media: technology, the public sphere, and political change. Foreign affairs, 28–41 (2011) 24. Silverman, C., Singer-Vine, J.: Most Americans who see fake news believe it, new survey says. BuzzFeed News, 6 (2016) 25. Syed, A., Gillela, K., Venugopal, C.: The future revolution on big data. Future 2(6), 2446–2451 (2013) 26. Taylor, M., Haggerty, J., Gresty, D., Wren, C., Berry, T.: Avoiding the misuse of social media by employees. Netw. Secur. 2016(5), 8–11 (2016) 27. Thompson, C.: What you really sign up for when you use social media. CNBC (2015) 28. Turcule¸t, M.: Ethical issues concerning online social networks. Procedia Soc. Behav. Sci. 149, 967–972 (2014) 29. Vallor, S., Green, B., Raicu, I.: Ethics in Technology Practice. The Markkula Center for Applied Ethics, Santa Clara University (2018)
Deep Learning Approaches to Detect Real Time Events Recognition in Smart Manufacturing Systems – A Short Survey Suleman Awan(B) and Marcello Trovati Department of Computer Science, Edge Hill University, Ormskirk, UK {awans,trovatim}@edgehill.ac.uk
Abstract. When steam and water powered engines started the first industrial revolution, this increased production rate and gave way to large production facilities, which transitioned into mass production of identical products in an assembly line with the introduction of electricity in the second revolution. Subsequently, the processing of different product categories along the same assembly line was facilitated with the help of automation and robotics. This has led to the third industrial period aided by the introduction of computers and the advancement in electronics. Currently, we are during the process of transitioning into an autonomous and intelligent manufacturing system where cyber and physical systems connects through data analytics and machine learning. In this article, a short overview of deep learning approaches in the detection and recognition of real time events is discussed.
1
Introduction
With the introduction of Industry 4.0, manufacturing companies are now focusing their strategies on designing and adopting new technologies, which can provide a variety of customised products along a same assembly line, while maintaining quality and reducing production costs. This way forward will provide a better response and reconfigurability to changing and emerging threats in the environment, internal or external, to maintain competitive advantage [1]. As discussed in [2], the new technologies currently utilised in industry 4.0 include: • Internet of Things: sensors, scanners and processors that helps in generating data used for timely decision making in production • Cyber physical systems: real time data generated from interaction between sensors and physical systems i.e. machines to formulate a decentralised response of a system to make decisions under uncertainty in real time • Virtualisation: a computer-generated 3D model of a work place, product, machine, or component • 3D printing: fabrication or production of a product through computer aided designing without actually utilising real material • Augmented reality: real time information being displayed through headset about a product or process and their current status c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 190–194, 2022. https://doi.org/10.1007/978-3-030-84910-8_20
Deep Learning Approaches to Real Time Events Recognition
191
• Cloud computing: virtual space for storage of data which can be shared among multiple users at different location simultaneously • Cyber security: security of the information, which can be on cloud or physical servers, being generated during the entire product lifecycle • Advanced robotics: pre-programmed machines doing complex tasks, which were once carried out by human, but with more precision and error free • Big data and machine learning: information being generated through the interaction of physical systems with sensors and processors. It is not compulsory for companies to have all these technologies and they should only opt for those technologies which satisfies their business and transformation strategy. 1.1
The Advantages of a Data Driven Industry 4.0
There are multiple advantages of industry 4.0 starting with more customised products through increased flexibility and productivity (reduction in production time and better utilisation of resources), improved quality (real time monitoring of products), reduction in production costs, and reducing lead time in delivery of final product to customer. Furthermore, advancements in information and communication technologies (ICT) have also caused a shift in the manufacturing industries towards adopting industry 4.0 technologies to help them in their production processes. One such advantage is the use of various sensors through out the production processes which monitors the production process flow by monitoring various measurements including temperature, pressure, vibrations etc. These sensors generate large amount of data which can tell about the production process steps, quality, safety and helps in detecting/monitoring unplanned stoppages [3]. This has led to extraction of information from the cyber physical assets in the production processes through various data mining and machine learning methods by analysing, understanding, and helping in the decision making. As mentioned above, one of the main objectives for the manufacturing industry is to reduce costs by reducing unplanned stoppages. Currently, three types of maintenance are usually carried out in the factories to keep the production processes running: reactive, preventive, and proactive [4]. Reactive maintenance is normally carried out once a breakdown or fault occurs, and cost, resources, and downtime (the time for which a machinery is not available to production processes due to fault or breakdown) associated with such maintenance is very high as compared to other two maintenance. Preventive maintenance activities are normally carried out after a machinery or equipment has completed certain number of hours as per the parameters set by the original equipment manufacturer (OEM) of that machinery. The maintenance activities in this category follow a set pattern as laid down by the OEM and requires arrangement of resources in advance. However, the cost and downtime associated with this maintenance is low as compared to reactive. Proactive or predictive maintenance is carried out after carrying out an analysis of the equipment and assessing when the said equipment will no longer
192
S. Awan and M. Trovati
be performing at the desired level of functioning as per its OEM set parameters. Basing on this prediction, maintenance is then carried out in advance to keep the equipment in functional condition. Costs and downtime associated with such maintenance is quite low as compared to reactive and preventive. As sensors generate large amount of data which can be acquired, stored in cloud or other distributed storage channels, processed, and analysed to ascertain condition of the production process. In this regard, data driven prognosis of production floor equipment has gained quite an importance in assessing the remaining useful life (RUL) and unplanned stoppages encountered on the production floor [5]. This is very important because by using this large amount of data and various supervised and unsupervised learning algorithms, one can predict the health status of the production process flow. These data driven approaches are also significant as one does not have to completely understand the physical processes involve in the production process. In this regard, Cross Industry Standard Process for Data Mining (CRISP) is used as methodology for deploying the data driven prognosis models which focuses on the understanding of data, its pre-processing and preparation for learning models, building up of learning models, evaluating, and improving, and finally deployment [5]. Section 2 will be focusing on the application of various deep learning models which have been utilised to predict the unplanned stoppages, RUL, or real time event monitoring in manufacturing industry.
2
Deep Learning for Real Time Events Recognition
In [4], a Random Forest (RF) based algorithms is introduced, which focuses on a data driven prognosis approach for predicting tool wear in a manufacturing setup and compares RF performance with a simple feed forward back propagation Artificial Neural Network (FFBP ANN) and a support vector regression based (SVR) models and showed that RF is better in performance for predicting the accurate tool wear as compared to others though it takes longer time to train the model based on the training data set. The performance of algorithms was evaluated on a data collected from 315 milling tests. The performance measures among the three models were Mean Squared Error, R-squared and training time. Features were extracted from raw data including cutting force, vibration and acoustic emission signal channels, and 28 features were extracted from these signals including max, median, mean, and standard deviation. A stoppage parameter was given to stop the models once that threshold was achieved. RF performed better against a simple FFBP ANN and SVR, even though it typically takes longer training time compared to the other two approaches. Moreover, the ANN used in this context consisted of a single hidden layer only with multiple neurons consisting of 2, 4, 8, 16, and 32. In [6], the authors propose a deep learning model based on stacked auto encoders (SAE) and Long short-term memory (LSTM) algorithms to detect faults in mechanical equipment using unsupervised learning approaches especially when the historical data is unlabelled and requires a pattern to be found, and practical knowledge regarding the working and operation of a mechanical
Deep Learning Approaches to Real Time Events Recognition
193
equipment is also not available. SAE is used for learning the feature representation from unlabelled training data whereas LSTM is used to detect anomaly in the equipment. Data from the vibration sensors (both during normal and fault state) is characterised in sequence through wavelet packet decomposition (WPD) to extract wavelet coefficient based and energy-based feature sequences. This is then fed to SAE auto-encoders to form five-fold feature representation in time series which is then fed to the LSTM model to detect anomalies due to LSTM capability for lowering loss of signal during gradient vanishing. This resulted in anomaly detection of 99% using unsupervised learning methods where data is unlabelled. To validate this, 200 samples were used with 150 being normal and 50 being with faulty. In [7], a 2D convolutional LSTM auto-encoder (AE) model is introduced, which predicts speed of a can manufacturing machine with sequential data obtained from speed sensors embedded within machine. The model comprises three layers i.e., a multistep convolutional LSTM encoder followed by a bidirectional LSTM AE capable of long- and short-term representation learning and finally a fully connected supervised learning layer. The model was tested against traditional machine learning (ML) models and some advanced deep learning (DL) models to be found as very effective. Sliding window method was used to convert the sequential time series data into a supervised learning sequence i.e., input and output values in a 2D matrix format. A convolutional LSTM is used to determine and retain the hidden state across the time step. After encoding this, it is then fed to a layered Bidirectional LSTM, after going through a repeat layer, for fostering representation learning and then finally to FC layer for predicting the output. This model is using supervised learning approaches. The training data was converted from a univariate input sequence to a matrix of shape n × 60 × 1. Model outperformed the statistical models as well as advanced DL models in predictive performance. Also, the model required less training time as compared to others making it more efficient for manufacturing settings. In [8], a model comprising of a CNN and a bidirectional LSTM with SAE is discussed, whose aim is to monitor event detection from a large amount of data collected from a can manufacturing machine. The CNN identifies the automatic feature extraction whereas LSTM helps in capturing long term time dependencies in sequential learning ability in a supervised manner. The model offers good predictive performance with faster training times. In particular, the input data, consisting of time series data, was fed to the 3 CNN layers for feature extractions. These were subsequently fed to the 4 SAE LSTM layers for sequential learning and finally the output to the sigmoid and classification layer. The model was tested against CNN-LSTM and Conv-2DLSTM. Model performed better than the other deep learning models in event detection and achieved shorter training time. In [9], data from a single source is taken from a packing machine to predict near future stoppages by using forecasting models ARIMA and Prophet. The model predicts the frequency and the time duration of the stoppages basing on the present status of the stoppages being experienced by the machine. The model works in self isolation with other machines and their data therefore does not consider a large amount of data. An IoT controller was installed to measure the start and stop function of the packing machine. One year data was collected
194
S. Awan and M. Trovati
with around 75K entries. Stoppages were divided into minor, breakdowns, and major ranging from 10 s to 5 min, 5 min to 40 min and greater than 40 min, respectively. After cleaning of data, ARIMA and Prophet are used to forecast the machine stoppages. Mean Absolute Error, Mean Absolute Percentage Error and Normalised Root Mean Square Error is used to measure the model. Both models performed well to predict the stoppages in future based on a single source of past data from the packing machine. However, it was desired to have more data from multiple sources rather than from a single source.
3
Conclusions
Industry 4.0 relies on date-driven approaches to enable a more agile, and computationally efficient intelligent environment. However,, the corresponding data is often unstructured, real time and contains high levels of noise. To address these aspects, deep learning methods and frameworks have been developed. However, more research is needed to enable a more in-depth analysis leading to a more targeted and accurate information extraction and prediction.
References 1. Brunelli, J., Lukic, V., Milon, T., Tantardini, M.: Five Lessons from the Frontlines of Industry 4.0 (2017). https://www.bcg.com/en-gb/publications/2017/industry-4. 0-lean-manufacturing-five-lessons-frontlines.aspx. Accessed 23 May 2021 2. Ahuett-Garza, H., Kurfess, T.: A brief discussion on the trends of habilitating technologies for Industry 4.0 and smart manufacturing. Manuf. Lett. 15, 60–63 (2018). https://doi.org/10.1016/j.mfglet.2018.02.011 3. Huber, M.F., Voigt, M., Ngomo, A.-C.N.: Big data architecture for the semantic analysis of complex events in manufacturing (2016) 4. Wu, D., Jennings, C., Terpenny, J., Gao, R.X., Kumara, S.: A comparative study on machine learning algorithms for smart manufacturing: tool wear prediction using random forests. J. Manuf. Sci. Eng. Trans. ASME 139(7) (2017). https://doi.org/ 10.1115/1.4036350 5. Diez-Olivan, A., Del Ser, J., Galar, D., Sierra, B.: Data fusion and machine learning for industrial prognosis: trends and perspectives towards industry 4.0. Inf. Fusion 50, 92–111 (2019). https://doi.org/10.1016/j.inffus.2018.10.005 6. Li, Z., Li, J., Wang, Y., Wang, K.: A deep learning approach for anomaly detection based on SAE and LSTM in mechanical equipment. Int. J. Adv. Manuf. Technol. 103(1–4), 499–510 (2019). https://doi.org/10.1007/s00170-019-03557-w 7. Essien, A., Giannetti, C.: A deep learning model for smart manufacturing using convolutional LSTM neural network autoencoders. IEEE Trans. Ind. Inf. 16(9), 6069–6078 (2020). https://doi.org/10.1109/TII.2020.2967556 8. Giannetti, C., Essien, A., Pang, Y.O.: A novel deep learning approach for event detection in smart manufacturing. In: Proceedings of International Conference on Computers and Industrial Engineering, CIE, October, vol. 2019, pp. 1–11 (2019) 9. Filios, G., Katsidimas, I., Nikoletseas, S., Panagiotou, S., Raptis, T.P.: An agnostic data-driven approach to predict stoppages of industrial packing machine in near. In: Proceedings of the 16th Annual International Conference on Distributed Computing in Sensor Systems, DCOSS 2020, May 2020, pp. 236–243 (2020). https://doi.org/ 10.1109/DCOSS49796.2020.00046
A Comparison Study of CM and RIWM Router Replacement Methods for WMNs Considering Boulevard Distribution of Mesh Clients Peng Xu1 , Admir Barolli2 , Phudit Ampririt3 , Shinji Sakamoto4 , and Leonard Barolli1(B) 1
2
Department of Information and Communication Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan [email protected] Department of Information Technology, Aleksander Moisiu University of Durres, L.1, Rruga e Currilave, Durres, Albania 3 Graduate School of Engineering, Fukuoka Institute of Technology, 3-30-1 Wajiro-Higashi, Higashi-Ku, Fukuoka 811-0295, Japan 4 Department of Computer and Information Science, Seikei University, 3-3-1 Kichijoji-Kitamachi, Musashino-shi, Tokyo180-8633, Japan [email protected]
Abstract. The Wireless Mesh Networks (WMNs) have attracted attention for different applications. They are an important networking infrastructure and they have many advantages such as low cost and high-speed wireless Internet connectivity. However, they have some problems such as router placement, covering of mesh clients and load balancing. To deal with these problems, in our previous work, we implemented a hybrid simulation system based on Particle Swarm Optimization (PSO) and Distributed Genetic Algorithm (DGA) called WMN-PSODGA. Moreover, we added in the fitness function a new parameter for the load balancing of the mesh routers called NCMCpR (Number of Covered Mesh Clients per Router). In this paper, we consider Boulevard distribution of mesh clients and two router replacement methods: Constriction Method (CM) and Random Inertia Weight Method (RIWM). We carry out simulations using WMN-PSODGA system considering 16 and 18 mesh routers. We compare the performance of CM with RIWM. The simulation results show that RIWM has better loading balancing than CM.
1
Introduction
The wireless networks and devices can provide users access to information and communication anytime and anywhere [3,8–11,14,20,26,27,29,33]. The Wireless Mesh Networks (WMNs) are gaining a lot of attention because of their low-cost that makes them attractive for providing wireless Internet connectivity. A WMN is dynamically self-organized and self-configured. The nodes in the network automatically establish and maintain the mesh connectivity among itself by creating c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 195–208, 2022. https://doi.org/10.1007/978-3-030-84910-8_21
196
P. Xu et al.
an ad-hoc network. This feature brings many advantages to WMN such as easy network maintenance, robustness and reliable service coverage [1]. Moreover, such infrastructure can be deployed in community networks, metropolitan area networks, municipal networks to support applications for urban areas, medical, transport and surveillance systems. Mesh node placement in WMNs can be seen as a family of problems, which is shown (through graph theoretic approaches or placement problems, e.g. [6,15]) to be computationally hard to solve for most of the formulations [37]. In this work, we consider the version of mesh router nodes placement problem in which we are given a grid area. We consider where to deploy a number of mesh router nodes and a number of mesh client nodes of fixed positions (of an arbitrary distribution) in the grid area. The objective is to find a location assignment for the mesh routers to the cells of the grid area that maximizes the network connectivity, client coverage and consider load balancing for each router. Network connectivity is measured by Size of Giant Component (SGC) of the resulting WMN graph, while the user coverage is simply the number of mesh client nodes that fall within the radio coverage of at least one mesh router node and is measured by Number of Covered Mesh Clients (NCMC). For load balancing, we added in the fitness function a new parameter called NCMCpR (Number of Covered Mesh Clients per Router). Node placement problems are known to be computationally hard to solve [12, 13,38]. In previous works, some intelligent algorithms have been recently investigated for node placement problem [4,7,16,18,21–23,31,32]. In [24], we implemented a Particle Swarm Optimization (PSO) based simulation system called WMN-PSO and another simulation system based on Genetic Algorithm (GA) called WMN-GA [19], for solving node placement problem in WMNs. Then, we designed and implemented a hybrid simulation system based on PSO and Distributed GA (DGA). We call this system WMN-PSODGA. In this paper, we present the performance analysis of WMNs using WMNPSODGA system considering Boulevard distribution of mesh clients and two router replacement methods: Constriction Method (CM) and Random Inertia Weight Method (RIWM). We carry out simulations considering 16 and 18 mesh routers. We compare the performance of CM with RIWM. The rest of the paper is organized as follows. We present our designed and implemented hybrid simulation system in Sect. 2. The simulation results are given in Sect. 3. Finally, we give conclusions and future work in Sect. 4.
2 2.1
Proposed and Implemented Simulation System Particle Swarm Optimization
In PSO, a number of simple entities (the particles) are placed in the search space of some problems or functions and each evaluates the objective function at its current location. The objective function is often minimized and the exploration of the search space is not through evolution [17].
A Comparison Study for CM and RIWM Router Replacement Methods
197
Each particle then determines its movement through the search space by combining some aspect of the history of its own current and best (best-fitness) locations with those of one or more members of the swarm, with some random perturbations. The next iteration takes place after all particles have been moved. Eventually the swarm as a whole, like a flock of birds collectively foraging for food, is likely to move close to an optimum of the fitness function. Each individual in the particle swarm is composed of three D-dimensional vectors, where D is the dimensionality of the search space. These are the current position xi , the previous best position pi and the velocity vi . The particle swarm is more than just a collection of particles. A particle by itself has almost no power to solve any problem; progress occurs only when the particles interact. Problem solving is a population-wide phenomenon, emerging from the individual behaviors of the particles through their interactions. In any case, populations are organized according to some sort of communication structure or topology, often thought of as a social network. The topology typically consists of bidirectional edges connecting pairs of particles, so that if j is in i’s neighborhood, i is also in j’s. Each particle communicates with some other particles and is affected by the best point found by any member of its topological neighborhood. This is just the vector pi for that best neighbor, which we will denote with pg . The potential kinds of population “social networks” are hugely varied, but in practice certain types have been used more frequently. We show the pseudo code of PSO in Algorithm 1. In the PSO process, the velocity of each particle is iteratively adjusted so that the particle stochastically oscillates around pi and pg locations. 2.2
Distributed Genetic Algorithm
The Distributed Genetic Algorithm (DGA) has been used in various fields of science. The DGA has shown their usefulness for the resolution of many computationally hard combinatorial optimization problems. We show the pseudo code of DGA in Algorithm 2. Population of individuals: Unlike local search techniques that construct a path in the solution space jumping from one solution to another one through local perturbations, DGA use a population of individuals giving thus the search a larger scope and chances to find better solutions. This feature is also known as “exploration” process in difference to “exploitation” process of local search methods. Fitness: The determination of an appropriate fitness function, together with the chromosome encoding are crucial to the performance of DGA. Ideally we would construct objective functions with “certain regularities”, i.e. objective functions that verify that for any two individuals which are close in the search space, their respective values in the objective functions are similar. Selection: The selection of individuals to be crossed is another important aspect in DGA as it impacts on the convergence of the algorithm. Several selection schemes have been proposed in the literature for selection operators trying to cope with premature convergence of DGA. There are many selection methods
198
P. Xu et al.
Algorithm 1. Pseudo code of PSO. /* Initialize all parameters for PSO */ Computation maxtime:= T pmax , t := 0; Number of particle-patterns:= m, 2 ≤ m ∈ N 1 ; Particle-patterns initial solution:= P 0i ; Particle-patterns initial position:= x0ij ; Particles initial velocity:= v 0ij ; PSO parameter:= ω, 0 < ω ∈ R1 ; PSO parameter:= C1 , 0 < C1 ∈ R1 ; PSO parameter:= C2 , 0 < C2 ∈ R1 ; /* Start PSO */ Evaluate(G0 , P 0 ); while t < T pmax do /* Update velocities and positions */ = ω · v tij v t+1 ij +C1 · rand() · (best(Pijt ) − xtij ) +C2 · rand() · (best(Gt ) − xtij ); t+1 xij = xtij + v t+1 ij ; /* if fitness value is increased, a new solution will be accepted. */ Update Solutions(Gt , P t ); t = t + 1; end while Update Solutions(Gt , P t ); return Best found pattern of particles as solution;
in GA. In our system, we implement 2 selection methods: Random method and Roulette wheel method. Crossover operators: Use of crossover operators is one of the most important characteristics. Crossover operator is the means of DGA to transmit best genetic features of parents to offsprings during generations of the evolution process. Many methods for crossover operators have been proposed such as Blend Crossover (BLX-α), Unimodal Normal Distribution Crossover (UNDX), Simplex Crossover (SPX). Mutation operators: These operators intend to improve the individuals of a population by small local perturbations. They aim to provide a component of randomness in the neighborhood of the individuals of the population. In our system, we implemented two mutation methods: uniformly random mutation and boundary mutation. Escaping from local optima: GA itself has the ability to avoid falling prematurely into local optima and can eventually escape from them during the search process. DGA has one more mechanism to escape from local optima by considering some islands. Each island computes GA for optimizing and they migrate its gene to provide the ability to avoid from local optima (See Fig. 1). Convergence: The convergence of the algorithm is the mechanism of DGA to reach to good solutions. A premature convergence of the algorithm would cause that all individuals of the population be similar in their genetic features
A Comparison Study for CM and RIWM Router Replacement Methods
199
and thus the search would result ineffective and the algorithm getting stuck into local optima. Maintaining the diversity of the population is therefore very important to this family of evolutionary algorithms.
Algorithm 2. Pseudo code of DGA. /* Initialize all parameters for DGA */ Computation maxtime:= T gmax , t := 0; Number of islands:= n, 1 ≤ n ∈ N 1 ; initial solution:= P 0i ; /* Start DGA */ Evaluate(G0 , P 0 ); while t < T gmax do for all islands do Selection(); Crossover(); Mutation(); end for t = t + 1; end while Update Solutions(Gt , P t ); return Best found pattern of particles as solution;
Fig. 1. Model of Migration in DGA.
2.3
WMN-PSODGA Hybrid Simulation System
In this subsection, we present the initialization, particle-pattern, gene coding, fitness function and replacement methods. The pseudo code of our implemented system is shown in Algorithm 3. Also, our implemented simulation system uses Migration function as shown in Fig. 2. The Migration function swaps solutions among lands included in PSO part.
200
P. Xu et al.
Algorithm 3. Pseudo code of WMN-PSODGA system. Computation maxtime:= Tmax , t := 0; Initial solutions: P . Initial global solutions: G. /* Start PSODGA */ while t < Tmax do Subprocess(PSO); Subprocess(DGA); WaitSubprocesses(); Evaluate(Gt , P t ) /* Migration() swaps solutions (see Fig. 2). */ Migration(); t = t + 1; end while Update Solutions(Gt , P t ); return Best found pattern of particles as solution;
Fig. 2. Model of WMN-PSODGA migration.
Initialization We decide the velocity of particles by a random process considering the area size. For√instance, when √ the area size is W × H, the velocity is decided randomly from − W 2 + H 2 to W 2 + H 2 . Particle-Pattern A particle is a mesh router. A fitness value of a particle-pattern is computed by combination of mesh routers and mesh clients positions. In other words, each particle-pattern is a solution as shown is Fig. 3. Gene Coding A gene describes a WMN. Each individual has its own combination of mesh nodes. In other words, each individual has a fitness value. Therefore, the combination of mesh nodes is a solution. Fitness Function The fitness function of WMN-PSODGA is used to evaluate the temporary solution of the router’s placements. The fitness function is defined as: F itness = α × N CM C(xij , y ij ) + β × SGC(xij , y ij ) + γ × N CM CpR(xij , y ij ).
A Comparison Study for CM and RIWM Router Replacement Methods
201
Fig. 3. Relationship among global solution, particle-patterns, and mesh routers in PSO part.
This function uses the following indicators. • NCMC (Number of Covered Mesh Clients) The NCMC is the number of clients covered by routers. • SGC (Size of Giant Component) The SGC is the maximum number of connected routers. • NCMCpR (Number of Covered Mesh Clients per Router) The NCMCpR is the number of clients covered by each router. The NCMCpR indicator is used for load balancing. WMN-PSODGA aims to maximize the value of fitness function in order to optimize the placements of routers using the above three indicators. Weightcoefficients of the fitness function are α, β, and γ for NCMC, SGC, and NCMCpR, respectively. Moreover, the weight-coefficients are implemented as α + β + γ = 1. Router Replacement Methods A mesh router has x, y positions, and velocity. Mesh routers are moved based on velocities. There are many router replacement methods as shown in following. In this paper, we consider CM and RIWM. Constriction Method (CM) CM is a method which PSO parameters are set to a week stable region (ω = 0.729, C1 = C2 = 1.4955) based on analysis of PSO by M. Clerc et al. [2,5,35]. Random Inertia Weight Method (RIWM) In RIWM, the ω parameter is changing ramdomly from 0.5 to 1.0. The C1 and C2 are kept 2.0. The ω can be estimated by the week stable region. The average of ω is 0.75 [28,35]. Linearly Decreasing Inertia Weight Method (LDIWM) In LDIWM, C1 and C2 are set to 2.0, constantly. On the other hand, the ω parameter is changed linearly from unstable region (ω = 0.9) to stable region (ω = 0.4) with increasing of iterations of computations [35,36]. Linearly Decreasing Vmax Method (LDVM) In LDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). A value of Vmax which is maximum velocity of particles is considered. With increasing of iteration of computations, the Vmax is kept decreasing linearly [30,34].
202
P. Xu et al.
Rational Decrement of Vmax Method (RDVM) In RDVM, PSO parameters are set to unstable region (ω = 0.9, C1 = C2 = 2.0). The Vmax is kept decreasing with the increasing of iterations as Vmax (x) =
W 2 + H2 ×
T −x . x
Where, W and H are the width and the height of the considered area, respectively. Also, T and x are the total number of iterations and a current number of iteration, respectively [25].
3
Simulation Results
In this section, we present the simulation results. Table 1 shows the common parameters for each simulation. In Fig. 4 are shown the visualization results after the optimization for the case of 16 mesh routers. For this scenario, the SGC is maximized because all routers are connected. While, Fig. 5 shows the number of covered clients by each router. Both router replacement methods cover almost the same number of clients. In Fig. 6 are shown the transition of standard deviations. The standard deviation is related to load balancing. When the standard deviation is increased, the number of mesh clients for each router tends to be different. On the other hand, when the standard deviation is decreased, the number of mesh clients for each router tends to go close to each other. The value of r in Fig. 6 means the correlation coefficient. Comparing Fig. 6(a) and Fig. 6(b), we see that the standard deviation of both router replacement methods is decreased. But, the correlation coefficient of RIWM is decreased more than CM. So, the load balancing of RIWM is better than CM. Table 1. The common parameters for each simulation. Parameters
Values
Distribution of mesh clients Boulevard distribution Number of mesh clients
48
Number of mesh routers
16, 18
Radius of a mesh router
3.0–3.5
Number of GA islands
16
Number of migrations
200
Evolution steps
9
Crossover rate
0.8
Mutation rate
0.2
Replacement methods
CM, RIWM
Area size
32.0 × 32.0
A Comparison Study for CM and RIWM Router Replacement Methods
203
In Fig. 7 are shown the visualization results after the optimization for 18 mesh routers. Also for this case, the SGC is maximized because all routers are connected. While, Fig. 8 shows the number of covered clients by each router. The RIWM and CM have covered almost the same number of clients. In Fig. 9 are shown the transition of standard deviations. Comparing Fig. 9(a) and Fig. 9(b), we see that the standard deviation for RIWM is decreased, while for CM is not changing. So, also in the case of 18 mesh routers, the load balancing of RIWM is better than CM. Comparing cases for 16 and 18 mesh routers, the performance for 18 mesh routers is better than 16 mesh routers.
(a) CM
(b) RIWM
Fig. 4. Visualization results after the optimization (16 routers).
5
Number of Covered Clients
Number of Covered Clients
6 5 4 3 2 1 0
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
4
3
2
1
0
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
Router
Router
(a) CM
(b) RIWM
Fig. 5. Number of covered clients by each router after the optimization (16 routers).
P. Xu et al.
Standard Deviation
2
2
regression line data
r = -0.573064 Standard Deviation
204
1
0
2
4
6
8
10
12
14
16
18
r = -0.699929
1
0
20
regression line data
5
10
15
Number of Updates
Number of Updates
(a) CM
(b) RIWM
20
25
Fig. 6. Standard deviation for CM and RIWM (16 routers).
(a) CM
(b) RIWM
Fig. 7. Visualization results after the optimization (18 routers). 6
Number of Covered Clients
Number of Covered Clients
6 5 4 3 2 1 0
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]
Router
(a) CM
5 4 3 2 1 0
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]
Router
(b) RIWM
Fig. 8. Number of covered clients by each router after the optimization (18 routers).
A Comparison Study for CM and RIWM Router Replacement Methods 2
regression line data
r = -0.0205041 Standard Deviation
Standard Deviation
2
1
0
5
10
15
20
Number of Updates
(a) CM
25
30
regression line data
205
r = -0.335459
1
0
5
10
15
20
25
30
35
Number of Updates
(b) RIWM
Fig. 9. Standard deviation for CM and RIWM (18 routers).
4
Conclusions
In this work, we evaluated the performance of WMNs using a hybrid simulation system based on PSO and DGA (called WMN-PSODGA). We considered Boulevard distribution of mesh clients and two router replacement methods. We carried out simulations considering 16 and 18 mesh routers. We compared the performance of CM with RIWM. The simulation results show that RIWM has better load balancing than CM. In future work, we will consider other distributions of mesh clients.
References 1. Akyildiz, I.F., Wang, X., Wang, W.: Wireless mesh networks: a survey. Comput. Netw. 47(4), 445–487 (2005) 2. Barolli, A., Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L., Takizawa, M.: Performance evaluation of WMNs by WMN-PSOSA simulation system considering constriction and linearly decreasing Vmax methods. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing. Springer, pp. 111–121 (2017). https://doi.org/10.1007/978-3-319-69835-9 10 3. Barolli, A., Sakamoto, S., Barolli, L., Takizawa, M.: Performance analysis of simulation system based on particle swarm optimization and distributed genetic algorithm for WMNs considering different distributions of mesh clients. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing. Springer, pp. 32–45 (2018). https://doi.org/10.1007/978-3-319-93554-6 3 4. Barolli, A., Sakamoto, S., Ozera, K., Barolli, L., Kulla, E., Takizawa, M.: Design and implementation of a hybrid intelligent system based on particle swarm optimization and distributed genetic algorithm. In: International Conference on Emerging Internetworking, Data & Web Technologies. Springer, pp. 79–93 (2018). https://doi.org/10.1007/978-3-319-75928-9 7 5. Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 6(1), 58–73 (2002) 6. Franklin, A.A., Murthy, C.S.R.: Node placement algorithm for deployment of twotier wireless mesh networks. In: Proceedings of Global Telecommunications Conference, pp. 4823–4827 (2007)
206
P. Xu et al.
7. Girgis, M.R., Mahmoud, T.M., Abdullatif, B.A., Rabie, A.M.: Solving the wireless mesh network design problem using genetic algorithm and simulated annealing optimization methods. Int. J. Comput. Appl. 96(11), 1–10 (2014) 8. Goto, K., Sasaki, Y., Hara, T., Nishio, S.: Data gathering using mobile agents for reducing traffic in dense mobile wireless sensor networks. Mob. Inf. Syst. 9(4), 295–314 (2013) 9. Inaba, T., Elmazi, D., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: A secure-aware call admission control scheme for wireless cellular networks using fuzzy logic and its performance evaluation. J. Mob. Multimed. 11(3 & 4), 213–222 (2015) 10. Inaba, T., Obukata, R., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of a QoS-aware fuzzy-based CAC for LAN access. Int. J. Space-Based Situated Comput. 6(4), 228–238 (2016) 11. Inaba, T., Sakamoto, S., Oda, T., Ikeda, M., Barolli, L.: A testbed for admission control in WLAN: a fuzzy approach and its performance evaluation. In: International Conference on Broadband and Wireless Computing, Communication and Applications. Springer, pp. 559–571 (2016). https://doi.org/10.1007/978-3-31949106-6 55 12. Lim, A., Rodrigues, B., Wang, F., Xu, Z.: k-Center problems with minimum coverage. Theor. Comput. Sci. 332(1–3), 1–17 (2005) 13. Maolin, T., et al.: Gateways placement in backbone wireless mesh networks. Int. J. Commun. Netw. Syst. Sci. 2(1), 44–50 (2009) 14. Matsuo, K., Sakamoto, S., Oda, T., Barolli, A., Ikeda, M., Barolli, L.: Performance analysis of WMNs by WMN-GA simulation system for two WMN architectures and different TCP congestion-avoidance algorithms and client distributions. Int. J. Commun. Netw. Distrib. Syst. 20(3), 335–351 (2018) 15. Muthaiah, S.N., Rosenberg, C.P.: Single gateway placement in wireless mesh networks. In: Proceedings of the 8th International IEEE Symposium on Computer Networks, pp. 4754–4759 (2008) 16. Naka, S., Genji, T., Yura, T., Fukuyama, Y.: A hybrid particle swarm optimization for distribution state estimation. IEEE Trans. Power syst. 18(1), 60–68 (2003) 17. Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization. Swarm Intell. 1(1), 33–57 (2007) 18. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A comparison study of simulated annealing and genetic algorithm for node placement problem in wireless mesh networks. J. Mob. Multimed. 9(1–2), 101–110 (2013) 19. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A comparison study of hill climbing, simulated annealing and genetic algorithm for node placement problem in WMNs. J. High Speed Netw. 20(1), 55–66 (2014) 20. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: A simulation system for WMN based on SA: performance evaluation for different instances and starting temperature values. Int. J. Space-Based Situated Comput. 4(3-4), 209–216 (2014) 21. Sakamoto, S., Kulla, E., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Performance evaluation considering iterations per phase and SA temperature in WMN-SA system. Mob. Inf. Syst. 10(3), 321–330 (2014) 22. Sakamoto, S., Lala, A., Oda, T., Kolici, V., Barolli, L., Xhafa, F.: Application of WMN-SA simulation system for node placement in wireless mesh networks: a case study for a realistic scenario. Int. J. Mob. Comput. Multimed. Commun. (IJMCMC) 6(2), 13–21 (2014)
A Comparison Study for CM and RIWM Router Replacement Methods
207
23. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: An integrated simulation system considering WMN-PSO simulation system and network simulator 3. In: International Conference on Broadband and Wireless Computing, Communication and Applications. Springer, pp. 187–198 (2016). https://doi.org/10.1007/978-3319-49106-6 17 24. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation and evaluation of a simulation system based on particle swarm optimisation for node placement problem in wireless mesh networks. Int. J. Commun. Netw. Distrib. Syst. 17(1), 1–13 (2016) 25. Sakamoto, S., Oda, T., Ikeda, M., Barolli, L., Xhafa, F.: Implementation of a new replacement method in WMN-PSO simulation system and its performance evaluation. In: The 30th IEEE International Conference on Advanced Information Networking and Applications (AINA-2016), pp. 206–211 (2016) 26. Sakamoto, S., Obukata, R., Oda, T., Barolli, L., Ikeda, M., Barolli, A.: Performance analysis of two wireless mesh network architectures by WMN-SA and WMN-TS simulation systems. J. High Speed Netw. 23(4), 311–322 (2017) 27. Sakamoto, S., Ozera, K., Barolli, A., Ikeda, M., Barolli, L., Takizawa, M.: Implementation of an intelligent hybrid simulation systems for WMNs based on particle swarm optimization and simulated annealing: performance evaluation for different replacement methods. Soft Comput. 23(9), 3029–3035 (2017) 28. Sakamoto, S., Ozera, K., Barolli, A., Ikeda, M., Barolli, L., Takizawa, M.: Performance evaluation of WMNs by WMN-PSOSA simulation system considering random inertia weight method and linearly decreasing Vmax method. In: International Conference on Broadband and Wireless Computing, Communication and Applications. Springer, pp. 114–124 (2018). https://doi.org/10.1007/978-3-319-698113 10 29. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Implementation of intelligent hybrid systems for node placement problem in WMNs considering particle swarm optimization, hill climbing and simulated annealing. Mob. Netw. Appl. 23(1), 27–33 (2017) 30. Sakamoto, S., Ozera, K., Ikeda, M., Barolli, L.: Performance evaluation of WMNs by WMN-PSOSA simulation system considering constriction and linearly decreasing inertia weight methods. In: International Conference on Network-Based Information Systems. Springer, pp. 3–13 (2017). https://doi.org/10.1007/978-3-31965521-5 1 31. Sakamoto, S., Ozera, K., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of intelligent hybrid systems for node placement in wireless mesh networks: a comparison study of WMN-PSOHC and WMN-PSOSA. In: International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing. Springer, pp. 16–26 (2017) 32. Sakamoto, S., Ozera, K., Oda, T., Ikeda, M., Barolli, L.: Performance evaluation of WMN-PSOHC and WMN-PSO simulation systems for node placement in wireless mesh networks: a comparison study. In: International Conference on Emerging Internetworking, Data & Web Technologies. Springer, pp. 64–74 (2017). https:// doi.org/10.1007/978-3-319-59463-7 7 33. Sakamoto, S., Ozera, K., Barolli, A., Barolli, L., Kolici, V., Takizawa, M.: Performance Evaluation of WMN-PSOSA considering four different replacement methods. In: International Conference on Emerging Internetworking, Data & Web Technologies. Springer, pp. 51–64 (2018). https://doi.org/10.1007/978-3-319-75928-9 5 34. Schutte, J.F., Groenwold, A.A.: A study of global optimization using particle swarms. J. Global Optim. 31(1), 93–108 (2005)
208
P. Xu et al.
35. Shi, Y.: Particle swarm optimization. IEEE Connections 2(1), 8–13 (2004) 36. Shi, Y., Eberhart, R.C.: Parameter selection in particle swarm optimization. In: Evolutionary Programming VII, pp. 591–600 (1998). https://doi.org/10.1007/ BFb0040810 37. Vanhatupa, T., Hannikainen, M., Hamalainen, T.: Genetic algorithm to optimize node placement and configuration for WLAN planning. In: Proceedings of The 4th IEEE International Symposium on Wireless Communication Systems, pp. 612–616 (2007) 38. Wang, J., Xie, B., Cai, K., Agrawal, D.P.: Efficient mesh router placement in wireless mesh networks. In: Proceedings of IEEE International Conference on Mobile Adhoc and Sensor Systems (MASS-2007), pp. 1–9 (2007)
Consideration of Presentation Timing in Bicycle Navigation Using Smart Glasses Takahiro Uchiya(B) and Ryo Futamura Nagoya Institute of Technology, Gokiso-cho, Showa-ku, Nagoya, Aichi 466-8555, Japan [email protected], [email protected]
Abstract. In recent years, applications for bicycle navigation and navigation devices for bicycles attached near bicycle handlebars have appeared. The numbers of outings and home deliveries using them are increasing. However, when checking and operating these navigation systems, one must devote close attention to the handle. Safety confirmation of the surroundings can therefore be neglected. As described herein, we use smart glasses as a means to resolve this difficulty. Because smart glasses for cyclists are used overseas, investigation results have shown that much unnecessary information exists for navigation and show them to be unsuitable for navigation. Therefore, this research was conducted to examine the navigation application specifically and to verify the contents necessary for a system suitable bicycle riding. We examine and evaluate the appropriate timing for presenting information.
1 Introduction In recent years, applications for bicycle navigation and bicycle navigation devices attached near bicycle handlebars have appeared. Furthermore, the numbers of outings and home deliveries using them are increasing. However, for users to confirm navigation and guide the bicycle, one must devote close attention to the handle. Safety among one’s own surroundings is neglected. Therefore, we use smart glasses for cyclists to resolve this difficulty. Investigations of existing smart glasses for cyclists have shown that much unnecessary information for navigation is given, and that such systems are unsuitable for navigation because the information is continuously presented. Therefore, we specifically examine the smart glass operations for cyclists as a navigation system, with the aim of designing a system that is suitable for use when riding a bicycle. Specifically, we propose a method by which navigation information is not always presented, with a direction of travel that can be understood in a short time. As described herein, we examine and evaluate the appropriate presentation timing of navigation information.
2 Smart Glasses Smart glasses are wearable devices worn on the head as eye glasses. Many smart glasses display information that is superimposed on the actual scene. Because of this feature, they are sometimes called “augmented reality (AR) glasses”. Another feature is that such © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 209–217, 2022. https://doi.org/10.1007/978-3-030-84910-8_22
210
T. Uchiya and R. Futamura
glasses generally do not work as standalone: they work by connecting to a smartphone or the internet using Wi-Fi, Bluetooth, etc. Moreover, some have controllers and interfaces that can execute and operate commands by voice recognition. Smart glasses have been studied for military purposes since the 1970s. Currently, several companies are selling such devices, such as Google Glass by Alphabet, Inc. and Moverio by Seiko Epson Corp. Some smart glasses on the market are targeted at cyclists and athletes. Everysight’s Raptor [1] is an example of smart glasses for cyclists. For this study, we designed a system for evaluation experiments using Sony smart glasses for developers “SmartEyeglass SED-E1” [2] (Fig. 1).
Fig. 1. Sony SmartEyeglass SED-E1.
3 Presentation Timing Evaluation Experiment 3.1 Experiment Purpose This experiment was conducted to examine presentation of navigation information in the direction of travel at an intersection while using smart glasses. Specifically, the timing of presentation is evaluated using the position from the intersection or the number of seconds until the intersection is reached. Ito et al. [3] concluded that an appropriate time exists for motorcycle drivers to present navigation information. Therefore, it is assumed for this study that there is an appropriate timing for presenting navigation information even when using a bicycle. We assess the relation between speed and distance and measure the time necessary to reach an intersection. In this way, the distance to the intersection and the time to reach the intersection, which are appropriate for the presentation timing, were obtained and evaluated. In addition, Miyazaki et al. [4] report that differences exist in traffic safety awareness depending on the means of transportation. For this reason, participants are grouped by their usual means of transportation. Each group is considered and evaluated. 3.2 Experiment Environment Participants examined for this study were 15 male students in their 20 s. The experiment site was on a road at our university, as presented in Fig. 2. Smart glasses (SmartEyeglass SED-E1) and smartphones (Xperia Z, Galaxy S10) were used. The bicycle is a cross bike that is suitable for long distances, assuming that it will travel a long distance that requires navigation. The information presentation time was set to 1.5 s, reflecting to the fact that visual inspection which does not violate the Road Traffic Act is approximately 2 s.
Consideration of Presentation Timing in Bicycle Navigation
211
The information presented in Fig. 3 is of three types (left turn, right turn, and straight ahead) based on information used in an experiment reported by Ito et al. [3]. The information is presented near the center of the field of view. Figure 4 shows an image of the information as actually presented. Contents of the collected data are the speed and distance to the intersection at the time when navigation information is presented. The definition of the distance from the navigation provisioning point to the intersection is shown as an orange arrow in Fig. 5. To acquire the collected data, the location information and speed using the smartphone GPS are used. The position information acquires the latitude and longitude of the current position. Distance to the intersection is found by spherical trigonometry using the latitude and longitude of the current position and the latitude and longitude of the position assumed to be the intersection. The speed is calculated using the difference between the acquisition time and the distance using the latitude and longitude acquired by GPS at the time of information presentation and the latitude and longitude acquired before the time of information presentation.
Fig. 2. Experiment place details.
Fig. 3. Provided navigation information.
3.3 Experiment Conditions The minimum necessary conditions for conducting the experiment are shown.
212
T. Uchiya and R. Futamura
Fig. 4. Navigation image using smart glasses.
Fig. 5. Distance between navigation point and intersection (orange arrow).
The first condition is that the participant be able to drive a bicycle with the naked eye or with contact lenses on. Those conditions are imposed on the participant because smart glasses cannot be used with eyeglasses. The participant must drive a bicycle with the naked eye or wearing contact lenses. For this reason, a person who cannot drive a bicycle with the naked eye or wearing contact lenses is unsuitable for participation as a participant because safe driving is impossible. The second condition is that the weather is sunny or cloudy. The wind does not affect the bicycle riding behavior. The reason for imposing conditions on the weather is to reduce weather effects when riding a bicycle. Fundamentally, the weather when using a bicycle is often sunny or cloudy. We inferred that data obtained in rainy weather would be meaningless for this study. Therefore, we chose not to conduct experiments in rainy weather. Next, we considered the wind velocity. If the handle operation becomes difficult because of wind effects, the user might concentrate on handle operations even if information is not presented, which would confound the results.
Consideration of Presentation Timing in Bicycle Navigation
213
The third condition is that the experimental time period be daytime. Experiments can be conducted safely only during daytime. 3.4 Experiment Procedure The experiment procedures are explained below. Step 1: Explain to the participant how to use smart glasses. Step 2: Inform the participant of experimental precautions. Step 3: The participant must become accustomed to operations while riding a bicycle. Step 4: The participant runs the course nine times. In step 1, first ask the participant to wear smart glasses and adjust the position to wear the smart glasses so that the participant can see the presented information. After adjustment, ask them to check the three types of arrow images presented in the experiment and tell them which direction the arrow is coming out. If the direction of the arrow in the presented image is unknown, then the experiment is not allowed to proceed at that time. Participants able to recognize the direction of the arrow in the presented image correctly are asked to proceed to step 2. In step 2, the experiment contents are explained while confirming the place where the experiment will be conducted. For example, a participant is told about the position of the intersection and the number of runs. The instructions given to the participants are presented below. • When driving, drive near the road center line • Drive as usual • Be sure to return to the starting point after driving The reason for giving this instruction is to arrange the experiment conditions so that the obtained data will not be affected. In step 3, the participant is actually asked to ride a bicycle. The purpose of riding the bicycle at this time is for the participant to find the optimal personal presentation timing and to become accustomed to the experiment. For this experiment, we will use the controller attached to the smart glasses. It was necessary to become accustomed to it before the experiment. Additionally, by becoming accustomed to the experiment, it is expected that the data which are outliers will decrease. In step 4, the participant is asked to present information while riding the bicycle. The experiment was temporarily interrupted only when a car passed. The experiment commenced again after the car passed. During the experiment, we checked the participant and surrounding conditions continuously. 3.5 Experiment Results The experimentally obtained results are presented below. Figure 6 portrays a graph of the relation between the speed [km/h] and distance [m] to the intersection for the timing at which the participant presented the information. Figure 7 shows a distribution of the time taken from the presentation position to the arrival at the intersection. In addition, the
214
T. Uchiya and R. Futamura
data in Fig. 6 and Fig. 7 are classified into two groups according to “whether or not they drive a car or a motorcycle for four or more days a week”. Group 1 (blue) is a group for which “the usual movement is by vehicle”. Group 2 (orange) is a group for which “the usual movement is not by vehicle”. Table 1 shows the average speed, standard deviation, distance to the intersection, and the average estimated arrival time, standard deviation, 95% confidence interval, and median of the speed of group 1. Table 2 shows those of group 2.
Fig. 6. Relation between distance and speed at the time of presentation.
Fig. 7. Distribution of time required from the presentation position to the arrival at the intersection.
3.6 Consideration [Group 1: Usual movement is by vehicle] Figure 6 shows that group 1 often presents information at a position 35 [m] or more away from the intersection (81% of the participant data are applicable). Therefore, the timing of issuing the indicator is related. First, investigation of the timing of issuing the indicator was found to be about 30 [m] before the intersection. Considering this point,
Consideration of Presentation Timing in Bicycle Navigation
215
Table 1. Results for group 1 Number of data
22
Average of velocity [km/h]
12.63
S.D. of velocity
2.24
Average of distance to the intersection [m]
42.30
S.D. of distance
7.43
Median of distance [m]
42.45
95% confidence interval of distance [m]
39.19–45.51
Estimated arrival time [s]
12.49
S.D. of estimated arrival time
3.13
Median of estimated arrival time [s]
13.02
95% confidence interval of estimated arrival time [s] 11.18–13.80
Table 2. Results for group 2 Number of data
95
Average of velocity [km/h]
15.25
S.D. of velocity
2.47
Average of distance to the intersection [m]
29.37
S.D. of distance
7.20
Median of distance [m]
30.00
95% confidence interval of distance [m]
27.92–30.82
Estimated arrival time [s]
7.24
S.D. of estimated arrival time
2.47
Median of estimated arrival time [s]
7.00
95% confidence interval of estimated arrival time [s] 6.49–7.51
if the indicator is issued 30 [m] before the intersection, then the navigation information should be issued at a position before 30 [m]. Therefore, reasonable values are inferred as the average value of the distance from the intersection, the median value, and the value of the 95% confidence interval. We consider that this group can maintain consciousness of “turning/going straight” early and can specifically examine the surroundings while checking the safety of the surroundings until entering the intersection. The rationale for the inference presented above is that the predicted arrival time is 10 s or more in the mean, median, and 95% confidence intervals. Based on these findings, it can be inferred that people using automobiles on public roads have the characteristic of trying to use vehicles with the same feeling as navigation used in cars.
216
T. Uchiya and R. Futamura
[Group 2: Usual movement is not by vehicle] It is apparent that participants in Group 2 are often presented information in the range of 20 [m] – 40 [m] from the intersection (81% of the participant data are applicable). Average and median values are about 30 [m]. The 95% confidence interval is about 3 [m]. The average velocity in Group 2 is 15.25 [km/h]. This speed implies travel of about 4 [m] in 1 [s]. Therefore, if information were presented at 30 [m] from the intersection, it would be possible to present the navigation information while passing through the 95% confidence interval. Therefore, it is appropriate that the distance from the intersection is 30 [m] when considering the appropriate presentation timing based on distance. Regarding the predicted arrival time, it is often presented at a position where it arrives in 6–8 s, which indicates that the participant might be thinking about how much extra time should be given for the intersection considering one’s own speed. However, if the appropriate information presentation timing is defined as “time to reach the intersection”, then the presentation timing might differ depending on the speed at the time of use. In other words, the faster one’s speed becomes, the farther one must be from the intersection. At slower speeds, one can be closer to the intersection. If the information presentation timing becomes random in this way, the possibility exists that the information will be followed at the wrong intersection, especially when the intersections are continuous. Because this does not function as a navigation system, it is desirable that the information presentation timing is not random. Therefore, the timing of information presentation should be defined by “distance” without randomness. The appropriate timing in Group 2 is regarded as 30 [m] before the intersection. [Statistical Analysis] The Mann–Whitney U test was performed to confirm whether a significant difference was found between the two groups with respect to the median distance from the intersection and the median predicted arrival time. A significant difference was found at the significance level of 1% (distance from the intersection: p = 1.25×10–8 , estimated arrival time: p = 7.98×10–9 ). Therefore, results suggest that the appropriate timing for presenting information differs depending on the usual means of transportation. In addition, the results indicated that the appropriate information presentation timing of group 1 was farther from the intersection than the information presentation timing of group 2, and that the predicted arrival time was long. This is regarded as a difference in awareness of traffic safety, as described by Miyazaki et al. [4].
4 Conclusion This study was conducted to design a system suitable for use when riding a bicycle, particularly addressing the operation of smart glasses for cyclists as a navigation system. As described in this paper, an experiment was conducted using smart glasses on the presentation timing of navigation information, which is necessary for designing a system that is suitable for using a bicycle. We evaluated and considered the suitable presentation timing. For the presentation timing evaluation experiment, participants were divided into two groups. The experiment was then conducted. The timing at which the navigation was presented was at a position where Group 1 was at least 35 [m] away from the
Consideration of Presentation Timing in Bicycle Navigation
217
intersection. For Group 2 participants, results showed that the appropriate presentation timing was at a position 30 [m] distant from the intersection. These results demonstrated that the appropriate information presentation timing is influenced by the usual means of transportation. In the future, we plan to increase the number of participants and to improve the reliability of the experimentally obtained results.
References 1. Raptor. https://everysight.com/product/raptor/ 2. SmartEyeglass SED-E1. https://developer.sony.com/ja/develop/smarteyeglass-sed-e1 3. Ito, K., Nishimura, H., Ogi, T.: Evaluation of the information presentation timing for motorcycle head-up display. Trans. JSME 83(853) (2017) 4. Miyazaki, K., Mikuni, C., Mikuni, S.: A consideration of consciousness of a traffic safety using a traffic safety test. In: Papers of Research Meeting on Civil Engineering Planning 39, No. 357 (2009)
Graph Convolution Network for Urban Mobile Traffic Prediction Changliang Yu1 , Zhiyang Ye2(B) , and Nan Zhao2 1
Wuhan Fisilink Microelectronics Technology Co., Ltd., Wuhan, China [email protected] 2 Hubei University of Technology, Wuhan , China [email protected], [email protected]
Abstract. Cellular traffic prediction has always been a great challenge in the communication field. Efficiently and correctly predicting the future mobile traffic can improve communication resource’s scheduling. However, changes in mobile traffic are affected by nearby areas. Therefore, prediction methods should be more sensitive to spatial features. Recently, deep learning methods have shown excellent ability in extracting spatial-temporal features. In this work, mobile traffic prediction approach based on the graph convolution networks is propose. We evaluate our model with the Telecom Italia dataset and compare with other traditional models. Experiments show that our model can significantly improve the prediction accuracy.
1 Introduction Urban mobile prediction [1, 2] aims to study patterns of crowd communication behavior and make real-time mobile traffic state evaluation, which can contribute to improve the quality of communication services. Actually the change of mobile traffic is affected by geographical location and population distribution effect []. Therefore, improving the accuracy of mobile traffic prediction is still a challenge for communication researchers. Recently, a large number of researchers are devoted to the research of mobile traffic prediction methods [4], such as traditional time series forecasting analysis methods. Cellular traffic prediction is regarded as a time series analysis issue, mainly depending on statistical models, for example, Kalman filter model [5], auto-regressive integrated moving average (ARIMA) [6], support vector regression (SVR) [7] and alpha-stable [8]. Due to the influence of factors such as user mobility and the diversity of user needs, the pattern of cellular traffic has become more and more complicated. It is increasingly obvious that these linear models are no longer suitable for mobile traffic prediction [9]. In recent years, deep learning has been successfully applied in human behavior prediction and traffic flow prediction. More and more researchers have studied the mobile traffic prediction technique with deep learning, such as LSTM [10], which is incomplete to obtain the features of the neighboring cell of input data by automatic encoding. For purpose of obtaining the spatial-temporal correlation in the original data, a convolutional neural network (CNN) based on mobile traffic data is proposed in [11]. However, most of the above work is based on the grid structure data of mobile traffic prediction methods, without considering the cellular traffic data based on graph structure. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 218–224, 2022. https://doi.org/10.1007/978-3-030-84910-8_23
Graph Convolution Network for Urban Mobile Traffic Prediction
219
Based on the above considerations, this paper proposes a graph convolution network (GCN) based on graph structure mobile traffic data to predict the citywide cellular traffic. For acquiring the spatial-temporal correlations effectively, the graph convolution neural network framework is suggested for modeling the hourly, daily and weekly temporal components of cellular traffic. With regard to graph convolution module, the GCN is utilized for capturing the spatial feature of citywide cellular traffic. By adaptively aggregating outputs of the above temporal components, the ultimate traffic prediction result could be gained. For evaluating and comparing the performance, a large number of experiments are executed with the proposed cellular traffic prediction technique on real-world cellular traffic datasets. The structure of the paper is as follows. The prediction problem formulation is described in Sect. 2. Section 3 investigates the graph convolution cellular traffic prediction method. Simulation results are displayed in Sect. 4. Section 5 concludes this paper.
2 Problem Formulation In this paper, the traffic network is seen as an undirected graph G= (ν , ε , A), where ν represents the set of N nodes, ε is the set of edges between the nodes, and A is the adjacent matrix of the graph G. Then, each area in traffic network is regarded as a node in the undirected graph G. The traffic of the node i at the time t is expressed as xti . And the cellular traffic for N nodes over τ time slices could be expressed as: X = (X1 , X2 , . . . , Xτ ) ,
(1)
where Xt = (xt1 , xt2 , . . . , xtN )T express the mobile traffic of N nodes. Then, the predicted mobile traffic at N nodes is represented as: Y = (Y1 ,Y2 , . . . ,YN ) ,
(2)
t+t p T t+1 denotes the mobile traffic of node i in the next time where Yi = yt+1 i , yi , . . . , yi slice t p from time t, yti express the traffic of node i at time t.
3 Traffic Prediction Algorithm In this section, the traffic prediction model is described, consisting of a fusion layer, and three independent temporal components with the hourly, daily and weekly inputs, respectively. In the hourly component, in order to obtain the influence of the mobile traffic in the adjacent period on the predicted traffic, the traffic data of the th hours is selected before ts as the input data. It is given by Xh = (Xts −th +1 , Xts −th +2 , . . . , Xts ), 1 ≤ th ≤ 24 − t p .
(3)
220
C. Yu et al.
In order to obtain the influence degree of the predicted traffic in the same time period on the past few days, the mobile traffic of the td days before the current time ts is intercepted as the input of the daily component, which can be expressed as: Xd = (Xt −24×(t +1)+1 , . . . , Xt −24×(t +1)+t , s s p d d Xts −24×t +1 , . . . , Xts −24×td +t p , . . . , d Xts −24+1 , . . . , Xts −24+t p ), 1 ≤ td ≤ 6.
(4)
Similarly, the input of the weekly component is composed of the tw weeks before the current time ts at the same time period, which can be expressed as: Xw = (Xts −24×7×(tw +1)+1 , . . . , Xts −24×7×(tw +1)+t p , Xts −24×7×tw +1 , . . . , Xts −24×7×tw +t p , . . . , Xts −24×7+1 , . . . , Xts −24×7+t p ), tw ≥ 1.
(5)
Then, Xh , Xd , and Xw are passed into the graph convolution network of three components, respectively. Therefore, the input of the l th graph convolution network of hourly component can be expressed as: X(l) = (X1 , X2 , . . . , XT ) ∈ RN×T ,
(6)
where 1 ≤ l ≤ 2, T = th is the length of the hourly component. Morever, as modeling the cellular traffic network as the undirected graph G, the GCN [3] is applied to acquire the neighboring nodes’ feature. The convolution operation of X(l) by the convolution kernel gθ (Λ ) is given by X(g) = gθ ∗ X = gθ (L) X(l) ,
(7)
where L˜ = λmax L−IN , L is the normalized Laplacian matrix of G, λmax is the maximum eigenvalue of L. Considering that the dimensions of each component’ output are inconsistent with the target dimension, the FC layer is used to make the dimensions of the three components’ output the same. This FC layer is a convolutional neural network. By passing the output of the last graph convolution module X(g) , the output is given by 2
N
ˆ f = ∑ wii X(L) + B f , Y
(8)
i=1
where wii is the learnable weight parameters, B f is the bias matrix of FC layer. Therefore, the final output is acquired ˆ h + Wd ⊗ Y ˆ d + Ww ⊗ Y ˆ w, ˆ = Wh ⊗ Y Y
(9)
ˆ d , and Y ˆ w represent the three components’ ˆ h, Y where ⊗ is the Hadamard product, Y output, respectively, Wh ∈ RN×t p , Wd ∈ RN×t p , and Ww ∈ RN×t p are learnable weight matrices. Then, by computing the loss value between the ground truth Yi and predicted results Yˆi , the loss function is given by 2 1 L(θ ) = ∑ Yi − Yˆi , (10) 2 i where θ represents all trainable parameters of graph convolution network.
Graph Convolution Network for Urban Mobile Traffic Prediction
221
4 Performance Evaluation In the original data set, the sampling period is 10 min. Therefore, the traffic of many nodes is zero and the experimental dataset is very large, which will lead to low efficiency or instability of the network. Based on the above considerations, in this paper, we will integrate the original dataset in hours, and each data represents the mobile traffic within an hour in a certain area. Then, in order to reduce the calculation difficulty of the network, the Min-Max normalization strategy is adopted to scale the traffic data to [0, 1]. The predicted value is restored to the normal value when evaluating the network performance. In addition, we selected the last 263 h of traffic as the test dataset. The graph convolution network is optimized using Adam optimization technique. The training process with 100 epochs and batch size 16 is considered. The adaptive learning rate δ is set to be 0.01. The graph convolution networks have 64 filters with the number of the terms of Chebyshev polynomial K = 3. The lengths of input data of three temporal components are Δ h = 1, Δ d = 1, Δ w = 3, respectively. And the prediction size t p is 6. Based on the setting of the above experimental parameters, with the aim of verifying the effectiveness of the traffic prediction method for graph convolution networks, we used two types of data (call in and call out) for experiments. Figure 1 and Fig. 2 show the comparison of predicted traffic and truth traffic of call in and call out services in randomly selected areas, respectively. During the mobile traffic with change from the 50th hour to the 100th hour, it can also be clearly observed that the prediction result can also match the truth traffic.
Fig. 1. Traffic prediction results with call in.
222
C. Yu et al.
And then, the prediction performance by different learning techniques is analyzed with RMSE and MAE. For comparison, the history average (HA), ARIMA and SVR learning approaches are also considered. Two different evaluation results of call in and call out services are plotted in Fig. 3 and Fig. 4, respectively. It is clearly shown by Fig. 3 and Fig. 4 that the prediction method based on the graph convolution network has the best prediction results com-
Fig. 2. Traffic prediction results with call out.
Fig. 3. RMSE with different approaches.
Graph Convolution Network for Urban Mobile Traffic Prediction
223
Fig. 4. MAE with different approaches.
pared with the other three prediction methods. The contribution of the proposed model is mainly attributed to three aspects. Firstly, the original data is processed to improve the stability of the model. Secondly, three components are established to obtain the dependency of the temporal dimension. Finally, the features of adjacent areas are obtained by graph convolution network.
5 Conclusion This paper proposed a traffic prediction technique with GCN. Due to the spatialtemporal correlation of mobile traffic, the components of three temporal dimensions are established, and graph convolution network is applied to graph structured mobile data. The traffic can be efficiently predicted by considerations mobile traffic data as a graph. Experimental results demonstrate that the proposed mobile traffic prediction method based on the graph convolutional network achieves better performance in terms of RMSE and MAE compared with other traditional approaches.
References 1. Liu, Z., Li, Z., Wu, K., Li, M.: Urban traffic prediction from mobility data using deep learning. IEEE Netw. 32(6), 40–46 (2018) 2. Wang, X., et al.: Spatio-temporal analysis and prediction of cellular traffic in metropolis. IEEE Trans. Mob. Comput. 18(9), 2190–2202 (2019) 3. Zhao, N., Ye, Z., Pei, Y., Liang, Y.-C., Niyato, D.: Spatial-temporal attention-convolution network for citywide cellular traffic prediction. IEEE Commun. Lett. 24(11), 2532–2536 (2020) 4. Aceto, G., Bovenzi, G., Ciuonzo, D., Montieri, A., Persico, V., Pescap´e, A.: Characterization and prediction of mobile-app traffic using Markov modeling. IEEE Trans. Netw. Serv. Manage. 18(1), 907–925 (2021)
224
C. Yu et al.
5. Li, R., Zhao, Z., Zheng, J., Mei, C., Cai, Y., Zhang, H.: The learning and prediction of application-level traffic data in cellular networks. IEEE Trans. Wireless Commun. 16(6), 3899–3912 (2017) 6. Xu, F., et al.: Big data driven mobile traffic understanding and forecasting: a time series approach. Serv. Comput. 9(5), 796–805 (2016) 7. Li, R., Zhao, Z., Zhou, X., Palicot, J., Zhang, H.: The prediction analysis of cellular radio access network traffic: from entropy theory to networking practice. IEEE Commun. Mag. 52(6), 234–240 (2014) 8. Sun, H., Liu, H.X., Xiao, H., He, R.R., Ran, B.: Short term traffic forecasting using the local linear regression model. In: Proceedings of 82nd Annual Meeting of the Transportation Research Board, Washington, DC, USA, Jan 2003, pp. 1–30 (2003) 9. Rago, A., Piro, G., Boggia, G., Dini, P.: Multi-task learning at the mobile edge: an effective way to combine traffic classification and prediction. IEEE Trans. Veh. Technol. 69(9), 10362–10374 (2020) 10. Sapankevych, N.I., Sankar, R.: Time series prediction using support vector machines: a survey. IEEE Comput. Intell. Mag. 4(2), 24–38 (2009) 11. Qiu, C., Zhang, Y., Feng, Z., Zhang, P., Cui, S.: Spatial-temporal wireless traffic prediction with recurrent neural network. IEEE Wireless Commun. Lett. 7(4), 554–557 (2018)
Deep Reinforcement Learning for Task Allocation in UAV-enabled Mobile Edge Computing Changliang Yu1 , Wei Du2(B) , Fan Ren2 , and Nan Zhao2 1
Wuhan Fisilink Microelectronics Technology Co., Ltd., Wuhan, China [email protected] 2 Hubei University of Technology, Wuhan, China [email protected]
Abstract. Mobile edge computing (MEC) is a new technology for reducing the network delay in dealing with computing tasks. Considering unmanned aerial vehicle (UAV) with advantage of strong mobility and reliability, a UAV-assisted MEC system is always adopted and researched to provide computing service in wide-range areas. However, since the problems of task allocation are highly nonconvex, it is difficult to reach the optimal solution. In this paper, deep reinforcement learning is used for solving the problem of task allocation in the UAVassisted MEC system. With the modeling of the UAV-assisted MEC system, the problems of task allocation are formulated. The Markov Decision Process (MDP) is developed for the nonconvexity during solving the task allocation problem. Since the MDP has continuous action space, a dual-delay depth deterministic strategy gradient algorithm is suggested for obtaining the joint optimal strategy of task allocation. Experiment results show better performance of the proposed method compared with other optimization approaches.
1 Introduction As a new technique of solving the network delay problem, Mobile Edge Computing (MEC) [1] has the specialness of utilizing cloud computing allocated on the edge of mobile network for the applications with intensive resource. However, the traditional mobile edge server is always installed in a fixed cellular base station, which cannot effectively provide computational offloading services for mobile devices in the Internet of Things in case of emergencies [2]. To solve this problem, unmanned aerial vehicles (UAVs) [3] with low-cost and strong mobility are used to enhance the equipment’s connectivity in MEC system. Since communication, computing and storage resources of mobile users (MUs) are very limited, we propose a UAV-assisted MEC task allocation method. However, the task allocation problems of UAV-assisted MEC system have strong non-convex characteristics, it is difficult to find a global optimal method. Some scholars have proposed rich methods, such as greedy algorithm [4], Q-learning [5], iterative algorithm [6], and Lyapunov optimization algorithm [7]. Although these proposed methods obtain suboptimal solutions for nonconvex optimization problems, some work does not take into c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 225–232, 2022. https://doi.org/10.1007/978-3-030-84910-8_24
226
C. Yu et al.
account the exact information of the system, so it is difficult to find the best strategy. Some work is unsuitable for dealing with continuous high-dimensional action space in joint optimization problems. In this article, a deep reinforcement learning (DRL) method is proposed for task allocation problems in UAV-assisted MEC system. The task allocation problem of UAVassisted MEC system is described as Markov Decision Process (MDP). Then, double delay depth deterministic strategy gradient (TD3) technique is adopted to seek the optimum strategy in MDP continuous action space. Simulation results show that this optimization strategy has better performance than others. The paper constructs as following. Section 2 describes the modeling of the system and the problem. In Sect. 3, a DRL method is suggested to solve optimization problems of task allocation in UAV-assisted MEC system. Its performance is evaluated with experimental results in Sect. 4. Finally, Sect. 5 is a conclusion of this article.
2 Modeling of System and Problem An UAV-assisted MEC system consists of N MUs, K UAVs and M ground edge clouds (ECs). The sets of MUs, UAVs and ECs are denoted as N , K and M , respectively. The position of UAV k ∈ K is denoted by Qk (t) = (xk (t), yk (t), H), where the fixed height is denoted as H, xk (t) and yk (t) is horizontal coordinates of UAV. The fixed position of MU n ∈ N are denoted as Qn = (xn , yn , 0), where xn and yn is horizontal coordinates of MU. The fixed position of EC m ∈ M are denoted as Qm = (xm , ym , 0), where xm and ym is horizontal coordinates of EC. Based on the Shannon’s capacity, the upstream rate from MU to UAV is expressed as: α Pn1 1 1 , (1) Rn (t) = Bn log2 1 + σ 2 Qn − Qk (t)2 where α is unit received power. σ 2 represents the noise power at UAV. B1n is the allocated bandwidth for MU. Pn1 represents MU’s transmission power. Correspondingly, the downstream rate from UAV to EC is written as α P2 2 2 , (2) Rm (t) = Bm log2 1 + σ 2 Qk (t) − Qm 2 where B2m represents the per-device bandwidth preassigned to EC. P2 is transmission power of UAV. The transmission delay from MU to UAV is calculated by the input data size Ln , as follows: Ln (3) Tn1 (t) = 1 . Rn (t) When receiving offloaded tasks from MU, the energy consumption by UAV is given by En1 (t) = P3 Tn1 (t), where P3 represents UAV’s receiving power. Denote βnm (t) ∈ [0, 1]
Deep Reinforcement Learning for Task Allocation
227
and βnk (t) ∈ [0, 1] as the part of received tasks from MU that is handled at EC and UAV, respectively. Hence, the computation delay at the UAV can be calculated as Tn2 (t) =
βnk (t)LnCn , fn1
(4)
where Cn is the number of CPU cycles, fn1 (in CPU cycles/s) represents the computation resource from UAV to MU. 3 Assumed that the power consumption of the CPU in UAV is modeling as κ fn1 . When processing the offloaded tasks, the computation energy consumption of UAV is given: 3 (5) En2 (t) = κ fn1 Tn2 (t). The transmission delay from MU to EC is written as: 3 (t) = Tnm
βnm (t)Ln . R2m (t)
(6)
When offloading the task from MU to EC, the energy consumption of UAV is given 3 (t) = P T 3 . Then, the computation delay at the EC is given by by Enm 2 nm 4 Tnm (t) =
βnm (t)LnCn , 2 fnm
(7)
2 (in CPU cycles/s) represents the computation resource from EC to MU. where fnm Based on the above description, the total energy consumption of UAV by the arrival rate of tasks λn is given: 3 Ens (t) = λn En1 (t) + En2 (t) + ∑ Enm (t) . (8) m
The total service delay of MU is 3 4 Tns (t) = Tn1 (t) + max Tn2 (t), Tnm (t) + Tnm (t) . m
(9)
We formulate the problem we have previously posed as the following optimization problem: min ∑ Ens (t) + ∑ Tns (t), Qk (t),βnk (t),βnm (t)
n
n
s.t. ∑ βnk (t) + ∑ βnm (t) = 1, k
m
(10)
0 ≤ βnm (t) ≤ 1, 0 ≤ βnk (t) ≤ 1,
3 DRL Algorithm A deep reinforcement learning (DRL) method is proposed here for solving the optimization problem. Markov decision process (MDP) is composed of state space S ,
228
C. Yu et al.
action space A , transition function P, discount factor γ , reward function R . Each UAV is considered an DRL agent. At time t, the system state of each UAV S (t) is denoted as: S (t) = {x1 (t), . . . , xK (t), y1 (t), . . . , yK (t)} . (11) At discrete time t, the action space of each UAV Ak (t) is defined as: Ak (t) = {Dk (t), Φk (t), βnm (t)},
(12)
where Dk (t) is the horizontal distance of UAVs, Φk (t) represents the direction angle of UAVs in the horizontal dimension, βnm (t) ∈ [0, 1] represents the task partition variable. Based on the above optimization problem, the reward function R(t) could be expressed as: R(t) = − ∑ Ens (t) + ∑ Tns (t) − ξ1 N − ∑ N (k) − ∑ ξ2k , (13) n
n
k
k
where N (k) represents the number of MUs covered by UAV k, ξ1 represents the punishment coefficient related with MUs’ coverage, ξ2k represents the punishment of UAVs’ collision. The proposed TD3 algorithm consists of the main network and the target network [8]. The main network contains an actor network with parameter ϕk and two critic networks with parameter ηk,1 and ηk,2 , respectively. Similarly, the target network contains an actor network with parameter ϕk and two critic networks with parameter with param and η , respectively. Take randomly Y sampling mini-batch state s , action eter ηk,1 t k,2 k at , reward rt , next state st+1 from experience replay memory F with the transition state S (t), action Ak (t), reward R(t), next state S (t + 1) . A soft target update is used for actor-critic algorithm [9]. The target network is updated by parameter τ ∈ [0, 1], as follows:
ϕk = τϕk + (1 − τ ) ϕk , ηk, j = τηk, j + (1 − τ ) ηk, j , j = 1, 2.
(14)
Then, TD3 adopts Clipped Double Q-learning algorithm [10] to give the target value, as follows: ytk = rt + γ min Qη k, j (st+1 , a˜k ) , j = 1, 2, (15) j=1,2 where Qη k, j (st+1 , a˜k ) are two target critic network used to calculate the Q value, respectively. a˜k = πϕk (st+1 ) + σ is the action a˜k obtained by using target policy smoothing k
regularization with noise σ ∼ clip (N (0, σ ) , −c, c). Subsequently, TD3 updates the parameters ηk, j by minimizing the mean-squared Bellman error as follows:
2 (16) Lk ηk, j = Eak ,st ,rt ytk − Qηk, j st , atk , j = 1, 2, t
where E expresses expectation. Qηk, j st , atk are two main critic network used to calculate the Q value, respectively.
Deep Reinforcement Learning for Task Allocation
229
TD3 uses The Delayed Policy Updates algorithm [10] to update the main actor network parameter ϕk by a main actor policy πϕk k (st ):
∇ϕk J (ϕk ) = E ∇ak Qηk,1 st , atk |ak =πϕk (st ) ∇ϕk πϕk k (st ) , (17) t t k
4 Performance Evaluation Simulations are displayed here to evaluate the performance of the proposed TD3 algorithm. We consider a UAV-assisted cellular network with 400 × 400 m2 area, and this network comprises 2 UAVs, 4 ECs and 50 MUs. As presented in Fig. 1, MUs are divided into two parts: 20 MUs and 30 MUs, which are located randomly in the square place of [50, 150], [250, 350], [250, 300] and [250,350], respectively. We set the height of UAV H as 80 m, receiving power of UAV P3 as 0.1 W, noise power at the UAV σ 2 as –174 dBm/Hz and arrival rate of tasks λn as 0.0001. In TD3 algorithm, both a two fully-connected hidden layers neural networks (128 and 32 neurons) are employed for the actor and critic networks. The AdamOptimizer is utilized as the optimizer for updating the actor and critic networks. We set the size of mini-batch Y as 64, the discount rate γ as 0.99. The suggested TD3 model is trained by 2000 episodes. Positions of UAVs, ECs and MUs
400
MU EC UAV
350
300
Y pos (m)
250
200
150
100
50
0
0
50
100
150
200
250
300
350
400
X pos (m)
Fig. 1. Network layout with UAVs, ECs and MUs.
Firstly, TD3 algorithm is analyzed under different learning rates δ , shown in Fig. 2. As the learning program goes on, the training steps tend to converge. Moreover, with
230
C. Yu et al.
the learning rate δ increasing, fewer training steps are needed for minimizing the delay and energy of transmission consumption. The learning convergence under δ = 0.001 is faster than that under δ = 0.1 and δ = 0.01. However, the smaller the learning rate, there are more risks that the global optimal solution may turn to the local one. It is noticed that the train speed under δ = 0.0001 is slower than that under δ = 0.001. Therefore, in consideration of real-time computation ability of the algorithm, δ is set as 0.001 here for our training model. 105
2.5
Traning Cost
2 =0.1 =0.01 =0.001 =0.0001
1.5
1
0.5
0 0
1000
2000
3000
4000
5000
6000
7000
8000
9000 10000
Episodes
Fig. 2. Training overall system cost with different learning rates δ .
Next, we evaluate the performance of TD3 under various numbers of UAVs K. Figure 3 shows the smoothing training steps under the settled designs, where K = 1, K = 2, K = 3 and K = 4 , respectively. It is suggested that when K = 4, the UAVs tend to reach the optimal hovering locations within shortest time for minimizing the delay and power consumption. A greater number of UAVs corresponds to a greater number of channels. On the contrary, the performance of UAVs when the number of UAVs K = 2 is better than K = 3 which is in that the TD3 model is trained under the condition that the number of UAVs K = 2. Especially, the convergence is faster when K = 4 than others. Thus, the higher uplink transmission rates and downlink transmission rates can be more likely to be achieved.
Deep Reinforcement Learning for Task Allocation
231
105
7
6
Traning Cost
5
4
3 K=1 K=2 K=3 K=4
2
1
0 0
200
400
600
800
1000
1200
1400
1600
1800
2000
Episodes
Fig. 3. Training overall system cost with different numbers of UAVs K.
Finally, we evaluate the performance of different optimization method. There are trends in the Fig. 4. to suggest that the performance of TD3 better than DDPG slightly. Moreover, TD3 converges faster than DDPG in the first 250 episodes. We also consider the situation that all computing tasks are disposed by UAVs and notice that the joint calculation of UAVs and ECs only a few epochs needed to converge. The reason is that 105
12
TD3 DDPG fix beta
Smoothing Traning Cost
10
8
6
4
2
0 0
200
400
600
800
1000
1200
1400
1600
1800
Episodes
Fig. 4. Smoothing training cost with different method
2000
232
C. Yu et al.
the evaluation of the action value of the TD3 optimization strategy is more accurate than that of DDPG. TD3 approach always achieves system performance index with lower cumulative time cost.
5 Conclusion In the paper, a UAV-assisted MEC system is investigated to provide computing service in wide-range areas. However, since the task allocation problems are highly nonconvex, it is difficult to reach the optimum solution. Deep reinforcement learning is used to solve the task allocation problem in UAV-assisted MEC system. With the modeling of the UAV-assisted MEC system, the task allocation problems are formulated. The Markov decision process (MDP) is developed for the nonconvexity of task allocation. Then, since the MDP has continuous action space, a dual-delay depth deterministic strategy gradient algorithm is suggested to acquire the joint optimal strategy of task allocation. Experiment results show better performance of the proposed method compared with other optimization approaches.
References 1. Shahidinejad, A., Farahbakhsh, F., Ghobaei-Arani, M., et al.: Context-aware multi-user offloading in mobile edge computing: a federated learning-based approach. J. Grid Comput. 19(2), 1570–7873 (2021). https://doi.org/10.1007/s10723-021-09559-x 2. Pan, Y., Da, X., Hu, H., et al.: Efficient design optimisation for UAV-enabled mobile edge computing in cognitive radio networks. IET Commun. 14(15), 2509–2515 (2020) 3. Pfeifer, C., R¨ummler, M.C., Mustafa, O.: Assessing colonies of Antarctic shags by unmanned aerial vehicle (UAV) at South Shetland Islands, Antarctica. Antarct. Sci. 33, 1–17 (2021) 4. Wang, Y., Ru, Z.Y., Wang, K., et al.: Joint deployment and task scheduling optimization for large-scale mobile users in multi-UAV-enabled mobile edge computing. IEEE Trans. Cybern. 50(9), 3984–3997 (2020) 5. Elgendy, I.A., Zhang, W.Z., He, H., et al.: Joint computation offloading and task caching for multi-user and multi-task MEC systems: reinforcement learning-based algorithms. Wireless Netw. 27(3), 2023–2038 (2021) 6. He, Y., Zhai, D., Huang, F., et al.L Joint task offloading, resource allocation, and security assurance for mobile edge computing-enabled UAV-assisted VANETs. Remote Sens. 13(8), 1547 (2021) 7. Wang, G., Yu, X., Xu, F., et al.: Task offloading and resource allocation for UAV-assisted mobile edge computing with imperfect channel estimation over Rician fading channels. EURASIP J. Wireless Commun. Netw. 2020(1), 169 (2020) 8. Zhong, S., Tan, J., Dong, H., et al.: Modeling-learning-based actor-critic algorithm with gaussian process approximator. J. Grid Comput. 18(2), 181–195 (2020) 9. Liu, M., Li, J., Hu, Z., et al.: A dynamic bidding strategy based on model-free reinforcement learning in display advertising. IEEE Access 8, 213587–213601 (2020) 10. Fujimoto, S., Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. In: International Conference on Machine Learning, PMLR, pp. 1587–1596 (2018)
Medical Image Analysis with NVIDIA Jetson GPU Modules Pavel Kr¨ omer(B) and Jana Nowakov´ a Department of Computer Science, Faculty of Electrical Engineering and Computer Science, VSB–Technical University of Ostrava, 17. listopadu 2172/15, 708 00 Ostrava-Poruba, Czech Republic {pavel.kromer,jana.nowakova}@vsb.cz
Abstract. Medical imaging and image analysis are important elements of modern diagnostic and treatment methods. Intelligent image processing, pattern recognition, and data analysis can be leveraged to introduce a new level of detection, segmentation, and, in general, understanding to medical image analysis. However, modern image analysis methods such as deep neural networks are often connected with significant computational complexity, slowing their adoption. Recent embedded systems such as the NVIDIA Jetson general-purpose GPUs became a viable platform for efficient execution of some computational models. This work analyzes the performance and time and energy costs of several neural models for medical image analysis on different kinds of NVIDIA Jetson modules. The experiments are performed with the lung X-ray medical images in connection with the COVID-19 disease.
1
Introduction
Deep learning and especially deep neural networks are nowadays a hot trend in medical image processing and analysis for many reasons. Medical imaging is at present executed practically exclusively in digital form. It produces large amounts of data that can be used by different healthcare stakeholders. Healthcare providers ought to utilize the information, which is already too big for manual processing and analysis exclusively by human force [13]. Automated computeraided approaches can be also useful in situations when experienced medical staff and/or usual resources (instruments, chemicals, laboratory materials, analytical devices) are not available. This applies, for example, to field medicine in remote places and different corners of the world and during the times of major epidemics, when standard medical protocols are suspended due to the depletion of resources, as witnessed during the 2020/2021 COVID-19 pandemic. Deep learning has emerged as a premier method for intelligent image analysis. The areas where deep learning approaches found place range from image detection and recognition, image segmentation [26], image registration, pattern
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 233–242, 2022. https://doi.org/10.1007/978-3-030-84910-8_25
234
P. Kr¨ omer and J. Nowakov´ a
recognition, and to computer-aided diagnosis [18] and beyond. The enabling technologies for medical image processing by deep learning include the advancement in hardware and computational resources and data storage capabilities [2] but it is a common understanding that such resources are restricted, as well. The use of deep learning is not limited to any particular type of medical images. On the contrary, it is widely used for different types of medical images such as MRI [17], CT scans [10] X-Rays [21], ultrasound [16], or positron emission tomography [19] (sorted by the frequency of use of imaging methods (modalities)). Brain images, followed by the cardiac system, lungs, retina, and prostate are among the body parts most often analyzed by deep learning [3]. The use of deep learning in medical image analysis has to deal with many challenges. Label noise, which significantly affects the ability of deep models to learn and analyze data, is a major problem from the data-quality point of view [12]. Other challenges are associated with the algorithmic and technical background of deep neural networks. They include time complexity (time costs) of learning and inference of deep models, especially in the presence of nonfunctional requirements on the designed solutions such as real-time usage [15]. The energy costs (energy-awareness) is also a very often discussed question not only in connection with traditional industry but also in the context of computer engineering and the use of mobile devices. According to recent findings [25], simplistic attempts to reduce the network model sizes or limit the amount of computation are not appropriate. Instead, novel methods appear to be more efficient. For example, Yang et al. [25] proposed an energy-aware CNN optimization algorithm that directly uses energy consumption to guide network pruning process. In this work, we study the ability of CNNs to analyze lung X-ray images in the presence of constrained computing resources. Several network configurations are developed, implemented, and evaluated. A special attention is paid to the simplicity of the network and its ability to run (perform inference) on embedded devices, in particular NVIDIA Jetson modules. We analyze the precision and the time and energy costs associated with the execution of the models on several types of Jetson modules. The rest of the paper is organized as follows: Sect. 2 provides a brief overview of the use of accelerators (GPUs) for medical image analysis. Section 3 gives a description of the embedded modules used to measure time and energy costs of the modules. The test problem, COVID-19 classification from chest X-ray images, and the test data are described in Sect. 4. Experiments and results are provided in Sect. 5 and conclusions are drawn in Sect. 6.
2
Medical Imaging and GPU Computing
Accurate deep learning models are inherently computationally expensive. This is especially emphasized in the medical imaging area, where the accuracy of a model is a major important requirement with life-saving potential. In general, expensive models can be run on CPUs or GPUs. In many contexts and for many model types, CPUs are most suitable for model execution. However, highly parallel
Medical Image Analysis with NVIDIA Jetson GPU Modules
235
floating-point accelerators represent a significant computational power and can be used to execute suitable models efficiently. Moreover, the use of mobile GPUs is associated with rapid acceleration and better energy efficiency of the computation [7]. Nevertheless, all comparisons between CPU and GPU approaches should be made only after a careful optimization for both platforms [20]. According to surveys [6,24], the use of GPUs for medical imaging is still not commonplace. Likely, the authors of medical imaging solutions are more concentrated on the development of novel algorithms than on parallelization/optimization. Nevertheless, there are also exceptions [24] such as [23], where the authors discussed the possibilities and future of massive parallelization in medical image registration using supercomputers. As the authors concluded, the future is inexpensive and easily accessible GPUs. After eleven years, it can be said that this has not changed. The bigger expansion of the GPU usage has come with the Compute Unified Device Architecture [8,11,14], where the GPU can be used to accelerate critical parts of the algorithms. The 3D medical image registration proposed in [22] is an example of a data-parallel algorithm that benefits from parallel execution on GPUs. The average execution time of the algorithm was compared on several GPU types and a traditional CPU. Another example of a method for medical image segmentation on GPU is the fuzzy C-means algorithm proposed in [1]. The potential of GPU computing for medical imaging in healthcare was summarized in a clever way by Despres and Jia [6]. The authors realized that: 1) speed is speed, i.e., the speed of model execution achieved by a GPU-based solution is the speed in the clinical workflow; 2) speed is accuracy, i.e., faster model execution provides an opportunity for greater accuracy; and 3) speed is big data, i.e., better processing power enables the use of larger data sets. Clearly, as all these statements are interrelated, it can be said that GPUs represent a great opportunity for medical imaging.
3
NVIDIA Jetson Modules
NVIDIA Jetson is a family of embedded computers with integrated GPUs (modules) designed for high-performance computing in constrained scenarios such as edge and in-car computing [4]. Four mobile NVIDIA Jetson modules, common as of mid 2021, were in this work used to run inference of different types of CNNs for medical image classification. The main technical parameters of the modules are summarized in Table 1.
236
P. Kr¨ omer and J. Nowakov´ a Table 1. Modules’ properties AGX Xavier
Xavier NX
TX2
GPU architecture
Volta
Volta
Pascal
Nano Maxwell
CUDA cores
512
384
256
128
Tensor cores
64
48
–
–
Max. GPU freq. (MHz)
1377
1100
1300
921.6
CPU type
ARMv8.2 64-bit (8 × Carmel)
ARMv8.2 64-bit (6 × Carmel)
ARMv8 64-bit (2 ARMv8 (4 × Denver + 4 × × Cortex-A57) Cortex-A57
Max. CPU freq. (MHz)
2265.6
1900
2000
1479
Memory (GiB)
32
8
8
4
51.2
59.7
25.6
Mem. bandwidth (GiB/s) 137
The Jetson GPUs use the Compute Unified Device Architecture (CUDA) [14]. CUDA is a hardware and software platform which enables the development and execution of general purpose programs. The CUDA runtime takes care of the scheduling and execution of kernels (CUDA routines) by many concurrent tasks (threads and thread groups) executed on the available hardware in parallel. A CUDA application is split between the CPU (host) and one or more GPUs (device). The host code can contain arbitrary operations, data types, and functions while the device code can contain only a subset of operations and functions implemented on the device. In this work, we use the modules to run inference of several CNNs for medical image classification and analyze their results, performance, and energy and time costs. The classification problem and the data used for the evaluation are described in the next section.
4
COVID-19 Detection by Deep Learning
In order to analyze the performance of CNNs on NVIDIA Jetson modules, a classification problem relevant for the COVID-19 pandemic was solved by deep learning (CNNs). The problem consists in the analysis of chest X-ray images of COVID-19 positive and negative subjects. The data (annotated radiographs) was taken from the Covid Radiography Database (CRD) [5]. The CRD data set was downloaded from Kaggle1 and contained 219 chest radiographs of patients with COVID-19, 1345 chest radiographs of patients with viral pneumonia, and 1341 radiographs of patients with neither disease (normal). All images had resolution 1 MP (i.e., 1024×1024 pixels). In this work, only COVID-19 positive and normal images were used and a binary classification problem was considered. The images were for the processing by CNNs preprocessed in the following way: they were downscaled to dimensions 128 × 128, randomly rotated in the counterclockwise direction (by up to 0.2◦ ), and randomly zoomed by a factor 1
https://www.kaggle.com/tawsifurrahman/covid19-radiography-database.
Medical Image Analysis with NVIDIA Jetson GPU Modules
237
from the range [0.8, 1.2]. These transformations enriched the data set and emulated real-world conditions where photos of the radiographs could be taken under a variety of conditions. Four subsets of the images were created for the experiments. The TRAIN data set contained two balanced classes, each with 92 COVID-19 positive and 92 COVID-19 negative radiographs and was used for the training of the network. The VALID data set consisted of 39 COVID-19 positive and 39 COVID-19 negative radiographs. It was used in the training process for validation. The TEST data set contained all remaining images: 88 chest X-ray images of COVID-19 positive subjects and 1,210 radiographs of COVID-19 negative subjects. It was used to evaluate the generalization ability of the trained networks on previously unknown data. The EVAL data set contained 50 images of COVID19 positive and 50 images of COVID-19 negative subjects randomly drawn from the TEST data set. It was used for the evaluation of trained models on Jetson modules.
5
Experiments and Results
The experimental pipeline consisted of two main stages: model search (training) and execution (inference) analysis. 5.1
Model Search
Model search was performed on a compute server and involved model structure definition and an extensive grid search for model parameters. The overall structure of the network was determined on the basis of the data, type of application, and CNN best practices. The CNN consisted of an input layer with the dimension (128 × 128), two 2D convolution layers with (3 × 3) convolution windows and Rectified Linear Unit (ReLU) activation (denoted C1 and C2), dense layer with ReLU activation (named D1), and an output layer with one neuron and Sigmoid activation function. Although the types of all layers were fixed in the CNN design, an extensive grid search was performed to determine the optimum sizes of the three hidden layers, C1, C2, D. The network was implemented using the Keras framework with the TensorFlow backend. The grid search was executed as follows: the sizes of C1, C2, and D were set to every combination of sizes drawn from three size vectors, C1 = C2 = [2, 4, 8, 16, 32], and D = [2, 4, 8, 16]. In total, 100 different network configurations with large differences in the number of parameters and estimated model size were evaluated. For each configuration, the corresponding CNN was trained 31 times independently to mitigate the stochastic nature of CNN training and the accuracy, sensitivity, and specificity of the trained network was recorded. The training process used input batches of the size 4 and was executed for 400 epochs. As a result, 3100 CNNs were trained. The results of network training were analyzed by statistical methods. First, the correspondence between training and test (generalization) accuracy was verified. Spearman’s rank correlation test confirmed a strong positive correlation
238
P. Kr¨ omer and J. Nowakov´ a
(ρ = 0.953, p = 1.910e − 52) between the mean accuracy of different configurations on the TRAIN and TEST data sets. Moreover, positive correlation (ρ = 0.778, p = 1.640e − 21) was identified between the median accuracy of different CNN configurations on the TRAIN and TEST data sets. The results of models corresponding to every CNN configuration were further investigated using other statistical tests, implemented in the Autorank package [9]. The nonparametric Friedman test was used to determine if there are significant differences between the CNN configurations on the TRAIN and TEST data sets and the Nemenyi post-hoc test was applied to see which differences are significant at α = 0.01. The tests revealed that there are several groups of configurations without significant differences in classification accuracy. Table 2 lists all models not significantly different from the best-ranked ones on both the TRAIN and the TEST data. It shows for each model the number of trainable parameters, estimated inmemory size, and mean and median accuracy on the TEST data set. All models in Table 2 are sorted according to in-memory size. The table shows that a wide variety of models falls in the category of best-performing ones. In order to select some of them for further evaluation, the sensitivity and specificity of the models was evaluated. The results are illustrated in Fig. 1. The figure illustrates for each configuration (rectangle) its sensitivity and specificity on the TRAIN and TEST data sets. The second row of Fig. 1 focuses on the on average best-performing configurations. The configurations non-dominated from the sensitivity and specificity point of view are labeled by a code corresponding to its configuration (C1C2-D). The plots show that the configuration (16-8-16) is on both data sets superior in sensitivity and the configurations (32-32-16) and (4-8-8) are the best on TRAIN and TEST in terms of specificity. Based on the results of all experiments, 4 models were selected for further investigation on Jetson Modules. They provide classification accuracy without significant differences (α = 0.01) and are of very different sizes at the same time. In particular, the configuration (2-2-8) is with memory usage 6.25 MiB the smallest out of the best-performing models. (16-8-16) is the best configuration in terms of average sensitivity on the TRAIN and TEST data sets and requires 44.60 MiB. (32-32-16) is with memory usage 100.887 MiB the largest of the evaluated configurations. However, it is also the best configuration in terms of specificity on the TRAIN data set. Finally, (4-8-8) is a configuration with the highest specificity on the TEST data set with modest memory requirements (15.305 MiB).
Medical Image Analysis with NVIDIA Jetson GPU Modules
239
Table 2. Models w/o statistical difference in accuracy from the best-ranked ones on both, the TRAIN and TEST data sets. Model
Parameters
RAM
Accuracy [%]
Model
Parameters
RAM
Accuracy [%]
(C1-C2-D)
[-]
[MiB]
Mean
Median
(C1-C2-D)
[-]
[MiB]
Mean
Median
2-2-8
14511
6.250
96.053
96.302
4-32-8
231713
32.142
85.419
93.606
2-2-16
28927
6.307
96.051
95.994
4-32-16
462129
33.023
87.358
97.072
2-4-4
14541
7.597
92.211
96.302
16-2-8
15155
40.168
95.971
96.071
2-4-8
28949
7.653
95.706
96.225
16-2-16
29571
40.224
95.847
95.994
2-4-16
57765
7.765
96.282
96.456
16-4-8
29845
41.571
95.763
96.302
2-8-4
29017
10.348
92.843
96.533
16-4-16
58661
41.683
96.295
96.456
2-8-8 2-8-16
57825
10.459
96.230
96.687
8-32-8
232977
41.837
84.500
96.918
115441
10.681
96.252
96.533
8-32-16
463393
42.718
86.779
96.533
4-2-8
14603
11.096
95.191
95.917
16-8-4
30417
44.268
86.639
95.686
4-2-16
29019
11.153
95.728
95.763
16-8-8
59225
44.379
92.408
96.533
4-4-8
29077
12.499
96.389
96.379
16-8-16
116841
44.601
96.329
96.533
4-4-16
57893
12.611
96.140
96.225
16-16-8
117985
49.995
84.823
96.918
4-8-4
29217
15.194
87.296
96.918
16-16-16
233201
50.437
96.347
96.533
4-8-8
58025
15.305
95.922
96.687
16-32-8
235505
61.227
75.729
95.300
115641
15.527
96.598
96.687
16-32-16
465921
62.108
87.402
96.533
4-8-16 2-16-4
57969
15.850
88.983
96.379
32-2-2
5079
78.887
90.104
95.686
2-16-8
115577
16.071
95.286
96.764
32-2-8
15891
78.930
95.633
95.994
2-16-16
96.071
230793
16.513
93.181
96.533
32-2-16
30307
78.987
95.830
8-2-8
14787
20.786
95.924
96.071
32-4-2
9257
80.251
83.921
96.533
8-2-16
29203
20.843
96.118
96.302
32-4-4
16461
80.279
93.886
95.917
4-16-8
115921
20.917
92.477
96.764
32-4-8
30869
80.335
96.091
96.302
4-16-16
231137
21.359
94.165
96.687
32-4-16
59685
80.447
96.342
96.379
8-4-8
29333
22.190
95.842
96.302
32-8-4
32017
83.034
91.906
96.225
8-4-16
58149
22.302
96.165
96.302
32-8-8
60825
83.145
93.121
96.533 96.610
8-8-8
58425
24.996
92.639
96.533
32-8-16
118441
83.367
96.252
8-8-16
116041
25.218
96.429
96.456
32-16-4
63129
88.545
83.665
93.760
2-32-8
231081
27.295
81.970
96.071
32-16-8
120737
88.765
83.635
95.455
2-32-16
461497
28.176
95.594
96.764
32-16-16
235953
89.207
95.860
96.533
8-16-8
116609
30.610
94.699
96.533
32-32-8
240561
100.006
79.107
96.841
8-16-16
231825
31.051
96.247
96.687
32-32-16
470977
100.887
91.138
96.687
5.2
Inference on the Jetson GPUs
The best models representing the CNN configuration selected in the previous step were executed on the four Jetson modules described in Sect. 3. Each model was run on the platform using the Keras API with TensorFlow v2. All modules were switched to the highest performance mode (nvpmodel 0) and the model processed 100 images from the EVAL data set. The accuracy, execution time, and energy consumption on CPU and GPU were recorded. The inference was for each model on every module repeated 31 times to obtain average execution times and energy costs. The results of the analysis are shown in Table 3. It very well illustrates the consistency of model accuracy on different modules. The fastest inference is achieved by AGX Xavier, the most powerful module. It achieves the inference of 26.3 (2-2-8) or more images per second. The Xavier NX module is on average 1.34 times slower than AGX Xavier. The TX2 is 1.22 slower than
240
P. Kr¨ omer and J. Nowakov´ a TRAIN
TEST
Fig. 1. Sensitivity and specificity of the best-performing models (higher is better). Each model is represented by a rectangle with colors encoding the configuration of layers (C1-C2-D) as the intensity of red, green, and blue. Table 3. The best models from each configuration executed on the EVAL data set on the Jetson modules. AGX Xavier
Xavier NX
Model
Accuracy [–] Time [s] Energy [mJ]
Accuracy [–] Time [s] Energy [mJ]
2-2-8
0.910
3.805
1112.203
0.910
5.065
1327.510
4-8-8
0.970
3.261
1007.935
0.970
4.421
1396.065
16-8-16
0.950
3.513
1167.719
0.950
4.424
1519.529
32-32-16 0.940
3.251
1426.632
0.940
4.545
1874.384
TX2
Nano
Model
Accuracy [–] Time [s] Energy [mJ]
Accuracy [–] Time [s] Energy [mJ]
2-2-8
0.910
5.278
1688.171
0.910
6.757
1935.671
4-8-8
0.970
5.924
1741.616
0.970
6.690
1864.603
16-8-16
0.950
5.624
2057.113
0.950
6.848
2090.629
32-32-16 0.940
5.573
2639.242
0.940
7.072
3024.019
Xavier NX and Nano is 1.22 times slower than the TX2. The energy consumption associated with the inference corresponds to the execution times. The CPU and GPU of the Xavier NX module required, on average, 1.30 times more energy than the CPU and GPU of AGX Xavier to run the same models. TX2 needed 1.32
Medical Image Analysis with NVIDIA Jetson GPU Modules
241
times more energy than Xavier NX and Jetson Nano used 1.09 times more energy than the TX2. The results indicate that the energy consumption is dominantly determined by the inference time - the differences in system architectures (CPU and GPU types) do not constitute large changes in energy consumption.
6
Conclusions
In this work, we studied the ability of convolutional neural networks to detect lung diseases, in particular, COVID-19, from X-ray chest images on NVIDIA Jetson GPUs. A wide range of network configurations was evaluated to find the best-performing ones. The inference of the models on the Jetson modules was studied from the accuracy, time, and energy perspectives. The results suggest that NVIDIA Jetson modules provide a reasonable platform for the execution of CNN models with application potential for, e.g., field medicine and scenarios with limited availability of standard medical procedures. Acknowledgment. This work was supported from ERDF in project “A Research Platform focused on Industry 4.0 and Robotics in Ostrava”, reg. no. CZ.02.1.01/0.0/0.0/17 049/0008425, by the Technology Agency of the Czech Republic in the frame of the project no. TN01000024 “National Competence Center – Cybernetics and Artificial Intelligence”, and by the project of the Student Grant System no. SP2021/24, VSB - Technical University of Ostrava.
References 1. Al-Ayyoub, M., Abu-Dalo, A.M., Jararweh, Y., Jarrah, M., Al Sa’d, M.: A GPUbased implementations of the fuzzy C-means algorithms for medical image segmentation. J. Supercomput. 71(8), 3149–3162 (2015). https://doi.org/10.1007/s11227015-1431-y 2. Anwar, S.M., Majid, M., Qayyum, A., Awais, M., Alnowami, M., Khan, M.K.: Medical image analysis using convolutional neural networks: a review. J. Med. Syst. 42(11), 1–13 (2018) 3. Boveiri, H.R., Khayami, R., Javidan, R., Mehdizadeh, A.: Medical image registration using deep neural networks: a comprehensive review. Comput. Electr. Eng. 87, 106767 (2020) 4. Cass, S.: Nvidia makes it easy to embed AI: the Jetson nano packs a lot of machinelearning power into DIY projects - [Hands on]. IEEE Spectr. 57(7), 14–16 (2020). https://doi.org/10.1109/MSPEC.2020.9126102 5. Chowdhury, M.E., et al.: Can AI help in screening viral and covid-19 pneumonia? arXiv preprint arXiv:2003.13145 (2020) 6. Despr´es, P., Jia, X.: A review of GPU-based medical image reconstruction. Physica Medica 42, 76–92 (2017) 7. Eklund, A., Dufort, P., Forsberg, D., LaConte, S.M.: Medical image processing on the GPU-past, present and future. Med. Image Anal. 17(8), 1073–1094 (2013) 8. Fluck, O., Vetter, C., Wein, W., Kamen, A., Preim, B., Westermann, R.: A survey of medical image registration on graphics hardware. Comput. Methods Prog. Biomed. 104(3), e45–e57 (2011)
242
P. Kr¨ omer and J. Nowakov´ a
9. Herbold, S.: Autorank: a python package for automated ranking of classifiers. J. Open Source Softw. 5(48), 2173 (2020). https://doi.org/10.21105/joss.02173 10. Huang, X., Sun, W., Tseng, T.L.B., Li, C., Qian, W.: Fast and fully-automated detection and segmentation of pulmonary nodules in thoracic CT scans using deep convolutional neural networks. Comput. Med. Imaging Graph. 74, 25–36 (2019) 11. Kalaiselvi, T., Sriramakrishnan, P., Somasundaram, K.: Survey of using GPU CUDA programming model in medical image analysis. Inform. Med. Unlocked 9, 133–144 (2017) 12. Karimi, D., Dou, H., Warfield, S.K., Gholipour, A.: Deep learning with noisy labels: exploring techniques and remedies in medical image analysis. Med. Image Anal. 65, 101759 (2020) 13. Ker, J., Wang, L., Rao, J., Lim, T.: Deep learning applications in medical image analysis. IEEE Access 6, 9375–9389 (2017) 14. Kirk, D.: Nvidia CUDA software and GPU parallel computing architecture. In: Proceedings of the 6th International Symposium on Memory Management, ISMM 2007, pp. 103–104. Association for Computing Machinery, New York (2007). https://doi.org/10.1145/1296907.1296909 15. Kumar, K.K., Kumar, M.D., Samsonu, C., Krishna, K.V.: Role of convolutional neural networks for any real time image classification, recognition and analysis. Mater. Today Proc. (2021) 16. Luchies, A.C., Byram, B.C.: Deep neural networks for ultrasound beamforming. IEEE Trans. Med. Imaging 37(9), 2010–2021 (2018) 17. Lundervold, A.S., Lundervold, A.: An overview of deep learning in medical imaging focusing on MRI. Zeitschrift f¨ ur Medizinische Physik 29(2), 102–127 (2019) 18. Maier, A., Syben, C., Lasser, T., Riess, C.: A gentle introduction to deep learning in medical image processing. Zeitschrift f¨ ur Medizinische Physik 29(2), 86–101 (2019) 19. Pinochet, P., et al.: Evaluation of an automatic classification algorithm using convolutional neural networks in oncological positron emission tomography. Front. Med. 8, 117 (2021) 20. Pratx, G., Xing, L.: GPU computing in medical physics: a review. Med. Phys. 38(5), 2685–2697 (2011) 21. Salehinejad, H., Valaee, S., Dowdell, T., Colak, E., Barfett, J.: Generalization of deep neural networks for chest pathology classification in x-rays using generative adversarial networks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 990–994. IEEE (2018) 22. Shams, R., Sadeghi, P., Kennedy, R., Hartley, R.: Parallel computation of mutual information on the GPU with application to real-time registration of 3D medical images. Comput. Methods Prog. Biomed. 99(2), 133–146 (2010) 23. Shams, R., Sadeghi, P., Kennedy, R.A., Hartley, R.I.: A survey of medical image registration on multicore and the GPU. IEEE Sig. Process. Mag. 27(2), 50–60 (2010) 24. Shi, L., Liu, W., Zhang, H., Xie, Y., Wang, D.: A survey of GPU-based medical image computing techniques. Quant. Imaging Med. Surg. 2(3), 188 (2012) 25. Yang, T.J., Chen, Y.H., Sze, V.: Designing energy-efficient convolutional neural networks using energy-aware pruning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5687–5695 (2017) 26. Zhou, T., Ruan, S., Canu, S.: A review: deep learning for medical image segmentation using multi-modality fusion. Array 3, 100004 (2019)
Analysis of Optical Mapping Data with Neural Network V´ıt Doleˇz´ı(B) and Petr Gajdoˇs Department of Computer Science, FEECS VSB–Technical University of Ostrava, 70800 Ostrava, Czech Republic {vit.dolezi,petr.gajdos}@vsb.cz
Abstract. Optical Mapping is a method of DNA sequencing, that can be used to detect large structural variations in genomes. To create these optical maps, a restriction enzyme is mixed with DNA where the enzyme binds to DNA creating labels called restriction sites. These restriction sites and can be captured by fluorescent microscope with a camera. One of the tools that can capture these images is Bionano Genomics Saphyr mapping instrument. Their system produces high-resolution images of DNA molecules with restriction sites and their software detects them. Molecules in these images are visualized by gray lines with restrictions sites appearing brighter. Some of these molecules have very low brightness and with static noise around them, they are almost indistinguishable from the background. This work proposes GPU accelerated method for molecule detection. A neural network was used for molecule segmentation from the image and the dataset for the network was created from the results of Bionano Genomics tools. This detection method can be used as a substitute for restriction map detection in the Bionano Genomics processing pipeline or as a tool that can highlight the regions of interest in the raw optical maps images.
1
Motivation
Optical mapping is a technique for creating ordered restriction maps from stained molecules of DNA. When restriction enzyme is mixed with the genomic DNA, it creates cuts and markings called restriction sites. Molecules are linearized, flow through a nanochannels and are captured by fluorescence microscope [26,27]. Optical map of single molecules are derived from fluorescent intensity. Many images are taken over many cycles and over many nanochannels. Molecules are extracted from the images and then by overlapping multiple molecule maps, the optical map of a complete genome is created. This whole genome map can be used to detect large and small structural variants when comparing to reference genome and help with detection of diseases [26,27]. Bionano Genomics provides software that does everything needed for optical maps to be processed and analyzed. From detecting molecules in the image, creating BNX files, creating consensus genome maps and aligning result to the reference genome to detect structural variants [10]. The extraction of the molecules c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 243–252, 2022. https://doi.org/10.1007/978-3-030-84910-8_26
244
V. Doleˇz´ı and P. Gajdoˇs
from the captured images is the first building block in the Binonano processing pipeline. While there are many other works that focuses on consensus genome maps construction or alignment to reference genome, not much work can be found about the actual image processing/analysis and extraction of the molecules from them. There are other companies that provide tools for optical mapping like OpGen or NABsys [11] but this work focuses solely on data provided by Bionano Genomics Saphyr instrument. Neural network are being more widely used every year, from self driving cars [6], code suggestion in editors [7] to image analysis in medicine like cancer detection [8] or covid-19 detection from x-rays [9] and many other fields. Although there are mentions about neural networks and its usage in the past [12,13] their wide usage only started in the last couple of years. The main reason being the increasing processing power of graphics cards. In the core, the computations of neural network are mostly multiplication of matrices and those can be easily parallelized and use maximum potential of the GPU. When talking about image analysis, neural network are less prone to the errors or noise in the images. Since they are already trained on noisy images or images with artificial errors, they are better at overcoming new errors that might come later and still give a correct output. Given the things mention above, the extraction of the molecules from image is not widely described, the neural network are great at working with noisy images and Bionano Genomics Saphyr instrument producing noisy images with other artefacts, we have decided to extract the molecules from the images with the help of machine learning, with intent to create tool that is less prune to errors and can be GPU accelerated. This paper focuses on analysis/extraction of molecules. All images are provided by Bionano Genomics Saphyr instrument. Construction of whole genome map and alignment of the maps to reference genome map is not part of the scope of this paper. This tool can potential replace the first building block in Bionano processing pipeline or it can be used to highlight the points of interest in the black and white image with fluorescent molecules.
2
Image Data Analysis
Bionano Genomics Saphyr mapping instrument produces many images with optical maps and other data. The size of the produced data is in the range from 10 GB to multiple TB. Their system not only produces images, but it also analyses them and creates BNX files, that contains information about images. Bionano puts its images into following folder structure: FloatCell (FC) > Scans > Bank1-4 > Images for channel CH1-2 Where each FC has arbitrary number of scans, each scan has 4 Banks in it and each Bank has 137 pairs images of molecule data. Pair is made by 2 channels, where the channel represents different colored laser illumination. Each color can highlight different things in the same image. The same molecule in these two images can show different restriction sites in each channel. One scan can contain about 1096 images of molecule data and depending on the file format used to store images, it can take 7–8 GB of space. Each of the FC has its own BNX file.
Analysis of Optical Mapping Data with Neural Network
245
The format of the file is specified here [26]. From this file it is possible to get number of base pairs per pixel and information about molecules found in all images like where the molecule is located in the image, the location of all restriction sites found on the molecule represented as a base pair distance from the beginning of the molecule, quality score for each of the restriction site and from the ColumnId and RunId it is possible to locate the image file from where the molecule originates. The images are saved in the tif or jxr format. Resolution of the image is 1024 × 8196 with 16 bit of pixel depth. The image is composed of four parts called Field of View (FOV), because the camera sensor that captures the images of the DNA is not big enough to capture the whole length of area at once, the camera has to take the image, then position itself further and take another image, repeating the cycle 3 times results in 4 images (FOVs). The neighbour images usually have exact same molecules in the bottom part as neighbour image in the top part, this feature can be used to align the image so it can be processed. Without this alignment the molecules are artificially split and further analysis would be inaccurate. The black color in the image represents background, gray lines represent molecules that have brighter white spots on them. These white spots is where the restriction enzyme bound to DNA, it florescence under the laser light and camera captures it as a bright spot. There can also be some other unwanted artefacts and defects from the capturing sensor itself. The images are always in pairs for each channel, where channel means different coloured illumination laser was used. Sometimes the second channel images contains useful data for more accurate analysis, but sometimes it is only black. 2.1
Image Alignment
The alignment of the FOVs can be done programatically via some feature extractor, this paper used AKAZE [15] local features matching but others like SURF [16], SIFT [17] or ORB [18] can be used too. The 250 pixel wide strip is cut from the bottom and top of neighboring FOVs. Each of these crops is then processed with AKAZE feature extractor that returns keypoints in the image. Most of the time the keypoint is the end of the molecule or restriction site. The keypoints are matched together with k-nearest neighbors algorithm. The matched keypoint position are subtracted and the result is a shift vector, which says in what direction the bottom FOV should be moved to be aligned with the top FOV. The image 1 shows, that not all matches are correct in one image. The matches have to be done on multiple images and then only the most occurred vector is used for alignment. If none keypoint matches are found then the fallback sift values are used, which were found in previous tests, the usual values are shift bottom FOV by 218 px up and 7 px to the right. The alignment of the FOVs can also be done by hand once and then the same sift vector can be used for all other alignments. On the image the shifted FOVs are visualized by colored rectangles 2 (Figs. 1 and 2).
246
V. Doleˇz´ı and P. Gajdoˇs
Fig. 1. Feature matched image crops with AKAZE. Higlighted matches show that not all matches are correct in one image and alignment has to be done on multiple images to get the correct shitft values.
3
Proposed Method
The analysis process is split into three main parts. The first part is a alignment of the input image for the neural network. Second part is semantic segmentation of DNA molecules in image via neural network, it outputs a mask image, that clearly shows where the DNA molecules in the image are. The third step to use the watershed algorithm [21] on the mask image, it clearly separates each molecule found by neural network. The images have a lot of noise in them and each molecule has different brightness level, it is hard to filter the molecules only by a threshold values. Molecules have varying brightness even in the same image. One threshold value could clearly separate molecules in one part of image to make them easily analysable, but it could filter other molecules in the different part of image out and the information would be lost. Molecules in the images are about 5 pixel wide and there are many cases where molecules that are right next to each other have the border between them brighter then molecules elsewhere. There are also some capturing artefacts from the camera that make the image analysis harder. All these obstacles make great candidate for a neural network to be used for analysis. Where the threshold filtering would fail, the neural network can learn and overcome the brightness variances and camera artefacts easily. Zoomed example of the molecule image is shown int the Fig. 3. The problem that need to be solved by neural network is called semantic segmentation. There are many other works focusing on medical image segmentation that use neural netwok architecture call U-net [22] and its variant [23–25]. Work that tried to solved most similar problem was analysis of kymograph output [5], the version of U-net is used in this work. The images that they show in the examples are similar to Bionano images, grey lines on black background with
Analysis of Optical Mapping Data with Neural Network
247
Fig. 2. Original image on the top, aligned image on the bottom, each coloured rectangle is one FOV. This image is only a crop of 2 FOVs, in reality 4 of these FOVs are aligned. (brightness levels were adjusted to make the lines visible)
Fig. 3. Example of molecules in image. The image on the right is a close up of image on the left. (brightness levels were adjusted to make the lines visible)
a lot of noise. They provide website where it is possible to upload small image sample and get the result from their model. The test showed promising results but the lines detected by their network jumped from one molecule to other, so the network architecture was good choice but it had to be trained on our own data.
248
V. Doleˇz´ı and P. Gajdoˇs
U-net is Fully Convolutional Network (FCN) [14]. The only thing trained in the network are weights of convolutional filters, there are no fully connected layers, this means that the network can accept different sizes of images than those that it was trained on. The training process is more computationally intensive so the smaller images had to be used. When it is used after training to segment the image, full size images can be used and there is no need to split the image. The input to the network is aligned image with molecules and the output is the same size image but only the molecules are clearly highlighted. This makes it easier to navigate through the image, distinguish molecules from each other and analyse them. 3.1
Training Pipeline
1. Align FOVs
3. Split images to halves
4. Train Neural Network
Aligned FOVs
Splited Aligned FOVs U-net
BNX Aligned FOVs Masks Splited FOVs Masks 2. Extract molecule data from BNX and create masks
Fig. 4. Training pipeline
The training pipeline is visualized in Fig. 4. To train the U-net architecture for semantic segmentation the training dataset has to be created. At first, all images from the Banks has to be aligned and to every aligned picture the desired output mask picture is created. The mask is binary image, that either is 0 in case of background or 1 in case of molecule. The position of the molecules is extracted from the BNX file, the positions has to be recalculated with the respect to FOV alignment. Tensorflow framework [20] was used to create the U-net neural network. Dropout layers were added between each convolutional block to prevent overfitting the network during training. Adam optimizer was used with the learning rate of 1e-4. The dataset is split into train and validation parts, so it can show meaningful accuracy result during training. 3.2
Molecule Extraction Pipeline
The input image has to be aligned before it is feed to the network, same as during the training phase. The output of neural network, the mask, is processed
Analysis of Optical Mapping Data with Neural Network
249
with watershed algorithm. It will label each molecule with unique color making it easier to work with. Extraction pipeline is visualized in Fig. 5. 1. Align FOVs
2. Neural Network
3. Watershed
Fig. 5. Extraction pipeline
4
Experiments
The data that we used for experiments was scan of human genome. The total amount of data produced by Bionano in this one folder is 953 GB. There is 28 folder with scans, each having 4 Banks that have 137 pair of images with molecule data. The BNX text file has 2.5 GB of molecule data. The GPU hardware used for this test was NVIDIA V100 with 32 GB of VRAM [19]. The NVIDIA GTX 1080 8 GB card was also used at one point, but because of low amount of memory, it could not train the network nor could be used after the training was done on other card. For the reasons of simplicity and saving of time only one 4 GB Bank folder was used for these experiments. 5 images were removed from this Bank so they can be used after the training is done for test. 4.1
Memory Problems and Training Results
The combination of architecture of neural network and the high resolution of images is way too big even for modern high-end GPUs, so during training the image had to be split in half, otherwise the training process would crash with GPU running out of memory. The batch size had to be set to 1 because of the amount of VRAM required by the high resolution images, with higher number the training would not start. The training ended after 46th epoch with precision 0.8561 and recal 0.8499 for training images and precision 0.7624 and recall 0.7584 for validation images. These number might seem low but they are affected by the dropout layer that are used only during training to increase robustness against errors. 4.2
Molecule Extraction Test
Since the usage of VRAM while using the network is far less then while training, the images do not have to be split in half and full size image is used as a input.
250
V. Doleˇz´ı and P. Gajdoˇs
Number and over all position of our detected molecules is then compared to the Binonano’s detected molecules. From the results shown in Table 1, the neural network was not able to detect the same amount of molecules as the Bionano. Further improvements to the neural network architecture or data augmentation during training have to be done to get closer to Bionano results. These results were done on images that were not part of the training nor validation dataset, the network did not work with them during training in anyway (Fig. 6).
Fig. 6. Image on the left is a crop from input image, the middle image is output from neural network (the mask), on the right is the image with the labeled molecules after watershed algoritm
Table 1. Neural network molecule extraction results table, where neural network numbers represent amount of molecules detected in the image by our method, Bionano is amount of molecules presented in BNX file. Precision and Recall is calculated from output image of neural network compared to the Bionano results. These results were performed on images that were not part of test nor validation dataset. Image
5
Neural network Bionano Precision Recall
Image1 1035
1075
89.35
94.56
Image2 1268
1342
88.78
82.97
Image3 1469
1567
90.37
94.79
Image4 1564
1613
89.54
94.65
Image5 1192
1284
89.40
94.21
Conclusions
This paper introduced usage of neural network to help with analysis of optical map images produced by Bionano Genomics Saphyr system. From the experiments we found out, that this approach is not as accurate as the result produced
Analysis of Optical Mapping Data with Neural Network
251
by Bionano Genomics software. Further improvements are needed, more training images can be used to train the model, the lines in training mask data can be made wider, the input data can be augmented during training or the structure of the neural network can be changed so the network outputs positions of the restrictions site directly instead of the mask and further manual post processing could be skipped. During the training of the network there were many problems with not enough GPU memory, the training data set images had to be splitted in half, even though the GPU used had 32 GB of VRAM. The FCN architecture of the network allows to be used with different sizes of the input images, during the tests the network processed images in full resolution and no splitting was required. The output of the network can be used for future work. Each extracted molecule has some restriction sites on it a they have to be precisely detected. The pixel values need to be read from the line. If the values are ploted we get the molecule signal function, where we are interested in its local maxima. This analysis need to be done for both channels and the maximas then have to be combined. The maximas on the line are where the restriction sites are be located. Once we have the positions of the maximas we need to calculate distances between them. These distances have to be to be multiplied by base pair per pixel that the scans were done in. The lengths between the restriction sites are used when creating the optical map of a complete genome and during alignment of scanned molecules to reference DNA genome. Acknowledgement. This work was supported by a grant from the Ministry of Health ˇ of the Czech Republic (NU20-06-00269), and the Internal Grant Agency of VSBTechnical University Ostrava (SP2021/94).
References 1. France G´enomique. Optical Mapping - France G´enomique (2021). https://www. france-genomique.org/technological-expertises/whole-genome/optical-mapping/? lang=en, Accessed 25 June 2021 2. Aston, C., Mishra, B., Schwartz, D.C.: Optical mapping and its potential for largescale sequencing projects. Trends Biotechnol., s. 297–302 (1999). https://doi.org/ 10.1016/S0167-7799(99)01326-8 3. Yuan, Y., Chung, C.Y.L., Chan, T.F.: Advances in optical mapping for genomic research. Comput. Struct. Biotechnol. J. 18, 2051–2062 (2020) 4. Bionano Saphyr. https://bionanogenomics.com/products/saphyr/, Accessed 24 June 2021 5. Jakobs, M., Dimitracopoulos, A., Franze, K.:. KymoButler, a deep learning software for automated kymograph analysis. eLife 8, e42288. https://doi.org/10.7554/ elife.42288 6. comma.ai - introducing openpilot. https://comma.ai/, Accessed 24 June 2021 7. Kite - Free AI Coding Assistant and Code Auto-Complete Plugin. https://www. kite.com/, Accessed 24 June 2021
252
V. Doleˇz´ı and P. Gajdoˇs
8. Nasser, I.M., Abu-Naser, S.S.: Lung cancer detection using artificial neural network. Int. J. Eng. Inf. Syst. (IJEAIS) 3(3), 17–23 (2019) 9. Wang, L., Lin, Z.Q., Wong, A.: Covid-net: a tailored deep convolutional neural network design for detection of covid-19 cases from chest x-ray images. Sci. Rep. 10(1), 1–12 (2020) 10. Shelton, J.M., et al.: Tools and pipelines for BioNano data: molecule assembly pipeline and FASTA super scaffolding tool. BMC Genom. 16(1) (2015). https:// doi.org/10.1186/s12864-015-1911-8 11. Yuan, Y., Yik-Lok Chung, C., Chan, T.-F.: Advances in optical mapping for genomic research. Comput. Struct. Biotechnol. J. (2020). https://doi.org/10.1016/ j.csbj.2020.07.018 12. Egmont-Petersen, M., de Ridder, D., Handels, H.: Image processing with neural networks-a review. Pattern Recogn. 35(10), 2279–2301 (2002). https://doi.org/10. 1016/s0031-3203(01)00178-9 13. Lin, C.T., Lee, C.S.G.: Neural-network-based fuzzy logic control and decision system. IEEE Trans. Comput. 40(12), 1320–1336 (1991) 14. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015) 15. Alcantarilla, P.F., Solutions, T.: Fast explicit diffusion for accelerated features in nonlinear scale spaces. IEEE Trans. Patt. Anal. Mach. Intell 34(7), 1281–1298 (2011) 16. Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404– 417. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023 32 17. Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE (1999) 18. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: 2011 International Conference on Computer Vision, pp. 2564–2571. IEEE (2011) 19. NVIDIA V100 - TENSOR CORE GPU. https://images.nvidia.com/content/ technologies/volta/pdf/volta-v100-datasheet-update-us-1165301-r5.pdf, Accessed 24 June 2021 20. TensorFlow. https://www.tensorflow.org/, Accessed 24 June 2021 21. Image Segmentation with Watershed Algorithm. https://docs.opencv.org/4.5.2/ d3/db4/tutorial py watershed.html, Accessed 24 June 2021 22. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation (2015) 23. Weng, Yu., Zhou, T., Li, Y., Qiu, X.: NAS-Unet: neural architecture search for medical image segmentation. IEEE Access 7, 44247–44257 (2019) 24. Huang, H., et al.: Unet 3+: a full-scale connected unet for medical image segmentation. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1055–1059. IEEE (2020) 25. Yan, W., et al.: The domain shift problem of medical image segmentation and vendor-adaptation by Unet-GAN. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11765, pp. 623–631. Springer, Cham (2019). https://doi.org/10.1007/978-3030-32245-8 69 26. Chan, S., et al.: Structural variation detection and analysis using Bionano optical mapping (2018) 27. Bocklandt, S., Hastie, A., Cao, H.: Bionano genome mapping: high-throughput, ultra-long molecule genome analysis system for precision genome assembly and haploid-resolved structural variation discovery (2019)
Evolutionary Multi-level Thresholding for Breast Thermogram Segmentation Arti Tiwari1(B) , Kamanasish Bhattacharjee1 , Millie Pant1 , Jana Nowakova2 , and Vaclav Snasel2 1 Department of Applied Science and Engineering, Indian Institute of Technology Roorkee,
Roorkee 247667, Uttarakhand, India {atiwari1,kbhattacharjee,pant.milli}@as.iitr.ac.in 2 Department of Computer Science, VŠB-Technical University of Ostrava, 17. listopadu 2172/15, Ostrava-Poruba 708 00, Czech Republic {jana.nowakova,vaclav.snasel}@vsb.cz
Abstract. In the pre-processing of the digital thermograms, multi-level thresholding plays a crucial role in the segmentation of thermographic images for better clinical decision support. This paper attempts to optimize the multi-level thresholding method for thermographic image segmentation using Differential Evolution (DE) with the Otsu’s between class variance. We have compared the results of the proposed method with the other popular metaheuristics- PSO, GWO and WOA. We have applied the Wilcoxon rank-sum test for the performance evaluation.
1 Introduction The advent of computer-aided diagnosis in the medical field has acknowledged thermography for the early diagnosis of breast tumor. Thermography in medical imaging practices to capture the temperature distribution over the surface of the human body based on the infrared radiation emitted by the body surface. Presence of the tumor increases the temperature of the cells around it; thus, the temperature difference among the cells can be recognized by the thermography. Thermography has its major advantages over the other diagnostic imaging techniques (Mammography, X-ray, Ultrasonography), like it is non-invasive, non-ionizing, painless, and provides real-time screening [1]. Segmentation is the preliminary step in the process of any sort of image analysis. For better visualization and to focus on the region of interest, images are segmented based on color, texture, shape, size or discontinuity. In view of all the existing segmentation techniques, thresholding is a simple and sufficiently robust method, takes lesser storage space and also provides speedy processing. Multi-level thresholding is a well-established method in image processing. However, in conventional methods, selecting the optimal thresholds is a time-consuming process that could be resolved by the metaheuristic algorithms [2]. In the past few years, researchers have explored the metaheuristic algorithms for optimizing the threshold values to segment the images. In [3], Akay explored two successful swarm-intelligence-based global optimization algorithms - Particle Swarm Optimization © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 253–263, 2022. https://doi.org/10.1007/978-3-030-84910-8_27
254
A. Tiwari et al.
(PSO) [4] and Artificial Bee Colony (ABC) [5] to find the optimal multi-level thresholds using Kapur’s entropy and between-class variance as fitness functions. In [6] and [7], Kumar et al. used Differential Evolution (DE) [8] and a DE variant named DE with Modified Random Localization (MRLDE) [9] respectively for image thresholding. In [10], Samantaray et al. explored two nature-inspired metaheuristic techniques – Harris Hawkes Optimization (HHO) [11] and Cuckoo Search (CS) [12] and proposed a hybrid metaheuristic HHO-CS which incorporates the exploitation capability of CS into HHO. A novel multi-level thresholding algorithm using Krill Herd Optimization (KHO) [13] was proposed by Resma et al. [14]. Here, optimum threshold values are obtained through maximizing Kapur’s and Otsu’s objective function. In [15], He and Huang proposed an Efficient Krill herd (EKH) algorithm and also used Tsallis entropy along with Otsu and Kapur’s. They compared the result of EKH with that of seven different metaheuristics - KH without any genetic operators (KH I), KH with crossover operator (KH II), KH with crossover and mutation operators (KH IV), modified firefly algorithm (MFA), modified grasshopper optimization algorithm (MGOA), bat algorithm (BA) and water cycle algorithm (WCA). In [16], Aziz et al. examine the ability of two nature-inspired algorithm - Whale Optimization Algorithm (WOA) and Moth-Flame Optimization (MFO) to determine the optimal multi-level thresholding for image segmentation. MFO showed better results than WOA, as well as providing a good balance between exploration and exploitation in all images at small and high threshold numbers. In [17], Elaziz and Lu treated multi-level thresholding as many-objective optimization (MaOP) problem and solved it using Knee Evolutionary Algorithm (KnEA) [18]. In [19], Sowjanya and Injeti investigate the ability of two different algorithms, namely: BOA (Butterfly Optimization Algorithm) [20] and GBMO (Gases Brownian Motion Optimization) [21], to determine the optimal threshold levels for image segmentation. In [22], Srikanth et al. discussed the drawbacks of using histogram for image segmentation. Wang et al. [23] used a modified ant lion optimizer algorithm based on opposition-based learning (MALO) to determine the optimum threshold values for image segmentation by maximizing Otsu’s and Kapur’s entropy. By introducing the opposition-based learning strategy, the search accuracy and convergence performance are increased. Apart from the mentioned algorithms, numerous other metaheuristic techniques used for multi-level thresholding [24–27] have been discussed for the multi-level thresholding problem. In this paper, multi-level thresholding segmentation has been done for the breast tumor thermograms; for optimizing the threshold values, we have employed Differential Evolution (DE) algorithm. The results have been compared with the other metaheuristicParticle Swarm Optimization (PSO), Grey Wolf Optimizer (GWO) [28] and Whale Optimization Algorithm (WOA) [29]. The proposed technique shows better results in terms of between-class variance. We have applied the Wilcoxon rank-sum test for the nonparametric alternative, and the proposed technique have shown efficient performance in almost every test.
2 Methodology In this paper, multi-level thresholding has been used for breast tumor thermogram segmentation. In multi-level thresholding, a number of thresholds are selected to divide the
Evolutionary Multi-level Thresholding
255
image into different regions for the ease of further analysis. For optimizing the thresholds, we have employed the Differential Evolution (DE) with the DE/rand/1 mutation strategy. The objective of the proposed method is to maximize the Otsu’s between class variance [30]. 2 (1) Objective function f = maximize σbv 2 is the between-class variance after segmenting the image into various Where σbv 2 can be defined asclasses based on different thresholds. σbv 2 2 σbv = σtv2 − σwv
(2)
2 denotes the within-class Where σtv2 is the total variance of the image, and σwv variance2 = σwv
C
wi ∗σc2i
(3)
i=2
Where C represents the number of classes. If there are total Nt thresholds, represented as Th = [th1 , th2 , . . . , thNt ] then the number of classes C = Nt +1. Here, the initial value of i is taken as 2 to represent the minimum 2 classes (binarization of gray level image). σc2i and wi represent the variance of individual classes and weight or pixel probability of the corresponding class i. Suppose there is a single threshold th then, pixel intensity range for two classes C1 and C2 will be [0 to th − 1] and [th to I ] respectively, I is the maximum pixel intensity in the image. Then, the individual class variance σc2i will be evaluated asσc21
=
th−1 i=0
I 2 Pi 2 Pi 2 i − μc1 ∗ i − μc2 ∗ , σc2 = wc1 wc2
(4)
i=th
Where wc1 =
th−1
Pi , wc2 =
i=0
I
Pi , and
(5)
i=th
Pi =
ni n
(6)
Where ni is the number of ith pixel and n is the total number of pixels in the image. μci denotes the mean of class iμc1 =
th−1 i=0
i∗
I Pi Pi , μc2 = i∗ wc1 wc2
(7)
i=th
DE is applied to optimize the threshold values and to attain the best fitness function. The step-by-step procedure is described below-
256
A. Tiwari et al.
Step 1: Read the thermogram and store it as a gray level image Step 2: Obtain the histogram for Step 3: Initialize the DE parameters- Scaling factor (F), Crossover rate (Cr) and Population size (NP).
Step 4:Initialize the population X of NP uniformly distributed random numbers with dimension Nt.
Where G is the generation, NP is the number of individuals in the population, Nt is the dimension of an individual. Step 5: Evaluate Otsu's between-class variance objective function using Eq. . Step 6: While termination criteria is not satisfied, do Step7: Perform mutation operation to expand the search space. In mutation strategy, for each target vector, a corresponding mutant vector is generated as for i = 1 to NP Select three random numbers and
and
, where
Perform crossover to increase diversity in the search space. In crossover operation, trial vector is generated with the combination of target vector and mutant vector to get the best individuals. for j = 1 to
is a randomly generated number between 0 and 1, and Where Crossover probability Cr for i = 1 to NP Perform tournament selection between trial vector and target vector. The individual having better objective function value, in this case, maximum fitness value, survive to the next generation.
where
is the objective function.
Step 8: Select the best thresholds (Th) for the segmentation of thermograms (
).
3 Experimental Analysis and Results 3.1 Experimental Setup (Software and Hardware Specification) The experiment of the proposed method was conducted on Spyder 4.1.4 Integrated Development Environment (IDE) with Python 3.7.7 through Anaconda distribution on
Evolutionary Multi-level Thresholding
257
an Intel Xeon Gold 6240 2.6 GHz dual-processor system with 192 GB RAM, Nvidia Quadro RTX 8000 GPU and 64-bit Windows 10 Education Operating System. 3.2 Parametric Setup The parameters of the proposed methods and other comparative algorithms are detailed in Table 1. Table 1. Parameters used of the DE and other algorithms. DE parameters
Value
PSO parameters
Value
GWO parameters
Value
WOA parameter
Value
Scaling Factor (F)
0.55
C1, C2
2, 2
NP
10
NP
10
Crossover Probability(Cr)
0.52
Inertia weight
0.85
Generation
35
Generation
35
NP
10
NP
10
Iteration
5
Iteration
5
Generation
35
Generation
35
No of thresholds (Nt )
2, 3, 4, 5
No of thresholds (Nt )
2, 3, 4, 5
Iteration
5
Iteration
5
No of thresholds (Nt )
2, 3, 4, 5
No of thresholds (Nt )
2, 3, 4, 5
3.3 Dataset The dataset used in this paper is the Database For Mastology Research (DMR) [31]. This dataset provides the five different profiles of the human breast thermograms. The thermograms are of size 480 × 640 pixels. Figure 1 shows the different test images taken for the experiment.
Front
Left 45 degree
Left 90 degree
Right 45 degree
Fig. 1. Test images used for the experiments
Right 90 degree
258
A. Tiwari et al.
Test Image
DE
PSO
GWO
WOA
H_F
H_L45
H_L90
H_R45
H_R90
Fig. 2. Segmented Images obtained from different algorithms
Table 2. Comparative study of the objective function values of DE and other algorithms Test image Nt DE
PSO
GWO
WOA
F
2
5184.92 5183.62 5184.89 5184.89
F
3
5292.06 5290.20 5291.60 5275.70
F
4
5326.54 5323.92 5318.53 5305.70
F
5
5351.74 5345.36 5345.16 5342.03
L45
2
7166.05 7165.99 7166.05 7165.94
L45
3
7225.63 7222.85 7224.72 7219.98
L45
4
7249.30 7244.72 7246.83 7240.58
L45
5
7270.18 7262.02 7259.32 7259.10 (continued)
Evolutionary Multi-level Thresholding
259
Table 2. (continued) Test image Nt DE
PSO
GWO
WOA
L90
2
6308.71 6308.57 6308.68 6308.58
L90
3
6360.26 3660.17 6360.18 6354.35
L90
4
6390.91 6387.97 6389.15 6377.04
L90
5
6407.93 6400.53 6403.04 6396.33
R45
2
6850.50 6850.19 6850.48 6850.46
R45
3
6906.23 6905.98 6905.27 6903.74
R45
4
6930.56 6917.94 6922.19 6922.46
R45
5
6951.27 6944.39 6944.76 6944.55
R90
2
6096.55 6096.55 6096.29 6095.72
R90
3
6135.06 6133.35 6134.21 6134.94
R90
4
6154.84 6150.50 6151.85 6146.66
R90
5
6172.17 6168.91 6168.39 6162.52
3.4 Results Analysis The results shown in Fig. 2 are the segmented thermograms generated from different algorithms used for the experiments. For the comparative analysis, Table 2 shows the resultant values of the objective functions obtained after evaluating the different algorithms. Results show that the proposed method quite well in all the cases. Table 3 represents the optimum threshold values obtained from different algorithms on the distinct number of thresholds. For the experiment purpose, we have taken four different thresholds Nt = [2, 3, 4, 5] that will segment the thermogram into 3, 4, 5 and 6 classes, respectively. Figure 3 shows the thresholds mapping over the histograms for distinct threshold values. Here Wilcoxon-rank-sum test is used to determine which algorithm ranks higher in terms of the resultant values of the objective functions. With the 35 samples, the significance value alpha is taken as 0.05. The null hypothesis (H0 ) is “there is no significant difference between the final values of objective function”, and the alternative hypothesis (H1 ) is “there is a significant difference between the final values of objective function”. Depending on the p-values, Table 4 shows the acceptance or rejection of the null hypothesis. The results show that the proposed method outperformed other algorithms in most of the cases. With distinct threshold values, it can be generalized that, when the number of thresholds is less, i.e. Nt = 2 or 3, all the algorithms are performing almost equally but increasing the number of threshold to 4 and 5, the proposed method performed efficiently in comparison to others.
260
A. Tiwari et al. Table 3. Threshold values obtained from of DE and other algorithms
Test image
Nt
DE
PSO
GWO
WOA
F
2
80,164
100, 191
78, 163
82, 167
F
3
73, 154, 199
67, 155, 201
72, 149,198
95, 146, 200
F
4
73, 155, 192, 211
73, 155, 196, 214
54, 111, 163, 200
56, 76, 156, 201
F
5
45,102,158,190,211
54, 92, 164, 196, 213
34, 83, 147, 190, 209
42, 74, 156, 190, 211
L45
2
98, 193
100, 193
799,192
101,192
L45
3
88, 173, 205
101, 179, 204
87, 174, 207
75, 173, 205
L45
4
56, 123, 177, 207
71, 138, 178, 207
51, 113, 179, 206
26, 104, 179, 204
L45
5
53, 120, 169, 192, 214
80, 128, 177, 197, 222
22, 78, 160, 183, 210
21, 76, 127, 178, 206
L90
2
95, 187
96, 189
97, 189
96, 187
L90
3
57, 129, 191
54, 129, 189
52, 128, 190
65, 110, 190
L90
4
51, 116, 174, 199
58, 138, 178, 201
47, 112, 174, 200
24, 55, 123, 190
L90
5
23, 55, 121, 173, 198 51, 118, 167, 188, 214
51, 115, 161, 180, 201
49, 91, 129, 173, 198
R45
2
97, 190
99, 190
98, 189
101, 187
R45
3
92, 175, 205
94, 175, 206
86, 174, 203
104, 171, 202
R45
4
87, 161, 187, 211
96, 170, 202, 228
50, 114, 176, 204
63, 117, 175, 205
R45
5
29, 91, 166, 190, 213 51, 109, 172, 194, 214
69, 129, 167, 194, 215
67, 118, 163, 189, 213
R90
2
101, 187
98, 188
94, 189
107,188
R90
3
95, 172, 200
103, 170, 202
95, 167, 196
95, 170, 200
R90
4
93, 165, 189, 21
56, 122, 177, 206
R90
5
30, 95, 164, 186, 210 45, 109, 165, 184, 211
28, 97, 172, 203
44, 94, 175, 203
51, 100, 165, 187, 208
28, 84, 127, 172, 201
Evolutionary Multi-level Thresholding
261
Fig. 3. Mapping of threshold values (2, 3, 4, 5) over the histograms and segmented images
Table 4. Wilcoxon rank-sum test results p-values obtained between DE and other algorithms Test image
Nt
DE vs PSO
DE vs GWO
DE vs WOA
HF
2
Reject
Reject
Accept
HF
3
Reject
Reject
Reject
HF
4
Accept
Reject
Reject
HF
5
Accept
Reject
Reject
H_L45
2
Accept
Reject
Reject
H_L45
3
Reject
Reject
Reject
H_L45
4
Accept
Accept
Accept
H_L45
5
Reject
Reject
Reject
H_L90
2
Reject
Accept
Accept
H_L90
3
Accept
Accept
Reject
H_L90
4
Accept
Accept
Reject
H_L90
5
Reject
Reject
Reject
H_R45
2
Reject
Reject
Reject
H_R45
3
Reject
Reject
Reject
H_R45
4
Reject
Reject
Accept
H_R45
5
Reject
Reject
Reject
H_R90
2
Reject
Reject
Reject
H_R90
3
Accept
Reject
Accept
H_R90
4
Reject
Accept
Reject
H_R90
5
Reject
Reject
Reject
262
A. Tiwari et al.
4 Conclusion and Future Scope This paper showed the application of DE leveraged Otsu’s method for thermographic image segmentation for breast tumor identification. Numerical and statistical results along with a comparison with other methods- Particle Swarm Optimization (PSO), Grey Wolf Optimization (GWO), and Whale Optimization Algorithm (WOA), indicate the competence of the proposed technique. The results have been validated by comparing the maximum between-class variance and implementing Wilcoxon rank-sum test on the results. For future reference, this method can be implemented on different datasets or maybe for different application areas. This work can be extended for further analysis by employing other objective functions, and the comparison can be made with other existing metaheuristic algorithms. Acknowledgements. This work was supported by the Ministry of Education, Youth and Sports of the Czech Republic in project “Metaheuristics Framework for Multi-objective Combinatorial Optimization Problems (META MO-COP)”, reg.no. LTAIN19176 and DST/ INT/ Czech/ P-12/ 2019.
References 1. Singh, D., Singh, A.K.: Role of image thermography in early breast cancer detection- past, present and future. Comput. Methods Programs Biomed. 183. Elsevier Ireland Ltd, 105074 (2020) 2. Abd Elaziz, M., Nabil, N., Moghdani, R., Ewees, A.A., Cuevas, E., Lu, S.: Multi-level thresholding image segmentation based on improved volleyball premier league algorithm using whale optimization algorithm. Multimed. Tools Appl. 80(8), 12435–12468 (2021) 3. Akay, B.: A study on particle swarm optimization and artificial bee colony algorithms for multi-level thresholding. Appl. Soft Comput. J. 13(6), 3066–3091 (2013) 4. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN 1995 International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995) 5. Karaboga, D.: An idea based on honey bee swarm for numerical optimization. In: Technical Report-tr06, Erciyes University, Engineering Faculty, Computer Engineering Department, vol. 200 (2005) 6. Kumar, S., Pant, M., Ray, A.: Differential evolution embedded Otsu’s method for optimized image thresholding. In: Proceedings of the 2011 World Congress on Information and Communication Technologies, WICT 2011, pp. 325–329 (2011) 7. Kumar, S., Kumar, P., Sharma, T.K., Pant, M.: Bi-level thresholding using PSO, artificial bee colony and MRLDE embedded with Otsu method. Memetic Comput. 5(4), 323–334 (2013). https://doi.org/10.1007/s12293-013-0123-5 8. Storn, R., Price, K.: Differential evolution - a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997) 9. Kumar, P., Pant, M.: Enhanced mutation strategy for differential evolution. In: 2012 IEEE Congress on Evolutionary Computation, CEC 2012 (2012) 10. Samantaray, L., Hembram, S., Panda, R.: A new Harris Hawks-Cuckoo search optimizer for multi-level thresholding of thermogram images. Rev. d’Intelligence Artif. 34(5), 541–551 (2020)
Evolutionary Multi-level Thresholding
263
11. Heidari, A.A., Mirjalili, S., Faris, H., Aljarah, I., Mafarja, M., Chen, H.: Harris hawks optimization: algorithm and applications. Futur. Gener. Comput. Syst. 97, 849–872 (2019) 12. Yang, X.S., Deb, S.: Cuckoo search via Lévy flights. In: 2009 World Congress on Nature and Biologically Inspired Computing, NABIC 2009 - Proceedings, pp. 210–214 (2009) 13. Gandomi, A.H., Alavi, A.H.: Krill herd: a new bio-inspired optimization algorithm. Commun. Nonlinear Sci. Numer. Simul. 17(12), 4831–4845 (2012) 14. Baby Resma, K.P., Nair, M.S.: Multi-level thresholding for image segmentation using Krill Herd Optimization algorithm. J. King Saud Univ. Comput. Inf. Sci. 33(5), 528–541 (2018) 15. He, L., Huang, S.: An efficient Krill Herd algorithm for color image multi-level thresholding segmentation problem. Appl. Soft Comput. J. 89, 106063 (2020) 16. El Aziz, M.A., Ewees, A.A., Hassanien, A.E.: Whale optimization algorithm and Moth-Flame optimization for multi-level thresholding image segmentation. Expert Syst. Appl. 83, 242–256 (2017) 17. Elaziz, M.A., Lu, S.: Many-objectives multi-level thresholding image segmentation using Knee Evolutionary Algorithm. Expert Syst. Appl. 125, 305–316 (2019) 18. Zhang, X., Tian, Y., Jin, Y.: A knee point-driven evolutionary algorithm for many-objective optimization. IEEE Trans. Evol. Comput. 19(6), 761–776 (2015) 19. Sowjanya, K., Injeti, S.K.: Investigation of butterfly optimization and gases Brownian motion optimization algorithms for optimal multi-level image thresholding. Expert Syst. Appl. 182, 115286 (2021) 20. Arora, S., Singh, S.: Butterfly optimization algorithm: a novel approach for global optimization. Soft Comput. 23(3), 715–734 (2018). https://doi.org/10.1007/s00500-018-3102-4 21. Abdechiri, M., Meybodi, M.R., Bahrami, H.: Gases brownian motion optimization: an algorithm for optimization (GBMO). Appl. Soft Comput. J. 13(5), 2932–2946 (2013) 22. Srikanth, R., Bikshalu, K.: Multi-level thresholding image segmentation based on energy curve with harmony Search Algorithm. Ain Shams Eng. J. 12(1), 1–20 (2021) 23. Wang, S., Sun, K., Zhang, W., Jia, H.: Multi-level thresholding using a modified ant lion optimizer with opposition-based learning for color image segmentation. Math. Biosci. Eng. 18(4), 3092–3143 (2021) 24. Kucukugurlu, B., Gedikli, E.: Meta-heuristic algorithms based multi-level thresholding. In: 27th Signal Processing and Communications Applications Conference, SIU 2019 (2019) 25. Jena, B., Naik, M.K., Wunnava, A., Panda, R.: A comparative study on multi-level thresholding using Meta-Heuristic algorithm. In: Proceedings - 2019 International Conference on Applied Machine Learning, ICAML 2019, pp. 57–62 (2019) 26. Oliva, D., Abd Elaziz, M., Hinojosa, S.: Multi-level thresholding for image segmentation based on metaheuristic algorithms. In: Studies in Computational Intelligence, vol. 825, pp. 59– 69, Springer Verlag (2019). https://doi.org/10.1007/978-3-030-12931-6_6 27. Harnrnouche, K., Diaf, M., Siarry, P.: A comparative study of various meta-heuristic techniques applied to the multi-level thresholding problem. Eng. Appl. Artif. Intell. 23(5), 676–688 (2010) 28. Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014) 29. Mirjalili, S., Lewis, A.: The whale optimization algorithm. Adv. Eng. Softw. 95, 51–67 (2016) 30. Otsu, N.: Threshold selection method from gray-level histograms. IEEE Trans. Syst. Man. Cybern. SMC-9(1), pp. 62–66 (1979) 31. Silva, L.F., et al.: A new database for breast research with infrared image. J. Med. Imaging Heal. Inform. 4(1), 92–100 (2014)
Identification of the Occurrence of Poor Blood Circulation in Toes by Processing Thermal Images from Flir Lepton Module Martin Radvansky1(B) , Martin Radvansky Jr.2 , and Milos Kudelka1 1
Department of Computer Science, Faculty of Electrical Engineering and Computer Science, VSB–TU, Ostrava–Poruba, Czech Republic {martin.radvansky,milos.kudelka}@vsb.cz 2 Department of Control and Instrumentation, Faculty of Electrical Engineering and Communication, BUT, Brno, Czech Republic [email protected]
Abstract. Poor blood circulation in the toes is a common problem for many diseases, and often it is connected with diabetes. Identifying the onset of this problem is very difficult, especially in the early stages of the disease. Specialized medical examinations are often expensive, so practitioners send patients to it mostly for clearly recognized limb problems or risk family anamnesis. We are focusing on designing a tool for identifying poor peripheral blood circulation that uses data obtained from infrared thermography of the toes using a cheap thermal camera module. We identify problems with poor blood circulation from external symptoms of this disease–local changes in surface temperature. Our proposed method can be used as a decision support tool for general practitioners.
1 Introduction The blood circulation system is responsible for sending blood, oxygen, and nutrients throughout the body. When blood flow to a specific part of the body is reduced, it may experience the symptoms of poor circulation. Poor circulation is most common in body extremities, such as legs and arms. However, this disease–widely known as Peripheral Artery Disease (PAD), mostly results from other health issues. In fact, leg artery blockages represent a danger to health that is equal to or greater than blockages in arteries to the brain or heart and can carry an equal or higher risk of heart attack, stroke, amputation, or death. The artery damage by ageing, obesity, smoking, heart conditions, diabetes, and adverse family history, high blood pressure or high cholesterol levels occurs throughout the body and not in just one place; therefore, PAD is a serious condition that should be diagnosed as soon as possible. Practicians have to reduce possible patient’s risk as quickly as possible. PAD may be the first warning sign of a serious health risk–this means that one in five people with PAD, if left undiagnosed and untreated, will suffer a heart attack, stroke or death within several years. In addition, PAD, when untreated, can have other serious consequences, including leg muscle pain, discomfort during exertion and subsequent c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 264–273, 2022. https://doi.org/10.1007/978-3-030-84910-8_28
Identification of the Occurrence of Poor Blood Circulation in Toes
265
loss of mobility. Blockages and blood clots in the arteries may lead to pain at rest, chronic cold hands or foot, foot skin ulcers or amputation, often lead to diabetic foot syndrome (DFS). Identifying PAD, especially in the early stages of the disease, is not an easy task. Based on known family history and associated diseases, the physician can use basic examinations and patient questioning to get an idea of the possible onset of the disease. If PAD symptoms are not directly visible or detectable on the limb, the physician must rely on the patient to self–report the symptoms. However, these patient symptoms reports can be classified as “soft” factors where patients report non-verifiable feelings (feeling cold, tingling, pain). Doctors from knowledge of family history and physical examination can diagnose disease or refers patients for further detailed examinations. These specialized tests (CT, X–ray, Ultrasonography) are not only very expensive, but the waiting time for the special medical tests can be very long. This article describes a method for identifying poor blood circulation problems in toes that can help doctors decide whether to send a patient for further advanced testing. Our approach is based on using a low–cost device designed by us using infrared thermography to recognize the early stages of PAD. The main idea of using toe thermography is based on the assumption that if the small arteries in the toes are narrow, they are less able to move blood through the toes and foot. Therefore, it causes the toes and foot to contain cold spots on the skin, which can be detected by thermography. The designed equipment, together with the decision support tool (DST), can improve the quality of the diagnostics and save money on specialized medical tests.
2 Related Work One of the first papers about using Infrared Thermal Imaging (TI), applied for diagnostic purposes in medicine, was published by Barnes [1] in 1964. In his work, he introduced a method for analysing psychological processes by abnormal changes in the skin temperature of a human body. Hot or cold areas on the skin indicate different metabolic processes. On the other hand, these changes strongly depend on heat exchange conditions, measurement environment and of course, patients thermoregulation. Therefore, the skin temperature distribution interpretation is not an easy task and requires careful preparation of a patient in stable environmental conditions, physical activities, and many others. The description of many aspects of using TI can be found in the Diakides monography [2]. Another overview of the history and using TI for diagnostic purposes in medicine may be found in many articles, for example, [3, 4]. The papers about using TI can be divided by focusing on different aspects of solved problems. Most articles, especially from the last decade, are focused on using different machine learning techniques and artificial intelligence together with algorithms for machine vision to identify different types of diseases that can be identified by thermography. Saxena [5] in his paper describes a method primarily focused on the active dynamic thermography. In this method, transient thermal images of the skin surface are captured using an thermal camera while the skin surface is stimulated externally (cooling), followed by a recovery phase to obtain high contrast thermal images. The authors used
266
M. Radvansky et al.
several steps for the processing of images. They utilised an adaptive direct phase substitution algorithm (based on FFT) to remove motion artefacts, then the authors used image subtraction from images before and after stimulation. Otsu’s method and manual segmentation were used for the extraction of a region of interests (ROI). For these regions, they computed the coefficient of tissue activity ratio. Finally, they used cut–off value based classification (ROC, AUC) for the final decision of tissue class. The statistical approach to the assessment of local skin temperature response can be found in paper [6]. Authors try to evaluate the relations between adiposity (biophysical characteristics) and temperature profiles under thermoneutral conditions for normal weight and overweight females group based on the thermogenic response to a metabolic stimulus performed with an oral glucose tolerance test. Thermal images of the right hand and abdomen were processed by the basic statistical methods and used to compare groups. Singh, in his paper [7] summarize methods and trends for extraction of ROI in medical thermography. The authors describe methods for image processing and segmentation. They cover a wide range of well–known methods starting from histogram–based thresholding, FFT, wavelet transformation, PCA, LDA, SVM, ANN, RNN, deep neural networks and many others for identifying hot/cold spots in a thermal image. They describe manual, semi–automated and fully–automated methods for segmentation of images and their use on particular parts of the body. Using convolutional neural networks to diagnose breast cancer using thermography id described in the paper [8]. The authors used feature extraction based on outer boundary detection of thermogram, width and height standardization of ROI, and finally detecting the central axis of the breasts and separation left and right breast. The extracted features were used as input for CNN with two hidden convolutional layers. The output of CNN was used for the prediction of probability benign or malignant breast cancer. Another area of researcher interest is the design of devices to help the clinician to get thermal images of part of a human body with maximal repeatability to obtain images for progress monitoring and improving the accuracy of diagnostics. One of the exciting approaches is the design of the high accuracy multispectral surface scanning 3D system RoScan [9] which can make 3D model of a scanned human body with different types of surface layers. The authors used thermography mapped on the surface of 3D model to achieve the precise position of the thermal spots on the skin of the human body. These days is using DST as one of the key factors for precise diagnosis to achieve maximal efficiency at the point of care and the personalised medicine approach. By the growing computer power, availability electronic health record for machine processing and advanced algorithms for machine learning, there exist the considerable potential for developing systems for clinicians to support their decisions in case of treatment or sending patients to advanced (mostly expensive) tests. Computer–aided clinical decisions are considered to be essential for a significant improvement of patient’s treatment is described in the report of the Institute of Medicine in 2000 [10]. In history, we can find the start of the development of diagnostic clinical decision support systems since 1970. These tools became more popular, widely accepted, and today, clinically desired. Raising computer power and using algorithms
Identification of the Occurrence of Poor Blood Circulation in Toes
267
based on artificial intelligence, deep neural network, and many others from machine learning areas provide high-quality decision support for doctors. Impact and effectiveness on the clinical decision–making and various challenges of developing DST can be found in work [11]. Describing various aspects using DST can be found in [12]. The author divides DSTs into two main categories. Firstly, support decisions passive–by error recognitions using computerized physician order entry or patient monitoring. Secondly, support decision actively–these systems provide alerts in unusual or dangerous situations, recommend medications, and support physicians for diagnostic and treatment decisions. We aim to develop a simple and cheap device that can be used at the point of care and provide practitioner’s decision support in case of early diagnosis of PAD.
3 Thermal Imaging Device The use of thermal imaging cameras in diagnosing inflammatory processes or in sports medicine is common practice, but these devices are usually unaffordable for the general practitioner. In this paper, the Flir Lepton 3.5 [14] low-resolution LWIR thermal imager is used. It is a module that can be easily interfaced with microcontrollers or PCs, and in the radiometric mode, it is possible to obtain a thermal image of a scene with resolution 160 × 120 points. The manufacturer declares thermal sensitivity as 0.05 ◦ C and the accuracy of measured temperature in high gain mode is ± 5 ◦ C or 5% over a range of −10 to +140 ◦ C. For our approach is not essential the exact value of temperature, but we are focused on finding cold spots (on skin of toes) in the thermal image, where thermal sensitivity plays a significant role in determining them. Our device is developed for several tasks. The main aim is to get thermal images of toes. To get the best results, we need to move the thermal camera module up and down to maximize the size of the toes scanning area. To achieve the best accuracy of the measured temperature, we added to the scanning head medical contactless infrared thermometer for temperature compensation. Finally, the scanning head includes a camera and LED lights for taking an image of toes. This scanning head is connected to the module Raspberry Pi 4 by the USB cables, and there are some extra power, control and communication wires. To achieve reproducibility and stability in the thermography, we need to prepare a measuring environment where we eliminate external factors like airflow, low surrounding temperature and other factors that negatively influence skin temperature. We prepared a measurement box from thermal insulation foam, and inside this box, the scanning head is placed on a sliding arm. 3.1 Process of Taking Thermal Images According to the literature [15] and our experimental measurements, we need to fulfil some requirements before thermal images are taken. The entire procedure takes about 20 min, including disinfection. Surrounding temperature around 20–24◦ , a patient has no using alcohol for four hours before measurement and no physical activity at least 30 min before measurement. The thermal imaging process contains several steps:
268
M. Radvansky et al.
• Patient’s foot is put inside measurement box, and imaging procedure is started. • Image of toes is taken by the camera. • Waiting 15 min for achieving thermal equilibrium in the box. During the waiting is optimized distance of the scanning head. • Start recording thermal images for 20 s. • End of imaging procedure and start processing.
4 Segmentation of Thermal Images The important part of identification the early stages of PAD plays a comparison between each toe temperature and temperature of the first third of the foot. Thermal images of toes have to be processed to identify regions of interest (ROI). We can divide these regions into three main parts. The first region, the biggest part, is the image of the foot (approximately a third of foot size). It is used as a reference temperature area. The second region is part containing all toes, and the last one is five small regions representing each toe. The processing of thermography is similar to any other image, but there exist some limitation. The thermal images are very noisy by the principle of detection. Another issue is reflecting from the surround of the measured object, causing a fuzzy border of objects. In literature [7] are summarized different approaches, and there is mainly suggested use various methods of thresholding algorithms. To our best knowledge, we do not find any paper where authors extracted toes from infrared images. Our method is based on the principles recommended in the literature but enhances them by automatic finding threshold and image masking. The main steps of ROI extraction are depicted in Fig. 1.
Fig. 1. Main steps of ROI extraction algorithm
The thermal camera module produces nine unique frames per seconds. Therefore recording of 20 s contains 180 frames each of size 160 × 120 points. The value of each
Identification of the Occurrence of Poor Blood Circulation in Toes
269
pint is 14–bit number of thermal intensity. By averaging all frames, we obtained the final thermal image for the following processing. Thermal images are very noisy; therefore, we used Gaussian filtering with kernel size 5 × 5 to noise reduction. 4.1 Selection of the Best Threshold Value The proper thresholding of thermal image is a crucial part of segmentation. To thresholding of image, we used a combination of two standard threshold methods. Firstly we used Otsu’s thresholding which chooses the threshold T hOt to minimise the intraclass variance of the thresholded black and white pixels. Because the images often have large areas with very close temperature, we used slightly modified binary thresholding given by the Eq. 1: 1, if f(x,y) > ThOt (1 − R f ) g(x, y) = (1) 0, otherwise Our method returns thresholded image g(x, y) from source image f (x, y). Previously computed value of Otsu’s threshold T hOt is reduced by reducing factor R f . Value of R f is determined by following process. For each thermal image, we annotated optimal ROI by hand. We processed thresholding by the Eq. 1 in loop for different setting of R f . The thresholded image we compared with annotated image and in case of difference between these two images was lower than the fixed value of 3% (576 pixels), we appropriately updated the confusion table. From the confusion table, we computed Hit ratio curve Fig. 2. The best value of the reduction factor R f is obtained from Hit ratio curve, and in the particular case, the optimal values of R f is 20%, when the hit ratio reached 76% of correctly thresholded images.
Fig. 2. Hit ratio curve for determination of reduction factor R f . The red line is the case when R f is equal to zero, which mean we worked only with the suggested threshold value returned from Otsu’s thresholding algorithm.
270
M. Radvansky et al.
The result from this step was used as a mask to the original thermal image, and we got the first ROI of the thermal image. 4.2
Segmentation of Thermal Image
In our method, we worked with three types of ROI. First of them (approximately first third of foot), we have obtained from the previous step. Now we have to get a region containing only all toes and five regions for toes separately (see Fig. 3). For preprocessing of each image is used Canny edge detector [16]. Application of Harris corner detection algorithm [17] we can detect corners in the image that are regions with significant variation in intensity in all directions. Using this algorithm on image contains contour, we can find spots with a high probability of corner. These spots inside of contour are where the edge between toes exists, and it helps us get a segmentation of the image. For extraction toes area from the image, it is necessary to find two points on the border of the foot. We used a method based on the intersection of the line and the boundary of the foot. The lines are constructed from two predeceasing points representing the lower end of the border between toes. By this method, we obtained all necessary points for the segmentation of the toes area and each toe separately. The borderline of the toes area and the rest of the foot is not linear, so we used spline interpolation, and this interpolation curve is used as a border for masking the toes area. Segmentation of each toe is done by the masking area between lines passing top and bottom points on the foot’s contour. Steps from thermal image to segment extraction is depicted in the Fig. 4, where the start point is the left upper corner, and the final segmented big toe is the right bottom.
Fig. 3. Regions of interests. The blue points are detected from space between toes by Harris’s algorithm. The green lines are constructed from two predeceasing blue points. Red points are the intersection between the border of the foot and the green line. The spline interpolation of all points creates a black curve and forms the lower border of ROI.
Identification of the Occurrence of Poor Blood Circulation in Toes
271
Fig. 4. Image processing
5 Processing of ROI Experiments in our paper were done on a relatively small set of ten unique persons. For each person, we took on five different days thermal images of both feet. We processed 100 thermal images. Together with thermal images, we recorded blood pressure and heart rate. Thermal images have been taken for a group of people without PAD diagnose. Box plots of the average temperature for different ROI are depicted in Fig. 5. Figure 6 displays the average temperature of ROI for the patient with medically diagnosed PAD.
Fig. 5. ROI temperature profiles, no PAD
Fig. 6. ROI temperature profiles, PAD diagnosed
The average temperatures of ROI were used for statistical testing with the result that there exists a statistically significant difference between the average temperature of ROI for persons with and without PAD. Based on the analysis of the data, we can hypothesize that people with colder areas on the surface of the skin on the toes may be affected by PAD which corresponds with medical knowledge in literature [18].
272
M. Radvansky et al.
We designed a tool based on the analysis of thermal images of the foot and comparison ROI of foot and toes average temperature for alerting the practitioner to the possible occurrence of PAD. This tool can influence the decision about sending patients to the advanced medical examination. Our approach shows that using a cheap thermal camera module can provide advanced information for practitioners at the point of care.
6 Conclusion In this paper, we introduce a method that can have the ability identify the early stages of PAD. Our method is based on analyse thermal images captured by the cheap thermal image module. We designed a device for capture thermal images under controlled conditions. We summed up from literature and from our experience rules to get comparable thermal images. Finally, we prepared a tool to support clinicians in deciding to send patients to more advanced examinations to confirm or disprove PAD. Using a developed measuring box with a cheap thermal camera module can improve diagnostic possibilities at the point of care. This project is the first part of a more extensive study, and the current result gave us optimism for starting collecting data from a large group of patients. Using advanced techniques like deep neural networks seems to be a good direction to achieve high accuracy in PAD detection. Acknowledgements. This work is supported by SGS, VSB–Technical University of Ostrava, under the grant no. SP2021/94.
References 1. Barnes, R.B.: Thermography. Thermography and its clinical applications. Ann. N.Y. Acad. Sci. 121, 34–48 (1964) 2. Diakides, M., Bronzino, J.D., Petereson, D.R. (eds.): Medical Infrared Imaging-Principles and Practices. CRC Press, T&F Group, Boca Raton (2013) 3. Ring, E.F.J., Ammer, K.: Infrared thermal imaging in medicine. Physiol. Measur. 33, 33–46 (2012) 4. Kaczmarek, M., Nowakowski, A.: Active IR-thermal imaging in medicine. J. Nondestr. Eval. 35, 19 (2016) 5. Saxena, A., Ng, E., Lim, S.T.: Active dynamic thermography to detect the presence of stenosis in the carotid artery. Comput. Biol. Med. 120, 103718 (2020) 6. Jalil, B., Hartwig, V., Moroni, D., et al.: A pilot study of infrared thermography based assessment of local skin temperature response in overweight and lean women during oral glucose tolerance test. J. Clin. Med. 8(2), 260 (2019) 7. Singh, J., Arora, A.: Automated approaches for ROIs extraction in medical thermography: a review and future directions. Multimed. Tools Appl. 79, 15273–15296 (2019) 8. Ekici, S., Jawzal, H.: Breast cancer diagnosis using thermography and convolutional neural networks. Med. Hypotheses 137, 109542 (2020) 9. Chromy, A., Zalud, L.: The RoScan thermal 3D body scanning system: medical applicability and benefits for unobtrusive sensing and objective diagnosis. Sensors 2020 20, 6656 (2020)
Identification of the Occurrence of Poor Blood Circulation in Toes
273
10. Kohn, L.T., Corrigan, J.M., Donaldson, M.S. (eds.): To Err Is Human: Building a Safer Health System. Institute of Medicine, The National Academies Press, Washington, DC (2000) 11. Berner, E.S.: Clinical Decision Support Systems: State of the Art. Agency for Healthcare Research and Quality (2009) 12. Teich, J.M., Merchia, P.R., Schmiz, J.L., et al.: Effects of computerized physician order entry on prescribing practices. Arch. Internal Med. 160(18), 2741–2747 (2000) 13. Osheroff, J.A., Teich, J.M., Levick, D., et al.: Improving Outcomes with Clinical Decision Support: An Implementer’s Guide. Himss Publishing, Chicago (2012) 14. LWIR Micro Thermal Camera Module. FLIR Systems Inc., Wilsonwille (2021). https:// www.flir.com/products/lepton/?model=3.5 Lepton. Accessed 4 June 2021 15. Vollmer, M., M¨ollmann, K.P.: Infrared Thermal Imaging: Fundamentals, Research and Applications. Wiley, New York (2017) 16. Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-8(6), 679–698 (1986) 17. Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 4th Alvey Vision Conference, pp. 147–151 (1988) 18. Dieter, R.S., Dieter, R.A., Dieter, R.A.: Peripheral Arterial Disease. McGraw-Hill Medical, New York (2009)
License Trading System for Video Contents Using Smart Contract on Blockchain Kosuke Mori, Kota Nakazawa, and Hiroyoshi Miwa(B) Graduate School of Science and Technology, Kwansei Gakuin University, 2–1 Gakuen, Sanda-shi, Hyogo 669–1337, Japan {mkosuke,K.Nakazawa,miwa}@kwansei.ac.jp
Abstract. Recently, with the spread of SNS (Social Networking Services) such as Twitter and Instagram, many images and videos have been posted and shared. These postings are generally evaluated and spread among SNS users, but basically, there is no direct profit for the posters. However, when the posting of videos and images in a disaster area spreads, the videos and the images are requested and bought by broadcasters such as public media companies. This means that there is often sufficient monetary value in such a video and an image. It is desirable that the negotiation on contents license between a poster and a broadcaster proceeds promptly and easily. The application of the block chain technology outside of the finance region attracts the attention recently. In particular, Ethereum, which is a platform that enables application construction using smart contracts, is currently being studied for many applications. In this paper, we propose a transaction system for buying and selling licenses of video contents using smart contract on a blockchain and a distributed file storage. We implement the proposed system on Ethereum blockchain and a distributed file storage Swarm and evaluate the performance.
1 Introduction Recently, with the spread of SNS (Social Networking Service) such as Twitter and Instagram, many images and videos are posted and widely shared. These posts are generally evaluated and spread by SNS users by “liking function” etc., but, basically, there is no direct profit such as money for the post users. However, when the posting of videos and images in a disaster area spreads, the videos and the images are requested and bought by broadcasters such as public media companies. The public media companies send a message for the permission of the use of the videos and images, and they are used in news programs of television, if it agrees actually. From this point of view, some posted videos and images have sufficient financial value. When using such videos and images posted in SNS in news program of television, the time cost of the direct negotiation with the posting person is a large problem, because the news are required the freshness of the information. Therefore, a new social network system for the prompt and easy negotiation on contents license between a poster and a broadcaster is needed. The application of the blockchain technology in the outside of the finance field is noticed recently. In particular, Ethereum [1], which is an open source platform that c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 274–282, 2022. https://doi.org/10.1007/978-3-030-84910-8_29
License Trading System for Video Contents Using Smart Contract on Blockchain
275
enables building applications using smart contracts, is still being studied for many applications. Applications running on Ethereum are called DApps (Decentralized Applications), which allow the implementation of smart contracts. The smart contract is effective for the implementation of this system which carries out the business transaction on the Internet, because it can realize the trustless transaction. In this paper, we propose a transaction system for buying and selling licenses of video contents using smart contract on a blockchain. Since data with large size such as videos cannot be handled on a blockchain, it is generally handled by putting them in a distributed file storage. Our proposed system handles video contents by using Swarm, a distributed file storage. We also implement this system on Ethereum blockchain and verify the operation of the system.
2 Related Works In recent years, blockchain technology has attracted attention as a technology with the possibility to solve the problems of data portability, falsification prevention, and high security. Blockchain is a distributed ledger technology that was originally developed as the underlying technology to support Bitcoin [2]. Blockchain is, in a narrow sense, a protocol or its implementation that, among a number of nodes including Byzantine failures [3], the probability that the agreement will be reversed over time will converge to zero. However, in a broad sense, blockchain refers to a technology that uses electronic signatures and hash pointers, has a data structure that is easy to detect tampering, and that realizes high availability and data identity by keeping the data in a large number of nodes distributed over a network. A smart contract is a protocol that automatically executes a contract in a blockchain network. When a smart contract is implemented in a blockchain, the definition of the contract is programmatically written and stored in the blockchain as data that cannot be altered. When an event corresponding to the condition described in the definition of the contract occurs, the contract is automatically executed by the program. The smart contract itself was proposed by Nick Szabo in 1994 [4]. Smart contracts mainly mean the automatic execution of contracts, and Nick Szabo exemplifies vending machines, but the availability of smart contract technology in some blockchains has broadened its application. As a research of right transfer using smart contract, there is a proposal of car sharing system [5]. This study uses the smart contract which can utilize features of the automatic contract in order to realize the car sharing without the third party organization. Ownership of the car and the personal information of the borrower can be linked to an address and published on the blockchain. A similar study is the study of land buying and selling through smart contracts [6]. By depositing the right of the land and the sales price of the land to the smart contract, the contract is automatically performed under the condition that the conditions of the seller and the buyer are satisfied. In the research which carries out the electronization of the public document using the smart contract, there is the research [7] of the electronic investigation report system by the smart contract. By extending and using tokens with the irreplaceable nature of ERC721, documents related to a school report on students’ grades and conduct in high schools are digitized on the blockchain.
276
K. Mori et al.
There is a study to realize SNS using blockchain [8]. Using the smart contract on Ethereum, a SNS is realized as distributed application (DApps). The problem of blockchain which is difficult to deal with large data is solved by using IPFS (Inter Planetary File System) [9] which is one of distributed storage. The smart contract part is implemented by dividing the contract for managing accounts (Account Manager Contract) and the contract that functions as SNS (User Contract). In Account Manager Contract, addresses are managed using a mapping data structure to control the creation of only one account for each address. In User Contract, functions based on Twitter are designed and implemented. In addition, since return value of complicated data structure including array and structure is not supported in the function of Solidity, the processing which acquires tweets is realized by the processing in the front end.
3 License Trading System Using Smart Contract on Blockchain and Distributed File Storage In this section, we describe our proposed license trading system for video contents using smart contract on blockchain and distributed file storage. 3.1
License Trading System Using Smart Contract on Blockchain
This system enables the license of video contents posted on SNS to trade on a blockchain. The realization of the basic SNS function and the realization of the commerce system itself are all designed using a smart contract. However, in addition to not being able to store large data-sized content such as videos in a blockchain, it is assumed that handling such data on a smart contract requires a very large fee called GAS to process transactions. Therefore, the smart contract of this system does not handle the video contents as it is, but solves the above problem by handling the video contents by combining with Swarm which is a distributed file storage. This system is implemented as a system running on Ethereum which is a platform for constructing smart contracts which are distributed applications. Since an assumed scene using this system is SNS, video contents to be carried there must be public information. However, as this system is a system to carry out commercial transaction of video contents, video contents to be opened to public and video contents to be purchased should be handled separately. Therefore, a part of the original video is cut and posted to SNS as public data and the original video is used for commercial transaction. The system is designed so that the token regulated by ERC721 which is one of the standards of Ethereum smart contract is given the information of a video (URI), and the use license transaction can be carried out using the token. The information held by the token can be divided into public and private information. The public information is a set as six of the following: issuer address of the token, address of the owner of the token, text data to be posted to SNS, token price (ETH), evaluation value of the token, and URI of clipped video. Non-public information is managed as data linked to a token, and is divided into two parts. One is the data set which manages the URI to the original video and the password to access the URI, and the other is the data set which manages the addresses where the commercial transaction is established and signed.
License Trading System for Video Contents Using Smart Contract on Blockchain
277
Since the token itself can be referred to by anyone who has an Ethereum address as the public information because of the nature of the token of the ERC721, the data of the non-public information is not written directly in the token but treated as the linked data. The URI to the original data of the video and the password to access the URI can be referred only to the issuer of the token or the users of the addresses where the commercial transaction was established and signed. The token signer list is only visible to the token issuer. 3.2 Swarm Processing and Encryption Swarm [10] is a distributed file storage mechanism built into the Ethereum blockchain. In the Internet world using Ethereum Web3.0, Ethereum’s blockchain technology, Swarm is positioned as the base layer for Ethereum Web3.0 and is strongly associated with Ethereum blockchain. The main difference with IPFS is that it has a mechanism to generate an incentive for the node which holds the file, and it enables to offer the content to the user of Swarm. Our proposed system does neither directly handle video contents in a blockchain nor a smart contract. Instead, it stores video contents in a distributed storage called Swarm so that they can be handled by a smart contract. A user who is a poster to SNS first cuts video contents to post the video to SNS. This is similar to the function to cut a video or to adjust the length of a video when posting video on Twitter; however, our proposed system executes this procedure in order to divide the public video contents for posting to SNS and the non-public original video contents for commercial transaction. The cut video and the original video are separately stored in the Swarm, and the hash value of fixed length for accessing each content is received. Since a hash value of fixed length is generated regardless of the format and data size of the file to be saved, it is not necessary to consider the data size of the video contents. By accessing the URI generated from the received hash value, the video contents can be obtained. As a feature of Swarm, it has an access control function which can set access restriction when accessing stored files, and can use the basic authentication system in HTTP as it is. Therefore, this system is designed to restrict access to the contents using the password. However, since the exchange of passwords is done on the blockchain and smart contract, it is not sufficient in terms of security. Therefore, by encrypting the video itself outside the blockchain, this system ensures the security aspect both inside and outside the blockchain. This system is designed on the premise that the hybrid encryption system which combined the common key system with the public key cryptogram system is used for the data for the commercial transaction. In order to store the data for commercial transaction in Swarm by encrypting it before the trading partner is decided, the video itself is encrypted by the common key. When the commercial transaction with a trading partner is established on the smart contract, the common key must be shared with the trading partner, but the leakage of the common key must be avoided. However, if the trading partner is determined, the common key itself can be encrypted by the public key cryptogram system. By this mechanism, it is possible to construct the system so that the video contents is not accessed except for the users in which the commercial transaction is established.
278
3.3
K. Mori et al.
Functions of Smart Contract of License Trading System
The functions of SNS and commerce are all implemented by smart contract. The smart contract part of this system consists of six functions: a submission function, a signature function, an ETH collection function, an evaluation function, a public data inquiry function, and a private data inquiry function. These functions can be divided into two parts. The submission part, signature function, ETH collection function, and evaluation function become a transaction to change the data in the blockchain, respectively; therefore, a fee called GAS is required for execution. The public and private data inquiry functions do not require GAS to execute the transactions because they are transactions that only read data in the blockchain. This system uses an Ethereum account as the identification of the account used in SNS; therefore, it does not need to create a new account only for SNS. In the following sections, we describe the above functions of the smart contract. 3.3.1 Submission Function The submission function is available to anyone who has an Ethereum account for posting SNS. Before executing this function, the SNS poster stores the video in Swarm and receives the hash value and URI for the video. At the time of execution of this function, the poster passes the information of the text to be posted, minimum selling price of the video, URI of the public data of the video, URI of the private data of the video, password for accessing the private data to the smart contract, and the token is issued. 3.3.2 Signature Function The signature function is a function for signing the token, and the private data of the video can be inquired by signing the token. This function can be executed by anyone except the issuer (poster) of the token itself, but it is designed so that only the owner of the token can sign it once. This is to take steps to purchase the license of use and to prevent it from being freely signed by a third party. In addition, the minimum selling price information exists in the token, and it is designed so that the signature function can not be executed for the address which does not have ETH over the purchase price. In addition, the ETH paid at the time of purchase is not sent directly to the seller, but is sent to the address of the smart contract itself called contract address. This is a security feature of smart contracts, and it is not recommended to exchange ETH between individual addresses using the function of smart contracts. Therefore, posters (token issuer) are required to recover their own ETH held by contract addresses. The token is signed as soon as the token is purchased, but the owner of the token is returned to the issuer’s address immediately after signing. The smart contract is designed to work by having someone perform a transaction, so the system will not automatically work unless someone performs the smart contract function. In other words, if the processing of returning the address of the token to the issuer is not performed simultaneously in a series of processing, the contract of another function of recovering the token must be prepared and executed again.
License Trading System for Video Contents Using Smart Contract on Blockchain
279
3.3.3 ETH Collection Function The ETH Collection Function is a function for a person who is a poster to recover the profit generated in the commercial transaction after the trade of the ownership is completed. The ETH transferred by the transaction is held by the contract address associated with each Ethereum account. Not everyone can retrieve the ETH held by the contract address, and only the ETH linked to the address can be recovered. In addition, the system recovers all ETH linked to Ethereum account by one collection. 3.3.4 Evaluation Function The Evaluation Function is a function for evaluating tokens in the liking function of conventional SNS. This feature is also available to anyone with an Ethereum account, but it is designed to limit the number of times a token can be evaluated to one by mapping and managing the number of evaluations and Ethereum addresses for each token. 3.3.5 Public Data Inquiry Function The private data inquiry function is a function for viewing the contents of the token, and it can be used by anyone who has an Ethereum account. Using this function, reading public data of the token is equivalent to viewing the posting of SNS. Since this processing only refers to data stored in the blockchain, GAS is not generated when executing the contract. Since there is a limit in the processing to return the return value of Solidity, the processing which displays the posting list of the users like the conventional SNS requires the processing in the front end. 3.3.6 Private Data Inquiry Function The private data inquiry function is a function for inquiring non-public data associated with a token, and the users who can refer to the two types of non-public data are respectively limited, and the processing is also performed separately. The ability to query the URI of an original video and the password needed to access the video is designed to be performed by the issuer of the token (token issuer, poster) or the user who is the Ethereum account signed by the token (token owner). The user who has signed the signature list of the token accesses the non-public original video using the information obtained here. The signature list is also kept private, but only the token issuer can query it. Since the user address of the mass media assumed by this system is assumed to frequently purchase the video contents, there is a danger that the identification of the Ethereum account becomes possible, when the signature list is managed as public information. Therefore, by limiting the users who can refer to only the issuer, the risk of the account specification is suppressed. 3.4 Flow in Execution of System We show the flow in execution of license trading system in Fig. 1.
280
K. Mori et al.
Fig. 1. Flow in execution of system
4 Performance Evaluation The system proposed in the previous section is implemented using smart contracts that run on Ethereum. The transaction costs equal to a fee called GAS are incurred when processing changes to blockchain data. Though this system is implemented as SNS, the users must pay this fee when the users uses the function. Therefore, the utilization of this system by the users can not be expected, unless the cost for executing each function is low. Therefore, we evaluate the transaction cost in executing each function of this system in the test environment. Table 1 shows the results and the price of Ethereum in Japanese Yen calculated on the basis of 60,000 Yen (as of December 16, 2020). 1 wei is the smallest unit of ETH, and 1 ETH = 1.0 × 1018 wei. The deployment of the contract with the highest transaction cost is not the transaction cost for the users, because it is the fee for the first time to implement this system on Ethereum. Even this transaction cost is almost negligible at the current Ethereum price range. In addition, all transaction costs actually are also at a negligible level of price range. One of the reasons for low transaction cost is the use of Swarm. By using Swarm, a hash value of a fixed length is generated regardless of the data size of a video, and the information of the video to be read by a token is only the URI generated from Table 1. Transaction cost for function Transaction cost (wei) Japanese Yen Deployment of contract 4211286
2.5267716 × 10−7
Submission
574753
3.448518 × 10−8
Signature
143094
8.58564 × 10−9
ETH collection
20862
1.25172 × 10−9
Evaluation
85740
5.1444 × 10−9
License Trading System for Video Contents Using Smart Contract on Blockchain
281
the hash value and the password to access it. The data handled in the token deals only comparatively simple data. In each posting, the part where the transaction cost changes is the number of characters of the text to be posted to SNS. Figure 2 shows the change in transaction cost depending on the number of characters in the text to be posted.
Fig. 2. Changes in GAS according to the number of characters of the text to be posted
Figure 2 shows that GAS increases linearly with the number of characters posted. Although the transaction cost with 300 characters nearly doubles to one with 0 character, we can assume that the transaction cost is still very low and negligible considering the number of characters posted on real SNS.
5 Conclusion In this paper, we proposed the license trading system for video contents using smart contract on blockchain, and implemented it on Swarm and Ethereum by using the language Solidity. Since this system uses smart contract, it can be expected to reduce the cost of communication between mass media and a SNS poster by automation of contract. In addition, it is advantageous to use this system for users who post videos to SNS, because users can obtain financial benefits. On design of the system, a fee called GAS required for execution of each contract could be reduced to a realistic price. In this paper, we have discussed a method of complete transfer of the license of a video. A method in which the license of the video revert to the issuer (owner of the original video) when the expiration date comes is also considered. In the former case, the purchaser has a monopoly on the video content and can distribute copies of the video for free after purchase. If this were to happen, the issuer would not be able to sell it to other users. However, since this study targets content for which freshness of information is important, the possibility of other users purchasing the content after a certain period
282
K. Mori et al.
of time is considered to be low. Therefore, a method in which the license of a video are returned to the issuer after a certain period of time is not practical. A method where the license is transferred to multiple users at the same time is also conceivable, but the algorithm of such a method is a future study. Acknowledgements. This work was partially supported by the Japan Society for the Promotion of Science through Grants-in-Aid for Scientific Research (B) (17H01742).
References 1. ethreum.org, Ethereum. https://ethereum.org/ 2. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system. https://bitcoin.org/bitcoin.pdf 3. Lamport, L., Shostak, R., Pease, M.: The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4(3), 382–401 (1982) 4. Szabo, N.: Smart contracts. http://www.fon.hum.uva.nl/rob/Courses/InformationInSpeech/ CDROM/Literature/LOTwinterschool2006/szabo.best.vwh.net/smart.contracts.html 5. Zhou, Q., Yang, Z., Zhang, K., Zheng, K., Liu, J.: A decentralized car-sharing control scheme based on smart contract in internet-of-vehicles. In: Proceedings of 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), Antwerp, Belgium, 25–28 May 2020 6. Karamitsos, I., Papadaki, M., Barghuthi, N.A.: Design of the Blockchain smart contract: a use case for real estate. J. Inf. Secur. 09(03), 177–190 (2018) 7. Mori, K., Miwa, H.: Digital university admission application system with study documents using smart contracts on blockchain. Adv. Intell. Syst. Comput. 1035, 172–180 (2020) 8. Xu, Q., Song, Z., Siow, R., Goh, M., Li, Y.: Building an Ethereum and IPFS-based decentralized social network system. In: Proceedings of ICPADS, pp. 11–13, Singapore, December 2018 9. IPFS. https://ipfs.io/ 10. Tr´on, V.: The book of Swarm. https://gateway.ethswarm.org/
Query Processing in Highly Distributed Environments Akira Kawaguchi1(B) , Nguyen Viet Ha2 , Masato Tsuru2 , Abbe Mowshowitz1 , and Masahiro Shibata2 1
2
Department of Computer Science, City College of New York, New York City, NY 10031, USA {akawaguchi,amowshowitz}@ccny.cuny.edu Department of Computer Science and Networks, Kyushu Institute of Technology, Iizuka, Fukuoka 820-8502, Japan {ha,tsuru,shibata}@cse.kyutech.ac.jp
Abstract. This paper will demonstrate a novel method for consolidating data in an engineered hypercube network for the purpose of optimizing query processing. Query processing typically calls for merging data collected from a small subset of server nodes in a network. This poses the problem of managing efficiently the exchange of data between processing nodes to complete some relational data operation. The method developed here is designed to minimize data transfer, measured as the product of data quantity and network distance, by delegating the processing to a node that is relatively central to the subset. A hypercube not only supports simple computation of network distance between nodes, but also allows for identifying a node to serve as the center for any data consolidation operations. We will show how the consolidation process can be performed by selecting a subgraph of a complex network to simplify the selection of a central node and thus facilitate the computations required. We will also show a prototype implementation of a hypercube using Software-Defined Networking to support query optimization in a distributed heterogeneous database system, making use of network distance information and data quantity.
1
Introduction
Today, scientists and engineers are building a unique computational infrastructure and testbed to support large-scale computing and information management that encompasses big data, data science, data analytics and visualization research. The instruments and infrastructure will be used for developing nextgeneration algorithms and software platforms to support these efforts while allowing precise experimentation with multiple architectural options. As we attempt to solve increasingly complex problems, a combination of computing platforms through effective data networking is often desirable. A wide range This material is based upon work supported by the National Science Foundation under Grant No. 1818884. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 283–294, 2022. https://doi.org/10.1007/978-3-030-84910-8_30
284
A. Kawaguchi et al.
of computing platforms, from traditional clusters or cloud servers can be connected through the fast, high bandwidth networks, including 5G. The landscape of network services is becoming increasingly diverse with recent advances in technology, particularly applications of visual computing and machine learning. This new technology can make dramatic improvements in the management of large urban environments. In edge computing [1,2], geographically distributed applications can run throughout the network utilizing the computing paradigm that relies on user or near-user (network edge) devices to conduct required processing. This architecture extends cloud computing with more flexibility than that found in conventional networks by allowing the integration of a massive number of components and services [28,32,33]. The challenges of information management derive in part from the dynamic and distributed features of operations in such a network. In particular, to exploit fully computing at the edge it is necessary to take account of the effect of querying on message traffic in the network. For instance, consider a smart city [10,18] covered by e-services and e-resource management using a collection of sensors, communication devices, and data-processing facilities. The absence of centralized management dictates policies designed to maximize the use of edge nodes which typically have relatively modest storage and processing power. Edge nodes must thus work together in information processing tasks, and this occasions the movement of data between nodes. To minimize the message traffic, innovative strategies are required. One such strategy detailed here is the development of an engineered overlay network structured as a hypercube graph [21,30,31]. This strategy was originally proposed in the International Technology Alliance project [3,19,20]. Research undertaken in that project partially demonstrated the feasibility of using hypercube routing for query optimization in a Dynamic Distributed Federated Database (the GAIAN DB) built by IBM-UK [4]. The choice of hypercube allows for determining the number of inter-node hops by computing the Hamming distance between corresponding node labels. This leads naturally to reexamination of distributed query optimization with a view to minimizing the total network traffic associated with querying. Assuming the presence of distributed databases connected by a hypercube network, our simulation study [16] has confirmed a significant theoretical advantage of a query optimization approach that incorporates inter-node distances into the cost model. The query plan applied by this cost model would produce much better plans in terms of the overall network usage for data transmission. Our study also confirmed a significant advantage of this cost model in the “peerto-peer collaboration” of databases in the network, and compared performance of the greedy algorithm to that of dynamic programming. The latter gives useful results in the case of a small number of participating nodes, but computational complexity limits its usefulness for a large number of nodes. Based on these findings, we will apply practical considerations to develop another approach, a so called “delegation to centrality” model, to reduce data transmission cost in a scalable way for performing database operations in the network with a larger number of nodes. A hybrid approach that combines peer-to-peer collaboration
Query Processing in Highly Distributed Environments
285
and delegation to centrality is also feasible. This approach exploits a divide-andconquer principle such as applying peer-to-peer collaboration for small clusters and then gathering results by utilizing delegation to centrality or vice verse. Therefore, we must take account of the performance differences and trade-offs of both approaches. Fine tuning the optimization based on advanced techniques covered in [8,15,29] can be introduced later. A comparative performance analysis needs a systems environment with a typical distributed database application. To this end, we have developed a relational-oriented but heterogeneous database application that runs on a hypercube generated by the Software Defined Networking. In this paper, we report on these studies to showcase the engineered hypercube as an efficient and practical distributed database tool, and also to provide preliminary performance data obtained from an experiment.
2
Approach for Distributed Query Optimization
The engineered hypercube allows for determining inter-node distances cheaply, but query performance is dependent on the number of participating nodes, and the sequence of operations required [7,16,26]. A key element in our proposed research is the exploitation of knowledge of the network structure to facilitate optimization. To characterize this computing situation formally, let us consider a distributed relational database environment. We can use it to provide consolidated cost estimates for logical tables, which are in effect fragmented relations, thus decreasing the query-execution-plan search space. Most importantly, we consider internode network distances and the role of topology. Since many nodes in a network may rely on batteries and wireless radios, we should like to minimize network information transfer to maximize network availability and lifetime. 2.1
Problem Definition
How to distribute the date over the nodes in a network could affect the relative advantage of query optimization. This is especially true when the data is widely spread over the network. The optimization problem involves finding the most cost efficient way of performing the actions specified in a query. For the distributed processing environment, the optimization must take account of the amount of data exchanged between nodes over the network. First of all, let us define a query Q to be expressed in the form Q(R1 , . . . , Rm ) = R1 ◦ R2 ◦ · · · ◦ Rm where Ri (1 ≤ i ≤ m with m > 2) are relations in a relational database, and the operator ◦ is a set-oriented operation, either union (∪), intersection (∩), subtraction (−), or join (). The query is expressed in standard relational algebra notation—parentheses may be added to specify explicitly the order of operations. To start with the most fundamental optimization, we do not nail down the details of select and project operations, nor include aggregate operations for Q such as computing an average and sum of values which will be applied after
286
A. Kawaguchi et al.
consolidating all the data in some way. Furthermore, we do not handle a trivial query so as to dispatch Q to each database and gather its response to construct a final result. We are most interested in dealing with a complex decision support or analytic query that mandates shuffling R1 through Rm over the network. To shed light on such an application, consider the following examples. Example 1. Suppose there is a criminal investigation involving sophisticated drug smuggling into country A from country B. Furthermore, suppose that international law enforcement can access a relation R1 holding records of recent immigrants from A, and a relation R2 holding records of recent immigrants from B. The country A owns R1 while the country B owns R2 . Suppose also that law enforcement has obtained records of convictions of their spouses, as a relation R3 . A relation R4 holds records of those who filed customs declarations at country A, and a relation R5 holds records of those who filed customs declarations at country B. Relations R4 and R5 are owned by A and B, respectively. Law enforcement could apply a query Q(R1 , . . . , R5 ) = (R1 ∩ R2 ) R3 − (R4 ∪ R5 ) to identify individuals suspected of being drug smugglers and thereby prevent criminal acts or to take preventive measures to enhance security at immigration. The distributed query Q enables the two countries to obtain information without constructing a shared, centralized database. Note that the relation instances could be concrete instances or temporary materializations resulting from some localized operations at the hosting nodes. Note also that the schemas of those that appear in the operands of union, intersection, and subtraction are assumed union-compatible (i.e., having identical structure). These presuppose availability of a spectrum of optimization methodologies, especially for select and project relational operations. For instance, selection of date and time windows as well as age ranges for R1 and R2 can be applied for reducing data space prior to data gathering and consolidation. A sophisticated query optimizer may be able to parse the query and rewrite it to apply predicate conditions applicable for data selection at each site, thereby reducing the amount of data to exchange for the final computation. 2.2
Communication-Cost in Distributed Query Optimization
For a typical query the total number n of nodes in the network is much larger than the number m of the data-hosting nodes. The challenge is to find an optimal distributed execution plan for this m-fold set query such that the plan guarantees the smallest total amount of data transmission over the n-node network. Specifically, our approach is to investigate the next two types of optimization paradigm: Peer-to-peer collaboration: a pair of nodes hosting Ri and Rj will communicate and send the instance, either from Ri to Rj or vice verse. The sent node will compute Ri ◦ Rj and communicate again to combine its result with a partial result held by another node in the party of data-hosting nodes.
Query Processing in Highly Distributed Environments
287
The peer-to-peer collaboration will continue until it derives the final result. Deriving the execution plan relies on the estimate of the result size at each ◦ operation; the estimate cascades to approximate the size of the next operation. The plan may improve by reflecting the actual result size, but this may be at the expense significant computational overhead. Delegation to centrality: a group of m-nodes hosting relations will send their data instances to a so-called central node that can be reached by the smallest total distance from the group. The central node will then execute the query using local optimization for those gathered instances and hold the result. The central node could be one of the group or a certain node that does not host any relation at all but has the power to process the query. Similarly, the central node is not necessarily the node that generated the query. The algorithm will select the central node to minimize the overall data transfer with the data server nodes at query processing. The heart of distributed query optimization is to minimize the overall amount of data transmission on the network as data has to move over the network to perform operations. Classic query optimization assumes that there is no cost associated with obtaining knowledge of internode distances or the overhead of data transmission. One critical measure of message traffic is the amount of data to be moved multiplied by the distance moved [16,17]. Thus, message traffic associated with querying can be reduced by minimizing the product of data size and distance of travel as a network hop count at Fig. 1. Query process on each operation involving several participating nodes hypercube in the execution of a query. Our previous study of [16,20] showed that better plans could be found with the use of internode distances than without. The reasoning here is to minimize the bandwidth usage of the entire network system. Reducing the total occupancy of data in the network will mitigate network congestion and yield faster query response time; the longer the distance to send data, the higher the occupancy of data in the network and thus the slower the query response time. This is especially true when the network accommodates a large number of queries requiring data from different nodes. Example 2. Consider the processing of a query Q(R1 , R2 , R3 ) = R1 R2 − R3 on the hypercube illustrated in Fig. 1. The size of each table is R1 = 15MB, R2 = 12MB, and R3 = 25MB. If the estimate [9,14] of the result size R1 R2 is 8MB, peer-to-peer collaboration derives an execution plan as (1) send R2 to R1 , compute R1 R2 at R1 ’s site, and send R3 to R1 ’s site to compute the final result, (2) do (1) but send the result of R1 R2 to R3 instead, (3) do (1) but start by sending R1 to R2 , and (4) do (3) but send the result of R1 R2 to R3 instead. The transmission cost is respectively (1) R2 × 2 + R3 × 3 = 99MB, (2) R2 × 2 + R1 R2 × 3 =48MB, (3)
288
A. Kawaguchi et al.
R1 × 2 + R3 × 1 = 55MB, (4) R1 × 2 + R1 R2 × 1 =23MB. Therefore, peer-to-peer method will select plan (4). On the other hand, the delegation to centrality method finds a server node eligible to gather data and process the query. Suppose there are three servers with the requisite processing capabilities: for R1 to be a center, the total amount of data transfer is R2 ×2+R3 ×3 = 99MB. for R2 , R1 ×2+R3 ×1 = 55MB, for R3 , R1 × 3 + R2 × 1 = 57MB, and therefore R2 should act as a center. A plan computation for the peer-to-peer collaboration is particularly demanding. For example, if 3 nodes R1 , R2 , and R3 are participating in a join R1 R2 R3 , it is necessary to examine 3!/2 sequences of the form due to the commutative property of join operation to determine which one gives the minimum cost. The evaluation of Example 2 by adjusting to ((R1 − R3 ) (R2 − R3 )) − R3 for possible optimization is costly and clearly a brute force approach to solving this problem is of exponential complexity. An alternative is to designate a node that is relatively central to those providing data for the query, and sending all the data to that node to complete the relational operation [17]. The first step in this delegation procedure is finding the central node. The simplest way to do this is by means of the Floyd-Warshall algorithm [12,13], a dynamic programming approach for finding the shortest paths between every pair of vertices in the subgraph formed by the nodes involved in the query operation. From the matrix of shortest distances produced by this algorithm, a central node can be chosen by comparing the sums of data times distance for each of the nodes in the subgraph. Floyd-Warshall executes in O(m3 ) steps where m is the number of nodes serving as data servers.
3
Delegation to Centrality: Simulation Study
To investigate the effect that a central node exerts on data transmission and network congestion, we have implemented a simulation utilizing the Python NetworkX [24] package. A patent [5] filed by IBM-UK for deriving a hypercube center in linear time does not work for our case because the derived center may not be able to act as a server. The delegation of work must occur at a node equipped with sufficient processing power as well as software and hardware that enables it to process queries. We have developed a new method to find an appropriate center, choosing one of the m-servers from the n-node network. 3.1
Experimental Method
As illustrated in Fig. 2, the simulator generates an n-node random geometric graph [25] with a single connected component in which m-nodes representing data servers are colored red, and the network center (node 39) derived by the Warshall-Floyd method for the entire network is colored yellow. It then constructs a subgraph that connects m-nodes and finds a center (node 47) colored here sky blue. Note that the figure shows a small-scale network with m = 10
Query Processing in Highly Distributed Environments
289
Fig. 2. Random geometric graph and derived centers
and n = 50 for purposes of illustration. While this is a center of the m-node subgraph, it may not be a data server as the figure indicates. Hence, the simulator reconstructs an m-node subgraph connecting only servers by finding pair-wise shortest distances of m-nodes and applying the Warshall-Floyd method again to find a center. The figure shows the server center colored dark blue. We conducted a series of experiments by setting out different combinations of n and m and different network structures to derive the average total size (product of data size and network hop count) to observe the growth of data circulating in the network. Specifically, for each combination of n and m, we generated 10 different graphs and for each such graph we generated 10 sets of different data sending schemes to average the total size of data transmission. In other words, the measure was derived from 100 simulation experiments. 3.2
Experimental Result
Here we present one result from the experiment set, with the following configuration: n = 500, m = 50, 100, 150, 200, and 250. Data servers are randomly selected from 1/4 of m nodes, and each will have a data segment whose size (in arbitrary units) is drawn from a uniform distribution [10, 100]. The following explains the plot and values in Fig. 3. (1) The worst (and largest) data amount to be transferred due to the choice of the most distant node for data processing.
290
A. Kawaguchi et al.
(2) The data amount to be transferred by choosing a node at random among m-nodes for data processing. (3) The data amount to be transferred by selecting the center without the use of data weights. (4) The best (and smallest) data amount to be transferred by deriving the center that minimizes data transfer by the use of data weights. The percentage values in the table are relative to the results in the best cases. The experimental results shown in Fig. 3 highlights a finding that a network center derived from an m-node subgraph, not taking account of the amount of data to be transferred, gives near-optimal performance in every case. The experiments with other variations produced similar results. This outcome could be due to the relatively high capacity of the transmission links, which apparently did not allow for distinguishing transmission times. However, the result we observed could generalize to a network with a large number of participating servers in which the volume of data to be exchanged is reasonably uniform. In this case, the delegation to centrality method could choose a center from m-nodes based solely on distance considerations, not making use of information about the amount of data to exchange. Our finding indicates that in principle the investment made in the server located in the fixed center of m-nodes should be most advantageous if the network structures of the group of data hosting sites are fairly stable. On the other hand, an unsupervised choice of center, i.e., random selection, will perform less well requiring about 20% more data being transmitted.
m (1) worst (2) random (3) fixed (4) best 50
396.04 (184,4%)
257.82 (120.1%)
245.94 214.76 (114.5%) (100%)
100
774.60 (169.5%)
572.90 (125.3%)
474.08 457.12 (103.7%) (100%)
150
1,105.10 (164.9%)
861.78 (128.6%)
713.56 670.30 (106.5%) (100%)
200
1,464.72 (175.3%)
1,032.70 (123.6%)
878.22 835.52 (107.5%) (100%)
250
1,937.28 (163.0%)
1,477.52 (124.3%)
1,223.00 1,189.08 (102.9%) (100%)
Fig. 3. Experiment result with n = 500 and m = 50 ∼ 200
4
Implementation by Software Design Networking
The objectives of this research are to investigate the performance characteristics of the above method and to develop and deliver the tools for achieving optimal
Query Processing in Highly Distributed Environments
291
or semi-optimal performance of the set-oriented relational algebra operations in data distributed environments. The tools will be designed to work in random and engineered networks as well as in more constrained environments such as low-bandwidth dynamic networks established across a set of co-operating organizations. Further research is needed to determine the precise conditions of data distribution that favor use of such set-oriented operations either prior to or post query operations. The distributed query optimization in this comprehensive level of work has not been addressed in the existing literature. Therefore, we believe a successful outcome of our research will contribute to the solution of an outstanding problem. 4.1
Systems Framework
Our initial approach is to implement an operating systems environment consisting of a set of independent processes serving virtual data hosts. Communication among these processes can be achieved efficiently by means of a hypercube overlay network implemented with commonly available software-defined networking tools. Distributed applications can be built based on multiple open-source, heterogeneous, relational database systems such as MariaDB, PostgreSQL, Firebird, SQLite3. These databases connect through a network maintained by the OpenFlow [23] environment and the software-oriented Address Resolution Protocol implemented with Ryu [27] component packages. Portability of the system is ensured through the use of the Ubuntu 18.04 operating system built with a Mininet [22] virtual machine. Our experimental system will create a static hypercube to accommodate a predetermined number of nodes, and will activate an overlay structure by accessing a designated IP address that hosts a specific database system. We have built several distributed query processing applications designed to gather data from a publicly available data repository system, namely New York City OpenData [11]. Collections of interest in this system include records of parking violations, housing litigation, arrests, complaints, etc. Figure 4 shows the software structure of our distributed applications: the system will (1) erase central
Fig. 4. Distributed data processing application environment
292
A. Kawaguchi et al.
server’s common data repository at the start of query execution, (2) deposit central server’s data segment to the common repository, (3) communicate with other databases through websocket to access their data segments and obtain them as Json streams, (4) unpack Json streams and insert data segments to the common repository each as a database transaction, (5) execute query as view selections, and (6) report response time spent for steps (1) through (5). 4.2
Preliminary Performance
A preliminary performance measure has been obtained from one distributed application that manipulates ’arrest records’ and ’complaint records’ stored in OpenData. The two data sets are divided into four databases as shown in Table 1. The queries retrieve the following: (1) the number of arrests from the three boroughs with the most complaints (2) the top 10 offenses complained about in boroughs that do not have the fewest arrests (3) the percentage of arrests per complaint across all boroughs for all racial groups. The four database are placed in a 16-node (24 ) hypercube and two sets of experiments were done to measure the data preparation time (steps (1) through (4) in Fig. 4) and query response time (step (5) in Fig. 4). The response times in Table 1 indicate the time spent by each of the four central database to complete the above three queries. The experiment showed that the insertion time to Firebird was long compared to other three databases, and that in this kind of lightweight data congestion there would be no significant difference in the database locations in the hypercube. Comparison of performance between a hypercube overlay network and a set of randomly built networks is underway. Table 1. OpenData’s record distributions and response time on hypercube Database
Data period Arrest row count
Complaint row count
Min. distance (1 hop away)
Max. Query distance (3 time hop away)
Firebird
Jan.–Mar.
44,822
106,111
226.8 s
226.9 s
4.3 s
SQLite3
Apr.–Jun.
29,959
89,075
43.6 s
46.2 s
1.2 s
MariaDB
Jun.–Sep.
28,593
107,990
41.3 s
41.2 s
1.1 s
PostgreSQL Oct.–Dec.
37,039
101,253
36.7 s
38.3 s
1.3 s
5
Conclusions
We have discussed a novel method for consolidating data in an engineered hypercube network for the purpose of optimizing distributed query processing. The principle approach works at an application layer to minimize data transfer, measured as the product of data quantity and network distance, by delegating the query processing to a node that is relatively central to the network. The hypercube facilitates computation of the network distance between nodes and the location of a node to serve as the center for any data consolidation operations. We
Query Processing in Highly Distributed Environments
293
sketched the demonstration effort and described a prototype implementation of a hypercube using Software-Defined Networking to support query optimization in distributed, heterogeneous database applications. At the time of writing, we are engaged in measuring performance and plan to report more results from our ongoing work. We will also incorporate a two-phase exchange method, adapted from well-known semi-join operations [6] to ensure that data transfer between the central node and data server nodes will substantially reduce network congestion [17]. Current research on 5G networks shows the need for computational resources that are neither readily available nor adequate with existing facilities in enterprise cloud environments. IoT entails a shift to edge computing, making use of urban datasets coupled with expertise across disciplines to address challenges in urban environments. We believe that the approach introduced in this paper offers a viable solution to reduce the congestion due to the high volume of data traffic in emerging network applications. Additionally, we believe that our approach will prove useful for collaborative projects that address problems of national importance including health management, urban informatics, and climate science. Acknowledgement. Authors of this paper are grateful to students in the Senior Design courses offered during the academic year 2020–2021 at the City College of New York. Each of the six teams produced a distributed database application which runs on the hypercube of 24 -nodes and successfully completed performance experiments to exhibit the result in this paper.
References 1. Arabi, K.: Mobile computing opportunities, challenges and technology drivers. In: IEEE DAC 2014 Keynote (2014) 2. Arabi, K.l Trends, opportunities and challenges driving architecture and design of next generation mobile computing and iot devices. In: MIT MTL Seminar (2015) 3. Bent, G., Dantressangle, P., Stone, P., Vyvyan, D., Mowshowitz, A.: Experimental evaluation of the performance and scalability of a dynamic distributed federated database. In: Proceedings of the Second Annual Conference of ITA (2009) 4. Bent, G., Dantressangle, P., Vyvyan, D., Mowshowitz, A., Mitsou, V.: A dynamic distributed federated database. In: Proceedings 2nd Annual Conference International Technology Alliance (2008) 5. Bent, G.A., Dantressangle, P., Stone, P.D.: Optimising data transmission in a hypercube network. Technical report, IBM-UK (2019) 6. Bernstein, P.A., Chiu, D.-M.W.: Using semi-joins to solve relational queries. J. ACM 28(1), 25–40 (1981) 7. Bernstein, P.A., Goodman, N., Wong, E., Reeve, C.L., Rothnie, J.B., Jr.: Query processing in a system for distributed databases (sdd-1). ACM Trans. Database Syst. 6(4), 602–625 (1981) 8. Bouganim, L., Fabret, F., Mohan, C., Valduriez, P.: Dynamic query scheduling in data integration systems. In: 16th International Conference on Data Engineering, pp. 425–434 (2000)
294
A. Kawaguchi et al.
9. Chaudhuri, S.: An overview of query optimization in relational systems. In: Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 34–43. ACM (1998) 10. Chourabi, H., et al.: Understanding smart cities: an integrative framework. In: 2012 45th Hawaii International Conference on System Sciences, pp. 2289–2297 (2012) 11. City of New York. https://opendata.cityofnewyork.us. NYC OpenData 12. Cormen, T.: Introduction to Algorithms, vol. 15, 2nd edn. The MIT press, Cambridge (2001) 13. Floyd, R.W.: Algorithm 97: shortest path. Communi. ACM 5(6), 345 (1962) 14. Garcia-Molina, H., Ullman, J.D., Widom, J.: Database Systems - The Complete Book, 2nd edn. Pearson Education (2009) 15. Jiang, Y., Taniar, D., Leung, C.: High performance distributed parallel query processing. Comput. Syst. Sci. Eng. 16, 277–289 (2001) 16. Kawaguchi, A., et al.: A model of query performance in dynamic distributed federated databases taking account of network topology. In: Annual Conference of International Technology Alliance in Network and Information Science (ACITA2012) (2012) 17. Kawaguchi, A., Mowshowitz, A., Shibata, M.: Semi-operational data reductions for query processing in highly distributed data environments (extended abstract). In: US-Japan Workshop on Programmable Networking, Kyoto, Japan (2020) 18. Miorandi, D., Sicari, S., De Pellegrini, F., Chlamtac, I.: Internet of things: Vision, applications and research challenges. Ad Hoc Netw. 10(7), 1497–1516 (2012) 19. Mowshowitz, A., Bent, G.: Formal properties of distributed database networks. In: Annual Conference of the International Technology Alliance, University of Maryland (2007) 20. Mowshowitz, A., et al.: Query optimization in a distributed hypercube database. In: Proceedings of the Fourth Annual Conference of ITA (2010) 21. Mowshowitz, A., Kawaguchi, A., Tsuru, M.: Topology as a factor in overlay networks designed to support dynamic systems modeling. In: 13th International Conference on Intelligent Networking and Collaborative Systems (INCoS-2021) (2021). in press 22. Open Networking Foundation. https://mininet.org. Mininet 23. Open Networking Foundation. https://opennetworking.org. OpenFlow 24. Organization, N.: Networkx - Network Analysis in Python. https://networkx.org 25. Penrose, M.: Random Geometric graphs. Oxford University Press, Oxford (2003) 26. Rothnie, J., Jr., Bernstein, P., Fox, S., Goodman, N., Hammer, M., Landers, T., Reeve, C., Shipman, D., Wong, E.: Introduction to a system for distributed databases (sdd-1). ACM Trans. Database Syst. (TODS) 5(1), 1–17 (1980) 27. Ryu SDN Framework Community. https://ryu-sdn.org. Ryu 28. Saadawi, T., Kawaguchi, A., Lee, M.J., Mowshowitz, A.: Secure resilient edge cloud designed network (invited). IEICE Trans. Commun. E103-B(4) (2020) 29. Taniar, D., Leung, C.H.C., Rahayu, J.W., Goel, S.: High Performance Parallel Database Processing and Grid Databases. John Wiley & Son, Hoboken (2008) 30. Toce, A., Mowshowitz, A., Kawaguchi, A., Stone, P., Dantressangle, P., Bent, G.: Hyperd: Analysis and performance evaluation of a distributed hypercube database databases. In: Proceedings of the Sixth Annual Conference of ITA (2012) 31. Toce, A., Mowshowitz, A., Kawaguchi, A., Stone, P., Dantressangle, P., Bent, G.: An efficient hypercube labeling schema for dynamic peer-to-peer networks. J. Parallel Distrib. Comput. 102, 186–198 (2017) 32. Yi, S., Li, C., Li, Q.: A survey of fog computing: concepts, applications and issues. In: Proceedings of the 2015 Workshop on Mobile Big Data (2015) 33. Yildirim, I.: Query operations in highly distributed environment. Master’s thesis, City College of New York (2014)
Loose Matching Approach Considering the Time Constraint for Spatio-Temporal Content Discovery Shota Akiyoshi1(B) , Yuzo Taenaka2 , Kazuya Tsukamoto1 , and Myung Lee3 1 Kyushu Institute of Technology, Iizuka, Japan [email protected], [email protected] 2 Nara Institute of Science and Technology, Ikoma, Japan [email protected] 3 CUNY, City College, New York City, USA [email protected]
Abstract. Cross-domain data fusion is becoming a key driver in the growth of numerous and diverse applications in the IoT era. We have proposed the concept of a new information platform, the Geo-Centric Information Platform (GCIP), that enables IoT data fusion based on geolocation. The GCIP dynamically produces spatio-temporal content (STC) by combining cross-domain data in each geographic area and then provides the STC to users. In this environment, it is difficult to find some particular STC requested by a user because the user cannot determine which STC is created in each area beforehand. Although, in order to address this difficulty, we proposed a content discovery method for GCIP in the previous study, the temporal property of STC was not taken into account, despite the fact that the available (effective) period of each of STC is limited. In the present paper, we propose a new loose matching approach considering the time constraint for STC discovery. Simulation results showed that the proposed method successfully discovered appropriate STC in response to a user request.
1
Introduction
With the development of IoT technologies, the collaboration of cross-domain data obtained by sensing a wide variety of things has attracted attention [1]. We proposed the Geo-Centric Information Platform (GCIP) [2], which collects, processes, and distributes IoT data (i.e., realizes IoT data fusion) in a geolocationaware manner. As shown in Fig. 1, the GCIP divides the space into a hierarchical mesh structure based on longitude and latitude, configures a network corresponding to the mesh structure, and generates content from the IoT data collected in each mesh. Two types of servers, a data store server (DS server) and a data fusion server (DF server), are deployed in a mesh. The DS server collects all IoT data in the geospatial range corresponding to the mesh, and the DF server uses this IoT data and generates spatio-temporal content (STC), which are local content for the region. c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 295–306, 2022. https://doi.org/10.1007/978-3-030-84910-8_31
296
S. Akiyoshi et al.
The data properties are various in terms of collection interval, data volume, and data presence, depending on the type and/or movement of IoT devices. The DF server generates STC dynamically according to the data present at the time of STC creation. That is, the generated STC depends on the IoT data being collected at that time. Therefore, it is difficult for users to find particular STC because they cannot know the STC generated by the DF server in advance. The previous study [3] proposed an STC discovery method that enabled users to find appropriate STC by matching the statistics of IoT data usage of each DF server with user requests. Although this method can select the DF server having the largest amount of STC, there is a problem in which the DF server which does not have the largest amount of STC is never selected, even if the DF server has appropriate STC for the user request. In addition, STC has an available period, but that is not taken into account. In the present paper, we propose an extended STC discovery method that considers the available period of STC. The proposed method enables the discovery of servers that have a large amount of STC, which is appropriate for user requests and also remains available by exploiting cosine similarity. The remainder of the present paper is organized as follows. Section 2 introduces related research on content retrieval, and Sect. 3 describes our previous research. Section 4 describes the proposed method, and Sect. 5 explains the comparison method, valuation metrics, and simulation. Section 6 presents a summary of our paper.
2
Related Research
In this section, we review existing content retrieval methods. Reference [4] summarizes existing studies that focus on content retrieval (location-based [5], metadata-based [6], and event-based [7]). These methods use the elements of time, location, and content as different search metrics. Information-centric networking (ICN) is a promising concept for efficient content retrieval and distribution. Since ICN operates on a content basis rather than an IP basis, users can directly search for content using the content name without knowing the location of the content [8]. However, in the case of cross-domain data fusion, the several content are created in response to IoT data collected at that time, and the user cannot know the name of the content at the time of search in ICN. This makes it very difficult to search for the content. In the present paper, we use topics that are used to compose content as a metric for content search.
Matching-Based Content Discovery Method on the GCIP
297
Fig. 1. Assumed environment for GCIP
3
GCIP: Geo-Centric Information Platform
The GCIP can arrange a transmission route based on physical location because it uses a network address with a unique ID called the mesh ID, which is a hierarchically defined number that depends on the mesh structure as shown in Fig. 1. This enables location-aware search only by designating a particular mesh ID in a request packet of content search. In the present study, we focus on STC retrieval by matching the statistics of IoT data usage of the DF server with the user request after a search packet reaches the designated mesh. In each mesh on the GCIP, DS servers are supposed to be installed by local governments, such as prefectures and municipalities, and DF servers are installed by content providers who want to provide STC to users in the region. In the present study, we assume that there is one DS server and several DF servers in a mesh of the GCIP. In order to make STC generation occur asynchronously, we use Publish/Subscribe (Pub/Sub) communication (Fig. 1). The mesh router is the Publisher, and the DS server is the Broker. The DF server is the Subscriber. The mesh router duplicates all data sent from IoT devices to a particular cloud server (original destination) along the way and publishes these data to the DS server with a topic indicating the type of data. Upon STC generation, a DF server sends a subscription request to the DS server specifying multiple topics, and processes the received data to generate an STC. At this time, the available period of each STC is set by the DF server. In the present study, we assume that one STC is generated by collected data for one subscription request and that the DS server cannot understand whether a subscription consisting of the same topics is for the same STC or for a different STC.
298
4 4.1
S. Akiyoshi et al.
Matching-Based STC Discovery Conceptual Design of the Matching Search Method
Since each DF server generating STC is managed by a different operator and the type and/or timing of content generated by each DF server vary, users cannot know what type of STC is being generated or when and where this generation occurs and thus cannot directly request a search of any DF server. In this circumstance, since for a user to specify the name of content or explicit keywords is also difficult, a new search method is required. Therefore, we focus on the fact that the DS server receives subscription requests from all DF servers in a mesh at the time of STC generation and thus can use its statistical information. The following is an overview of the matching-based search. The user first sends a search packet to the DS server, which will be the anchor point in each mesh because the DS server has all the data in each mesh and has subscription statistics. When the DS server receives the search packet, the server tries to match the subscription statistics for all DF servers with the user request and chooses an appropriate DF server for the request. The DS server then forwards the search packet to the DF server, and the DF server sends the user STC that is appropriate for the user request. In the following, we describe the previous method and its problems, and then explain the user requirements and matching procedures of the proposed method. 4.2
Previous Search Method and the Remaining Problems
In the previous search method, a user is supposed to specify several topics, which are highly related to the interest of the user. Each topic has a priority of 0 to 100 but the sum of all topic priorities is 100. This request is sent to the DS server, and the DS server chooses the DF server that is expected to have the largest amount of STC matching the user request. In order to calculate the expected value of the amount of STC, Ei at DF server i, if we assume that the combination of topics that satisfy the user request is c (c ∈ C), then the request probability of topic j could be Pi,j , and the probability that DF server i has combination c is expected value of the amount Gi (c) = Pi,j . We use these definitions, and the of STC on DF server i can be expressed as Ei = c∈C (Gi (c) × Ni ). Although this method can select a DF server having the largest amount of STC, it is two problems in that the DF server that does not have the largest amount of STC can never be selected, even if that server has appropriate STC for the user request. In addition, since all STC have an available period, the STC should be taken into account in the search procedure. Without this consideration, a selected DF server might have only old (not useful) STC. 4.3
Proposed Method
In the proposed method, a user specifies three types of information for STC discovery: the location information of the target area, search keyword of the
Matching-Based Content Discovery Method on the GCIP
299
desired content, and ambiguity level. The search keywords are translated (or decomposed) to topics by using any intention extraction technique. At this time, a priority, a number from 0 to 100, is also assigned to each topic. The sum of priorities for all topics is 100. The ambiguity level, a number from 0 (strictly same) to 100 (allowing anything), means that the degree the user allows the search results to differ from the specified keywords. Once the information is received as a request, the DS server tries to identify an appropriate DF server by the proposed two-stage search method and then forwards the request based on the information. The definition of the optimal DF server is described in Sect. 4.3.1. The first stage of selecting several candidates for an optimal DF server is described in Sect. 4.3.2, and the second stage of selecting the optimal DF server is described in Sect. 4.3.3. 4.3.1 Defining the Optimal Data Fusion Server Since the proposed method takes the freshness and amount of STC into account in STC search, we need the definition of the optimal DF server based on these two factors. We define the optimal DF server as a server that has the largest amount of STC, that matches the topics of a user request, and in which the remaining available period is long. A server having a large amount of fresh STC is more beneficial to users than a server having a large amount of old STC, which is sometimes identified by the previous method described in Sect. 4.2. The formal definition is as follows. We make two lists in which DF servers are sorted in the order of the amount of appropriate STC and the total available period of all appropriate STC, respectively. The score of each DF server is calculated by the sum of two numerical values, indicating their order on these lists. The DF server having the smallest score is treated as optimal. Although this definition of the optimal DF server is useful to determine a theoretically optimal server, in practice, nobody has a global view of all DF servers. This is why the proposed method tries to identify an optimal server by using statistics of subscription requests from every DF server, as will be described in the next section. If the sum of the ranks is the same, then the DF server with the largest amount of STC satisfying the user requirements is defined as the optimal DF server. In this way, servers with a large amount of only old STC, or servers with a small amount of STC but a very long available period for one piece of STC, are not selected, and a server with a large amount of STC and STC with a long available period for the entire STC can be determined as the optimal server. 4.3.2 Stage 1: Matching Algorithm for Selecting Several Candidates of Optimal DF Server Figure 2 shows the matching procedure of the proposed method. In order to identify an optimal DF server, a DS server estimates an optimal DF server by matching the subscription statistics of DF servers and the user request. Specifically, a DS server counts the number of subscription requests for each topic sent from each DF server and calculates the ratio of subscriptions for each topic in all subscriptions. A higher ratio for a topic indicates a DF server is more likely
300
S. Akiyoshi et al.
to have a larger amount of STC on that topic. In contrast, since a user request includes several topics of priority value, this can be treated such that the user expects STC composed in part from topics with the ratio of priority value. From this similar context, a DS server that can have information about the subscription request and the user request matches these requests to find an optimal DF server.
Fig. 2. Matching procedure of the proposed method
In order to perform matching, we use cosine similarity to evaluate the similarity of the topic composition on the subscription of the DF server and user request. The DS server keeps the combination of topics subscribed by each DF server, the last subscription time of each topic from each DF servers, the subscription interval for each subscription with the same topic combination, and the total number of subscriptions. In order to describe the procedure identifying an optimal DF server, we use the following notation for subscription information of the DF server and user request: • Total number of DF servers in the mesh: L • All topic combinations subscribed by DF server i: Ci = ci1 , ci2 , .., cim , ..., ciM • cm containing topic j in C:Cj • Last subscription time for cm : tcm • Subscription interval for cm : icm • Total number of subscriptions for cm : ncm • Total number of subscriptions of STC that satisfy the user request: nsum Next, we calculate the ratio of the combination cm to the subscriptions of one DF server. We set the ambiguity level specified in a user request as a threshold α and select only topic combinations cmα where the total value of the priority in the
Matching-Based Content Discovery Method on the GCIP
301
user request exceeds α. That is, the topic combinations depend on the ambiguity level, α, which could be that of only one topic, even if a user request includes several topics. This is to make a search result involving related information. Define the weight for each element in cmα as wcm = ncm /nsum . Define the weight wj of topic j as the sum of the wcm of the elements in Cj . The normalized vector of weights for each topic is defined as the weight vector W of the DF server. We define WU as the weight vector of the DF server when the user specifies the importance of a topic. Using these vectors, we calculate the cosine similarity as in Eq. 1. CSl = CosSim(W, WU )
(1)
Next, we will explain how to use cosine similarity. The larger the value of the cosine similarity, the more optimal the DF server is considered to be. Therefore, the DF server with the largest CSl calculated in Eq. 1 is estimated to be the optimal DF server. However, from the simulation result, it is clear that the server with the highest cosine similarity is not necessarily the optimal DF server. We investigated the distribution of optimal DF servers when they are arranged in the order of the cosine similarity value, and the range of possible values of the cosine similarity. We investigate the order of the cosine similarity of the optimal DF servers when CSl calculated by Eq. 1 is arranged in descending order. From the results, it can be seen that about 90% of the optimal DF servers are contained in the top five in terms of cosine similarity, and, therefore, the servers that are within the top five servers in terms of cosine similarity and satisfy the thresholds (MIN = 0.82, MAX = 0.97) are considered as candidate search results. 4.3.3 Stage 2: Optimal DF Server Selection The next step is to select one of the five candidates chosen in Sect. 4.3. We use the available period to select a candidate. The DF server with the largest median STC available period is defined as the server with the longest available period. This definition eliminates the possibility of selecting a server having old STC. However, since the DS server cannot know the available period of each STC, it uses the value of the Poisson distribution pcm of the mean available period λ to estimate the distribution. Specifically, the remaining available period of the combination cm is calculated as ecm in Eq. 2. The arrival time of the search packet of the user is set to tnow . ecm = pcm − (tnow − tcm )
(2)
Let USl be the median of the ecm aggregate of DF servers Sl . Among all the DF servers in the mesh, the server with the largest USl has the highest probability of being the optimal DF server, so the DS server forwards the search packet of the user and performs the search. The DF server that receives the forwarded request searches for STC composed of topic combinations cmα and then returns all found STC to the user.
302
5 5.1
S. Akiyoshi et al.
Performance Evaluation Simulation Environment
We use the simulation environment focusing on a mesh, as shown in Fig. 3. There are 10 DF servers, and each DF server is biased to request many subscriptions to a particular topic. The probability of obtaining a specific topic at the time of the subscribe request is set to 50% (Subscription bias 0.5: hereafter SB = 0.5) and 100% (SB = 1). In addition, we assume that each DF server requires several different topics. The number of topics constituting STC is set randomly. We search by an unclear request (UR) in which there is little difference in the importance of each topic. In the simulation, STC is generated for 10 min, and then a search packet is sent for evaluation. In this simulation, we set the parameters as shown in Table 1. The user sends an UR to the DS server 1,000 times to evaluate the performance of the search.
Fig. 3. Simulation topology
Table 1. Simulation parameters Number of servers
10 units
Number of content generated per unit of time 100 [piece/unit]
5.2
Topic Type
10 [types]
Number of topics linked
2–5
STC mean period to available
10,30,60 [min]
Threshold α
80
Mean period to available λ
10,30,60 [min]
Evaluation Index
We use three evaluation indexes: 1) the estimation accuracy, which is the probability that each method is estimated a DF server of each rank; 2) the amount
Matching-Based Content Discovery Method on the GCIP
303
of appropriate STC, which matches the user request, from the estimated server; and 3) the distribution of the remaining available period of the obtained STC (Eq. 2). Note that, we call the condition in which the obtained STC is composed of exact the same topics as included in a user request as cp. Furthermore, we may say cp+1 when the amount of STC obtained that contains one topic other than the user request, and cp+2 is the amount of STC obtained that contains two topics. We use two comparison methods to evaluate the effectiveness of the proposed method. For comparison method 1, we use the method of previous studies described in Sect. 4.2, which is referred to as expected value basis (EV). Comparison method 2 uses only the cosine similarity to select the appropriate DF server, which is referred to as Cosine similarity basis (CS). Since the difference with the proposed method does not take the remaining available period into account, we can evaluate its effectiveness on search results. In this method, the server with the largest value of CSl derived by the proposed method is selected as the optimal server. 5.3
Results and Discussion
Figure 4 shows the ratio of the identified DF server on the experiments of two environments in which user request (UR) and subscription bias (SB) 1.0/0.5 are combined. This figure show the data for the proposed method, Expected Value basis (EV, comparison method 1), and Cosine similarity (CS, comparison method 2), respectively. The order of the identified DF server, denoted as 1st, 2nd, and 3rd, is from the order of score, which is used for the definition of the optimal DF server in Sect. 4.3.1. Here, 1st indicates a method found the optimal DF server. For the case in which both URs and SBs are highly biased (SB=1, UR), the proposed method is able to estimate the DF servers with more than the top three DF servers with 92% accuracy. In the environment of UR and SB=0.5, the proposed method can estimate more than the top three DF servers with 81% accuracy. Next, Table 2 shows the average value of the amount of STC acquired by users in each SB. Figure 5 shows the remaining available period of the STC acquired by users. We show the results of the proposed method and the CS based method, which collects a large amount of STC. Figure 4 shows that the proposed method has the highest accuracy in estimating the optimal DF server in all environments, with 47% accuracy even for SB = 0.5 and UR. The accuracy of the cosine similarity-based method has the next highest result, with SB = 0.5, and 20% for UR. EV has an estimation accuracy of less than 1% in all environments. EV is a method for selecting the server with the largest amount of STC that satisfies the user requirements and does not take into account the available period of STC. As a result, EV fails to select the best DF server by the fifth server, and the worst server among the 10 DF servers is selected most often. CS is a significant improvement over EV and is able to select the optimal DF server. Since EV does not take cp+1 and cp+2 into account, it is not possible to estimate the DF server that maximizes the acquisition STC including cp+1 and cp+2. Since the
304
S. Akiyoshi et al.
Fig. 4. Estimation accuracy (proposed method) Table 2. Number of STC acquisitions UR(SB=1) UR(SB = 0.5) cp cp+1 cp+2 cp cp+1 cp+2 Proposed method
4
15
29
3
8
14
Expected value basis (EV)
0
3
11
1
4
13
20
32
5
10
12
Cosine similarity based (CS) 8
time characteristics are not taken into account, the number of times that the optimal DF server was estimated was less than that for the proposed method in all environments. The difference in median available period between the proposed method and the STC obtained by CS was approximately 4 min for SB = 1 and UR with small bias and approximately 5 min for SB = 0.5 and UR. The proposed method can provide STC in SB = 0.5 and UR environments with a longer available period than the methods with similarity. In all methods, we can obtain STC that include topics not yet specified by the user, which can give the user new insights. In addition, in the EV, since the server with the highest ranking is not selected, it is not possible to obtain STC that perfectly matches the user requirements, but only STC that contains topics that are not specified by the user.
Matching-Based Content Discovery Method on the GCIP
305
Fig. 5. Remaining available period of acquired STC
6
Conclusion
In the GCIP, users cannot know when and where any STC is generated, nor can they directly request any DF server to search for STC. Therefore, we proposed a matching approach for STC that satisfies the user request by focusing on the similarity between the subscription statistics of DF servers and the user request and the available period of STC. The simulation results showed that the user can obtain fresh STC. In the future, we intend to estimate the amount of STC generated from the transmission interval of the same subscription and to study methods by which to improve the search accuracy. Acknowledgements. The present study was supported by JSPS KAKENHI Grant Number JP21H03430, by NICT Grant Number 19304, and by USA Grant numbers 1818884, and 1827923.
References 1. Al-Fuqaha, A., et al.: Internet of things: a survey on enabling technologies, protocols and applications. IEEE Commun. Surv. Tutor. 17(4), 2347–2376 (2015) 2. Tsukamoto, K., et al.: Geolocation-centric information platform for resilient spatiotemporal content management. IEICE Trans. Commun. (2020). Online ISSN 1745– 1345, Print ISSN 0916–8516 3. Nagashima, K., et al.: Matching based content discovery method on Geo-Centric Information Platform. In: INCoS 2020, vol. 1263, pp. 470–479 (2020) 4. Pattar, S., et al.: Searching for the IoT resources: fundamentals, requirements, comprehensive review, and future directions. IEEE Commun. Surv. Tutor. 20, 2101– 2132 (2018) 5. Mayer, S., Guinard, D., Trifa, V.: Searching in a web-based infrastructure for smartthings. In: 2012 3rd IEEE International Conference on the Internet of Things, pp. 119–126 (2012)
306
S. Akiyoshi et al.
6. Mayer, S., Guinard, D.: An extensible discovery service for smart things. In: WoT 2011: Second International Workshop on the Web of Things, June, pp. 1–6 (2011) 7. Pintus, A., Carboni, D., Piras, A.: Paraimpu: a platform for a social Web of Things. In: Proceedings 21st International Conference on Companion World Wide Web (WWW Companion), pp. 401–404 (2012) 8. Xylomenos, G., et al.: A survey of information-centric networking research. IEEE Commun. Surv. Tutor. 16, 1024–1049 (2013)
Optimized Memory Encryption for VMs Across Multiple Hosts Shuhei Horio1(B) , Kouta Takahashi1 , Kenichi Kourai1 , and Lukman Ab. Rahim2 1
2
Kyushu Institute of Technology, 680-4 Kawazu Iizuka, Fukuoka, Japan {horio,takahashi,kourai}@ksl.ci.kyutech.ac.jp Universiti Teknologi Petronas, 32610, Seri Iskandar, Perak Darul Ridzuan, Malaysia [email protected]
Abstract. Recently, virtual machines (VMs) with a large amount of memory are widely used. It is often not easy to migrate such a large-memory VM because VM migration requires one large destination host. To address this issue, split migration divides the memory of a VM into small pieces and transfers them to multiple destination hosts. The migrated VM exchanges its memory data between the hosts using remote paging. To prevent information leakage from the memory data in an untrusted environment, memory encryption can be used. However, encryption overhead largely affects the performance of the hosts and the VM. This paper proposes SEmigrate for optimizing the memory encryption in split migration and remote paging. SEmigrate avoids decrypting memory data at most of the destination hosts to reduce the overhead and completely prevent information leakage. Also, it selectively encrypts only the memory data containing sensitive information by analyzing the memory of the guest operating system in a VM. SEmigrate could reduce the CPU utilization during encrypted split migration by 6–20% point and improve the performance of the migrated VM with encrypted remote paging to 1.9×.
1 Introduction Recently, virtual machines (VMs) with a large amount of memory are widely used. For example, Amazon EC2 provides VMs with 24 TB of memory [1]. Upon host maintenance, a VM can be moved to another host using VM migration without disrupting the services provided by the VM. For large-memory VMs, however, it is not cost-efficient to always preserve large hosts with a sufficient amount of memory as the destination of occasional VM migration. These hosts cannot run other VMs even while they are not used for VM migration. To make the migration of such large-memory VMs more flexible, split migration has been proposed [7]. Split migration divides the memory of a VM into small pieces and transfers them to multiple smaller destination hosts, which consist of one main host and one or more sub-hosts. It transfers likely accessed memory data to the main host as well as the state of the VM core such as virtual CPUs. The rest of the memory is transferred to sub-hosts. After split migration, the main host runs the VM core, while the sub-hosts provide the memory to the VM. When the VM accesses the memory existing in a subhost, it exchanges memory data between the main host and the sub-host using remote c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 307–315, 2022. https://doi.org/10.1007/978-3-030-84910-8_32
308
S. Horio et al.
paging. The sub-host transfers the required memory data to the main host, while the main host transfers unnecessary memory data to the sub-host. However, it is possible to eavesdrop on the memory data of a VM during split migration and remote paging in an untrusted execution environment. For example, information leakage easily occurs if the memory data is transferred via untrusted networks. If the administrators of some of the sub-hosts are untrusted, they can eavesdrop on the memory data held in the sub-hosts. In general, the encryption of the memory data can prevent such information leakage. However, encryption overhead largely affects the performance of the hosts and the VM because encryption and decryption are performed every time memory data is transferred. To address this performance issue, this paper proposes SEmigrate for optimizing memory encryption in split migration and remote paging. SEmigrate avoids decrypting memory data at sub-hosts to reduce encryption overhead and completely prevent information leakage at sub-hosts. Upon split migration, it encrypts memory data at the source host and decrypts that data only at the destination main host. Upon remote paging, it encrypts or decrypts memory data only at the main host. In addition, SEmigrate selectively encrypts only memory data containing sensitive information to further reduce encryption overhead. To obtain information needed for the selective encryption, SEmigrate analyzes the memory of the guest operating system (OS) in a VM using a technique called VM introspection (VMI) [3]. For example, it considers that the free memory of the guest OS does not contain sensitive information. If SEmigrate determines that memory data to be transferred is part of the free memory, it does not encrypt that memory data. When the user specifies that an application does not deal with sensitive information, SEmigrate does not encrypt the entire memory of all the processes executing that application. We have implemented SEmigrate in KVM supporting split migration and remote paging. To confirm performance improvement by SEmigrate, we ran an application using a large amount of memory in a VM and examined the performance of encrypted split migration and the migrated VM. As a result, it was shown that SEmigrate could reduce the CPU utilization during encrypted split migration by 6–20% point. Also, we showed that SEmigrate could improve the performance of the migrated VM with encrypted remote paging to 1.9×. The organization of this paper is as follows. Section 2 describes split migration and remote paging and the issues of memory encryption. Section 3 proposes SEmigrate for optimizing memory encryption in split migration and remote paging. Section 4 explains the implementation of SEmigrate and Sect. 5 shows our experimental results. Section 6 describes related work and Sect. 7 concludes this paper.
2 Memory Encryption of VMs Across Hosts 2.1
Split Migration
Split migration [7] divides the memory of a VM into small pieces and transfers them to multiple small destination hosts, as illustrated in Fig. 1. The destination hosts consist of one main host and one or more sub-hosts. Split migration transfers likely accessed memory data to the main host as much as possible. It also transfers the state of the VM
Optimized Memory Encryption for VMs Across Multiple Hosts
309
core such as virtual CPUs and devices. The rest of the memory data is transferred to one of the sub-hosts.
Fig. 1. Split migration.
After split migration, the migrated VM runs across the main host and the sub-hosts. The main host runs the VM core, while the sub-hosts provide the memory to the VM core. When the VM core requires the memory data existing in a sub-host, that data is exchanged between the main host and the sub-host using remote paging. The subhost transfers the memory data required by the VM core, which is called a page-in. At the same time, the main host transfers unlikely accessed memory data to the sub-host, which is called a page-out. 2.2 Encryption of Memory Data In an untrusted execution environment, it is possible to eavesdrop on the memory data of a VM transferred during split migration and remote paging. For example, information leakage easily occurs if the memory data is transferred via untrusted networks. Similarly, the memory data can be exposed if remote paging is performed between the main host and sub-hosts via untrusted networks. In addition, the memory data can be stolen by untrusted administrators at any host. Fortunately, several mechanisms have been proposed to protect the memory of a running VM [4, 8, 10]. Since we can use such memory protection mechanisms at the source host and the main host running the VM core, we assume that information leakage does not occur at these hosts in this paper. In general, information leakage can be prevented by encrypting the memory data of a VM. Upon split migration, the memory data is encrypted at the source host by using an encrypted communication channel such as SSL. Encrypted data is transferred to the destination main host and sub-hosts and is then decrypted by the channel. After that, the sub-hosts re-encrypt the decrypted memory data to securely hold it against untrusted administrators. The reasons why decryption and re-encryption are required are that the channel automatically decrypts memory data and that it is difficult to continue to use the decryption key created for the channel after the communication. Note that such reencryption is not necessary at the main host because the memory data is managed in a protected manner.
310
S. Horio et al.
Upon remote paging, the memory data held in a sub-host is first decrypted. Then, it is re-encrypted at the sub-host by using an encrypted communication channel established between the sub-host and the main host. It is transferred to the main host for a page-in and is then decrypted by the channel. For a page-out, unnecessary memory data is encrypted at the main host by using this channel and is then transferred to the sub-host. It is decrypted at the sub-host and is then re-encrypted to be securely held. However, using encrypted communication channels imposes a large overhead because encryption and decryption are performed whenever memory data is transferred. Figure 2(a) shows the average CPU utilization at the source host during split migration of a VM. This measurement was done using the experimental setup in Sect. 5. Although the migration time almost did not increase by memory encryption, the CPU utilization became 1.7× higher. Figure 2(b) shows the performance of the VM migrated by split migration. The execution time of a memory benchmark running in the VM increased to 2.2× due to memory encryption in remote paging. 60
40 30 20 10 0
(a) CPU overhead during split migration
60 execution time (sec)
CPU utilization (%)
50
70 unencrypted encrypted
unencrypted encrypted
50 40 30 20 10 0
(b) VM performance after split migration
Fig. 2. The encryption overhead during and after split migration.
In terms of security, the memory data of a VM is still exposed at sub-hosts due to reencryption. Since it is decrypted temporarily by an encrypted communication channel, attackers can eavesdrop on the memory data before the data is re-encrypted. In addition, the memory data held in sub-hosts can be easily decrypted if its decryption keys are stolen by the administrators of sub-hosts.
3 SEmigrate This paper proposes SEmigrate for optimizing memory encryption in split migration and remote paging. As illustrated in Fig. 3, SEmigrate avoids decrypting the memory data of a VM at sub-hosts to reduce encryption overhead and completely prevent information leakage. To further reduce the overhead, it selectively encrypts only the memory that contains sensitive information. To use memory attributes and process information for these optimizations, it analyzes the memory of the guest OS in a VM using VMI [3].
Optimized Memory Encryption for VMs Across Multiple Hosts
311
Fig. 3. The optimization of the encryption of memory data in SEmigrate.
3.1 No Decryption at Sub-hosts SEmigrate avoids the decryption of the memory data of a VM at sub-hosts. Upon split migration, the source host encrypts memory data, while only the destination main host decrypts it. The destination sub-hosts hold it without decrypting it. To enable decrypting the encrypted memory data later, SEmigrate uses an encryption key that is available through the life cycle of a VM. Therefore, it does not use an encrypted communication channel that is established only for VM migration. This can reduce the overhead of the re-encryption of memory data as well as that of the decryption. Also, this can prevent information leakage by temporarily decrypting memory data in an encrypted communication channel. Since the sub-hosts do not manage the decryption keys of the encrypted memory data, even the administrators of the sub-hosts cannot decrypt or eavesdrop on the memory data. Upon remote paging, SEmigrate encrypts and decrypts the memory data of a VM only at the main host. For a page-in, a sub-host does not decrypt the memory data requested by the main host. Then, it can securely transfer encrypted memory data to the main host without using an encrypted communication channel. The main host decrypts the received memory data and uses it. For a page-out, the main host encrypts unnecessary memory data without using an encrypted communication channel and then securely transfers it to a sub-host. The sub-host holds it without decrypting it. SEmigrate uses the encryption and decryption keys shared in split migration to encrypt and decrypt memory data at the main host in remote paging. 3.2 Selective Encryption Based on In-VM Information SEmigrate selectively encrypts the memory data of a VM at the source host and the main host to reduce encryption overhead. Upon split migration, the source host encrypts only the memory data that contains sensitive information, while it does not encrypt the other memory data. The destination main host decrypts the received data only if the memory data is encrypted. The destination sub-hosts hold it without encrypting it even if the memory data is not encrypted. Since the memory data does not contain sensitive information, it does not need encryption at sub-hosts as well. Upon remote paging, a sub-host transfers the memory data requested for a page-in to the main host without encrypting it even if the memory data is not encrypted. The main host decrypts the received data only if the memory data is encrypted. For a page-out, the main host transfers unnecessary memory data to a sub-host without encrypting it if
312
S. Horio et al.
the memory data does not contain sensitive information. The sub-host holds it without encrypting it even if the memory data is not encrypted. SEmigrate considers various memory regions of the guest OS in a VM to be not sensitive. For example, the memory regions that are not used by the guest OS do not contain sensitive information. Therefore, SEmigrate does not encrypt such free memory. When memory data is transferred, SEmigrate obtains its memory attribute from the guest OS using VMI and examines whether it is free memory or not. Since sensitive information may be left in free memory after used memory is released, SEmigrate transfers zero-filled data instead of the actual data of free memory. When the user specifies an application that does not deal with sensitive information in a VM, SEmigrate does not encrypt the memory of the corresponding processes used to execute that application. For example, if an in-memory database deals with only encrypted data, its memory data does not need to be further encrypted when it is transferred. When memory data is transferred, SEmigrate finds the memory region to which that data belongs and identifies the process that owns that memory region. If the name of that process is equal to the specified one, SEmigrate does not encrypt the memory data to be transferred.
4 Implementation We have implemented SEmigrate in QEMU-KVM 2.11.2 that supports split migration and remote paging. We assume Linux 4.18 as a guest OS to apply VMI for selective encryption, but we can use the other versions of Linux. We used AES-ECB with AESNI in OpenSSL and a 128-bit key for memory encryption. To transparently analyze and obtain information on the guest OS using VMI, we have ported the LLView framework [5] to QEMU-KVM. LLView enables writing programs for VMI using the source code of the Linux kernel. It transforms a written program at compile time so that the program accesses the memory of a VM when it accesses the data of the guest OS. When SEmigrate transfers a 4-KB memory page in a VM, it checks whether that page is included in a free memory region. The Linux kernel allocates contiguous 2n physical pages at once using the buddy system. Also, it manages free memory pages as free memory regions that consist of 2n pages. SEmigrate analyzes the memory of the VM using VMI and finds the page structure corresponding to the target page in the Linux kernel. On the basis of the information stored in that page structure, it can determine that the target page is free. SEmigrate also checks whether a page to be transferred is part of the specified application process. For this purpose, it finds the process that owns the target page. First, it finds the page structure corresponding to the target page and obtains the index in the red-black tree used for managing virtual memory areas from that page structure. Using this index, SEmigrate searches the red-black tree and finds the target virtual memory area. Next, it finds the process owning that area and obtains its process name from the task struct structure. If the process name is equal to the specified one, SEmigrate determines that the target page is part of the process memory that should not be encrypted.
Optimized Memory Encryption for VMs Across Multiple Hosts
313
5 Experiments We conducted several experiments to examine performance improvement by SEmigrate in encrypted split migration and remote paging. For comparison, we used two methods that encrypted all the memory data (AllEnc) and that encrypted no memory data (NoEnc). For the source host and the destination main host, we used two PCs with an Intel Core i7-8700 processor, 64 GB of memory, and 10 Gigabit Ethernet (GbE). For the destination sub-host, we used a PC with an Intel Xeon E3-1226 v3 processor, 32 GB of memory, and 10 GbE. We created a VM with one virtual CPU and 20 GB of memory. In all of the PCs and the VM, we ran Linux 4.18. Upon split migration, we equally divided the memory of the VM. To examine performance improvement in split migration, we ran an application using 10 GB of memory in the VM and specified that its memory did not need to be encrypted. As shown in Fig. 4(a), the migration time was almost unchanged between the three. However, SEmigrate could reduce the CPU utilization during split migration. Figure 4(b) shows that the reduction was 6% point, 13% point, and 20% point at the source host, the destination main host, and the destination sub-host, respectively. The CPU utilization in SEmigrate was almost the same as in NoEnc at the destination hosts. This is because the destination hosts did not decrypt most of the memory data by the optimization of memory encryption. In contrast, the CPU utilization was higher than NoEnc at the source host probably due to the overhead of selective encryption. 30
80 CPU utilization (%)
migration time (sec)
25
100 AllEnc SEmigrate NoEnc
20 15 10
AllEnc SEmigrate NoEnc
60 40 20
5 0
0
(a) Migration time
source host
main host
sub-host
(b) CPU utilization
Fig. 4. The performance of split migration.
To examine the performance of the VM after split migration, we measured the execution time of a memory benchmark running in the VM. This benchmark allocated 10 GB of memory and repeatedly wrote one byte per page to cause a lot of remote paging. As shown in Fig. 5(a), SEmigrate could reduce the execution time by 47% thanks to the optimization of memory encryption in remote paging. The performance was comparable to that of NoEnc. Figure 5(b) shows the CPU utilization at the main host and the sub-host. It was almost unchanged at the main host, whereas SEmigrate could reduce it by 26% point at the sub-host. This is because the sub-host avoided encryption and decryption completely.
314
S. Horio et al. 60
150 AllEnc SEmigrate NoEnc
40 30 20 10
AllEnc SEmigrate NoEnc
125 CPU utilization (%)
execution time (sec)
50
100 75 50 25
0
0
(a) Execution time
main host
sub-host
(b) CPU utilization
Fig. 5. The performance of a memory benchmark in the migrated VM.
6 Related Work For traditional VM migration, various optimizations using VMI have been proposed. MiG [6] obtains memory attributes from the guest OS in a VM and optimizes the compression algorithm for memory data using the obtained information. For example, it transfers only page frame numbers for free memory. It compresses the heap area using gzip because of its high redundancy. Such optimizations can reduce the amount of transferred memory by 51–61% and halve the migration time. However, MiG first saves all the states of a VM and then compresses the memory data. Therefore, it does not support live migration, which migrates a VM without stopping it, unlike SEmigrate. IntroMigrate [2] identifies free memory of the guest OS in a VM and avoids transferring it to shorten the migration time. It obtains information on the entire free memory at the beginning of VM migration. Therefore, it can apply the optimization based on old information for the memory that becomes in use or free during the migration. It relies on the re-transfer mechanism to transfer the memory data that became in use. Using old information can cause information leakage in the optimization of memory encryption. Similar work [9] avoids transferring the page cache of the guest OS as well as free memory. The secure virtualization architecture [4] and VMCrypt [8] prevent information leakage from the memory of a running VM. They provide the unencrypted version of the memory to the VM, while they provide the encrypted version to the management VM used by the administrators. Upon VM migration, they obtain and transfer the encrypted memory of a VM in the management VM. SEmigrate can apply such memory protection mechanisms to the source host and the destination main host to prevent information leakage unless the hypervisor is compromised. CloudVisor [10] can protect the memory of a VM without relying even on the hypervisor by running the security monitor under the hypervisor.
7 Conclusion This paper proposed SEmigrate for optimizing memory encryption in split migration and remote paging. SEmigrate avoids decrypting the memory data of a VM to reduce
Optimized Memory Encryption for VMs Across Multiple Hosts
315
encryption overhead and completely prevent information leakage at sub-hosts. In addition, it selectively encrypts only the memory data that contains sensitive information to further reduce encryption overhead. To enable this, it analyzes the memory of the guest OS using VMI and its applications in a VM and identifies free memory and specified applications. We have implemented SEmigrate in KVM and showed that SEmigrate could reduce the CPU utilization during encrypted split migration by 6–20% point. Also, it could improve the performance of the migrated VM with encrypted remote paging to 1.9×. One of our future work is to extend the target of our selective encryption to other memory regions, e.g., the code area and specified data in applications. To use application-specific information, SEmigrate needs to analyze the memory of applications as well as that of the OS kernel. After that, we are planning to apply SEmigrate to real applications and confirm that SEmigrate can perform selective encryption using application-specific information. Then, we will show the performance improvement in encrypted split migration and remote paging. Acknowledgements. The research results have been achieved by the “Resilient Edge Cloud Designed Network (19304),” the Commissioned Research of National Institute of Information and Communications Technology (NICT), Japan.
References 1. Amazon Web Services Inc: Amazon EC2 High Memory Instances (2019). https://aws. amazon.com/ec2/instance-types/high-memory/ 2. Chiang, J., Li, H., Chiueh, T.: Introspection-based memory de-duplication and migration. In: Proceedings of ACM International Conference Virtual Execution Environments, pp. 51–62 (2013) 3. Garfinkel, T., Rosenblum, M.: A virtual machine introspection based architecture for intrusion detection. In: Proceedings of Network and Distributed Systems Security Symposium, pp. 191–206 (2003) 4. Li, C., Raghunathan, A., Jha, N.: A trusted virtual machine in an untrusted management environment. IEEE Trans. Serv. Comput. 5(4), 472–483 (2012) 5. Ozaki, Y., Kanamoto, S., Yamamoto, H., Kourai, K.: Detecting system failures with GPUs and LLVM. In: Proceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems, pp. 47–53 (2019) 6. Rai, A., Ramjee, R., Anand, A., Padmanabhan, V., Varghese, G.: MiG: efficient migration of desktop VMs using semantic compression. In: Proceedings of USENIX Annual Technical Conference, pp. 25–36 (2013) 7. Suetake, M., Kashiwagi, T., Kizu, H., Kourai, K.: S-memV: split migration of large-memory virtual machines in IaaS clouds. In: Proceedings of IEEE International Conference Cloud Computing, pp. 285–293 (2018) 8. Tadokoro, H., Kourai, K., Chiba, S.: Preventing information leakage from virtual machines’ memory in IaaS clouds. IPSJ Online Trans. 5, 156–166 (2012) 9. Wang, C., Hao, Z., Cui, L., Zhang, X., Yun, X.: Introspection-based memory pruning for live VM migration. Int. J. Parallel Program. 45(6), 1298–1309 (2017) 10. Zhang, F., Chen, J., Chen, H., Zang, B.: CloudVisor: retrofitting protection of virtual machines in multi-tenant cloud with nested virtualization. In: Proceedings of ACM Symposium Operating Systems Principles, pp. 203–216 (2011)
Blockchain Simulation Environment on Multi-image Encryption for Smart Farming Application Irawan Widi Widayat(B) and Mario Köppen Faculty of Computer Science and Systems Engineering (CSSE), Graduate School of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka-shi, Fukuoka 820-8502, Japan [email protected], [email protected]
Abstract. The certainty of food provenance on the agriculture ecosystem brings one of the most popular research topics to smart farming. The global epidemic enforces consumers to be warned of the originality of food supply. Various technology unification is included to address this problem. Heterogeneous IoT sensors for sensing agricultural or plantation land provide system automation and monitoring for certain commodities. Sensor data and captured images of surveillance cameras are the results of IoT devices sensing capability. However, the originality of the data being stored is questioned; even worse, the data records are deleted. Such as problems occur due to several things: network congestion, device reliability, storage media, and operator. One of the findings proposed in this research is storing sensing results from each sensor and camera surveillance to a global database that is decentralized, immutable, and synchronized, known as the blockchain. All the parties involved in the smart farming system, such as farmers, food suppliers, and customers, are connected to a global blockchain network. Multi-Image Encryption (MIE) yields to secure the authenticity of captured images from multiple cameras. A specific MIE algorithm will compile and randomized the captured image from cameras to produce an encrypted image stored in the blockchain database with a unique identifier. This study provides a simulated model of blockchain technology that can be implemented in a smart farming environment using Ganache as the test net. In the smart contract, every entity connected to the blockchain network appears as a node account that is digitally assigned based on each role. Therefore, the transaction was successfully done from one node to the others. This research is the initial stages of implementing a smart farming system into the unification of various technology in the development of sustainable agriculture.
1 Introduction Smart farming has become a popular term in the intelligent agriculture ecosystem. Implementation of various emerging computing technologies to maintain sustainable development of agricultural cultivation was done by many researchers in the world. Food safety is one of the most sophisticated topics that have recently focused on the global problem. Mad Cow epidemic break in the United Kingdom in 1996, the SARS in Hong Kong in © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 316–326, 2022. https://doi.org/10.1007/978-3-030-84910-8_33
Blockchain Simulation Environment on Multi-image Encryption
317
2003, and the global pandemic of H1N1 flu in 2009 (Swine Influenza), and currently the virus evolution of the disease as the most challenging case that happened at the moment, COVID19 global pandemic disease. Traceability of food supply provenance helping consumers trace the origin where the food supply came from [1]. The unification of IoT devices and blockchain technology was involved in solving some difficulties and challenges in intelligent farming. IoT devices form as many types of sensors were utilized in intelligent farming to gather, store, and analyze the data in agriculture [2]. Moreover, it could strengthen the effectiveness and performance of all users to get information immediately without the barrier of places and time zone [3]. Various efforts have been made to maintain sustainable agricultural cultivation by maintaining traditional farming methods and utilizing modern technology. The use of various types of sensor devices and the involvement of computational devices to monitor land and agricultural commodities is an effort to implement smart farming [3, 4]. Furthermore, various systems have been developed for food traceability supply chain [1]. Such systems utilize blockchain technology to support the availability and originality of an agricultural commodity. Blockchain, in general, is an immutable database that is distributed and duplicated over many computers [5]. The critical concept in blockchain to communicate with each other in the network through message code is cryptography. The innovative thing about blockchain as a network database is its ability to handle or adjust various transactions even when several nodes on the network receive transactions in various orders. Blockchain combines three technologies, namely: peer-to-peer networking, asymmetric cryptography, and cryptographic hashing. Even large companies like IBM have developed blockchain in dealing with food supply availability [6]. The application of various IoT sensors consolidated with blockchain for various purposes is a ubiquitous technology trend that develops. There is no exception for security monitoring devices such as surveillance cameras and numerous types of IoT [7] sensors. To secure the captured image data, which will then be stored on the blockchain database, is the conclusive goal of the research reported here. An extensive database is required to capture images from a monitoring camera used in an intelligent farming system to monitor the growth of a type of plant or commodity. Furthermore, to discover whether the captured image is the original image or there has been a modification, we can use various image encryption techniques that have been developed at this time. One of them is multi-image encryption. A multi-image encryption technique using Piecewise Linear Chaotic Map (PWLCM) [8] can be used to encrypt images captured by a surveillance camera. This technique will consolidate several original images into a large image, then divide them into sections of original image elements. The original image results are then randomized using a chaotic sequence generated by PWLCM to construct an image element combined into one random form. The next step is to combine the image elements combined into a large randomized image divided into several small images of the same size as the original image. Moreover, the small images are finally encrypted and named according to the filename generated from another PWLCM system. The results of the image encryption will then be forwarded to be stored on the blockchain network storage to ensure that the original images’ database has not been tampered with.
318
I. Widi Widayat and M. Köppen
This paper is organized as follow: Sect. 2 covers experiments that have been carried out previously by researchers correlated this research; Sect. 3 will discuss the various types of tools used and their specifications, both applications tools, and devices; nevertheless, the proposed framework design; Sect. 4 includes a discussion of the results of the experimental phase of the research; Sect. 5 is a section that provides a coverage insight of various development issues on the smart farming system which answered throughout this research.
2 Related Works Like a database that cannot be modified, distributed across many nodes globally with unlimited storage capacity, are the most compelling features of blockchain [9]. Likewise, the agriculture sector also takes part in utilizing this blockchain technology. Blockchain technology can be used to ascertain the origin of food ingredients. This technology will help provide confidence in the food supply chain and increase trust between food producers (farmers) and consumers [10]. Adopting IoT devices in intelligent farming provides added value to improving agriculture products’ effectiveness and operational productivity. For instance, sensor devices measuring pH, temperature, and soil moisture placed on agricultural land will help farmers monitor acidity and soil moisture and store these data for later analysis to produce a product with high-grade quality. In addition to using sensors placed on agricultural land, IoT devices can also make data acquisition on a commodity by applying a surveillance camera carried by a drone [11]. Image processing is applied to ensure the quality of an agricultural or plantation commodity as has been done by [12, 13]. The resulting image from the camera taken periodically will then be stored for analysis using a specific algorithm to determine the quality of the commodity. However, the image may be edited or, at the most extreme, deleted. Thus, an effort is needed to ensure the originality of the image. Moreover, a storage medium that is immutable to change is needed. Combining image processing and blockchain technology can be a solution to these problems [14]. This study intends to examine the blockchain environment simulation on the test net using Ganache. The use of Ganache as a blockchain test net tool has also been carried out for monitoring and controlling UAV devices such as drones and other IoT devices [15]. The final results of this study are expected to apply multi-image encryption to capture images sourced from several surveillance cameras, which will then be stored on the blockchain using a unique identifier.
3 Materials and Methods As the goal of this research on testing a blockchain network simulation thoroughly for an intelligent farming system, this research uses several existing simulation applications henceforth as in the material and method below. The material section will cover numerous tools uses, including applications and device material specification, throughout this research. Whilst, method section covered the proposed framework and step of implementation.
Blockchain Simulation Environment on Multi-image Encryption
319
3.1 Materials Application simulation utilized Parity combined with Geth [16], Truffle framework combined with Geth to become a top-rated tool, known as Ganache-IDE [17]. These tools can be installed quickly and are available for various operating systems. To run a net blockchain test simulation tool also does not require a high-performance device specification, so it will be smooth to run on desktop or laptop devices with the current version of Windows, MacOS, or Linux [18]. This research conducted utilizing a laptop device with the following specifications: Hardware: – Processor: Intel Core i7 4-core – Memory: 8 GB RAM – Storage: 100 GB As for the software needed in this experiment, it is as follows: – – – – – –
Operating system: Linux Ubuntu 20.04 (64 bit) Ganache v2.5.4 Remix-IDE Metamask Ethereum Wallet Brave web browser Web3.js library compilation
In general, Ganache provides 10 test accounts with various addresses (64-bit hex character) [19] to be used in executing transactions in the local Ethereum network as shown in Table 1. In addition, each account also has 100 eth coins which are used as the value (fee) used in each transaction. In this environment, this value can be reset repeatedly. Remix [20] is a compiler tool used in creating and managing smart contracts in the Solidity programming language. Creating a smart contract code can be performed online or offline. Smart contracts can be written using various high-level programming applications such as Solidity or Viper [21]. Similar to trading transactions, blockchain, especially Ethereum, also has a scheme for storing the results of a node’s efforts to produce a block depicted as a digital wallet [22, 23]. In addition, digital wallets on the blockchain are also used to receive and send some ether values to and from other blockchain users. Metamask [24] is a digital wallet application widely used and installed on mobile devices. Metamask provides the ease of transacting (sending and receiving) with Ethereum tokens both with desktop and mobile devices. The accounts in Table 1 can be impersonated as parties involved, particularly in the smart farming process, such as farmers, suppliers, and customers. The account address is generated automatically by Ganache according to the account addressing scheme on the Ethereum blockchain, which uses 42 Hexadecimal characters or the same as the 160-bit identifier [22, 23]. The Ethereum blockchain uses this account address to send and receive a certain ether (Ethereum cryptocurrency).
320
I. Widi Widayat and M. Köppen Table 1. Account address for test environment.
Node number
Account address
Roles
Node1
0x860f475a59835b4f14f47b8f0952e5136b78bf3d
Farmer1
Node2
0x3a695450f465d5b67e559a2129259e3d6b19802f
Farmer2
Node3
0xf5addfeef7c938590a7f6201d31dd6f500a99f4e
Customer1
Node4
0x6a4ab3b0f2ecba7aba8b1025f1796371df5b3a27
Customer2
Node5
0x9aa3df334ac5fe5489e8b95849ddcaa0a3f85d66
Supplier
Node6
0x019e80eb39de6a566c778cc40a71bccac4cbeaf4
Sensor1
Node7
0xf76fc52fc7b998d35d9a9904af385a3a8694aacb
Sensor2
Node8
0xc7fda2af5bea337672d34922101da7b14a121e9c
Camera1
Node9
0x4f1bf68040d0395c0f359224937fcf3b3cc92c5c
Camera2
Node10
0xaa14bd471b893e61bf7d5e01d9275aedec505886
Controller
3.2 Methods Test transactions for blockchain smart contracts for the local environment on Ethereum Virtual Machine (EVM) as depicted in Fig. 1. EVM is a virtual machine used to execute any smart contract code created on the Ethereum blockchain. EVM is integrated from the Ethereum blockchain. EVM has the principal responsibility of interpreting every command code executed on the smart contract and ensure that the smart contract has been executed on all nodes and produces the same results. EVM is a virtual machine based on a stack-based architecture. A stack machine is a computer that operates using the last-in-first-out function to hold a command temporarily. Each stack on the EVM is 256 bits and has a maximum number of 1024. EVM also has memory to store the row of byte word addresses that are formed. The memory is temporary. EVM has storage in the form of Read-Only Memory (ROM), stores the program code separately in the ROM, and can only be accessed through specific instructions.
Fig. 1. Local environment on Ethereum Virtual Machine (EVM).
Blockchain Simulation Environment on Multi-image Encryption
321
An implementation framework for the unification of multi-image encryption algorithm and all IoT sensors within the blockchain Ethereum network on an intelligent farming solution is proposed in this study, as shown in Fig. 2. This research uses Ganache as a local Ethereum test network instead of the Ethereum network to simplify the test stage. Some of the main components in a smart farming concept are shown in Fig. 2, such as: – Sensor devices such as temperature and humidity sensor, soil ph sensor, surveillance camera, etc. – Parties include in the systems, such as: farmers, suppliers, customers, digital contractor, and digital verificatory. – Computational devices such as: microcontroller system, cloud and storage computing. – Application and libraries requirement, such as: mobile device application, machine-to-machine communication application libraries, blockchain web application.
Fig. 2. Simulation diagram implementation of the local Ethereum network.
In the scheme in Fig. 2, the IoT controller, sensors, and cameras are sided to the farm range, while the blockchain network is all connected devices in the blockchain. The sensing results from sensors and surveillance cameras will be encrypted by the controller and then forwarded to the blockchain network. At this stage, the controller will communicate with the blockchain network using machine-to-machine communication. If the controller is listed as one of the blockchain nodes, the data encryption result process, which was previously done, will be forwarded by the microservice application. This stage can also be referred to as the node verification stage by the blockchain. Each transaction inquired by a node will then be verified by a smart contract to calculate the transaction costs based on a digitally agreed contract. Multiple test nodes can be performed on a single PC or server using a virtual machine (VM) engine. The VM engine has a hypervisor whose job is to share and transpose device resources (such as CPU, memory, network, and storage) owned by the server
322
I. Widi Widayat and M. Köppen
on the installed guest machine. The mining node will use two virtual machines. EVM will be automatically installed on each mining node. After successfully verifying all submitted transactions, EVM starts working and contends against each other to create blocks for each transaction. The node that successfully wins the race and is also verified for the transaction will successfully create a new block on the ledger and be rewarded for the effort. Every party in the blockchain network will have their own role as shown in Table 1, and each role determined by their value correlated to the smart contract, and it is also known as transaction costs. The amount of the transaction value can be calculated by considering numerous determinants, such as the gas limit value, gas used, and gas price. Gas is a unit of cost used by a transaction. Gas price is the amount of ether used in each gas unit and is assessed in “gwei.” The value of the smallest fraction of ether is called “wei.” So that for the value of 1 ether is equal to 1 × 109 (1.000.000.000) gwei or equal to 1 × 1018 (1.000.000.000.000.000.000) wei. The transaction value can be calculated using the following equation [22, 23]: Gas limit + Gas price = Transaction fees
(1)
4 Results The Ganache blockchain local test net experiment has yielded several results. Figure 3 shows the Ganache function to automatically run transaction processes when receiving a transaction request confirmation from one connected node. This function runs on automining settings. Nevertheless, it will consistently generate blocks for each computation even though there are no transactions. The mining function will start to produces block formation mode for a set amount of time. Figure 4 shows the block building process was consistently performed for a time interval of every 30 s (from the Ganache setting). The transaction can be illustrated as sensors that were side in the farming area produce data to the blockchain.
Fig. 3. Result 1: Auto-mining process when receiving transaction.
Details of transactions that occur in one of the blocks are shown in Fig. 5. Several information parameters are displayed in the Fig. 5, such as sender address information, recipient address, contract maker address, gas price, used gas, encrypted transaction data, and information from the block where the transaction took place. Figure 5 and 6 are a sequence of transactions on a blockchain network. Figure 5 shows the process of a node requesting to make a transaction to the blockchain network, and the transaction results are shown as Fig. 6.
Blockchain Simulation Environment on Multi-image Encryption
323
Fig. 4. Result 2: Ethereum mining process is running within 30 s.
Fig. 5. Result 3-Sender request for transaction.
Figure 7(a) is the farming transaction process of a blockchain account (customers) sending several ether tokens to other blockchain accounts (suppliers). This transaction describes one of the supply chains processes when customers buy some farming product from suppliers. This transaction process can occur when a verified blockchain account (parties in the smart farming system) on the blockchain network uses a digital wallet to send a certain amount of ether to other stores, as shown in Fig. 7(b). Using Metamask wallet, account 1 (customers) sends a total of 5 ether tokens to account 2 (suppliers or farmers). The results of the transaction are as shown in the picture Fig. 8(a)(b). The value of the recipient’s account will be automatically added to the ether value sent by the sender’s account (see Fig. 8(a)). At the same time, the sender’s account will be deducted from the value of the ether value sent to the recipient plus the transaction cost as refer to Eq. 1. The digital wallet web application platform used by farmers, customers, and suppliers when transaction occur depicted as shown in Fig. 8.
324
I. Widi Widayat and M. Köppen
Fig. 6. Result 4: Smart contract creation on Ganache.
Fig. 7. Farming transaction process. (a) Detail transactions of blockchain account. (b) Detail transaction using Metamask wallet.
Fig. 8. Transaction using Metamask wallet (a) Sending ETH. (b) Total recipient wallet.
5 Discussion and Conclusions In this study, researchers applied the local environment to achieve blockchain transactions on smart farming model using Ganache as an Ethereum blockchain tool. It takes several steps to experiment [5, 6, 22], including setup a simulation environment,
Blockchain Simulation Environment on Multi-image Encryption
325
generating a transaction validation code in the form of a smart contract using RemixIDE. Furthermore, examining transactions to correlate the sender, receiver, and validator and evaluating whether Ganache successfully carried out the transaction as a form of simulation on the Ethereum blockchain network. The framework proposed in this study will be able to answer several challenges in strengthening the implementation of smart farming applications, especially in: – Security - Data security in blockchain technology is one of the most intrinsic elements in applying this technology to various fields, especially critical data, which came from IoT devices such as sensors and surveillance cameras in the innovative farming system. The data encryption process (image and sensor data) on the controller is part of preprocessed stages before data is passed through the blockchain network. The encrypted information is then forwarded using a network encrypted using the secure shell web protocol. After the verification process is successfully carried out, the data will be stored in an immutable, distributed, decentralized, and synchronized database [7, 8]. – Traceability - Customers in the system will easily access the origin of foodstuffs sold in the market without worrying about data manipulation. The secrecy of original data will remind safe. Distributors will also be helped by shortening the distribution system and providing the latest status of these foodstuffs at the distribution stage. Moreover, farmers will finally know the market demand for the products and improve the quality of the products produced through data that continues to be renewable [1, 3]. – The availability of blockchain as a globally distributed database is perking by the robustness of decentralized technology. The data verified as a transaction will automatically synchronize to all nodes connected to the blockchain network. Therefore, data availability will be maintained and cannot be edited [4]. All sensing devices produce continuous data that needs to be certain liveliness of storage and network resilience as the high availability of intelligent farming system. The implementation of this research in the future will test the multi-image cryptography [8, 10, 11] functionality of the captured images that will be linked to the Ethereum blockchain test net and IoT devices as one of the technological problems rupture of developing sustainable agriculture issues [1–3]. Acknowledgments. The authors gratefully acknowledge the support by the Kyushu Institute of Technology.
References 1. Demestichas, K., Peppes, N., Alexakis, T., Adamopoulou, E.: Blockchain in agriculture traceability systems: a review. Appl. Sci. (2020). https://doi.org/10.3390/app10124113 2. Walter, A., Finger, R., Huber, R., Buchmann, N.: Opinion: smart farming is key to developing sustainable agriculture. Proc. Natl. Acad. Sci. U.S.A. 114, 6148–6150 (2017). https://doi.org/ 10.1073/pnas.1707462114 3. Klerkx, L., Jakku, E., Labarthe, P.: A review of social science on dIgital agriculture, smart farming and agriculture 4.0: new contributions and a future research agenda. NJAS Wagening. J. Life Sci. 90–91, 100315 (2019). https://doi.org/10.1016/j.njas.2019.100315
326
I. Widi Widayat and M. Köppen
4. Wolfert, S., Ge, L., Verdouw, C., Bogaardt, M.J.: Big data in smart farming – a review. Agric. Syst. 153, 69–80 (2017). https://doi.org/10.1016/j.agsy.2017.01.023 5. Dannen, C.: Introducing Ethereum and Solidity: Foundations of Cryptocurrency and Blockchain Programming for Beginners. Springer, New York (2017). https://doi.org/10.1007/ 978-1-4842-2535-6 6. IBM Food Trust. https://www.ibm.com/blockchain/solutions/food-trust. Accessed 10 May 2021 7. Mohanty, S.N., et al.: An efficient lightweight integrated blockchain (ELIB) model for IoT security and privacy. Future Gener. Comput. Syst. 102, 1027–1037 (2020). https://doi.org/10. 1016/j.future.2019.09.050 8. Zhang, X., Wang, X.: Multiple-image encryption algorithm based on mixed image element and chaos. Comput. Electr. Eng. 62, 401–413 (2017). https://doi.org/10.1016/j.compeleceng. 2016.12.025 9. Song, H., Chen, J., Lv, Z.: Design of personnel big data management system based on blockchain. Future Gener. Comput. Syst. 101, 1122–1129 (2019). https://doi.org/10.1016/ j.future.2019.07.037 10. Hang, X., Tobias, D., Puqing, W., Jiajin, H.: Blockchain technology for agriculture: applications and rationale. Front. Blockchain 3, 7 (2020). https://doi.org/10.3389/fbloc.2020. 00007 11. Lottes, P., Khanna, R., Pfeifer, J., Siegwart, R., Stachniss, C.: UAV-based crop and weed classification for smart farming. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3024–3031 (2017). https://doi.org/10.1109/ICRA.2017.7989347 12. Sa, I., et al.: weedNet: dense semantic weed classification using multispectral images and MAV for smart farming. IEEE Robot. Autom. Lett. 3(1), 588–595 (2018). https://doi.org/10. 1109/LRA.2017.2774979 13. Udendhran, R., Balamurugan, M.: Towards secure deep learning architecture for smart farming-based applications. Complex Intell. Syst. 7(2), 659–666 (2020). https://doi.org/10. 1007/s40747-020-00225-5 14. Bhowmik, D. Feng, T.: The multimedia blockchain: a distributed and tamper-proof media transaction framework. In: 22nd International Conference on Digital Signal Processing (DSP), pp. 1–5 (2017). https://doi.org/10.1109/ICDSP.2017.8096051 15. Rupa, C., Srivastava, G., Gadekallu, T.R., Maddikunta, P.K.R., Bhattacharya, S.: Security and privacy of UAV data using blockchain technology. J. Inf. Secur. Appl. 55, 102670 (2020). https://doi.org/10.1016/j.jisa.2020.102670 16. Parity website. https://github.com/openethereum/parity-ethereum. Accessed 10 May 2021 17. Ganache website. https://www.trufflesuite.com/ganache. Accessed 10 May 2021 18. Pustišek, M., Umek, A., Kos, A.: Approaching the communication constraints of Ethereumbased decentralized applications. Sensors 19(11), 2647 (2019). https://doi.org/10.3390/s19 112647 19. Ethereum website. https://ethereum.org/en/developers/docs/accounts. Accessed 10 May 2021 20. Remix website. https://remix.ethereum.org/. Accessed 10 May 2021 21. Almakhour, M., Sliman, L., Samhat, A.E., Mellouk, A.: Verification of smart contracts: a survey. Pervasive Mob. Comput. 67, 101227 (2020). https://doi.org/10.1016/j.pmcj.2020. 101227 22. Kasireddy, P.: How does Ethereum works, anyway? https://preethikasireddy.medium.com/ how-does-ethereum-work-anyway-22d1df506369. Accessed 10 May 2021 23. Solomon, M.G.: Ethereum for Dummies. Wiley, Hoboken (2019) 24. Metamask website. https://metamask.io/. Accessed 10 May 2021
Author Index
A Akiyoshi, Shota, 295 Amato, Alessandra, 42, 49 Amato, Flora, 42, 49 Ampririt, Phudit, 96, 195 Angrisani, Leopoldo, 42 Awan, Suleman, 190 B Barolli, Admir, 1, 195 Barolli, Leonard, 1, 42, 49, 96, 136, 195 Bhattacharjee, Kamanasish, 253 Bonavolontà, Francesco, 42, 49 Bylykbashi, Kevin, 96 C Chantaranimi, Kittayaporn, 84 D Doleží, Vít, 243 Du, Wei, 225 F Futamura, Ryo, 209 G Gajdoš, Petr, 243 H Ha, Nguyen Viet, 283 Hirata, Aoto, 136 Hirota, Masaharu, 136 Horio, Shuhei, 307 Huseynov, Huseyn, 32
I Ikeda, Makoto, 96 Inoue, Masato, 11 Izumi, Kiyotaka, 11 K Kashimoto, Toshitaka, 169 Kaufmann, Tyson N., 70 Kawaguchi, Akira, 116, 283 Köppen, Mario, 125, 316 Kourai, Kenichi, 32, 307 Krömer, Pavel, 233 Kudelka, Milos, 264 L Le, Thien T. T., 149 Lee, Myung, 295 Leung, Carson K., 70 M Mantau, Aprinaldi Jasa, 125 Matsuo, Keita, 96 Miwa, Hiroyoshi, 23, 59, 274 Mori, Kosuke, 274 Mowshowitz, Abbe, 116, 283 N Nakazawa, Kota, 274 Natwichai, Juggapong, 84 Neglia, Gianluca, 42 Nguyen, Nhat Tien, 149 Nishikawa, Takahiro, 23 Nowaková, Jana, 233, 253
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 L. Barolli et al. (Eds.): INCoS 2021, LNNS 312, pp. 327–328, 2022. https://doi.org/10.1007/978-3-030-84910-8
328
Author Index
P Pant, Millie, 253
Takizawa, Makoto, 1, 112 Tamburis, Oscar, 42 Tiwari, Arti, 253 Toyoda, Fumiya, 158, 169 Toyoshima, Kyohei, 136 Trovati, Marcello, 183, 190 Tsujimura, Takeshi, 11 Tsukamoto, Kazuya, 295 Tsuru, Masato, 116, 283
Q Qafzezi, Ermioni, 96
U Uchiya, Takahiro, 209
R Radvansky Jr., Martin, 264 Radvansky, Martin, 264 Rahim, Lukman Ab., 307 Ren, Fan, 225
V Voznak, Miroslav, 149
O Oda, Tetsuya, 136 Ogata, Masato, 11 Ogiela, Lidia, 112 Ogiela, Marek R., 107 Ogiela, Urszula, 107, 112
S Saadawi, Tarek, 32 Saito, Masato, 59 Saito, Nobuki, 136 Sakamoto, Shinji, 1, 195 Sakumoto, Yusuke, 158, 169 Shibata, Masahiro, 283 Snasel, Vaclav, 253 Soussan, Tariq, 183 Sugunsil, Prompong, 84 T Taenaka, Yuzo, 295 Takahashi, Kouta, 307
W Wen, Yan, 70 Widayat, Irawan Widi, 125 Widi Widayat, Irawan, 316 X Xu, Peng, 195 Y Ye, Zhiyang, 218 Yu, Changliang, 218, 225 Z Zdralek, Jaroslav, 149 Zhao, Chenru, 70 Zhao, Nan, 218, 225 Zheng, Hao, 70