Applied Systemic Studies 3031274695, 9783031274695

This book is a collection of a wide range of research papers that combine both the humanities and sciences in applied in

258 101 39MB

English Pages 265 [266] Year 2023

Table of contents :
Contents
Artificial Intelligence
Decision-Support for Additive Repair Processes with a Multi-agent System
1 Introduction
2 Theoretical Background and Related Work
3 Multi-agent System as Decision-Support System for Additive Repair Processes
3.1 Methodical Procedure
3.2 Architecture of the Multi-agent System
4 Application Example
5 Discussion and Conclusion
References
AI Education for Middle/ High School Level: A Proposal of a System that Supports Teachers to Design Their AI Lessons
1 Introduction
2 Existing Applications/ Services that Support Teachers with Lesson Plans and AI Education
3 Purpose of the Proposed Application
4 Functions of the Application
4.1 Recommend AI Products in Accordance with AI Concepts that Teachers Need to Introduce
4.2 Automatic Update on Requirements of National Education Curriculum
4.3 Open Reference Resource and Integrated Search Engine
4.4 Teachers’ Social Networking and Co-designing Lesson Plans
5 How to Use the Application
6 Conclusion
7 Future Work
References
Sensor Data Restoration in Internet of Things Systems Using Machine Learning Approach
1 Introduction
2 Related Work
3 Solution Approach
4 Experiments and Results
5 Conclusions
References
Potential of eXtended Intelligence (XI) for Extending Human Expression from Digitization of Analog Elements
1 Introduction: Digital Scarcity?
2 Guarantee of Digital Value and Originality
3 Value of Analog Content and Value of Digital Content
4 Extended Intelligence (XI / EI) as a Post-AI
5 What is XI?
References
ML-Based System Failure Prediction Using Resource Utilization
1 Introduction
2 Related Work
3 Methodology
3.1 Data Preparation
3.2 Tracking Class Imbalance
3.3 Model Selection
4 Results
4.1 LSTM Model Accuracy
4.2 Experiment with XGBoost Model
5 Conclusion
References
Interactive System
NEOTEN¥ (NExt Optimum Trade ENvironment of ¥en): A Cryptocurrency with Market Capitalization Linked to the Japanese Yen, Non-speculative Crypto-Assets with Economic Security
1 Introduction
1.1 Background
1.2 Focus of Research
2 Purpose of Research: Non-speculative NEOTEN¥ with Economic Security
3 NEOTEN¥ Mechanism
3.1 System Structure of NEOTEN¥
3.2 User Definition by SmaCoN¥
3.3 Circulation Mechanism of NEOTEN¥
4 Expected NEOTEN¥ Use
4.1 NEOTEN¥ in Practical Use
4.2 Examination of NEOTEN¥ Usability
5 Conclusion
References
XI Guitar: Development of a Slide-Type Guitar Application that Reproduces the Playing Sensation of Analog Guitar
1 Introduction
2 Related Research
3 Slide-type Analog Guitar
3.1 Basic Design
4 Experiments on Accelerometer Accuracy
4.1 Purpose of the Experiment
4.2 Experimental Methods
4.3 Experimental Results
4.4 Primary Function Realization Mechanism
5 Implementation
6 Summary
7 Future Issues
7.1 Addition of Friction-Based Tactile Feedback
7.2 Prior Studies
7.3 Application to Analog Guitar
References
Audio Streams Synchronization for Music Performances
1 Introduction
2 Related Works
3 Multimedia Transmission Concept
4 Prototype Architecture
4.1 Audio Streams Synchronization
5 Efficiency Analysis
6 Summary
References
Research to Enhance the Sense of Presence Based on the Positional Relationship Between the Sound Source and the Distance
1 Background
2 Purpose
3 Definition of Reality
4 Acoustic System that Enhance the Sense of Reality
5 Experiment 1
5.1 Overview
5.2 Results/Discussion
6 Experiment 2
6.1 Overview
6.2 Results/Discussion
7 Summary
References
Development of an Image Processing Application that Represents Onomatopoeia
1 Background
2 Purpose
3 Types of Onomatopoeia
4 Related Research/Cases
4.1 Kana Wave
4.2 Digital Picture Book System for Foreign Learners Who Studies Japanese Onomatopoeia [3]
5 Development of the Application
5.1 Application Overview
5.2 Image Processing Methods
5.3 Execution Example
6 Conclusion
7 Future Work
References
MIMO State Prediction Based Adaptive Control of Aeroelastic Systems with Gust Load
1 Introduction
2 Aeroelastic Model and Control Problem
3 Adaptive Control System
4 State Predictor, Composite Adaptation Law, and Closed-Loop Stability
5 Simulation Results
6 Conclusions
References
Data Science and Data Analysis
Game Theory as a Method for Assessing the Security of Database Systems
1 Introduction
2 Related Works
3 Database Security Threads
4 Our Approach
5 Experimental Evaluation
6 Conclusions
References
Prediction of Web Service Execution QoS Parameters with Utilization of Machine Learning Methods
1 Introduction
2 Related Works
3 Web Service QoS Prediction Model
4 Web Service QoS Prediction Algorithm
5 Experimental Evaluation
6 Conclusions
References
Psychological Influence on Older Adults’ Preferences for Nursing Care
1 Introduction
2 Method
2.1 Survey Participants
2.2 Survey Procedures
2.3 Questions on Awareness of Family Nursing Care
2.4 General Self-efficacy Scale
2.5 Satisfaction with Life Scale (SWLS)
2.6 Questions on Psychological Indebtedness
2.7 UCLA Loneliness Scale in the Elderly
2.8 Method of Analysis
3 Results
3.1 Analysis of Results for Survey of Japanese Participants
3.2 Results of Survey of Chinese Participants
4 Discussions
4.1 Discussion of the Results of the Japanese
4.2 Discussion of the Survey Results of the Chinese
5 Conclusion
References
Observations of Global Software Engineering Classes During the Pandemic: Students’ Perspective
1 Introduction
2 Related Work
2.1 Global Software Engineering
2.2 Teaching During the Pandemic
3 Methods
3.1 Pedagogy
3.2 Organization of the Course
3.3 Data Collection Methods
4 Experiences
4.1 Experiences on Japanese Side
4.2 Experiences on German Side
4.3 Discussion of Experiences
4.4 Comparison of Education and Grading
4.5 Limitations
5 Conclusions and Future Work
References
IoT-Based Smart Logistics Model to Enhance Supply Chain
1 Introduction
2 Review of Literature
3 Proposed Model
4 Results and Discussion
4.1 Selection Presentation Management Using IoT
4.2 Automatic Supply of Goods
4.3 Robotics Management Using IoT
4.4 Radio Frequency Identification (RFID)
4.5 Autonomous Transport Management Using IoT
4.6 Improved GPS Management Using IoT
4.7 Internet of Things Logistical Management Using IoT
4.8 Social Media Logistical Management Using IoT
5 Conclusion
References
Two-Stage Approach to Cluster Categorical Medical Data
1 Introduction
1.1 Description of the Dataset
2 State-of-the-Art Approaches
3 Description of the Two-Stage Procedure
4 Numerical Study
4.1 Using MDS and K-Means
4.2 Using t-SNE and K-Means
4.3 Using K-Means and K-Modes Only
5 Conclusions
References
Case Studies in Access Control
1 Introduction
2 Background
3 Case Study 1: Partial Captures and Stitching
4 Case Study 2: Automated Teller Machine (ATM)
5 Case Study 3: Pharmacy Fridge
6 Case Study 4: Assisted Driving
7 Conclusion
References
Virtual Communication
A Study on Drama Content Production Using 360-Degree Video
1 Introduction
2 Related Work
3 360-Degree Video Production
3.1 Scenarios and Storyboards
3.2 Equipment for Shooting 360-Degree Images
3.3 Video Editing and Viewing Equipment
4 Video Works
5 Consideration
5.1 Planning and Scenario
5.2 Storyboard
5.3 Camera Work
6 Conclusion
References
Knowledge Space as a Way of Creating Knowledge: Framing Future Digital Library
1 Introduction
2 The Role of Traditional Librariesa
2.1 Library to Input and Output
2.2 Libraries as a Place to Encounter with Unknown Knowledge/information by Chance
3 Organization and Analysis of Existing Online Libraries
3.1 Search-Oriented Online Library
3.2 Experience-Oriented Online Libraries
3.3 The Future of Online Libraries
4 Experiments on Encounters with Unknown Knowledge
4.1 Outline of the Experiment
4.2 Results and Analysis
5 Proposal of Future Digital Library Mechanism
5.1 Dynamic Reproduction of the Library with a 360° Field of View
5.2 360° Video Playback Based on the User's Actual Movements
6 Summary and Future Works
6.1 Summary
6.2 Future Works
References
Developing a Meta-AR Space Construction System and Mapping Coverage Visualization
1 Introduction
2 Related Work
3 Meta-AR Space Construction
3.1 Texture Assignment Algorithm Using Depth Map
3.2 Mapping Support
4 Evaluation
4.1 Performance Evaluation
4.2 Mapping Coverage Evaluation
4.3 Usefulness of Mapping Visualization
5 Discussion
6 Conclusion
References
Development of Digitalization of Video Device in the Pre-Cinema Era
1 Introduction
1.1 Background of the Study
1.2 Purpose of the Study
1.3 What is Pre-Cinema?
2 What is Theatre Optique?
2.1 Overview of Theatre Optique
2.2 An Inventor, Emile Reynaud
2.3 Invention of Theatre Optique
2.4 Show with Animated Image Device
2.5 How Theatre Optique Presents Its Content at the Box Office
2.6 The Decline of Theatre Optique
3 Significance of Digitizing Past Technologies and Analog Devices
4 Prototype of Digital Optique
4.1 Overview of the Application
4.2 How to Operate the Application
5 Conclusion and Future Works
References
Visualization of the Relationship Between the Vehicle and Its Passengers
1 Introduction
2 Acquisition of Information
3 Measurement Experiment
4 Visualization
4.1 Comparative Display
4.2 Improved Vehicle Information
5 Conclusion and Future Work
References
Design of a Memory Palace Memory Aid Application Based on Trajectory Mnemonics
1 Introduction
2 Background
3 Purpose
4 Associated Research
5 Application Design
5.1 Application Overall Structure Design
5.2 Application Usage Design
5.3 Application-Specific Functional Design
6 Summary
References
Author Index

Recommend Papers

Studies in Systemic Phonology 9781474246705, 9781474285391, 9781474246668

This is the first collection of studies to apply the theory and techniques of Systemic Linguistics to the topics of phon

142 77 36MB Read more

Systemic Functional Linguistics and Translation Studies 9781350091863, 9781350091894, 9781350091870

The field of translation studies has grown rapidly over recent decades, with critical questions being investigated acros

165 75 28MB Read more

Systemic Principles of Applied Economic Philosophies I: Producers, Consumers, and the Firm 9819972728, 9789819972722

The objective of this book is to answer the calls from front-line managers, entrepreneurs, and scholarly researchers to

112 53 Read more

Systemic Principles of Applied Economic Philosophies I: Producers, Consumers, and the Firm 9819972728, 9789819972722

The objective of this book is to answer the calls from front-line managers, entrepreneurs, and scholarly researchers to

117 96 Read more

Systemic Principles of Applied Economic Philosophies I: Producers, Consumers, and the Firm 9819972728, 9789819972722

The objective of this book is to answer the calls from front-line managers, entrepreneurs, and scholarly researchers to

100 55 Read more

Systemic Interventions 9783666402203, 9783525402207

118 78 839KB Read more

Quantifying Systemic Risk 9780226921969

In the aftermath of the recent financial crisis, the federal government has pursued significant regulatory reforms, incl

152 34 3MB Read more

Endocrinology and Systemic Diseases 9783319687292

423 101 16MB Read more

Numerical C: Applied Computational Programming with Case Studies 148425063X, 9781484250631

Learn applied numerical computing using the C programming language, starting with a quick primer on the C programming la

383 85 4MB Read more

Applied Speech Processing Algorithms: Algorithms and Case Studies 9780128238981

432 27 14MB Read more

Applied Systemic Studies
3031274695, 9783031274695

Author / Uploaded
Henry Selvaraj
Takayuki Fujimoto

Similar Topics
Science (General)

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Lecture Notes in Networks and Systems 611

Henry Selvaraj Takayuki Fujimoto Editors

Applied Systemic Studies

Lecture Notes in Networks and Systems

611

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland

Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas—UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Türkiye Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong

The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).

Henry Selvaraj · Takayuki Fujimoto Editors

Applied Systemic Studies

Editors Henry Selvaraj Department of Electrical and Computer Engineering University of Nevada Henderson, NV, USA

Takayuki Fujimoto Faculty of Information Sciences and Arts Toyo University Kawagoe, Japan

ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-3-031-27469-5 ISBN 978-3-031-27470-1 (eBook) https://doi.org/10.1007/978-3-031-27470-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Contents

Artificial Intelligence Decision-Support for Additive Repair Processes with a Multi-agent System . . . . Stefan Plappert, Nicola Viktoria Ganter, Thimo von Hören, Paul Christoph Gembarski, and Roland Lachmayer

3

AI Education for Middle/ High School Level: A Proposal of a System that Supports Teachers to Design Their AI Lessons . . . . . . . . . . . . . . . . . . . . . . . . . Bich Hong Cu and Takayuki Fujimoto

12

Sensor Data Restoration in Internet of Things Systems Using Machine Learning Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Saugat Sharma, Grzegorz Chmaj, and Henry Selvaraj

21

Potential of eXtended Intelligence (XI) for Extending Human Expression from Digitization of Analog Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Takayuki Fujimoto

31

ML-Based System Failure Prediction Using Resource Utilization . . . . . . . . . . . . . Ittipon Rassameeroj, Naphat Khajohn-udomrith, Mangkhales Ngamjaruskotchakorn, Teekawin Kirdsaeng, and Piyorot Khongchuay

40

Interactive System NEOTEN¥ (NExt Optimum Trade ENvironment of ¥en): A Cryptocurrency with Market Capitalization Linked to the Japanese Yen, Non-speculative Crypto-Assets with Economic Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Takayuki Fujimoto XI Guitar: Development of a Slide-Type Guitar Application that Reproduces the Playing Sensation of Analog Guitar . . . . . . . . . . . . . . . . . . . . Liu PeiQiao and Takayuki Fujimoto Audio Streams Synchronization for Music Performances . . . . . . . . . . . . . . . . . . . . Klaudia Tomaszewska, Patryk Schauer, Arkadiusz Warzy´nski, and Łukasz Falas

53

64

77

vi

Contents

Research to Enhance the Sense of Presence Based on the Positional Relationship Between the Sound Source and the Distance . . . . . . . . . . . . . . . . . . . Tomo Moriguchi and Takayuki Fujimoto

88

Development of an Image Processing Application that Represents Onomatopoeia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuka Kojima and Takayuki Fujimoto

99

MIMO State Prediction Based Adaptive Control of Aeroelastic Systems with Gust Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Keum W. Lee and Sahjendra N. Singh Data Science and Data Analysis Game Theory as a Method for Assessing the Security of Database Systems . . . . 125 Arkadiusz Warzy´nski, Katarzyna Łabuda, Łukasz Falas, and Patryk Schauer Prediction of Web Service Execution QoS Parameters with Utilization of Machine Learning Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Łukasz Falas, Adam Sztukowski, Arkadiusz Warzy´nski, and Patryk Schauer Psychological Influence on Older Adults’ Preferences for Nursing Care: A Comparative Study Between Japan and China Based on Data Analysis . . . . . . 146 Zihan Zhang, Chieko Kato, and Koichiro Aoki Observations of Global Software Engineering Classes During the Pandemic: Students’ Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Julian Titze, Patricia Brockmann, Daniel Moritz Marutschke, and Victor V. Kryssanov IoT-Based Smart Logistics Model to Enhance Supply Chain . . . . . . . . . . . . . . . . . 168 Thirumurugan Shanmugam, Mohamed Abdul Karim Sadiq, and Kamalavelu Velayutham Two-Stage Approach to Cluster Categorical Medical Data . . . . . . . . . . . . . . . . . . . 178 ´ atek, Jarosław Drapała, Remigiusz Szczepanowski, Jerzy Swi˛ Izabella Uchmanowicz, Michał Czapla, Jan Biegus, Krzysztof Reczuch, and Tomasz Guszkowski Case Studies in Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Zenon Chaczko, Grzegorz Borowik, Mohammad Alsawwaf, and Marek Kulbacki

Contents

vii

Virtual Communication A Study on Drama Content Production Using 360-Degree Video . . . . . . . . . . . . . 201 Ryuta Motegi and Shin Tsuchiya Knowledge Space as a Way of Creating Knowledge: Framing Future Digital Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Xuezhen Li and Takayuki Fujimoto Developing a Meta-AR Space Construction System and Mapping Coverage Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 Koki Yasue, Masato Kikuchi, and Tadachika Ozono Development of Digitalization of Video Device in the Pre-Cinema Era . . . . . . . . 232 Nanami Kuwahara and Takayuki Fujimoto Visualization of the Relationship Between the Vehicle and Its Passengers . . . . . . 242 Masasuke Yasumoto, Kazuya Kojima, and Norimasa Kishi Design of a Memory Palace Memory Aid Application Based on Trajectory Mnemonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Wang Nan and Takayuki Fujimoto Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

Artificial Intelligence

Decision-Support for Additive Repair Processes with a Multi-agent System Stefan Plappert(B) , Nicola Viktoria Ganter, Thimo von Hören, Paul Christoph Gembarski, and Roland Lachmayer Institute of Product Development, Leibniz University of Hannover, An der Universität 1, 30823 Garbsen, Germany {plappert,ganter,gembarski,lachmayer}@ipeg.uni-hannover.de, [email protected]

Abstract. The repair of defective metal parts using additive manufacturing processes (additive repair) instead of disposal offers the potential to manufacture parts more cost-effectively, faster, and in a way that preserves resources. One hurdle to a broader application of additive repair is that users without expertise in additive repair lack adequate guidance on the technical feasibility of additive repair. Moreover, as the specific domain knowledge for the different repair processes is decentralized by different experts and negotiation between them has to be performed for the selection of the appropriate process, this paper investigates the application of a multi-agent system (MAS) using a deep-drawing die.

1 Introduction How can the disposal of metallic components resulting from a lack of knowledge of repair procedures or from errors in design or manufacturing be avoided? On the one hand, disposal can be counteracted by increasing communication between the design department and the manufacturing department, which is, however, made more difficult by the geographical distance between the departments. On the other hand, knowledge about manufacturing or repair has to be made available to decisionmakers so that they can avoid errors or pursue a suitable repair strategy at an early stage. This problem is further complicated when repair processes are not yet widely known, such as repair by additive manufacturing processes (additive repair), in which the component is restored to its original shape by depositing material. They can be used, for example, to repair parts that were previously considered unrepairable in a more materialand energy-efficient manner than manufacturing a replacement part [1]. Additive manufacturing processes allow also to provide spare parts in short term and in small batch size. For additive repair of metallic parts, certain additive processes have been identified in literature as suitable and have been applied accordingly [2]. These processes are Direct Energy Deposition (DED) and Powder Bed Fusion (PBF), as well as the Cold Spray (CS) process. In the DED process, the filler material is melted by means of a laser beam, so that the material deposition is also possible on free-form surfaces. PBF is a powder-based process in which the powder is fused by a laser or electron beam at © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 3–11, 2023. https://doi.org/10.1007/978-3-031-27470-1_1

4

S. Plappert et al.

specific points. In contrast to PBF, in CS the powder is applied to the component at high speed. Thereby, the powder particles are deformed and bond with the substrate. The decision on whether the component can be additively repaired is based on the limitations of additive repair techniques [3] and on the experience of whether comparable components have already been successfully repaired by additive repair. Challenges include the need to consider numerous production equipment, since the limitations differ for the equipment types significantly. Furthermore, component properties, including material, size, and geometry have to be considered. Due to these various influencing factors in an additive repair, a separate examination has to be carried out for each component, for which corresponding expert knowledge of the respective repair processes must be available. The hurdle here is that this specific domain knowledge is usually not available when the repair decision is made [4], and can only be determined by consulting various experts. To overcome this hurdle, agent-based technology provides a possibility to model a problem based on distributed knowledge, since real-world entities and their interactions can be represented by separate problem-solving agents [5]. For this purpose, this paper presents the development of a multi-agent system (MAS) that represents the possible repair procedures and enables the interaction between agents so that an appropriate repair procedure can be determined. This paper is organized as follows: Sect. 2 presents the theoretical background and related work on multi-agent systems. Then, in Sect. 3, the methodical procedure and architecture of the MAS are described. The use of the MAS is presented in Sect. 4 with a deep-drawing die. The application of the MAS to the comparison of additive repair processes is discussed and summarized in Sect. 5.

2 Theoretical Background and Related Work The beginnings of research on distributed artificial intelligence and program agents date back to the 1990s [6]. Although the term “agent” is loosely defined in the literature, it can be agreed that an agent is an autonomous, computational entity that senses its environment using sensors and acts on it using effectors [7]. Autonomous action is one of the main features that distinguish intelligent agents from conventional software [6]. A MAS consists of many agents that typically interact with each other by exchanging messages over a computer network infrastructure [8]. Characteristically, individual agents have limited knowledge about the system and the problem, and thus can contribute only partial solutions [9]. The basic idea of network agents is that they collaborate in solving problems, share knowledge, work asynchronously, are modular and robust through redundancy, and represent multiple viewpoints and expertise of experts [7]. Successful interaction therefore requires the skills of cooperation, coordination, and negotiation [8]. One of the key components of MAS is communication. Agents have to be able to communicate with users, with system resources, and with each other, if they cooperate, collaborate, negotiate, etc. [10]. Collaboration is a controlled process that takes place in a collaborative environment where a group of actors work together towards a specific goal. Furthermore, negotiation is a process in which a group of actors communicate with each other to reach a mutually acceptable agreement on a particular manner. It

Decision-Support for Additive Repair Processes

5

is one of the most important methods to establish collaboration among agents [11]. To give structure to collaboration, a well-known organizational approach is negotiation via a blackboard. Contracting as a negotiation mechanism is embodied in the classic contract-net protocol of [12], in which a manager agent announces a contract, receives bids from other interested contractor agents, evaluates the bids, and awards the contract to a successful contractor. The simplicity of the system makes it one of the most widely used coordination mechanisms with many variations in the literature [13]. For the implementation as a information system the most widely used holistic MAS framework in the last two decades is JADE (Java Agent Development Framework)[14], which is FIPA-compliant and builds the network of agent containers using the Java objectoriented language [10]. Message-based asynchronous communication is the basic form of communication between agents in JADE; an agent wishing to communicate have to send a message to a specific destination (or set of destinations). Moreover, JADE supports the development of multi-agent systems through the predefined programmable and extensible agent model and a set of management and testing tools [10]. MAS are already used in production to model and realize intelligent manufacturing systems (IMS) by having agents map, e.g., production stations, workpieces, or transport units and coordinate on supply and demand [15]. Papakostas et al. [16] also apply a MAS to decision support for additive manufacturing of plastic parts by using machine agents representing 3D printers to analyze part geometry and technical specifications, and then provide information on process cost and time performance, as well as process configurations to the user. Design support systems based on MAS offer the possibility to identify, store, and process-specific domain knowledge in a decentralized manner [17] so that a holistic view of the value-added process can be taken, and thereby engineers can be supported already in the product development phase [18, 19]. For additive repair processes, only generalized specifications in the form of lists or rule-based and casebased decision support systems exist so far [3, 4]. However, these decision support systems do not address the challenge that, when choosing a suitable repair process, the repair material and process are mutually dependent and that the different repair processes have different requirements, which are usually available in a decentralized manner. Therefore, this paper examines the suitability of MAS as a decision support system for additive repair and how the different repair processes can be represented within the JADE framework.

3 Multi-agent System as Decision-Support System for Additive Repair Processes 3.1 Methodical Procedure The MAS presented is intended to overcome the hurdle of expert knowledge not being available centrally and the need for individual assessment of components with regard to additive repair. To do this, the MAS has to address requirements such as querying component and damage information, and identifying, storing, and processing specific domain knowledge. The execution of the decision-making for repair or disposal ensures that the component is not unnecessarily disposed of and that an appropriate repair procedure is recommended.

6

S. Plappert et al.

In order to be able to map these requirements for a MAS in a structured way, the CommonKADS method is suitable, which follows an approach of knowledge-based system development by capturing the most important properties of the system and its environment by building separate models. The CommonKADS method consists of six models that form different levels of abstraction [20, 21]. The organization model identifies and analyzes the specific organizational and application contexts into which the MAS is to be implemented. The task model describes the tasks to be performed independently of an agent by the MAS. The agent model describes all relevant capabilities and characteristics of agents that can perform the tasks identified in the task model, which may be a human, computer software, or other entity. The exchange of information between the different agents is defined in the communication model. A key model in the CommonKADS methodology is the expertise model, which models an agent’s problem-solving behavior in terms of the knowledge used to perform a particular task. Finally, the design model is developed, which combines the other models into a system and is thus used to describe the architecture and technical design. An important part of the additive repair process chain is the selection of a suitable repair process, therefore the context or scope of the organization model lies here. The main task of the MAS is to support the decision on whether the technical feasibility of additive repair is given and, in case this is true, to provide a recommendation for a suitable repair process by means of additive repair. 3.2 Architecture of the Multi-agent System The JADE framework was chosen to build the MAS because of its widespread use and its FIPA-compliant communication infrastructure [14]. In addition, JADE provides a set of graphical tools within a user interface to monitor, control, or debug the agents in the MAS [10]. The architecture of the presented approach for the MAS for decision support for additive repair processes consists of six different agents (Fig. 1) and is divided into three levels: the interface to the environment, the coordination level, and the functional level. The interface to the environment is theinformationagent, which requests the necessary information about the component and the damage from the user via a graphical user interface and stores the task in the blackboard. In the coordination layer, the coordination agent acts, orchestrates the other agents and regulates their access to the blackboard. In addition, the coordination agent regulates the process of negotiation of the process agents to find the most appropriate repair procedure. This is done according to the contract-net protocol. The three process agents and the material agent act at the functional level. These agents only have specific knowledge, which is necessary for the evaluation of their procedure or the materials. By this structure, further procedures in the form of agents can be supplemented simply to the system, since these act only at the function level. Furthermore, depending on the manufacturing process, further information can be requested from the user if no exclusion criteria are fulfilled. In addition, the management of agents can be decentralized so that the basis for decision-making is always based on the current manufacturing capabilities. The material agent is responsible for the preselection of suitable materials and the process agent checks the exclusion criteria, selects

Decision-Support for Additive Repair Processes

7

the optimal material and the optimal machine for the given additive repair task of the considered additive repair processes. A key requirement of the system is the review of appropriate design guidelines to evaluate the suitability of the repair procedures. Design guidelines can be used to derive evaluation criteria for the various repair procedures, which can be assigned to two categories. The exclusion criteria represent design rules whose fulfillment is mandatory to apply the repair procedure. These usually cannot be corrected by post-processing. An example is the exclusion of PBF for a part that cannot be disassembled, since repair by PBF can only be carried out in a closed installation space. The evaluation criteria should be used to quantify the suitability of the process. To increase the decision quality, additional parameters such as part size and element thickness have to be related to the available production equipment, e.g., the size of its process chamber. These design guidelines can also be fulfilled by post-processing. However, direct fulfillment by the repair process is still desirable because postprocessing increases process time and costs.

Fig. 1. The architecture of the multi-agent system

In this MAS, both direct messages and a blackboard are used for communication between the individual agents. The blackboard serves the agents as a storage location for the knowledge gained in the form of tables. Since an exchange of these among all agents via direct messages involves an enormous volume of messages and thus a very high computational load, it was decided to use a blackboard in which the information is stored in a database.

4 Application Example An additively manufactured deep-drawing die (Fig. 2a) for the production of lipstick caps is used, which experiences abrasive wear during service, serves as an application example. When the wear limit is reached, the tool must be repaired or replaced with a new one. In order to extend the service life, the MAS should examine whether it is

8

S. Plappert et al.

possible to repair the die using additive manufacturing processes and which processes are suitable for this purpose.

Fig. 2. a) Deep-drawing die as example, b) output of the negotiation between the agents

In the following, the process of verifying the suitability of the repair procedures is presented and shown in Fig. 3. After the inputs are made in the GUI, the MAS is started. First, the information agent queries the relevant information about the part and the damage from the user and stores the information in the blackboard, and informs the coordination agent. Then the material agent checks which materials are compatible with the component material. To do this, it needs a list of all materials with their characteristic values, such as the part and damage data, and the lists of compatible materials. In addition to the compatibility check, it also examines whether the materials meet the required hardness, strength, and corrosion resistance. If this is not the case, the corresponding materials are also deleted from the list of all materials. Then, the updated material list is written to a CSV file and stored in the blackboard. For the deep-drawing die, the MAS selects a tool steel as a suitable material, which corresponds to the material of the damaged die and meets the specified requirements. The updated material list, which is further processed in the system, contains only the tool steel material. The orchestration of the negotiation is done according to the contract-net protocol (Fig. 2b). Here, the coordination agent acts as the manager, while the process agents represent the contractors. At the beginning of the negotiation, the coordination agent sends a call for proposal (CFP) for an additive repair task to the process agents. If a process agent receives such a request, it is checked whether it is suitable for the task. The check of the exclusion criteria can be divided into three steps. In the first step, the material availability is checked. If no material from the revised list of the material agent is available for a process, an exclusion criterion is fulfilled and this is documented as’process excluded’. In the next step, the machine-specific exclusion criteria such as sufficient installation space or falling below the minimum wall thickness are checked. For this purpose, the program runs through the list of machines for the corresponding process and compares the machine data with the requirements of the task. If these are not

Decision-Support for Additive Repair Processes

9

Fig. 3. Sequence diagram of the multi-agent system

sufficient, i.e., if a machine-specific exclusion criterion is met, the machine is deleted from the list. If no suitable machine is available, an exclusion criterion is fulfilled. The last step describes the processspecific queries, such as the need for a monocrystalline repair. It is only checked whether these are required in the task. In the case of the die, filigree structures are present near the damaged area. The restoration of filigree structures represents an exclusion criterion for the CS process. If no exclusion criteria are met, further information is requested from the user, e.g. for the PBF the volume to be removed. To create comparability between the different repair processes, a numerical value (p_score), is used to evaluate suitability. This p_score is composed of the capability level of properties for the manufacturing processes, a factor for material costs, and the effort for rework. The higher the p_score, the more suitable the process is for the repair. If no exclusion criterion is fulfilled, the agent starts calculating the p_score and selecting the most suitable machine. With the p_score of the best machine, the process agent participates in the auction and places a bid (PROPOSE). If the process is unsuitable for the task, the process agent sends a rejection to the coordination agent (REFUSE). The CS process is unsuitable because the material cannot be applied to hardened surfaces with CS and therefore none exclusion criterion is fulfilled. In addition, only few complicated components can be repaired with this process, which is why an exclusion criterion is also fulfilled at this point. However, no exclusion criterion is fulfilled for the DED and PBF procedures, which is why these begin with the calculation of the p_score. Once the coordinating agent has received all bids, it uses the p_score to determine the most suitable procedure and notifies the associated agent (ACCEPT_PROPOSAL). The remaining procedures receive a rejection (REFUSE). If the process agent’s offer was chosen, the process agent acknowledges receipt with a message to the coordination agent, thereby terminating the negotiation process. Due to the large difference between the damaged volume and the volume that has to be removed for PBF, the p_score of PBF is lower. The p_score of DED is many times higher, which is why this method also wins the auction of the coordination agent and is recommended by the system to the

10

S. Plappert et al.

user. Since the load is abrasive, the system searches for a material with particularly high hardness. The recommendation of the system was validated by human experts.

5 Discussion and Conclusion In this work, a MAS for assessing the suitability of additive repair techniques was presented. The Common KADS method was used, and an architecture was built based on the JADE framework, which was extended by a blackboard. A deep-drawing die was used as an application example to verify the suitability and negotiation of the repair processes. It could be shown that MAS can be a promising tool to represent expert knowledge in a decentralized way in the form of agents and that it is easy to maintain and extend due to its modular structure. The existing system is a first step for decision support on the suitability of additive repair processes, as it stores specific knowledge on additive repair processes within agents, performs a comparison based on design guidelines and handles this via a negotiation using a contract-net protocol so that a recommendation for the most suitable repair process can be given and documented. However, the MAS still has a few limitations at this stage, as a lot of information about the part and the damage still needs to be manually retrieved by the user, the stored material and machine information is limited to a few essentials, and the suitability of the p_score needs to be validated using sensitivity analysis. In addition, it would be conceivable to split the individual components of the p_score in order to discuss and negotiate it separately between the agents. For further research, it is to be examined to what extent a connection to a CAD system is possible to read out component information and to check the suitability of the procedures on the basis of geometric simulations. This can reduce user input and the scope for interpretation. To offer the user a complete system for process determination, such a system would also be profitable for designers who strive for a design for remanufacturing based on component repair by additive repair. The MAS could also be extended to include agents that represent the alternatives to additive repair, e.g. the production of a new spare part. The MAS could then recommend the most suitable option for spare part provision based on the criteria of process and material costs, as well as the duration until availability.

References 1. Wilson, J.M., Piya, C., Shin, Y.C., Zhao, F., Ramani, K.: Remanufacturing of turbine blades by laser direct deposition with its energy and environmental impact analysis. J. Clean. Prod. 80, 170–178 (2014). https://doi.org/10.1016/j.jclepro.2014.05.084 2. Wahab, D.A., Azman, A.H.: Additive manufacturing for repair and restoration in remanufacturing: an overview from object design and systems perspectives. Processes 7(11), 802 (2019). https://doi.org/10.3390/pr7110802 3. Lahrour, Y., Brissaud, D.: A technical assessment of product/component remanufacturability for additive remanufacturing. Procedia CIRP 69, 142–147 (2018). https://doi.org/10.1016/j. procir.2017.11.105

Decision-Support for Additive Repair Processes

11

4. Ganter, N.V., Plappert, S., Gembarski, P.C., Lachmayer, R.: Assessment of repairability and process chain configuration for additive repair. In: Andersen, A.-L., et al. (eds.) CARV/MCPC -2021. LNME, pp. 261–268. Springer, Cham (2022). https://doi.org/10.1007/978-3-03090700-6_29 5. Jennings, N.R., Wooldridge, M.: Applying agent technology. Appl. Artif. Intell. 9(4), 357–369 (1995). https://doi.org/10.1080/08839519508945480 6. Dostatni, E., Diakun, J., Grajewski, D., Wichniarek, R., Karwasz, A.: Multi-agent system to support decision-making process in design for recycling. Soft. Comput. 20(11), 4347–4361 (2016). https://doi.org/10.1007/s00500-016-2302-z 7. Weiss, G.: Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. MIT press (2000) 8. Wooldridge, M.: An Introduction to Multiagent Systems. John wiley & sons (2009) 9. Eymann, T.: Grundlagen der Software-Agenten, pp. 17–107. Springer, Berlin, Heidelberg (2003). https://doi.org/10.1007/978-3-642-55622-7_2 10. Bellifemine, F., Caire, G., Greenwood, D.: Developing multi-agent systems with JADE, 1 edn. Wiley Series in Agent Technology. John Wiley & Sons, Ltd, Chichester, UK (2007). https://doi.org/10.1002/9780470058411 11. Ganzha, M., Paprzycki, M.: Implementing rule-based automated price negotiation in an agent system. J. Univ. Comput. Sci. 13(2), 244–266 (2007) 12. Davis, R., Smith, R.G.: Negotiation as a metaphor for distributed problem solving. Artif. Intell. 20(1), 63–109 (1983). https://doi.org/10.1016/0004-3702(83)90015-2 13. Nwana, H.S., Ndumu, D.T., Lee, L.C., Collis, J.C.: Zeus: a toolkit for building distributed multiagent systems. Appl. Artif. Intell. 13(1–2), 129–185 (1999). https://doi.org/10.1080/088 395199117513 14. Bergenti, F., Caire, G., Monica, S., Poggi, A.: The first twenty years of agent-based software development with JADE. Auton. Agent. Multi-Agent Syst. 34(2), 1–19 (2020). https://doi. org/10.1007/s10458-020-09460-z 15. Upadhyay, D.B.: A review paper on Multi agent base intelligent manufacturing system. International J. Advanced Eng. Res. Science 1(3) (2014) 16. Papakostas, N., Newell, A., George, A.: An agent-based decision support platform for additive manufacturing applications. Appl. Sci. 10(14), 4953 (2020). https://doi.org/10.3390/app101 44953 17. Plappert, S., Gembarski, P.C., Lachmayer, R.: Multi-agent systems in mechanical engineering: a review. In: Jezic, G., Chen-Burger, J., Kusek, M., Sperka, R., Howlett, R.J., Jain, L.C. (eds.) Agents and Multi-Agent Systems: Technologies and Applications 2021. SIST, vol. 241, pp. 193–203. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-2994-5_16 18. Gembarski, P.C.: On the conception of a multi-agent analysis and optimization tool for mechanical engineering parts. In: Jezic, G., Chen-Burger, J., Kusek, M., Sperka, R., Howlett, R.J., Jain, L.C. (eds.): Agents and Multi-Agent Systems: Technologies and Applications, vol. 186, pp. 93–102. John Wiley & Sons, Singapore (2020). https://doi.org/10.1007/978-981-155764-4_9 19. Plappert, S., Gembarski, P.C., Lachmayer, R.: Knowledge-based design evaluation of rota´ atek, J. tional CAD-models with a multi-agent system. In: Borzemski, L., Selvaraj, H., Swi˛ (eds.) ICSEng 2021. LNNS, vol. 364, pp. 47–56. Springer, Cham (2022). https://doi.org/10. 1007/978-3-030-92604-5_5 20. Iglesias, C.A., Garijo, M., González, J.C., Velasco, J.R.: A methodological proposal for multiagent systems development extending CommonKADS. In: Proceedings of the 10th Banff Knowledge Acquisition for Knowledge-Based Systems Workshop, 1, pp. 21–25 (1996) 21. Schreiber, G., Wielinga, B., de Hoog, R., Akkermans, H., de Velde, W.: CommonKADS: a comprehensive methodology for KBS development. IEEE Expert 9(6), 28–37 (1994)

AI Education for Middle/ High School Level: A Proposal of a System that Supports Teachers to Design Their AI Lessons Bich Hong Cu(B) and Takayuki Fujimoto Toyo University, Kawagoe, Japan [email protected]

Abstract. The rapid development of Artificial Intelligence (AI) has increased the importance of AI education for users. While AI has been being researched continuously in higher education, teaching ‘AI’ to middle and high school students is still a new idea. Although governments of various countries have attempted to promote AI education for students from primary to high schools, teachers have encountered challenges in designing their teaching materials. In this paper, we proposed an application that helps teachers to create easy-to-understand lesson plans and introduce AI knowledge to middle and high school students. By recommending teachers AI-related products or services, which are common to the young generation to introduce AI concepts, the application is expected to mitigate teachers’ difficulties in designing their AI lesson plans. Keywords: AI education · Middle and high schools · Lesson plans · AI-related products

1 Introduction Artificial Intelligence (AI) has been developing rapidly, leading to a world where people have to work and live with it [6]. We have to admit that with its continuous evolution in voice and facial recognition technologies and machine learning algorithms, AI has become universal [8]. These days, the young generation has been exposed to AI-related devices so early that they need to be conscious of AI’s frequent appearance around them, as well as how to use it safely and effectively. Eguchi, Okada, and Muto (2021) stated that children have been surrounded by AI assistants or AI-assisted smart household devices during their grow-up period, including smart speakers enhanced by Google, and applications equipped with Siri or Alexa [15]. These days, smart toys or music streaming services have become familiar in children’s daily life [10]. Because AI’s impacts on our society are increasingly obvious, educators in various fields such as AI, computer science, and education strongly claim that understanding the science behind AI is important to people. As there are both profitable and unintended negative impacts of AI advances [17], people need to know AI’s limits and potential influence in the future [4]. Eguchi, Okada, and Muto (2021) also demonstrated that it is crucial and urgent to equip K-12 © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 12–20, 2023. https://doi.org/10.1007/978-3-031-27470-1_2

AI Education for Middle/ High School Level

13

students with AI-related knowledge and the capacity of using AI-enhanced technologies in the future [8]. Not only middle and high school students but also children need to know examples of AI in their world, understand how AI algorithm functions and evaluate the effect of AI systems on their society [3]. AI elemental concepts and techniques have been traditionally introduced at university level [12]. AI-related knowledge has been considered as a major component for higher education with various teaching strategies (for example, [14] and [9]). In contrast, the idea of teaching AI to middle and high school students is relatively new to most of us. Although research about AI in education has been developed recently, most of them focuses more on how AI-related systems can support education. Intelligent tutoring systems can be an example. AI curricula have been designed mainly for higher education, whereas AI education for students from primary to high schools has not been developed sufficiently [6]. Teaching the basic AI concepts and techniques at the school level, which is separate from any platform or programming language, is incredibly uncommon [5]. Although AI education has received increasing concern from K-12 teachers, little research has studied how AI education can be developed to be more accessible for all students [16]. Especially, the AI curriculum is still new to schools of all grades. That brought the serious lack of relevant studies about planning and implementing AI curricula [7]. Especially, the study on teachers’ perspective in creating the AI curriculum is insufficient, which increases the difficulty in making AI curriculum accessible to students [17]. Since AI curricula at schools are believed to be essentially a global strategic goal in educating the young generation [7, 10], AI started to be introduced to students from grade 1 to grade 12 by various educational organizations in a form of unofficial AI learning programs. Teaching AI at school levels is also UNESCO’s major initiative. Responding to this goal, several projects and activities have been organized. For instance, UNESCO has partnered with Technovation, a global tech education nonprofit, to construct a free, online tech education program for girls, which lasts for 5 weeks (as described in [8]). Recently, AI has been integrated into the current education systems of several nations gradually, such as China and the United States. In China, learning with AI-related technology and learning about AI concepts have gathered increasing attention [7]. The Chinese Ministry of Education released the “Artificial Intelligence Innovation Action Plan for Institutions of Higher Education” in 2018 [7]. This strategy aims to encourage and support young individuals to involve in AI work and also school teachers to convey AI knowledge to their students. Similar to China, different regions in the United States have taken efforts to integrate AI into their K-12 classrooms. According to the report by Peterson, Goode, and Gehlhaus (2021), Seckinger High School in Georgia and the North Carolina School of Science and Mathematics have introduced their own AI curricula to K-12 education, as well as instructed students to understand, design, and use AI [11]. Besides China and the United States, other developing countries have started to implement AI learning materials into their official education systems. In Vietnam, Le Hong Phong High school for gifted students, Ho Chi Minh city, is the first high school to teach AI to students who specialize in technology. In brief, AI education for middle and high school students is expected to become the future educational trend. However, there are challenges in introducing AI concepts to students, one of which is that teachers encounter difficulties in designing AI lesson plans. Teachers often feel insecure about

14

B. H. Cu and T. Fujimoto

their lack of qualified knowledge to teach AI, or their insufficient ability to integrate AI on top of their existing curriculum [16]. Therefore, in this paper, we proposed an application that can mitigate teachers’ pressure of creating AI lesson plans. It can recommend an effective AI teaching method to teachers, while fulfilling the requirements of the Ministry of Education.

2 Existing Applications/ Services that Support Teachers with Lesson Plans and AI Education To fulfill the demand for an AI curriculum for middle and high school students, various organizations have developed AI learning materials for students until grade 12. In the report published in 2022, UNESCO mentioned AI4ALL as an open learning platform, which empowers high school educators of all backgrounds to bring AI education to their communities. AI4ALL also provides various learning programs for teachers and college students to gain more knowledge of Artificial Intelligence [2]. AI4ALL, a nonprofit program, cooperated with the Association for the Advancement of Artificial Intelligence (AAAI) and the Computer Science Teachers Association (CSTA) to launch AI4K12 in 2018 [15]. AI4K12 act as a framework for middle and high school teachers to design their lesson plans. Also, this website states that its guidelines are organized around the 5 Big Ideas in AI. The guidelines will serve as a framework to assist teachers and curriculum developers on AI concepts, essential knowledge, and skills by grade band [1]. On the other hand, there have been various applications and services that support teachers with curriculum design and lesson plans. Top Hat, an all-in-one teaching and learning platform can be an example. Top Hat offers available AI lesson plans and curriculum templates for educators. In Top Hat, teachers will have access to different reference resources and customizable interactive textbooks. This platform also allows teachers to keep track of students’ progress and support students easily. Top Hat provides users with completed lesson plans for various subjects, which teachers can use for teaching immediately [13]. Although users can enjoy various AI educational content offered by the platforms mentioned above, there are several drawbacks that deserve our attention. First, most of the AI educational programs mentioned above are privately developed by different organizations but governments. Therefore, their lesson content about AI varies and it is difficult to evaluate if they met the requirements of the national education standards. Second, available teaching materials might not fit all of the educational environments and teachers’ teaching styles. Thus, in this paper, we proposed an application with an aim to fill these gaps that many AI lesson design tools have. Available AI lesson plans offered by existing educational platforms have a common tendency in the lesson design. Regardless of the teaching content, the method of conveying AI-related knowledge needs to be effective. Instead of starting from AI concepts and then explaining how they are used to create high-tech devices, the proposed application offers teachers a different way of teaching AI. To introduce fundamental AI knowledge, the application enables teachers to start from the AI-related application examples that students are interested in.

AI Education for Middle/ High School Level

15

3 Purpose of the Proposed Application The main purpose of this application is to support middle and high school teachers to create their AI lesson plans, given that they have been trained in AI education and understand AI fundamental concepts. Specifically, the proposed application is expected to encourage teachers’ creativity in designing their lesson plans, while meeting the requirements of the national Ministry of Education (MOE) regarding AI curriculum. Besides supporting teachers to design AI lesson plans in their own styles (free manual design section), this application mainly helps teachers to introduce AI concepts by analyzing a real AI-related product. We plan to develop the application with the concept of ‘starting an AI lesson with a real AI-related product’.The content of AI-related lessons designed by the proposed application will be divided into three main categories: AI fundamental concepts, Ethics and social impacts of AI, and Using and developing AI. These categories are standardized in the UNESCO report about AI education for students from primary to the high school level. However, in this paper, we simplified and adjusted details of three categories mentioned above so that we can create an AI curriculum suitable for students of middle and high schools. The proposed application aims to recommend to teachers a new way of introducing AI fundamental concepts to middle and high school students. Instead of only lecturing about AI-related knowledge, teachers can create student-centered lesson plans by encouraging students to discuss real AI products and AI-related issues. In this way, teacher-student interaction in AI classes can be improved lively.

4 Functions of the Application The application is to be used with computers because teachers may need to open various windows simultaneously while they are working on their lesson plans. Teachers can also view their teaching materials on their smartphones. The application has the following functions: 4.1 Recommend AI Products in Accordance with AI Concepts that Teachers Need to Introduce The application allows teachers to create their lesson plans manually with full functions except analyzing the real AI products. However, since the application’s priority is to orient users to start their lessons with an AI application or website, our focus is on how teachers can exploit this real AI products-analyzing function. Let’s take the AI concept of ‘Computer vision’ as an example. When a teacher wants to teach this content, he/she enters ‘Computer Vision’ and other related keywords in the application. Then, the proposed application will recommend to teachers some corresponding software, applications, or websites (such as Facebook and TikTok). Generally, starting with a real AI product can be a great hook to the lesson, which provokes students’ interests and helps them to link directly what they are learning and how the related AI products can be used in their daily life.

16

B. H. Cu and T. Fujimoto

4.2 Automatic Update on Requirements of National Education Curriculum The education curriculum endorsed by MOE provides educators with a framework for standard knowledge that students are expected to acquire. This framework guarantees fundamental and unified education content that students all over the country have to go through. On the other hand, the national educational curriculum empowers local authorities and schools to take the initiative and responsibility in selecting and supplementing some educational content, as well as implementing educational plans suitable to the educational objects and conditions of the locality and institution. The proposed application will update the national education curriculum according to countries, states, and grades. By entering the locations and grades, the application will provide fundamental information about national education frameworks. This official information will be well organized and displayed in the form of clickable choices. Also, teachers can identify the general picture of how their AI lessons will be in the whole semesters. Before designing a single lesson in detail, teachers can sketch out lessons’ titles (which can express the main lesson content) and learning objectives for all lessons they will have in that semester. Teachers can fill in date and time of each lesson as well. These learning objectives are standard basis by default, and they can be adjusted flexibly by each teacher. After sketching out learning objectives for all lessons, teachers will start to design lesson details. 4.3 Open Reference Resource and Integrated Search Engine Reference resources are indispensable to designing lesson plans, especially for a new category as AI. Therefore, this function is expected to provide users a fast and convenient way to search for reference documents. The application offers an integrated search engine, which allows users to search for reference or teaching resources by inputting the lesson’s keywords. The screen of reference search results will be displayed side by side with the area where teachers are creating the teaching slides. 4.4 Teachers’ Social Networking and Co-designing Lesson Plans This function allows teachers to share their lesson plans, give comments on others’ lessons, or co-design AI teaching materials. Because the idea of teaching AI to middle and high school students is relatively new, teachers need to connect to other teachers to support each other. Also, the way to create effective lesson plans require continuous updates and adjustments, based on evaluations from students and other teachers.

5 How to Use the Application Following the application’s instructions, teachers can design their lesson plans from the most basic stage (confirming the national education curriculum) to the most detailed stage (planning class activities). One completed lesson plan includes different slides with the main content of the lessons in text, videos, or pictures. Therefore, every classroom is defaulted to have a projector to display AI-related content to students. How to use the application is described below.

AI Education for Middle/ High School Level

17

Fig. 1. The process of creating an AI lesson plan.

Figure 1 illustrates the stages of how teachers create their lesson plans in order. First, teachers define the national standards before preparing their teaching materials. To search the applied national education curriculum accurately, teachers have to fill in the information of countries, specific states (in the case of America), and grades. As national education standards are organized in clickable categories, users can easily select various options, which are suitable to their schools’ conditions. Based on national education frameworks, teachers define detailed learning objectives for each scheduled lesson, which can be adjusted flexibly. This process is described in detail in Fig. 2 Below.

Fig. 2. Choose national education standards to define lessons’ learning objectives

As indicated in Fig. 2., users can arrange learning objectives for every lesson in one semester, and they can be adjusted flexibly. After setting learning objectives for all lessons scheduled for the whole semester, teachers will start to work on individual lesson design. This is the Instructional Strategies stage. There are four steps in this stage, as

18

B. H. Cu and T. Fujimoto

indicated in Fig. 1. To design each AI lesson, teachers start with choosing exemplified AI-related products from the alternatives recommended by the application.

Fig. 3. Recommended applications based on keywords

Figure 3 describes how AI-related applications are recommended based on keywords from lesson content that teachers have entered by taking computer vision as an example. When users enter ‘computer vision’ and ‘Convolutional Neural Network (CNN)’ into the search bar, the proposed system will suggest Facebook, Instagram, or Tiktok, in which the CNN is exploited for face recognition (this function was later removed from Meta’s services due to ethical issues). The recommended applications can work as demonstrational examples for teachers to explain AI fundamental concepts, Ethics and social impacts, and how to use and develop AI, which has a strong relation to the suggested applications. Upon an AI lesson plan creation, teachers can set the timeframe for class activities and share their work with other teachers for co-designing or giving/ receiving feedback.

Fig. 4. A sample screen for teachers to design AI lesson plans

AI Education for Middle/ High School Level

19

In Fig. 4., learning objectives are always displayed side-by-side with content writing areas and a reference search engine. Only ‘Teaching content’ is shown to students, whereas ‘Students’ activities/ Lecture content’ remains visible only to teachers. On the left side of the teaching slides, there is the time allocation box, which allows teachers to anticipate how long they should focus on a part of their teaching materials. Moreover, a search engine is integrated to this application to help teachers with open reference resources.

6 Conclusion As Artificial Intelligence (AI) has become an integral part of people’s daily life, users need to understand how AI operates and how it impacts our society. While research on AI have been developed continuously in higher education, ‘teaching AI to middle and high school students’ is not quite so common. In the age when young generation has frequent exposure to AI-powered devices, AI education for children from grade 1 to 12 would deserve more attention. Both the Ministry of Education and teachers encounter lots of challenges in this field. Therefore, in this paper, we proposed the application to partially mitigate teachers’ difficulties in designing AI lesson plans for middle and high school students. By analyzing AI-related products, the proposed application can suggest to teachers interesting and easy-to-understand instructions to conduct their AI classes. Based on the instructions, teachers can introduce to their students, AI concepts, AI’s impacts on society, and how to use AI effectively.

7 Future Work This paper demonstrated the first developmental stage of the application specialized for AI lesson plans for middle and high school students. We realized several remaining issues to study deeper, such as how the proposed application can be used in countries where governments have not decided the AI national curriculum and national AI textbooks for students. Moreover, we need to understand other challenges that teachers encounter when they introduce AI knowledge through the existing websites that support teachers with designing AI lesson plan (AI4K12 for example). Investigating these issues, we will possibly upgrade the proposed application. In the next paper, we will research specific content for AI curricula in nations, as well as use a sample of a detailed AI lesson plan to illustrate how the proposed application operates. Also, we will create a database of AI education and reveal the application’s algorithm. Ultimately, we believe that the government’s corporation is the key factor to acquire the official information about the national education standards of each country. We might also need to form a content team for the smooth application operation. Since AI education for middle and high school schools is still a new idea, and it is hard to automatically update national standards only from the official resource. Therefore, members of this team are expected to be responsible for updating to AI educational standards of the system. All considerations mentioned above will be described in detail in our next paper.

20

B. H. Cu and T. Fujimoto

References 1. AI4K12. Accessed 28 April 2022. https://ai4k12.org/ 2. AL4ALL. Accessed 28 April 2022. https://ai-4-all.org/ 3. Ali, S., Payne, B.H., Williams, R., Park, H.W., Breazeal, C.: Constructionism, ethics, and creativity: developing primary and middle school artificial intelligence education. International workshop on education in artificial intelligence K-12 (EDUAI’19) (2019) 4. Alonso, J.M.: Teaching explainable artificial intelligence to high school students. International Journal of Computational Intelligence Syst. 13(1), 974 (2020). https://doi.org/10.2991/ijcis. d.200715.003 5. Burgsteiner, H., Kandlhofer, M., Steinbauer, G.: Irobot: Teaching the basics of artificial intelligence in high schools. Proceedings of the AAAI Conference on Artificial Intelligence 30(1) (2016) 6. Chai, C.S., Wang, X., Xu, C.: An extended theory of planned behavior for the modelling of chinese secondary school students’ intention to learn artificial intelligence. Mathematics 8(11), 2089 (2020). https://doi.org/10.3390/math8112089 7. Chiu, T., Chai, C.-S.: Sustainable curriculum planning for artificial intelligence education: a self-determination theory perspective. Sustainability 12(14), 5568 (2020). https://doi.org/10. 3390/su12145568 8. Eguchi, A., Okada, H., Muto, Y.: Contextualizing AI education for K-12 students to enhance their Learning of AI literacy through culturally responsive approaches. KI - Künstliche Intelligenz 35(2), 153–161 (2021). https://doi.org/10.1007/s13218-021-00737-3 9. McGovern, A., Tidwell, Z., Rushing, D.: Teaching introductory artificial intelligence through java-based games. Proceedings of the Second Symposium on Educational Advances in Artificial Intelligence (2011) 10. Pedró, F., et al.: Artificial intelligence in education: challenges and opportunities for sustainable development. UNESCO: Paris, France (2019) 11. Peterson, D., Goode, K., Gehlhaus, D.: AI Education in China and the United States: A Comparative Assessment (2021). https://doi.org/10.51593/20210005 12. Steinbauer, G., Kandlhofer, M., Chklovski, T., Heintz, F., Koenig, S.: Education in artificial intelligence K-12. KI - Künstliche Intelligenz 35(2), 127–129 (2021). https://doi.org/10.1007/ s13218-021-00734-6 13. TOP HAT. Accessed 28 April 2022. https://tophat.com/ 14. Torrey, L.: Teaching problem-solving in algorithms and AI. 3rd Symposium on Educational Advances in Artificial Intelligence (2012) 15. UNESCO: K-12 AI curricula: A mapping of government-endorsed AI curricula. Accessed 28 April 2022. https://unesdoc.unesco.org/ark:/48223/pf0000380602 16. Van Brummelen, J., Lin, P.: Engaging Teachers to Co-Design Integrated AI Curriculum for K-12 Classrooms. arXiv preprint arXiv:2009.11100 (2020) 17. Zawacki-Richter, O., Marín, V.I., Bond, M., Gouverneur, F.: Systematic review of research on artificial intelligence applications in higher education – where are the educators? Int. J. Educ. Technol. High. Educ. 16(1), 1–27 (2019). https://doi.org/10.1186/s41239-019-0171-0

Sensor Data Restoration in Internet of Things Systems Using Machine Learning Approach Saugat Sharma, Grzegorz Chmaj(B) , and Henry Selvaraj University of Nevada, Las Vegas, USA [email protected]

Abstract. Internet of Things (IoT) consists of a network of physical objects, generally referred to as “things,” connected for the purpose of exchanging data with other devices and systems over the internet. IoT systems include multiple sensors, software, and other technologies. Mostly, sensors send the data over the internet to the cloud and these data readings are analyzed to perform some specific operations. However, sometimes due to sensor or network failure, readings are not delivered to the cloud—these sensor readings are considered to be missing data. In this paper, we have focused on recovery/restoration of the missing data by mean, median, mode, KNN and ANN approaches. Two data sets are used for experimentation, with the assumption that 20% of sensor readings are missing. KNN performed best in both datasets, with RMSE value 0.00267605 and 0.00121473 respectively, and ANN was worst. Among statistical approaches, Mean showed the best result in first dataset with RMSE value 0.03618 whereas, in second dataset, Mode showed the best result with RMSE value 0.0021 less than mean. Keywords: Data Imputation · Internet of Things · Machine learning · K-nearest neighbors (KNN) · Artificial Neural Networks (ANN)

1 Introduction Internet of Things (IoT) systems consist of multiple machines, sensors, and devices interconnected over the cloud [1]. Devices in IoT send huge amounts of data, usually delivered to the cloud and then analyzed to produce an outcome used further by certain applications. Due to various reasons—network loss, sensor failures or maintenance— data may go missing, leading to various consequences like unavailability of data for research or timely decision making [2]. These missing data readings prevent instantaneous decision-making when needed and hence these missing data should be handled properly. One way to handle the missing data is to completely remove the missing data points. This can be done when only a few data values go missing in a large data set but not in a small dataset or if huge data values go missing. Another approach is to request retransmission of missed data; however, this is not always possible (e.g. reading from sensor cannot be done for past conditions). Further, when certain applications must be done instantly based on the received sensor values, but these sensor values go missing, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 21–30, 2023. https://doi.org/10.1007/978-3-031-27470-1_3

22

S. Sharma et al.

then we cannot eliminate or ignore the missing data. In this case, missing data must be imputed along with the case where a huge amount of data goes missing in the data sets. Imputation means filling the missing value with the closest value that fits [3]. There are several approaches including statistical and Machine Learning to impute the missing data. Mean, Median and Mode is the most used statistic imputation approach [4]. In Mean imputation techniques, missing data is replaced by the mean of the whole dataset of same feature. This is the single imputation technique [5, 6]. Similarly, in median imputation technique, median is used to fill the missing data value and mode—most frequent value—is used in mode imputation technique. These techniques are also single imputation techniques [6]. Some studies suggest mean imputation techniques can be significant when only a few data values go missing in dataset, about less than 10% [7–9]. Nowadays, with the advance of machine learning, various machine learning approaches are being used in data imputation and prediction [4, 10]. We have also used KNN and ANN techniques for our experiments. This kind of machine learning algorithm has shown a better result compared to the statistical approaches [11, 12]. The missing data can go Missing Completely at Random (MCAR), Missing Not At Random (MNAR) or Missing at Random (MAR). When all the data values have equal probability of going missing then it is called missing completely at random (MCAR). When the data points are not equally probable to go missing then it is called MNAR. When the probability of being missing is the same only within groups defined by the observed data, then data are missing at random (MAR). In our case, we removed the 20% of data from both of our datasets at random, thus obtained MAR sets. In this paper, we have used two datasets and made 20% of its data go missing. Then, we have implemented mean, median, mode, ANN and KNN algorithms to makeup this missing data and have compared the results based on Root Mean Square Error (RMSE).

2 Related Work Most of the existing work in the area of data loss in Internet of Things systems concentrates at increasing reliability of data transmission, where lost data is retransmitted. There are some approaches to cases in which data is lost and is not sent to the recipient again. Transport errors from the perspective of security are described in [13]. Authors present the solution to increase data restoration error rate up to reduction of error rate to 0% with small computational overhead. Proposed Time-Based Dynamic Response (TBDR) method negotiates the randomized parameters and dynamic IDs of both parties during the data exchange for the purpose of the accuracy of data restoration. Authors of [14] present a mathematical approach to noisy/incomplete sensor readings with the use of linear models approximating related behaviors. The work includes the computation of minimum data rate required for such models with imperfect measurements. Results are used for decision making. Estimation of packet error rate for data distribution in IoT networks is proposed in [15] as a method of broadcasting using moving average. This way the packet’s error rate is estimated and used to obtain a sufficient number of output packets to achieve a high data restoration rate. Authors of [16] use the General Regression Neural Network to restore the data that is missed in the communication between IoT systems. Experiments were carried out using air pollution data and demonstrated high

Sensor Data Restoration in Internet of Things Systems

23

accuracy of GRNN method, compared to Adaptive Boosting, Support vector regression and Linear regression using gradient descent (with Mean Absolute Percentage Error being 23%, 28%, 30% and 33% respectively). Approach for minimizing/avoiding data loss through optimizing the transmission and energy utilization of each IoT node [17]. Such approach is applied to Fog IoT networks and leads to achieving desired reliability, also improving the energy consumption balance across system nodes. Our approach aims to efficiently restore sensor readings that are already lost and can’t be retransmitted, as the source of failure is the sensor problem or other hardware related issues.

3 Solution Approach For the data imputation techniques, we have considered the statistical and machine learning approaches. Statistical Approach The statistical approach includes Mean, Median and Mode (most frequent) for our data imputation techniques. In Mean Imputation (MI), mean is obtained from the observed values for every variable and based on this mean, the missing values for that variable are imputed. In median imputation techniques, the median value of the entire feature column is used to impute the missing value. Another method is mode imputation where the most frequent value of the entire feature column imputes the missing data. Machine Learning Approach For our experiments, we have used K nearest neighbors (KNN) and Artificial Neural Network (ANN)—a machine learning approach—for the data imputation. a) K nearest neighbors (KNN) algorithm KNN is supervised machine learning algorithm. In KNN, k samples in the dataset that are close in space are identified and these samples are used to estimate the missing data points [11]. For each sample, missing values are imputed using the mean value of the k neighbor found in the dataset [11]. KNN uses Euclidian distance for calculating how close datasets are in the space. b) Artificial Neural Network (ANN) algorithm ANN is a supervised machine learning algorithm which intends to simulate the behavior of biological systems composed of neurons. It has input layers, hidden layers, and output layers. The architecture of our model:

24

S. Sharma et al. Table 1. ANN model used Layer (type)

Output shape

Param #

Dense (Dense)

(None, 256)

2560

Dense_1 (Dense)

(None, 128)

32896

Dense_2 (Dense)

(None, 64)

8256

Dense_3 (Dense)

(None,1)

65

From Table 1 we can conclude that there are: Total params: 43,777; Trainable params: 43,777; non-trainable params: 0. Evaluation: For evaluation, we examine using Root Mean Square Error (RMSE) value. RMSE is the standard deviation of the prediction errors. At first, we split the data into training and test set. Then, RMSE values are obtained on the test set. The equation for the RMSE values is: N 2 i=1 ||y(i) − yˆ (i)|| RMSE = N N = Number of data points that are missing. y(i) – the i-th measurement or actual data values. yˆ (i) – corresponding prediction for i-th measurement or the predicted value for the missing data. Data Sets For our experiments we have used two different data sets, described below. These data sets are different in number of features and number of datapoints. One data set has a very large number of datapoints, about 36733, and the other has 100 datapoints. We want to compare and see how data imputation works on huge and small data sets. Originally, they were complete and didn’t have any missed readings. For our experimentation, 20% of data values were removed from each dataset. These data were removed at random. Gas Emissions Data Set (DS1) We obtain the CO and NOx Gas Emissions Data sets from UCI repository where it has 36733 data points [18]. Each data point contains measurements from 11 different sensors which were collected over a period of 1 h in Turkey to study CO and NOx emissions. These data sets don’t contain missing data but was made missing for the purpose of experiment (Table 2).

Sensor Data Restoration in Internet of Things Systems

25

Table 2. Sample of DS2 data AT

AP

AH

AFDP GTEP

TIT

TAT

TEY

CDP

CO NOX

1.953 1020.1 84.985 2.530

20.116 1048.7 544.92 116.27 10.79 7.4

113.2

1.219 1020.1 87.523 2.393

18.584 1045.5 548.50 109.18 10.34 6.4

112.0

0.949 1022.2 78.335 20778

22.264 1068.8 549.95 125.88 11.25 3.6

88.14

1.007 1021.7 76.942 2.817

23.358 1075.2 549.63 132.21 11.70 3.1

87.07

The attributes are ambient temperature, ambient pressure, ambient humidity, air filter difference pressure, gas turbine exhaust pressure, turbine inlet temperature, turbine after temperature, compressor discharge pressure and turbine energy yield. Thingspeak Dataset (DS2) The Other type data sets we obtained are from the Thingspeak service [19]. It also originally didn’t have missing data points, but we removed 20% data for data preparation process. It has a total of 5 different data from 5 different sensors—pressure, controller temperature, humidity, temperature, and CO2 [19]. It has a total of 100 datapoints for each attribute. These data were obtained in Germany for the purpose of air quality measurement and uploaded in Thingspeak, making it public. It has a total of one hundred datapoints for each of five features (Table 3). Table 3. Sample of DS2 data Pressure

Cont_temp

Humidity

Temp

Co2

989.73

39.07

29.11

18.08

781.0

989.78

38.97

29.18

18.04

780.0

989.78

38.92

29.21

18.02

780.1

989.81

39.02

29.24

18.00

784.5

The range of datasets were vastly different in both DS1 and DS2. For DS1, data point was from below 1 for one type of feature to above 1000 for other type. This can take slower convergence time while training the data and may result in disproportionate weight assignment to some parts of the system. Thus, to make it under the same range, we did the min-max normalization for DS1. Similarly, the range of datapoints were also vastly different for DS2, Thus, we did min-max normalization to make it within the same range. Therefore, all the datasets used were normalized using min-max normalization technique: normalized data =

data value − minimum data value in set maximum data value in set − minimum data value in set

26

S. Sharma et al.

This converts minimum value to zero, maximum data value to one and remaining values in the range of (0,1). Min-Max normalization doesn’t handle outliers in the best way; fortunately, in our datasets there were no such outliers. Thus, we decided to use the min-max normalization.

4 Experiments and Results For each dataset we apply statistical approach and ML approach. For DS2, the ANN algorithm wasn’t tested because of data specifics. 1. Gas emission dataset (DS1): a) KNN algorithm We have xperimented how well each algorithm has performed for DS1 in terms of RMSE values. First, we have experimented within KNN – which k nearest neighbor gives the least RMSE value, then with ANN, Mean, Median and Mode (most frequent). For DS1, Fig. 1 shows the RMSE plot for various values of k. Figure 1 shows the RMSE values (y-axis) for various values of k (x-axis). At first the RMSE value for k = 1 was obtained. Similarly, RMSE value for all k from 1 to 20 was obtained and plotted. RMSE value was high when k = 1 and it slowly decreased till k = 5 where it was the minimum, 0.00267605. Then, it went on increasing thereafter. Thus, it shows that for data imputation using KNN, we can choose k = 5 for this dataset to obtain the best result—more plausible value for the missing value. b) ANN algorithm RMSE plot of training and validation by using ANN approach is shown in Fig. 3. DS1 dataset was partitioned to an 80/20 training and testing split, thus 20% testing data had sensor readings missed. Further, 20% data was used as validation set. From the Fig. 3, we can see that RMSE values go on decreasing and it becomes the lowest on final epochs (30000). The validation RMSE was 1.2457 and training was 1.1047. This value is very high compared to our previous KNN approach. Thus, we can say that ANN will give more non-plausible value to our missing data.

Fig. 1. RMSE as function of k (DS1)

Fig. 2. RMSE for statistical approach (DS1)

Sensor Data Restoration in Internet of Things Systems

27

c) Mean, Median and Mode: Mean has the lowest RMSE value whereas the Mode has the maximum RMSE value – the comparison between statistical approaches in terms of RMSE values is shown in Fig. 2. Therefore, we can say that Mean imputation approach delivered best quality of data restoration, with RMSE value of 0.03618 between median and mode where they have RMSE values 0.037589 and 0.05939 respectively.

Fig. 3. Training and validation loss – RMSE plot

d) Comparison chart for KNN, ANN, Mean, Median and Mode: Table 4 shows the chart for comparison of RMSE values among KNN, ANN, Mean, Mode and Median for DS1. Thus, ANN has the highest RMSE values and KNN has the least RMSE value. Among the statistical approaches, Mean showed a better result than Median and Mode, whereas Mode showed the worst result. However, the difference in performance between Mean and Median was not huge. Thus, overall, it shows that KNN was far better than remaining and provides more plausible value to the missing data and ANN was far worse. Table 4. Comparison among five different imputation approaches Approach

RMSE values

KNN

0.002676

ANN

1.091

Mean

0.03618

Median

0.037589

Mode

0.05939

28

S. Sharma et al.

2. Thingspeak dataset (DS2): a) KNN algorithm For the DS2, we have used the KNN Imputer function provided by Kera’s. Figure 4 shows the RMSE values (y-axis) for various values of k (x-axis). At first the RMSE value for k = 1 was obtained. Similarly, RMSE value for all k from 1 to 20 was obtained and plotted. Contrary to our DS1, here in DS2, k = 1 showed the least RMSE value, 0.00121473, and then it went on increasing after that. This implies that for the dataset having very few data points, k = 1 performed the better result. Since this dataset has only 100 data values, we decided not to use ANN. b) Mean, Median and Mode: Figure 5 shows the comparison between statistical approaches in terms of RMSE values. Mean and Mode has the quite similar value – Mean was greater by 0.0021 than Mode, whereas Median has the highest RMSE value. From this we can say that Mean or Mode imputation approach will provide the best plausible value having RMSE value equal to 0.308 and 0.3059 respectively.

Fig. 4. RMSE value against k (DS2)

Fig. 5. RMSE for statistical approach (DS2)

c) Comparison chart for KNN, ANN, Mean, Median and Mode: Table 5 shows the chart for comparison of RMSE values among KNN, Mean, Median and Mode. The lower the RMSE values, the better the result. Table 5 shows that RMSE value for KNN is least and highest for the Median. Among statistical approaches, Mode was better, and Median was worse. This illustrates that KNN provides more plausible value to the missing data than the remaining others. Thus, for this small data set where 20% of data value was missing, KNN gave the more suitable replacement value compared to the other three. A) Comparison between DS1 and DS2: KNN gave the best result in both DS1 and DS2 (Fig. 6). Further RMSE value was not that far between DS1 and DS2 when using KNN—DS2 was higher by 0.0013 RMSE value. However, mean, median and mode was giving higher prediction error, especially in DS2. Less the dataset value, more the error rate for statistical approach. Moreover, ANN was worst among all the approaches, and it gave the highest prediction error by a huge margin. Thus, by looking at two different types of data sets—one with large data values (DS1) and the other with small data values (DS2), we can say that KNN will give good prediction of missing data in both kinds of data sets.

Sensor Data Restoration in Internet of Things Systems

29

Table 5. Comparison chart for four different imputatuon approaches Approach

RMSE values

KNN

0.0013

Mean

0.308

Median

0.3349

Mode

0.3059

Comparison between DS1 and DS2 RMSE (log)

10 0.1 0.001 KNN

Mean DS1

Median

Mode

ANN

DS2

Fig. 6. Comparison between DS1 and DS2 (log scale)

5 Conclusions We experimented data imputation techniques with five different approaches – KNN, ANN, Mean, Median, Mode. KNN approach outperformed other approaches in both huge thirty-six thousand datasets and one with one hundred datasets. This shows that KNN can be used for imputation techniques when sensor data goes missing. However, different values of k performed with different efficiency. For the first data set, k = 5 gave the least RMSE value and increased after that. For the second datasets having few data points, k = 1 gave the lowest RMSE value. Mean imputation technique showed a good result compared to the other two statistical approaches. ANN was worst among all the methods used. These experiments were done on 20% missing data set. However, in future, it can be extended by experimenting those algorithms on higher percentage of missing data and comparing which goes best.

References 1. Sharma, S., Chmaj, G., Selvaraj, H.: Machine learning applied to internet of things applica´ atek, J. (eds.) ICSEng 2021. LNNS, vol. tions: a survey. In: Borzemski, L., Selvaraj, H., Swi˛ 364, pp. 301–309. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-92604-5_27 2. Rani, S., Solanki, A.: Data imputation in wireless sensor network using deep learning techniques. In: Khanna, A., Gupta, D., Pólkowski, Z., Bhattacharyya, S., Castillo, O. (eds.) Data Analytics and Management. LNDECT, vol. 54, pp. 579–594. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-8335-3_44

30

S. Sharma et al.

3. Rubin, D.B.: Inference and missing data. Biometrika 63(3), 581–592 (1976). https://doi.org/ 10.1093/biomet/63.3.581 4. Jerez, J.M., et al.: Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artif. Intell. Med. 50(2), 105–115 (2010). https://doi.org/10. 1016/j.artmed.2010.05.002 5. Zhang, Z.: Missing data imputation: focusing on single imputation. Ann. Transl. Med. 4(1), 9 (2016). https://doi.org/10.3978/j.issn.2305-5839.2015.12.38 6. Kabir, G., Tesfamariam, S., Hemsing, J., Sadiq, R.: Handling incomplete and missing data in water network database using imputation methods. Sustainable and Resilient Infrastructure 5(6), 365–377 (2020). https://doi.org/10.1080/23789689.2019.1600960 7. Graham, J.W., Hofer, S.M., Donaldson, S.I., MacKinnon, D.P., Schafer, J.L.: Analysis with missing data in prevention research. In: The science of prevention: Methodological advances from alcohol and substance abuse research. American Psychological Association, Washington, pp. 325–366 (1997). https://doi.org/10.1037/10222-010 8. Tsikriktsis, N.: A review of techniques for treating missing data in OM survey research. J. Oper. Manag. 24(1), 53–62 (2005). https://doi.org/10.1016/j.jom.2005.03.001 9. Raymond, M.R.: Missing data in evaluation research. Eval. Health Prof. 9(4), 395–420 (1986). https://doi.org/10.1177/016327878600900401 10. Jadhav, A., Pramod, D., Ramanathan, K.: Comparison of performance of data imputation methods for numeric dataset. Appl. Artif. Intell. 33(10), 913–933 (2019). https://doi.org/10. 1080/08839514.2019.1637138 11. Batista, G.E., Monard, M.C.: A study of K-nearest neighbor as an imputation method. His 87(251–260), 48 (2002) 12. Batista, G.E.A.P.A., Monard, M.C.: An analysis of four missing data treatment methods for supervised learning. Appl. Artif. Intell. 17(5–6), 519–533 (2003). https://doi.org/10.1080/713 827181 13. Lin, C.-T., Tsai, C.-Y., Kao, C.-K.: Lower power data transport protection for Internet of Things (IoT). IEEE Conference on Dependable and Secure Computing 2017, 468–470 (2017). https://doi.org/10.1109/DESEC.2017.8073865 14. Bekiroglu, K., Srinivasan, S., Png, E., Su, R., Lagoa, C.: Recursive approximation of complex behaviours with IoT-data imperfections. IEEE/CAA J. Automatica Sinica 7(3), 656–667 (2020). https://doi.org/10.1109/JAS.2020.1003126 15. Jeon, S.Y., Ahn, J.H., Lee, T.-J.: Data distribution in IoT networks with estimation of packet error rate. In: 2016 10th International Conference on Next Generation Mobile Applications, Security and Technologies (NGMAST), pp. 94–98 (2016). https://doi.org/10.1109/NGMAST. 2016.25 16. Izonin, I., Kryvinska, N., Vitynskyi, P., Tkachenko, R., Zub, K.: GRNN approach towards missing data recovery between IoT systems. In: Barolli, L., Nishino, H., Miwa, H. (eds.) INCoS 2019. AISC, vol. 1035, pp. 445–453. Springer, Cham (2020). https://doi.org/10.1007/ 978-3-030-29035-1_43 17. Abkenar, F.S., Jamalipour, A.: A reliable data loss aware algorithm for fog-IoT networks. IEEE Trans. Veh. Technol. 69(5), 5718–5722 (2020). https://doi.org/10.1109/TVT.2020.298 1970 18. Dua, D., Graff, C.: UCI Machine Learning Repository. (2017). http://archive.ics.uci.edu/ml 19. tthosta CO2 measurement In: CO2 Measurement – ThingSpeak IoT. https://thing-speak.com/ channels/1350261. Accessed 2 May 2022

Potential of eXtended Intelligence (XI) for Extending Human Expression from Digitization of Analog Elements Takayuki Fujimoto(B) Toyo University, Tokyo, Japan [email protected]

Abstract. In recent years, AI technology has developed dramatically and become more versatile. AI technology has been integrated into every aspect of our lives. Many people use AI frequently without being aware that they are using AI. The use of AI has led to the digitization of every analog tool in society, enriching our cultural life. Old antique-like machines are digitized and reproduced as software, and AI algorithms clean up and reproduce analog devices that have slipped in their positions as relics of the past. The proliferation of AI and the digitization of content and tools have rapidly in-creased and expanded the convenience of our lives. On the other hand, digital materials are easy to copy and it is difficult to guarantee originality. It is no exaggeration to say that the value of content is equal to its rarity, which is the presence or absence of guaranteed originality. In this sense, our AI digital society is causing the loss of originality and the decreasing the value of content itself in exchange for convenience. For creators there is no difference in the effort required to produce content, whether digital or analog. However, as long as content is digital, creators’ efforts will not be rewarded. These days, NFTs can provide a guarantee of originality and uniqueness to digital content. However, the daily development of AI technology will lead to the problem of the originality of the AI itself, which produces digital content. Is there any originality in the AI itself that produces content with guaranteed originality? As AI is a computer program, an algorithm, it is ultimately nothing more than source code. It is easy to copy and mass-produce AI that produces content with originality. Given these issues, this paper focuses on XI as a post-AI concept, introduces the authors’ efforts, and discusses XI’s potential and importance. Keywords: eXtended intelligence · XI · EI · Artificial intelligence · AI · Post-AI · Digital value · Originality · Creativity support

1 Introduction: Digital Scarcity? The digitization of content has various merits and demerits. The digitization of analog machines, tools, and content has been rapidly gaining popularity in recent years due to digitization’s greatest advantages of high processing and storage capacity. People can enjoy Hollywood movies in their original resolution without any deterioration in quality, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 31–39, 2023. https://doi.org/10.1007/978-3-031-27470-1_4

32

T. Fujimoto

and can experience antique analog machines on digital computers no matter where they are. On the other hand, the digitization of analog materials has also led to many problems due to their ease of use. For example, digital content is easy to reproduce and it is difficult to determine originality. Digital content and digitized materials can be easily and completely reproduced by anyone without degrading the original. In today’s digital society, the differences between originals and reproductions have almost disappeared, and as a result, the originality of content has become blurred, making it difficult to ensure the uniqueness of content. Van Gogh’s paintings are valuable because there is only one of each in the world. If Van Gogh’s original work could be mass-produced inexhaustibly, the paintings would have no value. The value of content is equal to its scarcity. The digitization of content induces the loss of content scarcity. Needless to say, a decrease in scarcity is a decrease in the value of the content itself. In other words, it is inevitable that it will be difficult to increase the artistic and economic value of digital content. This undeniable reality shows the social problem with digital content. This is because for creators, whether digital or analog content, there is always a “first one” or original. The effort required by creators to produce an original work is the same whether it is digital or analog. The only difference is in the tools and media used. However, from the moment of its creation, digital content be-comes subject to easy copying, rapidly diminishing its scarcity and value.

2 Guarantee of Digital Value and Originality Recently, NFTs (non-fungible token) have been gaining attention as a way to add value to digital content [18]. NFT is a mechanism that provides proof of originality for digital content that can be easily copied, protects the originality of digital content, and recognizes its scarcity value. Since the advent of the NFT, digital content that was previously not fully valued has taken on the same artistic and economic value as fine art, and a variety of digital content has rapidly increased in price. It was global news when Beeple produced and sold the NFT digital content “Everydays - The First 5000 Days” for a record $69.35 million at auction house Christie’s in March 2021 [8]. Of course, it is undeniable that a large part of the recent NFT craze is a speculative money game by capitalists looking for a place to invest [19]. On the other hand, however, it is true that the NFT has given a ray of hope to digital content and digital content creators, who until then had been unable to enhance the value of their content due to the difficulty of proving its originality. Today, the number of digital content creators is rapidly increasing. The role of NFTs, which guarantee original value to digital content, is expected to transform from a speculative object to a system that guarantees the value of works for creators (Fig. 1).

Potential of eXtended Intelligence (XI)

33

Fig. 1. Everydays - the first 5000 days (Beeple, 2021)

3 Value of Analog Content and Value of Digital Content Digitization means losing the value of the original and making it difficult to prove originality. Thus, the digitization of analog materials means that originality is lost in exchange for convenience, making it difficult to increase value. However, content and other creative productions, whether they are analog or digital, are always original and unique to their creators. Even with content that has been reproduced so often that the existence of the original cannot be confirmed, there is always a “first one” when it was created. In other words, an original always exists for all content, regardless of media. Despite differences in production environments and media, the energy and hard work required by creators to produce content is the same. Digital content, which is easier to reproduce, is no less labor intensive to produce than analog content. With this in mind, we have examined the possibility of digitizing analog

34

T. Fujimoto

materials. We have been working on various developments and designs for digitizing analog content [3–6, 12]. For example, • An e-book system that can express the deterioration and staining of paper • An e-book interface system using physical contact with paper • Disposable camera applications that can only take as many pictures as there are sheets of film [7] • A mobile interface that replicates the feel of using a future phone on a smartphone • 19th century pre-cinema imaging device applications (Fig. 2) [1, 2]

Fig. 2. Examples of development

These examples are all digital software with mundane functionality compared to today’s applications and digital devices. Rather, our digitized applications limit functionality and remove the convenience that only digital content can offer. This is because we believe that when digitizing analog content and analog machines, daring to limit their functionality can reaffirm the appeal of analog content. Of course, that does not mean downgrading through digitization. These restrictions are intended to bring out human creativity and emotions, and in that way, the users can reaffirm the appeal of analog content and analog machines. In other words, the purpose of our challenge is to digitize in order to rediscover techniques and expressions that have been forgotten as old expressions or old technologies due to the development of new devices and content, and to provide opportunities to discover completely new expressions and technologies today.

Potential of eXtended Intelligence (XI)

35

With the rapid development of digital technology, many human skills and expressions are being lost in exchange for convenience. There are many fascinating analog techniques and expressions that existed in the past that are not used today and have been forgotten. Today, digital technology is saturated, making it difficult to invent new techniques and expressions. Thus, we believe that by unearthing the forgotten legacies of the past, we can find the seeds of new technologies and expressions. In short, we believe that, in today’s internet age, new technologies and expressions can be realized through the rediscovery of lost technologies rather than their emergence out of nowhere.

4 Extended Intelligence (XI / EI) as a Post-AI The development of digital content and devices has become easier lately with the use of AI technology. However, the greater the influence of AI in content creation, the more its originality is reduced. NFTs are no exception. This is because even if the NFT allows for “originality,” the AI or computer that controls the production of that original content itself can easily be duplicated. Therefore, even if the tools, media, and productions are digital, it is essential for a human being to be the creator in order to create original value. Today, when the value of content is blurred, what is required in the development of digital content is not AI, a computer system that replaces human creativity, but eXtended intelligence (eXtended intelligence can abbreviated as XI or EI, but in this paper we will use XI for simplicity), a computer system that is an extension of human creativity [9–11]. XI is a keyword in the post-AI field that has attracted much attention in recent years [13, 14, 16]. The concept of XI is quite simple [17]. The goal of AI technology development is to create computers that approach human intelligence and can express human-like intelligence that surpasses human intelligence. If a complex thinking function that enables human-like decision-making is incorporated with computing speed and memory capacity that far surpasses that of humans, it would create an AI that enables rational decision-making that far surpasses that of humans [15]. In general, that is the ideal AI. How-ever, at the current stage, it seems to be only an assumption that the ideal of AI is something that will be an alternative to human beings. Of course, this assumption has been called into question over the years as the singularity of AI. Rather, the singularity has been thought of as the diminution of human value. In contrast, XI is the concept of an intelligent computer system that contributes to increasing human value. This is because XI is not intended to replace human intelligence, but to extend hu-man intelligence. In intelligent computing based on the concept of XI, the creative subject is the human being. AI or intelligent computers will never be the creative subject. Humans can extend and enrich their own creativity by using advanced intelligent computer systems. XI as an intelligent computer will be developed to support hu-mans’ independent creative activities. While AI technology based on data science continues to develop, the value of computers as tools is becoming blurred. Is there any value in paintings generated by AI that can create paintings that look like they were drawn by a human? With algorithms, they could be mass-produced inexhaustibly. What should we think about the relationship between the creativity of AI that creates paintings and the creativity of the researcher

36

T. Fujimoto

who developed the AI? With the recent development of AI, there are many issues to consider. XI will be a key concept for considering recent AI issues. It is expected to deepen future research on XI as a key concept to explore post-AI (Fig. 3).

Fig. 3. Difference between AI and XI

5 What is XI? The study of XI is still in its infancy, but several important papers have appeared in recent years. One of the earliest texts was “Extended Intelligence” by Joi Ito, published as a discussion in the MIT Media Lab in 2016. One sentence he writes in this paper sums up the basic state of artificial and extended intelligence. We propose a kind of Extended Intelligence (EI), understanding intellgence as a fundamentally distributed phenomenon. As we develop increasingly powerful tools to process information and network that processing, aren’t we just adding new pieces to the EI that every actor in the network is a part of? Marvin Minsky conceived AI not just as a way to build better machines, but as a way to use machines to understand the mind itself. In this construction of Extended Intelligence, does the EI lens bring us closer to understanding what makes us human, by acknowledging that what part of what makes us human is that our intelligence lies so far outside any one human skull? In their paper, Joi Ito et al. present 16 specific initiatives for XI that they were working on. They are all interdisciplinary, transcending positions, specialties, and experiences, and are diverse and complex approaches to exploring the next generation of intelligence

Potential of eXtended Intelligence (XI)

37

with computer systems. These initiatives are thought-provoking for post-AI research and development. From the 1980’s there was a theme called “Creativity Support” as applied research in the field of AI and cognitive. Creativity Support is a field of study in which the ma-chine power of computers is used to complement human creativity and technology, and to support the expansion of a uniquely human ability: creativity. This field can be considered the earliest XI research. The biggest difference between Creativity Support and XI is the difference in computer performance and the level of AI technology used in the studies. Computers used in the 80s and 90s for Creativity Sup-port research were not as fast as today’s computers. Algorithms, which are considered intelligent computer systems, were not as complex as they are today. In other words, neither computers nor intelligent computer systems of the time were capable enough to complement and support human intellectual activity and creativity. It was not until the 2010s that this field was merged with sophisticated artificial intelligence technology and developed dramatically. Today, it is being systematized in practice through a new concept and approach called XI. As mentioned earlier, our research team is working on “digitizing analog machines and tools” as intelligent computer systems to extend human creativity and technology. Of course, these are not intended to provide a nostalgic experience through software imitating the look and function of analog materials. These are challenges based on the XI concept to extend and enrich the unique human creativity through the use of intelligent computer systems. Various approaches have been proposed through these attempts. In our case, we dare to in-corporate the inconveniences and limitations of analog equipment into the advanced and highly convenient processing capabilities of computers, in an attempt to extend the unique human intellectual abilities.

Fig. 4. XI concepts and old proverbs in Japanese

38

T. Fujimoto

There is an old Japanese proverb which originally came from China, that says, “窮すれば通ず” (kyusureba tsuzu: there is always some way out of a difficulty if you really look for one).This is paraphrased today as “necessity is the mother of invention.” We believe that the basic idea of XI is connected to this philosophy (Fig. 4). Today, many examples of digitization reproduce the controls, look, and feel of analog equipment. However, the appeal of analog equipment is not necessarily anachronism or nostalgia. Rather, it is about understanding the history of how human creativity has been expanded by the “limitations” and “inconveniences” unique to analog equipment and devices. Utilizing the convenience of computers and machines while ensuring humans remain the main actors of creativity and intellectual activities, will al-low the development of a better society in the future, where digital and analog, humans and AI will inevitably coexist. Of course, there are no dreams or fears of AI singularity in that society. The role of an intelligent computer in this future is a complementary function for human creativity, not a replacement for human imagination. This is the greatest appeal and potential of XI.

References 1. Kuwahara, N.: Digital optique: digital reproduction of 19th century pre-cinema movie equipment. In: The 29th International Conference On Systems Engineering (ICSEng2022) (2022) 2. Kuwahara, N.: A smartphone application that recreates the pre-cinema system “theatre optique”. In: The 18th IEEE Transdisciplinary-Oriented Workshop for Emerging Researchers (2021) 3. Fujimoto, T.: Integrated eRosary: design of a rosary application focusing on functionality. International Journal of Arts and Technol. 13(4), 300–314 (2021) 4. Fujimoto, T.: Ideology of AoD: analog on digital - operating digitized objects and experiences with analog-like approach. 2018 7th International Congress on Advanced Applied Informatics (IIAI-AAI), IEEE, pp. 901–906 (2018) 5. Fan, Z., Fujimoto, T.: Digital application of analogue-like time perception mechanism based on analogue on digital theory. Int. J. Internet Technol. Secured Trans. 11(5–6), 518–528 (2021) 6. Li, X., Fujimoto, T.: Examination of problems and improvements of virtual library systems. Lecture Notes in Networks and Syst. 182, 348–357 (2021) 7. Fan, Z., Fujimoto, T.: Design of camera application with physical sensation based on analog on digital (AoD) theory. Information Engineering Express 6(1), 27–38 (2020) 8. Beeple: EVERYDAYS:THE FIRST 5000 DAYS CHRISTIE’S (1981). https://onlineonly.chr isties.com/s/beeple-first-5000-days/beeple-b-1981-1/112924 9. Ito, J.: Extended Intelligence (2016). https://doi.org/10.21428/f875537b 10. Ito, J.: Forget About Artificial Intelligence, Extended Intelligence is the Future, MIT Media Lab. (2019). https://www.media.mit.edu/articles/forget-about-artificial-intelligenceextended-intelligence-is-the-future/ 11. Karachalios, K., Ito, J.: Human intelligence and autonomy in the era of ‘extended intelligence’. The Council on Extended Intelligence. https://globalcxi.org/wp-content/uploads/CXI_Essay. pdf 12. Watanabe, Y., Tanaka, Y., Fujimoto, T.: Prototyping and evaluation of drawing software to provide analog-like feeling by sound. In: Selvaraj, H., Chmaj, G., Zydek, D. (eds.) ICSEng 2020. LNNS, vol. 182, pp. 297–306. Springer, Cham (2021). https://doi.org/10.1007/978-3030-65796-3_28

Potential of eXtended Intelligence (XI)

39

13. Esau, N., Kleinjohann, B., Kleinjohann, L., Stichling, D.: MEXI: Machine with Emotionally eXtended Intelligence. In: HIS, pp. 961–970 (2003) 14. Gioti, A.-M.: From artificial to extended intelligence in music composition. Organised Sound 25(1), 25–32 (2020) 15. Cabitza, F.: From Artificial Intelligence to Humanistic Intelligence and then Extended Intelligence (2020). http://sigchitaly.eu/wp-content/uploads/2020/07/Cabitza_HCI4AI_Syllabus_ paper_8.pdf 16. Day, G.S., Schoemaker, P.J., Snyder, S.A., Kleindorfer, P.R., Wind, Y.J., Gunther, R.E.: Extended intelligence networks: minding and mining the periphery. The Network Challenge: Strategy, Profit, and Risk in an Interlinked World, pp. 277–295 (2009) 17. Gunasekaran, S.S., Mostafa, S.A., Ahmad, M.S.: Personal and extended intelligence in collective emergence. In: 2013 13th International Conference on Intelligent Systems Design and Applications, IEEE, pp. 199–204 (2013) 18. Wang, Q., Li, R., Wang, Q., Chen, S.: Non-fungible token (NFT): overview, evaluation, opportunities and challenges. Cryptography and Security (2021). https://doi.org/10.48550/ arXiv.2105.07447 19. Ante, L.: The non-fungible token (NFT) market and its relationship with bitcoin and ethereum. FinTech 1(3), 216–224 (2022). https://doi.org/10.3390/fintech1030017

ML-Based System Failure Prediction Using Resource Utilization Ittipon Rassameeroj1(B) , Naphat Khajohn-udomrith1 , Mangkhales Ngamjaruskotchakorn1 , Teekawin Kirdsaeng1 , and Piyorot Khongchuay2 1 Faculty of Information and Communication Technology, Mahidol University, Nakhon Pathom,

Thailand [email protected], {naphat.kha,mangkhales.nga, teekawin.kir}@student.mahidol.ac.th 2 Inspektion Co. Ltd., Nonthaburi, Thailand [email protected]

Abstract. Digital transactions are growing exponentially on a regular basis without any sign of interruption. Unfortunately, most online service providers were not able to operate 24/7 every day due to the limitation of system components. Online service providers can be highly benefited if an accurate system failure prediction can be obtained. The adverse effect of computer failure might be mitigated if a proper prediction could be made beforehand. In this paper, we propose a simple model training approach to detect the failure that might arise in a system by parsing the log files and conducting a probabilistic analysis of future performance values in advance. We utilize a Recurrent Neural Networks (RNN), namely, Long Short-Term Memory (LSTM) to provide the optimal solution for predicting system failure by reckoning the hardware performance utilization value and utilizing the prediction of log data system for representing the system benchmark. Consequently, the significant constituents that are considered for calculation are all utilization of CPU, MEM, DISK, and NET. Apart from utilization, another essential constituent that needs to examine is System Callout, which is a representative for displaying the alert signal to inform whenever the information system should ignore the incoming transactions to maintain several server systems. Keywords: Failure prediction · Resources utilization · Log files · NMON · Machine learning · Recurrent neural networks

1 Introduction At present, the digital world is getting more attractive in human society and much effortless with accessibility to the transaction log data where individuals are able to get access to those data whenever they are imperative. According to the digital world that currently individuals tend to pay more emphasis or interest to it. Consequently, that attractive factor caused a massive amount of people to desire to have an online account with an general online service provider for their general use for the reason that everything that is able to be done in an online world will be more convenient and faster than in the traditional one. When some information system has an enormous number © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 40–50, 2023. https://doi.org/10.1007/978-3-031-27470-1_5

ML-Based System Failure Prediction Using Resource Utilization

41

of users on a regular basis like this, it means that the system should be implemented meticulously and needs to be continuously developed in order to be able to handle a huge number of users every day. Unfortunately, the marvelous information system especially in the banking system that has been exhaustively generated as mentioned above still has some vulnerabilities in the system that cause the system to be unusable or malfunction in a short period of time which normally happens when it is the lottery date. However, it might be better if there are some people or an organization who could predict or seek an optimal solution to solve the failure or obstacles that are problematic and cause the banking system to be unavailable serviceable. Additionally, we observed this high priority problem as a serious threat that would be better to improve as early as conceivable so that we decided to solve this critical issue by implementing and developing the combination of AI and ML additional efficient assistance systems in order to detect and eradicate the vulnerabilities that occurred and cause the banking system failure. Since we have all become enlightened that online banking systems have served nonstop24/7serviceonaregularbasis.Individualsareusingtheonlineconnectionto transfer the currency flow of money or digital money whether it is within an identical bank or between different banks. Hence, several banks service providers in Thailand will have a beneficial and efficacious feature that operates as a middle software apparatus for facilitating the money transferring for every financial institution in order to provide more easiness or simplicity. Which due to the numerously used of transferring money activities via this convenient system somehow being a motive that makes the banking system to has a downtime period for internal maintenance and improving the system which these executions usually occurs on every first day and the sixteen of each month by the major reason of lottery payment via online transferring money. As a result, according to the above statement that the reader has read through it, we decided to make use of the AI and ML concepts to support the prediction process that is able to cooperate with the financial institution banking system. Besides, we hope the provided solution will be the exemplar solving guideline and beneficial use for all the researchers who read through this article and be able to clarify about the transaction issues that cause system fail-down.

2 Related Work The authors [14] look for a way to model cellular network service problems. The results of fault prediction models are presented in this work. Because the behaviors of cellular networks are unpredictable, this paper utilizes a Bayesian network to model them. The alerts created during the operation of the controlled network elements, or information received as a result of prior correlation operations, make up the majority of this data. In order to investigate the models’ performance. Extensive simulation studies were conducted out in a variety of cellular network scenarios. [11] presents a simple approach to detect the failure by analyzing the log files. When they found something wrong from log analysis, it could send an alert SMS or email to the user before the problem occurs. In the part of the train-test model, the author used Long Short-Term Memory (LSTM) [13] for prediction. The use of LSTM in time series forecasting assisted in obtaining lower values for future timestamps. In addition to the previous paper, [16] is another paper that used LSTM approach, and they focused on anomaly detection. Moreover, in terms of accuracy and efficiency, this

42

I. Rassameeroj et al.

method outperforms three prominent anomaly detection algorithms: one-class SVM, GMM, and principal components analysis. The LSTM model works very well with log lines. It can deduce the anomalous possibilities of each log line and estimate the probability distribution of the next character. However, The process of this method has a little bit of a problem. For example, the baselines trained with these vectors will not produce appropriate results if this statistical method cannot directly reflect the pattern of abnormal logs. Finally, [12] proposed CNN-LSTM model to forecast stock prices, which has the advantage of assessing correlations among time series data through its memory function. The paper shown that CNN-LSTM is a good model for stock price forecasting and can serve as a useful reference for investors looking to maximize their investment returns including time-series data. Moreover, in comparison to MLP, CNN, RNN, LSTM, and CNN-RNN, the CNN-LSTM model has the highest predicting accuracy and performance.

3 Methodology This section describes the techniques we used in this paper. First, we will describe how to prepare the dataset after converting the data to be a spreadsheet form and take it to be a dataframe for calculating the utilization process. Second, we will present how to deal with imbalance data. Finally, we will explain all of the models we used in this paper and how to select the most suitable for future prediction. 3.1 Data Preparation 3.1.1 Data Collection First of all, we need to prepare the data by monitoring the system performance. All of the data came from the log file or NMON file of the servers which have a problem. We just choose a few days which have a failure times to be a dataset. NMON (Nigel’s Monitor) is a computer performance system monitor tool for the AIX and Linux operating systems. The NMON tool has two modes. Firstly, displays the performance stats on-screen in a condensed format or the same stats are saved to a CSV data file for later graphing and analysis to aid the understanding of computer resource use, tuning options and bottlenecks, which are generated by using NMON analyzer. The NMON analyzer tool is helpful in analyzing performance data captured using the NMON performance tool. It allows a performance specialist to view the data in spreadsheet form. NMON analyzer is a tool that is designed to work with the latest version of NMON, but it is also tested with older versions for backwards compatibility. The tool is updated whenever NMON is updated, and at irregular intervals for the new function. After finishing the export data to be a spreadsheet, then it will be ready to import in the dataframe. 3.1.2 Resource Utilization After the data is in the dataframe, we need to calculate the system performance in utilization format. Utilization is a computer’s usage of processing resources, or the

ML-Based System Failure Prediction Using Resource Utilization

43

amount of work handled by a hardware. Actual utilization varies depending on the amount and type of managed computing tasks. Certain tasks require heavy CPU time, while others require less because of non-CPU resource requirements. It will help us to find the cause of system failure. Table 1. Utilization formula CPU

User% + Sys% + Wait%

MEM

log(swapused) + log(mem used)

DISK

read total + write total + xfer total

NET

NETPKT-reads/s + NETPKT-writes/s + NET-read-KB/s + NET-write-KB/s

Table 1 shows how to calculate all of the utilization. Firstly, CPU utilization is the total number of CPU performance being used during the day. It is calculated by sum of user percent, system percent, and waiting percent. Secondly, MEM utilization is an average utilization statistic derived from the percentage of available memory in use at a given moment, averaged over the reporting interval. The formula is log of swap used plus log of memory used. Actually, NMON can provide only memory total, memory free and swap used. Memory total is a total number of memory that can be used in the system. Memory free is a total number of memory that is not used. Swapused is the kernel parameter that defines how much Linux kernel will copy memory contents to swap [5]. Thus, we just need only one thing more which is memory used. It can be found by subtracting total memory with free space of memory. Thirdly, DISK utilization, it represents how to calculate the DISK utilization where the formula is using read total plus write total divided by transfer total. In order to find DISK performance being used during the day. DISK utilization is calculated by adding disk read, disk write, and disk transfer (xfer total) of all storage together. The last one is NET Utilization. We need to know the total number of reads and writes in the network. The formula represents the total number of networks read and write by calling it NET utilization. There are two types of read and write values which are total number of reads (reads/s) and writes (writes/s) per second from NET value (including SP Switch and Adapter), and total number of network that can read (readKB/s) and write (write-KB/s) per second from NETPACKET value (only Adapter). Actually, the network that we observed includes adapter and switch but some of them have only adapter, so we try to scope this project to be used in every server by adding NET and NETPACKET. In order to find NET performance begin to be used during the day. NET utilization is calculated by adding total NETPKT-reads/s, Total NETPKT-writes/s, Total NET-read-KB/s and Total NET-write-KB/s. 3.2 Tracking Class Imbalance When the utilization is calculated, another issue we faced was the imbalanced data problem which happened because the number of callouts and not callout are 0 (System not failure) and 1 (System failure). The callout is the name of the situation thatwillhappenwhenthetransactionofthesystemisoverflowedandtheperformance of the system is

44

I. Rassameeroj et al.

reached the limit of used, the system will call the network to break out all of the transaction to reboot the system. Actually, The number of callouts that happened was very small compared to the number that did not happen. To solve it, we generated the synthetic samples starting with selecting random data from the minority class. This method is Synthetic Minority Oversampling Technique or SMOTE [15]. SMOTE is an oversampling technique in which synthetic samples are created for minorities, as opposed to random oversampling, which just copies some random examples from the minority class [7]. This algorithm can solve the problem of overfitting caused by random oversampling which is suitable for our datasets. Figure 1 shows the process of SMOTE technique which generates synthetic samples from minority class, then compute the Euclidean distance between the random data and its k nearest neighbors. Between the specified point and its neighbors, synthetic points are added [3].

Fig. 1. The Process of SMOTE [6]

3.3 Model Selection After we classified all utilization, our senior group decided to use a Recurrent Neural Network, namely Long Short-Term Memory to be the model that predicts the future value of all utilization. Which is the first state that we will use in the utilization process by letting the model provide the future value. After the first state is completed, it will take the callout process that gets predicted value (future value) to train in a supervised learning model to predict the callout situation as shown in Fig. 2.

Fig. 2. Architecture of Prediction Process

ML-Based System Failure Prediction Using Resource Utilization

45

3.3.1 Long Short-Term Memory (LSTM) LSTM [13] is an artificial recurrent neural network (RNN) architecture. A simple LSTM unit contains a cell, an input gate, an output gate and a forget gate in Fig. 3. LSTM has feedback with the connection gate that makes the cell remember values in the scope of time. LSTM works well with classifying, processing and making predictions based on data with timestamps. Which means that it is also suited with this project, which is why we selected this model.

Fig. 3. LSTM Architecture [2]

There are several steps that we used to develop the idea for the model. First is a window size, it is a moving forward window size of 50, which means that we used the first 50 data points as an input to predict the 51st data point. After that, it will use the first 51 data points to predict the 52nd data point. Second is the layer model, we make a two-layered LSTM model with a dense output layer. Lastly is used for every utilization and makes data that is forecast in timestamps and focused by a machine learning server [4]. 3.3.2 Supervised Learning Model In this section, an extensive comparative analysis of the efficiency of simple 16 classification model algorithms from the Sklearn library as shown in Fig. 4. The binary classification was implemented to generate the classification model for the dataset. All the algorithm models are created without tuning parameters or hyperparameters. As shown in Fig. 4, each algorithm model result is plotted. The selection of the best model uses five scores as the criteria for selection that consist of Accuracy, ROC AUC, Recall, Precision, and F1-score. According to Performance metrics, based on five scores the XGBoost ranks first with 98.90, 1.00, 0.99, 0.98. 0.99 followed by Randow Forest, Catboost, Light GBM. It is obvious that XGBoost model [10] has a great score compared with other models. Moreover, ROC curve graphs of each 16 models are plotted as shown in Fig. 5. A graph shows that Randomforest, Catboost, XGBoost, and Lightgbm have a great score because the closer the graph is to the top, the better it is to predict.

46

I. Rassameeroj et al.

Fig. 4. Performance Metrics with 16 Models

Fig. 5. ROC curve plot with 16 models

4 Results This section will present experiment with LSTM model that will explain about all of the future utilization values, and the experiment with XGBoost model that will describe how the XGBoost model performs the prediction process and the optimal model that will be used in the prediction process.

ML-Based System Failure Prediction Using Resource Utilization

47

4.1 LSTM Model Accuracy

Fig. 6. Train and Test Value with 200 Epochs

Figure 6 shows the actual value and predicted value, the red line will be the predicted value, and the blue line will be the dataset or the actual value which is not used the dataset in the sampling process. It will have all of the utilization which are CPU, MEM, DISK, and NET. It is data that tests for predicting a future value in 10 min later, so the data in each index will be the data that group the timestamp to be every 10 min. The graph plotted data by the index with a timestamp in the form of the 10-min moving average in the record 24 h. The red line or predicted value will be plotted 1 step forward with the blue line or actual value to see the difference between both of them. The model is training on 200 epochs case. From Fig. 6, we observed that every utilization can fit and work well with LSTM. Except for ‘DISK’ utilization, the LSTM model cannot follow the pattern of the data point, so it makes a lot of errors in every training point.

Fig. 7. R2 Score and Root Mean Square Error (RMSE) of all Utilizations

After the result of all predictions is complete in Fig. 6, we evaluated by using R2 Score and Root Mean Square Error (RMSE) in Fig. 7. Actually, the R2 Score is closely related to the MSE, but not the same. R2 Score will be the percent of correlated which is compared by the observed values and the predicted ones. Root Mean Square Error (RMSE) is a prediction error or the point of the error between the model predict point and the regression line. We observed that the train data of DISK utilization cannot fit with the LSTM. Therefore, we decided to remove DISK utilization.

48

I. Rassameeroj et al.

4.2 Experiment with XGBoost Model In this section, the XGBoost model from the previous section will be tested with the data. There are several significant values that are used to measure the tested model consisting of accuracy for calculating how preciseness the model has predicted, ROC area under curve for checking the sensitivity, time taken for showing how long to implement model (in second), ROC curve, and cross validation score. In this experiment, the XGBoost package was used for implementation. In addition, the XGBoost model tuned the hyperparameters by using Optuna. Optuna is an automatic hyperparameter tuning software framework, specifically designed for machine learning [8]. Even if Optuna is a great library, the space of search should be reduced to make the optimization easier [9]. There are 35 different hyperparameters in the XGBoost, but not all hyperparameters are important. Maybe some hyperparameters have more impact than others. Most of the time, there are some hyperparameters that are frequently tuned in Optuna to find the optimal value. It consists of lambda, alpha, subsample ratio of columns, subsample ratio of the training instances, step size shrinkage (learning rate), the number of trees, maximum depth of a tree, and minimum sum of instance weight [1]. Table 2 shows the accuracy score, ROC area under curve, and time taken for: the XGBoost model with default hyperparameters (show as D.XGB), tuned hyperparameter of XGBoost (show as T.XGB). Both score accuracy and ROC Area under Curve were not much different, while the time for execution was very different from 0.587 to 48.164. Table 2. The result of accuracy, AUC and time taken for XGBoost model with default hyperparameters and tuned hyperparameter of XGBoost Score

D.XGB

T.XGB

Accuracy

98.834

97.086

ROC Area under Curve

0.9883

0.970

Time Taken

0.587 (Seconds)

48.164 (Seconds)

Table 3 displays the iteration of cross validation by using stratified 10-fold crossvalidation for: the XGBoost model with default hyperparameters (show as D.xgb), tuned hyperparameter of XGBoost (show as T.xgb). The overall score is quite good, but the cross-validation score of the XGBoost model seems to have a problem in case of XGBoost model with default hyperparameters. Sometimes the cross-validation score that was produced is not stable. The last two cross-validation scores are not good enough. However, this problem was solved by using Optuna as shown in the last two rows in Table 3. The last two validation scores were improved by Optuna from 75.53 to 86.51 and 66.87 to 68.62 including the accuracy mean.

ML-Based System Failure Prediction Using Resource Utilization

49

Table 3. Cross-validation score with Using Stratified 10-fold for XGBoost Model with Default Hyperparameters and Tuned Hyperparameter of XGBoost Iteration

D.XGB

T.XBG

1

98.00249688

97.4198918

2

99.16770703

98.54348731

3

99.50062422

98.6267166

4

99.58385352

98.29379942

5

99.20932168

98.91801914

6

99.62546816

97.00374532

7

83.39575531

84.10320433

8

95.13108614

92.09321681

9

75.53058677

86.51685393

10

66.87473991

68.62255514

5 Conclusion After we has researched and studied for more useful information that could rectify the failure scenario of the banking system from many sources, we decided to utilizes the Recurrent Neural Network, namely Long Short-Term Memory or in short called LSTM model to forecast the upcoming value of utilization by it has three core processes during the LSTM model implementation including, window sizing, layering model, and forecasting timestamps. Once the LSTM has finished its process it will provide the predicted value that will be used as the input variable in another model to compute the system slow or callout system. Apart from the Long Short-Term Memory model that was used to compute the utilization. We attempted to utilize other different models for predicting the callout value. After a ton of various models have been going through it turned out that the optimal model that provides the best accuracy is the XGBoost model which is a decision-tree-based ensemble Machine Learning algorithm that uses a gradient boosting framework. In addition to the model implementation section, there is another significant section which is the experimental testing to observe the prediction results. In the experimental testing section, it will comprise of two major sessions which is an experiment with the LSTM model that will represent all of the prediction tested values and another one which is an experiment with 16 models that will show all of the predicted results from all of the models that we used for predicting callout value. As a result, the future prediction value that we received is only one of the sessions that was planned to operate. There is still another essential session in the future work which is about implementing the interface system in order to easily gathering the information and be able to makes all of the work that we have done together so far in the model training session to become more tangible so that it can provide more accessible and convenient to all of the users.

50

I. Rassameeroj et al.

References 1. https://xgboost.readthedocs.io/en/stable/parameter.html 2. (September 2017). https://medium.com/@kangeugine/long-short-term-memory-lstmco ncept-cb3283934359 3. (December 2019). https://machinelearningmastery.com/what-is-imbalanced-classification/ 4. (December 2019). https://towardsdatascience.com/system-failure-prediction-using-logana lysis-8eab84d56d1 5. (August 2020), https://www.ibm.com/support/pages/swap-space-handled-ibm-cloud-private 6. (August 2020). https://medium.com/analytics-vidhya/bank-data-smote-b5cb01a5e0a2 7. (April 2021). https://towardsdatascience.com/imbalanced-classification-in-python-smotet omek-links-method-6e48dfe69bbc 8. (September 2021). https://www.analyticsvidhya.com/blog/2021/09/optimize-youroptimiza tions-using-optuna/ 9. (September 2021). https://www.architecture-performance.fr/ap blog/optuna-xgbooston-atabular-dataset/ 10. Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 785–794. KDD ‘16, Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2939672.2939785 11. Dutta, A., Sharma, A.: System failure prediction using log analysis. International J. Advance Res. Science Eng. 8(8), 38–46 (2019) 12. Hassanien, A.E.I.B., Lu, W., Li, J., Li, Y., Sun, A., Wang, J.: A cnn-lstm-based model to forecast stock prices. Complexity 2020, 6622927 (2020). https://doi.org/10.1155/2020/662 2927, https://doi.org/10.1155/2020/6622927 13. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735 14. Kogeda, O.P., Agbinya, J.I.: Prediction of Faults in Cellular Networks Using bayesian Network Model (2007) 15. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: Synthetic minority oversampling technique. J. Artificial Intelligence Res. 16, 321–357 (2002) 16. Zhao, Z., Xu, C., Li, B.: A LSTM-based anomaly detection model for log analysis. J. Signal Processing Syst. 93(7), 745–751 (2021). https://doi.org/10.1007/s11265-021-01644-4

Interactive System

NEOTEN¥ (NExt Optimum Trade ENvironment of ¥en): A Cryptocurrency with Market Capitalization Linked to the Japanese Yen, Non-speculative Crypto-Assets with Economic Security Takayuki Fujimoto(B) Department of Information Sciences and Arts, Toyo University, Tokyo, Japan [email protected]

Abstract. This paper proposes a non-speculative cryptocurrency “NEOTEN¥” with economic security based on the Japanese Yen (JPY). The specific mechanism and design theory of the asset will be described in detail. Although cryptocurrency is rapidly gaining popularity, there are many issues in terms of market stability, in contrast to their robustness in terms of security and other factors. In particular, unstable market prices and volatile exchange rates that fluctuate in a short period of time are barriers to daily use. While it is expected to become the key currency on the Internet in the future, the instability of the market and the market rate, as well as a sense of uneasiness about cryptocurrency remain. In particular, this is a major obstacle to the spread of the system among young people who are expected to use the system for purposes other than investment and speculation. NEOTEN¥ proposed in this research examines a non-speculative cryptocurrency with economic security by having a total value that is linked to the cash circulation of Japanese yen, a legal tender, and by equipping it with an algorithm that incorporates several restrictions.

1 Introduction 1.1 Background Recently, cryptocurrency has come to be used as currency as a substitute for legal tender, not just as an investment [4]. A recent example, is El Salvador adopting Bitcoin [2], one kind of cryptocurrency, as legal tender in 2021. Subsequently in 2022, the Central African Republic also adopted Bitcoin as legal tender. On the other hand, there are a considerable number of people who feel a sense of crisis or risk regarding nations’ adopting Bitcoin as legal tender. In response to El Salvador adopting Bitcoin as legal tender, the ratings agency Fitch Ratings has El Salvador’s credit rating to CCC from B- on the ninth of February 2022. This rating indicates that El Salvador falls below investment grade. Furthermore, in March 2022, Bitcoin experienced a 40% drop leading to El Salvador suffering losses equal to the nation’s next interest payment to bondholders. This news made the world realize the risk of Cryptocurrency. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 53–63, 2023. https://doi.org/10.1007/978-3-031-27470-1_6

54

T. Fujimoto

As this example shows, despite cryptocurrency’s rapid spread, there are still many security issues that need to be resolved. Cryptocurrency differs from conventional legal tender in that there is no market or exchange intervention by central banks, governments, or international financial organizations, which means that it carries ahigh risk of abnormal situations such as fluctuating value and control difficulties. Even when an abnormal situation occurs, there are no entities that can conduct proper interventions. Unlike conventional legal tender, as a decentralized market cryptocurrency has a great potential to realize a borderless market. However, cryptocurrency lacks the reliability of conventional legal tender. This is the primary concern of users. It goes without saying that the cryptocurrency market, is full of up and down fluctuations to a degree that is not usually seen in exchanges or stock markets for speculative trade by institutional investors. In short, every day there is a risk of a losses as devastating as the 2008 financial crisis. Research and discussions to enhance the stability of cryptocurrency markets are expected to continue, but their reliability is still low and for the meantime, it is difficult to say that cryptocurrency is a secure stable asset for its users and general consumers. 1.2 Focus of Research On the other hand, it can be easily imagined that cryptocurrency will become the asset that people actively as currency on the Internet [5]. Now, cryptocurrency is more often used as an investable financial instrument, or as a money game. However, gradually, cryptocurrency will become a key currency on the Internet that is used on a daily basis instead of credit cards. Because of its higher security and confidentiality than credit card transactions, cryptocurrency should prevail among underage and young adult segments in the years to come. Thus, we can imagine a future where cryptocurrency spreads widely and rapidly. However, in the present state of cryptocurrency, as it is now, it does not work as key currency for general use. Currently cryptocurrency is not an asset that people actively want to use unless they are the investors, or have understanding of speculative money game. The impression of the current state of cryptocurrency is that is too risky to keep or use especially for underage people and young and adult segments, who are the highpotential target groups of future cryptocurrency diffusion [8]. As an example, imagine a child that exchanges money saved from their allowance or part-time job to use for online shopping. If they were to exchange Japanese Yen (JPY) to American Dollars (USD) and keep the money, the amount would not change significantly. Even over a long term, the exchange-rate fluctuation is somewhere in the range of ±10 cents at maximum. But what will happen when they exchange their money to cryptocurrency? In an extreme case, an exchange of ¥ 12,000 = $100 could collapse to $70 or $60 in a few days, and this is not a rare case. This would mean they could not make their purchase at the price they expected. Looking at El Salvador’s loss, cryptocurrency experienced a 40% drop in a short period [3]. Far from a simple drop, there is a risk that cryptocurrency could become scrap paper (scrap data). Of course, there is always the possibility that the opposite case could occur. However, it is clear that cryptocurrency is not a currency that who do not have good understanding of investment or gambling, such as children, can control.

NEOTEN¥ (NExt Optimum Trade ENvironment of ¥en)

55

There is no problem in the current condition in which cryptocurrency is thought of as an investment or gambling. However, when cryptocurrency is thought of as ‘money easy to use on the Internet’ or ‘safe money with high-security/confidentiality,’ the risk of unpredictable great financial losses tends to be overlooked. As currency or an asset for general use, cryptocurrency is far too risky. That is a problem to be solved.

2 Purpose of Research: Non-speculative NEOTEN¥ with Economic Security While virtual currency ensures safety in terms of high-security, it is also always accompanied with great risks in terms of fluctuation of market value. Considering the expected expansion in use by underage people or children, this risk is a serious obstacle and concern. Indeed, it can be said that this point is preventing cryptocurrency from widespread diffusion and natural prevalence. In this paper, we design a high-security cryptocurrency with low exposure risk of fluctuating rates, designed for use by young and adult segments including underage people. Then, from a completely novel point of view, we propose NEOTEN¥, a non-speculative system with high financial security. NEOTEN¥ is an abbreviation of “NExt Optimum Trade ENvironment of ¥en”. Cryptocurrency created by the NEOTEN¥ system will be called NEOTEN¥coin (N¥coin) and 1 NEOTEN¥ is described as “1N¥.” NEOTEN¥ is a novel cryptocurrency platform that resolve cryptocurrency’s highly speculative and difficult to control market, while maintaining its original convenience and security stability. Specifically, our aim is that it could be used as a training ground for underage people, such as junior high, high school, and university students, before they start to use other ordinary cryptocurrencies because they will use cryptocurrency in a daily basis in the coming years. NEOTEN¥ is similar to the positioning of debit card in that respect. Debit cards are used by underage people who cannot use credit cards. Debit card have functions similar to a credit card, but they do not allow users to borrow funds like credit cards have, allowing users to experience cashless payment without risk of debt. Originally, utilization and agreement related to cryptocurrency, which are managed by blockchain technology, are highly confidential and its security is stable toward manipulation or errors [6]. Infinite freedom is guaranteed because there are no party or function such as central control organization or market mediation. However, the infinite freedom has led to the risk of ‘control difficulty’. For most people, cryptocurrency is not for saving but for investment. Among the various type of investments, it is thought of a gambling with extremely high risk. No matter how robust it is in respect of security, it cannot be said that cryptocurrency is the appropriate currency for general use because of the fact: unstable market and unpredictable value fluctuation. In particular, for children or the underaged people, it is difficult to understand or deal with the volatile price fluctuation. To overcome this problem, by default, NEOTEN¥ system verifies a user’s identity besides transaction information, and only when the both match, the user can use the system.

56

T. Fujimoto

3 NEOTEN¥ Mechanism 3.1 System Structure of NEOTEN¥ In this chapter, the mechanism of NEOTEN¥ system will be described. Upon NEOTEN¥ system, the implemented smart contract on a blockchain,”Smart Contract for NEOTEN¥ (SmaCoN¥)”, (implemented on a blockchain to run the programmed instructions automatically when specific conditions have been met), defines user attribute, user privileges and restrictions of functions. Within SmaCoN¥, rule execution for the information about NEOTEN¥ and restrictions is automated [1, 7]. NEOTEN¥ system is consisted of the rules defined and executed by SmaCoN¥ on a blockchain, and the personal information database used for the rule judgment, which is stored on local devices such as personal smartphones and computers. Only transaction information is recorded on a blockchain and the personal information is saved on the local devices. Every time a user use the system, SmaCoN¥ verifies personal information and transaction information on the block chain. By separately saving user information on local, transaction information and smart contract on the blockchain, robust security is maintained. (Fig. 1).

Fig. 1. Relation between user’s personal information and blockchain

3.2 User Definition by SmaCoN¥ Upon SmaCoN¥, regarding a user, the four kinds of contract conditions: User Attribute, User Privileges, Coin Function, and Market, are predefined as e-restrictions and executed.

NEOTEN¥ (NExt Optimum Trade ENvironment of ¥en)

57

3.2.1 User Attribute Definition (A) (B) (C) (D)

N¥coin users are consisted of two types; ‘coreUSER’ and ‘subUser’. Core USER are the people from 8 yrs. to 18yrs old. SubUSERs are the people other than coreUSERs. With the date on which a coreUSER comes of age as the Reference Date, his attribute is auto-changed to ‘subUSER’. At the same timing, the N¥coins that he holds, is auto-changed to exN¥coins.

3.2.2 Coin Function Definition (A) Only coreUSERs can hold N¥coins. (B) Legal Tender with which people can buy N¥coin is limited to Japanese Yen (JPY). (C) The value of N¥coins from which the N¥coin holder can exchange for JPY or other crypto-assets is limited to the value of N¥coins that he bought/exchanged on his own using JPY/other crypto-assets. (D) Only subUSERs can hold exN¥coins. (E) exN¥coin can be exchanged for JPY (Legal tender) and other crypto-assets freely. 3.2.3 USER Privileges Definition (A) coreUSER can conduct commercial transactions for the payment of N¥coin. (NEOTEN¥ certified business operator) (B) coreUSER can buy or pay the merchandises or services provided by NEOTEN¥ certified business operator with N¥coin. (C) N¥coin gained by subUSER is auto-converted to exN¥coins. (D) subUSER cannot hold N¥coin received in transactions (Fig. 2).

Fig. 2. Correlation between N¥coin and exN¥coin

3.2.4 Exchange Market Definition (A) Exchange rate between exN¥coin and Japanese Yen (JPY) or other cryptocurrency changes depending on each market. (B) exN¥coin is convertible in JPY or other cryptocurrency exchange market. (C) N¥coin is not convertible to other cryptocurrencies.

58

T. Fujimoto

(D) Regarding NEOTEN¥, if the total of cryptocurrency is 1, it is composed of proportions of N¥coin (0.5) and exN¥coin (0.5). This indicates that if N¥coin is exchanged for exN¥coin,’ the coin that was converted’ will disappear after the exchange. For example, ‘N¥coin a’, which coreUSER has paid, is automatically converted to ‘exN¥coin a’ as soon as subUSER receive that. At the moment in which ‘exN¥coin a’ is generated, ‘N¥coin a’ disappears, therefore, there is no coexistence of ‘N¥coin a’ with ‘exN¥coin a’ at the same time (Fig. 3).

Fig. 3. Exchange markets concerning with NEOTEN¥

3.3 Circulation Mechanism of NEOTEN¥ 3.3.1 Limitation of N¥coin/exN¥coin Market Capitalization NEOTEN¥ has no maximum supply but there is a particular limitation for market capitalization. The limit for market capitalization for N¥coin/exN¥coin is restricted to be same as JPY’s “cash in circulation (total value of bills + coins).” Therefore, N¥coin/exN¥coin exceeding the JPY cash in circulation will never be mined or distributed. Even if N¥coin/exN¥coin is excessively mined, their total value will never exceed the value of JPY’s cash in circulation. When the amount of N¥coin/exN¥coin in circulation increases, it just lowers the value of each N¥coin/exN¥coin depending on total value of JPY’s cash in circulation at the time. Thus the market capacity of N¥coin/exN¥coin will never rise due to changes in the amount of the currency in circulation. For example, in case of 2020, 121 trillion yen is a limitation for market capitalization. If 1,000,000 NEOTEN¥coins are in circulation, 1 NEOTEN¥ is 12,000,000JPY. If the number of issued coins increases to 10,000,000, the unit value of 1 NEOTEN¥ is lowered to 1,200,000JPY. From this framework, the total value of exN¥coin never exceeds ‘JPY’s cash in circulation’ by speculation, despite that exN¥coin is available for exchange or utilization in the same way as other cryptocurrencies. Because the overall number of issued exN¥coins / the total value of exN¥coins in circulation is controlled to be equal to that of N¥coin.

NEOTEN¥ (NExt Optimum Trade ENvironment of ¥en)

59

In short, for NEOTEN¥, the most powerful and influential market rate is the JPY, which is known as “Asia’s stable currency". This means NEOTEN¥ has quite high stability. 3.3.2 Personal Information Authentication In NEOTEN¥ system, transaction information is recorded on a block chain. On the other hand, User Attribute Information controlled by smart contract, which is indispensable to facilitate NEOTEN¥ system work-flow, is stored in a user’s individual local database as the personal information. When user use NEOTEN¥ system, the personal information held by each individual (the information to prove that the person is the authenticated N¥coin holder) is verified by the exchange of transaction information on a block chain and encryption key. Information to be recorded on

Information to be recorded in lo-

blockchain • transaction date • who paid • to whom it was paid • amount of transaction

cal database • User Attribute

Verification by encryption key

Overall structure of NEOTEN¥ system is indicated in Fig. 4. An event regarding N¥coin transaction occurs, contract and value transfer based on that is executed, and thus the transaction is settled. A series of this processing is automatically executed on a blockchain.

4 Expected NEOTEN¥ Use 4.1 NEOTEN¥ in Practical Use NEOTEN¥ system can solve the critical problem regarding current cryptocurrency and cover the cryptocurrency’s flaw. In particular, the majority expected to be prospective users: general consumers and the under aged people, who do not have a good understanding of investment or gambling, is like a goldmine of a multitude of needs. As represented by NFT market, recently content buying/selling by cryptocurrency has been active, and NEOTEN¥ has quite high availability potential. For example, active uses for the following payment purposes are expected. • • • • •

Books /learning tools purchase Participation fee for educational events or other events Admission fee for museums or art museums, purchase at museum gift shops New NFT Marketplace by the underaged people Schooling fee or Monthly fees to other educational institutions

In the safe market, children can create art works, writings, or music by their own, and they can do NFT business not for the investment purpose. Under such condition,

60

T. Fujimoto

Fig. 4. Overall structure of NEOTEN¥ system

children who has a special talent, can earn usable exN¥coins, based on the educational intention. Availability as financial support for educational purposes for the people in developing countries or school enrollment subsidy for the impoverished families can be also expected. If N¥ coin is used for education support, that prevents the risk of the misappropriate uses other than education and related purposes. N¥coins paid by the underaged people (coreUSER) are auto-converted to exN¥ coins on a mandatory basis at the moment that N¥coin is transferred to NEOTEN¥ certified business operators (subUSERs). Those exN¥coins can be exchanged for JPY or other cryptocurrencies at Cryptocurrency Exchange. From these, regarding the payment with N¥coin, there are no particular risk or need of efforts for NEOTEN¥ certified operators, who provide merchandises or services. N¥coin is less affected by market fluctuation and rarely becomes the target of investment, because it is on the premise of the use by the underaged people. It specializes the purpose that the underaged people use cryptocurrency on the Internet. As for the business operator on the other end, as soon as they receive N¥coins, they are replaced with exN¥coins, which is cryptocurrency with functions that can be exchanged for legal tender and has market volatility. This can work as incentive for NEOTEN¥ business operators regarding gaining exN¥coins and doing business with those exN¥coins. 4.2 Examination of NEOTEN¥ Usability Regarding Blockchain implementation, there are a variety of provided libraries and sample codes., and basically the required implementation environment is simple. It can be implemented with different programming languages, for example, in Javascript, which

NEOTEN¥ (NExt Optimum Trade ENvironment of ¥en)

61

is popular scripting language. Since NEOTEN¥ system proposed in this research enforces on ‘Smart Contract’ on a blockchain, it is thought that Solidity is the appropriate choice. In principle, ‘Blockchain’ itself does not specify a particular development environment or programming language. With regards to a database to authenticate personal information, it is stored on ordinary computers or smartphones, and there is no need to create a dedicated environment. In short, the technology premises the quite general and various implementation environment, the system realization is relatively easy. NEOTEN¥’s Basic Algorithm Flow is indicated Fig. 5 below.

Fig. 5. Basic algorithm flow of NEOTEN¥

As a brief assessment, we carried out the questionnaire survey for impression evaluation targeting 40 people to examine NEOTEN¥‘s superiority comparing with other cryptocurrencies. Those surveyed were common people who has general knowledge of cryptocurrency. They were requested to participate in the evaluation/comparison after they had understood the mechanism of NEOTEN¥ by the de-tailed and in-depth explanation about the system. Specifically, they were asked to evaluate the impression of cryptocurrencies in respect of 5 items: Security, Versatility, Market Stability, Gambling Element, Value reliability by 7 stages: ◎・◎-・ ◯・◯-・・-・×. Each evaluation mark and the point allocation is listed in the table below.

Evaluation marks

Meaning

Score

Quite positive

3

Rather positive

2.5

Not sure

2

Rather negative

1.5

Negative

1

Quite negative

0.5

None

0

Regarding ‘Gambling Element’, generally, if the score of this item is judges as ‘high’, it indicates ‘high uncertainty’ meaning negative evaluation. Therefore, the score given to the item, is practically regarded as the negative numbers (Figs. 6 and 7).

62

T. Fujimoto

Fig. 6. Impression evaluation score

Fig. 7. Comparison of impression evaluation score

As a result, both of N¥coin and exN¥coin were evaluated as low risk of ‘gambling element’, and in detail, the participants judged N¥coin as ‘no risk of gambling element’. Gambling Element’ is the item to measure certainty as an asset. Therefore, it can be said that both N¥coin and exN¥coin relatively have higher market stability and greater reliability than other cryptocurrencies. By way of contrast, the certainty and stability are basically relevant to the restrictions for the use and complex mechanism to understand. Therefore, regarding “Versality”, in other words, ‘handiness’ or ‘ease of use’, they got lower score other currencies. However, ‘relatively-low versatility’ is not necessarily a negative factor for cryptocurrency’s well-being popularization. Because cryptocurrencies’ unreliability, which gathers people’s attention as a problem, is originated from their overwhelming high versatility.Therefore, NEOTEN¥ has a high potential for future development because it can enforce security and reliability maintaining versatility and flexibility as cryptocurrency.

5 Conclusion In this paper, we proposed NEOTEN¥, a cryptocurrency based on Japanese Yen (JPY) which is a crypto-asset with no speculative nature but with economic security. And explained its mechanism and control concept in detail. Although cryptocurrency is

NEOTEN¥ (NExt Optimum Trade ENvironment of ¥en)

63

rapidly becoming prevalent, there are a few problems to be solved for general use and investment management. In particular, cryptocurrency is expected to become the key currency on the Internet in the future. However, there are still obstacles, namely, its instability as a currency due to its borderless versatility the lack of a controlling organization. This instability can be rephrased as “unreliability.” Cryptocurrency radically spread and diversified after a series of papers regarding Bitcoin by Satoshi Nakamoto in 2008. However, adversely, its reliability as currency and risks are often pointed. Some countries have approved cryptocurrency as legal tender, but it resulted in those countries’ credit ratings as nations being notably downgraded due to the unreliability of Bitcoin as a currency. For cryptocurrency to become the key currency on the Internet, its reliability as currency needs to be improved. In short, to achieve this goal, general consumers need to see cryptocurrency as a currency that is as safe as other kinds of legal tender and that can be used by people with no investment purpose. Specifically, a controllable cryptocurrency that can be used risk-free by the generations that cannot possess credit cards, similarly to the use of debit cards, is required. NEOTEN¥ was designed as an altcoin that can play such a role, and it will be further researched and developed to achieve this goal.

References 1. Szabo, N.: Smart Contracts: Building Blocks for Digital Markets (1996). http://www.fon.hum. uva.nl/rob/Courses/InformationInSpeech/CDROM/Literature/LOTwinterschool2006/szabo. best.vwh.net/smart_contracts_2.html 11/11 2. Nakamoto, S.: Bitcoin: A Peer-to-Peer Electronic Cash System (2008). https://bitcoin.org/ 3. Bloomberg, El Salvador’s Bitcoin Losses Are Equal to Next Bond Payment. https://www. bloomberg.com/news/articles/2022-05-12/el-salvador-s-bitcoin-losses-are-as-big-as-its-nextbond-payment. 12 May 2022 4. Buterinjan, V.: Bitcoinmagazine, Ethereum: a next-generation cryptocurrency and decentralized application platform. https://bitcoinmagazine.com/business/ethereum-next-generationcryptocurrency-decentralized-application-platform-1390528211. 23 January 2014 5. Miyazaki, H., Nakamura, S.: The Social Change that Blockchain Brings. Unisys Technol. Rev. 133 (2017) 6. Takayuki Akiya Overview the Blockchain Technology from Its Challenges and Solutions 秋谷昂志† 7. Furuichi, T., Mineno, H.: Proposal of IoT System with Smart Contract 8. Center for Research and Development Strategy, Japan Science and Technology Agency. The Next Generation Blockchain Technology- To establish secure and trustworthy infrastructures for data-sharing and value-exchange among people and society CRDS-FY2019-SP-09 (2019)

XI Guitar: Development of a Slide-Type Guitar Application that Reproduces the Playing Sensation of Analog Guitar Liu PeiQiao(B) and Takayuki Fujimoto Toyo University, Tokyo, Japan [email protected]

Abstract. In this paper, we propose and prototype a virtual slide-type guitar application for a smartphone. However, most of them place the guitar’s fretboards and strings on one same smartphone screen and require users to tap buttons or images of strings. In short, they can produce guitar sounds, but users cannot simulate the sensation or experience of playing an analog guitar. A possible substitute for a real guitar is JammyG, which has a MIDI controller function, but this requires a device other than a computer or a smartphone and it is not necessarily a smartphone application that simulates an analog guitar. It is a very complicated system in fact and the smartphone just works as a controller. Besides, it is expensive and difficult to call it ‘popular system’. There are many problems to be solved. In this research, inspired by the method of playing, a slide guitar, we have developed a prototype eXtended Intelligence Guitar (XI-Guitar) that can be played like an actual guitar on a single smartphone with physical sensation of horizontal movement of a finger, by utilizing the smartphone’s acceleration sensor. Keywords: Playing sensation · Analog · Acceleration sensor · Slide-type guitar

1 Introduction Today, with the acceleration of IT, a variety of analog devices and everyday equipment are being reproduced on smartphones [2]. While an increasing number of these devices are practical, a considerable number of the devices fall into various types, especially when it comes to the tools used in entertainment situations. For example, digitalization in the field of music has seen tremendous development and diversification. Old and valuable analog synthesizers from decades ago have been revived as software, and there are many examples of digitization of analog instruments that are not easily reachable in real life. [1] ‘analog guitar’ has been reproduced as smartphone applications in many ways, but most of them are merely ‘keyboard instruments that can play guitar sounds’ and do not allow users to enjoy the same quality of performance as playing a regular guitar. In particular, as far as the authors know, there are no applications that can give the “sensation of fingers’ moving on the fretboard,” which is the real thrill of playing the guitar. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 64–76, 2023. https://doi.org/10.1007/978-3-031-27470-1_7

XI Guitar: Development of a Slide-Type Guitar Application

65

This paper describes the development of the application that solves this problem and allows the user to experience the sensation of playing an analog guitar by simply using a single smart device [4]. The design of the application focuses on the element of the sensation of a finger’s movement that presses the strings, observing and considering the action of playing a slide guitar. By using an acceleration sensor built into the smartphone, this application incorporates the sensation of playing a slide guitar and changes the pitch according to the distance that the smartphone moves in response to the movement of the hand holding it.

2 Related Research Various analog guitar applications have been released in the past. In recent years, highperformance digital guitars and guitar software also have been introduced. On the other hand, many players prefer real guitars to high-performance guitar software. This is because it is believed that the humans’ purpose to play and enjoy musical instruments is not simply rely on their function as instruments. It rather strongly depends on the experience of ‘feel of playing’ or ‘physical feedback’ provided by the instruments. Therefore, the challenge for musical instrument applications is to provide the same sensation of use as a real musical instrument while maintaining the convenience of a digital device. For example, InstaChord is an iOS application that allows even inexperienced users to play sounding like a ‘guitar performance’ with little practice. By holding the device like a guitar neck, the main chords used in a song are aligned within the reach of the fingers, allowing the user to play a variety of songs without moving the hand position significantly. It is recognized and loved for its feature of being extremely easy to play, without the need for one’s playing ability. However, InstaChord’s interface and ‘feel’ cannot be described as a ‘stringed instrument’. It lacks the “real thrill” in being able to play a stringed guitar, which can be experienced through ‘fretting the strings’: a series of movements that involve changing the hand position and pressing down on the strings with fingers. A design that prioritizes ‘ease of use’ may be convenient, but it does not provide the player with the essential ‘playing’ experience. It goes without saying that such an application can be hardly a substitute for a real guitar, for performance use (Fig. 1).

Fig. 1. InstaChord, a chord keyboard-type guitar application

66

L. PeiQiao and T. Fujimoto

In recent years, there have been also products that attempt to replace the guitar electronically. For example, the MIDI guitar “JammyG” (Fig. 2) not only provides the user with the experience of play, but also allows the user to create the music that he or she envisions and control it in conjunction with a smartphone application. Furthermore, it can be disassembled and has a long-lasting battery for high mobility. Since JammyG aims to replace a real guitar, it is more functional and practical than previous conventional applications. On the other hand, the combination use of a smartphone application and a MIDI-type interface enables a realistic playing experience, but there is still the issue of not being able to use the smartphone application on its own.

Fig. 2. Explanation of JammyG

Both of the cases were designed based on the model of an analog guitar, and each has the inherent analog characteristics of the tool. However, it is difficult to say that they can be used with a smartphone alone and still provide the ‘feel of playing a guitar’. The reason for this is that it is extremely difficult to simulate both ‘movements of fingers on fretboard’ and ‘picking’ at the same time on a smartphone, which has a limited playing space. In particular, it is quite difficult to incorporate the physical and tactile sensations caused by the drag force from the strings when reproducing ‘picking’ with a smartphone, which is made of a different material. Among of many guitar playing techniques, there is a technique called ‘slide guitar’ (Fig. 3) that enables a player to more focus on the sensation of the hand touching the strings, without being affected by changes in the picking hand’s strength. In this technique, the slide bar (or bottleneck) is put on a finger or held with fingers, and the bar is held in contact with the strings at an arbitrary position without being firmly pressed down on the strings on the fretboard, and only a little bit of plucking (by another hand) is required to produce a sound. What is different from other common guitar playing methods is that one can just more focus on the physical feedback by one hand (with a slide) and it can work as a reference also for designing an analog guitar. Based on this inspiration, a ‘more convenient one-handed play’ is feasible. In addition,

XI Guitar: Development of a Slide-Type Guitar Application

67

Fig. 3. How to play the slide guitar

when the physical experience of horizontal movement is given, the intuitive ‘feel of playing’ can also be enhanced [5].

3 Slide-type Analog Guitar 3.1 Basic Design In this research, a prototype of a slider-type guitar application “eXtended Intelligence Guitar: XI Guitar” was developed to reproduce the ‘feel of analog guitar playing’, inspired by the mechanism and playing method of the slider guitar. As described above, the XI Guitar focuses on giving users the sensation of ‘one-handed playing’. Therefore, the system was designed based on a mechanism that changes the pitch and screen according to the distance of horizontal movement distance by the finger pressing the strings without being affected by the movement of picking in reference to the slider guitar [6]. The built-in acceleration sensor is used to calculate the movement distance, to improve making the most of smartphones. Start with an initial speed of 0 [m/s] and end measurement at a speed of 0 [m/s]. Using the built-in acceleration sensor, discrete acceleration values are detected and recorded. The movement distance was calculated by integrating twice using the trapezoidal formula. The movement distance determines the screen of the fretboard to be displayed and the value of the pitch change (Fig. 4).

68

L. PeiQiao and T. Fujimoto

Fig. 4. Primary function

4 Experiments on Accelerometer Accuracy 4.1 Purpose of the Experiment Today, the performance of accelerometers built into smartphones is quite high-quality. It is believed that the higher sensitivity is better. On the other hand, the sensitivity is so high that sound source control using accelerometers cannot avoid the problem of generating sounds in response to slight movements that would not in the case of analog guitar. Furthermore, it is necessary to compensate for detection errors. To solve these problems, not only the detection of acceleration value but also correction threshold for the acceleration value is indispensable for sound source control based on the calculation of movement distance by acceleration sensor. 4.2 Experimental Methods In general, players often use BPM (Beats Per Minute) rather than the usual m/s (Meter per second) as the speed symbol for their performances. However, bpm is a unit of measurement only for time, and cannot be converted to velocity without a distance variable. Therefore, the number of frets on the fretboard is used as an indicator of distance, which can be combined with BPM and converted to obtain the usual m/s. The conversion formula is velocity v = 1.8 x Fret number m/min or a = 60 v/min when converting acceleration. Using the smartphone’s accelerometer, the smartphone’s motion state is measured at different frets at a specified BPM. The smartphone should move along the horizontal direction, and the Y-axis data should be used as the preferred reference data. The acceleration collection time interval is 0.1 s, and 10 reciprocal movements are performed according to BPM. Since “pressing the start/stop button” affects the correctness and accuracy of the data, the first and last two sets of data are discarded to ensure the accuracy of the data [5] (Fig. 5).

XI Guitar: Development of a Slide-Type Guitar Application

69

Fig. 5. Measurement direction of 3-axis accelerometer

4.3 Experimental Results The graph of the accelerometer is observed and calculated by double integration, and results that are irregular and have a large error with the actual values are considered undetectable. Results that have a margin of errors of the actual values, but which are regular, are accepted as available with correction. Data sets that are regular and have negligible errors are the data that can be used as it is (Fig. 6).

Fig. 6. Available utilization date

70

L. PeiQiao and T. Fujimoto

The following results were obtained thereby (Fig. 7).

Fig. 7. Experimental results

4.4 Primary Function Realization Mechanism The built-in acceleration sensor and timer are used to calculate the movement distance. The calculated movement distance is used to determine the string length change, and the relationship between string length and pitch determines the degree of screen movement and pitch change. The primary function can be realized by generating the gliding sound and switching the screen accordingly. The flow of the mechanism to realize the function is as follows. (STEP. 1) When user starts to touch the screen, call the built-in acceleration sensor and time. (STEP. 2) While user is touching the screen, the system determines whether to record or process the data using the acceleration detection/correction threshold calculated from the aforementioned experimental results as the two branching conditions. (STEP. 3) If the data is smaller than the detection threshold, the more sensitive angular rate sensor is called to determine the change in pitch in response to tilt. If it is larger than the threshold value that can be used directly, a distance conversion is performed. After the data between the two is corrected, the distance is converted [9]. (STEP. 4) The converted distance is substituted into Mersenne’s law, and the smartphone screen and speaker will operate according to the change in pitch and distance due to the calculated results (Fig. 8).

XI Guitar: Development of a Slide-Type Guitar Application

71

Fig. 8. Flowchart of primary function realization

At the final output, the sound height data converted by the distance calculated at an acceleration collection time interval of 0.1 s, becomes nonlinear. This causes a dissonance that can be perceived by the user when the sound is changing [12]. It is essential that the analog output of the distance compensates for the quantization error in order to smooth out the sound shift change.

Fig. 9. Quantization error correction of analog output

5 Implementation The initial value of the Y-axis acceleration is –1, and the coordinate values are set to –1. Constraints between guitar pins are set by the string length formula, and the button

72

L. PeiQiao and T. Fujimoto

is connected to the input guitar sound source [15]. Finally, the guitar is slid according to the gravitational acceleration detected by the acceleration sensor and the acceleration generated by the one-handed slide (Fig. 10).

Fig. 10. Prototype screen

6 Summary Today, it is the users, who are forced to adapt to the ever-evolving technology, rather than the engineers, The emergence of new technologies is changing our lifestyle and altering ‘common sense’ in various fields. As the next-generation technologies that form the foundation of our daily life continue to gain momentum, users are forced to digest and cope with the changes brought about by these technologies. The application proposed in this paper focuses on the need from instrumentalists and reflects them on the design of the application. The ‘need’ is not to use complex technology, but rather to experience the physical sensations that traditional analog tools and devices have [4]. The application reflects the characteristic specific to analog usability, like a real musical instrument, and reproduces the characteristic in the digital medium of the smartphone. The method of designing and reproducing analog elements digitally is a concept not only for designing this application, besides that, the idea can be applied to all digital content, including other smartphone applications. This paper reemphasizes the importance of the physical sensation of playing a musical instrument in content design and suggests new possibilities against conventional computing methods. In the future, we would like to verify its effectiveness from various aspects. In other words, we aim to create a new performance experience by incorporating analog elements into digital media, which boast superior convenience and functionality. Therefore, effective use of digital technology must be considered, and analog elements must

XI Guitar: Development of a Slide-Type Guitar Application

73

be redesigned in terms of that. Therefore, this paper examines the mechanism of physical sensations that occur during slide guitar playing, and the resulting design emphasizes the sensation of one-handed playing. The slide guitar, which emphasizes the sensation of one hand over both hands, was referenced as the design model, and the main principle was to reproduce the movement of ‘gliding’ on a smartphone. This paper is the beginning and a part of such an integrated design theory. In the future, we will apply the method of prototyping the system based on the analog guitar to many other analog instruments. We would like to improve the method to more practical level and will also develop our research with a different focus other than the physical feedback and analog feel. Rather than making users busy to deal with how to handle the technology, we aim to design and provide technology in a way that users want to use it.

7 Future Issues 7.1 Addition of Friction-Based Tactile Feedback For future work, we would like to move forward with a more practical implementation of the prototype model. We will explore ways to further improve the tactile sensation and realize the aforementioned functions. The sensation of one-handed play is essential to guitar playing, and it is from the horizontal movement of hand positioning along the guitar fretboard. In other words, generating a tactile feedback at the fingertips due to friction when switching between the chords is also an important element. We will prepare implementation that incorporates this element as well. Although there have been some previous studies on reproducing the tactile feedback of frictional touch using the vibration function of a smartphone screen, there is no method that combines enhancement of the one-handed play’s sensation and its horizontal movement. 7.2 Prior Studies The mechanism of the friction-induced illusion is shown in Fig. 9. For example, if the frictional force decreases during a finger slide, an assisting force that assists the motion is pseudo-perceived. If the frictional force increases while the finger is sliding, it can induce an illusion of minute bumps called sticky-band illusions. Combining them, they can present a page-turning sensation, tactile perception of buttons, or assistive flicking [23]. In addition, by presenting changes in the frictional resistance force generated when rubbing a material, as changes in the frictional resistance force on the ultrasound tactile display, it is possible to reproduce the ‘tactile feel’ of the material to some extent [18] (Fig. 11).

74

L. PeiQiao and T. Fujimoto

Fig. 11. Mechanism of tactile illusion

7.3 Application to Analog Guitar It is expected that the mechanism explained in the previous section, can improve operability and ‘real feel’ of playing because it has strong affinity with the sense of friction, coupled with the haptic feedback when picking a guitar and the interaction with audiovisual sensation (cross-modality) [19] (Fig. 12).

Fig. 12. Image of application to analog guitar

As described above, several methods already exist to further carry out measures to solve the issues addressed by this study. On the other hand, they are all fragmentary and do not function as ultimate solution for the issues pointed out in this paper to reproduce the physical feedback of an analog musical instrument on a single smartphone [16]. Most of the challenges focused on this paper are to realize the tactile feedback of an analog instrument, and it means the practical utilization of a digital musical instrument application.

XI Guitar: Development of a Slide-Type Guitar Application

75

References 1. Fujimoto, T.: Ideology of AoD: analog in digital operating digitized objects and experiences with analog-like approach. In: 7th International Congress on Advanced Applied Informatics (2018) 2. Fujimoto, T.: Understandability design: what is “Information Design”. J. Inf. Sci. Technol. Assoc. 65, 450–456 (2015) 3. Fujimoto, T.: Toward Information design 3.0: the information design for ‘Communicate.’ Build. Mainten. Manag. 34, 42–46 (2013) 4. Shimasaki, M.: Considering analog and digital. Publ. Inf. Eng. 47, Kyoto University (2007) 5. 崔猛, 渡邊裕司: TI - a study of comparison of learning algorithms for pedestrian identification using 3-Axis accelerometer of smartphone. 電子情報通信学会, vol. 118, pp. 113–118 (2019) 6. 佐々木瑞夏: 加速度情報の音への変換方法の研究年度東洋大学総合情報学部卒業論文(2017) 7. Hermann, T.: Taxonomy and defnitions for onification and auditory display 8. Parseihian, G., Gondre, C., Aramaki, M., Ystad, S., Martinet, R.K.: Comparison and evaluation of sonification strategies for guidance tasks. IEEE Trans. Multimedia 18(4), April 2016 9. Terasawa, H., Parvizi, J., Chafe, C.: Sonifying ECoG seizure data with overtone mapping: a strategy for creating auditory gestalt from correlated multichannel data. In: Proceedings of the International Conference on Auditory Display 2012 (ICAD2012), pp. 129–134 (2012) 10. Kamimura, T., Kitani, T., Kovacs, D.L.: Automatic classification of motorcycle motion sensing data. In: Proceedings of 2014 IEEE International Conference on Consumer ElectronicsTaiwan (ICCE-TW), pp. 145–146. IEEE (2014) 11. “位置情報を用いた二輪車モーションセンシングデータへの正解データ自動ラベリング手法の一提案,” 情報処理学会研究報告マルチメディア通信と分散処理 (DPS), 2013(6), 1–6 (2013) 12. http://www.keishicho.metro.tokyo.jp/bicyclette/jmp/bicyclette.pdf 13. 下山直起, 安藤輝, 山和人, 石井貴拓, 平山雅之, “マルチセンサを用いた走行解析による自転車事故防止システムの検討,” 情報処理学会研究報告システムと LSI の設計技術 (SLDM), 2015(51), 1–6 (2015) 14. 井将徳, 中村嘉隆, 高橋修, “モーションセンサを用いた自転車違反運転検知システム,” 情報処理学会マルチメテ？ィア, 分散, 協調とモバイル (DICOMO2015) シンポジウム, vol. 2015, pp. 265–271 (2015) 15. 後藤秀信, 三浦元喜, “加速度センサーを使用した自転車の挙動認識,” 情報処理学会インタラクション 2014 論文集, vol. 2014, pp. 309–312 (2014) 16. 多摩川精機,ジャイロセンサ技術. 東京電機大学出版局 (2011) 17. Goto, H., Miura, M.: Examination of sensor positions to detect bicycle speeding behavior. In: KES-IIMSS, pp. 204–211 (2013) 18. “ジャイロ(角速度) から角度の算出方法,” https://garchiving.com/angular-from-angularacceleration/Leif E Peterson: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009) 19. 誠ほか:触覚認識メカニズムと応用技術—触覚センサ・触覚ディスプレイ—【増補版. S&T出版 (2014) 20. Takasaki, M.: Between smoothness and stickiness. In: 2015 IEEE World Haptics Conference [D-46] 21. Watanabe, T., et al.: A method for controlling tactile sensation of surface roughness using ultrasonic vibration. In: Proceedings of IEEE ICRA, pp. 1134–1139 (1995) 22. Lille University, Project STIMTAC. http://12ep.univ-lille1.fr/?page_id=2033

76

L. PeiQiao and T. Fujimoto

23. Sednaoui, T.: Ultrasonic lubrication tablet computer. In: 2015 IEEE World Haptics Conference [D-40] 24. Wiertlewski, M.: Power Optimization of ultrasonic friction-modulation tactile interfaces. IEEE Trans. Haptics 8(1), January–March 2015

Audio Streams Synchronization for Music Performances Klaudia Tomaszewska, Patryk Schauer(B) , Arkadiusz Warzy´nski , and Łukasz Falas Department of Computer Science and Systems Engineering, Wrocław University of Science and Technology, Wrocław, Poland [email protected]

Abstract. For many years, the mechanisms of transmitting audio streams have been gaining popularity. The SARS-COV-2 pandemic completely remodeled people’s habits by completely preventing participation in concerts. The technical possibilities of the musicians’ remote cooperation have not been fully used yet.The popularity of remote communication is unquestionable. However, so far this type of communication has been based on a one-to-many model. In the case of music events, or music production in general, a many-to-one or generally many-to-many model must be implemented. For this to be possible, it is necessary to solve the problem of synchornization of streams originating sequentially from many creators. In addition to the aspect of audio stream synchronization discussed in this article, one of the assumptions was also the ease of adapting the proposed solution as part of a web application. Keywords: Audio streaming · Web services · Stream synchronisation

1 Introduction The Internet is an inseparable part of everyone’s life, more and more well-known activities are transferred to the digital space. With the development of technology, more and more new solutions are available to users, some of them we owe to the development of telecommunications. Technologies began to emerge that enable users to send multimedia data via Internet connections. Thanks to instant messaging services such as Facebook Messenger, MS Teams or Google Meets, remote calls in real time become fast and additionally free. The need for continuous digitization was even more apparent when the SARSCoV-2 virus began to spread around the world in 2019. The pandemic has forced major changes to the economy, technology, financial markets and lifestyles of millions of people around the world [1]. The music industry has also been hit by the pandemic. Festivals and concerts that were often and willingly organized ceased to exist after 2019. The artists did not want to canceling concerts, so they started announcing their activity online. Cause of pandemic there is new possibilities in cultural consumption such as online concerts [2]. However, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 77–87, 2023. https://doi.org/10.1007/978-3-031-27470-1_8

78

K. Tomaszewska et al.

if the ensemble consisted of more members, or when it comes to festivals with frequent appearances of many artists, this caused technical complications. The solution described in this paper enables multi-person music ensembles, orchestras and various artists to conduct a rehearsal or concert by transferring these activities to the Internet. The solution is focused on eliminating the problem that arises during “live” communication, resulting from the lack of synchronization of streams coming from other people.

2 Related Works Currently, there is a wide range of solutions for multimedia transmission over the Internet or real-time streaming. Most of them are used as software for videoconferencing or simply for chatting with friends. They focus on synchronizing audio and video streams to provide users with the best audiovisual experience possible. The Zoom software is a commercial example of such a solution. It allows not only efficient communication with users via Internet links, but also functions such as file transfer or screen sharing [3]. An alternative is Microsoft Teams, a web-based service that contains a set of tools and services especially useful for teamwork [4]. An interesting proposition may also be Discord, a free application used in particular for voice and text conversations. The project, whose target group were initially players, turned out to be so universal over time that it gained the approval of a wide community. In December 2020, it was announced that 140 million active users monthly willingly use the Discord application [5]. The application offers a lot of freedom in modifying the audio settings, thanks to which users are able to enjoy voice calls in affordable quality even with less professional devices. Instant messaging is already very developed at the moment. Virtually all solutions of this type offer trouble-free and high-quality communication with other users of the same software. Such messengers can be successfully used by artists to conduct an online concert or even rehearsal with their band members. However, there is a limitation, when we join a conversation or meeting, we can hear and see everything, but only from the moment we joined, we do not know what data was transmitted before we joined. In such a situation, in order to, for example, start a team rehearsal, you have to wait until everyone is ready practically at the same time. IRCAM, the Acoustics and Music Research and Coordination Center, and the multimedia research team from CNAM-CEDRIC have been running a project called "distributed virtual orchestra" since 2002. The aim of this project is to ensure proper conditions for musicians to play in real time via the Internet [6]. The scope of their design is as follows: • Artists must play virtually together in real time, despite being physically separated, • The audio engineer must be able to adjust the parameters of the various sound sources in real time, • The audience must be able to virtually participate in the concert at home using a standard audio / video mechanism or in a room with a dedicated installation.

Audio Streams Synchronization for Music Performances

79

The authors [6] explain that the transmission between musicians is carried out using PCM audio - the most popular method of representing an analog signal in digital systems [7], which allows you to hear various audio streams in high quality. However, the main goal of these teams is to plan various processes using appropriate real-time scheduling techniques to improve the global operation of the application. The above-mentioned project brings together aspects that are particularly important for people who are responsible for the proper conduct and coordination of concerts. A great advantage is taking into account the presence of a sound engineer who controls the sound streams in real time, which certainly increases the quality of the broadcast, and it is certainly a solution suitable for professional musicians or a world-famous orchestra.

3 Multimedia Transmission Concept In this work, a prototype of a solution imitating a virtual orchestra was prepared. The target audience are users with any level of music experience and developers who want to use such a solution in their web applications. The multimedia stream synchronization module was developed to considering the principles of designing distributed systems based on microservices. The developed component is to enable multiplexing of audio streams. Synchronization of audio streams is to give the possibility of playing with other system users without having to worry that due to possible time differences, i.e. the time of joining the concert. The designed solution requires the use of a protocol that enables efficient streaming of multimedia data via a web application. WebRTC is an open standard for real-time multimedia streaming via web browsers. It supports media such as audio, video and general data that is transferred between users (peers). Technologies behind WebRTC are implemented as an open standard and available in the form of JavaScript API supported in all leading browsers. Thanks to the use of this standard, the user from the web application will be able to stream audio to the module where the synchronization processes of this stream will take place without any problems [8]. In this work, receiving and synchronization of such streams will take place via a module specially designed for this purpose, so in this case it is necessary to slightly modify the standard use of the WebRTC protocol. As already known, the most common use is to establish a connection between browsers (PeerConnection) in order to emit events between them in real time. However, if you need to synchronize such streams with a strictly defined method of operation, such a connection should exist between the browser and the module. Each connection is handled by the PeerConnection mechanism that uses the ICE (Interactive Connectivity Establishment) protocol. Before such communication is established, the objects wishing to establish it must exchange information about the connection with each other. Since network conditions may vary depending on various factors, this service is used to discover possible connection candidates. ICE uses either STUN (Session Traversal Utilities for NAT) or TURN (Traversal Using Relay NAT) depending on the situation. STUN is a more common choice than TURN, which is used for more advanced solutions [18]. The STUN server can detect the presence of NAT and in this case obtain the external NAT type address and port for the current connection. The TURN server, on the other hand, is used to forward network traffic, because

80

K. Tomaszewska et al.

a direct connection between clients is often impossible, unless they are in the same local network. Currently, TURN servers are available online, such as the COTURN project.

4 Prototype Architecture The microservice architecture brings many benefits such as independent and simple implementation, scalability, the use of various technologies, error isolation, or simplicity in changing the code [9]. The system, based on the principles of microservices design, consists of two components, one of them is a web application, and the other is the audio stream synchronization module. The choice of such an architecture opens up many possibilities of using the designed synchronization component.

Fig. 1. Component diagram

Figure 1 shows a diagram of the components that make up the described system. The “Virtual Orchestra” component is the equivalent of a web application, while the “Media Stream Sync” component is a module for audio stream synchronization. The Virtual Orchestra component consists of a client that provides an audiovisual experience to the user and a server that hosts the logical part of the web application. The “Media Stream Sync” component consists of a storage space for audio files and a server that contains the logic behind the synchronization of audio streams. The component responsible for synchronizing audio streams has a defined interface, thanks to which communication between it and the "Virtual Orchestra" component is possible. Figure 2 shows how the system functions with any number of clients, where the client is the system user’s web browser. There is a separate connection to the stream sync component for each client that interacts with the web application. The situation presented in the figure concerns a single web application. On the other hand, in a situation where there are more different web applications, a separate instance of the audio sync component is created for each of the web applications mentioned. Communication between these components uses the WebSocket protocol and in some cases the HTTP (Hypertext Transfer Protocol). The WebSocket protocol enables two-way communication between the client and the server [10]. Thanks to the use of the WebSocket protocol, it is possible to gain a small load on the network and messages in real time. Events related to user interactions with the site or other users will be handled via

Audio Streams Synchronization for Music Performances

81

Fig. 2. Component diagram with any number of clients

the WebSocket protocol. Events will be transmitted to users depending on the situation, using broadcast types such as broadcast or multicast (one-to-many). With their help, specific data from the server will be sent to all users (broadcast) or only to selected users (multicast). However, the correct implementation of the peer-to-peer communication that is required to use the WebRTC standard will be done with the help of the Fetch API, which provides access and the ability to manipulate parts of the HTTP protocol, such as requests and responses. 4.1 Audio Streams Synchronization The component dealing with synchronization of audio streams, in order to properly perform its task, must have a strictly defined mechanism of operation. Synchronization of audio streams, referring to the design assumptions, should make it possible to play with other system users without having to worry about possible time differences when joining the game. The most important aspect is to synchronize the streams in such a way that anyone joining can hear previous streams from the beginning. This means that the user does not have to wait for a participant to join the game, nor does he have to worry that starting it does not synchronize the streams of players at the same time. However, it is worth recalling that the subject of synchronization is combining audio streams, so any sound desynchronization resulting from inattention or loss of concentration of the user is not taken into account. Streams are merged sequentially, which means that each subsequent person will hear the previous ones that are already in sync. Figure 3 shows the sequence of combining streams, where the circles indicate the user, and the number assigned to them indicates who joined. The arrows show the direction of the flow of the streams, and the numbers above them indicate how many streams have already been synchronized and therefore how many such streams the user can hear. The first person cannot hear any stream due to the fact that no one has been able to play before him. The second person at the start

82

K. Tomaszewska et al.

of the play will already hear the first person, however the stream that this person hears will be heard from the very beginning. The next person to join - the third person will already hear the synchronized stream of the first and second person, also from the very beginning. A similar situation will occur when the fourth and fifth person joins. Shifting the beginnings of the streams for subsequent people generates a delay, which will be exactly the same as the difference in the time of joining the play by the last person and the first person in the chain. The first person joins at t = 0 [s], the last - fifth person joins at t = 20 [s]. The delay in this case will be 20 s, as this is the time between the first person and the last player starting the game. Players will not experience this delay directly, as it is a consequence of the timed streams offset which allows them to play together at the same time from start to finish. However, in order to minimize the occurring delay, the solution can be expanded.

Fig. 3. Synchronization concept with taps

The principle of operation is analogous, however, it is possible to detach the flow. This is the case when two people start playing at a very similar time, the stream is unlinked and all transmitted data is held until such stream is synchronized. The disconnected stream will be synchronized to the very end of the chain. This means that the possibility of playing together has been turned off and then all the taps are synchronized to the main stream. This situation requires that the unlinked streams are kept in a buffer. The predicted time as defining whether a given stream will be detached is 4 s from the time of attachment of the predecessor. This can be illustrated in Fig. 4. The third person joined at t = 10 [s], the fourth person joined at t = 11 [s]. In this case, the fourth person was disconnected from the main stream because the time that has elapsed since joining the third person until the fourth person joins in is 1 s, which is less than 4 s, which qualifies him as a stream that needs to be unhooked. A person disconnected from the fact of joining will hear the same streams as his predecessor. The use of unlinked streams will overcome the problem of a poor internet connection, since if such a stream does not reach the synchronizing component due to an unstable connection or abrupt disconnection, it will be forgotten. The merging of the streams takes place on an ongoing basis, which requires certain assumptions. According to the purpose of the audio stream synchronization component, if each person who starts the play will want to follow the stream emitted to the very end of this stream or longer. The number of people playing at the same time can theoretically be any, but in practice it is determined by technical limitations such as equipment and

Audio Streams Synchronization for Music Performances

83

bandwidth. Users can join the game at any time while the synced stream is in existence. It is very unlikely that two or more people will join at the same time, so the order is arbitrarily determined by the time they join the game. The quality of the internet connection is not under consideration, but it is nevertheless important to cover basic cases that may influence the operation of synchronization. Such events are: • The user has started to play and he is not the first but is playing shorter than his predecessor. • The user has started to play and he is not the first, but his internet connection has been abruptly cut. • The user has started to play and he is not the first, but the browser has been shut down abruptly. Another assumption is the listeners apperence, because the users, apart from playing with each other, may want to listen to the current progress of the players. Additionally, this solution creates opportunities for teachers who would like to listen to the progress of their pupils.

Fig. 4. Learners in the concept of synchronization

Figure 4 shows how the listeners are realized in the described mechanism of synchronization of audio streams, their presence is represented by circles with “S1” - where “1” means the order of joining the listener. The audience is presented as follows, in the situation shown in Fig. 3. Listeners receiving the stream that was synchronized at the moment when they initiated connection. In the example of Fig. 5, if the listener first joined the listening session in front of a second person who would like to play, he will only hear the stream of the first player. On the other hand, if the second listener expresses a desire to listen in a moment when the third player is not yet playing, he will hear the synchronized first and second stream. Likewise for each subsequent listener. Of course, joining the listener does not in any way stop the further possibility of playing for users who want to join the game together. This is because the listener will only hear those streams that were synchronized at the time of joining, if another person comes and wishes to play, then the listener who joined will not hear that person’s playing. In that case, listening have to be restarted.

84

K. Tomaszewska et al.

5 Efficiency Analysis The correctness of the implemented solution should be checked by means of specially prepared tests. Two tests have been prepared to check the correct operation of the system and the performance of the stream synchronization module. Test 1 consists in checking the efficiency of the solution. The critical function is tested because it is where the entire synchronization and stream merging process takes place. The average execution time for the critical function will be checked. First, the average time for each user will be calculated, and then, from the obtained values, the average time needed to perform all operations taking place in the function will be calculated. The test will be run for three users, then five, and finally ten users. In test 2 two system users play with each other, creating a common output stream. Both people start at about the same time but not simultaneously. The person who starts first is asked to say the alphabet every second letter. On the other hand, the second person who “plays” is supposed to fill in the alphabet with the remaining letters. The time that must elapse between the first person and the second person streaming is not less than a second. The test will be repeated 5 times using the same scenario. Test 1 The test was conducted with the following assumptions: • • • •

Each subsequent person started playing no longer than 5 s after the predecessor. Everyone had to play for at least a minute. People cannot hang up while playing together. There will be three repetitions of the test for each group.

Table 1. Test 1 results Number of playing users No.

3

5

10

1

0.00359 [s]

0.00433 [s]

0.00279 [s]

2

0.00320 [s]

0.00405 [s]

0.00261 [s]

3

0.00257 [s]

0.00371 [s]

0.00271 [s]

Table 1 summarizes the results obtained for the performed test. Time of single execution of the function was calculated, which corresponds to one passage through the function and receipt of one data packet. The total execution time of the function for each user was divided by the number of its calls. In this way, the average execution time of the critical function was obtained. For the correct anxalysis of the results, a standard is needed to be able to assess whether the obtained results are satisfactory. For this purpose, data from WebRTC and the Opus codec will be used. This codec is the default choice when using WebRTC

Audio Streams Synchronization for Music Performances

85

and it was also used in the tested solution. It is known that the sampling frequency is 48 kHz and the single data sample received on the server is 960 samples. About 50 such data packets are needed to obtain a full second. The same information can be found on the number of parcels sent per second with the use of WebRTC [11]. In this case, each package should be collected within 0.02 s on average. Thus, to determine whether the server is processing all data quickly enough, the average time of a single critical function execution should be less than 0.02. Considering the results of all tested groups, the average execution time of the function call does not exceed the order of milliseconds. Therefore, based on this information, it can be concluded that the obtained results are satisfactory. Test 2 The test was conducted with the following assumptions: • The first person begins speaking the alphabet with the letter “A” and continues with every second letter. • The second person must not start playing earlier than one second after the first person. • The second person starts speaking the alphabet from the letter “B” and continues every second letter to finally create the entire alphabet. • People cannot hang up while playing together. • Letters from both sources are spoken at the same pace as much as possible (A letter is spoken about every second). The test was performed five times with the same assumptions each time. The resulting audio tracks were analyzed using an audio editing program. The letters standing next to each other come from two different sources, for this fact it is required to check if they appeared in the right place for them. Figure 5 shows an example of an analysis where the letters in question occur in a path resulting from synchronization.

Fig. 5. An example of the analyzed audio track

If all the letters are found and in the right place for them, the next step is to find out how far apart they are. The spacing between letters was calculated for each track. It is known that both sources emit sounds with possibly similar intervals of a little over a second, which means that the correct path should have spaces between letters that are about half of this time. Of course, one must take into account the fact that the letters were spoken by users, which may also cause some deviations in the test performed. Table 2 shows the mean values for letter spacing collected during this study and it show that the averaged time difference between the letters is similar. Given the sensitivity

86

K. Tomaszewska et al. Table 2. Test 2 results

Average values of spacing between letters 1

2

3

4

5

0,607 [s] 0,570 [s] 0,615 [s] 0,650 [s] 0,567 [s]

of the test to human error, such as saying a letter too quickly or too slowly, it is unlikely that you will get the same result every time. Nevertheless, this test allowed for visual verification that the implemented solution works correctly.

6 Summary In the implementation phase of the audio stream synchronization component, the previously defined synchronization concept was taken into account, except for the unlinking of the streams. The result of the work is a solution that allows the user of the web application to play together with other participants in real time, but without losing data due to delays caused by different times of starting the game. The tests also ended with positive results. The direction of further work should certainly include issues related to security. The concept of synchronization and joining of audio streams is also worth developing, as the existing one may turn out to be insufficient to meet the possibly growing requirements of users. It may include the extension presented in the work with the possibility of disconnecting streams and their subsequent synchronization. A further direction of work may also be the possibility of adding to the already saved results of synchronization. It is worth considering extending the module with various synchronization concepts, thanks to which the user could choose how he would like to use the solution made available to him.

References 1. Soto-Acosta, P.: COVID-19 pandemic: shifting digital transformation to a high-speed gear. Inf. Syst. Manag. 37(4), 260–266 (2020). https://doi.org/10.1080/10580530.2020.1814461 2. Váradi, J.: New possibilities in cultural consumption. The effect of the global pandemic on listening to music. Central Euro. J. Educ. Res. 3(1), 1–15 (2021). https://doi.org/10.37441/ CEJER/2021/3/1/9345 3. Sevilla, G.: Zoom vs. Microsoft Teams vs. Google Meet: Which top video conferencing app is best. PC Mag (2020) 4. Microsoft Teams for Education (2021). Dost˛epny: https://www.microsoft.com/plpl/educat ion/products/teams. [dost˛ep: 05.12.2021] 5. Discord Transparency Report, July—Dec 2020, 2021. https://blog.discord.com/discord-tra nsparency-report-july-dec-2020-34087f9f45fb. Accessed 05 Dec 2021 6. Cordry, J., Bouillot, N., Bouzefrane, S.: Performing real-time scheduling in an interactive audiostreaming application 7. Black, H.S., Edson, J.O.: Pulse code modulation. Trans. Am. Inst. Elec. Eng. 66(1), 895–899 (1947)

Audio Streams Synchronization for Music Performances

87

8. WebRTC – documentation. https://webrtc.org/?hl=en. Accessed 09 Dec 2021 9. Hasselbring, W., Steinacker, G.: Microservice architectures for scalability, agility and reliability in E-commerce. In: 2017 IEEE International Conference on Software Architecture Workshops (ICSAW), pp. 243–246 (2017). https://doi.org/10.1109/ICSAW.2017.11 10. Fette, I., Melnikov, A.: The WebSocket Protocol, RFC 6455. https://doi.org/10.17487/RFC 6455, December 2011. https://www.rfc-editor.org/info/rfc6455 11. WebRTC samples. https://webrtc.github.io/samples/src/content/peerconnection/audio/. Accessed 10 Dec 2021

Research to Enhance the Sense of Presence Based on the Positional Relationship Between the Sound Source and the Distance Tomo Moriguchi(B) and Takayuki Fujimoto Graduate School of Information Sciences and Arts, Toyo University, Tokyo, Japan [email protected]

Abstract. In this research, through the survey and experiments, we examined whether the reflection of distance to sounds leads to an improvement in the sense of reality, and whether it is possible to enhance the sense of presence by reproducing the feeling of distance in environment that each of audience can easily build. The basic idea is from the needs of the times: the increase in online events due to the spread of COVID-19, and the loss of sense of distance and lack of realism when watching online events. From the survey results on ‘sense of reality’ at the first stage of the research, it became clear that ‘sense of reality’ is composed of multi kinds of sensory elements. Therefore, it is thought that, if sounds reflect sense of distance, it would lead to the enhancement of the ‘three-dimensional feeling’ and the ‘sense of cause and effect’, which are components of the ‘sense of presence’, and thus this results in the enhancement of the sensory reality of “being there” when viewing a video. Also, the sound is attenuated by different kinds of causes, and in the large-scale venues where events take place, it is the case that the ‘sound attenuated by the distance from the sound source’ reaches most of the seats. To reproduce a sense of distance, two kinds of experiments were conducted to confirm whether people recognize it and to check whether it is reproducible. From the results, it has been shown that the feeling of distance from the object can be perceived almost accurately by the audio information and the ‘distance perception’ can be altered to some extent by the volume change. Recognizing distance leads to an improvement in the three-dimensional sense, which is one of components of the sense of reality, and it is thought that matching the feeling of distance perceived by sound and the feeling of distance perceived by the image can improve the ‘sense of cause and effect’. Therefore, based on the results of the experiment, it is considered possible to enhance the sense of presence by changing the sound volume in line with what is on the video. Even in a re-viewing environment that can be generally adjusted like when participating in an event online. Keywords: Sense of reality · Feeling of distance · Online event

1 Background These days, due to the spread of the COVID-19, events such as stage and musical that had been originally held by gathering audience to the venue, are often distributed online. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 88–98, 2023. https://doi.org/10.1007/978-3-031-27470-1_9

Research to Enhance the Sense of Presence Based on the Positional Relationship

89

With the spread of this kind of systems, viewers can enjoy the event without actually visiting the venue. In addition, organizers can deliver performances to far more people at once than they would do when gathering audience to the venue. In this respect, event delivery is an innovative system that benefits both organizers and participants. However, in many cases, the distributed content is emphasized to be ‘online event’ with additional performance or effect that was not seen for the events at the venue in the real world. Besides them, regarding the online events without no special online effects with the aim that they can be enjoyed at home as it is, they are much different from the real events due to the lack of ‘reality’, when audience watch the content. In fact, people who participated in online events often report that they felt like they were watching DVDs rather than ‘participating’ in them. One possible cause of this is the loss of ‘feeling of distance’. In fact, as to the content of distributed events, images from various angles and perspectives are frequently switched just like the images included in the DVDs released after the events. Despite of that, the ‘sense of presence’ gained from visual information may be faint. Even though the perspective obtained visually changes each time by the screen switching, the sound is always at a constant volume, and it is significantly different from how we actually hear the sound at the real venue. As a whole, the audience’s perspective does not match the image on the screen. The ‘feeling of distance’ that we would perceive when actually watching an event in the real venue is faint, and it leads to ‘no sense of presence’. In other words, if it is possible to reflect a feeling of distance to both of the image and audio of the video, it can solve this problem and enhance the audience’s sense of presence when watching the delivered content. Regarding the feeling of distance for visual information, it seems to be easily reflected to some extent by securing the constant distance between the object and the camera while shooting is carried out. However, with this method, the range of viewing is limited, and the value of participating in the online events may decrease, and there may be a sense of confusion due to differences of visual and audio information. From these, in this research, the relationship between audio reproduction of distance perception and enhancement of reality was examined, assuming that sound effect reflecting feeling of distance can improve the audience’s sense of presence.

2 Purpose In this research, we clarify the relationship between the distance perception and the sense of presence at first. Next, we clarify whether auditory distance perception through sound can be a method to improve the sense of reality regarding video content, and whether it is possible to reproduce the feeling of distance with sound in environment that each participant can build for online events. By clarifying the method for improving the sense of presence, it will be easier to pursue the sense of reality regarding the content to be delivered in the future. In today’s people’s lifestyle, it has great significance to increase the sense of presence regarding video content, bridging the gap between watching distributed events such as a stage or concert, and visiting the venue to watch events in reality. Under the influence of the spread of the new coronavirus disease 2019 (COVID-19), on February 26, 2020, the Japanese government announced a request to cancel or postpone events, and since

90

T. Moriguchi and T. Fujimoto

then, the number of events held online have been increasing sharply. According to a questionnaire survey conducted by Peatix Japan Co., Ltd. from May 26 to June 2, 2020, since March 2020, the number of online events has far exceeded the number of events that gathers audience in reality. Regarding online events, obviously, participants do not have to go to the venue, and the organizers do not have to select and manage the venue. To host small online events is “handy” for the organizer. This is one of the causes behind the increase in online events and also one of the attractive points of online events. However, participation fees for events held online are often set to be free or comparatively cheap than events that gather audience to the venue. With that in mind, it is thought to be difficult to host online event such as a stage drama and concert. Even if it is one-time events, it requires costs for the venue and its infrastructure or stage effects. It is impossible to earn profits unless a large number possible audience can be expected. If it is possible to improve the sense of reality when audience participates in online events, by enhancing their ‘sense of presence’ to the delivered content, it will be also possible to set the participation fee higher. Just as each person on each seat of the real venue, views and hears the stage differently, regarding online events, if audience can select the venue and seat that is reflected to the content you watch in respect of “difference”, the different pricing depending on the venue and seat would be also possible. By clarifying the means to improve the ‘sense of presence’, the profit of the organizer will be increased. It would also lead to the discovery of new things that can be done online, besides to the improved satisfaction of the audience and the expansion of the way of enjoying as to the online events.

3 Definition of Reality The “sense of reality” is described in the dictionary as “a feeling of being there,” and can be regarded as a complex sense consisting of multiple sensory elements. “The feeling of being there” can be easily assumed to be a state in which the surrounding environment can be sensed in detail and accurately, therefore, it can be easily assumed that the ‘more detailed the environment’ means the ‘more realistic’. However, being able to accurately sense the surrounding situation does not necessarily mean that one feel one’s own presence on the spot. By perceiving the surrounding conditions in detail, and at the same time, by feeling the presence of oneself, it is possible to obtain a lively sense of being there. Research on sense of reality is widely conducted. In “the Perceptual Cognitive Mechanism of Realism and Evaluation Techniques” (2020), three components, “spatial element,” “time element,” and “body element,” are listed. Furthermore, for each element, “Spatial element” are composed of [three-dimensional effect, surrounding feeling, texture], “Time element” are composed of [motion feeling, simultaneous feeling, causal feeling], “Body elements” are composed of [self-existence, interactive feeling, emotion]. It is said that the sense of reality will be improved by having many from each elements. The explanations and definitions of each sensation that appear in multiple documents are summarized and briefly described below (Table 1). We suppose that it is the perception of the three-dimensional effect and the cognition of causal feeling that changes greatly by reflecting the feeling of distance to the voice or the sound that accompanies with the video. This is because reflecting the feeling of

Research to Enhance the Sense of Presence Based on the Positional Relationship

91

Table 1. Elements of reality Spatial element Three-dimensional effect Sense of the depth of the object and the distance between the observer and the object existing in the same space

Time element

Body element

Texture

Properties of the surface of the object (hardness/softness, gloss (surface reflectance), transparency, Sense of hot and cold (heat transfer coefficient)which leads to the estimation of the material of the object

A feeling of siege

A feeling that you can feel the spatial expanse around yourself, not just inside yourself

Dynamic feeling

The sensation of capturing the temporal changes in the surroundings that lead to the estimation of the movement and movement speed of the object

Causal feeling

A feeling of relevance between events that one event is caused by another, and that one event is the result of another

Simultaneous feeling

The feeling that multiple different events are caused by the same factor

Self-existence

Sense of feeling the position, direction, movement of the entire body or each part of the body

Interactive feeling

A sense of interaction, such as a reaction when a subject interacts with an object or another person

Emotion

Feelings of comfort and discomfort that the body feels about the object

distance to the sound creates a causal relationship such as “I can hear the sound like this because I am here”. It also enables the viewers to perceive and estimate the distance from the sound source to oneself not only visually but also auditorily.

4 Acoustic System that Enhance the Sense of Reality What is similar to the focus of this research, Dolby Atmos is a system that improves the sense of presence by making use of information obtained from hearing. Dolby Atmos is a surround recording/playback method based on object audio developed by Dolby Laboratories. Recently, the technology is also adopted for movie theaters and is called ‘Dolby Cinema’. Dolby Cinema enables realistic sound reproduction by installing multiple speakers on the left and right walls and ceiling of the movie theater. Together with each scene of the movie the sound that is to be heard from each direction is played back from the speaker installed at the corresponding position after adjusting the volume. This system creates a sense of siege and three-dimensionality by reproducing the sound so that it surrounds the viewer. For that reason, compared to the case where the number of

92

T. Moriguchi and T. Fujimoto

sound sources is small or the sound sources are installed only in the same direction to the viewer, the playback method can give audience a strong sense of presence. In addition, it is thought that the ‘perception of self-existence’ is enhanced because different sounds can be heard depending on the direction, and eventually the sense of presence is further improved. In line with this, there is a multi-channel acoustic method as a method of acoustic reproduction that has already been put into practical use. As a premise, the sounds that humans can hear are not limited to the sounds that can be heard directly from the sound source when a person hears a sound generated from a remote point such as a stage, in a large space such as a hall. Sounds that echo in the surrounding environment, including walls and ceilings, can be heard at the same time.

Fig. 1. Multi-channel acoustic method

Figure 1 is an illustration of a multi-channel acoustic system. The multi-channel acoustic system requires a large number of microphones and speakers. For recording, multiple microphones are installed so as to surround the player with an even distance. When playing back, the speakers are placed at the positions where the microphones were installed for recording, and the sound is played at the same position and according to the same direction as the position of microphones for recording. As a result, sounds including the sounds echoed at the actual venue will be played from each point, creating a sense of siege and three-dimensionality, and thus enhancing the sense of presence. However, this system has the drawback: it is possible to hear highly realistic sound only when the viewer is within a predetermined listening range. To reproduce the “feeling of being there”, it is also important that when the viewer changes his/her viewing position, he/she can hear the realistic sound that changes accordingly. From this point of view, acoustic techniques have already been devised to deal with the case where the viewer switching the viewing potions or when multiple people hear the sound at different positions at the same time. That is the high quality live sound field reproduction method. In this method, the direct sound generated from the sound source for recording the sound and the sound echoed in the surrounding environment are recorded separately. When playing, it plays the sound recorded directly from the position where the sound source was installed, and play recorded echoes from the left, right, above, and behind of the space where the viewer is. As an example, a case where an orchestra performance in a concert hall is reproduced by this method will be described. First, install a large number of microphones

Research to Enhance the Sense of Presence Based on the Positional Relationship

93

and speakers in the venue as shown in Fig. 2. And, on the stage where the orchestra plays, the direct sound generated from the instrument is recorded. Reverberations and clapping sounds from the audience are recorded on the left, right, above, and behind the seat area. After that, analyze the characteristics of the sound in the recording environment and create a reverberant sound. When playing, the direct sound generated from the instrument is played from the speaker on the stage, and reverberant sound is played from the sea-area surrounding speakers.

Fig. 2. High-quality live sound field reproduction method

By this means, the reproducible range becomes much wider than that of the multichannel acoustic method. In addition to that, it also responds to changes in the viewing position. If the viewer moves closer to the stage, he/she will be able to hear the sounds of nearby instruments when he/she are in each position. If you want to hear the sound of each instrument clearly, you can move closer to the stage, and if you want to hear the music of the entire orchestra, you can keep away from the stage. In this way, it is possible to reproduce the difference in sounds due to distance perception of the entire stage in the same way as in an actual hall. It is thought that not only the sense of siege but also the sense of self-existence, interactivity, and causality have improved, further enhancing the sense of presence. The focus of these systems and researches are the same as this research in respect of that they improve the sense of reality by the audio information. However, this research aims to improve the sense of presence in a reproduction environment that each participant can build for online events. It is an examination of whether or not the sense of presence can be improved by reflecting the feeling of distance in the sound generated from a single sound source, as in watching TV at home. Therefore, our research differs in that the proposed method depends only on the volume control of the sound flowing from one sound source to reflect the feeling of distance. Since a plurality of sound sources are not applied, it becomes impossible to obtain a sense of siege and a sense of self-existence, which are elements to improve the sense of presence and have a great influence on the above-mentioned acoustic system. That is a novelty in acoustic reproduction. Two experiments were conducted in this study, to clarify whether it is possible for people to perceive the difference in volume caused by the difference in distance that can occur in

94

T. Moriguchi and T. Fujimoto

the building, and whether it is possible to reproduce such difference by changing the sound volume.

5 Experiment 1 5.1 Overview In Experiment 1, it is confirmed whether the sound attenuation due to the distance can be perceived even if the difference in distance is small. First, we determined three points that is differently distant from the subject. After showing each position to the subject, it was explained that the sound would be played from each point at the same volume. After that, the subject was asked to stand at a place where he/she cannot see the installation position of the sound source at each point (and the shifted sound source), and then, the sound of the same volume was played from each point. For each sound, they were requested to answer the question: “from which position do you think the sound was played?” (Fig. 3).

Fig. 3. Positional relationship upon the experiment

If a large percentage of subjects correctly answered the position where the sound source was actually placed, or if there was a difference in results between when the sound source was placed near the subject and when it was placed far, they would indicate that humans can perceive the attenuation of sound even with a difference in distance that can occur in the same room and they can perceive the distance from how the sound is heard. 5.2 Results/Discussion In this experiment, the sound source was moved in the order of point B → point A → point C (Fig. 4). When the sound was played from point B, 27% of the subjects expected the position of the sound source to be point A, 67% expected it to be point B, and 7% expected it to be point C. Majority of the subjects perceived the correct sound source position. Then, regarding when the sound source was moved to point A, all the subjects predicted the position of the sound source as point A, and the correct answer rate was 100%. When the sound was played from point C, 33% of the subjects answered as point B and 67% answered as point C. None of the subjects predicted as point A. Compared

Research to Enhance the Sense of Presence Based on the Positional Relationship

95

Fig. 4. Questionnaire results of experiment 1

to when the sound was played from point B, the percentage of those who answered as point B decreased to less than half, while the number of people who answered as point C increased sharply. The concordance rate between the actual sound source position and the response (estimation) by the subject was 78%, It means that nearly 70% of people could perceive the feeling of distance from the object: sound source almost accurately by the information obtained from hearing. It was also found that even if the position estimation is not accurate, the difference in how the sound is heard according to the attenuation because of the distance can be definitely perceived from that all subjects predicted the position of the sound source as point A when the sound was played from point A, and none of the subjects predicted that the position of the sound source was point A when the sound was played from point C. The points were placed at equal interval of 3 m or less. Everyone perceive the difference in how the sound is heard even with this distance difference, people would find a greater difference in large venues used for events in reality.

6 Experiment 2 6.1 Overview Experiment 2 is to confirm whether the ‘difference in how the sound is heard depending on the distance’ can be reproduced by changing the volume of the sound. In Experiment 2, in advance, the subject was informed that the same experiment as Experiment 1 will be carried out again with ‘different sound source’. However, by default of the this experiment, the sound source was not to be changed. Instead of that, the sounds with the volume changed in three stages were played from the point B, without changing the actual position of the sound source. The subjects were asked to answer the same question as in Experiment 1, the question: “from which position do you think the sound was played?” If the result is similar to Experiment 1 or high percentage of people answered that they had felt that they had heard sounds from different positions, it indicates that it is possible to reproduce the feeling of distance by adjusting the sound volume. 6.2 Results/Discussion First, when the sound was played at the loudest of the three stages, 53% answered as point A, 33% answered as point B, and 13% answered point as C. Second, when the

96

T. Moriguchi and T. Fujimoto

volume was lowered to the middle of the three stages, the percentage of those who answered as point A decreased significantly to 7%, and on the contrary, the percentage of those who answered point B increased sharply to 60%. In addition, the percentage of people who predicted as point C was 33%, which was more than 2.5 times higher than before the volume was turned down. When the volume was further reduced to the lowest of the three levels, 87% of the subjects answered as point C, and the remaining 13% answered as point B. None of the subjects answered as point A (Fig. 5).

Fig. 5. Questionnaire results of experiment 2

As for each volume, there was a clear difference as much as in Experiment 1. From this, it was found that the perception of the sound source position can be altered by changing the sound volume and manipulating the information obtained from the auditory sense. Therefore, it is considered that the feeling of distance can be reproduced by adjusting the sound volume. Although some subjects predicted that the sound was played from point C even when the sound was played at the loudest volume, the number of such people was distinctively smaller than when the volume was turned down. In addition, when the sound source was expected to be perceived as being far away (point C), the number of people who answered as point A, decreased accordingly as the volume was reduced. Similar phenomenon would happen in the case when the sound source is expected to be perceived as near (point A). It is possible to gradually reduce the number of people who answer as point C, by further increasing the volume. Eventually the number of people who predict that the sound source is installed at a distant position (point C) will decrease to 0.

7 Summary The experiment conducted in this study proved that it is possible for humans to perceive the feeling of distance from the difference in how they hear the sound. Besides, it was also revealed that the distance perception can be altered to some extent by adjusting the volume and manipulating the auditory information. From the results, it is possible to enhance the people’s sense of presence when viewing the video by adjusting or changing the volume of the sound by linking it with what shows up in the video. In order to link the

Research to Enhance the Sense of Presence Based on the Positional Relationship

97

image and the volume, it is required to match the feeling of distance perceived visually with the image, and the feeling of distance perceived by the auditory sense with the volume of sounds. The change in volume depending on the type of sound source and the distance from the sound source has already been clarified by calculation. However, the experiment in this paper found that how people hear the sound and how they perceive the feeling of distance obtained from the sound, vary greatly depending on each person. In particular, there was a large individual difference in the distance perception, in response to the adjustment of the volume in Experiment 2. Considering this, at this point, it is difficult to establish a clear standard that many people can perceive uniformly, in the way: how much the volume should be increased or decreased in response to how much the depth of the image changes. Even if the online event producer adjusts the volume according to the depth of image, there is a possibility that the viewer may not be able to obtain a sense of reality at all. At the first stage of this research, we thought that various things would be possible by the improvement of the participants’ sense of presence in online events. For example, increasing the profits of the organizer, improving the satisfaction of the participants, discovering new things that can be done online were listed. However, if only a part of the participants can percept the enhanced reality, it would be difficult to make a big progress regarding the listed points. Therefore, to enhance the sense of presence by adjusting the sound volume when people watch a video of online event, it is necessary to find the average value of distance perception regarding volume, and setting clear criteria that majority of people can commonly perceive, for changes in depth of the images of the video and the accompanying changes in volume.

References 1. Ando, H., et al.: The perceptual cognitive mechanism of realism and evaluation techniques. Feature of super immersive communication. Inf. Commun. Res. Organ. Q. Bull., 157–165 (2020) 2. Hamasaki, M.: Super-realistic acoustic technology. Basic of super-realistic technology. Video Inf. Media Assoc. Mag. 65(5), 604–609 (2011) 3. Inoue, N.: Overview of super-realistic technology. Feature super-realistic technology. Gen. Found. J. Inf. Media Assoc. 65(5), 583–592 (2011) 4. Kaku, J.: Fundamentals of sound: sound generation and propagation. Complaints about noise and how to solve it. Minist. Internal Aff. Commun. Environ. Pollut. Coord. Comm. (66) (2011) 5. Watanabe, K.: Tutorial: background to create a realistic sensation by acoustic techniques. Res. Rep. Music Inf. Sci. 4, 1−2 (2017) 6. Oode, S., Taniguchi, T., Ando, A.: Relationship between sense of presence and subjective nearness in music listening. Technical research report of electronic information and communication society. EA Appl. Acoust. 110(71), 1–6 (2010). Electronic Information and Communication Association 7. Oode, S., Taniguchi, T., Ando, A.: Kandoh class and auditory presence evoked by reproduced musical sound. Hear. Soc. Data 40(1), 1–6 (2010). Japan Acoustic Society’s Hearing Research Committee 8. Teramoto, W., Ybshida, K., Asai, N., Hidaka, S, Gyoba, J., Suzuki, Y.: What is “sense of presence”? A non−researcher’s understanding of sense of presence. Virtual Real. Soc. Jpn. 15(1), 7−16 (2010)

98

T. Moriguchi and T. Fujimoto

9. Ando, A.: High immersive audio. J. Jpn. Soc. Acoust. 78(3), 105−107 (2022) 10. 2020 Online Event Survey. Peatix Blog Peatix Japan Co., Ltd. https://blog.peatix.com/fea tured/2020_onine_event_survey.html

Development of an Image Processing Application that Represents Onomatopoeia Yuka Kojima(B) and Takayuki Fujimoto Graduate School of Information Sciences and Arts, Toyo University, Tokyo, Japan [email protected]

Abstract. Japanese onomatopoeia is a word frequently used in daily Japanese conversation. However, for foreigners who do not use onomatopoeia naturally in the course of their languages and are not familiar with the unique word use and sense of Japanese onomatopoeia, onomatopoeia is a difficult point to learn Japanese. Based on this background, this research proposes an image processing application that can display images representing onomatopoeia so that non-native Japanese learners can easily understand the senses and impressions of Japanese onomatopoeia only with visual information. Each of the Japanese 50-character syllabary that compose onomatopoeia, has its own individual impression, for example, a voiced sounds give the impression like ‘dirty’ or ‘muddy’. Upon the proposed system, we applied an image processing method determined subjectively by the authors, to each of Japanese syllabary and users can execute the processing by pressing each button representing the Japanese syllable character. In addition, for onomatopoeia sounds such as zuba (describing slash), which is naturally easy for people to image visually, we set some individual buttons as samples for reflection of image processing. This application enables users to understand the impressions and sensations of Japanese onomatopoeia, which are not originally visible, by image processing; by loading images on computers into the application through the simple button-pressing operation. In order to improve the accuracy of this application in terms of representing the impressions and sensations of onomatopoeia, it is necessary to reduce individual differences in the impressions of onomatopoeia obtained from the images. Therefore, it is required to conduct a questionnaire survey on the impressions of the Japanese syllables when determining the image processing to be assigned to each syllable. Keywords: Japanese onomatopoeia · Japanese 50-character syllabary · Image processing

1 Background Onomatopoeia is a string of letters representing actual sounds or sounds that evoke images of situations or actions, for example, a variety of onomatopoeia is used in manga. For example, when depicting a high stress situation, a string of letters (English: ba-dump ba-dump, Japanese: Doki-doki) is used to describe a beating heart. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 99–109, 2023. https://doi.org/10.1007/978-3-031-27470-1_10

100

Y. Kojima and T. Fujimoto

Onomatopoeia is a word that describes various situations or movements with combination of sounds. It is classified into three types: the onomatopoeia (animal and human sounds) created by imitating animal noises and human voices (English: ruff-ruff, screaming, Japanese: wan-wan, kya-), onomatopoeia (real-world sounds in nature) created by imitating various sounds in nature (English: pouring, slam, Japanese: zaa-zaa, gatan), and mimetic words that describe the condition of things and actions, situations, feelings and psychological states [1]. Onomatopoeia is a word that is frequently used in everyday-conversation, and at the same time, it is a very sensory word that helps smooth communication between people who belongs to the same culture and living area, as its meaning can be understood intuitively [6]. Especially as to Japanese, in addition to the frequent use of onomatopoeia in daily conversation, there are more varieties of onomatopoeia than in other languages. It is said that there are approximately 2,000 onomatopoeic words in the dictionary and 400 to 700 used in every-day life. In contrast, in Western countries, onomatopoeia and mimetic words are rarely used in a daily life, as there is a strong concept that they are intended for children and are mainly used by adults when communicating with babies or toddlers. In short, when people from the countries where they do not use onomatopoeia and mimetic words in daily conversation (ex. English-speaking countries) learn Japanese, it is expected that they will be confused with the huge amount of encounter with onomatopoeic words and mimetic words. In seeking an easy way for non-native Japanese speakers to understand onomatopoeia used in a variety of situation, we focused on onomatopoeia that expresses the texture of objects such as ‘rough’ (Japanese: zara-zara) and ‘smooth’ (Japanese: tsuru-tsuru) [5]. In recent years, the widespread of smartphones has made it possible for anyone to easily take pictures and process images, and it is also easy to process images in a way that changes the texture of the image using the functions of photo processing applications. One example is expression of rough textures by processing: adding noise. Similarly, by using image processing application to represent onomatopoeia, we thought that we can provide users with visual understanding of onomatopoeia and mimetic words that would normally be recognized only by speech or text.

2 Purpose This research aims to create an application that enables users to freely add expression to original images by using onomatopoeia. Onomatopoeia is composed of a combination of the Japanese syllables like other words. In addition, since Japanese onomatopoeia is a sensory word, its range of expression can be said to be varied and flexible. In other words, an individual can freely create and express onomatopoeia, even if it is not commonly used and familiar to many people. We produced the application that include the entertainment aspect that anyone can easily perform the image processing of adding changes to the texture of an image by free expression that is converted from onomatopoeia. Furthermore, since the proposed system is to enable users to understand the impression and sense of onomatopoeia through image processing, the purpose of developing naturally includes to create a new image processing application that can be used by both Japanese and non-native Japanese learners [9]. It may not be perfectly appropriate to use ‘textures’ to represent the onomatopoeia imitating animal’s or human’s sounds

Development of an Image Processing Application that Represents Onomatopoeia

101

(English: bow-wow, screaming, Japanese: wan-wan, kya-), the onomatopoeia describing inanimate objects, or the real-world sounds (English: pouring, slam, Japanese: zaa-zaa, gatan). Because humans can naturally recognize them only as physical sounds or mimetic words that include expressions related to emotions or psychological states. Even so, this application is also characterized to have the entertainment aspect; users can assign a method of image processing to each of the Japanese syllabary, and by entering onomatopoeia through button pressing operation, the image processing is added in turn. In this way, the users can create expressions and textures of images that are not originally visible. The proposed application enables users to represent their own unique onomatopoeia through image processing by using or creating various patterns of onomatopoeic words. Upon the system, since the Japanese syllables used for onomatopoeia is recognized and reflected to image processing each by each, there is no need for special knowledge of image processing or complicated processes, and the combination patterns reflected as onomatopoeia made of Japanese syllables are countless.

3 Types of Onomatopoeia Japanese onomatopoeia can be classified into several types. ‘Japanese onomatopoeia’ includes mimetic words, which describes actions, feelings and psychological states. However, onomatopoeia composed of the same Japanese syllables do not necessarily fall into only one type. For example, don-don which is expressed as the sound of hitting something is classified as onomatopoeia imitating actual sound, while dondon which expresses the way things progress one after other functions as a mimetic word although they both have the same sound, they actually describe different things. Thus, the fact that a single word has a variety of meanings and usages is a notable characteristic of Japanese onomatopoeia. As for the characteristics of word forms in Japanese onomatopoeia, particularly common are the types in which the same sound is repeated such as doki-doki (English: ba-dump ba-dump) and zara-zara (English: rough). These unique word forms distinguish the onomatopoeia imitating real-world sounds and the mimetic words from other common words, therefore these are called “onomatopoeic signs” in Japanese. Also, by combining a single word base with onomatopoeic signs, various derivational forms can be created, for example, many words with similar meanings can be derived from the word base “doki”, such as “doki”, “dokiri”, “dokkiri”, “doki-doki”, and so on [4]. This is also one of the characteristics of Japanese onomatopoeia. As it is said that Japanese onomatopoeia is generally expressed as an ‘adverb + verb’ and English onomatopoeia as ‘a single active verb’, there exist different characteristics between Japanese and English onomatopoeia [2]. In the first place, the definition of onomatopoeia is not the same, in Japanese and in English. In Japanese the word ‘onomatopoeia’ includes both the onomatopoeia imitating real-world sounds and the mimetic words, which describe the condition of things and actions, situations, feelings and psychological states. On the other hand, in English, onomatopoeia basically indicates just the words imitating real-world sounds, while mimetic words are separately represented by the term “mimetic words”. In other words, the reason why Japanese onomatopoeia is more abundant than its English may be due to such differences in definition.

102

Y. Kojima and T. Fujimoto

Another reason for a wide variety of Japanese onomatopoeia is the limited number of Japanese verbs. For example, when we focus on “laugh”, in English, there are many words to express it with a single verb, such as ‘laugh’, ‘smile’, ‘giggle’, ‘grin’, and etc., In Japanese, the variations are limited to just the words ‘warau’ or ‘hohoemu’, meaning ‘laugh’ or ‘smile’. Instead, however, onomatopoeia like gera-gera (English: guffaw), niyari (English: grin) and hera-hera (English: laughing foolishly), which express the state of “laughing,” are abundant. It is also said that there is a difference in people’s awareness of onomatopoeia between Japanese and English. In Japan, it is common and natural for people of all ages to use onomatopoeia in daily conversation, but in English, onomatopoeia is mostly used in communication between adults and babies or in picture books for babies/toddlers, and English onomatopoeia is treated as so-called “baby talk.” This difference in people’s awareness of onomatopoeia may also be a potential why non-native Japanese learners find it difficult to master Japanese onomatopoeia. In addition, the example of Japanese manga that have been translated into English shows that Japanese and English formulate onomatopoeia differently. For instance, as to Japanese onomatopoeia, it is expressed by combining Japanese syllables based on the real-world sounds actually heard. However, when such Japanese onomatopoeia is translated into English, it is not translated in the form of the real-world sound converted into ‘romaji’ as it is, instead, the ‘choice and use of a verb or noun’ works to enable readers to imagine the real-world sound from the meaning of the word (the verb or noun). For example, the sound of waves expressed as zappa-n in Japanese may be converted to ‘splash’ in English. The actual sound of the waves does not sound like “splash,” however, as it is the verb meaning ‘to scatter’ or ‘to spread’, it is clear that English speakers imagine and understand the situation that includes the real-world sound, based on the meaning of the verb. For English speakers, since a richness of the imaginative expressions mainly rely on the variety of verbs, the word forms, uses and senses of Japanese onomatopoeia are unfamiliar ideas, and it is expected that they feel difficult to learn them.

4 Related Research/Cases 4.1 Kana Wave As in the case of onomatopoeic sounds such as ‘boom’ or ‘bang’ for explosions, sound effects are often converted into character strings. The sound of onomatopoeia can be played producing the original atmosphere as a sound effect. The software, developed by Shogo Kawai, creates elements of ‘tone’, ‘scale’, and ‘volume’ by adding several settings to the ‘input onomatopoeia’. The ‘tone’ that changes the mood of a sound is determined by the consonant, and a common tone is assigned to all the syllables on the same “line” within a Japanese syllabary. The ‘scale’, which changes the pitch of the sound, is determined by the vowel, and the system set a different scale for each “(consonant +) a”, “(consonant +) i”, “(consonant +) u”, “(consonant +) e”, and “(consonant +) o”, which belong to the same tones. Regarding ‘volume’, which changes the loudness of the sound, when a “n” is added to the end of a word, the system is designed to decrease the volume from the sound just before “n”, for example, in the onomatopoeia such “go-o-n,” the volume gradually lowers down starting from after the “go” and ending to the “n“. In the

Development of an Image Processing Application that Represents Onomatopoeia

103

case of the vowels use in succession, as in “go-o-o-o-o,” the system gradually increases the volume. Besides, the functions to adjust the playback time and overall onomatopoeic sound height, make it possible to create more complex sound effects. In this way, the software allows users to enjoy an actual listening experience of onomatopoeia, which is usually just expressed in the letters of text. Regarding our proposed application, instead of ‘sound effects’, ‘image processing’ is adopted as the method to express onomatopoeia. 4.2 Digital Picture Book System for Foreign Learners Who Studies Japanese Onomatopoeia [3] In recent years, the number of people from abroad learning Japanese has been increasing. Onomatopoeia is often used in daily life in Japan and is also frequently used in communication between infants and their parents because it is an easy-to-understand language for infants, however, it is considered unfamiliar and difficult for non-native Japanese learners to understand. Therefore, Maeda et al. (2015) proposes a digital picture book to support learning onomatopoeia. Previous picture books that have supported onomatopoeia learning are based on still images, and this made it difficult for foreign learners to understand the meaning of onomatopoeia that express movement or change. However, this digital picture book system proposes a support system that gives dynamic expressions to the illustrations and adds interactive operations to make the subtle nuances of onomatopoeia easier to understand and enable users to actively learn them. Experimental results on system usability show that the system is particularly well suited for expressing onomatopoeia of actions such as repetition and speed.

5 Development of the Application 5.1 Application Overview This application was developed using the Python language and Tkinter, a library for building and manipulating GUI (Graphical User Interface) applications that run on computer desktops. Upon the proposed system, a user load images on the computer that he/she wishes to use into the application, and when onomatopoeia is input by pressing the buttons representing each Japanese syllable, the images are processed. The method of image processing assigned to each Japanese syllable, including the voiced sound, the semi-voiced sound, the long sound, the geminate consonant, the syllabic nasal and the contracted sound, was determined subjectively by the author, and a mechanism was established whereby the assigned processing is added to the image by each pressing of the button. Instead of assigning completely different image processing to each of the Japanese syllables, similar image processing or derivative variations in the intensity of processing were applied to similar sounds or those with similar impressions.

104

Y. Kojima and T. Fujimoto

5.2 Image Processing Methods In this study, the author subjectively determined the image processing method to be assigned to Japanese syllables. Basically, of the Japanese syllabary, the syllables on a same line were assigned to the same type of image processing, but the intensity of the image processing was changed depending on the vowel that consists of the syllable. The processing of “a” and “o” was set to the strongest, “e” to the second strongest, and “i” and “u” to the weakest. These changes in intensity are based on the difference in mouth size when uttering the Japanese syllables. To give a few examples of image processing, first, as shown in Fig. 1, in the “ka” line, ‘unsharp masking’ was applied to sharpen the outlines of the image. The “ka” line is classified as “obstruent” in the Japanese syllabary. Obstruent is ‘the voiced sounds’ or ‘those that can be made into the voiced sounds’, which give an angular impression, and have a sharp image because they are pronounced by holding the breath for a moment and then exhaling with great force [7]. For this reason, we assigned a process to sharpen the outlines of the image.

Fig. 1. Original image (left) and image representing the “Ka” line (right)

In addition, the “ha” line was processed to convert to an RGBA image to reduce the transparency of the entire image as shown in Fig. 2. The “ha” line has not voiced consonant mark and is classified as “the voiceless sound” that gives a light impression, and of the Japanese syllables, it is particularly the sound pronounced with the breath, the image is more translucent than it retains vivid colors and solid contours [8]. For this reason, we assigned a process that changes the transparency of the image by handling the alpha channel.

Development of an Image Processing Application that Represents Onomatopoeia

105

Fig. 2. Original image (left) and image representing the “ha” line (right)

As shown in Fig. 3, the “ga” line was mosaicked throughout the image. Mosaic processing is a method of making it hard to see images on a pixel-by-pixel basis by filling in a square area of the image with the same color. Since “ga” line is an “obstruent” that gives an angular impression and is also classified as a “ the voiced sound” that gives an impression of dirtiness, or heaviness, a process was assigned to convert it from a smooth image to an angular mosaic.

Fig. 3. Original image (left) and the image representing the “ga” line (right)

In addition, the button dedicated to a certain onomatopoeia were also set as one of the functions to express onomatopoeia through image processing, because it is originally easy to image visually. Figure 4 is an image depicting zuba (describing ‘slash’ in Japanese). The onomatopoeia zuba (describing ‘slash’ in Japanese) is generally used to describe the sound of an object being cut by a blade or other object, therefore, the image is expressed as ‘being cut apart by one strike with a sword’.

106

Y. Kojima and T. Fujimoto

Fig. 4. Original image (left) and Image depicting zuba: a ‘slash’ image (right)

5.3 Execution Example To load an image into the application, at first, a user needs to activate a file dialog by pressing the Browse button in the upper right corner, as shown in Fig. 5, and then, he/she can load an image file. The original image will be displayed on the left side and the processed image will be displayed on the right side by operating buttons, and the user can compare the images. As shown in Fig. 5, based on the system design, when the Japanese syllable button is pressed, the entered characters will display so that even onomatopoeia entered at random can be checked visually. The user can reset the entered onomatopoeia and get the image displayed on the right side back to the original image by pressing the cancel button.

Fig. 5. Application screen

Development of an Image Processing Application that Represents Onomatopoeia

107

In addition, as a sample buttons, we set some buttons that represents onomatopoeia by default. By pressing either of these buttons, onomatopoeia which is easy to imagine visually, can be expressed by image processing. The onomatopoeia, zuba, which is often used to describe a sound how an object is slashed with a blade and zukyun, which is used to describe how the heart is becoming love struck. In Fig. 6, as an example, the onomatopoeia zuba is represented as ‘being cut apart an image by one strike with a sword’. Thus, we thought that the application will be useful for non-native Japanese learners to understand onomatopoeia easily in respect of two points: the input method of onomatopoeia using the Japanese syllabary buttons, and the image processing function that can convert the onomatopoeia to the image, regarding the onomatopoeia, with which the impressions and sensations are easy to imagine visually.

Fig. 6. Image processing set to the sample button

6 Conclusion In this research, we created an image processing application for learning support so that non-native Japanese learners can easily understand the impressions and sensations of onomatopoeia, which is essential to the everyday use of Japanese language, by utilizing visual information. Japanese onomatopoeia is one of the most difficult points for nonnative Japanese learners to master, not only because of its large number of volumes, but also because it is a sensory language. Therefore, we devised an application that enables users to understand Japanese onomatopoeia through only visual information without audio. The widespread use of smartphones in recent years has made it easier for people to take pictures and process images. In such a context, we believe that there is a novelty in the image processing method of reflecting onomatopoeia, which is originally

108

Y. Kojima and T. Fujimoto

impossible to ‘see’. This application can be also used for the entertainment purpose, as it is possible to input and create unique onomatopoeia, which are original to each user, and furthermore apply changes to the produced image. We think that the application can support non-native Japanese learners to learn onomatopoeia effectively based on two conditions. One is the simple means of pressing the buttons representing syllables that compose of onomatopoeia. The other is that the image processing method assigned to each syllable is based on the impression and image of the syllable, which are ‘common’ to both Japanese and foreigners, appropriately.

7 Future Work In this research, an image processing application was created to reflect the onomatopoeia however, the method of image processing applied for the Japanese syllables was determined by the author’s subjective judgment. Therefore, it may be a question that the application is highly accurate in respect of the purpose of having non-native Japanese learners understand the onomatopoeia. In the future, using questionnaires and other methods, the processing method needs to be carefully studied to ensure that the image processing assigned to the Japanese syllables conveys the meaning and impression of the onomatopoeia in a universal manner. For example, each Japanese syllable is known to have individual impression. This is ‘sound symbolism,’ which indicates that the connection between the sound of a word and the meaning or sensory impression of the word is common regardless of a hearer’s linguistic or cultural background. From this, we assume that by clarifying the impression of ‘sound’, it is possible to determine image processing from which both Japanese and people from abroad can get a common impression. It is necessary to improve the application for more accurate standard by incorporating the determined image processing in the way mentioned above and also evaluation is needed whether people from abroad can properly understand the impression of Japanese onomatopoeia.

References 1. Nishimura, K.: A contrastive study on onomatopoetic expressions in Japanese and English: “the tale of peter rabbit” and others by Beatrix Potter and their translations in Japanese. Kinki University center for liberal arts and foreign language education journal. foreign language edition, vol. 5, no. 1, pp. 55–72 (2014) 2. Ogura, Y.: A comparative study of Japanese and English onomatopoeia. Center for Japanese language and culture education, Osaka University, vol. 14, pp. 23–33 (2016) 3. Maeda, A., Uema, H., Shirozu, N., Matsushita, M.: Digital picture book system for foreign learners who studies Japanese onomatopoeia. Trans. Jpn. Soc. Artif. Intell. 30(1), 204–215 (2015) 4. Nasu, A.: Phonological structure of mimetic neologisms and segmental unmarkedness. Jpn. Linguist. Osaka Univ. Foreign Stud. 16, 69–91 (2004) 5. Sakamoto, M., Yoshino, J., Doizaki, R., Haginoya, M.: Metal-like texture design evaluation using sound symbolic words. Int. J. Des. Creat. Innov. 4(3–4), 181–194 (2016) 6. Tamori, I.: Characteristics of Japanese onomatopoeia. J. Acoust. Soc. Japan 54(3), 215–222 (1998)

Development of an Image Processing Application that Represents Onomatopoeia

109

7. Fujisawa, N., Iwamiya, S., Takada, M.: Auditory imagery associated with Japanese onomatopoeic representation. J. Physiol. Anthropol. Appl. Hum. Sci. 23(6), 351–355 (2004) 8. Fujisawa, N., Obata, F., Takada, M., Iwamiya, S.: Impression of auditory imagery associated with Japanese 2-mora onomatopoeic representation. J. Acoust. Soc. Jpn. 62(11), 774–783 (2006) 9. Mikami, M.: Selection and teaching of the basic Japanese onomatopoeias. ICU Stud. Jpn. Lang. Educ. 3, 49–63 (2007)

MIMO State Prediction Based Adaptive Control of Aeroelastic Systems with Gust Load Keum W. Lee1(B) and Sahjendra N. Singh2 1

Department of Electronic Engineering, Catholic Kwandong University, Gangneung, Gangwon 25601, Republic of Korea [email protected] 2 Department of Electrical and Computer Engineering, University of Nevada Las Vegas, Las Vegas, NV 89154, USA [email protected]

Abstract. This article presents a state predictor-based composite adaptive ﬂutter control system for a nonlinear multi-input multi-output (MIMO) aeroelastic system. The plunge and pitch motion of this 2-D airfoil is controlled by trailing- and leading-edge control ﬂaps. All of the model parameters are assumed to be unknown. The uncontrolled airfoil shows limit cycle oscillations (LCOs) if the freestream velocity exceeds certain speciﬁc value. First, an adaptive ﬂutter control module is derived for the model including pitch-axis and plunge-axis structural nonlinearities. Second, the design of a state predictor is presented and a composite adaptation control law is derived. Unlike traditional certaintyequivalence adaptive (CEA) systems, the composite adaptation law uses information not only on the plunge and pitch tracking errors, but also on the state prediction error. Through the Lyapunov analysis, it is shown that the state vector as well as the state prediction error converge to zero. Third, simulation results for the closed-loop system including wind gust are presented. The results show stabilization of oscillatory plunge and pitch motion despite parameter uncertainties and wind gust. Keywords: Aeroelastic system · Composite adaptive control · Flutter control · State predictor · Parameter identiﬁcation · Airfoil oscillation · Nonlinear system · Lyapunov stability analysis

1

Introduction

The aeroelastic instabilities (stable and unstable LCOs, divergence, and resonance) of aircraft arising from the interaction of inertial, elastic, aerodynamic forces have been a subject of considerable importance [1,2]. These aeroelastic phenomena aﬀect maneuverability of aircraft and can be catastrophic [3]. In the past, various robust and adaptive control systems have been implemented for the stabilization of aeroelastic systems. The researchers at the NASA Langley research center developed the Benchmark Active Control Technology Wind-Tunnel (BACT) c The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 110–121, 2023. https://doi.org/10.1007/978-3-031-27470-1_11

MIMO State Prediction Based Adaptive Control

111

Model for the study of aeroelastic instability and later implemented controllers for stabilization [4–7]. At the Texas A&M University, researchers constructed a two degree of freedom (plunge and pitch) device for the wind-tunnel tests and explored the existence of LCOs utilizing computational as well as experimental methods [8– 10]. Later for this model, several authors implemented linear, nonlinear and adaptive control laws for the removal of LCOs utilizing a single trailing-edge ﬂap or trailing- and leading-edge control surfaces [8,11–18]. Later noncertaintequivalence adaptive (NCEA) laws were designed using immersion and invariance principle [19,20]. The identiﬁers of certainty-equivalence adaptive (CEA)systems use only tracking error for estimation. Therefore, the these adaptive systems cannot get suﬃcient parameter excitation required for estimation error convergence. To overcome the limitation of CEA systems, composite adaptive laws utilizing model prediction-based (observer-based) identiﬁers have been proposed [21–23]. Recently, a composite adaptive system for a single-input aeroelastic system have been designed by combining gradient-based update rule and adaptation law of NCEA system [24]. Identiﬁers of composite systems use trajectory tracking error as well model prediction error for parameter estimation. Therefore, these identiﬁers have stronger stability properties. The objective in this paper is to design a composite adaptive control system for the stabilization of a 2 DOF MIMO aeroelastic system, which is equipped with trailing- and leading-edge ﬂaps. It is presumed that the model’s parameters are not known. This aeroelastic model exhibits LCOs beyond a critical freestream velocity. The contribution of this paper is threefold. First, a control module is designed for plunge-pitch motion control. Second, a state predictor is constructed, and then a composite adaptive law is derived. The parameter update law used feedback of the tracking error as well as state prediction error. Asymptotic convergence of the tracking error and prediction error is established using Lyapunov analysis. Third, simulation results for the closed-loop system are obtained. These results show that the composite adaptive system including state predictor gives better transient and steady-state responses compared with CEA law despite parameter uncertainties and triangular gust.

2

Aeroelastic Model and Control Problem

Figure 1 shows a two-dimensional airfoil equipped with leading- and trailing-edge control surfaces. In the ﬁgure, α is pitch angle, h is plunge displacement, b is the semichord of the wing, L and M are the aerodynamic lift and moment, and β and γ are trailing- and leading-edge control ﬂap deﬂection angles, respectively. In the ﬁgure, xα is the nondimensional distance from the elastic axis to the center of mass, b is the semichord, U is the freestream velocity, and kh (h) and kα (α) are nonlinear plunge and pitch stiﬀness of the springs. The complete derivation of the coupled second-order diﬀerential equations for the pitch angle and plunge displacement of this laboratory apparatus has been derived by Platanitis and Strganac [10]. These equations for α and h are given by:

112

K. W. Lee and S. N. Singh

α ¨ α˙ mW xα b Iα cα 0 ¨ + 0 ch mW xα b mt h h˙ kα (α) 0 α M + Mg + = −L − Lg h 0 kh (h)

(1)

where mW is the wing section mass; mt is total mass; Iα is moment of inertia; xα is nondimensionalized distance of the center of mass from the elastic axis; Lg and Mg are the lift and moment owing to wind gust. cα and ch are the pitch and plunge damping coeﬃcients, respectively. The structural stiﬀness nonlinearities kα (α) and kh (h) are speciﬁed in Eq. (5). Here, we consider a simple quasi-steady form of the aerodynamic force and moment given by ˙ ) + ((0.5 − a)b(α/U L = ρU 2 bClα sp [α + (h/U ˙ )] + ρU 2 bClβ sp β + ρU 2 bClγ sp γ ˙ ) + ((0.5 − a)b(α/U M = ρU 2 b2 Cmα−ef f sp [α + (h/U ˙ )] + ρU 2 b2 Cmβ−ef f sp β + ρU 2 b2 Cmγ−ef f sp γ

(2)

where a is the nondimensionalized distance from the midchord to the elastic axis, sp is the span, and β and γ are the trailing- and leading-edge surface deﬂections, respectively. The lift and moment derivatives owing to α and control surface deﬂections are Clα , Clβ , Clγ , and Cmα −ef f , Cmβ −ef f , Cmγ −ef f , respectively, where Cmα−ef f = (0.5 + a)Clα + 2Cmα ; Cmβ−ef f = (0.5 + a)Clβ + 2Cmβ Cmγ−ef f = (0.5 + a)Clγ + 2Cmγ

(3)

and Cmα = 0 for a symmetric airfoil. Similar to [19], the lift and moment generated by by wind gust are presumed to be of the form Lg = ρU 2 bsp Clα wG (τ )/U = ρU bsp Clα wG (τ ); Mg = (0.5 − a)bLg

(4)

where wG (τ ) means the disturbance velocity and τ is a dimensionless time variable given as τ = U t/b. Although higher-order polynomial nonlinearites can be included, here the nonlinear functions kα (α) and kh (h) are selected as: kα (α) = kα0 + kα1 α + kα2 α2 ; kh (h) = kh0 + kh1 h2

(5)

where kαj and khj are constant parameters. Let y = [α, h]T ∈ R2 , be the controlled output. Solving Eq. (1) gives: y¨ = Aa [y T , y˙ T ]T + θa nl (y) + Ba u + DG wG

(6)

MIMO State Prediction Based Adaptive Control

113

where Aa ∈ R2×4 , θa ∈ R2×5 , Ba ∈ R2×2 , and DG ∈ R2 . For low velocities U , the linearized open-loop aeroelastic model with wG = 0 has stable behavior. However, for the freestream velocities beyond a critical value of U , supercritical Andronov-Hopf bifurcation occurs and the system exhibits LCOs. Figure 2 shows the closed orbits in the α − α˙ and h − h˙ phase planes for two values (20 [m/s] and 25 [m/s]) of U . (The model parameters are collected in the appendix.) Assume that a reference smooth trajectory yr = (αr , hr )T ∈ R2 converging to zero is given. The objective to regulate (y − yr ) to zero. For the adaptive law derivation, it is presumed that the matrices Aa , Ba , and θa are not known. Thus, for the choice of yr converging to zero, (α, h) will converge to the origin; and therefore, LCOs will be suppressed.

3

Adaptive Control System

In the section, the implementation of a stabilizing control law for the uncertain system Eq. (6) is dealt with. For the purpose of design, it will be presumed that wG = 0. (The eﬀect of nonzero velocity WG will be examined later by simulation.) Deﬁne the tracking error w1 = y − yr ∈ R2 and w2 = w˙ 1 ∈ R2 . Then utilizing Eq. (6) gives (7) w ¨1 = Aa [y T , y˙ T ]T + θa nl (y) + Ba u − y¨r By adding and subtracting −Ω2 w1 − 2ζΩw2 in Eq. (7) gives : w ¨1 = −Ω2 w1 − 2ζΩw2 + Aa [y T , y˙ T ]T + θa nl (y) + Ba u − y¨r + Ω2 w1 + 2ζΩw2 (8) where ζ > 0 and Ω = diag{Ω1 , Ω2 } ∈ R2×2 with Ωi > 0. Deﬁne yd (y, y, ˙ t) = −¨ yr + Ω2 w1 + 2ζΩw2 ∈ R2

(9)

Utilizing Eq. (9) in Eq. (8) and factoring the input matrix Ba gives: w ¨1 = −Ω2 w1 − 2ζΩw2 + Ba [Ba−1 {Aa (y T , y˙ T )T + θa nl (y) + yd } + u] Deﬁne

ΘΦ(y, y, ˙ t) = Ba−1 [Aa (y T , y˙ T )T + θa nl (y) + yd ]

(10) (11)

where Θ denotes the unknown parameter vector and the regressor matrix Φ is given by Θ = [(Ba−1 Aa , Ba−1 θa , Ba−1 ] ∈ R2×(4+3+2) ; Φ(y, y, ˙ t) = [y T , y˙ T , nTl , ydT ]T ∈ R9 (12)

114

K. W. Lee and S. N. Singh

Fig. 2. Limit cycles in the phase planes for U = 20 and 25 [m/s], a = 0.6719

Fig. 1. Aeroelastic system

It is presumed that the matrix Ba is not known, but Ba can be factored as Ba = Bn Λ. Here, Bn ∈ R2×2 is a known matrix, but the diagonal matrix Λ = diag{λ1 , λ2 } ∈ R2×2 has unknown elements λ1 > 0 and λ2 > 0. In fact, λ1 and λ2 represent unknown variations in the eﬀectiveness of the two control surfaces. Then, in view of Eqs. (12), (10) can be shown in a compact form as w ¨1 = −Ω2 w1 − 2ζΩw2 + Bn Λ[ΘΦ + u]; Ω = diag{Ω1 , Ω2 }

(13)

Deﬁning w = = ∈ R , a state variable representation of Eq. (13) is given by 0 0 I . w+ Λ(u + ΘΦ) = Aw + BΛ[ΘΦ + u] (14) w˙ = Bn −Ω2 −2ζΩ [w1T , w˙ 1T ]T

[w1T , w2T ]T

4

where A ∈ R4×4 and B ∈ R4×2 are known matrices, but Λ and Θ are unknown. Of course, Φ(y, y, ˙ t) is a known regressor matrix. Note that A is a Hurwitz ˆ be an estimate of Θ, and Θ ˜ =Θ ˆ − Θ be the parameter estimation matrix. Let Θ error. Now, the purpose is to stabilize the uncertain nonlinear system Eq. (14) by a proper choice of the control input u. Considering Eq. (14), a control law is chosen as ˆ u = −ΘΦ (15) Using the control law Eq. (15) in Eq. (14) gives: ˜ w˙ = Aw − BΛΘΦ

(16)

˜ is zero, then w will exponentially converge to zero. If Θ Although a traditional CEA control system can be derived by utilizing a ˜ the implemented update law will use quadratic Lyapunov function of (w, Θ), only information on the trajectory tracking error and its derivative for synthesis. Furthermore, large adaptation gains used in CEA systems for faster adaptation,

MIMO State Prediction Based Adaptive Control

115

can cause undesirable oscillations in the transient period. Here, the design of a state predictor-based composite adaptation law for the MIMO aeroelastic model is derived.

4

State Predictor, Composite Adaptation Law, and Closed-Loop Stability

We begin with the tracking error dynamics described by Eq. (16) of the closedloop system. Here, a composite predictor-based (observer-based) identiﬁer will be designed [22,23]. Let w ˆ be a predicted value of w. In view of Eq. (16), consider a state predictor as follows : w ˆ˙ = Ap (w ˆ − w) + Aw

(17)

where w ˆ ∈ R4 is a predicted value of w and Ap is some Hurwitz matrix of the form: 0 I Ap = (18) −Ω2p −2ζp Ωp where Ωp = diag{Ωp1 , Ωp2 } with Ωpi > 0. The matrix Ap should be chosen so that the state predictor has faster dynamics compared to the tracking error dynamics. This can be achieved by selecting Ωpi > Ωi , (i = 1, 2) and ζp ≥ ζ. Let w ˜ = w ˆ − w be the prediction error. Then subtracting Eq. (16) from Eq. (17) yields the following prediction error dynamics: ˜ ˜ + BΛΘΦ w ˜˙ = Ap w

(19)

The tracking error dynamics Eq. (16) and the prediction error dynamics Eq. (19) ˜ Now, the interest is to force w and w involve uncertain matrices Λ and Θ. ˜ asymptotically to zero in spite of uncertainties in Eq. (16) and Eq. (19) by the appropriate selection of the update law. The matrices A and Ap have stable eigenvalues. Therefore, there are unique positive deﬁnite symmetric matrices P and Pp , which satisfy the Lyapunov equations: AT P + P A = −Q; ATp Pp + Pp Ap = −Qp

(20)

for any positive deﬁnite symmetric matrices Q > 0, and Qp > 0. Now, consider a quadratic positive deﬁnite Lyapunov function V given by ˜ T Λ) ˜ = g1 wT P w + g2 w ˜ −1 Θ V (w, w, ˜ Θ) ˜ T Pp w ˜ + trace(ΘΓ

(21)

where 0 < Γ ∈ R9×9 is an adaptation gain matrix and g1 and g2 are constants (g1 > 0, g2 > 0). Next, we diﬀerentiating V and use Eq. (16) and Eq. (19) to obtain: ˜˙ T Λ) (22) ˜ ˜ ˜ −1 Θ V˙ = 2g1 wT P [Aw−BΛΘΦ]+2g ˜ T Pp [Ap w+BΛ ˜ ΘΦ]+2trace( ΘΓ 2w

116

K. W. Lee and S. N. Singh

Note that in view of (20), one has: ˜ T Pp w ˜ = −wQ ˜ pw ˜ 2wT P Aw = −wT Qw; 2w

(23)

Also utilizing the relation trace(M N ) = trace(N M ), we have T ˜ = −2trace(ΘΦw ˜ ˜ = 2trace(ΘΦ ˜ w − 2wT P BΛΘΦ P BΛ); 2w ˜ T Pp BΛΘΦ ˜ T Pp BΛ) (24) Utilizing Eq. (23) and Eq. (24) in Eq. (22) gives:

T ˜˙ T Λ] ˜ ˜ −1 Θ V˙ = −g1 wT Qw − g2 w ˜ T Qp w ˜ + 2trace[ΘΦ(−g ˜ T Pp )BΛ + ΘΓ 1 w P + g2 w (25) ˜ It is possible to eliminate Θ-dependent functions from Eq. (27) by selecting the adaptation law as:

˜˙ T = Θ ˆ˙ T = ΓΦ(g1 wT P − g2 w Θ ˜ T Pp )B

(26)

Then, substituting Eq. (26) in Eq. (25) gives: V˙ = −g1 wT Qw − g2 w ˜ T Qp w ˜≤0

(27)

˜ is a positive deﬁnite function of (w, w, ˜ The Lyapunov function V (w, w, ˜ Θ) ˜ Θ) ˜ are bounded. By and V˙ ≤ 0. Therefore, V (t) ≤ V (0) < ∞ and w, w, ˜ and Θ integrating Eq. (27), one ﬁnds that w, w ˜ ∈ L2 (the set of square integrable functions). Because the reference trajectory yr , y˙ r , and w are bounded, the signals (y, y) ˙ and the elements of regressor matrix Φ will be bounded. Therefore, all the signals in the system as well as the control input u will be bounded. This also imply that V¨ will bounded. Therefore V˙ will be uniformly continuous. Thus, according to Barbalat’s lemma [21,25], w and w ˜ tend to zeros as t → ∞. Of course, the reference output yr was selected to converges to zero. Therefore, α, α, ˙ h, and h˙ converge to zero. This completes the design of the composite adaptive system for LCO stabilization. Remark 1. We can obtain a traditional CEA law for regulation of w to zero using a Lyapunov function given by ˜ T Λ) ˜ = wT P w + trace(ΘΓ ˜ −1 Θ Vs (w, Θ)

(28)

Taking its derivative and selecting an update law ˜˙ T = Θ ˆ˙ T = ΓΦwT P B Θ

(29) T ˙ one can show that Vs = −w Qw ≤ 0. Therefore, the CEA system also achieves convergence of w to zero. We note that Eq. (29) uses only the tracking error w, but the composite adaptation law Eq. (26) is a function of w as well as w. ˜ Additional signal w ˜ in the composite adaptation law Eq. (26) aids in improving the stability properties in the closed-loop system.

MIMO State Prediction Based Adaptive Control

5

117

Simulation Results

This section shows the simulation results for the aeroelastic model Eq. (1) perturbed by a triangular gust wG (t) with height of 0.7 and duration of 1 [sec]. All cases were simulated in the presence of this gust. (To save space, the robustness of the system in the presence of other types of gust loads are not included.) The model parameters given in [8–10] were used for simulation. (These are collected in the appendix.) The poles of the linearized system’s transfer function for U = 16 m/s, and a = −0.6719 are 2.3990 ± j13.2147, −4.0173 ± j12.5057, and the zeros are −1.8957 ± j16.8403 for β input and −1.2252 ± j12.4191 for γ input. The poles of the linearized system’s transfer function for U = 20 m/s, and a = −0.6719 are 3.6460 ± j13.6247, −5.3736 ± j12.6925, and the zeros are −2.0234 ± j16.8255 for β input and −1.3437 ± j12.4069 for γ input. The initial ˙ conditions h(0) = 0.02 [m], α(0) = 5.729 [deg], α(0) ˙ = h(0) = 0 were used for simulation. For the control of trajectory, yr (t) was implemented by a fourth-order command generator of the form: (p2 + 2ρ1 ω1 p + ω12 )(p2 + 2ρ2 ω2 p + ω22 )yr (t) = 0 The initial values were yr (0) = (αr (0), hr (0))T = (5.729 [deg], 0.02 [m])T , but all of the derivatives of yr at t = 0 were zero. The parameters of the command generator were set as ρ1 = ρ2 = 1, ω1 = ω2 = 10. The controller parameters were chosen to be λ1 = λ2 = 0.7, ζ = ζp = 1, Q = Qp = I4 , Ω1 = Ω2 =50, 100, Ωpi = fm Ωi , and fm = 2, 10. ˆ The initial values of the partial estimates were randomly set as Θ(0) = ˆ Θ(0) + [0.5, −1, 0.1, −0.1, −1, 1, 1, −0.1, 0.5; 1, 8, −0.1, −1, −4, 10, −4, 0.01, −0.1]. The adaptation gains are Γ = I. The ﬁrst and second rows of Bn were presumed to be [−101.1703, −16.2995] and [−8.5944, 1.1070] for U = 16[m/s] and [−89.5141, −35.5178] and [−17.4963, 1.4796] for U = 20[m/s]. Simulation was done using (g1 = 1, g2 = 105 ) as well as (g1 = 1, g2 = 0). In view if Remark 1, the composite adaptive system becomes a traditional CEA system if g2 = 0. For comparison, responses of the CEA system obtained with g2 = 0 and the composite adaptive system with (g1 = 1, g2 = 105 ) are plotted in the left and right columns of each ﬁgure, respectively. Case A. Composite Control: U = 16 m/s,a = −0.6719, (Ω1 , Ω2 , fm ) = (50, 50, 2), wG = 0 First, the closed-loop systems including CEA law with gains (g1 = 1, g2 = 0) and composite adaptive law with (g1 = 1, g2 = 105 ) are simulated for (U = 16 m/s, a = −0.6719) in the presence of the triangular gust. For a realistic simulation, the maximum value of β and γ are kept within 15o . The selected responses (left column) of the CEA system and composite adaptive system system (right column) in Fig. 3 are somewhat similar. One can observe that LCOs are suppressed and (α, h) converges almost to zero in a short time. But due to the small errors in the steady state, (β, γ) to tend to nonzero constant values.

118

K. W. Lee and S. N. Singh

0

0

5

Time[sec]

10

0

0

5

Time[sec]

Fig. 3. Composite control: U = 16 [m/s], a = −0.6719, Γ = I, Ω1 = Ω2 = 50, fm = 2

10

0

1 Time[sec]

2

0

1

2

Time[sec]

Fig. 4. Composite control: U = 20 [m/s], a = −0.6719, Γ = I, Ω1 = Ω2 = 50, fm = 2

Case B. Composite Control: U = 20 m/s, a = −0.6719, (Ω1 , Ω2 , fm ) = (50, 50, 2), wG = 0 Next, simulation is done for higher velocity U = 20 [m/s] with a = −0.6719. Remaining conditions of Case A are retained. Simulated responses of the CEA and composite adaptive systems are plotted in the left and right columns of Fig. 4. Again, it is seen that the composite adaptive system regulates (α, h) close to zero and ﬂap deﬂections settle to nonzero values. But the CEA law fails to suppress oscillations of h, and γ undergoes persistent saturation (see left column of Fig. 4). Case C . Composite Control: U = 20 m/s, a = −0.6719, (Ω1 , Ω2 , fm ) = (100, 100, 10), wG = 0 Finally, the eﬀect of the parameters Ωi and fm on responses are examined. For this purpose, simulation is done for Ωi = 100, fm = 10 with U = 20 [m/s]. Therefore, now the eigenvalues of matrices A and Ap are in deeper stable region of the complex plane. The remaining design parameters of the controller of Case A are retained. We observe that the CEA law as well as the composite adaptive law are able to regulate (α, h) and (β, γ) to zero (see Fig. 5). However, the composite adaptive law has better transient responses. These results conﬁrm that the composite adaptive law (Eq. (15) with the parameter adaptation law Eq. (26)) achieves superior stabilization properties compared with the CEA law (Eq. (15) with the simpliﬁed update rule Eq. (29)). The advantage of prediction error feedback is apparent if we see Eq. (27) for the derivative of the Lyapunov function V . The additional negative deﬁnite ˜ T Qp w ˜ in V˙ accelerates convergence of V (t) (i.e., w and quadratic function −g2 w w) ˜ to zero.

MIMO State Prediction Based Adaptive Control

119

Fig. 5. Composite control: U = 20 [m/s], a = −0.6719, Γ = I, Ω1 = Ω2 = 100, fm = 10

6

Conclusions

In this paper, we designed a state predictor-based composite adaptive ﬂutter control system for a nonlinear multi-input multi-output (MIMO) 2 DOF airfoil model. The uncontrolled model shows LCOs if the freestream velocity is larger than a speciﬁc value. The wing model was controlled by trailing- and leading-edge control surfaces. The model parameters were assumed to be unknown. First, an adaptive control module was constructed by utilizing a linearly parameterized form of the dynamics of the system. Then, a state predictor was constructed and a composite adaptive control system was developed by utilizing the plunge displacement and pitch angle trajectory tracking errors and state prediction error. Via the Lyapunov analysis, asymptotic convergence of trajectory tracking error as well as state prediction error was conﬁrmed. Simulation results showed robust performance despite parameter uncertainties and wind gust. Furthermore, it was shown that the composite adaptive law performs better than the CEA system.

120

K. W. Lee and S. N. Singh

Appendix 1. System Parameters Parameter value

Parameter value

Parameter value

a=-0.6719

b = 0.1905 [m]

sp = 0.5945 [m]

ρa = 1.225 [kg/m3 ]

rcg = −b(0.0998 + a) [m] xα = rcg /b

ch = 27.43 [kg/s]

cα = 0.0360 [N·s]

kh1 = 0.09kh0 [N/m]

mwing = 4.340 [kg]

mw = 5.23 [kg]

mt = 15.57 [kg]

Icgw = 0.04342 [kg·m2 ]

Icam = 0.04697 [kg·m2 ]

clα = 6.757 [rad−1 ]

clβ = 3.774 [rad−1 ]

clγ = −0.1566 [rad−1 ]

cmα = 0

[rad−1 ]

kα = 12.77 + 53.47α +

1003α2

cmβ = −0.6719 [N·m]

[rad−1 ]

kh0 = 2844 [N/m]

cmγ = −0.1005 [rad−1 ]

2 [kg ·m2 ] Iα = Icam + Icgw + mwing · rcg

2. Performance due to diﬀerent control laws Type

State Predictor Parameter update

Performance

CEA

X

Transient oscillations

Composite control O

No prediction error

Prediction error used Stabilized outputs

References 1. Lee, B.H.K., Price, S.J., Wong, Y.S.: Nonlinear aeroelastic analysis of airfoils: bifurcation and chaos. Prog. Aerosp. Sci. 35(3), 205–334 (1999). https://doi.org/ 10.1016/S0376-0421(98)00015-3 2. Thomas, J.P., Dowell, E.H., Hall, K.C.: Nonlinear inviscid aerodynamic outcomes on transonic divergence, ﬂutter, and limit-cycle oscillations. AIAA J. 40(4), 638– 646 (2002) 3. Mukhopadhyay, V.: Historical Perspective on analysis and control of aeroelastic responses. J. Guidance Control Dyn. 26(5), 673–684 (2003) 4. Waszak, M.R.: Robust multivariable ﬂutter removal for benchmark active control technology wind-tunnel model. J. Guidance Control Dyn. 24(1), 147–153 (2001) 5. Mukhopadhyay, V.: Transonic ﬂutter removal control law design and wind-tunnel test results. J. Guidance Control Dyn. 23(5), 930–937 (2000) 6. Barker, J.M., Balas, G.J.: Comparing linear parameter- varying gain scheduled control techniques for active ﬂutter removal. J. Guidance Control Dyn. 23(5), 948–955 (2000) 7. Scott, R.C., Pado, L.E.: Active control of wind-tunnel model aeroelastic response utilizing neural networks. J. Guidance Control Dyn. 23(6), 1100–1108 (2000) 8. Ko, J., Kurdila, A.J., Strganac, T.W.: Nonlinear control of a prototypical wing section with torsional nonlinearity. J. Guidance Control Dyn. 20(6), 1181–1189 (1997) 9. Block, J.J., Strganac, T.W.: Applied active control for a nonlinear aeroelastic structure. J. Guidance Control Dyn. 21(6), 838–845 (1998)

MIMO State Prediction Based Adaptive Control

121

10. Platanitis, G., Strganac, T.W.: Control of a nonlinear wing section utilizing leading- and trailing-edge surfaces. J. Guidance Control Dyn. 27(1), 52–58 (2004) 11. Xing, W., Singh, S.N.: Adaptive output feedback control of a nonlinear aeroelastic structure. J. Guidance Control Dyn. 23(6), 1109–1116 (2000) 12. Behal, A., Marzocca, P., Rao, V.M., Gnann, A.: Nonlinear adaptive control of an aeroelastic two-dimensional lifting surface. J. Guidance Control Dyn. 29(2), 382–390 (2006) 13. Gujjula, S., Singh, S.N., Yim, W.: Adaptive and neural control of a wing section utilizing leading- and trailing-edge surfaces. Aerosp. Sci. Technol. 9(2), 161–171 (2005) 14. Behal, A., Rao, V.M., Marzocca, P., Kamaludeen, M.: Adaptive control for a nonlinear wing section with multiple ﬂaps. J. Guidance Control Dyn. 29(3), 744–749 (2006) 15. Reddy, K.K., Chen, J., Behal, A., Marzocca, P.: Multi-input/multi-output adaptive output feedback control design for aeroelastic vibration removal. J. Guidance Control Dyn. 30(4), 1040–1048 (2007) 16. Lee, K.W., Singh, S.N.: L1 adaptive control of a nonlinear aeroelastic system in spite of gust load. J. Vibr. Control 19(12), 1807–1821 (2013) 17. Zhang, K., Wang, Z., Behal, A., Marzocca, P.: Novel nonlinear control design for a two-dimensional airfoil under unsteady ﬂow. J. Guidance Control Dyn. 36(6), 1681–1694 (2013) 18. Wang, Z., Behal, A., Marzocca, P.: Model-free control design for multi-input multioutput aeroelastic system subject to external disturbance. J. Guidance Control Dyn. 34(2), 446–458 (2011) 19. Lee, K.W., Singh, S.N.: Multi-input noncertainty-equivalent adaptive control of an aeroelastic system. J. Guidance Control Dyn. 33(5), 1451–1460 (2010). https:// doi.org/10.2514/1.48302 20. Mannarino, A., Mantegazza, P.: Multiﬁdelity control of aeroelastic systems: an immersion and invariance approach. J. Guidance Control Dyn. 37(5), 1568–1582 (2014) 21. Slotine, J.-J.E., Li, W.: Applied Nonlinear Control. Prentice-Hall, Hoboken, NJ (1991) 22. Krstic, M., Kanellakopoulos, I., Kokotovic, P.: Nonlinear and Adaptive Control Design. Wiley, New York (1995) 23. Lavretsky, E.: Predictor-based model reference adaptive control. J. Guidance Control Dyn. 33(4), 1195–1201 (2010) 24. Lee, K.W., Singh, S.N.: Generalized composite noncertainty-equivalence adaptive control of a prototypical wing section with torsional nonlinearity. Nonlinear Dyn. 103(3), 2547–2561 (2021). https://doi.org/10.1007/s11071-021-06227-3 25. Ioannou, P., Sun, J.: Stable and Robust Adaptive Control, pp. 85–134. PrenticeHall, Hoboken, NJ (1995)

Data Science and Data Analysis

Game Theory as a Method for Assessing the Security of Database Systems Arkadiusz Warzy´nski , Katarzyna Łabuda, Łukasz Falas , and Patryk Schauer(B) Department of Computer Science and Systems Engineering, Wrocław University of Science and Technology, Wrocław, Poland [email protected]

Abstract. This paper addresses the topic of database security, which is a critical component of many systems. The goal of the work is to investigate the effectiveness of methods for ensuring database security. The work introduces an innovative way of evaluating the effectiveness of the launched attacks. A literature review identifies current solutions in areas relevant to database security and identifies the state of the art and science. The final analysis introduced cost estimation and transformation of the results into a payoff matrix, which allowed the use of decision-making methods from the field of game theory. No party, defender or attacker, obtained a dominant strategy, but the application of the min-max criterion showed that the defenders’ best strategy is to implement all means of protection, given the assumptions introduced. The presented evaluation method can be applied to decision making in cyber security and contribute to cost optimization in the organization. Keywords: Databases · Computer security · Game theory · Security assurance methods · Vulnerability · Attacks

1 Introduction Database systems are collections of data, stored according to specific rules. In the case of relational databases, it is also important to reflect dependencies between data - in the form of relationship entities. Storing information in databases facilitates access to data, because of a formally defined structure. The contact between a user or application and a database is conducted through SQL queries. The management of databases and other resources in an organization is classified according to ISO/IEC 27001:2013 [1]. This standard is a set of general protection rules and principles to be followed when handling an organization’s resources. The principles contained in ISO/IEC 27001:2013 describe the classification of an organization’s resources into 3 security functions - confidentiality, integrity, and availability. Maintaining confidentiality, integrity and availability often becomes a fundamental objective for maintaining information protection in an organization. Achieving this goal can be thought of as a game theory problem in which the task is to find the optimal solution © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 125–134, 2023. https://doi.org/10.1007/978-3-031-27470-1_12

126

A. Warzy´nski et al.

to the conflict of players: the attackers/hackers, who want to obtain protected data, and defenders/victims, who are trying to maintain security. This work presents a method for evaluating the effectiveness of database security assurance methods using elements of game theory. The results presented in this work are intended to illustrate the possibility of applying a new approach to the analysis of costs and risks related to breaches of database security. This will lead to easier decision making in terms of investing in security solutions, developing own security measures and planning the time needed to achieve the required level of data security.

2 Related Works Research works regarding system’s and database security assessments may be distinguished into several typical approaches. One of common approaches is based on static risk assessment performed with defined checklists, which is later translated into a score value that indicates the calculated security level [2, 3]. Several works also propose a modeling based approach, where system’s entities and relations are considered and similarly to risk-based assessment each element of the model is assigned with a value, which is later utilized for calculation of an overall security level [4]. In the field of modeling-based assessments, some research works also propose ontologies for threat description which model the threat level by defining its properties and relations, supported by data from security databases like NVD or snort rules database [5, 6]. Some of the proposed solutions for security assessment also introduce models based on machine learning and prediction methods to asses security level on the basis of system’s parameter that are gathered [7]. Finally, some of the assessment methods, similarly to this work, try to model the security assessment problem from a game theory standpoint, depicting the attackers and defenders as a players participating in a game, which rules depend on the system that is considered and the type of threat that is analyzed [8, 9].

3 Database Security Threads The scope of securing systems may vary and ensuring database security is affected by many factors that seemingly do not directly concern databases. These may even be aspects beyond the technological framework. Key issues in ensuring database security are: injection class attacks (especially SQL Injection), attacks on service availability, broken access control, activity monitoring, data encryption, social engineering and opensource intelligence. This work presents a preliminary concept for the use of game theory elements in the assessment of database security, conducting additional tests and measurements will still require the preparation of a larger research environment. For this reason, vulnerabilities related to SQL language and the possibility of injecting one’s own code, attacks on service availability and data encryption have been considered in further considerations.

Game Theory as a Method for Assessing the Security

127

SQL Injection Queries written and run by the user directly in the database are not usually a direct source of vulnerability. They have control over what the query contains. The situation is different on the backend side, which is the intermediary element between the application user and the database. This is indeed a vulnerable point in the discussed data flow path. Vulnerabilities of the code implemented in the logical layer allow SQL queries to be injected. SQLi attacks are often identified as the most impactful threat among other attacks on web applications. They allow data to be taken over or, in extreme cases, even take control of the system [10]. Vulnerable applications using sensitive data can perform many dangerous operations in case of a SQLi attack. When investigating the effectiveness of SQL injection attacks, the focus should be on tools that are able to detect application vulnerabilities. Systems such as the OWASP ZAP or SQLmap can be used for activities of this nature. The tool from OWASP is run as an application with a user interface, in which it is possible to indicate places for vulnerability testing. The SQLmap tool has a different interface and is used in a console by running appropriate commands. OWASP ZAP has a large set of tools which allows to carry out various types of attacks - using additional libraries. To perform SQL injections, OWASP ZAP uses an add-on based on SQLmap. Service Availability Availability is a characteristic of system reliability [11]. Attacks on availability may seem seemingly harmless, but it is important to be aware of the associated risks. There are situations where limiting accessibility is motivated by an attempt to force a connection to a fake application. While the attacked application is not responding to queries, an almost identical instance controlled by the attacker is launched. With the help of such a system, numerous thefts of user data can be carried out, which also entails the possibility of further breaches and frauds. Defining the required availability over time is important in connection with maintenance work, but also involves the need to secure the system at an appropriate level against DoS(Denial of Service) attacks. DoS attacks directed at a server instance aim to saturate its computing power resources, which are used to handle requests [12]. DoS attacks can be carried out directly towards databases. Load in the context of databases refers to the number and complexity of SQL queries that must be sent in a very short time. Limiting the impact of attacks on availability can be done on many levels [13]. The basic element is to limit the traffic flowing into the system, which can be done by reducing the traffic to necessary queries, while discarding the part that is not required, e.g. disabling interfaces and ports that are not necessary for the systems operation. Data Encryption Managing the data processing in an organization is not easy, a large part of it is sensitive data. According to RODO standards and security recommendations, such data should be particularly protected against unauthorized access. Securing sensitive data involves making it difficult to read for unauthorized persons, including even ordinary users who use the system but do not directly use the displayed data [14]. Three types of securing sensitive data within the framework of making information difficult to read can be distinguished: anonymization, pseudo-anonymization and masking.

128

A. Warzy´nski et al.

4 Our Approach The research includes testing the effectiveness of selected methods of protection against attacks in selected areas of database systems. The research includes testing the code vulnerability of a particular query before and after protection by the selected method. Vulnerabilities will be detected using attacks offered by the SQLmap tool. The application whose vulnerabilities will be detected is the JuiceShop supported by OWASP. The research conducted in the accessibility area will include an examination of the effectiveness of detecting and blocking a DOS attack. As part of the research, the following will be analyzed results obtained in four situations: application server unsecured and not attacked, application server unsecured and attacked, application server secure and not attacked, application server secured and attacked. The application used for the research will again be OWASP Juice Shop. DOS attacks will be mimicked in the form of Hulk DOS Tool generating many HTTP requests. The data encryption area will be tested using a free version of the commercial Acra product provided by the manufacturer Cossack Labs. Conducting research in this area consists in checking the effectiveness of data encryption by the Acra system, within a database linked to an application - a blog, made available as part of the demonstration project. This will allow us to determine what data will be obtained during a leak with unauthorized access directly to the database. In our approach, considering cyber security as a game should be done under certain conditions, due to the specificity of the environment in which such a game takes place. With the constant emergence of new vulnerabilities, attackers have an infinite number of chances to win, limited only by their time in attempting attacks, as well as their resources in the form of will and determination, the performance of devices used to carry out attacks, or the amount of electricity consumed, which comes down to a limitation of money. The situation is similar on the defenders’ side - as more vulnerabilities are detected, the construction of defenses would also be easier, leading to continuous improvements in protection. To consider cyber security in the context of game theory, it is necessary to set certain conditions: • limited time-space in which the game would take place, • limited resources to be used within a certain time period, • attacks and defences take place in a defined technological environment, excluding the existence of undetected vulnerabilities, • The defenders and attackers do not know what strategies the opponent adopted. The conditions for attackers and defenders formulated in Table 1 allow us to conclude that the cyber security game will not be a zero-sum game, formally the attacker never loses and the defender never wins. Running SQLmap to detect vulnerabilities resulted in two types of results, possible for secured and unsecured portions of application code. Tests conducted on unprotected application code, using five types of techniques. For two of them the SQLmap tool identified vulnerable parameters for Boolean-based blind SQL injection and Time-based blind SQL injection attack methods. In the case of the remaining attack methods, no vulnerabilities were detected. For the modified code, no vulnerabilities were detected in

Game Theory as a Method for Assessing the Security

129

Table 1. Playing conditions Attackers

Defenders

There is always a dominant strategy

Defenders never know how they will be attacked

Attackers have an infinite number of chances They do not know if the protection is sufficient of winning Their payouts in the event of failure do not will be harmful

They should keep a balance between potential difficulties for users and good security

One successful attack is enough

Full protection is required - enough one failure to break the defense

any of the cases and the attacks generated by SQLmap failed. Detailed results are shown in Table 2. The vulnerable parameter for both successful attack methods is email. In both cases no vulnerabilities were detected after securing the code snippet. The changes between unprotected and protected code are also visible in the scope of generated errors - in the case of unprotected code, 401 code is detected, i.e. an unauthorized access attempt, while in the case of protected code no such error occurs in any method. In all cases, however, error 500 appears, which means an internal server error, indicating that the user has entered incorrect data. Table 2. SQL Injection attack results Attack method

Code state

Vulnerable parameter

Number of 401 http response

Number of 500 http response

Boolean-based blind SQL injection

Original Protected

Email –

3491 –

59 1758

Time-based blind SQL injection

Original Protected

Email –

140 –

16 910

Error-based SQL injection

Original Protected

– –

10 –

2 12

UNION query SQL injection

Original Protected

– –

9156 –

902 752

Stacked queries SQL injection

Original Protected

– –

347 –

18 365

130

A. Warzy´nski et al.

A measurement of the loading time of the loading time for the main page of the OWASP Juice Shop that was launched. The results of the "Measure page load" from the Ubuntu virtual machine are as follows: • without Snort protection, without DoS attack - about 4 s, page fully loaded • without Snort protection, with DoS attack - about 10 s, page did not load fully loaded - the product list in the shop does not appear, so using the shop is not possible • with Snort protection, without DoS attack - about 6 s, page fully loaded • with Snort protection, with DoS attack - about 11 s, page fully loaded. The results show that the operation of IPS Snort increases the loading time of a shop page, which is due to the inline mode of operation - to be able to block connections Snort analyses each request before passing it on. For the ciphering tests, the contents of the database were checked before and after running the encryption module from the Acra system. Adding new entries in the test application, which simulates a blog, resulted in the appearance of new plaintext and encrypted records respectively.

5 Experimental Evaluation For the purpose of the analysis, we introduce the time cost of selecting appropriate strategies, which can be expressed in terms of the number of employee time required (FTE) - which is convertible into a monetary cost associated with employment. Each undefended attack also has a cost, which is converted into money in an appropriate way - the losses and gains resulting from the attack. Collecting the relevant values, after transforming the payoff table, allows to choose the optimal defense strategy. To simplify the analyses presented here, all effects of successful attacks are reduced to single and immediate costs, without considering them as longterm effects. The assessment method introduced includes following steps: • the determination of the cost for both players, in case of success and failure attack, • to determine the profit made by both players, in case of a successful and unsuccessful attack, • checking the existence of dominant strategies as optimal solutions and of dominated strategies as solutions to be removed from the game, • in the absence of dominant strategies, finding the most favorable strategies of defenders according to the minmax criterion proposed by Leonard Savage [19]. In this case we are considering an online shop that stores sensitive data such as addresses, emails, etc. and makes medium-sized sales with an average monthly profit of X. Depending on the attack, specific effects affecting the amount of loss are possible. By considering the three areas selected for study, the most damaging consequences are assumed in in case of no protection and a successful attack. The estimated costs will be calculated for simplicity of comparison as the average monthly cost of the activities carried out.. The values for individual protection methods are presented in Table 3, attack methods in Table 4 and losses and profits in Table 5.

Game Theory as a Method for Assessing the Security

131

Table 3. Costs for defenders relating to the application of particular methods of protection Protection method

Software and hardware

Employee

Sum

Query parameterization

0

0.02x

0.02x

Snort IPS implementation

0,025x

0.03x

0,055x

Encryption with Acra

0.001x

0.03x

0,031x

Table 4. Costs to attackers of carrying out individual attacks Attack method

Software and hardware

Attacker

Sum

SQL injection

0

0.1x

0.1x

DOS with hulk

0,025x

0

0,025x

Table 5. Gains of attackers and losses of defenders in case of a successful attack with specific effect Effect of attack

Loss of defender

Profit of attacker

Lack of availability

0,25x

0.125x

Data leak

0,75x

0,5x

The analyzed environment within the work includes the three areas indicated in the research, which consequently allows the creation of a variety of strategies. The combinations of player strategies include selected methods of protection or attack for defenders and attackers respectively. B both defender and attacker will have 8 strategies to choose from. The cost of each strategy will be equal to the sum of the costs of the selected methods. The gain from winning or losing will be equal to the sum of the corresponding gains/losses. Abbreviations used in Tables 6: P - query parameterization, I - IPS snort, E - Acra encryption, S - SQL injection, H - DOS hulk, D - data stealing, _ - lack of protection/attack with a given method. If a situation occurs at the intersection of the strategies of the attacker and the defender, the payout for each side can be read, which means a profit or a loss depending on the value. It should be assumed that the implementation of a protection method in each area always protects against attacks in that area. However, any attack directed in an unprotected area will be successful. The attackers’ strategies are focused on profit maximization, despite the resources invested they have the potential to benefit from attacks. From the defenders’ perspective, it is important to minimize losses - if the organization is protected, they will achieve their goal, but it will not but there will be no tangible financial gain.

132

A. Warzy´nski et al.

Table 6. Table of payments depending on the choice of defence/attack strategy. The values on the left refer to the defender and on the right to the attacker. Defense/attack method

SHD

SH_

S_D

S_

_HD

_H_

__D

___

PIE

−0.106x, −0.345x

−0.106x, −0.125x

−0.106x, −0.32x

−0.106x, −0.125x

−0.106x, −0.245x

−0.106x, −0.025x

−0.106x, −0.125x

−0.106x, 0.0x

PI_

−0.825x, 0.145x

−0.075x, −0.125x

−0.825x, 0.18x

−0.075x, −0.125x

−0.825x, 0.265x

−0.075x, −0.025x

−0.825x, 0.28x

−0.075x, 0.0x

P_E

−0.301x, −0.22x

−0.301x, 0.00x

−0.051x, −0.32x

−0.051x, −0.125x

−0.301x, −0.125x

−0.301x, 0.1x

−0.051x, −0.125x

−0.051x, 0.0x

P__

−1.02x, 0.28x

−0.27x, 0.00x

−0.77x, 0.18x

−0.02x, −0.125x

−1.02x, 0.4x

−0.27x, 0.1x

−0.77x, 0.28x

−0.02x, 0.0x

_IE

−0.836x, 0.155x

−0.836x, 0.375x

−0.831x, 0.18x

−0.831x, 0.4x

−0.086x, −0.245x

−0.086x, −0.025x

−0.086x, −0.125x

−0.086x, 0.0x

_I_

−0,805x, 0,155x

−0,805x, 0,375x

−0.805x, 0.18x

−0.805x, 0.4x

−0.805x, 0.265x

−0.055x, −0.025x

−0.805x, 0.28x

−0.055x, 0.0x

__E

−1,031x, 0,28x

−1,031x, 0,5x

−0.781x, 0.18x

−0.781x, 0.4x

−0.281x, − −0.145x

0.281x, 0.1x

−0.031x, −0.125x

−0.031x, 0.0x

___

−1.00x, 0,28x

−1.00x, 0,5x

−0.75x, 0.18x

−0.75x, 0.4x

−1.0x, 0.4x

−0.25x, 0.1x

−0.75x, 0.28x

0.0x, 0.0x

As part of solving the game, read the results from the pay table, which shows which profits or costs are incurred by the players. In the search for dominant strategies, compare all the payouts possible with each strategy depending on the strategy of the opponent. By comparing the payouts in the table it can be seen that there is no dominant strategy for either side, which means that there is no strategy that will be the best regardless of the chosen strategy opponent’s strategy. If the attacker will not use the no-attack strategy, one can identify a strategy that dominates among the defender’s strategies - its payoffs will always be no better than those of other strategies. Such a strategy is the no protection strategy. The defender should not choose it. According to the payment table, the defenders never gain as part of their winnings. Their success is the lowest possible cost incurred in the implementation of the protection method. Choosing the best defenders’ strategy in this case will require the analysis of individual payoffs under the min-max criterion proposed by Leonard Savage [19].

6 Conclusions On the basis of the test results it was possible to prove that the indicated protection methods work for the declared cases. Parameterization of queries protected against SQL Injection attacks. The Snort IPS system ensured service availability during a simulated DoS attack with the Hulk tool. Encryption of data in the database using Acra ensured protection of key information in the case of unauthorized access to the database, without affecting the availability of information to end users. The evaluation of the effectiveness of protection methods and the estimation of tool costs and potential losses, however, does not allow to directly select the best, in terms

Game Theory as a Method for Assessing the Security

133

of benefits, set of solutions in which the organization should invest time and money resources. The transformations indicated in this paper, however, made it possible to obtain a table of payoffs depending on the strategies used, which makes it possible to further apply elements of game theory. Based on these, the absence of a dominant strategy in each player was shown. Additionally, with the assumption of the occurrence of any attack, the dominant strategy in the defender was indicated, i.e. the lack of application of any protection method. The application of a formalized method of strategy selection, in the form of a min-max criterion, allowed the best strategy to be determined for the defender, i.e. the implementation of all three methods of protection. Implementation of all three methods of protection.

References 1. Iso - iso/iec 27001:2013 - information technology — security techniques — information security management systems — requirements. https://www.iso.org/standard/54534.html. Accessed 14 June 2022 2. Kristen, E., Kloibhofer, R., Díaz, V.H., Castillejo, P.: Security assessment of agriculture IoT (AIoT) applications. Appl. Sci. 11(13), 5841 (2021). MDPI AGhttps://doi.org/10.3390/app 11135841 3. Wang, P., Ali, A., Kelly, W.: Data security and threat modeling for smart city infrastructure. In: 2015 International Conference on Cyber Security of Smart Cities, Industrial Control System and Communications (SSIC). IEEE, August 2015. https://doi.org/10.1109/ssic.2015.7245322 4. Hallberg, J., Hunstad, A., Peterson, M.: A framework for system security assessment. In: Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop, pp. 224– 231 (2005) 5. Gao, J., Zhang, B., Chen, X., Luo, Z.: Ontology-based model of network and computer attacks for security assessment. J. Shanghai Jiaotong Univ. Sci. 18(5), 554–562 (2013). https://doi. org/10.1007/s12204-013-1439-5 6. de Franco Rosa, F., Jino, M., Bonacin, R.: Towards an ontology of security assessment: a core model proposal. In: Latifi, S. (ed.) Information Technology – New Generations. AISC, vol. 738, pp. 75–80. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77028-4_12 7. Kamra, A., Ber, E.: Survey of machine learning methods for database security. In: Yu, P.S., Tsai, J.J.P. (eds.) Machine Learning in Cyber Trust, pp. 53–71. Springer, Boston (2009). https://doi.org/10.1007/978-0-387-88735-7_3 8. Wu, Y., Lyu, Y., Shi, Y.: Cloud storage security assessment through equilibrium analysis. Tsinghua Sci. Technol. 24(6), 738–749 (2019). https://doi.org/10.26599/tst.2018.9010127 9. Luh, R., Temper, M., Tjoa, S., Schrittwieser, S., Janicke, H.: PenQuest: a gamified attacker/defender meta model for cyber security assessment and education. J. Comput. Virol. Hacking Tech. 16(1), 19–61 (2019). https://doi.org/10.1007/s11416-019-00342-x 10. Nagels, J.: Availability and notification. Pract. Imaging Inform. (2021) 11. Tripathi, N., Hubballi, N.: Application layer denial-of-service attacks and defense mechanisms. ACM Comput. Surv. (CSUR), 54, 1–33 (2021) 12. Mahjabin, T., Xiao, Y., Sun, G., Jiang, W.: A survey of distributed denial-of-service attack, prevention, and mitigation techniques. Int. J. Distrib. Sens. Netw. 13 (2017)

134

A. Warzy´nski et al.

13. Murthy, S., Abu Bakar, A., Abdul Rahim, F., Ramli, R.: A comparative study of data anonymization techniques. In: 2019 IEEE 5th International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing, (HPSC) and IEEE International Conference on Intelligent Data and Security (IDS), pp. 306–309 (2019). https://doi.org/10.1109/BigDataSecurity-HPSC-IDS.2019. 00063 14. Savage, L.J.: The theory of statistical decision. J. Am. Stat. Assoc. 46, 55–67 (1951)

Prediction of Web Service Execution QoS Parameters with Utilization of Machine Learning Methods Łukasz Falas , Adam Sztukowski, Arkadiusz Warzy´nski , and Patryk Schauer(B) Department of Computer Science and Systems Engineering, Wrocław University of Science and Technology, Wrocław, Poland [email protected]

Abstract. With ever-increasing advancement of ICT (Information and Communications Technology) new challenges and requirements resulting from societal expectations emerge. Modern Cloud and Internet of Things feature functionalities that often involve not only data manipulation, but also mission critical tasks requiring ICT systems to strictly comply with defined nonfunctional requirements, especially in terms of web service execution time. These applications may range from solutions developed for e-Health, Industry 4.0, Smart Cities to applications related to fields like Smart Home or autonomous vehicles. Often, the QoS (Quality of Service) parameters are strongly correlated with user provided input and computational resources allocated to services that are handling the requests. To meet the needs of modern systems, new methods for QoS prediction are required, which can be later used by middleware responsible for resource allocation in service-based systems. This paper focuses on research related to QoS prediction methods. It discusses the general web service QoS parameters estimation problem and related challenges. Then a proposition of machine learning based approach for prediction of service QoS parameters based on the request data is presented, followed by experimentation aimed at evaluation of the proposed solution on real-word web services in a simulated environment. Keywords: Service-based systems · Web services · Quality of service · Prediction algorithms · Machine learning

1 Introduction Compliance with defined requirements, both functional and non-functional is one of base concepts of computer science. Computer systems that meet both types of these requirements allow users to complete their tasks according to their expectation, while providing safety and reliability necessary to build user’s confidence and trust in the system. While computer scientists and software engineers were able to provide solutions and methods that allowed professionals to build systems that meet the defined requirements, with each new technology new challenges emerge. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 135–145, 2023. https://doi.org/10.1007/978-3-031-27470-1_13

136

Ł. Falas et al.

Considering the challenges related to non-functional requirements compliance, it can be easily seen that at the very beginning of network communication between computer systems the main challenges were related to communication reliability and service availability. With the developments in the area of new network and application protocols new challenges have become noticeable, mainly the increasing number of users interacting with web systems. With this increase, the performance problems have started to occur, which required development of resource allocation methods ensuring stability and proper performance of web services. One of the most basic methods to cope with this problem was resource overprovisioning, which considered the expected maximal load generated by users and assumed that resource allocation for web application or a web service should allow system to function correctly even in the worst-case scenario in terms of load. This idea was further developed by introduction of concepts like load balancers and automated scaling by providing additional instances of web services with exclusive new computing resources that can handle the additional load, which later can be released if the load decreases. While these methods are still relevant, one of their drawbacks is the fact that they assume that user requests are similar to each other. Given the fact new solutions in the field of cloud computing often allow dynamic computing resource allocation, one of the key challenges that could enable optimization of resource utilization is the ability to predict the QoS parameters of a web service with specific computing resources on the basis of user input data send in the requests. A reliable method enabling such prediction would allow for a per user (even a per request) approach to dynamic resource allocation, which would enable better QoS guaranties while optimizing the resource allocation. This paper focuses on the QoS parameter values prediction based on user input data sent in the request, especially on prediction of the service execution time which is one of the most difficult parameters to predict and one that is most likely to be dependent on user input. This paper proposes a machine learning based approach for QoS parameters prediction to solve this problem. The idea behind utilization of machine learning spans from the observation, that in case of many services it is not possible to build a reliable analytical QoS model based on their computational complexity and, on the other hand, utilization of machine learning methods can automate the process of building a predication model and potentially should enable estimations for services that can be difficult to analyze by people. Also, utilization of such approach enables possible utilization of reinforcement learning, which could lead to prediction model self-improvement in production environment.

2 Related Works Regarding current solutions for QoS management, one of the most commonly used solution that is utilized for ensuring compliance with non-functional requirements are load balancers. In the work “Load-balancing algorithms in cloud computing: A survey” [1], authors identified the main problems emerging in large, distributed systems. The indicated problems where mainly related to scalability, quality assurance, data management, power consumption and availability of individual services. Hence, the load

Prediction of Web Service Execution QoS Parameters

137

balancing middleware which is purposely designed to distribute requests among the available resources in way maximizes usability of services while minimizing costs and response time [1] is a type of mechanisms that could benefit from solution that would be able to predict the QoS parameters for given request. According to the “Cloud Management Systems and Virtual Desktop Infrastructure Load Balancing Algorithms - A Survey”, the currently used load balancing algorithms can be distinguished into two main types: static algorithms and dynamic algorithms. Static algorithms are based only on the statistical behavior of the system and predetermined rules, while dynamic algorithms analyze the current state of the entire application to make an informed decision, which makes them usually more advanced and translates to better results in terms of QoS assurance [2]. Static algorithms utilize commonly known methods like Round-Robin, request counting or random assignment of requests which does not involve any additional request processing what puts them out of scope of this paper. However, the dynamic algorithms can be considered as area which is related to the topic of this paper. A common approach which may be observed in this filed relies on utilization of various algorithms. Many of these solutions are based on gathering information about system’s load and utilizing ant-colony optimization methods [3, 4], genetic algorithms [5, 6], swarm optimization [7], stochastic hill climbing [8], Min-Max optimization [9, 10] in order to find the best server to handle the request at given moment. Regardless of the methods that are being used, QoS management aim at optimization of resource allocation in a way that ensures or maximizes the probability of nonfunctional requirements compliance. In order to achieve better results in QoS management, statistical analysis may not be sufficient and it is crucial to consider methods that proactively try to predict how the request will be handled in terms nonfunctional characteristics. In papers related to the subject of QoS, various approaches to the topic of QoS prediction can be encountered. In one of papers [11] the authors focus on the problem of developing a service recommendation system for users. They are utilizing filtering methods based on deep learning. Similar approach is also proposed in other papers [12, 13]. This type of methods allows to recommend services that users should use in order to maximize quality based on the average response time and the geographic location. Paper [14] focuses on predicting the occupancy of web services and assigning one of three categories: low load, medium load, overload which can be used as indication for web service instance selection. Some of the papers [15] utilize prediction methods not for reliability assessment, which extends prediction-based approach beyond performance analysis. Despite the similarity of the research topic, none of the found papers proposes a solution similar to the one discussed in this article. The study presented in this article aims at solving this problem by proposing a solution that analyzes the request characteristics, to indicate the predicted QoS values related to its execution.

138

Ł. Falas et al.

3 Web Service QoS Prediction Model When discussing the QoS prediction problem, it has to be noted that the prediction will be performed for a specific service. Parameters such as algorithm time complexity, input data structure, compiler quality, and the architecture of the device running the service are specific to each service. In the developed prediction model, it is assumed that for a single service only the input data is variable, and other parameters remain constant, because the service infrastructure or implementation method are strictly defined. This means that the prediction for each of the services will be performed separately, which should allow for the development of a model that is simpler and easier to implement in real world systems. The main feature of the model is its universality enabling its practical utilization without any additional modification in order to fit a specific service. The output of the model is a single numerical value indicating service execution time prediction expressed in seconds. As model input is considered, it is difficult to identify the universal features that will appear in the model for any web service. Due to this fact the input of the model is fairly general and it takes an array of features which are selected for each of the services individually. The overview of the model is shown on the Fig. 1. The inputs are marked with symbols w1 , w2 , . . . , wn , where n is the number of inputs. This allows for any adjustment of the model to the service for which a prediction should be made. A single training dataset is defined as: {w1 , w2 , . . . , wn }, p , where: • w1 , w2 , . . . , wn – model inputs, • p – prediction value The task of the developed prediction algorithm is therefore to find the appropriate prediction function in the form f (x) = p; where: • w = [w1 , w2 , . . . , wn ] – feature input vector, • p – prediction value Summarizing the developed model is flexible and can be used for majority of web services. It also incorporates a “per service” approach which means that the model is trained for each service individually, which should result in the increased the quality of prediction. It is worth noting that the presents model is abstract, however, it was verified with specific ML methods implementations. The methods and verification results are discussed in the following sections of the paper.

Prediction of Web Service Execution QoS Parameters

139

Fig. 1. General overview of the developed prediction model

4 Web Service QoS Prediction Algorithm The proposed algorithm tries to solve the QoS prediction problem utilizing the analysis of input data provided in the requests by incorporating a multi-method prediction. The algorithm analyzes the request structure and related characteristics (e.g. request arrival time) and on their basis performs the prediction of QoS parameters values, utilizing different machine learning methods. On the basis of the defined criteria, the algorithm automatically verifies the quality of prediction provided by each of the methods and selects the best prediction method for given web service. Currently, the proposed algorithm has two criteria which can be used in the process of method verification. First of them is the mean absolute error (MAE), which is calculated by averaging the absolute value of the difference between the prediction value obtained from the model and the measurement label. The formal definition of this criteria is following: MAE =

1 n p yt − yt , i = 1, 2, . . . , n; i=1 n

where: p • n - number of measurements, •yt - prediction value, yt - observed value. The second implemented criteria is the mean absolute percentage error (MAPE). While the criteria may seem similar to MAE, the difference is that the percentage error determines how relatively the prediction differs from the actual value. The mean absolute percentage error can be calculated in the following way: p 1 n yt − yt · 100%, i = 1, 2, . . . , n; MAPE = i=1 n yt where • n - number of measurements, • yt - prediction value, p • yt - observed value. The main goal of the algorithm is the selection of the best prediction method for given service, which can later be utilized for prediction in production environment, e.g. as a

140

Ł. Falas et al.

part of some QoS management middleware. Due to the fact, that the proposed solution is in a preliminary phase of development, the following five methods were selected for the experimentation: • • • • •

k-nearest neighbors method using 5, 10, 15 and 20 neighbors variants (KNN), linear regression model (LR), decision tree model (DT), random forest model (RF), five models of neural networks with a linear activation function and five model of neural networks with a sigmoidal activation function, consisting of 100 neurons in hidden layer and the number of hidden layers ranging from 1 to 5 (NN),

The proposed algorithms meet all the defined requirements and implements the proposed model and enables multi-model approach to QoS prediction. The next section will discuss the results of experimental evaluation conducted with the algorithm.

5 Experimental Evaluation The experimental evaluation started with preparation of a test environment that incorporated software for request generation, request execution, request execution monitoring and request execution measurement storage on one server and a set of servers with isolated test services deployed on the (one specific service for each server). The general architecture of the test environment is presented in Fig. 2.

Fig. 2. Testing environment architecture.

Prediction of Web Service Execution QoS Parameters

141

In order to perform the tests, a set of six well known algorithms that process data and differ in computational complexity were selected and they were implemented in as RESTful web services. This work resulted in a following set of web services: • • • • • •

Bubble sort algorithm web service (Bubble Sort svc.) Factorization algorithm web service (Factorization svc.) Hash breaking web service (Hash breaking svc.) Graph coloring web service (Graph coloring svc.) Kruskal algorithm web service (Kruskal alg. Svc.) Dijkstra’s algorithm web service (Dijkstra’s alg. Svc.).

The experimentation was performer by starting a generation process for each of the prepared services. The request generation process was performed by sequential execution of requests sent to the developed services, and then saving the result in the form of the request execution date, response time and other parameters characteristic for a given service after initial preprocessing. The characteristics of the training sets generated in the following way are presented in Table 1. Table 1. Parameters of generated datasets Parameter

Bubble sort svc.

Factorization svc.

Hash breaking svc.

Graph coloring svc.

Kruskal alg. svc.

Dijkstra’s alg. svc.

Number of measurements

8289

8271

8164

5354

8144

7644

Minimum request time

0,224 s

0,222 s

0,224 s

0,225 s

0,226 s

0,223 s

Maximum request time

25,458 s

44,969 s

5,438 s

19,477 s

19,981 s

41,529 s

Average request 6,872 s time

3,022 s

1,573 s

5,402 s

6,739 s

9,733 s

Request time standard deviation

6,651 s

0,834 s

4,325 s

3,883 s

8,131 s

6,096 s

The next step of the experimentation was focused on performing the learning process. The algorithm had been ran twice for each of the data sets, once for each of the defined criteria. The results for the selected criteria of average absolute error and percentage of the average absolute error were gathered and presented in Table 2 and Table 3, with additional graphical comparison depicted in Fig. 3 and Fig. 4. For the sake of clarity, in case of methods like k-nearest neighbors and neural network based methods only the ones with the best results were depicted on the charts. Summarizing the results of various request execution time prediction methods, it can be stated that in all examined cases the neural network based approach given the best

142

Ł. Falas et al.

Fig. 3. Comparison of mean absolute error values for experimentation [s].

Table 2. Mean absolute error values for experimentation. KNN [s]

Linear reg. [s]

Decision tree [s]

Rand. forest [s]

NN [s]

Bubble sort svc.

0,659

1,567

0,868

0,758

0,584

Factorization svc.

1,844

1,773

2,287

2,069

1,659

Hash breaking svc.

0,657

0,642

0,817

0,722

0,640

Graph coloring svc. 0,817

1,054

0,872

0,866

0,802

Kruskal alg. svc.

1,128

1,196

1,461

1,172

1,092

Dijkstra’s alg. svc.

0,799

1,010

1,035

0,840

0,773

results. In typical cases where services are clearly dependent on the size of the input data the execution time prediction quality is at a satisfactory level. The only service whose prediction show fairly high mean absolute percentage error value is the integer factorization service. However, this is probably related to the problem of sparsity of measurements. Among the examined services, better results were achieved by neural networks in which the number of hidden layers was greater than or equal to two. A similar observation can be made in the case of the k-nearest neighbors model. In most studies, the best prediction result was the model in which the distance to 15 neighbors was measured. In none of the examined cases there was a model calculating the distance from 20 neighbors. At the same time, the model in which the parameter k was 5 was selected only once and it was the case for the integer factorization service. Comparing the decision tree model with the random forest model, it can be concluded that in all examined cases the random forest model obtained a result at a comparable or better level than the decision tree model. The overall prediction efficiency of a linear regression model was low. In the case of the service implementing the Dijkstra’s algorithm, graph coloring service, hash breaking service and the service implementing the bubble sorting algorithm, its percentage prediction error significantly exceeds the results obtained by the other models, hence it can be concluded that it is not suited for these kind of prediction problem.

Prediction of Web Service Execution QoS Parameters

143

Fig. 4. Comparison of mean absolute percentage error for experimentation [%].

Table 3. Mean absolute percentage error for experimentation. KNN [%] Bubble sort svc 9,504

Linear reg. [%]

Decision tree [%]

Rand. forest [%]

NN [%]

120,249

12,203

12,074

8,057

Factorization svc

287,597

277,064

355,232

323,331

235,113

Hash breaking svc

42,088

41,407

52,928

44,185

39,269

Graph coloring svc

17,514

48,762

18,915

18,652

14,187

Kruskal alg. Svc

16,557

25,301

21,829

18,190

15,924

Dijkstra’s alg. Svc

12,955

54,209

16,566

13,572

12,258

Finally, it can be sated that the experimentation results have shown that these approach can be viable for execution time, especially when multi-method learning algorithm is being applied. The only drawback of the proposed approach is the fact, that the learning process has to be performed separately for each of the services, however such approach also positively influences the quality the prediction.

6 Conclusions This paper discussed the problem of web service QoS prediction with utilization of machine learning methods. It discussed the challenges related to QoS prediction, related research works in this field and proposed a QoS prediction model followed by a proposition of an multi-method machine learning algorithm built according to the model, which was then tested on web services implementing various algorithms.

144

Ł. Falas et al.

The experimentation has shown that the proposed solution is viable and it can be probably utilized in practical and real-world use cases. The experimentation has also shown that the best prediction results were obtained by neural networks with more than one hidden layer. Due to the promising results of the discussed experimentation, further research will focus on improving the algorithm itself, mainly the neural network-based approach with the utilization of hyperparameter optimization. Further research will also consider further automation of the web service request feature extraction methods via request decomposition in order to increase the possibility of practical application of this solution by professionals without machine learning background.

References 1. Ghomi, E.J., Rahmani, A.M., Qader, N.N.: Load-balancing algorithms in cloud computing: a survey. J. Netw. Comput. Appl. 88 (2017) 2. Taylor, M.E., Shen, J.: Cloud management systems and virtual desktop infrastructure load balancing algorithms - a survey. In: Sun, X., Chao, HC., You, X., Bertino, E. (eds.) ICCCS 2017. LNCS, vol. 10602, pp. 300–309. Springer, Cham (2017). https://doi.org/10.1007/9783-319-68505-2_26 3. Keskinturk, T., Yildirim, M.B., Barut, M.: An ant colony optimization algorithm for load balancing in parallel machines with sequence-dependent setup times. Comput. Oper. Res. 39(6) (2012) 4. Dam, S., Mandal, G., Dasgupta, K., Dutta, P.: An ant colony based load balancing strategy in cloud computing. In: Kumar Kundu, M., Mohapatra, D., Konar, A., Chakraborty, A. (eds.) Advanced Computing, Networking and Informatics- Volume 2. Smart Innovation, Systems and Technologies, vol. 28, pp. 403–413. Springer, Cham (2014). https://doi.org/10.1007/9783-319-07350-7_45 5. Ye, Z., Zhou, X., Bouguettaya, A.: Genetic algorithm based QoS-aware service compositions in cloud computing. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011. LNCS, vol. 6588, pp. 321–334. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-201523_24 6. Dam, S., Mandal, G., Dasgupta, K., Dutta, P.: Genetic algorithm and gravitational emulation based hybrid load balancing strategy in cloud computing. In: Third International Conference on Computer, Communication, Control and Information Technology (C3IT) (2015) 7. Singhal, U., Jain, S.: An analysis of swarm intelligence based load balancing algorithms in a cloud computing environment. Int. J. Hybrid Inf. Technol. 8(1) (2015) 8. Mondal, B., Dasgupta, K., Dutta, P.: Load balancing in cloud computing using stochastic hill climbing-a soft computing approach. Procedia Technol. 4 (2012) 9. Wang, S.-C., Yan, K.-Q., Liao, W.-P., Wang, S.-S.: Towards a load balancing in a threelevel cloud computing network. In: 3rd International Conference on Computer Science and Information Technology (2010) 10. Hung, C.L., Wang, H.H., Hu, Y.C.: Efficient load balancing algorithm for cloud computing network. In: International Conference on Information Science and Technology (IST 2012) (2012) 11. Smahi, M.I., Hadjila, F., Tibermacine, C., et al.: A deep learning approach for collaborative prediction of Web service QoS. SOCA 15, 5–20 (2021). https://doi.org/10.1007/s11761-02000304-y 12. Zheng, Z., Xiaoli, L., Tang, M., Xie, F., Lyu, M.R.: Web service QoS prediction via collaborative filtering: a survey. IEEE Trans. Serv. Comput. 15(4), 2455–2472 (2020)

Prediction of Web Service Execution QoS Parameters

145

13. Yu, C., Huang, L.: A web service QoS prediction approach based on time- and locationaware collaborative filtering. SOCA 10, 135–149 (2016). https://doi.org/10.1007/s11761014-0168-4 14. Bala, A., Chana, I.: Prediction-based proactive load balancing approach through VM migration. Eng. Comput. 32(4), 581–592 (2016). https://doi.org/10.1007/s00366-016-0434-5 15. Song, Y.: Web service reliability prediction based on machine learning. Comput. Stand. Interfaces 73 (2021)

Psychological Influence on Older Adults’ Preferences for Nursing Care A Comparative Study Between Japan and China Based on Data Analysis Zihan Zhang1(B) , Chieko Kato2 , and Koichiro Aoki2 1 Graduate School of Information Sciences and Arts, Toyo University, Tokyo, Japan

[email protected] 2 Faculty of Information Sciences and Arts, Toyo University, Tokyo, Japan

Abstract. The issue of nursing care for older people is of great concern in Japan and China, where aging is becoming increasingly severe. In this context, the development of public nursing care and the traditional Eastern cultural values of family nursing presents a contrasting and complementary situation. This study conducted a questionnaire focusing on older Japanese and Chinese adults’ nursing care preferences and psychological factors influencing their preference, awareness of family nursing care, psychological indebtedness, life satisfaction, loneliness, and selfefficacy. The participants were older Japanese and Chinese adults above 65 years old, and the relationship between psychological factors and their nursing care preference was analyzed using multiple regression analysis. As a result, awareness of family nursing care, physical ability, and psychological indebtedness had significant influences on Japanese older adults’ preference toward family nursing care, as well as on their preferences toward public nursing care. In contrast, awareness of family nursing care, physical ability, and psychological indebtedness had significant influences on Chinese older adults’ preference toward family nursing care while awareness of family nursing care and intellectual ability had significant influences on their preference toward public nursing care. By comparing the analysis results, the influence of these factors on older adults’ nursing care preferences and the differences between Japanese and Chinese were clarified. Keywords: Psychological factors · Older adults · Preferences for nursing care · Comparative study · Data analysis

1 Introduction Japan has the highest rate of aging globally, and this rate continues to rise according to the White Paper on Aging Society in 2021, which states that the population aged 65 and over in Japan was 36.19 million, with an aging rate of 28.8%. By 2065, the aging rate will reach 38.4%, about 38.4% of the population will be 65 years old or older, and about 1 in 3.9 will be 75 years old or older [1]. With the accelerated aging of the population, health maintenance and long-term nursing care issues have become more social issues than ever. Of these, health maintenance and nursing care issues are of great importance. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 H. Selvaraj and T. Fujimoto (Eds.): ICSEng 2022, LNNS 611, pp. 146–156, 2023. https://doi.org/10.1007/978-3-031-27470-1_14

Psychological Influence on Older Adults’ Preferences for Nursing Care

147

According to a survey of middle-aged men and women regarding their “old age,” 42.0% of the subjects were pessimistic about their old age. In addition to concerns about their lives, such as “I am worried about my retirement fund/pension” and “I am worried about my health,” many respondents also expressed concerns about nursing care, such as “I am single,” “I have no one to take care of me,” and “I am afraid I will die alone because I am single” [2]. Although the aging rate is not as high as in Japan, China can be cited as a country with a rapidly aging population. In China, average life expectancy has increased from 44.6 years in 1950 to 75.3 years in 2015 and will reach about 80 years by 2050. The percentage of the population aged 60 and over in the total population is expected to increase from 12.4% (168 million) in 2010 to 28% (402 million) in 2040; the population aged 80 and overreached 22.6 million in 2013 and is expected to increase to 90.4 million in 2050. This will be the largest elderly population in the world [3]. The declining birthrate has also gradually become severe in Japan and China, to make matters worse. In Japan, the number of new-borns in 2019 will be 865,239, the lowest in the past. The ratio of the population aged 75 and over to the population of children aged 0 to 14 is growing every year [1]. In China, the number of new-borns dropped from 1987 to 10.62 million [3]. In a survey in China, nearly half of the people surveyed were concerned about the decreasing opportunities for their children to take care of them or accompany them in their old age [4]. In Japan and China, where the population is aging, issues related to nursing care are becoming increasingly apparent, and it is necessary to consider the nature of caregiving from the perspective of both the caregiver and the cared-for. In particular, the issue of nursing care preferences, i.e., who will take care of their parents and how, is an essential consideration for both parties. The position of the older people, in particular, is even more important; after all, the goal of nursing care is to enable older people to have a happier old age. For instance, will the older people feel alone because their children cannot be with them and therefore look forward to public nursing care? Or will they look forward to staying at home for nursing care because they fear being alone away from their familiar surroundings and family? Andersen’s behavioral model of medical service use has been proposed as a model related to the use of public medical and nursing care of the patients [5]. This is highly similar to whether the older people have the motivation and preference to choose public nursing care services (the opposite case is to choose family nursing care). This model assumes that the use of health care services is determined by social factors, health care service system factors, and personal factors of the patients. Among these, personal factors are categorized as needs, facilitators, and predisposing factors, while “beliefs,” which correspond to values about care, are categorized as “predisposing factors” in the personal factors. However, Bradley et al. (2002) pointed out that even though the Andersen model considers “beliefs” as a predisposing factor, the influence of psychological factors on the preferences and behaviors of older adults has not been thoroughly examined [5]. Bradley et al.’s study compared the two groups, noting that White Americans were more likely to use nursing homes than African Americans. The results revealed differences between the groups in access to information about long-term nursing care, social norms regarding expectations and burdens of nursing, and concerns about privacy and

148

Z. Zhang et al.

self-determination [5]. This study indicates the importance of focusing on psychological factors in exploring the factors that influence the preferences for nursing care. Indeed, recent previous studies suggest that various psychological factors are determinants of the preference to be cared for; Russell (1997) stated that when older adults have difficulty maintaining close relationships with family, friends, and relatives, they tend to feel lonelier and look forward to opportunities to meet others in public nursing care facilities. Hence, he argues that older adults who are more prone to loneliness are more likely to prefer public nursing care [6]. In Japan, Karasawa (2001) focused on awareness of family nursing care as a psychological factor that influences the nursing care preference: continuation of family nursing care or enrollment in public nursing care. The awareness of family nursing care indicates the idea that family caregiving is desirable and natural [7]. Sugisawa et al. (2002) showed that awareness of family nursing care of caregivers and care recipients is associated with the underutilization of public nursing care [8]. In other words, the awareness or attitude that “family members should do caregiving” is a factor that promotes family nursing care preference. In addition, as a concept related to the psychological aspect of nursing care recipients, some studies have focused on the “sense of psychological indebtedness”. Greenberg (1980) defined a sense of psychological indebtedness as “the degree to which one feels obligated to return the help to the caregiver” and conceptualized it as a psychological reaction that obligation the assisted person feels toward the caregiver [9]. In a previous study (Watanabe et al. 2011), it was found that the higher the sense of psychological indebtedness, the higher the preference for family nursing care. Regarding this result, Watanabe et al. (2011) explained that people with a higher sense of psychological indebtedness preferred family nursing care because they felt less need for return behavior toward close relationships such as family members [10]. Therefore, along with awareness of family nursing care, a high psychological indebtedness is thought to lead to a higher preference toward family nursing care. In China, Luo et al. (2018) noted that the preference to use public nursing care, such as nursing homes, is defined by loneliness, self-efficacy, and life satisfaction. This study of older Chinese adults showed that loneliness among the older adults negatively influences their attitudes toward public nursing care and indirectly discourages their use of public nursing care [11]. On the other hand, older adults with higher life satisfaction had better family relationships, and their families did not want them to be cared for in public nursing care institutions, which reduced subjective norms toward public nursing care and indirectly discouraged the use of public nursing care. In addition, higher selfefficacy was associated with confidence in their ability to adjust well to nursing homes, which positively influenced attitudes toward public nursing care and indirectly promoted the use of public nursing care. In China, the common value was that family members are responsible for caring for older adults, and nursing homes were seen as something to be used only when family members could not provide support or income [11]. From this perspective, concerning older Chinese adults, loneliness and life satisfaction as well as family nursing care attitudes may promote preference toward family nursing care. On the other hand, concerning older Japanese adults, it has been reported that there is no association between public nursing care facilities and life satisfaction [12]. Thus, the above-mentioned association between psychological factors and preference to receive

Psychological Influence on Older Adults’ Preferences for Nursing Care

149

nursing care may be a pattern peculiar to older Chinese adults. Based on the findings of previous studies, it is believed that the awareness or attitudes of older adults toward nursing care and related psychological factors are important factors that influence the nursing care preference. However, few studies have focused on the relationship between nursing care preference and various psychological factors and compared them among people with different social backgrounds. Therefore, this study surveyed older adults in Japan and China regarding their nursing care preference and psychological factors. The purpose of this study is to identify the psychological factors that influence the nursing care preference of older adults in Japan and China and test the following hypotheses. In the case of Japanese older adults, similar to the previous study by Watanabe et al. (2011), awareness of family nursing care is expected to increase preferences toward family nursing care and suppress preferences toward public nursing care [10]. Furthermore, it is also expected that older adults with a high sense of psychological indebtedness will feel less need for return behavior toward family members and have a higher preference for family nursing care. Hence, when older Japanese adults have a higher awareness of family nursing care and psychological indebtedness, their preference for family nursing care will increase, and their preference for public nursing care will decrease (hypothesis 1). In the case of older Chinese adults, the traditional Chinese value that “In China, taking care of older parents has been regarded as the responsibility and obligation of every child since ancient times.” may correspond to awareness of family nursing care. Therefore, individual differences in these values, i.e., higher or lower awareness of family nursing care, are expected to increase preferences for family nursing care. In addition, based on previous research by Luo et al. (2018), older adults with higher loneliness show negative attitudes toward public caregiving. Older adults with high satisfaction with life, backed by good family relationships, suppress their preference for public nursing care. In contrast, high self-efficacy is thought to promote preference for public nursing care, leading to confidence in one’s ability to adapt to life in a nursing care facility. Hence, when awareness of family nursing care, loneliness, and life satisfaction are high, and self-efficacy is low, preference for family nursing care increases, and preference for public nursing care decreases (hypothesis 2).

2 Method 2.1 Survey Participants The survey with same scales and questions was administered to 300 Japanese men and women (150 men and 150 women), and 300 Chinese men and women (150 men and 150 women). G*Power is a free power analysis program for a variety of statistical tests. The sample size was calculated using G*power 3.1.9 software (Heinrich-Heine-University, Düsseldorf, Germany) based on F tests as test family, linear multiple regression: fixed model, R2 deviation from zero as statistical test, a priori as type of power analysis, a significance level of 0.05, a power of 0.95, effect size of 0.1904762 (calculated by 0.16 as R2), and 8 associated variables. The minimum sample size was 168. To ensure a sufficient number of data based on the above sample size, 300 elderly Japanese and 300 elderly Chinese (who are able to live independently at home) were asked to complete the questionnaire online through the research institute. At this time, consideration was

150

Z. Zhang et al.

also given to the gender ratio of the participants, with the number of Japanese subjects being 150 men and 150 women, and the number of Chinese subjects being 152 men and 148 women. 2.2 Survey Procedures An Internet survey was conducted based on a questionnaire containing the following psychological scales and questions. Cross Marketing Inc. Was used for the survey on Japanese, and Credamo was used for the survey on Chinese. 2.3 Questions on Awareness of Family Nursing Care Based on Karasawa (2006), three items were used to measure awareness of family nursing care: “Family members should provide nursing care,” “It is the family’s duty to care for the older adults,” and “If the older adults want family nursing care, they should be cared for by family members.” The respondents were asked to respond using a 4-point scale, “1: Disagree, 2: Somewhat disagree, 3: Somewhat agree, 4: Agree.” The sum of the responses to the three items was calculated as the score of awareness of family nursing care [13]. 2.4 General Self-efficacy Scale The General Self-Efficacy Scale for the Elderly (GSESE) by Matsuda and Nakagawa (2021) was used as a measure of self-efficacy. The GSESE measures the extent to which older adults tend to perceive their general self-efficacy as high or low. The scale consists of 19 questions, such as “I am not very active or engaged in anything,” “I am able to go about my daily life without help,” and “I often feel that my intellectual abilities have declined,” and scores can be calculated for three factors: “proactivity,” “physical abilities,” and “intellectual abilities. The participants were asked to answer each item using a 4-point scale (1: no, 2: somewhat no, 3: somewhat yes, 4: yes), and the scores were calculated for proactivity, physical ability, and intellectual ability [14]. 2.5 Satisfaction with Life Scale (SWLS) The Satisfaction with Life Scale (SWLS) of Kakuno (1994) was used to measure satisfaction with life. The questions asked were: “In most ways, my life is close to my ideal,” “The conditions of my life are excellent,” “I am satisfied with my life,” “So far I have gotten the important things I want in life,” and “If I could live my life over, I would change almost nothing.” For each item, the respondents answered “1: Strongly disagree, 2: Disagree, 3: Slightly disagree, 4: Neither agree nor disagree, 5: Slightly agree, 6: Agree, 7: Strongly agree. The total of the responses to the six items was calculated as the score of satisfaction with life [15].

Psychological Influence on Older Adults’ Preferences for Nursing Care

151

2.6 Questions on Psychological Indebtedness Based on the studies by Aikawa and Yoshimori (1995) and Watanabe et al. (2011), the questions on psychological indebtedness included eleven questions such as “If you receive a favor from your friend, you will return a favor as soon as possible to maintain the friendship,” “If someone tells you ‘I owe you one,’ you feel embarrassed,” and “When someone treats you, you think you should treat him/her next time.” For each item, respondents were asked to respond using a 4-point scale (1: Not applicable, 2: Somewhat not applicable, 3: Somewhat true, 4: Yes). The sum of the responses to the 4 items was calculated as the score of psychological indebtedness [16]. 2.7 UCLA Loneliness Scale in the Elderly The Japanese version of the UCLA Loneliness Scale in the Elderly (3rd edition) by Masuda et al. (2012) was used as a measure of loneliness. This scale is a Japanese translation of the third version of the University of California, Los Angeles Loneliness Scale (UCLA LS), a self-administered loneliness scale developed by Russell (1978). The questionnaire consists of 20 items, including “I am unable to reach out and communicate with those around me.” The responses to the four items (1: I often feel this way, 2: I sometimes feel this way, 3: I rarely feel this way, 4: I never feel this way) were calculated as the total of the responses to the four items as the score of loneliness [17]. Preferences Toward Family Nursing Care or Public Nursing Care In this study, we used items of preference for family nursing care and public nursing care based on Watanabe (2011) as an indicator of preference to receive care. The preferences for family nursing care were measured through four items: “I am happy to be taken care of by my immediate family,” “I want to be taken care of by people I know,” “If possible, I don’t want to be taken care of by my family (reversal item),” and “It is desirable to be taken care of by family members.” The preference for publicpnursing care was defined as “I am happy to be taken care of by public nursing care services,” “I want to be taken care of by the welfare service,” “I don’t want to use welfare services if I can help it (reversal item),” and “It is desirable to use welfare services”. The sum of the responses to the four items was used to calculate the total number of respondents who answered “1: Disagree, 2: Somewhat disagree, 3: Somewhat agree, 4: Agree”. The scores were calculated as preference scores for public nursing care [10]. 2.8 Method of Analysis Multiple regression analysis was conducted to determine the influence of psychological factors on preferences toward family and public nursing care among Japanese and Chinese participants. First, multiple regression analysis was conducted using the forced entry method using Japanese data, with a preference for family nursing care as the dependent variable and data on psychological factors (awareness of family nursing care, proactivity, physical ability, intellectual ability, life satisfaction, psychological indebtedness, and loneliness) as independent variables. Multiple regression analysis was also conducted similarly, with a preference for public nursing care as the dependent variable and data

152

Z. Zhang et al.

on psychological factors as the independent variable. Next, multiple regression analysis was conducted with a preference for family nursing care or public nursing care as the dependent variable and data on psychological factors as the independent variable using the Chinese data. Again, IBM SPSS Statistics 26 was used for the statistical analysis.

3 Results 3.1 Analysis of Results for Survey of Japanese Participants Multiple regression analysis of the Japanese data yielded significant standardized partial regression coefficients for scores on awareness of family nursing care (p < .001), physical ability (p = .08, p < .10), and psychological indebtedness (p = .05, p < .10) when a preference for family nursing care was the independent variable. Significant standardized partial regression coefficients were not obtained for scores when a preference for public nursing care was the dependent variable. The results of each analysis are presented in Table 1. Table 1. Analysis of results of a preference on nursing care of Japanese participants

independent variables awareness of family nursing care proactivity physical ability intellectual ability life satisfaction psychological indebtedness loneliness R2 F -value (df)

preference for preference for public nursing care family nursing care β β .44*** -0.06 -.11† -0.07 -0.06 .11† -0.08 0.03 .24** 13.01 (7, 292) 1.489 (7, 292) †p