142 17 50MB
English Pages 477 [473] Year 2020
Advances in Intelligent Systems and Computing 1127
Zhengbing Hu Sergey Petoukhov Matthew He Editors
Advances in Intelligent Systems, Computer Science and Digital Economics
Advances in Intelligent Systems and Computing Volume 1127
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. ** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink **
More information about this series at http://www.springer.com/series/11156
Zhengbing Hu Sergey Petoukhov Matthew He •
•
Editors
Advances in Intelligent Systems, Computer Science and Digital Economics
123
Editors Zhengbing Hu School of Educational Information Technology Central China Normal University Wuhan, Hubei, China
Sergey Petoukhov Mechanical Engineering Research Institute Russian Academy of Sciences Moscow, Russia
Matthew He Halmos College of Natural Sciences and Oceanography Nova Southeastern University Davie, FL, USA
ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-3-030-39215-4 ISBN 978-3-030-39216-1 (eBook) https://doi.org/10.1007/978-3-030-39216-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Contents
Advances in Intelligent Systems and Intellectual Approaches Methods for Studying the Post-buckling Behavior of Axisymmetric Membrane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sergey A. Podkopaev, Sergey S. Gavriushin, and Tatiana B. Podkopaeva
3
Mathematical Modeling of DC Motors for the Construction of Prostheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. A. Fonov, I. A. Meshchikhin, and E. G. Korzhov
16
Complex Risks Control for Processes in Heat Technological Systems . . . . Vadim Borisov, Vladimir Bobkov, and Maxim Dli
28
The Synthesis of Virtual Space in the Context of Insufficient Data . . . . Aleksandr Mezhenin, Vladimir Polyakov, Vera Izvozchikova, Dmitry Burlov, and Anatoly Zykov
39
Reconstruction of Spatial Environment in Three-Dimensional Scenes . . . Alexander Mezhenin, Vera Izvozchikova, and Vladimir Shardakov
47
Railway Rolling Stock Tracking Based on Computer Vision Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andrey V. Sukhanov
56
Evaluating of Word Embeddings Hyper-parameters of the Master Data in Russian-Language Information Systems . . . . . . . . . . . . . . . . . . Dudnikov Sergey, Mikheev Petr, and Grinkina Tatyana
64
Development of an Intelligent Control System Based on a Fuzzy Logic Controller for Multidimensional Control of a Pumping Station . . . . . . . Artur Sagdatullin
76
From Algebraic Biology to Artificial Intelligence . . . . . . . . . . . . . . . . . . Georgy K. Tolokonnikov and Sergey V. Petoukhov
86
v
vi
Contents
Concept of Active Traffic Management for Maximizing the Road Network Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andrey M. Valuev
96
Collection of Individual Packet Statistical Information in a Flow Based on P4-switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Vladimir A. Mankov and Irina A. Krasnova A Model of Cognitive Disorders upon the Algebra of Fourier-Dual Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 A. V. Pavlov Intelligent OFDM Telecommunication Systems Based on Many-Parameter Complex or Quaternion Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Valeriy G. Labunets and Ekaterina Ostheimer Vibration Monitoring Systems for Power Equipment as an Analogue of an Artificial Neural Network . . . . . . . . . . . . . . . . . . . 145 Oleg B. Skvorcov and Elena A. Pravotorova Integrated Computer Analysis of Genomic Sequencing Data Based on ICGenomics Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Yuriy L. Orlov, Anatoly O. Bragin, Roman O. Babenko, Alina E. Dresvyannikova, Sergey S. Kovalev, Igor A. Shaderkin, Nina G. Orlova, and Fedor M. Naumenko Statistical and Linguistic Decision-Making Techniques Based on Fuzzy Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Nikolay I. Sidnyaev, Iuliia I. Butenko, and Elizaveta E. Bolotova Studying the Crack Growth Rate Variability by Applying the Willenborg’s Model to the Markov’s Simulated Trials . . . . . . . . . . . 175 Irina V. Gadolina, Andrey A. Bautin, and Evgenii V. Plotnikov Advances in Computer Science and Their Technological Applications Creating Spaces of Temporary Features for the Task of Diagnosing Complex Pathologies of Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 A. P. Eremeev, S. A. Ivliev, O. S. Kolosov, V. A. Korolenkova, A. D. Pronin, and O. D. Titova A Modified Particle Swarm Algorithm for Solving Group Robotics Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Kang Liang and A. P. Karpenko
Contents
vii
Analysis of Diagnostic Signs of Defective States of Mechatronic Mechanisms of Cyclic Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 Aleksandr K. Aleshin, Georgy I. Firsov, Viktor A. Glazunov, and Natalya L. Kovaleva Development and Performance Evaluation of a Software System for Multi-objective Design of Strain Gauge Force Sensors . . . . . . . . . . . 228 Sergey I. Gavrilenkov and Sergey S. Gavryushin Optimization of the Structure of the Intelligent Active System as a Necessary Condition for the Harmonization of Creative Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 N. Yu. Mutovkina and V. N. Kuznetsov Parallel Hybrid Genetic Algorithm for Solving Design and Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 L. A. Gladkov, N. V. Gladkova, and E. Y. Semushin Optimal Real-Time Image Processing with Imperfect Information on Convolution-Type Distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Peter Golubtsov Scalability and Parallelization of Sequential Processing: Big Data Demands and Information Algebras . . . . . . . . . . . . . . . . . . . . 274 Peter Golubtsov Aggregate Estimates for Probability of Social Engineering Attack Success: Sustainability of the Structure of Access Policies . . . . . . . . . . . 299 Artur Azarov, Alena Suvorova, Maria Koroleva, and Olga Vasileva A Machine Learning Approach to the Vector Prediction of Moments of Finite Normal Mixtures . . . . . . . . . . . . . . . . . . . . . . . . . 307 Andrey Gorshenin and Victor Kuzmin Diagnostic Data Fusion Collected from Railway Automatics and Telemechanics Devices on the Basis of Soft Computing Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 Sergey M. Kovalev, Anna E. Kolodenkova, and Vladislav S. Kovalev Towards a Parallel Informal/Formal Corpus of Educational Mathematical Texts in Russian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 Alexander Kirillovich, Olga Nevzorova, Konstantin Nikolaev, and Kamilla Galiaskarova Hyperbolic and Fibonacci Numbers in Modeling Natural Phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Sergey V. Petoukhov
viii
Contents
Application of a Modified Ant Colony Imitation Algorithm for the Traveling Salesman Problem with Time Windows When Designing an Intelligent Assistant . . . . . . . . . . . . . . . . . . . . . . . . 346 Larisa Kuznetsova, Arthur Zhigalov, Natalia Yanishevskaya, Denis Parfenov, and Irina Bolodurina Tensor Generalizations of the Fibonacci Matrix . . . . . . . . . . . . . . . . . . . 356 Matthew He, Z. B. Hu, and Sergey V. Petoukhov An Approach to Online Fuzzy Clustering Based on the Mahalanobis Distance Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 Zhengbing Hu and Oleksii K. Tyshchenko Application of a Novel Model “Requirement – Object – Parameter” for Design Automation of Complex Mechanical System . . . . . . . . . . . . . 375 Bui V. Phuong, Sergey S. Gavriushin, Dang H. Minh, Phung V. Binh, and Nguyen V. Duc Studies of Structure and Impact Damage of Composite Materials by a Computer Tomograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 Oleg N. Bezzametnov, Victor I. Mitryaykin, and Yevgeny O. Statsenko On the Possibility of Applying a Multi-frequency Dynamic Absorber (MDA) to Seismic Protection Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 S. B. Makarov and N. V. Pankova About the Calculation by the Method of Linearization of Oscillations in a System with Time Lag and Limited Power-Supply . . . . . . . . . . . . . 404 Alishir A. Alifov and M. G. Farzaliev Mathematical Model of Dot Peen Marker Operating in Self-exciting Vibration Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414 A. M. Gouskov, E. V. Efimova, I. A. Kiselev, and E. A. Nikitin Advances in Digital Economics and Methodological Approaches Study of the Mechanisms of Perspective Flexible Manufacturing System for a Newly Forming Robotic Enterprise . . . . . . . . . . . . . . . . . . 427 Vladimir V. Serebrenniy, Dmitriy V. Lapin, and Alisa A. Mokaeva Approach to Forecasting the Development of Crisis Situations in Complex Information Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Andrey V. Proletarsky, Ark M. Andreev, Dmitry V. Berezkin, Ilya A. Kozlov, Gennady P. Mozharov, and Yury A. Sokolov Systems Theory for the Digital Economy . . . . . . . . . . . . . . . . . . . . . . . . 447 Georgy K. Tolokonnikov, Vyacheslav I. Chernoivanov, Sergey K. Sudakov, and Yuri A. Tsoi
Contents
ix
Agile Simulation Model of Semiconductor Manufacturing . . . . . . . . . . . 457 Igor Stogniy, Mikhail Kruglov, and Mikhail Ovsyannikov Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
Advances in Intelligent Systems and Intellectual Approaches
Methods for Studying the Post-buckling Behavior of Axisymmetric Membrane Sergey A. Podkopaev(&), Sergey S. Gavriushin, and Tatiana B. Podkopaeva Bauman Moscow State Technical University, Building 1, 5, 2nd Baumanskaya Street, 105005 Moscow, Russia [email protected], [email protected], [email protected]
Abstract. The paper considers an intellectual approach to the synthesis of structure properties of thin-walled models, which allows you to create devices with the guided properties. The theoretical foundations of nonlinear straining of thin-walled axisymmetric shells are considered. The operational characteristics of the membranes in various switching devices, valves and pressure sensors are presented. The types of non-linear behavior of post-buckling behavior of axisymmetric membranes are considered. A mathematical model is presented to describe nonlinear straining of axisymmetric membranes, a discrete continuation by parameter method, and the “changing the subspace of control parameters” technique. Using the hinged spherical shell as an example, a study of postbuckling behavior is performed. A rational mathematical model has been selected to describe nonlinear straining of thin-walled axisymmetric shells. A numerical algorithm for studying the processes of nonlinear straining of multi-parameter systems has been developed and implemented as an author program. Keywords: Intellectual approach Synthesis of properties Nonlinear straining Thin-walled axisymmetric shell Membrane Post-buckling behavior Discrete switching Continuation by parameter Change of the subspace of parameters
1 Introduction A variety of structural elements and technical devices, made in the form of an axisymmetric shell (membrane), are widely used and used in various industries. At present, due to the digital industrial revolution (Industry 4.0), various switching devices, fuses and valves used in the industrial Internet of Things have become especially popular [2, 6, 11, 12]. Membrane (Fig. 1a) is a thin-walled axisymmetric shell, which, under the influence of an external load, can change its deflection in steps. This method of loss of stability, not accompanied by the destruction of the shell, will be referred to as buckling [14, 15].
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 3–15, 2020. https://doi.org/10.1007/978-3-030-39216-1_1
4
S. A. Podkopaev et al.
2 The Operational Characteristics of the Membrane, Made in the Form of Shells The shell-shaped Membrane finds application in various switching devices (Fig. 2b), valves, alarms, and pressure sensors in the intellectual mechatronics and robotics (Fig. 2a). An axisymmetric shell as a membrane is also used in the construction of fuel tanks in the aircraft industry and shipbuilding as a pressure membrane (Fig. 2c). The features of the proposed approach include the ability to analyze the family of shells into which a particular structure is immersed. Intellectual design approach allows to create devices with desire properties. The main operational element of the membrane is the elastic characteristic, which reflects the relationship between the displacement of the membrane control point and the change in external load (Fig. 1b) [13, 16].
Fig. 1. (a) Example of a membrane (b) Type of elastic characteristics of the membrane
The sections of the graph AB and CD are shown in Fig. 1b are stable parts of the elastic characteristic of the membrane. Under the influence of an external load, the first extreme point (extremum) (point B, Fig. 1b) corresponding to the critical pressure pcr1 , membrane, bypassing the unstable part of the BC graph, step-like changes deflection and the straining process continue on the CD segment of the elastic characteristics. In reducing (removing) the external load with the shell, the reverse step-like change in deflection will occur, corresponding to the second critical pressure pcr2 . It should be noted that after the initial load (first step) the stresses in the membrane will remain elastic.
Methods for Studying the Post-buckling Behavior
5
Fig. 2. (a) Valve (1-membrane; 2-casing); (b) Switching device (button); (c) Fuel tank (F - fuel; G - gas; M - displacing membrane)
It should be noted that as a result of geometrical imperfections and inaccurate axisymmetric loading of the shell during operation, there will be a buckle at external pressure values other than pcr1 and pcr2 . The pressure at which the shell will buckle will be denoted as pb . The pressure pb lies in the interval between the upper pcr1 and lower pcr2 critical pressures.
3 Types of Study of Post-buckling Membrane Behavior There are two main approaches to the study of the post-buckling behavior of nonlinear mechanical systems. The first approach is the classic one. It consists of finding the critical values of the external load at which there are adjacent forms of equilibrium of the structure [1, 5, 9]. The second approach consists in the direct construction of the very surface of equilibrium states in the parameter space of the system. The advantage of this approach is the possibility of a detailed study of the post-buckling behavior of the shell, which often characterizes the operational properties of the elastic element [3, 4, 6, 7, 22–24]. In the paper, the authors propose and use a multiparameter approach to the study of the post-buckling nonlinear process of straining buckling shells (membrane). The proposed approach uses the idea in which the process of “immersion” of a specific task occurs in a multiparameter family of similar tasks. Such an approach is a convenient tool for designing and constructing technical structures, which allows obtaining not one particular solution, but the solution of the entire buckling membrane task family at once, depending on the external parameters of the system.
6
S. A. Podkopaev et al.
To construct the surface of the equilibrium states of the membrane, the discrete method of continuation in the parameter was used simultaneously with the method of “changing the subspace of control parameters”. In other words, a complex multiparameter problem can be considered as a set of one-parameter problems, each of which has its own discretely varying parameter. For the transition between one-parameter tasks, we used the method of changing the subspace of control parameters [6, 7, 10, 17, 18]. The task of studying the post-buckling behavior is reduced to the analysis of the surface of equilibrium states. For some values of the parameters of the system, manyvalued solutions, the presence of singular points and isolated solutions can be observed. A distinctive feature of this approach is that, instead of solving the bifurcation problem, it is proposed to numerically examine the graphs of the rearrangements of the solution when passing through a particular point [4, 6, 7, 20, 21]. To construct a graph of rearrangements and bypassing the neighborhoods of bifurcation points, the technique of “changing the subspace of external parameters” is used. The use of the finite element method (FEM) in the study of the post-buckling behavior of shells is associated with a number of difficulties, such as the lack of the possibility of movement in terms of geometry parameters, thickness or curvature radius. In other words, varying by any geometric parameters of the structure will entail repeated rebuilding of the finite element mesh, which in turn is associated with significant required computing power. In particular, the FEM does not allow finding isolated solutions on the equilibrium state curve, as well as simulating the nucleation process on the elastic characteristic of the so-called “loops” [17, 19].
4 The Mathematical Model for Describing the Process of Nonlinear Straining of Axisymmetric Shell The paper used the correlation theory of thin axisymmetric shells, modernized for the subsequent application of the numerical algorithm. In describing the geometry of the membrane (shell of rotation), the middle surface of the shell was used as a reference surface (Fig. 3). The point B0 on the median surface is characterized by two coordinates: the angle u, which determines the position of the meridional section, and the arc coordinate S0 , directed along the meridian. In axisymmetric deformations, all meridional sections of the shell are equivalent and, therefore, only the S0 coordinate is essential. The geometry of the meridional section of the membrane (axisymmetric shell) (Fig. 3b) in the initial non-straining state was specified in a parametric form: X0 ¼ X0 ðS0 Þ;
Y0 ¼ Y 0 ð S 0 Þ
ð4:1Þ
Methods for Studying the Post-buckling Behavior
7
where, S0 – independent coordinate, which is measured from the previously fixed point of the meridian A0 up to the current point B0 of the meridian. X, Y – the Cartesian coordinate system in which the X-axis is directed along the radius of the shell, and the Y-axis coincides with the axis of rotation of the shell. X0 ; Y0 – Cartesian coordinates of the current meridian point before straining. Denote the current angle of inclination of the tangent to the meridian in the nonstraining state as follows W0 ; The geometric relations will be: dX0 dY0 ¼ cosw0 ; ¼ sinw0 dS0 dS0
ð4:2Þ
Fig. 3. To the conclusion of the basic axisymmetric shell relations
Parameters related to the non-straining state will be denoted using the subscript “0”. After deformation, the point B0 , belonging to the middle surface of the shell, will move to a new spatial position B, characterized by coordinates X, Y, and S, respectively (Fig. 3b). Then the geometric relations for straining states will be written in the form: dX dY ¼ cosw; ¼ sinw dS dS
ð4:3Þ
The displacements along the X-axis - u and the displacements along the Y-axis of point B - v and the angle of rotation of the normal h are determined by the following relations: u ¼ X X0 ; v ¼ Y Y0 ; h ¼ w w0
ð4:4Þ
8
S. A. Podkopaev et al.
To denote values corresponding to the meridional and district directions, the subscript “1” and “2” are used, respectively. The expressions for the principal radii of curvature in the source and strain states of the shell are: 1 dw 1 dw 1 sinw0 1 sinw ; ¼ 0; ¼ ¼ ; ¼ q10 dS0 q1 dS q20 X X0 q2
ð4:5Þ
The complete changes of the main curvatures are denoted as follows: j1 ¼
dw dw0 sinw sinw0 ; j2 ¼ dS dS0 X X0
ð4:6Þ
A complete change in curvature in the meridional direction consists of a change in curvature due to the rotation of the normal (bending) and lengthening of the meridian: fcompl ¼ fbend þ fgeom Then, fbend ¼ fcompl fgeom ¼
dw dw0 dw0 dw0 dw dw0 ¼ dS dS0 dS dS dS0 dS
ð4:7Þ
We introduce the notation: j10 ¼
dw dw0 dS dS
ð4:8Þ
Similarly, changing the curvature in the circumferential direction: j20 ¼
sinw sinw0 X X
ð4:9Þ
The linear deformation of the element of the median surface in the meridional direction for the current state is: e10 ¼
dS dS0 dS0
ð4:10Þ
The linear deformation of the element of the median surface in the circumferential direction for the current state is: e20 ¼
ðX0 þ uÞ u X0 u u ¼ X0 u X0
ð4:11Þ
Methods for Studying the Post-buckling Behavior
9
Transform expression (4.8) to the following form: j10 ¼
1 dw dw0 ð1 þ e10 Þ dS0 dS0
ð4:12Þ
Using relations (4.1–4.4), (4.10), (4.12) we obtain the following geometric relations for axisymmetric thin-walled shell equations: 8 > < > :
du dS0 ¼ ð1 þ e10 Þ cosw cosw0 dv dS0 ¼ ð1 þ e10 Þ sinw sinw0 dw0 dw dS0 ¼ ð1 þ e10 Þ j10 þ dS0
ð4:13Þ
5 The Transition from a Multiparameter Problem to a Set of One-Parameter Problems. Discrete Continuation Method by Parameter To study complex multiparametric processes, straining a family of nonlinear boundary problems in ordinary derivatives was reduced to a multiparameter family of systems of nonlinear equations of the form F ðfX1 g; fX2 gÞ ¼ 0
ð5:1Þ
The system (5.1) of order m depends on the vector of “internal” parameters fX1 g of dimension mx1, and the vector of “external” parameters fX2 g of dimension nx1. The structure of the vector fX1 g includes parameters characterizing the state of the system under consideration, all in all, these are generalized displacements and generalized internal force factors. External parameters or otherwise “control” parameters included in the vector fX2 g are variable, and, as a rule, are related to the geometric dimensions of the structure, material properties, fixing conditions, external load, etc. The set of all solutions (5.1) can be considered as a certain surface (hypersurface) of equilibrium states constructed in a Euclidean space of dimension Rm þ n . The construction of such a hypersurface is a very laborious process, therefore, often solving a complex multiparameter problem reduces to solving a family of one-parameter problems, each of which has its own discretely varying parameter. Such an approach makes it possible to construct cross-sections of the surface of equilibrium states, which greatly simplifies the problem under study, and makes it possible to analyze the influence of “external” parameters on the straining character of the element under consideration. In this case, the vector of “external” parameters fX2 g is determined through one independent scalar parameter q. fX2 g ¼ q
ð5:2Þ
To study the loss of stability problem of a buckling membrane loaded with external pressure, the method of continuation by parameter was used, the main idea of which is
10
S. A. Podkopaev et al.
that any complex multiparameter problem can be considered as a set of one-parameter problems, each of which has its own discretely varying-parameter with fixed values of all other external parameters. As a variable parameter q, an external load, the geometrical characteristic of the system, the property of the material, the conditions of fixing, etc. can be taken. Two-Point Procedure According to the “Predictor-Corrector” Scheme
5.1
In implementing the procedure of discrete continuation by parameter, a two-stage procedure according to the “predictor-corrector” k k scheme was used. At the “predictor” k stage, the initial vector Xextr ¼ X ; q is predicted for the new value of the parameter qk by extrapolating the solution based on the prehistory. At the “corrector” stage, the initial approximation fX0 g is refined using the iterative method. The Newton method and its modifications are used as an iterative method. 5.2
Method of Changing the Subspace of Control Parameters
To bypass the bifurcation points or the branch points of the equilibrium states arising on the curve, the method of numerical counting is used, which is called the “method of changing the subspace of control parameters” [4, 6, 7, 11, 17]. The main idea of the method is that in approaching the neighborhood of the proposed bifurcation point, one should move to a new system, for which there will not be any bifurcation points or branch points on the equilibrium state curve (the presence of limit points is allowed). In fact, the transition to the new system can be interpreted as a solution to the problem with a slightly modified configuration, which belongs to a whole family of similar problems. After passing through the critical section (a neighborhood of the bifurcation point), you can make a reverse transition and continue solving the problem with the initial configuration. To be able to implement the described procedure, it is necessary to be able to vary by at least two control parameters. For example, the parameters of the external load and some geometrical parameters of the shell. Then a one-parameter system should be written as follows: r ðfX1 g; q1 ; q2 Þ ¼ r ðfXextr gÞ ¼ 0 where fXextr g ¼ ffX1 g; q1 ; q2 g – advanced vector; fX1 g – vector of the main unknowns, has the dimension m 1; q1 ; q2 – two control parameters; The system (5.3) has the dimension m þ 2.
ð5:3Þ
Methods for Studying the Post-buckling Behavior
11
6 The Study of Post-buckling Behavior of a Hinged Spherical Shell Consider the multiparameter approach and the method of changing the subspace of control parameters on the example of the study of post-buckling behavior of the spherical shell, hinged on the external contour and loaded by external pressure (Fig. 4). Table 1 shows the initial data.
Fig. 4. Scheme of spherical shell hinged on the external contour Table 1. Initial data Bearing radius a ¼ 2:8 mm Shell thickness h ¼ 0:05 mm The original radius of curvature of the meridian R1 ¼ 32 mm Elastic modulus E ¼ 13 104 MPa Poisson’s ratio l ¼ 0:3 Method of fixing the edge of the shell Hinge
The multiparameter approach uses the strategy of sequential investigation of oneparameter families of dimension Rm þ 1 , each of which belongs to a multiparameter family of problems of dimension Rm þ n . Each one-parameter problem has its own discretely varying external parameter, while the remaining ðn 1Þ control (external) parameters have fixed values. In the example, the study of post-buckling behavior of the spherical shell was carried out in space: deflection shell in the center ðvÞ, the radius of curvature ðRÞ; external pressure ð pÞ. The radius of curvature and external pressure are the control parameters, and the deflection shell in the center is one of the components of the vector of internal parameters characterizing the current state of the structure. Figure 5 shows the projection of the surface of equilibrium states into space: v; R; p. This surface is constructed using the method of continuation by the parameter in conjunction with the method of changing the subspace of control parameters. The figure shows that with an increase in the radius of curvature, the projection of the surface of the equilibrium states becomes more complicated, and with a certain value of the curvature parameter, a separate surface of the isolated solutions appears, which subsequently merges with the main surface.
12
S. A. Podkopaev et al.
Fig. 5. The projection of the surface of equilibrium states in space: external pressure (p) - radius of curvature (R) - deflection at the center (v)
Let us demonstrate the method of changing the subspace of control parameters, for this, we consider the cross-sections of the projection of the surface of equilibrium states for the values of the radius of curvature R lying in the range from 32 mm to 36 mm (Fig. 6). Trajectory 1 corresponds to a radius of curvature of 35.5 mm, and trajectory 3 corresponds to a radius of curvature of 32 mm. Observing a qualitative change in the trajectories with a monotonic change in the curvature parameter, we can assume that there is a certain singular point—a bifurcation point corresponding to a certain critical value of the curvature parameter Rcr (33:5185\Rcr \33:5188). In Fig. 6, the section corresponding to the critical value of the curvature parameter is depicted by the shaded plane.
Fig. 6. Cross-sections of the projection of the surface of equilibrium states in space: external pressure (p) - radius of curvature (R) - deflection at the center (v)
Methods for Studying the Post-buckling Behavior
13
Approaching the critical value of the curvature parameter Rcr from below, the trajectories lying in the deflection plane of the pressure will look like a similar trajectory 3, and if approaching from above, then a similar trajectory 1. To obtain the exact value of the critical parameter is almost impossible. As we approach the neighborhood of the bifurcation point, a deterioration in the convergence of the numerical solution is observed: a spontaneous transition to another branch of the solution can be observed, or a turn back. In this paper, an alternative strategy is proposed in the framework of a multiparametric approach that allows us to avoid directly solving the branching problem, but at the same time obtain all the necessary information about the behavior of the system in the neighborhood of the bifurcation point. In other words, it is possible to bypass a particular point from different sides and build a so-called picture of restructuring. With the aid of the change of the subspace of control parameters, it is possible at the right moment to branch off from trajectory 3, the corresponding one-parameter problem with variable pressure ðp ¼ varÞ and fixed curvature (R ¼ 32 mmÞ, and start moving along trajectory 4 (Fig. 6), which corresponds to another one-parameter problem already with a variable curvature (R ¼ var), but with a fixed external pressure ðp ¼ 0:084 MPaÞ. In fact, the method of changing the subspace of control parameters is a switch between different one-parameter tasks. Having reached the specified value of the radius of curvature parameter, for example, R ¼ 35:5 mm, it is possible to branch off from trajectory 4 and thus get to trajectory 5 (Fig. 6) corresponding to the isolated solution and not having any common points with the main trajectory 1. By continuing along path 4, one can determine the “depth” of penetration of an isolated solution. As can be seen from the obtained results, the trajectory 4 also has a limit point (in this case, R_pr = 35.8 mm). Consequently, for values of the radius of curvature parameter greater than R_pr, there is no isolated solution, similar to the trajectories 5 and 6. To overcome the limit point on the trajectory 4, the method of changing the continuation parameter is used. Continuing the movement along the trajectory 4, bypassing the limit point, one can again return to the initial trajectory 3. Thus, a bypass of the neighborhood of the bifurcation point is carried out. This algorithm was implemented in the author’s program and to confirm the reliability of the results of a numerical solution, the problem in question was solved using the finite element method in the ANSYS software package. The results of the author’s program with high accuracy coincided with the results of the ANSYS software package, which indicates the reliability of the presented results.
7 Conclusions The paper provides an overview of existing approaches and methods for studying the post-buckling behavior of the axisymmetric membrane. A rational mathematical model was chosen to describe the nonlinear straining of the symmetric shells.
14
S. A. Podkopaev et al.
A numerical algorithm for the study of nonlinear straining processes of multiparameter systems has been developed and implemented as an authoring program. Within the framework of the algorithm, a method for calculating one-parameter problems using the discrete method of parameter continuation was developed, and the method of “changing the parameter of continuation” was implemented to overcome special limiting points of the equilibrium state curve.
References 1. Alfutov, N.A.: Fundamentals of calculation on the stability of elastic systems. Mechanical Engineering (1977). (Calculator library) 488 p. with illustration 2. Andreeva, L.E.: Elastic elements of devices. Tutorial, Mashinostroenie, 456 p. (1982) 3. Biderman, V.L.: Mechanics of thin-walled designs. Statics, Mechanical Engineering (1977). (Calculator library) 488 p. with illustrations 4. Valishvili, N.V.: Methods for calculating the shells of rotation on electronic digital computer. Mashinostroenie, 278 p. (1976) 5. Volmir, A.S.: Resistance of deformable systems. Fizmatgiz, 984 p. (1967) 6. Gavrushin, S.S.: Development of methods for calculating and designing elastic shell structures of instrument devices. Dissertation, Moscow, 316 p. (1994) 7. Gavrushin, S.S., Baryshnikova, O.O., Boriskin, O.F.: Numerical methods in dynamics and strength of machines. Publishing House of Bauman Moscow State Technical University, 492 p. (2012) 8. Grigolyuk, E.I., Lopanitsyn, E.A.: Finite deflections, stability and post-buckling behavior of thin shallow shells. Moscow, MSTU “MAMI”, 162 p. (2004) 9. Grigolyuk, E.I., Kabanov, V.V.: The stability of the shells. The main editors of the physical and mathematical literature publishing house “Science”, Moscow, 360 p. (1978) 10. Report at the International Scientific Conference “Boundary Problems of Continuum Mechanics and Their Applications” dedicated to the 100th anniversary of the birth of G.G. Tumashev and the 110th anniversary of the birth of Kh.M. Mushtari, vol. 42, pp. 5–19. Proceedings of the N.I. Lobachevsky Mathematical Center, Math Society, Kazan (2010) 11. Podkopaev, S.A., Gavrushin, S.S., Nikolaeva, A.S.: Analysis of the process of nonlinear straining of corrugated membranes. Interuniversity compilation of scientific papers: Mathematical modeling and experimental mechanics of a deformable solid, issue 1. Tver State Technical University, Tver (2017). 162c. UDC: 539.3/8, BBK: 22.251+22.19. http:// elibrary.ru/item.asp?id=29068499 12. Podkopaev, S.A., Gavrushin, S.S., Nikolaeva, A.S., Podkopaeva, T.B.: Calculation of the working characteristics of promising designs microactuators. Interuniversity compilation of scientific papers: Mathematical modeling and experimental mechanics of a deformable solid, issue 1. Tver State Technical University, Tver (2017). 162c. UDC: 539.3/8, BBK: 22.251 +22.19. http://elibrary.ru/item.asp?id=29068499 13. Ponomarev, S.D., Biderman, V.L., Likharev, K.K., et al.: Strength Calculations in Mechanical Engineering, vol. 2. Mashgiz (1958) 14. Feodosyev, V.I.: Elastic Elements of Precision Instrument Engineering. Oborongiz (1949) 15. Feodosyev, V.I.: To the calculation of the clapping membrane. Appl. Math. Mech. 10(2), 295–306 (1946) 16. Belhocine, A.: Exact analytical solution of boundary value problem in a form of an infinite hypergeometric series. Int. J. Math. Sci. Comput. (IJMSC) 3(1), 28–37 (2017). https://doi. org/10.5815/ijmsc.2017.01.03
Methods for Studying the Post-buckling Behavior
15
17. Crisfield, M.A.: A fast incremental/iterative solution procedure that handles “snapthrought”. Comput. Struct. 13(1), 55–62 (1981) 18. Chuma, F.M., Mwanga, G.G.: Stability analysis of equilibrium points of newcastle disease model of village chicken in the presence of wild birds reservoir. Int. J. Math. Sci. Comput. (IJMSC) 5(2), 1–18 (2019). https://doi.org/10.5815/ijmsc.2019.02.01 19. Gupta, N.K., Venkatesh: Experimental and numerical studies of dynamic axial compression of thin-walled spherical shells. Int. J. Impact Eng. 30, 1225–1240 (2004) 20. Marguerre, K.: Zur Theorie der gerkrümmten Platte grober Formänderung. In: Proceedings of the Fifth International Congress for Applied Mechanics, pp. 93–101. Wiley, Cambridge (1939) 21. Moshizi, M.M., Bardsiri, A.K.: The application of metaheuristic algorithms in automatic software test case generation. Int. J. Math. Sci. Comput. (IJMSC) 1(3), 1–8 (2015). https:// doi.org/10.5815/ijmsc.2015.03.01 22. Mescall, J.: Numerical solution of nonlinear equations for shell of revolution. AIAA J. 4(11), 2041–2043 (1966) 23. Reissner, E.: Asymmetrical deformations of thin shells of revolution. In: Proceedings of Symposia in Applied Mathematics, vol. 3, pp. 27–52. American Mathematical Society (1950) 24. Riks, E.: The application of Newton’s method to the problem of elastic stability. J. Appl. Mech. 39, 1060–1065 (1972)
Mathematical Modeling of DC Motors for the Construction of Prostheses D. A. Fonov1(&), I. A. Meshchikhin1, and E. G. Korzhov2 1
Bauman University, ul. Baumanskaya 2-ya, 5, Moscow 105005, Russia [email protected] 2 State University of Russia. AN Kosygin (Technology. Design. Art), ul. Sadovnicheskaya. 33, Moscow 1115035, Russia
Abstract. The article describes static and dynamic characteristics of a DC motor with independent excitation by mathematical modeling in MATLAB environment for the possibility of further construction of models of electric drive systems with this engine type. The tasks of designing bionic prostheses require solving the problem of developing a control system for this prosthesis. This complex system includes many different subsystems, in particular, DC drive systems. This study examines the control of the knee module of a bionic prosthesis based on a hydraulic cylinder, which is controlled by turning the spool, which in turn determines the moment of resistance. Therefore, the speed of the electric motor is a key factor in the operation of the bionic knee module. Inadequate assessment may result in personal injury. A brief theory describing the basic principles of calculating the electric and mechanical parts of a DC motor is presented. A basic structural diagram of a DC motor having an input effect, a transfer function and an output signal is constructed. Cases of structural schemes with a permanent and variable magnetic flux of the winding of the excitation and with variable anchor resistance are given. According to the above structural diagrams, a mathematical model of the DC motor in MATLAB/Simulink is compiled. The results of modeling a constant-current motor for direct starting with a constant and variable magnetic flux of the excitation winding are obtained. Using the special function linearization, the transfer function of the motor are obtained. Modeling of start-up modes at idle and under load was performed. A conclusion was made about the advisability of using direct-current motors in electric drive systems and the importance of the studies carried out for the further modeling of electric drive systems and real mechanisms. Keywords: Automated control DC motor PID control Positioning Automated control system Control basics Simulink Amesim MATLAB
1 Introduction Modern engineering design is based on the integrated application of digital modeling tools. The development of the model is carried out at various levels of control system construction: from detailed to super-element.
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 16–27, 2020. https://doi.org/10.1007/978-3-030-39216-1_2
Mathematical Modeling of DC Motors for the Construction of Prostheses
17
The DC motor is widely used in industries and in the development of new products, involving a wide regulation of rotational speed, shaft positioning, as well as motor operation in conditions of transient modes of various types. The various complications arising from the operation of both the DC motor itself and the power supply network, including the additional losses of active power, determine the need for an in-depth study of the processes taking place using mathematical modeling methods [1]. The main task of the electric motor is the conversion of electrical energy into mechanical energy - the rotational energy of the rotor. The object of the study is an electric motor designed to simulate the hip joint in the stand - knee prosthesis. 1.1
Basic Equations for a DC Motor
The structure of the electromechanical system of a DC motor is given in (Fig. 1), it includes the ia ðtÞ - armature current, Ra - armature resistance, La - armature inductance, vðtÞ - source voltage, TM ðtÞ - is the torque on the motor shaft, TL ðtÞ - is the external resistance torque (load).
Fig. 1. Structural diagram of a DC motor
According to Kirchhoff’s voltage law, the DC motor equation is as follows [2]: Ra ia ðtÞ þ La
dia ðtÞ þ eð t Þ ¼ vð t Þ dt
Where, eðtÞ - reverse EMF. The inverse EMF is proportional to the angular velocity of the rotor of the engine, its expression has the following form: eðtÞ ¼ kv xðtÞ Where, kv is a constant at speed. In addition, the motor generates a torque M proportional to the armature current ia ðtÞ, its expression has the following form:
18
D. A. Fonov et al.
MðtÞ ¼ kM ia ðtÞ If the input voltage is vðtÞ ¼ V, then it is constant. As a result, the current ia ðtÞ ¼ Ia is also constant. Based on this, xðtÞ ¼ X, M ðtÞ ¼ M in steady state. That way: Ra Ia þ kv X ¼ V MðtÞ ¼ kM M When maintaining power, in accordance with the laws, we know that N ¼ Ia V is the input power, which means that N ¼ M X is the external power consumed by the resistance Ra Ia2 . In this way: V M ¼ M X þ Ra Ia2 MðtÞ ¼ K ia ðtÞ Where, k ¼ kM ¼ kv . In addition, if an external resistance torque Mc acts on the motor shaft, then its mechanical behavior is described as follows: JM
dxðtÞ þ B M x ðt Þ ¼ M M c dt
Where, JM is the moment of inertia of the rotor and BM is the viscosity coefficient. Based on what’s above, the dynamic response of a DC motor can be written as follows: Ra ia ðtÞ þ La JM
1.2
dia ðtÞ þ kv x ð t Þ ¼ vð t Þ dt
dxðtÞ þ BM xðtÞ kM ia ðtÞ ¼ Mc dt
Analytical Calculation of a DC Motor
Now, based on what’s above, we will proceed to the creation of a DC motor model, describing the state space, input and output characteristics. The mathematical model of the DC motor will be described for the motor type: FL86BLS125. Technical specification of the electric motor: The FL86BLS125 series brushless motor is designed to convert electrical energy into the rotational mechanical energy of a motor rotor. The FL86BLS125 is equipped with three built-in Hall sensors, moved 120° apart, designed for shaft positioning.
Mathematical Modeling of DC Motors for the Construction of Prostheses
19
We introduce the following notation and translate the dimensions of the original parameters into system of SI: La ¼ 0:0003, Gn - armature inductance; Ra ¼ 0:16, Ohm - active resistance of the armature; ia ðtÞ ¼ 55, A - anchor circuit current; vðtÞ ¼ 48, B - the voltage of the anchor chain; eðtÞ ¼ 11:5, B - counter EMF; MðtÞ ¼ 6:178, N * m - torque on the engine shaft; xðtÞ ¼ 3000 rev/min, is the rotational speed of the motor shaft; bDC ðtÞ, rad - angle of rotation of the engine shaft; JM ¼ 0:00024, - axial moment of inertia; BM ¼ 0:00135, N * s * m - viscous friction coefficient. From relations (2–3), we define the constants KM and Kv , constant at the moment and constant at the speed, respectively. 6:178 ¼ KM ia ðtÞ; eðtÞ ¼ KE NDC ðtÞ; 6:178 ¼ KM 55; 11:5 ¼ KE 3000; Nm ; A V Kv ¼ 0:0038 ; rev=min KM ¼ 0:11233
Let’s make up the Eqs. (9–10) for the electrical and mechanical parts of the system: Ra ia ðtÞ þ La JM
dia ðtÞ þ eð t Þ ¼ vð t Þ dt
dxðtÞ þ BM xðtÞ ¼ M dt
It is necessary to find the dependencies of the output coordinates bDC ðtÞ of the system on the input vðtÞ (Fig. 2).
Fig. 2. The dependence of the voltage on the position
20
D. A. Fonov et al.
Let’s consider the case of idling. From the equations for the electrical and mechanical (9–10) systems, we eliminate a ðt Þ ; eðtÞ; M and we get: the variables ia ðtÞ; didt a0 xðtÞ þ a1 xðtÞðtÞ þ a2 xðtÞ ¼ b0 vðtÞ where, a0 ¼
La JM ¼ 0:00028759 Ra BM þ Kv KM
a0 ¼ a0 ¼
Ra JM þ La BM ¼ 0:1542 Ra BM þ Kv KM
KM ¼ 147:7872 Ra BM þ Kv KM
In the equation a2 ¼ 1, it is written in one of the standard forms. Let’s use a Laplace transform with zero initial conditions to find one of the transfer functions of DC Motor: WpPT ð pÞ ¼
xðtÞ b0 ¼ v ð t Þ a0 p2 þ a1 p þ 1
We introduce the notation: – link transfer ratio: pffiffiffiffiffi – time constant: TDC ¼ a0 ; c; ffi – attenuation parameter: n ¼ 2ap1 ffiffiffi a0 .
;
That way, To obtain the transfer function of the DC motor for the case when the output coordinate is the angle bDC ðtÞ, of the shaft rotation, it is necessary to take into account that: Zt
bDC ðtÞ ¼ xðtÞ; xðtÞ ¼ 0
dbDC ðtÞ dt
In the Laplace transform (with zero initial conditions): bDC ð pÞ ¼
xð pÞ ; xð pÞ ¼ p bDC ð pÞ p
Mathematical Modeling of DC Motors for the Construction of Prostheses
21
Then: b WDC ¼
bDC ð pÞ KDC 2 ¼ xð pÞ P TDC P2 þ 2 n TDC P þ 1
b WDC ¼
147:79 P 2:8757 P þ 2 0:0455 1:695 P þ 1
2
2 Modeling with MATLAB Simulink MATLAB is a proprietary tool which is highly used in various biomedical applications due to its flexibility, accuracy and timing constraints [3]. Based on the obtained transfer function and the calculated coefficients, let’s construct a structural model of the transfer function of the link with input parameters in the form of an external torque TM ðtÞ and voltage vðtÞ, and output characteristics in the form of frequency and shaft position. Thus, we will check the accuracy of mathematical calculations and the method to obtain the transfer function (Fig. 3).
Fig. 3. Structural model of the engine of the 3-rd order
Let’s also assign Input perturbation and Output measurement, for further linearization and getting the transfer function using the Simulink environment. Let’s make linearization with the help of the “Linear Analysis tool” and get the transfer function (Fig. 4).
Fig. 4. Transfer function
22
D. A. Fonov et al.
Note that if we make the standard transformations, we will come to the form obtained by analytical method. 2.1
Testing and Regulating the Model of a DC Motor
After drawing up the structural model, let’s enclose it in the super-component and substitute the nominal voltage and the external moment as input values (Figs. 5 and 6).
Fig. 5. Subsystem
Fig. 6. Output characteristic
Note that the output characteristic of the frequency coincides with the passport. Let’s add PID voltage regulation, limiting input voltage and current, and introduce feedback on the position of the motor shaft. PID controller is best example for closed loop systems. In 1911, Elmer Sperry developed the real PID controller. In 1922, Nicholas minor sky published the first theoretical analysis of PID controller. PID controller plays a major role in control systems engineering from the past few decades [4]. As input parameters, we use the angle of the set point and the external torque needed to perform the bending in the hip joint of the bionic knee. Let’s calculate the external torque on the motor shaft: Mex 2p ¼ F h
Mathematical Modeling of DC Motors for the Construction of Prostheses
Mex ¼ Mex ¼
23
Fh 2p
600 0:01 2p
Mex ¼ 0:94H M Let’s set the input values (Fig. 7):
Fig. 7. DC Motor model with position control
Since the system uses a ball-screw gear, for one revolution of the motor shaft, the transfer passes only 10 mm in translational motion, the task was set for the system to travel 0.1 m in 0.1 s, based on this, it is necessary to select the coefficients for the PID regulator. Primarily, to select the parameters of the PID regulator, we use the Tune tool. Secondly, we are use standard transformations to refine the parameters (Fig. 8).
Fig. 8. Parameters of the PID controller
24
D. A. Fonov et al.
Optimization technique to select the best values of the parameters that affect their performance [5] (Fig. 9).
Fig. 9. Output characteristic (position)
3 Mathematical Modelling of a DC Motor in a SIEMENS Amesim Environment Let us build a similar control system for DCT in the SIEMENS AMESIM mathematical modeling environment in order to get a more realistic motor behavior. Let’s add a third output characteristic: Motor load capacity [N], to determine the force arising on the ball-screw transmission. To do this, we perform the reverse transformation efforts arising on the shaft of the electric motor (Fig. 10). MB 2p ¼ F h MB ¼ F ðM B Þ ¼
Fh 2p MB 2p 0:01
Mathematical Modeling of DC Motors for the Construction of Prostheses
25
Fig. 10. DC motor model.
Where, 5 is the input voltage, 4 is the winding current, 3 is the output frequency, 2 is the position of the shaft, 1 is the external torque. We enter the regulator and feedback on the position of the motor shaft into the PID model (Fig. 11).
Fig. 11. Model of DC motor with PID position control and feedback loop
Let’s perform a calculation with similar input parameters used earlier and determine the mechanical characteristics (Fig. 12).
26
D. A. Fonov et al.
Fig. 12. Output characteristic (position)
Note that in the AMESIM environment there is no failure from the nominal torque. And the setpoint angle is reached in 0.1 s and no oscillation is observed.
4 Conclusions The paper presents the results of mathematical modeling of a brushless motor with a position feedback system and a PID control system. The model of the brushless electric motor is developed based on the transfer function, obtained by the basic electromechanical laws. Thus, this method allows you to evaluate the speed of the motor, sufficient for the control system of the bionic prosthesis, and on this basis to judge the suitability of this motor. The bionic knee module control system describes the interdisciplinary behavior of not only electromechanical, but also hydraulic and mechanical elements. The ability to create mathematical models in various software systems allows you to compare the results of a numerical experiment and use the full potential of the software used. In particular, the motor model in the AMESIM environment allows you to integrate it into more complex engineering systems, in particular, the use of the AMESIM software package allows you to most accurately describe the behavior of hydraulic systems, and the model in MATLAB Simulink allows you to debug the behavior of a higher-level control system in State Flow modules. The obtained qualitatively new results in the field of prosthetics. The simulation results allow us to conclude about the suitability of a particular motor based on its characteristics for use in a particular system, as well as for the selection of PID-regulator factors for further integration into the system of pulse-width modulation. But at the same time, an integral part of the verification of the mathematical model and confirmation of the engine’s parameters is the statement of the experiment and further processing of the results.
Mathematical Modeling of DC Motors for the Construction of Prostheses
27
References 1. Yudina, O.I.: Mathematical modelling of additional losses in DC motors with pulsing power. Abstract of the thesis for the degree of candidate of technical sciences – P.2.1 2. Chen, Y.-P.: Modeling of DC Motor. NCTU Department of Electrical and Computer Engineering Spring Course (2015) 3. Sharma, G.: Performance analysis of image processing algorithms using matlab for biomedical applications. Int. J. Eng. Manuf. (IJEM) 7(3), 8–19 (2017) 4. Alla, R.R., Lekyasri, N., Rajani, K.: PID control design for second order systems. Int. J. Eng. Manuf. 9(4), 45–56 (2019) 5. Elnaghi, B.E., Mohammed, R.H., Dessouky, S.S., Shehata, M.K.: Load test of induction motors based on PWM technique using genetic algorithm. Int. J. Eng. Manuf. 9(2), 1–15 (2019) 6. Derian, M.: Typology of prostheses and interface modes between humans and digital systems. Cogn. Prosthet. 1–27 (2018) 7. Wentink, E.C., Koopman, H.F.J.M., Stramigioli, S., Rietman, J.S., Veltink, P.H.: Variable stiffness actuated prosthetic knee to restore knee buckling during stance: a modeling study. Med. Eng. Phys. 35(6), 838–845 (2013) 8. Wu, S.-K., Waycaster, G., Shen, X.: Electromyography-based control of active above-knee prostheses. Control Eng. Practice 19(8), 875–882 (2011) 9. Severijns, P., Vanslembrouck, M., Vermulst, J., Callewaert, B., Innocenti, B., Desloovere, K., Vandenneucker, H., Scheys, L.: High-demand motor tasks are more sensitive to detect persisting alterations in muscle activation following total knee replacement. Gait Posture 50, 151–158 (2016) 10. Xu, L., Wang, D.-H., Fu, Q., Yuan, G., Bai, X.-X.: A novel motion platform system for testing prosthetic knees. Measurement (2019)
Complex Risks Control for Processes in Heat Technological Systems Vadim Borisov(&)
, Vladimir Bobkov
, and Maxim Dli
Smolensk Branch of the National Research University “Moscow Power Engineering Institute”, Smolensk, Russia [email protected], [email protected], [email protected]
Abstract. The tasks of risk control in heat technological systems (HTS) cannot be solved by using the traditional methods implying exact representation of problem situations. This is because such tasks are characterized by a large number of parameters, uncertainty and data fuzziness. A model of complex risks control for processes in HTS is proposed. This model is based on the hybridization and complexation methods of fuzzy models. The proposed model contains fuzzy models to analyze HTS processes; to assess processes in HTS; to assess risks in HTS. The need for complexly managing the risks in setting up resource and energy saving processes in HTS is stemming from the requirement of keeping the violation risks within the acceptable level. It should be noted that the acceptable level of risk can be established for all stages of the heat technological processes in HTS. The approach to the complex risks control for the processes in HTS with the use of the proposed model is described. It consists in setting the combinations of the control parameters for the processes in HTS; in modeling and determining the combinations of control parameter which ensure resource and energy efficiency for the processes in HTS without exceeding the permissible level for the risks of these processes disruption. The article presents the experimental results of complex risk control in setting up resource and energy saving in HTS using the proposed model and approach taking as an example the heat technological process of drying phosphate pellets in the calcining conveyor machine. The obtained results are supposed to be used for complex risk control in setting up resource and energy saving for various processes in HTSs. Keywords: Complex risk control Heat technological system Fuzzy analysis and modeling
1 Introduction The pragmatic reason for investigating processes in heat technological systems (HTS) is insufficiently high resource- and energy efficiency of the existing equipment. The modeling of heat technological processes in HTS and the search for the optimum control parameters are rather complex problems because of the set of factors analyzed and the need to take into account the uncertainty of these processes. The risk to disrupt processes in HTS is an unavoidable factor of their operation. The risk is understood as a possibility of negative events in combination with their © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 28–38, 2020. https://doi.org/10.1007/978-3-030-39216-1_3
Complex Risks Control for Processes in HTS
29
consequences [1, 2]. The complexity of risks control for the processes in HTS [3, 4] is caused by the requirement not to exceed the permissible disruption risks level at all stages for all processes in HTS. When the complexity of processes in HTS increases, the problems of risks control cannot be solved using the methods based on the faithful representation of the problem situation. These problems are characterized by a great number of parameters, uncertainty, fuzzy data. External factors for such processes are characterized by the uncertainty due to the variability of the external environment and the uniqueness of the problem situations [5–11]. The above mentioned statements make it possible to substantiate the expediency of the risks control methods development for the processes in HTS, based on the hybridization and complexation of fuzzy models.
2 Problem Statement The problem for complex risks control of HTS processes in a formalized form is as follows. Tg
jk
, Tg jk , Wg jk , j =1... J , k =1... K
n k 0 S ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ min, at R ≤ R( per ) ,
S = sE E + sH H ,
E = FE ( E1 , ..., Ek , ..., EK ) ,
(
)
Ek = FEk E1k , ..., E kj , ..., E Jk , k = 1...K , k
E kj = FE kj (Tg 0jk , Tg njk , Wg jk ) ,
j = 1...J k , k = 1...K ,
H = FH ( H1 , ..., H k , ..., H K ) ,
(
)
H k = FH k H1k , ..., H kj , ..., H Jk , k = 1...K , k
H kj = FH kj (Tg 0jk , Tg njk , Wg jk ) ,
j = 1...J k , k = 1...K ,
R = FR ( R1 , ..., Rk , ..., RK ) ,
(
)
Rk = FRk R1k , ..., R kj , ..., RJkk , k = 1...K , R kj = FR kj ( E kj , H kj ) ,
j = 1...J k , k = 1...K ,
where S – total expenditures for electrical and heat energy; R – generalized risk of processes disruption in HTS on the whole; RðperÞ – permissible level of generalized risk for processes disruption in HTS; E and H – total energy cost (for electrical and heat energy); sE and sE – incremental cost for electrical and heat energy, respectively; Ek and Hk – expenditures for electrical and heat energy of the k-th process, respectively; FE – the relation between the expenditures for electrical energy of specific processes and the total expenditures for electrical energy; FH – the relation between the expenditures for heat energy of specific processes and total expenditures for heat energy; Rk – the risk of the k-th process disruption; FR – the relation between the specific processes disruption risks and generalized risk of HTS processes disruption in general; Ejk and Hjk – expenditures for electrical and heat energy for the j-th stage of the
30
V. Borisov et al.
k-th process, respectively; FEk – the relation between the expenditures for electrical energy and for the j-th stage of the k-th process and the expenditures for electrical energy for the whole k-th process; FHk – the relation between the expenditures for heat energy of the j-th stage of the k-th process and the expenditures for heat energy of the whole k-th process; Rkj – the disruption risk for the j-th stage of the k-th process; FRk – the relation between the disruption risks of the k-th process specific stages and the disruption risk of the whole k-th process; FEjk – the relation between the control parameters and the electrical energy expenditures for the j-th stage of the k-th process; FHjk – the relation between the control parameters and the heat energy expenditures for the j-th stage of the k-th process; FRkj – the relation between the expenditures for electrical and heat energy at the j-th nstage o of the k-th process and the disruption risk of this stage for this process; P ¼ pjkl
– processes control parameters in HTS
ðl ¼ 1. . .Lj ; j ¼ 1. . .Jk ; k ¼ 1. . .KÞ. A permissible risk level can be also established both for processes disruption in HTS as a whole and for specific processes as well as for all stages of these processes: Rk RkðperÞ ; Rkj
RkjðperÞ ;
k ¼ 1. . .K;
j ¼ 1. . .Jk ; k ¼ 1. . .K:
Consider conveyor indurating machine for producing phosphorite pellets to be an example of HTS. The processes of this indurating machine are: drying, heating, hightemperature roasting, recuperation and cooling. In this case, the electrical energy is consumed to form a gas carrier flow with required control parameters and heat energy provides the initial temperature of a gas carrier. The mentioned processes differ in values of the control parameters at each stage (for each vacuum chamber). The control parameters for the processes are: • Tgjk0 – the temperature of a heat gas carrier at the entrance of a layer for the j-th stage of the k-th process; • Tgjkn – the temperature of a heat gas carrier after all n layers for the j-th stage of the k-th process; • Wgjk – heat carrier gas motion velocity for the j-th stage of the k-th process. The criterion to increase the resource and energy efficiency for industrial processes of phosphorite pellets production in a conveyor indurating machine is specified in the following way: Tg
jk
, Tg jk , Wg jk , j =1... J , k =1... K
n k 0 S ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ min, at R ≤ R( per ) .
For the indurating machine OK-520/536 under consideration: k ¼ 1. . .K, K = 5, in this case: • for the drying process (k = 1) – J1 = 11; • for the heating process (k = 2) – J2 = 2; • for the high-temperature roasting process (k = 3) – J3 = 8;
Complex Risks Control for Processes in HTS
31
• for the recuperation process (k = 4) – J4 = 2; • for the cooling process (k = 5) – J5 = 10.
3 The Model of Complex Risks Control for Processes in Heat Technological Systems The relations between the control parameters and the expenditures for electrical and heat energy at all stages of the processes in HTS as well as their impact on the disruption risks of these processes are of non-linear character [12]. It is due to the specifics of the processes, the equipment uniqueness [13] as well as the external factors impact in the conditions of uncertainty. Therefore, to analyze the processes in HTS and to assess the risks, it is reasonable to use hybridization methods and fuzzy models complexation.
S = sE E + sH H
H
E MFE
R1K
1
. . . MRJ1
...
ME1J1
H11 . . . MH11
The set of fuzzy component models to analyze the 1-th process in HTS
...
K
. . . MRJ K
E1K . . .
EJKK . . . MEJKK
H1K. . . MH1K
H JKK MH JKK
...
...
, Tg 1nK , Wg 1K )
11 , Tg 11 n , Wg ) 11 0
...
HK MFH K
...
...
RJKK
ME1K
...
(Tg
MH 1J1
, Tg nJ11 , Wg J11 )
E1J1
MR1K
H 1J1
J11 0
E11 . . . ME11
Cascade 1
...
...
EK MFEK
The set of fuzzy component models to analyze the K-th process in HTS
, Tg nJ K K , Wg J K K )
...
R1J1
RK MRK
...
(Tg
H1 MFH1
...
(Tg
MR11
...
E1 MFE1
1K 0
...
MFH
(Tg
R1 MR1
R11 Cascade 2
R
JK K 0
Cascade 3 Cascade 4
MR
Fig. 1. The model structure of the complex risks control for processes in HTS.
Taking into account this approach a model of a complex risks control for processes in HTS is proposed. This approach consists in giving sets of control parameters at each stage for all processes in [14, 15], in modeling and defining such combinations of control parameters [16, 17], which ensure the increase of resource and energy efficiency for the processes in HTS provided the permissible level of disruption risks of these processes is not exceeded [18]. The structure of this model is shown in Fig. 1. The proposed model consists of the following cascades. n Cascade 1 includes the sets of fuzzy component models
Mcompjki j i ¼ 1. . .n;
j ¼ 1. . .Jk g to analyze the k-th ðk ¼ 1. . .KÞ processes for drying, heating, hightemperature roasting, recuperation and cooling of pellets. These models correspond to
32
V. Borisov et al.
the decomposition of these processes. Each of a fuzzy component model is designed to solve the internal problem of heat conductivity for pellets taking into account the uncertainty of thermal physics characteristics and temperature distribution. The analysis with the use of this model is based, firstly, on the building of a differential equations system with fuzzy thermal physics characteristics (volumetric heat capacity, heat conductivity, heat-transfer coefficient from the surface) [19], secondly, on the proposed in work [11] approach to solve this equations system by fuzzy numerical methods.
...
...
ME1k
1k comp n
M
...
M
Cascade 1
McompnJ k k
Tg1J k k
jk 1
Tg ...
...
TgiJ−k1k
Tgijk−1
Tgi1−k1 1k compi
M
jk compi
...
M
1k i
...
...
...
McompiJ k k
TgiJ k k
Tgijk
...
Tg nJ−k1k
Tg njk−1
Tg11k 1k comp1
M
Wg 1k 1-st stage
...
jk comp1
M
Tg 0jk
MEJkk
Tg 0J k k Tg nJ k k Wg J k k ...
jk comp n
...
Tg 01k
...
ME kj
1k n −1
Tg
EJkk
Tg 0jk Tg njk Wg jk
Wg 1k Tg 01k Tg 1k n
MH Jkk
Wg jk j-th stage
...
Mcomp1J k k
Tg 0J k k
The set of fuzzy models to estimating the cost of electrical energy for the k-th process in HTS
E kj
E1k
Tg
...
MH kj
The set of fuzzy component models to analyze the k-th process in HTS
Cascade 2
MH1k
H Jkk The set of fuzzy models to estimating the cost of heat energy for the k-th process in HTS
H kj
H1k
Wg J k k Jk -th stage
n o Fig. 2. Interaction of fuzzy component models Mcompjki j i ¼ 1. . .n; j ¼ 1. . .Jk of the 1-st n o n o cascade with models MEjk j j ¼ 1. . .Jk and MHjk j j ¼ 1. . .Jk of the 2-nd cascade for the k-th process.
n o n Cascade 2 consists of models sets MEjk j j ¼ 1. . .Jk ; k ¼ 1. . .K , MHjk j j ¼ 1. . . Jk ; k ¼ 1. . .Kg, which realize the relations FEjk , FHjk and designed to assess the expenditures Ejk , Hjk for electrical and heat energy, respectively, at j-th stages ðj ¼ 1. . .Jk Þ for k-th ðk ¼ 1. . .KÞ processes of pellets production. n
Figure 2 shows the interaction for fuzzy component models Mcompjki j i ¼ 1 n o . . .n; j ¼ 1. . .Jk g of the 1-st cascade with models MEjk j j ¼ 1. . .Jk and n o MHjk j j ¼ 1. . .Jk of the 2-nd cascade for the k-th process in HTS.
Complex Risks Control for Processes in HTS
33
The 2-nd cascade models represent coordinated bases of fuzzy rules [10]. Designing procedures and the use of the proposed models are considered on the example for model MEjk of expenditures assessment Ejk for electrical energy. Step 1. The task for input and output fuzzy variables of the model. The input fuzzy variables of the model are the control parameters Tgjk0 , Tgjkn and Wgjk . The output fuzzy variable is the expenditures Ejk for the electrical energy. Step 2. Linguistic scales building for input and output fuzzy variables. To typify the description let us define the same terms of all fuzzy variables {L – small, M – middle, H – high}. Step 3. The formation of the base for fuzzy rules of the model. According to the results of this step a set of coordinated rules of the following type is formed: R1 : Ry : RY :
If Tgjk0 is L AND Tgjkn is L AND Wgjk is L ; Then Ejk is L ; . .. If Tgjk0 is M AND Tgjkn is M AND Wgjk is M ; Then Ejk is M ; . . . If Tgjk0 is H AND Tgjkn is H AND Wgjk is H ; Then Ejk is H :
The expenditures Ejk assessment for electrical energy is performed on the basis of algorithms for the fuzzy logic inference [10]. n Similarly, the building and use of all models is done MEjk j j ¼ 1. . .Jk ; k ¼ 1 n o . . .Kg and MHjk j j ¼ 1. . .Jk ; k ¼ 1. . .K . n o The 2-nd cascade contains the models MRkj j j ¼ 1. . .Jk ; k ¼ 1. . .K , realizing the relation FRkj and designed to assess the disruption risks of the j-th stages ðj ¼ 1. . .Jk Þ for k-th ðk ¼ 1. . .KÞ processes. These models represent the coordinated bases for fuzzy rules. The input fuzzy variables for model MRkj are the variables Ejk ; Hjk , and the output fuzzy variable is Rkj . Figure 1 shows the example of the relation Rkj ¼ FRkj Ejk ; Hjk , realizing by this model. Cascade 3 includes the set of fuzzy models fMFEk j k ¼ 1. . .K g and fMFHk j k ¼ 1. . .K g, which are used to assess the expenditures Ek and Hk for electrical and heat energy, respectively, for k-th ðk ¼ 1. . .KÞ processes (Fig. 3).
34
V. Borisov et al.
Fig. 3. The example of the relation Rkj ¼ FRkj Ejk ; Hjk realizing by the model MRkj .
Table 1 illustrates the building of the models for the 3-nd cascade on the example of the model MFEk to assess the expenditures Ek for electrical energy for the k-th process, and realizing the relation Ek ¼ FEk E1k ; . . .; Ejk ; . . .; EJk . k
Table 1. A fragment of a base structure of fuzzy rules for model MFEk and expenditures Ek assessment for electrical energy of the k-th process Rule number Fuzzy input Fuzzy output variable variables Ek k … k … k Ej EJ E1 k
R1 … Rg … RQ
L … M … H
… … … … …
L … M …. H
… … … … …
L … M … H
L … M … H
In addition to models fMFEk j k ¼ 1. . .K g and fMFHk j k ¼ 1. . .K g the 3-nd cascade includes models fMRk j k ¼ 1. . .K g for disruption risks assessments of the k k k processes in HTS, which realize the relation Rk ¼ FRk R1 ; . . .; Rj ; . . .; RJk . Different models for risks Rk assessment can be used, for example: R ¼ maxðR1 ; . . .; Rk ; . . .; RK Þ:
Complex Risks Control for Processes in HTS
35
Cascade 4 includes fuzzy models MFE and MFH, which are used to assess the total expenditures E and H for electrical and heat energy, respectively. Table 2 shows a fragment of a rule base structure for model MFE, the input fuzzy variables of which are output fuzzy variables Ek of the 3-nd cascade models fMFEk j k ¼ 1. . .K g, and the output fuzzy variable is E. Table 2. A fragment of a base structure of fuzzy rules for model MFE and expenditures E assessment for electrical energy Rule number Fuzzy input variables E1 … Ek R1 L … L … … … … Ru M … M … … … …. RU H … H
Fuzzy output variable E … … … … … …
EK L … M … H
L … M … H
Similarly, the designing of model MFH is done to assess the total expenditures H for heat energy. In addition to models E and H the 4-th cascade includes model MR for assessment of the total disruption risk of processes in HTS on the whole, which realizes the relation R ¼ FRðR1 ; . . .; Rk ; . . .; RK Þ. Different models for risks R, assessment can be used, for example: Edry ; Joule R ¼ maxðR1 ; . . .; Rk ; . . .; RK Þ:
4 Experiment Results and Analysis The proposed approach to the complex risks control for processes inHTS consists, firstly, in setting combinations of control parameters Tgjk0 ; Tgjkn ; Wgjk at each stage for all processes in HTS taking into account the restrictions imposed on these processes including disruption risks at all stages of the processes Rkj RkjðperÞ , ðj ¼ 1. . .Jk ; k ¼ 1. . .KÞ, secondly, in modeling and identifying the combinations for control parameters which will ensure an increase of resource and energy efficiency for processes in HTS provided the permissible level of the generalized disruption risk for technological processes in HTS is not exceeded in general. Figure 4 shows the results of the complex risks control for the process of phosphorite pellets drying in a conveyor indurating machine OK-520/536 (J1 ¼ 11, in the form of fuzzy result sets) when the risk level of this process is not exceeded and is equal to 0.6 ðR1 0:6Þ. In Fig. 5, these results are shown in the form of defuzzified values.
36
V. Borisov et al. % dry K E′
E′
% dry1 E′
dr k y
dr y K
E
,J ou
le
% dry E′
k
y dr
% dry 3 E′
E′ % dry 2 E′
μ dr 2 y
dr 3 y
1
0,5
0
1
y dr E′
E′
E′
τ, с
Energy efficiency, MJ/t
Fig. 4. Assessment results of energy efficiency for the pellets drying process in a conveyor indurating machine OK-520/536 at R1 0:6.
Number (stage) of vacuum chamber in the drying zone of the OK-520/536 Fig. 5. Defuzzified results of energy efficiency evaluation for different stages of pellet drying in a conveyor indurating machine OK-520/536 at R1 0:6.
These results are supposed to be used for complex risk control in setting up resource and energy saving for various processes in HTSs.
Complex Risks Control for Processes in HTS
37
5 Conclusions The problem of complex risk control in setting up resource and energy saving of processes in HTS is formulated. A model of complex risks control for processes in a HTS is proposed. The proposed model is based on a fuzzy approach and on a hybridization and complexation of fuzzy methods. The described model consist of fuzzy models for carrying out a componentwise analysis of the heat technological processes, for estimating the resource and energy efficiency of heat technological processes, and for assessing the risks in HTS. It describes an approach to the problem solution of the complex risks control for the processes in HTS using the developed model which consists in giving sets of control parameters at each stage for all processes, in modeling and defining such combinations of control parameters, which ensure the increase of resource and energy efficiency for the processes in HTS provided the permissible level of disruption risks of these processes is not exceeded. The experimental results of complex risks control with the use of the proposed model and approach based on the example of phosphorite pellets drying process in a conveyor indurating machine are presented. The obtained results are supposed to be used for complex risk control in setting up resource and energy saving for various processes in HTSs. Acknowledgments. This work was supported in the framework of the basic part of Government Assignment of the Ministry of Education and Science of the Russian Federation for performing scientific research (project no. 13.9597.2017/BCh).
References 1. Bobkov, V.I., Borisov, V.V., Dli, M.I., Meshalkin, V.P.: Intensive technologies for drying a lump material in a dense bed. Theor. Found. Chem. Eng. 51(1), 70–75 (2017) 2. Panchenko, S.V., Shirokikh, T.V.: Thermophysical processes in burden zone of submerged arc furnaces. Theor. Found. Chem. Eng. 48(1), 72–79 (2014) 3. Butkarev, A.A., Butkarev, A.P., Zhomiruk, P.A., Martynenko, V.V., Grinenko, N.V.: Pellet heating on modernized OK-124 roasting machine. Steel Transl. 40(3), 259–266 (2010) 4. Akberdin, A.A., Kim, A.S., Sultangaziev, R.B.: Planning of numerical and physical experiment in the simulation of technological processes. Inst. News. Ferr. Metall. 61(9), 737–742 (2018) 5. Petrosino, A., Fanelli, A.M., Pedrycz, W.: Fuzzy Logic and Applications. Springer, Heidelberg (2011) 6. Mohammadi, F., Bazmara, M., Pouryekta, H.: A new hybrid method for risk management in expert systems. Int. J. Intell. Syst. Appl. (IJISA) 6(7), 60–65 (2014) 7. Bodyanskiy, Y.V., Tyshchenko, O.K., Kopaliani, D.S.: A multidimensional cascade neurofuzzy system with neuron pool optimization in each cascade. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 7(2), 16–23 (2014) 8. Rhmann, W., Vipin, S.: Fuzzy expert system based test cases prioritization from UML state machine diagram using risk information. Int. J. Math. Sci. Comput. (IJMSC) 3(1), 17–27 (2017)
38
V. Borisov et al.
9. Borisov, V.V.: Hybridization of intellectual technologies for analytical tasks of decisionmaking support. J. Comput. Eng. Inform. 2(1), 148–156 (2014) 10. Bobkov, V.I., Borisov, V.V., Dli, M.I., Meshalkin, V.P.: Multicomponent fuzzy model for evaluating the energy efficiency of chemical and power engineering processes of drying of the multilayer mass of phosphoresce pellets. Theor. Found. Chem. Eng. 52(1), 786–799 (2018) 11. Bobkov, V.I., Borisov, V.V., Dli, M.I.: Approach to a heat conductivity research by fuzzy numerical methods in the conditions of indeterminacy thermal characteristics. Syst. Control. Commun. Secur. 3, 73–83 (2017). (in Russian) 12. Elgharbi, S., Horchani-Naifer, K., Férid, M.: Investigation of the structural and mineralogical changes of Tunisian phosphorite during calcinations. J. Therm. Anal. Calorim. 119(1), 265– 271 (2015) 13. Yur’ev, B.P., Gol’tsev, V.A.: Thermophysical properties of Kachkanar titanomagnetite pellets. Steel Transl. 46(5), 329–333 (2016) 14. Luis, P., Van der Bruggen, B.: Exergy analysis of energy-intensive production processes: advancing towards a sustainable chemical industry. J. Chem. Technol. Biotechonol. 89(9), 1288–1303 (2014) 15. Butkarev, A.A., Butkarev, A.P., Ptichnikov, A.G., Tumanov, V.P.: Boosting the hot-blast temperature in blast furnaces by of optimal control system. Steel Transl. 45(3), 199–206 (2015) 16. Bokovikov, B.A., Bragin, V.V., Shvydkii, V.S.: Role of the thermal-inertia zone in conveyer roasting machines. Steel Transl. 44(8), 595–601 (2014) 17. Yuryev, B.P., Goltsev, V.A.: Change in the equivalent porosity of the pellet bed along the length of the indurating machine. Inst. News. 60(2), 116–123 (2018) 18. Novichikhin, A.V., Shorokhova, A.V.: Control procedures for the step-by-step processing of iron ore mining waste. Inst. News. Ferr. Metall. 60(7), 565–572 (2018) 19. Shvydkii, V.S., Yaroshenko, Y., Spirin, N.A., Lavrov, V.V.: Mathematical model of roasting process of ore and coal pellets in a indurating machine. Inst. News. Ferr. Metall. 60(4), 329– 335 (2018)
The Synthesis of Virtual Space in the Context of Insufficient Data Aleksandr Mezhenin1(&), Vladimir Polyakov1, Vera Izvozchikova2, Dmitry Burlov1, and Anatoly Zykov1 1
Saint Petersburg National Research University of Information Technologies, Mechanics and Optics, ITMO University, Kronverksky Ave. 49, St. Petersburg 197101, Russia [email protected] 2 Orenburg State University, Ave. Pobedy 13, Orenburg 460018, Russia
Abstract. The paper deals with the issues of computer simulation of virtual space long an arbitrary trajectory. In this case, the shooting can be carried out in the conditions of interference, and the video data itself may have noise, blur and defocus. The simulated medium is represented as a cloud of points of different density. To improve the efficiency of perception, point clouds are represented in the form of heat maps. Particular consideration is given to modeling in the context of incomplete data and the task of preserving meaningful information. The mathematical model is built on algorithms for estimating the density of distribution of points, non-parametric and parametric restoration of the distribution density. The results of testing: the analysis and visualization of the density of distribution, the result of the analysis of the density of the cloud of points obtained, based on the video sequence. Keywords: Virtual space Cyber-visualization technologies Modular objects Real-time algorithm Virtual modeling Point clouds Thermal maps
1 Introduction Virtual space is an essential element of any 3D visualization system, virtual and augmented reality, cyber visualization and telepresence. The fields of application of these technologies are the following: modeling the behavior of virtual 3D objects during the design phase, managing of complex human-machine systems, remote control and piloting, monitoring and video surveillance [14]. These technologies would be particularly important in the fields of high risks where the use of optical and television facilities is limited or simply impossible [5]. At present, various approaches are used to synthesize the virtual environment, among which the most common are computer graphics and photogrammetric modeling. In addition, neural network information processing systems are being developed to create fully artificial interactive worlds based on real-world video [13]. Data for modeling can be obtained by scanning 3D objects with special devices or as a result of processing optical scan data. In this paper, video sequences obtained from surveillance cameras moving along an arbitrary trajectory are considered as source data. In this case, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 39–46, 2020. https://doi.org/10.1007/978-3-030-39216-1_4
40
A. Mezhenin et al.
the shooting can be carried out in the conditions of interference, and the video data itself may have noise, blur and defocus. The simulated medium is represented as a cloud of points of different density. To improve the efficiency of perception, point clouds are represented in the form of heat maps. Particular consideration is given to modeling in the context of incomplete data and the task of preserving meaningful information.
2 Modeling a 3D Point Cloud Building a point cloud is not a trivial task. In this paper, we consider the problem of building a cloud of points of three-dimensional space, based on video data obtained as a result of shooting with a single camera moving along an arbitrary trajectory. Particularly addressed the problem of incomplete data and issues of preserving meaningful information. Despite the large number of different methods for constructing a cloud of points in three-dimensional space, the typical mathematical approaches used in them turn out to be ineffective in problems, the initial data of which are a set of discrete vertex points in the space of point objects. Representation of data in this form follows either from the specifics of the distribution in the space of point objects, or, for example, from the insufficiency of an array of measurement data at points in space. In such cases, it is usually used to refer to the initial set of point objects as a special term - point cloud. In the task of analyzing the distribution of such a cloud of points, first of all, the probability of finding points in a particular area, or rather the density of their distribution, is of interest. Such a problem is solved in using voxels. However, the practical implementation of this method requires colossal computational resources, and in cases of strong sparseness of point data in space, the methods of reconstruction using voxels do not allow revealing information that is significant for analysis. Algorithms for Estimating the Density of Distribution Points. A direct approach to solving the problem of reconstructing the spatial density of distribution of points is an approach in which the set of points X = {x(1), …, x(m)}, x 2 Rr is considered as the implementation of a sample from one unknown distribution with density q(x), and ^
some approximation of the density is qðxÞ q(x). There are three main types of distribution density search algorithms: non-parametric, parametric, and recovery of mixtures of distributions. Nonparametric Recovery of Distribution Density. The basic non-parametric method for restoring the density distribution is the Parzen-Rosenblatt method (kernel density estimation), an algorithm for Bayesian classification based on non-parametric density recovery from an existing sample. The approach is based on the idea that the density is higher at those points next to which there is a large number of sample objects. Parsen density estimate is:
The Synthesis of Virtual Space in the Context of Insufficient Data m Y r 1X 1 qh ðxÞ ¼ K m i¼1 j¼1 hj
ðiÞ
xj xj
41
hj
ð1Þ
where h 2 Rr is the width of the window, and K(u) is the kernel (an arbitrary even, normalized function) specifying the degree of smoothness of the distribution function. The term “window” comes from the classic form of the function: 1 KðuÞ ¼ Ifjuj 1g 2
ð2Þ
where I{…} is an indicator function, but in practice, smoother functions are usually used, for example, the Gaussian kernel function 1 1 2 KðuÞ ¼ pffiffiffiffiffiffi e2u 2p
ð3Þ
The window width h and the type of the kernel K(u) are the structural parameters of the method on which the quality of the restoration depends. At the same time, the width of the window has the main influence on the quality of restoration of the density distribution, whereas the type of core function does not affect the quality in a decisive way. This method is widely used in machine learning for classification problems in cases where the general form of the distribution function is unknown, and only certain properties are known, for example, smoothness and continuity. To find the optimal width of the window, the maximum likelihood principle is usually used with the exception of objects one by one-leave-one-out (LOO). Parametric Recovery of Distribution Density. Parametric estimation relies on families of density functions, which are specified using one or several numerical parameters: q(x) = /(x; h), x 2 Rr, h 2 H. One of the ways to choose the density function from this family that best approximates the original one is the maximum likelihood method.
h ¼ argmax h2H
m Y
^
uðxðf Þ ; hÞ; qðxÞ ¼ uðx; h Þ
ð4Þ
i¼1
For example, for the multivariate normal distribution function:
uðx; hÞ ¼ Nðx; l;
X
P exp 12 ðx lÞT 1 ðx lÞ X R qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; 2 Rrr ; ð5Þ Þ¼ ; x; l 2 R RP ð2pÞ
maximum likelihood estimates are written explicitly:
42
A. Mezhenin et al.
l ¼
m m X 1 X 1X xi ; ¼ ðxi l ÞðxðiÞ l ÞT m i¼1 m i¼1
ð6Þ
Parametric recovery methods are used when the form of the distribution function is known to within a set of parameters that allow you to control the flexibility of the model.
3 Algorithm Restoration of Mixtures of Distributions This approach can be considered a complication of the parametric for cases where the distribution has a complex form, which is not accurately described by a single distribution. The distribution density q(x) in the framework of this approach is represented as a mixture, i.e. the sum of distributions with certain coefficients: qðxÞ ¼
k X
wj qj ðxÞ ¼
j¼1
k X j¼1
wj uðx; hj Þ ¼ qðxjfwj ; hj gÞ; x 2 Rn ;
k X
wj ¼ 1; wj 0 ð7Þ
j¼1
In (7), qj(x) is the density of the distribution of the components of the mixture, belonging to one parametric family /(x; hj), wj is its a priori probability (weight), k is the number of components in the mixture. The function q(x | {wj, hj}) is called the likelihood function. Each of the above methods for determining the density of distributions (nonparametric, parametric, and recovery of mixtures of distributions) is applied with certain a priori knowledge of the density of distribution (of the form or properties of a function). Despite the fact that these approaches seem to have different areas of applicability, it is possible to identify similarities between them. So the non-parametric method can be considered as a limiting special case of a mixture of distributions, in which each x(i) corresponds to exactly one component with a priori probability wi = 1/m and the selected density function (core) with center at point x(i). The parametric approach is another extreme case of a mixture consisting of one component. Thus, the three approaches described differ, first of all, in the number of additive components in the distribution model: 1 k m. Therefore, restoring a mixture from an arbitrary number of components k is in some sense a more general case of restoring a continuous distribution density over a discrete sample.
4 Approbation Distribution visualization requires significant computational resources. In Fig. 1 presents the result of the work of an interactive application developed by the authors for analyzing the distribution density of data obtained from the video stream. The individual components of the mixture of normal distributions are used to visualize the entire mixture of distributions. Displays data on the density of distribution, as the entire data region, and the study.
The Synthesis of Virtual Space in the Context of Insufficient Data
43
Fig. 1. Analysis and visualization of distribution density.
Point Cloud Modeling. Below is the practical implementation of the algorithm for modeling a cloud of points in three-dimensional space based on a video sequence obtained by a camera moving along an arbitrary trajectory. 1. Decomposition of a video into a set of sequential images. The files needed for the reconstruction are obtained from the frame-by-frame decomposition of the video sequence into a sequence; 2. Selection on the images of key points and their descriptors; 3. By comparing descriptors, key points corresponding to each other in different images are found; 4. Based on the set of matched key points, an image transformation model is built, with which you can get another from one image; 5. Knowing the model of camera transformation and correspondence of points on different frames, three-dimensional coordinates are calculated and a cloud of points is built. In Fig. 2 shows the frames of the original video data. The shooting was done with a Blackberry Priv mobile phone video camera. Video format MOV Full HD 29.97 fps H264 17 Mbps. The files needed to build a point cloud of three-dimensional space were obtained from the frame-by-frame decomposition of a video sequence into a sequence. To reduce the time of rendering and processing, every third frame of the video sequence was used. Processing was done in the package Adobe AE. In Fig. 2 shows the frames of the original video data. To improve the quality - defocus and blur [15] image processing was performed in the MATLAB system. To obtain stabilization and determine key points, software was used based on the considered algorithms and the Point Cloud Library [10] library (Fig. 3).
44
A. Mezhenin et al.
Fig. 2. Source video frames.
Fig. 3. The result of the analysis of the density of the constructed point cloud obtained, based on the video sequence.
5 Conclusions This paper covers the issues of modeling virtual space in the context of insufficient data based on arbitrary video sequences. The recovery of distribution mixtures was used as a mathematical model. The approbation results are presented in the form of visualization and analysis of the distribution density. The results of modeling the point cloud using a heat map were also presented. The results are pertinent and subject to further specification and elaboration. The subject of the research is the methods of modeling and visualization of the virtual environment, the construction of a point cloud of three-dimensional space, both in 3D modeling systems and on the basis of video data. Particularly considered is the
The Synthesis of Virtual Space in the Context of Insufficient Data
45
problem of incomplete data and the issues of maintaining information relevant to analysis. The results of the study are a conceptual description of virtual environment modeling methods, based on a mathematical model of parametric and nonparametric restoration of the distribution density of point objects of space according to the available sample. Visually, the simulation results are presented in the form of a cloud of points of uneven density. To improve visual perception, the points of the model are presented in the form of heat maps with an adaptive scale. The proposed approach will improve the accuracy and clarity of the virtual environment for the subsequent detailed analysis of the area of the simulated space. As examples, we consider: analysis of the distribution density of data obtained from the video stream, when the individual components of the normal distribution mixture are used to visualize the entire distribution mixture, modeling a point cloud of threedimensional space, based on video data obtained as a result of shooting with a single camera moving along an arbitrary trajectory. The proposed virtual environment modeling approaches can be applied in such areas as: monitoring the infrastructure of the space area of intelligent security systems, emergency control and prevention, building three-dimensional maps of cities and threedimensional surrounding space in simulators, determining the location of robots, visualization and analysis of large spatial data, in CAD and medicine. Acknowledgments. The research has been supported by the RFBR, according to the research projects No. 17-07-00700 A.
References 1. Dyshkant, N.: Measures for surface comparison on unstructured grids with different density. In: LNCS: Discrete Geometry for Computer Imagery, vol. 6607, pp. 501–512 (2011) 2. Dyshkant, N.: An algorithm for calculating the similarity measures of surfaces represented as point clouds. Pattern Recogn. Image Anal.: Adv. Math. Theory Appl. 20(4), 495–504 (2010) 3. Tomaka, A.: The application of 3D surfaces scanning in the facial features analysis. J. Med. Inform. Technol. 9, 233–240 (2005) 4. Szymczak, A., Rossignac, J., King, D.: Piecewise regular meshes: construction and compression. Graph. Models 64(3–4), 183–198 (2002) 5. Stepanyants, D.G., Knyaz, V.A.: PC-based digital closerange photogrammetric system for rapid 3D data input in CAD systems. Int. Arch. Photogrammetry Remote Sens. 33, 756–763 (2000). part B5 6. Mitra, N.J., Guibas, L.J., Pauly, M.: Partial and approximate symmetry detection for 3D geometry. In: Proceedings ACM SIGGRAPH, pp. 560–568 (2006) 7. Liu, Y., Rodrigues, M.A.: Geometrical analysis of two sets of 3D correspondence data patterns for the registration of free-form shapes. J. Int. Robotic Syst. 33, 409–436 (2002) 8. Knyaz, V.A., Zheltov, S.Yu.: Photogrammetric techniques for dentistry analysis, planning and visualisation. In: Proceedings ISPRS Congress Beijing 2008, Proceedings of Commission V, pp. 783–788 (2008) 9. MathWorks - Makers of MATLAB and Simulink. https://www.mathworks.com/ 10. CloudCompare - Open Source project. https://www.danielgm.net/cc/
46
A. Mezhenin et al.
11. New NVIDIA Research Creates Interactive Worlds with AI (2018). https://nvidianews.nvidia. com/news/new-nvidia-research-creates-interactive-worlds-with-ai?utm_source=ixbtcom 12. Shardakov, V.M., Parfenov, D.I., Zaporozhko, V.V., Izvozchikova, V.V.: Development of an adaptive module for visualization of the surrounding space for cloud educational environment. In: 11th International Conference Management of Large-Scale System Development, MLSD (2018) 13. Mezhenin, A., Izvozchikova, V., Ivanova, V.: Use of point clouds for video surveillance system cover zone imitation. In: CEUR Workshop Proceedings, vol. 2344 (2019) 14. Mezhenin, A., Zhigalova, A.: Similarity analysis using Hausdorff metrics. In: CEUR Workshop Proceedings, vol. 2344 (2019) 15. Sizikov, V.S., Stepanov, A.V., Mezhenin, A.V., Burlov, D.I., Eksemplyarov, R.A.: Determining image-distortion parameters by spectral means when processing pictures of the earth’s surface obtained from satellites and aircraft. J. Opt. Technol. 85(4), 203–210 (2018) 16. Goel, V., Raj, K.: Removal of image blurring and mix noises using gaussian mixture and variation models. Int. J. Image, Graph. Sig. Process. (IJIGSP) 10(1), 47–55 (2018). https:// doi.org/10.5815/ijigsp.2018.01.06 17. Sharma, S., Ghanekar, U.: Spliced image classification and tampered region localization using local directional pattern. Int. J. Image, Graph. Sig. Process. (IJIGSP) 11(3), 35–42 (2019). https://doi.org/10.5815/ijigsp.2019.03.05 18. Ye, Z., Hou, X., Zhang, X., Yang, J.: Application of bat algorithm for texture image classification. Int. J. Intell. Syst. Appl. (IJISA) 10(5), 42–50 (2018). https://doi.org/10.5815/ ijisa.2018.05.05
Reconstruction of Spatial Environment in Three-Dimensional Scenes Alexander Mezhenin1(&), Vera Izvozchikova2, and Vladimir Shardakov2 1
Saint Petersburg National Research University of Information Technologies, Mechanics and Optics, (ITMO University), Kronverksky Avenue 49, St. Petersburg 197101, Russia [email protected] 2 Orenburg State University, Avenue Pobedy 13, Orenburg 460018, Russia
Abstract. The problem of reconstruction of real images and their digitization is one of the Central themes of modern Cybernetics. Therefore, the relevance of this work is manifested in the need for technological modernization of the threedimensional data processing system. The article deals with the automatic creation of three-dimensional models of macro-level urban infrastructure in real time. Two methods are considered for reconstruction - the Poisson method and the reconstruction method based on elevation maps. The proposed methods can be recommended for fast generation of three-dimensional macro-level reconstructions. Practical implementation of obtaining three-dimensional models of the macro level was carried out using a set of open source software. The estimation of the received models from the point of view of speed and visual quality is made. On the basis of the work done, a pipeline was obtained from existing software products that allow to obtain an automatic reconstruction of the urban environment of acceptable quality in terms of the number of polygons and a survey of respondents was conducted to assess the visual quality of the models obtained as a result of the reconstruction. The results of the study can be used in GIS systems, video games, systems of modeling of human behavior, medical imaging. Keywords: Reconstruction of cities Level of details Models of cities
Real time graphics Poisson method
1 Introduction Currently, the ability to reconstruct three-dimensional models of scenes from real images is one of the most popular in computer vision systems. At the moment, there are three key ways to obtain three-dimensional models: the use of active methods (for example, laser scanners), building in automatic design environments (AutoCad, 3D Studio Max, Unity, Unreal Engine) and the use of passive methods of obtaining models through stereo shooting. Due to the fact that the reconstruction of a large number of polygons requires automation of the design process of three-dimensional models, the authors use the method of macro-level reconstruction of cities and districts on the basis of automatic © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 47–55, 2020. https://doi.org/10.1007/978-3-030-39216-1_5
48
A. Mezhenin et al.
design. It should be borne in mind that the final scene must correspond to the parameters of the real image. Quite often there is a situation when external parameters (scale, shift, rotation of the scene) relative to the real image are not defined. Reconstruction of vast areas such as city, city block, street, Park area, road junctions can be used in various fields: navigation, video games, urban planning, education, preservation of historical heritage, road industry, tourism. As a rule, the reconstruction is carried out with different levels of detail, which is determined by the requirements of the project, which involves a three-dimensional model. The level of detail determines the degree of elaboration of model details, i.e. how much geometric information and attributes are required for a particular model element [1, 2, 14]. Attributes are understood as element type, texture or color, and so on. The Need for attributes is determined by the degree of design development. Model geometry is usually represented as a polygon mesh [3, 15]. The standard for determining the level of detail is shown in Fig. 1:
Fig. 1. Representation of levels of detail (LOD)
Level 0 models represent solid surfaces and landscapes and are used in the early stages of a project when high precision is not necessary. Level 1 models use satellite imagery and images from the texture detection satellite. Level 2 uses aerial photography materials. The scene can also be divided into objects, which in turn form layers with specific themes. Often, these objects are assigned unique identifiers and a database of objects is created. Level 3 is a refinement of the local geometry of objects from survey data, detection of changes in textures and properties of objects from photographic images. Level 4 is used in engineering projects for high-precision modeling. These criteria are used as requirements of construction companies to projects [4].
Reconstruction of Spatial Environment in Three-Dimensional Scenes
49
2 Theoretical Aspects of a Research There are several approaches to modeling three-dimensional cities. Consider their possible classification [5]: – – – –
by level of automation; automatic, semi-automatic, manual; by source data type; using the results of photogrammetry (aerial photography, satellite photography, short-range photography); – using the results of laser scanning (air and ground scanning); – a hybrid method that can combine the specifics of several approaches. The automatic approach to city modeling uses geometric shape reconstruction algorithms. Lidar technology or aerial laser scanning results are used to obtain accurate geometric model data and realistic texture. Lidar technology allows to obtain data about the object based on the processing of the reflected light signal. Advantages-coverage of a large area in a short time, disadvantages-the reconstruction of the original points in the three-dimensional model is lost part of the data. At the moment, there is no single approach to obtaining automated threedimensional reconstruction. This usually depends on the source data and characteristics of the project. Source data for automatic reconstruction - photos, videos or point clouds. As a rule, the point cloud is obtained by laser scanning [6]. In addition, photos and videos can also provide a point cloud. Visual perception of the image of macro-level spaces for a person is a natural perception of the world in the form of three-dimensional objects. The image of the city is a broad concept that can be defined as a set of characteristics of the city reflected in the minds of people [7, 8, 11–13]. It can include both iconic landmarks of the city or territory, a certain plant environment characteristic of the area, and the location of objects within the city.
3 Mathematical Model Two methods are considered as mathematical models-Poisson reconstruction and reconstruction based on elevation maps. Reconstruction by the Poisson Method. The values of some function f (p) depending on the position in space are calculated. Then the resulting values with a plus sign are outside the object, with a minus sign inside the object, and on the desired surface the function will be zero. You have to find all these values. Due to the fact that the density of points in space is variable, for the best operation of the algorithm it is advisable to divide the entire volume into cubes of different sizes and find the values of the function.
@2 @2 @2 þ þ uðx; y; zÞ ¼ f ðx; y; zÞ @x2 @y2 @z2
ð1Þ
50
A. Mezhenin et al.
Reconstruction Based on Height Maps. This method uses an elevation map-a twodimensional representation of the elevation value of a landscape inscribed in a regular grid [9].
4 Experimental Part Practical implementation of obtaining three-dimensional macro-level models was carried out using a set of open - source software-CloudCompare (CC) and MeshLab (ML) [9, 10]. Figure 2 shows a diagram of the point cloud modeling pipeline.
Fig. 2. Pipeline for automatic generation of macro-level models
The first stage is point cloud segmentation, calculation and correction of normal vectors (Fig. 3).
Fig. 3. Work segment selection in the point cloud and normal vector calculation results
Get textures from the point cloud using MeshLab. There are different types of textures (for example, take the texture presented in the standard set of threedimensional graphics editor 3ds Max): color or diffuse color texture, which determines the color area of the model; mirror - texture highlights, it can be used to give the surface Shine; relief - texture roughness; offset-texture relief; opacity-texture transparency.
Reconstruction of Spatial Environment in Three-Dimensional Scenes
51
In the framework of this work, the following approach was used. Each point of the source data has its own RGB value, which was used to obtain the color texture. To do this, “transfer” this color to the polygon mesh. The approach proposed by the authors is to calculate the normal for each of the representations (point clouds, polygonal model). Then create an “empty” texture in the required resolution for the model and project the RGB value of the source points onto it. The resulting polygon mesh is converted to the OBJ format while preserving the texture map. The resulting models and textures were imported into Unreal Engine. The final view of the models is shown in Fig. 4.
Fig. 4. Results of model visualization using Poisson method and elevation maps
The FPS value was used to evaluate the display performance of the obtained models. The measurements were made in the unreal Engine environment using the developed script in the visual programming language – Blueprints (Fig. 5).
Fig. 5. Setting Blueprints to output frame rate per second in UE 4
The sample types in this experiment are independent because the FPS was measured at different points in time of moving around the scene, which does not guarantee that identical parts of the reconstruction are available when they are rendered. The
52
A. Mezhenin et al.
difference results are presented on a quantitative scale. We simplify it to a dichotomous with respect to the median (a characteristic that divides an ordered set in half), assuming that the statistical power is sufficient to obtain a meaningful result to test our hypothesis. Fisher’s exact test was used for calculations [9]: P¼
ða þ bÞ! ðc þ Dd Þ! ða þ cÞ! ðb þ d Þ! a! b! c! d! n!
ð2Þ
The median for a set of 30 values is calculated using standard Microsoft Excel tools. The median value at 73,4 fps. Construct a contingency Table 1 for ease of calculations.
Table 1. Weight factors of criteria and division of the ranges of change of meanings of criteria into levels Poisson model Model height maps Result FPS values > median 11 (a) 4 (b) 15 FPS median values 4 (c) 11 (d) 15 Result 15 15 30
We construct graphs of the time spent on reconstruction of the number of points in the original data using Poisson models and elevation maps. In addition, the experiment took into account the number of polygons obtained, depending on the number of points in the original data. (Fig. 6).
Fig. 6. Configuring Blueprints to output frame rate per second in UE 4
From the graphs you can see that if you use the method of elevation maps, you can reconstruct faster. The number of polygons in the three-dimensional models obtained by the proposed methods intersects in the areas of up to 500 thousand initial points.
Reconstruction of Spatial Environment in Three-Dimensional Scenes
53
After this value, the elevation mapping method reconstructs the environment with a noticeably large number of polygons (as opposed to the Poisson triangulation method). The next step is to evaluate the visual perception of the models. The obtained models had obvious visual differences of the models (Fig. 7). Due to extrusion in the second model was lost part of the information on one of the axes relative to the height, in addition visually reconstruction resembled crystals.
Fig. 7. Reconstruction obtained by the Poisson method and the method of elevation maps
A survey was conducted for visual assessment. One working segment was chosen for the survey, based on the variety of objects represented on it. Identical travel routes over the resulting reconstructions were then visualized. The movements were recorded in videos with mp4 extension. Each video is 20 s long. The Respondent was asked to look at one of the two videos of the reconstruction, selected at random, and answer the questionnaire. The questions were divided into two parts. The first concerned only the scene presented in the video. It was necessary to choose photos related to this reconstruction, or Vice versa not related to it. There was also a question on the memorability of the locations of objects in the scene by means of selecting the scheme, which, in the opinion of the Respondent, more than others corresponds to what he saw. These questions revealed the level of recognition of the objects and the transfer of its terrain structure relative to the presented scene. The second set of questions consisted of a comparative part. Respondents were asked to rate the scene on a ten-point scale, the realism of the textures in the scene, and finally choose whether they would like to explore the area, thereby determining the level of subjective evaluation of the model in the scene. In questions on the Respondent’s assessment of the scene as a whole and the realism of the textures, the ordinal evaluation scales are used. This study used a scale from 1 to 10 points. A summary of the two issues is presented in Fig. 8.
54
A. Mezhenin et al.
Fig. 8. Reconstruction obtained by the Poisson method and the method of elevation maps
Based on the survey of respondents, it can be concluded that the most effective method used for the automation of construction is the method of elevation maps. Poisson reconstruction gave the best result in terms of the number of polygons (the model turned out to be more realistic) in terms of visual quality - without distortion, in terms of urban infrastructure.
5 Conclusions The paper collected and analyzed analogues of macro-level reconstructions and found approaches to modeling such reconstructions. During the work, free software was used, which allowed to recreate the territory by various methods (Poisson triangulation and elevation maps). A method for obtaining textures from a point cloud has also been proposed. The models were converted to FBX format and placed in the Unreal Engine. For comparison of the received approaches such indicators as time spent for reconstruction and quantity of polygons of the received model were measured and estimated. In addition, a questionnaire was developed and a survey was conducted to identify respondent’s preferences and the level of memorability of object locations. The following conclusions can be drawn from this work. Elevation maps allow you to create reconstructions faster, but it is advisable to use them if the number of points in the original data does not exceed 450 thousand. The survey did not establish any preferences of respondents and statistically significant differences in memorability of object locations in the scene with different approaches. However, in percentage terms, the triangulation method showed greater results. The proposed methods can be recommended for fast generation of three-dimensional macro-level reconstructions. However, the recognition of these models is low, so such reconstructions should be used in the first stages of the project, when time is limited, and the result does not require high accuracy, or when the observation point is located far enough from the model. Further development of this study will be aimed at improving the polygonal grid of the resulting models, which will increase the recognition and accuracy of the threedimensional scene. It is also possible to investigate the distance of the observation point
Reconstruction of Spatial Environment in Three-Dimensional Scenes
55
from the model and the required accuracy of the reconstruction while maintaining the recognition of buildings. Acknowledgements. The research has been supported by the RFBR, according to the research projects No. 17-07-00700 A.
References 1. Ashraf, M., Arif, S., Basit, A., Khan, M.: Provisioning quality of service for multimedia applications in cloud computing. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 10(5), 40–47 (2018) 2. Excerpt of the CityGML model (LOD2) of The Hague, Netherlands (open dataset). Access mode. https://www.citygml.org/about/ 3. Memos, V.: Efficient multimedia transmission over scalable IoT architecture. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 10(5), 27–39 (2018) 4. Mezhenin, A., Izvozchikova, V., Ivanova, V.: Use of point clouds for video surveillance system cover zone imitation. In: CEUR Workshop Proceedings, vol. 2344, 9p. (2019) 5. Sizikov, V.S., Stepanov, A.V., Mezhenin, A.V., Burlov, D.I., Eksemplyarov, R.A.: Determining image-distortion parameters by spectral means when processing pictures of the earth’s surface obtained from satellites and aircraft. J. Opt. Technol. 85(4), 203–210 (2018) 6. Vipul, G., Krishna, R.: Removal of image blurring and mix noises using Gaussian mixture and variation models. Int. J. Image, Graph. Signal Process. (IJIGSP) 10(1), 47–55 (2018). https://doi.org/10.5815/ijigsp.2018.01.06 7. Surbhi, S., Umesh, G.: Spliced image classification and tampered region localization using local directional pattern. Int. J. Image, Graph. Signal Process. (IJIGSP) 11(3), 35–42 (2019). https://doi.org/10.5815/ijigsp.2019.03.05 8. Zhiwei, Y., Xiangfeng, H., Xu, Z., Juan, Y.: Application of bat algorithm for texture image classification. Int. J. Intell. Syst. Appl. (IJISA) 10(5), 42–50 (2018). https://doi.org/10.5815/ ijisa.2018.05.05 9. Chih-Fan, C., Bolas, M., Rosenberg, E.: Rapid creation of photorealistic virtual reality content with consumer depth cameras. In: IEEE Virtual Reality (VR), pp. 473–474. IEEE, Los Angeles (2017) 10. Argelaguet, F., Andujar, C.: A survey of 3D object selection techniques for virtual environments. Comput. Graph. 37(3), 121–136 (2013) 11. Fernandez-Palacios, B.J., Morabito, D., Remondino, F.: Access to complex reality-based 3D models using virtual reality solutions. J. Cult. Herit. 23, 40–48 (2017) 12. Salako, E.A., Adewale, O.S., Boyinbode, O.K.: Appraisal on perceived multimedia technologies as modern pedagogical tools for strategic improvement on teaching and learning. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 11(8), 15–26 (2019) 13. Gonzalez-Delgado, J.A., Martınez-Grana, A.M., Civis, J., Sierro, F.J., Goy, J.L., Dabrio, C.J., Ruiz, F., Gonzalez-Regalado, M.L., Abad, M.: Virtual 3D tour of the neogene palaeontological heritage of Huelva, 10 p. Springer, Guadalquivir Basin (2014) 14. Papadopoulos, C., Mirhosseini, S., Gutenko, I., Petkov, K., Kaufman, A.E., Laha, B.: Scalability limits of large immersive high-resolution displays. In: Conference Proceedings 2015 IEEE Virtual Reality (VR), IEEE, Arles (2015). https://doi.org/10.1109/vr.2015. 7223318 15. Chih-Fan, C., Bolas, M., Rosenberg, E.S.: Rapid creation of photorealistic virtual reality content with consumer depth cameras. In: IEEE Virtual Reality (VR), pp 473–474. IEEE, Los Angeles (2017)
Railway Rolling Stock Tracking Based on Computer Vision Algorithms Andrey V. Sukhanov(&) JSC NIIAS, Rostov Branch, Rostov State Transport University, Rostov-on-Don, Russia [email protected]
Abstract. The paper presents the problem statement of rolling stock control in sorting bowls of railway hump yards. It is proved that the technological processes taking place at railway sorting tracks are the least automated at hump yard, because of which they require the development of suitable control approaches. The criteria of sorting bowl automation are presented. To satisfy these criteria, the new automation tool is proposed. This tool is novel and substantially differs from the conventional ones, because it is based on computer vision. The tool basis is performed by two-phase algorithm (detection phase and tracking phase) and allows to obtain full control over sorting tracks of railway hump yard. The experiments on a real railway object are performed in the presented work. Keywords: Railway automation tracking
Intelligent sorting yard Rolling stock
1 Introduction Freight stations with sorting hump yards are the core of the railway transportation. The typical scheme of hump yard is presented in Fig. 1. It consists of arrival yard, hump, classification bowl (also called classification yard), and departure yard [1]. Hump yard provides breaking up of inbound trains staying at arrival yard into the cuts of several cars, which are humped then and making up into new outbound trains following to departure yard. This process is also called as humping. Last decades, humping is requested to be unmanned [2, 3], because unmanned technologies carried out by full automation provide growing up of train processing together with reducing unsafe operations connected with human activity [4]. In Russian railways, humping processes have been already automated in hump area (which is heart of hump yard), where cut braking, directing and tracking are performed. After a cut leaves hump area and enters to sorting bowl, its velocity is not directly controlled and tracking is available only for the tail [5]. It leads to the necessity of applying secondary approaches connected with imitation modeling and human resources attraction. Thus, existing tools do not provide accurate and objective control over the sorting tracks of hump yard. This fact leads to subjective assessment of sorting processes and inhibit to fully automate marshaling operations in terms of providing the permissible © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 56–63, 2020. https://doi.org/10.1007/978-3-030-39216-1_6
Railway Rolling Stock Tracking
57
Fig. 1. Layout of a hump yard
collision speed at classification bowl and predicting the gaps between cars, standing at sorting tracks. The paper formalizes the task of car control in railway sorting yard and proposes the algorithm of its solution, which is based on use of object detection and object tracking approaches. Section 2 provides the problem statement for control over railway cuts in sorting bowl of freight station and presents the scheme of the proposed twophase algorithm. In Sect. 3, the key steps of the algorithm are presented together with the experiments on a real railway object. Conclusion and future work are given in Sect. 4.
2 Problem Statement Modern railway sorting automation systems provide full control over rolling stocks moving through the heart of hump yard, which is hump [6, 7]. In area of brake positions, the control is performed as position and velocity computation via Doppler velocimeters, railway circuits and axle counters (Fig. 2). Between brake positions, it is performed without velocimeters, because speed computation is enough using axle counters.
Fig. 2. The typical scheme of hump and adjoining sorting tracks (BP is brake position)
58
A. V. Sukhanov
Hump devices allow to compute cut velocity with accuracy above 99.5%. It helps automated hump systems, which are, for example, Integrated Automation System for Marshalling Process Control (KSAU SP) [8] and Microcomputer System for Marshalling Yard (MSR-32) [9], to adjust their parameters in real-time and operatively respond on abnormal situations. After a cut leaves the last BP and enters to the sorting track, it is required to determine its position to predict the gaps and its velocity to predict the unacceptable collisions with standing cars on the track. The second problem is more important because it affects the safety on the hump yard. As it is written above, the control becomes significantly poorer than it was at hump, because existing system for filling track control [5] allows to search only the tail. According to the analyzed statistics collected from the reports of operational personnel, the velocity of collision exceeds the normal value (5 km/h for Russian railways) for every fifth cut. However, this statistics is only subjective view and collected from the small part of occurred situations. Because of above mentioned points, the task of control over the sorting bowl becomes very relevant for the marshalling processes. The decision made for hump automation (axle counters, rail circuits and Doppler velocimeters) does not conform to economical and operational criteria, because the total length of sorting tracks can achieve 50 km. This work proposes the novel tool, which allows controlling cuts at sorting tracks with satisfying the mentioned criteria. The video cameras placed on light towers of sorting bowl can be used as eyes for this tool (Fig. 3).
Fig. 3. Positions of video cameras
Railway Rolling Stock Tracking
59
As a brain of this tool, it is proposed to use approaches to moving objects tracking via computer vision. As a basic scheme of the proposed algorithm, the one from [9] is used (Fig. 4).
Fig. 4. The scheme of the algorithm
The following section describes the main steps of the algorithm.
3 Proposed Algorithm 3.1
Detection Phase
First of all, segmentation of moving areas from background should be made. Moving areas are considered as entities, which position is different in successive frames [10]. The frame-to-frame difference (FFD) I for each t-th frame. It is used to implement the segmentation: Iðx; yÞ ¼
1; 0;
if jIt ðx; yÞ It1 ðx; yÞj [ s ; else
where x and y are the pixel coordinates and s is the empirically obtained threshold. Figure 5b shows the result of FFD, which is obtained on the experimental video received from real railway object. To exclude noises and emphasize the contours of moving object, well-known morphological operations of dilation are made [11] (Fig. 5c). To obtain disjoint and clearly distinguishable objects in terms of the established problem, the size of the structural block for dilation is chosen as half of distance between the controlled track and neighbor one. To detect the contours of moving objects from segmented areas, idea from [12] is used, where contours of the objects (Fig. 5d) obtained using boundary pixels of areas in binary image (Fig. 5c) are transformed into the convex hulls (Fig. 5e). The convex hulls are the set of contour points P = {p1, p2, …, pn}, where angle between lines pi−1pi and pipi+1 is maximal.
60
A. V. Sukhanov
Fig. 5. Detection phase steps
Obtained convex hulls should be classified into key items and non-key ones. The key hulls are considered as satisfying to the following criteria: 1. Item intersects the line of analyzed track, 2. Item has the size, which is comparable with known size of rolling stock. The first criterion check is obvious. The second one requires the comparison of the image pixels with real track coordinates. For this reason, the following calibration formula presented in [13] can be used: D¼
LK ; W=x 1 þ K
ð1Þ
where D is the searched distance in meters, L is the length of the visible part of a track, W is the length of a track in pixels, x is the distance to the analyzed point in pixels, K is the incline coefficient, which is computed as: K¼
W M ; M
where M is the distance to the center of the visible part of a track in pixels. In Fig. 5f, the real scale obtained on the basis of (1) is shown. According to preliminary information, which is available from KSAU SP, the cut consists of 1 car, which length is 14 m. Therefore, key hull can be chosen as illustrated in Fig. 5f. Figure 5g presents the bounding box for the found key hull. 3.2
Tracking Phase
Tracking algorithms solve the task of calculating the position change of the labeled object from frame to frame. The initial coordinates of the object are considered as input data. Because of the known initial information, such approach has less computational costs than repeated detection every frame [14]. The image bounded by the box obtained in the previous phase is considered as the input of the tracking phase.
Railway Rolling Stock Tracking
61
For the tracking, the Discriminative Correlation Filter with Channel and Spatial Reliability (CSR-DCF) algorithm proposed in [15] is chosen (Fig. 6). The key advantage of the algorithm is real-time mode on CPU.
Fig. 6. The scheme of CSR-DCF algorithm
CSR-DCF is based on searching of area, which is the nearest to the one of the images obtained as affine transformations of the input image. As a criterion of search, the maximal response G between the potential area I and ideal filter h characterizing input image: G ¼ F ð I Þ F ð hÞ ; where F ðÞ is the fast Fourier transform (FFT) (which is frequently used in image processing tasks [16]), is the Hadamard product, * is the complex conjugation. The task is to find such ideal filter, which FFT satisfies the following equation: min
FðhÞ
X
!
wi jFðIi Þ FðhÞ Fðgi Þj
2
;
i
where wi is the weight of the i-th filter based on its affinity to the initial image, gi is the ideal response on i-th transformation, computed as Gaussian function: ! ðx xc Þ2 þ ðy yc Þ2 g ¼ exp ; 2:0 where xc and yc are the center coordinates, x and y are the searched point. The results of the second phase are presented in Fig. 7. The mean FPS for Intel Core i5 4200U using OpenCV 4.0 is 7 that is acceptable for the solved task. OpenCV is the Open Source Computer Vision Library, which is used to solve the tasks, which are similar to the considered one in the paper [17].
62
A. V. Sukhanov
To compute the velocity and track the position of the rolling stock using results of the previous step, the scale calculated in (1) is used.
Fig. 7. The results obtained by the presented algorithm
The video illustrating the proposed algorithm performance can be found at YouTube [18].
4 Conclusions and Future Work Proposed solution is relevant because of expensive existing approaches, which inhibit automation of the marshalling processes. The experimental results proved the applicability of the presented approach for the sorting bowl control. Nevertheless, the presented work only shows the possibility of such algorithms to be used in area of railway sorting processes. The algorithm still has the shortcoming with detection of cuts, which consist of multiple cars. As well, it has limitations in bad weather conditions. It can be eliminated involving additional information, such as cut data collected from hump automation system. Future research is required to implement and assess the approach in the framework of existing hump automation systems, such as KSAU SP. As well, the research group leading by the author is performing the deep learning architecture, which can be as alternative to the detection phase of the algorithm. The publication of the results is planned in the future work.
Railway Rolling Stock Tracking
63
Acknowledgments. The reported study was funded by RFBR, project number 20-07-00100.
References 1. Bohlin, M., Gestrelius, S., Dahms, F., Mihalák, M., Flier, H.: Optimization methods for multistage freight train formation. Transp. Sci. 50(3), 823–840 (2015) 2. Zarecky, S., Grun, J., Zilka, J.: The newest trends in marshalling yards automation. Transp. Probl. 3, 87–95 (2008) 3. Shabelnikov, A.N., Dmitriyev, V.V., Olgeyzer, I.A.: Digitalization of sorting system. Avtom. Svyaz Inform. 1(1), 19–22 (2019). (in Russian) 4. Shabelnikov, A.N., Olgeyzer, I.A.: Technical aspects of the “digital station” project. In: International Conference on Intelligent Information Technologies for Industry, pp. 284–290 (2018) 5. Odikadze, V.R.: Track filling control system by pulse sensing method (KZP IZ). Avtom. Svyaz Inform. 11(11), 14–15 (2008). (in Russian) 6. Kobzev, V.A.: Automated control at marshalling yards. Mir Transp. 8(5), 60–66 (2010). (in Russian) 7. Boysen, N., Fliedner, M., Jaehn, F., Pesch, E.: Shunting yard operations: theoretical aspects and applications. Eur. J. Oper. Res. 220(1), 1–14 (2012) 8. Hansmann, R.S., Zimmermann, U.T.: Optimal sorting of rolling stock at hump yards. In: Mathematics–Key Technology for the Future, pp. 189–203 (2008) 9. Mahalingam, T., Subramoniam, M.: A robust single and multiple moving object detection, tracking and classification. Appl. Comput. Inform. (2018) 10. Tokmakov, P., Schmid, C., Alahari, K.: Learning to segment moving objects. Int. J. Comput. Vis. 127(3), 1–20 (2017) 11. Haralick, R.M., Shapiro, L.G.: Computer and Robot Vision. Addison Wesley, Reading (1992) 12. Aksaç, A., Öztürk, O., Özyer, T.: Real-time multi-objective hand posture/gesture recognition by using distance classifiers and finite state machine for virtual mouse operations. In: 7th International Conference on Electrical and Electronics Engineering (ELECO), pp. 457–461 (2011) 13. Shubnikova, I.S., Palaguta, K. A.: Analysis of methods and algorithms for determining the parameters of an object and the distance to it from the image. Trans. Int. Symp. “Reliab. Qual.” 1(1) (2013). (in Russian) 14. Babenko, B., Yang, M.H., Belongie, S.: Visual tracking with online multiple instance learning. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 983– 990 (2009) 15. Lukezic, A., et al.: Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6309–6318 (2017) 16. Sahu, S., Nanda, S.K., Mohapatra, T.: Digital image texture classification and detection using radon transform. Int. J. Image Graph. Signal Process. (IJIGSP) 5(12), 38–48 (2013) 17. Jog, A., Halbe, S.: Multiple objects tracking using CAMShift algorithm and implementation of trip wire. Int. J. Image Graph. Signal Process. 5(6), 43 (2013) 18. Vehicle tracker NIIAS. https://youtu.be/tGm9mKFyQ4U. Accessed 14 Aug 2019
Evaluating of Word Embeddings Hyper-parameters of the Master Data in Russian-Language Information Systems Dudnikov Sergey(&), Mikheev Petr, and Grinkina Tatyana Bauman Moscow State Technical University, Moscow, Russia [email protected], [email protected], [email protected]
Abstract. Evaluating of word embeddings hyper-parameters for master data quality support task is conducted in this work. We introduce the structure and management of master data. We also describe a method of training the embeddings model for elements and methods for estimating the resulting vectors. Using a corpus of training and validation data sets – 264 thousand records, we have conducted an experiment on models with a different set of parameters and get results. Our vectors give good results in mapping and classification problems on specific industry texts in comparison with standard approaches. In conclusion, we present the main recommendations for the hyper-parameters setting in the task of management of master data for industry conditions. Our methods successfully using in the RPA-systems and in the Data Warehouse like the text analysis module. Keywords: Master data Quality of master data Word2Vec model Hyper-parameter settings
Word embeddings
1 Introduction Learning semantic representation for words is a fundamental task for NLP, which has been researched in different works. But the meaning of a word often varies from one domain to another [1]. Domain adaptation is an important research topic, and it has been considered in many areas for different NLP tasks. For example, biomedical NLP [2], software engineering text [3], oil and gas domain [4] and etc. For Industry 4.0 the master data is domain specific text data, which is unique to its type of production. The quality of the master data has an important role, because the master data is to bring together and exchange data such as customer, supplier or product master data from disparate applications or data silos. Operational processing of the master data should be effectively addressed as part of the business processes of the enterprise. To meet this requirement, processing systems must have the capacity and capacity to make operational decisions. Such systems should support not only the quality of the data available in them, but also update the database without loss of information quality. An important condition is the preservation of the lexical-semantic specifics of the area of activity of the enterprise, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 64–75, 2020. https://doi.org/10.1007/978-3-030-39216-1_7
Evaluating of Word Embeddings Hyper-parameters
65
which is not possible using only standard methods and approaches for constructing vector representations of words. This article explores a set of settings for the model of vector representations of text elements of the master data to create an optimal semantic space that can accurately solve the main tasks of maintaining the quality of the master data in Russian-language information systems.
2 Features of the Master Data Elements and Their Vector Representations The structure of the master data is an entity with descriptive or numerical characteristics. For instance, Ключ гaeчный кoмбиниpoвaнный 41*41 (Combination wrench 41*41), where Ключ гaeчный (wrench) is entity, кoмбиниpoвaнный (combination) is descriptive parameter, 41*41 is numerical parameter. The standard approach to maintaining the quality of this type of information consists of two stages: developing a multi-level classifier and creating an element template for each subgroup. Based on the created template, the elements are first reduced to a single form, and then compared with each other in terms of parameters. This approach does not allow comparing each element as an independent unit, as a result of which the quality of the master data remains low. Much more optimal is the reduction of elements of the master data to comparable representations, such as a vector. The main problem with this approach is the lack of open cases of trained vectors for this type of text data. This is due to the content in the corps of many abbreviations and words specific to this area. Existing vectors corps cover from 30% to 40% of the data array and the rest of the group of words needs independent training. In this paper, we study the parameters of the learning model of word embeddings, in order to choose the most optimal set and give the necessary recommendations for training such cases of the master data.
3 Vector Representation of the Master Data Elements The Master Data Corpus The corpus of the master data is a corporate directory of nomenclature with a volume of 264 thousand records. It includes a list of basic materials, spare sets, and storage services provided. The volume of the vocabulary is 45 thousand words. Figure 1 shows the frequency distribution in the vocabulary. From which we can conclude that the main array of words has frequencies in the range of 1–5 and approximately 70% of the total volume.
66
D. Sergey et al.
Fig. 1. Vocabulary frequency allocation
To assess the number of unique (specific) words in the corpus, a comparison of the existing vocabulary with the corpus of Wikipedia words is carried out. Words belonging to both sets of words are considered correct. The rest require either additional processing (abbreviations and acronyms, two-part words), or self-study. The percentage of groups in the case under consideration is presented in Fig. 2.
Fig. 2. The percentage distribution of words in the vocabulary
Word2Vec Model The Skip-gram [5] (Word2Vec) architecture shown in Fig. 3 was chosen as a model for training vectors. Firstly, such an architecture shows the best result for rare words compared to CBOW architecture. Secondly, further training of deep contextual models can be based on the results obtained during the study of skip-gram parameters.
Evaluating of Word Embeddings Hyper-parameters
67
Fig. 3. Skip-gram
The principle of training this architecture is to predict the context by word w(t) C ¼ fwðt 2Þ; wðt 1Þ; wðt þ 1Þ; wðt þ 2Þg
ð1Þ
0
The first step initializes two weight matrices WNV and WVN , where N is a dictionary size, V is an embedding dimension. An input vector is xi ¼ fx1 ; . . .; xk ; . . .; xV g
ð2Þ
Where xk ¼ 1 if k ¼ i and xk ¼ 0 if k 6¼ i. There is an embedding on the hidden layer hi ¼ xT W
ð3Þ
Then the matrix W0VN column vj is scalarly multiplied by the vector of the hidden layer, allowing you to get an estimate for each word in the dictionary uj ¼ vTj hi
ð4Þ
The posterior probability value for each word from the context is calculated using the softmax formula
p wj jwI
exp uj ¼ yj ¼ P k 0 j0 ¼1 exp uj
ð5Þ
Where wI is the input word and wO is the output word. The loss function is L ¼ logpðwO1 ; . . .; wOC jwI Þ
ð6Þ
68
D. Sergey et al.
n 0o 0 Update matrix weights WNV ¼ wij and WVN ¼ wij is @L @wij
ð7Þ
@L 0 @L@wij
ð8Þ
¼ wold wnew ij ij g 0
0
wijnew ¼ wijold g
Where g is learning rate. Additionally, the quality of vectors is improved by the following two methods. The first is a sampling method that is used to balance the body of training words. The essence of the method is to discard words from the training set. The probability that the word remains set by Formula 9 Pðwi Þ ¼
sffiffiffiffiffiffiffiffiffiffiffiffiffiffi ! zðwi Þ sample þ1 sample zð w i Þ
ð9Þ
Where zðwi Þ is word share in the vocabulary, sample is customizable sampling parameter. The second method allows you to increase the speed of learning - this is the method of negative sampling. Since all weights are updated for each step of training, which is a rather time-consuming calculation, it is most advisable to update weights for some negative examples (erroneous answers). The number of such examples from the corpus is negative. The probability of a word falling into a subsample is calculated according to Formula 10 3
zð w i Þ 4 Pðwi Þ ¼ PV 3 4 i¼0 ðzðwi ÞÞ
ð10Þ
Parameters of the Word2Vec Model The skip-gram model has the following set of parameters. 1. Size is word embeddings dimension 2. Window is the maximum size from the target word to the predicted in the Skipgram architecture. 3. Min count is the minimum frequency of words in the corpus. If the word occurs less than the given, it is ignored. 4. Negative is the parameter of negative sampling 5. Learning rate 6. Sample is sampling rate. Test values of the parameters are given in the Table 1, default values are bold [6, 7].
Evaluating of Word Embeddings Hyper-parameters
69
Table 1. Default values Parameters Size Window Min count Negative Learning rate Sample
Values 25/50/100/200/400/800 1/2/3/4/5/6/7/8 0/5/10/20/50/100/200/400/800/1000/1200/2400 1/2/3/5/8/10/15 0.0125/0.025/0.05/0.1 0/1e−1/1e−2/1e−3/1e−4/1e−5/1e−6/1e−7/1e−8/1e−9
4 Experiment The basic preprocessing of the words corpus includes the reduction of words to lower case, then all punctuation and special characters are removed from the elements, then all numerical values are removed, and the element is divided into tokens (words). The algorithm is implemented using the NLTK Python library (Bird, 2016). The resulting vectors obtained as a result of training the model with the optimal set of hyperparameters are compared with the vectors obtained by the fastText method with the default parameter value. Intrinsic Evaluation Learning vector representations of words should sufficiently qualitatively solve the problem of determining semantically close elements of reference information [8, 9]. The semantic similarity refers to the definition of cosine similarity between word embeddings. similarity wx ; wy ¼
wx wy kwx k wy
ð11Þ
Since the vector is trained for tokens (words), each element can be represented as a set of k tokens. el ¼ fw1 ; w2 ; . . .; wk g
ð12Þ
Then the element vector is calculated according to Formula 13 xel ¼
1 Xk w i¼1 i k
ð13Þ
As a test sample, labeled elements of the corporate directory are used. The nomenclature, which contains 379 pairs of elements that uniquely correspond to each other in meaning, that is, the indicator similarity for each of them is 1. The quality rating in this case will be the indicator of the average value of the semantic proximity of each pair elements of the test set on various vector representations. Extrinsic Evaluation The master data array of information systems is used for a wide range of tasks, one of which is the classification of the master data elements by categories. The quality of vectors will also affect the quality of the classifier.
70
D. Sergey et al.
To evaluate the effectiveness of the model, a classifier for 12 classes is trained on each vector model using the support vector method (SVM) [10], and then the F1-score is calculated. The classifier intentionally has very low quality and is not configured to solve the classification problem for a more accurate assessment of the quality of vectors. Experiment Results Size. The optimal parameter size is 25. This is because the size of the corpus of words is 15 thousand, while the values of the dimension of the vector such as 200, 400, etc. lead to overfitting of the model. Therefore, the classifier on overfitting vectors shows a better result, compared with vectors of smaller dimension (Tables 2, 3 and Fig. 4). Table 2. Research results for parameter size Size 25 50 100 200 400 800
Intrinsic evaluation Extrinsic evaluation 0.928 0.045 0.909 0.047 0.902 0.057 0.901 0.059 0.903 0.061 0.904 0.057
Fig. 4. Research results for parameter size Table 3. Research results for parameter min count Min count 0 5 10 20 50 100 200 400 800 1000 1200 2400
Intrinsic evaluation Extrinsic evaluation 0.920 0.066 0.902 0.06 0.898 0.051 – 0.054 – 0.056 – 0.051 – 0.055 – 0.059 – 0.048 – 0.045 – 0.048 – 0.054
Evaluating of Word Embeddings Hyper-parameters
71
Min Count. In the corpus of the master data there are quite a lot of unique words (on average more than half the volume of the entire dictionary), as a result of which, when discarding words whose frequency is below 20, uncertainties arise. In connection with these, only a small part can be found among the trained words, and the test is either incorrect, as in the case of an external test, or failed, as in the case of an internal one (Fig. 5).
Fig. 5. Research results for parameter min count
Window. Since the elements of the master data have a rather limited length, on average one element contains about 5 words. An internal vector quality estimate worsens when the window size exceeds this average. At the same time, lower values do not allow a sufficiently accurate study of the topic similarity of tokens (Table 4 and Fig. 6).
Table 4. Research results for parameter window Window 1 2 3 4 5 7 8
Intrinsic evaluation Extrinsic evaluation 0.908 0.054 0.908 0.055 0.905 0.050 0.904 0.058 0.902 0.059 0.901 0.058 0.900 0.056
Fig. 6. Research results for parameter window
72
D. Sergey et al.
Learning Rate. At low values of the learning step, the model is more accurate, but the learning speed is quite low, while at the same time, the learning process is unstable at high learning rates. The best option is a value of 0.025 or 0.05 (Table 5 and Fig. 7). Table 5. Research results for parameter learning rate Learning rate Intrinsic evaluation Extrinsic evaluation 0.0125 0.932 0.053 0.025 0.924 0.056 0.05 0.902 0.058 0.01 0.883 0.048
Fig. 7. Research results for parameter learning rate
Sample. At low thresholds, the probability of informative words to be dropped is greater. The maximum quality of the model according to both estimates is achieved at point 1e–05, then the values in both cases drop sharply. This is due to the fact that informative high-frequency words as well as non-informative ones start to be rejected with equal probability (Table 6 and Fig. 8). Table 6. Research results for parameter sample Sample 0 0.1 0.01 0.001 0.0001 1e−05 1e−06 1e−07 1e−08 1e−09
Intrinsic evaluation Extrinsic evaluation 0.5 0.049 0.899 0.064 0.899 0.061 0.902 0.055 0.944 0.049 0.999 0.044 0.587 0.046 0.585 0.045 0.585 0.042 0.585 0.043
Evaluating of Word Embeddings Hyper-parameters
73
Fig. 8. Research results for parameter sample
Negative. The higher the negative sampling parameter, the higher the results for both internal and external evaluation (Table 7 and Fig. 9). Table 7. Research results for parameter negative Negative 1 2 3 5 8 10 15
Intrinsic evaluation Extrinsic evaluation 0.895 0.057 0.904 0.058 0.905 0.059 0.903 0.060 0.901 0.063 0.901 0.065 0.902 0.068
Fig. 9. Research results for parameter negative
Comparison with Baseline Vectors. The optimal set of parameters in comparison with the basic vectors is presented in Table 8. Table 8. Result set of parameters Size Window Min count Negative Learning rate Sample Baseline values 100 5 0 5 0.025 1e−03 Optimal values 25 5 5 15 0.05 1e−05
74
D. Sergey et al.
When solving this type of problem, the Size, Window, Min count parameters should be selected from the properties of the training data corpus. Learning rate, Negative, Sample parameters basically have the same effect and do not depend on the parameters of the reference elements. Their values should be chosen from general recommendations for solving applied problems. Table 9. Vector comparison Vectors Intrinsic evaluation Extrinsic evaluation Word2Vec baseline 0.754 0.037 Word2Vec optimal 0.904 0.058 fastText 0.602 0.021
As you can see from the Table 9, the configured Word2Vec method gives a more accurate result in comparison with the more universal method (fastText) and defines in more detail the semantic relations of the directory elements.
5 Conclusions As a result of the work, we presented the method of training word embeddings for the master data in Industry 4.0. We proposed a novel and effective framework for setting hyperparameters, and we give recommendation for choosing hyper-parameters (size, window, min count, negative, learning rate, sample) for more productivity in solving a classification task with the master data. The model with choosing hyperparameters using in the Module of Intelligence Analysis (MIA). This module for the analysis of industrial text data can reduce operational time and increase quality of processes. The MIA is successfully used in systems for managing master data. The future work can include hyperparameter tuning for biggest datasets, experiments with the usage of different domain text corpuses. It could be also measured how many elements of the master data are needed for adapt onto new domain.
References 1. Bollegala, D., Maehara, T., Kawarabayashi, K.I.: Unsupervised cross-domain word representation learning. arXiv preprint arXiv:1505.07184 (2015) 2. Chiu, B., Crichton, G., Korhonen, A., Pyysalo, S.: How to train good word embeddings for biomedical NLP. In: Proceedings of the 15th Workshop on Biomedical Natural Language Processing, pp. 166–174, August 2016 3. Biswas, E., Vijay-Shanker, K., Pollock, L.: Exploring word embedding techniques to improve sentiment analysis of software engineering texts. In: Proceedings of the 16th International Conference on Mining Software Repositories, pp. 68–78. IEEE Press, May 2019
Evaluating of Word Embeddings Hyper-parameters
75
4. Nooralahzadeh, F., Øvrelid, L., Lønning, J.T.: Evaluation of domain-specific word embeddings using knowledge resources. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018), May 2018 5. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013) 6. Fanaeepour, M., Makarucha, A., Lau, J.H.: Evaluating word embedding hyper-parameters for similarity and analogy tasks. arXiv preprint arXiv:1804.04211 (2018) 7. Wang, B., Wang, A., Chen, F., Wang, Y., Kuo, C.C.J.: Evaluating word embedding models: methods and experimental results. arXiv preprint arXiv:1901.09785 (2019) 8. Rather, N.N., Patel, C.O., Khan, S.A.: Using deep learning towards biomedical knowledge discovery. Int. J. Math. Sci. Comput. (IJMSC) 3(2), 1–10 (2017). https://doi.org/10.5815/ ijmsc.2017.02.01 9. Adamuthe, A.C., Jagtap, S.: Comparative study of convolutional neural network with word embedding technique for text classification. Int. J. Intell. Syst. Appl. (IJISA) 11(8), 56–67 (2019). https://doi.org/10.5815/ijisa.2019.08.06 10. Chatterjee, S., Jose, P.G., Datta, D.: Text classification using SVM enhanced by multithreading and CUDA. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 11(1), 11–23 (2019). https://doi.org/10.5815/ijmecs.2019.01.02
Development of an Intelligent Control System Based on a Fuzzy Logic Controller for Multidimensional Control of a Pumping Station Artur Sagdatullin(&) Kazan National Research Technical University named after A.N. Tupolev-KAI, LF KNITU-KAI, 22 Lenin Str., Leninogorsk 423250, Russian Federation [email protected]
Abstract. This article discusses the process of developing, synthesizing and optimizing a new control system for a technologically important facility. Technological processes of collecting and preparing, production and transporting oil and associated gas and water consume up to 70% of all energy. Therefore, the issue of increasing energy efficiency, both of the technological processes themselves and their objects components using methods and approaches of intelligent control is relevant. To this end, for multidimensional logical control of a pumping station an intelligent controller has been developed. This control system is part of an intelligent control system for a group of pumping units in a series-parallel switching circuit. The main control unit is a multi-dimensional fuzzy logic controller with three control loops. To compensate the mutual influence of control levels of fuzzy controller, a three loops system of logical control has been developed. The implementation of this system in the form of an additional block of production rules for a fuzzy controller made it possible to achieve continuity of the process and stabilization of the fluid level at a given point with the small error. Keywords: Artificial intelligence Control theory Automation Fuzzy logic controller Pumping station Electric drive Frequency converter Induction motor
1 Introduction Pumps and pumping stations form the basis of technological processes for collecting, maintaining reservoir pressure, transporting and preparing oil and gas in the fields. These processes ensure the functioning of the most important oil production processes by performing the following basic operations: measuring the quantity and quality of well products at automated group metering units (AGMU), providing metrological indicators for oil and gas metering units, collecting and preparing well products at booster pump stations (BPS), preparation of produced water, associated gas at the preliminary water discharge installations (PWDI) and the integrated gas treatment plants units (GTPU), ne liquid pumping - to cluster pumping stations (CPS), transport of commercial oil to oil treatment plants (OTP).
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 76–85, 2020. https://doi.org/10.1007/978-3-030-39216-1_8
Development of an Intelligent Control System
77
The transportation and distribution of water in wells is an important problem for the oil and gas field, since pumping equipment is very energy intensive. A solution to this problem is possible by various methods, for example, in the work [4], a mixed integer nonlinear programming problem was proposed to minimize costs and total transport costs. For solving such models [5] proposed an evolutionary algorithm with decomposition. In [6], an algorithm was proposed for solving the pump planning problem based on the grid search, Hooke – Jeeves method. In the work [7] authors proposed optimization of the water distribution network; topological and dimensional variables were introduced when solving the problem by the method of evolutionary algorithms. In the work [8], based on the Bayesian Optimization method, a solution to the problem of optimizing the operation of pumps with an adjustable drive and Enabled/Disabled drives is proposed. It is invited to consider examples of application of methods for optimizing pump planning [1–9]. Preliminary analysis showed that the relevant and important is the solution to the problem of improving the energy efficiency of the processes considered in the article. Also it can be noted another way to reduce the cost of transportation and oil treatment operations, based on operation optimization of the pumps themselves. These issues can be solved based on the prediction of their operating modes, as well as by direct control methods based on automation schemes with various types of controllers, such as proportional integral differential and fuzzy controllers. However, in this case, the important issue is to build the correct hierarchy of schemes and control algorithms. For example, in the work [10] it is proposed forecasting production based on a segregated-flow model, which simplifies the prediction model. For flow control it is considered closed loop torsion valve control [11]. Zhang et al. 2019 discusses methods for proportional integral derivative (PID) control of hydraulic subsystems, including pumping units and valves. In the work [13] it is reviewed the application of an adaptive PID controller system based on swarm optimization. Lallouani, Tuncer, Mohammed and Tripathy et al. examined the use of fuzzy logic in object control systems and different types systems [10–14]. According to the analysis, it is relevant and important to solve the problem of improving the energy efficiency of these technological processes. To solve this problem, it is proposed to use an intelligent control system due to the fact that control schemes based on proportional-integral-differential (PID) controllers are not effective enough when managing complex objects of oil and gas production [15–20].
2 The Technological Process as an Object of Control The pumping station includes: high-voltage asynchronous electric motors (AM), a lowvoltage frequency converter, raising and lowering transformers, an integrated controller and a tank. The technological scheme of the pumping station under consideration is presented in Fig. 1.
78
A. Sagdatullin
Fig. 1. Technological scheme of the pumping station of the transport and oil treatment system
Technological scheme of the pumping station of the transport and oil treatment system include following main controlled and observed parameters: P1, x1, Qout1 pressure, angular velocity and flow rate, respectively, at the outlet of the main pump; P2, x2, Qout2 - pressure, angular velocity and flow rate, respectively, at the output of the backup pump; PoutP, QoutP - pressure and flow rate, respectively, on the flow line of the pumping station. The technological scheme itself consists of a tank (Tank – Separator E3), pipe fittings, including non-return valves, adjustable and non-adjustable valves, and two electric drives, including asynchronous drive motors (Induction Machines) of centrifugal pumps (main P-1 and backup P-2). The emulsion enters the tank through the valves Qin1, Qin2 and Qin3, the opening and closing of the valves with main of them QinP allows the emulsion to be fed to the pumps, and the control of the valves allows to purposefully change the output characteristic of the pumps. The second level of the model hierarchy is represented by output pumping station variables: fluid level (LE3) in the tank, fluid pressure (PoutP) at the flow line, flow rate QoutP (emulsion flow rate at the main outlet valve).
3 Development of Conceptual Model for Multidimensional Control of a Pumping Station Figure 2 shows a conceptual model of the multidimensional control of a pumping station, which under consideration has a hierarchical structure. At the first level of the hierarchy, the input variables are Q33 = QinP – the flow rate through the inlet valves of the incoming emulsion to the tank, Q20, Q21 – the flow rate of the emulsion at the output of the pumps P-1 and P-2, and GS1 GS6 – sensors for monitoring the tightness of pipelines and reservoir.
Development of an Intelligent Control System
79
Fig. 2. Conceptual model of multidimensional control of a pumping station
Moreover, the flow rate of the oil emulsion at the Q33 valve affects the head and flow rate of the oil emulsion at the outlet of the P-1 and P-2 pumps, i.e. for variables P1, Qout1 and – P2, Qout2, and also at the LE3 level in the tank. The dependence of the pressure and flow rate of the pump on the angular velocity of the electric motor is described by the following expression [21]: H¼
Pf 2 x Sf Q 2 ; x2N
where Hf - fictitious static pressure (meters); Sf - fictitious hydraulic resistance of the pump (s2/m5); Q - pump flow (volume of fluid pumped by the pump per unit time), (m3/sec); x and xN are the current and nominal rotation speeds of the pump shaft, respectively. By its nature, the function LE3 = f(Q33, Q20, Q21) is a function of three arguments; therefore, the pump station can be characterized as a multiply connected nonlinear control object with a dimension equal to three. The main adjustable parameters of the pumping station are: in the 1st and 2nd loops - the angular velocity of the rotor of the main and backup pump, respectively, and in the 3rd circuit - a controlled valve on the flow line. Electric motors and the K-33 valve are controlled by the signals of the microprocessor controller (MPC), which receives information from the sensors: the angular velocities of the electric motors, the position of the working body of the K-33
80
A. Sagdatullin
valve (valve closure in %), pressure and liquid level in the LE3 tank. The level in the tank should be maintained at 2.5 m (tank height – 5 m., Length - 10.2 m., And volume 200 m3).
4 Development of Logical Diagram and Process Control Algorithm The logical diagram of the process control algorithm is presented in Fig. 3. It follows that the controllers x1, x2, and K-33 are turned on when the conditional transition operators (LE3 > 2.5 m) and (LE3 2.5 m) are true. For high-quality control of the pumping station, a necessary condition is a quick response of the BP speed and the movement of the controlled valves to external influences and ensuring a given value of the controlled parameters (in this case, flow, pressure and level). To maintain a constant flow rate at the pumping station, a constant pressure (pressure) in the pipeline is achieved.
Fig. 3. Logical diagram of the control algorithm of the technological process of the pumping station
Development of an Intelligent Control System
81
In the vast majority of cases, PID controllers are used to control the pressure in the pipeline. However, a continuous change in the oil supply to the pump station in question leads to fluctuations in flow rate and pressure, both in the pipeline and at the OTP. This uncertainty does not allow obtaining analytical dependences and an adequate mathematical model of the system. Therefore, control systems based on PID controllers require periodic and labor-intensive tuning, which leads to increased operating costs and energy overruns [8–16]. To increase the efficiency of the control process, it is proposed to use a multidimensional discrete logical controller, the feature of which is the presentation of input and output variables, as well as compensation functions for the influence of control loops by a set of discrete terms, that is, terms with a rectangular shape of the membership function [18, 20]. Figure 4 presents the interpretation of the adjustable parameter LE3 with a combination of 16 precise terms LE31, LE32, …, LE315, LE316 (for example, TLE31 < LE3 < TLE316, etc.). The interpretation of the remaining adjustable parameters Qout, Hout is similar. The analytical expression of the term set for the variable LE3 has the following form: TðLE3 Þ ¼
16 X
LE3 ðði 1Þ 0; 25 LE3 \i 0; 25Þ;
i¼1
where i is the number of a discrete term.
Fig. 4. Placing discrete terms of the level of oil emulsion in the tank on the universal numerical axis
The advantage of the developed controller over the classical PID controllers and typical fuzzy controllers for the automation of complex, nonlinear and multiply connected control objects of the highest order represented by the verbal model is low error and high speed. These characteristics make it possible to construct a multidimensional fuzzy controller with good compensation of the mutual influence of control loops.
82
A. Sagdatullin
5 Design of Experiment Data Gathering for Introduction of Compensation Functions in the Program of Multidimensional Regulator It is proposed to obtain the initial data for the development of such a compensator from an experiment on removing for each CDML control loop two “Input-output” characteristics: in autonomous and multiply connected modes of operation of the circuits. With regard to the control of the pumping station, the autonomous mode means the following: upon the onset of the operating mode, the liquid level in the tank supports only one of the three regulators (for example, pump P-1), the rest of the regulators are turned off. Thereby, their influence on the studied circuit is minimized. In the case of multiply connected mode, all the regulators work normally. Therefore, the “Inputoutput” characteristic of the controller under consideration under such conditions will take into account the influence of the remaining contours of the multidimensional controller on it. In accordance with the experimental design, the “Inlet-outlet” characteristics were obtained for the liquid level regulator in the tank (Fig. 5).
Fig. 5. “Input-output” characteristics of the LE3 controller in stand-alone (LE3a) and multiply connected (LE3m) operating modes with the influence function TLE3v = TLE3m − TLE3a
Figure 5 shows the “Input-output” characteristics for stand-alone TLE3a = f(Q20) and multiply connected TLE3m = f(Q20) modes of operation of the controller related to the pump motor P-1 with an angular speed of rotation x1 and flow rate Q20. Their difference along the ordinate axis is represented by the function of the influence of all other control loops on the TLE3 circuit: TLE3v = TLE3m − TLE3a. From the above system it follows that the value of the LE3 parameter is maintained at 2.5 m with an accuracy of ± 0.25 m, and to compensate for the influence of other control loops on the LE3 parameter, the discrete terms LE3к1 LE3к16 are introduced in the consequent production rules. The systems of production rules related to the adjustable parameters LE3, PoutP, QoutP are constructed according to a similar structure.
Development of an Intelligent Control System
83
6 Results and Discussion The introduction of compensation functions has changed the structure of the program that implements the multidimensional regulator of the pumping station. In such a program, for each control circuit (RSPR – regulatory scheme of production rules of Q20), along with regulatory (RSPP(Q20), RSPP(Q21), RSPP(Q33)) there are compensating systems of production rules (CSPR(Q20), CSPR(Q21), CSPR(Q33)). This minimizes the mutual influence of the contours of a multidimensional discrete logical controller and increases the quality indicators of the regulation of the liquid level in the tank, in which program implementations [18, 20]. Replacing the regulating (x1, x1, Q33) and adjustable (Q20, Q21, LE3, Q33, PoutP) parameters of the pumping station with a combination of discrete terms allows us to construct a system of production rules for a three-loop discrete controller. For a circuit with adjustable parameter LE3, such a system has the following structure: 8 IF LE3 ¼ LE312 vLE313 v. . .vLE315 vLE316 ; THEN Q20 ¼ 0; > > > IF L ¼ L > > E3 E311 ; THEN Q20 ¼ TQ201 vTQ20k1 ; > > > ... < ... > > ... > > > > IF L ¼ LE32 ; THEN Q20 ¼ TQ2015 vTQ20k15 ; > E3 > : IF LE3 ¼ LE31 ; THEN Q20 ¼ TQ2016 vTQ20k16 : According to the inferred results and “Input-output” characteristics, classic fuzzy controllers with two or more circuits are of particular importance expressed in the form of errors that extend to a universal numerical set. This feature allows to more accurately compensating for the mutual influence of the control loops using a compensating system of production rules. The application of the proposed approach and the developed regulator when regulating the liquid level in the separator tank made it possible to increase the accuracy of maintaining the flow rate in the zone of (28 to 5)% with minimizing the error by 23%.
7 Conclusions This paper proposes a control system for a pumping station based on a threedimensional discrete logic controller with compensation for the mutual influence of control loops in the form of an additional system of production rules, which made it possible to stabilize the fluid level at a given point of 2.5 m. with an absolute error of ± 0.2 m. It allowed improve the quality of the oil emulsion at the oil treatment unit and the central collection point. Control system is designed as a part of intelligent control of a group of pumping stations in a series-parallel switching circuit with control from multi-dimensional fuzzy logic controller with three control loops. The representation of the defining information and output characteristics (inputs and outputs of the control system) in the form of a system of production rules with discrete terms allows to improve the accuracy and speed of existing logical control systems
84
A. Sagdatullin
based on fuzzy logic or proportional and integral-differential principles. This is due to the use of discrete terms and compensation logic in the control function and adjustable parameters. Therefore, for three-loop logical control based on a fuzzy controller, a system has been developed to compensate mutual influence of control levels. Finally, the simulation results show the effective minimization of contours mutual influence of the proposed multidimensional fuzzy logic controller that increasing the quality indicators of fluid level control in the tank-separator. The implementation of this system in the form of an additional block of production rules for a fuzzy controller made it possible to achieve these results. It can also be noted that using the proposed system it is possible to solve almost one of the most important problems associated with the transportation of oil in oil fields. This is the maintenance of the continuity of the process, which in the current conditions with the existing control system was almost impossible to do. It should also be noted that due to the impossibility of stopping a real-life facility, the experiments conducted as part of the research were carried out using the experimental data of the pumping station and laboratory installation; as a result, not all characteristics could be identified according to experimental data. The main disadvantages of the proposed approach to the development of intelligent controllers based on multidimensional fuzzy control with discrete terms include the need to describe a larger number of possible situations. As a result, this brings the proposed system closer to logical control systems and requires 1.5–2 times more memory of the controller that controls the process. On the other hand, this gives us significant advantages in speed and accuracy of the system. In the future work, since each pumping station has its own uncertain parameters of model equations to be estimated, tuning of proportional integral and differential controller with fuzzy logic tuning would be introduced to solve the quasi-linearity problem. The developed multidimensional thee-loop control system based on fuzzy controller with discrete terms can be extended to nonlinear systems control with unknown parameters in form of classical computation model with elaborated inference system and optimal loop-dimension scheme.
References 1. Devold, H.: Oil and Gas Production Handbook. An Introduction to Oil and Gas Production, Transport, Refining and Petrochemical Industry, p. 162. ABB Oil and Gas, Oslo (2013) 2. Szilas, A.P.: Production and Transport of Oil and Gas. Second Completely Part B. Gathering and Transportation, p. 353. Elsevier, Amsterdam (1986) 3. Sagdatullin, A.M.: Development and modeling of automation and control system of SuckerRod Well Pump with beam drive. Chem. Pet. Eng. 52(1–2), 29–32 (2016) 4. Wang, B., Liang, Y., Yuan, M.: Water transport system optimisation in oilfields: environmental and economic benefits. J. Clean. Prod. 237 (2019). https://doi.org/10.1016/ j.jclepro.2019.117768 5. Azadeh, A., Shafiee, F., Yazdanparast, R., Heydari, J., Fathabad, A.M.: Evolutionary multiobjective optimization of environmental indicators of integrated crude oil supply chain under uncertainty. J. Clean. Prod. 152, 295–311 (2017). https://doi.org/10.1016/j.jclepro.2017.03. 105
Development of an Intelligent Control System
85
6. Bagirov, A.M., Barton, A.F., Mala-Jetmarova, H., Al Nuaimat, A., Ahmed, S.T., Sultanova, N., Yearwood, J.: An algorithm for minimization of pumping costs in water distribution systems using a novel approach to pump scheduling. Math. Comput. Model. 57(3–4), 873– 886 (2013). https://doi.org/10.1016/j.mcm.2012.09.015 7. Bureerat, S., Sriworamas, K.: Simultaneous topology and sizing optimization of a water distribution network using a hybrid multiobjective evolutionary algorithm. Appl. Soft Comput. J. 13(8), 3693–3702 (2013). https://doi.org/10.1016/j.asoc.2013.04.005 8. Candelieri, A., Perego, R., Archetti, F.: Bayesian optimization of pump operations in water distribution systems. J. Glob. Optim. 71(1), 213–235 (2018). https://doi.org/10.1007/ s10898-018-0641-2 9. Zaghian, A., Mostafaei, H.: An MILP model for scheduling the operation of a refined petroleum products distribution system. Oper. Res. 16(3), 513–542 (2016) 10. Male, F.: Using a segregated flow model to forecast production of oil, gas, and water in shale oil plays. J. Petrol. Sci. Eng. 180, 48–61 (2019) 11. Jafari, R., Razvarz, S., Vargas-Jarillo, C., Yu, W.: Control of flow rate in pipeline using PID controller. In: Proceedings of the 2019 IEEE 16th International Conference on Networking, Sensing and Control, ICNSC 2019, pp 293–298 (2019) 12. Zhang, P., Li, Y.: Research on control methods for the pressure continuous regulation electrohydraulic proportional axial piston pump of an aircraft hydraulic system. Appl. Sci. (Switzerland) 9(7), 1376 (2019) 13. Zhang, N., Li, C., Lai, X.: Design of a multi-conditions adaptive fractional order PID controller for pumped turbine governing system using multiple objectives particle swarm optimization. In: Proceedings - 2019 4th International Conference on Electromechanical Control Technology and Transportation, ICECTT 2019, pp. 39–44 (2019) 14. Lallouani, H., Saad, B., Letfi, B.: DTC-SVM based on interval Type-2 fuzzy logic controller of double stator induction machine fed by six-phase inverter. Int. J. Image Graph. Signal Process. (IJIGSP), 11(7), 48–57 (2019). http://www.mecs-press.org/. https://doi.org/10. 5815/ijigsp.2019.07.04 15. Tuncer, T., Dogan, S., Akbal, E.: Discrete complex fuzzy transform based face image recognition method. Int. J. Image Graph. Signal Process. (IJIGSP) 11(4), 1–7 (2019). http:// www.mecs-press.org/. https://doi.org/10.5815/ijigsp.2019.04.01 16. Mohammed, R.: Quadrotor control using advanced control techniques. I. J. Image Graph. Signal Process. 11(2), 40–47 (2019). http://www.mecs-press.org/. https://doi.org/10.5815/ ijigsp.2019.02.05 17. Tripathy, B., Bhambhani, U.: Properties of multigranular rough sets on fuzzy approximation spaces and their application to rainfall prediction. Int. J. Intell. Syst. Appl. (IJISA) 10(11), 76–90 (2018). http://www.mecs-press.org/. https://doi.org/10.5815/ijisa.2018.11.08 18. Kayashev, A., Muravyova, E., Sharipov, M., Emekeev, A., Sagdatullin, A.: Verbally defined processes controlled by fuzzy controllers with input/output parameters represented by set of precise terms. In: Proceedings of 2014 International Conference on Mechanical Engineering, Automation and Control Systems, MEACS 2014 (2014). https://doi.org/10.1109/meacs. 2014.6986847 19. Abdelwanis, M., El-Sehiemy, R.: A fuzzy-based controller of a modified six-phase induction motor driving a pumping system. Iran. J. Sci. Technol. Trans. Electr. Eng. 43(1), 153–165 (2019) 20. Sagdatullin, A., Muravyova, E., Sharipov, M.: Modelling of fuzzy control modes for the automated pumping station of the oil and gas transportation system. IOP Conf. Ser.: Mater. Sci. Eng. 132(1) (2016). https://doi.org/10.1088/1757-899x/132/1/012028 21. Leznov, B.: Energy saving and adjustable drive in pump and blower units, p. 360. Energoatomizdat, Moscow (2006)
From Algebraic Biology to Artificial Intelligence Georgy K. Tolokonnikov1(&) and Sergey V. Petoukhov2 1
2
Federal Scientific Agro-Engineering Center VIM, Russian Academy of Sciences, 1st Institute Passage, 5, Moscow, Russia [email protected] Mechanical Engineering Research Institute, Russian Academy of Sciences, M. Kharitonievsky pereulok, 4, Moscow, Russia [email protected]
Abstract. Algebraic biology is engaged in mathematical modeling of the structural features of the molecular genetic system and the inherited properties of organisms (phyllotaxis, inheritance of traits in Mendelian genetics, WeberFechner law, etc.) without a detailed analysis of genome reading processes. At the same time algebraic biology is a deeply systemic science. The consideration of human person as a system, including the subsystem of his intellect, modeled in some degree by artificial intelligence, leads to the necessity of using the methods of categorical theory of systems in addition to classical algebraic methods. Examples that are not reducible to the set-theoretic approach are given; generalizations of U-numbers of matrix genetics to the categorical case are outlined. Keywords: Artificial intelligence Neural networks Genetic code Tensor product Systems theory Categories Topos
DNA
1 Introduction One of the main tasks of algebraic biology, beginnings of which were developed in the works ([1–5], many other works are available at http://petoukhov.com), is to identify structural relationships of the inherited physiological properties of organisms with the peculiarities of the genetic coding system by strict mathematical methods. Of particular interest is the creation - by means of achievements of algebraic biology - of new approaches to artificial intelligence (AI) systems that reproduce a person’s natural abilities for the phased development of the intellectual activity. In modern science, the available methods of AI are successfully used to analyze DNA and other issues of molecular biology. For example, in [6] a technology was proposed for predicting DNA and protein binding sites using convolutional neural networks. An improved model for recognizing splice sites was implemented in software (http://bioit2.irc.ugent.be/ splicerover/). Other achievements of AI in this rapidly developing field can be found in the review [7]. Our efforts in the field of algebraic biology are aimed to help in solving key problems of modern biology: “Our knowledge of gene functions today usually rests on information such as “gene X encodes (or is involved in the coding) the © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 86–95, 2020. https://doi.org/10.1007/978-3-030-39216-1_9
From Algebraic Biology to Artificial Intelligence
87
characteristic Y”… The lack of information about what happens in the interval between genes and the phenotype applies to all genes and all signs. The worst thing is that it is unclear how to fill it. We need a new scientific revolution, similar to the one made by the structure of DNA in genetics” [8]. Algebraic modelling of structural features of contain of genetic molecules of DNA and RNA generates heuristic associations that connect biology with other fields of science and give hope for the inclusion of biology in the future in the field of developed mathematical science. For example, in algebraic biology, the alphabets of nitrogenous bases of DNA (adenine A, cytosine C, guanine G and thymine T) are represented as symbolic matrices (Fig. 1). Taking into account some characteristics of these nitrogenous bases, these matrices turn into numerical Fibonacci matrices, the degrees of which give a series of Fibonacci numbers 0, 1, 1, 2, 3, 5, 8, 13, 21, …; these numbers are realised in known phenomena of inherited phyllotaxis. But Fibonacci numbers are actively used also in computer science and many other areas.
C, A G, T
,
T, A C, G
Fig. 1. Genetic matrices [C A; G T] and [T A; C G].
One of many important inherited properties is the implementation of Fibonacci numbers 5, 8, 13, 21 in the microtubules of the cytoskeleton consisting of tubulin dimers (5, 8 and 21 are quantities of right and left screw structures of the tubules of an outer layer, 13 - the number of tubulin dimers in cross section of microtubules). Note that interest in the nature of tubulin microtubules is very high, including from the point of view of AI [9], since it underlies the indirectly confirmed hypothesis of R. Penrose and S. Hameroff about the quantum nature of consciousness [10]. In particular, involvement of tubulin microtubules in the phenomenon of consciousness, is manifested in the fact that under the influence of anesthesia, consciousness is turned off along with the activity of microtubules (a transition between two conformations of tubulin dimers); while the removal of anesthesia restores the consciousness and activity of microtubules [11]. One of the interesting results of algebraic biology is algebraic modeling of rules of inheritance of traits in Mendelian genetics based on Punnett squares. Heredity is the passing of traits from parent to offspring. Traits are controlled by genes. The different forms of a gene for a certain trait are called alleles. There are two alleles for every trait. Alleles can be dominant or recessive. Each cell in an organism’s body contains two alleles for every trait. One allele is inherited from the female parent and one allele is inherited from the male parent. The method of Punnett squares known from 1905 year is a simple method for predicting the ways in which alleles can be combined. It is a quite popular method among most textbooks of Mendelian genetics. Punnett squares represent alphabets of genotypes or, more precisely, alphabets of possible combinations of male and female gametes in Mendelian crosses of organisms from a viewpoint of a certain amount of inherited traits taken into account. A possibility was revealed [3] for effective interpreting a set of Punnett squares for polyhybrid crosses not as tables but as
88
G. K. Tolokonnikov and S. V. Petoukhov
square matrices of Kronecker products of (2 * 2)-matrices for monohy brid crosses. Based on Kronecker multiplication of matrices this approach gives a simple algebraic method to construct Punnett squares for complex cases of multi-hybrid crosses. In addition it was shown [3] that dyadic-shift decompositions of these “Punnett matrices” lead in some cases to a certain method of classification of different sub-sets of combinations of alleles from male and female gametes. A living organism is a complex chorus of coordinated oscillatory processes. Since ancient times, chronomedicine claims that all diseases are caused by a violation in the coordination of these oscillatory processes. Algebraic analysis of the parametric features of the alphabets of DNA and RNA revealed their structural connection with the matrix formalisms of the theory of resonances of vibrational systems having many degrees of freedom. These results formed the basis of the fundamental concept of multiresonance genetics, according to which resonant interactions play a key role in the organization of molecular genetic structures and trait inheritance systems [4]. In particular, this approach made it possible to propose an algebraic model of the basic psychophysical law of Weber-Fechner [4]. In algebraic biology, data have been obtained on the relationship of the parametric structure of DNA molecules with musical harmony, more precisely, with the relations of the Pythagorean musical scales [1, 2]. This gives new material to ancient ideas about the meaning of musical harmony in the structure of nature, and also allows us to rethink the general harmony of biological bodies and people’s love for music and musical creativity. At the same time, a whole network of simultaneously existing parameter systems is discovered in the structure of chain-like DNA molecules, the relations between which are associated with the Pythagorean musical scales. A DNA molecule can be considered as a polyatomic structure having a form of a pigtail woven from many threads or chains. One of these chains included in the common DNA scythe is a chain of hydrogen bonds, the other is a chain of rings of nitrogenous bases, the third is a chain of individual types of atoms, etc. Each of these chains has its own Pythagorean sequences of parameters, that is, has its individual “melodies”. In the result, from the stated “musical Pythagorean” point of view, the whole DNA molecule appears as a carrier of these parallel quint sequences of discrete parameters or “melodies”. Their combination forms a certain general polyphonic melody (quinto polyphony), which, if desired, can be reproduced on musical instruments or through polyphonic singing and which is specific for each gene. You can not only listen to this music, but also compose it by playing new musical instruments [1]. The direction of genetic music that arose on the basis of algebraic analysis of DNA contains is being intensively developed at the Moscow State Conservatory [12]. It should be noted the well-known role of music in the work of many representatives of the exact sciences. We think that some aspects of musical harmony will be undoubtedly used in the future AI modeling human intelligence. Algebraic biology has become an independent science with its own subject, tasks and methods; a certain topic within its framework covers the main tasks of AI. In this report, a modern systems approach is considered that has found a natural application in algebraic biology.
From Algebraic Biology to Artificial Intelligence
89
2 Category Systems in Algebraic Biology Human organism is a complex biological systems with subsystems of thinking, the modeling of which is the main goal of AI. A systematic approach dictates the need for models of a person and his subsystems in form of algebraic systems. For algebraic biology, a similar broader thesis is that a DNA molecule should be considered in conjunction with the mechanisms for reading biological information, which are realized, in particular, by the cell. A suitable system approach must be chosen from the many directions of the mathematical theory of systems [13, 14], categorical systems used in the software industry [15], categorical theory of systems [19–24]. In addition to the requirement of mathematical rigor for a systematic approach in our case of biological systems a categorical language is required. According to [16, 17], a system is most often understood as the relation on the Cartesian product set of inputs and set outputs (black box) and a generalization of this definition to process systems [17], which includes most of the many definitions of systems in the traditional mathematical theory of systems. In the software development industry, one of the key tasks is the assembly of individual software modules into a single program (system), a categorical approach to systems has arisen here, we explain it in more detail for comparison with the approach to categorical systems theory [19–24]. Let C be a category with a collection of objects and arrows, respectively, Ob (C), Ar (C), UC be its graph (U is a forgetting functor from the category of categories CAT to the category of graphs Grph), G is some graph called a diagram scheme. Then the diagram D is a morphism of graphs D: G ! UC. The universal cone over the diagram is called the limit of the diagram. To determine the system and its properties according to [15], it is important to consider diagrams as a category. The class [G, C] of all diagrams in category C with a fixed diagram G can be turned into a category by defining morphisms f as follows: f ¼ ffi g; Di ¼ DðiÞ; fj DðmÞ ¼ D0j fi . Using the graph G, we can construct a free category, then the diagrams turn into functors, the category considered instead of the graph G is called the index category. Let a category C and an index category D be given. A system is a diagram as a functor S: D ! C. In addition to representing systems as diagram functors, the approach [15] models the behavior and interaction of systems using the limits of diagrams. If the diagram of the system S 2 DC has a limit, then it is called the behavior of the system. The interaction of systems is defined as the diagram S: X ! DC in the category of DC diagrams; if the indicated diagram has a (co) limit, then it is called the result of the interaction of systems. There is no concept for a system-forming factor (see below) in the approach [15], as well as in the case of a set-theoretic definition of a system (“black box”) [13, 14]. However, for biological systems, simply fixing the inputs and outputs of the system is fundamentally insufficient. In his theory of functional systems neuroscientist P. K. Anokhin introduced a notion “system-forming factor”, which is responsible for the process of assembling a system from subsystems, corresponding to the result that a functional system will be seeking. The specified assembly process is based on the postulated principle of building a system in motion from the whole to the parts. In addition to the system-forming factor in the theory of functional systems, the principles
90
G. K. Tolokonnikov and S. V. Petoukhov
of isomorphism of systems and their hierarchy have been postulated. In fact, in the theory of functional systems, a categorical approach is adopted and the numerous developed schemes of functional systems after appropriate editing become category diagrams. The categorical theory of systems [16–21] uses convolutional polycategories developed by first author of the article for other purposes but ideally suited as a system language for a categorical generalization of the theory of functional systems, a systemforming factor, the principles of isomorphism and hierarchy for systems. The polycategories were introduced in 1975 [23], we call them Sabo polycategories (by the name of their founder M. Szabo), but unlike the categories and multicategories, their theory turned out to be very complex and from the fundamental works, in fact, there are only a few articles [24]. Convolutional polycategories include the Sabo polycategories as a special case. The possibilities of convolution of the poly-arrows are much more expressive than the operation of composition in Sabo polycategories and were sufficient for modeling systems. Category science uses the concept of arrows in categories, multiarrows in multicategories, poly-arrows in Sabo polycategories and poly-arrows in convolutional polycategories. Varieties of arrows have a composition operation, generalized in the theory of convolutional polycategories to convolution operations. Functors are defined by a set of functions preserving the operations of composition and convolution.
Fig. 2. Convolution example (dotted line)
Convolutional polycategory is a set of poly-arrows with a set of convolutions (Fig. 2) that satisfy several properties (see [16]). A system in the categorical theory of systems is a poly-arrow in a given convolutional polycategory. The principle of isomorphism corresponds to the concept of similarity, the principle of hierarchy is realized by building systems from subsystems using convolutions. The role of a system-forming factor is played by a functor paired with a convolution: the functor itself changes the half-arrows-subsystems, and the convolution combines the changed subsystems into a new integral system, in accordance with the way this is formulated in the theory of P. K. Anokhin [22]. Systems [13, 14] have suitable categorical models. Note that the categorical approach [15] has completely different tasks in view and uses the usual categories without resorting to the multicategory theory. Artificial neural networks currently used extensively in AI [25–30] are a very simple example of convolutional polycategories with associative compositional convolution of the corona type, while a number of questions in the categorical model of neural networks look more natural, in particular, this concerns a rigorous substantiation of the formula Osovsky for the backpropagation method [21]. The categorical theory of systems not only formalized, in particular, the theory of functional systems but also made it possible to solve a number of problems that had not been solved to decades in the theory of functional
From Algebraic Biology to Artificial Intelligence
91
systems itself [20]. Algebraic biology considers DNA in the cellular environment as a categorical system with an appropriate system-forming factor, and in general, matters uses the categorical paradigm of the theory of categorical systems.
3 Mathematical Methods of Algebraic Biology The initial stage of development of algebraic biology took place as matrix genetics [1, 2], and various methods of associative algebra were used. The requirements of a systematic approach to considering organisms as systems have led to the expansion of modern categorical mathematics in which the usual settheoretic approach is one of the very private mathematical tools to the mathematical apparatus of algebraic biology. At Computer Science, AI researchers and practitioners have long gone beyond the bounds of set-theoretic mathematics: in functional programming, in the abovementioned categorical approach to software industry technologies, and in the categorical quantum mechanics used for the computational problems of quantum computers. In algebraic biology, categorical science is necessary, for example, when describing the transition of biological systems to the state of achieving a result, including if it is supposed to conduct research in a great aggregation of such systems. We illustrate what has been said with this simple example. Every living system in its life cycle passes from one result, that is, the corresponding functional system, to another result (or state). Models of results or functional systems usually contain a number of parameters (blood pH, pressure, etc.), we denote them a1 ; a2 ; . . .; ak ; . . ., characterizing the functional system. The parameters run through some ranges of values, both in the process of achieving the result a1 2 D1 ; a2 2 D2 ; . . .; ak 2 Dk ; :… by the system, and when the result reached a1 2 D01 D1 ; a2 2 D02 D2 ; . . .; ak 2 D0k Dk ; . . . In this case, it is assumed that we can answer the question of whether the parameter satisfies the result or not (ai 2 D0i – “yes”, ai 62 D0i – “no”), and also that there are mechanisms for transferring parameters to the result area (the system achieves the result). The parameter itself may be from several components. In other words, we have the sets Di , which run through the parameters a_i and the action of the monoid M2 ¼ f0; 1g; 1 1 ¼ 1; 1 0 ¼ 0; 0 0 ¼ 0; 0 1 ¼ 0 on Xi of the set (Boolean Xi ¼ PðDi Þ) of the subsets of the set Di , 1 goes into the function k1 ðnÞ ¼ n 2 Xi , 0 goes into the function k0 ðnÞ ¼ D0i 2 Xi . The action of a monoid, as we see, models the process of achieving a result by a functional system. It turns out that a set of such actions is a category with its properties requiring study, reflecting the properties of functional systems, and, on the other hand, it is a non-classical topos and when developing a theory in its universe one will have to deal with phenomena unusual for set theory. Indeed, we have a set of sets Xi (since the choice of systems is unlimited, we can assume that these sets are arbitrary) with pairs of functions k ¼ ðk1 ; k2 Þ. Of the functions between such objects ðXi ; kÞ, we will consider only those that leave the diagram commutative, that is, the compositions kð jÞ f ¼ f kðiÞ are equal. A set of objects (X_i, k) with the indicated f arrows forms a topos, a category (we will call it
92
G. K. Tolokonnikov and S. V. Petoukhov
topos M_2), all the requirements for which are met on the category of sets, but very different from the category of sets. It can be shown that each nonzero object of the topos M2 is not empty, that the elements of the classifier of subobjects X are f£; f£gg and £, that M2 is two-valued, that the arrows И and Л are equal , but topos M2 is not a classic.
For example, the extensionality principle for arrows turns out to be unfulfilled: , that is, no element in X distinguishes between different arrows fX 6¼ 1X . Not only that, by and large, ordinary sets are not enough for algebraic biology, but the logic of reasoning begins to differ from the usual classical one. The rules of topos logic are set using operations on the classifier of subobjects. In the category of sets Set, the operations of negation, conjunction, disjunction, and implication are defined by the well-known functions *, \ , \ , ! on the set {0, 1}. When interpreting the propositional calculus formulas, the logical connectives of the language negation ¬, conjunction ^, disjunction _, and implication pass into these functions, respectively. For example, a ^ b - the conjunction of the proposition a and the proposition b has values 1 ˄ 1 = 1, 1 ˄ 0 = 0, 0 ˄ 1 = 0, 0 ˄ 0 = 0 \ ð1; 1Þ ¼ 1; \ ð1; 0Þ ¼ 0; \ ð0; 1Þ ¼ 0; \ ð0; 0Þ ¼ 0, which corresponds to a function on the values \ ð1; 1Þ ¼ 1; \ ð1; 0Þ ¼ 0; \ ð0; 1Þ ¼ 0; \ ð0; 0Þ ¼ 0. These functions define the logic in the Set and it is classical, and the algebra of subsets of any set is Boolean, that is, a distribution lattice with additions. The algebra of subobjects for every object in an arbitrary topos is a distributive lattice with zero and one. On the classifier of subobjects of the topos M2 it turns out that is not isomorphic to the unit of the lattice. From this and from the general fact for an arbitrary topos and (if * exists) it follows that the algebra of subobjects of the toposus M2 is not Boolean, and therefore this topos does not define classical logic, unlike Set. In the general case, it turns out that the logic of topos is intuitionistic, not classical. Thus, if we want to develop a theory in the universe of systems with a specific requirement to achieve a result, then we will have to switch from Set to the non-classical topos M2 and take into account its distinguishing features considered above.
4 U-Numbers in a Categorical Generalization Conclusion Matrix genetics has led to new, not studied in detail numerical systems (matrions, bisexes, etc.), generalized to U-numbers [5], the properties of which reflect genetic patterns. In general, the concept of number plays a central role in algebraic biology;
From Algebraic Biology to Artificial Intelligence
93
a special program is being developed to develop the theory and applications of those types of generalized numbers that are adequate to the structure of the genetic code [1]. Category methods lead to new kinds of numbers generalizing U-numbers, and this generalization is caused by the deeply systemic nature of algebraic biology. However, all numbers are built from series of natural numbers, as Kronecker said: “God gave the natural numbers, the rest is the work of human hands.” The set of finite ordinals corresponds to a natural series as closely as possible to its intuitive properties …, (k − 1) ¯ “}” with the following function r s s: x ! x; sðnÞ ¼ n þ 1 and the diagram 1 ! x ! x; r: f0g ! x. In many topoi besides Set (for example, in SetC with a small category C), this diagram defines a r s natural numerical object N, that is, a topos object together with arrows 1 ! N ! N such x
f
that for any object a and the arrow 1 ! a ! a there is exactly one arrow h: N ! a for which the diagram is commutative
In any topos that has a natural numerical object, there is a full-fledged (almost like in Set) theory of generalized natural numbers as arrows 1 ! N. Numerical systems categorical analogs for positive, integer, rational and real numbers are also constructed in topos with a natural numerical object. An object N þ of positive numbers is defined as a subobject in N, an object of integers is defined as a coproduct Z ¼ N þ N þ , it is possible to construct a rational-numerical object Q, for real numbers there is a splitting, equivalent to Set definitions by Cauchy and Dedekind in topos are split. Numerical systems in topos are quite developed [31] and deliver a whole range of numbers. All these possibilities can be used for the development of numerical systems of algebraic biology, generalizing the U-numbers of matrix genetics.
5 Conclusions Algebraic biology studies, using rigorous mathematical methods, the general biological systems of DNA and RNA molecules in a cellular environment, which provides the reading of genetic information. This leads to heuristic associations and new methods for modeling the inherited properties of organisms, including the human body. Algebraic biology is a deeply systemic science. The use of modern categorical theory of systems in algebraic biology has led to the use of category theory methods. The research results obtained by the methods of algebraic biology for a person as a system with its subsystem of intelligence and thinking are used for modeling in AI. In addition to the initially applied algebraic methods, categorical systems theory provides algebraic biology with categorical methods for studying biological systems, including
94
G. K. Tolokonnikov and S. V. Petoukhov
expanding the set-theoretic universe of reasoning and constructing categorical-system theories based on the achievements of modern mathematics.
References 1. Petoukhov, S.V.: Matrix genetics, algebra of genetic code, noise immunity. RHD (2008) 2. Petoukhov, S.V., He, M.: Symmetrical Analysis Techniques for Genetic Systems and Bioinformatics: Advanced Patterns and Applications. IGI Global, Hershey (2009) 3. Petoukhov, S.V.: Matrix genetics and algebraic properties of the multi-level system of genetic alphabets. Neuroquantology 9(4), 60–81 (2011) 4. Petoukhov, S.V.: The system-resonance approach in modeling genetic structures. Biosystems 139, 1–11 (2016) 5. Petoukhov, S., Petukhova, E., Hazina, L., Stepanyan, I., Svirin, V., Silova, T.: The genetic coding, united-hypercomplex numbers and artificial intelligence. In: Hu, Z.B., Petoukhov, S., He, M. (eds.) Advances in Artificial Systems for Medicine and Education. Advances in Intelligent Systems and Computing, vol. 658. Springer, Cham (2018) 6. Zeng, H., et al.: Convolutional neural network architectures for predicting DNA–protein binding. Bioinformatics 32(12), i121–i127 (2016) 7. Min, S., Lee, B., Yoon, S.: Deep learning in bioinformatics. Brief. Bioinform. 18(5), 851– 869 (2017) 8. Sverdlov, E.D.: The great discovery: revolution, canonization, dogma and heresy. Herald RAS 73(6), 587–601 (2003) 9. Zhao, T., et al.: Consciousness: new concepts and neural networks. Front. Cell. Neurosci. 13, 302 (2019) 10. Hameroff, S., Penrose, R.: Consciousness in the universe: a review of the ‘Orch OR’ theory. Phys. Life Rev. 11, 39–78 (2014) 11. Craddock, T.J.A., Kurian, P., Preto, J., Hameroff, S.R., et al.: Anesthetic alterations of collective terahertz oscillations in tubulin correlate with clinical potency: implications for anesthetic action and post-operative cognitive dysfunction. Sci. Rep. 7, 9877 (2017) 12. Koblyakov, A.A., Petukhov, S.V., Stepanyan, I.V.: Genetic code and genetic musical systems. Biomash Syst. 2(3), 208–230 (2018) 13. Mesarovich, M., Takahara, I.: General Theory of Systems: Mathematical Foundations. Mir, Moscow (1978). 311 p. 14. Matrosov, V.M., Anapolsky, L.Yu., Vasiliev, S.N.: A Comparison Method in the Mathematical Theory of Systems. Nauka, Novosibirsk (1980). 480 p. 15. Goguen, J.: A categorical manifesto. Math. Struct. Comput. Sci. 1(1), 49–67 (1991) 16. Tolokonnikov, G.K.: Mathematical categorical theory of systems. In: Theory and Applications. Biomachsystems, vol. 2, pp. 22–114. Rosinformagroteh (2016) 17. Tolokonnikov, G.K.: Manifesto: neurographs, neurocategories and categorical gluing. Biomachsystems 1(1), 59–146 (2017) 18. Tolokonnikov, G.K.: Informal categorical theory of systems. Biomachystems 2(4), 63–128 (2018) 19. Tolokonnikov, G.K.: Classification of functional and other types of systems in their modeling by convolutional polycategories. Neurocomputers Dev. Appl. 20(6), 8–18 (2018) 20. Chernoivanov, V.I., Sudakov, S.K., Tolokonnikov, G.K.: Biomash systems, functional systems, categorical theory of systems. Research Institute of Normal Physiology. P.K. Anokhin, RAS (2018). 445 p.
From Algebraic Biology to Artificial Intelligence
95
21. Tolokonnikov, G.K.: Convolution polycategories and categorical splices for modeling neural networks. In: ICCSEEA 2019, pp. 259–267 (2019) 22. Anokhin, P.K.: Fundamental questions of the general theory of functional systems. The principles of systemic organization of functions, pp. 5–61. Science (1973) 23. Szabo, M.E.: Polycategories. Commun. Algebra 3(8), 663–689 (1975) 24. Garner, R.H.G.: Polycategories via pseudo-distributive laws. Adv. Math. 218, 781–827 (2008) 25. Karande, A.M., Kalbande, D.R.: Weight assignment algorithms for designing fully connected neural network. Int. J. Intell. Syst. Appl. (IJISA) 10(6), 68–76 (2018) 26. Dharmajee Rao, D.T.V., Ramana, K.V.: Winograd’s inequality: effectiveness for efficient training of deep neural networks. Int. J. Intell. Syst. Appl. (IJISA) 10(6), 49–58 (2018) 27. Hu, Z., Tereykovskiy, I.A., Tereykovska, L.O., Pogorelov, V.V.: Determination of structural parameters of multilayer perceptron designed to estimate parameters of technical systems. Int. J. Intell. Syst. Appl. (IJISA) 10(10), 57–62 (2017) 28. Awadalla, H.A.: Spiking neural network and bull genetic algorithm for active vibration control. Int. J. Intell. Syst. Appl. (IJISA) 10(2), 17–26 (2018) 29. Abuljadayel, A., Wedyan, F.: An approach for the generation of higher order mutants using genetic algorithms. Int. J. Intell. Syst. Appl. (IJISA) 10(1), 34–35 (2018) 30. Kumar, A., Sharma, R.: A Genetic algorithm based fractional fuzzy PID controller for integer and fractional order systems. Int. J. Intell. Syst. Appl. (IJISA) 10(5), 23–32 (2018) 31. Johnston, P.T.: Topos Theory. Nauka, Moscow (1986). 440 p.
Concept of Active Traffic Management for Maximizing the Road Network Usage Andrey M. Valuev(&) Mechanical Engineering Research Institute of the Russian Academy of Sciences, 4, Malyi Kharitonievsky pereulok, 101990 Moscow, Russia [email protected] Abstract. The recently developed and implemented concept of active traffic and demand management (ATDM) on a highway by on-ramp traffic flows’ regulation is spread in the paper to a carcass city road network (CRN). With the same aim of average trip time minimization and the same way of regulation it puts forward the original idea of the incoming flow limitation to the maximum value that the CRN admits, flows through entrances to the CRN being assigned according to the known distribution matrix of correspondences between traffic regions. The proposed method of the maximum input flow determination is based on a linear programming problem from which the distribution of traffic correspondences between roads is established as well. For signalized intersections traffic lights cycle parameters are variables in the problem, traffic organization on intersections being taken into account. Main features of the proposed approach are demonstrated by an example; aspects of its use for intelligent ATDMs are discussed, including the use of info-communicational technologies. Keywords: Vehicular traffic flow Intelligent traffic system Active traffic and demand management Road throughput Signalized intersection Optimization
1 Introduction Intellectualization of control and decision making in complicated environments, especially including both elements with human controls and the action of intelligent control systems is now the domain of intensive research [1–4], including problems related with transport systems [3, 4]. Despite significant investments in many megacities, the level of satisfaction of population’s needs in daily movements and commodities’ transportation remains far from well-being. To increase it by investments in the road infrastructure, changes in the traffic organization and traffic light regulation regimes reliable justification is required. One of the most promising concept in this area is active transportation and demand management (ATDM) that is “a comprehensive approach to facility management and operation that seeks to increase facility productivity by proactively balancing supply and demand to avoid or delay facility breakdown”; ATDM measures may include: “adaptive ramp metering, congestion pricing, speed harmonization, traveler information systems, and adaptive traffic signal control systems” [6]. Successful implementation of some ATDM features in TOPL and its application to freeways in California show papers [5, 7]. Unfortunately, both theoretical and practical progress in the realm of ATDM systems © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 96–105, 2020. https://doi.org/10.1007/978-3-030-39216-1_10
Concept of Active Traffic Management for Maximizing the Road Network Usage
97
after TOPL development was very slow [8, 9]. Although means of measurement and necessary informational exchange between ATSs and drivers are successfully developing [10, 11], new significant ideas were not proposed, so “traffic congestion is getting progressively worse in the United States” [11] and many other countries. For TOPL, reduction of the on-ramp flows to prevent over-congestion plays the crucial role. The reason for it is that a redundant density of traffic flows (TFs) decreases not only the average speed (and so enlarges the average trip time) but the TFs intensity as well, so the urban or interurban network capacity is used inefficiently. On the other hand, this inefficiency results from inadequate drivers’ behavior, both in aspects of driving modes and, more important, in route choices. The latter, being spontaneous, tend to result in Nash-Wardrop equilibrium in the urban road network that is not efficient in the aspects of its throughput usage [12]. To cope with the problem, the paper proposes an approach for simultaneous determination of the maximum total incoming traffic flow into a carcass road network (CRN) formed by city highways and the TFs distribution on the CRN according to the known matrix of correspondences [12]. The latter describes intensities of passenger and cargo correspondences between urban regions (referred to whole days or certain periods of several hours) that are quite stable and can be considered as the expression of social needs for today and the foreseeable future. The maximum incoming flow and its distribution between routes are treated as such for which the total incoming flow into each CRN road does not exceed the road throughput and so over-congested flows do no occur and average speeds in all CRN roads stay relatively high. Section 2 presents the problem setup for the case when CRN junctions provide permanent traffic passage without any interruptions. In Sect. 3 ways of implementation of obtained recommendations are discussed, including the use of info-communicational technologies. Section 4 proposes generalization of the problem for the usual case when some junctions are signalized intersections; then parameters of traffic light regulation are incorporated in the problem.
2 The Problem of Establishment of Maximum Traffic Flow Through the Carcass Network with Ideal Junctions Main elements of the setup of the problem in question are: the representation of the carcass road network (CRN) formed with highways and traffic flows on it; balance equations linking flows originating and terminating in traffic districts and branching or merging in CRN nodes; restrictions on road throughputs. At first, restrictions caused with limited capacities of road junctions are not introduced. However, further these restrictions are treated. A network is treated as an oriented graph which arcs correspond to single-direction roads between nodes representing road junctions. Limitations on each junction passage are introduced: from the i-th arc entering the node k ¼ ENDðiÞ it is possible to pass to the subset NEXT ðiÞ of arcs exiting the k-th node. Traffic flows on the i-th arc are treated as the sums of flows with all possible destinations which are traffic regions (TRs). The TR for which the i-th arc serves a part of its bound is denoted by TRðiÞ. The reason for which TFs are distinguished by destinations only but no by their origins is explained below.
98
A. M. Valuev
We do not determine where vehicles enter or leave the TF but suppose that definite shares of counts of vehicles entering and leaving the TR(i)-th traffic region, namely cIN i and cOUT i , do it on the i-th arc. The distribution matrix elements are denoted by Cpr where p and r are, resp., indices of TRs of origin and destination. The main value we are seeking is the total incoming TF Q for the CRN. Due to this notations, cOUT i CTRðiÞr vehicles leave the TRðiÞ-th TR for the r-th one and cIN i CrTRðiÞ vehicles departing in the r-th TR reach the TRðiÞ-th one on the i-th arc per a time unit. We define balance equations linking TF intensities with definite destinations in the beginning and in the end of the i-th arc together with non-negativity constraints as :q1ir ¼ q0ir þ cOUT i CTRðiÞ r Q; q0ir 0; r 6¼ TRðiÞ; i ¼ 1; . . .; N; q1i TRðiÞ ¼ q0i TRðiÞ cIN i Q
X r6¼TRðiÞ
CTRðiÞ r ; q1i TRðiÞ 0; i ¼ 1; . . .; N:
Let us define the coefficient of TF change for the i-th arc as X ki ¼ ðc C cIN i Cr TRðiÞ Þ; i ¼ 1; . . .; N: r6¼TRðiÞ OUT i TRðiÞr
ð1Þ ð2Þ
ð3Þ
Then, according to (1) and (2), the relationship between total TF intensities XR
qsi ¼
r¼1
qsir ; s ¼ 0; 1; i ¼ 1; . . .; N;
ð4Þ
in the beginning and in the end of the i-th arc is q1i ¼ q0i þ ki Q;
i ¼ 1; . . .; N:
ð5Þ
As Q [ 0, then q1i qmax if ki [ 0 and q0i qmax otherwise; i ¼ 1; . . .; N: i i
ð6Þ
To express the possibility of forbidding some directions of the junction passage extra variables are introduced, namely q0ijr , as the intensities of TFs passing from the i-th arc through the node k ¼ ENDðiÞ to the j-th arc, j 2 NEXT ðiÞ. Balance equations in nodes and the restriction on the total TF on the i-th arc caused by its limited throughput are expressed as q0ijr 0; j 2 NEXT ðiÞ; q0jr ¼
X :i:j2NEXTðiÞ
X j2NEXT ðiÞ
q0ijr ¼ q1ir ; ¼ 1; . . .; R; i : ENDðiÞ ¼ k;
q0ijr; ; r ¼ 1; . . .; R; j : BEGð jÞ ¼ k; k ¼ 1; . . .; M:
ð7Þ ð8Þ
With the above relationships the maximum value of Q may be found from the linear programming problem of maximization Q under restrictions (1), (2), (4)–(8) with the use of (3).
Concept of Active Traffic Management for Maximizing the Road Network Usage
99
We do not determine separately TFs with different origins since they may be computed from the solution of the above problems in such a way. Let p be the index of the origin traffic region. The set of the below relationships (9)–(13) entirely determines all values of q0jpr ; q10 jpr . If q0ir ¼ 0; then q0ipr ¼ 0; p; r ¼ 1; . . .; R; i ¼ 1; . . .; N:
ð9Þ
q1iTRðiÞr ¼ q0iTRðiÞr þ cOUT l CTRðiÞ r Q; q1ipr ¼ q0ipr ; p 6¼ TRðiÞ; r 6¼ TRðiÞ; i ¼ 1; . . .; N:
ð10Þ
q1i p TRðiÞ ¼ q0i p TRðiÞ
q0i p TRðiÞ q0i TRðiÞ
cIN i Q
P r6¼TRðiÞ
CTRðiÞ r ; q1i TRðiÞ 0; p ¼ 1; . . .; R;
ð11Þ
i ¼ 1; . . .; N: If q1ir [ 0; then q0ijpr ¼ q0jpr ¼
X :i:j2NEXTðiÞ
q1ipr q1ir
q0ijr ; p; r ¼ 1; . . .; R; j 2 NEXTðiÞ; i ¼ 1; . . .; N:
q0ijpr ; p; r ¼ 1; . . .; R; j : BEGð jÞ ¼ k; k ¼ 1; . . .; M:
ð12Þ ð13Þ
To illustrate the approach we propose a simple methodical example of a small network having three TRs. The general network structure is presented by Fig. 1 and principal traffic organization on its junctions by Fig. 2, traffic organization for node 3 being the same as for node 2. Conditional places of entrance and exit from TRs are marked by double arrows.
Fig. 1. Example of a CRN structure in geometrical (a) and graph-theoretical (b) representation. Here 1 , 2 , 4 are symbols of TRs, node and arc, resp.
100
A. M. Valuev
We assign the same throughput equal to 5400 that may correspond to roads with three lanes in every direction to all arcs and treat the presented CRN as a segment of a network with quadratic cells (see Fig. 1a), the length of each side being 4 km; these data are typical for the Moscow CRN. Positions of entrance and exit points of the TRs T1 ; T2 ; T3 on their arcs 1, 5, 3 are the following: the length of paths T1 1, T2 2, T3 3, are 6 km, 4 km and 6 km, resp., the correspondence matrix is presented in the Table 1.
Fig. 2. Principal structure of crossroads: a – node 1, b – node 2 Table 1. Correspondence matrix TRs of departure TRs of arrival 1 2 3 1 0,15 0,1 2 0,2 0,3 3 0,1 0,15
It is obvious that without any centralized control drivers prefer the shortest paths that are the following successions of arcs (see Table 2). Table 2. Shortest paths between traffic regions and average path lengths on the optimum solution Beginning End Path 1
2
1–9–5
1
3
2
1
1–9–5–8–3 or 1–7–6– 10–3 5–1
Length, km 14 30
10
LOPT Av ; Beginning End Path Length, km km 14 2 3 5– 14 8–3 30 3 1 3– 20 5–1 38
3
2
3–5 10
LOPT Av ; km 24 56
10
Concept of Active Traffic Management for Maximizing the Road Network Usage
101
However, traffic reduction to the set of shortest paths cannot provide the maximum flow. To prove this fact we solved the problem (1), (2), (4)–(8) excluding arcs not included in any shortest paths. The maximum Q value was 9000, whereas maximizing Q on the entire CRN we obtain the optimum value 10800, its distribution between paths is shown in Table 3 and average path lengths in Table 2. Table 3. Paths between traffic regions on the optimum solution Correspondence
Paths Route
(1, 2) (1, 3) (2, 1)
Share
Correspondence Paths Route
1–9–5
1
(2, 3)
1–7–6– 10–3 20 5-1 5, 8, 9, 4, 56 7, 1
1
(3, 1)
1/2 1/2
(3, 2)
Length, km
Length, km 5-8–3 12 5, 1, 7, 6, 10, 3 30 3, 10, 2, 6, 4, 56 7, 1 3-5
Share 1/3 2/3 1 1
We conclude that the average path length for all trips increased to 27 km ¼ ð0:15 14 þ 0:1 30 þ 0:2 38 þ 0:3 24 þ 0:1 56 þ 0:15 10Þ km in comparison with 14.8 km with shortest paths, but the increase of speeds undoubtedly compensate the increase of path lengths. The problem is that not all correspondences benefit from the proposed traffic situation improvement.
3 Ways of Implementation of Recommendations Obtained from the Optimization Problem Solution To achieve the drivers’ behavior due to obtained recommendations, administrative and/or economic measures are necessary. The reasons for drivers to adopt them do exist; with the maximum use of a road throughput the TF on it must move with sufficiently high speed, comfortably; the trip as a whole, even with a longer route but in a shorter time, may be assessed as more profitable. The weighted sum of the trip time and fuel expenses is a usual criterion for problems of route optimization, e.g. for flight plans [10]. The problem consists, however, in the above noticed necessity to split some correspondences between paths which benefits substantially differ. The ITS based on the proposed type of recommendations first must limit the number of departing vehicles either by granting permissions to depart at a definite time for individual drivers, or by assigning a fare for it, or with the control of bars or traffic lights on points of vehicles’ admission into the CRN. As to route choice, impersonal ways of the TFs control have a limited utility. In our example we find that TFs entering the 2nd node by the arc 5 are shared between arcs 1 and 8 in proportions depending on their destinations (1:1 and 1:2); so the universal
102
A. M. Valuev
recommendation for vehicles approaching this crossroads is useless. On the other hand, TFs on CRN roads are massive processes, so it is necessary only to maintain the needed proportions in their branching on intersections, e.g. by transmitting messages that must be obtained by all drivers of moving vehicles. As to the economic way of traffic control, the message must inform drivers on the price of a definite route choice. However, to send to a driver a bill for its trips it is necessary to trace each vehicle and to store data about each traced trip. At present, it is achievable with the use of existing info-communicational technologies. Software and hardware are available and practically used on individual highways for recognizing the numbers of all cars crossing a certain line. Besides controlling individual vehicles, such information can be used to identify the intensity of entry and exit at all sections of the CRN and to determine matrices of correspondence.
4 Generalization of the Problem for the Case of Signalized Intersection Presence Traffic organization of a signalized intersection affecting its throughput means separation of permitted passage directions by phases of the traffic lights cycle (TLC). For the CRN in question schemes of nodes’ passage on TLC phases are presented by Fig. 3. The passage schemes for the 2nd phase for the node 1 has the same structure and traffic organization on nodes 2 and 3 is the same. We treat traffic organization and control of a signalized intersection as in [11, 12].
Fig. 3. Passage schemes on the 1st TLC phase for node 1 (a) and on all TLC phases for node 1 (b, c, d)
Concept of Active Traffic Management for Maximizing the Road Network Usage
103
The intersection traffic capacity depends on many factors including the structure of TFs through it and TLC parameters. Reasonable values of the latter must correspond to the distribution of TFs among permissible directions of the intersection passage expressed by q0ijr ; to be more exact, they are expressed by q0ij ¼
XR r¼1
q0ijr
ð14Þ
Let the k-the node be a signalized intersection. The set of permissible passage directions for it expressed by pairs of arcs ði; jÞ is separated between TLC phases; some directions may be allowed on more than one phase. If the TLC has a sufficient duration, than the average speed during the intersection passage for each direction is almost constant but differs for directions and depends on the topology and the geometry of routes. These speeds, in turn, determine average headways on them and, finally, the maximum traffic flow qmax along the direction ði; jÞ. The latter is determined under ij assumption that crossing intersection along the direction is permitted permanently. In fact, it is permitted only on the set of green light phases PH ði; jÞ f1; . . .; NPH ðkÞg (here NPH ðkÞ is the number of TLC phases); usually NPH ¼ 2, 3 or 4. To express conditions of the intersection passage new variables are introduced, namely shares of phases duration in the TLC dkf ; f ¼ 1; . . .; NPH ðkÞ; restrictions of two new kinds are introduced for the k-the node: XNPH ðkÞ f ¼1
q0ij qmax ij
X f PH ði;jÞ
dkf ¼ 1;
dkf where k ¼ ENDðiÞ and j 2 NEXT ðiÞ:
ð15Þ ð16Þ
These restrictions are added to (1), (2), (4)–(8) for each signalized intersection. As to dkf ; f ¼ 1; . . .; NPH ðk Þ, they may be either parameters or optimization problem variables; certainly, the latter case yields bigger value of Q. The problem with additional constraints and variables was solved for the same example in several variants. Nodes 2 and 3 with traffic organization shown in Fig. 3 were treated as having 5 lanes on entries instead of 3 lanes on arcs. Such traffic organization is typical for intersections of highways in the South-West district of Moscow designed as a whole in 1950–1960s. Two variants of the distribution of these lanes between passage directions were considered: 2 straight lanes and 3 curved ones and v.v. The following values of these directions’ throughputs were assigned: 3600 for straight traces and 4200 for curved traces for the 1st variant, 5400 and 2800 for the 2nd. The maximum value of Q was 10507 for the 1st variant and 10254 for the 2nd in comparison with 10800 for ideal junctions. Therefore, even in the case of wide crossroads the total throughput was diminished. When only nodes 2 and 3 were treated as signalized intersections and node 1 as an ideal junction no loss of the CRN throughput took place for both variants. So costly road construction with the use of bridges and/or tunnels are not necessary for all junctions. But when the number of both straight and curved lanes for nodes 2 and 3 were set to 2, the optimum value of Q was diminished to 9333.
104
A. M. Valuev
Calculations showed the crucial role of choice of TLC parameters. When these parameter were assigned fixed values that seemed reasonable, only values 6825 and 7000 were obtained.
5 Conclusions The present paper develops the idea of on-ramp traffic flows limitation as the means of traffic situation improvement by ATDSs by spreading it from a single highway to a carcass road network. Its main contribution is the optimization problem formulation with respect to the total incoming flow, its distribution on the CRN according to the known matrix of correspondences and TLC parameters, the problem criterion being the maximum total incoming traffic flow to the TLC. It is a classical linear programming problem; its dimension, although high for the entire CRN, does not exceed dimensions of optimization problems for modern big systems. It must be emphasized that relationships constituting the problem seem undoubted: they are balance equations and restrictions on road throughputs. The utility of the problem is demonstrated on a simple example. The problem solution should be compared with the traffic situation in peak hours and used in operative traffic management for limitation of flows through entrances to the CRN as well as for active control of route choice of moving vehicles. We see two directions of the future development of the research: first, the selection of necessary techniques and means of data acquisition and transfer for the implementation of the proposed approach and second, the approach development for time periods preceding and succeeding the peak hours for better satisfaction of needs for efficient transportation.
References 1. Rashad, L.J., Hassan, F.A.: Artificial neural estimator and controller for field oriented control of three-phase LM. Int. J. Intell. Syst. Appl. (IJISA) 11(6), 40–48 (2019) 2. Agrawal, P., Agrawal, H.: Adaptive algorithm design for cooperative hunting in multirobots. Int. J. Intell. Syst. Appl. (IJISA) 10(12), 47–55 (2018) 3. Dennouni, N., Peter, Y., Lancieri, L., Slama, Z.: Towards an incremental recommendation of POIs for mobile tourists without profiles. Int. J. Intell. Syst. Appl. (IJISA) 10(10), 42–52 (2018) 4. Adebiyi, R.F.O., Abubilal, K.A., Tekanyi, A.M.S., Adebiyi, B.H.: Management of vehicular traffic system using artificial bee colony algorithm. Int. J. Image Graph. Signal Process. (IJIGSP) 9(11), 18–28 (2017) 5. Kurzhanskiy, A.A., Varaiya, P.: Active traffic management on road networks: a macroscopic approach. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 368(1928), 4607–4626 (2010) 6. Dowling, R., Margiotta, R., Cohen, H., Skabardonis, A., Elias, A.: Methodology to evaluate active transportation and demand management strategies. Procedia-Soc. Behav. Sci. 16, 751761 (2011)
Concept of Active Traffic Management for Maximizing the Road Network Usage
105
7. Chow, A., Dadok, V., Dervisoglu, G., Gomes, G., Horowitz, R., Kurzhanskiy, A.A., Sánchez, R.O.: TOPL: Tools for operational planning of transportation networks. In: ASME 2008 Dynamic Systems and Control Conference, American Society of Mechanical Engineers, pp. 1035–1042, January 2008 8. Mize, J., Park, S., Matkowski, L.: Identification of congestion factors for active transportation and demand management: case study of operations data from delaware valley regional planning. Transp. Res. Rec. 2470(1), 105–112 (2014) 9. Hong, Z., Mahmassani, H.S., Xu, X., Mittal, A., Chen, Y., Halat, H., Alfelor, R.M.: Effectiveness of predictive weather-related active transportation and demand management strategies for network management. Transp. Res. Rec. 2667(1), 71–87 (2017) 10. Zhou, C., Weng, Z., Chen, X., Zhizhe, S.: Integrated traffic information service system for public travel based on smart phones applications: a case in China. Int. J. Intell. Syst. Appl. (IJISA) 5(12), 72–80 (2013) 11. Goyal, K., Kaur, D.: A novel vehicle classification model for urban traffic surveillance using the deep neural network model. Int. J. Educ. Manag. Eng. (IJEME) 6(1), 18–31 (2016) 12. Treiber, M., Kesting, A.: Traffic Flow Dynamics: Data, Models and Simulation. Springer, Heidelberg (2013) 13. Valuev, A.M., Velichenko, V.V.: On the problem of planning a civil aircraft flight along a free route. J. Comput. Syst. Sci. Int. 41(6), 979–987 (2002) 14. Solovyev, A.A., Valuev, A.M.: Optimization of the structure and parameters of the light cycle aimed at improving traffic safety at an intersection. In: Tsvirkun, A. (ed.) Proceedings of 2018 Eleventh International Conference “Management of Large-Scale System Development” (MLSD), Russia, Moscow, 1–3 October 2018, pp 1–5. IEEE Xplore Digital Library (2018). https://doi.org/10.1109/mlsd.2018.8551900 15. Solovyev, A.A., Valuev, A.M.: Structural and parametric control of a signalized intersection with real-time “Education” of Drivers. In: Hu, Z., Petoukhov, S.V., He, M. (eds.) AIMEE2018: Advances in Artificial Systems for Medicine and Education II. Advances in Intelligent Systems and Computing, vol. 902, pp. 517–526. Springer, Cham (2020)
Collection of Individual Packet Statistical Information in a Flow Based on P4-switch Vladimir A. Mankov1 and Irina A. Krasnova2(&) 1
2
Training Center Nokia 8A, Aviamotornaya, 111024 Moscow, Russian Federation [email protected] Moscow Technical University of Communications and Informatics 8A, Aviamotornaya, 111024 Moscow, Russian Federation [email protected]
Abstract. The paper addresses solutions based on machine learning methods to problems emerging at the withdrawal of statistical information about flows for real time classification of traffic in SDN networks. The solution is based on P4 switch with a specially designed for it system of memory organization allowing to store only the essential data about each flow per packet. Three ways of flow identification were discovered during the research. The method allows to avoid additional load both on the network between switches and SDN controller and the controller itself thanks to preliminary real time processing of data at the switch. The approach allows to avoid extension of delay of the controller response in the process of traffic flows classification. The theoretical foundation received experimental validation. Methodology is developed for defining optimal parameters of the switch. Optimal parameters for the switch planned for further research have been defined experimentally. Keywords: P4-switch
SDN Packet monitoring Flows identification
1 Introduction Currently at the stage of modern telecommunications networks design the user’s traffic is hard to predict. The demand for some services may go up while for others down. So meeting the user’s needs and providing adequate quality of services (QoS) become very difficult. Moreover, with an increasingly more new services and rapid pace of their changes plus user’s privacy policy the operator finds it difficult to determine the type of traffic the network receives and the users’ expectations regarding its quality. At present most popular are SDN networks ensuring dynamic control of their status. SDN, Software-Defined Network is a paradigm meaning separation of the data and control planes [1–3]. One of the approaches allowing to dynamically determine a service type is a network traffic routes analysis by the Machine Learning methods. As it is known, the classification results produced by statistical analysis methods are very critical for the collected database and produce the best results based on the most accurate, complete and correct samples. Taking into account the dynamically
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 106–116, 2020. https://doi.org/10.1007/978-3-030-39216-1_11
Collection of Individual Packet Statistical Information
107
changed and added new applications, and consequently the network traffic types, details tracking becomes increasingly more problematic. In [4], Albert Mestres et al. describe a new paradigm of configured SDN-networks called KDN (Knowledge-Defined Networking). The KDN architecture is a symbiosis of the SDN existing paradigms, Network Analytics (NA) and artificial intelligence methods. However, the KDN monitoring means rely on such protocols as NETCONF (RFC 6241), NetFlow (RFC 3954) and IPFIX (RFC 7011), which, on the one hand, provide more exact review of the network status and, on the other hand, due to frequent requests add significant (additional) load on devices and network. Besides, there is a possibility of information duplication cases or small traffic flows omission from consideration. In [5], Sebastian Troia et al. show the network model with a new control system Machine Learning Routing Computation (MLRC) that classifies the input flow according to the traffic matrices generated with Net2Plan. However, the network information is collected every 5 s and this casts doubt on the possibility of using this approach for flows with less granularity. A possible increase of the monitoring rate will increase the load on network devices, controller and network. Consequently the monitoring passing through the controller causes a certain delay in receiving data. Several articles [6] classify the traffic by Machine Learning methods, but without the possibility of its implementation in real time. Gomes and Madeira in [7], created a real-time traffic classifier agent in OpenFlow networks to improve the quality of service (QoS), but libpcap is used as a monitoring tool transmitting a large amount of redundant information for classification. On the one hand, the above articles show the efficiency of ML methods for traffic classification; however the downside of the above methods is that they don’t allow to minimize the load in the network without reducing the data accuracy. Our early article [8] proposed the algorithm allowing the real-time traffic classification by Machine Learning methods based on the statistics on each package and using the obtained results when routing the flows. To monitor the network, the possibility of using OpenFlow protocol features was considered but it also proved to be insufficiently flexible for the task. The paper is organized as follows: Sect. 2 outlines a general layout of work of the developed switch, Sect. 3 describes a system of organization of memory for extraction and storage of statistical information in every packet of the flow, Sect. 4 details the ways of identification of flows developed in the course of research. Section 5 suggests a method of evaluating parameters of the switch to be worked out before its compilation and Sect. 6 sums up the definition of parameters of the switch under research received experimentally. Section 7 compiles basic results of the work performed.
2 Pipeline Architecture of the P4-switch for Statistical Monitoring The project covered designing and building a P4 switch for gathering statistical information about each packet individually and the classification of the flows in realtime. The design was based on a well-known open v1model. The switch package processing flow chart is shown in Fig. 1.
108
V. A. Mankov and I. A. Krasnova
The choice of the programming approach has been done based on features provided by the P4 language [9]. P4 is both a programming language designed for programming switches in SDN networks; and the name of specific P4-switches and the name of the interface through which the data and control planes interact. P4 allows to determine the parser and package processing using the controller, to change settings and capabilities during switch operation and thus easily to scale the network.It also allows to be independent of protocols: switches are not referenced to any specific protocols, therefore any protocols can be applied, including OpenFlow, or none of them can be applied at all; in this case, the monitoring is carried out only through P4 programming. So functional packages processing does not depend on the specification of the basic hardware; the compiler monitors all the capabilities listed in the P4 program code and recompiles them to the data transmitting plane. The switch was designed based on the well-known open v1model. The Pipeline processing in the designed P4-switch is shown in Fig. 1.
Fig. 1. Pipeline processing in designed P4-switch
Reading and analysis of headers and validation of IPv4-heading are carried out at the heading parser stage. The next two units are intended for TCP and UDP-segments (if necessary, the language structure and the related created P4- switches allow to easily make new required headings which are even not identified in any specification) and hereinafter only they are implied in the paper. The stage of flows identification determines which of the available flows the incoming packet belongs to, if necessary, a new flow is created. The next step is update of registers when the information about the new received package is added. The routing table allows to route flows and is built like Flow_Table OpenFlow switches, but with more advanced capabilities. They are collected in an appropriate sequence at the heading deparser stage.
3 Memory Organization for Statistical Information About Flows in P4-switches P4 switches have special memory cells called registers for storing information in them. Three types of registers can be distinguished in the proposed implementation of the switch, based on the number and purpose of the memory cells: the single register, the flow registers and the packet registers (Fig. 2).
Collection of Individual Packet Statistical Information
109
The single register represents only one memory cell. This type of registers includes flow_ID. It serves as a flows unique identifier within the switch. The values that occur in it are the addresses of the registers memory cells in which information on the corresponding flow is to be recorded. The value that is available in flow_ID at the selected point of time shows the memory cell numbers in which the next new flow will be written. Flow registers are one-dimensional arrays, which size n-coincides with the flow number, which information should be stored in the switch memory. The flow registers are: srcAddr (IP source address); dstAddr (IP destination address); protocol (transport layer protocol); srcPort (source port); dstPort (destination port); tos (the value of the DSCP field of the IP packet); packet_counter (the number counter of the flow packets passed through the switch); Packet_ID (the analog of flow_ID for the packets).
Fig. 2. Types of registers organised during the work on the structure of the switch
The register fields srcAddr, dstAddr, protocol, srcPort, dstPort form a 5-tuple value, which is used to identify the flow at the network-wide level. The tos register fields can be used not only directly to provide QoS with standard methods, as well as a possible marker for further classification of flows by Supervised learning. The packet registers are two-dimensional arrays organised from one-dimensional ones, in which the addresses of the cells are determined using the registers flow_ID and Packet_ID. For the traffic classification purposes by the Machine Learning methods two registers were introduced: ingress_global_timestamp as the packet arrival time at the switch processing stage, expressed in microseconds and packet_length is the total packet length in bytes. The size of such register is n x m fields, where n is the number of flows and m is the number of packets, information about which should be stored in the switch memory.
110
V. A. Mankov and I. A. Krasnova
4 Flows Identification Approaches The correct functioning of the memory registers requires a flow identification mechanism capable of performing the following actions: (1) Comparing the value of the 5-Tuple of the received packet with the existing entries in the registers and finding out if the packet belongs to the new or old flow relative to the switch, i.e. does the switch have information about this flow or not. (2) If the packet belongs to the old flow, write the corresponding values in the fields of the registers ingress_global_timestamp [i] and packet_length [i], where the address of the cells is i ¼ flow IDðfor the flow foundÞ m þ Packet ID. Then you should increase packet_counter and Packet_ID for this flow by 1. 3). If the packet belongs to the new flow, then its 5-Tuple and DSCP are to be written to the corresponding cells of the flow registers at the address equal to the current value of flow_ID, the packet arrival time and packet_length are fixed in the cells at flow ID m, and then packet_counter, Packet_ID and flow_ID are to be increased by 1. The task gets complicated due to the lack of the possibility of creating loops and other similar tools in the structure of the programming language P4. In [10] it was proposed to store information only about one flow on a P4 switch, and if the packet came from the new flow, send the records about the previous flow to the Control Plane and solve further questions regarding the storage of information about flows already in this plane. The disadvantages of this approach are the high load on the Control Plane, difficulties in working with a large number of flows passing simultaneously through the switch (5-Tuple would require constant updating each time), and the inability to obtain information about each packet from active flows, because in this case the flow and packet registers have one value - instead of ingress_global_timestamp and packet_length-only the average flow metrics are used. While writing the used version of the P4 switch, three more methods for identifying the flow were developed, each capable of recording and storing the information about m-packets for n-flows on the switch: using the flow register pool, using the flow identification tables and with the simultaneous use of the pool of registers and flow identification tables. a. The flow register pool is a 5-Tuple register and a special flag. When a packet arrives at the switch, it is compared one by one with all available 5-Tuple register values. If the packet arrives from the old flow, then the flag will be raised on the found flow, and the flow_ID value of the flow becomes known. If the flag does not work somewhere, then the flow is new. The advantage of this identification method is the capability to record the information in the Data Plane about new flow simultaneously with packets processing. This approach is effective in cases with relatively few new flows. In the cases, where the number of new flows is quite large, the program code is complicated, and the identification speed decreases, the load on the switch increases significantly. b. The flow identification table based on the match-action table template, which is actively used for routing. The feature of this type of tables is the use of, among other things, special types of memory (TCAM and SRAM), which have specific algorithms for accessing information by key, which allow you to find quickly the necessary flows in the table.
Collection of Individual Packet Statistical Information
111
It is proposed to use the table, in which 5-Tuple values are indicated as key, and if the specified action matches, the registers are updated, where as the address specified in the table in the data column for the action value is flow_ID. If no match is found in the identification table all registers’ cells are filled at the address of current flow_ID value, as well as a special frame of notification on the new flow is created (Fig. 3). The notification frame shall contain all 5-Tuple fields, flow_ID and flag that enables to identify the exact type of the notification sent. Then the created notification frame is sent to the special switch port listening to the notification server. The notification server identifies the notification by its flag, selects 5-Tuple fields as a key, and flow_ID - as action data. Based on this information, the identification table server generates and sends a request for adding new information in the flow identification table. This approach efficiently processes a rather large number of new flows, the code complexity does not depend on the number of flows, and the identification rate and the switch load do not increase as quickly as in the previous case with the flow register pool. However, using this method to fill in the table and create a record therein, the notification servers and identification table servers are involved. Although the records on all packets are formed at the time of their passing through the switch regardless of the presence or absence of flow information in the identification table, a new flow line appears in it after the packet has passed the switch. This can result in certain difficulties, since if the record of a new flow has not yet appeared before the second packet arrives from the flow, and possibly the third one, they will be considered to be new for the switch, will occupy the fields in the memory cells as separate flows, and notification frames will be created and sent for each flow. c. The approach that is proposed for subsequent use enables to compensate for the disadvantages of the second method using the advantages of the first one and vice versa, therefore, both the register pool and the flow identification tables are used (Fig. 3). 5-Tuple of each packet that has arrived at the switch port is first checked for the compliance with the records in the flow identification table. If a record is found, then register values for packets (ingress_global_timestamp and packet_length) and some registers for flows (packet_counter and Packet_ID) are updated. If no record is found, then the packet is considered to have arrived from a new flow and is sent to a backup flow register pool. In this case, a small-size backup flow register pool is required (z records), and it does not have access to all cells of the flow register memory, only to a certain limited number of them. These records are the information on the last z new flows that have arrived in the switch. If a record on the arrived packet’s flow is found in this pool, this means that the packets from this flow have already passed through the switch, but the record on the new flow has not yet been added into the identification table. In this case, as in the previous one, the respective registers are simply updated. If the record is not found, then the packet transfers to the stage of new flow generation. When generating a new flow, the fields of all registers are filled in according to the current flow_ID, a notification
112
V. A. Mankov and I. A. Krasnova
frame is generated and sent, whose subsequent processing is carried out in accordance with the procedure described above. Also, flow_ID is incremented by 1, and since the flow register pool depends on flow_ID and shows only z last values, the latest existing record automatically becomes unavailable from pool and a new record is opened to replace it.
Fig. 3. An algorithm for recording statistical information using identification tables and a backup pool of flows identifiers
As an additional method for controlling duplicated flows, in parallel with the flow register pool in the Data Plane, the Application Plane information is checked where the packets of one flow that accidentally got into different flows can be combined in a single flow again. However, it should be noted that this solution does not protect against filling in the flow register cells, generating and sending excessive notifications to the server. Therefore, this method does not replace the method using the backup pool but only complements it. The number of records of the backup pool z shall be selected such that the scenario, where the duplicated flows are combined in one in the Application Plane, is used only in exceptional cases. Thus, in case of the proposed approach to flow identification, it is possible to keep recording the information about each packet, the increase in the number of flows does not critically affect the performance, and the system is protected against the duplicated flows and errors in the statistics records.
5 Method for Determining the Size of the Backup Pool of Flows Identifiers One of the most important tasks when operating the P4 switches is the correct definition of the switch parameters. In the program created during operation, n, m and z are to be changed. n and m completely depend on the network and the purpose of using the
Collection of Individual Packet Statistical Information
113
information obtained from the switch. A more complex problem is the definition of the size of the backup flow register pool. The handling time for the 1st packet of the ith in Data Plane (Dt) includes the following: the time of packet processing in the forwarding/routing table (tm ), the time of packet processing in the identification table as a function of the number of flows recorded therein (tident table ðk Þ), the time of comparison the 5-Tuple of a new flow with z records of the register pool (z tpool Þ, the time of creation of a new flow, generation and sending a notification frame to the notification server (tnew flow ): Dt ¼ tm þ tident table ðkÞ þ z tpool þ tnew flow . In Fig. 3 the time of transmition a notification frame/request frame between the Data and Control Plane for adding a record into the identification table is a ttransmission ; the time of notification processing and generation of a request for adding a record into the identification table is tserver ; the arrival time of the first and second packets of epy flow to the switch are s1 ; s2 ; the time of arrival into to the switch is Ti . Thus, the condition, when the introduction of backup flow register pool is required, looks as follows: minðs2 s1 Þ\tm þ tident table ðkÞ þ tnew flow þ 2ttransmission þ tserver
ð1Þ
that is, when the packets from one flow arrive before a record is generated on a new flow in the identification table. On the one hand, the number of records z can be determined based on the following ratio: Xz þ 1 i¼1
ðTi þ 1 Ti Þ [ tm þ tident table ðkÞ þ z tpool þ tnew flow þ 2ttransmission þ tserver ; where z [ 0
ð2Þ that is, if the register pool (z) is too small in terms of the frequency of new flows arrival, they will cause it to overflow too soon, which will result in duplicated records of the same flow. On the other hand, the time for new packet processing at the Data Plane level shall not be large to minimize the delay in the packet processing: ðtm þ tident table ðkÞ þ z tpool þ tnew flow Þ ! min
ð3Þ
6 Assessment of the Number of Records in the Flow Register Pool in the Test Environment The testbed for definition of optimal z parameter is implemented with an emulator for Mininet SDN networks using special P4 code that emulates operation of a P4-switch. Since the information on packets is required for real-time classification, then a relatively small value is reserved for m and n: m = n = 1000. 1000 generated exponentially distributed UDP-flows arrive at the switch.
114
V. A. Mankov and I. A. Krasnova
Test 1. To determine how the time Dt þ 2ttransmission þ tserver changes depending on z, z changes from 0 to 100 and the identification table is not filled in during the test runs. P þ1 ðTi þ 1 Ti Þ is calThe results Dt and T are recorded. Based on the T values, zi¼1 culated for each measured size of the register pool. 2ttransmission þ tserver values are recorded separately. The average values of measurement results are given in Fig. 4 (left). The values per packet for different z are shown in Fig. 5. During the test, it was found that the minimum permissible size of flows register pool is z = 10 in this situation. Also, a significant increase is shown in the introduced delay in packet processing by the switch when the size of the flow register pool increases considerably.
Fig. 4. Left: the dependence of handling time and the time between z + 1 flows on z. Right: the relationship between handling time and k
For comparative purposes, the average response delay from the controller to the first packet are 5 ms in [11], while in our work flows through the pool of registers is processed without waiting for a response, and the average time are about 0,5 ms. Test 2. To determine how the time for processing of flows in the switch at the ingress stage depends on the number of active flows recorded in the identification table, the switch is configured to use z = 10, and k varies from 0 to 1000 during the test run. To maintain experimental integrity, no new records are stored in the identification table. Dt is calculated here. The measurement results are given in Fig. 4 (Right). As can be seen from the runs, Dt changes insignificantly even when the identification table is filled to the maximum extent, and therefore, in this case, its modification can be neglected. The tests regarding increasing the number of records in various formats also confirm the theory that the identification tables must be used when operating many active flows.
Collection of Individual Packet Statistical Information
115
Fig. 5. The handling time per packet and the time between z + 1 flows for different z
7 Conclusions The paper presents a new method for generating statistical information on packets, during which a record on the packet length and the time of its arrival to the switch is individually stored for each packet. Using flexible monitoring capabilities enables to register both all records for the flow and the selected fields for specific packets. The implemented memory register system expands the scope of P4-switches to classify the traffic by the Machine Learning method in the real-time mode. The developed methods of flows identification enable to select the optimal mode of switch operation and its initial parameters minimising the delay in packet processing and preventing the occurrence of duplicate information. As a result of test runs, the parameters of the created P4-switch are determined, which is planned to be used for further studies.
References 1. Sahoo, K.S., Mishra, S.K., Sahoo, S., Sahoo, B.: Software defined network: the next generation internet technology. Int. J. Wirel. Microwave Technol. (IJWMT) 7(11), 13–24 (2017). https://doi.org/10.5815/ijwmt.2017.02.02 2. Kumar, P., Dutta, R., Dagdi, R., Sooda, K., Naik, A.: A programmable and managed software defined network. Int. J. Comput. Netw. Inf. Security (IJCNIS) 9(12), 11–17 (2017). https:// doi.org/10.5815/ijcnis.2017.12.02 3. Fathi, A., Kia, K.: A centralized controller as an approach in designing NoC. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 9(1), 60–67 (2017). https://doi.org/10.5815/ijmecs.2017.01.07
116
V. A. Mankov and I. A. Krasnova
4. Mestres, A., Rodriguez-Natal, A., Carner, J., Barlet-Ros, P., Alarcon, E., Sole, M., MuntesMulero, V., Meyer, D., Barkai, S., Hibbett, M.J., et al.: Knowledge-defined networking. SIGCOMM Comput. Commun. Rev. 47(3), 2–10 (2017) 5. Troia, S., Martin, N., Rodriguez, A., Hernandez, J.A., et al.: Machine-learning-assisted routing in SDN-based optical networks. In: 44th European Conference on Optical Communication (ECOC), At Rome, September 2018. https://doi.org/10.1109/ECOC.2018.8535437 6. Xie, J., Yu, F.R., et al.: A survey of machine learning techniques applied to software defined networking (SDN): research issues and challenges. In: IEEE Communications Surveys and Tutorials 2019 (2018). https://doi.org/10.1109/COMST.2018.2866942 7. Gomes, R.L., Madeira, M.E.R.: A traffic classification agent for virtual networks based on QoS classes. IEEE Latin Am. Trans. 10(3), 1734–1741 (2012) 8. Mankov, V.A., Krasnova, I.A.: Algorithm for dynamic classification of flows in a multiservice software defined network. T-Comm 11(12), 37–42 (2017) 9. Bosshart, P., Daly, D., Gibb, G., Izzard, M., McKeown, N., Rexford, J., Schlesinger, C., Talayco, D., Vahdat, A., Varghese, G., et al.: P4: programming protocol independent packet processors. ACM SIGCOMM Comput. Commun. Rev. 44(3), 87–95 (2014) 10. Sonchack, J.: Feature rich flow monitoring with P4. https://open-nfp.org/the-classroom/ feature-rich-flow-monitoring-with-P4. Accessed 15 May 2019 11. He, K., Khalid, J., Aaron, G.-J., et al.: Measuring control plane latency in SDN-enabled switches. In: The 1st ACM SIGCOMM Symposium (2015). https://doi.org/10.1145/2774993. 2775069
A Model of Cognitive Disorders upon the Algebra of Fourier-Dual Operations A. V. Pavlov(&) ITMO University, 49 Kronverksky Pr., St. Petersburg 197101, Russia [email protected]
Abstract. A model of cognitive disorders on the algebra of Fourier-dual operations is proposed. The model meets the criterion of biological motivation and describes main stages of the new information perceiving, that contradicts to the existing subjective knowledge: cognitive dissonance and its reducing. It’s demonstrated, the weakening of low frequencies in the existing pattern of knowledge representation leads to the dynamics of the cognitive dissonance reducing that is similar to hysteria manifestation and then, to the schizophrenia. Keywords: Cognitive system Cognitive disorders Non-monotonic logics Fourier-duality Dynamical system Cognitive dissonance Hysteria Hysterics
1 Introduction One of the most important attributes of a cognitive system is the ability to self-correct the inner representation of the outer world (IRW), or, by another words, inner image of the world, by including new information in it. The greatest difficulties in solving the task arise in the perception of information that contradicts the previously established rules presented by the IRW. In this case, cognitive dissonance arises in the system [1, 2], the reduction of which normally leads to the formation of a new IRW, in which both old and new information must be consistent. Both the scenario of cognitive dissonance reducing and the characteristics of the new IRW are inextricably associated with the cognitive style of the system. Conditionally distinguish two of its extreme types: the first is “scientific”, focused on the adequacy of the corrected IRW to the real world, and the second is the “ordinary” cognitive style, the priority goal for the latter is the IRW stability, even despite its possible inadequacy of reality [3]. Functional causes of cognitive disorders are of practical interest both in medicine and psychology, and in the design of artificial cognitive systems. As applied to human, it concerns, first of all, deviations from accepted mental health norms that are manifested under the perception of new information. The approach to solving this problem is based on the fact that the individual characteristics of the cognitive system, including the cognitive style and the scenario for reducing cognitive dissonance, are determined by the characteristics of its material carrier. These are the characteristics of both sensors and sensory pathways that work at the stage of initial sensory perception, as well as the properties of a neural network that implements the apperception stage as the highest © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 117–128, 2020. https://doi.org/10.1007/978-3-030-39216-1_12
118
A. V. Pavlov
cognitive function of matching new information with existing IRW and forming a new corrected IRW. Thus, following this approach, in order to identify functional causes of the disorders, a model it would be useful to have a model that that links the properties of the material carrier of the cognitive system with the characteristics of the IRW and describes the manifestations of the disorders [4]. If perceived information contradicts to the previously established logical rules, then nonmonotonic logic is to be used to describe the logical conclusion generated by the cognitive system [5]. This approach was used in the articles [6, 7] where it was demonstrated that the model of logic with exception, built upon biologically motivated algebra of Fourier dual operations, described a number of scenarios for cognitive dissonance reducing. In [6] a dependence of the scenario on the filtration was studied and it was found that in a very narrow range of filtration with the weakening of low frequency the logical conclusion was unstable. The character of this instability was phenomenological analogous to hysterics, that is to the manifestation of hysteria. Hysteria is well known since ancient times, but many myths and misconceptions are associated with its functional causes. There is still no universally accepted theory of the functional causes of hysteria. In the context of the task of searching for the functional causes of hysteria, attention is drawn to the work [7], in which for people with hysterical symptoms an increased attention to details was described at the level of existing rules. These results are consistent with the above-mentioned ones as it is weakening of low frequencies in IRW leads to increasing attention to the details. The aim of this research is to find possible causes for the behavior of the response of the neural network, that is similar to hysterics as manifestation of hysteria. The rest of the paper is organized as follows. In Sect. 2 the algebra of Fourier-dual operations as just as the neural networks that generates the model are described. Then Sect. 3 is dedicated to present both conditions and results of numerical investigation. We demonstrate that such peculiarity of perception as the weakening of low frequency in a narrow range of filtrations leads to the weak instability of the system response under perception of new, contradicting to the existing knowledge, information, i.t. to the alternation mode of the quasi-stable response, and this intermittency is phenomenologically similar to manifestations of hysteria. Further weakening of the low frequencies leads to the formation of a new stable response of the system, but the new response and the internal picture of the world are partly similar to those of schizophrenia. The article concludes with a conclusion, which discusses and analyzes the results obtained, on the basis of which conclusions are formulated.
2 Model Let
be an algebra, where F is a set of the model elements, defined
by , X is an universal set, C is a field of complex numbers so, that 8c 2 C : ReðcÞ 2 ½0:1, and are the operation of algebraic product and convolution, the latter implements abstract addition, their duality is established by the Fourier transform:
A Model of Cognitive Disorders upon the Algebra of Fourier-Dual Operations
119
ð1Þ
where d and U are the special constants that are the additive neutral and multiplication neutral elements, respectively. Following by the definition of abstract subtraction as addition with the opposite element [8], the subtraction in the algebra is implemented by the correlation:
Zxmax
Að xÞ Bð xÞ ¼ Að xÞ B ðxÞ ¼
Að xÞB ðD ðxÞÞdx ¼
xmin
Zxmax
ð2Þ
Aðz DÞB ðzÞdx ¼ CorrAB ðDÞ ¼ Að xÞ Bð xÞ
xmin
where asterisk denotes complex conjugating, ⊗ denotes correlation, sign “−” by the argument of the correlation function D arose from the inverse Fourier transform unrealizability. It is easy to see that the algebra of Fourier-dual operations is, by definition, the algebra of fuzzy sets: even if the original elements of the model are crisp, then either Fourier transform or convolution, or correlation transforms they into fuzzy sets. Accordingly, this is an algebra of fuzzy logic, in which the logical conclusion “Generalized Modus Ponens” is also implemented by the subtraction operation (2). The choice of the latter instead of addition for the GMP implementation gives the unimodality of the logical conclusion that is necessary for the condition on the unambiguity of the command for the executive bodies to be met. Note that according to the results of neurophysiological studies, the neural structures of the human visual system implement matched filtering, i.e. double Fourier transform [9]. Thus, the model meets the criterion of biological motivation. To make a step from monotonic logic (GMP) to the logic with exclusion, the logic is to be complemented by the operation of exclusion. Very often, negation operation is used, however, more general approach is based on the usage of the operation of duality, here it is Fourier transform (1). Thus, the logic with exclusion is to store two rules, linked by the Fourier transform: GMP and exclusion (E). By the neural network NN implementation of the logic, the rules are stored in the interconnection weights. GMP rule is stored by the interconnection matrix, formed in accordance with the Hebb learning rule W GMP ðvÞ ¼ gGMP ðF ðALow ð xÞÞÞ
ð3Þ
120
A. V. Pavlov
where ALow(x) describes the neural ensemble representing the worst value of the input variable that was used to learn the NN, gGMP is the operator of the recording medium (synaptic sensitivity). The exception rule E is stored by the weight matrix: W E ðfÞ ¼ F ðAE ðvÞÞC GMP OutHigh ðfÞ
ð4Þ
where AE(v) is a neural ensemble, that was used to learn the NN by the exception, CGMP OutHigh(f) is a response of the NN according to GMP rule if the highest value of the input variable is inputted. The NN is shown schematically in Fig. 1, its functional diagram is shown in Fig. 2.
E
GMP
In
E
СW
W
GMP
Ex
Fig. 1. Three layers NN for the logic with exception implementation: In, C, E are the layers: input, conclusion of GMP, and conclusion of the exception, correspondingly, WGMP и WE are the matrix of the weights (3) and (4), F – Fourier-transform devices; GMP и Ex – stages of monotonic logic GMP and exclusion, correspondingly
ImC(n-1)
E
GMP
WW
ImCn
Fig. 2. Functional diagram of the NN: WGMPWE – composite matrix of the weights
If the neural ensemble B(x) in the input layer is activated by the perceived information, then the NN response in the layer C is represented by the ensemble C(f): C ðfÞ ¼ ½Bð xÞ Að xÞg
ð5Þ
where f is a coordinate in the C-layer, subscript η denotes filtration on the matrix (3) due to the nonlinearity of the weights recording. Response (5) by transmitting matrix (4) forms the neural ensemble in the layer of exception E, that is described as follows: n h io GMP E ðvÞ ¼ F CðfÞgE F ðAE ðvÞÞCOutHIgh ð fÞ
ð6Þ
A Model of Cognitive Disorders upon the Algebra of Fourier-Dual Operations
121
where ηE is the operator that takes into account the nonlinearity of the recording medium used for recording of the matrix (4), i.e. nonlinearity of the synaptic sensitivity. As the NN Fig. 1 is a loop, that is formed by linking layer E with layer In, so the neural ensemble (6) is transferred into the input layer In, then passing In!C!E!In is repeated and new neural ensemble is activated in the correlation layer C: C2 ðfÞ ¼ Eð xÞ Ag ð xÞ
ð7Þ
Expressions (5)–(7) describe the first iteration, as a result of which the logical conclusion (7) takes into account both old knowledge and new information. Thus, it is to consider new composite matrix storing both old and new information W ðvÞ ¼ W GMP ðvÞ W E ðvÞ
ð8Þ
where W describes vector-columns. Then the state of the system at the n-th iteration is described by a pair of dual equations: Cn ð xÞ ¼ F Cðn1Þ ð xÞ F ðW ðvÞÞ;
ð9Þ
F ðCn ð xÞÞ ¼ Cðn1Þ ð xÞ W ðvÞ:
ð10Þ
The term F(W(v)) in (9) describes the new composite IRW, including the old rule and the new exception; at the same time it is also a dissipative factor causing the convergence of the system’s dynamics. The steady state of the system is given by solving Eqs. (9), (10), for analysis it is convenient to present it in the form: W ð vÞ ¼
F ð C n ð xÞ Þ Cðn1Þ ð xÞ
ð11Þ
As analytical solution of Eq. (11) in general is difficult, then, to analyze the factors affecting the characteristics of the IRW in conjunction with the dynamics of the cognitive dissonance reduction, and the properties of the logical conclusion C(v) it was to use numerical simulation. The numerical investigation was based on the next considerations. IRW should be characterized by its inner connectivity, reflecting such attribute of real information as the interconnection of its elements. To evaluate the inner-connectivity it is natural to use a correlation measure – the correlation length or the radius of the global maximum of autocorrelation function (GM ACF). If the GM ACF is monotonous, and the monotony of the GM ACF is characteristic of real information, then this approach is correct. At the same time, the filtration that is always present in real system may lead to
122
A. V. Pavlov
a change in the form of the GM ACF and, under certain filtering parameters, the nonmonotony of the GM ACF is possible. In this case, usage of the radius of correlation is no longer enough; an analysis of the whole ACF of the ACM is necessary.
3 Modeling The filter was added to the interconnection weights (3) and was set by the model: (
exp
HF ðvÞ ¼
ðvv0 Þ2 2ðrv0:606 Þ2
if v v0 1 if v [ v0
ð12Þ
where v is a frequency, parameter rv0.606 is a radius at the level 0.606, v0 is the central frequency of the Gaussian function. The effect of the weakening of low frequencies by the filter (12) on the dynamics of the system response Cn(f) was investigated. Filtration is evaluated by the relative bandwidth in the range of low frequencies weakening vR ¼ rv0:606 =v0 . The system response is evaluated by the radius of its maximum at the level 0.606. The dynamics of the response Cn(f) was investigated in the range of iterations n 2 [0, 524200] for a vLow
Low where vLow number of the spectra ratio V ¼ v0:606 E 0:606 and v0:606 are the parameters of the 0:606
Gaussian spectrum of amplitudes (frequencies at the level of 0.606) of the reference images used to record matrices of GMP (3) and exclusion (4), correspondingly. An example of three kinds of the dynamics that are typical for the investigated system is presented in the Fig. 3, corresponding responses are shown in Fig. 4.
Fig. 3. Three typical examples of the NN Fig. 1 dynamics for: V = 8, v0 = 40: a – convergent dynamics, unimodal response, vR = 0.45, b – alternation mode, quasi-stable response vR = 0.43 (hysterics), c – mode of a stable multimodal response vR = 0.35
A Model of Cognitive Disorders upon the Algebra of Fourier-Dual Operations
123
Fig. 4. Responses in the correlation plane for the dynamic modes shown in Fig. 3, correspondingly; response in the Fig. 4a corresponds to quasi-stable stage in Fig. 3b
Similar types of dynamics were obtained for other values of the system parameters: the ratio of the parameters of the spectra V and filters v0 and vR. Let us pay attention to the mode b in the Fig. 3. This kind of dynamics is similar to the phenomenon of hysterics, that is a manifestation of hysteria: quasi-stable unimodal response shown in Fig. 4b from time to time loses its stability and turns into an unstable multimodal one, each time by itself returning then to a quasi-stable unimodal. To illustrate this kind of dynamics, that similar to hysterics, typical examples of the responses for a number of iterations in the framework of a hysterics are shown in Fig. 5. The figure shows that during the “hysterics” unimodal quasi-stable response (Fig. 5b.1) loses its stability and transforms into multimodal one (Fig. 5b.2 and b.3). By reducing the “hysterics”, the response returns into unimodal one. To illustrate dependence on this kind of dynamics on the filter parameters, Fig. 6 presents the hysterics frequency versus the relative bandwidth vR of the filter (12) for three values of the parameters of the spectra V and filters v0.
Fig. 5. Typical responses of the NN Fig. 1 for the quasi-stable mode of the dynamics shown in Fig. 3b for V = 8, v0 = 40, vR= 0.43: b.1 – n = 170, quasi-stable response; b.2 – n = 200, failure of stability, multimodal response; b.3 – n = 205, unstable response, b.4 – n = 209, attenuation of the instability
124
A. V. Pavlov
Fig. 6. Dependences of the hysterics’ frequency on the relative frequency bandwidth of the filter vR: V = 4 – solid lines, V = 6 – dotted lines, V = 8 – dashed lines
For a more visual identification of the relationship of hysterical modes with the peculiarities of the IRW, represented by the matrix (8), Fig. 7 shows the autocorrelation functions (ACF) of the IRW in the GM ACF region at V = 8, v0 = 40 for a number of vR values, and Fig. 8 shows corresponding filtering functions.
Fig. 7. Sections of the ACF GM: 1 – vR = 1; a – vR = 0.45; b – vR = 0.43; c – vR = 0.35
Fig. 8. Filtering functions, the curves correspond to the ones in Fig. 7
A Model of Cognitive Disorders upon the Algebra of Fourier-Dual Operations
125
It has to pay attention to the modification of the ACF GM due to the filtration: a flexure of the ACF GM, that is develops into a new local minimum. This GM ACF flexure reflects a violation of the integrity of the IRW – the appearance of a gap at small distances (in the Fig. 7 – approximately 30 pixels) between general and detailed features, resulting from the weakening of low frequencies. This is significantly different from what happens under “usual” high-frequency filtering, which is results in a decreasing of the GM ACF radius at all levels, i.e. while maintaining its shape. Figures 9 and 10 shows the dependences of the correlation length on the parameter mR for V = 6 (Fig. 9) and V = 8 (Fig. 10), the ranges of unstable modes are restricted by the vertical dash lines.
Fig. 9. Dependences of the correlation length for a number of levels for V = 6
Fig. 10. Dependences of the correlation length for a number of levels for V = 8
It can be seen from Figs. 9 and 10 that a sharp decrease in the correlation length is ahead of the transition of the system to the mode with intermittency as on the left, i.e. when filtering is enhanced, predicting the transition from a stable unimodal response to an unstable one, and to the right (when filtering is weakened), preceding a transition from a stable multimodal response to an unstable unimodal response. Thus, the correlation length of the composite image of the references A(x,y) and E(x,y) recorded in the weights of the matrices is a system order parameter, the control of which allows to predict the transition of the system to an unstable mode with intermittency.
126
A. V. Pavlov
Figures 11 and 12 show the dependences of the standard deviation given in the approximation of the IRW ACF by the model that is adequate to the ACF of a number of real processes [10]: 2 f CðfÞ ¼ Cosða fÞ exp 2b2
ð13Þ
The dependence of the values of the approximation function (13) parameters on the filter (12) parameter vR are shown in the Figs. 13 and 14 for V = 6 and V = 8. Correspondingly. It can be seen that in the range of hysterics restricted in the graphs by the vertical dash lines, the standard deviation of the approximation sharply increases, reflecting the rearrangement of the IRW ACF. Then, by passing to the stable multimodal response mode, the standard deviation decreases again – the IRW ACF is rebuilt to new values of the model (13) parameters. A further increase in the standard deviation reflects the approximation of the next range of hysterics, accompanied by the restructuring of the stable response described above.
Fig. 11. Standard deviation of the IRW ACF approximation by (13) in the dependence on the vR, for V = 6: solid line – approximation in the range of 200 pixels, dotted line – 2000 pixels
Fig. 12. Standard deviation of the IRW ACF approximation by (13) in the dependence on the vR, for V = 8: solid line – for the range of 200 pixels, dotted – 2000 pixels
A Model of Cognitive Disorders upon the Algebra of Fourier-Dual Operations
127
Fig. 13. Dependences of the approximation parameters for V = 6
Fig. 14. Dependences of the approximation parameters for V = 8
4 Conclusions Thus, with the perception by a neural network that implements logic with the exception of new information that contradicts previously learned rules, instability of the logical conclusion is possible. This instability occurs in a narrow filtering range, characterized by a strong attenuation of low frequencies in a narrow band. This filtering range corresponds to the restructuring of the response dynamics of the network from stable unimodal to stable multimodal. The instability of the logical conclusion is manifested in the intermittency of the network response - the periodic loss by a stable unimodal response its stability with the transition to an unstable multimodal response, and then self-returning to a quasi-stable unimodal response. The physical reason for this filtering is the limited dynamic range of media that record the weights of interneurons connections. The informational cause of instability is a violation as a result of filtering the internal coherence of the stored information within the correlation length—the appearance of a gap between general and particular signs in the internal picture of the world.
128
A. V. Pavlov
Since such instability is similar to the phenomenon of hysterics, i.e. manifestation of hysteria, it can be assumed that the functional cause of hysteria may be a personal feature of perception - excessive attention to detail to the detriment of attention to the general. These conclusions are also relevant for artificial cognitive systems due to the fact that they show the danger of low frequencies rejection while optimizing their technical parameters as the weakening of low frequencies is often used to improve the detail of the perceived information and reduce the influence of environmental factors. Acknowledgments. This work was supported by the Russian Foundation for Basic Research, grant 18-01-00676-a. I would like to thank Prof. Igor B. Fominykh and Alexander V. Mullin for the fruitful and helpful discussions.
References 1. Festinger, L.: A Theory of Cognitive Dissonance. Stanford University Press, Stanford (1962) 2. Heckhausen, H.: Motivation und Handeln. Lehrbuch der Motivationspsychologie. Springer, Berlin (2010) 3. Kuznetsov, O.P.: Bystrye processy mozga i obrabotka obrazov. Novosti iskusstvennogo intellekta, 2 (1998). (in Russian) 4. Reiter, R.: A logic for default reasoning. Artif. Intell. 13(1–2), 81–132 (1980) 5. Pavlov A.V.: Logic with exception upon the algebra of Fourier-dual operations: a mechanism of cognitive dissonance reducing. Nauchno-technicheskii vestnik of IT, Mech. Opt. 89(1), 17–25 (2014). (in Russian) 6. Pavlov, A.V.: The Influence of hologram recording conditions and nonlinearity of recording media on the dynamic characteristics of the fourier holography scheme with resonance architecture. Opt. Spectrosc. 119(1), 146–154 (2015) 7. Edwards, M.J., Adams, R.A., Brown, H., Paree’s I., Friston, K.J.: A Bayesian account of “hysteria”. Brain, 1–18 (2012). https://doi.org/10.1093/brain/aws129 8. Dubois, D., Prade, H.: Fuzzy numbers: an overview. In: Bezdek, J.C. (ed.) Analysis of Fuzzy Information, Boca Raton, FL, vol. 1, pp. 3–39 (1987) 9. Glezer, V.D.: Matched filtering in the visual system. J. Opt. Technol. 66(10), 853–856 (1999) 10. Yaglom, A.M.: Correlation Theory of Stationary Random Functions. Gidrometeoizdat, Leningrad (1981). (in Russian)
Intelligent OFDM Telecommunication Systems Based on Many-Parameter Complex or Quaternion Fourier Transforms Valeriy G. Labunets1(&) and Ekaterina Ostheimer2 1
Ural State Forest Engineering University, 37, Sibirskiy Trakt, 620100 Ekaterinburg, Russian Federation [email protected] 2 Capricat LLC, Pompano Beach, FL, USA
Abstract. In this paper, we propose novel Intelligent quaternion OFDMtelecommunication systems based on many-parameter complex and quaternion transform (MPFT). The new systems use inverse MPFT (IMPFT) for modulation at the transmitter and direct MPFT (DMPFT) for demodulation at the receiver. The purpose of employing the MPFT is to improve: (1) the PHY-LS of wireless transmissions against to the wide-band anti-jamming and antieavesdropping communication; (2) the bit error rate (BER) performance with respect to the conventional OFDM-TCS; (3) the peak to average power ratio (PAPR). Each MPFT depends on finite set of independent parameters (angles). When parameters are changed, many-parametric transform is also changed taking form of different quaternion orthogonal transforms. For this reason, the concrete values of parameters are specific “key” for entry into OFDM-TCS. Vector of parameters belong to multi-dimension torus space. Scanning of this space for find out the “key” (the concrete values of parameters) is hard problem. Keywords: Many-parameter transforms Complex and quaternion Fourier transform OFDM Noncommutative modulation and demodulation Telecommunication system Anti-eavesdropping communication
1 Introduction In today’s world, an important aspect of communication and technology is security. Wars are being fought in the virtual world rather than in the real world. There is a rapid increase in cyber warfare. Ensuring information security is of paramount importance for wireless communications. Due to the broadcast nature of radio propagation, any receiver within the cover range can listen and analyze the transmission without being detected, which makes wireless networks vulnerable to eavesdropping and jamming attacks. Orthogonal Frequency-Division Multiplexing (OFDM) has been widely employed in modern wireless communications networks. Unfortunately, conventional OFDM signals are vulnerable to malicious eavesdropping and jamming attacks due to their distinct time and frequency characteristics. The communication that happens between the two legitimate agents needs to be authorized, authentic and secured. Hence, in order to design a secured communication, we need a secret key that can be © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 129–144, 2020. https://doi.org/10.1007/978-3-030-39216-1_13
130
V. G. Labunets and E. Ostheimer
used to encode the data in order to be prevented from phishing. Therefore, there is a need to generate a secret key with the existing information available. This key should not be shared, as the wireless channel remains vulnerable to attack. So, the key should be generated by communicating legitimate agents. Traditionally, cryptographic algorithms/protocols implemented at upper layers of the open systems interconnection (OSI) protocol stack, have been widely used to prevent information disclosure to unauthorized users [1]. However, the layered design architecture with transparent physical layer leads to a loss in security functionality [2], especially for wireless communication scenarios where a common physical medium is always shared by legitimate and non-legitimate users. Moreover, the cryptographic protocols can only offer a computational security [3]. As an alternative, exploiting physical layer characteristics for secure transmission has become an emerging hot topic in wireless communications [4–7]. The pioneering work by Wyner in [4] introduced the concept of “secrecy capacity” as a metric for PHY-layer security (PHY-LS). It is pointed out that perfect security is in fact possible without the aid of an encryption keys when the source-eavesdropper channel is a degraded version of the source-destination (main) channel. As the physical-layer transmission adversaries can blindly estimate parameters of OFDM signals, traditional upper-layer security mechanisms cannot completely address security threats in wireless OFDM systems. Physical layer security, which targets communications security at the physical layer, is emerging as an effective complement to traditional security strategies in securing wireless OFDM transmission. The physical layer security of OFDM systems over wireless channels was investigated from an information-theoretic perspective in [8]. In this paper, we propose a simple and effective anti-eavesdropping and anti-jamming Intelligent OFDM system, based on many-parameter complex or quaternion Fourier transforms (MPFTs). U 2n ðu1 ; u2 ; . . .; uq Þ. In this paper, we aim to investigate the superiority and practicability of MPFTs from the physical layer security (PHY-LS) perspective. The main advantage of using MPFT in OFDM TCS is that it is a very flexible anti-eavesdropping and anti-jamming Intelligent OFDM TCS. The paper are organized as follows. Section 2 of the paper presents a brief introduction to the conventional OFDM system along with various notations used in the paper and novel Intelligent-OFDM-TCS based on MPFT U 2n ðu1 ; u2 ; . . .; uq Þ transforms. Section 3 and 4 present manyparameter complex- and quaternion-valued Fourier transforms, respectively. In Sect. 5 we introduce new many-parameter complex- and quaternion-valued all-pass filters.
2 Intelligent Complex and Quaternion OFDM TCS The conventional OFDM is a multi-carrier modulation technique that is basic technology having high-speed transmission capability with bandwidth efficiency and robust performance in multipath fading environments. OFDM divides the available spectrum into a number of parallel orthogonal sub-carriers and each sub-carrier is then modulated by a low rate data stream at different carrier frequency. In OFDM system, the
Intelligent OFDM Telecommunication Systems
131
modulation and demodulation can be applied easily by means of inverse and direct discrete Fourier transforms (DFT). The conventional OFDM will be denoted by the symbol F N OFDM. Conventional OFDM-TCS makes use of signal orthogonality of the multiple sub-carriers ej2pkn=N (discrete complex exponential harmonics). All sub j2pkn=N N1 form matrix of discrete orthogonal Fourier carriers fsubck ðnÞgN1 k¼0 ¼ e k¼0 N1 N1 transform F N ¼ ½subck ðnÞk;n¼0 ej2pkn=N k;n¼0 . At the time, the idea of using the fast algorithm of different orthogonal transforms U N ¼ ½subck ðnÞN1 k;n¼0 for a software-based implementation of the OFDM’s modulator and demodulator, transformed this technique from an attractive [9, 10]. OFDM-TCS, based on arbitrary orthogonal (unitary) transform U N will be denoted as U N OFDM. The idea which links F N OFDM and N1 U N OFDM is that, in the same manner that the complex exponentials ej2pkn=N k¼0 are orthogonal to each-other, the members of a family of U N -sub-carriers fsubck ðnÞgN1 k¼0 (rows of the matrix U N ) will satisfy the same property. The U N OFDM reshapes the multi-carrier transmission concept, by using carriers j2pkn=N N1 . There are a fsubck ðnÞgN1 k¼0 instead of OFDM’s complex exponentials e k¼0 number of candidates for orthogonal function sets used in the OFDM-TCS: discrete wavelet sub-carriers [11, 12], Golay complementary sequences [13–15], Walsh functions [16, 17], pseudo random sequences [18, 19]. Intelligent-OFDM TCS can be described as a dynamically reconfigurable TCS that can adaptively regulates its internal parameters as a response to changes in the surrounding environment. One of the most important capacities of Intelligent OFDM systems is their capability to optimally adapt their operating parameters based on observations and previous experiences. There are several possible approaches towards realizing such intelligent capabilities. In this work, we aim to investigate the superiority and practicability of MPFTs from the physical layer security perspective. In this work, we propose a simple and effective anti-eavesdropping and antijamming Intelligent OFDM system, based on many-parameter transform. In our Intelligent-OFDM-TCS we use complex or quaternion MPFTs U N ðu1 ; u2 ; . . .; uq Þ instead of ordinary DFT F N . Each MPFT depends on finite set of free parameters h ¼ ðu1 ; u2 ; . . .; uq Þ, and each of them can take its value form 0 to 2p. When parameters are changed, MPFT is changed too taking form of known (and unknown) complex or quaternion transforms. The vector of parameters h ¼ ðu1 ; u2 ; . . .; uq Þ 2 Torq ½0; 2p ¼ ½0; 2pq belongs to the qD torus. When the vector ðu1 ; u2 ; . . .; uq Þ runs completely the q-D torus Torq ½0; 2p, the ensemble of the orthogonal quaternion transforms is created. Intelligent OFDM system uses some concrete values of the parameters u1 ¼ u01 ; u2 ¼ u02 ; . . .; uq ¼ u0q ; i.e., it uses a concrete realization of MPFT QU 0N QU N ðu01 ; u02 ; . . .; u0q Þ. The vector ðu01 ; u02 ; . . .; u0q Þ plays the role of some analog key (see Fig. 1), whose knowing is necessary for entering into the TCS with the aim of intercepting the confidential information.
132
V. G. Labunets and E. Ostheimer
Fig. 1. Key of parameters ðu1 ; u2 ; . . .; uq Þ
Quantity of parameters can achieve the values q ¼ 10 000. So, searching the vector key by scanning the 10000-dimensional torus ½0; 2p10 000 with the aim of finding the working parameters ðu01 ; u02 ; . . .; u0q Þ is very difficult problem for the enemy cybermeans. But if, nevertheless, this key were found by the enemy in the cyber attack, then the system could change values of the working parameters for rejecting the enemy attack. If the system is one of the TCS type, then in such a case, it will transmit the confidential information on the new sub-carriers (i.e., in the new orthogonal basis). As a result, the system will counteract against the enemy radio-electronic attacks. MPFT U N ðhÞ has the form of the product of the sparse Jacoby rotation matrixes and which describes a fast algorithm for this transform. The main advantage of using MPFT in OFDM TCS is that it is a very flexible anti-eavesdropping and anti-jamming Intelligent OFDM TCS. To the best of our knowledge, this is the first work that utilizes the MPT theory to facilitate the PHY-LS through parameterization of unitary transforms.
Fig. 2. Block diagram of Intelligent OFDM-TCS
We do study of Intelligent U N ðhÞ-OFDM-TCS to find out optimal values of parameters optimized PARP, BER, SER, anti-eavesdropping and anti-jamming effects (see next parts of our work). For simplicity, we consider a single-input single-output OFDM (SISO-OFDM) setup with N sub-carriers (see Fig. 2). Let
Intelligent OFDM Telecommunication Systems
133
o 8n < ZðbÞ ¼ Zðb0 ;b1 ;...;bd1 Þ 2 Cb ¼ ðb0 ; b1 ; . . .; bd1 Þ 2 f0; 1gd ; o 2d CD ¼ n : QðbÞ ¼ Qðb0 ;b1 ;...;bd1 Þ 2 Hb ¼ ðb0 ; b1 ; . . .; bd1 Þ 2 f0; 1gd be constellation diagrams on the complex plane C or on the quaternion algebra H consisting of 2d complex Z ðbÞ ¼ Z ðb0 ;b1 ;...;bd1 Þ 2 C or quaternion Qðb0 ;b1 ;...;bd1 Þ 2 H points (stars) and numbered by binary d-digital numbers hbj ¼ ðb0 ; b1 ; . . .; bd1 Þ 2 f0; 1gd : Here f0; 1gd is d-D Boolean cube. We interpret row-vector hbj ¼ ðb0 ; b1 ; . . .; bd1 Þ as an address of star Z ðbÞ ¼ Z ðb0 ;b1 ;...;bd1 Þ (or Qðb0 ;b1 ;...;bd1 Þ ) in computer memory. Let us introduce the following designations CM1 Z ðb0 ;b1 ;...;bd1 Þ Z ðb0 ;b1 ;...;bd1 Þ ; b ¼ h j Qðb0 ;b1 ;...;bd1 Þ ; CM1 Qðb0 ;b1 ;...;bd1 Þ ¼ ðb0 ; b1 ; . . .; bd1 Þ 2 f0; 1gd ;
CMðb0 ; b1 ; . . .; bd1 Þ ¼
where CM and CM1 are constellation direct and inverse mappings. The principle of any OFDM system is to split the input 1-bit stream b½m; m ¼ 0; 1; . . . into d-bit stream (Bd2 -valued stream): b½m ¼ b½nd þ r ! b½n ¼ ðb0 ½n; . . .; br ½n; . . .; bd1 ½nÞ, where b 2 Bd2 ¼ f0; 1gd , m ¼ nd þ r, r ¼ 0; 1; . . .; d 1 and n ¼ 0; 1; 2. . .. Here m is the real discrete time, n is the “time” for d-bit stream bðnÞ (i.e., the d-decimation “time” with respect to real discrete time). The Bd2 -valued sequence bðnÞ is split into N sub-sequences (sub-streams) b½n ¼ b½lN þ k ! hB½lj ¼ b0 ½l; . . .; bk ½l; . . .; bN1 ½lÞ; where n ¼ lN þ k, k ¼ 0; 1; . . .; N 1 and l ¼ 0; 1; 2. . .. The row-vector hB½lj ¼ b0 ½l; b1 ½l; . . .; bk ½l; . . .; bN1 ½l is called the lth f0; 1gd -valued time-slot. Here l is the “time” for time-slot hB½lj (i.e., the N-decimation “time” with respect to d-bit stream “time” n and Nd-decimation “time” with respect to real discrete time m). The data of the lth time-slot hB½lj is first being processed by complex or quaternion ðbk ½lÞ ; or bk ½l ! CM bk ½l ¼ constellation mappings: bk ½l ! CM bk ½l ¼ Zk ðbk ½lÞ
Qk
; where k ¼ 0; 1; . . .; N 1; i.e.,
3 2 ðb0 ½lÞ 3 3 2 ðb0 ½lÞ 3 2 CM b0 ½l CM b0 ½l Q0 Z0 7 7 7 6 7 6 6 6 .. .. . .. 7 7 6 6 .. 7 7 6 6 6 . 7 7 E . 4 ðB½lÞ E 6 7 6 . k 7 6 6 ðbk ½lÞ 7 6 ðbk ½lÞ 7 ðB½lÞ k Q 7 7 6 6 ¼ 6 CM b ½l 7 ¼ 6 Qk ¼ 6 CM b ½l 7 ¼ 6 Zk 7; or Z 7: 7 7 7 6 7 6 6 6 .. .. 7 7 6 6 . . 5 5 4 4 5 5 4 4 . . . . . . N1 N1 N1 N1 ðb ½lÞ ðb ½lÞ ½l CM b ½l CM b Q Z 2
N1
N1
Z ðB½lÞ ðk ¼ 0; 1; . . .; N 1Þ are QðB½lÞ called data symbols. These symbols are then input into the inverse (complex or quaternion) MPFT U 1 N ðhÞ block: Complex numbers and quaternions SðB½lÞ :¼
134
V. G. Labunets and E. Ostheimer
2
ðb0 ½lÞ
3
2
S0
ðB½lÞ
s0
3
7 6 7 6 .. 7 6 .. 7 6 7 6 . 7 6 E .k 6 ðb ½lÞ 7 6 ðB½lÞ 7 ðB½lÞ U 1 : 7 ¼ 6 sv 7¼ s N ðhÞ 6 Sk 7 6 7 6 7 7 6 6 .. . 5 4 .. 5 4 . ðbN1 ½lÞ ðB½lÞ SN1 sN1 ðB½lÞ
ðB½lÞ
ðB½lÞ
The sequences of coefficients s0 ; . . .; sv ; . . .; sN1 can be conveniently visualized as discrete composite complex-valued or quaternion-valued signals to be transmitted. They are sums of modulated complex-valued or quaternion-valued subck ðvjhÞ subN1 N1 P ðbk ½lÞ P ðbk ½lÞ ðB½lÞ ðB½lÞ carriers: sv ðhÞ ¼ Sk subck ðvjhÞ; i.e., sv ðhÞ ¼ Zk subck ðvjhÞ; for k¼0
k¼0
complex TKS and ðhÞ ¼ sðB½lÞ v
N 1 X
ðbk ½lÞ
Qk
subck ðvjhÞ or sðB½lÞ ðhÞ ¼ v
k¼0
N 1 X
ðbk ½lÞ
subck ðvjhÞ Qk
k¼0
for quaternion TKS (it is noncommutative modulation), where N is the number of subcarriers. All sub-carriers transmit D dN data
bits. Let the symbol key ¼0 means multiðB½lÞ ðB½lÞ ðB½lÞ ðB½lÞ ; . . .; Qv ; . . .; Q plication of the data vector Q ¼ Q on the matrix 0
U 1 N ðhÞ
N1
½subck ðvjhÞN1 k;v¼0
element subck ðvjhÞ of ¼ from the left (L) and the symbol key ¼ 1 means the multiplication from the right (R). Then the N-D binary vector key ¼ ðkey0 ; key1 ; . . .; keyN1 Þ is the digital vector key (see Fig. 3) showing onto the way, by which the multiplication of the MPFT matrix U 1 N ðhÞ on the data vector ðB½lÞ E Q has to be implemented: ðB½lÞ E ðB½lÞ E s Q ðh; keyÞ ¼ U 1 ) N
(
ðbk ½lÞ
subck ðvjhÞ Qk
ðbk ½lÞ Qk
; keyk ¼ 0;
subck ðvjhÞ; keyk ¼ 1;
:
where fsubck ðvjhÞgNk;n¼1 are the set of matrix elements of quaternion transform, i.e., h iN1 keyk ðh; keyÞ ¼ subc njh ; h ; . . .; h : So, the number of such keys is equal U 1 1 2 p N k k;n¼0
to 2N . They form the Boolean cube BN2 . Knowing this digital key is necessary to enter into the Intelligent OFDM TCS. ðB½lÞ ðhÞ is interpolated by digital-to-analog converter Digital data s ðB½lÞ DAC ðB½lÞ s ðhÞ ! s ðtjhÞ;ðt 2 ½0; TÞ in generating the analog signal sðB½lÞ ðtjhÞ. It is then AM-modulated 1 þ m sðB½lÞ ðtjhÞ ej2pf0 t to the carrier frequency f0 and radiated to a wireless medium, the so-called radio channel (RF), before it is picked up at the receiver side. Here m is the AM-modulation index. At the receiver side, after AM-demodulation and discretization by analog-to-digital converter (ADC) from received signal rðtjhÞ we obtain the received symbols
Intelligent OFDM Telecommunication Systems
135
rðB½lÞ ðhÞ. They are the transmitted symbols sðB½lÞ ðhÞ plus additive Gaussian noise samples:
ðB½lÞ ðB½lÞ ðB½lÞ rðB½lÞ ðhÞ ¼ r0 ðhÞ; . . .; rv ðhq Þ; . . .; rN1 ðhÞ
ðB½lÞ ðB½lÞ ðB½lÞ ¼ sðB½lÞ ðhÞ þ hnj ¼ s0 ðhÞ þ n0 ðlÞ; . . .; sv ðhÞ þ nv ðlÞ; . . .; sN1 ðhÞ þ nN1 ðlÞ :
Fig. 3. N-D binary vector-key key ¼ ðkey0 ; key1 ; . . .; keyN1 Þ
At the receiver side the process is reversed to obtain the data. The signal rðB½lÞ ¼
ðB½lÞ ðB½lÞ ðB½lÞ r0 ; . . .; rv ; . . .; rN1 is demodulated by direct MPFT. The output of DMPFT is represented as follow: D
D Rðb ½lÞ ðhÞ ¼ rðB½lÞ ðhÞ U N ðhÞ ¼ k
( ðB½lÞ þ hNðljhÞj; DZ ðB½lÞ þ hNðljhÞj: Q
After that the maximum-likelihood algorithm (MLA) gives the optimal estimation ðB½lÞ ðB½lÞ E or Q : of the signal Z MLAhDZðB½lÞ þ hNðljhÞj i ¼ MLA QðB½lÞ þ hNðljhÞj ¼ h 0 i h N1 i 8
ðb ½lÞ < MLA Z^0ðb ½lÞ þ N0 ðljhÞ ; . . .; MLA Z^N1 þ NN1 ðljhÞ ¼ h 0 i h N1 i ¼
: MLA Q ^ ðb ½lÞ þ NN1 ðljhÞ ¼ ^ ðb ½lÞ þ N0 ðljhÞ ; . . .; MLA Q 0 N1 8 n 0 o n N1 o ðb ½lÞ ðb ½lÞ > > min q Z^0 þ N0 ðljhÞ; Z ; . . .; min q Z^N1 þ NN1 ðljhÞ; Z ¼ < Z22d CD Z22d CD ¼ n 0 o n N1 o > > ^ ðb ½lÞ þ N0 ðljhÞ; Q ; . . .; min q Q ^ ðb ½lÞ þ NN1 ðljhÞ; Q : ¼ min q Q 0 N1 Q22d CD Q22d CD 8 D ðB½lÞ 0 k ðbN1 ½lÞ ^ ^ ðb ½lÞ ; . . .; Z^ ðb ½lÞ ; . . .; Z^N1 < Z : opt ¼ Z0 k ¼ D ðB½lÞ ðb0 ½lÞ k N1 ðb ½lÞ ðb ½lÞ : Q ^ ^ ^ ^ ; . . .; Q ; . . .; Q ; N1 opt ¼ Q0 k (
136
V. G. Labunets and E. Ostheimer
where q is the Euclidean distance on C or H and the symbol 00^00 over means estimated value. Finally, estimation of bit stream is given as 0 ¼ b ^ ^ ½ljh; . . .; b ^N1 ½ljh B½ljh nD ðB½lÞ o
n 0 o n N1 o ðb ½lÞ 1 ^ ðb ½lÞ ^ Z0 ¼ QCM1 Z ðhÞ ; ::; QCM1 Z^N1 ðhÞ ; opt ðhÞ ¼ QCM for complex TCS and 0 ¼ b ^ ^ ½ljh; . . .; b ^N1 ½ljh ¼ B½ljh nD ðB½lÞ o
n 0 o n N1 o 1 ^ ðb ½lÞ ^ ^ ðb ½lÞ ðhÞ ; Q0 ðhÞ ; ::; QCM1 Q QCM1 Q N1 opt ðhÞ ¼ QCM
^k ½ljhÞ ! b½lN ^ þ kjhÞ ! ^ for quaternion TKS, where b b½ðlN þ kÞd þ rjh ¼ ^ b½mjh is an estimation of initial bit stream. Here, m ¼ ½ðlN þ kÞd þ r ¼ lNd þ kd þ r and l ¼ 0; 1; 2; . . .; k ¼ 0; 1; . . .; N 1; r ¼ 0; 1; . . .; d 1: The BER and SER for lth time slot are defined as BERU ½ljh ¼
N 1
X 1 Nd1 1X ^k ½ljh 6¼ b ^k ½ljh : b b½mjh ^b½mjh ; SERU ½ljh ¼ Nd m¼0 N k¼0
As we see in digital Intelligent-OFDM TCS, many-parameter sub-carriers are used n k oN1 ðb ½lÞ to carry the digital data Sk . By this reason, all coefficients k¼0 ðB½lÞ ðB½lÞ ðB½lÞ s0 ðh; keyÞ; . . .; sv ðh; keyÞ; . . .; sN1 ðh; keyÞ
depend on parameters h ¼ ðu1 ; . . .; uq Þ and vector key ¼ ðkey0 ; key1 ; . . .; keyN1 Þ. This dependence can be used for multiple purposes such as, anti-eavesdropping and anti-jamming in order to increase the system secrecy. It is interesting to minimize the peak to average power ratio PARPU ½ljh, the bit error rate BERU ½ljh, symbol error rate SERU ½ljh, inter-symbol interference ISIU ½ljh by chaining of parameters h.
3 Fast Many-Parameter Complex Fourier Transforms Fast Fourier transform is the following iteration procedure (see Fig. 4): n h
r1 i 1 Y F ¼ pffiffiffiffiffi I2r1 I2nr D2nr e2 ½I2r1 F2 I2nr ; 2n r¼1
ð1Þ
r1
r1 r1 r1 nr 1 1 where F2 ¼ , D2nr e2 ¼ Diag2nr 1; e2 1 ; e2 2 ; . . .; e2 ð2 1Þ : Its 1 1 iteration steps are indexed by integer r 2 f1; 2; . . .; ng.For each iteration r we introduce digital representation for p 2 0; 1; . . .; 2n1 1 : p ¼ pðkr ; sr Þ ¼ 2r1 kr þ sr :
Intelligent OFDM Telecommunication Systems
137
where kr 2 0; 1; . . .; 2r1 1 ; sr 2 f0; 1; . . .; 2nr 1g. Let and qðkr ; sr Þ ¼ pðkr ; sr Þ þ 2n1 : Obviously, q 2 2n1 ; ; . . .; 2n 1 :
Fig. 4. Fast Fourier transforms for N ¼ 8.
For fixed integer r 2 f1; 2; . . .; ng and pr ; qr 2 f0; 1; . . .; 2nr 1g let
ðpr ;qr Þ ðpr ;qr Þ aðpr Þ bðqr Þ D2nr jeqr þ 1 ¼ D2nr þ 1 epr 9 8 2nr 2nr > =
bðqr Þ rÞ ; ; 1; . . .; 1 j 1; . . .; e ; 1; . . .; 1 ¼ Diag2nr þ 1 1; . . .; epaðp qr r > > |fflfflfflfflfflfflffl{zfflfflfflfflfflffl ffl} ; :|fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl} pr
pr
nr where a ¼ aðpr Þ; b ¼ bðqr Þ are integers depending on positions
pr and qr ¼ pr þ 2 ,
respectively. Here e ¼ expð2pj=2n Þ: For I2nr D2nr e2
r1
from (1) we have
r1 r1 r1 nr r1 I2nr D2nr epprr 2 ¼ Diag2nr ð1; 1; ; . . .; 1Þ Diag2nr 1; e12 ; e22 ; . . .; eð2 1Þ2 2nr Q1 ðpr ;pr þ 2nr Þ pr pr 2r1 ¼ D2nr þ 1 1pr jepr þ 2nr : pr ¼0
Now we are going touse in fast Fouriertransform the following radix-ð2r1 ; 2nr Þ representation of p; q 2 0; 1; . . .; 2n1 1 : p ¼ pðkr ; sr Þ ¼ 2r1 kr þ sr ; qðkr ; sr Þ ¼ pðkr ; sr Þ þ þ 2nr ; where kr 2 0; 1; . . .; 2r1 1 , sr 2 f0; 1; . . .; 2nr 1g and r 2 f1; 2; . . .; ng. We can write diagonal matrices of FFT (for all r 2 f1; 2; . . .; ng) as
138
V. G. Labunets and E. Ostheimer
I2r1 I2nr D2nr
r1 epprr 2
¼
2r1 Y1 Y1 2nr sr ¼0
kr ¼0
ðpðkr ;sr Þ;qðkr ;sr ÞÞ
D2n
r ð1spðk jesr 2 Þ: r ;sr Þ qðkr ;sr Þ nr
ð2Þ
Then fast DFT (1) takes the following form F ¼
"
n Q
2r1 Q1 2nr Q1 kr ¼0
r¼1
¼
n Q r¼1
sr ¼0
ðpðk ;s Þ;qðk ;s ÞÞ D2n r r r r
2nr Q1 2r1 Q1 h sr ¼0
kr ¼0
sr 2nr r 1spðk eqðkr ;sr Þ r ;sr Þ
ðpðkr ;sr Þ;qðkr ;sr ÞÞ
D2n
# "
2r1 Q1 2nr Q1
#! pðk ;s Þ;qðk ;s Þ J2n r r r r ðp=4Þ
kr ¼0 sr ¼0 ! nr i pðkr ;sr Þ;qðkr ;sr Þ sr sr 2 1pðkr ;sr Þ eqðkr ;sr Þ J2n ðp=4Þ ;
ð3Þ where p
⎛1 ⎜ ⎜ ⎜0 p ⎜ J (Np ,q ) (ϕ pq ) = ⎜ q ⎜0 ⎜ ⎜ ⎜⎜ ⎝0
q
0
0
c p ,q
s p ,q 1
s p ,q
− c p ,q
0
0
0⎞ ⎟ ⎟ 0⎟ ⎟, ⎟ 0⎟ ⎟ ⎟ ⎟ 1⎠⎟
is the Jacobi rotation, cp;q ¼ cosðup;q Þ; sp;q ¼ sinðup;q Þ:
Fig. 5. Fast complex-valued many-parameter Fourier transforms for N ¼ 8, where we have n 2n1 ¼ 12 u-parameters u1 ¼ u10 ; u11 ; u12 ; u13 ; u2 ¼ u20 ; u21 ; u22 ; u23 ; u3 ¼ u30 ; u31 ; u32 ; u33 ; and n 2n ¼ 24 c-parameters c1 ¼ c10 ; c11 ; . . .; c17 ; c2 ¼ c20 ; c21 ; . . .; c27 ; c3 ¼ c30 ; c31 ; . . .; c37 :
Intelligent OFDM Telecommunication Systems
139
ðp;qÞ
Our generalization of (3) is based on Jacobi matrices J2n ðurðp;qÞ Þ instead nr
sr 2 ðp;qÞ ðpðk ;s Þ;qðk ;s ÞÞ r of JN ðp=4Þ and on arbitrary phasors: D2n r r r r 1spðk eqðkr ;sr Þ ! ;s Þ r r
r ðpðk ;s Þ;qðk ;s ÞÞ jc jcr D2n r r r r e pðkr ;sr Þ je qðkr ;sr Þ :
¼
Q h
n Q
2nr 1
Q
2r1 1
r¼1
sr ¼0
kr ¼0
F ðu1 ; u2 ; . . .; un ; c1 ; c2 ; . . .; cn Þ !
r
i ðpðkr ;sr Þ;qðkr ;sr ÞÞ ðpðkr ;sr Þ;qðkr ;sr ÞÞ jcpðk ;s Þ jcrqðk ;s Þ r D2n e r r je r r J2n uðpðkr ;sr Þ;qðkr ;sr ÞÞ ;
ð4Þ It is 3n 2n1 -parameter complex-valued Fourier-like transform (see Fig. 5) with n 2n1 u-parameters u1 ¼ u10 ; u11 ; . . .; u12n1 1 ; u2 ¼ u20 ; u21 ; . . .; u22n1 1 ; . . .; un ¼ un0 ; un1 ; . . .; un2n1 1 ; and n 2n ¼ 24 c-parameters c1 ¼ c10 ; c11 ; . . .; c12n 1 ; c2 ¼ c20 ; c21 ; . . .; c22n 1 ; . . .; cn ¼ cn0 ; cn1 ; . . .; cn2n 1 :
4 Fast Many-Parameter Quaternion Fourier Transforms The space of quaternions denoted by H were first invented by W.R. Hamilton in 1843 as an extension of the complex numbers into four dimensions [20]. General information on quaternions may be obtained from [21]. Definition 1. Numbers of the form 4 q ¼ a1 þ bi þ cj þ dk, where a; b; c; d 2 R are called quaternions, where (1) 1 is the real unit; (2) i; j; k are three imaginary units. The addition and subtraction of two quaternions 4 q1 ¼ a1 þ x1 i þ y1 j þ z1 k and 4 q2 ¼ a2 þ x2 i þ y2 j þ z2 k are given by 4
q1 4 q2 ¼ ða1 a2 Þ þ ðb1 b2 Þi þ ðb1 b2 Þj þ ðb1 b2 Þk:
The product of quaternions for the standard format Hamilton defined according as: q1 4 q2 ¼ ða1 þ b1 i þ c1 j þ d1 kÞ ða2 þ b2 i þ c2 j þ d2 kÞ ¼ ða1 a2 b1 b2 c1 c2 d1 d2 Þ þ ða1 b2 þ b1 a2 þ c1 d2 d1 c2 Þi þ ða1 c2 þ c1 a2 þ d1 b2 b1 d2 Þj þ ða1 d2 þ d1 a2 þ b1 c2 c1 b2 Þk;
4
where i2 ¼ j2 ¼ k2 ¼ 1; i j ¼ i j ¼ k, i k ¼ k i ¼ j, j k ¼ k j ¼ i. The set of quaternions with operations of multiplication and addition forms 4-D algebra H ¼ HðRj1; i; j; kÞ :¼ R þ Ri þ Rj þ Rk over the real field R.
140
V. G. Labunets and E. Ostheimer
Number component a and direction component 3 r ¼ bi þ cj þ dk 2 R3 were called the scalar and 3-D vector parts of quaternion, respectively. A non–zero element 3 r ¼ bi þ cj þ dk is called pure vector quaternion. Since i j ¼ k, then a quaternion 4 q ¼ a þ bi þ cj þ dk ¼ ða1 þ biÞ þ ðcj þ di jÞ ¼ ða þ biÞ þ þ ðc þ diÞ j ¼ z þ w j is the sum of two complex numbers z ¼ a þ bi; w ¼ c þ di with a new imaginary unit j. ¼ Definition 2. Let 4 q ¼ a þ bi þ cj þ dk 2 HðRÞ be a quaternion. Then 4 q a þ bi þ cj þ dk ¼ a bi cj dk is the conjugate of 4 q, and Nð4 qÞ ¼ jj4 qjj ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4q ¼ 4q 4q is the norm of 4 q. a2 þ b2 þ c 2 þ d 2 ¼ 4 q 3 3 Definition 3. Quaternions rNð rÞ ¼ 1 of unit norm are called unit pure vector quaternions and denotes as 3 lNð3 lÞ ¼ 1 . They form a 2-D sphere S2 R3 and parameterized by the Euler angles 3 lðb; hÞ ¼ i cos b þ j sin b cos h þ k sin b sin h 2 S2 : For each quaternion 3 lðb; hÞ we have 3 l2 ðb; hÞ ¼ 3 l2 ðb; hÞ 3 l2 ðb; hÞ ¼ 1: This unit-vector product identity represents the generalization of the complex-variable identity i2 ¼ 1. This means that, if in the ordinary theory of complex numbers there are only two different square roots of negative unity ( þ i and i) and they differ only in their signs, then in the quaternion theory there are infinite numbers of different square 3 roots of negative unity. The exponential function e lðb;hÞc ¼ cos c þ 3 lðb; hÞ sin c is called quaternion-valued phasor. We are going to use the expression (4) for obtaining many-parameter quaternion Fourier-like transform. It based on the left and right side quaternion-valued phasors:
¼
n Q r¼1
QF xjfkeyr gnr¼1 2nr 3 r 3 r Q1 2r1 Q1 h ðpðkr ;sr Þ;qðkr ;sr ÞÞ
lpðk ;s Þ ðbr ;hr Þcrpðk ;s Þ lqðk ;s Þ ðbr ;hr Þcrqðk ;s Þ r r r r j r r r r D2n ð5Þ keyrpðk ;s Þ e keyrqðk ;s Þ e r r r r sr ¼0 kr ¼0
i pðk ;s Þ;qðk ;s Þ ; J2n r r r r urðpðkr ;sr Þ;qðkr ;sr ÞÞ
where x ¼ u1 ; u2 ; . . .; un ; c1 ; c2 ; . . .; cn ; b1 ; b2 ; . . .; bn ; h1 ; h2 ; . . .hn , keyrpðkr ;sr Þ and keyrqðkr ;sr Þ are binary keys at
keyrpðk
r ;sr Þ
e
3 r lpðk ;s Þ ðbr ;hr Þcrpðk ;s Þ r r r r
and
keyrqðk
r ;sr Þ
e
3 r lqðk ;s Þ ðbr ;hr Þcrqðk ;s Þ r r r r
:
They indicate about the left side or the right side multiplications, respectively. Here ¼ 3 lr0 ðbr0 ; hr0 Þ; 3 lr1 ðbr1 ; hr1 Þ; . . .; 3 lr2n 1 ðbr2n 1 ; hr2n 1 Þ are quaterion-valued imaginary units parameterized by angles ðbrp ; hrp Þ. New transform QF xjfkeyr gnr¼1 is 7n 2n1 -parameter quaternion-valued Fourier-like transforms with n 2n1 u-parameters, n 2n c-, b- and h-parameters and with the branch of binary crypto-keys fkeyr gnr¼1 . 3 r r lp ðbp ; hrp Þ
Intelligent OFDM Telecommunication Systems
141
5 Many-Parameter Complex and Quaternion All-Pass Filters In this section we introduce special classes of many-parametric all-pass discrete cyclic filters. The output/input relation of the discrete cyclic filter is described by the discrete cyclic convolution: N 1 X yðnÞ ¼ FiltF fxðnÞg ¼ hðn mÞxðmÞ ¼ ðh xÞðnÞ N m¼0
n o iuðkÞ y ¼ F Diag jHðkÞje F xðnÞ;
where xðnÞ; yðnÞ are input and output signals, respectively, hðnÞ is the impulse response, HðkÞ ¼ jHðkÞjeiuðkÞ ¼ F fhðnÞg is the frequency response, is difference N N1 modulo N and is the symbol of cyclic convolution, FiltF ¼ hðn mÞ is the N
n;m¼0
cyclic ðN NÞ matrix with the kernel hðnÞ. We will concentrate our analysis on allpass filters whose frequency response can be expressed in the form HðkÞ ¼ jHðkÞjeiuðkÞ ; where frequency response magnitude is constant for all frequencies, for example, jHðkÞj 1; k ¼ 0; 1; 2; . . .; N 1: So, for all-pass filter FiltF has the fol lowing complex-valued impulse jhðnÞi ¼ F eiuðkÞ and frequency responses
jHðkÞi ¼ eiuðkÞ . Hence, yðnÞ ¼ FiltF fxðnÞg ¼ F y Diag eiuðkÞ F fxðnÞg: We are going to consider this filter as a parametric filter n o ðuÞ ðu ;u ;...;u Þ FiltF ¼ FiltF 0 1 N1 ¼ F y Diag eiuðkÞ F ¼ F y Diag eiu0 ; eiu1 ; . . .; eiuN1 F
ð6Þ ðuÞ
with N free parameters u ¼ ðu0 ; u1 ; . . .; uN1 Þ: Obviously, all-pass filter FiltF (as linear transform) is many-parameter unitary cyclic ðN NÞ matrix. Our the first natural generalization of (6) is based on an arbitrary unitary transform U instead of Fourier transform F : n o ðuÞ ðu ;u ;...;u Þ FiltU ¼ FiltU 0 1 N1 ¼ U y Diag eiuðkÞ U ¼ U y Diag eiu0 ; eiu1 ; . . .; eiuN1 U:
ð7Þ
n The3 second o generalized is based on quaternion-valued exponents (phasors) lðcCp ;hCp ÞaCp C key e ðp ¼ 0; 1; . . .; N 1Þ and two quaternion Fourier transforms: p n n QFiltQF aCn; cC ; hC ; L x; R xjkeyC ; fL keyr gr¼1 ; fR keyr gor¼1 C 3 C C C 3 C C n n ¼ QF y L xfL keyr gr¼1 Diag C key0 e lðc0 ;h0 Þa0 ; . . .;C keyN1 e lðcN1 ;hN1 ÞaN1 QF R xfR keyr gr¼1 :
ð8Þ
142
V. G. Labunets and E. Ostheimer
where C key ¼ ðC key0 ; C key1 ; . . .; C keyN1 Þ, aC ¼ aC0 ; aC1 ; . . .; aCN1 , cC ¼ cC0 ; cC1 ; n n . . .; cCN1 Þ, hC ¼ hC0 ; hC1 ; . . .; hCN1 . Here L x; fL keyr gr¼1 and R x; fR keyr gr¼1 are n parameters of left and right quaternion Fourier transforms QF y L xfL keyr gr¼1 , n QF R xfR keyr gr¼1 , respectively. n Quaternion cyclic transform QFiltQF aC ; cC ; hC ; L x; R xjkeyC ; fL keyr gr¼1 ; n fR keyr gr¼1 Þ has 7n 2n1 left parameters L x;7n 2n1 right parameters R x and 3 2n center parameters aC ; cC ; hC . Total number of parameters is ð14n þ 6Þ 2n1 . Moren n over, QFiltQF has three branches of binary crypto-keys keyC ; fL keyr gr¼1 ; fR keyr gr¼1 .
6 Conclusions In this paper, we proposed a novel Intelligent OFDM-telecommunication systems based on a new unified approach to the many-parametric representation of complex and quaternion Fourier transforms. The new systems use inverse MPFT for modulation at the transmitter and direct MPFT for demodulation at the receiver. Each MPT depends on finite set of independent parameters (angles), which could be changed in dependently of one another. For each fixed values of parameter we get the unique orthogonal transform. When parameters are changed, multi-parametric transform is changed too taking form of a set known (and unknown) orthogonal (or unitary) transforms. The main advantage of using MPFT in OFDM TCS is that it is a very flexible system allowing to construct Intelligent OFDM TCS for electronic warfare (EW). EW is a type of armed struggle using electronic means against enemy to “change the quality of information”. EW includes (consists) of suppressor and protector. Suppressor aims to “reduce the effectiveness” of enemy forces, including command and control and their use of weapons systems, and targets enemy communications and reconnaissance by changing the “quality and speed” of information processes. In reverse, EW in defense protects such assets and those of friendly forces. In order to protect corporate privacy and sensitive client information against the threat of electronic eavesdropping and jamming protector uses intelligent OFDM-TCS, based on MPFTs. The system model that is going to be used in this work is know as the wiretap channel model, which was introduced in 1975 by Wyner [4]. This model is composed of two legitimate users, named Alice and Bob. A legitimate user (Alice) transmits her confidential messages to a legitimate receiver (Bob), while Eve will be trying to eavesdrop Alice’s information. An active jammer, named Jamie, attempts to jam up this information. Alice transmits her data using OFDM with N complexor quaternion-valued sub-carriers n oN1 0 0 0 qsubck ðnju1 ; . . .; uq Þ ; i.e. she use the unitary transform QF N ¼ QF N ðh0 Þ k¼0 qsubck ðnju01 ; with fixed parameters h0 ¼ ðu01 ; . . .; u0q Þ. When sub-carriers 0 . . .; u0q ÞgN1 k¼0 (i.e. unitary transform QF N ðh Þ) of Alice’s and Bob’s Intelligent-OFDMTCS are identified by Eve (or Jammi) this TCS can be eavesdropped (or jammed) by means of Radio-Electronic Eavesdropping Attack (REEA). As an anti-eavesdropping
Intelligent OFDM Telecommunication Systems
143
and anti-jamming measures, Alice and Bob can use the following strategy: they can select new sub-carriers by changing parameters of QF N ðhÞ in the periodical (or in pseudo random) manner: QF N ðh0 Þ ! QF N ðh1 Þ ! . . . ! QF N ðhr Þ ! . . . r ¼ 0; 1; 2; . . .; where hr ¼ h0 þ r Dh and h0 are initial values of parameters at the initial time t0 . Acknowledgements. The reported study was funded by RFBR, project number 19-29-09022мк and by the Ural State Forest Engineering’s Center of Excellence in «Quantum and Classical Information Technologies for Remote Sensing Systems.
References 1. Liang, X., Zhang, K., Shen, X., Lin, X.: Security and privacy in mobile social networks: challenges and solutions. IEEE Wirel. Commun. 21(1), 33–41 (2014) 2. Jorswieck, E., Tomasin, S., Sezgin, A.: Broadcasting into the uncertainty: authentication and confidentiality by physical-layer processing. Proc. IEEE 103(10), 1702–1724 (2015) 3. Zhang, N., Lu, N., Cheng, N., Mark, J.W., Shen, X.S.: Cooperative spectrum access towards secure information transfer for CRNS. IEEE J. Sel. Areas Commun. 31(11), 2453–2464 (2013) 4. Wyner, A.D.: The wiretap channel. Bell Sys. Tech. J. 54(8), 1355–1387 (1975) 5. Renna, F., Laurenti, N., Poor, H.V.: Physical-layer secrecy for OFDM transmissions over fading channels. IEEE Trans. Inf. Forens. Secur. 7(4), 1354–1367 (2012) 6. Chorti, A., Poor, H.V.: Faster than Nyquist interference assisted secret communication for OFDM systems. In: Proceedings of the IEEE Asilomar Conference on Signals, Systems and Computers, pp. 183–187 (2011) 7. Wang, X.: Power and subcarrier allocation for physical-layer security in OFDMA-based broadband wireless networks. IEEE Trans. Inf. Forens. Secur. 6(3), 693–702 (2011) 8. Wang, H.M., Yin, Q., Xia, X.G.: Distributed beamforming for physical-layer security of two-way relay networks. IEEE Trans. Signal Process. 60(7), 3532–3545 (2012) 9. Manhas, P., Soni, M.K.: Comparison of OFDM system in terms of BER using different transform and channel coding. Int. J. Eng. Manuf. 1, 28–34 (2016) 10. Patchala, S., Sailaja, M.: Analysis of filter bank multi-carrier system for 5G communications. Int. J. Wirel. Microwave Technol. (IJWMT) 9(4), 39–50 (2019) 11. Gupta, M.K., Tiwari, S.: Performance evaluation of conventional and wavelet based OFDM system. Int. J. Electron. Commun. 67(4), 348–354 (2013) 12. Kaur, H., Kumar, M., Sharma, A.K., Singh, H.P.: Performance analysis of different wavelet families over fading environments for mobile WiMax system. Int. J. Future Gener. Commun. Netw. 8, 87–98 (2015) 13. Halford, K., Halford, S., Webster, M., Andren, C.: Complementary code keying for rakebased indoor wireless communication. In: Proceedings of IEEE International Symposium on Circuits and Systems, pp. 427–430 (1999) 14. Golay, M.J.E.: Complementary series. IEEE Trans. Inform. Theory 7, 82–87 (1961) 15. Davis, J.A., Jedwab, J.: Peak-to-mean power control in OFDM, Golay complementary sequences, and Reed-Muller codes. IEEE Trans. Inform. Theory 45, 2397–2417 (1999) 16. Michailow, N., Mendes, L., Matthe, M., Festag, I., Fettweis, A., Robust, G.: WHT-GFDM for the next generation of wireless networks. IEEE Commun. Lett. 19, 106–109 (2015)
144
V. G. Labunets and E. Ostheimer
17. Xiao, J., Yu, J., Li, X., Tang, Q., Chen, H., Li, F., Cao, Z., Chen, L.: Hadamard transform combined with companding transform technique for PAPR reduction in an optical directdetection OFDM system. IEEE J. Opt. Commun. Netw. 4(10), 709–714 (2012) 18. Wilkinson, T.A., Jones, A.E.: Minimization of the peak to mean envelope power ratio of multicarrier transmission schemes by block coding. In: Proceedings of the IEEE 45th Vehicular Technology Conference, pp. 825–829 (1995) 19. Wilkinson, T.A., Jones, A.E.: Combined coding for error control and in creased robustness to system nonlinearities in OFDM. In: Proceedings of the IEEE 46th Vehicular Technology Conference, pp. 904–908 (1996) 20. Hamilton, W.R.: Elements of Quaternions, p. 242. Chelsea Publishing, New York (1969) 21. Ward, J.P.: Quaternions and Cayley Numbers: Algebra and Applications, p. 218. Kluwer Academic Publishers, Dordrecht (1997)
Vibration Monitoring Systems for Power Equipment as an Analogue of an Artificial Neural Network Oleg B. Skvorcov1,2(&) and Elena A. Pravotorova1 1
2
Mechanical Engineering Research Institute of the Russian Academy of Sciences (IMASH RAN), 4, M. Kharitonyevskiy Pereulok, 101990 Moscow, Russian Federation [email protected] Scientific and Technical Center “Zavod Balansirovochnykh mashin” Limited Leability Company, 46, Varshavskoye shosse, Moscow 115230, Russian Federation
Abstract. This article discusses the analogies between modern systems for vibration monitoring of power equipment and neural networks. Vibration monitoring systems solve the problems of diagnosing faults and emergency protection of equipment. Modern equipment condition monitoring systems have a distributed structure. This allows us to consider them as a neural network. Such an analogy makes it possible to use the principles of training networks to make decisions on large volumes of data that cannot be analyzed by analytical or expert algorithms. Keywords: Vibration Condition monitoring Power turbine generator Artificial neural networks Fuzzy logic Ergodic processes Emergency protection system
1 Introduction Artificial neural networks have been successfully used for several decades in the diagnosis of defects of various rotor nodes [1]. In power equipment, such solutions have found application in diagnosing malfunctions of induction motors [2]. When monitoring the status of complex and powerful energy equipment, the application of these methods has not yet been reflected in the regulatory documentation: API standard 670, ISO 13381, ISO/TR 19201. When constructing such monitoring systems, more conservative equipment protection methods are used. Protection of energy rotary machines is based on determining vibration intensity levels in three orthogonal directions: vertical, transverse and axial and comparing these levels with predetermined threshold values. If the threshold level of emergency vibration is exceeded, the monitoring system generates an emergency stop signal for the operator or for automatic shutdown of the equipment. To increase the reliability of measurements when determining vibration intensity levels, statistical processing of current estimates is used.
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 145–153, 2020. https://doi.org/10.1007/978-3-030-39216-1_14
146
O. B. Skvorcov and E. A. Pravotorova
To increase the reliability of operation of the emergency protection system, a number of techniques are used: operation delay, logic of confirmation of measurement results on different channels, redundancy. Despite the widespread use of these protection methods, emergencies at large power units have been repeatedly observed [3–6]. There are false alarms of the emergency protection system. False positives of powerful equipment are extremely undesirable, since such blackouts often lead to the development of emergency situations or are accompanied by an increase in financial costs. The main problem, ensuring the reliability of the emergency protection system, is associated with the complexity of the processes in the equipment, the operation of which is based on the use of simplified models to describe them. Such models do not take into account the individual characteristics of each of these units. At the same time, a large amount of data collected by the continuous state monitoring system is not fully used. The use of neural network processing allows more efficient use of the data collected by the monitoring system. Some new features of the implementation of vibration monitoring systems are considered below, such as: analogy to artificial neural networks, features of preliminary multi-channel processing of vibration signals from sensors, vector and matrix data transformations in a multi-channel monitoring system, adaptation of threshold levels depending on wear over time and operating conditions.
2 Distributed State Monitoring System as the Basic Structure of Neural Network Data Processing Modern systems for monitoring complex equipment consist of primary data collection and processing modules distributed over the unit, which often have additional crosslinks to ensure reliability and mutual duplication. The primary data collection and processing modules are implemented on modern microcontrollers, which not only provide the collection of analog information from the sensors, but also provide selfdiagnostics of the measuring channels, data verification and decision making on its reliability. Such modules, in connection with the spread of the technology of the industrial Internet of things (IIoT), contain a wide range of input and output interfaces in the structure of the chip and are made in a “system on chip” - SoC structure. An example of such a microcontroller is the TM4C1294 microcircuit. An example of the structure of the vibration monitoring system of a multi-support rotor unit is shown in Fig. 1. It is known that the work of a simple model (McCulloch-Pitts model) of a neuron is described by the expression: yj ¼ f
N X i¼1
where xi is the input impact on the i-th input; yj is the output signal j of the neuron;
! wi;j xi
ð1Þ
Vibration Monitoring Systems for Power Equipment
147
wi,j are the weighted values of the inputs that are configured when training the neural nework. Various hardware and software implementations of such an adaptive model used to generate the output signal are possible [2, 7]. The primary data acquisition and processing modules also implement functions for evaluating input signals and converting such estimates into output logical messages, which are determined by comparing the total effects with predetermined threshold values. High pressure turbine
1
Low pressure turbine
Medium pressure turbine 1
1
1
1
1
1
2
2
2
Generator
1
2
Emergency Stop
3
3
3
3
4 5
Fig. 1. Power unit monitoring system. Multicomponent vibration sensors - 1, primary data acquisition and processing modules - 2, asynchronous data stream concentrators - 3, signal bus for automatic protection of the unit when exceeding the permissible vibration - 4, bus for an Internet data exchange channel with a personal (industrial) computer or programmable logic controller - 5.
3 Input Data Generation and Signal Preprocessing in Multichannel Monitoring Systems The creation of low cost monitoring systems involves the use of its low cost elements. Modules of primary data collection and processing, made using SoC technology, provide the implementation of multi-channel processing and their cost per channel is relatively low. Distributed physical quantity sensors made by MEMs technology also have low cost. In most cases, such sensors are supplied in the form of integrated circuits and require refinement for use in the monitoring system. There are solutions that can significantly reduce the cost of such improvements without significantly increasing the final cost of the sensor. In such solutions, the printed circuit board for installing the MEMs of the microcircuit is at the same time the sensor housing. As in biological systems, the transmission of information from sensitive elements in modern monitoring systems is carried out on a single wire, which can significantly reduce resources for the implementation of data transmission lines and reduce the space
148
O. B. Skvorcov and E. A. Pravotorova
required for their placement. The single-wire system for transmitting analog information from a sensitive element in this implementation allows, first of all, to significantly reduce the size of the connecting elements, which practically determine the geometric dimensions and cost of the multi-channel modules for primary data collection and processing. A number of interface solutions have been developed for organizing the transfer of such analog data. These include two-wire current loop interfaces and IEPE (TEDs). Since these interfaces require significant power consumption, it is preferable to use more economical solutions for organizing two-wire interfaces to work with MEMs sensitive elements. Another solution for building an economical input data collection is the use of MEMs sensors with single-wire transmission of the output signal by pulses with modulation in duration, duty cycle, or frequency. Such a solution is also functionally close to the organization of pulsed neuromorphic data exchange systems [7]. Another analogy with the properties of neural-like input data processing is seen when structurally adaptive structural solutions are used in the primary data collection and processing modules that can carry out signal transmission with integration or respond to fast changes. This structure is the functional equivalent of the solutions used in PID devices. Such adaptation in the process of parallel signal processing also provides a significant expansion of the frequency and dynamic ranges of input signals. The considered features of modern multichannel monitoring systems of complex equipment in terms of dynamic parameters such as vibration or acoustic signals show that their structure is close to the neuromorphic structural solutions used in the construction of artificial neural networks.
4 Matrix Conversion of Multi-channel Monitoring Data The structural similarity of the solutions used in the construction of equipment status monitoring systems and trained artificial neural networks will probably allow more efficient solutions for processing large amounts of data in stationary systems for continuous monitoring of complex equipment by dynamic parameters. Consider the principles of signal generation about the occurrence of an emergency by the vibration parameters that are currently used, based on the confirmation that the threshold for emergency vibration has been reached by increasing the level through other measuring channels (as a rule, these are adjacent channels). The analysis of real data on vibration levels collected by the monitoring system for a month on a horizontal turbine unit with a capacity of 300 MW at a thermal power plant shows that in some cases such confirmation is not valid. This conclusion follows from the data on the mutual correlation coefficients of vibration levels of multicomponent vibration on the six rotor bearings, which are presented in Table 1. In some cases, these data show that an increase in any of the vibration components can accompany not only very weak, but also opposite sign by changing the level of vibration in other channels. Similar results were previously obtained in the analysis of vibrational data by monitoring systems of the state of vertical hydraulic units [3, 4].
V1 H1 A1 V2 H2 A2 V3 H3 A3 V4 H4 A4 V5 H5 A5 V6 H6 A6
V1 1,00 0,28 0,18 0,19 0,10 0,43 0,54 0,39 0,42 −0,15 −0,31 −0,27 0,23 0,28 −0,01 −0,21 0,41 −0,33
H1 0,28 1,00 −0,39 −0,10 0,32 −0,13 −0,35 −0,14 −0,30 −0,04 0,36 0,32 0,08 0,26 0,02 −0,20 0,28 0,11
A1 0,18 −0,39 1,00 −0,13 −0,19 0,01 0,28 0,15 0,13 0,24 −0,15 −0,33 0,13 −0,16 0,06 0,18 −0,22 −0,09
V2 0,19 −0,10 −0,13 1,00 0,16 0,64 0,38 0,29 0,58 −0,56 −0,41 −0,25 −0,02 0,44 −0,32 −0,23 0,60 −0,15
H2 0,10 0,32 −0,19 0,16 1,00 0,28 0,03 0,29 0,16 −0,29 0,19 0,11 −0,09 0,37 −0,21 −0,08 0,49 0,23
A2 0,43 −0,13 0,01 0,64 0,28 1,00 0,57 0,43 0,92 −0,57 −0,55 −0,33 −0,12 0,42 −0,35 −0,24 0,57 −0,31
V3 0,54 −0,35 0,28 0,38 0,03 0,57 1,00 0,69 0,74 −0,31 −0,66 −0,48 0,18 0,09 −0,10 0,00 0,32 −0,36
H3 0,39 −0,14 0,15 0,29 0,29 0,43 0,69 1,00 0,59 −0,33 −0,22 −0,28 −0,08 0,24 −0,32 0,15 0,43 −0,07
A3 0,42 −0,30 0,13 0,58 0,16 0,92 0,74 0,59 1,00 −0,50 −0,66 −0,36 0,01 0,34 −0,32 −0,05 0,49 −0,23
V4 −0,15 −0,04 0,24 −0,56 −0,29 −0,57 −0,31 −0,33 −0,50 1,00 0,47 0,00 0,19 −0,55 0,52 0,24 −0,68 0,04
H4 −0,31 0,36 −0,15 −0,41 0,19 −0,55 −0,66 −0,22 −0,66 0,47 1,00 0,20 −0,26 −0,12 0,21 0,06 −0,18 0,26
A4 −0,27 0,32 −0,33 −0,25 0,11 −0,33 −0,48 −0,28 −0,36 0,00 0,20 1,00 0,13 0,13 −0,20 0,22 −0,03 0,39
V5 0,23 0,08 0,13 −0,02 −0,09 −0,12 0,18 −0,08 0,01 0,19 −0,26 0,13 1,00 −0,01 0,34 0,28 0,02 0,25
Table 1. Correlation coefficients of vibration intensity levels. H5 0,28 0,26 −0,16 0,44 0,37 0,42 0,09 0,24 0,34 −0,55 −0,12 0,13 −0,01 1,00 −0,62 0,14 0,85 0,37
A5 −0,01 0,02 0,06 −0,32 −0,21 −0,35 −0,10 −0,32 −0,32 0,52 0,21 −0,20 0,34 −0,62 1,00 −0,18 −0,45 −0,22
V6 −0,21 −0,20 0,18 −0,23 −0,08 −0,24 0,00 0,15 −0,05 0,24 0,06 0,22 0,28 0,14 −0,18 1,00 −0,10 0,73
H6 0,41 0,28 −0,22 0,60 0,49 0,57 0,32 0,43 0,49 −0,68 −0,18 −0,03 0,02 0,85 −0,45 −0,10 1,00 0,12
A6 −0,33 0,11 −0,09 −0,15 0,23 −0,31 −0,36 −0,07 −0,23 0,04 0,26 0,39 0,25 0,37 −0,22 0,73 0,12 1,00
Vibration Monitoring Systems for Power Equipment 149
150
O. B. Skvorcov and E. A. Pravotorova
Thus, the correct use of the principle of confirming the occurrence of an emergency in multi-channel parallel control involves taking into account a priori information about the individual characteristics of a complex energy unit. Such information, in this case, is contained in the shown matrix of correlation coefficients, which can be generated during the accumulation and processing of data in the process of “training” the monitoring system installed on a specific object. It should be noted that the matrix representation of neural network settings follows from the neuron model itself (1) and is widely used in the description of learning algorithms. Even the name of the module simulating processing using neural networks includes this notation. The use of matrix operations on input data vectors is also typical for data collection in modern multichannel monitoring systems. Application of this operation to us by the input vector allows us to increase the accuracy of measurements by compensating for errors associated with sensor errors. Multiplication by a matrix of coefficients allows you to optimize the choice of direction of the measuring axes and increase the reliability of the emergency protection system. Optimization of the choice of vibration measurement directions allows, on the one hand, to maximize the reliability of diagnosing incipient defects at an early stage of their formation, and, on the other hand, to provide a mode of functional reservation of measurements of the multicomponent sensor used to solve the problem of equipment emergency protection. Moreover, for each of these tasks there is an optimal direction of the axes in space.
5 Adaptation of Response Thresholds Taking into Account Time and Operating Modes In the standards of regulation of vibration of rotary machines, the state of the unit is determined by the level of vibration intensity. For units of a certain power or rotation speed, zones with boundary values of vibration intensity are defined. These zones can be arbitrarily named as, for example, A, B, C and D. As the intensity of vibration increases and it enters a certain zone, the vibrational state is classified from “very good” - A to “very bad” - D. This is typical for ISO 10816 group standards. The technique used is a typical example of estimation using fuzzy logic [8]. Moreover, the vibration level itself can be estimated as the average or maximum estimate for the respectively extreme or average instantaneous values of the measured vibration intensities. In most cases, it is recommended that you use the maximum RMS value of the speed. The resulting tables describing the vibrational state at the measurement point in terms of fuzzy logic can be used to describe the state of the entire unit by constructing a logical function for all measuring points. When constructing such a function, fuzzy logic operators should be used. From Table 1 it follows that when constructing such a function it is not always possible to limit ourselves to simple confirmations in the form of a combination of disjunction (OR) and conjunction (AND) due to the presence of negative correlation coefficients. Since in the case of an increase in the level of vibration intensity in one measuring point, it may be accompanied by a decrease in another measuring point, it is important
Vibration Monitoring Systems for Power Equipment
151
not to change their absolute value at some point in time, but to change such levels of vibration intensity. Thus, a correct logical analysis of state functions should also include a dependence on discrete time. The time discretization step may correspond to the minimum formation time of one estimate of the vibration intensity level. Taking into account the time factor, when constructing the functions of fuzzy logic and genetic algorithm [9], allows you to take into account the appearance of jumps in the level of vibration intensity. Such jumps in the standards of the ISO 10816 group are considered as a sign of a malfunction, regardless of whether such a jump occurs in the direction of increasing or decreasing the level of vibration intensity. The need to take into account the time factor in assessing the condition of the rotor assembly follows from the fatigue wear of structural materials and components of the assembly. Since vibration usually creates dynamic local stresses that are significantly lower than the static strength threshold, the nucleation and development of defects in this case occurs in the form of a relatively long process. It was shown in [10] that the S-N diagrams used to describe the processes that occur during this process can be replaced by equivalent A-N diagrams, where A is the vibration overload determined by the value of the locally acting acceleration at each time instant. stress, MPa
Static fatigue
vibration, m/c2
Cyclic (vibration) fatigue
600
3000 Shocks and dynamic loads
500
Vibro-acoustic processes and random vibrations
Polyharmonic vibrations
400
2500 2000 1500
300 friction, fretting
friction, fretting
200 100
multicyclic
low cyclic 100
101
102
1000
103
104
105
500
gigacyclic 106
107
108
109
1010
N
Fig. 2. S-N diagram and equivalent vibration impacts
An example of such a diagram of cyclic strength and equivalent vibration exposure is shown in Fig. 2. Since local deformations are described by acceleration, the role of the latter as the most important parameter can be underestimated if estimates of the intensity of vibration by its velocity are used. This is especially important when taking into account the effect of gigacyclic fatigue most quickly manifested when exposed to highfrequency vibrations. Exposure to such relatively high-frequency vibration can cause fatigue and emergency situations at relatively short time intervals (weeks and months) [3, 4]. Moreover, such relatively high-frequency vibrations are usually not detected by regular vibration monitoring systems, since their frequency is higher than the upper
152
O. B. Skvorcov and E. A. Pravotorova
cutoff frequency of the control. However, such high-frequency vibration in the form of high-intensity acceleration can usually be detected at objects such as hydraulic units with special equipment at the recommended operating modes at high power. Based on the foregoing, it is advisable to supplement the use of assessments of the state of controlled equipment by the influence of a time factor. It is also advisable to supplement the use of estimates of vibration intensity from velocity prices with average and extreme estimates of vibration intensity from acceleration, including in the region of increased frequencies. Consideration of the time factor in the estimates of the vibrational state suggests that the vibrational equipment in the normal state is characterized by identical vibrational characteristics at the same operating conditions. The independence of the estimates of such states from time suggests that the vibrational processes are ergodic in this case. This condition is usually satisfied for stationary operating conditions [11]. Due to this, even on a single unit, taking into account the Birkhoff-Khinchin ergodic theorem, one can obtain robust statistical estimates of the vibrational state.
6 Conclusions Modern systems for monitoring the state of equipment in terms of static and dynamic parameters are close in structure to the structures of artificial neural networks, and the control algorithms used in processing information in terms of vibration intensity parameters are similar to using fuzzy logic. This allows us to recommend the use of training principles designed for use with neural networks. Such training can be effectively implemented even with a limited set of control objects, taking into account the ergodicity of the dynamic processes controlled in this case. Since a time factor is important, algorithms based on estimates of vibration intensity should be supplemented with a functional time dependence. To take into account the processes of vibrational fatigue in such a control, the monitoring frequency range should also be expanded in terms of dynamic parameters in the field of increased frequencies and control of the intensity of local accelerations. Using the principle education to adjust levels with which vibration intensity estimates are compared to make decisions about the condition of equipment in a vibration monitoring system allows you to abandon the use of subjective threshold values, that were first defined in ISO 2372 and then repeated that are not related to the actual characteristics of vibration strength in later international standards regarding the assessment of the vibrational state for rotor equipment. Such education also ensures the elimination of incorrect operation of the emergency protection signal generation logic.
References 1. Chang, C.-W., Lee, H.-W., Liu, C.-H.: A review of artificial intelligence algorithms used for smart machine tools. Inventions 3, 41, 28 p. (2018) 2. Haykin, S.: Neural Networks and Learning Machines, 3rd edn. Prentice Hall, Upper Saddle River (2009)
Vibration Monitoring Systems for Power Equipment
153
3. Trunin, E.S., Skvortsov, O.B.: Operational monitoring of the technical condition of hydroelectric plants. Power Technol. Eng. 44(4), 314–321 (2010) 4. Skvorcov, O.: Development of vibrating monitoring for hydro power turbines under operating condition. J. Mech. Eng. Autom. 4(11), 878–886 (2014) 5. Legeza, V., Dychka, I., Hadyniak, R., Oleshchenko, L.: Mathematical model of the dynamics in a one nonholonomic vibration protection system. Int. J. Intell. Syst. Appl. (IJISA) 10, 20–26 (2018) 6. Hu, Z., Legeza, V.P., Dychka, I.A., Legeza, D.V.: Mathematical modeling of the process of vibration protection in a system with two-mass damper pendulum. Int. J. Intell. Syst. Appl. (IJISA) 3, 18–25 (2017) 7. Skvortsov, O.B.: Formal neuron model. Invention SU № 437103 (1974) 8. Deore, K.S., Khandekar, M.A.: Design machine condition monitoring system for ISO 108163 standard using fuzzy logic. Int. J. Eng. Res. Technol. (IJERT) 4(01), 726–729 (2015) 9. Awadalla, M.H.A.: Spiking neural network and bull genetic algorithm for active vibration control. Int. J. Intell. Syst. Appl. (IJISA) 2, 17–26 (2018) 10. Lenk, A., Rehnitz, J.: Schwingungsprüftechnik, 270 p. VEB Verlag Technik, Berlin (1974) 11. Pravotorova, E.A., Skvortsov, O.B.: Modelling of vibration tests of winding elements of power electric equipment. J. Mach. Manuf. Reliab. 44(5), 479–484 (2015)
Integrated Computer Analysis of Genomic Sequencing Data Based on ICGenomics Tool Yuriy L. Orlov1,2,3(&), Anatoly O. Bragin2,3, Roman O. Babenko2, Alina E. Dresvyannikova2,3, Sergey S. Kovalev2,3, Igor A. Shaderkin1, Nina G. Orlova1,4, and Fedor M. Naumenko2 1
2
I.M. Sechenov First Moscow State Medical University of the Ministry of Health of the Russian Federation (Sechenov University), Trubetskaya 8-2, 119991 Moscow, Russia [email protected] Novosibirsk State University, Pirogova, 1, 630090 Novosibirsk, Russia 3 Institute of Cytology and Genetics SB RAS, Lavrentyeva, 10, 630090 Novosibirsk, Russia 4 Moscow Witte University, 2nd Kozhukhovskiy proezd, 12-1, 115432 Moscow, Russia
Abstract. Fast growth of sequencing data volume demands development of new program systems for processing, storage and analysis of sequencing data. Here we review approaches for data bioinformatics integration using complementary approaches in genomics, proteomics and supercomputer calculations on example of ICGenomics tool. The program complex ICGenomics has been designed previously in Novosibirsk for storage, mining, and analysis of genomic sequences. This tool enables wet-lab biologists to perform high-quality processing of sequencing data in the fields of genomics, biomedicine, and biotechnology. Overall, integrated software tools have to include novel methods of the processing of initial high-throughput sequencing data including gene expression data. Examples of the application areas are: ChIP-seq analysis; functional annotation of gene regulatory regions in nucleotide sequences; prediction of nucleosome positioning; and structural and functional annotation of proteins, including prediction of their allergenicity parameters, as well as estimates of evolution changes in protein families. Applications of the ICGenomics to the analysis of genomic sequences in model genomes are shown. We conclude the presentation by on machine learning methods adaptation in bioinformatics. The ICGenomics tool is available at http://www-bionet.sscc.ru/icgenomics/. Keywords: ICGenomics Genomic sequencing data analysis Biomedicine Biotechnology
Integrated computer
1 Introduction The development of computational algorithms, databases and tools are necessary for the efficient processing, management, storage and analysis of large-scale sequencing data [1, 2]. However, analyzing such big data and deriving biological knowledge, applying it back for predictions and further experimentation is becoming a challenging © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 154–164, 2020. https://doi.org/10.1007/978-3-030-39216-1_15
Integrated Computer Analysis of Genomic Sequencing Data
155
task demanding new software development [2]. There are sets of computer tools to handle separate problem of sequencing, protein structure and phylogeny analysis. They might be in broad scope of sequencing problems, but the program architecture not intended for joining different application areas. The computer complex ICGenomics is intended for computer support of researches in genomics, molecular biology, biotechnology and biomedicine complementing existing tools. The research problem highlighted is functional annotation of the genomic sequences obtained as a result of high-throughput sequencing using such technologies as ChIP-seq and RNA-seq. It includes sequencing quality control, clustering of short sequencing reads, then analysis of gene regulatory regions (gene promoters, enhancers). The program complex ICGenomics was designed in pilot version for the analysis of symbol sequences – DNA and amino acids. Key approach here is analysis of sequences (texts) with data exchange between program modules in text formats. As we note, development of new experimental genomics methods, first of all, the sequencing, led to exponential growth of volumes of experimental data, «information explosion», and diversification of application areas and typical data analysis scenario, that assume adaptation of software architecture to varying tasks [3, 4]. The main objective of the computer analysis of genomic data consists in functional annotation, integration of results with molecular and biological information resources, such as general databanks (NCBI Genbank, EMBL, PDB) and specialized databases on gene transcription regulation, medical and patient-oriented resources [3]. In this regard development of computer technologies of the automatic analysis (short sequence read filtering, processing) and the functional annotation of genomic sequences (predication of functional sites, gene regulatory regions, transcription and protein binding sites) is demanded [4]. A number of programs for extraction and integration of such data, and visual representation of information were developed to solve the task in the form of genomic profiles [5–9], microarray expression data [10–12]. Such data are presented on servers of the international bioinformatics centers NCBI (http://www.ncbi.nlm.nih.gov), UCSC Genome Browser (http://genome.ucsc.edu/), EBI (http://www.ebi.ac.uk/), in the specialized databases such as TCGA (The Cancer Genome Atlas, cancer.gov). A set of specialized workflows and pipelines for sequencing data processing appeared recently [7, 8, 13]. The major objects of theoretical and applied genomics are the molecular and genetic systems coordinating function of genomes, genes, RNA, proteins, gene and metabolic ways on hierarchical standards of life - cell, tissue, organ, organism, and population [14]. Such hierarchy assume module structure of the integration software from DNA to RNA transcripts, then proteins and more complex network structures.. Despite availability of computer programs of bioinformatics, in connection with growing volumes of data there is a number of bioinformatics problem (biomedical and patient-oriented), important for more detailed development [15, 16]. We allocated the following areas of genomic sequences research in the integrated program complex in addition to classical problems: 1. Development of computer conveyor approach (pipe-line) for preprocessing, processing and mapping on a reference genome DNA fragments received during largescale parallel sequencing [17, 18].
156
Y. L. Orlov et al.
2. The functional annotation of genomic sequences (human genome and genomes of model organisms) for the marking of regulatory areas, prediction of nucleosome formation sites and chromatin structure annotation [19]. 3. Development of programs for marking of protein functional sites, determination of properties of the protein fragments coded in nucleotide sequences, prediction potential allergenicity of the proteins using original methods and databases [20–23]. 4. Comparison of functional properties of newly sequenced genes of various organisms [24, 25]. Researches of adaptive models of evolution at level of gene families and at genome level. The solution of the tasks mentioned is necessary for ensuring technical support of genome research. Software for these research fields is presented in the program complex ICGenomics initially developed at the Institute of Cytology and Genetics SB RAS [4]. Special attention was given to the original methods which are not repeating standard algorithms for already enough routine tasks, such as prediction of coding sequence or prediction of transcription factor binding sites (TFBS) by nucleotide sequence only (for example, by weight matrix), standard solutions for which are presented on servers of NCBI, UCSC, EBI and stand-alone programs [26–29]. Program complex ICGenomics was realized and tested on the computing equipment of Shared Facility Center “Bioinformatics” of the Siberian Branch of the Russian Academy of Science http://www-bionet.sscc.ru/icgenomics.
ICGenomics-Processing - sequencing data processing - transcription factor binding site recognition by ChIP-seq profiles
User interface
ICGenomicsGenomeAnnotation
ICGenomics-web / ICGenomics-start (interface and workflow)
- nucleosome annotation - exon search - search of miRNA gene promoters
Exons database
ICGenomics-Allergen - module of protein allergenicity prediction
ICGenomics-Evolution Computer pipe-line
- reconstruction of protein evolution history - phylogenetic analysis
Fig. 1. Structure of the ICGenomics complex.
Integrated Computer Analysis of Genomic Sequencing Data
157
Thus, the tasks realized for the computer analysis of genomic sequences include the following procedures [1]: • Processing of sequences of DNA from the fragments received by genome sequencing devices of new generation. • The functional annotation of genomic nucleotide sequences possessing possibilities of the functional annotation of nucleosomes, exon search, and prediction of miRNA gene promoters. • Prediction of allergenicity of proteins on their structural and functional properties on the basis of method of the functional annotation including a prediction of functional sites in spatial structures of proteins and predictions of specific activity of proteins by their primary and spatial structure [20]. • Research of modes of evolution of protein coding genes, possessing functional: reconstruction of evolutionary history of proteins [22].
2 Materials and Methods. Structure of the Program Complex The program complex ICGenomics allows to fulfill the following distinct functions (Fig. 1): – processing of extended nucleotide sequences from next generation sequencing data including: processing of sequencing data from technological platforms 454 and Illumina, processing of sequencing data of the SOLiD platform and processing of the whole-genome ChIP-seq profiles, including peak calling and TFBS prediction [4, 28, 29]; – annotation of genomic sequences including: marking of nucleosome positions on the basis of wavelet-transformation of whole-genome profiles, nucleosome formation sites prediction. This functional module includes recognition of sites nucleosomes formation sites based on sequencing data using linker DNA for prediction. It includes exon search and prediction of miRNA gene promoters using specific nucleotide structure motifs [30]; – prediction of protein allergenicity by structural and functional properties and functional annotation of protein spatial structure, including prediction of functional sites in protein 3D structure [20, 21]; – research of evolution modes of protein coding genes, including reconstruction of evolutionary history of proteins on the basis of ortolog prediction in sequenced genomes; the phylogenetic analysis and investigation of selection modes [24]. Each of the functions listed above is realized in the corresponding program component (module) of the program complex (see Fig. 1). The program complex consists of the management module (program the ICGenomics-web components and the operating ICGenomics-start program) and 4 programs: the ICGenomics-Processing, ICGenomics- GenomeAnnotation, ICGenomics-Allergen and ICGenomics-Evolution component. The general user interface is presented in Fig. 2.
158
Y. L. Orlov et al.
Input data for the system are files of nucleotide and amino acid sequences in the FASTA format, and also in the formats of next generation sequencing platforms Illumina, SOLiD. It is also possible to use of genome data formats such as bed files (genomic coordinates), wig (a numerical profile). The complex refers to the SiteEx database [22], and PDBSite [23], what contain compiled information about exons and protein spatial sites. The ICGenomics-Processing component includes modules of data processing (including converting of formats and sequencing signal filtering – filtering of DNA fragments with low quality, converting of sequences for further processing to the standard FASTA format, processing of sequencing data from platforms 454 and Illumina (the formats: fastq, qseq), processing of SOLiD platform sequencing data (in the color coding format csfasta), and pipe-line of SOLiD mapping data processing. This component (ChIP-seq pipeline module) also carries out processing of the wholegenome ChIP-seq profiles, peak calling from such a profile and TFBS prediction in the genome, as applied in [3, 4].
Fig. 2. Interface and program components of the ICGenomics complex.
Integrated Computer Analysis of Genomic Sequencing Data
159
3 Application Areas Typical tasks to be done at preprocessing stage: transformation of the data received as a result of experiment to standard formats; the analysis of sequences quality and filtration by quality; preparation of results of such operations. The recognition method is realized in the ChIP-Seq pipeline program, and intended for processing of the output data of experiment on a mass high-throughput sequencing of functional sites. The program complex pursues two main objectives: (a) data processing and mapping to genome; (b) verification of the genomic loci found by various existing bioinformatics software (programs of TFBS recognition). Such approach allows exclude from consideration of error and artifacts, existing in ChIP-Seq experiments; (c) correctly adjust parameters at various stages of data processing (mapping onto genome, choice of the minimum number of readings of a site in genomic locus, etc.); (d) to receive as a result the complete list of potential genes targets for TF studied [3, 4]. As the program for mapping of DNA reads (primary data of ChIP-Seq experiment) we used SOLiD™ BioScope™ Software recommended by the manufacturer with default settings. The converting of output file to the target format (.ma->.bam) was made by subsequent usage of MACS program output [29]. The MACS program is intended to fill procedure of ChIP-Seq profile peaks search (ChIP-Seq peak calling). It is one of most widely used and, besides, possesses the greatest accuracy in binding site detection. Result of work of the program is the whole-genome profile in wig format which represent list of positions and “coverage”. “Position” is chromosomal localization, includes chromosome number and position in the chromosome coordinates. “Coverage” - number of readings – number of the fixed bindings of studied protein (TF) with DNA in the chromosomal locus. Further the nucleotide sequences containing TFBS can be defined, and frequencies of oligonucleotides could be analyzed by the program developed earlier [27]. Illumina and ABI SOLiD sequencing technologies are characterized by the features connected with carrying out experimental procedures that is reflected in input formats used by ICGenomics. The technology of the Illumina (Solexa) company (www. illumina.com), uses optical scanning of fluorescence of marked nucleotides in the cloned colonies of molecules of DNA on a firm surface while the ABI (Applied Biosystems) sequencing technology SOLiD (Sequencing by Oligonucleotide Ligation) uses an alloying, and, respectively the coding on two nucleotides. For processing of sequencing data the following formats of genomic data could be accepted by the program complex: FASTA, fastq, clustal. Using the same FASTA format, the ICGenomics-GenomeAnnotation component of the functional annotation of genomic sequences solves problems: – the functional annotation of nucleosomes (including methodically application of wavelet-transformation for the analysis of whole-genome profiles of nucleosome formation sites and recognition of sites of nucleosome formation of by whole genome sequencing data of linker DNA); – exon search in new sequenced loci for more detailed gene and protein annotation. – miRNA gene promoter search in nucleotide sequences by specific structure motifs.
160
Y. L. Orlov et al.
The call of separate modules is carried out from the general interface step by step. The Phase program of nucleosome prediction was successfully applied to the analysis of yeast genome and comparison of efficiency of gene transcription depending on the predicted localization of nucleosomes in gene promoters [19]. In the same way the analysis modules of exon search and miRNA promoter predictions could be called, including the SiteEx database. The program component ICGenomics-Allergen for prediction of protein allergenicity fulfills prediction of allergenic properties of proteins (peptides) using conformational peptides [18]. Besides, the module carries out prediction of functional sites in spatial structure of proteins and predicts specific activity of proteins based on their structure. The program calculates values of allergenicity of the set of peptide sequences (numerical value and the text description). The same data can be transferred to the homology analysis with exon sequences in the corresponding module, and on comparative analysis of properties of family of the proteins leading to emergence of properties of allergenicity. Overall, the program component ICGenomics-Evolution allows to analyze modes of protein evolution and protein-coding genes based on aligned sequences samples. Evolutionary history of proteins reconstruction is based on ortholog prediction in sequenced genomes. The component is realized in the form of the data processing pipeline. Methods of the phylogenetic analysis embedded into this program component, were successfully used in [24] for analysis of cyclin proteins.
4 Results The program complex ICGenomics containing several unique modules is developed. The ICGenomics tool is available at http://www-bionet.sscc.ru/icgenomics/. The program allows fulfill unique procedures of data processing and the analysis of genomic sequences. We applied transcriptome sequencing to analysis of parasite fluke O.felineus [31]. Three samples of O.felineus tissues (life stage maritae without eggs, maritae with eggs, a metacercariae), and as preparations of O. viverini and C.sinensis tissues were investigated. Sequence read mapping was done onto genome of parasitic flat worm – schistosoma (Schistosoma japonicum). Shistosoma – is a parasitic worm who strikes blood system of an organism. It is the next sibling species which genomic sequence is deciphered almost completely. For genome of S.japonicum the functional annotation was done earlier and 55 sequences of microRNA are identified. These genes take part in regulation of development stages of worm organism. The analysis of localization of sequence reads of these three organisms revealed that 17 genes of these 55 microRNAs are presented and in genomes of O.felineus, O.viverini and C.sinensis. It was shown that the number of the mapped sequences for these genes depends both on stage of the organism development, and from the species.
Integrated Computer Analysis of Genomic Sequencing Data
161
5 Conclusions Overall, in the program complex the unique methods developed at Institute of Cytology and Genetics SB RAS were used: protein allergenicity prediction based on amino acid sequences (conformational peptides), predictions of nucleosome positions, and prediction of transcription factor binding sites. The main constructive characteristics of the developed methods are possibility to process large volumes of next generation sequencing data. Modern biomedical applications, such as telemedicine, also need integration of data and tools [1, 32, 33]. Development of sequencing data processing tools including quality estimates [34], miRNA [35] and alternative splicing analysis [36] raise problem of the resources integration in universal computer tool. It is important for plant biology applications [37, 38]. We have organized series of international bioinformatics events on computer genomics and databases integration in Novosibirsk, Russia [39–42] as important step to join efforts in developing integrated bioinformatics standards [1]. Perspective extension of bioinformatics data integration is for plant biology [38] applications, analysis of crop plants [35]. Acknowledgments. The authors are grateful to Drs M.P. Ponomarenko and P.S. Demenkov for technical help in system development, to Prof Ming Chen for science discussion. Initial work was supported by the Russian Ministry of Education and Science, following the program support by ICG SB RAS budget projects. Testing was done on supercomputer cluster of the Siberian Branch of the Russian Academy of Science, Shared Facility Center “Bioinformatics”. Open data of a genomic sequencing from GEO NCBI and the sequencing data obtained at ICG SB RAS were used. The analysis of sites of TFBS (for YLO and AED) was supported by the RFBR (18-04-00483).
References 1. Chen, M., Harrison, A., Shanahan, H., Orlov, Y.: Biological big bytes: integrative analysis of large biological datasets. J. Integr. Bioinform. 14(3) (2017). pii:/j/jib.2017.14.issue-3/jib2017-0052/jib-2017-0052.xml 2. Wilkinson, M.D.: Genomics data resources: frameworks and standards. Methods Mol. Biol. 856, 489–511 (2012) 3. Orlov, Y.L.: Computer-assisted study of the regulation of eukaryotic gene transcription on the base of data on chromatin sequencing and immunoprecipitation. Vavilovskii Zhurnal Genetiki i Selektsii=Vavilov J. Genet. Breeding 18(1), 193–206 (2014). (in Russian) 4. Orlov, Y.L., Bragin, A.O., Medvedeva, I.V., Gunbin, K.V., Demenkov, P.S., Vishnevsky, O.V., Levitsky, V.G., Oshchepkov, D.Y., Podkolodnyy, N.L., Afonnikov, D.A., Grosse, I., Kolchanov, N.A.: ICGenomics: a program complex for analysis of symbol sequences in genomics. Vavilovskii Zhurnal Genetiki i Selektsii=Vavilov J. Genet. Breeding 16(4/1), 732–741 (2012). (in Russian) 5. Chavan, S.S., Shaughnessy Jr., J.D., Edmondson, R.D.: Overview of biological database mapping services for interoperation between different ‘omics’ datasets. Hum. Genomics 5 (6), 703–708 (2011) 6. Orjuela, S., Huang, R., Hembach, K.M., Robinson, M.D., Soneson, C.: ARMOR: an automated reproducible modular workflow for preprocessing and differential analysis of RNA-seq data. G3 (Bethesda) 9(7), 2089–2096 (2019)
162
Y. L. Orlov et al.
7. Backman, T.W.H., Girke, T.: systemPipeR: NGS workflow and report generation environment. BMC Bioinformatics 17, 388 (2016) 8. Love, M.I., Anders, S., Kim, V., Huber, W.: RNA-seq workflow: gene-level exploratory analysis and differential expression. F1000Res 4, 1070 (2015) 9. Vieira, V., Ferreira, J., Rodrigues, R., Liu, F., Rocha, M.: A model integration pipeline for the improvement of human genome-scale metabolic reconstructions. J. Integr. Bioinform. 16(1) (2018). pii:/j/jib.2019.16.issue-1/jib-2018-0068/jib-2018-0068.xml 10. Ghai, K., Malik, S.K.: Proximity measurement technique for gene expression data. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 7(10), 40–48 (2015). https://doi.org/10.5815/ijmecs. 2015.10.06 11. Bhalla, A.R., Agrawal, K.: Microarray gene-expression data classification using less gene expressions by combining feature selection methods and classifiers. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 5(5), 42–48 (2013). https://doi.org/10.5815/ijieeb.2013.05.06 12. Arowolo, M.O., Abdulsalam, S.O., Isiaka, R.M., Gbolagade, K.A.: A hybrid dimensionality reduction model for classification of microarray dataset. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 9(11), 57–63 (2017). https://doi.org/10.5815/ijitcs.2017.11.06 13. Mohamed, E.M., Mousa, H.M., Keshk, A.E.: Comparative analysis of multiple sequence alignment tools. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 10(8), 24–30 (2018). https://doi. org/10.5815/ijitcs.2018.08.04 14. Ignatieva, E.V., Podkolodnaya, O.A., Orlov, Y.L., Vasiliev, G.V., Kolchanov, N.A.: Regulatory genomics: combined experimental and computational approaches. Russ. J. Genet. 51(4), 334–352 (2015). https://doi.org/10.1134/S1022795415040067 15. Stepanyan, I.V., Mayorova, L.A., Alferova, V.V., Ivanova, E.G., Nesmeyanova, E.S., Petrushevsky, A.G., Tiktinsky-Shklovsky, V.M.: Neural network modeling and correlation analysis of brain plasticity mechanisms in stroke patients. Int. J. Intell. Syst. Appl. (IJISA) 11 (6), 28–39 (2019). https://doi.org/10.5815/ijisa.2019.06.03 16. Kunjir, A., Shah, J., Singh, N., Wadiwala, T.: Big data analytics and visualization for hospital recommendation using HCAHPS standardized patient survey. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 11(3), 1–9 (2019). https://doi.org/10.5815/ijitcs.2019.03.01 17. Devailly, G., Joshi, A.: Insights into mammalian transcription control by systematic analysis of ChIP sequencing data. BMC Bioinform. 19(Suppl 14), 409 (2018) 18. Ernlund, A.W., Schneider, R.J., Ruggles, K.V.: RIVET: comprehensive graphic user interface for analysis and exploration of genome-wide translatomics data. BMC Genom. 1, 809 (2018) 19. Matushkun, Y.G., Levitsky, V.G., Orlov, Y., Likhoshvai, V.A., Kolchanov, N.A.: Translation efficiency in yeasts correlates with nucleosome formation in promoters. J. Biomol. Struct. Dyn. 31(1), 96–102 (2013) 20. Bragin, A.O., Demenkov, P.S., Kolchanov, N.A., Ivanisenko, V.A.: Accuracy of protein allergenicity prediction can be improved by taking into account data on allergenic protein discontinuous peptides. J. Biomol. Struct. Dyn. 31(1), 59–64 (2013) 21. Ivanisenko, V.A., Demenkov, P.S., Pintus, S.S., Ivanisenko, T.V., Podkolodny, N.L., Ivanisenko, L.N., Rozanov, A.S., Bryanskaya, A.V., Kostrjukova, E.S., Levizkiy, S.A., Selezneva, O.V., Chukin, M.M., Larin, A.K., Kondratov, I.G., Lazarev, V.N., Peltek, S.E., Govorun, V.M., Kolchanov, N.A.: Computer analysis of metagenomic data-prediction of quantitative value of specific activity of proteins. Dokl. Biochem. Biophys. 443, 76–80 (2012) 22. Medvedeva, I., Demenkov, P., Kolchanov, N., Ivanisenko, V.: SitEx: a computer system for analysis of projections of protein functional sites on eukaryotic genes. Nucleic Acids Res. 40 (Database issue), 278–283 (2012)
Integrated Computer Analysis of Genomic Sequencing Data
163
23. Ivanisenko, V.A., Pintus, S.S., Grigorovich, D.A., Kolchanov, N.A.: PDBSite: a database of the 3D structure of protein functional sites. Nucleic Acids Res. 33, 183–187 (2005). (Database) 24. Gunbin, K.V., Suslov, V.V., Turnaev, I.I., Afonnikov, D.A., Kolchanov, N.A.: Molecular evolution of cyclin proteins in animals and fungi. BMC Evol. Biol. 11, 224 (2011) 25. Gupta, S., Singh, M.: Phylogenetic method for high-throughput ortholog detection. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 7(2), 51–59 (2015). https://doi.org/10.5815/ijieeb.2015. 02.07 26. Glinskiy, B.M., Kuchin, N.V., Chernykh, I.G., Orlov, Y.L., Podkolodnyi, Y.L., Likhoshvai, V.A., Kolchanov, N.A.: Bioinformatics and high performance computing. Program Syst.: Theory Appl. 4(27), 99–112 (2015). (In Russian) 27. Putta, P., Orlov, Y.L., Podkolodnyy, N.L., Mitra, C.K.: Relatively conserved common short sequences in transcription factor binding sites and miRNA. Vavilov J. Sel. Breeding 15(4), 750–756 (2011) 28. Lee, K.L., Orlov, Y.L., le Yit, Y., Yang, H., Ang, L.T., Poellinger, L., Lim, B.: Graded Nodal/Activin signaling titrates conversion of quantitative phospho-Smad2 levels into qualitative embryonic stem cell fate decisions. PLoS Genet. 7(6), e1002130 (2011) 29. Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nusbaum, C., Myers, R.M., Brown, M., Li, W., Liu, X.S., et al.: Model-based analysis of ChIP-seq (MACS). Genome Biol. 9(9), R137 (2008) 30. Babichev, S., Skvor, J., Fišer, J., Lytvynenko, V.: Technology of gene expression profiles filtering based on wavelet analysis. Int. J. Intell. Syst. Appl. (IJISA) 10(4), 7 (2018). https:// doi.org/10.5815/ijisa.2018.04.01 31. Pakharukova, M.Y., Ershov, N.I., Vorontsova, E.V., Katokhin, A.V., Merkulova, T.I., Mordvinov, V.A.: Cytochrome P450 in fluke Opisthorchis felineus: identification and characterization. Mol. Biochem. Parasitol. 181(2), 190–194 (2012) 32. Ullah, Z., Fayaz, M., Iqbal, A.: Critical analysis of data mining techniques on medical data. Int. J. Mod. Educ. Computer Sci. (IJMECS) 8(2), 42–48 (2016). https://doi.org/10.5815/ ijmecs.2016.02.05 33. Voropaeva, E.N., Pospelova, T.I., Voevoda, M.I., Maksimov, V.N., Orlov, Y.L., Seregina, O.B.: Clinical aspects of TP53 gene inactivation in diffuse large B-cell lymphoma. BMC Med. Genom. 12(Suppl 2), 35 (2019) 34. Naumenko, F.M., Abnizova, I.I., Beka, N., Genaev, M.A., Orlov, Y.L.: Novel read density distribution score shows possible aligner artefacts, when mapping a single chromosome. BMC Genom. 19(Suppl 3), 92 (2018) 35. Wang, J., Meng, X., Dobrovolskaya, O.B., Orlov, Y.L., Chen, M.: Non-coding RNAs and their roles in stress response in plants. Genom. Proteomics Bioinform. 15(5), 301–312 (2017). https://doi.org/10.1016/j.gpb.2017.01.007 36. Babenko, V.N., Bragin, A.O., Spitsina, A.M., Chadaeva, I.V., Galieva, E.R., Orlova, G.V., Medvedeva, I.V., Orlov, Y.L.: Analysis of differential gene expression by RNA-seq data in brain areas of laboratory animals. J. Integr. Bioinform. 13(4), 292 (2016) 37. Masood, R., Khan, S.A., Khan, M.N.A.: Plants disease segmentation using image processing. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 8(1), 24–32 (2016). https://doi.org/ 10.5815/ijmecs.2016.01.04 38. Orlov, Y.L., Salina, E.A., Eslami, G., Kochetov, A.V.: Plant biology research at BGRS2018. BMC Plant Biol. 19(Suppl 1), 56 (2019) 39. Orlov, Y.L., Galieva, E.R., Melerzanov, A.V.: Computer genomics research at the bioinformatics conference series in Novosibirsk. BMC Genom. 20(Suppl 7), 537 (2019)
164
Y. L. Orlov et al.
40. Orlov, Y.L., Hofestädt, R., Tatarinova, T.V.: Bioinformatics research at BGRS\SB-2018. J. Bioinform. Comput. Biol. 17(1), 1902001 (2019). https://doi.org/10.1142/S021972001 9020013 41. Orlov, Y.L., Tatarinova, T.V., Zakhartsev, M.V., Kolchanov, N.A.: Introduction to the 9th young scientists school on systems biology and bioinformatics (SBB ‘2017). J. Bioinform. Comput. Biol. 16(1), 1802001 (2018). https://doi.org/10.1142/S0219720018020018 42. Yi, Y., Xu, Z.: Bioinformatics analysis and characteristics of the giant panda interferonalpha. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 3(1), 45–54 (2011)
Statistical and Linguistic Decision-Making Techniques Based on Fuzzy Set Theory Nikolay I. Sidnyaev, Iuliia I. Butenko, and Elizaveta E. Bolotova(&) Bauman Moscow State Technical University, 2-nd Baumanskaya Street, 5\1, Moscow 105005, Russian Federation [email protected], [email protected], [email protected]
Abstract. The article describes the production of sentences with the help of several grammars interesting for stenographs processing. Grammars and types of rules are determined by a theory with a high degree of formalization. A statistical approach to the distribution of stenographs is proposed. It can be used in cases when the available information is insufficient to describe the symbols or classes that may be included in the data set being under consideration. It is shown that decision-making is similar to hierarchical division, but in the latter case the decision in each vertex is made on the basis of statistical properties of the corresponding feature. It is noted that discriminant analysis is closely related to the understanding of “non-recognition”, which means that it is not classified with the separating function. If necessary, the classification should be carried out by other means (for example, manually). It is shown that the use of classification abandonment can dramatically improve the quality of classification, although the number of correct classification decisions is decreasing. Estimation techniques of the separating function on the basis of its classification error estimation are stated. The simplest possibility is to apply it to the objects under training and count the number of incorrect decisions. It is proved that the application of the technique results in an “optimal” offset. The possibility for separating multiple objects into training and control ones (the technique of concealment of available information) is shown. The training set is used to form the separating function, and the control set is used to estimate its quality. It is proved that the application of the technique results in a “pessimistic” offset of the result related to the separating function being formed on the basis of the full set of objects. The correlation between the techniques for the control of the training set is proposed. It is the technique for exclusion from the training set and application of the training set as a control single element. The proposed technique involves removing only one object from the training set, which is then used to estimate the quality of the separating function. Keywords: Gammar Recognition Decision making Distribution Information processing
Formalization Decision
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 165–174, 2020. https://doi.org/10.1007/978-3-030-39216-1_16
166
N. I. Sidnyaev et al.
1 Introduction Automatic systems that replace recognition by means of the human senses must carry out the same identification as a human colliding with a certain object. However, recognition systems must explicitly use the characteristic features of the object. Classification includes all processes that run out with the indication of some class (or belonging to a class) for the considered objects or data. The result of recognition can also be represented as a similar indication of the class - in this case pattern recognition is a variety of classification. The classification also includes simple processes based on, for example, the use of some measured and some threshold values (to determine whether the temperature value is within safe limits). In the case of fuzzy classes, an object can be characterized by belonging to one or several classes, the value of which can be set within 0–1. Approaches based on fuzzy concepts can be useful for solving recognition problems if classes are fuzzy as well. They can also play the role of intermediate means in solving recognition problems where the solution results in move to discrete classes (which include the class of “rejection of recognition” or questionable area). Sometimes the result of the recognition process is the assignment of an analog (continuous) variable of a value by interpolation between discrete classes. When applying recognition techniques at the training stage, a number of discrete positions (discrete classes) will be used. However, it is appropriate correct the recognition result of the position according to muscle activity to some interpolated position value lying between the values corresponding discrete classes instead of specifying this provision, indicating one of these classes [1, 2]. Recognition can result in a completely new type of analog data. The output information is characterized by a certain degree of complexity and is “produced” by “real” source (a physical object immersed in some medium, a process, an experiment, an economic system, etc.). The output information is relatively simple. It is reduced to the indication of the class [3–5].
2 Recognition of Non-stylized Written Signs and Estimation of the Similarity Measure As an example, we consider methodological approaches for complex experiments in recognition of non-stylized handwritten characters. In modern publications, the advantages and disadvantages of the idea of a recognizable object as a whole or of any characteristic part of it as a multidimensional vector in the space of signs are noted; its disadvantage is that the properties of the vector as a mathematical object are explicitly or implicitly extended to the recognizable object, which in real identification problems is apparently always incorrect, and in any case is never sufficiently conditioned [1]. It is more correct to call object recognition by a tuple (.) or p-characteristic, of course, but the visibility and the possibility of unconditional use of the metric of Euclidean spaces as a measure of similarity of classified objects is completely lost. The inability for vector interpretation of the handwritten characters under of their standard way of writing are immediately apparent, so the scheme for
Statistical and Linguistic Decision-Making Techniques
167
recognition of handwritten characters from the beginning to adapt to the peculiarities of this task is more important of which are: – the absence of any hints on what to consider an adequate system of descriptions and features in this seemingly simple problem; – the complete absence of objective data on the criterion of similarity used by man in the recognition of written characters; – very strict requirements to the result, determined by the capabilities of the person in this problem; – the admissibility and presence of “teacher errors” due to the inevitable errors in the encoding of the source information and the inability to control the content of samples due to the size of their volume. These features led to the understanding of the need to study the problem of recognition of handwritten signs by such heterogeneous techniques [2]. For tasks similar in complexity to the task of recognition of written characters, when the nature of the source is absolutely not clear, and seemingly obvious solutions capabilities inexplicably failed, the possibility of cascading decisions is one of the best [3]. In this case, breaking the original problem into a number of much simpler subtasks within the framework of the assumed model of the situation generating the problem to be solved, one can expect to achieve success by a sequence of locally, but mutually agreed solutions. The most natural of these sub-tasks or stages within the framework of the accepted model are: (a) selecting or designing a system, the original descriptions of the objects of recognition as their similarities, if possible agreed with the assessments of similarity in this task, which gives people, of course, if he has the power to do it, and (b) on the basis of the found descriptions and similarity measure, the search for meaningful signs and the procedures for taking decisions on them, which in turn may allow to clarify the source of the description on the similarity measure and so on, Fig. 1.
Fig. 1. Similarity measures for stenographs images.
168
N. I. Sidnyaev et al.
The model of recognition of handwritten characters, based on performance of flatline character portrait character line is the same, but in the extended multidimensional space of the natural descriptions, such as subspaces of the plane of the symbol [4]. In the first phase of validation of this model it was necessary to clarify the source of coordinates and a measure of the similarity of the portraits, and then start a formal search for signs within the framework of the model corresponding to local regions of the grouping of portraits in the space of the natural descriptions. Consideration of the main features of the problem of recognition of handwritten characters, from the standpoint of step-by-step techniques of solving such problems shows that hypotheses about the suitability and adequacy of the original descriptions and the similarity criterion should be estimated by the degree of collapse into heaps. To carry out this part of the experiments, a program was developed that allowed to clarify the descriptions and the nature of their processing on the material obtained with its help, and to formulate a criterion of similarity for handwritten characters given by portraits in the space of natural descriptions. The result of the test on the piles-standards was at the level of the best works on automatic character recognition. This allowed to conclude that, at least in the first stage, model, and adopted a decision meaningful, and opportunities to further improve results relate to accounting handwriting restrictions—manner of writing and the transition to the recognition of local identities, as a comparison generally used in the first stage, it was possible to explain the obtained errors are 3/4 and 1/2 bounce from the lineup completely in seemingly simple cases. Experiments with the manner of writing have not yet yielded a noticeable result, as for local features, it is to find them, as well as to clarify the initial descriptions and measure their similarity in the space of natural descriptions that the technique was developed. The technique belongs to the widely used techniques of cluster analysis and implements one of the modifications of gradient search. To estimate the gradient of the point set, the center of gravity of the set of points in the sample sphere of relatively small radius is shifted from the center of the sphere (see Fig. 1). The center of the sphere of the next step is aligned with the center of gravity defined in the previous position of the sphere. It is easy to see that the magnitude of this shift does not exceed the radius of the sphere and therefore the movement to the mode will be stable. The radius of the sphere changes during the search, adapting to the number of points that fall inside the sphere, as they are usually small. Otherwise single point away from the clot would have to be separate clots, and with a large radius of closest clusters were merged. Therefore, if there are few points in the sphere, its radius will increase slightly in the next step; if there are many points, the radius will decrease (see Fig. 1). The use of this kind of adaptation allows for small samples, where the restoration of smooth distribution functions is a serious problem, to obtain a high resolution near the clots with a low probability of allocation of false clots. An extremely important point in the work of search procedures is to find out whether all modes are identified when their number is not known in advance and can fluctuate within very wide limits. To solve this problem, the technique provides for the marking of points in the process of moving the test sphere to the mode and on the mode, and the movement to the next mode can begin only with an unmarked point. Since the overwhelming number of points is marked on the modes and in their immediate vicinity, the number of unmarked points
Statistical and Linguistic Decision-Making Techniques
169
very quickly decreases; when they do not become, the procedure stops. In the observations number of outputs to mode did not exceed an average of 8% of the volume of samples and re-outputs on selected mode amounted to an average of 50–70%, indicating a high efficiency of this search procedure with the “staining”. There are two marking modes. In the first, adequate to the situation-type peaks, one label is used, and in the second, adequate ridges or taxa, realizing the near from the dedicated mode labeled advanced. The radius of the additional marking slightly exceeds the radius of the sphere established on the mode, and all implementations marked with such a label are generally excluded from further consideration. Thus, in the first markup mode, only the conditions for starting the search for modes change, and in the second—also the sample size. Therefore, the second mode is significantly faster (excluded re-outputs on selected fashion, although their share is relatively small, not each fashion is drastically reduced sample size) and more suitable for the situation such as the ridges, since it is a gradient version of the technique of coating. Sets of modes to be allocated in addition to the true modes, contain and quasi-modes on the slopes of a true mode. This increases the role of selecting the radius of the sphere to exclude realizations in the vicinity of the mode from the sample when the mode is found. If there are no prescriptions, the choice of a more adequate problem mode of approximation of distributions is made on a trial sample based on the results of the exam for two types of modes with a comparison of the number of errors and failures, the number and specificity of modes. In particular, in observations conducted on handwriting recognition of digits in the space of the natural descriptions, the situation corresponds just to the second mode markup, which could be due to lack of use in these observations, neutralizing transformations on the input. The methodology implements a decision regarding two closest to identifiable implementation in the sense used similarity measure standards-modes. However, to the rule when the difference of the two minimum distances is less than the allowable clearance and a different sense of upcoming standards or at the nearest benchmark, a larger threshold distance, the decision is not accepted. Otherwise, the decision is made by the nearest standard, and if its meaning does not coincide with the meaning of the implementation specified in its header, an identification error is fixed. In case of error and refusal of identification, the title of the second nearest standard and the contents of the counter accumulating the exam result are additionally printed [4]. In the dialogue, gets all the necessary information about the characteristics of the problem being solved (aside from a couple of metres of the organization of the data arrays being processed implementations here is an indication of the number of classes and a few settings specific for that program), and its operating mode is also set by the keys. The time spent on iteration depends more on the dimension of the description space: in the first approximation, it is proportional to the sample size, the dimension of the space and the number of features in one cell.
170
N. I. Sidnyaev et al.
3 Statistical and Linguistic Decision-Making Techniques There are a number of decision-making techniques that are suitable for assigning an object to one of several classes for given characteristics and taking into account the variability of images. These techniques depend significantly on the types of features and equity and interclass variability of the respective images. In addition to specialized and heuristic techniques, there are two broad families of decision-making techniques [6–9]. 1. Techniques of the statistical theory of solutions are applicable at work with objects which signs serve numerical values, and these values at the objects belonging to one class are unequal [10]. A priori information is essential for feature selection; information about feature value deviations for features of the same class should be collected using statistical analysis. These two types of information can contribute to the formation of critical functions that provide classification [11–15]. 2. Linguistic techniques (also called structural or syntactic) are applicable when operating with objects whose characteristics are non-derived elements (or conglomerates of non-derived elements) and their relations. A grammar defines the rules for constructing an image from non-derivative elements. Many types of grammars (including stochastic grammars) are used, with varying breadth of grammatical description and complexity. Such grammars can often be determined on the basis of a priori information about images; otherwise, to obtain them, one has to resort to grammatical inference from a representative sample of images [16, 17]. Performing classification at prescribed non-derivative components and their relationships requires a parse. A combination of these two techniques may be appropriate. Noise reduction introduced by communication channels and/or measuring systems is achieved quite simply in those (particular) cases where measurements and data transmission from some sources can be carried out repeatedly. Averaging independent values of all input data sources ensures reduced noise in time. If the noise of this type completely determines the variability (an extremely rare case), does not depend on the initial values and its distribution is, say, normal, then for some separation procedures it is possible to calculate the classification error (and the limits of its reduction). If the noise depends on the initial values in a way known to us, then the application of the (nonlinear) scale-to-values transformation allows us to reduce this case to the previous model of normally distributed noise with degenerate distributions that do not depend on the initial signal.
4 Analysis Techniques Used in Feature Recognition The statistical approach to pattern recognition can be used in cases where the available information is insufficient to describe the images or classes that may be contained in the data set under consideration [15–19]. In such circumstances, the solution may be the use of statistical techniques for analysis, which allows and use all available a priori information. Sometimes new observations need to be made and analysed. The presentation of these data in statistical language (in the form of, for example, distribution densities, probabilities) is sometimes extremely difficult. If these difficulties cannot be
Statistical and Linguistic Decision-Making Techniques
171
overcome in a satisfactory manner, a multi-step procedure can be used, in which statistical, “physical” and heuristic approaches are used alternately. The source material for the statistical procedure is a certain set of objects, each of which is specified by a certain set of feature values. The necessary a priori information about the possible densities of the distribution of characteristic values, adequacy of signs, etc. in a statistical approach it is immaterial whether the objects recognition real physical objects or such “non-physical” categories as “social behavior” or “economic progress”, if they allow a uniform representation through signs. Consider the following data analysis techniques: 1. Discriminant analysis: we construct functions that depend on the features and provide optimal in some sense separation of objects belonging to different classes. 2. Selection and selection of features: a subset of the “best” features or their combinations is selected from a certain (redundant) set of criteria. 3. Cluster analysis: the data is divided into groups of objects similar in one way or another. In addition, some attention is paid to the problems related to the sample size and the number of features. The formation of a discriminant function that separates two or more feature classes is usually based on one of the following techniques. Statistical techniques are mainly based on minimizing the estimation of the classification error. This error e represents the probability of incorrect classification of an arbitrary k-dimensional object X received for recognition: e¼
L X
pl PðDðxÞ 6¼ ljx 2 lÞ
ð1Þ
l¼1
Where L is a number of classes; D(X) is a function that makes a classification decision (it can take one of the values 1, 2, …, l, …, L), and pi = prob (X 2 to class l) is a priori probability that an arbitrary object belongs X to class l (Prob a|b) denotes the probability of occurrence of event a when condition b is fulfilled). Sometimes minimized the expected loss: c¼
L X L X l¼1
cll0 pl0 PðDðxÞ ¼ ljx 2 l0 Þ
ð2Þ
l0 ¼1
Where Cll0 is the loss associated with the assignment of object X to class l when in fact X 2 to class l′. If Cll0 = 1 when l 6¼ l′ and Cll0 = 0, then the expression (2) Sovfalls (1). Let’s use criterion (1). It can be proved that the value (1) reaches the minimum if ð3Þ D x = l и pl fl x > pl fl x for all l ≠ l ,
( )
( )
( )
(k = 1) and three classes (L = 3). Each point X belongs to the class for which the product of a priori probability on the density of the distribution pl0 fl0 maximal, where fl(X) is the density of the distribution of class l in k-dimensional feature space. The
172
N. I. Sidnyaev et al.
classification rule (3) is called Bayesian. The points X belong to the class to which the maximum value of plfl corresponds; classification decisions are made in accordance with rule (3).When solving real problems, information about a priori probabilities and distribution densities has to be extracted from the available source data.
5 Estimation of Distribution Densities in Decision-Making Many practical techniques do not use the error minimization principle described above, which is based on estimates of distribution densities. These techniques have other criteria, often directly related to the available initial data; less a priori information about the density of the distribution (say, its normality) is required or it is possible to use other a priori information [17–19]. Now let’s look at a few classification procedures. They differ from each other, in particular, by the required a priori information about classes, the number of estimated parameters and computational complexity [20–22]. 1. Separation by quadratic functions based on the use of normal distribution densities. If all densities of distributions fl(X), l = 1, L, can be considered normal, then it is possible to construct a simple separating function that provides minimal classification error. The distribution density is defined (for l = 1, L) by the following expression: 1 fl ð xÞ ¼ ð2pÞk=2 jRl j1=2 exp ðx ll ÞT R1 ð x l Þ l ; l 2
ð4Þ
where k is a number of features; ll is an average value for class l; Rl is a covariance matrix of class l and XT denotes the result of the transposition of vector X. It is Easy to be sure that plfl(X) > pl′ fl′(X), if T 1 1 Sll0 ðxÞ ¼ 12 ðx ll ÞT R1 l ðx ll Þ þ 2 ðx ll0 Þ Rl0 ðx ll0 Þ : logðpl0 =pl Þ 12 logðjRl j=jRl0 jÞ [ 0
ð5Þ
Thus, D (X) = l, if Sll′ (X) > 0 8 l 6¼ l′. Note that Sll′ (X) is a quadratic function. This function is optimal if the true distributions are normal. When solving real problems, the parameters ll, ll′, Rl, Rl′, pl and pl are estimated from the training set. 2. Nearest neighbor rules. An object is assigned to the class that has its nearest neighbor from the training set (nearest neighbor rule) or most of its nearest neighbors (K-nearest neighbor rule). This technique, however, has the disadvantage that all training objects must be used in obtaining each classification decision. 3. Optimization of some criterion error. The heuristic technique is reduced to optimization of parameters of the chosen separating function (linear, quadratic or other type) by any error criterion. As the latter, you can, for example, use the number of incorrectly classified objects from the training set, the average or “weighted” distance between the training set and the separating function. It may be useful to use this technique in conjunction with the single control object technique described below.
Statistical and Linguistic Decision-Making Techniques
173
4. Hierarchical separation. Using a decision tree is especially useful if the number of features is large. In each vertex of the tree one of the features is investigated and depending on its value the next branch is selected. In the end, the classification decision is made at the bottom top. Such trees are a very flexible tool for the use of a priori knowledge. Unfortunately, there are no optimal training schemes. 5. Adaptive sharing feature. When solving some applied problems, it is necessary to change the separating function in the process of solving, i.e. to make small changes to the values of its parameters in case of incorrect classification of one or more objects. This problem is specifically devoted to the work.
6 Conclusions In statistical recognition of stenographs, the amount of input data is of great importance. If the sample size is small, the properties of the classes are estimated roughly and only global techniques should be used that require a small number of parameters to be taken into account. This usually boils down to the use of linear separation and simple feature selection techniques, with cluster analysis techniques allowing only very pronounced clusters to be identified. As the sample size increases, accuracy increases and more detail can be detected; in addition, more subtle techniques can be used. However, large samples are difficult not only to obtain but also to process. To work with them requires large amounts of computer memory. A small set of features is useful from a statistical point of view. It is assumed that a small number of variables are selected by means of non-statistical means and that this set is sufficient to solve the recognition problem. Therefore, there is no need to select or highlight features. If you add new features to the existing set, the separating power of the set of features can increase and the quality of recognition will improve accordingly. The opposite effect is also possible: an increase in the number of features means an increase in the dimension and, consequently, the number of parameters that require estimation. As a result, the estimation errors will increase, which can lead to a decrease in the quality of recognition. Significant problems in recognition are the following: which input data can be considered relevant and which (pre) processing of the input data (usually characterized by extreme redundancy) leads to the “properties” or “features” that really allow classification. Much attention is paid to these problems, but, unfortunately, today there are no formalized techniques that would allow to get an answer to these questions. A priori knowledge, intuition, trial and error, experience are somehow used in the determination of signs. Acknowledgements. The work was carried out under the financial support of Russian Science Foundation. Project No. 19-11-00230 “Development of hierarchical mathematical models and efficient computational algorithms for solving complex scientific and technical problems».
174
N. I. Sidnyaev et al.
References 1. Zhuravlev, Yu.N.: On the algebraic approach to solving recognition and classification problems. In: The Book: Problems of Cybernetics. vol. 33, pp. 5–68. Science, Moscow (1978) 2. Zhuravlev, Yu.N.: Nonparametric problems of pattern recognition. Cybernetics 12(6), 93– 103 (1976) 3. Bongard, M.M.: The problem of recognition, p. 320. Science, Moscow (1967) 4. Gurevich, N.B., Zhuravlev, Yu.N.: Minimization of Boolean functions and effective recognition algorithms. Cybernetics 10(3), 16–20 (1974) 5. Zhuravlev, Yu.I.: Correct algebras over sets of ill-posed (heuristic) algorithms. Cybernetics Part II 6, 21–27 (1977). Part 1 4, 14–21 (1977). h. Sh 2, 35–43 (1978) 6. Zhuravlev, Y., Zenkin, A.A., Zenkin, A.I., et al.: Tasks of recognition and classification with standard training information. J. Comput. Math. Math. Phys. 20(5), 1294–1309 (1980) 7. Ayvazyan, S.A., Bezhaeva, Z.N., Staroverov, O.V.: Classification of multi-dimensional observations, p. 240. Statistics, Moscow (1974) 8. Bravermai, E.M., Muchnik, N.B.: Structural methods for processing empirical data, p. 464. Science, Moscow (1983) 9. Vapnik, V.N.: Dependencies recovery from empirical data, p. 448. Science, Moscow (1979) 10. Vapnik, V.N., Chervovenkis, A.Ya.: Pattern recognition theory. Statistical problems of learning, p. 416. Science, Moscow (1974) 11. Vasiliev, V.N.: Recognition Systems: A Handbook, 2nd edn, p. 424. Naukova Dumka, Pererab., Kiev (1983) 12. Gorelik, A.L., Skripkin, V.A.: Recognition methods: Proceedings manual for universities, 2nd edn., p. 208. Higher School, Moscow (1984). Revised and add 13. Patrick, E.: Foundations of the theory of pattern recognition, p. 408. Owls. Radio, Moscow (1980). Trans. from English 14. Tu, J., Gonzalez, R.: Principles of pattern recognition, p. 416. Mir, Moscow (1978). Trans. from English 15. Fomin, V.N.: Mathematical Theory of Learning Identification Systems, p. 16. Publishing House of Leningrad State University, Saint Petersburg (1976). 236 16. Sidnyaev, N.I.: Neural networks and neuromathematics: study guide. In: Sidnyayev, N.I., Hrapov, P.V., (eds.), p. 83. Publishing BMSTU, Moscow (2016) 17. Andrews, G.: The use of computers for image processing, p. 161. Energy, Moscow (1977). Trans. from English 18. Duda, R., Hart, P.: Pattern Recognition and Scene Analysis, p. 512. Mir, Moscow (1976). Trans. from English 19. Fu, K.: Structural methods in pattern recognition, p. 320. Mir, Moscow (1977). Trans. from English 20. de Almeida, L.L., de Paiva, M.S.V., da Silva, F.A., Artero, A.O.: Super-resolution image created from a sequence of images with application of character recognition. Int. J. Intell. Syst. Appl. (IJISA) 6(01), 11–19 (2014) 21. Tuncer, T., Dogan, S., Akbal, E.: Discrete complex fuzzy transform based face image recognition method. Int. J. Image Graph. Signal Process. (IJIGSP) 11(4), 1–7 (2019) 22. Khan, R., Debnath, R.: Multi class fruit classification using efficient object detection and recognition techniques. Int. J. Image Graph. Signal Process. (IJIGSP) 11(8), 1–18 (2019)
Studying the Crack Growth Rate Variability by Applying the Willenborg’s Model to the Markov’s Simulated Trials Irina V. Gadolina1(&), Andrey A. Bautin2, and Evgenii V. Plotnikov1 1
2
Mechanical Engineering Research Institute of the Russian Academy of Sciences, 4, Malyi Kharitonievsky pereulok, 101990 Moscow, Russian Federation [email protected] Central Aerohydrodynamic Institute (TSAGI), 1, Zhukovsky Street, 140180 Zhukovsky, Moscow Region, Russian Federation
Abstract. There is great interest in studying the stage of crack propagation in aircraft structures under fatigue. The problem of longevity scatter at that stage is considered. Under the irregular loading, the variability of durability might be extremely significant. To study this variability, it was proposed to do the study consisting of two steps. As the first step, the Markov’s imitation of the loading processes was performed and several replicas trials were obtained. At step number two, the fatigue longevity for each replica was calculated using the Willenborg’s model for crack propagation rate. As a base for the imitation model for random processes the TWIST sequence, employed in aviation, was used. The idea was to preliminary investigate the probable scatter of longevities due to uncertainties of information about loading in numerical experiment prior to experimental investigation. Some data concerning the aluminum allows were employed in this study. The most interesting problem here was to investigate how the sequence of the turning points in the loading process, modelled by Markov’s approach, influences upon the calculated longevity. Keywords: Metal fatigue Random sequence of the events propagation Random process imitation
Crack
1 Introduction Generally speaking, there are two sources of uncertainty in fatigue analysis. They are uncertainty of construction properties and uncertainty due to varied loading [1]. Concerning number one uncertainty it is possible to refer to [2, 3]. In [4] it is said that the local microstructural feature responsible for fatigue crack initiation. [5] refers to the fact that the grain size, coupled with the presence of non-metallic inclusions at the high end of the size distribution contribute strongly to the fatigue life variability. The uncertainty number two (loading uncertainty) also influence strongly upon the total uncertainty and scatter. In [6] the massive investigation on measure flight parameters of a fighter in China was performed and special software for data collection and processing was developed. A computational study has been performed on aircraft © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 175–184, 2020. https://doi.org/10.1007/978-3-030-39216-1_17
176
I. V. Gadolina et al.
materials which concludes that heavy spectra with high damage under operational loads may reduce the life but not the fatigue scatter factor [7]. During recent years extensive experimental studies of fatigue crack growth have been performed. Many theories which try to explain peculiarities of this process were proposed [8–11]. The important question here is the investigation of the influence of the sequence of the events at the stage of fatigue crack propagation. This problem, in particular, is very relevant for the reliability estimation of structural elements of aircraft. Concerning the influence of sequence of the events upon the longevities, it should be mentioned the work [11], where the most severe and the least severe loading conditions were constructed and analyzed. To form those extreme conditions the author [11] created the optimum composition of the flight numbers in TWIST (Transport Wing Standard [12]). In this paper, as a basic sequence of extrema, the program TWIST [12] also was adopted. This standard sequence is not only widely being used for the elements of the aircraft testing but also has been comprehensively studied. The fact, that a random TWIST sequence is an example of a process with a simple structure – i.e. for one extreme corresponds one intersection of the average level – at some extent simplifies the task of the analysis. For the processes of this kind, all cycle counting methods, namely, extremes, ranges and the rain-flow counting give nearly the same result. The only exception here is the ‘earth-air-earth’ cycle. This cycle has a negative minimum on the earth and grows to the maximum on the air. This cycle brings significant damage and needs special consideration in the simulation. The ways of the description of the randomness of the loads causing fatigue are abundant. At the very beginning of the history of random testing, the imitation with keeping up with signal spectral densities was widespread. This approach might be useful while considering multiple loading inputs for taking into account interaction between them. It also helps to considerate the randomness of loading [13]. In one-dimension applications, the sequence of the turning points (extremums) plays a critical role, because exactly this sequence is responsible for the fatigue damage [14]. Due to the fact, that as in design as well as in the testing, the researcher should consider the random nature of the loading, the question of the randomness should be considered. That fact is very important in fatigue and should not be ignored during the simulation of a random process. Varied testing methods with imitation or just standards of turning points [15] were developed since the emergence of servo-hydraulic machines in 1970-s. The other way of performing the experiment is to apply the genetic algorithm, that is the optimization technique for performing the equivalent load test [16]. Not a bad idea is to use MATLAB/Simulink environment for conducting the experiments [17]. Because of specific of the data representation in conducting experiment on fatigue crack growth it is desirable to apply Big data Analytics to find some hidden ties [18].
Studying the Crack Growth Rate Variability by Applying the Willenborg’s Model
177
2 Markov’s Matrixes Representation of the Random Loading One of the perspective ways to account for process randomness in to impose the Markov’s modelling [19]. It deals directly with the peaks of a random process and is able to bear the intrinsic properties of the initial process and as well to consider loading variability. It was applied, for example, to the testing of hydraulic pump components [20]. By considering applying the Markov’s approach it should be taken into consideration, that according to the Markov model each event depends only on the previous event, and the former information does not play any role. In fatigue, and namely, while the random loading simulation, as the random events the turning points (extremums: maximums MAX and minimums MIN) serve. As it will be shown later, the sequence of extremums might play an important role. The analysis stage of the applying the Markov’s matrix [19] consists of filling up the square matrix (32 32 or 64 64) on the base of real processes sequence of extremums and at the synthesis stage, the sequence of extremums is generated using the specific frequencies obtained at the analysis stage (recorded in the matrix) and the random generator. In this way it is possible to obtain a number of so-called “replicas”, each of those possesses its individual features. In this study, these replicas were used for analyzing the fatigue scatter. The example of filled-up Markov’s matrix on the base of TWIST sequence [12] is shown in Fig. 2. In the cells the number of event of transitions from MAX to MIN (lower left triangle of the matrix) and from MIN to MAX (upper right triangle) are shown. For the better graphical representation only 12 digits were employed in this example, so, the matrix dimension became (12 12).
1 2 3 4 5 6 7 8 9 10 11 12
1 0 0 0 0 0 0 0 1 1 0 0 0
2 0 0 0 0 0 0 0 19 0 0 0 0
3 0 0 0 0 0 0 0 196 26 2 0 0
4 0 0 0 0 0 0 0 2271 197 17 0 2
5 0 0 0 0 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0 0 0 0 0
7 0 0 0 0 0 0 0 0 0 0 0 0
8 2 17 191 2277 0 0 0 0 0 0 0 0
9 0 1 32 191 0 0 0 0 0 0 0 0
10 0 1 1 17 0 0 0 0 0 0 0 0
11 0 0 0 0 0 0 0 0 0 0 0 0
12 0 0 0 2 0 0 0 0 0 0 0 0
*For illustration, only 12 digits are shown here. In the further study, 64 digits were used to satisfy the prediction accuracy. Fig. 1. Simplified Markov’s matrix filled up on the base of TWIST sequence
3 Willenborg’s Crack Growth Model Analysis of the growth of fatigue cracks is quite complicated. The experimental determination of crack growth duration has a huge scatter. The main experimental dependence here is the diagram of fatigue fracture [1]. It represents the S-type curve and shows the dependence of fatigue crack growth rate on the SIF (K - stress intensity
178
I. V. Gadolina et al.
factor). A few modifications of that dependence are used nowadays. It goes without saying that the problem of dealing with the variability of the results is highly relevant because it affects the reliability of the design structures. The problem aggravates, when the need arises to deal with irregular loading, that is when the extremums of loading vary. To deal with this problem of prediction of durabilities for such processes, some approaches were developed. Among them, the best known are Elber’s, Willer’s and Willenborg’s models [21, 22]. All of them take into consideration the existence of the plastic zone at the tip of crack. Partly due to the presence of that zone or for some other reasons [8], the effect of crack retardation plays an important role in this process. So far there is no final decision concerning the fact, which of the mentioned methods gives the best results. In this study, the method [22] has been chosen. It works well on aluminum specimens for the processes with the rare compressive stresses. To estimate the crack growth under irregular loading the geometric parameters of the element and the characteristics of the particular material are necessary. For the simplicity while modeling crack growth in this study, a plate with open hole was considered as a structural element. The cracks were supposed to grow at both sides of the hole. The initial data for the calculation according to [22] are: the dimensions of the plate with the open hole (cracks grow on both sides of the hole), material characteristics (yield strength, kinetic diagram of fatigue fracture), and the loading history. The material characteristics for the sake of generality have been taken from the reference book the typical alloy for aviation, namely D16AT, similar to 2024 alloy [23]. The algorithm is as follow: 1. Set a sequence of turning points of each modelled trial to calculate the stress intensity factor (SIF) values at each step of the algorithm. For this purpose, the initial sequence of loading history or the sequence of rain-flow cycles obtained by one of the generally accepted procedures can be used; 2. It is necessary to obtain the values of SIF at the minimum and maximum points of each cycle of the sequence obtained in step 1, taking into account the geometric features of the element: K¼r
brutto
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pa pffiffiffiffi pffiffiffiffiffiffi pa sec MPA m W
ð1Þ
here: rbrutto [MPa] is the stress value in the specimen far enough from the crack with length a [mm]; W [mm] – is the finite width of the band; 3. Calculate the radius of plastic deformation zone at the end of the crack: 1 K 2 ru ¼ ½mm 2p r02
ð2Þ
Studying the Crack Growth Rate Variability by Applying the Willenborg’s Model
179
4. Next, the location of the crack end and the maximum distance from the crack boundary of the current plastic deformation zone should be obtained (Fig. 2):
rpOL
a0
ai
ri
OL
Fig. 2. Model of crack retardation because of the load’s interaction (overloading effect)
In Fig.2: a0 - is the crack length at the overloading, ai – is the current crack length, OL rOL – plastic zone under p - the dimension of the plastic zone under overloading, ri current cycle. 5. The extreme position of the current plastic deformation zone and plastic deformation zone obtained at previous iterations is compared (if this is not the first iteration of the considered algorithm). If the current plastic deformation zone is in the field of the plasticity of earlier cycles, formula (3) is used: K ¼ r02
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2pðrPOL þ a0 ai Þ
ð3Þ
6. Calculate the difference between the SIF (DK) at the minimum and maximum point of the cycle and determine the increment of the crack and the new crack length by the kinetic fatigue fracture diagram by Forman’s equation da DK n ¼C dn ð1 RÞKc DK
ð4Þ
here C, n – are the constants of the material; R – asymmetry factor; KC – the critical value of fracture, that is the viscosity of fracture; 7. The algorithm is repeated from step 2 until the failure criterion is met (either the critical crack length or critical SIF = Kc are achieved).
180
I. V. Gadolina et al.
4 Numerical Experiment The TWIST sequence [12] was treated by Markov’s matrix instrumentation with division by 64 classes and filling up the matrix of the size 64 64. The initial sequence TWIST served here as the information source. As the next step some new processes (replicas) were imitated. Their integral characteristics, like the main statistical parameters and the distribution of the rain-flow cycles coincide with the parameters of the initial sequence. The very different situation takes place when the longevity at the crack propagation stage is considered. As it was mentioned before, the presence and distribution over loadings throughout the realization influence significantly on longevity. The example of simplified Markov matrix, was shown in Fig. 1. Based on real matrix, which is (64 64), by employing the algorithm for random sequence imitation [19] the trial-replicas were simulated. Further on, the described in Sect. 3 [22] algorithm has been applied to each replica to get the longevities for each of them.
5 Results and Discussion In Fig. 3 the fatigue crack growth for initial TWIST realization, as well as the fatigue crack growth for some replicas are shown. Each line represents a particular replicarealization. The step-wise calculations are being stopped when the calculated crack length exceeded 10 mm. It can be seen, that the scatter is large enough, although it is not as big as the experimental scatter in the laboratory testing. Among those replicas, the most severe one (‘severe’) and the lightest one (‘light’) were selected for the future analysis as the marginal cases (they are shown among the others in Fig. 3). In Fig. 3 it can be seen, that the calculated longevitiy for the “light” replica is almost twice as the longevity, estimated for “severe” replica. Based on the accepted in this research model (namely, Willenborg’s model [22]), some analytical speculations have been performed. In Fig. 4 the major peaks densities of two marginal processes (‘light’ and ‘severe’) are compared. These distributions demonstrate the fact, that the increased presence of larger peaks values somehow improve the situation, i e the longevity becomes bigger, however paradoxically it might be seen at the first glance. For the ‘light’ process, the median of stresses amplitudes is bigger than one of the ‘severe’ process and it helps (according to Willenborg’s model) to endure the loading. As a matter of fact, it follows from the physics of the crack propagation – that is the greater loading peaks retard fatigue crack growth. In Fig. 5 the influence of the moment of the occurrence of the major peaks throughout loadings realization is illustrated. Only the significant events are shown in the figure. From that figure it follows, that the later the overloading occurs, the tougher the situation will be.
Studying the Crack Growth Rate Variability by Applying the Willenborg’s Model
181
Fig. 3. Calculated dependences of fatigue crack growth for initial TWIST and for some replica realizations
Fig. 4. Comparison of the stress densities of “light” and “severe” replicas
182
I. V. Gadolina et al.
Fig. 5. Time of appearance of stresses, more than 180 MPa in two marginal processes
6 Conclusions Being the good possibility to incorporate the randomness of loading process and simultaneously adhere to the realistic prototype, the Markov’s matrix approach had been applied to the investigation the probable scatter of longevity at the crack propagation stage. Until now there were no detailed numerical investigation of the fatigue crack rate under random loading taking into consideration the probable loading divergence. The conclusions were made about the influence upon longevity from the distribution of the large extremums by their values and throughout the length of realization. The comparison has been made based on the so-called ‘light’ and ‘severe’ realizations, which were selected among the modelled replicas by Markov imitation. These light’ and ‘severe’ realizations represented two limiting cases of loading. The method might be applied to the varied random processes and varied materials. It should be noted, however, that employed in this study Willenborg’s model is not the only one which existed nowadays [8]. It means, that the different model might lead to slightly varied results. There is a need to prove those conclusions in the laboratory test in addition to the numerical study described in this paper. Acknowledgements. Most of the calculations and the data analysis have been performed using R: A language and environment for statistical computing [24].
Studying the Crack Growth Rate Variability by Applying the Willenborg’s Model
183
References 1. Kogaev, V.P.: Strength design under non-stationary stresses, p. 363. Mashinostroenie, Moscow (1993) 2. Doubrava, R.: Effect of mechanical properties of fastener on stress state and fatigue behaviour of aircraft structures as determined by damage tolerance analyses. In: Proceedings of 3rd International Conference VAL 2015, Prague, 23–26 March, pp. 135–142 (2015) 3. Gubenko, S.I., Ivanov, I.A., Kononov, D.P.: The impact of steel quality on the fatigue strength of wrought wheels. Zavodskaja laboratoria. Diagnostika materialov. Test- Zl, Moscow № 3. pp. 52–60 (2018) 4. Su, X.: Toward an understanding of local variability of fatigue strength with microstructures. Int. J. Fatigue 30(6), 1007–1015. https://doi.org/10.1016/j.ijfatigue.2007.08.016 5. Texier, D., Gómez, A.C., Pierret, S., et al.: Microstructural features controlling the variability in low-cycle fatigue properties of Alloy Inconel 718DA at intermediate temperature. Metall. Mat. Trans. A 47, 1096 (2016). https://doi.org/10.1007/s11661-015-3291-8 6. He, X., Sui, F., Zhai, B., Liu, W.: Probabilistic and testing analysis for the variability of load spectrum damage in a fleet. Eng. Fail. Anal. 33, 419–429 (2013) 7. Zhang, F.: Heavy spectra under operational loads may reduce life but not fatigue scatter factor (2013). https://doi.org/10.7527/S1000-6893.2013.0066 8. Sunder, R., Andronik, A., Biakov, A., Eremin, E., Panin, S., Savkin, A.: Combined action of crack closure and residual stress under periodic overloads: a fractographic analysis. Int. J. Fatigue 82, 667–675 (2016) 9. Romanov, A.N., Nesterenko, G.I., Filimonova, N.I.: Damage accumulation under variable loading of cyclically hardening material at the stages of formation and development of cracks. J. Mach. Manuf. Reliab. 47(5), 414–419 (2018). https://doi.org/10.3103/ S1052618818050102 10. Lebedinskii, S.G.: Design modeling of propagation of the fatigue cracks in the steel of molded parts of the railway structures. J. Mach. Manuf. Reliab. 47(1), 62–66 (2018). ISSN 1052-6188, © Allerton Press 11. Gal’chenko, E.V.: Influence of loading sequence in aviation constructions at two stages of fatigue by identical integral characteristics. Thesis of Ph.D. in engineering TSAGI, 29 p. (2003) 12. De Jonge, J.B., et al.: “A standardized load sequence for flight simulation tests on transport aircraft wing structures”, NLR-TR-73029U, National Lucht-en Ruimtevaart-laboratorium, The Netherlands (also LBF-Bericht-FM-106, Laboratorium fur Betriebsfestigkeit, Darmstadt, German Federal Republic) (1973) 13. Benasciutti, D., Tovo, R.: Frequency-based analysis of random fatigue loads: models, hypotheses, reality. Mat.-Werkstofftech 49, 345–361 (2018) 14. Běhala, J., Homola, P.: Simulation of operational loading spectrum and its effect on fatigue crack growth. Procedia Eng. 101, 26–33 (2015) 15. Heuler, P., Klatschke, H.: Generation and use of standardised load spectra and load–time histories. Int. J. Fatigue 27, 974–990 (2005) 16. Elnaghi, B.E., Mohammed, R.H., Dessouky, S.S., Shehata, M.K.: Load test of induction motors based on PWM technique using genetic algorithm. Int. J. Eng. Manuf. (IJEM) 9(2), 1–15 (2019). https://doi.org/10.5815/ijem.2019.02.01 17. Dorji, S.: Modeling and real-time simulation of large hydropower plant. Int. J. Eng. Manuf. (IJEM) 9(3), 29–43 (2019). https://doi.org/10.5815/ijem.2019.03.03
184
I. V. Gadolina et al.
18. More, R., Goudar, R.H., More, R.: DataViz model: a novel approach towards big data analytics and visualization. Int. J. Eng. Manuf. (IJEM) 7(6), 43–49 (2017). https://doi.org/10. 5815/ijem.2017.06.04 19. Fisher, R., Haibach, E.: Modeling functions loading in experiments on the evaluation of materials. In: Dahl, V. (ed.) Behavior of Steel Under Cyclic Loads, pp. 368–405 (1983) 20. Carboni, M., et al.: Load spectra analysis and reconstruction for hydraulic pump components. Fatigue Fracture Eng. Mater. Struct., 251–261 (2008). https://doi.org/10. 1111/j.1460-2695.2008.01221.x 21. Nesterenko, B.G.: Computational and experimental study of methods to ensure the operational survivability of aircraft. Thesis of Ph.D. in Engineering TSAGI, 153 p. (2003) 22. Willenborg, J., Engle, R.H., Wood, H.A.: AFFDL-TM-71-1 FBR, WPAFB, OH (1971) 23. The design values of the characteristics of aircraft metallic structural materials: Handbook M.: PJSC «UAC» , 300 p. (2012) 24. R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2019). https://www.R-project.org/
Advances in Computer Science and Their Technological Applications
Creating Spaces of Temporary Features for the Task of Diagnosing Complex Pathologies of Vision A. P. Eremeev, S. A. Ivliev(&) , O. S. Kolosov, V. A. Korolenkova, A. D. Pronin, and O. D. Titova National Research University “MPEI”, Moscow, Russia [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Abstract. A purposeful transformation of periodic time dependencies resulting from retinographic studies of retinal pathologies was carried out in order to study their frequency properties. Such a study is intended to expand the space of formalized signs of pathologies that can be used in diagnostic systems based on artificial intelligence methods. A technique has been developed for constructing the amplitude-frequency characteristics of the retina, taking into account the mathematical description of impulse test stimuli. A procedure has been proposed for a polynomial approximation of the frequency response of the retina, which allows the coefficients of approximating polynomials to be used as new formalized signs in diagnostics. It is shown that for complex retinal pathologies, it is advisable to take into account not only its amplitude-frequency characteristics under different stimulation conditions but also its phase-frequency characteristics by analyzing retinal hodographs on the complex plane. When searching for additional formalized signs of retinal pathologies, it is proposed to use a new generalized frequency response of the retinal hodograph, which facilitates the search and formalization of additional signs of pathologies. Keywords: Feature extraction Digital signals processing Frequency response Signal analyzing
Hodographs
1 Introduction Methods of artificial intelligence are now widely used for the analysis and diagnosis of complex problem situations in technical, organizational, biological and other systems. Active researches and developments on this issue are also carried out at the National Research University “MPEI” [1, 2]. These researches include computerization processes of analysis and diagnosis complex vision pathologies, which are carried out in conjunction with the physiology specialists from the FGBU “MNII GB named after Helmholtz” [3–5]. One of the main things in the analysis of pathologies of vision is an accurate diagnosis. This decision depends on the choice of an appropriate method of treatment. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 187–204, 2020. https://doi.org/10.1007/978-3-030-39216-1_18
188
A. P. Eremeev et al.
There is a large set of methods. One of them is electroretinography [6], the essence of which lies in the study of the biopotentials that occur on the retina of the eye with specific light stimulation. A comparison of the existing approaches to the analysis of electroretinograms (ERG) is presented in [7–10]. Currently, the requirements for the level of processing and analyzing the data and searching for new mathematical tools to increase the information content of the ERG are sharply increasing. Of interest is the construction of diagnostic systems (decision support systems in diagnostics) of pathologies according to formalized features descriptions of the retina. The decisive factor here is the selection and evaluation of formalized informative features and comparing them with the reference values. There is a need to make new training samples in the form of patterns, taking into account temporal (temporal) dependencies. The values of such signs, which are signs of different pathologies, are often significant in the area of mutual intersections, which complicate diagnosis. One of the possible ways to obtain informative features is an additional mathematical transformation of registered types of ERG. A purposeful transformation of periodic time dependencies resulting from retinographic studies of retinal pathologies was carried out in order to study their frequency properties. Such a study is intended to expand the space of formalized signs of pathologies that can be used in diagnostic systems based on artificial intelligence methods. A technique has been developed for constructing the amplitude-frequency characteristics of the retina (AFC), taking into account the mathematical description of impulse test stimuli. A procedure has been proposed for a polynomial approximation of the AFC of the retina, which allows the coefficients of approximating polynomials to be used as new formalized signs in diagnostics. It is shown that for complex retinal pathologies it is advisable to take into account not only its amplitude-frequency characteristics under different stimulation conditions but also its phase-frequency characteristics (PFC) by analyzing the retinal hodographs on the complex plane. When searching for additional formalized signs of retinal pathologies, it is proposed to use a new generalized frequency response of the retinal hodograph, which facilitates the search and formalization of additional signs of pathologies.
2 Transform Temporal Signals and Creation Feature Spaces of Temporal Dependencies Frequency conversions are an essential and widespread type of periodic time signal transformation. These transformations are aimed at obtaining the amplitude and phasefrequency (frequency response and phase response) characteristics of the object of inters, also amplitude-phase responses (APR). 2.1
Purpose of Work and Research Methods
The aim of the work is to obtain additional formalized signs of ERG of retinal responses to rhythmic stimuli using examples of ERG processing. Due to the fact that these signals are periodic, they represent a set of harmonic oscillations that make up the total source signal. It is easy to verify that the signal of the
Creating Spaces of Temporary Features for the Task
189
ERG, as stated by the expanded number, meets the Dirichlet conditions. Thus, a periodic ERG signal can be represented by an infinite trigonometric next to the Fourier [16]: f ð t Þ ¼ a0 þ
X1 n¼1
ðan cos nxt þ bn sin nxtÞ
ð1Þ
where the coefficients of the series a0 ; an ; bn determined from the relations: RT a0 ¼ T1 f ðtÞdt RT 2
an ¼ T
0
f ðtÞ cosðnxtÞdt
ð2Þ
0 RT
bn ¼ T2 f ðtÞ sinðnxtÞdt 0
In expression (1), the coefficient a0 determines the constant component of the periodic signal (2). Using the truncated Fourier series (3), containing only N first terms instead of an infinite sum, leads to an approximate representation of the signal. f ðtÞ a0 þ
XN n¼1
ðan cos nxt þ bn sin nxtÞ
ð3Þ
The choice of the number N in this paper depends on the level of the noise component of the recorded ERG signal and on the duration of the test pulse. This will be discussed further. The obtained spectra of rhythmic (flicker) ERG (FERG) of healthy individuals and patients with primary open-angle glaucoma (POAG) were then used to assess the frequency properties of the retina. Digitized FERG recordings were obtained in healthy individuals and patients with a confirmed diagnosis of POAG Ia and IIa stages of the same age group. Registration was performed on the TOMEY EP-1000 diagnostic system in compliance with the Standards of the International Society of Clinical Electrophysiologists of Vision (ISCEV) [12, 13]. 2.2
Frequency Response and Phase Response of a Retina
A retina is a non-linear object, and for that object, we can only talk about the frequency response and phase response for specific types of input testing effects. Consider receiving the retina response when exposed to the following stimuli: rhythmic and periodic light pulses of fixed duration with repetition frequencies of 8.3 Hz, 10 Hz, 12 Hz, 24 Hz, 30 Hz and alternating pattern stimuli in the signals of the flicker ERG (FERG) and ERG responses to the chess pattern - black-and-white reversal cells (PERG), respectively [7, 8]. The process of transformation of the input test stimulus for FERG can be represented by the scheme in Fig. 1.
190
A. P. Eremeev et al.
X
WL
X1
WR
Y
WD Z
Fig. 1. Transformation of input signal X for FERG (recorded output signal is Z)
The input test signal is a switch that turns the flash lamp on and off. The flash lamp has its own frequency response (WL ) converts the input signal X into a change in the luminous flux X1 . Then the retina with its frequency response (WR ) converts the light flux into the biopotential Y, which is the actual device with its frequency response (WD ), recorded in the FERG (Z). Thus, the spectrum of the recorded signal FERG Zðf Þ is converted by three dynamic objects (lamp-flash, mesh and recording device), the spectrum of the input signal Xðf Þ. The amplitude of the pulses A is a relative value and is selected during the processing of the FERG taking into account the normalization of the obtained frequency response with respect to healthy subjects. This process is further described in the work. A flash lamp with its own frequency response (WL ) converts the input signal X into a change in the luminous flux X1 . Further, the retina with its frequency response (WR ) converts the luminous flux into the biopotential Y, which is recorded by the corresponding device with its frequency response (WD ) as FERG (Z). Thus, the spectrum of the recorded FERG signal Z ð f Þ is the spectrum of the input signal Xðf Þ transformed by three dynamic links (flash lamp, retina, and recording device). Such a process can be reflected by the ratio: Zðf Þ ¼ Xðf Þ
X1 ðf Þ Yðf Þ Zðf Þ ¼ Xðf Þ WL ðf Þ WR ðf Þ WD ðf Þ Xðf Þ X1 ðf Þ Yðf Þ
ð4Þ
where f is the switching frequency, and each dynamic link in (4) is represented by 0 a spectrum converter (or its frequency response) in the form: WL ðf Þ ¼
X1 ðf Þ Yðf Þ Zðf Þ ; WR ¼ ; WD ¼ Xðf Þ X1 ðf Þ Yðf Þ
ð5Þ
An analogue of such a converter for a linear dynamic link is its transfer function, which describes the amplifying or attenuating properties of the converter with respect to the amplitude of each specific harmonic input signal of a certain frequency or to the amplitude of any particular harmonic in the spectrum of the input signal. The spectrum of a periodic pulse signal is calculated by decomposing it in a Fourier series on time intervals equal to one pulse repetition period Tu . The analytical form of recording for calculating the harmonics of the spectrum with numbers n ¼ 1; 2; 3; . . . has the form [17]: nx p sin u ð6Þ An ðnxu Þ ¼ 2A nxu T2u 2
Creating Spaces of Temporary Features for the Task
191
xu ¼ 2pf The spectra of the input signal Xðf Þ for pulse repetition frequencies of 10 Hz and 30 Hz are shown in Fig. 2. The spectra of pulse sequences (5) have several features [6]: • the number of harmonics of the spectrum in a limited frequency interval is inversely proportional to the pulse repetition rate; • the amplitudes of harmonics in the signal spectrum are proportional to the pulse repetition rate; • there are no harmonics in the spectra at frequencies, that are multiples of 1=s, and in the vicinity of these frequencies the harmonics amplitudes are close to zero. For FERG, these are neighborhoods of frequencies of 200 Hz, 400 Hz, 600 Hz, etc.
0.16 0.14 0.12 0.1 0.08
10 Hz
0.06
30 Hz
0.04 0.02 10 30 50 70 90 110 130 150 170 190 210 230 250 270 290 310 330 350 370 390
0
Fig. 2. Spectrum view of the input signal Xðf Þ for pulse repetition frequencies of 10 Hz and 30 Hz
The results of the study show that the inertial properties of the flash lamp can be neglected, that is, the frequency response of the flash lamp can be represented as some fixed transmission coefficient KL for the entire frequency range of interest to the researchers. Thus, its influence can be taken into account in the form of a correction of the amplitude A of the input test stimulus X. A high-pass filter is installed at the input of the WD ðf Þ recording device, which does not pass the constant component of the recorded signal and limits the low frequencies to the lower boundary of the uniform bandwidth of approximately f0 ¼ 1 Hz (for the Tomey EP-1000 device). Starting from this frequency, at the other frequencies, the device has a uniform bandwidth with a transmission coefficient equal to KD . Thus, the
192
A. P. Eremeev et al.
recording device can also be considered a linear proportional link in the bandwidth of interest to the researcher. Its transmission coefficient KD can also be taken into account as a correction of the amplitude of the input signal A. Thus, the retinal frequency response can be obtained by dividing the harmonics of the spectrum of the recorded signal Z by the harmonics (with corresponding numbers) of the spectrum of the test stimulus X [18]. The form of the FERG as the initial signal subjected to expansion in the Fourier series at the pulse repetition period with a frequency of 10 Hz is shown in Fig. 3a. FERG of a healthy subject with normal vision is considered.
Frequency 10 Hz 1.00E+00 8.00E-01
μV
6.00E-01 4.00E-01 2.00E-01
-2.00E-01 -4.00E-01
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.20
0.00E+00
time in seconds Frequency 10 Hz with zero extension to 4.15 Hz
1.00E-02 8.00E-03 6.00E-03 μV
4.00E-03 2.00E-03
-2.00E-03 -4.00E-03
0.00 0.01 0.02 0.03 0.04 0.05 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.19 0.20
0.00E+00
time in seconds
Fig. 3. a (at top) - ERG subjected to Fourier expansion (initial 10 Hz), b (at bottom) - FERG subjected to Fourier expansion (with zero extension to 4.15 Hz)
Creating Spaces of Temporary Features for the Task
193
It should be noted that the use of FERG spectra for diagnostics is complicated by the fact that the amplitudes of harmonics in them depend on the spectra of the input signals, in which, according to (5), the amplitudes of the harmonics of the same name are proportional to the frequency of the applied light pulses. The construction of the frequency response of the retina (WR ðf Þ ¼ Aðf Þ) by dividing the values of the amplitudes of the harmonics of the spectra Zðf Þ by the corresponding values of the amplitudes of the harmonics of the input signals Xðf Þ, in accordance with (5), eliminates this undesirable effect [14, 15]. Also, in the analyzed spectra the permissible upper limit of harmonics frequencies under consideration narrows to 120–150 Hz due to the influence of the spectrum of the noise component of the FERG on the shape of the frequency response in the vicinity of the critical frequency of 200 Hz, where the harmonic amplitudes of the spectrum of the input pulse are close to zero. This circumstance leads to the fact that the number of frequency response points for different frequencies of supply of light pulses is different and very small for high frequencies. For example, for a pulse frequency of 30 Hz, such points will be 4–5. Intermediate points of the retinal frequency response can be obtained by artificially lengthening the decomposition period of the observed signal (or “window”) by zero values, which is used in radio engineering to determine the intermediate harmonics of the signal under study [19]. In [19], a spectral analysis of a continuous signal with zero mathematical expectation is performed (for FERG this condition is provided by the presence of a high-pass filter of the recording device) inside the window. In our case, conditionally delaying the arrival of the next light pulse for a specific time and extending the output signal Z by zero values at the same time, we artificially increase the pulse repetition period. In this case, we can obtain intermediate (additional) points of the frequency response of the retina [7]. In essence, the approach proposed and developed in the work is similar to the well-known wavelet transform when using the Haar’s wavelet with the only difference being that in the integrand of the studied signal and the wavelet, the added zero values do not belong to the wavelet, but to the signal under study [20]. Figure 3b shows the lengthening of the processing period of the FERG (10 Hz signal at the input) by zero values. Moreover, the duration of such a period corresponds to light pulses with a pseudo-frequency of 4.15 Hz to the input. In Fig. 4 the frequency response of the retina of a healthy subject with an artificial lengthening of the period of light pulses (pseudo-frequency of pulse supply 1 Hz) is presented. Frequency response points of the same subject, obtained as a result of processing the original FERG shown in Fig. 1, are also marked with circles. 3a. Note that the frequency response with an increase in the frequency of supply of pulses gradually decreases, that is, the transmitting properties of the retina weaken. According to the frequency response of a healthy subject with normal vision, a comparison of the frequency response of patients with retinal pathology can be performed relative to it. Fig. 5 shows the frequency response of two eyes of a subject with suspected glaucoma (GL1), two eyes of one subject with a diagnosis of POAG Ia (Gl2), and two eyes of healthy control subjects (norm N1 and N2). In this example, N1 includes data for a subject with normal refraction, and N2 includes data for a subject with mild myopia. All frequency responses are reduced to a pseudo-frequency of 4.15 Hz. The retinal
194
A. P. Eremeev et al.
frequency response obtained reflects their ability to convert (amplify or attenuate) the corresponding amplitudes of the harmonics of the input signal spectrum.
Fig. 4. The family of frequency responses of the retina with artificial lengthening of the period of repetition of light pulses (pseudo-frequency of impulses 1 Hz)
Amplitude
Frequency response 8,3 - 4,15 Hz GL1 right
1.4 1.2 1 0.8 0.6 0.4 0.2 0
GL1 le GL2 right GL2 le 0
20
40
60
80
Frequency
100
120
140
N1 N2
Fig. 5. Retinal frequency response family with norm and glaucoma (f = 8.3 Hz, pseudofrequency 4.15 Hz)
Creating Spaces of Temporary Features for the Task
195
The selection of the amplitude of the input signal (A) of the frequency response of the retina of a particular healthy subject from above is limited to a conventional unit or slightly exceeds it, as shown in Fig. 5. 2.3
Approximation of the Frequency Response of the Retina in Order to Obtain Formalized Signs for the Diagnosis
The formalization of features extracted from the frequency response of the retina is proposed by approximating them with algebraic power polynomials. The frequency of the corresponding harmonics acts as an argument of such polynomials. The coefficients of these polynomials can be taken into account as formalized features characterizing the FERG. Known numerical methods implemented in a number of well-known mathematical packages are suitable for research in this direction. For research purposes, the Mathcad package was used. An attempt to approximate the frequency response of the retina by one power polynomial over the entire frequency range up to 120 Hz gives a highly smoothed curve, regardless of the degree of smoothing polynomial being assigned. Therefore, it was proposed to approximate the frequency response separately for two frequency ranges. In the frequency range 0\f \50 Hz, we approximate the frequency response by a second-order polynomial, and in the frequency range 50 \ f \ 120 Hz, we approximate the frequency response by a first-order polynomial. At the same time, we skip harmonics with a frequency of 50 Hz. Thus, we represent the approximating curve in the form: Nðf Þ ¼
a0 þ a1 f þ a2 f 2 ; IF 0 \ f \ 50 b0 þ b2 f ; IF 50 \ f \ 120
ð7Þ
Consider this approximation of the frequency response for a subject with normal vision. Below in Fig. 6, its form is presented at a stimulation frequency of 8.3 Hz (pseudo-frequency 4.15 Hz), as well as the approximation result in the form of smoothed curves constructed from the found dependencies: Nðf Þ ¼
0:13705 þ 0:06206f 0:00099f 2 ; IF 0 \ f \50 0:67308 0:00189f ; IF 0 \ f \ 120
ð8Þ
As a result of the approximation carried out in this way, we obtain five numerical values of the coefficients of the approximating polynomials for each stimulation frequency, which are further used as new additional formalized signs for the diagnosis of retinal pathologies.
196
A. P. Eremeev et al.
Normal vision 8,3 Гц 1.40 Amplitude (uV)
1.20 1.00 0.80 0.60 0.40 0.20 0.00
Frequency (Hz) Fig. 6. Combined source (grey) and smoothed (black) frequency response
2.4
Amplitude-Phase Characteristics (AFC) of the Retina
The use of the additional signs obtained by the approximated frequency response for the diagnosis of various pathologies in many cases gives poorly distinguishable results. Figure 7 presents the frequency response for different conditions of the eyes during the processing of PERG. There are two characteristics each for patients with normal vision
Comparsion of requency response of PERG DR 15-1
400
DR 15-2
350
Gl 104-1
300
Gl 104-2
250
Myop 276-1
200
Myop 276-2
150
AMD 285-1
100
AMD 285-2
50
N 576-1
0
N 576-2
Fig. 7. Frequency responses of retinas for different pathologies
Creating Spaces of Temporary Features for the Task
197
(N) and retinal pathologies such as glaucoma (GL), diabetic retinopathy (DR), myopia (M) and age-related macular dystrophy (AMD). Despite the fact that the presented AFCs differ from each other, the use of their signs (for example, the coordinates of extremes) for diagnosing diseases is not sufficient enough, due to the high correlation with the already known signs extracted from ERG. In addition, there are significant areas of mutual intersection of signs for different pathologies. When analyzing the frequency properties of objects, researchers are content with the frequency response, as there is usually a one-to-one correspondence between the frequency response and the phase response. However, studies show that the retina of the eye according to its properties in all test modes should be attributed to non-minimal phase objects for which there is no unambiguous connection between the frequency response and phase response. The use of the retinal phase response, along with its frequency response demonstrates the possibility of obtaining new characteristic features that accompany the most dangerous pathologies, such as glaucoma or diabetic retinopathy. For these purposes, we can use the AFC (hodograph). Figure 8 shows the hodographs of the retina of a subject with normal vision after treatment with an FERG with a test stimulus frequency of 24 Hz. The dots on the hodograph mark the positions of the ends of the vectors corresponding to the values of the harmonic modulus with numbers from 1 to 20 inclusive.
0.80 0.60
Imaginary part
0.40 0.20 0.00 -0.20 -0.40 -0.60 -0.80 -1.00 Real part Fig. 8. Hodographs of the right and left eyes of a subject with a vision norm at a frequency of 24 Hz.
198
A. P. Eremeev et al.
Imaginary part
Comparsion of hodographs 0.02 0.01
N576-1
0.01
N576-2 Gl111-1
0.00 -0.02
-0.01
0.00
0.01
0.02
Gl111-2
-0.01
N576-2
-0.01
DR16-1
-0.02 Real part Fig. 9. Retina hodographs for various pathologies
The hodographs shown in Fig. 9 demonstrates this. There are two upper hodographs in the first quadrant - GL, but two more below - N, two hodographs starting in the fourth quadrant - DR. The visual perception of these hodographs allows us to distinguish between different pathologies. At the same time, the formalization of features forces one to look for some additional transformations of the obtained frequency characteristics. The hodograph is more effective for visual perception in comparison with the frequency response, but it is difficult to obtain additional formalized signs with it since for this it is necessary to take into account both the amplitude and phase of the corresponding harmonic vectors together for different pathologies. To facilitate this task, we introduce a generalized characteristic of the hodograph. 2.5
A Generalized Frequency Characteristic of the Hodograph
Let us consider as a generalized frequency characteristic of the hodograph WGCC ðfn Þ dependence on the frequency fn of the scalar product of the projections of the vector An of the harmonic module with numbers n (1 n N) on the real (An cos /n ) and imaginary axis (An sin /n ) of the hodograph: WGCC ðfn Þ ¼ A2n sin /n cos /n ¼ 0:5A2n sin 2/n where according to (1) /n ¼ arctg abnn .
ð9Þ
Creating Spaces of Temporary Features for the Task
199
The meaning of introducing such a characteristic is explained by the example of the hodograph of the link “transport delay” with a transfer function of the form: Wðj2pf Þ ¼ ej2pf s
ð10Þ
where s –delay value. The hodograph of such a link, as is known (5), is a continuous circle of a unit radius on a complex plane, multiple frequencies 2p characterizes any point on this circle. The generalized characteristic in accordance with (9) is a harmonic dependence on the frequency with a fixed period and amplitude. All transitions through zero of this dependence occur at frequencies corresponding to transitions of the hodograph from one quadrant to another. One period of such a dependence determines the frequency range that ensures the hodograph is located in two adjacent quadrants of the complex plane.
0.60 0.50 0.40 0.30 0.20 0.10 0.00 -0.10 -0.20 -0.30 -0.40
1
3
5
7
9
11 13 15 17 19 21 23 25 27 29
Numbers of harmonics Fig. 10. The generalized frequency characteristics of the hodographs shown in Fig. 8 for two eyes of a subject with normal vision at a frequency at a stimulation frequency of 24 Hz.
In Fig. 10 shows the generalized frequency characteristics of the hodographs shown in Fig. 8 for two eyes of a subject with normal vision at a stimulation frequency of 24 Hz. In Fig. 10 on the abscissa axis, the numbers of harmonics are plotted, starting with the first (4.15 Hz). Thanks to such a restructuring of the hodograph, it becomes convenient to choose characteristic harmonics for comparing pathologies. For example, Fig. 11(a) shows the combined generalized frequency characteristics of the hodographs for the eyes of subjects: 1, 2 - the right and left eyes with normal vision (N), 3, 4, respectively, glaucoma (GL), 5, 6, respectively diabetic retinopathy (DR). For clarification and comparison, Fig. 9(b) shows the characteristics of the same eyes, but without the characteristics of the eyes with the norm.
200
A. P. Eremeev et al.
24 Hz 0.6 0.4
Norm 24r Norm 24l
0.2
DR 24r 0 1
3
5
7
9 11 13 15 17 19 21 23 25 27 29
DR 24l Glau 24r
-0.2
Glau 24l
-0.4
Numbers of harmonics
24 Hz 0.06 0.04 0.02
DR 24r DR 24l
0 -0.02
1
3
5
7
9 11 13 15 17 19 21 23 25 27 29
Glau 24l
-0.04 -0.06
Glau 24r
Numbers of harmonics
Fig. 11. Combined generalized frequency characteristics of the hodographs of the eyes of subjects - a: normal vision (N – 1,2), glaucoma (GL-3.4), diabetic retinopathy (DR-5.6); b: glaucoma (GL-3,4), diabetic retinopathy (DR-5,6).
An analysis of the characteristics in Fig. 11 shows a significant decrease in harmonics amplitudes for the eyes of subjects with pathologies. Besides, the subject with DR has a significant phase lag of the presented frequency response from similar characteristics of other subjects. Similar differences are observed at other stimulation frequencies. The differences revealed in this way allow them to be formalized and used as additional features in diagnostic systems using artificial intelligence methods.
Creating Spaces of Temporary Features for the Task
201
Generalized frequency responses 6.00E-05
N576-1
4.00E-05
N576-2
2.00E-05
DR15-1 DR15-2
0.00E+00 -2.00E-05
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
Gl104-2
-4.00E-05 -6.00E-05
Gl 104-1
Myop276-1 Myop276-2
Numbers of harmonics
AMD 285-1
Fig. 12. Generalized frequency responses in the decomposition of the PERG in a Fourier series
In further work, it is proposed to use generalized frequency characteristics when expanding PERG in a Fourier series. Figure 12 shows similar characteristics for different pathologies. Note that all the transitions of the graphs through the abscissa axis correspond to the transitions of the hodographs to the new quadrant. For healthy eyes (N), these are the two graphs with the largest amplitudes. A healthy eye has characteristics in the first, fourth, and third quadrants that are similar to the delay characteristics, what confirmed by Fig. 11. Characteristics of eyes with pathologies do not possess such properties and at the same time significantly differ from each other.
0.00006 0.00004 0.00002 0 -0.00002 -0.00004 -0.00006 -0.00008 -0.0001
Graphs of product for the norm N576-1 N576-2 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
N572-1 N572-2 N583-1 N583-2
Fig. 13. Generalized frequency responses for the norm (N)
Figure 13 shows six generalized frequency responses for norm (N). Note that, despite significant differences in amplitudes, these graphs remain similar to the graphs and hodographs for the delay in the same three quadrants, which were discussed above. The graphs of the generalized frequency characteristics of works for six eyes with glaucoma (GL), shown in Fig. 14, do not possess this property. These graphs show a significant departure from the delay in the first quadrant, which is also clearly seen in Fig. 11.
202
A. P. Eremeev et al.
Graphs of product for the glaucoma (GL)
0.00012 0.0001 0.00008
Gl104-1
0.00006
Gl104-2
0.00004
Gl111-1
0.00002
Gl111-2
0 -0.00002
1
3
5
7
9 11 13 15 17 19 21 23 25 27 29
Gl202-1 Gl202-2
-0.00004 -0.00006 -0.00008 Fig. 14. Generalized frequency responses for glaucoma (GL)
Summing up the consideration of the proposed graphs of products of real and imaginary parts depending on harmonic frequencies during the decomposition of PERG, it should be concluded that they can be used to introduce several new formalized signs of retinal pathologies taking into account their temporal dependencies. Results of this article are planned to be used in the developed intellectual decision support system for analyzing and diagnosing complex pathologies of vision, based on the integration of various approaches and methods for processing dynamic and formation (as ERG) and artificial intelligence methods (fuzzy logic, and evidence methods Bayes theory, neural network approach, cognitive graphics, etc.). Note that Bayesian LMMSE/MAP estimator [9], the efficient mathematical procedural model for brain signal [10], an improved image compression algorithm using wavelet and fractional cosine transforms [11] can be used for preprocessing ERG signals.
3 Conclusions The following new results are presented in the work: 1. An extensive and purposeful study of the results of retinographic studies of retinal pathologies with periodic test stimuli was carried out to study its frequency properties to expand the space of formalized signs of pathologies that can be used in diagnostic systems based on artificial intelligence methods; 2. A technique has been developed for constructing the retinal frequency response taking into account the mathematical description of pulsed test stimuli, which objectively reflects its ability to convert stimulus spectra into ERG spectra;
Creating Spaces of Temporary Features for the Task
203
3. A procedure for a polynomial approximation of the frequency response of the retina is proposed, which allows the use of coefficients of approximating polynomials as new formalized signs in diagnosis; 4. It is shown that for complex pathologies of the retina, it is advisable to take into account not only its amplitude-frequency properties under different stimulation conditions but also the phase-frequency characteristics by analyzing retinal hodographs on a complex plane; 5. When searching for additional formalized signs of retinal pathologies, it is proposed to use a new generalized frequency response of the retinal hodograph, which facilitates the search and formalization of additional signs of pathologies. Acknowledgement. The work was supported by RFBR projects17-07-00553, 18-01-00201, 1851-00007, 18-29-03088, 19-01-00143.
References 1. Vagin, V.N., Eremeev, A.P., Dzegelenok, I.I., Kolosov, O.S., Frolov, A.B.: Formation and development of the scientific school of artificial intelligence in the Moscow Energy Institute. In: Software Products and Systems #3, pp. 3–16. Research Institute “Center program system”, Tver (2010) 2. Vagin, V.N., Eremeev, A.P., Kutepov, V.P., Falk, V.N., Fominykh, I.B.: On the 40th anniversary of the department of applied mathematics: research and development in the field of education, programming, information technology and artificial intelligence. In:Vestnik MEI #4, pp. 117–128. Publishing House MPEI, Moscow (2017) 3. Anisimov, D.N., Vershinin, D.V., Zueva, I.V., Kolosov, O.S., Khripkov, A.V., Tsapenko, M.V.. The use of a tunable dynamic model of the eye grids in the component analysis for the diagnosis of pathologies using artificial intelligence methods. In: Vestnik MEI #5, pp. 70– 74. Publishing House MPEI, Moscow (2008) 4. Eremeev, A.P., Khaziev, R.R., Zueva, M.V., Tsapenko, I.V.: The prototype of the diagnostic decision-making system based on the integration of Bayesian trust networks and DempsterSchaefer parameters. In: Software Products and Systems #1, pp. 11–16. Research Institute “Centerprogramsystem”, Tver (2013) 5. Eremeev, A., Ivliev, S.: The use of convolutional neural networks for the analysis of nonstationary signals for diagnosing problems of vision pathology. In: Proceedings of the 16th Russian Conference on Artificial Intelligence RCAI 2018, Moscow, Russia, 24–27 September 2018, pp. 164–175. Springer, Heidelberg (2018) 6. Shamshinova, A.M.: Electroretinography in ophthalmology. Medica, Moscow (2009) 7. Kolosov, O.S., Balarev, D.A., Pronin, A.D., Zueva, M.V., Tsapenko, I.V.: Evaluation of the frequency properties of a dynamic object using pulsed testing signals. In: Mechatronics, Automation, Control #18, pp. 219–226. New Technologies, Moscow (2017) 8. Kolosov, O.S., Korolenkova, V.A., Pronin, A.D., Zueva, M.V., Tsapenko, I.V.: The construction of the amplitude-frequency characteristics of the retina of the eye and the formalization of their parameters for use in diagnostic systems. In: Mechatronics, Automation, Control #19, pp. 451–457. New Technologies, Moscow (2018) 9. Sardari, I., Harsini, J.S.: Thresholding or Bayesian LMMSE/MAP estimator, which one works better for despeckling of true SAR images. Int. J. Image Graph. Signal Process. (IJIGSP) 1, 1–11 (2019). MECS Press, Hong Kong
204
A. P. Eremeev et al.
10. Chowdhury, R., Saifuddin Saif, A.F.M.: Efficient mathematical procedural model for brain signal improvement from human brain sensor activities. Int. J. Image Graph. Signal Process. (IJIGSP) 10, 46–53 (2018). MECS Press, Hong Kong 11. Naveen kumar, R., Jagadale, B.N., Bhat, J.S.: An improved image compression algorithm using wavelet and fractional cosine transforms. Int. J. Image Graph. Signal Process. (IJIGSP) 11, 19–27 (2018). MECS Press, Hong Kong 12. McCulloch, D.L., Marmor, M.F., Brigell, M.G., Hamilton, R., Holder, G.E., Tzekov, R., Bach, M.: ISCEV Standard for full-field clinical electroretinography (2015 update). Doc. Ophthalmol. 130, 1–12 (2015). Springer, Heidelberg 13. Hood, D.C., Bach, M., Brigell, M., Keating, D., Kondo, M., Lyons, J.S., Marmor, M.F., McCulloch, D.L.: ISCEV standard for clinical multifocal electroretinography (mfERG) (2011 edition). Doc. Ophthalmol. 124, 1–13 (2012). Springer, Heidelberg 14. Bach, M., Brigell, M.G., Hawlina, M., Holder, G.E., Johnson, M.A., McCulloch, D.L., Meigen, T., Viswanathan, S.: ISCEV standard for clinical pattern electroretinography (PERG) – 2012 update. Doc. Ophthalmol. 124, 1–13 (2013). Springer, Heidelberg 15. Zueva, M.V., Tsapenko, I.V., Kolosov, O.S., Vershinin, D.V., Korolenkova, V.A., Pronin, A.D.: Assessment of the Amplitude-Frequency Characteristics of the Retina with its Stimulation by Flicker and Chess Pattern-Reversed Incentives and their Use to Obtain New Formalized Signs of Retinal Pathologies. Biomed. J. Sci. Tech. Res. 19(5), 14575–14583 (2019). Crimson Publishers, Westchester 16. de Broglie, L.: Mathematics for Electrical and Radio Engineers, 2nd edn. Science, Moscow (1967) 17. Netushila, A.V: TeoriyaAvtomaticheskogoUpravleniya. Uchebnikdlyavuzov. Izd. 2-ye, dop. ipererabotannoye. VysshayaShkola, Moscow (1976) 18. Kolosov, O.S., Pronin, A.D.: Features of identification of dynamic objects by impulse testing actions. In: Vestnik MEI 2018, #3, pp. 116–125. Publishing House MPEI, Moscow (2018) 19. Sergienko, A.B.: Tsifrovaya Obrabotka Signalov: Ucheb. Posobiye. 3-ye izd.BKHVPeterburg, Sankt-Petersburg (2013) 20. Sviridov, V.G., Sviridov, E.V., Filaretov, G.F: Fundamentals of automation of thermal physics experiment. Textbook for universities. Publishing House MPEI, Moscow (2019)
A Modified Particle Swarm Algorithm for Solving Group Robotics Problem Kang Liang1 and A. P. Karpenko2(&) 1
2
Shanghai Polytechnic University, Pudong District, Jin Hai Road, 2360, Shanghai 201209, China [email protected] Bauman Moscow State Technical University, ul. Baumanskaya 2-ya, 5/1, 10005 Moscow, Russian Federation [email protected]
Abstract. The paper proposes modifications of the PSO+ global optimization algorithm, which is based on the canonical particle swarm algorithm (PSO). The PSO+ algorithm is designed to solve the group robotics problem that comes down to the localization of the global extremum of a certain scalar objective function when there are no obstacles for robots’ movement. The PSO+ algorithm modifications proposed in the paper make allowance for a possibility that such obstacles are present. The paper objective is to study the obstacles effect on the efficiency of the modified PSO+ algorithm. We consider two possible options of the problem: a priori information about the obstacle sizes and locations is available and specified prior information is non-available. The results of computational experiments to study the effectiveness of the modified PSO+ algorithm is presented. The experimental results show that the modification of the PSO+ algorithm proposed in the work provides a high quality solution to the considered task of group robotics under conditions when there is an obstacle to the movement of robots in the search area. Keywords: Group robotics Modified particle swarm optimization algorithm Global optimization
1 Introduction This paper is sequel to an earlier paper [1] in which we considered the problem of finding the maximum of a certain scalar physical field using a group of robots. The problems of localizing the areas of radioactive, chemical, bacteriological and other contamination, problems of finding malignant algae, ocean turbulence, temperature, and other anomalies, etc. can be set in this form [2–5]. We focus on a decentralized strategy to control a group of robots. The strategy suggests that an achievement of the overarching goal of the operation is based on selforganization of participants rather than on coordination of their actions. The natural basis for such a technique to control a group of robots is a bionic approach in the form of evolutionary and population algorithms [6].
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 205–217, 2020. https://doi.org/10.1007/978-3-030-39216-1_19
206
K. Liang and A. P. Karpenko
Note that the bionic approach is widely used not only in robotics, but also for solving a wide range of problems in other areas. For example, [7] uses the Particle Swarm Optimization (PSO) algorithm for the material balance computation in alumina production process; in [8] - Ant Colony System Algorithm (ACSA) for knapsack problem; in [9] - Monkey Algorithm (MA) for drug design; in [10] - Bacterial Foraging Algorithm (BFA) for performance assessment of power system stabilizer in multimachine power system. In group robotics, bionic approaches are used primarily to search for various kinds of goals. For example, in [11] this approach is used to map the distribution of weeds in agricultural fields using quadrocopters. In [12], the multicopter search problem is considered. A feature of the statement of the problem is the discretization of the search space. Closest to our work is [13]. This article also uses the particle swarm algorithm to solve the search problem. However, this refers to the use of quadrocopters for search, while our work is focused on the use of ground-based robots. As a basic algorithm, we use the canonical algorithm of the Particle Swarm Optimization (PSO) [6], which is based on the socio-psychological behavior model of a group of individuals. In a large number of applications, the PSO algorithm has proved to be an easy-to-implement and efficient global optimization algorithm. For example, the papers [14–17] discuss various modifications of the algorithm. There is a little sense in using a canonical PSO algorithm and its known modifications to control a group of robots, since these algorithms require global information about the entire group of robots (to localize the best of them). To realize this requirement in practice is a challenge. In the paper [18] to solve the group robotics problem under consideration, we proposed a modified PSO+ particle swarm algorithm, which does not use global information about particles. The main feature of this algorithm is that the concept of particle proximity in topological space [6] is replaced with the concept of their proximity in the search space. In this paper, we offer a modification of the PSO+ algorithm called PSO++ for solving the group robotics problem under consideration in case an obstacle or obstacles are available in the search area. The paper objective is to study the effect of obstacles on the efficiency of the PSO++ algorithm. We consider two possible options of the problem: a priori information about the obstacle sizes and location is available (PSO1++ algorithm) and non-available (PSO2++).
2 Problem Description We consider the deterministic optimization problem if unacceptable subdomains (obstacles) are available in the search area: min
X 2 D RjXj ^ 2D X 62 D
f ðXÞ ¼ f ðX Þ ¼ f
ð1Þ
Here jXj - vector dimension of variable parameters; X; RjXj - jXj-dimensional arithmetic ^ - unacceptable subdomains (obstacles); f ðXÞ - objective space; D - search area; D
A Modified Particle Swarm Algorithm for Solving Group Robotics Problem
207
function (optimality criterion) with values in R1 space; X , f - required optimal solution and value of the objective function, respectively. We select two possible options of the problem (1), which differ in the volume of a priori information on obstacles. ^ obstacles, that is, Problem 1. There is complete and accurate a priori information on D the limiting functions GðXÞ ¼ fgi ðXÞ ¼ 0; i 2 ½1 : jGjg that form these obstacles, are a priori known: ^ ¼ fXj gi ðXÞ 0; i 2 ½1 : jGjg: D ^ obstacles, and robots have to Problem 2. There is no a priori information on D determine their availability in the process of their evolution in the search space.
3 Canonical and Modified Particle Swarm Optimization Algorithms The canonical PSO algorithm is designed to solve the problem (1), but without ^ set is empty). The scheme of the canonical obstacles in the search area (when the D PSO algorithm appears as follows [6]. (1) Specifying the values of the free parameters of the algorithm and initialize the population of particles S ¼ fsi ; i 2 ½1 : jSjg. We suppose that the iteration counter t ¼ 1. (2) Updating the positions of all particles si ; i 2 ½1 : jSj in the population by the formulas: Xi0 ¼ Xi þ Vi ;
ð2Þ
Vi ¼ bI Vi þ UjX j ð0; bC Þ Xi Xi þ UjX j ð0; bS Þ Xi Xi
ð3Þ
where Xi , Xi - current locally and globally best positions of the particle si , respectively; UjX j ð0; bÞ - j X j-dimensional vector, whose component values have a randomly uniform distribution in the interval ð0; bÞ, b [ 0; - symbol of a direct product of vectors; bI ; bC ; bS - free parameters of the algorithm, recommended values of which are equal bI ¼ 0;7298, bC ¼ bS ¼ 1;49618; Vi ¼ Vi ðtÞ - j X jdimensional increment vector of the particle coordinates; Vi ¼ Vi ðt 1Þ; t 0 iteration number. (3) Checking the fulfilment of the condition for ending iterations. If this condition is fulfilled, we complete the iteration; otherwise, we assume that t ¼ t þ 1 and go to the step 2. We emphasize that the vector Xi (formula 3) is found among the “neighbours” of a given particle (in the sense of neighbourhood topology used) [6].
208
K. Liang and A. P. Karpenko
The PSO+ algorithm is also designed to solve the problem (1) with no obstacles. The intent of modifications of the canonical PSO algorithm used in the PSO+ algorithm is as follows [1]. (a) We control a velocity of the particle si ; i 2 ½1 : jSj according to the rule vi;j ¼
vi;j ; vmax ; j
absðvi;j Þ\vmax ; j otherwise;
j 2 ½1 : j X j:
is the maximum allowable value of the j-th component of the velocity Here, vmax j vector (free parameter of the algorithm). (b) With increasing number of iterations t we linearly reduce the maximum allowable value of the absolute velocity value of the particle si ; i 2 ½1 : jSj according to the formula Vmax ð0Þ Vmin Vmax ðtÞ ¼ t þ Vmax ð0Þ; tmax
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 XjX j max ðtÞ ; Vmax ðtÞ ¼ v j j¼1
where tmax ; Vmin , Vmax ð0Þ - free parameters of the algorithm. (c) We assume that the neighbours of this particle si ; i 2 ½1 : jSj are all particles of the population S in its current state, the Euclidean distance of which to this particle does not exceed the communication radius qmax : Ni ¼ fsj j Xj Xi qmax ; j 2 ½1 : jSjg ð4Þ Note, that according to the formula (4) a size of the set Ni is equal to one, that is jNi j ¼ 1, if within the communication radius there are no particles other than the particle si .
4 Modified PSO++ Particle Swarm Optimization Algorithms for Solving Problem with Obstacles We present PSO1++, PSO2++ modifications of the PSO1+ algorithm to solve the Problems 1, 2 with obstacles, respectively (п. 2). In PSO1++, PSO2++ algorithms, the following components are common. Use of Penalty Functions. We determine the best positions of the particle Si, i 2 ½1 : jSj locally and globally from the values of the fitness function rather than from the values of the objective function f ðXÞ, as in the PSO+ algorithm: uðXÞ ¼ f ðXÞ þ pðX; bÞ; where pðX; bÞ - penalty function to formalize robot’s obstacle information; b [ 0 free parameter. We define the function pðX; bÞ for PSO1++, PSO2++ algorithms differently (see below).
A Modified Particle Swarm Algorithm for Solving Group Robotics Problem
209
Isolated Particles Evolution. The PSO1++, PSO2++ algorithms take into account the risk of appearing isolated particles. The particle si ,i 2 ½1 : jSj is isolated at the given iteration if at this iteration an equality jNi j ¼ 1 is satisfied. Moreover, instead of formulas (2), (3), in the process of population evolution, there are used the formulas: Xi0 ¼ Xi þ Vi ;
Vi ¼ bI Vi þ UjX j ð0; bC Þ Xi Xi
ð5Þ
that is, a social component in formula (3) is no longer used. The particle si evolution process, according to formulas (5), proceeds until the particle has a neighbour (s) or until the iteration process is completed. The recommended value of the free parameter bC in this case is equal bC ¼ 1; 0: Preventing Collision with Obstacles. Despite the use of penalty functions to prevent collisions with obstacles, there may be a situation in PSO1++, PSO2++ algorithms, ^ In this situation, we find a new point when a point Xi0 i 2 ½1 : jSj is within domain D. 0 ^ Xi 62 D using the iterative division method of halving the segment ½Xi ; Xi0 . PSO1++ Algorithm In this case, we remind that every robot si ; i 2 ½1 : jSj «knows» exact boundary coordinates of each obstacle. As a penalty, the PSO1++ algorithm uses the function pðX; bÞ ¼ b
jGj X
2 gjþ ðXÞ ;
j¼1
where gjþ ðXÞ ¼
0; gj ðXÞ\0; gj ðXÞ; gj ðXÞ 0:
If, for instance, an obstacle represents a sphere with dg diameter and centre in the point Xg ¼ ðxg;1 ; . . .xg;jX j Þ, then jGj ¼ 1 and g1 ðXÞ ¼
jX j X j¼1
xj xg;j
2
2 dg : 2
PSO2++ Algorithm In this case, the robot a model of which is the particle, can determine in some way or other that there is an obstacle on its path, being at distance dd [ 0 from this obstacle. Suppose that the particle si ; i 2 ½1 : jSj while moving from the current point Xi to the point Xi0 , being in position Xiþ , localizes an obstacle. The scheme of PSO2++ algorithm for such a particle appears as in Fig. 1.
210
K. Liang and A. P. Karpenko
Fig. 1. To the scheme of PSO2++ algorithm: an obstacle localized by the particle si
(1) As a target position of the particle si at this iteration we take the point Xiþ , i.e. we assume that Xi0 ¼ Xiþ ; i 2 ½1 : jSj. ^ i a symmetry axis of which is the straight line (2) We form a rectangular marquee D ðXi ; Xi0 Þ, and the lateral lengths are equal to ad ; bd (free parameters of the algorithm). (3) For the particle si and all its neighbours (4), we determine a penalty value pðX; bÞ ^ i to be equal to b: pðX; bÞ ¼ const ¼ b. in all points of domain D
5 Software and Computational Experiment Implementation The PSO1++, PSO2++ algorithms are implemented in MatLab. Computational experiments have been conducted for three target functions for which exact solutions are known [6]: a spherical function (sphere), Rastrigin function, and Rosenbrock function. Taking into consideration that the PSO++ algorithm is aimed at solving twodimensional problems of group robotics, in all experiments a dimension of vector X is assumed to be equal to two: j X j ¼ 2. The initial population we generate in hypercube. ½a; a2 ¼ P ¼ ðXj a xi a; i ¼ 1; 2Þ: Suppose that in the search area there is a circle-like obstacle with the centre 0g in the point with coordinates x1 ¼ xg1 ; x2 ¼ xg2 and diameter that is equal to dg (Fig. 2).
A Modified Particle Swarm Algorithm for Solving Group Robotics Problem
211
Fig. 2. Search area: a circle-like obstacle with the centre 0g
To estimate the algorithm efficiency values we use a multi-start method with the number of starts M. An estimate of the algorithm efficiency was based on the seven indicators: U1 – best by multi-start reached value of the target function; ~ ; U2 – best by multi-start reached value of vector components of variable parameters X U3 – estimate of expectation of values ~fk , equal to f ; ~k , denoted DX ; U4 – estimate of values DX U5 – estimate of localization probability of the target function minimum with a specified accuracy ef in the space of this function values, equal to ~ pf ðef Þ ¼ ~ pf ; U6 – similar estimate of localization probability of the target function minimum with a specified accuracy eX in the space of this function arguments equal to ~pX ðeX Þ ¼ ~pX ; tests (computations of the target function U7 – multi-start average number of N values). ~ of the algorithm efficiency that repreBesides, we use an integrating indicator U sents an additive scalar convolution of these normalized indicators: ~ ¼ k1 U ~ 1 þ k2 U ~ 2 þ k3 U ~ 3 þ k4 U ~ 4 k5 U ~ 5 k6 U ~ 6 þ k7 U ~ 7 ! min : U X
212
K. Liang and A. P. Karpenko
Here kk ; k 2 ½1 : 7 - weighting factors; ~ ~ k ¼ Uk ; U Umax k
k ¼ 1; 2; 3; 4; 7;
~ ~ k ¼ Uk ; k ¼ 5; 6; U Umin k
where Umax - the highest obtained value of the indicator Uk , and Umin - the lowest value k k of this indicator. Experiments have been carried out at the following values of the algorithm pffiffiffi free parameters: jSj ¼ 50; j X j ¼ 2; a ¼ 10; 0; xg1 ¼ 5; 0; xg2 ¼ 0; Vmin ¼ 0; 01 2 a; max M ¼ 100; dg 2 ½0; 1; 0; 2; 0; 3; 0; 4; 0; 5; 0; vmax 1 ð0Þ ¼ v2 ð0Þ ¼ 0;2 a; qmax ¼ pffiffiffi 0;5 a; Vmax ð0Þ ¼ 0;2 2 a; dd ¼ 0;1 a; ad ¼ 0;1 a; bd ¼ 0;2 a ; b ¼ 100; ef ¼ eX ¼ 0;01. The computational process stagnation was used as a condition for iterations to end: e k ðtÞ iterations end if, after the end of dt iterations, the inequality abs u e k ðt þ dt ÞÞ df , where dt ¼ 20, df ¼ 0; 01 is true. u
6 Results and Discussion Some results of computational experiments are shown in Tables 1, 2 and 3 and in Figs. 3, 4, 5, 6 and 7 to present the values of the efficiency indicator U1 ¼ ~f , U3 ¼ f , e The last value is obtained for all weight factors kk ; k 2 ½1 : 7 U. U5 ¼ ~pf , U7 ¼ N, equal to 1.
Table 1. Results of a computational experiment: dg ¼ 1:0 f ðXÞ PSO1++ Sphere Rastrigin Rosenbrock PSO2++ Sphere Rastrigin Rosenbrock
U1
U2
U3
U4
U5
U6
U7
4.8E−7 6.9E−4 3.2E−4 0.01 1.00 0.37 34.71 6.5E−9 5.7E−6 0.17 0.15 0.85 0.85 80.33 1.5E−7 2.6E−4 1.8E−3 0.04 0.96 0.40 87.15 2.9E−8 1.7E−4 3.2E−4 0.01 1.00 0.45 34.79 1.2E−7 2.3E−5 0.21 0.16 0.88 0.88 78.53 3.1E−7 1.6E−4 1.1E−3 0.03 0.97 0.37 88.13
A Modified Particle Swarm Algorithm for Solving Group Robotics Problem
213
Table 2. Results of a computational experiment: dg ¼ 3:0 f ðXÞ PSO1++ Sphere Rastrigin Rosenbrock PSO2++ Sphere Rastrigin Rosenbrock
U1
U2
U3
U4
U5
U6
U7
2.7E−6 1.6E−3 0.3E−3 0.01 1.00 0.42 35.01 5.7E−8 1.7E−5 0,29 0.21 0.82 0.82 78.93 1.2E−10 4.9E−6 0,01 0.07 0.95 0.35 87.78 2.5E−6 9.8E−9 5.3E−7
1.6E−3 2.9E−4 0.01 1.00 0.52 35.09 7.0E−6 0.20 0.14 0.87 0.87 78.33 4.4E−4 9.9E−4 0.03 0.99 0.32 87.79
Table 3. Results of a computational experiment: dg ¼ 5:0 f ðXÞ PSO1++ Sphere Rastrigin Rosenbrock PSO2++ Sphere Rastrigin Rosenbrock
U1
U2
U3
U4
U5
U6
U7
1.0E−7 3.2E−4 0.4E−3 0.02 1.00 0.43 34.27 2.3E−9 3.4E−6 0.26 0.23 0.78 0.78 79.90 6.7E−7 3.3E−4 0.8E−3 0.03 0.99 0.40 88.00 1.5E−7 3.9E−4 3.0E−4 0.01 1.00 0.44 34.70 3.0E−8 1.2E−5 0.29 0.18 0.85 0.85 80.40 2.6E−7 8.4E−4 8.3E−3 0.08 0.94 0.32 88.08
Figures 3, 4, 5, 6 and 7 allow us, first, to estimate the efficiency of each of the PSO1++, PSO2++ algorithms in terms of efficiency indicators under consideration and, second, compare the efficiency of these algorithms by mentioned indicators.
а) PSO1++
b) PSO2++
Fig. 3. Values of efficiency indicator U1 : the best by multi-start reached value of the target function
214
K. Liang and A. P. Karpenko
а) PSO1++
b) PSO2++
Fig. 4. Values of efficiency indicator U3 : estimate of expectation of values ~fk , equal to f
а) PSO1++
b) PSO2++
Fig. 5. Values of efficiency indicator U5 : estimate of localization probability of the target function minimum
а) PSO1++
b) PSO2++
tests Fig. 6. Values of efficiency indicator U7 : multi-start average number of N
A Modified Particle Swarm Algorithm for Solving Group Robotics Problem
а) PSO1++
215
б) PSO2++
e integrating indicator of the algorithm Fig. 7. Values of integrating efficiency indicator U: efficiency
The main conclusion resulted from all indicators is that with a priori available information on obstacles (PSO1++ algorithm) there is no drastically increasing efficiency of this algorithm as compared to the situation when such information is absent (PSO2++ algorithm). Similarly to the PSO+ algorithm [18], as expected, the efficiency of the PSO1++ and PSO2++ algorithms is higher for the unimodal spherical function in comparison with the unimodal but ravine Rosenbrock function, and especially in comparison with the multimodal Rastrigin function. It is important that, even if in the search area there are large-sized obstacles, the probability U6 , U5 , U6 estimates of the global minimum localization for PSO1++, PSO2++ algorithms turn out to be quite high. For example, if the obstacle diameter is dg ¼ 5;0, then for the Rastrigin function, a value of these estimates is approximately 0.8. Note that these estimates are achieved with the acceptable numbers of U7 tests U7 . On a per-particle basis, the number of tests is 90 at most (multi-start average).
7 Conclusions Two modifications of the canonical PSO algorithm are proposed in the paper, which we called PSO1++, PSO2++. The PSO algorithm does not assume that there are obstacles in the search area. The modifications were intended to improve the PSO algorithm so that it can be used in these conditions. The PSO1++, PSO2++ algorithms cover two possible situations: (1) information on the location and shape of obstacles is a priori available to robots; (2) robots only have a posteriori information available to them during the search process. Thus, the scientific novelty of the work is to extend the PSO algorithm to a class of problems with obstacles in the field of search. A wide parametric study of the effectiveness of the PSO1++, PSO2++ algorithms were carried out using the well-known two-dimensional test functions of three different
216
K. Liang and A. P. Karpenko
classes: one-extremal “spherical” function; single extreme ravine function; multiextreme function. We assumed that in the search area there is one obstacle in the form of a circle, the diameter of which varied widely. The results of the study showed a rather high efficiency of the PSO1++, PSO2++ algorithms in the considered task of group robotics, as well as the prospects of using these algorithms to solve this problem. The model of robots used in the work does not take into account their dynamics as mechanical systems, as well as the dynamics of robot control systems. In continuation of the work, the authors plan to conduct a study of the effectiveness of the PSO1++, PSO2++ algorithms taking into account the dynamics of robots and their control systems.
References 1. Karpenko, A.P., Leschev, I.A.: Nature-inspired algorithms for global optimization in group robotics problems. In: Smart Electromechanical Systems: Group Interaction. Studies in Systems, Decision and Control, vol. 174, pp. 91–106. Springer (2018) 2. Pettersson, L.M., Durand, D., Johannessen, O.M., Pozdnyakov, D.: Monitoring of Harmful Algal Blooms, 300 p. Praxis Publishing, London (2012) 3. White, B.A., Tsourdos, A., Ashokoraj, I., Subchan, S., Zbikowski, R.: Contaminant cloud boundary monitoring using UAV sensor swarms. In: Proceedings of the AIAA Guidance, Navigation, and Control Conference, San Francisco, USA, pp. 1037–1043 (2005) 4. Hayes, A.T., Martinoli, A., Goodman, R.M.: Distributed odor source localization. IEEE Sensors J. 2(3), 260–271 (2002) 5. Lilienthal, A., Duckett, T.: Creating gas concentration grid maps with a mobile robot. In: Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems, Las Vegas, USA, pp. 118–123 (2003) 6. Karpenko, A.P.: Sovremennye algoritmy poiskovoj optimizacii – Algoritmy vdohnovlennye prirodoj, 446 p. Izdatel’stvo MGTU im. N.E. Baumana (2017). (in Russian) 7. Song, S., Kong, L., Cheng, J.: A novel particle swarm optimization algorithm model with centroid and its application. Int. J. Intell. Syst. Appl. (IJISA) 1, 42–49 (2009) 8. Alzaqebah, A., Abu-Shareha, A.A.: Ant colony system algorithm with dynamic pheromone updating for 0/1 knapsack problem. Int. J. Intell. Syst. Appl. (IJISA) 2, 9–17 (2019) 9. Devi, R.V., Sathya, S.S., Kumar, N., Coumar, M.S.: Multi-objective monkey algorithm for drug design. Int. J. Intell. Syst. Appl. (IJISA) 3, 31–41 (2019) 10. Ibrahim, N.M.A., Elnaghi, B.E., Ibrahim, H.A., Talaat, H.E.A.: Performance assessment of bacterial foraging based power system stabilizer in multi-machine power system. Electr. Power Compon. Syst. 42(10), 1016–1028 (2014) 11. Albani, D., IJsselmuiden, J., Haken, R., Trianni, V.: Monitoring and mapping with robot swarms for agricultural applications. In: Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, Italy, 29 August–1 September 2017, pp. 1–6 (2017) 12. Garcia-Aunon, P., Barrientos Cruz, A.: Comparison of heuristic algorithms in discrete search and surveillance tasks using aerial swarms. Appl. Sci. 8(5), 711 (2018). https://doi.org/10. 3390/app8050711 13. Lee, K.-B., Kim, Y.-J., Hong, Y.-D.: Real-time swarm search method for real-world quadcopter drones. Appl. Sci. 8(7), 1169 (2018). https://doi.org/10.3390/app8071169
A Modified Particle Swarm Algorithm for Solving Group Robotics Problem
217
14. Karpenko, A.P., Seliverstov, E.Yu.: Global’naya optimizaciya metodom roya chastic. Obzor. Informacionnye tekhnologii №. 2, pp. 25–34 (2010). (in Russian) 15. Bonyadi, M.R., Michalewicz, Z.: Particle swarm optimization for single objective continuous space problems: a review. Evol. Comput. 25(1), 1–54 (2017) 16. Sengupta, S., Basak, S., Peters II, R.A.: Particle swarm optimization: a survey of historical and recent developments with hybridization perspectives. Mach. Learn. Knowl. Extr. 1(1), 157–191 (2018) 17. Zhang, Y.: A comprehensive survey on particle swarm optimization algorithm and its applications. Math. Probl. Eng. 2011, 157–191 (2015) 18. Kan, L., Karpenko, A.P.: Modificirovannyj algoritm optimizacii roem chastic, orientirovannyj na zadachu monitoringa gruppoj robotov. Aviakosmicheskoe priborostroenie №. 2, pp. 34–43 (2019). (in Russian)
Analysis of Diagnostic Signs of Defective States of Mechatronic Mechanisms of Cyclic Action Aleksandr K. Aleshin, Georgy I. Firsov(&), Viktor A. Glazunov, and Natalya L. Kovaleva Blagonravov Mechanical Engineering Research Institute of the RAS, 4, Malyi Kharitonievsky pereulok, 101990 Moscow, Russian Federation [email protected]
Abstract. It is proposed to use random component of the time intervals as the diagnostic signal as about the random variables, which more fully and accurately reflects their physical properties and are offered possibilities for an increase in their informativeness as diagnostic signals. Different characteristics of the laws of distribution of the random values of the time intervals are examined. It is shown that random component possesses a sufficient depth of diagnosis for the recognition of the nascent defects. Keywords: Random processes Diagnostics of mechanisms The informative parameter Entropy Density of distribution
1 Introduction Time as a source of diagnostic information is widely used for monitoring and diagnostics of mechanical systems of cyclic action. Traditionally, time intervals are considered as deterministic values [1, 2]. Their deviations beyond the permissible limits are diagnostic signs of defects. However, this idea of time intervals significantly reduces their informativeness about the current technical condition of the mechanical system. Operation of the mechanism is dynamically connected with the influence of the number of random power factors, as on the drive side of the movement, and the friction forces in kinematic pairs, the forces of useful resistance, etc. Working mechanism is a stochastic dynamic system and cycle time t single movement – the value of the random. Consequently, the idea of the time intervals of movement of the mechanism at a given displacement, as random variables, more fully and accurately reflects their physical properties and opens up opportunities to increase their informativeness as diagnostic signals. This is especially important for the organization of preventive diagnostics of incipient defects. The statistical array of cycle time values, as random variables, has a high sensitivity to small changes in the properties of the dynamic system of the mechanism and the incipient defect. It is proposed to measure and analyze the time of achievement of the Executive link of the mechanism of a given point in the process of movement. There is a relationship between the nature of the change in time of the speed of motion as a random process, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 218–227, 2020. https://doi.org/10.1007/978-3-030-39216-1_20
Analysis of Diagnostic Signs of Defective States of Mechatronic Mechanisms
219
and the time of achievement of a constant given value by this process. This connection follows from the Pontryagin equation for the distribution law f ðpi ; tÞ of the time of the first achievement of a given value by a random process as a function of the parameters pi ði ¼ 1; 2; . . .N Þ of the dynamical system [3]. Each defect is a deviation dpk ðk ¼ 1; 2; . . .sÞ of one or more parameters of the diagnosed mechanical system from the normative values. The deterministic functional dependence of f ðpi ; tÞ on pi leads to characteristic changes in the distribution law of f ðpi ; tÞ in the case of a defect. It is this feature of time intervals t that is proposed to be used for defect recognition.
2 Diagnostics of Cyclic Mechanisms According to the Statistical Characteristics of the Law of Motion The peculiarity of the dynamic systems of cyclic action consists in the repeated fulfillment of a given law of motion with a period t. These are oscillatory systems or mechanisms with periodic translational or rotational motions of links. For them, the cycle time t is an important technological parameter that determines the speed and synchronization with other devices. In addition, it is used as an external sign of the occurrence of defects in the form of deviation t for permissible limits [1, 2]. However, time t as a physical parameter is very limitedly considered as an independent diagnostic signal and source of information for defect recognition. This is due to the fact that the knowledge of the actual value of the time interval t does not allow to indicate a specific defect and does not have the necessary depth of diagnosis [4–6]. The limited information is connected with the idea of time intervals t as deterministic quantities, which take fixed and very specific values. In this interpretation, different in physical nature defects can lead to the same changes in t and it is impossible to indicate a specific reason. The notion of time intervals t as random variables more fully and accurately reflects their physical properties and opens up possibilities for increasing their informativeness as diagnostic signals. The fact is that, with sufficiently accurate multiple measurements of the period t, its random deviations are found around a certain average value. Depending on the dynamic properties of the system being diagnosed, the mean values may also undergo evolution. It turns out that in the random time component t there is a significant amount of diagnostic information about the current state of the dynamic system. Thus, along with the analysis of a specific physical process x(t) as a diagnostic signal, it is proposed to measure and analyze the time when this process reaches a certain value, for example, the time when a link reaches the mechanism of a given point in the process of movement. There is a relationship between the nature of the change in time of the physical process x(t) as a random process and the law of distribution f(t) of the time when this process reaches a constant given value. In addition to the well-known statistical characteristics of random processes, such as expectation, variance, standard deviation (rms), asymmetry and kurtosis, there use of the entropy coefficient of probability density is constructive
220
A. K. Aleshin et al.
K H = Δ эσ −1 = 0,5exp{Ish (n)}σ −1 =
bN − N1 d 10 ∑ ni lg ni , 2σ i =1
where Ish(n) - entropy (information on Shannon) [7], determined from Z1 Ish ðW0 Þ ¼ Mfln W0 ðnÞg ¼
W0 ðnÞ ln W0 ðnÞdn; 1
r - rms, b - histogram column width, N is the sample size, d is the number of histogram columns, ni is the number of observations in the i-th column of the histogram. Note that for any distribution laws, the KH value lies within 0; . . . 2:066, with the maximum value of KH = 2.066 having a Gaussian distribution. In addition, it is more convenient to use not the kk coefficient of kurtosis, which varies from 1 to ∞, but the counterkurtosis k э−0,5 , the value of which can vary from 0 to 1. To obtain an adequate numerical characteristic of a random process in the presence of correlation, it is necessary to specify the number N of measurements (sample elements), the time of each measurement s and the interval T between consecutive measurements, which may differ from s by the dead time (T − s). After this, it is possible to determine for this data set the so-called N-point sampling variance for a given number of measurements N and given values of T and s: N N 1 X 1X yi yj r ðN; T; sÞ ¼ N 1 i¼1 N j¼1 2
!2 :
ð1Þ
Currently, it is generally accepted [9] to follow the proposal of Dave Allan [8] and use the sample variance with N = 2 and T = s. This so-called Allan dispersion r2y ð2; s; sÞ; for which shorter designations are also used r2y ð2; sÞ, or r2y ðsÞ, can be defined using formula (1), as * r2y ðsÞ ¼
2 X i¼1
2 1X yi yj 2 j¼1
!2 +
1 ¼ \ðy2 y1 Þ2 [ : 2
ð2Þ
Allan’s variance and the square root of it, sometimes called the standard deviation or Allan deviation, rely on measuring the difference of two adjacent successive measurements of the cycle duration, rather than measuring the cycle time deviation from the average value, as in the case of the classical definition of standard deviation. For a given interval s yi ¼ ðxi þ 1 xE. i Þ=s. Substituting the last relation in formula (2) gives D
r2y ðsÞ ¼ ðxi þ 2 2xi þ 1 þ xi Þ2
2s2 :
In the case of linear drift of the cycle duration, y(t) = at, where a is the drift velocity. Taking into account the fact y1 ¼ ½at0 þ aðt0 þ sÞ=2 and y2 ¼ ½aðt0 þ sÞ þ pffiffiffi aðt0 þ 2sÞ=2; that it follows from formula (2) that ry ðsÞ ¼ as= 2 ¼ paffiffi s. 2
Analysis of Diagnostic Signs of Defective States of Mechatronic Mechanisms
221
Consequently, the linear drift of the cycle duration leads to the Allan deviation, linearly dependent on the measurement time s. With harmonic modulation of the cycle duration yðtÞ ¼ where fm is the modulation dm0 sin2 ðpfm sÞ m0 pfm s . This shows that the
dm0 sinð2pfm tÞ; m0
ð3Þ
frequency. Substitution (3) into (2) gives ry ðsÞ ¼
contribution of frequency modulation to Allan deviation becomes zero when s = 1/fm, that is, when time s is multiple to the modulation period of 1/fm and the modulation effect is zeroed by time averaging. The deviation is maximal for s n=ð2fm Þ; where n is an odd integer. For random variations of the cycle duration, the spectral density of frequency noise can be approximated to a first approximation by a power function of the form Sðf Þ ¼ ha f a , which corresponds to the following nature of noise: white phase noise (a = 2), phase flicker noise (a = 1), white frequency noise (a = 0), frequency flicker noise (a = −1), noise of random walks of frequency (a = −2). In this case, the variance of Allan can also be approximated by a power function r2y ðsÞ ¼ kl sl . Between the values of a and l, there is a dependence l = −a − 1. This dependence is unambiguously performed in the range, while for phase noise (a = 1 and a = 2), which manifest themselves at small measurement intervals, there is a certain ambiguity. In other words, the white phase noise in the time domain looks the same as the phase flicker noise. To overcome this drawback, a so-called modified Allan dispersion was proposed Mod r2y
1 ¼ 3
*"
n n n 1X 1X 1X yi þ k þ n;s0 yi þ k;s0 n k¼1 n k¼1 n k¼1
!#2 +
The modified dispersion of Allan eliminates this ambiguity due to the artificial narrowing of the bandwidth of the measuring system, but it has an increased sensitivity to white phase noise. Additionally, you can use to evaluate the histogram indicators, obtained using the method of triangulation interpolation, the essence of which is to represent the histogram in the form of an isosceles triangle. The so-called “St. George’s Index” is equal to the width of the base of the triangle, close to the histogram of the distribution of intervals. The size of the base of the histogram is considered as the base of the triangle, obtained by approximating the distribution of the method of least squares. At the same time, to calculate this basis on the time axis of the histogram, some points A and B are set, after which a multilinear function q(t) is constructed such that q(t) = 0 for t A and t B, þR1 and the integral ðDðtÞ qðtÞÞ2 dt is minimal for all possible values between A and 0
B. Another indicator, called the triangulation index, is equal to the ratio of the total number of intervals to the height of the histogram (its mode). In other words, the triangulation index is an integral of the density of distribution D, referred to the maximum of the density of distribution.
222
A. K. Aleshin et al.
3 Using the Laws of Distribution of Parameters of the Laws of Motion in the Problems of Operational Diagnostics of Cyclic Mechanisms The dynamic properties of mechanical systems, including deformations of links and elements of kinematic pairs, the values of gaps, friction forces, etc. manifest themselves differently depending on the law of motion and inertial loads [1]. This is clearly reflected in the graphs for the distribution laws for time intervals t. The histograms of the distribution of time t of turning the robot arm “PUMA” by a given angle were considered. The grip of a robot with an optical position sensor repeatedly moved in the same plane, but with different laws of speed change: “trapezoidal” and “triangular”. An optical sensor consisting of a light source and a phototransistor reacts to a change in the light rod. In the two extreme positions, lightproof plates are fixedly mounted. Opening the luminous flux of the sensor when it moves from the initial position starts the timer to measure the time of movement, and blocking it in the final position by the plate stops the measurement. As a result of repeated repetition of movements, a statistical array is formed for t, by which a histogram is built. In Fig. 1 shows the histograms of the time distribution t of turning the arm of the robot “PUMA” at a given angle. The dynamic properties of the electromechanical system of the robot, including the deformation of the links and elements of kinematic pairs, the magnitudes of the gaps, friction force, etc., were manifested differently depending on the law of motion and inertial loads. This clearly reflected in the difference in the distribution laws for t.
Fig. 1. Histograms of the distribution of time t of turning the arm of the robot “PUMA” at a given angle
For comparison of histograms as estimates of empirical density of distribution, the so-called homogeneity criteria are usually used [9, 10]. One example of the homogeneity criterion is the most commonly used v2 criterion, similar to the Pearson approval criterion. This criterion, as well as all other traditional criteria of homogeneity, is not sensitive to the shape of the histograms and is not suitable for comparing complex forms of distributions. Therefore, to compare complex discrete forms of histograms and to determine the degree (randomness) of their similarity, it is advisable to use the correlation criterion [11], based on the calculation of the correlation coefficient of the
Analysis of Diagnostic Signs of Defective States of Mechatronic Mechanisms
223
ordinates of two histograms after normalizing each of them to the density of the normal distribution, selected by the mean value and variance, and also a rank correlation criterion based on the calculation of the Spearman statistics for the ordinates of the compared histograms. In addition to these methods, the Kullback measure [12] can be R1 used to distinguish the distribution of f(x) from a given f0(x) H ¼ f ðxÞ ln½f ðxÞ= 1
f0 ðxÞdx, where f(x) is in the probability density distribution of fluctuations (variations) of time intervals, which has shown high efficiency in solving problems of diagnosing cyclic machines such as clockwork, gear and turbo [13]. To check the steady conformity of the forms of histograms to the laws of motion, the experiments were repeated. For this, the law of motion and the measurement process t were reproduced anew. The dotted line in Fig. 1 shows the histograms obtained by reproducing the experiments. The close coincidence of the corresponding histograms indicates a stable correspondence between the dynamic properties of the mechanism and the corresponding distribution laws for t. We show that this is due to the presence of a deterministic functional dependence of the distribution laws f(pi, t) on the parameters pi. Let x(t) be the generalized coordinate of a dynamical system. A probabilistic description of x(t) as a random process is possible if it is a Markov process [14, 15]. It is known that processes that satisfy this condition are described by first-order differential equations or systems of first-order differential equations. A rather large number of real systems is reduced to such a description scheme. Let the equation of a dynamical system be given in general x_ ¼ Fðp1 ; p2 ; . . .; pn ; x; tÞ þ GnðtÞ
ð4Þ
with the initial condition x(t0) = x0 at t = t0. Here Gn(t) is the random effect of the white noise type; G is the intensity of white noise; pi - parameters of a dynamic system, the deviations of which from the permissible values cause a defect to appear. The probabilistic description of a random process x(t) satisfying Eq. (4) is the conditional probability density function uðp1 ; p2 ; . . .; pn ; x; t=x0 ; t0 Þ, which is a solution to the Kolmogorov equation @u @u 1 @2u þ k1 ðpi ; x0 ; t0 Þ þ k2 ðpi ; x0 ; t0 Þ 2 ¼ 0; @t0 @x0 2 @ x0
ð5Þ
where k1 ðpi ; x0 ; t0 Þ and k2 ðpi ; x0 ; t0 Þ are the drift and diffusion coefficients, respectively. They are determined from the equation of motion (4) 1 ½Fðpi ; x; tÞDt0 þ me ðtÞDt0 ; Dt0 !0 Dt0 tZ þ Dt0 Z 1 k2 ðpi ; x0 ; t0 Þ ¼ lim Ke ðt1 ; t2 Þdt1 dt2 ; Dt0 !0 Dt0 k1 ðpi ; x0 ; t0 Þ ¼ lim
t
where Ke (t1, t2) is the correlation function of the random perturbation Gn(t). For stationary random perturbations of white noise type with zero expectation me(t) = 0
224
A. K. Aleshin et al.
and dispersion r2 = G, the function Ke (t1, t2) is the delta function d(s), where s = t 2 − t 1. From the Kolmogorov Eq. (5), we can obtain the equation for the function f(pi, x0, t) - the probability density of the time distribution of the first achievement of a given process x(t) by a random process xi [14, 15]. Assuming the process is stationary, the equation for f(pi, x0, t) is @f @f 1 @2f ¼ k1 ðpi ; x0 Þ þ k2 ðpi ; x0 Þ 2 @t @x0 2 @ x0
ð6Þ
with initial and boundary conditions: f ðpi ; x; t0 Þ ¼ dðt t0 Þ; f ðpi ; x0 ; tÞ ¼ dðx x0 Þ; f ðpi ; x1 ; tÞ ¼ dðx x1 Þ. In this Eq. (6), the drift and diffusion coefficients do not depend on time. For problems of diagnostics and identification, Eq. (6) gives an important result. Its solution determines the law of the time distribution when the generalized coordinate x(t) reaches the given value x1 as a deterministic function of the parameters pi of the dynamic system. The defect origin (deviation dpk) entails changes in the coefficients k1(pi, x0) and k2(pi, x0) in Eq. (6) and, as a result, changes the function f(pi, x0, t). This will be a sign of a defect. Since each fault in a specific way changes the functions k1(pi, x0) and k2(pi, x0), characteristic changes will also be observed for f(pi, x0, t). Having a previously obtained set of distributions (histograms) for each defect and presenting an experimentally obtained histogram to recognition, a specific defect can be determined from the results of the comparison. However, the feature of defects in mechanical systems consists in the continuous evolutionary nature of development from the nucleation stage to emergency failure. In addition, in complex systems, the occurrence of one defect stimulates the emergence of another, so that a very wide variety of combinations of various defective states is possible. In such a situation, the preliminary creation of a “bank of defects” with the corresponding distribution laws is almost impossible. In this situation, the localization of defects is effectively carried out on the basis of recording and analyzing additional diagnostic signals that are directly related to the process of defect formation. For example, for the mechanisms of rotary tables of machines such parameters are the angular speed of rotation x and acceleration e faceplate - the platform on which the workpiece is fixed. The nature of the change in the oscillogram x and the maximum values of the acceleration e of the faceplate during braking indicate the appearance of a defect in the braking mechanism. The defect consists in blocking the throttling holes of the braking mechanism by fragments of wear and destruction of the piston seal material of the hydraulic cylinder of the drive for turning the faceplate. Significant dynamic loads repeatedly arising during braking lead to faceplate displacement and loss of accuracy by the turntable, parts failure and long emergency equipment downtime. Thus, the proposed method in this case will play the role of an indicator, preventively in the early stages of signaling the birth of a defect and the need for a deeper diagnostic procedure. The histograms presented for recognition reflect not only the intrinsic properties of the dynamic system, as follows from Eq. (3), but also the metrological properties of the measuring path for time intervals. The measurement process has inevitable errors, which have their own distribution law w(t), determined by the properties of the measuring system.
Analysis of Diagnostic Signs of Defective States of Mechatronic Mechanisms
225
The measurement error creates a masking background that reinforces the uncertainty of the distribution laws f(pi, x0, t). In practice, this affects the “blurring” of the forms of histograms and reducing the reliability of recognition. In the first approximation, the model of the random component of the measured interval t can be represented as a sum t ¼ t1 þ t2 ;
ð7Þ
where t1 and t2 are random components determined by the properties of the dynamic and measuring systems, respectively. According to this model, the experimentally obtained distribution laws f ðpi ; x0 ; tÞ in the form of histograms (Fig. 1) are compositions of the distribution laws f ðpi ; x0 ; tÞ and w(t2):
Z1
f ðpi ; x0 ; tÞ ¼ f ðpi ; x0 ; t1 Þwðt2 Þ ¼
f ðpi ; x0 ; t1 Þwðt t1 Þdt1 : 1
Consequently, the “images” of defects in the form of histograms depend on the adopted measurement scheme and are not universal. For the reliability of the entire recognition procedure, it is important that the function w(t2) has the property of stationarity and is minimally dependent on the properties of the system being diagnosed. The function w(t2) is mainly determined by the accuracy of the device that starts and stops the timer for time measurement. If this device (the sensor) does not change its characteristics during operation and with a change in the states of the system being diagnosed, the property of stationarity of the law w(t2) is ensured. The fact is that the accuracy of time measurement by the timer itself can be almost any reasonable value, so its own error is much less than the sensor error. The property of stationarity is also desirable for the component t1 in (7). It is provided by the property of time t as a physical parameter. The interval to be measured is the time when the generalized coordinate reaches the same constant value. This means that in case of repeated measurements repeated, the same complex of factors will act in the formation of t1 and new, previously unaccounted for factors will not be involved in the experiment. In addition, as a result of each measurement of the time interval, its value was formed throughout the entire change of the generalized coordinate. In the process of this change, mutual non-stationary sources of disturbances are mutually neutralized and only after that the measurement of the time interval ends. This diagnostic signal (time t) favorably differs from the vibro-acoustic signal [16], which responds to the whole complex of disturbances at every instant and therefore has a pronounced feature of nonstationarity. For time t, there is no need for instrumental and algorithmic filtering of the non-stationary component [17, 18]. A sign of stability and stationarity t is the close coincidence of the forms of the histograms when re-creating the measurement scheme and reproducing the experiments.
226
A. K. Aleshin et al.
4 Conclusions In conclusion, it should be noted that the results of time measurement are obtained in digital form, which excludes the stage of analog-digital conversion and the associated distortion of the initial information. The proposed estimates and algorithms were used in the diagnosis of unified units of aggregate machines. These are power heads, rotary tables, as well as robotic complexes of technological equipment. In addition, these methods were used in the diagnosis of textile and printing production equipment, where the mechanisms of cyclic action are widely used.
References 1. Pronyakin, V.I.: Problems in diagnosing cyclic machines and mechanisms. Meas. Tech. 51 (10), 1058–1064 (2008) 2. Kiselev, M.I., Pronyakin, V.I.: A phase method of investigating cyclic machines and mechanisms based on a chronometric approach. Meas. Tech. 44(9), 898–902 (2001) 3. Park, K.I.: Fundamentals of Probability and Stochastic Processes with Applications to Communications, 273 p. Springer, Heidelberg (2018) 4. Niu, G.: Data-Driven Technology for Engineering Systems Health Management: Design Approach, Feature Construction, Fault Diagnosis, Prognosis, Fusion and Decision, 364 p. Springer, Science Press, Singapore, Beijing (2017) 5. Isermann, R.: Fault-Diagnosis Systems: An Introduction from Fault Detection to Fault Tolerance, 475 p. Springer, Heidelberg (2006) 6. Isermann, R.: Fault-Diagnosis Applications. Model-Based Condition Monitoring: Actuators, Drives, Machinery, Plants, Sensors, and Fault-Tolerant Systems, 372 p. Springer, Heidelberg (2011) 7. Song, G.: Research on feature selection algorithm in rough set based on information entropy. Int. J. Educ. Manag. Eng. 1, 6–11 (2011) 8. Riehle, F.: Frequency Standards: Basic and Applications, 526 p. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim (2004) 9. Suhoy, Y., Kelbert, M.: Probability and Statistics by Example. Basic Probability and Statistics, vol. I, 373 p. Cambridge University Press, Cambridge (2005) 10. Hajek, B.: Random Processes for Engineers, 427 p. Cambridge University Press, Cambridge (2015) 11. Shnoll, S.E.: Fractality, “coastline of the universe”, movement of the earth, and “macroscopic fluctuations”. Biophysics 58(2), 265–282 (2013) 12. Bulinski, A., Dimitrov, D.: Statistical estimation of the Shannon entropy. Acta Math. Sin. Engl. Ser. 35(1), 17–46 (2019) 13. Morozov, A.N., Nazolin, A.L.: Determinate and random processes in cyclic and dynamic systems. J. Eng. Math. 55(1–4), 273–294 (2006) 14. Gamerman, D., Lopes, H.F.: Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference, 323 p. Springer, Heidelberg (2006) 15. Brooks, S., Gelman, A., Jones, G.L., Meng. X.-L. (eds.): Handbook of Markov Chain Monte Carlo, 620 p. Chapan and Hall/CRC, Taylor and Francis Group, Boca Raton (2011)
Analysis of Diagnostic Signs of Defective States of Mechatronic Mechanisms
227
16. Dobrynin, S.A., Feldman, M.S., Firsov, G.I.: Methods of computer-aided study of machine dynamics, 218 p. Mechanical Engineering, Moscow (1987). (in Russian) 17. Lubbad, M.A.H., Ashour, W.M.: Cosine-based clustering algorithm approach. Int. J. Intell. Syst. Appl. (IJISA) 4(1), 53–63 (2012) 18. Kumar, K.V., Kumar, S.S.: LabVIEW based condition monitoring of induction machines. Int. J. Intell. Syst. Appl. (IJISA) 4(3), 56–62 (2012)
Development and Performance Evaluation of a Software System for Multi-objective Design of Strain Gauge Force Sensors Sergey I. Gavrilenkov1(&) and Sergey S. Gavryushin1,2(&) 1
2
Bauman Moscow State Technical University, 5, Baumanskaya St., 105005 Moscow, Russian Federation [email protected], [email protected] Mechanical Engineering Research Institute of the Russian Academy of Sciences, 4, Malyi Kharitonievsky pereulok, 101990 Moscow, Russian Federation
Abstract. This paper presents a software system for designing strain gauge force sensors. The system is based on the Parameter Space Investigation (PSI) method coupled with methods of multicriteria optimization. The method gives the Decision Maker a high degree of control over the design process. The values of design objectives are calculated using a parametric finite element model of the force sensor being designed. The system is written in C# and Python utilizing the capabilities of the open-source FEA system SalomeMeca/Code_Aster. The paper presents a design case study, and the performance of the design method proposed is compared to that of other common algorithms for multi-objective optimization to demonstrate the system’s efficacy. Keywords: Strain gauge force sensor Parameter Space Investigation method Multi-objective optimization Computer-aided design
1 Introduction Force sensors based on strain gauges are widely used across the industry in platform scales, checkweighers, weighbridges. They are also used in load monitoring applications [1] and for force control in robotics. One of the current trends in the sensors industry is to bridge down the sensor price down while retaining adequate accuracy and strength of the sensors, which serve as structural members in various constructions. A strain gauge force sensor (FS), Fig. 1, is comprised of an elastic element and a number of strain gauges bonded to the elastic element. During measurement, the elastic element is deformed. The strain of the elastic element is transferred to the strain gauges making them change their electrical resistance. The strain gauges are connected to form a bridge measuring circuit where the resistance change of the strain gauges in individual arms of the bridge circuit generates voltage across the circuit output terminals. The voltage magnitude is proportional to the magnitude of the force applied. The characteristics of a strain gauge force sensor (bridge output voltage magnitude, nonlinearity, creep and hysteresis of the output signal) are determined by the stressstrain state of its elastic element and the properties of strain gauges used. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 228–237, 2020. https://doi.org/10.1007/978-3-030-39216-1_21
Development and Performance Evaluation of a Software System
229
Fig. 1. A strain gauge force sensor (Schematic representation).
The stress-strain state is in its turn is governed by the shape of the elastic element. Hence, the characteristics of a strain gauge force sensor can be perfected by finding the right shape of the elastic element and the optimal placement of strain gauges on it. The goal of this article is to develop a unified system for the design of strain gauge FS and to demonstrate its efficacy with a case study. To evaluate the performance of the design algorithm proposed, the case study will be solved using common algorithms for multi-objective optimization.
2 Literature Review There is a considerable body of research on designing strain gauge force sensors. The issue of material selection for force sensors was studied in depth in [2]. Various optimization techniques combined with Finite Element Analysis (FEA) were applied to the task of force sensor design in [3–5]. FEA was also used to estimate the errors of strain gauge pressure sensors and design overloading protection devices in [6–8]. Besides, there is a lot of research [9–12] on the design of multi-axis force sensors with low interference between individual measuring channels (so-called ‘crosstalk’), for example, when lateral force interferes with the measurements of the vertical force. Although most research employed FEA to analyze sensor characteristics, certain types of sensors can be modeled using analytical methods [13, 14]. This approach is much faster than FEA, but its use is limited to a narrow range of elastic element constructions. Thus, the use of FEA is preferable for the design of strain gauge FS with an arbitrary shape. The aforementioned papers considered only parametric design of force sensor, i.e., the shape of the force sensor elastic element remained within a prescribed topology, which somewhat limits the design space. This limitation had been overcome in [15], where topological optimization was used for designing a strain gauge force sensor. In addition, in [16], the authors proposed a design method combining primitive measuring shapes, or modules, to construct multi-axis force/moment sensors. To summarize, there is a demand for improving the characteristics of strain gauge force sensors, and there is a lot of research in this area, but a unified system tailored to this design task does not exist. Thus, the development of such a system is an essential task for the force sensors industry.
230
S. I. Gavrilenkov and S. S. Gavryushin
3 Design Algorithm The design algorithm employs techniques of multi-objective optimization and the Parameter Space Investigation method used in [17]. Figure 2 synopsizes the algorithm proposed.
Fig. 2. Design algorithm
According to the algorithm, the design procedure is as follows. First, for a chosen topology of the elastic element (bending beam, membrane, shear beam, etc.), the Decision Maker (DM) chooses the parameters varied in the design process. These parameters are called design variables. The DM also imposes constraints on the design variables. The chosen design variables and the constraints make up the Parameter Space. Second, the Parameter Space is investigated using the Sobol sequence strategy. Each element of this sequence corresponds to a particular sensor design. For each sensor design, the design objectives are calculated based on the finite element simulation of the force sensor elastic element. The simulation models the stress-strain state of the element for a ramping measured load. Then the distribution of the strain is used to calculate the output voltage magnitude of the sensor’s bridge circuit for the optimal placement of strain gauges using formulas given in [18]. The simulations are done automatically using a parametric finite element model done using open-source software Salome-Meca/Code_Aster [19]. Third, based on the constraints imposed on design variables and design objectives (for example, the value of the design objective ‘maximum stress’ must be less than 700 MPa, and so on), a set of design meeting all these constraints, is constructed. This is a set of feasible designs. A set of Pareto-optimal solutions is constructed based on the set of feasible designs. Finally, the information about the sets of feasible and Pareto-optimal designs is given to the DM in the form of various charts, histograms, and tables. The DM analyzes the information about the set of feasible and Pareto-optimal solutions, and the DM can proceed with the following actions: • If the set of feasible solutions is empty, weaken the criteria constraints and reanalyze the new sets of feasible and Pareto-optimal design candidates; • Change the boundaries of the Parameter Space and reinvestigate it anew; • Conduct a local investigation around a design that seems promising.
Development and Performance Evaluation of a Software System
231
The system keeps a global archive of sensor designs for a current study. Every time the Parameter Space is investigated, the results are appended to the archive, and the sets of feasible and Pareto-optimal solutions are updated. The process of investigating the Parameter Space and refining the set of Pareto-optimal designs is repeated until the DM finds the most preferable design.
4 Software Implementation of the Design Algorithm The design algorithm proposed is implemented in a software system. The modules for manipulating the archive of designs and visualizing data are written in C#. The modules for automatic calculation of the FEA model are written in Python, as it is the language of the Application Program Interface of Salome-Meca. Figure 3 presents some of the UI forms of the software developed. When a particular sensor design is evaluated, the module for evaluating the design objective first rebuilds the geometry of the elastic element according to the given values of design variables. Then the module automatically creates a finite element mesh using the GMSH mesher (a default option is Salome-Meca). After that, the module calls the Code_Aster solver and runs a preset simulation script created by the designer. At the end of a simulation, Code_Aster writes the information about the mesh and the stress-strain state in a text file. The text file is read and interpreted by the module to calculate the values of design objectives based on simulation results.
Fig. 3. Some of the UI forms of the software: the module for defining the Parameter Space (left), the module for visualizing the Pareto-optimal solutions in the form of a parallel coordinates plot.
5 Case Study The following case study will demonstrate the efficacy of the software developed. The case study is to design a Double-Beam force sensor with a rated load of 1000 N. Figure 4 displays the elastic element geometry and the design variables. Such sensors are often used in platform scales. The elastic element material is 630 stainless steel (17-4 PH). The material properties are as follows: elasticity module E = 197 GPa, Poisson’s ratio µ = 0.3, hardness = 42.5 HRC, ry = 1200 MPa.
232
S. I. Gavrilenkov and S. S. Gavryushin
Fig. 4. The geometry of the elastic element. The bridge circuit is comprised of active strain gauges. The strain gauges experiencing tension are shown with red rectangles; the strain gauges experiencing compression are shown with blue rectangles.
The design objectives are the following: • Maximize the bridge output voltage SO at the rated load. • Maximize the sensor overloading factor SF; • Minimize the maximum nonlinearity NL of the bridge output voltage vs. load characteristic; The sensor overloading factor is given by SF ¼
rMises max 0:6 ry
ð1Þ
Where rMises max - maximum Von Mises stress in the elastic element, ry - yield strength of the elastic element material, 0.6 – coefficient taking into account the fact that the yield strength is determined for the plastic strain of 0.2%. Plastic strains in the elastic element manifest themselves in the zero shift of the sensor, and the value of plastic strain of 0.2% is quite large. Besides, fatigue strength of the sensor is also a concern, especially in dynamic weighing applications. In this study, the value of 0.6 was chosen based on experience. In this study, the value of hysteresis is not considered. There are several reasons for this. In force sensors, hysteresis is typically related to two factors. The first factor is hysteresis from the friction between the sensor and the load-introducing hardware. Contrary to cylindrical and membrane compression force sensors, there is little room for external friction for the elastic element geometry considered. The second factor is the internal friction hysteresis. Yagmur et al. proved [2] that this value can almost be eliminated by proper heat treatment of the 17-4 PH stainless steel. The constraints on the design objectives are as follows: SO 1 mV per 1 V of supply voltage, SF 1.5, NL < 0.02%.
Development and Performance Evaluation of a Software System
233
The Parameter Space is two-dimensional. The design variables are HH and HL, and their ranges of variation are as follows: 15…25 mm for HH and 30…50 mm for HL. Other than those, there no constraints imposed on design variables. Usually, the placement of strain gauges on the elastic element (locations and orientations) have to be considered as design variables, and the module for evaluating the design objectives would have to determine the optimal placement based on simulation results. But for simplicity, their locations are fixed to the thinnest points on the beam. The strain gauges are generic constantan strain gauges for transducer manufacturing having a gauge factor of 2 and the measuring grid 3 mm 3 mm. The case study is solved as follows. First, the task is solved using the software developed. Second, this task is solved using common multi-objective genetic algorithms (MOGA). The algorithms are NSGA II, NSGA III, IBEA, SPEA2. Genetic algorithms are a tool of choice for various engineering optimization problems where the objective functions may be diverse and have multiple extrema [20–23]. Third, the quality of the Pareto Front obtained using the software proposed, and the MOGA is compared with different quality indicators [24]. The indicators are Generational Distance (GD), Hypervolume (HV), Spacing (SP). This study uses an open-source Python library Platypus [25] for the implementation of the MOGA and the quality indicators. The GD indicator requires a true set of Pareto-optimal designs as a reference set. To find it, the Parameter Space was investigated with a very large number of possible designs (20000). This allows getting a very close approximation of the true set of Pareto-optimal designs. The settings of the MOGAs are as follows. The mutation operator is Polynomial Mutation (PM) with a probability of 0.05. The crossover operator is Simulated Binary Crossover (SBX) with a probability of 0.8. The population size is varied in the case study. The archive size is 1000. As MOGAs are probabilistic by nature, each MOGA was run 50 times, and the averages and confidence intervals are calculated for each of the quality indicators. The computational budget, i.e., the maximum number of times the objective functions can be evaluated (one sensor design = one FEA calculation of the parametric model of the elastic element) was set to 1000. This parameter serves as the termination condition for the MOGAs. 5.1
Solving the Case Study Using the Developed Software System
To investigate the distribution of Pareto solutions in the Parameter Space, the DM conducted a preliminary investigation. The preliminary investigation included 400 designs. Figure 5 shows the distribution of feasible and Pareto-optimal solutions in the Parameter Space obtained using the visualization module of the developed software. Based on the distribution of Pareto-optimal solutions, the DM limited the subsequent investigations to just two Subareas marked with rectangles in Fig. 5. The first subarea is located at the ‘top’ of the Parameter Space. The second subarea is a small ‘isle’ at the bottom of the Parameter Space. The DM divided the remaining computational budget (600 designs) equally between the subareas.
234
S. I. Gavrilenkov and S. S. Gavryushin
Fig. 5. Distribution of the feasible (left) and Pareto-optimal (right) designs in the Parameter Space. The plots were obtained using the visualization tools of the developed software system. Black rectangles show the areas for further Investigations.
Figure 6 shows the distribution of the Pareto-optimal designs in the objective space (a screenshot from the developed software). Investigation 1 was the initial Investigation, Investigations 2 and 3 were conducted in the Subareas 1 and 2, accordingly.
Fig. 6. The plot Pareto-optimal designs in the space of design objectives from different investigations obtained using the system’s visualization tools. The system uses different markers for different Investigations to help the Decision Maker get a good understanding of how the designing progresses. The units for the Design Objectives # 1, 2, 3 are %, none, and mV/V, accordingly.
Development and Performance Evaluation of a Software System
235
It is noteworthy that the Pareto designs from the investigation of Subarea 1 are localized in a single small ‘cluster’ and do not yield the result as diverse as the investigation of Subarea 2. 5.2
Discussion of Results
Mathematical Table 1 shows the indicator values. For the MOGAs, Table 1 shows mean indicator values and the 0.95 confidence interval. Table 1. Indicator values Algorithm Proposed method IBEA IBEA SPEA2 SPEA2 NSGA III NSGA III NSGA II NSGA II
Npop – 50 100 50 100 50 100 50 100
GD∙10−3 0.49183 2.62 ± 0.18 1.74 ± 0.10 2.10 ± 0.06 1.44 ± 0.03 2.34 ± 0.07 1.66 ± 0.03 2.36 ± 0.03 1.62 ± 0.02
HV∙10−3 447.22 242.65 ± 237.95 ± 443.60 ± 466.62 ± 431.29 ± 468.92 ± 425.10 ± 451.90 ±
SP∙10−3 23.711 11.23 74.70 ± 3.19 52.55 ± 7.44 18.01 ± 5.43 15.60 ± 7.24 49.93 ± 3.14 60.95 ± 3.33 48.61 ± 2.40 38.01 ±
3.15 2.00 1.31 0.77 3.11 3.03 0.93 0.47
On the one hand, the Pareto set obtained using the software and the design method proposed has the lowest values of GD, i.e., it is closest to the reference Pareto set. Besides, the hypervolume of this set is around the same as that of the MOGAs. On the other hand, the spacing SP is rather weak compared to other algorithms. It means that the method proposed does not capture the Pareto-optimal designs with the optimal values of the individual design objectives, i.e., the MOGAs give somewhat more diverse results.
6 Conclusions In this paper, the software system for designing strain gauge force sensors was presented. The system is based on the Parameter Space Investigation method and methods of multicriteria optimization. The developed software system can be used as a tool by the engineers and designers in the force sensor industry to improve the characteristics of existing force sensors and for creating new sensor designs. The software gives the Decision Maker great control over the design process. The design principle the software is based upon can be extended to other sensor types, for instance, fiber optics force sensors or pressure sensors. However, the conducted case study indicated that the Pareto optimal set of designs obtained by the software is somewhat inferior to the ones obtained by multi-objective genetic algorithms by certain quality indicators (spacing and hypervolume of the Pareto set). This limitation can be overcome by using multiobjective genetic algorithms for individual investigations of the Parameter Space.
236
S. I. Gavrilenkov and S. S. Gavryushin
Besides, future research will aim to introduce the economic criteria (the cost of materials and manufacturing) in the design process, which will help the Decision Maker to estimate the economic impact of the considered sensor designs.
References 1. Saxena, P., Pahuja, R., Singh Khurana, M., Satija, S.: Real-time fuel quality monitoring system for smart vehicles. Int. J. Intell. Syst. Appl. (IJISA) 8(11), 19–26 (2016) 2. Yagmur, L., Aydemir, B.: A comparative study for material selection of sensor element using analytic hierarchy process. MAPAN 33(4), 459–468 (2018) 3. Kolhapure, R., Shinde, V., Kamble, V.: Geometrical optimization of strain gauge force transducer using GRA method. Measurement 101, 111–117 (2017) 4. Sun, Y., Liu, Y., Zou, T., Jin, M., Liu, H.: Design and optimization of a novel six-axis force/torque sensor for space robot. Measurement 65, 135–148 (2015) 5. Uddin, M.S., Songyi, D.: On the design and analysis of an octagonal–ellipse ring based cutting force measuring transducer. Measurement 90, 168–177 (2016) 6. Gavryushin, S.S., Skvortsov, P.A., Skvortsov, A.A.: Optimization of semiconductor pressure transducer with sensitive element based on “silicon on sapphire” structure. Periodico Tche Quimica 15(30), 679–687 (2018) 7. Gavryushin, S.S., Skvortsov, P.A.: Evaluation of output signal nonlinearity for semiconductor strain gage with ANSYS software. In: Solid State Phenomena, vol. 269, pp. 60–70 (2017) 8. Andreev, K.A., Vlasov, A.I., Shakhnov, V.A.: Silicon pressure transmitters with overload protection. Autom. Remote Control 77(7), 1281–1285 (2016) 9. Tavakolpour-Saleh, A.R., Sadeghzadeh, M.R.: Design and development of a threecomponent force/moment sensor for underwater hydrodynamic tests. Sens. Actuators A: Phys. 216, 84–91 (2014) 10. Tavakolpour-Saleh, A.R., Setoodeh, A.R., Gholamzadeh, M.: A novel multi-component strain-gauge external balance for wind tunnel tests: simulation and experiment. Sens. Actuators A: Phys. 247, 172–186 (2016) 11. Li, X., He, H., Ma, H.: Structure design of six-component strain-gauge-based transducer for minimum cross-interference via hybrid optimization methods. Struct. Multidiscip. Optim. 60 (1), 301–314 (2019) 12. Feng, L., Lin, G., Zhang, W., Pang, H., Wang, T.: Design and optimization of a selfdecoupled six-axis wheel force transducer for a heavy truck. Proc. Inst. Mech. Eng. Part D: J. Automob. Eng. 229(12), 1585–1610 (2015) 13. Wang, Y., Zuo, G., Chen, X., Liu, L.: Strain analysis of six-axis force/torque sensors based on analytical method. IEEE Sens. J. 17(14), 4394–4404 (2017) 14. Payo, I., Adánez, J.M., Rosa, D.R., Fernandez, R., Vázquez, A.S.: Six-axis column-type force and moment sensor for robotic applications. IEEE Sens. J. 18(17), 6996–7004 (2018) 15. Takezawa, A., Nishiwaki, S., Kitamura, M., Silva, E.C.: Topology optimization for designing strain-gauge load cells. Struct. Multidiscip. Optim. 42(3), 387–402 (2010) 16. Liang, Q.K., Zhang, D., Coppola, G., Wu, W.N., Zou, K.L., Wang, Y.N., Ge, Y.: Modular design and development methodology for robotic multi-axis F/M sensors. Sci. Rep. 6, 24689 (2016) 17. Statnikov, R.B., Gavriushin, S.S., Dang, M., Statnikov, A.: Multicriteria design of composite pressure vessels. Int. J. Multicriteria Decis. Making 4(3), 252–278 (2014)
Development and Performance Evaluation of a Software System
237
18. Stefanescu, D.M.: Handbook of Force Transducers: Principles and Components. Springer, Berlin (2010) 19. Homepage of the training materials of the Code_Aster project. https://code-aster.org/spip. php?rubrique2. Accessed 19 June 2019 20. Soukkou, A., Belhour, M.C., Leulmi, S.: Review, design, optimization and stability analysis of fractional-order PID controller. Int. J. Intell. Syst. Appl. (IJISA) 8(7), 73 (2016) 21. Kumar, R.: Internet of things for the prevention of black hole using fingerprint authentication and genetic algorithm optimization. Int. J. Comput. Netw. Inf. Secur. (IJCNIS) 9(8), 17 (2018) 22. Sakharov, M., Karpenko, A.: Parallel multi-memetic global optimization algorithm for optimal control of polyarylenephthalide’s thermally-stimulated luminescence. In: Proceedings of World Congress on Global Optimization, pp. 191–201. Springer, Cham (2019) 23. Karpenko, A., Sakharov, M.: New adaptive multi-memetic global optimization algorithm. Herald Bauman Moscow State Tech. Univ. Ser. Nat. Sci. 83, 17–31 (2019) 24. Chand, S., Wagner, M.: Evolutionary many-objective optimization: a quick-start guide. Surv. Oper. Res. Manag. Sci. 20, 35–42 (2015) 25. Homepage of the Platypus project. https://github.com/Project-Platypus/Platypus. Accessed 15 June 2019
Optimization of the Structure of the Intelligent Active System as a Necessary Condition for the Harmonization of Creative Solutions N. Yu. Mutovkina(&)
and V. N. Kuznetsov
Tver State Technical University, Tver, Russia [email protected]
Abstract. This article is devoted to the problem of identifying the most effective intelligent agents suitable in their personal qualities for solving nonstandard, complex, creative tasks. The effectiveness of the decisions made by intelligent agents determines the productivity of the entire system. The possibilities of generating creative solutions to complex, non-standard problems largely depend on the state of intelligent agents. An intelligent agent, taking into account its inherent anthropomorphic properties and qualities, can be in various psycho-emotional states: from a depressed mood to creative uplift and insight. The states of the agent change over time, which he informs the system about at specified points in time. It is assuming that all agents provide reliable information about their states to the system. Based on the data obtained, the system should determine the most and least stable agents and redistribute the functions and responsibilities between them to maximize own utility function. The task is characterizing by the need to analyze large volumes of subjective information and belongs to the class of difficulty formalized tasks. To solve it, fuzzy clustering algorithms that are well implementing in the Matlab software environment can be using. Keywords: Intelligent active system Intelligent agent Fuzzy clustering Creative solutions Coordination problem Optimization Agent status FCM algorithm
1 Introduction At present, characterized by the ever-increasing pace of scientific and technological progress, the increasing complexity of business and personal interactions and relationships, the emergence of new tasks, ideas and developments, and, as a result, new dangers for the existence of human society as a whole, the problem of building intelligent active systems, is gaining particular importance, populated by creative agents. A creative intelligent agent is understood as an intelligent, thinking entity, able to put itself in the place of another individual and, guided by anthropomorphic qualities, ideas about himself, other agents, the system and the world around him, build strategies of behavior and generate non-standard solutions to problem situations, while increasing the value of his own utility functions, and system utility functions. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 238–248, 2020. https://doi.org/10.1007/978-3-030-39216-1_22
Optimization of the Structure of the Intelligent Active System
239
Intelligent agents can use various prohibited behavioral strategies to achieve their goals: (1) (2) (3) (4)
ostentatious friendliness; demonstration of their regression, imitation of their death; creating your copies and hiding them from other agents of the system [1]; deliberate transmission of false information to mislead the rest of the agents of the system [2].
As practice shows, the choice by most agents of any of these strategies leads to an increase in mistrust in the intelligent active system (IAS), disruption of the relationships between agents and, ultimately, to the collapse of the system [3]. Therefore, it is extremely important to timely identify agents whose interests are contrary to the interests of the system. As a rule, the existence of such agents is to maximize their utility function, but not the utility function of the system and certainly not other agents. Practice shows that agents of evasive or coercive psycho-behavioral type are not capable of creating and harmonizing creative solutions to creative problems. They cannot create since all their energy is directing towards satisfying their own needs and nothing else. The purpose of the work is to identify offending agents and agents suitable in their qualities for solving non-standard, complex, creative tasks. The last can only be friendly agents – agents that have a positive effect on the system. Friendliness means that an intelligent agent should never be hostile or ambivalent about the system in which it is locating. Moreover, the attitude should not be affected by its goals, interests, preferences, etc. A friendly intelligent agent must have a moving scale of values – taking into account changes in their relevance over time. That is, an intelligent agent must possess the so-called coherent extrapolated will and be able to anticipate the desires of other agents and users of IAS [4]. The initial data of the task of optimizing the composition of the IAS are the results of observations of the behavior of agents in the IAS, supported by messages about their states by the agents themselves. It is assuming that all agents are interested in reporting reliable information about their conditions since they will be punishing for broadcasting false information up to exclusion from the IAS.
2 Theoretical Aspects of a Research The task of identifying the most “friendly”, capable and at the same time purposeful agents is a derivative of the task of team building, well known in conflict management, the theory of socio-economic systems management and applied sociology [5–8]. In IAS, it is also important to divide agents into groups (subgroups), working in which they will be able to carry out their tasks most effectively. To achieve the best result, you need a steadily friendly climate in IAS. But this is a difficult task since hundreds and even thousands of agents are involved in IAS, who do not always get along with each other. Open conflicts are not required. The joint work of some agents may be ineffective due to the banal mismatch of their interests and preferences. This is especially
240
N. Yu. Mutovkina and V. N. Kuznetsov
characteristic when a complex, non-standard problem appears in the system, the solution of which is requiring in a limited time. Creativity, finding innovative, progressive ways out of difficult situations always has been the main condition for the development of society. Creativity is an activity, the preliminary regulation of which contains some degree of uncertainty. The result of creative activity is always new information suggesting self-organization [9]. To solve creative problems, it is necessary to have special abilities that can develop under the following conditions: sufficient resource endowment, including knowledge; the absence of conflicts in the system and those, who can create these conflicts; each agent’s awareness of their significance and the ability to put themselves in the place of another and look at themselves (their behavior) from the side. In general, creativity can be defined as a set of properties and qualities of an intelligent agent, which provides him with the opportunity to express himself in any kind of activity as a creative person [9, p. 5]. Psychological research shows that many creative tasks can have several correct solutions. The nature of each proposed solution largely depends on the specific situation in which the intelligent agent itself is its most important part [10]. The problem is the coordination of the solutions proposed by the agents and the achievement of some consensus, which is impossible if there are a large number of antagonist agents in the IAS. Agents that impede the effective functioning of IAS are dividing into two types: explicit antagonists and hidden antagonists. Explicit antagonists do not hide their true states and report them openly to the system. Their position, as a rule, is always far from a compromise balance: they either shy away from fulfilling their assigned functions or try to seize the initiative from other agents and seize their resources. Hidden antagonists try to conceal their true states. The difference between the desired and the actual for such agents is excessively large [2].
3 Formalization and Methods for Solving the Problem 3.1
Formalization of the State of an Intelligent Agent
The concept of “state of an intelligent agent” is multifaceted and includes determination, goodwill, willingness to share their results, a sense of freedom of choice, selfsufficiency, etc. The behavior of the agent in the IAS is expressing by its transitions from state to state: s1 ! . . . ! st1 ! st ! st þ 1 ! . . . ! sT , where st1 is the state of the agent preceding the time t, sT is the state of receiving the creative solution, T is the number of iterations for which the solution will be obtaining. In general, the state of an agent can be represented as a combination of the following concepts: knowledge (X1 ), beliefs (X2 ), desire (X3 ), intentions (X4 ), goals (X5 ), obligations (X6 ), mobility (X7 ), benevolence (X8 ), truthfulness (X9 ), rationality (X10 ). An intelligent agent has some knowledge of himself, other agents, and the environment. Knowledge does not change over time, but can only be supplementing. Based on this knowledge, some beliefs are forming regarding a particular situation in which the agent falls, i.e. a certain “look” at the problem. Beliefs (faith) can be interpreting as
Optimization of the Structure of the Intelligent Active System
241
a subjective perception of knowledge, which can change over time. The reasons for this may be the result obtained from the agent applying objective knowledge (information) in practice or from the experience of other agents. Desires are states, situations, the achievement of which for various reasons is an end in itself for the agent, but they can be contradictory and therefore the agent cannot expect all of them to be achieved. Intentions are what the agent is obliged to do by virtue of his obligations towards other agents (he is entrusting with the solution of the problem and he took upon himself the obligation to solve it), or what follows from his desires (i.e. a consistent subset of desires, chosen for one reason or another, and which is compatible with the obligations assumed). Goals are calling a specific set of final and intermediate states, the achievement of which the agent has adopted as the current strategy of behavior. Obligations about other agents are tasks that the agent takes upon himself at the request (instruction) of other agents in the framework of cooperative goals or the goals of individual agents in the framework of partnership. Mobility is the agent’s ability to move to and from the IAS (via the Internet) in search of the necessary information to solve its problems. An important condition for the cooperation and cooperation of agents is the property of benevolence, i.e. the willingness of agents, to help each other and solve exactly those tasks that the user instructs a specific agent, which implies the absence of conflicting goals for the agent. No less important properties of agents are truthfulness and rationality. The truthfulness of the agent lies in the fact that he does not manipulate information about which he knows that it is false. Rationality is the ability of an agent to act in such a way as to achieve its goals, and not to shy away from their achievement, at least within the framework of its knowledge and beliefs [11]. All of these concepts can be interpreting as fuzzy sets [12]. For example, at the moment t ¼ 1, the agent knows the amount of 50% of the 100% necessary to solve the problem, at the moment of time t ¼ 2 the volume of knowledge increases already to 65%, etc. The value of 50% can be interpreted as “insufficient” to solve the problem and have estimated in the interval (0; 0.5), 65% is as “insufficient” or “average” and have estimates in the intervals (0; 0.5) and (0.5; 0.8), respectively. The values from the intervals are the degrees of belonging of the element x1t to the fuzzy set X1 . Another example: at time t ¼ 1, the agent is ready to undertake obligations to solve a problem with the readiness of 37%, and at the next moment this readiness can increase to 60%. When t ¼ 3, due to the influence of exogenous and endogenous factors, for example, under the influence of the opinions of other agents or due to a change in the statement of the problem by the user, the agent’s readiness to assume obligations to solve the problem may decrease to 52%, etc. Thus, the values of 37%, 60%, 52% can also be evaluated in the linguistic scale “not ready”, “rather not ready than ready”, “rather ready than not ready”, “ready” and expressed by numbers in the interval (0; 1), which are the degrees of belonging of the element x6t to the fuzzy set X6 . Therefore, we can write: X1 ¼ x1t ; lX1 ðx1t Þjx1t 2 U ; X2 ¼ x2t ; lX2 ðx2t Þ jx2t 2 U ; ... X10 ¼ x10t ; lX10 ðx10t Þ jx10t 2 U ;
ð1Þ
242
N. Yu. Mutovkina and V. N. Kuznetsov
where lX1 ; lX2 ; . . .; lX10 are membership functions, i.e. lX1 ; lX2 ; . . .; lX10 : U ! ð0; 1Þ are characteristic functions of sets X1 ; X2 ; . . .; X10 U, whose values indicate whether x1t ; x2t ; . . .; x10t 2 U are elements of the corresponding sets X1 ; X2 ; . . .; X10 ; U is the socalled universal set, from the elements of which all the other sets considered in this class of problems are forming. The values lX1 ðx1t Þ; lX2 ðx2t Þ; . . .; lX10 ðx10t Þ are called the degrees of belonging of elements x1t ; x2t ; . . .; x10t to fuzzy sets X1 ; X2 ; . . .; X10 respectively [12]. The agent state model at any time can be represented as follows: S ¼ P1 \ P2 ; P1 ¼ X1 [ X2 ; P2 ¼ X3 [ . . . [ X10 ;
ð2Þ
where the set P1 is the position of the agent, his point of view; P2 is the set that defines the behavior of the agent. The type of agent (r, r 2 R), reflecting his preferences, is an assessment of his conditions in which he is in performing a specific task; R is the set of agent preferences. Type of agent simplistic can be calculated by the formula: r¼
T X
, st
T:
ð3Þ
t¼1
The following rules are accepting: an agent is evasive if r 2 ð0; 0;4Þ; an agent is calling a compromise (cooperative) if r 2 ½0;4; 0;6 and the agent is coercive in the case r 2 ð0;6; 1Þ. The type of agent in the process of its functioning can change and be considered separately at each iteration. In this case, the equality [2] holds: rt ¼ st :
ð4Þ
In [13] it was established that IAS is in an equilibrium state if all agents can be classifying as a compromise type. Changing the type of agent is carried out by applying models and methods of coordinated management as a set of decision-making rules by agents in the form of dependencies that give each state (type) of the agent (4) a specific value of the control action vit , i 2 1; N, vit 2 Vt , where N is the number of agents in the IAS, Vt is the set of control impacts on the agent at time t from the side of the agentCenter or the IAS user [14]. The presence of control mechanisms at each iteration of the functioning of the IAS allows you to get the best result and solve the problems of coordination and optimization in the interaction of agents with each other [15]. 3.2
Fuzzy Clustering of the Composition of the Intelligent Active System
It is proposed to optimize the composition of IAS in two independent stages: explicit and hidden antagonists are identified separately. For this, fuzzy clustering algorithms, which were considered in detail in [16, 17] and are easily implemented in the Matlab software environment, are quite suitable.
Optimization of the Structure of the Intelligent Active System
243
The initial data for the identification of explicit antagonists are the results of observations of the states of agents that change in the process of solving a creative problem. To identify hidden antagonists, the results of observations of the states of agents are comparing with their estimates of their states. All estimates are expressing by numbers on the diapason [0; 1]. Since the number of clusters is known (corresponding to the number of psycho-behavioral types r), the FCM algorithm and the corresponding command line function were used to solve the fuzzy clustering problem in the Matlab system (fcm): ½center, U, obj fcn ¼ fcmðdata, cluster n; optionsÞ:
ð5Þ
The input arguments to function (5) are: data: clusterization initial data matrix (S), the i-th row of which contains estimates of the states of the i-th agent in the form of a vector si ¼ si1 ; si2 ; . . .; sit ; . . .; siT , where sit is the quantitative state value for the i-th agent at time t; cluster_n: number of desired fuzzy clusters c ðc ¼ 3Þ; options: additional arguments that are designed to control the process of fuzzy clustering, as well as to change the criterion for stopping the operation of the algorithm and/or displaying information on the monitor screen. There are 4 such arguments: (1) options(1): exponential weight m for calculating the fuzzy partition matrix U (default m = 2); (2) options(2): maximum number of iterations q (default s = 100); (3) options(3): algorithm convergence parameter e (default e = 0.00001); (4) options(4): current iteration information, displayed on the monitor screen (by default, this value is 1). The results of the fcm function are listing in the first part of the expression (5): center: the matrix of centers of the desired fuzzy clusters, each row of which is the coordinate of the center of one of the fuzzy clusters in the form of a vector; U: matrix of membership functions of the desired fuzzy partition; obj_fcn: values of the objective function at each iteration of the algorithm. The fcm function ends when the FCM algorithm performs the maximum number of iterations q or when the difference between the values of the objective functions at two consecutive iterations is less than the a priori value of the convergence parameter of the algorithm e.
4 Software Implementation of the FCM Algorithm in Matlab IAS is considering, which includes 100 agents. The period of observation of the operation of each agent is T. Time is assumed to be discrete: t ¼ 1; T. It is necessary to determine how effectively IAS with such an agent composition is capable of solving non-standard, creative tasks. If the center matrix of the desired fuzzy clusters contains approximately the same values, then such a system is considering effective. Control
244
N. Yu. Mutovkina and V. N. Kuznetsov
influences vit can be applied to agents of evading and coercive type, as a result of which conditions will be creating for harmonizing their creative decisions. Let the agent state estimates be the matrix S 100 8 and be contained in an external file: fcmdata_2.dat. A fragment of the original data is showing in Fig. 1.
Fig. 1. Initial data for modeling
The program code that implements the FCM fuzzy clustering algorithm is shown in Fig. 2. It is implementing as an m-file: Clastering_2.m.
Fig. 2. Fcm algorithm code
Optimization of the Structure of the Intelligent Active System
245
The graph (Fig. 3) displays the data matrix values for the first two states.
Fig. 3. The scattering diagram of agents in IAS for the first two states and the result of fuzzy clustering
A total of 8 states of intelligent agents were investigating. A fragment of the obtained numerical characteristics center, U, and obj_fcn is shown in Fig. 4.
Fig. 4. Results of solving the fuzzy clustering problem in the command window of the Matlab system
246
N. Yu. Mutovkina and V. N. Kuznetsov
Fig. 5. Visualization of agent states after control influences
As can be seen from the calculations, the coordinates of the centers of the clusters are quite close to each other. This indicates the possibility of adaptation of intelligent agents to each other and the flexibility of changing their states when solving nonstandard, creative tasks. Using certain control influences on agents, it is possible to achieve a change in their states (Fig. 5), which leads to an increase in the efficiency of IAS. In Figs. 3 and 5, the agents of compromise type are highlighting in green, agents of the enforcement type are red, and agents of the evasive type are blue.
5 Conclusions The results of fuzzy clustering are approximate and can only serve as a preliminary structure of the information contained in the set of source data. Solving the problems of fuzzy clustering, you need to remember about the features and limitations of the process of measuring features in a set of clustering objects. Since fuzzy clusters are formed based on the Euclidean metric, the corresponding attribute space must satisfy the axioms of the metric space. The coincidence of cluster centers indicates the coordination of the activities of intelligent agents. Application of “soft” control influences to intelligent agents, as a rule, gives good results. Such effects in 97% of cases ensure the achievement of the IAS equilibrium
Optimization of the Structure of the Intelligent Active System
247
state, contributing to the solution of complex, creative tasks with sufficient accuracy and speed. The “soft” control influences include: (1) (2) (3) (4)
timely identification of controversial issues for antagonist agents; emphasis on mutual interests; an objective assessment of the situation during the negotiations; the conviction of antagonist agents that each of them will receive a greater gain if they focus on solving the problem.
The art of management in the IAS also consists of showing its participants that broadcasting truthful information is much more profitable for them than telling a false one. Acknowledgments. The reported study was funded by RFBR according to the research project No. 17-01-00817A.
References 1. Barrat, J.: Our Final Invention: Artificial Intelligence and the End of the Human Era (Book Review), 336 p. New York Journal of Books. Accessed 30 Oct 2013 2. Mutovkina, N.Yu.: Methods of Coordinated Control in Active Systems: Monograph, 164 p. Tver State Technical University, Tver (2018) 3. Mutovkina, N.Y., Kuznetsov, V.N., Klyushin, A.Y.: Autom Remote Control 76, 1088 (2015). https://doi.org/10.1134/S0005117915060120 4. Yudkowsky, E.: Friendly artificial intelligence. In: Eden, A., Moor, J., Søraker, J., et al. (eds.) Singularity Hypotheses: A Scientific and Philosophical Assessment. The Frontiers Collection, pp. 181–195. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-325601_10. ISBN 978-3-642-32559-5 5. Babosov, E.M.: Conflictology, 2nd edn, 446 p. Tutorial. Minsk, TetraSystems (2000) 6. Myerson, R.B.: Game Theory: Analysis of Conflict, 568 p. Harvard Univ. Press, London (2001) 7. Novikov, D.A.: Mathematical Models of the Formation and Functioning of Teams. Publishing House of Physical and Mathematical Literature, 184 p. (2008) 8. Majeed, N., Shah, K.A., Qazi, K.A., Maqsood, M.: Performance evaluation of I.T project management in developing countries. Int. J. Inf. Technol. Comput. Sci. (IJITCS), 5(4), 68– 75 (2013). https://doi.org/10.5815/ijitcs.2013.04.08 9. Popov, A.I.: The Solution of Creative Professional Tasks: Textbook, 80 p. Publishing House of the Tambov State Technical University, Tambov (2004) 10. Tyurin, P.T.: Creative tasks: variety of solutions and ambiguity of interpretations. Siberian Pedagogical J. 3, 218–227 (2013) 11. Gorodetsky, V.I., Grushinsky, M.S., Khabalov, A.V.: Multi-agent systems (review). News Artif. Intell. 2, 64–116 (1998) 12. Bellman, R., Zade, L.: Decision making in vague conditions. In: Shakhnova, I.F. (ed.) Questions of Analysis and Decision-Making Procedures, 230, p., pp. 172–215. Mir (1976) 13. Kuznetsov, V.N., Klyushin, A.Yu., Mutovkina, N.Yu., Kuznetsov, G.V.: Models and methods of coordinated control in multi-agent systems. Softw. Prod. Syst. 4, 231–235 (2012) 14. Novikov, D.A.: Control Mechanisms, 192 p. Lenand (2011)
248
N. Yu. Mutovkina and V. N. Kuznetsov
15. Chkhartishvili, A.G.: Agreed information management. Problems Manag. 3, 43–48 (2011) 16. Leonenkov, A.V.: Fuzzy Modeling in MATLAB and fuzzyTECH, 736 p. SPb, BHVPetersburg (2005) 17. Aneja, D., Rawat, T.K.: Fuzzy clustering algorithms for effective medical image segmentation. Int. J. Intell. Syst. Appl. (IJISA), 5(11), 55–61 (2013). https://doi.org/10.5815/ijisa. 2013.11.06
Parallel Hybrid Genetic Algorithm for Solving Design and Optimization Problems L. A. Gladkov(&), N. V. Gladkova, and E. Y. Semushin Southern Federal University, Taganrog, Russia [email protected], [email protected]
Abstract. The paper considers a problem of building the hybrid algorithm for solving the optimization design tasks on the basis of integration of different methods of computation intelligence. The authors describe the definition and the main approaches to building the hybrid systems and demonstrate the possibilities of integration of the evolutionary design and multi-agent systems methods The different approaches to evolutionary design of the agents are considered. Different methods of parallelizing the computational process and the main models of parallel genetic algorithms, their benefits and shortcomings are described and analyzed in the paper. A hybrid parallel genetic algorithm for searching and optimization of the design decisions is developed in the paper. The algorithm is implemented as software subsystem and investigated in terms of its effectiveness. Keywords: Design tasks Bioinspired algorithms Hybrid methods Parallel genetic algorithm Multiagent systems
1 Introduction Modern trends in scientific and technical development include constant complicating of management systems and technological processes. This stimulates the developers to create and implement new instruments for improving the effectiveness of the information systems. Another aspect of development of the information systems involves high-performance computations (supercomputers, computational clusters, GRID- and cloud computing, computational jungles), which are characterized as parallel and distributed [1–3]. Synergetics is a new multidisciplinary branch of science, which deals with investigating the common dependencies in terms of transition from chaos to order and vice versa in open non-linear systems of different nature (technical, economic, social, etc.). Synergetics is based on similarity of mathematical models and ignores the different nature of the described systems. This distinguishes synergetics from other scientific fields standing on the border of different disciplines, when one discipline gives a subject and another one gives the research method to a new discipline [4]. The need for synthesis multidisciplinary research is manifested not only in appearance and development of the studies in synergetics. One of the leading trends to define the development of science includes the growing popularity of the integrated and hybrid systems. A hybrid artificial system can be © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 249–258, 2020. https://doi.org/10.1007/978-3-030-39216-1_23
250
L. A. Gladkov et al.
considered as a system that contains two or more integrated heterogeneous subsystems (of different kinds), which are united by cooperative actions (although these subsystems can be of a different nature) [5]. The aim of this research is to develop a hybrid algorithm for the effective solution of complex multi-criteria design and optimization problems.
2 Hybridization as a Paradigm of Computational Intelligence Hybrid systems are composed of different elements united to achieve the desired goals. Integration and hybridization of different methods and information technologies allow us to solve complex tasks, which cannot be solved by any separate method and technology. In terms of a hybrid architecture that unites several paradigms, the effectiveness of a certain approach can compensate for weaknesses of others. Combining several approaches, we can avoid the shortcomings of the separate methods. This phenomenon is referred as a synergetic effect [4]. Unfortunately, the artificial systems that are created by people are not usually able to develop, self-organize, dynamically change the structure, and adapt to the changing environmental conditions. Thus, the main task for developers of new information systems and data processing technologies is to include the ability to adapt, selforganize, and use the experience, accumulated by the nature and previous generations, into the designed systems. In that connection, it is rather justified to use the methods and approach inspired by the nature in terms of creating decision support systems. Such methods include evolutionary computations and swarm intelligence [6, 7]. In terms of a concept point of view, created information systems as well as search and decision support systems can be classified as hybrid artificial systems, i.e. the human-created systems, which combine artificial and natural subsystems [8]. They can also be considered as objective-oriented systems, which main functions are based on the factors of practicability. A model corresponding to the level of evolutionary systems can be represented as follows [9]: SYS ¼ ðGN; KD; MB; EV; FC; RPÞ where GN denotes the genetic origin (creation of the initial set of decisions); KD denotes the existence conditions; MB denotes the exchanging events (evolutionary and genetic operators); EV denotes the development (strategy of evolution); FC denotes functioning; RP denotes reproduction. A scientific direction on developing the models and methods of hybrid principles of fuzzy management and searching abilities of bioinspired methods is also actively developed today. We can distinguish two main approaches to using such hybrid methods [10, 11]: 1. To use the evolutionary algorithms for solving the problems of optimization and search in terms of fuzzy, ambiguous or insufficient information on the object, parameters of criteria of the solved tasks. To use the systems based on the fuzzy rules (genetic fuzzy rule-based system – GFRBS). In this case, hybrid systems are
Parallel Hybrid Genetic Algorithm for Solving Design and Optimization Problems
251
used for training and adjustment of different components of a fuzzy rules system: automated generation of the GFRBS knowledge base, its testing and adjustment of its output function [12]. 2. To use the methods based on fuzzy logic for modeling the structure and operators of evolutionary algorithms, for management and adaptation of the parameters of evolutionary algorithms [12]. This paper works with the second approach. To control and adjust the parameters of evolutionary algorithm, we propose including the fuzzy logical controller (FLC), which uses the experience and knowledge of the experts in the considered area and can change the search parameters dynamically [11, 12]. The rule production system on the basis of reasoning and knowledge performs logical conclusion, which is transformed into controlling actions after defuzzification. Change of the algorithm parameters results in change of search process and receiving new results, which are transformed into the fuzzy sets in the fuzzification block [10].
3 Evolutionary Design and the Agent Approach One of the most promising approaches to organization of search structure and optimization is using the intelligent agents and multi-agent architectures [1, 13]. Holland [14] (one of the founders of the genetic algorithms theory) defined the agent as an artificial organism that exists and evolves in a population of similar organisms and tries to survive in terms of competitive conditions (“natural selection”) of self-improvement and adaptation to the external environment. This definition includes the key set of characteristics (evolution, natural selection, population of similar individuals, etc.), which also refers to the terms “genetic algorithm” and “evolutionary design”. There are different approaches to evolutionary design of the agents and multi-agent systems (MAS), which are based on different models of evolution [13]. Let us consider the evolutionary design of the agent as the processes of formation of its hereditary variation and evolutionary adaptation to the external environment. That means that evolutionary design can be defined as a process of formation and development of a genotype and phenotype of the agent. The genotype corresponds to all hereditary information (determined genetically), which is obtained by the parents of the agent. The phenotype contains a set of structures of the agents, determined by the contextual rules resulting from the genotype development in terms of a certain environment. It is also required to process qualitative fuzzy information and consider different strategies and computer models of the evolution. Evolutionary theory and modeling together with fuzzy logic allow us to create an algorithm for determination of the agents interaction. The agents are characterized by the parameters defined in the interval [0, 1]; thus, using the fuzzy logic, we can modify the genetic operators and mutation operator in the algorithm. Resulting from the algorithm of crossing-over the parent-agents, we obtain the child-agents, which compose a family (an agency) in conjunction with the parent-agents. It is proposed using graphs to demonstrate the agents and the common structure of the MAS.
252
L. A. Gladkov et al.
There are several models of evolution that are mostly used in computer science. All of them describe the separate aspects of the evolution. To select the common scheme and model of evolution for a certain task, it is needed to investigate these aspects and problems, and consider other modern evolutionary researches. The population of the agents is represented as the evolving multi-agent system (EMAS) with a certain set of parameters. To build the model, we propose using a modified genetic algorithm for finding the effective structures of interaction between the agents and forming the child-agents. The genetic algorithm plays a role of the upper-level coordinator, which imposes restrictions on the actions of the whole agency. The analysis of the results of these restrictions allows us to accumulate the useful features in the population and to form the structures of the agents that are most suitable for a concrete situation.
4 Parallel Genetic Algorithm The abilities and effectiveness of evolutionary algorithms are significantly enhanced in terms of considering the search process as the evolution of several separate populations. To organize such process, we can successfully use the principles of distributed computations. Unlike the usual genetic algorithms, where the evolution process represents the evolvement of a single population of decisions, parallel genetic algorithms work with several separate subpopulations simultaneously. These subpopulations are characterized with a certain morphologic (structural) similarity of the individuals in each subpopulation [15, 16]. Genetic operators are also applied in terms of each subpopulation. The idea of partitioning the population into several smaller subpopulations allows us to reduce execution time of computations and improve the management of the evolutionary process in comparison with usual genetic algorithm. In terms of the parallelizing process, we can distinguish three main groups of parallel genetic algorithms: global single-population genetic algorithms (Master-Slave GAs); single-population genetic algorithms (Fine-Grained GAs); multi-population genetic algorithms (Coarse-Grained GAs) [15, 17]. In terms of solving the considered tasks, multi-population genetic algorithms are the most interesting. The main population is divided into several subpopulations of rather large size. The decisions are exchanged between them by means of the migration operator. In this case, a very important aspect is the simplicity of transforming the genetic algorithm with sequential computations into the multi-population parallel genetic algorithm. We add the mechanism of exchange between the subpopulations, while the evolutionary process of the whole population does not change. The paper analyzes the abilities of the most popular models of parallel genetic algorithms including island, buffer and cellular models [15, 17]. The cellular model contains the following ideas. Each decision (individual) can be represented as a cell in a certain closed space. Each individual can interact with its neighbors (above, below, left or right). Each cell represents only one decision. At the beginning, all decisions are assigned with random values of the fitness function. After several generations, the individuals are united into the areas in accordance with
Parallel Hybrid Genetic Algorithm for Solving Design and Optimization Problems
253
“similarity” of their fitness function values. Subsequently, the size of the areas is enhanced and there occurs the competition between different areas. The island model (Fig. 1) is based on dividing the population into a certain number of subpopulations (islands). Each of them evolves individually. In the evolutionary process the islands exchange the individuals using the migration operator. In terms of the island model, it is necessary to choose the correct frequency of implementing the migration operator. It is very important since the subpopulation size is rather small, i.e. we need to keep balance between the population diversity and convergence speed. If the exchange is applied too often, the quality of the decisions decreases, and the parallel algorithm is reduced to the usual sequential one. If the amount of migrations is too small, the computation process can converge prematurely.
Start
GA1
GA2
GA2
GA4
Migration operator
Migration operator
Migration operator
Migration operator
Fig. 1. The island model
The principles of the buffer model (Fig. 2) can be represented as a “star” graph model. This model has a central block (buffer) m which implements the interaction between the different subpopulations. During the algorithm execution process, the buffer is filled with chromosomes (decisions) from the different subpopulations. In terms of a certain pre-determined case, one of the existing subpopulations can reach the buffer and take chromosomes. At the end of the evolutionary cycle, it returns the requested number of the individuals back to the buffer.
Start
GA1
GA2
GAn
Migration operator
Migration operator
Migration operator
Chromosome buffer
Fig. 2. The buffer model
254
L. A. Gladkov et al.
Each subpopulation evolves separately. At each iteration, the algorithm verifies the conditions required for the exchange between the subpopulations (migration operator). If the conditions are fulfilled, the exchange between the current subpopulation and the buffer is implemented. After the exchange, the algorithm checks the current size of the buffer. If the number of chromosomes exceeds the available limit, the algorithm implements the selection operator to reduce the buffer size to permissible. The migration conditions and types of genetic operators can be different for different subpopulations. The main benefits of such model include the flexibility of its organization and the ability to adjust the parameters individually [18]. Resulting from the analysis of the available models of organization of parallel computations for the developed algorithm, we have chosen the island and buffer models or the evolutionary process organization.
5 Description of the Developed Algorithm Let us present a parallel multi-populational genetic algorithm for finding and optimizing the decisions. It assumes the parallel performance of the evolutionary process in terms of several populations. As it was mentioned above, to organize the process of parallel evolution, we used the island and buffer models of parallel genetic algorithm. The structure of the developed integrated parallel algorithm is demonstrated in Fig. 3. Synchronization of the asynchronous processes is implemented in the migration points. The migration points are determined by the occurrence of the certain asynchronous events, which can appear in the process of evolution of each subpopulation. Upon occurrence of the event in a subpopulation, other randomly chosen process is stopped. One of the near populations is chosen to implement the migration. After that, the migration operator is applied to the population, where the event occurred and the population of the randomly chosen process. To improve the execution speed of the algorithm, the paper presents the scheme of multi-threading on the local level in terms of calculation of the fitness functions values. Calculating the fitness function of each separate chromosome is implemented in an individual thread. The population is separated into S/N blocks of chromosomes, where S denotes the population size, N denotes the pre-assigned number of the threads. Each block contains N chromosomes. In terms of the block, the fitness function is calculated for each chromosome in the individual thread. After all the runnable threads are finished, we run calculation of the fitness function of the chromosomes in the next block. This process is continued until all values if the fitness function are calculated for all chromosomes.
Parallel Hybrid Genetic Algorithm for Solving Design and Optimization Problems
255
Start
Entering Source Data
Choice of a ParGA model
Generation of the initial population
Generation of the initial population
Calculation of fitness function
Calculation of fitness function
FLC
FLC
Selection
Selection
Crossover
Crossover
Mutation
Mutation
Migration
Synchro nization
Yes
Synchro nization
No
Sync proc Complete?
No
Sync proc Complete?
Yes
No
No
Yes
Yes
Waiting for sync
No
No
Stop condition
Stop condition
Yes
Waiting for Process Suspension
Yes
Finding the best solution
Stop
Fig. 3. The scheme of the parallel hybrid algorithm
To regulate the frequency of migration, the paper proposes using a fuzzy logical controller (FLC), which input receives the information on the effectiveness of the evolutionary process. A hybrid migration operator allows us to execute the migration only when it is needed, e.g. in terms of the conditions of the preliminary convergence. The FLC transforms the initial parameters into the fuzzy form. On the basis of the available knowledge and rules, the FLC determines the controlling action and returns the corrected values of the control parameters.
256
L. A. Gladkov et al.
To train the neural fuzzy logical controller, we propose using the genetic algorithm. The results of the fuzzy control module are defined by the parameters of the membership function of the input variables (xki , rki ) and a coefficient (yk ) to calculate the output value. The developed hybrid parallel algorithm is investigated on the basis of two hardware configurations: Intel® Core (TM) i7-3630QM CPU @ 2.40 GHz, RAM – 8 Gb (configuration 1); Intel® Core (TM) 2 Quad CPU Q8200 @ 2.40 GHz, RAM – 4 Gb (configuration 2). The researches include several experiments with the number of elements from 100 to 3000 and the interval 100. The number of connections, iterations and the population size were constant. The time complexity of the proposed algorithm is quadratic. The results of the experiments are represented as the diagrams of the average execution time and the common number of elements in Fig. 4.
Fig. 4. The dependence of the execution time of the algorithm on the number of elements
To improve the effectiveness and control the parameters of the genetic algorithms, we used the fuzzy logical controller (FLC). The execution time while using the FLC exceeds the execution time without using the FLC not more than 0,5% in terms of the same number of the elements. However, the FLC allows us to improve the quality of the obtained decisions by 25% on the average in comparison with the sequential genetic algorithm with the same number of iterations (Fig. 5). The effectiveness of the FLC is improved after introduction of the training block on the basis of the artificial neural network model.
Parallel Hybrid Genetic Algorithm for Solving Design and Optimization Problems
257
Fig. 5. The quality of the solutions obtained with and without the FLCr
6 Conclusions The paper presents the architecture of the parallel genetic search on the basis of the buffer and island models. The algorithm uses a hybrid mutation operator, which application is managed by a module of fuzzy control. The authors developed the structure if the fuzzy control module on the basis of integrating the artificial neural networks and fuzzy control module to change the management parameters of the parallel genetic algorithm dynamically. The paper proposes the scheme of interaction between the genetic algorithm and fuzzy control module. The authors developed a software and algorithmic complex for solving the tasks of design optimization. On the basis of such complex, we carried out a set of computational experiments to determine the effectiveness of the developed algorithms on the basis of benchmarks. The paper presents the time complexity of the developed algorithm and the results of application of the neural fuzzy logical controller and the block of controller training. Acknowledgments. This research is supported by the grant from the Russian Foundation for Basic Research (projects 17-01-00627).
References 1. Russel, S.J., Norvig, P.: Artificial Intelligence: A modern Approach. Prentice Hall, Englewood Cliffs (2003) 2. Luger, G.F.: Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 6th edn. Addison Wesley, Boston (2009) 3. Tawfeek, M.A., Elhady, G.F.: Hybrid algorithm based on swarm intelligence techniques for dynamic tasks scheduling in cloud computing. Int. J. Intell. Syst. Appl. (IJISA) 8(11), 61–69 (2016) 4. Haken, H.: The Science of Structure: Synergetics. Van Nostrand Reinhold, New York (1981)
258
L. A. Gladkov et al.
5. Glagkov, L.A., Glagkova, N.V., Legebokov, A.A.: Organization of knowledge management based on hybrid intelligent methods. In: Proceedings of the 4th Computer Science On-Line Conference 2015, Vol. 3: Software Engineering in Intelligent Systems, vol. 349, pp. 107– 113 (2015) 6. Gladkov, L.A., Kureychik, V.M., Kureychik, V.V., Sorokoletov, P.V.: Bioinspirirovannye metody v optimizatsii. Phizmatlit, Moscow (2009) 7. Prajapati, P.P., Shah, M.V.: Performance estimation of differential evolution, particle swarm optimization and cuckoo search algorithms. Int. J. Intell. Syst. Appl. (IJISA) 10(6), 59–67 (2018) 8. Prangishvili, I.V.: Sistemnyy podkhod i obshchesistemnye zakonomernosti. SINTEG, Moscow (2000) 9. Borisov, V.V., Kruglov, V.V., Fedulov, A.S.: Nechetkie modeli i seti. Goryachaya liniya – Telekom, Moscow (2007) 10. Herrera, F., Lozano, M.: Fuzzy adaptive genetic algorithms: design, taxonomy, and future directions. Soft. Comput. 7(8), 545–562 (2003) 11. Gladkov, L.A., Gladkova, N.V., Gromov, S.A.: Hybrid fuzzy algorithm for solving operational production planning problems. In: Advances in Intelligent Systems and Computing. vol. 573, pp. 444–456. Springer (2017) 12. Michael, A., Takagi, H.: Dynamic control of genetic algorithms using fuzzy logic techniques. In: Proceedings of the 5th International Conference on Genetic Algorithms, pp. 76–83. Morgan Kaufmann (1993) 13. Tarasov, V.B.: Ot mnogoagentnykh sistem k intellektual’nym organizatsiyam. Editorial URSS, Moscow (2002) 14. Holland, J.H.: Adaptation in Natural and Artificial Systems. The University of Michigan Press, Ann Arbor (1975) 15. Alba, E., Tomassini, M.: Parallelism and evolutionary algorithms. IEEE Trans. Evol. Comput. 6, 443–461 (2002) 16. Praveen, T., Arun Raj Kumar, P.: Multi-objective memetic algorithm for FPGA placement using parallel genetic annealing. Int. J. Intell. Syst. Appl. (IJISA) 8(4), 60–66 (2016) 17. Xiong, Z., Zhang, Y., Zhang, L., Niu, S.: A parallel classification algorithm based on hybrid genetic algorithm. In: Proceedings of the 6th World Congress on Intelligent Control and Automation, Dalian, China, pp. 3237–3240 (2006) 18. Gladkov, L.A., Gladkova, N.V., Leiba, S.N., Strakhov, N.E.: Development and research of the hybrid approach to the solution of optimization design problems. In: Advances in Intelligent Systems and Computing, vol. 875, pp. 246–257. Springer, Cham (2019)
Optimal Real-Time Image Processing with Imperfect Information on Convolution-Type Distortion Peter Golubtsov1,2,3(&) 1
2
Lomonosov Moscow State University, Moscow, Russia [email protected] National Research University Higher School of Economics, Moscow, Russia 3 Russian Institute for Scientific and Technical Information (VINITI RAS), Moscow, Russia
Abstract. We study a problem of designing an optimal two-dimensional circularly symmetric convolution kernel (or point spread function (PSF)) with a circular support of a chosen radius R. Such function will be optimal for estimating an unknown signal (image) from an observation obtained through a convolution-type distortion with the additive random noise. This technique is then generalized to the case of an imprecisely known or random PSF of the measurement distortion. It is shown that the construction of the optimal convolution kernel reduces to a one-dimensional Fredholm equation of the first or a second kind on the interval ½0; R. If the reconstruction PSF is sought in a finitedimensional class of functions, the problem naturally reduces to a finitedimensional optimization problem or even a system of linear equations. We also analyze how reconstruction quality depends on the radius of the convolution kernel. It allows finding a good balance between computational complexity and quality of the image reconstruction. Keywords: Optimal image processing Convolution kernel Deconvolution Circular symmetry Point spread function Fredholm equation Imperfect information on distortion Isotropic homogeneous random field
1 Introduction The paper studies optimal image processing problems for imaging systems on a continuous field of view. Such systems are often invariant under the group of the plane motions, i.e., translations and rotations. The processing problem is formulated as a problem of finding an optimal reconstruction map from a class of maps with a given support of its point spread function (PSF) that defines the algorithm for processing the observed image. It turns out that under certain, pretty general, conditions the optimal reconstruction map will also be invariant (Filatova and Golubtsov 1991). The restriction to the class of invariant mappings allows to substantially narrow the class in which the optimal reconstruction map is constructed, which essentially simplifies the solution of the problem. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 259–273, 2020. https://doi.org/10.1007/978-3-030-39216-1_24
260
P. Golubtsov
This paper develops the approach considered in (Filatova and Golubtsov 1991, 1992, 1993) for discrete imaging systems and using the invariance with respect to shifts, rotations and reflections. It was shown that the wider the group of transformations with respect to which the measuring system is invariant, the narrower the class of invariant reconstruction mappings which simplifies constructing the optimal mapping. However, often the use of space homogeneity is limited by the homogeneity inherent in the field of view, namely, the discreteness of the field of view. For example, a hexagonal grid has symmetry with respect to a group of rotations, shifts, and reflections that translate this grid into itself, which in turn is determined by the symmetry of a regular hexagon (Filatova and Golubtsov 1993). In this paper, an attempt is made to develop this approach for the plane in the continuous case. This allows you to remove restrictions on symmetry groups and measuring systems and, thus, maximize the possible invariance of (Golubtsov et al. 2003). A detailed study of the properties of optimal invariant estimates in image analysis problems is contained in (Pyt’ev 1973). In the majority of studies an image is represented by a function defined on a discrete finite grid, usually of a rectangular shape. In this paper we will assume that an image is a function on a continuous infinite plane. Formally, an image is modeled as a random field with a given correlation structure. Under the assumptions of a Gaussian Markov random field, the correlation and mean completely describe the stochastic properties of the image (Rue and Held 2005). The books (Cressie 1993; Rue and Held 2005) provide an overview of the history and current methods for statistical methods for spatial data. The image forming system is described by a convolution-type integral transformation on the plane. The kernel of such transformation depends only on the distance between points on the plane and is determined by some function a given on an interval. The goal is to construct the kernel of the processing integral transformation r, which also depends only on the distance, vanishes outside the given radius qr , and minimizes the average error in estimating the signal f . This distinguishes this approach from the classical approaches, where such possibility of narrowing the class of admissible processing transformations is not considered (see, for example, (Pyt’ev 1973)). As shown in this paper, high level of symmetry of the imaging system allows to reduce the problem of constructing the kernel of the optimal processing transformation for images on the plane to the one-dimensional Fredholm equation of the first or second kind on the interval ½0; qr . We also analyze how reconstruction quality depends on the radius of the convolution kernel. It allows finding a good balance between the quality of the image reconstruction and computational complexity both on constructing the transformation r, and on applying it to a processed image. This technique is then generalized to the case of an imprecisely known or random PSF of the measurement distortion. The problem is reduced to similar integral equations with modified ingredients. Since Fredholm equation is an infinite-dimensional problem, its numeric solution requires transforming it to a finite-dimensional approximation. Therefore, it could be
Optimal Real-Time Image Processing with Imperfect Information
261
attractive from the very beginning to search the reconstruction PSF r in a finitedimensional class of functions. Appropriate formalism is also studied. It is show that the corresponding problem naturally reduces to a finite-dimensional optimization problem or even a system of linear equations. Most of the tasks of increasing the quality of images are de-blurring (or, more generally, restoration) and de-noising processes. These two goals are conflicting and are often studied separately. For example, in (Binh and Tuyet 2015) the proposed method is divided into two steps: de-noising and de-blurring. In contrast to such approaches, ours constructs an optimal algorithm which fights both factors simultaneously, providing an optimal balance of noise and blur for the reconstructed image. In (Ciulla et al. 2016) also try to deal with noise and blur simultaneously by highlighting specific features of the images, while suppressing other features using certain information about the analyzed MRI image. In our study we also use certain, very general form of prior information about the reconstructed image. Specifically, we assume that it as a homogeneous random field with a known covariance function. The image restoration problem becomes much more difficult when the degradation model is not known or known imprecisely. As a result, restoration techniques (Gonzalez and Woods 2008; Jayaraman et al. 2011; Sridhar 2011; Singh and Khare 2016) can be broadly classified into Blind and Non-Blind ones. Non-Blind techniques have knowledge about the blur PSF, while the blind techniques try to restore images without having any knowledge of the distortion. Effectively, blind techniques extract certain information about the blur PSF from the image itself and use this approximation for image processing (see, e.g., Sasi and Jayasree 2016). However, since such information about the blur PSF is inherently imprecise the reconstructed image may suffer from various artifacts, such as ringing effect (Gupta and Porwal 2016). To accurately use the imprecise knowledge about the distortion we directly introduce such inaccuracy into the problem formulation in Sect. 5. Let us also mention that a pretty accurate information about the distortion can be obtained from specially arranged calibration observations of known signals (Golubtsov and Starikova 2002), for example, from the observation of a sharp edge (Joyce 2016).
2 Imaging System Model Consider an imaging system with a convolution-type distortion: g ¼ a f þ e; where the unknown image f is a function on R2 ; a is a point spread function (PSF), representing distortion of the imaging system; e is an additive noise, represented by a random field on R2 ; is the convolution operation; and g is an observed image, a function on R2 .
262
2.1
P. Golubtsov
Circularly Symmetric PSF
A widely used model for blurring (Hansen 2010; Jain 1989; Vogel 2002; Epstein 2008) describes it as a linear filter that maps an ideal image f to a blurred image a f , i.e., by the convolution on R2 , which has the form ZZ aðyÞf ðx yÞdy; x 2 R2 : ða f ÞðxÞ ¼ R2
A system that is not biased with respect to spatial orientation is said to exhibit isotropic blur, so that, in the convolution model, the PSF is circularly symmetric. In fact, many parametrically modeled PSFs such symmetry (Doering et al. 1992; Jain 1989; Kundur and Hatzinakos 1996; Watson 1993). More precisely, we will assume that the PSF a is a circularly symmetric function aðxÞ ¼ aðkxkÞ;
x 2 R2
with a bounded support on the plane R2 , i.e., að xÞ ¼ 0
k x k [ qa :
if
It implies that aðxÞ is completely determined by its one-dimensional profile aðqÞ, defined on a finite segment, i.e., q 2 ½0; qa ¼ suppðaÞ; see Fig. 1: 1 0.9 0.8 0.7
a(ρ)
0.6 0.5 0.4 0.3 0.2 0.1 0
0
5
10
15
20
25 ρ
30
35
40
45
50
Fig. 1. Example of a PSF aðxÞ (left) and its profile aðqÞ (right).
For a circularly symmetric bounded-support PSF a the convolution with f can be expressed directly through the profile aðqÞ and written as ð a f Þ ð xÞ ¼
qZ a 0
Here eu ¼
cos u sin u
aðqÞqdq
2p Z
f ðx þ reu Þdu;
x 2 R2 :
0
is the unit vector with the angle u 2 ½0; 2pÞ:
Optimal Real-Time Image Processing with Imperfect Information
2.2
263
Convolution of Symmetric PSFs
For any bounded-support circular symmetric PSFs að xÞ, x 2 ½0; qa and bð xÞ, x 2 ½0; qb their convolution a b is circular symmetric with a bounded support determined by the radius qab ¼ qa þ qb : Its profile ða bÞð xÞ can be expressed directly through the profiles að xÞ and bð xÞ: ð a bÞ ð x Þ ¼
qZ a
aðyÞydy
0
2p Z
b
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x2 þ y2 2xy cos u du
0
Obviously, convolution is a commutative, associative, bilinear operation on such functions. 2.3
Assumptions About Noise and Unknown Image
We assume that the additive random noise e is represented by an isotropic homogeneous random field (Gikhman and Skorokhod 1977; Rue and Held 2005) with zero mean and circularly symmetric covariance function: EeðxÞ ¼ 0;
EeðxÞeðyÞ ¼ sðkx ykÞ:
In the important case of white (uncorrelated) noise e its covariance function is represented by Dirac’s d-function, i.e., sðkx ykÞ ¼ r2 dðx yÞ. We will also assume that certain prior information about the unknown image f is known. Specifically, f is an isotropic homogeneous random field with zero mean (for simplicity) and circularly symmetric covariance function: Ef ðxÞ ¼ 0;
Ef ðxÞf ðyÞ ¼ tðkx ykÞ:
The random fields e and f are assumed to be independent.
3 Optimal Reconstruction Problem 3.1
Linear Reconstruction of an Unknown Image
In the reconstruction problem an estimate ^f of the signal f is computed by applying the reconstruction PSF r to the observed image ^f ¼ r g; where r is a symmetric PSF with some fixed radius qr . It implies that the PSF rðxÞ is completely determined by its profile r ð xÞ, x 2 ½0; qr ¼ suppðrÞ.
264
3.2
P. Golubtsov
Estimation Quality
Under the optimal reconstruction PSF r, we mean a function that provides a minimum to the expected mean-square estimation error: H ðr Þ ¼ E½^f ðxÞ f ðxÞ2 ¼ E½ðr gÞðxÞ f ðxÞ2
min
r:suppðrÞ¼½0;qr
:
Since f and e are homogeneous random fields, H ðr Þ does not depend on x 2 R2 and is given by the following expression: H ðr Þ ¼ ðr p r 2r q þ tÞð0Þ; where p ¼ a t a þ s;
3.3
q ¼ a t:
General Case - Correlated Noise
The optimal reconstruction PSF r satisfies the Fredholm integral equation of the first kind (Vasil’eva and Tikhonov 1989): Pr ¼ q; or, more explicitly, Z
qr
Pðx; yÞr ð yÞdy ¼ qðxÞ
0
8x 2 ½0; qr ;
where the Kernel Pðx; yÞ is defined as Z 2p pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi p x2 þ y2 2xy cos u du Pðx; yÞ ¼ y 0
8x; y 2 ½0; qr :
ð1Þ
ð2Þ
It is well known that the Fredholm integral equation of the first kind is an ill-posed problem (Tikhonov and Arsenin 1977), and, as a result, requires special treatment in numerical solution. On the other hand, the Fredholm integral equation of the second kind is a well-posed problem. 3.4
Special Case - White Noise
In the important special case of “white noise”, i.e., when s ¼ r2 d, the problem of constructing optimal PSF r reduces to the Fredholm integral equation of the second kind (Vasil’eva and Tikhonov 1989) with a different kernel ¼ q; r2 r þ Pr
Optimal Real-Time Image Processing with Imperfect Information
265
or more explicitly, Z
qr
r2 r ð xÞ þ
ðx; yÞr ð yÞdy ¼ qðxÞ P
0
8x 2 ½0; qr ;
ð3Þ
ðx; yÞ is defined as where the new Kernel P ðx; yÞ ¼ y P
Z
2p
p
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x2 þ y2 2xy cos u du
0
8x; y 2 ½0; qr
ð4Þ
and p ¼ a t a:
4 Numerical Results 4.1
Optimal Processing Kernel and the Synthesized Measurement-Processing System
The convolution of the original distortion PSF a and the optimal reconstruction PSF r represent the PSF of the synthesized “measurement + processing” system r a. Ideally, such complete synthesized system should be close to the delta-function. The closer it is to the delta function, the weaker the systematic distortions introduced by the synthesized system. Figures 2 and 3 show examples of the optimal reconstruction PSFs r and the corresponding synthesized PSFs r a for the case of flat PSF a with a slightly blurred edge. It is seen that qr is too small r a has a prominent circular structure, which will lead to circular artifacts on the reconstructed image. For the medium size of the reconstruction mapping support (qr ¼ 40) the circular structure for r a becomes less visible and there is a prominent bright spot in the middle. Its radius, compared to the radius of the measurement system a reflects the “sharpening effect” of the processing. Finally, for the large enough radius of the reconstruction mapping support (qr ¼ 50) the circular structure for r a almost disappears and the small bright spot in the middle reflects the dramatic increase of resolution for the synthesized system. 4.2
Balancing Size of r and Reconstruction Quality
According to the problem statement, the larger radius qr of the reconstruction PSF r should result in a better estimation quality, i.e., smaller HOpt . Figure 4 shows how the average estimation error HOpt ðqr Þ drops with the increase of qr . However, increasing qr would also require increased computational complexity for computing the optimal r and for applying it to the observed image g. Beside, processing with larger size of the sliding window requires larger area of observation g to reconstruct the same area of the unknown image f . Figure 5 shows the optimal reconstruction PSF with a very large radius.
266
P. Golubtsov
Fig. 2. Example: flat PSF a with a slightly blurred edge, qa ¼ 41 (left); optimal reconstruction PSF r with qr ¼ 30, 40, and 50 (middle, from top to bottom), and the synthesized PSF r a with qra ¼ 71, 81, and 91 (right, from top to bottom), respectively.
Optimal Real-Time Image Processing with Imperfect Information
267
1
a r
0.8 0.6 0.4 0.2 0 −0.2 −0.4 −0.6 −0.8 −1
0
10
20
30
40
50
60
70
80
90
1.2
a r∗a
1 0.8 0.6 0.4 0.2 0 −0.2
0
10
20
30
40
50
60
70
80
90
Fig. 3. Profiles of the PSF a (dashed line), optimal reconstruction PSF r for qr ¼ 50 (top), and the synthesized PSF r a (bottom). Curves are rescaled to be shown on the same graph.
2200 2000
Estimation error H(ρr)
1800 1600 1400 1200 1000 800 600 400 200
0
20
40
60
80
100 Radius ρr
120
140
160
Fig. 4. Reconstruction quality HOpt ðqr Þ as a function of qr .
180
200
268
P. Golubtsov 1
a r
0.8 0.6 0.4 0.2 0 −0.2 −0.4 −0.6 −0.8 −1
0
20
40
60
80
100
120
140
160
180
200
Fig. 5. Profiles of the optimal reconstruction PSF r for very large qr ¼ 200 (solid) and the distortion PSF with qa ¼ 41 a (dashed).
5 Imperfect Information on the Distortion The above approach can be used in the case where information about the PSF a is not precise. This may happen if the knowledge of a is not perfect or if a changes randomly from measurement to measurement. Specifically, suppose that the PSF a is a linear combination að x Þ ¼
n X
ai ei ð xÞ;
x 2 ½0; qa ;
i¼1
where ei ð xÞ, i ¼ 1; . . .; n, is a set of given linearly independent functions and the coefficients ai are random values. The information about a is represented by the mean values and covariances Eai ¼ ai ;
Eðai ai Þðaj aj Þ ¼ Vij ;
i; j ¼ 1; . . .; n:
Define the estimation error H ðr Þ ¼ E½^f ðxÞ f ðxÞ2 ¼ E½ðr gÞðxÞ f ðxÞ2
ð5Þ
the same way as before, but now the expectation E is assumed to be taken over the independent distributions of f , e, and a. It can be shown that H ðr Þ ¼ ðr p r 2r q þ tÞ; with the modified functions p and q p ¼ a t a þ w þ s;
q¼ at
ð6Þ
Optimal Real-Time Image Processing with Imperfect Information
269
Here að xÞ ¼
n X
ai ei ð xÞ;
w ð xÞ ¼
i¼1
n X
Vij ðei t ej Þð xÞ;
x 2 ½0; qa :
i¼1
The expectation a is used in place of the unknown a and additional term w reflects effects of uncertainty in a. As a result, like in Sect. 3, construction of the optimal PSF r can be reduced to a Fredholm integral equation. Specifically, in the case of correlated noise it reduces to the Fredholm integral equation of the first kind Pr ¼ q: In the important special case of “white noise”, the problem of constructing optimal PSF r reduces to the Fredholm integral equation of the second kind: ¼ q; r2 r þ Pr but with the modified expressions for the functions p, q, given by the expressions in (6) and p defined as p ¼ a t a þ w: ðx; yÞ are defined through the modified p and Fredholm kernels Pðx; yÞ and P p by Eqs. (2) and (4).
6 Finite-Dimensional Representation of Reconstruction PSF 6.1
Finite-Dimensional Representation of the Reconstruction PSF
The Fredholm equation is an infinite-dimensional problem. Its numeric solution requires transforming it to a finite-dimensional approximation. Therefore, it could be attractive from the very beginning to search r in a finite-dimensional form. Specifically, suppose that the PSF r is constructed in a form a linear combination r ð xÞ ¼
m X
ri bi ð xÞ;
x 2 ½0; qr ;
i¼1
where bi ð xÞ, i ¼ 1; . . .; n, is a set of given linearly independent functions, which can be considered as a basis in the corresponding finite-dimensional subspace. Obviously, any such function r ð xÞ is represented by the vector r ¼ ½r1 ; ; rm T .
270
6.2
P. Golubtsov
Optimal “Finite-Dimensional” PSF r
It can be shown that the estimation error (5) now can be represented in the form H ð rÞ ¼
m X
ri rj ðbi p bj Þð0Þ 2
i;j¼1
m X
ri ðbi qÞð0Þ þ tð0Þ;
i¼1
which can be written in the matrix form as H ðrÞ ¼ hr; Pri 2hr; qi þ tð0Þ: Here P and q are an m m-matrix and an m-dimensional vector with the following components: pij ¼ ðbi p bj Þð0Þ;
qi ¼ ðbi qÞð0Þ;
Minimizing H ðrÞ in such form is a simple finite-dimensional optimization problem, which reduces to the matrix equation Pr ¼ q: It can be shown that P ¼ PT and P 0, i.e., P is symmetric positive semidefinite matrix. If P is nonsingular, there exists a unique optimal PSF r determined by the vector r ¼ P1 q: In the particularly important case when e is a white noise and the functions bi are orthonormal R q with the weight q, i.e., with respect to the inner product, defined as hu; vi ¼ 0 a uðqÞvðqÞqdq, then the matrix P can be written as þ r2 I; P¼P is an m m-matrix with the components where P pij ¼ bi p bj ð0Þ: is symmetric positive semidefinite, the matrix P is positive definite and, as Since P a result, nonsingular. It implies that there exists a unique optimal PSF r determined by the vector þ r2 IÞ1 q: r ¼ ðP
Optimal Real-Time Image Processing with Imperfect Information
271
7 Conclusions We have seen that in many image processing problems constructing the optimal processing algorithm can be reduced to the Fredholm equation of the first kind (in general) or of the second kind (in the case of white noise). The problem can even be formulated in such a form, which directly reduces to the finite-dimensional linear equation. It does not matter much whether the measurement PSF is known precisely or imprecisely. In the latter case we just need to take the inaccuracy of our knowledge into account by adequately introducing it to the problem statement. Since we assume that the original image f and its observation g are defined on the infinite plane, it allows us to ignore the problem of problem of edge handling. In practice it means that to reconstruct a certain part of the image f one would need to have a properly extended part of the observed image g. The corresponding extension, which can be called “overhead”, is determined by the radius of the reconstruction PSF. The resulting processing algorithm is completely determined by the PSF and can be applied to images of any, potentially infinite size. This distinguishes our approach from classical image processing approaches, where the processing, in fact, depends on the image size. However, in the case of “small” images our approach may have certain disadvantages. Indeed, if the reconstruction PSF is large we will be able to properly reconstruct an even smaller part of the image, since the reconstruction PSF should not go outside the edges. If we allow the PSF to go outside the edges we may get undesirable artifacts. This effect and ways of minimizing artifacts is supposed to be studied in the future research. Approaches, proposed in this paper provide various options for computational optimization and parallelization. In particular, adjusting the radius of the reconstruction PSF allows to choose a good compromise between reconstruction quality and computational demands both for computing the reconstruction PSF and applying it. Using finite-dimensional representation for the reconstruction PSF allows to completely avoid solving integral equations. In this study, we focused on designing an optimal processing algorithm determined by the reconstruction PSF. We did not discuss aspects of applying such PSF. The application of the constructed convolution kernel to a distorted image reduces to a 2D convolution. Convolution algorithms require significant computing time by themselves, however they allow very efficient GPU acceleration (Bozkurt et al. 2015; Afif et al. 2018). Finally, when processing a large image it can be divided into several smaller pieces. Such pieces can be processed in parallel on different computers and then combined into one image. All these aspects allow very efficient strategies for processing massive amounts of large images by using various parallel architectures. Let us note that all our study was made in continuous settings, which led to certain analytical benefits. However, digital images are always specified on discrete rectangular grids. It means that at some point the designed continuous reconstruction PSF should be transformed to a corresponding discrete grid. Such discretization may introduce certain errors, not accounted in this study. To estimate corresponding effects on image processing quality additional analysis and numerical simulations are required.
272
P. Golubtsov
Acknowledgments. The reported study was supported by RFBR, research project number 1929-09044.
References Afif, M., Said, Y., Atri, M.: Efficient 2D convolution filters implementations on graphics processing unit using NVIDIA CUDA. Int. J. Image Graph. Sig. Process. (IJIGSP) 8, 1–8 (2018) Bozkurt, F., Yaganoglu, M., Günay, F.B.: Effective Gaussian blurring process on graphics processing unit with CUDA. Int. J. Mach. Learn. Comput. 5(1), 57–61 (2015) Ciulla, C., Yahaya, F., Adomako, E., Shikoska, U.R., Agyapong, G., Veljanovski, D., Risteski, F.A.: A novel approach to T2-weighted MRI filtering: the classic-curvature and the signal resilient to interpolation filter masks. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 1, 1–10 (2016) Cressie, N.A.C.: Statistics for Spatial Data. Wiley, New York (1993) Doering, E.R., Gray, J., Basart, J.P.: Point spread function estimation of image intensifier tubes. In: Thompson, D.O., Chimenti, D.E. (eds.) Review of Progress in Quantitative Nondestructive Evaluation. Advances in Cryogenic Engineering, vol. 28, pp. 323–329. Springer, Boston (1992). https://doi.org/10.1007/978-1-4615-3344-3_40 Epstein, C.L.: Introduction to the Mathematics of Medical Imaging. Society for Industrial and Applied Mathematics, Philadelphia (2008) Filatova, S.A., Golubtsov, P.V.: Invariant measurement computer systems. Pattern Recognit. Image Anal. 1(2), 224–235 (1991) Filatova, S.A., Golubtsov, P.V.: Invariance and synthesis of optimum image formation measurement computer systems. In: Houstis, E.N., Rice, J.R. (eds.) Artificial Intelligence, Expert Systems and Symbolic Computing, pp. 243–252. Elsevier Science Publishers B.V. (North-Holland) (1992) Filatova, S.A., Golubtsov, P.V.: Invariance considerations in design of image formation measurement computer systems. In: Sadjadi, F.A. (ed.) Proceedings of SPIE, Automatic Object Recognition III, vol. 1960, pp. 483–494 (1993). https://doi.org/10.1117/12.160623 Gikhman, I.I., Skorokhod, A.V.: Introduction to the theory of random processes. Nauka, Moscow (1977). (in Russian) Golubtsov, P.V., Starikova, O.V.: Invariance considerations in calibration problem of measurement computer systems. Math. Model. 14(4), 45–56 (2002). (in Russian) Golubtsov, P.V., Sizarev, D.V., Starikova, O.V.: Synthesis of optimal invariant image generation systems on a plane. Mosc. Univ. Phys. Bull. 58(2), 1–5 (2003) Gonzalez, R., Woods, R.: Digital Image Processing, 3rd edn. Pearson Education Inc., London (2008) Gupta, S., Porwal, R.: Implementing blind de-convolution with weights on x-ray images for lesser ringing effect. Int. J. Image Graph. Sig. Process. (IJIGSP) 8, 30–36 (2016) Hansen, P.C.: Discrete Inverse Problems: Insight and Algorithms. SIAM, Philadelphia (2010) Jain, A.K.: Fundamentals of Digital Image Processing. Prentice-Hall Information and System Sciences Series. Prentice Hall, Englewood Cliffs (1989) Jayaraman, S., Esakkirajan, S., Veerakumar, T.: Digital Image Processing. Tata McGraw-Hill Education (2011) Joyce, K.: Point spread function estimation and uncertainty quantification. Ph.D. dissertation. The University of Montana, Missoula, MT, United States (2016). https://doi.org/10.2172/ 1508604
Optimal Real-Time Image Processing with Imperfect Information
273
Kundur, D., Hatzinakos, D.: Blind image deconvolution. IEEE Sig. Process. Mag. 13(3), 43–64 (1996) Pyt’ev, Yu.P.: (G, G)-covariant transformations and image estimates, Kibernetika [cybernetics], no. 6, pp. 126–134 (1973). (in Russian) Rue, H., Held, L.: Gaussian Markov Random Fields: Theory and Applications. CRC Press, Boca Raton (2005) Sasi, N.M., Jayasree, V.K.: Reduction of blur in image by hybrid de-convolution using point spread function. Int. J. Image Graph. Sig. Process. (IJIGSP) 8(6), 21–28 (2016) Sridhar, S.: Digital Image Processing. Oxford University Press, Oxford (2011) Singh, D.P., Khare, A.: Restoration of degraded gray images using genetic algorithm. Int. J. Image Graph. Sig. Process. (IJIGSP) 3, 28–35 (2016) Tikhonov, A.N., Arsenin, V.Ya.: Methods of solving ill-posed problems, Moscow (1977). (in Russian) Vasil’eva, A.B., Tikhonov, N.A.: Integral equations, Moscow (1989). (in Russian) Binh, N.T., Tuyet, V.T.H.: Enhancing the quality of medical images containing blur combined with noise pair. Int. J. Image Graph. Sig. Process. (IJIGSP) 11, 16–25 (2015) Vogel, C.R.: Computational Methods of Inverse Problems. Society for Industrial and Applied Mathematics, Philadelphia (2002) Watson, S.A.: Real-time spot size measurement for pulsed high-energy radiographic machines. In: Proceedings of the 1993 Particle Accelerator Convergence, vol. 4, pp. 2447–2449 (1993)
Scalability and Parallelization of Sequential Processing: Big Data Demands and Information Algebras Peter Golubtsov1,2,3(&) 1
2
Lomonosov Moscow State University, Moscow, Russia [email protected] National Research University Higher School of Economics, Moscow, Russia 3 Russian Institute for Scientific and Technical Information (VINITI RAS), Moscow, Russia
Abstract. Procedures of sequential updating of information are important for “big data streams” processing because they avoid accumulating and storing large data sets. As a model of information accumulation, we study the Bayesian updating procedure for linear experiments. Analysis and gradual transformation of the original processing scheme in order to increase its efficiency lead to certain mathematical structures - information spaces. We show that processing can be simplified by introducing a special intermediate form of information representation. Thanks to the rich algebraic properties of the corresponding information space, it allows unifying and increasing the efficiency of the information updating. It also leads to various parallelization options for inherently sequential Bayesian procedure, which are suited for distributed data processing platforms, such as MapReduce. Besides, we will see how certain formalization of the concept of information and its algebraic properties can arise simply from adopting data processing to big data demands. Approaches and concepts developed in the paper allow to increase efficiency and uniformity of data processing and present a systematic approach to transforming sequential processing into parallel. Keywords: Big data streams Sequential Bayesian updating spaces Algebra of information Distributed processing
Information
1 Introduction Present time exhibits a sharp increase in the number of studies related to big data. It became clearly recognized that large amounts of data often contain unexpected valuable information. Typically, in big data problems, such new information is too well hidden and it has to be extracted from the original data, transformed, transmitted, accumulated and, eventually, converted to a form suitable for interpretation or decisionmaking. Besides, the whole concept of information, contained in data becomes even less clear in the realm of big data, since information becomes completely hidden in the heaps of data, can be distributed among many sites, and new data, containing more information can be constantly generated. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 274–298, 2020. https://doi.org/10.1007/978-3-030-39216-1_25
Scalability and Parallelization of Sequential Processing
275
Big data, usually, have a huge volume, are distributed among numerous sites and are constantly replenished. As a result, even the simplest analysis of big data faces serious difficulties. Indeed, the traditional approaches to information processing assume that the data intended for the processing is collected in one place, organized in the form of convenient structures (for example, matrices), and only then the appropriate algorithm processes these structures and produces the result of the analysis. In the case of big data, it is impossible to collect all the data needed for a research project on a single computer. Moreover, it would be impractical, because one computer would not be able to process them in a reasonable time. As a result, there emerges a need to transform existing algorithms, which could process separate data fragments independently and in parallel. The corresponding data analysis algorithm must, working on many computers in parallel, extract from each set of source data some intermediate compact “information”, gradually combine and update it and, finally, use the accumulated information to produce the result. Upon the arrival of new pieces of data, it should be able to add the information extracted from them to the accumulated information in real time and, eventually, update the result. Usually it is not obvious, how to design such completely parallel and ultimately scalable version of an existing algorithm. In this study, we will try to approach this problem gradually, by starting with sequential version of an algorithm and then transforming it to completely parallel one. Since, typically, in big data problems it is not possible to collect all the relevant data on one place before processing, we will first focus attention on approaches, which would allow “digesting” incoming pieces of data and accumulating the information, contained there, without storing all the raw data. One of the general approaches to such sequential updating of information is based on the Bayesian procedure of transition from prior to posterior distribution (Barra 1971; Lindley 1972; Borovkov 1998). The Bayesian procedure for sequential updating of information is considered one of the most important tools in expert systems (Spiegelhalter and Lauritzen 1990; Spiegelhalter et al. 1993). Special interest to this procedure is observed in the context of Big Data (Oravecz et al. 2016; Zhu et al. 2017), since it allows updating information about the object of interest as data is received. As a result, there is no need to accumulate and store the original raw data. In this paper, we will study various aspects of Bayesian information accumulation from the point of view of big data systems. The settings we will be using provide an interesting opportunity to explore and compare different forms of information representation, starting with the “natural” ones, such as initial “raw” information and “explicit” information, which is the most convenient for interpretation. We will also design a special “canonical” form of information representation, most convenient for intermediate manipulation of information (such as merging, updating, transfer, storage, etc.). It will be shown that all these three forms of presenting information lead to information spaces with various algebraic properties and important relations between these spaces. Of particular interest to us are the features of these spaces, in the context of distributed processing of large amounts of data. As will be shown below, the choice of an adequate form of canonical information space makes it possible to improve the efficiency of the data processing by unifying and minimizing computations. Moreover, special attention in the Big Data problems is paid to the methods of data analysis that allow parallel and distributed processing
276
P. Golubtsov
(Bekkerman et al. 2012; Kholod and Yada 2012; Fan et al. 2013). Below we will see that the introduction of a suitable intermediate form of information representation opens the possibility of flexible parallelization and scaling of the procedure for updating information in distributed data processing systems. Approaches and data structures studied in this paper are close to the ones, which are intuitively introduced within MapReduce framework (Dean and Ghemawat 2010). We believe that systematic study of emerging algebraic structures and explicit formulation of their mathematical properties could serve as important guidelines in developing efficient and highly scalable algorithms dealing with big data.
2 Big Data and Information Theory The use of the term “information” has recently increased significantly, especially in the context of data analysis. Usually it is understood too broadly and informally. However, in the author’s opinion, such an increased frequency of use of this term indicates an increasing need for a more accurate and formal understanding of the phenomenon of information. Can the area of big data bring us closer to this understanding? Studies related to big data systems are aimed primarily at the problems of processing large amounts of distributed data and have, as a rule, a practical and technical orientation. At the same time, most of research on information theory is carried out in the context of the probability theory and mathematical statistics and is of predominantly theoretical interest. Perhaps the most applied part of information theory, originating in Shannon’s works (Shannon and Weaver 1949), is related to the transmission of messages in the presence of interference. It is not so much about the “meaning” or quality of information, but about its quantity. It focuses on the measure of information, but not on its essence. As a result, it provides tools for optimizing the throughput of information transmission channels, but not for extracting or accumulating information, especially when the data is distributed. A special place in mathematical statistics is occupied by Fisher’s information, which is represented by a matrix - Fisher’s information matrix (Barra 1971; Borovkov 1998). It provides a more detailed reflection of the concept of information and, in particular, has an important additive structure in which the union of independent statistics (data sets) corresponds to the sum of their information matrices. Currently, the areas of interest of big data and various approaches to the notion of information are poorly connected. However, as we will see, the problematics of big data requires a more precise, formal description of the very concept of information, or, at least, of information representation. This becomes important for constructing effective tools for manipulating information, based on mathematical (for example, algebraic) properties of information. In this regard big data problems might soon become the main driver and beneficiary of the general information theory. In this paper, we will try to show how certain formalization of the concept of information and its algebraic properties can arise simply from the consideration of the problem in the context of big data.
Scalability and Parallelization of Sequential Processing
277
We will discuss the features of the appropriate well-organized intermediate form of information representation, reveal its natural algebraic properties, and will see how we can use these properties to transform the processing algorithm into a parallel one. It appears that such an intermediate form of information representation in some sense reflects the very essence of the information contained in the data. This leads us to a completely new, “practical” approach to the notion of information. Conceptually our understanding of information is close to the Central Dogma of Computer Science (He and Petoukhov 2010), where information is considered as an intermediate link between data and knowledge. In this article, as well as in (Golubtsov 2017, 2018), we study some mathematical models of such relationships.
3 Bayesian Sequential Update of Information In this section, we outline in general terms how Bayesian update of information can avoid storing all the raw data, thus providing a simple workaround in certain big data scenarios. 3.1
Traditional Data Processing and Big Data Problems
Let us focus on the following features of information processing problems in big data systems: • Typically, such problems deal with huge amounts of data. • Usually such data is not collected in one place, but distributed over numerous, possibly remote sites. • New data constantly emerges and should be immediately included in the processing. Traditional processing methods usually do not take into account such features and, as a result, require fundamental revision to become applicable to big data problems. Usually, for a small fixed data set, the processing consists in applying an appropriate transformation (algorithm, method), which represents the processing P, to a data set and obtaining the result of processing (for example, an estimate of some unknown value) as in Fig. 1.
Fig. 1. Standard approach to data processing.
Any data processing can be considered as a process of transformation between two “natural” forms of information. Specifically it transforms original raw information into
278
P. Golubtsov
the desired explicit form, which is the most convenient for interpretation. Hereafter, raw information will be shown in red, while explicit information will be green. To utilize such simple scheme is critical for all the data to be collected in one place, presented in the form of suitable structures, say, matrices, and is ready for the processing transformation to be applied to them. If the data is distributed among many different locations, before the processing they must be collected in one place and organized in the form of suitable structures (Fig. 2).
Fig. 2. Standard processing scheme for distributed data. Double wavy arrows indicate data transmission in its original, unprocessed form.
The drawbacks of such approach to the processing of distributed data are obvious: • Transferring large amounts of raw data creates excessive traffic. • Keeping the combined data set in one place requires huge amounts of memory. • Processing all data on one computer requires excessive computational and time resources. • As new data becomes available, the combined data set grows in size and, therefore, requires ever-increasing (potentially infinite) storage resources. • The processing algorithm has to be reapplied to the constantly increasing amount of accumulated data. 3.2
Sequential Updating Information
Probabilistic settings provide rather natural and widely accepted formalization for the notion of information. Specifically, information about an unknown element from some space is represented by a probability distribution on this space. Moreover, the notion of Bayesian conditional probability provides a natural conceptual and practical framework for the idea of sequential information update. According to Bayesian approach, the initial knowledge is represented by some prior distribution. The result of statistical experiment, combined with the prior distribution leads to the posterior (conditional) distribution (Barra 1971; Lindley 1972), which is regarded as the new, updated information, Fig. 3.
Scalability and Parallelization of Sequential Processing
279
Fig. 3. Bayesian prior to posterior updating of information.
Moreover, the obtained posterior distribution is then considered as a new prior for the next observation and so on, as shown of Fig. 4. In (Lindley 1972) this idea is expressed in the form: “today’s posterior is tomorrow’s prior”, see also (Robert 2010; Oravecz et al. 2016).
Fig. 4. Bayesian sequential updating of information. Each data set transforms the accumulated explicit information.
Bayesian information update provides a natural workaround for Big Data processing. Instead of accumulating large volumes of raw data in allows to gradual updating the accumulated information in the explicit form. Such procedure of sequential updating of information is especially important in “big data streams” processing (Zhu et al. 2017) because it avoids accumulating and storing large amounts of raw data.
4 Bayesian Information Updating for Linear Experiment As an example of Bayesian information updating, we will study the case of linear experiment with the additive noise, where all probability distributions, i.e., the prior distribution and the noise distribution are normal. Being quite important from the practical point of view, this case also leads to quite interesting properties of merging
280
P. Golubtsov
information spaces and can be regarded as a model of information transformation and accumulation in distributed systems. 4.1
Linear Measurement
Consider a linear experiment scheme of the form (Pyt’ev 1983, 1984, 2012) y ¼ Ax þ m; where x 2 D is an object of measurement – a vector of Euclidean space, y 2 R is a measurement result, A : D ! R is a linear mapping, and m 2 R is a normally distributed random noise vector with zero mean Em ¼ 0 and a given positive definite covariance operator S [ 0, i.e., Pm ¼ N ð0; SÞ. All the information about the measurement is represented by a triplet of the form ðy; A; SÞ, where y 2 R, A : D ! R, S : R ! R, S [ 0, and R is some vector space. The set of all such triples will be denoted by 1 ; > > ððxðkÞvl ðkÞÞT Fl ðxðkÞvl ðkÞÞÞ1x > > > l¼1 > > N P > > x u > j ðk Þxðk Þ < ; vj ¼ k¼1P N > ux > j ðk Þ > k¼1 > > > N T > P > > Sj ¼ ux > j ð k Þ xð k Þ vj xð k Þ vj ; > > k¼1 > > > : F ¼
S
1n S1 : j j j The matrix Fj deduced in this fashion is symmetric and positive definite. Furthermore, the scale it sets, which is established for each coordinate by a square root of the analogous eigenvector, is optimal [15] according to the criterion (1). The eigenvectors’ magnitudes in the matrix Fj can alter the form of the corresponding clump that enables to practice this procedure in a deep spectrum of functional purposes. As some estimation schemes are engaged in the evaluation of the covariance matrix and its inversion (pseudo inversion), it brings about a high rate of computational cost for the Gustafson-Kessel method (which also increases fast in relation to a representation capacity); to the significant weakness of the scheme to outliers; to processing information in a batch (offline) mode. Applying the Sherman-Morrison transformation for acquired inversion of the initial matrix and the nonlinear programming scheme [by Arrow-Hurwicz-Uzawa] for receiving the system of recursive minimization grants determining an adaptive design of the Gustafson-Kessel method which is capable of running totally in a recurrent style: xðk þ 1Þ vj ðk Þ2=ð1xÞ
uj ð k þ 1Þ ¼ P m
l¼1
Fj ðk Þ
2=ð1xÞ
kx ð k þ 1Þ v l ð k Þ kF l ð k Þ
;
T Sj ð k þ 1Þ ¼ S j ð k Þ þ ux j ð k þ 1Þ x ð k þ 1Þ v j ð k Þ x ð k þ 1Þ v j ð k Þ ; 1 S1 j ð k þ 1Þ ¼ S j ð k Þ
T 1 1 ux j ðk þ 1ÞSj ðk Þ xðk þ 1Þ vj ðk Þ xðk þ 1Þ vj ðk Þ Sj ðk Þ T 1 ; 1 þ ux j ðk þ 1Þ xðk þ 1Þ vj ðk Þ Sj ðk Þ xðk þ 1Þ vj ðk Þ
Sj ðk þ 1Þ ¼ Sj ðkÞ 1 þ ux ðk þ 1Þxðk þ 1Þ vj ðkÞ2 ; j
372
Z. Hu and O. K. Tyshchenko
1=n Fj ðk þ 1Þ ¼ Sj ðk þ 1Þ S1 j ðk þ 1Þ; vj ðk þ 1Þ ¼ vj ðk Þ þ gðkÞux j ðk þ 1ÞFj ðk þ 1Þ xðk þ 1Þ vj ðk Þ : The most unique attribute about this method is a recursive inversion of the covariance matrix S1 of the Sherman-Morrison description and j ðk þ 1Þ in the context
the recurrent estimate of its determinant Sj ðk þ 1Þ .
3 Simulation Results Employing the clustering validity measures, the clustering methods may be estimated efficiently. To explain this problem, a synthetic data set was employed. We generated a set of points (observations) randomly and ran the processing procedure several times to generalize the output. In this section, only a simple example is given. A general capacity of the data set was 534 points. The results based on the validity measures are gathered in Table 1. In the first instance, all these methods are initialized randomly, so there may be various running issues. On the other hand, the outcomes obtained slightly depend on the data topology, and there is no general validity index that can fit perfectly to a clustering intricacy. Speaking of the case under consideration, the two results are represented in Fig. 2 employing different initializations. What we can finally see is that there is a varying quantity of clusters to have been identified. The principal obstacle of this type of algorithms is that the prototypes’ initialization is run randomly. It is highly prescribed to run the method several times to gain accurate results. To withdraw the problem illustrated above, the cluster prototypes are initialized by randomly picked data points. From the perspective of the outcomes, the hard clustering algorithm may be quite the right solution to a task, although it is highly likely that this method meets the initialization problem. When it comes to the fuzzy procedures, the only distinction is a shape of the clusters obtained; at the same time, the GK procedure can detect elongated clusters much better. For this reason, GK demonstrates slightly better results based on all three validity measures. For example, it was proved in [16, 17] that the real data are characterized by the Mahalanobis norm rather than the Euclidean norm. DI is meant initially to identify compressed and well-divided clusters. XB tries to quantify a ratio of the total variety within clumps and the separation of groups. PC estimates how overlying clusters are. Based on the score of the two most widely known fuzzy clustering indexes (PC and XB indexes), the modified GK method has the best results for this data set.
An Approach to Online Fuzzy Clustering
373
Fig. 2. The results obtained by various initializations based on the synthetic data
Table 1. The clustering results PC XB DI K-Means 0.8176 0.5184 0.5617 FCM 0.8743 0.3465 0.5874 Modified GK 0.9103 0.1897 0.6732
4 Conclusion The paper presented a new approach to recursive methods for online fuzzy clustering. In consequence of the mentioned technique, the groups of patterns tend to have a hyprellipsoidal form randomly. The offered approach mainly deals with the Mahalanobis distance metrics. Based on the results obtained, the introduced procedure demonstrated a higher clustering quality compared to the algorithms under comparison. Since the majority of the real-world tasks are characterized by ellipsoidal shapes of clusters, the developed procedure is a more useful tool from this point of view. Acknowledgment. This scientific work was partially supported by RAMECS and selfdetermined research funds of CCNU from the colleges’ primary research and operation of MOE (CCNU19TS022). The investigation of Oleksii K. Tyshchenko was also granted by the National Science Agency of the Czech Republic within the project TACR TL01000351.
References 1. Höppner, F., Klawonn, F., Kruse, R., Runkler, T.: Fuzzy Cluster Analysis: Methods for Classification. Data Analysis and Image Recognition. Wiley, Chichester (1999) 2. Kohonen, T.: Self-Organizing Maps. Springer, Berlin (1995)
374
Z. Hu and O. K. Tyshchenko
3. Du, K.-L., Swamy, M.N.S.: Neural Networks and Statistical Learning. Springer, London (2014) 4. Bezdek, J.-C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981) 5. Bezdek, J.C., Keller, J., Krishnapuram, R., Pal, N.: Fuzzy Models and Algorithms for Pattern Recognition and Image Processing. The Handbook of Fuzzy Sets. Springer, Dordrecht (1999) 6. Hu, Z., Bodyanskiy, Ye.V., Tyshchenko, O.K., Boiko, O.O.: A neuro-fuzzy Kohonen network for data stream possibilistic clustering and its online self-learning procedure. Appl. Soft Comput. J. 68, 710–718 (2018) 7. Bodyanskiy, Ye.V., Tyshchenko, O.K., Kopaliani, D.S.: An evolving connectionist system for data stream fuzzy clustering and its online learning. Neurocomputing 262, 41–56 (2017) 8. Deineko, A., Zhernova, P., Gordon, B., Zayika, O., Pliss, I., Pabyrivska, N.: Data stream online clustering based on fuzzy expectation-maximization approach. In: Proceedings of the 2nd IEEE International Conference on Data Stream Mining and Processing, DSMP 2018, pp. 171–176 (2018) 9. Mao, J., Jain, A.K.: A self-organizing network for hyperellipsoidal clustering. IEEE Trans. Neural Netw. 7, 16–29 (1996) 10. Gustafson, D.E., Kessel, W.C.: Fuzzy clustering with fuzzy covariance matrix. In: Proceedings of the IEEE CDC, San Diego, pp. 761–766 (1979) 11. Gan, G., Ma, Ch., Wu, J.: Data Clustering: Theory. Algorithms and Applications. SIAM, Philadelphia (2007) 12. Xu, R., Wunsch, D.C.: Clustering. IEEE Press Series on Computational Intelligence. Wiley, Hoboken (2009) 13. Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications. CRC Press, Boca Raton (2014) 14. Krishnapuram, R., Jongwoo, K.: A note on the Gustafson-Kessel and adaptive fuzzy clustering algorithms. IEEE Trans. Fuzzy Syst. 7(4), 453–461 (1999) 15. Hu, Z., Bodyanskiy, Ye.V., Tyshchenko, O.K.: Self-Learning and Adaptive Algorithms for Business Applications. A Guide to Adaptive Neuro-fuzzy Systems for Fuzzy Clustering under Uncertainty Conditions. Emerald Publishing Limited, Edinburgh (2019) 16. Izonin, I., Trostianchyn, A., Duriagina, Z., Tkachenko, R., Tepla, T., Lotoshynska, N.: The combined use of the wiener polynomial and SVM for material classification task in medical implants production. Int. J. Intell. Syst. Appl. (IJISA) 10(9), 40–47 (2018) 17. Babichev, S., Škvor, J., Fišer, J., Lytvynenko, V.: Technology of gene expression profiles filtering based on wavelet analysis. Int. J. Intell. Syst. Appl. (IJISA) 10(4), 1–7 (2018)
Application of a Novel Model “Requirement – Object – Parameter” for Design Automation of Complex Mechanical System Bui V. Phuong1(&), Sergey S. Gavriushin1,2, Dang H. Minh3, Phung V. Binh4, and Nguyen V. Duc5 1
Bauman Moscow State Technical University, 5c1, 2nd Baumanskaya Street, 105005 Moscow, Russian Federation [email protected], [email protected] 2 Mechanical Engineering Research Institute of the Russian Academy of Sciences, 4, Malyi Kharitonievsky pereulok, 101990 Moscow, Russian Federation 3 Industrial University of Ho Chi Minh City, Ho Chi Minh City, Vietnam [email protected] 4 Le Quy Don Technical University, Hanoi, Vietnam [email protected] 5 Thuyloi University, 175 Tay Son, Dong Da, Hanoi, Vietnam [email protected]
Abstract. This paper presents a novel approach to deal with mathematical modeling in the design process of complex machine by using the so-called “Requirement – Object – Parameter” or ROP model. This novel model allows for coordinating the related experts in order to deal with the limitation of conventional design method. Based on ROP model, for each particular product context, a priority order of technical requirements is set, and consequently a corresponding suitable design process is defined. Indeed, the proposed model played a significant role in the design process of an innovative fruit and vegetable washer. Having said that, three different configurations of the washer corresponding to particular production contexts were effectively created as design options for producers in flexible and convenient manner. Additionally, the model allows for minimizing test steps and optimizing machine quality in general. Keywords: Intelligence system design Man-machine interaction Machine design synthesis Product life cycle management Fruit and vegetable washer
1 Introduction Design of an intricate machine or mechanical system has been always a complex problem so far [1–5]. Quality revolution, appeared in the early 80 s, changed the concept of product design [6–8]. Accordingly, the design process needs to comply with a common standard as follows: recognition of need, definition of problem, synthesis, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 375–384, 2020. https://doi.org/10.1007/978-3-030-39216-1_34
376
B. V. Phuong et al.
analysis and optimization, evaluation, and finally market representation. It is noteworthy that these steps can be done repeatedly until quality of the designed product meets producer desired requirement. Along with the development of science and technology, this model has been used widespreadly up to now, indeed it is suitable for producing sustainable products at the factory or company with a firm basis of financial resources [7, 9–11]. Today, in order to improve product competitiveness, the abovementioned design procedure is often accompanied by the concept of lifecycle management [12–15]. This allows experts like engineers, technologist, economists, market professional, graphic designer and others for participating into the design process. Accordingly, product quality control is carried out continuously from the start to the end of the process. Although the experts play an important role in the improvement of product quality and time saving, in case of new machine creation that the experts are not aware of, this management model will become much more intricacy [12]. In fact, regarding the design of “super complex” systems like vehicles, ships, trains, airplanes, and others it can be splitted into small modules. Each of them can also be splitted into the smaller ones like components and/or parts of the subject that are wellmanaged by the experts. However, since there are many professionals available in the design process, there will be undesirable inconsistent situation in terms of technical requirement, product quality and/or importance grade of product quality assurance [16, 17]. This could cause stagnation and even deadlock in production [13]. Based on the literature review and authors’ experience, two limitations in design and development of new product still exit as follows: (i) time-consuming and high expense for coordinating related experts in the field in order to find out the suitable production option, and (ii) the efficiency of expert idea exchange, because most of the time opinion of different experts is provided at the different time. Besides, it is a private view and bias of every expert, while the chief engineer or business owner is the most important person who decides to opt for the design option. Thus, the selected one might still present inappropriateness for production. To deal with the abovementioned limitations, in this paper the authors propose an approach to deal with the design problem based on the concept of product lifecycle management, in which stages such as technology, computing, execution, maintenance, cost estimation, and others are analyzed comprehensively. This novel approach offers flexibility and convenience for engineers and producers, and it also allows for minimizing test steps and optimizing product quality.
2 Model “Requirements – Objects – Parameters” It is evident that during the design process of a complex mechanical system, there is a vast amount of technical requirements for components/parts and constraints on operating conditions that need to be complied. Thus, in order to have the best overview of design guidelines, it is necessary to generalize all of input conditions for the design. From this point of view, the authors propose a new design model so called
Application of a Novel Model “Requirement – Object – Parameter”
377
“Requirements – Objects - Parameters” or ROP, as illustrated in Fig. 1. To develop a mathematical model for any machine design, firstly it needs to determine technical requirements of all functions by answering following questions: What characteristic do the machine need? What functions need to be done? Here, it needs to analyze machine lifecycle to define the stages and related experts involved into the design process. Next, the criteria such as quality, cost, productivity, etc. of the machine are set by the related experts. These criteria are placed in the group “Requirements”. Besides, this group also includes production condition and all physical phenomena that may occur. REQUIREMENTS
OBJECTS
PARAMETERS
Desire A
Part/Detail α
Parameter a
Desire B
Part/Detail β
Parameter b
Part/Detail η
Parameter c
Desire C Desire m
Part/Detail ω
Parameter d
Condition 1 Part/Detail μ
Parameter e
Part/Detail γ
Parameter f
...
...
Condition n Phenomena 1 Phenomena i
Fig. 1. Design model ROP in a general form
Next, it concerns about parts, components, details, and others that form the machine. This group is called as “Objects”. The criteria in the group “Requirements” are linked to different parts in this group. This makes sense in visualizing the specific characteristic of parts in the machine. The last group in the model ROP is called as “Parameters”. Those are geometry, mass, material properties, kinematic and dynamic data, and others of machine parts. These parameters directly contribute to the development of mathematical model as they represent the group “Objects” the form the quality standard chain of the machine to be achieved.
378
B. V. Phuong et al.
Once formulation of those groups is created, it is possible to observe the specific relationship among them, which is very important to generalize the machine design concept. For instance, looking into Fig. 1, the desire B effects directly on parts a and x, and in order to design these parts, it is necessary to consider parameters (a, b, c) and (a, c, e). This implies that every requirement is driven by a series of parameters. Hence, it is important to note that the ROP model can be actually considered as a generalized one used for designing any mechanical system. In practice, for a complex machine, there is a large number of parameters that make the model to be bulky. To solve this problem, an approximation concept is proposed as follows: – Reducing free variables by classifying them into groups of equivalent functions, which can be substituted according to production context; – Developing high-quality explicit approximations for complex constraint, or in other words to simplify constraints. In case, there are another requirements and/or new priority order Desire C before Desire B and A instead of the initial order A–B–C (Fig. 1), the design model can be changed completely, and the design outcome would be a different one from the previous. Indeed, ROP model can be customized that depending on particular case, the order of “Requirements” could be different, and the design process also changes accordingly.
3 Application of ROP Model for Designing an Innovative Fruit and Vegetable Washer To assess the capability of ROP model, let’s consider the design of an innovative fruit and vegetable washer [18]. This is a brand new machine that owns many superior features in comparison with conventional ones [19]. The machine is designed in the drum form that works based on two mechanisms such as horizontal and rotational motions. The sketch of machine structure is illustrated in Fig. 2. Based on the concept of product lifecycle management together with ROP model, in order to design engineers, need to analyses several principal stages in the washer lifecycle, as shown in Fig. 3, such as: (I) Design: Engineers determine quality criteria of the washer (A), motor vibration during operation (C), washer life service (E), safety factor during operation (F). (II) Elaboration: Technologists analyze and collect requirements on manufacturing technology, detailed machining and assembly methods. (III) Cost estimation: Engineers estimate washer production cost (D), energy consumption norms of motor and water consumption norms used for washing (B). (IV) Operation and maintenance: Engineers set requirements on convenience for operation, repair, maintenance and replacement.
Application of a Novel Model “Requirement – Object – Parameter”
379
1 A
B
2 C
D
3 1- Main part: A- drum; B- drive motor; C-water tank; D-auxiliary details; 2- Part of horizontal motion; 3- Part of rotational motion. Fig. 2. Concept of an innovative fruit and vegetable washer
Regarding the ROP model for designing an innovative fruit and vegetable washer (Fig. 3.), the “Requirements” (A)–(F) in relation to the “Objects” are following: drum (a), water tank (b), machine frame (η), slider-crank mechanism with springs or springSCM (x), high pressure nozzles and drainpipes (µ), motor and inverter controller (c). “Parameters” in the ROP model are: mass of fruit-vegetable (a), amplitude and frequency of horizontal motion (b), rotating speed (c), size of drum (d), size of drum grid (e), spring stiffness (f), motor power (g), size of machine frame (h), size of water tank (i), size of support plate (j). It is noted that there is a sign […] at the bottom of Fig. 3., which means that apart from those requirements, objects and parameters mentioned there might be other ones that can be customized by the producers. It is remarkable that the essential parts of the washer are the following: springSCM, drum, water tank, machine frame. Thus, prior to design the information data of these parts are crucial. Based on machine theory like dynamics or hydrodynamics, it is possible to establish the relationship among requirements, objects and parameters, as the arrow indications are shown in Fig. 3. By using the diagram in this figure, it is viable to control the whole machine structure as well as to perceive the mutual influence among design objects. Accordingly, depending on particular production context, the corresponding mathematical model for design of the washer can be established automatically.
380
B. V. Phuong et al. REQUIREMENTS
Quality criteria of the washer
(A)
OBJECTS
Drum (α)
PARAMETERS Mass of fruit-vegetables (a) Amplitude and frequency of horizontal motion (b)
Water and Energy consumption norm (B)
Water tank (β)
Rotating speed (c) Size of drum (d)
Vibration (C)
Machine frame (η)
Size of drum grid (e) Spring stiffness (f)
Washer production costs (D)
Washer life service (E)
Spring-CSM (ω)
High pressure nozzles and Drainpipes (μ)
Safety factor (F)
[…]
Motor power (g) Size of machine frame (h) Size of water tank (i)
Motor, Inverter controller (γ)
Size of support component (i)
[…]
[…]
Fig. 3. ROP model for designing an innovative fruit and vegetable washer
4 Design Options and Discussion Once there is a complete ROP model for designing the fruit and vegetable washer, depending on the production context and real life demand there will be an appropriate design process. It is noteworthy that in this paper the authors aim to study the suitability of the proposed model for design process rather than to evaluate deeply one design option with components and sketches. Hence, only preliminary study on design options is presented.
Application of a Novel Model “Requirement – Object – Parameter”
4.1
381
Case 1
In case, producers are not worried about financial expenses, and they set quality and productivity of the washer being the most important, hence the stage “Elaboration” will be the first priority. Looking into Fig. 3., it is seen that the drum (part 1A), nozzledrainpipes (part 1D), and spring-SCM (part 2) effect directly on the utmost concern of producers. In particular, amplitude and frequency of horizontal motion, drum size are those parameters that need to be paid an attention in order to obtain the effective washer in compliance with producer requirement. With the main parts, shown in Fig. 2, the proposed washer for this case is illustrated in Fig. 4.
1A
1B
3
1C 2
Fig. 4. Configuration of the fruit and vegetable washer for the Case 1
4.2
Case 2
When manufacturing technology influences straightforwardly on the design (stage I), for instance current technology does not allow for producing bevel gear (part 3, Fig. 4) with desirable geometrical configuration, thus it normally needs to purchase prefabricated ones available in the market. For this case, the author proposed the replacement of bevel gear by cardan-joint. This results in a new configuration of the washer, as shown as part 3 in Fig. 5. Besides, in the ROP model, the bevel fear is substituted by cardan-joint as same as corresponding parameters. Afterwards, the design process carries on concerning about quality, productivity and other requirements.
382
B. V. Phuong et al.
1A
3
1C 1B
2
Fig. 5. Configuration of the fruit and vegetable washer for the Case 2
4.3
Case 3
In order to produce the washer mockup used for research, it is necessary to modify the whole structure, reduce size of parts, and simplify several details in comparison with a full-version complete washer. For example, one (part 3) or two drive motor can be temporarily replaced by a crank, as shown in Fig. 6, and when it is required, the motor can be mounted again at the usual place.
1A 1B 1C 3
2
Fig. 6. Configuration of the fruit and vegetable washer for the Case 3
Application of a Novel Model “Requirement – Object – Parameter”
383
5 Conclusions This paper proposed the “Requirements – Objects – Parameters” or ROP model on the basis of the product lifecycle concept for designing complex mechanical systems. This model helps the experts to own comprehensive views, consistent and update information on design circumstance. Besides, it also offers flexibility and convenience for engineers and manufactures in different product contexts. Regarding design of an innovative fruit and vegetable washer, application of ROP model allows for creation of three different design options of the washer corresponding to particular production circumstances. It is noted that, apart from the fruit and vegetable washer, ROP model is potential to be implemented for design and development of other devices, because it solves the current limitation of the conventional design method. In the future, ROP model can be advanced to automate building and solving process of multi-objective mathematical model that would definitely simplify the design process in practice.
References 1. Gavryushin, S.S., Nguyen, C.D., Dang, H.M., Phung, V.B. Automation of decision-making when designing the saw unit of a multirip bench using the multiple-criteria approach. J. High. Educ. Inst.: Mach. Build. 12(689), 51–65 (2017). (In Russian) 2. Dang, H.M., Phung, V.B., Bui, V.P., Nguyen, V.D.: Multi-criteria design of mechanical system by using visual interactive analysis tool. J. Eng. Sci. Technol. 14(3), 1187–1199 (2019) 3. Aliyev, A.G., Shahverdiyeva, R.O.: Perspective directions of development of innovative structures on the basis of modern technologies. Int. J. Eng. Manuf. (IJEM) 8(4), 1–12 (2018). https://doi.org/10.5815/ijem.2018.04.01 4. Ugwuoke, I.C., Abolarin, M.S.: Design and development of class 2B-lpl compliant constantforce compression slider mechanism. Int. J. Eng. Manuf. (IJEM) 9(3), 19–28 (2019). https:// doi.org/10.5815/ijem.2019.03.02 5. Ugwuoke, I.C., Ikechukwu, I.B., Ifianyi, O.E.: Design and development of a mixed-mode domestic solar dryer. Int. J. Eng. Manuf. (IJEM) 9(3), 55–65 (2019). https://doi.org/10.5815/ ijem.2019.03.05 6. Hurmuzlu, Y., Nwokah, O.D.I.: The Mechanical Systems Design Handbook: Modeling, Measurement, and Control, 1st edn. 872 p. CRC Press (2017) 7. Ullman, D.G.: The Mechanical Design Process, 6th edn. 480 p. David Ullman LLC (2017) 8. Perel’muter, A.V.: Synthesis problems in the theory of structures (brief historical review). Vestn. Tomsk State Univ. Archit. Build. 2(2016), 70–106 (2016). (In Russian) 9. Budynas, R., Nisbett, K.: Shigley’s Mechanical Engineering Design, 10th edn. 1104 p. McGraw-Hill Education (2014) 10. Bhandari, V.B.: Design of Machine Elements, 4th edn. 944 p. Mc Graw Hill (2016) 11. Ugural, A.C.: Mechanical Design of Machine Components, 2nd edn. 1034 p. CRC Press (2015) 12. Dang, H.M.: Automation and management of design and production of composite pressure vessel by winding method, Ph.D. Dissertation, Moscow (2013). (In Russian) 13. Phung, V.B.: Automation and management of the decision-making process for multi-criteria design of the saw unit of a multirip bench, Ph.D. Dissertation, Moscow (2017). (In Russian)
384
B. V. Phuong et al.
14. Nazir, A., Raana, A., Majeed, N.: Highlighting the role of requirement engineering and user experience design in product development life cycle. IJMECS 6(1), 34–40 (2014). https:// doi.org/10.5815/ijmecs.2014.01.04 15. Phung, V.B., Dang, H.M., Gavriushin, S.S.: Development of mathematical model for management lifecycle process of new type of multirip saw machine. Sci. Educ. Sci. Publ. BMSTU 1(2), 87–109 (2017) 16. Dang, H.M., Phung, V.B., Nguyen, V.D.: Multi-objective design for a new type of frame saw machine. Int. J. Mech. Prod. Eng. Res. Dev. (IJMPERD) 9(2), 449–466 (2019) 17. Statnikov, R.B., Gavriushin, S.S., Dang, M., Statnikov, A.: Multicriteria design of composite pressure vessels. Int. J. Multicriteria Decis. Making 4(3), 252–278 (2014) 18. Dang, H.M., Phung, V.B., Nguyen, V.D., Tran, T.T.: Multifunctional fruit and vegetable washer. Vietnamese Patent Application № VN2019324A2 (2019, In submission) 19. Kenghe, R.N., Magar, A.P., Kenghe, K.R.: Design, development and testing of small scale mechanical fruit washer. Int. J. Trend Res. Dev. 2(4), 168–171 (2015)
Studies of Structure and Impact Damage of Composite Materials by a Computer Tomograph Oleg N. Bezzametnov1(&), Victor I. Mitryaykin1, and Yevgeny O. Statsenko2 1 Kazan National Research Technical University named after A.N. Tupolev – KAI, K. Marksa str. 10, 420111 Kazan, RT, Russia [email protected], [email protected] 2 Kazan (Volga region) Federal University, Kremlevskaya str., 18, 420008 Kazan, RT, Russia [email protected]
Abstract. The paper discusses the possibilities of computed tomography in the study of the internal structure of samples of composite materials with different types of reinforcing the material. A technique has been developed for recording impact damage parameters on a stand with a vertically falling weight. Using CT data the character of the distribution of internal damages was carried out after a low-speed strike was studied. The unique technique allows reconstructing 3D geometry of the object and analyze object defects in spatial. Keywords: Composite materials Computed tomography testing Porosity Impact damage
Non-destructive
1 Introduction Composite structures are widely used in modern aviation and space technology. Evaluation of their strength characteristics, determination of the initial and residual life is impossible without a deep study of the structure of the material in the production and control of the dynamics of its change in the process of uses. There is a relationship between the structure and mechanical properties of materials. It makes it possible to judge the strength characteristics of the material from a study of its structure. One of the main physical characteristics of the material is its density q. It is used in the calculations of most physical and mechanical parameters of the material. In turn, the density depends on the porosity of the material Vp, which is a dimensionless quantity and estimates the percentage content of pores in the solid. In fiberglass, porosity can range from 1% to 26%. It is established that porosity reduces the mechanical properties of the material. In [1], the strength of fiberglass panels was investigated depending on the percentage of pores. With a porosity of from 0.5 to 16%, the compressive strength r−1 and
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 385–394, 2020. https://doi.org/10.1007/978-3-030-39216-1_35
386
O. N. Bezzametnov et al.
the horizontal shear strength s decrease linearly with an increase in the percentage of pores Vp. These dependencies were approximated by the following formulas: r1 ¼ 112 73 VP
ð1Þ
s ¼ 6; 7 0; 55 Vp
ð2Þ
The production porosity values are determined on samples of witnesses cut from allowances of manufactured structures, but during operation, these indicators under the influence of climatic factors (moisture, temperature, radiation, acidity, etc.) change for the worse. In addition, the structures receive fatigue and impact damage, which leads to the detachment of the filler fibers from the epoxy base (matrix). This increases the volumetric porosity and, as a result, the strength characteristics decrease. Therefore, controlling the value of porosity, the residual resource can be predicted. Currently, acoustic, radiation, dielectric non-destructive testing methods are used to solve these problems [2–4]. These methods are relative and do not allow to determine the absolute values of density and porosity. Using these methods, a correlation is established between the density and the physical parameters used. In order to identify impact damage to composite skins, visual and instrumental control methods are used. The entire surface of the object is subjected to visual inspection; a more thorough inspection is carried out in a limited area, the most stressful places of the structure (target visual inspection). Critical structural sites, impact damage of which can lead to the destruction of the unit, are inspected using instrumental methods of non-destructive testing (targeted comprehensive inspection). In [5], X-ray diffraction patterns of the dynamics of deformation and fracture of composites upon impact are presented. The damage zone in [6] is estimated by ultrasonic testing using phase gratings. However, the ultrasound analysis allows to obtain data only on the projections of the stratification onto the plane but does not provide information on the dependence of the size of the stratification region on the coordinate along with the thickness of the sample. It is proposed to use layer-by-layer analysis, using tomography. In [7], CT was applied in assessing the structure of a three-dimensional reinforced composite, which made it possible to reduce the number of material research methods. An analysis of the microstructure of the material in the initial state made it possible to optimize the modes of material production. The study of the microstructure in the volume of the material after exposure to statistical loads made it possible to evaluate the effectiveness of the joint work of the filler and binder. In the work [8], the main stages of the formation of the microstructure of the composite were investigated, the most characteristic tomogram projections with the designation of the length of the void defects in the material are presented. Based on the obtained tomogram projections, a 3D volume model of the material was modeled. Special software was used for image processing and analysis of quantitative properties of the material with a description of the size of defects in the entire volume of the material in three mutually perpendicular areas. In this paper, this approach was used to identify the mechanical characteristics of composite structures using X-ray computed tomography (CT). In addition, of great interest is the study of the impact damage zone in order to determine its size and the nature of the destruction of the layers in material samples.
Studies of Structure and Impact Damage of Composite Materials
387
2 X-Ray Computed Tomography Method to Study the Structure of the Material CT method is a non-destructive investigation of the internal structure of an object is based on measuring the attenuation of X-rays by various parts of an object that differ in density, composition, and thickness. For the formation of three-dimensional images of the internal structure of an object, complex computer processing of data arrays is used along with the set of its two-dimensional shadow projections. Nowadays there are many different methods to analyze CT data. For quick get the orthotropic directions fabric tensor is used [9–11]. To get all mechanical constants representative element method is used. Usually, this method is based on the finite element method. The representative element can be made as an assembly of finite elements [11–13] or as one but with modified technic of integration the stiffness matrix [14]. In both ways, the large number of numerical experiments should be held and it takes a long time. Additionally, for all types of simulation, the CT data should be separated on materials. For this purpose, the noise should be deleted and the threshold of binarization should be calculated [15–17]. Scanning was performed using a micro/nanofocal X-ray inspection system for CT and 2D inspections of Phoenix V | tome | X S240 in the laboratory of X-ray CT of the Institute of Geology and Petroleum Technologies of Kazan Federal University. The system is equipped with two X-ray tubes: microfocus with a maximum accelerating voltage of 240 kV. power of 320 W. For primary data processing and creating a volumetric (voxel) model of a sample based on x-ray images (projections), the datos|x reconstruction software was used. VG Studio MAX 2.1 and Avizo Fire 7.1 software was used to visualize and analyze data on the elements of the volumetric image. Initially, the structure and porosity of samples of polymer composite materials based on glass cloth and carbon cloth were investigated. In this work, samples of two types of materials were chosen as objects of study: fiberglass based on EE380 fabric and carbon fiber CC201. The characteristics of the fiberglass composite package package had the following properties: twill weave, monolayer thickness 0.33 mm, number of layers 12, filling ratio 45%. The CFRP samples had the following characteristics: plain weave, monolayer thickness 0.25, number of layers 16, filling factor 44%. Twill (“diagonal”) weave is created by weaving one or more elements of the warp with two or more weft elements in the correct interlacing (Fig. 1, a). The peculiarity of such a fabric is greater flexibility and better draping ability than fabrics with plain or net weaves. In the case of plain weave, the overlap of the threads goes at a 90º angle (Fig. 1, b). Fabrics of this type are of equal strength and have high strength and rigidity, they are usually used to reinforce strong-loaded construction sites.
388
O. N. Bezzametnov et al.
Fig. 1. Schematic images of twill fabric (a) and plain weave (b).
Epoxy resin SICOMIN SR 8500 and hardener SICOMIN SZ 8525 were chosen as the binder. This epoxy system has a low viscosity and a short curing time. The properties of the components are shown in Table 1. Table 1. Properties of epoxy resin SICOMIN SR 8500 and hardener SICOMIN SZ 8525. Mechanical properties Viscosity (25 °C) Density (20 °C) Complex viscosity (25 °C)
Unit SR 8500 SZ 8525 [mPa ∙ sec] 4200 ± 500 25 ± 5 [kg/l] 1,17 ± 0,01 0,94 ± 0,01 [mPa ∙ sec] 960 ± 200
In this study, extrusion molding was used to make samples for fabrication. It is widely used due to technological, instrumental simplicity and versatility. An important parameter for the molding of composites is the heating Mode (Fig. 2); it is selected in accordance with the requirements for the polymerization conditions of the binder. The dimensions of the moldable plate: 950x330x4 mm. For scanning samples on CT, depending on their geometric shapes and sizes, an appropriate holder is made. The specimen fixed in the holder is placed on the rotating table of the X-ray computed tomography chamber at the optimum distance from the Xray source.
Studies of Structure and Impact Damage of Composite Materials
389
Fig. 2. Manufacturing process parameters graph
The sample fixed in the holder was placed on the rotating table of the X-ray computed tomography chamber at the optimum distance from the X-ray source and micro focus tube was used. The survey was carried out at an accelerating voltage of 100 kV. and current 300 mA. The linear dimension of voxel was equal to 34 lm. Images and videos of 2D slices were obtained in VG Studio MAX 2.1 software. The total porosity of the samples was calculated using Avizo Fire 7.1 software. To eliminate shooting defects, volume elements with dimensions greater than two voxels (>68 lm) were used in the calculations. Figure 3 shows the results of scanning samples of EE 380 fiberglass (a) and CC201 carbon fiber (b) in XY section, in Fig. 4 in YZ section, respectively.
Fig. 3. XY sample cross section
390
O. N. Bezzametnov et al.
Fig. 4. YZ sample cross section
The fiber laying pattern and the distribution of pores in the volume and voids between the layers are clearly visible (see Figs. 3 and 4). The total porosity of the samples was calculated using the Avizo Fire 7.1 software. Table 2 shows the results of measuring the porosity of samples cut from the studied materials. Minimum and maximum pore sizes in the image are shown, their number and porosity Vp. The results of the calculation of the volumetric content of pores of polymer composite material (PCM) samples that have a different nature of reinforcing material are shown in Table 2 Table 2. Calculated results for porosity Sample
Fiberglass EE380 Carbon fiber CC201
Volume, mm3
Pore volume, mm3 Min Max
Sum
25.948
1.42E-06
0.058
1.002
3.8628
The number of pores 46134
25.948
1.42E-06
0.012
0.409
1.5766
7623
Porosity, %
3 The Method of Applying Impact Damage and the Analysis of Internal Damage by CT The impact on the samples was carried out using an Instron Dunatup 9250HV impact machine with a vertically falling weight (Fig. 5, a). The test machine is equipped with a highly sensitive piezometric load sensor with an impact tip with a diameter of 16 mm, which records the load with an accuracy of ±1%, the machine has a pneumatic system
Studies of Structure and Impact Damage of Composite Materials
391
to prevent re-impact, the required impact energy is set by lifting the suspended load to a predetermined height. The tests were carried out in accordance with the requirements of international standards ASTM D7136 and GOST 33496-2015. A general view of the sample installed in the specialized equipment is shown in Fig. 5.
Fig. 5. Instron Dynatup 9250 HV Vertical drop test machine (a), Impact test equipment (b).
The experimental data were processed using the Impulse software. During the tests, the dependence of the contact force on the duration of the contact during the impact was recorded. The impact energy was chosen to take into account the recommendations of the standards [18, 19] and amounted to 20 J. According to the test results, the damage was evaluated, damage initiation energy, absorbed energy, the maximum load on impact. The tomograms of the impact zone in two planes were obtained. This made it possible to determine the damage size with high accuracy by the sample volume and in separate layers. The dimensions of the visually recorded dent on the surface are much less than the zone of damage inside the specimen. There is observed the destruction of the fibers, damage to fibers with delamination, delamination without damage to the fibers. All these damages change the structure of the material, which reduces the mechanical characteristics. Figures 6, 7 show the results of microtomographic studies of PCM samples after impact, the nature of the change in the void space is established.
392
O. N. Bezzametnov et al.
Fig. 6. The cross-section of the EE380 sample in the XZ plane (a), the cross section in the YZ plane at the impact point (b), 3 mm from the impact (c).
On tomographic XZ sections of the EE380 sample, radial fracture from the impact is observed, on the YZ sections, the fracture is observed, decompression along with the material layers and complete destruction near the center of the impact (Fig. 6). The tomographic images of the EE380 sample show radial destruction from the impact, fracture, softening along with the layers of the material and complete destruction near the center of the impact.
Fig. 7. The cross-section of the sample CC201 in the XZ plane (a), the cross section in the YZ plane at the impact point (b), 4 mm from the impact (c).
Studies of Structure and Impact Damage of Composite Materials
393
On tomographic XZ sections of sample CC201, a cross-shaped crack from the impact produced is observed, and on YZ sections, fracturing and decompression are observed along with the layers of the material (Fig. 7). On the tomographic images of the SS201 sample, there is a cross-shaped crack from the impact, fracture and softening along with the layers of material.
4 Conclusions In the paper, computed tomography is used for non-destructive testing of impact damage. Tomographic studies of specimens subjected to mechanical or other kinds of loading can provide important information on the specifics of the processes of deformation, fracture and prefracture of objects of interest. With the development of industrial X-ray tomography, qualitatively new possibilities are opened for obtaining and using the information on the internal structure of materials at macro and micro levels. The application of these methods will make it possible to fairly accurately determine the location and size of impact damage, assess structural changes and predict a decrease in the mechanical characteristics of the material in the impact zone. Acknowledgement. The work supported by the Russian Foundation for Basic Research (project No. 19-08-00577).
References 1. Fried N.Pros.20 Conf.SPI Reinforced Plastics Div. 1965.Sec.1-C 2. Kayumov, R.A., Tazyukov, B.F., Mukhamedova, I.Z.: Identification of mechanical characteristics of a nonlinear-viscoelastic composite by results of tests on shells of revolution. Mech. Compos. Mater. 55(2), 171–180 (2019) 3. Paimushin, V.N., Kayumov, R.A., Kholmogorov, S.A.: Deformation features and models of [±45]2 s cross-ply fiber-reinforced plastics in tension. Mech. Compos. Mater. 55(2), 141– 154 (2019) 4. Kayumov, R.A., Tazyukov, B.F., Shakirzyanov, F.R., Mukhamedova, I.Z.: Large deflections of beams, arches and panels in an elastic medium with regard to Deformation shifts. Lobachevskii J. Math. 40(3), 321–327 (2019) 5. Gareyev, A.R., Danilov, Ye.A., Pylayev, A.Ye., Yelizarov, P.G., Kolesnikov, S.A.: Determination of the microstructure of 3D reinforced carbon fiber “Facets” by X-ray tomography. Zavodskaya laboratoriya. Diagnostika materialov, 11(80), 31–35 (2014) 6. Boychuk, A.S., Generalov, A.S., Dalin, M.A., Stepanov, A.V.: Non-destructive testing of technological discontinuities of the T-shaped zone of an integral structure made of PCM using ultrasonic phased arrays). Entsiklopedicheskiy spravochnik M, No 10, 38–43 (2012) 7. Tolkachev, V.F., Zheykov, V.V.: The destruction of structural materials and composites during high-speed impact. Vestnik TGU 18(4), 1741–1742 (2013) 8. Dolgodvorov, A.V., Syromyatnikova, A.I.: The study of the volumetric microstructure of the structural carbon-carbon composite material and the creation of a 3D computer model of the test sample). Vestnik PNIPU. Aerokosmicheskaya tekhnika 37, 202 (2014)
394
O. N. Bezzametnov et al.
9. Kharin, N., Vorob’Yev, O., Bol’Shakov, P., Sachenkov, O.: Determination of the orthotropic parameters of a representative sample by computed tomography. J. Phys: Conf. Ser. 1158(3), 032012 (2019). https://doi.org/10.1088/1742-6596/1158/3/032012 10. Gerasimov, O., Kharin, N., Vorob’Yev, O., Semenova, E., Sachenkov, O.: Determination of the mechanical properties distribution of the sample by tomography data. J. Phys: Conf. Ser. 1158(2), 022046 (2019). https://doi.org/10.1088/1742-6596/1158/2/022046 11. Gerasimov, O., Koroleva, E., Sachenkov, O.: Experimental study of evaluation of mechanical parameters of heterogeneous porous structure. In: IOP Conference Series: Materials Science and Engineering, vol. 208, no. 1, p. 012013 (2017). https://doi.org/10.1088/1757-899x/208/1/ 012013 12. Kharin, N.V., Vorobyev, O.V., Berezhnoi, D.V., Sachenkov, O.A.: Construction of a representative model based on computed tomography. PNRPU Mech. Bull. 3, 95–102 (2018). https://doi.org/10.15593/perm.mech/2018.3.10 13. Sachenkov, O.A., Gerasimov, O.V., Koroleva, Y.V., Mukhin, D.A., Yaikova, V.V., Akhtyamov, I.F., Shakirova, F.V., Korobeynikova, D.A., Chzhi, K.K.: Building the inhomogeneous finite element model by the data of computed tomography. Russ. J. Biomech. 22(3), 291–303 (2018). https://doi.org/10.15593/RJBiomeh/2018.3.05 14. Gerasimov, O.V., Berezhnoi, D.V., Bolshakov, P.V., Statsenko, E.O., Sachenkov, O.A.: Mechanical model of a heterogeneous continuum based on numerical-digital algorithm processing computer tomography data. Russ. J. Biomech. 23(1), 87–97 (2019). https://doi. org/10.15593/RJBiomech/2019.1.10 15. Marwa, F., Wajih, E.Y., Philippe, L., Mohsen, M.: Improved USCT of paired bones using wavelet-based image processing. Int. J. Image Graph. Signal Process. (IJIGSP) 10(9), 1–9 (2018). https://doi.org/10.5815/ijigsp.2018.09.01 16. Mithun, K.P.K., Gauhar, A., Mohammad, M.R., Delowar, A.S.M.H.: Automatically gradient threshold estimation of anisotropic diffusion for Meyer’s watershed algorithm based optimal segmentation. Int. J. Image Graph. Signal Process. (IJIGSP) 6(12), 26–31 (2014). https://doi. org/10.5815/ijigsp.2014.12.04 17. Mithun, K.P.K., Mohammad, M.R.: Metal artifact reduction from computed tomography (CT) images using directional restoration filter. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 6 (6), 47–54 (2014). https://doi.org/10.5815/ijitcs.2014.06.07 18. ASTM D 7136/D 7136 M – 12 Standard Test Method for Measuring the damage Resistance of a Fiber-Reinforced Polymer Matrix Composite to a Drop-Weight Impact Event 19. ASD-STAN prEN 6038 P1 «Fiber Reinforced Plastics - Test Method - Determination of the Compression Strength after Impact»
On the Possibility of Applying a Multi-frequency Dynamic Absorber (MDA) to Seismic Protection Tasks S. B. Makarov and N. V. Pankova(&) Mechanical Engineering Research Institute of the Russian Academy of Sciences, Moscow, Russia [email protected], [email protected]
Abstract. The question discussed is how to best protect from seismic impact some building in which high-tech production is planned, or there are other reasons for severe restrictions on the acceptable level of vibrations. Whether there is a need to have a strong foundation for the building, or there is another way of protection if the building is already built? It is proposed to use a passive multi-frequency dynamic absorber (MDA) in the form of an elastic continuum of rubber-like material, which has a large number of eigen frequencies in the frequency range of seismic effects. When MDA has eigen frequencies close to the resonances of the protected object, a part of the building oscillation energy is pumped into the absorber oscillations and the peak level of frequency response function is significantly reduced. Keywords: Multifrequency Dynamic Absorber (MDA) Elastic systems Resonance Eigen frequencies and modes of vibration Seismic effects
1 Introduction The authors believe that the experience of the Mechanical Engineering Research Institute of RAS in the study of vibrations of structurally complex systems will be useful for protection against seismic impacts that are shaking the planet, causing enormous damage [1–6]. Earthquakes, as well as any unwanted dynamic effects, can be fought only in 3 ways: to suppress an earthquake at the source (for now it is impossible), to weaken its action on the propagation path and, finally, to protect the structure itself (for example, a building). With “dense” urban development, in fact, only the 3rd method is possible. To protect the structure itself, it is necessary to change its dynamic properties, i.e. to reduce its response to dynamic impacts by increasing its damping properties [7]. An effective, inexpensive, long and widely used means of reducing vibrations is the usage of passive dynamic vibration dampers [8, 9], tuned to a certain frequency. Various approaches to monitoring, recording and registration damping are used in the study of specific complex structures [10–12]. Since about 2012, our Institute (IMASH RAS) has begun researches of protecting various building structures from seismic effects with the passive multi-frequency © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 395–403, 2020. https://doi.org/10.1007/978-3-030-39216-1_36
396
S. B. Makarov and N. V. Pankova
dynamic oscillation absorbers (MDA). The novelty of usage MDA consists in reducing the vibration level of the protected structure not at one resonance, but simultaneously at several resonances that are in a given frequency range. The main causes of seismic effects are well known. These are tectonic (movement of tectonic plates), volcanic, and also anthropogenic cause associated with human activity (filling of reservoirs, water injection into wells during various works, manmade explosions, etc.). In this case, seismic waves of various types can occur: P longitudinal, S - transverse, L - surface - Rayleigh, Lyav waves, etc.) The propagation speed of these waves is different; therefore, buildings can be affected by excitations of different frequencies. Therefore, various elastic structures were proposed as multifrequency absorbers: beam type [13], elastic capacitors filled with liquid [14, 15], elastic medium with a large set of own resonant frequencies (continuum) [16]. Research studies were conducted by numerical modeling. In computer simulations, encouraging results were obtained to reduce the vibration levels of protected structures, on whose roofs (i.e., in the places of their highest vibroactivity at low frequencies) MDAs were installed. At the same time, building oscillations are significantly reduced due to their energy absorption by MDA, which is tuned to the range of the main low-frequency resonances of the building. This paper discusses the use of MDAs to reduce low-frequency vibrations of construction sites, for which one of the most dangerous dynamic effects are seismic. In fact; to protect a particular building from seismic effects, it is necessary to determine (calculate or measure on its dynamically-like models) its own frequencies and modes of vibration. You also need to know the most expected frequency spectrum of earthquakes in a given area, or at least the frequency range of seismic arousals. Seismologists may know this. And then, on the basis of computer modeling, it is necessary to create MDA with corresponding resonant properties, check in the computer and, if possible, in a full-scale or model experiment, the work of such protection on a building, and recommend this MDA to reduce the vibrational response of the building being protected.
2 Study of Vibration of a Building of Tower Type Consider the model task of protecting a concrete tower building from seismic effects. Let’s compare 2 approaches to solving this problem. The first approach is to use the massive concrete foundation of the building, which will require considerable material resources and a large amount of construction work carried out before the construction of the building itself. In the model problem, a concrete tower building with a base of 8 by 5 m, a height of 20 m with a wall and ceiling thickness of 0.5 m was considered. The density of concrete was adopted at 1730 kg/m3, the Young’s modulus of 24.5E + 0.9n/m2, the Poisson’s ratio of 0.2. The mass of the building was 449,800 tons. A coordinate system was introduced in which the relative displacements of the building points were determined: the X axis is directed along the wide part of the building parallel to the
On the Possibility of Applying a Multi-Frequency Dynamic Absorber (Mda)
397
base, the Y axis along the narrow part of the building, and the Z axis is directed vertically upwards.
Fig. 1. Two first eigen modes of building vibrations.
40 eigen frequencies and forms of the building were calculated under the condition of pinching the foundation. These frequencies range from 7 to 130 Hz. Figure 1 shows the two lowest forms of building vibrations. At a frequency of 7.73 Hz, a bending oscillation of a building in the YOZ plane occurs, and at a frequency of 11.04 Hz, a bending oscillation of the building in the XOZ plane occurs. According to seismologists, the largest number of natural earthquakes occurs in the range of up to 20–25 Hz, and the building under study has resonant frequencies in this range. The amplitude-frequency response of the accelerations of the observation points of the protected object were determined by the kinematic excitation of the base of the building by the total acceleration with unit amplitudes along all 3 axes of coordinates in the range up to 35–40 Hz - i.e. the so-called truncated white noise. The choice of excitation is based on one of the 3 positions of linearity used in the modal analysis - superposition. The frequency characteristics do not depend on the type and form of the excitation wave. Excitation with a sinusoidal power with a frequency sweep gives the same results as the excitation with a wide-band random force. As observation points in the corners and in the upper part of the side panels of the building were selected, which most strongly move along the Y axis in the specified frequency range. Figure 2 shows the amplitude-frequency response of the vibration accelerations of the observation points along the Y axis; the maximum vibration acceleration of observation points at 7.731 Hz (the 1st eigen frequency of the building) was 27 units (2.8 g).
398
S. B. Makarov and N. V. Pankova
Fig. 2. The amplitude-frequency response of the vibration accelerations of the observation points along the Y axis.
Similarly, at 11.04 Hz (2nd own frequency of the building), the maximum acceleration along the X axis was 12 units (1.2 g). Next, we compare two approaches to solving the task: to make a powerful foundation for a building or to change the resonant properties of the building itself in order to reduce oscillations at all frequencies in a given frequency range.
3 Vibration of Building on a Powerful Foundation As a foundation, we will choose a concrete parallelepiped with a base area of 9 by 6 m and a height of 3 m, on which we will “rigidly” and “symmetrically” install the building discussed above. We consider the whole foundation to be buried in the ground. Mass of the building with the foundation is 823.480 tons, i.e. an increase in the mass of the object amounted to 373.68 tons (which may increase construction costs). Similar to the above, the calculation of 40 own frequencies and modes of oscillations showed that the 3 first (lowest) frequencies of the object in the 2–25 Hz range did not change much, they slightly decreased from 7.731 Hz to 7.373 Hz, from 11.044 Hz to 10.606 Hz (Fig. 3). At the same time, the mass of the building with the foundation increased by 373.68 tons (by 83%).
On the Possibility of Applying a Multi-Frequency Dynamic Absorber (Mda)
399
Fig. 3. The 3 first (lowest) eigen frequencies of the object.
Similarly, to the above, with kinematic excitation of the base of the building by truncated white noise, we obtain the frequency response of the vibration accelerations at the observation points along the Y axis (perpendicular to the wide part of the building) in the range up to 25 Hz (Fig. 4). The acceleration amplitude of oscillations decreased by no more than 23%.
Fig. 4. The amplitude-frequency response of the vibration accelerations of the observation points along the Y axis.
It should be noted here that in seismic resistant construction, elastic shock absorbers are used on foundations, the use of which shifts resonances to lower frequencies, but the amplitudes of vibration accelerations do not decrease significantly. And if an “unexpected” frequency is present in seismic excitation (of a technogenic nature or from explosions, for example), then the effect of oscillations damping may appear less noticeable.
400
S. B. Makarov and N. V. Pankova
4 Vibration of Building with Installed MDA The task is to damp off the oscillations not at a specific frequency, but in a certain frequency range (up to 20–25 Hz). This requires a continuum - an elastic object with a sufficiently large set of own frequencies of oscillations in a given frequency range. As an MDA we will select an object made of rubber-like material in the form of a “box” measuring 8 by 5 m, wall thickness 0.5 m, height 2 m, depth 1.5 m. The density of the material is 1300 kg/m3, the Young’s modulus is 70 000 000 n/m2, the Poisson’s ratio is 0.4. The mass of MDA is 49.4 tons. We will install such MDA on the roof of the building, rigidly fastening it to the roof. As part of the modal analysis, a computer study of the building vibrations caused by seismic effects and their damping was performed. 200 own frequencies and forms of a building with MDA in the range from 6 to 80 Hz were calculated. It should be noted that in this arrangement the building is small deformed at most of its frequencies (Fig. 5).
Fig. 5. The building is little deformed at most of its frequencies.
The panel form of the walls oscillations of the building under study appears for the first time at a frequency of 29.359 Hz (the 32nd frequency of the building), the 2nd panel - at a frequency of 42.522 Hz (the 69th frequency of the building), the 3rd - at a frequency of 51.964 Hz (91st building frequency). Thus, in the frequency range of seismic effects (up to 20–25 Hz), our building will not exhibit its resonant frequencies in a destructive manner. Having carried out a computer procedure of kinematic excitation of the base of a building, we will get the frequency response of vibration accelerations of the observation points of the building in its upper part (Fig. 6). It is seen that the vibration acceleration of the points of the building has decreased by more than 2 times in the range up to 40 Hz. At the same time, there are significant oscillations of absorber points (Fig. 7) in the frequency range of seismic effects (up to 20–25 Hz) - more than
On the Possibility of Applying a Multi-Frequency Dynamic Absorber (Mda)
401
90 units (9 g) in acceleration. The frequency response of vibration accelerations in the Y direction, as the most vibroactive in this frequency range, is given. Consequently, it was possible to transfer the energy of the building oscillations from the seismic effect to the oscillations of the MDA. At the same time there is no need for labor-intensive and expensive work on the foundation, which can reduce the cost of construction. Thus, the model study demonstrated the obvious effect of the application of MDA. Of course, the strength properties of the MDA material should be analyzed, but this is beyond the scope of this work. Also, we do not claim that this form of MDA is the best, most likely its form can be optimized.
Fig. 6. The vibration acceleration of the points of the building has decreased by more than 2 times in the range up to 40 Hz.
At the same time, there are significant oscillations of absorber points (Fig. 7) in the frequency range of seismic effects (up to 20–25 Hz) - more than 90 units (9 g) in acceleration. The frequency response of vibration accelerations in the Y direction, as the most vibroactive in this frequency range, is given. Consequently, it was possible to transfer the energy of the building oscillations from the seismic effect to the oscillations of the MDA. At the same time there is no need for labor-intensive and expensive work on the foundation, which can reduce the cost of construction. Thus, the model study demonstrated the obvious effect of the application of MDA.
402
S. B. Makarov and N. V. Pankova
Of course, the strength properties of the MDA material should be analyzed, but this is beyond the scope of this work. Also, we do not claim that this form of MDA is the best, most likely its form can be optimized.
Fig. 7. It is possible to transfer the energy of the building oscillations from the seismic effect to the oscillations of the MDA.
5 Conclusions 1. The proposed passive MDA makes it possible to protect already existed buildings (for example the historical stone buildings of cities) from seismic impacts in enough wide frequency range as the excitation spectrum may not even be known in advance. 2. MDA allows to transfer part of the energy of the building oscillations from seismic impacts into oscillations of the MDA. In this case, there is no need to build a massive foundation, the mass of which is comparable to the mass of the building itself, which can reduce the cost of construction. 3. The use of MDAs has a significant drawback - it requires a noticeable increase in the mass of the structure. This disadvantage practically rejects the use of MDAs in land and flying vehicles, as well as in many engineering constructions due to the
On the Possibility of Applying a Multi-Frequency Dynamic Absorber (Mda)
403
reduction of their energy efficiency. However, in the protection of buildings from seismic loads and earthquakes, they can be very useful. 4. The presented study illustrates the principal possibilities of passive MDA to reduce the level of building structures vibration caused with seismic impacts. The authors suppose that the application of MDA for real practical problems requires additional studies of the behavior of MDA made of new and developing materials (elastomers) under different types of loading. This will allow to take into account the extremely various climatic conditions of the possible use of passive MDA.
References 1. Aizenberg, Y.A.: The science of construction VS natural calamities. Constr. saf. 1, 51–59 (2018). Earthquake engineering 2. Petrashkevich, Z.: Preservation of architectural heritage - seismic stability of objects. Constr. saf. 4, 52–57 (2016). Earthquake engineering 3. Sanginov, A.: Study of the seismic safety state of building structures for cultural and historic buildings in Tajikistan. Constr. saf. 4, 59–63 (2017). Earthquake engineering 4. Savin, S.: The use of elastic vibrations with different wavelengths to evaluate the dynamic parameters of buildings and structures and assess the strength of materials of the building construction. Constr. saf. 4, 43–54 (2017). Earthquake engineering 5. Yerzhanov, S.Y., Lapin, V.A., Daugavet, V.P.: Studying the dynamics of seismically isolated buildings with the aid of the stations of engineering seismometric service. Constr. saf. 1, 40–45 (2018). Earthquake engineering 6. Shigapov, R.R., Kovalchuk, O.A.: Review of typical accidents with vertical cylindrical storage tanks during earthquakes. Constr. saf. 1, 14–19 (2018). Earthquake engineering 7. Nashif, A.D., Jones, D.I., Henderson, J.P.: Vibration Damping. Wiley, Hoboken (1985) 8. Korenev, B.G., Reznikov, L.M.: Dynamic Vibration Dampers: Theory and technical applications. Science, Fizmatgiz, Moscow (1988). 304 p 9. Karamyshkin V.V.: The dynamic damping of oscillations. Mechanical Engineering. Leningrad Department (1988). 108 p 10. Haghjooa, F., Eghtesadb, M., Yazdi, E.A.: Dynamic modeling and H∞ control of offshore wind turbines. Int. J. Eng. Manuf. (IJEM) 1, 10–25 (2017) 11. Baad, S.M., Patil, R.J., Qaimi, M.G.: Hand arm vibration alleviation of motorcycle handlebar using particle damper. Int. J. Eng. Manuf. (IJEM) 1, 26–40 (2017) 12. Dorji, S.: Modeling and real-time simulation of large hydropower plant. Int. J. Eng. Manuf. (IJEM) 3, 29–43 (2019) 13. Makarov, S.B., Pankova, N.V., Perminov, M.D.: On the modal approach to vibrating structures. Probl. Mech. Eng. Autom. 3, 132–135 (2013) 14. Makarov, S.B., Pankova, N.V., Perminov, M.D., Tropkin, S.N.: Model problem on the damping of buildings with the help of multifrequency dynamic absorbers (MDAs) during natural and man-made disasters. Probl. Mech. Eng. Autom. 4, 117–121 (2016) 15. Makarov, S.B., Pankova, N.V., Perminov, M.D., Tropkin, S.N.: A model problem on the use of a multi-frequency dynamic oscillation absorber (MDA) on an object having its own oscillation modes of various types. Probl. Mech. Eng. Autom. 4, 103–108 (2017) 16. Makarov, S.B., Pankova, N.V., Tropkin, S.N.: How do shock absorbers work in seismic protection of buildings? Research question numerical simulation. Part 2. Earthquake engineering. Saf. Build. 1, 46–50 (2018)
About the Calculation by the Method of Linearization of Oscillations in a System with Time Lag and Limited Power-Supply Alishir A. Alifov1(&) and M. G. Farzaliev2 1
Mechanical Engineering Research Institute of the Russian Academy of Sciences, Moscow 101990, Russia [email protected] 2 Azerbaijan State University of Economics, Baku, Azerbaijan
Abstract. An external self-oscillatory system with an energy source of limited power (limited excitation) is considered. The friction force that causes selfoscillation contains a lagging. Using the methods of direct linearization, the equations of non-stationary motions are obtained. The stationary oscillations and their stability are considered. Stability conditions for steady-state oscillations are derived using the Routh-Hurwitz criteria. To obtain information on the effect of time lag on the characteristics of stationary regimes, calculations were carried out. Calculations were performed to obtain information on the effect of the delay on the characteristics of stationary modes. As from obtained results, they are completely qualitatively similar to those obtained using the well-known methods of nonlinear mechanics. There are only small quantitative differences. At the same time, the application of the direct linearization method is quite simple, it requires significantly (several orders of magnitude) less time and labor costs compared with the known methods of nonlinear mechanics. Keywords: Self-oscillatory system Energy source Limited power Friction force Lag Method Direct linearization
1 Introduction Thanks to the use of artificial intelligence systems (manipulators, automatic machines, robots, flexible production systems, etc.), cybernetic machines are able to adapt to environmental changes. As is well known, in the modern definition a machine is a device that performs the functions assigned to it. It consists of interconnected parts and uses energy to function. Any machine can work only if it has an energy source (engine) under whose influence it is located. When creating devices for any purpose, calculations are carried out and various schemes and models are used. The study of dynamic models of devices that have a multidimensional structure is connected in many cases by solving differential equations (linear, nonlinear). Such equations also describe oscillatory processes. As it is known, all physical objects (and not only physical) represent a potentially oscillatory system and oscillations are manifested in the vast majority of real systems, which are characterized by a continuously changing in time mode of motion. In the control system of artificial intelligence, various variables (displacements, speed, © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 404–413, 2020. https://doi.org/10.1007/978-3-030-39216-1_37
About the Calculation by the Method of Linearization of Oscillations
405
temperature, voltage, etc.) included in dynamic models of the device description can act as the object of control. In devices of various kinds (automatic control systems, tracking systems, radio devices, electronics, regulators, vibration machines, etc.), systems with lag are widespread. It is known [1] that in mechanical systems lag is caused by internal friction in materials, imperfection of their elastic properties, etc. The presence of lag in the system can be both useful and harmful. A large number of works [2–4 and others] are devoted to the problems of oscillations in systems with lag without taking into account the limited power of the energy source. At the same time, there are a relatively small number of studies of oscillatory systems with lag, which take into account the properties of the energy source that supports the oscillations. For the analysis of nonlinear oscillatory systems with lag, known approximate methods of nonlinear mechanics are used: averaging, energy balance, harmonic linearization, etc. [2, 5–10]. All these methods are characterized by laboriousness, which grows with increasing degree of nonlinearity. One of the main problems of the nonlinear dynamics of systems is large labor and time costs. The presence of this problem for the analysis of coupled oscillator networks, which play an important role in biology, chemistry, physics, electronics, neural networks, etc., is indicated in [11] with reference to works [12–15]. In contrast, direct linearization methods [16–19 and others] have several orders of magnitude less labor and are simple enough to use. Using the methods of direct linearization, the model of a self-oscillating system with an energy source of limited power, external influence and delay in the friction force is considered below. The aim of the work is to develop on the basis of direct linearization methods a procedure for calculating mixed vibrations in oscillatory systems that interact with energy sources of limited power.
2 System Model The differential equations of motion of the system, which is shown in Fig. 1, have the form m €x þ k 0 x_ þ c0 x ¼ TðUD Þ þ k sin pt
ð1Þ
€ ¼ MðuÞ _ r0 TðUD Þ Ju where k sin p t is external force with amplitude k and frequency p, TðUD Þ is nonlinear friction force, which depends on the time lag D and causes self-oscillations, _ r0 ¼ const is radius of application of friction UD ¼ V x_ D , x_ D ¼ x_ ðt DÞ, V ¼ r0 u, force TðUD Þ at the point of contact of the body of mass m with tape, k0 is damping _ is difference of factor, c0 ¼ const, J is total moment of inertia of rotating parts, MðuÞ the torque of the energy source and the moment of resistance to rotation, u_ is engine rotation speed.
406
A. A. Alifov and M. G. Farzaliev
Fig. 1. System model.
The nonlinear friction force TðUD Þ is represented by lowering the index D in x_ , in the form TðUD Þ ¼ R½sgnUD þ Fð_xÞ; Fð_xÞ ¼
X i
di UDi ¼
5 X
an x_ n
ð2Þ
n¼0
where di ¼ const, s = 2, 3, 4, …, i = 1, 2, 3, …, R is the normal reaction force, sgnUD ¼ 1 at UD [ 0 and sgnUD ¼ 1 at UD \0, a0 ¼ d1 V þ d2 V 2 þ d3 V 3 þ d4 V 4 þ d5 V 5 a1 ¼ ðd1 þ 2d2 V þ 3d3 V 2 þ 4d4 V 3 þ 5d5 V 4 Þ; a2 ¼ d2 þ 3d3 V þ 6d4 V 2 þ 10d5 V 3 a3 ¼ ðd3 þ 4d4 V þ 10d5 V 2 Þ; a4 ¼ d4 þ 5d5 V; a5 ¼ d5 As a result of averaging, V ¼ r0 u_ is replaced by u ¼ r0 X. The nonlinear function Fð_xÞ by the method of direct linearization [13] is replaced by a linear function F ð_xÞ ¼ BF þ kF x_ D
ð3Þ
where BF , kF - the linearization coefficients defined by expressions BF ¼
X
Nn an tn ;
n
n = 0, 2, 4 (n is even number) kF ¼
X
n tn1 ; an N
n
n = 1, 3, 5 (n is odd number) n ¼ ð2r þ 3Þ=ð2r þ 2 þ nÞ, t ¼ maxjx_ j, r is the Nn ¼ ð2r þ 1Þ=ð2r þ 1 þ nÞ, N linearization accuracy parameter. Regardless of the value of r, there are N0 ¼ 1, 1 ¼ 1. N
About the Calculation by the Method of Linearization of Oscillations
407
Equations (1) in view of (3) take the form: m€x þ k0 x_ þ c0 x ¼ R ðsgnUD þ BF þ kF x_ D Þ þ k sin p t € ¼ MðuÞ _ r0 R ðsgnUD þ BF þ kF x_ D Þ Ju
ð4Þ
Before proceeding to the solutions of system (4), we note that they are different for UD [ 0 and UD \0.
3 Equation Solutions To solve system (4), we use the method of replacing variables with averaging [13]. It allows one to consider stationary and non-stationary processes in the resonance region and its close surroundings. In this method for a non-linear equation of the form xÞ þ f ðxÞ ¼ Hðt; xÞ €x þ Fð_
ð5Þ
xÞ and f ðxÞ are obtained on the basis of the with linearized nonlinear functions Fð_ solution form x ¼ a cos w; x_ ¼ ap sin w; w ¼ p t þ n
ð6Þ
The following standard form equations for determining non-stationary t and n values dt kt dn x 2 p2 1 ¼ Hs ðt; nÞ; ¼ Hc ðt; nÞ dt 2 dt t 2p where t ¼ a p is based on t ¼ maxjx_ j and the expression x_ in (6), 1 Hs ðt; nÞ ¼ 2p
Z2p 0
1 Hð Þ sin w dw; Hc ðt; nÞ ¼ 2p
Z2p Hð Þ cos w dw 0
Comparing the first Eqs. (4) and (5), one can determine and calculate the expressions Hs ðt; nÞ and Hc ðt; nÞ with allowance for x_ D ¼ ap sin ðw pDÞ. To solve the second Eq. (4), we use the procedure described in [19]. As a result of these actions, we have the following equations for the nonstationary values of the amplitude a, phase n and speed u: (a) u ap da aðk0 RkF cos pD Þ k ¼ cos n dt 2m 2pm dn x 2 p2 k RkF ¼ 0 sin n þ þ sin pD dt 2apm 2p 2m du r0 u ¼ Mð Þ r0 Rð1 þ BF Þ dt r0 J
ð7aÞ
408
A. A. Alifov and M. G. Farzaliev
(b) u\ap da a 4R pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi k 2 p2 u2 ¼ ðk0 RkF cos pDÞ þ cos n a dt 2m p a2 p2 2pm dn x2 p2 k R kF ¼ 0 sin n þ þ sin pD dt 2apm 2p 2m du r0 u r0 R ¼ ð3p 2w Þ Mð Þ r0 Rð1 BF Þ dt r0 p J
ð7bÞ
Here w ¼ 2p arcsinðu=apÞ, x20 ¼ c0 =m and in the case of u\ap, the technique described in [3, 4] is used. To determine the stationary amplitude and phase values from (7a), we have . 2 ½m ðx2 p2 Þ þ pRkF sin pD þ p2 ðk0 RkF cos pDÞ2 ¼ ðk aÞ2 tgn ¼
m ðx2 p2 Þ þ pRkF sin pD pðk0 RkF cos pDÞ
ð8Þ
In the case of u\ap, the amplitude of stationary oscillations is determined by the approximate equality ap u. From the condition u_ ¼ 0 follows an equation of general form for finding stationary values of velocity Mðu=r0 Þ SðuÞ ¼ 0
ð9Þ
where (a) u ap SðuÞ ¼ r0 Rð1 þ BF Þ; (b) u\ap SðuÞ ¼ r0 R ð1 BF Þ þ p1 ð3p 2w Þ : The S(u) expression in the u\ap case is simplified by taking into account the approximate equality ap u for the stationary amplitude.
About the Calculation by the Method of Linearization of Oscillations
409
4 Stability of Stationary Modes Stationary movements need to be checked for stability. For this purpose, we make up the equations in variations for (7a and 7b) and use the following Routh-Hurwitz criteria D1 [ 0; D3 [ 0; D1 D2 D3 [ 0
ð10Þ
where D1 ¼ ðb11 þ b22 þ b33 Þ; D2 ¼ b11 b33 þ b11 b22 þ b22 b33 b23 b32 b12 b21 b13 b31 D3 ¼ b11 b23 b32 þ b12 b21 b33 b11 b22 b33 b12 b23 b31 b13 b21 b32 Stationary regimes are stable if conditions (10) are satisfied. In the case of u ap we have r0 @BF r 2 R @BF aR @kF ðQ r0 R Þ; b12 ¼ 0 ; b13 ¼ 0; b21 ¼ cos pD J @a 2m @u J @u 1 @kF k sin n ðk0 RkF cos pD aR ; b22 ¼ cos pDÞ; b23 ¼ 2m 2pm @a k R @kF k cos n b32 ¼ sin pD; b33 ¼ sin n þ 2pma2 2m @a 2pma
b11 ¼
where Q¼
d u Mð Þ: du r
In the case of u\ap, only the coefficients change b11
" # " # r0 @BF 2r0 R r02 R @BF 2u Q r0 R pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; b12 ¼ þ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ J J @u @a p a2 p2 u2 pa a2 p2 u2 " # a @kF 4uR pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi R cos pD þ b21 ¼ 2m @u pa2 p2 a2 p2 u2 " # 1 @kF 4Ru2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi k0 RkF cos pD aR cos pD þ b22 ¼ 2m @a pa2 p2 a2 p2 u2
410
A. A. Alifov and M. G. Farzaliev
To express the friction force Fð_xÞ ¼
5 P n¼0
an x_ n with regard to u_ ¼ X, u ¼ r X with
averaging, we have @BF @a0 @a2 @a4 ¼ þ N2 ðapÞ2 þ N4 ðapÞ4 @u @u @u @u @kF @a @a 1 3 2 4 1 3 ðapÞ 5 ðapÞ @a5 ¼N þN þN @u @u @u @u @BF @k F 3 a3 þ 2N 5 a5 a2 p2 Þ ¼ 2ap2 ðN2 a2 þ 2N4 a4 a2 p2 Þ; ¼ 2ap2 ðN @a @a a0 ¼ d1 u þ d2 u2 þ d3 u3 þ d4 u4 þ d5 u5 ; a1 ¼ ðd1 þ 2d2 u þ 3d3 u2 þ 4d4 u3 þ 5d5 u4 Þ a2 ¼ d2 þ 3d3 u þ 6d4 u2 þ 10d5 u3 ; a3 ¼ ðd3 þ 4d4 u þ 10d5 u2 Þ a4 ¼ d4 þ 5d5 u; a5 ¼ d5 @a0 @a1 ¼ d1 þ 2d2 u þ 3d3 u2 þ 4d4 u3 þ 5d5 u4 ; ¼ 2ðd2 þ 3d3 u þ 6d4 u2 þ 10d5 u3 Þ @u @u @a2 @a3 @a4 @a5 ¼ 3ðd3 þ 4d4 u þ 10d5 u2 Þ; ¼ 4ðd4 þ 5d5 uÞ; ¼ 5d5 ; ¼0 @u @u @u @u Note that when calculating @BF =@u, @BF =@a, only even powers of n and a0 , a2 , a4 , are taken into account, and when calculating @kF =@u, @kF =@a, odd powers of n and a1 , a3 , a5 are taken into account.
5 Calculations To obtain information on the effect of the delay on the characteristics of stationary modes, calculations were performed with the following parameters: x0 ¼ 1 s1 , m ¼ 1 kgf s2 cm1 , k0 ¼ 0:02 kgf s cm1 , k ¼ 0:02 kgf, r0 ¼ 1 cm, J ¼ 1 kgf s cm2 . For the calculations, the characteristic of the friction force was used in a fairly common form in practice [20] TðUD Þ ¼ R½sgnUD þ Fð_xÞ Here Fð_xÞ ¼ d1 UD þ d3 UD3 , d1 and d3 are positive constants. Note that the characteristic of the friction force in this form was also observed when measuring the friction forces under the conditions of a space experiment [21].
About the Calculation by the Method of Linearization of Oscillations
411
Fig. 2. Amplitude-frequency curves.
The calculated parameters are R ¼ 0:5 kgf, d1 ¼ 0:84 s cm1 , d3 ¼ 0:18 s3 2 shows some results of cm . For the time lag, the values pD = 0; p/2; p; 3p/2. Figure calculations with a linear characteristic of the elastic force kf 0 and the value of the velocity u ¼ 1:16 cm s1 . Practically coincident solid curve 1 and points were obtained, respectively, with the accuracy parameters r = 1.5 and r = 1.3 in the case D= 0. Note that the results of the direct linearization method with the accuracy parameter r = 1.5 completely coincide with the results based on the asymptotic averaging method. Dashed curve 2, dashed-dashed 3 and dashed 4 are obtained for r = 1.5, respectively, for pD = p/2, pD = p, pD = 3p/2. Oscillations with amplitudes corresponding to points A, B, C are stable if the characteristic of the energy source Mðu=r0 Þ is within the shaded sectors. 3
6 Conclusions We considered the procedure for applying the methods of direct linearization to calculate the interaction of forced oscillations and self-oscillations in the presence of an energy source of limited power and a lag in the friction force that causes selfoscillations. This procedure is applicable to a wide class of nonlinear oscillatory systems interacting with energy sources. These calculations show a complete qualitative similarity and a very small quantitative difference between the results obtained by the known methods of nonlinear mechanics and the methods of direct linearization. At the same time, there are significant differences between the methods of direct linearization and the known methods of nonlinear mechanics. The advantages of direct linearization methods are: the ability to obtain final design ratios regardless of the specific type of nonlinear characteristic; fairly low cost of resources; simplicity and ease of use; the
412
A. A. Alifov and M. G. Farzaliev
absence of laborious and complex approximations of various orders used in the known methods of nonlinear mechanics. This is especially important from a practical point of view - for calculating the parameters of mechanisms, machines, equipment for various purposes at the design stage.
References 1. Encyclopedia of engineering. https://mash-xxl.info/info/174754/ 2. Rubanik, V.P.: Oscillations of Quasilinear Systems with Lag. Nauka, Moscow (1969). (in Russian) 3. Zhirnov, B.M.: On self-oscillations of a mechanical system with two degrees of freedom in the presence of a delay. J. Appl. Mech. 9(10), 83–87 (1973). (in Russian) 4. Astashev, V.K., Hertz, M.E.: Auto-oscillations of a visco-elastic rod with limiters under the action of a lagging force. J. Mashinovedenie 5, 3–11 (1973). (in Russian) 5. Butenin, N.V., Neymark, YuI, Fufaev, N.A.: Introduction to the Theory of Nonlinear Oscillations. Nauka, Moscow (1976). (in Russian) 6. Bogolyubov, N.N., Mitropolsky, YuA: Asymptotic methods in the Theory of Nonlinear Oscillations. Nauka, Moscow (1974). (in Russian) 7. Tondl, A.: On the interaction between self-exited and parametric vibrations. National Research Institute for Machine Design Bechovice. Series: Monographs and Memoranda, no. 25 (1978) 8. Wang, Q., Fu, F.: Variational iteration method for solving differential equations with piecewise constant arguments. Int. J. Eng. Manuf. (IJEM) 2(2), 36–43 (2012). https://doi. org/10.5815/ijem.2012.02.06 9. Lin, Y., Zhou, L., Bao, L.: A parameter free iterative method for solving projected generalized Lyapunov equations. Int. J. Eng. Manuf. (IJEM) 2(1), 62 (2012). https://doi.org/ 10.5815/ijem.2012.01.10 10. Chen, D.-X., Liu, G.-H.: Oscillatory behavior of a class of second-order nonlinear dynamic equations on time scales. Int. J. Eng. Manuf. (IJEM) 1(6), 72–79 (2011). https://doi.org/10. 5815/ijem.2011.06.11 11. Gourary, M.M., Rusakov, S.G.: Analysis of oscillator ensemble with dynamic couplings. In: The Second International Conference of Artificial Intelligence, Medical Engineering, Education, AIMEE 2018, pp. 150–160 (2018) 12. Acebrón, J.A., et al.: The Kuramoto model: a simple paradigm for synchronization phenomena. Rev. Mod. Phys. 77(1), 137–185 (2005) 13. Bhansali, P., Roychowdhury, J.: Injection locking analysis and simulation of weakly coupled oscillator networks. In: Li, P., et al. (eds.) Simulation and Verification of Electronic and Biological Systems, pp. 71–93. Springer, Heidelberg (2011) 14. Ashwin, P., Coombes, S., Nicks, R.J.: Mathematical frameworks for oscillatory network dynamics in neuroscience. J. Math. Neurosci. 6(2), 1–92 (2016) 15. Ziabari, M.T., Sahab, A.R., Fakhari, S.N.S.: Synchronization new 3D chaotic system using brain emotional learning based intelligent controller. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 7(2), 80–87 (2015). https://doi.org/10.5815/ijitcs.2015.02.10 16. Alifov, A.A.: Methods of direct linearization for calculation of nonlinear systems. RCD, Moscow, Russia (2015). (in Russian). ISBN: 978-5-93972-993-2 17. Alifov, A.A.: Method of the direct linearization of mixed nonlinearities. J. Mach. Manuf. Reliab. 46(2), 128–131 (2017). https://doi.org/10.3103/S1052618817020029
About the Calculation by the Method of Linearization of Oscillations
413
18. Alifov, A.A.: About some methods of calculation nonlinear oscillations in machines. In: Proceedings of International Symposium of Mechanism and Machine Science, Izmir, Turkey, 5–8 October, pp. 378–381 (2010) 19. Alifov, A.A., Farzaliev, M.G., Jafarov, E.N.: Dynamics of a self-oscillatory system with an energy source. Russ. Eng. Res. 38(4), 260–262 (2018). https://doi.org/10.3103/S106879 8X18040032 20. Alifov, A.A., Frolov, K.V.: Interaction of Nonlinear Oscillatory Systems with Energy Sources. Hemisphere Publishing Corporation/Taylor & Francis Group, New York (1990). ISBN 0-89116-695-5 21. Bronovec, M.A., Zhuravljov, V.F.: On self-oscillations in systems for measuring friction forces. Izv. RAN, Mekh. Tverd. Tela, no. 3, pp. 3–11 (2012). (in Russian)
Mathematical Model of Dot Peen Marker Operating in Self-exciting Vibration Mode A. M. Gouskov1,2, E. V. Efimova1, I. A. Kiselev1, and E. A. Nikitin1(&) 1
Bauman Moscow State Technical University, Moscow 105005, Russia [email protected], [email protected] 2 Mechanical Engineering Research Institute of the Russian Academy of Sciences, Moscow 101990, Russia
Abstract. This paper investigates the dynamics of a pneumatic dot peen marker without a slide valve. Dot peen marking is a promising method of marking part at any stage of the manufacturing process. A mathematical model of the marker is proposed, where vibration and impact properties of the system were described using methods of stereo-mechanical impact theory; the loads were calculated by solving a coupled gas dynamics problem. The design model is comprised of a nonlinear oscillator having one degree of freedom. The problem was solved using numerical integration. The proposed approach allows to describe major characteristics of dot peen marker operation, such as boundaries of the effective self-excited mode, amplitude and frequency of peen oscillation, pressure and temperature in the working chamber. The influence of the marker’s characteristics on the operating parameters was investigated using the calculated time histories of the state vector elements. Keywords: Pneumatic dot peen marker Vibro-impact system Mathematical model Nonlinear dynamics
1 Introduction Marking of parts and workpieces is an integral part of any manufacturing process. There are many ways of marking parts—laser marking, painting, labeling, dot peen marking. Dot peen marking is one of the most reliable marking methods [1, 2]. It works by striking the marker peen against the surface of the part being marked. Markings obtained by the dot peen method are resistant to aggressive environments, painting, and moderate mechanical impact. There are two main types of dot peen markers: electric and pneumatic markers [1]. Compared to electric markers, pneumatic markers are smaller and have fewer moving parts. A pneumatic marker can be installed in the tool holder of a CNC milling machine to use its pressure line, which eliminates the need to have a standalone pneumatic system to control the marker. Kharkevich [3] described the layout and the principle of operation of a pneumatic marker. In terms of dynamics, a pneumatic marker is a self-excited system with hard self-excitation. If a system has a negative feedback valve [4] and a source of © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 414–423, 2020. https://doi.org/10.1007/978-3-030-39216-1_38
Mathematical Model of Dot Peen Marker Operating
415
nonlinearity self-excited oscillations are possible [5, 6]. In the marker, the function of the valve is performed by the distribution channel in the peen. The source of nonlinearity is the nonlinear process of the peen impacting the part [7]. Besides, the thermodynamic process of airflow, which is the source of the peen movement, is also nonlinear. The dynamics of self-excited oscillatory systems with pneumatic excitation in the case of pneumatic vibro-exciters was considered by Ragulskis et al. [8, 9]. The model proposed by them allows estimating the influence of the pressure in the main pressure line on the frequency of oscillations. This paper is organized as follows. A thermo-mechanical model of the pneumatic dot peen marker dynamics is developed in third section. The peen dynamics and its interaction with the part marked are fully described in fourth section. The fifth section deals with the process of numerical integration of a system of nonlinear ODEs that determine the mathematical model of the marker. Results and discussion enclose the paper. It was shown that the marker’s parameters can be optimized taking into account the design requirements [10].
2 Key Assumptions The mathematical model of the pneumatic marker is based on the following assumptions: 1. The heat exchange between the walls and the working mass (air) is neglected; all processes are adiabatic; 2. Air is treated as ideal gas governed by the ideal gas law; 3. The flow is steady-state for all cases considered; 4. The main pressure line volume is infinite, the air temperature and pressure in the main pressure line are constant; 5. Energy losses are approximated using flow coefficients; 6. Air leakage due to non-tight connections is neglected; 7. The peen is a rigid body performing linear motion along the marker axis; 8. The impact of the peen on the surface being marked is governed by Newton’s inelastic impact hypothesis [11]. The error due to neglecting the heat exchange between walls in pneumatic calculations is less than 10% [12]. Assumption 8 does not allow estimating the peen indentation depth, but it greatly simplifies simulation. The relationship between the imprint depth and the design parameters of the model can be acquired by additional experimental or numerical investigation of the process of dynamic indentation for different structural materials. The imprint depth can be estimated from the peen kinetic energy at the moment of impact using indentation diagrams.
416
A. M. Gouskov et al.
3 Dynamic of the “Gas-Peen” System Within Impactless Motion The mathematical model starts by formulating the equation of peen motion. According to the design model shown in Fig. 1, the peen is a mass M having one degree of freedom.
1
x pM ,TM
3
F2 8
2
xc
10
F1
p1
x1
4
M δ
x2
p2
7
ρ
5
c
pA ,TM
9
6 11
Fig. 1. Design model of the pneumatic marker; 1 – marker’s chuck, 2 – first chamber, 3 – second chamber, 4 – peen, 5 – spring, 6 – indenter, 7 – distribution channel, 8 – inlet channel, 9 – outlet channel, 10 – case, 11 – part being marked
The forces acting on the peen 6 are the resistance force - b_x, which is proportional to velocity, and the restoring force cHðxÞx from the compression spring 8 installed in the lower part of the marker case 7. As the spring is not rigidly connected to the pin it creates force only when compressed, the stiffness c has to be multiplied by the Heaviside function Hðx þ dÞ, where d—spring displacement. The peen displacement is done by compressed air coming from the main pressure line though the inlet channel 4 to the first chamber 5 and flowing to the second chamber 2 when the distribution valve 3 opens. In the first and second chambers, the air acts on the peen with the force of p1 F1 and p2 F2 accordingly, where F1 and F2 —corresponding peen surface areas in the first and second chamber. It is assumed that the device is installed vertically in the chuck, hence the force of gravity Mg has to be taken into account. The peen equation of motion is given by M€x þ b_x þ cHðx þ dÞx ¼ p1 F1 p2 F2 Mg
ð1Þ
Mathematical Model of Dot Peen Marker Operating
417
The pressures p1 and p2 in the first and second chamber are functions of temperature and velocity. Therefore, differential relations coupling pressure, temperature, and velocity have to be derived to formulate the system of equations and its state vector. The gas flow from the i-th chamber to the j-th chamber is governed by the first law of thermodynamics [13] dQi ¼ dUj þ dLj ;
ð2Þ
where dQi ¼ cp Ti dm—quantity of heat from the air supplied, cp —heat capacity at constant pressure, Ti —temperature of the air supplied, dm—mass of the air supplied during the time dt, dU ¼ dðmcV Tj Þ—internal energy change, cV —gas heat capacity at constant volume, Tj —temperature of the air in the chamber where the air is supplied in, Vj —chamber volume, dLj ¼ pj dVj —total external work. The mass of air in the j-th chamber is denoted with m. The Eq. (2) can be reduced to the following form using the Mayer’s relation cp cV ¼ mR (m—molar mass of air) and the ideal gas law pV ¼ mRT kRTi dm ¼ Vj dpj þ kpj dVj ;
ð3Þ
where k ¼ cp =cV ¼ 1:4—heat capacity ratio for air, R ¼ 287 J/(kgK)—gas constant for air. The amount of gas flowing from the i-th chamber to the j-th chamber can be expressed using mass flow rate dm ¼ Gi!j dt given by the Saint-Venant and Wantzel equation [14] Kpi pi Gi!j ¼ lj fj pffiffiffiffiffiffiffi u ; pj RTi
ð4Þ
where lj —flow coefficient of the inlet orifice, fj —cross-section area of the inlet orifice, pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2k=ðk 1Þ, u—flow rate function
K ¼
8 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðk þ 1Þ=k > < r2=k r ; r \ r ¼ ½2=ðk þ 1Þk=ðk1Þ ; ffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi uðrÞ ¼ 2=k ðk þ 1Þ=k ; r r \ 1; > : r r 0; r 1: The balance Eq. (3) for the first and second chambers are derived as follows. Indices 1, 2, M, A denote the variables of the first and second chamber, the main pressure line and the atmosphere, respectively. Generally, the flow rate in a chamber is given by the sum of individual flow rates coming from different sources. The flow rate for the first chamber is given by G1 ¼ GM!1 G1!M þ ðG2!1 G1!2 ÞHðx xc Þ;
ð5Þ
418
A. M. Gouskov et al.
GM!1 —air inflow from the main pressure line, G1!M —air outflow in the main pressure line, G2!1 —air outflow from the second chamber, G1!2 —air outflow in the second chamber, xc —position of the distribution valve. The flow rate for the second chamber is given by G2 ¼ ðG1!2 G2!1 ÞHðx xc Þ G2!A Hðx þ qÞ;
ð6Þ
where G2!A —air outflow in the atmosphere, q—coordinate of the position when the outlet orifice opens. The volume of the chambers is expressed in terms of the peen position as follows: V1 ¼ F1 ðx1 þ xÞ, V2 ¼ F2 ðx2 xÞ, where F1 x1 and F2 x2 — parasitic volumes of the first and second chamber accordingly. The following expression can be acquired by plugging in Eqs. (5) and (6) in Eq. (3) and using the Eq. (4) rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi p1 p2 T2 p_ 1 ¼ u pM þ Xc H(x - xc ) pM TM u pp12 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi
ppM1 TTM1 u pp21 ppM1 TTM1 u ppM1 x1kþx_ x p1 ; pffiffiffiffiffiffiffi kl1 f1 KpM RTM F1 ðx1 þ xÞ
p_ 2 ¼
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi p2 p2 p1 T1 T2 u u TM p1 pM TM p2 ffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x_ p2 ; XA H( - x þ qÞ ppM2 TTM2 u ppa2 þ x2kx
pffiffiffiffiffiffiffi kl1 f1 KpM RTM F2 ðx2 þ xÞ
ð7Þ
Xc H(x - xc )
p1 pM
ð8Þ
where Xc ¼ lc fc =ðl1 f1 Þ—distribution valve flow coefficient, XA ¼ lA fA =ðl1 f1 Þ— outlet orifice flow coefficient. Differential relations for the air temperatures T1 and T2 are required to get a complete system of equations. The following expression can be acquired by differentiating the ideal gas law T_ j ¼
Gi!j R 2 p_ j V_ j þ T : Tj pj V j j pj Vj
ð9Þ
By plugging the expressions for flow rate (4) and pressure derivatives (7), (8) in (9) qffiffiffiffiffi k TTM1 u ppM1 ðk 1Þ ppM1 TTM1 u ppM1 þ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi h ffi
x p2 ðkT p1 p1 T1 2 T1 Þ p ffiffiffiffiffiffiffiffi þ Xc H(x xc ) pM T T u p2 ðk 1Þ pM TM u pp21 ðk1Þ_ x1 þ x T1 ; T_ 1 ¼
pffiffiffiffiffiffiffi l1 f1 KpM RTM T1 F1 ðx1 þ xÞ p1
n
2 M
qffiffiffiffiffi n H(x xc )½ðk 1Þ ppM2 TTM2 u pp12 þ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi i ffi
p1 ðkT p2 X A p2 T2 1 T2 Þ p ffiffiffiffiffiffiffiffi þ pM T T u p1 H( x þ qÞðk 1) XB pM TM u ppa2 ;
Þ_x T_ 2 ¼ ðxk1 T2 þ 2 þx
1 M
pffiffiffiffiffiffiffi Xc l1 f1 KpM RTM T2 F2 ðx2 þ xÞ p2
Mathematical Model of Dot Peen Marker Operating
419
The Eqs. (1), (7), (8) and (9) form a complete system of equations describing the marker operation within one cycle. The state vector is given by yT ¼ f x
v
p1
p2
T1
ð10Þ
T2 g;
v ¼ x_ —peen velocity.
4 Impact and State Vector’s Change Integration should be done until the peen reaches the position H of the part marked, then the inelastic impact takes place. The Newton’s hypothesis allows modeling the impact by introducing just one parameter—the coefficient of restitution r, which defines the peen velocity right after the impact. Figure 2 illustrates the process of impact between the peen and the part.
v+ = -rvvFig. 2. Illustration of impact modeling
The initial conditions for the next operating cycle are set equal to the state vector of the previous cycle y : yT0 þ ¼ f x
rv
p1
p2
T1
T2 g
ð11Þ
Therefore, the impact is instant. The time of impact can be calculated more accurately by knowing the diagram of deforming the part material by an indenter of known geometry.
5 Numerical Integration The system of equations has to be represented in the matrix form y_ ¼ AðyÞ y in order to perform numerical integration. The system can be expressed in the dimensionless form to reduce the number of design variables. The following dimensionless variables were introduced
420
A. M. Gouskov et al.
n ¼
x p1 p2 T1 T2 t ; r1 ¼ ; r2 ¼ ; h1 ¼ ; h2 ¼ ; s ¼ xc td pM pM TM TM
ð12Þ
and dimensionless quantities F1 xc l f1 K pffiffiffiffiffiffiffiffiffi ; N ¼ 1 td ¼ F1 l1 f1 K RTM
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi MRTM bl1 f1 K RTM cxc ; C ¼ ; ; D ¼ 2 pM F1 xc pM F1 pM F1
Mg F2 x1 x2 q d pA ; P ¼ ; n ¼ ; n2 ¼ ; nq ¼ ; nd ¼ ; r A ¼ ; pM F1 xc xc F1 1 xc xc pM H h ¼ : xc
v ¼
The system was solved numerically in MATLAB [15] using variable order AdamsBashforth-Moulton method with adaptive step. Due to the low calculation error, this method allows switching the model for the next operating cycle when required [16, 17]. The initial conditions in the dimensionless form for the first operating cycle are given by ~yT0 ¼ f h
0
1 rA
1 1 g:
6 Result Discussion Figures 3 and 4 show the results of simulating the system in the working mode. Table 1 gives the values of the dimensionless parameters used, they correspond to a steel peen with a maximum diameter of 25 mm, the bottom diameter of 17 mm and the height of 25 mm. The pressure in the main pressure line is 4 bar, the operating conditions are normal (ambient pressure 0.1 MPa, air temperature 290 K). Table 1. Values of dimensionless parameters used for numerical integration N 11 n1 0.8
D 0.04 n2 5
C 0.67 nq 0
v 0.006 nd 0
P 2 rA 0.25
l1 0.7 k 1.4
XA 5 H 0
Xc 1 r 0.6
The time histories of the temperatures of the first and second chamber are not shown because the temperature change during the marker’s operation is negligible (less than 5%).
Mathematical Model of Dot Peen Marker Operating
421
Fig. 3. Time histories of state vector components for the values of dimensionless parameters given in Table 1.
Fig. 4. Indicator diagrams and a phase portrait of the system for the values of dimensionless parameters given in Table 1.
A series of calculations was conducted to investigate the influence of the dimensionless parameters on the peen oscillation frequency, where the parameters varied were the main structural parameter P and the ratio of atmospheric pressure to the pressure in the main pressure line rA . Three diagrams of m vs. P corresponding to three values of the parameter N were obtained for the constant value of rA ¼ 0:25. Figure 5 shows these diagrams. These diagrams can be used to determine the boundaries of selfexcitation mode beyond which the peen stops impacting the part and peen oscillations decay. Figure 6 shows the plots of m vs. rA for different values of N, where rA —ratio of atmospheric pressure to the main pressure line pressure.
422
A. M. Gouskov et al.
Fig. 5. Peen impact frequency vs. parameter П for different values of N: 1 – N ¼ 10, 2 – N ¼ 15, 3 – N ¼ 20
Fig. 6. Peen impact frequency vs. rA for different values N: 1 – N ¼ 10, 2 – N ¼ 15, 3 – N ¼ 20
The peen impact frequency is practically unaffected by the air pressure until a certain threshold value of rth A ¼ 0:47 (which corresponds to the pressure in the main pressure line of pth ¼ 2:13 atm for normal conditions). It should be noted that the A threshold pressure does not depend on the value of the structural variable N. If the pressure in the main pressure line drops below pth A , the peen stops impacting the part, i.e. the self-excited mode is terminated.
7 Conclusions This paper proposed the simplest concept of the pneumatic dot peen marker with only one moving part. A new approach allows to describe the process of peen self-excited vibrations and boundaries of effective working mode. The simulation results can be used for the optimal design of pneumatic markers with required characteristics. Mathematical model can be validated by series of experiments with physical models and then used for tuning of dot peen markers with various size.
Mathematical Model of Dot Peen Marker Operating
423
The model accuracy can be improved by experimental or numerical investigation of the dynamic indentation of the peen in the part being marked. If dynamic “force displacement” curves are determined, it is possible to predict indentation depth and required feed speed of milling machine with proposed mathematical model. Also, of great interest is the study of dynamics of the «marker – part» system as the part installed on the marker’s table is not isolated from vibrations coming from the marker. Acknowledgements. The work has been realized under the support of the Russian Scientific Foundation, project №18-19-00708.
References 1. Tate, C.: Make your mark. Cut. Tool Eng. 68(12), 54–55 (2016) 2. Dragičević, D., Tegeltija, S., Ostojić, G., Stankovski, S., Lazarević, M.: Reliability of dot peen marking in product traceability. Int. J. Ind. Eng. Manag. (IJIEM) 8(2), 71–76 (2017) 3. Kharkevich, A.A.: Auto-Oscillation. Gostekhizdat, Moscow (1954). (in Russian) 4. Panovko, G.Ya., Shokhin, A.: Self-synchronization features of inertial vibration exciters in two-mass system. J. VibroEng. 21(2), 498–506 (2019). https://doi.org/10.21595/jve.2019. 20083 5. Gouskov, A.M., Tung, D.D.: Nonlinear dynamics of multi-edge trepanning vibration drilling. In: IOP Conference Series: Materials Science and Engineering, vol. 489, p. 012036 (2019). https://doi.org/10.1088/1757-899X/489/1/012036 6. Gouskov, A.M., Voronov, S.A., Novikov, V.V., Ivanov, I.I.: Chatter suppression in boring with tool position feedback control. J. VibroEng. 19(5), 3512–3521 (2017). https://doi.org/ 10.21595/jve.2017.17777 7. Voronov, S.A., Ivanov, I.I., Kiselev, I.A.: Investigation of the milling process based on a reduced dynamic model of cutting tool. J. Mach. Manuf. Reliab. 44(1), 70–78 (2015). https://doi.org/10.3103/S1052618815010100 8. Kibirkštis, E., Pauliukaitis, D., Miliūnas, V., Ragulskis, K.: Synchronization of pneumatic vibroexciters under air cushion operating mode in a self-exciting autovibration regime. J. Mech. Sci. Technol. 31(9), 4137–4144 (2017). https://doi.org/10.1007/s12206-017-0809-6 9. Kibirkštis, E., Pauliukaitis, D., Miliūnas, V., Ragulskis, K.: Synchronization of pneumatic vibroexciters operating on air cushion with feeding pulsatile pressure under autovibration regime. J. Mech. Sci. Technol. 32(1), 81–89 (2018). https://doi.org/10.1007/s12206-0171209-7 10. Kuts, V.A., Nikolaev, S.M., Voronov, S.A.: The procedure for subspace identification optimal parameters selection in application to the turbine blade modal analysis. Procedia Eng. 176, 56–65 (2017). https://doi.org/10.1016/j.proeng.2017.02.273 11. Panovko, Y.G.: Introduction to the Theory of Impact. Nauka, Moscow (1977). (in Russian) 12. Vibrations in Technics, vol. 4. Mashinostroenie, Moscow (1979). (in Russian) 13. Bazarov, I.P.: Thermodymamics. Vyshaya Shkola, Moscow (1991). (in Russian) 14. Gertz, E.V., Kreynin, G.V.: Calculations of the Pneumatic Drives. Mashinostroenie, Moscow (1975). (in Russian) 15. Nehra, V.: MATLAB/simulink based study of different approaches using mathematical model of differential equations. Int. J. Intell. Syst. Appl. (IJISA) 6(5), 1–24 (2014) 16. Kahaner, D., Moler, C., Nash, S.: Numerical Methods and Software. Prentice-Hall International, NJ (1989) 17. Wang, Q., Fu, F.: Numerical oscillations of Runge-Kutta methods for differential equations with piecewise constant arguments of alternately advanced and retarded type. Int. J. Intell. Syst. Appl. (IJISA) 3(4), 49–55 (2011)
Advances in Digital Economics and Methodological Approaches
Study of the Mechanisms of Perspective Flexible Manufacturing System for a Newly Forming Robotic Enterprise Vladimir V. Serebrenniy, Dmitriy V. Lapin(&), and Alisa A. Mokaeva Bauman Moscow State Technical University, 5c1, 2nd Baumanskaya Street, 105005 Moscow, Russian Federation {vsereb,lapindv,alisa.mokaeva}@bmstu.ru
Abstract. This paper is aimed at forming the concept of the perspective manufacturing system for the newly forming hi-tech robotic enterprises. Small innovative enterprises and large production centers on the stage of modernization are typical companies in terms of the systems mentioned above. An overview of the perspective manufacturing systems was highlighted, the principles of structures forming and functional mechanisms were shown. The following flexible manufacturing systems are analyzed ORCA-FMS, ADACOR, POLLUX. Their features, functionality and structure were observed. The concept of the perspective flexible manufacturing system was suggested. It is based on the mutual integration of two entities: the heterogeneous multi-agent system with the coalitions forming mechanism and the scalable technology of end-to-end identification of complex technical systems. Primary detailing was carried out, highlighting possible advantages and disadvantages of the concept. Organization issues of the mechanisms of flexible manufacturing system connected with multiagent system, identification and their interaction were considered. The intermediate results is the compilation of functional mechanisms for the system and its subsystems, as applied to the topical technical case for the creation of a collaborative robotic technological cell. It is concluded that the organization and observation subsystems, developed in the frame of perspective flexible manufacturing system, make it possible to implement clear, transparent and effective manufacturing system, that allow to use abilities of collaborative robotics Keywords: Flexible manufacturing system Multiagent system identification Robotic manufacturing Robotic enterprise
System
1 Introduction Countries all over the globe have their own ways to implement the Industry 4.0 paradigm [1]. Principles of the Russian national technology initiative (NTI) were taken as the basis for this study. In particular, cross-market and cross-industry focus area TechNet [2] that provides technological support for the development of NTI markets and hi-tech industries by forming Factories of the Future [2].
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 427–436, 2020. https://doi.org/10.1007/978-3-030-39216-1_39
428
V. V. Serebrenniy et al.
The question of Perspective manufacturing systems (PMS) implementing is one of the key technological barriers. However, assuming the universal properties of the above-mentioned manufacturing systems, the question arises of the high efficiency and synergy of the system deployment on the specific platforms taking into account special technical and economic circumstances [3]. If taken separately, none of the advanced manufacturing technologies can provide a long-term competitive advantage in the market. Complex technological solutions are therefore needed to design and manufacture a new generation of globally competitive products in the shortest time possible. These solutions made up of the best world-class technologies are referred to in the TechNet Roadmap as Digital, Smart, Virtual Factories of the Future and its interconnection, shown on Fig. 1.
Virtual Factory Smart Factory Digital Factory
Fig. 1. Interconnection of the structures of Factories of the Future
The newly forming robotic enterprises are part of the technological business development [4]. There are two scenarios of their elaboration: creation from scratch and modification of the existing production. Consider these scenarios in more detail. They will define the key technological solutions and form requirements for effective manufacturing system. The goal of the study is the development of the specialized perspective manufacturing system for the newly forming robotic enterprises. This article presents the analysis of the existing PMS and highlights their essential features and properties, describes the concept of PMS for the newly forming robotic enterprises, its structure, and operating mechanisms, details the substantive organization and observation subsystems and their mechanisms of interaction. Also, it takes apart the case study on the topic of partial automation of riveting operations in the aerospace industry.
2 Analysis To enable efficient use of the data available in Industry 4.0 oriented processes, it becomes necessary to adapt the control architecture in order to make it flexible, reactive and adaptable enough to reach the objectives previously described. For the last 20 years, Holonic Control Architectures (HCA) has been widely studied and developed. The use at the industrial level is starting to spread due to their effectiveness. A Holonic Control Architecture (HCA) is an architecture composed of holons, called holarchy. A holon is a communicating decisional entity (with inputs and outputs) composed of a set of sub-level holons and at the same time, part of a wider organization composed of
Study of the Mechanisms of Perspective Flexible Manufacturing System
429
higher-level holons (recursive, called the Janus effect [5]). It is important to note that a holon is also composed of a physical part associated to a digital one (that can be modeled as a digital agent, avatar, digital twin) and finally, holons are able to decide according to a certain degree of autonomy [5]. Dynamic HCA are interesting because they integrate an optimal scheduling module used in the normal state, coupled with reactive abilities, executed when a disturbance occurs. When it happens, the HCA may modify its own organization to minimize the impact of this disturbance. Such architectures guarantee that performances of the manufacturing system are optimal in the normal state, but not always in degraded mode. ORCA-FMS In the manufacturing domain, ORCA – dynamic and heterogeneous hybrid Architecture for Optimized and Reactive Control – [6] was one of the first dynamic HCA that was formalized in literature. ORCA is divided into three layers: the physical system (PS) layer, the local control (LC) one, and the global control (GC) one. The GC has a global view of the system and is composed of one global optimizer. Its role is to guarantee good performances based on the PS’s initial state. The LC is composed of many local optimizers, which have a local view of the system only. Their goal is to react to unexpected events that occur on the PS by making rapid decisions online, thus providing a feasible solution that is suitable for the current system state. Each local optimizer in LC is associated with a physical part in PS. Both the physical part and the local optimizer constitute one entity. ADACOR ADACOR (ADAptive holonic COntrol aRchitecture) is a holonic reference for the distributed manufacturing system [7]. ADACOR is a decentralized control architecture but it also considers centralization in order to tend to global optimization of the system. Holons are belonging to the following classes: Product Holons (ProdH), Task Holons (TH), Operation Holons (OpH) and Supervisor Holons (SupH). An evolution of ADACOR mechanism has also been presented as ADACOR2 [8]. The objective is to let the system evolving dynamically through configurations discovered online, and not only between a stationary and one transient state. The rest of the architecture is nevertheless quite similar to ADACOR. POLLUX The last architecture in date is denoted as POLLUX [9]. The main novelty is focused on the adaptation mechanism of the architecture, using governance parameters that enlarge or constraint the behavior of the low-level holons regarding the disturbances observed by the higher level, the idea is to find the “best” architecture that suits detected disturbances. It is a Hybrid Control Architecture (HCA) presented as a reference control system that supports the switching process of the control system between hierarchical and heterarchical architectures.
430
V. V. Serebrenniy et al.
3 Concept Description The system is based on dynamic organization and observation subsystems. Subsystems implementation and their interaction mechanisms are shown on Fig. 2.
Reorganization
Work Environment Complex State Information
Organization
Global Product Task/Solution
Enterprise Behavioral Model
Observation
Global Model Task/Solution
Fig. 2. PMS Subsystems implementation and their interaction mechanisms
This effect is achieved due to the implementation features of the organization and observation subsystems. Organization subsystem is based on the multiagent systems approach, observation subsystem – end-to-end structural and parametric identification tool.
Observing subsystem
Structure and Mechanisms The overall structure of the proposed production system is based on two subsystems that carry out both independent functioning and mutual influence. The total effect of adaptive management is due to the effective decomposition of the global task, as well as through the continuous exchange of information at all levels of control, shown Fig. 3.
Fig. 3. Coverage of management levels
Organizing subsystem
Study of the Mechanisms of Perspective Flexible Manufacturing System
431
Subsystems interaction is based on complex behavioral model of the enterprise, that shown on Fig. 2. The main object in the behavioral setting is the “behavior” – the set of all signals compatible with the system. An important feature of this approach is that it does not distinguish a priority between input and output variables. Apart from putting system theory and control on a rigorous basis, the behavioral approach unified the existing approaches and brought new results on controllability for nD systems, control via interconnection, and system identification.
4 Detailing of Mechanisms This paragraph describes a more detailed consideration of the organization and observation subsystems structures. Multiagent System The system used to organize manufacturing tools is a multi-agent group control system, that use a dynamic mechanism of homogeneous and/or heterogeneous coalitions formation [10–12]. The implementation of such a mechanism brings the system closer to the hybrid control architecture. However, system formation rules are associated with the identification tool result in interpretation in real-time. The analysis of the modern multiagent architectures showed, that of greatest interest is the coalition model of the system, the essence of which lies in the formation of subgroups of agents. Each group may be considered as a separate self-sufficient system, and as part of the global system. For example, homogenous coalitions are good in a quick readjusting of the production line, heterogeneous - with the selected manufacturing process, shown on Fig. 4.
Fig. 4. Types of agents and coalitions
432
V. V. Serebrenniy et al.
Identification Tools The system used to observe manufacturing tools is an end-to-end structural and parametric identification tool. The math part based on wavelet transforms theory. Wavelet transforms of the input and output parameters [13–15] are proposed for organizing tools for end-to-end processes identification, shown on Fig. 5.
Fig. 5. Complex identification tool
Using the real-time representation of the system in the combined time-frequency domain it is possible to critically evaluate linear and non-linear processes in the system as well as obtaining interpretable information about the real generalized structure and parameters of the system.
5 Case Example The concept of manufacturing robotization on the example of assembly process of aircraft hull structures is following. The essence of PMS integration is in the cooperation between worker and collaborative robot within the framework of one technological process – drilling and riveting work. In this concept robotization is the use of collaborative robotic systems, functional features consist of a fairly flexible response to changes in the work area. Background Time spent on assembly operations is about 50–75% of total aircraft manufacturing time, and their labor intensity is 30–40% of the total labor intensity [16, 17]. Assembly performance improvement is provided by the mechanization and automation of the basic typical technological operations - marking, cutting, drilling and riveting. Airframe pieces such as longerons, ribs, and frames are plane frameworks. Rivet joint is the basic method to connect plane frameworks. Drilling and riveting are 30–45% of the overall labor intensity of the assembly process. Drilling laboriousness is 30%, countersinking – 13%, rivet inserting – 4%, riveting – 53% [18].
Study of the Mechanisms of Perspective Flexible Manufacturing System
433
The riveting automats are widely used nowadays. However, the specifics of production, the complexity of the aircraft design, variety of conditions for approaching the riveting area, different rivets diameters, short seams determine the use of hand drills and riveting hammers. It does not allow to achieve high labor productivity, does not guarantee quality stability and adversely affects the human body, causing such occupational diseases as hand-arm vibration syndrome and hearing loss. The approach proposed by the authors is aimed, first of all, at robotization of the drilling and riveting works with minimal equipment costs. The system will consist of one collaborative robot equipped with a special tool. This configuration allows the simultaneous work of human and robot in a shared technological environment. Solution The collaborative robot performs the most of monotonous operations, the worker is involved when performing operations in a work area inaccessible to the robot. Such a combination makes it possible to reduce the total operational time and overall labor intensity with minimal interference with the existing process. Figure 6 illustrates the interaction between human and robot while performing drilling and riveting of the airframe.
Fig. 6. Example of human and robot collaboration: d.H – human working area, d.K – robot working area.
Technical Implementation Consider the basic technical solutions on the example of drilling and riveting work block. The proposed technical implementation of the system is presented on Fig. 7.
434
V. V. Serebrenniy et al.
Fig. 7. The structure of the robotic system for drilling and riveting
The equipment can be divided into the following groups according to its properties: • A – base manipulator - the basis of the collaborative robot, which is an n-link industrial manipulator; • B – modified tool - end-of-arm tool for drilling and riveting, the essence of the technical requirements for which is formed on the basis of the manipulator ergonomics. It is also notable, that due to human-like ergonomics of the modern collaborative robots, the development of tool modification is a part of the future work [18]; • C - sensing system - one of the keys of the concept implementation is the modification of the existing robot, tool and tooling for drilling and riveting into collaborative to be similarly safe for human. The approach is based on a special sensing system for the robot, tool and tooling and developing of a simplified lashing diagram [19, 20]; • D – control unit - hardware unit for implementing a part of a hybrid control system of a collaborative multiagent robotic system [21, 22]. It can be integrated into the united information field, along with being able to decentralized control with the operator assistance. Within the framework of the proposed concept robot performs a significant part of drilling and riveting works. The human not only acts as an observer but also has the ability to perform the same operations as the robot, for example, in areas inaccessible to the robot. The robot, tool, and tooling have to be equipped by sensors due to meet strict safety requirements for work in cooperation with human [22]. However, these technical solutions will lead to the robotic system total cost increase due to the design complexity and additional requirements for the control algorithms.
Study of the Mechanisms of Perspective Flexible Manufacturing System
435
6 Results and Discussion In comparison with the system with different Holon architectures presented in the analysis, the proposed concept has the properties of scalability in depth and width, while maintaining the principle of minimizing the number of entities inside and outside the system. Thus functions of organizational and observation subsystems are constructed so that to keep the efficiency irrespective of scalability. The given conclusion is amplified when it is expanded to the control system of collaborative production cell in global. Organization subsystem based on the multiagent system with dynamic mechanism of coalition formation is designed to connect groups of workers, industrial and collaborative robots and other equipment within a single process production cell. Observation subsystem, based on end-to-end structural and parametric wavelet identification – monitoring tool to provide an ability to respond on events in the cell.
7 Conclusions Assuming the results of this study it is possible to conclude that the developed concept has possibilities for further development and detailing, in particular within the framework of the actual task of creating a control system for the collaborative robotic production cell. Further, it is planned to continue researches in two directions: lifecycle evolution of the concept as a system and developing of subsystems and their interaction.
References 1. Lee, J., Bagheri, B., Kao, H.A.: A cyber-physical systems architecture for industry 4.0-based manufacturing systems. Manuf. Lett. 3, 18–23 (2015). https://doi.org/10.1016/j.mfglet.2014. 12.001 2. Gromova, E.A.: Digital economy development with an emphasis on automotive industry in Russia. Rev. ESPACIOS 40(6), 27–29 (2019) 3. Aliyev, A., Shahverdiyeva, R.: Perspective directions of development of innovative structures on the basis of modern technologies. Int. J. Eng. Manuf. (IJEM) 8(4), 1–12 (2018). https://doi.org/10.5815/ijem.2018.04.01. https://doi.org/10.1007/978-3-319-99582-3_28 4. Sharan, R., Onwubolu, G.: Automating the process of work-piece recognition and location for a pick-and-place robot in a SFMS. Int. J. Image Graph. Signal Process. (IJIGSP) 6(4), 9– 17 (2014). https://doi.org/10.5815/ijigsp.2014.04.02 5. Babiceanu, R.F., Chen, F.F.: Development and applications of holonic manufacturing systems: a survey. J. Intell. Manuf. 17(1), 111–131 (2006). https://link.springer.com/article/ 10.1007/s10845-005-5516-y 6. Pach, C., Berger, T., Bonte, T., Trentesaux, D.: ORCA-FMS: a dynamic architecture for the optimized and reactive control of flexible manufacturing scheduling. Comput. Ind. 65(4), 706–720 (2014) 7. Leitão, P., Restivo, F.: ADACOR: a holonic architecture for agile and adaptive manufacturing control. Comput. Ind. 57(2), 121–130 (2006)
436
V. V. Serebrenniy et al.
8. Barbosa, J., Leitão, P., Adam, E., Trentesaux, D.: Dynamic self-organization in holonic multi-agent manufacturing systems: the ADACOR evolution. Comput. Ind. 66, 99–111 (2015) 9. Jimenez, J.F., Bekrar, A., Zambrano-Rey, G., Trentesaux, D., Leitão, P.: Pollux: a dynamic hybrid control architecture for flexible job shop systems. Int. J. Prod. Res. 55(15), 4229– 4247 (2017) 10. Vorotnikov, S., Ermishin, K., Nazarova, A., Yuschenko, A.: Multi-agent robotic systems in collaborative robotics. In: International Conference on Interactive Collaborative Robotics, vol. 11097, pp. 270–279 (2018) 11. Pechoucek, M., Marik, V., Stepankova, O.: Coalition formation in manufacturing multiagent systems. In: Proceedings 11th International Workshop on Database and Expert Systems Applications, pp. 241–246 (2000). https://doi.org/10.1109/dexa.2000.875034 12. Chouhan, S., Rajdeep, N.: An analysis of the effect of communication for multi-agent planning in a grid world domain. Int. J. Intell. Syst. Appl. (IJISA) 4(5), 8–15 (2012). https:// doi.org/10.5815/ijisa.2012.05.02 13. Bakhtadze, N., Sakrutina, E.: Wavelet-based identification and control of variable structure systems. In: 2016 International Siberian Conference on Control and Communications (SIBCON), pp. 1–6 (2016). https://doi.org/10.1109/sibcon.2016.7491757 14. Bakhtadze, N., Sakrutina, E.: Applying the multi-scale wavelet-transform to the identification of non-linear time-varying plants. IFAC-PapersOnLine 49(12), 1927–1932 (2016). https://doi.org/10.1016/j.ifacol.2016.07.912 15. Karabutov, N.: Structural identification of nonlinear dynamic systems. Int. J. Intell. Syst. Appl. (IJISA) 7(9), 1–11 (2015). https://doi.org/10.5815/ijisa.2015.09.01 16. Fedorov, V.: Tehnologija sborki izdelij aviacionnoj tehniki: Tekst lekcij [Technology of assembly of aviation equipment: text of lectures]. SUSU Publishing House, Chelyabinsk (2011). (in Russian) 17. Vashukov, J.: Tehnologija i oborudovanie sborochnyh processov [Technology and equipment of assembly processes]. Samar State University, Samara (2011). (in Russian) 18. Lysenko, J.: Mehanizacija i avtomatizacija sborochno-klepal’nyh rabot na baze mashin impul’snogo dejstvija: ucheb. posobie [Mechanization and automation of assembly and riveting works based on pulsed machines: a tutorial]. Publishing House of Samar State Aerospace University, Samara (2017). (in Russian) 19. Pang, G., Deng, J., Wang, F., Zhang, J., Pang, Z., Yang, G.: Development of flexible robot skin for safe and natural human–robot collaboration. Micromachines 9(11), 576–586 (2018). https://doi.org/10.3390/mi9110576 20. Mazzocchi, T., Diodato, A., Ciuti, G., De Micheli, D.M., Menciassi, A.: Smart sensorized polymeric skin for safe robot collision and environmental interaction. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), vol. 9, no. 11, pp. 837– 843 (2015). https://doi.org/10.1109/iros.2015.7353469 21. Serebrenny, V., Shereuzhev, M., Metasov, I.: Approaches to the robotization of agricultural mobile machines. In: MATEC Web of Conferences, vol. 161, p. 3014 (2018). https://doi.org/ 10.1051/matecconf/201816103014 22. Volodin, S.Y., Mikhaylov, B.B., Yuschenko, A.S.: Autonomous robot control in partially undetermined world via fuzzy logic. In: Advances on Theory and Practice of Robots and Manipulators, pp. 197–203 (2014). https://doi.org/10.1007/978-3-319-07058-2_23
Approach to Forecasting the Development of Crisis Situations in Complex Information Networks Andrey V. Proletarsky, Ark M. Andreev, Dmitry V. Berezkin , Ilya A. Kozlov(&) , Gennady P. Mozharov, and Yury A. Sokolov Bauman Moscow State Technical University, 5/1, 2nd Baumanskaya Street, 105005 Moscow, Russian Federation {pav,berezkind}@bmstu.ru, [email protected], [email protected], [email protected], [email protected]
Abstract. We propose an approach to solving the problem of forecasting and decision support in managing processes in complex information networks. The approach is based on the analysis of heterogeneous data streams and involves solving two consecutive selection problems, such as the formation of the set of possible strategies of the decision maker and the choice of the best strategy. We propose a hybrid scenario approach to solve the problem of determining the set of possible strategies. The approach consists in detecting events in heterogeneous data streams, forming situations and constructing scenarios for their further development. We describe a method of scenario formation based on case-based reasoning. Each scenario is supplied with recommendations regarding actions that should be taken to promote or hinder the development of current situation according to the given scenario. The generated recommendations are considered as possible strategies of the decision maker. The optimal strategy is identified via a game-theoretic method based on searching for Nash equilibrium. The article provides an example of forecasting and decision support in resolving the conflict of economic interests of firms competing for sales markets. Keywords: Game-theoretic models Situational analysis Forecasting
Bimatrix games Nash equilibrium
1 Introduction The environment surrounding a person is gradually turning into a cyber-physical system, which leads to a significant change in social and economic relations: society becomes “networked,” [1] as a result of which well-known crisis phenomena appear in it and new ones arise. Over time the structure of socio-economic relations should completely turn into a global network of agents, which in turn should lead to the fact that the structure of socio-political relations will also gradually become networked. These socio-political relations will require the creation of fundamentally new systems of state and even © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 437–446, 2020. https://doi.org/10.1007/978-3-030-39216-1_40
438
A. V. Proletarsky et al.
military-political management, as previous rigid control systems would be unable to control processes on such a network. Besides, it is necessary to develop new models that describe processes in complex socio-technical systems with a network structure in order to predict possible risks and crisis phenomena. The crisis development model represents the situation as a product of interactions of agents in a complex network. Users of social networks affected by the informational and psychological impact can act as such agents. The situation also involves players seeking to influence the agents and spread some information in the network [2]. Such players may be companies or enterprises, state or private organizations. Since the situation usually involves several players under conditions of confrontation or cooperation, various game-theoretic models can be used to describe their interaction. Each player acts based on decision making processes. To make the best decisions, the player must take into account the current situation, as well as the forecast for the development of the situation and possible actions of other players. An analysis of heterogeneous data streams coming from open and specialized sources and containing information on the development of situations in the analyzed system over time can be used to make such a forecast and support decision making. A significant part of this data is usually presented in text form. Section 2 of the paper provides the mathematical formulation of the optimal control problem and demonstrates that solving the problem consists of two tasks: preparing the set of possible alternatives, and selecting the optimal ones. Generation of possible alternatives based on the analysis of heterogeneous data streams is described in Sect. 3. In Sect. 4, we consider the task of selecting the optimal alternatives as game-theoretic problem and propose to solve it via searching for Nash equilibrium. The solution for non-cooperative games is considered in Sect. 5. Section 6 provides an example of application of the proposed approach to find optimal control in confrontation between two companies competing for sales markets. We wrap up with conclusions in Sect. 7.
2 The Mathematical Formulation of the Optimal Control Problem In the decision making process, it is required to choose a subset of decisions from a set of possible options (alternatives). Each decision must be evaluated from various points of view, taking into account political, economic, scientific, technical, and other aspects. The notion of the quality of the alternatives is characterized by the optimality principle, which requires constructing decision optimization models according to several criteria [3]. Thus, we represent the decision making problem by a pair (X, OP) where X is the set of alternatives, OP is the optimality principle. The solution to the problem (X, OP) is the set Xo X obtained using the optimality principle. The mathematical expression of the optimality principle is the selection function Co, which associates any subset X X to its part Co(X). The solution Xo of the original problem is the set Co(X). The process of solving the problem (X, OP) is organized in the following way: we form the set X, i.e., prepare alternatives, and then solve the selection problem. We use conditions for the possibility and admissibility of alternatives to form the set X. The
Approach to Forecasting the Development of Crisis Situations
439
specific problem constraints determine these conditions. The problem of forming the set X is a selection problem (Xu, OP1) where Xu is the universal set of all alternatives that is considered known, and OP1 is the optimality principle expressing the conditions for the admissibility of alternatives. The set X = Co1(Xu) obtained by solving the indicated selection problem is called the initial set of alternatives. Thus, the general problem of decision making consists of solving two consecutive selection problems. In practical cases, alternatives have many properties that affect the solution. Let all the properties k1, …, km taken into account when solving the problem (X, OP) be criteria. We associate the criterion kj with the jth axis of the space Emðj ¼ 1; mÞ. We map the set X to Em by assigning each alternative x 2 X to the point u(x) = (u1(x), …, um(x)) 2 Em where uj(x) is the estimation of x by the criterion kjðj ¼ 1; mÞ. The space Em will be called the criterion space. When solving the problem of optimal control in complex networks, alternatives X are the possible controls u 2 U. The vector u(u) 2 Em represents the profit from the control u, and Em represents the profit space. The optimality principle is defined not on the initial set of controls U, but on the set of profits U = u(u) Em, i.e., the selection of controls is carried out not directly but based on profits u(u) corresponding to the controls u 2 U. Let a binary relation R be set on the space Em. A control u* 2 U is optimal if it is impossible to obtain a profit vector u(u) that is preferable to u(u*) according to R for all other controls, i.e. uðuÞRuðu Þ for all u 2 U. We call such a control R-optimal. The set of all R-optimal controls is denoted by O(U, u, R). The multicriteria optimal control problem (U, u, R) consists in identifying O(U, u, R) for given U, u and R. If the map u associates each u 2 U to a number u(u) E1, and the relation R is the “greater than” relation on the set of real numbers, then we obtain the classical singlecriterion optimal control problem.
3 Formation of a Set of Alternatives Based on the Analysis of Heterogeneous Data Streams At the first stage of solving the decision making problem, it is necessary to prepare the set of acceptable alternatives X = Co1(Xu), applicable in the current situation. In order to construct possible strategies for the decision maker, it is necessary to determine how the current situation may evolve in the future. That requires forming a set of possible scenarios for the situation’s development [4]. Each scenario corresponds to some control actions that the decision maker must take to facilitate or hinder the development of the situation according to the given scenario. In [5], the authors proposed a hybrid approach to analyzing and forecasting the development of situations, which allows one to form a set of scenarios based on processing streams of heterogeneous data dynamically coming from various sources. We consider changes that occur in the data stream and reflect various stages of development of the analyzed situations as events. Various models of events are used in different subject areas, in particular, a logical rule, a frame, a burst in time series, a separate document and a cluster of documents describing the event [6–10]. Sequential
440
A. V. Proletarsky et al.
detection of interrelated events allows for tracking the development of the situation over time. The formation of scenarios for the further development of this situation consists in identifying events that may occur in the future. The proposed approach consists of the following steps: In the first step, we regularly download various (numerical, tabular, textual) data from different sources, such as news portals on the Internet, documentary and relational databases. In the second step, we perform preliminary processing of the downloaded data, then cleanse it and convert it to the required format. In the third step, we detect events ei in the data stream. The event model and the method for detecting events are specific to each data type. In the fourth step, we form situations s ¼ ðe1s ; e2s ; . . .; ens Þ which are chains of interrelated events. The data type also determines the method of event chain construction. In the fifth step, we form a set of possible scenarios for further development of the current situation. Each scenario n ¼ ðe1n ; e2n ; . . .; em n Þ represents a potential continuation of the current situation. Scenario generation is performed uniformly for different types of data. In the sixth step, we form recommendations recn for decision makers corresponding to each of the proposed scenarios. The proposed approach allows working with heterogeneous data streams. Depending on the type of data and the event model used, suitable methods for detecting events and combining them into situational chains are selected. In particular the authors used a method of detecting events based on incremental clustering to work with text streams. The method can be adjusted to various subject areas through the use of machine learning [11]. The formation of scenarios for further development of situations is based on casebased reasoning [12, 13]: the current situation sc is compared with sample situations se 2 Se from the sample database Se prepared by experts. Such samples reflect the development of various situations in the past. If the current event chain is similar to the initial part st(se, sc) of the sample chain, we can assume that the further development of the situation sc will be similar to the final part fin(se, sc) of the sample chain. Thus, we can consider the sequence of events fin(se, sc) as a possible scenario for further development of the current situation. A method used to compare situations is a modification of the Levenshtein distance: the normalized total weight of the operations needed to convert st(se, sc) to sc determines the distance between the chains: qðse ; sc Þ ¼
ðhdel Wdel þ hadd Wadd þ hrep Wrep þ htrep Wtrep Þ hT W ¼ lenðstðse ; sc ÞÞ lenðstðse ; sc ÞÞ
where len(st(se, sc)) is the length of the initial part of the sample chain. W = (Wdel, Wadd, Wrep, Wtrep) is a vector containing the total weights of various types of the chain conversion operations, such as: deleting an event from the sample situation (Wdel), adding an event to the current situation (Wadd), replacing an event with
Approach to Forecasting the Development of Crisis Situations
441
its analogue (Wrep) and changing the time interval between events (Wtrep). The method of calculating weights depends on the type of analyzed data and is selected based on the used event models. h = (hdel, hadd, hrep, htrep) is a vector of coefficients which determine the impact of different operations on the distance value. The distance value q(se, sc) is used to determine whether the current situation sc is analogous to the sample situation se. We consider the task of analogy determination as a logistic regression problem. We introduce a variable y which assumes a value of “1” if chains are not analogous and a value of “0” otherwise. We assume that the probability of the event “y = 0” (that is, the probability that the current situation is analogous to the sample chain) is defined by the logistic function of the distance between the chains: Pðy ¼ 0jse ; sc Þ ¼ 1
1 1 þ eqðse ;sc Þ
We determine the values of the coefficients h via the maximum likelihood method using a training set consisting of pairs of analogous and non-analogous situations. Sample chains that satisfy the condition P(y = 0|se, sc) > 0.5 are considered as analogues of the situation sc and their final parts are recognized as probable scenarios of its further development. To support decision making we provide each of the generated scenarios with recommendations regarding actions that need to be taken to facilitate or prevent the development of the situation according to this scenario. In order to generate such proposals experts provide each event of sample situations in the base Se with recommendations indicating what actions, in what period and by which person should be taken if a similar event occurs in the future. We consider the set of the generated recommendations as a set of possible alternatives for the decision maker in the current situation, that is, a set of acceptable controls U. This set is the initial set of alternatives for the optimal control problem (U, u, R).
4 Optimal Control Under Conditions of Confrontation In paper [5], the proposed solution to the problem of choosing the optimal decision making strategy is to select the most profitable alternative of the generated set using the analytic hierarchy process (AHP) [14]. However, this approach does not take into account the presence of other players and the relations between them, including confrontation and cooperation. It is reasonable to apply information network models to describe the behavior of interacting players. We consider an information network a structure consisting of many agents (individuals, groups, organizations, states) and relations between them (interaction, cooperation, communication) [3, 15]. Formally, the information network is a graph G(N, E), in which N = {1, 2, …, n} is a finite set of vertices (agents) and E is a set of edges reflecting the relations between agents. Agents are affected by players, each of whom seeks to distribute some information on the network and influence the agents of other players.
442
A. V. Proletarsky et al.
We suppose that players choose strategies simultaneously and independently, i.e., play a bimatrix game in standard form. We consider the problem of optimal control under conditions of confrontation of n players A1, …, An, each of which chooses a control action ui 2 U i . In this case U can be represented in the form U ¼ U 1 . . . U n . We determine the profit of the player Ai by the function ui(u1, …, ui, …, un), i.e., it depends on the controls chosen by all players. The goal of each player Ai is to maximize its profit uiði ¼ 1; nÞ. The optimal control problem (U, u, R) is to find the set of control actions of all players that is optimal according to the binary relation R set on En. Usual choices of R are Pareto relation and the majority relation. The control u* is R-optimal if the condition (1) is satisfied for all u 2 U. uðuÞRuðu Þ
ð1Þ
where u(u) = (u1(u), …, un(u)). Note that in this case the profits ui(u) of individual players are scalar functions. When using vector profits of individual players (ui(u) 2 Em), the profit u(u) is defined by a vector in the space Emn: uðuÞ ¼ ðu11 ðuÞ; . . .; um 1 ðuÞ; . . .; u1n ðuÞ; . . .; um ðuÞÞ. In this case R is a binary relation on E . mn n Let ui(u) ði ¼ 1; nÞ be scalar functions. We define the notion of control optimality using the Nash principle. The control u* is called Nash optimal if the condition (2) is satisfied for any i ¼ 1; n: ui 2 arg maxui ui ðu1 ; . . .; ui1 ; ui ; ui þ 1 ; . . .; un Þ
ð2Þ
This means that each player individually should not change their control ui , since, with the other players’ controls unchanged, the player Ai cannot get profit exceeding ui(u*). The point u* is called the Nash point [16, 17]. The Nash optimal control problem is denoted by (U, u, N) where U = U 1 … U n , u = (u1, …, un), and N indicates the Nash optimality. A partial solution to the problem (U, u, N) is any Nash optimal control u* 2 U, and a general solution is the set of all partial solutions. Note that the Nash point u* may not satisfy relation (1), i.e., with a simultaneous change in control, all players can increase their profit. In such cases, the Nash point is not Pareto optimal. When using vector profits of players, functions ui(u) ði ¼ 1; nÞ accept values from the space Em and the binary relation R is defined on this space. We denote control u* 2 U as Nash R-optimal if for each i ¼ 1; n and any ui 2 U i it satisfies the condition: ui ðu1 ; . . .; ui1 ; ui ; ui þ 1 ; . . .; un ÞRui ðu1 ; . . .; ui1 ; ui ; ui þ 1 ; . . .; un Þ
ð3Þ
If m = 1 and R is the “greater than” relation, the condition (3) is reduced to the above described Nash optimal control (2). The optimal control problem with vector profits under conditions of confrontation is to find all or some of the Nash R-optimal controls. We denote this problem as (U, u, R, N).
Approach to Forecasting the Development of Crisis Situations
443
Let R1, …, Rn be binary relations on Em that express the preferences of players A1, …, An respectively. We consider control u* 2 U as the cumulative Nash optimal control if for each i ¼ 1; n and any ui 2 U i : ui ðu1 ; . . .; ui ; . . .; un ÞRi ui ðu1 ; . . .; ui ; . . .; un Þ
ð4Þ
If Ri = R for each i ¼ 1; n, the cumulative Nash optimal control is reduced to Nash Roptimal control (3).
5 Non-cooperative Game with Vector Profits Depending on the possibility or impossibility of joint actions between the participants in the game, we divide them into non-cooperative, cooperative, and coalition ones. In the non-cooperative version of the game, each player chooses their strategy to achieve their highest profit [18, 19]. We consider a non-cooperative bimatrix game with two players A and B. The game is defined by payment matrices A and B of size nA nB where nA(nB) is the number of possible controls of player A(B). The rows of the matrices correspond to the controls of player A, the columns correspond to the controls of player B. The profit of player A (B) for given controls is equal to the element of the matrix A(B) at the intersection of the selected row and column. Let A1, …, Am, B1, …, Bm be square matrices of size nA nB with elements 1 m a1ij ; . . .; am ij ; bij ; . . .; bij i ¼ 1; nA ; j ¼ 1; nB . Profits of the players A and B for given 1 m average profit of controls are denoted by aij ¼ ða1ij ; . . .; am ij Þ and bij ¼ ðbij ; . . .; bij Þ. The P P the player A, that is, the vector uA(x, y), is determined as i j aij xi yj where xP¼ hx1 ; . . .; xnP i, y ¼ h y ; . . .; y i are the mixed strategies of players A and B, 1 n A B nA nB x ¼ 1, y ¼ 1. Player B’s average profit is determined similarly based on i¼1 i j¼1 j the matrix B. The solution to the matrix game with vector profits is a pair x0 , y0 that satisfies the following conditions: P P i
P P
j
aij xi y0j R
i
j
bij x0i yj R
P P i
P P
j
aij x0i y0j ; for each x;
i
j
bij x0i y0j ; for each y:
The set of solutions to the matrix game with vector profits is denoted by O(A, B, R) where A = (A1, …, Am ), B = (B1, …, Bm ). The problem of finding (A, B, R) is the problem of determining Nash-optimal controls.
444
A. V. Proletarsky et al.
6 An Example of Finding Optimal Control in a Confrontation Between Two Players As an example of a non-cooperative game, we consider the confrontation of two firms (players A and B) in the field of technological leadership. A previously had technological superiority, but currently has less financial resources for research than its competitor. Both firms A and B have to decide whether to try to achieve a dominant market position in the technological field by making large investments. In order to obtain sets of possible controls for both players their current situations were compared to sample situations from the database prepared by experts. Sample situations analogous to the current ones were identified as possible scenarios for their further development. Recommendations for the decision makers corresponding to the generated scenarios were considered as possible controls, that is, strategies of the players. Analysis of the scenario generation method’s quality is given in [5]. A series of experiments was carried out to estimate the method’s ability to correctly identify analogous situations. The experiments showed that the values of precision, recall and F-measure may reach 75.8%, 84.9% and 80.1% respectively when using 90 training samples. The scenario analysis has provided two possible strategies for the player A. The strategies are presented in Table 1. Table 1. Possible strategies for the firm A. Analogous situation Strategy Sanofi cuts R&D programs Reduce research project costs IBM increases investment in IoT Invest heavily in research to maintain excellence
For B, two possible strategies have also been identified, as presented in Table 2. Table 2. Possible strategies for the firm B. Analogous situation Google abandoned competition with Apple in the field of tablet computers Specialized invests in R&D to enter the electric bike market
Strategy Abandon technological competition with A Invest heavily in research to compete with A
If both firms invest heavily in the business, then A will have better prospects for success, although it will incur high financial costs for both companies. Profits with negative values represent this situation. For the firm A, it would be best if the firm B abandoned competition. Its profit, in this case, would be 3 (payments). With a high probability, the firm B would win the rivalry if the firm A adopted a reduced investment
Approach to Forecasting the Development of Crisis Situations
445
program and the firm B adopted a wider one. We get a bimatrix game with the following payment matrices (we will use scalar profits of the players): A¼
3 0
1 ; 1
B¼
0 0
3 : 2
We search for pure strategy Nash equilibrium (in this case in the vectors x and y one element has a value of “1” and the rest ones have values of “0”). Maximal elements in the columns of the matrix A are (1; 1), (1; 2). Maximal elements in the rows of the matrix B are (1; 2), (2; 1). The intersection of these two sets is (1; 2). This pair of controls corresponds to the equilibrium in pure strategies [20]. The analysis of the matrix shows that the balance in pure strategies occurs at high R&D costs for B and low costs for A. In any other scenario, one of the competitors has a reason to deviate from the strategic combination. For example, a reduced budget is preferable for the firm A if the firm B abandons competition. However, if A adopts a reduced investment program, it is beneficial for B to invest in R&D heavily. The firm A, which has a technological advantage, can use the results of the situation analysis based on game theory in order to achieve an optimal outcome for itself. It should make a specific signal (such as a decision to purchase new laboratories or to hire additional staff) to show that it is ready to make significant investments. The absence of such a signal would tell the firm B that A is abandoning the competition.
7 Conclusions The article proposes an approach to solve optimal control problem under conditions of confrontation with vector profits, which consists in finding all or some of the Nash Roptimal controls. The novelty of the proposed approach lies in the possibility to automatically form a set of possible strategies for the players based on the analysis of heterogeneous data streams. The paper describes a hybrid approach to analyzing and forecasting the development of situations and a method for generating possible scenarios based on case-based reasoning. The recommendations provided for the generated scenarios are considered as possible controls. The use of game theory models for determining the optimal control actions allows one to take into account the basic principles underlying the rational behavior of players in crisis situations in complex information networks. The possible direction of the further development of the proposed approach is to analyze games with incomplete information, as well as coalition games.
References 1. Castells, M.: The Information Age: Economy, Society and Culture. Blackwell Publishers, Oxford (1996) 2. Okhapkina, E., Okhapkin, V., Kazarin, O.: The cardinality estimation of destructive information influence types in social networks. In: 16th European Conference on Cyber Warfare and Security, ECCWS 2017, pp. 282–287. Academic Conferences and Publishing International Limited, Reading (2017)
446
A. V. Proletarsky et al.
3. Yeung, D.W.K., Petrosyan, L.A.: Subgame Consistent Economic Optimization. Springer Press, New York (2012) 4. Vučijak, B., Kurtagić, S.M., Silajdžić, I.: Multicriteria decision making in selecting best solid waste management scenario: a municipal case study from Bosnia and Herzegovina. J. Clean. Prod. 130, 166–174 (2016) 5. Andreev, A., Berezkin, D., Kozlov, I.: Approach to forecasting the development of situations based on event detection in heterogeneous data streams. In: International Conference on Data Analytics and Management in Data Intensive Domains, pp. 213–229. Springer, Cham (2017) 6. Weng, J., Lee, B.S.: Event detection in Twitter. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (ICWSM 2011), pp. 401–408. AAAI Press, Palo Alto (2011) 7. Yang, Y., Carbonell, J.G., Brown, R.D., Pierce, T., Archibald, B.T., Liu, X.: Learning approaches for detecting and tracking news events. IEEE Intell. Syst. 14(4), 32–43 (1999) 8. Aggarwal, C.C., Yu, P.S.: On clustering massive text and categorical data streams. Knowl. Inf. Syst. 24(2), 171–196 (2010) 9. Yao, W., Chu, C.H., Li, Z.: Leveraging complex event processing for smart hospitals using RFID. J. Netw. Comput. Appl. 34(3), 799–810 (2011) 10. Song, H., Wang, L., Li, B., Liu, X.: New trending events detection based on the multirepresentation index tree clustering. Int. J. Intell. Syst. Appl. (IJISA) 3(3), 26–32 (2011) 11. Andreev, A.M., Berezkin, D.V., Kozlov, I.A.: Automated topic monitoring based on event detection in text stream. Inf. Meas. Control Syst. 15(3), 49–60 (2017). (in Russian) 12. Leśniak, A., Zima, K.: Cost calculation of construction projects including sustainability factors using the Case Based Reasoning (CBR) method. Sustainability 10(5), 1608 (2018) 13. Kozaev, A., Alexandrov, D., Saleh, H., Bukhvalov, I.: Application of case-based method to choose scenarios to resolve emergency situations on main gas pipeline. In: Proceedings of the 5th International Conference on Actual Problems of System and Software Engineering (APSSE 2017), pp. 120–126. CEUR-WS.org, Aachen (2017) 14. Saaty, T.L.: The Analytic Hierarchy Process. McGraw-Hill, New York (1980) 15. Saaty, T.L., Kevin, K.P.: Analytical Planning. The Organization of Systems. Pergamon Press, Oxford (2007) 16. Nash, J.F.: Non-cooperative games. Ann. Math. 54(2), 286–295 (1951) 17. Beltadze, G.N.: Game theory-basis of higher education and teaching organization. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 8(6), 41–49 (2016) 18. Salukvadze, M.E., Beltadze, G.N.: Strategies of nonsolidary behavior in teaching organization. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 9(4), 12–18 (2017) 19. Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V.V. (eds.): Algorithmic Game Theory. Cambridge University Press, New York (2007) 20. Bai, S.X.: A note on determining pure-strategy equilibrium points of bimatrix games. Comput. Math. Appl. 32(7), 29–35 (1996)
Systems Theory for the Digital Economy Georgy K. Tolokonnikov1(&), Vyacheslav I. Chernoivanov1, Sergey K. Sudakov2, and Yuri A. Tsoi1 1
Federal Scientific Agro-Engineering Center VIM, Russian Academy of Sciences, 1st Institute Passage, 5, Moscow, Russia [email protected] 2 Federal State Budgetary Institution Scientific and Research Institute of Normal Physiology named after P.K. Anokhin, Russian Academy of Sciences, st. Baltic, 8, Moscow, Russia [email protected]
Abstract. In the framework of the categorical theory of systems that generalizes functional and biomachsystems, an analysis is made of methods for implementing the digital economy. It is proved that the proven methods of deep learning neural networks should be classified as weak artificial intelligence the methods cover the level of reflexes in a physiological formulation. A strong artificial intelligence should model not only the level of reflexes but also the level of functional systems at which thinking takes place and consciousness manifests itself. This higher level is proposed to be implemented in the neurocategories of categorical systems. It is proposed to include the ability to develop new system behavior algorithms and the tough implementation of a categorical system approach in the interaction of subsystems into the definition of a smart enterprise for the digital economy. The indicated properties cannot be realized in traditional neural networks. They require functional and biomachsystems methods, as well as a categorical approach to systems. Keywords: Artificial intelligence Neural networks Functional systems Biomachsystems Categorical systems theory Categories Smart enterprise
1 Introduction The introduction of modern methods of artificial intelligence (AI), including the intellectual tools of the digital economy, has great hopes in ensuring, with their help, the breakthrough character of the current stage of development of industry and agriculture both in Russia and in other countries. The boom of applications of artificial neural networks that have been observed for more than five years [1] substantiates, according to many authors, the thesis that the basis for breakthrough technologies should be precisely the technologies of artificial neural networks. As it seems to these authors, having big data breeds intelligence in itself which may possibly surpass human intelligence in the future. The successes of the neural network approach are undeniable and underlie national plans and programs for creating a digital economy. The purpose of the report is to reach the following conclusions by relying on the truly outstanding achievements of the neural network approach to AI [2–9], as well as on modern © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 447–456, 2020. https://doi.org/10.1007/978-3-030-39216-1_41
448
G. K. Tolokonnikov et al.
systems theory, including functional systems, biomachsystems, and categorical systems theory [10–16]. Certainly, the neural network approach is effective and it is necessary to achieve the further realization of its potential, but the hopes for unlimited development of AI within its framework are not justified. It is time to move on, from the Pavlov principle, which accumulates the indicated neural network hopes to the Anokhin-Sudakov principle, based on a categorical generalization of the theory of functional systems and the theory of biomachsystems. The first one goes from reflex to human consciousness and thinking, the modeling of which is the goal of AI. The second one offers for AI, in particular, a solver for generating new algorithms for the behavior of systems.
2 Artificial Neural Networks and Reflexes in Physiology Hilbert’s thirteenth problem “Is it possible to solve the general seventh-degree equation using functions depending on only two variables” was solved by Kolmogorov (with the participation of V.I. Arnold) in a much more general form. It was proved [17] that any continuous function of many variables can be represented as a superposition of continuous functions of one variable and a single function of two variables, which is an addition. Artificial neural networks are built on the basis of composition and the result of A. N. Kolmogorov adapted for them [18] means that a neural network is just an approximation of the function of several variables by functions of a smaller number of variables using composition. Scientists have learned to bring functions closer long ago in a variety of ways, for example, by expanding functions in a Taylor series. Learning a neural network by selecting the weights of neurons corresponds to the selection of the coefficients of the Taylor series for the approximated function. When arguments (impact) are given, the function produces after calculating the result (the result of the effect). By and large, there is a direct correspondence with the reflexes of living organisms (stimulus - reaction). To look for anything else here besides reflexes seems rather strange if you do not succumb to fashion on the neural network. Nevertheless, in [1] one can read about the very widespread hope that intelligence does not need to be programmed. With big data, it appears itself. A soberer and, in our opinion, correct statement was made by A.A. Zhdanov, who devoted this idea to the whole report “Intelligence is not as a set of reflexes, but as freedom of choice” 03.02.2018 at the III Congress of the Neuronet industrial union. “Deep learning and Big data cover only a small part of the brain functions - recognition and search for some correlations. What kind of control can be built on the basis of only a recognition system? Based on a recognition system, you can build only a control system that implements a reflex … The assumption that activity “of the higher parts of the brain is a combination of reflexes - deeply mistaken!” (http://www.aac-lab.com/files/aac-rus-03-02-20181.ppsx). The whole technology of neural networks is reduced to the art of selecting the number of neurons, weights and their connections so that the network learns, and A.N. Kolmogorov’s theorem states that there are such sets of neurons and weights. Ideologically, the method is very simple, if not primitive (see the opinion of P.K. Anokhin
Systems Theory for the Digital Economy
449
regarding the use of artificial neurons [10]), it’s not surprising that this neural network boom “… is a growing concern that inflating another “bubble”, and the expectations of AI adherents, as has already happened more than once, are significantly overstated” [1]. Success has come (in the version of convolutional networks) in the way of using multilayer networks. In the last century, it was possible to train such networks with one or two or three layers, but it was impossible to train such networks with more layers due to the lack of computer power. Layers of artificial neurons mimic the layers of the neocortex, as well as in the deeper approach of Hawkins with his HTM technology [19]. The mentioned convolutional networks were proposed in 1979 [20], in 2006, Jan Lekun [9] successfully applied the error back propagation method to convolutional networks, and the “revolution” began in 2012, when the technology surpassed all competitors in the recognition of medical images, Chinese hieroglyphs, road signs (here the network has already surpassed man). Of course, one cannot believe that such a technology, operating with such minimal means, can solve the cardinal issues of autonomous control and become the only effective basis necessary for a technological breakthrough. The network acts like a reflex: an external signal is sent, a suitable answer is received. Turning to functional systems according to P.K. Anokhin, we will see how deep the functional systems are reflected properties of the networks of living neurons of the nervous system and brain. Need, motivation, result, and other phenomena that are clearly inexpressible by the popular neural network approach are successfully modeled by a functional system. Nevertheless, the successes of deep learning neural networks, as we have already noted, gave rise to the hope that programming of artificial intelligence is no longer necessary. Artificial intelligence appears when training neural networks on large data sets. Moreover, the following so-called Pavlov principle was put forward [10]. “A network of neurons, each of the connections between which gradually changes as a function of the locally accessible components of the error and activity signals connected by a connection of neurons, comes to the error-free operation during the functioning of the network. … The general meaning of the Pavlov Principle is that neural networks with mutable “almost always” connections acquire over time the ability to respond to inputs to a neural network as it is needed (to a device or subject) that trains the network.” Confidence in the validity of the principle is given by the analogy of the functioning of artificial neural networks with the reflex theory of academician Pavlov I.P. in neurobiology. The method of backpropagation of error, as is known, cannot be implemented on living neurons; the search for more biological methods for training artificial neural networks has led to some suitable methods, which allowed us to connect the phenomenon of network learning with the I.P. Pavlov hypothesis that “a properly organized change in the connections between elements and structures of the nervous system is the basis of the higher nervous activity of a person … An analysis of the working conditions of such networks shows that they are actually based on a variety of conditioned reflexes, that is, based on a lot of those same bricks behavior, about which I.P. Pavlov believed that it provides the formation of mental reactions of all living organisms, including the process of human thinking” [21].
450
G. K. Tolokonnikov et al.
It has long been known that the theory of reflexes of I.P. Pavlov is far from enough to explain human thinking and many other functions of the nervous system. The next breakthrough step in neuroscience is the theory of functional systems of P.K. AnokhinK.V. Sudakov. Thus, the Pavlov principle should be generalized, at least to the case of functional systems. But a direct generalization does not work, since the type of neural networks that are covered by the Pavlov principle is too limited. The poly-category approach [12–16] revealed significant additional opportunities due to the organization of neurons into a new mathematical structure called neurographs and neurocategories. The theory of reflexes of I.P. Pavlov turned out to be insufficient, so the artificial neural networks corresponding to it and reflexes, described by the Pavlov principle, are insufficient for fully functional modeling of the brain. Models of functional systems should be used, as well as poly-category methods, leading to an extensive class of systems besides functional. A variant of such a generalization of the Pavlov principle was proposed in [16] and called the categorical principle of P.K. Anokhin-K.V. Sudakov. We immediately give a fundamental conclusion. The prospects for the currently successful application of deep learning neural networks are limited. In this approach, the most fundamental decisions on the autonomy of control systems and strong AI cannot be implemented, which correspond to the capabilities of the categorical principle of P.K. Anokhin-K.V. Sudakov, which lies in the field categorical systems theory.
3 Category Systems and the Principle of P.K. AnokhinK.V. Sudakov In this section, we, first, provide a brief definition of convolutional polycategories and their higher analogs, including neurographs and neurocategories, and show that artificial neural networks are a very special case of convolutional polycategories. Secondly, we give a brief definition of categorical systems within the framework of which the problem of formalizing the theory of functional systems of P.K. Anokhin-K.V. Sudakov is solved. The above allows us to formulate the categorical principle of P.K. Anokhin-K.V. Sudakov and give suitable explanations that it adequate to nature for modeling human intelligence in the framework of strong AI more than the Pavlov principle. A category is a set of arrows having one input and one output and objects or unit arrows. The composition operation is defined on the arrows, which satisfies certain properties together with the arrows. The sets of multi-arrows having several inputs and one output and poly-arrows having multiple inputs and several outputs similarly define multicategory and polycategory, respectively. In the traditional definition of a polycategory, a composition connects one output of an arrow to one input of another arrow. Such a composition cannot describe the connections of neurons in a neural network where simultaneous connections of two or more inputs and outputs are possible. In the convolution polycategories [12], the compositions are generalized to convolutions that can connect the arrow arrows in an arbitrary way. For example, Fig. 1 shows a convolution of three arrow arrows.
Systems Theory for the Digital Economy
451
Fig. 1. An example of three n1 ; n2 ; n3 poly-arrows connected by a convolution S ¼ ðs1 ; s2 ; s3 Þ into the network shown on the right, on the left is the convolutions themselves, arrows (lines of inputs and outputs) are directed from bottom to top.
Neurons in the brain can connect not only through synapses but also in the form of “soma-dendrite-soma-axon-soma”, when the connecting neuron connects not the synapses but the two neurons themselves. This corresponds to the model in which arrows connect the original arrows. Then these connecting arrows can be connected by arrows of the next level. We are, in fact, talking about higher categories, multicategories, and polycategories. A neurograph is a set of poly-arrows of different levels with corresponding convolutions. Neurocategory is a neurograph with additional properties. It’s similar to the fact that a category is a graph with additional properties. The neurocategory is a neurograph, to which all kinds of convolutions are added, similar to how a category is constructed from an arbitrary graph by adding compositions of arrows. Functors between multicategories are sets of functions (one for polyarrows, the other for objects) preserving the structure of polycategories. Conventional artificial neural networks are very private convolutional categories. Theorem. Let there be an artificial neural network with neurons n : bk ! b, having several inputs (k 2 N) and one output, with its activation function for each neuron, with signals from the set b, which is a set, which comes to the input of the synapses of the neuron. Connections of neurons are carried out by the existing output, which branches to a finite number m 2 N lines connecting to the inputs of other neurons. Then built from these neurons neural networks form an associative compositional convolutional polycategory with convolutions of the “crown” type. As you can see, the expressive capabilities of ordinary neural networks compared to neurographs are completely incomparable. According to this theorem, connections of neurons in the brain representing a neurograph are not modeled by conventional artificial neural networks. There are many mathematically rigorous system approaches, but for biology systems, which is a person and his subsystem of intelligence, the modeling of which is the goal of AI, the theory of functional systems of P.K. AnokhinK.V. Sudakov (TFS) is the most adequate. Unfortunately, it is presented by the creators on an intuitive level. TFS was able to formalize in the language of convolutional polycategories. The key principle of TFS is the presence of a system-forming factor (the result that the whole system will strive for), which collects other necessary systems into the whole system, which become subsystems and this movement is from whole to
452
G. K. Tolokonnikov et al.
parts [10]. For such a system approach, ordinary set-theoretic mathematics is not suitable, the category theory language adequate to this principle should be used. The cornerstones of TFS are also the principle of isomorphism and hierarchy for systems [10]. Both of these principles are formalized in the categorical theory of systems. So, a category arrow with a corresponding convolution of a suitable convolutional polycategory is called a categorical system. The system-forming factor for a composite system is the pair (F, S), where F is a functor that changes the systems that will be included in the system, and S is the convolution connecting the systems in an appropriate way. Isomorphic objects, as in any theory, are described by invariants. One of the invariants is the number of required components of the system, for example, there are three of them in the biomachsystem (human, machine, and living), and there are four of them in the functional system (afferent synthesis unit, decision-making unit, action result acceptor, and action program unit). A more detailed invariant, including the number of required components, is the image of a forgetting functor from the category of systems to the category of x-hypergraph constructions. This functor is called the similarity of categorical systems. Systems are similar if they are translated by this functor into isomorphic constructions. For example, biomachsystems are not similar and therefore not isomorphic to functional systems, since the number of required components is different. The language of convolutional polycategories generalizes the concept of calculus of x-hypergraph constructions. Now, having expanded the artificial neural networks on which the Pavlov principle is based, to categorical systems and neurocategories with categorical calculi of x-hypergraph constructions, we come to the Anokhin-Sudakov categorical principle (in a similar form to Pavlov’s form): “A network of neurons with the connections between them, like a neurocategory neurograph, gradually changes under the influence of a categorical system-forming factor that implements both analogs of training procedures on the basis of the Pavlov principle and categorical procedures for the calculus of x-hypergraph constructions, comes to error-free operation, achieving the result.”
4 Smart Enterprises for the Digital Economy It turns out that the definitions of smart manufacturing that have become popular are no different qualitatively from the usual ones. In the literature and discussions we do not find formulations giving the qualitative differences required from the traditional enterprises for the introduction of the new concept of “smart manufacturing”. For example, let’s see what is usually invested in the concept of “smart farm” [22] and how traditional concepts are sufficient for the problem of technological breakthroughs for the agricultural sector. Based on the definition of a “smart” farm, the following conditions are usually necessary for its functioning: informatization of all processes carried out on the farm using BigData elements; minimization of uncertainties, including the influence of the “human” factor; maximum consideration of the climatic and socio-economic characteristics of the region; the availability of trained personnel. In this description, a smart farm exists on its own, there is no indication of the connections and requirements of a higher-level system, it is not clear at what level the increase in “productivity and profitability” can be expected by applying the
Systems Theory for the Digital Economy
453
indicated technologies (BigData, etc.) solutions. We are talking about whether there is a qualitative difference between smart and “not smart” farms if you rely only on the number of implemented sensors and methods of processing information … The key to describing or trying to determine what “mind” is, obviously, its opposition to instinct, reflex, in particular, “mind” is uniquely attributed to the ability to find new algorithms of behavior in new problem situations, to find solutions that have arisen, not yet solved, as say “mind-demanding” tasks. Full autonomous behavior of the robot cannot be realized without endowing the robot with the ability to develop new algorithms that are not embedded in its control system during manufacture. Of course, completely autonomous behavior is carried out by living creatures already at the level of bacteria. A detailed, unbiased analysis of the situation shows that modern robots (drones, cars without drivers, avatar soldiers, etc.), despite the presence in them of “smart” neural networks of deep learning, are far from even the beginnings of the “mind” found in bacteria … [22–24]. In connection with the above, it is necessary to discuss the question of whether there are ways to solve the problem of autonomy, as well as other additional resources besides those that are planned to be used in existing programs of digitalization of the economy. The problem of the “smartness” of the enterprise and the autonomy of its management subsystems have much in common and in many respects coincide. But the creation of autonomous control systems is one of the keys in the developed theory of biomachsystems, where a number of solutions have been proposed [12, 23, 24]. Among these solutions is the use of new types of solvers for control subsystems, bioblock and Post block. In the second, new types of calculi are proposed, which are more powerful in comparison with the known ones, which is confirmed by the considered model examples. The Post block is based on universal calculi, such as the universal calculus of E. Post and its generalization to the case of omega-hypergraphic calculi [12] which, in principle, allow one to construct any predetermined calculus, for example, one that gives the required conclusion (new machine behavior algorithm). The bioblock relies on the implantation of a functional artificially grown section of the mammalian neocortex with the preserved function of generating new algorithms. Even partial implementation of the Post block and/or bioblock can increase autonomy, which is tantamount to a real increase in the “intelligence” of the enterprise’s management subsystem. The issues of control subsystems in the theory of biomachsystems are described in detail in monographs [12, 14] and the literature cited there. Thus, the use of biomachsystems solvers together with autonomy unlimitedly raises the “intelligence” of the enterprise management subsystem and, thus, is a qualitative difference between an enterprise with such management subsystems and “smart” enterprises (houses, farms, etc.) in the popular sense and ordinary enterprises. In addition to the use of new solvers in the implementation of the enterprise on the principles of biomachsystems, there are a number of other possibilities for increasing the efficiency of the enterprise. The most important of them is the strict adherence to the principles of a categorical system approach, which is used in biomachsystems and their categorical generalization, during the design and operation of an enterprise. Only starting from the goal, the result of the system itself, you can get the basic requirements of building a system. For example, in the case of smart agriculture, this allows you to answer the question which of the subsystems (smart field, smart farm,
454
G. K. Tolokonnikov et al.
smart greenhouse, etc.) are in priority, how much financing and in what terms should be received, how should they interact with each other in order to achieve the goals of the system and so on. A complete misunderstanding and disregard for the system approach (often when it is declared) are characteristic of the walking idea of a smart farm as an ‘unmanned’ production. From the point of view of the theory of systems, it is not necessary to start from the ‘unmanned’ of production, but it is necessary to start from the goal, the result (for example, income, etc.) that a smart farm should achieve. In this case, the conditions for the existence of the farm and the requirement to achieve a result may not lead to the requirement of an ‘unmanned’ production, which is a secondary factor. A prerequisite for the “smartness” of an enterprise is the understanding (built-in into the management subsystem) that all subsystems must be configured to achieve the result of the enterprise itself as a system, and there must be effective capabilities for the appropriate and prompt adjustment of the subsystems. We can say that the enterprise should be “smart enough” to bring the subsystems to the state required by the system itself. Whether this will be carried out automatically or by the hands of employees is the second question, which objectively indicates that it is unacceptable for a smart enterprise to use digitalization where it is less effective compared to other capabilities. So, an enterprise can only be called smart when its management system is capable of developing new algorithms for production processes and is built strictly according to system laws. In other words, an enterprise must be a categorical biomachsystem. As you can see, in the proposed definition of a smart enterprise, the presence of one or another number of sensors, methods of processing information, desertedness or applying Big Data analysis alone are not qualitative criteria for the “smartness” of an enterprise. Of course, further deepening of the concept of “smart enterprise” will lead to new criteria not considered here. The material presented applies to all types of smart enterprises, the issues raised related to the smart farm are developed in more detail in [22–24].
5 Conclusions It is customary to subdivide artificial intelligence into weak artificial intelligence, which implements cognitive tasks by means, generally speaking, not connected in any way with analogs of human thinking in the sense of bionics, and strong artificial intelligence, which recreates human intelligence in artificial (computers, etc.) systems. Popular approaches of artificial neural networks of deep learning should be attributed to weak AI. The concept of strong AI largely reflects the concept of a model of human thinking, which is a complex biological system. This means that an adequate mathematical model of a strong AI does not just have to take into account the systemic aspects of human thinking, but itself should be a system. As shown in the report, a categorical generalization of functional and biomachsystems should be used, i.e. a categorical theory of systems. This is a further development, including the methods of artificial neural networks, which, as indicated above, are a very special case of neurographs, convolutional polycategories with associative convolutions of the “crown” type. For the digital economy, the transition from the methods of artificial neural
Systems Theory for the Digital Economy
455
networks described by the Pavlov principle to the Anokhin-Sudakov principle means the beginning of the transition to a strong AI, certain steps in this direction were taken in the authors’ works described in the report.
References 1. Shumsky, S.A.: Deep learning. 10 years later. In: Lectures on Neuroinformatics. Neuroinformatics 2017, pp. 98–131 (2017) 2. Nikolenko, S., Kadurin, A., Arkhangelskaya, E.: Deep learning, 480 p. (2018) 3. Osovsky, S.: Neural networks for information processing. Hot.-Telecom, 448 p. (2018) 4. Galushkin, A.I.: Neural networks: the basics of theory. Hot.-Telecom, 496 p. (2017) 5. Karande, A.M., Kalbande, D.R.: Weight assignment algorithms for designing fully connected neural network. Int. J. Intell. Syst. Appl. (IJISA) 10(6), 68–76 (2018) 6. Dharmajee Rao, D.T.V., Ramana, K.V.: Winograd’s inequality: effectiveness for efficient training of deep neural networks. Int. J. Intell. Syst. Appl. (IJISA) 6, 49–58 (2018) 7. Hu, Z., Tereykovskiy, I.A., Tereykovska, L.O., Pogorelov, V.V.: Determination of structural parameters of multilayer perceptron designed to estimate parameters of technical systems. Int. J. Intell. Syst. Appl. (IJISA) 10(10), 57–62 (2017) 8. Awadalla, H.A.: Spiking neural network and bull genetic algorithm for active vibration control. Int. J. Intell. Syst. Appl. (IJISA) 10(2), 17–26 (2018) 9. LeCun, Y.: Off-road obstacle avoidance through end-to-end learning. In: Advances in Neural Information Processing Systems (NIPS 2005) (2005) 10. Anokhin, P.K.: Fundamental questions of the general theory of functional systems. In: Principles of the Systematic Organization of Functions, Nauka, pp. 5–61 (1973) 11. Sudakov, K.V., Kuzichev, I.A., et al.: The evolution of terminology and schemes of functional systems in the scientific school of P.K. Anokhin, 238 p. (2010) 12. Biomachsystems: Theory and Applications. Rosinformagroteh, vol. 1, 2, 228, 216 p. (2016) 13. Chernoivanov, V.I., Sudakov, S.K., Tolokonnikov, G.K.: Category theory of systems, functional systems and biomachsystems, parts 1-2. In: Collection of Scientific Papers of the International Scientific and Technical Conference “Neuroinformatics 2017”, vol. 2, pp. 131147 (2017) 14. Chernoivanov, V.I., Sudakov, S.K., Tolokonnikov, G.K.: Biomachsystems, functional systems, categorical theory of systems. Scientific Research Institute of Normal Physiology named after P.K. Anokhin, FNATs “VIM”, 447 p. (2018) 15. Tolokonnikov, G.K.: Classification of functional and other types of systems in their modeling by convolutional categories. Neurocomput. Dev. Appl. 6, 8–18 (2018) 16. Tolokonnikov, G.K.: Anokhin-Sudakov category theory principle. In: XIV International Interdisciplinary Congress Neuroscience FOR Medicine AND PSYCHOLOGY Sudak, Crimea, Russia, 30 May–10 June 2018, Moscow, pp. 455–544. MAKS Press, 568 pp. (2018). ISBN 978-5-317-05822-7 17. Kolmogorov, A.N.: On the representation of continuous functions of several variables as a superposition of continuous functions of one variable. Dokl. USSR Academy of Sciences, vol. 114, no. 5, pp. 953–956 (1957) 18. Gorban, A.N.: A generalized approximation theorem and computational capabilities of neural networks. Sibirsk. J. Calc. Math. RAS Sib. Sep. 1(1), 11–24 (1998) 19. Hawkins, J.: On Intelligence, 240 p. (2007) 20. Fukushima, K.: Neural network model for mechanism of pattern recognition unaffected by shift in position. Neocognitron. Trans. IECE J62-A(10), 658–665 (1979)
456
G. K. Tolokonnikov et al.
21. Dunin-Barkovsky, V.L., Solovieva, K.P.: Pavlov’s principle in the problem of reverse brain construction. In: XVIII International Conference on Neuroinformation-2016. Sci. works, Part 1. pp. 11–23 (2016) 22. Tsoi, YuA: The management system of a “smart” farm in conditions of uncertainty. Biomachsystems 2(1), 216–224 (2018) 23. Chernoivanov, V.I., Tolokonnikov, G.K.: The concept of smart enterprise in the biomachsystems paradigm. Mach. Equip. Village 10, 2–7 (2018) 24. Chernoivanov, V.I., Tsoi, Yu.A, Elizarov, V.P., Tolokonnikov, G.K., Perednya, V.I.: About the concept of creating a smart dairy farm. Mach. Equip. Village 11, 2–9 (2018)
Agile Simulation Model of Semiconductor Manufacturing Igor Stogniy1(&), Mikhail Kruglov2, and Mikhail Ovsyannikov2 1
Institute of Applied Computer Science, Technische Universität Dresden, 01062 Dresden, Germany [email protected] 2 Bauman Moscow State Technical University, 2nd Baumanskaya 5, 105005 Moscow, Russia [email protected], [email protected]
Abstract. In the semiconductor industry there is a need to have different simulation models for different purposes. There are several universal simulation tools with a high level of agility but they do not have a special semiconductor library. There are specialized simulation tools for the semiconductor industry, but they do not have an acceptable level of the agility. We developed an agile simulation model which allows for the fulfillment of this task based on a universal simulation tool. The main feature of the model is a universal process flow step. The experiments demonstrate the application of the model. Keywords: Simulation
Semiconductor industry Simplification
1 Introduction Intensive research activities have been carried out over the last thirty years regarding simulation in the semiconductor industry. Simulation is one of the nine Industry 4.0 technologies [1]. It is a form of prescriptive analytics and is focused on “why did it happen?” questions [2]. Simulation allows organizations to implement a digital firm concept [3]. There is a need to produce a simplified model to implement simulation experiments for two reasons: to reduce run time of the model and to reduce efforts during the manual model calibration. A comprehensive overview of the simulation model simplification could be found in [4]. As it is pointed out by Van der Zee, simplification “is still very much a green field” [4]. In recent years, several authors focused on a simplified model where only a bottleneck was modeled in detail (e.g. [5, 6]). Others investigated different levels of model complexity (e.g. [7]). Rank et al. pointed out the importance of having the correct level of model complexity in semiconductor fab simulation and came to the conclusion that the different tasks need different levels of simplification [8]. In this paper we present an approach which offers the ability to create a simulation model with different granularities. Using this approach we could achieve a simulation model with the necessary complexity level without significant development efforts. As a basis for our research, a well-known MIMAC model was used [9]. The raw data © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 457–465, 2020. https://doi.org/10.1007/978-3-030-39216-1_42
458
I. Stogniy et al.
could be found in [10]. Hassoun and Kalir showed that this dataset is still the only actual dataset for the semiconductor industry [11]. Anylogic was used as a simulation model tool [12]. There are three different methods in simulation modeling: discrete event, agent based, and system dynamics. All three are used in the semiconductor industry [13]. To simulate a production planning process, a common approach is Discrete Event Simulation (DES). DES software has proven to be very useful, as confirmed by many leading semiconductor manufacturers (e.g. IBM, Intel) in numerous publications (e.g. [14, 15]). We used a combination of agent based and discrete event modeling to achieve a higher scalability of the simulation model. The difficulty of the simulation lies not in the execution of a simulation experiment, but in the creation of a simulation model [16]. The design of the experiment could also be problematic, as it can often lead to a compromise between the quality of the results and the efficiency (e.g. duration) of the experiment. Our approach allows for the use of gradual simplification to achieve a better compromise decision for a particular simulation purpose. This paper is organized as follows. The universal process flow step and modeling concept are presented in Sect. 2. Design of experiments is described in Sect. 3. Section 4 shows experimental results and includes discussion. Section 5 presents conclusions.
2 Modeling Concept The wafer manufacturing process is a sequence of several steps (operations). Each step consists of the following phases: batching, setup, processing, scrap, rework, transporting [17]. We consider loading/unloading time as a part of processing time. The products path through the manufacturing system as lots. Some operations include batching, in which case processing time is calculated for a batch. The main idea of our approach is to build a model of a universal process flow step, which describes all possible actions. Thus we represent a process flow as a cycle where an iteration is an operation with particular parameters (see Fig. 1).
Fig. 1. Universal process flow step as a basis of the simulation model
Agile Simulation Model of Semiconductor Manufacturing
459
Lots are modeled as active objects – agents. Machine tools and operators are resources – objects necessary for agents to perform actions. Steps are modeled with the help of specific blocks that implement a particular phase of the process (e.g. wait and batch for batching). Functions represent logical events (e.g. getPriority and calcReleasePriority for dispatching). Variables are necessary to save the value of the parameters (e.g. counters, currentType) for the particular step. Events (event and event1) are used to trunk the particular time period to record statistics. Input data are tables of data sets, which are built following the form of MIMAC data sets. Lot parameters are as follows: • • • • • • • • • •
Step – number of operation Type – product type No – lot number Release Priority – priority according to the dispatching rule Entered System – time of the beginning of the first operation Unit – number of the machine tool in the tool group Toolset – number of the machine tool group Start time – time of the beginning of the processing for a machine tool End time – time of the end of the processing for a machine tool Rework type – rework process flow ID.
To ensure the adaptability of the simulation model, several system parameters can be varied to study their impact on manufacturing performance. Before running the model, each parameter can be changed by the user to obtain results for a specific system configuration. Those model parameters include the following types: • • • • • • •
Batch type – batching group Lot size – number of wafers in a lot (48 in the model) Tool value – number of machine tool groups Types value – number of product types Process steps – number of steps in the process flow for the particular product type Priority rule – dispatching rule Release method – lot release rule.
During the execution of the model the selected parameters of the system are displayed, as well as changes of variables and the number of incoming and outgoing lots for each phase of the process step. Machine groups and machine tools utilization statistics and the lot cycle time are also displayed. An output table is built to specify the start and end time of each operation, as well as the specific machine tool and the waiting time (Table 1). The model output reports are very flexible and could be easily adjusted to the particular simulation purpose. For example, the table below allows for a calculation of the average waiting time not only for a particular machine tool, but even for specific combinations of machine tool and product type. Several other diagrams are presented below which describe the simulation experiments.
460
I. Stogniy et al. Table 1. Output information
Product type
Lot number
Step number
Tool group
1 1 1 1
1 1 1 1
1 2 3 4
59 45 31 9
Machine tool number 1 1 1 1
Start time
End time
Waiting time
2.72 31.52 5445.92 9795.32
31.52 85.52 5771.42 9861.36
0 0 5187.6 4023.9
3 Design of Experiments and Experimental Results MIMAC Dataset 1 was chosen as the basis for the experiments. It consists of 83 machine tool groups (each group has one or several machine tools). There are two products, both of which are non-volatile memory chips. Process flows include 210 and 245 steps, respectively. The average model run time for 16,000 lots (about 1,000 h) is 10 s. Without using additional tools (see [5, 6]), we were able to produce standard lot cycle times (see Fig. 2) or machine tools utilization diagrams as a result of simulation runs.
Fig. 2. Lot cycle time diagram as a standard simulation report.
But it is much more interesting to use the model for optimization tasks. We carried out several scenario experiments to show practical values of the agile simulation model. We considered four controlled variables (four experiment types accordingly): • tool (bottleneck) numbers in the highest utilized machine tool group: 2, 3, 4; • batch size: 1, 2, 4; • dispatching rule: First In First Out, Shortest Processing Time, Longest Processing Time;
Agile Simulation Model of Semiconductor Manufacturing
461
• lot release (product) order: 1–2 (change the product type every other lot 1 -> 2 -> 1), 100 (change every 100 lots), random. We considered the following effectiveness criteria: average lot cycle time, average lot waiting time for an operation, average machine tool group utilization. It is necessary to emphasize that out agile model allows for the use of different variables and criteria. For example, we could calculate waiting time for a particular machine tool group and particular product (see Table 1). To produce similar data from some other specialized simulation tool could be very time consuming, as it would require significant additional programming efforts. In our case, that effort is replaced by only one line of code as a filter for reported data.
4 Experimental Results and Discussion Experimental results are presented below. In Figs. 3 and 4 the dependencies of the average lot cycle time on the lot release frequency are shown. The lot frequency changes from 1 to 5 lots per day for each of the experimental series. Thus we drew a line on the diagrams. In the left diagram in Fig. 3 the results of three experimental series with 2, 3, and 4 machine tools in the highest utilized machine tool group (bottleneck) are represented. The difference between the lines is not big. This suggests that there are other high utilized machine tools in the production system; in this case we can not produce a model based only on a detailed bottleneck modeling, but instead need to model other machine tools. Our agile simulation allows for this. In the right diagram in Fig. 3 the experiments with different batching strategies are shown.
Fig. 3. Average lot cycle time. Experiments: bottleneck numbers (left); batch size (right)
462
I. Stogniy et al.
The results correlate with [9] and show that a greedy batching policy (batch size = 1) is better than a full batch strategy (batch size = 4). In the left diagram in Fig. 4, the results of the experiments with three dispatching rules are displayed: First In First Out (FIFO), Shortest Processing Time (SPT), and Longest Processing Time (LPT). The diagram shows that, among these dispatching rules, FIFO provides the best productivity. We also ran three experimental series with different product release orders: 1–2 (change the product type every other lot 1 -> 2 -> 1), 100 (change every 100 lots), random. We see that, for this production system, product release order does not have a significant impact on the productivity for the lot release frequency. When the lot release frequency is set at 3, the random product release order shows better productivity. But with a value of 5, the best product release strategy is 100 (i.e., change the product type every 100 lots). Understanding this phenomenon is one focus of our future research.
Fig. 4. Average lot cycle time. Experiments: lot release rule (left); product release order (right)
In Figs. 5 and 6, the dependencies of the average lot waiting time on operation and average machine tool group utilization for different values of controlled parameters are shown. The diagrams are drawn for the lot release value 4 lots per day. Based on the left diagram in Fig. 5, we conclude that the larger the number of the bottleneck, the better the value of the effectiveness criteria. This seems rather obvious and can be considered proof for the accuracy of our model. For the batch size (the left diagram) we can see that bigger values of the batch size provide lower utilization. This happens because a batch tool should wait longer before a batch is actually processed, due to the batching waiting time. The machine tools, which are next in line after the batch tool in the process flow, should also wait. Thus we see lower utilization and longer average lot waiting time for the full batch strategy.
Agile Simulation Model of Semiconductor Manufacturing
463
In the left diagram in Fig. 6 we see that the best lot release rule is FIFO. It provides higher average utilization and lower average lot waiting time. The higher average utilization in this case means a better balanced production system. As for product release order, the best solution proves to be random release (see the right diagram in Fig. 6).
Fig. 5. Average waiting time and average utilization. Experiments: bottleneck number (left); batch size (right)
Fig. 6. Average waiting time and average utilization. Experiments: lot release rule (left); product release order (right)
464
I. Stogniy et al.
Based on the experimental results, we conclude that the best configuration for the production system is as follows: 4 machine tools in the highest utilized machine tool group, greedy batching policy (1 lot per batch), FIFO dispatching rule, and random product release rule. More importantly, the results demonstrate that our agile model is suitable for such optimization experiments and could be easily adjusted for other simulation needs.
5 Conclusions We developed an agile simulation model for semiconductor manufacturing using the Anylogic simulation tool. This work enhances previous researches mentioned in the paper and references. The core of the model is a universal process flow step which describes all possible actions. The model allows for the easy implementation of all necessary production logic. The model reports are configurable and could be adjusted for various research tasks. We carried out several scenario optimization experiments and demonstrated the work of the model with “standard” reports used in the literature. The model could be used both in supply chain and factory simulation in semiconductor industry. In future research we plan to enhance the model and to add several other dispatching rules (early due date, critical ratio), group setup (not only step setup, as is realized in the model currently), additional plant maintenance, and exceptional down events. We plan a deeper research of the different product release strategies as well. Acknowledgments. A part of the work is executed at financial support RFBR, grant № 18-0701311.
References 1. Wang, L., Wang, G.: Big data in cyber-physical systems, digital manufacturing and industry 4.0. Int. J. Eng. Manuf. (IJEM) 6(4), 1–8 (2016) 2. Anitha, P., Patil, M.M.: A review on data analytics for supply chain management: a case study. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 10(5), 30–39 (2018) 3. Al-Samawi, Y.: Digital firm: requirements, recommendations, and evaluation the success in digitization. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 1, 39–49 (2019) 4. Van der Zee, D.J.: Model simplification in manufacturing simulation – review and framework. Comput. Ind. Eng. 127, 1056–1067 (2019) 5. Sprenger, R., Rose, R.: On the simplification of semiconductor wafer factory simulation models. In: Robinson, S., Brooks, R., Kotiadis, K., van der Zee, D.J. (eds.) Conceptual Modeling for Discrete-Event Simulation, pp. 451–470. CRC Press Taylor&Francis Group, Boca Raton (2010) 6. Ewen, H., Mönch, L., Ehm, H., Ponsignon, T., Fowler, J.W., Forstner, L.: A testbed for simulating semiconductor supply chains. IEEE Trans. Semicond. Manuf. 30(3), 293–305 (2017)
Agile Simulation Model of Semiconductor Manufacturing
465
7. Johnson, R.T., Fowler, J.W., Mackulak, G.T.: A discrete event simulation model simplification technique. In: Proceedings of the 37th Conference on Winter Simulation, pp. 2172–2176 (2005) 8. Rank, S., Hammel, C., Schmidt, T., Müller, J., Wenzel, A., Lasch, R., Schneider, G.: The correct level of model complexity in semiconductor fab simulation – lessons learned from practice. In: 27th Annual SEMI Advanced Semiconductor Manufacturing Conference, pp. 133–139. Institute of Electrical and Electronics Engineers (2016) 9. Fowler, J., Robinson, J.: Measurement and improvement of manufacturing capacity (MIMAC) designed experiment report technology transfer, pp. 3–16. SEMATECH (1995) 10. MIMAC Datasets. http://p2schedgen.fernuni-hagen.de/index.php?i-d=242. Accessed 05 July 2019 11. Hassoun, M., Kalir, A.: Towards a new simulation testbed for semiconductor manufacturing. In: Proceedings of the 2017 Conference on Winter Simulation, pp. 3612–3623 (2017) 12. Anylogic Homepage. http://www.anylogic.ru/. Accessed 05 July 2019 13. Sadeghi, R., Dauzere-Pérès, S., Yugma, C.: A multi-method simulation modelling for semiconductor manufacturing. IFAC-PapersOnLine 49(12), 727–732 (2016) 14. Bagchi, S., Chen-Ritzo, C.H., Shikalgar, S.T., Toner, M.: A full-factory simulator as a daily decision-support tool for 300 mm wafer fabrication productivity. In: Proceedings of the Winter Simulation Conference, pp. 2021–2029 (2008) 15. Rozen, K., Byrne, N.M.: Using simulation to improve semiconductor factory cycle time by segregation of preventive maintenance activities. In: Proceedings of the Winter Simulation Conference, pp. 2676–2684 (2016) 16. März, L., Krug, W., Rose, O., Weigert, G. (eds.): Simulation und Optimierung in Produktion und Logistik: Praxisorientierter Leitfaden mit Fallbeispielen. Springer, Heidelberg (2010) 17. Baur, D., Nagel, W., Berger, O.: Systematics and key performance indicators to control a GaAs volume production. In: Proceedings of the International Conference on Compound Semiconductor Manufacturing Technology, pp. 24–27 (2001)
Author Index
A Aleshin, Aleksandr K., 218 Alifov, Alishir A., 404 Andreev, Ark M., 437 Azarov, Artur, 299 B Babenko, Roman O., 154 Bautin, Andrey A., 175 Berezkin, Dmitry V., 437 Bezzametnov, Oleg N., 385 Binh, Phung V., 375 Bobkov, Vladimir, 28 Bolodurina, Irina, 346 Bolotova, Elizaveta E., 165 Borisov, Vadim, 28 Bragin, Anatoly O., 154 Burlov, Dmitry, 39 Butenko, Iuliia I., 165 C Chernoivanov, Vyacheslav I., 447 D Dli, Maxim, 28 Dresvyannikova, Alina E., 154 Duc, Nguyen V., 375 E Efimova, E. V., 414 Eremeev, A. P., 187 F Farzaliev, M. G., 404 Firsov, Georgy I., 218 Fonov, D. A., 16
G Gadolina, Irina V., 175 Galiaskarova, Kamilla, 325 Gavrilenkov, Sergey I., 228 Gavriushin, Sergey S., 3, 375 Gavryushin, Sergey S., 228 Gladkov, L. A., 249 Gladkova, N. V., 249 Glazunov, Viktor A., 218 Golubtsov, Peter, 259, 274 Gorshenin, Andrey, 307 Gouskov, A. M., 414 H He, Matthew, 356 Hu, Z. B., 356 Hu, Zhengbing, 364 I Ivliev, S. A., 187 Izvozchikova, Vera, 39, 47 K Karpenko, A. P., 205 Kirillovich, Alexander, 325 Kiselev, I. A., 414 Kolodenkova, Anna E., 315 Kolosov, O. S., 187 Korolenkova, V. A., 187 Koroleva, Maria, 299 Korzhov, E. G., 16 Kovalev, Sergey M., 315 Kovalev, Sergey S., 154 Kovalev, Vladislav S., 315 Kovaleva, Natalya L., 218 Kozlov, Ilya A., 437
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020 Z. Hu et al. (Eds.): CSDEIS 2019, AISC 1127, pp. 467–468, 2020. https://doi.org/10.1007/978-3-030-39216-1
468 Krasnova, Irina A., 106 Kruglov, Mikhail, 457 Kuzmin, Victor, 307 Kuznetsov, V. N., 238 Kuznetsova, Larisa, 346 L Labunets, Valeriy G., 129 Lapin, Dmitriy V., 427 Liang, Kang, 205 M Makarov, S. B., 395 Mankov, Vladimir A., 106 Meshchikhin, I. A., 16 Mezhenin, Aleksandr, 39 Mezhenin, Alexander, 47 Minh, Dang H., 375 Mitryaykin, Victor I., 385 Mokaeva, Alisa A., 427 Mozharov, Gennady P., 437 Mutovkina, N. Yu., 238 N Naumenko, Fedor M., 154 Nevzorova, Olga, 325 Nikitin, E. A., 414 Nikolaev, Konstantin, 325 O Orlov, Yuriy L., 154 Orlova, Nina G., 154 Ostheimer, Ekaterina, 129 Ovsyannikov, Mikhail, 457 P Pankova, N. V., 395 Parfenov, Denis, 346 Pavlov, A. V., 117 Petoukhov, Sergey V., 86, 335, 356 Petr, Mikheev, 64 Phuong, Bui V., 375
Author Index Plotnikov, Evgenii V., 175 Podkopaev, Sergey A., 3 Podkopaeva, Tatiana B., 3 Polyakov, Vladimir, 39 Pravotorova, Elena A., 145 Proletarsky, Andrey V., 437 Pronin, A. D., 187 S Sagdatullin, Artur, 76 Semushin, E. Y., 249 Serebrenniy, Vladimir V., 427 Sergey, Dudnikov, 64 Shaderkin, Igor A., 154 Shardakov, Vladimir, 47 Sidnyaev, Nikolay I., 165 Skvorcov, Oleg B., 145 Sokolov, Yury A., 437 Statsenko, Yevgeny O., 385 Stogniy, Igor, 457 Sudakov, Sergey K., 447 Sukhanov, Andrey V., 56 Suvorova, Alena, 299 T Tatyana, Grinkina, 64 Titova, O. D., 187 Tolokonnikov, Georgy K., 86, 447 Tsoi, Yuri A., 447 Tyshchenko, Oleksii K., 364 V Valuev, Andrey M., 96 Vasileva, Olga, 299 Y Yanishevskaya, Natalia, 346 Z Zhigalov, Arthur, 346 Zykov, Anatoly, 39